11 Jan, 2012

14 commits

  • TI's TCA6507 is the LED driver in the GTA04 Openmoko motherboard. The
    driver provides full support for brightness levels and hardware blinking.

    This driver can drive each of 7 outputs as an LED or a GPIO output,
    and provides hardware-assist blinking.

    [akpm@linux-foundation.org: fix __mod_i2c_device_table alias]
    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: NeilBrown
    Cc: Richard Purdie
    Cc: Randy Dunlap
    Cc: Dan Carpenter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • mpol_equal() logically returns a boolean. Use a bool type to slightly
    improve readability.

    Signed-off-by: KOSAKI Motohiro
    Cc: Stephen Wilson
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • oom_score_adj is used for guarding processes from OOM-Killer. One of
    problem is that it's inherited at fork(). When a daemon set oom_score_adj
    and make children, it's hard to know where the value is set.

    This patch adds some tracepoints useful for debugging. This patch adds
    3 trace points.
    - creating new task
    - renaming a task (exec)
    - set oom_score_adj

    To debug, users need to enable some trace pointer. Maybe filtering is useful as

    # EVENT=/sys/kernel/debug/tracing/events/task/
    # echo "oom_score_adj != 0" > $EVENT/task_newtask/filter
    # echo "oom_score_adj != 0" > $EVENT/task_rename/filter
    # echo 1 > $EVENT/enable
    # EVENT=/sys/kernel/debug/tracing/events/oom/
    # echo 1 > $EVENT/enable

    output will be like this.
    # grep oom /sys/kernel/debug/tracing/trace
    bash-7699 [007] d..3 5140.744510: oom_score_adj_update: pid=7699 comm=bash oom_score_adj=-1000
    bash-7699 [007] ...1 5151.818022: task_newtask: pid=7729 comm=bash clone_flags=1200011 oom_score_adj=-1000
    ls-7729 [003] ...2 5151.818504: task_rename: pid=7729 oldcomm=bash newcomm=ls oom_score_adj=-1000
    bash-7699 [002] ...1 5175.701468: task_newtask: pid=7730 comm=bash clone_flags=1200011 oom_score_adj=-1000
    grep-7730 [007] ...2 5175.701993: task_rename: pid=7730 oldcomm=bash newcomm=grep oom_score_adj=-1000

    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: KOSAKI Motohiro
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • migrate was doing an rmap_walk with speculative lock-less access on
    pagetables. That could lead it to not serializing properly against mremap
    PT locks. But a second problem remains in the order of vmas in the
    same_anon_vma list used by the rmap_walk.

    If vma_merge succeeds in copy_vma, the src vma could be placed after the
    dst vma in the same_anon_vma list. That could still lead to migrate
    missing some pte.

    This patch adds an anon_vma_moveto_tail() function to force the dst vma at
    the end of the list before mremap starts to solve the problem.

    If the mremap is very large and there are a lots of parents or childs
    sharing the anon_vma root lock, this should still scale better than taking
    the anon_vma root lock around every pte copy practically for the whole
    duration of mremap.

    Update: Hugh noticed special care is needed in the error path where
    move_page_tables goes in the reverse direction, a second
    anon_vma_moveto_tail() call is needed in the error path.

    This program exercises the anon_vma_moveto_tail:

    ===

    int main()
    {
    static struct timeval oldstamp, newstamp;
    long diffsec;
    char *p, *p2, *p3, *p4;
    if (posix_memalign((void **)&p, 2*1024*1024, SIZE))
    perror("memalign"), exit(1);
    if (posix_memalign((void **)&p2, 2*1024*1024, SIZE))
    perror("memalign"), exit(1);
    if (posix_memalign((void **)&p3, 2*1024*1024, SIZE))
    perror("memalign"), exit(1);

    memset(p, 0xff, SIZE);
    printf("%p\n", p);
    memset(p2, 0xff, SIZE);
    memset(p3, 0x77, 4096);
    if (memcmp(p, p2, SIZE))
    printf("error\n");
    p4 = mremap(p+SIZE/2, SIZE/2, SIZE/2, MREMAP_FIXED|MREMAP_MAYMOVE, p3);
    if (p4 != p3)
    perror("mremap"), exit(1);
    p4 = mremap(p4, SIZE/2, SIZE/2, MREMAP_FIXED|MREMAP_MAYMOVE, p+SIZE/2);
    if (p4 != p+SIZE/2)
    perror("mremap"), exit(1);
    if (memcmp(p, p2, SIZE))
    printf("error\n");
    printf("ok\n");

    return 0;
    }
    ===

    $ perf probe -a anon_vma_moveto_tail
    Add new event:
    probe:anon_vma_moveto_tail (on anon_vma_moveto_tail)

    You can now use it on all perf tools, such as:

    perf record -e probe:anon_vma_moveto_tail -aR sleep 1

    $ perf record -e probe:anon_vma_moveto_tail -aR ./anon_vma_moveto_tail
    0x7f2ca2800000
    ok
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.043 MB perf.data (~1860 samples) ]
    $ perf report --stdio
    100.00% anon_vma_moveto [kernel.kallsyms] [k] anon_vma_moveto_tail

    Signed-off-by: Andrea Arcangeli
    Reported-by: Nai Xia
    Acked-by: Mel Gorman
    Cc: Hugh Dickins
    Cc: Pawel Sikora
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • The maximum number of dirty pages that exist in the system at any time is
    determined by a number of pages considered dirtyable and a user-configured
    percentage of those, or an absolute number in bytes.

    This number of dirtyable pages is the sum of memory provided by all the
    zones in the system minus their lowmem reserves and high watermarks, so
    that the system can retain a healthy number of free pages without having
    to reclaim dirty pages.

    But there is a flaw in that we have a zoned page allocator which does not
    care about the global state but rather the state of individual memory
    zones. And right now there is nothing that prevents one zone from filling
    up with dirty pages while other zones are spared, which frequently leads
    to situations where kswapd, in order to restore the watermark of free
    pages, does indeed have to write pages from that zone's LRU list. This
    can interfere so badly with IO from the flusher threads that major
    filesystems (btrfs, xfs, ext4) mostly ignore write requests from reclaim
    already, taking away the VM's only possibility to keep such a zone
    balanced, aside from hoping the flushers will soon clean pages from that
    zone.

    Enter per-zone dirty limits. They are to a zone's dirtyable memory what
    the global limit is to the global amount of dirtyable memory, and try to
    make sure that no single zone receives more than its fair share of the
    globally allowed dirty pages in the first place. As the number of pages
    considered dirtyable excludes the zones' lowmem reserves and high
    watermarks, the maximum number of dirty pages in a zone is such that the
    zone can always be balanced without requiring page cleaning.

    As this is a placement decision in the page allocator and pages are
    dirtied only after the allocation, this patch allows allocators to pass
    __GFP_WRITE when they know in advance that the page will be written to and
    become dirty soon. The page allocator will then attempt to allocate from
    the first zone of the zonelist - which on NUMA is determined by the task's
    NUMA memory policy - that has not exceeded its dirty limit.

    At first glance, it would appear that the diversion to lower zones can
    increase pressure on them, but this is not the case. With a full high
    zone, allocations will be diverted to lower zones eventually, so it is
    more of a shift in timing of the lower zone allocations. Workloads that
    previously could fit their dirty pages completely in the higher zone may
    be forced to allocate from lower zones, but the amount of pages that
    "spill over" are limited themselves by the lower zones' dirty constraints,
    and thus unlikely to become a problem.

    For now, the problem of unfair dirty page distribution remains for NUMA
    configurations where the zones allowed for allocation are in sum not big
    enough to trigger the global dirty limits, wake up the flusher threads and
    remedy the situation. Because of this, an allocation that could not
    succeed on any of the considered zones is allowed to ignore the dirty
    limits before going into direct reclaim or even failing the allocation,
    until a future patch changes the global dirty throttling and flusher
    thread activation so that they take individual zone states into account.

    Test results

    15M DMA + 3246M DMA32 + 504 Normal = 3765M memory
    40% dirty ratio
    16G USB thumb drive
    10 runs of dd if=/dev/zero of=disk/zeroes bs=32k count=$((10 << 15))

    seconds nr_vmscan_write
    (stddev) min| median| max
    xfs
    vanilla: 549.747( 3.492) 0.000| 0.000| 0.000
    patched: 550.996( 3.802) 0.000| 0.000| 0.000

    fuse-ntfs
    vanilla: 1183.094(53.178) 54349.000| 59341.000| 65163.000
    patched: 558.049(17.914) 0.000| 0.000| 43.000

    btrfs
    vanilla: 573.679(14.015) 156657.000| 460178.000| 606926.000
    patched: 563.365(11.368) 0.000| 0.000| 1362.000

    ext4
    vanilla: 561.197(15.782) 0.000|2725438.000|4143837.000
    patched: 568.806(17.496) 0.000| 0.000| 0.000

    Signed-off-by: Johannes Weiner
    Reviewed-by: Minchan Kim
    Acked-by: Mel Gorman
    Reviewed-by: Michal Hocko
    Tested-by: Wu Fengguang
    Cc: KAMEZAWA Hiroyuki
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Jan Kara
    Cc: Shaohua Li
    Cc: Rik van Riel
    Cc: Chris Mason
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Per-zone dirty limits try to distribute page cache pages allocated for
    writing across zones in proportion to the individual zone sizes, to reduce
    the likelihood of reclaim having to write back individual pages from the
    LRU lists in order to make progress.

    This patch:

    The amount of dirtyable pages should not include the full number of free
    pages: there is a number of reserved pages that the page allocator and
    kswapd always try to keep free.

    The closer (reclaimable pages - dirty pages) is to the number of reserved
    pages, the more likely it becomes for reclaim to run into dirty pages:

    +----------+ ---
    | anon | |
    +----------+ |
    | | |
    | | -- dirty limit new -- flusher new
    | file | | |
    | | | |
    | | -- dirty limit old -- flusher old
    | | |
    +----------+ --- reclaim
    | reserved |
    +----------+
    | kernel |
    +----------+

    This patch introduces a per-zone dirty reserve that takes both the lowmem
    reserve as well as the high watermark of the zone into account, and a
    global sum of those per-zone values that is subtracted from the global
    amount of dirtyable pages. The lowmem reserve is unavailable to page
    cache allocations and kswapd tries to keep the high watermark free. We
    don't want to end up in a situation where reclaim has to clean pages in
    order to balance zones.

    Not treating reserved pages as dirtyable on a global level is only a
    conceptual fix. In reality, dirty pages are not distributed equally
    across zones and reclaim runs into dirty pages on a regular basis.

    But it is important to get this right before tackling the problem on a
    per-zone level, where the distance between reclaim and the dirty pages is
    mostly much smaller in absolute numbers.

    [akpm@linux-foundation.org: fix highmem build]
    Signed-off-by: Johannes Weiner
    Reviewed-by: Rik van Riel
    Reviewed-by: Michal Hocko
    Reviewed-by: Minchan Kim
    Acked-by: Mel Gorman
    Cc: KAMEZAWA Hiroyuki
    Cc: Christoph Hellwig
    Cc: Wu Fengguang
    Cc: Dave Chinner
    Cc: Jan Kara
    Cc: Shaohua Li
    Cc: Chris Mason
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Calling alloc_pages_exact_node() means the allocation only passes the
    zonelist of a single node into the page allocator. If that node isn't
    online, it's zonelist may never have been initialized causing a strange
    oops that may not immediately be clear.

    I recently debugged an issue where node 0 wasn't online and an allocator
    was passing 0 to alloc_pages_exact_node() and it resulted in a NULL
    pointer on zonelist->_zoneref. If CONFIG_DEBUG_VM is enabled, though, it
    would be nice to catch this a bit earlier.

    Signed-off-by: David Rientjes
    Acked-by: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • With CONFIG_DEBUG_PAGEALLOC configured, the CPU will generate an exception
    on access (read,write) to an unallocated page, which permits us to catch
    code which corrupts memory. However the kernel is trying to maximise
    memory usage, hence there are usually few free pages in the system and
    buggy code usually corrupts some crucial data.

    This patch changes the buddy allocator to keep more free/protected pages
    and to interlace free/protected and allocated pages to increase the
    probability of catching corruption.

    When the kernel is compiled with CONFIG_DEBUG_PAGEALLOC,
    debug_guardpage_minorder defines the minimum order used by the page
    allocator to grant a request. The requested size will be returned with
    the remaining pages used as guard pages.

    The default value of debug_guardpage_minorder is zero: no change from
    current behaviour.

    [akpm@linux-foundation.org: tweak documentation, s/flg/flag/]
    Signed-off-by: Stanislaw Gruszka
    Cc: Mel Gorman
    Cc: Andrea Arcangeli
    Cc: "Rafael J. Wysocki"
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stanislaw Gruszka
     
  • We can place this in definitions that we expect the compiler to remove by
    dead code elimination. If this assertion fails, we get a nice error
    message at build time.

    The GCC function attribute error("message") was added in version 4.3, so
    we define a new macro __linktime_error(message) to expand to this for
    GCC-4.3 and later. This will give us an error diagnostic from the
    compiler on the line that fails. For other compilers
    __linktime_error(message) expands to nothing, and we have to be content
    with a link time error, but at least we will still get a build error.

    BUILD_BUG() expands to the undefined function __build_bug_failed() and
    will fail at link time if the compiler ever emits code for it. On GCC-4.3
    and later, attribute((error())) is used so that the failure will be noted
    at compile time instead.

    Signed-off-by: David Daney
    Acked-by: David Rientjes
    Cc: DM
    Cc: Ralf Baechle
    Acked-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Daney
     
  • Colin Cross reported;

    Under the following conditions, __alloc_pages_slowpath can loop forever:
    gfp_mask & __GFP_WAIT is true
    gfp_mask & __GFP_FS is false
    reclaim and compaction make no progress
    order
    Signed-off-by: Mel Gorman
    Acked-by: David Rientjes
    Cc: Minchan Kim
    Cc: Pekka Enberg
    Cc: KAMEZAWA Hiroyuki
    Cc: Andrea Arcangeli
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Rename mm_page_free_direct into mm_page_free and mm_pagevec_free into
    mm_page_free_batched

    Since v2.6.33-5426-gc475dab the kernel triggers mm_page_free_direct for
    all freed pages, not only for directly freed. So, let's name it properly.
    For pages freed via page-list we also trigger mm_page_free_batched event.

    Signed-off-by: Konstantin Khlebnikov
    Cc: Mel Gorman
    Cc: KOSAKI Motohiro
    Reviewed-by: Minchan Kim
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • It not exported and now nobody uses it.

    Signed-off-by: Konstantin Khlebnikov
    Cc: Mel Gorman
    Cc: KOSAKI Motohiro
    Reviewed-by: Minchan Kim
    Acked-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • This patch adds helper free_hot_cold_page_list() to free list of 0-order
    pages. It frees pages directly from list without temporary page-vector.
    It also calls trace_mm_pagevec_free() to simulate pagevec_free()
    behaviour.

    bloat-o-meter:

    add/remove: 1/1 grow/shrink: 1/3 up/down: 267/-295 (-28)
    function old new delta
    free_hot_cold_page_list - 264 +264
    get_page_from_freelist 2129 2132 +3
    __pagevec_free 243 239 -4
    split_free_page 380 373 -7
    release_pages 606 510 -96
    free_page_list 188 - -188

    Signed-off-by: Konstantin Khlebnikov
    Cc: Mel Gorman
    Cc: KOSAKI Motohiro
    Acked-by: Minchan Kim
    Acked-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • The tracing ring-buffer used this function briefly, but not anymore.
    Make it local to the writeback code again.

    Also, move the function so that no forward declaration needs to be
    reintroduced.

    Signed-off-by: Johannes Weiner
    Acked-by: Mel Gorman
    Reviewed-by: Michal Hocko
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

10 Jan, 2012

18 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    vfs: new helper - d_make_root()
    dcache: use a dispose list in select_parent
    ceph: d_alloc_root() may fail
    ext4: fix failure exits
    isofs: inode leak on mount failure

    Linus Torvalds
     
  • d_alloc_root() with iput() in case of allocation failure...

    Signed-off-by: Al Viro

    Al Viro
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    igmp: Avoid zero delay when receiving odd mixture of IGMP queries
    netdev: make net_device_ops const
    bcm63xx: make ethtool_ops const
    usbnet: make ethtool_ops const
    net: Fix build with INET disabled.
    net: introduce netif_addr_lock_nested() and call if when appropriate
    net: correct lock name in dev_[uc/mc]_sync documentations.
    net: sk_update_clone is only used in net/core/sock.c
    8139cp: fix missing napi_gro_flush.
    pktgen: set correct max and min in pktgen_setup_inject()
    smsc911x: Unconditionally include linux/smscphy.h in smsc911x.h
    asix: fix infinite loop in rx_fixup()
    net: Default UDP and UNIX diag to 'n'.
    r6040: fix typo in use of MCR0 register bits
    net: fix sock_clone reference mismatch with tcp memcontrol

    Linus Torvalds
     
  • clock management changes for i.MX

    Another simple series related to clock management, this time only for
    imx.

    * tag 'clk' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
    ARM: mxs: select HAVE_CLK_PREPARE for clock
    clk: add config option HAVE_CLK_PREPARE into Kconfig
    ASoC: mxs-saif: convert to clk_prepare/clk_unprepare
    video: mxsfb: convert to clk_prepare/clk_unprepare
    serial: mxs-auart: convert to clk_prepare/clk_unprepare
    net: flexcan: convert to clk_prepare/clk_unprepare
    mtd: gpmi-lib: convert to clk_prepare/clk_unprepare
    mmc: mxs-mmc: convert to clk_prepare/clk_unprepare
    dma: mxs-dma: convert to clk_prepare/clk_unprepare
    net: fec: add clk_prepare/clk_unprepare
    ARM: mxs: convert platform code to clk_prepare/clk_unprepare
    clk: add helper functions clk_prepare_enable and clk_disable_unprepare

    Fix up trivial conflicts in drivers/net/ethernet/freescale/fec.c due to
    commit 0ebafefcaa7a ("net: fec: add clk_prepare/clk_unprepare") clashing
    trivially with commit e163cc97f9ac ("net/fec: fix the .remove code").

    Linus Torvalds
     
  • Driver specific changes

    Again, a lot of platforms have changes in here: pxa, samsung, omap,
    at91, imx, ...

    * tag 'drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (54 commits)
    ARM: sa1100: clean up of the clock support
    ARM: pxa: add dummy clock for sa1100-rtc
    RTC: sa1100: support sa1100, pxa and mmp soc families
    RTC: sa1100: remove redundant code of setting alarm
    RTC: sa1100: Clean out ost register
    Input: zylonite-wm97xx - replace IRQ_GPIO() with gpio_to_irq()
    pcmcia: pxa: replace IRQ_GPIO() with gpio_to_irq()
    ARM: EXYNOS: Modified files for SPI consolidation work
    ARM: S5P64X0: Enable SDHCI support
    ARM: S5P64X0: Add lookup of sdhci-s3c clocks using generic names
    ARM: S5P64X0: Add HSMMC setup for host Controller
    ARM: EXYNOS: Add USB OHCI support to ORIGEN board
    USB: Add Samsung Exynos OHCI diver
    ARM: EXYNOS: Add USB OHCI support to SMDKV310 board
    ARM: EXYNOS: Add USB OHCI device
    net: macb: fix build break with !CONFIG_OF
    i2c: tegra: Support DVC controller in device tree
    i2c: tegra: Add __devinit/exit to probe/remove
    net/at91_ether: use gpio_is_valid for phy IRQ line
    ARM: at91/net: add macb ethernet controller in 9g45/9g20 DT
    ...

    Linus Torvalds
     
  • New feature development

    This adds support for new features, and contains stuff from most
    platforms. A number of these patches could have fit into other
    branches, too, but were small enough not to cause too much
    confusion here.

    * tag 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (28 commits)
    mfd/db8500-prcmu: remove support for early silicon revisions
    ARM: ux500: fix the smp_twd clock calculation
    ARM: ux500: remove support for early silicon revisions
    ARM: ux500: update register files
    ARM: ux500: register DB5500 PMU dynamically
    ARM: ux500: update ASIC detection for U5500
    ARM: ux500: support DB8520
    ARM: picoxcell: implement watchdog restart
    ARM: OMAP3+: hwmod data: Add the default clockactivity for I2C
    ARM: OMAP3: hwmod data: disable multiblock reads on MMC1/2 on OMAP34xx/35xx <= ES2.1
    ARM: OMAP: USB: EHCI and OHCI hwmod structures for OMAP4
    ARM: OMAP: USB: EHCI and OHCI hwmod structures for OMAP3
    ARM: OMAP: hwmod data: Add support for AM35xx UART4/ttyO3
    ARM: Orion: Remove address map info from all platform data structures
    ARM: Orion: Get address map from plat-orion instead of via platform_data
    ARM: Orion: mbus_dram_info consolidation
    ARM: Orion: Consolidate the address map setup
    ARM: Kirkwood: Add configuration for MPP12 as GPIO
    ARM: Kirkwood: Recognize A1 revision of 6282 chip
    ARM: ux500: update the MOP500 GPIO assignments
    ...

    Linus Torvalds
     
  • Device tree conversions for samsung and tegra

    Both platforms had some initial device tree support, but this adds
    much more to actually make it usable.

    * tag 'dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (45 commits)
    ARM: dts: Add intial dts file for EXYNOS4210 SoC, SMDKV310 and ORIGEN
    ARM: EXYNOS: Add Exynos4 device tree enabled board file
    rtc: rtc-s3c: Add device tree support
    input: samsung-keypad: Add device tree support
    ARM: S5PV210: Modify platform data for pl330 driver
    ARM: S5PC100: Modify platform data for pl330 driver
    ARM: S5P64x0: Modify platform data for pl330 driver
    ARM: EXYNOS: Add a alias for pdma clocks
    ARM: EXYNOS: Limit usage of pl330 device instance to non-dt build
    ARM: SAMSUNG: Add device tree support for pl330 dma engine wrappers
    DMA: PL330: Add device tree support
    ARM: EXYNOS: Modify platform data for pl330 driver
    DMA: PL330: Infer transfer direction from transfer request instead of platform data
    DMA: PL330: move filter function into driver
    serial: samsung: Fix build for non-Exynos4210 devices
    serial: samsung: add device tree support
    serial: samsung: merge probe() function from all SoC specific extensions
    serial: samsung: merge all SoC specific port reset functions
    ARM: SAMSUNG: register uart clocks to clock lookup list
    serial: samsung: remove all uses of get_clksrc and set_clksrc
    ...

    Fix up fairly trivial conflicts in arch/arm/mach-s3c2440/clock.c and
    drivers/tty/serial/Kconfig both due to just adding code close to
    changes.

    Linus Torvalds
     
  • Cleanups on various subarchitectures

    Cleanup patches for various ARM platforms and some of their associated
    drivers, the bulk of these is for mach-91.

    Arnd ended up pulling in the restart branch from Russell in order to
    fix up some simple but annoying merge conflicts.

    * tag 'cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (44 commits)
    arm/at91: fix build of stamp9g20
    ARM: u300: delete memory.h
    MAINTAINERS: add maintainer entry for Picochip picoxcell
    ARM: picoxcell: move io mappings to common.c
    ARM: picoxcell: don't reserve irq_descs
    ARM: picoxcell: remove mach/memory.h
    ARM: at91: delete the pcontrol_g20_defconfig
    arm/tegra: Remove code that's ifndef CONFIG_ARM_GIC
    arm/tegra: remove unused defines
    arm/tegra: fix variable formatting in makefile
    ARM: davinci: vpif: move code to driver core header from platform
    ARM: at91/gpio: fix display of number of irq setuped
    ARM: at91/gpio: drop PIN_BASE
    ARM: at91/udc: use gpio_is_valid to check the gpio
    ARM: at91/ohci: use gpio_is_valid to check the gpio
    ARM: at91/nand: use gpio_is_valid to check the gpio
    ARM: at91/mmc: use gpio_is_valid to check the gpio
    ARM: at91/ide: use gpio_is_valid to check the gpio
    ARM: at91/pata: use gpio_is_valid to check the gpio
    ARM: at91/soc: use gpio_is_valid to check the gpio
    ...

    Linus Torvalds
     
  • Including trace/events/*.h TRACE_EVENT() macro headers in other headers
    can cause strange side effects if another trace/event/*.h header
    includes that header. Having trace/events/kmem.h inside slab_def.h
    caused a compile error in sparc64 when changes were done to some header
    files. Moving the kmem.h trace header out of slab.h and into slab.c
    fixes the problem.

    Note, both slub.c and slob.c already include the trace/events/kmem.h
    file. Only slab.c had it missing.

    Link: http://lkml.kernel.org/r/20120105190405.1e3191fb5a43b2a0f1655e1f@canb.auug.org.au

    Reported-by: Stephen Rothwell
    Signed-off-by: Steven Rostedt
    Signed-off-by: Linus Torvalds

    Steven Rostedt
     
  • > net/core/sock.c: In function 'sk_update_clone':
    > net/core/sock.c:1278:3: error: implicit declaration of function 'sock_update_memcg'

    Reported-by: Randy Dunlap
    Signed-off-by: David S. Miller

    David S. Miller
     
  • * 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
    percpu: Remove irqsafe_cpu_xxx variants

    Fix up conflict in arch/x86/include/asm/percpu.h due to clash with
    cebef5beed3d ("x86: Fix and improve percpu_cmpxchg{8,16}b_double()")
    which edited the (now removed) irqsafe_cpu_cmpxchg*_double code.

    Linus Torvalds
     
  • * 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits)
    cgroup: fix to allow mounting a hierarchy by name
    cgroup: move assignement out of condition in cgroup_attach_proc()
    cgroup: Remove task_lock() from cgroup_post_fork()
    cgroup: add sparse annotation to cgroup_iter_start() and cgroup_iter_end()
    cgroup: mark cgroup_rmdir_waitq and cgroup_attach_proc() as static
    cgroup: only need to check oldcgrp==newgrp once
    cgroup: remove redundant get/put of task struct
    cgroup: remove redundant get/put of old css_set from migrate
    cgroup: Remove unnecessary task_lock before fetching css_set on migration
    cgroup: Drop task_lock(parent) on cgroup_fork()
    cgroups: remove redundant get/put of css_set from css_set_check_fetched()
    resource cgroups: remove bogus cast
    cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task()
    cgroup, cpuset: don't use ss->pre_attach()
    cgroup: don't use subsys->can_attach_task() or ->attach_task()
    cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
    cgroup: improve old cgroup handling in cgroup_attach_proc()
    cgroup: always lock threadgroup during migration
    threadgroup: extend threadgroup_lock() to cover exit and exec
    threadgroup: rename signal->threadgroup_fork_lock to ->group_rwsem
    ...

    Fix up conflict in kernel/cgroup.c due to commit e0197aae59e5: "cgroups:
    fix a css_set not found bug in cgroup_attach_proc" that already
    mentioned that the bug is fixed (differently) in Tejun's cgroup
    patchset. This one, in other words.

    Linus Torvalds
     
  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    ext2/3/4: delete unneeded includes of module.h
    ext{3,4}: Fix potential race when setversion ioctl updates inode
    udf: Mark LVID buffer as uptodate before marking it dirty
    ext3: Don't warn from writepage when readonly inode is spotted after error
    jbd: Remove j_barrier mutex
    reiserfs: Force inode evictions before umount to avoid crash
    reiserfs: Fix quota mount option parsing
    udf: Treat symlink component of type 2 as /
    udf: Fix deadlock when converting file from in-ICB one to normal one
    udf: Cleanup calling convention of inode_getblk()
    ext2: Fix error handling on inode bitmap corruption
    ext3: Fix error handling on inode bitmap corruption
    ext3: replace ll_rw_block with other functions
    ext3: NULL dereference in ext3_evict_inode()
    jbd: clear revoked flag on buffers before a new transaction started
    ext3: call ext3_mark_recovery_complete() when recovery is really needed

    Linus Torvalds
     
  • dev_uc_sync() and dev_mc_sync() are acquiring netif_addr_lock for
    destination device of synchronization. Since netif_addr_lock is
    already held at the time for source device, this triggers lockdep
    deadlock warning.

    There's no way this deadlock can happen so use spin_lock_nested() to
    silence the warning.

    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • * 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (466 commits)
    net/hyperv: Add support for jumbo frame up to 64KB
    net/hyperv: Add NETVSP protocol version negotiation
    net/hyperv: Remove unnecessary kmap_atomic in netvsc driver
    staging/rtl8192e: Register against lib80211
    staging/rtl8192e: Convert to lib80211_crypt_info
    staging/rtl8192e: Convert to lib80211_crypt_data and lib80211_crypt_ops
    staging/rtl8192e: Add lib80211.h to rtllib.h
    staging/mei: add watchdog device registration wrappers
    drm/omap: GEM, deal with cache
    staging: vt6656: int.c, int.h: Change return of function to void
    staging: usbip: removed unused definitions from header
    staging: usbip: removed dead code from receive function
    staging:iio: Drop {mark,unmark}_in_use callbacks
    staging:iio: Drop buffer mark_param_change callback
    staging:iio: Drop the unused buffer enable() and is_enabled() callbacks
    staging:iio: Drop buffer busy flag
    staging:iio: Make sure a device is only opened once at a time
    staging:iio: Disallow modifying buffer size when buffer is enabled
    staging:iio: Disallow changing scan elements in all buffered modes
    staging:iio: Use iio_buffer_enabled instead of open coding it
    ...

    Fix up conflict in drivers/staging/iio/adc/ad799x_core.c (removal of
    module_init due to using module_i2c_driver() helper, next to removal of
    MODULE_ALIAS due to using MODULE_DEVICE_TABLE instead).

    Linus Torvalds
     
  • * 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (232 commits)
    USB: Add USB-ID for Multiplex RC serial adapter to cp210x.c
    xhci: Clean up 32-bit build warnings.
    USB: update documentation for usbmon
    usb: usb-storage doesn't support dynamic id currently, the patch disables the feature to fix an oops
    drivers/usb/class/cdc-acm.c: clear dangling pointer
    drivers/usb/dwc3/dwc3-pci.c: introduce missing kfree
    drivers/usb/host/isp1760-if.c: introduce missing kfree
    usb: option: add ZD Incorporated HSPA modem
    usb: ch9: fix up MaxStreams helper
    USB: usb-skeleton.c: cleanup open_count
    USB: usb-skeleton.c: fix open/disconnect race
    xhci: Properly handle COMP_2ND_BW_ERR
    USB: remove dead code from suspend/resume path
    USB: add quirk for another camera
    drivers: usb: wusbcore: Fix dependency for USB_WUSB
    xhci: Better debugging for critical host errors.
    xhci: Be less verbose during URB cancellation.
    xhci: Remove debugging about ring structure allocation.
    xhci: Remove debugging about toggling cycle bits.
    xhci: Remove debugging for individual transfers.
    ...

    Linus Torvalds
     
  • * 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (65 commits)
    tty: serial: imx: move del_timer_sync() to avoid potential deadlock
    imx: add polled io uart methods
    imx: Add save/restore functions for UART control regs
    serial/imx: let probing fail for the dt case without a valid alias
    serial/imx: propagate error from of_alias_get_id instead of using -ENODEV
    tty: serial: imx: Allow UART to be a source for wakeup
    serial: driver for m32 arch should not have DEC alpha errata
    serial/documentation: fix documented name of DCD cpp symbol
    atmel_serial: fix spinlock lockup in RS485 code
    tty: Fix memory leak in virtual console when enable unicode translation
    serial: use DIV_ROUND_CLOSEST instead of open coding it
    serial: add support for 400 and 800 v3 series Titan cards
    serial: bfin-uart: Remove ASYNC_CTS_FLOW flag for hardware automatic CTS.
    serial: bfin-uart: Enable hardware automatic CTS only when CTS pin is available.
    serial: make FSL errata depend on 8250_CONSOLE, not just 8250
    serial: add irq handler for Freescale 16550 errata.
    serial: manually inline serial8250_handle_port
    serial: make 8250 timeout use the specified IRQ handler
    serial: export the key functions for an 8250 IRQ handler
    serial: clean up parameter passing for 8250 Rx IRQ handling
    ...

    Linus Torvalds
     
  • Conflicts:
    arch/arm/mach-mxs/include/mach/common.h

    Pull in previous samsung conflict merges and do a trivial
    merge of an mxs double-add conflict.

    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

09 Jan, 2012

8 commits

  • j_barrier mutex is used for serializing different journal lock operations. The
    problem with it is that e.g. FIFREEZE ioctl results in process leaving kernel
    with j_barrier mutex held which makes lockdep freak out. Also hibernation code
    wants to freeze filesystem but it cannot do so because it then cannot hibernate
    the system because of mutex being locked.

    So we remove j_barrier mutex and use direct wait on j_barrier_count instead.
    Since locking journal is a rare operation we don't have to care about fairness
    or such things.

    CC: Andrew Morton
    Acked-by: Joel Becker
    Signed-off-by: Jan Kara

    Jan Kara
     
  • so move it there. Fixes build errors when CONFIG_INET is not defined:

    In file included from include/linux/tcp.h:211:0,
    from include/linux/ipv6.h:221,
    from include/net/ipv6.h:16,
    from include/linux/sunrpc/clnt.h:26,
    from include/linux/nfs_fs.h:50,
    from init/do_mounts.c:20:
    include/net/sock.h: In function 'sk_update_clone':
    include/net/sock.h:1109:3: error: implicit declaration of function 'sock_update_memcg' [-Werror=implicit-function-declaration]

    Signed-off-by: Stephen Rothwell
    Signed-off-by: David S. Miller

    Stephen Rothwell
     
  • infiniband changes for 3.3 merge window

    * tag 'infiniband-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
    rdma/core: Fix sparse warnings
    RDMA/cma: Fix endianness bugs
    RDMA/nes: Fix terminate during AE
    RDMA/nes: Make unnecessarily global nes_set_pau() static
    RDMA/nes: Change MDIO bus clock to 2.5MHz
    IB/cm: Fix layout of APR message
    IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE
    IB/qib: Default some module parameters optimally
    IB/qib: Optimize locking for get_txreq()
    IB/qib: Fix a possible data corruption when receiving packets
    IB/qib: Eliminate 64-bit jiffies use
    IB/qib: Fix style issues
    IB/uverbs: Protect QP multicast list

    Linus Torvalds
     
  • * 'dma-buf-merge' of git://people.freedesktop.org/~airlied/linux:
    dma-buf: mark EXPERIMENTAL for 1st release.
    dma-buf: Documentation for buffer sharing framework
    dma-buf: Introduce dma buffer sharing mechanism

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: (36 commits)
    mfd: Clearing events requires event registers to be writable for da9052-core
    mfd: Fix annotations in da9052-core
    gpiolib: Mark da9052 driver broken
    mfd: Declare da9052_regmap_config for the bus drivers
    MFD: DA9052/53 MFD core module add SPI support v2
    MFD: DA9052/53 MFD core module
    regmap: Add irq_base accessor to regmap_irq
    regmap: Allow drivers to reinitialise the register cache at runtime
    regmap: Add trace event for successful cache reads
    regmap: Allow regmap_update_bits() users to detect changes
    regmap: Report if we actually handled an interrupt in regmap-irq
    regmap: Fix rbtreee build when not using debugfs
    regmap: Provide debugfs dump of the rbtree cache data
    regmap: Do debugfs init before cache init
    regmap: Suppress noop writes in regmap_update_bits()
    regmap: Remove indexed cache type
    regmap: Drop check whether a register is readable in regcache_read
    regmap: Properly round cache_word_size
    regmap: Add support for 10/14 register formating
    regmap: Try cached read before checking if a hardware read is possible
    ...

    Linus Torvalds
     
  • md update for 3.3

    Big change is new hot-replacement.
    A slot in an array can hold 2 devices - one that
    wants-replacement and one that is the replacement.
    Once the replacement is built - either from the
    original or (in the case of errors) from elsewhere,
    the wants-replacement device will be removed.

    * tag 'md-3.3' of git://neil.brown.name/md: (36 commits)
    md/raid1: Mark device want_replacement when we see a write error.
    md/raid1: If there is a spare and a want_replacement device, start replacement.
    md/raid1: recognise replacements when assembling arrays.
    md/raid1: handle activation of replacement device when recovery completes.
    md/raid1: Allow a failed replacement device to be removed.
    md/raid1: Allocate spare to store replacement devices and their bios.
    md/raid1: Replace use of mddev->raid_disks with conf->raid_disks.
    md/raid10: If there is a spare and a want_replacement device, start replacement.
    md/raid10: recognise replacements when assembling array.
    md/raid10: Allow replacement device to be replace old drive.
    md/raid10: handle recovery of replacement devices.
    md/raid10: Handle replacement devices during resync.
    md/raid10: writes should get directed to replacement as well as original.
    md/raid10: allow removal of failed replacement devices.
    md/raid10: preferentially read from replacement device if possible.
    md/raid10: change read_balance to return an rdev
    md/raid10: prepare data structures for handling replacement.
    md/raid5: Mark device want_replacement when we see a write error.
    md/raid5: If there is a spare and a want_replacement device, start replacement.
    md/raid5: recognise replacements when assembling array.
    ...

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits)
    Kconfig: acpi: Fix typo in comment.
    misc latin1 to utf8 conversions
    devres: Fix a typo in devm_kfree comment
    btrfs: free-space-cache.c: remove extra semicolon.
    fat: Spelling s/obsolate/obsolete/g
    SCSI, pmcraid: Fix spelling error in a pmcraid_err() call
    tools/power turbostat: update fields in manpage
    mac80211: drop spelling fix
    types.h: fix comment spelling for 'architectures'
    typo fixes: aera -> area, exntension -> extension
    devices.txt: Fix typo of 'VMware'.
    sis900: Fix enum typo 'sis900_rx_bufer_status'
    decompress_bunzip2: remove invalid vi modeline
    treewide: Fix comment and string typo 'bufer'
    hyper-v: Update MAINTAINERS
    treewide: Fix typos in various parts of the kernel, and fix some comments.
    clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
    gpio: Kconfig: drop unknown symbol 'CS5535_GPIO'
    leds: Kconfig: Fix typo 'D2NET_V2'
    sound: Kconfig: drop unknown symbol ARCH_CLPS7500
    ...

    Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new
    kconfig additions, close to removed commented-out old ones)

    Linus Torvalds
     
  • * 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (76 commits)
    PM / Hibernate: Implement compat_ioctl for /dev/snapshot
    PM / Freezer: fix return value of freezable_schedule_timeout_killable()
    PM / shmobile: Allow the A4R domain to be turned off at run time
    PM / input / touchscreen: Make st1232 use device PM QoS constraints
    PM / QoS: Introduce dev_pm_qos_add_ancestor_request()
    PM / shmobile: Remove the stay_on flag from SH7372's PM domains
    PM / shmobile: Don't include SH7372's INTCS in syscore suspend/resume
    PM / shmobile: Add support for the sh7372 A4S power domain / sleep mode
    PM: Drop generic_subsys_pm_ops
    PM / Sleep: Remove forward-only callbacks from AMBA bus type
    PM / Sleep: Remove forward-only callbacks from platform bus type
    PM: Run the driver callback directly if the subsystem one is not there
    PM / Sleep: Make pm_op() and pm_noirq_op() return callback pointers
    PM/Devfreq: Add Exynos4-bus device DVFS driver for Exynos4210/4212/4412.
    PM / Sleep: Merge internal functions in generic_ops.c
    PM / Sleep: Simplify generic system suspend callbacks
    PM / Hibernate: Remove deprecated hibernation snapshot ioctls
    PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()
    ARM: S3C64XX: Implement basic power domain support
    PM / shmobile: Use common always on power domain governor
    ...

    Fix up trivial conflict in fs/xfs/xfs_buf.c due to removal of unused
    XBT_FORCE_SLEEP bit

    Linus Torvalds