22 Oct, 2010

2 commits

  • …it/konrad/swiotlb-2.6

    * 'stable/swiotlb-0.9' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6:
    swiotlb: Use page alignment for early buffer allocation
    swiotlb: make io_tlb_overflow static

    Linus Torvalds
     
  • …git/tip/linux-2.6-tip

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (163 commits)
    tracing: Fix compile issue for trace_sched_wakeup.c
    [S390] hardirq: remove pointless header file includes
    [IA64] Move local_softirq_pending() definition
    perf, powerpc: Fix power_pmu_event_init to not use event->ctx
    ftrace: Remove recursion between recordmcount and scripts/mod/empty
    jump_label: Add COND_STMT(), reducer wrappery
    perf: Optimize sw events
    perf: Use jump_labels to optimize the scheduler hooks
    jump_label: Add atomic_t interface
    jump_label: Use more consistent naming
    perf, hw_breakpoint: Fix crash in hw_breakpoint creation
    perf: Find task before event alloc
    perf: Fix task refcount bugs
    perf: Fix group moving
    irq_work: Add generic hardirq context callbacks
    perf_events: Fix transaction recovery in group_sched_in()
    perf_events: Fix bogus AMD64 generic TLB events
    perf_events: Fix bogus context time tracking
    tracing: Remove parent recording in latency tracer graph options
    tracing: Use one prologue for the preempt irqs off tracer function tracers
    ...

    Linus Torvalds
     

19 Oct, 2010

1 commit


12 Oct, 2010

2 commits

  • We could call free_bootmem_late() if swiotlb is not used, and
    it will shrink to page alignment.

    So alloc them with page alignment at first, to avoid lose two pages

    before patch:
    [ 0.000000] memblock_x86_reserve_range: [00d3600000, 00d7600000] swiotlb buffer
    [ 0.000000] memblock_x86_reserve_range: [00d7e7ef40, 00d7e9ef40] swiotlb list
    [ 0.000000] memblock_x86_reserve_range: [00d7e3ef40, 00d7e7ef40] swiotlb orig_ad
    [ 0.000000] memblock_x86_reserve_range: [000008a000, 0000092000] swiotlb overflo

    after patch will get
    [ 0.000000] memblock_x86_reserve_range: [00d3600000, 00d7600000] swiotlb buffer
    [ 0.000000] memblock_x86_reserve_range: [00d7e7e000, 00d7e9e000] swiotlb list
    [ 0.000000] memblock_x86_reserve_range: [00d7e3e000, 00d7e7e000] swiotlb orig_ad
    [ 0.000000] memblock_x86_reserve_range: [000008a000, 0000092000] swiotlb overflo

    Signed-off-by: Yinghai Lu
    Acked-by: FUJITA Tomonori
    Cc: Becky Bruce
    Signed-off-by: Konrad Rzeszutek Wilk

    Yinghai Lu
     
  • We don't need to export io_tlb_overflow_buffer. I'll remove
    io_tlb_overflow_buffer completely in the long term though.

    Signed-off-by: FUJITA Tomonori
    Signed-off-by: Konrad Rzeszutek Wilk

    FUJITA Tomonori
     

08 Oct, 2010

1 commit


07 Oct, 2010

2 commits


06 Oct, 2010

1 commit

  • With all the recent module loading cleanups, we've minimized the code
    that sits under module_mutex, fixing various deadlocks and making it
    possible to do most of the module loading in parallel.

    However, that whole conversion totally missed the rather obscure code
    that adds a new module to the list for BUG() handling. That code was
    doubly obscure because (a) the code itself lives in lib/bugs.c (for
    dubious reasons) and (b) it gets called from the architecture-specific
    "module_finalize()" rather than from generic code.

    Calling it from arch-specific code makes no sense what-so-ever to begin
    with, and is now actively wrong since that code isn't protected by the
    module loading lock any more.

    So this commit moves the "module_bug_{finalize,cleanup}()" calls away
    from the arch-specific code, and into the generic code - and in the
    process protects it with the module_mutex so that the list operations
    are now safe.

    Future fixups:
    - move the module list handling code into kernel/module.c where it
    belongs.
    - get rid of 'module_bug_list' and just use the regular list of modules
    (called 'modules' - imagine that) that we already create and maintain
    for other reasons.

    Reported-and-tested-by: Thomas Gleixner
    Cc: Rusty Russell
    Cc: Adrian Bunk
    Cc: Andrew Morton
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

02 Oct, 2010

1 commit

  • If the original list is a POT in length, the first callback from line 73
    will pass a==b both pointing to the original list_head. This is dangerous
    because the 'list_sort()' user can use 'container_of()' and accesses the
    "containing" object, which does not necessary exist for the list head. So
    the user can access RAM which does not belong to him. If this is a write
    access, we can end up with memory corruption.

    Signed-off-by: Don Mullis
    Tested-by: Artem Bityutskiy
    Signed-off-by: Artem Bityutskiy
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Don Mullis
     

24 Sep, 2010

1 commit


23 Sep, 2010

1 commit


15 Sep, 2010

1 commit


10 Sep, 2010

1 commit

  • * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
    block: Range check cpu in blk_cpu_to_group
    scatterlist: prevent invalid free when alloc fails
    writeback: Fix lost wake-up shutting down writeback thread
    writeback: do not lose wakeup events when forking bdi threads
    cciss: fix reporting of max queue depth since init
    block: switch s390 tape_block and mg_disk to elevator_change()
    block: add function call to switch the IO scheduler from a driver
    fs/bio-integrity.c: return -ENOMEM on kmalloc failure
    bio-integrity.c: remove dependency on __GFP_NOFAIL
    BLOCK: fix bio.bi_rw handling
    block: put dev->kobj in blk_register_queue fail path
    cciss: handle allocation failure
    cfq-iosched: Documentation help for new tunables
    cfq-iosched: blktrace print per slice sector stats
    cfq-iosched: Implement tunable group_idle
    cfq-iosched: Do group share accounting in IOPS when slice_idle=0
    cfq-iosched: Do not idle if slice_idle=0
    cciss: disable doorbell reset on reset_devices
    blkio: Fix return code for mkdir calls

    Linus Torvalds
     

01 Sep, 2010

1 commit

  • When CONFIG_IRQSOFF_TRACER is set and CONFIG_PROVE_LOCKING is not, we
    get the following error:

    $ make oldconfig
    scripts/kconfig/conf --oldconfig arch/x86/Kconfig
    warning: (IRQSOFF_TRACER && TRACING_SUPPORT && FTRACE && TRACE_IRQFLAGS_SUPPORT && !ARCH_USES_GETTIMEOFFSET) selects TRACE_IRQFLAGS which has unmet direct dependencies (DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && PROVE_LOCKING)
    warning: (IRQSOFF_TRACER && TRACING_SUPPORT && FTRACE && TRACE_IRQFLAGS_SUPPORT && !ARCH_USES_GETTIMEOFFSET) selects TRACE_IRQFLAGS which has unmet direct dependencies (DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && PROVE_LOCKING)

    This is because IRQSOFF_TRACER selects TRACE_IRQFLAGS but TRACE_IRQFLAGS
    has PROVE_LOCKING as a dependency. This code is incorrect, and
    this patch changes the TRACE_IRQFLAGS to be just a simple bool that
    does not depend or select anything. Instead both IRQSOFF_TRACER and
    PROVE_LOCKING select it.

    Reported-by: Richard Kennedy
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

31 Aug, 2010

1 commit

  • When alloc fails, free_table is being called. Depending on the number of
    bytes requested, we determine if we are going to call _get_free_page()
    or kmalloc(). When alloc fails, our math is wrong (due to sg_size - 1),
    and the last buffer is wrongfully assumed to have been allocated by
    kmalloc. Hence, kfree gets called and a panic occurs.

    Signed-off-by: Jeffrey Carlyle
    Signed-off-by: Olusanya Soyannwo
    Acked-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Jeffrey Carlyle
     

30 Aug, 2010

1 commit


24 Aug, 2010

1 commit


23 Aug, 2010

4 commits

  • …/linux-2.6-rcu into core/rcu

    Ingo Molnar
     
  • * 'radix-tree' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev:
    radix-tree: radix_tree_range_tag_if_tagged() can set incorrect tags
    radix-tree: clear all tags in radix_tree_node_rcu_free

    Linus Torvalds
     
  • Commit ebf8aa44beed48cd17893a83d92a4403e5f9d9e2 ("radix-tree:
    omplement function radix_tree_range_tag_if_tagged") does not safely
    set tags on on intermediate tree nodes. The code walks down the tree
    setting tags before it has fully resolved the path to the leaf under
    the assumption there will be a leaf slot with the tag set in the
    range it is searching.

    Unfortunately, this is not a valid assumption - we can abort after
    setting a tag on an intermediate node if we overrun the number of
    tags we are allowed to set in a batch, or stop scanning because we
    we have passed the last scan index before we reach a leaf slot with
    the tag we are searching for set.

    As a result, we can leave the function with tags set on intemediate
    nodes which can be tripped over later by tag-based lookups. The
    result of these stale tags is that lookup may end prematurely or
    livelock because the lookup cannot make progress.

    The fix for the problem involves reocrding the traversal path we
    take to the leaf nodes, and only propagating the tags back up the
    tree once the tag is set in the leaf node slot. We are already
    recording the path for efficient traversal, so there is no
    additional overhead to do the intermediately node tag setting in
    this manner.

    This fixes a radix tree lookup livelock triggered by the new
    writeback sync livelock avoidance code introduced in commit
    f446daaea9d4a420d16c606f755f3689dcb2d0ce ("mm: implement writeback
    livelock avoidance using page tagging").

    Signed-off-by: Dave Chinner
    Acked-by: Jan Kara

    Dave Chinner
     
  • Commit f446daaea9d4a420d16c606f755f3689dcb2d0ce ("mm: implement
    writeback livelock avoidance using page tagging") introduced a new
    radix tree tag, increasing the number of tags in each node from 2 to
    3. It did not, however, fix up the code in
    radix_tree_node_rcu_free() that cleans up after radix_tree_shrink()
    and hence could leave stray tags set in the new tag array.

    The result is that the livelock avoidance code added in the the
    above commit would hit stale tags when doing tag based lookups,
    resulting in livelocks when trying to traverse the tree.

    Fix this problem in radix_tree_node_rcu_free() so it doesn't happen
    again in the future by using a loop to walk all the tags up to
    RADIX_TREE_MAX_TAGS to clear the stray tags radix_tree_shrink()
    leaves behind.

    Signed-off-by: Dave Chinner
    Acked-by: Nick Piggin
    Acked-by: Jan Kara

    Dave Chinner
     

21 Aug, 2010

1 commit


20 Aug, 2010

4 commits

  • Currently, if RCU CPU stall warnings are enabled, they are enabled
    immediately upon boot. They can be manually disabled via /sys (and
    also re-enabled via /sys), and are automatically disabled upon panic.
    However, some users need RCU CPU stalls to be disabled at boot time,
    but to be enabled without rebuilding/rebooting. For example, someone
    running a real-time application in production might not want the
    additional latency of RCU CPU stall detection in normal operation, but
    might need to enable it at any point for fault isolation purposes.

    This commit therefore provides a new CONFIG_RCU_CPU_STALL_DETECTOR_RUNNABLE
    kernel configuration parameter that maintains the current behavior
    (enable at boot) by default, but allows a kernel to be configured
    with RCU CPU stall detection built into the kernel, but disabled at
    boot time.

    Requested-by: Clark Williams
    Requested-by: John Kacur
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Cc: Nick Piggin
    Reviewed-by: Josh Triplett

    Arnd Bergmann
     
  • Also set the default to 60 seconds, up from the previous hard-coded timeout
    of 10 seconds. This allows people who care to set short timeouts, while
    avoiding people with unusual configurations (make randconfig!!!) from being
    bothered with spurious CPU stall warnings.

    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • This commit provides definitions for the __rcu annotation defined earlier.
    This annotation permits sparse to check for correct use of RCU-protected
    pointers. If a pointer that is annotated with __rcu is accessed
    directly (as opposed to via rcu_dereference(), rcu_assign_pointer(),
    or one of their variants), sparse can be made to complain. To enable
    such complaints, use the new default-disabled CONFIG_SPARSE_RCU_POINTER
    kernel configuration option. Please note that these sparse complaints are
    intended to be a debugging aid, -not- a code-style-enforcement mechanism.

    There are special rcu_dereference_protected() and rcu_access_pointer()
    accessors for use when RCU read-side protection is not required, for
    example, when no other CPU has access to the data structure in question
    or while the current CPU hold the update-side lock.

    This patch also updates a number of docbook comments that were showing
    their age.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Cc: Christopher Li
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

17 Aug, 2010

1 commit

  • warning: (LATENCYTOP && HAVE_LATENCYTOP_SUPPORT) selects
    SCHED_DEBUG which has unmet direct dependencies (DEBUG_KERNEL &&
    PROC_FS) warning: (LATENCYTOP && HAVE_LATENCYTOP_SUPPORT) selects
    SCHEDSTATS which has unmet direct dependencies (DEBUG_KERNEL && PROC_FS)

    Add depends on STACKTRACE_SUPPORT for 'select STACKTRACE'.
    Add depends on PROC_FS since that is where the output goes.

    Signed-off-by: Randy Dunlap
    Cc: Arjan van de Ven
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Randy Dunlap
     

13 Aug, 2010

2 commits

  • * 'for-linus' of git://neil.brown.name/md:
    Further tidyup of raid6 naming in lib/raid6
    Make lib/raid6/test build correctly.
    Rename raid6 files now they're in a 'raid6' directory.

    Linus Torvalds
     
  • Don't try and #include in lib/inflate.c from the bootloader code
    as linux/slab.h hauls in function defs that aren't available in the bootloader
    code and may also haul in conflicting functions.

    To fix this, make the inclusion of linux/slab.h contingent on NO_INFLATE_MALLOC
    as are the usages of kmalloc() and kfree().

    In MN10300, this causes the following errors:

    In file included from include/linux/string.h:21,
    from include/linux/bitmap.h:8,
    from include/linux/nodemask.h:93,
    from include/linux/mmzone.h:16,
    from include/linux/gfp.h:4,
    from include/linux/slab.h:12,
    from arch/mn10300/boot/compressed/../../../../lib/inflate.c:106,
    from arch/mn10300/boot/compressed/misc.c:170:
    /warthog/am33/linux-2.6-mn10300/arch/mn10300/include/asm/string.h:19: error: conflicting types for 'memset'
    arch/mn10300/boot/compressed/misc.c:59: error: previous definition of 'memset' was here

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     

12 Aug, 2010

2 commits


11 Aug, 2010

6 commits

  • Fix checkstack error:

    lib/decompress_bunzip2.c: In function `get_next_block':
    lib/decompress_bunzip2.c:511: warning: the frame size of 1932 bytes is larger than 1024 bytes

    byteCount, symToByte, and mtfSymbol cannot be declared static or allocated
    dynamically so place them in the bunzip_data struct.

    Signed-off-by: Prarit Bhargava
    Cc: Phillip Lougher
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Prarit Bhargava
     
  • We are missing the oops end marker for the exception based WARN implementation
    in lib/bug.c. This is useful for logfile analysis tools.

    Signed-off-by: Anton Blanchard
    Cc: Ingo Molnar
    Cc: Arjan van de Ven
    Cc: "Kirill A. Shutemov"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anton Blanchard
     
  • There are a few issues with the exception based WARN implementation in
    lib/bug.c:

    - Inconsistent printk flags. The "cut here" line is printed at KERN_EMERG, so
    the console and all logged in users see the single line:

    ------------[ cut here ]------------

    for each WARN. Fix this so we print everything at KERN_WARNING to match the
    kernel/panic.c version.

    - The lib/bug.c WARN would print "Badness at". Change it to match the
    kernel/panic.c version which prints "WARNING: at".

    - Print the list of modules, similar to kernel/panic.c of modules, similar to
    kernel/panic.c

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Anton Blanchard
    Cc: Ingo Molnar
    Cc: Arjan van de Ven
    Cc: "Kirill A. Shutemov"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anton Blanchard
     
  • Linus asks 'why "raid6" twice?'. No reason.

    Signed-off-by: David Woodhouse

    David Woodhouse
     
  • * 'for-linus' of git://neil.brown.name/md: (24 commits)
    md: clean up do_md_stop
    md: fix another deadlock with removing sysfs attributes.
    md: move revalidate_disk() back outside open_mutex
    md/raid10: fix deadlock with unaligned read during resync
    md/bitmap: separate out loading a bitmap from initialising the structures.
    md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log.
    md/bitmap: optimise scanning of empty bitmaps.
    md/bitmap: clean up plugging calls.
    md/bitmap: reduce dependence on sysfs.
    md/bitmap: white space clean up and similar.
    md/raid5: export raid5 unplugging interface.
    md/plug: optionally use plugger to unplug an array during resync/recovery.
    md/raid5: add simple plugging infrastructure.
    md/raid5: export is_congested test
    raid5: Don't set read-ahead when there is no queue
    md: add support for raising dm events.
    md: export various start/stop interfaces
    md: split out md_rdev_init
    md: be more careful setting MD_CHANGE_CLEAN
    md/raid5: ensure we create a unique name for kmem_cache when mddev has no gendisk
    ...

    Linus Torvalds
     
  • * 'kmemleak' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-2.6-cm:
    kmemleak: Fix typo in the comment
    lib/scatterlist: Hook sg_kmalloc into kmemleak (v2)
    kmemleak: Add DocBook style comments to kmemleak.c
    kmemleak: Introduce a default off mode for kmemleak
    kmemleak: Show more information for objects found by alias

    Linus Torvalds
     

10 Aug, 2010

2 commits

  • More code can be pushed from rwsem_down_read_failed and
    rwsem_down_write_failed into rwsem_down_failed_common.

    Following change adding down_read_critical infrastructure support also
    enjoys having flags available in a register rather than having to fish it
    out in the struct rwsem_waiter...

    Signed-off-by: Michel Lespinasse
    Acked-by: David Howells
    Cc: Mike Waychison
    Cc: Suleiman Souhlal
    Cc: Ying Han
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • This change addresses the following situation:

    - Thread A acquires the rwsem for read
    - Thread B tries to acquire the rwsem for write, notices there is already
    an active owner for the rwsem.
    - Thread C tries to acquire the rwsem for read, notices that thread B already
    tried to acquire it.
    - Thread C grabs the spinlock and queues itself on the wait queue.
    - Thread B grabs the spinlock and queues itself behind C. At this point A is
    the only remaining active owner on the rwsem.

    In this situation thread B could notice that it was the last active writer
    on the rwsem, and decide to wake C to let it proceed in parallel with A
    since they both only want the rwsem for read.

    Signed-off-by: Michel Lespinasse
    Acked-by: David Howells
    Cc: Mike Waychison
    Cc: Suleiman Souhlal
    Cc: Ying Han
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michel Lespinasse