02 Nov, 2009

1 commit

  • Currently, rcu_irq_exit() is invoked only for CONFIG_NO_HZ,
    while rcu_irq_enter() is invoked unconditionally. This patch
    moves rcu_irq_exit() out from under CONFIG_NO_HZ so that the
    calls are balanced.

    This patch has no effect on the behavior of the kernel because
    both rcu_irq_enter() and rcu_irq_exit() are empty for
    !CONFIG_NO_HZ, but the code is easier to understand if the calls
    are obviously balanced in all cases.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Lai Jiangshan
     

27 Oct, 2009

1 commit

  • Use lockdep_set_class() to simplify the code and to avoid any
    additional overhead in the !LOCKDEP case. Also move the
    definition of rcu_root_class into kernel/rcutree.c, as suggested
    by Lai Jiangshan.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

26 Oct, 2009

5 commits

  • No change in functionality - just straighten out a few small
    stylistic details.

    Cc: Paul E. McKenney
    Cc: David Howells
    Cc: Josh Triplett
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: avi@redhat.com
    Cc: mtosatti@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Make rcutorture list the available torture_type values when it
    doesn't like the one specified.

    Signed-off-by: Paul E. McKenney
    Acked-by: Josh Triplett
    Reviewed-by: Lai Jiangshan
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    Cc: avi@redhat.com
    Cc: mtosatti@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • Adds the "srcu_expedited" torture type, and also renames
    sched_ops_sync to sched_sync_ops for consistency while we are in
    this file.

    Signed-off-by: Paul E. McKenney
    Acked-by: Josh Triplett
    Reviewed-by: Lai Jiangshan
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    Cc: avi@redhat.com
    Cc: mtosatti@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • This patch creates a synchronize_srcu_expedited() that uses
    synchronize_sched_expedited() where synchronize_srcu()
    uses synchronize_sched(). The synchronize_srcu() and
    synchronize_srcu_expedited() functions become one-liners that
    pass synchronize_sched() or synchronize_sched_expedited(),
    repectively, to a new __synchronize_srcu() function.

    While in the file, move the EXPORT_SYMBOL_GPL()s to immediately
    follow the corresponding functions.

    Requested-by: Avi Kivity
    Tested-by: Marcelo Tosatti
    Signed-off-by: Paul E. McKenney
    Acked-by: Josh Triplett
    Reviewed-by: Lai Jiangshan
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    Cc: avi@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • This patch is a version of RCU designed for !SMP provided for a
    small-footprint RCU implementation. In particular, the
    implementation of synchronize_rcu() is extremely lightweight and
    high performance. It passes rcutorture testing in each of the
    four relevant configurations (combinations of NO_HZ and PREEMPT)
    on x86. This saves about 1K bytes compared to old Classic RCU
    (which is no longer in mainline), and more than three kilobytes
    compared to Hierarchical RCU (updated to 2.6.30):

    CONFIG_TREE_RCU:

    text data bss dec filename
    183 4 0 187 kernel/rcupdate.o
    2783 520 36 3339 kernel/rcutree.o
    3526 Total (vs 4565 for v7)

    CONFIG_TREE_PREEMPT_RCU:

    text data bss dec filename
    263 4 0 267 kernel/rcupdate.o
    4594 776 52 5422 kernel/rcutree.o
    5689 Total (6155 for v7)

    CONFIG_TINY_RCU:

    text data bss dec filename
    96 4 0 100 kernel/rcupdate.o
    734 24 0 758 kernel/rcutiny.o
    858 Total (vs 848 for v7)

    The above is for x86. Your mileage may vary on other platforms.
    Further compression is possible, but is being procrastinated.

    Changes from v7 (http://lkml.org/lkml/2009/10/9/388)

    o Apply Lai Jiangshan's review comments (aside from
    might_sleep() in synchronize_sched(), which is covered by SMP builds).

    o Fix up expedited primitives.

    Changes from v6 (http://lkml.org/lkml/2009/9/23/293).

    o Forward ported to put it into the 2.6.33 stream.

    o Added lockdep support.

    o Make lightweight rcu_barrier.

    Changes from v5 (http://lkml.org/lkml/2009/6/23/12).

    o Ported to latest pre-2.6.32 merge window kernel.

    - Renamed rcu_qsctr_inc() to rcu_sched_qs().
    - Renamed rcu_bh_qsctr_inc() to rcu_bh_qs().
    - Provided trivial rcu_cpu_notify().
    - Provided trivial exit_rcu().
    - Provided trivial rcu_needs_cpu().
    - Fixed up the rcu_*_enter/exit() functions in linux/hardirq.h.

    o Removed the dependence on EMBEDDED, with a view to making
    TINY_RCU default for !SMP at some time in the future.

    o Added (trivial) support for expedited grace periods.

    Changes from v4 (http://lkml.org/lkml/2009/5/2/91) include:

    o Squeeze the size down a bit further by removing the
    ->completed field from struct rcu_ctrlblk.

    o This permits synchronize_rcu() to become the empty function.
    Previous concerns about rcutorture were unfounded, as
    rcutorture correctly handles a constant value from
    rcu_batches_completed() and rcu_batches_completed_bh().

    Changes from v3 (http://lkml.org/lkml/2009/3/29/221) include:

    o Changed rcu_batches_completed(), rcu_batches_completed_bh()
    rcu_enter_nohz(), rcu_exit_nohz(), rcu_nmi_enter(), and
    rcu_nmi_exit(), to be static inlines, as suggested by David
    Howells. Doing this saves about 100 bytes from rcutiny.o.
    (The numbers between v3 and this v4 of the patch are not directly
    comparable, since they are against different versions of Linux.)

    Changes from v2 (http://lkml.org/lkml/2009/2/3/333) include:

    o Fix whitespace issues.

    o Change short-circuit "||" operator to instead be "+" in order
    to fix performance bug noted by "kraai" on LWN.

    (http://lwn.net/Articles/324348/)

    Changes from v1 (http://lkml.org/lkml/2009/1/13/440) include:

    o This version depends on EMBEDDED as well as !SMP, as suggested
    by Ingo.

    o Updated rcu_needs_cpu() to unconditionally return zero,
    permitting the CPU to enter dynticks-idle mode at any time.
    This works because callbacks can be invoked upon entry to
    dynticks-idle mode.

    o Paul is now OK with this being included, based on a poll at
    the Kernel Miniconf at linux.conf.au, where about ten people said
    that they cared about saving 900 bytes on single-CPU systems.

    o Applies to both mainline and tip/core/rcu.

    Signed-off-by: Paul E. McKenney
    Acked-by: David Howells
    Acked-by: Josh Triplett
    Reviewed-by: Lai Jiangshan
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: avi@redhat.com
    Cc: mtosatti@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

15 Oct, 2009

4 commits

  • Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    Cc: npiggin@suse.de
    Cc: jens.axboe@oracle.com
    Cc: Josh Triplett
    LKML-Reference:
    Signed-off-by: Ingo Molnar
    kernel/rcutree_trace.c | 8 ++++++--
    1 file changed, 6 insertions(+), 2 deletions(-)

    Paul E. McKenney
     
  • For the short term, map synchronize_rcu_expedited() to
    synchronize_rcu() for TREE_PREEMPT_RCU and to
    synchronize_sched_expedited() for TREE_RCU.

    Longer term, there needs to be a real expedited grace period for
    TREE_PREEMPT_RCU, but candidate patches to date are considerably
    more complex and intrusive.

    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    Cc: npiggin@suse.de
    Cc: jens.axboe@oracle.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • As the number of callbacks on a given CPU rises, invoke
    force_quiescent_state() only every blimit number of callbacks
    (defaults to 10,000), and even then only if no other CPU has
    invoked force_quiescent_state() in the meantime.

    This should fix the performance regression reported by Nick.

    Reported-by: Nick Piggin
    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    Cc: jens.axboe@oracle.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • If userspace tries to perform a requeue_pi on a non-requeue_pi waiter,
    it will find the futex_q->requeue_pi_key to be NULL and OOPS.

    Check for NULL in match_futex() instead of doing explicit NULL pointer
    checks on all call sites. While match_futex(NULL, NULL) returning
    false is a little odd, it's still correct as we expect valid key
    references.

    Signed-off-by: Darren Hart
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    CC: Eric Dumazet
    CC: Dinakar Guniguntala
    CC: John Stultz
    Cc: stable@kernel.org
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Darren Hart
     

14 Oct, 2009

1 commit

  • The futex code does not handle spurious wake up in futex_wait and
    futex_wait_requeue_pi.

    The code assumes that any wake up which was not caused by futex_wake /
    requeue or by a timeout was caused by a signal wake up and returns one
    of the syscall restart error codes.

    In case of a spurious wake up the signal delivery code which deals
    with the restart error codes is not invoked and we return that error
    code to user space. That causes applications which actually check the
    return codes to fail. Blaise reported that on preempt-rt a python test
    program run into a exception trap. -rt exposed that due to a built in
    spurious wake up accelerator :)

    Solve this by checking signal_pending(current) in the wake up path and
    handle the spurious wake up case w/o returning to user space.

    Reported-by: Blaise Gassend
    Debugged-by: Darren Hart
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: stable@kernel.org
    LKML-Reference:

    Thomas Gleixner
     

09 Oct, 2009

6 commits

  • Some tracepoint magic (TRACE_EVENT(lock_acquired)) relies on
    the fact that lock hold times are positive and uses div64 on
    that. That triggered a build warning on MIPS, and probably
    causes bad output in certain circumstances as well.

    Make it truly positive.

    Reported-by: Andrew Morton
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • …/git/tip/linux-2.6-tip

    * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    futex: fix requeue_pi key imbalance
    futex: Fix typo in FUTEX_WAIT/WAKE_BITSET_PRIVATE definitions
    rcu: Place root rcu_node structure in separate lockdep class
    rcu: Make hot-unplugged CPU relinquish its own RCU callbacks
    rcu: Move rcu_barrier() to rcutree
    futex: Move exit_pi_state() call to release_mm()
    futex: Nullify robust lists after cleanup
    futex: Fix locking imbalance
    panic: Fix panic message visibility by calling bust_spinlocks(0) before dying
    rcu: Replace the rcu_barrier enum with pointer to call_rcu*() function
    rcu: Clean up code based on review feedback from Josh Triplett, part 4
    rcu: Clean up code based on review feedback from Josh Triplett, part 3
    rcu: Fix rcu_lock_map build failure on CONFIG_PROVE_LOCKING=y
    rcu: Clean up code to address Ingo's checkpatch feedback
    rcu: Clean up code based on review feedback from Josh Triplett, part 2
    rcu: Clean up code based on review feedback from Josh Triplett

    Linus Torvalds
     
  • …l/git/tip/linux-2.6-tip

    * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sched: Set correct normal_prio and prio values in sched_fork()

    Linus Torvalds
     
  • …nel/git/tip/linux-2.6-tip

    * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    tracing: user local buffer variable for trace branch tracer
    tracing: fix warning on kernel/trace/trace_branch.c andtrace_hw_branches.c
    ftrace: check for failure for all conversions
    tracing: correct module boundaries for ftrace_release
    tracing: fix transposed numbers of lock_depth and preempt_count
    trace: Fix missing assignment in trace_ctxwake_*
    tracing: Use free_percpu instead of kfree
    tracing: Check total refcount before releasing bufs in profile_enable failure

    Linus Torvalds
     
  • …/linux/kernel/git/tip/linux-2.6-tip

    * 'sparc-perf-events-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    mm, perf_event: Make vmalloc_user() align base kernel virtual address to SHMLBA
    perf_event: Provide vmalloc() based mmap() backing

    Linus Torvalds
     
  • …el/git/tip/linux-2.6-tip

    * 'perf-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf_events: Make ABI definitions available to userspace
    perf tools: elf_sym__is_function() should accept "zero" sized functions
    tracing/syscalls: Use long for syscall ret format and field definitions
    perf trace: Update eval_flag() flags array to match interrupt.h
    perf trace: Remove unused code in builtin-trace.c
    perf: Propagate term signal to child

    Linus Torvalds
     

08 Oct, 2009

6 commits


07 Oct, 2009

4 commits

  • Commit f2e21c9610991e95621a81407cdbab881226419b had unfortunate side
    effects with cpufreq governors on some systems.

    If the system did not switch into NOHZ mode ts->inidle is not set when
    tick_nohz_stop_sched_tick() is called from the idle routine. Therefor
    all subsequent calls from irq_exit() to tick_nohz_stop_sched_tick()
    fail to call tick_nohz_start_idle(). This results in bogus idle
    accounting information which is passed to cpufreq governors.

    Set the inidle flag unconditionally of the NOHZ active state to keep
    the idle time accounting correct in any case.

    [ tglx: Added comment and tweaked the changelog ]

    Reported-by: Steven Noonan
    Signed-off-by: Eero Nurkkala
    Cc: Rik van Riel
    Cc: Venkatesh Pallipadi
    Cc: Greg KH
    Cc: Steven Noonan
    Cc: stable@kernel.org
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Eero Nurkkala
     
  • Before this patch, all of the rcu_node structures were in the same lockdep
    class, so that lockdep would complain when rcu_preempt_offline_tasks()
    acquired the root rcu_node structure's lock while holding one of the leaf
    rcu_nodes' locks.

    This patch changes rcu_init_one() to use a separate
    spin_lock_init() for the root rcu_node structure's lock than is
    used for that of all of the rest of the rcu_node structures, which
    puts the root rcu_node structure's lock in its own lockdep class.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: akpm@linux-foundation.org
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • The current interaction between RCU and CPU hotplug requires that
    RCU block in CPU notifiers waiting for callbacks to drain.

    This can be greatly simplified by having each CPU relinquish its
    own callbacks, and for both _rcu_barrier() and CPU_DEAD notifiers
    to adopt all callbacks that were previously relinquished.

    This change also eliminates the possibility of certain types of
    hangs due to the previous practice of waiting for callbacks to be
    invoked from within CPU notifiers. If you don't every wait, you
    cannot hang.

    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: akpm@linux-foundation.org
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • Move the existing rcu_barrier() implementation to rcutree.c,
    consistent with the fact that the rcu_barrier() implementation is
    tied quite tightly to the RCU implementation.

    This opens the way to simplify and fix rcutree.c's rcu_barrier()
    implementation in a later patch.

    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: akpm@linux-foundation.org
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

06 Oct, 2009

12 commits

  • exit_pi_state() is called from do_exit() but not from do_execve().
    Move it to release_mm() so it gets called from do_execve() as well.

    Signed-off-by: Thomas Gleixner
    LKML-Reference:
    Cc: stable@kernel.org
    Cc: Anirban Sinha
    Cc: Peter Zijlstra

    Thomas Gleixner
     
  • The robust list pointers of user space held futexes are kept intact
    over an exec() call. When the exec'ed task exits exit_robust_list() is
    called with the stale pointer. The risk of corruption is minimal, but
    still it is incorrect to keep the pointers valid. Actually glibc
    should uninstall the robust list before calling exec() but we have to
    deal with it anyway.

    Nullify the pointers after [compat_]exit_robust_list() has been
    called.

    Reported-by: Anirban Sinha
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Thomas Gleixner
    LKML-Reference:
    Cc: stable@kernel.org

    Peter Zijlstra
     
  • The state char variable S should be reassigned, if S == 0.

    We are missing the state of the task that is going to sleep for the
    context switch events (in the raw mode).

    Fortunately the problem arises with the sched_switch/wake_up
    tracers, not the sched trace events.

    The formers are legacy now. But still, that was buggy.

    Signed-off-by: Hiroshi Shimamoto
    Cc: Steven Rostedt
    Acked-by: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Hiroshi Shimamoto
     
  • Some architectures such as Sparc, ARM and MIPS (basically
    everything with flush_dcache_page()) need to deal with dcache
    aliases by carefully placing pages in both kernel and user maps.

    These architectures typically have to use vmalloc_user() for this.

    However, on other architectures, vmalloc() is not needed and has
    the downsides of being more restricted and slower than regular
    allocations.

    Signed-off-by: Peter Zijlstra
    Acked-by: David Miller
    Cc: Andrew Morton
    Cc: Jens Axboe
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The syscall event definitions use long for the syscall exit ret
    value, but unsigned long for the same thing in the format and field
    definitions. Change them all to long.

    Signed-off-by: Tom Zanussi
    Acked-by: Frederic Weisbecker
    Cc: rostedt@goodmis.org
    Cc: lizf@cn.fujitsu.com
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • Rich reported a lock imbalance in the futex code:

    http://bugzilla.kernel.org/show_bug.cgi?id=14288

    It's caused by the displacement of the retry_private label in
    futex_wake_op(). The code unlocks the hash bucket locks in the
    error handling path and retries without locking them again which
    makes the next unlock fail.

    Move retry_private so we lock the hash bucket locks when we retry.

    Reported-by: Rich Ercolany
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Darren Hart
    Cc: stable-2.6.31
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • Commit ffd71da4e3f ("panic: decrease oops_in_progress only after
    having done the panic") moved bust_spinlocks(0) to the end of the
    function, which in practice is never reached.

    As a result console_unblank() is not called, and on some systems
    the user may not see the panic message.

    Move it back up to before the unblanking.

    Signed-off-by: Aaro Koskinen
    Reviewed-by: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Aaro Koskinen
     
  • …/git/tip/linux-2.6-tip

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf tools: Run generate-cmdlist.sh properly
    perf_event: Clean up perf_event_init_task()
    perf_event: Fix event group handling in __perf_event_sched_*()
    perf timechart: Add a power-only mode
    perf top: Add poll_idle to the skip list

    Linus Torvalds
     
  • …el/git/tip/linux-2.6-tip

    * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    hrtimer: Remove overly verbose "switch to high res mode" message

    Linus Torvalds
     
  • …nel/git/tip/linux-2.6-tip

    * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    kmemtrace: Fix up tracer registration
    tracing: Fix infinite recursion in ftrace_update_pid_func()

    Linus Torvalds
     
  • The rcu_barrier enum causes several problems:

    (1) you have to define the enum somewhere, and there is no
    convenient place,

    (2) the difference between TREE_RCU and TREE_PREEMPT_RCU causes
    problems when you need to map from rcu_barrier enum to struct
    rcu_state,

    (3) the switch statement are large, and

    (4) TINY_RCU really needs a different rcu_barrier() than do the
    treercu implementations.

    So replace it with a functionally equivalent but cleaner function
    pointer abstraction.

    Signed-off-by: Paul E. McKenney
    Acked-by: Mathieu Desnoyers
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: akpm@linux-foundation.org
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • These issues identified during an old-fashioned face-to-face code
    review extending over many hours. This group improves an existing
    abstraction and introduces two new ones. It also fixes an RCU
    stall-warning bug found while making the other changes.

    o Make RCU_INIT_FLAVOR() declare its own variables, removing
    the need to declare them at each call site.

    o Create an rcu_for_each_leaf() macro that scans the leaf
    nodes of the rcu_node tree.

    o Create an rcu_for_each_node_breadth_first() macro that does
    a breadth-first traversal of the rcu_node tree, AKA
    stepping through the array in index-number order.

    o If all CPUs corresponding to a given leaf rcu_node
    structure go offline, then any tasks queued on that leaf
    will be moved to the root rcu_node structure. Therefore,
    the stall-warning code must dump out tasks queued on the
    root rcu_node structure as well as those queued on the leaf
    rcu_node structures.

    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: akpm@linux-foundation.org
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney