15 Oct, 2009
1 commit
-
As the number of callbacks on a given CPU rises, invoke
force_quiescent_state() only every blimit number of callbacks
(defaults to 10,000), and even then only if no other CPU has
invoked force_quiescent_state() in the meantime.This should fix the performance regression reported by Nick.
Reported-by: Nick Piggin
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: jens.axboe@oracle.com
LKML-Reference:
Signed-off-by: Ingo Molnar
07 Oct, 2009
3 commits
-
Before this patch, all of the rcu_node structures were in the same lockdep
class, so that lockdep would complain when rcu_preempt_offline_tasks()
acquired the root rcu_node structure's lock while holding one of the leaf
rcu_nodes' locks.This patch changes rcu_init_one() to use a separate
spin_lock_init() for the root rcu_node structure's lock than is
used for that of all of the rest of the rcu_node structures, which
puts the root rcu_node structure's lock in its own lockdep class.Suggested-by: Peter Zijlstra
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
The current interaction between RCU and CPU hotplug requires that
RCU block in CPU notifiers waiting for callbacks to drain.This can be greatly simplified by having each CPU relinquish its
own callbacks, and for both _rcu_barrier() and CPU_DEAD notifiers
to adopt all callbacks that were previously relinquished.This change also eliminates the possibility of certain types of
hangs due to the previous practice of waiting for callbacks to be
invoked from within CPU notifiers. If you don't every wait, you
cannot hang.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
Move the existing rcu_barrier() implementation to rcutree.c,
consistent with the fact that the rcu_barrier() implementation is
tied quite tightly to the RCU implementation.This opens the way to simplify and fix rcutree.c's rcu_barrier()
implementation in a later patch.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar
06 Oct, 2009
3 commits
-
These issues identified during an old-fashioned face-to-face code
review extending over many hours. This group improves an existing
abstraction and introduces two new ones. It also fixes an RCU
stall-warning bug found while making the other changes.o Make RCU_INIT_FLAVOR() declare its own variables, removing
the need to declare them at each call site.o Create an rcu_for_each_leaf() macro that scans the leaf
nodes of the rcu_node tree.o Create an rcu_for_each_node_breadth_first() macro that does
a breadth-first traversal of the rcu_node tree, AKA
stepping through the array in index-number order.o If all CPUs corresponding to a given leaf rcu_node
structure go offline, then any tasks queued on that leaf
will be moved to the root rcu_node structure. Therefore,
the stall-warning code must dump out tasks queued on the
root rcu_node structure as well as those queued on the leaf
rcu_node structures.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
Whitespace fixes, updated comments, and trivial code movement.
o Fix whitespace error in RCU_HEAD_INIT()
o Move "So where is rcu_write_lock()" comment so that it does
not come between the rcu_read_unlock() header comment and
the rcu_read_unlock() definition.o Move the module_param statements for blimit, qhimark, and
qlowmark to immediately follow the corresponding
definitions.o In __rcu_offline_cpu(), move the assignment to rdp_me
inside the "if" statement, given that rdp_me is not used
outside of that "if" statement.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
Move the rcu_lock_map definition from rcutree.c to rcupdate.c so that
TINY_RCU can use lockdep.Reported-by: Ingo Molnar
Signed-off-by: Paul E. McKenney
Signed-off-by: Ingo Molnar
24 Sep, 2009
3 commits
-
Move declarations and update storage classes to make checkpatch happy.
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
These issues identified during an old-fashioned face-to-face code
review extending over many hours.o Add comments for tricky parts of code, and correct comments
that have passed their sell-by date.o Get rid of the vestiges of rcu_init_sched(), which is no
longer needed now that PREEMPT_RCU is gone.o Move the #include of rcutree_plugin.h to the end of
rcutree.c, which means that, rather than having a random
collection of forward declarations, the new set of forward
declarations document the set of plugins. The new home for
this #include also allows __rcu_init_preempt() to move into
rcutree_plugin.h.o Fix rcu_preempt_check_callbacks() to be static.
Suggested-by: Josh Triplett
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar
Peter Zijlstra -
These issues identified during an old-fashioned face-to-face code
review extended over many hours.o Bury various forms of the "rsp->completed == rsp->gpnum"
comparison into an rcu_gp_in_progress() function, which has
the beneficial side-effect of forcing consistent use of
ACCESS_ONCE().o Replace hand-coded arithmetic with DIV_ROUND_UP().
o Bury several "!list_empty(&rnp->blocked_tasks[rnp->gpnum & 0x01])"
instances into an rcu_preempted_readers() function, as this
expression indicates that there are no readers blocked
within RCU read-side critical sections blocking the current
grace period. (Though there might well be similar readers
blocking the next grace period.)o Remove a dangling rcu_restart_cpu() declaration that has
been dangling for almost 20 minor releases of the kernel.Signed-off-by: Paul E. McKenney
Acked-by: Peter Zijlstra
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar
19 Sep, 2009
4 commits
-
Fix a number of whitespace ^Ierrors in the include/linux/rcu*
and the kernel/rcu* files.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
LKML-Reference:
[ did more checkpatch fixlets ]
Signed-off-by: Ingo Molnar -
Commit de078d8 ("rcu: Need to update rnp->gpnum if preemptable RCU
is to be reliable") repeatedly and incorrectly initializes the root
rcu_node structure's ->gpnum field rather than initializing the
->gpnum field of each node in the tree. Fix this. Also add an
additional consistency check to catch this in the future.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
LKML-Reference:
Signed-off-by: Ingo Molnar -
o Drop the calls to cpu_quiet() from the online/offline code.
These are unnecessary, since force_quiescent_state() will
clean up, and removing them simplifies the code a bit.o Add a warning to check that we don't enqueue the same blocked
task twice onto the ->blocked_tasks[] lists.o Rework the phase computation in rcu_preempt_note_context_switch()
to be more readable, as suggested by Josh Triplett.o Disable irqs to close a race between the scheduling clock
interrupt and rcu_preempt_note_context_switch() WRT the
->rcu_read_unlock_special field.o Add comments to rnp->lock acquisition and release within
rcu_read_unlock_special() noting that irqs are already
disabled.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
LKML-Reference:
Signed-off-by: Ingo Molnar -
o Verify that qsmask bits stay clear through GP
initialization.o Verify that cpu_quiet_msk_finish() is never invoked unless
there actually is an RCU grace period in progress.o Verify that all internal-node rcu_node structures have empty
blocked_tasks[] lists.o Verify that child rcu_node structure's bits remain clear after
acquiring parent's lock.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
LKML-Reference:
Signed-off-by: Ingo Molnar
18 Sep, 2009
4 commits
-
The earlier approach required two scheduling-clock ticks to note an
preemptable-RCU quiescent state in the situation in which the
scheduling-clock interrupt is unlucky enough to always interrupt an
RCU read-side critical section.With this change, the quiescent state is instead noted by the
outermost rcu_read_unlock() immediately following the first
scheduling-clock tick, or, alternatively, by the first subsequent
context switch. Therefore, this change also speeds up grace
periods.Suggested-by: Josh Triplett
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
LKML-Reference:
Signed-off-by: Ingo Molnar -
Check to make sure that there are no blocked tasks for the previous
grace period while initializing for the next grace period, verify
that rcu_preempt_qs() is given the correct CPU number and is never
called for an offline CPU.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
LKML-Reference:
Signed-off-by: Ingo Molnar -
Prior implementations initialized the root and any internal
nodes without holding locks, then initialized the leaves
holding locks.This is a false economy, as the leaf nodes will usually greatly
outnumber the root and internal nodes. Acquiring locks on all
nodes is conceptually much simpler as well.Signed-off-by: Paul E. McKenney
Acked-by: Steven Rostedt
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josht@linux.vnet.ibm.com
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
LKML-Reference:
Signed-off-by: Ingo Molnar -
Without this patch, tasks preempted in RCU read-side critical
sections can fail to block the grace period, given that
rnp->gpnum is used to determine which rnp->blocked_tasks[]
element the preempted task is enqueued on.Before the patch, rnp->gpnum is always zero, so preempted tasks
are always enqueued on rnp->blocked_tasks[0], which is correct
only when the current CPU has not checked into the current
grace period and the grace-period number is even, or,
similarly, if the current CPU -has- checked into the current
grace period and the grace-period number is odd.Signed-off-by: Paul E. McKenney
Acked-by: Steven Rostedt
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josht@linux.vnet.ibm.com
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
LKML-Reference:
Signed-off-by: Ingo Molnar
12 Sep, 2009
1 commit
-
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (28 commits)
rcu: Move end of special early-boot RCU operation earlier
rcu: Changes from reviews: avoid casts, fix/add warnings, improve comments
rcu: Create rcutree plugins to handle hotplug CPU for multi-level trees
rcu: Remove lockdep annotations from RCU's _notrace() API members
rcu: Add #ifdef to suppress __rcu_offline_cpu() warning in !HOTPLUG_CPU builds
rcu: Add CPU-offline processing for single-node configurations
rcu: Add "notrace" to RCU function headers used by ftrace
rcu: Remove CONFIG_PREEMPT_RCU
rcu: Merge preemptable-RCU functionality into hierarchical RCU
rcu: Simplify rcu_pending()/rcu_check_callbacks() API
rcu: Use debugfs_remove_recursive() simplify code.
rcu: Merge per-RCU-flavor initialization into pre-existing macro
rcu: Fix online/offline indication for rcudata.csv trace file
rcu: Consolidate sparse and lockdep declarations in include/linux/rcupdate.h
rcu: Renamings to increase RCU clarity
rcu: Move private definitions from include/linux/rcutree.h to kernel/rcutree.h
rcu: Expunge lingering references to CONFIG_CLASSIC_RCU, optimize on !SMP
rcu: Delay rcu_barrier() wait until beginning of next CPU-hotunplug operation.
rcu: Fix typo in rcu_irq_exit() comment header
rcu: Make rcupreempt_trace.c look at offline CPUs
...
29 Aug, 2009
2 commits
-
Changes suggested by review comments from Josh Triplett and
Mathieu Desnoyers.Signed-off-by: Paul E. McKenney
Acked-by: Josh Triplett
Acked-by: Mathieu Desnoyers
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
LKML-Reference:
Signed-off-by: Ingo Molnar -
When offlining CPUs from a multi-level tree, there is the
possibility of offlining the last CPU from a given node when
there are preempted RCU read-side critical sections that
started life on one of the CPUs on that node.In this case, the corresponding tasks will be enqueued via the
task_struct's rcu_node_entry list_head onto one of the
rcu_node's blocked_tasks[] lists. These tasks need to be moved
somewhere else so that they will prevent the current grace
period from ending. That somewhere is the root rcu_node.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josht@linux.vnet.ibm.com
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
LKML-Reference:
Signed-off-by: Ingo Molnar
26 Aug, 2009
1 commit
-
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josht@linux.vnet.ibm.com
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Paul Mundt
LKML-Reference:
Signed-off-by: Ingo Molnar
25 Aug, 2009
1 commit
-
Add preemptable-RCU plugin to handle the CPU-offline
processing.An additional plugin is forthcoming to handle multinode RCU
trees, but this current plugin works for configurations up to
32 CPUs (64 CPUs for 64-bit kernels).Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josht@linux.vnet.ibm.com
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
LKML-Reference:
Signed-off-by: Ingo Molnar
23 Aug, 2009
5 commits
-
Create a kernel/rcutree_plugin.h file that contains definitions
for preemptable RCU (or, under the #else branch of the #ifdef,
empty definitions for the classic non-preemptable semantics).
These definitions fit into plugins defined in kernel/rcutree.c
for this purpose.This variant of preemptable RCU uses a new algorithm whose
read-side expense is roughly that of classic hierarchical RCU
under CONFIG_PREEMPT. This new algorithm's update-side expense
is similar to that of classic hierarchical RCU, and, in absence
of read-side preemption or blocking, is exactly that of classic
hierarchical RCU. Perhaps more important, this new algorithm
has a much simpler implementation, saving well over 1,000 lines
of code compared to mainline's implementation of preemptable
RCU, which will hopefully be retired in favor of this new
algorithm.The simplifications are obtained by maintaining per-task
nesting state for running tasks, and using a simple
lock-protected algorithm to handle accounting when tasks block
within RCU read-side critical sections, making use of lessons
learned while creating numerous user-level RCU implementations
over the past 18 months.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josht@linux.vnet.ibm.com
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
LKML-Reference:
Signed-off-by: Ingo Molnar -
All calls from outside RCU are of the form:
if (rcu_pending(cpu))
rcu_check_callbacks(cpu, user);This is silly, instead we put a call to rcu_pending() in
rcu_check_callbacks(), and then make the outside calls be to
rcu_check_callbacks(). This cuts down on the code a bit and
also gives the compiler a better chance of optimizing.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josht@linux.vnet.ibm.com
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
LKML-Reference:
Signed-off-by: Ingo Molnar -
Rename the RCU_DATA_PTR_INIT() macro to RCU_INIT_FLAVOR() and
make it do the rcu_init_one() and rcu_boot_init_percpu_data()
calls. Merge the loop that was in the original macro with the
loops that were in __rcu_init().Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josht@linux.vnet.ibm.com
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
LKML-Reference:
Signed-off-by: Ingo Molnar -
Make RCU-sched, RCU-bh, and RCU-preempt be underlying
implementations, with "RCU" defined in terms of one of the
three. Update the outdated rcu_qsctr_inc() names, as these
functions no longer increment anything.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josht@linux.vnet.ibm.com
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
LKML-Reference:
Signed-off-by: Ingo Molnar -
Some information hiding that makes it easier to merge
preemptability into rcutree without descending into #include
hell.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: josht@linux.vnet.ibm.com
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
LKML-Reference:
Signed-off-by: Ingo Molnar
16 Aug, 2009
2 commits
-
Use the new cpu_notifier() API to simplify RCU's CPU-hotplug
notifiers, collapsing down to a single such notifier.This makes it trivial to provide the notifier-ordering
guarantee that rcu_barrier() depends on.Also remove redundant open_softirq() calls from Hierarchical
RCU notifier.Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: josht@linux.vnet.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: hugh.dickins@tiscali.co.uk
Cc: benh@kernel.crashing.org
LKML-Reference:
Signed-off-by: Ingo Molnar -
This patch divides the rcutree initialization into boot-time
and hotplug-time components, so that the tree data structures
are guaranteed to be fully linked at boot time regardless of
what might happen in CPU hotplug operations.This makes RCU more resilient against CPU hotplug misbehavior
(and vice versa), but more importantly, does a better job of
compartmentalizing the code.Reported-by: Ingo Molnar
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: josht@linux.vnet.ibm.com
Cc: akpm@linux-foundation.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: hugh.dickins@tiscali.co.uk
Cc: benh@kernel.crashing.org
LKML-Reference:
Signed-off-by: Ingo Molnar
02 Aug, 2009
1 commit
-
When debugging a recent lockup bug i found various deficiencies
in how our current lockup detection helpers work:- SysRq-L is not very efficient as it uses a workqueue, hence
it cannot punch through hard lockups and cannot see through
most soft lockups either.- The SysRq-L code depends on the NMI watchdog - which is off
by default.- We dont print backtraces from the RCU code's built-in
'RCU state machine is stuck' debug code. This debug
code tends to be one of the first (and only) mechanisms
that show that a lockup has occured.This patch changes the code so taht we:
- Trigger the NMI backtrace code from SysRq-L instead of using
a workqueue (which cannot punch through hard lockups)- Trigger print-all-CPU-backtraces from the RCU lockup detection
codeAlso decouple the backtrace printing code from the NMI watchdog:
- Dont use variable size cpumasks (it might not be initialized
and they are a bit more fragile anyway)- Trigger an NMI immediately via an IPI, instead of waiting
for the NMI tick to occur. This is a lot faster and can
produce more relevant backtraces. It will also work if the
NMI watchdog is disabled.- Dont print the 'dazed and confused' message when we print
a backtrace from the NMI- Do a show_regs() plus a dump_stack() to get maximum info
out of the dump. Worst-case we get two stacktraces - which
is not a big deal. Sometimes, if register content is
corrupted, the precise stack walker in show_regs() wont
give us a full backtrace - in this case dump_stack() will
do it.Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Andrew Morton
Cc: Linus Torvalds
LKML-Reference:
Signed-off-by: Ingo Molnar
24 Jun, 2009
1 commit
-
Removes the warnings about Hierarchical RCU being experimental,
given that it has gone through almost six months of being the
default RCU in mainline for the x86 with very little trouble.This makes hierarchical-RCU bootup look less scary.
Signed-off-by: Paul E. McKenney
Cc: akpm@linux-foundation.org
Cc: niv@us.ibm.com
Cc: dvhltc@us.ibm.com
Cc: dipankar@in.ibm.com
Cc: dhowells@redhat.com
Cc: lethal@linux-sh.org
Cc: kernel@wantstofly.org
Cc: cl@linux-foundation.org
Cc: schamp@sgi.com
Signed-off-by: Ingo Molnar
14 Apr, 2009
2 commits
-
Add tracing to __rcu_pending() to provide information on why RCU
processing was kicked off. This is helpful for debugging hierarchical
RCU, and might also be helpful in learning how hierarchical RCU operates.Located-by: Anton Blanchard
Tested-by: Anton Blanchard
Signed-off-by: Paul E. McKenney
Cc: anton@samba.org
Cc: akpm@linux-foundation.org
Cc: dipankar@in.ibm.com
Cc: manfred@colorfullife.com
Cc: cl@linux-foundation.org
Cc: josht@linux.vnet.ibm.com
Cc: schamp@sgi.com
Cc: niv@us.ibm.com
Cc: dvhltc@us.ibm.com
Cc: ego@in.ibm.com
Cc: laijs@cn.fujitsu.com
Cc: rostedt@goodmis.org
Cc: peterz@infradead.org
Cc: penberg@cs.helsinki.fi
Cc: andi@firstfloor.org
Cc: "Paul E. McKenney"
LKML-Reference:
Signed-off-by: Ingo Molnar -
This patch fixes a hierarchical-RCU performance bug located by Anton
Blanchard. The problem stems from a misguided attempt to provide a
work-around for jiffies-counter failure. This work-around uses a per-CPU
n_rcu_pending counter, which is incremented on each call to rcu_pending(),
which in turn is called from each scheduling-clock interrupt. Each CPU
then treats this counter as a surrogate for the jiffies counter, so
that if the jiffies counter fails to advance, the per-CPU n_rcu_pending
counter will cause RCU to invoke force_quiescent_state(), which in turn
will (among other things) send resched IPIs to CPUs that have thus far
failed to pass through an RCU quiescent state.Unfortunately, each CPU resets only its own counter after sending a
batch of IPIs. This means that the other CPUs will also (needlessly)
send -another- round of IPIs, for a full N-squared set of IPIs in the
worst case every three scheduler-clock ticks until the grace period
finally ends. It is not reasonable for a given CPU to reset each and
every n_rcu_pending for all the other CPUs, so this patch instead simply
disables the jiffies-counter "training wheels", thus eliminating the
excessive IPIs.Note that the jiffies-counter IPIs do not have this problem due to
the fact that the jiffies counter is global, so that the CPU sending
the IPIs can easily reset things, thus preventing the other CPUs from
sending redundant IPIs.Note also that the n_rcu_pending counter remains, as it will continue to
be used for tracing. It may also see use to update the jiffies counter,
should an appropriate kick-the-jiffies-counter API appear.Located-by: Anton Blanchard
Tested-by: Anton Blanchard
Signed-off-by: Paul E. McKenney
Cc: anton@samba.org
Cc: akpm@linux-foundation.org
Cc: dipankar@in.ibm.com
Cc: manfred@colorfullife.com
Cc: cl@linux-foundation.org
Cc: josht@linux.vnet.ibm.com
Cc: schamp@sgi.com
Cc: niv@us.ibm.com
Cc: dvhltc@us.ibm.com
Cc: ego@in.ibm.com
Cc: laijs@cn.fujitsu.com
Cc: rostedt@goodmis.org
Cc: peterz@infradead.org
Cc: penberg@cs.helsinki.fi
Cc: andi@firstfloor.org
Cc: "Paul E. McKenney"
LKML-Reference:
Signed-off-by: Ingo Molnar
03 Apr, 2009
2 commits
-
Impact: cleanup
We want to remove rcutree internals from the public rcutree.h file for
upcoming kmemtrace changes - but kernel/rcutree_trace.c depends on them.Introduce kernel/rcutree.h for internal definitions. (Probably all
the other data types from include/linux/rcutree.h could be
moved here too - except rcu_data.)Cc: Pekka Enberg
Cc: Eduard - Gabriel Munteanu
Cc: paulmck@linux.vnet.ibm.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
Impact: build fix for all non-x86 architectures
We want to remove percpu.h from rcuclassic.h/rcutree.h (for upcoming
kmemtrace changes) but that would break the DECLARE_PER_CPU based
declarations in these files.Move the quiescent counter management functions to their respective
RCU implementation .c files - they were slightly above the inlining
limit anyway.Cc: Pekka Enberg
Cc: Eduard - Gabriel Munteanu
Cc: paulmck@linux.vnet.ibm.com
LKML-Reference:
Signed-off-by: Ingo Molnar
26 Feb, 2009
1 commit
-
This patch fixes a bug located by Vegard Nossum with the aid of
kmemcheck, updated based on review comments from Nick Piggin,
Ingo Molnar, and Andrew Morton. And cleans up the variable-name
and function-name language. ;-)The boot CPU runs in the context of its idle thread during boot-up.
During this time, idle_cpu(0) will always return nonzero, which will
fool Classic and Hierarchical RCU into deciding that a large chunk of
the boot-up sequence is a big long quiescent state. This in turn causes
RCU to prematurely end grace periods during this time.This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks()
function to ignore the idle task as a quiescent state until the
system has started up the scheduler in rest_init(), introducing a
new non-API function rcu_idle_now_means_idle() to inform RCU of this
transition. RCU maintains an internal rcu_idle_cpu_truthful variable
to track this state, which is then used by rcu_check_callback() to
determine if it should believe idle_cpu().Because this patch has the effect of disallowing RCU grace periods
during long stretches of the boot-up sequence, this patch also introduces
Josh Triplett's UP-only optimization that makes synchronize_rcu() be a
no-op if num_online_cpus() returns 1. This allows boot-time code that
calls synchronize_rcu() to proceed normally. Note, however, that RCU
callbacks registered by call_rcu() will likely queue up until later in
the boot sequence. Although rcuclassic and rcutree can also use this
same optimization after boot completes, rcupreempt must restrict its
use of this optimization to the portion of the boot sequence before the
scheduler starts up, given that an rcupreempt RCU read-side critical
section may be preeempted.In addition, this patch takes Nick Piggin's suggestion to make the
system_state global variable be __read_mostly.Changes since v4:
o Changes the name of the introduced function and variable to
be less emotional. ;-)Changes since v3:
o WARN_ON(nr_context_switches() > 0) to verify that RCU
switches out of boot-time mode before the first context
switch, as suggested by Nick Piggin.Changes since v2:
o Created rcu_blocking_is_gp() internal-to-RCU API that
determines whether a call to synchronize_rcu() is itself
a grace period.o The definition of rcu_blocking_is_gp() for rcuclassic and
rcutree checks to see if but a single CPU is online.o The definition of rcu_blocking_is_gp() for rcupreempt
checks to see both if but a single CPU is online and if
the system is still in early boot.This allows rcupreempt to again work correctly if running
on a single CPU after booting is complete.o Added check to rcupreempt's synchronize_sched() for there
being but one online CPU.Tested all three variants both SMP and !SMP, booted fine, passed a short
rcutorture test on both x86 and Power.Located-by: Vegard Nossum
Tested-by: Vegard Nossum
Tested-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney
Signed-off-by: Ingo Molnar
14 Jan, 2009
1 commit
-
Impact: reduce memory footprint
add __cpuinit to rcu_init_percpu_data(), and this function's text
will be discarded after boot when !CONFIG_HOTPLUG_CPU.Signed-off-by: Lai Jiangshan
Signed-off-by: Ingo Molnar
05 Jan, 2009
2 commits
-
Impact: fix kernel warnings [and potential crash] during suspend+resume
Kudos to both Dhaval Giani and Jens Axboe for finding a bug in treercu
that causes warnings after suspend-resume cycles in Dhaval's case and
during stress tests in Jens's case. It would also probably cause failures
if heavily stressed. The solution, ironically enough, is to revert to
rcupreempt's code for initializing the dynticks state. And the patch
even results in smaller code -- so what was I thinking???This is 2.6.29 material, given that people really do suspend and resume
Linux these days. ;-)Reported-by: Dhaval Giani
Reported-by: Jens Axboe
Tested-by: Dhaval Giani
Tested-by: Jens Axboe
Signed-off-by: Paul E. McKenney
Signed-off-by: Ingo Molnar -
Impact: fix delays during bootup
Kudos to Andi Kleen for finding a grace-period-latency problem! The
problem was that the special-case code for small machines never updated
the ->signaled field to indicate that grace-period initialization had
completed, which prevented force_quiescent_state() from ever expediting
grace periods. This problem resulted in grace periods extending for more
than 20 seconds. Not subtle. I introduced this bug during my inspection
process when I fixed a race between grace-period initialization and
force_quiescent_state() execution.The following patch properly updates the ->signaled field for the
"small"-system case (no more than 32 CPUs for 32-bit kernels and no more
than 64 CPUs for 64-bit kernels).Reported-by: Andi Kleen
Tested-by: Andi Kleen
Signed-off-by: Paul E. McKenney
Signed-off-by: Ingo Molnar