06 May, 2009

1 commit

  • As a kernel thread, rcu_sched_grace_period() runs with all signals ignored.
    It can never receive a signal even if it sleeps in TASK_INTERRUPTIBLE, it
    needs the explicit allow_signal() to be visible for signals.

    [ Impact: reduce kernel size, remove dead code ]

    Signed-off-by: Oleg Nesterov
    Reviewed-by: Paul E. McKenney
    Cc: Andrew Morton
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Oleg Nesterov
     

03 Apr, 2009

1 commit


26 Feb, 2009

1 commit

  • This patch fixes a bug located by Vegard Nossum with the aid of
    kmemcheck, updated based on review comments from Nick Piggin,
    Ingo Molnar, and Andrew Morton. And cleans up the variable-name
    and function-name language. ;-)

    The boot CPU runs in the context of its idle thread during boot-up.
    During this time, idle_cpu(0) will always return nonzero, which will
    fool Classic and Hierarchical RCU into deciding that a large chunk of
    the boot-up sequence is a big long quiescent state. This in turn causes
    RCU to prematurely end grace periods during this time.

    This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks()
    function to ignore the idle task as a quiescent state until the
    system has started up the scheduler in rest_init(), introducing a
    new non-API function rcu_idle_now_means_idle() to inform RCU of this
    transition. RCU maintains an internal rcu_idle_cpu_truthful variable
    to track this state, which is then used by rcu_check_callback() to
    determine if it should believe idle_cpu().

    Because this patch has the effect of disallowing RCU grace periods
    during long stretches of the boot-up sequence, this patch also introduces
    Josh Triplett's UP-only optimization that makes synchronize_rcu() be a
    no-op if num_online_cpus() returns 1. This allows boot-time code that
    calls synchronize_rcu() to proceed normally. Note, however, that RCU
    callbacks registered by call_rcu() will likely queue up until later in
    the boot sequence. Although rcuclassic and rcutree can also use this
    same optimization after boot completes, rcupreempt must restrict its
    use of this optimization to the portion of the boot sequence before the
    scheduler starts up, given that an rcupreempt RCU read-side critical
    section may be preeempted.

    In addition, this patch takes Nick Piggin's suggestion to make the
    system_state global variable be __read_mostly.

    Changes since v4:

    o Changes the name of the introduced function and variable to
    be less emotional. ;-)

    Changes since v3:

    o WARN_ON(nr_context_switches() > 0) to verify that RCU
    switches out of boot-time mode before the first context
    switch, as suggested by Nick Piggin.

    Changes since v2:

    o Created rcu_blocking_is_gp() internal-to-RCU API that
    determines whether a call to synchronize_rcu() is itself
    a grace period.

    o The definition of rcu_blocking_is_gp() for rcuclassic and
    rcutree checks to see if but a single CPU is online.

    o The definition of rcu_blocking_is_gp() for rcupreempt
    checks to see both if but a single CPU is online and if
    the system is still in early boot.

    This allows rcupreempt to again work correctly if running
    on a single CPU after booting is complete.

    o Added check to rcupreempt's synchronize_sched() for there
    being but one online CPU.

    Tested all three variants both SMP and !SMP, booted fine, passed a short
    rcutorture test on both x86 and Power.

    Located-by: Vegard Nossum
    Tested-by: Vegard Nossum
    Tested-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

05 Jan, 2009

1 commit

  • Impact: cleanup

    Expand macro into two files.

    The synchronize_rcu_xxx macro is quite ugly and it's only used by two
    callers, so expand it instead. This makes this code easier to change.

    Signed-off-by: Andi Kleen
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

01 Jan, 2009

1 commit

  • Impact: use new cpumask API.

    rcu_ctrlblk contains a cpumask, and it's highly optimized so I don't want
    a cpumask_var_t (ie. a pointer) for the CONFIG_CPUMASK_OFFSTACK case. It
    could use a dangling bitmap, and be allocated in __rcu_init to save memory,
    but for the moment we use a bitmap.

    (Eventually 'struct cpumask' will be undefined for CONFIG_CPUMASK_OFFSTACK,
    so we use a bitmap here to show we really mean it).

    We remove on-stack cpumasks, using cpumask_var_t for
    rcu_torture_shuffle_tasks() and for_each_cpu_and in force_quiescent_state().

    Signed-off-by: Rusty Russell

    Rusty Russell
     

19 Dec, 2008

1 commit

  • This patch fixes a long-standing performance bug in classic RCU that
    results in massive internal-to-RCU lock contention on systems with
    more than a few hundred CPUs. Although this patch creates a separate
    flavor of RCU for ease of review and patch maintenance, it is intended
    to replace classic RCU.

    This patch still handles stress better than does mainline, so I am still
    calling it ready for inclusion. This patch is against the -tip tree.
    Nevertheless, experience on an actual 1000+ CPU machine would still be
    most welcome.

    Most of the changes noted below were found while creating an rcutiny
    (which should permit ejecting the current rcuclassic) and while doing
    detailed line-by-line documentation.

    Updates from v9 (http://lkml.org/lkml/2008/12/2/334):

    o Fixes from remainder of line-by-line code walkthrough,
    including comment spelling, initialization, undesirable
    narrowing due to type conversion, removing redundant memory
    barriers, removing redundant local-variable initialization,
    and removing redundant local variables.

    I do not believe that any of these fixes address the CPU-hotplug
    issues that Andi Kleen was seeing, but please do give it a whirl
    in case the machine is smarter than I am.

    A writeup from the walkthrough may be found at the following
    URL, in case you are suffering from terminal insomnia or
    masochism:

    http://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf

    o Made rcutree tracing use seq_file, as suggested some time
    ago by Lai Jiangshan.

    o Added a .csv variant of the rcudata debugfs trace file, to allow
    people having thousands of CPUs to drop the data into
    a spreadsheet. Tested with oocalc and gnumeric. Updated
    documentation to suit.

    Updates from v8 (http://lkml.org/lkml/2008/11/15/139):

    o Fix a theoretical race between grace-period initialization and
    force_quiescent_state() that could occur if more than three
    jiffies were required to carry out the grace-period
    initialization. Which it might, if you had enough CPUs.

    o Apply Ingo's printk-standardization patch.

    o Substitute local variables for repeated accesses to global
    variables.

    o Fix comment misspellings and redundant (but harmless) increments
    of ->n_rcu_pending (this latter after having explicitly added it).

    o Apply checkpatch fixes.

    Updates from v7 (http://lkml.org/lkml/2008/10/10/291):

    o Fixed a number of problems noted by Gautham Shenoy, including
    the cpu-stall-detection bug that he was having difficulty
    convincing me was real. ;-)

    o Changed cpu-stall detection to wait for ten seconds rather than
    three in order to reduce false positive, as suggested by Ingo
    Molnar.

    o Produced a design document (http://lwn.net/Articles/305782/).
    The act of writing this document uncovered a number of both
    theoretical and "here and now" bugs as noted below.

    o Fix dynticks_nesting accounting confusion, simplify WARN_ON()
    condition, fix kerneldoc comments, and add memory barriers
    in dynticks interface functions.

    o Add more data to tracing.

    o Remove unused "rcu_barrier" field from rcu_data structure.

    o Count calls to rcu_pending() from scheduling-clock interrupt
    to use as a surrogate timebase should jiffies stop counting.

    o Fix a theoretical race between force_quiescent_state() and
    grace-period initialization. Yes, initialization does have to
    go on for some jiffies for this race to occur, but given enough
    CPUs...

    Updates from v6 (http://lkml.org/lkml/2008/9/23/448):

    o Fix a number of checkpatch.pl complaints.

    o Apply review comments from Ingo Molnar and Lai Jiangshan
    on the stall-detection code.

    o Fix several bugs in !CONFIG_SMP builds.

    o Fix a misspelled config-parameter name so that RCU now announces
    at boot time if stall detection is configured.

    o Run tests on numerous combinations of configurations parameters,
    which after the fixes above, now build and run correctly.

    Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line):

    o Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a
    changeset some time ago, and finally got around to retesting
    this option).

    o Fix some tracing bugs in rcupreempt that caused incorrect
    totals to be printed.

    o I now test with a more brutal random-selection online/offline
    script (attached). Probably more brutal than it needs to be
    on the people reading it as well, but so it goes.

    o A number of optimizations and usability improvements:

    o Make rcu_pending() ignore the grace-period timeout when
    there is no grace period in progress.

    o Make force_quiescent_state() avoid going for a global
    lock in the case where there is no grace period in
    progress.

    o Rearrange struct fields to improve struct layout.

    o Make call_rcu() initiate a grace period if RCU was
    idle, rather than waiting for the next scheduling
    clock interrupt.

    o Invoke rcu_irq_enter() and rcu_irq_exit() only when
    idle, as suggested by Andi Kleen. I still don't
    completely trust this change, and might back it out.

    o Make CONFIG_RCU_TRACE be the single config variable
    manipulated for all forms of RCU, instead of the prior
    confusion.

    o Document tracing files and formats for both rcupreempt
    and rcutree.

    Updates from v4 for those missing v5 given its bad subject line:

    o Separated dynticks interface so that NMIs and irqs call separate
    functions, greatly simplifying it. In particular, this code
    no longer requires a proof of correctness. ;-)

    o Separated dynticks state out into its own per-CPU structure,
    avoiding the duplicated accounting.

    o The case where a dynticks-idle CPU runs an irq handler that
    invokes call_rcu() is now correctly handled, forcing that CPU
    out of dynticks-idle mode.

    o Review comments have been applied (thank you all!!!).
    For but one example, fixed the dynticks-ordering issue that
    Manfred pointed out, saving me much debugging. ;-)

    o Adjusted rcuclassic and rcupreempt to handle dynticks changes.

    Attached is an updated patch to Classic RCU that applies a hierarchy,
    greatly reducing the contention on the top-level lock for large machines.
    This passes 10-hour concurrent rcutorture and online-offline testing on
    128-CPU ppc64 without dynticks enabled, and exposes some timekeeping
    bugs in presence of dynticks (exciting working on a system where
    "sleep 1" hangs until interrupted...), which were fixed in the
    2.6.27 kernel. It is getting more reliable than mainline by some
    measures, so the next version will be against -tip for inclusion.
    See also Manfred Spraul's recent patches (or his earlier work from
    2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2).
    We will converge onto a common patch in the fullness of time, but are
    currently exploring different regions of the design space. That said,
    I have already gratefully stolen quite a few of Manfred's ideas.

    This patch provides CONFIG_RCU_FANOUT, which controls the bushiness
    of the RCU hierarchy. Defaults to 32 on 32-bit machines and 64 on
    64-bit machines. If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT,
    there is no hierarchy. By default, the RCU initialization code will
    adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA
    architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable
    this balancing, allowing the hierarchy to be exactly aligned to the
    underlying hardware. Up to two levels of hierarchy are permitted
    (in addition to the root node), allowing up to 16,384 CPUs on 32-bit
    systems and up to 262,144 CPUs on 64-bit systems. I just know that I
    am going to regret saying this, but this seems more than sufficient
    for the foreseeable future. (Some architectures might wish to set
    CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs.
    If this becomes a real problem, additional levels can be added, but I
    doubt that it will make a significant difference on real hardware.)

    In the common case, a given CPU will manipulate its private rcu_data
    structure and the rcu_node structure that it shares with its immediate
    neighbors. This can reduce both lock and memory contention by multiple
    orders of magnitude, which should eliminate the need for the strange
    manipulations that are reported to be required when running Linux on
    very large systems.

    Some shortcomings:

    o More bugs will probably surface as a result of an ongoing
    line-by-line code inspection.

    Patches will be provided as required.

    o There are probably hangs, rcutorture failures, &c. Seems
    quite stable on a 128-CPU machine, but that is kind of small
    compared to 4096 CPUs. However, seems to do better than
    mainline.

    Patches will be provided as required.

    o The memory footprint of this version is several KB larger
    than rcuclassic.

    A separate UP-only rcutiny patch will be provided, which will
    reduce the memory footprint significantly, even compared
    to the old rcuclassic. One such patch passes light testing,
    and has a memory footprint smaller even than rcuclassic.
    Initial reaction from various embedded guys was "it is not
    worth it", so am putting it aside.

    Credits:

    o Manfred Spraul for ideas, review comments, and bugs spotted,
    as well as some good friendly competition. ;-)

    o Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers,
    Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton
    for reviews and comments.

    o Thomas Gleixner for much-needed help with some timer issues
    (see patches below).

    o Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos,
    Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton
    Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines
    alive despite my heavy abuse^Wtesting.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

20 Oct, 2008

1 commit


18 Aug, 2008

1 commit


16 Jul, 2008

2 commits


15 Jul, 2008

1 commit


11 Jul, 2008

2 commits

  • Conflicts:

    include/linux/rculist.h
    kernel/rcupreempt.c

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • PREEMPT_RCU without HOTPLUG_CPU is broken. The rcu_online_cpu is called
    to initially populate rcu_cpu_online_map with all online CPUs when the
    hotplug event handler is installed, and also to populate the map with
    CPUs as they come online. The former case is meant to happen with and
    without HOTPLUG_CPU, but without HOTPLUG_CPU, the rcu_offline_cpu
    function is no-oped -- while it still gets called, it does not set the
    rcu CPU map.

    With a blank RCU CPU map, grace periods get to tick by completely
    oblivious to active RCU read side critical sections. This results in
    free-before-grace bugs.

    Fix is obvious once the problem is known. (Also, change __devinit to
    __cpuinit so the function gets thrown away on !HOTPLUG_CPU kernels).

    Signed-off-by: Nick Piggin
    Reported-and-tested-by: Alexey Dobriyan
    Acked-by: Ingo Molnar
    Cc: Paul E. McKenney
    [ Nick is my personal hero of the day - Linus ]
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

06 Jul, 2008

1 commit


23 Jun, 2008

1 commit


19 Jun, 2008

1 commit


16 Jun, 2008

1 commit


25 May, 2008

1 commit

  • As git-grep shows, open_softirq() is always called with the last argument
    being NULL

    block/blk-core.c: open_softirq(BLOCK_SOFTIRQ, blk_done_softirq, NULL);
    kernel/hrtimer.c: open_softirq(HRTIMER_SOFTIRQ, run_hrtimer_softirq, NULL);
    kernel/rcuclassic.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
    kernel/rcupreempt.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
    kernel/sched.c: open_softirq(SCHED_SOFTIRQ, run_rebalance_domains, NULL);
    kernel/softirq.c: open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
    kernel/softirq.c: open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
    kernel/timer.c: open_softirq(TIMER_SOFTIRQ, run_timer_softirq, NULL);
    net/core/dev.c: open_softirq(NET_TX_SOFTIRQ, net_tx_action, NULL);
    net/core/dev.c: open_softirq(NET_RX_SOFTIRQ, net_rx_action, NULL);

    This observation has already been made by Matthew Wilcox in June 2002
    (http://www.cs.helsinki.fi/linux/linux-kernel/2002-25/0687.html)

    "I notice that none of the current softirq routines use the data element
    passed to them."

    and the situation hasn't changed since them. So it appears we can safely
    remove that extra argument to save 128 (54) bytes of kernel data (text).

    Signed-off-by: Carlos R. Mafra
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Carlos R. Mafra
     

24 May, 2008

1 commit


19 May, 2008

4 commits

  • Removed duplicated include file in kernel/rcupreempt.c.

    Signed-off-by: Huang Weiyi
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar

    Huang Weiyi
     
  • The comment was correct -- need to make the code match the comment.
    Without this patch, if a CPU goes dynticks idle (and stays there forever)
    in just the right phase of preemptible-RCU grace-period processing,
    grace periods stall. The offending sequence of events (courtesy
    of Promela/spin, at least after I got the liveness criterion coded
    correctly...) is as follows:

    o CPU 0 is in dynticks-idle mode. Its dynticks_progress_counter
    is (say) 10.

    o CPU 0 takes an interrupt, so rcu_irq_enter() increments CPU 0's
    dynticks_progress_counter to 11.

    o CPU 1 is doing RCU grace-period processing in rcu_try_flip_idle(),
    sees rcu_pending(), so invokes dyntick_save_progress_counter(),
    which in turn takes a snapshot of CPU 0's dynticks_progress_counter
    into CPU 0's rcu_dyntick_snapshot -- now set to 11. CPU 1 then
    updates the RCU grace-period state to rcu_try_flip_waitack().

    o CPU 0 returns from its interrupt, so rcu_irq_exit() increments
    CPU 0's dynticks_progress_counter to 12.

    o CPU 1 later invokes rcu_try_flip_waitack(), which notices that
    CPU 0 has not yet responded, and hence in turn invokes
    rcu_try_flip_waitack_needed(). This function examines the
    state of CPU 0's dynticks_progress_counter and rcu_dyntick_snapshot
    variables, which it copies to curr (== 12) and snap (== 11),
    respectively.

    Because curr!=snap, the first condition fails.

    Because curr-snap is only 1 and snap is odd, the second
    condition fails.

    rcu_try_flip_waitack_needed() therefore incorrectly concludes
    that it must wait for CPU 0 to explicitly acknowledge the
    counter flip.

    o CPU 0 remains forever in dynticks-idle mode, never taking
    any more hardware interrupts or any NMIs, and never running
    any more tasks. (Of course, -something- will usually eventually
    happen, which might be why we haven't seen this one in the
    wild. Still should be fixed!)

    Therefore the grace period never ends. Fix is to make the code match
    the comment, as shown below. With this fix, the above scenario
    would be satisfied with curr being even, and allow the grace period
    to proceed.

    Signed-off-by: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Josh Triplett
    Cc: Dipankar Sarma
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • Fourth cut of patch to provide the call_rcu_sched(). This is again to
    synchronize_sched() as call_rcu() is to synchronize_rcu().

    Should be fine for experimental and -rt use, but not ready for inclusion.
    With some luck, I will be able to tell Andrew to come out of hiding on
    the next round.

    Passes multi-day rcutorture sessions with concurrent CPU hotplugging.

    Fixes since the first version include a bug that could result in
    indefinite blocking (spotted by Gautham Shenoy), better resiliency
    against CPU-hotplug operations, and other minor fixes.

    Fixes since the second version include reworking grace-period detection
    to avoid deadlocks that could happen when running concurrently with
    CPU hotplug, adding Mathieu's fix to avoid the softlockup messages,
    as well as Mathieu's fix to allow use earlier in boot.

    Fixes since the third version include a wrong-CPU bug spotted by
    Andrew, getting rid of the obsolete synchronize_kernel API that somehow
    snuck back in, merging spin_unlock() and local_irq_restore() in a
    few places, commenting the code that checks for quiescent states based
    on interrupting from user-mode execution or the idle loop, removing
    some inline attributes, and some code-style changes.

    Known/suspected shortcomings:

    o I still do not entirely trust the sleep/wakeup logic. Next step
    will be to use a private snapshot of the CPU online mask in
    rcu_sched_grace_period() -- if the CPU wasn't there at the start
    of the grace period, we don't need to hear from it. And the
    bit about accounting for changes in online CPUs inside of
    rcu_sched_grace_period() is ugly anyway.

    o It might be good for rcu_sched_grace_period() to invoke
    resched_cpu() when a given CPU wasn't responding quickly,
    but resched_cpu() is declared static...

    This patch also fixes a long-standing bug in the earlier preemptable-RCU
    implementation of synchronize_rcu() that could result in loss of
    concurrent external changes to a task's CPU affinity mask. I still cannot
    remember who reported this...

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Paul E. McKenney
     
  • rcu_batches_completed and rcu_patches_completed_bh are both declared
    in rcuclassic.h and rcupreempt.h. This patch removes the extra
    prototypes for them from rcupdate.h.

    rcu_batches_completed_bh is defined as a static inline in the rcupreempt.h
    header file. Trying to export this as EXPORT_SYMBOL_GPL causes linking problems
    with the powerpc linker. There's no need to export a static inlined function.

    Modules must be compiled with the same type of RCU implementation as the
    kernel they are for.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

20 Apr, 2008

1 commit

  • * Modify sched_affinity functions to pass cpumask_t variables by reference
    instead of by value.

    * Use new set_cpus_allowed_ptr function.

    Depends on:
    [sched-devel]: sched: add new set_cpus_allowed_ptr function

    Cc: Paul Jackson
    Cc: Cliff Wickman
    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Mike Travis
     

01 Mar, 2008

3 commits

  • This patch fixes a potentially invalid access to a per-CPU variable in
    rcu_process_callbacks().

    This per-CPU access needs to be done in such a way as to guarantee that
    the code using it cannot move to some other CPU before all uses of the
    value accessed have completed. Even though this code is currently only
    invoked from softirq context, which currrently cannot migrate to some
    other CPU, life would be better if this code did not silently make such
    an assumption.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • This fixes a oops encountered when doing hibernate/resume in presence of
    PREEMPT_RCU.

    The problem was that the code failed to disable preemption when
    accessing a per-CPU variable. This is OK when called from code that
    already has preemption disabled, but such is not the case from the
    suspend/resume code path.

    Reported-by: Dave Young
    Tested-by: Dave Young
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • The PREEMPT-RCU can get stuck if a CPU goes idle and NO_HZ is set. The
    idle CPU will not progress the RCU through its grace period and a
    synchronize_rcu my get stuck. Without this patch I have a box that will
    not boot when PREEMPT_RCU and NO_HZ are set. That same box boots fine
    with this patch.

    This patch comes from the -rt kernel where it has been tested for
    several months.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

26 Jan, 2008

2 commits

  • This patch allows preemptible RCU to tolerate CPU-hotplug operations.
    It accomplishes this by maintaining a local copy of a map of online
    CPUs, which it accesses under its own lock.

    Signed-off-by: Gautham R Shenoy
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • This patch implements a new version of RCU which allows its read-side
    critical sections to be preempted. It uses a set of counter pairs
    to keep track of the read-side critical sections and flips them
    when all tasks exit read-side critical section. The details
    of this implementation can be found in this paper -

    http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf

    and the article-

    http://lwn.net/Articles/253651/

    This patch was developed as a part of the -rt kernel development and
    meant to provide better latencies when read-side critical sections of
    RCU don't disable preemption. As a consequence of keeping track of RCU
    readers, the readers have a slight overhead (optimizations in the paper).
    This implementation co-exists with the "classic" RCU implementations
    and can be switched to at compiler.

    Also includes RCU tracing summarized in debugfs.

    [ akpm@linux-foundation.org: build fixes on non-preempt architectures ]

    Signed-off-by: Gautham R Shenoy
    Signed-off-by: Dipankar Sarma
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Paul E. McKenney