26 Feb, 2009

1 commit

  • This patch fixes a bug located by Vegard Nossum with the aid of
    kmemcheck, updated based on review comments from Nick Piggin,
    Ingo Molnar, and Andrew Morton. And cleans up the variable-name
    and function-name language. ;-)

    The boot CPU runs in the context of its idle thread during boot-up.
    During this time, idle_cpu(0) will always return nonzero, which will
    fool Classic and Hierarchical RCU into deciding that a large chunk of
    the boot-up sequence is a big long quiescent state. This in turn causes
    RCU to prematurely end grace periods during this time.

    This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks()
    function to ignore the idle task as a quiescent state until the
    system has started up the scheduler in rest_init(), introducing a
    new non-API function rcu_idle_now_means_idle() to inform RCU of this
    transition. RCU maintains an internal rcu_idle_cpu_truthful variable
    to track this state, which is then used by rcu_check_callback() to
    determine if it should believe idle_cpu().

    Because this patch has the effect of disallowing RCU grace periods
    during long stretches of the boot-up sequence, this patch also introduces
    Josh Triplett's UP-only optimization that makes synchronize_rcu() be a
    no-op if num_online_cpus() returns 1. This allows boot-time code that
    calls synchronize_rcu() to proceed normally. Note, however, that RCU
    callbacks registered by call_rcu() will likely queue up until later in
    the boot sequence. Although rcuclassic and rcutree can also use this
    same optimization after boot completes, rcupreempt must restrict its
    use of this optimization to the portion of the boot sequence before the
    scheduler starts up, given that an rcupreempt RCU read-side critical
    section may be preeempted.

    In addition, this patch takes Nick Piggin's suggestion to make the
    system_state global variable be __read_mostly.

    Changes since v4:

    o Changes the name of the introduced function and variable to
    be less emotional. ;-)

    Changes since v3:

    o WARN_ON(nr_context_switches() > 0) to verify that RCU
    switches out of boot-time mode before the first context
    switch, as suggested by Nick Piggin.

    Changes since v2:

    o Created rcu_blocking_is_gp() internal-to-RCU API that
    determines whether a call to synchronize_rcu() is itself
    a grace period.

    o The definition of rcu_blocking_is_gp() for rcuclassic and
    rcutree checks to see if but a single CPU is online.

    o The definition of rcu_blocking_is_gp() for rcupreempt
    checks to see both if but a single CPU is online and if
    the system is still in early boot.

    This allows rcupreempt to again work correctly if running
    on a single CPU after booting is complete.

    o Added check to rcupreempt's synchronize_sched() for there
    being but one online CPU.

    Tested all three variants both SMP and !SMP, booted fine, passed a short
    rcutorture test on both x86 and Power.

    Located-by: Vegard Nossum
    Tested-by: Vegard Nossum
    Tested-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

15 Aug, 2008

2 commits

  • Seems that I found a box that has a config that passes call_rcu_bh as a
    function pointer (see net/sctp/sm_make_chunk.c), so declaring the
    call_rcu_bh has a macro function isn't good enough.

    This patch makes it just another name of call_rcu for rcupreempt.

    Signed-off-by: Steven Rostedt
    Reviewed-by: "Paul E. McKenney"
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Hello!

    Compared tip/core/rcu to my latest patchset, and found the following
    issues:

    o the memory barrier in rcu_exit_nohz() somehow got out of place
    (it is correct in mainline as of 2.6.26-rc7).

    o There is a duplicate declaration of rcu_dyntick_sched.

    The attached patch fixes these.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

26 Jul, 2008

1 commit

  • All ratelimit user use same jiffies and burst params, so some messages
    (callbacks) will be lost.

    For example:
    a call printk_ratelimit(5 * HZ, 1)
    b call printk_ratelimit(5 * HZ, 1) before the 5*HZ timeout of a, then b will
    will be supressed.

    - rewrite __ratelimit, and use a ratelimit_state as parameter. Thanks for
    hints from andrew.

    - Add WARN_ON_RATELIMIT, update rcupreempt.h

    - remove __printk_ratelimit

    - use __ratelimit in net_ratelimit

    Signed-off-by: Dave Young
    Cc: "David S. Miller"
    Cc: "Paul E. McKenney"
    Cc: Dave Young
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Young
     

19 May, 2008

1 commit

  • Fourth cut of patch to provide the call_rcu_sched(). This is again to
    synchronize_sched() as call_rcu() is to synchronize_rcu().

    Should be fine for experimental and -rt use, but not ready for inclusion.
    With some luck, I will be able to tell Andrew to come out of hiding on
    the next round.

    Passes multi-day rcutorture sessions with concurrent CPU hotplugging.

    Fixes since the first version include a bug that could result in
    indefinite blocking (spotted by Gautham Shenoy), better resiliency
    against CPU-hotplug operations, and other minor fixes.

    Fixes since the second version include reworking grace-period detection
    to avoid deadlocks that could happen when running concurrently with
    CPU hotplug, adding Mathieu's fix to avoid the softlockup messages,
    as well as Mathieu's fix to allow use earlier in boot.

    Fixes since the third version include a wrong-CPU bug spotted by
    Andrew, getting rid of the obsolete synchronize_kernel API that somehow
    snuck back in, merging spin_unlock() and local_irq_restore() in a
    few places, commenting the code that checks for quiescent states based
    on interrupting from user-mode execution or the idle loop, removing
    some inline attributes, and some code-style changes.

    Known/suspected shortcomings:

    o I still do not entirely trust the sleep/wakeup logic. Next step
    will be to use a private snapshot of the CPU online mask in
    rcu_sched_grace_period() -- if the CPU wasn't there at the start
    of the grace period, we don't need to hear from it. And the
    bit about accounting for changes in online CPUs inside of
    rcu_sched_grace_period() is ugly anyway.

    o It might be good for rcu_sched_grace_period() to invoke
    resched_cpu() when a given CPU wasn't responding quickly,
    but resched_cpu() is declared static...

    This patch also fixes a long-standing bug in the earlier preemptable-RCU
    implementation of synchronize_rcu() that could result in loss of
    concurrent external changes to a task's CPU affinity mask. I still cannot
    remember who reported this...

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Paul E. McKenney
     

30 Apr, 2008

1 commit


20 Mar, 2008

1 commit

  • In the process of writing up the mechanical proof of correctness for the
    dynticks/preemptable-RCU interface, I noticed misplaced memory barriers in
    rcu_enter_nohz() and rcu_exit_nohz().

    This patch puts them in the right place and adds a comment. The key thing to
    keep in mind is that rcu_enter_nohz() is -exiting- the mode that can legally
    execute RCU read-side critical sections.

    The memory barrier must be between any potential RCU read-side critical
    sections and the increment of the per-CPU dynticks_progress_counter, and thus
    must come -before- this increment. And vice versa for rcu_exit_nohz().

    The locking in the scheduler is probably saving us for the moment.

    Also, switch to smp_mb() - we don't need a barrier for uniprocessor kernels.

    Signed-off-by: Paul E. McKenney
    Acked-by: Steven Rostedt
    Cc: Nick Piggin
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     

01 Mar, 2008

1 commit

  • The PREEMPT-RCU can get stuck if a CPU goes idle and NO_HZ is set. The
    idle CPU will not progress the RCU through its grace period and a
    synchronize_rcu my get stuck. Without this patch I have a box that will
    not boot when PREEMPT_RCU and NO_HZ are set. That same box boots fine
    with this patch.

    This patch comes from the -rt kernel where it has been tested for
    several months.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

09 Feb, 2008

1 commit


26 Jan, 2008

1 commit

  • This patch implements a new version of RCU which allows its read-side
    critical sections to be preempted. It uses a set of counter pairs
    to keep track of the read-side critical sections and flips them
    when all tasks exit read-side critical section. The details
    of this implementation can be found in this paper -

    http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf

    and the article-

    http://lwn.net/Articles/253651/

    This patch was developed as a part of the -rt kernel development and
    meant to provide better latencies when read-side critical sections of
    RCU don't disable preemption. As a consequence of keeping track of RCU
    readers, the readers have a slight overhead (optimizations in the paper).
    This implementation co-exists with the "classic" RCU implementations
    and can be switched to at compiler.

    Also includes RCU tracing summarized in debugfs.

    [ akpm@linux-foundation.org: build fixes on non-preempt architectures ]

    Signed-off-by: Gautham R Shenoy
    Signed-off-by: Dipankar Sarma
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Paul E. McKenney