16 Apr, 2009

1 commit

  • Don't try and predeclare inline funcs like this:

    static inline void wait_migrated_callbacks(void)
    ...
    static void _rcu_barrier(enum rcu_barrier type)
    {
    ...
    wait_migrated_callbacks();
    }
    ...
    static inline void wait_migrated_callbacks(void)
    {
    wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count));
    }

    as it upsets some versions of gcc under some circumstances:

    kernel/rcupdate.c: In function `_rcu_barrier':
    kernel/rcupdate.c:125: sorry, unimplemented: inlining failed in call to 'wait_migrated_callbacks': function body not available
    kernel/rcupdate.c:152: sorry, unimplemented: called from here

    This can be dealt with by simply putting the static variables (rcu_migrate_*)
    at the top, and moving the implementation of the function up so that it
    replaces its forward declaration.

    Signed-off-by: David Howells
    Cc: Dipankar Sarma
    Cc: Paul E. McKenney
    Signed-off-by: Linus Torvalds

    David Howells
     

31 Mar, 2009

1 commit

  • cpu hotplug may happen asynchronously, some rcu callbacks are maybe
    still on dead cpu, rcu_barrier() also needs to wait for these rcu
    callbacks to complete, so we must ensure callbacks in dead cpu are
    migrated to online cpu.

    Paul E. McKenney's review:

    Good stuff, Lai!!! Simpler than any of the approaches that I was
    considering, and, better yet, independent of the underlying RCU
    implementation!!!

    I was initially worried that wake_up() might wake only one of two
    possible wait_event()s, namely rcu_barrier() and the CPU_POST_DEAD code,
    but the fact that wait_event() clears WQ_FLAG_EXCLUSIVE avoids that issue.
    I was also worried about the fact that different RCU implementations have
    different mappings of call_rcu(), call_rcu_bh(), and call_rcu_sched(), but
    this is OK as well because we just get an extra (harmless) callback in the
    case that they map together (for example, Classic RCU has call_rcu_sched()
    mapping to call_rcu()).

    Overlap of CPU-hotplug operations is prevented by cpu_add_remove_lock,
    and any stray callbacks that arrive (for example, from irq handlers
    running on the dying CPU) either are ahead of the CPU_DYING callbacks on
    the one hand (and thus accounted for), or happened after the rcu_barrier()
    started on the other (and thus don't need to be accounted for).

    Signed-off-by: Lai Jiangshan
    Reviewed-by: Paul E. McKenney
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Lai Jiangshan
     

26 Feb, 2009

1 commit

  • This patch fixes a bug located by Vegard Nossum with the aid of
    kmemcheck, updated based on review comments from Nick Piggin,
    Ingo Molnar, and Andrew Morton. And cleans up the variable-name
    and function-name language. ;-)

    The boot CPU runs in the context of its idle thread during boot-up.
    During this time, idle_cpu(0) will always return nonzero, which will
    fool Classic and Hierarchical RCU into deciding that a large chunk of
    the boot-up sequence is a big long quiescent state. This in turn causes
    RCU to prematurely end grace periods during this time.

    This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks()
    function to ignore the idle task as a quiescent state until the
    system has started up the scheduler in rest_init(), introducing a
    new non-API function rcu_idle_now_means_idle() to inform RCU of this
    transition. RCU maintains an internal rcu_idle_cpu_truthful variable
    to track this state, which is then used by rcu_check_callback() to
    determine if it should believe idle_cpu().

    Because this patch has the effect of disallowing RCU grace periods
    during long stretches of the boot-up sequence, this patch also introduces
    Josh Triplett's UP-only optimization that makes synchronize_rcu() be a
    no-op if num_online_cpus() returns 1. This allows boot-time code that
    calls synchronize_rcu() to proceed normally. Note, however, that RCU
    callbacks registered by call_rcu() will likely queue up until later in
    the boot sequence. Although rcuclassic and rcutree can also use this
    same optimization after boot completes, rcupreempt must restrict its
    use of this optimization to the portion of the boot sequence before the
    scheduler starts up, given that an rcupreempt RCU read-side critical
    section may be preeempted.

    In addition, this patch takes Nick Piggin's suggestion to make the
    system_state global variable be __read_mostly.

    Changes since v4:

    o Changes the name of the introduced function and variable to
    be less emotional. ;-)

    Changes since v3:

    o WARN_ON(nr_context_switches() > 0) to verify that RCU
    switches out of boot-time mode before the first context
    switch, as suggested by Nick Piggin.

    Changes since v2:

    o Created rcu_blocking_is_gp() internal-to-RCU API that
    determines whether a call to synchronize_rcu() is itself
    a grace period.

    o The definition of rcu_blocking_is_gp() for rcuclassic and
    rcutree checks to see if but a single CPU is online.

    o The definition of rcu_blocking_is_gp() for rcupreempt
    checks to see both if but a single CPU is online and if
    the system is still in early boot.

    This allows rcupreempt to again work correctly if running
    on a single CPU after booting is complete.

    o Added check to rcupreempt's synchronize_sched() for there
    being but one online CPU.

    Tested all three variants both SMP and !SMP, booted fine, passed a short
    rcutorture test on both x86 and Power.

    Located-by: Vegard Nossum
    Tested-by: Vegard Nossum
    Tested-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

05 Jan, 2009

1 commit

  • Impact: cleanup

    Expand macro into two files.

    The synchronize_rcu_xxx macro is quite ugly and it's only used by two
    callers, so expand it instead. This makes this code easier to change.

    Signed-off-by: Andi Kleen
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

21 Oct, 2008

1 commit

  • current rcu_barrier_bh() is like this:

    void rcu_barrier_bh(void)
    {
    BUG_ON(in_interrupt());
    /* Take cpucontrol mutex to protect against CPU hotplug */
    mutex_lock(&rcu_barrier_mutex);
    init_completion(&rcu_barrier_completion);
    atomic_set(&rcu_barrier_cpu_count, 0);
    /*
    * The queueing of callbacks in all CPUs must be atomic with
    * respect to RCU, otherwise one CPU may queue a callback,
    * wait for a grace period, decrement barrier count and call
    * complete(), while other CPUs have not yet queued anything.
    * So, we need to make sure that grace periods cannot complete
    * until all the callbacks are queued.
    */
    rcu_read_lock();
    on_each_cpu(rcu_barrier_func, (void *)RCU_BARRIER_BH, 1);
    rcu_read_unlock();
    wait_for_completion(&rcu_barrier_completion);
    mutex_unlock(&rcu_barrier_mutex);
    }

    The inconsistency of the code and the comments show a bug here.
    rcu_read_lock() cannot make sure that "grace periods for RCU_BH
    cannot complete until all the callbacks are queued".
    it only make sure that race periods for RCU cannot complete
    until all the callbacks are queued.

    so we must use rcu_read_lock_bh() for rcu_barrier_bh().
    like this:

    void rcu_barrier_bh(void)
    {
    ......
    rcu_read_lock_bh();
    on_each_cpu(rcu_barrier_func, (void *)RCU_BARRIER_BH, 1);
    rcu_read_unlock_bh();
    ......
    }

    and also rcu_barrier() rcu_barrier_sched() are implemented like this.
    it will bring a lot of duplicate code. My patch uses another way to
    fix this bug, please see the comment of my patch.
    Thank Paul E. McKenney for he rewrote the comment.

    Signed-off-by: Lai Jiangshan
    Reviewed-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Lai Jiangshan
     

21 Aug, 2008

1 commit

  • Fix RCU's synchronize_rcu() so that it looks like a C function, enabling
    it to be recognized as a function with kernel-doc annotation.

    Warning(linux-2.6.26-git11//kernel/rcupdate.c:81): No description found for parameter 'synchronize_rcu'
    Warning(linux-2.6.26-git11//kernel/rcupdate.c:81): No description found for parameter 'call_rcu'

    [akpm@linux-foundation.org: fix comment]
    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar

    Randy Dunlap
     

16 Jul, 2008

1 commit

  • …l/git/tip/linux-2.6-tip

    * 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
    generic-ipi: more merge fallout
    generic-ipi: merge fix
    x86, visws: use mach-default/entry_arch.h
    x86, visws: fix generic-ipi build
    generic-ipi: fixlet
    generic-ipi: fix s390 build bug
    generic-ipi: fix linux-next tree build failure
    fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
    fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
    fix "smp_call_function: get rid of the unused nonatomic/retry argument"
    on_each_cpu(): kill unused 'retry' parameter
    smp_call_function: get rid of the unused nonatomic/retry argument
    sh: convert to generic helpers for IPI function calls
    parisc: convert to generic helpers for IPI function calls
    mips: convert to generic helpers for IPI function calls
    m32r: convert to generic helpers for IPI function calls
    arm: convert to generic helpers for IPI function calls
    alpha: convert to generic helpers for IPI function calls
    ia64: convert to generic helpers for IPI function calls
    powerpc: convert to generic helpers for IPI function calls
    ...

    Fix trivial conflicts due to rcu updates in kernel/rcupdate.c manually

    Linus Torvalds
     

26 Jun, 2008

1 commit


19 May, 2008

2 commits

  • Add rcu_barrier_sched() and rcu_barrier_bh(). With these in place,
    rcutorture no longer gives the occasional oops when repeatedly starting
    and stopping torturing rcu_bh. Also adds the API needed to flush out
    pre-existing call_rcu_sched() callbacks.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Paul E. McKenney
     
  • Fourth cut of patch to provide the call_rcu_sched(). This is again to
    synchronize_sched() as call_rcu() is to synchronize_rcu().

    Should be fine for experimental and -rt use, but not ready for inclusion.
    With some luck, I will be able to tell Andrew to come out of hiding on
    the next round.

    Passes multi-day rcutorture sessions with concurrent CPU hotplugging.

    Fixes since the first version include a bug that could result in
    indefinite blocking (spotted by Gautham Shenoy), better resiliency
    against CPU-hotplug operations, and other minor fixes.

    Fixes since the second version include reworking grace-period detection
    to avoid deadlocks that could happen when running concurrently with
    CPU hotplug, adding Mathieu's fix to avoid the softlockup messages,
    as well as Mathieu's fix to allow use earlier in boot.

    Fixes since the third version include a wrong-CPU bug spotted by
    Andrew, getting rid of the obsolete synchronize_kernel API that somehow
    snuck back in, merging spin_unlock() and local_irq_restore() in a
    few places, commenting the code that checks for quiescent states based
    on interrupting from user-mode execution or the idle loop, removing
    some inline attributes, and some code-style changes.

    Known/suspected shortcomings:

    o I still do not entirely trust the sleep/wakeup logic. Next step
    will be to use a private snapshot of the CPU online mask in
    rcu_sched_grace_period() -- if the CPU wasn't there at the start
    of the grace period, we don't need to hear from it. And the
    bit about accounting for changes in online CPUs inside of
    rcu_sched_grace_period() is ugly anyway.

    o It might be good for rcu_sched_grace_period() to invoke
    resched_cpu() when a given CPU wasn't responding quickly,
    but resched_cpu() is declared static...

    This patch also fixes a long-standing bug in the earlier preemptable-RCU
    implementation of synchronize_rcu() that could result in loss of
    concurrent external changes to a task's CPU affinity mask. I still cannot
    remember who reported this...

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Paul E. McKenney
     

14 Feb, 2008

1 commit

  • This comment caused some consternation during fastcall removal. Make it
    truthful.

    Signed-off-by: Paul E. McKenney
    Cc: Harvey Harrison
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     

26 Jan, 2008

3 commits

  • Fix rcu_barrier() to work properly in preemptive kernel environment.
    Also, the ordering of callback must be preserved while moving
    callbacks to another CPU during CPU hotplug.

    Signed-off-by: Gautham R Shenoy
    Signed-off-by: Dipankar Sarma
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • This patch re-organizes the RCU code to enable multiple implementations
    of RCU. Users of RCU continues to include rcupdate.h and the
    RCU interfaces remain the same. This is in preparation for
    subsequently merging the preemptible RCU implementation.

    Signed-off-by: Gautham R Shenoy
    Signed-off-by: Dipankar Sarma
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • This patch makes RCU use softirq instead of tasklets.

    It also adds a memory barrier after raising the softirq
    inorder to ensure that the cpu sees the most recently updated
    value of rcu->cur while processing callbacks.
    The discussion of the related theoretical race pointed out
    by James Huang can be found here --> http://lkml.org/lkml/2007/11/20/603

    Signed-off-by: Gautham R Shenoy
    Signed-off-by: Steven Rostedt
    Signed-off-by: Dipankar Sarma
    Reviewed-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Dipankar Sarma
     

23 Jan, 2008

1 commit

  • rcu_online_cpu() should be __cpuinit instead of __devinit.

    WARNING: vmlinux.o(.text+0x4b6d5): Section mismatch: reference to .init.text: (between 'rcu_cpu_notify' and 'wakeme_after_rcu')

    Signed-off-by: Randy Dunlap
    Cc: Sam Ravnborg
    Acked-by: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

17 Oct, 2007

1 commit


12 Oct, 2007

1 commit


10 May, 2007

1 commit

  • Since nonboot CPUs are now disabled after tasks and devices have been
    frozen and the CPU hotplug infrastructure is used for this purpose, we need
    special CPU hotplug notifications that will help the CPU-hotplug-aware
    subsystems distinguish normal CPU hotplug events from CPU hotplug events
    related to a system-wide suspend or resume operation in progress. This
    patch introduces such notifications and causes them to be used during
    suspend and resume transitions. It also changes all of the
    CPU-hotplug-aware subsystems to take these notifications into consideration
    (for now they are handled in the same way as the corresponding "normal"
    ones).

    [oleg@tv-sign.ru: cleanups]
    Signed-off-by: Rafael J. Wysocki
    Cc: Gautham R Shenoy
    Cc: Pavel Machek
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

08 Dec, 2006

1 commit

  • On some workloads, (for example when lot of close() syscalls are done), RCU
    qlen can be quite large, and RCU heads are no longer in cpu cache when
    rcu_do_batch() is called.

    This patch adds a prefetch() in rcu_do_batch() to give CPU a hint to bring
    back cache lines containing 'struct rcu_head's.

    Most list manipulations macros include prefetch(), but not open coded ones
    (at least with current C compilers :) )

    I got a nice speedup on a trivial benchmark (3.48 us per iteration instead
    of 3.95 us on a 1.6 GHz Pentium-M)

    while (1) { pipe(p); close(fd[0]); close(fd[1]);}

    Signed-off-by: Eric Dumazet
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

04 Oct, 2006

1 commit

  • Kill a hard-to-calculate 'rsinterval' boot parameter and per-cpu
    rcu_data.last_rs_qlen. Instead, it adds adds a flag rcu_ctrlblk.signaled,
    which records the fact that one of CPUs has sent a resched IPI since the
    last rcu_start_batch().

    Roughly speaking, we need two rcu_start_batch()s in order to move callbacks
    from ->nxtlist to ->donelist. This means that when ->qlen exceeds qhimark
    and continues to grow, we should send a resched IPI, and then do it again
    after we gone through a quiescent state.

    On the other hand, if it was already sent, we don't need to do it again
    when another CPU detects overflow of the queue.

    Signed-off-by: Oleg Nesterov
    Acked-by: Paul E. McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

13 Sep, 2006

1 commit

  • rcu_do_batch() decrements rdp->qlen with irqs enabled. This is not good,
    it can also be modified by call_rcu() from interrupt.

    Decrement ->qlen once with irqs disabled, after a main loop.

    Signed-off-by: Oleg Nesterov
    Cc: Dipankar Sarma
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

01 Aug, 2006

1 commit

  • Few of the callback functions and notifier blocks that are associated with cpu
    notifications incorrectly have __devinit and __devinitdata. They should be
    __cpuinit and __cpuinitdata instead.

    It makes no functional difference but wastes text area when CONFIG_HOTPLUG is
    enabled and CONFIG_HOTPLUG_CPU is not.

    This patch fixes all those instances.

    Signed-off-by: Chandra Seetharaman
    Cc: Ashok Raj
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chandra Seetharaman
     

04 Jul, 2006

1 commit


28 Jun, 2006

3 commits

  • This patch reverts notifier_block changes made in 2.6.17

    Signed-off-by: Chandra Seetharaman
    Cc: Ashok Raj
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chandra Seetharaman
     
  • In 2.6.17, there was a problem with cpu_notifiers and XFS. I provided a
    band-aid solution to solve that problem. In the process, i undid all the
    changes you both were making to ensure that these notifiers were available
    only at init time (unless CONFIG_HOTPLUG_CPU is defined).

    We deferred the real fix to 2.6.18. Here is a set of patches that fixes the
    XFS problem cleanly and makes the cpu notifiers available only at init time
    (unless CONFIG_HOTPLUG_CPU is defined).

    If CONFIG_HOTPLUG_CPU is defined then cpu notifiers are available at run
    time.

    This patch reverts the notifier_call changes made in 2.6.17

    Signed-off-by: Chandra Seetharaman
    Cc: Ashok Raj
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chandra Seetharaman
     
  • Add operations for the call_rcu_bh() variant of RCU. Also add an
    rcu_batches_completed_bh() function, which is needed by rcutorture.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     

23 Jun, 2006

1 commit


16 May, 2006

1 commit

  • With "Paul E. McKenney"

    Introduce rcu_needs_cpu() interface. This can be used to tell if there
    will be a new rcu batch on a cpu soon by looking at the curlist pointer.
    This can be used to avoid to enter a tickless idle state where the cpu
    would miss that a new batch is ready when rcu_start_batch would be called
    on a different cpu.

    Signed-off-by: Heiko Carstens
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     

26 Apr, 2006

2 commits

  • Few of the notifier_chain_register() callers use __init in the definition
    of notifier_call. It is incorrect as the function definition should be
    available after the initializations (they do not unregister them during
    initializations).

    This patch fixes all such usages to _not_ have the notifier_call __init
    section.

    Signed-off-by: Chandra Seetharaman
    Signed-off-by: Linus Torvalds

    Chandra Seetharaman
     
  • Few of the notifier_chain_register() callers use __devinitdata in the
    definition of notifier_block data structure. It is incorrect as the
    data structure should be available after the initializations (they do
    not unregister them during initializations).

    This was leading to an oops when notifier_chain_register() call is
    invoked for those callback chains after initialization.

    This patch fixes all such usages to _not_ have the notifier_block data
    structure in the init data section.

    Signed-off-by: Chandra Seetharaman
    Signed-off-by: Linus Torvalds

    Chandra Seetharaman
     

24 Mar, 2006

1 commit

  • __rcu_process_callbacks() disables interrupts to protect itself from
    call_rcu() which adds new entries to ->nxtlist.

    However we can check "->nxtlist != NULL" with interrupts enabled, we can't
    get "false positives" because call_rcu() can only change this condition
    from 0 to 1.

    Tested with rcutorture.ko.

    Signed-off-by: Oleg Nesterov
    Acked-by: Dipankar Sarma
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

23 Mar, 2006

2 commits


21 Mar, 2006

1 commit


09 Mar, 2006

1 commit

  • This patch adds new tunables for RCU queue and finished batches. There are
    two types of controls - number of completed RCU updates invoked in a batch
    (blimit) and monitoring for high rate of incoming RCUs on a cpu (qhimark,
    qlowmark).

    By default, the per-cpu batch limit is set to a small value. If the input
    RCU rate exceeds the high watermark, we do two things - force quiescent
    state on all cpus and set the batch limit of the CPU to INTMAX. Setting
    batch limit to INTMAX forces all finished RCUs to be processed in one shot.
    If we have more than INTMAX RCUs queued up, then we have bigger problems
    anyway. Once the incoming queued RCUs fall below the low watermark, the
    batch limit is set to the default.

    Signed-off-by: Dipankar Sarma
    Cc: "Paul E. McKenney"
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dipankar Sarma
     

11 Jan, 2006

2 commits

  • Pointed out by Srivatsa Vaddagiri .

    rcu_do_batch() stops after processing maxbatch callbacks
    on ->donelist leaving rcu_tasklet in TASKLET_STATE_SCHED
    state.

    If CPU_DEAD event happens remaining ->donelist entries are
    lost, rcu_offline_cpu() kills this tasklet.

    With this patch ->donelist migrates along with ->curlist
    and ->nxtlist to the current cpu.

    Compile tested.

    Signed-off-by: Oleg Nesterov
    Acked-by: Paul E. McKenney
    Cc: Srivatsa Vaddagiri
    Cc: Dipankar Sarma
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • This patch moves rcu_state into the rcu_ctrlblk. I think there
    are no reasons why we should have 2 different variables to control
    rcu state. Every user of rcu_state has also "rcu_ctrlblk *rcp" in
    the parameter list.

    Signed-off-by: Oleg Nesterov
    Acked-by: Paul E. McKenney
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

10 Jan, 2006

2 commits


09 Jan, 2006

1 commit