11 Jun, 2013

1 commit


08 Feb, 2013

2 commits

  • SRCU has its own statemachine and no longer relies on normal RCU.
    Its read-side critical section can now be used by an offline CPU, so this
    commit removes the check and the comments, reverting the SRCU portion
    of ff195cb6 (rcu: Warn when srcu_read_lock() is used in an extended
    quiescent state).

    It also makes the codes match the comments in whatisRCU.txt:

    g. Do you need read-side critical sections that are respected
    even though they are in the middle of the idle loop, during
    user-mode execution, or on an offlined CPU? If so, SRCU is the
    only choice that will work for you.

    [ paulmck: There is at least one remaining issue, namely use of lockdep
    with tracing enabled. ]

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney

    Lai Jiangshan
     
  • SRCU has its own statemachine and no longer relies on normal RCU.
    Its read-side critical section can now be used by an offline CPU, so this
    commit removes the check and the comments, reverting the SRCU portion
    of c0d6d01b (rcu: Check for illegal use of RCU from offlined CPUs).

    It also makes the code match the comments in whatisRCU.txt:

    g. Do you need read-side critical sections that are respected
    even though they are in the middle of the idle loop, during
    user-mode execution, or on an offlined CPU? If so, SRCU is the
    only choice that will work for you.

    [ paulmck: There is at least one remaining issue, namely use of lockdep
    with tracing enabled. ]

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney

    Lai Jiangshan
     

28 Oct, 2012

1 commit

  • In old days, we had two different API sets for dynamic-allocated per-CPU
    data and DEFINE_PER_CPU()-defined per_cpu data, and because SRCU used
    dynamic-allocated per-CPU data, its srcu_struct structures cannot be
    declared statically. This commit therefore introduces DEFINE_SRCU()
    and DEFINE_STATIC_SRCU() to allow statically declared SRCU structures,
    using the new static per-CPU interfaces.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney
    [ paulmck: Updated for __DELAYED_WORK_INITIALIZER() added argument,
    fixed whitespace issue. ]

    Lai Jiangshan
     

24 Oct, 2012

2 commits


01 May, 2012

5 commits

  • This commit implements an SRCU state machine in support of call_srcu().
    The state machine is preemptible, light-weight, and single-threaded,
    minimizing synchronization overhead. In particular, there is no longer
    any need for synchronize_srcu() to be guarded by a mutex.

    Expedited processing is handled, at least in the absence of concurrent
    grace-period operations on that same srcu_struct structure, by having
    the synchronize_srcu_expedited() thread take on the role of the
    workqueue thread for one iteration.

    There is a reasonable probability that a given SRCU callback will
    be invoked on the same CPU that registered it, however, there is no
    guarantee. Concurrent SRCU grace-period primitives can cause callbacks
    to be executed elsewhere, even in absence of CPU-hotplug operations.

    Callbacks execute in process context, but under the influence of
    local_bh_disable(), so it is illegal to sleep in an SRCU callback
    function.

    Signed-off-by: Lai Jiangshan
    Acked-by: Peter Zijlstra
    Signed-off-by: Paul E. McKenney

    Lai Jiangshan
     
  • The old srcu_barrier() macro is now unused. This commit removes it so
    that it may be used for the SRCU flavor of rcu_barrier(), which will in
    turn be needed to allow the upcoming call_srcu() to be used from within
    modules.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney

    Lai Jiangshan
     
  • This commit implements a variant of Peter's algorithm, which may be found
    at https://lkml.org/lkml/2012/2/1/119.

    o Make the checking lock-free to enable parallel checking.
    Parallel checking is required when (1) the original checking
    task is preempted for a long time, (2) sychronize_srcu_expedited()
    starts during an ongoing SRCU grace period, or (3) we wish to
    avoid acquiring a lock.

    o Since the checking is lock-free, we avoid a mutex in state machine
    for call_srcu().

    o Remove the SRCU_REF_MASK and remove the coupling with the flipping.
    This might allow us to remove the preempt_disable() in future
    versions, though such removal will need great care because it
    rescinds the one-old-reader-per-CPU guarantee.

    o Remove a smp_mb(), simplify the comments and make the smp_mb() pairs
    more intuitive.

    Inspired-by: Peter Zijlstra
    Signed-off-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney

    Lai Jiangshan
     
  • The purpose of the upper bit of SRCU's per-CPU counters is to guarantee
    that no reasonable series of srcu_read_lock() and srcu_read_unlock()
    operations can return the value of the counter to its original value.
    This guarantee is require only after the index has been switched to
    the other set of counters, so at most one srcu_read_lock() can affect
    a given CPU's counter. The number of srcu_read_unlock() operations
    on a given counter is limited to the number of tasks in the system,
    which given the Linux kernel's current structure is limited to far less
    than 2^30 on 32-bit systems and far less than 2^62 on 64-bit systems.
    (Something about a limited number of bytes in the kernel's address space.)

    Therefore, if srcu_read_lock() increments the upper bits, then
    srcu_read_unlock() need not do so. In this case, an srcu_read_lock() and
    an srcu_read_unlock() will flip the lower bit of the upper field of the
    counter. An unreasonably large additional number of srcu_read_unlock()
    operations would be required to return the counter to its initial value,
    thus preserving the guarantee.

    This commit takes this approach, which further allows it to shrink
    the size of the upper field to one bit, making the number of
    srcu_read_unlock() operations required to return the counter to its
    initial value even more unreasonable than before.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney

    Lai Jiangshan
     
  • The current implementation of synchronize_srcu_expedited() can cause
    severe OS jitter due to its use of synchronize_sched(), which in turn
    invokes try_stop_cpus(), which causes each CPU to be sent an IPI.
    This can result in severe performance degradation for real-time workloads
    and especially for short-interation-length HPC workloads. Furthermore,
    because only one instance of try_stop_cpus() can be making forward progress
    at a given time, only one instance of synchronize_srcu_expedited() can
    make forward progress at a time, even if they are all operating on
    distinct srcu_struct structures.

    This commit, inspired by an earlier implementation by Peter Zijlstra
    (https://lkml.org/lkml/2012/1/31/211) and by further offline discussions,
    takes a strictly algorithmic bits-in-memory approach. This has the
    disadvantage of requiring one explicit memory-barrier instruction in
    each of srcu_read_lock() and srcu_read_unlock(), but on the other hand
    completely dispenses with OS jitter and furthermore allows SRCU to be
    used freely by CPUs that RCU believes to be idle or offline.

    The update-side implementation handles the single read-side memory
    barrier by rechecking the per-CPU counters after summing them and
    by running through the update-side state machine twice.

    This implementation has passed moderate rcutorture testing on both
    x86 and Power. Also updated to use this_cpu_ptr() instead of per_cpu_ptr(),
    as suggested by Peter Zijlstra.

    Reported-by: Peter Zijlstra
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Acked-by: Peter Zijlstra
    Reviewed-by: Lai Jiangshan

    Paul E. McKenney
     

22 Feb, 2012

2 commits

  • The WARN_ON_ONCE() in rcu_lock_acquire() results in infinite recursion
    on S390, and also doesn't print very much information. Remove this.

    Updated patch to add lockdep-RCU assertions to RCU's read-side primitives.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Paul E. McKenney

    Heiko Carstens
     
  • Although it is legal to use RCU during early boot, it is anything
    but legal to use RCU at runtime from an offlined CPU. After all, RCU
    explicitly ignores offlined CPUs. This commit therefore adds checks
    for runtime use of RCU from offlined CPUs.

    These checks are not perfect, in particular, they can be subverted
    through use of things like rcu_dereference_raw(). Note that it is not
    possible to put checks in rcu_read_lock() and friends due to the fact
    that these primitives are used in code that might be used under either
    RCU or lock-based protection, which means that checking rcu_read_lock()
    gets you fat piles of false positives.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

12 Dec, 2011

4 commits

  • The intent is that a given RCU read-side critical section be confined
    to a single context. For example, it is illegal to invoke rcu_read_lock()
    in an exception handler and then invoke rcu_read_unlock() from the
    context of the task that received the exception.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The RCU implementations, including SRCU, are designed to be used in a
    lock-like fashion, so that the read-side lock and unlock primitives must
    execute in the same context for any given read-side critical section.
    This constraint is enforced by lockdep-RCU. However, there is a need
    to enter an SRCU read-side critical section within the context of an
    exception and then exit in the context of the task that encountered the
    exception. The cost of this capability is that the read-side operations
    incur the overhead of disabling interrupts.

    Note that although the current implementation allows a given read-side
    critical section to be entered by one task and then exited by another, all
    known possible implementations that allow this have scalability problems.
    Therefore, a given read-side critical section must be exited by the same
    task that entered it, though perhaps from an interrupt or exception
    handler running within that task's context. But if you are thinking
    in terms of interrupt handlers, make sure that you have considered the
    possibility of threaded interrupt handlers.

    Credit goes to Peter Zijlstra for suggesting use of the existing _raw
    suffix to indicate disabling lockdep over the earlier "bulkref" names.

    Requested-by: Srikar Dronamraju
    Signed-off-by: Paul E. McKenney
    Tested-by: Srikar Dronamraju

    Paul E. McKenney
     
  • A common debug_lockdep_rcu_enabled() function is used to check whether
    RCU lockdep splats should be reported, but srcu_read_lock() does not
    use it. This commit therefore brings srcu_read_lock_held() up to date.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • Catch SRCU up to the other variants of RCU by making PROVE_RCU
    complain if either srcu_read_lock() or srcu_read_lock_held() are
    used from within RCU-idle mode.

    Frederic reworked this to allow for the new versions of his patches
    that check for extended quiescent states.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

21 Aug, 2010

1 commit


20 Aug, 2010

1 commit

  • This commit provides definitions for the __rcu annotation defined earlier.
    This annotation permits sparse to check for correct use of RCU-protected
    pointers. If a pointer that is annotated with __rcu is accessed
    directly (as opposed to via rcu_dereference(), rcu_assign_pointer(),
    or one of their variants), sparse can be made to complain. To enable
    such complaints, use the new default-disabled CONFIG_SPARSE_RCU_POINTER
    kernel configuration option. Please note that these sparse complaints are
    intended to be a debugging aid, -not- a code-style-enforcement mechanism.

    There are special rcu_dereference_protected() and rcu_access_pointer()
    accessors for use when RCU read-side protection is not required, for
    example, when no other CPU has access to the data structure in question
    or while the current CPU hold the update-side lock.

    This patch also updates a number of docbook comments that were showing
    their age.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Cc: Christopher Li
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

11 May, 2010

2 commits


03 Mar, 2010

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
    percpu: add __percpu sparse annotations to what's left
    percpu: add __percpu sparse annotations to fs
    percpu: add __percpu sparse annotations to core kernel subsystems
    local_t: Remove leftover local.h
    this_cpu: Remove pageset_notifier
    this_cpu: Page allocator conversion
    percpu, x86: Generic inc / dec percpu instructions
    local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c
    module: Use this_cpu_xx to dynamically allocate counters
    local_t: Remove cpu_local_xx macros
    percpu: refactor the code in pcpu_[de]populate_chunk()
    percpu: remove compile warnings caused by __verify_pcpu_ptr()
    percpu: make accessors check for percpu pointer in sparse
    percpu: add __percpu for sparse.
    percpu: make access macros universal
    percpu: remove per_cpu__ prefix.

    Linus Torvalds
     

25 Feb, 2010

2 commits

  • Make rcu_dereference() check for being in an RCU read-side
    critical section, and create rcu_dereference_bh(),
    rcu_dereference_sched(), and srcu_dereference() to check for the
    other flavors of RCU. Also create rcu_dereference_raw() to
    avoid checking, and make rcu_dereference_check() use
    rcu_dereference_raw().

    Acked-by: Eric Dumazet
    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • Inspection is proving insufficient to catch all RCU misuses,
    which is understandable given that rcu_dereference() might be
    protected by any of four different flavors of RCU (RCU, RCU-bh,
    RCU-sched, and SRCU), and might also/instead be protected by any
    of a number of locking primitives. It is therefore time to
    enlist the aid of lockdep.

    This set of patches is inspired by earlier work by Peter
    Zijlstra and Thomas Gleixner, and takes the following approach:

    o Set up separate lockdep classes for RCU, RCU-bh, and RCU-sched.

    o Set up separate lockdep classes for each instance of SRCU.

    o Create primitives that check for being in an RCU read-side
    critical section. These return exact answers if lockdep is
    fully enabled, but if unsure, report being in an RCU read-side
    critical section. (We want to avoid false positives!)
    The primitives are:

    For RCU: rcu_read_lock_held(void)

    For RCU-bh: rcu_read_lock_bh_held(void)

    For RCU-sched: rcu_read_lock_sched_held(void)

    For SRCU: srcu_read_lock_held(struct srcu_struct *sp)

    o Add rcu_dereference_check(), which takes a second argument
    in which one places a boolean expression based on the above
    primitives and/or lockdep_is_held().

    o A new kernel configuration parameter, CONFIG_PROVE_RCU, enables
    rcu_dereference_check(). This depends on CONFIG_PROVE_LOCKING,
    and should be quite helpful during the transition period while
    CONFIG_PROVE_RCU-unaware patches are in flight.

    The existing rcu_dereference() primitive does no checking, but
    upcoming patches will change that.

    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

17 Feb, 2010

1 commit

  • Add __percpu sparse annotations to core subsystems.

    These annotations are to make sparse consider percpu variables to be
    in a different address space and warn if accessed without going
    through percpu accessors. This patch doesn't affect normal builds.

    Signed-off-by: Tejun Heo
    Reviewed-by: Christoph Lameter
    Acked-by: Paul E. McKenney
    Cc: Jens Axboe
    Cc: linux-mm@kvack.org
    Cc: Rusty Russell
    Cc: Dipankar Sarma
    Cc: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Eric Biederman

    Tejun Heo
     

26 Oct, 2009

1 commit

  • This patch creates a synchronize_srcu_expedited() that uses
    synchronize_sched_expedited() where synchronize_srcu()
    uses synchronize_sched(). The synchronize_srcu() and
    synchronize_srcu_expedited() functions become one-liners that
    pass synchronize_sched() or synchronize_sched_expedited(),
    repectively, to a new __synchronize_srcu() function.

    While in the file, move the EXPORT_SYMBOL_GPL()s to immediately
    follow the corresponding functions.

    Requested-by: Avi Kivity
    Tested-by: Marcelo Tosatti
    Signed-off-by: Paul E. McKenney
    Acked-by: Josh Triplett
    Reviewed-by: Lai Jiangshan
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    Cc: avi@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

04 Oct, 2006

3 commits

  • Currently the init_srcu_struct() routine has no way to report out-of-memory
    errors. This patch (as761) makes it return -ENOMEM when the per-cpu data
    allocation fails.

    The patch also makes srcu_init_notifier_head() report a BUG if a notifier
    head can't be initialized. Perhaps it should return -ENOMEM instead, but
    in the most likely cases where this might occur I don't think any recovery
    is possible. Notifier chains generally are not created dynamically.

    [akpm@osdl.org: avoid statement-with-side-effect in macro]
    Signed-off-by: Alan Stern
    Acked-by: Paul E. McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Stern
     
  • This patch (as751) adds a new type of notifier chain, based on the SRCU
    (Sleepable Read-Copy Update) primitives recently added to the kernel. An
    SRCU notifier chain is much like a blocking notifier chain, in that it must
    be called in process context and its callout routines are allowed to sleep.
    The difference is that the chain's links are protected by the SRCU
    mechanism rather than by an rw-semaphore, so calling the chain has
    extremely low overhead: no memory barriers and no cache-line bouncing. On
    the other hand, unregistering from the chain is expensive and the chain
    head requires special runtime initialization (plus cleanup if it is to be
    deallocated).

    SRCU notifiers are appropriate for notifiers that will be called very
    frequently and for which unregistration occurs very seldom. The proposed
    "task notifier" scheme qualifies, as may some of the network notifiers.

    Signed-off-by: Alan Stern
    Acked-by: Paul E. McKenney
    Acked-by: Chandra Seetharaman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Stern
     
  • Updated patch adding a variant of RCU that permits sleeping in read-side
    critical sections. SRCU is as follows:

    o Each use of SRCU creates its own srcu_struct, and each
    srcu_struct has its own set of grace periods. This is
    critical, as it prevents one subsystem with a blocking
    reader from holding up SRCU grace periods for other
    subsystems.

    o The SRCU primitives (srcu_read_lock(), srcu_read_unlock(),
    and synchronize_srcu()) all take a pointer to a srcu_struct.

    o The SRCU primitives must be called from process context.

    o srcu_read_lock() returns an int that must be passed to
    the matching srcu_read_unlock(). Realtime RCU avoids the
    need for this by storing the state in the task struct,
    but SRCU needs to allow a given code path to pass through
    multiple SRCU domains -- storing state in the task struct
    would therefore require either arbitrary space in the
    task struct or arbitrary limits on SRCU nesting. So I
    kicked the state-storage problem up to the caller.

    Of course, it is not permitted to call synchronize_srcu()
    while in an SRCU read-side critical section.

    o There is no call_srcu(). It would not be hard to implement
    one, but it seems like too easy a way to OOM the system.
    (Hey, we have enough trouble with call_rcu(), which does
    -not- permit readers to sleep!!!) So, if you want it,
    please tell me why...

    [josht@us.ibm.com: sparse notation]
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney