25 Aug, 2020

1 commit


30 Jun, 2020

2 commits

  • SRCU disables interrupts to get a stable per-CPU pointer and then
    acquires the spinlock which is in the per-CPU data structure. The
    release uses spin_unlock_irqrestore(). While this is correct on a non-RT
    kernel, this conflicts with the RT semantics because the spinlock is
    converted to a 'sleeping' spinlock. Sleeping locks can obviously not be
    acquired with interrupts disabled.

    Acquire the per-CPU pointer `ssp->sda' without disabling preemption and
    then acquire the spinlock_t of the per-CPU data structure. The lock will
    ensure that the data is consistent.

    The added call to check_init_srcu_struct() is now needed because a
    statically defined srcu_struct may remain uninitialized until this
    point and the newly introduced locking operation requires an initialized
    spinlock_t.

    This change was tested for four hours with 8*SRCU-N and 8*SRCU-P without
    causing any warnings.

    Cc: Lai Jiangshan
    Cc: "Paul E. McKenney"
    Cc: Josh Triplett
    Cc: Steven Rostedt
    Cc: Mathieu Desnoyers
    Cc: rcu@vger.kernel.org
    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Paul E. McKenney

    Sebastian Andrzej Siewior
     
  • This commit fixes a typo in a comment.

    Signed-off-by: Ethon Paul
    Signed-off-by: Paul E. McKenney

    Ethon Paul
     

28 Apr, 2020

2 commits

  • The srcu_data structure's ->srcu_lock_count and ->srcu_unlock_count arrays
    are read and written locklessly, so this commit adds the data_race()
    to the diagnostic-print loads from these arrays in order mark them as
    known and approved data-racy accesses.

    This data race was reported by KCSAN. Not appropriate for backporting due
    to failure being unlikely and due to this being used only by rcutorture.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit adds stubs for KCSAN's data_race(), ASSERT_EXCLUSIVE_WRITER(),
    and ASSERT_EXCLUSIVE_ACCESS() macros to allow code using these macros to
    move ahead.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

22 Mar, 2020

1 commit

  • …'locktorture.2020.02.20a', 'ovld.2020.02.20a', 'rcu-tasks.2020.02.20a', 'srcu.2020.02.20a' and 'torture.2020.02.20a' into HEAD

    doc.2020.02.27a: Documentation updates.
    fixes.2020.03.21a: Miscellaneous fixes.
    kfree_rcu.2020.02.20a: Updates to kfree_rcu().
    locktorture.2020.02.20a: Lock torture-test updates.
    ovld.2020.02.20a: Updates to callback-overload handling.
    rcu-tasks.2020.02.20a: RCU-tasks updates.
    srcu.2020.02.20a: SRCU updates.
    torture.2020.02.20a: Torture-test updates.

    Paul E. McKenney
     

21 Feb, 2020

5 commits

  • A read of the srcu_struct structure's ->srcu_gp_seq field should not
    need READ_ONCE() when that structure's ->lock is held. Except that this
    lock is not always held when updating this field. This commit therefore
    acquires the lock around updates and removes a now-unneeded READ_ONCE().

    This data race was reported by KCSAN.

    Signed-off-by: Paul E. McKenney
    [ paulmck: Switch from READ_ONCE() to lock per Peter Zilstra question. ]
    Acked-by: Peter Zijlstra (Intel)

    Paul E. McKenney
     
  • The srcu_struct structure's ->srcu_idx field is accessed locklessly,
    so reads must use READ_ONCE(). This commit therefore adds the needed
    READ_ONCE() invocation where it was missed.

    This data race was reported by KCSAN. Not appropriate for backporting
    due to failure being unlikely.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The srcu_struct structure's ->srcu_gp_seq_needed_exp field is accessed
    locklessly, so updates must use WRITE_ONCE(). This commit therefore
    adds the needed WRITE_ONCE() invocations.

    This data race was reported by KCSAN. Not appropriate for backporting
    due to failure being unlikely.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The srcu_node structure's ->srcu_gp_seq_needed_exp field is accessed
    locklessly, so updates must use WRITE_ONCE(). This commit therefore
    adds the needed WRITE_ONCE() invocations.

    This data race was reported by KCSAN. Not appropriate for backporting
    due to failure being unlikely.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Convert to plural and add a note that this is for Tree RCU.

    Signed-off-by: SeongJae Park
    Signed-off-by: Paul E. McKenney

    SeongJae Park
     

25 Jan, 2020

3 commits

  • …_rcu.2020.01.24a', 'list.2020.01.10a', 'preempt.2020.01.24a' and 'torture.2019.12.09a' into HEAD

    doc.2019.12.10a: Documentations updates
    exp.2019.12.09a: Expedited grace-period updates
    fixes.2020.01.24a: Miscellaneous fixes
    kfree_rcu.2020.01.24a: Batch kfree_rcu() work
    list.2020.01.10a: RCU-protected-list updates
    preempt.2020.01.24a: Preemptible RCU updates
    torture.2019.12.09a: Torture-test updates

    Paul E. McKenney
     
  • The ->srcu_last_gp_end field is accessed from any CPU at any time
    by synchronize_srcu(), so non-initialization references need to use
    READ_ONCE() and WRITE_ONCE(). This commit therefore makes that change.

    Reported-by: syzbot+08f3e9d26e5541e1ecf2@syzkaller.appspotmail.com
    Acked-by: Marco Elver
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit removes kfree_rcu() special-casing and the lazy-callback
    handling from Tree RCU. It moves some of this special casing to Tiny RCU,
    the removal of which will be the subject of later commits.

    This results in a nice negative delta.

    Suggested-by: Paul E. McKenney
    Signed-off-by: Joel Fernandes (Google)
    [ paulmck: Add slab.h #include, thanks to kbuild test robot . ]
    Signed-off-by: Paul E. McKenney

    Joel Fernandes (Google)
     

02 Aug, 2019

1 commit

  • Because pointer output is now obfuscated, and because what you really
    want to know is whether or not the callback lists are empty, this commit
    replaces the srcu_data structure's head callback pointer printout with
    a single character that is "." is the callback list is empty or "C"
    otherwise.

    This is the only remaining user of rcu_segcblist_head(), so this
    commit also removes this function's definition. It also turns out that
    rcu_segcblist_tail() no longer has any callers, so this commit removes
    that function's definition while in the area. They were both marked
    "Interim", and their end has come.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

29 May, 2019

2 commits

  • Because __call_srcu() is not used outside kernel/rcu/srcutree.c,
    this commit makes it static.

    Signed-off-by: Jiang Biao
    Signed-off-by: Paul E. McKenney

    Jiang Biao
     
  • Adding DEFINE_SRCU() or DEFINE_STATIC_SRCU() to a loadable module requires
    that the size of the reserved region be increased, which is not something
    we want to be doing all that often. One approach would be to require
    that loadable modules define an srcu_struct and invoke init_srcu_struct()
    from their module_init function and cleanup_srcu_struct() from their
    module_exit function. However, this is more than a bit user unfriendly.

    This commit therefore creates an ___srcu_struct_ptrs linker section,
    and pointers to srcu_struct structures created by DEFINE_SRCU() and
    DEFINE_STATIC_SRCU() within a module are placed into that module's
    ___srcu_struct_ptrs section. The required init_srcu_struct() and
    cleanup_srcu_struct() functions are then automatically invoked as needed
    when that module is loaded and unloaded, thus allowing modules to continue
    to use DEFINE_SRCU() and DEFINE_STATIC_SRCU() while avoiding the need
    to increase the size of the reserved region.

    Many of the algorithms and some of the code was cheerfully cherry-picked
    from other code making use of linker sections, perhaps most notably from
    tracepoints. All bugs are nevertheless the sole property of the author.

    Suggested-by: Mathieu Desnoyers
    [ paulmck: Use __section() and use "default" in srcu_module_notify()'s
    "switch" statement as suggested by Joel Fernandes. ]
    Signed-off-by: Paul E. McKenney
    Tested-by: Joel Fernandes (Google)

    Paul E. McKenney
     

27 Mar, 2019

2 commits

  • The cleanup_srcu_struct_quiesced() function was added because NVME
    used WQ_MEM_RECLAIM workqueues and SRCU did not, which meant that
    NVME workqueues waiting on SRCU workqueues could result in deadlocks
    during low-memory conditions. However, SRCU now also has WQ_MEM_RECLAIM
    workqueues, so there is no longer a potential for deadlock. Furthermore,
    it turns out to be extremely hard to use cleanup_srcu_struct_quiesced()
    correctly due to the fact that SRCU callback invocation accesses the
    srcu_struct structure's per-CPU data area just after callbacks are
    invoked. Therefore, the usual practice of using srcu_barrier() to wait
    for callbacks to be invoked before invoking cleanup_srcu_struct_quiesced()
    fails because SRCU's callback-invocation workqueue handler might be
    delayed, which can result in cleanup_srcu_struct_quiesced() being invoked
    (and thus freeing the per-CPU data) before the SRCU's callback-invocation
    workqueue handler is finished using that per-CPU data. Nor is this a
    theoretical problem: KASAN emitted use-after-free warnings because of
    this problem on actual runs.

    In short, NVME can now safely invoke cleanup_srcu_struct(), which
    avoids the use-after-free scenario. And cleanup_srcu_struct_quiesced()
    is quite difficult to use safely. This commit therefore removes
    cleanup_srcu_struct_quiesced(), switching its sole user back to
    cleanup_srcu_struct(). This effectively reverts the following pair
    of commits:

    f7194ac32ca2 ("srcu: Add cleanup_srcu_struct_quiesced()")
    4317228ad9b8 ("nvme: Avoid flush dependency in delete controller flow")

    Reported-by: Bart Van Assche
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Bart Van Assche
    Tested-by: Bart Van Assche

    Paul E. McKenney
     
  • If someone fails to drain the corresponding SRCU callbacks (for
    example, by failing to invoke srcu_barrier()) before invoking either
    cleanup_srcu_struct() or cleanup_srcu_struct_quiesced(), the resulting
    diagnostic is an ambiguous use-after-free diagnostic, and even then
    only if you are running something like KASAN. This commit therefore
    improves SRCU diagnostics by adding checks for in-flight callbacks at
    _cleanup_srcu_struct() time.

    Note that these diagnostics can still be defeated, for example, by
    invoking call_srcu() concurrently with cleanup_srcu_struct(). Which is
    a really bad idea, but sometimes all too easy to do. But even then,
    these diagnostics have at least some probability of catching the problem.

    Reported-by: Sagi Grimberg
    Reported-by: Bart Van Assche
    Signed-off-by: Paul E. McKenney
    Tested-by: Bart Van Assche

    Paul E. McKenney
     

10 Feb, 2019

2 commits


26 Jan, 2019

1 commit

  • srcu_queue_delayed_work_on() disables preemption (and therefore CPU
    hotplug in RCU's case) and then checks based on its own accounting if a
    CPU is online. If the CPU is online it uses queue_delayed_work_on()
    otherwise it fallbacks to queue_delayed_work().
    The problem here is that queue_work() on -RT does not work with disabled
    preemption.

    queue_work_on() works also on an offlined CPU. queue_delayed_work_on()
    has the problem that it is possible to program a timer on an offlined
    CPU. This timer will fire once the CPU is online again. But until then,
    the timer remains programmed and nothing will happen.

    Add a local timer which will fire (as requested per delay) on the local
    CPU and then enqueue the work on the specific CPU.

    RCUtorture testing with SRCU-P for 24h showed no problems.

    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Paul E. McKenney

    Sebastian Andrzej Siewior
     

28 Nov, 2018

2 commits

  • In RCU, the distinction between "rsp", "rnp", and "rdp" has served well
    for a great many years, but in SRCU, "sp" vs. "sdp" has proven confusing.
    This commit therefore renames SRCU's "sp" pointers to "ssp", so that there
    is "ssp" for srcu_struct pointer, "snp" for srcu_node pointer, and "sdp"
    for srcu_data pointer.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The srcu_gp_start() function is called with the srcu_struct structure's
    ->lock held, but not with the srcu_data structure's ->lock. This is
    problematic because this function accesses and updates the srcu_data
    structure's ->srcu_cblist, which is protected by that lock. Failing to
    hold this lock can result in corruption of the SRCU callback lists,
    which in turn can result in arbitrarily bad results.

    This commit therefore makes srcu_gp_start() acquire the srcu_data
    structure's ->lock across the calls to rcu_segcblist_advance() and
    rcu_segcblist_accelerate(), thus preventing this corruption.

    Reported-by: Bart Van Assche
    Reported-by: Christoph Hellwig
    Reported-by: Sebastian Kuzminsky
    Signed-off-by: Dennis Krein
    Signed-off-by: Paul E. McKenney
    Tested-by: Dennis Krein
    Cc: # 4.16.x

    Dennis Krein
     

09 Nov, 2018

1 commit

  • Ever since cdf7abc4610a ("srcu: Allow use of Tiny/Tree SRCU from
    both process and interrupt context"), it has been permissible
    to use SRCU read-side critical sections in interrupt context.
    This allows __call_srcu() to use SRCU read-side critical sections to
    prevent a new SRCU grace period from ending before the call to either
    srcu_funnel_gp_start() or srcu_funnel_exp_start completes, thus preventing
    SRCU grace-period counter overflow during that time.

    Note that this does not permit removal of the counter-wrap checks in
    srcu_gp_end(). These check are necessary to handle the case where
    a given CPU does not interact at all with SRCU for an extended time
    period.

    This commit therefore adds an SRCU read-side critical section to
    __call_srcu() in order to prevent grace period counter wrap during
    the funnel-locking process.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

31 Aug, 2018

5 commits

  • … 'torture.2018.08.29a' into HEAD

    doc.2018.08.30a: Documentation updates
    dynticks.2018.08.30b: RCU flavor consolidation updates and cleanups
    srcu.2018.08.30b: SRCU updates
    torture.2018.08.29a: Torture-test updates

    Paul E. McKenney
     
  • Allocating a list_head structure that is almost never used, and, when
    used, is used only during early boot (rcu_init() and earlier), is a bit
    wasteful. This commit therefore eliminates that list_head in favor of
    the one in the work_struct structure. This is safe because the work_struct
    structure cannot be used until after rcu_init() returns.

    Reported-by: Steven Rostedt
    Signed-off-by: Paul E. McKenney
    Cc: Tejun Heo
    Cc: Lai Jiangshan
    Tested-by: Steven Rostedt (VMware)

    Paul E. McKenney
     
  • Event tracing is moving to SRCU in order to take advantage of the fact
    that SRCU may be safely used from idle and even offline CPUs. However,
    event tracing can invoke call_srcu() very early in the boot process,
    even before workqueue_init_early() is invoked (let alone rcu_init()).
    Therefore, call_srcu()'s attempts to queue work fail miserably.

    This commit therefore detects this situation, and refrains from attempting
    to queue work before rcu_init() time, but does everything else that it
    would have done, and in addition, adds the srcu_struct to a global list.
    The rcu_init() function now invokes a new srcu_init() function, which
    is empty if CONFIG_SRCU=n. Otherwise, srcu_init() queues work for
    each srcu_struct on the list. This all happens early enough in boot
    that there is but a single CPU with interrupts disabled, which allows
    synchronization to be dispensed with.

    Of course, the queued work won't actually be invoked until after
    workqueue_init() is invoked, which happens shortly after the scheduler
    is up and running. This means that although call_srcu() may be invoked
    any time after per-CPU variables have been set up, there is still a very
    narrow window when synchronize_srcu() won't work, and this window
    extends from the time that the scheduler starts until the time that
    workqueue_init() returns. This can be fixed in a manner similar to
    the fix for synchronize_rcu_expedited() and friends, but until someone
    actually needs to use synchronize_srcu() during this window, this fix
    is added churn for no benefit.

    Finally, note that Tree SRCU's new srcu_init() function invokes
    queue_work() rather than the queue_delayed_work() function that is
    invoked post-boot. The reason is that queue_delayed_work() will (as you
    would expect) post a timer, and timers have not yet been initialized.
    So use of queue_work() avoids the complaints about use of uninitialized
    spinlocks that would otherwise result. Besides, some delay is already
    provide by the aforementioned fact that the queued work won't actually
    be invoked until after the scheduler is up and running.

    Requested-by: Steven Rostedt
    Signed-off-by: Paul E. McKenney
    Tested-by: Steven Rostedt (VMware)

    Paul E. McKenney
     
  • Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • There now is only one rcu_state structure in a given build of the Linux
    kernel, so there is no need to pass it as a parameter to RCU's rcu_node
    tree's accessor macros. This commit therefore removes the rsp parameter
    from those macros in kernel/rcu/rcu.h, and removes some now-unused rsp
    local variables while in the area.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

13 Jul, 2018

3 commits


26 Jun, 2018

3 commits


16 May, 2018

1 commit

  • The current cleanup_srcu_struct() flushes work, which prevents it
    from being invoked from some workqueue contexts, as well as from
    atomic (non-blocking) contexts. This patch therefore introduced a
    cleanup_srcu_struct_quiesced(), which can be invoked only after all
    activity on the specified srcu_struct has completed. This restriction
    allows cleanup_srcu_struct_quiesced() to be invoked from workqueue
    contexts as well as from atomic contexts.

    Suggested-by: Christoph Hellwig
    Signed-off-by: Paul E. McKenney
    Tested-by: Nitzan Carmi
    Tested-by: Nicholas Piggin

    Paul E. McKenney
     

24 Feb, 2018

2 commits

  • fixes.2018.02.23a: Miscellaneous fixes
    srcu.2018.02.20a: SRCU updates
    torture.2018.02.20a: Torture-test updates

    Paul E. McKenney
     
  • RCU's expedited grace periods can participate in out-of-memory deadlocks
    due to all available system_wq kthreads being blocked and there not being
    memory available to create more. This commit prevents such deadlocks
    by allocating an RCU-specific workqueue_struct at early boot time, and
    providing it with a rescuer to ensure forward progress. This uses the
    shiny new init_rescuer() function provided by Tejun (but indirectly).

    This commit also causes SRCU to use this new RCU-specific
    workqueue_struct. Note that SRCU's use of workqueues never blocks them
    waiting for readers, so this should be safe from a forward-progress
    viewpoint. Note that this moves SRCU from system_power_efficient_wq
    to a normal workqueue. In the unlikely event that this results in
    measurable degradation, a separate power-efficient workqueue will be
    creates for SRCU.

    Reported-by: Prateek Sood
    Reported-by: Tejun Heo
    Signed-off-by: Paul E. McKenney
    Acked-by: Tejun Heo

    Paul E. McKenney
     

21 Feb, 2018

1 commit

  • The code in srcu_gp_end() inserts a delay every 0x3ff grace periods in
    order to prevent SRCU grace-period work from consuming an entire CPU
    when there is a long sequence of expedited SRCU grace-period requests.
    However, all of SRCU's grace-period work is carried out in workqueues,
    which are in turn within kthreads, which are automatically throttled as
    needed by the scheduler. In particular, if there is plenty of idle time,
    there is no point in throttling.

    This commit therefore removes the expedited SRCU grace-period throttling.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney