01 Dec, 2018

1 commit

  • commit 92aa39e9dc77481b90cbef25e547d66cab901496 upstream.

    The per-CPU rcu_dynticks.rcu_urgent_qs variable communicates an urgent
    need for an RCU quiescent state from the force-quiescent-state processing
    within the grace-period kthread to context switches and to cond_resched().
    Unfortunately, such urgent needs are not communicated to need_resched(),
    which is sometimes used to decide when to invoke cond_resched(), for
    but one example, within the KVM vcpu_run() function. As of v4.15, this
    can result in synchronize_sched() being delayed by up to ten seconds,
    which can be problematic, to say nothing of annoying.

    This commit therefore checks rcu_dynticks.rcu_urgent_qs from within
    rcu_check_callbacks(), which is invoked from the scheduling-clock
    interrupt handler. If the current task is not an idle task and is
    not executing in usermode, a context switch is forced, and either way,
    the rcu_dynticks.rcu_urgent_qs variable is set to false. If the current
    task is an idle task, then RCU's dyntick-idle code will detect the
    quiescent state, so no further action is required. Similarly, if the
    task is executing in usermode, other code in rcu_check_callbacks() and
    its called functions will report the corresponding quiescent state.

    Reported-by: Marius Hillenbrand
    Reported-by: David Woodhouse
    Suggested-by: Peter Zijlstra
    Signed-off-by: Paul E. McKenney
    [ paulmck: Backported to make patch apply cleanly on older versions. ]
    Tested-by: Marius Hillenbrand
    Cc: # 4.12.x - 4.19.x
    Signed-off-by: Greg Kroah-Hartman

    Paul E. McKenney
     

20 Oct, 2017

1 commit


04 Oct, 2017

1 commit

  • Pull tracing fixlets from Steven Rostedt:
    "Two updates:

    - A memory fix with left over code from spliting out ftrace_ops and
    function graph tracer, where the function graph tracer could reset
    the trampoline pointer, leaving the old trampoline not to be freed
    (memory leak).

    - The update to Paul's patch that added the unnecessary READ_ONCE().
    This removes the unnecessary READ_ONCE() instead of having to
    rebase the branch to update the patch that added it"

    * tag 'trace-v4.14-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    rcu: Remove extraneous READ_ONCE()s from rcu_irq_{enter,exit}()
    ftrace: Fix kmemleak in unregister_ftrace_graph

    Linus Torvalds
     

03 Oct, 2017

1 commit

  • The read of ->dynticks_nmi_nesting in rcu_irq_enter() and rcu_irq_exit()
    is currently protected with READ_ONCE(). However, this protection is
    unnecessary because (1) ->dynticks_nmi_nesting is updated only by the
    current CPU, (2) Although NMI handlers can update this field, they reset
    it back to its old value before return, and (3) Interrupts are disabled,
    so nothing else can modify it. The value of ->dynticks_nmi_nesting is
    thus effectively constant, and so no protection is required.

    This commit therefore removes the READ_ONCE() protection from these
    two accesses.

    Link: http://lkml.kernel.org/r/20170926031902.GA2074@linux.vnet.ibm.com

    Reported-by: Linus Torvalds
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Steven Rostedt (VMware)

    Paul E. McKenney
     

26 Sep, 2017

1 commit

  • Pull tracing fixes from Steven Rostedt:
    "Stack tracing and RCU has been having issues with each other and
    lockdep has been pointing out constant problems.

    The changes have been going into the stack tracer, but it has been
    discovered that the problem isn't with the stack tracer itself, but it
    is with calling save_stack_trace() from within the internals of RCU.

    The stack tracer is the one that can trigger the issue the easiest,
    but examining the problem further, it could also happen from a WARN()
    in the wrong place, or even if an NMI happened in this area and it did
    an rcu_read_lock().

    The critical area is where RCU is not watching. Which can happen while
    going to and from idle, or bringing up or taking down a CPU.

    The final fix was to put the protection in kernel_text_address() as it
    is the one that requires RCU to be watching while doing the stack
    trace.

    To make this work properly, Paul had to allow rcu_irq_enter() happen
    after rcu_nmi_enter(). This should have been done anyway, since an NMI
    can page fault (reading vmalloc area), and a page fault triggers
    rcu_irq_enter().

    One patch is just a consolidation of code so that the fix only needed
    to be done in one location"

    * tag 'trace-v4.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    tracing: Remove RCU work arounds from stack tracer
    extable: Enable RCU if it is not watching in kernel_text_address()
    extable: Consolidate *kernel_text_address() functions
    rcu: Allow for page faults in NMI handlers

    Linus Torvalds
     

24 Sep, 2017

1 commit

  • A number of architecture invoke rcu_irq_enter() on exception entry in
    order to allow RCU read-side critical sections in the exception handler
    when the exception is from an idle or nohz_full CPU. This works, at
    least unless the exception happens in an NMI handler. In that case,
    rcu_nmi_enter() would already have exited the extended quiescent state,
    which would mean that rcu_irq_enter() would (incorrectly) cause RCU
    to think that it is again in an extended quiescent state. This will
    in turn result in lockdep splats in response to later RCU read-side
    critical sections.

    This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to
    take no action if there is an rcu_nmi_enter() in effect, thus avoiding
    the unscheduled return to RCU quiescent state. This in turn should
    make the kernel safe for on-demand RCU voyeurism.

    Link: http://lkml.kernel.org/r/20170922211022.GA18084@linux.vnet.ibm.com

    Cc: stable@vger.kernel.org
    Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
    Reported-by: Steven Rostedt
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Steven Rostedt (VMware)

    Paul E. McKenney
     

09 Sep, 2017

1 commit

  • First, number of CPUs can't be negative number.

    Second, different signnnedness leads to suboptimal code in the following
    cases:

    1)
    kmalloc(nr_cpu_ids * sizeof(X));

    "int" has to be sign extended to size_t.

    2)
    while (loff_t *pos < nr_cpu_ids)

    MOVSXD is 1 byte longed than the same MOV.

    Other cases exist as well. Basically compiler is told that nr_cpu_ids
    can't be negative which can't be deduced if it is "int".

    Code savings on allyesconfig kernel: -3KB

    add/remove: 0/0 grow/shrink: 25/264 up/down: 261/-3631 (-3370)
    function old new delta
    coretemp_cpu_online 450 512 +62
    rcu_init_one 1234 1272 +38
    pci_device_probe 374 399 +25

    ...

    pgdat_reclaimable_pages 628 556 -72
    select_fallback_rq 446 369 -77
    task_numa_find_cpu 1923 1807 -116

    Link: http://lkml.kernel.org/r/20170819114959.GA30580@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

17 Aug, 2017

8 commits

  • …isc.2017.08.17a', 'spin_unlock_wait_no.2017.08.17a', 'srcu.2017.07.27c' and 'torture.2017.07.24c' into HEAD

    doc.2017.08.17a: Documentation updates.
    fixes.2017.08.17a: RCU fixes.
    hotplug.2017.07.25b: CPU-hotplug updates.
    misc.2017.08.17a: Miscellaneous fixes outside of RCU (give or take conflicts).
    spin_unlock_wait_no.2017.08.17a: Remove spin_unlock_wait().
    srcu.2017.07.27c: SRCU updates.
    torture.2017.07.24c: Torture-test updates.

    Paul E. McKenney
     
  • The rcu_idle_exit() and rcu_idle_enter() functions are exported because
    they were originally used by RCU_NONIDLE(), which was intended to
    be usable from modules. However, RCU_NONIDLE() now instead uses
    rcu_irq_enter_irqson() and rcu_irq_exit_irqson(), which are not
    exported, and there have been no complaints.

    This commit therefore removes the exports from rcu_idle_exit() and
    rcu_idle_enter().

    Reported-by: Peter Zijlstra
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • All current callers of rcu_idle_enter() have irqs disabled, and
    rcu_idle_enter() relies on this, but doesn't check. This commit
    therefore adds a RCU_LOCKDEP_WARN() to add some verification to the trust.
    While we are there, pass "true" rather than "1" to rcu_eqs_enter().

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • All callers to rcu_idle_enter() have irqs disabled, so there is no
    point in rcu_idle_enter disabling them again. This commit therefore
    replaces the irq disabling with a RCU_LOCKDEP_WARN().

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Paul E. McKenney

    Peter Zijlstra (Intel)
     
  • This commit adds assertions verifying the consistency of the rcu_node
    structure's ->blkd_tasks list and its ->gp_tasks, ->exp_tasks, and
    ->boost_tasks pointers. In particular, the ->blkd_tasks lists must be
    empty except for leaf rcu_node structures.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Set disable_rcu_irq_enter on not only rcu_eqs_enter_common() but also
    rcu_eqs_exit(), since rcu_eqs_exit() suffers from the same issue as was
    fixed for rcu_eqs_enter_common() by commit 03ecd3f48e57 ("rcu/tracing:
    Add rcu_disabled to denote when rcu_irq_enter() will not work").

    Signed-off-by: Masami Hiramatsu
    Acked-by: Steven Rostedt (VMware)
    Signed-off-by: Paul E. McKenney

    Masami Hiramatsu
     
  • The _rcu_barrier_trace() function is a wrapper for trace_rcu_barrier(),
    which needs TPS() protection for strings passed through the second
    argument. However, it has escaped prior TPS()-ification efforts because
    it _rcu_barrier_trace() does not start with "trace_". This commit
    therefore adds the needed TPS() protection

    Signed-off-by: Paul E. McKenney
    Acked-by: Steven Rostedt (VMware)

    Paul E. McKenney
     
  • These RCU waits were set to use interruptible waits to avoid the kthreads
    contributing to system load average, even though they are not interruptible
    as they are spawned from a kthread. Use the new TASK_IDLE swaits which makes
    our goal clear, and removes confusion about these paths possibly being
    interruptible -- they are not.

    When the system is idle the RCU grace-period kthread will spend all its time
    blocked inside the swait_event_interruptible(). If the interruptible() was
    not used, then this kthread would contribute to the load average. This means
    that an idle system would have a load average of 2 (or 3 if PREEMPT=y),
    rather than the load average of 0 that almost fifty years of UNIX has
    conditioned sysadmins to expect.

    The same argument applies to swait_event_interruptible_timeout() use. The
    RCU grace-period kthread spends its time blocked inside this call while
    waiting for grace periods to complete. In particular, if there was only one
    busy CPU, but that CPU was frequently invoking call_rcu(), then the RCU
    grace-period kthread would spend almost all its time blocked inside the
    swait_event_interruptible_timeout(). This would mean that the load average
    would be 2 rather than the expected 1 for the single busy CPU.

    Acked-by: "Eric W. Biederman"
    Tested-by: Paul E. McKenney
    Signed-off-by: Luis R. Rodriguez
    Signed-off-by: Paul E. McKenney

    Luis R. Rodriguez
     

26 Jul, 2017

10 commits

  • After adopting callbacks from a newly offlined CPU, the adopting CPU
    checks to make sure that its callback list's count is zero only if the
    list has no callbacks and vice versa. Unfortunately, it does so after
    enabling interrupts, which means that false positives are possible due to
    interrupt handlers invoking call_rcu(). Although these false positives
    are improbable, rcutorture did make it happen once.

    This commit therefore moves this check to an irq-disabled region of code,
    thus suppressing the false positive.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Given that the rcu_state structure's >orphan_pend and ->orphan_done
    fields are used only during migration of callbacks from the recently
    offlined CPU to a surviving CPU, if rcu_send_cbs_to_orphanage() and
    rcu_adopt_orphan_cbs() are combined, these fields can become local
    variables in the combined function. This commit therefore combines
    rcu_send_cbs_to_orphanage() and rcu_adopt_orphan_cbs() into a new
    rcu_segcblist_merge() function and removes the ->orphan_pend and
    ->orphan_done fields.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • When migrating callbacks from a newly offlined CPU, we are already
    holding the root rcu_node structure's lock, so it costs almost nothing
    to advance and accelerate the newly migrated callbacks. This patch
    therefore makes this advancing and acceleration happen.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The ->orphan_lock is acquired and released only within the
    rcu_migrate_callbacks() function, which now acquires the root rcu_node
    structure's ->lock. This commit therefore eliminates the ->orphan_lock
    in favor of the root rcu_node structure's ->lock.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • It is possible that the outgoing CPU is unaware of recent grace periods,
    and so it is also possible that some of its pending callbacks are actually
    ready to be invoked. The current callback-migration code would needlessly
    force these callbacks to pass through another grace period. This commit
    therefore invokes rcu_advance_cbs() on the outgoing CPU's callbacks in
    order to give them full credit for having passed through any recent
    grace periods.

    This also fixes an odd theoretical bug where there are no callbacks in
    the system except for those on the outgoing CPU, none of those callbacks
    have yet been associated with a grace-period number, there is never again
    another callback registered, and the surviving CPU never again takes a
    scheduling-clock interrupt, never goes idle, and never enters nohz_full
    userspace execution. Yes, this is (just barely) possible. It requires
    that the surviving CPU be a nohz_full CPU, that its scheduler-clock
    interrupt be shut off, and that it loop forever in the kernel. You get
    bonus points if you can make this one happen! ;-)

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • RCU's CPU-hotplug callback-migration code first moves the outgoing
    CPU's callbacks to ->orphan_done and ->orphan_pend, and only then
    moves them to the NOCB callback list. This commit avoids the
    extra step (and simplifies the code) by moving the callbacks directly
    from the outgoing CPU's callback list to the NOCB callback list.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The current CPU-hotplug RCU-callback-migration code checks
    for the source (newly offlined) CPU being a NOCBs CPU down in
    rcu_send_cbs_to_orphanage(). This commit simplifies callback migration a
    bit by moving this check up to rcu_migrate_callbacks(). This commit also
    adds a check for the source CPU having no callbacks, which eases analysis
    of the rcu_send_cbs_to_orphanage() and rcu_adopt_orphan_cbs() functions.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The rcu_node structure's ->n_cbs_orphaned and ->n_cbs_adopted fields
    are updated, but never read. This commit therefore removes them.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The update of the ->expmaskinitnext and of ->ncpus are unsynchronized,
    with the value of ->ncpus being incremented long before the corresponding
    ->expmaskinitnext mask is updated. If an RCU expedited grace period
    sees ->ncpus change, it will update the ->expmaskinit masks from the new
    ->expmaskinitnext masks. But it is possible that ->ncpus has already
    been updated, but the ->expmaskinitnext masks still have their old values.
    For the current expedited grace period, no harm done. The CPU could not
    have been online before the grace period started, so there is no need to
    wait for its non-existent pre-existing readers.

    But the next RCU expedited grace period is in a world of hurt. The value
    of ->ncpus has already been updated, so this grace period will assume
    that the ->expmaskinitnext masks have not changed. But they have, and
    they won't be taken into account until the next never-been-online CPU
    comes online. This means that RCU will be ignoring some CPUs that it
    should be paying attention to.

    The solution is to update ->ncpus and ->expmaskinitnext while holding
    the ->lock for the rcu_node structure containing the ->expmaskinitnext
    mask. Because smp_store_release() is now used to update ->ncpus and
    smp_load_acquire() is now used to locklessly read it, if the expedited
    grace period sees ->ncpus change, then the updating CPU has to
    already be holding the corresponding ->lock. Therefore, when the
    expedited grace period later acquires that ->lock, it is guaranteed
    to see the new value of ->expmaskinitnext.

    On the other hand, if the expedited grace period loads ->ncpus just
    before an update, earlier full memory barriers guarantee that
    the incoming CPU isn't far enough along to be running any RCU readers.

    This commit therefore makes the required change.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • RCU callbacks must be migrated away from an outgoing CPU, and this is
    done near the end of the CPU-hotplug operation, after the outgoing CPU is
    long gone. Unfortunately, this means that other CPU-hotplug callbacks
    can execute while the outgoing CPU's callbacks are still immobilized
    on the long-gone CPU's callback lists. If any of these CPU-hotplug
    callbacks must wait, either directly or indirectly, for the invocation
    of any of the immobilized RCU callbacks, the system will hang.

    This commit avoids such hangs by migrating the callbacks away from the
    outgoing CPU immediately upon its departure, shortly after the return
    from __cpu_die() in takedown_cpu(). Thus, RCU is able to advance these
    callbacks and invoke them, which allows all the after-the-fact CPU-hotplug
    callbacks to wait on these RCU callbacks without risk of a hang.

    While in the neighborhood, this commit also moves rcu_send_cbs_to_orphanage()
    and rcu_adopt_orphan_cbs() under a pre-existing #ifdef to avoid including
    dead code on the one hand and to avoid define-without-use warnings on the
    other hand.

    Reported-by: Jeffrey Hugo
    Link: http://lkml.kernel.org/r/db9c91f6-1b17-6136-84f0-03c3c2581ab4@codeaurora.org
    Signed-off-by: Paul E. McKenney
    Cc: Thomas Gleixner
    Cc: Sebastian Andrzej Siewior
    Cc: Ingo Molnar
    Cc: Anna-Maria Gleixner
    Cc: Boris Ostrovsky
    Cc: Richard Weinberger

    Paul E. McKenney
     

25 Jul, 2017

1 commit


09 Jun, 2017

6 commits

  • The NO_HZ_FULL_SYSIDLE full-system-idle capability was added in 2013
    by commit 0edd1b1784cb ("nohz_full: Add full-system-idle state machine"),
    but has not been used. This commit therefore removes it.

    If it turns out to be needed later, this commit can always be reverted.

    Signed-off-by: Paul E. McKenney
    Cc: Frederic Weisbecker
    Cc: Rik van Riel
    Cc: Ingo Molnar
    Acked-by: Linus Torvalds

    Paul E. McKenney
     
  • Anything that can be done with the RCU_KTHREAD_PRIO Kconfig option can
    also be done with the rcutree.kthread_prio kernel boot parameter.
    This commit therefore removes this Kconfig option.

    Reported-by: Linus Torvalds
    Signed-off-by: Paul E. McKenney
    Cc: Frederic Weisbecker
    Cc: Rik van Riel

    Paul E. McKenney
     
  • The RCU_TORTURE_TEST_SLOW_PREINIT, RCU_TORTURE_TEST_SLOW_PREINIT_DELAY,
    RCU_TORTURE_TEST_SLOW_PREINIT_DELAY, RCU_TORTURE_TEST_SLOW_INIT,
    RCU_TORTURE_TEST_SLOW_INIT_DELAY, RCU_TORTURE_TEST_SLOW_CLEANUP,
    and RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY Kconfig options are only
    useful for torture testing, and there are the rcutree.gp_cleanup_delay,
    rcutree.gp_init_delay, and rcutree.gp_preinit_delay kernel boot parameters
    that rcutorture can use instead. The effect of these parameters is to
    artificially slow down grace period initialization and cleanup in order
    to make some types of race conditions happen more often.

    This commit therefore simplifies Tree RCU a bit by removing the Kconfig
    options and adding the corresponding kernel parameters to rcutorture's
    .boot files instead. However, this commit also leaves out the kernel
    parameters for TREE02, TREE04, and TREE07 in order to have about the
    same number of tests slowed as not slowed. TREE01, TREE03, TREE05,
    and TREE06 are slowed, and the rest are not slowed.

    Reported-by: Linus Torvalds
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The "__call_rcu(): Leaked duplicate callback" error message from
    __call_rcu() has proven to be unhelpful. This commit therefore changes
    it to "__call_rcu(): Double-freed CB" and adds the value of the pointer
    passed in. The value of the pointer improves debuggability by allowing
    correlation with tracing output, for example, the rcu:rcu_callback trace
    event.

    Reported-by: Vegard Nossum
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The __rcu_is_watching() function is currently not used, aside from
    to implement the rcu_is_watching() function. This commit therefore
    eliminates __rcu_is_watching(), which has the beneficial side-effect
    of shrinking include/linux/rcupdate.h a bit.

    Reported-by: Ingo Molnar
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The include/linux/rcupdate.h file is included by more than 200
    files, so shrinking it should provide some build-time benefits.
    This commit therefore moves several docbook comments from rcupdate.h to
    kernel/rcu/update.c, kernel/rcu/tree.c, and kernel/rcu/tree_plugin.h, thus
    reducing the number of times that the compiler has to scan these comments.
    This likely provides only a small benefit, but every little bit helps.

    This commit also fixes a malformed bulleted list noted by the 0day
    Test Robot.

    Reported-by: Ingo Molnar
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

08 Jun, 2017

6 commits

  • Comments can be helpful, but assertions carry more force. This
    commit therefore adds lockdep_assert_held() and RCU_LOCKDEP_WARN()
    calls to enforce lock-held and interrupt-disabled preconditions.

    Reported-by: Peter Zijlstra
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit updates rcu_bootup_announce_oddness() to check additional
    Kconfig options and module/boot parameters.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit adds WARN_ON_ONCE() calls that trigger if either
    rcu_sched_qs() or rcu_bh_qs() are invoked with preemption enabled.
    In the immortal words of Peter Zijlstra: "these are much harder to ignore
    than comments".

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The synchronize_kernel() primitive was removed in favor of
    synchronize_sched() more than a decade ago, and it seems likely that
    rather few kernel hackers are familiar with it. Its continued presence
    is therefore providing more confusion than enlightenment. This commit
    therefore removes the reference from the synchronize_sched() header
    comment, and adds the corresponding information to the synchronize_rcu(0
    header comment.

    Reported-by: Peter Zijlstra
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Although preemptible RCU allows its read-side critical sections to be
    preempted, general blocking is forbidden. The reason for this is that
    excessive preemption times can be handled by CONFIG_RCU_BOOST=y, but a
    voluntarily blocked task doesn't care how high you boost its priority.
    Because preemptible RCU is a global mechanism, one ill-behaved reader
    hurts everyone. Hence the prohibition against general blocking in
    RCU-preempt read-side critical sections. Preemption yes, blocking no.

    This commit enforces this prohibition.

    There is a special exception for the -rt patchset (which they kindly
    volunteered to implement): It is OK to block (as opposed to merely being
    preempted) within an RCU-preempt read-side critical section, but only if
    the blocking is subject to priority inheritance. This exception permits
    CONFIG_RCU_BOOST=y to get -rt RCU readers out of trouble.

    Why doesn't this exception also apply to mainline's rt_mutex? Because
    of the possibility that someone does general blocking while holding
    an rt_mutex. Yes, the priority boosting will affect the rt_mutex,
    but it won't help with the task doing general blocking while holding
    that rt_mutex.

    Reported-by: Thomas Gleixner
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Currently rcu_barrier() uses call_rcu() to enqueue new callbacks
    on each CPU with a non-empty callback list. This works, but means
    that rcu_barrier() forces grace periods that are not otherwise needed.
    The key point is that rcu_barrier() never needs to wait for a grace
    period, but instead only for all pre-existing callbacks to be invoked.
    This means that rcu_barrier()'s new callbacks should be placed in
    the callback-list segment containing the last pre-existing callback.

    This commit makes this change using the new rcu_segcblist_entrain()
    function.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

11 May, 2017

1 commit

  • Pull RCU updates from Ingo Molnar:
    "The main changes are:

    - Debloat RCU headers

    - Parallelize SRCU callback handling (plus overlapping patches)

    - Improve the performance of Tree SRCU on a CPU-hotplug stress test

    - Documentation updates

    - Miscellaneous fixes"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits)
    rcu: Open-code the rcu_cblist_n_lazy_cbs() function
    rcu: Open-code the rcu_cblist_n_cbs() function
    rcu: Open-code the rcu_cblist_empty() function
    rcu: Separately compile large rcu_segcblist functions
    srcu: Debloat the header
    srcu: Adjust default auto-expediting holdoff
    srcu: Specify auto-expedite holdoff time
    srcu: Expedite first synchronize_srcu() when idle
    srcu: Expedited grace periods with reduced memory contention
    srcu: Make rcutorture writer stalls print SRCU GP state
    srcu: Exact tracking of srcu_data structures containing callbacks
    srcu: Make SRCU be built by default
    srcu: Fix Kconfig botch when SRCU not selected
    rcu: Make non-preemptive schedule be Tasks RCU quiescent state
    srcu: Expedite srcu_schedule_cbs_snp() callback invocation
    srcu: Parallelize callback handling
    kvm: Move srcu_struct fields to end of struct kvm
    rcu: Fix typo in PER_RCU_NODE_PERIOD header comment
    rcu: Use true/false in assignment to bool
    rcu: Use bool value directly
    ...

    Linus Torvalds
     

03 May, 2017

1 commit

  • Because the rcu_cblist_n_lazy_cbs() just samples the ->len_lazy counter,
    and because the rcu_cblist structure is quite straightforward, it makes
    sense to open-code rcu_cblist_n_lazy_cbs(p) as p->len_lazy, cutting out
    a level of indirection. This commit makes this change.

    Reported-by: Ingo Molnar
    Signed-off-by: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Linus Torvalds

    Paul E. McKenney