Eric Lee / smarc-fsl-linux-kernel

10 Feb, 2019

2 commits

e7ffb4eb9 Merge branches 'doc.2019.01.26a', 'fixes.2019.01.26a', 'sil.2019.01.26a', 'spdx.… ... Browse Code »

…2019.02.09a', 'srcu.2019.01.26a' and 'torture.2019.01.26a' into HEAD

doc.2019.01.26a: Documentation updates.
fixes.2019.01.26a: Miscellaneous fixes.
sil.2019.01.26a: Removal of a few more spin_is_locked() instances.
spdx.2019.02.09a: Add SPDX identifiers to RCU files
srcu.2019.01.26a: SRCU updates.
torture.2019.01.26a: Torture-test updates.

Paul E. McKenney
2019-02-10 00:47:52 +0800
22e409253 rcu/tree: Convert to SPDX license identifier ... Browse Code »

Replace the license boiler plate with a SPDX license identifier.
While in the area, update an email address.

Signed-off-by: Paul E. McKenney
[ paulmck: Update .h file SPDX comment format per Joe Perches. ]
Reviewed-by: Thomas Gleixner

Paul E. McKenney
2019-02-10 00:44:10 +0800

26 Jan, 2019

6 commits

5a0874c1d rcu: Remove preemption disabling from expedited CPU selection ... Browse Code »

It turns out that it is queue_delayed_work_on() rather than
queue_work_on() that has difficulties when used concurrently with
CPU-hotplug removal operations. It is therefore unnecessary to protect
CPU identification and queue_work_on() with preempt_disable().

This commit therefore removes the preempt_disable() and preempt_enable()
from sync_rcu_exp_select_cpus(), which has the further benefit of reducing
the number of changes that must be maintained in the -rt patchset.

Reported-by: Thomas Gleixner
Reported-by: Sebastian Siewior
Suggested-by: Boqun Feng
Signed-off-by: Paul E. McKenney

Paul E. McKenney
2019-01-26 07:35:23 +0800
892307266 rcu: Inline _synchronize_rcu_expedited() into synchronize_rcu_expedited() ... Browse Code »

Now that _synchronize_rcu_expedited() has only one caller, and given that
this is a tail call, this commit inlines _synchronize_rcu_expedited()
into synchronize_rcu_expedited().

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2019-01-26 07:28:29 +0800
e5bc3af77 rcu: Consolidate PREEMPT and !PREEMPT synchronize_rcu() ... Browse Code »

Now that rcu_blocking_is_gp() makes the correct immediate-return
decision for both PREEMPT and !PREEMPT, a single implementation of
synchronize_rcu() will work correctly under both configurations.
This commit therefore eliminates a few lines of code by consolidating
the two implementations of synchronize_rcu().

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2019-01-26 07:28:28 +0800
3cd4ca47a rcu: Consolidate PREEMPT and !PREEMPT synchronize_rcu_expedited() ... Browse Code »

The CONFIG_PREEMPT=n and CONFIG_PREEMPT=y implementations of
synchronize_rcu_expedited() are quite similar, and with small
modifications to rcu_blocking_is_gp() can be made identical. This commit
therefore makes this change in order to save a few lines of code and to
reduce the amount of duplicate code.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2019-01-26 07:28:27 +0800
142d106d5 rcu: Determine expedited-GP IPI handler at build time ... Browse Code »

Back when there could be multiple RCU flavors running in the same kernel
at the same time, it was necessary to specify the expedited grace-period
IPI handler at runtime. Now that there is only one RCU flavor, the
IPI handler can be determined at build time. There is therefore no
longer any reason for the RCU-preempt and RCU-sched IPI handlers to
have different names, nor is there any reason to pass these handlers in
function arguments and in the data structures enclosing workqueues.

This commit therefore makes all these changes, pushing the specification
of the expedited grace-period IPI handler down to the point of use.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2019-01-26 07:28:27 +0800
1de462ed8 rcu: Make expedited IPI handler return after handling critical section ... Browse Code »

During expedited RCU grace-period initialization, IPIs are sent to
all non-idle online CPUs. The IPI handler checks to see if the CPU is
in quiescent state, reporting one if so. This handler looks at three
different cases: (1) The CPU is not in an rcu_read_lock()-based critical
section, (2) The CPU is in the process of exiting an rcu_read_lock()-based
critical section, and (3) The CPU is in an rcu_read_lock()-based critical
section. In case (2), execution falls through into case (3).

This is harmless from a functionality viewpoint, but can result in
needless overhead during an improbable corner case. This commit therefore
adds the "return" statement needed to prevent fall-through.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2019-01-26 07:28:24 +0800

13 Nov, 2018

1 commit

05f415715 rcu: Speed up expedited GPs when interrupting RCU reader ... Browse Code »

In PREEMPT kernels, an expedited grace period might send an IPI to a
CPU that is executing an RCU read-side critical section. In that case,
it would be nice if the rcu_read_unlock() directly interacted with the
RCU core code to immediately report the quiescent state. And this does
happen in the case where the reader has been preempted. But it would
also be a nice performance optimization if immediate reporting also
happened in the preemption-free case.

This commit therefore adds an ->exp_hint field to the task_struct structure's
->rcu_read_unlock_special field. The IPI handler sets this hint when
it has interrupted an RCU read-side critical section, and this causes
the outermost rcu_read_unlock() call to invoke rcu_read_unlock_special(),
which, if preemption is enabled, reports the quiescent state immediately.
If preemption is disabled, then the report is required to be deferred
until preemption (or bottom halves or interrupts or whatever) is re-enabled.

Because this is a hint, it does nothing for more complicated cases. For
example, if the IPI interrupts an RCU reader, but interrupts are disabled
across the rcu_read_unlock(), but another rcu_read_lock() is executed
before interrupts are re-enabled, the hint will already have been cleared.
If you do crazy things like this, reporting will be deferred until some
later RCU_SOFTIRQ handler, context switch, cond_resched(), or similar.

Reported-by: Joel Fernandes
Signed-off-by: Paul E. McKenney
Acked-by: Joel Fernandes (Google)

Paul E. McKenney
2018-11-13 01:03:59 +0800

12 Nov, 2018

1 commit

9cac83a57 rcu: Stop expedited grace periods from relying on stop-machine ... Browse Code »

The CPU-selection code in sync_rcu_exp_select_cpus() disables preemption
to prevent the cpu_online_mask from changing. However, this relies on
the stop-machine mechanism in the CPU-hotplug offline code, which is not
desirable (it would be good to someday remove the stop-machine mechanism).

This commit therefore instead uses the relevant leaf rcu_node structure's
->ffmask, which has a bit set for all CPUs that are fully functional.
A given CPU's bit is cleared very early during offline processing by
rcutree_offline_cpu() and set very late during online processing by
rcutree_online_cpu(). Therefore, if a CPU's bit is set in this mask, and
preemption is disabled, we have to be before the synchronize_sched() in
the CPU-hotplug offline code, which means that the CPU is guaranteed to be
workqueue-ready throughout the duration of the enclosing preempt_disable()
region of code.

This also has the side-effect of using WORK_CPU_UNBOUND if all the CPUs for
this leaf rcu_node structure are offline, which is an acceptable difference
in behavior.

Reported-by: Sebastian Andrzej Siewior
Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-11-12 03:23:01 +0800

31 Aug, 2018

13 commits

dc5a4f293 rcu: Switch ->dynticks to rcu_data structure, remove rcu_dynticks ... Browse Code »

This commit move ->dynticks from the rcu_dynticks structure to the
rcu_data structure, replacing the field of the same name. It also updates
the code to access ->dynticks from the rcu_data structure and to use the
rcu_data structure rather than following to now-gone ->dynticks field
to the now-gone rcu_dynticks structure. While in the area, this commit
also fixes up comments.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-08-31 07:03:52 +0800
2dba13f0b rcu: Switch urgent quiescent-state requests to rcu_data structure ... Browse Code »

This commit removes ->rcu_need_heavy_qs and ->rcu_urgent_qs from the
rcu_dynticks structure and updates the code to access them from the
rcu_data structure.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-08-31 07:03:50 +0800
fced9c8cf rcu: Avoid resched_cpu() when rescheduling the current CPU ... Browse Code »

The resched_cpu() interface is quite handy, but it does acquire the
specified CPU's runqueue lock, which does not come for free. This
commit therefore substitutes the following when directing resched_cpu()
at the current CPU:

set_tsk_need_resched(current);
set_preempt_need_resched();

Signed-off-by: Paul E. McKenney
Cc: Peter Zijlstra

Paul E. McKenney
2018-08-31 07:03:45 +0800
8fa946d42 rcu: Clean up flavor-related definitions and comments in tree_exp.h ... Browse Code »

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-08-31 07:03:35 +0800
aedf4ba98 rcu: Remove rsp parameter from rcu_node tree accessor macros ... Browse Code »

There now is only one rcu_state structure in a given build of the Linux
kernel, so there is no need to pass it as a parameter to RCU's rcu_node
tree's accessor macros. This commit therefore removes the rsp parameter
from those macros in kernel/rcu/rcu.h, and removes some now-unused rsp
local variables while in the area.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-08-31 07:03:16 +0800
63d4c8c97 rcu: Remove rsp parameter from expedited grace-period functions ... Browse Code »

There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to
RCU's functions. This commit therefore removes the rsp parameter
from the code in kernel/rcu/tree_exp.h, and removes all of the
rsp local variables while in the area.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-08-31 07:03:14 +0800
336a4f6c4 rcu: Remove rsp parameter from rcu_get_root() ... Browse Code »

There now is only one rcu_state structure in a given build of the
Linux kernel, so there is no need to pass it as a parameter to RCU's
functions. This commit therefore removes the rsp parameter from
rcu_get_root().

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-08-31 07:02:55 +0800
16fc9c600 rcu: Remove rcu_state_p pointer to default rcu_state structure ... Browse Code »

The rcu_state_p pointer references the default rcu_state structure,
that is, the one that call_rcu() uses, as opposed to call_rcu_bh()
and sometimes call_rcu_sched(). But there is now only one rcu_state
structure, so that one structure is by definition the default, which
means that the rcu_state_p pointer no longer serves any useful purpose.
This commit therefore removes it.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-08-31 07:02:50 +0800
da1df50d1 rcu: Remove rcu_state structure's ->rda field ... Browse Code »

The rcu_state structure's ->rda field was used to find the per-CPU
rcu_data structures corresponding to that rcu_state structure. But now
there is only one rcu_state structure (creatively named "rcu_state")
and one set of per-CPU rcu_data structures (creatively named "rcu_data").
Therefore, uses of the ->rda field can always be replaced by "rcu_data,
and this commit makes that change and removes the ->rda field.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-08-31 07:02:49 +0800
ec5dd444b rcu: Eliminate rcu_state structure's ->call field ... Browse Code »

The rcu_state structure's ->call field references the corresponding RCU
flavor's call_rcu() function. However, now that there is only ever one
rcu_state structure in a given build of the Linux kernel, and that flavor
uses plain old call_rcu(), there is not a lot of point in continuing to
have the ->call field. This commit therefore removes it.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-08-31 07:02:48 +0800
45975c7d2 rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds ... Browse Code »

Now that RCU-preempt knows about preemption disabling, its implementation
of synchronize_rcu() works for synchronize_sched(), and likewise for the
other RCU-sched update-side API members. This commit therefore confines
the RCU-sched update-side code to CONFIG_PREEMPT=n builds, and defines
RCU-sched's update-side API members in terms of those of RCU-preempt.

This means that any given build of the Linux kernel has only one
update-side flavor of RCU, namely RCU-preempt for CONFIG_PREEMPT=y builds
and RCU-sched for CONFIG_PREEMPT=n builds. This in turn means that kernels
built with CONFIG_RCU_NOCB_CPU=y have only one rcuo kthread per CPU.

Signed-off-by: Paul E. McKenney
Cc: Andi Kleen

Paul E. McKenney
2018-08-31 07:02:45 +0800
2bbfc25b0 rcu: Drop "wake" parameter from rcu_report_exp_rdp() ... Browse Code »

The rcu_report_exp_rdp() function is always invoked with its "wake"
argument set to "true", so this commit drops this parameter. The only
potential call site that would use "false" is in the code driving the
expedited grace period, and that code uses rcu_report_exp_cpu_mult()
instead, which therefore retains its "wake" parameter.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-08-31 07:02:43 +0800
3e3100989 rcu: Defer reporting RCU-preempt quiescent states when disabled ... Browse Code »

This commit defers reporting of RCU-preempt quiescent states at
rcu_read_unlock_special() time when any of interrupts, softirq, or
preemption are disabled. These deferred quiescent states are reported
at a later RCU_SOFTIRQ, context switch, idle entry, or CPU-hotplug
offline operation. Of course, if another RCU read-side critical
section has started in the meantime, the reporting of the quiescent
state will be further deferred.

This also means that disabling preemption, interrupts, and/or
softirqs will act as an RCU-preempt read-side critical section.
This is enforced by checking preempt_count() as needed.

Some special cases must be handled on an ad-hoc basis, for example,
context switch is a quiescent state even though both the scheduler and
do_exit() disable preemption. In these cases, additional calls to
rcu_preempt_deferred_qs() override the preemption disabling. Similar
logic overrides disabled interrupts in rcu_preempt_check_callbacks()
because in this case the quiescent state happened just before the
corresponding scheduling-clock interrupt.

In theory, this change lifts a long-standing restriction that required
that if interrupts were disabled across a call to rcu_read_unlock()
that the matching rcu_read_lock() also be contained within that
interrupts-disabled region of code. Because the reporting of the
corresponding RCU-preempt quiescent state is now deferred until
after interrupts have been enabled, it is no longer possible for this
situation to result in deadlocks involving the scheduler's runqueue and
priority-inheritance locks. This may allow some code simplification that
might reduce interrupt latency a bit. Unfortunately, in practice this
would also defer deboosting a low-priority task that had been subjected
to RCU priority boosting, so real-time-response considerations might
well force this restriction to remain in place.

Because RCU-preempt grace periods are now blocked not only by RCU
read-side critical sections, but also by disabling of interrupts,
preemption, and softirqs, it will be possible to eliminate RCU-bh and
RCU-sched in favor of RCU-preempt in CONFIG_PREEMPT=y kernels. This may
require some additional plumbing to provide the network denial-of-service
guarantees that have been traditionally provided by RCU-bh. Once these
are in place, CONFIG_PREEMPT=n kernels will be able to fold RCU-bh
into RCU-sched. This would mean that all kernels would have but
one flavor of RCU, which would open the door to significant code
cleanup.

Moving to a single flavor of RCU would also have the beneficial effect
of reducing the NOCB kthreads by at least a factor of two.

Signed-off-by: Paul E. McKenney
[ paulmck: Apply rcu_read_unlock_special() preempt_count() feedback
from Joel Fernandes. ]
[ paulmck: Adjust rcu_eqs_enter() call to rcu_preempt_deferred_qs() in
response to bug reports from kbuild test robot. ]
[ paulmck: Fix bug located by kbuild test robot involving recursion
via rcu_preempt_deferred_qs(). ]

Paul E. McKenney
2018-08-31 07:02:34 +0800

14 Aug, 2018

1 commit

f7951c33f Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler updates from Thomas Gleixner:

- Cleanup and improvement of NUMA balancing

- Refactoring and improvements to the PELT (Per Entity Load Tracking)
code

- Watchdog simplification and related cleanups

- The usual pile of small incremental fixes and improvements

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
watchdog: Reduce message verbosity
stop_machine: Reflow cpu_stop_queue_two_works()
sched/numa: Move task_numa_placement() closer to numa_migrate_preferred()
sched/numa: Use group_weights to identify if migration degrades locality
sched/numa: Update the scan period without holding the numa_group lock
sched/numa: Remove numa_has_capacity()
sched/numa: Modify migrate_swap() to accept additional parameters
sched/numa: Remove unused task_capacity from 'struct numa_stats'
sched/numa: Skip nodes that are at 'hoplimit'
sched/debug: Reverse the order of printing faults
sched/numa: Use task faults only if numa_group is not yet set up
sched/numa: Set preferred_node based on best_cpu
sched/numa: Simplify load_too_imbalanced()
sched/numa: Evaluate move once per node
sched/numa: Remove redundant field
sched/debug: Show the sum wait time of a task group
sched/fair: Remove #ifdefs from scale_rt_capacity()
sched/core: Remove get_cpu() from sched_fork()
sched/cpufreq: Clarify sugov_get_util()
sched/sysctl: Remove unused sched_time_avg_ms sysctl
...

Linus Torvalds
2018-08-14 02:25:07 +0800

13 Jul, 2018

1 commit

fcc635436 rcu: Make expedited GPs handle CPU 0 being offline ... Browse Code »

Currently, the parallelized initialization of expedited grace periods uses
the workqueue associated with each rcu_node structure's ->grplo field.
This works fine unless that CPU is offline. This commit therefore uses
the CPU corresponding to the lowest-numbered online CPU, or just queues
the work on WORK_CPU_UNBOUND if there are no online CPUs corresponding
to this rcu_node structure.

Note that this patch uses cpu_is_offline() instead of the usual approach
of checking bits in the rcu_node structure's ->qsmaskinitnext field. This
is safe because preemption is disabled across both the cpu_is_offline()
check and the call to queue_work_on().

Signed-off-by: Boqun Feng
[ paulmck: Disable preemption to close offline race window. ]
Signed-off-by: Paul E. McKenney
[ paulmck: Apply Peter Zijlstra feedback on CPU selection. ]
Tested-by: Aneesh Kumar K.V

Boqun Feng
2018-07-13 03:36:06 +0800

26 Jun, 2018

1 commit

5257514d8 rcu: Make expedited grace period use direct call on last leaf ... Browse Code »

During expedited grace-period initialization, a work item is scheduled
for each leaf rcu_node structure. However, that initialization code
is itself (normally) executing from a workqueue, so one of the leaf
rcu_node structures could just as well be handled by that pre-existing
workqueue, and with less overhead. This commit therefore uses a
shiny new rcu_is_leaf_node() macro to execute the last leaf rcu_node
structure's initialization directly from the pre-existing workqueue.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-06-26 02:25:41 +0800

20 Jun, 2018

1 commit

b3dae109f sched/swait: Rename to exclusive ... Browse Code »

Since swait basically implemented exclusive waits only, make sure
the API reflects that.

$ git grep -l -e "\"
-e "\" | while read file;
do
sed -i -e 's/\/&_one/g'
-e 's/\/&_exclusive/g' $file;
done

With a few manual touch-ups.

Suggested-by: Linus Torvalds
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Acked-by: Linus Torvalds
Cc: bigeasy@linutronix.de
Cc: oleg@redhat.com
Cc: paulmck@linux.vnet.ibm.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180612083909.261946548@infradead.org

Peter Zijlstra
2018-06-20 17:35:56 +0800

16 May, 2018

5 commits

22df7316a Merge branches 'exp.2018.05.15a', 'fixes.2018.05.15a', 'lock.2018.05.15a' and 't… ... Browse Code »

…orture.2018.05.15a' into HEAD

exp.2018.05.15a: Parallelize expedited grace-period initialization.
fixes.2018.05.15a: Miscellaneous fixes.
lock.2018.05.15a: Decrease lock contention on root rcu_node structure,
which is a step towards merging RCU flavors.
torture.2018.05.15a: Torture-test updates.

Paul E. McKenney
2018-05-16 01:33:05 +0800
6fba2b376 rcu: Remove deprecated RCU debugfs tracing code ... Browse Code »

Commit ae91aa0adb14 ("rcu: Remove debugfs tracing") removed the
RCU debugfs tracing code, but did not remove the no-longer used
->exp_workdone{0,1,2,3} fields in the srcu_data structure. This commit
therefore removes these fields along with the code that uselessly
updates them.

Signed-off-by: Byungchul Park
Signed-off-by: Paul E. McKenney
Tested-by: Nicholas Piggin

Byungchul Park
2018-05-16 01:27:23 +0800
55ebfce06 rcu: exp: Protect all sync_rcu_preempt_exp_done() with rcu_node lock ... Browse Code »

Currently some callsites of sync_rcu_preempt_exp_done() are not called
with the corresponding rcu_node's ->lock held, which could introduces
bugs as per Paul:

o CPU 0 in sync_rcu_preempt_exp_done() reads ->exp_tasks and
sees that it is NULL.

o CPU 1 blocks within an RCU read-side critical section, so
it enqueues the task and points ->exp_tasks at it and
clears CPU 1's bit in ->expmask.

o All other CPUs clear their bits in ->expmask.

o CPU 0 reads ->expmask, sees that it is zero, so incorrectly
concludes that all quiescent states have completed, despite
the fact that ->exp_tasks is non-NULL.

To fix this, sync_rcu_preempt_exp_unlocked() is introduced to replace
lockless callsites of sync_rcu_preempt_exp_done().

Further, a lockdep annotation is added into sync_rcu_preempt_exp_done()
to prevent mis-use in the future.

Signed-off-by: Boqun Feng
Signed-off-by: Paul E. McKenney
Tested-by: Nicholas Piggin

Boqun Feng
2018-05-16 01:26:07 +0800
7be8c56f8 rcu: exp: Fix "must hold exp_mutex" comments for QS reporting functions ... Browse Code »

Since commit d9a3da0699b2 ("rcu: Add expedited grace-period support
for preemptible RCU"), there are comments for some funtions in
rcu_report_exp_rnp()'s call-chain saying that exp_mutex or its
predecessors needs to be held.

However, exp_mutex and its predecessors were used only to synchronize
between GPs, and it is clear that all variables visited by those functions
are under the protection of rcu_node's ->lock. Moreover, those functions
are currently called without held exp_mutex, and seems that doesn't
introduce any trouble.

So this patch fixes this problem by updating the comments to match the
current code.

Signed-off-by: Boqun Feng
Fixes: d9a3da0699b2 ("rcu: Add expedited grace-period support for preemptible RCU")
Signed-off-by: Paul E. McKenney
Tested-by: Nicholas Piggin

Boqun Feng
2018-05-16 01:26:01 +0800
25f3d7eff rcu: Parallelize expedited grace-period initialization ... Browse Code »

The latency of RCU expedited grace periods grows with increasing numbers
of CPUs, eventually failing to be all that expedited. Much of the growth
in latency is in the initialization phase, so this commit uses workqueues
to carry out this initialization concurrently on a rcu_node-by-rcu_node
basis.

This change makes use of a new rcu_par_gp_wq because flushing a work
item from another work item running from the same workqueue can result
in deadlock.

Signed-off-by: Paul E. McKenney
Tested-by: Nicholas Piggin

Paul E. McKenney
2018-05-16 01:25:44 +0800

24 Feb, 2018

1 commit

ad7c946b3 rcu: Create RCU-specific workqueues with rescuers ... Browse Code »

RCU's expedited grace periods can participate in out-of-memory deadlocks
due to all available system_wq kthreads being blocked and there not being
memory available to create more. This commit prevents such deadlocks
by allocating an RCU-specific workqueue_struct at early boot time, and
providing it with a rescuer to ensure forward progress. This uses the
shiny new init_rescuer() function provided by Tejun (but indirectly).

This commit also causes SRCU to use this new RCU-specific
workqueue_struct. Note that SRCU's use of workqueues never blocks them
waiting for readers, so this should be safe from a forward-progress
viewpoint. Note that this moves SRCU from system_power_efficient_wq
to a normal workqueue. In the unlikely event that this results in
measurable degradation, a separate power-efficient workqueue will be
creates for SRCU.

Reported-by: Prateek Sood
Reported-by: Tejun Heo
Signed-off-by: Paul E. McKenney
Acked-by: Tejun Heo

Paul E. McKenney
2018-02-24 07:14:40 +0800

21 Feb, 2018

3 commits

65963d246 rcu: Make expedited RCU CPU selection avoid unnecessary stores ... Browse Code »

This commit reworks the first loop in sync_rcu_exp_select_cpus()
to avoid doing unnecssary stores to other CPUs' rcu_data
structures. This speeds up that first loop by roughly a factor of
two on an old x86 system. In the case where the system is mostly
idle, this loop incurs a large fraction of the overhead of the
synchronize_rcu_expedited(). There is less benefit on busy systems
because the overhead of the smp_call_function_single() in the second
loop dominates in that case.

However, it is not unusual to do configuration chances involving
RCU grace periods (both expedited and normal) while the system is
mostly idle, so this optimization is worth doing.

While we are in the area, this commit also adds parentheses to arguments
used by the for_each_leaf_node_possible_cpu() macro.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-02-21 08:12:29 +0800
7f5d42d05 rcu: Trace expedited GP delays due to transitioning CPUs ... Browse Code »

If a CPU is transitioning to or from offline state, an expedited
grace period may undergo a timed wait. This timed wait can unduly
delay grace periods, so this commit adds a trace statement to make
it visible.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-02-21 08:12:28 +0800
9a414201a rcu: Add more tracing of expedited grace periods ... Browse Code »

This commit adds more tracing of expedited grace periods to enable
improved debugging of slowdowns.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2018-02-21 08:12:27 +0800

26 Jul, 2017

1 commit

313517fc4 rcu: Make expedited GPs correctly handle hardware CPU insertion ... Browse Code »

The update of the ->expmaskinitnext and of ->ncpus are unsynchronized,
with the value of ->ncpus being incremented long before the corresponding
->expmaskinitnext mask is updated. If an RCU expedited grace period
sees ->ncpus change, it will update the ->expmaskinit masks from the new
->expmaskinitnext masks. But it is possible that ->ncpus has already
been updated, but the ->expmaskinitnext masks still have their old values.
For the current expedited grace period, no harm done. The CPU could not
have been online before the grace period started, so there is no need to
wait for its non-existent pre-existing readers.

But the next RCU expedited grace period is in a world of hurt. The value
of ->ncpus has already been updated, so this grace period will assume
that the ->expmaskinitnext masks have not changed. But they have, and
they won't be taken into account until the next never-been-online CPU
comes online. This means that RCU will be ignoring some CPUs that it
should be paying attention to.

The solution is to update ->ncpus and ->expmaskinitnext while holding
the ->lock for the rcu_node structure containing the ->expmaskinitnext
mask. Because smp_store_release() is now used to update ->ncpus and
smp_load_acquire() is now used to locklessly read it, if the expedited
grace period sees ->ncpus change, then the updating CPU has to
already be holding the corresponding ->lock. Therefore, when the
expedited grace period later acquires that ->lock, it is guaranteed
to see the new value of ->expmaskinitnext.

On the other hand, if the expedited grace period loads ->ncpus just
before an update, earlier full memory barriers guarantee that
the incoming CPU isn't far enough along to be running any RCU readers.

This commit therefore makes the required change.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-26 04:04:45 +0800

08 Jun, 2017

1 commit

dcfc315b7 rcu: Make sync_rcu_preempt_exp_done() return bool ... Browse Code »

The sync_rcu_preempt_exp_done() function returns a logical expression,
but its return type is nevertheless int. This commit therefore changes
the return type to bool.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-06-08 23:25:27 +0800

19 Apr, 2017

2 commits

031aeee00 srcu: Improve rcu_seq grace-period-counter abstraction ... Browse Code »

The expedited grace-period code contains several open-coded shifts
know the format of an rcu_seq grace-period counter, which is not
particularly good style. This commit therefore creates a new
rcu_seq_ctr() function that extracts the counter portion of the
counter, and an rcu_seq_state() function that extracts the low-order
state bit. This commit prepares for SRCU callback parallelization,
which will require two state bits.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-04-19 02:38:21 +0800
3c345825c rcu: Expedited wakeups need to be fully ordered ... Browse Code »

Expedited grace periods use workqueue handlers that wake up the requesters,
but there is no lock mediating this wakeup. Therefore, memory barriers
are required to ensure that the handler's memory references are seen by
all to occur before synchronize_*_expedited() returns to its caller.
Possibly detected by syzkaller.

Reported-by: Dmitry Vyukov
Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-04-19 02:38:19 +0800