Eric Lee / smarc-fsl-linux-kernel

23 Aug, 2016

1 commit

8b355e3bc rcu: Drive expedited grace periods from workqueue ... Browse Code »

The current implementation of expedited grace periods has the user
task drive the grace period. This works, but has downsides: (1) The
user task must awaken tasks piggybacking on this grace period, which
can result in latencies rivaling that of the grace period itself, and
(2) User tasks can receive signals, which interfere with RCU CPU stall
warnings.

This commit therefore uses workqueues to drive the grace periods, so
that the user task need not do the awakening. A subsequent commit
will remove the now-unnecessary code allowing for signals.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
9 years ago

16 Jun, 2016

1 commit

bc75e9998 rcu: Correctly handle sparse possible cpus ... Browse Code »

In many cases in the RCU tree code, we iterate over the set of cpus for
a leaf node described by rcu_node::grplo and rcu_node::grphi, checking
per-cpu data for each cpu in this range. However, if the set of possible
cpus is sparse, some cpus described in this range are not possible, and
thus no per-cpu region will have been allocated (or initialised) for
them by the generic percpu code.

Erroneous accesses to a per-cpu area for these !possible cpus may fault
or may hit other data depending on the addressed generated when the
erroneous per cpu offset is applied. In practice, both cases have been
observed on arm64 hardware (the former being silent, but detectable with
additional patches).

To avoid issues resulting from this, we must iterate over the set of
*possible* cpus for a given leaf node. This patch add a new helper,
for_each_leaf_node_possible_cpu, to enable this. As iteration is often
intertwined with rcu_node local bitmask manipulation, a new
leaf_node_cpu_bit helper is added to make this simpler and more
consistent. The RCU tree code is made to use both of these where
appropriate.

Without this patch, running reboot at a shell can result in an oops
like:

[ 3369.075979] Unable to handle kernel paging request at virtual address ffffff8008b21b4c
[ 3369.083881] pgd = ffffffc3ecdda000
[ 3369.087270] [ffffff8008b21b4c] *pgd=00000083eca48003, *pud=00000083eca48003, *pmd=0000000000000000
[ 3369.096222] Internal error: Oops: 96000007 [#1] PREEMPT SMP
[ 3369.101781] Modules linked in:
[ 3369.104825] CPU: 2 PID: 1817 Comm: NetworkManager Tainted: G W 4.6.0+ #3
[ 3369.121239] task: ffffffc0fa13e000 ti: ffffffc3eb940000 task.ti: ffffffc3eb940000
[ 3369.128708] PC is at sync_rcu_exp_select_cpus+0x188/0x510
[ 3369.134094] LR is at sync_rcu_exp_select_cpus+0x104/0x510
[ 3369.139479] pc : [] lr : [] pstate: 200001c5
[ 3369.146860] sp : ffffffc3eb9435a0
[ 3369.150162] x29: ffffffc3eb9435a0 x28: ffffff8008be4f88
[ 3369.155465] x27: ffffff8008b66c80 x26: ffffffc3eceb2600
[ 3369.160767] x25: 0000000000000001 x24: ffffff8008be4f88
[ 3369.166070] x23: ffffff8008b51c3c x22: ffffff8008b66c80
[ 3369.171371] x21: 0000000000000001 x20: ffffff8008b21b40
[ 3369.176673] x19: ffffff8008b66c80 x18: 0000000000000000
[ 3369.181975] x17: 0000007fa951a010 x16: ffffff80086a30f0
[ 3369.187278] x15: 0000007fa9505590 x14: 0000000000000000
[ 3369.192580] x13: ffffff8008b51000 x12: ffffffc3eb940000
[ 3369.197882] x11: 0000000000000006 x10: ffffff8008b51b78
[ 3369.203184] x9 : 0000000000000001 x8 : ffffff8008be4000
[ 3369.208486] x7 : ffffff8008b21b40 x6 : 0000000000001003
[ 3369.213788] x5 : 0000000000000000 x4 : ffffff8008b27280
[ 3369.219090] x3 : ffffff8008b21b4c x2 : 0000000000000001
[ 3369.224406] x1 : 0000000000000001 x0 : 0000000000000140
...
[ 3369.972257] [] sync_rcu_exp_select_cpus+0x188/0x510
[ 3369.978685] [] synchronize_rcu_expedited+0x64/0xa8
[ 3369.985026] [] synchronize_net+0x24/0x30
[ 3369.990499] [] dev_deactivate_many+0x28c/0x298
[ 3369.996493] [] __dev_close_many+0x60/0xd0
[ 3370.002052] [] __dev_close+0x28/0x40
[ 3370.007178] [] __dev_change_flags+0x8c/0x158
[ 3370.012999] [] dev_change_flags+0x20/0x60
[ 3370.018558] [] do_setlink+0x288/0x918
[ 3370.023771] [] rtnl_newlink+0x398/0x6a8
[ 3370.029158] [] rtnetlink_rcv_msg+0xe4/0x220
[ 3370.034891] [] netlink_rcv_skb+0xc4/0xf8
[ 3370.040364] [] rtnetlink_rcv+0x2c/0x40
[ 3370.045663] [] netlink_unicast+0x160/0x238
[ 3370.051309] [] netlink_sendmsg+0x2f0/0x358
[ 3370.056956] [] sock_sendmsg+0x18/0x30
[ 3370.062168] [] ___sys_sendmsg+0x26c/0x280
[ 3370.067728] [] __sys_sendmsg+0x44/0x88
[ 3370.073027] [] SyS_sendmsg+0x10/0x20
[ 3370.078153] [] el0_svc_naked+0x24/0x28

Signed-off-by: Mark Rutland
Reported-by: Dennis Chen
Cc: Catalin Marinas
Cc: Josh Triplett
Cc: Lai Jiangshan
Cc: Mathieu Desnoyers
Cc: Steve Capper
Cc: Steven Rostedt
Cc: Will Deacon
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Paul E. McKenney

Mark Rutland
9 years ago

22 Apr, 2016

1 commit

dcd36d01f Merge branches 'doc.2016.04.19a', 'exp.2016.03.31d', 'fixes.2016.03.31d' and 'to… ... Browse Code »

…rture.2016.04.21a' into HEAD

doc.2016.04.19a: Documentation updates
exp.2016.03.31d: Expedited grace-period updates
fixes.2016.03.31d: Miscellaneous fixes
torture.2016.004.21a Torture-test updates

Paul E. McKenney
9 years ago

01 Apr, 2016

5 commits

8c7c4829a rcu: Awaken grace-period kthread if too long since FQS ... Browse Code »

Recent kernels can fail to awaken the grace-period kthread for
quiescent-state forcing. This commit is a crude hack that does
a wakeup if a scheduling-clock interrupt sees that it has been
too long since force-quiescent-state (FQS) processing.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
9 years ago
3b5f668e7 rcu: Overlap wakeups with next expedited grace period ... Browse Code »

The current expedited grace-period implementation makes subsequent grace
periods wait on wakeups for the prior grace period. This does not fit
the dictionary definition of "expedited", so this commit allows these two
phases to overlap. Doing this requires four waitqueues rather than two
because tasks can now be waiting on the previous, current, and next grace
periods. The fourth waitqueue makes the bit masking work out nicely.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
9 years ago
f6a12f34a rcu: Enforce expedited-GP fairness via funnel wait queue ... Browse Code »

The current mutex-based funnel-locking approach used by expedited grace
periods is subject to severe unfairness. The problem arises when a
few tasks, making a path from leaves to root, all wake up before other
tasks do. A new task can then follow this path all the way to the root,
which needlessly delays tasks whose grace period is done, but who do
not happen to acquire the lock quickly enough.

This commit avoids this problem by maintaining per-rcu_node wait queues,
along with a per-rcu_node counter that tracks the latest grace period
sought by an earlier task to visit this node. If that grace period
would satisfy the current task, instead of proceeding up the tree,
it waits on the current rcu_node structure using a pair of wait queues
provided for that purpose. This decouples awakening of old tasks from
the arrival of new tasks.

If the wakeups prove to be a bottleneck, additional kthreads can be
brought to bear for that purpose.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
9 years ago
d40a4f09a rcu: Shorten expedited_workdone* to exp_workdone* ... Browse Code »

Just a name change to save a few lines and a bit of typing.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
9 years ago
e2fd9d358 rcu: Remove expedited GP funnel-lock bypass ... Browse Code »

Commit #cdacbe1f91264 ("rcu: Add fastpath bypassing funnel locking")
turns out to be a pessimization at high load because it forces a tree
full of tasks to wait for an expedited grace period that they probably
do not need. This commit therefore removes this optimization.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
9 years ago

15 Mar, 2016

1 commit

8bc6782fe Merge commit 'fixes.2015.02.23a' into core/rcu ... Browse Code »

Conflicts:
kernel/rcu/tree.c

Signed-off-by: Ingo Molnar

Ingo Molnar
9 years ago

25 Feb, 2016

2 commits

abedf8e24 rcu: Use simple wait queues where possible in rcutree ... Browse Code »

As of commit dae6e64d2bcfd ("rcu: Introduce proper blocking to no-CBs kthreads
GP waits") the RCU subsystem started making use of wait queues.

Here we convert all additions of RCU wait queues to use simple wait queues,
since they don't need the extra overhead of the full wait queue features.

Originally this was done for RT kernels[1], since we would get things like...

BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
in_atomic(): 1, irqs_disabled(): 1, pid: 8, name: rcu_preempt
Pid: 8, comm: rcu_preempt Not tainted
Call Trace:
[] __might_sleep+0xd0/0xf0
[] rt_spin_lock+0x24/0x50
[] __wake_up+0x36/0x70
[] rcu_gp_kthread+0x4d2/0x680
[] ? __init_waitqueue_head+0x50/0x50
[] ? rcu_gp_fqs+0x80/0x80
[] kthread+0xdb/0xe0
[] ? finish_task_switch+0x52/0x100
[] kernel_thread_helper+0x4/0x10
[] ? __init_kthread_worker+0x60/0x60
[] ? gs_change+0xb/0xb

...and hence simple wait queues were deployed on RT out of necessity
(as simple wait uses a raw lock), but mainline might as well take
advantage of the more streamline support as well.

[1] This is a carry forward of work from v3.10-rt; the original conversion
was by Thomas on an earlier -rt version, and Sebastian extended it to
additional post-3.10 added RCU waiters; here I've added a commit log and
unified the RCU changes into one, and uprev'd it to match mainline RCU.

Signed-off-by: Daniel Wagner
Acked-by: Peter Zijlstra (Intel)
Cc: linux-rt-users@vger.kernel.org
Cc: Boqun Feng
Cc: Marcelo Tosatti
Cc: Steven Rostedt
Cc: Paul Gortmaker
Cc: Paolo Bonzini
Cc: "Paul E. McKenney"
Link: http://lkml.kernel.org/r/1455871601-27484-6-git-send-email-wagi@monom.org
Signed-off-by: Thomas Gleixner

Paul Gortmaker
9 years ago
065bb78c5 rcu: Do not call rcu_nocb_gp_cleanup() while holding rnp->lock ... Browse Code »

rcu_nocb_gp_cleanup() is called while holding rnp->lock. Currently,
this is okay because the wake_up_all() in rcu_nocb_gp_cleanup() will
not enable the IRQs. lockdep is happy.

By switching over using swait this is not true anymore. swake_up_all()
enables the IRQs while processing the waiters. __do_softirq() can now
run and will eventually call rcu_process_callbacks() which wants to
grap nrp->lock.

Let's move the rcu_nocb_gp_cleanup() call outside the lock before we
switch over to swait.

If we would hold the rnp->lock and use swait, lockdep reports
following:

=================================
[ INFO: inconsistent lock state ]
4.2.0-rc5-00025-g9a73ba0 #136 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
rcu_preempt/8 [HC0[0]:SC0[0]:HE1:SE1] takes:
(rcu_node_1){+.?...}, at: [] rcu_gp_kthread+0xb97/0xeb0
{IN-SOFTIRQ-W} state was registered at:
[] __lock_acquire+0xd5f/0x21e0
[] lock_acquire+0xdf/0x2b0
[] _raw_spin_lock_irqsave+0x59/0xa0
[] rcu_process_callbacks+0x141/0x3c0
[] __do_softirq+0x14d/0x670
[] irq_exit+0x104/0x110
[] smp_apic_timer_interrupt+0x46/0x60
[] apic_timer_interrupt+0x70/0x80
[] rq_attach_root+0xa6/0x100
[] cpu_attach_domain+0x16d/0x650
[] build_sched_domains+0x942/0xb00
[] sched_init_smp+0x509/0x5c1
[] kernel_init_freeable+0x172/0x28f
[] kernel_init+0xe/0xe0
[] ret_from_fork+0x3f/0x70
irq event stamp: 76
hardirqs last enabled at (75): [] _raw_spin_unlock_irq+0x30/0x60
hardirqs last disabled at (76): [] _raw_spin_lock_irq+0x1f/0x90
softirqs last enabled at (0): [] copy_process.part.26+0x602/0x1cf0
softirqs last disabled at (0): [< (null)>] (null)
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(rcu_node_1);

lock(rcu_node_1);
*** DEADLOCK ***
1 lock held by rcu_preempt/8:
#0: (rcu_node_1){+.?...}, at: [] rcu_gp_kthread+0xb97/0xeb0
stack backtrace:
CPU: 0 PID: 8 Comm: rcu_preempt Not tainted 4.2.0-rc5-00025-g9a73ba0 #136
Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
0000000000000000 000000006d7e67d8 ffff881fb081fbd8 ffffffff818379e0
0000000000000000 ffff881fb0812a00 ffff881fb081fc38 ffffffff8110813b
0000000000000000 0000000000000001 ffff881f00000001 ffffffff8102fa4f
Call Trace:
[] dump_stack+0x4f/0x7b
[] print_usage_bug+0x1db/0x1e0
[] ? save_stack_trace+0x2f/0x50
[] mark_lock+0x66d/0x6e0
[] ? check_usage_forwards+0x150/0x150
[] mark_held_locks+0x78/0xa0
[] ? _raw_spin_unlock_irq+0x30/0x60
[] trace_hardirqs_on_caller+0x168/0x220
[] trace_hardirqs_on+0xd/0x10
[] _raw_spin_unlock_irq+0x30/0x60
[] swake_up_all+0xb7/0xe0
[] rcu_gp_kthread+0xab1/0xeb0
[] ? trace_hardirqs_on_caller+0xff/0x220
[] ? _raw_spin_unlock_irq+0x41/0x60
[] ? rcu_barrier+0x20/0x20
[] kthread+0x104/0x120
[] ? _raw_spin_unlock_irq+0x30/0x60
[] ? kthread_create_on_node+0x260/0x260
[] ret_from_fork+0x3f/0x70
[] ? kthread_create_on_node+0x260/0x260

Signed-off-by: Daniel Wagner
Acked-by: Peter Zijlstra (Intel)
Cc: linux-rt-users@vger.kernel.org
Cc: Boqun Feng
Cc: Marcelo Tosatti
Cc: Steven Rostedt
Cc: Paul Gortmaker
Cc: Paolo Bonzini
Cc: "Paul E. McKenney"
Link: http://lkml.kernel.org/r/1455871601-27484-5-git-send-email-wagi@monom.org
Signed-off-by: Thomas Gleixner

Daniel Wagner
9 years ago

24 Feb, 2016

1 commit

67c583a7d RCU: Privatize rcu_node::lock ... Browse Code »

In patch:

"rcu: Add transitivity to remaining rcu_node ->lock acquisitions"

All locking operations on rcu_node::lock are replaced with the wrappers
because of the need of transitivity, which indicates we should never
write code using LOCK primitives alone(i.e. without a proper barrier
following) on rcu_node::lock outside those wrappers. We could detect
this kind of misuses on rcu_node::lock in the future by adding __private
modifier on rcu_node::lock.

To privatize rcu_node::lock, unlock wrappers are also needed. Replacing
spinlock unlocks with these wrappers not only privatizes rcu_node::lock
but also makes it easier to figure out critical sections of rcu_node.

This patch adds __private modifier to rcu_node::lock and makes every
access to it wrapped by ACCESS_PRIVATE(). Besides, unlock wrappers are
added and raw_spin_unlock(&rnp->lock) and its friends are replaced with
those wrappers.

Signed-off-by: Boqun Feng
Signed-off-by: Paul E. McKenney

Boqun Feng
9 years ago

08 Dec, 2015

1 commit

648c630c6 Merge branches 'doc.2015.12.05a', 'exp.2015.12.07a', 'fixes.2015.12.07a', 'list.… ... Browse Code »

…2015.12.04b' and 'torture.2015.12.05a' into HEAD

doc.2015.12.05a: Documentation updates
exp.2015.12.07a: Expedited grace-period updates
fixes.2015.12.07a: Miscellaneous fixes
list.2015.12.04b: Linked-list updates
torture.2015.12.05a: Torture-test updates

Paul E. McKenney
10 years ago

06 Dec, 2015

1 commit

6b50e119c rcutorture: Print symbolic name for ->gp_state ... Browse Code »

Currently, ->gp_state is printed as an integer, which slows debugging.
This commit therefore prints a symbolic name in addition to the integer.

Signed-off-by: Paul E. McKenney
[ paulmck: Updated to fix relational operator called out by Dan Carpenter. ]
[ paulmck: More "const", as suggested by Josh Triplett. ]
Reviewed-by: Josh Triplett

Paul E. McKenney
10 years ago

05 Dec, 2015

2 commits

df5bd5144 rcu: Reduce expedited GP memory contention via per-CPU variables ... Browse Code »

Currently, the piggybacked-work checks carried out by sync_exp_work_done()
atomically increment a small set of variables (the ->expedited_workdone0,
->expedited_workdone1, ->expedited_workdone2, ->expedited_workdone3
fields in the rcu_state structure), which will form a memory-contention
bottleneck given a sufficiently large number of CPUs concurrently invoking
either synchronize_rcu_expedited() or synchronize_sched_expedited().

This commit therefore moves these for fields to the per-CPU rcu_data
structure, eliminating the memory contention. The show_rcuexp() function
also changes to sum up each field in the rcu_data structures.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago
1de6e56dd rcu: Clarify role of ->expmaskinitnext ... Browse Code »

Analogy with the ->qsmaskinitnext field might lead one to believe that
->expmaskinitnext tracks online CPUs. This belief is incorrect: Any CPU
that has ever been online will have its bit set in the ->expmaskinitnext
field. This commit therefore adds a comment to make this clear, at
least to people who read comments.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago

24 Nov, 2015

1 commit

2a67e741b rcu: Create transitive rnp->lock acquisition functions ... Browse Code »

Providing RCU's memory-ordering guarantees requires that the rcu_node
tree's locking provide transitive memory ordering, which the Linux kernel's
spinlocks currently do not provide unless smp_mb__after_unlock_lock()
is used. Having a separate smp_mb__after_unlock_lock() after each and
every lock acquisition is error-prone, hard to read, and a bit annoying,
so this commit provides wrapper functions that pull in the
smp_mb__after_unlock_lock() invocations.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Paul E. McKenney

Peter Zijlstra
10 years ago

08 Oct, 2015

4 commits

d2856b046 Merge branches 'fixes.2015.10.06a' and 'exp.2015.10.07a' into HEAD ... Browse Code »

exp.2015.10.07a: Reduce OS jitter of RCU-sched expedited grace periods.
fixes.2015.10.06a: Miscellaneous fixes.

Paul E. McKenney
10 years ago
74611ecb0 rcu: Add online/offline info to expedited stall warning message ... Browse Code »

This commit makes the RCU CPU stall warning message print online/offline
indications immediately after the CPU number. A "O" indicates global
offline, a "." global online, and a "o" indicates RCU believes that the
CPU is offline for the current grace period and "." otherwise, and an
"N" indicates that RCU believes that the CPU will be offline for the
next grace period, and "." otherwise, all right after the CPU number.
So for CPU 10, you would normally see "10-...:" indicating that everything
believes that the CPU is online.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago
83c2c735e rcu: Stop silencing lockdep false positive for expedited grace periods ... Browse Code »

This reverts commit af859beaaba4 (rcu: Silence lockdep false positive
for expedited grace periods). Because synchronize_rcu_expedited()
no longer invokes synchronize_sched_expedited(), ->exp_funnel_mutex
acquisition is no longer nested, so the false positive no longer happens.
This commit therefore removes the extra lockdep data structures, as they
are no longer needed.

Paul E. McKenney
10 years ago
6587a23b6 rcu: Switch synchronize_sched_expedited() to IPI ... Browse Code »

This commit switches synchronize_sched_expedited() from stop_one_cpu_nowait()
to smp_call_function_single(), thus moving from an IPI and a pair of
context switches to an IPI and a single pass through the scheduler.
Of course, if the scheduler actually does decide to switch to a different
task, there will still be a pair of context switches, but there would
likely have been a pair of context switches anyway, just a bit later.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago

07 Oct, 2015

4 commits

c34d2f418 rcu: Correct comment for values of ->gp_state field ... Browse Code »

This commit corrects the comment for the values of the ->gp_state field,
which previously incorrectly said that these were for the ->gp_flags
field.

Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
10 years ago
77f81fe08 rcu: Finish folding ->fqs_state into ->gp_state ... Browse Code »

Commit commit 4cdfc175c25c89ee ("rcu: Move quiescent-state forcing
into kthread") started the process of folding the old ->fqs_state into
->gp_state, but did not complete it. This situation does not cause
any malfunction, but can result in extremely confusing trace output.
This commit completes this task of eliminating ->fqs_state in favor
of ->gp_state.

The old ->fqs_state was also used to decide when to collect dyntick-idle
snapshots. For this purpose, we add a boolean variable into the kthread,
which is set on the first call to rcu_gp_fqs() for a given grace period
and clear otherwise.

Signed-off-by: Petr Mladek
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Petr Mladek
10 years ago
db3e8db45 rcu: Use call_rcu_func_t to replace explicit type equivalents ... Browse Code »

We have had the call_rcu_func_t typedef for a quite awhile, but we still
use explicit function pointer types in some places. These types can
confuse cscope and can be hard to read. This patch therefore replaces
these types with the call_rcu_func_t typedef.

Signed-off-by: Boqun Feng
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Boqun Feng
10 years ago
b6a4ae766 rcu: Use rcu_callback_t in call_rcu*() and friends ... Browse Code »

As we now have rcu_callback_t typedefs as the type of rcu callbacks, we
should use it in call_rcu*() and friends as the type of parameters. This
could save us a few lines of code and make it clear which function
requires an rcu callbacks rather than other callbacks as its argument.

Besides, this can also help cscope to generate a better database for
code reading.

Signed-off-by: Boqun Feng
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Boqun Feng
10 years ago

21 Sep, 2015

5 commits

5b74c4589 rcu: Make ->cpu_no_qs be a union for aggregate OR ... Browse Code »

This commit converts the rcu_data structure's ->cpu_no_qs field
to a union. The bytewise side of this union allows individual access
to indications as to whether this CPU needs to find a quiescent state
for a normal (.norm) and/or expedited (.exp) grace period. The setwise
side of the union allows testing whether or not a quiescent state is
needed at all, for either type of grace period.

For now, only .norm is used. A later commit will introduce the expedited
usage.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago
0d43eb34f rcu: Invert passed_quiesce and rename to cpu_no_qs ... Browse Code »

This commit inverts the sense of the rcu_data structure's ->passed_quiesce
field and renames it to ->cpu_no_qs. This will allow a later commit to
use an "aggregate OR" operation to test expedited as well as normal grace
periods without added overhead.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago
97c668b8e rcu: Rename qs_pending to core_needs_qs ... Browse Code »

An upcoming commit needs to invert the sense of the ->passed_quiesce
rcu_data structure field, so this commit is taking this opportunity
to clarify things a bit by renaming ->qs_pending to ->core_needs_qs.

So if !rdp->core_needs_qs, then this CPU need not concern itself with
quiescent states, in particular, it need not acquire its leaf rcu_node
structure's ->lock to check. Otherwise, it needs to report the next
quiescent state.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago
bce5fa12a rcu: Move synchronize_sched_expedited() to combining tree ... Browse Code »

Currently, synchronize_sched_expedited() uses a single global counter
to track the number of remaining context switches that the current
expedited grace period must wait on. This is problematic on large
systems, where the resulting memory contention can be pathological.
This commit therefore makes synchronize_sched_expedited() instead use
the combining tree in the same manner as synchronize_rcu_expedited(),
keeping memory contention down to a dull roar.

This commit creates a temporary function sync_sched_exp_select_cpus()
that is very similar to sync_rcu_exp_select_cpus(). A later commit
will consolidate these two functions, which becomes possible when
synchronize_sched_expedited() switches from stop_one_cpu_nowait() to
smp_call_function_single().

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago
b9585e940 rcu: Consolidate tree setup for synchronize_rcu_expedited() ... Browse Code »

This commit replaces sync_rcu_preempt_exp_init1(() and
sync_rcu_preempt_exp_init2() with sync_exp_reset_tree_hotplug()
and sync_exp_reset_tree(), which will also be used by
synchronize_sched_expedited(), and sync_rcu_exp_select_nodes(), which
contains code specific to synchronize_rcu_expedited().

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago

04 Aug, 2015

2 commits

12d560f4e rcu,locking: Privatize smp_mb__after_unlock_lock() ... Browse Code »

RCU is the only thing that uses smp_mb__after_unlock_lock(), and is
likely the only thing that ever will use it, so this commit makes this
macro private to RCU.

Signed-off-by: Paul E. McKenney
Cc: Will Deacon
Cc: Peter Zijlstra
Cc: Benjamin Herrenschmidt
Cc: "linux-arch@vger.kernel.org"

Paul E. McKenney
10 years ago
af859beaa rcu: Silence lockdep false positive for expedited grace periods ... Browse Code »

In a CONFIG_PREEMPT=y kernel, synchronize_rcu_expedited()
acquires the ->exp_funnel_mutex in rcu_preempt_state, then invokes
synchronize_sched_expedited, which acquires the ->exp_funnel_mutex in
rcu_sched_state. There can be no deadlock because rcu_preempt_state
->exp_funnel_mutex acquisition always precedes that of rcu_sched_state.
But lockdep does not know that, so it gives false-positive splats.

This commit therefore associates a separate lock_class_key structure
with the rcu_sched_state structure's ->exp_funnel_mutex, allowing
lockdep to see the lock ordering, avoiding the false positives.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago

18 Jul, 2015

8 commits

cdacbe1f9 rcu: Add fastpath bypassing funnel locking ... Browse Code »

In the common case, there will be only one expedited grace period in
the system at a given time, in which case it is not helpful to use
funnel locking. This commit therefore adds a fastpath that bypasses
funnel locking when the root ->exp_funnel_mutex is not held.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago
32bb1c799 rcu: Rename RCU_GP_DONE_FQS to RCU_GP_DOING_FQS ... Browse Code »

The grace-period kthread sleeps waiting to do a force-quiescent-state
scan, and when awakened sets rsp->gp_state to RCU_GP_DONE_FQS.
However, this is confusing because the kthread has not done the
force-quiescent-state, but is instead just starting to do it. This commit
therefore renames RCU_GP_DONE_FQS to RCU_GP_DOING_FQS in order to make
things a bit easier on reviewers.

Reported-by: Peter Zijlstra
Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago
cf3620a6c rcu: Add stall warnings to synchronize_sched_expedited() ... Browse Code »

Although synchronize_sched_expedited() historically has no RCU CPU stall
warnings, the availability of the rcupdate.rcu_expedited boot parameter
invalidates the old assumption that synchronize_sched()'s stall warnings
would suffice. This commit therefore adds RCU CPU stall warnings to
synchronize_sched_expedited().

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago
2cd6ffafe rcu: Extend expedited funnel locking to rcu_data structure ... Browse Code »

The strictly rcu_node based funnel-locking scheme works well in many
cases, but systems with CONFIG_RCU_FANOUT_LEAF=64 won't necessarily get
all that much concurrency. This commit therefore extends the funnel
locking into the per-CPU rcu_data structure, providing concurrency equal
to the number of CPUs.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago
4f525a528 rcu: Apply rcu_seq operations to _rcu_barrier() ... Browse Code »

The rcu_seq operations were open-coded in _rcu_barrier(), so this commit
replaces the open-coding with the shiny new rcu_seq operations.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago
3a6d7c64d rcu: Make expedited GP CPU stoppage asynchronous ... Browse Code »

Sequentially stopping the CPUs slows down expedited grace periods by
at least a factor of two, based on rcutorture's grace-period-per-second
rate. This is a conservative measure because rcutorture uses unusually
long RCU read-side critical sections and because rcutorture periodically
quiesces the system in order to test RCU's ability to ramp down to and
up from the idle state. This commit therefore replaces the stop_one_cpu()
with stop_one_cpu_nowait(), using an atomic-counter scheme to determine
when all CPUs have passed through the stopped state.

Signed-off-by: Peter Zijlstra
Signed-off-by: Paul E. McKenney

Peter Zijlstra
10 years ago
385b73c06 rcu: Get rid of synchronize_sched_expedited()'s polling loop ... Browse Code »

This commit gets rid of synchronize_sched_expedited()'s mutex_trylock()
polling loop in favor of a funnel-locking scheme based on the rcu_node
tree. The work-done check is done at each level of the tree, allowing
high-contention situations to be resolved quickly with reasonable levels
of mutex contention.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago
d6ada2cf2 rcu: Rework synchronize_sched_expedited() counter handling ... Browse Code »

Now that synchronize_sched_expedited() have a mutex, it can use simpler
work-already-done detection scheme. This commit simplifies this scheme
by using something similar to the sequence-locking counter scheme.
A counter is incremented before and after each grace period, so that
the counter is odd in the midst of the grace period and even otherwise.
So if the counter has advanced to the second even number that is
greater than or equal to the snapshot, the required grace period has
already happened.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
10 years ago