Eric Lee / smarc-fsl-linux-kernel

01 Dec, 2018

1 commit

077506972 rcu: Make need_resched() respond to urgent RCU-QS needs ... Browse Code »

commit 92aa39e9dc77481b90cbef25e547d66cab901496 upstream.

The per-CPU rcu_dynticks.rcu_urgent_qs variable communicates an urgent
need for an RCU quiescent state from the force-quiescent-state processing
within the grace-period kthread to context switches and to cond_resched().
Unfortunately, such urgent needs are not communicated to need_resched(),
which is sometimes used to decide when to invoke cond_resched(), for
but one example, within the KVM vcpu_run() function. As of v4.15, this
can result in synchronize_sched() being delayed by up to ten seconds,
which can be problematic, to say nothing of annoying.

This commit therefore checks rcu_dynticks.rcu_urgent_qs from within
rcu_check_callbacks(), which is invoked from the scheduling-clock
interrupt handler. If the current task is not an idle task and is
not executing in usermode, a context switch is forced, and either way,
the rcu_dynticks.rcu_urgent_qs variable is set to false. If the current
task is an idle task, then RCU's dyntick-idle code will detect the
quiescent state, so no further action is required. Similarly, if the
task is executing in usermode, other code in rcu_check_callbacks() and
its called functions will report the corresponding quiescent state.

Reported-by: Marius Hillenbrand
Reported-by: David Woodhouse
Suggested-by: Peter Zijlstra
Signed-off-by: Paul E. McKenney
[ paulmck: Backported to make patch apply cleanly on older versions. ]
Tested-by: Marius Hillenbrand
Cc: # 4.12.x - 4.19.x
Signed-off-by: Greg Kroah-Hartman

Paul E. McKenney
2018-12-01 16:43:00 +0800

30 May, 2018

1 commit

46d8696c6 rcu: Call touch_nmi_watchdog() while printing stall warnings ... Browse Code »

[ Upstream commit 3caa973b7a260e7a2a69edc94c300ab9c65148c3 ]

When RCU stall warning triggers, it can print out a lot of messages
while holding spinlocks. If the console device is slow (e.g. an
actual or IPMI serial console), it may end up triggering NMI hard
lockup watchdog like the following.

Tejun Heo
2018-05-30 13:52:39 +0800

17 Feb, 2018

1 commit

ce6faf10f rcu: Export init_rcu_head() and destroy_rcu_head() to GPL modules ... Browse Code »

commit 156baec39732f025dc778e00da95fc10d6e45885 upstream.

Use of init_rcu_head() and destroy_rcu_head() from modules results in
the following build-time error with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y:

ERROR: "init_rcu_head" [drivers/scsi/scsi_mod.ko] undefined!
ERROR: "destroy_rcu_head" [drivers/scsi/scsi_mod.ko] undefined!

This commit therefore adds EXPORT_SYMBOL_GPL() for each to allow them to
be used by GPL-licensed kernel modules.

Reported-by: Bart Van Assche
Reported-by: Stephen Rothwell
Signed-off-by: Paul E. McKenney
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman

Paul E. McKenney
2018-02-17 03:23:11 +0800

24 Nov, 2017

1 commit

3594216fc rcu: Fix up pending cbs check in rcu_prepare_for_idle ... Browse Code »

commit 135bd1a230bb69a68c9808a7d25467318900b80a upstream.

The pending-callbacks check in rcu_prepare_for_idle() is backwards.
It should accelerate if there are pending callbacks, but the check
rather uselessly accelerates only if there are no callbacks. This commit
therefore inverts this check.

Fixes: 15fecf89e46a ("srcu: Abstract multi-tail callback list handling")
Signed-off-by: Neeraj Upadhyay
Signed-off-by: Paul E. McKenney
Signed-off-by: Greg Kroah-Hartman

Neeraj Upadhyay
2017-11-24 15:37:04 +0800

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800

20 Oct, 2017

1 commit

27fdb35fe doc: Fix various RCU docbook comment-header problems ... Browse Code »

Because many of RCU's files have not been included into docbook, a
number of errors have accumulated. This commit fixes them.

Signed-off-by: Paul E. McKenney
Signed-off-by: Linus Torvalds

Paul E. McKenney
2017-10-20 10:26:11 +0800

04 Oct, 2017

1 commit

013a8ee62 Merge tag 'trace-v4.14-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing fixlets from Steven Rostedt:
"Two updates:

- A memory fix with left over code from spliting out ftrace_ops and
function graph tracer, where the function graph tracer could reset
the trampoline pointer, leaving the old trampoline not to be freed
(memory leak).

- The update to Paul's patch that added the unnecessary READ_ONCE().
This removes the unnecessary READ_ONCE() instead of having to
rebase the branch to update the patch that added it"

* tag 'trace-v4.14-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
rcu: Remove extraneous READ_ONCE()s from rcu_irq_{enter,exit}()
ftrace: Fix kmemleak in unregister_ftrace_graph

Linus Torvalds
2017-10-04 23:34:01 +0800

03 Oct, 2017

1 commit

f39b536ce rcu: Remove extraneous READ_ONCE()s from rcu_irq_{enter,exit}() ... Browse Code »

The read of ->dynticks_nmi_nesting in rcu_irq_enter() and rcu_irq_exit()
is currently protected with READ_ONCE(). However, this protection is
unnecessary because (1) ->dynticks_nmi_nesting is updated only by the
current CPU, (2) Although NMI handlers can update this field, they reset
it back to its old value before return, and (3) Interrupts are disabled,
so nothing else can modify it. The value of ->dynticks_nmi_nesting is
thus effectively constant, and so no protection is required.

This commit therefore removes the READ_ONCE() protection from these
two accesses.

Link: http://lkml.kernel.org/r/20170926031902.GA2074@linux.vnet.ibm.com

Reported-by: Linus Torvalds
Signed-off-by: Paul E. McKenney
Signed-off-by: Steven Rostedt (VMware)

Paul E. McKenney
2017-10-03 22:27:32 +0800

26 Sep, 2017

1 commit

ac0a36461 Merge tag 'trace-v4.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing fixes from Steven Rostedt:
"Stack tracing and RCU has been having issues with each other and
lockdep has been pointing out constant problems.

The changes have been going into the stack tracer, but it has been
discovered that the problem isn't with the stack tracer itself, but it
is with calling save_stack_trace() from within the internals of RCU.

The stack tracer is the one that can trigger the issue the easiest,
but examining the problem further, it could also happen from a WARN()
in the wrong place, or even if an NMI happened in this area and it did
an rcu_read_lock().

The critical area is where RCU is not watching. Which can happen while
going to and from idle, or bringing up or taking down a CPU.

The final fix was to put the protection in kernel_text_address() as it
is the one that requires RCU to be watching while doing the stack
trace.

To make this work properly, Paul had to allow rcu_irq_enter() happen
after rcu_nmi_enter(). This should have been done anyway, since an NMI
can page fault (reading vmalloc area), and a page fault triggers
rcu_irq_enter().

One patch is just a consolidation of code so that the fix only needed
to be done in one location"

* tag 'trace-v4.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Remove RCU work arounds from stack tracer
extable: Enable RCU if it is not watching in kernel_text_address()
extable: Consolidate *kernel_text_address() functions
rcu: Allow for page faults in NMI handlers

Linus Torvalds
2017-09-26 06:22:31 +0800

24 Sep, 2017

1 commit

28585a832 rcu: Allow for page faults in NMI handlers ... Browse Code »

A number of architecture invoke rcu_irq_enter() on exception entry in
order to allow RCU read-side critical sections in the exception handler
when the exception is from an idle or nohz_full CPU. This works, at
least unless the exception happens in an NMI handler. In that case,
rcu_nmi_enter() would already have exited the extended quiescent state,
which would mean that rcu_irq_enter() would (incorrectly) cause RCU
to think that it is again in an extended quiescent state. This will
in turn result in lockdep splats in response to later RCU read-side
critical sections.

This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to
take no action if there is an rcu_nmi_enter() in effect, thus avoiding
the unscheduled return to RCU quiescent state. This in turn should
make the kernel safe for on-demand RCU voyeurism.

Link: http://lkml.kernel.org/r/20170922211022.GA18084@linux.vnet.ibm.com

Cc: stable@vger.kernel.org
Fixes: 0be964be0 ("module: Sanitize RCU usage and locking")
Reported-by: Steven Rostedt
Signed-off-by: Paul E. McKenney
Signed-off-by: Steven Rostedt (VMware)

Paul E. McKenney
2017-09-24 04:49:42 +0800

09 Sep, 2017

1 commit

9b130ad5b treewide: make "nr_cpu_ids" unsigned ... Browse Code »

First, number of CPUs can't be negative number.

Second, different signnnedness leads to suboptimal code in the following
cases:

1)
kmalloc(nr_cpu_ids * sizeof(X));

"int" has to be sign extended to size_t.

2)
while (loff_t *pos < nr_cpu_ids)

MOVSXD is 1 byte longed than the same MOV.

Other cases exist as well. Basically compiler is told that nr_cpu_ids
can't be negative which can't be deduced if it is "int".

Code savings on allyesconfig kernel: -3KB

add/remove: 0/0 grow/shrink: 25/264 up/down: 261/-3631 (-3370)
function old new delta
coretemp_cpu_online 450 512 +62
rcu_init_one 1234 1272 +38
pci_device_probe 374 399 +25

...

pgdat_reclaimable_pages 628 556 -72
select_fallback_rq 446 369 -77
task_numa_find_cpu 1923 1807 -116

Link: http://lkml.kernel.org/r/20170819114959.GA30580@avx2
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2017-09-09 09:26:48 +0800

17 Aug, 2017

13 commits

656e7c0c0 Merge branches 'doc.2017.08.17a', 'fixes.2017.08.17a', 'hotplug.2017.07.25b', 'm… ... Browse Code »

…isc.2017.08.17a', 'spin_unlock_wait_no.2017.08.17a', 'srcu.2017.07.27c' and 'torture.2017.07.24c' into HEAD

doc.2017.08.17a: Documentation updates.
fixes.2017.08.17a: RCU fixes.
hotplug.2017.07.25b: CPU-hotplug updates.
misc.2017.08.17a: Miscellaneous fixes outside of RCU (give or take conflicts).
spin_unlock_wait_no.2017.08.17a: Remove spin_unlock_wait().
srcu.2017.07.27c: SRCU updates.
torture.2017.07.24c: Torture-test updates.

Paul E. McKenney
2017-08-17 23:10:04 +0800
16c0b1060 rcu: Remove exports from rcu_idle_exit() and rcu_idle_enter() ... Browse Code »

The rcu_idle_exit() and rcu_idle_enter() functions are exported because
they were originally used by RCU_NONIDLE(), which was intended to
be usable from modules. However, RCU_NONIDLE() now instead uses
rcu_irq_enter_irqson() and rcu_irq_exit_irqson(), which are not
exported, and there have been no complaints.

This commit therefore removes the exports from rcu_idle_exit() and
rcu_idle_enter().

Reported-by: Peter Zijlstra
Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-08-17 22:26:25 +0800
d4db30af5 rcu: Add warning to rcu_idle_enter() for irqs enabled ... Browse Code »

All current callers of rcu_idle_enter() have irqs disabled, and
rcu_idle_enter() relies on this, but doesn't check. This commit
therefore adds a RCU_LOCKDEP_WARN() to add some verification to the trust.
While we are there, pass "true" rather than "1" to rcu_eqs_enter().

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-08-17 22:26:25 +0800
3a6079926 rcu: Make rcu_idle_enter() rely on callers disabling irqs ... Browse Code »

All callers to rcu_idle_enter() have irqs disabled, so there is no
point in rcu_idle_enter disabling them again. This commit therefore
replaces the irq disabling with a RCU_LOCKDEP_WARN().

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Paul E. McKenney

Peter Zijlstra (Intel)
2017-08-17 22:26:24 +0800
2dee9404f rcu: Add assertions verifying blocked-tasks list ... Browse Code »

This commit adds assertions verifying the consistency of the rcu_node
structure's ->blkd_tasks list and its ->gp_tasks, ->exp_tasks, and
->boost_tasks pointers. In particular, the ->blkd_tasks lists must be
empty except for leaf rcu_node structures.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-08-17 22:26:23 +0800
35fe723bd rcu/tracing: Set disable_rcu_irq_enter on rcu_eqs_exit() ... Browse Code »

Set disable_rcu_irq_enter on not only rcu_eqs_enter_common() but also
rcu_eqs_exit(), since rcu_eqs_exit() suffers from the same issue as was
fixed for rcu_eqs_enter_common() by commit 03ecd3f48e57 ("rcu/tracing:
Add rcu_disabled to denote when rcu_irq_enter() will not work").

Signed-off-by: Masami Hiramatsu
Acked-by: Steven Rostedt (VMware)
Signed-off-by: Paul E. McKenney

Masami Hiramatsu
2017-08-17 22:26:23 +0800
d8db2e86d rcu: Add TPS() protection for _rcu_barrier_trace strings ... Browse Code »

The _rcu_barrier_trace() function is a wrapper for trace_rcu_barrier(),
which needs TPS() protection for strings passed through the second
argument. However, it has escaped prior TPS()-ification efforts because
it _rcu_barrier_trace() does not start with "trace_". This commit
therefore adds the needed TPS() protection

Signed-off-by: Paul E. McKenney
Acked-by: Steven Rostedt (VMware)

Paul E. McKenney
2017-08-17 22:26:22 +0800
d5374226c rcu: Use idle versions of swait to make idle-hack clear ... Browse Code »

These RCU waits were set to use interruptible waits to avoid the kthreads
contributing to system load average, even though they are not interruptible
as they are spawned from a kthread. Use the new TASK_IDLE swaits which makes
our goal clear, and removes confusion about these paths possibly being
interruptible -- they are not.

When the system is idle the RCU grace-period kthread will spend all its time
blocked inside the swait_event_interruptible(). If the interruptible() was
not used, then this kthread would contribute to the load average. This means
that an idle system would have a load average of 2 (or 3 if PREEMPT=y),
rather than the load average of 0 that almost fifty years of UNIX has
conditioned sysadmins to expect.

The same argument applies to swait_event_interruptible_timeout() use. The
RCU grace-period kthread spends its time blocked inside this call while
waiting for grace periods to complete. In particular, if there was only one
busy CPU, but that CPU was frequently invoking call_rcu(), then the RCU
grace-period kthread would spend almost all its time blocked inside the
swait_event_interruptible_timeout(). This would mean that the load average
would be 2 rather than the expected 1 for the single busy CPU.

Acked-by: "Eric W. Biederman"
Tested-by: Paul E. McKenney
Signed-off-by: Luis R. Rodriguez
Signed-off-by: Paul E. McKenney

Luis R. Rodriguez
2017-08-17 22:26:15 +0800
c5ebe66ce rcu: Add event tracing to ->gp_tasks update at GP start ... Browse Code »

There is currently event tracing to track when a task is preempted
within a preemptible RCU read-side critical section, and also when that
task subsequently reaches its outermost rcu_read_unlock(), but none
indicating when a new grace period starts when that grace period must
wait on pre-existing readers that have been been preempted at least once
since the beginning of their current RCU read-side critical sections.

This commit therefore adds an event trace at grace-period start in
the case where there are such readers. Note that only the first
reader in the list is traced.

Signed-off-by: Paul E. McKenney
Acked-by: Steven Rostedt (VMware)

Paul E. McKenney
2017-08-17 22:26:06 +0800
7414fac05 rcu: Move rcu.h to new trivial-function style ... Browse Code »

This commit saves a few lines in kernel/rcu/rcu.h by moving to single-line
definitions for trivial functions, instead of the old style where the
two curly braces each get their own line.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-08-17 22:26:06 +0800
bedbb648e rcu: Add TPS() to event-traced strings ... Browse Code »

Strings used in event tracing need to be specially handled, for example,
using the TPS() macro. Without the TPS() macro, although output looks
fine from within a running kernel, extracting traces from a crash dump
produces garbage instead of strings. This commit therefore adds the TPS()
macro to some unadorned strings that were passed to event-tracing macros.

Signed-off-by: Paul E. McKenney
Acked-by: Steven Rostedt (VMware)

Paul E. McKenney
2017-08-17 22:26:05 +0800
ccdd29fff rcu: Create reasonable API for do_exit() TASKS_RCU processing ... Browse Code »

Currently, the exit-time support for TASKS_RCU is open-coded in do_exit().
This commit creates exit_tasks_rcu_start() and exit_tasks_rcu_finish()
APIs for do_exit() use. This has the benefit of confining the use of the
tasks_rcu_exit_srcu variable to one file, allowing it to become static.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-08-17 22:26:05 +0800
7e42776d5 rcu: Drive TASKS_RCU directly off of PREEMPT ... Browse Code »

The actual use of TASKS_RCU is only when PREEMPT, otherwise RCU-sched
is used instead. This commit therefore makes synchronize_rcu_tasks()
and call_rcu_tasks() available always, but mapped to synchronize_sched()
and call_rcu_sched(), respectively, when !PREEMPT. This approach also
allows some #ifdefs to be removed from rcutorture.

Reported-by: Ingo Molnar
Signed-off-by: Paul E. McKenney
Reviewed-by: Masami Hiramatsu
Acked-by: Ingo Molnar

Paul E. McKenney
2017-08-17 22:26:04 +0800

28 Jul, 2017

1 commit

35732cf9d srcu: Provide ordering for CPU not involved in grace period ... Browse Code »

Tree RCU guarantees that every online CPU has a memory barrier between
any given grace period and any of that CPU's RCU read-side sections that
must be ordered against that grace period. Since RCU doesn't always
know where read-side critical sections are, the actual implementation
guarantees order against prior and subsequent non-idle non-offline code,
whether in an RCU read-side critical section or not. As a result, there
does not need to be a memory barrier at the end of synchronize_rcu()
and friends because the ordering internal to the grace period has
ordered every CPU's post-grace-period execution against each CPU's
pre-grace-period execution, again for all non-idle online CPUs.

In contrast, SRCU can have non-idle online CPUs that are completely
uninvolved in a given SRCU grace period, for example, a CPU that
never runs any SRCU read-side critical sections and took no part in
the grace-period processing. It is in theory possible for a given
synchronize_srcu()'s wakeup to be delivered to a CPU that was completely
uninvolved in the prior SRCU grace period, which could mean that the
code following that synchronize_srcu() would end up being unordered with
respect to both the grace period and any pre-existing SRCU read-side
critical sections.

This commit therefore adds an smp_mb() to the end of __synchronize_srcu(),
which prevents this scenario from occurring.

Reported-by: Lance Roy
Signed-off-by: Paul E. McKenney
Acked-by: Lance Roy
Cc: # 4.12.x

Paul E. McKenney
2017-07-28 06:53:04 +0800

26 Jul, 2017

12 commits

09efeeee1 rcu: Move callback-list warning to irq-disable region ... Browse Code »

After adopting callbacks from a newly offlined CPU, the adopting CPU
checks to make sure that its callback list's count is zero only if the
list has no callbacks and vice versa. Unfortunately, it does so after
enabling interrupts, which means that false positives are possible due to
interrupt handlers invoking call_rcu(). Although these false positives
are improbable, rcutorture did make it happen once.

This commit therefore moves this check to an irq-disabled region of code,
thus suppressing the false positive.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-26 04:04:50 +0800
aed4e0468 rcu: Remove unused RCU list functions ... Browse Code »

Given changes to callback migration, rcu_cblist_head(),
rcu_cblist_tail(), rcu_cblist_count_cbs(), rcu_segcblist_segempty(),
rcu_segcblist_dequeued_lazy(), and rcu_segcblist_new_cbs() are
no longer used. This commit therefore removes them.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-26 04:04:49 +0800
f2dbe4a56 rcu: Localize rcu_state ->orphan_pend and ->orphan_done ... Browse Code »

Given that the rcu_state structure's >orphan_pend and ->orphan_done
fields are used only during migration of callbacks from the recently
offlined CPU to a surviving CPU, if rcu_send_cbs_to_orphanage() and
rcu_adopt_orphan_cbs() are combined, these fields can become local
variables in the combined function. This commit therefore combines
rcu_send_cbs_to_orphanage() and rcu_adopt_orphan_cbs() into a new
rcu_segcblist_merge() function and removes the ->orphan_pend and
->orphan_done fields.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-26 04:04:49 +0800
21cc24838 rcu: Advance callbacks after migration ... Browse Code »

When migrating callbacks from a newly offlined CPU, we are already
holding the root rcu_node structure's lock, so it costs almost nothing
to advance and accelerate the newly migrated callbacks. This patch
therefore makes this advancing and acceleration happen.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-26 04:04:48 +0800
537b85c87 rcu: Eliminate rcu_state ->orphan_lock ... Browse Code »

The ->orphan_lock is acquired and released only within the
rcu_migrate_callbacks() function, which now acquires the root rcu_node
structure's ->lock. This commit therefore eliminates the ->orphan_lock
in favor of the root rcu_node structure's ->lock.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-26 04:04:48 +0800
9fa46fb8c rcu: Advance outgoing CPU's callbacks before migrating them ... Browse Code »

It is possible that the outgoing CPU is unaware of recent grace periods,
and so it is also possible that some of its pending callbacks are actually
ready to be invoked. The current callback-migration code would needlessly
force these callbacks to pass through another grace period. This commit
therefore invokes rcu_advance_cbs() on the outgoing CPU's callbacks in
order to give them full credit for having passed through any recent
grace periods.

This also fixes an odd theoretical bug where there are no callbacks in
the system except for those on the outgoing CPU, none of those callbacks
have yet been associated with a grace-period number, there is never again
another callback registered, and the surviving CPU never again takes a
scheduling-clock interrupt, never goes idle, and never enters nohz_full
userspace execution. Yes, this is (just barely) possible. It requires
that the surviving CPU be a nohz_full CPU, that its scheduler-clock
interrupt be shut off, and that it loop forever in the kernel. You get
bonus points if you can make this one happen! ;-)

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-26 04:04:47 +0800
b1a2d79fe rcu: Make NOCB CPUs migrate CBs directly from outgoing CPU ... Browse Code »

RCU's CPU-hotplug callback-migration code first moves the outgoing
CPU's callbacks to ->orphan_done and ->orphan_pend, and only then
moves them to the NOCB callback list. This commit avoids the
extra step (and simplifies the code) by moving the callbacks directly
from the outgoing CPU's callback list to the NOCB callback list.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-26 04:04:47 +0800
95335c035 rcu: Check for NOCB CPUs and empty lists earlier in CB migration ... Browse Code »

The current CPU-hotplug RCU-callback-migration code checks
for the source (newly offlined) CPU being a NOCBs CPU down in
rcu_send_cbs_to_orphanage(). This commit simplifies callback migration a
bit by moving this check up to rcu_migrate_callbacks(). This commit also
adds a check for the source CPU having no callbacks, which eases analysis
of the rcu_send_cbs_to_orphanage() and rcu_adopt_orphan_cbs() functions.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-26 04:04:46 +0800
c47e067a3 rcu: Remove orphan/adopt event-tracing fields ... Browse Code »

The rcu_node structure's ->n_cbs_orphaned and ->n_cbs_adopted fields
are updated, but never read. This commit therefore removes them.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-26 04:04:46 +0800
313517fc4 rcu: Make expedited GPs correctly handle hardware CPU insertion ... Browse Code »

The update of the ->expmaskinitnext and of ->ncpus are unsynchronized,
with the value of ->ncpus being incremented long before the corresponding
->expmaskinitnext mask is updated. If an RCU expedited grace period
sees ->ncpus change, it will update the ->expmaskinit masks from the new
->expmaskinitnext masks. But it is possible that ->ncpus has already
been updated, but the ->expmaskinitnext masks still have their old values.
For the current expedited grace period, no harm done. The CPU could not
have been online before the grace period started, so there is no need to
wait for its non-existent pre-existing readers.

But the next RCU expedited grace period is in a world of hurt. The value
of ->ncpus has already been updated, so this grace period will assume
that the ->expmaskinitnext masks have not changed. But they have, and
they won't be taken into account until the next never-been-online CPU
comes online. This means that RCU will be ignoring some CPUs that it
should be paying attention to.

The solution is to update ->ncpus and ->expmaskinitnext while holding
the ->lock for the rcu_node structure containing the ->expmaskinitnext
mask. Because smp_store_release() is now used to update ->ncpus and
smp_load_acquire() is now used to locklessly read it, if the expedited
grace period sees ->ncpus change, then the updating CPU has to
already be holding the corresponding ->lock. Therefore, when the
expedited grace period later acquires that ->lock, it is guaranteed
to see the new value of ->expmaskinitnext.

On the other hand, if the expedited grace period loads ->ncpus just
before an update, earlier full memory barriers guarantee that
the incoming CPU isn't far enough along to be running any RCU readers.

This commit therefore makes the required change.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-26 04:04:45 +0800
a58163d8c rcu: Migrate callbacks earlier in the CPU-offline timeline ... Browse Code »

RCU callbacks must be migrated away from an outgoing CPU, and this is
done near the end of the CPU-hotplug operation, after the outgoing CPU is
long gone. Unfortunately, this means that other CPU-hotplug callbacks
can execute while the outgoing CPU's callbacks are still immobilized
on the long-gone CPU's callback lists. If any of these CPU-hotplug
callbacks must wait, either directly or indirectly, for the invocation
of any of the immobilized RCU callbacks, the system will hang.

This commit avoids such hangs by migrating the callbacks away from the
outgoing CPU immediately upon its departure, shortly after the return
from __cpu_die() in takedown_cpu(). Thus, RCU is able to advance these
callbacks and invoke them, which allows all the after-the-fact CPU-hotplug
callbacks to wait on these RCU callbacks without risk of a hang.

While in the neighborhood, this commit also moves rcu_send_cbs_to_orphanage()
and rcu_adopt_orphan_cbs() under a pre-existing #ifdef to avoid including
dead code on the one hand and to avoid define-without-use warnings on the
other hand.

Reported-by: Jeffrey Hugo
Link: http://lkml.kernel.org/r/db9c91f6-1b17-6136-84f0-03c3c2581ab4@codeaurora.org
Signed-off-by: Paul E. McKenney
Cc: Thomas Gleixner
Cc: Sebastian Andrzej Siewior
Cc: Ingo Molnar
Cc: Anna-Maria Gleixner
Cc: Boris Ostrovsky
Cc: Richard Weinberger

Paul E. McKenney
2017-07-26 04:03:43 +0800
8be6e1b15 rcu: Use timer as backstop for NOCB deferred wakeups ... Browse Code »

The handling of RCU's no-CBs CPUs has a maintenance headache, namely
that if call_rcu() is invoked with interrupts disabled, the rcuo kthread
wakeup must be defered to a point where we can be sure that scheduler
locks are not held. Of course, there are a lot of code paths leading
from an interrupts-disabled invocation of call_rcu(), and missing any
one of these can result in excessive callback-invocation latency, and
potentially even system hangs.

This commit therefore uses a timer to guarantee that the wakeup will
eventually occur. If one of the deferred-wakeup points kicks in, then
the timer is simply cancelled.

This commit also fixes up an incomplete removal of commits that were
intended to plug remaining exit paths, which should have the added
benefit of reducing the overhead of RCU's context-switch hooks. In
addition, it simplifies leader-to-follower callback-list handoff by
introducing locking. The call_rcu()-to-leader handoff continues to
use atomic operations in order to maintain good real-time latency for
common-case use of call_rcu().

Signed-off-by: Paul E. McKenney
[ paulmck: Dan Carpenter fix for mod_timer() usage bug found by smatch. ]

Paul E. McKenney
2017-07-26 00:53:09 +0800

25 Jul, 2017

3 commits

f34c8585e rcutorture: Invoke call_rcu() from timer handler ... Browse Code »

The Linux kernel invokes call_rcu() from various interrupt/softirq
handlers, but rcutorture does not. This commit therefore adds this
behavior to rcutorture's repertoire.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-25 07:04:19 +0800
96036c430 rcu: Add last-CPU to GP-kthread starvation messages ... Browse Code »

This commit augments the grace-period-kthread starvation debugging
messages by adding the last CPU that ran the kthread.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-25 07:04:18 +0800
a3b7b6c27 rcutorture: Eliminate unused ts_rem local from rcu_trace_clock_local() ... Browse Code »

This commit removes an unused local variable named ts_rem that is
marked __maybe_unused. Yes, the variable was assigned to, but it
was never used beyond that point, hence not needed.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-07-25 07:04:17 +0800