Eric Lee / smarc-fsl-linux-kernel

16 May, 2013

1 commit

615ee5443 rcu: Don't allocate bootmem from rcu_init() ... Browse Code »

When rcu_init() is called we already have slab working, allocating
bootmem at that point results in warnings and an allocation from
slab. This commit therefore changes alloc_bootmem_cpumask_var() to
alloc_cpumask_var() in rcu_bootup_announce_oddness(), which is called
from rcu_init().

Signed-off-by: Sasha Levin
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett
Tested-by: Robin Holt

[paulmck: convert to zalloc_cpumask_var(), as suggested by Yinghai Lu.]

Sasha Levin
12 years ago

15 May, 2013

1 commit

6faf72834 rcu: Fix comparison sense in rcu_needs_cpu() ... Browse Code »

Commit c0f4dfd4f (rcu: Make RCU_FAST_NO_HZ take advantage of numbered
callbacks) introduced a bug that can result in excessively long grace
periods. This bug reverse the senes of the "if" statement checking
for lazy callbacks, so that RCU takes a lazy approach when there are
in fact non-lazy callbacks. This can result in excessive boot, suspend,
and resume times.

This commit therefore fixes the sense of this "if" statement.

Reported-by: Borislav Petkov
Reported-by: Bjørn Mork
Reported-by: Joerg Roedel
Signed-off-by: Paul E. McKenney
Tested-by: Bjørn Mork
Tested-by: Joerg Roedel

Paul E. McKenney
12 years ago

06 May, 2013

1 commit

534c97b09 Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull 'full dynticks' support from Ingo Molnar:
"This tree from Frederic Weisbecker adds a new, (exciting! :-) core
kernel feature to the timer and scheduler subsystems: 'full dynticks',
or CONFIG_NO_HZ_FULL=y.

This feature extends the nohz variable-size timer tick feature from
idle to busy CPUs (running at most one task) as well, potentially
reducing the number of timer interrupts significantly.

This feature got motivated by real-time folks and the -rt tree, but
the general utility and motivation of full-dynticks runs wider than
that:

- HPC workloads get faster: CPUs running a single task should be able
to utilize a maximum amount of CPU power. A periodic timer tick at
HZ=1000 can cause a constant overhead of up to 1.0%. This feature
removes that overhead - and speeds up the system by 0.5%-1.0% on
typical distro configs even on modern systems.

- Real-time workload latency reduction: CPUs running critical tasks
should experience as little jitter as possible. The last remaining
source of kernel-related jitter was the periodic timer tick.

- A single task executing on a CPU is a pretty common situation,
especially with an increasing number of cores/CPUs, so this feature
helps desktop and mobile workloads as well.

The cost of the feature is mainly related to increased timer
reprogramming overhead when a CPU switches its tick period, and thus
slightly longer to-idle and from-idle latency.

Configuration-wise a third mode of operation is added to the existing
two NOHZ kconfig modes:

- CONFIG_HZ_PERIODIC: [formerly !CONFIG_NO_HZ], now explicitly named
as a config option. This is the traditional Linux periodic tick
design: there's a HZ tick going on all the time, regardless of
whether a CPU is idle or not.

- CONFIG_NO_HZ_IDLE: [formerly CONFIG_NO_HZ=y], this turns off the
periodic tick when a CPU enters idle mode.

- CONFIG_NO_HZ_FULL: this new mode, in addition to turning off the
tick when a CPU is idle, also slows the tick down to 1 Hz (one
timer interrupt per second) when only a single task is running on a
CPU.

The .config behavior is compatible: existing !CONFIG_NO_HZ and
CONFIG_NO_HZ=y settings get translated to the new values, without the
user having to configure anything. CONFIG_NO_HZ_FULL is turned off by
default.

This feature is based on a lot of infrastructure work that has been
steadily going upstream in the last 2-3 cycles: related RCU support
and non-periodic cputime support in particular is upstream already.

This tree adds the final pieces and activates the feature. The pull
request is marked RFC because:

- it's marked 64-bit only at the moment - the 32-bit support patch is
small but did not get ready in time.

- it has a number of fresh commits that came in after the merge
window. The overwhelming majority of commits are from before the
merge window, but still some aspects of the tree are fresh and so I
marked it RFC.

- it's a pretty wide-reaching feature with lots of effects - and
while the components have been in testing for some time, the full
combination is still not very widely used. That it's default-off
should reduce its regression abilities and obviously there are no
known regressions with CONFIG_NO_HZ_FULL=y enabled either.

- the feature is not completely idempotent: there is no 100%
equivalent replacement for a periodic scheduler/timer tick. In
particular there's ongoing work to map out and reduce its effects
on scheduler load-balancing and statistics. This should not impact
correctness though, there are no known regressions related to this
feature at this point.

- it's a pretty ambitious feature that with time will likely be
enabled by most Linux distros, and we'd like you to make input on
its design/implementation, if you dislike some aspect we missed.
Without flaming us to crisp! :-)

Future plans:

- there's ongoing work to reduce 1Hz to 0Hz, to essentially shut off
the periodic tick altogether when there's a single busy task on a
CPU. We'd first like 1 Hz to be exposed more widely before we go
for the 0 Hz target though.

- once we reach 0 Hz we can remove the periodic tick assumption from
nr_running>=2 as well, by essentially interrupting busy tasks only
as frequently as the sched_latency constraints require us to do -
once every 4-40 msecs, depending on nr_running.

I am personally leaning towards biting the bullet and doing this in
v3.10, like the -rt tree this effort has been going on for too long -
but the final word is up to you as usual.

More technical details can be found in Documentation/timers/NO_HZ.txt"

* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
sched: Keep at least 1 tick per second for active dynticks tasks
rcu: Fix full dynticks' dependency on wide RCU nocb mode
nohz: Protect smp_processor_id() in tick_nohz_task_switch()
nohz_full: Add documentation.
cputime_nsecs: use math64.h for nsec resolution conversion helpers
nohz: Select VIRT_CPU_ACCOUNTING_GEN from full dynticks config
nohz: Reduce overhead under high-freq idling patterns
nohz: Remove full dynticks' superfluous dependency on RCU tree
nohz: Fix unavailable tick_stop tracepoint in dynticks idle
nohz: Add basic tracing
nohz: Select wide RCU nocb for full dynticks
nohz: Disable the tick when irq resume in full dynticks CPU
nohz: Re-evaluate the tick for the new task after a context switch
nohz: Prepare to stop the tick on irq exit
nohz: Implement full dynticks kick
nohz: Re-evaluate the tick from the scheduler IPI
sched: New helper to prevent from stopping the tick in full dynticks
sched: Kick full dynticks CPU that have more than one task enqueued.
perf: New helper to prevent full dynticks CPUs from stopping tick
perf: Kick full dynticks CPU if events rotation is needed
...

Linus Torvalds
12 years ago

02 May, 2013

1 commit

c032862fb Merge commit '8700c95adb03 ' into timers/nohz ... Browse Code »

The full dynticks tree needs the latest RCU and sched
upstream updates in order to fix some dependencies.

Merge a common upstream merge point that has these
updates.

Conflicts:
include/linux/perf_event.h
kernel/rcutree.h
kernel/rcutree_plugin.h

Signed-off-by: Frederic Weisbecker

Frederic Weisbecker
12 years ago

19 Apr, 2013

1 commit

d1e43fa5f nohz: Ensure full dynticks CPUs are RCU nocbs ... Browse Code »

We need full dynticks CPU to also be RCU nocb so
that we don't have to keep the tick to handle RCU
callbacks.

Make sure the range passed to nohz_full= boot
parameter is a subset of rcu_nocbs=

The CPUs that fail to meet this requirement will be
excluded from the nohz_full range. This is checked
early in boot time, before any CPU has the opportunity
to stop its tick.

Suggested-by: Steven Rostedt
Reviewed-by: Paul E. McKenney
Signed-off-by: Frederic Weisbecker
Cc: Andrew Morton
Cc: Chris Metcalf
Cc: Christoph Lameter
Cc: Geoff Levand
Cc: Gilad Ben Yossef
Cc: Hakan Akkan
Cc: Ingo Molnar
Cc: Kevin Hilman
Cc: Li Zhong
Cc: Paul E. McKenney
Cc: Paul Gortmaker
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: Thomas Gleixner

Frederic Weisbecker
12 years ago

16 Apr, 2013

1 commit

65d798f0f rcu: Kick adaptive-ticks CPUs that are holding up RCU grace periods ... Browse Code »

Adaptive-ticks CPUs inform RCU when they enter kernel mode, but they do
not necessarily turn the scheduler-clock tick back on. This state of
affairs could result in RCU waiting on an adaptive-ticks CPU running
for an extended period in kernel mode. Such a CPU will never run the
RCU state machine, and could therefore indefinitely extend the RCU state
machine, sooner or later resulting in an OOM condition.

This patch, inspired by an earlier patch by Frederic Weisbecker, therefore
causes RCU's force-quiescent-state processing to check for this condition
and to send an IPI to CPUs that remain in that state for too long.
"Too long" currently means about three jiffies by default, which is
quite some time for a CPU to remain in the kernel without blocking.
The rcu_tree.jiffies_till_first_fqs and rcutree.jiffies_till_next_fqs
sysfs variables may be used to tune "too long" if needed.

Reported-by: Frederic Weisbecker
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett
Signed-off-by: Frederic Weisbecker
Cc: Chris Metcalf
Cc: Christoph Lameter
Cc: Geoff Levand
Cc: Gilad Ben Yossef
Cc: Hakan Akkan
Cc: Ingo Molnar
Cc: Kevin Hilman
Cc: Li Zhong
Cc: Paul E. McKenney
Cc: Paul Gortmaker
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: Thomas Gleixner

Paul E. McKenney
12 years ago

26 Mar, 2013

12 commits

6d8766935 Merge branches 'doc.2013.03.12a', 'fixes.2013.03.13a' and 'idlenocb.2013.03.26b' into HEAD ... Browse Code »

doc.2013.03.12a: Documentation changes.

fixes.2013.03.13a: Miscellaneous fixes.

idlenocb.2013.03.26b: Remove restrictions on no-CBs CPUs, make
RCU_FAST_NO_HZ take advantage of numbered callbacks, add
callback acceleration based on numbered callbacks.

Paul E. McKenney
12 years ago
0446be489 rcu: Abstract rcu_start_future_gp() from rcu_nocb_wait_gp() ... Browse Code »

CPUs going idle will need to record the need for a future grace
period, but won't actually need to block waiting on it. This commit
therefore splits rcu_start_future_gp(), which does the recording, from
rcu_nocb_wait_gp(), which now invokes rcu_start_future_gp() to do the
recording, after which rcu_nocb_wait_gp() does the waiting.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
12 years ago
8b425aa8f rcu: Rename n_nocb_gp_requests to need_future_gp ... Browse Code »

CPUs going idle need to be able to indicate their need for future grace
periods. A mechanism for doing this already exists for no-callbacks
CPUs, so the idea is to re-use that mechanism. This commit therefore
moves the ->n_nocb_gp_requests field of the rcu_node structure out from
under the CONFIG_RCU_NOCB_CPU #ifdef and renames it to ->need_future_gp.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
12 years ago
b8462084a rcu: Push lock release to rcu_start_gp()'s callers ... Browse Code »

If CPUs are to give prior notice of needed grace periods, it will be
necessary to invoke rcu_start_gp() without dropping the root rcu_node
structure's ->lock. This commit takes a second step in this direction
by moving the release of this lock to rcu_start_gp()'s callers.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
12 years ago
bd9f0686f rcu: Repurpose no-CBs event tracing to future-GP events ... Browse Code »

Dyntick-idle CPUs need to be able to pre-announce their need for grace
periods. This can be done using something similar to the mechanism used
by no-CB CPUs to announce their need for grace periods. This commit
moves in this direction by renaming the no-CBs grace-period event tracing
to suit the new future-grace-period needs.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
12 years ago
c0f4dfd4f rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks ... Browse Code »

Because RCU callbacks are now associated with the number of the grace
period that they must wait for, CPUs can now take advance callbacks
corresponding to grace periods that ended while a given CPU was in
dyntick-idle mode. This eliminates the need to try forcing the RCU
state machine while entering idle, thus reducing the CPU intensiveness
of RCU_FAST_NO_HZ, which should increase its energy efficiency.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
12 years ago
5e44ce35a rcu: Export RCU_FAST_NO_HZ parameters to sysfs ... Browse Code »

RCU_FAST_NO_HZ operation is controlled by four compile-time C-preprocessor
macros, but some use cases benefit greatly from runtime adjustment,
particularly when tuning devices. This commit therefore creates the
corresponding sysfs entries.

Reported-by: Robin Randhawa
Signed-off-by: Paul E. McKenney

Paul E. McKenney
12 years ago
a48898585 rcu: Distinguish "rcuo" kthreads by RCU flavor ... Browse Code »

Currently, the per-no-CBs-CPU kthreads are named "rcuo" followed by
the CPU number, for example, "rcuo". This is problematic given that
there are either two or three RCU flavors, each of which gets a per-CPU
kthread with exactly the same name. This commit therefore introduces
a one-letter abbreviation for each RCU flavor, namely 'b' for RCU-bh,
'p' for RCU-preempt, and 's' for RCU-sched. This abbreviation is used
to distinguish the "rcuo" kthreads, for example, for CPU 0 we would have
"rcuob/0", "rcuop/0", and "rcuos/0".

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney
Tested-by: Dietmar Eggemann

Paul E. McKenney
12 years ago
09c7b8906 rcu: Add event tracing for no-CBs CPUs' grace periods ... Browse Code »

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
12 years ago
21e7a6087 rcu: Add event tracing for no-CBs CPUs' callback registration ... Browse Code »

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
12 years ago
dae6e64d2 rcu: Introduce proper blocking to no-CBs kthreads GP waits ... Browse Code »

Currently, the no-CBs kthreads do repeated timed waits for grace periods
to elapse. This is crude and energy inefficient, so this commit allows
no-CBs kthreads to specify exactly which grace period they are waiting
for and also allows them to block for the entire duration until the
desired grace period completes.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
12 years ago
911af505e rcu: Provide compile-time control for no-CBs CPUs ... Browse Code »

Currently, the only way to specify no-CBs CPUs is via the rcu_nocbs
kernel command-line parameter. This is inconvenient in some cases,
particularly for randconfig testing, so this commit adds a new set of
kernel configuration parameters. CONFIG_RCU_NOCB_CPU_NONE (the default)
retains the old behavior, CONFIG_RCU_NOCB_CPU_ZERO offloads callback
processing from CPU 0 (along with any other CPUs specified by the
rcu_nocbs boot-time parameter), and CONFIG_RCU_NOCB_CPU_ALL offloads
callback processing from all CPUs.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
12 years ago

14 Mar, 2013

1 commit

6231069bd rcu: Add softirq-stall indications to stall-warning messages ... Browse Code »

If RCU's softirq handler is prevented from executing, an RCU CPU stall
warning can result. Ways to prevent RCU's softirq handler from executing
include: (1) CPU spinning with interrupts disabled, (2) infinite loop
in some softirq handler, and (3) in -rt kernels, an infinite loop in a
set of real-time threads running at priorities higher than that of RCU's
softirq handler.

Because this situation can be difficult to track down, this commit causes
the count of RCU softirq handler invocations to be printed with RCU
CPU stall warnings. This information does require some interpretation,
as now documented in Documentation/RCU/stallwarn.txt.

Reported-by: Thomas Gleixner
Signed-off-by: Paul E. McKenney
Tested-by: Paul Gortmaker

Paul E. McKenney
12 years ago

13 Mar, 2013

1 commit

34ed62461 rcu: Remove restrictions on no-CBs CPUs ... Browse Code »

Currently, CPU 0 is constrained to not be a no-CBs CPU, and furthermore
at least one no-CBs CPU must remain online at any given time. These
restrictions are problematic in some situations, such as cases where
all CPUs must run a real-time workload that needs to be insulated from
OS jitter and latencies due to RCU callback invocation. This commit
therefore provides no-CBs CPUs a (very crude and energy-inefficient)
way to start and to wait for grace periods independently of the normal
RCU callback mechanisms. This approach allows any or all of the CPUs to
be designated as no-CBs CPUs, and allows any proper subset of the CPUs
(whether no-CBs CPUs or not) to be offlined.

This commit also provides a fix for a locking bug spotted by Xie
ChanglongX .

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
12 years ago

09 Jan, 2013

2 commits

1b0048a44 rcu: Make rcu_nocb_poll an early_param instead of module_param ... Browse Code »

The as-documented rcu_nocb_poll will fail to enable this feature
for two reasons. (1) there is an extra "s" in the documented
name which is not in the code, and (2) since it uses module_param,
it really is expecting a prefix, akin to "rcutree.fanout_leaf"
and the prefix isn't documented.

However, there are several reasons why we might not want to
simply fix the typo and add the prefix:

1) we'd end up with rcutree.rcu_nocb_poll, and rather probably make
a change to rcutree.nocb_poll

2) if we did #1, then the prefix wouldn't be consistent with the
rcu_nocbs= parameter (i.e. one with, one without prefix)

3) the use of module_param in a header file is less than desired,
since it isn't immediately obvious that it will get processed
via rcutree.c and get the prefix from that (although use of
module_param_named() could clarify that.)

4) the implied export of /sys/module/rcutree/parameters/rcu_nocb_poll
data to userspace via module_param() doesn't really buy us anything,
as it is read-only and we can tell if it is enabled already without
it, since there is a printk at early boot telling us so.

In light of all that, just change it from a module_param() to an
early_setup() call, and worry about adding it to /sys later on if
we decide to allow a dynamic setting of it.

Also change the variable to be tagged as read_mostly, since it
will only ever be fiddled with at most, once at boot.

Signed-off-by: Paul Gortmaker
Signed-off-by: Paul E. McKenney

Paul Gortmaker
13 years ago
353af9c9a rcu: Prevent soft-lockup complaints about no-CBs CPUs ... Browse Code »

The wait_event() at the head of the rcu_nocb_kthread() can result in
soft-lockup complaints if the CPU in question does not register RCU
callbacks for an extended period. This commit therefore changes
the wait_event() to a wait_event_interruptible().

Reported-by: Frederic Weisbecker
Signed-off-by: Paul Gortmaker
Signed-off-by: Paul E. McKenney

Paul Gortmaker
13 years ago

17 Nov, 2012

3 commits

c635a4e1c rcu: Separate accounting of callbacks from callback-free CPUs ... Browse Code »

Currently, callback invocations from callback-free CPUs are accounted to
the CPU that registered the callback, but using the same field that is
used for normal callbacks. This makes it impossible to determine from
debugfs output whether callbacks are in fact being diverted. This commit
therefore adds a separate ->n_nocbs_invoked field in the rcu_data structure
in which diverted callback invocations are counted. RCU's debugfs tracing
still displays normal callback invocations using ci=, but displayed
diverted callbacks with nci=.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
13 years ago
3fbfbf7a3 rcu: Add callback-free CPUs ... Browse Code »

RCU callback execution can add significant OS jitter and also can
degrade both scheduling latency and, in asymmetric multiprocessors,
energy efficiency. This commit therefore adds the ability for selected
CPUs ("rcu_nocbs=" boot parameter) to have their callbacks offloaded
to kthreads. If the "rcu_nocb_poll" boot parameter is also specified,
these kthreads will do polling, removing the need for the offloaded
CPUs to do wakeups. At least one CPU must be doing normal callback
processing: currently CPU 0 cannot be selected as a no-CBs CPU.
In addition, attempts to offline the last normal-CBs CPU will fail.

This feature was inspired by Jim Houston's and Joe Korty's JRCU, and
this commit includes fixes to problems located by Fengguang Wu's
kbuild test robot.

[ paulmck: Added gfp.h include file as suggested by Fengguang Wu. ]

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
13 years ago
aac1cda34 Merge branches 'urgent.2012.10.27a', 'doc.2012.11.16a', 'fixes.2012.11.13a', 'sr… ... Browse Code »

…cu.2012.10.27a', 'stall.2012.11.13a', 'tracing.2012.11.08a' and 'idle.2012.10.24a' into HEAD

urgent.2012.10.27a: Fix for RCU user-mode transition (already in -tip).

doc.2012.11.08a: Documentation updates, most notably codifying the
memory-barrier guarantees inherent to grace periods.

fixes.2012.11.13a: Miscellaneous fixes.

srcu.2012.10.27a: Allow statically allocated and initialized srcu_struct
structures (courtesy of Lai Jiangshan).

stall.2012.11.13a: Add more diagnostic information to RCU CPU stall
warnings, also decrease from 60 seconds to 21 seconds.

hotplug.2012.11.08a: Minor updates to CPU hotplug handling.

tracing.2012.11.08a: Improved debugfs tracing, courtesy of Michael Wang.

idle.2012.10.24a: Updates to RCU idle/adaptive-idle handling, including
a boot parameter that maps normal grace periods to expedited.

Resolved conflict in kernel/rcutree.c due to side-by-side change.

Paul E. McKenney
13 years ago

14 Nov, 2012

1 commit

f0a0e6f28 rcu: Clarify memory-ordering properties of grace-period primitives ... Browse Code »

This commit explicitly states the memory-ordering properties of the
RCU grace-period primitives. Although these properties were in some
sense implied by the fundmental property of RCU ("a grace period must
wait for all pre-existing RCU read-side critical sections to complete"),
stating it explicitly will be a great labor-saving device.

Reported-by: Oleg Nesterov
Signed-off-by: Paul E. McKenney
Reviewed-by: Oleg Nesterov

Paul E. McKenney
13 years ago

09 Nov, 2012

1 commit

7b2e6011f rcu: Rename ->onofflock to ->orphan_lock ... Browse Code »

The ->onofflock field in the rcu_state structure at one time synchronized
CPU-hotplug operations for RCU. However, its scope has decreased over time
so that it now only protects the lists of orphaned RCU callbacks. This
commit therefore renames it to ->orphan_lock to reflect its current use.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
13 years ago

24 Oct, 2012

1 commit

3705b88db rcu: Add a module parameter to force use of expedited RCU primitives ... Browse Code »

There have been some embedded applications that would benefit from
use of expedited grace-period primitives. In some ways, this is
similar to synchronize_net() doing either a normal or an expedited
grace period depending on lock state, but with control outside of
the kernel.

This commit therefore adds rcu_expedited boot and sysfs parameters
that cause the kernel to substitute expedited primitives for the
normal grace-period primitives.

[ paulmck: Add trace/event/rcu.h to kernel/srcu.c to avoid build error.
Get rid of infinite loop through contention path.]

Signed-off-by: Antti P Miettinen
Signed-off-by: Paul E. McKenney

Antti P Miettinen
13 years ago

26 Sep, 2012

2 commits

9a0c6fef4 rcu: Make RCU_FAST_NO_HZ handle adaptive ticks ... Browse Code »

The current implementation of RCU_FAST_NO_HZ tries reasonably hard to rid
the current CPU of RCU callbacks. This is appropriate when the CPU is
entering idle, where it doesn't have much useful to do anyway, but is most
definitely not what you want when transitioning to user-mode execution.
This commit therefore detects the adaptive-tick case, and refrains from
burning CPU time getting rid of RCU callbacks in that case.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
13 years ago
5217192b8 Merge remote-tracking branch 'tip/smp/hotplug' into next.2012.09.25b ... Browse Code »

The conflicts between kernel/rcutree.h and kernel/rcutree_plugin.h
were due to adjacent insertions and deletions, which were resolved
by simply accepting the changes on both branches.

Paul E. McKenney
13 years ago

25 Sep, 2012

1 commit

bda4ec9f6 Merge branches 'bigrt.2012.09.23a', 'doctorture.2012.09.23a', 'fixes.2012.09.23a… ... Browse Code »

…', 'hotplug.2012.09.23a' and 'idlechop.2012.09.23a' into HEAD

bigrt.2012.09.23a contains additional commits to reduce scheduling latency
from RCU on huge systems (many hundrends or thousands of CPUs).

doctorture.2012.09.23a contains documentation changes and rcutorture fixes.

fixes.2012.09.23a contains miscellaneous fixes.

hotplug.2012.09.23a contains CPU-hotplug-related changes.

idle.2012.09.23a fixes architectures for which RCU no longer considered
the idle loop to be a quiescent state due to earlier
adaptive-dynticks changes. Affected architectures are alpha,
cris, frv, h8300, m32r, m68k, mn10300, parisc, score, xtensa,
and ia64.

Paul E. McKenney
13 years ago

23 Sep, 2012

9 commits

86f343b50 rcu: Fix CONFIG_RCU_FAST_NO_HZ stall warning message ... Browse Code »

The print_cpu_stall_fast_no_hz() function attempts to print -1 when
the ->idle_gp_timer is not pending, but unsigned arithmetic causes it
to instead print ULONG_MAX, which is 4294967295 on 32-bit systems and
18446744073709551615 on 64-bit systems. Neither of these are the most
reader-friendly values, so this commit instead causes "timer not pending"
to be printed when ->idle_gp_timer is not pending.

Reported-by: Paul Walmsley
Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
13 years ago
5fd4dc068 rcu: Avoid rcu_print_detail_task_stall_rnp() segfault ... Browse Code »

The rcu_print_detail_task_stall_rnp() function invokes
rcu_preempt_blocked_readers_cgp() to verify that there are some preempted
RCU readers blocking the current grace period outside of the protection
of the rcu_node structure's ->lock. This means that the last blocked
reader might exit its RCU read-side critical section and remove itself
from the ->blkd_tasks list before the ->lock is acquired, resulting in
a segmentation fault when the subsequent code attempts to dereference
the now-NULL gp_tasks pointer.

This commit therefore moves the test under the lock. This will not
have measurable effect on lock contention because this code is invoked
only when printing RCU CPU stall warnings, in other words, in the common
case, never.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
13 years ago
115f7a7ca rcu: Apply for_each_rcu_flavor() to increment_cpu_stall_ticks() ... Browse Code »

The increment_cpu_stall_ticks() function listed each RCU flavor
explicitly, with an ifdef to handle preemptible RCU. This commit
therefore applies for_each_rcu_flavor() to save a line of code.

Because this commit switches from a code-based enumeration of the
flavors of RCU to an rcu_state-list-based enumeration, it is no longer
possible to apply __get_cpu_var() to the per-CPU rcu_data structures.
We instead use __this_cpu_var() on the rcu_state structure's ->rda field
that references the corresponding rcu_data structures.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
13 years ago
b065a8535 rcu: Fix obsolete rcu_initiate_boost() header comment ... Browse Code »

Commit 1217ed1b (rcu: permit rcu_read_unlock() to be called while holding
runqueue locks) made rcu_initiate_boost() restore irq state when releasing
the rcu_node structure's ->lock, but failed to update the header comment
accordingly. This commit therefore brings the header comment up to date.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
13 years ago
5cc900cf5 rcu: Improve boost selection when moving tasks to root rcu_node ... Browse Code »

The rcu_preempt_offline_tasks() moves all tasks queued on a given leaf
rcu_node structure to the root rcu_node, which is done when the last CPU
corresponding the the leaf rcu_node structure goes offline. Now that
RCU-preempt's synchronize_rcu_expedited() implementation blocks CPU-hotplug
operations during the initialization of each rcu_node structure's
->boost_tasks pointer, rcu_preempt_offline_tasks() can do a better job
of setting the root rcu_node's ->boost_tasks pointer.

The key point is that rcu_preempt_offline_tasks() runs as part of the
CPU-hotplug process, so that a concurrent synchronize_rcu_expedited()
is guaranteed to either have not started on the one hand (in which case
there is no boosting on behalf of the expedited grace period) or to be
completely initialized on the other (in which case, in the absence of
other priority boosting, all ->boost_tasks pointers will be initialized).
Therefore, if rcu_preempt_offline_tasks() finds that the ->boost_tasks
pointer is equal to the ->exp_tasks pointer, it can be sure that it is
correctly placed.

In the case where there was boosting ongoing at the time that the
synchronize_rcu_expedited() function started, different nodes might start
boosting the tasks blocking the expedited grace period at different times.
In this mixed case, the root node will either be boosting tasks for
the expedited grace period already, or it will start as soon as it gets
done boosting for the normal grace period -- but in this latter case,
the root node's tasks needed to be boosted in any case.

This commit therefore adds a check of the ->boost_tasks pointer against
the ->exp_tasks pointer to the list that prevents updating ->boost_tasks.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
13 years ago
1e3fd2b38 rcu: Properly initialize ->boost_tasks on CPU offline ... Browse Code »

When rcu_preempt_offline_tasks() clears tasks from a leaf rcu_node
structure, it does not NULL out the structure's ->boost_tasks field.
This commit therefore fixes this issue.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
13 years ago
d7d6a11e8 rcu: Simplify quiescent-state detection ... Browse Code »

The current quiescent-state detection algorithm is needlessly
complex. It records the grace-period number corresponding to
the quiescent state at the time of the quiescent state, which
works, but it seems better to simply erase any record of previous
quiescent states at the time that the CPU notices the new grace
period. This has the further advantage of removing another piece
of RCU for which lockless reasoning is required.

Therefore, this commit makes this change.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
13 years ago
1943c89de rcu: Reduce synchronize_rcu_expedited() latency ... Browse Code »

The synchronize_rcu_expedited() function disables interrupts across a
scan of all leaf rcu_node structures, which is not good for real-time
scheduling latency on large systems (hundreds or especially thousands
of CPUs). This commit therefore holds off CPU-hotplug operations using
get_online_cpus(), and removes the prior acquisiion of the ->onofflock
(which required disabling interrupts).

Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
13 years ago
bcfa57ce1 rcu: Eliminate signed overflow in synchronize_rcu_expedited() ... Browse Code »

In the C language, signed overflow is undefined. It is true that
twos-complement arithmetic normally comes to the rescue, but if the
compiler can subvert this any time it has any information about the values
being compared. For example, given "if (a - b > 0)", if the compiler
has enough information to realize that (for example) the value of "a"
is positive and that of "b" is negative, the compiler is within its
rights to optimize to a simple "if (1)", which might not be what you want.

This commit therefore converts synchronize_rcu_expedited()'s work-done
detection counter from signed to unsigned.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
13 years ago