23 Feb, 2017
1 commit
-
Move the x86_64 idle notifiers originally by Andi Kleen and Venkatesh
Pallipadi to generic.Change-Id: Idf29cda15be151f494ff245933c12462643388d5
Acked-by: Nicolas Pitre
Signed-off-by: Todd Poynor
06 Jan, 2017
1 commit
-
commit 777c6e0daebb3fcefbbd6f620410a946b07ef6d0 upstream.
Yu Zhao has noticed that __unregister_cpu_notifier only unregisters its
notifiers when HOTPLUG_CPU=y while the registration might succeed even
when HOTPLUG_CPU=n if MODULE is enabled. This means that e.g. zswap
might keep a stale notifier on the list on the manual clean up during
the pool tear down and thus corrupt the list. Resulting in the following[ 144.964346] BUG: unable to handle kernel paging request at ffff880658a2be78
[ 144.971337] IP: [] raw_notifier_chain_register+0x1b/0x40[ 145.122628] Call Trace:
[ 145.125086] [] __register_cpu_notifier+0x18/0x20
[ 145.131350] [] zswap_pool_create+0x273/0x400
[ 145.137268] [] __zswap_param_set+0x1fc/0x300
[ 145.143188] [] ? trace_hardirqs_on+0xd/0x10
[ 145.149018] [] ? kernel_param_lock+0x28/0x30
[ 145.154940] [] ? __might_fault+0x4f/0xa0
[ 145.160511] [] zswap_compressor_param_set+0x17/0x20
[ 145.167035] [] param_attr_store+0x5c/0xb0
[ 145.172694] [] module_attr_store+0x1d/0x30
[ 145.178443] [] sysfs_kf_write+0x4f/0x70
[ 145.183925] [] kernfs_fop_write+0x149/0x180
[ 145.189761] [] __vfs_write+0x18/0x40
[ 145.194982] [] vfs_write+0xb2/0x1a0
[ 145.200122] [] SyS_write+0x52/0xa0
[ 145.205177] [] entry_SYSCALL_64_fastpath+0x12/0x17This can be even triggered manually by changing
/sys/module/zswap/parameters/compressor multiple times.Fix this issue by making unregister APIs symmetric to the register so
there are no surprises.Fixes: 47e627bc8c9a ("[PATCH] hotplug: Allow modules to use the cpu hotplug notifiers even if !CONFIG_HOTPLUG_CPU")
Reported-and-tested-by: Yu Zhao
Signed-off-by: Michal Hocko
Cc: linux-mm@kvack.org
Cc: Andrew Morton
Cc: Dan Streetman
Link: http://lkml.kernel.org/r/20161207135438.4310-1-mhocko@kernel.org
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
16 Oct, 2016
1 commit
-
Use distinctive name for cpu_hotplug.dep_map to avoid the actual
cpu_hotplug.lock appearing as cpu_hotplug.lock#2 in lockdep splats.Signed-off-by: Joonas Lahtinen
Reviewed-by: Chris Wilson
Acked-by: Gautham R. Shenoy
Cc: Andrew Morton
Cc: Daniel Vetter
Cc: Gautham R . Shenoy
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: intel-gfx@lists.freedesktop.org
Cc: trivial@kernel.org
Signed-off-by: Ingo Molnar
04 Oct, 2016
2 commits
-
Pull CPU hotplug updates from Thomas Gleixner:
"Yet another batch of cpu hotplug core updates and conversions:- Provide core infrastructure for multi instance drivers so the
drivers do not have to keep custom lists.- Convert custom lists to the new infrastructure. The block-mq custom
list conversion comes through the block tree and makes the diffstat
tip over to more lines removed than added.- Handle unbalanced hotplug enable/disable calls more gracefully.
- Remove the obsolete CPU_STARTING/DYING notifier support.
- Convert another batch of notifier users.
The relayfs changes which conflicted with the conversion have been
shipped to me by Andrew.The remaining lot is targeted for 4.10 so that we finally can remove
the rest of the notifiers"* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
cpufreq: Fix up conversion to hotplug state machine
blk/mq: Reserve hotplug states for block multiqueue
x86/apic/uv: Convert to hotplug state machine
s390/mm/pfault: Convert to hotplug state machine
mips/loongson/smp: Convert to hotplug state machine
mips/octeon/smp: Convert to hotplug state machine
fault-injection/cpu: Convert to hotplug state machine
padata: Convert to hotplug state machine
cpufreq: Convert to hotplug state machine
ACPI/processor: Convert to hotplug state machine
virtio scsi: Convert to hotplug state machine
oprofile/timer: Convert to hotplug state machine
block/softirq: Convert to hotplug state machine
lib/irq_poll: Convert to hotplug state machine
x86/microcode: Convert to hotplug state machine
sh/SH-X3 SMP: Convert to hotplug state machine
ia64/mca: Convert to hotplug state machine
ARM/OMAP/wakeupgen: Convert to hotplug state machine
ARM/shmobile: Convert to hotplug state machine
arm64/FP/SIMD: Convert to hotplug state machine
... -
Pull RCU updates from Ingo Molnar:
"The main changes in this cycle were:- Expedited grace-period changes, most notably avoiding having user
threads drive expedited grace periods, using a workqueue instead.- Miscellaneous fixes, including a performance fix for lists that was
sent with the lists modifications.- CPU hotplug updates, most notably providing exact CPU-online
tracking for RCU. This will in turn allow removal of the checks
supporting RCU's prior heuristic that was based on the assumption
that CPUs would take no longer than one jiffy to come online.- Torture-test updates.
- Documentation updates"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
list: Expand list_first_entry_or_null()
torture: TOROUT_STRING(): Insert a space between flag and message
rcuperf: Consistently insert space between flag and message
rcutorture: Print out barrier error as document says
torture: Add task state to writer-task stall printk()s
torture: Convert torture_shutdown() to hrtimer
rcutorture: Convert to hotplug state machine
cpu/hotplug: Get rid of CPU_STARTING reference
rcu: Provide exact CPU-online tracking for RCU
rcu: Avoid redundant quiescent-state chasing
rcu: Don't use modular infrastructure in non-modular code
sched: Make wake_up_nohz_cpu() handle CPUs going offline
rcu: Use rcu_gp_kthread_wake() to wake up grace period kthreads
rcu: Use RCU's online-CPU state for expedited IPI retry
rcu: Exclude RCU-offline CPUs from expedited grace periods
rcu: Make expedited RCU CPU stall warnings respond to controls
rcu: Stop disabling expedited RCU CPU stall warnings
rcu: Drive expedited grace periods from workqueue
rcu: Consolidate expedited grace period machinery
documentation: Record reason for rcu_head two-byte alignment
...
07 Sep, 2016
3 commits
-
Install the callbacks via the state machine.
Signed-off-by: Richard Weinberger
Signed-off-by: Thomas Gleixner
Signed-off-by: Sebastian Andrzej Siewior
Reviewed-by: Sebastian Andrzej Siewior
Cc: Peter Zijlstra
Cc: Pekka Enberg
Cc: linux-mm@kvack.org
Cc: rt@linutronix.de
Cc: David Rientjes
Cc: Joonsoo Kim
Cc: Andrew Morton
Cc: Christoph Lameter
Link: http://lkml.kernel.org/r/20160823125319.abeapfjapf2kfezp@linutronix.de
Signed-off-by: Thomas Gleixner -
Install the callbacks via the state machine. They are installed at run time but
relay_prepare_cpu() does not need to be invoked by the boot CPU because
relay_open() was not yet invoked and there are no pools that need to be created.Signed-off-by: Richard Weinberger
Signed-off-by: Thomas Gleixner
Signed-off-by: Sebastian Andrzej Siewior
Reviewed-by: Sebastian Andrzej Siewior
Cc: Peter Zijlstra
Cc: rt@linutronix.de
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/20160818125731.27256-3-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner -
All users are converted to state machine, remove CPU_STARTING and the
corresponding CPU_DYING.Signed-off-by: Thomas Gleixner
Signed-off-by: Sebastian Andrzej Siewior
Cc: Peter Zijlstra
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160818125731.27256-2-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner
06 Sep, 2016
1 commit
-
We should have all names in the scheme "[subsys/]facility:state]". Fix the
core to comply.Signed-off-by: Thomas Gleixner
05 Sep, 2016
1 commit
-
Some compilers are unhappy with the anon union in the state array. Replace
it with a named union.While at it align the state array initializers proper and add the missing
name tags.Fixes: cf392d10b69e "cpu/hotplug: Add multi instance support"
Reported-by: Ingo Molnar
Reported-by: Fenguang Wu
Signed-off-by: Thomas Gleixner
Cc: rt@linutronix.de
03 Sep, 2016
3 commits
-
When cpu_hotplug_enable() is called unbalanced w/o a preceeding
cpu_hotplug_disable() the code emits a warning, but happily decrements the
disabled counter. This causes the next operations to malfunction.Prevent the decrement and just emit a warning.
Signed-off-by: Lianwei Wang
Cc: peterz@infradead.org
Cc: linux-pm@vger.kernel.org
Cc: oleg@redhat.com
Link: http://lkml.kernel.org/r/1465541008-12476-1-git-send-email-lianwei.wang@gmail.com
Signed-off-by: Thomas Gleixner -
This patch adds the ability for a given state to have multiple
instances. Until now all states have a single instance and the startup /
teardown callback use global variables.
A few drivers need to perform a the same callbacks on multiple
"instances". Currently we have three drivers in tree which all have a
global list which they iterate over. With multi instance they support
don't need their private list and the functionality has been moved into
core code. Plus we hold the hotplug lock in core so no cpus comes/goes
while instances are registered and we do rollback in error case :)Signed-off-by: Thomas Gleixner
Signed-off-by: Sebastian Andrzej Siewior
Cc: Mark Rutland
Cc: Peter Zijlstra
Cc: Will Deacon
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/1471024183-12666-3-git-send-email-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner -
This is preparation for the following patch.
This rework here changes the arguments of cpuhp_invoke_callback(). It
passes now `state' and whether `startup' or `teardown' callback should
be invoked. The callback then is looked up by the function.The following is a clanup of callers:
- cpuhp_issue_call() has one argument less
- struct cpuhp_cpu_state (which is used by the hotplug thread) gets also
its callback removed. The decision if it is a single callback
invocation moved to the `single' variable. Also a `bringup' variable
has been added to distinguish between startup and teardown callback.
- take_cpu_down() needs to start one step earlier. We always get here
via CPUHP_TEARDOWN_CPU callback. Before that change cpuhp_ap_states +
CPUHP_TEARDOWN_CPU pointed to an empty entry because TEARDOWN is saved
in bp_states for this reason. Now that we use cpuhp_get_step() to
lookup the state we must explicitly skip it in order not to invoke it
twice.Signed-off-by: Thomas Gleixner
Signed-off-by: Sebastian Andrzej Siewior
Cc: Mark Rutland
Cc: Peter Zijlstra
Cc: Will Deacon
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/1471024183-12666-2-git-send-email-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner
26 Aug, 2016
1 commit
-
disable_nonboot_cpus() assumes that the lowest numbered online CPU is
the boot CPU, and that this is the correct CPU to run any power
management code on.On x86 this is always correct, as CPU0 cannot (easily) by taken offline.
On arm64 CPU0 can be taken offline. For hibernate/resume this means we
may hibernate on a CPU other than CPU0. If the system is rebooted with
kexec 'CPU0' will be assigned to a different physical CPU. This
complicates hibernate/resume as now we can't trust the CPU numbers.
Arch code can find the correct physical CPU, and ensure it is online
before resume from hibernate begins, but also needs to influence
disable_nonboot_cpus()s choice of CPU.Rename disable_nonboot_cpus() as freeze_secondary_cpus() and add an
argument indicating which CPU should be left standing. Follow the logic
in migrate_to_reboot_cpu() to use the lowest numbered online CPU if the
requested CPU is not online.
Add disable_nonboot_cpus() as an inline function that has the existing
behaviour.Cc: Rafael J. Wysocki
Reviewed-by: Thomas Gleixner
Signed-off-by: James Morse
Signed-off-by: Will Deacon
23 Aug, 2016
2 commits
-
CPU_STARTING is scheduled for removal. There is no use of it in drivers
and core code uses it only for compatibility with old-style CPU-hotplug
notifiers. This patch removes therefore removes CPU_STARTING from an
RCU-related comment.Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Paul E. McKenney -
Up to now, RCU has assumed that the CPU-online process makes it from
CPU_UP_PREPARE to set_cpu_online() within one jiffy. Given the recent
rise of virtualized environments, this assumption is very clearly
obsolete. Failing to meet this deadline can result in RCU paying
attention to an incoming CPU for one jiffy, then ignoring it until the
grace period following the one in which that CPU sets itself online.
This situation might prove to be fatally disappointing to any RCU
read-side critical sections that had the misfortune to execute during
the time in which RCU was ignoring the slow-to-come-online CPU.This commit therefore updates RCU's internal CPU state-tracking
information at notify_cpu_starting() time, thus providing RCU with
an exact transition of the CPU's state from offline to online.Note that this means that incoming CPUs must not use RCU read-side
critical section (other than those of SRCU) until notify_cpu_starting()
time. Note also that the CPU_STARTING notifiers -are- allowed to use
RCU read-side critical sections. (Of course, CPU-hotplug notifiers are
rapidly becoming obsolete, so you need to act fast!)If a given architecture or CPU family needs to use RCU read-side
critical sections earlier, the call to rcu_cpu_starting() from
notify_cpu_starting() will need to be architecture-specific, with
architectures that need early use being required to hand-place
the call to rcu_cpu_starting() at some point preceding the call to
notify_cpu_starting().Signed-off-by: Paul E. McKenney
10 Aug, 2016
1 commit
-
Now that Xen no longer allocates irqs in _cpu_up() we can restore
commit:a89941816726 ("hotplug: Prevent alloc/free of irq descriptors during cpu up/down")
Signed-off-by: Boris Ostrovsky
Reviewed-by: Juergen Gross
Acked-by: Thomas Gleixner
Cc: Anna-Maria Gleixner
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Sebastian Andrzej Siewior
Cc: david.vrabel@citrix.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1470244948-17674-3-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Ingo Molnar
29 Jul, 2016
1 commit
-
On the tear-down path, the dead CPU callback for the timers was
misplaced within the 'cpuhp_state' enumeration. There is a hidden
dependency between the timers and block multiqueue. The timers
callback must happen before the block multiqueue callback otherwise a
RCU stall occurs.Move the timers callback to the proper place in the state machine.
Reported-and-tested-by: Jon Hunter
Reported-by: kbuild test robot
Fixes: 24f73b99716a ("timers/core: Convert to hotplug state machine")
Signed-off-by: Richard Cochran
Cc: Peter Zijlstra
Cc: Sebastian Andrzej Siewior
Cc: Rasmus Villemoes
Cc: John Stultz
Cc: rt@linutronix.de
Cc: Oleg Nesterov
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1469610498-25914-1-git-send-email-rcochran@linutronix.de
Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
15 Jul, 2016
4 commits
-
Straight forward conversion to the state machine. Though the question arises
whether this needs really all these state transitions to work.Signed-off-by: Thomas Gleixner
Signed-off-by: Anna-Maria Gleixner
Reviewed-by: Sebastian Andrzej Siewior
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153337.982013161@linutronix.de
Signed-off-by: Ingo Molnar -
Install the callbacks via the state machine. They are installed at runtime so
smpcfd_prepare_cpu() needs to be invoked by the boot-CPU.Signed-off-by: Richard Weinberger
[ Added the dropped CPU dying case back in. ]
Signed-off-by: Richard Cochran
Signed-off-by: Anna-Maria Gleixner
Reviewed-by: Sebastian Andrzej Siewior
Cc: Davidlohr Bueso
Cc: Linus Torvalds
Cc: Mel Gorman
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Rasmus Villemoes
Cc: Thomas Gleixner
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153337.818376366@linutronix.de
Signed-off-by: Ingo Molnar -
When tearing down, call timers_dead_cpu() before notify_dead().
There is a hidden dependency between:- timers
- block multiqueue
- rcutreeIf timers_dead_cpu() comes later than blk_mq_queue_reinit_notify()
that latter function causes a RCU stall.Signed-off-by: Richard Cochran
Signed-off-by: Anna-Maria Gleixner
Reviewed-by: Sebastian Andrzej Siewior
Cc: John Stultz
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Rasmus Villemoes
Cc: Thomas Gleixner
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153337.566790058@linutronix.de
Signed-off-by: Ingo Molnar -
Split out the clockevents callbacks instead of piggybacking them on
hrtimers.This gets rid of a POST_DEAD user. See commit:
54e88fad223c ("sched: Make sure timers have migrated before killing the migration_thread")
We just move the callback state to the proper place in the state machine.
Signed-off-by: Thomas Gleixner
Signed-off-by: Anna-Maria Gleixner
Reviewed-by: Sebastian Andrzej Siewior
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Rasmus Villemoes
Cc: Rusty Russell
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153337.485419196@linutronix.de
Signed-off-by: Ingo Molnar
14 Jul, 2016
3 commits
-
Get rid of the prio ordering of the separate notifiers and use a proper state
callback pair.Signed-off-by: Thomas Gleixner
Signed-off-by: Anna-Maria Gleixner
Reviewed-by: Sebastian Andrzej Siewior
Acked-by: Tejun Heo
Cc: Andrew Morton
Cc: Lai Jiangshan
Cc: Linus Torvalds
Cc: Nicolas Iooss
Cc: Oleg Nesterov
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Rasmus Villemoes
Cc: Rusty Russell
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153335.197083890@linutronix.de
Signed-off-by: Ingo Molnar -
Actually a nice symmetric startup/teardown pair which fits properly into
the state machine concept. In the long run we should be able to invoke
the startup callback for the boot CPU via the state machine and get
rid of the init function which invokes it on the boot CPU.Note: This comes actually before the perf hardware callbacks. In the notifier
model the hardware callbacks have a higher priority than the core
callback. But that's solely for CPU offline so that hardware migration of
events happens before the core is notified about the outgoing CPU.With the symetric state array model we have the following ordering:
UP: core -> hardware
DOWN: hardware -> coreSigned-off-by: Thomas Gleixner
Signed-off-by: Anna-Maria Gleixner
Reviewed-by: Sebastian Siewior
Cc: Alexander Shishkin
Cc: Arnaldo Carvalho de Melo
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Rasmus Villemoes
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153333.587514098@linutronix.de
Signed-off-by: Ingo Molnar -
We switched the hotplug machinery to smpboot threads. Early registration of
hotplug callbacks, i.e. from do_pre_smp_initcalls(), happens before the
threads are initialized. Instead of moving the thread init, we simply handle
it in the hotplug code itself and invoke the function directly.Signed-off-by: Thomas Gleixner
Signed-off-by: Anna-Maria Gleixner
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153332.896450738@linutronix.de
Signed-off-by: Ingo Molnar
13 Jul, 2016
1 commit
-
Xiaolong Ye reported lock debug warnings triggered by the following commit:
8de4a0066106 ("perf/x86: Convert the core to the hotplug state machine")
The bug is the following: the cpuhp_bp_states[] array is cut short when
CONFIG_SMP=n, but the dynamically registered callbacks are stored nevertheless
and happily scribble outside of the array bounds...We need to store them in case that the state is unregistered so we can invoke
the teardown function. That's independent of CONFIG_SMP. Make sure the array
is large enough.Reported-by: kernel test robot
Signed-off-by: Thomas Gleixner
Cc: Adam Borowski
Cc: Alexander Shishkin
Cc: Anna-Maria Gleixner
Cc: Arnaldo Carvalho de Melo
Cc: Arnaldo Carvalho de Melo
Cc: Borislav Petkov
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Sebastian Andrzej Siewior
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: lkp@01.org
Cc: stable@vger.kernel.org
Cc: tipbuild@zytor.com
Fixes: cff7d378d3fd "cpu/hotplug: Convert to a state machine for the control processor"
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1607122144560.4083@nanos
Signed-off-by: Ingo Molnar
06 May, 2016
5 commits
-
The scheduler can handle per cpu threads before the cpu is set to active and
it does not allow user space threads on the cpu before active is
set. Attaching to the scheduling domains is also not required before user
space threads can be handled.Move the activation to the end of the hotplug state space. That also means
that deactivation is the first action when a cpu is shut down.Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160310120025.597477199@linutronix.de
Signed-off-by: Thomas Gleixner -
Remove the hotplug notifier and make it an explicit state.
Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160310120025.502222097@linutronix.de
Signed-off-by: Thomas Gleixner -
The sync_rcu stuff is specificically for clearing bits in the active
mask, such that everybody will observe the bit cleared and will not
consider the cleared CPU for load-balancing etc.Signed-off-by: Peter Zijlstra (Intel)
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160310120025.169219710@linutronix.de
Signed-off-by: Thomas Gleixner -
Now that we reduced everything into single notifiers, it's simple to move them
into the hotplug state machine space.Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Cc: rt@linutronix.de
Signed-off-by: Thomas Gleixner -
Start distangling the maze of hotplug notifiers in the scheduler.
Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Cc: rt@linutronix.de
Signed-off-by: Thomas Gleixner
22 Apr, 2016
1 commit
-
The recent introduction of the hotplug thread which invokes the callbacks on
the plugged cpu, cased the following regression:If takedown_cpu() fails, then we run into several issues:
1) The rollback of the target cpu states is not invoked. That leaves the smp
threads and the hotplug thread in disabled state.2) notify_online() is executed due to a missing skip_onerr flag. That causes
that both CPU_DOWN_FAILED and CPU_ONLINE notifications are invoked which
confuses quite some notifiers.3) The CPU_DOWN_FAILED notification is not invoked on the target CPU. That's
not an issue per se, but it is inconsistent and in consequence blocks the
patches which rely on these states being invoked on the target CPU and not
on the controlling cpu. It also does not preserve the strict call order on
rollback which is problematic for the ongoing state machine conversion as
well.To fix this we add a rollback flag to the remote callback machinery and invoke
the rollback including the CPU_DOWN_FAILED notification on the remote
cpu. Further mark the notify online state with 'skip_onerr' so we don't get a
double invokation.This workaround will go away once we moved the unplug invocation to the target
cpu itself.[ tglx: Massaged changelog and moved the CPU_DOWN_FAILED notifiaction to the
target cpu ]Fixes: 4cb28ced23c4 ("cpu/hotplug: Create hotplug threads")
Reported-by: Heiko Carstens
Signed-off-by: Sebastian Andrzej Siewior
Cc: linux-s390@vger.kernel.org
Cc: rt@linutronix.de
Cc: Martin Schwidefsky
Cc: Anna-Maria Gleixner
Link: http://lkml.kernel.org/r/20160408124015.GA21960@linutronix.de
Signed-off-by: Thomas Gleixner
13 Mar, 2016
1 commit
-
Requested-by: Peter Zijlstra
Signed-off-by: Thomas Gleixner
11 Mar, 2016
1 commit
-
Commit 931ef163309e moved the smpboot thread park/unpark invocation to the
state machine. The move of the unpark invocation was premature as it depends
on work in progress patches.As a result cpu down can fail, because rcu synchronization in takedown_cpu()
eventually requires a functional softirq thread. I never encountered the
problem in testing, but 0day testing managed to provide a reliable reproducer.Remove the smpboot_threads_park() call from the state machine for now and put
it back into the original place after the rcu synchronization.I'm embarrassed as I knew about the dependency and still managed to get it
wrong. Hotplug induced brain melt seems to be the only sensible explanation
for that.Fixes: 931ef163309e "cpu/hotplug: Unpark smpboot threads from the state machine"
Reported-by: Fengguang Wu
Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
06 Mar, 2016
1 commit
-
The check for the AP range in cpuhp_is_ap_state() is redundant after commit
8df3e07e7f21 "cpu/hotplug: Let upcoming cpu bring itself fully up" because all
states above CPUHP_BRINGUP_CPU are invoked on the hotplugged cpu. Remove it.Reported-by: Richard Cochran
Signed-off-by: Thomas Gleixner
03 Mar, 2016
1 commit
-
Paul noticed that the conversion of the death reporting introduced a race
where the outgoing cpu might be delayed after waking the controll processor,
so it might not be able to call rcu_report_dead() before being physically
removed, leading to RCU stalls.We cant call complete after rcu_report_dead(), so instead of going back to
busy polling, simply issue a function call to do the completion.Fixes: 27d50c7eeb0f "rcu: Make CPU_DYING_IDLE an explicit call"
Reported-by: Paul E. McKenney
Link: http://lkml.kernel.org/r/20160302201127.GA23440@linux.vnet.ibm.com
Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
02 Mar, 2016
4 commits
-
Make the RCU CPU_DYING_IDLE callback an explicit function call, so it gets
invoked at the proper place.Signed-off-by: Thomas Gleixner
Cc: linux-arch@vger.kernel.org
Cc: Rik van Riel
Cc: Rafael Wysocki
Cc: "Srivatsa S. Bhat"
Cc: Peter Zijlstra
Cc: Arjan van de Ven
Cc: Sebastian Siewior
Cc: Rusty Russell
Cc: Steven Rostedt
Cc: Oleg Nesterov
Cc: Tejun Heo
Cc: Andrew Morton
Cc: Paul McKenney
Cc: Linus Torvalds
Cc: Paul Turner
Link: http://lkml.kernel.org/r/20160226182341.870167933@linutronix.de
Signed-off-by: Thomas Gleixner -
Kill the busy spinning on the control side and just wait for the hotplugged
cpu to tell that it reached the dead state.Signed-off-by: Thomas Gleixner
Cc: linux-arch@vger.kernel.org
Cc: Rik van Riel
Cc: Rafael Wysocki
Cc: "Srivatsa S. Bhat"
Cc: Peter Zijlstra
Cc: Arjan van de Ven
Cc: Sebastian Siewior
Cc: Rusty Russell
Cc: Steven Rostedt
Cc: Oleg Nesterov
Cc: Tejun Heo
Cc: Andrew Morton
Cc: Paul McKenney
Cc: Linus Torvalds
Cc: Paul Turner
Link: http://lkml.kernel.org/r/20160226182341.776157858@linutronix.de
Signed-off-by: Thomas Gleixner -
Let the upcoming cpu kick the hotplug thread and let itself complete the
bringup. That way the controll side can just wait for the completion or later
when we made the hotplug machinery async not care at all.Signed-off-by: Thomas Gleixner
Cc: linux-arch@vger.kernel.org
Cc: Rik van Riel
Cc: Rafael Wysocki
Cc: "Srivatsa S. Bhat"
Cc: Peter Zijlstra
Cc: Arjan van de Ven
Cc: Sebastian Siewior
Cc: Rusty Russell
Cc: Steven Rostedt
Cc: Oleg Nesterov
Cc: Tejun Heo
Cc: Andrew Morton
Cc: Paul McKenney
Cc: Linus Torvalds
Cc: Paul Turner
Link: http://lkml.kernel.org/r/20160226182341.697655464@linutronix.de
Signed-off-by: Thomas Gleixner -
Let the hotplugged cpu invoke the setup/teardown callbacks
(CPU_ONLINE/CPU_DOWN_PREPARE) itself.Signed-off-by: Thomas Gleixner
Cc: linux-arch@vger.kernel.org
Cc: Rik van Riel
Cc: Rafael Wysocki
Cc: "Srivatsa S. Bhat"
Cc: Peter Zijlstra
Cc: Arjan van de Ven
Cc: Sebastian Siewior
Cc: Rusty Russell
Cc: Steven Rostedt
Cc: Oleg Nesterov
Cc: Tejun Heo
Cc: Andrew Morton
Cc: Paul McKenney
Cc: Linus Torvalds
Cc: Paul Turner
Link: http://lkml.kernel.org/r/20160226182341.536364371@linutronix.de
Signed-off-by: Thomas Gleixner