Eric Lee / smarc-fsl-linux-kernel

26 Sep, 2017

6 commits

1db49484f smp/hotplug: Hotplug state fail injection ... Browse Code »

Add a sysfs file to one-time fail a specific state. This can be used
to test the state rollback code paths.

Something like this (hotplug-up.sh):

#!/bin/bash

echo 0 > /debug/sched_debug
echo 1 > /debug/tracing/events/cpuhp/enable

ALL_STATES=`cat /sys/devices/system/cpu/hotplug/states | cut -d':' -f1`
STATES=${1:-$ALL_STATES}

for state in $STATES
do
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /debug/tracing/trace
echo Fail state: $state
echo $state > /sys/devices/system/cpu/cpu1/hotplug/fail
cat /sys/devices/system/cpu/cpu1/hotplug/fail
echo 1 > /sys/devices/system/cpu/cpu1/online

cat /debug/tracing/trace > hotfail-${state}.trace

sleep 1
done

Can be used to test for all possible rollback (barring multi-instance)
scenarios on CPU-up, CPU-down is a trivial modification of the above.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.972581715@infradead.org

Peter Zijlstra
2017-09-26 04:11:44 +0800
5ebe7742f smp/hotplug: Differentiate the AP completion between up and down ... Browse Code »

With lockdep-crossrelease we get deadlock reports that span cpu-up and
cpu-down chains. Such deadlocks cannot possibly happen because cpu-up
and cpu-down are globally serialized.

takedown_cpu()
irq_lock_sparse()
wait_for_completion(&st->done)

cpuhp_thread_fun
cpuhp_up_callback
cpuhp_invoke_callback
irq_affinity_online_cpu
irq_local_spare()
irq_unlock_sparse()
complete(&st->done)

Now that we have consistent AP state, we can trivially separate the
AP completion between up and down using st->bringup.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Acked-by: max.byungchul.park@gmail.com
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Link: https://lkml.kernel.org/r/20170920170546.872472799@infradead.org

Peter Zijlstra
2017-09-26 04:11:43 +0800
5f4b55e10 smp/hotplug: Differentiate the AP-work lockdep class between up and down ... Browse Code »

With lockdep-crossrelease we get deadlock reports that span cpu-up and
cpu-down chains. Such deadlocks cannot possibly happen because cpu-up
and cpu-down are globally serialized.

CPU0 CPU1 CPU2
cpuhp_up_callbacks: takedown_cpu: cpuhp_thread_fun:

cpuhp_state
irq_lock_sparse()
irq_lock_sparse()
wait_for_completion()
cpuhp_state
complete()

Now that we have consistent AP state, we can trivially separate the
AP-work class between up and down using st->bringup.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Cc: max.byungchul.park@gmail.com
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Link: https://lkml.kernel.org/r/20170920170546.922524234@infradead.org

Peter Zijlstra
2017-09-26 04:11:43 +0800
724a86881 smp/hotplug: Callback vs state-machine consistency ... Browse Code »

While the generic callback functions have an 'int' return and thus
appear to be allowed to return error, this is not true for all states.

Specifically, what used to be STARTING/DYING are ran with IRQs
disabled from critical parts of CPU bringup/teardown and are not
allowed to fail. Add WARNs to enforce this rule.

But since some callbacks are indeed allowed to fail, we have the
situation where a state-machine rollback encounters a failure, in this
case we're stuck, we can't go forward and we can't go back. Also add a
WARN for that case.

AFAICT this is a fundamental 'problem' with no real obvious solution.
We want the 'prepare' callbacks to allow failure on either up or down.
Typically on prepare-up this would be things like -ENOMEM from
resource allocations, and the typical usage in prepare-down would be
something like -EBUSY to avoid CPUs being taken away.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.819539119@infradead.org

Peter Zijlstra
2017-09-26 04:11:43 +0800
4dddfb5fa smp/hotplug: Rewrite AP state machine core ... Browse Code »

There is currently no explicit state change on rollback. That is,
st->bringup, st->rollback and st->target are not consistent when doing
the rollback.

Rework the AP state handling to be more coherent. This does mean we
have to do a second AP kick-and-wait for rollback, but since rollback
is the slow path of a slowpath, this really should not matter.

Take this opportunity to simplify the AP thread function to only run a
single callback per invocation. This unifies the three single/up/down
modes is supports. The looping it used to do for up/down are achieved
by retaining should_run and relying on the main smpboot_thread_fn()
loop.

(I have most of a patch that does the same for the BP state handling,
but that's not critical and gets a little complicated because
CPUHP_BRINGUP_CPU does the AP handoff from a callback, which gets
recursive @st usage, I still have de-fugly that.)

[ tglx: Move cpuhp_down_callbacks() et al. into the HOTPLUG_CPU section to
avoid gcc complaining about unused functions. Make the HOTPLUG_CPU
one piece instead of having two consecutive ifdef sections of the
same type. ]

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.769658088@infradead.org

Peter Zijlstra
2017-09-26 04:11:42 +0800
96abb9685 smp/hotplug: Allow external multi-instance rollback ... Browse Code »

Currently the rollback of multi-instance states is handled inside
cpuhp_invoke_callback(). The problem is that when we want to allow an
explicit state change for rollback, we need to return from the
function without doing the rollback.

Change cpuhp_invoke_callback() to optionally return the multi-instance
state, such that rollback can be done from a subsequent call.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.720361181@infradead.org

Peter Zijlstra
2017-09-26 04:11:42 +0800

05 Sep, 2017

1 commit

d725c7ac8 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull CPU hotplug fix from Thomas Gleixner:
"A single fix to handle the removal of the first dynamic CPU hotplug
state correctly"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smp/hotplug: Handle removal correctly in cpuhp_store_callbacks()

Linus Torvalds
2017-09-05 04:53:53 +0800

26 Jul, 2017

1 commit

a58163d8c rcu: Migrate callbacks earlier in the CPU-offline timeline ... Browse Code »

RCU callbacks must be migrated away from an outgoing CPU, and this is
done near the end of the CPU-hotplug operation, after the outgoing CPU is
long gone. Unfortunately, this means that other CPU-hotplug callbacks
can execute while the outgoing CPU's callbacks are still immobilized
on the long-gone CPU's callback lists. If any of these CPU-hotplug
callbacks must wait, either directly or indirectly, for the invocation
of any of the immobilized RCU callbacks, the system will hang.

This commit avoids such hangs by migrating the callbacks away from the
outgoing CPU immediately upon its departure, shortly after the return
from __cpu_die() in takedown_cpu(). Thus, RCU is able to advance these
callbacks and invoke them, which allows all the after-the-fact CPU-hotplug
callbacks to wait on these RCU callbacks without risk of a hang.

While in the neighborhood, this commit also moves rcu_send_cbs_to_orphanage()
and rcu_adopt_orphan_cbs() under a pre-existing #ifdef to avoid including
dead code on the one hand and to avoid define-without-use warnings on the
other hand.

Reported-by: Jeffrey Hugo
Link: http://lkml.kernel.org/r/db9c91f6-1b17-6136-84f0-03c3c2581ab4@codeaurora.org
Signed-off-by: Paul E. McKenney
Cc: Thomas Gleixner
Cc: Sebastian Andrzej Siewior
Cc: Ingo Molnar
Cc: Anna-Maria Gleixner
Cc: Boris Ostrovsky
Cc: Richard Weinberger

Paul E. McKenney
2017-07-26 04:03:43 +0800

20 Jul, 2017

1 commit

0c96b2730 smp/hotplug: Handle removal correctly in cpuhp_store_callbacks() ... Browse Code »

If cpuhp_store_callbacks() is called for CPUHP_AP_ONLINE_DYN or
CPUHP_BP_PREPARE_DYN, which are the indicators for dynamically allocated
states, then cpuhp_store_callbacks() allocates a new dynamic state. The
first allocation in each range returns CPUHP_AP_ONLINE_DYN or
CPUHP_BP_PREPARE_DYN.

If cpuhp_remove_state() is invoked for one of these states, then there is
no protection against the allocation mechanism. So the removal, which
should clear the callbacks and the name, gets a new state assigned and
clears that one.

As a consequence the state which should be cleared stays initialized. A
consecutive CPU hotplug operation dereferences the state callbacks and
accesses either freed or reused memory, resulting in crashes.

Add a protection against this by checking the name argument for NULL. If
it's NULL it's a removal. If not, it's an allocation.

[ tglx: Added a comment and massaged changelog ]

Fixes: 5b7aa87e0482 ("cpu/hotplug: Implement setup/removal interface")
Signed-off-by: Ethan Barnes
Signed-off-by: Thomas Gleixner
Cc: Ingo Molnar
Cc: "Srivatsa S. Bhat"
Cc: Sebastian Siewior
Cc: Paul McKenney
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/DM2PR04MB398242FC7776D603D9F99C894A60@DM2PR04MB398.namprd04.prod.outlook.com

Ethan Barnes
2017-07-20 22:40:24 +0800

12 Jul, 2017

1 commit

dea1d0f5f smp/hotplug: Replace BUG_ON and react useful ... Browse Code »

The move of the unpark functions to the control thread moved the BUG_ON()
there as well. While it made some sense in the idle thread of the upcoming
CPU, it's bogus to crash the control thread on the already online CPU,
especially as the function has a return value and the callsite is prepared
to handle an error return.

Replace it with a WARN_ON_ONCE() and return a proper error code.

Fixes: 9cd4f1a4e7a8 ("smp/hotplug: Move unparking of percpu threads to the control CPU")
Rightfully-ranted-at-by: Linux Torvalds
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2017-07-12 04:25:44 +0800

06 Jul, 2017

1 commit

9cd4f1a4e smp/hotplug: Move unparking of percpu threads to the control CPU ... Browse Code »

Vikram reported the following backtrace:

BUG: scheduling while atomic: swapper/7/0/0x00000002
CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.9.32-perf+ #680
schedule
schedule_hrtimeout_range_clock
schedule_hrtimeout
wait_task_inactive
__kthread_bind_mask
__kthread_bind
__kthread_unpark
kthread_unpark
cpuhp_online_idle
cpu_startup_entry
secondary_start_kernel

He analyzed correctly that a parked cpu hotplug thread of an offlined CPU
was still on the runqueue when the CPU came back online and tried to unpark
it. This causes the thread which invoked kthread_unpark() to call
wait_task_inactive() and subsequently schedule() with preemption disabled.
His proposed workaround was to "make sure" that a parked thread has
scheduled out when the CPU goes offline, so the situation cannot happen.

But that's still wrong because the root cause is not the fact that the
percpu thread is still on the runqueue and neither that preemption is
disabled, which could be simply solved by enabling preemption before
calling kthread_unpark().

The real issue is that the calling thread is the idle task of the upcoming
CPU, which is not supposed to call anything which might sleep. The moron,
who wrote that code, missed completely that kthread_unpark() might end up
in schedule().

The solution is simpler than expected. The thread which controls the
hotplug operation is waiting for the CPU to call complete() on the hotplug
state completion. So the idle task of the upcoming CPU can set its state to
CPUHP_AP_ONLINE_IDLE and invoke complete(). This in turn wakes the control
task on a different CPU, which then can safely do the unpark and kick the
now unparked hotplug thread of the upcoming CPU to complete the bringup to
the final target state.

Control CPU AP

bringup_cpu();
__cpu_up() ------------>
bringup_ap();
bringup_wait_for_ap()
wait_for_completion();
cpuhp_online_idle();
stopper);
unpark(AP->hotplugthread);
while(1)
do_idle();
kick(AP->hotplugthread);
wait_for_completion(); hotplug_thread()
run_online_callbacks();
complete();

Fixes: 8df3e07e7f21 ("cpu/hotplug: Let upcoming cpu bring itself fully up")
Reported-by: Vikram Mulukutla
Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Cc: Sebastian Sewior
Cc: Rusty Russell
Cc: Tejun Heo
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1707042218020.2131@nanos
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2017-07-06 16:55:10 +0800

04 Jul, 2017

1 commit

9a9594efe Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull SMP hotplug updates from Thomas Gleixner:
"This update is primarily a cleanup of the CPU hotplug locking code.

The hotplug locking mechanism is an open coded RWSEM, which allows
recursive locking. The main problem with that is the recursive nature
as it evades the full lockdep coverage and hides potential deadlocks.

The rework replaces the open coded RWSEM with a percpu RWSEM and
establishes full lockdep coverage that way.

The bulk of the changes fix up recursive locking issues and address
the now fully reported potential deadlocks all over the place. Some of
these deadlocks have been observed in the RT tree, but on mainline the
probability was low enough to hide them away."

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
cpu/hotplug: Constify attribute_group structures
powerpc: Only obtain cpu_hotplug_lock if called by rtasd
ARM/hw_breakpoint: Fix possible recursive locking for arch_hw_breakpoint_init
cpu/hotplug: Remove unused check_for_tasks() function
perf/core: Don't release cred_guard_mutex if not taken
cpuhotplug: Link lock stacks for hotplug callbacks
acpi/processor: Prevent cpu hotplug deadlock
sched: Provide is_percpu_thread() helper
cpu/hotplug: Convert hotplug locking to percpu rwsem
s390: Prevent hotplug rwsem recursion
arm: Prevent hotplug rwsem recursion
arm64: Prevent cpu hotplug rwsem recursion
kprobes: Cure hotplug lock ordering issues
jump_label: Reorder hotplug lock and jump_label_lock
perf/tracing/cpuhotplug: Fix locking order
ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus()
PCI: Replace the racy recursion prevention
PCI: Use cpu_hotplug_disable() instead of get_online_cpus()
perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode()
x86/perf: Drop EXPORT of perf_check_microcode
...

Linus Torvalds
2017-07-04 09:08:06 +0800

30 Jun, 2017

1 commit

993647a29 cpu/hotplug: Constify attribute_group structures ... Browse Code »

attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by work with const
attribute_group.

So mark the non-const structs as const:

File size before:
text data bss dec hex filename
12582 15361 20 27963 6d3b kernel/cpu.o

File size After adding 'const':
text data bss dec hex filename
12710 15265 20 27995 6d5b kernel/cpu.o

Signed-off-by: Arvind Yadav
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: anna-maria@linutronix.de
Cc: bigeasy@linutronix.de
Cc: boris.ostrovsky@oracle.com
Cc: rcochran@linutronix.de
Link: http://lkml.kernel.org/r/f9079e94e12b36d245e7adbf67d312bc5d0250c6.1498737970.git.arvind.yadav.cs@gmail.com
Signed-off-by: Ingo Molnar

Arvind Yadav
2017-06-30 15:34:39 +0800

23 Jun, 2017

1 commit

c5cb83bb3 genirq/cpuhotplug: Handle managed IRQs on CPU hotplug ... Browse Code »

If a CPU goes offline, interrupts affine to the CPU are moved away. If the
outgoing CPU is the last CPU in the affinity mask the migration code breaks
the affinity and sets it it all online cpus.

This is a problem for affinity managed interrupts as CPU hotplug is often
used for power management purposes. If the affinity is broken, the
interrupt is not longer affine to the CPUs to which it was allocated.

The affinity spreading allows to lay out multi queue devices in a way that
they are assigned to a single CPU or a group of CPUs. If the last CPU goes
offline, then the queue is not longer used, so the interrupt can be
shutdown gracefully and parked until one of the assigned CPUs comes online
again.

Add a graceful shutdown mechanism into the irq affinity breaking code path,
mark the irq as MANAGED_SHUTDOWN and leave the affinity mask unmodified.

In the online path, scan the active interrupts for managed interrupts and
if the interrupt is functional and the newly online CPU is part of the
affinity mask, restart the interrupt if it is marked MANAGED_SHUTDOWN or if
the interrupts is started up, try to add the CPU back to the effective
affinity mask.

Originally-by: Christoph Hellwig
Signed-off-by: Thomas Gleixner
Cc: Jens Axboe
Cc: Marc Zyngier
Cc: Michael Ellerman
Cc: Keith Busch
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20170619235447.273417334@linutronix.de

Thomas Gleixner
2017-06-23 00:21:25 +0800

13 Jun, 2017

1 commit

57de72125 cpu/hotplug: Remove unused check_for_tasks() function ... Browse Code »

clang -Wunused-function found one remaining function that was
apparently meant to be removed in a recent code cleanup:

kernel/cpu.c:565:20: warning: unused function 'check_for_tasks' [-Wunused-function]

Sebastian explained: The function became unused unintentionally, but there
is already a failure check, when a task cannot be removed from the outgoing
cpu in the scheduler code, so bringing it back is not really giving any
extra value.

Fixes: 530e9b76ae8f ("cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions")
Signed-off-by: Arnd Bergmann
Cc: Peter Zijlstra
Cc: Sebastian Andrzej Siewior
Cc: Boris Ostrovsky
Cc: Anna-Maria Gleixner
Link: http://lkml.kernel.org/r/20170608085544.2257132-1-arnd@arndb.de
Signed-off-by: Thomas Gleixner

Arnd Bergmann
2017-06-13 01:00:55 +0800

03 Jun, 2017

1 commit

40da1b11f cpu/hotplug: Drop the device lock on error ... Browse Code »

If a custom CPU target is specified and that one is not available _or_
can't be interrupted then the code returns to userland without dropping a
lock as notices by lockdep:

|echo 133 > /sys/devices/system/cpu/cpu7/hotplug/target
| ================================================
| [ BUG: lock held when returning to user space! ]
| ------------------------------------------------
| bash/503 is leaving the kernel with locks still held!
| 1 lock held by bash/503:
| #0: (device_hotplug_lock){+.+...}, at: [] lock_device_hotplug_sysfs+0x10/0x40

So release the lock then.

Fixes: 757c989b9994 ("cpu/hotplug: Make target state writeable")
Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Thomas Gleixner
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170602142714.3ogo25f2wbq6fjpj@linutronix.de

Sebastian Andrzej Siewior
2017-06-03 15:35:04 +0800

26 May, 2017

6 commits

49dfe2a67 cpuhotplug: Link lock stacks for hotplug callbacks ... Browse Code »

The CPU hotplug callbacks are not covered by lockdep versus the cpu hotplug
rwsem.

CPU0 CPU1
cpuhp_setup_state(STATE, startup, teardown);
cpus_read_lock();
invoke_callback_on_ap();
kick_hotplug_thread(ap);
wait_for_completion(); hotplug_thread_fn()
lock(m);
do_stuff();
unlock(m);

Lockdep does not know about this dependency and will not trigger on the
following code sequence:

lock(m);
cpus_read_lock();

Add a lockdep map and connect the initiators lock chain with the hotplug
thread lock chain, so potential deadlocks can be detected.

Signed-off-by: Thomas Gleixner
Tested-by: Paul E. McKenney
Acked-by: Ingo Molnar
Cc: Peter Zijlstra
Cc: Sebastian Siewior
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/20170524081549.709375845@linutronix.de

Thomas Gleixner
2017-05-26 16:10:48 +0800
fc8dffd37 cpu/hotplug: Convert hotplug locking to percpu rwsem ... Browse Code »

There are no more (known) nested calls to get_online_cpus() and all
observed lock ordering problems have been addressed.

Replace the magic nested 'rwsem' hackery with a percpu-rwsem.

Signed-off-by: Thomas Gleixner
Tested-by: Paul E. McKenney
Acked-by: Ingo Molnar
Cc: Peter Zijlstra
Cc: Sebastian Siewior
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/20170524081549.447014063@linutronix.de

Thomas Gleixner
2017-05-26 16:10:46 +0800
210e21331 cpu/hotplug: Use stop_machine_cpuslocked() in takedown_cpu() ... Browse Code »

takedown_cpu() is a cpu hotplug function invoking stop_machine(). The cpu
hotplug machinery holds the hotplug lock for write.

stop_machine() invokes get_online_cpus() as well. This is correct, but
prevents the conversion of the hotplug locking to a percpu rwsem.

Use stop_machine_cpuslocked() to avoid the nested call.

Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Thomas Gleixner
Tested-by: Paul E. McKenney
Acked-by: Ingo Molnar
Cc: Peter Zijlstra
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/20170524081548.423292433@linutronix.de

Sebastian Andrzej Siewior
2017-05-26 16:10:42 +0800
9805c6733 cpu/hotplug: Add __cpuhp_state_add_instance_cpuslocked() ... Browse Code »

Add cpuslocked() variants for the multi instance registration so this can
be called from a cpus_read_lock() protected region.

Signed-off-by: Thomas Gleixner
Tested-by: Paul E. McKenney
Acked-by: Ingo Molnar
Cc: Peter Zijlstra
Cc: Sebastian Siewior
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/20170524081547.321782217@linutronix.de

Thomas Gleixner
2017-05-26 16:10:35 +0800
71def423f cpu/hotplug: Provide cpuhp_setup/remove_state[_nocalls]_cpuslocked() ... Browse Code »

Some call sites of cpuhp_setup/remove_state[_nocalls]() are within a
cpus_read locked region.

cpuhp_setup/remove_state[_nocalls]() call cpus_read_lock() as well, which
is possible in the current implementation but prevents converting the
hotplug locking to a percpu rwsem.

Provide locked versions of the interfaces to avoid nested calls to
cpus_read_lock().

Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Thomas Gleixner
Tested-by: Paul E. McKenney
Acked-by: Ingo Molnar
Cc: Peter Zijlstra
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/20170524081547.239600868@linutronix.de

Sebastian Andrzej Siewior
2017-05-26 16:10:35 +0800
8f553c498 cpu/hotplug: Provide cpus_read|write_[un]lock() ... Browse Code »

The counting 'rwsem' hackery of get|put_online_cpus() is going to be
replaced by percpu rwsem.

Rename the functions to make it clear that it's locking and not some
refcount style interface. These new functions will be used for the
preparatory patches which make the code ready for the percpu rwsem
conversion.

Rename all instances in the cpu hotplug code while at it.

Signed-off-by: Thomas Gleixner
Tested-by: Paul E. McKenney
Acked-by: Paul E. McKenney
Acked-by: Ingo Molnar
Cc: Peter Zijlstra
Cc: Sebastian Siewior
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/20170524081547.080397752@linutronix.de

Thomas Gleixner
2017-05-26 16:10:34 +0800

14 Apr, 2017

1 commit

0ba78a95a Merge branch 'linus' into locking/core, to pick up fixes ... Browse Code »

Signed-off-by: Ingo Molnar

Ingo Molnar
2017-04-14 16:29:40 +0800

26 Mar, 2017

1 commit

8ce371f98 lockdep: Fix per-cpu static objects ... Browse Code »

Since commit 383776fa7527 ("locking/lockdep: Handle statically initialized
PER_CPU locks properly") we try to collapse per-cpu locks into a single
class by giving them all the same key. For this key we choose the canonical
address of the per-cpu object, which would be the offset into the per-cpu
area.

This has two problems:

- there is a case where we run !0 lock->key through static_obj() and
expect this to pass; it doesn't for canonical pointers.

- 0 is a valid canonical address.

Cure both issues by redefining the canonical address as the address of the
per-cpu variable on the boot CPU.

Since I didn't want to rely on CPU0 being the boot-cpu, or even existing at
all, track the boot CPU in a variable.

Fixes: 383776fa7527 ("locking/lockdep: Handle statically initialized PER_CPU locks properly")
Reported-by: kernel test robot
Signed-off-by: Peter Zijlstra (Intel)
Tested-by: Borislav Petkov
Cc: Sebastian Andrzej Siewior
Cc: linux-mm@kvack.org
Cc: wfg@linux.intel.com
Cc: kernel test robot
Cc: LKP
Link: http://lkml.kernel.org/r/20170320114108.kbvcsuepem45j5cr@hirez.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner

Peter Zijlstra
2017-03-26 21:09:45 +0800

15 Mar, 2017

1 commit

dc434e056 cpu/hotplug: Serialize callback invocations proper ... Browse Code »

The setup/remove_state/instance() functions in the hotplug core code are
serialized against concurrent CPU hotplug, but unfortunately not serialized
against themself.

As a consequence a concurrent invocation of these function results in
corruption of the callback machinery because two instances try to invoke
callbacks on remote cpus at the same time. This results in missing callback
invocations and initiator threads waiting forever on the completion.

The obvious solution to replace get_cpu_online() with cpu_hotplug_begin()
is not possible because at least one callsite calls into these functions
from a get_online_cpu() locked region.

Extend the protection scope of the cpuhp_state_mutex from solely protecting
the state arrays to cover the callback invocation machinery as well.

Fixes: 5b7aa87e0482 ("cpu/hotplug: Implement setup/removal interface")
Reported-and-tested-by: Bart Van Assche
Signed-off-by: Sebastian Andrzej Siewior
Cc: hpa@zytor.com
Cc: mingo@kernel.org
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/20170314150645.g4tdyoszlcbajmna@linutronix.de
Signed-off-by: Thomas Gleixner

Sebastian Andrzej Siewior
2017-03-15 02:19:27 +0800

02 Mar, 2017

3 commits

299300258 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> ... Browse Code »

We are going to split out of , which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder file that just
maps to to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Ingo Molnar
2017-03-02 15:42:35 +0800
ef8bd77f3 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/hotplug.h> ... Browse Code »

We are going to split out of , which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder file that just
maps to to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Ingo Molnar
2017-03-02 15:42:35 +0800
3f07c0144 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> ... Browse Code »

We are going to split out of , which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder file that just
maps to to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Ingo Molnar
2017-03-02 15:42:29 +0800

18 Jan, 2017

1 commit

0fec9557f cpu/hotplug: Remove unused but set variable in _cpu_down() ... Browse Code »

After the recent removal of the hotplug notifiers the variable 'hasdied' in
_cpu_down() is set but no longer read, leading to the following GCC warning
when building with 'make W=1':

kernel/cpu.c:767:7: warning: variable ‘hasdied’ set but not used [-Wunused-but-set-variable]

Fix it by removing the variable.

Fixes: 530e9b76ae8f ("cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions")
Signed-off-by: Tobias Klauser
Cc: Peter Zijlstra
Cc: Sebastian Andrzej Siewior
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20170117143501.20893-1-tklauser@distanz.ch
Signed-off-by: Thomas Gleixner

Tobias Klauser
2017-01-18 18:55:09 +0800

16 Jan, 2017

1 commit

4205e4786 cpu/hotplug: Provide dynamic range for prepare stage ... Browse Code »

Mathieu reported that the LTTNG modules are broken as of 4.10-rc1 due to
the removal of the cpu hotplug notifiers.

Usually I don't care much about out of tree modules, but LTTNG is widely
used in distros. There are two ways to solve that:

1) Reserve a hotplug state for LTTNG

2) Add a dynamic range for the prepare states.

While #1 is the simplest solution, #2 is the proper one as we can convert
in tree users, which do not care about ordering, to the dynamic range as
well.

Add a dynamic range which allows LTTNG to request states in the prepare
stage.

Reported-and-tested-by: Mathieu Desnoyers
Signed-off-by: Thomas Gleixner
Reviewed-by: Mathieu Desnoyers
Cc: Peter Zijlstra
Cc: Sebastian Sewior
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701101353010.3401@nanos
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2017-01-16 20:20:05 +0800

27 Dec, 2016

1 commit

b9d9d6911 smp/hotplug: Undo tglxs brainfart ... Browse Code »

The attempt to prevent overwriting an active state resulted in a
disaster which effectively disables all dynamically allocated hotplug
states.

Cleanup the mess.

Fixes: dc280d936239 ("cpu/hotplug: Prevent overwriting of callbacks")
Reported-by: Markus Trippelsdorf
Reported-by: Boris Ostrovsky
Signed-off-by: Thomas Gleixner
Signed-off-by: Linus Torvalds

Thomas Gleixner
2016-12-27 09:30:24 +0800

25 Dec, 2016

2 commits

530e9b76a cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions ... Browse Code »

hotcpu_notifier(), cpu_notifier(), __hotcpu_notifier(), __cpu_notifier(),
register_hotcpu_notifier(), register_cpu_notifier(),
__register_hotcpu_notifier(), __register_cpu_notifier(),
unregister_hotcpu_notifier(), unregister_cpu_notifier(),
__unregister_hotcpu_notifier(), __unregister_cpu_notifier()

are unused now. Remove them and all related code.

Remove also the now pointless cpu notifier error injection mechanism. The
states can be executed step by step and error rollback is the same as cpu
down, so any state transition can be tested w/o requiring the notifier
error injection.

Some CPU hotplug states are kept as they are (ab)used for hotplug state
tracking.

Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161221192112.005642358@linutronix.de
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2016-12-25 17:47:43 +0800
dc280d936 cpu/hotplug: Prevent overwriting of callbacks ... Browse Code »

Developers manage to overwrite states blindly without thought. That's fatal
and hard to debug. Add sanity checks to make it fail.

This requries to restructure the code so that the dynamic state allocation
happens in the same lock protected section as the actual store. Otherwise
the previous assignment of 'Reserved' to the name field would trigger the
overwrite check.

Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Sebastian Siewior
Link: http://lkml.kernel.org/r/20161221192111.675234535@linutronix.de
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2016-12-25 17:47:42 +0800

16 Dec, 2016

1 commit

512f09801 cpu/hotplug: Clarify description of __cpuhp_setup_state() return value ... Browse Code »

When invoked with CPUHP_AP_ONLINE_DYN state __cpuhp_setup_state()
is expected to return positive value which is the hotplug state that
the routine assigns.

Signed-off-by: Boris Ostrovsky
Cc: linux-pm@vger.kernel.org
Cc: viresh.kumar@linaro.org
Cc: bigeasy@linutronix.de
Cc: rjw@rjwysocki.net
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1481814058-4799-2-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Thomas Gleixner

Boris Ostrovsky
2016-12-16 00:48:20 +0800

08 Dec, 2016

1 commit

777c6e0da hotplug: Make register and unregister notifier API symmetric ... Browse Code »

Yu Zhao has noticed that __unregister_cpu_notifier only unregisters its
notifiers when HOTPLUG_CPU=y while the registration might succeed even
when HOTPLUG_CPU=n if MODULE is enabled. This means that e.g. zswap
might keep a stale notifier on the list on the manual clean up during
the pool tear down and thus corrupt the list. Resulting in the following

[ 144.964346] BUG: unable to handle kernel paging request at ffff880658a2be78
[ 144.971337] IP: [] raw_notifier_chain_register+0x1b/0x40

[ 145.122628] Call Trace:
[ 145.125086] [] __register_cpu_notifier+0x18/0x20
[ 145.131350] [] zswap_pool_create+0x273/0x400
[ 145.137268] [] __zswap_param_set+0x1fc/0x300
[ 145.143188] [] ? trace_hardirqs_on+0xd/0x10
[ 145.149018] [] ? kernel_param_lock+0x28/0x30
[ 145.154940] [] ? __might_fault+0x4f/0xa0
[ 145.160511] [] zswap_compressor_param_set+0x17/0x20
[ 145.167035] [] param_attr_store+0x5c/0xb0
[ 145.172694] [] module_attr_store+0x1d/0x30
[ 145.178443] [] sysfs_kf_write+0x4f/0x70
[ 145.183925] [] kernfs_fop_write+0x149/0x180
[ 145.189761] [] __vfs_write+0x18/0x40
[ 145.194982] [] vfs_write+0xb2/0x1a0
[ 145.200122] [] SyS_write+0x52/0xa0
[ 145.205177] [] entry_SYSCALL_64_fastpath+0x12/0x17

This can be even triggered manually by changing
/sys/module/zswap/parameters/compressor multiple times.

Fix this issue by making unregister APIs symmetric to the register so
there are no surprises.

Fixes: 47e627bc8c9a ("[PATCH] hotplug: Allow modules to use the cpu hotplug notifiers even if !CONFIG_HOTPLUG_CPU")
Reported-and-tested-by: Yu Zhao
Signed-off-by: Michal Hocko
Cc: linux-mm@kvack.org
Cc: Andrew Morton
Cc: Dan Streetman
Link: http://lkml.kernel.org/r/20161207135438.4310-1-mhocko@kernel.org
Signed-off-by: Thomas Gleixner

Michal Hocko
2016-12-08 17:08:41 +0800

16 Oct, 2016

1 commit

a705e07b9 cpu/hotplug: Use distinct name for cpu_hotplug.dep_map ... Browse Code »

Use distinctive name for cpu_hotplug.dep_map to avoid the actual
cpu_hotplug.lock appearing as cpu_hotplug.lock#2 in lockdep splats.

Signed-off-by: Joonas Lahtinen
Reviewed-by: Chris Wilson
Acked-by: Gautham R. Shenoy
Cc: Andrew Morton
Cc: Daniel Vetter
Cc: Gautham R . Shenoy
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: intel-gfx@lists.freedesktop.org
Cc: trivial@kernel.org
Signed-off-by: Ingo Molnar

Joonas Lahtinen
2016-10-16 17:09:32 +0800

04 Oct, 2016

2 commits

597f03f9d Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull CPU hotplug updates from Thomas Gleixner:
"Yet another batch of cpu hotplug core updates and conversions:

- Provide core infrastructure for multi instance drivers so the
drivers do not have to keep custom lists.

- Convert custom lists to the new infrastructure. The block-mq custom
list conversion comes through the block tree and makes the diffstat
tip over to more lines removed than added.

- Handle unbalanced hotplug enable/disable calls more gracefully.

- Remove the obsolete CPU_STARTING/DYING notifier support.

- Convert another batch of notifier users.

The relayfs changes which conflicted with the conversion have been
shipped to me by Andrew.

The remaining lot is targeted for 4.10 so that we finally can remove
the rest of the notifiers"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
cpufreq: Fix up conversion to hotplug state machine
blk/mq: Reserve hotplug states for block multiqueue
x86/apic/uv: Convert to hotplug state machine
s390/mm/pfault: Convert to hotplug state machine
mips/loongson/smp: Convert to hotplug state machine
mips/octeon/smp: Convert to hotplug state machine
fault-injection/cpu: Convert to hotplug state machine
padata: Convert to hotplug state machine
cpufreq: Convert to hotplug state machine
ACPI/processor: Convert to hotplug state machine
virtio scsi: Convert to hotplug state machine
oprofile/timer: Convert to hotplug state machine
block/softirq: Convert to hotplug state machine
lib/irq_poll: Convert to hotplug state machine
x86/microcode: Convert to hotplug state machine
sh/SH-X3 SMP: Convert to hotplug state machine
ia64/mca: Convert to hotplug state machine
ARM/OMAP/wakeupgen: Convert to hotplug state machine
ARM/shmobile: Convert to hotplug state machine
arm64/FP/SIMD: Convert to hotplug state machine
...

Linus Torvalds
2016-10-04 10:43:08 +0800
4b978934a Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull RCU updates from Ingo Molnar:
"The main changes in this cycle were:

- Expedited grace-period changes, most notably avoiding having user
threads drive expedited grace periods, using a workqueue instead.

- Miscellaneous fixes, including a performance fix for lists that was
sent with the lists modifications.

- CPU hotplug updates, most notably providing exact CPU-online
tracking for RCU. This will in turn allow removal of the checks
supporting RCU's prior heuristic that was based on the assumption
that CPUs would take no longer than one jiffy to come online.

- Torture-test updates.

- Documentation updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
list: Expand list_first_entry_or_null()
torture: TOROUT_STRING(): Insert a space between flag and message
rcuperf: Consistently insert space between flag and message
rcutorture: Print out barrier error as document says
torture: Add task state to writer-task stall printk()s
torture: Convert torture_shutdown() to hrtimer
rcutorture: Convert to hotplug state machine
cpu/hotplug: Get rid of CPU_STARTING reference
rcu: Provide exact CPU-online tracking for RCU
rcu: Avoid redundant quiescent-state chasing
rcu: Don't use modular infrastructure in non-modular code
sched: Make wake_up_nohz_cpu() handle CPUs going offline
rcu: Use rcu_gp_kthread_wake() to wake up grace period kthreads
rcu: Use RCU's online-CPU state for expedited IPI retry
rcu: Exclude RCU-offline CPUs from expedited grace periods
rcu: Make expedited RCU CPU stall warnings respond to controls
rcu: Stop disabling expedited RCU CPU stall warnings
rcu: Drive expedited grace periods from workqueue
rcu: Consolidate expedited grace period machinery
documentation: Record reason for rcu_head two-byte alignment
...

Linus Torvalds
2016-10-04 01:29:53 +0800

07 Sep, 2016

2 commits

6731d4f12 slab: Convert to hotplug state machine ... Browse Code »

Install the callbacks via the state machine.

Signed-off-by: Richard Weinberger
Signed-off-by: Thomas Gleixner
Signed-off-by: Sebastian Andrzej Siewior
Reviewed-by: Sebastian Andrzej Siewior
Cc: Peter Zijlstra
Cc: Pekka Enberg
Cc: linux-mm@kvack.org
Cc: rt@linutronix.de
Cc: David Rientjes
Cc: Joonsoo Kim
Cc: Andrew Morton
Cc: Christoph Lameter
Link: http://lkml.kernel.org/r/20160823125319.abeapfjapf2kfezp@linutronix.de
Signed-off-by: Thomas Gleixner

Sebastian Andrzej Siewior
2016-09-07 00:30:20 +0800
e6d4989a9 relayfs: Convert to hotplug state machine ... Browse Code »

Install the callbacks via the state machine. They are installed at run time but
relay_prepare_cpu() does not need to be invoked by the boot CPU because
relay_open() was not yet invoked and there are no pools that need to be created.

Signed-off-by: Richard Weinberger
Signed-off-by: Thomas Gleixner
Signed-off-by: Sebastian Andrzej Siewior
Reviewed-by: Sebastian Andrzej Siewior
Cc: Peter Zijlstra
Cc: rt@linutronix.de
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/20160818125731.27256-3-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner

Richard Weinberger
2016-09-07 00:30:20 +0800