Eric Lee / smarc-fsl-linux-kernel

06 Dec, 2018

2 commits

36a4c5fc9 x86/speculation: Rework SMT state change ... Browse Code »

commit a74cfffb03b73d41e08f84c2e5c87dec0ce3db9f upstream

arch_smt_update() is only called when the sysfs SMT control knob is
changed. This means that when SMT is enabled in the sysfs control knob the
system is considered to have SMT active even if all siblings are offline.

To allow finegrained control of the speculation mitigations, the actual SMT
state is more interesting than the fact that siblings could be enabled.

Rework the code, so arch_smt_update() is invoked from each individual CPU
hotplug function, and simplify the update function while at it.

Signed-off-by: Thomas Gleixner
Reviewed-by: Ingo Molnar
Cc: Peter Zijlstra
Cc: Andy Lutomirski
Cc: Linus Torvalds
Cc: Jiri Kosina
Cc: Tom Lendacky
Cc: Josh Poimboeuf
Cc: Andrea Arcangeli
Cc: David Woodhouse
Cc: Tim Chen
Cc: Andi Kleen
Cc: Dave Hansen
Cc: Casey Schaufler
Cc: Asit Mallick
Cc: Arjan van de Ven
Cc: Jon Masters
Cc: Waiman Long
Cc: Greg KH
Cc: Dave Stewart
Cc: Kees Cook
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181125185004.521974984@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2018-12-06 02:41:20 +0800
300a6f27d x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation ... Browse Code »

commit 53c613fe6349994f023245519265999eed75957f upstream

STIBP is a feature provided by certain Intel ucodes / CPUs. This feature
(once enabled) prevents cross-hyperthread control of decisions made by
indirect branch predictors.

Enable this feature if

- the CPU is vulnerable to spectre v2
- the CPU supports SMT and has SMT siblings online
- spectre_v2 mitigation autoselection is enabled (default)

After some previous discussion, this leaves STIBP on all the time, as wrmsr
on crossing kernel boundary is a no-no. This could perhaps later be a bit
more optimized (like disabling it in NOHZ, experiment with disabling it in
idle, etc) if needed.

Note that the synchronization of the mask manipulation via newly added
spec_ctrl_mutex is currently not strictly needed, as the only updater is
already being serialized by cpu_add_remove_lock, but let's make this a
little bit more future-proof.

Signed-off-by: Jiri Kosina
Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Josh Poimboeuf
Cc: Andrea Arcangeli
Cc: "WoodhouseDavid"
Cc: Andi Kleen
Cc: Tim Chen
Cc: "SchauflerCasey"
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1809251438240.15880@cbobk.fhfr.pm
Signed-off-by: Greg Kroah-Hartman

Jiri Kosina
2018-12-06 02:41:18 +0800

23 Nov, 2018

1 commit

c33b5cc70 Revert "x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation" ... Browse Code »

This reverts commit 8a13906ae519b3ed95cd0fb73f1098b46362f6c4 which is
commit 53c613fe6349994f023245519265999eed75957f upstream.

It's not ready for the stable trees as there are major slowdowns
involved with this patch.

Reported-by: Jiri Kosina
Cc: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Josh Poimboeuf
Cc: Andrea Arcangeli
Cc: "WoodhouseDavid"
Cc: Andi Kleen
Cc: Tim Chen
Cc: "SchauflerCasey"
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2018-11-23 15:19:27 +0800

14 Nov, 2018

1 commit

8a13906ae x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation ... Browse Code »

commit 53c613fe6349994f023245519265999eed75957f upstream.

STIBP is a feature provided by certain Intel ucodes / CPUs. This feature
(once enabled) prevents cross-hyperthread control of decisions made by
indirect branch predictors.

Enable this feature if

- the CPU is vulnerable to spectre v2
- the CPU supports SMT and has SMT siblings online
- spectre_v2 mitigation autoselection is enabled (default)

After some previous discussion, this leaves STIBP on all the time, as wrmsr
on crossing kernel boundary is a no-no. This could perhaps later be a bit
more optimized (like disabling it in NOHZ, experiment with disabling it in
idle, etc) if needed.

Note that the synchronization of the mask manipulation via newly added
spec_ctrl_mutex is currently not strictly needed, as the only updater is
already being serialized by cpu_add_remove_lock, but let's make this a
little bit more future-proof.

Signed-off-by: Jiri Kosina
Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Josh Poimboeuf
Cc: Andrea Arcangeli
Cc: "WoodhouseDavid"
Cc: Andi Kleen
Cc: Tim Chen
Cc: "SchauflerCasey"
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1809251438240.15880@cbobk.fhfr.pm
Signed-off-by: Greg Kroah-Hartman

Jiri Kosina
2018-11-14 03:14:47 +0800

20 Sep, 2018

2 commits

1d92a611d cpu/hotplug: Prevent state corruption on error rollback ... Browse Code »

commit 69fa6eb7d6a64801ea261025cce9723d9442d773 upstream.

When a teardown callback fails, the CPU hotplug code brings the CPU back to
the previous state. The previous state becomes the new target state. The
rollback happens in undo_cpu_down() which increments the state
unconditionally even if the state is already the same as the target.

As a consequence the next CPU hotplug operation will start at the wrong
state. This is easily to observe when __cpu_disable() fails.

Prevent the unconditional undo by checking the state vs. target before
incrementing state and fix up the consequently wrong conditional in the
unplug code which handles the failure of the final CPU take down on the
control CPU side.

Fixes: 4dddfb5faa61 ("smp/hotplug: Rewrite AP state machine core")
Reported-by: Neeraj Upadhyay
Signed-off-by: Thomas Gleixner
Tested-by: Geert Uytterhoeven
Tested-by: Sudeep Holla
Tested-by: Neeraj Upadhyay
Cc: josh@joshtriplett.org
Cc: peterz@infradead.org
Cc: jiangshanlai@gmail.com
Cc: dzickus@redhat.com
Cc: brendan.jackman@arm.com
Cc: malat@debian.org
Cc: sramana@codeaurora.org
Cc: linux-arm-msm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1809051419580.1416@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman

----

Thomas Gleixner
2018-09-20 04:43:36 +0800
cb2625854 cpu/hotplug: Adjust misplaced smb() in cpuhp_thread_fun() ... Browse Code »

commit f8b7530aa0a1def79c93101216b5b17cf408a70a upstream.

The smp_mb() in cpuhp_thread_fun() is misplaced. It needs to be after the
load of st->should_run to prevent reordering of the later load/stores
w.r.t. the load of st->should_run.

Fixes: 4dddfb5faa61 ("smp/hotplug: Rewrite AP state machine core")
Signed-off-by: Neeraj Upadhyay
Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra (Intel)
Cc: josh@joshtriplett.org
Cc: peterz@infradead.org
Cc: jiangshanlai@gmail.com
Cc: dzickus@redhat.com
Cc: brendan.jackman@arm.com
Cc: malat@debian.org
Cc: mojha@codeaurora.org
Cc: sramana@codeaurora.org
Cc: linux-arm-msm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1536126727-11629-1-git-send-email-neeraju@codeaurora.org
Signed-off-by: Greg Kroah-Hartman

Neeraj Upadhyay
2018-09-20 04:43:36 +0800

16 Aug, 2018

12 commits

7bdbaba8e cpu/hotplug: Non-SMP machines do not make use of booted_once ... Browse Code »

commit 269777aa530f3438ec1781586cdac0b5fe47b061 upstream.

Commit 0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
breaks non-SMP builds.

[ I suspect the 'bool' fields should just be made to be bitfields and be
exposed regardless of configuration, but that's a separate cleanup
that I'll leave to the owners of this file for later. - Linus ]

Fixes: 0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
Cc: Dave Hansen
Cc: Thomas Gleixner
Cc: Tony Luck
Signed-off-by: Abel Vesa
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Abel Vesa
2018-08-16 00:13:01 +0800
9eb0a3cce cpu/hotplug: Fix SMT supported evaluation ... Browse Code »

commit bc2d8d262cba5736332cbc866acb11b1c5748aa9 upstream

Josh reported that the late SMT evaluation in cpu_smt_state_init() sets
cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied
on the kernel command line as it cannot differentiate between SMT disabled
by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and
makes the sysfs interface unusable.

Rework this so that during bringup of the non boot CPUs the availability of
SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a
'primary' thread then set the local cpu_smt_available marker and evaluate
this explicitely right after the initial SMP bringup has finished.

SMT evaulation on x86 is a trainwreck as the firmware has all the
information _before_ booting the kernel, but there is no interface to query
it.

Fixes: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS")
Reported-by: Josh Poimboeuf
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2018-08-16 00:13:00 +0800
c6613521a cpu/hotplug: detect SMT disabled by BIOS ... Browse Code »

commit 73d5e2b472640b1fcdb61ae8be389912ef211bda upstream

If SMT is disabled in BIOS, the CPU code doesn't properly detect it.
The /sys/devices/system/cpu/smt/control file shows 'on', and the 'l1tf'
vulnerabilities file shows SMT as vulnerable.

Fix it by forcing 'cpu_smt_control' to CPU_SMT_NOT_SUPPORTED in such a
case. Unfortunately the detection can only be done after bringing all
the CPUs online, so we have to overwrite any previous writes to the
variable.

Reported-by: Joe Mario
Tested-by: Jiri Kosina
Fixes: f048c399e0f7 ("x86/topology: Provide topology_smt_supported()")
Signed-off-by: Josh Poimboeuf
Signed-off-by: Peter Zijlstra
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Josh Poimboeuf
2018-08-16 00:12:57 +0800
476d29ab7 cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED early ... Browse Code »

commit fee0aede6f4739c87179eca76136f83210953b86 upstream

The CPU_SMT_NOT_SUPPORTED state is set (if the processor does not support
SMT) when the sysfs SMT control file is initialized.

That was fine so far as this was only required to make the output of the
control file correct and to prevent writes in that case.

With the upcoming l1tf command line parameter, this needs to be set up
before the L1TF mitigation selection and command line parsing happens.

Signed-off-by: Thomas Gleixner
Tested-by: Jiri Kosina
Reviewed-by: Greg Kroah-Hartman
Reviewed-by: Josh Poimboeuf
Link: https://lkml.kernel.org/r/20180713142323.121795971@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2018-08-16 00:12:56 +0800
8e41ddda3 cpu/hotplug: Expose SMT control init function ... Browse Code »

commit 8e1b706b6e819bed215c0db16345568864660393 upstream

The L1TF mitigation will gain a commend line parameter which allows to set
a combination of hypervisor mitigation and SMT control.

Expose cpu_smt_disable() so the command line parser can tweak SMT settings.

[ tglx: Split out of larger patch and made it preserve an already existing
force off state ]

Signed-off-by: Jiri Kosina
Signed-off-by: Thomas Gleixner
Tested-by: Jiri Kosina
Reviewed-by: Greg Kroah-Hartman
Reviewed-by: Josh Poimboeuf
Link: https://lkml.kernel.org/r/20180713142323.039715135@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Jiri Kosina
2018-08-16 00:12:56 +0800
ac10995a7 cpu/hotplug: Online siblings when SMT control is turned on ... Browse Code »

commit 215af5499d9e2b55f111d2431ea20218115f29b3 upstream

Writing 'off' to /sys/devices/system/cpu/smt/control offlines all SMT
siblings. Writing 'on' merily enables the abilify to online them, but does
not online them automatically.

Make 'on' more useful by onlining all offline siblings.

Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2018-08-16 00:12:55 +0800
c2fdbbb47 x86/KVM: Warn user if KVM is loaded SMT and L1TF CPU bug being present ... Browse Code »

commit 26acfb666a473d960f0fd971fe68f3e3ad16c70b upstream

If the L1TF CPU bug is present we allow the KVM module to be loaded as the
major of users that use Linux and KVM have trusted guests and do not want a
broken setup.

Cloud vendors are the ones that are uncomfortable with CVE 2018-3620 and as
such they are the ones that should set nosmt to one.

Setting 'nosmt' means that the system administrator also needs to disable
SMT (Hyper-threading) in the BIOS, or via the 'nosmt' command line
parameter, or via the /sys/devices/system/cpu/smt/control. See commit
05736e4ac13c ("cpu/hotplug: Provide knobs to control SMT").

Other mitigations are to use task affinity, cpu sets, interrupt binding,
etc - anything to make sure that _only_ the same guests vCPUs are running
on sibling threads.

Signed-off-by: Konrad Rzeszutek Wilk
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Konrad Rzeszutek Wilk
2018-08-16 00:12:54 +0800
7f2229c92 cpu/hotplug: Boot HT siblings at least once ... Browse Code »

commit 0cc3cd21657be04cb0559fe8063f2130493f92cf upstream

Due to the way Machine Check Exceptions work on X86 hyperthreads it's
required to boot up _all_ logical cores at least once in order to set the
CR4.MCE bit.

So instead of ignoring the sibling threads right away, let them boot up
once so they can configure themselves. After they came out of the initial
boot stage check whether its a "secondary" sibling and cancel the operation
which puts the CPU back into offline state.

Reported-by: Dave Hansen
Signed-off-by: Thomas Gleixner
Tested-by: Tony Luck
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2018-08-16 00:12:53 +0800
c5ac43ee8 cpu/hotplug: Provide knobs to control SMT ... Browse Code »

commit 05736e4ac13c08a4a9b1ef2de26dd31a32cbee57 upstream

Provide a command line and a sysfs knob to control SMT.

The command line options are:

'nosmt': Enumerate secondary threads, but do not online them

'nosmt=force': Ignore secondary threads completely during enumeration
via MP table and ACPI/MADT.

The sysfs control file has the following states (read/write):

'on': SMT is enabled. Secondary threads can be freely onlined
'off': SMT is disabled. Secondary threads, even if enumerated
cannot be onlined
'forceoff': SMT is permanentely disabled. Writes to the control
file are rejected.
'notsupported': SMT is not supported by the CPU

The command line option 'nosmt' sets the sysfs control to 'off'. This
can be changed to 'on' to reenable SMT during runtime.

The command line option 'nosmt=force' sets the sysfs control to
'forceoff'. This cannot be changed during runtime.

When SMT is 'on' and the control file is changed to 'off' then all online
secondary threads are offlined and attempts to online a secondary thread
later on are rejected.

When SMT is 'off' and the control file is changed to 'on' then secondary
threads can be onlined again. The 'off' -> 'on' transition does not
automatically online the secondary threads.

When the control file is set to 'forceoff', the behaviour is the same as
setting it to 'off', but the operation is irreversible and later writes to
the control file are rejected.

When the control status is 'notsupported' then writes to the control file
are rejected.

Signed-off-by: Thomas Gleixner
Reviewed-by: Konrad Rzeszutek Wilk
Acked-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2018-08-16 00:12:52 +0800
6beba29c6 cpu/hotplug: Split do_cpu_down() ... Browse Code »

commit cc1fe215e1efa406b03aa4389e6269b61342dec5 upstream

Split out the inner workings of do_cpu_down() to allow reuse of that
function for the upcoming SMT disabling mechanism.

No functional change.

Signed-off-by: Thomas Gleixner
Reviewed-by: Konrad Rzeszutek Wilk
Acked-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2018-08-16 00:12:51 +0800
26a6dcc71 cpu/hotplug: Make bringup/teardown of smp threads symmetric ... Browse Code »

commit c4de65696d865c225fda3b9913b31284ea65ea96 upstream

The asymmetry caused a warning to trigger if the bootup was stopped in state
CPUHP_AP_ONLINE_IDLE. The warning no longer triggers as kthread_park() can
now be invoked on already or still parked threads. But there is still no
reason to have this be asymmetric.

Signed-off-by: Thomas Gleixner
Reviewed-by: Konrad Rzeszutek Wilk
Acked-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2018-08-16 00:12:51 +0800
b7722f4ac init: rename and re-order boot_cpu_state_init() ... Browse Code »

commit b5b1404d0815894de0690de8a1ab58269e56eae6 upstream.

This is purely a preparatory patch for upcoming changes during the 4.19
merge window.

We have a function called "boot_cpu_state_init()" that isn't really
about the bootup cpu state: that is done much earlier by the similarly
named "boot_cpu_init()" (note lack of "state" in name).

This function initializes some hotplug CPU state, and needs to run after
the percpu data has been properly initialized. It even has a comment to
that effect.

Except it _doesn't_ actually run after the percpu data has been properly
initialized. On x86 it happens to do that, but on at least arm and
arm64, the percpu base pointers are initialized by the arch-specific
'smp_prepare_boot_cpu()' hook, which ran _after_ boot_cpu_state_init().

This had some unexpected results, and in particular we have a patch
pending for the merge window that did the obvious cleanup of using
'this_cpu_write()' in the cpu hotplug init code:

- per_cpu_ptr(&cpuhp_state, smp_processor_id())->state = CPUHP_ONLINE;
+ this_cpu_write(cpuhp_state.state, CPUHP_ONLINE);

which is obviously the right thing to do. Except because of the
ordering issue, it actually failed miserably and unexpectedly on arm64.

So this just fixes the ordering, and changes the name of the function to
be 'boot_cpu_hotplug_init()' to make it obvious that it's about cpu
hotplug state, because the core CPU state was supposed to have already
been done earlier.

Marked for stable, since the (not yet merged) patch that will show this
problem is marked for stable.

Reported-by: Vlastimil Babka
Reported-by: Mian Yousaf Kaukab
Suggested-by: Catalin Marinas
Acked-by: Thomas Gleixner
Cc: Will Deacon
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Linus Torvalds
2018-08-16 00:12:48 +0800

03 Jan, 2018

1 commit

6fae6de72 timers: Reinitialize per cpu bases on hotplug ... Browse Code »

commit 26456f87aca7157c057de65c9414b37f1ab881d1 upstream.

The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
them with a potentially stale clk and next_expiry valuem, which can cause
trouble then the CPU is plugged.

Add a prepare callback which forwards the clock, sets next_expiry to far in
the future and reset the control flags to a known state.

Set base->must_forward_clk so the first timer which is queued will try to
forward the clock to current jiffies.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Reported-by: Paul E. McKenney
Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Sebastian Siewior
Cc: Anna-Maria Gleixner
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2018-01-03 03:31:15 +0800

14 Dec, 2017

1 commit

b68df97ec smp/hotplug: Move step CPUHP_AP_SMPCFD_DYING to the correct place ... Browse Code »

commit 46febd37f9c758b05cd25feae8512f22584742fe upstream.

Commit 31487f8328f2 ("smp/cfd: Convert core to hotplug state machine")
accidently put this step on the wrong place. The step should be at the
cpuhp_ap_states[] rather than the cpuhp_bp_states[].

grep smpcfd /sys/devices/system/cpu/hotplug/states
40: smpcfd:prepare
129: smpcfd:dying

"smpcfd:dying" was missing before.
So was the invocation of the function smpcfd_dying_cpu().

Fixes: 31487f8328f2 ("smp/cfd: Convert core to hotplug state machine")
Signed-off-by: Lai Jiangshan
Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Richard Weinberger
Cc: Sebastian Andrzej Siewior
Cc: Boris Ostrovsky
Link: https://lkml.kernel.org/r/20171128131954.81229-1-jiangshanlai@gmail.com
Signed-off-by: Greg Kroah-Hartman

Lai Jiangshan
2017-12-14 16:52:56 +0800

21 Oct, 2017

1 commit

1f7c70d6b cpu/hotplug: Reset node state after operation ... Browse Code »

The recent rework of the cpu hotplug internals changed the usage of the per
cpu state->node field, but missed to clean it up after usage.

So subsequent hotplug operations use the stale pointer from a previous
operation and hand it into the callback functions. The callbacks then
dereference a pointer which either belongs to a different facility or
points to freed and potentially reused memory. In either case data
corruption and crashes are the obvious consequence.

Reset the node and the last pointers in the per cpu state to NULL after the
operation which set them has completed.

Fixes: 96abb968549c ("smp/hotplug: Allow external multi-instance rollback")
Reported-by: Tvrtko Ursulin
Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Sebastian Andrzej Siewior
Cc: Boris Ostrovsky
Cc: "Paul E. McKenney"
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710211606130.3213@nanos

Thomas Gleixner
2017-10-21 22:11:30 +0800

06 Oct, 2017

1 commit

27efed3e8 Merge branch 'core-watchdog-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull watchddog clean-up and fixes from Thomas Gleixner:
"The watchdog (hard/softlockup detector) code is pretty much broken in
its current state. The patch series addresses this by removing all
duct tape and refactoring it into a workable state.

The reasons why I ask for inclusion that late in the cycle are:

1) The code causes lockdep splats vs. hotplug locking which get
reported over and over. Unfortunately there is no easy fix.

2) The risk of breakage is minimal because it's already broken

3) As 4.14 is a long term stable kernel, I prefer to have working
watchdog code in that and the lockdep issues resolved. I wouldn't
ask you to pull if 4.14 wouldn't be a LTS kernel or if the
solution would be easy to backport.

4) The series was around before the merge window opened, but then got
delayed due to the UP failure caused by the for_each_cpu()
surprise which we discussed recently.

Changes vs. V1:

- Addressed your review points

- Addressed the warning in the powerpc code which was discovered late

- Changed two function names which made sense up to a certain point
in the series. Now they match what they do in the end.

- Fixed a 'unused variable' warning, which got not detected by the
intel robot. I triggered it when trying all possible related config
combinations manually. Randconfig testing seems not random enough.

The changes have been tested by and reviewed by Don Zickus and tested
and acked by Micheal Ellerman for powerpc"

* 'core-watchdog-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
watchdog/core: Put softlockup_threads_initialized under ifdef guard
watchdog/core: Rename some softlockup_* functions
powerpc/watchdog: Make use of watchdog_nmi_probe()
watchdog/core, powerpc: Lock cpus across reconfiguration
watchdog/core, powerpc: Replace watchdog_nmi_reconfigure()
watchdog/hardlockup/perf: Fix spelling mistake: "permanetely" -> "permanently"
watchdog/hardlockup/perf: Cure UP damage
watchdog/hardlockup: Clean up hotplug locking mess
watchdog/hardlockup/perf: Simplify deferred event destroy
watchdog/hardlockup/perf: Use new perf CPU enable mechanism
watchdog/hardlockup/perf: Implement CPU enable replacement
watchdog/hardlockup/perf: Implement init time detection of perf
watchdog/hardlockup/perf: Implement init time perf validation
watchdog/core: Get rid of the racy update loop
watchdog/core, powerpc: Make watchdog_nmi_reconfigure() two stage
watchdog/sysctl: Clean up sysctl variable name space
watchdog/sysctl: Get rid of the #ifdeffery
watchdog/core: Clean up header mess
watchdog/core: Further simplify sysctl handling
watchdog/core: Get rid of the thread teardown/setup dance
...

Linus Torvalds
2017-10-06 23:36:41 +0800

26 Sep, 2017

6 commits

1db49484f smp/hotplug: Hotplug state fail injection ... Browse Code »

Add a sysfs file to one-time fail a specific state. This can be used
to test the state rollback code paths.

Something like this (hotplug-up.sh):

#!/bin/bash

echo 0 > /debug/sched_debug
echo 1 > /debug/tracing/events/cpuhp/enable

ALL_STATES=`cat /sys/devices/system/cpu/hotplug/states | cut -d':' -f1`
STATES=${1:-$ALL_STATES}

for state in $STATES
do
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /debug/tracing/trace
echo Fail state: $state
echo $state > /sys/devices/system/cpu/cpu1/hotplug/fail
cat /sys/devices/system/cpu/cpu1/hotplug/fail
echo 1 > /sys/devices/system/cpu/cpu1/online

cat /debug/tracing/trace > hotfail-${state}.trace

sleep 1
done

Can be used to test for all possible rollback (barring multi-instance)
scenarios on CPU-up, CPU-down is a trivial modification of the above.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.972581715@infradead.org

Peter Zijlstra
2017-09-26 04:11:44 +0800
5ebe7742f smp/hotplug: Differentiate the AP completion between up and down ... Browse Code »

With lockdep-crossrelease we get deadlock reports that span cpu-up and
cpu-down chains. Such deadlocks cannot possibly happen because cpu-up
and cpu-down are globally serialized.

takedown_cpu()
irq_lock_sparse()
wait_for_completion(&st->done)

cpuhp_thread_fun
cpuhp_up_callback
cpuhp_invoke_callback
irq_affinity_online_cpu
irq_local_spare()
irq_unlock_sparse()
complete(&st->done)

Now that we have consistent AP state, we can trivially separate the
AP completion between up and down using st->bringup.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Acked-by: max.byungchul.park@gmail.com
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Link: https://lkml.kernel.org/r/20170920170546.872472799@infradead.org

Peter Zijlstra
2017-09-26 04:11:43 +0800
5f4b55e10 smp/hotplug: Differentiate the AP-work lockdep class between up and down ... Browse Code »

With lockdep-crossrelease we get deadlock reports that span cpu-up and
cpu-down chains. Such deadlocks cannot possibly happen because cpu-up
and cpu-down are globally serialized.

CPU0 CPU1 CPU2
cpuhp_up_callbacks: takedown_cpu: cpuhp_thread_fun:

cpuhp_state
irq_lock_sparse()
irq_lock_sparse()
wait_for_completion()
cpuhp_state
complete()

Now that we have consistent AP state, we can trivially separate the
AP-work class between up and down using st->bringup.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Cc: max.byungchul.park@gmail.com
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Link: https://lkml.kernel.org/r/20170920170546.922524234@infradead.org

Peter Zijlstra
2017-09-26 04:11:43 +0800
724a86881 smp/hotplug: Callback vs state-machine consistency ... Browse Code »

While the generic callback functions have an 'int' return and thus
appear to be allowed to return error, this is not true for all states.

Specifically, what used to be STARTING/DYING are ran with IRQs
disabled from critical parts of CPU bringup/teardown and are not
allowed to fail. Add WARNs to enforce this rule.

But since some callbacks are indeed allowed to fail, we have the
situation where a state-machine rollback encounters a failure, in this
case we're stuck, we can't go forward and we can't go back. Also add a
WARN for that case.

AFAICT this is a fundamental 'problem' with no real obvious solution.
We want the 'prepare' callbacks to allow failure on either up or down.
Typically on prepare-up this would be things like -ENOMEM from
resource allocations, and the typical usage in prepare-down would be
something like -EBUSY to avoid CPUs being taken away.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.819539119@infradead.org

Peter Zijlstra
2017-09-26 04:11:43 +0800
4dddfb5fa smp/hotplug: Rewrite AP state machine core ... Browse Code »

There is currently no explicit state change on rollback. That is,
st->bringup, st->rollback and st->target are not consistent when doing
the rollback.

Rework the AP state handling to be more coherent. This does mean we
have to do a second AP kick-and-wait for rollback, but since rollback
is the slow path of a slowpath, this really should not matter.

Take this opportunity to simplify the AP thread function to only run a
single callback per invocation. This unifies the three single/up/down
modes is supports. The looping it used to do for up/down are achieved
by retaining should_run and relying on the main smpboot_thread_fn()
loop.

(I have most of a patch that does the same for the BP state handling,
but that's not critical and gets a little complicated because
CPUHP_BRINGUP_CPU does the AP handoff from a callback, which gets
recursive @st usage, I still have de-fugly that.)

[ tglx: Move cpuhp_down_callbacks() et al. into the HOTPLUG_CPU section to
avoid gcc complaining about unused functions. Make the HOTPLUG_CPU
one piece instead of having two consecutive ifdef sections of the
same type. ]

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.769658088@infradead.org

Peter Zijlstra
2017-09-26 04:11:42 +0800
96abb9685 smp/hotplug: Allow external multi-instance rollback ... Browse Code »

Currently the rollback of multi-instance states is handled inside
cpuhp_invoke_callback(). The problem is that when we want to allow an
explicit state change for rollback, we need to return from the
function without doing the rollback.

Change cpuhp_invoke_callback() to optionally return the multi-instance
state, such that rollback can be done from a subsequent call.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Thomas Gleixner
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.720361181@infradead.org

Peter Zijlstra
2017-09-26 04:11:42 +0800

14 Sep, 2017

1 commit

941154bd6 watchdog/hardlockup/perf: Prevent CPU hotplug deadlock ... Browse Code »

The following deadlock is possible in the watchdog hotplug code:

cpus_write_lock()
...
takedown_cpu()
smpboot_park_threads()
smpboot_park_thread()
kthread_park()
->park() := watchdog_disable()
watchdog_nmi_disable()
perf_event_release_kernel();
put_event()
_free_event()
->destroy() := hw_perf_event_destroy()
x86_release_hardware()
release_ds_buffers()
get_online_cpus()

when a per cpu watchdog perf event is destroyed which drops the last
reference to the PMU hardware. The cleanup code there invokes
get_online_cpus() which instantly deadlocks because the hotplug percpu
rwsem is write locked.

To solve this add a deferring mechanism:

cpus_write_lock()
kthread_park()
watchdog_nmi_disable(deferred)
perf_event_disable(event);
move_event_to_deferred(event);
....
cpus_write_unlock()
cleaup_deferred_events()
perf_event_release_kernel()

This is still properly serialized against concurrent hotplug via the
cpu_add_remove_lock, which is held by the task which initiated the hotplug
event.

This is also used to handle event destruction when the watchdog threads are
parked via other mechanisms than CPU hotplug.

Analyzed-by: Peter Zijlstra

Reported-by: Borislav Petkov
Signed-off-by: Thomas Gleixner
Reviewed-by: Don Zickus
Cc: Andrew Morton
Cc: Chris Metcalf
Cc: Linus Torvalds
Cc: Nicholas Piggin
Cc: Peter Zijlstra
Cc: Sebastian Siewior
Cc: Ulrich Obergfell
Link: http://lkml.kernel.org/r/20170912194146.884469246@linutronix.de
Signed-off-by: Ingo Molnar

Thomas Gleixner
2017-09-14 17:41:05 +0800

05 Sep, 2017

1 commit

d725c7ac8 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull CPU hotplug fix from Thomas Gleixner:
"A single fix to handle the removal of the first dynamic CPU hotplug
state correctly"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smp/hotplug: Handle removal correctly in cpuhp_store_callbacks()

Linus Torvalds
2017-09-05 04:53:53 +0800

26 Jul, 2017

1 commit

a58163d8c rcu: Migrate callbacks earlier in the CPU-offline timeline ... Browse Code »

RCU callbacks must be migrated away from an outgoing CPU, and this is
done near the end of the CPU-hotplug operation, after the outgoing CPU is
long gone. Unfortunately, this means that other CPU-hotplug callbacks
can execute while the outgoing CPU's callbacks are still immobilized
on the long-gone CPU's callback lists. If any of these CPU-hotplug
callbacks must wait, either directly or indirectly, for the invocation
of any of the immobilized RCU callbacks, the system will hang.

This commit avoids such hangs by migrating the callbacks away from the
outgoing CPU immediately upon its departure, shortly after the return
from __cpu_die() in takedown_cpu(). Thus, RCU is able to advance these
callbacks and invoke them, which allows all the after-the-fact CPU-hotplug
callbacks to wait on these RCU callbacks without risk of a hang.

While in the neighborhood, this commit also moves rcu_send_cbs_to_orphanage()
and rcu_adopt_orphan_cbs() under a pre-existing #ifdef to avoid including
dead code on the one hand and to avoid define-without-use warnings on the
other hand.

Reported-by: Jeffrey Hugo
Link: http://lkml.kernel.org/r/db9c91f6-1b17-6136-84f0-03c3c2581ab4@codeaurora.org
Signed-off-by: Paul E. McKenney
Cc: Thomas Gleixner
Cc: Sebastian Andrzej Siewior
Cc: Ingo Molnar
Cc: Anna-Maria Gleixner
Cc: Boris Ostrovsky
Cc: Richard Weinberger

Paul E. McKenney
2017-07-26 04:03:43 +0800

20 Jul, 2017

1 commit

0c96b2730 smp/hotplug: Handle removal correctly in cpuhp_store_callbacks() ... Browse Code »

If cpuhp_store_callbacks() is called for CPUHP_AP_ONLINE_DYN or
CPUHP_BP_PREPARE_DYN, which are the indicators for dynamically allocated
states, then cpuhp_store_callbacks() allocates a new dynamic state. The
first allocation in each range returns CPUHP_AP_ONLINE_DYN or
CPUHP_BP_PREPARE_DYN.

If cpuhp_remove_state() is invoked for one of these states, then there is
no protection against the allocation mechanism. So the removal, which
should clear the callbacks and the name, gets a new state assigned and
clears that one.

As a consequence the state which should be cleared stays initialized. A
consecutive CPU hotplug operation dereferences the state callbacks and
accesses either freed or reused memory, resulting in crashes.

Add a protection against this by checking the name argument for NULL. If
it's NULL it's a removal. If not, it's an allocation.

[ tglx: Added a comment and massaged changelog ]

Fixes: 5b7aa87e0482 ("cpu/hotplug: Implement setup/removal interface")
Signed-off-by: Ethan Barnes
Signed-off-by: Thomas Gleixner
Cc: Ingo Molnar
Cc: "Srivatsa S. Bhat"
Cc: Sebastian Siewior
Cc: Paul McKenney
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/DM2PR04MB398242FC7776D603D9F99C894A60@DM2PR04MB398.namprd04.prod.outlook.com

Ethan Barnes
2017-07-20 22:40:24 +0800

12 Jul, 2017

1 commit

dea1d0f5f smp/hotplug: Replace BUG_ON and react useful ... Browse Code »

The move of the unpark functions to the control thread moved the BUG_ON()
there as well. While it made some sense in the idle thread of the upcoming
CPU, it's bogus to crash the control thread on the already online CPU,
especially as the function has a return value and the callsite is prepared
to handle an error return.

Replace it with a WARN_ON_ONCE() and return a proper error code.

Fixes: 9cd4f1a4e7a8 ("smp/hotplug: Move unparking of percpu threads to the control CPU")
Rightfully-ranted-at-by: Linux Torvalds
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2017-07-12 04:25:44 +0800

06 Jul, 2017

1 commit

9cd4f1a4e smp/hotplug: Move unparking of percpu threads to the control CPU ... Browse Code »

Vikram reported the following backtrace:

BUG: scheduling while atomic: swapper/7/0/0x00000002
CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.9.32-perf+ #680
schedule
schedule_hrtimeout_range_clock
schedule_hrtimeout
wait_task_inactive
__kthread_bind_mask
__kthread_bind
__kthread_unpark
kthread_unpark
cpuhp_online_idle
cpu_startup_entry
secondary_start_kernel

He analyzed correctly that a parked cpu hotplug thread of an offlined CPU
was still on the runqueue when the CPU came back online and tried to unpark
it. This causes the thread which invoked kthread_unpark() to call
wait_task_inactive() and subsequently schedule() with preemption disabled.
His proposed workaround was to "make sure" that a parked thread has
scheduled out when the CPU goes offline, so the situation cannot happen.

But that's still wrong because the root cause is not the fact that the
percpu thread is still on the runqueue and neither that preemption is
disabled, which could be simply solved by enabling preemption before
calling kthread_unpark().

The real issue is that the calling thread is the idle task of the upcoming
CPU, which is not supposed to call anything which might sleep. The moron,
who wrote that code, missed completely that kthread_unpark() might end up
in schedule().

The solution is simpler than expected. The thread which controls the
hotplug operation is waiting for the CPU to call complete() on the hotplug
state completion. So the idle task of the upcoming CPU can set its state to
CPUHP_AP_ONLINE_IDLE and invoke complete(). This in turn wakes the control
task on a different CPU, which then can safely do the unpark and kick the
now unparked hotplug thread of the upcoming CPU to complete the bringup to
the final target state.

Control CPU AP

bringup_cpu();
__cpu_up() ------------>
bringup_ap();
bringup_wait_for_ap()
wait_for_completion();
cpuhp_online_idle();
stopper);
unpark(AP->hotplugthread);
while(1)
do_idle();
kick(AP->hotplugthread);
wait_for_completion(); hotplug_thread()
run_online_callbacks();
complete();

Fixes: 8df3e07e7f21 ("cpu/hotplug: Let upcoming cpu bring itself fully up")
Reported-by: Vikram Mulukutla
Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Cc: Sebastian Sewior
Cc: Rusty Russell
Cc: Tejun Heo
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1707042218020.2131@nanos
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2017-07-06 16:55:10 +0800

04 Jul, 2017

1 commit

9a9594efe Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull SMP hotplug updates from Thomas Gleixner:
"This update is primarily a cleanup of the CPU hotplug locking code.

The hotplug locking mechanism is an open coded RWSEM, which allows
recursive locking. The main problem with that is the recursive nature
as it evades the full lockdep coverage and hides potential deadlocks.

The rework replaces the open coded RWSEM with a percpu RWSEM and
establishes full lockdep coverage that way.

The bulk of the changes fix up recursive locking issues and address
the now fully reported potential deadlocks all over the place. Some of
these deadlocks have been observed in the RT tree, but on mainline the
probability was low enough to hide them away."

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
cpu/hotplug: Constify attribute_group structures
powerpc: Only obtain cpu_hotplug_lock if called by rtasd
ARM/hw_breakpoint: Fix possible recursive locking for arch_hw_breakpoint_init
cpu/hotplug: Remove unused check_for_tasks() function
perf/core: Don't release cred_guard_mutex if not taken
cpuhotplug: Link lock stacks for hotplug callbacks
acpi/processor: Prevent cpu hotplug deadlock
sched: Provide is_percpu_thread() helper
cpu/hotplug: Convert hotplug locking to percpu rwsem
s390: Prevent hotplug rwsem recursion
arm: Prevent hotplug rwsem recursion
arm64: Prevent cpu hotplug rwsem recursion
kprobes: Cure hotplug lock ordering issues
jump_label: Reorder hotplug lock and jump_label_lock
perf/tracing/cpuhotplug: Fix locking order
ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus()
PCI: Replace the racy recursion prevention
PCI: Use cpu_hotplug_disable() instead of get_online_cpus()
perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode()
x86/perf: Drop EXPORT of perf_check_microcode
...

Linus Torvalds
2017-07-04 09:08:06 +0800

30 Jun, 2017

1 commit

993647a29 cpu/hotplug: Constify attribute_group structures ... Browse Code »

attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by work with const
attribute_group.

So mark the non-const structs as const:

File size before:
text data bss dec hex filename
12582 15361 20 27963 6d3b kernel/cpu.o

File size After adding 'const':
text data bss dec hex filename
12710 15265 20 27995 6d5b kernel/cpu.o

Signed-off-by: Arvind Yadav
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: anna-maria@linutronix.de
Cc: bigeasy@linutronix.de
Cc: boris.ostrovsky@oracle.com
Cc: rcochran@linutronix.de
Link: http://lkml.kernel.org/r/f9079e94e12b36d245e7adbf67d312bc5d0250c6.1498737970.git.arvind.yadav.cs@gmail.com
Signed-off-by: Ingo Molnar

Arvind Yadav
2017-06-30 15:34:39 +0800

23 Jun, 2017

1 commit

c5cb83bb3 genirq/cpuhotplug: Handle managed IRQs on CPU hotplug ... Browse Code »

If a CPU goes offline, interrupts affine to the CPU are moved away. If the
outgoing CPU is the last CPU in the affinity mask the migration code breaks
the affinity and sets it it all online cpus.

This is a problem for affinity managed interrupts as CPU hotplug is often
used for power management purposes. If the affinity is broken, the
interrupt is not longer affine to the CPUs to which it was allocated.

The affinity spreading allows to lay out multi queue devices in a way that
they are assigned to a single CPU or a group of CPUs. If the last CPU goes
offline, then the queue is not longer used, so the interrupt can be
shutdown gracefully and parked until one of the assigned CPUs comes online
again.

Add a graceful shutdown mechanism into the irq affinity breaking code path,
mark the irq as MANAGED_SHUTDOWN and leave the affinity mask unmodified.

In the online path, scan the active interrupts for managed interrupts and
if the interrupt is functional and the newly online CPU is part of the
affinity mask, restart the interrupt if it is marked MANAGED_SHUTDOWN or if
the interrupts is started up, try to add the CPU back to the effective
affinity mask.

Originally-by: Christoph Hellwig
Signed-off-by: Thomas Gleixner
Cc: Jens Axboe
Cc: Marc Zyngier
Cc: Michael Ellerman
Cc: Keith Busch
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20170619235447.273417334@linutronix.de

Thomas Gleixner
2017-06-23 00:21:25 +0800

13 Jun, 2017

1 commit

57de72125 cpu/hotplug: Remove unused check_for_tasks() function ... Browse Code »

clang -Wunused-function found one remaining function that was
apparently meant to be removed in a recent code cleanup:

kernel/cpu.c:565:20: warning: unused function 'check_for_tasks' [-Wunused-function]

Sebastian explained: The function became unused unintentionally, but there
is already a failure check, when a task cannot be removed from the outgoing
cpu in the scheduler code, so bringing it back is not really giving any
extra value.

Fixes: 530e9b76ae8f ("cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions")
Signed-off-by: Arnd Bergmann
Cc: Peter Zijlstra
Cc: Sebastian Andrzej Siewior
Cc: Boris Ostrovsky
Cc: Anna-Maria Gleixner
Link: http://lkml.kernel.org/r/20170608085544.2257132-1-arnd@arndb.de
Signed-off-by: Thomas Gleixner

Arnd Bergmann
2017-06-13 01:00:55 +0800

03 Jun, 2017

1 commit

40da1b11f cpu/hotplug: Drop the device lock on error ... Browse Code »

If a custom CPU target is specified and that one is not available _or_
can't be interrupted then the code returns to userland without dropping a
lock as notices by lockdep:

|echo 133 > /sys/devices/system/cpu/cpu7/hotplug/target
| ================================================
| [ BUG: lock held when returning to user space! ]
| ------------------------------------------------
| bash/503 is leaving the kernel with locks still held!
| 1 lock held by bash/503:
| #0: (device_hotplug_lock){+.+...}, at: [] lock_device_hotplug_sysfs+0x10/0x40

So release the lock then.

Fixes: 757c989b9994 ("cpu/hotplug: Make target state writeable")
Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Thomas Gleixner
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170602142714.3ogo25f2wbq6fjpj@linutronix.de

Sebastian Andrzej Siewior
2017-06-03 15:35:04 +0800

26 May, 2017

1 commit

49dfe2a67 cpuhotplug: Link lock stacks for hotplug callbacks ... Browse Code »

The CPU hotplug callbacks are not covered by lockdep versus the cpu hotplug
rwsem.

CPU0 CPU1
cpuhp_setup_state(STATE, startup, teardown);
cpus_read_lock();
invoke_callback_on_ap();
kick_hotplug_thread(ap);
wait_for_completion(); hotplug_thread_fn()
lock(m);
do_stuff();
unlock(m);

Lockdep does not know about this dependency and will not trigger on the
following code sequence:

lock(m);
cpus_read_lock();

Add a lockdep map and connect the initiators lock chain with the hotplug
thread lock chain, so potential deadlocks can be detected.

Signed-off-by: Thomas Gleixner
Tested-by: Paul E. McKenney
Acked-by: Ingo Molnar
Cc: Peter Zijlstra
Cc: Sebastian Siewior
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/20170524081549.709375845@linutronix.de

Thomas Gleixner
2017-05-26 16:10:48 +0800