Eric Lee / smarc-fsl-linux-kernel

23 May, 2018

1 commit

6986750cb tick/broadcast: Use for_each_cpu() specially on UP kernels ... Browse Code »

commit 5596fe34495cf0f645f417eb928ef224df3e3cb4 upstream.

for_each_cpu() unintuitively reports CPU0 as set independent of the actual
cpumask content on UP kernels. This causes an unexpected PIT interrupt
storm on a UP kernel running in an SMP virtual machine on Hyper-V, and as
a result, the virtual machine can suffer from a strange random delay of 1~20
minutes during boot-up, and sometimes it can hang forever.

Protect if by checking whether the cpumask is empty before entering the
for_each_cpu() loop.

[ tglx: Use !IS_ENABLED(CONFIG_SMP) instead of #ifdeffery ]

Signed-off-by: Dexuan Cui
Signed-off-by: Thomas Gleixner
Cc: Josh Poulson
Cc: "Michael Kelley (EOSG)"
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: stable@vger.kernel.org
Cc: Rakib Mullick
Cc: Jork Loeser
Cc: Greg Kroah-Hartman
Cc: Andrew Morton
Cc: KY Srinivasan
Cc: Linus Torvalds
Cc: Alexey Dobriyan
Cc: Dmitry Vyukov
Link: https://lkml.kernel.org/r/KL1P15301MB000678289FE55BA365B3279ABF990@KL1P15301MB0006.APCP153.PROD.OUTLOOK.COM
Link: https://lkml.kernel.org/r/KL1P15301MB0006FA63BC22BEB64902EAA0BF930@KL1P15301MB0006.APCP153.PROD.OUTLOOK.COM
Signed-off-by: Greg Kroah-Hartman

Dexuan Cui
2018-05-23 00:54:00 +0800

13 Jun, 2017

1 commit

94114c367 tick/broadcast: Make tick_broadcast_setup_oneshot() static ... Browse Code »

This function isn't used outside of tick-broadcast.c, so let's
mark it static.

Signed-off-by: Stephen Boyd
Link: http://lkml.kernel.org/r/20170608063603.13276-1-sboyd@codeaurora.org
Signed-off-by: Thomas Gleixner

Stephen Boyd
2017-06-13 00:56:01 +0800

21 Feb, 2017

1 commit

20dcfe1b7 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull timer updates from Thomas Gleixner:
"Nothing exciting, just the usual pile of fixes, updates and cleanups:

- A bunch of clocksource driver updates

- Removal of CONFIG_TIMER_STATS and the related /proc file

- More posix timer slim down work

- A scalability enhancement in the tick broadcast code

- Math cleanups"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
hrtimer: Catch invalid clockids again
math64, tile: Fix build failure
clocksource/drivers/arm_arch_timer:: Mark cyclecounter __ro_after_init
timerfd: Protect the might cancel mechanism proper
timer_list: Remove useless cast when printing
time: Remove CONFIG_TIMER_STATS
clocksource/drivers/arm_arch_timer: Work around Hisilicon erratum 161010101
clocksource/drivers/arm_arch_timer: Introduce generic errata handling infrastructure
clocksource/drivers/arm_arch_timer: Remove fsl-a008585 parameter
clocksource/drivers/arm_arch_timer: Add dt binding for hisilicon-161010101 erratum
clocksource/drivers/ostm: Add renesas-ostm timer driver
clocksource/drivers/ostm: Document renesas-ostm timer DT bindings
clocksource/drivers/tcb_clksrc: Use 32 bit tcb as sched_clock
clocksource/drivers/gemini: Add driver for the Cortina Gemini
clocksource: add DT bindings for Cortina Gemini
clockevents: Add a clkevt-of mechanism like clksrc-of
tick/broadcast: Reduce lock cacheline contention
timers: Omit POSIX timer stuff from task_struct when disabled
x86/timer: Make delay() work during early bootup
delay: Add explanation of udelay() inaccuracy
...

Linus Torvalds
2017-02-21 02:06:32 +0800

13 Feb, 2017

1 commit

202461e2f tick/broadcast: Prevent deadlock on tick_broadcast_lock ... Browse Code »

tick_broadcast_lock is taken from interrupt context, but the following call
chain takes the lock without disabling interrupts:

[ 12.703736] _raw_spin_lock+0x3b/0x50
[ 12.703738] tick_broadcast_control+0x5a/0x1a0
[ 12.703742] intel_idle_cpu_online+0x22/0x100
[ 12.703744] cpuhp_invoke_callback+0x245/0x9d0
[ 12.703752] cpuhp_thread_fun+0x52/0x110
[ 12.703754] smpboot_thread_fn+0x276/0x320

So the following deadlock can happen:

lock(tick_broadcast_lock);

lock(tick_broadcast_lock);

intel_idle_cpu_online() is the only place which violates the calling
convention of tick_broadcast_control(). This was caused by the removal of
the smp function call in course of the cpu hotplug rework.

Instead of slapping local_irq_disable/enable() at the call site, we can
relax the calling convention and handle it in the core code, which makes
the whole machinery more robust.

Fixes: 29d7bbada98e ("intel_idle: Remove superfluous SMP fuction call")
Reported-by: Gabriel C
Signed-off-by: Mike Galbraith
Cc: Ruslan Ruslichenko
Cc: Jiri Slaby
Cc: Greg KH
Cc: Borislav Petkov
Cc: lwn@lwn.net
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Anna-Maria Gleixner
Cc: Sebastian Siewior
Cc: stable
Link: http://lkml.kernel.org/r/1486953115.5912.4.camel@gmx.de
Signed-off-by: Thomas Gleixner

Mike Galbraith
2017-02-13 16:49:31 +0800

04 Feb, 2017

1 commit

668802c25 tick/broadcast: Reduce lock cacheline contention ... Browse Code »

It was observed that on an Intel x86 system without the ARAT (Always
running APIC timer) feature and with fairly large number of CPUs as
well as CPUs coming in and out of intel_idle frequently, the lock
contention on the tick_broadcast_lock can become significant.

To reduce contention, the lock is put into its own cacheline and all
the cpumask_var_t variables are put into the __read_mostly section.

Running the SP benchmark of the NAS Parallel Benchmarks on a 4-socket
16-core 32-thread Nehalam system, the performance number improved
from 3353.94 Mop/s to 3469.31 Mop/s when this patch was applied on
a 4.9.6 kernel. This is a 3.4% improvement.

Signed-off-by: Waiman Long
Cc: "Peter Zijlstra (Intel)"
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/1485799063-20857-1-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner

Waiman Long
2017-02-04 15:54:46 +0800

26 Dec, 2016

1 commit

2456e8553 ktime: Get rid of the union ... Browse Code »

ktime is a union because the initial implementation stored the time in
scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
variant for 32bit machines. The Y2038 cleanup removed the timespec variant
and switched everything to scalar nanoseconds. The union remained, but
become completely pointless.

Get rid of the union and just keep ktime_t as simple typedef of type s64.

The conversion was done with coccinelle and some manual mopping up.

Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra

Thomas Gleixner
2016-12-26 00:21:22 +0800

15 Dec, 2016

1 commit

c1a9eeb93 tick/broadcast: Prevent NULL pointer dereference ... Browse Code »

When a disfunctional timer, e.g. dummy timer, is installed, the tick core
tries to setup the broadcast timer.

If no broadcast device is installed, the kernel crashes with a NULL pointer
dereference in tick_broadcast_setup_oneshot() because the function has no
sanity check.

Reported-by: Mason
Signed-off-by: Thomas Gleixner
Cc: Mark Rutland
Cc: Anna-Maria Gleixner
Cc: Richard Cochran
Cc: Sebastian Andrzej Siewior
Cc: Daniel Lezcano
Cc: Peter Zijlstra ,
Cc: Sebastian Frias
Cc: Thibaud Cornic
Cc: Robin Murphy
Link: http://lkml.kernel.org/r/1147ef90-7877-e4d2-bb2b-5c4fa8d3144b@free.fr

Thomas Gleixner
2016-12-15 19:25:13 +0800

14 Jul, 2015

1 commit

0f4470517 tick: Move the export of tick_broadcast_oneshot_control to the proper place ... Browse Code »

tick_broadcast_oneshot_control got moved from tick-broadcast to
tick-common, but the export stayed in the old place. Fix it up.

Fixes: f32dd1170511 'tick/broadcast: Make idle check independent from mode and config'
Reported-by: Ingo Molnar
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2015-07-14 18:01:04 +0800

11 Jul, 2015

1 commit

c4d029f2d tick/broadcast: Prevent NULL pointer dereference ... Browse Code »

Dan reported that the recent changes to the broadcast code introduced
a potential NULL dereference.

Add the proper check.

Fixes: e0454311903d "tick/broadcast: Sanity check the shutdown of the local clock_event"
Reported-by: Dan Carpenter
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2015-07-11 20:26:34 +0800

08 Jul, 2015

9 commits

c42883348 tick/broadcast: Handle spurious interrupts gracefully ... Browse Code »

Andriy reported that on a virtual machine the warning about negative
expiry time in the clock events programming code triggered:

hpet: hpet0 irq 40 for MSI
hpet: hpet1 irq 41 for MSI
Switching to clocksource hpet
WARNING: at kernel/time/clockevents.c:239

[] clockevents_program_event+0xdb/0xf0
[] tick_handle_periodic_broadcast+0x41/0x50
[] timer_interrupt+0x15/0x20

When the second hpet is installed as a per cpu timer the broadcast
event is not longer required and stopped, which sets the next_evt of
the broadcast device to KTIME_MAX.

If after that a spurious interrupt happens on the broadcast device,
then the current code blindly handles it and tries to reprogram the
broadcast device afterwards, which adds the period to
next_evt. KTIME_MAX + period results in a negative expiry value
causing the WARN_ON in the clockevents code to trigger.

Add a proper check for the state of the broadcast device into the
interrupt handler and return if the interrupt is spurious.

[ Folded in pointer fix from Sudeep ]

Reported-by: Andriy Gapon
Signed-off-by: Thomas Gleixner
Cc: Sudeep Holla
Cc: Peter Zijlstra
Cc: Preeti U Murthy
Link: http://lkml.kernel.org/r/20150705205221.802094647@linutronix.de

Thomas Gleixner
2015-07-08 00:46:48 +0800
d5113e13a tick/broadcast: Check for hrtimer broadcast active early ... Browse Code »

If the current cpu is the one which has the hrtimer based broadcast
queued then we better return busy immediately instead of going through
loops and hoops to figure that out.

[ Split out from a larger combo patch ]

Tested-by: Sudeep Holla
Signed-off-by: Thomas Gleixner
Cc: Suzuki Poulose
Cc: Lorenzo Pieralisi
Cc: Catalin Marinas
Cc: Rafael J. Wysocki
Cc: Peter Zijlstra
Cc: Preeti U Murthy
Cc: Ingo Molnar
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos

Thomas Gleixner
2015-07-08 00:46:48 +0800
0cc5281aa tick/broadcast: Return busy when IPI is pending ... Browse Code »

Tell the idle code not to go deep if the broadcast IPI is about to
arrive.

[ Split out from a larger combo patch ]

Tested-by: Sudeep Holla
Signed-off-by: Thomas Gleixner
Cc: Suzuki Poulose
Cc: Lorenzo Pieralisi
Cc: Catalin Marinas
Cc: Rafael J. Wysocki
Cc: Peter Zijlstra
Cc: Preeti U Murthy
Cc: Ingo Molnar
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos

Thomas Gleixner
2015-07-08 00:46:48 +0800
d33257264 tick/broadcast: Return busy if periodic mode and hrtimer broadcast ... Browse Code »

If the system is in periodic mode and the broadcast device is hrtimer
based, return busy as we have no proper handling for this.

[ Split out from a larger combo patch ]

Tested-by: Sudeep Holla
Signed-off-by: Thomas Gleixner
Cc: Suzuki Poulose
Cc: Lorenzo Pieralisi
Cc: Catalin Marinas
Cc: Rafael J. Wysocki
Cc: Peter Zijlstra
Cc: Preeti U Murthy
Cc: Ingo Molnar
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos

Thomas Gleixner
2015-07-08 00:46:48 +0800
e3ac79e08 tick/broadcast: Move the check for periodic mode inside state handling ... Browse Code »

We need to check more than the periodic mode for proper operation in
all runtime combinations. To avoid code duplication move the check
into the enter state handling.

No functional change.

[ Split out from a larger combo patch ]

Reported-and-tested-by: Sudeep Holla
Signed-off-by: Thomas Gleixner
Cc: Suzuki Poulose
Cc: Lorenzo Pieralisi
Cc: Catalin Marinas
Cc: Rafael J. Wysocki
Cc: Peter Zijlstra
Cc: Preeti U Murthy
Cc: Ingo Molnar
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos

Thomas Gleixner
2015-07-08 00:46:47 +0800
b78f3f3c8 tick/broadcast: Prevent deep idle if no broadcast device available ... Browse Code »

Add a check for a installed broadcast device to the oneshot control
function and return busy if not.

[ Split out from a larger combo patch ]

Reported-and-tested-by: Sudeep Holla
Signed-off-by: Thomas Gleixner
Cc: Suzuki Poulose
Cc: Lorenzo Pieralisi
Cc: Catalin Marinas
Cc: Rafael J. Wysocki
Cc: Peter Zijlstra
Cc: Preeti U Murthy
Cc: Ingo Molnar
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos

Thomas Gleixner
2015-07-08 00:46:47 +0800
f32dd1170 tick/broadcast: Make idle check independent from mode and config ... Browse Code »

Currently the broadcast busy check, which prevents the idle code from
going into deep idle, works only in one shot mode.

If NOHZ and HIGHRES are off (config or command line) there is no
sanity check at all, so under certain conditions cpus are allowed to
go into deep idle, where the local timer stops, and are not woken up
again because there is no broadcast timer installed or a hrtimer based
broadcast device is not evaluated.

Move tick_broadcast_oneshot_control() into the common code and provide
proper subfunctions for the various config combinations.

The common check in tick_broadcast_oneshot_control() is for the C3STOP
misfeature flag of the local clock event device. If its not set, idle
can proceed. If set, further checks are necessary.

Provide checks for the trivial cases:

- If broadcast is disabled in the config, then return busy

- If oneshot mode (NOHZ/HIGHES) is disabled in the config, return
busy if the broadcast device is hrtimer based.

- If oneshot mode is enabled in the config call the original
tick_broadcast_oneshot_control() function. That function needs
extra checks which will be implemented in seperate patches.

[ Split out from a larger combo patch ]

Reported-and-tested-by: Sudeep Holla
Signed-off-by: Thomas Gleixner
Cc: Suzuki Poulose
Cc: Lorenzo Pieralisi
Cc: Catalin Marinas
Cc: Rafael J. Wysocki
Cc: Peter Zijlstra
Cc: Preeti U Murthy
Cc: Ingo Molnar
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos

Thomas Gleixner
2015-07-08 00:46:47 +0800
e04543119 tick/broadcast: Sanity check the shutdown of the local clock_event ... Browse Code »

The broadcast code shuts down the local clock event unconditionally
even if no broadcast device is installed or if the broadcast device is
hrtimer based.

Add proper sanity checks.

[ Split out from a larger combo patch ]

Reported-and-tested-by: Sudeep Holla
Signed-off-by: Thomas Gleixner
Cc: Suzuki Poulose
Cc: Lorenzo Pieralisi
Cc: Catalin Marinas
Cc: Rafael J. Wysocki
Cc: Peter Zijlstra
Cc: Preeti U Murthy
Cc: Ingo Molnar
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos

Thomas Gleixner
2015-07-08 00:46:47 +0800
8eb231261 tick/broadcast: Prevent hrtimer recursion ... Browse Code »

The hrtimer based broadcast vehicle can cause a hrtimer recursion
which went unnoticed until we changed the hrtimer expiry code to keep
track of the currently running timer.

local_timer_interrupt()
local_handler()
hrtimer_interrupt()
expire_hrtimers()
broadcast_hrtimer()
send_ipis()
local_handler()
hrtimer_interrupt()
....

Solution is simple: Prevent the local handler call from the broadcast
code when the broadcast 'device' is hrtimer based.

[ Split out from a larger combo patch ]

Tested-by: Sudeep Holla
Signed-off-by: Thomas Gleixner
Cc: Suzuki Poulose
Cc: Lorenzo Pieralisi
Cc: Catalin Marinas
Cc: Rafael J. Wysocki
Cc: Peter Zijlstra
Cc: Preeti U Murthy
Cc: Ingo Molnar
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1507070929360.3916@nanos

Thomas Gleixner
2015-07-08 00:46:47 +0800

02 Jun, 2015

2 commits

d7eb231c7 clockevents: Provide functions to set and get the state ... Browse Code »

We want to rename dev->state, so provide proper get and set
functions. Rename clockevents_set_state() to
clockevents_switch_state() to avoid confusion.

Signed-off-by: Thomas Gleixner
Cc: Viresh Kumar
Cc: Peter Zijlstra

Thomas Gleixner
2015-06-02 20:40:47 +0800
472c4a943 clockevents: Use helpers to check the state of a clockevent device ... Browse Code »

Use accessor functions to check the state of clockevent devices in
core code.

Signed-off-by: Viresh Kumar
Cc: linaro-kernel@lists.linaro.org
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/fa2b9869fd17f210eaa156ec2b594efd0230b6c7.1432192527.git.viresh.kumar@linaro.org
Signed-off-by: Thomas Gleixner

Viresh Kumar
2015-06-02 20:40:47 +0800

05 May, 2015

2 commits

298dbd1c5 tick: broadcast: Simplify oneshot logic and shorten lock region ... Browse Code »

Simplify the oneshot logic by avoiding the reprogramming loops. That
also allows to call the cpu local handler outside of the
broadcast_lock held region.

Tested-by: Borislav Petkov
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2015-05-05 16:25:23 +0800
2951d5c03 tick: broadcast: Prevent livelock from event handler ... Browse Code »

With the removal of the hrtimer softirq the switch to highres/nohz
mode happens in the tick interrupt. That leads to a livelock when the
per cpu event handler is directly called from the broadcast handler
under broadcast lock because broadcast lock needs to be taken for the
highres/nohz switch as well.

Solve this by calling the cpu local handler outside the broadcast_lock
held region.

Fixes: c6eb3f70d448 "hrtimer: Get rid of hrtimer softirq"
Reported-and-tested-by: Borislav Petkov
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2015-05-05 16:25:23 +0800

03 Apr, 2015

3 commits

a49b116dc clockevents: Cleanup dead cpu explicitely ... Browse Code »

clockevents_notify() is a leftover from the early design of the
clockevents facility. It's really not a notification mechanism,
it's a multiplex call. We are way better off to have explicit
calls instead of this monstrosity.

Split out the cleanup function for a dead cpu and invoke it
directly from the cpu down code. Make it conditional on
CPU_HOTPLUG as well.

Temporary change, will be refined in the future.

Signed-off-by: Thomas Gleixner
[ Rebased, added clockevents_notify() removal ]
Signed-off-by: Rafael J. Wysocki
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1735025.raBZdQHM3m@vostro.rjw.lan
Signed-off-by: Ingo Molnar

Thomas Gleixner
2015-04-03 14:44:37 +0800
1fe5d5c3c clockevents: Provide explicit broadcast oneshot control functions ... Browse Code »

clockevents_notify() is a leftover from the early design of the
clockevents facility. It's really not a notification mechanism,
it's a multiplex call. We are way better off to have explicit
calls instead of this monstrosity.

Split out the broadcast oneshot control into a separate function
and provide inline helpers. Switch clockevents_notify() over.
This will go away once all callers are converted.

This also gets rid of the nested locking of clockevents_lock and
broadcast_lock. The broadcast oneshot control functions do not
require clockevents_lock. Only the managing functions
(setup/shutdown/suspend/resume of the broadcast device require
clockevents_lock.

Signed-off-by: Thomas Gleixner
Signed-off-by: Rafael J. Wysocki
Cc: Alexandre Courbot
Cc: Daniel Lezcano
Cc: Len Brown
Cc: Peter Zijlstra
Cc: Stephen Warren
Cc: Thierry Reding
Cc: Tony Lindgren
Link: http://lkml.kernel.org/r/13000649.8qZuEDV0OA@vostro.rjw.lan
Signed-off-by: Ingo Molnar

Thomas Gleixner
2015-04-03 14:44:33 +0800
592a438ff clockevents: Provide explicit broadcast control functions ... Browse Code »

clockevents_notify() is a leftover from the early design of the
clockevents facility. It's really not a notification mechanism,
it's a multiplex call. We are way better off to have explicit
calls instead of this monstrosity.

Split out the broadcast control into a separate function and
provide inline helpers. Switch clockevents_notify() over. This
will go away once all callers are converted.

This also gets rid of the nested locking of clockevents_lock and
broadcast_lock. The broadcast control functions do not require
clockevents_lock. Only the managing functions
(setup/shutdown/suspend/resume of the broadcast device require
clockevents_lock.

Signed-off-by: Thomas Gleixner
Signed-off-by: Rafael J. Wysocki
Cc: Daniel Lezcano
Cc: Len Brown
Cc: Peter Zijlstra
Cc: Tony Lindgren
Link: http://lkml.kernel.org/r/8086559.ttsuS0n1Xr@vostro.rjw.lan
Signed-off-by: Ingo Molnar

Thomas Gleixner
2015-04-03 14:44:31 +0800

02 Apr, 2015

1 commit

345527b1e clockevents: Fix cpu_down() race for hrtimer based broadcasting ... Browse Code »

It was found when doing a hotplug stress test on POWER, that the
machine either hit softlockups or rcu_sched stall warnings. The
issue was traced to commit:

7cba160ad789 ("powernv/cpuidle: Redesign idle states management")

which exposed the cpu_down() race with hrtimer based broadcast mode:

5d1638acb9f6 ("tick: Introduce hrtimer based broadcast")

The race is the following:

Assume CPU1 is the CPU which holds the hrtimer broadcasting duty
before it is taken down.

CPU0 CPU1

cpu_down() take_cpu_down()
disable_interrupts()

cpu_die()

while (CPU1 != CPU_DEAD) {
msleep(100);
switch_to_idle();
stop_cpu_timer();
schedule_broadcast();
}

tick_cleanup_cpu_dead()
take_over_broadcast()

So after CPU1 disabled interrupts it cannot handle the broadcast
hrtimer anymore, so CPU0 will be stuck forever.

Fix this by explicitly taking over broadcast duty before cpu_die().

This is a temporary workaround. What we really want is a callback
in the clockevent device which allows us to do that from the dying
CPU by pushing the hrtimer onto a different cpu. That might involve
an IPI and is definitely more complex than this immediate fix.

Changelog was picked up from:

https://lkml.org/lkml/2015/2/16/213

Suggested-by: Thomas Gleixner
Tested-by: Nicolas Pitre
Signed-off-by: Preeti U. Murthy
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: nicolas.pitre@linaro.org
Cc: peterz@infradead.org
Cc: rjw@rjwysocki.net
Fixes: http://linuxppc.10917.n7.nabble.com/offlining-cpus-breakage-td88619.html
Link: http://lkml.kernel.org/r/20150330092410.24979.59887.stgit@preeti.in.ibm.com
[ Merged it to the latest timer tree, renamed the callback, tidied up the changelog. ]
Signed-off-by: Ingo Molnar

Preeti U Murthy
2015-04-02 20:25:39 +0800

01 Apr, 2015

2 commits

f46481d0a tick/xen: Provide and use tick_suspend_local() and tick_resume_local() ... Browse Code »

Xen calls on every cpu into tick_resume() which is just wrong.
tick_resume() is for the syscore global suspend/resume
invocation. What XEN really wants is a per cpu local resume
function.

Provide a tick_resume_local() function and use it in XEN.

Also provide a complementary tick_suspend_local() and modify
tick_unfreeze() and tick_freeze(), respectively, to use the
new local tick resume/suspend functions.

Signed-off-by: Thomas Gleixner
[ Combined two patches, rebased, modified subject/changelog. ]
Signed-off-by: Rafael J. Wysocki
Cc: Boris Ostrovsky
Cc: David Vrabel
Cc: Konrad Rzeszutek Wilk
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1698741.eezk9tnXtG@vostro.rjw.lan
[ Merged to latest timers/core. ]
Signed-off-by: Ingo Molnar

Thomas Gleixner
2015-04-01 20:23:00 +0800
080873ce2 tick: Make tick_resume_broadcast_oneshot() static ... Browse Code »
36

Solely used in tick-broadcast.c and the return value is
hardcoded 0. Make it static and void.

Signed-off-by: Thomas Gleixner
Signed-off-by: Rafael J. Wysocki
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1689058.QkHYDJSRKu@vostro.rjw.lan
Signed-off-by: Ingo Molnar

Thomas Gleixner
2015-04-01 20:22:59 +0800

27 Mar, 2015

2 commits

77e32c89a clockevents: Manage device's state separately for the core ... Browse Code »

'enum clock_event_mode' is used for two purposes today:

- to pass mode to the driver of clockevent device::set_mode().

- for managing state of the device for clockevents core.

For supporting new modes/states we have moved away from the
legacy set_mode() callback to new per-mode/state callbacks. New
modes/states shouldn't be exposed to the legacy (now OBSOLOTE)
callbacks and so we shouldn't add new states to 'enum
clock_event_mode'.

Lets have separate enums for the two use cases mentioned above.
Keep using the earlier enum for legacy set_mode() callback and
mark it OBSOLETE. And add another enum to clearly specify the
possible states of a clockevent device.

This also renames the newly added per-mode callbacks to reflect
state changes.

We haven't got rid of 'mode' member of 'struct
clock_event_device' as it is used by some of the clockevent
drivers and it would automatically die down once we migrate
those drivers to the new interface. It ('mode') is only updated
now for the drivers using the legacy interface.

Suggested-by: Peter Zijlstra
Suggested-by: Ingo Molnar
Signed-off-by: Viresh Kumar
Acked-by: Peter Zijlstra
Cc: Daniel Lezcano
Cc: Frederic Weisbecker
Cc: Kevin Hilman
Cc: Preeti U Murthy
Cc: linaro-kernel@lists.linaro.org
Cc: linaro-networking@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/b6b0143a8a57bd58352ad35e08c25424c879c0cb.1425037853.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar

Viresh Kumar
2015-03-27 17:26:19 +0800
554ef3876 clockevents: Handle tick device's resume separately ... Browse Code »

Upcoming patch will redefine possible states of a clockevent
device. The RESUME mode is a special case only for tick's
clockevent devices. In future it can be replaced by ->resume()
callback already available for clockevent devices.

Lets handle it separately so that clockevents_set_mode() only
handles states valid across all devices. This also renames
set_mode_resume() to tick_resume() to make it more explicit.

Signed-off-by: Viresh Kumar
Acked-by: Peter Zijlstra
Cc: Daniel Lezcano
Cc: Frederic Weisbecker
Cc: Kevin Hilman
Cc: Peter Zijlstra
Cc: Preeti U Murthy
Cc: linaro-kernel@lists.linaro.org
Cc: linaro-networking@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/c1b0112410870f49e7bf06958e1483eac6c15e20.1425037853.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar

Viresh Kumar
2015-03-27 17:26:19 +0800

27 Aug, 2014

1 commit

22127e93c time: Replace __get_cpu_var uses ... Browse Code »

Convert uses of __get_cpu_var for creating a address from a percpu
offset to this_cpu_ptr.

The two cases where get_cpu_var is used to actually access a percpu
variable are changed to use this_cpu_read/raw_cpu_read.

Reviewed-by: Thomas Gleixner
Signed-off-by: Christoph Lameter
Signed-off-by: Tejun Heo

Christoph Lameter
2014-08-27 01:45:44 +0800

02 Apr, 2014

1 commit

1ead65812 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull timer changes from Thomas Gleixner:
"This assorted collection provides:

- A new timer based timer broadcast feature for systems which do not
provide a global accessible timer device. That allows those
systems to put CPUs into deep idle states where the per cpu timer
device stops.

- A few NOHZ_FULL related improvements to the timer wheel

- The usual updates to timer devices found in ARM SoCs

- Small improvements and updates all over the place"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
tick: Remove code duplication in tick_handle_periodic()
tick: Fix spelling mistake in tick_handle_periodic()
x86: hpet: Use proper destructor for delayed work
workqueue: Provide destroy_delayed_work_on_stack()
clocksource: CMT, MTU2, TMU and STI should depend on GENERIC_CLOCKEVENTS
timer: Remove code redundancy while calling get_nohz_timer_target()
hrtimer: Rearrange comments in the order struct members are declared
timer: Use variable head instead of &work_list in __run_timers()
clocksource: exynos_mct: silence a static checker warning
arm: zynq: Add support for cpufreq
arm: zynq: Don't use arm_global_timer with cpufreq
clocksource/cadence_ttc: Overhaul clocksource frequency adjustment
clocksource/cadence_ttc: Call clockevents_update_freq() with IRQs enabled
clocksource: Add Kconfig entries for CMT, MTU2, TMU and STI
sh: Remove Kconfig entries for TMU, CMT and MTU2
ARM: shmobile: Remove CMT, TMU and STI Kconfig entries
clocksource: armada-370-xp: Use atomic access for shared registers
clocksource: orion: Use atomic access for shared registers
clocksource: timer-keystone: Delete unnecessary variable
clocksource: timer-keystone: introduce clocksource driver for Keystone
...

Linus Torvalds
2014-04-02 02:00:07 +0800

14 Feb, 2014

1 commit

dd5fd9b91 tick: Clear broadcast pending bit when switching to oneshot ... Browse Code »

AMD systems which use the C1E workaround in the amd_e400_idle routine
trigger the WARN_ON_ONCE in the broadcast code when onlining a CPU.

The reason is that the idle routine of those AMD systems switches the
cpu into forced broadcast mode early on before the newly brought up
CPU can switch over to high resolution / NOHZ mode. The timer related
CPU1 bringup looks like this:

clockevent_register_device(local_apic);
tick_setup(local_apic);
...
idle()
tick_broadcast_on_off(FORCE);
tick_broadcast_oneshot_control(ENTER)
cpumask_set(cpu, broadcast_oneshot_mask);
halt();

Now the broadcast interrupt on CPU0 sets CPU1 in the
broadcast_pending_mask and wakes CPU1. So CPU1 continues:

local_apic_timer_interrupt()
tick_handle_periodic();
softirq()
tick_init_highres();
cpumask_clr(cpu, broadcast_oneshot_mask);

tick_broadcast_oneshot_control(ENTER)
WARN_ON(cpumask_test(cpu, broadcast_pending_mask);

So while we remove CPU1 from the broadcast_oneshot_mask when we switch
over to highres mode, we do not clear the pending bit, which then
triggers the warning when we go back to idle.

The reason why this is only visible on C1E affected AMD systems is
that the other machines enter the deep sleep states via
acpi_idle/intel_idle and exit the broadcast mode before executing the
remote triggered local_apic_timer_interrupt. So the pending bit is
already cleared when the switch over to highres mode is clearing the
oneshot mask.

The solution is simple: Clear the pending bit together with the mask
bit when we switch over to highres mode.

Stanislaw came up independently with the same patch by enforcing the
C1E workaround and debugging the fallout. I picked mine, because mine
has a changelog :)

Reported-by: poma
Debugged-by: Stanislaw Gruszka
Signed-off-by: Thomas Gleixner
Cc: Olaf Hering
Cc: Dave Jones
Cc: Justin M. Forbes
Cc: Josh Boyer
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1402111434180.21991@ionos.tec.linutronix.de
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2014-02-14 04:55:54 +0800

07 Feb, 2014

3 commits

5d1638acb tick: Introduce hrtimer based broadcast ... Browse Code »

On some architectures, in certain CPU deep idle states the local timers stop.
An external clock device is used to wakeup these CPUs. The kernel support for the
wakeup of these CPUs is provided by the tick broadcast framework by using the
external clock device as the wakeup source.

However not all implementations of architectures provide such an external
clock device. This patch includes support in the broadcast framework to handle
the wakeup of the CPUs in deep idle states on such systems by queuing a hrtimer
on one of the CPUs, which is meant to handle the wakeup of CPUs in deep idle states.

This patchset introduces a pseudo clock device which can be registered by the
archs as tick_broadcast_device in the absence of a real external clock
device. Once registered, the broadcast framework will work as is for these
architectures as long as the archs take care of the BROADCAST_ENTER
notification failing for one of the CPUs. This CPU is made the stand by CPU to
handle wakeup of the CPUs in deep idle and it *must not enter deep idle states*.

The CPU with the earliest wakeup is chosen to be this CPU. Hence this way the
stand by CPU dynamically moves around and so does the hrtimer which is queued
to trigger at the next earliest wakeup time. This is consistent with the case where
an external clock device is present. The smp affinity of this clock device is
set to the CPU with the earliest wakeup. This patchset handles the hotplug of
the stand by CPU as well by moving the hrtimer on to the CPU handling the CPU_DEAD
notification.

Originally-from: Thomas Gleixner
Signed-off-by: Preeti U Murthy
Cc: deepthi@linux.vnet.ibm.com
Cc: paulmck@linux.vnet.ibm.com
Cc: fweisbec@gmail.com
Cc: paulus@samba.org
Cc: srivatsa.bhat@linux.vnet.ibm.com
Cc: svaidy@linux.vnet.ibm.com
Cc: peterz@infradead.org
Cc: benh@kernel.crashing.org
Cc: rafael.j.wysocki@intel.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20140207080632.17187.80532.stgit@preeti.in.ibm.com
Signed-off-by: Thomas Gleixner

Preeti U Murthy
2014-02-07 22:34:29 +0800
da7e6f45c time: Change the return type of clockevents_notify() to integer ... Browse Code »

The broadcast framework can potentially be made use of by archs which do not have an
external clock device as well. Then, it is required that one of the CPUs need
to handle the broadcasting of wakeup IPIs to the CPUs in deep idle. As a
result its local timers should remain functional all the time. For such
a CPU, the BROADCAST_ENTER notification has to fail indicating that its clock
device cannot be shutdown. To make way for this support, change the return
type of tick_broadcast_oneshot_control() and hence clockevents_notify() to
indicate such scenarios.

Signed-off-by: Preeti U Murthy
Cc: deepthi@linux.vnet.ibm.com
Cc: paulmck@linux.vnet.ibm.com
Cc: fweisbec@gmail.com
Cc: paulus@samba.org
Cc: srivatsa.bhat@linux.vnet.ibm.com
Cc: svaidy@linux.vnet.ibm.com
Cc: peterz@infradead.org
Cc: benh@kernel.crashing.org
Cc: rafael.j.wysocki@intel.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20140207080606.17187.78306.stgit@preeti.in.ibm.com
Signed-off-by: Thomas Gleixner

Preeti U Murthy
2014-02-07 22:34:29 +0800
627ee7947 clockevents: Serialize calls to clockevents_update_freq() in the core ... Browse Code »

We can identify the broadcast device in the core and serialize all
callers including interrupts on a different CPU against the update.
Also, disabling interrupts is moved into the core allowing callers to
leave interrutps enabled when calling clockevents_update_freq().

Signed-off-by: Soren Brinkmann
Cc: linux-arm-kernel@lists.infradead.org
Cc: Soeren Brinkmann
Cc: Daniel Lezcano
Cc: Michal Simek
Link: http://lkml.kernel.org/r/1391466877-28908-2-git-send-email-soren.brinkmann@xilinx.com
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2014-02-07 22:34:28 +0800

03 Dec, 2013

1 commit

e8fcaa5c5 nohz: Convert a few places to use local per cpu accesses ... Browse Code »

A few functions use remote per CPU access APIs when they
deal with local values.

Just do the right conversion to improve performance, code
readability and debug checks.

While at it, lets extend some of these function names with *_this_cpu()
suffix in order to display their purpose more clearly.

Signed-off-by: Frederic Weisbecker
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Oleg Nesterov
Cc: Steven Rostedt

Frederic Weisbecker
2013-12-03 03:39:30 +0800

02 Oct, 2013

1 commit

245a34962 tick: broadcast: Deny per-cpu clockevents from being broadcast sources ... Browse Code »

On most ARM systems the per-cpu clockevents are truly per-cpu in
the sense that they can't be controlled on any other CPU besides
the CPU that they interrupt. If one of these clockevents were to
become a broadcast source we will run into a lot of trouble
because the broadcast source is enabled on the first CPU to go
into deep idle (if that CPU suffers from FEAT_C3_STOP) and that
could be a different CPU than what the clockevent is interrupting
(or even worse the CPU that the clockevent interrupts could be
offline).

Theoretically it's possible to support per-cpu clockevents as the
broadcast source but so far we haven't needed this and supporting
it is rather complicated. Let's just deny the possibility for now
until this becomes a reality (let's hope it never does!).

Signed-off-by: Soren Brinkmann
Signed-off-by: Daniel Lezcano
Acked-by: Michal Simek

Soren Brinkmann
2013-10-02 17:34:06 +0800

12 Jul, 2013

1 commit

a272dcca1 tick: broadcast: Check broadcast mode on CPU hotplug ... Browse Code »

On ARM systems the dummy clockevent is registered with the cpu
hotplug notifier chain before any other per-cpu clockevent. This
has the side-effect of causing the dummy clockevent to be
registered first in every hotplug sequence. Because the dummy is
first, we'll try to turn the broadcast source on but the code in
tick_device_uses_broadcast() assumes the broadcast source is in
periodic mode and calls tick_broadcast_start_periodic()
unconditionally.

On boot this isn't a problem because we typically haven't
switched into oneshot mode yet (if at all). During hotplug, if
the broadcast source isn't in periodic mode we'll replace the
broadcast oneshot handler with the broadcast periodic handler and
start emulating oneshot mode when we shouldn't. Due to the way
the broadcast oneshot handler programs the next_event it's
possible for it to contain KTIME_MAX and cause us to hang the
system when the periodic handler tries to program the next tick.
Fix this by using the appropriate function to start the broadcast
source.

Reported-by: Stephen Warren
Tested-by: Stephen Warren
Signed-off-by: Stephen Boyd
Cc: Mark Rutland
Cc: Marc Zyngier
Cc: ARM kernel mailing list
Cc: John Stultz
Cc: Joseph Lo
Link: http://lkml.kernel.org/r/20130711140059.GA27430@codeaurora.org
Signed-off-by: Thomas Gleixner

Stephen Boyd
2013-07-12 18:35:40 +0800

05 Jul, 2013

1 commit

2b0f89317 Merge branch 'timers/posix-cpu-timers-for-tglx' of ... Browse Code »

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/core

Frederic sayed: "Most of these patches have been hanging around for
several month now, in -mmotm for a significant chunk. They already
missed a few releases."

Signed-off-by: Thomas Gleixner

Thomas Gleixner
2013-07-05 05:11:22 +0800