Eric Lee / smarc-fsl-linux-kernel

25 Sep, 2013

2 commits

9809b18fc watchdog: update watchdog_thresh properly ... Browse Code »

watchdog_tresh controls how often nmi perf event counter checks per-cpu
hrtimer_interrupts counter and blows up if the counter hasn't changed
since the last check. The counter is updated by per-cpu
watchdog_hrtimer hrtimer which is scheduled with 2/5 watchdog_thresh
period which guarantees that hrtimer is scheduled 2 times per the main
period. Both hrtimer and perf event are started together when the
watchdog is enabled.

So far so good. But...

But what happens when watchdog_thresh is updated from sysctl handler?

proc_dowatchdog will set a new sampling period and hrtimer callback
(watchdog_timer_fn) will use the new value in the next round. The
problem, however, is that nobody tells the perf event that the sampling
period has changed so it is ticking with the period configured when it
has been set up.

This might result in an ear ripping dissonance between perf and hrtimer
parts if the watchdog_thresh is increased. And even worse it might lead
to KABOOM if the watchdog is configured to panic on such a spurious
lockup.

This patch fixes the issue by updating both nmi perf even counter and
hrtimers if the threshold value has changed.

The nmi one is disabled and then reinitialized from scratch. This has
an unpleasant side effect that the allocation of the new event might
fail theoretically so the hard lockup detector would be disabled for
such cpus. On the other hand such a memory allocation failure is very
unlikely because the original event is deallocated right before.

It would be much nicer if we just changed perf event period but there
doesn't seem to be any API to do that right now. It is also unfortunate
that perf_event_alloc uses GFP_KERNEL allocation unconditionally so we
cannot use on_each_cpu() and do the same thing from the per-cpu context.
The update from the current CPU should be safe because
perf_event_disable removes the event atomically before it clears the
per-cpu watchdog_ev so it cannot change anything under running handler
feet.

The hrtimer is simply restarted (thanks to Don Zickus who has pointed
this out) if it is queued because we cannot rely it will fire&adopt to
the new sampling period before a new nmi event triggers (when the
treshold is decreased).

[akpm@linux-foundation.org: the UP version of __smp_call_function_single ended up in the wrong place]
Signed-off-by: Michal Hocko
Acked-by: Don Zickus
Cc: Frederic Weisbecker
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Fabio Estevam
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Hocko
2013-09-25 08:00:25 +0800
359e6fab6 watchdog: update watchdog attributes atomically ... Browse Code »

proc_dowatchdog doesn't synchronize multiple callers which might lead to
confusion when two parallel callers might confuse watchdog_enable_all_cpus
resp watchdog_disable_all_cpus (eg watchdog gets enabled even if
watchdog_thresh was set to 0 already).

This patch adds a local mutex which synchronizes callers to the sysctl
handler.

Signed-off-by: Michal Hocko
Cc: Frederic Weisbecker
Acked-by: Don Zickus
Cc: Thomas Gleixner
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Hocko
2013-09-25 08:00:25 +0800

31 Jul, 2013

1 commit

93786a5f6 watchdog: Make it work under full dynticks ... Browse Code »

A perf event can be used without forcing the tick to
stay alive if it doesn't use a frequency but a sample
period and if it doesn't throttle (raise storm of events).

Since the lockup detector neither use a perf event frequency
nor should ever throttle due to its high period, it can now
run concurrently with the full dynticks feature.

So remove the hack that disabled the watchdog.

Signed-off-by: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: Don Zickus
Cc: Srivatsa S. Bhat
Cc: Anish Singh
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1374539466-4799-9-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar

Frederic Weisbecker
2013-07-31 04:29:15 +0800

20 Jun, 2013

3 commits

940be35ac watchdog: Boot-disable by default on full dynticks ... Browse Code »

When the watchdog runs, it prevents the full dynticks
CPUs from stopping their tick because the hard lockup
detector uses perf events internally, which in turn
rely on the periodic tick.

Since this is a rather confusing behaviour that is not
easy to track down and identify for those who want to
test CONFIG_NO_HZ_FULL, let's default disable the
watchdog on boot time when full dynticks is enabled.

The user can still enable it later on runtime using
proc or sysctl.

Reported-by: Steven Rostedt
Suggested-by: Peter Zijlstra
Signed-off-by: Frederic Weisbecker
Cc: Steven Rostedt
Cc: Paul E. McKenney
Cc: Ingo Molnar
Cc: Andrew Morton
Cc: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Li Zhong
Cc: Don Zickus
Cc: Srivatsa S. Bhat
Cc: Anish Singh

Frederic Weisbecker
2013-06-20 21:46:32 +0800
3c00ea82c watchdog: Rename confusing state variable ... Browse Code »

We have two very conflicting state variable names in the
watchdog:

* watchdog_enabled: This one reflects the user interface. It's
set to 1 by default and can be overriden with boot options
or sysctl/procfs interface.

* watchdog_disabled: This is the internal toggle state that
tells if watchdog threads, timers and NMI events are currently
running or not. This state mostly depends on the user settings.
It's a convenient state latch.

Now we really need to find clearer names because those
are just too confusing to encourage deep review.

watchdog_enabled now becomes watchdog_user_enabled to reflect
its purpose as an interface.

watchdog_disabled becomes watchdog_running to suggest its
role as a pure internal state.

Signed-off-by: Frederic Weisbecker
Cc: Srivatsa S. Bhat
Cc: Anish Singh
Cc: Steven Rostedt
Cc: Paul E. McKenney
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Borislav Petkov
Cc: Li Zhong
Cc: Don Zickus

Frederic Weisbecker
2013-06-20 21:41:18 +0800
b8900bc02 watchdog: Register / unregister watchdog kthreads on sysctl control ... Browse Code »

The user activation/deactivation of the watchdog through boot parameters
or systcl is currently implemented with a dance involving kthreads parking
and unparking methods: the threads are unconditionally registered on
boot and they park as soon as the user want the watchdog to be disabled.

This method involves a few noisy details to handle though: the watchdog
kthreads may be unparked anytime due to hotplug operations, after which
the watchdog internals have to decide to park again if it is user-disabled.

As a result the setup() and unpark() methods need to be able to request a
reparking. This is not currently supported in the kthread infrastructure
so this piece of the watchdog code only works halfway.

Besides, unparking/reparking the watchdog kthreads consume unnecessary
cputime on hotplug operations when those could be simply ignored in the
first place.

As suggested by Srivatsa, let's instead only register the watchdog
threads when they are needed. This way we don't need to think about
hotplug operations and we don't burden the CPU onlining when the watchdog
is simply disabled.

Suggested-by: Srivatsa S. Bhat
Signed-off-by: Frederic Weisbecker
Cc: Srivatsa S. Bhat
Cc: Anish Singh
Cc: Steven Rostedt
Cc: Paul E. McKenney
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Borislav Petkov
Cc: Li Zhong
Cc: Don Zickus

Frederic Weisbecker
2013-06-20 07:16:09 +0800

14 Mar, 2013

1 commit

b66a2356d watchdog: Add comments to explain the watchdog_disabled variable ... Browse Code »

The watchdog_disabled flag is a bit cryptic. However it's
usefulness is multifold. Uses are:

1. Check if smpboot_register_percpu_thread function passed.

2. Makes sure that user enables and disables the watchdog in
sequence i.e. enable watchdog->disable watchdog->enable watchdog
Unlike enable watchdog->enable watchdog which is wrong.

Signed-off-by: anish kumar
[small text cleanups]
Signed-off-by: Don Zickus
Cc: chuansheng.liu@intel.com
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1363113848-18344-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar

anish kumar
2013-03-14 15:24:05 +0800

23 Feb, 2013

1 commit

3b5d8510b Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull core locking changes from Ingo Molnar:
"The biggest change is the rwsem lock-steal improvements, both to the
assembly optimized and the spinlock based variants.

The other notable change is the clean up of the seqlock implementation
to be based on the seqcount infrastructure.

The rest is assorted smaller debuggability, cleanup and continued -rt
locking changes."

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rwsem-spinlock: Implement writer lock-stealing for better scalability
futex: Revert "futex: Mark get_robust_list as deprecated"
generic: Use raw local irq variant for generic cmpxchg
lockdep: Selftest: convert spinlock to raw spinlock
seqlock: Use seqcount infrastructure
seqlock: Remove unused functions
ntp: Make ntp_lock raw
intel_idle: Convert i7300_idle_lock to raw_spinlock
locking: Various static lock initializer fixes
lockdep: Print more info when MAX_LOCK_DEPTH is exceeded
rwsem: Implement writer lock-stealing for better scalability
lockdep: Silence warning if CONFIG_LOCKDEP isn't set
watchdog: Use local_clock for get_timestamp()
lockdep: Rename print_unlock_inbalance_bug() to print_unlock_imbalance_bug()
locking/stat: Fix a typo

Linus Torvalds
2013-02-23 11:25:09 +0800

19 Feb, 2013

1 commit

c06b4f194 watchdog: Use local_clock for get_timestamp() ... Browse Code »

The get_timestamp() function is always called with current cpu,
thus using local_clock() would be more appropriate and it makes
the code shorter and cleaner IMHO.

Signed-off-by: Namhyung Kim
Acked-by: Don Zickus
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/1356576585-28782-1-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar

Namhyung Kim
2013-02-19 15:42:40 +0800

08 Feb, 2013

1 commit

8bd75c77b sched/rt: Move rt specific bits into new header file ... Browse Code »

Move rt scheduler definitions out of include/linux/sched.h into
new file include/linux/sched/rt.h

Signed-off-by: Clark Williams
Cc: Peter Zijlstra
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/20130207094707.7b9f825f@riff.lan
Signed-off-by: Ingo Molnar

Clark Williams
2013-02-08 03:51:08 +0800

20 Dec, 2012

1 commit

3935e8950 watchdog: Fix disable/enable regression ... Browse Code »

Commit 8d4516904b39 ("watchdog: Fix CPU hotplug regression") causes an
oops or hard lockup when doing

echo 0 > /proc/sys/kernel/nmi_watchdog
echo 1 > /proc/sys/kernel/nmi_watchdog

and the kernel is booted with nmi_watchdog=1 (default)

Running laptop-mode-tools and disconnecting/connecting AC power will
cause this to trigger, making it a common failure scenario on laptops.

Instead of bailing out of watchdog_disable() when !watchdog_enabled we
can initialize the hrtimer regardless of watchdog_enabled status. This
makes it safe to call watchdog_disable() in the nmi_watchdog=0 case,
without the negative effect on the enabled => disabled => enabled case.

All these tests pass with this patch:
- nmi_watchdog=1
echo 0 > /proc/sys/kernel/nmi_watchdog
echo 1 > /proc/sys/kernel/nmi_watchdog

- nmi_watchdog=0
echo 0 > /sys/devices/system/cpu/cpu1/online

- nmi_watchdog=0
echo mem > /sys/power/state

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51661

Cc: # v3.7
Cc: Norbert Warmuth
Cc: Joseph Salisbury
Cc: Thomas Gleixner
Signed-off-by: Bjørn Mork
Signed-off-by: Linus Torvalds

Bjørn Mork
2012-12-20 04:10:33 +0800

18 Dec, 2012

1 commit

0f34c4009 watchdog: store the watchdog sample period as a variable ... Browse Code »

Currently getting the sample period is always thru a complex
calculation: get_softlockup_thresh() * ((u64)NSEC_PER_SEC / 5).

We can store the sample period as a variable, and set it as __read_mostly
type.

Signed-off-by: liu chuansheng
Cc: Don Zickus
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Chuansheng Liu
2012-12-18 09:15:13 +0800

05 Dec, 2012

1 commit

8d4516904 watchdog: Fix CPU hotplug regression ... Browse Code »

Norbert reported:
"3.7-rc6 booted with nmi_watchdog=0 fails to suspend to RAM or
offline CPUs. It's reproducable with a KVM guest and physical
system."

The reason is that commit bcd951cf(watchdog: Use hotplug thread
infrastructure) missed to take this into account. So the cpu offline
code gets stuck in the teardown function because it accesses non
initialized data structures.

Add a check for watchdog_enabled into that path to cure the issue.

Reported-and-tested-by: Norbert Warmuth
Tested-by: Joseph Salisbury
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1211231033230.2701@ionos
Link: http://bugs.launchpad.net/bugs/1079534
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2012-12-05 02:56:59 +0800

27 Nov, 2012

1 commit

8ffeb9b0e watchdog: using u64 in get_sample_period() ... Browse Code »

In get_sample_period(), unsigned long is not enough:

watchdog_thresh * 2 * (NSEC_PER_SEC / 5)

case1:
watchdog_thresh is 10 by default, the sample value will be: 0xEE6B2800

case2:
set watchdog_thresh is 20, the sample value will be: 0x1 DCD6 5000

In case2, we need use u64 to express the sample period. Otherwise,
changing the threshold thru proc often can not be successful.

Signed-off-by: liu chuansheng
Acked-by: Don Zickus
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Chuansheng Liu
2012-11-27 09:41:24 +0800

13 Aug, 2012

1 commit

bcd951cf1 watchdog: Use hotplug thread infrastructure ... Browse Code »

Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Srivatsa S. Bhat
Cc: Rusty Russell
Reviewed-by: Paul E. McKenney
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/20120716103948.563736676@linutronix.de
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2012-08-13 23:01:07 +0800

09 Aug, 2012

1 commit

300d3739e Revert "NMI watchdog: fix for lockup detector breakage on resume" ... Browse Code »

Revert commit 45226e9 (NMI watchdog: fix for lockup detector breakage
on resume) which breaks resume from system suspend on my SH7372
Mackerel board (by causing a NULL pointer dereference to happen) and
is generally wrong, because it abuses the CPU hotplug functionality
in a shamelessly blatant way.

The original issue should be addressed through appropriate syscore
resume callback instead.

Signed-off-by: Rafael J. Wysocki

Rafael J. Wysocki
2012-08-09 02:49:45 +0800

31 Jul, 2012

1 commit

45226e944 NMI watchdog: fix for lockup detector breakage on resume ... Browse Code »
44

On the suspend/resume path the boot CPU does not go though an
offline->online transition. This breaks the NMI detector post-resume
since it depends on PMU state that is lost when the system gets
suspended.

Fix this by forcing a CPU offline->online transition for the lockup
detector on the boot CPU during resume.

To provide more context, we enable NMI watchdog on Chrome OS. We have
seen several reports of systems freezing up completely which indicated
that the NMI watchdog was not firing for some reason.

Debugging further, we found a simple way of repro'ing system freezes --
issuing the command 'tasket 1 sh -c "echo nmilockup > /proc/breakme"'
after the system has been suspended/resumed one or more times.

With this patch in place, the system freeze result in panics, as
expected.

These panics provide a nice stack trace for us to debug the actual issue
causing the freeze.

[akpm@linux-foundation.org: fiddle with code comment]
[akpm@linux-foundation.org: make lockup_detector_bootcpu_resume() conditional on CONFIG_SUSPEND]
[akpm@linux-foundation.org: fix section errors]
Signed-off-by: Sameer Nanda
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: "Rafael J. Wysocki"
Cc: Don Zickus
Cc: Mandeep Singh Baines
Cc: Srivatsa S. Bhat
Cc: Anshuman Khandual
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sameer Nanda
2012-07-31 08:25:13 +0800

14 Jun, 2012

1 commit

a70270468 watchdog: Quiet down the boot messages ... Browse Code »

A bunch of bugzillas have complained how noisy the nmi_watchdog
is during boot-up especially with its expected failure cases
(like virt and bios resource contention).

This is my attempt to quiet them down and keep it less confusing
for the end user. What I did is print the message for cpu0 and
save it for future comparisons. If future cpus have an
identical message as cpu0, then don't print the redundant info.
However, if a future cpu has a different message, happily print
that loudly.

Before the change, you would see something like:

..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz stepping 0a
Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
... version: 2
... bit width: 40
... generic registers: 2
... value mask: 000000ffffffffff
... max period: 000000007fffffff
... fixed-purpose events: 3
... event mask: 0000000700000003
NMI watchdog enabled, takes one hw-pmu counter.
Booting Node 0, Processors #1
NMI watchdog enabled, takes one hw-pmu counter.
#2
NMI watchdog enabled, takes one hw-pmu counter.
#3 Ok.
NMI watchdog enabled, takes one hw-pmu counter.
Brought up 4 CPUs
Total of 4 processors activated (22607.24 BogoMIPS).

After the change, it is simplified to:

..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz stepping 0a
Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
... version: 2
... bit width: 40
... generic registers: 2
... value mask: 000000ffffffffff
... max period: 000000007fffffff
... fixed-purpose events: 3
... event mask: 0000000700000003
NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
Booting Node 0, Processors #1 #2 #3 Ok.
Brought up 4 CPUs

V2: little changes based on Joe Perches' feedback
V3: printk cleanup based on Ingo's feedback; checkpatch fix
V4: keep printk as one long line
V5: Ingo fix ups

Reported-and-tested-by: Nathan Zimmer
Signed-off-by: Don Zickus
Cc: nzimmer@sgi.com
Cc: joe@perches.com
Link: http://lkml.kernel.org/r/1339594548-17227-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar

Don Zickus
2012-06-14 18:20:50 +0800

08 Apr, 2012

1 commit

5d1c0f4a8 watchdog: add check for suspended vm in softlockup detector ... Browse Code »

A suspended VM can cause spurious soft lockup warnings. To avoid these, the
watchdog now checks if the kernel knows it was stopped by the host and skips
the warning if so. When the watchdog is reset successfully, clear the guest
paused flag.

Signed-off-by: Eric B Munson
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Eric B Munson
2012-04-08 17:49:03 +0800

24 Mar, 2012

3 commits

b60f796c4 kernel/watchdog.c: add comment to watchdog() exit path ... Browse Code »

Revelation from Peter.

Cc: Peter Zijlstra
Cc: Don Zickus
Cc: Ingo Molnar
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2012-03-24 07:58:32 +0800
4501980aa kernel/watchdog.c: convert to pr_foo() ... Browse Code »

It fixes some 80-col wordwrappings and adds some consistency.

Cc: Ingo Molnar
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2012-03-24 07:58:32 +0800
7a05c0f7b watchdog: make sure the watchdog thread gets CPU on loaded system ... Browse Code »

If the system is loaded while hotplugging a CPU we might end up with a
bogus hardlockup detection. This has been seen during LTP pounder test
executed in parallel with hotplug test.

The main problem is that enable_watchdog (called when CPU is brought up)
registers perf event which periodically checks per-cpu counter
(hrtimer_interrupts), updated from a hrtimer callback, but the hrtimer
is fired from the kernel thread.

This means that while we already do check for the hard lockup the kernel
thread might be sitting on the runqueue with zillions of tasks so there
is nobody to update the value we rely on and so we KABOOM.

Let's fix this by boosting the watchdog thread priority before we wake
it up rather than when it's already running. This still doesn't handle
a case where we have the same amount of high prio FIFO tasks but that
doesn't seem to be common. The current implementation doesn't handle
that case anyway so this is not worse at least.

Unfortunately, we cannot start perf counter from the watchdog thread
because we could miss a real lock up and also we cannot start the
hrtimer watchdog_enable because we there is no way (at least I don't
know any) to start a hrtimer from a different CPU.

[dzickus@redhat.com: fix compile issue with param]
Cc: Ingo Molnar
Cc: Peter Zijlstra
Reviewed-by: Mandeep Singh Baines
Signed-off-by: Michal Hocko
Signed-off-by: Don Zickus
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Hocko
2012-03-24 07:58:32 +0800

11 Feb, 2012

1 commit

86f5e6a7b watchdog: Fix code/comments mismatches ... Browse Code »

Reflect the change in the soft and hard lockup thresholds and
their relation to the frequency of the hrtimer and NMI events in
the code comments. While at it, remove references to files that
do not exist anymore.

Signed-off-by: Fernando Luis Vazquez Cao
Signed-off-by: Don Zickus
Link: http://lkml.kernel.org/r/1328827342-6253-3-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar

Fernando Luis Vázquez Cao
2012-02-11 22:11:33 +0800

27 Jan, 2012

1 commit

b0f4c4b32 bugs, x86: Fix printk levels for panic, softlockups and stack dumps ... Browse Code »

rsyslog will display KERN_EMERG messages on a connected
terminal. However, these messages are useless/undecipherable
for a general user.

For example, after a softlockup we get:

Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
kernel:Stack:

Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
kernel:Call Trace:

Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
kernel:Code: ff ff a8 08 75 25 31 d2 48 8d 86 38 e0 ff ff 48 89
d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e0 0f 01 c9 ea 69 dd ff 4c 29 e8 48 89 c7 e8 0f bc da ff 49 89 c4 49 89

This happens because the printk levels for these messages are
incorrect. Only an informational message should be displayed on
a terminal.

I modified the printk levels for various messages in the kernel
and tested the output by using the drivers/misc/lkdtm.c kernel
modules (ie, softlockups, panics, hard lockups, etc.) and
confirmed that the console output was still the same and that
the output to the terminals was correct.

For example, in the case of a softlockup we now see the much
more informative:

Message from syslogd@intel-s3e37-04 at Jan 25 10:18:06 ...
BUG: soft lockup - CPU4 stuck for 60s!

instead of the above confusing messages.

AFAICT, the messages no longer have to be KERN_EMERG. In the
most important case of a panic we set console_verbose(). As for
the other less severe cases the correct data is output to the
console and /var/log/messages.

Successfully tested by me using the drivers/misc/lkdtm.c module.

Signed-off-by: Prarit Bhargava
Cc: dzickus@redhat.com
Cc: Linus Torvalds
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/1327586134-11926-1-git-send-email-prarit@redhat.com
Signed-off-by: Ingo Molnar

Prarit Bhargava
2012-01-27 04:28:45 +0800

01 Nov, 2011

1 commit

4ff819515 watchdog: move watchdog_*_all_cpus under CONFIG_SYSCTL ... Browse Code »

Fix compilation warnings for CONFIG_SYSCTL=n:

fixed compilation warnings in case of disabled CONFIG_SYSCTL
kernel/watchdog.c:483:13: warning: `watchdog_enable_all_cpus' defined but not used
kernel/watchdog.c:500:13: warning: `watchdog_disable_all_cpus' defined but not used

these functions are static and are used only in sysctl handler, so move
them inside #ifdef CONFIG_SYSCTL too

Signed-off-by: Vasily Averin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vasily Averin
2011-11-01 08:30:53 +0800

18 Sep, 2011

1 commit

cba9bd22a watchdog: Drop FIFO policy in exit path ... Browse Code »

When the watchdog thread exits it runs through the exit path with FIFO
priority. There is no point in doing so. Switch back to SCHED_NORMAL
before exiting.

Cc: Don Zickus
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1109121337461.2723@ionos
Signed-off-by: Ingo Molnar

Thomas Gleixner
2011-09-18 20:34:07 +0800

14 Aug, 2011

1 commit

18e5a45db watchdog: Make the kthreads NUMA affine ... Browse Code »

Watchdog kthreads can use kthread_create_on_node() to NUMA affine their
stack and task_struct.

Signed-off-by: Eric Dumazet
Signed-off-by: Don Zickus
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1312394344-18815-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar

Eric Dumazet
2011-08-14 17:53:06 +0800

15 Jul, 2011

1 commit

f91298709 perf, x86: P4 PMU - Introduce event alias feature ... Browse Code »

Instead of hw_nmi_watchdog_set_attr() weak function
and appropriate x86_pmu::hw_watchdog_set_attr() call
we introduce even alias mechanism which allow us
to drop this routines completely and isolate quirks
of Netburst architecture inside P4 PMU code only.

The main idea remains the same though -- to allow
nmi-watchdog and perf top run simultaneously.

Note the aliasing mechanism applies to generic
PERF_COUNT_HW_CPU_CYCLES event only because arbitrary
event (say passed as RAW initially) might have some
additional bits set inside ESCR register changing
the behaviour of event and we can't guarantee anymore
that alias event will give the same result.

P.S. Thanks a huge to Don and Steven for for testing
and early review.

Acked-by: Don Zickus
Tested-by: Steven Rostedt
Signed-off-by: Cyrill Gorcunov
CC: Ingo Molnar
CC: Peter Zijlstra
CC: Stephane Eranian
CC: Lin Ming
CC: Arnaldo Carvalho de Melo
CC: Frederic Weisbecker
Link: http://lkml.kernel.org/r/20110708201712.GS23657@sun
Signed-off-by: Steven Rostedt

Cyrill Gorcunov
2011-07-15 05:25:04 +0800

01 Jul, 2011

3 commits

4dc0da869 perf: Add context field to perf_event ... Browse Code »

The perf_event overflow handler does not receive any caller-derived
argument, so many callers need to resort to looking up the perf_event
in their local data structure. This is ugly and doesn't scale if a
single callback services many perf_events.

Fix by adding a context parameter to perf_event_create_kernel_counter()
(and derived hardware breakpoints APIs) and storing it in the perf_event.
The field can be accessed from the callback as event->overflow_handler_context.
All callers are updated.

Signed-off-by: Avi Kivity
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1309362157-6596-2-git-send-email-avi@redhat.com
Signed-off-by: Ingo Molnar

Avi Kivity
2011-07-01 17:06:38 +0800
a8b0ca17b perf: Remove the nmi parameter from the swevent and overflow interface ... Browse Code »
1

The nmi parameter indicated if we could do wakeups from the current
context, if not, we would set some state and self-IPI and let the
resulting interrupt do the wakeup.

For the various event classes:

- hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
the PMI-tail (ARM etc.)
- tracepoint: nmi=0; since tracepoint could be from NMI context.
- software: nmi=[0,1]; some, like the schedule thing cannot
perform wakeups, and hence need 0.

As one can see, there is very little nmi=1 usage, and the down-side of
not using it is that on some platforms some software events can have a
jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).

The up-side however is that we can remove the nmi parameter and save a
bunch of conditionals in fast paths.

Signed-off-by: Peter Zijlstra
Cc: Michael Cree
Cc: Will Deacon
Cc: Deng-Cheng Zhu
Cc: Anton Blanchard
Cc: Eric B Munson
Cc: Heiko Carstens
Cc: Paul Mundt
Cc: David S. Miller
Cc: Frederic Weisbecker
Cc: Jason Wessel
Cc: Don Zickus
Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2011-07-01 17:06:35 +0800
1880c4ae1 perf, x86: Add hw_watchdog_set_attr() in a sake of nmi-watchdog on P4 ... Browse Code »

Due to restriction and specifics of Netburst PMU we need a separated
event for NMI watchdog. In particular every Netburst event
consumes not just a counter and a config register, but also an
additional ESCR register.

Since ESCR registers are grouped upon counters (i.e. if ESCR is occupied
for some event there is no room for another event to enter until its
released) we need to pick up the "least" used ESCR (or the most available
one) for nmi-watchdog purposes -- so MSR_P4_CRU_ESCR2/3 was chosen.

With this patch nmi-watchdog and perf top should be able to run simultaneously.

Signed-off-by: Cyrill Gorcunov
CC: Lin Ming
CC: Arnaldo Carvalho de Melo
CC: Frederic Weisbecker
Tested-and-reviewed-by: Don Zickus
Tested-and-reviewed-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20110623124918.GC13050@sun
Signed-off-by: Ingo Molnar

Cyrill Gorcunov
2011-07-01 17:06:34 +0800

24 May, 2011

1 commit

6e9101aee watchdog: Fix non-standard prototype of get_softlockup_thresh() ... Browse Code »

This build warning slipped through:

kernel/watchdog.c:102: warning: function declaration isn't a prototype

As reported by Stephen Rothwell.

Also address an unused variable warning that GCC 4.6.0 reports:
we cannot do anything about failed watchdog ops during CPU hotplug
(it's not serious enough to return an error from the notifier),
so ignore them.

Reported-by: Stephen Rothwell
Cc: Mandeep Singh Baines
Cc: Marcin Slusarz
Cc: Don Zickus
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/r/20110524134129.8da27016.sfr@canb.auug.org.au
Signed-off-by: Ingo Molnar
LKML-Reference:

Ingo Molnar
2011-05-24 11:53:39 +0800

23 May, 2011

4 commits

4eec42f39 watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh ... Browse Code »

Before the conversion of the NMI watchdog to perf event, the
watchdog timeout was 5 seconds. Now it is 60 seconds. For my
particular application, netbooks, 5 seconds was a better
timeout. With a short timeout, we catch faults earlier and are
able to send back a panic. With a 60 second timeout, the user is
unlikely to wait and will instead hit the power button, causing
us to lose the panic info.

This change configures the NMI period to watchdog_thresh and
sets the softlockup_thresh to watchdog_thresh * 2. In addition,
watchdog_thresh was reduced to 10 seconds as suggested by Ingo
Molnar.

Signed-off-by: Mandeep Singh Baines
Cc: Marcin Slusarz
Cc: Don Zickus
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/r/1306127423-3347-4-git-send-email-msb@chromium.org
Signed-off-by: Ingo Molnar
LKML-Reference:

Mandeep Singh Baines
2011-05-23 17:58:59 +0800
586692a5a watchdog: Disable watchdog when thresh is zero ... Browse Code »

This restores the previous behavior of softlock_thresh.

Currently, setting watchdog_thresh to zero causes the watchdog
kthreads to consume a lot of CPU.

In addition, the logic of proc_dowatchdog_thresh and
proc_dowatchdog_enabled has been factored into proc_dowatchdog.

Signed-off-by: Mandeep Singh Baines
Cc: Marcin Slusarz
Cc: Don Zickus
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/r/1306127423-3347-3-git-send-email-msb@chromium.org
Signed-off-by: Ingo Molnar
LKML-Reference:

Mandeep Singh Baines
2011-05-23 17:58:59 +0800
e04ab2bc4 watchdog: Only disable/enable watchdog if neccessary ... Browse Code »

Don't take any action on an unsuccessful write to /proc.

Signed-off-by: Mandeep Singh Baines
Cc: Marcin Slusarz
Cc: Don Zickus
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/r/1306127423-3347-2-git-send-email-msb@chromium.org
Signed-off-by: Ingo Molnar

Mandeep Singh Baines
2011-05-23 17:58:58 +0800
824c6b7f6 watchdog: Fix rounding bug in get_sample_period() ... Browse Code »

In get_sample_period(), softlockup_thresh is integer divided by
5 before the multiplication by NSEC_PER_SEC. This results in
softlockup_thresh being rounded down to the nearest integer
multiple of 5.

For example, a softlockup_thresh of 4 rounds down to 0.

Signed-off-by: Mandeep Singh Baines
Cc: Marcin Slusarz
Cc: Don Zickus
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/r/1306127423-3347-1-git-send-email-msb@chromium.org
Signed-off-by: Ingo Molnar

Mandeep Singh Baines
2011-05-23 17:58:58 +0800

29 Apr, 2011

1 commit

1409f141a kernel/watchdog.c: disable nmi perf event in the error path of enabling watchdog ... Browse Code »

In corner cases where softlockup watchdog is not setup successfully, the
relevant nmi perf event for hardlockup watchdog could be disabled, then
the status of the underlying hardware remains unchanged.

Also, if the kthread doesn't start then the hrtimer won't run and the
hardlockup detector will falsely fire.

Signed-off-by: Hillf Danton
Signed-off-by: Don Zickus
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hillf Danton
2011-04-29 02:28:21 +0800

23 Mar, 2011

2 commits

f99a99330 kernel/watchdog.c: always return NOTIFY_OK during cpu up/down events ... Browse Code »

This patch addresses a couple of problems. One was the case when the
hardlockup failed to start, it also failed to start the softlockup. There
were valid cases when the hardlockup shouldn't start and that shouldn't
block the softlockup (no lapic, bios controls perf counters).

The second problem was when the hardlockup failed to start on boxes (from
a no lapic or bios controlled perf counter case), it reported failure to
the cpu notifier chain. This blocked the notifier from continuing to
start other more critical pieces of cpu bring-up (in our case based on a
2.6.32 fork, it was the mce). As a result, during soft cpu online/offline
testing, the system would panic when a cpu was offlined because the cpu
notifier would succeed in processing a watchdog disable cpu event and
would panic in the mce case as a result of un-initialized variables from a
never executed cpu up event.

I realized the hardlockup/softlockup cases are really just debugging aids
and should never impede the progress of a cpu up/down event. Therefore I
modified the code to always return NOTIFY_OK and instead rely on printks
to inform the user of problems.

Signed-off-by: Don Zickus
Acked-by: Peter Zijlstra
Reviewed-by: WANG Cong
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Don Zickus
2011-03-23 08:44:12 +0800
fef2c9bc1 kernel/watchdog.c: allow hardlockup to panic by default ... Browse Code »

When a cpu is considered stuck, instead of limping along and just printing
a warning, it is sometimes preferred to just panic, let kdump capture the
vmcore and reboot. This gets the machine back into a stable state quickly
while saving the info that got it into a stuck state to begin with.

Add a Kconfig option to allow users to set the hardlockup to panic
by default. Also add in a 'nmi_watchdog=nopanic' to override this.

[akpm@linux-foundation.org: fix strncmp length]
Signed-off-by: Don Zickus
Acked-by: Peter Zijlstra
Reviewed-by: WANG Cong
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Don Zickus
2011-03-23 08:44:12 +0800

10 Feb, 2011

1 commit

5651f7f47 watchdog, nmi: Lower the severity of error messages ... Browse Code »

During boot if the hardlockup detector fails to initialize, it
complains very loudly. Some failures should be expected under
certain situations, ie no lapics, or resource in-use. Tone
those error messages down a bit. Keep the rest at a high level.

Reported-by: Paul Bolle
Tested-by: Paul Bolle
Signed-off-by: Don Zickus
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Don Zickus
2011-02-10 20:21:59 +0800