Doug / smarc-fsl-linux-kernel | Embedian Git Server

25 Feb, 2010

9 commits

3acd9eb31 rcu: Fix deadlock in TREE_PREEMPT_RCU CPU stall detection ... Browse Code »

Under TREE_PREEMPT_RCU, print_other_cpu_stall() invokes
rcu_print_task_stall() with the root rcu_node structure's ->lock
held, and rcu_print_task_stall() acquires that same lock for
self-deadlock. Fix this by removing the lock acquisition from
rcu_print_task_stall(), and making all callers acquire the lock
instead.

Tested-by: John Kacur
Tested-by: Thomas Gleixner
Located-by: Thomas Gleixner
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2010-02-25 17:34:59 +0800
1304afb22 rcu: Convert to raw_spinlocks ... Browse Code »

The spinlocks in rcutree need to be real spinlocks in
preempt-rt. Convert them to raw_spinlocks.

Signed-off-by: Thomas Gleixner
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2010-02-25 17:34:58 +0800
20133cfce rcu: Stop overflowing signed integers ... Browse Code »

The C standard does not specify the result of an operation that
overflows a signed integer, so such operations need to be
avoided. This patch changes the type of several fields from
"long" to "unsigned long" and adjusts operations as needed.
ULONG_CMP_GE() and ULONG_CMP_LT() macros are introduced to do
the modular comparisons that are appropriate given that overflow
is an expected event.

Acked-by: Mathieu Desnoyers
Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2010-02-25 17:34:57 +0800
8bd93a2c5 rcu: Accelerate grace period if last non-dynticked CPU ... Browse Code »

Currently, rcu_needs_cpu() simply checks whether the current CPU
has an outstanding RCU callback, which means that the last CPU
to go into dyntick-idle mode might wait a few ticks for the
relevant grace periods to complete. However, if all the other
CPUs are in dyntick-idle mode, and if this CPU is in a quiescent
state (which it is for RCU-bh and RCU-sched any time that we are
considering going into dyntick-idle mode), then the grace period
is instantly complete.

This patch therefore repeatedly invokes the RCU grace-period
machinery in order to force any needed grace periods to complete
quickly. It does so a limited number of times in order to
prevent starvation by an RCU callback function that might pass
itself to call_rcu().

However, if any CPU other than the current one is not in
dyntick-idle mode, fall back to simply checking (with fix to bug
noted by Lai Jiangshan). Also, take advantage of last
grace-period forcing, the opportunity to do so noted by Steve
Rostedt. And apply simplified #ifdef condition suggested by
Frederic Weisbecker.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2010-02-25 17:34:55 +0800
497f0ab39 sched: Better name for for_each_domain_rd ... Browse Code »

As suggested by Peter Ziljstra, make better choice of name
for for_each_domain_rd(), containing "rcu_dereference", given
that it is but a wrapper for rcu_dereference_check(). The name
rcu_dereference_check_sched_domain() does that and provides a
separate per-subsystem name space.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2010-02-25 17:34:47 +0800
d11c563dd sched: Use lockdep-based checking on rcu_dereference() ... Browse Code »

Update the rcu_dereference() usages to take advantage of the new
lockdep-based checking.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
[ -v2: fix allmodconfig missing symbol export build failure on x86 ]
Signed-off-by: Ingo Molnar

Paul E. McKenney
2010-02-25 17:34:26 +0800
0632eb3d7 rcu: Integrate rcu_dereference_check() message into lockdep ... Browse Code »

Make rcu_dereference_check() print the list of held locks in
addition to the stack dump to ease debugging.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2010-02-25 16:41:01 +0800
632ee2001 rcu: Introduce lockdep-based checking to RCU read-side primitives ... Browse Code »

Inspection is proving insufficient to catch all RCU misuses,
which is understandable given that rcu_dereference() might be
protected by any of four different flavors of RCU (RCU, RCU-bh,
RCU-sched, and SRCU), and might also/instead be protected by any
of a number of locking primitives. It is therefore time to
enlist the aid of lockdep.

This set of patches is inspired by earlier work by Peter
Zijlstra and Thomas Gleixner, and takes the following approach:

o Set up separate lockdep classes for RCU, RCU-bh, and RCU-sched.

o Set up separate lockdep classes for each instance of SRCU.

o Create primitives that check for being in an RCU read-side
critical section. These return exact answers if lockdep is
fully enabled, but if unsure, report being in an RCU read-side
critical section. (We want to avoid false positives!)
The primitives are:

For RCU: rcu_read_lock_held(void)

For RCU-bh: rcu_read_lock_bh_held(void)

For RCU-sched: rcu_read_lock_sched_held(void)

For SRCU: srcu_read_lock_held(struct srcu_struct *sp)

o Add rcu_dereference_check(), which takes a second argument
in which one places a boolean expression based on the above
primitives and/or lockdep_is_held().

o A new kernel configuration parameter, CONFIG_PROVE_RCU, enables
rcu_dereference_check(). This depends on CONFIG_PROVE_LOCKING,
and should be quite helpful during the transition period while
CONFIG_PROVE_RCU-unaware patches are in flight.

The existing rcu_dereference() primitive does no checking, but
upcoming patches will change that.

Signed-off-by: Paul E. McKenney
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Paul E. McKenney
2010-02-25 16:40:59 +0800
996de8c6f Merge commit 'v2.6.33' into core/rcu ... Browse Code »

Merge reason: Update from -rc4 to -final.

Signed-off-by: Ingo Molnar

Ingo Molnar
2010-02-25 16:40:26 +0800

23 Feb, 2010

2 commits

701188374 kernel/sys.c: fix missing rcu protection for sys_getpriority() ... Browse Code »

find_task_by_vpid() is not safe without rcu_read_lock(). 2.6.33-rc7 got
RCU protection for sys_setpriority() but missed it for sys_getpriority().

Signed-off-by: Tetsuo Handa
Cc: Oleg Nesterov
Cc: "Paul E. McKenney"
Acked-by: Serge Hallyn
Acked-by: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tetsuo Handa
2010-02-23 11:50:34 +0800
bee415ce4 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf probe: Init struct probe_point and set counter correctly
hw-breakpoint: Keep track of dr7 local enable bits
hw-breakpoints: Accept breakpoints on NULL address
perf_events: Fix FORK events

Linus Torvalds
2010-02-23 00:55:32 +0800

17 Feb, 2010

2 commits

5a5e0f4c7 kfifo: Don't use integer as NULL pointer ... Browse Code »

This patch fixes following sparse warnings:

include/linux/kfifo.h:127:25: warning: Using plain integer as NULL pointer
kernel/kfifo.c:83:21: warning: Using plain integer as NULL pointer

Signed-off-by: Anton Vorontsov
Acked-by: Stefani Seibold
Signed-off-by: Greg Kroah-Hartman

Anton Vorontsov
2010-02-17 07:11:08 +0800
1a02d59ab kfifo: Make kfifo_initialized work after kfifo_free ... Browse Code »

After kfifo rework it's no longer possible to reliably know if kfifo is
usable, since after kfifo_free(), kfifo_initialized() would still return
true. The correct behaviour is needed for at least FHCI USB driver.

This patch fixes the issue by resetting the kfifo to zero values (the
same approach is used in kfifo_alloc() if allocation failed).

Signed-off-by: Anton Vorontsov
Acked-by: Stefani Seibold
Signed-off-by: Greg Kroah-Hartman

Anton Vorontsov
2010-02-17 07:11:06 +0800

16 Feb, 2010

3 commits

7d0bab9df Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/tip/linux-2.6-tip

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hrtimer, softirq: Fix hrtimer->softirq trampoline

Linus Torvalds
2010-02-16 11:52:12 +0800
627a9a194 Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/ker… ... Browse Code »

…nel/git/tip/linux-2.6-tip

* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing/kprobes: Fix probe parsing
tracing: Fix circular dead lock in stack trace

Linus Torvalds
2010-02-16 11:47:59 +0800
3d8b4bdef Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf top: Fix help text alignment
perf: Fix hypervisor sample reporting
perf: Make bp_len type to u64 generic across the arch

Linus Torvalds
2010-02-16 11:47:48 +0800

15 Feb, 2010

1 commit

6f93d0a7c perf_events: Fix FORK events ... Browse Code »

Commit 22e19085 ("Honour event state for aux stream data")
introduced a bug where we would drop FORK events.

The thing is that we deliver FORK events to the child process'
event, which at that time will be PERF_EVENT_STATE_INACTIVE
because the child won't be scheduled in (we're in the middle of
fork).

Solve this twice, change the event state filter to exclude only
disabled (STATE_OFF) or worse, and deliver FORK events to the
current (parent).

Signed-off-by: Peter Zijlstra
Cc: Anton Blanchard
Cc: Arnaldo Carvalho de Melo
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-02-15 01:10:39 +0800

14 Feb, 2010

1 commit

a9bb18f36 tracing/kprobes: Fix probe parsing ... Browse Code »

Trying to add a probe like:

echo p:myprobe 0x10000 > /sys/kernel/debug/tracing/kprobe_events

will fail since the wrong pointer is passed to strict_strtoul
when trying to convert the address to an unsigned long.

Signed-off-by: Heiko Carstens
Acked-by: Masami Hiramatsu
Cc: Frederic Weisbecker
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Heiko Carstens
2010-02-14 16:43:58 +0800

10 Feb, 2010

1 commit

c93d89f3d Export the symbol of getboottime and mmonotonic_to_bootbased ... Browse Code »

Export getboottime and monotonic_to_bootbased in order to let them
could be used by following patch.

Cc: stable@kernel.org
Signed-off-by: Jason Wang
Signed-off-by: Marcelo Tosatti

Jason Wang
2010-02-10 01:20:15 +0800

05 Feb, 2010

1 commit

aa16cd8d1 Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
futex: Handle futex value corruption gracefully
futex: Handle user space corruption gracefully
futex_lock_pi() key refcnt fix
softlockup: Add sched_clock_tick() to avoid kernel warning on kgdb resume

Linus Torvalds
2010-02-05 08:07:41 +0800

04 Feb, 2010

2 commits

cd757645f perf: Make bp_len type to u64 generic across the arch ... Browse Code »

Change 'bp_len' type to __u64 to make it work across archs as
the s390 architecture watch point length can be upto 2^64.

reference:
http://lkml.org/lkml/2010/1/25/212

This is an ABI change that is not backward compatible with
the previous hardware breakpoint info layout integrated in this
development cycle, a rebuilt of perf tools is necessary for
versions based on 2.6.33-rc1 - 2.6.33-rc6 to work with a
kernel based on this patch.

Signed-off-by: Mahesh Salgaonkar
Acked-by: Peter Zijlstra
Cc: Ananth N Mavinakayanahalli
Cc: "K. Prasad"
Cc: Maneesh Soni
Cc: Heiko Carstens
Cc: Martin
LKML-Reference:
Signed-off-by: Frederic Weisbecker

Mahesh Salgaonkar
2010-02-04 08:07:12 +0800
b9c303227 hrtimer, softirq: Fix hrtimer->softirq trampoline ... Browse Code »

hrtimers callbacks are always done from hardirq context, either the
jiffy tick interrupt or the hrtimer device interrupt.

[ there is currently one exception that can still call a hrtimer
callback from softirq, but even in that case this will still
work correctly. ]

Reported-by: Wei Yongjun
Signed-off-by: Peter Zijlstra
Cc: Yury Polyanskiy
Tested-by: Wei Yongjun
Acked-by: David S. Miller
LKML-Reference:
Signed-off-by: Thomas Gleixner

Peter Zijlstra
2010-02-04 01:17:40 +0800

03 Feb, 2010

7 commits

59647b6ac futex: Handle futex value corruption gracefully ... Browse Code »

The WARN_ON in lookup_pi_state which complains about a mismatch
between pi_state->owner->pid and the pid which we retrieved from the
user space futex is completely bogus.

The code just emits the warning and then continues despite the fact
that it detected an inconsistent state of the futex. A conveniant way
for user space to spam the syslog.

Replace the WARN_ON by a consistency check. If the values do not match
return -EINVAL and let user space deal with the mess it created.

This also fixes the missing task_pid_vnr() when we compare the
pi_state->owner pid with the futex value.

Reported-by: Jermome Marchand
Signed-off-by: Thomas Gleixner
Acked-by: Darren Hart
Acked-by: Peter Zijlstra
Cc:

Thomas Gleixner
2010-02-03 22:13:22 +0800
51246bfd1 futex: Handle user space corruption gracefully ... Browse Code »

If the owner of a PI futex dies we fix up the pi_state and set
pi_state->owner to NULL. When a malicious or just sloppy programmed
user space application sets the futex value to 0 e.g. by calling
pthread_mutex_init(), then the futex can be acquired again. A new
waiter manages to enqueue itself on the pi_state w/o damage, but on
unlock the kernel dereferences pi_state->owner and oopses.

Prevent this by checking pi_state->owner in the unlock path. If
pi_state->owner is not current we know that user space manipulated the
futex value. Ignore the mess and return -EINVAL.

This catches the above case and also the case where a task hijacks the
futex by setting the tid value and then tries to unlock it.

Reported-by: Jermome Marchand
Signed-off-by: Thomas Gleixner
Acked-by: Darren Hart
Acked-by: Peter Zijlstra
Cc:

Thomas Gleixner
2010-02-03 22:13:22 +0800
5ecb01cfd futex_lock_pi() key refcnt fix ... Browse Code »

This fixes a futex key reference count bug in futex_lock_pi(),
where a key's reference count is incremented twice but decremented
only once, causing the backing object to not be released.

If the futex is created in a temporary file in an ext3 file system,
this bug causes the file's inode to become an "undead" orphan,
which causes an oops from a BUG_ON() in ext3_put_super() when the
file system is unmounted. glibc's test suite is known to trigger this,
see .

The bug is a regression from 2.6.28-git3, namely Peter Zijlstra's
38d47c1b7075bd7ec3881141bb3629da58f88dab "[PATCH] futex: rely on
get_user_pages() for shared futexes". That commit made get_futex_key()
also increment the reference count of the futex key, and updated its
callers to decrement the key's reference count before returning.
Unfortunately the normal exit path in futex_lock_pi() wasn't corrected:
the reference count is incremented by get_futex_key() and queue_lock(),
but the normal exit path only decrements once, via unqueue_me_pi().
The fix is to put_futex_key() after unqueue_me_pi(), since 2.6.31
this is easily done by 'goto out_put_key' rather than 'goto out'.

Signed-off-by: Mikael Pettersson
Acked-by: Peter Zijlstra
Acked-by: Darren Hart
Signed-off-by: Thomas Gleixner
Cc:

Mikael Pettersson
2010-02-03 22:13:22 +0800
c80d292f1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorri… ... Browse Code »

…s/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
kernel/cred.c: use kmem_cache_free

Linus Torvalds
2010-02-03 10:12:22 +0800
4528fd059 cgroups: fix to return errno in a failure path ... Browse Code »

In cgroup_create(), if alloc_css_id() returns failure, the errno is not
propagated to userspace, so mkdir will fail silently.

To trigger this bug, we mount blkio (or memory subsystem), and create more
then 65534 cgroups. (The number of cgroups is limited to 65535 if a
subsystem has use_id == 1)

# mount -t cgroup -o blkio xxx /mnt
# for ((i = 0; i < 65534; i++)); do mkdir /mnt/$i; done
# mkdir /mnt/65534
(should return ENOSPC)
#

Signed-off-by: Li Zefan
Acked-by: Serge Hallyn
Acked-by: Paul Menage
Acked-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2010-02-03 10:11:22 +0800
bc173f709 kfifo: fix kernel-doc notation ... Browse Code »

Fix kfifo kernel-doc warnings:

Warning(kernel/kfifo.c:361): No description found for parameter 'total'
Warning(kernel/kfifo.c:402): bad line: @ @lenout: pointer to output variable with copied data
Warning(kernel/kfifo.c:412): No description found for parameter 'lenout'

Signed-off-by: Randy Dunlap
Cc: Stefani Seibold
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2010-02-03 10:11:21 +0800
b8a1d37c5 kernel/cred.c: use kmem_cache_free ... Browse Code »

Free memory allocated using kmem_cache_zalloc using kmem_cache_free rather
than kfree.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

//
@@
expression x,E,c;
@@

x = $kmem_cache_alloc\|kmem_cache_zalloc\|kmem_cache_alloc_node$(c,...)
... when != x = E
when != &x
?-kfree(x)
+kmem_cache_free(c,x)
//

Signed-off-by: Julia Lawall
Acked-by: David Howells
Cc: James Morris
Cc: Steve Dickson
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: James Morris

Julia Lawall
2010-02-03 07:21:57 +0800

02 Feb, 2010

5 commits

4f48f8b7f tracing: Fix circular dead lock in stack trace ... Browse Code »

When we cat /tracing/stack_trace, we may cause circular lock:
sys_read()
t_start()
arch_spin_lock(&max_stack_lock);

t_show()
seq_printf(), vsnprintf() .... /* they are all trace-able,
when they are traced, max_stack_lock may be required again. */

The following script can trigger this circular dead lock very easy:
#!/bin/bash

echo 1 > /proc/sys/kernel/stack_tracer_enabled

mount -t debugfs xxx /mnt > /dev/null 2>&1

(
# make check_stack() zealous to require max_stack_lock
for ((; ;))
{
echo 1 > /mnt/tracing/stack_max_size
}
) &

for ((; ;))
{
cat /mnt/tracing/stack_trace > /dev/null
}

To fix this bug, we increase the percpu trace_active before
require the lock.

Reported-by: Li Zefan
Signed-off-by: Lai Jiangshan
LKML-Reference:
Signed-off-by: Steven Rostedt

Lai Jiangshan
2010-02-02 23:20:18 +0800
e20da8913 Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
lockdep: Fix check_usage_backwards() error message

Linus Torvalds
2010-02-02 02:45:26 +0800
834db333e Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf, hw_breakpoint, kgdb: Do not take mutex for kernel debugger
x86, hw_breakpoints, kgdb: Fix kgdb to use hw_breakpoint API
hw_breakpoints: Release the bp slot if arch_validate_hwbkpt_settings() fails.
perf: Ignore perf.data.old
perf report: Fix segmentation fault when running with '-g none'

Linus Torvalds
2010-02-02 02:45:00 +0800
8ea85c281 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Correct printk whitespace in warning from cpu down task check
sched: Fix incorrect sanity check
sched: Fix fork vs hotplug vs cpuset namespaces

Linus Torvalds
2010-02-02 02:44:36 +0800
bdd846678 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/tip/linux-2.6-tip

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
clocksource: Prevent potential kgdb dead lock

Linus Torvalds
2010-02-02 02:44:06 +0800

01 Feb, 2010

1 commit

d6ad3e286 softlockup: Add sched_clock_tick() to avoid kernel warning on kgdb resume ... Browse Code »

When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is set, sched_clock() gets
the time from hardware such as the TSC on x86. In this
configuration kgdb will report a softlock warning message on
resuming or detaching from a debug session.

Sequence of events in the problem case:

1) "cpu sched clock" and "hardware time" are at 100 sec prior
to a call to kgdb_handle_exception()

2) Debugger waits in kgdb_handle_exception() for 80 sec and on
exit the following is called ... touch_softlockup_watchdog() -->
__raw_get_cpu_var(touch_timestamp) = 0;

3) "cpu sched clock" = 100s (it was not updated, because the
interrupt was disabled in kgdb) but the "hardware time" = 180 sec

4) The first timer interrupt after resuming from
kgdb_handle_exception updates the watchdog from the "cpu sched clock"

update_process_times() { ... run_local_timers() -->
softlockup_tick() --> check (touch_timestamp == 0) (it is "YES"
here, we have set "touch_timestamp = 0" at kgdb) -->
__touch_softlockup_watchdog() ***(A)--> reset "touch_timestamp"
to "get_timestamp()" (Here, the "touch_timestamp" will still be
set to 100s.) ...

scheduler_tick() ***(B)--> sched_clock_tick() (update "cpu sched
clock" to "hardware time" = 180s) ... }

5) The Second timer interrupt handler appears to have a large
jump and trips the softlockup warning.

update_process_times() { ... run_local_timers() -->
softlockup_tick() --> "cpu sched clock" - "touch_timestamp" =
180s-100s > 60s --> printk "soft lockup error messages" ... }

note: ***(A) reset "touch_timestamp" to
"get_timestamp(this_cpu)"

Why is "touch_timestamp" 100 sec, instead of 180 sec?

When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is set, the call trace of
get_timestamp() is:

get_timestamp(this_cpu)
-->cpu_clock(this_cpu)
-->sched_clock_cpu(this_cpu)
-->__update_sched_clock(sched_clock_data, now)

The __update_sched_clock() function uses the GTOD tick value to
create a window to normalize the "now" values. So if "now"
value is too big for sched_clock_data, it will be ignored.

The fix is to invoke sched_clock_tick() to update "cpu sched
clock" in order to recover from this state. This is done by
introducing the function touch_softlockup_watchdog_sync(). This
allows kgdb to request that the sched clock is updated when the
watchdog thread runs the first time after a resume from kgdb.

[yong.zhang0@gmail.com: Use per cpu instead of an array]
Signed-off-by: Jason Wessel
Signed-off-by: Dongdong Deng
Cc: kgdb-bugreport@lists.sourceforge.net
Cc: peterz@infradead.org
LKML-Reference:
Signed-off-by: Ingo Molnar

Jason Wessel
2010-02-01 15:22:32 +0800

30 Jan, 2010

2 commits

5352ae638 perf, hw_breakpoint, kgdb: Do not take mutex for kernel debugger ... Browse Code »

This patch fixes the regression in functionality where the
kernel debugger and the perf API do not nicely share hw
breakpoint reservations.

The kernel debugger cannot use any mutex_lock() calls because it
can start the kernel running from an invalid context.

A mutex free version of the reservation API needed to get
created for the kernel debugger to safely update hw breakpoint
reservations.

The possibility for a breakpoint reservation to be concurrently
processed at the time that kgdb interrupts the system is
improbable. Should this corner case occur the end user is
warned, and the kernel debugger will prohibit updating the
hardware breakpoint reservations.

Any time the kernel debugger reserves a hardware breakpoint it
will be a system wide reservation.

Signed-off-by: Jason Wessel
Acked-by: Frederic Weisbecker
Cc: kgdb-bugreport@lists.sourceforge.net
Cc: K.Prasad
Cc: Peter Zijlstra
Cc: Alan Stern
Cc: torvalds@linux-foundation.org
LKML-Reference:
Signed-off-by: Ingo Molnar

Jason Wessel
2010-01-30 15:42:21 +0800
cc0967490 x86, hw_breakpoints, kgdb: Fix kgdb to use hw_breakpoint API ... Browse Code »

In the 2.6.33 kernel, the hw_breakpoint API is now used for the
performance event counters. The hw_breakpoint_handler() now
consumes the hw breakpoints that were previously set by kgdb
arch specific code. In order for kgdb to work in conjunction
with this core API change, kgdb must use some of the low level
functions of the hw_breakpoint API to install, uninstall, and
deal with hw breakpoint reservations.

The kgdb core required a change to call kgdb_disable_hw_debug
anytime a slave cpu enters kgdb_wait() in order to keep all the
hw breakpoints in sync as well as to prevent hitting a hw
breakpoint while kgdb is active.

During the architecture specific initialization of kgdb, it will
pre-allocate 4 disabled (struct perf event **) structures. Kgdb
will use these to manage the capabilities for the 4 hw
breakpoint registers, per cpu. Right now the hw_breakpoint API
does not have a way to ask how many breakpoints are available,
on each CPU so it is possible that the install of a breakpoint
might fail when kgdb restores the system to the run state. The
intent of this patch is to first get the basic functionality of
hw breakpoints working and leave it to the person debugging the
kernel to understand what hw breakpoints are in use and what
restrictions have been imposed as a result. Breakpoint
constraints will be dealt with in a future patch.

While atomic, the x86 specific kgdb code will call
arch_uninstall_hw_breakpoint() and arch_install_hw_breakpoint()
to manage the cpu specific hw breakpoints.

The net result of these changes allow kgdb to use the same pool
of hw_breakpoints that are used by the perf event API, but
neither knows about future reservations for the available hw
breakpoint slots.

Signed-off-by: Jason Wessel
Acked-by: Frederic Weisbecker
Cc: kgdb-bugreport@lists.sourceforge.net
Cc: K.Prasad
Cc: Peter Zijlstra
Cc: Alan Stern
Cc: torvalds@linux-foundation.org
LKML-Reference:
Signed-off-by: Ingo Molnar

Jason Wessel
2010-01-30 15:42:20 +0800

28 Jan, 2010

3 commits

b23ff0e93 hw_breakpoints: Release the bp slot if arch_validate_hwbkpt_settings() fails. ... Browse Code »

On a given architecture, when hardware breakpoint registration fails
due to un-supported access type (read/write/execute), we lose the bp
slot since register_perf_hw_breakpoint() does not release the bp slot
on failure.
Hence, any subsequent hardware breakpoint registration starts failing
with 'no space left on device' error.

This patch introduces error handling in register_perf_hw_breakpoint()
function and releases bp slot on error.

Signed-off-by: Mahesh Salgaonkar
Cc: Ananth N Mavinakayanahalli
Cc: K. Prasad
Cc: Maneesh Soni
LKML-Reference:
Signed-off-by: Frederic Weisbecker

Mahesh Salgaonkar
2010-01-28 21:15:51 +0800
9d3cfc4c1 sched: Correct printk whitespace in warning from cpu down task check ... Browse Code »

Due to an incorrect line break the output currently contains tabs.
Also remove trailing space.

The actual output that logcheck sent me looked like this:
Task events/1 (pid = 10) is on cpu 1^I^I^I^I(state = 1, flags = 84208040)

After this patch it becomes:
Task events/1 (pid = 10) is on cpu 1 (state = 1, flags = 84208040)

Signed-off-by: Frans Pop
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Frans Pop
2010-01-28 13:59:55 +0800
11854247e sched: Fix incorrect sanity check ... Browse Code »

We moved to migrate on wakeup, which means that sleeping tasks could
still be present on offline cpus. Amend the check to only test running
tasks.

Reported-by: Heiko Carstens
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-01-28 13:59:51 +0800