Eric Lee / smarc-fsl-linux-kernel

02 Feb, 2010

4 commits

e20da8913 Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
lockdep: Fix check_usage_backwards() error message

Linus Torvalds
2010-02-02 02:45:26 +0800
834db333e Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf, hw_breakpoint, kgdb: Do not take mutex for kernel debugger
x86, hw_breakpoints, kgdb: Fix kgdb to use hw_breakpoint API
hw_breakpoints: Release the bp slot if arch_validate_hwbkpt_settings() fails.
perf: Ignore perf.data.old
perf report: Fix segmentation fault when running with '-g none'

Linus Torvalds
2010-02-02 02:45:00 +0800
8ea85c281 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Correct printk whitespace in warning from cpu down task check
sched: Fix incorrect sanity check
sched: Fix fork vs hotplug vs cpuset namespaces

Linus Torvalds
2010-02-02 02:44:36 +0800
bdd846678 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/tip/linux-2.6-tip

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
clocksource: Prevent potential kgdb dead lock

Linus Torvalds
2010-02-02 02:44:06 +0800

30 Jan, 2010

2 commits

5352ae638 perf, hw_breakpoint, kgdb: Do not take mutex for kernel debugger ... Browse Code »

This patch fixes the regression in functionality where the
kernel debugger and the perf API do not nicely share hw
breakpoint reservations.

The kernel debugger cannot use any mutex_lock() calls because it
can start the kernel running from an invalid context.

A mutex free version of the reservation API needed to get
created for the kernel debugger to safely update hw breakpoint
reservations.

The possibility for a breakpoint reservation to be concurrently
processed at the time that kgdb interrupts the system is
improbable. Should this corner case occur the end user is
warned, and the kernel debugger will prohibit updating the
hardware breakpoint reservations.

Any time the kernel debugger reserves a hardware breakpoint it
will be a system wide reservation.

Signed-off-by: Jason Wessel
Acked-by: Frederic Weisbecker
Cc: kgdb-bugreport@lists.sourceforge.net
Cc: K.Prasad
Cc: Peter Zijlstra
Cc: Alan Stern
Cc: torvalds@linux-foundation.org
LKML-Reference:
Signed-off-by: Ingo Molnar

Jason Wessel
2010-01-30 15:42:21 +0800
cc0967490 x86, hw_breakpoints, kgdb: Fix kgdb to use hw_breakpoint API ... Browse Code »

In the 2.6.33 kernel, the hw_breakpoint API is now used for the
performance event counters. The hw_breakpoint_handler() now
consumes the hw breakpoints that were previously set by kgdb
arch specific code. In order for kgdb to work in conjunction
with this core API change, kgdb must use some of the low level
functions of the hw_breakpoint API to install, uninstall, and
deal with hw breakpoint reservations.

The kgdb core required a change to call kgdb_disable_hw_debug
anytime a slave cpu enters kgdb_wait() in order to keep all the
hw breakpoints in sync as well as to prevent hitting a hw
breakpoint while kgdb is active.

During the architecture specific initialization of kgdb, it will
pre-allocate 4 disabled (struct perf event **) structures. Kgdb
will use these to manage the capabilities for the 4 hw
breakpoint registers, per cpu. Right now the hw_breakpoint API
does not have a way to ask how many breakpoints are available,
on each CPU so it is possible that the install of a breakpoint
might fail when kgdb restores the system to the run state. The
intent of this patch is to first get the basic functionality of
hw breakpoints working and leave it to the person debugging the
kernel to understand what hw breakpoints are in use and what
restrictions have been imposed as a result. Breakpoint
constraints will be dealt with in a future patch.

While atomic, the x86 specific kgdb code will call
arch_uninstall_hw_breakpoint() and arch_install_hw_breakpoint()
to manage the cpu specific hw breakpoints.

The net result of these changes allow kgdb to use the same pool
of hw_breakpoints that are used by the perf event API, but
neither knows about future reservations for the available hw
breakpoint slots.

Signed-off-by: Jason Wessel
Acked-by: Frederic Weisbecker
Cc: kgdb-bugreport@lists.sourceforge.net
Cc: K.Prasad
Cc: Peter Zijlstra
Cc: Alan Stern
Cc: torvalds@linux-foundation.org
LKML-Reference:
Signed-off-by: Ingo Molnar

Jason Wessel
2010-01-30 15:42:20 +0800

28 Jan, 2010

3 commits

b23ff0e93 hw_breakpoints: Release the bp slot if arch_validate_hwbkpt_settings() fails. ... Browse Code »

On a given architecture, when hardware breakpoint registration fails
due to un-supported access type (read/write/execute), we lose the bp
slot since register_perf_hw_breakpoint() does not release the bp slot
on failure.
Hence, any subsequent hardware breakpoint registration starts failing
with 'no space left on device' error.

This patch introduces error handling in register_perf_hw_breakpoint()
function and releases bp slot on error.

Signed-off-by: Mahesh Salgaonkar
Cc: Ananth N Mavinakayanahalli
Cc: K. Prasad
Cc: Maneesh Soni
LKML-Reference:
Signed-off-by: Frederic Weisbecker

Mahesh Salgaonkar
2010-01-28 21:15:51 +0800
9d3cfc4c1 sched: Correct printk whitespace in warning from cpu down task check ... Browse Code »

Due to an incorrect line break the output currently contains tabs.
Also remove trailing space.

The actual output that logcheck sent me looked like this:
Task events/1 (pid = 10) is on cpu 1^I^I^I^I(state = 1, flags = 84208040)

After this patch it becomes:
Task events/1 (pid = 10) is on cpu 1 (state = 1, flags = 84208040)

Signed-off-by: Frans Pop
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Frans Pop
2010-01-28 13:59:55 +0800
11854247e sched: Fix incorrect sanity check ... Browse Code »

We moved to migrate on wakeup, which means that sleeping tasks could
still be present on offline cpus. Amend the check to only test running
tasks.

Reported-by: Heiko Carstens
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-01-28 13:59:51 +0800

27 Jan, 2010

4 commits

48d506741 lockdep: Fix check_usage_backwards() error message ... Browse Code »

Lockdep has found the real bug, but the output doesn't look right to me:

> =========================================================
> [ INFO: possible irq lock inversion dependency detected ]
> 2.6.33-rc5 #77
> ---------------------------------------------------------
> emacs/1609 just changed the state of lock:
> (&(&tty->ctrl_lock)->rlock){+.....}, at: [] tty_fasync+0xe8/0x190
> but this lock took another, HARDIRQ-unsafe lock in the past:
> (&(&sighand->siglock)->rlock){-.....}

"HARDIRQ-unsafe" and "this lock took another" looks wrong, afaics.

> ... key at: [] __key.46539+0x0/0x8
> ... acquired at:
> [] __lock_acquire+0x1056/0x15a0
> [] lock_acquire+0x9f/0x120
> [] _raw_spin_lock_irqsave+0x52/0x90
> [] __proc_set_tty+0x3e/0x150
> [] tty_open+0x51d/0x5e0

The stack-trace shows that this lock (ctrl_lock) was taken under
->siglock (which is hopefully irq-safe).

This is a clear typo in check_usage_backwards() where we tell the print a
fancy routine we're forwards.

Signed-off-by: Oleg Nesterov
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Oleg Nesterov
2010-01-27 15:34:02 +0800
036889703 tracing/documentation: Cover new frame pointer semantics ... Browse Code »

Update the graph tracer examples to cover the new frame pointer semantics
(in terms of passing it along). Move the HAVE_FUNCTION_GRAPH_FP_TEST docs
out of the Kconfig, into the right place, and expand on the details.

Signed-off-by: Mike Frysinger
LKML-Reference:
Signed-off-by: Steven Rostedt

Mike Frysinger
2010-01-27 06:00:39 +0800
3c05d7482 ring-buffer: Check for end of page in iterator ... Browse Code »

If the iterator comes to an empty page for some reason, or if
the page is emptied by a consuming read. The iterator code currently
does not check if the iterator is pass the contents, and may
return a false entry.

This patch adds a check to the ring buffer iterator to test if the
current page has been completely read and sets the iterator to the
next page if necessary.

Signed-off-by: Steven Rostedt

Steven Rostedt
2010-01-27 05:14:08 +0800
492a74f42 ring-buffer: Check if ring buffer iterator has stale data ... Browse Code »

Usually reads of the ring buffer is performed by a single task.
There are two types of reads from the ring buffer.

One is a consuming read which will consume the entry that was read
and the next read will be the entry that follows.

The other is an iterator that will let the user read the contents of
the ring buffer without modifying it. When an iterator is allocated,
writes to the ring buffer are disabled to protect the iterator.

The problem exists when consuming reads happen while an iterator is
allocated. Specifically, the kind of read that swaps out an entire
page (used by splice) and replaces it with a new read. If the iterator
is on the page that is swapped out, then the next read may read
from this swapped out page and return garbage.

This patch adds a check when reading the iterator to make sure that
the iterator contents are still valid. If a consuming read has taken
place, the iterator is reset.

Signed-off-by: Steven Rostedt

Steven Rostedt
2010-01-27 05:09:30 +0800

26 Jan, 2010

2 commits

7b7422a56 clocksource: Prevent potential kgdb dead lock ... Browse Code »

commit 0f8e8ef7 (clocksource: Simplify clocksource watchdog resume
logic) introduced a potential kgdb dead lock. When the kernel is
stopped by kgdb inside code which holds watchdog_lock then kgdb dead
locks in clocksource_resume_watchdog().

clocksource_resume_watchdog() is called from kbdg via
clocksource_touch_watchdog() to avoid that the clock source watchdog
marks TSC unstable after the kernel has been stopped.

Solve this by replacing spin_lock with a spin_trylock and just return
in case the lock is held. Not resetting the watchdog might result in
TSC becoming marked unstable, but that's an acceptable penalty for
using kgdb.

The timekeeping is anyway easily screwed up by kgdb when the system
uses either jiffies or a clock source which wraps in short intervals
(e.g. pm_timer wraps about every 4.6s), so we really do not have to
worry about that occasional TSC marked unstable side effect.

The second caller of clocksource_resume_watchdog() is
clocksource_resume(). The trylock is safe here as well because the
system is UP at this point, interrupts are disabled and nothing else
can hold watchdog_lock().

Reported-by: Jason Wessel
LKML-Reference:
Cc: kgdb-bugreport@lists.sourceforge.net
Cc: Martin Schwidefsky
Cc: John Stultz
Cc: Andrew Morton
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2010-01-26 21:53:16 +0800
74bf4076f tracing: Prevent kernel oops with corrupted buffer ... Browse Code »

If the contents of the ftrace ring buffer gets corrupted and the trace
file is read, it could create a kernel oops (usualy just killing the user
task thread). This is caused by the checking of the pid in the buffer.
If the pid is negative, it still references the cmdline cache array,
which could point to an invalid address.

The simple fix is to test for negative PIDs.

Signed-off-by: Steven Rostedt

Steven Rostedt
2010-01-26 04:11:53 +0800

25 Jan, 2010

2 commits

f6760aa02 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/tip/linux-2.6-tip

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
clockevent: Don't remove broadcast device when cpu is dead

Linus Torvalds
2010-01-25 02:38:07 +0800
b8be634e0 Merge git://git.infradead.org/~dwmw2/mtd-2.6.33 ... Browse Code »

* git://git.infradead.org/~dwmw2/mtd-2.6.33:
mtd: tests: fix read, speed and stress tests on NOR flash
mtd: Really add ARM pismo support
kmsg_dump: Dump on crash_kexec as well

Linus Torvalds
2010-01-25 02:31:34 +0800

22 Jan, 2010

2 commits

fabf318e5 sched: Fix fork vs hotplug vs cpuset namespaces ... Browse Code »

There are a number of issues:

1) TASK_WAKING vs cgroup_clone (cpusets)

copy_process():

sched_fork()
child->state = TASK_WAKING; /* waiting for wake_up_new_task() */
if (current->nsproxy != p->nsproxy)
ns_cgroup_clone()
cgroup_clone()
mutex_lock(inode->i_mutex)
mutex_lock(cgroup_mutex)
cgroup_attach_task()
ss->can_attach()
ss->attach() [ -> cpuset_attach() ]
cpuset_attach_task()
set_cpus_allowed_ptr();
while (child->state == TASK_WAKING)
cpu_relax();
will deadlock the system.

2) cgroup_clone (cpusets) vs copy_process

So even if the above would work we still have:

copy_process():

if (current->nsproxy != p->nsproxy)
ns_cgroup_clone()
cgroup_clone()
mutex_lock(inode->i_mutex)
mutex_lock(cgroup_mutex)
cgroup_attach_task()
ss->can_attach()
ss->attach() [ -> cpuset_attach() ]
cpuset_attach_task()
set_cpus_allowed_ptr();
...

p->cpus_allowed = current->cpus_allowed

over-writing the modified cpus_allowed.

3) fork() vs hotplug

if we unplug the child's cpu after the sanity check when the child
gets attached to the task_list but before wake_up_new_task() shit
will meet with fan.

Solve all these issues by moving fork cpu selection into
wake_up_new_task().

Reported-by: Serge E. Hallyn
Tested-by: Serge E. Hallyn
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Thomas Gleixner

Peter Zijlstra
2010-01-22 06:25:31 +0800
e80b13598 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf: x86: Add support for the ANY bit
perf: Change the is_software_event() definition
perf: Honour event state for aux stream data
perf: Fix perf_event_do_pending() fallback callsite
perf kmem: Print usage help for unknown commands
perf kmem: Increase "Hit" column length
hw-breakpoints, perf: Fix broken mmiotrace due to dr6 by reference change
perf timechart: Use tid not pid for COMM change

Linus Torvalds
2010-01-22 00:50:04 +0800

21 Jan, 2010

4 commits

22e190851 perf: Honour event state for aux stream data ... Browse Code »

Anton reported that perf record kept receiving events even after calling
ioctl(PERF_EVENT_IOC_DISABLE). It turns out that FORK,COMM and MMAP
events didn't respect the disabled state and kept flowing in.

Reported-by: Anton Blanchard
Signed-off-by: Peter Zijlstra
Tested-by: Anton Blanchard
LKML-Reference:
CC: stable@kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-01-21 20:40:40 +0800
fe432200a perf: Fix perf_event_do_pending() fallback callsite ... Browse Code »

Paul questioned the context in which we should call
perf_event_do_pending(). After looking at that I found that it should be
called from IRQ context these days, however the fallback call-site is
placed in softirq context. Ammend this by placing the callback in the IRQ
timer path.

Reported-by: Paul Mackerras
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-01-21 20:40:39 +0800
6d558c3ac sched: Reassign prev and switch_count when reacquire_kernel_lock() fail ... Browse Code »

Assume A->B schedule is processing, if B have acquired BKL before and it
need reschedule this time. Then on B's context, it will go to
need_resched_nonpreemptible for reschedule. But at this time, prev and
switch_count are related to A. It's wrong and will lead to incorrect
scheduler statistics.

Signed-off-by: Yong Zhang
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Yong Zhang
2010-01-21 20:39:04 +0800
50b926e43 sched: Fix vmark regression on big machines ... Browse Code »

SD_PREFER_SIBLING is set at the CPU domain level if power saving isn't
enabled, leading to many cache misses on large machines as we traverse
looking for an idle shared cache to wake to. Change the enabler of
select_idle_sibling() to SD_SHARE_PKG_RESOURCES, and enable same at the
sibling domain level.

Reported-by: Lin Ming
Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Mike Galbraith
2010-01-21 20:39:03 +0800

18 Jan, 2010

1 commit

ea9d8e3f4 clockevent: Don't remove broadcast device when cpu is dead ... Browse Code »

Marc reported that the BUG_ON in clockevents_notify() triggers on his
system. This happens because the kernel tries to remove an active
clock event device (used for broadcasting) from the device list.

The handling of devices which can be used as per cpu device and as a
global broadcast device is suboptimal.

The simplest solution for now (and for stable) is to check whether the
device is used as global broadcast device, but this needs to be
revisited.

[ tglx: restored the cpuweight check and massaged the changelog ]

Reported-by: Marc Dionne
Tested-by: Marc Dionne
Signed-off-by: Xiaotian Feng
LKML-Reference:
Signed-off-by: Thomas Gleixner
Cc: stable@kernel.org

Xiaotian Feng
2010-01-18 21:44:50 +0800

17 Jan, 2010

7 commits

2a8249daf Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
futexes: Remove rw parameter from get_futex_key()

Linus Torvalds
2010-01-17 04:31:30 +0800
6ccc347b6 Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/ker… ... Browse Code »

…nel/git/tip/linux-2.6-tip

* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing/filters: Add comment for match callbacks
tracing/filters: Fix MATCH_FULL filter matching for PTR_STRING
tracing/filters: Fix MATCH_MIDDLE_ONLY filter matching
lib: Introduce strnstr()
tracing/filters: Fix MATCH_END_ONLY filter matching
tracing/filters: Fix MATCH_FRONT_ONLY filter matching
ftrace: Fix MATCH_END_ONLY function filter
tracing/x86: Derive arch from bits argument in recordmcount.pl
ring-buffer: Add rb_list_head() wrapper around new reader page next field
ring-buffer: Wrap a list.next reference with rb_list_head()

Linus Torvalds
2010-01-17 04:27:25 +0800
af2422c42 smp_call_function_any(): pass the node value to cpumask_of_node() ... Browse Code »

The change in acpi_cpufreq to use smp_call_function_any causes a warning
when it is called since the function erroneously passes the cpu id to
cpumask_of_node rather than the node that the cpu is on. Fix this.

cpumask_of_node(3): node > nr_node_ids(1)
Pid: 1, comm: swapper Not tainted 2.6.33-rc3-00097-g2c1f189 #223
Call Trace:
[] cpumask_of_node+0x23/0x58
[] smp_call_function_any+0x65/0xfa
[] ? do_drv_read+0x0/0x2f
[] get_cur_val+0xb0/0x102
[] get_cur_freq_on_cpu+0x74/0xc5
[] acpi_cpufreq_cpu_init+0x417/0x515
[] ? __down_write+0xb/0xd
[] cpufreq_add_dev+0x278/0x922

Signed-off-by: David John
Cc: Suresh Siddha
Cc: Rusty Russell
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David John
2010-01-17 04:15:39 +0800
5dab600e6 kfifo: document everywhere that size has to be power of two ... Browse Code »

On my first try using them I missed that the fifos need to be power of
two, resulting in a runtime bug. Document that requirement everywhere
(and fix one grammar bug)

Signed-off-by: Andi Kleen
Acked-by: Stefani Seibold
Cc: Roland Dreier
Cc: Dmitry Torokhov
Cc: Andy Walls
Cc: Vikram Dhillon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andi Kleen
2010-01-17 04:15:38 +0800
a5b9e2c10 kfifo: add kfifo_out_peek ... Browse Code »

In some upcoming code it's useful to peek into a FIFO without permanentely
removing data. This patch implements a new kfifo_out_peek() to do this.

Signed-off-by: Andi Kleen
Acked-by: Stefani Seibold
Cc: Roland Dreier
Cc: Dmitry Torokhov
Cc: Andy Walls
Cc: Vikram Dhillon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andi Kleen
2010-01-17 04:15:38 +0800
64ce1037c kfifo: sanitize *_user error handling ... Browse Code »

Right now for kfifo_*_user it's not easily possible to distingush between
a user copy failing and the FIFO not containing enough data. The problem
is that both conditions are multiplexed into the same return code.

Avoid this by moving the "copy length" into a separate output parameter
and only return 0/-EFAULT in the main return value.

I didn't fully adapt the weird "record" variants, those seem
to be unused anyways and were rather messy (should they be just removed?)

I would appreciate some double checking if I did all the conversions
correctly.

Signed-off-by: Andi Kleen
Cc: Stefani Seibold
Cc: Roland Dreier
Cc: Dmitry Torokhov
Cc: Andy Walls
Cc: Vikram Dhillon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andi Kleen
2010-01-17 04:15:38 +0800
8ecc29515 kfifo: use void * pointers for user buffers ... Browse Code »

The pointers to user buffers are currently unsigned char *, which requires
a lot of casting in the caller for any non-char typed buffers. Use void *
instead.

Signed-off-by: Andi Kleen
Acked-by: Stefani Seibold
Cc: Roland Dreier
Cc: Dmitry Torokhov
Cc: Andy Walls
Cc: Vikram Dhillon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andi Kleen
2010-01-17 04:15:38 +0800

15 Jan, 2010

6 commits

d1303dd1d tracing/filters: Add comment for match callbacks ... Browse Code »

We should be clear on 2 things:

- the length parameter of a match callback includes
tailing '\0'.

- the string to be searched might not be NULL-terminated.

Signed-off-by: Li Zefan
LKML-Reference:
Signed-off-by: Steven Rostedt

Li Zefan
2010-01-15 11:38:14 +0800
16da27a8b tracing/filters: Fix MATCH_FULL filter matching for PTR_STRING ... Browse Code »

MATCH_FULL matching for PTR_STRING is not working correctly:

# echo 'func == vt' > events/bkl/lock_kernel/filter
# echo 1 > events/bkl/lock_kernel/enable
...
# cat trace
Xorg-1484 [000] 1973.392586: lock_kernel: ... func=vt_ioctl()
gpm-1402 [001] 1974.027740: lock_kernel: ... func=vt_ioctl()

We should pass to regex.match(..., len) the length (including '\0')
of the source string instead of the length of the pattern string.

Signed-off-by: Li Zefan
LKML-Reference:
Acked-by: Frederic Weisbecker
Signed-off-by: Steven Rostedt

Li Zefan
2010-01-15 11:38:12 +0800
b2af211f2 tracing/filters: Fix MATCH_MIDDLE_ONLY filter matching ... Browse Code »

The @str might not be NULL-terminated if it's of type
DYN_STRING or STATIC_STRING, so we should use strnstr()
instead of strstr().

Signed-off-by: Li Zefan
LKML-Reference:
Acked-by: Frederic Weisbecker
Signed-off-by: Steven Rostedt

Li Zefan
2010-01-15 11:38:11 +0800
a3291c14e tracing/filters: Fix MATCH_END_ONLY filter matching ... Browse Code »

For '*foo' pattern, we should allow any string ending with
'foo', but event filtering incorrectly disallows strings
like bar_foo_foo:

Signed-off-by: Li Zefan
LKML-Reference:
Acked-by: Frederic Weisbecker
Signed-off-by: Steven Rostedt

Li Zefan
2010-01-15 11:38:07 +0800
285caad41 tracing/filters: Fix MATCH_FRONT_ONLY filter matching ... Browse Code »

MATCH_FRONT_ONLY actually is a full matching:

# ./perf record -R -f -a -e lock:lock_acquire \
--filter 'name ~rcu_*' sleep 1
# ./perf trace
(no output)

We should pass the length of the pattern string to strncmp().

Signed-off-by: Li Zefan
LKML-Reference:
Acked-by: Frederic Weisbecker
Signed-off-by: Steven Rostedt

Li Zefan
2010-01-15 11:38:05 +0800
751e9983e ftrace: Fix MATCH_END_ONLY function filter ... Browse Code »

For '*foo' pattern, we should allow any string ending with
'foo', but ftrace filter incorrectly disallows strings
like bar_foo_foo:

# echo '*io' > set_ftrace_filter
# cat set_ftrace_filter | grep 'req_bio_endio'
# cat available_filter_functions | grep 'req_bio_endio'
req_bio_endio

Signed-off-by: Li Zefan
LKML-Reference:
Acked-by: Frederic Weisbecker
Signed-off-by: Steven Rostedt

Li Zefan
2010-01-15 11:38:03 +0800

13 Jan, 2010

1 commit

7485d0d37 futexes: Remove rw parameter from get_futex_key() ... Browse Code »
1

Currently, futexes have two problem:

A) The current futex code doesn't handle private file mappings properly.

get_futex_key() uses PageAnon() to distinguish file and
anon, which can cause the following bad scenario:

1) thread-A call futex(private-mapping, FUTEX_WAIT), it
sleeps on file mapping object.
2) thread-B writes a variable and it makes it cow.
3) thread-B calls futex(private-mapping, FUTEX_WAKE), it
wakes up blocked thread on the anonymous page. (but it's nothing)

B) Current futex code doesn't handle zero page properly.

Read mode get_user_pages() can return zero page, but current
futex code doesn't handle it at all. Then, zero page makes
infinite loop internally.

The solution is to use write mode get_user_page() always for
page lookup. It prevents the lookup of both file page of private
mappings and zero page.

Performance concerns:

Probaly very little, because glibc always initialize variables
for futex before to call futex(). It means glibc users never see
the overhead of this patch.

Compatibility concerns:

This patch has few compatibility issues. After this patch,
FUTEX_WAIT require writable access to futex variables (read-only
mappings makes EFAULT). But practically it's not a problem,
glibc always initalizes variables for futexes explicitly - nobody
uses read-only mappings.

Reported-by: Hugh Dickins
Signed-off-by: KOSAKI Motohiro
Acked-by: Peter Zijlstra
Acked-by: Darren Hart
Cc:
Cc: Linus Torvalds
Cc: KAMEZAWA Hiroyuki
Cc: Nick Piggin
Cc: Ulrich Drepper
LKML-Reference:
Signed-off-by: Ingo Molnar

KOSAKI Motohiro
2010-01-13 16:17:36 +0800

12 Jan, 2010

2 commits

b45c6e76b kernel/signal.c: fix kernel information leak with print-fatal-signals=1 ... Browse Code »

When print-fatal-signals is enabled it's possible to dump any memory
reachable by the kernel to the log by simply jumping to that address from
user space.

Or crash the system if there's some hardware with read side effects.

The fatal signals handler will dump 16 bytes at the execution address,
which is fully controlled by ring 3.

In addition when something jumps to a unmapped address there will be up to
16 additional useless page faults, which might be potentially slow (and at
least is not very efficient)

Fortunately this option is off by default and only there on i386.

But fix it by checking for kernel addresses and also stopping when there's
a page fault.

Signed-off-by: Andi Kleen
Cc: Ingo Molnar
Cc: Oleg Nesterov
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andi Kleen
2010-01-12 01:34:05 +0800
bd4f490a0 cgroups: fix 2.6.32 regression causing BUG_ON() in cgroup_diput() ... Browse Code »

The LTP cgroup test suite generates a "kernel BUG at kernel/cgroup.c:790!"
here in cgroup_diput():

/*
* if we're getting rid of the cgroup, refcount should ensure
* that there are no pidlists left.
*/
BUG_ON(!list_empty(&cgrp->pidlists));

The cgroup pidlist rework in 2.6.32 generates the BUG_ON, which is caused
when pidlist_array_load() calls cgroup_pidlist_find():

(1) if a matching cgroup_pidlist is found, it down_write's the mutex of the
pre-existing cgroup_pidlist, and increments its use_count.
(2) if no matching cgroup_pidlist is found, then a new one is allocated, it
down_write's its mutex, and the use_count is set to 0.
(3) the matching, or new, cgroup_pidlist gets returned back to pidlist_array_load(),
which increments its use_count -- regardless whether new or pre-existing --
and up_write's the mutex.

So if a matching list is ever encountered by cgroup_pidlist_find() during
the life of a cgroup directory, it results in an inflated use_count value,
preventing it from ever getting released by cgroup_release_pid_array().
Then if the directory is subsequently removed, cgroup_diput() hits the
BUG_ON() when it finds that the directory's cgroup is still populated with
a pidlist.

The patch simply removes the use_count increment when a matching pidlist
is found by cgroup_pidlist_find(), because it gets bumped by the calling
pidlist_array_load() function while still protected by the list's mutex.

Signed-off-by: Dave Anderson
Reviewed-by: Li Zefan
Acked-by: Ben Blum
Cc: Paul Menage
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dave Anderson
2010-01-12 01:34:05 +0800