Doug / smarc-fsl-linux-kernel | Embedian Git Server

01 Apr, 2013

1 commit

dbf520a9d Revert "lockdep: check that no locks held at freeze time" ... Browse Code »

This reverts commit 6aa9707099c4b25700940eb3d016f16c4434360d.

Commit 6aa9707099c4 ("lockdep: check that no locks held at freeze time")
causes problems with NFS root filesystems. The failures were noticed on
OMAP2 and 3 boards during kernel init:

[ BUG: swapper/0/1 still has locks held! ]
3.9.0-rc3-00344-ga937536 #1 Not tainted
-------------------------------------
1 lock held by swapper/0/1:
#0: (&type->s_umount_key#13/1){+.+.+.}, at: [] sget+0x248/0x574

stack backtrace:
rpc_wait_bit_killable
__wait_on_bit
out_of_line_wait_on_bit
__rpc_execute
rpc_run_task
rpc_call_sync
nfs_proc_get_root
nfs_get_root
nfs_fs_mount_common
nfs_try_mount
nfs_fs_mount
mount_fs
vfs_kern_mount
do_mount
sys_mount
do_mount_root
mount_root
prepare_namespace
kernel_init_freeable
kernel_init

Although the rootfs mounts, the system is unstable. Here's a transcript
from a PM test:

http://www.pwsan.com/omap/testlogs/test_v3.9-rc3/20130317194234/pm/37xxevm/37xxevm_log.txt

Here's what the test log should look like:

http://www.pwsan.com/omap/testlogs/test_v3.8/20130218214403/pm/37xxevm/37xxevm_log.txt

Mailing list discussion is here:

http://lkml.org/lkml/2013/3/4/221

Deal with this for v3.9 by reverting the problem commit, until folks can
figure out the right long-term course of action.

Signed-off-by: Paul Walmsley
Cc: Mandeep Singh Baines
Cc: Jeff Layton
Cc: Shawn Guo
Cc:
Cc: Fengguang Wu
Cc: Trond Myklebust
Cc: Ingo Molnar
Cc: Ben Chan
Cc: Oleg Nesterov
Cc: Tejun Heo
Cc: Rafael J. Wysocki
Cc: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Walmsley
2013-04-01 02:38:33 +0800

29 Mar, 2013

1 commit

2c3de1c2d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull userns fixes from Eric W Biederman:
"The bulk of the changes are fixing the worst consequences of the user
namespace design oversight in not considering what happens when one
namespace starts off as a clone of another namespace, as happens with
the mount namespace.

The rest of the changes are just plain bug fixes.

Many thanks to Andy Lutomirski for pointing out many of these issues."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
userns: Restrict when proc and sysfs can be mounted
ipc: Restrict mounting the mqueue filesystem
vfs: Carefully propogate mounts across user namespaces
vfs: Add a mount flag to lock read only bind mounts
userns: Don't allow creation if the user is chrooted
yama: Better permission check for ptraceme
pid: Handle the exit of a multi-threaded init.
scm: Require CAP_SYS_ADMIN over the current pidns to spoof pids.

Linus Torvalds
2013-03-29 04:43:46 +0800

27 Mar, 2013

2 commits

87a8ebd63 userns: Restrict when proc and sysfs can be mounted ... Browse Code »

Only allow unprivileged mounts of proc and sysfs if they are already
mounted when the user namespace is created.

proc and sysfs are interesting because they have content that is
per namespace, and so fresh mounts are needed when new namespaces
are created while at the same time proc and sysfs have content that
is shared between every instance.

Respect the policy of who may see the shared content of proc and sysfs
by only allowing new mounts if there was an existing mount at the time
the user namespace was created.

In practice there are only two interesting cases: proc and sysfs are
mounted at their usual places, proc and sysfs are not mounted at all
(some form of mount namespace jail).

Cc: stable@vger.kernel.org
Acked-by: Serge Hallyn
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-03-27 22:50:08 +0800
3151527ee userns: Don't allow creation if the user is chrooted ... Browse Code »

Guarantee that the policy of which files may be access that is
established by setting the root directory will not be violated
by user namespaces by verifying that the root directory points
to the root of the mount namespace at the time of user namespace
creation.

Changing the root is a privileged operation, and as a matter of policy
it serves to limit unprivileged processes to files below the current
root directory.

For reasons of simplicity and comprehensibility the privilege to
change the root directory is gated solely on the CAP_SYS_CHROOT
capability in the user namespace. Therefore when creating a user
namespace we must ensure that the policy of which files may be access
can not be violated by changing the root directory.

Anyone who runs a processes in a chroot and would like to use user
namespace can setup the same view of filesystems with a mount
namespace instead. With this result that this is not a practical
limitation for using user namespaces.

Cc: stable@vger.kernel.org
Acked-by: Serge Hallyn
Reported-by: Andy Lutomirski
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-03-27 22:49:29 +0800

26 Mar, 2013

2 commits

751c644b9 pid: Handle the exit of a multi-threaded init. ... Browse Code »

When a multi-threaded init exits and the initial thread is not the
last thread to exit the initial thread hangs around as a zombie
until the last thread exits. In that case zap_pid_ns_processes
needs to wait until there are only 2 hashed pids in the pid
namespace not one.

v2. Replace thread_pid_vnr(me) == 1 with the test thread_group_leader(me)
as suggested by Oleg.

Cc: stable@vger.kernel.org
Cc: Oleg Nesterov
Reported-by: Caj Larsson
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-03-26 18:41:23 +0800
a12183c62 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull timer fix from Thomas Gleixner:
"A single bugfix which prevents that a non functional timer device is
selected to provide the fallback device, which is supposed to serve
timer interrupts on behalf of non functional devices ..."

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clockevents: Don't allow dummy broadcast timers

Linus Torvalds
2013-03-26 09:03:34 +0800

23 Mar, 2013

2 commits

2ca067efd poweroff: change orderly_poweroff() to use schedule_work() ... Browse Code »

David said:

Commit 6c0c0d4d1080 ("poweroff: fix bug in orderly_poweroff()")
apparently fixes one bug in orderly_poweroff(), but introduces
another. The comments on orderly_poweroff() claim it can be called
from any context - and indeed we call it from interrupt context in
arch/powerpc/platforms/pseries/ras.c for example. But since that
commit this is no longer safe, since call_usermodehelper_fns() is not
safe in interrupt context without the UMH_NO_WAIT option.

orderly_poweroff() can be used from any context but UMH_WAIT_EXEC is
sleepable. Move the "force" logic into __orderly_poweroff() and change
orderly_poweroff() to use the global poweroff_work which simply calls
__orderly_poweroff().

While at it, remove the unneeded "int argc" and change argv_split() to
use GFP_KERNEL.

We use the global "bool poweroff_force" to pass the argument, this can
obviously affect the previous request if it is pending/running. So we
only allow the "false => true" transition assuming that the pending
"true" should succeed anyway. If schedule_work() fails after that we
know that work->func() was not called yet, it must see the new value.

This means that orderly_poweroff() becomes async even if we do not run
the command and always succeeds, schedule_work() can only fail if the
work is already pending. We can export __orderly_poweroff() and change
the non-atomic callers which want the old semantics.

Signed-off-by: Oleg Nesterov
Reported-by: Benjamin Herrenschmidt
Reported-by: David Gibson
Cc: Lucas De Marchi
Cc: Feng Hong
Cc: Kees Cook
Cc: Serge Hallyn
Cc: "Eric W. Biederman"
Cc: "Rafael J. Wysocki"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2013-03-23 07:41:20 +0800
dc72c32e1 printk: Provide a wake_up_klogd() off-case ... Browse Code »

wake_up_klogd() is useless when CONFIG_PRINTK=n because neither printk()
nor printk_sched() are in use and there are actually no waiter on
log_wait waitqueue. It should be a stub in this case for users like
bust_spinlocks().

Otherwise this results in this warning when CONFIG_PRINTK=n and
CONFIG_IRQ_WORK=n:

kernel/built-in.o In function `wake_up_klogd':
(.text.wake_up_klogd+0xb4): undefined reference to `irq_work_queue'

To fix this, provide an off-case for wake_up_klogd() when
CONFIG_PRINTK=n.

There is much more from console_unlock() and other console related code
in printk.c that should be moved under CONFIG_PRINTK. But for now,
focus on a minimal fix as we passed the merged window already.

[akpm@linux-foundation.org: include printk.h in bust_spinlocks.c]
Signed-off-by: Frederic Weisbecker
Reported-by: James Hogan
Cc: James Hogan
Cc: Steven Rostedt
Cc: Peter Zijlstra
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Frederic Weisbecker
2013-03-23 07:41:20 +0800

21 Mar, 2013

1 commit

cd8234693 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf fixes from Ingo Molnar:
"A fair chunk of the linecount comes from a fix for a tracing bug that
corrupts latency tracing buffers when the overwrite mode is changed on
the fly - the rest is mostly assorted fewliner fixlets."

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Add SNB/SNB-EP scheduling constraints for cycle_activity event
kprobes/x86: Check Interrupt Flag modifier when registering probe
kprobes: Make hash_64() as always inlined
perf: Generate EXIT event only once per task context
perf: Reset hwc->last_period on sw clock events
tracing: Prevent buffer overwrite disabled for latency tracers
tracing: Keep overwrite in sync between regular and snapshot buffers
tracing: Protect tracer flags with trace_types_lock
perf tools: Fix LIBNUMA build with glibc 2.12 and older.
tracing: Fix free of probe entry by calling call_rcu_sched()
perf/POWER7: Create a sysfs format entry for Power7 events
perf probe: Fix segfault
libtraceevent: Remove hard coded include to /usr/local/include in Makefile
perf record: Fix -C option
perf tools: check if -DFORTIFY_SOURCE=2 is allowed
perf report: Fix build with NO_NEWT=1
perf annotate: Fix build with NO_NEWT=1
tracing: Fix race in snapshot swapping

Linus Torvalds
2013-03-21 23:29:11 +0800

19 Mar, 2013

1 commit

b63dc123b Merge branch 'for-3.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq ... Browse Code »

Pull workqueue fix from Tejun Heo:
"Lai's patch to fix highly unlikely but still possible workqueue stall
during CPU hotunplug."

* 'for-3.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix possible pool stall bug in wq_unbind_fn()

Linus Torvalds
2013-03-19 09:47:07 +0800

18 Mar, 2013

3 commits

1f1b39675 Merge branch 'tip/perf/urgent-2' of git://git.kernel.org/pub/scm/linux/kernel/gi… ... Browse Code »

…t/rostedt/linux-trace into perf/urgent

Pull tracing fixes from Steven Rostedt.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2013-03-18 16:48:29 +0800
d610d98b5 perf: Generate EXIT event only once per task context ... Browse Code »

perf_event_task_event() iterates pmu list and generate events
for each eligible pmu context. But if task_event has task_ctx
like in EXIT it'll generate events even though the pmu doesn't
have an eligible one. Fix it by moving the code to proper
places.

Before this patch:

$ perf record -n true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.006 MB perf.data (~248 samples) ]

$ perf report -D | tail
Aggregated stats:
TOTAL events: 73
MMAP events: 67
COMM events: 2
EXIT events: 4
cycles stats:
TOTAL events: 73
MMAP events: 67
COMM events: 2
EXIT events: 4

After this patch:

$ perf report -D | tail
Aggregated stats:
TOTAL events: 70
MMAP events: 67
COMM events: 2
EXIT events: 1
cycles stats:
TOTAL events: 70
MMAP events: 67
COMM events: 2
EXIT events: 1

Signed-off-by: Namhyung Kim
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1363332433-7637-1-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar

Namhyung Kim
2013-03-18 16:47:33 +0800
778141e3c perf: Reset hwc->last_period on sw clock events ... Browse Code »

When cpu/task clock events are initialized, their sampling
frequencies are converted to have a fixed value. However it
missed to update the hwc->last_period which was set to 1 for
initial sampling frequency calibration.

Because this hwc->last_period value is used as a period in
perf_swevent_ hrtime(), every recorded sample will have an
incorrected period of 1.

$ perf record -e task-clock noploop 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.158 MB perf.data (~6919 samples) ]

$ perf report -n --show-total-period --stdio
# Samples: 4K of event 'task-clock'
# Event count (approx.): 4000
#
# Overhead Samples Period Command Shared Object Symbol
# ........ ............ ............ ....... ............. ..................
#
99.95% 3998 3998 noploop noploop [.] main
0.03% 1 1 noploop libc-2.15.so [.] init_cacheinfo
0.03% 1 1 noploop ld-2.15.so [.] open_verify

Note that it doesn't affect the non-sampling event so that the
perf stat still gets correct value with or without this patch.

$ perf stat -e task-clock noploop 1

Performance counter stats for 'noploop 1':

1000.272525 task-clock # 1.000 CPUs utilized

1.000560605 seconds time elapsed

Signed-off-by: Namhyung Kim
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1363574507-18808-1-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar

Namhyung Kim
2013-03-18 16:15:18 +0800

15 Mar, 2013

3 commits

613f04a0f tracing: Prevent buffer overwrite disabled for latency tracers ... Browse Code »

The latency tracers require the buffers to be in overwrite mode,
otherwise they get screwed up. Force the buffers to stay in overwrite
mode when latency tracers are enabled.

Added a flag_changed() method to the tracer structure to allow
the tracers to see what flags are being changed, and also be able
to prevent the change from happing.

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2013-03-15 11:40:21 +0800
809028226 tracing: Keep overwrite in sync between regular and snapshot buffers ... Browse Code »

Changing the overwrite mode for the ring buffer via the trace
option only sets the normal buffer. But the snapshot buffer could
swap with it, and then the snapshot would be in non overwrite mode
and the normal buffer would be in overwrite mode, even though the
option flag states otherwise.

Keep the two buffers overwrite modes in sync.

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2013-03-15 11:40:15 +0800
69d34da29 tracing: Protect tracer flags with trace_types_lock ... Browse Code »

Seems that the tracer flags have never been protected from
synchronous writes. Luckily, admins don't usually modify the
tracing flags via two different tasks. But if scripts were to
be used to modify them, then they could get corrupted.

Move the trace_types_lock that protects against tracers changing
to also protect the flags being set.

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2013-03-15 01:50:56 +0800

14 Mar, 2013

7 commits

0b34083f4 Merge branch 'tip/perf/urgent-2' of git://git.kernel.org/pub/scm/linux/kernel/gi… ... Browse Code »

…t/rostedt/linux-trace into perf/urgent

Pull tracing fixes from Steven Rostedt.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2013-03-14 15:12:20 +0800
842d223f2 Merge branch 'akpm' (fixes from Andrew) ... Browse Code »

Merge misc fixes from Andrew Morton:

- A bunch of fixes

- Finish off the idr API conversions before someone starts to use the
old interfaces again.

* emailed patches from Andrew Morton :
idr: idr_alloc() shouldn't trigger lowmem warning when preloaded
UAPI: fix endianness conditionals in M32R's asm/stat.h
UAPI: fix endianness conditionals in linux/raid/md_p.h
UAPI: fix endianness conditionals in linux/acct.h
UAPI: fix endianness conditionals in linux/aio_abi.h
decompressors: fix typo "POWERPC"
mm/fremap.c: fix oops on error path
idr: deprecate idr_pre_get() and idr_get_new[_above]()
tidspbridge: convert to idr_alloc()
zcache: convert to idr_alloc()
mlx4: remove leftover idr_pre_get() call
workqueue: convert to idr_alloc()
nfsd: convert to idr_alloc()
nfsd: remove unused get_new_stid()
kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER
signal: always clear sa_restorer on execve
mm: remove_memory(): fix end_pfn setting
include/linux/res_counter.h needs errno.h

Linus Torvalds
2013-03-14 06:21:57 +0800
e68035fb6 workqueue: convert to idr_alloc() ... Browse Code »

idr_get_new*() and friends are about to be deprecated. Convert to the
new idr_alloc() interface.

Signed-off-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-03-14 06:21:46 +0800
522cff142 kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER ... Browse Code »

__ARCH_HAS_SA_RESTORER is the preferred conditional for use in 3.9 and
later kernels, per Kees.

Cc: Emese Revfy
Cc: Emese Revfy
Cc: PaX Team
Cc: Al Viro
Cc: Oleg Nesterov
Cc: "Eric W. Biederman"
Cc: Serge Hallyn
Cc: Julien Tinnes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2013-03-14 06:21:45 +0800
2ca39528c signal: always clear sa_restorer on execve ... Browse Code »

When the new signal handlers are set up, the location of sa_restorer is
not cleared, leaking a parent process's address space location to
children. This allows for a potential bypass of the parent's ASLR by
examining the sa_restorer value returned when calling sigaction().

Based on what should be considered "secret" about addresses, it only
matters across the exec not the fork (since the VMAs haven't changed
until the exec). But since exec sets SIG_DFL and keeps sa_restorer,
this is where it should be fixed.

Given the few uses of sa_restorer, a "set" function was not written
since this would be the only use. Instead, we use
__ARCH_HAS_SA_RESTORER, as already done in other places.

Example of the leak before applying this patch:

$ cat /proc/$$/maps
...
7fb9f3083000-7fb9f3238000 r-xp 00000000 fd:01 404469 .../libc-2.15.so
...
$ ./leak
...
7f278bc74000-7f278be29000 r-xp 00000000 fd:01 404469 .../libc-2.15.so
...
1 0 (nil) 0x7fb9f30b94a0
2 4000000 (nil) 0x7f278bcaa4a0
3 4000000 (nil) 0x7f278bcaa4a0
4 0 (nil) 0x7fb9f30b94a0
...

[akpm@linux-foundation.org: use SA_RESTORER for backportability]
Signed-off-by: Kees Cook
Reported-by: Emese Revfy
Cc: Emese Revfy
Cc: PaX Team
Cc: Al Viro
Cc: Oleg Nesterov
Cc: "Eric W. Biederman"
Cc: Serge Hallyn
Cc: Julien Tinnes
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2013-03-14 06:21:44 +0800
e66eded83 userns: Don't allow CLONE_NEWUSER | CLONE_FS ... Browse Code »

Don't allowing sharing the root directory with processes in a
different user namespace. There doesn't seem to be any point, and to
allow it would require the overhead of putting a user namespace
reference in fs_struct (for permission checks) and incrementing that
reference count on practically every call to fork.

So just perform the inexpensive test of forbidding sharing fs_struct
acrosss processes in different user namespaces. We already disallow
other forms of threading when unsharing a user namespace so this
should be no real burden in practice.

This updates setns, clone, and unshare to disallow multiple user
namespaces sharing an fs_struct.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman"
Signed-off-by: Linus Torvalds

Eric W. Biederman
2013-03-14 06:00:20 +0800
740466bc8 tracing: Fix free of probe entry by calling call_rcu_sched() ... Browse Code »

Because function tracing is very invasive, and can even trace
calls to rcu_read_lock(), RCU access in function tracing is done
with preempt_disable_notrace(). This requires a synchronize_sched()
for updates and not a synchronize_rcu().

Function probes (traceon, traceoff, etc) must be freed after
a synchronize_sched() after its entry has been removed from the
hash. But call_rcu() is used. Fix this by using call_rcu_sched().

Also fix the usage to use hlist_del_rcu() instead of hlist_del().

Cc: stable@vger.kernel.org
Cc: Paul McKenney
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2013-03-14 05:57:44 +0800

13 Mar, 2013

2 commits

6c23cbbd5 futex: fix kernel-doc notation and spello ... Browse Code »

Fix kernel-doc warning in futex.c and convert 'Returns' to the new Return:
kernel-doc notation format.

Warning(kernel/futex.c:2286): Excess function parameter 'clockrt' description in 'futex_wait_requeue_pi'

Fix one spello.

Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds

Randy Dunlap
2013-03-13 11:42:10 +0800
20f22ab42 signals: fix new kernel-doc warnings ... Browse Code »

Fix new kernel-doc warnings in kernel/signal.c:

Warning(kernel/signal.c:2689): No description found for parameter 'uset'
Warning(kernel/signal.c:2689): Excess function parameter 'set' description in 'sys_rt_sigpending'

Signed-off-by: Randy Dunlap
Cc: Alexander Viro
Signed-off-by: Linus Torvalds

Randy Dunlap
2013-03-13 11:42:10 +0800

12 Mar, 2013

1 commit

2721e72dd tracing: Fix race in snapshot swapping ... Browse Code »

Although the swap is wrapped with a spin_lock, the assignment
of the temp buffer used to swap is not within that lock.
It needs to be moved into that lock, otherwise two swaps
happening on two different CPUs, can end up using the wrong
temp buffer to assign in the swap.

Luckily, all current callers of the swap function appear to have
their own locks. But in case something is added that allows two
different callers to call the swap, then there's a chance that
this race can trigger and corrupt the buffers.

New code is coming soon that will allow for this race to trigger.

I've Cc'd stable, so this bug will not show up if someone backports
one of the changes that can trigger this bug.

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2013-03-12 23:56:33 +0800

11 Mar, 2013

1 commit

7c6baa304 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf fixes from Ingo Molnar:
"Misc minor fixes mostly related to tracing"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
s390: Fix a header dependencies related build error
tracing: update documentation of snapshot utility
tracing: Do not return EINVAL in snapshot when not allocated
tracing: Add help of snapshot feature when snapshot is empty
ftrace: Update the kconfig for DYNAMIC_FTRACE

Linus Torvalds
2013-03-11 22:54:29 +0800

09 Mar, 2013

2 commits

eb2834285 workqueue: fix possible pool stall bug in wq_unbind_fn() ... Browse Code »

Since multiple pools per cpu have been introduced, wq_unbind_fn() has
a subtle bug which may theoretically stall work item processing. The
problem is two-fold.

* wq_unbind_fn() depends on the worker executing wq_unbind_fn() itself
to start unbound chain execution, which works fine when there was
only single pool. With multiple pools, only the pool which is
running wq_unbind_fn() - the highpri one - is guaranteed to have
such kick-off. The other pool could stall when its busy workers
block.

* The current code is setting WORKER_UNBIND / POOL_DISASSOCIATED of
the two pools in succession without initiating work execution
inbetween. Because setting the flags requires grabbing assoc_mutex
which is held while new workers are created, this could lead to
stalls if a pool's manager is waiting for the previous pool's work
items to release memory. This is almost purely theoretical tho.

Update wq_unbind_fn() such that it sets WORKER_UNBIND /
POOL_DISASSOCIATED, goes over schedule() and explicitly kicks off
execution for a pool and then moves on to the next one.

tj: Updated comments and description.

Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo
Cc: stable@vger.kernel.org

Lai Jiangshan
2013-03-09 07:18:28 +0800
dc893e19b Revert parts of "hlist: drop the node parameter from iterators" ... Browse Code »

Commit b67bfe0d42ca ("hlist: drop the node parameter from iterators")
did a lot of nice changes but also contains two small hunks that seem to
have slipped in accidentally and have no apparent connection to the
intent of the patch.

This reverts the two extraneous changes.

Signed-off-by: Arnd Bergmann
Cc: Peter Senna Tschudin
Cc: Paul E. McKenney
Cc: Sasha Levin
Cc: Thomas Gleixner
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arnd Bergmann
2013-03-09 07:05:34 +0800

08 Mar, 2013

1 commit

a7dc19b86 clockevents: Don't allow dummy broadcast timers ... Browse Code »

Currently tick_check_broadcast_device doesn't reject clock_event_devices
with CLOCK_EVT_FEAT_DUMMY, and may select them in preference to real
hardware if they have a higher rating value. In this situation, the
dummy timer is responsible for broadcasting to itself, and the core
clockevents code may attempt to call non-existent callbacks for
programming the dummy, eventually leading to a panic.

This patch makes tick_check_broadcast_device always reject dummy timers,
preventing this problem.

Signed-off-by: Mark Rutland
Cc: linux-arm-kernel@lists.infradead.org
Cc: Jon Medhurst (Tixy)
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner

Mark Rutland
2013-03-08 00:16:11 +0800

07 Mar, 2013

2 commits

c9960e485 tracing: Do not return EINVAL in snapshot when not allocated ... Browse Code »

To use the tracing snapshot feature, writing a '1' into the snapshot
file causes the snapshot buffer to be allocated if it has not already
been allocated and dose a 'swap' with the main buffer, so that the
snapshot now contains what was in the main buffer, and the main buffer
now writes to what was the snapshot buffer.

To free the snapshot buffer, a '0' is written into the snapshot file.

To clear the snapshot buffer, any number but a '0' or '1' is written
into the snapshot file. But if the file is not allocated it returns
-EINVAL error code. This is rather pointless. It is better just to
do nothing and return success.

Acked-by: Hiraku Toyooka
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2013-03-07 23:31:38 +0800
d8741e2e8 tracing: Add help of snapshot feature when snapshot is empty ... Browse Code »

When cat'ing the snapshot file, instead of showing an empty trace
header like the trace file does, show how to use the snapshot
feature.

Also, this is a good place to show if the snapshot has been allocated
or not. Users may want to "pre allocate" the snapshot to have a fast
"swap" of the current buffer. Otherwise, a swap would be slow and might
fail as it would need to allocate the snapshot buffer, and that might
fail under tight memory constraints.

Here's what it looked like before:

# tracer: nop
#
# entries-in-buffer/entries-written: 0/0 #P:4
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |

Here's what it looks like now:

# tracer: nop
#
#
# * Snapshot is freed *
#
# Snapshot commands:
# echo 0 > snapshot : Clears and frees snapshot buffer
# echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
# Takes a snapshot of the main buffer.
# echo 2 > snapshot : Clears snapshot buffer (but does not allocate)
# (Doesn't have to be '2' works with any number that
# is not a '0' or '1')

Acked-by: Hiraku Toyooka
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2013-03-07 23:31:22 +0800

06 Mar, 2013

2 commits

e3b59518c Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull irq fixes and cleanups from Thomas Gleixner:
"Commit e5ab012c3271 ("nohz: Make tick_nohz_irq_exit() irq safe") is
the first commit in the series and the minimal necessary bugfix, which
needs to go back into stable.

The remanining commits enforce irq disabling in irq_exit(), sanitize
the hardirq/softirq preempt count transition and remove a bunch of no
longer necessary conditionals."

I personally love getting rid of the very subtle and confusing
IRQ_EXIT_OFFSET thing. Even apart from the whole "more lines removed
than added" thing.

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irq: Don't re-enable interrupts at the end of irq_exit
irq: Remove IRQ_EXIT_OFFSET workaround
Revert "nohz: Make tick_nohz_irq_exit() irq safe"
irq: Sanitize invoke_softirq
irq: Ensure irq_exit() code runs with interrupts disabled
nohz: Make tick_nohz_irq_exit() irq safe

Linus Torvalds
2013-03-06 10:10:04 +0800
6516ab6fd Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull smpboot bugfix from Thomas Gleixner:
"A single bugfix for a regression introduced with the conversion of the
stop machine threads to the generic smpboot thread management
facility"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
stop_machine: Mark per cpu stopper enabled early

Linus Torvalds
2013-03-06 10:07:12 +0800

04 Mar, 2013

2 commits

56a79b7b0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull more VFS bits from Al Viro:
"Unfortunately, it looks like xattr series will have to wait until the
next cycle ;-/

This pile contains 9p cleanups and fixes (races in v9fs_fid_add()
etc), fixup for nommu breakage in shmem.c, several cleanups and a bit
more file_inode() work"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
constify path_get/path_put and fs_struct.c stuff
fix nommu breakage in shmem.c
cache the value of file_inode() in struct file
9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry
9p: make sure ->lookup() adds fid to the right dentry
9p: untangle ->lookup() a bit
9p: double iput() in ->lookup() if d_materialise_unique() fails
9p: v9fs_fid_add() can't fail now
v9fs: get rid of v9fs_dentry
9p: turn fid->dlist into hlist
9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
more file_inode() open-coded instances
selinux: opened file can't have NULL or negative ->f_path.dentry

(In the meantime, the hlist traversal macros have changed, so this
required a semantic conflict fixup for the newly hlistified fid->dlist)

Linus Torvalds
2013-03-04 05:23:03 +0800
8fd5e7a2d Merge tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag ... Browse Code »

Pull new ImgTec Meta architecture from James Hogan:
"This adds core architecture support for Imagination's Meta processor
cores, followed by some later miscellaneous arch/metag cleanups and
fixes which I kept separate to ease review:

- Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
- A few fixes all over, particularly for symbol prefixes
- A few privilege protection fixes
- Several cleanups (setup.c includes, split out a lot of
metag_ksyms.c)
- Fix some missing exports
- Convert hugetlb to use vm_unmapped_area()
- Copy device tree to non-init memory
- Provide dma_get_sgtable()"

* tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (61 commits)
metag: Provide dma_get_sgtable()
metag: prom.h: remove declaration of metag_dt_memblock_reserve()
metag: copy devicetree to non-init memory
metag: cleanup metag_ksyms.c includes
metag: move mm/init.c exports out of metag_ksyms.c
metag: move usercopy.c exports out of metag_ksyms.c
metag: move setup.c exports out of metag_ksyms.c
metag: move kick.c exports out of metag_ksyms.c
metag: move traps.c exports out of metag_ksyms.c
metag: move irq enable out of irqflags.h on SMP
genksyms: fix metag symbol prefix on crc symbols
metag: hugetlb: convert to vm_unmapped_area()
metag: export clear_page and copy_page
metag: export metag_code_cache_flush_all
metag: protect more non-MMU memory regions
metag: make TXPRIVEXT bits explicit
metag: kernel/setup.c: sort includes
perf: Enable building perf tools for Meta
metag: add boot time LNKGET/LNKSET check
metag: add __init to metag_cache_probe()
...

Linus Torvalds
2013-03-04 04:06:09 +0800

03 Mar, 2013

4 commits

6ec40b423 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal ... Browse Code »

Pull sigprocmask compat fix from Al Viro:
"generic compat_sys_rt_sigprocmask() had a very dumb braino; I'd spent
quite a while staring at the offending commit before finally managing
to spot the idiocy ;-/"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
fix compat_sys_rt_sigprocmask()

Linus Torvalds
2013-03-03 11:32:06 +0800
db61ec29f fix compat_sys_rt_sigprocmask() ... Browse Code »

Converting bitmask to 32bit granularity is fine, but we'd better
_do_ something with the result. Such as "copy it to userland"...

Signed-off-by: Al Viro

Al Viro
2013-03-03 09:39:15 +0800
649508f68 trace/ring_buffer: handle 64bit aligned structs ... Browse Code »

Some 32 bit architectures require 64 bit values to be aligned (for
example Meta which has 64 bit read/write instructions). These require 8
byte alignment of event data too, so use
!CONFIG_HAVE_64BIT_ALIGNED_ACCESS instead of !CONFIG_64BIT ||
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to decide alignment, and align
buffer_data_page::data accordingly.

Signed-off-by: James Hogan
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Acked-by: Steven Rostedt (previous version subtly different)

James Hogan
2013-03-03 04:09:16 +0800
3cfb07743 Merge tag 'for_linux-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb ... Browse Code »

Pull KGDB/KDB fixes and cleanups from Jason Wessel:
"For a change we removed more code than we added. If people aren't
using it we shouldn't be carrying it. :-)

Cleanups:
- Remove kdb ssb command - there is no in kernel disassembler to
support it

- Remove kdb ll command - Always caused a kernel oops and there were
no bug reports so no one was using this command

- Use kernel ARRAY_SIZE macro instead of array computations

Fixes:
- Stop oops in kdb if user executes kdb_defcmd with args

- kdb help command truncated text

- ppc64 support for kgdbts

- Add missing kconfig option from original kdb port for dealing with
catastrophic kernel crashes such that you can reboot automatically
on continue from kdb"

* tag 'for_linux-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
kdb: Remove unhandled ssb command
kdb: Prevent kernel oops with kdb_defcmd
kdb: Remove the ll command
kdb_main: fix help print
kdb: Fix overlap in buffers with strcpy
Fixed dead ifdef block by adding missing Kconfig option.
kdb: Setup basic kdb state before invoking commands via kgdb
kdb: use ARRAY_SIZE where possible
kgdb/kgdbts: support ppc64
kdb: A fix for kdb command table expansion

Linus Torvalds
2013-03-03 00:31:39 +0800