Eric Lee / linux-smarc-t335x-v3.2

16 Jul, 2011

3 commits

614243181 tracing/kprobes: Support module init function probing ... Browse Code »

To support probing module init functions, kprobe-tracer allows
user to define a probe on non-existed function when it is given
with a module name. This also enables user to set a probe on
a function on a specific module, even if a same name (but different)
function is locally defined in another module.

The module name must be in the front of function name and separated
by a ':'. e.g. btrfs:btrfs_init_sysfs

Signed-off-by: Masami Hiramatsu
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Arnaldo Carvalho de Melo
Link: http://lkml.kernel.org/r/20110627072656.6528.89970.stgit@fedora15
Signed-off-by: Steven Rostedt

Masami Hiramatsu
2011-07-16 03:17:14 +0800
bc81d48d1 kprobes: Return -ENOENT if probe point doesn't exist ... Browse Code »

Return -ENOENT if probe point doesn't exist, but still returns
-EINVAL if both of kprobe->addr and kprobe->symbol_name are
specified or both are not specified.

Acked-by: Ananth N Mavinakayanahalli
Signed-off-by: Masami Hiramatsu
Cc: Ananth N Mavinakayanahalli
Cc: Arnaldo Carvalho de Melo
Cc: Ingo Molnar
Cc: Frederic Weisbecker
Cc: Peter Zijlstra
Cc: Anil S Keshavamurthy
Cc: "David S. Miller"
Link: http://lkml.kernel.org/r/20110627072650.6528.67329.stgit@fedora15
Signed-off-by: Steven Rostedt

Masami Hiramatsu
2011-07-16 03:11:47 +0800
1538f888f tracing/kprobes: Merge trace probe enable/disable functions ... Browse Code »

Merge redundant enable/disable functions into enable_trace_probe()
and disable_trace_probe().

Signed-off-by: Masami Hiramatsu
Cc: Arnaldo Carvalho de Melo
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: yrl.pp-manager.tt@hitachi.com
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Link: http://lkml.kernel.org/r/20110627072644.6528.26910.stgit@fedora15

[ converted kprobe selftest to use enable_trace_probe ]

Signed-off-by: Steven Rostedt

Masami Hiramatsu
2011-07-16 03:10:58 +0800

15 Jul, 2011

3 commits

7143f168e tracing/kprobes: Rename probe_* to trace_probe_* ... Browse Code »

Rename probe_* to trace_probe_* for avoiding namespace
confliction. This also fixes improper names of find_probe_event()
and cleanup_all_probes() to find_trace_probe() and
release_all_trace_probes() respectively.

Signed-off-by: Masami Hiramatsu
Cc: Arnaldo Carvalho de Melo
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/r/20110627072636.6528.60374.stgit@fedora15
Signed-off-by: Steven Rostedt

Masami Hiramatsu
2011-07-15 05:44:43 +0800
f91298709 perf, x86: P4 PMU - Introduce event alias feature ... Browse Code »

Instead of hw_nmi_watchdog_set_attr() weak function
and appropriate x86_pmu::hw_watchdog_set_attr() call
we introduce even alias mechanism which allow us
to drop this routines completely and isolate quirks
of Netburst architecture inside P4 PMU code only.

The main idea remains the same though -- to allow
nmi-watchdog and perf top run simultaneously.

Note the aliasing mechanism applies to generic
PERF_COUNT_HW_CPU_CYCLES event only because arbitrary
event (say passed as RAW initially) might have some
additional bits set inside ESCR register changing
the behaviour of event and we can't guarantee anymore
that alias event will give the same result.

P.S. Thanks a huge to Don and Steven for for testing
and early review.

Acked-by: Don Zickus
Tested-by: Steven Rostedt
Signed-off-by: Cyrill Gorcunov
CC: Ingo Molnar
CC: Peter Zijlstra
CC: Stephane Eranian
CC: Lin Ming
CC: Arnaldo Carvalho de Melo
CC: Frederic Weisbecker
Link: http://lkml.kernel.org/r/20110708201712.GS23657@sun
Signed-off-by: Steven Rostedt

Cyrill Gorcunov
2011-07-15 05:25:04 +0800
4a9bd3f13 tracing: Have dynamic size event stack traces ... Browse Code »

Currently the stack trace per event in ftace is only 8 frames.
This can be quite limiting and sometimes useless. Especially when
the "ignore frames" is wrong and we also use up stack frames for
the event processing itself.

Change this to be dynamic by adding a percpu buffer that we can
write a large stack frame into and then copy into the ring buffer.

For interrupts and NMIs that come in while another event is being
process, will only get to use the 8 frame stack. That should be enough
as the task that it interrupted will have the full stack frame anyway.

Requested-by: Thomas Gleixner
Signed-off-by: Steven Rostedt

Steven Rostedt
2011-07-15 04:36:53 +0800

14 Jul, 2011

3 commits

6331c28c9 ftrace: Fix dynamic selftest failure on some archs ... Browse Code »
1

Archs that do not implement CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST, will
fail the dynamic ftrace selftest.

The function tracer has a quick 'off' variable that will prevent
the call back functions from being called. This variable is called
function_trace_stop. In x86, this is implemented directly in the mcount
assembly, but for other archs, an intermediate function is used called
ftrace_test_stop_func().

In dynamic ftrace, the function pointer variable ftrace_trace_function is
used to update the caller code in the mcount caller. But for archs that
do not have CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST set, it only calls
ftrace_test_stop_func() instead, which in turn calls __ftrace_trace_function.

When more than one ftrace_ops is registered, the function it calls is
ftrace_ops_list_func(), which will iterate over all registered ftrace_ops
and call the callbacks that have their hash matching.

The issue happens when two ftrace_ops are registered for different functions
and one is then unregistered. The __ftrace_trace_function is then pointed
to the remaining ftrace_ops callback function directly. This mean it will
be called for all functions that were registered to trace by both ftrace_ops
that were registered.

This is not an issue for archs with CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST,
because the update of ftrace_trace_function doesn't happen until after all
functions have been updated, and then the mcount caller is updated. But
for those archs that do use the ftrace_test_stop_func(), the update is
immediate.

The dynamic selftest fails because it hits this situation, and the
ftrace_ops that it registers fails to only trace what it was suppose to
and instead traces all other functions.

The solution is to delay the setting of __ftrace_trace_function until
after all the functions have been updated according to the registered
ftrace_ops. Also, function_trace_stop is set during the update to prevent
function tracing from calling code that is caused by the function tracer
itself.

Signed-off-by: Steven Rostedt

Steven Rostedt
2011-07-14 10:25:09 +0800
072126f45 ftrace: Update filter when tracing enabled in set_ftrace_filter() ... Browse Code »

Currently, if set_ftrace_filter() is called when the ftrace_ops is
active, the function filters will not be updated. They will only be updated
when tracing is disabled and re-enabled.

Update the functions immediately during set_ftrace_filter().

Signed-off-by: Steven Rostedt

Steven Rostedt
2011-07-14 10:10:05 +0800
41fb61c2d ftrace: Balance records when updating the hash ... Browse Code »

Whenever the hash of the ftrace_ops is updated, the record counts
must be balance. This requires disabling the records that are set
in the original hash, and then enabling the records that are set
in the updated hash.

Moving the update into ftrace_hash_move() removes the bug where the
hash was updated but the records were not, which results in ftrace
triggering a warning and disabling itself because the ftrace_ops filter
is updated while the ftrace_ops was registered, and then the failure
happens when the ftrace_ops is unregistered.

The current code will not trigger this bug, but new code will.

Signed-off-by: Steven Rostedt

Steven Rostedt
2011-07-14 10:00:50 +0800

08 Jul, 2011

2 commits

4376cac66 ftrace: Do not disable interrupts for modules in mcount update ... Browse Code »

When I mounted an NFS directory, it caused several modules to be loaded. At the
time I was running the preemptirqsoff tracer, and it showed the following
output:

# tracer: preemptirqsoff
#
# preemptirqsoff latency trace v1.1.5 on 2.6.33.9-rt30-mrg-test
# --------------------------------------------------------------------
# latency: 1177 us, #4/4, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: modprobe-19370 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
# => started at: ftrace_module_notify
# => ended at: ftrace_module_notify
#
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /_--=> lock-depth
# |||||/ delay
# cmd pid |||||| time | caller
# \ / |||||| \ | /
modprobe-19370 3d.... 0us!: ftrace_process_locs
=> ftrace_process_locs
=> ftrace_module_notify
=> notifier_call_chain
=> __blocking_notifier_call_chain
=> blocking_notifier_call_chain
=> sys_init_module
=> system_call_fastpath

That's over 1ms that interrupts are disabled on a Real-Time kernel!

Looking at the cause (being the ftrace author helped), I found that the
interrupts are disabled before the code modification of mcounts into nops. The
interrupts only need to be disabled on start up around this code, not when
modules are being loaded.

Signed-off-by: Steven Rostedt

Steven Rostedt
2011-07-08 10:39:38 +0800
e4a3f541f tracing: Still trace filtered irq functions when irq trace is disabled ... Browse Code »

If a function is set to be traced by the set_graph_function, but the
option funcgraph-irqs is zero, and the traced function happens to be
called from a interrupt, it will not be traced.

The point of funcgraph-irqs is to not trace interrupts when we are
preempted by an irq, not to not trace functions we want to trace that
happen to be *in* a irq.

Luckily the current->trace_recursion element is perfect to add a flag
to help us be able to trace functions within an interrupt even when
we are not tracing interrupts that preempt the trace.

Reported-by: Heiko Carstens
Tested-by: Heiko Carstens
Signed-off-by: Steven Rostedt

Steven Rostedt
2011-07-08 10:26:27 +0800

05 Jul, 2011

1 commit

931da6137 Merge branch 'tip/perf/core-2' of git://git.kernel.org/pub/scm/linux/kernel/git/… ... Browse Code »

…rostedt/linux-2.6-trace into perf/core

Ingo Molnar
2011-07-05 17:55:43 +0800

01 Jul, 2011

10 commits

26ca5c11f perf: export perf_event_refresh() to modules ... Browse Code »

KVM needs one-shot samples, since a PMC programmed to -X will fire after X
events and then again after 2^40 events (i.e. variable period).

Signed-off-by: Avi Kivity
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1309362157-6596-4-git-send-email-avi@redhat.com
Signed-off-by: Ingo Molnar

Avi Kivity
2011-07-01 17:06:40 +0800
4dc0da869 perf: Add context field to perf_event ... Browse Code »

The perf_event overflow handler does not receive any caller-derived
argument, so many callers need to resort to looking up the perf_event
in their local data structure. This is ugly and doesn't scale if a
single callback services many perf_events.

Fix by adding a context parameter to perf_event_create_kernel_counter()
(and derived hardware breakpoints APIs) and storing it in the perf_event.
The field can be accessed from the callback as event->overflow_handler_context.
All callers are updated.

Signed-off-by: Avi Kivity
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1309362157-6596-2-git-send-email-avi@redhat.com
Signed-off-by: Ingo Molnar

Avi Kivity
2011-07-01 17:06:38 +0800
a7ac67ea0 perf: Remove the perf_output_begin(.sample) argument ... Browse Code »

Since only samples call perf_output_sample() its much saner (and more
correct) to put the sample logic in there than in the
perf_output_begin()/perf_output_end() pair.

Saves a useless argument, reduces conditionals and shrinks
struct perf_output_handle, win!

Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/n/tip-2crpvsx3cqu67q3zqjbnlpsc@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2011-07-01 17:06:35 +0800
a8b0ca17b perf: Remove the nmi parameter from the swevent and overflow interface ... Browse Code »

The nmi parameter indicated if we could do wakeups from the current
context, if not, we would set some state and self-IPI and let the
resulting interrupt do the wakeup.

For the various event classes:

- hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
the PMI-tail (ARM etc.)
- tracepoint: nmi=0; since tracepoint could be from NMI context.
- software: nmi=[0,1]; some, like the schedule thing cannot
perform wakeups, and hence need 0.

As one can see, there is very little nmi=1 usage, and the down-side of
not using it is that on some platforms some software events can have a
jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).

The up-side however is that we can remove the nmi parameter and save a
bunch of conditionals in fast paths.

Signed-off-by: Peter Zijlstra
Cc: Michael Cree
Cc: Will Deacon
Cc: Deng-Cheng Zhu
Cc: Anton Blanchard
Cc: Eric B Munson
Cc: Heiko Carstens
Cc: Paul Mundt
Cc: David S. Miller
Cc: Frederic Weisbecker
Cc: Jason Wessel
Cc: Don Zickus
Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2011-07-01 17:06:35 +0800
1880c4ae1 perf, x86: Add hw_watchdog_set_attr() in a sake of nmi-watchdog on P4 ... Browse Code »

Due to restriction and specifics of Netburst PMU we need a separated
event for NMI watchdog. In particular every Netburst event
consumes not just a counter and a config register, but also an
additional ESCR register.

Since ESCR registers are grouped upon counters (i.e. if ESCR is occupied
for some event there is no room for another event to enter until its
released) we need to pick up the "least" used ESCR (or the most available
one) for nmi-watchdog purposes -- so MSR_P4_CRU_ESCR2/3 was chosen.

With this patch nmi-watchdog and perf top should be able to run simultaneously.

Signed-off-by: Cyrill Gorcunov
CC: Lin Ming
CC: Arnaldo Carvalho de Melo
CC: Frederic Weisbecker
Tested-and-reviewed-by: Don Zickus
Tested-and-reviewed-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20110623124918.GC13050@sun
Signed-off-by: Ingo Molnar

Cyrill Gorcunov
2011-07-01 17:06:34 +0800
0d6412085 events: Ensure that timers are updated without requiring read() call ... Browse Code »

The event tracing infrastructure exposes two timers which should be updated
each time the value of the counter is updated. Currently, these counters are
only updated when userspace calls read() on the fd associated with an event.
This means that counters which are read via the mmap'd page exclusively never
have their timers updated. This patch adds ensures that the timers are updated
each time the values in the mmap'd page are updated.

Signed-off-by: Eric B Munson
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1308932786-5111-1-git-send-email-emunson@mgebm.net
Signed-off-by: Ingo Molnar

Eric B Munson
2011-07-01 17:06:34 +0800
c47942959 events: Move lockless timer calculation into helper function ... Browse Code »

Take the timer calculation from perf_output_read and move it to a helper
function for any place that needs timer values but cannot take the ctx->lock.

Signed-off-by: Eric B Munson
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1308861279-15216-2-git-send-email-emunson@mgebm.net
Signed-off-by: Ingo Molnar

Eric B Munson
2011-07-01 17:06:33 +0800
b7526f0ca events: Add note to update_event_times comment about holding ctx->lock ... Browse Code »

Signed-off-by: Eric B Munson
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1308861279-15216-1-git-send-email-emunson@mgebm.net
Signed-off-by: Ingo Molnar

Eric B Munson
2011-07-01 17:06:33 +0800
4ec8363df perf_events: Fix perf buffer watermark setting ... Browse Code »

Since 2.6.36 (specifically commit d57e34fdd60b ("perf: Simplify the
ring-buffer logic: make perf_buffer_alloc() do everything needed"),
the perf_buffer_init_code() has been mis-setting the buffer watermark
if perf_event_attr.wakeup_events has a non-zero value.

This is because perf_event_attr.wakeup_events is a union with
perf_event_attr.wakeup_watermark.

This commit re-enables the check for perf_event_attr.watermark being
set before continuing with setting a non-default watermark.

This bug is most noticable when you are trying to use PERF_IOC_REFRESH
with a value larger than one and perf_event_attr.wakeup_events is set to
one. In this case the buffer watermark will be set to 1 and you will
get extraneous POLL_IN overflows rather than POLL_HUP as expected.

[ avoid using attr.wakeup_events when attr.watermark is set ]

Signed-off-by: Vince Weaver
Signed-off-by: Peter Zijlstra
Cc:
Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1106011506390.5384@cl320.eecs.utk.edu
Signed-off-by: Ingo Molnar

Vince Weaver
2011-07-01 17:06:32 +0800
10e696276 Merge commit 'v3.0-rc5' into perf/core ... Browse Code »

Merge reason: Pick up the latest fixes.

Signed-off-by: Ingo Molnar

Ingo Molnar
2011-07-01 16:28:46 +0800

28 Jun, 2011

1 commit

26c4caea9 taskstats: don't allow duplicate entries in listener mode ... Browse Code »

Currently a single process may register exit handlers unlimited times.
It may lead to a bloated listeners chain and very slow process
terminations.

Eg after 10KK sent TASKSTATS_CMD_ATTR_REGISTER_CPUMASKs ~300 Mb of
kernel memory is stolen for the handlers chain and "time id" shows 2-7
seconds instead of normal 0.003. It makes it possible to exhaust all
kernel memory and to eat much of CPU time by triggerring numerous exits
on a single CPU.

The patch limits the number of times a single process may register
itself on a single CPU to one.

One little issue is kept unfixed - as taskstats_exit() is called before
exit_files() in do_exit(), the orphaned listener entry (if it was not
explicitly deregistered) is kept until the next someone's exit() and
implicit deregistration in send_cpu_listeners(). So, if a process
registered itself as a listener exits and the next spawned process gets
the same pid, it would inherit taskstats attributes.

Signed-off-by: Vasiliy Kulikov
Cc: Balbir Singh
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vasiliy Kulikov
2011-06-28 09:00:13 +0800

25 Jun, 2011

1 commit

8abf55883 Merge branch 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'timer-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rtc: vt8500: Fix build error & cleanup rtc_class_ops->update_irq_enable()
alarmtimers: Return -ENOTSUPP if no RTC device is present
alarmtimers: Handle late rtc module loading

Linus Torvalds
2011-06-25 22:23:59 +0800

22 Jun, 2011

3 commits

1c6b39ad3 alarmtimers: Return -ENOTSUPP if no RTC device is present ... Browse Code »

Toralf Förster and Richard Weinberger noted that if there is
no RTC device, the alarm timers core prints out an annoying
"ALARM timers will not wake from suspend" message.

This warning has been removed in a previous patch, however
the issue still remains: The original idea was to support
alarm timers even if there was no rtc device, as long as the
system didn't go into suspend.

However, after further consideration, communicating to the application
that alarmtimers are not fully functional seems like the better
solution.

So this patch makes it so we return -ENOTSUPP to any posix _ALARM
clockid calls if there is no backing RTC device on the system.

Further this changes the behavior where when there is no rtc device
we will check for one on clock_getres, clock_gettime, timer_create,
and timer_nsleep instead of on suspend.

CC: Toralf Förster
CC: Richard Weinberger
CC: Thomas Gleixner
Reported-by: Toralf Förster
Reported by: Richard Weinberger
Signed-off-by: John Stultz

John Stultz
2011-06-22 07:32:28 +0800
c008ba58a alarmtimers: Handle late rtc module loading ... Browse Code »

The alarmtimers code currently picks a rtc device to use at
late init time. However, if your rtc driver is loaded as a module,
it may be registered after the alarmtimers late init code, leaving
the alarmtimers nonfunctional.

This patch moves the the rtcdevice selection to when we actually try
to use it, allowing us to make use of rtc modules that may have been
loaded at any point since bootup.

CC: Thomas Gleixner
CC: Meelis Roos
Reported-by: Meelis Roos
Signed-off-by: John Stultz

John Stultz
2011-06-22 06:38:33 +0800
8440f4b19 PM: Free memory bitmaps if opening /dev/snapshot fails ... Browse Code »

When opening /dev/snapshot device, snapshot_open() creates memory
bitmaps which are freed in snapshot_release(). But if any of the
callbacks called by pm_notifier_call_chain() returns NOTIFY_BAD, open()
fails, snapshot_release() is never called and bitmaps are not freed.
Next attempt to open /dev/snapshot then triggers BUG_ON() check in
create_basic_memory_bitmaps(). This happens e.g. when vmwatchdog module
is active on s390x.

Signed-off-by: Michal Kubecek
Signed-off-by: Rafael J. Wysocki
Cc: stable@kernel.org

Michal Kubecek
2011-06-22 05:20:06 +0800

20 Jun, 2011

1 commit

8816ead9d Merge branches 'perf-urgent-for-linus', 'sched-urgent-for-linus', 'timers-urgent… ... Browse Code »

…-for-linus' and 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tools/perf: Fix static build of perf tool
tracing: Fix regression in printk_formats file

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
generic-ipi: Fix kexec boot crash by initializing call_single_queue before enabling interrupts

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
clocksource: Make watchdog robust vs. interruption
timerfd: Fix wakeup of processes when timer is cancelled on clock change

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, MAINTAINERS: Add x86 MCE people
x86, efi: Do not reserve boot services regions within reserved areas

Linus Torvalds
2011-06-20 00:00:18 +0800

19 Jun, 2011

1 commit

357ed6b1a Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: Move RCU_BOOST #ifdefs to header file
rcu: use softirq instead of kthreads except when RCU_BOOST=y
rcu: Use softirq to address performance regression
rcu: Simplify curing of load woes

Linus Torvalds
2011-06-19 23:56:56 +0800

18 Jun, 2011

1 commit

879669961 KEYS/DNS: Fix ____call_usermodehelper() to not lose the session keyring ... Browse Code »
1

____call_usermodehelper() now erases any credentials set by the
subprocess_inf::init() function. The problem is that commit
17f60a7da150 ("capabilites: allow the application of capability limits
to usermode helpers") creates and commits new credentials with
prepare_kernel_cred() after the call to the init() function. This wipes
all keyrings after umh_keys_init() is called.

The best way to deal with this is to put the init() call just prior to
the commit_creds() call, and pass the cred pointer to init(). That
means that umh_keys_init() and suchlike can modify the credentials
_before_ they are published and potentially in use by the rest of the
system.

This prevents request_key() from working as it is prevented from passing
the session keyring it set up with the authorisation token to
/sbin/request-key, and so the latter can't assume the authority to
instantiate the key. This causes the in-kernel DNS resolver to fail
with ENOKEY unconditionally.

Signed-off-by: David Howells
Acked-by: Eric Paris
Tested-by: Jeff Layton
Signed-off-by: Linus Torvalds

David Howells
2011-06-18 00:40:48 +0800

17 Jun, 2011

3 commits

d8ad7d112 generic-ipi: Fix kexec boot crash by initializing call_single_queue before enabling interrupts ... Browse Code »

There is a problem that kdump(2nd kernel) sometimes hangs up due
to a pending IPI from 1st kernel. Kernel panic occurs because IPI
comes before call_single_queue is initialized.

To fix the crash, rename init_call_single_data() to call_function_init()
and call it in start_kernel() so that call_single_queue can be
initialized before enabling interrupts.

The details of the crash are:

(1) 2nd kernel boots up

(2) A pending IPI from 1st kernel comes when irqs are first enabled
in start_kernel().

(3) Kernel tries to handle the interrupt, but call_single_queue
is not initialized yet at this point. As a result, in the
generic_smp_call_function_single_interrupt(), NULL pointer
dereference occurs when list_replace_init() tries to access
&q->list.next.

Therefore this patch changes the name of init_call_single_data()
to call_function_init() and calls it before local_irq_enable()
in start_kernel().

Signed-off-by: Takao Indoh
Reviewed-by: WANG Cong
Acked-by: Neil Horman
Acked-by: Vivek Goyal
Acked-by: Peter Zijlstra
Cc: Milton Miller
Cc: Jens Axboe
Cc: Paul E. McKenney
Cc: kexec@lists.infradead.org
Link: http://lkml.kernel.org/r/D6CBEE2F420741indou.takao@jp.fujitsu.com
Signed-off-by: Ingo Molnar

Takao Indoh
2011-06-17 16:17:12 +0800
f8b7fc6b5 rcu: Move RCU_BOOST #ifdefs to header file ... Browse Code »

The commit "use softirq instead of kthreads except when RCU_BOOST=y"
just applied #ifdef in place. This commit is a cleanup that moves
the newly #ifdef'ed code to the header file kernel/rcutree_plugin.h.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney

Paul E. McKenney
2011-06-17 07:12:05 +0800
b5199515c clocksource: Make watchdog robust vs. interruption ... Browse Code »

The clocksource watchdog code is interruptible and it has been
observed that this can trigger false positives which disable the TSC.

The reason is that an interrupt storm or a long running interrupt
handler between the read of the watchdog source and the read of the
TSC brings the two far enough apart that the delta is larger than the
unstable treshold. Move both reads into a short interrupt disabled
region to avoid that.

Reported-and-tested-by: Vernon Mauery
Signed-off-by: Thomas Gleixner
Cc: stable@kernel.org

Thomas Gleixner
2011-06-17 01:30:53 +0800

16 Jun, 2011

5 commits

b4f9f2b64 Merge commit 'v3.0-rc3' into perf/core ... Browse Code »

Merge reason: add the latest fixes.

Signed-off-by: Ingo Molnar

Ingo Molnar
2011-06-16 19:23:22 +0800
a46e0899e rcu: use softirq instead of kthreads except when RCU_BOOST=y ... Browse Code »

This patch #ifdefs RCU kthreads out of the kernel unless RCU_BOOST=y,
thus eliminating context-switch overhead if RCU priority boosting has
not been configured.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2011-06-16 14:07:21 +0800
a1b6ae8ed Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/tip/linux-2.6-tip

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Check if lowest_mask is initialized in find_lowest_rq()
sched: Fix need_resched() when checking peempt

Linus Torvalds
2011-06-16 12:45:18 +0800
d2c322587 gcov: disable CONFIG_CONSTRUCTORS when not needed by CONFIG_GCOV_KERNEL ... Browse Code »

CONFIG_CONSTRUCTORS controls support for running constructor functions at
kernel init time. According to commit b99b87f70c7785ab ("kernel:
constructor support"), gcov (CONFIG_GCOV_KERNEL) needs this. However,
CONFIG_CONSTRUCTORS currently defaults to y, with no option to disable it,
and CONFIG_GCOV_KERNEL depends on it. Instead, default it to n and have
CONFIG_GCOV_KERNEL select it, so that the normal case of
CONFIG_GCOV_KERNEL=n will result in CONFIG_CONSTRUCTORS=n.

Observed in the short list of =y values in a minimal kernel configuration.

Signed-off-by: Josh Triplett
Acked-by: WANG Cong
Acked-by: Peter Oberparleiter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josh Triplett
2011-06-16 11:04:01 +0800
733eda7ac memcg: clear mm->owner when last possible owner leaves ... Browse Code »

The following crash was reported:

> Call Trace:
> [] mem_cgroup_from_task+0x15/0x17
> [] __mem_cgroup_try_charge+0x148/0x4b4
> [] ? need_resched+0x23/0x2d
> [] ? preempt_schedule+0x46/0x4f
> [] mem_cgroup_charge_common+0x9a/0xce
> [] mem_cgroup_newpage_charge+0x5d/0x5f
> [] khugepaged+0x5da/0xfaf
> [] ? __init_waitqueue_head+0x4b/0x4b
> [] ? add_mm_counter.constprop.5+0x13/0x13
> [] kthread+0xa8/0xb0
> [] ? sub_preempt_count+0xa1/0xb4
> [] kernel_thread_helper+0x4/0x10
> [] ? retint_restore_args+0x13/0x13
> [] ? __init_kthread_worker+0x5a/0x5a

What happens is that khugepaged tries to charge a huge page against an mm
whose last possible owner has already exited, and the memory controller
crashes when the stale mm->owner is used to look up the cgroup to charge.

mm->owner has never been set to NULL with the last owner going away, but
nobody cared until khugepaged came along.

Even then it wasn't a problem because the final mmput() on an mm was
forced to acquire and release mmap_sem in write-mode, preventing an
exiting owner to go away while the mmap_sem was held, and until "692e0b3
mm: thp: optimize memcg charge in khugepaged", the memory cgroup charge
was protected by mmap_sem in read-mode.

Instead of going back to relying on the mmap_sem to enforce lifetime of a
task, this patch ensures that mm->owner is properly set to NULL when the
last possible owner is exiting, which the memory controller can handle
just fine.

[akpm@linux-foundation.org: tweak comments]
Signed-off-by: Hugh Dickins
Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Johannes Weiner
Reported-by: Hugh Dickins
Reported-by: Dave Jones
Reviewed-by: Andrea Arcangeli
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2011-06-16 11:04:01 +0800

15 Jun, 2011

2 commits

0da938c44 sched: Check if lowest_mask is initialized in find_lowest_rq() ... Browse Code »

On system boot up, the lowest_mask is initialized with an
early_initcall(). But RT tasks may wake up on other
early_initcall() callers before the lowest_mask is initialized,
causing a system crash.

Commit "d72bce0e67 rcu: Cure load woes" was the first commit
to wake up RT tasks in early init. Before this commit this bug
should not happen.

Reported-by: Andrew Theurer
Tested-by: Andrew Theurer
Tested-by: Paul E. McKenney
Signed-off-by: Steven Rostedt
Acked-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20110614223657.824872966@goodmis.org
Signed-off-by: Ingo Molnar

Steven Rostedt
2011-06-15 17:44:48 +0800
8dd0de8be sched: Fix need_resched() when checking peempt ... Browse Code »

The RT preempt check tests the wrong task if NEED_RESCHED is
set. It currently checks the local CPU task. It is supposed to
check the task that is running on the runqueue we are about to
wake another task on.

Signed-off-by: Hillf Danton
Reviewed-by: Yong Zhang
Signed-off-by: Steven Rostedt
Link: http://lkml.kernel.org/r/20110614223657.450239027@goodmis.org
Signed-off-by: Ingo Molnar

Hillf Danton
2011-06-15 15:50:32 +0800