Eric Lee / linux-smarc-t335x-v3.2

21 Oct, 2010

1 commit

f4bc6bb2d tracing: Cleanup the convoluted softirq tracepoints ... Browse Code »

With the addition of trace_softirq_raise() the softirq tracepoint got
even more convoluted. Why the tracepoints take two pointers to assign
an integer is beyond my comprehension.

But adding an extra case which treats the first pointer as an unsigned
long when the second pointer is NULL including the back and forth
type casting is just horrible.

Convert the softirq tracepoints to take a single unsigned int argument
for the softirq vector number and fix the call sites.

Signed-off-by: Thomas Gleixner
LKML-Reference:
Acked-by: Peter Zijlstra
Acked-by: mathieu.desnoyers@efficios.com
Cc: Frederic Weisbecker
Cc: Steven Rostedt

Thomas Gleixner
2010-10-21 22:50:29 +0800

20 Oct, 2010

1 commit

750ed158b Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/ro… ... Browse Code »

…stedt/linux-2.6-trace into perf/core

Ingo Molnar
2010-10-20 02:41:38 +0800

19 Oct, 2010

11 commits

7e40798f4 tracing: Fix compile issue for trace_sched_wakeup.c ... Browse Code »

The function start_func_tracer() was incorrectly added in the
#ifdef CONFIG_FUNCTION_TRACER condition, but is still used even
when function tracing is not enabled.

The calls to register_ftrace_function() and register_ftrace_graph()
become nops (and their arguments are even ignored), thus there is
no reason to hide start_func_tracer() when function tracing is
not enabled.

Reported-by: Ingo Molnar
Signed-off-by: Steven Rostedt

Steven Rostedt
2010-10-19 22:56:19 +0800
7e54a5a0b perf: Optimize sw events ... Browse Code »

Acked-by: Frederic Weisbecker
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-10-19 01:58:59 +0800
82cd6def9 perf: Use jump_labels to optimize the scheduler hooks ... Browse Code »

Trades a call + conditional + ret for an unconditional jmp.

Acked-by: Frederic Weisbecker
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-10-19 01:58:58 +0800
3b6e901f8 jump_label: Use more consistent naming ... Browse Code »

Now that there's still only a few users around, rename things to make
them more consistent.

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-10-19 01:58:56 +0800
d580ff869 perf, hw_breakpoint: Fix crash in hw_breakpoint creation ... Browse Code »

hw_breakpoint creation needs to account stuff per-task to ensure there
is always sufficient hardware resources to back these things due to
ptrace.

With the perf per pmu context changes the event initialization no
longer has access to the event context, for the simple reason that we
need to first find the pmu (result of initialization) before we can
find the context.

This makes hw_breakpoints unhappy, because it can no longer do per
task accounting, cure this by frobbing a task pointer in the event::hw
bits for now...

Signed-off-by: Peter Zijlstra
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-10-19 01:58:55 +0800
c6be5a5cb perf: Find task before event alloc ... Browse Code »

So that we can pass the task pointer to the event allocation, so that
we can use task associated data during event initialization.

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-10-19 01:58:54 +0800
e7d0bc047 perf: Fix task refcount bugs ... Browse Code »

Currently it looks like find_lively_task_by_vpid() takes a task ref
and relies on find_get_context() to drop it.

The problem is that perf_event_create_kernel_counter() shouldn't be
dropping task refs.

Signed-off-by: Peter Zijlstra
Acked-by: Frederic Weisbecker
Acked-by: Matt Helsley
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-10-19 01:58:52 +0800
74c3337c2 perf: Fix group moving ... Browse Code »

Matt found we trigger the WARN_ON_ONCE() in perf_group_attach() when we take
the move_group path in perf_event_open().

Since we cannot de-construct the group (we rely on it to move the events), we
have to simply ignore the double attach. The group state is context invariant
and doesn't need changing.

Reported-by: Matt Fleming
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-10-19 01:58:51 +0800
e360adbe2 irq_work: Add generic hardirq context callbacks ... Browse Code »

Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.

Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.

The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.

Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.

Signed-off-by: Peter Zijlstra
Acked-by: Kyle McMartin
Acked-by: Martin Schwidefsky
[ various fixes ]
Signed-off-by: Huang Ying
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-10-19 01:58:50 +0800
8e5fc1a73 perf_events: Fix transaction recovery in group_sched_in() ... Browse Code »

The group_sched_in() function uses a transactional approach to schedule
a group of events. In a group, either all events can be scheduled or
none are. To schedule each event in, the function calls event_sched_in().
In case of error, event_sched_out() is called on each event in the group.

The problem is that event_sched_out() does not completely cancel the
effects of event_sched_in(). Furthermore event_sched_out() changes the
state of the event as if it had run which is not true is this particular
case.

Those inconsistencies impact time tracking fields and may lead to events
in a group not all reporting the same time_enabled and time_running values.
This is demonstrated with the example below:

$ task -eunhalted_core_cycles,baclears,baclears -e unhalted_core_cycles,baclears,baclears sleep 5
1946101 unhalted_core_cycles (32.85% scaling, ena=829181, run=556827)
11423 baclears (32.85% scaling, ena=829181, run=556827)
7671 baclears (0.00% scaling, ena=556827, run=556827)

2250443 unhalted_core_cycles (57.83% scaling, ena=962822, run=405995)
11705 baclears (57.83% scaling, ena=962822, run=405995)
11705 baclears (57.83% scaling, ena=962822, run=405995)

Notice that in the first group, the last baclears event does not
report the same timings as its siblings.

This issue comes from the fact that tstamp_stopped is updated
by event_sched_out() as if the event had actually run.

To solve the issue, we must ensure that, in case of error, there is
no change in the event state whatsoever. That means timings must
remain as they were when entering group_sched_in().

To do this we defer updating tstamp_running until we know the
transaction succeeded. Therefore, we have split event_sched_in()
in two parts separating the update to tstamp_running.

Similarly, in case of error, we do not want to update tstamp_stopped.
Therefore, we have split event_sched_out() in two parts separating
the update to tstamp_stopped.

With this patch, we now get the following output:

$ task -eunhalted_core_cycles,baclears,baclears -e unhalted_core_cycles,baclears,baclears sleep 5
2492050 unhalted_core_cycles (71.75% scaling, ena=1093330, run=308841)
11243 baclears (71.75% scaling, ena=1093330, run=308841)
11243 baclears (71.75% scaling, ena=1093330, run=308841)

1852746 unhalted_core_cycles (0.00% scaling, ena=784489, run=784489)
9253 baclears (0.00% scaling, ena=784489, run=784489)
9253 baclears (0.00% scaling, ena=784489, run=784489)

Note that the uneven timing between groups is a side effect of
the process spending most of its time sleeping, i.e., not enough
event rotations (but that's a separate issue).

Signed-off-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Stephane Eranian
2010-10-19 01:58:49 +0800
c530ccd9a perf_events: Fix bogus context time tracking ... Browse Code »

You can only call update_context_time() when the context
is active, i.e., the thread it is attached to is still running.

However, perf_event_read() can be called even when the context
is inactive, e.g., user read() the counters. The call to
update_context_time() must be conditioned on the status of
the context, otherwise, bogus time_enabled, time_running may
be returned. Here is an example on AMD64. The task program
is an example from libpfm4. The -p prints deltas every 1s.

$ task -p -e cpu_clk_unhalted sleep 5
2,266,610 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
5,242,358,071 cpu_clk_unhalted (99.95% scaling, ena=5,000,359,984, run=2,319,270)

Whereas if you don't read deltas, e.g., no call to perf_event_read() until
the process terminates:

$ task -e cpu_clk_unhalted sleep 5
2,497,783 cpu_clk_unhalted (0.00% scaling, ena=2,376,899, run=2,376,899)

Notice that time_enable, time_running are bogus in the first example
causing bogus scaling.

This patch fixes the problem, by conditionally calling update_context_time()
in perf_event_read().

Signed-off-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
Cc: stable@kernel.org
LKML-Reference:
Signed-off-by: Ingo Molnar

Stephane Eranian
2010-10-19 01:58:46 +0800

18 Oct, 2010

7 commits

78c89ba12 tracing: Remove parent recording in latency tracer graph options ... Browse Code »

Even though the parent is recorded with the normal function tracing
of the latency tracers (irqsoff and wakeup), the function graph
recording is bogus.

This is due to the function graph messing with the return stack.
The latency tracers pass in as the parent CALLER_ADDR0, which
works fine for plain function tracing. But this causes bogus output
with the graph tracer:

3) -0 | d.s3. 0.000 us | return_to_handler();
3) -0 | d.s3. 0.000 us | _raw_spin_unlock_irqrestore();
3) -0 | d.s3. 0.000 us | return_to_handler();
3) -0 | d.s3. 0.000 us | trace_hardirqs_on();

The "return_to_handle()" call is the trampoline of the
function graph tracer, and is meaningless in this context.

Cc: Jiri Olsa
Signed-off-by: Steven Rostedt

Steven Rostedt
2010-10-18 22:53:38 +0800
5e6d2b9cf tracing: Use one prologue for the preempt irqs off tracer function tracers ... Browse Code »

The preempt and irqsoff tracers have three types of function tracers.
Normal function tracer, function graph entry, and function graph return.
Each of these use a complex dance to prevent recursion and whether
to trace the data or not (depending if interrupts are enabled or not).

This patch moves the duplicate code into a single routine, to
prevent future mistakes with modifying duplicate complex code.

Cc: Jiri Olsa
Signed-off-by: Steven Rostedt

Steven Rostedt
2010-10-18 22:53:36 +0800
542181d37 tracing: Use one prologue for the wakeup tracer function tracers ... Browse Code »

The wakeup tracer has three types of function tracers. Normal
function tracer, function graph entry, and function graph return.
Each of these use a complex dance to prevent recursion and whether
to trace the data or not (depending on the wake_task variable).

This patch moves the duplicate code into a single routine, to
prevent future mistakes with modifying duplicate complex code.

Cc: Jiri Olsa
Signed-off-by: Steven Rostedt

Steven Rostedt
2010-10-18 22:53:33 +0800
7495a5bea tracing: Graph support for wakeup tracer ... Browse Code »

Add function graph support for wakeup latency tracer.
The graph output is enabled by setting the 'display-graph'
trace option.

Signed-off-by: Jiri Olsa
LKML-Reference:
Signed-off-by: Steven Rostedt

Jiri Olsa
2010-10-18 22:53:30 +0800
0a772620a tracing: Make graph related irqs/preemptsoff functions global ... Browse Code »

Move trace_graph_function() and print_graph_headers_flags() functions
to the trace_function_graph.c to be globaly available.

Signed-off-by: Jiri Olsa
LKML-Reference:
Signed-off-by: Steven Rostedt

Jiri Olsa
2010-10-18 22:53:28 +0800
a9d61173d tracing: Add proper check for irq_depth routines ... Browse Code »

The check_irq_entry and check_irq_return could be called
from graph event context. In such case there's no graph
private data allocated. Adding checks to handle this case.

Signed-off-by: Jiri Olsa
LKML-Reference:

[ Fixed some grammar in the comments ]

Signed-off-by: Steven Rostedt

Jiri Olsa
2010-10-18 22:53:25 +0800
907f27840 tracing/trivial: Remove cast from void* ... Browse Code »

Unnecessary cast from void* in assignment.

Signed-off-by: matt mooney
Signed-off-by: Steven Rostedt

matt mooney
2010-10-18 22:53:22 +0800

15 Oct, 2010

5 commits

6268464b3 Merge remote branch 'tip/perf/core' into oprofile/core ... Browse Code »

Conflicts:
arch/arm/oprofile/common.c
kernel/perf_event.c

Robert Richter
2010-10-15 18:45:00 +0800
0fdf13606 Merge branch 'tip/perf/recordmcount-2' of git://git.kernel.org/pub/scm/linux/ker… ... Browse Code »

…nel/git/rostedt/linux-2.6-trace into perf/core

Ingo Molnar
2010-10-15 12:12:28 +0800
cf4db2597 ftrace: Rename config option HAVE_C_MCOUNT_RECORD to HAVE_C_RECORDMCOUNT ... Browse Code »

The config option used by archs to let the build system know that
the C version of the recordmcount works for said arch is currently
called HAVE_C_MCOUNT_RECORD which enables BUILD_C_RECORDMCOUNT. To
be more consistent with the name that all archs may use, it has been
renamed to HAVE_C_RECORDMCOUNT. This will be less confusing since
we are building a C recordmcount and not a mcount_record.

Suggested-by: Ingo Molnar
Cc:
Cc: Michal Marek
Cc: linux-kbuild@vger.kernel.org
Cc: John Reiser
Signed-off-by: Steven Rostedt

Steven Rostedt
2010-10-15 11:32:44 +0800
d9d572a9c Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/freder… ... Browse Code »

…ic/random-tracing into perf/core

Ingo Molnar
2010-10-15 11:12:45 +0800
72441cb1f ftrace/x86: Add support for C version of recordmcount ... Browse Code »

This patch adds the support for the C version of recordmcount and
compile times show ~ 12% improvement.

After verifying this works, other archs can add:

HAVE_C_MCOUNT_RECORD

in its Kconfig and it will use the C version of recordmcount
instead of the perl version.

Cc:
Cc: Michal Marek
Cc: linux-kbuild@vger.kernel.org
Cc: John Reiser
Signed-off-by: Steven Rostedt

Steven Rostedt
2010-10-15 04:52:41 +0800

14 Oct, 2010

1 commit

fd02e6f7a kprobes: Fix selftest to clear flags field for reusing probes ... Browse Code »

Fix selftest to clear flags field for reusing probes
because the flags field can be modified by Kprobes.
This also set NULL to kprobe.addr instead of 0.

Signed-off-by: Masami Hiramatsu
Cc: Rusty Russell
Cc: Ananth N Mavinakayanahalli
Cc: 2nddept-manager@sdl.hitachi.co.jp
LKML-Reference:
Signed-off-by: Ingo Molnar

Masami Hiramatsu
2010-10-14 14:55:27 +0800

13 Oct, 2010

1 commit

14cae9bd2 tracing: Fix function-graph build warning on 32-bit ... Browse Code »

Fix

kernel/trace/trace_functions_graph.c: In function ‘trace_print_graph_duration’:
kernel/trace/trace_functions_graph.c:652: warning: comparison of distinct pointer types lacks a cast

when building 36-rc6 on a 32-bit due to the strict type check failing
in the min() macro.

Signed-off-by: Borislav Petkov
Cc: Chase Douglas
Cc: Steven Rostedt
Cc: Ingo Molnar
LKML-Reference:
Signed-off-by: Frederic Weisbecker

Borislav Petkov
2010-10-13 23:47:53 +0800

12 Oct, 2010

1 commit

ad0f7cfaa Merge branch 'oprofile/urgent' (early part) into oprofile/perf Browse Code »

Robert Richter
2010-10-12 01:26:50 +0800

11 Oct, 2010

1 commit

84c799105 perf: New helper function for pmu name ... Browse Code »

Introduce perf_pmu_name() helper function that returns the name of the
pmu. This gives us a generic way to get the name of a pmu regardless of
how an architecture identifies it internally.

Signed-off-by: Matt Fleming
Acked-by: Peter Zijlstra
Acked-by: Paul Mundt
Signed-off-by: Robert Richter

Matt Fleming
2010-10-11 23:45:49 +0800

08 Oct, 2010

1 commit

7cd2541cf Merge commit 'v2.6.36-rc7' into perf/core ... Browse Code »

Conflicts:
arch/x86/kernel/module.c

Merge reason: Resolve the conflict, pick up fixes.

Signed-off-by: Ingo Molnar

Ingo Molnar
2010-10-08 16:46:27 +0800

06 Oct, 2010

2 commits

e1d9694ca Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: rcu_read_lock_bh_held(): disabling irqs also disables bh
generic-ipi: Fix deadlock in __smp_call_function_single

Linus Torvalds
2010-10-06 04:07:43 +0800
5336377d6 modules: Fix module_bug_list list corruption race ... Browse Code »

With all the recent module loading cleanups, we've minimized the code
that sits under module_mutex, fixing various deadlocks and making it
possible to do most of the module loading in parallel.

However, that whole conversion totally missed the rather obscure code
that adds a new module to the list for BUG() handling. That code was
doubly obscure because (a) the code itself lives in lib/bugs.c (for
dubious reasons) and (b) it gets called from the architecture-specific
"module_finalize()" rather than from generic code.

Calling it from arch-specific code makes no sense what-so-ever to begin
with, and is now actively wrong since that code isn't protected by the
module loading lock any more.

So this commit moves the "module_bug_{finalize,cleanup}()" calls away
from the arch-specific code, and into the generic code - and in the
process protects it with the module_mutex so that the list operations
are now safe.

Future fixups:
- move the module list handling code into kernel/module.c where it
belongs.
- get rid of 'module_bug_list' and just use the regular list of modules
(called 'modules' - imagine that) that we already create and maintain
for other reasons.

Reported-and-tested-by: Thomas Gleixner
Cc: Rusty Russell
Cc: Adrian Bunk
Cc: Andrew Morton
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds

Linus Torvalds
2010-10-06 02:29:27 +0800

04 Oct, 2010

1 commit

540804b5c perf_events: Fix invalid pointer when pid is invalid ... Browse Code »

This patch fixes an error in perf_event_open() when the pid
provided by the user is invalid. find_lively_task_by_vpid()
does not return NULL on error but an error code. Without the
fix the error code was silently passed to find_get_context()
which would eventually cause a invalid pointer dereference.

Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: paulus@samba.org
Cc: davem@davemloft.net
Cc: fweisbec@gmail.com
Cc: perfmon2-devel@lists.sf.net
Cc: eranian@gmail.com
Cc: robert.richter@amd.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Stephane Eranian
2010-10-04 18:47:20 +0800

02 Oct, 2010

1 commit

399f1e30a kfifo: fix scatterlist usage ... Browse Code »

The kfifo_dma family of functions use sg_mark_end() on the last element in
their scatterlist. This forces use of a fresh scatterlist for each DMA
operation, which makes recycling a single scatterlist impossible.

Change the behavior of the kfifo_dma functions to match the usage of the
dma_map_sg function. This means that users must respect the returned
nents value. The sample code is updated to reflect the change.

This bug is trivial to cause: call kfifo_dma_in_prepare() such that it
prepares a scatterlist with a single entry comprising the whole fifo.
This is the case when you map the entirety of a newly created empty fifo.
This causes the setup_sgl() function to mark the first scatterlist entry
as the end of the chain, no matter what comes after it.

Afterwards, add and remove some data from the fifo such that another call
to kfifo_dma_in_prepare() will create two scatterlist entries. It returns
nents=2. However, due to the previous sg_mark_end() call, sg_is_last()
will now return true for the first scatterlist element. This causes the
sample code to print a single scatterlist element when it should print
two.

By removing the call to sg_mark_end(), we make the API as similar as
possible to the DMA mapping API. All users are required to respect the
returned nents.

Signed-off-by: Ira W. Snyder
Cc: Stefani Seibold
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ira W. Snyder
2010-10-02 01:50:58 +0800

24 Sep, 2010

1 commit

a5a2bad55 Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/ro… ... Browse Code »

…stedt/linux-2.6-trace into perf/core

Ingo Molnar
2010-09-24 15:12:05 +0800

23 Sep, 2010

5 commits

a247c3a97 rmap: fix walk during fork ... Browse Code »

The below bug in fork led to the rmap walk finding the parent huge-pmd
twice instead of just once, because the anon_vma_chain objects of the
child vma still point to the vma->vm_mm of the parent.

The patch fixes it by making the rmap walk accurate during fork. It's not
a big deal normally but it worth being accurate considering the cost is
the same.

Signed-off-by: Andrea Arcangeli
Acked-by: Johannes Weiner
Acked-by: Rik van Riel
Acked-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrea Arcangeli
2010-09-23 08:22:39 +0800
8f7b50c51 jump label: Tracepoint support for jump labels ... Browse Code »

Make use of the jump label infrastructure for tracepoints.

Signed-off-by: Jason Baron
LKML-Reference:
Signed-off-by: Steven Rostedt

Jason Baron
2010-09-23 04:31:01 +0800
4c3ef6d79 jump label: Add jump_label_text_reserved() to reserve jump points ... Browse Code »

Add a jump_label_text_reserved(void *start, void *end), so that other
pieces of code that want to modify kernel text, can first verify that
jump label has not reserved the instruction.

Acked-by: Masami Hiramatsu
Signed-off-by: Jason Baron
LKML-Reference:
Signed-off-by: Steven Rostedt

Jason Baron
2010-09-23 04:30:46 +0800
e0cf0cd49 jump label: Initialize workqueue tracepoints *before* they are registered ... Browse Code »

Initialize the workqueue data structures *before* they are registered
so that they are ready for callbacks.

Signed-off-by: Jason Baron
LKML-Reference:
Signed-off-by: Steven Rostedt

Jason Baron
2010-09-23 04:30:03 +0800
bf5438fca jump label: Base patch for jump label ... Browse Code »

base patch to implement 'jump labeling'. Based on a new 'asm goto' inline
assembly gcc mechanism, we can now branch to labels from an 'asm goto'
statment. This allows us to create a 'no-op' fastpath, which can subsequently
be patched with a jump to the slowpath code. This is useful for code which
might be rarely used, but which we'd like to be able to call, if needed.
Tracepoints are the current usecase that these are being implemented for.

Acked-by: David S. Miller
Signed-off-by: Jason Baron
LKML-Reference:

[ cleaned up some formating ]

Signed-off-by: Steven Rostedt

Jason Baron
2010-09-23 04:29:41 +0800