13 Mar, 2010
2 commits
-
For consistency, use the newt API more fully.
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
These are keys people expect when pressed to exit the current
widget, so have associate all of them to this semantic.Suggested-by: Ingo Molnar
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar
12 Mar, 2010
8 commits
-
Newt has widespread availability and provides a rather simple
API as can be seen by the size of this patch.The work needed to support it will benefit other frontends too.
In this initial patch it just checks if the output is a tty, if
not it falls back to the previous behaviour, also if
newt-devel/libnewt-dev is not installed the previous behaviour
is maintaned.Pressing enter on a symbol will annotate it, ESC in the
annotation window will return to the report symbol list.More work will be done to remove the special casing in
color_fprintf, stop using fmemopen/FILE in the printing of
hist_entries, etc.Also the annotation doesn't need to be done via spawning "perf
annotate" and then browsing its output, we can do better by
calling directly the builtin-annotate.c functions, that would
then be moved to tools/perf/util/annotate.c and shared with perf
top, etcBut lets go by baby steps, this patch already improves perf
usability by allowing to quickly do annotations on symbols from
the report screen and provides a first experimentation with
libnewt/TUI integration of tools.Tested on RHEL5 and Fedora12 X86_64 and on Debian PARISC64 to
browse a perf.data file collected on a Fedora12 x86_64 box.Signed-off-by: Arnaldo Carvalho de Melo
Cc: Avi Kivity
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
We need those to properly size the browser widht in the newt
TUI.Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Just like we do for pr_debug, so that we can have a single point
where to redirect to the currently used output system, be it
stdio or newt.Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Will be used by the newt code too.
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Merge reason: We want to queue up a dependent patch.
Signed-off-by: Ingo Molnar
-
[acme@mica linux-2.6-tip]$ perf record -a -f
Fatal: Permission error - are you root?
Consider tweaking /proc/sys/kernel/perf_event_paranoid.[acme@mica linux-2.6-tip]$
Suggested-by: Ingo Molnar
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Fixing this symptom:
[acme@mica linux-2.6-tip]$ perf record -a -f
Fatal: Permission error - are you root?Bus error
[acme@mica linux-2.6-tip]$I.e. if for some reason no data is collected, in this case a non
root user trying to do systemwide profiling, no data will be
collected, and then we end up trying to mmap a zero sized file
and access the file header, b00m.Reported-by: Ingo Molnar
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc:
LKML-Reference:
Signed-off-by: Ingo Molnar
11 Mar, 2010
8 commits
-
This patch is an optimization in perf_event_task_sched_in() to avoid
scheduling the events twice in a row.Without it, the perf_disable()/perf_enable() pair is invoked twice,
thereby pinned events counts while scheduling flexible events and we go
throuh hw_perf_enable() twice.By encapsulating, the whole sequence into perf_disable()/perf_enable() we
ensure, hw_perf_enable() is going to be invoked only once because of the
refcount protection.Signed-off-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Export perf_trace_regs and perf_arch_fetch_caller_regs since module will
use these.Signed-off-by: Xiao Guangrong
[ use EXPORT_PER_CPU_SYMBOL_GPL() ]
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
What happens is that we schedule badly like:
-1987 [019] 280.252808: x86_pmu_start: event-46/1300c0: idx: 0
-1987 [019] 280.252811: x86_pmu_start: event-47/1300c0: idx: 1
-1987 [019] 280.252812: x86_pmu_start: event-48/1300c0: idx: 2
-1987 [019] 280.252813: x86_pmu_start: event-49/1300c0: idx: 3
-1987 [019] 280.252814: x86_pmu_start: event-50/1300c0: idx: 32
-1987 [019] 280.252825: x86_pmu_stop: event-46/1300c0: idx: 0
-1987 [019] 280.252826: x86_pmu_stop: event-47/1300c0: idx: 1
-1987 [019] 280.252827: x86_pmu_stop: event-48/1300c0: idx: 2
-1987 [019] 280.252828: x86_pmu_stop: event-49/1300c0: idx: 3
-1987 [019] 280.252829: x86_pmu_stop: event-50/1300c0: idx: 32
-1987 [019] 280.252834: x86_pmu_start: event-47/1300c0: idx: 1
-1987 [019] 280.252834: x86_pmu_start: event-48/1300c0: idx: 2
-1987 [019] 280.252835: x86_pmu_start: event-49/1300c0: idx: 3
-1987 [019] 280.252836: x86_pmu_start: event-50/1300c0: idx: 32
-1987 [019] 280.252837: x86_pmu_start: event-51/1300c0: idx: 32 *FAIL*This happens because we only iterate the n_running events in the first
pass, and reset their index to -1 if they don't match to force a
re-assignment.Now, in our RR example, n_running == 0 because we fully unscheduled, so
event-50 will retain its idx==32, even though in scheduling it will have
gotten idx=0, and we don't trigger the re-assign path.The easiest way to fix this is the below patch, which simply validates
the full assignment in the second pass.Reported-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Fix:
arch/powerpc/kernel/perf_event.c:1334: error: 'power_pmu_notifier' undeclared (first use in this function)
arch/powerpc/kernel/perf_event.c:1334: error: (Each undeclared identifier is reported only once
arch/powerpc/kernel/perf_event.c:1334: error: for each function it appears in.)
arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'power_pmu_notifier'
arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'register_cpu_notifier'Due to commit 3f6da390 (perf: Rework and fix the arch CPU-hotplug hooks).
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Without this change, the install path is relative to
prefix/DESTDIR where prefix is automatically set to $HOME.This can produce unexpected results. For example:
make -C tools/perf DESTDIR=/home/jkacur/tmp install-man
creates the directory: /home/jkacur/home/jkacur/tmp/share/...
instead of the expected: /home/jkacur/tmp/share/...Signed-off-by: John Kacur
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Tom Zanussi
Cc: Kyle McMartin
Cc:
LKML-Reference:
Signed-off-by: Ingo Molnar -
From : Ananth N Mavinakayanahalli
When freeing the instruction slot, the arithmetic to calculate
the index of the slot in the page needs to account for the total
size of the instruction on the various architectures.Calculate the index correctly when freeing the out-of-line
execution slot.Reported-by: Sachin Sant
Reported-by: Heiko Carstens
Signed-off-by: Ananth N Mavinakayanahalli
Signed-off-by: Masami Hiramatsu
LKML-Reference:
Signed-off-by: Ingo Molnar -
At present, the perf subcommands that do system-wide monitoring
(perf stat, perf record and perf top) don't work properly unless
the online cpus are numbered 0, 1, ..., N-1. These tools ask
for the number of online cpus with sysconf(_SC_NPROCESSORS_ONLN)
and then try to create events for cpus 0, 1, ..., N-1.This creates problems for systems where the online cpus are
numbered sparsely. For example, a POWER6 system in
single-threaded mode (i.e. only running 1 hardware thread per
core) will have only even-numbered cpus online.This fixes the problem by reading the /sys/devices/system/cpu/online
file to find out which cpus are online. The code that does that is in
tools/perf/util/cpumap.[ch], and consists of a read_cpu_map()
function that sets up a cpumap[] array and returns the number of
online cpus. If /sys/devices/system/cpu/online can't be read or
can't be parsed successfully, it falls back to using sysconf to
ask how many cpus are online and sets up an identity map in cpumap[].The perf record, perf stat and perf top code then calls
read_cpu_map() in the system-wide monitoring case (instead of
sysconf) and uses cpumap[] to get the cpu numbers to pass to
perf_event_open.Signed-off-by: Paul Mackerras
Cc: Anton Blanchard
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
LKML-Reference:
Signed-off-by: Ingo Molnar -
Anton Blanchard found that he could reliably make the kernel hit a
BUG_ON in the slab allocator by taking a cpu offline and then online
while a system-wide perf record session was running.The reason is that when the cpu comes up, we completely reinitialize
the ctx field of the struct perf_cpu_context for the cpu. If there is
a system-wide perf record session running, then there will be a struct
perf_event that has a reference to the context, so its refcount will
be 2. (The perf_event has been removed from the context's group_entry
and event_entry lists by perf_event_exit_cpu(), but that doesn't
remove the perf_event's reference to the context and doesn't decrement
the context's refcount.)When the cpu comes up, perf_event_init_cpu() gets called, and it calls
__perf_event_init_context() on the cpu's context. That resets the
refcount to 1. Then when the perf record session finishes and the
perf_event is closed, the refcount gets decremented to 0 and the
context gets kfreed after an RCU grace period. Since the context
wasn't kmalloced -- it's part of a per-cpu variable -- bad things
happen.In fact we don't need to completely reinitialize the context when the
cpu comes up. It's sufficient to initialize the context once at boot,
but we need to do it for all possible cpus.This moves the context initialization to happen at boot time. With
this, we don't trash the refcount and the context never gets kfreed,
and we don't hit the BUG_ON.Reported-by: Anton Blanchard
Signed-off-by: Paul Mackerras
Tested-by: Anton Blanchard
Acked-by: Peter Zijlstra
Cc:
Signed-off-by: Ingo Molnar
10 Mar, 2010
22 commits
-
Drop the obsolete "profile" naming used by perf for trace events.
Perf can now do more than simple events counting, so generalize
the API naming.Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Steven Rostedt
Cc: Masami Hiramatsu
Cc: Jason Baron -
We are taking a wrong regs snapshot when a trace event triggers.
Either we use get_irq_regs(), which gives us the interrupted
registers if we are in an interrupt, or we use task_pt_regs()
which gives us the state before we entered the kernel, assuming
we are lucky enough to be no kernel thread, in which case
task_pt_regs() returns the initial set of regs when the kernel
thread was started.What we want is different. We need a hot snapshot of the regs,
so that we can get the instruction pointer to record in the
sample, the frame pointer for the callchain, and some other
things.Let's use the new perf_fetch_caller_regs() for that.
Comparison with perf record -e lock: -R -a -f -g
Before:perf [kernel] [k] __do_softirq
|
--- __do_softirq
|
|--55.16%-- __open
|
--44.84%-- __write_nocancelAfter:
perf [kernel] [k] perf_tp_event
|
--- perf_tp_event
|
|--41.07%-- lock_acquire
| |
| |--39.36%-- _raw_spin_lock
| | |
| | |--7.81%-- hrtimer_interrupt
| | | smp_apic_timer_interrupt
| | | apic_timer_interruptThe old case was producing unreliable callchains. Now having
right frame and instruction pointers, we have the trace we
want.Also syscalls and kprobe events already have the right regs,
let's use them instead of wasting a retrieval.v2: Follow the rename perf_save_regs() -> perf_fetch_caller_regs()
Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: H. Peter Anvin
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Steven Rostedt
Cc: Arnaldo Carvalho de Melo
Cc: Masami Hiramatsu
Cc: Jason Baron
Cc: Archs -
Events that trigger overflows by interrupting a context can
use get_irq_regs() or task_pt_regs() to retrieve the state
when the event triggered. But this is not the case for some
other class of events like trace events as tracepoints are
executed in the same context than the code that triggered
the event.It means we need a different api to capture the regs there,
namely we need a hot snapshot to get the most important
informations for perf: the instruction pointer to get the
event origin, the frame pointer for the callchain, the code
segment for user_mode() tests (we always use __KERNEL_CS as
trace events always occur from the kernel) and the eflags
for further purposes.v2: rename perf_save_regs to perf_fetch_caller_regs as per
Masami's suggestion.Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: H. Peter Anvin
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Steven Rostedt
Cc: Arnaldo Carvalho de Melo
Cc: Masami Hiramatsu
Cc: Jason Baron
Cc: Archs -
We were using the frame pointer based stack walker on every
contexts in x86-32, but not in x86-64 where we only use the
seven-league boots on the exception stacks.Use it also on irq and process stacks. This utterly accelerate
the captures.Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo -
There are rcu locked read side areas in the path where we submit
a trace event. And these rcu_read_(un)lock() trigger lock events,
which create recursive events.One pair in do_perf_sw_event:
__lock_acquire
|
|--96.11%-- lock_acquire
| |
| |--27.21%-- do_perf_sw_event
| | perf_tp_event
| | |
| | |--49.62%-- ftrace_profile_lock_release
| | | lock_release
| | | |
| | | |--33.85%-- _raw_spin_unlockAnother pair in perf_output_begin/end:
__lock_acquire
|--23.40%-- perf_output_begin
| | __perf_event_overflow
| | perf_swevent_overflow
| | perf_swevent_add
| | perf_swevent_ctx_event
| | do_perf_sw_event
| | perf_tp_event
| | |
| | |--55.37%-- ftrace_profile_lock_acquire
| | | lock_acquire
| | | |
| | | |--37.31%-- _raw_spin_lockThe problem is not that much the trace recursion itself, as we have a
recursion protection already (though it's always wasteful to recurse).
But the trace events are outside the lockdep recursion protection, then
each lockdep event triggers a lock trace, which will trigger two
other lockdep events. Here the recursive lock trace event won't
be taken because of the trace recursion, so the recursion stops there
but lockdep will still analyse these new events:To sum up, for each lockdep events we have:
lock_*()
|
trace lock_acquire
|
----- rcu_read_lock()
| |
| lock_acquire()
| |
| trace_lock_acquire() (stopped)
| |
| lockdep analyze
|
----- rcu_read_unlock()
|
lock_release
|
trace_lock_release() (stopped)
|
lockdep analyzeAnd you can repeat the above two times as we have two rcu read side
sections when we submit an event.This is fixed in this patch by moving the lock trace event under
the lockdep recursion protection.Signed-off-by: Frederic Weisbecker
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Steven Rostedt
Cc: Paul Mackerras
Cc: Hitoshi Mitake
Cc: Li Zefan
Cc: Lai Jiangshan
Cc: Masami Hiramatsu
Cc: Jens Axboe -
If -vv is used just the map table will be printed, -vvv will
print the symbol table too, with it we can see that we have a
bug where some samples are not being resolved to a map when we
get them in the perf.data stream, but after we have it all
processed, we can find the right map, some reordering probably
is happening.Upcoming patches will provide ways to ask for most PERF_SAMPLE_
conditional samples to be taken for !PERF_RECORD_SAMPLE events
too, then we'll be able to ask for PERF_SAMPLE_TIME and
PERF_SAMPLE_CPU to help diagnose this.Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Perf report does not handle multiple events being reported, even
though perf record stores them properly on disk. This patch
addresses that issue by adding the logic to perf report to use
the event stream id that is saved by record and the new data
structures to seperate the event streams and report them
individually.Signed-off-by: Eric B Munson
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Now that report can store historgrams for multiple events we
need to be able to do the post processing work for each
histogram. This patch changes the post processing functions so
that they can be called individually for each event's histogram.Signed-off-by: Eric B Munson
[ Guarantee bisectabilty by fixing up builtin-report.c ]
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
This patch adds the structures necessary to count each event
type independently in perf report.Signed-off-by: Eric B Munson
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
In order to minimize the impact of storing multiple events in a
report this function will now take the root of the histogram
tree so that the logic for selecting the proper tree can be
inserted before the call.Signed-off-by: Eric B Munson
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Currently perf record does not write the ID or the to disk for
events. This doesn't allow report to tell if an event stream
contains one or more types of events. This patch adds this
entry to the list of data that record will write to disk if more
than one event was requested.Signed-off-by: Eric B Munson
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
cc1: warnings being treated as errors
util/probe-finder.c: In function 'find_line_range':
util/probe-finder.c:172: warning: 'src' may be used
uninitialized in this function make: *** [util/probe-finder.o]
Error 1Signed-off-by: Arnaldo Carvalho de Melo
Acked-by: Masami Hiramatsu
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Signed-off-by: Arnaldo Carvalho de Melo
Cc: David S. Miller
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Fix typo. But the modularization here is ugly and should be improved.
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
LKML-Reference:
Signed-off-by: Ingo Molnar -
The PEBS+LBR decoding magic needs the insn_get_length() infrastructure
to be able to decode x86 instruction length.So split it out of KPROBES dependency and make it enabled when either
KPROBES or PERF_EVENTS is enabled.Cc: Peter Zijlstra
Cc: Masami Hiramatsu
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
LKML-Reference:
Signed-off-by: Ingo Molnar -
Don't decrement the TOS twice...
Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
Pull the core handler in line with the nhm one, also make sure we always
drain the buffer.Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
We don't need checking_{wr,rd}msr() calls, since we should know what cpu
we're running on and not use blindly poke at msrs.Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
If we reset the LBR on each first counter, simple counter rotation which
first deschedules all counters and then reschedules the new ones will
lead to LBR reset, even though we're still in the same task context.Reduce this by not flushing on the first counter but only flushing on
different task contexts.Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
We need to use the actual cpuc->pebs_enabled value, not a local copy for
the changes to take effect.Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
Its unclear if the PEBS state record will have only a single bit set, in
case it does not and accumulates bits, deal with that by only processing
each event once.Also, robustify some of the code.
Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
The documentation says we have to enable PEBS before we enable the PMU
proper.Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference:
Signed-off-by: Ingo Molnar