19 Aug, 2009
1 commit
-
pushd tools/perf/Documentation
make html
popdis failing for me...
ASCIIDOC perf-annotate.html
ERROR: unsafe: include file: /etc/asciidoc/./stylesheets/xhtml11.css
ERROR: unsafe: include file:
/etc/asciidoc/./stylesheets/xhtml11-manpage.css
ERROR: unsafe: include file:
/etc/asciidoc/./stylesheets/xhtml11-quirks.css
make: *** [perf-annotate.html] Error 1Apparently asciidoc "unsafe" is the default mode of operation
in practice.https://bugzilla.redhat.com/show_bug.cgi?id=506953
Works tidily now.
Signed-off-by: Kyle McMartin
Cc: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar
18 Aug, 2009
1 commit
-
Linus reported this perf annotate segfault:
[torvalds@nehalem git]$ perf annotate unmap_vmas
Segmentation fault#0 map__clone (self=) at builtin-annotate.c:236
#1 thread__fork (self=) at builtin-annotate.c:372The bug here was that builtin-annotate.c was a copy of
builtin-report.c and a threading related fix to builtin-report.c
didnt get propagated to builtin-annotate.c ...Reported-by: Linus Torvalds
Cc: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar
17 Aug, 2009
1 commit
-
Rename it to examples.txt to avoid the perf-*.txt pattern in
the Makefile, otherwise 'make doc' fails because
perf-examples.txt is not formatted to be a man page:ERROR: perf-examples.txt: line 1: manpage document title is mandatory
Signed-off-by: Carlos R. Mafra
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar
15 Aug, 2009
1 commit
-
We were using 'fd' locally, but there was a global 'fd' too, so
when converting from open to fopen the test made against fd
should be made against 'fp', but since we have that global
it didnt get discovered ...Reported-by: Ulrich Drepper
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar
13 Aug, 2009
4 commits
-
We're interested in just those symbols/DSOs, so filter out the
unresolved ones.Signed-off-by: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
While we can enable the perf sample records per tracepoint
counter, we may also want to enable this option for every
tracepoint counters to open, so that we don't need to add a
:record flag for all of them.Add the -R, --raw-samples options for this purpose.
Signed-off-by: Frederic Weisbecker
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Mike Galbraith
LKML-Reference:
Signed-off-by: Ingo Molnar -
Add a new flag field while opening a tracepoint perf counter:
-e tracepoint_subsystem:tracepoint_name:flags
This is intended to be generic although for now it only supports the
r[e[c[o[r[d]]]]] flag:./perf record -e workqueue:workqueue_insertion:record
./perf record -e workqueue:workqueue_insertion:rwill have the same effect: enabling the raw samples record for
the given tracepoint counter.In the future, we may want to support further flags, separated
by commas.Signed-off-by: Frederic Weisbecker
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Mike Galbraith
LKML-Reference:
Signed-off-by: Ingo Molnar -
When /sys/kernel/debug is mounted the list can be imense, so
use the pager like the other tools.Signed-off-by: Arnaldo Carvalho de Melo
Acked-by: Frederic Weisbecker
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar
12 Aug, 2009
5 commits
-
perf top supports a -C for setting the profile CPU, but perf
record does not. This adds the same option for perf record,
allowing the user to specify a specific target profile CPU.Signed-off-by: Jens Axboe
Cc: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
It is better than showing the map addr, this way at least we
know that we can't get the symtabs because the DSO was deleted
(system update) while an app still used such DSO.Yeah, don't do that, but if you do, you'll figure it out
quicker this way.[acme@doppio linux-2.6-tip]$ perf report | head -15
# Samples: 3796
#
# Overhead Command Shared Object Symbol
# ........ ....... ................................................................... ......
#
23.55% pidgin /lib64/libglib-2.0.so.0.2000.4.#prelink#.Pd98lu (deleted) [.] 0x00000000038844
21.55% pidgin /lib64/libpthread-2.10.1.so.#prelink#.AFwK8Q (deleted) [.] 0x0000000000a42d
10.85% pidgin [kernel] [.] vread_hpet
7.85% pidgin /lib64/libgobject-2.0.so.0.2000.4.#prelink#.o1vpU7 (deleted) [.] 0x00000000014de8
3.35% pidgin /lib64/libc-2.10.1.so (deleted) [.] 0x0000000007a875
3.19% pidgin /lib64/libdbus-1.so.3.4.0.#prelink#.6mwgZP (deleted) [.] 0x0000000001d254
3.06% pidgin /usr/lib64/libgtk-x11-2.0.so.0.1600.5.#prelink#.511hAl (deleted) [.] 0x000000002334e7
2.90% pidgin /usr/lib64/libgdk-x11-2.0.so.0.1600.5.#prelink#.5qlMo1 (deleted) [.] 0x00000000037b2d
1.84% pidgin [kernel] [k] do_sys_poll
1.45% pidgin /usr/lib64/libX11.so.6.2.0.#prelink#.iR59Rx (deleted) [.] 0x0000000004c751
[acme@doppio linux-2.6-tip]$Signed-off-by: Arnaldo Carvalho de Melo
Cc: Luis Claudio R. Gonçalves
Cc: Clark Williams
Cc: H. Peter Anvin
Cc: Peter Zijlstra
Cc: Mike Galbraith
Cc: Frédéric Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
In old binutils we can't access bfd_demangle(), use
cplus_demangle() just like oprofile.Signed-off-by: Arnaldo Carvalho de Melo
Cc: Luis Claudio R. Gonçalves
Cc: H. Peter Anvin
Cc: Peter Zijlstra
Cc: Mike Galbraith
Cc: Frédéric Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
This made it easier to find the firefox threading related
bug.Signed-off-by: Arnaldo Carvalho de Melo
Cc: "H. Peter Anvin"
Cc: Peter Zijlstra
Cc: Mike Galbraith
Cc: Frédéric Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
Noticed when trying to record events for a firefox thread. We
were synthesizing both .tid and .pid with the pid passed via
--pid.Fix it by reading /proc/PID/status and getting the tgid
to use in .pid, .tid gets the specified "pid".Signed-off-by: Arnaldo Carvalho de Melo
Cc: "H. Peter Anvin"
Cc: Frédéric Weisbecker
Cc: Peter Zijlstra
Cc: Mike Galbraith
LKML-Reference:
Signed-off-by: Ingo Molnar
09 Aug, 2009
19 commits
-
Sometimes we get callchain branches that have a rate under the
limit given by the user.Say you launched:
perf record -f -g -a ./hackbench 10
perf report -g fractal,10.0And you got:
2.33% hackbench [kernel] [k] _spin_lock_irqsave
|
|--78.57%-- remove_wait_queue
| poll_freewait
| do_sys_poll
| sys_poll
| sysenter_dispatch
| 0xf7ffa430
| 0x1ffadea3c
|
|--7.14%-- __up_read
| up_read
| do_page_fault
| page_fault
| 0xf7ffa430
| 0xa0df710000000a
...It is abnormal to get a 7.14% branch whereas we passed a 10%
filter.The problem is that we round down the minimum threshold. This
happens mostly when we have very low number of events. If the
total amount of your branch is 4 and you have a subranch of 3
events, filtering to 90% will be computed like follows:limit = 4 * 0.9;
The result is about 3.6, but the cast to integer will round
down to 3. It means that our filter is actually of 75%We must then explicitly round up the minimum threshold.
Reported-by: Ingo Molnar
Signed-off-by: Frederic Weisbecker
Cc: acme@redhat.com
Cc: peterz@infradead.org
Cc: efault@gmx.de
LKML-Reference:
Signed-off-by: Ingo Molnar -
Due to a libz dependency in some distro's binutils package,
C++ demangle support isn't compiled in despite the necessary
libraries being available.Fix this by adding a -lz link test to the dependency detection
rules.Signed-off-by: Mike Galbraith
Acked-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
A few examples of how 'perf' can be used, from an e-mail by
Ingo Molnar http://lkml.org/lkml/2009/8/4/346.Signed-off-by: Carlos R. Mafra
Cc: Peter Zijlstra
Cc: Valdis.Kletnieks@vt.edu
LKML-Reference:
Signed-off-by: Ingo Molnar -
… ignored chains in fractal mode
When we filter the callchains below a given percentage, we
ignore them and the end result only shows entries that have an
upper percentage than the filter threshold.It seems to users then that we have an imbalance in the
percentage, as if the sum inside a profiled branch doesn't
reach 100%.Since in the past there have been real perf report bugs that
showed the same sypmtom, it would be nice to assure the user
that the data is perfect and trustable and it all sums up to
100.00%.So fix this by displaying the remaining hits that have been
filtered but without more detail than their amount in each
branches. Example while filtering below 50%:7.73% [k] delay_tsc
|
|--98.22%-- __const_udelay
| |
| |--86.37%-- ath5k_hw_register_timeout
| | ath5k_hw_noise_floor_calibration
| | ath5k_hw_reset
| | ath5k_reset
| | ath5k_config
| | ieee80211_hw_config
| | |
| | |--88.53%-- ieee80211_scan_work
| | | worker_thread
| | | kthread
| | | child_rip
| | --11.47%-- [...]
| --13.63%-- [...]
--1.78%-- [...]Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1249690585-9145-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu> -
If we recorded with -g option to record the callchain, right now
we require a -g option to perf report as well - and people reported
this as unnecessary complication: the user already specified -g
once, no need to require it a second time.So if the recording includes call-chains, display the callchain by
default from perf report.( The user can override this default using "-g none" option from
perf report. )Reported-by: Ingo Molnar
Signed-off-by: Frederic Weisbecker
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Mike Galbraith
LKML-Reference:
Signed-off-by: Ingo Molnar -
When the callchain tree comes to insert an empty backtrace, it
raises a spurious warning about the fact we are inserting an
empty. This is spurious because the radix tree assumes it did
something wrong to reach this state. But it didn't, we just met
an empty callchain that has to be ignored.This happens occasionally with certain types of call-chain
recordings. If it happens it's a big nuisance as perf report
output starts with thousands of warning lines.Reported-by: Ingo Molnar
Signed-off-by: Frederic Weisbecker
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Mike Galbraith
LKML-Reference:
Signed-off-by: Ingo Molnar -
1. Ignore the -A argument if there is no perf.data file
2. Treat an empty file like a non existent file.Else, perf will try to read the perf.data header, and fail with
an error.Treating an empty file like a non-existent file makes sense,
since an interupted (as in SIGKILLed) perf could leave such
files around, and you don't want to annoy the user with errors
for files with no data in it.Signed-off-by: Pierre Habouzit
Acked-by: Peter Zijlstra
Cc: Paul Mackerras
Signed-off-by: Ingo Molnar -
While toying with perf, I've noticed that perf record can
easily enter a busy loop when doing something as silly as:$ perf record -A ls
Yeah, do_read here really wants to read a known size, not being
able to should die(), not busy-loop ;)That was the cause for the bug.
Signed-off-by: Pierre Habouzit
Acked-by: Peter Zijlstra
Cc: Paul Mackerras
Signed-off-by: Ingo Molnar -
Stop perf list from displaying tracepoints without an id file,
those are special tracepoints that are not interfaced to
perfcounters so listing them is erroneous and passing them as
events will produce no output.Signed-off-by: Peter Zijlstra
Acked-by: Jason Baron
Cc: Steven Rostedt
Cc: Chris Mason
Signed-off-by: Ingo Molnar -
We want to use a coherent flag for -S/--stat across all tools,
so free up -S in perf stat.Signed-off-by: Brice Goglin
Cc: Peter Zijlstra
Cc: paulus@samba.org
Signed-off-by: Ingo Molnar -
…gin (DSO, build-id, kernel, etc)
Used with perf report --verbose:
[acme@doppio linux-2.6-tip]$ perf report -v | head -16
5.17% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x00000000005d8eee f [.] imgContainer::DrawFrameTo(gfxIImageFrame*, gfxIImageFrame*, nsRect&)
2.56% firefox /lib64/libpthread-2.10.1.so 0x0000000000008e02 d [.] __pthread_mutex_lock_internal
1.94% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x0000000000d0af8f f [.] SearchTable
1.75% firefox [kernel] 0xffffffffff60013b k [.] vread_hpet
1.63% firefox /lib64/libpthread-2.10.1.so 0x000000000000a404 d [.] __pthread_mutex_unlock
1.47% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000482ea f [.] js_Interpret
1.42% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x000000000003eda3 f [.] JS_CallTracer
1.24% firefox [kernel] 0xffffffff8102ca4a k [k] read_hpet
1.16% firefox [kernel] 0xffffffff810f3dd4 k [k] fget_light
1.11% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000567ff f [.] js_TraceObject
0.98% firefox /usr/lib64/firefox-3.5.2/firefox 0x000000000000dd23 b [.] arena_ralloc
[acme@doppio linux-2.6-tip]$The new field is just after the symbol address. To help in
figuring out symbol resolution bugs.Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu> -
Brice Goglin reported:
> I can easily sort them by thread id, but I don't know how to match
> my 4 events with each group of 4 lines.Also report the counter id and the time running/enabled
stats (in case the counter got time-shared).Reported-by: Brice Goglin
Signed-off-by: Peter Zijlstra
Tested-by: Brice Goglin
Signed-off-by: Ingo Molnar -
Brice Goglin reported that only the first result from a
multi-counter perf record --stat run is accurate, the
rest looks bogus.A silly mistake made us re-read the first attribute for
every recorded attribute.Reported-by: Brice Goglin
Signed-off-by: Peter Zijlstra
Tested-by: Brice Goglin
Cc: paulus@samba.org
Signed-off-by: Ingo Molnar -
The callchain fractal mode builds each new total hits in a new
branch of profiling by using the parent's hits of the current
branch plus the hits of the children.This is wrong, the total hits of a branch should be made of the
sum of every children hits, we must ignore the parent hits in
this scope.This patch also fixes another mistake with the hit counting.
Now the rates are correct.
Signed-off-by: Frederic Weisbecker
Cc: Frederic Weisbecker
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Mike Galbraith
Cc: Pekka Enberg
Signed-off-by: Ingo Molnar -
perf_counter tools: update perf top manual page to reflect
current implementation.Signed-off-by: Mike Galbraith
Cc: Peter Zijlstra
Signed-off-by: Ingo Molnar -
Pressing any key which is not currently mapped to
functionality, based on startup command line options, displays
currently mapped keys, and prompts for input.Pressing any unmapped key at the prompt returns the user to
display mode with variables unchanged. eg, pressing ?
etc displays currently available keys, the value of the
variable associated with that key, and prompts.Pressing same again aborts input.
Signed-off-by: Mike Galbraith
Cc: Peter Zijlstra
Signed-off-by: Ingo Molnar -
…vidual counter display
Add [w]eighted hotkey. Pressing [w] toggles between displaying
weighted total of all counters, and the counter selected via
[E]vent select key.------------------------------------------------------------------------------
PerfTop: 90395 irqs/sec kernel:16.1% [cache-misses/cache-references/instructions], (all, 4 CPUs)
------------------------------------------------------------------------------weight samples pcnt RIP kernel function
______ _______ _____ ________________ _______________1275408.6 10881 - 5.3% - ffffffff81146f70 : copy_page_c
553683.4 43569 - 21.3% - ffffffff81146f20 : clear_page_c
74075.0 6768 - 3.3% - ffffffff81147190 : copy_user_generic_string
40602.9 7538 - 3.7% - ffffffff81284ba2 : _spin_lock
26882.1 965 - 0.5% - ffffffff8109d280 : file_ra_state_init[w]
------------------------------------------------------------------------------
PerfTop: 91221 irqs/sec kernel:14.5% [10000Hz cache-misses], (all, 4 CPUs)
------------------------------------------------------------------------------weight samples pcnt RIP kernel function
______ _______ _____ ________________ _______________47320.00 - 22.3% - ffffffff81146f20 : clear_page_c
14261.00 - 6.7% - ffffffff810992f5 : __rmqueue
11046.00 - 5.2% - ffffffff81146f70 : copy_page_c
7842.00 - 3.7% - ffffffff81284ba2 : _spin_lock
7234.00 - 3.4% - ffffffff810aa1d6 : unmap_vmasSigned-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu> -
perf top used to have annotation support, but it has bitrotted and
removed.This patch restores that: it allows the user to select any symbol
in kernel space for source level annotation on the fly, switch
between event counters and alter display variables. When symbol
details are being displayed, stopping annotation reverts to normal.known keys:
[d] select display delay.
[e] select display entries (lines).
[E] select annotation event counter.
[f] select normal display count filter.
[F] select annotation display count filter (percentage).
[qQ] quit.
[s] select annotation symbol and start annotation.
[S] stop annotation, revert to normal display.
[z] toggle event count zeroing.Sample:
------------------------------------------------------------------------------
PerfTop: 16719 irqs/sec kernel:78.7% [cache-misses/cache-references/instructions/cycles], (all, 4 CPUs)
------------------------------------------------------------------------------Showing cache-misses for e1000_clean_rx_irq
Events Pcnt (>=3%)
0 0.0% /* adjust length to remove Ethernet CRC */
0 0.0% if (!(adapter->flags2 & FLAG2_CRC_STRIPPING))
0 0.0% length -= 4;
436 5.0% f039: 41 f6 84 24 5c 29 00 testb $0x1,0x295c(%r12)
0 0.0% f089: 8b 4d 84 mov -0x7c(%rbp),%ecx
0 0.0% f08c: 48 83 ef 02 sub $0x2,%rdi
0 0.0% f090: 48 83 ee 02 sub $0x2,%rsi
811 9.3% f094: f3 a4 rep movsb %ds:(%rsi),%es:(%rdi)
0 0.0%
0 0.0% while (rx_desc->status & E1000_RXD_STAT_DD) {
0 0.0% f114: 41 f6 47 0c 01 testb $0x1,0xc(%r15)
7226 82.6% f119: 0f 85 24 fe ff ff jne ef43Available events:
0 cache-misses
1 cache-references
2 instructions
3 cycles
Enter details event counter: 2
------------------------------------------------------------------------------
PerfTop: 15035 irqs/sec kernel:79.0% [cache-misses/cache-references/instructions/cycles], (all, 4 CPUs)
------------------------------------------------------------------------------Showing instructions for e1000_clean_rx_irq
Events Pcnt (>=3%)
0 0.0% int *work_done, int work_to_do)
0 0.0% {
175 0.9% eebf: 55 push %rbp
1898 9.8% eec0: 48 89 e5 mov %rsp,%rbp
0 0.0%
0 0.0% i = rx_ring->next_to_clean;
140 0.7% ef0a: 0f b7 41 1a movzwl 0x1a(%rcx),%eax
670 3.4% ef0e: 89 45 ac mov %eax,-0x54(%rbp)
0 0.0% {
0 0.0% memcpy(skb->data + offset, from, len);
91 0.5% f07b: 49 8b b6 e8 00 00 00 mov 0xe8(%r14),%rsi
1153 5.9% f082: 48 8b b8 e8 00 00 00 mov 0xe8(%rax),%rdi
42 0.2% f089: 8b 4d 84 mov -0x7c(%rbp),%ecx
14 0.1% f08c: 48 83 ef 02 sub $0x2,%rdi
0 0.0% f090: 48 83 ee 02 sub $0x2,%rsi
1618 8.3% f094: f3 a4 rep movsb %ds:(%rsi),%es:(%rdi)
0 0.0%
0 0.0% /* return some buffers to hardware, one at a time is too slow */
0 0.0% if (cleaned_count >= E1000_RX_BUFFER_WRITE) {
867 4.5% f0e7: 83 7d b0 0f cmpl $0xf,-0x50(%rbp)
0 0.0%
0 0.0% while (rx_desc->status & E1000_RXD_STAT_DD) {
37 0.2% f114: 41 f6 47 0c 01 testb $0x1,0xc(%r15)
4047 20.8% f119: 0f 85 24 fe ff ff jne ef43Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
Cc: Paul Mackerras
Signed-off-by: Ingo Molnar -
This patch implements the kernel side support for ftrace event
record sampling.A new counter sampling attribute is added:
PERF_SAMPLE_TP_RECORD
which requests ftrace events record sampling. In this case
if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
fires, we emit the tracepoint binary record to the
perfcounter event buffer, as a sample.Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
record:perf record -f -F 1 -a -e workqueue:workqueue_execution
perf report -D0x21e18 [0x48]: event: 9
.
. ... raw event: size 72 bytes
. 0000: 09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff ......H........
. 0010: 0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00 ........!......
. 0020: 2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e +...........eve
. 0030: 74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00 ts/1...........
. 0040: e0 b1 31 81 ff ff ff ff .......
.
0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33The raw ftrace binary record starts at offset 0020.
Translation:
struct trace_entry {
type = 0x2b = 43;
flags = 1;
preempt_count = 2;
pid = 0xa = 10;
tgid = 0xa = 10;
}thread_comm = "events/1"
thread_pid = 0xa = 10;
func = 0xffffffff8131b1e0 = flush_to_ldisc()What will come next?
- Userspace support ('perf trace'), 'flight data recorder' mode
for perf trace, etc.- The unconditional copy from the profiling callback brings
some costs however if someone wants no such sampling to
occur, and needs to be fixed in the future. For that we need
to have an instant access to the perf counter attribute.
This is a matter of a flag to add in the struct ftrace_event.- Take care of the events recursivity! Don't ever try to record
a lock event for example, it seems some locking is used in
the profiling fast path and lead to a tracing recursivity.
That will be fixed using raw spinlock or recursivity
protection.- [...]
- Profit! :-)
Signed-off-by: Frederic Weisbecker
Cc: Li Zefan
Cc: Tom Zanussi
Cc: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
Cc: Mike Galbraith
Cc: Steven Rostedt
Cc: Paul Mackerras
Cc: Pekka Enberg
Cc: Gabriel Munteanu
Cc: Lai Jiangshan
Signed-off-by: Ingo Molnar
07 Aug, 2009
2 commits
-
Adds autodetection for libelf as well, and simplifies the
libbfd code. Furthermore, fail make with an error when libelf
is not found and warn about the lack of libbfd.Also provide an option to build a 32bit version even though you
might be running a 64bit kernel.Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar -
In some cases distros have binaries and debuginfo in weird places:
[root@doppio tuna]# ls -la /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
-rwxr-xr-x 1 root root 90024 2009-08-03 19:45 /usr/lib64/firefox-3.5.2/firefox
-rwxr-xr-x 1 root root 90024 2009-08-03 18:23 /usr/lib64/xulrunner-1.9.1/xulrunner-stub
[root@doppio tuna]# sha1sum /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
19a858077d263d5de22c9c5da250d3e4396ae739 /usr/lib64/xulrunner-1.9.1/xulrunner-stub
19a858077d263d5de22c9c5da250d3e4396ae739 /usr/lib64/firefox-3.5.2/firefox
[root@doppio tuna]# rpm -qf /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
xulrunner-1.9.1.2-1.fc11.x86_64
firefox-3.5.2-2.fc11.x86_64
[root@doppio tuna]# ls -la /usr/lib/debug/{usr/lib64/xulrunner-1.9.1/xulrunner-stub,usr/lib64/firefox-3.5.2/firefox}.debug
ls: cannot access /usr/lib/debug/usr/lib64/firefox-3.5.2/firefox.debug: No such file or directory
-rwxr-xr-x 1 root root 403608 2009-08-03 18:22 /usr/lib/debug/usr/lib64/xulrunner-1.9.1/xulrunner-stub.debugSeemingly we don't have a .symtab when we actually can find it
if we use the .note.gnu.build-id ELF section put in place by
some distros. Use it and find the symbols we need.Signed-off-by: Arnaldo Carvalho de Melo
Acked-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar
05 Aug, 2009
3 commits
-
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Since the C++ demangling isn't needed for everybody and
bfd/iberty aren't widely/easily available on all machines, make
it optional.It also allows you to forcefully disable demangling by using
NO_DEMANGLE=1 and otherwise tries to detect libbfd/libiberty
combinations that result in a compiling demangler.Reported-by: Jens Axboe
Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Kyle McMartin
LKML-Reference:
Signed-off-by: Ingo Molnar -
If you're doing performance testing, you're interested in the
symbols anyway so lets make "--sort comm,dso,symbol" the
default sort option.Signed-off-by: Pekka Enberg
Acked-by: Peter Zijlstra
Cc: acme@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar
04 Aug, 2009
1 commit
-
Check whether index is within bounds before testing the element.
Signed-off-by: Roel Kluin
Cc: a.p.zijlstra@chello.nl
Cc: Andrew Morton
LKML-Reference:
Signed-off-by: Ingo Molnar
02 Aug, 2009
2 commits
-
We skip the display of idle routine related symbols because
they are typically rather erratic and confusing: they depend
on the IRQ rate or sometimes they dominate the profile if
they are polling based.Add mwait_idle_with_hints too, this is one of the idle
routines on x86.Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Ingo Molnar -
This patch fixes a spelling error that has resulted from copy
and pasting. The location of the error was found using a
semantic patch but the semantic patch was not trying to find
these errors. After looking things over it seemed logical that
this change was needed. Please review it and then include the
patch if it is in fact the correct change.Signed-off-by: Stoyan Gaydarov
Signed-off-by: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar