21 Jul, 2011
4 commits
-
Moving out the option parameter from parse_events function,
and adding new parse_events_option function instead.The option parameter is used only to carry "struct perf_evlist"
pointer for chaining new events. Putting it away, enable us
to call parse_events from other places without using the
option parameter.Signed-off-by: Jiri Olsa
Cc: acme@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1310635534-4013-2-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar -
Non-callchain path is using al.addr which prints as:
openssl 14564 17672.003587: 7862d _x86_64_AES_encrypt_compactThis should be sample->ip to print as:
openssl 14564 17672.003587: 3f7867862d _x86_64_AES_encrypt_compactSigned-off-by: David Ahern
Acked-by: Frederic Weisbecker
Cc: acme@ghostprotocols.net
Cc: peterz@infradead.org
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1306768587-15376-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar -
The perf_event_attr struct has two __u32's at the top and
they need to be swapped individually.With this change I was able to analyze a perf.data collected in a
32-bit PPC VM on an x86 system. I tested both 32-bit and 64-bit
binaries for the Intel analysis side; both read the PPC perf.data
file correctly.-v2:
- changed the existing perf_event__attr_swap() to swap only elements
of perf_event_attr and exported it for use in swapping the
attributes in the file header
- updated swap_ops used for processing eventsSigned-off-by: David Ahern
Acked-by: Frederic Weisbecker
Cc: acme@ghostprotocols.net
Cc: peterz@infradead.org
Cc: paulus@samba.org
Cc:
Link: http://lkml.kernel.org/r/1310754849-12474-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar -
Add "node" as a simple alias for NODE cache events.
The addition of NODE cache events broke the parse_alias
function, so any mismatched event caused the segfault, like:# ./perf stat -e krava ls
The hw_cache/hw_cache_op/hw_cache_result arrays needs to follow
PERF_COUNT_HW_CACHE_*MAX enums. Adding those MAXs to be size
of those arrays, so possible ommision in future wil not lead to
segfault.Adding read/write/prefetch as allowed operations for node cache
event.Signed-off-by: Jiri Olsa
Acked-by: Peter Zijlstra
Cc: acme@redhat.com
Link: http://lkml.kernel.org/r/20110713205818.GB7827@jolsa.brq.redhat.com
Signed-off-by: Ingo Molnar
16 Jul, 2011
7 commits
-
Support adding probes on offline kernel modules. This enables
perf-probe to trace kernel-module init functions via perf-probe.
If user gives the path of module with -m option, perf-probe
expects the module is offline.
This feature works with --add, --funcs, and --vars.E.g)
# perf probe -m /lib/modules/`uname -r`/kernel/fs/btrfs/btrfs.ko \
-a "extent_io_init:5 extent_state_cache"
Add new events:
probe:extent_io_init (on extent_io_init:5 with extent_state_cache)
probe:extent_io_init_1 (on extent_io_init:5 with extent_state_cache)You can now use it on all perf tools, such as:
perf record -e probe:extent_io_init_1 -aR sleep 1
Signed-off-by: Masami Hiramatsu
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Link: http://lkml.kernel.org/r/20110627072751.6528.10230.stgit@fedora15
Signed-off-by: Steven Rostedt -
Add probed module name and ":" in front of function name
if -m module option is given. In the result, the symbol
name passed to kprobe-tracer becomes MODULE:FUNCTION,
so that kallsyms can solve it as a symbol in the module
correctly.Signed-off-by: Masami Hiramatsu
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Link: http://lkml.kernel.org/r/20110627072745.6528.26416.stgit@fedora15
Signed-off-by: Steven Rostedt -
Introduce debuginfo to encapsulate dwarf information.
This new object allows us to reuse and expand debuginfo easily.Signed-off-by: Masami Hiramatsu
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Link: http://lkml.kernel.org/r/20110627072739.6528.12438.stgit@fedora15
Signed-off-by: Steven Rostedt -
Move dwarf library related routines to dwarf-aux.{c,h}.
This includes several minor changes.
- Add simple documents for each API.
- Rename die_find_real_subprogram() to die_find_realfunc()
- Rename line_walk_handler_t to line_walk_callback_t.
- Minor cleanups.Signed-off-by: Masami Hiramatsu
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Link: http://lkml.kernel.org/r/20110627072727.6528.57647.stgit@fedora15
Signed-off-by: Steven Rostedt -
Since there are dwarf_bitsize, dwarf_bitoffset and dwarf_bytesize
defined in libdw, we don't need die_get_bit_size, die_get_bit_offset
and die_get_byte_size anymore.Signed-off-by: Masami Hiramatsu
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Link: http://lkml.kernel.org/r/20110627072721.6528.2747.stgit@fedora15
Signed-off-by: Steven Rostedt -
Since strtailcmp() is enough generic, it should be defined in string.c.
Signed-off-by: Masami Hiramatsu
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Link: http://lkml.kernel.org/r/20110627072715.6528.10677.stgit@fedora15
Signed-off-by: Steven Rostedt -
Since die_find/walk* callbacks use DIE_FIND_CB_FOUND for
both of failed and found cases, it should be "END"
instead "FOUND" for avoiding confusion.Signed-off-by: Masami Hiramatsu
Reported-by: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Ingo Molnar
Link: http://lkml.kernel.org/r/20110627072709.6528.45706.stgit@fedora15
Signed-off-by: Steven Rostedt
15 Jul, 2011
1 commit
-
While attempting to create a timechart of boot up I found perf didn't
tolerate modules being loaded/unloaded. This patch fixes this by
reading the file once and then writing the size read at the correct
point in the file. It also simplifies the code somewhat.Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Signed-off-by: Sonny Rao
Signed-off-by: Michael Neuling
Link: http://lkml.kernel.org/r/10011.1310614483@neuling.org
Signed-off-by: Steven Rostedt
05 Jul, 2011
1 commit
-
Add an option to perf report/annotate/script to specify which
CPUs to operate on. This enables us to take a single system wide
profile and analyse each CPU (or group of CPUs) in isolation.This was useful when profiling a multiprocess workload where the
bottleneck was on one CPU but this was hidden in the overall
profile. Per process and per thread breakdowns didn't help
because multiple processes were running on each CPU and no
single process consumed an entire CPU.The patch converts the list of CPUs returned by cpu_map__new
into a bitmap for fast lookup. I wanted to use -C to be
consistent with perf top/record/stat, but unfortunately perf
report already uses -C .v2: Incorporate suggestions from David Ahern:
- Added -c to perf script
- Check that SAMPLE_CPU is set when -c is used
- Update documentationv3: Create perf_session__cpu_bitmap()
Signed-off-by: Anton Blanchard
Acked-by: David Ahern
Cc: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
Cc: Paul Mackerras
Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
Signed-off-by: Ingo Molnar
01 Jul, 2011
2 commits
-
…ic/random-tracing into perf/core
-
Merge reason: Pick up the latest fixes.
Signed-off-by: Ingo Molnar
30 Jun, 2011
5 commits
-
So that the parent sort dimension can be registered twice: once
if we add it as an explicit sort dimension (-s parent) and twice
if we request a parent filter (-p foo).We'll have only one parent sort dimension in the end but this
allows to override the default parent filter with we gave in "-p"
option. The goal of this is to prepare to allow the use of
"-s parent" and "-p foo" at the same time, ie: sort by filtered
parent.Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: David Ahern
Cc: Sam Liao -
As for newt ui, don't display entries that have been marked
as ignored.The practical current effect of this is to make parent
filtering really working. Before, entries that were ignored
were given a null parent but were still displayed. This
resulted in some weird effects:# Overhead Command Shared Object Symbol
# ........ ........... ................. ............
#
^A
|
--- __lock_acquire
|
|--95.97%-- lock_acquire
| |
| |--30.75%-- _raw_spin_lockDiscard these from the stdio display.
Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: David Ahern
Cc: Sam Liao -
These are probably some old leftovers.
Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: David Ahern
Cc: Sam Liao -
These don't need to be globally visible.
Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: David Ahern
Cc: Sam Liao -
Add "caller/callee" option to support inverted butterfly report,
in the inverted report (with caller option), the call graph start
from the callee's ancestor. Users can use such view to catch system's
performance bottleneck from a sysprof like view. Using this option
with specified sort order like pid gives us high level view of call
graph statistics.Also add "-G" alias for inverted call graph.
Signed-off-by: Sam Liao
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: David Ahern
Signed-off-by: Frederic Weisbecker
19 Jun, 2011
1 commit
-
…l/git/tip/linux-2.6-tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: Move RCU_BOOST #ifdefs to header file
rcu: use softirq instead of kthreads except when RCU_BOOST=y
rcu: Use softirq to address performance regression
rcu: Simplify curing of load woes
17 Jun, 2011
1 commit
-
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
kbuild: Call depmod.sh via shell
perf: clear out make flags when calling kernel make kernelver
16 Jun, 2011
2 commits
-
Merge reason: add the latest fixes.
Signed-off-by: Ingo Molnar
-
When generating the perf version from the kernel version using 'make
kernelver' it is necessary to clear out any MAKEFLAGS otherwise they may
trigger additional output which pollute the contents.Signed-off-by: Andy Whitcroft
Signed-off-by: Michal Marek
15 Jun, 2011
1 commit
-
Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread)
introduced performance regression. In an AIM7 test, this commit degraded
performance by about 40%.The commit runs rcu callbacks in a kthread instead of softirq. We observed
high rate of context switch which is caused by this. Out test system has
64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
which is caused by RCU's per-CPU kthread. A trace showed that most of
the time the RCU per-CPU kthread doesn't actually handle any callbacks,
but instead just does a very small amount of work handling grace periods.
This means that RCU's per-CPU kthreads are making the scheduler do quite
a bit of work in order to allow a very small amount of RCU-related
processing to be done.Alex Shi's analysis determined that this slowdown is due to lock
contention within the scheduler. Unfortunately, as Peter Zijlstra points
out, the scheduler's real-time semantics require global action, which
means that this contention is inherent in real-time scheduling. (Yes,
perhaps someone will come up with a workaround -- otherwise, -rt is not
going to do well on large SMP systems -- but this patch will work around
this issue in the meantime. And "the meantime" might well be forever.)This patch therefore re-introduces softirq processing to RCU, but only
for core RCU work. RCU callbacks are still executed in kthread context,
so that only a small amount of RCU work runs in softirq context in the
common case. This should minimize ksoftirqd execution, allowing us to
skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.Signed-off-by: Shaohua Li
Tested-by: "Alex,Shi"
Signed-off-by: Paul E. McKenney
10 Jun, 2011
2 commits
-
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
perf: Use make kernelversion instead of parsing the Makefile
kbuild: Hack for depmod not handling X.Y versions
kbuild: Move depmod call to a separate script
kbuild: Fix for empty SUBLEVEL or PATCHLEVEL
kbuild: Fix KERNELVERSION for empty SUBLEVEL or PATCHLEVEL
kbuild: silence Nothing to be done for 'all' message -
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Signed-off-by: Michal Marek
03 Jun, 2011
10 commits
-
Mandatory arguments need to be present in the argument name list, as
well as optional arguments, otherwise python barfs:# ./python/twatch.py
Traceback (most recent call last):
File "./python/twatch.py", line 41, in
main()
File "./python/twatch.py", line 32, in main
event = evlist.read_on_cpu(cpu)
RuntimeError: more argument specifiers than keyword list entriesHence, add cpu to the name list.
Cc: David Ahern
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
Link: http://lkml.kernel.org/r/1301588863-20210-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker
Signed-off-by: Arnaldo Carvalho de Melo -
Fixes two more cases where the python binding would not load:
. Not finding die(), which it shouldn't anyway, not good to just stop the
world because some particular perf.data file is invalid, just propagate
the error to the caller.. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
where it belongs, as most cases are moving to operate on an evsel object.oOne of the fixed problems:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "", line 1, in
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
We were using pr_debug to tell the user about not being able to parse a sample
where we should really use the python way of reporting errors: exceptions.Fixes this problem:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "", line 1, in
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
>>>
[root@emilia ~]As we want to keep the objects linked in the python binding (and in the future
in a shared library) minimal.Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-m9dba9kaluas0kq8r58z191c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
So far we avoided having to link debug.o in the python binding, keep it
that way by not using ui__warning() in evlist.c.Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-4wtew8hd3g7ejnlehtspys2t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Resolve to a function or variable if possible and if the sym option is
enabled.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1306782503-22002-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
The 'sym' option displays both the function name and the DSO it comes
from. Split the display of the dso into a separate option. This allows
display of the ip address and symbol without the dso, thus shortening
line lengths - and decluttering the output a bit.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1306528124-25861-3-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
Currently the "sym" output field is used to dump instruction pointers
and callchain stack. Sample addresses can also be converted to symbols,
so the meaning of "sym" needs to be fixed. This patch adds an "ip"
option and if it is selected the user can also opt to dump symbols for
them. If the user opts to dump IP without syms only the address is
shown.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1306528124-25861-2-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
perf stat continues running even if the event list contains counters
that are not supported. The resulting output then contains
for those events which gets confusing as to which events are supported,
but not counted and which are not supported.Before:
perf stat -ddd -- sleep 1
Performance counter stats for 'sleep 1':
0.571283 task-clock # 0.001 CPUs utilized
1 context-switches # 0.002 M/sec
0 CPU-migrations # 0.000 M/sec
157 page-faults # 0.275 M/sec
1,037,707 cycles # 1.816 GHz
stalled-cycles-frontend
stalled-cycles-backend
654,499 instructions # 0.63 insns per cycle
136,129 branches # 238.286 M/sec
branch-misses
L1-dcache-loads
L1-dcache-load-misses
LLC-loads
LLC-load-misses
L1-icache-loads
L1-icache-load-misses
dTLB-loads
dTLB-load-misses
iTLB-loads
iTLB-load-misses
L1-dcache-prefetches
L1-dcache-prefetch-misses1.001004836 seconds time elapsed
After:
perf stat -ddd -- sleep 1
Performance counter stats for 'sleep 1':
1.350326 task-clock # 0.001 CPUs utilized
2 context-switches # 0.001 M/sec
0 CPU-migrations # 0.000 M/sec
157 page-faults # 0.116 M/sec
11,986 cycles # 0.009 GHz
stalled-cycles-frontend
stalled-cycles-backend
496,986 instructions # 41.46 insns per cycle
138,065 branches # 102.246 M/sec
7,245 branch-misses # 5.25% of all branches
L1-dcache-loads
L1-dcache-load-misses
LLC-loads
LLC-load-misses
L1-icache-loads
L1-icache-load-misses
dTLB-loads
dTLB-load-misses
iTLB-loads
iTLB-load-misses
L1-dcache-prefetches
L1-dcache-prefetch-misses1.002397333 seconds time elapsed
v1->v2:
changed supported type from int to boolv2->v3
fixed vertical alignment of new struct elementCc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1306767359-13221-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
The list of methods argument names only needs to be NULL terminated
once. Remove the second ones.Cc: David Ahern
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
Link: http://lkml.kernel.org/r/1301588863-20210-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker
Signed-off-by: Arnaldo Carvalho de Melo -
Mandatory arguments need to be present in the argument name list, as
well as optional arguments, otherwise python barfs:# ./python/twatch.py
Traceback (most recent call last):
File "./python/twatch.py", line 41, in
main()
File "./python/twatch.py", line 32, in main
event = evlist.read_on_cpu(cpu)
RuntimeError: more argument specifiers than keyword list entriesHence, add cpu to the name list.
Cc: David Ahern
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
Link: http://lkml.kernel.org/r/1301588863-20210-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker
Signed-off-by: Arnaldo Carvalho de Melo
02 Jun, 2011
3 commits
-
Fixes two more cases where the python binding would not load:
. Not finding die(), which it shouldn't anyway, not good to just stop the
world because some particular perf.data file is invalid, just propagate
the error to the caller.. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
where it belongs, as most cases are moving to operate on an evsel object.oOne of the fixed problems:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "", line 1, in
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
We were using pr_debug to tell the user about not being able to parse a sample
where we should really use the python way of reporting errors: exceptions.Fixes this problem:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "", line 1, in
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
>>>
[root@emilia ~]As we want to keep the objects linked in the python binding (and in the future
in a shared library) minimal.Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-m9dba9kaluas0kq8r58z191c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
So far we avoided having to link debug.o in the python binding, keep it
that way by not using ui__warning() in evlist.c.Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-4wtew8hd3g7ejnlehtspys2t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo