02 Aug, 2012
1 commit
-
Since we need evsel->{attr.{sample_{id_all,type}},sample_size},
reducing the number of parameters tools have to pass.Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-wdtmgak0ihgsmw1brb54a8h4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
09 Mar, 2012
1 commit
-
This patch adds:
- ability to parse samples with PERF_SAMPLE_BRANCH_STACK
- sort on branches (dso_from, symbol_from, dso_to, symbol_to, mispredict)
- build histograms on branchesSigned-off-by: Roberto Agostino Vitillo
Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-12-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar
12 Dec, 2011
1 commit
-
It's the counterpart of perf_session__parse_sample.
v2: fixed mistakes found by David Ahern.
v3: s/data/sample/
s/perf_event__change_sample/perf_event__synthesize_sampleReviewed-by: David Ahern
Cc: Arun Sharma
Cc: David Ahern
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: devel@openvz.org
Link: http://lkml.kernel.org/r/1323266161-394927-3-git-send-email-avagin@openvz.org
Signed-off-by: Andrew Vagin
Signed-off-by: Arnaldo Carvalho de Melo
02 Dec, 2011
1 commit
-
So that tools like 'perf test' can print the events when in verbose
mode, for instance.Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-xnovdqfi25nc48gy6604k7yp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
28 Nov, 2011
3 commits
-
To better reflect that it became the base class for all tools, that must
be in each tool struct and where common stuff will be put.Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-qgpc4msetqlwr8y2k7537cxe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Reducing the exposure of perf_session further, so that we can use the
classes in cases where no perf.data file is created.Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-stua66dcscsezzrcdugvbmvd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
So that we don't need to have that many globals.
Next steps will remove the 'session' pointer, that in most cases is
not needed.Then we can rename perf_event_ops to 'perf_tool' that better describes
this class hierarchy.Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-wp4djox7x6w1i2bab1pt4xxp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
24 Sep, 2011
1 commit
-
Currently, analyzing PPC data files on x86 the cpu field is always 0 and
the tid and pid are backwards. For example, analyzing a PPC file on PPC
the pid/tid fields show:rsyslogd 1210/1212
and analyzing the same PPC file using an x86 perf binary shows:
rsyslogd 1212/1210
The problem is that the swap_op method for samples is
perf_event__all64_swap which assumes all elements in the sample_data
struct are u64s. cpu, tid and pid are u32s and need to be handled
individually. Given that the swap is done before the sample is parsed,
the simplest solution is to undo the 64-bit swap of those elements when
the sample is parsed and do the proper swap.The RAW data field is generic and perf cannot have programmatic knowledge
of how to treat that data. Instead a warning is given to the user.Thanks to Anton Blanchard for providing a data file for a mult-CPU
PPC system so I could verify the fix for the CPU fields.v3 -> v4:
- fixed use of WARN_ONCEv2 -> v3:
- used WARN_ONCE for message regarding raw data
- removed struct wrapper around union
- fixed whitespace issuesv1 -> v2:
- added a union for undoing the byte-swap on u64 and redoing swap on
u32's to address compiler errors (see git commit 65014ab3)Cc: Anton Blanchard
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1315321946-16993-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo
03 Jun, 2011
1 commit
-
Fixes two more cases where the python binding would not load:
. Not finding die(), which it shouldn't anyway, not good to just stop the
world because some particular perf.data file is invalid, just propagate
the error to the caller.. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
where it belongs, as most cases are moving to operate on an evsel object.oOne of the fixed problems:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "", line 1, in
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
22 May, 2011
1 commit
-
Check that the total size of the sample fields having a fixed
size do not exceed the one of the whole event. This robustifies
the sample parsing.Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
11 Feb, 2011
1 commit
-
Fixups due to rename of event_t routines from event__ to perf_event__
done in perf/core.Conflicts:
tools/perf/builtin-record.c
tools/perf/builtin-top.c
tools/perf/util/event.c
tools/perf/util/event.hSigned-off-by: Arnaldo Carvalho de Melo
10 Feb, 2011
1 commit
-
Jeff Moyer reported these messages:
Warning: ... trying to fall back to cpu-clock-ticks
couldn't open /proc/-1/status
couldn't open /proc/-1/maps
[ls output]
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.008 MB perf.data (~363 samples) ]That lead me and David Ahern to see that something was fishy on the thread
synthesizing routines, at least for the case where the workload is started
from 'perf record', as -1 is the default for target_tid in 'perf record --tid'
parameter, so somehow we were trying to synthesize the PERF_RECORD_MMAP and
PERF_RECORD_COMM events for the thread -1, a bug.So I investigated this and noticed that when we introduced support for
recording a process and its threads using --pid some bugs were introduced and
that the way to fix it was to instead of passing the target_tid to the event
synthesizing routines we should better pass the thread_map that has the list of
threads for a --pid or just the single thread for a --tid.Checked in the following ways:
On a 8-way machine run cyclictest:
[root@emilia ~]# perf record cyclictest -a -t -n -p99 -i100 -d50
policy: fifo: loadavg: 0.00 0.13 0.31 2/139 28798T: 0 (28791) P:99 I:100 C: 25072 Min: 4 Act: 5 Avg: 6 Max: 122
T: 1 (28792) P:98 I:150 C: 16715 Min: 4 Act: 6 Avg: 5 Max: 27
T: 2 (28793) P:97 I:200 C: 12534 Min: 4 Act: 5 Avg: 4 Max: 8
T: 3 (28794) P:96 I:250 C: 10028 Min: 4 Act: 5 Avg: 5 Max: 96
T: 4 (28795) P:95 I:300 C: 8357 Min: 5 Act: 6 Avg: 5 Max: 12
T: 5 (28796) P:94 I:350 C: 7163 Min: 5 Act: 6 Avg: 5 Max: 12
T: 6 (28797) P:93 I:400 C: 6267 Min: 4 Act: 5 Avg: 5 Max: 9
T: 7 (28798) P:92 I:450 C: 5571 Min: 4 Act: 5 Avg: 5 Max: 9
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.108 MB perf.data (~4719 samples) ][root@emilia ~]#
This will create one extra thread per CPU:
[root@emilia ~]# tuna -t cyclictest -CP
thread ctxt_switches
pid SCHED_ rtpri affinity voluntary nonvoluntary cmd
28825 OTHER 0 0xff 2169 671 cyclictest
28832 FIFO 93 6 52338 1 cyclictest
28833 FIFO 92 7 46524 1 cyclictest
28826 FIFO 99 0 209360 1 cyclictest
28827 FIFO 98 1 139577 1 cyclictest
28828 FIFO 97 2 104686 0 cyclictest
28829 FIFO 96 3 83751 1 cyclictest
28830 FIFO 95 4 69794 1 cyclictest
28831 FIFO 94 5 59825 1 cyclictest
[root@emilia ~]#So we should expect only samples for the above 9 threads when using the
--dump-raw-trace|-D perf report switch to look at the column with the tid:[root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
629 28825
110 28826
491 28827
308 28828
198 28829
621 28830
225 28831
203 28832
89 28833
[root@emilia ~]#So for workloads started by 'perf record' seems to work, now for existing workloads,
just run cyclictest first, without 'perf record':[root@emilia ~]# tuna -t cyclictest -CP
thread ctxt_switches
pid SCHED_ rtpri affinity voluntary nonvoluntary cmd
28859 OTHER 0 0xff 594 200 cyclictest
28864 FIFO 95 4 16587 1 cyclictest
28865 FIFO 94 5 14219 1 cyclictest
28866 FIFO 93 6 12443 0 cyclictest
28867 FIFO 92 7 11062 1 cyclictest
28860 FIFO 99 0 49779 1 cyclictest
28861 FIFO 98 1 33190 1 cyclictest
28862 FIFO 97 2 24895 1 cyclictest
28863 FIFO 96 3 19918 1 cyclictest
[root@emilia ~]#and then later did:
[root@emilia ~]# perf record --pid 28859 sleep 3
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.027 MB perf.data (~1195 samples) ]
[root@emilia ~]#To collect 3 seconds worth of samples for pid 28859 and its children:
[root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
15 28859
33 28860
19 28861
13 28862
13 28863
10 28864
11 28865
9 28866
255 28867
[root@emilia ~]#Works, last thing is to check if looking at just one of those threads also works:
[root@emilia ~]# perf record --tid 28866 sleep 3
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.006 MB perf.data (~242 samples) ]
[root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
3 28866
[root@emilia ~]#Works too.
Reported-by: Jeff Moyer
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Jeff Moyer
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
30 Jan, 2011
2 commits
-
And move the event_t methods to the perf_event__ too.
No code changes, just namespace consistency.
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Making the namespace more uniform.
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
24 Jan, 2011
1 commit
-
To avoid linking more stuff in the python binding I'm working on, future
csets will make the sample type be taken from the evsel itself, but for
that we need to first have one file per cpu and per sample_type, not a
single perf.data file.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
09 Dec, 2010
2 commits
-
The dump code used by perf report -D is scattered all over the place.
Move it to separate functions.Acked-by: Ian Munsie
Cc: Frederic Weisbecker
Cc: Ian Munsie
Cc: Ingo Molnar
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Thomas Gleixner
Signed-off-by: Arnaldo Carvalho de Melo -
event__name[] is missing an entry for PERF_RECORD_FINISHED_ROUND, but we
happily access the array from the dump code.Make event__name[] static and provide an accessor function, fix up all
callers and add the missing string.Cc: Frederic Weisbecker
Cc: Ian Munsie
Cc: Ingo Molnar
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Thomas Gleixner
Signed-off-by: Arnaldo Carvalho de Melo
05 Dec, 2010
2 commits
-
So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME:
$ perf record -aT
$ perf report -D | grep PERF_RECORD_
3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3
3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3
3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811)
3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3
3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853
3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find
3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3
3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so
3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso]
3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1
3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so
3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so
3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1
3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3
First column is the cpu and the second the timestamp.
That way we can investigate problems in the event stream.
If the new perf binary is run on an older kernel, it will disable this feature
automatically.Tested-by: Thomas Gleixner
Reviewed-by: Thomas Gleixner
Acked-by: Ian Munsie
Acked-by: Thomas Gleixner
Cc: Frédéric Weisbecker
Cc: Ian Munsie
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Stephane Eranian
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
At perf_session__process_event, so that we reduce the number of lines in eache
tool sample processing routine that now receives a sample_data pointer already
parsed.This will also be useful in the next patch, where we'll allow sample the
identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
timestamp) just after before every event.Also validate callchains in perf_session__process_event, i.e. as early as
possible, and keep a counter of the number of events discarded due to invalid
callchains, warning the user about it if it happens.There is an assumption that was kept that all events have the same sample_type,
that will be dealt with in the future, when this preexisting limitation will be
removed.Tested-by: Thomas Gleixner
Reviewed-by: Thomas Gleixner
Acked-by: Ian Munsie
Acked-by: Thomas Gleixner
Cc: Frédéric Weisbecker
Cc: Ian Munsie
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Stephane Eranian
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
04 Aug, 2010
1 commit
-
The event__process function is useful in processing /proc//maps. All of
the functions that are called from event__process are defined in util/event.c.
Though its defined in builtin-top.c, it could be reused for perf probe for
uprobes. Hence moving it to util/event.c and exporting the function.LKML-Reference:
Signed-off-by: Srikar Dronamraju
Signed-off-by: Arnaldo Carvalho de Melo
05 Jun, 2010
1 commit
-
Simplifying the tools that were using both in sequence and allowing
upcoming simplifications, such as Arun's patch to sort by cpus.Cc: David S. Miller
Cc: Frédéric Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
14 May, 2010
1 commit
-
This is one more thing that started global but are more useful per hist
or per session.Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
11 May, 2010
1 commit
-
In cbbc79a we introduced support for multiple events by introducing a
new "event_stat_id" struct and then made several perf_session methods
receive a point to it instead of a pointer to perf_session, and kept the
event_stats and hists rb_tree in perf_session.While working on the new newt based browser, I realised that it would be
better to introduce a new class, "hists" (short for "histograms"),
renaming the "event_stat_id" struct and the perf_session methods that
were really "hists" methods, as they manipulate only struct hists
members, not touching anything in the other perf_session members.Other optimizations, such as calculating the maximum lenght of a symbol
name present in an hists instance will be possible as we add them,
avoiding a re-traversal just for finding that information.The rationale for the name "hists" to replace "event_stat_id" is that we
may have multiple sets of hists for the same event_stat id, as, for
instance, the 'perf diff' tool has, so event stat id is not what
characterizes what this struct and the functions that manipulate it do.Cc: Eric B Munson
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
09 May, 2010
1 commit
-
In order to provide a more rubust and deterministic reordering
algorithm, we need to know when we reach a point where we just
did a pass through over every counter buffers to read every thing
they had.This patch introduces a new PERF_RECORD_FINISHED_ROUND pseudo event
that only consist in an event header and doesn't need to contain
anything.Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Paul Mackerras
Cc: Tom Zanussi
Cc: Masami Hiramatsu
28 Apr, 2010
1 commit
-
struct kernel_info and kerninfo__ are too vague, what they really
describe are machines, virtual ones or hosts.There are more changes to introduce helpers to shorten function calls
and to make more clear what is really being done, but I left that for
subsequent patches.Cc: Avi Kivity
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Zhang, Yanmin
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
19 Apr, 2010
1 commit
-
Here is the patch of userspace perf tool.
Signed-off-by: Zhang Yanmin
Signed-off-by: Avi Kivity
14 Apr, 2010
5 commits
-
Bypasses the build_id perf header code and replaces it with a
synthesized event and processing function that accomplishes the
same thing, used when reading/writing perf data to/from a pipe.Signed-off-by: Tom Zanussi
Acked-by: Thomas Gleixner
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference:
Signed-off-by: Ingo Molnar -
Bypasses the tracing_data perf header code and replaces it with
a synthesized event and processing function that accomplishes
the same thing, used when reading/writing perf data to/from a
pipe.The tracing data is pretty large, and this patch doesn't attempt
to break it down into component events. The tracing_data event
itself doesn't actually contain the tracing data, rather it
arranges for the event processing code to skip over it after
it's read, using the skip return value added to the event
processing loop in a previous patch.Signed-off-by: Tom Zanussi
Acked-by: Thomas Gleixner
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference:
Signed-off-by: Ingo Molnar -
Bypasses the event type perf header code and replaces it with a
synthesized event and processing function that accomplishes the
same thing, used when reading/writing perf data to/from a pipe.Signed-off-by: Tom Zanussi
Acked-by: Thomas Gleixner
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference:
Signed-off-by: Ingo Molnar -
Bypasses the attr perf header code and replaces it with a
synthesized event and processing function that accomplishes the
same thing, used when reading/writing perf data to/from a pipe.Making the attrs into events allows them to be streamed over a
pipe along with the rest of the header data (in later patches).
It also paves the way to allowing events to be added and removed
from perf sessions dynamically.Signed-off-by: Tom Zanussi
Acked-by: Thomas Gleixner
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference:
Signed-off-by: Ingo Molnar -
This patch makes several changes to allow the perf event stream
to be sent and received over a pipe:- adds pipe-specific versions of the header read/write code
- adds pipe-specific version of the event processing code
- adds a range of event types to be used for header or other
pseudo events, above the range used by the kernel- checks the return value of event handlers, which they can use
to skip over large events during event processing rather than actually
reading them into event objects.- unifies the multiple do_read() functions and updates its
users.Note that none of these changes affect the existing perf data
file format or processing - this code only comes into play if
perf output is sent to stdout (or is read from stdin).Signed-off-by: Tom Zanussi
Acked-by: Thomas Gleixner
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference:
Signed-off-by: Ingo Molnar
08 Apr, 2010
1 commit
-
Using 'pahole --packable' I found some structs that could be reorganized
to eliminate alignment holes, in some cases getting them to be cacheline
multiples.[acme@doppio linux-2.6-tip]$ codiff perf.old ~/bin/perf
builtin-annotate.c:
struct perf_session | -8
struct perf_header | -8
2 structs changedbuiltin-diff.c:
struct sample_data | -8
1 struct changed
diff__process_sample_event | -8
1 function changed, 8 bytes removed, diff: -8builtin-sched.c:
struct sched_atom | -8
1 struct changedbuiltin-timechart.c:
struct per_pid | -8
1 struct changed
cmd_timechart | -16
1 function changed, 16 bytes removed, diff: -16builtin-probe.c:
struct perf_probe_point | -8
struct perf_probe_event | -8
2 structs changed
opt_add_probe_event | -3
1 function changed, 3 bytes removed, diff: -3util/probe-finder.c:
struct probe_finder | -8
1 struct changed
find_kprobe_trace_events | -16
1 function changed, 16 bytes removed, diff: -16/home/acme/bin/perf:
4 functions changed, 43 bytes removed, diff: -43
[acme@doppio linux-2.6-tip]$Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
10 Mar, 2010
1 commit
-
This patch adds the structures necessary to count each event
type independently in perf report.Signed-off-by: Eric B Munson
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar
14 Jan, 2010
1 commit
-
We were always looking at the running machine /proc/modules,
even when processing a perf.data file, which only makes sense
when we're doing 'perf record' and 'perf report' on the same
machine, and in close sucession, or if we don't use modules at
all, right Peter? ;-)Now, at 'perf record' time we read /proc/modules, find the long
path for modules, and put them as PERF_MMAP events, just like we
did to encode the reloc reference symbol for vmlinux. Talking
about that now it is encoded in .pgoff, so that we can use
.{start,len} to store the address boundaries for the kernel so
that when we reconstruct the kmaps tree we can do lookups right
away, without having to fixup the end of the kernel maps like we
did in the past (and now only in perf record).One more step in the 'perf archive' direction when we'll finally
be able to collect data in one machine and analyse in another.Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar
13 Jan, 2010
2 commits
-
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
DSOs don't have this problem because the kernel emits a
PERF_MMAP for each new executable mapping it performs on
monitored threads.To fix the kernel case we simulate the same behaviour, by having
'perf record' to synthesize a PERF_MMAP for the kernel, encoded
like this:[root@doppio ~]# perf record -a -f sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.344 MB perf.data (~15038 samples) ]
[root@doppio ~]# perf report -D | head -100xd0 [0x40]: event: 1
.
. ... raw event: size 64 bytes
. 0000: 01 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 ......@........
. 0010: 00 00 00 81 ff ff ff ff 00 00 00 00 00 00 00 00 ...............
. 0020: 00 00 00 00 00 00 00 00 5b 6b 65 72 6e 65 6c 2e ........ [kernel
. 0030: 6b 61 6c 6c 73 79 6d 73 2e 5f 74 65 78 74 5d 00 kallsyms._text]
. 0xd0
[0x40]: PERF_RECORD_MMAP 0/0: [0xffffffff81000000((nil)) @ (nil)]: [kernel.kallsyms._text]I.e. we identify such event as having:
.pid = 0
.filename = [kernel.kallsyms.REFNAME]
.start = REFNAME addr in /proc/kallsyms at 'perf record' timeand use now a hardcoded value of '.text' for REFNAME.
Then, later, in 'perf report', if there are any kernel hits and
thus we need to resolve kernel symbols, we search for REFNAME
and if its address changed, relocation happened and we thus must
change the kernel mapping routines to one that uses .pgoff as
the relocation to apply.This way we use the same mechanism used for the other DSOs and
don't have to do a two pass in all the kernel symbols.Reported-by: Xiao Guangrong
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: "H. Peter Anvin"
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Xiao Guangrong
LKML-Reference:
Signed-off-by: Ingo Molnar
28 Dec, 2009
1 commit
-
And this resulted in the need for adding some missing includes
in some places that were getting the definitions needed out of
sheer luck.Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar
18 Dec, 2009
1 commit
-
Pekka Enberg reported weird percentages in perf report. It
turns out we are overflowing a 32-bit variables in struct
events_stats on 32-bit architectures.Before:
[acme@ana linux-2.6-tip]$ perf report -i pekka.perf.data 2> /dev/null | head -10
281.96% Xorg b710a561 [.] 0x000000b710a561
140.15% Xorg [kernel] [k] __initramfs_end
51.56% metacity libgobject-2.0.so.0.2000.1 [.] 0x00000000026e46
35.12% evolution libcairo.so.2.10800.6 [.] 0x000000000203bd
33.84% metacity libpthread-2.9.so [.] 0x00000000007a3dAfter:
[acme@ana linux-2.6-tip]$ perf report -i pekka.perf.data 2> /dev/null | head -10
30.04% Xorg b710a561 [.] 0x000000b710a561
14.93% Xorg [kernel] [k] __initramfs_end
5.49% metacity libgobject-2.0.so.0.2000.1 [.] 0x00000000026e46
3.74% evolution libcairo.so.2.10800.6 [.] 0x000000000203bd
3.61% metacity libpthread-2.9.so [.] 0x00000000007a3dReported-by: Pekka Enberg
Tested-by: Pekka Enberg
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar
16 Dec, 2009
1 commit
-
Check build-id of vmlinux by using functions in symbol.c.
This also exposes map__load() for getting vmlinux path,
and removes vmlinux path list in builtin-probe.c,
because symbol.c already has that. Checking build-id
prevents users to open old or different debuginfo from
current running kernel.Signed-off-by: Masami Hiramatsu
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Steven Rostedt
Cc: Jim Keniston
Cc: Ananth N Mavinakayanahalli
Cc: Christoph Hellwig
Cc: Frank Ch. Eigler
Cc: Jason Baron
Cc: K.Prasad
Cc: Peter Zijlstra
Cc: Srikar Dronamraju
Cc: systemtap
Cc: DLE
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar
15 Dec, 2009
1 commit
-
Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar