30 Oct, 2015
1 commit
-
Although previous patch allows setting BPF compiler related options in
perfconfig, on some ad-hoc situation it still requires passing options
through cmdline. This patch introduces 2 options to 'perf record' for
this propose: --clang-path and --clang-opt.Signed-off-by: Wang Nan
Cc: Alexei Starovoitov
Cc: Brendan Gregg
Cc: Daniel Borkmann
Cc: David Ahern
Cc: He Kuang
Cc: Jiri Olsa
Cc: Kaixu Xia
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-9-git-send-email-wangnan0@huawei.com
[ Add the new options to the 'record' man page ]
Signed-off-by: Arnaldo Carvalho de Melo
27 Oct, 2015
1 commit
-
Now usage_with_options() setup a pager before printing message so normal
printf() or pr_err() will not be shown. The usage_with_options_msg()
can be used to print some help message before usage strings.Signed-off-by: Namhyung Kim
Acked-by: Masami Hiramatsu
Cc: David Ahern
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1445701767-12731-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
23 Oct, 2015
2 commits
-
The --call-graph option is complex so we should provide better guide for
users. Also change help message to be consistent with config option
names. Now perf top will show help like below:$ perf top --call-graph
Error: option `call-graph' requires a valueUsage: perf top []
--call-graph
setup and enables call-graph (stack chain/backtrace):record_mode: call graph recording mode (fp|dwarf|lbr)
record_size: if record_mode is 'dwarf', max size of stack recording ()
default: 8192 (bytes)
print_type: call graph printing style (graph|flat|fractal|none)
threshold: minimum call graph inclusion threshold ()
print_limit: maximum number of call graph entry ()
order: call graph order (caller|callee)
sort_key: call graph sort key (function|address)
branch: include last branch info to call graph (branch)Default: fp,graph,0.5,caller,function
Requested-by: Ingo Molnar
Signed-off-by: Namhyung Kim
Acked-by: Frederic Weisbecker
Cc: Adrian Hunter
Cc: Borislav Petkov
Cc: Brendan Gregg
Cc: Chandler Carruth
Cc: David Ahern
Cc: Jiri Olsa
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Wang Nan
Link: http://lkml.kernel.org/r/1445524112-5201-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
These messages will be used by 'perf top' in the next patch.
Signed-off-by: Namhyung Kim
Cc: Adrian Hunter
Cc: Borislav Petkov
Cc: Brendan Gregg
Cc: Chandler Carruth
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Wang Nan
Link: http://lkml.kernel.org/r/1445495330-25416-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
03 Oct, 2015
1 commit
-
When run "perf record -e", the number of samples showed up is wrong on some
32 bit systems, i.e. powerpc and arm.For example, run the below commands on 32 bit powerpc:
perf probe -x /lib/libc.so.6 malloc
perf record -e probe_libc:malloc -a ls perf.data
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.036 MB perf.data (13829241621624967218 samples) ]Actually, "perf script" just shows 21 samples. The number of samples is also
absurd since samples is long type, but it is printed as PRIu64.Build test ran on x86-64, x86, aarch64, arm, mips, ppc and ppc64.
Signed-off-by: Yang Shi
Cc: linaro-kernel@lists.linaro.org
Link: http://lkml.kernel.org/r/1443563383-4064-1-git-send-email-yang.shi@linaro.org
[ Bumped the 'hits' var used together with record.samples to 'unsigned long long' too ]
Signed-off-by: Arnaldo Carvalho de Melo
01 Oct, 2015
1 commit
-
A previous patch added a synthesized comm event for forked child process
but it missed that the event should contain area for sample_id_hdr at
the end. It worked by accident since the perf_event union contains
bigger event structs like mmap_events.This patch fixes it by dynamically allocating event struct including
those area like in perf_event__synthesize_thread_map().Reported-by: Arnaldo Carvalho de Melo
Signed-off-by: Namhyung Kim
Cc: David Ahern
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1443577526-3240-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
23 Sep, 2015
1 commit
-
When perf creates a new child to profile, the events are enabled on
exec(). And in this case, it doesn't synthesize any event for the
child since they'll be generated during exec(). But there's an window
between the enabling and the event generation.It used to be overcome since samples are only in kernel (so we always
have the map) and the comm is overridden by a later COMM event.
However it won't work if events are processed and displayed before the
COMM event overrides like in 'perf script'. This leads to those early
samples (like native_write_msr_safe) not having a comm but pid (like
':15328').So it needs to synthesize COMM event for the child explicitly before
enabling so that it can have a correct comm. But at this time, the
comm will be "perf" since it's not exec-ed yet.Committer note:
Before this patch:
# perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
# perf script --show-task-events
:4429 4429 27909.079372: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
:4429 4429 27909.079375: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
:4429 4429 27909.079376: 10 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
:4429 4429 27909.079377: 223 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
:4429 4429 27909.079378: 6571 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
usleep 4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429
usleep 4429 27909.079381: 185403 cycles: ffffffff810a72d3 flush_signal_handlers (/lib/modules/4.
usleep 4429 27909.079444: 2241110 cycles: 7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so)
usleep 4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429)After:
# perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
# perf script --show-task
perf 0 0.000000: PERF_RECORD_COMM: perf:8446/8446
perf 8446 30154.038944: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
perf 8446 30154.038948: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
perf 8446 30154.038949: 9 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
perf 8446 30154.038950: 230 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
perf 8446 30154.038951: 6772 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
usleep 8446 30154.038952: PERF_RECORD_COMM exec: usleep:8446/8446
usleep 8446 30154.038954: 196923 cycles: ffffffff81766440 _raw_spin_lock (/lib/modules/4.3.0-rc1
usleep 8446 30154.039021: 2292130 cycles: 7f609a173dc4 memcpy (/usr/lib64/ld-2.20.so)
usleep 8446 30154.039349: PERF_RECORD_EXIT(8446:8446):(8446:8446)
#Signed-off-by: Namhyung Kim
Tested-by: Arnaldo Carvalho de Melo
Cc: David Ahern
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1442881495-2928-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
01 Sep, 2015
2 commits
-
This patch modifies the -I/--int-regs option to enablepassing the name
of the registers to sample on interrupt. Registers can be specified by
their symbolic names. For instance on x86, --intr-regs=ax,si.The motivation is to reduce the size of the perf.data file and the
overhead of sampling by only collecting the registers useful to a
specific analysis. For instance, for value profiling, sampling only the
registers used to passed arguements to functions.With no parameter, the --intr-regs still records all possible registers
based on the architecture.To name registers, it is necessary to use the long form of the option,
i.e., --intr-regs:$ perf record --intr-regs=si,di,r8,r9 .....
To record any possible registers:
$ perf record -I .....
$ perf report --intr-regs ...To display the register, one can use perf report -D
To list the available registers:
$ perf record --intr-regs=\?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15Signed-off-by: Stephane Eranian
Tested-by: Arnaldo Carvalho de Melo
Cc: Adrian Hunter
Cc: Andi Kleen
Cc: David Ahern
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1441039273-16260-4-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo -
An evsel may have different cpus and threads than the evlist it is in.
Use it's own cpus and threads, when opening the evsel in 'perf record'.
Signed-off-by: Kan Liang
Cc: Jiri Olsa
Link: http://lkml.kernel.org/r/1440138194-17001-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
20 Aug, 2015
2 commits
-
Signed-off-by: Ingo Molnar
-
After recording, 'perf record' post-processes the data to determine
which buildids are needed.That processing must process the data in time order, if possible,
because otherwise dependent events, like forks and mmaps, will not make
sense.Signed-off-by: Adrian Hunter
Tested-by: Jiri Olsa
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1439994561-27436-4-git-send-email-adrian.hunter@intel.com
[ Moved the sample_id_add to after trying to open the events, use pr_warning ]
Signed-off-by: Arnaldo Carvalho de Melo
06 Aug, 2015
1 commit
-
Pass global callchain_param into parse_callchain_record_opt and
perf_evsel__config_callgraph as parameter. So we can reuse these
functions to parse/config local param for callchain.Signed-off-by: Kan Liang
Acked-by: Jiri Olsa
Cc: Andi Kleen
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/1438677022-34296-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
30 Jul, 2015
1 commit
-
Introduce callgraph_set to indicate whether the callgraph option was set
by user.Signed-off-by: Kan Liang
Cc: Andi Kleen
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/1438162936-59698-4-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
24 Jul, 2015
1 commit
-
Add an option to select PERF_RECORD_SWITCH events.
Signed-off-by: Adrian Hunter
Acked-by: Peter Zijlstra (Intel)
Tested-by: Arnaldo Carvalho de Melo
Tested-by: Jiri Olsa
Cc: Andi Kleen
Cc: Mathieu Poirier
Cc: Pawel Moll
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1437471846-26995-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
21 Jul, 2015
1 commit
-
This patch allows 'perf record' to exclude events issued by perf itself
by '--exclude-perf' option.Before this patch, when doing something like:
# perf record -a -e syscalls:sys_enter_write
One could easily get result like this:
# /tmp/perf report --stdio
...
# Overhead Command Shared Object Symbol
# ........ ....... .................. ....................
#
99.99% perf libpthread-2.18.so [.] __write_nocancel
0.01% ls libc-2.18.so [.] write
0.01% sshd libc-2.18.so [.] write
...Where most events are generated by perf itself.
A shell trick can be done to filter perf itself out:
# cat << EOF > ./tmp
> #!/bin/sh
> exec perf record -e ... --filter="common_pid != \$\$" -a sleep 10
> EOF
# chmod a+x ./tmp
# ./tmpHowever, doing so is user unfriendly.
This patch extracts evsel iteration framework introduced by patch 'perf
record: Apply filter to all events in a glob matching' into
foreach_evsel_in_last_glob(), and makes exclude_perf() function append
new filter expression to each evsel selected by a '-e' selector.To avoid losing filters if user pass '--filter' after '--exclude-perf',
this patch uses perf_evsel__append_filter() in both case, instead of
perf_evsel__set_filter() which removes old filter. As a side effect, now
it is possible to use multiple '--filter' option for one selector. They
are combinded with '&&'.Signed-off-by: Wang Nan
Cc: Andi Kleen
Cc: Brendan Gregg
Cc: Steven Rostedt
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1436513770-8896-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo
06 Jul, 2015
1 commit
-
If the option -T is used with option --per-thread, then time is still
not sampled. Fix that by using OPT_BOOLEAN_SET to distinguish when the
user used the -T option as opposed to the default case when timestamps
are enabled but only for per-cpu recording.Signed-off-by: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/1436183461-1918-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
20 Jun, 2015
1 commit
-
The time out to limit the individual proc map processing was hard code
to 500ms. This patch introduce a new option --proc-map-timeout to make
the time limit configurable.Signed-off-by: Kan Liang
Cc: Andi Kleen
Cc: David Ahern
Cc: Ying Huang
Link: http://lkml.kernel.org/r/1434549071-25611-2-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
10 Jun, 2015
1 commit
-
Because there's too many options and I cannot read, I frequently get
confused between -c and -P, and try to do things like:perf record -P 50000 -- foo
Which does not work; try and make the option description slightly longer
and hopefully less confusing.Signed-off-by: Peter Zijlstra (Intel)
Link: http://lkml.kernel.org/r/20150610144850.GP19282@twins.programming.kicks-ass.net
[ Do those changes on the man page as well ]
Signed-off-by: Arnaldo Carvalho de Melo
08 Jun, 2015
1 commit
-
The size of perf.data is missing update in no-buildid mode, which gives
wrong output result.Before this patch:
$ perf.perf record -B -e syscalls:sys_enter_open uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB perf.data ]After this patch:
$ perf.perf record -B -e syscalls:sys_enter_open uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data ]Signed-off-by: He Kuang
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Wang Nan
Link: http://lkml.kernel.org/r/1432819050-30511-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo
28 May, 2015
1 commit
-
.. to allow sharing between builtin-record and builtin-top later. No
code changes, just moved code.Signed-off-by: Andi Kleen
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1432749114-904-9-git-send-email-andi@firstfloor.org
[ Rename too generic branch.[ch] name to parse-branch-options.[ch] ]
Signed-off-by: Arnaldo Carvalho de Melo
06 May, 2015
2 commits
-
Add a new option and support for Instruction Tracing Snapshot Mode.
When the new option is selected, no AUX area tracing data is captured
until a signal (SIGUSR2) is received.Signed-off-by: Adrian Hunter
Acked-by: Jiri Olsa
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1430404667-10593-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo -
Add build option NO_AUXTRACE to exclude compiling support for AUX area
tracing. Support for both recording and processing is excluded and by
implication any future additions such as Intel PT and Intel BTS will
also not be compiled in with this option.Signed-off-by: Adrian Hunter
Acked-by: Jiri Olsa
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1430404667-10593-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
05 May, 2015
2 commits
-
We need to include all buildids when a perf.data file contains AUX area
tracing data because we do not decode the trace for that purpose because
it would take too long.Signed-off-by: Adrian Hunter
Acked-by: Jiri Olsa
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1430404667-10593-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo -
Add an index of AUX area tracing events within a perf.data file.
perf record uses a special user event PERF_RECORD_FINISHED_ROUND to
enable sorting of events in chunks instead of having to sort all events
altogether.AUX area tracing events contain data that can span back to the very
beginning of the recording period. i.e. they do not obey the rules of
PERF_RECORD_FINISHED_ROUND.By adding an index, AUX area tracing events can be found in advance and
the PERF_RECORD_FINISHED_ROUND approach works as usual.The index is recorded with the auxtrace feature in the perf.data file.
A session reads the index but does not process it. An AUX area decoder
can queue all the AUX area data in advance using
auxtrace_queues__process_index() or otherwise process the index in some
custom manner.Signed-off-by: Adrian Hunter
Acked-by: Jiri Olsa
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1430404667-10593-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
29 Apr, 2015
2 commits
-
Extend the -m option so that the number of mmap pages for AUX area
tracing can be specified by adding a comma followed by the number of
pages.Signed-off-by: Adrian Hunter
Acked-by: Jiri Olsa
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1428594864-29309-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo -
Amend the perf record tool to read the AUX area tracing mmap and
synthesize AUX area tracing events.Signed-off-by: Adrian Hunter
Acked-by: Jiri Olsa
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1428594864-29309-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
10 Apr, 2015
1 commit
-
The data_head and data_tail fields are defined as __u64 in
linux/perf_event.h, but perf userspace uses int and unsigned int.Convert all references to u64 for consistency.
Signed-off-by: David Ahern
Acked-by: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1428420037-26599-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo
08 Apr, 2015
1 commit
-
Teach perf-record about the new perf_event_attr::{use_clockid, clockid}
fields. Add a simple parameter to set the clock (if any) to be used for
the events to be recorded into the data file.Since we store the entire perf_event_attr in the EVENT_DESC section we
also already store the used clockid in the data file.Signed-off-by: Peter Zijlstra (Intel)
Acked-by: David Ahern
Cc: "H. Peter Anvin"
Cc: Adrian Hunter
Cc: Andrew Morton
Cc: Jiri Olsa
Cc: John Stultz
Cc: Linus Torvalds
Cc: Stephane Eranian
Cc: Thomas Gleixner
Cc: Yunlong Song
Link: http://lkml.kernel.org/r/20150407154851.GR23123@twins.programming.kicks-ass.net
[ Conditionally define CLOCK_BOOTTIME, at least rhel6 doesn't have it - dsahern
Ditto for CLOCK_MONOTONIC_RAW, sles11sp2 doesn't have it - yunlong.song ]
Signed-off-by: Arnaldo Carvalho de Melo
26 Mar, 2015
1 commit
-
Use of a bad filter currently generates the message:
Error: failed to set filter with 22 (Invalid argument)Add the event name to make it clear to which event the filter
failed to apply:
Error: Failed to set filter "foo" on event sched:sg_lb_stats: 22: Invalid argumentTo test it use something like:
# perf record -e sched:sched_switch -e sched:*fork --filter parent_pid==1 -e sched:*wait* --filter bla usleep 1
Error: failed to set filter "bla" on event sched:sched_stat_iowait with 22 (Invalid argument)
#Based-on-a-patch-by: David Ahern
Acked-by: David Ahern
Cc: Adrian Hunter
Cc: Borislav Petkov
Cc: Don Zickus
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-d7gq2fjvaecozp9o2i0siifu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
11 Mar, 2015
1 commit
-
By keeping pointers to machines, evlist and tool in ordered_events.
Cc: Adrian Hunter
Cc: Borislav Petkov
Cc: David Ahern
Cc: Don Zickus
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-0c6huyaf59mqtm2ek9pmposl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
25 Feb, 2015
1 commit
-
Add an option to perf record to record running/enabled time for read
events, similar to what stat does.This is useful to understand multiplexing problems.
Right now the report support is not great, but at least report -D
already supports it.Signed-off-by: Andi Kleen
Acked-by: Jiri Olsa
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/1424819620-16043-1-git-send-email-andi@firstfloor.org
[ Fixed the Documentation entry to match the OPT_BOOLEAN one ]
Signed-off-by: Arnaldo Carvalho de Melo
19 Feb, 2015
1 commit
-
Currently, there are two call chain recording options, fp and dwarf.
Haswell has a new feature that utilizes the existing LBR facility to
record call chains. Kernel side LBR support code provides this as a
third option to record call chains. This patch enables the lbr call
stack support on the tooling side.LBR call stack has some limitations:
- It reuses current LBR facility, so LBR call stack and branch record
can not be enabled at the same time.- It is only available for user-space callchains.
However, it also offers some advantages:
- LBR call stack can work on user apps which don't have frame-pointers
or dwarf debug info compiled. It is a good alternative when nothing
else works.Tested-by: Jiri Olsa
Signed-off-by: Kan Liang
Signed-off-by: Peter Zijlstra (Intel)
Cc: Adrian Hunter
Cc: Anshuman Khandual
Cc: Arnaldo Carvalho de Melo
Cc: Cody P Schafer
Cc: David Ahern
Cc: Don Zickus
Cc: Jacob Shin
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Masanari Iida
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Rodrigo Campos
Cc: Stephane Eranian
Cc: Sukadev Bhattiprolu
Link: http://lkml.kernel.org/r/1420482185-29830-2-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar
30 Jan, 2015
3 commits
-
Do not reference file->fd directly since we want hide the
implementation details from outside for possible future changes.Signed-off-by: Namhyung Kim
Cc: Adrian Hunter
Cc: Andi Kleen
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1422518843-25818-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
After perf record finishes, it prints file size and number of samples in
the file but this info is wrong since it assumes typical sample size of
24 bytes and divides file size by the value.However as we post-process recorded samples for build-id, it can show
correct number like below. If build-id post-processing is not requested
just omit the wrong number of samples.$ perf record noploop 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.159 MB perf.data (3989 samples) ]$ perf report --stdio -n
# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 3K of event 'cycles'
# Event count (approx.): 3771330663
#
# Overhead Samples Command Shared Object Symbol
# ........ ............ ....... ................ ..........................
#
99.90% 3982 noploop noploop [.] main
0.09% 1 noploop ld-2.17.so [.] _dl_check_map_versions
0.01% 1 noploop [kernel.vmlinux] [k] setup_arg_pages
0.00% 5 noploop [kernel.vmlinux] [k] intel_pmu_enable_allReported-by: Milian Wolff
Signed-off-by: Namhyung Kim
Reviewed-by: Jiri Olsa
Cc: Adrian Hunter
Cc: Andi Kleen
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1422518843-25818-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
It's only used for perf record to process build-id because its file size
it's not fixed at this time due to remaining header features.However data offset and size is available so that we can use the
perf_session__process_events() once we set the file size as the current
offset like for now.Signed-off-by: Namhyung Kim
Cc: Adrian Hunter
Cc: Andi Kleen
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1422518843-25818-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
16 Nov, 2014
1 commit
-
Add -I/--intr-regs option to capture machine state registers at
interrupt.Add the corresponding man page description
Signed-off-by: Stephane Eranian
Signed-off-by: Peter Zijlstra (Intel)
Link: http://lkml.kernel.org/r/1411559322-16548-6-git-send-email-eranian@google.com
Cc: cebbert.lkml@gmail.com
Cc: Adrian Hunter
Cc: Anshuman Khandual
Cc: Arnaldo Carvalho de Melo
Cc: Linus Torvalds
Cc: Masanari Iida
Signed-off-by: Ingo Molnar
05 Nov, 2014
1 commit
-
When perf record finishes a session, it pre-processes samples in order
to write build-id info from DSOs that had samples.During this process it'll call map__load() for the kernel map, and it
ends up calling dso__load_vmlinux_path() which replaces dso->long_name.But this function checks kernel's build-id before searching vmlinux path
so it'll end up with a cryptic name, the pathname for the entry in the
~/.debug cache, which can be confusing to users.This patch adds a flag to skip the build-id check during record, so
that it'll have the original vmlinux path for the kernel dso->long_name,
not the entry in the ~/.debug cache.Before:
# perf record -va sleep 3
mmap size 528384B
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.196 MB perf.data (~8545 samples) ]
Looking at the vmlinux_path (7 entries long)
Using /home/namhyung/.debug/.build-id/f0/6e17aa50adf4d00b88925e03775de107611551 for symbolsAfter:
# perf record -va sleep 3
mmap size 528384B
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.193 MB perf.data (~8432 samples) ]
Looking at the vmlinux_path (7 entries long)
Using /lib/modules/3.16.4-1-ARCH/build/vmlinux for symbolsSigned-off-by: Namhyung Kim
Cc: Adrian Hunter
Cc: David Ahern
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1415063674-17206-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
29 Oct, 2014
1 commit
-
Those are shared with other builtin commands like kvm, script. So
make it accessable from them. This is a preparation of later change
that limiting possible options.Signed-off-by: Namhyung Kim
Acked-by: Hemant Kumar
Cc: Alexander Yarygin
Cc: David Ahern
Cc: Hemant Kumar
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Masami Hiramatsu
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1413990949-13953-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
17 Oct, 2014
1 commit
-
The only thing we need is a forward declaration for 'struct cgroup_sel',
that is inside 'struct perf_evsel'.Include cgroup.h instead on the tools that support cgroups.
Cc: Adrian Hunter
Cc: Borislav Petkov
Cc: David Ahern
Cc: Don Zickus
Cc: Frederic Weisbecker
Cc: Jean Pihet
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-b7kuymbgf0zxi5viyjjtu5hk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
15 Oct, 2014
1 commit
-
It was lost in hist.h, move it to where it belongs, callchain.h, as
there are places that gets hist.h by means of evsel.h, and since evsel.h
is being untangled from hist.h...Cc: Adrian Hunter
Cc: Borislav Petkov
Cc: David Ahern
Cc: Don Zickus
Cc: Frederic Weisbecker
Cc: Jean Pihet
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-0rg7ji1jnbm6q6gj35j37jby@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo