28 May, 2020
1 commit
-
As it is a 'struct evsel' method, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo
06 May, 2020
3 commits
-
As those is a 'struct evsel' methods, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
As those are 'struct evsel' methods, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
As they are not 'struct evsel' methods, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo
27 Feb, 2020
2 commits
-
For all the perf-config options that can also be set from command line
option, the preference is given to command line version in case of any
conflict. But that's opposite in case of perf annotate. i.e. the more
preference is given to default option rather than command line option.
Fix it.Before:
$ ./perf config
annotate.show_nr_samples=false$ ./perf annotate shash --show-nr-samples
Percent│
│24: mov -0xc(%rbp),%eax
49.19 │ imul $0x1003f,%eax,%ecx
│ mov -0x18(%rbp),%raxAfter:
Samples│
│24: mov -0xc(%rbp),%eax
1 │ imul $0x1003f,%eax,%ecx
│ mov -0x18(%rbp),%raxSigned-off-by: Ravi Bangoria
Tested-by: Arnaldo Carvalho de Melo
Cc: Adrian Hunter
Cc: Alexey Budankov
Cc: Changbin Du
Cc: Ian Rogers
Cc: Jin Yao
Cc: Jiri Olsa
Cc: Leo Yan
Cc: Namhyung Kim
Cc: Song Liu
Cc: Taeung Song
Cc: Thomas Richter
Cc: Yisheng Xie
Link: http://lore.kernel.org/lkml/20200213064306.160480-7-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo -
perf default config set by user in [annotate] section is totally ignored
by annotate code. Fix it.Before:
$ ./perf config
annotate.hide_src_code=true
annotate.show_nr_jumps=true
annotate.show_nr_samples=true$ ./perf annotate shash
│ unsigned h = 0;
│ movl $0x0,-0xc(%rbp)
│ while (*s)
│ ↓ jmp 44
│ h = 65599 * h + *s++;
11.33 │24: mov -0xc(%rbp),%eax
43.50 │ imul $0x1003f,%eax,%ecx
│ mov -0x18(%rbp),%raxAfter:
│ movl $0x0,-0xc(%rbp)
│ ↓ jmp 44
1 │1 24: mov -0xc(%rbp),%eax
4 │ imul $0x1003f,%eax,%ecx
│ mov -0x18(%rbp),%raxNote that we have removed show_nr_samples and show_total_period from
annotation_options because they are not used. Instead of them we use
symbol_conf.show_nr_samples and symbol_conf.show_total_period.Committer testing:
Using 'perf annotate --stdio2' to use the TUI rendering but emitting the output to stdio:
# perf config
#
# perf config annotate.hide_src_code=true
# perf config
annotate.hide_src_code=true
#
# perf config annotate.show_nr_jumps=true
# perf config annotate.show_nr_samples=true
# perf config
annotate.hide_src_code=true
annotate.show_nr_jumps=true
annotate.show_nr_samples=true
#
#Before:
# perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized
Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
Percent
00000000000609f0 :
endbr64
cmpq $0x0,0x20(%rdi)
↓ je 10
xor %eax,%eax
← retq
xchg %ax,%ax
100.00 10: push %rbp
cmpq $0x0,0x18(%rdi)
mov %rdi,%rbp
↓ jne 20
1b: xor %eax,%eax
pop %rbp
← retq
nop
20: lea 0x18(%rdi),%rdi
→ callq JS_UpdateWeakPointerAfterGC(JS::Heap /dev/null
Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
Samples endbr64
cmpq $0x0,0x20(%rdi)
↓ je 10
xor %eax,%eax
← retq
xchg %ax,%ax
1 1 10: push %rbp
cmpq $0x0,0x18(%rdi)
mov %rdi,%rbp
↓ jne 20
1 1b: xor %eax,%eax
pop %rbp
← retq
nop
1 20: lea 0x18(%rdi),%rdi
→ callq JS_UpdateWeakPointerAfterGC(JS::Heap /dev/null
Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period]
ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0
Samples endbr64
cmpq $0x0,0x20(%rdi)
↓ je 10
xor %eax,%eax
← retq
xchg %ax,%ax
1 10: push %rbp
cmpq $0x0,0x18(%rdi)
mov %rdi,%rbp
↓ jne 20
1b: xor %eax,%eax
pop %rbp
← retq
nop
20: lea 0x18(%rdi),%rdi
→ callq JS_UpdateWeakPointerAfterGC(JS::Heap
Tested-by: Arnaldo Carvalho de Melo
Cc: Adrian Hunter
Cc: Alexey Budankov
Cc: Changbin Du
Cc: Ian Rogers
Cc: Jin Yao
Cc: Jiri Olsa
Cc: Leo Yan
Cc: Namhyung Kim
Cc: Song Liu
Cc: Taeung Song
Cc: Thomas Richter
Cc: Yisheng Xie
Link: http://lore.kernel.org/lkml/20200213064306.160480-6-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo
14 Jan, 2020
1 commit
-
The objdump utility has useful --prefix / --prefix-strip options to
allow changing source code file names hardcoded into executables' debug
info. Add options to 'perf report', 'perf top' and 'perf annotate',
which are then passed to objdump.$ mkdir foo
$ echo 'main() { for (;;); }' > foo/foo.c
$ gcc -g foo/foo.c
foo/foo.c:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
1 | main() { for (;;); }
| ^~~~
$ perf record ./a.out
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.230 MB perf.data (5721 samples) ]
$ mv foo bar
$ perf annotate
$ perf annotate --prefix=/home/ak/lsrc/git/bar --prefix-strip=5
Signed-off-by: Andi Kleen
Tested-by: Jiri Olsa
LPU-Reference: 20200107210444.214071-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo
12 Nov, 2019
2 commits
-
So that we pass that substructure around and with it consolidate lots of
functions that receive a (map, symbol) pair and now can receive just a
'struct map_symbol' pointer.This further paves the way to add 'struct map_groups' to 'struct
map_symbol' so that we can have all we need for annotation so that we
can ditch 'struct map'->groups, i.e. have the map_groups pointer in a
more central place, avoiding the pointer in the 'struct map' that have
tons of instances.Cc: Adrian Hunter
Cc: Andi Kleen
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-fs90ttd9q12l7989fo7pw81q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
We are already passing things like:
symbol__annotate(ms->sym, ms->map, ...)
So shorten the signature of such functions to receive the 'map_symbol'
pointer.This also paves the way to having the 'struct map_groups' pointer in the
'struct map_symbol' so that we can get rid of 'struct map'->groups.Cc: Adrian Hunter
Cc: Andi Kleen
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-23yx8v1t41nzpkpi7rdrozww@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
07 Nov, 2019
1 commit
-
We can get the per sample cycles by hist__account_cycles(). It's also
useful to know the total cycles of all samples in order to get the
cycles coverage for a single program block in further. For example:coverage = per block sampled cycles / total sampled cycles
This patch creates a new argument 'total_cycles' in hist__account_cycles(),
which will be added with the cycles of each sample.Signed-off-by: Jin Yao
Reviewed-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Jin Yao
Cc: Kan Liang
Cc: Peter Zijlstra
Link: http://lore.kernel.org/lkml/20191107074719.26139-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo
21 Sep, 2019
1 commit
-
This patch is to return error code of perf_new_session function on
failure instead of NULL.Test Results:
Before Fix:
$ perf c2c report -input
failed to open nput: No such file or directory$ echo $?
0
$After Fix:
$ perf c2c report -input
failed to open nput: No such file or directory$ echo $?
254
$Committer notes:
Fix 'perf tests topology' case, where we use that TEST_ASSERT_VAL(...,
session), i.e. we need to pass zero in case of failure, which was the
case before when NULL was returned by perf_session__new() for failure,
but now we need to negate the result of IS_ERR(session) to respect that
TEST_ASSERT_VAL) expectation of zero meaning failure.Reported-by: Nageswara R Sastry
Signed-off-by: Mamatha Inamdar
Tested-by: Arnaldo Carvalho de Melo
Tested-by: Nageswara R Sastry
Acked-by: Ravi Bangoria
Reviewed-by: Jiri Olsa
Reviewed-by: Mukesh Ojha
Cc: Adrian Hunter
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Greg Kroah-Hartman
Cc: Jeremie Galarneau
Cc: Kate Stewart
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Shawn Landden
Cc: Song Liu
Cc: Thomas Gleixner
Cc: Tzvetomir Stoyanov
Link: http://lore.kernel.org/lkml/20190822071223.17892.45782.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo
20 Sep, 2019
1 commit
-
We use what is defined there, were getting it by luck, indirectly, fix
it.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-56g4jshmktniundmiw7h845k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
01 Sep, 2019
3 commits
-
The mem_info struct goes to mem-events.h and branch_info goes to
branch.h, where they belong, this way we can remove several headers from
symbols.h and trim the include dependency tree more.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-aupw71xnravcsu2xoabfmhpc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Now that thread.h isn't included by any other header, we can check where
it is really needed, i.e. we can remove it and be sure that it isn't
being obtained indirectly.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-kh333ivjbw05wsggckpziu86@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
So that we can reduce the header dependency tree further, in the process
noticed that lots of places were getting even things like build-id
routines and 'struct perf_tool' definition indirectly, so fix all those
too.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-ti0btma9ow5ndrytyoqdk62j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
30 Jul, 2019
1 commit
-
Rename struct perf_evsel to struct evsel, so we don't have a name clash
when we add struct perf_evsel in libperf.Committer notes:
Added fixes for arm64, provided by Jiri.
Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Andi Kleen
Cc: Michael Petlan
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
09 Jul, 2019
1 commit
-
Eroding a bit more the tools/perf/util/util.h hodpodge header.
Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
16 May, 2019
1 commit
-
The hist__account_cycles() function is executed when the
hist_iter__branch_callback() is called.But it looks it's not necessary. In hist__account_cycles, it already
walks on all branch entries.This patch moves the hist__account_cycles out of callback, now the data
processing is much faster than before.Previous code has an issue that the ch[offset].num++ (in
__symbol__account_cycles) is executed repeatedly since
hist__account_cycles is called in each hist_iter__branch_callback, so
the counting of ch[offset].num is not correct (too big).With this patch, the issue is fixed. And we don't need the code of
"ch->reset >= ch->num / 2" to check if there are too many overlaps (in
annotation__count_and_fill), otherwise some data would be hidden.Now, we can try, for example:
perf record -b ...
perf annotate or perf report -s symbolThe before/after output should be no change.
v3:
---
Fix the crash in stdio mode.
Like previous code, it needs the checking of ui__has_annotation()
before hist__account_cycles()v2:
---
1. Cover the similar perf report
2. Remove the checking code "ch->reset >= ch->num / 2"Signed-off-by: Jin Yao
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Jin Yao
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1552684577-29041-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo
23 Feb, 2019
1 commit
-
Add a 'path' member to 'struct perf_data'. It will keep the configured
path for the data (const char *). The path in struct perf_data_file is
now dynamically allocated (duped) from it.This scheme is useful/used in following patches where struct
perf_data::path holds the 'configure' directory path and struct
perf_data_file::path holds the allocated path for specific files.Also it actually makes the code little simpler.
Signed-off-by: Jiri Olsa
Cc: Adrian Hunter
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Andi Kleen
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
[ Fixup data-convert-bt.c missing conversion ]
Signed-off-by: Arnaldo Carvalho de Melo
06 Feb, 2019
1 commit
-
Lots of places get the map.h file indirectly, and since we're going to
remove it from machine.h, then those need to include it directly, do it
now, before we remove that dep.Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-ob8jehdjda8h5jsrv9dqj9tf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
25 Jan, 2019
2 commits
-
At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node), which is
something heavily required for histograms. Specifically, the following
are converted to rb_root_cached, and users accordingly:hist::entries_in_array
hist::entries_in
hist::entries
hist::entries_collapsed
hist_entry::hroot_in
hist_entry::hroot_outSigned-off-by: Davidlohr Bueso
Tested-by: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/20181206191819.30182-7-dave@stgolabs.net
[ Added some missing conversions to rb_first_cached() ]
Signed-off-by: Arnaldo Carvalho de Melo -
At the cost of an extra pointer, we can avoid the O(logN) cost of
finding the first element in the tree (smallest node).Signed-off-by: Davidlohr Bueso
Tested-by: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/20181206191819.30182-6-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo
19 Sep, 2018
1 commit
-
Now that we keep a perf_tool pointer inside perf_session, there's no
need to have a perf_tool argument in the event_op2 callback. Remove it.Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Andi Kleen
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20180913125450.21342-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
09 Aug, 2018
1 commit
-
Add --percent-type option to set annotation percent type from following
choices:global-period, local-period, global-hits, local-hits
Examples:
$ perf annotate --percent-type period-local --stdio | head -1
Percent | Source code ... es, percent: local period)
$ perf annotate --percent-type hits-local --stdio | head -1
Percent | Source code ... es, percent: local hits)
$ perf annotate --percent-type hits-global --stdio | head -1
Percent | Source code ... es, percent: global hits)
$ perf annotate --percent-type period-global --stdio | head -1
Percent | Source code ... es, percent: global period)The local/global keywords set if the percentage is computed in the scope
of the function (local) or the whole data (global).The period/hits keywords set the base the percentage is computed on -
the samples period or the number of samples (hits).Signed-off-by: Jiri Olsa
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexander Shishkin
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/20180804130521.11408-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
25 Jun, 2018
1 commit
-
perf_event__process_feature() accesses feat_ops[HEADER_LAST_FEATURE]
which is not defined and thus perf is crashing. HEADER_LAST_FEATURE is
used as an end marker for the perf report but it's unused for perf
script/annotate. Ignore HEADER_LAST_FEATURE for perf script/annotate,
just like it is done in 'perf report'.Before:
# perf record -o - ls | perf script
Segmentation fault (core dumped)
#After:
# perf record -o - ls | perf script
Segmentation fault (core dumped)
ls 7031 4392.099856: 250000 cpu-clock:uhH: 7f5e0ce7cd60
ls 7031 4392.100355: 250000 cpu-clock:uhH: 7f5e0c706ef7
#Signed-off-by: Ravi Bangoria
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: David Ahern
Cc: David Carrillo-Cisneros
Cc: Jin Yao
Cc: Jiri Olsa
Cc: Namhyung Kim
Fixes: 57b5de463925 ("perf report: Support forced leader feature in pipe mode")
Link: http://lkml.kernel.org/r/20180625124220.6434-4-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo
04 Jun, 2018
6 commits
-
One more step in grouping annotation options.
Cc: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: https://lkml.kernel.org/n/tip-sogzdhugoavm6fyw60jnb0vs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
So that things changed in the command line may percolate to the browser
code without using globals.Cc: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: https://lkml.kernel.org/n/tip-5daawc40zhl6gcs600com1ua@git.kernel.org
[ Merged fix for NO_SLANG=1 build provided by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo -
Continuing to group annotation specific stuff into a struct.
Cc: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: https://lkml.kernel.org/n/tip-p3cdhltj58jt0byjzg3g7obx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Continuing to group annotation options in an annotation specific struct.
Cc: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: https://lkml.kernel.org/n/tip-astei92tzxp4yccag5pxb2h7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Accross all the routines, this way we can have eventually have a
consistent set of defaults for all UIs.Cc: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: https://lkml.kernel.org/n/tip-6qgtixurjgdk5u0n3rw78ges@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
The code gets shorter and we'll be able to use evsel->evlist in a
followup patch.Cc: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: https://lkml.kernel.org/n/tip-t0s7vy19wq5kak74kavm8swf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
22 May, 2018
1 commit
-
With the '--group' option, even for non-explicit group, 'perf annotate'
will enable the group output.For example,
$ perf record -e cycles,branches ./div
$ perf annotate main --stdio --group: Disassembly of section .text:
:
: 00000000004004b0 :
: main():
:
: return i;
: }
:
: int main(void)
: {
0.00 0.00 : 4004b0: push %rbx
: int i;
: int flag;
: volatile double x = 1212121212, y = 121212;
:
: s_randseed = time(0);
0.00 0.00 : 4004b1: xor %edi,%edi
: srand(s_randseed);
0.00 0.00 : 4004b3: mov $0x77359400,%ebx
:
: return i;
: }
:But if without --group, there is only one event reported.
$ perf annotate main --stdio
: Disassembly of section .text:
:
: 00000000004004b0 :
: main():
:
: return i;
: }
:
: int main(void)
: {
0.00 : 4004b0: push %rbx
: int i;
: int flag;
: volatile double x = 1212121212, y = 121212;
:
: s_randseed = time(0);
0.00 : 4004b1: xor %edi,%edi
: srand(s_randseed);
0.00 : 4004b3: mov $0x77359400,%ebx
:
: return i;
: }Signed-off-by: Jin Yao
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1526914666-31839-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo
27 Apr, 2018
1 commit
-
Remove the split of symbol tables for data (MAP__VARIABLE) and for
functions (MAP__FUNCTION), its unneeded and there were various places
doing two lookups to find a symbol, so simplify this.We still will consider only the symbols that matched the filters in
place, i.e. see the (elf_(sec,sym)|symbol_type)__filter() routines in
the patch, just so that we consider only the same symbols as before,
to reduce the possibility of regressions.All the tests on 50-something build environments, in varios versions
of lots of distros and cross build environments were performed without
build regressions, as usual with all pull requests the other tests were
also performed: 'perf test' and 'make -C tools/perf build-test'.Also this was done at a great granularity so that regressions can be
bisected more easily.Cc: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: https://lkml.kernel.org/n/tip-hiq0fy2rsleupnqqwuojo1ne@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
21 Mar, 2018
3 commits
-
This is already present in 'perf top', albeit undocumented (will fix),
and is useful to use /proc/kcore instead of vmlinux and then get what is
really in place, not what the kernel starts with, before alternatives,
ftrace .text patching, etc, see the differences:# perf annotate --stdio2 _raw_spin_lock_irqsave
_raw_spin_lock_irqsave() /lib/modules/4.16.0-rc4/build/vmlinux
Event: anon group { cycles, instructions }0.00 3.17 → callq __fentry__
0.00 7.94 push %rbx
7.69 36.51 → callq __page_file_index
mov %rax,%rbx
7.69 3.17 → callq *ffffffff82225cd0
xor %eax,%eax
mov $0x1,%edx
80.77 49.21 lock cmpxchg %edx,(%rdi)
test %eax,%eax
↓ jne 2b
3.85 0.00 mov %rbx,%rax
pop %rbx
← retq
2b: mov %eax,%esi
→ callq queued_spin_lock_slowpath
mov %rbx,%rax
pop %rbx
← retq
[root@jouet ~]# perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
_raw_spin_lock_irqsave() /proc/kcore
Event: anon group { cycles, instructions }0.00 3.17 nop
0.00 7.94 push %rbx
0.00 23.81 pushfq
7.69 12.70 pop %rax
nop
mov %rax,%rbx
7.69 3.17 cli
nop
xor %eax,%eax
mov $0x1,%edx
80.77 49.21 lock cmpxchg %edx,(%rdi)
test %eax,%eax
↓ jne 2b
3.85 0.00 mov %rbx,%rax
pop %rbx
← retq
2b: mov %eax,%esi
→ callq *ffffffff820e96b0
mov %rbx,%rax
pop %rbx
← retq
#Diff of the output of those commands:
# perf annotate --stdio2 _raw_spin_lock_irqsave > /tmp/vmlinux
# perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave > /tmp/kcore
# diff -y /tmp/vmlinux /tmp/kcore
_raw_spin_lock_irqsave() vmlinux | _raw_spin_lock_irqsave() /proc/kcore
Event: anon group { cycles, instructions } Event: anon group { cycles, instructions }0.00 3.17 → callq __fentry__ | 0.00 3.17 nop
0.00 7.94 push %rbx 0.00 7.94 push %rbx
7.69 36.51 → callq __page_file_index | 0.00 23.81 pushfq
> 7.69 12.70 pop %rax
> nop
mov %rax,%rbx mov %rax,%rbx
7.69 3.17 → callq *ffffffff82225cd0 | 7.69 3.17 cli
> nop
xor %eax,%eax xor %eax,%eax
mov $0x1,%edx mov $0x1,%edx
80.77 49.21 lock cmpxchg %edx,(%rdi) 80.77 49.21 lock cmpxchg %edx,(%rdi)
test %eax,%eax test %eax,%eax
↓ jne 2b ↓ jne 2b
3.85 0.00 mov %rbx,%rax 3.85 0.00 mov %rbx,%rax
pop %rbx pop %rbx
← retq ← retq
2b: mov %eax,%esi 2b: mov %eax,%esi
→ callq queued_spin_lock_slowpath| → callq *ffffffff820e96b0
mov %rbx,%rax mov %rbx,%rax
pop %rbx pop %rbx
← retq ← retq
#This should be further streamlined by doing both annotations and
allowing the TUI to toggle initial/current, and show the patched
instructions in a slightly different color.Cc: Adrian Hunter
Cc: Andi Kleen
Cc: David Ahern
Cc: Jin Yao
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: https://lkml.kernel.org/n/tip-wz8d269hxkcwaczr0r4rhyjg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
One more thing that goes from the TUI code to be used more widely,
for instance it'll affect the default options used by:perf annotate --stdio2
Cc: Adrian Hunter
Cc: Andi Kleen
Cc: David Ahern
Cc: Jin Yao
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: https://lkml.kernel.org/n/tip-0nsz0dm0akdbo30vgja2a10e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
This uses the TUI augmented formatting routines, modulo interactivity.
# perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
_raw_spin_lock_irqsave() /proc/kcore
Event: cycles:pppPercent
Disassembly of section load0:
ffffffff9a8734b0 :
nop
push %rbx
50.00 pushfq
pop %rax
nop
mov %rax,%rbx
cli
nop
xor %eax,%eax
mov $0x1,%edx
50.00 lock cmpxchg %edx,(%rdi)
test %eax,%eax
↓ jne 2b
mov %rbx,%rax
pop %rbx
← retq
2b: mov %eax,%esi
→ callq queued_spin_lock_slowpath
mov %rbx,%rax
pop %rbx
← retqTested-by: Jin Yao
Cc: Adrian Hunter
Cc: Andi Kleen
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: https://lkml.kernel.org/n/tip-6cte5o8z84mbivbvqlg14uh1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
08 Mar, 2018
1 commit
-
Unlike the perf report interactive annotate mode, the perf annotate
doesn't display the IPC/Cycle even if branch info is recorded in perf
data file.perf record -b ...
perf annotate functionIt should show IPC/cycle, but it doesn't.
This patch lets perf annotate support the displaying of IPC/Cycle if
branch info is in perf data.For example,
perf annotate compute_flag
Percent│ IPC Cycle
│
│
│ Disassembly of section .text:
│
│ 0000000000400640 :
│ compute_flag():
│ volatile int count;
│ static unsigned int s_randseed;
│
│ __attribute__((noinline))
│ int compute_flag()
│ {
22.96 │1.18 584 sub $0x8,%rsp
│ int i;
│
│ i = rand() % 2;
23.02 │1.18 1 → callq rand@plt
│
│ return i;
27.05 │3.37 mov %eax,%edx
│ }
│3.37 add $0x8,%rsp
│ {
│ int i;
│
│ i = rand() % 2;
│
│ return i;
│3.37 shr $0x1f,%edx
│3.37 add %edx,%eax
│3.37 and $0x1,%eax
│3.37 sub %edx,%eax
│ }
26.97 │3.37 2 ← retqNote that, this patch only supports TUI mode. For stdio, now it just keeps
original behavior. Will support it in a follow-up patch.$ perf annotate compute_flag --stdio
Percent | Source code & Disassembly of div for cycles:ppp (7993 samples)
------------------------------------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: 0000000000400640 :
: compute_flag():
: volatile int count;
: static unsigned int s_randseed;
:
: __attribute__((noinline))
: int compute_flag()
: {
0.29 : 400640: sub $0x8,%rsp # +100.00%
: int i;
:
: i = rand() % 2;
42.93 : 400644: callq 400490 # -100.00% (p:100.00%)
:
: return i;
0.10 : 400649: mov %eax,%edx # +100.00%
: }
0.94 : 40064b: add $0x8,%rsp
: {
: int i;
:
: i = rand() % 2;
:
: return i;
27.02 : 40064f: shr $0x1f,%edx
0.15 : 400652: add %edx,%eax
1.24 : 400654: and $0x1,%eax
2.08 : 400657: sub %edx,%eax
: }
25.26 : 400659: retq # -100.00% (p:100.00%)Signed-off-by: Jin Yao
Acked-by: Andi Kleen
Link: http://lkml.kernel.org/r/20180223170210.GC7045@tassilo.jf.intel.com
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1519724327-7773-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo
07 Nov, 2017
1 commit
-
Conflicts:
tools/perf/arch/arm/annotate/instructions.c
tools/perf/arch/arm64/annotate/instructions.c
tools/perf/arch/powerpc/annotate/instructions.c
tools/perf/arch/s390/annotate/instructions.c
tools/perf/arch/x86/tests/intel-cqm.c
tools/perf/ui/tui/progress.c
tools/perf/util/zlib.cSigned-off-by: Ingo Molnar
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
31 Oct, 2017
1 commit
-
Add struct perf_data_file to represent a single file within a perf_data
struct.Signed-off-by: Jiri Olsa
Cc: Andi Kleen
Cc: Changbin Du
Cc: David Ahern
Cc: Jin Yao
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Wang Nan
Link: http://lkml.kernel.org/n/tip-c3f9p4xzykr845ktqcek6p4t@git.kernel.org
[ Fixup recent changes in 'perf script --per-event-dump' ]
Signed-off-by: Arnaldo Carvalho de Melo