23 Feb, 2019
2 commits
-
Add a 'path' member to 'struct perf_data'. It will keep the configured
path for the data (const char *). The path in struct perf_data_file is
now dynamically allocated (duped) from it.This scheme is useful/used in following patches where struct
perf_data::path holds the 'configure' directory path and struct
perf_data_file::path holds the allocated path for specific files.Also it actually makes the code little simpler.
Signed-off-by: Jiri Olsa
Cc: Adrian Hunter
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Andi Kleen
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
[ Fixup data-convert-bt.c missing conversion ]
Signed-off-by: Arnaldo Carvalho de Melo -
We are about to add support for multiple files, so we need each file to
keep its size.Signed-off-by: Jiri Olsa
Cc: Adrian Hunter
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Andi Kleen
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/20190221094145.9151-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
11 Feb, 2019
1 commit
-
Implement --affinity=node|cpu option for the record mode defaulting
to system affinity mask bouncing.Signed-off-by: Alexey Budankov
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/083f5422-ece9-10dd-8305-bf59c860f10f@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo
06 Feb, 2019
3 commits
-
Build node cpu masks for mmap data buffers. Apply node cpu masks to tool
thread every time it references data buffers cross node or cross cpu.Signed-off-by: Alexey Budankov
Reviewed-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/b25e4ebc-078d-2c7b-216c-f0bed108d073@linux.intel.com
[ Use cpu-set-sched.h to get the CPU_{EQUAL,OR}() fallbacks for older systems ]
Signed-off-by: Arnaldo Carvalho de Melo -
Allocate affinity option and masks for mmap data buffers and record
thread as well as initialize allocated objects.Signed-off-by: Alexey Budankov
Reviewed-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/526fa2b0-07de-6dbd-a7e9-26ba875593c9@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo -
CoreSight was the only client of the PMU's set_drv_config() API. Now
that it is no longer needed by CoreSight remove it from the code base.Signed-off-by: Mathieu Poirier
Acked-by: Suzuki K Poulouse
Cc: Adrian Hunter
Cc: Alexander Shishkin
Cc: Alexei Starovoitov
Cc: Greg Kroah-Hartman
Cc: H. Peter Anvin
Cc: Heiko Carstens
Cc: Jiri Olsa
Cc: Mark Rutland
Cc: Martin Schwidefsky
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Will Deacon
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Link: http://lkml.kernel.org/r/20190131184714.20388-8-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo
22 Jan, 2019
2 commits
-
This patch synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for
BPF programs loaded before perf-record. This is achieved by gathering
information about all BPF programs via sys_bpf.Committer notes:
Fix the build on some older systems such as amazonlinux:1 where it was
breaking with:util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog':
util/bpf-event.c:52:9: error: missing initializer for field 'type' of 'struct bpf_prog_info' [-Werror=missing-field-initializers]
struct bpf_prog_info info = {};
^
In file included from /git/linux/tools/lib/bpf/bpf.h:26:0,
from util/bpf-event.c:3:
/git/linux/tools/include/uapi/linux/bpf.h:2699:8: note: 'type' declared here
__u32 type;
^
cc1: all warnings being treated as errorsFurther fix on a centos:6 system:
cc1: warnings being treated as errors
util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog':
util/bpf-event.c:50: error: 'func_info_rec_size' may be used uninitialized in this functionThe compiler is wrong, but to silence it, initialize that variable to
zero.One more fix, this time for debian:experimental-x-mips, x-mips64 and
x-mipsel:util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog':
util/bpf-event.c:93:16: error: implicit declaration of function 'calloc' [-Werror=implicit-function-declaration]
func_infos = calloc(sub_prog_cnt, func_info_rec_size);
^~~~~~
util/bpf-event.c:93:16: error: incompatible implicit declaration of built-in function 'calloc' [-Werror]
util/bpf-event.c:93:16: note: include '' or provide a declaration of 'calloc'Add the missing header.
Committer testing:
# perf record --bpf-event sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.021 MB perf.data (7 samples) ]
# perf report -D | grep PERF_RECORD_BPF_EVENT | nl
1 0 0x4b10 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13
2 0 0x4c60 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14
3 0 0x4db0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15
4 0 0x4f00 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16
5 0 0x5050 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17
6 0 0x51a0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18
7 0 0x52f0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21
8 0 0x5440 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22
# bpftool prog
13: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2019-01-19T09:09:43-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 13,14
14: cgroup_skb tag 2a142ef67aaad174 gpl
loaded_at 2019-01-19T09:09:43-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 13,14
15: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2019-01-19T09:09:43-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 15,16
16: cgroup_skb tag 2a142ef67aaad174 gpl
loaded_at 2019-01-19T09:09:43-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 15,16
17: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2019-01-19T09:09:44-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 17,18
18: cgroup_skb tag 2a142ef67aaad174 gpl
loaded_at 2019-01-19T09:09:44-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 17,18
21: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2019-01-19T09:09:45-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 21,22
22: cgroup_skb tag 2a142ef67aaad174 gpl
loaded_at 2019-01-19T09:09:45-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 21,22
## perf report -D | grep -B22 PERF_RECORD_KSYMBOL
. ... raw event: size 312 bytes
. 0000: 11 00 00 00 00 00 38 01 ff 44 06 c0 ff ff ff ff ......8..D......
. 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
. 0020: 5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62 _7be49e3934a125b
. 0030: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a...............
. 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
. 0120: 7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00 {..94.%.........
. 0130: 00 00 00 00 00 00 00 00 ........0 0x49d8 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc00644ff len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
--
. ... raw event: size 312 bytes
. 0000: 11 00 00 00 00 00 38 01 48 6d 06 c0 ff ff ff ff ......8.Hm......
. 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
. 0020: 5f 32 61 31 34 32 65 66 36 37 61 61 61 64 31 37 _2a142ef67aaad17
. 0030: 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4...............
. 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
. 0120: 2a 14 2e f6 7a aa d1 74 00 00 00 00 00 00 00 00 *...z..t........
. 0130: 00 00 00 00 00 00 00 00 ........0 0x4b28 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0066d48 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
--
. ... raw event: size 312 bytes
. 0000: 11 00 00 00 00 00 38 01 04 cf 03 c0 ff ff ff ff ......8.........
. 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
. 0020: 5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62 _7be49e3934a125b
. 0030: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a...............
. 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
. 0120: 7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00 {..94.%.........
. 0130: 00 00 00 00 00 00 00 00 ........0 0x4c78 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc003cf04 len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
--
. ... raw event: size 312 bytes
. 0000: 11 00 00 00 00 00 38 01 96 28 04 c0 ff ff ff ff ......8..(......
. 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
. 0020: 5f 32 61 31 34 32 65 66 36 37 61 61 61 64 31 37 _2a142ef67aaad17
. 0030: 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4...............
. 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
. 0120: 2a 14 2e f6 7a aa d1 74 00 00 00 00 00 00 00 00 *...z..t........
. 0130: 00 00 00 00 00 00 00 00 ........0 0x4dc8 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0042896 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
--
. ... raw event: size 312 bytes
. 0000: 11 00 00 00 00 00 38 01 05 13 17 c0 ff ff ff ff ......8.........
. 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
. 0020: 5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62 _7be49e3934a125b
. 0030: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a...............
. 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
. 0120: 7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00 {..94.%.........
. 0130: 00 00 00 00 00 00 00 00 ........0 0x4f18 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0171305 len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
--
. ... raw event: size 312 bytes
. 0000: 11 00 00 00 00 00 38 01 0a 8c 23 c0 ff ff ff ff ......8...#.....
. 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
. 0020: 5f 32 61 31 34 32 65 66 36 37 61 61 61 64 31 37 _2a142ef67aaad17
. 0030: 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4...............
. 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
. 0120: 2a 14 2e f6 7a aa d1 74 00 00 00 00 00 00 00 00 *...z..t........
. 0130: 00 00 00 00 00 00 00 00 ........0 0x5068 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0238c0a len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
--
. ... raw event: size 312 bytes
. 0000: 11 00 00 00 00 00 38 01 2a a5 a4 c0 ff ff ff ff ......8.*.......
. 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
. 0020: 5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62 _7be49e3934a125b
. 0030: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a...............
. 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
. 0120: 7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00 {..94.%.........
. 0130: 00 00 00 00 00 00 00 00 ........0 0x51b8 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0a4a52a len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
--
. ... raw event: size 312 bytes
. 0000: 11 00 00 00 00 00 38 01 9b c9 a4 c0 ff ff ff ff ......8.........
. 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
. 0020: 5f 32 61 31 34 32 65 66 36 37 61 61 61 64 31 37 _2a142ef67aaad17
. 0030: 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4...............
. 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
. 0120: 2a 14 2e f6 7a aa d1 74 00 00 00 00 00 00 00 00 *...z..t........
. 0130: 00 00 00 00 00 00 00 00 ........0 0x5308 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0a4c99b len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
Signed-off-by: Song Liu
Reviewed-by: Arnaldo Carvalho de Melo
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexei Starovoitov
Cc: Daniel Borkmann
Cc: Peter Zijlstra
Cc: kernel-team@fb.com
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20190117161521.1341602-8-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo -
This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of
PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to
turn it on.Committer notes:
Add dummy machine__process_bpf_event() variant that returns zero for
systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking
the build in such systems.Remove the needless include from bpf->event.h, provide just
forward declarations for the structs and unions in the parameters, to
reduce compilation time and needless rebuilds when machine.h gets
changed.Committer testing:
When running with:
# perf record --bpf-event
On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL
is not present, we fallback to removing those two bits from
perf_event_attr, making the tool to continue to work on older kernels:perf_event_attr:
size 112
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|PERIOD
read_format ID
disabled 1
inherit 1
mmap 1
comm 1
freq 1
enable_on_exec 1
task 1
precise_ip 3
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
ksymbol 1
bpf_event 1
------------------------------------------------------------
sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open failed, error -22
switching off bpf_event
------------------------------------------------------------
perf_event_attr:
size 112
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|PERIOD
read_format ID
disabled 1
inherit 1
mmap 1
comm 1
freq 1
enable_on_exec 1
task 1
precise_ip 3
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
ksymbol 1
------------------------------------------------------------
sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open failed, error -22
switching off ksymbol
------------------------------------------------------------
perf_event_attr:
size 112
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|PERIOD
read_format ID
disabled 1
inherit 1
mmap 1
comm 1
freq 1
enable_on_exec 1
task 1
precise_ip 3
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
------------------------------------------------------------And then proceeds to work without those two features.
As passing --bpf-event is an explicit action performed by the user, perhaps we
should emit a warning telling that the kernel has no such feature, but this can
be done on top of this patch.Now with a kernel that supports these events, start the 'record --bpf-event -a'
and then run 'perf trace sleep 10000' that will use the BPF
augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus
should generate PERF_RECORD_BPF_EVENT events:[root@quaco ~]# perf record -e dummy -a --bpf-event
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.713 MB perf.data ][root@quaco ~]# bpftool prog
13: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2019-01-19T09:09:43-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 13,14
14: cgroup_skb tag 2a142ef67aaad174 gpl
loaded_at 2019-01-19T09:09:43-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 13,14
15: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2019-01-19T09:09:43-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 15,16
16: cgroup_skb tag 2a142ef67aaad174 gpl
loaded_at 2019-01-19T09:09:43-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 15,16
17: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2019-01-19T09:09:44-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 17,18
18: cgroup_skb tag 2a142ef67aaad174 gpl
loaded_at 2019-01-19T09:09:44-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 17,18
21: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2019-01-19T09:09:45-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 21,22
22: cgroup_skb tag 2a142ef67aaad174 gpl
loaded_at 2019-01-19T09:09:45-0300 uid 0
xlated 296B jited 229B memlock 4096B map_ids 21,22
31: tracepoint name sys_enter tag 12504ba9402f952f gpl
loaded_at 2019-01-19T09:19:56-0300 uid 0
xlated 512B jited 374B memlock 4096B map_ids 30,29,28
32: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl
loaded_at 2019-01-19T09:19:56-0300 uid 0
xlated 256B jited 191B memlock 4096B map_ids 30,29
# perf report -D | grep PERF_RECORD_BPF_EVENT | nl
1 0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13
2 0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14
3 0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15
4 0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16
5 0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17
6 0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18
7 0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21
8 0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22
9 7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29
10 7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29
11 7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30
12 7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30
13 7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31
14 7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32
#There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'.
Signed-off-by: Song Liu
Reviewed-by: Arnaldo Carvalho de Melo
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexei Starovoitov
Cc: Daniel Borkmann
Cc: Peter Zijlstra
Cc: kernel-team@fb.com
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo
18 Dec, 2018
3 commits
-
The default timeout of 500ms for parsing /proc//maps files is too
short for profiling many of our services.This can be overridden by passing --proc-map-timeout to the relevant
command but it'd be nice to globally increase our default value.This patch permits setting a different default with the
core.proc-map-timeout config file parameter.Signed-off-by: Mark Drayton
Acked-by: Song Liu
Acked-by: Namhyung Kim
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20181204203420.1683114-1-mbd@fb.com
Signed-off-by: Arnaldo Carvalho de Melo -
Multi AIO trace writing allows caching more kernel data into userspace
memory postponing trace writing for the sake of overall profiling data
thruput increase. It could be seen as kernel data buffer extension into
userspace memory.With an --aio option value different from 0 (default value is 1) the
tool has capability to cache more and more data into user space along
with delegating spill to AIO.That allows avoiding to suspend at record__aio_sync() between calls of
record__mmap_read_evlist() and increases profiling data thruput at the
cost of userspace memory.Signed-off-by: Alexey Budankov
Reviewed-by: Jiri Olsa
Acked-by: Namhyung Kim
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/050bb053-e7f3-aa83-fde7-f27ff90be7f6@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo -
The trace file offset is read once before mmaps iterating loop and
written back after all performance data is enqueued for aio writing.The trace file offset is incremented linearly after every successful aio
write operation.record__aio_sync() blocks till completion of the started AIO operation
and then proceeds.record__aio_mmap_read_sync() implements a barrier for all incomplete
aio write requests.Signed-off-by: Alexey Budankov
Reviewed-by: Jiri Olsa
Acked-by: Namhyung Kim
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/ce2d45e9-d236-871c-7c8f-1bed2d37e8ac@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo
06 Nov, 2018
1 commit
-
Implement a weak group fallback for 'perf record', similar to the
existing 'perf stat' support. This allows to use groups that might be
longer than the available counters without failing.Before:
$ perf record -e '{cycles,cache-misses,cache-references,cpu_clk_unhalted.thread,cycles,cycles,cycles}' -a sleep 1
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles).
/bin/dmesg | grep -i perf may provide additional information.After:
$ ./perf record -e '{cycles,cache-misses,cache-references,cpu_clk_unhalted.thread,cycles,cycles,cycles}:W' -a sleep 1
WARNING: No sample_id_all support, falling back to unordered processing
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 8.136 MB perf.data (134069 samples) ]Signed-off-by: Andi Kleen
Acked-by: Jiri Olsa
Link: http://lkml.kernel.org/r/20181001195927.14211-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo
18 Oct, 2018
1 commit
-
Store -k clockid frequency into Perf trace to enable timestamps
derived metrics conversion into wall clock time on reporting stage.Below is the example of perf report output:
tools/perf/perf record -k raw -- ../../matrix/linux/matrix.gcc
...
[ perf record: Captured and wrote 31.222 MB perf.data (818054 samples) ]tools/perf/perf report --header
# ========
...
# event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, use_clockid = 1, clockid = 4
...
# clockid frequency: 1000 MHz
...
# ========Signed-off-by: Alexey Budankov
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/23a4a1dc-b160-85a0-347d-40a2ed6d007b@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo
19 Sep, 2018
2 commits
-
The struct perf_mmap map argument will hold the file pointer to write
the data to.Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Andi Kleen
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20180913125450.21342-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
The perf_mmap struct will hold a file pointer to write the mmap's
contents, so we need to propagate it down the stack to record__write
callers instead of its member the auxtrace_mmap struct.Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Alexey Budankov
Cc: Andi Kleen
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20180913125450.21342-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
31 Aug, 2018
1 commit
-
To be able to pass in other than session's evlist.
Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20180830063252.23729-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
17 Mar, 2018
2 commits
-
We need to synthesize events first, because some features works on top
of them (on report side).Signed-off-by: Jiri Olsa
Tested-by: Stephane Eranian
Cc: Alexander Shishkin
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20180314092205.23291-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
We have brought perf_default_config to the very beginning at main(), so
it no need to call perf_default_config() once more for most of config in
perf-record but only for record.call-graph.Signed-off-by: Yisheng Xie
Cc: Alexander Shishkin
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1520853957-36106-2-git-send-email-xieyisheng1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo
08 Mar, 2018
3 commits
-
It's no longer used.
Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20180307155020.32613-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
It's used far more down to be declared on the top of the __cmd_record.
Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20180307155020.32613-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Using the 'start' and 'end' which are stored in struct perf_mmap to
replace the temporary 'start' and 'end'.
The temporary variables will be discarded later.It doesn't need to pass 'overwrite' to perf_mmap__push(). It's stored in
struct perf_mmap.Signed-off-by: Kan Liang
Acked-by: Jiri Olsa
Cc: Andi Kleen
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: http://lkml.kernel.org/r/1520350567-80082-3-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo
07 Mar, 2018
1 commit
-
In preparation for adding AUX area sampling support, combine some
auxtrace initialization into a single function.Signed-off-by: Adrian Hunter
Cc: Jiri Olsa
Link: http://lkml.kernel.org/r/1520327598-1317-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
05 Mar, 2018
3 commits
-
Currently we can crash perf record when running in pipe mode, like:
$ perf record ls | perf report
# To display the perf.data header info, please use --header/--header-only options.
#
perf: Segmentation fault
Error:
The - file has no samples!The callstack of the crash is:
0x0000000000515242 in perf_event__synthesize_event_update_name
3513 ev = event_update_event__new(len + 1, PERF_EVENT_UPDATE__NAME, evsel->id[0]);
(gdb) bt
#0 0x0000000000515242 in perf_event__synthesize_event_update_name
#1 0x00000000005158a4 in perf_event__synthesize_extra_attr
#2 0x0000000000443347 in record__synthesize
#3 0x00000000004438e3 in __cmd_record
#4 0x000000000044514e in cmd_record
#5 0x00000000004cbc95 in run_builtin
#6 0x00000000004cbf02 in handle_internal_command
#7 0x00000000004cc054 in run_argv
#8 0x00000000004cc422 in mainThe reason of the crash is that the evsel does not have ids array
allocated and the pipe's synthesize code tries to access it.We don't force evsel ids allocation when we have single event, because
it's not needed. However we need it when we are in pipe mode even for
single event as a key for evsel update event.Fixing this by forcing evsel ids allocation event for single event, when
we are in pipe mode.Signed-off-by: Jiri Olsa
Cc: Alexander Shishkin
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20180302161354.30192-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
# perf record -F 200000 sleep 1
warning: Maximum frequency rate (15,000 Hz) exceeded, throttling from 200,000 Hz to 15,000 Hz.
The limit can be raised via /proc/sys/kernel/perf_event_max_sample_rate.
The kernel will lower it when perf's interrupts take too long.
Use --strict-freq to disable this throttling, refusing to record.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (15 samples) ]
# perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1For those wanting that it fails if the desired frequency can't be used:
# perf record --strict-freq -F 200000 sleep 1
error: Maximum frequency rate (15,000 Hz) exceeded.
Please use -F freq option with a lower value or consider
tweaking /proc/sys/kernel/perf_event_max_sample_rate.
#Suggested-by: Ingo Molnar
Cc: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: https://lkml.kernel.org/n/tip-oyebruc44nlja499nqkr1nzn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Add the handy '-F max' shortcut to reading and using the
kernel.perf_event_max_sample_rate value as the user supplied
sampling frequency:# perf record -F max sleep 1
info: Using a maximum frequency rate of 15,000 Hz
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (14 samples) ]
# sysctl kernel.perf_event_max_sample_rate
kernel.perf_event_max_sample_rate = 15000
# perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1# perf record -F 10 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (4 samples) ]
# perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 10, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
#Suggested-by: Ingo Molnar
Cc: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: https://lkml.kernel.org/n/tip-4y0tiuws62c64gp4cf0hme0m@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
16 Feb, 2018
1 commit
-
There's no new-line after target-override warning, now:
$ perf record -a --per-thread
Warning:
SYSTEM/CPU switch overriding PER-THREAD^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.705 MB perf.data (2939 samples) ]with patch:
$ perf record -a --per-thread
Warning:
SYSTEM/CPU switch overriding PER-THREAD
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.705 MB perf.data (2939 samples) ]Signed-off-by: Jiri Olsa
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexander Shishkin
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Fixes: 16ad2ffb822c ("perf tools: Introduce perf_target__strerror()")
Link: http://lkml.kernel.org/r/20180206181813.10943-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
05 Feb, 2018
1 commit
-
Stephan reported we don't unset PERIOD sample type when --no-period is
specified. Adding the unset check and reset PERIOD if --no-period is
specified.Committer notes:
Check the sample_type, it shouldn't have PERF_SAMPLE_PERIOD there when
--no-period is used.Before:
# perf record --no-period sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data (7 samples) ]
# perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
#After:
[root@jouet ~]# perf record --no-period sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (17 samples) ]
[root@jouet ~]# perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
[root@jouet ~]#Reported-by: Stephane Eranian
Signed-off-by: Jiri Olsa
Tested-by: Arnaldo Carvalho de Melo
Tested-by: Stephane Eranian
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20180201083812.11359-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
08 Jan, 2018
1 commit
-
In the default 'perf record' configuration, all samples are processed,
to create the HEADER_BUILD_ID table. So it's very easy to get the
first/last samples and save the time to perf file header via the
function write_sample_time().Later, at post processing time, perf report/script will fetch the time
from perf file header.Committer testing:
# perf record -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.099 MB perf.data (1101 samples) ]
[root@jouet home]# perf report --header | grep "time of "
# time of first sample : 22947.909226
# time of last sample : 22948.910704
#
# perf report -D | grep PERF_RECORD_SAMPLE\(
0 22947909226101 0x20bb68 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa21b1af3 period: 1 addr: 0
0 22947909229928 0x20bb98 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa200d204 period: 1 addr: 0
3 22948910397351 0x219360 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 28251/28251: 0xffffffffa22071d8 period: 169518 addr: 0
0 22948910652380 0x20f120 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 198807 addr: 0
2 22948910704034 0x2172d0 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 88111 addr: 0
#Changelog:
v7: Just update the patch description according to Arnaldo's suggestion.
v6: Currently '--buildid-all' is not enabled at default. So the walking
on all samples is the default operation. There is no big overhead
to calculate the timestamp boundary in process_sample_event handler
once we already go through all samples. So the timestamp boundary
calculation is enabled by default when '--buildid-all' is not enabled.While if '--buildid-all' is enabled, we creates a new option
"--timestamp-boundary" for user to decide if it enables the
timestamp boundary calculation.v5: There is an issue that the sample walking can only work when
'--buildid-all' is not enabled. So we need to let the walking
be able to work even if '--buildid-all' is enabled and let the
processing skips the dso hit marking for this case.At first, I want to provide a new option "--record-time-boundaries".
While after consideration, I think a new option is not very
necessary.v3: Remove the definitions of first_sample_time and last_sample_time
from struct record and directly save them in perf_evlist.Signed-off-by: Jin Yao
Acked-by: Jiri Olsa
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexander Shishkin
Cc: Andi Kleen
Cc: Kan Liang
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1512738826-2628-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo
27 Dec, 2017
2 commits
-
While monitoring a multithread process with pid option, perf sometimes
may return sys_perf_event_open failure with 3(No such process) if any of
the process's threads die before we open the event. However, we want
perf continue monitoring the remaining threads and do not exit with
error.Here, the patch enables perf_evsel::ignore_missing_thread for -p option
to ignore complete failure if any of threads die before we open the event.
But it may still return sys_perf_event_open failure with 22(Invalid) if we
monitors several event groups.sys_perf_event_open: pid 28960 cpu 40 group_fd 118202 flags 0x8
sys_perf_event_open: pid 28961 cpu 40 group_fd 118203 flags 0x8
WARNING: Ignored open failure for pid 28962
sys_perf_event_open: pid 28962 cpu 40 group_fd [118203] flags 0x8
sys_perf_event_open failed, error -22That is because when we ignore a missing thread, we change the thread_idx
without dealing with its fds, FD(evsel, cpu, thread). Then get_group_fd()
may return a wrong group_fd for the next thread and sys_perf_event_open()
return with 22.sys_perf_event_open(){
...
if (group_fd != -1)
perf_fget_light()//to get corresponding group_leader by group_fd
...
if (group_leader)
if (group_leader->ctx->task != ctx->task)//should on the same task
goto err_context
...
}This patch also fixes this bug by introducing perf_evsel__remove_fd() and
update_fds to allow removing fds for the missing thread.Changes since v1:
- Change group_fd__remove() into a more genetic way without changing code logic
- Remove redundant conditionChanges since v2:
- Use a proper function name and add some comment.
- Multiline comment style fixes.Committer testing:
Before this patch the recently added 'perf stat --per-thread' for system
wide counting would race while enumerating all threads using /proc:[root@jouet ~]# perf stat --per-thread
failed to parse CPUs map: No such file or directoryUsage: perf stat [] []
-C, --cpu list of cpus to monitor in system-wide
-a, --all-cpus system-wide collection from all CPUs
[root@jouet ~]# perf stat --per-thread
failed to parse CPUs map: No such file or directoryUsage: perf stat [] []
-C, --cpu list of cpus to monitor in system-wide
-a, --all-cpus system-wide collection from all CPUs
[root@jouet ~]#When, say, the kernel was being built, so lots of shortlived threads,
after this patch this doesn't happen.Signed-off-by: Mengting Zhang
Tested-by: Arnaldo Carvalho de Melo
Acked-by: Jiri Olsa
Cc: Cheng Jian
Cc: Li Bin
Cc: Wang Nan
Link: http://lkml.kernel.org/r/1513148513-6974-1-git-send-email-zhangmengting@huawei.com
[ Remove one use 'evlist' alias variable ]
Signed-off-by: Arnaldo Carvalho de Melo -
These duplicate includes have been found with scripts/checkincludes.pl
but they have been removed manually to avoid removing false positives.Signed-off-by: Pravin Shedge
Cc: David S. Miller
Cc: Greg Kroah-Hartman
Cc: Jiri Olsa
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1512582204-6493-1-git-send-email-pravin.shedge4linux@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo
06 Dec, 2017
4 commits
-
Remove the backward/forward concept to make it uniform with user
interface (the '--overwrite' option).Signed-off-by: Wang Nan
Acked-by: Namhyung Kim
Cc: Jiri Olsa
Cc: Kan Liang
Cc: Mengting Zhang
Link: http://lkml.kernel.org/r/20171204165107.95327-4-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo -
'overwrite' argument is always 'false'. Remove it from arguments list of
perf_mmap__push().Signed-off-by: Wang Nan
Acked-by: Namhyung Kim
Cc: Jiri Olsa
Cc: Kan Liang
Link: http://lkml.kernel.org/r/20171203020044.81680-5-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo -
evlist->overwrite is set to false in all users. It can be removed.
Signed-off-by: Wang Nan
Acked-by: Namhyung Kim
Cc: Jiri Olsa
Cc: Kan Liang
Link: http://lkml.kernel.org/r/20171203020044.81680-4-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo -
All users of perf_evlist__mmap_ex set !overwrite. Remove it from its
arguments list.Signed-off-by: Wang Nan
Acked-by: Namhyung Kim
Cc: Jiri Olsa
Cc: Kan Liang
Link: http://lkml.kernel.org/r/20171203020044.81680-3-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo
30 Nov, 2017
2 commits
-
Synthesize the per attr thread maps and cpu maps in 'perf record'.
This allows code from 'perf stat' called from 'perf script' to access
this information.Committer testing:
Please see the PERF_RECORD_THREAD_MAP and PERF_RECORD_CPU_MAP records,
added by this patch:$ perf record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
$ perf report -D | grep PERF_RECORD_ | head
0xe8 [0x20]: PERF_RECORD_TIME_CONV: unhandled!
0x108 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 23568
0x130 [0x18]: PERF_RECORD_CPU_MAP: 0-3
0 0x148 [0x28]: PERF_RECORD_COMM: perf:23568/23568
0x570 [0x8]: PERF_RECORD_FINISHED_ROUND
445342677837144 0x170 [0x28]: PERF_RECORD_COMM exec: sleep:23568/23568
445342677847339 0x198 [0x68]: PERF_RECORD_MMAP2 23568/23568: [0x564c943a4000(0x208000) @ 0 fd:00 3147174 2566255743]: r-xp /usr/bin/sleep
445342677862450 0x200 [0x70]: PERF_RECORD_MMAP2 23568/23568: [0x7f25968a8000(0x229000) @ 0 fd:00 3151761 2566238119]: r-xp /usr/lib64/ld-2.25.so
445342677873174 0x270 [0x60]: PERF_RECORD_MMAP2 23568/23568: [0x7ffc98176000(0x2000) @ 0 00:00 0 0]: r-xp [vdso]
445342677891928 0x2d0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4002): 23568/23568: 0xffffffff8f84c7e7 period: 1 addr: 0
$Signed-off-by: Andi Kleen
Acked-by: Jiri Olsa
Tested-by: Arnaldo Carvalho de Melo
Link: http://lkml.kernel.org/r/20171117214300.32746-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo -
Move the code to synthesize event updates for scale/unit/cpus to a
common utility file, and use it both from stat and record.This allows to access scale and other extra qualifiers from perf script.
Signed-off-by: Andi Kleen
Acked-by: Jiri Olsa
Link: http://lkml.kernel.org/r/20171117214300.32746-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo
17 Nov, 2017
1 commit
-
If we're not sampling the kernel, we shouldn't care about kptr_restrict
neither synthesize anything for assisting in resolving kernel samples,
like the reference relocation symbol or kernel modules information.Before:
$ cat /proc/sys/kernel/kptr_restrict /proc/sys/kernel/perf_event_paranoid
2
2
$ perf record sleep 1
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.Couldn't record kernel reference relocation symbol
Symbol resolution may be skewed if relocation was used (e.g. kexec).
Check /proc/kallsyms permission or run as root.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
$ perf evlist -v
cycles:uppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
$After:
$ perf record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (10 samples) ]
$Cc: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Namhyung Kim
Cc: Wang Nan
Link: http://lkml.kernel.org/n/tip-t025e9zftbx2b8cq2w01g5e5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
13 Nov, 2017
1 commit
-
When we use an initial delay, e.g.: 'perf record --delay 1000', we do not
enable the events until that delay has passed after we started the workload,
including the tracking event, i.e. the one for which we have attr.mmap, etc,
enabled to ask the kernel to generate the PERF_RECORD_{MMAP,COMM,EXEC} metadata
events that will then allow us to resolve addresses in samples to the map, dso
and symbol. There will be a shadow that even synthesizing samples won't cover,
i.e. the workload that we start and other processes forking while we
wait for the initial delay to expire.So use a dummy event to be the tracking one and make it be enabled on exec.
Before:
# perf record --delay 1000 stress --cpu 1 --timeout 5
stress: info: [9029] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [9029] successful run completed in 5s
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.624 MB perf.data (15908 samples) ]
# perf script | head
:9031 9031 32001.826888: 1 cycles:ppp: ffffffff831aa30d event_function (/lib/modules/4.14.0-rc6+/build/vmlinux)
:9031 9031 32001.826893: 1 cycles:ppp: ffffffff8300d1a0 intel_bts_enable_local (/lib/modules/4.14.0-rc6+/build/vmlinux)
:9031 9031 32001.826895: 7 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
:9031 9031 32001.826897: 103 cycles:ppp: ffffffff8300c331 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux)
:9031 9031 32001.826899: 1615 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
:9031 9031 32001.826902: 26724 cycles:ppp: ffffffff8384c6a7 native_irq_return_iret (/lib/modules/4.14.0-rc6+/build/vmlinux)
:9031 9031 32001.826913: 329739 cycles:ppp: 7fb2a5410932 [unknown] ([unknown])
:9031 9031 32001.827033: 1225451 cycles:ppp: 7fb2a5410930 [unknown] ([unknown])
:9031 9031 32001.827474: 1391725 cycles:ppp: 7fb2a5410930 [unknown] ([unknown])
:9031 9031 32001.827978: 1233697 cycles:ppp: 7fb2a5410928 [unknown] ([unknown])
#After:
# perf record --delay 1000 stress --cpu 1 --timeout 5
stress: info: [9741] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [9741] successful run completed in 5s
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.751 MB perf.data (15976 samples) ]
# perf script | head
stress 9742 32110.959106: 1 cycles:ppp: ffffffff831b26f6 __perf_event_task_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux)
stress 9742 32110.959110: 1 cycles:ppp: ffffffff8300c2e9 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux)
stress 9742 32110.959112: 7 cycles:ppp: ffffffff830231e0 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
stress 9742 32110.959115: 101 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
stress 9742 32110.959117: 1533 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
stress 9742 32110.959119: 23992 cycles:ppp: ffffffff831b0900 ctx_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux)
stress 9742 32110.959129: 329406 cycles:ppp: 7f4b1b661930 __random_r (/usr/lib64/libc-2.25.so)
stress 9742 32110.959249: 1288322 cycles:ppp: 5566e1e7cbc9 hogcpu (/usr/bin/stress)
stress 9742 32110.959712: 1464046 cycles:ppp: 7f4b1b66179e __random (/usr/lib64/libc-2.25.so)
stress 9742 32110.960241: 1266918 cycles:ppp: 7f4b1b66195b __random_r (/usr/lib64/libc-2.25.so)
#Reported-by: Bram Stolk
Tested-by: Bram Stolk
Cc: Adrian Hunter
Cc: Andi Kleen
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Fixes: 6619a53ef757 ("perf record: Add --initial-delay option")
Link: http://lkml.kernel.org/n/tip-nrdfchshqxf7diszhxcecqb9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
07 Nov, 2017
1 commit
-
Conflicts:
tools/perf/arch/arm/annotate/instructions.c
tools/perf/arch/arm64/annotate/instructions.c
tools/perf/arch/powerpc/annotate/instructions.c
tools/perf/arch/s390/annotate/instructions.c
tools/perf/arch/x86/tests/intel-cqm.c
tools/perf/ui/tui/progress.c
tools/perf/util/zlib.cSigned-off-by: Ingo Molnar
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman