Eric Lee / smarc-fsl-linux-kernel

13 Nov, 2015

6 commits

092b1f0b5 perf probe: Clear probe_trace_event when add_probe_trace_event() fails ... Browse Code »

When probing with a glob, errors in add_probe_trace_event() won't be
passed to debuginfo__find_trace_events() because it would be modified by
probe_point_search_cb(). It causes a segfault if perf fails to find an
argument for a probe point matched by the glob. For example:

# ./perf probe -v -n 'SyS_dup? oldfd'
probe-definition(0): SyS_dup? oldfd
symbol:SyS_dup? file:(null) line:0 offset:0 return:0 lazy:(null)
parsing arg: oldfd into oldfd
1 arguments
Looking at the vmlinux_path (7 entries long)
Using /lib/modules/4.3.0-rc4+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.3.0-rc4+/build/vmlinux
Try to find probe point from debuginfo.
Matched function: SyS_dup3
found inline addr: 0xffffffff812095c0
Probe point found: SyS_dup3+0
Searching 'oldfd' variable in context.
Converting variable oldfd into trace event.
oldfd type is long int.
found inline addr: 0xffffffff812096d4
Probe point found: SyS_dup2+36
Searching 'oldfd' variable in context.
Failed to find 'oldfd' in this function.
Matched function: SyS_dup3
Probe point found: SyS_dup3+0
Searching 'oldfd' variable in context.
Converting variable oldfd into trace event.
oldfd type is long int.
Matched function: SyS_dup2
Probe point found: SyS_dup2+0
Searching 'oldfd' variable in context.
Converting variable oldfd into trace event.
oldfd type is long int.
Found 4 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/SyS_dup3 _text+2135488 oldfd=%di:s64
Segmentation fault (core dumped)
#

This patch ensures that add_probe_trace_event() doesn't touches
tf->ntevs and tf->tevs if those functions fail.

After the patch:

# perf probe 'SyS_dup? oldfd'
Failed to find 'oldfd' in this function.
Added new events:
probe:SyS_dup3 (on SyS_dup? with oldfd)
probe:SyS_dup3_1 (on SyS_dup? with oldfd)
probe:SyS_dup2 (on SyS_dup? with oldfd)

You can now use it in all perf tools, such as:

perf record -e probe:SyS_dup2 -aR sleep 1

Signed-off-by: Wang Nan
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexei Starovoitov
Cc: Masami Hiramatsu
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1447417761-156094-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-11-13 23:28:09 +0800
0196e787c perf probe: Fix memory leaking on failure by clearing all probe_trace_events ... Browse Code »

Fix memory leaking on the debuginfo__find_trace_events() failure path
which frees an array of probe_trace_events but doesn't clears all the
allocated sub-structures and strings.

So, before doing zfree(tevs), clear all the array elements which may
have allocated resources.

Reported-by: Wang Nan
Signed-off-by: Masami Hiramatsu
Cc: Alexei Starovoitov
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1447417761-156094-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Masami Hiramatsu
2015-11-13 23:24:32 +0800
1216b65c5 perf buildid-list: Requires ordered events ... Browse Code »

'perf buildid-list' processes events to determine hits (i.e. with-hits
option). That may not work if events are not sorted in order. i.e. MMAP
events must be processed before the samples that depend on them so that
sample processing can 'hit' the DSO to which the MMAP refers.

Signed-off-by: Adrian Hunter
Cc: Jiri Olsa
Link: http://lkml.kernel.org/r/1447408112-1920-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo

Adrian Hunter
2015-11-13 23:22:04 +0800
e266a753b perf symbols: Fix dso lookup by long name and missing buildids ... Browse Code »

Commit 4598a0a6d22f ("perf symbols: Improve DSO long names lookup speed
with rbtree") Added a tree to lookup dsos by long name. That tree gets
corrupted whenever a dso long name is changed because the tree is not
updated.

One effect of that is buildid-list does not work with the 'with-hits'
option because dso lookup fails and results in two structs for the same
dso. The first has the buildid but no hits, the second has hits but no
buildid. e.g.

Before:

$ tools/perf/perf record ls
arch certs CREDITS Documentation firmware include
ipc Kconfig lib Makefile net REPORTING-BUGS
scripts sound usr block COPYING crypto
drivers fs init Kbuild kernel MAINTAINERS
mm README samples security tools virt
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB perf.data (11 samples) ]
$ tools/perf/perf buildid-list
574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
30c94dc66a1fe95180c3d68d2b89e576d5ae213c /lib/x86_64-linux-gnu/libc-2.19.so
$ tools/perf/perf buildid-list -H
574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
0000000000000000000000000000000000000000 /lib/x86_64-linux-gnu/libc-2.19.so

After:

$ tools/perf/perf buildid-list -H
574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
30c94dc66a1fe95180c3d68d2b89e576d5ae213c /lib/x86_64-linux-gnu/libc-2.19.so

The fix is to record the root of the tree on the dso so that
dso__set_long_name() can update the tree when the long name changes.

Signed-off-by: Adrian Hunter
Tested-by: Arnaldo Carvalho de Melo
Cc: Adrian Hunter
Cc: Don Zickus
Cc: Douglas Hatch
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Scott J Norton
Cc: Waiman Long
Fixes: 4598a0a6d22f ("perf symbols: Improve DSO long names lookup speed with rbtree")
Link: http://lkml.kernel.org/r/1447408112-1920-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo

Adrian Hunter
2015-11-13 22:14:36 +0800
2059fc7a5 perf symbols: Allow forcing reading of non-root owned files by root ... Browse Code »

When the root user tries to read a file owned by some other user we get:

# ls -la perf.data
-rw-------. 1 acme acme 20032 Nov 12 15:50 perf.data
# perf report
File perf.data not owned by current user or root (use -f to override)
# perf report -f | grep -v ^# | head -2
30.96% ls [kernel.vmlinux] [k] do_set_pte
28.24% ls libc-2.20.so [.] intel_check_word
#

That wasn't happening when the symbol code tried to read a JIT map,
where the same check was done but no forcing was possible, fix it.

Reported-by: Brendan Gregg
Tested-by: Brendan Gregg
Cc: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: http://permalink.gmane.org/gmane.linux.kernel.perf.user/2380
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2015-11-13 05:58:18 +0800
866548dd6 perf symbols: Rebuild rbtree when adjusting symbols for kcore ... Browse Code »

Normally symbols are read from the DSO and adjusted, if need be, so that
the symbol start matches the file offset in the DSO file (we want the
file offset because that is what we know from MMAP events). That is done
by dso__load_sym() which inserts the symbols *after* adjusting them.

In the case of kcore, the symbols have been read from kallsyms and the
symbol start is the memory address. The symbols have to be adjusted to
match the kcore file offsets. dso__split_kallsyms_for_kcore() does that,
but now the adjustment is being done *after* the symbols have been
inserted. It appears dso__split_kallsyms_for_kcore() was assuming that
changing the symbol start would not change the order in the rbtree -
which is, of course, not guaranteed.

Signed-off-by: Adrian Hunter
Tested-by: Wang Nan
Cc: Jiri Olsa
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/563CB241.2090701@intel.com
Signed-off-by: Arnaldo Carvalho de Melo

Adrian Hunter
2015-11-13 05:58:17 +0800

12 Nov, 2015

3 commits

421fd0845 perf probe: Verify parameters in two functions ... Browse Code »

On kernel with only one out of CONFIG_KPROBE_EVENTS and
CONFIG_UPROBE_EVENTS enabled, 'perf probe -d' causes a segfault because
perf_del_probe_events() calls probe_file__get_events() with a negative
fd.

This patch fixes it by adding parameter validation at the entry of
probe_file__get_events() and probe_file__get_rawlist(). Since they are
both non-static public functions (in .h file), parameter verifying is
required.

v1 -> v2: Verify fd at the head of probe_file__get_rawlist() instead of
checking at call site (suggested by Masami and Arnaldo at [1,2]).

[1] http://lkml.kernel.org/r/50399556C9727B4D88A595C8584AAB37526048E3@GSjpTKYDCembx32.service.hitachi.net
[2] http://lkml.kernel.org/r/20151105155830.GV13236@kernel.org

Signed-off-by: Wang Nan
Acked-by: Masami Hiramatsu
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446803415-83382-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-11-12 05:41:32 +0800
e87b49116 perf session: Add missing newlines to some pr_err() calls ... Browse Code »

Before:

[acme@zoo linux]$ perf evlist
WARNING: The perf.data file's data size field is 0 which is unexpected.
Was the 'perf record' command properly terminated?
non matching sample_type[acme@zoo linux]$

After:

[acme@zoo linux]$ perf evlist
WARNING: The perf.data file's data size field is 0 which is unexpected.
Was the 'perf record' command properly terminated?
non matching sample_type
[acme@zoo linux]$

Cc: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: http://lkml.kernel.org/n/tip-wscok3a2s7yrj8156oc2r6qe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2015-11-12 05:41:31 +0800
4a4c03c1b perf annotate: Support full source file paths for srcline fix ... Browse Code »

The --full-paths option did not show the full source file paths in the 'perf
annotate' tool, because the value of the option was not propagated into the
related functions.

With this patch the value of the --full-paths option is known to the function
that composes the srcline string, so it prints the full path when necessary.

Committer Note:

This affects annotate when the --print-line option is used:

# perf annotate -h 2>&1 | grep print-line
-l, --print-line print matching source lines (may be slow)

Looking just at the lines that should be affected by this change:

Before:

# perf annotate --print-line --full-paths --stdio fput | grep '\.[ch]:[0-9]\+'
94.44 atomic64_64.h:114
5.56 file_table.c:265
file_table.c:265 5.56 : ffffffff81219a00: callq ffffffff81769360
atomic64_64.h:114 94.44 : ffffffff81219a05: lock decq 0x38(%rdi)

After:

# perf annotate --print-line --full-paths --stdio fput | grep '\.[ch]:[0-9]\+'
94.44 /home/git/linux/arch/x86/include/asm/atomic64_64.h:114
5.56 /home/git/linux/fs/file_table.c:265
/home/git/linux/fs/file_table.c:265 5.56 : ffffffff81219a00: callq ffffffff81769360
/home/git/linux/arch/x86/include/asm/atomic64_64.h:114 94.44 : ffffffff81219a05: lock decq 0x38(%rdi)
#

Signed-off-by: Michael Petlan
Tested-by: Arnaldo Carvalho de Melo
Link: http://permalink.gmane.org/gmane.linux.kernel.perf.user/2365
Signed-off-by: Arnaldo Carvalho de Melo

Michael Petlan
2015-11-12 05:41:30 +0800

07 Nov, 2015

4 commits

ba1fae431 perf test: Add 'perf test BPF' ... Browse Code »

This patch adds BPF testcase for testing BPF event filtering.

By utilizing the result of 'perf test LLVM', this patch compiles the
eBPF sample program then test its ability. The BPF script in 'perf test
LLVM' lets only 50% samples generated by epoll_pwait() to be captured.
This patch runs that system call for 111 times, so the result should
contain 56 samples.

Signed-off-by: Wang Nan
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexei Starovoitov
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446817783-86722-8-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-11-07 04:50:03 +0800
d3e0ce393 perf bpf: Improve BPF related error messages ... Browse Code »

A series of bpf loader related error codes were introduced to help error
reporting. Functions were improved to return these new error codes.

Functions which return pointers were adjusted to encode error codes into
return value using the ERR_PTR() interface.

bpf_loader_strerror() was improved to convert these error messages to
strings. It checks the error codes and calls libbpf_strerror() and
strerror_r() accordingly, so caller don't need to consider checking the
range of the error code.

In bpf__strerror_load(), print kernel version of running kernel and the
object's 'version' section to notify user how to fix his/her program.

v1 -> v2:
Use macro for error code.

Fetch error message based on array index, eliminate for-loop.

Print version strings.

Before:

# perf record -e ./test_kversion_nomatch_program.o sleep 1
event syntax error: './test_kversion_nomatch_program.o'
\___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
SKIP

After:

# perf record -e ./test_kversion_nomatch_program.o ls
event syntax error: './test_kversion_nomatch_program.o'
\___ 'version' (4.4.0) doesn't match running kernel (4.3.0)
SKIP

Signed-off-by: Wang Nan
Tested-by: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446818289-87444-1-git-send-email-wangnan0@huawei.com
[ Add 'static inline' to bpf__strerror_prepare_load() when LIBBPF is disabled ]
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-11-07 04:14:20 +0800
07bc5c699 perf tools: Make fetch_kernel_version() publicly available ... Browse Code »

There are 2 places in llvm-utils.c which find kernel version information
through uname. This patch extracts the uname related code into a
fetch_kernel_version() function and puts it into util.h so it can be
reused.

Signed-off-by: Wang Nan
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446818135-87310-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-11-07 02:57:18 +0800
6371ca3b5 bpf tools: Improve libbpf error reporting ... Browse Code »

In this patch, a series of libbpf specific error numbers and
libbpf_strerror() are introduced to help reporting errors.

Functions are updated to pass correct the error number through the
CHECK_ERR() macro.

All users of bpf_object__open{_buffer}() and bpf_program__title() in
perf are modified accordingly. In addition, due to the error codes
changing, bpf__strerror_load() is also modified to use them.

bpf__strerror_head() is also changed accordingly so it can parse libbpf
errors. bpf_loader_strerror() is introduced for that purpose, and will
be improved by the following patch.

load_program() is improved not to dump log buffer if it is empty. log
buffer is also used to deduce whether the error was caused by an invalid
program or other problem.

v1 -> v2:

- Using macro for error code.

- Fetch error message based on array index, eliminate for-loop.

- Use log buffer to detect the reason of failure. 3 new error code
are introduced to replace LIBBPF_ERRNO__LOAD.

In v1:

# perf record -e ./test_ill_program.o ls
event syntax error: './test_ill_program.o'
\___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
SKIP

# perf record -e ./test_kversion_nomatch_program.o ls
event syntax error: './test_kversion_nomatch_program.o'
\___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
SKIP

# perf record -e ./test_big_program.o ls
event syntax error: './test_big_program.o'
\___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
SKIP

In v2:

# perf record -e ./test_ill_program.o ls
event syntax error: './test_ill_program.o'
\___ Kernel verifier blocks program loading
SKIP

# perf record -e ./test_kversion_nomatch_program.o
event syntax error: './test_kversion_nomatch_program.o'
\___ Incorrect kernel version
SKIP
(Will be further improved by following patches)

# perf record -e ./test_big_program.o
event syntax error: './test_big_program.o'
\___ Program too big
SKIP

Signed-off-by: Wang Nan
Tested-by: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446817783-86722-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-11-07 02:52:41 +0800

06 Nov, 2015

2 commits

0a62f6869 perf probe: Cleanup find_perf_probe_point_from_map to reduce redundancy ... Browse Code »

In find_perf_probe_point_from_map(), the 'ret' variable is initialized
with -ENOENT but overwritten by the return code of
kernel_get_symbol_address_by_name(), and after that it is re-initialized
with -ENOENT again.

Setting ret=-ENOENT twice looks a bit redundant. This avoids the
overwriting and just returns -ENOENT if some error happens to simplify
the code.

Signed-off-by: Masami Hiramatsu
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Cc: Zefan Li
Link: http://lkml.kernel.org/n/tip-ufp1zgbktzmttcputozneomd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Masami Hiramatsu
2015-11-06 21:47:33 +0800
62ec9b3f0 perf annotate: Inform the user about objdump failures in --stdio ... Browse Code »

When the browser fails to annotate it is difficult for users to find out
what went wrong.

Add some errors for objdump failures that are displayed in the UI.

Note it would be even better to handle these errors smarter, like
falling back to the binary when the debug info is somehow corrupted. But
for now just giving a better error is an improvement.

Committer note:

This works for --stdio, where errors just scroll by the screen:

# perf annotate --stdio intel_idle
Failure running objdump --start-address=0xffffffff81418290 --stop-address=0xffffffff814183ae -l -d --no-show-raw -S -C /root/.debug/.build-id/28/2777c262e6b3c0451375163c9a81c893218ab1 2>/dev/null|grep -v /root/.debug/.build-id/28/2777c262e6b3c0451375163c9a81c893218ab1|expand
Percent | Source code & Disassembly of vmlinux for cycles:pp
------------------------------------------------------------------

And with that one can use that command line to try to find out more about what
happened instead of getting a blank screen, an improvement.

We need tho to improve this further to get it to work with other UIs, like
--tui and --gtk, where it continues showing a blank screen, no messages, as
the pr_err() used is enough just for --stdio.

Signed-off-by: Andi Kleen
Tested-by: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/1446779167-18949-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo

Andi Kleen
2015-11-06 21:20:48 +0800

05 Nov, 2015

5 commits

98d3b258e perf tools: Fix find_perf_probe_point_from_map() which incorrectly returns success ... Browse Code »

It is possible that find_perf_probe_point_from_map() fails to find a
symbol but still returns 0 because of an small error when coding:
find_perf_probe_point_from_map() set 'ret' to error code at first, but
also use it to hold return value of kernel_get_symbol_address_by_name().

This patch resets 'ret' to error even kernel_get_symbol_address_by_name()
success, so if !sym, the whole function returns error correctly.

Signed-off-by: Wang Nan
Cc: Jiri Olsa
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446729565-27592-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-11-05 23:47:52 +0800
4a4f66a1a perf llvm: Pass LINUX_VERSION_CODE to BPF program when compiling ... Browse Code »

Arnaldo suggests to make LINUX_VERSION_CODE works like __func__ and
__FILE__ so user don't need to care setting right linux version too
much. In this patch, perf llvm transfers LINUX_VERSION_CODE macro
through clang cmdline.

[1] http://lkml.kernel.org/r/20151029223744.GK2923@kernel.org

Committer notes:

Before, forgetting to update the version:

# uname -r
4.3.0-rc1+
# cat bpf.c
__attribute__((section("fork=_do_fork"), used))
int fork(void *ctx)
{
return 1;
}

char _license[] __attribute__((section("license"), used)) = "GPL";
int _version __attribute__((section("version"), used)) = 0x40200;
#
# perf record -e bpf.c sleep 1
event syntax error: 'bpf.c'
\___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?

(add -v to see detail)
Run 'perf list' for a list of valid events

Usage: perf record [] []
or: perf record [] -- []

-e, --event event selector. use 'perf list' to list available events
#

After:

# grep version bpf.c
int _version __attribute__((section("version"), used)) = LINUX_VERSION_CODE;
# perf record -e bpf.c sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data ]
# perf evlist -v
perf_bpf_probe:fork: type: 2, size: 112, config: 0x5ee, { sample_period,
sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1,
inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all:
1, exclude_guest: 1, mmap2: 1, comm_exec: 1
#

Suggested-and-Tested-by: Arnaldo Carvalho de Melo
Signed-off-by: Wang Nan
Cc: Alexei Starovoitov
Cc: Namhyung Kim
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446636007-239722-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-11-05 23:47:50 +0800
59f41af98 perf llvm: Pass number of configured CPUs to clang compiler ... Browse Code »

This patch introduces a new macro "__NR_CPUS__" to perf's embedded clang
compiler, which represent the number of configured CPUs in this system.
BPF programs can use this macro to create a map with the same number of
system CPUs. For example:

struct bpf_map_def SEC("maps") pmu_map = {
.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
.key_size = sizeof(int),
.value_size = sizeof(u32),
.max_entries = __NR_CPUS__,
};

Signed-off-by: Wang Nan
Cc: Alexei Starovoitov
Cc: Namhyung Kim
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446636007-239722-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-11-05 23:47:02 +0800
cb8382e05 perf tools: Insert split maps correctly into origin group ... Browse Code »

When new maps are cloned out of split map they are added into origin
map's group, but their groups pointer is not updated.

This could lead to a segfault, because map->groups is expected to be
always set as reported by Markus:

__map__is_kernel (map=map@entry=0x1abb7a0) at util/map.c:238
238 return __machine__kernel_map(map->groups->machine, map->type) =
(gdb) bt
#0 __map__is_kernel (map=map@entry=0x1abb7a0) at util/map.c:238
#1 0x00000000004393e4 in symbol_filter (map=map@entry=0x1abb7a0, sym=sym@entry
#2 0x00000000004fcd4d in dso__load_sym (dso=dso@entry=0x166dae0, map=map@entry
#3 0x00000000004a64e0 in dso__load (dso=0x166dae0, map=map@entry=0x1abb7a0, fi
#4 0x00000000004b941f in map__load (filter=0x4393c0 , map=groups pointer update. It takes no lock as opposed to existing
map_groups__insert, as maps__fixup_overlappings(), where it is being
called, already has the necessary lock held.

Using __map_groups__insert to add new maps after map split.

Reported-by: Markus Trippelsdorf
Signed-off-by: Jiri Olsa
Tested-by: Markus Trippelsdorf
Cc: Andrew Morton
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20151104140811.GA32664@krava.brq.redhat.com
Fixes: cfc5acd4c80b ("perf top: Filter symbols based on __map__is_kernel(map)")
Signed-off-by: Arnaldo Carvalho de Melo

Jiri Olsa
2015-11-05 22:39:38 +0800
4579ecc8b perf stat: Move sw clock metrics printout to stat-shadow ... Browse Code »

The sw clock metrics printing was missed in the earlier move to
stat-shadow of all the other metric printouts. Move it too.

v2: Fix metrics printing in this version to make bisect safe.

Signed-off-by: Andi Kleen
Acked-by: Jiri Olsa
Link: http://lkml.kernel.org/r/1446515428-7450-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo

Andi Kleen
2015-11-05 02:11:41 +0800

03 Nov, 2015

1 commit

7a0119468 perf bpf: Mute libbpf when '-v' not set ... Browse Code »

According to [1], libbpf should be muted. This patch reset info and
warning message level to ensure libbpf doesn't output anything even
if error happened.

[1] http://lkml.kernel.org/r/20151020151255.GF5119@kernel.org

Committer note:

Before:

Testing it with an incompatible kernel version in the .c file that
generated foo.o:

[root@zoo ~]# perf record -e /tmp/foo.o sleep 1
libbpf: load bpf program failed: Invalid argument
libbpf: -- BEGIN DUMP LOG ---
libbpf:

libbpf: -- END LOG --
libbpf: failed to load program 'fork=_do_fork'
libbpf: failed to load object '/tmp/foo.o'
event syntax error: '/tmp/foo.o'
\___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?

(add -v to see detail)
Run 'perf list' for a list of valid events

Usage: perf record [] []
or: perf record [] -- []

-e, --event event selector. use 'perf list' to list available events
[root@zoo ~]#

After:

[root@zoo ~]# perf record -e /tmp/foo.o sleep 1
event syntax error: '/tmp/foo.o'
\___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?

(add -v to see detail)
Run 'perf list' for a list of valid events

Usage: perf record [] []
or: perf record [] -- []

-e, --event event selector. use 'perf list' to list available events
[root@zoo ~]#

This, BTW, need fixing to emit a proper message by validating the
version in the foo.o "version" ELF section against the running kernel,
warning the user instead of asking the kernel to load a binary that it
will refuse due to unmatching kernel version.

Signed-off-by: Wang Nan
Tested-by: Arnaldo Carvalho de Melo
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446547486-229499-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-11-03 23:06:04 +0800

30 Oct, 2015

3 commits

7ed4915ad perf unwind: Pass symbol source to libunwind ... Browse Code »

Even if --symfs is used to point to the debug binaries, we send in the
non-debug filenames to libunwind, which leads to libunwind not finding
the debug frame. Fix this by preferring the file in --symfs, if it is
available.

Signed-off-by: Rabin Vincent
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Rabin Vincent
Link: http://lkml.kernel.org/r/1446104978-26429-1-git-send-email-rabin.vincent@axis.com
Signed-off-by: Arnaldo Carvalho de Melo

Rabin Vincent
2015-10-30 04:48:38 +0800
d509db047 perf tools: Compile scriptlets to BPF objects when passing '.c' to --event ... Browse Code »

This patch provides infrastructure for passing source files to --event
directly using:

# perf record --event bpf-file.c command

This patch does following works:

1) Allow passing '.c' file to '--event'. parse_events_load_bpf() is
expanded to allow caller tell it whether the passed file is source
file or object.

2) llvm__compile_bpf() is called to compile the '.c' file, the result
is saved into memory. Use bpf_object__open_buffer() to load the
in-memory object.

Introduces a bpf-script-example.c so we can manually test it:

# perf record --clang-opt "-DLINUX_VERSION_CODE=0x40200" --event ./bpf-script-example.c sleep 1

Note that '--clang-opt' must put before '--event'.

Futher patches will merge it into a testcase so can be tested automatically.

Signed-off-by: Wang Nan
Acked-by: Alexei Starovoitov
Cc: Brendan Gregg
Cc: Daniel Borkmann
Cc: David Ahern
Cc: He Kuang
Cc: Jiri Olsa
Cc: Kaixu Xia
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-10-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-10-30 04:16:23 +0800
1f45b1d49 perf bpf: Attach eBPF filter to perf event ... Browse Code »

This is the final patch which makes basic BPF filter work. After
applying this patch, users are allowed to use BPF filter like:

# perf record --event ./hello_world.o ls

A bpf_fd field is appended to 'struct evsel', and setup during the
callback function add_bpf_event() for each 'probe_trace_event'.

PERF_EVENT_IOC_SET_BPF ioctl is used to attach eBPF program to a newly
created perf event. The file descriptor of the eBPF program is passed to
perf record using previous patches, and stored into evsel->bpf_fd.

It is possible that different perf event are created for one kprobe
events for different CPUs. In this case, when trying to call the ioctl,
EEXIST will be return. This patch doesn't treat it as an error.

Committer note:

The bpf proggie used so far:

__attribute__((section("fork=_do_fork"), used))
int fork(void *ctx)
{
return 0;
}

char _license[] __attribute__((section("license"), used)) = "GPL";
int _version __attribute__((section("version"), used)) = 0x40300;

failed to produce any samples, even with forks happening and it being
running in system wide mode.

That is because now the filter is being associated, and the code above
always returns zero, meaning that all forks will be probed but filtered
away ;-/

Change it to 'return 1;' instead and after that:

# trace --no-syscalls --event /tmp/foo.o
0.000 perf_bpf_probe:fork:(ffffffff8109be30))
2.333 perf_bpf_probe:fork:(ffffffff8109be30))
3.725 perf_bpf_probe:fork:(ffffffff8109be30))
4.550 perf_bpf_probe:fork:(ffffffff8109be30))
^C#

And it works with all tools, including 'perf trace'.

Signed-off-by: Wang Nan
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexei Starovoitov
Cc: Brendan Gregg
Cc: Daniel Borkmann
Cc: David Ahern
Cc: He Kuang
Cc: Jiri Olsa
Cc: Kaixu Xia
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-8-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-10-30 04:16:22 +0800

29 Oct, 2015

2 commits

4edf30e39 perf bpf: Collect perf_evsel in BPF object files ... Browse Code »

This patch creates a 'struct perf_evsel' for every probe in a BPF object
file(s) and fills 'struct evlist' with them. The previously introduced
dummy event is now removed. After this patch, the following command:

# perf record --event filter.o ls

Can trace on each of the probes defined in filter.o.

The core of this patch is bpf__foreach_tev(), which calls a callback
function for each 'struct probe_trace_event' event for a bpf program
with each associated file descriptors. The add_bpf_event() callback
creates evsels by calling parse_events_add_tracepoint().

Since bpf-loader.c will not be built if libbpf is turned off, an empty
bpf__foreach_tev() is defined in bpf-loader.h to avoid build errors.

Committer notes:

Before:

# /tmp/oldperf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.198 MB perf.data ]
# perf evlist
/tmp/foo.o
# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
exclude_guest: 1, mmap2: 1, comm_exec: 1

I.e. we create just the PERF_TYPE_SOFTWARE (type: 1),
PERF_COUNT_SW_DUMMY(config 0x9) event, now, with this patch:

# perf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.210 MB perf.data ]
# perf evlist -v
perf_bpf_probe:fork: type: 2, size: 112, config: 0x6bd, { sample_period,
sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1,
inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest:
1, mmap2: 1, comm_exec: 1
#

We now have a PERF_TYPE_SOFTWARE (type: 1), but the config states 0x6bd,
which is how, after setting up the event via the kprobes interface, the
'perf_bpf_probe:fork' event is accessible via the perf_event_open
syscall. This is all transient, as soon as the 'perf record' session
ends, these probes will go away.

To see how it looks like, lets try doing a neverending session, one that
expects a control+C to end:

# perf record --event /tmp/foo.o -a

So, with that in place, we can use 'perf probe' to see what is in place:

# perf probe -l
perf_bpf_probe:fork (on _do_fork@acme/git/linux/kernel/fork.c)

We also can use debugfs:

[root@felicio ~]# cat /sys/kernel/debug/tracing/kprobe_events
p:perf_bpf_probe/fork _text+638512

Ok, now lets stop and see if we got some forks:

[root@felicio linux]# perf record --event /tmp/foo.o -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.325 MB perf.data (111 samples) ]

[root@felicio linux]# perf script
sshd 1271 [003] 81797.507678: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [000] 81797.524917: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.381603: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.408635: perf_bpf_probe:fork: (ffffffff8109be30)

Sure enough, we have 111 forks :-)

Callchains seems to work as well:

# perf report --stdio --no-child
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 562 of event 'perf_bpf_probe:fork'
# Event count (approx.): 562
#
# Overhead Command Shared Object Symbol
# ........ ........ ................ ............
#
44.66% sh [kernel.vmlinux] [k] _do_fork
|
---_do_fork
entry_SYSCALL_64_fastpath
__libc_fork
make_child

26.16% make [kernel.vmlinux] [k] _do_fork

#

Signed-off-by: Wang Nan
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexei Starovoitov
Cc: Brendan Gregg
Cc: Daniel Borkmann
Cc: David Ahern
Cc: He Kuang
Cc: Jiri Olsa
Cc: Kaixu Xia
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-10-29 00:11:59 +0800
1e5e3ee8f perf tools: Load eBPF object into kernel ... Browse Code »

This patch utilizes bpf_object__load() provided by libbpf to load all
objects into kernel.

Committer notes:

Testing it:

When using an incorrect kernel version number, i.e., having this in your
eBPF proggie:

int _version __attribute__((section("version"), used)) = 0x40100;

For a 4.3.0-rc6+ kernel, say, this happens and needs checking at event
parsing time, to provide a better error report to the user:

# perf record --event /tmp/foo.o sleep 1
libbpf: load bpf program failed: Invalid argument
libbpf: -- BEGIN DUMP LOG ---
libbpf:

libbpf: -- END LOG --
libbpf: failed to load program 'fork=_do_fork'
libbpf: failed to load object '/tmp/foo.o'
event syntax error: '/tmp/foo.o'
\___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?

(add -v to see detail)
Run 'perf list' for a list of valid events

Usage: perf record [] []
or: perf record [] -- []

-e, --event event selector. use 'perf list' to list available events

If we instead make it match, i.e. use 0x40300 on this v4.3.0-rc6+
kernel, the whole process goes thru:

# perf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.202 MB perf.data ]
# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
exclude_guest: 1, mmap2: 1, comm_exec: 1
#

Signed-off-by: Wang Nan
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexei Starovoitov
Cc: Brendan Gregg
Cc: Daniel Borkmann
Cc: David Ahern
Cc: He Kuang
Cc: Jiri Olsa
Cc: Kaixu Xia
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-10-29 00:09:50 +0800

28 Oct, 2015

10 commits

aa3abf30b perf tools: Create probe points for BPF programs ... Browse Code »

This patch introduces bpf__{un,}probe() functions to enable callers to
create kprobe points based on section names a BPF program. It parses the
section names in the program and creates corresponding 'struct
perf_probe_event' structures. The parse_perf_probe_command() function is
used to do the main parsing work. The resuling 'struct perf_probe_event'
is stored into program private data for further using.

By utilizing the new probing API, this patch creates probe points during
event parsing.

To ensure probe points be removed correctly, register an atexit hook so
even perf quit through exit() bpf__clear() is still called, so probing
points are cleared. Note that bpf_clear() should be registered before
bpf__probe() is called, so failure of bpf__probe() can still trigger
bpf__clear() to remove probe points which are already probed.

strerror style error reporting scaffold is created by this patch.
bpf__strerror_probe() is the first error reporting function in
bpf-loader.c.

Committer note:

Trying it:

To build a test eBPF object file:

I am testing using a script I built from the 'perf test -v LLVM' output:

$ cat ~/bin/hello-ebpf
export KERNEL_INC_OPTIONS="-nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.8.3/include -I/home/acme/git/linux/arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated -I/home/acme/git/linux/include -Iinclude -I/home/acme/git/linux/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -Iinclude/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h"
export WORKING_DIR=/lib/modules/4.2.0/build
export CLANG_SOURCE=-
export CLANG_OPTIONS=-xc

OBJ=/tmp/foo.o
rm -f $OBJ
echo '__attribute__((section("fork=do_fork"), used)) int fork(void *ctx) {return 0;} char _license[] __attribute__((section("license"), used)) = "GPL";int _version __attribute__((section("version"), used)) = 0x40100;' | \
clang -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o /tmp/foo.o && file $OBJ

---

First asking to put a probe in a function not present in the kernel
(misses the initial _):

$ perf record --event /tmp/foo.o sleep 1
Probe point 'do_fork' not found.
event syntax error: '/tmp/foo.o'
\___ You need to check probing points in BPF file

(add -v to see detail)
Run 'perf list' for a list of valid events

Usage: perf record [] []
or: perf record [] -- []

-e, --event event selector. use 'perf list' to list available events
$

---

Now, with "__attribute__((section("fork=_do_fork"), used)):

$ grep _do_fork /proc/kallsyms
ffffffff81099ab0 T _do_fork
$ perf record --event /tmp/foo.o sleep 1
Failed to open kprobe_events: Permission denied
event syntax error: '/tmp/foo.o'
\___ Permission denied

---

Cool, we need to provide some better hints, "kprobe_events" is too low
level, one doesn't strictly need to know the precise details of how
these things are put in place, so something that shows the command
needed to fix the permissions would be more helpful.

Lets try as root instead:

# perf record --event /tmp/foo.o sleep 1
Lowering default frequency rate to 1000.
Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
# perf evlist
/tmp/foo.o
[root@felicio ~]# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 1000, sample_type: IP|TID|TIME|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1,
sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1

---

Signed-off-by: Wang Nan
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexei Starovoitov
Cc: Brendan Gregg
Cc: Daniel Borkmann
Cc: David Ahern
Cc: He Kuang
Cc: Jiri Olsa
Cc: Kaixu Xia
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-10-28 23:48:13 +0800
84c86ca12 perf tools: Enable passing bpf object file to --event ... Browse Code »

By introducing new rules in tools/perf/util/parse-events.[ly], this
patch enables 'perf record --event bpf_file.o' to select events by an
eBPF object file. It calls parse_events_load_bpf() to load that file,
which uses bpf__prepare_load() and finally calls bpf_object__open() for
the object files.

After applying this patch, commands like:

# perf record --event foo.o sleep

become possible.

However, at this point it is unable to link any useful things onto the
evsel list because the creating of probe points and BPF program
attaching have not been implemented. Before real events are possible to
be extracted, to avoid perf report error because of empty evsel list,
this patch link a dummy evsel. The dummy event related code will be
removed when probing and extracting code is ready.

Commiter notes:

Using it:

$ ls -la foo.o
ls: cannot access foo.o: No such file or directory
$ perf record --event foo.o sleep
libbpf: failed to open foo.o: No such file or directory
event syntax error: 'foo.o'
\___ BPF object file 'foo.o' is invalid

(add -v to see detail)
Run 'perf list' for a list of valid events

Usage: perf record [] []
or: perf record [] -- []

-e, --event event selector. use 'perf list' to list available events
$

$ file /tmp/build/perf/perf.o
/tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$ perf record --event /tmp/build/perf/perf.o sleep
libbpf: /tmp/build/perf/perf.o is not an eBPF object file
event syntax error: '/tmp/build/perf/perf.o'
\___ BPF object file '/tmp/build/perf/perf.o' is invalid

(add -v to see detail)
Run 'perf list' for a list of valid events

Usage: perf record [] []
or: perf record [] -- []

-e, --event event selector. use 'perf list' to list available events
$

$ file /tmp/foo.o
/tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
$ perf record --event /tmp/foo.o sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
$ perf evlist
/tmp/foo.o
$ perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
$

So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok.

$ perf report --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
$

Signed-off-by: Wang Nan
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexei Starovoitov
Cc: Brendan Gregg
Cc: Daniel Borkmann
Cc: David Ahern
Cc: He Kuang
Cc: Jiri Olsa
Cc: Kaixu Xia
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-10-28 23:48:12 +0800
69d262a93 perf ebpf: Add the libbpf glue ... Browse Code »

The 'bpf-loader.[ch]' files are introduced in this patch. Which will be
the interface between perf and libbpf. bpf__prepare_load() resides in
bpf-loader.c. Following patches will enrich these two files.

Signed-off-by: Wang Nan
Acked-by: Alexei Starovoitov
Cc: Brendan Gregg
Cc: Daniel Borkmann
Cc: David Ahern
Cc: He Kuang
Cc: Jiri Olsa
Cc: Kaixu Xia
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-10-28 23:48:12 +0800
443f8c75e perf symbols: Fix endless loop in dso__split_kallsyms_for_kcore ... Browse Code »

Currently we split symbols based on the map comparison, but symbols are stored
within dso objects and maps could point into same dso objects (kernel maps).

Hence we could end up changing rbtree we are currently iterating and mess it
up. It's easily reproduced on s390x by running:

$ perf record -a -- sleep 3
$ perf buildid-list -i perf.data --with-hits

The fix is to compare dso objects instead.

Reported-by: Michael Petlan
Signed-off-by: Jiri Olsa
Acked-by: Adrian Hunter
Cc: Andi Kleen
Cc: Kan Liang
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20151026135130.GA26003@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo

Jiri Olsa
2015-10-28 22:19:30 +0800
374ce938a perf tools: Enable pre-event inherit setting by config terms ... Browse Code »

This patch allows perf record setting event's attr.inherit bit by
config terms like:

# perf record -e cycles/no-inherit/ ...
# perf record -e cycles/inherit/ ...

So user can control inherit bit for each event separately.

In following example, a.out fork()s in main then do some complex
CPU intensive computations in both of its children.

Basic result with and without inherit:

# perf record -e cycles -e instructions ./a.out
[ perf record: Woken up 9 times to write data ]
[ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ]
# perf report --stdio
# ...
# Samples: 23K of event 'cycles'
# Event count (approx.): 23641752891
...
# Samples: 24K of event 'instructions'
# Event count (approx.): 30428312415

# perf record -i -e cycles -e instructions ./a.out
[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ]
...
# Samples: 12K of event 'cycles'
# Event count (approx.): 11699501775
...
# Samples: 12K of event 'instructions'
# Event count (approx.): 15058023559

Cancel inherit for one event when globally enable:

# perf record -e cycles/no-inherit/ -e instructions ./a.out
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ]
...
# Samples: 12K of event 'cycles/no-inherit/'
# Event count (approx.): 11895759282
...
# Samples: 24K of event 'instructions'
# Event count (approx.): 30668000441

Enable inherit for one event when globally disable:

# perf record -i -e cycles/inherit/ -e instructions ./a.out
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ]
...
# Samples: 23K of event 'cycles/inherit/'
# Event count (approx.): 23285400229
...
# Samples: 11K of event 'instructions'
# Event count (approx.): 14969050259

Committer note:

One can check if the bit was set, in addition to seeing the result in
the perf.data file size as above by doing one of:

# perf record -e cycles -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
#

So, the inherit bit was set in both, now, if we disable it globally using
--no-inherit:

# perf record --no-inherit -e cycles -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1

No inherit bit set, then disabling it and setting just on the cycles event:

# perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ]
# perf evlist -v
cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
#

We can see it as well in by using a more verbose level of debug messages in
the tool that sets up the perf_event_attr, 'perf record' in this case:

[root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
------------------------------------------------------------
perf_event_attr:
size 112
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
inherit 1
mmap 1
comm 1
freq 1
task 1
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8
------------------------------------------------------------
perf_event_attr:
size 112
config 0x1
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
freq 1
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8

Signed-off-by: Wang Nan
Acked-by: Jiri Olsa
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexei Starovoitov
Cc: Brendan Gregg
Cc: David S. Miller
Cc: Li Zefan
Cc: Peter Zijlstra
Cc: Zefan Li
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com
[ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa]
Signed-off-by: Arnaldo Carvalho de Melo

Wang Nan
2015-10-28 22:19:16 +0800
5baecbcd9 perf symbols: we can now read separate debug-info files based on a build ID ... Browse Code »

Recent GDB (at least on a vanilla Debian box) looks for debug information in

/usr/lib/debug/.build-id/nn/nnnnnnn

where nn/nnnnnn is the build-id of the stripped ELF binary. This is
documented here:

https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html

This was not working in perf because we didn't read the build id until
AFTER we searched for the separate debug information file. This patch
reads the build ID and THEN does the search.

Signed-off-by: Dima Kogan
Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.net
Signed-off-by: Arnaldo Carvalho de Melo

Dima Kogan
2015-10-28 21:04:27 +0800
f2f309688 perf symbols: Fix type error when reading a build-id ... Browse Code »

This was benign, but wrong. The build-id should live in a char[], not a char*[]

Signed-off-by: Dima Kogan
Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.net
Signed-off-by: Arnaldo Carvalho de Melo

Dima Kogan
2015-10-28 21:02:00 +0800
f4efcce33 perf tools: Search for more options when passing args to -h ... Browse Code »

Recently 'perf -h' was made aware of arguments and would show
just the help for the arguments specified, but that required a strict
form, i.e.:

$ perf -h --tui

worked, but:

$ perf -h tui

didn't.

Make it support both cases and also look at the option help when neither
matches, so that he following examples works:

$ perf report -h interface

Usage: perf report []

--gtk Use the GTK2 interface
--stdio Use the stdio interface
--tui Use the TUI interface

$ perf report -h stack

Usage: perf report []

-g, --call-graph
Display call graph (stack chain/backtrace):

print_type: call graph printing style (graph|flat|fractal|none)
threshold: minimum call graph inclusion threshold ()
print_limit: maximum number of call graph entry ()
order: call graph order (caller|callee)
sort_key: call graph sort key (function|address)
branch: include last branch info to call graph (branch)

Default: graph,0.5,caller,function
--max-stack Set the maximum stack depth when parsing the
callchain, anything beyond the specified depth
will be ignored. Default: 127
$

Suggested-by: Ingo Molnar
Cc: Adrian Hunter
Cc: Borislav Petkov
Cc: Brendan Gregg
Cc: Chandler Carruth
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Wang Nan
Link: http://lkml.kernel.org/n/tip-xzqvamzqv3cv0p6w3inhols3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2015-10-28 04:35:58 +0800
2322f573f perf cpu_map: Add cpu_map__empty_new function ... Browse Code »

Adding cpu_map__empty_new interface to create empty cpumap with given
size. The cpumap entries are initialized with -1.

It'll be used for caching cpu_map in following patches.

Signed-off-by: Jiri Olsa
Tested-by: Kan Liang
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1445784728-21732-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Jiri Olsa
2015-10-28 02:05:36 +0800
af3399817 perf evsel: Move id_offset out of struct perf_evsel union member ... Browse Code »

Because the 'perf stat record' patches will use the id_offset member
together with the priv pointer.

Signed-off-by: Jiri Olsa
Tested-by: Kan Liang
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1445784728-21732-29-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Jiri Olsa
2015-10-28 02:04:29 +0800

27 Oct, 2015

4 commits

c71183697 perf tools: Introduce usage_with_options_msg() ... Browse Code »

Now usage_with_options() setup a pager before printing message so normal
printf() or pr_err() will not be shown. The usage_with_options_msg()
can be used to print some help message before usage strings.

Signed-off-by: Namhyung Kim
Acked-by: Masami Hiramatsu
Cc: David Ahern
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1445701767-12731-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Namhyung Kim
2015-10-27 20:28:44 +0800
01b19455c perf tools: Setup pager when printing usage and help ... Browse Code »

It's annoying to see error or help message when command has many options
like in perf record, report or top. So setup pager when print parser
error or help message - it should be OK since no UI is enabled at the
parsing time. The usage_with_options() already disables it by calling
exit_browser() anyway.

Signed-off-by: Namhyung Kim
Acked-by: Ingo Molnar
Tested-by: Arnaldo Carvalho de Melo
Cc: David Ahern
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1445701767-12731-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Namhyung Kim
2015-10-27 01:08:48 +0800
b272a59d8 perf report: Rename to --show-cpu-utilization ... Browse Code »

So that it can be more consistent with other --show-* options. The old
name (--showcpuutilization) is provided only for compatibility.

Signed-off-by: Namhyung Kim
Cc: David Ahern
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1445701767-12731-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Namhyung Kim
2015-10-27 01:06:04 +0800
a5f4a6932 perf tools: Improve ambiguous option help message ... Browse Code »

Currently if an option name is ambiguous it only prints first two
matched option names but no help. It'd be better it could show all
possible names and help messages too.

Before:
$ perf report --show
Error: Ambiguous option: show (could be --show-total-period or
--show-ref-call-graph)
Usage: perf report []

After:
$ perf report --show
Error: Ambiguous option: show (could be --show-total-period or
--show-ref-call-graph)
Usage: perf report []

-n, --show-nr-samples
Show a column with the number of samples
--showcpuutilization
Show sample percentage for different cpu modes
-I, --show-info Display extended information about perf.data file
--show-total-period
Show a column with the sum of periods
--show-ref-call-graph
Show callgraph from reference event

Signed-off-by: Namhyung Kim
Acked-by: Ingo Molnar
Tested-by: Arnaldo Carvalho de Melo
Cc: David Ahern
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1445701767-12731-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Namhyung Kim
2015-10-27 00:59:06 +0800