13 Nov, 2015

6 commits

  • When probing with a glob, errors in add_probe_trace_event() won't be
    passed to debuginfo__find_trace_events() because it would be modified by
    probe_point_search_cb(). It causes a segfault if perf fails to find an
    argument for a probe point matched by the glob. For example:

    # ./perf probe -v -n 'SyS_dup? oldfd'
    probe-definition(0): SyS_dup? oldfd
    symbol:SyS_dup? file:(null) line:0 offset:0 return:0 lazy:(null)
    parsing arg: oldfd into oldfd
    1 arguments
    Looking at the vmlinux_path (7 entries long)
    Using /lib/modules/4.3.0-rc4+/build/vmlinux for symbols
    Open Debuginfo file: /lib/modules/4.3.0-rc4+/build/vmlinux
    Try to find probe point from debuginfo.
    Matched function: SyS_dup3
    found inline addr: 0xffffffff812095c0
    Probe point found: SyS_dup3+0
    Searching 'oldfd' variable in context.
    Converting variable oldfd into trace event.
    oldfd type is long int.
    found inline addr: 0xffffffff812096d4
    Probe point found: SyS_dup2+36
    Searching 'oldfd' variable in context.
    Failed to find 'oldfd' in this function.
    Matched function: SyS_dup3
    Probe point found: SyS_dup3+0
    Searching 'oldfd' variable in context.
    Converting variable oldfd into trace event.
    oldfd type is long int.
    Matched function: SyS_dup2
    Probe point found: SyS_dup2+0
    Searching 'oldfd' variable in context.
    Converting variable oldfd into trace event.
    oldfd type is long int.
    Found 4 probe_trace_events.
    Opening /sys/kernel/debug/tracing//kprobe_events write=1
    Writing event: p:probe/SyS_dup3 _text+2135488 oldfd=%di:s64
    Segmentation fault (core dumped)
    #

    This patch ensures that add_probe_trace_event() doesn't touches
    tf->ntevs and tf->tevs if those functions fail.

    After the patch:

    # perf probe 'SyS_dup? oldfd'
    Failed to find 'oldfd' in this function.
    Added new events:
    probe:SyS_dup3 (on SyS_dup? with oldfd)
    probe:SyS_dup3_1 (on SyS_dup? with oldfd)
    probe:SyS_dup2 (on SyS_dup? with oldfd)

    You can now use it in all perf tools, such as:

    perf record -e probe:SyS_dup2 -aR sleep 1

    Signed-off-by: Wang Nan
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Masami Hiramatsu
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1447417761-156094-3-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • Fix memory leaking on the debuginfo__find_trace_events() failure path
    which frees an array of probe_trace_events but doesn't clears all the
    allocated sub-structures and strings.

    So, before doing zfree(tevs), clear all the array elements which may
    have allocated resources.

    Reported-by: Wang Nan
    Signed-off-by: Masami Hiramatsu
    Cc: Alexei Starovoitov
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1447417761-156094-2-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Masami Hiramatsu
     
  • 'perf buildid-list' processes events to determine hits (i.e. with-hits
    option). That may not work if events are not sorted in order. i.e. MMAP
    events must be processed before the samples that depend on them so that
    sample processing can 'hit' the DSO to which the MMAP refers.

    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/1447408112-1920-3-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • Commit 4598a0a6d22f ("perf symbols: Improve DSO long names lookup speed
    with rbtree") Added a tree to lookup dsos by long name. That tree gets
    corrupted whenever a dso long name is changed because the tree is not
    updated.

    One effect of that is buildid-list does not work with the 'with-hits'
    option because dso lookup fails and results in two structs for the same
    dso. The first has the buildid but no hits, the second has hits but no
    buildid. e.g.

    Before:

    $ tools/perf/perf record ls
    arch certs CREDITS Documentation firmware include
    ipc Kconfig lib Makefile net REPORTING-BUGS
    scripts sound usr block COPYING crypto
    drivers fs init Kbuild kernel MAINTAINERS
    mm README samples security tools virt
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.012 MB perf.data (11 samples) ]
    $ tools/perf/perf buildid-list
    574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
    30c94dc66a1fe95180c3d68d2b89e576d5ae213c /lib/x86_64-linux-gnu/libc-2.19.so
    $ tools/perf/perf buildid-list -H
    574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
    0000000000000000000000000000000000000000 /lib/x86_64-linux-gnu/libc-2.19.so

    After:

    $ tools/perf/perf buildid-list -H
    574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
    30c94dc66a1fe95180c3d68d2b89e576d5ae213c /lib/x86_64-linux-gnu/libc-2.19.so

    The fix is to record the root of the tree on the dso so that
    dso__set_long_name() can update the tree when the long name changes.

    Signed-off-by: Adrian Hunter
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Adrian Hunter
    Cc: Don Zickus
    Cc: Douglas Hatch
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Waiman Long
    Fixes: 4598a0a6d22f ("perf symbols: Improve DSO long names lookup speed with rbtree")
    Link: http://lkml.kernel.org/r/1447408112-1920-2-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • When the root user tries to read a file owned by some other user we get:

    # ls -la perf.data
    -rw-------. 1 acme acme 20032 Nov 12 15:50 perf.data
    # perf report
    File perf.data not owned by current user or root (use -f to override)
    # perf report -f | grep -v ^# | head -2
    30.96% ls [kernel.vmlinux] [k] do_set_pte
    28.24% ls libc-2.20.so [.] intel_check_word
    #

    That wasn't happening when the symbol code tried to read a JIT map,
    where the same check was done but no forcing was possible, fix it.

    Reported-by: Brendan Gregg
    Tested-by: Brendan Gregg
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://permalink.gmane.org/gmane.linux.kernel.perf.user/2380
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Normally symbols are read from the DSO and adjusted, if need be, so that
    the symbol start matches the file offset in the DSO file (we want the
    file offset because that is what we know from MMAP events). That is done
    by dso__load_sym() which inserts the symbols *after* adjusting them.

    In the case of kcore, the symbols have been read from kallsyms and the
    symbol start is the memory address. The symbols have to be adjusted to
    match the kcore file offsets. dso__split_kallsyms_for_kcore() does that,
    but now the adjustment is being done *after* the symbols have been
    inserted. It appears dso__split_kallsyms_for_kcore() was assuming that
    changing the symbol start would not change the order in the rbtree -
    which is, of course, not guaranteed.

    Signed-off-by: Adrian Hunter
    Tested-by: Wang Nan
    Cc: Jiri Olsa
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/563CB241.2090701@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

12 Nov, 2015

3 commits

  • On kernel with only one out of CONFIG_KPROBE_EVENTS and
    CONFIG_UPROBE_EVENTS enabled, 'perf probe -d' causes a segfault because
    perf_del_probe_events() calls probe_file__get_events() with a negative
    fd.

    This patch fixes it by adding parameter validation at the entry of
    probe_file__get_events() and probe_file__get_rawlist(). Since they are
    both non-static public functions (in .h file), parameter verifying is
    required.

    v1 -> v2: Verify fd at the head of probe_file__get_rawlist() instead of
    checking at call site (suggested by Masami and Arnaldo at [1,2]).

    [1] http://lkml.kernel.org/r/50399556C9727B4D88A595C8584AAB37526048E3@GSjpTKYDCembx32.service.hitachi.net
    [2] http://lkml.kernel.org/r/20151105155830.GV13236@kernel.org

    Signed-off-by: Wang Nan
    Acked-by: Masami Hiramatsu
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1446803415-83382-1-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • Before:

    [acme@zoo linux]$ perf evlist
    WARNING: The perf.data file's data size field is 0 which is unexpected.
    Was the 'perf record' command properly terminated?
    non matching sample_type[acme@zoo linux]$

    After:

    [acme@zoo linux]$ perf evlist
    WARNING: The perf.data file's data size field is 0 which is unexpected.
    Was the 'perf record' command properly terminated?
    non matching sample_type
    [acme@zoo linux]$

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-wscok3a2s7yrj8156oc2r6qe@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The --full-paths option did not show the full source file paths in the 'perf
    annotate' tool, because the value of the option was not propagated into the
    related functions.

    With this patch the value of the --full-paths option is known to the function
    that composes the srcline string, so it prints the full path when necessary.

    Committer Note:

    This affects annotate when the --print-line option is used:

    # perf annotate -h 2>&1 | grep print-line
    -l, --print-line print matching source lines (may be slow)

    Looking just at the lines that should be affected by this change:

    Before:

    # perf annotate --print-line --full-paths --stdio fput | grep '\.[ch]:[0-9]\+'
    94.44 atomic64_64.h:114
    5.56 file_table.c:265
    file_table.c:265 5.56 : ffffffff81219a00: callq ffffffff81769360
    atomic64_64.h:114 94.44 : ffffffff81219a05: lock decq 0x38(%rdi)

    After:

    # perf annotate --print-line --full-paths --stdio fput | grep '\.[ch]:[0-9]\+'
    94.44 /home/git/linux/arch/x86/include/asm/atomic64_64.h:114
    5.56 /home/git/linux/fs/file_table.c:265
    /home/git/linux/fs/file_table.c:265 5.56 : ffffffff81219a00: callq ffffffff81769360
    /home/git/linux/arch/x86/include/asm/atomic64_64.h:114 94.44 : ffffffff81219a05: lock decq 0x38(%rdi)
    #

    Signed-off-by: Michael Petlan
    Tested-by: Arnaldo Carvalho de Melo
    Link: http://permalink.gmane.org/gmane.linux.kernel.perf.user/2365
    Signed-off-by: Arnaldo Carvalho de Melo

    Michael Petlan
     

07 Nov, 2015

4 commits

  • This patch adds BPF testcase for testing BPF event filtering.

    By utilizing the result of 'perf test LLVM', this patch compiles the
    eBPF sample program then test its ability. The BPF script in 'perf test
    LLVM' lets only 50% samples generated by epoll_pwait() to be captured.
    This patch runs that system call for 111 times, so the result should
    contain 56 samples.

    Signed-off-by: Wang Nan
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1446817783-86722-8-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • A series of bpf loader related error codes were introduced to help error
    reporting. Functions were improved to return these new error codes.

    Functions which return pointers were adjusted to encode error codes into
    return value using the ERR_PTR() interface.

    bpf_loader_strerror() was improved to convert these error messages to
    strings. It checks the error codes and calls libbpf_strerror() and
    strerror_r() accordingly, so caller don't need to consider checking the
    range of the error code.

    In bpf__strerror_load(), print kernel version of running kernel and the
    object's 'version' section to notify user how to fix his/her program.

    v1 -> v2:
    Use macro for error code.

    Fetch error message based on array index, eliminate for-loop.

    Print version strings.

    Before:

    # perf record -e ./test_kversion_nomatch_program.o sleep 1
    event syntax error: './test_kversion_nomatch_program.o'
    \___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
    SKIP

    After:

    # perf record -e ./test_kversion_nomatch_program.o ls
    event syntax error: './test_kversion_nomatch_program.o'
    \___ 'version' (4.4.0) doesn't match running kernel (4.3.0)
    SKIP

    Signed-off-by: Wang Nan
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1446818289-87444-1-git-send-email-wangnan0@huawei.com
    [ Add 'static inline' to bpf__strerror_prepare_load() when LIBBPF is disabled ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • There are 2 places in llvm-utils.c which find kernel version information
    through uname. This patch extracts the uname related code into a
    fetch_kernel_version() function and puts it into util.h so it can be
    reused.

    Signed-off-by: Wang Nan
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1446818135-87310-1-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • In this patch, a series of libbpf specific error numbers and
    libbpf_strerror() are introduced to help reporting errors.

    Functions are updated to pass correct the error number through the
    CHECK_ERR() macro.

    All users of bpf_object__open{_buffer}() and bpf_program__title() in
    perf are modified accordingly. In addition, due to the error codes
    changing, bpf__strerror_load() is also modified to use them.

    bpf__strerror_head() is also changed accordingly so it can parse libbpf
    errors. bpf_loader_strerror() is introduced for that purpose, and will
    be improved by the following patch.

    load_program() is improved not to dump log buffer if it is empty. log
    buffer is also used to deduce whether the error was caused by an invalid
    program or other problem.

    v1 -> v2:

    - Using macro for error code.

    - Fetch error message based on array index, eliminate for-loop.

    - Use log buffer to detect the reason of failure. 3 new error code
    are introduced to replace LIBBPF_ERRNO__LOAD.

    In v1:

    # perf record -e ./test_ill_program.o ls
    event syntax error: './test_ill_program.o'
    \___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
    SKIP

    # perf record -e ./test_kversion_nomatch_program.o ls
    event syntax error: './test_kversion_nomatch_program.o'
    \___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
    SKIP

    # perf record -e ./test_big_program.o ls
    event syntax error: './test_big_program.o'
    \___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
    SKIP

    In v2:

    # perf record -e ./test_ill_program.o ls
    event syntax error: './test_ill_program.o'
    \___ Kernel verifier blocks program loading
    SKIP

    # perf record -e ./test_kversion_nomatch_program.o
    event syntax error: './test_kversion_nomatch_program.o'
    \___ Incorrect kernel version
    SKIP
    (Will be further improved by following patches)

    # perf record -e ./test_big_program.o
    event syntax error: './test_big_program.o'
    \___ Program too big
    SKIP

    Signed-off-by: Wang Nan
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1446817783-86722-2-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     

06 Nov, 2015

2 commits

  • In find_perf_probe_point_from_map(), the 'ret' variable is initialized
    with -ENOENT but overwritten by the return code of
    kernel_get_symbol_address_by_name(), and after that it is re-initialized
    with -ENOENT again.

    Setting ret=-ENOENT twice looks a bit redundant. This avoids the
    overwriting and just returns -ENOENT if some error happens to simplify
    the code.

    Signed-off-by: Masami Hiramatsu
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Cc: Zefan Li
    Link: http://lkml.kernel.org/n/tip-ufp1zgbktzmttcputozneomd@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Masami Hiramatsu
     
  • When the browser fails to annotate it is difficult for users to find out
    what went wrong.

    Add some errors for objdump failures that are displayed in the UI.

    Note it would be even better to handle these errors smarter, like
    falling back to the binary when the debug info is somehow corrupted. But
    for now just giving a better error is an improvement.

    Committer note:

    This works for --stdio, where errors just scroll by the screen:

    # perf annotate --stdio intel_idle
    Failure running objdump --start-address=0xffffffff81418290 --stop-address=0xffffffff814183ae -l -d --no-show-raw -S -C /root/.debug/.build-id/28/2777c262e6b3c0451375163c9a81c893218ab1 2>/dev/null|grep -v /root/.debug/.build-id/28/2777c262e6b3c0451375163c9a81c893218ab1|expand
    Percent | Source code & Disassembly of vmlinux for cycles:pp
    ------------------------------------------------------------------

    And with that one can use that command line to try to find out more about what
    happened instead of getting a blank screen, an improvement.

    We need tho to improve this further to get it to work with other UIs, like
    --tui and --gtk, where it continues showing a blank screen, no messages, as
    the pr_err() used is enough just for --stdio.

    Signed-off-by: Andi Kleen
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/1446779167-18949-1-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

05 Nov, 2015

5 commits

  • It is possible that find_perf_probe_point_from_map() fails to find a
    symbol but still returns 0 because of an small error when coding:
    find_perf_probe_point_from_map() set 'ret' to error code at first, but
    also use it to hold return value of kernel_get_symbol_address_by_name().

    This patch resets 'ret' to error even kernel_get_symbol_address_by_name()
    success, so if !sym, the whole function returns error correctly.

    Signed-off-by: Wang Nan
    Cc: Jiri Olsa
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1446729565-27592-3-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • Arnaldo suggests to make LINUX_VERSION_CODE works like __func__ and
    __FILE__ so user don't need to care setting right linux version too
    much. In this patch, perf llvm transfers LINUX_VERSION_CODE macro
    through clang cmdline.

    [1] http://lkml.kernel.org/r/20151029223744.GK2923@kernel.org

    Committer notes:

    Before, forgetting to update the version:

    # uname -r
    4.3.0-rc1+
    # cat bpf.c
    __attribute__((section("fork=_do_fork"), used))
    int fork(void *ctx)
    {
    return 1;
    }

    char _license[] __attribute__((section("license"), used)) = "GPL";
    int _version __attribute__((section("version"), used)) = 0x40200;
    #
    # perf record -e bpf.c sleep 1
    event syntax error: 'bpf.c'
    \___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?

    (add -v to see detail)
    Run 'perf list' for a list of valid events

    Usage: perf record [] []
    or: perf record [] -- []

    -e, --event event selector. use 'perf list' to list available events
    #

    After:

    # grep version bpf.c
    int _version __attribute__((section("version"), used)) = LINUX_VERSION_CODE;
    # perf record -e bpf.c sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.017 MB perf.data ]
    # perf evlist -v
    perf_bpf_probe:fork: type: 2, size: 112, config: 0x5ee, { sample_period,
    sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1,
    inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all:
    1, exclude_guest: 1, mmap2: 1, comm_exec: 1
    #

    Suggested-and-Tested-by: Arnaldo Carvalho de Melo
    Signed-off-by: Wang Nan
    Cc: Alexei Starovoitov
    Cc: Namhyung Kim
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1446636007-239722-3-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • This patch introduces a new macro "__NR_CPUS__" to perf's embedded clang
    compiler, which represent the number of configured CPUs in this system.
    BPF programs can use this macro to create a map with the same number of
    system CPUs. For example:

    struct bpf_map_def SEC("maps") pmu_map = {
    .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
    .key_size = sizeof(int),
    .value_size = sizeof(u32),
    .max_entries = __NR_CPUS__,
    };

    Signed-off-by: Wang Nan
    Cc: Alexei Starovoitov
    Cc: Namhyung Kim
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1446636007-239722-2-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • When new maps are cloned out of split map they are added into origin
    map's group, but their groups pointer is not updated.

    This could lead to a segfault, because map->groups is expected to be
    always set as reported by Markus:

    __map__is_kernel (map=map@entry=0x1abb7a0) at util/map.c:238
    238 return __machine__kernel_map(map->groups->machine, map->type) =
    (gdb) bt
    #0 __map__is_kernel (map=map@entry=0x1abb7a0) at util/map.c:238
    #1 0x00000000004393e4 in symbol_filter (map=map@entry=0x1abb7a0, sym=sym@entry
    #2 0x00000000004fcd4d in dso__load_sym (dso=dso@entry=0x166dae0, map=map@entry
    #3 0x00000000004a64e0 in dso__load (dso=0x166dae0, map=map@entry=0x1abb7a0, fi
    #4 0x00000000004b941f in map__load (filter=0x4393c0 , map=groups pointer update. It takes no lock as opposed to existing
    map_groups__insert, as maps__fixup_overlappings(), where it is being
    called, already has the necessary lock held.

    Using __map_groups__insert to add new maps after map split.

    Reported-by: Markus Trippelsdorf
    Signed-off-by: Jiri Olsa
    Tested-by: Markus Trippelsdorf
    Cc: Andrew Morton
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20151104140811.GA32664@krava.brq.redhat.com
    Fixes: cfc5acd4c80b ("perf top: Filter symbols based on __map__is_kernel(map)")
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • The sw clock metrics printing was missed in the earlier move to
    stat-shadow of all the other metric printouts. Move it too.

    v2: Fix metrics printing in this version to make bisect safe.

    Signed-off-by: Andi Kleen
    Acked-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1446515428-7450-2-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

03 Nov, 2015

1 commit

  • According to [1], libbpf should be muted. This patch reset info and
    warning message level to ensure libbpf doesn't output anything even
    if error happened.

    [1] http://lkml.kernel.org/r/20151020151255.GF5119@kernel.org

    Committer note:

    Before:

    Testing it with an incompatible kernel version in the .c file that
    generated foo.o:

    [root@zoo ~]# perf record -e /tmp/foo.o sleep 1
    libbpf: load bpf program failed: Invalid argument
    libbpf: -- BEGIN DUMP LOG ---
    libbpf:

    libbpf: -- END LOG --
    libbpf: failed to load program 'fork=_do_fork'
    libbpf: failed to load object '/tmp/foo.o'
    event syntax error: '/tmp/foo.o'
    \___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?

    (add -v to see detail)
    Run 'perf list' for a list of valid events

    Usage: perf record [] []
    or: perf record [] -- []

    -e, --event event selector. use 'perf list' to list available events
    [root@zoo ~]#

    After:

    [root@zoo ~]# perf record -e /tmp/foo.o sleep 1
    event syntax error: '/tmp/foo.o'
    \___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?

    (add -v to see detail)
    Run 'perf list' for a list of valid events

    Usage: perf record [] []
    or: perf record [] -- []

    -e, --event event selector. use 'perf list' to list available events
    [root@zoo ~]#

    This, BTW, need fixing to emit a proper message by validating the
    version in the foo.o "version" ELF section against the running kernel,
    warning the user instead of asking the kernel to load a binary that it
    will refuse due to unmatching kernel version.

    Signed-off-by: Wang Nan
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1446547486-229499-3-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     

30 Oct, 2015

3 commits

  • Even if --symfs is used to point to the debug binaries, we send in the
    non-debug filenames to libunwind, which leads to libunwind not finding
    the debug frame. Fix this by preferring the file in --symfs, if it is
    available.

    Signed-off-by: Rabin Vincent
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Rabin Vincent
    Link: http://lkml.kernel.org/r/1446104978-26429-1-git-send-email-rabin.vincent@axis.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Rabin Vincent
     
  • This patch provides infrastructure for passing source files to --event
    directly using:

    # perf record --event bpf-file.c command

    This patch does following works:

    1) Allow passing '.c' file to '--event'. parse_events_load_bpf() is
    expanded to allow caller tell it whether the passed file is source
    file or object.

    2) llvm__compile_bpf() is called to compile the '.c' file, the result
    is saved into memory. Use bpf_object__open_buffer() to load the
    in-memory object.

    Introduces a bpf-script-example.c so we can manually test it:

    # perf record --clang-opt "-DLINUX_VERSION_CODE=0x40200" --event ./bpf-script-example.c sleep 1

    Note that '--clang-opt' must put before '--event'.

    Futher patches will merge it into a testcase so can be tested automatically.

    Signed-off-by: Wang Nan
    Acked-by: Alexei Starovoitov
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: David Ahern
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kaixu Xia
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1444826502-49291-10-git-send-email-wangnan0@huawei.com
    Signed-off-by: He Kuang
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • This is the final patch which makes basic BPF filter work. After
    applying this patch, users are allowed to use BPF filter like:

    # perf record --event ./hello_world.o ls

    A bpf_fd field is appended to 'struct evsel', and setup during the
    callback function add_bpf_event() for each 'probe_trace_event'.

    PERF_EVENT_IOC_SET_BPF ioctl is used to attach eBPF program to a newly
    created perf event. The file descriptor of the eBPF program is passed to
    perf record using previous patches, and stored into evsel->bpf_fd.

    It is possible that different perf event are created for one kprobe
    events for different CPUs. In this case, when trying to call the ioctl,
    EEXIST will be return. This patch doesn't treat it as an error.

    Committer note:

    The bpf proggie used so far:

    __attribute__((section("fork=_do_fork"), used))
    int fork(void *ctx)
    {
    return 0;
    }

    char _license[] __attribute__((section("license"), used)) = "GPL";
    int _version __attribute__((section("version"), used)) = 0x40300;

    failed to produce any samples, even with forks happening and it being
    running in system wide mode.

    That is because now the filter is being associated, and the code above
    always returns zero, meaning that all forks will be probed but filtered
    away ;-/

    Change it to 'return 1;' instead and after that:

    # trace --no-syscalls --event /tmp/foo.o
    0.000 perf_bpf_probe:fork:(ffffffff8109be30))
    2.333 perf_bpf_probe:fork:(ffffffff8109be30))
    3.725 perf_bpf_probe:fork:(ffffffff8109be30))
    4.550 perf_bpf_probe:fork:(ffffffff8109be30))
    ^C#

    And it works with all tools, including 'perf trace'.

    Signed-off-by: Wang Nan
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: David Ahern
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kaixu Xia
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1444826502-49291-8-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     

29 Oct, 2015

2 commits

  • This patch creates a 'struct perf_evsel' for every probe in a BPF object
    file(s) and fills 'struct evlist' with them. The previously introduced
    dummy event is now removed. After this patch, the following command:

    # perf record --event filter.o ls

    Can trace on each of the probes defined in filter.o.

    The core of this patch is bpf__foreach_tev(), which calls a callback
    function for each 'struct probe_trace_event' event for a bpf program
    with each associated file descriptors. The add_bpf_event() callback
    creates evsels by calling parse_events_add_tracepoint().

    Since bpf-loader.c will not be built if libbpf is turned off, an empty
    bpf__foreach_tev() is defined in bpf-loader.h to avoid build errors.

    Committer notes:

    Before:

    # /tmp/oldperf record --event /tmp/foo.o -a usleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.198 MB perf.data ]
    # perf evlist
    /tmp/foo.o
    # perf evlist -v
    /tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
    sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
    inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
    exclude_guest: 1, mmap2: 1, comm_exec: 1

    I.e. we create just the PERF_TYPE_SOFTWARE (type: 1),
    PERF_COUNT_SW_DUMMY(config 0x9) event, now, with this patch:

    # perf record --event /tmp/foo.o -a usleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.210 MB perf.data ]
    # perf evlist -v
    perf_bpf_probe:fork: type: 2, size: 112, config: 0x6bd, { sample_period,
    sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1,
    inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest:
    1, mmap2: 1, comm_exec: 1
    #

    We now have a PERF_TYPE_SOFTWARE (type: 1), but the config states 0x6bd,
    which is how, after setting up the event via the kprobes interface, the
    'perf_bpf_probe:fork' event is accessible via the perf_event_open
    syscall. This is all transient, as soon as the 'perf record' session
    ends, these probes will go away.

    To see how it looks like, lets try doing a neverending session, one that
    expects a control+C to end:

    # perf record --event /tmp/foo.o -a

    So, with that in place, we can use 'perf probe' to see what is in place:

    # perf probe -l
    perf_bpf_probe:fork (on _do_fork@acme/git/linux/kernel/fork.c)

    We also can use debugfs:

    [root@felicio ~]# cat /sys/kernel/debug/tracing/kprobe_events
    p:perf_bpf_probe/fork _text+638512

    Ok, now lets stop and see if we got some forks:

    [root@felicio linux]# perf record --event /tmp/foo.o -a
    ^C[ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.325 MB perf.data (111 samples) ]

    [root@felicio linux]# perf script
    sshd 1271 [003] 81797.507678: perf_bpf_probe:fork: (ffffffff8109be30)
    sshd 18309 [000] 81797.524917: perf_bpf_probe:fork: (ffffffff8109be30)
    sshd 18309 [001] 81799.381603: perf_bpf_probe:fork: (ffffffff8109be30)
    sshd 18309 [001] 81799.408635: perf_bpf_probe:fork: (ffffffff8109be30)

    Sure enough, we have 111 forks :-)

    Callchains seems to work as well:

    # perf report --stdio --no-child
    # To display the perf.data header info, please use --header/--header-only options.
    #
    # Total Lost Samples: 0
    #
    # Samples: 562 of event 'perf_bpf_probe:fork'
    # Event count (approx.): 562
    #
    # Overhead Command Shared Object Symbol
    # ........ ........ ................ ............
    #
    44.66% sh [kernel.vmlinux] [k] _do_fork
    |
    ---_do_fork
    entry_SYSCALL_64_fastpath
    __libc_fork
    make_child

    26.16% make [kernel.vmlinux] [k] _do_fork

    #

    Signed-off-by: Wang Nan
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: David Ahern
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kaixu Xia
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1444826502-49291-7-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • This patch utilizes bpf_object__load() provided by libbpf to load all
    objects into kernel.

    Committer notes:

    Testing it:

    When using an incorrect kernel version number, i.e., having this in your
    eBPF proggie:

    int _version __attribute__((section("version"), used)) = 0x40100;

    For a 4.3.0-rc6+ kernel, say, this happens and needs checking at event
    parsing time, to provide a better error report to the user:

    # perf record --event /tmp/foo.o sleep 1
    libbpf: load bpf program failed: Invalid argument
    libbpf: -- BEGIN DUMP LOG ---
    libbpf:

    libbpf: -- END LOG --
    libbpf: failed to load program 'fork=_do_fork'
    libbpf: failed to load object '/tmp/foo.o'
    event syntax error: '/tmp/foo.o'
    \___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?

    (add -v to see detail)
    Run 'perf list' for a list of valid events

    Usage: perf record [] []
    or: perf record [] -- []

    -e, --event event selector. use 'perf list' to list available events

    If we instead make it match, i.e. use 0x40300 on this v4.3.0-rc6+
    kernel, the whole process goes thru:

    # perf record --event /tmp/foo.o -a usleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.202 MB perf.data ]
    # perf evlist -v
    /tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
    sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
    inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
    exclude_guest: 1, mmap2: 1, comm_exec: 1
    #

    Signed-off-by: Wang Nan
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: David Ahern
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kaixu Xia
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1444826502-49291-6-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     

28 Oct, 2015

10 commits

  • This patch introduces bpf__{un,}probe() functions to enable callers to
    create kprobe points based on section names a BPF program. It parses the
    section names in the program and creates corresponding 'struct
    perf_probe_event' structures. The parse_perf_probe_command() function is
    used to do the main parsing work. The resuling 'struct perf_probe_event'
    is stored into program private data for further using.

    By utilizing the new probing API, this patch creates probe points during
    event parsing.

    To ensure probe points be removed correctly, register an atexit hook so
    even perf quit through exit() bpf__clear() is still called, so probing
    points are cleared. Note that bpf_clear() should be registered before
    bpf__probe() is called, so failure of bpf__probe() can still trigger
    bpf__clear() to remove probe points which are already probed.

    strerror style error reporting scaffold is created by this patch.
    bpf__strerror_probe() is the first error reporting function in
    bpf-loader.c.

    Committer note:

    Trying it:

    To build a test eBPF object file:

    I am testing using a script I built from the 'perf test -v LLVM' output:

    $ cat ~/bin/hello-ebpf
    export KERNEL_INC_OPTIONS="-nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.8.3/include -I/home/acme/git/linux/arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated -I/home/acme/git/linux/include -Iinclude -I/home/acme/git/linux/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -Iinclude/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h"
    export WORKING_DIR=/lib/modules/4.2.0/build
    export CLANG_SOURCE=-
    export CLANG_OPTIONS=-xc

    OBJ=/tmp/foo.o
    rm -f $OBJ
    echo '__attribute__((section("fork=do_fork"), used)) int fork(void *ctx) {return 0;} char _license[] __attribute__((section("license"), used)) = "GPL";int _version __attribute__((section("version"), used)) = 0x40100;' | \
    clang -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o /tmp/foo.o && file $OBJ

    ---

    First asking to put a probe in a function not present in the kernel
    (misses the initial _):

    $ perf record --event /tmp/foo.o sleep 1
    Probe point 'do_fork' not found.
    event syntax error: '/tmp/foo.o'
    \___ You need to check probing points in BPF file

    (add -v to see detail)
    Run 'perf list' for a list of valid events

    Usage: perf record [] []
    or: perf record [] -- []

    -e, --event event selector. use 'perf list' to list available events
    $

    ---

    Now, with "__attribute__((section("fork=_do_fork"), used)):

    $ grep _do_fork /proc/kallsyms
    ffffffff81099ab0 T _do_fork
    $ perf record --event /tmp/foo.o sleep 1
    Failed to open kprobe_events: Permission denied
    event syntax error: '/tmp/foo.o'
    \___ Permission denied

    ---

    Cool, we need to provide some better hints, "kprobe_events" is too low
    level, one doesn't strictly need to know the precise details of how
    these things are put in place, so something that shows the command
    needed to fix the permissions would be more helpful.

    Lets try as root instead:

    # perf record --event /tmp/foo.o sleep 1
    Lowering default frequency rate to 1000.
    Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.013 MB perf.data ]
    # perf evlist
    /tmp/foo.o
    [root@felicio ~]# perf evlist -v
    /tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
    sample_freq }: 1000, sample_type: IP|TID|TIME|PERIOD, disabled: 1,
    inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1,
    sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1

    ---

    Signed-off-by: Wang Nan
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: David Ahern
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kaixu Xia
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1444826502-49291-5-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • By introducing new rules in tools/perf/util/parse-events.[ly], this
    patch enables 'perf record --event bpf_file.o' to select events by an
    eBPF object file. It calls parse_events_load_bpf() to load that file,
    which uses bpf__prepare_load() and finally calls bpf_object__open() for
    the object files.

    After applying this patch, commands like:

    # perf record --event foo.o sleep

    become possible.

    However, at this point it is unable to link any useful things onto the
    evsel list because the creating of probe points and BPF program
    attaching have not been implemented. Before real events are possible to
    be extracted, to avoid perf report error because of empty evsel list,
    this patch link a dummy evsel. The dummy event related code will be
    removed when probing and extracting code is ready.

    Commiter notes:

    Using it:

    $ ls -la foo.o
    ls: cannot access foo.o: No such file or directory
    $ perf record --event foo.o sleep
    libbpf: failed to open foo.o: No such file or directory
    event syntax error: 'foo.o'
    \___ BPF object file 'foo.o' is invalid

    (add -v to see detail)
    Run 'perf list' for a list of valid events

    Usage: perf record [] []
    or: perf record [] -- []

    -e, --event event selector. use 'perf list' to list available events
    $

    $ file /tmp/build/perf/perf.o
    /tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
    $ perf record --event /tmp/build/perf/perf.o sleep
    libbpf: /tmp/build/perf/perf.o is not an eBPF object file
    event syntax error: '/tmp/build/perf/perf.o'
    \___ BPF object file '/tmp/build/perf/perf.o' is invalid

    (add -v to see detail)
    Run 'perf list' for a list of valid events

    Usage: perf record [] []
    or: perf record [] -- []

    -e, --event event selector. use 'perf list' to list available events
    $

    $ file /tmp/foo.o
    /tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
    $ perf record --event /tmp/foo.o sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.013 MB perf.data ]
    $ perf evlist
    /tmp/foo.o
    $ perf evlist -v
    /tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
    $

    So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok.

    $ perf report --stdio
    Error:
    The perf.data file has no samples!
    # To display the perf.data header info, please use --header/--header-only options.
    #
    $

    Signed-off-by: Wang Nan
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: David Ahern
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kaixu Xia
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • The 'bpf-loader.[ch]' files are introduced in this patch. Which will be
    the interface between perf and libbpf. bpf__prepare_load() resides in
    bpf-loader.c. Following patches will enrich these two files.

    Signed-off-by: Wang Nan
    Acked-by: Alexei Starovoitov
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: David Ahern
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kaixu Xia
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1444826502-49291-3-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • Currently we split symbols based on the map comparison, but symbols are stored
    within dso objects and maps could point into same dso objects (kernel maps).

    Hence we could end up changing rbtree we are currently iterating and mess it
    up. It's easily reproduced on s390x by running:

    $ perf record -a -- sleep 3
    $ perf buildid-list -i perf.data --with-hits

    The fix is to compare dso objects instead.

    Reported-by: Michael Petlan
    Signed-off-by: Jiri Olsa
    Acked-by: Adrian Hunter
    Cc: Andi Kleen
    Cc: Kan Liang
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20151026135130.GA26003@krava.brq.redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • This patch allows perf record setting event's attr.inherit bit by
    config terms like:

    # perf record -e cycles/no-inherit/ ...
    # perf record -e cycles/inherit/ ...

    So user can control inherit bit for each event separately.

    In following example, a.out fork()s in main then do some complex
    CPU intensive computations in both of its children.

    Basic result with and without inherit:

    # perf record -e cycles -e instructions ./a.out
    [ perf record: Woken up 9 times to write data ]
    [ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ]
    # perf report --stdio
    # ...
    # Samples: 23K of event 'cycles'
    # Event count (approx.): 23641752891
    ...
    # Samples: 24K of event 'instructions'
    # Event count (approx.): 30428312415

    # perf record -i -e cycles -e instructions ./a.out
    [ perf record: Woken up 5 times to write data ]
    [ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ]
    ...
    # Samples: 12K of event 'cycles'
    # Event count (approx.): 11699501775
    ...
    # Samples: 12K of event 'instructions'
    # Event count (approx.): 15058023559

    Cancel inherit for one event when globally enable:

    # perf record -e cycles/no-inherit/ -e instructions ./a.out
    [ perf record: Woken up 7 times to write data ]
    [ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ]
    ...
    # Samples: 12K of event 'cycles/no-inherit/'
    # Event count (approx.): 11895759282
    ...
    # Samples: 24K of event 'instructions'
    # Event count (approx.): 30668000441

    Enable inherit for one event when globally disable:

    # perf record -i -e cycles/inherit/ -e instructions ./a.out
    [ perf record: Woken up 7 times to write data ]
    [ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ]
    ...
    # Samples: 23K of event 'cycles/inherit/'
    # Event count (approx.): 23285400229
    ...
    # Samples: 11K of event 'instructions'
    # Event count (approx.): 14969050259

    Committer note:

    One can check if the bit was set, in addition to seeing the result in
    the perf.data file size as above by doing one of:

    # perf record -e cycles -e instructions -a usleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ]
    # perf evlist -v
    cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
    instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
    #

    So, the inherit bit was set in both, now, if we disable it globally using
    --no-inherit:

    # perf record --no-inherit -e cycles -e instructions -a usleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ]
    # perf evlist -v
    cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
    instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1

    No inherit bit set, then disabling it and setting just on the cycles event:

    # perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ]
    # perf evlist -v
    cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
    instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
    #

    We can see it as well in by using a more verbose level of debug messages in
    the tool that sets up the perf_event_attr, 'perf record' in this case:

    [root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
    ------------------------------------------------------------
    perf_event_attr:
    size 112
    { sample_period, sample_freq } 4000
    sample_type IP|TID|TIME|ID|CPU|PERIOD
    read_format ID
    disabled 1
    inherit 1
    mmap 1
    comm 1
    freq 1
    task 1
    sample_id_all 1
    exclude_guest 1
    mmap2 1
    comm_exec 1
    ------------------------------------------------------------
    sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
    sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8
    sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8
    sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8
    ------------------------------------------------------------
    perf_event_attr:
    size 112
    config 0x1
    { sample_period, sample_freq } 4000
    sample_type IP|TID|TIME|ID|CPU|PERIOD
    read_format ID
    disabled 1
    freq 1
    sample_id_all 1
    exclude_guest 1
    ------------------------------------------------------------
    sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8

    Signed-off-by: Wang Nan
    Acked-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Brendan Gregg
    Cc: David S. Miller
    Cc: Li Zefan
    Cc: Peter Zijlstra
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com
    [ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa]
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • Recent GDB (at least on a vanilla Debian box) looks for debug information in

    /usr/lib/debug/.build-id/nn/nnnnnnn

    where nn/nnnnnn is the build-id of the stripped ELF binary. This is
    documented here:

    https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html

    This was not working in perf because we didn't read the build id until
    AFTER we searched for the separate debug information file. This patch
    reads the build ID and THEN does the search.

    Signed-off-by: Dima Kogan
    Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.net
    Signed-off-by: Arnaldo Carvalho de Melo

    Dima Kogan
     
  • This was benign, but wrong. The build-id should live in a char[], not a char*[]

    Signed-off-by: Dima Kogan
    Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.net
    Signed-off-by: Arnaldo Carvalho de Melo

    Dima Kogan
     
  • Recently 'perf -h' was made aware of arguments and would show
    just the help for the arguments specified, but that required a strict
    form, i.e.:

    $ perf -h --tui

    worked, but:

    $ perf -h tui

    didn't.

    Make it support both cases and also look at the option help when neither
    matches, so that he following examples works:

    $ perf report -h interface

    Usage: perf report []

    --gtk Use the GTK2 interface
    --stdio Use the stdio interface
    --tui Use the TUI interface

    $ perf report -h stack

    Usage: perf report []

    -g, --call-graph
    Display call graph (stack chain/backtrace):

    print_type: call graph printing style (graph|flat|fractal|none)
    threshold: minimum call graph inclusion threshold ()
    print_limit: maximum number of call graph entry ()
    order: call graph order (caller|callee)
    sort_key: call graph sort key (function|address)
    branch: include last branch info to call graph (branch)

    Default: graph,0.5,caller,function
    --max-stack Set the maximum stack depth when parsing the
    callchain, anything beyond the specified depth
    will be ignored. Default: 127
    $

    Suggested-by: Ingo Molnar
    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: Brendan Gregg
    Cc: Chandler Carruth
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-xzqvamzqv3cv0p6w3inhols3@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Adding cpu_map__empty_new interface to create empty cpumap with given
    size. The cpumap entries are initialized with -1.

    It'll be used for caching cpu_map in following patches.

    Signed-off-by: Jiri Olsa
    Tested-by: Kan Liang
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445784728-21732-2-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Because the 'perf stat record' patches will use the id_offset member
    together with the priv pointer.

    Signed-off-by: Jiri Olsa
    Tested-by: Kan Liang
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445784728-21732-29-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

27 Oct, 2015

4 commits

  • Now usage_with_options() setup a pager before printing message so normal
    printf() or pr_err() will not be shown. The usage_with_options_msg()
    can be used to print some help message before usage strings.

    Signed-off-by: Namhyung Kim
    Acked-by: Masami Hiramatsu
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445701767-12731-4-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • It's annoying to see error or help message when command has many options
    like in perf record, report or top. So setup pager when print parser
    error or help message - it should be OK since no UI is enabled at the
    parsing time. The usage_with_options() already disables it by calling
    exit_browser() anyway.

    Signed-off-by: Namhyung Kim
    Acked-by: Ingo Molnar
    Tested-by: Arnaldo Carvalho de Melo
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445701767-12731-3-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • So that it can be more consistent with other --show-* options. The old
    name (--showcpuutilization) is provided only for compatibility.

    Signed-off-by: Namhyung Kim
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445701767-12731-2-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Currently if an option name is ambiguous it only prints first two
    matched option names but no help. It'd be better it could show all
    possible names and help messages too.

    Before:
    $ perf report --show
    Error: Ambiguous option: show (could be --show-total-period or
    --show-ref-call-graph)
    Usage: perf report []

    After:
    $ perf report --show
    Error: Ambiguous option: show (could be --show-total-period or
    --show-ref-call-graph)
    Usage: perf report []

    -n, --show-nr-samples
    Show a column with the number of samples
    --showcpuutilization
    Show sample percentage for different cpu modes
    -I, --show-info Display extended information about perf.data file
    --show-total-period
    Show a column with the sum of periods
    --show-ref-call-graph
    Show callgraph from reference event

    Signed-off-by: Namhyung Kim
    Acked-by: Ingo Molnar
    Tested-by: Arnaldo Carvalho de Melo
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445701767-12731-1-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim