12 Apr, 2016

9 commits

  • Instead receive a callchain_param pointer to configure callchain
    aspects, not doing so if NULL is passed.

    This will allow fine grained control over which evsels in an evlist
    gets callchains enabled.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-2mupip6khc92mh5x4nw9to82@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The kernel parts are not that useful:

    # trace -m 512 -e nanosleep --call dwarf usleep 1
    0.065 ( 0.065 ms): usleep/18732 nanosleep(rqtp: 0x7ffc4ee4e200) = 0
    syscall_slow_exit_work ([kernel.kallsyms])
    do_syscall_64 ([kernel.kallsyms])
    return_from_SYSCALL_64 ([kernel.kallsyms])
    __nanosleep (/usr/lib64/libc-2.22.so)
    usleep (/usr/lib64/libc-2.22.so)
    main (/usr/bin/usleep)
    __libc_start_main (/usr/lib64/libc-2.22.so)
    _start (/usr/bin/usleep)
    #

    So lets just use perf_event_attr.exclude_callchain_kernel to avoid
    collecting it in the ring buffer:

    # trace -m 512 -e nanosleep --call dwarf usleep 1
    0.063 ( 0.063 ms): usleep/19212 nanosleep(rqtp: 0x7ffc3df10fb0) = 0
    __nanosleep (/usr/lib64/libc-2.22.so)
    usleep (/usr/lib64/libc-2.22.so)
    main (/usr/bin/usleep)
    __libc_start_main (/usr/lib64/libc-2.22.so)
    _start (/usr/bin/usleep)
    #

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-qctu3gqhpim0dfbcp9d86c91@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • In 'perf trace' we're just interested in printing callchains, and we
    don't want to use the symbol_conf.use_callchain, so move the callchain
    part to a new method.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-kcn3romzivcpxb3u75s9nz33@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • As it receives a FILE, and its more than just the IP, which can even be
    requested not to be printed.

    For consistency with other similar methods in tools/perf/, name it as
    perf_evsel__fprintf_sym() and make it return the number of bytes
    printed, just like 'fprintf(3)'

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-84gawlqa3lhk63nf0t9vnqnn@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Now, one can print the call chain for every encountered sys_exit event,
    e.g.:

    $ perf trace -e nanosleep --call-graph dwarf path/to/ex_sleep
    1005.757 (1000.090 ms): ex_sleep/13167 nanosleep(...) = 0
    syscall_slow_exit_work ([kernel.kallsyms])
    syscall_return_slowpath ([kernel.kallsyms])
    int_ret_from_sys_call ([kernel.kallsyms])
    __nanosleep (/usr/lib/libc-2.23.so)
    [unknown] (/usr/lib/libQt5Core.so.5.6.0)
    QThread::sleep (/usr/lib/libQt5Core.so.5.6.0)
    main (path/to/ex_sleep)
    __libc_start_main (/usr/lib/libc-2.23.so)
    _start (path/to/ex_sleep)

    Note that it is advised to increase the number of mmap pages to prevent
    event losses when using this new feature. Often, adding `-m 10M` to the
    `perf trace` invocation is enough.

    This feature is also available in strace when built with libunwind via
    `strace -k`. Performance wise, this solution is much better:

    $ time find path/to/linux &> /dev/null

    real 0m0.051s
    user 0m0.013s
    sys 0m0.037s

    $ time perf trace -m 800M --call-graph dwarf find path/to/linux &> /dev/null

    real 0m2.624s
    user 0m1.203s
    sys 0m1.333s

    $ time strace -k find path/to/linux &> /dev/null

    real 0m35.398s
    user 0m10.403s
    sys 0m23.173s

    Note that it is currently not possible to configure the print output.
    Adding such a feature, similar to what is available in `perf script` via
    its `--fields` knob can be added later on.

    Signed-off-by: Milian Wolff
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    LPU-Reference: 1460115255-17648-1-git-send-email-milian.wolff@kdab.com
    [ Split from a larger patch, do not print the IP, left align,
    remove dup call symbol__init(), added man page entry ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Milian Wolff
     
  • For callchains, etc where we want it to align just below the syscall
    name, for instance, in 'perf trace'

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-uk9ekchd67651c625ltaur5y@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • As this function will be used in 'perf trace'.

    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/n/tip-8x297v9utnxq77onikevvlse@git.kernel.org
    [ Split from a larger patch ]
    Signed-off-by: Milian Wolff

    Milian Wolff
     
  • This patch removes the need to set a bpf-output event in cmdline. By
    referencing a map named '__bpf_stdout__', perf automatically creates an
    event for it.

    For example:

    # perf record -e ./test_bpf_trace.c usleep 100000
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ]
    # perf script
    usleep 4639 [000] 261895.307826: 0 __bpf_stdout__: ffffffff810eb9a1 ...
    BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a
    0008: 42 50 46 20 65 76 65 6e BPF even
    0010: 74 21 00 00 t!..
    BPF string: "Raise a BPF event!"

    usleep 4639 [000] 261895.407883: 0 __bpf_stdout__: ffffffff8105d609 ...
    BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a
    0008: 42 50 46 20 65 76 65 6e BPF even
    0010: 74 21 00 00 t!..
    BPF string: "Raise a BPF event!"

    perf record -e ./test_bpf_trace.c usleep 100000

    equals to:

    perf record -e bpf-output/no-inherit=1,name=__bpf_stdout__/ \
    -e ./test_bpf_trace.c/map:__bpf_stdout__.event=__bpf_stdout__/ \
    usleep 100000

    Where test_bpf_trace.c is:

    /************************ BEGIN **************************/
    #include
    struct bpf_map_def {
    unsigned int type;
    unsigned int key_size;
    unsigned int value_size;
    unsigned int max_entries;
    };
    #define SEC(NAME) __attribute__((section(NAME), used))
    static u64 (*ktime_get_ns)(void) =
    (void *)BPF_FUNC_ktime_get_ns;
    static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
    (void *)BPF_FUNC_trace_printk;
    static int (*get_smp_processor_id)(void) =
    (void *)BPF_FUNC_get_smp_processor_id;
    static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
    (void *)BPF_FUNC_perf_event_output;

    struct bpf_map_def SEC("maps") __bpf_stdout__ = {
    .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
    .key_size = sizeof(int),
    .value_size = sizeof(u32),
    .max_entries = __NR_CPUS__,
    };

    static inline int __attribute__((always_inline))
    func(void *ctx, int type)
    {
    char output_str[] = "Raise a BPF event!";
    char err_str[] = "BAD %d\n";
    int err;

    err = perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(),
    &output_str, sizeof(output_str));
    if (err)
    trace_printk(err_str, sizeof(err_str), err);
    return 1;
    }
    SEC("func_begin=sys_nanosleep")
    int func_begin(void *ctx) {return func(ctx, 1);}
    SEC("func_end=sys_nanosleep%return")
    int func_end(void *ctx) { return func(ctx, 2);}
    char _license[] SEC("license") = "GPL";
    int _version SEC("version") = LINUX_VERSION_CODE;
    /************************* END ***************************/

    Committer note:

    Testing with 'perf trace':

    # trace -e nanosleep --ev test_bpf_stdout.c usleep 1
    0.007 ( 0.007 ms): usleep/729 nanosleep(rqtp: 0x7ffc5bbc5fe0) ...
    0.007 ( ): __bpf_stdout__:Raise a BPF event!..)
    0.008 ( ): perf_bpf_probe:func_begin:(ffffffff81112460))
    0.069 ( ): __bpf_stdout__:Raise a BPF event!..)
    0.070 ( ): perf_bpf_probe:func_end:(ffffffff81112460
    Signed-off-by: Wang Nan
    Cc: Jiri Olsa
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1460128045-97310-5-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • This patch allows cloning bpf-output event configuration among multiple
    bpf scripts. If there exist a map named '__bpf_output__' and not
    configured using 'map:__bpf_output__.event=', this patch clones the
    configuration of another '__bpf_stdout__' map. For example, following
    command:

    # perf trace --ev bpf-output/no-inherit,name=evt/ \
    --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \
    --ev ./test_bpf_trace2.c usleep 100000

    equals to:

    # perf trace --ev bpf-output/no-inherit,name=evt/ \
    --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \
    --ev ./test_bpf_trace2.c/map:__bpf_stdout__.event=evt/ \
    usleep 100000

    Signed-off-by: Wang Nan
    Suggested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1460128045-97310-4-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     

08 Apr, 2016

13 commits

  • To fix the build on Fedora Rawhide (gcc 6.0.0 20160311 (Red Hat 6.0.0-0.17):

    CC /tmp/build/perf/arch/x86/util/dwarf-regs.o
    arch/x86/util/dwarf-regs.c:66:36: error: 'x86_32_regoffset_table' defined but not used [-Werror=unused-const-variable=]
    static const struct pt_regs_offset x86_32_regoffset_table[] = {
    ^~~~~~~~~~~~~~~~~~~~~~
    cc1: all warnings being treated as errors

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-fghuksc1u8ln82bof4lwcj0o@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The readdir() function is thread safe as long as just one thread uses a
    DIR, which is the case when parsing tracepoint event definitions, to
    avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it
    instead of readdir_r().

    See: http://man7.org/linux/man-pages/man3/readdir.3.html

    "However, in modern implementations (including the glibc implementation),
    concurrent calls to readdir() that specify different directory streams
    are thread-safe. In cases where multiple threads must read from the
    same directory stream, using readdir() with external synchronization is
    still preferable to the use of the deprecated readdir_r(3) function."

    Noticed while building on a Fedora Rawhide docker container.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-wddn49r6bz6wq4ee3dxbl7lo@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The readdir() function is thread safe as long as just one thread uses a
    DIR, which is the case when synthesizing events for pre-existing threads
    by traversing /proc, so, to avoid breaking the build with glibc-2.23.90
    (upcoming 2.24), use it instead of readdir_r().

    See: http://man7.org/linux/man-pages/man3/readdir.3.html

    "However, in modern implementations (including the glibc implementation),
    concurrent calls to readdir() that specify different directory streams
    are thread-safe. In cases where multiple threads must read from the
    same directory stream, using readdir() with external synchronization is
    still preferable to the use of the deprecated readdir_r(3) function."

    Noticed while building on a Fedora Rawhide docker container.

    CC /tmp/build/perf/util/event.o
    util/event.c: In function '__event__synthesize_thread':
    util/event.c:466:2: error: 'readdir_r' is deprecated [-Werror=deprecated-declarations]
    while (!readdir_r(tasks, &dirent, &next) && next) {
    ^~~~~
    In file included from /usr/include/features.h:368:0,
    from /usr/include/stdint.h:25,
    from /usr/lib/gcc/x86_64-redhat-linux/6.0.0/include/stdint.h:9,
    from /git/linux/tools/include/linux/types.h:6,
    from util/event.c:1:
    /usr/include/dirent.h:189:12: note: declared here

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-i1vj7nyjp2p750rirxgrfd3c@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The readdir() function is thread safe as long as just one thread uses a
    DIR, which is the case in thread_map, so, to avoid breaking the build
    with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r().

    See: http://man7.org/linux/man-pages/man3/readdir.3.html

    "However, in modern implementations (including the glibc implementation),
    concurrent calls to readdir() that specify different directory streams
    are thread-safe. In cases where multiple threads must read from the
    same directory stream, using readdir() with external synchronization is
    still preferable to the use of the deprecated readdir_r(3) function."

    Noticed while building on a Fedora Rawhide docker container.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-del8h2a0f40z75j4r42l96l0@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The readdir() function is thread safe as long as just one thread uses a
    DIR, which is the case in 'perf script', so, to avoid breaking the build
    with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r().

    See: http://man7.org/linux/man-pages/man3/readdir.3.html

    "However, in modern implementations (including the glibc implementation),
    concurrent calls to readdir() that specify different directory streams
    are thread-safe. In cases where multiple threads must read from the
    same directory stream, using readdir() with external synchronization is
    still preferable to the use of the deprecated readdir_r(3) function."

    Noticed while building on a Fedora Rawhide docker container.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-mt3xz7n2hl49ni2vx7kuq74g@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • He Kuang reported a problem that perf fails to get correct symbol on
    Android platform in [1]. The problem can be reproduced on normal x86_64
    platform. I will describe the reproducing steps in detail at the end of
    commit message.

    The reason of this problem is the missing of symbol adjustment for normal
    shared objects. In most of the cases skipping adjustment is okay. However,
    when '.text' section have different 'address' and 'offset' the result is wrong.
    I checked all shared objects in my working platform, only wine dll objects and
    debug objects (in .debug) have this problem. However, it is common on Android.
    For example:

    $ readelf -S ./libsurfaceflinger.so | grep \.text
    [10] .text PROGBITS 0000000000029030 00012030

    This patch enables symbol adjustment for dynamic objects so the symbol
    address got from elfutils would be adjusted correctly.

    Now nearly all types of ELF files should adjust symbols. Makes
    ss->adjust_symbols default to true.

    Steps to reproduce the problem:

    $ cat ./Makefile
    PWD := $(shell pwd)
    LDFLAGS += "-Wl,-rpath=$(PWD)"
    CFLAGS += -g
    main: main.c libbuggy.so
    libbuggy.so: buggy.c
    gcc -g -shared -fPIC -Wl,-Ttext-segment=0x200000 $< -o $@
    clean:
    rm -rf main libbuggy.so *.o

    $ cat ./buggy.c
    int fib(int x)
    {
    return (x == 0) ? 1 : (x == 1) ? 1 : fib(x - 1) + fib(x - 2);
    }

    $ cat ./main.c
    #include

    extern int fib(int x);
    int main()
    {
    int i;

    for (i = 0; i < 40; i++)
    printf("%d\n", fib(i));
    return 0;
    }

    $ make
    $ perf record ./main
    ...
    $ perf report --stdio
    # Overhead Command Shared Object Symbol
    # ........ ....... ................. ...............................
    #
    14.97% main libbuggy.so [.] 0x000000000000066c
    8.68% main libbuggy.so [.] 0x00000000000006aa
    8.52% main libbuggy.so [.] fib@plt
    7.95% main libbuggy.so [.] 0x0000000000000664
    5.94% main libbuggy.so [.] 0x00000000000006a9
    5.35% main libbuggy.so [.] 0x0000000000000678
    ...

    The correct result should be (after this patch):

    # Overhead Command Shared Object Symbol
    # ........ ....... ................. ...............................
    #
    91.47% main libbuggy.so [.] fib
    8.52% main libbuggy.so [.] fib@plt
    0.00% main [kernel.kallsyms] [k] kmem_cache_free

    [1] http://lkml.kernel.org/g/1452567507-54013-1-git-send-email-hekuang@huawei.com

    Signed-off-by: Wang Nan
    Acked-by: Namhyung Kim
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Adrian Hunter
    Cc: Cody P Schafer
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kirill Smelkov
    Cc: Li Zefan
    Cc: Masami Hiramatsu
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1460024671-64774-3-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • In this patch, the offset of '.text' section is stored into dso
    and used here to re-calculate address to objdump.

    In most of the cases, executable code is in '.text' section, so the
    adjustment made to a symbol in dso__load_sym (using
    sym.st_value -= shdr.sh_addr - shdr.sh_offset) should equal to
    'sym.st_value -= dso->text_offset'. Therefore, adding text_offset back
    get objdump address from symbol address (rip). However, it is not true
    for kernel and kernel module since there could be multiple executable
    sections with different offset. Exclude kernel for this reason.

    After this patch, even dso->adjust_symbols is set to true for shared
    objects, map__rip_2objdump() and map__objdump_2mem() would return
    correct result, so perf behavior of annotate won't be changed.

    Signed-off-by: Wang Nan
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Adrian Hunter
    Cc: Cody P Schafer
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kirill Smelkov
    Cc: Li Zefan
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1460024671-64774-2-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • We used libaudit to map ids to syscall names and vice-versa, but that
    imposes a delay in supporting new syscalls, having to wait for libaudit
    to get those new syscalls on its tables.

    To remove that delay, for x86_64 initially, grab a copy of
    arch/x86/entry/syscalls/syscall_64.tbl and use it to generate those
    tables.

    Syscalls currently not available in audit-libs:

    # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd
    Error: Invalid syscall copy_file_range, membarrier, mlock2, pread64, pwrite64, timerfd_create, userfaultfd
    Hint: try 'perf list syscalls:sys_enter_*'
    Hint: and: 'man syscalls'
    #

    With this patch:

    # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd
    8505.733 ( 0.010 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 36
    8506.688 ( 0.005 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 40
    30023.097 ( 0.025 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63ae382000, count: 4096, pos: 529592320) = 4096
    31268.712 ( 0.028 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afd8b000, count: 4096, pos: 2314133504) = 4096
    31268.854 ( 0.016 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afda2000, count: 4096, pos: 2314137600) = 4096

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-51xfjbxevdsucmnbc4ka5r88@git.kernel.org
    [ Added make dep for 'prepare' in 'LIBPERF_IN', fix by Wang Nan to fix parallell build ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Tools should use a mechanism similar to arch/x86/entry/syscalls/ to
    generate a header file with the definitions for two variables:

    static const char *syscalltbl_x86_64[] = {
    [0] = "read",
    [1] = "write",

    [324] = "membarrier",
    [325] = "mlock2",
    [326] = "copy_file_range",
    };
    static const int syscalltbl_x86_64_max_id = 326;

    In a per arch file that should then be included in
    tools/perf/util/syscalltbl.c.

    First one will be for x86_64.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-02uuamkxgccczdth8komspgp@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • We're using libaudit for doing name to id and id to syscall name
    translations, but that makes 'perf trace' to have to wait for newer
    libaudit versions supporting recently added syscalls, such as
    "userfaultfd" at the time of this changeset.

    We have all the information right there, in the kernel sources, so move
    this code to a separate place, wrapped behind functions that will
    progressively use the kernel source files to extract the syscall table
    for use in 'perf trace'.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-i38opd09ow25mmyrvfwnbvkj@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • When reading the syscall tracepoint /format file, look for arguments of type
    "mode_t" and attach a beautifier:

    [root@jouet ~]# cat ~/bin/tp_with_fields_of_type
    #!/bin/bash
    grep -w $1 /sys/kernel/tracing/events/syscalls/*/format | sed -r 's%.*sys_enter_(.*)/format.*%\1%g' | paste -d, -s
    # tp_with_fields_of_type umode_t
    chmod,creat,fchmodat,fchmod,mkdirat,mkdir,mknodat,mknod,mq_open,openat,open
    #

    Testing it:

    #define S_ISUID 0004000
    #define S_ISGID 0002000
    #define S_ISVTX 0001000
    #define S_IRWXU 0000700
    #define S_IRUSR 0000400
    #define S_IWUSR 0000200
    #define S_IXUSR 0000100

    #define S_IRWXG 0000070
    #define S_IRGRP 0000040
    #define S_IWGRP 0000020
    #define S_IXGRP 0000010

    #define S_IRWXO 0000007
    #define S_IROTH 0000004
    #define S_IWOTH 0000002
    #define S_IXOTH 0000001

    # for mode in 4000 2000 1000 700 400 200 100 70 40 20 10 7 4 2 1 ; do \
    echo -n $mode '->' ; trace --no-inherit -e chmod,fchmodat,fchmod chmod $mode x; \
    done
    4000 -> 0.338 ( 0.012 ms): fchmodat(dfd: CWD, filename: x, mode: ISUID) = 0
    2000 -> 0.438 ( 0.015 ms): fchmodat(dfd: CWD, filename: x, mode: ISGID) = 0
    1000 -> 0.677 ( 0.040 ms): fchmodat(dfd: CWD, filename: x, mode: ISVTX) = 0
    700 -> 0.394 ( 0.013 ms): fchmodat(dfd: CWD, filename: x, mode: IRWXU) = 0
    400 -> 0.337 ( 0.010 ms): fchmodat(dfd: CWD, filename: x, mode: IRUSR) = 0
    200 -> 0.259 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IWUSR) = 0
    100 -> 0.249 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IXUSR) = 0
    70 -> 0.266 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IRWXG) = 0
    40 -> 0.329 ( 0.009 ms): fchmodat(dfd: CWD, filename: x, mode: IRGRP) = 0
    20 -> 0.250 ( 0.009 ms): fchmodat(dfd: CWD, filename: x, mode: IWGRP) = 0
    10 -> 0.259 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IXGRP) = 0
    7 -> 0.249 ( 0.009 ms): fchmodat(dfd: CWD, filename: x, mode: IRWXO) = 0
    4 -> 0.278 ( 0.011 ms): fchmodat(dfd: CWD, filename: x, mode: IROTH) = 0
    2 -> 0.276 ( 0.009 ms): fchmodat(dfd: CWD, filename: x, mode: IWOTH) = 0
    1 -> 0.250 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IXOTH) = 0
    #
    # trace --no-inherit -e chmod,fchmodat,fchmod chmod 7777 x
    0.258 ( 0.011 ms): fchmodat(dfd: CWD, filename: x, mode: IALLUGO) = 0
    # trace --no-inherit -e chmod,fchmodat,fchmod chmod 7770 x
    0.258 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: ISUID|ISGID|ISVTX|IRWXU|IRWXG) = 0
    # trace --no-inherit -e chmod,fchmodat,fchmod chmod 777 x
    0.293 ( 0.012 ms): fchmodat(dfd: CWD, filename: x, mode: IRWXUGO
    #

    Now lets see if check by using the tracepoint for that specific syscall,
    instead of raw_syscalls:sys_enter as 'trace' does for its strace fu:

    # trace --no-inherit --ev syscalls:sys_enter_fchmodat -e fchmodat chmod 666 x
    0.255 ( ): syscalls:sys_enter_fchmodat:dfd: 0xffffffffffffff9c, filename: 0x55db32a3f0f0, mode: 0x000001b6)
    0.268 ( 0.012 ms): fchmodat(dfd: CWD, filename: x, mode: IRUGO|IWUGO ) = 0
    #

    Perfect, 0x1bc == 0666.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-18e8zfgbkj83xo87yoom43kd@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Andreas reported following command produces no output:

    # cat test.py
    #!/usr/bin/env python

    def stat__krava(cpu, thread, time, val, ena, run):
    print "event %s cpu %d, thread %d, time %d, val %d, ena %d, run %d" % \
    ("krava", cpu, thread, time, val, ena, run)
    # perf stat -a -I 1000 -e cycles,"cpu/config=0x6530160,name=krava/" record | perf script -s test.py
    ^C
    #

    The reason is that 'perf script' does not process event update events and
    will never get the event name update thus the python callback is never
    called.

    The fix is just to add already existing callback we use in 'perf stat
    report'.

    Committer note:

    After the patch:

    # perf stat -a -I 1000 -e cycles,"cpu/config=0x6530160,name=krava/" record | perf script -s test.py
    event krava cpu -1, thread -1, time 1000239179, val 1789051, ena 4000690920, run 4000690920
    event krava cpu -1, thread -1, time 2000479061, val 2391338, ena 4000879596, run 4000879596
    event krava cpu -1, thread -1, time 3000740802, val 1939121, ena 4000977209, run 4000977209
    event krava cpu -1, thread -1, time 4001006730, val 2356115, ena 4001000489, run 4001000489
    ^C
    #

    Reported-by: Andreas Hollmann
    Signed-off-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: David Ahern
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1460013073-18444-3-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Milian reported issue with thread::priv, which was double booked by perf
    trace and DWARF unwind code. So using those together is impossible at
    the moment.

    Moving DWARF unwind private data into separate variable so perf trace
    can keep using thread::priv.

    Reported-and-Tested-by: Milian Wolff
    Signed-off-by: Jiri Olsa
    Cc: Andreas Hollmann
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1460013073-18444-2-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

07 Apr, 2016

6 commits

  • To be used in cases for both sides trim.

    Signed-off-by: Jiri Olsa
    Cc: Andreas Hollmann
    Cc: David Ahern
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1460013073-18444-1-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • When reading the syscall tracepoint /format file, look for arguments
    of type "pid_t" and attach the PID beautifier, that will do a lookup
    on the threads it knows, i.e. the ones that came from PERF_RECORD_COMM
    events and add the COMM after the pid in such args:

    Excerpt of a system wide trace for syscalls with pid_t args:

    55602.977 ( 0.006 ms): bash/12122 setpgid(pid: 24347 (bash), pgid: 24347 (bash)) = 0
    55603.024 ( 0.004 ms): bash/24347 setpgid(pid: 24347 (bash), pgid: 24347 (bash)) = 0
    55691.527 (88.397 ms): bash/12122 wait4(upid: -1, stat_addr: 0x7ffe0cee1720, options: UNTRACED|CONTINUED) ...
    55692.479 ( 0.952 ms): git/24347 wait4(upid: 24368, stat_addr: 0x7ffe030d5724) ...
    55694.549 ( 2.070 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4fc10) = 24369 (pre-commit)
    55694.575 ( 0.002 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f650, options: NOHANG) = -1 ECHILD No child processes
    55695.934 ( 0.010 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f2d0, options: NOHANG) = 24370 (git)
    55695.937 ( 0.001 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f2d0, options: NOHANG) = -1 ECHILD No child processes
    55717.963 ( 0.000 ms): pre-commit/24371 ... [continued]: wait4()) = 24372
    55717.978 (21.468 ms): :24371/24371 wait4(upid: -1, stat_addr: 0x7ffc94f4f230) ...
    55718.087 ( 0.109 ms): pre-commit/24371 wait4(upid: -1, stat_addr: 0x7ffc94f4f230) = 24373 (tr)
    55718.187 ( 0.096 ms): pre-commit/24371 wait4(upid: -1, stat_addr: 0x7ffc94f4f230) = 24374 (wc)
    55718.218 ( 0.002 ms): pre-commit/24371 wait4(upid: -1, stat_addr: 0x7ffc94f4eed0, options: NOHANG) = -1 ECHILD No child processes
    55718.367 ( 0.005 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f1d0, options: NOHANG) = 24371 (pre-commit)
    55718.369 ( 0.001 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f1d0, options: NOHANG) = -1 ECHILD No child processes
    55741.021 (49.494 ms): git/24347 ... [continued]: wait4()) = 24368 (pre-commit)
    74146.427 (18319.601 ms): git/24347 wait4(upid: 24375 (git), stat_addr: 0x7ffe030d6824) ...
    74149.036 ( 0.891 ms): bash/24391 wait4(upid: -1, stat_addr: 0x7ffe0cee0560) = 24393 (sed)

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-75yl9hzjhb020iadc81gdj8t@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Showing the COMM for that return, if available.

    # trace -e getpid,getppid,set_tid_address
    490.007 ( 0.005 ms): sh/8250 getpid(...) = 8250 (sh)
    490.014 ( 0.001 ms): sh/8250 getppid(...) = 7886 (make)
    491.156 ( 0.004 ms): install/8251 set_tid_address(tidptr: 0x7f204a9d4ad0) = 8251 (install)
    ^C

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-psbpplqupatom9x4uohbxid5@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Starting with clone, waitid and wait4:

    # trace -e waitid,wait4
    1.385 ( 1.385 ms): bash/12122 wait4(upid: -1, stat_addr: 0x7ffe0cee1720, options: UNTRACED|CONTINUED) = 1210 (ls)
    1.426 ( 0.002 ms): bash/12122 wait4(upid: -1, stat_addr: 0x7ffe0cee1150, options: NOHANG|UNTRACED|CONTINUED) = 0
    3.293 ( 0.604 ms): bash/1211 wait4(upid: -1, stat_addr: 0x7ffe0cee0560 ) = 1214 (sed)
    3.342 ( 0.002 ms): bash/1211 wait4(upid: -1, stat_addr: 0x7ffe0cee01d0, options: NOHANG ) = -1 ECHILD No child processes
    3.576 ( 0.016 ms): bash/12122 wait4(upid: -1, stat_addr: 0x7ffe0cee0550, options: NOHANG|UNTRACED|CONTINUED) = 1211 (bash)
    ^C# trace -e clone
    0.027 ( 0.000 ms): systemd/1 ... [continued]: clone()) = 1227 (systemd)
    0.050 ( 0.000 ms): systemd/1227 ... [continued]: clone()) = 0
    ^C[root@jouet ~]#

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-lyf5d3y5j15wikjb6pe6ukoi@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • # trace -e waitid,wait4

    0.557 ( 0.557 ms): bash/27335 wait4(upid: -1, stat_addr: 0x7ffd02f449f0) = 27336
    1.250 ( 0.685 ms): bash/27335 wait4(upid: -1, stat_addr: 0x7ffd02f449f0) = 27337
    1.312 ( 0.002 ms): bash/27335 wait4(upid: -1, stat_addr: 0x7ffd02f44690, options: NOHANG) = -1 ECHILD No child processes
    1.550 ( 0.015 ms): bash/3856 wait4(upid: -1, stat_addr: 0x7ffd02f44990, options: NOHANG|UNTRACED|CONTINUED) = 27335
    1.552 ( 0.001 ms): bash/3856 wait4(upid: -1, stat_addr: 0x7ffd02f44990, options: NOHANG|UNTRACED|CONTINUED) = 0
    #

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-i5vlo5n5jv0amt8bkyicmdxh@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • $ trace -e sched_setscheduler chrt -f 1 usleep 1
    chrt: failed to set pid 0's policy: Operation not permitted
    0.005 ( 0.005 ms): chrt/19189 sched_setscheduler(policy: FIFO, param: 0x7ffec5273d70) = -1 EPERM Operation not permitted
    $

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-i5vlo5n5jv0amt8bkyicmdxh@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

06 Apr, 2016

6 commits

  • Document some features for specifying events in the perf list manpage:

    - Event groups
    - Leader sampling
    - How to specify raw PMU events in the new syntax
    - Global versus per process PMUs.
    - Access restrictions
    - Fix Intel SDM URL

    v2: Lots of new content. address review feedback.

    Signed-off-by: Andi Kleen
    Acked-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1459810686-15913-1-git-send-email-andi@firstfloor.org
    [ Add quotes to some keywords, such as "any" ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • Link: http://lkml.kernel.org/n/tip-yk6brsq3opuotr9by18xlkr8@git.kernel.org
    Signed-off-by: Jiri Olsa

    Jiri Olsa
     
  • This ended up triggering these warnings when building on Ubuntu 12.04.5:

    util/scripting-engines/trace-event-perl.c: In function 'perl_process_callchain':
    util/scripting-engines/trace-event-perl.c:293:4: error: value computed is not used [-Werror=unused-value]
    util/scripting-engines/trace-event-perl.c:294:4: error: value computed is not used [-Werror=unused-value]
    util/scripting-engines/trace-event-perl.c:295:4: error: value computed is not used [-Werror=unused-value]
    util/scripting-engines/trace-event-perl.c:297:4: error: value computed is not used [-Werror=unused-value]
    util/scripting-engines/trace-event-perl.c:309:4: error: value computed is not used [-Werror=unused-value]
    cc1: all warnings being treated as errors
    mv: cannot stat `/tmp/build/perf/util/scripting-engines/.trace-event-perl.o.tmp': No such file or directory
    make[4]: *** [/tmp/build/perf/util/scripting-engines/trace-event-perl.o] Error 1

    Fix it by doing error checking when building the perl data structures
    related to callchains.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Dima Kogan
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Fixes: f7380c12ec6c ("perf script perl: Perl scripts now get a backtrace, like the python ones")
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • If not, tell the user that:

    config/Makefile:273: Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.157

    And return -ENOTSUPP in die_get_var_range(), failing features that
    need it, like the one pointed out above.

    This fixes the build on older systems, such as Ubuntu 12.04.5.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Vinson Lee
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-9l7luqkq4gfnx7vrklkq4obs@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Fix build error on Ubuntu 12.04.5 with GCC 4.6.3.

    CC util/config.o
    util/config.c: In function ‘perf_buildid_config’:
    util/config.c:384:15: error: declaration of ‘dirname’ shadows a global declaration [-Werror=shadow]

    Signed-off-by: Vinson Lee
    Cc: Alexander Shishkin
    Cc: Alexei Starovoitov
    Cc: Jiri Olsa
    Cc: Josh Poimboeuf
    Cc: Peter Zijlstra
    Cc: Taeung Song
    Cc: Wang Nan
    Fixes: 9cb5987c8227 ("perf config: Rework buildid_dir_command_config to perf_buildid_config")
    Link: http://lkml.kernel.org/r/1459807659-9020-1-git-send-email-vlee@freedesktop.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Vinson Lee
     
  • …ernel/git/acme/linux into perf/core

    Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

    User visible changes:

    - Do not use events that don't have timestamps when setting 'perf trace's
    base timestamp, fixing up the timestamp column for syscalls (Arnaldo Carvalho de Melo)

    - Make the 'bpf-output' sample_type be the same as tracepoint's, fixing up
    'perf trace's timestamp column for bpf events (Wang Nan)

    - Fix PMU term format max value calculation (Kan Liang)

    - Pretty print 'seccomp', 'getrandom' syscalls in 'perf trace' (Arnaldo Carvalho de Melo)

    Infrastructure changes:

    - Add support for using TSC as an ARCH timestamp when synthesizing
    JIT records (Adrian Hunter)

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

02 Apr, 2016

6 commits

  • Before this patch we can see very large time in the events before the
    'bpf-output' event. For example:

    # perf trace -vv -T --ev sched:sched_switch \
    --ev bpf-output/no-inherit,name=evt/ \
    --ev ./test_bpf_trace.c/map:channel.event=evt/ \
    usleep 10
    ...
    18446744073709.551 (18446564645918.480 ms): usleep/4157 nanosleep(rqtp: 0x7ffd3f0dc4e0) ...
    18446744073709.551 ( ): evt:Raise a BPF event!..)
    179427791.076 ( ): perf_bpf_probe:func_begin:(ffffffff810eb9a0))
    179427791.081 ( ): sched:sched_switch:usleep:4157 [120] S ==> swapper/2:0 [120])
    ...

    We can also see the differences between bpf-output events and
    breakpoint events:

    For bpf output event:
    sample_type IP|TID|RAW|IDENTIFIER

    For tracepoint events:
    sample_type IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER

    This patch fix this differences by adding more sample type for
    bpf-output events.

    After this patch:

    # perf trace -vv -T --ev sched:sched_switch \
    --ev bpf-output/no-inherit,name=evt/ \
    --ev ./test_bpf_trace.c/map:channel.event=evt/ \
    usleep 10
    ...
    179877370.878 ( 0.003 ms): usleep/5336 nanosleep(rqtp: 0x7ffff866c450) ...
    179877370.878 ( ): evt:Raise a BPF event!..)
    179877370.878 ( ): perf_bpf_probe:func_begin:(ffffffff810eb9a0))
    179877370.882 ( ): sched:sched_switch:usleep:5336 [120] S ==> swapper/4:0 [120])
    179877370.945 ( ): evt:Raise a BPF event!..)
    ...

    # ./perf trace -vv -T --ev sched:sched_switch \
    --ev bpf-output/no-inherit,name=evt/ \
    --ev ./test_bpf_trace.c/map:channel.event=evt/ \
    usleep 10 2>&1 | grep sample_type
    sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW
    sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW
    sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW
    sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW
    sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW
    sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW

    The 'IDENTIFIER' info is not required because all events have the same
    sample_type.

    Committer notes:

    Further testing, on top of the changes making 'perf trace' avoid samples
    from events without PERF_SAMPLE_TIME:

    Before:

    # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10

    0.560 ( 0.001 ms): brk( ) = 0x55e5a1df8000
    18446640227439.430 (18446640227438.859 ms): nanosleep(rqtp: 0x7ffc96643370) ...
    18446640227439.430 ( ): evt:Raise a BPF event!..)
    0.576 ( ): perf_bpf_probe:func_begin:(ffffffff81112460))
    18446640227439.430 ( ): evt:Raise a BPF event!..)
    0.645 ( ): perf_bpf_probe:func_end:(ffffffff81112460
    0.292 ( 0.001 ms): brk( ) = 0x55c7cd6e1000
    0.302 ( 0.004 ms): nanosleep(rqtp: 0x7ffedd8bc0f0) ...
    0.302 ( ): evt:Raise a BPF event!..)
    0.303 ( ): perf_bpf_probe:func_begin:(ffffffff81112460))
    0.397 ( ): evt:Raise a BPF event!..)
    0.397 ( ): perf_bpf_probe:func_end:(ffffffff81112460
    Reported-and-Tested-by: Arnaldo Carvalho de Melo
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1459517202-42320-1-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • This was causing bogus values to be shown at the timestamp column:

    Before:

    # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10
    94631143.385 ( 0.001 ms): brk( ) = 0x555555757000
    94631143.398 ( 0.003 ms): mmap(len: 4096, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1) = 0x7ffff7ff6000
    94631143.406 ( 0.004 ms): access(filename: 0xf7df9e10, mode: R ) = -1 ENOENT No such file or directory
    94631143.412 ( 0.004 ms): open(filename: 0xf7df8761, flags: CLOEXEC) = 3
    94631143.415 ( 0.002 ms): fstat(fd: 3, statbuf: 0x7fffffffd6b0 ) = 0
    94631143.419 ( 0.003 ms): mmap(len: 106798, prot: READ, flags: PRIVATE, fd: 3) = 0x7ffff7fdb000
    94631143.420 ( 0.001 ms): close(fd: 3 ) = 0
    94631143.432 ( 0.004 ms): open(filename: 0xf7ff6640, flags: CLOEXEC) = 3

    After:

    # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10
    0.022 ( 0.001 ms): brk( ) = 0x55d7668a6000
    0.037 ( 0.003 ms): mmap(len: 4096, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1) = 0x7f8fbeb97000
    0.123 ( 0.083 ms): access(filename: 0xbe995e10, mode: R ) = -1 ENOENT No such file or directory
    0.130 ( 0.004 ms): open(filename: 0xbe994761, flags: CLOEXEC) = 3
    0.133 ( 0.002 ms): fstat(fd: 3, statbuf: 0x7fff6487a890 ) = 0
    0.138 ( 0.003 ms): mmap(len: 106798, prot: READ, flags: PRIVATE, fd: 3) = 0x7f8fbeb7c000
    0.140 ( 0.001 ms): close(fd: 3 ) = 0
    0.151 ( 0.004 ms): open(filename: 0xbeb97640, flags: CLOEXEC) = 3

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-p7m8llv81iv55ekxexdp5n57@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • That is used in both live runs, i.e.:

    # trace ls

    As when processing events recorded in a perf.data file:

    # trace -i perf.data

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-901l6yebnzeqg7z8mbaf49xb@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Currently the max value of format is calculated by the bits number. It
    relies on the continuity of the format.

    However, uncore event format is not continuous. E.g. uncore qpi event
    format can be 0-7,21.

    If bit 21 is set, there is parsing issues as below.

    $ perf stat -a -e uncore_qpi_0/event=0x200002,umask=0x8/
    event syntax error: '..pi_0/event=0x200002,umask=0x8/'
    \___ value too big for format, maximum is 511

    This patch return the real max value by setting all possible bits to 1.

    Signed-off-by: Kan Liang
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/1459365375-14285-1-git-send-email-kan.liang@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Kan Liang
     
  • For Intel PT / BTS, define the environment variable that selects TSC
    timestamps in the jitdump file.

    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1457426333-30260-1-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • Intel PT uses TSC as a timestamp, so add support for using TSC instead
    of the monotonic clock. Use of TSC is selected by an environment
    variable "JITDUMP_USE_ARCH_TIMESTAMP" and flagged in the jitdump file
    with flag JITDUMP_FLAGS_ARCH_TIMESTAMP.

    Signed-off-by: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Josh Poimboeuf
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Sukadev Bhattiprolu
    Cc: Wang Nan
    Link: http://lkml.kernel.org/r/1457426330-30226-1-git-send-email-adrian.hunter@intel.com
    [ Added the fixup from He Kuang to make it build on other arches, ]
    [ such as aarch64, to avoid inserting this bisectiong breakage upstream ]
    Link: http://lkml.kernel.org/r/1459482572-129494-1-git-send-email-hekuang@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter