18 Sep, 2020

1 commit

  • The kernel can release tasks while they are still running. This can
    result in a task having no tid, in which case perf records a tid of -1.
    Improve the perf script output in that case.

    Example:

    Before:

    # cat ./autoreap.c

    #include
    #include
    #include
    #include

    struct sigaction act = {
    .sa_handler = SIG_IGN,
    };

    int main()
    {
    pid_t child;
    int status = 0;

    sigaction(SIGCHLD, &act, NULL);
    child = fork();
    if (child == 0)
    return 123;
    wait(&status);
    return 0;
    }

    # gcc -o autoreap autoreap.c
    # ./perf record -a -e dummy --switch-events ./autoreap
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.948 MB perf.data ]
    # ./perf script --show-task-events --show-switch-events | grep -C2 'autoreap\|4294967295\|-1'
    swapper 0 [004] 18462.673613: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25189/25189
    perf 25189 [004] 18462.673614: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
    autoreap 25189 [004] 18462.673800: PERF_RECORD_COMM exec: autoreap:25189/25189
    autoreap 25189 [004] 18462.674042: PERF_RECORD_FORK(25191:25191):(25189:25189)
    autoreap 25189 [004] 18462.674050: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
    swapper 0 [004] 18462.674051: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 25189/25189
    swapper 0 [005] 18462.674083: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25191/25191
    autoreap 25191 [005] 18462.674084: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
    swapper 0 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
    rcu_preempt 11 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
    rcu_preempt 11 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
    swapper 0 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11
    autoreap 25191 [005] 18462.674138: PERF_RECORD_EXIT(25191:25191):(25189:25189)
    PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
    swapper 0 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 4294967295/4294967295
    swapper 0 [004] 18462.674182: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25189/25189
    autoreap 25189 [004] 18462.674183: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
    autoreap 25189 [004] 18462.674218: PERF_RECORD_EXIT(25189:25189):(25188:25188)
    autoreap 25189 [004] 18462.674225: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
    swapper 0 [004] 18462.674226: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 25189/25189
    swapper 0 [007] 18462.674257: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25188/25188

    After:

    # ./perf script --show-task-events --show-switch-events | grep -C2 'autoreap\|4294967295\|-1'
    swapper 0 [004] 18462.673613: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25189/25189
    perf 25189 [004] 18462.673614: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
    autoreap 25189 [004] 18462.673800: PERF_RECORD_COMM exec: autoreap:25189/25189
    autoreap 25189 [004] 18462.674042: PERF_RECORD_FORK(25191:25191):(25189:25189)
    autoreap 25189 [004] 18462.674050: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
    swapper 0 [004] 18462.674051: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 25189/25189
    swapper 0 [005] 18462.674083: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25191/25191
    autoreap 25191 [005] 18462.674084: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
    swapper 0 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
    rcu_preempt 11 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
    rcu_preempt 11 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
    swapper 0 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11
    autoreap 25191 [005] 18462.674138: PERF_RECORD_EXIT(25191:25191):(25189:25189)
    :-1 -1 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
    swapper 0 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: -1/-1
    swapper 0 [004] 18462.674182: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25189/25189
    autoreap 25189 [004] 18462.674183: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
    autoreap 25189 [004] 18462.674218: PERF_RECORD_EXIT(25189:25189):(25188:25188)
    autoreap 25189 [004] 18462.674225: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
    swapper 0 [004] 18462.674226: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 25189/25189
    swapper 0 [007] 18462.674257: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25188/25188

    Signed-off-by: Adrian Hunter
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Andi Kleen
    Cc: Jiri Olsa
    Cc: Yu-cheng Yu
    Link: http://lore.kernel.org/lkml/20200909084923.9096-2-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

06 Aug, 2020

2 commits

  • Add a 'tod' field to display time of day column with time of date
    (wallclock) time.

    # perf record -k CLOCK_MONOTONIC kill
    kill: not enough arguments
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.033 MB perf.data (8 samples) ]

    # perf script
    perf 261340 152919.481538: 1 cycles: ffffffff8106d104 ...
    perf 261340 152919.481543: 1 cycles: ffffffff8106d104 ...
    perf 261340 152919.481545: 7 cycles: ffffffff8106d104 ...
    ...

    # perf script --ns
    perf 261340 152919.481538922: 1 cycles: ffffffff8106d ...
    perf 261340 152919.481543286: 1 cycles: ffffffff8106d ...
    perf 261340 152919.481545397: 7 cycles: ffffffff8106d ...
    ...

    # perf script -F+tod
    perf 261340 2020-07-13 18:26:55.620971 152919.481538: ...
    perf 261340 2020-07-13 18:26:55.620975 152919.481543: ...
    perf 261340 2020-07-13 18:26:55.620978 152919.481545: ...
    ...

    # perf script -F+tod --ns
    perf 261340 2020-07-13 18:26:55.620971621 152919.481538922: ...
    perf 261340 2020-07-13 18:26:55.620975985 152919.481543286: ...
    perf 261340 2020-07-13 18:26:55.620978096 152919.481545397: ...
    ...

    It's available only for recording with clockid specified, because it's
    the only case where we can get reference time to wallclock time. It's
    can't do that with perf clock yet.

    Error is display if you want to use --tod on data without clockid
    specified:

    # perf script -F+tod
    Can't provide 'tod' time, missing clock data. Please record with -k/--clockid option.

    Original-patch-by: David Ahern
    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Geneviève Bastien
    Cc: Ian Rogers
    Cc: Jeremie Galarneau
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Link: http://lore.kernel.org/lkml/20200805093444.314999-8-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • So it's possible to add new values. I did not find any place where the
    enum values are passed through some number type, so it's safe to make
    this change.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Geneviève Bastien
    Cc: Ian Rogers
    Cc: Jeremie Galarneau
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Link: http://lore.kernel.org/lkml/20200805093444.314999-7-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

10 Jul, 2020

2 commits

  • It is generally more useful to show the symbol with an address. In this
    case, the print function requires the 'machine' which means changing
    callers to provide it as a parameter. It is optional because most events
    do not need it and the callers that matter can provide it.

    Committer notes:

    Made 'union perf_event' continue to be the first parameter to the
    perf_event__fprintf() and perf_event__fprintf_text_poke() events.

    Signed-off-by: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Borislav Petkov
    Cc: H. Peter Anvin
    Cc: Jiri Olsa
    Cc: Leo Yan
    Cc: Mark Rutland
    Cc: Masami Hiramatsu
    Cc: Mathieu Poirier
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Cc: x86@kernel.org
    Link: http://lore.kernel.org/lkml/20200512121922.8997-16-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • Consistent with other new events, add an option to perf script to
    display text poke events and ksymbol events. Both text poke events and
    ksymbol events are displayed because some text pokes (e.g. ftrace
    trampolines) have corresponding ksymbol events.

    Signed-off-by: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Borislav Petkov
    Cc: H. Peter Anvin
    Cc: Jiri Olsa
    Cc: Leo Yan
    Cc: Mark Rutland
    Cc: Masami Hiramatsu
    Cc: Mathieu Poirier
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Cc: x86@kernel.org
    Link: http://lore.kernel.org/lkml/20200512121922.8997-15-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

09 Jul, 2020

1 commit


06 Jul, 2020

1 commit

  • After recording PEBS-via-PT, perf script will not accept 'iregs' field e.g.

    # perf record -c 10000 -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -I -- ls -l
    ...
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.062 MB perf.data ]
    # ./perf script --itrace=eop -F+iregs
    Samples for 'dummy:u' event do not have IREGS attribute set. Cannot print 'iregs' field.

    Fix by using allow_user_set, which is true when recording AUX area data.

    Fixes: 9e64cefe4335b ("perf intel-pt: Process options for PEBS event synthesis")
    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Luwei Kang
    Link: http://lore.kernel.org/lkml/20200630133935.11150-3-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

23 Jun, 2020

3 commits


18 Jun, 2020

1 commit

  • Fixes segmentation fault when trying to interpret zstd-compressed data
    with perf script:

    ```
    $ perf record -z ls
    ...
    [ perf record: Captured and wrote 0,010 MB perf.data, compressed (original 0,001 MB, ratio is 2,190) ]
    $ memcheck perf script
    ...
    ==67911== Invalid read of size 4
    ==67911== at 0x5568188: ZSTD_decompressStream (in /usr/lib/libzstd.so.1.4.5)
    ==67911== by 0x6E726B: zstd_decompress_stream (zstd.c:100)
    ==67911== by 0x65729C: perf_session__process_compressed_event (session.c:72)
    ==67911== by 0x6598E8: perf_session__process_user_event (session.c:1583)
    ==67911== by 0x65BA59: reader__process_events (session.c:2177)
    ==67911== by 0x65BA59: __perf_session__process_events (session.c:2234)
    ==67911== by 0x65BA59: perf_session__process_events (session.c:2267)
    ==67911== by 0x5A7397: __cmd_script (builtin-script.c:2447)
    ==67911== by 0x5A7397: cmd_script (builtin-script.c:3840)
    ==67911== by 0x5FE9D2: run_builtin (perf.c:312)
    ==67911== by 0x711627: handle_internal_command (perf.c:364)
    ==67911== by 0x711627: run_argv (perf.c:408)
    ==67911== by 0x711627: main (perf.c:538)
    ==67911== Address 0x71d8 is not stack'd, malloc'd or (recently) free'd
    ```

    Signed-off-by: Milian Wolff
    Acked-by: Alexey Budankov
    Tested-by: Arnaldo Carvalho de Melo
    LPU-Reference: 20200612230333.72140-1-milian.wolff@kdab.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Milian Wolff
     

28 May, 2020

5 commits

  • Make process_attr() respect -F-ip, noting also that the condition in
    process_attr() (callchain_param.record_mode != CALLCHAIN_NONE) is always
    true so test the sample type directly.

    Example:

    Before:

    $ perf record -e intel_pt//u uname
    Linux
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.033 MB perf.data ]
    $ perf script --call-trace | head -5
    uname 30992 [006] 41758.313696574: cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown] )
    uname 30992 [006] 41758.313696907: _start 7f71792c4100 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-2.31.so )
    uname 30992 [006] 41758.313699574: _dl_start 7f71792c4103 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-2.31.so )
    uname 30992 [006] 41758.313699907: _dl_start 7f71792c4e18 _dl_start+0x28 (/usr/lib/x86_64-linux-gnu/ld-2.31.so )
    uname 30992 [006] 41758.313701574: _dl_start 7f71792c5128 _dl_start+0x338 (/usr/lib/x86_64-linux-gnu/ld-2.31.so )

    After:

    $ perf script --call-trace | head -5
    uname 30992 [006] 41758.313696574: cbr: 42 freq: 4219 MHz (156%)
    uname 30992 [006] 41758.313696907: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _start
    uname 30992 [006] 41758.313699574: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_start
    uname 30992 [006] 41758.313699907: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_start
    uname 30992 [006] 41758.313701574: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_start

    Fixes: f288e8e1aa4f ("perf script: Enable IP fields for callchains")
    Signed-off-by: Adrian Hunter
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200527180250.16723-1-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • --xed currently forces less. When piping the output to other scripts
    this can waste a lot of CPU time because less is rather slow.
    I've seen it using up a full core on its own in a pipeline.
    Only force less when the output is actually a terminal.

    Signed-off-by: Andi Kleen
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200522020914.527564-1-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • The current codebase makes use of the zero-length array language
    extension to the C90 standard, but the preferred mechanism to declare
    variable-length types such as these ones is a flexible array
    member[1][2], introduced in C99:

    struct foo {
    int stuff;
    struct boo array[];
    };

    By making use of the mechanism above, we will get a compiler warning in
    case the flexible array does not occur last in the structure, which will
    help us prevent some kind of undefined behavior bugs from being
    inadvertently introduced[3] to the codebase from now on.

    Also, notice that, dynamic memory allocations won't be affected by this
    change:

    "Flexible array members have incomplete type, and so the sizeof operator
    may not be applied. As a quirk of the original implementation of
    zero-length arrays, sizeof evaluates to zero."[1]

    sizeof(flexible-array-member) triggers a warning because flexible array
    members have incomplete type[1]. There are some instances of code in
    which the sizeof operator is being incorrectly/erroneously applied to
    zero-length arrays and the result is zero. Such instances may be hiding
    some bugs. So, this work (flexible-array member conversions) will also
    help to get completely rid of those sorts of issues.

    This issue was found with the help of Coccinelle.

    [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
    [2] https://github.com/KSPP/linux/issues/21
    [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

    Signed-off-by: Gustavo A. R. Silva
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Gustavo A. R. Silva
    Cc: Ian Rogers
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200515172926.GA31976@embeddedor
    Signed-off-by: Arnaldo Carvalho de Melo

    Gustavo A. R. Silva
     
  • In case the callchains were deleted in pipe mode, we need to ensure that
    the IP fields are enabled, otherwise the callchain is not displayed.

    Enabling IP and SYM, which should be enough for callchains.

    Committer testing:

    Before:

    Committer Testing:

    before:

    # ls
    # perf record -g -e 'syscalls:*' sleep 0.1 2>/dev/null | perf script | tail
    sleep 5677 [0] 5034.295882: syscalls:sys_exit_mmap: 0x7fcbcfa74000
    sleep 5677 [0] 5034.295885: syscalls:sys_enter_close: fd: 0x00000003
    sleep 5677 [0] 5034.295886: syscalls:sys_exit_close: 0x0
    sleep 5677 [0] 5034.295911: syscalls:sys_enter_nanosleep: rqtp: 0x7fff775b33a0, rmtp: 0x00000000
    sleep 5677 [0] 5034.396021: syscalls:sys_exit_nanosleep: 0x0
    sleep 5677 [0] 5034.396027: syscalls:sys_enter_close: fd: 0x00000001
    sleep 5677 [0] 5034.396028: syscalls:sys_exit_close: 0x0
    sleep 5677 [0] 5034.396029: syscalls:sys_enter_close: fd: 0x00000002
    sleep 5677 [0] 5034.396029: syscalls:sys_exit_close: 0x0
    sleep 5677 [0] 5034.396032: syscalls:sys_enter_exit_group: error_code: 0x00000000
    #
    # ls
    #

    After:

    # perf record --call-graph=dwarf -e 'syscalls:sys_enter*' sleep 0.1 2>/dev/null | perf script | tail -37
    sleep 33010 [000] 5400.625269: syscalls:sys_enter_nanosleep: rqtp: 0x7fff2d0e7860, rmtp: 0x00000000
    7f1406f131a7 __GI___nanosleep (inlined)
    561c4f996966 [unknown]
    561c4f99673f [unknown]
    561c4f9937af [unknown]
    7f1406e6c1a2 __libc_start_main
    561c4f99388d [unknown]

    sleep 33010 [000] 5400.725391: syscalls:sys_enter_close: fd: 0x00000001
    7f1406f3c3cb __GI___close_nocancel (inlined)
    7f1406ec7d6f _IO_new_file_close_it (inlined)
    7f1406ebafa5 _IO_new_fclose (inlined)
    561c4f996a40 [unknown]
    561c4f993d79 [unknown]
    7f1406e83e86 __run_exit_handlers
    7f1406e8403f __GI_exit (inlined)
    7f1406e6c1a9 __libc_start_main
    561c4f99388d [unknown]

    sleep 33010 [000] 5400.725395: syscalls:sys_enter_close: fd: 0x00000002
    7f1406f3c3cb __GI___close_nocancel (inlined)
    7f1406ec7d6f _IO_new_file_close_it (inlined)
    7f1406ebafa5 _IO_new_fclose (inlined)
    561c4f996a40 [unknown]
    561c4f993da2 [unknown]
    7f1406e83e86 __run_exit_handlers
    7f1406e8403f __GI_exit (inlined)
    7f1406e6c1a9 __libc_start_main
    561c4f99388d [unknown]

    sleep 33010 [000] 5400.725399: syscalls:sys_enter_exit_group: error_code: 0x00000000
    7f1406f13466 __GI__exit (inlined)
    7f1406e83fa1 __run_exit_handlers
    7f1406e8403f __GI_exit (inlined)
    7f1406e6c1a9 __libc_start_main
    561c4f99388d [unknown]
    #

    And, if we install coreutils-debuginfo, we'll have those [unknown] resolved,
    those are for the /usr/bin/sleep binary, use:

    # dnf debuginfo-install coreutils

    On Fedora and derivatives, then:

    # perf record --call-graph=dwarf -e 'syscalls:sys_enter*' sleep 0.1 2>/dev/null | perf script | tail -37
    sleep 33046 [009] 5533.910074: syscalls:sys_enter_nanosleep: rqtp: 0x7ffea6fa7ab0, rmtp: 0x00000000
    7f5f786e81a7 __GI___nanosleep (inlined)
    564472454966 rpl_nanosleep
    56447245473f xnanosleep
    5644724517af main
    7f5f786411a2 __libc_start_main
    56447245188d _start

    sleep 33046 [009] 5534.010218: syscalls:sys_enter_close: fd: 0x00000001
    7f5f787113cb __GI___close_nocancel (inlined)
    7f5f7869cd6f _IO_new_file_close_it (inlined)
    7f5f7868ffa5 _IO_new_fclose (inlined)
    564472454a40 close_stream
    564472451d79 close_stdout
    7f5f78658e86 __run_exit_handlers
    7f5f7865903f __GI_exit (inlined)
    7f5f786411a9 __libc_start_main
    56447245188d _start

    sleep 33046 [009] 5534.010224: syscalls:sys_enter_close: fd: 0x00000002
    7f5f787113cb __GI___close_nocancel (inlined)
    7f5f7869cd6f _IO_new_file_close_it (inlined)
    7f5f7868ffa5 _IO_new_fclose (inlined)
    564472454a40 close_stream
    564472451da2 close_stdout
    7f5f78658e86 __run_exit_handlers
    7f5f7865903f __GI_exit (inlined)
    7f5f786411a9 __libc_start_main
    56447245188d _start

    sleep 33046 [009] 5534.010229: syscalls:sys_enter_exit_group: error_code: 0x00000000
    7f5f786e8466 __GI__exit (inlined)
    7f5f78658fa1 __run_exit_handlers
    7f5f7865903f __GI_exit (inlined)
    7f5f786411a9 __libc_start_main
    56447245188d _start

    #

    Reported-by: Paul Khuong
    Signed-off-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Ian Rogers
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200507095024.2789147-6-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Callchains are automatically initialized by checking on event's
    sample_type. For pipe mode we need to put this check into attr event
    code.

    Moving the callchains setup code into callchain_param_setup function and
    calling it from attr event process code.

    This enables pipe output having callchains, like:

    # perf record -g -e 'raw_syscalls:sys_enter' true | perf script
    # perf record -g -e 'raw_syscalls:sys_enter' true | perf report

    Committer notes:

    We still need the next patch for the above output to work.

    Reported-by: Paul Khuong
    Signed-off-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Ian Rogers
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200507095024.2789147-5-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

06 May, 2020

4 commits


30 Apr, 2020

1 commit

  • When printing iregs, there was a double newline printed because
    perf_sample__fprintf_regs() was printing its own and then at the end of
    all fields, perf script was adding one. This was causing blank line in
    the output:

    Before:

    $ perf script -Fip,iregs
    401b8d ABI:2 DX:0x100 SI:0x4a8340 DI:0x4a9340

    401b8d ABI:2 DX:0x100 SI:0x4a9340 DI:0x4a8340

    401b8d ABI:2 DX:0x100 SI:0x4a8340 DI:0x4a9340

    401b8d ABI:2 DX:0x100 SI:0x4a9340 DI:0x4a8340

    After:

    $ perf script -Fip,iregs
    401b8d ABI:2 DX:0x100 SI:0x4a8340 DI:0x4a9340
    401b8d ABI:2 DX:0x100 SI:0x4a9340 DI:0x4a8340
    401b8d ABI:2 DX:0x100 SI:0x4a8340 DI:0x4a9340

    Committer testing:

    First we need to figure out how to request that registers be recorded,
    so we use:

    # perf record -h reg

    Usage: perf record [] []
    or: perf record [] -- []

    -I, --intr-regs[=]
    sample selected machine registers on interrupt, use '-I?' to list register names
    --buildid-all Record build-id of all DSOs regardless of hits
    --user-regs[=]
    sample selected machine registers on interrupt, use '--user-regs=?' to list register names

    #

    Ok, now lets ask for them all:

    # perf record -a --intr-regs --user-regs sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 4.105 MB perf.data (2760 samples) ]
    #

    Lets look at the first 6 output lines:

    # perf script -Fip,iregs | head -6
    ffffffff8a06f2f4 ABI:2 AX:0xffffd168fee0a980 BX:0xffff8a23b087f000 CX:0xfffeb69aaeb25d73 DX:0xffff8a253e8310f0 SI:0xfffffff9bafe7359 DI:0xffffb1690204fb10 BP:0xffffd168fee0a950 SP:0xffffb1690204fb88 IP:0xffffffff8a06f2f4 FLAGS:0x4e CS:0x10 SS:0x18 R8:0x1495f0a91129a R9:0xffff8a23b087f000 R10:0x1 R11:0xffffffff R12:0x0 R13:0xffff8a253e827e00 R14:0xffffd168fee0aa5c R15:0xffffd168fee0a980

    ffffffff8a06f2f4 ABI:2 AX:0x0 BX:0xffffd168fee0a950 CX:0x5684cc1118491900 DX:0x0 SI:0xffffd168fee0a9d0 DI:0x202 BP:0xffffb1690204fd70 SP:0xffffb1690204fd20 IP:0xffffffff8a06f2f4 FLAGS:0x24e CS:0x10 SS:0x18 R8:0x0 R9:0xffffd168fee0a9d0 R10:0x1 R11:0xffffffff R12:0xffffffff8a23e480 R13:0xffff8a23b087f240 R14:0xffff8a23b087f000 R15:0xffffd168fee0a950

    ffffffff8a06f2f4 ABI:2 AX:0x0 BX:0x0 CX:0x7f25f334335b DX:0x0 SI:0x2400 DI:0x4 BP:0x7fff5f264570 SP:0x7fff5f264538 IP:0xffffffff8a06f2f4 FLAGS:0x24e CS:0x10 SS:0x2b R8:0x0 R9:0x2312d20 R10:0x0 R11:0x246 R12:0x22cc0e0 R13:0x0 R14:0x0 R15:0x22d0780

    #

    Reproduced, apply the patch and:

    [root@five ~]# perf script -Fip,iregs | head -6
    ffffffff8a06f2f4 ABI:2 AX:0xffffd168fee0a980 BX:0xffff8a23b087f000 CX:0xfffeb69aaeb25d73 DX:0xffff8a253e8310f0 SI:0xfffffff9bafe7359 DI:0xffffb1690204fb10 BP:0xffffd168fee0a950 SP:0xffffb1690204fb88 IP:0xffffffff8a06f2f4 FLAGS:0x4e CS:0x10 SS:0x18 R8:0x1495f0a91129a R9:0xffff8a23b087f000 R10:0x1 R11:0xffffffff R12:0x0 R13:0xffff8a253e827e00 R14:0xffffd168fee0aa5c R15:0xffffd168fee0a980
    ffffffff8a06f2f4 ABI:2 AX:0x0 BX:0xffffd168fee0a950 CX:0x5684cc1118491900 DX:0x0 SI:0xffffd168fee0a9d0 DI:0x202 BP:0xffffb1690204fd70 SP:0xffffb1690204fd20 IP:0xffffffff8a06f2f4 FLAGS:0x24e CS:0x10 SS:0x18 R8:0x0 R9:0xffffd168fee0a9d0 R10:0x1 R11:0xffffffff R12:0xffffffff8a23e480 R13:0xffff8a23b087f240 R14:0xffff8a23b087f000 R15:0xffffd168fee0a950
    ffffffff8a06f2f4 ABI:2 AX:0x0 BX:0x0 CX:0x7f25f334335b DX:0x0 SI:0x2400 DI:0x4 BP:0x7fff5f264570 SP:0x7fff5f264538 IP:0xffffffff8a06f2f4 FLAGS:0x24e CS:0x10 SS:0x2b R8:0x0 R9:0x2312d20 R10:0x0 R11:0x246 R12:0x22cc0e0 R13:0x0 R14:0x0 R15:0x22d0780
    ffffffff8a24074b ABI:2 AX:0xcb BX:0xcb CX:0x0 DX:0x0 SI:0xffffb1690204ff58 DI:0xcb BP:0xffffb1690204ff58 SP:0xffffb1690204ff40 IP:0xffffffff8a24074b FLAGS:0x24e CS:0x10 SS:0x18 R8:0x0 R9:0x0 R10:0x0 R11:0x0 R12:0x0 R13:0x0 R14:0x0 R15:0x0
    ffffffff8a310600 ABI:2 AX:0x0 BX:0xffffffff8b8c39a0 CX:0x0 DX:0xffff8a2503890300 SI:0xffffb1690204ff20 DI:0xffff8a23e4080000 BP:0xffff8a23e4080000 SP:0xffffb1690204fec0 IP:0xffffffff8a310600 FLAGS:0x28e CS:0x10 SS:0x18 R8:0x0 R9:0x0 R10:0x0 R11:0x0 R12:0xffffffffffffffea R13:0xffff8a23e4080020 R14:0x0 R15:0x0
    ffffffff8a11b688 ABI:2 AX:0x0 BX:0xffff8a237b7c8800 CX:0xffffb1690204fae0 DX:0x78 SI:0xffff8a237b7c8800 DI:0xffffb1690204fa10 BP:0xffffb1690204fb00 SP:0xffffb1690204fa00 IP:0xffffffff8a11b688 FLAGS:0x8a CS:0x10 SS:0x18 R8:0x1495f0a917eba R9:0xffffd168fde19a48 R10:0xffffb1690204fd98 R11:0xffff8a253e82afb0 R12:0xffff8a237b7c8800 R13:0xffffb1690204fb00 R14:0x0 R15:0xffff8a237b7c8800
    [root@five ~]#

    To see it more clearly, lets get just two of those registers by sample:

    # perf record -a --intr-regs=ax,bx --user-regs=cx,dx sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 3.502 MB perf.data (1653 samples) ]
    #

    Extra info, lets see what gets setup in that 'struct perf_event_attr':

    # perf evlist -v
    cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|REGS_USER|REGS_INTR, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 2, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, sample_regs_user: 0xc, sample_regs_intr: 0x3
    #

    Cook, some PERF_SAMPLE_REGS_USER|PERF_SAMPLE_REGS_INTR +
    attr.sample_regs_user and attr.sample_regs_intr register masks, now lets
    see if those newlines are gone in a more compact fashion:

    # perf script -Fip,iregs,uregs
    ffffffff8a56df78 ABI:2 AX:0xffff8a25137b6028 BX:0xffff8a2502f18000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    ffffffff8a56df78 ABI:2 AX:0xffff8a25137b6028 BX:0xffff8a2502f18000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    ffffffff8a56df78 ABI:2 AX:0xffff8a25137b6028 BX:0xffff8a2502f18000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    ffffffff8a56df78 ABI:2 AX:0xffff8a25137b6028 BX:0xffff8a2502f18000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    ffffffff8a56df78 ABI:2 AX:0xffff8a25137b6028 BX:0xffff8a2502f18000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    ffffffff8a56df78 ABI:2 AX:0xffff8a25137b6028 BX:0xffff8a2502f18000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    ffffffff8a29b78d ABI:2 AX:0x2a20ffcd6000 BX:0x2ec7d9000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    #

    And where was that?

    # perf script -Fip,iregs,uregs,sym,dso
    ffffffff8a56df78 strrchr (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2 AX:0xffff8a25137b6028 BX:0xffff8a2502f18000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    ffffffff8a56df78 strrchr (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2 AX:0xffff8a25137b6028 BX:0xffff8a2502f18000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    ffffffff8a56df78 strrchr (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2 AX:0xffff8a25137b6028 BX:0xffff8a2502f18000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    ffffffff8a56df78 strrchr (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2 AX:0xffff8a25137b6028 BX:0xffff8a2502f18000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    ffffffff8a56df78 strrchr (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2 AX:0xffff8a25137b6028 BX:0xffff8a2502f18000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    ffffffff8a56df78 strrchr (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2 AX:0xffff8a25137b6028 BX:0xffff8a2502f18000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    ffffffff8a29b78d __vma_link_rb (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2 AX:0x2a20ffcd6000 BX:0x2ec7d9000 ABI:2 CX:0x7f204460e49b DX:0xf42920
    #

    Signed-off-by: Stephane Eranian
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Andi Kleen
    Cc: Ian Rogers
    Cc: Jiri Olsa
    Cc: Kan Liang
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200418231908.152212-1-eranian@google.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     

18 Apr, 2020

1 commit

  • With the LBR stitching approach, the reconstructed LBR call stack can
    break the HW limitation. However, it may reconstruct invalid call stacks
    in some cases, e.g. exception handing such as setjmp/longjmp. Also, it
    may impact the processing time especially when the number of samples
    with stitched LBRs are huge.

    Add an option to enable the approach.

    Committer testing:

    Using the same perf.data as with the latest cset committer testing
    section:

    $ perf script --stitch-lbr

    tchain_edit 11131 15164.984292: 437491 cycles:u:
    401106 f43+0x0 (/wb/tchain_edit)
    40114c f42+0x18 (/wb/tchain_edit)
    401172 f41+0xe (/wb/tchain_edit)
    401194 f40+0x0 (/wb/tchain_edit)
    40119b f39+0x0 (/wb/tchain_edit)
    4011a2 f38+0x0 (/wb/tchain_edit)
    4011a9 f37+0x0 (/wb/tchain_edit)
    4011b0 f36+0x0 (/wb/tchain_edit)
    4011b7 f35+0x0 (/wb/tchain_edit)
    4011be f34+0x0 (/wb/tchain_edit)
    4011c5 f33+0x0 (/wb/tchain_edit)
    4011cc f32+0x0 (/wb/tchain_edit)
    401207 f31+0x34 (/wb/tchain_edit)
    401212 f30+0x0 (/wb/tchain_edit)
    401219 f29+0x0 (/wb/tchain_edit)
    401220 f28+0x0 (/wb/tchain_edit)
    401227 f27+0x0 (/wb/tchain_edit)
    40122e f26+0x0 (/wb/tchain_edit)
    401235 f25+0x0 (/wb/tchain_edit)
    40123c f24+0x0 (/wb/tchain_edit)
    401243 f23+0x0 (/wb/tchain_edit)
    40124a f22+0x0 (/wb/tchain_edit)
    401251 f21+0x0 (/wb/tchain_edit)
    401258 f20+0x0 (/wb/tchain_edit)
    40125f f19+0x0 (/wb/tchain_edit)
    401266 f18+0x0 (/wb/tchain_edit)
    40126d f17+0x0 (/wb/tchain_edit)
    401274 f16+0x0 (/wb/tchain_edit)
    40127b f15+0x0 (/wb/tchain_edit)
    401282 f14+0x0 (/wb/tchain_edit)
    401289 f13+0x0 (/wb/tchain_edit)
    401290 f12+0x0 (/wb/tchain_edit)
    401297 f11+0x0 (/wb/tchain_edit)
    40129e f10+0x0 (/wb/tchain_edit)
    4012a5 f9+0x0 (/wb/tchain_edit)
    4012ac f8+0x0 (/wb/tchain_edit)
    4012b3 f7+0x0 (/wb/tchain_edit)
    4012ba f6+0x0 (/wb/tchain_edit)
    4012c1 f5+0x0 (/wb/tchain_edit)
    4012c8 f4+0x0 (/wb/tchain_edit)
    4012cf f3+0x0 (/wb/tchain_edit)
    4012d6 f2+0x0 (/wb/tchain_edit)
    4012dd f1+0x0 (/wb/tchain_edit)
    4012e4 main+0x0 (/wb/tchain_edit)
    7f41a5016f41 __libc_start_main+0xf1 (/usr/lib64/libc-2.29.so)

    $

    Signed-off-by: Kan Liang
    Reviewed-by: Andi Kleen
    Acked-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Adrian Hunter
    Cc: Alexey Budankov
    Cc: Mathieu Poirier
    Cc: Michael Ellerman
    Cc: Namhyung Kim
    Cc: Pavel Gerasimov
    Cc: Peter Zijlstra
    Cc: Ravi Bangoria
    Cc: Stephane Eranian
    Cc: Vitaly Slobodskoy
    Link: http://lore.kernel.org/lkml/20200319202517.23423-15-kan.liang@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Kan Liang
     

16 Apr, 2020

2 commits

  • Currently, callchains can be synthesized only for synthesized events. Add
    an itrace option to synthesize callchains for regular events.

    Signed-off-by: Adrian Hunter
    Cc: Andi Kleen
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200401101613.6201-9-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • This simplifies the print functions for the following perf script
    options:

    --show-task-events
    --show-namespace-events
    --show-cgroup-events
    --show-mmap-events
    --show-switch-events
    --show-lost-events
    --show-bpf-events

    Example:
    # perf record --switch-events -a -e cycles -c 10000 sleep 1
    Before:
    # perf script --show-task-events --show-namespace-events --show-cgroup-events --show-mmap-events --show-switch-events --show-lost-events --show-bpf-events > out-before.txt
    After:
    # perf script --show-task-events --show-namespace-events --show-cgroup-events --show-mmap-events --show-switch-events --show-lost-events --show-bpf-events > out-after.txt
    # diff -s out-before.txt out-after.txt
    Files out-before.txt and out-after.tx are identical

    Signed-off-by: Adrian Hunter
    Acked-by: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200402141548.21283-1-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

03 Apr, 2020

2 commits

  • closedir(lang_dir) frees the memory of script_dirent->d_name, which
    gets accessed in the next line in a call to scnprintf().

    Valgrind report:

    Invalid read of size 1
    ==413557== at 0x483CBE6: strlen (vg_replace_strmem.c:461)
    ==413557== by 0x4DD45FD: __vfprintf_internal (vfprintf-internal.c:1688)
    ==413557== by 0x4DE6679: __vsnprintf_internal (vsnprintf.c:114)
    ==413557== by 0x53A037: vsnprintf (stdio2.h:80)
    ==413557== by 0x53A037: scnprintf (vsprintf.c:21)
    ==413557== by 0x435202: get_script_path (builtin-script.c:3223)
    ==413557== Address 0x52e7313 is 1,139 bytes inside a block of size 32,816 free'd
    ==413557== at 0x483AA0C: free (vg_replace_malloc.c:540)
    ==413557== by 0x4E303C0: closedir (closedir.c:50)
    ==413557== by 0x4351DC: get_script_path (builtin-script.c:3222)

    Signed-off-by: Andreas Gerstmayr
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200402124337.419456-1-agerstmayr@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Andreas Gerstmayr
     
  • The --show-cgroup-events option is to print CGROUP events in the
    output like others.

    Committer testing:

    [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.039 MB perf.data (487 samples) ]
    [root@seventh ~]# perf script --show-cgroup-events | grep PERF_RECORD_CGROUP -B2 -A2
    swapper 0 0.000000: PERF_RECORD_CGROUP cgroup: 1 /
    perf 12145 11200.440730: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
    perf 12145 11200.440733: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
    --
    cgtest 12145 11200.440739: 193472 cycles: ffffffffb90f6fbc commit_creds+0x1fc (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
    cgtest 12145 11200.440790: 2691608 cycles: 7fa2cb43019b _dl_sysdep_start+0x7cb (/usr/lib64/ld-2.29.so)
    cgtest 12145 11200.440962: PERF_RECORD_CGROUP cgroup: 83 /sub
    cgtest 12147 11200.441054: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
    cgtest 12147 11200.441057: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
    --
    cgtest 12148 11200.441103: 10227 cycles: ffffffffb9a0153d end_repeat_nmi+0x48 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
    cgtest 12148 11200.441106: 273295 cycles: ffffffffb99ecbc7 copy_page+0x7 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
    cgtest 12147 11200.441133: PERF_RECORD_CGROUP cgroup: 88 /sub/cgrp1
    cgtest 12147 11200.441143: 2788845 cycles: ffffffffb94676c2 security_genfs_sid+0x102 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
    cgtest 12148 11200.441162: PERF_RECORD_CGROUP cgroup: 93 /sub/cgrp2
    cgtest 12148 11200.441182: 2669546 cycles: 401020 _init+0x20 (/wb/cgtest)
    cgtest 12149 11200.441247: 1 cycles: ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
    [root@seventh ~]#

    Signed-off-by: Namhyung Kim
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200325124536.2800725-10-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

27 Mar, 2020

1 commit

  • For some kind of analysis a deltatime output is more human friendly and
    reduce the cognitive load for further analysis.

    The following output demonstrate the new option "deltatime": calculate
    the time difference in relation to the previous event.

    $ perf script --deltatime
    test 2525 [001] 0.000000: sdt_libev:ev_add: (5635e72a5ebd)
    test 2525 [001] 0.000091: sdt_libev:epoll_wait_enter: (5635e72a76a9)
    test 2525 [001] 1.000051: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
    test 2525 [001] 0.000685: sdt_libev:ev_add: (5635e72a5ebd)
    test 2525 [001] 0.000048: sdt_libev:epoll_wait_enter: (5635e72a76a9)
    test 2525 [001] 1.000104: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
    test 2525 [001] 0.003895: sdt_libev:epoll_wait_enter: (5635e72a76a9)
    test 2525 [001] 0.996034: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
    test 2525 [001] 0.000058: sdt_libev:epoll_wait_enter: (5635e72a76a9)
    test 2525 [001] 1.000004: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
    test 2525 [001] 0.000064: sdt_libev:epoll_wait_enter: (5635e72a76a9)
    test 2525 [001] 0.999934: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
    test 2525 [001] 0.000056: sdt_libev:epoll_wait_enter: (5635e72a76a9)
    test 2525 [001] 0.999930: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1

    Committer testing:

    So go from default output to --reltime and then this new --deltatime, to
    contrast the various timestamp presentation modes for a random perf.data file I
    had laying around:

    [root@five ~]# perf script --reltime | head
    perf 442394 [000] 0.000000: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [000] 0.000002: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [000] 0.000004: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [000] 0.000006: 128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [000] 0.000009: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 0.000036: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 0.000038: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 0.000040: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 0.000041: 224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 0.000044: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    [root@five ~]# perf script --deltatime | head
    perf 442394 [000] 0.000000: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [000] 0.000002: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [000] 0.000001: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [000] 0.000001: 128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [000] 0.000002: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 0.000027: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 0.000002: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 0.000001: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 0.000001: 224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 0.000002: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    [root@five ~]# perf script | head
    perf 442394 [000] 7600.157861: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [000] 7600.157864: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [000] 7600.157866: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [000] 7600.157867: 128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [000] 7600.157870: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 7600.157897: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 7600.157900: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 7600.157901: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 7600.157903: 224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    perf 442394 [001] 7600.157906: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
    [root@five ~]#

    Andi suggested we better implement it as a new field, i.e. -F deltatime, like:

    [root@five ~]# perf script -F deltatime
    Invalid field requested.

    Usage: perf script []
    or: perf script [] record ]
    or: perf script [] report ] ]
    or: perf script [] [script-args]

    -F, --fields comma separated output fields prepend with 'type:'. +field to add and -field to remove.Valid types: hw,sw,trace,raw,synth. Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,srcline,period,iregs,uregs,brstack,brstacksym,flags,bpf-output,brstackinsn,brstackoff,callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc
    [root@five ~]#

    I.e. we have -F for maximum flexibility:

    [root@five ~]# perf script -F comm,pid,cpu,time | head
    perf 442394 [000] 7600.157861:
    perf 442394 [000] 7600.157864:
    perf 442394 [000] 7600.157866:
    perf 442394 [000] 7600.157867:
    perf 442394 [000] 7600.157870:
    perf 442394 [001] 7600.157897:
    perf 442394 [001] 7600.157900:
    perf 442394 [001] 7600.157901:
    perf 442394 [001] 7600.157903:
    perf 442394 [001] 7600.157906:
    [root@five ~]#

    But since we already have --reltime, having --deltatime, documented one after
    the other is sensible.

    Signed-off-by: Hagen Paul Pfeifer
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Adrian Hunter
    Cc: Andi Kleen
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200204173709.489161-1-hagen@jauu.net
    [ Added 'perf script' man page entry for --deltatime ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Hagen Paul Pfeifer
     

10 Mar, 2020

1 commit

  • The low level index of raw branch records for the most recent branch can
    be recorded in a sample with PERF_SAMPLE_BRANCH_HW_INDEX
    branch_sample_type. Extend struct branch_stack to support it.

    However, if the PERF_SAMPLE_BRANCH_HW_INDEX is not applied, only nr and
    entries[] will be output by kernel. The pointer of entries[] could be
    wrong, since the output format is different with new struct
    branch_stack. Add a variable no_hw_idx in struct perf_sample to
    indicate whether the hw_idx is output. Add get_branch_entry() to return
    corresponding pointer of entries[0].

    To make dummy branch sample consistent as new branch sample, add hw_idx
    in struct dummy_branch_stack for cs-etm and intel-pt.

    Apply the new struct branch_stack for synthetic events as well.

    Extend test case sample-parsing to support new struct branch_stack.

    Committer notes:

    Renamed get_branch_entries() to perf_sample__branch_entries() to have
    proper namespacing and pave the way for this to be moved to libperf,
    eventually.

    Add 'static' to that inline as it is in a header.

    Add 'hw_idx' to 'struct dummy_branch_stack' in cs-etm.c to fix the build
    on arm64.

    Signed-off-by: Kan Liang
    Cc: Adrian Hunter
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Jiri Olsa
    Cc: Mathieu Poirier
    Cc: Michael Ellerman
    Cc: Namhyung Kim
    Cc: Pavel Gerasimov
    Cc: Peter Zijlstra
    Cc: Ravi Bangoria
    Cc: Stephane Eranian
    Cc: Vitaly Slobodskoy
    Link: http://lore.kernel.org/lkml/20200228163011.19358-2-kan.liang@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Kan Liang
     

28 Nov, 2019

2 commits

  • The 'len' returned by grab_bb() includes an extra MAXINSN bytes to allow
    for the last instruction, so the the final 'offs' will not be 'len'.
    Fix the error condition logic accordingly.

    Before:

    $ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
    [ perf record: Woken up 19 times to write data ]
    [ perf record: Captured and wrote 2.274 MB perf.data ]
    $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
    grep 13759 [002] 8091.310257: 1862 instructions:uH: 5641d58069eb bmexec+0x86b (/bin/grep)
    bmexec+2485:
    00005641d5806b35 jnz 0x5641d5806bd0 # MISPRED
    00005641d5806bd0 movzxb (%r13,%rdx,1), %eax
    00005641d5806bd6 add %rdi, %rax
    00005641d5806bd9 movzxb -0x1(%rax), %edx
    00005641d5806bdd cmp %rax, %r14
    00005641d5806be0 jnb 0x5641d58069c0 # MISPRED
    mismatch of LBR data and executable
    00005641d58069c0 movzxb (%r13,%rdx,1), %edi

    After:

    $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
    grep 13759 [002] 8091.310257: 1862 instructions:uH: 5641d58069eb bmexec+0x86b (/bin/grep)
    bmexec+2485:
    00005641d5806b35 jnz 0x5641d5806bd0 # MISPRED
    00005641d5806bd0 movzxb (%r13,%rdx,1), %eax
    00005641d5806bd6 add %rdi, %rax
    00005641d5806bd9 movzxb -0x1(%rax), %edx
    00005641d5806bdd cmp %rax, %r14
    00005641d5806be0 jnb 0x5641d58069c0 # MISPRED
    00005641d58069c0 movzxb (%r13,%rdx,1), %edi
    00005641d58069c6 add %rax, %rdi

    Fixes: e98df280bc2a ("perf script brstackinsn: Fix recovery from LBR/binary mismatch")
    Reported-by: Andi Kleen
    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20191127095631.15663-1-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • brstackinsn must be allowed to be set by the user when AUX area data has
    been captured because, in that case, the branch stack might be
    synthesized on the fly. This fixes the following error:

    Before:

    $ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
    [ perf record: Woken up 19 times to write data ]
    [ perf record: Captured and wrote 2.274 MB perf.data ]
    $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
    Display of branch stack assembler requested, but non all-branch filter set
    Hint: run 'perf record -b ...'

    After:

    $ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
    [ perf record: Woken up 19 times to write data ]
    [ perf record: Captured and wrote 2.274 MB perf.data ]
    $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
    grep 13759 [002] 8091.310257: 1862 instructions:uH: 5641d58069eb bmexec+0x86b (/bin/grep)
    bmexec+2485:
    00005641d5806b35 jnz 0x5641d5806bd0 # MISPRED
    00005641d5806bd0 movzxb (%r13,%rdx,1), %eax
    00005641d5806bd6 add %rdi, %rax
    00005641d5806bd9 movzxb -0x1(%rax), %edx
    00005641d5806bdd cmp %rax, %r14
    00005641d5806be0 jnb 0x5641d58069c0 # MISPRED
    mismatch of LBR data and executable
    00005641d58069c0 movzxb (%r13,%rdx,1), %edi

    Fixes: 48d02a1d5c13 ("perf script: Add 'brstackinsn' for branch stacks")
    Reported-by: Andi Kleen
    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20191127095322.15417-1-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

26 Nov, 2019

1 commit


15 Oct, 2019

1 commit

  • My earlier patch to just enable --reltime with --time was a little too
    optimistic. The --time parsing would accept absolute time, which is
    very confusing to the user.

    Support relative time in --time parsing too. This only works with recent
    perf record that records the first sample time. Otherwise we error out.

    Fixes: 3714437d3fcc ("perf script: Allow --time with --reltime")
    Signed-off-by: Andi Kleen
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20191011182140.8353-1-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

07 Oct, 2019

1 commit

  • The original --reltime patch forbid --time with --reltime.

    But it turns out --time doesn't really care about --reltime, because the
    relative time is only used at final output, while the time filtering
    always works earlier on absolute time.

    So just remove the check and allow combining the two options.

    Fixes: 90b10f47c0ee ("perf script: Support relative time")
    Signed-off-by: Andi Kleen
    Acked-by: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20191002164642.1719-1-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

01 Oct, 2019

1 commit

  • When the LBR data and the instructions in a binary do not match the loop
    printing instructions could get confused and print a long stream of
    bogus instructions.

    The problem was that if the instruction decoder cannot decode an
    instruction it ilen wasn't initialized, so the loop going through the
    basic block would continue with the previous value.

    Harden the code to avoid such problems:

    - Make sure ilen is always freshly initialized and is 0 for bad
    instructions.

    - Do not overrun the code buffer while printing instructions

    - Print a warning message if the final jump is not on an instruction
    boundary.

    Signed-off-by: Andi Kleen
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20190927233546.11533-1-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

26 Sep, 2019

2 commits

  • We already had evsel_fprintf.c, add its counterpart, so that we can
    reduce evsel.h a bit more.

    We needed a new perf_event_attr_fprintf.c file so as to have a separate
    object to link with the python binding in tools/perf/util/python-ext-sources
    and not drag symbol_conf, etc into the python binding.

    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-06bdmt1062d9unzgqmxwlv88@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • So that we an later link it to the python binding without having to
    drag the symbol object files.

    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-8823tveyasocnuoelq4qopwf@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

25 Sep, 2019

2 commits

  • Add perf_evlist__first()/last() functions to libperf, as internal
    functions and rename perf's origins to evlist__first/last.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20190913132355.21634-29-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Move the 'system_wide 'member from perf's evsel to libperf's perf_evsel.

    Committer notes:

    Added stdbool.h as we now use bool here.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20190913132355.21634-20-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

21 Sep, 2019

1 commit

  • This patch is to return error code of perf_new_session function on
    failure instead of NULL.

    Test Results:

    Before Fix:

    $ perf c2c report -input
    failed to open nput: No such file or directory

    $ echo $?
    0
    $

    After Fix:

    $ perf c2c report -input
    failed to open nput: No such file or directory

    $ echo $?
    254
    $

    Committer notes:

    Fix 'perf tests topology' case, where we use that TEST_ASSERT_VAL(...,
    session), i.e. we need to pass zero in case of failure, which was the
    case before when NULL was returned by perf_session__new() for failure,
    but now we need to negate the result of IS_ERR(session) to respect that
    TEST_ASSERT_VAL) expectation of zero meaning failure.

    Reported-by: Nageswara R Sastry
    Signed-off-by: Mamatha Inamdar
    Tested-by: Arnaldo Carvalho de Melo
    Tested-by: Nageswara R Sastry
    Acked-by: Ravi Bangoria
    Reviewed-by: Jiri Olsa
    Reviewed-by: Mukesh Ojha
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Greg Kroah-Hartman
    Cc: Jeremie Galarneau
    Cc: Kate Stewart
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Shawn Landden
    Cc: Song Liu
    Cc: Thomas Gleixner
    Cc: Tzvetomir Stoyanov
    Link: http://lore.kernel.org/lkml/20190822071223.17892.45782.stgit@localhost.localdomain
    Signed-off-by: Arnaldo Carvalho de Melo

    Mamatha Inamdar
     

01 Sep, 2019

1 commit

  • Remove the last unneeded use of cache.h in a header, we can check where
    it is really needed, i.e. we can remove it and be sure that it isn't
    being obtained indirectly.

    This is an old file, used by now incorrectly in many places, so it was
    providing includes needed indirectly, fixup this fallout.

    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-3x3l8gihoaeh7714os861ia7@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo