15 Oct, 2020

3 commits

  • The output format for metrics has been reorganized, update documentation
    to reflect the changes for it.

    Signed-off-by: Leo Yan
    Cc: Al Grant
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Don Zickus
    Cc: Ian Rogers
    Cc: James Clark
    Cc: Jiri Olsa
    Cc: Joe Mario
    Cc: Kan Liang
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20201015144548.18482-10-leo.yan@linaro.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Leo Yan
     
  • This patch enables perf-diff with "--stream" option.

    "--stream": Enable hot streams comparison

    Now let's see example.

    perf record -b ... Generate perf.data.old with branch data
    perf record -b ... Generate perf.data with branch data
    perf diff --stream

    [ Matched hot streams ]

    hot chain pair 1:
    cycles: 1, hits: 27.77% cycles: 1, hits: 9.24%
    --------------------------- --------------------------
    main div.c:39 main div.c:39
    main div.c:44 main div.c:44

    hot chain pair 2:
    cycles: 34, hits: 20.06% cycles: 27, hits: 16.98%
    --------------------------- --------------------------
    __random_r random_r.c:360 __random_r random_r.c:360
    __random_r random_r.c:388 __random_r random_r.c:388
    __random_r random_r.c:388 __random_r random_r.c:388
    __random_r random_r.c:380 __random_r random_r.c:380
    __random_r random_r.c:357 __random_r random_r.c:357
    __random random.c:293 __random random.c:293
    __random random.c:293 __random random.c:293
    __random random.c:291 __random random.c:291
    __random random.c:291 __random random.c:291
    __random random.c:291 __random random.c:291
    __random random.c:288 __random random.c:288
    rand rand.c:27 rand rand.c:27
    rand rand.c:26 rand rand.c:26
    rand@plt rand@plt
    rand@plt rand@plt
    compute_flag div.c:25 compute_flag div.c:25
    compute_flag div.c:22 compute_flag div.c:22
    main div.c:40 main div.c:40
    main div.c:40 main div.c:40
    main div.c:39 main div.c:39

    hot chain pair 3:
    cycles: 9, hits: 4.48% cycles: 6, hits: 4.51%
    --------------------------- --------------------------
    __random_r random_r.c:360 __random_r random_r.c:360
    __random_r random_r.c:388 __random_r random_r.c:388
    __random_r random_r.c:388 __random_r random_r.c:388
    __random_r random_r.c:380 __random_r random_r.c:380

    [ Hot streams in old perf data only ]

    hot chain 1:
    cycles: 18, hits: 6.75%
    --------------------------
    __random_r random_r.c:360
    __random_r random_r.c:388
    __random_r random_r.c:388
    __random_r random_r.c:380
    __random_r random_r.c:357
    __random random.c:293
    __random random.c:293
    __random random.c:291
    __random random.c:291
    __random random.c:291
    __random random.c:288
    rand rand.c:27
    rand rand.c:26
    rand@plt
    rand@plt
    compute_flag div.c:25
    compute_flag div.c:22
    main div.c:40

    hot chain 2:
    cycles: 29, hits: 2.78%
    --------------------------
    compute_flag div.c:22
    main div.c:40
    main div.c:40
    main div.c:39

    [ Hot streams in new perf data only ]

    hot chain 1:
    cycles: 4, hits: 4.54%
    --------------------------
    main div.c:42
    compute_flag div.c:28

    hot chain 2:
    cycles: 5, hits: 3.51%
    --------------------------
    main div.c:39
    main div.c:44
    main div.c:42
    compute_flag div.c:28

    Signed-off-by: Jin Yao
    Acked-by: Jiri Olsa
    Link: https://lore.kernel.org/r/20201009022845.13141-8-yao.jin@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jin Yao
     
  • Document the higher level --insn-trace etc. perf script options.

    Include the howto how to build xed into the manpage

    Signed-off-by: Andi Kleen
    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20201014035346.4772-1-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

14 Oct, 2020

1 commit

  • Peter suggested that using the exclusive mode in perf could avoid some
    problems with bad scheduling of groups. Exclusive is implemented in the
    kernel, but wasn't exposed by the perf tool, so hard to use without
    custom low level API users.

    Add support for marking groups or events with :e for exclusive in the
    perf tool. The implementation is basically the same as the existing
    pinned attribute.

    Committer testing:

    # perf test "parse event"
    6: Parse event definition strings : Ok
    # perf test -v "parse event" |& grep :u*e
    running test 56 'instructions:uep'
    running test 57 '{cycles,cache-misses,branch-misses}:e'
    #
    #
    # grep "model name" -m1 /proc/cpuinfo
    model name : AMD Ryzen 9 3900X 12-Core Processor
    #
    # perf stat -a -e '{cycles,cache-misses,branch-misses}:e' sleep 1

    Performance counter stats for 'system wide':

    cycles (0.00%)
    cache-misses (0.00%)
    branch-misses (0.00%)

    1.001269893 seconds time elapsed

    Some events weren't counted. Try disabling the NMI watchdog:
    echo 0 > /proc/sys/kernel/nmi_watchdog
    perf stat ...
    echo 1 > /proc/sys/kernel/nmi_watchdog
    # echo 0 > /proc/sys/kernel/nmi_watchdog
    # perf stat -a -e '{cycles,cache-misses,branch-misses}:e' sleep 1

    Performance counter stats for 'system wide':

    1,298,663,141 cycles
    30,962,215 cache-misses
    5,325,150 branch-misses

    1.001474934 seconds time elapsed

    #
    # The output for asking for precise events on AMD needs to improve, it
    # supposedly works only for system wide or per CPU
    #
    # perf stat -a -e '{cycles,cache-misses,branch-misses}:uep' sleep 1
    Error:
    The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles).
    /bin/dmesg | grep -i perf may provide additional information.

    # perf stat -a -e '{cycles,cache-misses,branch-misses}:ue' sleep 1

    Performance counter stats for 'system wide':

    746,363,126 cycles
    16,881,611 cache-misses
    2,871,259 branch-misses

    1.001636066 seconds time elapsed

    #

    Signed-off-by: Andi Kleen
    Acked-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Andi Kleen
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20201014144255.22699-1-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

13 Oct, 2020

1 commit

  • Like 'perf record', we can even more speedup build-id processing by just
    using all DSOs. Then we don't need to look at all the sample events
    anymore. The following patch will update 'perf bench' to show the result
    of the --buildid-all option too.

    Signed-off-by: Namhyung Kim
    Original-patch-by: Stephane Eranian
    Acked-by: Ian Rogers
    Acked-by: Jiri Olsa
    Link: https://lore.kernel.org/r/20201012070214.2074921-6-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

28 Sep, 2020

1 commit

  • The --for-each-cgroup option is a syntax sugar to monitor large number
    of cgroups easily. Current command line requires to list all the events
    and cgroups even if users want to monitor same events for each cgroup.
    This patch addresses that usage by copying given events for each cgroup
    on user's behalf.

    For instance, if they want to monitor 6 events for 200 cgroups each they
    should write 1200 event names (with -e) AND 1200 cgroup names (with -G)
    on the command line. But with this change, they can just specify 6
    events and 200 cgroups with a new option.

    A simpler example below: It wants to measure 3 events for 2 cgroups ('A'
    and 'B'). The result is that total 6 events are counted like below.

    $ perf stat -a -e cpu-clock,cycles,instructions --for-each-cgroup A,B sleep 1

    Performance counter stats for 'system wide':

    988.18 msec cpu-clock A # 0.987 CPUs utilized
    3,153,761,702 cycles A # 3.200 GHz (100.00%)
    8,067,769,847 instructions A # 2.57 insn per cycle (100.00%)
    982.71 msec cpu-clock B # 0.982 CPUs utilized
    3,136,093,298 cycles B # 3.182 GHz (99.99%)
    8,109,619,327 instructions B # 2.58 insn per cycle (99.99%)

    1.001228054 seconds time elapsed

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Ian Rogers
    Cc: Mark Rutland
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lore.kernel.org/lkml/20200924124455.336326-3-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

18 Sep, 2020

3 commits

  • perf has many undocumented options, such as:-vv, --exec-path,
    --html-path, -p, --paginate,--no-pager, --debugfs-dir, --buildid-dir,
    --list-cmds, --list-opts.

    Add entris for these options in perf.txt.

    Signed-off-by: Zejiang Tang
    Acked-by: Namhyung Kim
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Peter Zijlstra
    Cc: Ravi Bangoria
    Link: http://lore.kernel.org/lkml/1599645194-8438-1-git-send-email-tangzejiang@loongson.cn
    Signed-off-by: Arnaldo Carvalho de Melo

    Zejiang Tang
     
  • Add some documentation how to use the topdown metrics in ring 3.

    Signed-off-by: Andi Kleen
    Co-developed-by: Kan Liang
    Acked-by: Jiri Olsa
    Acked-by: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lore.kernel.org/lkml/20200911144808.27603-5-kan.liang@linux.intel.com
    Signed-off-by: Kan Liang
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • Icelake has support for reporting per thread TopDown metrics.

    These are reported differently than the previous TopDown support,
    each metric is standalone, but scaled to pipeline "slots".

    We don't need to do anything special for HyperThreading anymore.
    Teach perf stat --topdown to handle these new metrics and
    print them in the same way as the previous TopDown metrics.

    The restrictions of only being able to report information per core is
    gone.

    Signed-off-by: Andi Kleen
    Co-developed-by: Kan Liang
    Acked-by: Jiri Olsa
    Acked-by: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lore.kernel.org/lkml/20200911144808.27603-4-kan.liang@linux.intel.com
    Signed-off-by: Kan Liang
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

05 Sep, 2020

6 commits

  • Same as 'perf probe -F', this patch adds filter support for the ftrace
    subcommand option '-F, --funcs '.

    Here is an example that only lists functions which start with 'vfs_':

    $ sudo perf ftrace -F vfs_*
    vfs_fadvise
    vfs_fallocate
    vfs_truncate
    vfs_open
    vfs_setpos
    vfs_llseek
    vfs_readf
    vfs_writef
    ...

    Suggested-by: Arnaldo Carvalho de Melo
    Signed-off-by: Changbin Du
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200904152357.6053-1-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • The documentation describes snapshot mode. Update it to include the new
    snapshot control command.

    Signed-off-by: Adrian Hunter
    Acked-by: Alexey Budankov
    Acked-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Link: http://lore.kernel.org/lkml/20200901093758.32293-7-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • # perf annotate --stdio2 acpi_processor_ffh_cstate_enter > default
    # perf config annotate.disassembler_style=intel
    # perf config annotate.disassembler_style
    annotate.disassembler_style=intel
    # perf annotate --stdio2 acpi_processor_ffh_cstate_enter > intel
    # diff -u default intel
    --- default 2020-09-04 13:09:26.019205732 -0300
    +++ intel 2020-09-04 13:09:52.823795081 -0300
    @@ -1,42 +1,42 @@
    Samples: 1K of event 'cycles', 4000 Hz, Event count (approx.): 990065316, [percent: local period]
    acpi_processor_ffh_cstate_enter() /lib/modules/5.9.0-rc3/build/vmlinux
    -Percent → callq __fentry__
    - mov cpu_number,%edx
    - mov %edx,%edx
    - mov cpu_cstate_entry,%rax
    - add -0x7dbe9700(,%rdx,8),%rax
    - movzbl 0x9(%rdi),%edx
    - mov 0x4(%rax,%rdx,8),%edi
    - mov (%rax,%rdx,8),%esi
    - → jmpq 137ccc6
    - 2d: → jmpq 137ccd8
    +Percent → call __fentry__
    + mov edx,DWORD PTR gs:[rip+0x7e541d74]
    + mov edx,edx
    + mov rax,QWORD PTR [rip+0x152b8fb]
    + add rax,QWORD PTR [rdx*8-0x7dbe9700]
    + movzx edx,BYTE PTR [rdi+0x9]
    + mov edi,DWORD PTR [rax+rdx*8+0x4]
    + mov esi,DWORD PTR [rax+rdx*8]
    + → jmp 137ccc6
    + 2d: → jmp 137ccd8
    mfence
    - mov %gs:0x17bc0,%rax
    - clflush (%rax)
    + mov rax,QWORD PTR gs:0x17bc0
    + clflush BYTE PTR [rax]
    mfence
    - xor %edx,%edx
    - mov %rdx,%rcx
    - mov %gs:0x17bc0,%rax
    - 0.00 monitor %rax,%ecx,%edx
    - mov (%rax),%rax
    - test $0x8,%al
    + xor edx,edx
    + mov rcx,rdx
    + mov rax,QWORD PTR gs:0x17bc0
    + 0.00 monitor
    + mov rax,QWORD PTR [rax]
    + test al,0x8
    ↓ jne 71
    - ↓ jmpq 68
    - verw 0x538b08(%rip) # ffffffff82008150
    - 68: mov %rsi,%rax
    - mov %rdi,%rcx
    -100.00 mwait %eax,%ecx
    - 71: mov %gs:0x17bc0,%rax
    - lock andb $0xdf,0x2(%rax)
    - lock addl $0x0,-0x4(%rsp)
    - mov (%rax),%rax
    - test $0x8,%al
    + ↓ jmp 68
    + verw WORD PTR [rip+0x538b08] # ffffffff82008150
    + 68: mov rax,rsi
    + mov rcx,rdi
    +100.00 mwait
    + 71: mov rax,QWORD PTR gs:0x17bc0
    + lock and BYTE PTR [rax+0x2],0xdf
    + lock add DWORD PTR [rsp-0x4],0x0
    + mov rax,QWORD PTR [rax]
    + test al,0x8
    ↓ je 97
    - andl $0x7fffffff,__preempt_count
    - 97: ← retq
    - mov %gs:0x17bc0,%rax
    - lock orb $0x20,0x2(%rax)
    - mov (%rax),%rax
    - test $0x8,%al
    + and DWORD PTR gs:[rip+0x7e548509],0x7fffffff
    + 97: ret
    + mov rax,QWORD PTR gs:0x17bc0
    + lock or BYTE PTR [rax+0x2],0x20
    + mov rax,QWORD PTR [rax]
    + test al,0x8
    ↑ jne 71
    - ↑ jmpq 2d
    + ↑ jmp 2d
    #

    Requested-by: Matt P. Dziubinski
    Cc: Adrian Hunter
    Cc: Ian Rogers
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Add 'snapshot' control command to create an AUX area tracing snapshot
    the same as if sending SIGUSR2. The advantage of the FIFO is that access
    is governed by access to the FIFO.

    Example:

    $ mkfifo perf.control
    $ mkfifo perf.ack
    $ cat perf.ack &
    [1] 15235
    $ sudo ~/bin/perf record --control fifo:perf.control,perf.ack -S -e intel_pt//u -- sleep 60 &
    [2] 15243
    $ ps -e | grep perf
    15244 pts/1 00:00:00 perf
    $ kill -USR2 15244
    bash: kill: (15244) - Operation not permitted
    $ echo snapshot > perf.control
    ack
    $

    Signed-off-by: Adrian Hunter
    Acked-by: Alexey Budankov
    Acked-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Link: http://lore.kernel.org/lkml/20200901093758.32293-6-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • Enable the --control option to accept file names as an alternative to
    file descriptors.

    Example:

    $ mkfifo perf.control
    $ mkfifo perf.ack
    $ cat perf.ack &
    [1] 6808
    $ perf record --control fifo:perf.control,perf.ack -- sleep 300 &
    [2] 6810
    $ echo disable > perf.control
    $ Events disabled
    ack

    $ echo enable > perf.control
    $ Events enabled
    ack

    $ echo disable > perf.control
    $ Events disabled
    ack

    $ kill %2
    [ perf record: Woken up 4 times to write data ]
    $ [ perf record: Captured and wrote 0.018 MB perf.data (7 samples) ]

    [1]- Done cat perf.ack
    [2]+ Terminated perf record --control fifo:perf.control,perf.ack -- sleep 300
    $

    Signed-off-by: Adrian Hunter
    Acked-by: Alexey Budankov
    Acked-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Link: http://lore.kernel.org/lkml/20200902105707.11491-1-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • The --control option does not display well in man pages unless AsciiDoc
    formatting is used.

    Signed-off-by: Adrian Hunter
    Acked-by: Alexey Budankov
    Acked-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Link: http://lore.kernel.org/lkml/20200901093758.32293-4-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

04 Sep, 2020

2 commits

  • Event modifiers are not mentioned in the perf record or perf stat
    manpages. Add them to orient new users more effectively by pointing
    them to the perf list manpage for details.

    Fixes: 2055fdaf8703 ("perf list: Document precise event sampling for AMD IBS")
    Signed-off-by: Kim Phillips
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Ian Rogers
    Cc: Jin Yao
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Paul Clarke
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tony Jones
    Cc: stable@vger.kernel.org
    Link: http://lore.kernel.org/lkml/20200901215853.276234-1-kim.phillips@amd.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Kim Phillips
     
  • There's a risk that outputting interval mode summaries by default breaks
    CSV consumers. It already broke pmu-tools/toplev.

    So now we turn off the summary by default but we create a new option
    '--summary' to enable the summary. This is active even when not using
    CSV mode.

    Before:

    root@kbl-ppc:~# perf stat -I1000 --interval-count 2
    # time counts unit events
    1.000265904 8,005.73 msec cpu-clock # 8.006 CPUs utilized
    1.000265904 601 context-switches # 0.075 K/sec
    1.000265904 10 cpu-migrations # 0.001 K/sec
    1.000265904 0 page-faults # 0.000 K/sec
    1.000265904 66,746,521 cycles # 0.008 GHz
    1.000265904 71,874,398 instructions # 1.08 insn per cycle
    1.000265904 13,356,781 branches # 1.668 M/sec
    1.000265904 298,756 branch-misses # 2.24% of all branches
    2.001857667 8,012.52 msec cpu-clock # 8.013 CPUs utilized
    2.001857667 164 context-switches # 0.020 K/sec
    2.001857667 10 cpu-migrations # 0.001 K/sec
    2.001857667 2 page-faults # 0.000 K/sec
    2.001857667 5,822,188 cycles # 0.001 GHz
    2.001857667 2,186,170 instructions # 0.38 insn per cycle
    2.001857667 442,378 branches # 0.055 M/sec
    2.001857667 44,750 branch-misses # 10.12% of all branches

    Performance counter stats for 'system wide':

    16,018.25 msec cpu-clock # 7.993 CPUs utilized
    765 context-switches # 0.048 K/sec
    20 cpu-migrations # 0.001 K/sec
    2 page-faults # 0.000 K/sec
    72,568,709 cycles # 0.005 GHz
    74,060,568 instructions # 1.02 insn per cycle
    13,799,159 branches # 0.861 M/sec
    343,506 branch-misses # 2.49% of all branches

    2.004118489 seconds time elapsed

    After:

    root@kbl-ppc:~# perf stat -I1000 --interval-count 2
    # time counts unit events
    1.001336393 8,013.28 msec cpu-clock # 8.013 CPUs utilized
    1.001336393 82 context-switches # 0.010 K/sec
    1.001336393 8 cpu-migrations # 0.001 K/sec
    1.001336393 0 page-faults # 0.000 K/sec
    1.001336393 4,199,121 cycles # 0.001 GHz
    1.001336393 1,373,991 instructions # 0.33 insn per cycle
    1.001336393 270,681 branches # 0.034 M/sec
    1.001336393 31,659 branch-misses # 11.70% of all branches
    2.003905006 8,020.52 msec cpu-clock # 8.021 CPUs utilized
    2.003905006 184 context-switches # 0.023 K/sec
    2.003905006 8 cpu-migrations # 0.001 K/sec
    2.003905006 2 page-faults # 0.000 K/sec
    2.003905006 5,446,190 cycles # 0.001 GHz
    2.003905006 2,312,547 instructions # 0.42 insn per cycle
    2.003905006 451,691 branches # 0.056 M/sec
    2.003905006 37,925 branch-misses # 8.40% of all branches

    root@kbl-ppc:~# perf stat -I1000 --interval-count 2 --summary
    # time counts unit events
    1.001313128 8,013.20 msec cpu-clock # 8.013 CPUs utilized
    1.001313128 83 context-switches # 0.010 K/sec
    1.001313128 8 cpu-migrations # 0.001 K/sec
    1.001313128 0 page-faults # 0.000 K/sec
    1.001313128 4,470,950 cycles # 0.001 GHz
    1.001313128 1,440,045 instructions # 0.32 insn per cycle
    1.001313128 283,222 branches # 0.035 M/sec
    1.001313128 33,576 branch-misses # 11.86% of all branches
    2.003857385 8,020.34 msec cpu-clock # 8.020 CPUs utilized
    2.003857385 154 context-switches # 0.019 K/sec
    2.003857385 8 cpu-migrations # 0.001 K/sec
    2.003857385 2 page-faults # 0.000 K/sec
    2.003857385 4,515,676 cycles # 0.001 GHz
    2.003857385 2,180,449 instructions # 0.48 insn per cycle
    2.003857385 435,254 branches # 0.054 M/sec
    2.003857385 31,179 branch-misses # 7.16% of all branches

    Performance counter stats for 'system wide':

    16,033.53 msec cpu-clock # 7.992 CPUs utilized
    237 context-switches # 0.015 K/sec
    16 cpu-migrations # 0.001 K/sec
    2 page-faults # 0.000 K/sec
    8,986,626 cycles # 0.001 GHz
    3,620,494 instructions # 0.40 insn per cycle
    718,476 branches # 0.045 M/sec
    64,755 branch-misses # 9.01% of all branches

    2.006124542 seconds time elapsed

    Fixes: c7e5b328a8d4 ("perf stat: Report summary for interval mode")
    Signed-off-by: Jin Yao
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Jiri Olsa
    Cc: Kan Liang
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200903010113.32232-1-yao.jin@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jin Yao
     

14 Aug, 2020

13 commits

  • This allows us to trace single thread instead of the whole process.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-17-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '-D/--delay' to allow us to start tracing some times
    later after workload is launched.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-16-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This is to have a consistent view of all graph tracer options.
    The original option '--graph-depth' is marked as deprecated.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-15-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '--graph-opts thresh' to setup trace duration
    threshold for funcgraph tracer.

    $ sudo ./perf ftrace -G '*' --graph-opts thresh=100
    3) ! 184.060 us | } /* schedule */
    3) ! 185.600 us | } /* exit_to_usermode_loop */
    2) ! 225.989 us | } /* schedule_idle */
    2) # 4140.051 us | } /* do_idle */

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-14-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • Sometimes we want ftrace display more and longer information about the
    trace.

    $ sudo perf ftrace -G '*'
    2) 0.979 us | mutex_unlock();
    2) 1.540 us | __fsnotify_parent();
    2) 0.433 us | fsnotify();

    $ sudo perf ftrace -G '*' --graph-opts verbose
    14160.770883 | 0) -47814 | .... | 1.289 us | mutex_unlock();
    14160.770886 | 0) -47814 | .... | 1.624 us | __fsnotify_parent();
    14160.770887 | 0) -47814 | .... | 0.636 us | fsnotify();
    14160.770888 | 0) -47814 | .... | 0.328 us | __sb_end_write();
    14160.770888 | 0) -47814 | d... | 0.430 us | fpregs_assert_state_consistent();
    14160.770889 | 0) -47814 | d... | | do_syscall_64() {
    14160.770889 | 0) -47814 | .... | | __x64_sys_close() {

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-13-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds support to display irq context info for function tracer. To do
    this, just specify a '--func-opts irq-info' option.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-12-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '--graph-opts noirqs' to filter out functions executed
    in irq context.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-11-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '--graph-opts nosleep-time' which allow us only to
    measure on-CPU time. This option is function_graph tracer only.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-10-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds support to display call trace for function tracer. To do this,
    just specify a '--func-opts call-graph' option.

    Example:

    $ sudo perf ftrace -T vfs_read --func-opts call-graph
    iio-sensor-prox-855 [003] 6168.369657: vfs_read
    => vfs_read
    => ksys_read
    => __x64_sys_read
    => do_syscall_64
    => entry_SYSCALL_64_after_hwframe
    ...

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-9-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '--inherit' to allow us trace children
    processes spawned by our target.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-7-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '-m/--buffer-size' to allow us set the size of per-cpu
    tracing buffer.

    Committer testing:

    Before running with this option:

    # find /sys/kernel/tracing/ -name buffer_size_kb | xargs cat
    1408
    1408
    1408
    1408
    1408
    1408
    1408
    1408
    1408
    #

    Then, run:

    # perf ftrace -m 2048K | head -10
    2) | mutex_unlock() {
    2) ==========> |
    2) | smp_irq_work_interrupt() {
    2) | irq_enter() {
    2) 0.121 us | rcu_irq_enter();
    2) 0.128 us | irqtime_account_irq();
    2) 0.719 us | }
    2) | __wake_up() {
    2) | __wake_up_common_lock() {
    2) 0.105 us | _raw_spin_lock_irqsave();
    #

    Now look at those tracefs knobs:

    # find /sys/kernel/tracing/ -name buffer_size_kb | xargs cat
    2048
    2048
    2048
    2048
    2048
    2048
    2048
    2048
    2048
    #

    This should be similar to the -m option in the other perf tools, such as
    'perf record', 'perf trace', etc.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-5-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '-F/--funcs' to list all available functions to
    trace, which is read from tracing file 'available_filter_functions'.

    $ sudo ./perf ftrace -F | head
    trace_initcall_finish_cb
    initcall_blacklisted
    do_one_initcall
    do_one_initcall
    trace_initcall_start_cb
    run_init_process
    try_to_run_init_process
    match_dev_by_label
    match_dev_by_uuid
    rootfs_init_fs_context
    $

    Committer notes:

    This is the same command line option and for the same purpose as in
    'perf probe'.

    Signed-off-by: Changbin Du
    Tested-by: Arnaldo Carvalho de Melo
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-3-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • The '-g/-G' options have already implied function_graph tracer should be
    used instead of function tracer. So we don't need extra option
    '--tracer' in this case.

    This patch changes the behavior as below:

    - If '-g' or '-G' option is on, then function_graph tracer is used.
    - If '-T' or '-N' option is on, then function tracer is used.
    - The function_graph has priority over function tracer.
    - The option '--tracer' only take effect if neither -g/-G nor -T/-N
    is specified.

    Here are some examples.

    This will start tracing all functions using default tracer:

    $ sudo perf ftrace

    This will trace all functions using function graph tracer:

    $ sudo perf ftrace -G '*'

    This will trace function vfs_read using function graph tracer:

    $ sudo perf ftrace -G vfs_read

    This will trace function vfs_read using function tracer:

    $ sudo perf ftrace -T vfs_read

    Committer notes:

    Using '-h -G' will tell what that option is about, so to further clarify
    the above examples:

    # perf ftrace -h -G

    -G, --graph-funcs Set graph filter on given functions

    # perf ftrace -h -g

    -g, --nograph-funcs Set nograph filter on given functions

    # perf ftrace -h -T

    -T, --trace-funcs trace given functions only

    # perf ftrace -h -N

    -N, --notrace-funcs do not trace given functions

    #

    Signed-off-by: Changbin Du
    Tested-by: Arnaldo Carvalho de Melo
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-2-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     

06 Aug, 2020

10 commits

  • Adad support to convert and store time of day in CTF data conversion for
    'perf data convert' subcommand.

    The perf.data used for conversion needs to have clock data information -
    must be recorded with -k/--clockid option).

    New --tod option is added to 'perf data convert' subcommand to convert
    data with timestamps converted to wall clock time.

    Record data with clockid set:

    # perf record -k CLOCK_MONOTONIC kill
    kill: not enough arguments
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.033 MB perf.data (8 samples) ]

    Convert data with TOD timestamps:

    # perf data convert --tod --to-ctf ./ctf
    [ perf data convert: Converted 'perf.data' into CTF data './ctf' ]
    [ perf data convert: Converted and wrote 0.000 MB (8 samples) ]

    Display data in perf script:

    # perf script -F+tod --ns
    perf 262150 2020-07-13 18:38:50.097678523 153633.958246159: 1 cycles: ...
    perf 262150 2020-07-13 18:38:50.097682941 153633.958250577: 1 cycles: ...
    perf 262150 2020-07-13 18:38:50.097684997 153633.958252633: 7 cycles: ...
    ...

    Display data in babeltrace:

    # babeltrace --clock-date ./ctf
    [2020-07-13 18:38:50.097678523] (+?.?????????) cycles: { cpu_id = 0 }, { perf_ip = 0xFFF ...
    [2020-07-13 18:38:50.097682941] (+0.000004418) cycles: { cpu_id = 0 }, { perf_ip = 0xFFF ...
    [2020-07-13 18:38:50.097684997] (+0.000002056) cycles: { cpu_id = 0 }, { perf_ip = 0xFFF ...
    ...

    It's available only for recording with clockid specified, because it's
    the only case where we can get reference time to wallclock time. It's
    can't do that with perf clock yet.

    Error is display if you want to use --tod on data without clockid
    specified:

    # perf data convert --tod --to-ctf ./ctf
    Can't provide --tod time, missing clock data. Please record with -k/--clockid option.
    Failed to setup CTF writer.
    Error during conversion setup.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Geneviève Bastien
    Cc: Ian Rogers
    Cc: Jeremie Galarneau
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Link: http://lore.kernel.org/lkml/20200805093444.314999-6-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Add a new CLOCK_DATA feature that stores reference times when
    -k/--clockid option is specified.

    It contains the clock id and its reference time together with wall clock
    time taken at the 'same time', both values are in nanoseconds.

    The format of data is as below:

    struct {
    u32 version; /* version = 1 */
    u32 clockid;
    u64 wall_clock_ns;
    u64 clockid_time_ns;
    };

    This clock reference times will be used in following changes to display
    wall clock for perf events.

    It's available only for recording with clockid specified, because it's
    the only case where we can get reference time to wallclock time. It's
    can't do that with perf clock yet.

    Committer testing:

    $ perf record -h -k

    Usage: perf record [] []
    or: perf record [] -- []

    -k, --clockid
    clockid to use for events, see clock_gettime()

    $ perf record -k monotonic sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.017 MB perf.data (8 samples) ]
    $ perf report --header-only | grep clockid -A1
    # event : name = cycles:u, , id = { 88815, 88816, 88817, 88818, 88819, 88820, 88821, 88822 }, size = 120, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, read_format = ID, disabled = 1, inherit = 1, exclude_kernel = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, use_clockid = 1, ksymbol = 1, bpf_event = 1, clockid = 1
    # CPU_TOPOLOGY info available, use -I to display
    --
    # clockid frequency: 1000 MHz
    # cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake
    # clockid: monotonic (1)
    # reference time: 2020-08-06 09:40:21.619290 = 1596717621.619290 (TOD) = 21931.077673635 (monotonic)
    $

    Original-patch-by: David Ahern
    Signed-off-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Geneviève Bastien
    Cc: Ian Rogers
    Cc: Jeremie Galarneau
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Link: http://lore.kernel.org/lkml/20200805093444.314999-4-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • A single q option decodes ip from only FUP/TIP packets. Make it so that
    repeating the q option (i.e. qq) decodes only PSB+, getting ip if there
    is a FUP packet within PSB+ (i.e. between PSB and PSBEND).

    Example:

    $ perf record -e intel_pt//u grep -rI pudding drivers
    [ perf record: Woken up 52 times to write data ]
    [ perf record: Captured and wrote 57.870 MB perf.data ]
    $ time perf script --itrace=bi | wc -l
    58948289

    real 1m23.863s
    user 1m23.251s
    sys 0m7.452s
    $ time perf script --itrace=biq | wc -l
    3385694

    real 0m4.453s
    user 0m4.455s
    sys 0m0.328s
    $ time perf script --itrace=biqq | wc -l
    1883

    real 0m0.047s
    user 0m0.043s
    sys 0m0.009s

    Signed-off-by: Adrian Hunter
    Reviewed-by: Andi Kleen
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200710151104.15137-13-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • Use the new itrace 'q' option to add support for a mode of decoding that
    ignores TNT, does not walk object code, but gets the ip from FUP and TIP
    packets.

    Example:

    $ perf record -e intel_pt//u grep -rI pudding drivers
    [ perf record: Woken up 52 times to write data ]
    [ perf record: Captured and wrote 57.870 MB perf.data ]
    $ time perf script --itrace=bi | wc -l
    58948289

    real 1m23.863s
    user 1m23.251s
    sys 0m7.452s
    $ time perf script --itrace=biq | wc -l
    3385694

    real 0m4.453s
    user 0m4.455s
    sys 0m0.328s

    Signed-off-by: Adrian Hunter
    Reviewed-by: Andi Kleen
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200710151104.15137-12-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • The 'q' option is for modes of decoding that are quicker because they
    skip or omit decoding some aspects of trace data.

    If supported, the 'q' option may be repeated to increase the effect.

    Signed-off-by: Adrian Hunter
    Reviewed-by: Andi Kleen
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200710151104.15137-11-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • Change the debug logging (when used with the --time option) to time
    filter logged perf events, but allow that to be overridden by using
    "d+a" instead of plain "d".

    That can reduce the size of the log file.

    Signed-off-by: Adrian Hunter
    Reviewed-by: Andi Kleen
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200710151104.15137-10-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • The "d" option may be followed by flags which affect what debug messages
    will or will not be logged. Each flag must be preceded by either '+' or
    '-'. The flags support by Intel PT are:

    -a Suppress logging of perf events

    Suppressing perf events is useful for decreasing the size of the log.

    Signed-off-by: Adrian Hunter
    Reviewed-by: Andi Kleen
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200710151104.15137-9-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • Allow the 'd' option to be followed by flags which will affect what debug
    messages will or will not be reported. Each flag must be preceded by either
    '+' or '-'. The flags are:
    a all perf events

    Signed-off-by: Adrian Hunter
    Reviewed-by: Andi Kleen
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200710151104.15137-8-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • The itrace "e" option may be followed by flags which affect what errors
    will or will not be reported. Each flag must be preceded by either '+' or '-'.
    The flags supported by Intel PT are:

    -o Suppress overflow errors
    -l Suppress trace data lost errors
    For example, for errors but not overflow or data lost errors:

    --itrace=e-o-l

    Suppressing those errors can be useful for testing and debugging because
    they are not due to decoding.

    Signed-off-by: Adrian Hunter
    Reviewed-by: Andi Kleen
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200710151104.15137-7-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • Allow the 'e' option to be followed by flags which will affect what errors
    will or will not be reported. Each flag must be preceded by either '+' or
    '-'. The flags are:
    o overflow
    l trace data lost

    Signed-off-by: Adrian Hunter
    Reviewed-by: Andi Kleen
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200710151104.15137-6-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter