06 Feb, 2019

1 commit

  • The timestamp can use useful to find part of a trace that has an error
    without outputting all of the trace e.g. using the itrace 's' option to
    skip initial number of events.

    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/20190206103947.15750-6-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

25 Jan, 2019

1 commit


22 Jan, 2019

2 commits

  • This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of
    PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to
    turn it on.

    Committer notes:

    Add dummy machine__process_bpf_event() variant that returns zero for
    systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking
    the build in such systems.

    Remove the needless include from bpf->event.h, provide just
    forward declarations for the structs and unions in the parameters, to
    reduce compilation time and needless rebuilds when machine.h gets
    changed.

    Committer testing:

    When running with:

    # perf record --bpf-event

    On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL
    is not present, we fallback to removing those two bits from
    perf_event_attr, making the tool to continue to work on older kernels:

    perf_event_attr:
    size 112
    { sample_period, sample_freq } 4000
    sample_type IP|TID|TIME|PERIOD
    read_format ID
    disabled 1
    inherit 1
    mmap 1
    comm 1
    freq 1
    enable_on_exec 1
    task 1
    precise_ip 3
    sample_id_all 1
    exclude_guest 1
    mmap2 1
    comm_exec 1
    ksymbol 1
    bpf_event 1
    ------------------------------------------------------------
    sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8
    sys_perf_event_open failed, error -22
    switching off bpf_event
    ------------------------------------------------------------
    perf_event_attr:
    size 112
    { sample_period, sample_freq } 4000
    sample_type IP|TID|TIME|PERIOD
    read_format ID
    disabled 1
    inherit 1
    mmap 1
    comm 1
    freq 1
    enable_on_exec 1
    task 1
    precise_ip 3
    sample_id_all 1
    exclude_guest 1
    mmap2 1
    comm_exec 1
    ksymbol 1
    ------------------------------------------------------------
    sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8
    sys_perf_event_open failed, error -22
    switching off ksymbol
    ------------------------------------------------------------
    perf_event_attr:
    size 112
    { sample_period, sample_freq } 4000
    sample_type IP|TID|TIME|PERIOD
    read_format ID
    disabled 1
    inherit 1
    mmap 1
    comm 1
    freq 1
    enable_on_exec 1
    task 1
    precise_ip 3
    sample_id_all 1
    exclude_guest 1
    mmap2 1
    comm_exec 1
    ------------------------------------------------------------

    And then proceeds to work without those two features.

    As passing --bpf-event is an explicit action performed by the user, perhaps we
    should emit a warning telling that the kernel has no such feature, but this can
    be done on top of this patch.

    Now with a kernel that supports these events, start the 'record --bpf-event -a'
    and then run 'perf trace sleep 10000' that will use the BPF
    augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus
    should generate PERF_RECORD_BPF_EVENT events:

    [root@quaco ~]# perf record -e dummy -a --bpf-event
    ^C[ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.713 MB perf.data ]

    [root@quaco ~]# bpftool prog
    13: cgroup_skb tag 7be49e3934a125ba gpl
    loaded_at 2019-01-19T09:09:43-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 13,14
    14: cgroup_skb tag 2a142ef67aaad174 gpl
    loaded_at 2019-01-19T09:09:43-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 13,14
    15: cgroup_skb tag 7be49e3934a125ba gpl
    loaded_at 2019-01-19T09:09:43-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 15,16
    16: cgroup_skb tag 2a142ef67aaad174 gpl
    loaded_at 2019-01-19T09:09:43-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 15,16
    17: cgroup_skb tag 7be49e3934a125ba gpl
    loaded_at 2019-01-19T09:09:44-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 17,18
    18: cgroup_skb tag 2a142ef67aaad174 gpl
    loaded_at 2019-01-19T09:09:44-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 17,18
    21: cgroup_skb tag 7be49e3934a125ba gpl
    loaded_at 2019-01-19T09:09:45-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 21,22
    22: cgroup_skb tag 2a142ef67aaad174 gpl
    loaded_at 2019-01-19T09:09:45-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 21,22
    31: tracepoint name sys_enter tag 12504ba9402f952f gpl
    loaded_at 2019-01-19T09:19:56-0300 uid 0
    xlated 512B jited 374B memlock 4096B map_ids 30,29,28
    32: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl
    loaded_at 2019-01-19T09:19:56-0300 uid 0
    xlated 256B jited 191B memlock 4096B map_ids 30,29
    # perf report -D | grep PERF_RECORD_BPF_EVENT | nl
    1 0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13
    2 0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14
    3 0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15
    4 0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16
    5 0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17
    6 0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18
    7 0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21
    8 0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22
    9 7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29
    10 7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29
    11 7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30
    12 7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30
    13 7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31
    14 7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32
    #

    There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'.

    Signed-off-by: Song Liu
    Reviewed-by: Arnaldo Carvalho de Melo
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Peter Zijlstra
    Cc: kernel-team@fb.com
    Cc: netdev@vger.kernel.org
    Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Song Liu
     
  • This patch handles PERF_RECORD_KSYMBOL in perf record/report.
    Specifically, map and symbol are created for ksymbol register, and
    removed for ksymbol unregister.

    This patch also sets perf_event_attr.ksymbol properly. The flag is ON by
    default.

    Committer notes:

    Use proper inttypes.h for u64, fixing the build in some environments
    like in the android NDK r15c targetting ARM 32-bit.

    I.e. fixing this build error:

    util/event.c: In function 'perf_event__fprintf_ksymbol':
    util/event.c:1489:10: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=]
    event->ksymbol_event.flags, event->ksymbol_event.name);
    ^
    cc1: all warnings being treated as errors

    Signed-off-by: Song Liu
    Reviewed-by: Arnaldo Carvalho de Melo
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Peter Zijlstra
    Cc: kernel-team@fb.com
    Cc: netdev@vger.kernel.org
    Link: http://lkml.kernel.org/r/20190117161521.1341602-6-songliubraving@fb.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Song Liu
     

18 Dec, 2018

1 commit

  • The default timeout of 500ms for parsing /proc//maps files is too
    short for profiling many of our services.

    This can be overridden by passing --proc-map-timeout to the relevant
    command but it'd be nice to globally increase our default value.

    This patch permits setting a different default with the
    core.proc-map-timeout config file parameter.

    Signed-off-by: Mark Drayton
    Acked-by: Song Liu
    Acked-by: Namhyung Kim
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20181204203420.1683114-1-mbd@fb.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Mark Drayton
     

23 May, 2018

1 commit

  • Like the kernel text, the location of x86 PTI entry trampolines must be
    recorded in the perf.data file. Like the kernel, synthesize a mmap event
    for that, and add processing for it.

    Signed-off-by: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Andy Lutomirski
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Jiri Olsa
    Cc: Joerg Roedel
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: x86@kernel.org
    Link: http://lkml.kernel.org/r/1526986485-6562-10-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

18 Jan, 2018

1 commit


08 Jan, 2018

1 commit

  • Adding support to display sample misc field in form
    of letter for each bit:

    # perf script -F +misc ...
    sched-messaging 1414 K 28690.636582: 4590 cycles ...
    sched-messaging 1407 U 28690.636600: 325620 cycles ...
    sched-messaging 1414 K 28690.636608: 19473 cycles ...
    misc field __________/

    The misc bits are assigned to following letters:

    PERF_RECORD_MISC_KERNEL K
    PERF_RECORD_MISC_USER U
    PERF_RECORD_MISC_HYPERVISOR H
    PERF_RECORD_MISC_GUEST_KERNEL G
    PERF_RECORD_MISC_GUEST_USER g
    PERF_RECORD_MISC_MMAP_DATA* M
    PERF_RECORD_MISC_COMM_EXEC E
    PERF_RECORD_MISC_SWITCH_OUT S

    Signed-off-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20180107160356.28203-9-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

07 Nov, 2017

1 commit

  • Conflicts:
    tools/perf/arch/arm/annotate/instructions.c
    tools/perf/arch/arm64/annotate/instructions.c
    tools/perf/arch/powerpc/annotate/instructions.c
    tools/perf/arch/s390/annotate/instructions.c
    tools/perf/arch/x86/tests/intel-cqm.c
    tools/perf/ui/tui/progress.c
    tools/perf/util/zlib.c

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

03 Oct, 2017

1 commit

  • The proc files which is sorted with alphabetical order are evenly
    assigned to several synthesize threads to be processed in parallel.

    For 'perf top', the threads number hard code to online CPU number. The
    following patch will introduce an option to set it.

    For other perf tools, the thread number is 1. Because the process
    function is not ready for multithreading, e.g.
    process_synthesized_event.

    This patch series only support event synthesize multithreading for 'perf
    top'. For other tools, it can be done separately later.

    With multithread applied, the total processing time can get up to 1.56x
    speedup on Knights Mill for 'perf top'.

    For specific single event processing, the processing time could increase
    because of the lock contention. So proc_map_timeout may need to be
    increased. Otherwise some proc maps will be truncated.

    Based on my test, increasing the proc_map_timeout has small impact
    on the total processing time. The total processing time still get 1.49x
    speedup on Knights Mill after increasing the proc_map_timeout.
    The patch itself doesn't increase the proc_map_timeout.

    Doesn't need to implement multithreading for per task monitoring,
    perf_event__synthesize_thread_map. It doesn't have performance issue.

    Committer testing:

    # getconf _NPROCESSORS_ONLN
    4
    # perf trace --no-inherit -e clone -o /tmp/output perf top
    # tail -4 /tmp/bla
    0.124 ( 0.041 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eb3a8f30, parent_tidptr: 0x7fc3eb3a99d0, child_tidptr: 0x7fc3eb3a99d0, tls: 0x7fc3eb3a9700) = 9548 (perf)
    0.246 ( 0.023 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3eaba7f30, parent_tidptr: 0x7fc3eaba89d0, child_tidptr: 0x7fc3eaba89d0, tls: 0x7fc3eaba8700) = 9549 (perf)
    0.286 ( 0.019 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9550 (perf)
    246.540 ( 0.047 ms): clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7fc3ea3a6f30, parent_tidptr: 0x7fc3ea3a79d0, child_tidptr: 0x7fc3ea3a79d0, tls: 0x7fc3ea3a7700) = 9551 (perf)
    #

    Signed-off-by: Kan Liang
    Tested-by: Arnaldo Carvalho de Melo
    Acked-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexei Starovoitov
    Cc: Andi Kleen
    Cc: He Kuang
    Cc: Lukasz Odzioba
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Link: http://lkml.kernel.org/r/1506696477-146932-4-git-send-email-kan.liang@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Kan Liang
     

02 Sep, 2017

1 commit

  • Support new sample type PERF_SAMPLE_PHYS_ADDR for physical address.

    Add new option --phys-data to record sample physical address.

    Signed-off-by: Kan Liang
    Tested-by: Jiri Olsa
    Acked-by: Stephane Eranian
    Cc: Andi Kleen
    Cc: Madhavan Srinivasan
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1504026672-7304-2-git-send-email-kan.liang@intel.com
    [ Added missing printing in evsel.c patch sent by Jiri Olsa ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Kan Liang
     

19 Jul, 2017

2 commits

  • Create new util/branch.c and util/branch.h to contain the common branch
    functions. Such as:

    branch_type_count(): Count the numbers of branch types
    branch_type_name() : Return the name of branch type
    branch_type_stat_display(): Display branch type statistics info
    branch_type_str(): Construct the branch type string.

    The branch type is saved in branch_flags.

    Change log:

    v8: Change PERF_BR_NONE to PERF_BR_UNKNOWN.

    v7: Since the common branch type name is changed (e.g. JCC->COND),
    this patch is performed the modification accordingly.

    v6: Move that multiline conditional code inside {} brackets.
    Move branch_type_stat_display() from builtin-report.c to
    branch.c.
    Move branch_type_str() from callchain.c to branch.c.

    v5: It's a new patch in v5 patch series.

    Signed-off-by: Yao Jin
    Acked-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Kan Liang
    Cc: Michael Ellerman
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1500379995-6449-6-git-send-email-yao.jin@linux.intel.com
    [ Don't use 'index' and 'stat' as names for variables, it shadows global decls in older distros ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jin Yao
     
  • Add header record types to pipe-mode, reusing the functions
    used in file-mode and leveraging the new struct feat_fd.

    For alignment, check that synthesized events don't exceed
    pagesize.

    Add the perf_event__synthesize_feature event call back to
    process the new header records.

    Before this patch:

    $ perf record -o - -e cycles sleep 1 | perf report --stdio --header
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.000 MB - ]
    ...

    After this patch:
    $ perf record -o - -e cycles sleep 1 | perf report --stdio --header
    # ========
    # captured on: Mon May 22 16:33:43 2017
    # ========
    #
    # hostname : my_hostname
    # os release : 4.11.0-dbx-up_perf
    # perf version : 4.11.rc6.g6277c80
    # arch : x86_64
    # nrcpus online : 72
    # nrcpus avail : 72
    # cpudesc : Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz
    # cpuid : GenuineIntel,6,63,2
    # total memory : 263457192 kB
    # cmdline : /root/perf record -o - -e cycles -c 100000 sleep 1
    # HEADER_CPU_TOPOLOGY info available, use -I to display
    # HEADER_NUMA_TOPOLOGY info available, use -I to display
    # pmu mappings: intel_bts = 6, uncore_imc_4 = 22, uncore_sbox_1 = 47, uncore_cbox_5 = 33, uncore_ha_0 = 16, uncore_cbox
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.000 MB - ]
    ...

    Support added for the subcommands: report, inject, annotate and script.

    Signed-off-by: David Carrillo-Cisneros
    Acked-by: David Ahern
    Acked-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: He Kuang
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Paul Turner
    Cc: Peter Zijlstra
    Cc: Simon Que
    Cc: Stephane Eranian
    Cc: Wang Nan
    Link: http://lkml.kernel.org/r/20170718042549.145161-16-davidcc@google.com
    Signed-off-by: Arnaldo Carvalho de Melo

    David Carrillo-Cisneros
     

30 Jun, 2017

1 commit


27 Jun, 2017

1 commit

  • Instruction trace decoders such as Intel PT may have additional information
    recorded in the trace. For example, Intel PT has power information and a
    there is a new instruction 'ptwrite' that can write a value into a PTWRITE
    trace packet.

    Such information may be associated with an IP and so can be treated as a
    sample (PERF_RECORD_SAMPLE). Custom data can be incorporated in the
    sample as raw_data (PERF_SAMPLE_RAW).

    However a means of identifying the raw data format is needed. That will
    be done by synthesizing an attribute for it.

    So add an attribute type for custom synthesized events. Different
    synthesized events will be identified by the attribute 'config'.

    Committer notes:

    Start those PERF_TYPE_ after the PMU range, i.e. after (INT_MAX + 1U),
    i.e. after perf_pmu_register() -> idr_alloc(end=0).

    Signed-off-by: Adrian Hunter
    Cc: Andi Kleen
    Link: http://lkml.kernel.org/r/1498040239-32418-1-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

04 May, 2017

1 commit

  • Mostly in the documentation.

    Signed-off-by: Kim Phillips
    Cc: Alexander Shishkin
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20170503131350.cebeecd8bd0f2968417626ab@arm.com
    [ Fix spelling of "parameter" in one of the spell-checked lines ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Kim Phillips
     

03 May, 2017

1 commit

  • That is the case of _text on s390, and we have some functions that return an
    address, using address zero to report problems, oops.

    This would lead the symbol loading routines to not use "_text" as the reference
    relocation symbol, or the first symbol for the kernel, but use instead
    "_stext", that is at the same address on x86_64 and others, but not on s390:

    [acme@localhost perf-4.11.0-rc6]$ head -15 /proc/kallsyms
    0000000000000000 T _text
    0000000000000418 t iplstart
    0000000000000800 T start
    000000000000080a t .base
    000000000000082e t .sk8x8
    0000000000000834 t .gotr
    0000000000000842 t .cmd
    0000000000000846 t .parm
    000000000000084a t .lowcase
    0000000000010000 T startup
    0000000000010010 T startup_kdump
    0000000000010214 t startup_kdump_relocated
    0000000000011000 T startup_continue
    00000000000112a0 T _ehead
    0000000000100000 T _stext
    [acme@localhost perf-4.11.0-rc6]$

    Which in turn would make 'perf test vmlinux' to fail because it wouldn't find
    the symbols before "_stext" in kallsyms.

    Fix it by using the return value only for errors and storing the
    address, when the symbol is successfully found, in a provided pointer
    arg.

    Before this patch:

    After:

    [acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1
    1: vmlinux symtab matches kallsyms :
    --- start ---
    test child forked, pid 40693
    Looking at the vmlinux_path (8 entries long)
    Using /usr/lib/debug/lib/modules/3.10.0-654.el7.s390x/vmlinux for symbols
    ERR : 0: _text not on kallsyms
    ERR : 0x418: iplstart not on kallsyms
    ERR : 0x800: start not on kallsyms
    ERR : 0x80a: .base not on kallsyms
    ERR : 0x82e: .sk8x8 not on kallsyms
    ERR : 0x834: .gotr not on kallsyms
    ERR : 0x842: .cmd not on kallsyms
    ERR : 0x846: .parm not on kallsyms
    ERR : 0x84a: .lowcase not on kallsyms
    ERR : 0x10000: startup not on kallsyms
    ERR : 0x10010: startup_kdump not on kallsyms
    ERR : 0x10214: startup_kdump_relocated not on kallsyms
    ERR : 0x11000: startup_continue not on kallsyms
    ERR : 0x112a0: _ehead not on kallsyms

    test child finished with -1
    ---- end ----
    vmlinux symtab matches kallsyms: FAILED!
    [acme@localhost perf-4.11.0-rc6]$

    After:

    [acme@localhost perf-4.11.0-rc6]$ tools/perf/perf test -v 1
    1: vmlinux symtab matches kallsyms :
    --- start ---
    test child forked, pid 47160

    test child finished with 0
    ---- end ----
    vmlinux symtab matches kallsyms: Ok
    [acme@localhost perf-4.11.0-rc6]$

    Reported-by: Michael Petlan
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-9x9bwgd3btwdk1u51xie93fz@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

26 Apr, 2017

1 commit


25 Apr, 2017

1 commit


17 Mar, 2017

1 commit

  • This patch decodes the 'partial' flag in AUX records and prints
    a warning to the user, so that they don't have to guess why their
    PT traces contain gaps (or missing altogether):

    Warning:
    AUX data had gaps in it 8 times out of 8!

    Are you running a KVM guest in the background?

    Trying to be even more helpful, we will detect if the user's kvm driver sets up
    exclusive VMX root mode for the entire lifespan of the kvm process:

    Reloading kvm_intel module with vmm_exclusive=0
    will reduce the gaps to only guest's timeslices.

    Note however, that you'll still have gaps in cpu-wide traces even with
    vmm_exclusive=0, but the number of gaps will be below 100% (as opposed to the
    above example).

    Currently this is the only reason for partial records.

    Signed-off-by: Alexander Shishkin
    Cc: Adrian Hunter
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Vince Weaver
    Link: http://lkml.kernel.org/r/8760j941ig.fsf@ashishki-desk.ger.corp.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexander Shishkin
     

15 Mar, 2017

1 commit

  • Synthesize PERF_RECORD_NAMESPACES events for processes that were running prior
    to invocation of perf record. The data for this is taken from /proc/$PID/ns.
    These changes make way for analyzing events with regard to namespaces.

    Committer notes:

    Check if 'tool' is NULL in perf_event__synthesize_namespaces(), as in the
    test__mmap_thread_lookup case, i.e. 'perf test Lookup mmap thread".

    Testing it:

    # ps axH > /tmp/allthreads
    # perf record -a --namespaces usleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 1.169 MB perf.data (8 samples) ]
    # perf report -D | grep PERF_RECORD_NAMESPACES | wc -l
    602
    # wc -l /tmp/allthreads
    601 /tmp/allthreads
    # tail /tmp/allthreads
    16951 pts/4 T 0:00 git rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^
    16952 pts/4 T 0:00 /bin/sh /usr/libexec/git-core/git-rebase -i a033bf1bfacdaa25642e6bcc857a7d0f67cc3c92^
    17176 pts/4 T 0:00 git commit --amend --no-post-rewrite
    17204 pts/4 T 0:00 vim /home/acme/git/linux/.git/COMMIT_EDITMSG
    18939 ? S 0:00 [kworker/2:1]
    18947 ? S 0:00 [kworker/3:0]
    18974 ? S 0:00 [kworker/1:0]
    19047 ? S 0:00 [kworker/0:1]
    19152 pts/6 S+ 0:00 weechat
    19153 pts/7 R+ 0:00 ps axH
    # perf report -D | grep PERF_RECORD_NAMESPACES | tail
    0 0 0x125068 [0xa0]: PERF_RECORD_NAMESPACES 17176/17176 - nr_namespaces: 7
    0 0 0x1255b8 [0xa0]: PERF_RECORD_NAMESPACES 17204/17204 - nr_namespaces: 7
    0 0 0x125df0 [0xa0]: PERF_RECORD_NAMESPACES 18939/18939 - nr_namespaces: 7
    0 0 0x125f00 [0xa0]: PERF_RECORD_NAMESPACES 18947/18947 - nr_namespaces: 7
    0 0 0x126010 [0xa0]: PERF_RECORD_NAMESPACES 18974/18974 - nr_namespaces: 7
    0 0 0x126120 [0xa0]: PERF_RECORD_NAMESPACES 19047/19047 - nr_namespaces: 7
    0 0 0x126230 [0xa0]: PERF_RECORD_NAMESPACES 19152/19152 - nr_namespaces: 7
    0 0 0x129330 [0xa0]: PERF_RECORD_NAMESPACES 19154/19154 - nr_namespaces: 7
    0 0 0x12a1f8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
    0 0 0x12b0b8 [0xa0]: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
    #

    Humm, investigate why we got two record for the 19155 pid/tid...

    Signed-off-by: Hari Bathini
    Tested-by: Arnaldo Carvalho de Melo
    Acked-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Alexei Starovoitov
    Cc: Ananth N Mavinakayanahalli
    Cc: Aravinda Prasad
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: Eric Biederman
    Cc: Peter Zijlstra
    Cc: Sargun Dhillon
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/148891931111.25309.11073854609798681633.stgit@hbathini.in.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Hari Bathini
     

14 Mar, 2017

1 commit

  • Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
    by the kernel when fork, clone, setns or unshare are invoked. And update
    perf-record documentation with the new option to record namespace
    events.

    Committer notes:

    Combined it with a later patch to allow printing it via 'perf report -D'
    and be able to test the feature introduced in this patch. Had to move
    here also perf_ns__name(), that was introduced in another later patch.

    Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:

    util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
    ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
    ^
    Testing it:

    # perf record --namespaces -a
    ^C[ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
    #
    # perf report -D

    3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
    [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
    4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]

    0x1151e0 [0x30]: event: 9
    .
    . ... raw event: size 48 bytes
    . 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h....
    . 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c....
    . 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................

    NAMESPACES events: 1

    #

    Signed-off-by: Hari Bathini
    Acked-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Alexei Starovoitov
    Cc: Ananth N Mavinakayanahalli
    Cc: Aravinda Prasad
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: Eric Biederman
    Cc: Peter Zijlstra
    Cc: Sargun Dhillon
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Hari Bathini
     

24 Oct, 2016

1 commit

  • Change Intel PT and BTS to pass up the length and the instruction
    bytes of the decoded or sampled instruction in the perf sample.

    The decoder already knows this information, we just need to pass it
    up. Since it is only a couple of movs it is not very expensive.

    Handle instruction cache too. Make sure ilen is always initialized.

    Used in the next patch.

    [Adrian: re-base on top (and adjust for) instruction buffer size tidy-up]
    [Adrian: add BTS support and adjust commit message accordingly]

    Signed-off-by: Adrian Hunter
    Link: http://lkml.kernel.org/r/1475847747-30994-3-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Andi Kleen
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

25 Jul, 2016

1 commit

  • This reverts commit e083a21fcac9311ca425e600a15332f4792c56cc.

    Not needed at all, tools/perf/util/perf_regs.h, included via:

    #include "perf_regs.h"

    Should have a definition for PERF_REGS_MAX, and since this is dependent
    on HAVE_PERF_REGS_SUPPORT, fixes the build on powerpc, noticed by trying
    to cross compile this from ubuntu16.04 with a locally build libz &
    elfutils pair, since those are not available in multilib packages.

    Cc: Jiri Olsa
    Cc: Naveen N. Rao
    Cc: Stephane Eranian
    Cc: Sukadev Bhattiprolu
    Link: http://lkml.kernel.org/n/tip-0bv204s71t4wuw1l53b6fz79@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

13 Jul, 2016

1 commit


31 Mar, 2016

1 commit

  • Intel PT uses the time members from the perf_event_mmap_page to convert
    between TSC and perf time.

    Due to a lack of foresight when Intel PT was implemented, those time
    members were recorded in the (implementation dependent) AUXTRACE_INFO
    event, the structure of which is generally inaccessible outside of the
    Intel PT decoder. However now the conversion between TSC and perf time
    is needed when processing a jitdump file when Intel PT has been used for
    tracing.

    So add a user event to record the time members. 'perf record' will
    synthesize the event if the information is available. And session
    processing will put a copy of the event on the session so that tools
    like 'perf inject' can easily access it.

    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1457426324-30158-1-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

23 Mar, 2016

3 commits

  • Since none of the perf_event fields are used anymore, just the
    perf_sample ones, and since this resolves to (map, symbol) from data
    structures within struct thread, rename it to thread__resolve and make
    the argument ordering similar to the one in machine__resolve().

    Cc: Adrian Hunter
    Cc: Hemant Kumar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Naveen N. Rao
    Cc: Ravi Bangoria
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-2b33hs9bp550tezzlhl4kejh@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Since we only deal with fields in the passed struct perf_sample move
    this method to struct machine, that is where the perf_sample fields
    will be resolved to a struct addr_location, i.e. thread, map, symbol,
    etc.

    Cc: Adrian Hunter
    Cc: Hemant Kumar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Naveen N. Rao
    Cc: Ravi Bangoria
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-a1ww2lbm2vbuqsv4p7ilubu9@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • To avoid parsing event->header.misc in many locations.

    This will also allow setting perf.sample.{ip,cpumode} in a single place,
    from tracepoint fields, as needed by 'perf kvm' with PPC guests, where
    the guest hardware counters is not available at the host.

    Cc: Adrian Hunter
    Cc: Hemant Kumar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Naveen N. Rao
    Cc: Ravi Bangoria
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-qp3yradhyt6q3wl895b1aat0@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

18 Dec, 2015

10 commits

  • Adding the cpumask 'event update' event, that stores/transfer the
    cpumask for a event.

    Signed-off-by: Jiri Olsa
    Tested-by: Kan Liang
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445784728-21732-25-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding name type 'event update' event, that stores/transfer events name.
    Event's name is stored within perf.data's EVENT_DESC feature, but we
    don't have it if we get the report data from pipe.

    Signed-off-by: Jiri Olsa
    Tested-by: Kan Liang
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445784728-21732-24-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • A__allocdding scale type 'event update' event, that stores/transfer
    events scale value. The PMU events can define the scale
    value which is used to multiply events data.

    Signed-off-by: Jiri Olsa
    Tested-by: Kan Liang
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445784728-21732-23-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding unit type 'event update' event, that stores/transfer events unit
    name. The unit name is part of the perf stat output data.

    Signed-off-by: Jiri Olsa
    Tested-by: Kan Liang
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445784728-21732-22-git-send-email-jolsa@kernel.org
    [ Rename __alloc() to __new() for consistency ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • It'll serve as a base event for additional event attributes details,
    that are not part of the attr event.

    At the moment this event is just a dummy one without any specific
    functionality. The type value will distinguish the update event details.
    It'll come in the following patches.

    The idea for this event is to be extensible for any update that the
    event might need in the future.

    Signed-off-by: Jiri Olsa
    Tested-by: Kan Liang
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445784728-21732-21-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Introduce the perf_event__synthesize_stat_round function to
    synthesize a 'struct stat_round_event'.

    Signed-off-by: Jiri Olsa
    Tested-by: Kan Liang
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445784728-21732-19-git-send-email-jolsa@kernel.org
    [ Renamed 'time' parameter to 'evtime' to fix build on older systems ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding the stat round event to be stored after each stat interval round,
    so that report tools (report/script) gets notified and process interval
    data.

    Signed-off-by: Jiri Olsa
    Tested-by: Kan Liang
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445784728-21732-18-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Introduce the perf_event__synthesize_stat function to synthesize a
    'struct stat_event'.

    Signed-off-by: Jiri Olsa
    Tested-by: Kan Liang
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445784728-21732-16-git-send-email-jolsa@kernel.org
    [ Renamed 'stat' parameter to 'st' to fix 'already defined' build error with older distros (e.g. RHEL6.7) ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding a stat event to store a 'struct perf_counter_values' for a given
    event/cpu/thread.

    Signed-off-by: Jiri Olsa
    Tested-by: Kan Liang
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445784728-21732-15-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Introducing the perf_event__read_stat_config function to read a struct
    perf_stat_config object data from a stat config event.

    Signed-off-by: Jiri Olsa
    Tested-by: Kan Liang
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1445784728-21732-14-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa