10 Mar, 2019

1 commit

  • …ux/kernel/git/acme/linux into perf/urgent

    Pull perf/core changes from Arnaldo Carvalho de Melo:

    perf bpf:

    Arnaldo Carvalho de Melo:

    - Automatically add BTF ELF markers to 'perf trace' BPF programs, so that
    tools such as 'bpftool map dump' can pretty print map keys and values.

    perf c2c:

    Jiri Olsa:

    - Fix report for empty NUMA node.

    perf diff:

    Jin Yao:

    - Support --time, --cpu, --pid and --tid filter options.

    perf probe:

    Arnaldo Carvalho de Melo:

    - Clarify error message about not finding kernel modules debuginfo.

    perf record:

    Jiri Olsa:

    - Fixup probing for max attr.precise_ip.

    perf trace:

    Arnaldo Carvalho de Melo:

    - Add missing %s lost in the 'msg_flags' recvmmsg arg when adding prefix suppression logic.

    perf annotate:

    Arnaldo Carvalho de Melo:

    - Calculate the max instruction name, align column to that, removing the
    hardcoded max 6 chars and cope with instructions with names longer than that,
    such as vpmovmskb, vpcmpeqb, etc.

    kernel:

    Song Liu:

    - Consider events with attr.bpf_event set as side-band.

    Gustavo A. R. Silva:

    - Mark expected switch fall-through in perf_event_parse_addr_filter().

    Libraries:

    Jiri Olsa:

    - Fix leaks and double frees on error paths.

    libtraceevent:

    Tony Jones:

    - Fix buffer overflow in arg_eval().

    python scripting:

    Tony Jones:

    - More python3 fixes.

    Trivial:

    Yang Wei:

    - Remove needless extra semicolon in clang C++ glue code.

    Intel PT/BTS:

    Adrian Hunter:

    - Improve auxtrace address filter error message when there is no DSO.

    - Fix divide by zero when TSC is not available.

    - Further improvements to the export to sqlite/posgresql python scripts
    and to the GUI sqlviewer, exporting 'parent_id' so that we have enable
    the creation of call trees.

    Andi Kleen:

    - Generalize function to copy from thread addr space from intel-bts code.

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

07 Mar, 2019

8 commits

  • Making sure the data->file.path is zeroed on perf_data__open error path
    and in perf_data__close, so we don't double free it in case someone call
    it twice.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Jonas Rabenstein
    Cc: Nageswara R Sastry
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Ravi Bangoria
    Link: http://lkml.kernel.org/r/20190305152536.21035-9-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • We can't call perf_data__close and subsequently perf_session__delete,
    because it will call perf_data__close again and cause double free for
    data->file.path.

    $ perf report -i .
    incompatible file format (rerun with -v to learn more)
    free(): double free detected in tcache 2
    Aborted (core dumped)

    In fact we don't need to call perf_data__close at all, because at the
    time the got out_close is reached, session->data is already initialized,
    so the perf_data__close call will be triggered from
    perf_session__delete.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Jonas Rabenstein
    Cc: Nageswara R Sastry
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Ravi Bangoria
    Fixes: 2d4f27999b88 ("perf data: Add global path holder")
    Link: http://lkml.kernel.org/r/20190305152536.21035-8-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Currently we probe for precise_ip with user specified perf_event_attr,
    which might fail because of unsupported kernel features, which would get
    disabled during the open time anyway.

    Switching the probe to take place on simple hw cycles, so the following
    record sets proper precise_ip:

    # perf record -e cycles:P ls
    # perf evlist -v
    cycles:P: size: 112, ... precise_ip: 3, ...

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Jonas Rabenstein
    Cc: Nageswara R Sastry
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Ravi Bangoria
    Link: http://lkml.kernel.org/r/20190305152536.21035-7-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Read the caps/max_precise value and store it in struct perf_pmu to be
    used when setting the maximum precise_ip field in following patch.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Jonas Rabenstein
    Cc: Nageswara R Sastry
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Ravi Bangoria
    Link: http://lkml.kernel.org/r/20190305152536.21035-5-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • We can't allocate he->srcline unconditionaly, only when new hist_entry
    is created. Moving he->srcline allocation into hist_entry__init
    function.

    Original-patch-by: Jonas Rabenstein
    Suggested-by: Namhyung Kim
    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Nageswara R Sastry
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Ravi Bangoria
    Link: http://lkml.kernel.org/r/20190305152536.21035-4-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding error path into hist_entry__init to unify error handling, so
    every new member does not need to free everything else.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Jonas Rabenstein
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Ravi Bangoria
    Cc: nageswara r sastry
    Link: http://lkml.kernel.org/r/20190305152536.21035-3-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Add a utility function to fetch executable code. Convert one
    user over to it. There are more places doing that, but they
    do significantly different actions, so they are not
    easy to fit into a single library function.

    Committer changes:

    . No need to cast around, make 'buf' be a void pointer.

    . Rename it to thread__memcpy() to reflect the fact it is about copying
    a chunk of memory from a thread, i.e. from its address space.

    . No need to have it in a separate object file, move it to thread.[ch]

    . Check the return of map__load(), the original code didn't do it, but
    since we're moving this around, check that as well.

    Signed-off-by: Andi Kleen
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/r/20190305144758.12397-2-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • We were hardcoding '6' as the max instruction name, and we have lots
    that are longer than that, see the diff from two 'P' printed TUI
    annotations for a libc function that uses instructions with long names,
    such as 'vpmovmskb' with its 9 chars:

    --- __strcmp_avx2.annotation.before 2019-03-06 16:31:39.368020425 -0300
    +++ __strcmp_avx2.annotation 2019-03-06 16:32:12.079450508 -0300
    @@ -2,284 +2,284 @@
    Event: cycles:ppp

    Percent endbr64
    - 0.10 mov %edi,%eax
    + 0.10 mov %edi,%eax
    - xor %edx,%edx
    + xor %edx,%edx
    - 3.54 vpxor %ymm7,%ymm7,%ymm7
    + 3.54 vpxor %ymm7,%ymm7,%ymm7
    - or %esi,%eax
    + or %esi,%eax
    - and $0xfff,%eax
    + and $0xfff,%eax
    - cmp $0xf80,%eax
    + cmp $0xf80,%eax
    - ↓ jg 370
    + ↓ jg 370
    - 27.07 vmovdqu (%rdi),%ymm1
    + 27.07 vmovdqu (%rdi),%ymm1
    - 7.97 vpcmpeqb (%rsi),%ymm1,%ymm0
    + 7.97 vpcmpeqb (%rsi),%ymm1,%ymm0
    - 2.15 vpminub %ymm1,%ymm0,%ymm0
    + 2.15 vpminub %ymm1,%ymm0,%ymm0
    - 4.09 vpcmpeqb %ymm7,%ymm0,%ymm0
    + 4.09 vpcmpeqb %ymm7,%ymm0,%ymm0
    - 0.43 vpmovmskb %ymm0,%ecx
    + 0.43 vpmovmskb %ymm0,%ecx
    - 1.53 test %ecx,%ecx
    + 1.53 test %ecx,%ecx
    - ↓ je b0
    + ↓ je b0
    - 5.26 tzcnt %ecx,%edx
    + 5.26 tzcnt %ecx,%edx
    - 18.40 movzbl (%rdi,%rdx,1),%eax
    + 18.40 movzbl (%rdi,%rdx,1),%eax
    - 7.09 movzbl (%rsi,%rdx,1),%edx
    + 7.09 movzbl (%rsi,%rdx,1),%edx
    - 3.34 sub %edx,%eax
    + 3.34 sub %edx,%eax
    2.37 vzeroupper
    ← retq
    nop
    - 50: tzcnt %ecx,%edx
    + 50: tzcnt %ecx,%edx
    - movzbl 0x20(%rdi,%rdx,1),%eax
    + movzbl 0x20(%rdi,%rdx,1),%eax
    - movzbl 0x20(%rsi,%rdx,1),%edx
    + movzbl 0x20(%rsi,%rdx,1),%edx
    - sub %edx,%eax
    + sub %edx,%eax
    vzeroupper
    ← retq
    - data16 nopw %cs:0x0(%rax,%rax,1)
    + data16 nopw %cs:0x0(%rax,%rax,1)

    Reported-by: Travis Downs
    LPU-Reference: CAOBGo4z1KfmWeOm6Et0cnX5Z6DWsG2PQbAvRn1MhVPJmXHrc5g@mail.gmail.com
    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-89wsdd9h9g6bvq52sgp6d0u4@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

06 Mar, 2019

2 commits

  • Pull perf updates from Ingo Molnar:
    "Lots of tooling updates - too many to list, here's a few highlights:

    - Various subcommand updates to 'perf trace', 'perf report', 'perf
    record', 'perf annotate', 'perf script', 'perf test', etc.

    - CPU and NUMA topology and affinity handling improvements,

    - HW tracing and HW support updates:
    - Intel PT updates
    - ARM CoreSight updates
    - vendor HW event updates

    - BPF updates

    - Tons of infrastructure updates, both on the build system and the
    library support side

    - Documentation updates.

    - ... and lots of other changes, see the changelog for details.

    Kernel side updates:

    - Tighten up kprobes blacklist handling, reduce the number of places
    where developers can install a kprobe and hang/crash the system.

    - Fix/enhance vma address filter handling.

    - Various PMU driver updates, small fixes and additions.

    - refcount_t conversions

    - BPF updates

    - error code propagation enhancements

    - misc other changes"

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
    perf script python: Add Python3 support to syscall-counts-by-pid.py
    perf script python: Add Python3 support to syscall-counts.py
    perf script python: Add Python3 support to stat-cpi.py
    perf script python: Add Python3 support to stackcollapse.py
    perf script python: Add Python3 support to sctop.py
    perf script python: Add Python3 support to powerpc-hcalls.py
    perf script python: Add Python3 support to net_dropmonitor.py
    perf script python: Add Python3 support to mem-phys-addr.py
    perf script python: Add Python3 support to failed-syscalls-by-pid.py
    perf script python: Add Python3 support to netdev-times.py
    perf tools: Add perf_exe() helper to find perf binary
    perf script: Handle missing fields with -F +..
    perf data: Add perf_data__open_dir_data function
    perf data: Add perf_data__(create_dir|close_dir) functions
    perf data: Fail check_backup in case of error
    perf data: Make check_backup work over directories
    perf tools: Add rm_rf_perf_data function
    perf tools: Add pattern name checking to rm_rf
    perf tools: Add depth checking to rm_rf
    perf data: Add global path holder
    ...

    Linus Torvalds
     
  • Delete a superfluous semicolon in getBPFObjectFromModule().

    Signed-off-by: Yang Wei
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Yang Wei
    Link: http://lkml.kernel.org/r/1551710174-3349-1-git-send-email-albin_yang@163.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Yang Wei
     

02 Mar, 2019

3 commits

  • The call_path can be used to find the parent symbol for a call but not
    the exact parent call. To do that add parent_id to the call_return
    export. This enables the creation of a call tree from the exported data.

    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Link: https://lkml.kernel.org/n/tip-6j7tzdxo67cox6kan7k22oo6@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • When TSC is not available, "timeless" decoding is used but a divide by
    zero occurs if perf_time_to_tsc() is called.

    Ensure the divisor is not zero.

    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Cc: stable@vger.kernel.org # v4.9+
    Link: https://lkml.kernel.org/n/tip-1i4j0wqoc8vlbkcizqqxpsf4@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • The message does not indicate the possibility that the symbol is not
    found because the file does not exist.

    Before:

    $ perf record -e intel_pt//u --filter 'filter strcmp / strcpy @ foo ' ls
    Symbol 'strcmp' not found.
    Note that symbols must be functions.
    Failed to parse address filter: 'filter strcmp / strcpy @ foo '
    Filter format is: filter|start|stop|tracestop [/ ] [@]
    Where multiple filters are separated by space or comma.

    After:

    $ perf record -e intel_pt//u --filter 'filter strcmp / strcpy @ foo ' ls
    File 'foo' not found or has no symbols.
    Symbol 'strcmp' not found.
    Note that symbols must be functions.
    Failed to parse address filter: 'filter strcmp / strcpy @ foo '
    Filter format is: filter|start|stop|tracestop [/ ] [@]
    Where multiple filters are separated by space or comma.

    Reported-by: Alexander Shishkin
    Signed-off-by: Adrian Hunter
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Link: https://lkml.kernel.org/n/tip-dvngzxd0jkplzw1ary69dilb@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

01 Mar, 2019

3 commits

  • Jiri points out that we don't need any time checking and time string
    parsing if the --time option is not set. That makes sense.

    This patch refactors the time range parsing code, move the duplicated
    code from perf report and perf script to time_utils and check if --time
    option is set before parsing the time string. This patch is no logic
    change expected. So the usage of --time is same as before.

    For example:

    Select the first and second 10% time slices:
    perf report --time 10%/1,10%/2
    perf script --time 10%/1,10%/2

    Select the slices from 0% to 10% and from 30% to 40%:
    perf report --time 0%-10%,30%-40%
    perf script --time 0%-10%,30%-40%

    Select the time slices from timestamp 3971 to 3973
    perf report --time 3971,3973
    perf script --time 3971,3973

    Committer testing:

    Using the above examples, check before and after to see if it remains
    the same:

    $ perf record -F 10000 -- find . -name "*.[ch]" -exec cat {} + > /dev/null
    [ perf record: Woken up 3 times to write data ]
    [ perf record: Captured and wrote 1.626 MB perf.data (42392 samples) ]
    $
    $ perf report --time 10%/1,10%/2 > /tmp/report.before.1
    $ perf script --time 10%/1,10%/2 > /tmp/script.before.1
    $ perf report --time 0%-10%,30%-40% > /tmp/report.before.2
    $ perf script --time 0%-10%,30%-40% > /tmp/script.before.2
    $ perf report --time 180457.375844,180457.377717 > /tmp/report.before.3
    $ perf script --time 180457.375844,180457.377717 > /tmp/script.before.3

    For example, the 3rd test produces this slice:

    $ cat /tmp/script.before.3
    cat 3147 180457.375844: 2143 cycles:uppp: 7f79362590d9 cfree@GLIBC_2.2.5+0x9 (/usr/lib64/libc-2.28.so)
    cat 3147 180457.375986: 2245 cycles:uppp: 558b70f3d86e [unknown] (/usr/bin/cat)
    cat 3147 180457.376012: 2164 cycles:uppp: 7f7936257430 _int_malloc+0x8c0 (/usr/lib64/libc-2.28.so)
    cat 3147 180457.376140: 2921 cycles:uppp: 558b70f3a554 [unknown] (/usr/bin/cat)
    cat 3147 180457.376296: 2844 cycles:uppp: 7f7936258abe malloc+0x4e (/usr/lib64/libc-2.28.so)
    cat 3147 180457.376431: 2717 cycles:uppp: 558b70f3b0ca [unknown] (/usr/bin/cat)
    cat 3147 180457.376667: 2630 cycles:uppp: 558b70f3d86e [unknown] (/usr/bin/cat)
    cat 3147 180457.376795: 2442 cycles:uppp: 7f79362bff55 read+0x15 (/usr/lib64/libc-2.28.so)
    cat 3147 180457.376927: 2376 cycles:uppp: ffffffff9aa00163 [unknown] ([unknown])
    cat 3147 180457.376954: 2307 cycles:uppp: 7f7936257438 _int_malloc+0x8c8 (/usr/lib64/libc-2.28.so)
    cat 3147 180457.377116: 3091 cycles:uppp: 7f7936258a70 malloc+0x0 (/usr/lib64/libc-2.28.so)
    cat 3147 180457.377362: 2945 cycles:uppp: 558b70f3a3b0 [unknown] (/usr/bin/cat)
    cat 3147 180457.377517: 2727 cycles:uppp: 558b70f3a9aa [unknown] (/usr/bin/cat)
    $

    Install 'coreutils-debuginfo' to see cat's guts (symbols), but then, the
    above chunk translates into this 'perf report' output:

    $ cat /tmp/report.before.3
    # To display the perf.data header info, please use --header/--header-only options.
    #
    #
    # Total Lost Samples: 0
    #
    # Samples: 13 of event 'cycles:uppp' (time slices: 180457.375844,180457.377717)
    # Event count (approx.): 33552
    #
    # Overhead Command Shared Object Symbol
    # ........ ....... ................ ......................
    #
    17.69% cat libc-2.28.so [.] malloc
    14.53% cat cat [.] 0x000000000000586e
    13.33% cat libc-2.28.so [.] _int_malloc
    8.78% cat cat [.] 0x00000000000023b0
    8.71% cat cat [.] 0x0000000000002554
    8.13% cat cat [.] 0x00000000000029aa
    8.10% cat cat [.] 0x00000000000030ca
    7.28% cat libc-2.28.so [.] read
    7.08% cat [unknown] [k] 0xffffffff9aa00163
    6.39% cat libc-2.28.so [.] cfree@GLIBC_2.2.5

    #
    # (Tip: Order by the overhead of source file name and line number: perf report -s srcline)
    #
    $

    Now lets see after applying this patch, nothing should change:

    $ perf report --time 10%/1,10%/2 > /tmp/report.after.1
    $ perf script --time 10%/1,10%/2 > /tmp/script.after.1
    $ perf report --time 0%-10%,30%-40% > /tmp/report.after.2
    $ perf script --time 0%-10%,30%-40% > /tmp/script.after.2
    $ perf report --time 180457.375844,180457.377717 > /tmp/report.after.3
    $ perf script --time 180457.375844,180457.377717 > /tmp/script.after.3
    $ diff -u /tmp/report.before.1 /tmp/report.after.1
    $ diff -u /tmp/script.before.1 /tmp/script.after.1
    $ diff -u /tmp/report.before.2 /tmp/report.after.2
    --- /tmp/report.before.2 2019-03-01 11:01:53.526094883 -0300
    +++ /tmp/report.after.2 2019-03-01 11:09:18.231770467 -0300
    @@ -352,5 +352,5 @@

    #
    -# (Tip: Generate a script for your data: perf script -g )
    +# (Tip: Treat branches as callchains: perf report --branch-history)
    #
    $ diff -u /tmp/script.before.2 /tmp/script.after.2
    $ diff -u /tmp/report.before.3 /tmp/report.after.3
    --- /tmp/report.before.3 2019-03-01 11:03:08.890045588 -0300
    +++ /tmp/report.after.3 2019-03-01 11:09:40.660224002 -0300
    @@ -22,5 +22,5 @@

    #
    -# (Tip: Order by the overhead of source file name and line number: perf report -s srcline)
    +# (Tip: List events using substring match: perf list )
    #
    $ diff -u /tmp/script.before.3 /tmp/script.after.3
    $

    Cool, just the 'perf report' tips changed, QED.

    Signed-off-by: Jin Yao
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Jin Yao
    Cc: Jiri Olsa
    Cc: Kan Liang
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1551435186-6008-1-git-send-email-yao.jin@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jin Yao
     
  • For historical reasons the helper to loop over maps in an object
    is called bpf_map__for_each while it really should be called
    bpf_object__for_each_map. Rename and add a correctly named
    define for backward compatibility.

    Switch all in-tree users to the correct name (Quentin).

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • 'perf probe' supports using just the kernel module name, but that will
    work only when the module is loaded, or using the full pathname to the
    file with the DWARF debug info, but the warning was cryptic:

    Before:

    # perf probe -m cls_flower -L fl_change
    Failed to find the path for cls_flower: No such file or directory
    Error: Failed to show lines.
    #

    After:

    # perf probe -m cls_flower -L fl_change
    Module cls_flower is not loaded, please specify its full path name.
    Error: Failed to show lines.
    # perf probe -m /lib/modules/5.0.0-rc7+/kernel/net/sched/cls_flower.ko -L fl_change | head -7

    0 static int fl_change(struct net *net, struct sk_buff *in_skb,
    struct tcf_proto *tp, unsigned long base,
    u32 handle, struct nlattr **tca,
    void **arg, bool ovr, struct netlink_ext_ack *extack)
    4 {
    5 struct cls_fl_head *head = rtnl_dereference(tp->root);
    #

    The behaviour doesn't change when the module is loaded:

    # modprobe cls_flower
    # perf probe -m cls_flower -L fl_change | head -7

    0 static int fl_change(struct net *net, struct sk_buff *in_skb,
    struct tcf_proto *tp, unsigned long base,
    u32 handle, struct nlattr **tca,
    void **arg, bool ovr, struct netlink_ext_ack *extack)
    4 {
    5 struct cls_fl_head *head = rtnl_dereference(tp->root);
    #

    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Marcelo Ricardo Leitner
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-q4njvk9mshra00jacqjbzfn5@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

25 Feb, 2019

8 commits

  • Also convert one existing user.

    Signed-off-by: Andi Kleen
    Acked-by: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190224153722.27020-9-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • Add perf_data__open_dir_data to open files inside 'struct perf_data'
    path directory:

    static int perf_data__open_dir(struct perf_data *data);

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190224190656.30163-10-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Add perf_data__create_dir() to create nr files inside 'struct perf_data'
    path directory:

    int perf_data__create_dir(struct perf_data *data, int nr);

    and function to close that data:

    void perf_data__close_dir(struct perf_data *data);

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190224190656.30163-9-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • And display the error message from removing the old data file:

    $ perf record ls
    Can't remove old data: Permission denied (perf.data.old)
    Perf session creation failed.

    $ perf record ls
    Can't remove old data: Unknown file found (perf.data.old)
    Perf session creation failed.

    Not sure how to make fail the rename (after we successfully remove the
    destination file/dir) to show the message, anyway let's have it there.

    Signed-off-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190224190656.30163-8-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Change check_backup() to call rm_rf_perf_data() instead of unlink() to
    work over directory paths.

    Also move the call earlier in the code, before we fork for file/dir, so
    it can backup also directory data.

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190224190656.30163-7-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • To remove perf.data including the directory, with checking on expected
    files and no other directories inside.

    Signed-off-by: Jiri Olsa
    Suggested-by: Andi Kleen
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190224190656.30163-4-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Add pattern argument to rm_rf_depth() (and rename it to rm_rf_depth_pat())
    to specify the name pattern files need to match inside the directory.

    The function fails if we find different file to remove.

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190224190656.30163-3-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding depth argument to rm_rf (and renaming it to rm_rf_depth) to
    specify the depth we will go searching for files to remove.

    It will be used to specify single depth for perf.data directory removal
    in following patch.

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190224190656.30163-2-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

23 Feb, 2019

3 commits

  • Add a 'path' member to 'struct perf_data'. It will keep the configured
    path for the data (const char *). The path in struct perf_data_file is
    now dynamically allocated (duped) from it.

    This scheme is useful/used in following patches where struct
    perf_data::path holds the 'configure' directory path and struct
    perf_data_file::path holds the allocated path for specific files.

    Also it actually makes the code little simpler.

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
    [ Fixup data-convert-bt.c missing conversion ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • We are about to add support for multiple files, so we need each file to
    keep its size.

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190221094145.9151-2-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • x86 retpoline functions pollute the call graph by showing up everywhere
    there is an indirect branch, but they do not really mean anything. Make
    changes so that the default retpoline functions will no longer appear in
    the call graph. Note this only affects the call graph, since all the
    original branches are left unchanged.

    This does not handle function return thunks, nor is there any
    improvement for the handling of inline thunks or extern thunks.

    Example:

    $ cat simple-retpoline.c
    __attribute__((noinline)) int bar(void)
    {
    return -1;
    }

    int foo(void)
    {
    return bar() + 1;
    }

    __attribute__((indirect_branch("thunk"))) int main()
    {
    int (*volatile fn)(void) = foo;

    fn();
    return fn();
    }
    $ gcc -ggdb3 -Wall -Wextra -O2 -o simple-retpoline simple-retpoline.c
    $ objdump -d simple-retpoline

    0000000000001040 :
    1040: 48 83 ec 18 sub $0x18,%rsp
    1044: 48 8d 05 25 01 00 00 lea 0x125(%rip),%rax # 1170
    104b: 48 89 44 24 08 mov %rax,0x8(%rsp)
    1050: 48 8b 44 24 08 mov 0x8(%rsp),%rax
    1055: e8 1f 01 00 00 callq 1179
    105a: 48 8b 44 24 08 mov 0x8(%rsp),%rax
    105f: 48 83 c4 18 add $0x18,%rsp
    1063: e9 11 01 00 00 jmpq 1179

    0000000000001160 :
    1160: b8 ff ff ff ff mov $0xffffffff,%eax
    1165: c3 retq

    0000000000001170 :
    1170: e8 eb ff ff ff callq 1160
    1175: 83 c0 01 add $0x1,%eax
    1178: c3 retq
    0000000000001179 :
    1179: e8 07 00 00 00 callq 1185
    117e: f3 90 pause
    1180: 0f ae e8 lfence
    1183: eb f9 jmp 117e
    1185: 48 89 04 24 mov %rax,(%rsp)
    1189: c3 retq

    $ perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0,017 MB simple-retpoline.perf.data ]
    $ perf script -i simple-retpoline.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
    2019-01-08 14:03:37.851655 Creating database...
    2019-01-08 14:03:37.863256 Writing records...
    2019-01-08 14:03:38.069750 Adding indexes
    2019-01-08 14:03:38.078799 Done
    $ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py simple-retpoline.db

    Before:

    main
    -> __x86_indirect_thunk_rax
    -> __x86_indirect_thunk_rax
    -> foo
    -> bar

    After:

    main
    -> foo
    -> bar

    Signed-off-by: Adrian Hunter
    Tested-by: Arnaldo Carvalho de Melo
    Acked-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/20190109091835.5570-7-adrian.hunter@intel.com
    [ Remove (sym->name != NULL) test, this is not a pointer and breaks the build with clang version 7.0.1 (Fedora 7.0.1-2.fc30) ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

22 Feb, 2019

2 commits

  • Improve thread_stack__no_call_return() to better handle 'returns' that
    do not match the stack i.e. 'no call'. See code comments for details.
    The example below shows how retpolines are affected:

    Example:

    $ cat simple-retpoline.c
    __attribute__((noinline)) int bar(void)
    {
    return -1;
    }

    int foo(void)
    {
    return bar() + 1;
    }

    __attribute__((indirect_branch("thunk"))) int main()
    {
    int (*volatile fn)(void) = foo;

    fn();
    return fn();
    }
    $ gcc -ggdb3 -Wall -Wextra -O2 -o simple-retpoline simple-retpoline.c
    $ objdump -d simple-retpoline

    0000000000001040 :
    1040: 48 83 ec 18 sub $0x18,%rsp
    1044: 48 8d 05 25 01 00 00 lea 0x125(%rip),%rax # 1170
    104b: 48 89 44 24 08 mov %rax,0x8(%rsp)
    1050: 48 8b 44 24 08 mov 0x8(%rsp),%rax
    1055: e8 1f 01 00 00 callq 1179
    105a: 48 8b 44 24 08 mov 0x8(%rsp),%rax
    105f: 48 83 c4 18 add $0x18,%rsp
    1063: e9 11 01 00 00 jmpq 1179

    0000000000001160 :
    1160: b8 ff ff ff ff mov $0xffffffff,%eax
    1165: c3 retq

    0000000000001170 :
    1170: e8 eb ff ff ff callq 1160
    1175: 83 c0 01 add $0x1,%eax
    1178: c3 retq
    0000000000001179 :
    1179: e8 07 00 00 00 callq 1185
    117e: f3 90 pause
    1180: 0f ae e8 lfence
    1183: eb f9 jmp 117e
    1185: 48 89 04 24 mov %rax,(%rsp)
    1189: c3 retq

    $ perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0,017 MB simple-retpoline.perf.data ]
    $ perf script -i simple-retpoline.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
    2019-01-08 14:03:37.851655 Creating database...
    2019-01-08 14:03:37.863256 Writing records...
    2019-01-08 14:03:38.069750 Adding indexes
    2019-01-08 14:03:38.078799 Done
    $ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py simple-retpoline.db

    Before:

    main
    -> __x86_indirect_thunk_rax
    -> __x86_indirect_thunk_rax
    -> __x86_indirect_thunk_rax
    -> bar

    After:

    main
    -> __x86_indirect_thunk_rax
    -> __x86_indirect_thunk_rax
    -> foo
    -> bar

    Committer testing:

    Chose "Reports", Then "Context-Sensitive Call Graph" and then go on
    expanding:

    Before:

    simple-retpolin
    PID:PID
    _start
    _start
    __libc_start_main
    main
    __x86_indirect_thunk_rax
    __x86_indirect_thunk_rax
    bar

    After:

    Remove the "simple.retpoline.db" file, run again the 'perf script' line
    to regenerate the .db file and run the exported-sql-viewer.py again to
    get the same all the way to 'main', then, from there, including 'main':

    main
    __x86_indirect_thunk_rax
    __x86_indirect_thunk_rax
    foo
    bar

    Signed-off-by: Adrian Hunter
    Tested-by: Arnaldo Carvalho de Melo
    Acked-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/20190109091835.5570-6-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • The output of "perf annotate -l --stdio xxx" changed since commit 425859ff0de33
    ("perf annotate: No need to calculate notes->start twice") removed notes->start
    assignment in symbol__calc_lines(). It will get failed in
    find_address_in_section() from symbol__tty_annotate() subroutine as the
    a2l->addr is wrong. So the annotate summary doesn't report the line number of
    source code correctly.

    Before fix:

    liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ cat common_while_1.c
    void hotspot_1(void)
    {
    volatile int i;

    for (i = 0; i < 0x10000000; i++);
    for (i = 0; i < 0x10000000; i++);
    for (i = 0; i < 0x10000000; i++);
    }

    int main(void)
    {
    hotspot_1();

    return 0;
    }
    liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ gcc common_while_1.c -g -o common_while_1

    liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record ./common_while_1
    [ perf record: Woken up 2 times to write data ]
    [ perf record: Captured and wrote 0.488 MB perf.data (12498 samples) ]
    liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s hotspot_1 --stdio

    Sorted summary for file /home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
    ----------------------------------------------

    19.30 common_while_1[32]
    19.03 common_while_1[4e]
    19.01 common_while_1[16]
    5.04 common_while_1[13]
    4.99 common_while_1[4b]
    4.78 common_while_1[2c]
    4.77 common_while_1[10]
    4.66 common_while_1[2f]
    4.59 common_while_1[51]
    4.59 common_while_1[35]
    4.52 common_while_1[19]
    4.20 common_while_1[56]
    0.51 common_while_1[48]
    Percent | Source code & Disassembly of common_while_1 for cycles:ppp (12480 samples, percent: local period)
    -----------------------------------------------------------------------------------------------------------------
    :
    :
    :
    : Disassembly of section .text:
    :
    : 00000000000005fa :
    : hotspot_1():
    : void hotspot_1(void)
    : {
    0.00 : 5fa: push %rbp
    0.00 : 5fb: mov %rsp,%rbp
    : volatile int i;
    :
    : for (i = 0; i < 0x10000000; i++);
    0.00 : 5fe: movl $0x0,-0x4(%rbp)
    0.00 : 605: jmp 610
    0.00 : 607: mov -0x4(%rbp),%eax
    common_while_1[10] 4.77 : 60a: add $0x1,%eax
    common_while_1[13] 5.04 : 60d: mov %eax,-0x4(%rbp)
    common_while_1[16] 19.01 : 610: mov -0x4(%rbp),%eax
    common_while_1[19] 4.52 : 613: cmp $0xfffffff,%eax
    0.00 : 618: jle 607
    : for (i = 0; i < 0x10000000; i++);
    ...

    After fix:

    liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record ./common_while_1
    [ perf record: Woken up 2 times to write data ]
    [ perf record: Captured and wrote 0.488 MB perf.data (12500 samples) ]
    liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s hotspot_1 --stdio

    Sorted summary for file /home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
    ----------------------------------------------

    33.34 common_while_1.c:5
    33.34 common_while_1.c:6
    33.32 common_while_1.c:7
    Percent | Source code & Disassembly of common_while_1 for cycles:ppp (12482 samples, percent: local period)
    -----------------------------------------------------------------------------------------------------------------
    :
    :
    :
    : Disassembly of section .text:
    :
    : 00000000000005fa :
    : hotspot_1():
    : void hotspot_1(void)
    : {
    0.00 : 5fa: push %rbp
    0.00 : 5fb: mov %rsp,%rbp
    : volatile int i;
    :
    : for (i = 0; i < 0x10000000; i++);
    0.00 : 5fe: movl $0x0,-0x4(%rbp)
    0.00 : 605: jmp 610
    0.00 : 607: mov -0x4(%rbp),%eax
    common_while_1.c:5 4.70 : 60a: add $0x1,%eax
    4.89 : 60d: mov %eax,-0x4(%rbp)
    common_while_1.c:5 19.03 : 610: mov -0x4(%rbp),%eax
    common_while_1.c:5 4.72 : 613: cmp $0xfffffff,%eax
    0.00 : 618: jle 607
    : for (i = 0; i < 0x10000000; i++);
    0.00 : 61a: movl $0x0,-0x4(%rbp)
    0.00 : 621: jmp 62c
    0.00 : 623: mov -0x4(%rbp),%eax
    common_while_1.c:6 4.54 : 626: add $0x1,%eax
    4.73 : 629: mov %eax,-0x4(%rbp)
    common_while_1.c:6 19.54 : 62c: mov -0x4(%rbp),%eax
    common_while_1.c:6 4.54 : 62f: cmp $0xfffffff,%eax
    ...

    Signed-off-by: Wei Li
    Acked-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Jin Yao
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Fixes: 425859ff0de33 ("perf annotate: No need to calculate notes->start twice")
    Link: http://lkml.kernel.org/r/20190221095716.39529-1-liwei391@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wei Li
     

21 Feb, 2019

5 commits

  • Let rm_rf() remove a file if it's provided by path, not just
    directories.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20190220122800.864-7-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • So it does not screw up single -v verbose output.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20190220122800.864-6-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Add a missing new line into pr_debug call in perf_event__synthesize_bpf_events(),
    so that the error message does not screw the verbose output.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Song Liu
    Link: http://lkml.kernel.org/r/20190220122800.864-5-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Force sample_type setup for slave events in group leader sessions.

    We don't get sample for slave events, we make them when delivering group
    leader sample. Set the slave event to follow the master sample_type to
    ease up report.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20190220122800.864-3-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • There's no reason to deliver a sample with zero period. It means there
    was no value for slave event since its last group leader sample.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20190220122800.864-2-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

20 Feb, 2019

1 commit

  • At some point I'll suggest moving this to libbpf, for now I'll
    experiment with ways to dump BPF maps set by events in 'perf trace',
    starting with a very basic dumper for the current very limited needs
    of the augmented_raw_syscalls code: dumping booleans.

    Having functions that apply to the map keys and values and do table
    lookup in things like syscall id to string tables should come next.

    Cc: Adrian Hunter
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Jiri Olsa
    Cc: Martin KaFai Lau
    Cc: Namhyung Kim
    Cc: Yonghong Song
    Link: https://lkml.kernel.org/n/tip-lz14w0esqyt1333aon05jpwc@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

19 Feb, 2019

4 commits

  • We can't assume inlined symbols with the same name are equal, because
    their address range may be different. This will cause the symbols with
    different addresses be shadowed when adding to the hist entry, and lead
    to ERANGE error when checking the symbol address during sample parse,
    the addr should be within the range of [sym.start, sym.end].

    The error message is like: "0x36aea60 [0x8]: failed to process type: 68".

    The second parameter of symbol__new() is the length of the fake symbol
    for the inline frame, which is the subtraction of the end and start
    address of base_sym.

    Signed-off-by: He Kuang
    Acked-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Fixes: aa441895f7b4 ("perf report: Compare symbol name for inlined frames when sorting")
    Link: http://lkml.kernel.org/r/20190219130531.15692-1-hekuang@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    He Kuang
     
  • Use sysfs__mountpoint() when reading sysfs files to obtain cpu/numa
    topologies.

    Also use scnprintf instead of sprintf as suggested by Namhyung.

    Signed-off-by: Jiri Olsa
    Acked-by: Namhyung Kim
    Cc: Alexander Shishkin
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20190219095815.15931-5-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Add the numa_topology object to return the list of numa nodes together
    with their cpus. It will replace the numa code in header.c and will be
    used from 'perf record' in the following patches.

    Add the following interface functions to load numa details:

    struct numa_topology *numa_topology__new(void);
    void numa_topology__delete(struct numa_topology *tp);

    And replace the current (copied) local interface, with no functional
    changes.

    Signed-off-by: Jiri Olsa
    Acked-by: Namhyung Kim
    Cc: Alexander Shishkin
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20190219095815.15931-4-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Make struct cpu_topo global and rename it to 'struct cpu_topology', so
    that it can be used from the 'perf record' command in the following
    patches.

    Add the following interface functions to load/free cpu topology details:

    struct cpu_topology *cpu_topology__new(void);
    void cpu_topology__delete(struct cpu_topology *tp);

    Move it to a separate source file cputopo.c together with numa related
    object in the following patches.

    No functional change, the new interface will be used in upcoming changes.

    Signed-off-by: Jiri Olsa
    Acked-by: Namhyung Kim
    Cc: Alexander Shishkin
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20190219095815.15931-3-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa