18 Mar, 2018

1 commit

  • commit de19e5c3c51fdb1ff20d0f61d099db902ff7494b upstream.

    trigger_on() means that the trigger is available but not ready, however
    trigger_on() was making it ready. That can segfault if the signal comes
    before trigger_ready(). e.g. (USR2 signal delivery not shown)

    $ perf record -e intel_pt//u -S sleep 1
    perf: Segmentation fault
    Obtained 16 stack frames.
    /home/ahunter/bin/perf(sighandler_dump_stack+0x40) [0x4ec550]
    /lib/x86_64-linux-gnu/libc.so.6(+0x36caf) [0x7fa76411acaf]
    /home/ahunter/bin/perf(perf_evsel__disable+0x26) [0x4b9dd6]
    /home/ahunter/bin/perf() [0x43a45b]
    /lib/x86_64-linux-gnu/libc.so.6(+0x36caf) [0x7fa76411acaf]
    /lib/x86_64-linux-gnu/libc.so.6(__xstat64+0x15) [0x7fa7641d2cc5]
    /home/ahunter/bin/perf() [0x4ec6c9]
    /home/ahunter/bin/perf() [0x4ec73b]
    /home/ahunter/bin/perf() [0x4ec73b]
    /home/ahunter/bin/perf() [0x4ec73b]
    /home/ahunter/bin/perf() [0x4eca15]
    /home/ahunter/bin/perf(machine__create_kernel_maps+0x257) [0x4f0b77]
    /home/ahunter/bin/perf(perf_session__new+0xc0) [0x4f86f0]
    /home/ahunter/bin/perf(cmd_record+0x722) [0x43c132]
    /home/ahunter/bin/perf() [0x4a11ae]
    /home/ahunter/bin/perf(main+0x5d4) [0x427fb4]

    Note, for testing purposes, this is hard to hit unless you add some sleep()
    in builtin-record.c before record__open().

    Signed-off-by: Adrian Hunter
    Acked-by: Jiri Olsa
    Cc: Wang Nan
    Cc: stable@vger.kernel.org
    Fixes: 3dcc4436fa6f ("perf tools: Introduce trigger class")
    Link: http://lkml.kernel.org/r/1519807144-30694-1-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Adrian Hunter
     

25 Feb, 2018

2 commits

  • [ Upstream commit 321a7c35c90cc834851ceda18a8ee18f1d032b92 ]

    Certain systems are designed to have sparse/discontiguous nodes. On
    such systems, 'perf bench numa' hangs, shows wrong number of nodes and
    shows values for non-existent nodes. Handle this by only taking nodes
    that are exposed by kernel to userspace.

    Signed-off-by: Satheesh Rajendran
    Reviewed-by: Srikar Dronamraju
    Acked-by: Naveen N. Rao
    Link: http://lkml.kernel.org/r/1edbcd353c009e109e93d78f2f46381930c340fe.1511368645.git.sathnaga@linux.vnet.ibm.com
    Signed-off-by: Balamuruhan S
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Satheesh Rajendran
     
  • [ Upstream commit 89d0aeab4252adc2a7ea693637dd21c588bfa2d1 ]

    The stdio perf top crashes when we change the terminal
    window size. The reason is that we assumed we get the
    perf_top pointer as a signal handler argument which is
    not the case.

    Changing the SIGWINCH handler logic to change global
    resize variable, which is checked in the main thread
    loop.

    Signed-off-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Tested-by: Ravi Bangoria
    Cc: Adrian Hunter
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-ysuzwz77oev1ftgvdscn9bpu@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Jiri Olsa
     

24 Jan, 2018

1 commit

  • commit 7a759cd8e8272ee18922838ee711219c7c796a31 upstream.

    With commit: 0a943cb10ce78 (tools build: Add HOSTARCH Makefile variable)
    when building for ARCH=x86_64, ARCH=x86_64 is passed to perf instead of
    ARCH=x86, so the perf build process searchs header files from
    tools/arch/x86_64/include, which doesn't exist.

    The following build failure is seen:

    In file included from util/event.c:2:0:
    tools/include/uapi/linux/mman.h:4:27: fatal error: uapi/asm/mman.h: No such file or directory
    compilation terminated.

    Fix this issue by using SRCARCH instead of ARCH in perf, just like the
    main kernel Makefile and tools/objtool's.

    Signed-off-by: Jiada Wang
    Tested-by: Arnaldo Carvalho de Melo
    Acked-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Eugeniu Rosca
    Cc: Jan Stancek
    Cc: Masami Hiramatsu
    Cc: Peter Zijlstra
    Cc: Ravi Bangoria
    Cc: Rui Teng
    Cc: Sukadev Bhattiprolu
    Cc: Wang Nan
    Fixes: 0a943cb10ce7 ("tools build: Add HOSTARCH Makefile variable")
    Link: http://lkml.kernel.org/r/1491793357-14977-2-git-send-email-jiada_wang@mentor.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Tuomas Tynkkynen
    Signed-off-by: Greg Kroah-Hartman

    Jiada Wang
     

20 Dec, 2017

1 commit

  • [ Upstream commit e7ede72a6d40cb3a30c087142d79381ca8a31dab ]

    The current symbols__fixup_end() heuristic for the last entry in the rb
    tree is suboptimal as it leads to not being able to recognize the symbol
    in the call graph in a couple of corner cases, for example:

    i) If the symbol has a start address (f.e. exposed via kallsyms)
    that is at a page boundary, then the roundup(curr->start, 4096)
    for the last entry will result in curr->start == curr->end with
    a symbol length of zero.

    ii) If the symbol has a start address that is shortly before a page
    boundary, then also here, curr->end - curr->start will just be
    very few bytes, where it's unrealistic that we could perform a
    match against.

    Instead, change the heuristic to roundup(curr->start, 4096) + 4096, so
    that we can catch such corner cases and have a better chance to find
    that specific symbol. It's still just best effort as the real end of the
    symbol is unknown to us (and could even be at a larger offset than the
    current range), but better than the current situation.

    Alexei reported that he recently run into case i) with a JITed eBPF
    program (these are all page aligned) as the last symbol which wasn't
    properly shown in the call graph (while other eBPF program symbols in
    the rb tree were displayed correctly). Since this is a generic issue,
    lets try to improve the heuristic a bit.

    Reported-and-Tested-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann
    Fixes: 2e538c4a1847 ("perf tools: Improve kernel/modules symbol lookup")
    Link: http://lkml.kernel.org/r/bb5c80d27743be6f12afc68405f1956a330e1bc9.1489614365.git.daniel@iogearbox.net
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Daniel Borkmann
     

10 Dec, 2017

1 commit

  • [ Upstream commit 22905582f6dd4bbd0c370fe5732c607452010c04 ]

    Command perf test -v 16 (Setup struct perf_event_attr test) always
    reports success even if the test case fails. It works correctly if you
    also specify -F (for don't fork).

    root@s35lp76 perf]# ./perf test -v 16
    15: Setup struct perf_event_attr :
    --- start ---
    running './tests/attr/test-record-no-delay'
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.002 MB /tmp/tmp4E1h7R/perf.data
    (1 samples) ]
    expected task=0, got 1
    expected precise_ip=0, got 3
    expected wakeup_events=1, got 0
    FAILED './tests/attr/test-record-no-delay' - match failure
    test child finished with 0
    ---- end ----
    Setup struct perf_event_attr: Ok

    The reason for the wrong error reporting is the return value of the
    system() library call. It is called in run_dir() file tests/attr.c and
    returns the exit status, in above case 0xff00.

    This value is given as parameter to the exit() function which can only
    handle values 0-0xff.

    The child process terminates with exit value of 0 and the parent does
    not detect any error.

    This patch corrects the error reporting and prints the correct test
    result.

    Signed-off-by: Thomas-Mich Richter
    Acked-by: Jiri Olsa
    Cc: Heiko Carstens
    Cc: Hendrik Brueckner
    Cc: Martin Schwidefsky
    Cc: Thomas-Mich Richter
    LPU-Reference: 20170913081209.39570-2-tmricht@linux.vnet.ibm.com
    Link: http://lkml.kernel.org/n/tip-rdube6rfcjsr1nzue72c7lqn@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Thomas Richter
     

08 Nov, 2017

1 commit

  • [ Upstream commit 75fc5ae5cc53fff71041ecadeb3354a2b4c9fe42 ]

    Signed-off-by: Taeung Song
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Link: http://lkml.kernel.org/r/1485952447-7013-2-git-send-email-treeze.taeung@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Taeung Song
     

30 Aug, 2017

1 commit

  • commit eebc509b20881b92d62e317b2c073e57c5f200f0 upstream.

    Fix --funcs (-F) option to show correct symbols for offline module.
    Since previous perf-probe uses machine__findnew_module_map() for offline
    module, even if user passes a module file (with full path) which is for
    other architecture, perf-probe always tries to load symbol map for
    current kernel module.

    This fix uses dso__new_map() to load the map from given binary as same
    as a map for user applications.

    Signed-off-by: Masami Hiramatsu
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/148350053478.19001.15435255244512631545.stgit@devbox
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Krister Johansen
    Signed-off-by: Greg Kroah-Hartman

    Masami Hiramatsu
     

07 Aug, 2017

3 commits

  • [ Upstream commit 7934c98a6e04028eb34c1293bfb5a6b0ab630b66 ]

    Markus reported that perf segfaults when reading /sys/kernel/notes from
    a kernel linked with GNU gold, due to what looks like a gold bug, so do
    some bounds checking to avoid crashing in that case.

    Reported-by: Markus Trippelsdorf
    Report-Link: http://lkml.kernel.org/r/20161219161821.GA294@x4
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-ryhgs6a6jxvz207j2636w31c@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Arnaldo Carvalho de Melo
     
  • [ Upstream commit 30a9c6444810429aa2b7cbfbd453ce339baaadbf ]

    Those are binaries as well, so should be installed by:

    make -C tools/perf install-bin'

    too.

    Cc: Alexander Shishkin
    Cc: Daniel Bristot de Oliveira
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/n/tip-3841b37u05evxrs1igkyu6ks@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Arnaldo Carvalho de Melo
     
  • [ Upstream commit 1f2ed153b916c95a49a1ca9d7107738664224b7f ]

    Since 'perf probe' supports cross-arch probes, it is possible to analyze
    different arch kernel image which has different bits-per-long.

    In that case, it fails to get the module name because it uses the
    MOD_NAME_OFFSET macro based on the host machine bits-per-long, instead
    of the target arch bits-per-long.

    This fixes above issue by changing modname-offset based on the target
    archs bit width. This is ok because linux kernel uses LP64 model on
    64bit arch.

    E.g. without this (on x86_64, and target module is arm32):

    $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
    p:probe/configfs_lookup :configfs_lookup+0
    ^-Here is an empty module name.

    With this fix, you can see correct module name:

    $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
    p:probe/configfs_lookup configfs:configfs_lookup+0

    Signed-off-by: Masami Hiramatsu
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/148337043836.6752.383495516397005695.stgit@devbox
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Masami Hiramatsu
     

28 Jul, 2017

9 commits

  • commit 80f62589fa52f530cffc50e78c0b5a2ae572d61e upstream.

    When the jump instruction is displayed at the row 0 in annotate view,
    the arrow is broken. An example:

    16.86 │ ┌──je 82
    0.01 │ movsd (%rsp),%xmm0
    │ movsd 0x8(%rsp),%xmm4
    │ movsd 0x8(%rsp),%xmm1
    │ movsd (%rsp),%xmm3
    │ divsd %xmm4,%xmm0
    │ divsd %xmm3,%xmm1
    │ movsd (%rsp),%xmm2
    │ addsd %xmm1,%xmm0
    │ addsd %xmm2,%xmm0
    │ movsd %xmm0,(%rsp)
    │82: sub $0x1,%ebx
    83.03 │ ↑ jne 38
    │ add $0x10,%rsp
    │ xor %eax,%eax
    │ pop %rbx
    │ ← retq

    The patch increments the row number before checking with 0.

    Signed-off-by: Yao Jin
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Jiri Olsa
    Cc: Kan Liang
    Cc: Peter Zijlstra
    Fixes: 944e1abed9e1 ("perf ui browser: Add method to draw up/down arrow line")
    Link: http://lkml.kernel.org/r/1496901704-30275-1-git-send-email-yao.jin@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Jin Yao
     
  • commit 6a558f12dbe85437acbdec5e149ea07b5554eced upstream.

    Sometimes a FUP packet is associated with a TSX transaction and a flag is
    set to indicate that. Ensure that flag is cleared on any error condition
    because at that point the decoder can no longer assume it is correct.

    Signed-off-by: Adrian Hunter
    Cc: Andi Kleen
    Link: http://lkml.kernel.org/r/1495786658-18063-9-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Adrian Hunter
     
  • commit 622b7a47b843c78626f40c1d1aeef8483383fba2 upstream.

    The decoder will try to use branch packets to find an IP to start decoding
    or to recover from errors. Currently the FUP packet is used only in the
    case of an overflow, however there is no reason for that to be a special
    case. So just use FUP always when scanning for an IP.

    Signed-off-by: Adrian Hunter
    Cc: Andi Kleen
    Link: http://lkml.kernel.org/r/1495786658-18063-8-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Adrian Hunter
     
  • commit f952eaceb089b691eba7c4e13686e742a8f26bf5 upstream.

    Intel PT uses IP compression based on the last IP. For decoding purposes,
    'last IP' is not updated when a branch target has been suppressed, which is
    indicated by IPBytes == 0. IPBytes is stored in the packet 'count', so
    ensure never to set 'last_ip' when packet 'count' is zero.

    Signed-off-by: Adrian Hunter
    Cc: Andi Kleen
    Link: http://lkml.kernel.org/r/1495786658-18063-7-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Adrian Hunter
     
  • commit ee14ac0ef6827cd6f9a572cc83dd0191ea17812c upstream.

    Intel PT uses IP compression based on the last IP. For decoding
    purposes, 'last IP' is considered to be reset to zero whenever there is
    a synchronization packet (PSB). The decoder wasn't doing that, and was
    treating the zero value to mean that there was no last IP, whereas
    compression can be done against the zero value. Fix by setting last_ip
    to zero when a PSB is received and keep track of have_last_ip.

    Signed-off-by: Adrian Hunter
    Cc: Andi Kleen
    Link: http://lkml.kernel.org/r/1495786658-18063-6-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Adrian Hunter
     
  • commit ad7167a8cd174ba7d8c0d0ed8d8410521206d104 upstream.

    A value of zero is used to indicate that there is no IP. Ensure the
    value is zero when the state is INTEL_PT_STATE_NO_IP.

    Signed-off-by: Adrian Hunter
    Cc: Andi Kleen
    Link: http://lkml.kernel.org/r/1495786658-18063-5-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Adrian Hunter
     
  • commit 12b7080609097753fd8198cc1daf589be3ec1cca upstream.

    The return compression stack must be cleared whenever there is a PSB. Fix
    one case where that was not happening.

    Signed-off-by: Adrian Hunter
    Cc: Andi Kleen
    Link: http://lkml.kernel.org/r/1495786658-18063-4-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Adrian Hunter
     
  • commit 3f04d98e972b59706bd43d6cc75efac91f8fba50 upstream.

    The decoder uses its current timestamp in samples. Usually that is a
    timestamp that has already passed, but in some cases it is a timestamp
    for a branch that the decoder is walking towards, and consequently
    hasn't reached. Improve that situation by using the pkt_state to
    determine when to use the current or previous timestamp.

    Signed-off-by: Adrian Hunter
    Cc: Andi Kleen
    Link: http://lkml.kernel.org/r/1495786658-18063-3-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Adrian Hunter
     
  • commit 22c06892332d8916115525145b78e606e9cc6492 upstream.

    Move decoder error setting into one condition.

    Cc'ed to stable because later fixes depend on it.

    Signed-off-by: Adrian Hunter
    Cc: Andi Kleen
    Link: http://lkml.kernel.org/r/1495786658-18063-2-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Adrian Hunter
     

15 Jul, 2017

11 commits

  • commit 3e96dac7c956089d3f23aca98c4dfca57b6aaf8a upstream.

    Add error check codes on post processing and improve it for offline
    probe events as:

    - post processing fails if no matched symbol found in map(-ENOENT)
    or strdup() failed(-ENOMEM).

    - Even if the symbol name is the same, it updates symbol address
    and offset.

    Signed-off-by: Masami Hiramatsu
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/148411443738.9978.4617979132625405545.stgit@devbox
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Krister Johansen
    Signed-off-by: Greg Kroah-Hartman

    Masami Hiramatsu
     
  • commit 8a937a25a7e3c19d5fb3f9d92f605cf5fda219d8 upstream.

    Fix perf-probe to show probe definition on gcc generated symbols for
    offline kernel (including cross-arch kernel image).

    gcc sometimes optimizes functions and generate new symbols with suffixes
    such as ".constprop.N" or ".isra.N" etc. Since those symbol names are
    not recorded in DWARF, we have to find correct generated symbols from
    offline ELF binary to probe on it (kallsyms doesn't correct it). For
    online kernel or uprobes we don't need it because those are rebased on
    _text, or a section relative address.

    E.g. Without this:

    $ perf probe -k build-arm/vmlinux -F __slab_alloc*
    __slab_alloc.constprop.9
    $ perf probe -k build-arm/vmlinux -D __slab_alloc
    p:probe/__slab_alloc __slab_alloc+0

    If you put above definition on target machine, it should fail
    because there is no __slab_alloc in kallsyms.

    With this fix, perf probe shows correct probe definition on
    __slab_alloc.constprop.9:

    $ perf probe -k build-arm/vmlinux -D __slab_alloc
    p:probe/__slab_alloc __slab_alloc.constprop.9+0

    Signed-off-by: Masami Hiramatsu
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/148350060434.19001.11864836288580083501.stgit@devbox
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Krister Johansen
    Signed-off-by: Greg Kroah-Hartman

    Masami Hiramatsu
     
  • commit d7dd112ea5cacf91ae72c0714c3b911eb6016fea upstream.

    Fix below compile error:

    CC util/scripting-engines/trace-event-perl.o
    In file included from /usr/lib/perl5/5.22.2/i686-linux/CORE/perl.h:5673:0,
    from util/scripting-engines/trace-event-perl.c:31:
    /usr/lib/perl5/5.22.2/i686-linux/CORE/inline.h: In function 'S__is_utf8_char_slow':
    /usr/lib/perl5/5.22.2/i686-linux/CORE/inline.h:270:5: error: nested extern declaration of 'Perl___notused' [-Werror=nested-externs]
    dTHX; /* The function called below requires thread context */
    ^
    cc1: all warnings being treated as errors

    After digging perl5 repository, I find out that we will meet this
    compile error with perl from v5.21.1 to v5.25.4

    Signed-off-by: Wang YanQing
    Acked-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/20170212024655.GA15997@udknight
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Wang YanQing
     
  • commit 8434a2ec13d5c8cb25716950bfbf7c9d7b64628a upstream.

    In commit daeecbc0c431 ("perf tools: Add event_update event scale type"), the
    handling of PERF_EVENT_UPDATE__SCALE cast struct event_update_event->data to a
    pointer to event_update_event_scale, uses some field from this casted struct
    and then ends up falling through to the handling of another event type,
    PERF_EVENT_UPDATE__CPUS were it casts that ev->data to yet another type, oops,
    fix it by inserting the missing break.

    Noticed when building perf using gcc 7 on Fedora Rawhide:

    util/header.c: In function 'perf_event__process_event_update':
    util/header.c:3207:16: error: this statement may fall through [-Werror=implicit-fallthrough=]
    evsel->scale = ev_scale->scale;
    ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
    util/header.c:3208:2: note: here
    case PERF_EVENT_UPDATE__CPUS:
    ^~~~

    This wasn't noticed because probably PERF_EVENT_UPDATE__CPUS comes after
    PERF_EVENT_UPDATE__SCALE, so we would just create a bogus evsel->own_cpus when
    processing a PERF_EVENT_UPDATE__SCALE to then leak it and create a new cpu map
    with the correct data.

    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Kan Liang
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Fixes: daeecbc0c431 ("perf tools: Add event_update event scale type")
    Link: http://lkml.kernel.org/n/tip-lukcf9hdj092ax2914ss95at@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Arnaldo Carvalho de Melo
     
  • commit 3aff8ba0a4c9c9191bb788171a1c54778e1246a2 upstream.

    Addressing this warning from gcc 7:

    CC /tmp/build/perf/bench/numa.o
    bench/numa.c: In function '__bench_numa':
    bench/numa.c:1582:42: error: '%d' directive output may be truncated writing between 1 and 10 bytes into a region of size between 8 and 17 [-Werror=format-truncation=]
    snprintf(tname, 32, "process%d:thread%d", p, t);
    ^~
    bench/numa.c:1582:25: note: directive argument in the range [0, 2147483647]
    snprintf(tname, 32, "process%d:thread%d", p, t);
    ^~~~~~~~~~~~~~~~~~~~
    In file included from /usr/include/stdio.h:939:0,
    from bench/../util/util.h:47,
    from bench/../builtin.h:4,
    from bench/numa.c:11:
    /usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' output between 17 and 35 bytes into a destination of size 32
    return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    __bos (__s), __fmt, __va_arg_pack ());
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    cc1: all warnings being treated as errors

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Petr Holasek
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-twa37vsfqcie5gwpqwnjuuz9@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Arnaldo Carvalho de Melo
     
  • commit 2e2bbc039fad9eabad6c4c1a473c8b2554cdd2d4 upstream.

    Addressing a few cases spotted by a new warning in gcc 7:

    tests/parse-events.c: In function 'test_pmu_events':
    tests/parse-events.c:1790:39: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size 90 [-Werror=format-truncation=]
    snprintf(name, MAX_NAME, "cpu/event=%s/u", ent->d_name);
    ^~
    In file included from /usr/include/stdio.h:939:0,
    from /git/linux/tools/perf/util/map.h:9,
    from /git/linux/tools/perf/util/symbol.h:7,
    from /git/linux/tools/perf/util/evsel.h:10,
    from tests/parse-events.c:3:
    /usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' output between 13 and 268 bytes into a destination of size 100
    return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    __bos (__s), __fmt, __va_arg_pack ());
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    tests/parse-events.c:1798:29: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size 100 [-Werror=format-truncation=]
    snprintf(name, MAX_NAME, "%s:u,cpu/event=%s/u", ent->d_name, ent->d_name);

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Fixes: 945aea220bb8 ("perf tests: Move test objects into 'tests' directory")
    Link: http://lkml.kernel.org/n/tip-ty4q2p8zp1dp3mskvubxskm5@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Arnaldo Carvalho de Melo
     
  • commit 7ea6856d6f5629d742edc23b8b76e6263371ef45 upstream.

    To address new warnings emmited by gcc 7, e.g.::

    CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.o
    CC /tmp/build/perf/tests/parse-events.o
    util/intel-pt-decoder/intel-pt-pkt-decoder.c: In function 'intel_pt_pkt_desc':
    util/intel-pt-decoder/intel-pt-pkt-decoder.c:499:6: error: this statement may fall through [-Werror=implicit-fallthrough=]
    if (!(packet->count))
    ^
    util/intel-pt-decoder/intel-pt-pkt-decoder.c:501:2: note: here
    case INTEL_PT_CYC:
    ^~~~
    CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-decoder.o
    cc1: all warnings being treated as errors

    Acked-by: Andi Kleen
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-mf0hw789pu9x855us5l32c83@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Arnaldo Carvalho de Melo
     
  • commit bdf23a9a190d7ecea092fd5c4aabb7d4bd0a9980 upstream.

    The size of dirent->dt_name is NAME_MAX + 1, but the size for the 'path'
    buffer is hard coded at 256, which may truncate it because we also
    prepend "/proc/", so that all that into account and thank gcc 7 for this
    warning:

    /git/linux/tools/perf/util/thread_map.c: In function 'thread_map__new_by_uid':
    /git/linux/tools/perf/util/thread_map.c:119:39: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size 250 [-Werror=format-truncation=]
    snprintf(path, sizeof(path), "/proc/%s", dirent->d_name);
    ^~
    In file included from /usr/include/stdio.h:939:0,
    from /git/linux/tools/perf/util/thread_map.c:5:
    /usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' output between 7 and 262 bytes into a destination of size 256
    return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    __bos (__s), __fmt, __va_arg_pack ());
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-csy0r8zrvz5efccgd4k12c82@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Arnaldo Carvalho de Melo
     
  • commit 7b0214b702ad8e124e039a317beeebb3f020d125 upstream.

    The implicit fall through case label here is intended, so let us inform
    that to gcc >= 7:

    CC /tmp/build/perf/builtin-top.o
    builtin-top.c: In function 'display_thread':
    builtin-top.c:644:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
    if (errno == EINTR)
    ^
    builtin-top.c:647:3: note: here
    default:
    ^~~~~~~

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-lmcfnnyx9ic0m6j0aud98p4e@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Arnaldo Carvalho de Melo
     
  • commit d64b721d27aef3fbeb16ecda9dd22ee34818ff70 upstream.

    The implicit fall through case label here is intended, so let us inform
    that to gcc >= 7:

    util/strfilter.c: In function 'strfilter_node__sprint':
    util/strfilter.c:270:6: error: this statement may fall through [-Werror=implicit-fallthrough=]
    if (len < 0)
    ^
    util/strfilter.c:272:2: note: here
    case '!':
    ^~~~
    cc1: all warnings being treated as errors

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-z2dpywg7u8fim000hjfbpyfm@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Arnaldo Carvalho de Melo
     
  • commit 94bdd5edb34e472980d1e18b4600d6fb92bd6b0a upstream.

    The implicit fall through case label here is intended, so let us inform
    that to gcc >= 7:

    CC /tmp/build/perf/util/string.o
    util/string.c: In function 'perf_atoll':
    util/string.c:22:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
    if (*p)
    ^
    util/string.c:24:3: note: here
    case '\0':
    ^~~~

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-0ophb30v9apkk6o95el0rqlq@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Arnaldo Carvalho de Melo
     

05 Jul, 2017

2 commits

  • [ Upstream commit 613f050d68a8ed3c0b18b9568698908ef7bbc1f7 ]

    Fix to probe on gcc generated functions on modules. Since
    probing on a module is based on its symbol name, it should
    be adjusted on actual symbols.

    E.g. without this fix, perf probe shows probe definition
    on non-exist symbol as below.

    $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -F in_range*
    in_range.isra.12
    $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
    p:probe/in_range nf_nat:in_range+0

    With this fix, perf probe correctly shows a probe on
    gcc-generated symbol.

    $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
    p:probe/in_range nf_nat:in_range.isra.12+0

    This also fixes same problem on online module as below.

    $ perf probe -m i915 -D assert_plane
    p:probe/assert_plane i915:assert_plane.constprop.134+0

    Signed-off-by: Masami Hiramatsu
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/148411450673.9978.14905987549651656075.stgit@devbox
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Masami Hiramatsu
     
  • [ Upstream commit d2d4edbebe07ddb77980656abe7b9bc7a9e0cdf7 ]

    Fix to show correct locations for events on modules by relocating given
    address instead of retrying after failure.

    This happens when the module text size is big enough, bigger than
    sh_addr, because the original code retries with given address + sh_addr
    if it failed to find CU DIE at the given address.

    Any address smaller than sh_addr always fails and it retries with the
    correct address, but addresses bigger than sh_addr will get a CU DIE
    which is on the given address (not adjusted by sh_addr).

    In my environment(x86-64), the sh_addr of ".text" section is 0x10030.
    Since i915 is a huge kernel module, we can see this issue as below.

    $ grep "[Tt] .*\[i915\]" /proc/kallsyms | sort | head -n1
    ffffffffc0270000 t i915_switcheroo_can_switch [i915]

    ffffffffc0270000 + 0x10030 = ffffffffc0280030, so we'll check
    symbols cross this boundary.

    $ grep "[Tt] .*\[i915\]" /proc/kallsyms | grep -B1 ^ffffffffc028\
    | head -n 2
    ffffffffc027ff80 t haswell_init_clock_gating [i915]
    ffffffffc0280110 t valleyview_init_clock_gating [i915]

    So setup probes on both function and see what happen.

    $ sudo ./perf probe -m i915 -a haswell_init_clock_gating \
    -a valleyview_init_clock_gating
    Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

    You can now use it in all perf tools, such as:

    perf record -e probe:valleyview_init_clock_gating -aR sleep 1

    $ sudo ./perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on i915_vga_set_decode:4@gpu/drm/i915/i915_drv.c in i915)

    As you can see, haswell_init_clock_gating is correctly shown,
    but valleyview_init_clock_gating is not.

    With this patch, both events are shown correctly.

    $ sudo ./perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)

    Committer notes:

    In my case:

    # perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
    Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

    You can now use it in all perf tools, such as:

    perf record -e probe:valleyview_init_clock_gating -aR sleep 1

    # perf probe -l
    probe:haswell_init_clock_gating (on i915_getparam+432@gpu/drm/i915/i915_drv.c in i915)
    probe:valleyview_init_clock_gating (on __i915_printk+240@gpu/drm/i915/i915_drv.c in i915)
    #

    # readelf -SW /lib/modules/4.9.0+/build/vmlinux | egrep -w '.text|Name'
    [Nr] Name Type Address Off Size ES Flg Lk Inf Al
    [ 1] .text PROGBITS ffffffff81000000 200000 822fd3 00 AX 0 0 4096
    #

    So both are b0rked, now with the fix:

    # perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
    Added new events:
    probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

    You can now use it in all perf tools, such as:

    perf record -e probe:valleyview_init_clock_gating -aR sleep 1

    # perf probe -l
    probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
    #

    Both looks correct.

    Signed-off-by: Masami Hiramatsu
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/148411436777.9978.1440275861947194930.stgit@devbox
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Masami Hiramatsu
     

20 May, 2017

1 commit

  • commit c3a0bbc7ad7598dec5a204868bdf8a2b1b51df14 upstream.

    Address filtering with kernel symbols incorrectly resulted in the error
    "Cannot determine size of symbol" because the no_size logic was the wrong
    way around.

    Signed-off-by: Adrian Hunter
    Tested-by: Andi Kleen
    Link: http://lkml.kernel.org/r/1490357752-27942-1-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Adrian Hunter
     

12 Mar, 2017

1 commit

  • commit aa33b9b9a2ebb00d33c83a5312d4fbf2d5aeba36 upstream.

    If dso__load_kcore frees all of the existing maps, but one has already
    been attached to a callchain cursor node, then we can get a SIGSEGV in
    any function that happens to try to use this invalid cursor. Use the
    existing map refcount mechanism to forestall cleanup of a map until the
    cursor iterates past the node.

    Signed-off-by: Krister Johansen
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Fixes: 84c2cafa2889 ("perf tools: Reference count struct map")
    Link: http://lkml.kernel.org/r/20170106062331.GB2707@templeofstupid.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Krister Johansen
     

15 Feb, 2017

2 commits

  • commit 8381cdd0e32dd748bd34ca3ace476949948bd793 upstream.

    The -o/--order option is to select column number to sort a diff result.

    It does the job by adding a hpp field at the beginning of the sort list.
    But it should not be added to the output field list as it has no
    callbacks required by a output field.

    During the setup_sorting(), the perf_hpp__setup_output_field() appends
    the given sort keys to the output field if it's not there already.

    Originally it was checked by fmt->list being non-empty. But commit
    3f931f2c4274 ("perf hists: Make hpp setup function generic") changed it
    to check the ->equal callback.

    Anyways, we don't need to add the pseudo hpp field to the output field
    list since it won't be used for output. So just skip fields if they
    have no ->color or ->entry callbacks.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Peter Zijlstra
    Fixes: 3f931f2c4274 ("perf hists: Make hpp setup function generic")
    Link: http://lkml.kernel.org/r/20170118051457.30946-1-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Namhyung Kim
     
  • commit a1c9f97f0b64e6337d9cfcc08c134450934fdd90 upstream.

    Commit 21e6d8428664 ("perf diff: Use perf_hpp__register_sort_field
    interface") changed list_add() to perf_hpp__register_sort_field().

    This resulted in a behavior change since the field was added to the tail
    instead of the head. So the -o option is mostly ignored due to its
    order in the list.

    This patch fixes it by adding perf_hpp__prepend_sort_field().

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Peter Zijlstra
    Fixes: 21e6d8428664 ("perf diff: Use perf_hpp__register_sort_field interface")
    Link: http://lkml.kernel.org/r/20170118051457.30946-2-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Namhyung Kim
     

26 Jan, 2017

3 commits

  • commit 621cb4e7837e39d25a5af5a785ad282cdd2b4ce8 upstream.

    This patch modifies the build dependencies on the jitdump support in
    perf. As it stands jitdump was wrongfully made dependent 100% on using
    DWARF. However, the dwarf dependency, only exist if generating the
    source line table in genelf_debug.c. The rest of the support does not
    need DWARF.

    This patch removes the dependency on DWARF for the entire jitdump
    support. It keeps it only for the genelf_debug.c support.

    Signed-off-by: Maciej Debski
    Reviewed-by: Stephane Eranian
    Cc: Anton Blanchard
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1476356383-30100-3-git-send-email-eranian@google.com
    Fixes: e12b202f8fb9 ("perf jitdump: Build only on supported archs")
    [ Make it build only if NO_LIBELF isn't defined, as jitdump.o will only be built in that case ]
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Maciej Debski
     
  • commit cf346d5bd4b9d61656df2f72565c9b354ef3ca0d upstream.

    Both register_perl_scripting() and register_python_scripting() allocate
    this variable, fix it by checking if it already was.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Tom Zanussi
    Cc: Wang Nan
    Fixes: 7e4b21b84c43 ("perf/scripts: Add Python scripting engine")
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Arnaldo Carvalho de Melo
     
  • commit c56cb33b56c13493eeb95612f80e4dd6e35cd109 upstream.

    Since 841e3558b2d ("perf callchain: Recording 'dwarf' callchains do not
    need DWARF unwinding support"), --call-graph dwarf is allowed in 'perf
    record' even without unwind support. A couple of other places don't
    reflect this yet though: the help text should list dwarf as a valid
    record mode and the dump_size config should be respected too.

    Signed-off-by: Rabin Vincent
    Cc: He Kuang
    Fixes: 841e3558b2de ("perf callchain: Recording 'dwarf' callchains do not need DWARF unwinding support")
    Link: http://lkml.kernel.org/r/1470837148-7642-1-git-send-email-rabin.vincent@axis.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Rabin Vincent