07 Aug, 2015

22 commits

  • Fix the perf-with-kcore script so that it doesn't split arguments that
    contain spaces.

    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/1437150840-31811-13-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • PERF_ITRACE_PERIOD_INSTRUCTIONS is zero so it got overwritten by the
    default period type.

    Fix by checking if the period type was set rather than if the value was
    zero when applying the default.

    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/1437150840-31811-12-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • Signed-off-by: Max Filippov
    Cc: Chris Zankel
    Cc: Marc Gauthier
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: linux-xtensa@linux-xtensa.org
    Link: http://lkml.kernel.org/r/1437208216-15729-9-git-send-email-jcmvbkbc@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Max Filippov
     
  • Display the cycles by default in branch sort mode.

    To make enough room for the new column I removed dso_to. It is usually
    redundant with dso_from.

    Signed-off-by: Andi Kleen
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/1437233094-12844-9-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • Now that we can process branch data in annotate it makes sense to
    support enabling branch recording from top too. Most of the code needed
    for this is already in shared code with report. But we need to add:

    - The option parsing code (using shared code from the previous patch)
    - Document the options
    - Set up the IPC/cycles accounting state in the top session
    - Call the accounting code in the hist iter callback

    Signed-off-by: Andi Kleen
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/1437233094-12844-8-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • Add two new columns to the annotate display and display the average
    cycles and the compute IPC if available.

    When the LBR was not in any branch mode the IPC computation is
    automatically disabled. We still display the cycle information.

    Example output (with made up numbers):

    The second column is the IPC and third average cycles.

    │ __attribute__((noinline)) f2()
    │ {
    5.15 0.07 │ push %rbp
    0.01 0.07 │ mov %rsp,%rbp
    │ c = a / b;
    9.87 0.07 │ mov a,%eax
    0.07 │ mov b,%ecx
    0.07 │ cltd
    4.92 0.07 123│ idiv %ecx
    70.79 0.07 │ mov %eax,__TMC_END__
    │ }
    9.25 0.07 │ pop %rbp
    0.01 0.07 123│ ← retq

    v2: Fix display problems.

    Signed-off-by: Andi Kleen
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/1437233094-12844-7-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • Compute the IPC and the basic block cycles for the annotate display.

    IPC is computed by counting the instructions, and then dividing the
    accounted cycles by that count.

    The actual IPC computation can only be done at annotate time, because we
    need to parse the objdump output first to know the number of
    instructions in the basic block.

    The cycles/IPC are also put into the perf function annotation so that
    the display code can show them.

    Again basic block overlaps are not handled, with the longest winning,
    but there are some heuristics to hide the IPC when the longest is not
    the most common.

    v2: Compute IPC correctly.

    Signed-off-by: Andi Kleen
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/1437233094-12844-6-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • Call the earlier added cycle histogram infrastructure from the perf
    report hist iter callback. For this we walk the branch records.

    This allows to use cycle histograms when browsing perf report annotate.

    v2: Rename flag

    Signed-off-by: Andi Kleen
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/1437233094-12844-5-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • This adds the basic infrastructure to keep track of cycle counts per
    basic block for annotate. We allocate an array similar to the normal
    accounting, and then account branch cycles there.

    We handle two cases:

    cycles per basic block with start and cycles per branch (these are later
    used for either IPC or just cycles per BB)

    In the start case we cannot handle overlaps, so always the longest basic
    block wins.

    For the cycles per branch case everything is accurately accounted.

    v2: Remove unnecessary checks. Slight restructure. Move
    symbol__get_annotation to another patch. Move histogram allocation.
    v3: Merged with current tree

    Signed-off-by: Andi Kleen
    Acked-by: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/1437233094-12844-4-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • Later patches need to cheaply check that the branch mode is in ANY. Add
    a new function to check all event attrs and add a flag to the report
    state, which is then initialized.

    v2: Rename flag

    Signed-off-by: Andi Kleen
    Acked-by: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/1437233094-12844-3-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • cycles is a new branch_info field available on some CPUs that indicates
    the time deltas between branches in the LBR.

    Add a sort key and output code for the cycles to allow to display the
    basic block cycles individually in perf report.

    We also pass in the cycles for weight when LBRs are processed, which
    allows to get global and local weight, to get an estimate of the total
    cost.

    And also print the cycles information for perf report -D. I also added
    printing for the previously missing LBR flags (mispredict etc.)

    Signed-off-by: Andi Kleen
    Acked-by: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/1437233094-12844-2-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • perf currently fails to build on MIPS as there is no
    tools/perf/arch/mips/Build file. Adding an empty file fixes this as
    there are no MIPS-specific sources to build.

    It looks like the same is needed for Alpha and PA-RISC, though I
    haven't been able to test those.

    Signed-off-by: Ben Hutchings
    Fixes: 5e8c0fb6a957 ("perf build: Add arch x86 objects building")
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1438704627.7315.2.camel@decadent.org.uk
    Signed-off-by: Arnaldo Carvalho de Melo

    Ben Hutchings
     
  • Moving counter processing code into stat object as
    perf_stat__process_counter.

    Signed-off-by: Jiri Olsa
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1437481927-29538-8-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Passing 'struct perf_stat_config' into process_counter(), so that we can
    make process_counter() non static and use it from other places.

    Signed-off-by: Jiri Olsa
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1437481927-29538-7-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Moving 'interval' into struct perf_stat_config. The point is to
    centralize the base stat config so it could be used localy together with
    other stat routines in other parts of perf code.

    Signed-off-by: Jiri Olsa
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1437481927-29538-6-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Moving 'output' into struct perf_stat_config. The point is to centralize
    the base stat config so it could be used localy together with other stat
    routines in other parts of perf code.

    Signed-off-by: Jiri Olsa
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1437481927-29538-5-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Moving 'scale' into struct perf_stat_config. The point is to centralize
    the base stat config so it could be used localy together with other stat
    routines in other parts of perf code.

    Signed-off-by: Jiri Olsa
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1437481927-29538-4-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Moving 'aggr_mode' into new struct. The point is to centralize the base
    stat config so it could be used localy together with other stat routines
    in other parts of perf code.

    Signed-off-by: Jiri Olsa
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1437481927-29538-3-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Commit 7b6ff0bdbf4f7f429c2116cca92a6d171217449e ("perf probe ppc64le:
    Fixup function entry if using kallsyms lookup") adds 'struct map' into
    probe-event.h but not forward declares it. This patch fixes it.

    Signed-off-by: Wang Nan
    Cc: Alexei Starovoitov
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: David Ahern
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kaixu Xia
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Fixes: 7b6ff0bdbf4f ("perf probe ppc64le: Fixup function entry if using kallsyms lookup")
    Link: http://lkml.kernel.org/n/1436445342-1402-30-git-send-email-wangnan0@huawei.com
    [ No need to include map.h, just forward declare 'struct map' ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • va_args alternative to eprintf().

    Signed-off-by: Wang Nan
    Acked-by: Alexei Starovoitov
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: David Ahern
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kaixu Xia
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/n/1436445342-1402-19-git-send-email-wangnan0@huawei.com
    [ split from another patch ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • By copying BPF related operation to uprobe processing path, this patch
    allow users attach BPF programs to uprobes like what they are already
    doing on kprobes.

    After this patch, users are allowed to use PERF_EVENT_IOC_SET_BPF on a
    uprobe perf event. Which make it possible to profile user space programs
    and kernel events together using BPF.

    Because of this patch, CONFIG_BPF_EVENTS should be selected by
    CONFIG_UPROBE_EVENT to ensure trace_call_bpf() is compiled even if
    KPROBE_EVENT is not set.

    Signed-off-by: Wang Nan
    Acked-by: Alexei Starovoitov
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: David Ahern
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kaixu Xia
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1435716878-189507-3-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • Commit e1abf2cc8d5d80b41c4419368ec743ccadbb131e ("bpf: Fix the build on
    BPF_SYSCALL=y && !CONFIG_TRACING kernels, make it more configurable")
    updated the building condition of bpf_trace.o from CONFIG_BPF_SYSCALL
    to CONFIG_BPF_EVENTS, but the corresponding #ifdef controller in
    trace_events.h for trace_call_bpf() was not changed. Which, in theory,
    is incorrect.

    With current Kconfigs, we can create a .config with CONFIG_BPF_SYSCALL=y
    and CONFIG_BPF_EVENTS=n by unselecting CONFIG_KPROBE_EVENT and
    selecting CONFIG_BPF_SYSCALL. With these options, trace_call_bpf() will
    be defined as an extern function, but if anyone calls it a symbol missing
    error will be triggered since bpf_trace.o was not built.

    This patch changes the #ifdef controller for trace_call_bpf() from
    CONFIG_BPF_SYSCALL to CONFIG_BPF_EVENTS. I'll show its correctness:

    Before this patch:

    BPF_SYSCALL BPF_EVENTS trace_call_bpf bpf_trace.o
    y y normal compiled
    n n inline not compiled
    y n normal not compiled (incorrect)
    n y impossible (BPF_EVENTS depends on BPF_SYSCALL)

    After this patch:

    BPF_SYSCALL BPF_EVENTS trace_call_bpf bpf_trace.o
    y y normal compiled
    n n inline not compiled
    y n inline not compiled (fixed)
    n y impossible (BPF_EVENTS depends on BPF_SYSCALL)

    So this patch doesn't break anything. QED.

    Signed-off-by: Wang Nan
    Cc: Alexei Starovoitov
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: David Ahern
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kaixu Xia
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1435716878-189507-2-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     

06 Aug, 2015

7 commits

  • It is cumbersome to manually calculate the total time spent in a given
    syscall by multiplying the average value with the number of calls.

    Instead, we now do this directly inside perf trace.

    Note that this is also done by 'strace', which even adds a column with
    relative numbers - something we could do in the future.

    Example:

    perf trace -s find /some/folder > /dev/null

    Summary of events:

    find (19976), 700123 events, 100.0%, 0.000 msec

    syscall calls total min avg max stddev
    (msec) (msec) (msec) (msec) (%)
    --------------- -------- --------- --------- --------- --------- ------
    read 4 0.006 0.001 0.002 0.003 27.42%
    write 8046 9.617 0.001 0.001 0.035 0.56%
    open 34196 40.384 0.001 0.001 0.071 0.30%
    close 68375 57.104 0.001 0.001 0.076 0.25%
    stat 4 0.004 0.001 0.001 0.001 3.14%
    fstat 34189 27.518 0.001 0.001 0.060 0.34%
    mmap 13 0.029 0.001 0.002 0.003 10.74%
    mprotect 6 0.018 0.002 0.003 0.005 17.04%
    munmap 3 0.014 0.003 0.005 0.006 24.87%
    brk 87 0.490 0.001 0.006 0.016 6.50%
    ioctl 3 0.004 0.001 0.001 0.003 36.39%
    access 1 0.004 0.004 0.004 0.004 0.00%
    uname 1 0.001 0.001 0.001 0.001 0.00%
    getdents 68393 143.600 0.001 0.002 0.187 0.95%
    fchdir 68371 56.980 0.001 0.001 0.111 0.39%
    arch_prctl 1 0.001 0.001 0.001 0.001 0.00%
    openat 34184 41.737 0.001 0.001 0.102 0.41%
    newfstatat 34184 41.180 0.001 0.001 0.064 0.34%

    Signed-off-by: Milian Wolff
    Tested-by: Arnaldo Carvalho de Melo
    LPU-Reference: 1438853069-5902-1-git-send-email-milian.wolff@kdab.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Milian Wolff
     
  • …/acme/linux into perf/core

    Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

    New features:

    - Deref sys_enter pointer args with contents from probe:vfs_getname, showing
    pathnames instead of pointers in many syscalls in 'perf trace'. (Arnaldo Carvalho de Melo)

    - Make 'perf trace' write to stderr by default, just like 'strace'. (Milian Woff)

    Infrastructure changes:

    - color_vfprintf() fixes. (Andi Kleen, Jiri Olsa)

    - Allow enabling/disabling PERF_SAMPLE_TIME per event. (Kan Liang)

    - Fix build errors with mipsel-linux-uclibc compiler. (Petri Gynther)

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     
  • linux/tools$ make ARCH=mips CROSS_COMPILE=mipsel-linux- perf
    ...
    config/Makefile:256: *** No gnu/libc-version.h found, please install
    glibc-dev[el]. Stop.
    make[1]: *** [all] Error 2
    make: *** [perf] Error 2

    ...
    In file included from builtin-sched.c:13:0:
    util/cloexec.h:8:12: error: redundant redeclaration of ‘sched_getcpu’
    [-Werror=redundant-decls]
    extern int sched_getcpu(void) __THROW;

    mipsel-buildroot-linux-uclibc/sysroot/usr/include/bits/sched.h:88:12:
    note: previous declaration of ‘sched_getcpu’ was here
    extern int sched_getcpu (void) __THROW;

    uclibc info:
    sysroot/usr/include/bits/uClibc_config.h
    __UCLIBC_MAJOR__ 0
    __UCLIBC_MINOR__ 9
    __UCLIBC_SUBLEVEL__ 33

    sysroot/usr/include/features.h
    __UCLIBC__ 1
    __GLIBC__ 2
    __GLIBC_MINOR__ 2

    Signed-off-by: Petri Gynther
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1438735081-24131-1-git-send-email-pgynther@google.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Petri Gynther
     
  • Without this patch, it is cumbersome to read the trace output but
    ignoring the normal, potentially verbose, output of the debuggee. One
    common example is doing something like the following:

    perf trace -s find /tmp > /dev/null

    Without this patch, the trace summary will be lost. Now, it will still
    be printed at the end. This behavior is also applied by strace.

    Cc: Milian Wolff
    Cc: David Ahern
    Link: http://lkml.kernel.org/n/tip-tqnks6y2cnvm5f9g2dsfr7zl@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Milian Wolff
     
  • color_vprintf was including the length of the invisible escape sequences
    in its return argument. Don't include them to make the return value
    usable for indentation calculations.

    v2: Add comment, rebase

    Signed-off-by: Andi Kleen
    Cc: Jiri Olsa
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1438649408-20807-3-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • Seems like it's always '\n' through color_fprintf_ln, which is not used
    at all, removing.. ;-)

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1438649408-20807-2-git-send-email-andi@firstfloor.org
    Signed-off-by: Andi Kleen
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Pass global callchain_param into parse_callchain_record_opt and
    perf_evsel__config_callgraph as parameter. So we can reuse these
    functions to parse/config local param for callchain.

    Signed-off-by: Kan Liang
    Acked-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/1438677022-34296-3-git-send-email-kan.liang@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Kan Liang
     

05 Aug, 2015

6 commits

  • This patchkit adds the ability to turn off time stamps per event.

    One usaful case for partial time is to work with per-event callgraph to
    enable "PEBS threshold > 1" (https://lkml.org/lkml/2015/5/10/196), which
    can significantly reduce the sampling overhead.

    The event samples with time stamps off will not be ordered.

    Signed-off-by: Kan Liang
    Acked-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/1438677022-34296-2-git-send-email-kan.liang@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Kan Liang
     
  • Those were covered and tested in this cset:

    access, chdir, chmod, chown, chroot, creat, getxattr,
    inotify_add_watch, lchown, lgetxattr, listxattr,
    lsetxattr, mkdir, mkdirat, mknod, rmdir, faccessat,
    newfstatat, openat, readlink, readlinkat, removexattr,
    setxattr, statfs, swapon, swapoff, truncate, unlinkat,
    utime, utimes, utimensat.

    E.g.:

    # trace -e statfs,access,mkdir mkdir /tmp/bla
    0.285 (0.020 ms): mkdir/2799 access(filename: /etc/ld.so.preload, mode: R ) = -1 ENOENT No such file or directory
    1.070 (0.032 ms): mkdir/2799 statfs(pathname: /sys/fs/selinux, buf: 0x7ffeafbdc930) = 0
    1.087 (0.013 ms): mkdir/2799 statfs(pathname: /sys/fs/selinux, buf: 0x7ffeafbdc820) = 0
    1.189 (0.014 ms): mkdir/2799 access(filename: /etc/selinux/config ) = 0
    1.905 (0.610 ms): mkdir/2799 mkdir(pathname: /tmp/bla, mode: 511 ) = 0
    #

    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-wbqtnlktquun3wtpjdz3okul@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    and an empty message aborts the commit.

    Arnaldo Carvalho de Melo
     
  • To work like strace and dereference syscall pointer args we need to
    insert probes (or tracepoints) right after we copy those bytes from
    userspace.

    Since we're formatting the syscall args at raw_syscalls:sys_enter time,
    we need to have a formatter that just stores the position where, later,
    when we get the probe:vfs_getname, we can insert the pointer contents.

    Now, if a probe:vfs_getname with this format is in place:

    # perf probe -l
    probe:vfs_getname (on getname_flags:72@/home/git/linux/fs/namei.c with pathname)

    That was, in this case, put in place with:

    # perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'
    Added new event:
    probe:vfs_getname (on getname_flags:72 with pathname=filename:string)

    You can now use it in all perf tools, such as:

    perf record -e probe:vfs_getname -aR sleep 1
    #

    Then 'perf trace' will notice that and do the pointer -> contents
    expansion:

    # trace -e open touch /tmp/bla
    0.165 (0.010 ms): touch/17752 open(filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
    0.195 (0.011 ms): touch/17752 open(filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
    0.512 (0.012 ms): touch/17752 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
    0.582 (0.012 ms): touch/17752 open(filename: /tmp/bla, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: 438) = 3
    #

    Roughly equivalent to strace's output:

    # strace -rT -e open touch /tmp/bla
    0.000000 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
    0.000317 open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
    0.001461 open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
    0.000405 open("/tmp/bla", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3
    0.000641 +++ exited with 0 +++
    #

    Now we need to either look for at all syscalls that are marked as
    pointers and have some well known names ("filename", "pathname", etc)
    and set the arg formatter to the one used for the "open" syscall in this
    patch.

    This implementation works for syscalls with just a string being copied
    from userspace, for matching syscalls with more than one string being
    copied via the same probe/trace point (vfs_getname) we need to extend
    the vfs_getname probe spec to include the pointer too, but there are
    some problems with that in 'perf probe' or the kernel kprobes code, need
    to investigate before considering supporting multiple strings per
    syscall.

    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-xvuwx6nuj8cf389kf9s2ue2s@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • We were using it as a magic number, 1024, fix that.

    Eventually we need to stop doing it per line, and do it per
    arg, traversing the args at output time, to avoid the memmove()
    calls that will be used in the next cset to replace pointers
    present at raw_syscalls:sys_enter time with its contents that
    appear at probe:vfs_getname time, before raw_syscalls:sys_exit
    time.

    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-4sz3wid39egay1pp8qmbur4u@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • So that we can later decide if we will store where to expand the
    pathname once we are handling vfs_getname or if we should instead
    just go on and straight away print the pointer.

    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-ytxk5s5jpc50wahffmlxgxuw@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • We were accessing trace->syscalls.events members even when that struct
    wasn't initialized, i.e. --no-syscalls was specified on the command
    line, fix it to show that, still in debug mode, when we have an event
    qualifier list, i.e. when we actually are doing subset syscall tracing.

    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Stephane Eranian
    Fixes: 19867b6186f3 ("perf trace: Use event filters for the event qualifier list")
    Link: http://lkml.kernel.org/n/tip-7980ym6vujgh3yiai0cqzc88@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

04 Aug, 2015

5 commits

  • The libtraceevent handler (session->tevent) is only initialized when
    there are tracepoints in a perf.data event list, so do not call
    pevent_set_function_resolve() in those cases, fixing a segfault.

    Reported-by: Jiri Olsa
    Tested-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Namhyung Kim
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-xyynkucl5p4bcs13zi4i4b1f@git.kernel.org
    Report-link: http://lkml.kernel.org/r/20150803174113.GA20282@krava.redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Vince Weaver and Stephane Eranian reported warnings in the PEBS
    code when running the perf fuzzer. Stephane wrote:

    > I can reproduce the problem on my HSW running the fuzzer.
    >
    > I can see why this could be happening if you are mixing PEBS and non PEBS events
    > in the bottom 4 counters. I suspect:
    > for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) {
    > if ((counts[bit] == 0) && (error[bit] == 0))
    > continue;
    >
    > This test is not correct when you have non-PEBS events mixed with
    > PEBS events and they overflow at the same time. They will have
    > counts[i] != 0 but error[i] == 0, and thus you fall thru the loop
    > and hit the assert. Or it is something along those lines.

    The only way I can make this work is if ->status only has !PEBS events
    set, because if it has both set we'll take that slow path which masks
    out the !PEBS bits.

    After masking there are 3 options:

    - there is one bit set, and its @bit, we increment counts[bit].

    - there are multiple bits set, we increment error[] for each set bit,
    we do not increment counts[].

    - there are no bits set, we do nothing.

    The intent was to never increment counts[] for !PEBS events.

    Now if we start out with only a single !PEBS event set, we'll pass the
    test and increment counts[] for a !PEBS and hit the warn.

    Reported-by: Vince Weaver
    Reported-by: Stephane Eranian
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Arnaldo Carvalho de Melo
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: kan.liang@intel.com
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • When disabling a PEBS event, we need to drain the buffer. Doing so
    requires a correct cpuc->pebs_active mask.

    The current code clears the pebs_active bit before draining the
    buffer. Fix that.

    Signed-off-by: "Liang, Kan"
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Link: http://lkml.kernel.org/r/37D7C6CF3E00A74B8858931C1DB2F07701885A65@SHSMSX103.ccr.corp.intel.com
    [ Fixed the SOB. ]
    Signed-off-by: Ingo Molnar

    Liang, Kan
     
  • This patch adds an MSR PMU to support free running MSR counters. Such
    as time and freq related counters includes TSC, IA32_APERF, IA32_MPERF
    and IA32_PPERF, but also SMI_COUNT.

    The events are exposed in sysfs for use by perf stat and other tools.
    The files are under /sys/devices/msr/events/

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Kan Liang
    [ s/freq/msr/, added SMI_COUNT, fixed bugs. ]
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: acme@kernel.org
    Cc: adrian.hunter@intel.com
    Cc: dsahern@gmail.com
    Cc: eranian@google.com
    Cc: jolsa@kernel.org
    Cc: mark.rutland@arm.com
    Cc: namhyung@kernel.org
    Link: http://lkml.kernel.org/r/1437407346-31186-1-git-send-email-kan.liang@intel.com
    Signed-off-by: Ingo Molnar

    Andy Lutomirski
     
  • The uncore subsystem for Broadwell-DE is similar to Haswell-EP. There
    are some differences in pci device IDs, box number and constraints.

    Please refer to the public document:

    http://www.intel.com/content/www/us/en/processors/xeon/xeon-d-1500-uncore-performance-monitoring.html

    Signed-off-by: Kan Liang
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: eranian@google.com
    Link: http://lkml.kernel.org/r/1435839172-15114-1-git-send-email-kan.liang@intel.com
    Signed-off-by: Ingo Molnar

    Kan Liang