13 Apr, 2016

14 commits

  • Adding --color-cpus option to display selected cpus with background
    color (red by default). It helps on navigating through the perf sched
    map output.

    Signed-off-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1460467771-26532-8-git-send-email-jolsa@kernel.org
    [ Added entry to man page ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding --color-pids option to display selected pids in color (blue by
    default). It helps on navigating through the 'perf sched map' output.

    Signed-off-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1460467771-26532-7-git-send-email-jolsa@kernel.org
    [ Added entry to man page ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • It will be used in following patch.

    Signed-off-by: Jiri Olsa
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1460467771-26532-6-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • As preparation for next patch.

    Signed-off-by: Jiri Olsa
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1460467771-26532-5-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Add compact map display that does not output the whole cpu matrix, only
    cpus that got event.

    $ perf sched map --compact
    *A0 1082427.094098 secs A0 => perf:19404 (CPU 2)
    A0 *. 1082427.094127 secs . => swapper:0 (CPU 1)
    A0 . *B0 1082427.094174 secs B0 => rcuos/2:25 (CPU 3)
    A0 . *. 1082427.094177 secs
    *C0 . . 1082427.094187 secs C0 => migration/2:21
    C0 *A0 . 1082427.094193 secs
    *. A0 . 1082427.094195 secs
    *D0 A0 . 1082427.094402 secs D0 => rngd:968
    *. A0 . 1082427.094406 secs
    . *E0 . 1082427.095221 secs E0 => kworker/1:1:5333
    . E0 *F0 1082427.095227 secs F0 => xterm:3342

    It helps to display sane output for small thread loads on big cpu
    servers.

    Signed-off-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1460467771-26532-4-git-send-email-jolsa@kernel.org
    [ Add entry in 'perf sched' man page ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding cpu_map__has() to return bool of cpu presence in cpus map.

    Signed-off-by: Jiri Olsa
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1460467771-26532-3-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding thread_map__has() to return bool of pid presence in threads map.

    Signed-off-by: Jiri Olsa
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1460467771-26532-2-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • We already were able to ask for callchains for a specific event:

    # trace -e nanosleep --call dwarf --event sched:sched_switch/call-graph=fp/ usleep 1

    This would enable tracing just the "nanosleep" syscall, with callchains
    at syscall exit and would ask the kernel for frame pointer callchains to
    be enabled for the "sched:sched_switch" tracepoint event, its just that
    we were not resolving the callchain and printing it in 'perf trace', do
    it:

    # trace -e nanosleep --call dwarf --event sched:sched_switch/call-graph=fp/ usleep 1
    0.425 ( 0.013 ms): usleep/6718 nanosleep(rqtp: 0x7ffcc1d16e20) ...
    0.425 ( ): sched:sched_switch:usleep:6718 [120] S ==> swapper/2:0 [120])
    __schedule+0xfe200402 ([kernel.kallsyms])
    schedule+0xfe200035 ([kernel.kallsyms])
    do_nanosleep+0xfe20006f ([kernel.kallsyms])
    hrtimer_nanosleep+0xfe2000dc ([kernel.kallsyms])
    sys_nanosleep+0xfe20007a ([kernel.kallsyms])
    do_syscall_64+0xfe200062 ([kernel.kallsyms])
    return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
    __nanosleep+0xffff008b8cbe2010 (/usr/lib64/libc-2.22.so)
    0.486 ( 0.073 ms): usleep/6718 ... [continued]: nanosleep()) = 0
    __nanosleep+0x10 (/usr/lib64/libc-2.22.so)
    usleep+0x34 (/usr/lib64/libc-2.22.so)
    main+0x1eb (/usr/bin/usleep)
    __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
    _start+0x29 (/usr/bin/usleep)
    #

    Pretty compact, huh? DWARF callchains for raw_syscalls:sys_exit + frame
    pointer callchains for a tracepoint, if your hardware supports LBR, go
    wild with /call-graph=lbr/, guess the next step is to lift this from
    'perf script':

    -F, --fields comma separated output fields prepend with 'type:'. Valid types: hw,sw,trace,raw.
    Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,period,iregs,brstack,brstacksym,flags

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-2e7yiv5hqdm8jywlmfivvx2v@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The new sanity check introduced by:

    26657848502b ("perf/core: Verify we have a single perf_hw_context PMU")

    ... triggered on the AMD uncore driver.

    Uncore PMUs are per node, they cannot have per-task counters. Fix it.

    Reported-by: Borislav Petkov
    Reported-by: Ingo Molnar
    Tested-by: Borislav Petkov
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: acme@redhat.com
    Cc: alexander.shishkin@linux.intel.com
    Cc: eranian@google.com
    Cc: jolsa@redhat.com
    Cc: linux-tip-commits@vger.kernel.org
    Cc: vincent.weaver@maine.edu
    Link: http://lkml.kernel.org/r/20160404140208.GA3448@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • At the moment, initialization path is using test_cpu_cap(&boot_cpu_data),
    to detect PT, which is just open coding boot_cpu_has(). Use the latter
    instead.

    Signed-off-by: Alexander Shishkin
    Acked-by: Borislav Petkov
    Cc: Arnaldo Carvalho de Melo
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: eranian@google.com
    Cc: vince@deater.net
    Link: http://lkml.kernel.org/r/1459953307-14372-1-git-send-email-alexander.shishkin@linux.intel.com
    Signed-off-by: Ingo Molnar

    Alexander Shishkin
     
  • The uprobe_xol_ops structures are never modified, so declare them as const.

    Done with the help of Coccinelle.

    Signed-off-by: Julia Lawall
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: kernel-janitors@vger.kernel.org
    Link: http://lkml.kernel.org/r/1460200649-32526-1-git-send-email-Julia.Lawall@lip6.fr
    Signed-off-by: Ingo Molnar

    Julia Lawall
     
  • …ernel/git/acme/linux into perf/core

    Pull perf/core improvements from Arnaldo Carvalho de Melo:

    User visible changes:

    - Automagically create a 'bpf-output' event, easing the setup of BPF
    C "scripts" that produce output via the perf ring buffer. Now it is
    just a matter of calling any perf tool, such as 'trace', with a C
    source file that references the __bpf_stdout__ output channel and
    that channel will be created and connected to the script:

    # trace -e nanosleep --event test_bpf_stdout.c usleep 1
    0.013 ( 0.013 ms): usleep/2818 nanosleep(rqtp: 0x7ffcead45f40 ) ...
    0.013 ( ): __bpf_stdout__:Raise a BPF event!..)
    0.015 ( ): perf_bpf_probe:func_begin:(ffffffff81112460))
    0.261 ( ): __bpf_stdout__:Raise a BPF event!..)
    0.262 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
    0.264 ( 0.264 ms): usleep/2818 ... [continued]: nanosleep()) = 0
    #

    Further work is needed to reduce the number of lines in a perf bpf C source
    file, this being the part where we greatly reduce the command line setup (Wang Nan)

    - 'perf trace' now supports callchains, with 'trace --call-graph dwarf' using
    libunwind, just like 'perf top', to ask the kernel for stack dumps for CFI
    processing. This reduces the overhead by asking just for userspace callchains
    and also only for the syscall exit tracepoint (raw_syscalls:sys_exit)
    (Milian Wolff, Arnaldo Carvalho de Melo)

    Try it with, for instance:

    # perf trace --call dwarf ping 127.0.0.1

    An excerpt of a system wide 'perf trace --call dwarf" session is at:

    https://fedorapeople.org/~acme/perf/perf-trace--call-graph-dwarf--all-cpus.txt

    You may need to bump the number of mmap pages, using -m/--mmap-pages,
    but on a Broadwell machine the defaults allowed system wide tracing to
    work without losing that many records, experiment with just some
    syscalls, like:

    # perf trace --call dwarf -e nanosleep,futex

    All the targets available for 'perf record', 'perf top' (--pid, --tid, --cpu,
    etc) should work. Also --duration may be interesting to try.

    To get filenames from in various syscalls pointer args (open, ettc), add this
    to the mix:

    # perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'

    Making this work is next in line:

    # trace --call dwarf --ev sched:sched_switch/call-graph=fp/ usleep 1

    I.e. honouring per-tracepoint callchains in 'perf trace' in addition to
    in raw_syscalls:sys_exit.

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     
  • …ernel/git/acme/linux into perf/core

    Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

    User visible changes:

    - Beautify more syscall arguments in 'perf trace', using the type column in
    tracepoint /format fields to attach, for instance, a pid_t resolver to the
    thread COMM, also attach a mode_t beautifier in the same fashion
    (Arnaldo Carvalho de Melo)

    - Build the syscall table id <-> name resolver using the same .tbl file
    used in the kernel to generate headers, to avoid the delay in getting
    new syscalls supported in the audit-libs external dependency, done so
    far only for x86_64 (Arnaldo Carvalho de Melo)

    - Improve the documentation of event specifications (Andi Kleen)

    - Process update events in 'perf script', fixing up this use case:

    # perf stat -a -I 1000 -e cycles record | perf script -s script.py

    - Shared object symbol adjustment fixes, fixing symbol resolution in
    Android (Wang Nan)

    Infrastructure changes:

    - Add dedicated unwind addr_space member into thread struct, to allow
    tools to use thread->priv, noticed while working on having callchains
    in 'perf trace' (Jiri Olsa)

    Build fixes:

    - Fix the build in Ubuntu 12.04 (Arnaldo Carvalho de Melo, Vinson Lee)

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     
  • Signed-off-by: Ingo Molnar

    Ingo Molnar
     

12 Apr, 2016

14 commits

  • Instead of having "[unknown]" as the name used for unresolved symbols,
    use the address in the callchain, in hexadecimal form:

    28.801 ( 0.007 ms): qemu-system-x8/10065 ppoll(ufds: 0x55c98b39e400, nfds: 72, tsp: 0x7fffe4e4fe60, sigsetsize: 8) = 0 Timeout
    ppoll+0x91 (/usr/lib64/libc-2.22.so)
    [0x337309] (/usr/bin/qemu-system-x86_64)
    [0x336ab4] (/usr/bin/qemu-system-x86_64)
    main+0x1724 (/usr/bin/qemu-system-x86_64)
    __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
    [0xc59a9] (/usr/bin/qemu-system-x86_64)
    35.265 (14.805 ms): gnome-shell/2287 ... [continued]: poll()) = 1
    [0xf6fdd] (/usr/lib64/libc-2.22.so)
    g_main_context_iterate.isra.29+0x17c (/usr/lib64/libglib-2.0.so.0.4600.2)
    g_main_loop_run+0xc2 (/usr/lib64/libglib-2.0.so.0.4600.2)
    meta_run+0x2c (/usr/lib64/libmutter.so.0.0.0)
    main+0x3f7 (/usr/bin/gnome-shell)
    __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
    [0x2909] (/usr/bin/gnome-shell)

    Suggested-by: Milian Wolff
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The fprintf_sym() and fprintf_callchain() methods now allow users to
    change the existing behaviour of showing "[unknown]" as the name of
    unresolved symbols to instead show "[0x123456]", i.e. its address.

    The current patch doesn't change tools to use this facility, the results
    from 'perf trace' and 'perf script' cotinue like:

    70.109 ( 0.001 ms): qemu-system-x8/10153 poll(ufds: 0x7f2d93ffe870, nfds: 1) = 0 Timeout
    [unknown] (/usr/lib64/libc-2.22.so)
    [unknown] (/usr/lib64/libspice-server.so.1.10.0)
    [unknown] (/usr/lib64/libspice-server.so.1.10.0)
    [unknown] (/usr/lib64/libspice-server.so.1.10.0)
    start_thread+0xca (/usr/lib64/libpthread-2.22.so)
    __clone+0x6d (/usr/lib64/libc-2.22.so)

    The next patch will make 'perf trace' use the new formatting.

    Suggested-by: Milian Wolff
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • We don't need the callchains at the syscall enter tracepoint, just when
    finishing it at syscall exit, so reduce the overhead by asking for
    callchains just at syscall exit.

    Suggested-by: Milian Wolff
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The rename is for consistency with the parameter name.

    Make it public for fine grained control of which evsels should have
    callchains enabled, like, for instance, will be done in the next
    changesets in 'perf trace', to enable callchains just on the
    "raw_syscalls:sys_exit" tracepoint.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-og8vup111rn357g4yagus3ao@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • For fiddling with sample_type fields in all evsels in an evlist.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-dg6yavctt0hzl2tsgfb43qsr@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Instead receive a callchain_param pointer to configure callchain
    aspects, not doing so if NULL is passed.

    This will allow fine grained control over which evsels in an evlist
    gets callchains enabled.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-2mupip6khc92mh5x4nw9to82@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The kernel parts are not that useful:

    # trace -m 512 -e nanosleep --call dwarf usleep 1
    0.065 ( 0.065 ms): usleep/18732 nanosleep(rqtp: 0x7ffc4ee4e200) = 0
    syscall_slow_exit_work ([kernel.kallsyms])
    do_syscall_64 ([kernel.kallsyms])
    return_from_SYSCALL_64 ([kernel.kallsyms])
    __nanosleep (/usr/lib64/libc-2.22.so)
    usleep (/usr/lib64/libc-2.22.so)
    main (/usr/bin/usleep)
    __libc_start_main (/usr/lib64/libc-2.22.so)
    _start (/usr/bin/usleep)
    #

    So lets just use perf_event_attr.exclude_callchain_kernel to avoid
    collecting it in the ring buffer:

    # trace -m 512 -e nanosleep --call dwarf usleep 1
    0.063 ( 0.063 ms): usleep/19212 nanosleep(rqtp: 0x7ffc3df10fb0) = 0
    __nanosleep (/usr/lib64/libc-2.22.so)
    usleep (/usr/lib64/libc-2.22.so)
    main (/usr/bin/usleep)
    __libc_start_main (/usr/lib64/libc-2.22.so)
    _start (/usr/bin/usleep)
    #

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-qctu3gqhpim0dfbcp9d86c91@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • In 'perf trace' we're just interested in printing callchains, and we
    don't want to use the symbol_conf.use_callchain, so move the callchain
    part to a new method.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-kcn3romzivcpxb3u75s9nz33@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • As it receives a FILE, and its more than just the IP, which can even be
    requested not to be printed.

    For consistency with other similar methods in tools/perf/, name it as
    perf_evsel__fprintf_sym() and make it return the number of bytes
    printed, just like 'fprintf(3)'

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-84gawlqa3lhk63nf0t9vnqnn@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Now, one can print the call chain for every encountered sys_exit event,
    e.g.:

    $ perf trace -e nanosleep --call-graph dwarf path/to/ex_sleep
    1005.757 (1000.090 ms): ex_sleep/13167 nanosleep(...) = 0
    syscall_slow_exit_work ([kernel.kallsyms])
    syscall_return_slowpath ([kernel.kallsyms])
    int_ret_from_sys_call ([kernel.kallsyms])
    __nanosleep (/usr/lib/libc-2.23.so)
    [unknown] (/usr/lib/libQt5Core.so.5.6.0)
    QThread::sleep (/usr/lib/libQt5Core.so.5.6.0)
    main (path/to/ex_sleep)
    __libc_start_main (/usr/lib/libc-2.23.so)
    _start (path/to/ex_sleep)

    Note that it is advised to increase the number of mmap pages to prevent
    event losses when using this new feature. Often, adding `-m 10M` to the
    `perf trace` invocation is enough.

    This feature is also available in strace when built with libunwind via
    `strace -k`. Performance wise, this solution is much better:

    $ time find path/to/linux &> /dev/null

    real 0m0.051s
    user 0m0.013s
    sys 0m0.037s

    $ time perf trace -m 800M --call-graph dwarf find path/to/linux &> /dev/null

    real 0m2.624s
    user 0m1.203s
    sys 0m1.333s

    $ time strace -k find path/to/linux &> /dev/null

    real 0m35.398s
    user 0m10.403s
    sys 0m23.173s

    Note that it is currently not possible to configure the print output.
    Adding such a feature, similar to what is available in `perf script` via
    its `--fields` knob can be added later on.

    Signed-off-by: Milian Wolff
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    LPU-Reference: 1460115255-17648-1-git-send-email-milian.wolff@kdab.com
    [ Split from a larger patch, do not print the IP, left align,
    remove dup call symbol__init(), added man page entry ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Milian Wolff
     
  • For callchains, etc where we want it to align just below the syscall
    name, for instance, in 'perf trace'

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-uk9ekchd67651c625ltaur5y@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • As this function will be used in 'perf trace'.

    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/n/tip-8x297v9utnxq77onikevvlse@git.kernel.org
    [ Split from a larger patch ]
    Signed-off-by: Milian Wolff

    Milian Wolff
     
  • This patch removes the need to set a bpf-output event in cmdline. By
    referencing a map named '__bpf_stdout__', perf automatically creates an
    event for it.

    For example:

    # perf record -e ./test_bpf_trace.c usleep 100000
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ]
    # perf script
    usleep 4639 [000] 261895.307826: 0 __bpf_stdout__: ffffffff810eb9a1 ...
    BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a
    0008: 42 50 46 20 65 76 65 6e BPF even
    0010: 74 21 00 00 t!..
    BPF string: "Raise a BPF event!"

    usleep 4639 [000] 261895.407883: 0 __bpf_stdout__: ffffffff8105d609 ...
    BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a
    0008: 42 50 46 20 65 76 65 6e BPF even
    0010: 74 21 00 00 t!..
    BPF string: "Raise a BPF event!"

    perf record -e ./test_bpf_trace.c usleep 100000

    equals to:

    perf record -e bpf-output/no-inherit=1,name=__bpf_stdout__/ \
    -e ./test_bpf_trace.c/map:__bpf_stdout__.event=__bpf_stdout__/ \
    usleep 100000

    Where test_bpf_trace.c is:

    /************************ BEGIN **************************/
    #include
    struct bpf_map_def {
    unsigned int type;
    unsigned int key_size;
    unsigned int value_size;
    unsigned int max_entries;
    };
    #define SEC(NAME) __attribute__((section(NAME), used))
    static u64 (*ktime_get_ns)(void) =
    (void *)BPF_FUNC_ktime_get_ns;
    static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
    (void *)BPF_FUNC_trace_printk;
    static int (*get_smp_processor_id)(void) =
    (void *)BPF_FUNC_get_smp_processor_id;
    static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
    (void *)BPF_FUNC_perf_event_output;

    struct bpf_map_def SEC("maps") __bpf_stdout__ = {
    .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
    .key_size = sizeof(int),
    .value_size = sizeof(u32),
    .max_entries = __NR_CPUS__,
    };

    static inline int __attribute__((always_inline))
    func(void *ctx, int type)
    {
    char output_str[] = "Raise a BPF event!";
    char err_str[] = "BAD %d\n";
    int err;

    err = perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(),
    &output_str, sizeof(output_str));
    if (err)
    trace_printk(err_str, sizeof(err_str), err);
    return 1;
    }
    SEC("func_begin=sys_nanosleep")
    int func_begin(void *ctx) {return func(ctx, 1);}
    SEC("func_end=sys_nanosleep%return")
    int func_end(void *ctx) { return func(ctx, 2);}
    char _license[] SEC("license") = "GPL";
    int _version SEC("version") = LINUX_VERSION_CODE;
    /************************* END ***************************/

    Committer note:

    Testing with 'perf trace':

    # trace -e nanosleep --ev test_bpf_stdout.c usleep 1
    0.007 ( 0.007 ms): usleep/729 nanosleep(rqtp: 0x7ffc5bbc5fe0) ...
    0.007 ( ): __bpf_stdout__:Raise a BPF event!..)
    0.008 ( ): perf_bpf_probe:func_begin:(ffffffff81112460))
    0.069 ( ): __bpf_stdout__:Raise a BPF event!..)
    0.070 ( ): perf_bpf_probe:func_end:(ffffffff81112460
    Signed-off-by: Wang Nan
    Cc: Jiri Olsa
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1460128045-97310-5-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     
  • This patch allows cloning bpf-output event configuration among multiple
    bpf scripts. If there exist a map named '__bpf_output__' and not
    configured using 'map:__bpf_output__.event=', this patch clones the
    configuration of another '__bpf_stdout__' map. For example, following
    command:

    # perf trace --ev bpf-output/no-inherit,name=evt/ \
    --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \
    --ev ./test_bpf_trace2.c usleep 100000

    equals to:

    # perf trace --ev bpf-output/no-inherit,name=evt/ \
    --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \
    --ev ./test_bpf_trace2.c/map:__bpf_stdout__.event=evt/ \
    usleep 100000

    Signed-off-by: Wang Nan
    Suggested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Zefan Li
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1460128045-97310-4-git-send-email-wangnan0@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     

11 Apr, 2016

5 commits

  • Linus Torvalds
     
  • Pull ARM fixes from Russell King:
    "A couple of small fixes, and wiring up the new syscalls which appeared
    during the merge window"

    * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
    ARM: 8550/1: protect idiv patching against undefined gcc behavior
    ARM: wire up preadv2 and pwritev2 syscalls
    ARM: SMP enable of cache maintanence broadcast

    Linus Torvalds
     
  • Pull MMC fixes from Ulf Hansson:
    "Here are a couple of mmc fixes intended for v4.6 rc3:

    MMC host:
    - sdhci: Fix regression setting power on Trats2 board
    - sdhci-pci: Add support and PCI IDs for more Broxton host controllers"

    * tag 'mmc-v4.6-rc1' of git://git.linaro.org/people/ulf.hansson/mmc:
    mmc: sdhci-pci: Add support and PCI IDs for more Broxton host controllers
    mmc: sdhci: Fix regression setting power on Trats2 board

    Linus Torvalds
     
  • Pull i2c fixes from Wolfram Sang:
    "Some bugfixes from I2C:

    - fix a uevent triggered boot problem by removing a useless debug
    print

    - fix sysfs-attributes of the new i2c-demux-pinctrl driver to follow
    standard kernel behaviour

    - fix a potential division-by-zero error (needed two takes)"

    * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
    i2c: jz4780: really prevent potential division by zero
    Revert "i2c: jz4780: prevent potential division by zero"
    i2c: jz4780: prevent potential division by zero
    i2c: mux: demux-pinctrl: Update docs to new sysfs-attributes
    i2c: mux: demux-pinctrl: Clean up sysfs attributes
    i2c: prevent endless uevent loop with CONFIG_I2C_DEBUG_CORE

    Linus Torvalds
     
  • This reverts commit 1028b55bafb7611dda1d8fed2aeca16a436b7dff.

    It's broken: it makes ext4 return an error at an invalid point, causing
    the readdir wrappers to write the the position of the last successful
    directory entry into the position field, which means that the next
    readdir will now return that last successful entry _again_.

    You can only return fatal errors (that terminate the readdir directory
    walk) from within the filesystem readdir functions, the "normal" errors
    (that happen when the readdir buffer fills up, for example) happen in
    the iterorator where we know the position of the actual failing entry.

    I do have a very different patch that does the "signal_pending()"
    handling inside the iterator function where it is allowable, but while
    that one passes all the sanity checks, I screwed up something like four
    times while emailing it out, so I'm not going to commit it today.

    So my track record is not good enough, and the stars will have to align
    better before that one gets committed. And it would be good to get some
    review too, of course, since celestial alignments are always an iffy
    debugging model.

    IOW, let's just revert the commit that caused the problem for now.

    Reported-by: Greg Thelen
    Cc: Theodore Ts'o
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

10 Apr, 2016

7 commits

  • Pull parisc fixes from Helge Deller:
    "Since commit 0de798584bde ("parisc: Use generic extable search and
    sort routines") module loading is boken on parisc, because the parisc
    module loader wasn't prepared for the new R_PARISC_PCREL32 relocations.

    In addition, due to that breakage, Mikulas Patocka noticed that
    handling exceptions from modules probably never worked on parisc. It
    was just masked by the fact that exceptions from modules don't happen
    during normal use.

    This patch series fixes those issues and survives the tests of the
    lib/test_user_copy kernel module test. Some patches are tagged for
    stable"

    * 'parisc-4.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
    parisc: Update comment regarding relative extable support
    parisc: Unbreak handling exceptions from kernel modules
    parisc: Fix kernel crash with reversed copy_from_user()
    parisc: Avoid function pointers for kernel exception routines
    parisc: Handle R_PARISC_PCREL32 relocations in kernel modules

    Linus Torvalds
     
  • Pull libnvdimm fixes from Dan Williams:
    "Three fixes, the first two are tagged for -stable:

    - The ndctl utility/library gained expanded unit tests illuminating a
    long standing bug in the libnvdimm SMART data retrieval
    implementation.

    It has been broken since its initial implementation, now fixed.

    - Another one line fix for the detection of stale info blocks.

    Without this change userspace can get into a situation where it is
    unable to reconfigure a namespace.

    - Fix the badblock initialization path in the presence of the new (in
    v4.6-rc1) section alignment workarounds.

    Without this change badblocks will be reported at the wrong offset.

    These have received a build success report from the kbuild robot and
    have appeared in -next with no reported issues"

    * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
    libnvdimm, pfn: fix nvdimm_namespace_add_poison() vs section alignment
    libnvdimm, pfn: fix uuid validation
    libnvdimm: fix smart data retrieval

    Linus Torvalds
     
  • Pull GPIO fixes from Linus Walleij:
    "Here is a set of four GPIO fixes. The two fixes to the core are
    serious as they are regressing minor architectures.

    Core fixes:

    - Defer GPIO device setup until after gpiolib is initialized.

    It turns out that a few very tightly integrated GPIO platform
    drivers initialize so early (befor core_initcall()) so that the
    gpiolib isn't even initialized itself. That limits what the
    library can do, and we cannot reference uninitialized fields until
    later.

    Defer some of the initialization until right after the gpiolib is
    initialized in these (rare) cases.

    - As a consequence: do not use devm_* resources when allocating the
    states in the initial set-up of the gpiochip.

    Driver fixes:

    - In ACPI retrieveal: ignore GpioInt when looking for output GPIOs.

    - Fix legacy builds on the PXA without a backing pin controller.

    - Use correct datatype on pca953x register writes"

    * tag 'gpio-v4.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
    gpio: pca953x: Use correct u16 value for register word write
    gpiolib: Defer gpio device setup until after gpiolib initialization
    gpiolib: Do not use devm functions when registering gpio chip
    gpio: pxa: fix legacy non pinctrl aware builds
    gpio / ACPI: ignore GpioInt() GPIOs when requesting GPIO_OUT_*

    Linus Torvalds
     
  • Pull tty fixes from Greg KH:
    "Here are two tty fixes for issues found.

    One was due to a merge error in 4.6-rc1, and the other a regression
    fix for UML consoles that broke in 4.6-rc1.

    Both have been in linux-next for a while"

    * tag 'tty-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
    tty: Fix merge of "tty: Refactor tty_open()"
    tty: Fix UML console breakage

    Linus Torvalds
     
  • Pull USB fixes from Greg KH:
    "Here are some USB fixes and new device ids for 4.6-rc3.

    Nothing major, the normal USB gadget fixes and usb-serial driver ids,
    along with some other fixes mixed in. All except the USB serial ids
    have been tested in linux-next, the id additions should be fine as
    they are 'trivial'"

    * tag 'usb-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (25 commits)
    USB: option: add "D-Link DWM-221 B1" device id
    USB: serial: cp210x: Adding GE Healthcare Device ID
    USB: serial: ftdi_sio: Add support for ICP DAS I-756xU devices
    usb: dwc3: keystone: drop dma_mask configuration
    usb: gadget: udc-core: remove manual dma configuration
    usb: dwc3: pci: add ID for one more Intel Broxton platform
    usb: renesas_usbhs: fix to avoid using a disabled ep in usbhsg_queue_done()
    usb: dwc2: do not override forced dr_mode in gadget setup
    usb: gadget: f_midi: unlock on error
    USB: digi_acceleport: do sanity checking for the number of ports
    USB: cypress_m8: add endpoint sanity check
    USB: mct_u232: add sanity checking in probe
    usb: fix regression in SuperSpeed endpoint descriptor parsing
    USB: usbip: fix potential out-of-bounds write
    usb: renesas_usbhs: disable TX IRQ before starting TX DMAC transfer
    usb: renesas_usbhs: avoid NULL pointer derefernce in usbhsf_pkt_handler()
    usb: gadget: f_midi: Fixed a bug when buflen was smaller than wMaxPacketSize
    usb: phy: qcom-8x16: fix regulator API abuse
    usb: ch9: Fix SSP Device Cap wFunctionalitySupport type
    usb: gadget: composite: Access SSP Dev Cap fields properly
    ...

    Linus Torvalds
     
  • Pull staging and IIO driver fixes from Greg KH:
    "Here are some IIO driver fixes, along with two staging driver fixes
    for 4.6-rc3.

    One staging driver patch reverts the deletion of a driver that
    happened in 4.6-rc1. We thought that laptop.org was dead, but it's
    still alive and kicking, and has users that were mad we broke their
    hardware by deleting a driver for their machines. So that driver is
    added back and everyone is happy again.

    All of these have been in linux-next for a while with no reported
    issues"

    * tag 'staging-4.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
    Revert "Staging: olpc_dcon: Remove obsolete driver"
    staging/rdma/hfi1: select CRC32
    iio: gyro: bmg160: fix buffer read values
    iio: gyro: bmg160: fix endianness when reading axes
    iio: accel: bmc150: fix endianness when reading axes
    iio: st_magn: always define ST_MAGN_TRIGGER_SET_STATE
    iio: fix config watermark initial value
    iio: health: max30100: correct FIFO check condition
    iio: imu: Fix inv_mpu6050 dependencies
    iio: adc: Fix build error of missing devm_ioremap_resource on UM
    iio: light: apds9960: correct FIFO check condition
    iio: adc: max1363: correct reference voltage
    iio: adc: max1363: add missing adc to max1363_id

    Linus Torvalds
     
  • Pull SCSI fixes from James Bottomley:
    "This is a set of eight fixes.

    Two are trivial gcc-6 updates (brace additions and unused variable
    removal). There's a couple of cxlflash regressions, a correction for
    sd being overly chatty on revalidation (causing excess log increases).
    A VPD issue which could crash USB devices because they seem very
    intolerant to VPD inquiries, an ALUA deadlock fix and a mpt3sas buffer
    overrun fix"

    * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
    scsi: Do not attach VPD to devices that don't support it
    sd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes
    scsi_dh_alua: Fix a recently introduced deadlock
    scsi: Declare local symbols static
    cxlflash: Move to exponential back-off when cmd_room is not available
    cxlflash: Fix regression issue with re-ordering patch
    mpt3sas: Don't overreach ioc->reply_post[] during initialization
    aacraid: add missing curly braces

    Linus Torvalds