16 May, 2019

8 commits

  • The available registers for --int-regs and --user-regs may be different,
    e.g. XMM registers.

    Split parse_regs into two dedicated functions for --int-regs and
    --user-regs respectively.

    Modify the warning message. "--user-regs=?" should be applied to show
    the available registers for --user-regs.

    Signed-off-by: Kan Liang
    Tested-by: Ravi Bangoria
    Cc: Andi Kleen
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/1557865174-56264-1-git-send-email-kan.liang@linux.intel.com
    [ Changed docs as suggested by Ravi and agreed by Kan ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Kan Liang
     
  • Implemented -z,--compression_level[=] option that enables compression
    of mmaped kernel data buffers content in runtime during perf record mode
    collection. Default option value is 1 (fastest compression).

    Compression overhead has been measured for serial and AIO streaming when
    profiling matrix multiplication workload:

    -------------------------------------------------------------
    | SERIAL | AIO-1 |
    ----------------------------------------------------------------|
    |-z | OVH(x) | ratio(x) size(MiB) | OVH(x) | ratio(x) size(MiB) |
    |---------------------------------------------------------------|
    | 0 | 1,00 | 1,000 179,424 | 1,00 | 1,000 187,527 |
    | 1 | 1,04 | 8,427 181,148 | 1,01 | 8,474 188,562 |
    | 2 | 1,07 | 8,055 186,953 | 1,03 | 7,912 191,773 |
    | 3 | 1,04 | 8,283 181,908 | 1,03 | 8,220 191,078 |
    | 5 | 1,09 | 8,101 187,705 | 1,05 | 7,780 190,065 |
    | 8 | 1,05 | 9,217 179,191 | 1,12 | 6,111 193,024 |
    -----------------------------------------------------------------

    OVH = (Execution time with -z N) / (Execution time with -z 0)

    ratio - compression ratio
    size - number of bytes that was compressed

    size ~= trace size x ratio

    Committer notes:

    Testing it I noticed that it failed to disable build id processing when
    compression is enabled, and as we'd have to uncompress everything to
    look for the PERF_RECORD_{MMAP,SAMPLE,etc} to figure out which build ids
    to read from DSOs, we better disable build id processing when
    compression is enabled, logging with pr_debug() when doing so:

    Original patch:

    # perf record -z2
    ^C[ perf record: Woken up 1 times to write data ]
    0x1746e0 [0x76]: failed to process type: 81 [Invalid argument]
    [ perf record: Captured and wrote 1.568 MB perf.data, compressed (original 0.452 MB, ratio is 3.995) ]
    #

    After auto-disabling build id processing when compression is enabled:

    $ perf record -z2 sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.292) ]
    $ perf record -v -z2 sleep 1
    Compression enabled, disabling build id collection at the end of the session.

    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.305) ]
    $

    Also, with parts of the patch originally after this one moved to just
    before this one we get:

    $ perf record -z2 sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.371) ]
    $ perf report -D | grep COMPRESS
    0 0x1b8 [0x155]: PERF_RECORD_COMPRESSED: unhandled!
    0 0x30d [0x80]: PERF_RECORD_COMPRESSED: unhandled!
    COMPRESSED events: 2
    COMPRESSED events: 0
    $

    I.e. when faced with PERF_RECORD_COMPRESSED that we still have no code
    to process, we just show it as not being handled, skip them and
    continue, while before we had:

    $ perf report -D | grep COMPRESS
    0x1b8 [0x169]: failed to process type: 81 [Invalid argument]
    Error:
    failed to process sample
    0 0x1b8 [0x169]: PERF_RECORD_COMPRESSED
    $

    Signed-off-by: Alexey Budankov
    Reviewed-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/9ff06518-ae63-a908-e44d-5d9e56dd66d9@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     
  • Compression is implemented using the functions from zstd.c. As the memory
    to operate on the compression uses mmap->aio.data[] buffers. If Zstd
    streaming compression API fails for some reason the data to be compressed
    are just copied into the memory buffers using plain memcpy().

    Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED
    records. Each element of the array is not longer that PERF_SAMPLE_MAX_SIZE
    and consists of perf_event_header followed by the compressed chunk
    that is decompressed on the loading stage.

    perf_mmap__aio_push() is replaced by perf_mmap__push() which is now used
    in the both serial and AIO streaming cases. perf_mmap__push() is extended
    with positive return values to signify absence of data ready for
    processing.

    Signed-off-by: Alexey Budankov
    Reviewed-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/77db2b2c-5d03-dbb0-aeac-c4dd92129ab9@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     
  • Compression is implemented using the functions from zstd.c. As the
    memory to operate on the compression uses mmap->data buffer.

    If Zstd streaming compression API fails for some reason the data to be
    compressed are just copied into the memory buffers using plain memcpy().

    Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED
    records. Each element of the array is not longer that
    PERF_SAMPLE_MAX_SIZE and consists of perf_event_header followed by the
    compressed chunk that is decompressed on the loading stage.

    Comitter notes:

    Undo some unnecessary line breaks, remove some unnecessary () around
    zstd_data to then just get its address, and fix conflicts with
    BPF_PROG_INFO/BPF_BTF patchkits.

    Signed-off-by: Alexey Budankov
    Reviewed-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/744df43f-3932-2594-ddef-1e99a3cad03a@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     
  • Implemented mmap data buffer that is used as the memory to operate
    on when compressing data in case of serial trace streaming.

    Signed-off-by: Alexey Budankov
    Reviewed-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/49b31321-0f70-392b-9a4f-649d3affe090@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     
  • Implemented PERF_RECORD_COMPRESSED event, related data types, header
    feature and functions to write, read and print feature attributes from
    the trace header section.

    comp_mmap_len preserves the size of mmaped kernel buffer that was used
    during collection. comp_mmap_len size is used on loading stage as the
    size of decomp buffer for decompression of COMPRESSED events content.

    Committer notes:

    Fixed up conflict with BPF_PROG_INFO and BTF_BTF header features.

    Signed-off-by: Alexey Budankov
    Reviewed-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/ebbaf031-8dda-3864-ebc6-7922d43ee515@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     
  • Define 'bytes_transferred' and 'bytes_compressed' metrics to calculate
    ratio in the end of the data collection:

    compression ratio = bytes_transferred / bytes_compressed

    The 'bytes_transferred' metric accumulates the amount of bytes that was
    extracted from the mmaped kernel buffers for compression, while
    'bytes_compressed' accumulates the amount of bytes that was received
    after applying compression.

    Signed-off-by: Alexey Budankov
    Reviewed-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1d4bf499-cb03-26dc-6fc6-f14fec7622ce@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     
  • $ perf record -h -I

    Usage: perf record [] []
    or: perf record [] -- []

    -I, --intr-regs[=]
    sample selected machine registers on interrupt, use -I ? to list register names

    $ m
    $ perf record -I ?
    Workload failed: No such file or directory
    $

    After:

    $ perf record -h -I

    Usage: perf record [] []
    or: perf record [] -- []

    -I, --intr-regs[=]
    sample selected machine registers on interrupt, use '-I?' to list register names

    $
    $ perf record -I?
    available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15

    Usage: perf record [] []
    or: perf record [] -- []

    -I, --intr-regs[=]
    sample selected machine registers on interrupt, use '-I?' to list register names
    $

    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Jiri Olsa
    Cc: Kan Liang
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Fixes: bcc84ec65ad1 ("perf record: Add ability to name registers to record")
    Link: https://lkml.kernel.org/n/tip-r0xhfhy5radmkhhcbcfs5izf@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

02 Apr, 2019

1 commit

  • Implement a --mmap-flush option that specifies minimal number of bytes
    that is extracted from mmaped kernel buffer to store into a trace. The
    default option value is 1 byte what means every time trace writing
    thread finds some new data in the mmaped buffer the data is extracted,
    possibly compressed and written to a trace.

    $ tools/perf/perf record --mmap-flush 1024 -e cycles -- matrix.gcc
    $ tools/perf/perf record --aio --mmap-flush 1K -e cycles -- matrix.gcc

    The option is independent from -z setting, doesn't vary with compression
    level and can serve two purposes.

    The first purpose is to increase the compression ratio of a trace data.
    Larger data chunks are compressed more effectively so the implemented
    option allows specifying data chunk size to compress. Also at some cases
    executing more write syscalls with smaller data size can take longer
    than executing less write syscalls with bigger data size due to syscall
    overhead so extracting bigger data chunks specified by the option value
    could additionally decrease runtime overhead.

    The second purpose is to avoid self monitoring live-lock issue in system
    wide (-a) profiling mode. Profiling in system wide mode with compression
    (-a -z) can additionally induce data into the kernel buffers along with
    the data from monitored processes. If performance data rate and volume
    from the monitored processes is high then trace streaming and
    compression activity in the tool is also high. High tool process
    activity can lead to subtle live-lock effect when compression of single
    new byte from some of mmaped kernel buffer leads to generation of the
    next single byte at some mmaped buffer. So perf tool process ends up in
    endless self monitoring.

    Implemented synch parameter is the mean to force data move independently
    from the specified flush threshold value. Despite the provided flush
    value the tool needs capability to unconditionally drain memory buffers,
    at least in the end of the collection.

    Committer testing:

    Running with the default value, i.e. as soon as there is something to
    read go on consuming, we first write the synthesized events, small
    chunks of about 128 bytes:

    # perf trace -m 2048 --call-graph dwarf -e write -- perf record

    101.142 ( 0.004 ms): perf/25821 write(fd: 3, buf: 0x210db60, count: 120) = 120
    __libc_write (/usr/lib64/libpthread-2.28.so)
    ion (/home/acme/bin/perf)
    record__write (inlined)
    process_synthesized_event (/home/acme/bin/perf)
    perf_tool__process_synth_event (inlined)
    perf_event__synthesize_mmap_events (/home/acme/bin/perf)

    Then we move to reading the mmap buffers consuming the events put there
    by the kernel perf infrastructure:

    107.561 ( 0.005 ms): perf/25821 write(fd: 3, buf: 0x7f1befc02000, count: 336) = 336
    __libc_write (/usr/lib64/libpthread-2.28.so)
    ion (/home/acme/bin/perf)
    record__write (inlined)
    record__pushfn (/home/acme/bin/perf)
    perf_mmap__push (/home/acme/bin/perf)
    record__mmap_read_evlist (inlined)
    record__mmap_read_all (inlined)
    __cmd_record (inlined)
    cmd_record (/home/acme/bin/perf)
    12919.953 ( 0.136 ms): perf/25821 write(fd: 3, buf: 0x7f1befc83150, count: 184984) = 184984

    12920.094 ( 0.155 ms): perf/25821 write(fd: 3, buf: 0x7f1befc02150, count: 261816) = 261816

    12920.253 ( 0.093 ms): perf/25821 write(fd: 3, buf: 0x7f1befb81120, count: 170832) = 170832

    If we limit it to write only when more than 16MB are available for
    reading, it throttles that to a quarter of the --mmap-pages set for
    'perf record', which by default get to 528384 bytes, found out using
    'record -v':

    mmap flush: 132096
    mmap size 528384B

    With that in place all the writes coming from
    record__mmap_read_evlist(), i.e. from the mmap buffers setup by the
    kernel perf infrastructure were at least 132096 bytes long.

    Trying with a bigger mmap size:

    perf trace -e write perf record -v -m 2048 --mmap-flush 16M
    74982.928 ( 2.471 ms): perf/26500 write(fd: 3, buf: 0x7ff94a6cc000, count: 3580888) = 3580888
    74985.406 ( 2.353 ms): perf/26500 write(fd: 3, buf: 0x7ff949ecb000, count: 3453256) = 3453256
    74987.764 ( 2.629 ms): perf/26500 write(fd: 3, buf: 0x7ff9496ca000, count: 3859232) = 3859232
    74990.399 ( 2.341 ms): perf/26500 write(fd: 3, buf: 0x7ff948ec9000, count: 3769032) = 3769032
    74992.744 ( 2.064 ms): perf/26500 write(fd: 3, buf: 0x7ff9486c8000, count: 3310520) = 3310520
    74994.814 ( 2.619 ms): perf/26500 write(fd: 3, buf: 0x7ff947ec7000, count: 4194688) = 4194688
    74997.439 ( 2.787 ms): perf/26500 write(fd: 3, buf: 0x7ff9476c6000, count: 4029760) = 4029760

    Was again limited to a quarter of the mmap size:

    mmap flush: 2098176
    mmap size 8392704B

    A warning about that would be good to have but can be added later,
    something like:

    "max flush is a quarter of the mmap size, if wanting to bump the mmap
    flush further, bump the mmap size as well using -m/--mmap-pages"

    Also rename the 'sync' parameters to 'synch' to keep tools/perf building
    with older glibcs:

    cc1: warnings being treated as errors
    builtin-record.c: In function 'record__mmap_read_evlist':
    builtin-record.c:775: warning: declaration of 'sync' shadows a global declaration
    /usr/include/unistd.h:933: warning: shadowed declaration is here
    builtin-record.c: In function 'record__mmap_read_all':
    builtin-record.c:856: warning: declaration of 'sync' shadows a global declaration
    /usr/include/unistd.h:933: warning: shadowed declaration is here

    Signed-off-by: Alexey Budankov
    Reviewed-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/f6600d72-ecfa-2eb7-7e51-f6954547d500@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     

21 Mar, 2019

2 commits

  • To fully annotate BPF programs with source code mapping, 4 different
    information are needed:

    1) PERF_RECORD_KSYMBOL
    2) PERF_RECORD_BPF_EVENT
    3) bpf_prog_info
    4) btf

    This patch handles 3) and 4) for BPF programs loaded after 'perf
    record|top'.

    For timely process of these information, a dedicated event is added to
    the side band evlist.

    When PERF_RECORD_BPF_EVENT is received via the side band event, the
    polling thread gathers 3) and 4) vis sys_bpf and store them in perf_env.

    This information is saved to perf.data at the end of 'perf record'.

    Committer testing:

    The 'wakeup_watermark' member in 'struct perf_event_attr' is inside a
    unnamed union, so can't be used in a struct designated initialization
    with older gccs, get it out of that, isolating as 'attr.wakeup_watermark
    = 1;' to work with all gcc versions.

    We also need to add '--no-bpf-event' to the 'perf record'
    perf_event_attr tests in 'perf test', as the way that that test goes is
    to intercept the events being setup and looking if they match the fields
    described in the control files, since now it finds first the side band
    event used to catch the PERF_RECORD_BPF_EVENT, they all fail.

    With these issues fixed:

    Same scenario as for testing BPF programs loaded before 'perf record' or
    'perf top' starts, only start the BPF programs after 'perf record|top',
    so that its information get collected by the sideband threads, the rest
    works as for the programs loaded before start monitoring.

    Add missing 'inline' to the bpf_event__add_sb_event() when
    HAVE_LIBBPF_SUPPORT is not defined, fixing the build in systems without
    binutils devel files installed.

    Signed-off-by: Song Liu
    Reviewed-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stanislav Fomichev
    Link: http://lkml.kernel.org/r/20190312053051.2690567-16-songliubraving@fb.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Song Liu
     
  • This patch introduces side band thread that captures extended
    information for events like PERF_RECORD_BPF_EVENT.

    This new thread uses its own evlist that uses ring buffer with very low
    watermark for lower latency.

    To use side band thread, we need to:

    1. add side band event(s) by calling perf_evlist__add_sb_event();
    2. calls perf_evlist__start_sb_thread();
    3. at the end of perf run, perf_evlist__stop_sb_thread().

    In the next patch, we use this thread to handle PERF_RECORD_BPF_EVENT.

    Committer notes:

    Add fix by Jiri Olsa for when te sb_tread can't get started and then at
    the end the stop_sb_thread() segfaults when joining the (non-existing)
    thread.

    That can happen when running 'perf top' or 'perf record' as a normal
    user, for instance.

    Further checks need to be done on top of this to more graciously handle
    these possible failure scenarios.

    Signed-off-by: Song Liu
    Reviewed-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stanislav Fomichev
    Link: http://lkml.kernel.org/r/20190312053051.2690567-15-songliubraving@fb.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Song Liu
     

20 Mar, 2019

3 commits

  • This patch changes the arguments of perf_event__synthesize_bpf_events()
    to include perf_session* instead of perf_tool*. perf_session will be
    used in the next patch.

    Signed-off-by: Song Liu
    Reviewed-by: Jiri Olsa
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stanislav Fomichev
    Cc: kernel-team@fb.com
    Link: http://lkml.kernel.org/r/20190312053051.2690567-6-songliubraving@fb.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Song Liu
     
  • Currently, monitoring of BPF programs through bpf_event is off by
    default for 'perf record'.

    To turn it on, the user need to use option "--bpf-event". As BPF gets
    wider adoption in different subsystems, this option becomes
    inconvenient.

    This patch makes bpf_event on by default, and adds option "--no-bpf-event"
    to turn it off. Since option --bpf-event is not released yet, it is safe
    to remove it.

    Signed-off-by: Song Liu
    Reviewed-by: Jiri Olsa
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: kernel-team@fb.com
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stanislav Fomichev
    Link: http://lkml.kernel.org/r/20190312053051.2690567-2-songliubraving@fb.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Song Liu
     
  • The help description for --switch-output looks like there are multiple
    comma separated fields. But it's actually a choice of different options.
    Make it clear and less confusing.

    Before:

    % perf record -h
    ...
    --switch-output[=]
    Switch output when receive SIGUSR2 or cross size,time threshold

    After:

    % perf record -h
    ...

    --switch-output[=]
    Switch output when receiving SIGUSR2 (signal) or cross a size or time threshold

    Signed-off-by: Andi Kleen
    Acked-by: Jiri Olsa
    LPU-Reference: 20190314225002.30108-4-andi@firstfloor.org
    Link: https://lkml.kernel.org/n/tip-9yecyuha04nyg8toyd1b2pgi@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

19 Mar, 2019

1 commit

  • When doing long term recording and waiting for some event to snapshot
    on, we often only care about the last minute or so.

    The --switch-output command line option supports rotating the perf.data
    file when the size exceeds a threshold. But the disk would still be
    filled with unnecessary old files.

    Add a new option to only keep a number of rotated files, so that the
    disk space usage can be limited.

    Signed-off-by: Andi Kleen
    Acked-by: Jiri Olsa
    LPU-Reference: 20190314225002.30108-3-andi@firstfloor.org
    Link: https://lkml.kernel.org/n/tip-y5u2lik0ragt4vlktz6qc9ks@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

11 Mar, 2019

2 commits

  • The data files layout is described by HEADER_DIR_FORMAT feature.
    Currently it holds only version number (1):

    uint64_t version;

    The current version holds only version value (1) means that data files:

    - Follow the 'data.*' name format.

    - Contain raw events data in standard perf format as read from kernel
    (and need to be sorted)

    Future versions are expected to describe different data files layout
    according to special needs.

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190308134745.5057-6-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • We can't store the auxtrace index when we store into multiple files,
    because we keep only offset for it, not the file.

    The auxtrace data will be processed correctly in the 'pipe' mode.

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190308134745.5057-3-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

23 Feb, 2019

2 commits

  • Add a 'path' member to 'struct perf_data'. It will keep the configured
    path for the data (const char *). The path in struct perf_data_file is
    now dynamically allocated (duped) from it.

    This scheme is useful/used in following patches where struct
    perf_data::path holds the 'configure' directory path and struct
    perf_data_file::path holds the allocated path for specific files.

    Also it actually makes the code little simpler.

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
    [ Fixup data-convert-bt.c missing conversion ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • We are about to add support for multiple files, so we need each file to
    keep its size.

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190221094145.9151-2-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

11 Feb, 2019

1 commit

  • Implement --affinity=node|cpu option for the record mode defaulting
    to system affinity mask bouncing.

    Signed-off-by: Alexey Budankov
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/083f5422-ece9-10dd-8305-bf59c860f10f@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     

06 Feb, 2019

3 commits

  • Build node cpu masks for mmap data buffers. Apply node cpu masks to tool
    thread every time it references data buffers cross node or cross cpu.

    Signed-off-by: Alexey Budankov
    Reviewed-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/b25e4ebc-078d-2c7b-216c-f0bed108d073@linux.intel.com
    [ Use cpu-set-sched.h to get the CPU_{EQUAL,OR}() fallbacks for older systems ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     
  • Allocate affinity option and masks for mmap data buffers and record
    thread as well as initialize allocated objects.

    Signed-off-by: Alexey Budankov
    Reviewed-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/526fa2b0-07de-6dbd-a7e9-26ba875593c9@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     
  • CoreSight was the only client of the PMU's set_drv_config() API. Now
    that it is no longer needed by CoreSight remove it from the code base.

    Signed-off-by: Mathieu Poirier
    Acked-by: Suzuki K Poulouse
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexei Starovoitov
    Cc: Greg Kroah-Hartman
    Cc: H. Peter Anvin
    Cc: Heiko Carstens
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Martin Schwidefsky
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-s390@vger.kernel.org
    Link: http://lkml.kernel.org/r/20190131184714.20388-8-mathieu.poirier@linaro.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Mathieu Poirier
     

22 Jan, 2019

2 commits

  • This patch synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for
    BPF programs loaded before perf-record. This is achieved by gathering
    information about all BPF programs via sys_bpf.

    Committer notes:

    Fix the build on some older systems such as amazonlinux:1 where it was
    breaking with:

    util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog':
    util/bpf-event.c:52:9: error: missing initializer for field 'type' of 'struct bpf_prog_info' [-Werror=missing-field-initializers]
    struct bpf_prog_info info = {};
    ^
    In file included from /git/linux/tools/lib/bpf/bpf.h:26:0,
    from util/bpf-event.c:3:
    /git/linux/tools/include/uapi/linux/bpf.h:2699:8: note: 'type' declared here
    __u32 type;
    ^
    cc1: all warnings being treated as errors

    Further fix on a centos:6 system:

    cc1: warnings being treated as errors
    util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog':
    util/bpf-event.c:50: error: 'func_info_rec_size' may be used uninitialized in this function

    The compiler is wrong, but to silence it, initialize that variable to
    zero.

    One more fix, this time for debian:experimental-x-mips, x-mips64 and
    x-mipsel:

    util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog':
    util/bpf-event.c:93:16: error: implicit declaration of function 'calloc' [-Werror=implicit-function-declaration]
    func_infos = calloc(sub_prog_cnt, func_info_rec_size);
    ^~~~~~
    util/bpf-event.c:93:16: error: incompatible implicit declaration of built-in function 'calloc' [-Werror]
    util/bpf-event.c:93:16: note: include '' or provide a declaration of 'calloc'

    Add the missing header.

    Committer testing:

    # perf record --bpf-event sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.021 MB perf.data (7 samples) ]
    # perf report -D | grep PERF_RECORD_BPF_EVENT | nl
    1 0 0x4b10 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13
    2 0 0x4c60 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14
    3 0 0x4db0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15
    4 0 0x4f00 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16
    5 0 0x5050 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17
    6 0 0x51a0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18
    7 0 0x52f0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21
    8 0 0x5440 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22
    # bpftool prog
    13: cgroup_skb tag 7be49e3934a125ba gpl
    loaded_at 2019-01-19T09:09:43-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 13,14
    14: cgroup_skb tag 2a142ef67aaad174 gpl
    loaded_at 2019-01-19T09:09:43-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 13,14
    15: cgroup_skb tag 7be49e3934a125ba gpl
    loaded_at 2019-01-19T09:09:43-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 15,16
    16: cgroup_skb tag 2a142ef67aaad174 gpl
    loaded_at 2019-01-19T09:09:43-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 15,16
    17: cgroup_skb tag 7be49e3934a125ba gpl
    loaded_at 2019-01-19T09:09:44-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 17,18
    18: cgroup_skb tag 2a142ef67aaad174 gpl
    loaded_at 2019-01-19T09:09:44-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 17,18
    21: cgroup_skb tag 7be49e3934a125ba gpl
    loaded_at 2019-01-19T09:09:45-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 21,22
    22: cgroup_skb tag 2a142ef67aaad174 gpl
    loaded_at 2019-01-19T09:09:45-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 21,22
    #

    # perf report -D | grep -B22 PERF_RECORD_KSYMBOL
    . ... raw event: size 312 bytes
    . 0000: 11 00 00 00 00 00 38 01 ff 44 06 c0 ff ff ff ff ......8..D......
    . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
    . 0020: 5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62 _7be49e3934a125b
    . 0030: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a...............

    . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
    . 0120: 7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00 {..94.%.........
    . 0130: 00 00 00 00 00 00 00 00 ........

    0 0x49d8 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc00644ff len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
    --
    . ... raw event: size 312 bytes
    . 0000: 11 00 00 00 00 00 38 01 48 6d 06 c0 ff ff ff ff ......8.Hm......
    . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
    . 0020: 5f 32 61 31 34 32 65 66 36 37 61 61 61 64 31 37 _2a142ef67aaad17
    . 0030: 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4...............

    . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
    . 0120: 2a 14 2e f6 7a aa d1 74 00 00 00 00 00 00 00 00 *...z..t........
    . 0130: 00 00 00 00 00 00 00 00 ........

    0 0x4b28 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0066d48 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
    --
    . ... raw event: size 312 bytes
    . 0000: 11 00 00 00 00 00 38 01 04 cf 03 c0 ff ff ff ff ......8.........
    . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
    . 0020: 5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62 _7be49e3934a125b
    . 0030: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a...............

    . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
    . 0120: 7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00 {..94.%.........
    . 0130: 00 00 00 00 00 00 00 00 ........

    0 0x4c78 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc003cf04 len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
    --
    . ... raw event: size 312 bytes
    . 0000: 11 00 00 00 00 00 38 01 96 28 04 c0 ff ff ff ff ......8..(......
    . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
    . 0020: 5f 32 61 31 34 32 65 66 36 37 61 61 61 64 31 37 _2a142ef67aaad17
    . 0030: 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4...............

    . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
    . 0120: 2a 14 2e f6 7a aa d1 74 00 00 00 00 00 00 00 00 *...z..t........
    . 0130: 00 00 00 00 00 00 00 00 ........

    0 0x4dc8 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0042896 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
    --
    . ... raw event: size 312 bytes
    . 0000: 11 00 00 00 00 00 38 01 05 13 17 c0 ff ff ff ff ......8.........
    . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
    . 0020: 5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62 _7be49e3934a125b
    . 0030: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a...............

    . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
    . 0120: 7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00 {..94.%.........
    . 0130: 00 00 00 00 00 00 00 00 ........

    0 0x4f18 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0171305 len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
    --
    . ... raw event: size 312 bytes
    . 0000: 11 00 00 00 00 00 38 01 0a 8c 23 c0 ff ff ff ff ......8...#.....
    . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
    . 0020: 5f 32 61 31 34 32 65 66 36 37 61 61 61 64 31 37 _2a142ef67aaad17
    . 0030: 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4...............

    . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
    . 0120: 2a 14 2e f6 7a aa d1 74 00 00 00 00 00 00 00 00 *...z..t........
    . 0130: 00 00 00 00 00 00 00 00 ........

    0 0x5068 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0238c0a len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
    --
    . ... raw event: size 312 bytes
    . 0000: 11 00 00 00 00 00 38 01 2a a5 a4 c0 ff ff ff ff ......8.*.......
    . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
    . 0020: 5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62 _7be49e3934a125b
    . 0030: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a...............

    . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
    . 0120: 7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00 {..94.%.........
    . 0130: 00 00 00 00 00 00 00 00 ........

    0 0x51b8 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0a4a52a len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
    --
    . ... raw event: size 312 bytes
    . 0000: 11 00 00 00 00 00 38 01 9b c9 a4 c0 ff ff ff ff ......8.........
    . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 ........bpf_prog
    . 0020: 5f 32 61 31 34 32 65 66 36 37 61 61 61 64 31 37 _2a142ef67aaad17
    . 0030: 34 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4...............

    . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 ........!.......
    . 0120: 2a 14 2e f6 7a aa d1 74 00 00 00 00 00 00 00 00 *...z..t........
    . 0130: 00 00 00 00 00 00 00 00 ........

    0 0x5308 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0a4c99b len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174

    Signed-off-by: Song Liu
    Reviewed-by: Arnaldo Carvalho de Melo
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Peter Zijlstra
    Cc: kernel-team@fb.com
    Cc: netdev@vger.kernel.org
    Link: http://lkml.kernel.org/r/20190117161521.1341602-8-songliubraving@fb.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Song Liu
     
  • This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of
    PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to
    turn it on.

    Committer notes:

    Add dummy machine__process_bpf_event() variant that returns zero for
    systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking
    the build in such systems.

    Remove the needless include from bpf->event.h, provide just
    forward declarations for the structs and unions in the parameters, to
    reduce compilation time and needless rebuilds when machine.h gets
    changed.

    Committer testing:

    When running with:

    # perf record --bpf-event

    On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL
    is not present, we fallback to removing those two bits from
    perf_event_attr, making the tool to continue to work on older kernels:

    perf_event_attr:
    size 112
    { sample_period, sample_freq } 4000
    sample_type IP|TID|TIME|PERIOD
    read_format ID
    disabled 1
    inherit 1
    mmap 1
    comm 1
    freq 1
    enable_on_exec 1
    task 1
    precise_ip 3
    sample_id_all 1
    exclude_guest 1
    mmap2 1
    comm_exec 1
    ksymbol 1
    bpf_event 1
    ------------------------------------------------------------
    sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8
    sys_perf_event_open failed, error -22
    switching off bpf_event
    ------------------------------------------------------------
    perf_event_attr:
    size 112
    { sample_period, sample_freq } 4000
    sample_type IP|TID|TIME|PERIOD
    read_format ID
    disabled 1
    inherit 1
    mmap 1
    comm 1
    freq 1
    enable_on_exec 1
    task 1
    precise_ip 3
    sample_id_all 1
    exclude_guest 1
    mmap2 1
    comm_exec 1
    ksymbol 1
    ------------------------------------------------------------
    sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8
    sys_perf_event_open failed, error -22
    switching off ksymbol
    ------------------------------------------------------------
    perf_event_attr:
    size 112
    { sample_period, sample_freq } 4000
    sample_type IP|TID|TIME|PERIOD
    read_format ID
    disabled 1
    inherit 1
    mmap 1
    comm 1
    freq 1
    enable_on_exec 1
    task 1
    precise_ip 3
    sample_id_all 1
    exclude_guest 1
    mmap2 1
    comm_exec 1
    ------------------------------------------------------------

    And then proceeds to work without those two features.

    As passing --bpf-event is an explicit action performed by the user, perhaps we
    should emit a warning telling that the kernel has no such feature, but this can
    be done on top of this patch.

    Now with a kernel that supports these events, start the 'record --bpf-event -a'
    and then run 'perf trace sleep 10000' that will use the BPF
    augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus
    should generate PERF_RECORD_BPF_EVENT events:

    [root@quaco ~]# perf record -e dummy -a --bpf-event
    ^C[ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.713 MB perf.data ]

    [root@quaco ~]# bpftool prog
    13: cgroup_skb tag 7be49e3934a125ba gpl
    loaded_at 2019-01-19T09:09:43-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 13,14
    14: cgroup_skb tag 2a142ef67aaad174 gpl
    loaded_at 2019-01-19T09:09:43-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 13,14
    15: cgroup_skb tag 7be49e3934a125ba gpl
    loaded_at 2019-01-19T09:09:43-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 15,16
    16: cgroup_skb tag 2a142ef67aaad174 gpl
    loaded_at 2019-01-19T09:09:43-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 15,16
    17: cgroup_skb tag 7be49e3934a125ba gpl
    loaded_at 2019-01-19T09:09:44-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 17,18
    18: cgroup_skb tag 2a142ef67aaad174 gpl
    loaded_at 2019-01-19T09:09:44-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 17,18
    21: cgroup_skb tag 7be49e3934a125ba gpl
    loaded_at 2019-01-19T09:09:45-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 21,22
    22: cgroup_skb tag 2a142ef67aaad174 gpl
    loaded_at 2019-01-19T09:09:45-0300 uid 0
    xlated 296B jited 229B memlock 4096B map_ids 21,22
    31: tracepoint name sys_enter tag 12504ba9402f952f gpl
    loaded_at 2019-01-19T09:19:56-0300 uid 0
    xlated 512B jited 374B memlock 4096B map_ids 30,29,28
    32: tracepoint name sys_exit tag c1bd85c092d6e4aa gpl
    loaded_at 2019-01-19T09:19:56-0300 uid 0
    xlated 256B jited 191B memlock 4096B map_ids 30,29
    # perf report -D | grep PERF_RECORD_BPF_EVENT | nl
    1 0 55834574849 0x4fc8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13
    2 0 60129542145 0x5118 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14
    3 0 64424509441 0x5268 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15
    4 0 68719476737 0x53b8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16
    5 0 73014444033 0x5508 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17
    6 0 77309411329 0x5658 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18
    7 0 90194313217 0x57a8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21
    8 0 94489280513 0x58f8 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22
    9 7 620922484360 0xb6390 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 29
    10 7 620922486018 0xb6410 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 29
    11 7 620922579199 0xb6490 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 30
    12 7 620922580240 0xb6510 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 2, flags 0, id 30
    13 7 620922765207 0xb6598 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 31
    14 7 620922874543 0xb6620 [0x30]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 32
    #

    There, the 31 and 32 tracepoint BPF programs put in place by 'perf trace'.

    Signed-off-by: Song Liu
    Reviewed-by: Arnaldo Carvalho de Melo
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Peter Zijlstra
    Cc: kernel-team@fb.com
    Cc: netdev@vger.kernel.org
    Link: http://lkml.kernel.org/r/20190117161521.1341602-7-songliubraving@fb.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Song Liu
     

18 Dec, 2018

3 commits

  • The default timeout of 500ms for parsing /proc//maps files is too
    short for profiling many of our services.

    This can be overridden by passing --proc-map-timeout to the relevant
    command but it'd be nice to globally increase our default value.

    This patch permits setting a different default with the
    core.proc-map-timeout config file parameter.

    Signed-off-by: Mark Drayton
    Acked-by: Song Liu
    Acked-by: Namhyung Kim
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20181204203420.1683114-1-mbd@fb.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Mark Drayton
     
  • Multi AIO trace writing allows caching more kernel data into userspace
    memory postponing trace writing for the sake of overall profiling data
    thruput increase. It could be seen as kernel data buffer extension into
    userspace memory.

    With an --aio option value different from 0 (default value is 1) the
    tool has capability to cache more and more data into user space along
    with delegating spill to AIO.

    That allows avoiding to suspend at record__aio_sync() between calls of
    record__mmap_read_evlist() and increases profiling data thruput at the
    cost of userspace memory.

    Signed-off-by: Alexey Budankov
    Reviewed-by: Jiri Olsa
    Acked-by: Namhyung Kim
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/050bb053-e7f3-aa83-fde7-f27ff90be7f6@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     
  • The trace file offset is read once before mmaps iterating loop and
    written back after all performance data is enqueued for aio writing.

    The trace file offset is incremented linearly after every successful aio
    write operation.

    record__aio_sync() blocks till completion of the started AIO operation
    and then proceeds.

    record__aio_mmap_read_sync() implements a barrier for all incomplete
    aio write requests.

    Signed-off-by: Alexey Budankov
    Reviewed-by: Jiri Olsa
    Acked-by: Namhyung Kim
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/ce2d45e9-d236-871c-7c8f-1bed2d37e8ac@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     

06 Nov, 2018

1 commit

  • Implement a weak group fallback for 'perf record', similar to the
    existing 'perf stat' support. This allows to use groups that might be
    longer than the available counters without failing.

    Before:

    $ perf record -e '{cycles,cache-misses,cache-references,cpu_clk_unhalted.thread,cycles,cycles,cycles}' -a sleep 1
    Error:
    The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles).
    /bin/dmesg | grep -i perf may provide additional information.

    After:

    $ ./perf record -e '{cycles,cache-misses,cache-references,cpu_clk_unhalted.thread,cycles,cycles,cycles}:W' -a sleep 1
    WARNING: No sample_id_all support, falling back to unordered processing
    [ perf record: Woken up 3 times to write data ]
    [ perf record: Captured and wrote 8.136 MB perf.data (134069 samples) ]

    Signed-off-by: Andi Kleen
    Acked-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/20181001195927.14211-2-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

18 Oct, 2018

1 commit

  • Store -k clockid frequency into Perf trace to enable timestamps
    derived metrics conversion into wall clock time on reporting stage.

    Below is the example of perf report output:

    tools/perf/perf record -k raw -- ../../matrix/linux/matrix.gcc
    ...
    [ perf record: Captured and wrote 31.222 MB perf.data (818054 samples) ]

    tools/perf/perf report --header
    # ========
    ...
    # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, use_clockid = 1, clockid = 4
    ...
    # clockid frequency: 1000 MHz
    ...
    # ========

    Signed-off-by: Alexey Budankov
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/23a4a1dc-b160-85a0-347d-40a2ed6d007b@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     

19 Sep, 2018

2 commits

  • The struct perf_mmap map argument will hold the file pointer to write
    the data to.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20180913125450.21342-5-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • The perf_mmap struct will hold a file pointer to write the mmap's
    contents, so we need to propagate it down the stack to record__write
    callers instead of its member the auxtrace_mmap struct.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20180913125450.21342-4-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

31 Aug, 2018

1 commit


17 Mar, 2018

2 commits

  • We need to synthesize events first, because some features works on top
    of them (on report side).

    Signed-off-by: Jiri Olsa
    Tested-by: Stephane Eranian
    Cc: Alexander Shishkin
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20180314092205.23291-1-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • We have brought perf_default_config to the very beginning at main(), so
    it no need to call perf_default_config() once more for most of config in
    perf-record but only for record.call-graph.

    Signed-off-by: Yisheng Xie
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1520853957-36106-2-git-send-email-xieyisheng1@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Yisheng Xie
     

08 Mar, 2018

3 commits

  • It's no longer used.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20180307155020.32613-5-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • It's used far more down to be declared on the top of the __cmd_record.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20180307155020.32613-4-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Using the 'start' and 'end' which are stored in struct perf_mmap to
    replace the temporary 'start' and 'end'.
    The temporary variables will be discarded later.

    It doesn't need to pass 'overwrite' to perf_mmap__push(). It's stored in
    struct perf_mmap.

    Signed-off-by: Kan Liang
    Acked-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/r/1520350567-80082-3-git-send-email-kan.liang@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Kan Liang
     

07 Mar, 2018

1 commit


05 Mar, 2018

1 commit

  • Currently we can crash perf record when running in pipe mode, like:

    $ perf record ls | perf report
    # To display the perf.data header info, please use --header/--header-only options.
    #
    perf: Segmentation fault
    Error:
    The - file has no samples!

    The callstack of the crash is:

    0x0000000000515242 in perf_event__synthesize_event_update_name
    3513 ev = event_update_event__new(len + 1, PERF_EVENT_UPDATE__NAME, evsel->id[0]);
    (gdb) bt
    #0 0x0000000000515242 in perf_event__synthesize_event_update_name
    #1 0x00000000005158a4 in perf_event__synthesize_extra_attr
    #2 0x0000000000443347 in record__synthesize
    #3 0x00000000004438e3 in __cmd_record
    #4 0x000000000044514e in cmd_record
    #5 0x00000000004cbc95 in run_builtin
    #6 0x00000000004cbf02 in handle_internal_command
    #7 0x00000000004cc054 in run_argv
    #8 0x00000000004cc422 in main

    The reason of the crash is that the evsel does not have ids array
    allocated and the pipe's synthesize code tries to access it.

    We don't force evsel ids allocation when we have single event, because
    it's not needed. However we need it when we are in pipe mode even for
    single event as a key for evsel update event.

    Fixing this by forcing evsel ids allocation event for single event, when
    we are in pipe mode.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20180302161354.30192-1-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa