17 Nov, 2020

1 commit

  • "perf inject" can create corrupt files when synthesizing sample events from AUX
    data. This happens when in the input file, the first event (for the AUX data)
    has a different sample_type from the second event (generally dummy).

    Specifically, they differ in the bits that indicate the standard fields
    appended to perf records in the mmap buffer. "perf inject" deletes the first
    event and moves up the second event to first position.

    The problem is with the synthetic PERF_RECORD_MMAP (etc.) events created
    by "perf record".

    Since these are synthetic versions of events which are normally produced
    by the kernel, they have to have the standard fields appended as
    described by sample_type.

    "perf record" fills these in with zeroes, including the IDENTIFIER
    field; perf readers interpret records with zero IDENTIFIER using the
    descriptor for the first event in the file.

    Since "perf inject" changes the first event, these synthetic records are
    then processed with the wrong value of sample_type, and the perf reader
    reads bad data, reports on incorrect length records etc.

    Mismatching sample_types are seen with "perf record -e cs_etm//", where the AUX
    event has TID|TIME|CPU|IDENTIFIER and the dummy event has TID|TIME|IDENTIFIER.

    Perhaps they could be the same, but it isn't normally a problem if they aren't
    - perf has no problems reading the file.

    The sample_types have to agree on the position of IDENTIFIER, because
    that's how perf finds the right event descriptor in the first place, but
    they don't normally have to agree on other fields, and perf doesn't
    check that they do.

    The problem is specific to the way "perf inject" reorganizes the events
    and the way synthetic MMAP events are recorded with a zero identifier. A
    simple solution is to stop "perf inject" deleting the tracing event.

    Committer testing

    Removed the now unused 'evsel' variable, update the comment about the
    evsel removal not being performed anymore, and apply the patch manually
    as it failed with this warning:

    warning: Patch sent with format=flowed; space at the end of lines might be lost.

    Testing it with:

    $ perf bench internals inject-build-id
    # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 8.543 msec (+- 0.130 msec)
    Average time per event: 0.838 usec (+- 0.013 usec)
    Average memory usage: 12717 KB (+- 9 KB)
    Average build-id-all injection took: 5.710 msec (+- 0.058 msec)
    Average time per event: 0.560 usec (+- 0.006 usec)
    Average memory usage: 12079 KB (+- 7 KB)
    $

    Signed-off-by: Al Grant
    Acked-by: Adrian Hunter
    Acked-by: Namhyung Kim
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Peter Zijlstra
    LPU-Reference: b9cf5611-daae-2390-3439-6617f8f0a34b@foss.arm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Al Grant
     

14 Oct, 2020

2 commits

  • Pass a build_id object to filename__read_build_id function, so it can
    populate the size of the build_id object.

    Changing filename__read_build_id() code for both ELF/non-ELF code.

    Signed-off-by: Jiri Olsa
    Acked-by: Ian Rogers
    Link: https://lore.kernel.org/r/20201013192441.1299447-3-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Replace build_id byte array with struct build_id object and all the code
    that references it.

    The objective is to carry size together with build id array, so it's
    better to keep both together.

    This is preparatory change for following patches, and there's no
    functional change.

    Signed-off-by: Jiri Olsa
    Acked-by: Ian Rogers
    Link: https://lore.kernel.org/r/20201013192441.1299447-2-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

13 Oct, 2020

5 commits

  • Like 'perf record', we can even more speedup build-id processing by just
    using all DSOs. Then we don't need to look at all the sample events
    anymore. The following patch will update 'perf bench' to show the result
    of the --buildid-all option too.

    Signed-off-by: Namhyung Kim
    Original-patch-by: Stephane Eranian
    Acked-by: Ian Rogers
    Acked-by: Jiri Olsa
    Link: https://lore.kernel.org/r/20201012070214.2074921-6-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • No need to load symbols in a DSO when injecting build-id. I guess the
    reason was to check the DSO is a special file like anon files. Use some
    helper functions in map.c to check them before reading build-id. Also
    pass sample event's cpumode to a new build-id event.

    It brought a speedup in the benchmark of 25 -> 21 msec on my laptop.
    Also the memory usage (Max RSS) went down by ~200 KB.

    # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 21.389 msec (+- 0.138 msec)
    Average time per event: 2.097 usec (+- 0.014 usec)
    Average memory usage: 8225 KB (+- 0 KB)

    Committer notes:

    Before:

    $ perf stat -r5 perf bench internals inject-build-id > /dev/null

    Performance counter stats for 'perf bench internals inject-build-id' (5 runs):

    4,020.56 msec task-clock:u # 1.271 CPUs utilized ( +- 0.74% )
    0 context-switches:u # 0.000 K/sec
    0 cpu-migrations:u # 0.000 K/sec
    123,354 page-faults:u # 0.031 M/sec ( +- 0.81% )
    7,119,951,568 cycles:u # 1.771 GHz ( +- 1.74% ) (83.27%)
    230,086,969 stalled-cycles-frontend:u # 3.23% frontend cycles idle ( +- 1.97% ) (83.41%)
    1,168,298,765 stalled-cycles-backend:u # 16.41% backend cycles idle ( +- 1.13% ) (83.44%)
    11,173,083,669 instructions:u # 1.57 insn per cycle
    # 0.10 stalled cycles per insn ( +- 1.58% ) (83.31%)
    2,413,908,936 branches:u # 600.392 M/sec ( +- 1.69% ) (83.26%)
    46,576,289 branch-misses:u # 1.93% of all branches ( +- 2.20% ) (83.31%)

    3.1638 +- 0.0309 seconds time elapsed ( +- 0.98% )

    $

    After:

    $ perf stat -r5 perf bench internals inject-build-id > /dev/null

    Performance counter stats for 'perf bench internals inject-build-id' (5 runs):

    2,379.94 msec task-clock:u # 1.473 CPUs utilized ( +- 0.18% )
    0 context-switches:u # 0.000 K/sec
    0 cpu-migrations:u # 0.000 K/sec
    62,584 page-faults:u # 0.026 M/sec ( +- 0.07% )
    2,372,389,668 cycles:u # 0.997 GHz ( +- 0.29% ) (83.14%)
    106,937,862 stalled-cycles-frontend:u # 4.51% frontend cycles idle ( +- 4.89% ) (83.20%)
    581,697,915 stalled-cycles-backend:u # 24.52% backend cycles idle ( +- 0.71% ) (83.47%)
    3,659,692,199 instructions:u # 1.54 insn per cycle
    # 0.16 stalled cycles per insn ( +- 0.10% ) (83.63%)
    791,372,961 branches:u # 332.518 M/sec ( +- 0.27% ) (83.39%)
    10,648,083 branch-misses:u # 1.35% of all branches ( +- 0.22% ) (83.16%)

    1.61570 +- 0.00172 seconds time elapsed ( +- 0.11% )

    $

    Signed-off-by: Namhyung Kim
    Original-patch-by: Stephane Eranian
    Tested-by: Arnaldo Carvalho de Melo
    Acked-by: Jiri Olsa
    Link: https://lore.kernel.org/r/20201012070214.2074921-5-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • It should be in a proper mnt namespace when accessing the file.

    I think this had no problem since the build-id was actually read from
    map__load() -> dso__load() already. But I'd like to change it in the
    following commit.

    Acked-by: Jiri Olsa
    Signed-off-by: Namhyung Kim
    Link: https://lore.kernel.org/r/20201012070214.2074921-4-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • I found some events (like PERF_RECORD_CGROUP) are not copied by perf
    inject due to the missing callbacks. Let's add them.

    While at it, I've changed the order of the callbacks to match with
    struct perf_tool so that we can compare them easily.

    Acked-by: Jiri Olsa
    Signed-off-by: Namhyung Kim
    Link: https://lore.kernel.org/r/20201012070214.2074921-3-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Sometimes I can see that 'perf record' piped with 'perf inject' take a
    long time processing build-ids.

    So introduce a inject-build-id benchmark to the internals benchmark
    suite to measure its overhead regularly.

    It runs the 'perf inject' command internally and feeds the given number
    of synthesized events (MMAP2 + SAMPLE basically).

    Usage: perf bench internals inject-build-id

    -i, --iterations Number of iterations used to compute average (default: 100)
    -m, --nr-mmaps Number of mmap events for each iteration (default: 100)
    -n, --nr-samples Number of sample events per mmap event (default: 100)
    -v, --verbose be more verbose (show iteration count, DSO name, etc)

    By default, it measures average processing time of 100 MMAP2 events
    and 10000 SAMPLE events. Below is a result on my laptop.

    $ perf bench internals inject-build-id
    # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 25.789 msec (+- 0.202 msec)
    Average time per event: 2.528 usec (+- 0.020 usec)
    Average memory usage: 8411 KB (+- 7 KB)

    Committer testing:

    $ perf bench
    Usage:
    perf bench [] []

    # List of all available benchmark collections:

    sched: Scheduler and IPC benchmarks
    syscall: System call benchmarks
    mem: Memory access benchmarks
    numa: NUMA scheduling and MM benchmarks
    futex: Futex stressing benchmarks
    epoll: Epoll stressing benchmarks
    internals: Perf-internals benchmarks
    all: All benchmarks

    $ perf bench internals

    # List of available benchmarks for collection 'internals':

    synthesize: Benchmark perf event synthesis
    kallsyms-parse: Benchmark kallsyms parsing
    inject-build-id: Benchmark build-id injection

    $ perf bench internals inject-build-id
    # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 14.202 msec (+- 0.059 msec)
    Average time per event: 1.392 usec (+- 0.006 usec)
    Average memory usage: 12650 KB (+- 10 KB)
    Average build-id-all injection took: 12.831 msec (+- 0.071 msec)
    Average time per event: 1.258 usec (+- 0.007 usec)
    Average memory usage: 11895 KB (+- 10 KB)
    $

    $ perf stat -r5 perf bench internals inject-build-id
    # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 14.380 msec (+- 0.056 msec)
    Average time per event: 1.410 usec (+- 0.006 usec)
    Average memory usage: 12608 KB (+- 11 KB)
    Average build-id-all injection took: 11.889 msec (+- 0.064 msec)
    Average time per event: 1.166 usec (+- 0.006 usec)
    Average memory usage: 11838 KB (+- 10 KB)
    # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 14.246 msec (+- 0.065 msec)
    Average time per event: 1.397 usec (+- 0.006 usec)
    Average memory usage: 12744 KB (+- 10 KB)
    Average build-id-all injection took: 12.019 msec (+- 0.066 msec)
    Average time per event: 1.178 usec (+- 0.006 usec)
    Average memory usage: 11963 KB (+- 10 KB)
    # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 14.321 msec (+- 0.067 msec)
    Average time per event: 1.404 usec (+- 0.007 usec)
    Average memory usage: 12690 KB (+- 10 KB)
    Average build-id-all injection took: 11.909 msec (+- 0.041 msec)
    Average time per event: 1.168 usec (+- 0.004 usec)
    Average memory usage: 11938 KB (+- 10 KB)
    # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 14.287 msec (+- 0.059 msec)
    Average time per event: 1.401 usec (+- 0.006 usec)
    Average memory usage: 12864 KB (+- 10 KB)
    Average build-id-all injection took: 11.862 msec (+- 0.058 msec)
    Average time per event: 1.163 usec (+- 0.006 usec)
    Average memory usage: 12103 KB (+- 10 KB)
    # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 14.402 msec (+- 0.053 msec)
    Average time per event: 1.412 usec (+- 0.005 usec)
    Average memory usage: 12876 KB (+- 10 KB)
    Average build-id-all injection took: 11.826 msec (+- 0.061 msec)
    Average time per event: 1.159 usec (+- 0.006 usec)
    Average memory usage: 12111 KB (+- 10 KB)

    Performance counter stats for 'perf bench internals inject-build-id' (5 runs):

    4,267.48 msec task-clock:u # 1.502 CPUs utilized ( +- 0.14% )
    0 context-switches:u # 0.000 K/sec
    0 cpu-migrations:u # 0.000 K/sec
    102,092 page-faults:u # 0.024 M/sec ( +- 0.08% )
    3,894,589,578 cycles:u # 0.913 GHz ( +- 0.19% ) (83.49%)
    140,078,421 stalled-cycles-frontend:u # 3.60% frontend cycles idle ( +- 0.77% ) (83.34%)
    948,581,189 stalled-cycles-backend:u # 24.36% backend cycles idle ( +- 0.46% ) (83.25%)
    5,835,587,719 instructions:u # 1.50 insn per cycle
    # 0.16 stalled cycles per insn ( +- 0.21% ) (83.24%)
    1,267,423,636 branches:u # 296.996 M/sec ( +- 0.22% ) (83.12%)
    17,484,290 branch-misses:u # 1.38% of all branches ( +- 0.12% ) (83.55%)

    2.84176 +- 0.00222 seconds time elapsed ( +- 0.08% )

    $

    Acked-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Signed-off-by: Namhyung Kim
    Link: https://lore.kernel.org/r/20201012070214.2074921-2-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

09 Jul, 2020

1 commit

  • **perf-.map and jit-.dump designs:

    When a JIT generates code to be executed, it must allocate memory and
    mark it executable using an mmap call.

    *** perf-.map design

    The perf-.map assumes that any sample recorded in an anonymous
    memory page is JIT code. It then tries to resolve the symbol name by
    looking at the process' perf-.map.

    *** jit-.dump design

    The jit-.dump mechanism takes a different approach. It requires a
    JIT to write a `/jit-.dump` file. This file must also be
    mmapped so that perf inject -jit can find the file. The JIT must also
    add JIT_CODE_LOAD records for any functions it generates. The records
    are timestamped using a clock which can be correlated to the perf record
    clock.

    After perf record, the `perf inject -jit` pass parses the recording
    looking for a `/jit-.dump` file. When it finds the file, it
    parses it and for each JIT_CODE_LOAD record:
    * creates an elf file `/jitted--.so
    * injects a new mmap record mapping the new elf file into the process.

    *** Coexistence design

    The kernel and perf support both of these mechanisms. We need to make
    sure perf works on an app supporting either or both of these mechanisms.
    Both designs rely on mmap records to determine how to resolve an ip
    address.

    The mmap records of both techniques by definition overlap. When the JIT
    compiles a method, it must:

    * allocate memory (mmap)
    * add execution privilege (mprotect or mmap. either will
    generate an mmap event form the kernel to perf)
    * compile code into memory
    * add a function record to perf-.map and/or jit-.dump

    Because the jit-.dump mechanism supports greater capabilities, perf
    prefers the symbols from jit-.dump. It implements this based on
    timestamp ordering of events. There is an implicit ASSUMPTION that the
    JIT_CODE_LOAD record timestamp will be after the // anon mmap event that
    was generated during memory allocation or adding the execution privilege setting.

    *** Problems with the ASSUMPTION

    The ASSUMPTION made in the Coexistence design section above is violated
    in the following scenario.

    *** Scenario

    While a JIT is jitting code it will eventually need to commit more
    pages and change these pages to executable permissions. Typically the
    JIT will want these collocated to minimize branch displacements.

    The kernel will coalesce these anonymous mapping with identical
    permissions before sending an MMAP event for the new pages. The address
    range of the new mmap will not be just the most recently mmap pages.
    It will include the entire coalesced mmap region.

    See mm/mmap.c

    unsigned long mmap_region(struct file *file, unsigned long addr,
    unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
    struct list_head *uf)
    {
    ...
    /*
    * Can we just expand an old mapping?
    */
    ...
    perf_event_mmap(vma);
    ...
    }

    *** Symptoms

    The coalesced // anon mmap event will be timestamped after the
    JIT_CODE_LOAD records. This means it will be used as the most recent
    mapping for that entire address range. For remaining events it will look
    at the inferior perf-.map for symbols.

    If both mechanisms are supported, the symbol will appear twice with
    different module names. This causes weird behavior in reporting.

    If only jit-.dump is supported, the symbol will no longer be resolved.

    ** Implemented solution

    This patch solves the issue by removing // anon mmap events for any
    process which has a valid jit-.dump file.

    It tracks on a per process basis to handle the case where some running
    apps support jit-.dump, but some only support perf-.map.

    It adds new assumptions:
    * // anon mmap events are only required for perf-.map support.
    * An app that uses jit-.dump, no longer needs
    perf-.map support. It assumes that any perf-.map info is
    inferior.

    *** Details

    Use thread->priv to store whether a jitdump file has been processed

    During "perf inject --jit", discard "//anon*" mmap events for any pid which
    has sucessfully processed a jitdump file.

    ** Testing:

    // jitdump case

    perf record
    perf inject --jit --input perf.data --output perfjit.data

    // verify mmap "//anon" events present initially

    perf script --input perf.data --show-mmap-events | grep '//anon'

    // verify mmap "//anon" events removed

    perf script --input perfjit.data --show-mmap-events | grep '//anon'

    // no jitdump case

    perf record
    perf inject --jit --input perf.data --output perfjit.data

    // verify mmap "//anon" events present initially

    perf script --input perf.data --show-mmap-events | grep '//anon'

    // verify mmap "//anon" events not removed

    perf script --input perfjit.data --show-mmap-events | grep '//anon'

    ** Repro:

    This issue was discovered while testing the initial CoreCLR jitdump
    implementation. https://github.com/dotnet/coreclr/pull/26897.

    ** Alternate solutions considered

    These were also briefly considered:

    * Change kernel to not coalesce mmap regions.

    * Change kernel reporting of coalesced mmap regions to perf. Only
    include newly mapped memory.

    * Only strip parts of // anon mmap events overlapping existing
    jitted--.so mmap events.

    Signed-off-by: Steve MacLean
    Acked-by: Ian Rogers
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lore.kernel.org/lkml/1590544271-125795-1-git-send-email-steve.maclean@linux.microsoft.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Steve MacLean
     

28 May, 2020

1 commit

  • The current codebase makes use of the zero-length array language
    extension to the C90 standard, but the preferred mechanism to declare
    variable-length types such as these ones is a flexible array
    member[1][2], introduced in C99:

    struct foo {
    int stuff;
    struct boo array[];
    };

    By making use of the mechanism above, we will get a compiler warning in
    case the flexible array does not occur last in the structure, which will
    help us prevent some kind of undefined behavior bugs from being
    inadvertently introduced[3] to the codebase from now on.

    Also, notice that, dynamic memory allocations won't be affected by this
    change:

    "Flexible array members have incomplete type, and so the sizeof operator
    may not be applied. As a quirk of the original implementation of
    zero-length arrays, sizeof evaluates to zero."[1]

    sizeof(flexible-array-member) triggers a warning because flexible array
    members have incomplete type[1]. There are some instances of code in
    which the sizeof operator is being incorrectly/erroneously applied to
    zero-length arrays and the result is zero. Such instances may be hiding
    some bugs. So, this work (flexible-array member conversions) will also
    help to get completely rid of those sorts of issues.

    This issue was found with the help of Coccinelle.

    [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
    [2] https://github.com/KSPP/linux/issues/21
    [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

    Signed-off-by: Gustavo A. R. Silva
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Gustavo A. R. Silva
    Cc: Ian Rogers
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200515172926.GA31976@embeddedor
    Signed-off-by: Arnaldo Carvalho de Melo

    Gustavo A. R. Silva
     

06 May, 2020

5 commits


04 Dec, 2019

1 commit

  • The ID index event is used when decoding, but can result in the
    following error:

    $ perf record --aux-sample -e '{intel_pt//,branch-misses}:u' ls
    $ perf inject -i perf.data -o perf.data.inj --itrace=be
    $ perf script -i perf.data.inj
    0x1020 [0x410]: failed to process type: 69 [No such file or directory]

    Fix by having 'perf inject' drop the ID index event.

    Fixes: c0a6de06c446 ("perf record: Add support for AUX area sampling")
    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20191204120800.8138-1-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

22 Nov, 2019

1 commit

  • After decoding AUX area samples, the AUX area data is no longer needed
    (having been replaced by synthesized events) so cut it out.

    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20191115124225.5247-9-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

07 Nov, 2019

1 commit

  • create_gcov (refer to the autofdo example in tools/perf/Documentation/intel-pt.txt)
    now needs the evsels to read the perf.data file. So don't strip them.

    Signed-off-by: Adrian Hunter
    Acked-by: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20191105100057.21465-1-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

21 Sep, 2019

1 commit

  • This patch is to return error code of perf_new_session function on
    failure instead of NULL.

    Test Results:

    Before Fix:

    $ perf c2c report -input
    failed to open nput: No such file or directory

    $ echo $?
    0
    $

    After Fix:

    $ perf c2c report -input
    failed to open nput: No such file or directory

    $ echo $?
    254
    $

    Committer notes:

    Fix 'perf tests topology' case, where we use that TEST_ASSERT_VAL(...,
    session), i.e. we need to pass zero in case of failure, which was the
    case before when NULL was returned by perf_session__new() for failure,
    but now we need to negate the result of IS_ERR(session) to respect that
    TEST_ASSERT_VAL) expectation of zero meaning failure.

    Reported-by: Nageswara R Sastry
    Signed-off-by: Mamatha Inamdar
    Tested-by: Arnaldo Carvalho de Melo
    Tested-by: Nageswara R Sastry
    Acked-by: Ravi Bangoria
    Reviewed-by: Jiri Olsa
    Reviewed-by: Mukesh Ojha
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Greg Kroah-Hartman
    Cc: Jeremie Galarneau
    Cc: Kate Stewart
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Shawn Landden
    Cc: Song Liu
    Cc: Thomas Gleixner
    Cc: Tzvetomir Stoyanov
    Link: http://lore.kernel.org/lkml/20190822071223.17892.45782.stgit@localhost.localdomain
    Signed-off-by: Arnaldo Carvalho de Melo

    Mamatha Inamdar
     

20 Sep, 2019

1 commit

  • Those are the only routines using the perf_event__handler_t typedef and
    are all related, so move to a separate header to reduce the header
    dependency tree, lots of places were getting event.h and even stdio.h,
    limits.h indirectly, so fix those as well.

    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-yvx9u1mf7baq6cu1abfhbqgs@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

01 Sep, 2019

1 commit


30 Aug, 2019

1 commit


30 Jul, 2019

5 commits

  • Move the perf_event_attr struct fron 'struct evsel' to 'struct perf_evsel'.

    Committer notes:

    Fixed up these:

    tools/perf/arch/arm/util/auxtrace.c
    tools/perf/arch/arm/util/cs-etm.c
    tools/perf/arch/arm64/util/arm-spe.c
    tools/perf/arch/s390/util/auxtrace.c
    tools/perf/util/cs-etm.c

    Also

    cc1: warnings being treated as errors
    tests/sample-parsing.c: In function 'do_test':
    tests/sample-parsing.c:162: error: missing initializer
    tests/sample-parsing.c:162: error: (near initialization for 'evsel.core.cpus')

    struct evsel evsel = {
    .needs_swap = false,
    - .core.attr = {
    - .sample_type = sample_type,
    - .read_format = read_format,
    + .core = {
    + . attr = {
    + .sample_type = sample_type,
    + .read_format = read_format,
    + },

    [perfbuilder@a70e4eeb5549 /]$ gcc --version |& head -1
    gcc (GCC) 4.4.7

    Also we don't need to include perf_event.h in
    tools/perf/lib/include/perf/evsel.h, forward declaring 'struct
    perf_event_attr' is enough. And this even fixes the build in some
    systems where things are used somewhere down the include path from
    perf_event.h without defining __always_inline.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20190721112506.12306-43-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Rename perf_evlist__remove() to evlist__remove(), so we don't have a
    name clash when we add perf_evlist__remove() in libperf.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20190721112506.12306-14-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Remame perf_evsel__delete() to evsel__delete(), so we don't have a name
    clash when we add perf_evsel__delete() in libperf.

    Also renaming perf_evsel__delete_priv() to evsel__delete_priv().

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20190721112506.12306-11-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Rename struct perf_evlist to struct evlist, so we don't have a name
    clash when we add struct perf_evlist in libperf.

    Committer notes:

    Added fixes to build on arm64, from Jiri and from me
    (tools/perf/util/cs-etm.c)

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Rename struct perf_evsel to struct evsel, so we don't have a name clash
    when we add struct perf_evsel in libperf.

    Committer notes:

    Added fixes for arm64, provided by Jiri.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

09 Jul, 2019

1 commit

  • Check first, as machines__deliver_event() may have
    perf_evlist__id2evsel() returning NULL.

    This was found while checking a report from Leo Yan that used the smatch
    tool to find places where a pointer is checked before use and then,
    later in the same function gets used without checking.

    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Leo Yan
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-muvb8xqyh0gysgfjfq35w642@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

16 May, 2019

1 commit

  • Initialized decompression part of Zstd based API so COMPRESSED records
    would be decompressed into the resulting output data file.

    Signed-off-by: Alexey Budankov
    Reviewed-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/c27d7500-ecdd-3569-cab5-8f70bbed5ea4@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexey Budankov
     

23 Feb, 2019

1 commit

  • Add a 'path' member to 'struct perf_data'. It will keep the configured
    path for the data (const char *). The path in struct perf_data_file is
    now dynamically allocated (duped) from it.

    This scheme is useful/used in following patches where struct
    perf_data::path holds the 'configure' directory path and struct
    perf_data_file::path holds the allocated path for specific files.

    Also it actually makes the code little simpler.

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
    [ Fixup data-convert-bt.c missing conversion ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

06 Feb, 2019

2 commits

  • Several places were using definitions found in symbols.h but not
    including it, getting it by sheer luck from some other headers that now
    are in the process of removing that include because they don't need it
    or because simply having struct forward declarations is enough, fix it.

    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-xbcvvx296d70kpg9wb0qmeq9@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Lots of places get the map.h file indirectly, and since we're going to
    remove it from machine.h, then those need to include it directly, do it
    now, before we remove that dep.

    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-ob8jehdjda8h5jsrv9dqj9tf@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

20 Sep, 2018

1 commit

  • I often forget all the options that --itrace accepts. Instead of burying
    them in the man page only report them in the normal command line help
    too to make them easier accessible.

    v2: Align

    Signed-off-by: Andi Kleen
    Cc: Jiri Olsa
    Cc: Kim Phillips
    Link: http://lkml.kernel.org/r/20180914031038.4160-2-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

19 Sep, 2018

2 commits

  • Now that we keep a perf_tool pointer inside perf_session, there's no need
    to have a perf_tool argument in the event_op3 callback. Remove it.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20180913125450.21342-3-jolsa@kernel.org
    [ Fix the builtin-inject.c build for !HAVE_AUXTRACE_SUPPORT ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Now that we keep a perf_tool pointer inside perf_session, there's no
    need to have a perf_tool argument in the event_op2 callback. Remove it.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20180913125450.21342-2-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

27 Apr, 2018

2 commits

  • It was returning the searched map just on the addr_location passed, with
    the function itself returning void.

    Make it return the map so that we can make the code more compact.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: https://lkml.kernel.org/n/tip-tzlrrzdeoof4i6ktyqv1t6ks@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Out of thread__find_add_map(..., MAP__FUNCTION, ...), idea here is to
    continue removing references to MAP__{FUNCTION,VARIABLE} ahead of
    getting both types of symbols in the same rbtree, as various places do
    two lookups, looking first at MAP__FUNCTION, then at MAP__VARIABLE.

    So thread__find_map() will eventually do just that, and 'struct symbol'
    will have the symbol type, for code that cares about that.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: https://lkml.kernel.org/n/tip-q27xee34l4izpfau49w103s6@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

18 Jan, 2018

1 commit


07 Nov, 2017

1 commit

  • Conflicts:
    tools/perf/arch/arm/annotate/instructions.c
    tools/perf/arch/arm64/annotate/instructions.c
    tools/perf/arch/powerpc/annotate/instructions.c
    tools/perf/arch/s390/annotate/instructions.c
    tools/perf/arch/x86/tests/intel-cqm.c
    tools/perf/ui/tui/progress.c
    tools/perf/util/zlib.c

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman