04 Mar, 2017

1 commit

  • The refcount_t type and corresponding API should be used instead of atomic_t
    when the variable is used as a reference counter.

    This allows to avoid accidental refcounter overflows that might lead to
    use-after-free situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: David Windsor
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Kook
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Andrew Morton
    Cc: David Windsor
    Cc: Greg Kroah-Hartman
    Cc: Hans Liljestrand
    Cc: Jiri Olsa
    Cc: Kees Kook
    Cc: Mark Rutland
    Cc: Matija Glavinic Pecotic
    Cc: Peter Zijlstra
    Cc: alsa-devel@alsa-project.org
    Link: http://lkml.kernel.org/r/1487691303-31858-5-git-send-email-elena.reshetova@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Elena Reshetova
     

13 Jul, 2016

1 commit


01 Jul, 2016

1 commit

  • I hit a bug when running test suite without forking
    each test (-F option):

    $ perf test -F dso
    8: Test dso data read : Ok
    9: Test dso data cache : FAILED!
    10: Test dso data reopen : FAILED!

    The reason the session file limit is set just once for
    perf process so we need to reset it for each test,
    otherwise wrong limit is taken into account.

    Signed-off-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Tested-by: Nilay Vaish
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1467113345-12669-2-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

31 May, 2016

1 commit

  • Use path/to/bin/buildid/elf instead of path/to/bin/buildid
    to store corresponding elf binary.
    This also stores vdso in buildid/vdso, kallsyms in buildid/kallsyms.

    Note that the existing caches are not updated until user adds
    or updates the cache. Anyway, if there is the old style build-id
    cache it falls back to use it. (IOW, it is backward compatible)

    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Masami Hiramatsu
    Acked-by: Namhyung Kim
    Cc: Ananth N Mavinakayanahalli
    Cc: Brendan Gregg
    Cc: Hemant Kumar
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20160528151537.16098.85815.stgit@devbox
    Signed-off-by: Arnaldo Carvalho de Melo

    Masami Hiramatsu
     

23 Mar, 2016

1 commit


19 Mar, 2016

1 commit

  • Store DSO's .text offset into DSO, used for VDSOs and will also be used for
    other needs, like handling kernel modules.

    Signed-off-by: Wang Nan
    Cc: Adrian Hunter
    Cc: Alexei Starovoitov
    Cc: Cody P Schafer
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Kirill Smelkov
    Cc: Li Zefan
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: pi3orama@163.com
    Link: http://lkml.kernel.org/r/1456479154-136027-2-git-send-email-wangnan0@huawei.com
    [ Extracted from larger patch ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     

13 Nov, 2015

1 commit

  • Commit 4598a0a6d22f ("perf symbols: Improve DSO long names lookup speed
    with rbtree") Added a tree to lookup dsos by long name. That tree gets
    corrupted whenever a dso long name is changed because the tree is not
    updated.

    One effect of that is buildid-list does not work with the 'with-hits'
    option because dso lookup fails and results in two structs for the same
    dso. The first has the buildid but no hits, the second has hits but no
    buildid. e.g.

    Before:

    $ tools/perf/perf record ls
    arch certs CREDITS Documentation firmware include
    ipc Kconfig lib Makefile net REPORTING-BUGS
    scripts sound usr block COPYING crypto
    drivers fs init Kbuild kernel MAINTAINERS
    mm README samples security tools virt
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.012 MB perf.data (11 samples) ]
    $ tools/perf/perf buildid-list
    574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
    30c94dc66a1fe95180c3d68d2b89e576d5ae213c /lib/x86_64-linux-gnu/libc-2.19.so
    $ tools/perf/perf buildid-list -H
    574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
    0000000000000000000000000000000000000000 /lib/x86_64-linux-gnu/libc-2.19.so

    After:

    $ tools/perf/perf buildid-list -H
    574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
    30c94dc66a1fe95180c3d68d2b89e576d5ae213c /lib/x86_64-linux-gnu/libc-2.19.so

    The fix is to record the root of the tree on the dso so that
    dso__set_long_name() can update the tree when the long name changes.

    Signed-off-by: Adrian Hunter
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Adrian Hunter
    Cc: Don Zickus
    Cc: Douglas Hatch
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Waiman Long
    Fixes: 4598a0a6d22f ("perf symbols: Improve DSO long names lookup speed with rbtree")
    Link: http://lkml.kernel.org/r/1447408112-1920-2-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

25 Aug, 2015

1 commit

  • The 'annotate' tool does some filtering in the entries in a DSO but
    forgot to reset the cache done in dso__find_symbol(), cauxing a SEGV:

    [root@zoo ~]# perf annotate netlink_poll
    perf: Segmentation fault
    -------- backtrace --------
    perf[0x526ceb]
    /lib64/libc.so.6(+0x34960)[0x7faedfbe0960]
    perf(rb_erase+0x223)[0x499d63]
    perf[0x4213e9]
    perf[0x4bc123]
    perf[0x4bc621]
    perf[0x4bf26b]
    perf[0x4bc855]
    perf(perf_session__process_events+0x340)[0x4bddc0]
    perf(cmd_annotate+0x6bb)[0x421b5b]
    perf[0x479063]
    perf(main+0x60a)[0x42098a]
    /lib64/libc.so.6(__libc_start_main+0xf0)[0x7faedfbcbfe0]
    perf[0x420aa9]
    [0x0]
    [root@zoo ~]#

    Fix it by reseting the find cache when removing symbols.

    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Stephane Eranian
    Fixes: b685ac22b436 ("perf symbols: Add front end cache for DSO symbol lookup")
    Link: http://lkml.kernel.org/n/tip-b2y9x46y0t8yem1ive41zqyp@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

23 Jul, 2015

1 commit


08 Jun, 2015

2 commits

  • This has a different model than the 'thread' and 'map' struct lifetimes:
    there is not a definitive "don't use this DSO anymore" event, i.e. we may
    get many 'struct map' holding references to the '/usr/lib64/libc-2.20.so'
    DSO but then at some point some DSO may have no references but we still
    don't want to straight away release its resources, because "soon" we may
    get a new 'struct map' that needs it and we want to reuse its symtab or
    other resources.

    So we need some way to garbage collect it when crossing some memory
    usage threshold, which is left for anoter patch, for now it is
    sufficient to release it when calling dsos__exit(), i.e. when deleting
    the whole list as part of deleting the 'struct machine' containing it,
    which will leave only referenced objects being used.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/n/tip-majzgz07cm90t2tejrjy4clf@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • To allow concurrent access, next step: refcount struct dso instances, so
    that we can ditch unused them when the last map pointing to it goes
    away.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/n/tip-yk1k08etpd2aoe3tnrf0oizn@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

03 Jun, 2015

1 commit

  • Before patch ba92732e9808 ('perf kmaps: Check kmaps to make code more
    robust'), 'perf report' and 'perf annotate' will segfault if trace data
    contains kernel module information like this:

    # perf report -D -i ./perf.data
    ...
    0 0 0x188 [0x50]: PERF_RECORD_MMAP -1/0: [0xffffffbff1018000(0xf068000) @ 0]: x [test_module]
    ...

    # perf report -i ./perf.data --objdump=/path/to/objdump --kallsyms=/path/to/kallsyms

    perf: Segmentation fault
    -------- backtrace --------
    /path/to/perf[0x503478]
    /lib64/libc.so.6(+0x3545f)[0x7fb201f3745f]
    /path/to/perf[0x499b56]
    /path/to/perf(dso__load_kallsyms+0x13c)[0x49b56c]
    /path/to/perf(dso__load+0x72e)[0x49c21e]
    /path/to/perf(map__load+0x6e)[0x4ae9ee]
    /path/to/perf(thread__find_addr_map+0x24c)[0x47deec]
    /path/to/perf(perf_event__preprocess_sample+0x88)[0x47e238]
    /path/to/perf[0x43ad02]
    /path/to/perf[0x4b55bc]
    /path/to/perf(ordered_events__flush+0xca)[0x4b57ea]
    /path/to/perf[0x4b1a01]
    /path/to/perf(perf_session__process_events+0x3be)[0x4b428e]
    /path/to/perf(cmd_report+0xf11)[0x43bfc1]
    /path/to/perf[0x474702]
    /path/to/perf(main+0x5f5)[0x42de95]
    /lib64/libc.so.6(__libc_start_main+0xf4)[0x7fb201f23bd4]
    /path/to/perf[0x42dfc4]

    This is because __kmod_path__parse treats '[' leading names as kernel
    name instead of names of kernel module.

    If perf.data contains build information and the buildid of such modules
    can be found, the dso->kernel of it will be set to DSO_TYPE_KERNEL by
    __event_process_build_id(), not kernel module.

    It will then be passed to dso__load() -> dso__load_kernel_sym() ->
    dso__load_kcore() if --kallsyms is provided.

    The refered patch adds NULL pointer checker to avoid segfault. However,
    such kernel modules are still processed incorrectly.

    This patch fixes __kmod_path__parse, makes it treat names like
    '[test_module]' as kernel modules.

    kmod-path.c is also update to reflect the above changes.

    Signed-off-by: Wang Nan
    Acked-by: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Zefan Li
    Link: http://lkml.kernel.org/r/1433321541-170245-1-git-send-email-wangnan0@huawei.com
    [ Fixed the merged with 0443f36b0de0 ("perf machine: Fix the search
    for the kernel DSO on the unified list" ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     

29 May, 2015

1 commit

  • It never was a 'struct dso' method, so fix that by rename
    dso__kernel_findnew() to machine__findnew_kernel().

    At some point I'll move it all to the machine.[ch] files, for now
    lets ease patch review by not moving too much stuff.

    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Don Zickus
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-zrxmblgsg5vx0iv4rhvq2f6l@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

27 May, 2015

1 commit

  • Using dso__data_fd() in multi-thread environment is not safe since
    returned fd can be closed and/or reused anytime.

    So convert it to the dso__data_get/put_fd() pair to protect the access
    with lock.

    The original dso__data_fd() is deprecated and kept only for testing.

    Signed-off-by: Namhyung Kim
    Acked-by: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1432137821-10853-3-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

18 May, 2015

1 commit

  • Add mutex to protect it from concurrent dso__load().

    Signed-off-by: Namhyung Kim
    Cc: Adrian Hunter
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1431909055-21442-26-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

29 Apr, 2015

1 commit

  • Add a member to struct dso that can be used by Instruction Trace
    implementations to hold a cache for decoded instructions.

    Signed-off-by: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1428594864-29309-16-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

24 Mar, 2015

1 commit

  • Before, when some problem happened while trying to load the kernel
    symtab, 'perf top' would show:

    ┌─Warning:───────────────────────────┐
    │The vmlinux file can't be used. │
    │Kernel samples will not be resolved.│
    │ │
    │ │
    │Press any key... │
    └────────────────────────────────────┘

    Now, it reports:

    # perf top --vmlinux /dev/null

    ┌─Warning:───────────────────────────────────────────┐
    │The /tmp/passwd file can't be used: Invalid ELF file│
    │Kernel samples will not be resolved. │
    │ │
    │ │
    │Press any key... │
    └────────────────────────────────────────────────────┘

    This is possible because we now register the reason for not being able
    to load the symtab in the dso->load_errno member, and provide a
    dso__strerror_load() routine to format this error into a strerror like
    string with a short reason for the error while loading.

    That can be just forwarding the dso__strerror_load() call to
    strerror_r(), or, for a separate errno range providing a custom message.

    Reported-by: Ingo Molnar
    Acked-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Don Zickus
    Cc: Frederic Weisbecker
    Cc: Namhyung Kim
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-u5rb5uq63xqhkfb8uv2lxd5u@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

23 Mar, 2015

2 commits

  • Because it's no longer needed.

    Signed-off-by: Jiri Olsa
    Acked-by: Namhyung Kim
    Cc: Adrian Hunter
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-bb84vlg76t78q8y8fdeed2qn@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • We no longer need the 'compressed' argument, because all
    current users use 'NULL' for it.

    Signed-off-by: Jiri Olsa
    Acked-by: Namhyung Kim
    Cc: Adrian Hunter
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-d72q2s7ggbmy2yzhumux4zzw@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

22 Mar, 2015

2 commits

  • Separate the creation of new dso object and its addition to the dsos
    list. It will be used in following patch.

    Signed-off-by: Jiri Olsa
    Acked-by: Namhyung Kim
    Cc: Adrian Hunter
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-8j43jod97fdt5dwdsushwwae@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Provides united way of parsing kernel module path
    into several components.

    The new kmod_path__parse function and few defines:

    int __kmod_path__parse(struct kmod_path *m, const char *path,
    bool alloc_name, bool alloc_ext);

    #define kmod_path__parse(__m, __p) __kmod_path__parse(__m, __p, false, false)
    #define kmod_path__parse_name(__m, __p) __kmod_path__parse(__m, __p, true , false)
    #define kmod_path__parse_ext(__m, __p) __kmod_path__parse(__m, __p, false, true)

    parse kernel module @path and updates @m argument like:

    @comp - true if @path contains supported compression suffix,
    false otherwise
    @kmod - true if @path contains '.ko' suffix in right position,
    false otherwise
    @name - if (@alloc_name && @kmod) is true, it contains strdup-ed base name
    of the kernel module without suffixes, otherwise strudup-ed
    base name of @path
    @ext - if (@alloc_ext && @comp) is true, it contains strdup-ed string
    the compression suffix

    It returns 0 if there's no strdup error, -ENOMEM otherwise.

    Signed-off-by: Jiri Olsa
    Acked-by: Namhyung Kim
    Cc: Adrian Hunter
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-9t6eqg8j610r94l743hkntiv@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

20 Mar, 2015

1 commit

  • Commit f1f13af99a90 ("perf callchain: Cache eh/debug frame offset for
    dwarf unwind") introduces a cache for .debug_frame and .eh_frame_hdr.
    Unfortunately, it makes them share a same cache (dso->frame_offset).
    Which causes unwind failure on ARM:

    $ perf test unwind
    Test dwarf unwind: FAILED!

    The reason is that, if a dso has '.debug_frame' but doesn't have
    '.eh_frame_hdr' (like ARM), dso->frame_offset will be filled by offset
    of '.debug_frame' during the first time calling of find_proc_info() ->
    read_unwind_spec_debug_frame(), and be regarded to '.eh_frame_hdr' when
    the second time calling of find_proc_info() ->
    read_unwind_spec_eh_frame(), since '.eh_frame_hdr' is checked prior to
    '.debug_frame'.

    This patch solves the problem by creating two cache fields for
    '.eh_frame_hdr' and '.debug_frame'.

    Signed-off-by: Wang Nan
    Acked-by: Jiri Olsa
    Acked-by: Namhyung Kim
    Cc: Li Zefan
    Link: http://lkml.kernel.org/r/55028BA0.1030701@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Wang Nan
     

30 Jan, 2015

1 commit

  • When libunwind tries to resolve callchains it needs to know the offset
    of .eh_frame_hdr or .debug_frame to access the dso.

    Since it will always return the same result for a given DSO, just cache
    the result as an optimization.

    Signed-off-by: Namhyung Kim
    Cc: Adrian Hunter
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1422518843-25818-41-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

04 Nov, 2014

1 commit

  • This patch adds basic support to handle compressed kernel module as some
    distro (such as Archlinux) carries on it now. The actual work using
    compression library will be added later.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1415063674-17206-2-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

29 Oct, 2014

2 commits

  • This patch introduces an abstraction for exporting sample data in a
    database-friendly way. The abstraction does not implement the actual
    output. A subsequent patch takes this facility into use for extending
    the script interface.

    The abstraction is needed because static data like symbols, dsos, comms
    etc need to be exported only once. That means allocating them a unique
    identifier and recording it on each structure. The member 'db_id' is
    used for that. 'db_id' is just a 64-bit sequence number.

    Exporting centres around the db_export__sample() function which exports
    the associated data structures if they have not yet been allocated a
    db_id.

    Signed-off-by: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1414061124-26830-6-git-send-email-adrian.hunter@intel.com
    [ committer note: Stash db_id using symbol_conf.priv_size + symbol__priv() and foo->priv areas ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • Cache the DWARF debug info for DSO so we don't have to rebuild it for each
    address in the DSO.

    Note that dso__new() uses calloc() so don't need to set dso->dwfl to NULL.

    $ /tmp/perf.orig --version
    perf version 3.18.rc1.gc2661b8
    $ /tmp/perf.new --version
    perf version 3.18.rc1.g402d62
    $ perf stat -e cycles,instructions /tmp/perf.orig report -g > orig

    Performance counter stats for '/tmp/perf.orig report -g':

    6,428,177,183 cycles # 0.000 GHz
    4,176,288,391 instructions # 0.65 insns per cycle

    1.840666132 seconds time elapsed

    $ perf stat -e cycles,instructions /tmp/perf.new report -g > new

    Performance counter stats for '/tmp/perf.new report -g':

    305,773,142 cycles # 0.000 GHz
    276,048,272 instructions # 0.90 insns per cycle

    0.087693543 seconds time elapsed
    $ diff orig new
    $

    Changelog[v2]:

    [Arnaldo Carvalho] Cache in existing global objects rather than create
    new static/globals in functions.

    Reported-by: Anton Blanchard
    Signed-off-by: Sukadev Bhattiprolu
    Cc: Anton Blanchard
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/20141022000958.GB2228@us.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Sukadev Bhattiprolu
     

02 Oct, 2014

1 commit

  • With workload that spawns and destroys many threads and processes, it
    was found that perf-mem could took a long time to post-process the perf
    data after the target workload had completed its operation.

    The performance bottleneck was found to be the lookup and insertion of
    the new DSO structures (thousands of them in this case).

    In a dual-socket Ivy-Bridge E7-4890 v2 machine (30-core, 60-thread), the
    perf profile below shows what perf was doing after the profiled AIM7
    shared workload completed:

    - 83.94% perf libc-2.11.3.so [.] __strcmp_sse42
    - __strcmp_sse42
    - 99.82% map__new
    machine__process_mmap_event
    perf_session_deliver_event
    perf_session__process_event
    __perf_session__process_events
    cmd_record
    cmd_mem
    run_builtin
    main
    __libc_start_main
    - 13.17% perf perf [.] __dsos__findnew
    __dsos__findnew
    map__new
    machine__process_mmap_event
    perf_session_deliver_event
    perf_session__process_event
    __perf_session__process_events
    cmd_record
    cmd_mem
    run_builtin
    main
    __libc_start_main

    So about 97% of CPU times were spent in the map__new() function trying
    to insert new DSO entry into the DSO linked list. The whole
    post-processing step took about 9 minutes.

    The DSO structures are currently searched linearly. So the total
    processing time will be proportional to n^2.

    To overcome this performance problem, the DSO code is modified to also
    put the DSO structures in a RB tree sorted by its long name in
    additional to being in a simple linked list. With this change, the
    processing time will become proportional to n*log(n) which will be much
    quicker for large n. However, the short name will still be searched
    using the old linear searching method. With that patch in place, the
    same perf-mem post-processing step took less than 30 seconds to
    complete.

    Signed-off-by: Waiman Long
    Cc: Adrian Hunter
    Cc: Don Zickus
    Cc: Douglas Hatch
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Link: http://lkml.kernel.org/r/1412098575-27863-3-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Waiman Long
     

30 Sep, 2014

1 commit

  • This is a precursor patch to enable long name searching of DSOs using
    a rbtree.

    In this patch, a new dsos structure is created which contains only a
    list head structure for the moment.

    The new dsos structure is used, in turn, in the machine structure for
    the user_dsos and kernel_dsos fields.

    Only the following 3 dsos functions are modified to accept the new dsos
    structure parameter instead of list_head:

    - dsos__add()
    - dsos__find()
    - __dsos__findnew()

    Signed-off-by: Waiman Long
    Cc: Adrian Hunter
    Cc: Don Zickus
    Cc: Douglas Hatch
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Link: http://lkml.kernel.org/r/1412021249-19201-2-git-send-email-Waiman.Long@hp.com
    [ Move struct dsos to dso.h to reduce the dso methods depends on machine.h ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Waiman Long
     

24 Jul, 2014

1 commit

  • dso__type() determines wheather a dso is 32-bit, x32 (32-bit with 64-bit
    registers) or 64-bit.

    dso__type() will be used to determine the VDSO a program maps.

    Reviewed-by: Jiri Olsa
    Signed-off-by: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1406035081-14301-51-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

23 Jul, 2014

3 commits

  • Add a function to return the dso data size, for use in estimating the
    size an instruction cache.

    Signed-off-by: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1406035081-14301-27-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • Add a function to track whether a caller has seen the data status of a
    dso. This is needed to enable callers to report the error exactly once
    only per dso.

    Signed-off-by: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1406035081-14301-11-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • Add 'data.status' to record whether a dso has data (i.e. an object
    file). This is used to avoid repeatedly creating the file name and
    attempting to open a file that is not present.

    Signed-off-by: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1406035081-14301-10-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

17 Jul, 2014

1 commit

  • Add a flag to 'struct dso' to record if the dso is 64-bit or not.
    Update the flag when reading the ELF.

    This is needed for instruction decoding. For example, x86 instruction
    decoding depends on whether or not the 64-bit instruction set is used.

    Signed-off-by: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1405332185-4050-18-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     

12 Jun, 2014

5 commits

  • Adding descriptions/explanations for dso__data_* interface
    functions.

    Acked-by: Namhyung Kim
    Cc: Arnaldo Carvalho de Melo
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Jean Pihet
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1401892622-30848-10-git-send-email-jolsa@kernel.org
    Signed-off-by: Jiri Olsa

    Jiri Olsa
     
  • Adding file size check, because the lseek will succeed for
    any offset behind file size and thus succeed when it was
    expected to fail.

    Factoring the code to check the offset against file size
    earlier in the flow.

    Acked-by: Namhyung Kim
    Cc: Arnaldo Carvalho de Melo
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Jean Pihet
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1401892622-30848-8-git-send-email-jolsa@kernel.org
    Signed-off-by: Jiri Olsa

    Jiri Olsa
     
  • Adding global list of opened dso objects, so we can
    track them and use the list for caching dso data file
    descriptors.

    Acked-by: Namhyung Kim
    Cc: Arnaldo Carvalho de Melo
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Jean Pihet
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1401892622-30848-5-git-send-email-jolsa@kernel.org
    Signed-off-by: Jiri Olsa

    Jiri Olsa
     
  • Adding data_fd into dso object so we could handle caching
    of opened dso file data descriptors coming int next patches.

    Adding dso__data_close interface to keep the data_fd updated
    when the descriptor is closed.

    Acked-by: Namhyung Kim
    Cc: Arnaldo Carvalho de Melo
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Jean Pihet
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1401892622-30848-4-git-send-email-jolsa@kernel.org
    Signed-off-by: Jiri Olsa

    Jiri Olsa
     
  • Add separated structure/namespace for data related
    variables. We are going to add mode of them, so this
    way they will be clearly separated.

    Acked-by: Namhyung Kim
    Cc: Arnaldo Carvalho de Melo
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Jean Pihet
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1401892622-30848-3-git-send-email-jolsa@kernel.org
    Signed-off-by: Jiri Olsa

    Jiri Olsa
     

02 May, 2014

1 commit

  • Combine all definitions into a common tools/include/linux/types.h and
    kill the wild growth elsewhere. Move DECLARE_BITMAP to its proper
    bitmap.h header.

    Signed-off-by: Borislav Petkov
    Acked-by: Rusty Russell
    Link: http://lkml.kernel.org/n/tip-azczs7qcv6h9xek9od10hiv2@git.kernel.org
    Signed-off-by: Jiri Olsa

    Borislav Petkov
     

18 Feb, 2014

1 commit

  • Allow to add events on the local functions without debuginfo.
    (With the debuginfo, we can add events even on inlined functions)
    Currently, probing on local functions requires debuginfo to
    locate actual address. It is also possible without debuginfo since
    we have symbol maps.

    Without this change;
    ----
    # ./perf probe -a t_show
    Added new event:
    probe:t_show (on t_show)

    You can now use it in all perf tools, such as:

    perf record -e probe:t_show -aR sleep 1

    # ./perf probe -x perf -a identity__map_ip
    no symbols found in /kbuild/ksrc/linux-3/tools/perf/perf, maybe install a debug package?
    Failed to load map.
    Error: Failed to add events. (-22)
    ----
    As the above results, perf probe just put one event
    on the first found symbol for kprobe event. Moreover,
    for uprobe event, perf probe failed to find local
    functions.

    With this change;
    ----
    # ./perf probe -a t_show
    Added new events:
    probe:t_show (on t_show)
    probe:t_show_1 (on t_show)
    probe:t_show_2 (on t_show)
    probe:t_show_3 (on t_show)

    You can now use it in all perf tools, such as:

    perf record -e probe:t_show_3 -aR sleep 1

    # ./perf probe -x perf -a identity__map_ip
    Added new events:
    probe_perf:identity__map_ip (on identity__map_ip in /kbuild/ksrc/linux-3/tools/perf/perf)
    probe_perf:identity__map_ip_1 (on identity__map_ip in /kbuild/ksrc/linux-3/tools/perf/perf)
    probe_perf:identity__map_ip_2 (on identity__map_ip in /kbuild/ksrc/linux-3/tools/perf/perf)
    probe_perf:identity__map_ip_3 (on identity__map_ip in /kbuild/ksrc/linux-3/tools/perf/perf)

    You can now use it in all perf tools, such as:

    perf record -e probe_perf:identity__map_ip_3 -aR sleep 1
    ----
    Now we succeed to put events on every given local functions
    for both kprobes and uprobes. :)

    Note that this also introduces some symbol rbtree
    iteration macros; symbols__for_each, dso__for_each_symbol,
    and map__for_each_symbol. These are for walking through
    the symbol list in a map.

    Changes from v2:
    - Fix add_exec_to_probe_trace_events() not to convert address
    to tp->symbol any more.
    - Fix to set kernel probes based on ref_reloc_sym.

    Signed-off-by: Masami Hiramatsu
    Cc: David Ahern
    Cc: "David A. Long"
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Oleg Nesterov
    Cc: Srikar Dronamraju
    Cc: Steven Rostedt
    Cc: yrl.pp-manager.tt@hitachi.com
    Link: http://lkml.kernel.org/r/20140206053225.29635.15026.stgit@kbuild-fedora.yrl.intra.hitachi.co.jp
    Signed-off-by: Arnaldo Carvalho de Melo

    Masami Hiramatsu