17 Jan, 2015

1 commit

  • dwfl_report_offline() works only when libraries are prelinked.

    Replace dwfl_report_offline() with dwfl_report_elf() so we correctly
    extract debug info even from libraries that are not prelinked.

    Reported-by: Jiri Olsa
    Signed-off-by: Sukadev Bhattiprolu
    Tested-by: Jiri Olsa
    Cc: Jiri Olsa
    Cc: Michael Ellerman
    Link: http://lkml.kernel.org/r/20150114221045.GA17703@us.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Sukadev Bhattiprolu
     

29 Oct, 2014

2 commits

  • So stop passing both machine and thread to several thread methods,
    reducing function signature length.

    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Don Zickus
    Cc: Frederic Weisbecker
    Cc: Jean Pihet
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-ckcy19dcp1jfkmdihdjcqdn1@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Cache the DWARF debug info for DSO so we don't have to rebuild it for each
    address in the DSO.

    Note that dso__new() uses calloc() so don't need to set dso->dwfl to NULL.

    $ /tmp/perf.orig --version
    perf version 3.18.rc1.gc2661b8
    $ /tmp/perf.new --version
    perf version 3.18.rc1.g402d62
    $ perf stat -e cycles,instructions /tmp/perf.orig report -g > orig

    Performance counter stats for '/tmp/perf.orig report -g':

    6,428,177,183 cycles # 0.000 GHz
    4,176,288,391 instructions # 0.65 insns per cycle

    1.840666132 seconds time elapsed

    $ perf stat -e cycles,instructions /tmp/perf.new report -g > new

    Performance counter stats for '/tmp/perf.new report -g':

    305,773,142 cycles # 0.000 GHz
    276,048,272 instructions # 0.90 insns per cycle

    0.087693543 seconds time elapsed
    $ diff orig new
    $

    Changelog[v2]:

    [Arnaldo Carvalho] Cache in existing global objects rather than create
    new static/globals in functions.

    Reported-by: Anton Blanchard
    Signed-off-by: Sukadev Bhattiprolu
    Cc: Anton Blanchard
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/20141022000958.GB2228@us.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Sukadev Bhattiprolu
     

18 Sep, 2014

1 commit


20 Aug, 2014

1 commit

  • Looks like util/debug.h was indirectly included before and is no longer
    included now. pr_debug is left undefined and the build of perf tool
    fails on Powerpc.

    Explicitly include util/debug.h.

    Signed-off-by: Sukadev Bhattiprolu
    Acked-by: Jiri Olsa
    Cc: Jiri Olsa
    Cc: Michael Ellerman
    Link: http://lkml.kernel.org/r/20140807072700.GA17623@us.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Sukadev Bhattiprolu
     

24 Jul, 2014

1 commit


27 Jun, 2014

1 commit

  • When saving the callchain on Power, the kernel conservatively saves excess
    entries in the callchain. A few of these entries are needed in some cases
    but not others. We should use the DWARF debug information to determine
    when the entries are needed.

    Eg: the value in the link register (LR) is needed only when it holds the
    return address of a function. At other times it must be ignored.

    If the unnecessary entries are not ignored, we end up with duplicate arcs
    in the call-graphs.

    Use the DWARF debug information to determine if any callchain entries
    should be ignored when building call-graphs.

    Callgraph before the patch:

    14.67% 2234 sprintft libc-2.18.so [.] __random
    |
    --- __random
    |
    |--61.12%-- __random
    | |
    | |--97.15%-- rand
    | | do_my_sprintf
    | | main
    | | generic_start_main.isra.0
    | | __libc_start_main
    | | 0x0
    | |
    | --2.85%-- do_my_sprintf
    | main
    | generic_start_main.isra.0
    | __libc_start_main
    | 0x0
    |
    --38.88%-- rand
    |
    |--94.01%-- rand
    | do_my_sprintf
    | main
    | generic_start_main.isra.0
    | __libc_start_main
    | 0x0
    |
    --5.99%-- do_my_sprintf
    main
    generic_start_main.isra.0
    __libc_start_main
    0x0

    Callgraph after the patch:

    14.67% 2234 sprintft libc-2.18.so [.] __random
    |
    --- __random
    |
    |--95.93%-- rand
    | do_my_sprintf
    | main
    | generic_start_main.isra.0
    | __libc_start_main
    | 0x0
    |
    --4.07%-- do_my_sprintf
    main
    generic_start_main.isra.0
    __libc_start_main
    0x0

    TODO: For split-debug info objects like glibc, we can only determine
    the call-frame-address only when both .eh_frame and .debug_info
    sections are available. We should be able to determin the CFA
    even without the .eh_frame section.

    Fix suggested by Anton Blanchard.

    Thanks to valuable input on DWARF debug information from Ulrich Weigand.

    Reported-by: Maynard Johnson
    Tested-by: Maynard Johnson
    Signed-off-by: Sukadev Bhattiprolu
    Link: http://lkml.kernel.org/r/20140625154903.GA29607@us.ibm.com
    Signed-off-by: Jiri Olsa

    Sukadev Bhattiprolu
     

16 Mar, 2013

1 commit

  • Including libio.h causes build failures on uClibc systems (which lack
    libio.h).

    It appears that libio.h was only included to pull in a definition for
    NULL, so it has been replaced by stddef.h.

    On powerpc, libio.h was conditionally included, but could be removed
    completely as it is unneeded. Also, the included of stdlib.h was changed
    to stddef.h (as again, only NULL is needed).

    Signed-off-by: Cody P Schafer
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1363300074-26288-1-git-send-email-cody@linux.vnet.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Cody P Schafer
     

14 Mar, 2012

1 commit

  • Several places were expecting that the value returned was the number of
    characters printed, not what would be printed if there was space.

    Fix it by using the scnprintf and vscnprintf variants we inherited from
    the kernel sources.

    Some corner cases where the number of printed characters were not
    accounted were fixed too.

    Reported-by: Anton Blanchard
    Cc: Anton Blanchard
    Cc: Eric B Munson
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Yanmin Zhang
    Cc: stable@kernel.org
    Link: http://lkml.kernel.org/n/tip-kwxo2eh29cxmd8ilixi2005x@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

25 Nov, 2011

1 commit


08 Oct, 2011

1 commit

  • The goal of this patch is to include more information about the host
    environment into the perf.data so it is more self-descriptive. Overtime,
    profiles are captured on various machines and it becomes hard to track
    what was recorded, on what machine and when.

    This patch provides a way to solve this by extending the perf.data file
    with basic information about the host machine. To add those extensions,
    we leverage the feature bits capabilities of the perf.data format. The
    change is backward compatible with existing perf.data files.

    We define the following useful new extensions:
    - HEADER_HOSTNAME: the hostname
    - HEADER_OSRELEASE: the kernel release number
    - HEADER_ARCH: the hw architecture
    - HEADER_CPUDESC: generic CPU description
    - HEADER_NRCPUS: number of online/avail cpus
    - HEADER_CMDLINE: perf command line
    - HEADER_VERSION: perf version
    - HEADER_TOPOLOGY: cpu topology
    - HEADER_EVENT_DESC: full event description (attrs)
    - HEADER_CPUID: easy-to-parse low level CPU identication

    The small granularity for the entries is to make it easier to extend
    without breaking backward compatiblity. Many entries are provided as
    ASCII strings.

    Perf report/script have been modified to print the basic information as
    easy-to-parse ASCII strings. Extended information about CPU and NUMA
    topology may be requested with the -I option.

    Thanks to David Ahern for reviewing and testing the many versions of
    this patch.

    $ perf report --stdio
    # ========
    # captured on : Mon Sep 26 15:22:14 2011
    # hostname : quad
    # os release : 3.1.0-rc4-tip
    # perf version : 3.1.0-rc4
    # arch : x86_64
    # nrcpus online : 4
    # nrcpus avail : 4
    # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
    # cpuid : GenuineIntel,6,15,11
    # total memory : 8105360 kB
    # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
    # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
    # HEADER_CPU_TOPOLOGY info available, use -I to display
    # HEADER_NUMA_TOPOLOGY info available, use -I to display
    # ========
    #
    ...

    $ perf report --stdio -I
    # ========
    # captured on : Mon Sep 26 15:22:14 2011
    # hostname : quad
    # os release : 3.1.0-rc4-tip
    # perf version : 3.1.0-rc4
    # arch : x86_64
    # nrcpus online : 4
    # nrcpus avail : 4
    # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
    # cpuid : GenuineIntel,6,15,11
    # total memory : 8105360 kB
    # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
    # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
    # sibling cores : 0-3
    # sibling threads : 0
    # sibling threads : 1
    # sibling threads : 2
    # sibling threads : 3
    # node0 meminfo : total = 8320608 kB, free = 7571024 kB
    # node0 cpu list : 0-3
    # ========
    #
    ...

    Reviewed-by: David Ahern
    Tested-by: David Ahern
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Andi Kleen
    Link: http://lkml.kernel.org/r/20110930134040.GA5575@quad
    Signed-off-by: Stephane Eranian
    [ committer notes: Use --show-info in the tools as was in the docs, rename
    perf_header_fprintf_info to perf_file_section__fprintf_info, fixup
    conflict with f69b64f7 "perf: Support setting the disassembler style" ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     

22 Apr, 2010

1 commit

  • This adds mappings from the register numbers from DWARF to the
    register names used in the PowerPC Regs and Stack Access API. This
    allows perf probe to be used to record variable contents on PowerPC.

    This requires the functionality represented by the config symbol
    HAVE_REGS_AND_STACK_ACCESS_API in order to function, although it will
    compile without it. That functionality is added for PowerPC in commit
    359e4284 ("powerpc: Add kprobe-based event tracer").

    Signed-off-by: Ian Munsie
    Acked-by: Masami Hiramatsu
    Signed-off-by: Paul Mackerras

    Ian Munsie