01 Apr, 2013

2 commits

  • This new command is a wrapper on top of perf record and perf report to
    make it easier to configure for memory access profiling.

    To record loads:
    $ perf mem -t load rec .....

    To record stores:
    $ perf mem -t store rec .....

    To get the report:
    $ perf mem -t load rep

    Signed-off-by: Stephane Eranian
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1359040242-8269-15-git-send-email-eranian@google.com
    [ Fixed minor conflict with 66857b5 "Sort command-list.txt alphabetically" ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     
  • perf record has a new option -W that enables weightened sampling.

    Add sorting support in top/report for the average weight per sample and the
    total weight sum. This allows to both compare relative cost per event
    and the total cost over the measurement period.

    Add the necessary glue to perf report, record and the library.

    v2: Merge with new hist refactoring.
    v3: Fix manpage. Remove value check.
    Rename global_weight to weight and weight to local_weight.
    v4: Readd sort keys to manpage
    v5: Move weight to end
    v6: Move weight to template
    v7: Rename weight key.

    Original patch from Andi modified by Stephane Eranian
    to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4.

    Signed-off-by: Andi Kleen
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com
    [ committer note: changed to cope with fc5871ed and the hists_link perf test entry ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

27 Mar, 2013

1 commit

  • It's sometimes useful to see undemangled raw symbol name for example
    other tools using the perf output to do manipulation of binaries.

    Signed-off-by: Namhyung Kim
    Suggested-by: William Cohen
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: William Cohen
    BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=55571
    Link: http://lkml.kernel.org/r/1364203098-17741-1-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

26 Mar, 2013

2 commits

  • This patch adds the --per-core option to perf stat.

    This option is used to aggregate system-wide counts
    on a per physical core basis. On processors with
    hyperthreading, this means counts of all HT threads
    running on a physical core are aggregated.

    This mode is useful to find imblance between physical
    cores running an uniform workload. Cores are identified
    by socket: S0-C1, means physical core 1 on socket 0. Note
    that cores are identified using their physical core id,
    thus their numbering may not be continuous.

    Per core aggregation can be combined with interval printing:

    # perf stat -a --per-core -I 1000 -e cycles sleep 1000
    # time core cpus counts events
    1.000090030 S0-C0 1 4,765,747 cycles
    1.000090030 S0-C1 1 5,580,647 cycles
    1.000090030 S0-C2 1 221,181 cycles
    1.000090030 S0-C3 1 266,092 cycles

    Signed-off-by: Stephane Eranian
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1360846649-6411-4-git-send-email-eranian@google.com
    [ committer note: Remove parts already applied on 86ee6e1 to keep bisectability ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     
  • To make it more obvious what this option does as suggested by Andi on
    LKML.

    Signed-off-by: Stephane Eranian
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1360846649-6411-3-git-send-email-eranian@google.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     

16 Mar, 2013

2 commits

  • The following patch causes 'perf stat --repeat 0' to be interpreted as
    'forever', displaying the stats for every run.

    We act as if a single run was asked, and reset the stats in each
    iteration. In this mode SIGINT is passed to perf to be able to stop the
    loop with Ctrl+C.

    Signed-off-by: Frederik Deweerdt
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20130301180227.GA24385@ks398093.ip-192-95-24.net
    Signed-off-by: Arnaldo Carvalho de Melo

    Frederik Deweerdt
     
  • Add --group option to enable event grouping. When enabled, all the
    group members information will be shown with the leader so skip
    non-leader events.

    It only supports --stdio output currently. Later patches will extend
    additional features.

    $ perf annotate --group --stdio
    ...
    Percent | Source code & Disassembly of libpthread-2.15.so
    --------------------------------------------------------------------------------
    :
    :
    :
    : Disassembly of section .text:
    :
    : 000000387dc0aa50 :
    8.08 2.40 5.29 : 387dc0aa50: mov %rdi,%rdx
    0.00 0.00 0.00 : 387dc0aa53: mov 0x10(%rdi),%edi
    0.00 0.00 0.00 : 387dc0aa56: mov %edi,%eax
    0.00 0.80 0.00 : 387dc0aa58: and $0x7f,%eax
    3.03 2.40 3.53 : 387dc0aa5b: test $0x7c,%dil
    0.00 0.00 0.00 : 387dc0aa5f: jne 387dc0aaa9
    0.00 0.00 0.00 : 387dc0aa81: nop
    0.00 0.00 0.00 : 387dc0aa82: xor %eax,%eax
    0.00 0.00 0.00 : 387dc0aa84: retq
    ...

    Signed-off-by: Namhyung Kim
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Paul Mackerras
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1362462812-30885-6-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

15 Feb, 2013

3 commits

  • Add --skip-missing option for skipping symbols that cannot be used for
    annotation. It's the case of kernel symbols that user doesn't have a
    vmlinux image file.

    Signed-off-by: Namhyung Kim
    Cc: Andi Kleen
    Cc: Borislav Petkov
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Paul Mackerras
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1360227734-375-8-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Basic implementation of perf annotate on GTK2. Currently only
    shows first symbol. Add a new --gtk option to use it.

    Signed-off-by: Namhyung Kim
    Cc: Andi Kleen
    Cc: Borislav Petkov
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1360227734-375-2-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • When adding vmlinux file to build-id cache, it'd be fail since kallsyms
    dso with a same build-id was already added by perf record.

    So one needs to remove the kallsyms first to add vmlinux into the cache.
    Add --update option for doing it at once.

    Signed-off-by: Namhyung Kim
    Cc: Andi Kleen
    Cc: Borislav Petkov
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Paul Mackerras
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1360227734-375-5-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

07 Feb, 2013

1 commit

  • This patch adds per-processor socket count aggregation for system-wide
    mode measurements. This is a useful mode to detect imbalance between
    sockets.

    To enable this mode, use --aggr-socket in addition
    to -a. (system-wide).

    The output includes the socket number and the number of online
    processors on that socket. This is useful to gauge the amount of
    aggregation.

    # ./perf stat -I 1000 -a --aggr-socket -e cycles sleep 2
    # time socket cpus counts events
    1.000097680 S0 4 5,788,785 cycles
    2.000379943 S0 4 27,361,546 cycles
    2.001167808 S0 4 818,275 cycles

    Signed-off-by: Stephane Eranian
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1360161962-9675-3-git-send-email-eranian@google.com
    [ committer note: Added missing man page entry based on above comments ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     

01 Feb, 2013

2 commits

  • Add '-g/--group' option for showing event groups. For simplicity it is
    currently not compatible with other options.

    $ perf evlist --group
    {ref-cycles,cycles}

    $ perf evlist
    ref-cycles
    cycles

    Suggested-by: Arnaldo Carvalho de Melo
    Signed-off-by: Namhyung Kim
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1358845787-1350-20-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Add --group option to enable event grouping. When enabled, all the
    group members information will be shown together with the leader.

    $ perf report --group
    ...
    # group: {ref-cycles,cycles}
    # ========
    #
    # Samples: 7K of event 'anon group { ref-cycles, cycles }'
    # Event count (approx.): 6876107743
    #
    # Overhead Command Shared Object Symbol
    # ................ ....... ................. ..........................
    #
    99.84% 99.76% noploop noploop [.] main
    0.07% 0.00% noploop ld-2.15.so [.] strcmp
    0.03% 0.00% noploop [kernel.kallsyms] [k] timerqueue_del
    0.03% 0.03% noploop [kernel.kallsyms] [k] sched_clock_cpu
    0.02% 0.00% noploop [kernel.kallsyms] [k] account_user_time
    0.01% 0.00% noploop [kernel.kallsyms] [k] __alloc_pages_nodemask
    0.00% 0.00% noploop [kernel.kallsyms] [k] native_write_msr_safe
    0.00% 0.11% noploop [kernel.kallsyms] [k] _raw_spin_lock
    0.00% 0.06% noploop [kernel.kallsyms] [k] find_get_page
    0.00% 0.02% noploop [kernel.kallsyms] [k] rcu_check_callbacks
    0.00% 0.02% noploop [kernel.kallsyms] [k] __current_kernel_time

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1358845787-1350-18-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

30 Jan, 2013

1 commit

  • This patch adds a new printing mode for perf stat. It allows interval
    printing. That means perf stat can now print event deltas at regular
    time interval. This is useful to detect phases in programs.

    The -I option enables interval printing. It expects an interval duration
    in milliseconds. Minimum is 100ms. Once, activated perf stat prints
    events deltas since last printout. All modes are supported.

    $ perf stat -I 1000 -e cycles noploop 10
    noploop for 10 seconds
    # time counts events
    1.000109853 2,388,560,546 cycles
    2.000262846 2,393,332,358 cycles
    3.000354131 2,393,176,537 cycles
    4.000439503 2,393,203,790 cycles
    5.000527075 2,393,167,675 cycles
    6.000609052 2,393,203,670 cycles
    7.000691082 2,393,175,678 cycles

    The output format makes it easy to feed into a plotting program such as
    gnuplot when the -I option is used in combination with the -x option:

    $ perf stat -x, -I 1000 -e cycles noploop 10
    noploop for 10 seconds
    1.000084113,2378775498,cycles
    2.000245798,2391056897,cycles
    3.000354445,2392089414,cycles
    4.000459115,2390936603,cycles
    5.000565341,2392108173,cycles

    Signed-off-by: Stephane Eranian
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1359460064-3060-3-git-send-email-eranian@google.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     

25 Jan, 2013

3 commits

  • Sometimes a test is problematic for some reason and one wants to skip it,
    for instance:

    [root@sandy ~]# perf test
    1: vmlinux symtab matches kallsyms : Ok
    2: detect open syscall event : Ok
    3: detect open syscall event on all cpus : Ok
    4: read samples using the mmap interface : Ok
    5: parse events tests : Warning: bad op token {
    Warning: bad op token {
    Warning: bad op token {
    Warning: bad op token {
    Warning: bad op token {
    Warning: function is_writable_pte not defined
    Segmentation fault (core dumped)

    So now we can use -s/--skip while the problematic tests are being fixed,
    allowing us to test all the other entries:

    [root@sandy ~]# perf test -s 5
    1: vmlinux symtab matches kallsyms : Ok
    2: detect open syscall event : Ok
    3: detect open syscall event on all cpus : Ok
    4: read samples using the mmap interface : Ok
    5: parse events tests : Skip (user override)
    6: x86 rdpmc test : Ok
    7: Validate PERF_RECORD_* events & perf_sample fields : Ok
    8: Test perf pmu format parsing : Ok
    9: Test dso data interface : Ok
    10: roundtrip evsel->name check : Ok
    11: Check parsing of sched tracepoints fields : Ok
    12: Generate and check syscalls:sys_enter_open event fields: Ok
    13: struct perf_event_attr setup : Ok
    14: Test matching and linking mutliple hists : Ok
    15: Try 'use perf' in python, checking link problems : Ok
    [root@sandy ~]#

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-klzd8p57jzdryafqkmlppcb1@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The tracepoints used by the workqueue-stats script no longer exist so
    trying to run the script results in:

    # perf script record workqueue-stats
    invalid or unsupported event: 'workqueue:workqueue_creation'
    Run 'perf list' for a list of valid events

    So remove the script until it can be reworked using the new workqueue
    tracepoints.

    Signed-off-by: Tom Zanussi
    Link: http://lkml.kernel.org/r/e7a7637d5df9df86887c3bff7683574665ec5360.1358527965.git.tom.zanussi@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Tom Zanussi
     
  • Add description of sort keys to the perf-report document and also add
    missing cpu and srcline keys to the command line help string.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1356599507-14226-11-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

24 Jan, 2013

1 commit

  • …/acme/linux into perf/core

    Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

    . perf build-id cache now can show DSOs present in a perf.data file that are
    not in the cache, to integrate with build-id servers being put in place by
    organizations such as Fedora.

    . perf buildid-list -i an-elf-file-instead-of-a-perf.data is back showing its
    build-id.

    . No need to do feature checks when doing a 'make tags'

    . Fix some 'perf test' errors and make them use the tracepoint evsel constructor.

    . perf top now shares more of the evsel config/creation routines with 'record',
    paving the way for further integration like 'top' snapshots, etc.

    . perf top now supports DWARF callchains.

    . perf evlist decodes sample_type and read_format, helping diagnose problems.

    . Fix mmap limitations on 32-bit, fix from David Miller.

    . perf diff fixes from Jiri Olsa.

    . Ignore ABS symbols when loading data maps, fix from Namhyung Kim

    . Hists improvements from Namhyung Kim

    . Don't check configuration on make clean, from Namhyung Kim

    . Fix dso__fprintf() print statement, from Stephane Eranian.

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

14 Dec, 2012

1 commit

  • Pull trivial branch from Jiri Kosina:
    "Usual stuff -- comment/printk typo fixes, documentation updates, dead
    code elimination."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
    HOWTO: fix double words typo
    x86 mtrr: fix comment typo in mtrr_bp_init
    propagate name change to comments in kernel source
    doc: Update the name of profiling based on sysfs
    treewide: Fix typos in various drivers
    treewide: Fix typos in various Kconfig
    wireless: mwifiex: Fix typo in wireless/mwifiex driver
    messages: i2o: Fix typo in messages/i2o
    scripts/kernel-doc: check that non-void fcts describe their return value
    Kernel-doc: Convention: Use a "Return" section to describe return values
    radeon: Fix typo and copy/paste error in comments
    doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c
    various: Fix spelling of "asynchronous" in comments.
    Fix misspellings of "whether" in comments.
    eisa: Fix spelling of "asynchronous".
    various: Fix spelling of "registered" in comments.
    doc: fix quite a few typos within Documentation
    target: iscsi: fix comment typos in target/iscsi drivers
    treewide: fix typo of "suport" in various comments and Kconfig
    treewide: fix typo of "suppport" in various comments
    ...

    Linus Torvalds
     

12 Dec, 2012

1 commit

  • Using struct perf_record_opts to specify how to configure the evsel
    perf_event_attrs.

    This gets top closer to record in the way it sets up evsels, with the
    aim of sharing more and more to the point that both will be a single
    utility.

    In this direction top now uses the same callchain option parsing as
    record and that brings DWARF callchains to top, something that was
    already available for record.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-u03o0bsrqcjgskciso3pvsjr@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

09 Dec, 2012

4 commits

  • This will allow to connect with services being put in place by distros such as
    Fedora, where one can retrieve DSOs by their build-id.

    Example usage:

    for buildid in $(perf buildid-cache --missing perf.data | cut -d' ' -f1) ; do
    echo "trying to get $buildid"
    wget -q https://darkserver.fedoraproject.org/buildids/$buildid
    cat $buildid ; echo
    rm -f $buildid
    done

    Now its just a matter of some porcelain to get the details provided by such a
    service, retrieve the file and use 'perf buildid-cache --add $FILE' to insert
    it in the cache, then use 'perf report' or 'annotate' that will find the
    required files in the cache.

    More information about the darkserver service at:

    https://darkserver.fedoraproject.org/

    Cc: David Ahern
    Cc: Frank Eigler
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Kushal Das
    Cc: Mark Wielaard
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-6fuktuiyjn4jykxmt7c9f7xq@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • It seems not very useful, because it's possible and event more convenient to
    lookup related symbol by name. Also the output value for both 'baseline' and
    'new' data is quite apparent from diff output.

    And above all it complicates hist code factoring ;)

    Ditching out PERF_HPP__DISPL column with related output functions.

    Suggested-by: Arnaldo Carvalho de Melo
    Signed-off-by: Jiri Olsa
    Cc: Corey Ashford
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20121206132228.GB1080@krava.brq.redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Doing the same thing done in:

    b059dee: perf tools: Don't check configuration on make clean

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-n2ni4riphpqxw7d6ziv1ndyc@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Current perf build process checks various system configuration on
    invocation to make. But this is not needed just for cleaning.

    To do that, move some of python related variables out of conditional
    since 'clean' target needs them. Normal path should not be affected by
    this.

    Signed-off-by: Namhyung Kim
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1352867990-658-1-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

03 Dec, 2012

1 commit


31 Oct, 2012

1 commit

  • Without defining ARCH=arm, building perf for Android ARM will fail,
    because it needs architecture specific files.

    So add related relevant information to the android documentation.

    Signed-off-by: Joonsoo Kim
    Reviewed-by: Namhyung Kim
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Irina Tirdea
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/1351518066-4791-1-git-send-email-js1304@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Joonsoo Kim
     

26 Oct, 2012

3 commits

  • In order to measure kernel builds, one has to do some pre/post cleanup
    work in order to do the repeat build.

    So provide --pre and --post command hooks to allow doing just that.

    perf stat --repeat 10 --null --sync --pre 'make -s O=defconfig-build/clean' \
    -- make -s -j64 O=defconfig-build/ bzImage

    Signed-off-by: Peter Zijlstra
    Acked-by: Ingo Molnar
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1350992414.13456.5.camel@twins
    [ committer note: Added respective entries in Documentation/perf-stat.txt ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Peter Zijlstra
     
  • You may want to know where and how long a task is sleeping. A callchain
    may be found in sched_switch and a time slice in stat_iowait, so I add
    handler in perf inject for merging this events.

    My code saves sched_switch event for each process and when it meets
    stat_iowait, it reports the sched_switch event, because this event
    contains a correct callchain. By another words it replaces all
    stat_iowait events on proper sched_switch events.

    I use the next sequence of commands for testing:

    perf record -e sched:sched_stat_sleep -e sched:sched_switch \
    -e sched:sched_process_exit -g -o ~/perf.data.raw \
    ~/test-program
    perf inject -v -s -i ~/perf.data.raw -o ~/perf.data
    perf report --stdio -i ~/perf.data
    100.00% foo [kernel.kallsyms] [k] __schedule
    |
    --- __schedule
    schedule
    |
    |--79.75%-- schedule_hrtimeout_range_clock
    | schedule_hrtimeout_range
    | poll_schedule_timeout
    | do_select
    | core_sys_select
    | sys_select
    | system_call_fastpath
    | __select
    | __libc_start_main
    |
    --20.25%-- do_nanosleep
    hrtimer_nanosleep
    sys_nanosleep
    system_call_fastpath
    __GI___libc_nanosleep
    __libc_start_main

    And here is test-program.c:

    #include
    #include
    #include

    int main()
    {
    struct timespec ts1;
    struct timeval tv1;
    int i;
    long s;

    for (i = 0; i < 10; i++) {
    ts1.tv_sec = 0;
    ts1.tv_nsec = 10000000;
    nanosleep(&ts1, NULL);

    tv1.tv_sec = 0;
    tv1.tv_usec = 40000;
    select(0, NULL, NULL, NULL,&tv1);
    }
    return 1;
    }

    Signed-off-by: Andrew Vagin
    Acked-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1344344165-369636-4-git-send-email-avagin@openvz.org
    [ committer note: Made it use evsel->handler ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Andrew Vagin
     
  • Before this patch "perf inject" can only handle data from pipe.

    I want to use "perf inject" for reworking events. Look at my following patch.

    v2: add information about new options in tools/perf/Documentation/

    Signed-off-by: Andrew Vagin
    Acked-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1344344165-369636-2-git-send-email-avagin@openvz.org
    [ committer note: fixed it up to cope with 5852a44, 5ded57a, 002439e & f62d3f0 ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Andrew Vagin
     

25 Oct, 2012

3 commits

  • [root@sandy ~]# perf trace --sched --duration 0.100 --pid `pidof firefox`

    17079.847 ( 0.009 ms): 17643 poll(ufds: 140037623086496, nfds: 11, timeout_msecs: 0) = 0 Timeout
    17079.892 ( 0.010 ms): 17643 read(fd: 4, buf: 140038178943092, count: 4096 ) = -1 EAGAIN Resource temporarily unavailable
    17079.921 ( 0.013 ms): 17643 poll(ufds: 140037623086496, nfds: 11, timeout_msecs: 0) = 0 Timeout
    17079.949 ( 0.009 ms): 17643 read(fd: 4, buf: 140038178943092, count: 4096 ) = -1 EAGAIN Resource temporarily unavailable
    ^C
    _____________________________________________________________________
    __) Summary of events (__

    [ task - pid ] [ events ] [ ratio ] [ runtime ]
    _____________________________________________________________________

    firefox - 17643 : 18013 [ 72.2% ] 359.110 ms
    firefox - 17663 : 41 [ 0.2% ] 21.439 ms
    firefox - 17664 : 6840 [ 27.4% ] 133.642 ms
    firefox - 17667 : 46 [ 0.2% ] 0.682 ms
    [root@sandy ~]#

    This is equivalent to the 'perf trace summary' subcomand in the tmp.perf/trace2
    branch.

    Another example, setting a huge duration filter to get just a system
    wide summary:

    [root@sandy ~]# perf trace --duration 10000.0 --sched
    ^C
    _____________________________________________________________________
    __) Summary of events (__

    [ task - pid ] [ events ] [ ratio ] [ runtime ]
    _____________________________________________________________________

    scsi_eh_1 - 258 : 15 [ 0.0% ] 0.133 ms
    kworker/0:1H - 322 : 13 [ 0.0% ] 0.032 ms
    jbd2/dm-0-8 - 384 : 4 [ 0.0% ] 0.115 ms
    flush-253:0 - 470 : 1 [ 0.0% ] 0.027 ms
    firefox - 950 : 4783 [ 0.1% ] 24.863 ms
    firefox - 992 : 1883 [ 0.1% ] 6.808 ms
    firefox - 995 : 35 [ 0.0% ] 0.111 ms
    ksoftirqd/6 - 4362 : 2 [ 0.0% ] 0.005 ms
    ksoftirqd/7 - 4365 : 1 [ 0.0% ] 0.007 ms
    Xorg - 4671 : 148 [ 0.0% ] 0.912 ms
    gnome-settings- - 4846 : 14 [ 0.0% ] 0.086 ms
    seahorse-daemon - 4847 : 14 [ 0.0% ] 0.092 ms
    gnome-panel - 4875 : 46 [ 0.0% ] 0.159 ms
    gnome-power-man - 4918 : 16 [ 0.0% ] 0.065 ms
    gvfs-afc-volume - 4992 : 77 [ 0.0% ] 0.136 ms
    gnome-screensav - 5114 : 24 [ 0.0% ] 0.128 ms
    xchat - 8082 : 466 [ 0.0% ] 2.019 ms
    synergyc - 8369 : 941 [ 0.0% ] 3.291 ms
    synergyc - 8371 : 85 [ 0.0% ] 1.817 ms
    jbd2/dm-4-8 - 9352 : 4 [ 0.0% ] 0.109 ms
    rpcbind - 9786 : 3 [ 0.0% ] 0.017 ms
    rtkit-daemon - 12802 : 10 [ 0.0% ] 0.038 ms
    rtkit-daemon - 12803 : 8 [ 0.0% ] 0.000 ms
    udisks-daemon - 13020 : 27 [ 0.0% ] 0.240 ms
    kworker/7:0 - 14651 : 669 [ 0.0% ] 2.616 ms
    kworker/5:1 - 16220 : 2 [ 0.0% ] 0.069 ms
    kworker/4:0 - 19776 : 13 [ 0.0% ] 0.176 ms
    openvpn - 20131 : 133 [ 0.0% ] 0.762 ms
    plugin-containe - 20508 : 60658 [ 1.7% ] 131.153 ms
    npviewer.bin - 20520 : 72208 [ 2.0% ] 138.945 ms
    npviewer.bin - 20542 : 35 [ 0.0% ] 0.074 ms
    npviewer.bin - 20543 : 30 [ 0.0% ] 0.074 ms
    npviewer.bin - 20547 : 35 [ 0.0% ] 0.092 ms
    npviewer.bin - 20552 : 35 [ 0.0% ] 0.093 ms
    sshd - 20645 : 32 [ 0.0% ] 0.071 ms
    npviewer.bin - 21053 : 35 [ 0.0% ] 0.074 ms
    npviewer.bin - 21054 : 35 [ 0.0% ] 0.097 ms
    kworker/0:2 - 21169 : 149 [ 0.0% ] 1.143 ms
    kworker/3:0 - 22171 : 113 [ 0.0% ] 96.892 ms
    flush-253:4 - 22410 : 1 [ 0.0% ] 0.028 ms
    kworker/6:0 - 24581 : 25 [ 0.0% ] 0.275 ms
    kworker/1:0 - 25572 : 4 [ 0.0% ] 0.103 ms
    kworker/2:1 - 26299 : 138 [ 0.0% ] 1.440 ms
    kworker/0:0 - 26325 : 1 [ 0.0% ] 0.003 ms
    perf - 26330 : 3506967 [ 96.1% ] 6648.310 ms
    [root@sandy ~]#

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/n/tip-mzuli0srnxyi1o029py6537x@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • There's a portion in the "perf list" output refering to the exact
    specification of raw hardware events.

    Since this description is in the perf-list manpage, try to build and
    install the man pages, warning the user when that is not possible
    due to missing packages (xmlto and asciidoc).

    Signed-off-by: Borislav Petkov
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/n/tip-ij71ysszkdvz3fy3wr331bke@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Borislav Petkov
     
  • Example:

    [acme@sandy linux]$ perf trace --duration 0.025 usleep 1
    2.221 ( 0.958 ms): 6724 execve(arg0: 140733557168278, arg1: 140733557178768, arg2: 16134304, arg3: 140733557167840, arg4: 7955998171588342573, arg5: 6723) = -2
    3.690 ( 1.443 ms): 6724 execve(arg0: 140733557168295, arg1: 140733557178768, arg2: 16134304, arg3: 140733557167840, arg4: 7955998171588342573, arg5: 6723) = 0
    3.979 ( 0.048 ms): 6724 open(filename: 208733843841, flags: 0, mode: 1 ) = 3
    4.071 ( 0.075 ms): 6724 open(filename: 139744419925673, flags: 0, mode: 0 ) = 3
    4.318 ( 0.056 ms): 6724 nanosleep(rqtp: 140734030404608, rmtp: 0 ) = 0
    [acme@sandy linux]$ perf trace --duration 0.100 usleep 1
    1.143 ( 1.021 ms): 6726 execve(arg0: 140736323962279, arg1: 140736323972752, arg2: 34926752, arg3: 140736323961824, arg4: 7955998171588342573, arg5: 6725) = 0
    [acme@sandy linux]$

    Cherry picked from tmp.perf/trace2 branch.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/n/tip-oslw2j2958we9qf0ctra4whd@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

09 Oct, 2012

1 commit

  • Add documentation for cross-compiling on Android including:

    () instructions on how to set the Android NDK environment
    () how to cross-compile perf for Android
    () how to install on an Android device/emulator, set the runtime
    environment and run it

    Signed-off-by: Irina Tirdea
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/1349678613-7045-4-git-send-email-irina.tirdea@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Irina Tirdea
     

06 Oct, 2012

6 commits

  • Adding -F option to display the formula for specified computation.

    This is mainly to facilitate debugging, but can be useful anyway.

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Corey Ashford
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1349448287-18919-7-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding -p option to show period values for both compared hist entries.
    Showing hist column PERF_HPP__PERIOD and newly added hist column
    PERF_HPP__PERIOD_BASELINE.

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Corey Ashford
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1349448287-18919-6-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding 'wdiff' as new computation way to compare hist entries.

    If specified the 'Weighted diff' column is displayed with value 'd'
    computed as:

    d = B->period * WEIGHT-A - A->period * WEIGHT-B

    - A/B being matching hist entry from first/second file specified
    (or perf.data/perf.data.old) respectively.
    - period being the hist entry period value
    - WEIGHT-A/WEIGHT-B being user suplied weights in the the '-c' option
    behind ':' separator like '-c wdiff:1,2'.

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Corey Ashford
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1349448287-18919-5-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding support to sort hist entries based on the outcome of selected
    computation. It's now possible to specify '+' as a first character of
    '-c' option value to make such sort.

    Example:

    $ perf diff -c ratio -b
    # Event 'cache-misses'
    #
    # Baseline Ratio Shared Object Symbol
    # ........ .............. ................. ................................
    #
    19.64% 0.69 [kernel.kallsyms] [k] clear_page
    0.30% 0.17 [kernel.kallsyms] [k] mm_alloc
    0.04% 0.20 [kernel.kallsyms] [k] kmem_cache_alloc

    $ perf diff -c +ratio -b
    # Event 'cache-misses'
    #
    # Baseline Ratio Shared Object Symbol
    # ........ .............. ................. ................................
    #
    19.64% 0.69 [kernel.kallsyms] [k] clear_page
    0.04% 0.20 [kernel.kallsyms] [k] kmem_cache_alloc
    0.30% 0.17 [kernel.kallsyms] [k] mm_alloc

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Corey Ashford
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1349448287-18919-4-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding -c option to select computation method with the current 'Delta'
    computation as default. Current possible values are of this option are:
    'delta' and 'ratio'.

    Adding 'ratio' as new computation way to compare hist entries. If
    specified the 'Ratio' column is displayed with value 'r' computed as:

    r = A->period / B->period

    with:
    - A/B being matching hist entry from first/second file specified
    (or perf.data/perf.data.old) respectively.
    - period being the hist entry period value

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Corey Ashford
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1349448287-18919-3-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Adding -b option to perf diff command to display only entries with match
    in the baseline.

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Corey Ashford
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1349448287-18919-2-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

27 Sep, 2012

1 commit

  • Initially should look loosely like the venerable 'strace' tool, but
    using the infrastructure in the perf tools to allow tracing extra
    targets:

    [acme@sandy linux]$ perf trace --hell
    Error: unknown option `hell'

    usage: perf trace

    -p, --pid trace events on existing process id
    --tid trace events on existing thread id
    --all-cpus system-wide collection from all CPUs
    --cpu list of cpus to monitor
    --no-inherit child tasks do not inherit counters
    --mmap-pages number of mmap data pages
    --uid user to profile

    [acme@sandy linux]$

    Those should have the same semantics as when using with 'perf record'.

    It gets stuck sometimes, but hey, it works sometimes too!

    In time it should support perf.data based workloads, i.e. it should have
    a:
    -o filename

    Command line option that will produce a perf.data file that can then be
    used with 'perf trace' or any of the other perf tools (script, report,
    etc).

    It will also eventually have the set of functionalities described in the
    previous 'trace' prototype by Thomas Gleixner:

    "Announcing a new utility: 'trace'"
    http://lwn.net/Articles/415728/

    Also planned is to have some of the features suggested in the comments
    of that LWN article.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/n/tip-v9x3q9rv4caxtox7wtjpchq5@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo