23 Oct, 2015

1 commit

  • commit 601083cffb7cabdcc55b8195d732f0f7028570fa upstream.

    print_aggr() fails to print per-core/per-socket statistics after commit
    582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events")
    if events have differnt cpus. Because in print_aggr(), aggr_get_id needs
    index (not cpu id) to find core/pkg id. Also, evsel cpu maps should be
    used to get aggregated id.

    Here is an example:

    Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event has
    cpumask 0,18)

    $ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core sleep 2

    Without this patch, it failes to get CPU 18 result.

    Performance counter stats for 'CPU(s) 0,18':

    S0-C0 1 7526851 cycles
    S0-C0 1 1.05 MiB uncore_imc_0/cas_count_read/
    S1-C0 0 cycles
    S1-C0 0 MiB uncore_imc_0/cas_count_read/

    With this patch, it can get both CPU0 and CPU18 result.

    Performance counter stats for 'CPU(s) 0,18':

    S0-C0 1 6327768 cycles
    S0-C0 1 0.47 MiB uncore_imc_0/cas_count_read/
    S1-C0 1 330228 cycles
    S1-C0 1 0.29 MiB uncore_imc_0/cas_count_read/

    Signed-off-by: Kan Liang
    Acked-by: Jiri Olsa
    Acked-by: Stephane Eranian
    Cc: Adrian Hunter
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Fixes: 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events")
    Link: http://lkml.kernel.org/r/1435820925-51091-1-git-send-email-kan.liang@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Kan Liang
     

26 Mar, 2015

1 commit

  • Use of a bad filter currently generates the message:
    Error: failed to set filter with 22 (Invalid argument)

    Add the event name to make it clear to which event the filter
    failed to apply:
    Error: Failed to set filter "foo" on event sched:sg_lb_stats: 22: Invalid argument

    To test it use something like:

    # perf record -e sched:sched_switch -e sched:*fork --filter parent_pid==1 -e sched:*wait* --filter bla usleep 1
    Error: failed to set filter "bla" on event sched:sched_stat_iowait with 22 (Invalid argument)
    #

    Based-on-a-patch-by: David Ahern
    Acked-by: David Ahern
    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: Don Zickus
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-d7gq2fjvaecozp9o2i0siifu@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

13 Mar, 2015

3 commits

  • When cycles or instructions do not print anything, as in being,
    --per-socket or --per-core modi, the ratio column was not correctly
    indented for them. This lead to some ratios not lining up with the
    others. Always indent correctly when nothing is printed.

    Signed-off-by: Andi Kleen
    Link: http://lkml.kernel.org/r/1426087682-22765-3-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • perf stat didn't compute the IPC and other formulas for individual CPUs
    with -A. Fix this for the easy -A case. As before, --per-core and
    --per-socket do not handle it, they simply print nothing.

    Signed-off-by: Andi Kleen
    Link: http://lkml.kernel.org/r/1426087682-22765-2-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • The information how much a counter ran in 'perf stat' can be quite
    interesting for other tools to judge how trustworthy a measurement is.

    Currently it is only output in non CSV mode.

    This patches make perf stat always output the running time and the
    enabled/running ratio in CSV mode.

    This adds two new fields at the end for each line. I assume that
    existing tools ignore new fields at the end, so it's on by default.

    Only CSV mode is affected, no difference otherwise.

    v2: Add extra print_running function
    v3: Avoid printing nan
    v4: Remove some elses and add brackets.
    v5: Move non CSV case into print_running

    Signed-off-by: Andi Kleen
    Reviewed-by: Jiri Olsa
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/1426083387-17006-1-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

02 Mar, 2015

1 commit

  • Commit 1971f59 (perf stat: Use read_counter in read_counter_aggr )
    broke the perf stat output for unsupported counters.

    $ perf stat -v -a -C 0 -e CCI_400/config=24/ sleep 1
    Warning:
    CCI_400/config=24/ event is not supported by the kernel.

    Performance counter stats for 'system wide':

    0 CCI_400/config=24/

    1.080265400 seconds time elapsed

    Where it used to be :

    $ perf stat -v -a -C 0 -e CCI_400/config=24/ sleep 1
    Warning:
    CCI_400/config=24/ event is not supported by the kernel.

    Performance counter stats for 'system wide':

    CCI_400/config=24/

    1.083840675 seconds time elapsed

    This patch fixes the issues by checking if the counter is supported,
    before reading and logging the counter value.

    Signed-off-by: Suzuki K. Poulose
    Acked-by: David Ahern
    Tested-by: David Ahern
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/1423852858-8455-1-git-send-email-suzuki.poulose@arm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Suzuki K. Poulose
     

22 Jan, 2015

1 commit

  • Janitorial stuff: boredom moment.

    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Don Zickus
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-u70i7shys3kths4hzru72bha@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

02 Dec, 2014

5 commits

  • The .snapshot file indicates that the provided event value is a snapshot
    value. Bypassing the delta computation logic for such event.

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Matt Fleming
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1416562275-12404-12-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • The .per-pkg file indicates that all but one value per socket should be
    discarded. Adding the logic of skipping the rest of the socket once
    first value was read.

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Matt Fleming
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1416562275-12404-11-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Use the read_counter function as the values retrieval function for aggr
    counter values thus eliminating the use of __perf_evsel__read function.

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Matt Fleming
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1416562275-12404-7-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • The read function will be used later for both aggr and cpu counters, so
    we need to make it work over threads as well.

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Matt Fleming
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1416562275-12404-6-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Replacing __perf_evsel__read_on_cpu function with perf_evsel__read_cb
    function. The read_cb callback will be used later for global aggregation
    counter values as well.

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Matt Fleming
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1416562275-12404-5-git-send-email-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

17 Oct, 2014

1 commit

  • The only thing we need is a forward declaration for 'struct cgroup_sel',
    that is inside 'struct perf_evsel'.

    Include cgroup.h instead on the tools that support cgroups.

    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Don Zickus
    Cc: Frederic Weisbecker
    Cc: Jean Pihet
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-b7kuymbgf0zxi5viyjjtu5hk@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

26 Sep, 2014

1 commit

  • On systems with more than one socket perf stat --per-core would either
    segfault or stop before outputting all cores.

    The problem was that the output code referenced the id including the
    socket number in the higher bits, which is far beyond any per cpu array.

    Mask out the socket number before referencing cpus in abs_printout.

    I also renamed the variable in nsec_printout to be clear what it is,
    even though it doesn't reference cpus.

    Signed-off-by: Andi Kleen
    Acked-by: Stephane Eranian
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1411591846-32736-1-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

16 Aug, 2014

1 commit

  • Use strerror_r instead of strerror in error message for thread-safety.

    Signed-off-by: Masami Hiramatsu
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Naohiro Aota
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20140814022255.3545.81549.stgit@kbuild-fedora.novalocal
    Signed-off-by: Arnaldo Carvalho de Melo

    Masami Hiramatsu
     

27 Jun, 2014

1 commit

  • Check real allocated pointer for NULL.

    Cc: Arnaldo Carvalho de Melo
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-5rfzbalwjphmdzzil74eazyl@git.kernel.org
    Signed-off-by: Jiri Olsa

    Jiri Olsa
     

14 Apr, 2014

1 commit

  • perf stat did initialize the stats structure used to compute
    stddev etc. incorrectly. It merely zeroes it. But one member
    (min) needs to be set to a non zero value. This causes min
    to be not computed at all. Call init_stats() correctly.

    It doesn't matter for stat currently because it doesn't use
    min, but it's still better to do it correctly.

    The other users of statistics are already correct.

    Signed-off-by: Andi Kleen
    Acked-by: Namhyung Kim
    Link: http://lkml.kernel.org/r/1395768699-16060-1-git-send-email-andi@firstfloor.org
    Signed-off-by: Jiri Olsa

    Andi Kleen
     

13 Jan, 2014

6 commits

  • For the common evsel list traversal, so that it becomes more compact.

    Use the opportunity to start ditching the 'perf_' from 'perf_evlist__',
    as discussed, as the whole conversion touches a lot of places, lets do
    it piecemeal when we have the chance due to other work, like in this
    case.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-qnkx7dzm2h6m6uptkfk03ni6@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • That 'argc' argument _is_ being used.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-t2gsxc15zulkorieg8zq996o@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Instead of requiring tools to do an extra destructor call just before
    calling perf_evlist__delete.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-0jd2ptzyikxb5wp7inzz2ah2@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • So that we have the boilerplate in the preparation method, instead of
    open coded in tools wanting the reporting when the exec fails.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-purbdzcphdveskh7wwmnm4t7@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • When a tool uses perf_evlist__start_workload and the supplied workload
    fails (e.g.: its binary wasn't found), perror was being used to print
    the error reason.

    This is undesirable, as the caller may be a GUI, when it wants to have
    total control of the error reporting process.

    So move to using sigaction(SA_SIGINFO) + siginfo_t->sa_value->sival_int
    to communicate to the caller the errno and let it print it using the UI
    of its choosing.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-epgcv7kjq8ll2udqfken92pz@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • When starting a workload 'stat' wasn't using prepare_workload evlist
    method's signal based exec() error reporting mechanism.

    Use it so that the we don't report 'not counted' counters.

    Before:

    [acme@zoo linux]$ perf stat dfadsfa
    dfadsfa: No such file or directory

    Performance counter stats for 'dfadsfa':

    task-clock
    context-switches
    cpu-migrations
    page-faults
    cycles
    stalled-cycles-frontend
    stalled-cycles-backend
    instructions
    branches
    branch-misses

    0.001831462 seconds time elapsed

    [acme@zoo linux]$

    After:

    [acme@zoo linux]$ perf stat dfadsfa
    dfadsfa: No such file or directory
    [acme@zoo linux]$

    Reported-by: David Ahern
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-5yui3bv7e3hitxucnjsn6z8q@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

28 Dec, 2013

1 commit

  • For the frequent idiom of:

    free(ptr);
    ptr = NULL;

    Make it expect a pointer to the pointer being freed, so that it becomes
    clear at first sight that the variable being freed is being modified.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-pfw02ezuab37kha18wlut7ir@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

27 Nov, 2013

1 commit

  • This patch adds perf stat support for handling event units and
    scales as exported by the kernel.

    The kernel can export PMU events actual unit and scaling factor
    via sysfs:

    $ ls -1 /sys/devices/power/events/energy-*
    /sys/devices/power/events/energy-cores
    /sys/devices/power/events/energy-cores.scale
    /sys/devices/power/events/energy-cores.unit
    /sys/devices/power/events/energy-pkg
    /sys/devices/power/events/energy-pkg.scale
    /sys/devices/power/events/energy-pkg.unit
    $ cat /sys/devices/power/events/energy-cores.scale
    2.3283064365386962890625e-10
    $ cat cat /sys/devices/power/events/energy-cores.unit
    Joules

    This patch modifies the pmu event alias code to check
    for the presence of the .unit and .scale files to load
    the corresponding values. They are then used by perf stat
    transparently:

    # perf stat -a -e power/energy-pkg/,power/energy-cores/,cycles -I 1000 sleep 1000
    # time counts unit events
    1.000214717 3.07 Joules power/energy-pkg/ [100.00%]
    1.000214717 0.53 Joules power/energy-cores/
    1.000214717 12965028 cycles [100.00%]
    2.000749289 3.01 Joules power/energy-pkg/
    2.000749289 0.52 Joules power/energy-cores/
    2.000749289 15817043 cycles

    When the event does not have an explicit unit exported by
    the kernel, nothing is printed. In csv output mode, there
    will be an empty field.

    Special thanks to Jiri for providing the supporting code
    in the parser to trigger reading of the scale and unit files.

    Signed-off-by: Stephane Eranian
    Reviewed-by: Jiri Olsa
    Reviewed-by: Andi Kleen
    Signed-off-by: Peter Zijlstra
    Cc: zheng.z.yan@intel.com
    Cc: bp@alien8.de
    Cc: maria.n.dimakopoulou@gmail.com
    Cc: acme@redhat.com
    Link: http://lkml.kernel.org/r/1384275531-10892-3-git-send-email-eranian@google.com
    Signed-off-by: Ingo Molnar

    Stephane Eranian
     

13 Nov, 2013

1 commit

  • Getting unwieldly long, for this app domain should be descriptive enough
    and the use of __ to separate the class from the method names should
    help with avoiding clashes with other code bases.

    Reported-by: David Ahern
    Suggested-by: Ingo Molnar
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20131112113427.GA4053@ghostprotocols.net
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

04 Nov, 2013

1 commit

  • Print related option help messages only when it failed to process
    options. While at it, modify parse_options_usage() to skip usage part
    so that it can be used for showing multiple option help messages
    naturally like below:

    $ perf stat -Bx, ls
    -B option not supported with -x

    usage: perf stat [] []

    -B, --big-num print large numbers with thousands' separators
    -x, --field-separator
    print counts with custom separator

    Signed-off-by: Namhyung Kim
    Acked-by: Ingo Molnar
    Reviewed-by: Ingo Molnar
    Enthusiastically-Supported-by: Ingo Molnar
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1383291195-24386-6-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

11 Oct, 2013

5 commits

  • Ingo pointed out that the task-clock counter should have the units
    explicitly stated since it is not a counter.

    Before:

    perf stat -a -- sleep 1

    Performance counter stats for 'sleep 1':

    16186.874834 task-clock # 16.154 CPUs utilized
    ...

    After:

    perf stat -a -- sleep 1

    Performance counter stats for 'system wide':

    16146.402138 task-clock (msec) # 16.125 CPUs utilized
    ...

    Reported-by: Ingo Molnar
    Signed-off-by: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1380400080-9211-4-git-send-email-dsahern@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • The "perf stat" command can do system wide counters or one or more cpus.
    For these options do not require a workload to be specified.

    v2: use perf_target__none per Namhyung's comment.

    Signed-off-by: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/52497F3C.9070908@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • The "perf stat" tool displays the command run in its summary output
    which is misleading when using a cpu list or system wide collection.

    Before:

    perf stat -a -- sleep 1

    Performance counter stats for 'sleep 1':

    16152.670249 task-clock # 16.132 CPUs utilized
    417 context-switches # 0.002 M/sec
    7 cpu-migrations # 0.030 K/sec
    ...

    After:

    perf stat -a -- sleep 1

    Performance counter stats for 'system wide':

    16206.931120 task-clock # 16.144 CPUs utilized
    395 context-switches # 0.002 M/sec
    5 cpu-migrations # 0.030 K/sec
    ...

    or

    perf stat -C1 -- sleep 1

    Performance counter stats for 'CPU(s) 1':

    1001.669257 task-clock # 1.000 CPUs utilized
    4,264 context-switches # 0.004 M/sec
    3 cpu-migrations # 0.003 K/sec
    ...

    Signed-off-by: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1380400080-9211-2-git-send-email-dsahern@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • When only the instructions event is requested:

    $ perf stat -e instructions git s
    M builtin-stat.c

    Performance counter stats for 'git s':

    917,453,420 instructions # 0.00 insns per cycle

    0.213002926 seconds time elapsed

    The 0.00 insns per cycle comment in the output is totally bogus and
    misleading. It happens because update_shadow_stats() doesn't touch
    runtime_cycles_stats when only the instructions event is requested. So,
    omit printing the bogus data altogether.

    Signed-off-by: Ramkumar Ramachandra
    Acked-by: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1380616604-4077-1-git-send-email-artagnon@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Ramkumar Ramachandra
     
  • When only the cycles event is requested:

    $ perf stat -e cycles dd if=/dev/zero of=/dev/null count=1000000
    1000000+0 records in
    1000000+0 records out
    512000000 bytes (512 MB) copied, 0.26123 s, 2.0 GB/s

    Performance counter stats for 'dd if=/dev/zero of=/dev/null count=1000000':

    911,626,453 cycles # 0.000 GHz

    0.262113350 seconds time elapsed

    The 0.000 GHz comment in the output is totally bogus and misleading. It
    happens because update_shadow_stats() doesn't touch runtime_nsecs_stats;
    it is only written when a requested counter matches a SW_TASK_CLOCK. In
    our case, since we have only requested HW_CPU_CYCLES,
    runtime_nsecs_stats is unavailable. So, omit printing the comment
    altogether.

    Signed-off-by: Ramkumar Ramachandra
    Acked-by: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1380539585-23859-3-git-send-email-artagnon@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Ramkumar Ramachandra
     

08 Oct, 2013

1 commit


05 Oct, 2013

1 commit

  • The commit acf2892270dc ("perf stat: Use perf_evlist__prepare/
    start_workload()") converted to use the function but forgot to update
    child_pid. Fix it.

    Signed-off-by: Namhyung Kim
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1380531671-28076-1-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

04 Oct, 2013

1 commit

  • Add support to perf stat to print the basic transactional execution statistics:
    Total cycles, Cycles in Transaction, Cycles in aborted transsactions
    using the in_tx and in_tx_checkpoint qualifiers.
    Transaction Starts and Elision Starts, to compute the average transaction
    length.

    This is a reasonable overview over the success of the transactions.

    Also support architectures that have a transaction aborted cycles
    counter like POWER8. Since that is awkward to handle in the kernel
    abstract handle both cases here.

    Enable with a new --transaction / -T option.

    This requires measuring these events in a group, since they depend on each
    other.

    This is implemented by using TM sysfs events exported by the kernel

    Signed-off-by: Andi Kleen
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1377128846-977-5-git-send-email-andi@firstfloor.org
    Signed-off-by: Ingo Molnar

    Andi Kleen
     

08 Aug, 2013

2 commits

  • When interval mode is outputting to a pipe, each measurement should be
    flushed individually, so that the reader sees it timely.

    With a terminal each line is automatically flushed by stdio, but that is
    disabled with non terminal output.

    Simply fflush output after each time interval

    Signed-off-by: Andi Kleen
    Reviewed-by: Jiri Olsa
    Cc: Jiri Olsa
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1375490473-1503-5-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • When measuring workloads the startup phase -- doing page faults, dynamic
    linking, opening files -- is often very different from the rest of the
    workload. Especially with smaller kernels and using counter
    multiplexing this can give significant measurement errors.

    Multiplexing assumes that the workload is mostly the same over longer
    periods. But at startup there is typically some spike of activity which
    is relatively short. If many groups are multiplexing the one group
    seeing the spike, and which is then scaled up over the time to run all
    groups, may see a significant error.

    Also in general it's often not useful to measure the startup, because it
    is so different from the rest.

    One way around this is to use interval mode and discard the first
    sample, but this can be awkward because interval mode doesn't support
    intervals of less than 100ms, and also a useful interval is not
    necessarily the same as a useful startup delay.

    This patch adds a new --initial-delay / -D option to skip measuring for
    the startup phase. The time can be specified in ms

    Here's a simple example:

    perf stat -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
    ...
    3,721 page-faults
    ...

    If we just wait 20 ms the number of page faults is 1/3 less:

    perf stat -D 20 -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
    ...
    2,823 page-faults
    ...

    So we filtered out most of the startup noise from bash.

    Signed-off-by: Andi Kleen
    Reviewed-by: Jiri Olsa
    Cc: Jiri Olsa
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1375490473-1503-4-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     

09 Jul, 2013

2 commits

  • This patch fixes a problem reported by Andi Kleen on perf
    stat when measuring uncore events:

    # perf stat --per-socket -e uncore_pcu/event=0x0/ -I1000 -a sleep 2

    It would not report counts for the second socket. That was due to a
    cpu mapping bug in print_aggr().

    This patch also fixes the socket numbering bug for
    events.

    Reported-by: Andi Kleen
    Signed-off-by: Stephane Eranian
    Tested-by: Andi Kleen
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: zheng.z.yan@intel.com
    Link: http://lkml.kernel.org/r/20130705170645.GA32519@quad
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     
  • This patch fixes a problem with perf stat whereby on termination it may
    send a SIGTERM signal to random processes on systems with high PID
    recycling. I got some actual bug reports on this.

    There is race between the SIGCHLD and sig_atexit() handlers. This patch
    addresses this problem by clearing child_pid in the SIGCHLD handler.

    Signed-off-by: Stephane Eranian
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20130604154426.GA2928@quad
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     

26 Mar, 2013

1 commit

  • This patch adds the --per-core option to perf stat.

    This option is used to aggregate system-wide counts
    on a per physical core basis. On processors with
    hyperthreading, this means counts of all HT threads
    running on a physical core are aggregated.

    This mode is useful to find imblance between physical
    cores running an uniform workload. Cores are identified
    by socket: S0-C1, means physical core 1 on socket 0. Note
    that cores are identified using their physical core id,
    thus their numbering may not be continuous.

    Per core aggregation can be combined with interval printing:

    # perf stat -a --per-core -I 1000 -e cycles sleep 1000
    # time core cpus counts events
    1.000090030 S0-C0 1 4,765,747 cycles
    1.000090030 S0-C1 1 5,580,647 cycles
    1.000090030 S0-C2 1 221,181 cycles
    1.000090030 S0-C3 1 266,092 cycles

    Signed-off-by: Stephane Eranian
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1360846649-6411-4-git-send-email-eranian@google.com
    [ committer note: Remove parts already applied on 86ee6e1 to keep bisectability ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian