Eric Lee / smarc-fsl-linux-kernel

23 Oct, 2015

1 commit

63f155ca0 perf stat: Get correct cpu id for print_aggr ... Browse Code »

commit 601083cffb7cabdcc55b8195d732f0f7028570fa upstream.

print_aggr() fails to print per-core/per-socket statistics after commit
582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events")
if events have differnt cpus. Because in print_aggr(), aggr_get_id needs
index (not cpu id) to find core/pkg id. Also, evsel cpu maps should be
used to get aggregated id.

Here is an example:

Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event has
cpumask 0,18)

$ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core sleep 2

Without this patch, it failes to get CPU 18 result.

Performance counter stats for 'CPU(s) 0,18':

S0-C0 1 7526851 cycles
S0-C0 1 1.05 MiB uncore_imc_0/cas_count_read/
S1-C0 0 cycles
S1-C0 0 MiB uncore_imc_0/cas_count_read/

With this patch, it can get both CPU0 and CPU18 result.

Performance counter stats for 'CPU(s) 0,18':

S0-C0 1 6327768 cycles
S0-C0 1 0.47 MiB uncore_imc_0/cas_count_read/
S1-C0 1 330228 cycles
S1-C0 1 0.29 MiB uncore_imc_0/cas_count_read/

Signed-off-by: Kan Liang
Acked-by: Jiri Olsa
Acked-by: Stephane Eranian
Cc: Adrian Hunter
Cc: Andi Kleen
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Fixes: 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events")
Link: http://lkml.kernel.org/r/1435820925-51091-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Greg Kroah-Hartman

Kan Liang
2015-10-23 05:43:12 +0800

26 Mar, 2015

1 commit

23d4aad48 perf evlist: Return the first evsel with an invalid filter in apply_filters() ... Browse Code »

Use of a bad filter currently generates the message:
Error: failed to set filter with 22 (Invalid argument)

Add the event name to make it clear to which event the filter
failed to apply:
Error: Failed to set filter "foo" on event sched:sg_lb_stats: 22: Invalid argument

To test it use something like:

# perf record -e sched:sched_switch -e sched:*fork --filter parent_pid==1 -e sched:*wait* --filter bla usleep 1
Error: failed to set filter "bla" on event sched:sched_stat_iowait with 22 (Invalid argument)
#

Based-on-a-patch-by: David Ahern
Acked-by: David Ahern
Cc: Adrian Hunter
Cc: Borislav Petkov
Cc: Don Zickus
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-d7gq2fjvaecozp9o2i0siifu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2015-03-26 21:52:28 +0800

13 Mar, 2015

3 commits

791035285 perf stat: Always correctly indent ratio column ... Browse Code »

When cycles or instructions do not print anything, as in being,
--per-socket or --per-core modi, the ratio column was not correctly
indented for them. This lead to some ratios not lining up with the
others. Always indent correctly when nothing is printed.

Signed-off-by: Andi Kleen
Link: http://lkml.kernel.org/r/1426087682-22765-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo

Andi Kleen
2015-03-13 18:47:44 +0800
56f0fd45d perf stat: Fix IPC and other formulas with -A ... Browse Code »

perf stat didn't compute the IPC and other formulas for individual CPUs
with -A. Fix this for the easy -A case. As before, --per-core and
--per-socket do not handle it, they simply print nothing.

Signed-off-by: Andi Kleen
Link: http://lkml.kernel.org/r/1426087682-22765-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo

Andi Kleen
2015-03-13 18:46:10 +0800
d73515c03 perf stat: Output running time and run/enabled ratio in CSV mode ... Browse Code »

The information how much a counter ran in 'perf stat' can be quite
interesting for other tools to judge how trustworthy a measurement is.

Currently it is only output in non CSV mode.

This patches make perf stat always output the running time and the
enabled/running ratio in CSV mode.

This adds two new fields at the end for each line. I assume that
existing tools ignore new fields at the end, so it's on by default.

Only CSV mode is affected, no difference otherwise.

v2: Add extra print_running function
v3: Avoid printing nan
v4: Remove some elses and add brackets.
v5: Move non CSV case into print_running

Signed-off-by: Andi Kleen
Reviewed-by: Jiri Olsa
Acked-by: Namhyung Kim
Cc: Jiri Olsa
Link: http://lkml.kernel.org/r/1426083387-17006-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo

Andi Kleen
2015-03-13 18:46:04 +0800

02 Mar, 2015

1 commit

3b4331d9a perf stat: Report unsupported events properly ... Browse Code »

Commit 1971f59 (perf stat: Use read_counter in read_counter_aggr )
broke the perf stat output for unsupported counters.

$ perf stat -v -a -C 0 -e CCI_400/config=24/ sleep 1
Warning:
CCI_400/config=24/ event is not supported by the kernel.

Performance counter stats for 'system wide':

0 CCI_400/config=24/

1.080265400 seconds time elapsed

Where it used to be :

$ perf stat -v -a -C 0 -e CCI_400/config=24/ sleep 1
Warning:
CCI_400/config=24/ event is not supported by the kernel.

Performance counter stats for 'system wide':

CCI_400/config=24/

1.083840675 seconds time elapsed

This patch fixes the issues by checking if the counter is supported,
before reading and logging the counter value.

Signed-off-by: Suzuki K. Poulose
Acked-by: David Ahern
Tested-by: David Ahern
Cc: Jiri Olsa
Link: http://lkml.kernel.org/r/1423852858-8455-1-git-send-email-suzuki.poulose@arm.com
Signed-off-by: Arnaldo Carvalho de Melo

Suzuki K. Poulose
2015-03-02 22:51:17 +0800

22 Jan, 2015

1 commit

48000a1ae perf tools: Remove EOL whitespaces ... Browse Code »

Janitorial stuff: boredom moment.

Cc: Adrian Hunter
Cc: Borislav Petkov
Cc: David Ahern
Cc: Don Zickus
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-u70i7shys3kths4hzru72bha@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2015-01-22 00:24:31 +0800

02 Dec, 2014

5 commits

6c0345b73 perf stat: Add support for snapshot counters ... Browse Code »

The .snapshot file indicates that the provided event value is a snapshot
value. Bypassing the delta computation logic for such event.

Signed-off-by: Jiri Olsa
Cc: Andi Kleen
Cc: Corey Ashford
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Matt Fleming
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1416562275-12404-12-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Jiri Olsa
2014-12-02 07:00:31 +0800
779d0b997 perf stat: Add support for per-pkg counters ... Browse Code »

The .per-pkg file indicates that all but one value per socket should be
discarded. Adding the logic of skipping the rest of the socket once
first value was read.

Signed-off-by: Jiri Olsa
Cc: Andi Kleen
Cc: Corey Ashford
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Matt Fleming
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1416562275-12404-11-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Jiri Olsa
2014-12-02 07:00:30 +0800
1971f59f1 perf stat: Use read_counter in read_counter_aggr ... Browse Code »

Use the read_counter function as the values retrieval function for aggr
counter values thus eliminating the use of __perf_evsel__read function.

Signed-off-by: Jiri Olsa
Cc: Andi Kleen
Cc: Corey Ashford
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Matt Fleming
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1416562275-12404-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Jiri Olsa
2014-12-02 07:00:30 +0800
9bf1a5291 perf stat: Make read_counter work over the thread dimension ... Browse Code »

The read function will be used later for both aggr and cpu counters, so
we need to make it work over threads as well.

Signed-off-by: Jiri Olsa
Cc: Andi Kleen
Cc: Corey Ashford
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Matt Fleming
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1416562275-12404-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Jiri Olsa
2014-12-02 07:00:30 +0800
060c4f9c8 perf stat: Use perf_evsel__read_cb in read_counter ... Browse Code »

Replacing __perf_evsel__read_on_cpu function with perf_evsel__read_cb
function. The read_cb callback will be used later for global aggregation
counter values as well.

Signed-off-by: Jiri Olsa
Cc: Andi Kleen
Cc: Corey Ashford
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Matt Fleming
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1416562275-12404-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Jiri Olsa
2014-12-02 07:00:30 +0800

17 Oct, 2014

1 commit

f14d57078 perf evsel: No need to drag util/cgroup.h ... Browse Code »

The only thing we need is a forward declaration for 'struct cgroup_sel',
that is inside 'struct perf_evsel'.

Include cgroup.h instead on the tools that support cgroups.

Cc: Adrian Hunter
Cc: Borislav Petkov
Cc: David Ahern
Cc: Don Zickus
Cc: Frederic Weisbecker
Cc: Jean Pihet
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-b7kuymbgf0zxi5viyjjtu5hk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2014-10-17 23:17:40 +0800

26 Sep, 2014

1 commit

da88c7f78 perf stat: Fix --per-core on multi socket systems ... Browse Code »

On systems with more than one socket perf stat --per-core would either
segfault or stop before outputting all cores.

The problem was that the output code referenced the id including the
socket number in the higher bits, which is far beyond any per cpu array.

Mask out the socket number before referencing cpus in abs_printout.

I also renamed the variable in nsec_printout to be clear what it is,
even though it doesn't reference cpus.

Signed-off-by: Andi Kleen
Acked-by: Stephane Eranian
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1411591846-32736-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo

Andi Kleen
2014-09-26 21:17:13 +0800

16 Aug, 2014

1 commit

759e612bf perf stat: Use strerror_r instead of strerror ... Browse Code »

Use strerror_r instead of strerror in error message for thread-safety.

Signed-off-by: Masami Hiramatsu
Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Naohiro Aota
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20140814022255.3545.81549.stgit@kbuild-fedora.novalocal
Signed-off-by: Arnaldo Carvalho de Melo

Masami Hiramatsu
2014-08-16 00:08:40 +0800

27 Jun, 2014

1 commit

d180ac14a perf tools: Fix wrong condition for allocation failure ... Browse Code »

Check real allocated pointer for NULL.

Cc: Arnaldo Carvalho de Melo
Cc: Corey Ashford
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/n/tip-5rfzbalwjphmdzzil74eazyl@git.kernel.org
Signed-off-by: Jiri Olsa

Jiri Olsa
2014-06-27 17:14:54 +0800

14 Apr, 2014

1 commit

90f6bb6c9 perf stat: Initialize statistics correctly ... Browse Code »

perf stat did initialize the stats structure used to compute
stddev etc. incorrectly. It merely zeroes it. But one member
(min) needs to be set to a non zero value. This causes min
to be not computed at all. Call init_stats() correctly.

It doesn't matter for stat currently because it doesn't use
min, but it's still better to do it correctly.

The other users of statistics are already correct.

Signed-off-by: Andi Kleen
Acked-by: Namhyung Kim
Link: http://lkml.kernel.org/r/1395768699-16060-1-git-send-email-andi@firstfloor.org
Signed-off-by: Jiri Olsa

Andi Kleen
2014-04-14 18:56:06 +0800

13 Jan, 2014

6 commits

0050f7aa1 perf evlist: Introduce evlist__for_each() & friends ... Browse Code »

For the common evsel list traversal, so that it becomes more compact.

Use the opportunity to start ditching the 'perf_' from 'perf_evlist__',
as discussed, as the whole conversion touches a lot of places, lets do
it piecemeal when we have the chance due to other work, like in this
case.

Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-qnkx7dzm2h6m6uptkfk03ni6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2014-01-13 21:06:25 +0800
41cde4767 perf stat: Remove misplaced __maybe_unused ... Browse Code »

That 'argc' argument _is_ being used.

Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-t2gsxc15zulkorieg8zq996o@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2014-01-13 21:06:22 +0800
03ad9747c perf evlist: Move destruction of maps to evlist destructor ... Browse Code »

Instead of requiring tools to do an extra destructor call just before
calling perf_evlist__delete.

Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-0jd2ptzyikxb5wp7inzz2ah2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2014-01-13 21:06:21 +0800
735f7e0bb perf evlist: Move the SIGUSR1 error reporting logic to prepare_workload ... Browse Code »

So that we have the boilerplate in the preparation method, instead of
open coded in tools wanting the reporting when the exec fails.

Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-purbdzcphdveskh7wwmnm4t7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2014-01-13 21:06:21 +0800
f33cbe72e perf evlist: Send the errno in the signal when workload fails ... Browse Code »

When a tool uses perf_evlist__start_workload and the supplied workload
fails (e.g.: its binary wasn't found), perror was being used to print
the error reason.

This is undesirable, as the caller may be a GUI, when it wants to have
total control of the error reporting process.

So move to using sigaction(SA_SIGINFO) + siginfo_t->sa_value->sival_int
to communicate to the caller the errno and let it print it using the UI
of its choosing.

Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-epgcv7kjq8ll2udqfken92pz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2014-01-13 21:06:21 +0800
6af206fd9 perf stat: Don't show counter information when workload fails ... Browse Code »

When starting a workload 'stat' wasn't using prepare_workload evlist
method's signal based exec() error reporting mechanism.

Use it so that the we don't report 'not counted' counters.

Before:

[acme@zoo linux]$ perf stat dfadsfa
dfadsfa: No such file or directory

Performance counter stats for 'dfadsfa':

task-clock
context-switches
cpu-migrations
page-faults
cycles
stalled-cycles-frontend
stalled-cycles-backend
instructions
branches
branch-misses

0.001831462 seconds time elapsed

[acme@zoo linux]$

After:

[acme@zoo linux]$ perf stat dfadsfa
dfadsfa: No such file or directory
[acme@zoo linux]$

Reported-by: David Ahern
Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-5yui3bv7e3hitxucnjsn6z8q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2014-01-13 21:06:21 +0800

28 Dec, 2013

1 commit

046625231 perf tools: Introduce zfree ... Browse Code »

For the frequent idiom of:

free(ptr);
ptr = NULL;

Make it expect a pointer to the pointer being freed, so that it becomes
clear at first sight that the variable being freed is being modified.

Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-pfw02ezuab37kha18wlut7ir@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2013-12-28 02:17:00 +0800

27 Nov, 2013

1 commit

410136f5d tools/perf/stat: Add event unit and scale support ... Browse Code »

This patch adds perf stat support for handling event units and
scales as exported by the kernel.

The kernel can export PMU events actual unit and scaling factor
via sysfs:

$ ls -1 /sys/devices/power/events/energy-*
/sys/devices/power/events/energy-cores
/sys/devices/power/events/energy-cores.scale
/sys/devices/power/events/energy-cores.unit
/sys/devices/power/events/energy-pkg
/sys/devices/power/events/energy-pkg.scale
/sys/devices/power/events/energy-pkg.unit
$ cat /sys/devices/power/events/energy-cores.scale
2.3283064365386962890625e-10
$ cat cat /sys/devices/power/events/energy-cores.unit
Joules

This patch modifies the pmu event alias code to check
for the presence of the .unit and .scale files to load
the corresponding values. They are then used by perf stat
transparently:

# perf stat -a -e power/energy-pkg/,power/energy-cores/,cycles -I 1000 sleep 1000
# time counts unit events
1.000214717 3.07 Joules power/energy-pkg/ [100.00%]
1.000214717 0.53 Joules power/energy-cores/
1.000214717 12965028 cycles [100.00%]
2.000749289 3.01 Joules power/energy-pkg/
2.000749289 0.52 Joules power/energy-cores/
2.000749289 15817043 cycles

When the event does not have an explicit unit exported by
the kernel, nothing is printed. In csv output mode, there
will be an empty field.

Special thanks to Jiri for providing the supporting code
in the parser to trigger reading of the scale and unit files.

Signed-off-by: Stephane Eranian
Reviewed-by: Jiri Olsa
Reviewed-by: Andi Kleen
Signed-off-by: Peter Zijlstra
Cc: zheng.z.yan@intel.com
Cc: bp@alien8.de
Cc: maria.n.dimakopoulou@gmail.com
Cc: acme@redhat.com
Link: http://lkml.kernel.org/r/1384275531-10892-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar

Stephane Eranian
2013-11-27 18:16:39 +0800

13 Nov, 2013

1 commit

602ad878d perf target: Shorten perf_target__ to target__ ... Browse Code »

Getting unwieldly long, for this app domain should be descriptive enough
and the use of __ to separate the class from the method names should
help with avoiding clashes with other code bases.

Reported-by: David Ahern
Suggested-by: Ingo Molnar
Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/20131112113427.GA4053@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2013-11-13 03:51:03 +0800

04 Nov, 2013

1 commit

cc03c5429 perf stat: Enhance option parse error message ... Browse Code »

Print related option help messages only when it failed to process
options. While at it, modify parse_options_usage() to skip usage part
so that it can be used for showing multiple option help messages
naturally like below:

$ perf stat -Bx, ls
-B option not supported with -x

usage: perf stat [] []

-B, --big-num print large numbers with thousands' separators
-x, --field-separator
print counts with custom separator

Signed-off-by: Namhyung Kim
Acked-by: Ingo Molnar
Reviewed-by: Ingo Molnar
Enthusiastically-Supported-by: Ingo Molnar
Cc: David Ahern
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1383291195-24386-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Namhyung Kim
2013-11-04 23:57:36 +0800

11 Oct, 2013

5 commits

4bbe5a61f perf stat: Add units to nanosec-based counters ... Browse Code »

Ingo pointed out that the task-clock counter should have the units
explicitly stated since it is not a counter.

Before:

perf stat -a -- sleep 1

Performance counter stats for 'sleep 1':

16186.874834 task-clock # 16.154 CPUs utilized
...

After:

perf stat -a -- sleep 1

Performance counter stats for 'system wide':

16146.402138 task-clock (msec) # 16.125 CPUs utilized
...

Reported-by: Ingo Molnar
Signed-off-by: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1380400080-9211-4-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo

David Ahern
2013-10-11 23:17:46 +0800
ac3063bd4 perf stat: Don't require a workload when using system wide or CPU options ... Browse Code »

The "perf stat" command can do system wide counters or one or more cpus.
For these options do not require a workload to be specified.

v2: use perf_target__none per Namhyung's comment.

Signed-off-by: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/52497F3C.9070908@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo

David Ahern
2013-10-11 23:17:44 +0800
62d3b617c perf stat: Fix misleading message when specifying cpu list or system wide ... Browse Code »

The "perf stat" tool displays the command run in its summary output
which is misleading when using a cpu list or system wide collection.

Before:

perf stat -a -- sleep 1

Performance counter stats for 'sleep 1':

16152.670249 task-clock # 16.132 CPUs utilized
417 context-switches # 0.002 M/sec
7 cpu-migrations # 0.030 K/sec
...

After:

perf stat -a -- sleep 1

Performance counter stats for 'system wide':

16206.931120 task-clock # 16.144 CPUs utilized
395 context-switches # 0.002 M/sec
5 cpu-migrations # 0.030 K/sec
...

or

perf stat -C1 -- sleep 1

Performance counter stats for 'CPU(s) 1':

1001.669257 task-clock # 1.000 CPUs utilized
4,264 context-switches # 0.004 M/sec
3 cpu-migrations # 0.003 K/sec
...

Signed-off-by: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1380400080-9211-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo

David Ahern
2013-10-11 23:17:42 +0800
3e7a08179 perf stat: Don't print bogus data on -e instructions ... Browse Code »

When only the instructions event is requested:

$ perf stat -e instructions git s
M builtin-stat.c

Performance counter stats for 'git s':

917,453,420 instructions # 0.00 insns per cycle

0.213002926 seconds time elapsed

The 0.00 insns per cycle comment in the output is totally bogus and
misleading. It happens because update_shadow_stats() doesn't touch
runtime_cycles_stats when only the instructions event is requested. So,
omit printing the bogus data altogether.

Signed-off-by: Ramkumar Ramachandra
Acked-by: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1380616604-4077-1-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo

Ramkumar Ramachandra
2013-10-11 23:17:35 +0800
c458fe62c perf stat: Don't print bogus data on -e cycles ... Browse Code »

When only the cycles event is requested:

$ perf stat -e cycles dd if=/dev/zero of=/dev/null count=1000000
1000000+0 records in
1000000+0 records out
512000000 bytes (512 MB) copied, 0.26123 s, 2.0 GB/s

Performance counter stats for 'dd if=/dev/zero of=/dev/null count=1000000':

911,626,453 cycles # 0.000 GHz

0.262113350 seconds time elapsed

The 0.000 GHz comment in the output is totally bogus and misleading. It
happens because update_shadow_stats() doesn't touch runtime_nsecs_stats;
it is only written when a requested counter matches a SW_TASK_CLOCK. In
our case, since we have only requested HW_CPU_CYCLES,
runtime_nsecs_stats is unavailable. So, omit printing the comment
altogether.

Signed-off-by: Ramkumar Ramachandra
Acked-by: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1380539585-23859-3-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo

Ramkumar Ramachandra
2013-10-11 23:17:33 +0800

08 Oct, 2013

1 commit

429eb0510 Merge branch 'perf/urgent' into tools/perf/build Browse Code »

Ingo Molnar
2013-10-08 17:51:31 +0800

05 Oct, 2013

1 commit

d20a47e70 perf stat: Set child_pid after perf_evlist__prepare_workload() ... Browse Code »

The commit acf2892270dc ("perf stat: Use perf_evlist__prepare/
start_workload()") converted to use the function but forgot to update
child_pid. Fix it.

Signed-off-by: Namhyung Kim
Cc: David Ahern
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1380531671-28076-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Namhyung Kim
2013-10-05 02:16:05 +0800

04 Oct, 2013

1 commit

4cabc3d1c tools/perf/stat: Add perf stat --transaction ... Browse Code »

Add support to perf stat to print the basic transactional execution statistics:
Total cycles, Cycles in Transaction, Cycles in aborted transsactions
using the in_tx and in_tx_checkpoint qualifiers.
Transaction Starts and Elision Starts, to compute the average transaction
length.

This is a reasonable overview over the success of the transactions.

Also support architectures that have a transaction aborted cycles
counter like POWER8. Since that is awkward to handle in the kernel
abstract handle both cases here.

Enable with a new --transaction / -T option.

This requires measuring these events in a group, since they depend on each
other.

This is implemented by using TM sysfs events exported by the kernel

Signed-off-by: Andi Kleen
Acked-by: Arnaldo Carvalho de Melo
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1377128846-977-5-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar

Andi Kleen
2013-10-04 16:06:07 +0800

08 Aug, 2013

2 commits

2bbf03f16 perf stat: Flush output after each line in interval mode ... Browse Code »

When interval mode is outputting to a pipe, each measurement should be
flushed individually, so that the reader sees it timely.

With a terminal each line is automatically flushed by stdio, but that is
disabled with non terminal output.

Simply fflush output after each time interval

Signed-off-by: Andi Kleen
Reviewed-by: Jiri Olsa
Cc: Jiri Olsa
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1375490473-1503-5-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo

Andi Kleen
2013-08-08 04:35:29 +0800
411916880 perf stat: Add support for --initial-delay option ... Browse Code »

When measuring workloads the startup phase -- doing page faults, dynamic
linking, opening files -- is often very different from the rest of the
workload. Especially with smaller kernels and using counter
multiplexing this can give significant measurement errors.

Multiplexing assumes that the workload is mostly the same over longer
periods. But at startup there is typically some spike of activity which
is relatively short. If many groups are multiplexing the one group
seeing the spike, and which is then scaled up over the time to run all
groups, may see a significant error.

Also in general it's often not useful to measure the startup, because it
is so different from the rest.

One way around this is to use interval mode and discard the first
sample, but this can be awkward because interval mode doesn't support
intervals of less than 100ms, and also a useful interval is not
necessarily the same as a useful startup delay.

This patch adds a new --initial-delay / -D option to skip measuring for
the startup phase. The time can be specified in ms

Here's a simple example:

perf stat -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
...
3,721 page-faults
...

If we just wait 20 ms the number of page faults is 1/3 less:

perf stat -D 20 -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
...
2,823 page-faults
...

So we filtered out most of the startup noise from bash.

Signed-off-by: Andi Kleen
Reviewed-by: Jiri Olsa
Cc: Jiri Olsa
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1375490473-1503-4-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo

Andi Kleen
2013-08-08 04:35:29 +0800

09 Jul, 2013

2 commits

582ec0829 perf stat: Fix per-socket output bug for uncore events ... Browse Code »

This patch fixes a problem reported by Andi Kleen on perf
stat when measuring uncore events:

# perf stat --per-socket -e uncore_pcu/event=0x0/ -I1000 -a sleep 2

It would not report counts for the second socket. That was due to a
cpu mapping bug in print_aggr().

This patch also fixes the socket numbering bug for
events.

Reported-by: Andi Kleen
Signed-off-by: Stephane Eranian
Tested-by: Andi Kleen
Cc: Andi Kleen
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: zheng.z.yan@intel.com
Link: http://lkml.kernel.org/r/20130705170645.GA32519@quad
Signed-off-by: Arnaldo Carvalho de Melo

Stephane Eranian
2013-07-09 05:01:46 +0800
d07f0b120 perf stat: Avoid sending SIGTERM to random processes ... Browse Code »

This patch fixes a problem with perf stat whereby on termination it may
send a SIGTERM signal to random processes on systems with high PID
recycling. I got some actual bug reports on this.

There is race between the SIGCHLD and sig_atexit() handlers. This patch
addresses this problem by clearing child_pid in the SIGCHLD handler.

Signed-off-by: Stephane Eranian
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20130604154426.GA2928@quad
Signed-off-by: Arnaldo Carvalho de Melo

Stephane Eranian
2013-07-09 04:36:33 +0800

26 Mar, 2013

1 commit

12c08a9f5 perf stat: Add per-core aggregation ... Browse Code »

This patch adds the --per-core option to perf stat.

This option is used to aggregate system-wide counts
on a per physical core basis. On processors with
hyperthreading, this means counts of all HT threads
running on a physical core are aggregated.

This mode is useful to find imblance between physical
cores running an uniform workload. Cores are identified
by socket: S0-C1, means physical core 1 on socket 0. Note
that cores are identified using their physical core id,
thus their numbering may not be continuous.

Per core aggregation can be combined with interval printing:

# perf stat -a --per-core -I 1000 -e cycles sleep 1000
# time core cpus counts events
1.000090030 S0-C0 1 4,765,747 cycles
1.000090030 S0-C1 1 5,580,647 cycles
1.000090030 S0-C2 1 221,181 cycles
1.000090030 S0-C3 1 266,092 cycles

Signed-off-by: Stephane Eranian
Cc: Andi Kleen
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1360846649-6411-4-git-send-email-eranian@google.com
[ committer note: Remove parts already applied on 86ee6e1 to keep bisectability ]
Signed-off-by: Arnaldo Carvalho de Melo

Stephane Eranian
2013-03-26 03:13:26 +0800