23 Oct, 2015
1 commit
-
commit 601083cffb7cabdcc55b8195d732f0f7028570fa upstream.
print_aggr() fails to print per-core/per-socket statistics after commit
582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events")
if events have differnt cpus. Because in print_aggr(), aggr_get_id needs
index (not cpu id) to find core/pkg id. Also, evsel cpu maps should be
used to get aggregated id.Here is an example:
Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event has
cpumask 0,18)$ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core sleep 2
Without this patch, it failes to get CPU 18 result.
Performance counter stats for 'CPU(s) 0,18':
S0-C0 1 7526851 cycles
S0-C0 1 1.05 MiB uncore_imc_0/cas_count_read/
S1-C0 0 cycles
S1-C0 0 MiB uncore_imc_0/cas_count_read/With this patch, it can get both CPU0 and CPU18 result.
Performance counter stats for 'CPU(s) 0,18':
S0-C0 1 6327768 cycles
S0-C0 1 0.47 MiB uncore_imc_0/cas_count_read/
S1-C0 1 330228 cycles
S1-C0 1 0.29 MiB uncore_imc_0/cas_count_read/Signed-off-by: Kan Liang
Acked-by: Jiri Olsa
Acked-by: Stephane Eranian
Cc: Adrian Hunter
Cc: Andi Kleen
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Fixes: 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events")
Link: http://lkml.kernel.org/r/1435820925-51091-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Greg Kroah-Hartman
26 Mar, 2015
1 commit
-
Use of a bad filter currently generates the message:
Error: failed to set filter with 22 (Invalid argument)Add the event name to make it clear to which event the filter
failed to apply:
Error: Failed to set filter "foo" on event sched:sg_lb_stats: 22: Invalid argumentTo test it use something like:
# perf record -e sched:sched_switch -e sched:*fork --filter parent_pid==1 -e sched:*wait* --filter bla usleep 1
Error: failed to set filter "bla" on event sched:sched_stat_iowait with 22 (Invalid argument)
#Based-on-a-patch-by: David Ahern
Acked-by: David Ahern
Cc: Adrian Hunter
Cc: Borislav Petkov
Cc: Don Zickus
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-d7gq2fjvaecozp9o2i0siifu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
13 Mar, 2015
3 commits
-
When cycles or instructions do not print anything, as in being,
--per-socket or --per-core modi, the ratio column was not correctly
indented for them. This lead to some ratios not lining up with the
others. Always indent correctly when nothing is printed.Signed-off-by: Andi Kleen
Link: http://lkml.kernel.org/r/1426087682-22765-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo -
perf stat didn't compute the IPC and other formulas for individual CPUs
with -A. Fix this for the easy -A case. As before, --per-core and
--per-socket do not handle it, they simply print nothing.Signed-off-by: Andi Kleen
Link: http://lkml.kernel.org/r/1426087682-22765-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo -
The information how much a counter ran in 'perf stat' can be quite
interesting for other tools to judge how trustworthy a measurement is.Currently it is only output in non CSV mode.
This patches make perf stat always output the running time and the
enabled/running ratio in CSV mode.This adds two new fields at the end for each line. I assume that
existing tools ignore new fields at the end, so it's on by default.Only CSV mode is affected, no difference otherwise.
v2: Add extra print_running function
v3: Avoid printing nan
v4: Remove some elses and add brackets.
v5: Move non CSV case into print_runningSigned-off-by: Andi Kleen
Reviewed-by: Jiri Olsa
Acked-by: Namhyung Kim
Cc: Jiri Olsa
Link: http://lkml.kernel.org/r/1426083387-17006-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo
02 Mar, 2015
1 commit
-
Commit 1971f59 (perf stat: Use read_counter in read_counter_aggr )
broke the perf stat output for unsupported counters.$ perf stat -v -a -C 0 -e CCI_400/config=24/ sleep 1
Warning:
CCI_400/config=24/ event is not supported by the kernel.Performance counter stats for 'system wide':
0 CCI_400/config=24/
1.080265400 seconds time elapsed
Where it used to be :
$ perf stat -v -a -C 0 -e CCI_400/config=24/ sleep 1
Warning:
CCI_400/config=24/ event is not supported by the kernel.Performance counter stats for 'system wide':
CCI_400/config=24/
1.083840675 seconds time elapsed
This patch fixes the issues by checking if the counter is supported,
before reading and logging the counter value.Signed-off-by: Suzuki K. Poulose
Acked-by: David Ahern
Tested-by: David Ahern
Cc: Jiri Olsa
Link: http://lkml.kernel.org/r/1423852858-8455-1-git-send-email-suzuki.poulose@arm.com
Signed-off-by: Arnaldo Carvalho de Melo
22 Jan, 2015
1 commit
-
Janitorial stuff: boredom moment.
Cc: Adrian Hunter
Cc: Borislav Petkov
Cc: David Ahern
Cc: Don Zickus
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-u70i7shys3kths4hzru72bha@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
02 Dec, 2014
5 commits
-
The .snapshot file indicates that the provided event value is a snapshot
value. Bypassing the delta computation logic for such event.Signed-off-by: Jiri Olsa
Cc: Andi Kleen
Cc: Corey Ashford
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Matt Fleming
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1416562275-12404-12-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
The .per-pkg file indicates that all but one value per socket should be
discarded. Adding the logic of skipping the rest of the socket once
first value was read.Signed-off-by: Jiri Olsa
Cc: Andi Kleen
Cc: Corey Ashford
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Matt Fleming
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1416562275-12404-11-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Use the read_counter function as the values retrieval function for aggr
counter values thus eliminating the use of __perf_evsel__read function.Signed-off-by: Jiri Olsa
Cc: Andi Kleen
Cc: Corey Ashford
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Matt Fleming
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1416562275-12404-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
The read function will be used later for both aggr and cpu counters, so
we need to make it work over threads as well.Signed-off-by: Jiri Olsa
Cc: Andi Kleen
Cc: Corey Ashford
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Matt Fleming
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1416562275-12404-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Replacing __perf_evsel__read_on_cpu function with perf_evsel__read_cb
function. The read_cb callback will be used later for global aggregation
counter values as well.Signed-off-by: Jiri Olsa
Cc: Andi Kleen
Cc: Corey Ashford
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Matt Fleming
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1416562275-12404-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
17 Oct, 2014
1 commit
-
The only thing we need is a forward declaration for 'struct cgroup_sel',
that is inside 'struct perf_evsel'.Include cgroup.h instead on the tools that support cgroups.
Cc: Adrian Hunter
Cc: Borislav Petkov
Cc: David Ahern
Cc: Don Zickus
Cc: Frederic Weisbecker
Cc: Jean Pihet
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-b7kuymbgf0zxi5viyjjtu5hk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
26 Sep, 2014
1 commit
-
On systems with more than one socket perf stat --per-core would either
segfault or stop before outputting all cores.The problem was that the output code referenced the id including the
socket number in the higher bits, which is far beyond any per cpu array.Mask out the socket number before referencing cpus in abs_printout.
I also renamed the variable in nsec_printout to be clear what it is,
even though it doesn't reference cpus.Signed-off-by: Andi Kleen
Acked-by: Stephane Eranian
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1411591846-32736-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo
16 Aug, 2014
1 commit
-
Use strerror_r instead of strerror in error message for thread-safety.
Signed-off-by: Masami Hiramatsu
Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Naohiro Aota
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20140814022255.3545.81549.stgit@kbuild-fedora.novalocal
Signed-off-by: Arnaldo Carvalho de Melo
27 Jun, 2014
1 commit
-
Check real allocated pointer for NULL.
Cc: Arnaldo Carvalho de Melo
Cc: Corey Ashford
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/n/tip-5rfzbalwjphmdzzil74eazyl@git.kernel.org
Signed-off-by: Jiri Olsa
14 Apr, 2014
1 commit
-
perf stat did initialize the stats structure used to compute
stddev etc. incorrectly. It merely zeroes it. But one member
(min) needs to be set to a non zero value. This causes min
to be not computed at all. Call init_stats() correctly.It doesn't matter for stat currently because it doesn't use
min, but it's still better to do it correctly.The other users of statistics are already correct.
Signed-off-by: Andi Kleen
Acked-by: Namhyung Kim
Link: http://lkml.kernel.org/r/1395768699-16060-1-git-send-email-andi@firstfloor.org
Signed-off-by: Jiri Olsa
13 Jan, 2014
6 commits
-
For the common evsel list traversal, so that it becomes more compact.
Use the opportunity to start ditching the 'perf_' from 'perf_evlist__',
as discussed, as the whole conversion touches a lot of places, lets do
it piecemeal when we have the chance due to other work, like in this
case.Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-qnkx7dzm2h6m6uptkfk03ni6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
That 'argc' argument _is_ being used.
Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-t2gsxc15zulkorieg8zq996o@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Instead of requiring tools to do an extra destructor call just before
calling perf_evlist__delete.Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-0jd2ptzyikxb5wp7inzz2ah2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
So that we have the boilerplate in the preparation method, instead of
open coded in tools wanting the reporting when the exec fails.Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-purbdzcphdveskh7wwmnm4t7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
When a tool uses perf_evlist__start_workload and the supplied workload
fails (e.g.: its binary wasn't found), perror was being used to print
the error reason.This is undesirable, as the caller may be a GUI, when it wants to have
total control of the error reporting process.So move to using sigaction(SA_SIGINFO) + siginfo_t->sa_value->sival_int
to communicate to the caller the errno and let it print it using the UI
of its choosing.Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-epgcv7kjq8ll2udqfken92pz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
When starting a workload 'stat' wasn't using prepare_workload evlist
method's signal based exec() error reporting mechanism.Use it so that the we don't report 'not counted' counters.
Before:
[acme@zoo linux]$ perf stat dfadsfa
dfadsfa: No such file or directoryPerformance counter stats for 'dfadsfa':
task-clock
context-switches
cpu-migrations
page-faults
cycles
stalled-cycles-frontend
stalled-cycles-backend
instructions
branches
branch-misses0.001831462 seconds time elapsed
[acme@zoo linux]$
After:
[acme@zoo linux]$ perf stat dfadsfa
dfadsfa: No such file or directory
[acme@zoo linux]$Reported-by: David Ahern
Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-5yui3bv7e3hitxucnjsn6z8q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
28 Dec, 2013
1 commit
-
For the frequent idiom of:
free(ptr);
ptr = NULL;Make it expect a pointer to the pointer being freed, so that it becomes
clear at first sight that the variable being freed is being modified.Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-pfw02ezuab37kha18wlut7ir@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
27 Nov, 2013
1 commit
-
This patch adds perf stat support for handling event units and
scales as exported by the kernel.The kernel can export PMU events actual unit and scaling factor
via sysfs:$ ls -1 /sys/devices/power/events/energy-*
/sys/devices/power/events/energy-cores
/sys/devices/power/events/energy-cores.scale
/sys/devices/power/events/energy-cores.unit
/sys/devices/power/events/energy-pkg
/sys/devices/power/events/energy-pkg.scale
/sys/devices/power/events/energy-pkg.unit
$ cat /sys/devices/power/events/energy-cores.scale
2.3283064365386962890625e-10
$ cat cat /sys/devices/power/events/energy-cores.unit
JoulesThis patch modifies the pmu event alias code to check
for the presence of the .unit and .scale files to load
the corresponding values. They are then used by perf stat
transparently:# perf stat -a -e power/energy-pkg/,power/energy-cores/,cycles -I 1000 sleep 1000
# time counts unit events
1.000214717 3.07 Joules power/energy-pkg/ [100.00%]
1.000214717 0.53 Joules power/energy-cores/
1.000214717 12965028 cycles [100.00%]
2.000749289 3.01 Joules power/energy-pkg/
2.000749289 0.52 Joules power/energy-cores/
2.000749289 15817043 cyclesWhen the event does not have an explicit unit exported by
the kernel, nothing is printed. In csv output mode, there
will be an empty field.Special thanks to Jiri for providing the supporting code
in the parser to trigger reading of the scale and unit files.Signed-off-by: Stephane Eranian
Reviewed-by: Jiri Olsa
Reviewed-by: Andi Kleen
Signed-off-by: Peter Zijlstra
Cc: zheng.z.yan@intel.com
Cc: bp@alien8.de
Cc: maria.n.dimakopoulou@gmail.com
Cc: acme@redhat.com
Link: http://lkml.kernel.org/r/1384275531-10892-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar
13 Nov, 2013
1 commit
-
Getting unwieldly long, for this app domain should be descriptive enough
and the use of __ to separate the class from the method names should
help with avoiding clashes with other code bases.Reported-by: David Ahern
Suggested-by: Ingo Molnar
Cc: Adrian Hunter
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Jiri Olsa
Cc: Mike Galbraith
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/20131112113427.GA4053@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo
04 Nov, 2013
1 commit
-
Print related option help messages only when it failed to process
options. While at it, modify parse_options_usage() to skip usage part
so that it can be used for showing multiple option help messages
naturally like below:$ perf stat -Bx, ls
-B option not supported with -xusage: perf stat [] []
-B, --big-num print large numbers with thousands' separators
-x, --field-separator
print counts with custom separatorSigned-off-by: Namhyung Kim
Acked-by: Ingo Molnar
Reviewed-by: Ingo Molnar
Enthusiastically-Supported-by: Ingo Molnar
Cc: David Ahern
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1383291195-24386-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
11 Oct, 2013
5 commits
-
Ingo pointed out that the task-clock counter should have the units
explicitly stated since it is not a counter.Before:
perf stat -a -- sleep 1
Performance counter stats for 'sleep 1':
16186.874834 task-clock # 16.154 CPUs utilized
...After:
perf stat -a -- sleep 1
Performance counter stats for 'system wide':
16146.402138 task-clock (msec) # 16.125 CPUs utilized
...Reported-by: Ingo Molnar
Signed-off-by: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1380400080-9211-4-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo -
The "perf stat" command can do system wide counters or one or more cpus.
For these options do not require a workload to be specified.v2: use perf_target__none per Namhyung's comment.
Signed-off-by: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/52497F3C.9070908@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo -
The "perf stat" tool displays the command run in its summary output
which is misleading when using a cpu list or system wide collection.Before:
perf stat -a -- sleep 1
Performance counter stats for 'sleep 1':
16152.670249 task-clock # 16.132 CPUs utilized
417 context-switches # 0.002 M/sec
7 cpu-migrations # 0.030 K/sec
...After:
perf stat -a -- sleep 1
Performance counter stats for 'system wide':
16206.931120 task-clock # 16.144 CPUs utilized
395 context-switches # 0.002 M/sec
5 cpu-migrations # 0.030 K/sec
...or
perf stat -C1 -- sleep 1
Performance counter stats for 'CPU(s) 1':
1001.669257 task-clock # 1.000 CPUs utilized
4,264 context-switches # 0.004 M/sec
3 cpu-migrations # 0.003 K/sec
...Signed-off-by: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1380400080-9211-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo -
When only the instructions event is requested:
$ perf stat -e instructions git s
M builtin-stat.cPerformance counter stats for 'git s':
917,453,420 instructions # 0.00 insns per cycle
0.213002926 seconds time elapsed
The 0.00 insns per cycle comment in the output is totally bogus and
misleading. It happens because update_shadow_stats() doesn't touch
runtime_cycles_stats when only the instructions event is requested. So,
omit printing the bogus data altogether.Signed-off-by: Ramkumar Ramachandra
Acked-by: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1380616604-4077-1-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo -
When only the cycles event is requested:
$ perf stat -e cycles dd if=/dev/zero of=/dev/null count=1000000
1000000+0 records in
1000000+0 records out
512000000 bytes (512 MB) copied, 0.26123 s, 2.0 GB/sPerformance counter stats for 'dd if=/dev/zero of=/dev/null count=1000000':
911,626,453 cycles # 0.000 GHz
0.262113350 seconds time elapsed
The 0.000 GHz comment in the output is totally bogus and misleading. It
happens because update_shadow_stats() doesn't touch runtime_nsecs_stats;
it is only written when a requested counter matches a SW_TASK_CLOCK. In
our case, since we have only requested HW_CPU_CYCLES,
runtime_nsecs_stats is unavailable. So, omit printing the comment
altogether.Signed-off-by: Ramkumar Ramachandra
Acked-by: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1380539585-23859-3-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo
08 Oct, 2013
1 commit
05 Oct, 2013
1 commit
-
The commit acf2892270dc ("perf stat: Use perf_evlist__prepare/
start_workload()") converted to use the function but forgot to update
child_pid. Fix it.Signed-off-by: Namhyung Kim
Cc: David Ahern
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1380531671-28076-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
04 Oct, 2013
1 commit
-
Add support to perf stat to print the basic transactional execution statistics:
Total cycles, Cycles in Transaction, Cycles in aborted transsactions
using the in_tx and in_tx_checkpoint qualifiers.
Transaction Starts and Elision Starts, to compute the average transaction
length.This is a reasonable overview over the success of the transactions.
Also support architectures that have a transaction aborted cycles
counter like POWER8. Since that is awkward to handle in the kernel
abstract handle both cases here.Enable with a new --transaction / -T option.
This requires measuring these events in a group, since they depend on each
other.This is implemented by using TM sysfs events exported by the kernel
Signed-off-by: Andi Kleen
Acked-by: Arnaldo Carvalho de Melo
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1377128846-977-5-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar
08 Aug, 2013
2 commits
-
When interval mode is outputting to a pipe, each measurement should be
flushed individually, so that the reader sees it timely.With a terminal each line is automatically flushed by stdio, but that is
disabled with non terminal output.Simply fflush output after each time interval
Signed-off-by: Andi Kleen
Reviewed-by: Jiri Olsa
Cc: Jiri Olsa
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1375490473-1503-5-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo -
When measuring workloads the startup phase -- doing page faults, dynamic
linking, opening files -- is often very different from the rest of the
workload. Especially with smaller kernels and using counter
multiplexing this can give significant measurement errors.Multiplexing assumes that the workload is mostly the same over longer
periods. But at startup there is typically some spike of activity which
is relatively short. If many groups are multiplexing the one group
seeing the spike, and which is then scaled up over the time to run all
groups, may see a significant error.Also in general it's often not useful to measure the startup, because it
is so different from the rest.One way around this is to use interval mode and discard the first
sample, but this can be awkward because interval mode doesn't support
intervals of less than 100ms, and also a useful interval is not
necessarily the same as a useful startup delay.This patch adds a new --initial-delay / -D option to skip measuring for
the startup phase. The time can be specified in msHere's a simple example:
perf stat -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
...
3,721 page-faults
...If we just wait 20 ms the number of page faults is 1/3 less:
perf stat -D 20 -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
...
2,823 page-faults
...So we filtered out most of the startup noise from bash.
Signed-off-by: Andi Kleen
Reviewed-by: Jiri Olsa
Cc: Jiri Olsa
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1375490473-1503-4-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo
09 Jul, 2013
2 commits
-
This patch fixes a problem reported by Andi Kleen on perf
stat when measuring uncore events:# perf stat --per-socket -e uncore_pcu/event=0x0/ -I1000 -a sleep 2
It would not report counts for the second socket. That was due to a
cpu mapping bug in print_aggr().This patch also fixes the socket numbering bug for
events.Reported-by: Andi Kleen
Signed-off-by: Stephane Eranian
Tested-by: Andi Kleen
Cc: Andi Kleen
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: zheng.z.yan@intel.com
Link: http://lkml.kernel.org/r/20130705170645.GA32519@quad
Signed-off-by: Arnaldo Carvalho de Melo -
This patch fixes a problem with perf stat whereby on termination it may
send a SIGTERM signal to random processes on systems with high PID
recycling. I got some actual bug reports on this.There is race between the SIGCHLD and sig_atexit() handlers. This patch
addresses this problem by clearing child_pid in the SIGCHLD handler.Signed-off-by: Stephane Eranian
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20130604154426.GA2928@quad
Signed-off-by: Arnaldo Carvalho de Melo
26 Mar, 2013
1 commit
-
This patch adds the --per-core option to perf stat.
This option is used to aggregate system-wide counts
on a per physical core basis. On processors with
hyperthreading, this means counts of all HT threads
running on a physical core are aggregated.This mode is useful to find imblance between physical
cores running an uniform workload. Cores are identified
by socket: S0-C1, means physical core 1 on socket 0. Note
that cores are identified using their physical core id,
thus their numbering may not be continuous.Per core aggregation can be combined with interval printing:
# perf stat -a --per-core -I 1000 -e cycles sleep 1000
# time core cpus counts events
1.000090030 S0-C0 1 4,765,747 cycles
1.000090030 S0-C1 1 5,580,647 cycles
1.000090030 S0-C2 1 221,181 cycles
1.000090030 S0-C3 1 266,092 cyclesSigned-off-by: Stephane Eranian
Cc: Andi Kleen
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1360846649-6411-4-git-send-email-eranian@google.com
[ committer note: Remove parts already applied on 86ee6e1 to keep bisectability ]
Signed-off-by: Arnaldo Carvalho de Melo