15 Jul, 2011
1 commit
-
While attempting to create a timechart of boot up I found perf didn't
tolerate modules being loaded/unloaded. This patch fixes this by
reading the file once and then writing the size read at the correct
point in the file. It also simplifies the code somewhat.Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Signed-off-by: Sonny Rao
Signed-off-by: Michael Neuling
Link: http://lkml.kernel.org/r/10011.1310614483@neuling.org
Signed-off-by: Steven Rostedt
05 Jul, 2011
1 commit
-
Add an option to perf report/annotate/script to specify which
CPUs to operate on. This enables us to take a single system wide
profile and analyse each CPU (or group of CPUs) in isolation.This was useful when profiling a multiprocess workload where the
bottleneck was on one CPU but this was hidden in the overall
profile. Per process and per thread breakdowns didn't help
because multiple processes were running on each CPU and no
single process consumed an entire CPU.The patch converts the list of CPUs returned by cpu_map__new
into a bitmap for fast lookup. I wanted to use -C to be
consistent with perf top/record/stat, but unfortunately perf
report already uses -C .v2: Incorporate suggestions from David Ahern:
- Added -c to perf script
- Check that SAMPLE_CPU is set when -c is used
- Update documentationv3: Create perf_session__cpu_bitmap()
Signed-off-by: Anton Blanchard
Acked-by: David Ahern
Cc: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
Cc: Paul Mackerras
Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
Signed-off-by: Ingo Molnar
01 Jul, 2011
3 commits
-
Previously, when you want perf-stat to output the statistics in
csv mode, no information of the noise will be printed out.For example right now we output this --repeat information:
./perf stat -r3 -x, sleep 1
1.164789,task-clock
8,context-switches
0,CPU-migrations
219,page-faults
3337800,cyclesWith this patch, the output will be appended with an additional
entry for the noise value:./perf stat -r3 -x, sleep 1
1.164789,task-clock,3.75%
8,context-switches,75.00%
0,CPU-migrations,100.00%
219,page-faults,0.00%
3337800,cycles,3.36%Signed-off-by: Zhengyu He
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: Venkatesh Pallipadi
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/r/1308861942-4945-1-git-send-email-zhengyuh@google.com
Signed-off-by: Ingo Molnar -
…ic/random-tracing into perf/core
-
Merge reason: Pick up the latest fixes.
Signed-off-by: Ingo Molnar
30 Jun, 2011
6 commits
-
We don't need to display the parent field if the parent
sorting machinery is only used for parent filtering
(as in "-p foo").However if parent filtering is used in combination with
explicit parent sorting ( -s parent), we want to
display it.Result with:
perf report -p kernel_thread -s parent
Before:
# Overhead Parent symbol
# ........ .............
#
0.07%
|
--- ioread8
ata_sff_check_status
ata_sff_tf_load
ata_sff_qc_issue
ata_bmdma_qc_issue
ata_qc_issue
ata_scsi_translate
ata_scsi_queuecmd
scsi_dispatch_cmd
scsi_request_fn
__blk_run_queue
__make_request
generic_make_request
submit_bio
submit_bh
journal_submit_commit_record
jbd2_journal_commit_transaction
kjournald2
kthread
kernel_thread_helpeAfter:
# Overhead Parent symbol
# ........ .............
#
0.07% kernel_thread_helper
|
--- ioread8
ata_sff_check_status
ata_sff_tf_load
ata_sff_qc_issue
ata_bmdma_qc_issue
ata_qc_issue
ata_scsi_translate
ata_scsi_queuecmd
scsi_dispatch_cmd
scsi_request_fn
__blk_run_queue
__make_request
generic_make_request
submit_bio
submit_bh
journal_submit_commit_record
jbd2_journal_commit_transaction
kjournald2
kthread
kernel_thread_helperSigned-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: David Ahern
Cc: Sam Liao -
So that the parent sort dimension can be registered twice: once
if we add it as an explicit sort dimension (-s parent) and twice
if we request a parent filter (-p foo).We'll have only one parent sort dimension in the end but this
allows to override the default parent filter with we gave in "-p"
option. The goal of this is to prepare to allow the use of
"-s parent" and "-p foo" at the same time, ie: sort by filtered
parent.Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: David Ahern
Cc: Sam Liao -
As for newt ui, don't display entries that have been marked
as ignored.The practical current effect of this is to make parent
filtering really working. Before, entries that were ignored
were given a null parent but were still displayed. This
resulted in some weird effects:# Overhead Command Shared Object Symbol
# ........ ........... ................. ............
#
^A
|
--- __lock_acquire
|
|--95.97%-- lock_acquire
| |
| |--30.75%-- _raw_spin_lockDiscard these from the stdio display.
Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: David Ahern
Cc: Sam Liao -
These are probably some old leftovers.
Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: David Ahern
Cc: Sam Liao -
These don't need to be globally visible.
Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: David Ahern
Cc: Sam Liao -
Add "caller/callee" option to support inverted butterfly report,
in the inverted report (with caller option), the call graph start
from the callee's ancestor. Users can use such view to catch system's
performance bottleneck from a sysprof like view. Using this option
with specified sort order like pid gives us high level view of call
graph statistics.Also add "-G" alias for inverted call graph.
Signed-off-by: Sam Liao
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: David Ahern
Signed-off-by: Frederic Weisbecker
20 Jun, 2011
1 commit
-
…-for-linus' and 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tools/perf: Fix static build of perf tool
tracing: Fix regression in printk_formats file* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
generic-ipi: Fix kexec boot crash by initializing call_single_queue before enabling interrupts* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
clocksource: Make watchdog robust vs. interruption
timerfd: Fix wakeup of processes when timer is cancelled on clock change* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, MAINTAINERS: Add x86 MCE people
x86, efi: Do not reserve boot services regions within reserved areas
19 Jun, 2011
1 commit
-
…l/git/tip/linux-2.6-tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: Move RCU_BOOST #ifdefs to header file
rcu: use softirq instead of kthreads except when RCU_BOOST=y
rcu: Use softirq to address performance regression
rcu: Simplify curing of load woes
17 Jun, 2011
1 commit
-
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
kbuild: Call depmod.sh via shell
perf: clear out make flags when calling kernel make kernelver
16 Jun, 2011
3 commits
-
Merge reason: add the latest fixes.
Signed-off-by: Ingo Molnar
-
To build a statically linked version of the perf tool all needed
libraries must be added in the correct order to get the symbols
resolved. Currently this is broken when, e.g. python or newt
support is enabled -- libpython needs libpthread which is an
unconditional link dependency of the perf tool; libslang needs
libm, another unconditional dependency. To solve the problem in
the long run without the need to keep track of transitive
library dependencies, simply make the linker look at the EXTLIBS
multiple times until it has all symbols resolved.Signed-off-by: Mathias Krause
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Link: http://lkml.kernel.org/r/1308171818-20370-1-git-send-email-minipli@googlemail.com
Signed-off-by: Ingo Molnar -
When generating the perf version from the kernel version using 'make
kernelver' it is necessary to clear out any MAKEFLAGS otherwise they may
trigger additional output which pollute the contents.Signed-off-by: Andy Whitcroft
Signed-off-by: Michal Marek
15 Jun, 2011
1 commit
-
Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread)
introduced performance regression. In an AIM7 test, this commit degraded
performance by about 40%.The commit runs rcu callbacks in a kthread instead of softirq. We observed
high rate of context switch which is caused by this. Out test system has
64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
which is caused by RCU's per-CPU kthread. A trace showed that most of
the time the RCU per-CPU kthread doesn't actually handle any callbacks,
but instead just does a very small amount of work handling grace periods.
This means that RCU's per-CPU kthreads are making the scheduler do quite
a bit of work in order to allow a very small amount of RCU-related
processing to be done.Alex Shi's analysis determined that this slowdown is due to lock
contention within the scheduler. Unfortunately, as Peter Zijlstra points
out, the scheduler's real-time semantics require global action, which
means that this contention is inherent in real-time scheduling. (Yes,
perhaps someone will come up with a workaround -- otherwise, -rt is not
going to do well on large SMP systems -- but this patch will work around
this issue in the meantime. And "the meantime" might well be forever.)This patch therefore re-introduces softirq processing to RCU, but only
for core RCU work. RCU callbacks are still executed in kthread context,
so that only a small amount of RCU work runs in softirq context in the
common case. This should minimize ksoftirqd execution, allowing us to
skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.Signed-off-by: Shaohua Li
Tested-by: "Alex,Shi"
Signed-off-by: Paul E. McKenney
10 Jun, 2011
2 commits
-
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
perf: Use make kernelversion instead of parsing the Makefile
kbuild: Hack for depmod not handling X.Y versions
kbuild: Move depmod call to a separate script
kbuild: Fix for empty SUBLEVEL or PATCHLEVEL
kbuild: Fix KERNELVERSION for empty SUBLEVEL or PATCHLEVEL
kbuild: silence Nothing to be done for 'all' message -
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Signed-off-by: Michal Marek
08 Jun, 2011
1 commit
-
…l/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf: Fix comments in include/linux/perf_event.h
perf: Comment /proc/sys/kernel/perf_event_paranoid to be part of user ABI
perf python: Fix argument name list of read_on_cpu()
perf evlist: Don't die if sample_{id_all|type} is invalid
perf python: Use exception to propagate errors
perf evlist: Remove dependency on debug routines
perf, cgroups: Fix up for new API
04 Jun, 2011
2 commits
-
Conflicts:
tools/perf/util/python.cMerge reason: resolve the conflict with perf/urgent.
Signed-off-by: Ingo Molnar
-
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-ktest:
ktest: Ignore unset values of the minconfig in config_bisect
ktest: Fix result of rebooting the kernel
ktest: Fix off-by-one in config bisect result
03 Jun, 2011
10 commits
-
Mandatory arguments need to be present in the argument name list, as
well as optional arguments, otherwise python barfs:# ./python/twatch.py
Traceback (most recent call last):
File "./python/twatch.py", line 41, in
main()
File "./python/twatch.py", line 32, in main
event = evlist.read_on_cpu(cpu)
RuntimeError: more argument specifiers than keyword list entriesHence, add cpu to the name list.
Cc: David Ahern
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
Link: http://lkml.kernel.org/r/1301588863-20210-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker
Signed-off-by: Arnaldo Carvalho de Melo -
Fixes two more cases where the python binding would not load:
. Not finding die(), which it shouldn't anyway, not good to just stop the
world because some particular perf.data file is invalid, just propagate
the error to the caller.. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
where it belongs, as most cases are moving to operate on an evsel object.oOne of the fixed problems:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "", line 1, in
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
We were using pr_debug to tell the user about not being able to parse a sample
where we should really use the python way of reporting errors: exceptions.Fixes this problem:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "", line 1, in
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
>>>
[root@emilia ~]As we want to keep the objects linked in the python binding (and in the future
in a shared library) minimal.Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-m9dba9kaluas0kq8r58z191c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
So far we avoided having to link debug.o in the python binding, keep it
that way by not using ui__warning() in evlist.c.Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-4wtew8hd3g7ejnlehtspys2t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Resolve to a function or variable if possible and if the sym option is
enabled.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1306782503-22002-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
The 'sym' option displays both the function name and the DSO it comes
from. Split the display of the dso into a separate option. This allows
display of the ip address and symbol without the dso, thus shortening
line lengths - and decluttering the output a bit.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1306528124-25861-3-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
Currently the "sym" output field is used to dump instruction pointers
and callchain stack. Sample addresses can also be converted to symbols,
so the meaning of "sym" needs to be fixed. This patch adds an "ip"
option and if it is selected the user can also opt to dump symbols for
them. If the user opts to dump IP without syms only the address is
shown.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1306528124-25861-2-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
perf stat continues running even if the event list contains counters
that are not supported. The resulting output then contains
for those events which gets confusing as to which events are supported,
but not counted and which are not supported.Before:
perf stat -ddd -- sleep 1
Performance counter stats for 'sleep 1':
0.571283 task-clock # 0.001 CPUs utilized
1 context-switches # 0.002 M/sec
0 CPU-migrations # 0.000 M/sec
157 page-faults # 0.275 M/sec
1,037,707 cycles # 1.816 GHz
stalled-cycles-frontend
stalled-cycles-backend
654,499 instructions # 0.63 insns per cycle
136,129 branches # 238.286 M/sec
branch-misses
L1-dcache-loads
L1-dcache-load-misses
LLC-loads
LLC-load-misses
L1-icache-loads
L1-icache-load-misses
dTLB-loads
dTLB-load-misses
iTLB-loads
iTLB-load-misses
L1-dcache-prefetches
L1-dcache-prefetch-misses1.001004836 seconds time elapsed
After:
perf stat -ddd -- sleep 1
Performance counter stats for 'sleep 1':
1.350326 task-clock # 0.001 CPUs utilized
2 context-switches # 0.001 M/sec
0 CPU-migrations # 0.000 M/sec
157 page-faults # 0.116 M/sec
11,986 cycles # 0.009 GHz
stalled-cycles-frontend
stalled-cycles-backend
496,986 instructions # 41.46 insns per cycle
138,065 branches # 102.246 M/sec
7,245 branch-misses # 5.25% of all branches
L1-dcache-loads
L1-dcache-load-misses
LLC-loads
LLC-load-misses
L1-icache-loads
L1-icache-load-misses
dTLB-loads
dTLB-load-misses
iTLB-loads
iTLB-load-misses
L1-dcache-prefetches
L1-dcache-prefetch-misses1.002397333 seconds time elapsed
v1->v2:
changed supported type from int to boolv2->v3
fixed vertical alignment of new struct elementCc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1306767359-13221-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
The list of methods argument names only needs to be NULL terminated
once. Remove the second ones.Cc: David Ahern
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
Link: http://lkml.kernel.org/r/1301588863-20210-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker
Signed-off-by: Arnaldo Carvalho de Melo -
Mandatory arguments need to be present in the argument name list, as
well as optional arguments, otherwise python barfs:# ./python/twatch.py
Traceback (most recent call last):
File "./python/twatch.py", line 41, in
main()
File "./python/twatch.py", line 32, in main
event = evlist.read_on_cpu(cpu)
RuntimeError: more argument specifiers than keyword list entriesHence, add cpu to the name list.
Cc: David Ahern
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
Link: http://lkml.kernel.org/r/1301588863-20210-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker
Signed-off-by: Arnaldo Carvalho de Melo
02 Jun, 2011
6 commits
-
By ignoring the unset values of the minconfig in deciding
what to test in the config_bisect can cause the problem
config from being tested too.Just do not test the configs that are set in the minconfig.
Signed-off-by: Steven Rostedt
-
The command that is called that reboots the kernel may fail
but the return code is not passed back to the ktest.pl script.
This is because a ';' is used between the two commands and
if the second command fails, only the first command's return
code is returned. Using a '&&' between the two commands fixes
this.Signed-off-by: Steven Rostedt
-
Because in perl the array size returned by $#arr, is the last
index and not the actually size of the array, we end the config
bisect early, thinking there is only one config left when there
are in fact two. Thus the result has a 50% chance of picking
the correct config that caused the problem.Signed-off-by: Steven Rostedt
-
Fixes two more cases where the python binding would not load:
. Not finding die(), which it shouldn't anyway, not good to just stop the
world because some particular perf.data file is invalid, just propagate
the error to the caller.. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
where it belongs, as most cases are moving to operate on an evsel object.oOne of the fixed problems:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "", line 1, in
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
We were using pr_debug to tell the user about not being able to parse a sample
where we should really use the python way of reporting errors: exceptions.Fixes this problem:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "", line 1, in
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
>>>
[root@emilia ~]As we want to keep the objects linked in the python binding (and in the future
in a shared library) minimal.Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-m9dba9kaluas0kq8r58z191c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
So far we avoided having to link debug.o in the python binding, keep it
that way by not using ui__warning() in evlist.c.Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-4wtew8hd3g7ejnlehtspys2t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
30 May, 2011
1 commit
-
Add ability to test the new event idx feature,
enable by default.Signed-off-by: Rusty Russell