13 Oct, 2009
1 commit
-
Timechart doesn't work if debugfs is not in /sys/kernel/debug/.
Fixed by using global debugfs_path which is filled in by perf.Signed-off-by: Ashwin Chaugule
Cc: "Arjan van de Ven"
LKML-Reference:
Signed-off-by: Ingo Molnar
12 Oct, 2009
2 commits
-
Randy Dunlap reported that 'make NO_64BIT=1' fails to build
a pure 32-b it binary on 64-bit/64-bit x86 systems.The reason is that we dont pass in the -m32 and GCC defaults
to -m64.So pass it in - and also extend the warning message about libelf
dependencies - glibc-dev[el] is needed as well beyond the libelf
library.Reported-by: Randy Dunlap
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
LKML-Reference: Message-Id:
Signed-off-by: Ingo Molnar -
The following perf build warnings/errors in function
argument types:builtin-sched.c:1894: warning: passing argument 1 of 'sort_dimension__add' discards qualifiers from pointer target type
util/trace-event-parse.c:685: warning: passing argument 2 of 'read_expected' discards qualifiers from pointer target type
util/trace-event-parse.c:741: warning: passing argument 4 of 'test_type_token' discards qualifiers from pointer target type
util/trace-event-parse.c:706: warning: passing argument 2 of 'read_expected_item' discards qualifiers from pointer target type... trigger because older GCC is not able to prove that
sort_dimension__add() does not change the string.Some goes for test_type_token().
Fix this by improving type consistency.
Signed-off-by: Randy Dunlap
Acked-by: Frederic Weisbecker
Acked-by: Peter Zijlstra
Cc: Paul Mackerras
LKML-Reference:
[ Also remove ugly type cast now unnecessary. ]
Signed-off-by: Ingo Molnar
06 Oct, 2009
4 commits
-
Some architectures such as Sparc, ARM and MIPS (basically
everything with flush_dcache_page()) need to deal with dcache
aliases by carefully placing pages in both kernel and user maps.These architectures typically have to use vmalloc_user() for this.
However, on other architectures, vmalloc() is not needed and has
the downsides of being more restricted and slower than regular
allocations.Signed-off-by: Peter Zijlstra
Acked-by: David Miller
Cc: Andrew Morton
Cc: Jens Axboe
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Asm routines that end up having size equal to zero are not really
zero sized, and as now we do kernel_maps__fixup_sym_end, at least
for kernel routines this gets fixed.A similar fixup needs to be done for the userspace bits as well,
but as this fixup started only because in /proc/kallsyms we don't
have the end address nor the function size, it appeared here first.Signed-off-by: Arnaldo Carvalho de Melo
Cc: Frédéric Weisbecker
Cc: Peter Zijlstra
Cc: Mike Galbraith
LKML-Reference:
Signed-off-by: Ingo Molnar -
Add missing BLOCK_IOPOLL_SOFTIRQ entry.
Signed-off-by: Tom Zanussi
Acked-by: Frederic Weisbecker
Cc: rostedt@goodmis.org
Cc: lizf@cn.fujitsu.com
Cc: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
LKML-Reference:
Signed-off-by: Ingo Molnar -
And some minor whitespace cleanup.
Signed-off-by: Tom Zanussi
Acked-by: Frederic Weisbecker
Cc: rostedt@goodmis.org
Cc: lizf@cn.fujitsu.com
Cc: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
LKML-Reference:
Signed-off-by: Ingo Molnar
05 Oct, 2009
1 commit
-
If we launch the child on behalf of the user, ensure that it dies
along with ourselves when we are interrupted.Signed-off-by: Chris Wilson
Cc: Chris Wilson
LKML-Reference:
Signed-off-by: Ingo Molnar
01 Oct, 2009
2 commits
-
Right now generate-cmdlist.sh is not executable, so we
should call it as an argument ".".This fixes cases where due to different umask defaults
the generate-cmdlist.sh script is not executable in
a kernel tree checkout.Signed-off-by: Mulyadi Santosa
Acked-by: Sam Ravnborg
Cc: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
For doing work on the Linux power management components, I need to
make long (30+ seconds) traces. Currently, this then results in a
HUGE svg file, with mostly process data that isn't interesting.This patch adds a --power-only mode to perf timechart that only
outputs the CPU power section of the SVG; this significantly
reduces the size of the SVG file, making even 30+ second traces
viewable with inkscape.As a minor tweak for the same effect, the minimum text size is
decreased; current inkscape cannot zoom in deep enough to show text
this small, but it reduces inkscape compute time.Signed-off-by: Arjan van de Ven
Cc: peterz@infradead.org
LKML-Reference:
Signed-off-by: Ingo Molnar
30 Sep, 2009
1 commit
-
Signed-off-by: Arnaldo Carvalho de Melo
Acked-by: Eric Dumazet
Cc: Frederic Weisbecker
Cc: Peter Zijlstra
Cc: Mike Galbraith
LKML-Reference:
Signed-off-by: Ingo Molnar
25 Sep, 2009
1 commit
-
openat() is still a young glibc facility, better to not use it in a
non performance critical program (perf list)Many machines have older glibc (RHEL 4 Update 5 -> glibc-2.3.4-2.36
on my dev machine for example).Signed-off-by: Eric Dumazet
Cc: Peter Zijlstra
Cc: Ulrich Drepper
LKML-Reference:
Signed-off-by: Ingo Molnar
24 Sep, 2009
3 commits
-
"perf top" cores dump on my dev machine, if run from a directory
where vmlinux is present:*** glibc detected *** malloc(): memory corruption: 0x085670d0 ***
Signed-off-by: Eric Dumazet
Cc:
LKML-Reference:
Signed-off-by: Ingo Molnar -
I've tried building the docs in tools/perf/Documentation/ , and after
that `git status` showed dozen of untracked htmls. Let's ignore them.Signed-off-by: Kirill Smelkov
LKML-Reference:
Signed-off-by: Ingo Molnar -
Inform util/module.c::mod_dso__load_module_paths() that relative
paths do exist in some modules.dep, and make it fail noisily should
it encounter a path that it doesn't understand, or a module it
cannot open.Reported-by: Avi Kivity
Signed-off-by: Mike Galbraith
Cc: Arnaldo Carvalho de Melo
Cc: rostedt@goodmis.org
Cc: Mathieu Desnoyers
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Masami Hiramatsu
LKML-Reference:
Signed-off-by: Ingo Molnar
23 Sep, 2009
1 commit
-
Avi Kivity reported 'perf annotate' failures with modules, the
requested function was not annotated.If there are no modules currently loaded, or the last module
scanned is not loaded, dso__load_modules() steps on the value from
dso__load_vmlinux(), so we happily load the kallsyms symbols on top
of what we've already loaded.Fix that such that the total count of symbols loaded is returned.
Should module symbol load fail after parsing of vmlinux, is's a
hard failure, so do not silently fall-back to kallsyms.Reported-by: Avi Kivity
Signed-off-by: Mike Galbraith
Cc: Arnaldo Carvalho de Melo
Cc: rostedt@goodmis.org
Cc: Mathieu Desnoyers
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Cc: Masami Hiramatsu
LKML-Reference:
Signed-off-by: Ingo Molnar
22 Sep, 2009
1 commit
-
Before:
0 sched:sched_switch # nan M/sec
After:
0 sched:sched_switch # 0.000 M/sec
Cc: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar
21 Sep, 2009
5 commits
-
Bye-bye Performance Counters, welcome Performance Events!
In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)This patch has been generated via the following script:
FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
sed -i \
-e 's/PERF_EVENT_/PERF_RECORD_/g' \
-e 's/PERF_COUNTER/PERF_EVENT/g' \
-e 's/perf_counter/perf_event/g' \
-e 's/nb_counters/nb_events/g' \
-e 's/swcounter/swevent/g' \
-e 's/tpcounter_event/tp_event/g' \
$FILESfor N in $(find . -name perf_counter.[ch]); do
M=$(echo $N | sed 's/perf_counter/perf_event/g')
mv $N $M
doneFILES=$(find . -name perf_event.*)
sed -i \
-e 's/COUNTER_MASK/REG_MASK/g' \
-e 's/COUNTER/EVENT/g' \
-e 's/\/event_id/g' \
-e 's/counter/event/g' \
-e 's/Counter/Event/g' \
$FILES... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.( NOTE: 'counters' are still the proper terminology when we deal
with hardware registers - and these sed scripts are a bit
over-eager in renaming them. I've undone some of that, but
in case there's something left where 'counter' would be
better than 'event' we can undo that on an individual basis
instead of touching an otherwise nicely automated patch. )Suggested-by: Stephane Eranian
Acked-by: Peter Zijlstra
Acked-by: Paul Mackerras
Reviewed-by: Arjan van de Ven
Cc: Mike Galbraith
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Steven Rostedt
Cc: Benjamin Herrenschmidt
Cc: David Howells
Cc: Kyle McMartin
Cc: Martin Schwidefsky
Cc: "David S. Miller"
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc:
LKML-Reference:
Signed-off-by: Ingo Molnar -
Tweak the output SVG to increase performance in SVG viewers by
limiting the different types of font sizes and by smarter
transformations on the text.At least with Inkscape this gives a notable performance improvement
during zoom and scrolling.Signed-off-by: Arjan van de Ven
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
This patch adds a command line option for timechart that allows the
user to specify the width of the SVG file.This patch also makes sure that each second of recording has at
least 200 units (pixels at 96 DPI) of width. This impacts
recordings longer than 5 seconds; recordings shorter than 5 second
will scale up to have a width of 1000 units for the whole recording
(as before).Signed-off-by: Arjan van de Ven
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
Given that scheduler latencies are the hot thing nowadays, show the
duration of said latencies in the SVG in text form.In addition, if the latency is more than 10 msec, pick a brighter
yellow color as a way to point these long delays out.Signed-off-by: Arjan van de Ven
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
Timechart currently shows thin green lines for sending or receiving
wakeups. This patch also prints (in a very small font) the name of
the process that is being woken/wakes up this process.Signed-off-by: Arjan van de Ven
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar
20 Sep, 2009
4 commits
-
As per Ingo's review: use a #define rather than an open coded constant
for the maximum length of a trace event for storing in the perf.data file.Signed-off-by: Arjan van de Ven
Cc: fweisbec@gmail.com
Cc: peterz@infradead.org
Cc: Paul Mackerras
LKML-Reference:
[ add a few comments to nearby functions ]
Signed-off-by: Ingo Molnar -
As suggested by Ingo, add a timechart man page help text, as well
as add it to the "perf help" overview.Signed-off-by: Arjan van de Ven
Cc: fweisbec@gmail.com
Cc: peterz@infradead.org
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Be more consistent in the svghelper about the minimum text size
by having a global #define for this.There needs to be a minimum text size in order to keep the size
of the SVG file within the reach of what current SVG viewers can
cope with.Signed-off-by: Arjan van de Ven
Cc: fweisbec@gmail.com
Cc: peterz@infradead.org
Cc: Paul Mackerras
Cc: Arjan van de Ven
LKML-Reference:
Signed-off-by: Ingo Molnar -
Add a command line option to record a trace, similar to "perf sched record".
Signed-off-by: Arjan van de Ven
Cc: fweisbec@gmail.com
Cc: peterz@infradead.org
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar
19 Sep, 2009
7 commits
-
timechart is a tool to visualize what is going on in the system.
The user makes a trace of what is going on with
> perf record --timechart /usr/bin/some_command
and then can turn the output of this into an svg file
> perf timechart
which then can be viewed with any SVG view; inkscape works well
enough for me.The idea behind timechart is to create a "infinitely zoomable"
picture; something that has high level information on a 1:1 zoom
level, but which exposes more details every time you zoom into a
specific area.Signed-off-by: Arjan van de Ven
Acked-by: Peter Zijlstra
Cc: Paul Mackerras
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
The timechart tool writes out SVG format output; this patch adds a
set of helper functions to abstract dealing with SVG from the core
timechart code.Signed-off-by: Arjan van de Ven
Acked-by: Peter Zijlstra
Cc: Paul Mackerras
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
Add a sample_event type to the event_union so that raw samples can
be processed easily.Signed-off-by: Arjan van de Ven
Acked-by: Peter Zijlstra
Cc: Paul Mackerras
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
timechart needs to add a "callback" type command line argument that
does not take arguments.This patch adds the parse-options.h infrastructure to make this
possible.Signed-off-by: Arjan van de Ven
Acked-by: Peter Zijlstra
Cc: Paul Mackerras
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
The trace event nameid mapping is dynamic for each kernel
compile. In order for perf.data to be useable outside the actual
system, we thus need to store a table of this mapping for later
use.This patch adds this table to perf.data, and provides helper
functions for lookup up fields from this table.To avoid mistakes, lookup-from-table is kept completely seprate
from lookup-from-local-debugfs.Signed-off-by: Arjan van de Ven
Acked-by: Peter Zijlstra
Cc: Paul Mackerras
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
perf timechart needs to know when a process forked, in order to be
able to visualize properly when tasks start.This patch adds a time field to the event structure, and fills it
in appropriately.Signed-off-by: Arjan van de Ven
Acked-by: Peter Zijlstra
Cc: Paul Mackerras
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
Merge reason: Bring in tracing changes we depend on.
Signed-off-by: Ingo Molnar
18 Sep, 2009
6 commits
-
perf sched record passes unparsed args on to perf record, so
specifying an output file via perf sched record -o FILE (cmd) just
works. Ergo, provide an option to specify input file as well.Also add the missing 'map' command to help.
Signed-off-by: Mike Galbraith
Acked-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Sample timestamp and cpu just like the -R option.
Before:
init-0 [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=18 handler=eth0
init-0 [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=18 handler=eth0
init-0 [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=1 handler=i8042
init-0 [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=18 handler=eth0
init-0 [-01] 1266874889.17179869184709551615: irq_handler_entry: irq=1 handler=i8042After:
init-0 [001] 7364.568965353: irq_handler_entry: irq=18 handler=eth0
init-0 [001] 7365.530226877: irq_handler_entry: irq=1 handler=i8042
init-0 [001] 7365.542831563: irq_handler_entry: irq=18 handler=eth0
init-0 [001] 7365.644156299: irq_handler_entry: irq=18 handler=eth0
init-0 [001] 7365.694556201: irq_handler_entry: irq=18 handler=eth0Signed-off-by: Li Zefan
Acked-by: Frederic Weisbecker
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
The name length of some trace events is longer than 30, like
sys_enter_sched_get_priority_max and
ext4_mb_discard_preallocations.Passing those events to perf-record will fail, try:
# ./perf record -f -e syscalls:sys_enter_sched_get_priority_max -F 1 -a
Signed-off-by: Li Zefan
Acked-by: Frederic Weisbecker
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
get_tracing_file() should be paired with put_tracing_file().
Signed-off-by: Li Zefan
Acked-by: Frederic Weisbecker
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
For 'perf sched map' output, determine max_cpu automatically,
instead of the static default of 15.[ v2: use sysconf() pointed out by Arjan van de Ven ]
Cc: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
I noticed that perf-record continues profiling itself after the
child terminated and we're draining the buffer.This can cause a _lot_ of overhead with --all recording - we keep
and keep recording, which produces new and new events.Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar
16 Sep, 2009
1 commit
-
This prints a textual context-switching outline of workload
captured via perf sched record.For example, on a 16 CPU box it outputs:
N1 O1 . . . S1 . . . B0 . *I0 C1 . M1 . 23002.773423 secs
N1 O1 . *Q0 . S1 . . . B0 . I0 C1 . M1 . 23002.773423 secs
N1 O1 . Q0 . S1 . . . B0 . *R1 C1 . M1 . 23002.773485 secs
N1 O1 . Q0 . S1 . *S0 . B0 . R1 C1 . M1 . 23002.773478 secs
*L0 O1 . Q0 . S1 . S0 . B0 . R1 C1 . M1 . 23002.773523 secs
L0 O1 . *. . S1 . S0 . B0 . R1 C1 . M1 . 23002.773531 secs
L0 O1 . . . S1 . S0 . B0 . R1 C1 *T1 M1 . 23002.773547 secs T1 => irqbalance:2089
L0 O1 . . . S1 . S0 . *P0 . R1 C1 T1 M1 . 23002.773549 secs
*N1 O1 . . . S1 . S0 . P0 . R1 C1 T1 M1 . 23002.773566 secs
N1 O1 . . . *J0 . S0 . P0 . R1 C1 T1 M1 . 23002.773571 secs
N1 O1 . . . J0 . S0 *B0 P0 . R1 C1 T1 M1 . 23002.773592 secs
N1 O1 . . . J0 . *U0 B0 P0 . R1 C1 T1 M1 . 23002.773582 secs
N1 O1 . . . *S1 . U0 B0 P0 . R1 C1 T1 M1 . 23002.773604 secs
N1 O1 . . . S1 . U0 B0 *. . R1 C1 T1 M1 . 23002.773615 secs
N1 O1 . . . S1 . U0 B0 . . *K0 C1 T1 M1 . 23002.773631 secs
N1 O1 . *M0 . S1 . U0 B0 . . K0 C1 T1 M1 . 23002.773624 secs
N1 O1 . M0 . S1 . U0 *. . . K0 C1 T1 M1 . 23002.773644 secs
N1 O1 . M0 . S1 . U0 . . . *R1 C1 T1 M1 . 23002.773662 secs
N1 O1 . M0 . S1 . *. . . . R1 C1 T1 M1 . 23002.773648 secs
N1 O1 . *. . S1 . . . . . R1 C1 T1 M1 . 23002.773680 secs
N1 O1 . . . *L0 . . . . . R1 C1 T1 M1 . 23002.773717 secs
*N0 O1 . . . L0 . . . . . R1 C1 T1 M1 . 23002.773709 secs
*N1 O1 . . . L0 . . . . . R1 C1 T1 M1 . 23002.773747 secsColumns stand for individual CPUs, from CPU0 to CPU15, and the
two-letter shortcuts stand for tasks that are running on a CPU.'*' denotes the CPU that had the event.
A dot signals an idle CPU.
New tasks are assigned new two-letter shortcuts - when they occur
first they are printed. In the above example 'T1' stood for irqbalance:T1 => irqbalance:2089
Cc: Peter Zijlstra
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar