18 May, 2010
6 commits
-
…git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (311 commits)
perf tools: Add mode to build without newt support
perf symbols: symbol inconsistency message should be done only at verbose=1
perf tui: Add explicit -lslang option
perf options: Type check all the remaining OPT_ variants
perf options: Type check OPT_BOOLEAN and fix the offenders
perf options: Check v type in OPT_U?INTEGER
perf options: Introduce OPT_UINTEGER
perf tui: Add workaround for slang < 2.1.4
perf record: Fix bug mismatch with -c option definition
perf options: Introduce OPT_U64
perf tui: Add help window to show key associations
perf tui: Make <- exit menus too
perf newt: Add single key shortcuts for zoom into DSO and threads
perf newt: Exit browser unconditionally when CTRL+C, q or Q is pressed
perf newt: Fix the 'A'/'a' shortcut for annotate
perf newt: Make <- exit the ui_browser
x86, perf: P4 PMU - fix counters management logic
perf newt: Make <- zoom out filters
perf report: Report number of events, not samples
perf hist: Clarify events_stats fields usage
...Fix up trivial conflicts in kernel/fork.c and tools/perf/builtin-record.c
-
That happened for an old perf.data file that had no fake MMAP events for
the kernel modules, but even then it should warn once for each module,
not one time for every symbol in every module not found.Reported-by: Ingo Molnar
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
OPT_SET_INT was renamed to OPT_SET_UINT since the only use in these
tools is to set something that has an enum type, that is builtin
compatible with unsigned int.Several string constifications were done to make OPT_STRING require a
const char * type.Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
To avoid problems like the one fixed by Stephane Eranian in 3de29ca, now
we'll got this instead:bench/sched-messaging.c:259: error: negative width in bit-field ‘’
bench/sched-messaging.c:261: error: negative width in bit-field ‘’Which is rather cryptic, but is how BUILD_BUG_ON_ZERO works, so kernel
hackers should be already used to this.With it in place found some problems, fixed by changing the affected
variables to sensible types or changed some OPT_INTEGER to OPT_UINTEGER.Next csets will go thru converting each of the remaining OPT_ so that
review can be made easier by grouping changes per type per patch.Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
For unsigned int options to be parsed, next patches will make use of it.
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
17 May, 2010
4 commits
-
Older versions of the slang library didn't used the 'const' specifier,
causing problems with modern compilers of this kind:util/newt.c:252: error: passing argument 1 of ‘SLsmg_printf’ discards
qualifiers from pointer target typeFix it by using some wrappers that when needed const the affected
parameters back to plain (char *).Reported-by: Lin Ming
Cc: Frédéric Weisbecker
Cc: Lin Ming
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
We have things like user_interval (-c/--count) in 'perf record' that
needs this.Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Suggested-by: Ingo Molnar
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
In fact it is now added to the hot key list when newt_form__new is used,
allowing us to remove the explicit assignment in all its users.The visible change is that is
pressed (and Enter when callchains are not being used).Suggested-by: Ingo Molnar
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
16 May, 2010
4 commits
-
'D'/'d' for zooming into the DSO in the current highlighted hist entry,
'T'/'t' for zooming into the current thread.Suggested-by: Ingo Molnar
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
ESC still asks for confirmation.
Suggested-by: Ingo Molnar
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Reported-by: Ingo Molnar
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Right now that means that pressing the left arrow willl make the symbol
annotation window to exit back to the main symbol histogram browser.This is another improvement on the UI fastpath, i.e. just the arrows and
enter are enough for most browsing.Suggested-by: Ingo Molnar
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
15 May, 2010
3 commits
-
After we use the filters to zoom into DSOs or threads, we can use
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Number of samples is meaningless after we switched to auto-freq, so
report the number of events, i.e. not the sum of the different periods,
but the number PERF_RECORD_SAMPLE emitted by the kernel.While doing this I noticed that naming "count" to the sum of all the
event periods can be confusing, so rename it to .period, just like in
struct sample.data, so that we become more consistent.This helps with the next step, that was to record in struct hist_entry
the number of sample events for each instance, we need that because we
use it to generate the number of events when applying filters to the
tree of hist entries like it is being done in the TUI report browser.Suggested-by: Ingo Molnar
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
The events_stats.total field is too generic, rename it to .total_period,
and also add a comment explaining that it is the sum of all the .period
fields in samples, that is needed because we use auto-freq to avoid
sampling artifacts.Ditto for events_stats.lost, that is the sum of all lost_event.lost
fields, i.e. the number of events the kernel dropped.Looking at the users, builtin-sched.c can make use of these fields and
stop doing it again.Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
14 May, 2010
3 commits
-
This is one more thing that started global but are more useful per hist
or per session.Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
hist.c needs to include util.h so that it gets stdio.h
inclusion with __GNU_SOURCE defined.Fixes:
util/hist.c: In function ‘hist_entry__parse_objdump_line’:
util/hist.c:931: erreur: implicit declaration of function ‘getline’
util/hist.c:931: erreur: nested extern declaration of ‘getline’Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Fix mistake in a parameter type of the no-newt hists__browse()
version.Fixes:
builtin-report.c: In function ‘__cmd_report’:
builtin-report.c:314: erreur: incompatible type for argument 1 of ‘hists__browse’Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
12 May, 2010
2 commits
-
Now we don't anymore use popen to run 'perf annotate' for the selected
symbol, instead we collect per address samplings when processing samples
in 'perf report' if we're using the newt browser, then we use this data
directly to do annotation.Done this way we can actually traverse the objdump_line objects
directly, matching the addresses to the collected samples and colouring
them appropriately using lower level slang routines.The new ui_browser class will be reused for the main, callchain aware,
histogram browser, when it will be made generic and don't assume that
the objects are always instances of the objdump_line class maintained
using list_heads.Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Initially this was just to be able to have a printf like method to
prepare the formatted string and then pass to newtPushHelpLine, but as
we already have for ui_progress, etc, its a step in identifying a
restricted, highlevel set of widgets we can then have implementations
for multiple widget sets (GTK, etc).Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
11 May, 2010
5 commits
-
Those are really not specific to the newt code, can be used by other UI
frontends.Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
The raw_field_ptr() helper, used to retrieve the address of a field
inside a trace event, treats every strings as if they were dynamic
ie: having a secondary level of indirection to retrieve their
contents.FIELD_IS_STRING doesn't mean FIELD_IS_DYNAMIC, we only need to
compute the secondary dereference for the latter case.This fixes perf sched segfaults, bad cmdline report and may be
some other bugs.Reported-by: Jason Baron
Reported-by: Arnaldo Carvalho de Melo
Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Tom Zanussi -
Only print the script start/stop messages in verbose mode - users
normally don't care and it just clutters up the output.Cc: Frédéric Weisbecker
Cc: Ingo Molnar
LKML-Reference:
Signed-off-by: Tom Zanussi
Signed-off-by: Arnaldo Carvalho de Melo -
Better done when we are adding entries, be it initially of when we're
re-sorting the histograms.Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
In cbbc79a we introduced support for multiple events by introducing a
new "event_stat_id" struct and then made several perf_session methods
receive a point to it instead of a pointer to perf_session, and kept the
event_stats and hists rb_tree in perf_session.While working on the new newt based browser, I realised that it would be
better to introduce a new class, "hists" (short for "histograms"),
renaming the "event_stat_id" struct and the perf_session methods that
were really "hists" methods, as they manipulate only struct hists
members, not touching anything in the other perf_session members.Other optimizations, such as calculating the maximum lenght of a symbol
name present in an hists instance will be possible as we add them,
avoiding a re-traversal just for finding that information.The rationale for the name "hists" to replace "event_stat_id" is that we
may have multiple sets of hists for the same event_stat id, as, for
instance, the 'perf diff' tool has, so event stat id is not what
characterizes what this struct and the functions that manipulate it do.Cc: Eric B Munson
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
10 May, 2010
10 commits
-
Using machines__create_kernel_maps(..., HOST_KERNEL_ID) it would create
another machine instance for the host machine, and since 1f626bc we have
it out of the machines rb_tree.Fix it by using machine__create_kernel_maps(&self->host_machine)
directly.Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Instead of newtAddComponents(just-one-entry, NULL), that is not needed
if, like in this browser, we're adding just one component at a time.Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
…ic/random-tracing into perf/core
-
Works by adding a third parameter to the '-g' argument, after the graph
type and minimum percentage, for example:[root@doppio linux-2.6-tip]# perf report -g fractal,0.5,2
Will show only the first two symbols where at least 0.5% of the samples
took place.All the other symbols that don't fall outside these constraints will be
put together in the last entry, prefixed with "[...]" and the total
percentage for them.Suggested-by: Arjan van de Ven
Acked-by: Arjan van de Ven
Cc: Arjan van de Ven
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
We have just one host on a given session, and that is the most common
setup right now, so embed a ->host_machine struct machine instance
directly in the perf_session class, check if we're looking for it before
going to the rb_tree.This also fixes a problem found when we try to process old perf.data
files where we didn't have MMAP events for the kernel and modules and
thus don't create the kernel maps, do it in event__preprocess_sample if
it wasn't already.Reported-by: Ingo Molnar
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
Cc: Zhang, Yanmin
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Which can happen when processing old files that had no fake kernel MMAP,
events.That shouldn't result in perf_session__create_kernel_maps not being
called, this will be fixed in a followup patch, for now do these checks
to avoid segfaulting.Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
By using BITS_PER_LONG / 4, that is the number of chars that will be
used in such cases as the DSO "name".Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
And with that fix at least one bug:
The first hit for an entry, the one that calls malloc to create a new
instance in __perf_session__add_hist_entry, wasn't adding the count to
the per cpumode (PERF_RECORD_MISC_USER, etc) total variable.Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Cc: Frédéric Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
09 May, 2010
3 commits
-
Some events, such as the PERF_RECORD_FINISHED_ROUND event consist of
only an event header and no data. In this case, a 0-length payload
will be read, and the 0 return value will be wrongly interpreted as an
'unexpected end of event stream'.This patch allows for proper handling of data-less events by skipping
0-length reads.Signed-off-by: Tom Zanussi
Cc: Arnaldo Carvalho de Melo
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Masami Hiramatsu
LKML-Reference:
Signed-off-by: Frederic Weisbecker -
The current events reordering algorithm is based on a heuristic that
gets broken once we deal with a very fast flow of events.Indeed the time period based flushing is not suitable anymore
in the following case, assuming we have a flush period of two
seconds.CPU 0 | CPU 1
|
cnt1 timestamps | cnt1 timestamps
|
0 | 0
1 | 1
2 | 2
3 | 3
[...] | [...]
4 seconds laterIf we spend too much time to read the buffers (case of a lot of
events to record in each buffers or when we have a lot of CPU buffers
to read), in the next pass the CPU 0 buffer could contain a slice
of several seconds of events. We'll read them all and notice we've
reached the period to flush. In the above example we flush the first
half of the CPU 0 buffer, then we read the CPU 1 buffer where we
have events that were on the flush slice and then the reordering
fails.It's simple to reproduce with:
perf lock record perf bench sched messaging
To solve this, we use a new solution that doesn't rely on an
heuristical time slice period anymore but on a deterministic basis
based on how perf record does its job.perf record saves the buffers through passes. A pass is a tour
on every buffers from every CPUs. This is made in order: for
each CPU we read the buffers of every counters. So the more
buffers we visit, the later will be the timstamps of their events.When perf record finishes a pass it records a
PERF_RECORD_FINISHED_ROUND pseudo event.
We record the max timestamp t found in the pass n. Assuming these
timestamps are monotonic across cpus, we know that if a buffer
still has events with timestamps below t, they will be all available
and then read in the pass n + 1.
Hence when we start to read the pass n + 2, we can safely flush every
events with timestamps below t.============ PASS n =================
CPU 0 | CPU 1
|
cnt1 timestamps | cnt2 timestamps
1 | 2
2 | 3
- | 4
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Paul Mackerras
Cc: Tom Zanussi
Cc: Masami Hiramatsu -
In order to provide a more rubust and deterministic reordering
algorithm, we need to know when we reach a point where we just
did a pass through over every counter buffers to read every thing
they had.This patch introduces a new PERF_RECORD_FINISHED_ROUND pseudo event
that only consist in an event header and doesn't need to contain
anything.Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Paul Mackerras
Cc: Tom Zanussi
Cc: Masami Hiramatsu