24 Dec, 2011
1 commit
-
The default input file for perf report is not handled the same way as
perf record does it for its output file. This leads to unexpected
behavior of perf report, etc. E.g.:# perf record -a -e cpu-cycles sleep 2 | perf report | cat
failed to open perf.data: No such file or directory (try 'perf record' first)While perf record writes to a fifo, perf report expects perf.data to be
read. This patch changes this to accept fifos as input file.Applies to the following commands:
perf annotate
perf buildid-list
perf evlist
perf kmem
perf lock
perf report
perf sched
perf script
perf timechartAlso fixes char const* -> const char* type declaration for filename
strings.v2:
* Prevent potential null pointer access to input_name in
builtin-report.c. Needed due to removal of patch "perf report: Setup
browser if stdout is a pipe"Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter
Signed-off-by: Arnaldo Carvalho de Melo
22 Dec, 2011
1 commit
-
perf report does not take a command from command line.
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1323703017-6060-8-git-send-email-namhyung@gmail.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo
20 Dec, 2011
1 commit
-
The '--call-graph' command line option can receive undocumented optional
print_limit argument. Besides, use strtoul() to parse the option since
its type is u32.Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1323703017-6060-2-git-send-email-namhyung@gmail.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo
28 Nov, 2011
7 commits
-
Currently the meaning of -C varies by perf command: for perf-top,
perf-stat, perf-record it means cpu list. For perf-report it means comm
list. Then perf-annotate, perf-report and perf-script use -c for cpu
list.Fix annotate, report and script to use -C for cpu list to be consistent
with top, stat and record. This means report needs to use -c for comm
list which does introduce a backward compatibility change.v1 -> v2
- update perf-script.txt tooCc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1321209008-7004-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
To better reflect that it became the base class for all tools, that must
be in each tool struct and where common stuff will be put.Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-qgpc4msetqlwr8y2k7537cxe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Reducing the exposure of perf_session further, so that we can use the
classes in cases where no perf.data file is created.Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-stua66dcscsezzrcdugvbmvd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
So that we don't need to have that many globals.
Next steps will remove the 'session' pointer, that in most cases is
not needed.Then we can rename perf_event_ops to 'perf_tool' that better describes
this class hierarchy.Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-wp4djox7x6w1i2bab1pt4xxp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Paving the way to remove these globals when we change the perf_event_ops
to receive as a first parameter a pointer to a perf_event_ops that will
then provide access to perf_report via container_of.Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-2eh2vi2nb5z3tg1lvoxv09xu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Since we have it in evsel->hists.callchain_cursor, remove it from
perf_session.One more step in disentangling several places from requiring a
perf_session pointer.Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-rxr5dj3di7ckyfmnz0naku1z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Since symbol__alloc_hists need it, to avoid passing it around in many
functions have it in the symbol_conf struct.Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-cwv8ysvpywzjq4v3xtbd4zwv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
08 Oct, 2011
2 commits
-
And add the annotation output knobs to all the tools that have
integrated annotation (top, report).Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-gnlob67mke6sji2kf4nstp7m@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
The goal of this patch is to include more information about the host
environment into the perf.data so it is more self-descriptive. Overtime,
profiles are captured on various machines and it becomes hard to track
what was recorded, on what machine and when.This patch provides a way to solve this by extending the perf.data file
with basic information about the host machine. To add those extensions,
we leverage the feature bits capabilities of the perf.data format. The
change is backward compatible with existing perf.data files.We define the following useful new extensions:
- HEADER_HOSTNAME: the hostname
- HEADER_OSRELEASE: the kernel release number
- HEADER_ARCH: the hw architecture
- HEADER_CPUDESC: generic CPU description
- HEADER_NRCPUS: number of online/avail cpus
- HEADER_CMDLINE: perf command line
- HEADER_VERSION: perf version
- HEADER_TOPOLOGY: cpu topology
- HEADER_EVENT_DESC: full event description (attrs)
- HEADER_CPUID: easy-to-parse low level CPU identicationThe small granularity for the entries is to make it easier to extend
without breaking backward compatiblity. Many entries are provided as
ASCII strings.Perf report/script have been modified to print the basic information as
easy-to-parse ASCII strings. Extended information about CPU and NUMA
topology may be requested with the -I option.Thanks to David Ahern for reviewing and testing the many versions of
this patch.$ perf report --stdio
# ========
# captured on : Mon Sep 26 15:22:14 2011
# hostname : quad
# os release : 3.1.0-rc4-tip
# perf version : 3.1.0-rc4
# arch : x86_64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
# cpuid : GenuineIntel,6,15,11
# total memory : 8105360 kB
# cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
# event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# ========
#
...$ perf report --stdio -I
# ========
# captured on : Mon Sep 26 15:22:14 2011
# hostname : quad
# os release : 3.1.0-rc4-tip
# perf version : 3.1.0-rc4
# arch : x86_64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
# cpuid : GenuineIntel,6,15,11
# total memory : 8105360 kB
# cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date
# event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31,
# sibling cores : 0-3
# sibling threads : 0
# sibling threads : 1
# sibling threads : 2
# sibling threads : 3
# node0 meminfo : total = 8320608 kB, free = 7571024 kB
# node0 cpu list : 0-3
# ========
#
...Reviewed-by: David Ahern
Tested-by: David Ahern
Cc: David Ahern
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Andi Kleen
Link: http://lkml.kernel.org/r/20110930134040.GA5575@quad
Signed-off-by: Stephane Eranian
[ committer notes: Use --show-info in the tools as was in the docs, rename
perf_header_fprintf_info to perf_file_section__fprintf_info, fixup
conflict with f69b64f7 "perf: Support setting the disassembler style" ]
Signed-off-by: Arnaldo Carvalho de Melo
07 Oct, 2011
3 commits
-
This allows passing a timer to be run periodically, which will update
the hists tree that then gers refreshed on the screen, just like the
Live mode (symbol entries, annotation) we already have in 'perf top
--tui'.Will be used by the new hist_entry/hists based 'top' tool.
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-2r44qd8oe4sagzcgoikl8qzc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Just like --show-nr-samples, to help in diagnosing problems in the
tools.Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-1lr7ejdjfvy2uwy2wkmatcpq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
So that we can reuse hists__fprintf for in the new perf top tool.
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-huazw48x05h8r9niz5cf63za@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
30 Sep, 2011
2 commits
-
In the past we tried to avoid printing the name of the event when just
one event was found in the perf.data file, after some refactorings it
ended up not printing the event name if just one hist_entry was found in
one of the events.Fix it by always printing the name of the event, even if just one is
found.Reported-by: Peter Zijlstra
Link: http://lkml.kernel.org/n/tip-kikr0c7ou55bd9caok8569rf@git.kernel.org
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Signed-off-by: Arnaldo Carvalho de Melo -
Add -M option to report/annotate to pass directly to objdump. This
allows to use -M intel for intel style disassembler syntax, which is
useful for people who are very used to the Intel syntax.Link: http://lkml.kernel.org/r/1316122302-24306-2-git-send-email-andi@firstfloor.org
[committer note: Add missing Documentation bits, fixup conflicts with 3e6a2a7]
Cc: Frederic Weisbecker
Cc: Stephane Eranian
Signed-off-by: Andi Kleen
Signed-off-by: Arnaldo Carvalho de Melo
03 Aug, 2011
1 commit
-
So that we get a proper warning in the TUI in cases like:
$ perf report --stdio -g fractal,0.5,caller --sort pid
Selected -g but no callchain data. Did you call 'perf record' without -g?
$The --stdio case is ok because it uses fprintf, ui__warning is needed to
figure out if --stdio or --tui is being used.Cc: Arun Sharma
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Sam Liao
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-ag9fz2wd17mbbfjsbznq1wms@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
05 Jul, 2011
1 commit
-
Add an option to perf report/annotate/script to specify which
CPUs to operate on. This enables us to take a single system wide
profile and analyse each CPU (or group of CPUs) in isolation.This was useful when profiling a multiprocess workload where the
bottleneck was on one CPU but this was hidden in the overall
profile. Per process and per thread breakdowns didn't help
because multiple processes were running on each CPU and no
single process consumed an entire CPU.The patch converts the list of CPUs returned by cpu_map__new
into a bitmap for fast lookup. I wanted to use -C to be
consistent with perf top/record/stat, but unfortunately perf
report already uses -C .v2: Incorporate suggestions from David Ahern:
- Added -c to perf script
- Check that SAMPLE_CPU is set when -c is used
- Update documentationv3: Create perf_session__cpu_bitmap()
Signed-off-by: Anton Blanchard
Acked-by: David Ahern
Cc: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra
Cc: Paul Mackerras
Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
Signed-off-by: Ingo Molnar
30 Jun, 2011
2 commits
-
We don't need to display the parent field if the parent
sorting machinery is only used for parent filtering
(as in "-p foo").However if parent filtering is used in combination with
explicit parent sorting ( -s parent), we want to
display it.Result with:
perf report -p kernel_thread -s parent
Before:
# Overhead Parent symbol
# ........ .............
#
0.07%
|
--- ioread8
ata_sff_check_status
ata_sff_tf_load
ata_sff_qc_issue
ata_bmdma_qc_issue
ata_qc_issue
ata_scsi_translate
ata_scsi_queuecmd
scsi_dispatch_cmd
scsi_request_fn
__blk_run_queue
__make_request
generic_make_request
submit_bio
submit_bh
journal_submit_commit_record
jbd2_journal_commit_transaction
kjournald2
kthread
kernel_thread_helpeAfter:
# Overhead Parent symbol
# ........ .............
#
0.07% kernel_thread_helper
|
--- ioread8
ata_sff_check_status
ata_sff_tf_load
ata_sff_qc_issue
ata_bmdma_qc_issue
ata_qc_issue
ata_scsi_translate
ata_scsi_queuecmd
scsi_dispatch_cmd
scsi_request_fn
__blk_run_queue
__make_request
generic_make_request
submit_bio
submit_bh
journal_submit_commit_record
jbd2_journal_commit_transaction
kjournald2
kthread
kernel_thread_helperSigned-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: David Ahern
Cc: Sam Liao -
Add "caller/callee" option to support inverted butterfly report,
in the inverted report (with caller option), the call graph start
from the callee's ancestor. Users can use such view to catch system's
performance bottleneck from a sysprof like view. Using this option
with specified sort order like pid gives us high level view of call
graph statistics.Also add "-G" alias for inverted call graph.
Signed-off-by: Sam Liao
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Stephane Eranian
Cc: David Ahern
Signed-off-by: Frederic Weisbecker
28 May, 2011
1 commit
-
Suggested-by: Ingo Molnar
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
Link: http://lkml.kernel.org/n/tip-i1p8vrhq7xveyui6t1sc914e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
26 May, 2011
1 commit
-
Perf uses /proc/modules to figure out where kernel modules are loaded.
With the advent of kptr_restrict, non root users get zeroes for all module
start addresses.So check if kptr_restrict is non zero and don't generate the syntethic
PERF_RECORD_MMAP events for them.Warn the user about it in perf record and in perf report.
In perf report the reference relocation symbol being zero means that
kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't
use it to fixup symbol addresses when using a valid kallsyms (in the buildid
cache) or vmlinux (in the vmlinux path) build-id located automatically or
specified by the user.Provide an explanation about it in 'perf report' if kernel samples were taken,
checking if a suitable vmlinux or kallsyms was found/specified.Restricted /proc/kallsyms don't go to the buildid cache anymore.
Example:
[acme@emilia ~]$ perf record -F 100000 sleep 1
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check
/proc/sys/kernel/kptr_restrict.Samples in kernel functions may not be resolved if a suitable vmlinux file is
not found in the buildid cache or in the vmlinux path.Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved even
with a suitable vmlinux or kallsyms file.[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ]
[acme@emilia ~]$[acme@emilia ~]$ perf report --stdio
Kernel address maps (/proc/{kallsyms,modules}) were restricted,
check /proc/sys/kernel/kptr_restrict before running 'perf record'.If some relocation was applied (e.g. kexec) symbols may be misresolved.
Samples in kernel modules can't be resolved as well.
# Events: 13 cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. .....................
#
20.24% sleep [kernel.kallsyms] [k] page_fault
20.04% sleep [kernel.kallsyms] [k] filemap_fault
19.78% sleep [kernel.kallsyms] [k] __lru_cache_add
19.69% sleep ld-2.12.so [.] memcpy
14.71% sleep [kernel.kallsyms] [k] dput
4.70% sleep [kernel.kallsyms] [k] flush_signal_handlers
0.73% sleep [kernel.kallsyms] [k] perf_event_comm
0.11% sleep [kernel.kallsyms] [k] native_write_msr_safe#
# (For a higher level overview, try: perf report --sort comm,dso)
#
[acme@emilia ~]$This is because it found a suitable vmlinux (build-id checked) in
/lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long
file name).If we remove that file from the vmlinux path:
[root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \
/lib/modules/2.6.39-rc7+/build/vmlinux.OFF
[acme@emilia ~]$ perf report --stdio
[kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562
not found, continuing without symbolsKernel address maps (/proc/{kallsyms,modules}) were restricted, check
/proc/sys/kernel/kptr_restrict before running 'perf record'.As no suitable kallsyms nor vmlinux was found, kernel samples can't be
resolved.Samples in kernel modules can't be resolved as well.
# Events: 13 cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. ......
#
80.31% sleep [kernel.kallsyms] [k] 0xffffffff8103425a
19.69% sleep ld-2.12.so [.] memcpy#
# (For a higher level overview, try: perf report --sort comm,dso)
#
[acme@emilia ~]$Reported-by: Stephane Eranian
Suggested-by: David Miller
Cc: Dave Jones
Cc: David Miller
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Kees Cook
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Pekka Enberg
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
Link: http://lkml.kernel.org/n/tip-mt512joaxxbhhp1odop04yit@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
24 Mar, 2011
1 commit
-
Resolving the sample->id to an evsel since the most advanced tools,
report and annotate, and the others will too when they evolve to
properly support multi-event perf.data files.Good also because it does an extra validation, checking that the ID is
valid when present. When that is not the case, the overhead is just a
branch + function call (perf_evlist__id2evsel).Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
10 Mar, 2011
1 commit
-
So that we can reuse things like the id to attr lookup routine
(perf_evlist__id2evsel) that uses a hash table instead of the linear
lookup done in the older perf_header_attr routines, etc.Also to make evsels/evlist more pervasive an API, simplyfing using the
emerging perf lib.cc: Arun Sharma
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
07 Mar, 2011
2 commits
-
When multiple events were used in 'perf record', allow the user to
choose which one is wanted before showing the per event histograms.Annotations will be performed on the chosen event.
Allow going back and forth from event to event quickly using just the
arrow keys and enter.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
Cc: William Cohen
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
By creating an perf_evlist out of the attributes in the perf.data file
header, so that we can use evlists and evsels when reading recorded
sessions in addition to when we record sessions.More work is needed to allow tools to allow the user to select which
events are wanted when browsing sessions, be it just one or a subset of
them, aggregated or showed at the same time but with different
indications on the UI to allow seeing workloads thru different views at
the same time.But the overall goal/trend is to more uniformly use evsels and evlists.
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
18 Feb, 2011
1 commit
-
[root@emilia ~]# perf report --stdio
The perf.data file has no samples!
[root@emilia ~]#The TUI shows a popup warning message with the same message.
Reported-by: Ingo Molnar
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Steven Rostedt
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
11 Feb, 2011
1 commit
-
We only allocate it when in TUI mode. In --stdio mode unconditionally
initializing this area leads to memory corruption.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
09 Feb, 2011
1 commit
-
Since we'll need it when implementing the live annotate TUI browser.
This also simplifies things a bit by having the list head for the source
code to be in the dynamicly allocated part of struct annotation, that
way we don't have to pass it around, it can be found from the struct
symbol that is passed everywhere.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
05 Feb, 2011
2 commits
-
The perf annotate tool continues aggregating everything on just one
histograms, but to support the top model add support for one histogram
perf evsel in the evlist.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
They will be used by perf top, so that we have just one set of routines
to do annotation.Rename "struct sym_priv" to "struct annotation", etc, to clarify this
code a bit.Rename "struct sym_ext" to "struct source_line", to give it a meaningful
name, that clarifies that it is a the result of an addr2line call, that
is sorted by percentage one particular source code line appeared in the
annotation.And since we're moving things around also rename 'sym_hist->ip' to
'sym_hist->addr' as we want to do data structure annotation at some
point.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
01 Feb, 2011
1 commit
-
Because in tools like 'top' we don't want the pager.
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
30 Jan, 2011
2 commits
-
And move the event_t methods to the perf_event__ too.
No code changes, just namespace consistency.
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Making the namespace more uniform.
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tom Zanussi
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
23 Jan, 2011
3 commits
-
To make the callchain API naming more consistent.
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Frederic Weisbecker
Signed-off-by: Arnaldo Carvalho de Melo -
The callchains are fed with an array of a fixed size.
As a result we iterate over each callchains three times:- 1st to resolve symbols
- 2nd to filter out context boundaries
- 3rd for the insertion into the treeThis also involves some pairs of memory allocation/deallocation
everytime we insert a callchain, for the filtered out array of
addresses and for the array of symbols that comes along.Instead, feed the callchains through a linked list with persistent
allocations. It brings several pros like:- Merge the 1st and 2nd iterations in one. That was possible before
but in a way that would involve allocating an array slightly taller
than necessary because we don't know in advance the number of context
boundaries to filter out.- Much lesser allocations/deallocations. The linked list keeps
persistent empty entries for the next usages and is extendable at
will.- Makes it easier for multiple sources of callchains to feed a
stacktrace together. This is deemed to pave the way for cfi based
callchains wherein traditional frame pointer based kernel
stacktraces will precede cfi based user ones, producing an overall
callchain which size is hardly predictable. This requirement
makes the static array obsolete and makes a linked list based
iterator a much more flexible fit.Basic testing on a big perf file containing callchains (~ 176 MB)
has shown a throughput gain of about 11% with perf report.Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Frederic Weisbecker
Signed-off-by: Arnaldo Carvalho de Melo -
Using %L[uxd] has issues in some architectures, like on ppc64. Fix it
by making our 64 bit integers typedefs of stdint.h types and using
PRI[ux]64 like, for instance, git does.Reported by Denis Kirjanov that provided a patch for one case, I went
and changed all cases.Reported-by: Denis Kirjanov
Tested-by: Denis Kirjanov
LKML-Reference:
Cc: Denis Kirjanov
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Pingtian Han
Cc: Stephane Eranian
Cc: Tom Zanussi
Signed-off-by: Arnaldo Carvalho de Melo
22 Dec, 2010
2 commits
-
The symfs argument allows analysis of perf.data file using a locally accessible
filesystem tree with debug symbols - e.g., tree created during image builds,
sshfs mount, loop mounted KVM disk images, USB keys, initrds, etc. Anything
with an OS tree can be analyzed from anywhere without the need to populate a
local data store with build-ids.Commiter notes:
o Fixed up symfs="/" variants handling.
o prefixed DSO__ORIG_GUEST_KMODULE case with symfs too, avoiding use of files
outside the symfs directory.LKML-Reference:
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
This patch changes perf report to ask for the ID info on all events be
default if recording from multiple CPUs.Perf report, annotate and diff will now process the events in order if
the kernel is able to provide timestamps on all events. This ensures
that events such as COMM and MMAP which are necessary to correctly
interpret samples are processed prior to those samples so that they are
attributed correctly.Before:
# perf record ./cachetest
# perf report# Events: 6K cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. ...............................
#
74.11% :3259 [unknown] [k] 0x4a6c
1.50% cachetest ld-2.11.2.so [.] 0x1777c
1.46% :3259 [kernel.kallsyms] [k] .perf_event_mmap_ctx
1.25% :3259 [kernel.kallsyms] [k] restore
0.74% :3259 [kernel.kallsyms] [k] ._raw_spin_lock
0.71% :3259 [kernel.kallsyms] [k] .filemap_fault
0.66% :3259 [kernel.kallsyms] [k] .memset
0.54% cachetest [kernel.kallsyms] [k] .sha_transform
0.54% :3259 [kernel.kallsyms] [k] .copy_4K_page
0.54% :3259 [kernel.kallsyms] [k] .find_get_page
0.52% :3259 [kernel.kallsyms] [k] .trace_hardirqs_off
0.50% :3259 [kernel.kallsyms] [k] .__do_faultAfter:
# perf report# Events: 6K cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. ...............................
#
44.28% cachetest cachetest [.] sumArrayNaive
22.53% cachetest cachetest [.] sumArrayOptimal
6.59% cachetest ld-2.11.2.so [.] 0x1777c
2.13% cachetest [unknown] [k] 0x340
1.46% cachetest [kernel.kallsyms] [k] .perf_event_mmap_ctx
1.25% cachetest [kernel.kallsyms] [k] restore
0.74% cachetest [kernel.kallsyms] [k] ._raw_spin_lock
0.71% cachetest [kernel.kallsyms] [k] .filemap_fault
0.66% cachetest [kernel.kallsyms] [k] .memset
0.54% cachetest [kernel.kallsyms] [k] .copy_4K_page
0.54% cachetest [kernel.kallsyms] [k] .find_get_page
0.54% cachetest [kernel.kallsyms] [k] .sha_transform
0.52% cachetest [kernel.kallsyms] [k] .trace_hardirqs_off
0.50% cachetest [kernel.kallsyms] [k] .__do_faultCc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
LKML-Reference:
Signed-off-by: Ian Munsie
Signed-off-by: Arnaldo Carvalho de Melo