24 Mar, 2012
1 commit
-
This renames for_each_set_bit_cont() to for_each_set_bit_from() because
it is analogous to list_for_each_entry_from() in list.h rather than
list_for_each_entry_continue().This doesn't remove for_each_set_bit_cont() for now.
Signed-off-by: Akinobu Mita
Cc: Robert Richter
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
21 Mar, 2012
1 commit
-
Pull perf events changes for v3.4 from Ingo Molnar:
- New "hardware based branch profiling" feature both on the kernel and
the tooling side, on CPUs that support it. (modern x86 Intel CPUs
with the 'LBR' hardware feature currently.)This new feature is basically a sophisticated 'magnifying glass' for
branch execution - something that is pretty difficult to extract from
regular, function histogram centric profiles.The simplest mode is activated via 'perf record -b', and the result
looks like this in perf report:$ perf record -b any_call,u -e cycles:u branchy
$ perf report -b --sort=symbol
52.34% [.] main [.] f1
24.04% [.] f1 [.] f3
23.60% [.] f1 [.] f2
0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow
0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn
0.01% [k] _IO_vfprintf_internal [k] strchrnul
0.01% [k] __printf [k] _IO_vfprintf_internal
0.01% [k] main [k] __printfThis output shows from/to branch columns and shows the highest
percentage (from,to) jump combinations - i.e. the most likely taken
branches in the system. "branches" can also include function calls
and any other synchronous and asynchronous transitions of the
instruction pointer that are not 'next instruction' - such as system
calls, traps, interrupts, etc.This feature comes with (hopefully intuitive) flat ascii and TUI
support in perf report.- Various 'perf annotate' visual improvements for us assembly junkies.
It will now recognize function calls in the TUI and by hitting enter
you can follow the call (recursively) and back, amongst other
improvements.- Multiple threads/processes recording support in perf record, perf
stat, perf top - which is activated via a comma-list of PIDs:perf top -p 21483,21485
perf stat -p 21483,21485 -ddd
perf record -p 21483,21485- Support for per UID views, via the --uid paramter to perf top, perf
report, etc. For example 'perf top --uid mingo' will only show the
tasks that I am running, excluding other users, root, etc.- Jump label restructurings and improvements - this includes the
factoring out of the (hopefully much clearer) include/linux/static_key.h
generic facility:struct static_key key = STATIC_KEY_INIT_FALSE;
...
if (static_key_false(&key))
do unlikely code
else
do likely code...
static_key_slow_inc();
...
static_key_slow_inc();
...The static_key_false() branch will be generated into the code with as
little impact to the likely code path as possible. the
static_key_slow_*() APIs flip the branch via live kernel code patching.This facility can now be used more widely within the kernel to
micro-optimize hot branches whose likelihood matches the static-key
usage and fast/slow cost patterns.- SW function tracer improvements: perf support and filtering support.
- Various hardenings of the perf.data ABI, to make older perf.data's
smoother on newer tool versions, to make new features integrate more
smoothly, to support cross-endian recording/analyzing workflows
better, etc.- Restructuring of the kprobes code, the splitting out of 'optprobes',
and a corner case bugfix.- Allow the tracing of kernel console output (printk).
- Improvements/fixes to user-space RDPMC support, allowing user-space
self-profiling code to extract PMU counts without performing any
system calls, while playing nice with the kernel side.- 'perf bench' improvements
- ... and lots of internal restructurings, cleanups and fixes that made
these features possible. And, as usual this list is incomplete as
there were also lots of other improvements* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits)
perf report: Fix annotate double quit issue in branch view mode
perf report: Remove duplicate annotate choice in branch view mode
perf/x86: Prettify pmu config literals
perf report: Enable TUI in branch view mode
perf report: Auto-detect branch stack sampling mode
perf record: Add HEADER_BRANCH_STACK tag
perf record: Provide default branch stack sampling mode option
perf tools: Make perf able to read files from older ABIs
perf tools: Fix ABI compatibility bug in print_event_desc()
perf tools: Enable reading of perf.data files from different ABI rev
perf: Add ABI reference sizes
perf report: Add support for taken branch sampling
perf record: Add support for sampling taken branch
perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK
x86/kprobes: Split out optprobe related code to kprobes-opt.c
x86/kprobes: Fix a bug which can modify kernel code permanently
x86/kprobes: Fix instruction recovery on optimized path
perf: Add callback to flush branch_stack on context switch
perf: Disable PERF_SAMPLE_BRANCH_* when not supported
perf/x86: Add LBR software filter support for Intel CPUs
...
14 Mar, 2012
4 commits
-
On ancient systems I get this build failure:
util/../../../arch/x86/include/asm/unistd.h:67:29: error: asm/unistd_64.h: No such file or directory
In file included from util/cache.h:7,
from builtin-test.c:8:
util/../perf.h: In function ‘sys_perf_event_open’:In file included from util/../perf.h:16
perf.h:170: error: ‘__NR_perf_event_open’ undeclared (first use in this function)The reason is that this old system does not have the split
unistd.h headers yet, from which to pick up the syscall
definitions.Add the syscall numbers to the already existing i386 and x86_64
blocks in perf.h, and also provide empty include file stubs.With this patch perf builds and works fine on 5 years old
user-space as well.Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Link: http://lkml.kernel.org/n/tip-jctwg64le1w47tuaoeyftsg9@git.kernel.org
Signed-off-by: Ingo Molnar
Signed-off-by: Arnaldo Carvalho de Melo -
Several places were expecting that the value returned was the number of
characters printed, not what would be printed if there was space.Fix it by using the scnprintf and vscnprintf variants we inherited from
the kernel sources.Some corner cases where the number of printed characters were not
accounted were fixed too.Reported-by: Anton Blanchard
Cc: Anton Blanchard
Cc: Eric B Munson
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Yanmin Zhang
Cc: stable@kernel.org
Link: http://lkml.kernel.org/n/tip-kwxo2eh29cxmd8ilixi2005x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
I have a workload where perf top scribbles over the stack and we SEGV.
What makes it interesting is that an snprintf is causing this.The workload is a c++ gem that has method names over 3000 characters
long, but snprintf is designed to avoid overrunning buffers. So what
went wrong?The problem is we assume snprintf returns the number of characters
written:ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", self->level);
...
ret += repsep_snprintf(bf + ret, size - ret, "%s", self->ms.sym->name);Unfortunately this is not how snprintf works. snprintf returns the
number of characters that would have been written if there was enough
space. In the above case, if the first snprintf returns a value larger
than size, we pass a negative size into the second snprintf and happily
scribble over the stack. If you have 3000 character c++ methods thats a
lot of stack to trample.This patch fixes repsep_snprintf by clamping the value at size - 1 which
is the maximum snprintf can write before adding the NULL terminator.I get the sinking feeling that there are a lot of other uses of snprintf
that have this same bug, we should audit them all.Cc: David Ahern
Cc: Eric B Munson
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Yanmin Zhang
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/20120307114249.44275ca3@kryten
Signed-off-by: Anton Blanchard
Signed-off-by: Arnaldo Carvalho de Melo -
This patch fixes a buffer overrun bug in
tracepoint_id_to_path(). The bug manisfested itself as a memory
error reported by perf record. I ran into it with perf sched:$ perf sched rec noploop 2 noploop for 2 seconds
[ perf record: Woken up 14 times to write data ]
[ perf record: Captured and wrote 42.701 MB perf.data (~1865622 samples) ]
Fatal: No memory to alloc tracepoints listIt turned out that tracepoint_id_to_path() was reading the
tracepoint id using read() but the buffer was not large enough
to include the \n terminator for id with 4 digits or more.The patch fixes the problem by extending the buffer to a more
reasonable size covering all possible id length include \n
terminator. Note that atoll() stops at the first non digit
character, thus it is not necessary to clear the buffer between
each read.Signed-off-by: Stephane Eranian
Acked-by: Arnaldo Carvalho de Melo
Acked-by: Peter Zijlstra
Cc: fweisbec@gmail.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/20120313155102.GA6465@quad
Signed-off-by: Ingo Molnar
13 Mar, 2012
3 commits
-
Merge reason: The 'perf record -b' hardware branch sampling feature is ready for upstream.
Signed-off-by: Ingo Molnar
-
This patch fixes perf report to not go back two levels when
pressing the 'q' key while annotating in branch view mode.When pressing 'q' in annotate mode and if the branch source
and target belong to different functions, perf now brings
up the annotation popup menu again to offer the option to
annotate the other branch source or target.As part of the code restructuring in perf_evsel__hists_browse()
we also fix a memory leak on options[] in case of error.Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331565210-10865-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar -
This patch removes the duplicated annotate selection when
browsing in branch view mode. If the sym and dso oof the branch
source and target are the same, then only one annotate choice is
proposed.Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331565210-10865-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar
09 Mar, 2012
10 commits
-
This patch updates perf report to support TUI mode
when the perf.data file contains samples with branch
stacks.For each row in the report, it is possible to annotate
either the source or target of each branch.Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331246868-19905-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar -
This patch enhances perf report to auto-detect when the
perf.data file contains samples with branch stacks. That way it
is not necessary to use the -b option.To force branch view mode to off, simply use --no-branch-stack.
Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331246868-19905-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar -
This patch adds a new feature bit, namely,
HEADER_BRANCH_STACK. When present, it indicates
that sample records may contain branch stack.This could be useful to a viewer to switch to
branch mode without having to parse all the
samples or without a specific cmdline option.This will be used in a subsequent patch to
enhance perf report with branch stacks.Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331246868-19905-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar -
This patch chanegs the logic of the -b, --branch-stack options
of perf record.Based on users' request, the patch provides a default filter
mode with the -b (or --branch-any) option. With the option,
any type of taken branches is sampled.With -j (or --branch-filter), the user can specify any
valid combination of branch types and privilege levels
if supported by the underlying hardware.The -b (--branch any) is a shortcut for: --branch-filter any.
$ perf record -b foo
or:
$ perf record --branch-filter any foo
For more specific filtering:
$ perf record --branch-filter ind_call,u foo
Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1331246868-19905-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar -
This patches provides a way to handle legacy perf.data
files. Legacy files are those using the older PERFFILE
signature.For those, it is still necessary to detect endianness but
without comparing their header->attr_size with the
tool's own version as it may be different. Instead, we use
a reference table for all known sizes from the legacy era.We try all the combinations for sizes and endianness. If we find
a match, we proceed, otherwise we return: "incompatible file
format".This is also done for the pipe-mode file format.
Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-19-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar -
This patches cleans up local variable types for msz and ret.
They need to be size_t and ssize_t respectively.It also fixes a bug whereby perf would not read attr struct
with a different size than what it knows about.Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-18-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar -
This patch allows perf to process perf.data files generated
using an ABI that has a different perf_event_attr struct size,
i.e., a different ABI version.The perf_event_attr can be extended, yet perf needs to cope with
older perf.data files. Similarly, perf must be able to cope with
a perf.data file which is using a newer version of the ABI than
what it knows about.This patch adds read_attr(), a routine that reads a
perf_event_attr struct from a file incrementally based on its
advertised size. If the on-file struct is smaller than what perf
knows, then the extra fields are zeroed. If the on-file struct
is bigger, then perf only uses what it knows about, the rest is
skipped.Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-17-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar -
This patch adds support for taken branch sampling, i.e, the
PERF_SAMPLE_BRANCH_STACK feature to perf report. In other
words, to display histograms based on taken branches rather
than executed instructions addresses.The new option is called -b and it takes no argument. To
generate meaningful output, the perf.data must have been
obtained using perf record -b xxx ... where xxx is a branch
filter option.The output shows symbols, modules, sorted by 'who branches
where' the most often. The percentages reported in the first
column refer to the total number of branches captured and
not the usual number of samples.Here is a quick example.
Here branchy is simple test program which looks as follows:void f2(void)
{}
void f3(void)
{}
void f1(unsigned long n)
{
if (n & 1UL)
f2();
else
f3();
}
int main(void)
{
unsigned long i;for (i=0; i < N; i++)
f1(i);
return 0;
}Here is the output captured on Nehalem, if we are
only interested in user level function calls.$ perf record -b any_call,u -e cycles:u branchy
$ perf report -b --sort=symbol
52.34% [.] main [.] f1
24.04% [.] f1 [.] f3
23.60% [.] f1 [.] f2
0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow
0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn
0.01% [k] _IO_vfprintf_internal [k] strchrnul
0.01% [k] __printf [k] _IO_vfprintf_internal
0.01% [k] main [k] __printfAbout half (52%) of the call branches captured are from main()
-> f1(). The second half (24%+23%) is split in two equal shares
between f1() -> f2(), f1() ->f3(). The output is as expected
given the code.It should be noted, that using -b in perf record does not
eliminate information in the perf.data file. Consequently, a
typical profile can also be obtained by perf report by simply
not using its -b option.It is possible to sort on branch related columns:
- dso_from, symbol_from
- dso_to, symbol_to
- mispredictSigned-off-by: Roberto Agostino Vitillo
Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-14-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar -
This patch adds a new option to enable taken branch stack
sampling, i.e., leverage the PERF_SAMPLE_BRANCH_STACK feature
of perf_events.There is a new option to active this mode: -b.
It is possible to pass a set of filters to select the type of
branches to sample.The following filters are available:
- any : any type of branches
- any_call : any function call or system call
- any_ret : any function return or system call return
- any_ind : any indirect branch
- u: only when the branch target is at the user level
- k: only when the branch target is in the kernel
- hv: only when the branch target is in the hypervisorFilters can be combined by passing a comma separated list
to the option:$ perf record -b any_call,u -e cycles:u branchy
Signed-off-by: Roberto Agostino Vitillo
Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-13-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar -
This patch adds:
- ability to parse samples with PERF_SAMPLE_BRANCH_STACK
- sort on branches (dso_from, symbol_from, dso_to, symbol_to, mispredict)
- build histograms on branchesSigned-off-by: Roberto Agostino Vitillo
Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-12-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar
05 Mar, 2012
8 commits
-
If perf.data couldn't find vmlinux image for the given build-id,
it would print error message. However it lacked a newline at the
end, so the output looked like below:$ perf annotate --stdio
No vmlinux file with build id 63b554b2e90f14a4bced200008865e757d3e8b36
was found in the path.Please use:
perf buildid-cache -av vmlinux
or:
--vmlinux vmlinux Percent | Source code & Disassembly of a.out
------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: 00000000004004f4 :
0.00 : 4004f4: push %rbp
0.00 : 4004f5: mov %rsp,%rbp
0.00 : 4004f8: movl $0x0,-0x4(%rbp)
0.00 : 4004ff: jmp 400517
14.70 : 400501: mov 0x200b28(%rip),%rax # 601030
0.02 : 400508: add $0x1,%rax
0.01 : 40050c: mov %rax,0x200b1d(%rip) # 601030
0.01 : 400513: addl $0x1,-0x4(%rbp)
13.92 : 400517: cmpl $0x98967f,-0x4(%rbp)
71.33 : 40051e: jle 400501
0.00 : 400520: leaveq
0.00 : 400521: retqFix it by adding a newline at the end of the message. It doesn't affect
the tui output AFAICS. New output will look like this:...
or:--vmlinux vmlinux
Percent | Source code & Disassembly of a.out
------------------------------------------------
:
:
:
: Disassembly of section .text:
...Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1329986784-4916-6-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
Separate multiple binding using /, capitalize descriptions, add missing
key binding.Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1329986784-4916-5-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
On tui annotation, the title was set to name of the target symbol if
user selects the target. However it remained after returning to original
symbol from the target. Fix it.Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1329986784-4916-4-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
Accepting upper case character only is unconvenient since it requires
SHIFT key too. Why not change to it accept a simple key stroke?Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1329986784-4916-3-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
Print unselected asm code lines as blue. This is what we do now for
--stdio.Cc: Ingo Molnar
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1329986784-4916-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
There are some variable arguments can be specified on make invocation,
but some of them are missing descriptions so that user cannot be
informed easily. Fix it.Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1329980894-4289-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
If perf_evsel__open() failed, the errno was set and returned properly.
However since the perf_evlist__open() called close() on fd's for all of
evsel x cpu x thread after the failure, the errno was overridden by
other code (EBADF). So the caller of the function ended up seeing
different error message and getting confused.Fit it by restoring original return value. Because one of caller of the
function is in the python extension, and it uses system errno
internally, it'd be better restoring the original value rather than
using the return value of the function directly, IMHO (i.e. I'm not a
python expert :)Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1329966816-23175-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo -
Conflicts:
tools/perf/builtin-record.c
tools/perf/builtin-top.c
tools/perf/perf.h
tools/perf/util/top.hMerge reason: resolve these cherry-picking conflicts.
Signed-off-by: Ingo Molnar
03 Mar, 2012
3 commits
-
Just fall back to resetting those fields, if set, warning the user that
that feature is not available.If guest samples appear they will just be discarded because no struct
machine will be found and thus the event will be accounted as not
handled and dropped, see 0c09571.Reported-by: Namhyung Kim
Tested-by: Joerg Roedel
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Joerg Roedel
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Setting perf_guest to true by default makes no sense because the perf
subcommands can not setup guest symbol information and thus not process
and guest samples. The only exception is perf-kvm which changes the
perf_guest value on its own. So change the default for perf_guest back
to false.Cc: David Ahern
Cc: Ingo Molnar
Cc: Jason Wang
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1328893505-4115-3-git-send-email-joerg.roedel@amd.com
Signed-off-by: Joerg Roedel
Signed-off-by: Arnaldo Carvalho de Melo -
A recent refactoring of perf-record introduced the following:
perf record -a -B
Couldn't generating buildids. Use --no-buildid to profile anyway.
sleep: TerminatedI believe the triple negative was meant to be only a double negative.
:-) While I'm there, fixed the grammar on the error message.Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1328567272-13190-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo
01 Mar, 2012
4 commits
-
The 'perf probe' command allows kprobe to be inserted at any offset from
a function start, which results in adding kprobes to unintended
location. (example: perf probe do_fork+10000 is allowed even though
size of do_fork is ~904).My previous patch https://lkml.org/lkml/2012/2/24/42 addressed the case
where DWARF info was available for the kernel. This patch fixes the
case where perf probe is used on a kernel without debuginfo available.Acked-by: Masami Hiramatsu
Cc: Ananth N Mavinakayanahalli
Cc: Jason Baron
Cc: Masami Hiramatsu
Cc: Srikar Dronamraju
Cc: Steven Rostedt
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/4F4C544D.1010909@linux.vnet.ibm.com
Signed-off-by: Prashanth Nageshappa
Signed-off-by: Arnaldo Carvalho de Melo -
If threads in a multi-threaded process have names shorter than the main
thread the comm for the named threads is not properly terminated.E.g., for the process 'namedthreads' where each thread is named noploop%d
where %d is the thread number:Before:
perf script -f comm,tid,ip,sym,dso
noploop:4ads 21616 400a49 noploop (/tmp/namedthreads)
The 'ads' in the thread comm bleeds over from the process name.After:
perf script -f comm,tid,ip,sym,dso
noploop:4 21616 400a49 noploop (/tmp/namedthreads)Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1330111898-68071-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern
Signed-off-by: Arnaldo Carvalho de Melo -
The perf probe command allows kprobe to be inserted at any offset from a
function start, which results in adding kprobes to unintended location.Example: perf probe do_fork+10000 is allowed even though size of do_fork
is ~904.This patch will ensure probe addition fails when the offset specified is
greater than size of the function.Acked-by: Masami Hiramatsu
Cc: Ananth N Mavinakayanahalli
Cc: Srikar Dronamraju
Cc: Steven Rostedt
Cc: Andrew Morton
Cc: Jason Baron
Cc: Masami Hiramatsu
Link: http://lkml.kernel.org/r/4F473F33.4060409@linux.vnet.ibm.com
Signed-off-by: Prashanth Nageshappa
Signed-off-by: Arnaldo Carvalho de Melo -
On old kernels that don't support sample_id_all feature,
perf_evlist__id2evsel() returns NULL for non-sampling events.This breaks perf top when multiple events are given on command line. Fix
it by using first evsel in the evlist. This will also prevent getting
the same (potential) problem in such new tool/ old kernel combo.Suggested-by: Arnaldo Carvalho de Melo
Cc: Ingo Molnar
Cc: Paul Mackerras
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1329702447-25045-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo
22 Feb, 2012
1 commit
-
The following commit:
b52956c perf tools: Allow multiple threads or processes in record, stat, topintroduced a bug in the thread_map code which caused perf record -a to
not setup system-wide monitoring properly.$ taskset -c 1 noploop 1000 &
$ perf record -a -C 1 sleep 10
$ perf report -D | tail -20
cycles stats:
TOTAL events: 4413
MMAP events: 4025
COMM events: 340
SAMPLE events: 48Here I was expecting about 10,000 samples and not 48.
In system-wide mode, the PID passed to perf_event_open() must be -1 and
it was 0. That caused the kernel to setup a per-process event on PID:0.
Consequently, the number of samples captured does not correspond to the
requested measurement.The following one-liner fixes the problem for me with or without -C.
I would also suggest to change the malloc() to something that matches
the struct definition. thread_map->map[] is declared as int map[] and
not pid_t map[]. If map[] can only contain pids, then change the struct
definition.Acked-by: David Ahern
Cc: David Ahern
Cc: Eric Dumazet
Cc: Ingo Molnar
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120221145424.GA6757@quad
Signed-off-by: Stephane Eranian
Signed-off-by: Arnaldo Carvalho de Melo
18 Feb, 2012
2 commits
-
tools/perf/util/probe-event.c included 'string.h' twice, remove the
duplicate.Acked-by: Masami Hiramatsu
Cc: Danny Kukawka
Cc: Ingo Molnar
Cc: Jovi Zhang
Cc: Masami Hiramatsu
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/1329400459-31570-1-git-send-email-danny.kukawka@bisect.de
Signed-off-by: Danny Kukawka
Signed-off-by: Arnaldo Carvalho de Melo -
The __print_symbolic() function takes a sequence of key-value pairs for
pretty-printing a constant. The new kvm:kvm_exit print fmt uses the
expression:__print_symbolic(..., { 0x040 + 1, "DB excp" }, ...)
Currently only atoms are supported and this print fmt fails to parse.
This patch adds support for expressions instead of just atoms so that
0x040 + 1 is parsed successfully. Also add arg_num_eval() support for
the '+' operator.Acked-by: Steven Rostedt
Cc: Ingo Molnar
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/1315148939-14313-1-git-send-email-stefanha@linux.vnet.ibm.com
Signed-off-by: Stefan Hajnoczi
Signed-off-by: Arnaldo Carvalho de Melo
15 Feb, 2012
2 commits
-
Instead of requiring that users of perf_record_opts set
.sample_id_all_avail to true, just invert the logic, using
.sample_id_all_missing, that doesn't need to be explicitely initialized
since gcc will zero members ommitted in a struct initialization.Just like the newly introduced .exclude_{guest,host} feature test.
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-ab772uzk78cwybihf0vt7kxw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo -
Just fall back to resetting those fields, if set, warning the user that
that feature is not available.If guest samples appear they will just be discarded because no struct
machine will be found and thus the event will be accounted as not
handled and dropped, see 0c09571.Reported-by: Namhyung Kim
Tested-by: Joerg Roedel
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Joerg Roedel
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
14 Feb, 2012
1 commit
-
The perf_event_attr size needs to be initialized in all cases because it
captures the ABI version.This patch moves the initialization of the field from the
perf_event_open() syscall stub to its proper location in the
event_attr_init().Cc: Ingo Molnar
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120209151238.GA10272@quad
Signed-off-by: Stephane Eranian
Signed-off-by: Arnaldo Carvalho de Melo