02 May, 2013
1 commit
-
The full dynticks tree needs the latest RCU and sched
upstream updates in order to fix some dependencies.Merge a common upstream merge point that has these
updates.Conflicts:
include/linux/perf_event.h
kernel/rcutree.h
kernel/rcutree_plugin.hSigned-off-by: Frederic Weisbecker
23 Apr, 2013
1 commit
-
Provide a new helper that help full dynticks CPUs to prevent
from stopping their tick in case there are events in the local
rotation list.This way we make sure that perf_event_task_tick() is serviced
on demand.Signed-off-by: Frederic Weisbecker
Cc: Chris Metcalf
Cc: Christoph Lameter
Cc: Geoff Levand
Cc: Gilad Ben Yossef
Cc: Hakan Akkan
Cc: Ingo Molnar
Cc: Kevin Hilman
Cc: Li Zhong
Cc: Oleg Nesterov
Cc: Paul E. McKenney
Cc: Paul Gortmaker
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: Thomas Gleixner
Cc: Stephane Eranian
Cc: Jiri Olsa
08 Apr, 2013
1 commit
-
Pull IBM zEnterprise EC12 support patchlet from Robert Richter.
Signed-off-by: Ingo Molnar
01 Apr, 2013
3 commits
-
This patch adds PERF_SAMPLE_DATA_SRC.
PERF_SAMPLE_DATA_SRC collects the data source, i.e., where
did the data associated with the sampled instruction
come from. Information is stored in a perf_mem_data_src
structure. It contains opcode, mem level, tlb, snoop,
lock information, subject to availability in hardware.Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-8-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar
Signed-off-by: Arnaldo Carvalho de Melo -
For some events it's useful to weight sample with a hardware
provided number. This expresses how expensive the action the
sample represent was. This allows the profiler to scale
the samples to be more informative to the programmer.There is already the period which is used similarly, but it
means something different, so I chose to not overload it.
Instead a new sample type for WEIGHT is added.Can be used for multiple things. Initially it is used for TSX
abort costs and profiling by memory latencies (so to make
expensive load appear higher up in the histograms). The concept
is quite generic and can be extended to many other kinds of
events or architectures, as long as the hardware provides
suitable auxillary values. In principle it could be also used
for software tracepoints.This adds the generic glue. A new optional sample format for a
64-bit weight value.Signed-off-by: Andi Kleen
Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar
Signed-off-by: Arnaldo Carvalho de Melo -
This patch adds a flags field to each event constraint.
It can be used to store event specific features which can
then later be used by scheduling code or low-level x86 code.The flags are propagated into event->hw.flags during the
get_event_constraint() call. They are cleared during the
put_event_constraint() call.This mechanism is going to be used by the PEBS-LL patches.
It avoids defining yet another table to hold event specific
information.Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar
Signed-off-by: Arnaldo Carvalho de Melo
27 Mar, 2013
1 commit
-
This patch extends Jiri's changes to make generic
events mapping visible via sysfs. The patch extends
the mechanism to non-generic events by allowing
the mappings to be hardcoded in strings.This mechanism will be used by the PEBS-LL patch
later on.Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: namhyung.kim@lge.com
Link: http://lkml.kernel.org/r/1359040242-8269-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar
[ fixed up conflict with 2663960 "perf: Make EVENT_ATTR global" ]
Signed-off-by: Arnaldo Carvalho de Melo
18 Mar, 2013
1 commit
-
Commit 1d9d8639c063 ("perf,x86: fix kernel crash with PEBS/BTS after
suspend/resume") introduces a link failure since
perf_restore_debug_store() is only defined for CONFIG_CPU_SUP_INTEL:arch/x86/power/built-in.o: In function `restore_processor_state':
(.text+0x45c): undefined reference to `perf_restore_debug_store'Fix it by defining the dummy function appropriately.
Signed-off-by: David Rientjes
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds
16 Mar, 2013
1 commit
-
This patch fixes a kernel crash when using precise sampling (PEBS)
after a suspend/resume. Turns out the CPU notifier code is not invoked
on CPU0 (BP). Therefore, the DS_AREA (used by PEBS) is not restored properly
by the kernel and keeps it power-on/resume value of 0 causing any PEBS
measurement to crash when running on CPU0.The workaround is to add a hook in the actual resume code to restore
the DS Area MSR value. It is invoked for all CPUS. So for all but CPU0,
the DS_AREA will be restored twice but this is harmless.Reported-by: Linus Torvalds
Signed-off-by: Stephane Eranian
Signed-off-by: Linus Torvalds
06 Mar, 2013
1 commit
-
Move struct perf_cgroup_info and perf_cgroup to
kernel/perf/core.c, and then we can remove include of cgroup.h.Signed-off-by: Li Zefan
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Tejun Heo
Link: http://lkml.kernel.org/r/513568A0.6020804@huawei.com
Signed-off-by: Ingo Molnar
09 Feb, 2013
1 commit
-
sys_perf_event_open()->perf_init_event(event) is called before
find_get_context(event), this means that event->ctx == NULL when
class->reg(TRACE_REG_PERF_REGISTER/OPEN) is called and thus it
can't know if this event is per-task or system-wide.This patch adds hw_perf_event->tp_target for PERF_TYPE_TRACEPOINT,
this is analogous to PERF_TYPE_BREAKPOINT/bp_target we already have.
The patch also moves ->bp_target up so that it can overlap with the
new member, this can help the compiler to generate the better code.trace_uprobe_register() will use it for prefiltering to avoid the
unnecessary breakpoints in mm's we do not want to trace.->tp_target doesn't have its own reference, but we can rely on the
fact that either sys_perf_event_open() holds a reference, or it is
equal to event->ctx->task. So this pointer is always valid until
free_event().Also add the "struct list_head tp_list" into this union. It is not
strictly necessary, but it can simplify the next changes and we can
add it for free.Signed-off-by: Oleg Nesterov
01 Feb, 2013
1 commit
-
Rename EVENT_ATTR() to PMU_EVENT_ATTR() and make it global so it is
available to all architectures.Further to allow architectures flexibility, have PMU_EVENT_ATTR() pass
in the variable name as a parameter.Changelog[v2]
- [Jiri Olsa] No need to define PMU_EVENT_PTR()Signed-off-by: Sukadev Bhattiprolu
Acked-by: Jiri Olsa
Cc: Andi Kleen
Cc: Anton Blanchard
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Stephane Eranian
Cc: linuxppc-dev@ozlabs.org
Link: http://lkml.kernel.org/r/20130123062422.GC13720@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo
24 Oct, 2012
2 commits
-
The perf_cpu_notifier() macro invokes smp_processor_id()
multiple times. Optimize it by using a local variable.Signed-off-by: Srivatsa S. Bhat
Reviewed-by: Paul E. McKenney
Cc: peterz@infradead.org
Cc: acme@ghostprotocols.net
Link: http://lkml.kernel.org/r/20121016075817.3572.76733.stgit@srivatsabhat.in.ibm.com
Signed-off-by: Ingo Molnar -
The CPU_STARTING notifiers are supposed to be run with irqs
disabled. But the perf_cpu_notifier() macro invokes them without
doing that. Fix it.Signed-off-by: Srivatsa S. Bhat
Reviewed-by: Paul E. McKenney
Cc: peterz@infradead.org
Cc: acme@ghostprotocols.net
Link: http://lkml.kernel.org/r/20121016075809.3572.47848.stgit@srivatsabhat.in.ibm.com
Signed-off-by: Ingo Molnar
13 Oct, 2012
1 commit
-
Signed-off-by: David Howells
Acked-by: Arnd Bergmann
Acked-by: Thomas Gleixner
Acked-by: Michael Kerrisk
Acked-by: Paul E. McKenney
Acked-by: Dave Jones
05 Oct, 2012
1 commit
-
Stephane thought the perf_cpu_context::active_pmu name confusing and
suggested using 'unique_pmu' instead.This pointer is a pointer to a 'random' pmu sharing the cpuctx
instance, therefore limiting a for_each_pmu loop to those where
cpuctx->unique_pmu matches the pmu we get a loop over unique cpuctx
instances.Suggested-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/n/tip-kxyjqpfj2fn9gt7kwu5ag9ks@git.kernel.org
Signed-off-by: Ingo Molnar
02 Oct, 2012
1 commit
-
Pull perf update from Ingo Molnar:
"Lots of changes in this cycle as well, with hundreds of commits from
over 30 contributors. Most of the activity was on the tooling side.Higher level changes:
- New 'perf kvm' analysis tool, from Xiao Guangrong.
- New 'perf trace' system-wide tracing tool
- uprobes fixes + cleanups from Oleg Nesterov.
- Lots of patches to make perf build on Android out of box, from
Irina Tirdea- Extend ftrace function tracing utility to be more dynamic for its
users. It allows for data passing to the callback functions, as
well as reading regs as if a breakpoint were to trigger at function
entry.The main goal of this patch series was to allow kprobes to use
ftrace as an optimized probe point when a probe is placed on an
ftrace nop. With lots of help from Masami Hiramatsu, and going
through lots of iterations, we finally came up with a good
solution.- Add cpumask for uncore pmu, use it in 'stat', from Yan, Zheng.
- Various tracing updates from Steve Rostedt
- Clean up and improve 'perf sched' performance by elliminating lots
of needless calls to libtraceevent.- Event group parsing support, from Jiri Olsa
- UI/gtk refactorings and improvements from Namhyung Kim
- Add support for non-tracepoint events in perf script python, from
Feng Tang- Add --symbols to 'script', similar to the one in 'report', from
Feng Tang.Infrastructure enhancements and fixes:
- Convert the trace builtins to use the growing evsel/evlist
tracepoint infrastructure, removing several open coded constructs
like switch like series of strcmp to dispatch events, etc.
Basically what had already been showcased in 'perf sched'.- Add evsel constructor for tracepoints, that uses libtraceevent just
to parse the /format events file, use it in a new 'perf test' to
make sure the libtraceevent format parsing regressions can be more
readily caught.- Some strange errors were happening in some builds, but not on the
next, reported by several people, problem was some parser related
files, generated during the build, didn't had proper make deps, fix
from Eric Sandeen.- Introduce struct and cache information about the environment where
a perf.data file was captured, from Namhyung Kim.- Fix handling of unresolved samples when --symbols is used in
'report', from Feng Tang.- Add union member access support to 'probe', from Hyeoncheol Lee.
- Fixups to die() removal, from Namhyung Kim.
- Render fixes for the TUI, from Namhyung Kim.
- Don't enable annotation in non symbolic view, from Namhyung Kim.
- Fix pipe mode in 'report', from Namhyung Kim.
- Move related stats code from stat to util/, will be used by the
'stat' kvm tool, from Xiao Guangrong.- Remove die()/exit() calls from several tools.
- Resolve vdso callchains, from Jiri Olsa
- Don't pass const char pointers to basename, so that we can
unconditionally use libgen.h and thus avoid ifdef BIONIC lines,
from David Ahern- Refactor hist formatting so that it can be reused with the GTK
browser, From Namhyung Kim- Fix build for another rbtree.c change, from Adrian Hunter.
- Make 'perf diff' command work with evsel hists, from Jiri Olsa.
- Use the only field_sep var that is set up: symbol_conf.field_sep,
fix from Jiri Olsa.- .gitignore compiled python binaries, from Namhyung Kim.
- Get rid of die() in more libtraceevent places, from Namhyung Kim.
- Rename libtraceevent 'private' struct member to 'priv' so that it
works in C++, from Steven Rostedt- Remove lots of exit()/die() calls from tools so that the main perf
exit routine can take place, from David Ahern- Fix x86 build on x86-64, from David Ahern.
- {int,str,rb}list fixes from Suzuki K Poulose
- perf.data header fixes from Namhyung Kim
- Allow user to indicate objdump path, needed in cross environments,
from Maciek Borzecki- Fix hardware cache event name generation, fix from Jiri Olsa
- Add round trip test for sw, hw and cache event names, catching the
problem Jiri fixed, after Jiri's patch, the test passes
successfully.- Clean target should do clean for lib/traceevent too, fix from David
Ahern- Check the right variable for allocation failure, fix from Namhyung
Kim- Set up evsel->tp_format regardless of evsel->name being set
already, fix from Namhyung Kim- Oprofile fixes from Robert Richter.
- Remove perf_event_attr needless version inflation, from Jiri Olsa
- Introduce libtraceevent strerror like error reporting facility,
from Namhyung Kim- Add pmu mappings to perf.data header and use event names from cmd
line, from Robert Richter- Fix include order for bison/flex-generated C files, from Ben
Hutchings- Build fixes and documentation corrections from David Ahern
- Assorted cleanups from Robert Richter
- Let O= makes handle relative paths, from Steven Rostedt
- perf script python fixes, from Feng Tang.
- Initial bash completion support, from Frederic Weisbecker
- Allow building without libelf, from Namhyung Kim.
- Support DWARF CFI based unwind to have callchains when %bp based
unwinding is not possible, from Jiri Olsa.- Symbol resolution fixes, while fixing support PPC64 files with an
.opt ELF section was the end goal, several fixes for code that
handles all architectures and cleanups are included, from Cody
Schafer.- Assorted fixes for Documentation and build in 32 bit, from Robert
Richter- Cache the libtraceevent event_format associated to each evsel
early, so that we avoid relookups, i.e. calling pevent_find_event
repeatedly when processing tracepoint events.[ This is to reduce the surface contact with libtraceevents and
make clear what is that the perf tools needs from that lib: so
far parsing the common and per event fields. ]- Don't stop the build if the audit libraries are not installed, fix
from Namhyung Kim.- Fix bfd.h/libbfd detection with recent binutils, from Markus
Trippelsdorf.- Improve warning message when libunwind devel packages not present,
from Jiri Olsa"* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (282 commits)
perf trace: Add aliases for some syscalls
perf probe: Print an enum type variable in "enum variable-name" format when showing accessible variables
perf tools: Check libaudit availability for perf-trace builtin
perf hists: Add missing period_* fields when collapsing a hist entry
perf trace: New tool
perf evsel: Export the event_format constructor
perf evsel: Introduce rawptr() method
perf tools: Use perf_evsel__newtp in the event parser
perf evsel: The tracepoint constructor should store sys:name
perf evlist: Introduce set_filter() method
perf evlist: Renane set_filters method to apply_filters
perf test: Add test to check we correctly parse and match syscall open parms
perf evsel: Handle endianity in intval method
perf evsel: Know if byte swap is needed
perf tools: Allow handling a NULL cpu_map as meaning "all cpus"
perf evsel: Improve tracepoint constructor setup
tools lib traceevent: Fix error path on pevent_parse_event
perf test: Fix build failure
trace: Move trace event enable from fs_initcall to core_initcall
tracing: Add an option for disabling markers
...
13 Sep, 2012
1 commit
-
Current implementation simply ignores attribute flags. Thus, there is
no notification to userland of unsupported features. Check syscall's
attribute flags to let userland know if a feature is supported by the
kernel. This is also needed to distinguish between future kernels what
might support a feature.Cc: v3.5..
Signed-off-by: Robert Richter
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120910093018.GO8285@erda.amd.com
Signed-off-by: Ingo Molnar
04 Sep, 2012
2 commits
-
While debugging a warning message on PowerPC while using hardware
breakpoints, it was discovered that when perf_event_disable is invoked
through hw_breakpoint_handler function with interrupts disabled, a
subsequent IPI in the code path would trigger a WARN_ON_ONCE message in
smp_call_function_single function.This patch calls __perf_event_disable() when interrupts are already
disabled, instead of perf_event_disable().Reported-by: Edjunior Barbosa Machado
Signed-off-by: K.Prasad
[naveen.n.rao@linux.vnet.ibm.com: v3: Check to make sure we target current task]
Signed-off-by: Naveen N. Rao
Acked-by: Frederic Weisbecker
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120802081635.5811.17737.stgit@localhost.localdomain
[ Fixed build error on MIPS. ]
Signed-off-by: Ingo Molnar -
Don't mess with file refcounts (or keep a reference to file, for
that matter) in perf_event. Use explicit refcount of its own
instead. Deal with the race between the final reference to event
going away and new children getting created for it by use of
atomic_long_inc_not_zero() in inherit_event(); just have the
latter free what it had allocated and return NULL, that works
out just fine (children of siblings of something doomed are
created as singletons, same as if the child of leader had been
created and immediately killed).Signed-off-by: Al Viro
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120820135925.GG23464@ZenIV.linux.org.uk
Signed-off-by: Ingo Molnar
23 Aug, 2012
1 commit
-
Stashing version 4 under version 3 and removing version 4, because both
version changes were within single patchset.Reported-by: Peter Zijlstra
Signed-off-by: Jiri Olsa
Acked-by: Peter Zijlstra
Cc: Arun Sharma
Cc: Benjamin Redelings
Cc: Corey Ashford
Cc: Cyrill Gorcunov
Cc: Frank Ch. Eigler
Cc: Frederic Weisbecker
Cc: H. Peter Anvin
Cc: Ingo Molnar
Cc: Masami Hiramatsu
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Stephane Eranian
Cc: Thomas Gleixner
Cc: Tom Zanussi
Cc: Ulrich Drepper
Link: http://lkml.kernel.org/r/20120822083540.GB1003@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo
10 Aug, 2012
5 commits
-
Introducing following bits to the the perf_event_attr struct:
- exclude_callchain_kernel to filter out kernel callchain
from the sample dump- exclude_callchain_user to filter out user callchain
from the sample dumpWe need to be able to disable standard user callchain dump when we use
the dwarf cfi callchain mode, because frame pointer based user
callchains are useless in this mode.Implementing also exclude_callchain_kernel to have complete set of
options.Signed-off-by: Jiri Olsa
[ Added kernel callchains filtering ]
Cc: "Frank Ch. Eigler"
Cc: Arun Sharma
Cc: Benjamin Redelings
Cc: Corey Ashford
Cc: Cyrill Gorcunov
Cc: Frank Ch. Eigler
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Masami Hiramatsu
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Stephane Eranian
Cc: Tom Zanussi
Cc: Ulrich Drepper
Link: http://lkml.kernel.org/r/1344345647-11536-7-git-send-email-jolsa@redhat.com
Signed-off-by: Frederic Weisbecker
Signed-off-by: Arnaldo Carvalho de Melo -
Introducing PERF_SAMPLE_STACK_USER sample type bit to trigger the dump
of the user level stack on sample. The size of the dump is specified by
sample_stack_user value.Being able to dump parts of the user stack, starting from the stack
pointer, will be useful to make a post mortem dwarf CFI based stack
unwinding.Added HAVE_PERF_USER_STACK_DUMP config option to determine if the
architecture provides user stack dump on perf event samples. This needs
access to the user stack pointer which is not unified across
architectures. Enabling this for x86 architecture.Signed-off-by: Jiri Olsa
Original-patch-by: Frederic Weisbecker
Cc: "Frank Ch. Eigler"
Cc: Arun Sharma
Cc: Benjamin Redelings
Cc: Corey Ashford
Cc: Cyrill Gorcunov
Cc: Frank Ch. Eigler
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Masami Hiramatsu
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Stephane Eranian
Cc: Tom Zanussi
Cc: Ulrich Drepper
Link: http://lkml.kernel.org/r/1344345647-11536-6-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo -
Introducing perf_output_skip function to be able to skip data within the
perf ring buffer.When writing data into perf ring buffer we first reserve needed place in
ring buffer and then copy the actual data.There's a possibility we won't be able to fill all the reserved size
with data, so we need a way to skip the remaining bytes.This is going to be useful when storing the user stack dump, where we
might end up with less data than we originally requested.Signed-off-by: Jiri Olsa
Acked-by: Frederic Weisbecker
Cc: "Frank Ch. Eigler"
Cc: Arun Sharma
Cc: Benjamin Redelings
Cc: Corey Ashford
Cc: Cyrill Gorcunov
Cc: Frank Ch. Eigler
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Masami Hiramatsu
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Stephane Eranian
Cc: Tom Zanussi
Cc: Ulrich Drepper
Link: http://lkml.kernel.org/r/1344345647-11536-5-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo -
Adding a generic way to use __output_copy function with specific copy
function via DEFINE_PERF_OUTPUT_COPY macro.Using this to add new __output_copy_user function, that provides output
copy from user pointers. For x86 the copy_from_user_nmi function is used
and __copy_from_user_inatomic for the rest of the architectures.This new function will be used in user stack dump on sample, coming in
next patches.Signed-off-by: Jiri Olsa
Cc: "Frank Ch. Eigler"
Cc: Arun Sharma
Cc: Benjamin Redelings
Cc: Corey Ashford
Cc: Cyrill Gorcunov
Cc: Frank Ch. Eigler
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Masami Hiramatsu
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Stephane Eranian
Cc: Tom Zanussi
Cc: Ulrich Drepper
Link: http://lkml.kernel.org/r/1344345647-11536-4-git-send-email-jolsa@redhat.com
Signed-off-by: Frederic Weisbecker
Signed-off-by: Arnaldo Carvalho de Melo -
Introducing PERF_SAMPLE_REGS_USER sample type bit to trigger the dump of
user level registers on sample. Registers we want to dump are specified
by sample_regs_user bitmask.Only user level registers are dumped at the moment. Meaning the register
values of the user space context as it was before the user entered the
kernel for whatever reason (syscall, irq, exception, or a PMI happening
in userspace).The layout of the sample_regs_user bitmap is described in
asm/perf_regs.h for archs that support register dump.This is going to be useful to bring Dwarf CFI based stack unwinding on
top of samples.Original-patch-by: Frederic Weisbecker
[ Dump registers ABI specification. ]
Signed-off-by: Jiri Olsa
Suggested-by: Stephane Eranian
Cc: "Frank Ch. Eigler"
Cc: Arun Sharma
Cc: Benjamin Redelings
Cc: Corey Ashford
Cc: Cyrill Gorcunov
Cc: Frank Ch. Eigler
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Masami Hiramatsu
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Stephane Eranian
Cc: Tom Zanussi
Cc: Ulrich Drepper
Link: http://lkml.kernel.org/r/1344345647-11536-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo
31 Jul, 2012
1 commit
-
A few events are interesting not only for a current task.
For example, sched_stat_* events are interesting for a task
which wakes up. For this reason, it will be good if such
events will be delivered to a target task too.Now a target task can be set by using __perf_task().
The original idea and a draft patch belongs to Peter Zijlstra.
I need these events for profiling sleep times. sched_switch is used for
getting callchains and sched_stat_* is used for getting time periods.
These events are combined in user space, then it can be analyzed by
perf tools.Inspired-by: Peter Zijlstra
Cc: Steven Rostedt
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Steven Rostedt
Cc: Arun Sharma
Signed-off-by: Andrew Vagin
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1342016098-213063-1-git-send-email-avagin@openvz.org
Signed-off-by: Ingo Molnar
18 Jun, 2012
1 commit
-
Originally from Peter Zijlstra. The helper migrates perf events
from one cpu to another cpu.Signed-off-by: Zheng Yan
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1339741902-8449-5-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar
06 Jun, 2012
2 commits
-
The rdpmc instruction is faster than the equivelant rdmsr call,
so use it when possible in the kernel.The perfctr kernel patches did this, after extensive testing showed
rdpmc to always be faster (One can look in etc/costs in the perfctr-2.6
package to see a historical list of the overhead).I have done some tests on a 3.2 kernel, the kernel module I used
was included in the first posting of this patch:rdmsr rdpmc
Core2 T9900: 203.9 cycles 30.9 cycles
AMD fam0fh: 56.2 cycles 9.8 cycles
Atom 6/28/2: 129.7 cycles 50.6 cyclesThe speedup of using rdpmc is large.
[ It's probably possible (and desirable) to do this without
requiring a new field in the hw_perf_event structure, but
the fixed events make this tricky. ]Signed-off-by: Vince Weaver
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1203011724030.26934@cl320.eecs.utk.edu
Signed-off-by: Ingo Molnar -
Stack depth of 255 seems excessive, given that copy_from_user_nmi()
could be slow.Signed-off-by: Arun Sharma
Cc: Linus Torvalds
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1334961696-19580-3-git-send-email-asharma@fb.com
Signed-off-by: Ingo Molnar
31 May, 2012
1 commit
-
We faced segmentation fault on perf top -G at very high sampling rate
due to a corrupted callchain. While the root cause was not revealed (I
failed to figure it out), this patch tries to protect us from the
segfault on such cases.Reported-by: Arnaldo Carvalho de Melo
Signed-off-by: Namhyung Kim
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Sunjin Yang
Link: http://lkml.kernel.org/r/1338443007-24857-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo
23 May, 2012
1 commit
-
This reverts commit cb04ff9ac424 ("sched, perf: Use a single
callback into the scheduler").Before this change was introduced, the process switch worked
like this (wrt. to perf event schedule):schedule (prev, next)
- schedule out all perf events for prev
- switch to next
- schedule in all perf events for current (next)After the commit, the process switch looks like:
schedule (prev, next)
- schedule out all perf events for prev
- schedule in all perf events for (next)
- switch to nextThe problem is, that after we schedule perf events in, the pmu
is enabled and we can receive events even before we make the
switch to next - so "current" still being prev process (event
SAMPLE data are filled based on the value of the "current"
process).Thats exactly what we see for test__PERF_RECORD test. We receive
SAMPLES with PID of the process that our tracee is scheduled
from.Discussed with Peter Zijlstra:
> Bah!, yeah I guess reverting is the right thing for now. Sad
> though.
>
> So by having the two hooks we have a black-spot between them
> where we receive no events at all, this black-spot covers the
> hand-over of current and we thus don't receive the 'wrong'
> events.
>
> I rather liked we could do away with both that black-spot and
> clean up the code a little, but apparently people rely on it.Signed-off-by: Jiri Olsa
Acked-by: Peter Zijlstra
Cc: acme@redhat.com
Cc: paulus@samba.org
Cc: cjashfor@linux.vnet.ibm.com
Cc: fweisbec@gmail.com
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/20120523111302.GC1638@m.brq.redhat.com
Signed-off-by: Ingo Molnar
09 May, 2012
2 commits
-
We can easily use a single callback for both sched-in and sched-out. This
reduces the code footprint in the scheduler path as well as removes
the PMU black spot otherwise present between the out and in callback.Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/n/tip-o56ajxp1edwqg6x9d31wb805@git.kernel.org
Signed-off-by: Ingo Molnar -
We always need to pass the last sample period to
perf_sample_data_init(), otherwise the event distribution will be
wrong. Thus, modifiyng the function interface with the required period
as argument. So basically a pattern like this:perf_sample_data_init(&data, ~0ULL);
data.period = event->hw.last_period;will now be like that:
perf_sample_data_init(&data, ~0ULL, event->hw.last_period);
Avoids unininitialized data.period and simplifies code.
Signed-off-by: Robert Richter
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1333390758-10893-3-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar
24 Mar, 2012
1 commit
-
Having the build time assertion in header is making the perf
build fail on x86 with:../../include/linux/perf_event.h:411:32: error: variably modified \
‘__assert_mmap_data_head_offset’ at file scope [-Werror]I'm moving the build time validation out of the header, because
I think it's better than to lessen the perf build warn/error
check.Signed-off-by: Jiri Olsa
Cc: acme@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Cc: cjashfor@linux.vnet.ibm.com
Cc: fweisbec@gmail.com
Link: http://lkml.kernel.org/r/1332513680-7870-1-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar
23 Mar, 2012
1 commit
-
Complete the syscall-less self-profiling feature and address
all complaints, namely:- capabilities, so we can detect what is actually available at runtime
Add a capabilities field to perf_event_mmap_page to indicate
what is actually available for use.- on x86: RDPMC weirdness due to being 40/48 bits and not sign-extending
properly.- ABI documentation as to how all this stuff works.
Also improve the documentation for the new features.
Signed-off-by: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Link: http://lkml.kernel.org/r/1332433596.2487.33.camel@twins
Signed-off-by: Ingo Molnar
17 Mar, 2012
1 commit
-
Adding sysfs group 'format' attribute for pmu device that
contains a syntax description on how to construct raw events.The event configuration is described in following
struct pefr_event_attr attributes:config
config1
config2Each sysfs attribute within the format attribute group,
describes mapping of name and bitfield definition within
one of above attributes.eg:
"/sys/.../format/event" contains "config:0-7"
"/sys/.../format/umask" contains "config:8-15"
"/sys/.../format/usr" contains "config:16"the attribute value syntax is:
line: config ':' bits
config: 'config' | 'config1' | 'config2"
bits: bits ',' bit_term | bit_term
bit_term: VALUE '-' VALUE | VALUEAdding format attribute definitions for x86 cpu pmus.
Acked-by: Peter Zijlstra
Signed-off-by: Peter Zijlstra
Signed-off-by: Jiri Olsa
Link: http://lkml.kernel.org/n/tip-vhdk5y2hyype9j63prymty36@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
09 Mar, 2012
1 commit
-
This patch adds reference sizes for revision 1
and 2 of the perf_event ABI, i.e., the size of
the perf_event_attr struct.With Rev1: config2 was added = +8 bytes
With Rev2: branch_sample_type was added = +8 bytesAdds the definition for Rev1, Rev2.
This is useful for tools trying to decode the revision
numbers based on the size of the struct.Signed-off-by: Stephane Eranian
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-16-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar
05 Mar, 2012
2 commits
-
With branch stack sampling, it is possible to filter by priv levels.
In system-wide mode, that means it is possible to capture only user
level branches. The builtin SW LBR filter needs to disassemble code
based on LBR captured addresses. For that, it needs to know the task
the addresses are associated with. Because of context switches, the
content of the branch stack buffer may contain addresses from
different tasks.We need a callback on context switch to either flush the branch stack
or save it. This patch adds a new callback in struct pmu which is called
during context switches. The callback is called only when necessary.
That is when a system-wide context has, at least, one event which
uses PERF_SAMPLE_BRANCH_STACK. The callback is never called for
per-thread context.In this version, the Intel x86 code simply flushes (resets) the LBR
on context switches (fills it with zeroes). Those zeroed branches are
then filtered out by the SW filter.Signed-off-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1328826068-11713-11-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar -
This patch adds the ability to sample taken branches to the
perf_event interface.The ability to capture taken branches is very useful for all
sorts of analysis. For instance, basic block profiling, call
counts, statistical call graph.This new capability requires hardware assist and as such may
not be available on all HW platforms. On Intel x86 it is
implemented on top of the Last Branch Record (LBR) facility.To enable taken branches sampling, the PERF_SAMPLE_BRANCH_STACK
bit must be set in attr->sample_type.Sampled taken branches may be filtered by type and/or priv
levels.The patch adds a new field, called branch_sample_type, to the
perf_event_attr structure. It contains a bitmask of filters
to apply to the sampled taken branches.Filters may be implemented in HW. If the HW filter does not exist
or is not good enough, some arch may also implement a SW filter.The following generic filters are currently defined:
- PERF_SAMPLE_USER
only branches whose targets are at the user level- PERF_SAMPLE_KERNEL
only branches whose targets are at the kernel level- PERF_SAMPLE_HV
only branches whose targets are at the hypervisor level- PERF_SAMPLE_ANY
any type of branches (subject to priv levels filters)- PERF_SAMPLE_ANY_CALL
any call branches (may incl. syscall on some arch)- PERF_SAMPLE_ANY_RET
any return branches (may incl. syscall returns on some arch)- PERF_SAMPLE_IND_CALL
indirect call branchesObviously filter may be combined. The priv level bits are optional.
If not provided, the priv level of the associated event are used. It
is possible to collect branches at a priv level different from the
associated event. Use of kernel, hv priv levels is subject to permissions
and availability (hv).The number of taken branch records present in each sample may vary based
on HW, the type of sampled branches, the executed code. Therefore
each sample contains the number of taken branches it contains.Signed-off-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1328826068-11713-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar