16 Feb, 2011
2 commits
-
By pre-computing the maximum number of samples per tick we can avoid a
multiplication and a conditional since MAX_INTERRUPTS >
max_samples_per_tick.Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
This kernel patch adds the ability to filter monitoring based on
container groups (cgroups). This is for use in per-cpu mode only.The cgroup to monitor is passed as a file descriptor in the pid
argument to the syscall. The file descriptor must be opened to
the cgroup name in the cgroup filesystem. For instance, if the
cgroup name is foo and cgroupfs is mounted in /cgroup, then the
file descriptor is opened to /cgroup/foo. Cgroup mode is
activated by passing PERF_FLAG_PID_CGROUP in the flags argument
to the syscall.For instance to measure in cgroup foo on CPU1 assuming
cgroupfs is mounted under /cgroup:struct perf_event_attr attr;
int cgroup_fd, fd;cgroup_fd = open("/cgroup/foo", O_RDONLY);
fd = perf_event_open(&attr, cgroup_fd, 1, -1, PERF_FLAG_PID_CGROUP);
close(cgroup_fd);Signed-off-by: Stephane Eranian
[ added perf_cgroup_{exit,attach} ]
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar
16 Dec, 2010
3 commits
-
Simple sysfs emumeration of the PMUs.
Use a "event_source" bus, and add PMU devices using their name.
Each PMU device has a type attribute which contrains the value needed
for perf_event_attr::type to identify this PMU.This is the minimal stub needed to start using this interface,
we'll consider extending the sysfs usage later.Cc: Kay Sievers
Cc: Greg KH
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Extend the perf_pmu_register() interface to allow for named and
dynamic pmu types.Because we need to support the existing static types we cannot use
dynamic types for everything, hence provide a type argument.If we want to enumerate the PMUs they need a name, provide one.
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Merge reason: We want to apply a dependent patch.
Signed-off-by: Ingo Molnar
09 Dec, 2010
1 commit
-
Because the multi-pmu bits can share contexts between struct pmu
instances we could get duplicate events by iterating the pmu list.Signed-off-by: Peter Zijlstra
Signed-off-by: Thomas Gleixner
LKML-Reference:
Signed-off-by: Ingo Molnar
05 Dec, 2010
2 commits
-
If perf_event_attr.sample_id_all is set it will add the PERF_SAMPLE_ identity
info:TID, TIME, ID, CPU, STREAM_ID
As a trailer, so that older perf tools can process new files, just ignoring the
extra payload.With this its possible to do further analysis on problems in the event stream,
like detecting reordering of MMAP and FORK events, etc.V2: Fixup header size in comm, mmap and task processing, as we have to take into
account different sample_types for each matching event, noticed by Thomas Gleixner.Thomas also noticed a problem in v2 where if we didn't had space in the buffer we
wouldn't restore the header size.Tested-by: Thomas Gleixner
Reviewed-by: Thomas Gleixner
Acked-by: Ian Munsie
Acked-by: Peter Zijlstra
Acked-by: Thomas Gleixner
Cc: Frédéric Weisbecker
Cc: Ian Munsie
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Stephane Eranian
Cc: Thomas Gleixner
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo -
Those will be made available in sample like events like MMAP, EXEC, etc in a
followup patch. So precalculate the extra id header space and have a separate
routine to fill them up.V2: Thomas noticed that the id header needs to be precalculated at
inherit_events too:LKML-Reference:
Tested-by: Thomas Gleixner
Reviewed-by: Thomas Gleixner
Acked-by: Ian Munsie
Acked-by: Peter Zijlstra
Acked-by: Thomas Gleixner
Cc: Frédéric Weisbecker
Cc: Ian Munsie
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Stephane Eranian
Cc: Thomas Gleixner
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
01 Dec, 2010
1 commit
-
PERF_SAMPLE_{CALLCHAIN,RAW} have variable lenghts per sample, but the others
can be precalculated, reducing a bit the per sample cost.Acked-by: Peter Zijlstra
Cc: Frédéric Weisbecker
Cc: Ian Munsie
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Stephane Eranian
LKML-Reference:
Signed-off-by: Arnaldo Carvalho de Melo
26 Nov, 2010
3 commits
-
and use it when appropriate.
Signed-off-by: Franck Bui-Huu
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Stephane noticed that because the perf_sw_event() call is inside the
perf_event_task_sched_out() call it won't get called unless we
have a per-task counter.Reported-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
It was found that sometimes children of tasks with inherited events had
one extra event. Eventually it turned out to be due to the list rotation
no being exclusive with the list iteration in the inheritance code.Cure this by temporarily disabling the rotation while we inherit the events.
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra
LKML-Reference:
Cc:
Signed-off-by: Ingo Molnar
11 Nov, 2010
1 commit
-
This patch corrects time tracking in samples. Without this patch
both time_enabled and time_running are bogus when user asks for
PERF_SAMPLE_READ.One uses PERF_SAMPLE_READ to sample the values of other counters
in each sample. Because of multiplexing, it is necessary to know
both time_enabled, time_running to be able to scale counts correctly.In this second version of the patch, we maintain a shadow
copy of ctx->time which allows us to compute ctx->time without
calling update_context_time() from NMI context. We avoid the
issue that update_context_time() must always be called with
ctx->lock held.We do not keep shadow copies of the other event timings
because if the lead event is overflowing then it is active
and thus it's been scheduled in via event_sched_in() in
which case neither tstamp_stopped, tstamp_running can be modified.This timing logic only applies to samples when PERF_SAMPLE_READ
is used.Note that this patch does not address timing issues related
to sampling inheritance between tasks. This will be addressed
in a future patch.With this patch, the libpfm4 example task_smpl now reports
correct counts (shown on 2.4GHz Core 2):$ task_smpl -p 2400000000 -e unhalted_core_cycles:u,instructions_retired:u,baclears noploop 5
noploop for 5 seconds
IIP:0x000000004006d6 PID:5596 TID:5596 TIME:466,210,211,430 STREAM_ID:33 PERIOD:2,400,000,000 ENA=1,010,157,814 RUN=1,010,157,814 NR=3
2,400,000,254 unhalted_core_cycles:u (33)
2,399,273,744 instructions_retired:u (34)
53,340 baclears (35)Signed-off-by: Stephane Eranian
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar
19 Oct, 2010
5 commits
-
The use of the JUMP_LABEL() construct ends up creating endless silly
wrappers, create a higher level construct to reduce this clutter.Signed-off-by: Peter Zijlstra
Cc: Jason Baron
Cc: Steven Rostedt
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Acked-by: Frederic Weisbecker
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Trades a call + conditional + ret for an unconditional jmp.
Acked-by: Frederic Weisbecker
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
hw_breakpoint creation needs to account stuff per-task to ensure there
is always sufficient hardware resources to back these things due to
ptrace.With the perf per pmu context changes the event initialization no
longer has access to the event context, for the simple reason that we
need to first find the pmu (result of initialization) before we can
find the context.This makes hw_breakpoints unhappy, because it can no longer do per
task accounting, cure this by frobbing a task pointer in the event::hw
bits for now...Signed-off-by: Peter Zijlstra
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.Signed-off-by: Peter Zijlstra
Acked-by: Kyle McMartin
Acked-by: Martin Schwidefsky
[ various fixes ]
Signed-off-by: Huang Ying
LKML-Reference:
Signed-off-by: Ingo Molnar
15 Oct, 2010
1 commit
-
Conflicts:
arch/arm/oprofile/common.c
kernel/perf_event.c
11 Oct, 2010
2 commits
-
Introduce perf_pmu_name() helper function that returns the name of the
pmu. This gives us a generic way to get the name of a pmu regardless of
how an architecture identifies it internally.Signed-off-by: Matt Fleming
Acked-by: Peter Zijlstra
Acked-by: Paul Mundt
Signed-off-by: Robert Richter -
The number of counters for the registered pmu is needed in a few places
so provide a helper function that returns this number.Signed-off-by: Matt Fleming
Tested-by: Will Deacon
Acked-by: Paul Mundt
Acked-by: Peter Zijlstra
Signed-off-by: Robert Richter
17 Sep, 2010
2 commits
-
Revert the timer per cpu-context timers because of unfortunate
nohz interaction. Fixing that would have been somewhat ugly, so
go back to driving things from the regular tick. Provide a
jiffies interval feature for people who want slower rotations.Signed-off-by: Peter Zijlstra
Cc: Stephane Eranian
Cc: Robert Richter
Cc: Yinghai Lu
LKML-Reference:
Signed-off-by: Ingo Molnar -
Aside from allowing software events into a !software group,
allow adding !software events to pure software groups.Once we've moved the software group and attached the first
!software event, the group will no longer be a pure software
group and hence no longer be eligible for movement, at which
point the straight ctx comparison is correct again.Signed-off-by: Peter Zijlstra
Cc: Stephane Eranian
Cc: Robert Richter
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar
15 Sep, 2010
1 commit
-
The kernel perf event creation path shouldn't use find_task_by_vpid()
because a vpid exists in a specific namespace. find_task_by_vpid() uses
current's pid namespace which isn't always the correct namespace to use
for the vpid in all the places perf_event_create_kernel_counter() (and
thus find_get_context()) is called.The goal is to clean up pid namespace handling and prevent bugs like:
https://bugzilla.kernel.org/show_bug.cgi?id=17281
Instead of using pids switch find_get_context() to use task struct
pointers directly. The syscall is responsible for resolving the pid to
a task struct. This moves the pid namespace resolution into the syscall
much like every other syscall that takes pid parameters.Signed-off-by: Matt Helsley
Signed-off-by: Peter Zijlstra
Cc: Robin Green
Cc: Prasad
Cc: Arnaldo Carvalho de Melo
Cc: Steven Rostedt
Cc: Will Deacon
Cc: Mahesh Salgaonkar
LKML-Reference:
Signed-off-by: Ingo Molnar
10 Sep, 2010
14 commits
-
I missed a perf_event_ctxp user when converting it to an array. Pull this
last user into perf_event.c as well and fix it up.Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Since software events are always schedulable, mixing them up with
hardware events (who are not) can lead to funny scheduling oddities.Giving them their own context solves this.
Signed-off-by: Peter Zijlstra
Cc: paulus
Cc: stephane eranian
Cc: Robert Richter
Cc: Frederic Weisbecker
Cc: Lin Ming
Cc: Yanmin
LKML-Reference:
Signed-off-by: Ingo Molnar -
Provide the infrastructure for multiple task contexts.
A more flexible approach would have resulted in more pointer chases
in the scheduling hot-paths. This approach has the limitation of a
static number of task contexts.Since I expect most external PMUs to be system wide, or at least node
wide (as per the intel uncore unit) they won't actually need a task
context.Signed-off-by: Peter Zijlstra
Cc: paulus
Cc: stephane eranian
Cc: Robert Richter
Cc: Frederic Weisbecker
Cc: Lin Ming
Cc: Yanmin
LKML-Reference:
Signed-off-by: Ingo Molnar -
Allocate per-cpu contexts per pmu.
Signed-off-by: Peter Zijlstra
Cc: paulus
Cc: stephane eranian
Cc: Robert Richter
Cc: Frederic Weisbecker
Cc: Lin Ming
Cc: Yanmin
LKML-Reference:
Signed-off-by: Ingo Molnar -
Give each cpu-context its own timer so that it is a self contained
entity, this eases the way for per-pmu-per-cpu contexts as well as
provides the basic infrastructure to allow different rotation
times per pmu.Things to look at:
- folding the tick and these TICK_NSEC timers
- separate task context rotationSigned-off-by: Peter Zijlstra
Cc: paulus
Cc: stephane eranian
Cc: Robert Richter
Cc: Frederic Weisbecker
Cc: Lin Ming
Cc: Yanmin
LKML-Reference:
Signed-off-by: Ingo Molnar -
Separate the swevent hash-table from the cpu_context bits in
preparation for per pmu cpu contexts.This keeps the swevent hash a global entity.
Signed-off-by: Peter Zijlstra
Cc: paulus
Cc: stephane eranian
Cc: Robert Richter
Cc: Frederic Weisbecker
Cc: Lin Ming
Cc: Yanmin
LKML-Reference:
Signed-off-by: Ingo Molnar -
Neither the overcommit nor the reservation sysfs parameter were
actually working, remove them as they'll only get in the way.Signed-off-by: Peter Zijlstra
Cc: paulus
LKML-Reference:
Signed-off-by: Ingo Molnar -
Replace pmu::{enable,disable,start,stop,unthrottle} with
pmu::{add,del,start,stop}, all of which take a flags argument.The new interface extends the capability to stop a counter while
keeping it scheduled on the PMU. We replace the throttled state with
the generic stopped state.This also allows us to efficiently stop/start counters over certain
code paths (like IRQ handlers).It also allows scheduling a counter without it starting, allowing for
a generic frozen state (useful for rotating stopped counters).The stopped state is implemented in two different ways, depending on
how the architecture implemented the throttled state:1) We disable the counter:
a) the pmu has per-counter enable bits, we flip that
b) we program a NOP event, preserving the counter state2) We store the counter state and ignore all read/overflow events
Signed-off-by: Peter Zijlstra
Cc: paulus
Cc: stephane eranian
Cc: Robert Richter
Cc: Will Deacon
Cc: Paul Mundt
Cc: Frederic Weisbecker
Cc: Cyrill Gorcunov
Cc: Lin Ming
Cc: Yanmin
Cc: Deng-Cheng Zhu
Cc: David Miller
Cc: Michael Cree
LKML-Reference:
Signed-off-by: Ingo Molnar -
Use hw_perf_event::period_left instead of hw_perf_event::remaining
and win back 8 bytes.Signed-off-by: Peter Zijlstra
Cc: paulus
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar -
Provide default implementations for the pmu txn methods, this
allows us to remove some conditional code.Signed-off-by: Peter Zijlstra
Cc: paulus
Cc: stephane eranian
Cc: Robert Richter
Cc: Will Deacon
Cc: Paul Mundt
Cc: Frederic Weisbecker
Cc: Cyrill Gorcunov
Cc: Lin Ming
Cc: Yanmin
Cc: Deng-Cheng Zhu
Cc: David Miller
Cc: Michael Cree
LKML-Reference:
Signed-off-by: Ingo Molnar -
Changes perf_disable() into perf_pmu_disable().
Signed-off-by: Peter Zijlstra
Cc: paulus
Cc: stephane eranian
Cc: Robert Richter
Cc: Will Deacon
Cc: Paul Mundt
Cc: Frederic Weisbecker
Cc: Cyrill Gorcunov
Cc: Lin Ming
Cc: Yanmin
Cc: Deng-Cheng Zhu
Cc: David Miller
Cc: Michael Cree
LKML-Reference:
Signed-off-by: Ingo Molnar -
Since the current perf_disable() usage is only an optimization,
remove it for now. This eases the removal of the __weak
hw_perf_enable() interface.Signed-off-by: Peter Zijlstra
Cc: paulus
Cc: stephane eranian
Cc: Robert Richter
Cc: Will Deacon
Cc: Paul Mundt
Cc: Frederic Weisbecker
Cc: Cyrill Gorcunov
Cc: Lin Ming
Cc: Yanmin
Cc: Deng-Cheng Zhu
Cc: David Miller
Cc: Michael Cree
LKML-Reference:
Signed-off-by: Ingo Molnar -
Simple registration interface for struct pmu, this provides the
infrastructure for removing all the weak functions.Signed-off-by: Peter Zijlstra
Cc: paulus
Cc: stephane eranian
Cc: Robert Richter
Cc: Will Deacon
Cc: Paul Mundt
Cc: Frederic Weisbecker
Cc: Cyrill Gorcunov
Cc: Lin Ming
Cc: Yanmin
Cc: Deng-Cheng Zhu
Cc: David Miller
Cc: Michael Cree
LKML-Reference:
Signed-off-by: Ingo Molnar -
sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`
Signed-off-by: Peter Zijlstra
Cc: paulus
Cc: stephane eranian
Cc: Robert Richter
Cc: Will Deacon
Cc: Paul Mundt
Cc: Frederic Weisbecker
Cc: Cyrill Gorcunov
Cc: Lin Ming
Cc: Yanmin
Cc: Deng-Cheng Zhu
Cc: David Miller
Cc: Michael Cree
LKML-Reference:
Signed-off-by: Ingo Molnar
19 Aug, 2010
2 commits
-
…rostedt/linux-2.6-trace into perf/core
-
Instead of hardcoding the number of contexts for the recursions
barriers, define a cpp constant to make the code more
self-explanatory.Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Paul Mackerras
Cc: Stephane Eranian