16 Feb, 2011

2 commits

  • By pre-computing the maximum number of samples per tick we can avoid a
    multiplication and a conditional since MAX_INTERRUPTS >
    max_samples_per_tick.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • This kernel patch adds the ability to filter monitoring based on
    container groups (cgroups). This is for use in per-cpu mode only.

    The cgroup to monitor is passed as a file descriptor in the pid
    argument to the syscall. The file descriptor must be opened to
    the cgroup name in the cgroup filesystem. For instance, if the
    cgroup name is foo and cgroupfs is mounted in /cgroup, then the
    file descriptor is opened to /cgroup/foo. Cgroup mode is
    activated by passing PERF_FLAG_PID_CGROUP in the flags argument
    to the syscall.

    For instance to measure in cgroup foo on CPU1 assuming
    cgroupfs is mounted under /cgroup:

    struct perf_event_attr attr;
    int cgroup_fd, fd;

    cgroup_fd = open("/cgroup/foo", O_RDONLY);
    fd = perf_event_open(&attr, cgroup_fd, 1, -1, PERF_FLAG_PID_CGROUP);
    close(cgroup_fd);

    Signed-off-by: Stephane Eranian
    [ added perf_cgroup_{exit,attach} ]
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Stephane Eranian
     

16 Dec, 2010

3 commits

  • Simple sysfs emumeration of the PMUs.

    Use a "event_source" bus, and add PMU devices using their name.

    Each PMU device has a type attribute which contrains the value needed
    for perf_event_attr::type to identify this PMU.

    This is the minimal stub needed to start using this interface,
    we'll consider extending the sysfs usage later.

    Cc: Kay Sievers
    Cc: Greg KH
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Extend the perf_pmu_register() interface to allow for named and
    dynamic pmu types.

    Because we need to support the existing static types we cannot use
    dynamic types for everything, hence provide a type argument.

    If we want to enumerate the PMUs they need a name, provide one.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Merge reason: We want to apply a dependent patch.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

09 Dec, 2010

1 commit


05 Dec, 2010

2 commits

  • If perf_event_attr.sample_id_all is set it will add the PERF_SAMPLE_ identity
    info:

    TID, TIME, ID, CPU, STREAM_ID

    As a trailer, so that older perf tools can process new files, just ignoring the
    extra payload.

    With this its possible to do further analysis on problems in the event stream,
    like detecting reordering of MMAP and FORK events, etc.

    V2: Fixup header size in comm, mmap and task processing, as we have to take into
    account different sample_types for each matching event, noticed by Thomas Gleixner.

    Thomas also noticed a problem in v2 where if we didn't had space in the buffer we
    wouldn't restore the header size.

    Tested-by: Thomas Gleixner
    Reviewed-by: Thomas Gleixner
    Acked-by: Ian Munsie
    Acked-by: Peter Zijlstra
    Acked-by: Thomas Gleixner
    Cc: Frédéric Weisbecker
    Cc: Ian Munsie
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Those will be made available in sample like events like MMAP, EXEC, etc in a
    followup patch. So precalculate the extra id header space and have a separate
    routine to fill them up.

    V2: Thomas noticed that the id header needs to be precalculated at
    inherit_events too:

    LKML-Reference:

    Tested-by: Thomas Gleixner
    Reviewed-by: Thomas Gleixner
    Acked-by: Ian Munsie
    Acked-by: Peter Zijlstra
    Acked-by: Thomas Gleixner
    Cc: Frédéric Weisbecker
    Cc: Ian Munsie
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

01 Dec, 2010

1 commit


26 Nov, 2010

3 commits

  • and use it when appropriate.

    Signed-off-by: Franck Bui-Huu
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Franck Bui-Huu
     
  • Stephane noticed that because the perf_sw_event() call is inside the
    perf_event_task_sched_out() call it won't get called unless we
    have a per-task counter.

    Reported-by: Stephane Eranian
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • It was found that sometimes children of tasks with inherited events had
    one extra event. Eventually it turned out to be due to the list rotation
    no being exclusive with the list iteration in the inheritance code.

    Cure this by temporarily disabling the rotation while we inherit the events.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Cc:
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

11 Nov, 2010

1 commit

  • This patch corrects time tracking in samples. Without this patch
    both time_enabled and time_running are bogus when user asks for
    PERF_SAMPLE_READ.

    One uses PERF_SAMPLE_READ to sample the values of other counters
    in each sample. Because of multiplexing, it is necessary to know
    both time_enabled, time_running to be able to scale counts correctly.

    In this second version of the patch, we maintain a shadow
    copy of ctx->time which allows us to compute ctx->time without
    calling update_context_time() from NMI context. We avoid the
    issue that update_context_time() must always be called with
    ctx->lock held.

    We do not keep shadow copies of the other event timings
    because if the lead event is overflowing then it is active
    and thus it's been scheduled in via event_sched_in() in
    which case neither tstamp_stopped, tstamp_running can be modified.

    This timing logic only applies to samples when PERF_SAMPLE_READ
    is used.

    Note that this patch does not address timing issues related
    to sampling inheritance between tasks. This will be addressed
    in a future patch.

    With this patch, the libpfm4 example task_smpl now reports
    correct counts (shown on 2.4GHz Core 2):

    $ task_smpl -p 2400000000 -e unhalted_core_cycles:u,instructions_retired:u,baclears noploop 5
    noploop for 5 seconds
    IIP:0x000000004006d6 PID:5596 TID:5596 TIME:466,210,211,430 STREAM_ID:33 PERIOD:2,400,000,000 ENA=1,010,157,814 RUN=1,010,157,814 NR=3
    2,400,000,254 unhalted_core_cycles:u (33)
    2,399,273,744 instructions_retired:u (34)
    53,340 baclears (35)

    Signed-off-by: Stephane Eranian
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Stephane Eranian
     

19 Oct, 2010

5 commits

  • The use of the JUMP_LABEL() construct ends up creating endless silly
    wrappers, create a higher level construct to reduce this clutter.

    Signed-off-by: Peter Zijlstra
    Cc: Jason Baron
    Cc: Steven Rostedt
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Acked-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Trades a call + conditional + ret for an unconditional jmp.

    Acked-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • hw_breakpoint creation needs to account stuff per-task to ensure there
    is always sufficient hardware resources to back these things due to
    ptrace.

    With the perf per pmu context changes the event initialization no
    longer has access to the event context, for the simple reason that we
    need to first find the pmu (result of initialization) before we can
    find the context.

    This makes hw_breakpoints unhappy, because it can no longer do per
    task accounting, cure this by frobbing a task pointer in the event::hw
    bits for now...

    Signed-off-by: Peter Zijlstra
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Provide a mechanism that allows running code in IRQ context. It is
    most useful for NMI code that needs to interact with the rest of the
    system -- like wakeup a task to drain buffers.

    Perf currently has such a mechanism, so extract that and provide it as
    a generic feature, independent of perf so that others may also
    benefit.

    The IRQ context callback is generated through self-IPIs where
    possible, or on architectures like powerpc the decrementer (the
    built-in timer facility) is set to generate an interrupt immediately.

    Architectures that don't have anything like this get to do with a
    callback from the timer tick. These architectures can call
    irq_work_run() at the tail of any IRQ handlers that might enqueue such
    work (like the perf IRQ handler) to avoid undue latencies in
    processing the work.

    Signed-off-by: Peter Zijlstra
    Acked-by: Kyle McMartin
    Acked-by: Martin Schwidefsky
    [ various fixes ]
    Signed-off-by: Huang Ying
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

15 Oct, 2010

1 commit


11 Oct, 2010

2 commits

  • Introduce perf_pmu_name() helper function that returns the name of the
    pmu. This gives us a generic way to get the name of a pmu regardless of
    how an architecture identifies it internally.

    Signed-off-by: Matt Fleming
    Acked-by: Peter Zijlstra
    Acked-by: Paul Mundt
    Signed-off-by: Robert Richter

    Matt Fleming
     
  • The number of counters for the registered pmu is needed in a few places
    so provide a helper function that returns this number.

    Signed-off-by: Matt Fleming
    Tested-by: Will Deacon
    Acked-by: Paul Mundt
    Acked-by: Peter Zijlstra
    Signed-off-by: Robert Richter

    Matt Fleming
     

17 Sep, 2010

2 commits

  • Revert the timer per cpu-context timers because of unfortunate
    nohz interaction. Fixing that would have been somewhat ugly, so
    go back to driving things from the regular tick. Provide a
    jiffies interval feature for people who want slower rotations.

    Signed-off-by: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Robert Richter
    Cc: Yinghai Lu
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Aside from allowing software events into a !software group,
    allow adding !software events to pure software groups.

    Once we've moved the software group and attached the first
    !software event, the group will no longer be a pure software
    group and hence no longer be eligible for movement, at which
    point the straight ctx comparison is correct again.

    Signed-off-by: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Robert Richter
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

15 Sep, 2010

1 commit

  • The kernel perf event creation path shouldn't use find_task_by_vpid()
    because a vpid exists in a specific namespace. find_task_by_vpid() uses
    current's pid namespace which isn't always the correct namespace to use
    for the vpid in all the places perf_event_create_kernel_counter() (and
    thus find_get_context()) is called.

    The goal is to clean up pid namespace handling and prevent bugs like:

    https://bugzilla.kernel.org/show_bug.cgi?id=17281

    Instead of using pids switch find_get_context() to use task struct
    pointers directly. The syscall is responsible for resolving the pid to
    a task struct. This moves the pid namespace resolution into the syscall
    much like every other syscall that takes pid parameters.

    Signed-off-by: Matt Helsley
    Signed-off-by: Peter Zijlstra
    Cc: Robin Green
    Cc: Prasad
    Cc: Arnaldo Carvalho de Melo
    Cc: Steven Rostedt
    Cc: Will Deacon
    Cc: Mahesh Salgaonkar
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Matt Helsley
     

10 Sep, 2010

14 commits

  • I missed a perf_event_ctxp user when converting it to an array. Pull this
    last user into perf_event.c as well and fix it up.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Since software events are always schedulable, mixing them up with
    hardware events (who are not) can lead to funny scheduling oddities.

    Giving them their own context solves this.

    Signed-off-by: Peter Zijlstra
    Cc: paulus
    Cc: stephane eranian
    Cc: Robert Richter
    Cc: Frederic Weisbecker
    Cc: Lin Ming
    Cc: Yanmin
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Provide the infrastructure for multiple task contexts.

    A more flexible approach would have resulted in more pointer chases
    in the scheduling hot-paths. This approach has the limitation of a
    static number of task contexts.

    Since I expect most external PMUs to be system wide, or at least node
    wide (as per the intel uncore unit) they won't actually need a task
    context.

    Signed-off-by: Peter Zijlstra
    Cc: paulus
    Cc: stephane eranian
    Cc: Robert Richter
    Cc: Frederic Weisbecker
    Cc: Lin Ming
    Cc: Yanmin
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Allocate per-cpu contexts per pmu.

    Signed-off-by: Peter Zijlstra
    Cc: paulus
    Cc: stephane eranian
    Cc: Robert Richter
    Cc: Frederic Weisbecker
    Cc: Lin Ming
    Cc: Yanmin
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Give each cpu-context its own timer so that it is a self contained
    entity, this eases the way for per-pmu-per-cpu contexts as well as
    provides the basic infrastructure to allow different rotation
    times per pmu.

    Things to look at:
    - folding the tick and these TICK_NSEC timers
    - separate task context rotation

    Signed-off-by: Peter Zijlstra
    Cc: paulus
    Cc: stephane eranian
    Cc: Robert Richter
    Cc: Frederic Weisbecker
    Cc: Lin Ming
    Cc: Yanmin
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Separate the swevent hash-table from the cpu_context bits in
    preparation for per pmu cpu contexts.

    This keeps the swevent hash a global entity.

    Signed-off-by: Peter Zijlstra
    Cc: paulus
    Cc: stephane eranian
    Cc: Robert Richter
    Cc: Frederic Weisbecker
    Cc: Lin Ming
    Cc: Yanmin
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Neither the overcommit nor the reservation sysfs parameter were
    actually working, remove them as they'll only get in the way.

    Signed-off-by: Peter Zijlstra
    Cc: paulus
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Replace pmu::{enable,disable,start,stop,unthrottle} with
    pmu::{add,del,start,stop}, all of which take a flags argument.

    The new interface extends the capability to stop a counter while
    keeping it scheduled on the PMU. We replace the throttled state with
    the generic stopped state.

    This also allows us to efficiently stop/start counters over certain
    code paths (like IRQ handlers).

    It also allows scheduling a counter without it starting, allowing for
    a generic frozen state (useful for rotating stopped counters).

    The stopped state is implemented in two different ways, depending on
    how the architecture implemented the throttled state:

    1) We disable the counter:
    a) the pmu has per-counter enable bits, we flip that
    b) we program a NOP event, preserving the counter state

    2) We store the counter state and ignore all read/overflow events

    Signed-off-by: Peter Zijlstra
    Cc: paulus
    Cc: stephane eranian
    Cc: Robert Richter
    Cc: Will Deacon
    Cc: Paul Mundt
    Cc: Frederic Weisbecker
    Cc: Cyrill Gorcunov
    Cc: Lin Ming
    Cc: Yanmin
    Cc: Deng-Cheng Zhu
    Cc: David Miller
    Cc: Michael Cree
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Use hw_perf_event::period_left instead of hw_perf_event::remaining
    and win back 8 bytes.

    Signed-off-by: Peter Zijlstra
    Cc: paulus
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Provide default implementations for the pmu txn methods, this
    allows us to remove some conditional code.

    Signed-off-by: Peter Zijlstra
    Cc: paulus
    Cc: stephane eranian
    Cc: Robert Richter
    Cc: Will Deacon
    Cc: Paul Mundt
    Cc: Frederic Weisbecker
    Cc: Cyrill Gorcunov
    Cc: Lin Ming
    Cc: Yanmin
    Cc: Deng-Cheng Zhu
    Cc: David Miller
    Cc: Michael Cree
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Changes perf_disable() into perf_pmu_disable().

    Signed-off-by: Peter Zijlstra
    Cc: paulus
    Cc: stephane eranian
    Cc: Robert Richter
    Cc: Will Deacon
    Cc: Paul Mundt
    Cc: Frederic Weisbecker
    Cc: Cyrill Gorcunov
    Cc: Lin Ming
    Cc: Yanmin
    Cc: Deng-Cheng Zhu
    Cc: David Miller
    Cc: Michael Cree
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Since the current perf_disable() usage is only an optimization,
    remove it for now. This eases the removal of the __weak
    hw_perf_enable() interface.

    Signed-off-by: Peter Zijlstra
    Cc: paulus
    Cc: stephane eranian
    Cc: Robert Richter
    Cc: Will Deacon
    Cc: Paul Mundt
    Cc: Frederic Weisbecker
    Cc: Cyrill Gorcunov
    Cc: Lin Ming
    Cc: Yanmin
    Cc: Deng-Cheng Zhu
    Cc: David Miller
    Cc: Michael Cree
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Simple registration interface for struct pmu, this provides the
    infrastructure for removing all the weak functions.

    Signed-off-by: Peter Zijlstra
    Cc: paulus
    Cc: stephane eranian
    Cc: Robert Richter
    Cc: Will Deacon
    Cc: Paul Mundt
    Cc: Frederic Weisbecker
    Cc: Cyrill Gorcunov
    Cc: Lin Ming
    Cc: Yanmin
    Cc: Deng-Cheng Zhu
    Cc: David Miller
    Cc: Michael Cree
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`

    Signed-off-by: Peter Zijlstra
    Cc: paulus
    Cc: stephane eranian
    Cc: Robert Richter
    Cc: Will Deacon
    Cc: Paul Mundt
    Cc: Frederic Weisbecker
    Cc: Cyrill Gorcunov
    Cc: Lin Ming
    Cc: Yanmin
    Cc: Deng-Cheng Zhu
    Cc: David Miller
    Cc: Michael Cree
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

19 Aug, 2010

2 commits