03 Jun, 2011

1 commit

  • perf stat continues running even if the event list contains counters
    that are not supported. The resulting output then contains
    for those events which gets confusing as to which events are supported,
    but not counted and which are not supported.

    Before:

    perf stat -ddd -- sleep 1

    Performance counter stats for 'sleep 1':

    0.571283 task-clock # 0.001 CPUs utilized
    1 context-switches # 0.002 M/sec
    0 CPU-migrations # 0.000 M/sec
    157 page-faults # 0.275 M/sec
    1,037,707 cycles # 1.816 GHz
    stalled-cycles-frontend
    stalled-cycles-backend
    654,499 instructions # 0.63 insns per cycle
    136,129 branches # 238.286 M/sec
    branch-misses
    L1-dcache-loads
    L1-dcache-load-misses
    LLC-loads
    LLC-load-misses
    L1-icache-loads
    L1-icache-load-misses
    dTLB-loads
    dTLB-load-misses
    iTLB-loads
    iTLB-load-misses
    L1-dcache-prefetches
    L1-dcache-prefetch-misses

    1.001004836 seconds time elapsed

    After:

    perf stat -ddd -- sleep 1

    Performance counter stats for 'sleep 1':

    1.350326 task-clock # 0.001 CPUs utilized
    2 context-switches # 0.001 M/sec
    0 CPU-migrations # 0.000 M/sec
    157 page-faults # 0.116 M/sec
    11,986 cycles # 0.009 GHz
    stalled-cycles-frontend
    stalled-cycles-backend
    496,986 instructions # 41.46 insns per cycle
    138,065 branches # 102.246 M/sec
    7,245 branch-misses # 5.25% of all branches
    L1-dcache-loads
    L1-dcache-load-misses
    LLC-loads
    LLC-load-misses
    L1-icache-loads
    L1-icache-load-misses
    dTLB-loads
    dTLB-load-misses
    iTLB-loads
    iTLB-load-misses
    L1-dcache-prefetches
    L1-dcache-prefetch-misses

    1.002397333 seconds time elapsed

    v1->v2:
    changed supported type from int to bool

    v2->v3
    fixed vertical alignment of new struct element

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1306767359-13221-1-git-send-email-dsahern@gmail.com
    Signed-off-by: David Ahern
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     

02 Jun, 2011

1 commit

  • Fixes two more cases where the python binding would not load:

    . Not finding die(), which it shouldn't anyway, not good to just stop the
    world because some particular perf.data file is invalid, just propagate
    the error to the caller.

    . Not finding perf_sample_size: fix it by moving it from event.c to evsel,
    where it belongs, as most cases are moving to operate on an evsel object.o

    One of the fixed problems:

    [root@emilia ~]# python
    >>> import perf
    Traceback (most recent call last):
    File "", line 1, in
    ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
    >>>
    [root@emilia ~]#

    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

15 Apr, 2011

1 commit

  • perf stat doesn't mmap and its perfectly fine for it to use task-bound
    counters with inheritance.

    So set the attr.inherit on the caller and leave the syscall itself to
    validate it.

    When the mmap fails perf_evlist__mmap will just emit a warning if this
    is the failure reason.

    Reported-by: Peter Zijlstra
    Acked-by: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    Link: http://lkml.kernel.org/r/20110414170121.GC3229@ghostprotocols.net
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

10 Mar, 2011

1 commit

  • So that we can reuse things like the id to attr lookup routine
    (perf_evlist__id2evsel) that uses a hash table instead of the linear
    lookup done in the older perf_header_attr routines, etc.

    Also to make evsels/evlist more pervasive an API, simplyfing using the
    emerging perf lib.

    cc: Arun Sharma
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

07 Mar, 2011

1 commit

  • By creating an perf_evlist out of the attributes in the perf.data file
    header, so that we can use evlists and evsels when reading recorded
    sessions in addition to when we record sessions.

    More work is needed to allow tools to allow the user to select which
    events are wanted when browsing sessions, be it just one or a subset of
    them, aggregated or showed at the same time but with different
    indications on the UI to allow seeing workloads thru different views at
    the same time.

    But the overall goal/trend is to more uniformly use evsels and evlists.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

17 Feb, 2011

1 commit

  • This patch changes the way perf stat prints event names at the end of a
    run. Until now, it was trying to reconstruct the event name from its
    encoding. The problem is that it would only print generic events without
    their modifiers (u, k, pp).

    This patch saves the event name as passed by the user in the evsel
    struct and uses it to print the final event name.

    This would also work in case perf is linked with a library (such as
    libpfm4) which provides full PMU event tables.

    $ perf stat -e cycles:u,cycles:k date
    Wed Feb 16 14:58:52 CET 2011

    Performance counter stats for 'date':

    568600 cycles:u
    2779715 cycles:k

    0.001908182 seconds time elapsed

    Cc: Arun Sharma
    Cc: David S. Miller
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Stephane Eranian
    LPU-Reference:
    Signed-off-by: Stephane Eranian
    [ committer note: Fixed a merge problem with 023695d "Add cgroup support" ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     

16 Feb, 2011

1 commit

  • This patch adds the ability to filter monitoring based on container groups
    (cgroups) for both perf stat and perf record. It is possible to monitor
    multiple cgroup in parallel. There is one cgroup per event. The cgroups to
    monitor are passed via a new -G option followed by a comma separated list of
    cgroup names.

    The cgroup filesystem has to be mounted. Given a cgroup name, the perf tool
    finds the corresponding directory in the cgroup filesystem and opens it. It
    then passes that file descriptor to the kernel.

    Example:

    $ perf stat -B -a -e cycles:u,cycles:u,cycles:u -G test1,,test2 -- sleep 1
    Performance counter stats for 'sleep 1':

    2,368,667,414 cycles test1
    2,369,661,459 cycles
    cycles test2

    1.001856890 seconds time elapsed

    Signed-off-by: Stephane Eranian
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Stephane Eranian
     

30 Jan, 2011

1 commit

  • They were on evsel.c because they came from refactoring existing evsel
    methods, so, to make reviewing the changes easier, I kept it there, now
    its a plain move.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

24 Jan, 2011

1 commit

  • Out of the {con,des}structor, as in interpreted language bindings we will
    need to go back from the wrapper object to the real thing. In that case
    using container_of will save us to have an extra pointer in the perf_evsel
    struct.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

23 Jan, 2011

4 commits

  • Adopting the new model used in 'perf record', where we don't have a map
    per thread per cpu, instead we have an mmap per cpu, established on the
    first fd for that cpu and ask the kernel using the
    PERF_EVENT_IOC_SET_OUTPUT ioctl to send events for the other fds on that
    cpu for the one with the mmap.

    The methods moved from perf_evsel to perf_evlist, but for easing review
    they were modified in place, in evsel.c, the next patch will move the
    migrated methods to evlist.c.

    With this 'perf top' now uses the same mmap model used by 'perf record'
    and the next patches will make 'perf record' use these new routines,
    establishing a common codebase for both tools.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Out of the code in 'perf top'. Record is next in line.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • As this is a per-cpu attribute, we can't set it up in advance and use it
    for all the calls.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The perf_evsel__open now have an extra boolean argument specifying if
    event grouping is desired.

    The first file descriptor created on a CPU becomes the group leader.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

07 Jan, 2011

1 commit

  • Since commit 69aad6f1(perf tools: Introduce event selectors), only
    perf_event_attr::type and ::config are passed to event selector, which
    makes perf tool not work correctly.

    For example, PEBS does not work because perf_event_attr::precise_ip is
    not passed to the syscall.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Lin Ming
    Signed-off-by: Arnaldo Carvalho de Melo

    Lin Ming
     

06 Jan, 2011

1 commit

  • This patch fixes the usage of the perf_event.h header file
    between command modules and the supporting code in util.

    It is necessary to ensure that ALL files use the SAME
    perf_event.h header from the kernel source tree.

    There were a couple of #include mixed
    with #include "../../perf_event.h".

    This caused issues on some distros because of mismatch
    in the layout of struct perf_event_attr. That eventually
    led perf stat to segfault.

    Cc: David S. Miller
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Stephane Eranian
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     

04 Jan, 2011

5 commits

  • Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Abstracting away the loops needed to create the various event fd handlers.

    The users have to pass a confiruged perf->evsel.attr field, which is already
    usable after perf_evsel__new (constructor) time, using defaults.

    Comes out of the ad-hoc routines in builtin-stat, that now uses it.

    Fixed a small silly bug where we were die()ing before killing our
    children, dysfunctional family this one 8-)

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Making them hopefully generic enough to be used in 'perf test',
    well see.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Out of ad-hoc code and global arrays with hard coded sizes.

    This is the first step on having a library that will be first
    used on regression tests in the 'perf test' tool.

    [acme@felicio linux]$ size /tmp/perf.before
    text data bss dec hex filename
    1273776 97384 5104416 6475576 62cf38 /tmp/perf.before
    [acme@felicio linux]$ size /tmp/perf.new
    text data bss dec hex filename
    1275422 97416 1392416 2765254 2a31c6 /tmp/perf.new

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo