21 Jul, 2011

1 commit

  • Moving out the option parameter from parse_events function,
    and adding new parse_events_option function instead.

    The option parameter is used only to carry "struct perf_evlist"
    pointer for chaining new events. Putting it away, enable us
    to call parse_events from other places without using the
    option parameter.

    Signed-off-by: Jiri Olsa
    Cc: acme@redhat.com
    Cc: a.p.zijlstra@chello.nl
    Cc: paulus@samba.org
    Link: http://lkml.kernel.org/r/1310635534-4013-2-git-send-email-jolsa@redhat.com
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     

28 May, 2011

3 commits

  • We now just warn the user about the fact and go on providing just
    userspace samples.

    This fixes a problem when no vmlinux is explicetely passed by the user,
    thus symbol_conf.vmlinux_name is NULL, no suitable vmlinux is found, and
    then we get:

    aldebaran:~> perf top -p 7557
    [kernel.kallsyms] with build id 44d9a989eabbd79e486bc079d6b743d397c204e0
    not found, continuing without symbols
    The (null) file can't be used

    Reported-by: Ingo Molnar
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    Link: http://lkml.kernel.org/n/tip-cj2g81hn64wv2bipmqk4fy2m@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Reported-by: Ingo Molnar
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    Link: http://lkml.kernel.org/n/tip-cyl5zmi1nu35vyu7l5im2pyv@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    Link: http://lkml.kernel.org/n/tip-weqbs0tkk2u0qp1xxdxxosfg@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

22 May, 2011

2 commits


15 May, 2011

1 commit

  • The PERF_EVENT_IOC_SET_OUTPUT ioctl was returning -EINVAL when using
    --pid when monitoring multithreaded apps, as we can only share a ring
    buffer for events on the same thread if not doing per cpu.

    Fix it by using per thread ring buffers.

    Tested with:

    [root@felicio ~]# tuna -t 26131 -CP | nl
    1 thread ctxt_switches
    2 pid SCHED_ rtpri affinity voluntary nonvoluntary cmd
    3 26131 OTHER 0 0,1 10814276 2397830 chromium-browse
    4 642 OTHER 0 0,1 14688 0 chromium-browse
    5 26148 OTHER 0 0,1 713602 115479 chromium-browse
    6 26149 OTHER 0 0,1 801958 2262 chromium-browse
    7 26150 OTHER 0 0,1 1271128 248 chromium-browse
    8 26151 OTHER 0 0,1 3 0 chromium-browse
    9 27049 OTHER 0 0,1 36796 9 chromium-browse
    10 618 OTHER 0 0,1 14711 0 chromium-browse
    11 661 OTHER 0 0,1 14593 0 chromium-browse
    12 29048 OTHER 0 0,1 28125 0 chromium-browse
    13 26143 OTHER 0 0,1 2202789 781 chromium-browse
    [root@felicio ~]#

    So 11 threads under pid 26131, then:

    [root@felicio ~]# perf record -F 50000 --pid 26131

    [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
    1 7fa4a2538000-7fa4a25b9000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    2 7fa4a25b9000-7fa4a263a000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    3 7fa4a263a000-7fa4a26bb000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    4 7fa4a26bb000-7fa4a273c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    5 7fa4a273c000-7fa4a27bd000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    6 7fa4a27bd000-7fa4a283e000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    7 7fa4a283e000-7fa4a28bf000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    8 7fa4a28bf000-7fa4a2940000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    9 7fa4a2940000-7fa4a29c1000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    10 7fa4a29c1000-7fa4a2a42000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    11 7fa4a2a42000-7fa4a2ac3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    [root@felicio ~]#

    11 mmaps, one per thread since we didn't specify any CPU list, so we need one
    mmap per thread and:

    [root@felicio ~]# perf record -F 50000 --pid 26131
    ^M
    ^C[ perf record: Woken up 79 times to write data ]
    [ perf record: Captured and wrote 20.614 MB perf.data (~900639 samples) ]

    [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
    1 371310 26131
    2 96516 26148
    3 95694 26149
    4 95203 26150
    5 7291 26143
    6 87 27049
    7 76 661
    8 60 29048
    9 47 618
    10 43 642
    [root@felicio ~]#

    Ok, one of the threads, 26151 was quiescent, so no samples there, but all the
    others are there.

    Then, if I specify one CPU:

    [root@felicio ~]# perf record -F 50000 --pid 26131 --cpu 1
    ^C[ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.680 MB perf.data (~29730 samples) ]

    [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
    1 8444 26131
    2 2584 26149
    3 2518 26148
    4 2324 26150
    5 123 26143
    6 9 661
    7 9 29048
    [root@felicio ~]#

    This machine has two cores, so fewer threads appeared on the radar, and:

    [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
    1 7f484b922000-7f484b9a3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    [root@felicio ~]#

    Just one mmap, as now we can use just one per-cpu buffer instead of the
    per-thread needed in the previous case.

    For global profiling:

    [root@felicio ~]# perf record -F 50000 -a
    ^C[ perf record: Woken up 26 times to write data ]
    [ perf record: Captured and wrote 7.128 MB perf.data (~311412 samples) ]

    [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
    1 7fb49b435000-7fb49b4b6000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    2 7fb49b4b6000-7fb49b537000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    [root@felicio ~]#

    It uses per-cpu buffers.

    For just one thread:

    [root@felicio ~]# perf record -F 50000 --tid 26148
    ^C[ perf record: Woken up 2 times to write data ]
    [ perf record: Captured and wrote 0.330 MB perf.data (~14426 samples) ]

    [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
    1 9969 26148
    [root@felicio ~]#

    [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
    1 7f286a51b000-7f286a59c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
    [root@felicio ~]#

    Tested-by: David Ahern
    Tested-by: Lin Ming
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

15 Apr, 2011

1 commit

  • perf stat doesn't mmap and its perfectly fine for it to use task-bound
    counters with inheritance.

    So set the attr.inherit on the caller and leave the syscall itself to
    validate it.

    When the mmap fails perf_evlist__mmap will just emit a warning if this
    is the failure reason.

    Reported-by: Peter Zijlstra
    Acked-by: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    Link: http://lkml.kernel.org/r/20110414170121.GC3229@ghostprotocols.net
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

30 Mar, 2011

2 commits

  • Resend of patch sent back in January 2011 in light of recent confusion around
    unsupported events for a given platform.

    Improve sys_perf_event_open ENOENT return handling in top and record, just
    like 5a3446b does for stat.

    Retry of Arnaldo's patch using ui_warning instead of die which allows the
    fallback from hardware cycles to software clock.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    LKML-Reference:
    Signed-off-by: David Ahern
    [ committer note: Some adjustments to make it apply to newer codebase ]
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • We have to deal with the TUI mode in perf top, so that we don't end up
    with a garbled screen when, say, a non root user on a machine with a
    paranoid setting (the default) tries to use 'perf top'.

    Introduce a ui__warning_paranoid() routine shared by top and record that
    tells the user the valid values for /proc/sys/kernel/perf_event_paranoid.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

23 Mar, 2011

1 commit

  • builtin-top.c has an uninitialized variable.
    gcc(version 4.5.1) warns about it and it results in build failure:

    builtin-top.c: In function 'display_thread':
    builtin-top.c:518:9: error: 'counter' may be used uninitialized

    This situation can indeed trigger, if the getline() call in
    prompt_integer() fails.

    Signed-off-by: Akihiro Nagai
    Cc: Arnaldo Carvalho de Melo
    Cc: Masami Hiramatsu
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Akihiro Nagai
     

12 Mar, 2011

4 commits

  • While going thru each of the sym_entry fields looking to reduce it to
    the set of entries needed when in an active symbols list, 'skip' should
    really be in symbol, as we set it when loading the symtab.

    And the space used by the basic symbol allocation remains the same as
    we had 5 bytes of padding.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • And the DSO__ORIG_ enum to SYMTAB__, to clarify that this is about from
    where the symtab was obtained.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • We can get it from syme->map->dso->kernel (that should be renamed to
    origin, but leave this for another patch).

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • We can get that counter index from perf_top->sym_evsel->idx instead.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

10 Mar, 2011

1 commit

  • So that we can reuse things like the id to attr lookup routine
    (perf_evlist__id2evsel) that uses a hash table instead of the linear
    lookup done in the older perf_header_attr routines, etc.

    Also to make evsels/evlist more pervasive an API, simplyfing using the
    emerging perf lib.

    cc: Arun Sharma
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

01 Mar, 2011

2 commits


22 Feb, 2011

1 commit

  • Now one has just to press the right key, 'a' or Enter on the main 'perf
    top --tui' screen to live annotate the symbol under the cursor.

    The annotate window starts centered on the hottest line (the one with
    most samples so far) then TAB and shift+TAB can be used to go to the
    prev/next hot line.

    Pressing 'H' at any point will center again the screen on the hottest
    line.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

11 Feb, 2011

1 commit


10 Feb, 2011

1 commit

  • Jeff Moyer reported these messages:

    Warning: ... trying to fall back to cpu-clock-ticks

    couldn't open /proc/-1/status
    couldn't open /proc/-1/maps
    [ls output]
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.008 MB perf.data (~363 samples) ]

    That lead me and David Ahern to see that something was fishy on the thread
    synthesizing routines, at least for the case where the workload is started
    from 'perf record', as -1 is the default for target_tid in 'perf record --tid'
    parameter, so somehow we were trying to synthesize the PERF_RECORD_MMAP and
    PERF_RECORD_COMM events for the thread -1, a bug.

    So I investigated this and noticed that when we introduced support for
    recording a process and its threads using --pid some bugs were introduced and
    that the way to fix it was to instead of passing the target_tid to the event
    synthesizing routines we should better pass the thread_map that has the list of
    threads for a --pid or just the single thread for a --tid.

    Checked in the following ways:

    On a 8-way machine run cyclictest:

    [root@emilia ~]# perf record cyclictest -a -t -n -p99 -i100 -d50
    policy: fifo: loadavg: 0.00 0.13 0.31 2/139 28798

    T: 0 (28791) P:99 I:100 C: 25072 Min: 4 Act: 5 Avg: 6 Max: 122
    T: 1 (28792) P:98 I:150 C: 16715 Min: 4 Act: 6 Avg: 5 Max: 27
    T: 2 (28793) P:97 I:200 C: 12534 Min: 4 Act: 5 Avg: 4 Max: 8
    T: 3 (28794) P:96 I:250 C: 10028 Min: 4 Act: 5 Avg: 5 Max: 96
    T: 4 (28795) P:95 I:300 C: 8357 Min: 5 Act: 6 Avg: 5 Max: 12
    T: 5 (28796) P:94 I:350 C: 7163 Min: 5 Act: 6 Avg: 5 Max: 12
    T: 6 (28797) P:93 I:400 C: 6267 Min: 4 Act: 5 Avg: 5 Max: 9
    T: 7 (28798) P:92 I:450 C: 5571 Min: 4 Act: 5 Avg: 5 Max: 9
    ^C[ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.108 MB perf.data (~4719 samples) ]

    [root@emilia ~]#

    This will create one extra thread per CPU:

    [root@emilia ~]# tuna -t cyclictest -CP
    thread ctxt_switches
    pid SCHED_ rtpri affinity voluntary nonvoluntary cmd
    28825 OTHER 0 0xff 2169 671 cyclictest
    28832 FIFO 93 6 52338 1 cyclictest
    28833 FIFO 92 7 46524 1 cyclictest
    28826 FIFO 99 0 209360 1 cyclictest
    28827 FIFO 98 1 139577 1 cyclictest
    28828 FIFO 97 2 104686 0 cyclictest
    28829 FIFO 96 3 83751 1 cyclictest
    28830 FIFO 95 4 69794 1 cyclictest
    28831 FIFO 94 5 59825 1 cyclictest
    [root@emilia ~]#

    So we should expect only samples for the above 9 threads when using the
    --dump-raw-trace|-D perf report switch to look at the column with the tid:

    [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
    629 28825
    110 28826
    491 28827
    308 28828
    198 28829
    621 28830
    225 28831
    203 28832
    89 28833
    [root@emilia ~]#

    So for workloads started by 'perf record' seems to work, now for existing workloads,
    just run cyclictest first, without 'perf record':

    [root@emilia ~]# tuna -t cyclictest -CP
    thread ctxt_switches
    pid SCHED_ rtpri affinity voluntary nonvoluntary cmd
    28859 OTHER 0 0xff 594 200 cyclictest
    28864 FIFO 95 4 16587 1 cyclictest
    28865 FIFO 94 5 14219 1 cyclictest
    28866 FIFO 93 6 12443 0 cyclictest
    28867 FIFO 92 7 11062 1 cyclictest
    28860 FIFO 99 0 49779 1 cyclictest
    28861 FIFO 98 1 33190 1 cyclictest
    28862 FIFO 97 2 24895 1 cyclictest
    28863 FIFO 96 3 19918 1 cyclictest
    [root@emilia ~]#

    and then later did:

    [root@emilia ~]# perf record --pid 28859 sleep 3
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.027 MB perf.data (~1195 samples) ]
    [root@emilia ~]#

    To collect 3 seconds worth of samples for pid 28859 and its children:

    [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
    15 28859
    33 28860
    19 28861
    13 28862
    13 28863
    10 28864
    11 28865
    9 28866
    255 28867
    [root@emilia ~]#

    Works, last thing is to check if looking at just one of those threads also works:

    [root@emilia ~]# perf record --tid 28866 sleep 3
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.006 MB perf.data (~242 samples) ]
    [root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
    3 28866
    [root@emilia ~]#

    Works too.

    Reported-by: Jeff Moyer
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Jeff Moyer
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

09 Feb, 2011

2 commits

  • The live annotation done in 'perf top' needs to limit the context before
    lines that aren't filtered out by the min percent filter, if we don't do
    that, the screen in a tty often is not enough for showing what is
    interesting: lines with hits and a few source code lines before it.

    Reported-by: Mike Galbraith
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Since we'll need it when implementing the live annotate TUI browser.

    This also simplifies things a bit by having the list head for the source
    code to be in the dynamicly allocated part of struct annotation, that
    way we don't have to pass it around, it can be found from the struct
    symbol that is passed everywhere.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

07 Feb, 2011

2 commits

  • GCC 4.6.0 in Fedora rawhide turned up some compile errors in tools/perf
    due to the -Werror=unused-but-set-variable flag.

    I've gone through and annotated some of the assignments that had side
    effects (ie: return value from a function) with the __used annotation,
    and in some cases, just removed unused code.

    In a few cases, we were assigning something useful, but not using it in
    later parts of the function.

    kyle@dreadnought:~/src% gcc --version
    gcc (GCC) 4.6.0 20110122 (Red Hat 4.6.0-0.3)

    Cc: Ingo Molnar
    LKML-Reference:
    Signed-off-by: Kyle McMartin
    [ committer note: Fixed up the annotation fixes, as that code moved recently ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Kyle McMartin
     
  • Next step: Live TUI annotation in perf top, just press enter on a symbol
    line.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

05 Feb, 2011

1 commit


01 Feb, 2011

2 commits

  • Disabled by default as there are features found in the stdio based one
    that aren't implemented, like live annotation, filtering knobs data
    entry.

    Annotation hopefully will get somehow merged with the 'perf annotate'
    code.

    To use it:

    perf top --tui

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Paving the way for a slang browser a la 'perf report --tui'.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

31 Jan, 2011

1 commit

  • So that we don't have to pass it around to the several methods that
    needs it, simplifying usage.

    There is one case where we don't have the thread/cpu map in advance,
    which is in the parsing routines used by top, stat, record, that we have
    to wait till all options are parsed to know if a cpu or thread list was
    passed to then create those maps.

    For that case consolidate the cpu and thread map creation via
    perf_evlist__create_maps() out of the code in top and record, while also
    providing a perf_evlist__set_maps() for cases where multiple evlists
    share maps or for when maps that represent CPU sockets, for instance,
    get crafted out of topology information or subsets of threads in a
    particular application are to be monitored, providing more granularity
    in specifying which cpus and threads to monitor.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

30 Jan, 2011

3 commits


24 Jan, 2011

2 commits

  • To avoid linking more stuff in the python binding I'm working on, future
    csets will make the sample type be taken from the evsel itself, but for
    that we need to first have one file per cpu and per sample_type, not a
    single perf.data file.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • To untangle it from struct thread handling, that is tied to symbols, etc.

    Right now in the python bindings I'm working on I need just a subset of
    the util/ files, untangling it allows me to do that.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

23 Jan, 2011

6 commits

  • Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Will be used in the upcoming 'perf test' entry for the evlist mmap
    routines.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Adopting the new model used in 'perf record', where we don't have a map
    per thread per cpu, instead we have an mmap per cpu, established on the
    first fd for that cpu and ask the kernel using the
    PERF_EVENT_IOC_SET_OUTPUT ioctl to send events for the other fds on that
    cpu for the one with the mmap.

    The methods moved from perf_evsel to perf_evlist, but for easing review
    they were modified in place, in evsel.c, the next patch will move the
    migrated methods to evlist.c.

    With this 'perf top' now uses the same mmap model used by 'perf record'
    and the next patches will make 'perf record' use these new routines,
    establishing a common codebase for both tools.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Out of the code in 'perf top'. Record is next in line.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Now that it handles group_fd and inherit we can use it, sharing it with
    stat.

    Next step: 'perf record' should use, then move the mmap_array out of
    ->priv and into perf_evsel, with top and record sharing this, and at the
    same time, write a 'perf test' stress test.

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Allocating just the space needed for nr_cpus * nr_threads * nr_evsels,
    not the MAX_NR_CPUS and counters.

    LKML-Reference:
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo