19 Aug, 2009

1 commit

  • pushd tools/perf/Documentation
    make html
    popd

    is failing for me...

    ASCIIDOC perf-annotate.html
    ERROR: unsafe: include file: /etc/asciidoc/./stylesheets/xhtml11.css
    ERROR: unsafe: include file:
    /etc/asciidoc/./stylesheets/xhtml11-manpage.css
    ERROR: unsafe: include file:
    /etc/asciidoc/./stylesheets/xhtml11-quirks.css
    make: *** [perf-annotate.html] Error 1

    Apparently asciidoc "unsafe" is the default mode of operation
    in practice.

    https://bugzilla.redhat.com/show_bug.cgi?id=506953

    Works tidily now.

    Signed-off-by: Kyle McMartin
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Kyle McMartin
     

18 Aug, 2009

1 commit

  • Linus reported this perf annotate segfault:

    [torvalds@nehalem git]$ perf annotate unmap_vmas
    Segmentation fault

    #0 map__clone (self=) at builtin-annotate.c:236
    #1 thread__fork (self=) at builtin-annotate.c:372

    The bug here was that builtin-annotate.c was a copy of
    builtin-report.c and a threading related fix to builtin-report.c
    didnt get propagated to builtin-annotate.c ...

    Reported-by: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

17 Aug, 2009

1 commit

  • Rename it to examples.txt to avoid the perf-*.txt pattern in
    the Makefile, otherwise 'make doc' fails because
    perf-examples.txt is not formatted to be a man page:

    ERROR: perf-examples.txt: line 1: manpage document title is mandatory

    Signed-off-by: Carlos R. Mafra
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Carlos R. Mafra
     

15 Aug, 2009

1 commit

  • We were using 'fd' locally, but there was a global 'fd' too, so
    when converting from open to fopen the test made against fd
    should be made against 'fp', but since we have that global
    it didnt get discovered ...

    Reported-by: Ulrich Drepper
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

13 Aug, 2009

4 commits


12 Aug, 2009

5 commits

  • perf top supports a -C for setting the profile CPU, but perf
    record does not. This adds the same option for perf record,
    allowing the user to specify a specific target profile CPU.

    Signed-off-by: Jens Axboe
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Jens Axboe
     
  • It is better than showing the map addr, this way at least we
    know that we can't get the symtabs because the DSO was deleted
    (system update) while an app still used such DSO.

    Yeah, don't do that, but if you do, you'll figure it out
    quicker this way.

    [acme@doppio linux-2.6-tip]$ perf report | head -15
    # Samples: 3796
    #
    # Overhead Command Shared Object Symbol
    # ........ ....... ................................................................... ......
    #
    23.55% pidgin /lib64/libglib-2.0.so.0.2000.4.#prelink#.Pd98lu (deleted) [.] 0x00000000038844
    21.55% pidgin /lib64/libpthread-2.10.1.so.#prelink#.AFwK8Q (deleted) [.] 0x0000000000a42d
    10.85% pidgin [kernel] [.] vread_hpet
    7.85% pidgin /lib64/libgobject-2.0.so.0.2000.4.#prelink#.o1vpU7 (deleted) [.] 0x00000000014de8
    3.35% pidgin /lib64/libc-2.10.1.so (deleted) [.] 0x0000000007a875
    3.19% pidgin /lib64/libdbus-1.so.3.4.0.#prelink#.6mwgZP (deleted) [.] 0x0000000001d254
    3.06% pidgin /usr/lib64/libgtk-x11-2.0.so.0.1600.5.#prelink#.511hAl (deleted) [.] 0x000000002334e7
    2.90% pidgin /usr/lib64/libgdk-x11-2.0.so.0.1600.5.#prelink#.5qlMo1 (deleted) [.] 0x00000000037b2d
    1.84% pidgin [kernel] [k] do_sys_poll
    1.45% pidgin /usr/lib64/libX11.so.6.2.0.#prelink#.iR59Rx (deleted) [.] 0x0000000004c751
    [acme@doppio linux-2.6-tip]$

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Luis Claudio R. Gonçalves
    Cc: Clark Williams
    Cc: H. Peter Anvin
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Frédéric Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • In old binutils we can't access bfd_demangle(), use
    cplus_demangle() just like oprofile.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Luis Claudio R. Gonçalves
    Cc: H. Peter Anvin
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Frédéric Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • This made it easier to find the firefox threading related
    bug.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: "H. Peter Anvin"
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Frédéric Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Noticed when trying to record events for a firefox thread. We
    were synthesizing both .tid and .pid with the pid passed via
    --pid.

    Fix it by reading /proc/PID/status and getting the tgid
    to use in .pid, .tid gets the specified "pid".

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: "H. Peter Anvin"
    Cc: Frédéric Weisbecker
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

09 Aug, 2009

19 commits

  • Sometimes we get callchain branches that have a rate under the
    limit given by the user.

    Say you launched:

    perf record -f -g -a ./hackbench 10
    perf report -g fractal,10.0

    And you got:

    2.33% hackbench [kernel] [k] _spin_lock_irqsave
    |
    |--78.57%-- remove_wait_queue
    | poll_freewait
    | do_sys_poll
    | sys_poll
    | sysenter_dispatch
    | 0xf7ffa430
    | 0x1ffadea3c
    |
    |--7.14%-- __up_read
    | up_read
    | do_page_fault
    | page_fault
    | 0xf7ffa430
    | 0xa0df710000000a
    ...

    It is abnormal to get a 7.14% branch whereas we passed a 10%
    filter.

    The problem is that we round down the minimum threshold. This
    happens mostly when we have very low number of events. If the
    total amount of your branch is 4 and you have a subranch of 3
    events, filtering to 90% will be computed like follows:

    limit = 4 * 0.9;

    The result is about 3.6, but the cast to integer will round
    down to 3. It means that our filter is actually of 75%

    We must then explicitly round up the minimum threshold.

    Reported-by: Ingo Molnar
    Signed-off-by: Frederic Weisbecker
    Cc: acme@redhat.com
    Cc: peterz@infradead.org
    Cc: efault@gmx.de
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Due to a libz dependency in some distro's binutils package,
    C++ demangle support isn't compiled in despite the necessary
    libraries being available.

    Fix this by adding a -lz link test to the dependency detection
    rules.

    Signed-off-by: Mike Galbraith
    Acked-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Mike Galbraith
     
  • A few examples of how 'perf' can be used, from an e-mail by
    Ingo Molnar http://lkml.org/lkml/2009/8/4/346.

    Signed-off-by: Carlos R. Mafra
    Cc: Peter Zijlstra
    Cc: Valdis.Kletnieks@vt.edu
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Carlos R. Mafra
     
  • … ignored chains in fractal mode

    When we filter the callchains below a given percentage, we
    ignore them and the end result only shows entries that have an
    upper percentage than the filter threshold.

    It seems to users then that we have an imbalance in the
    percentage, as if the sum inside a profiled branch doesn't
    reach 100%.

    Since in the past there have been real perf report bugs that
    showed the same sypmtom, it would be nice to assure the user
    that the data is perfect and trustable and it all sums up to
    100.00%.

    So fix this by displaying the remaining hits that have been
    filtered but without more detail than their amount in each
    branches. Example while filtering below 50%:

    7.73% [k] delay_tsc
    |
    |--98.22%-- __const_udelay
    | |
    | |--86.37%-- ath5k_hw_register_timeout
    | | ath5k_hw_noise_floor_calibration
    | | ath5k_hw_reset
    | | ath5k_reset
    | | ath5k_config
    | | ieee80211_hw_config
    | | |
    | | |--88.53%-- ieee80211_scan_work
    | | | worker_thread
    | | | kthread
    | | | child_rip
    | | --11.47%-- [...]
    | --13.63%-- [...]
    --1.78%-- [...]

    Reported-by: Ingo Molnar <mingo@elte.hu>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Mike Galbraith <efault@gmx.de>
    LKML-Reference: <1249690585-9145-4-git-send-email-fweisbec@gmail.com>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

    Frederic Weisbecker
     
  • If we recorded with -g option to record the callchain, right now
    we require a -g option to perf report as well - and people reported
    this as unnecessary complication: the user already specified -g
    once, no need to require it a second time.

    So if the recording includes call-chains, display the callchain by
    default from perf report.

    ( The user can override this default using "-g none" option from
    perf report. )

    Reported-by: Ingo Molnar
    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Mike Galbraith
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • When the callchain tree comes to insert an empty backtrace, it
    raises a spurious warning about the fact we are inserting an
    empty. This is spurious because the radix tree assumes it did
    something wrong to reach this state. But it didn't, we just met
    an empty callchain that has to be ignored.

    This happens occasionally with certain types of call-chain
    recordings. If it happens it's a big nuisance as perf report
    output starts with thousands of warning lines.

    Reported-by: Ingo Molnar
    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Mike Galbraith
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • 1. Ignore the -A argument if there is no perf.data file
    2. Treat an empty file like a non existent file.

    Else, perf will try to read the perf.data header, and fail with
    an error.

    Treating an empty file like a non-existent file makes sense,
    since an interupted (as in SIGKILLed) perf could leave such
    files around, and you don't want to annoy the user with errors
    for files with no data in it.

    Signed-off-by: Pierre Habouzit
    Acked-by: Peter Zijlstra
    Cc: Paul Mackerras
    Signed-off-by: Ingo Molnar

    Pierre Habouzit
     
  • While toying with perf, I've noticed that perf record can
    easily enter a busy loop when doing something as silly as:

    $ perf record -A ls

    Yeah, do_read here really wants to read a known size, not being
    able to should die(), not busy-loop ;)

    That was the cause for the bug.

    Signed-off-by: Pierre Habouzit
    Acked-by: Peter Zijlstra
    Cc: Paul Mackerras
    Signed-off-by: Ingo Molnar

    Pierre Habouzit
     
  • Stop perf list from displaying tracepoints without an id file,
    those are special tracepoints that are not interfaced to
    perfcounters so listing them is erroneous and passing them as
    events will produce no output.

    Signed-off-by: Peter Zijlstra
    Acked-by: Jason Baron
    Cc: Steven Rostedt
    Cc: Chris Mason
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • We want to use a coherent flag for -S/--stat across all tools,
    so free up -S in perf stat.

    Signed-off-by: Brice Goglin
    Cc: Peter Zijlstra
    Cc: paulus@samba.org
    Signed-off-by: Ingo Molnar

    Brice Goglin
     
  • …gin (DSO, build-id, kernel, etc)

    Used with perf report --verbose:

    [acme@doppio linux-2.6-tip]$ perf report -v | head -16
    5.17% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x00000000005d8eee f [.] imgContainer::DrawFrameTo(gfxIImageFrame*, gfxIImageFrame*, nsRect&)
    2.56% firefox /lib64/libpthread-2.10.1.so 0x0000000000008e02 d [.] __pthread_mutex_lock_internal
    1.94% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x0000000000d0af8f f [.] SearchTable
    1.75% firefox [kernel] 0xffffffffff60013b k [.] vread_hpet
    1.63% firefox /lib64/libpthread-2.10.1.so 0x000000000000a404 d [.] __pthread_mutex_unlock
    1.47% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000482ea f [.] js_Interpret
    1.42% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x000000000003eda3 f [.] JS_CallTracer
    1.24% firefox [kernel] 0xffffffff8102ca4a k [k] read_hpet
    1.16% firefox [kernel] 0xffffffff810f3dd4 k [k] fget_light
    1.11% firefox /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000567ff f [.] js_TraceObject
    0.98% firefox /usr/lib64/firefox-3.5.2/firefox 0x000000000000dd23 b [.] arena_ralloc
    [acme@doppio linux-2.6-tip]$

    The new field is just after the symbol address. To help in
    figuring out symbol resolution bugs.

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Acked-by: Peter Zijlstra <peterz@infradead.org>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

    Arnaldo Carvalho de Melo
     
  • Brice Goglin reported:

    > I can easily sort them by thread id, but I don't know how to match
    > my 4 events with each group of 4 lines.

    Also report the counter id and the time running/enabled
    stats (in case the counter got time-shared).

    Reported-by: Brice Goglin
    Signed-off-by: Peter Zijlstra
    Tested-by: Brice Goglin
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Brice Goglin reported that only the first result from a
    multi-counter perf record --stat run is accurate, the
    rest looks bogus.

    A silly mistake made us re-read the first attribute for
    every recorded attribute.

    Reported-by: Brice Goglin
    Signed-off-by: Peter Zijlstra
    Tested-by: Brice Goglin
    Cc: paulus@samba.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The callchain fractal mode builds each new total hits in a new
    branch of profiling by using the parent's hits of the current
    branch plus the hits of the children.

    This is wrong, the total hits of a branch should be made of the
    sum of every children hits, we must ignore the parent hits in
    this scope.

    This patch also fixes another mistake with the hit counting.

    Now the rates are correct.

    Signed-off-by: Frederic Weisbecker
    Cc: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Mike Galbraith
    Cc: Pekka Enberg
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • perf_counter tools: update perf top manual page to reflect
    current implementation.

    Signed-off-by: Mike Galbraith
    Cc: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Mike Galbraith
     
  • Pressing any key which is not currently mapped to
    functionality, based on startup command line options, displays
    currently mapped keys, and prompts for input.

    Pressing any unmapped key at the prompt returns the user to
    display mode with variables unchanged. eg, pressing ?
    etc displays currently available keys, the value of the
    variable associated with that key, and prompts.

    Pressing same again aborts input.

    Signed-off-by: Mike Galbraith
    Cc: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Mike Galbraith
     
  • …vidual counter display

    Add [w]eighted hotkey. Pressing [w] toggles between displaying
    weighted total of all counters, and the counter selected via
    [E]vent select key.

    ------------------------------------------------------------------------------
    PerfTop: 90395 irqs/sec kernel:16.1% [cache-misses/cache-references/instructions], (all, 4 CPUs)
    ------------------------------------------------------------------------------

    weight samples pcnt RIP kernel function
    ______ _______ _____ ________________ _______________

    1275408.6 10881 - 5.3% - ffffffff81146f70 : copy_page_c
    553683.4 43569 - 21.3% - ffffffff81146f20 : clear_page_c
    74075.0 6768 - 3.3% - ffffffff81147190 : copy_user_generic_string
    40602.9 7538 - 3.7% - ffffffff81284ba2 : _spin_lock
    26882.1 965 - 0.5% - ffffffff8109d280 : file_ra_state_init

    [w]

    ------------------------------------------------------------------------------
    PerfTop: 91221 irqs/sec kernel:14.5% [10000Hz cache-misses], (all, 4 CPUs)
    ------------------------------------------------------------------------------

    weight samples pcnt RIP kernel function
    ______ _______ _____ ________________ _______________

    47320.00 - 22.3% - ffffffff81146f20 : clear_page_c
    14261.00 - 6.7% - ffffffff810992f5 : __rmqueue
    11046.00 - 5.2% - ffffffff81146f70 : copy_page_c
    7842.00 - 3.7% - ffffffff81284ba2 : _spin_lock
    7234.00 - 3.4% - ffffffff810aa1d6 : unmap_vmas

    Signed-off-by: Mike Galbraith <efault@gmx.de>
    Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

    Mike Galbraith
     
  • perf top used to have annotation support, but it has bitrotted and
    removed.

    This patch restores that: it allows the user to select any symbol
    in kernel space for source level annotation on the fly, switch
    between event counters and alter display variables. When symbol
    details are being displayed, stopping annotation reverts to normal.

    known keys:
    [d] select display delay.
    [e] select display entries (lines).
    [E] select annotation event counter.
    [f] select normal display count filter.
    [F] select annotation display count filter (percentage).
    [qQ] quit.
    [s] select annotation symbol and start annotation.
    [S] stop annotation, revert to normal display.
    [z] toggle event count zeroing.

    Sample:
    ------------------------------------------------------------------------------
    PerfTop: 16719 irqs/sec kernel:78.7% [cache-misses/cache-references/instructions/cycles], (all, 4 CPUs)
    ------------------------------------------------------------------------------

    Showing cache-misses for e1000_clean_rx_irq
    Events Pcnt (>=3%)
    0 0.0% /* adjust length to remove Ethernet CRC */
    0 0.0% if (!(adapter->flags2 & FLAG2_CRC_STRIPPING))
    0 0.0% length -= 4;
    436 5.0% f039: 41 f6 84 24 5c 29 00 testb $0x1,0x295c(%r12)
    0 0.0% f089: 8b 4d 84 mov -0x7c(%rbp),%ecx
    0 0.0% f08c: 48 83 ef 02 sub $0x2,%rdi
    0 0.0% f090: 48 83 ee 02 sub $0x2,%rsi
    811 9.3% f094: f3 a4 rep movsb %ds:(%rsi),%es:(%rdi)
    0 0.0%
    0 0.0% while (rx_desc->status & E1000_RXD_STAT_DD) {
    0 0.0% f114: 41 f6 47 0c 01 testb $0x1,0xc(%r15)
    7226 82.6% f119: 0f 85 24 fe ff ff jne ef43

    Available events:
    0 cache-misses
    1 cache-references
    2 instructions
    3 cycles
    Enter details event counter: 2
    ------------------------------------------------------------------------------
    PerfTop: 15035 irqs/sec kernel:79.0% [cache-misses/cache-references/instructions/cycles], (all, 4 CPUs)
    ------------------------------------------------------------------------------

    Showing instructions for e1000_clean_rx_irq
    Events Pcnt (>=3%)
    0 0.0% int *work_done, int work_to_do)
    0 0.0% {
    175 0.9% eebf: 55 push %rbp
    1898 9.8% eec0: 48 89 e5 mov %rsp,%rbp
    0 0.0%
    0 0.0% i = rx_ring->next_to_clean;
    140 0.7% ef0a: 0f b7 41 1a movzwl 0x1a(%rcx),%eax
    670 3.4% ef0e: 89 45 ac mov %eax,-0x54(%rbp)
    0 0.0% {
    0 0.0% memcpy(skb->data + offset, from, len);
    91 0.5% f07b: 49 8b b6 e8 00 00 00 mov 0xe8(%r14),%rsi
    1153 5.9% f082: 48 8b b8 e8 00 00 00 mov 0xe8(%rax),%rdi
    42 0.2% f089: 8b 4d 84 mov -0x7c(%rbp),%ecx
    14 0.1% f08c: 48 83 ef 02 sub $0x2,%rdi
    0 0.0% f090: 48 83 ee 02 sub $0x2,%rsi
    1618 8.3% f094: f3 a4 rep movsb %ds:(%rsi),%es:(%rdi)
    0 0.0%
    0 0.0% /* return some buffers to hardware, one at a time is too slow */
    0 0.0% if (cleaned_count >= E1000_RX_BUFFER_WRITE) {
    867 4.5% f0e7: 83 7d b0 0f cmpl $0xf,-0x50(%rbp)
    0 0.0%
    0 0.0% while (rx_desc->status & E1000_RXD_STAT_DD) {
    37 0.2% f114: 41 f6 47 0c 01 testb $0x1,0xc(%r15)
    4047 20.8% f119: 0f 85 24 fe ff ff jne ef43

    Signed-off-by: Mike Galbraith
    Signed-off-by: Peter Zijlstra
    Cc: Paul Mackerras
    Signed-off-by: Ingo Molnar

    Mike Galbraith
     
  • This patch implements the kernel side support for ftrace event
    record sampling.

    A new counter sampling attribute is added:

    PERF_SAMPLE_TP_RECORD

    which requests ftrace events record sampling. In this case
    if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
    fires, we emit the tracepoint binary record to the
    perfcounter event buffer, as a sample.

    Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
    record:

    perf record -f -F 1 -a -e workqueue:workqueue_execution
    perf report -D

    0x21e18 [0x48]: event: 9
    .
    . ... raw event: size 72 bytes
    . 0000: 09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff ......H........
    . 0010: 0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00 ........!......
    . 0020: 2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e +...........eve
    . 0030: 74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00 ts/1...........
    . 0040: e0 b1 31 81 ff ff ff ff .......
    .
    0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33

    The raw ftrace binary record starts at offset 0020.

    Translation:

    struct trace_entry {
    type = 0x2b = 43;
    flags = 1;
    preempt_count = 2;
    pid = 0xa = 10;
    tgid = 0xa = 10;
    }

    thread_comm = "events/1"
    thread_pid = 0xa = 10;
    func = 0xffffffff8131b1e0 = flush_to_ldisc()

    What will come next?

    - Userspace support ('perf trace'), 'flight data recorder' mode
    for perf trace, etc.

    - The unconditional copy from the profiling callback brings
    some costs however if someone wants no such sampling to
    occur, and needs to be fixed in the future. For that we need
    to have an instant access to the perf counter attribute.
    This is a matter of a flag to add in the struct ftrace_event.

    - Take care of the events recursivity! Don't ever try to record
    a lock event for example, it seems some locking is used in
    the profiling fast path and lead to a tracing recursivity.
    That will be fixed using raw spinlock or recursivity
    protection.

    - [...]

    - Profit! :-)

    Signed-off-by: Frederic Weisbecker
    Cc: Li Zefan
    Cc: Tom Zanussi
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Steven Rostedt
    Cc: Paul Mackerras
    Cc: Pekka Enberg
    Cc: Gabriel Munteanu
    Cc: Lai Jiangshan
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

07 Aug, 2009

2 commits

  • Adds autodetection for libelf as well, and simplifies the
    libbfd code. Furthermore, fail make with an error when libelf
    is not found and warn about the lack of libbfd.

    Also provide an option to build a 32bit version even though you
    might be running a 64bit kernel.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • In some cases distros have binaries and debuginfo in weird places:

    [root@doppio tuna]# ls -la /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
    -rwxr-xr-x 1 root root 90024 2009-08-03 19:45 /usr/lib64/firefox-3.5.2/firefox
    -rwxr-xr-x 1 root root 90024 2009-08-03 18:23 /usr/lib64/xulrunner-1.9.1/xulrunner-stub
    [root@doppio tuna]# sha1sum /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
    19a858077d263d5de22c9c5da250d3e4396ae739 /usr/lib64/xulrunner-1.9.1/xulrunner-stub
    19a858077d263d5de22c9c5da250d3e4396ae739 /usr/lib64/firefox-3.5.2/firefox
    [root@doppio tuna]# rpm -qf /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
    xulrunner-1.9.1.2-1.fc11.x86_64
    firefox-3.5.2-2.fc11.x86_64
    [root@doppio tuna]# ls -la /usr/lib/debug/{usr/lib64/xulrunner-1.9.1/xulrunner-stub,usr/lib64/firefox-3.5.2/firefox}.debug
    ls: cannot access /usr/lib/debug/usr/lib64/firefox-3.5.2/firefox.debug: No such file or directory
    -rwxr-xr-x 1 root root 403608 2009-08-03 18:22 /usr/lib/debug/usr/lib64/xulrunner-1.9.1/xulrunner-stub.debug

    Seemingly we don't have a .symtab when we actually can find it
    if we use the .note.gnu.build-id ELF section put in place by
    some distros. Use it and find the symbols we need.

    Signed-off-by: Arnaldo Carvalho de Melo
    Acked-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

05 Aug, 2009

3 commits


04 Aug, 2009

1 commit


02 Aug, 2009

2 commits

  • We skip the display of idle routine related symbols because
    they are typically rather erratic and confusing: they depend
    on the IRQ rate or sometimes they dominate the profile if
    they are polling based.

    Add mwait_idle_with_hints too, this is one of the idle
    routines on x86.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • This patch fixes a spelling error that has resulted from copy
    and pasting. The location of the error was found using a
    semantic patch but the semantic patch was not trying to find
    these errors. After looking things over it seemed logical that
    this change was needed. Please review it and then include the
    patch if it is in fact the correct change.

    Signed-off-by: Stoyan Gaydarov
    Signed-off-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Stoyan Gaydarov