09 Jun, 2009

3 commits


08 Jun, 2009

4 commits

  • Standardize and tidy up all the messages we print during
    perfcounter initialization.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Fill in core2_hw_cache_event_id[] with the Atom model specific events.

    The events can be used in all the tools via the -e (--event) parameter,
    for example "-e l1-misses" or -"-e l2-accesses" or "-e l2-write-misses".

    ( Note: these are straight from the Intel manuals - not tested yet.)

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • Fill in core2_hw_cache_event_id[] with the Core2 model specific events.

    The events can be used in all the tools via the -e (--event) parameter,
    for example "-e l1-misses" or -"-e l2-accesses" or "-e l2-write-misses".

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • Before:

    7549326754 cycles # 3201.811 M/sec
    10007594937 instructions # 4244.408 M/sec

    After:

    7542051194 cycles # 3201.996 M/sec
    10007743852 instructions # 4248.811 M/sec # 1.327 per cycle

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

07 Jun, 2009

15 commits

  • Before:

    $ perf report
    failed to open file: No such file or directory

    After:

    $ perf report
    failed to open file: perf.data (try 'perf record' first)

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • If perf is run on a !CONFIG_PERF_COUNTER kernel right now it
    bails out with no messages or with confusing messages.

    Standardize this case some more and explain the situation.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • On architectures/CPUs without PMU support but with perfcounters
    enabled 'perf record' currently fails because it cannot create a
    cycle based hw-perfcounter.

    Fall back to the cpu-clock-tick sw-perfcounter in this case, which
    is hrtimer based and will always work (as long as perfcounters
    are enabled).

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • On architectures/CPUs without PMU support but with perfcounters
    enabled 'perf top' currently fails because it cannot create a
    cycle based hw-perfcounter.

    Fall back to the cpu-clock-tick sw-perfcounter in this case, which
    is hrtimer based and will always work (as long as perfcounters
    is enabled).

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Before:

    $ perf stat ~/hackbench 5

    error: syscall returned with -1 (No such device)

    After:

    $ perf stat ~/hackbench 5
    Time: 1.640

    Performance counter stats for '/home/mingo/hackbench 5':

    6524.570382 task-clock-ticks # 3.838 CPU utilization factor
    35704 context-switches # 0.005 M/sec
    191 CPU-migrations # 0.000 M/sec
    8958 page-faults # 0.001 M/sec
    cycles
    instructions
    cache-references
    cache-misses

    Wall-clock time elapsed: 1699.999995 msecs

    Also add -v (--verbose) option to allow the printing of failed
    counter opens.

    Plus dont print 'inf' if wall-time is zero (due to jiffies granularity),
    instead skip the printing of the CPU utilization factor.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • The first snapshot reading often occur before any events have
    been read in the mapped perfcounter files.

    Just wait until we have at least one event before starting the
    snapshot, or the delay before the first set of entries to be
    displayed may be long in case of low refresh rate.

    Note: we could also use a semaphore to wait before
    "print_entries" number of eveents is reached, but again this
    value is tunable and we can't ensure we will even reach it.
    Also we could base on a default mimimum set of entries for the
    first refresh, say 15, but again, the minimal sample is
    tunable, and we could end up displaying nothing until we have a
    minimal default set of events, which can take some time in case
    of high samples filters.

    Hence this simple solution which partially covers the default
    case.

    [ Impact: fix display artifacts in perf top ]

    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Corey Ashford
    Cc: Marcelo Tosatti
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Arjan noticed this bug in the perf annotate help output:

    -s, --symbol symbol to annotate

    that should be instead.

    Reported-by: Arjan van de Ven
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • the "perf report" utility crashed in some circumstances
    because the "sym" stack variable was not initialized before used
    (as also proven by valgrind).

    With this fix both the crash goes away and valgrind no longer complains.

    Signed-off-by: Arjan van de Ven
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     
  • Right now kernel debug info does not get resolved by default, because
    we dont know where to look for the vmlinux.

    The -k option can be used for that - but if no option is given, pick
    up vmlinux files in the current directory - in case a kernel hacker
    runs profiling from the source directory that the kernel was built in.

    The real solution would be to embedd the location (and perhaps the
    date/timestamp) of the vmlinux file in /proc/kallsyms, so that
    tools can pick it up automatically.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • gcc warned about this bug:

    util/parse-events.c: In function ‘parse_generic_hw_symbols’:
    util/parse-events.c:175: warning: comparison is always false due to limited range of data type
    util/parse-events.c:182: warning: comparison is always false due to limited range of data type
    util/parse-events.c:190: warning: comparison is always false due to limited range of data type

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     
  • Several people have suggested that 'perf' has become a full-fledged
    tool that should be moved out of Documentation/. Move it to the
    (new) tools/ directory.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Merge reason: Pick up the latest fixes before the -v8 perfcounters
    release.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Add new perf sub-command to display annotated source code:

    $ perf annotate decode_tree_entry

    ------------------------------------------------
    Percent | Source code & Disassembly of /home/mingo/git/git
    ------------------------------------------------
    :
    : /home/mingo/git/git: file format elf64-x86-64
    :
    :
    : Disassembly of section .text:
    :
    : 00000000004a0da0 :
    : *modep = mode;
    : return str;
    : }
    :
    : static void decode_tree_entry(struct tree_desc *desc, const char *buf, unsigned long size)
    : {
    3.82 : 4a0da0: 41 54 push %r12
    : const char *path;
    : unsigned int mode, len;
    :
    : if (size < 24 || buf[size - 21])
    0.17 : 4a0da2: 48 83 fa 17 cmp $0x17,%rdx
    : *modep = mode;
    : return str;
    : }
    :
    : static void decode_tree_entry(struct tree_desc *desc, const char *buf, unsigned long size)
    : {
    0.00 : 4a0da6: 49 89 fc mov %rdi,%r12
    0.00 : 4a0da9: 55 push %rbp
    3.37 : 4a0daa: 53 push %rbx
    : const char *path;
    : unsigned int mode, len;
    :
    : if (size < 24 || buf[size - 21])
    0.08 : 4a0dab: 76 73 jbe 4a0e20
    0.00 : 4a0dad: 80 7c 16 eb 00 cmpb $0x0,-0x15(%rsi,%rdx,1)
    3.48 : 4a0db2: 75 6c jne 4a0e20
    : static const char *get_mode(const char *str, unsigned int *modep)
    : {
    : unsigned char c;
    : unsigned int mode = 0;
    :
    : if (*str == ' ')
    1.94 : 4a0db4: 0f b6 06 movzbl (%rsi),%eax
    0.39 : 4a0db7: 3c 20 cmp $0x20,%al
    0.00 : 4a0db9: 74 65 je 4a0e20
    : return NULL;
    :
    : while ((c = *str++) != ' ') {
    0.06 : 4a0dbb: 89 c2 mov %eax,%edx
    : if (c < '0' || c > '7')
    1.99 : 4a0dbd: 31 ed xor %ebp,%ebp
    : unsigned int mode = 0;
    :
    : if (*str == ' ')
    : return NULL;
    :
    : while ((c = *str++) != ' ') {
    1.74 : 4a0dbf: 48 8d 5e 01 lea 0x1(%rsi),%rbx
    : if (c < '0' || c > '7')
    0.00 : 4a0dc3: 8d 42 d0 lea -0x30(%rdx),%eax
    0.17 : 4a0dc6: 3c 07 cmp $0x7,%al
    0.00 : 4a0dc8: 76 0d jbe 4a0dd7
    0.00 : 4a0dca: eb 54 jmp 4a0e20
    0.00 : 4a0dcc: 0f 1f 40 00 nopl 0x0(%rax)
    16.57 : 4a0dd0: 8d 42 d0 lea -0x30(%rdx),%eax
    0.14 : 4a0dd3: 3c 07 cmp $0x7,%al
    0.00 : 4a0dd5: 77 49 ja 4a0e20
    : return NULL;
    : mode = (mode << 3) + (c - '0');
    3.12 : 4a0dd7: 0f b6 c2 movzbl %dl,%eax
    : unsigned int mode = 0;
    :
    : if (*str == ' ')
    : return NULL;
    :
    : while ((c = *str++) != ' ') {
    0.00 : 4a0dda: 0f b6 13 movzbl (%rbx),%edx
    16.74 : 4a0ddd: 48 83 c3 01 add $0x1,%rbx
    : if (c < '0' || c > '7')
    : return NULL;
    : mode = (mode << 3) + (c - '0');

    The first column is the percentage of samples that arrived on that
    particular line - relative to the total cost of the function.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Prepare for the 'perf annotate' implementation by splitting off
    builtin-annotate.c from builtin-report.c.

    ( We keep this commit separate to ease the later librarization
    of the facilities that perf-report and perf-annotate shares. )

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

06 Jun, 2009

13 commits

  • Also fix a misalignment in usage string printing.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Also add perf list to command-list.txt.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Also update other areas of the help texts.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • Also standardize the cache printout (so that it can be pasted back
    into the command) and sort out the aliases.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Corey Ashford
    Cc: Marcelo Tosatti
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • perf list: List all the available event types which can be used in
    -e (--event) options.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • Extend generic event enumeration with the PERF_TYPE_HW_CACHE
    method.

    This is a 3-dimensional space:

    { L1-D, L1-I, L2, ITLB, DTLB, BPU } x
    { load, store, prefetch } x
    { accesses, misses }

    User-space passes in the 3 coordinates and the kernel provides
    a counter. (if the hardware supports that type and if the
    combination makes sense.)

    Combinations that make no sense produce a -EINVAL.
    Combinations that are not supported by the hardware produce -ENOTSUP.

    Extend the tools to deal with this, and rewrite the event symbol
    parsing code with various popular aliases for the units and
    access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
    both valid aliases.

    ( x86 is supported for now, with the Nehalem event table filled in,
    and with Core2 and Atom having placeholder tables. )

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Corey Ashford
    Cc: Marcelo Tosatti
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Counter type is a frequently used value and we do a lot of
    bit juggling by encoding and decoding it from attr->config.

    Clean this up by creating a separate attr->type field.

    Also clean up the various similarly complex user-space bits
    all around counter attribute management.

    The net improvement is significant, and it will be easier
    to add a new major type (which is what triggered this cleanup).

    (This changes the ABI, all tools are adapted.)
    (PowerPC build-tested.)

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Corey Ashford
    Cc: Marcelo Tosatti
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • If perf top is executed with a zero value for the refresh rate,
    we get a division by zero exception while computing samples_per_sec.

    Also a zero refresh rate is not possible, neither do we want to
    accept negative values.

    [ Impact: fix division by zero in perf top ]

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Corey Ashford
    Cc: Marcelo Tosatti
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • To allow the debugging of frequency-adjusting counters, sample
    those adjustments and display them in perf report -D.

    Acked-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Signed-off-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • In order to allow easy tracking of the period, also provide means of
    adding it to the sample data.

    Signed-off-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The purpose of PERF_SAMPLE_CONFIG was to identify the counters,
    since then we've added counter ids, use those instead.

    Signed-off-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

05 Jun, 2009

5 commits

  • Turns out that neither PowerPC nor older x86 compilers know this switch
    ...

    and since it does not make a measurable difference, just omit it.

    Reported-by: Paul Mackerras
    Reported-by: Steven Rostedt
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • A number of places said 'events' while they should say 'samples'.

    Acked-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Before:

    25.96% copy_user_generic_string
    15.23% two_op
    15.19% one_op
    6.92% enough_duration
    1.23% alloc_pages_current
    1.14% acpi_os_read_port
    1.08% _spin_lock

    After:

    25.96% [k] copy_user_generic_string
    15.23% [.] two_op
    15.19% [.] one_op
    6.92% [.] enough_duration
    1.23% [k] alloc_pages_current
    1.14% [k] acpi_os_read_port
    1.08% [k] _spin_lock

    The '[k]' differentiator is a quick clue that it's a kernel symbol,
    without having to bring in the full dso column.

    Acked-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • In order to deal with [vdso] maps generalize the ip->symbol path
    a bit and allow to override some bits with custom functions.

    Signed-off-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • In order to track the vdso also generate mmap events for
    install_special_mapping().

    Signed-off-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra