13 Jun, 2009

1 commit


12 Jun, 2009

3 commits

  • Provide for means of extending the perf_counter_attr in a 'natural' way.

    We allow growing the structure by appending fields at the end by specifying
    the full structure size inside it.

    When a new kernel sees a smaller (old) structure, it will 0 pad the tail.
    When an old kernel sees a larger (new) structure, it will verify the tail
    consists of 0s, otherwise fail.

    If we fail due to a size-mismatch, we return -E2BIG and write the kernel's
    native attribe size back into the provided structure.

    Furthermore, add some attribute verification, so that we'll fail counter
    creation when unknown bits are present (PERF_SAMPLE, PERF_FORMAT, or in
    the __reserved fields).

    (This ABI detail is introduced while keeping the existing syscall ABI.)

    Signed-off-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Up until now record has worked on the assumption that type=0, config=0
    was a suitable configuration - which it is. Lets make this a little more
    explicit and more readable via the use of proper symbols.

    [ Impact: cleanup ]

    Signed-off-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Otherwise all L1-instruction aliases will be recognized as
    L1-data by strcasestr() when calling function parse_aliases.

    Signed-off-by: Yong Wang
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Yong Wang
     

11 Jun, 2009

3 commits

  • Pure renames only, to PERF_COUNT_HW_* and PERF_COUNT_SW_*.

    Signed-off-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • A build error slipped in:

    builtin-report.c: In function ‘hist_entry__fprintf’:
    builtin-report.c:711: error: format ‘%12d’ expects type ‘int’, but argument 3 has type ‘uint64_t’

    Because we got a bit sloppy with those types. uint64_t really sucks,
    because there's no printf format for it. So standardize on __u64
    instead - for all types that go to or come from the ABI (which is __u64),
    or for values that need to be large enough even on 32-bit.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • When we use variable period sampling, add the period to the sample
    data and use that to normalize the samples.

    Signed-off-by: Peter Zijlstra
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

10 Jun, 2009

2 commits

  • Currently report and stat catch SIGINT (and others) without altering
    their exit state. This means that things like:

    while :; do perf stat ./foo ; done

    Loops become hard-to-interrupt, because bash never sees perf terminate
    due to interruption. Fix this.

    Signed-off-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Create the counter in a disabled state and only enable it after we
    mmap() the buffer, this allows us to see the first few samples (and
    observe the frequency ramp).

    Furthermore, print the period in the verbose report.

    Signed-off-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

09 Jun, 2009

2 commits

  • The rule is:

    - high overhead: red
    - mid overhead: green
    - low overhead: normal (white/black)

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • This patch adds support for profiling JIT generated code to 'perf
    report'. A JIT compiler is required to generate a "/tmp/perf-$PID.map"
    symbols map that is parsed when looking and displaying symbols.

    Thanks to Peter Zijlstra for his help with this patch!

    Example "perf report" output with the Jato JIT:

    #
    # (40311 samples)
    #
    # Overhead Command Shared Object Symbol
    # ........ ................ ......................... ......
    #
    97.80% jato /tmp/perf-11915.map [.] Fibonacci.fib(I)I
    0.56% jato 00000000b7fa023b 0x000000b7fa023b
    0.45% jato /tmp/perf-11915.map [.] Fibonacci.main([Ljava/lang/String;)V
    0.38% jato [kernel] [k] get_page_from_freelist
    0.06% jato [kernel] [k] kunmap_atomic
    0.05% jato ./jato [.] utf8Hash
    0.04% jato ./jato [.] executeJava
    0.04% jato ./jato [.] defineClass

    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Signed-off-by: Pekka Enberg
    Cc: a.p.zijlstra@chello.nl
    Cc: acme@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Pekka Enberg
     

08 Jun, 2009

1 commit

  • Before:

    7549326754 cycles # 3201.811 M/sec
    10007594937 instructions # 4244.408 M/sec

    After:

    7542051194 cycles # 3201.996 M/sec
    10007743852 instructions # 4248.811 M/sec # 1.327 per cycle

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

07 Jun, 2009

12 commits

  • Before:

    $ perf report
    failed to open file: No such file or directory

    After:

    $ perf report
    failed to open file: perf.data (try 'perf record' first)

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • If perf is run on a !CONFIG_PERF_COUNTER kernel right now it
    bails out with no messages or with confusing messages.

    Standardize this case some more and explain the situation.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • On architectures/CPUs without PMU support but with perfcounters
    enabled 'perf record' currently fails because it cannot create a
    cycle based hw-perfcounter.

    Fall back to the cpu-clock-tick sw-perfcounter in this case, which
    is hrtimer based and will always work (as long as perfcounters
    are enabled).

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • On architectures/CPUs without PMU support but with perfcounters
    enabled 'perf top' currently fails because it cannot create a
    cycle based hw-perfcounter.

    Fall back to the cpu-clock-tick sw-perfcounter in this case, which
    is hrtimer based and will always work (as long as perfcounters
    is enabled).

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Before:

    $ perf stat ~/hackbench 5

    error: syscall returned with -1 (No such device)

    After:

    $ perf stat ~/hackbench 5
    Time: 1.640

    Performance counter stats for '/home/mingo/hackbench 5':

    6524.570382 task-clock-ticks # 3.838 CPU utilization factor
    35704 context-switches # 0.005 M/sec
    191 CPU-migrations # 0.000 M/sec
    8958 page-faults # 0.001 M/sec
    cycles
    instructions
    cache-references
    cache-misses

    Wall-clock time elapsed: 1699.999995 msecs

    Also add -v (--verbose) option to allow the printing of failed
    counter opens.

    Plus dont print 'inf' if wall-time is zero (due to jiffies granularity),
    instead skip the printing of the CPU utilization factor.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • The first snapshot reading often occur before any events have
    been read in the mapped perfcounter files.

    Just wait until we have at least one event before starting the
    snapshot, or the delay before the first set of entries to be
    displayed may be long in case of low refresh rate.

    Note: we could also use a semaphore to wait before
    "print_entries" number of eveents is reached, but again this
    value is tunable and we can't ensure we will even reach it.
    Also we could base on a default mimimum set of entries for the
    first refresh, say 15, but again, the minimal sample is
    tunable, and we could end up displaying nothing until we have a
    minimal default set of events, which can take some time in case
    of high samples filters.

    Hence this simple solution which partially covers the default
    case.

    [ Impact: fix display artifacts in perf top ]

    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Corey Ashford
    Cc: Marcelo Tosatti
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Arjan noticed this bug in the perf annotate help output:

    -s, --symbol symbol to annotate

    that should be instead.

    Reported-by: Arjan van de Ven
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • the "perf report" utility crashed in some circumstances
    because the "sym" stack variable was not initialized before used
    (as also proven by valgrind).

    With this fix both the crash goes away and valgrind no longer complains.

    Signed-off-by: Arjan van de Ven
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     
  • Right now kernel debug info does not get resolved by default, because
    we dont know where to look for the vmlinux.

    The -k option can be used for that - but if no option is given, pick
    up vmlinux files in the current directory - in case a kernel hacker
    runs profiling from the source directory that the kernel was built in.

    The real solution would be to embedd the location (and perhaps the
    date/timestamp) of the vmlinux file in /proc/kallsyms, so that
    tools can pick it up automatically.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • gcc warned about this bug:

    util/parse-events.c: In function ‘parse_generic_hw_symbols’:
    util/parse-events.c:175: warning: comparison is always false due to limited range of data type
    util/parse-events.c:182: warning: comparison is always false due to limited range of data type
    util/parse-events.c:190: warning: comparison is always false due to limited range of data type

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     
  • Several people have suggested that 'perf' has become a full-fledged
    tool that should be moved out of Documentation/. Move it to the
    (new) tools/ directory.

    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar