24 Dec, 2011

1 commit

  • The default input file for perf report is not handled the same way as
    perf record does it for its output file. This leads to unexpected
    behavior of perf report, etc. E.g.:

    # perf record -a -e cpu-cycles sleep 2 | perf report | cat
    failed to open perf.data: No such file or directory (try 'perf record' first)

    While perf record writes to a fifo, perf report expects perf.data to be
    read. This patch changes this to accept fifos as input file.

    Applies to the following commands:

    perf annotate
    perf buildid-list
    perf evlist
    perf kmem
    perf lock
    perf report
    perf sched
    perf script
    perf timechart

    Also fixes char const* -> const char* type declaration for filename
    strings.

    v2:
    * Prevent potential null pointer access to input_name in
    builtin-report.c. Needed due to removal of patch "perf report: Setup
    browser if stdout is a pipe"

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.com
    Signed-off-by: Robert Richter
    Signed-off-by: Arnaldo Carvalho de Melo

    Robert Richter
     

28 Nov, 2011

3 commits

  • To better reflect that it became the base class for all tools, that must
    be in each tool struct and where common stuff will be put.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-qgpc4msetqlwr8y2k7537cxe@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Reducing the exposure of perf_session further, so that we can use the
    classes in cases where no perf.data file is created.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-stua66dcscsezzrcdugvbmvd@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • So that we don't need to have that many globals.

    Next steps will remove the 'session' pointer, that in most cases is
    not needed.

    Then we can rename perf_event_ops to 'perf_tool' that better describes
    this class hierarchy.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-wp4djox7x6w1i2bab1pt4xxp@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

08 Aug, 2011

1 commit

  • Looks to me like the :r modifier is not supported anymore, so remove it
    from the list of events. Without this fix 'perf lock record' doesn't
    work.

    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Zhu Yanhai
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1312035232-9534-1-git-send-email-gaoyang.zyh@taobao.com
    Signed-off-by: Zhu Yanhai
    Signed-off-by: Arnaldo Carvalho de Melo

    Zhu Yanhai
     

24 Mar, 2011

1 commit

  • Resolving the sample->id to an evsel since the most advanced tools,
    report and annotate, and the others will too when they evolve to
    properly support multi-event perf.data files.

    Good also because it does an extra validation, checking that the ID is
    valid when present. When that is not the case, the overhead is just a
    branch + function call (perf_evlist__id2evsel).

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

16 Mar, 2011

1 commit

  • If lock was uncontended, wait_time_min == ULLONG_MAX, so we need to
    handle this case differently to show high wait times first

    Acked-by: Hitoshi Mitake
    Cc: Hitoshi Mitake
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Marcin Slusarz
    Signed-off-by: Arnaldo Carvalho de Melo

    Marcin Slusarz
     

23 Feb, 2011

1 commit


30 Jan, 2011

2 commits


23 Jan, 2011

1 commit

  • Using %L[uxd] has issues in some architectures, like on ppc64. Fix it
    by making our 64 bit integers typedefs of stdint.h types and using
    PRI[ux]64 like, for instance, git does.

    Reported by Denis Kirjanov that provided a patch for one case, I went
    and changed all cases.

    Reported-by: Denis Kirjanov
    Tested-by: Denis Kirjanov
    LKML-Reference:
    Cc: Denis Kirjanov
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Pingtian Han
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

22 Dec, 2010

1 commit

  • If we are running the new perf on an old kernel without support for
    sample_id_all, we should fall back to the old unordered processing of
    events. If we didn't than we would *always* process events without
    timestamps out of order, whether or not we hit a reordering race. In
    other words, instead of there being a chance of not attributing samples
    correctly, we would guarantee that samples would not be attributed.

    While processing all events without timestamps before events with
    timestamps may seem like an intuitive solution, it falls down as
    PERF_RECORD_EXIT events would also be processed before any samples.
    Even with a workaround for that case, samples before/after an exec would
    not be attributed correctly.

    This patch allows commands to indicate whether they need to fall back to
    unordered processing, so that commands that do not care about timestamps
    on every event will not be affected. If we do fallback, this will print
    out a warning if report -D was invoked.

    This patch adds the test in perf_session__new so that we only need to
    test once per session. Commands that do not use an event_ops (such as
    record and top) can simply pass NULL in it's place.

    Acked-by: Thomas Gleixner
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    LKML-Reference:
    Signed-off-by: Ian Munsie
    Signed-off-by: Arnaldo Carvalho de Melo

    Ian Munsie
     

06 Dec, 2010

1 commit

  • There were a few stray calloc()'s and malloc()'s which were not having
    their return values checked for success.

    As the calling code either already coped with failure or didn't actually
    care we just return -ENOMEM at that point.

    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Signed-off-by: Chris Samuel
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Chris Samuel
     

05 Dec, 2010

1 commit

  • At perf_session__process_event, so that we reduce the number of lines in eache
    tool sample processing routine that now receives a sample_data pointer already
    parsed.

    This will also be useful in the next patch, where we'll allow sample the
    identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
    timestamp) just after before every event.

    Also validate callchains in perf_session__process_event, i.e. as early as
    possible, and keep a counter of the number of events discarded due to invalid
    callchains, warning the user about it if it happens.

    There is an assumption that was kept that all events have the same sample_type,
    that will be dealt with in the future, when this preexisting limitation will be
    removed.

    Tested-by: Thomas Gleixner
    Reviewed-by: Thomas Gleixner
    Acked-by: Ian Munsie
    Acked-by: Thomas Gleixner
    Cc: Frédéric Weisbecker
    Cc: Ian Munsie
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

17 Nov, 2010

1 commit


18 May, 2010

1 commit


10 May, 2010

1 commit

  • This patch drops "-a" from the default arguments passed to
    perf record by perf lock.

    If a user wants to do a system wide record of lock events,
    perf lock record -a ...
    is enough for this purpose.

    This can reduce the size of the perf.data file.

    % sudo ./perf lock record whoami
    root
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.439 MB perf.data (~19170 samples) ]
    % sudo ./perf lock record -a whoami # with -a option
    root
    [ perf record: Woken up 0 times to write data ]
    [ perf record: Captured and wrote 48.962 MB perf.data (~2139197 samples) ]

    Signed-off-by: Hitoshi Mitake
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Jens Axboe
    Cc: Jason Baron
    Cc: Xiao Guangrong
    LKML-Reference: Message-Id:
    Signed-off-by: Frederic Weisbecker

    Hitoshi Mitake
     

09 May, 2010

5 commits

  • When a lock is acquired after beeing contended, we update the
    wait time statistics for the given lock.
    But if the min wait time is updated, we don't check the max wait
    time. This is wrong because the first time we update the wait time,
    we want to update both min and max wait time.

    Before:
    Name acquired contended total wait (ns) max wait (ns) min wait (ns)
    key 8 1 21656 0 21656

    After:
    Name acquired contended total wait (ns) max wait (ns) min wait (ns)
    key 8 1 21656 21656 21656

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Hitoshi Mitake

    Frederic Weisbecker
     
  • Fix the cast made to get the bad rate. It is made in the result
    instead of the operands. We need the operands to be cast in double,
    otherwise the result will always be zero.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Hitoshi Mitake

    Frederic Weisbecker
     
  • Use an enum instead of plain constants for lock flags.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Hitoshi Mitake

    Frederic Weisbecker
     
  • Use enum to get a human view of bad_hist indexes and
    put bad histogram output in its own function.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Hitoshi Mitake

    Frederic Weisbecker
     
  • This adds the "info" subcommand to perf lock which can be used
    to dump metadata like threads or addresses of lock instances.
    "map" was removed because info should do the work for it.

    This will be useful not only for debugging but also for ordinary
    analyzing.

    v2: adding example of usage
    % sudo ./perf lock info -t
    | Thread ID: comm
    | 0: swapper
    | 1: init
    | 18: migration/5
    | 29: events/2
    | 32: events/5
    | 33: events/6
    ...

    % sudo ./perf lock info -m
    | Address of instance: name of class
    | 0xffff8800b95adae0: &(&sighand->siglock)->rlock
    | 0xffff8800bbb41ae0: &(&sighand->siglock)->rlock
    | 0xffff8800bf165ae0: &(&sighand->siglock)->rlock
    | 0xffff8800b9576a98: &p->cred_guard_mutex
    | 0xffff8800bb890a08: &(&p->alloc_lock)->rlock
    | 0xffff8800b9522a08: &(&p->alloc_lock)->rlock
    | 0xffff8800bb8aaa08: &(&p->alloc_lock)->rlock
    | 0xffff8800bba72a08: &(&p->alloc_lock)->rlock
    | 0xffff8800bf18ea08: &(&p->alloc_lock)->rlock
    | 0xffff8800b8a0d8a0: &(&ip->i_lock)->mr_lock
    | 0xffff88009bf818a0: &(&ip->i_lock)->mr_lock
    | 0xffff88004c66b8a0: &(&ip->i_lock)->mr_lock
    | 0xffff8800bb6478a0: &(shost->host_lock)->rlock

    v3: fixed some problems Frederic pointed out
    * better rbtree tracking in dump_threads()
    * removed printf() and used pr_info() and pr_debug()

    Signed-off-by: Hitoshi Mitake
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Jens Axboe
    Cc: Jason Baron
    Cc: Xiao Guangrong
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Hitoshi Mitake
     

03 May, 2010

1 commit

  • Currently, perf 'live mode' writes build-ids at the end of the
    session, which isn't actually useful for processing live mode events.

    What would be better would be to have the build-ids sent before any of
    the samples that reference them, which can be done by processing the
    event stream and retrieving the build-ids on the first hit. Doing
    that in perf-record itself, however, is off-limits.

    This patch introduces perf-inject, which does the same job while
    leaving perf-record untouched. Normal mode perf still records the
    build-ids at the end of the session as it should, but for live mode,
    perf-inject can be injected in between the record and report steps
    e.g.:

    perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -

    perf-inject reads a perf-record event stream and repipes it to stdout.
    At any point the processing code can inject other events into the
    event stream - in this case build-ids (-b option) are read and
    injected as needed into the event stream.

    Build-ids are just the first user of perf-inject - potentially
    anything that needs userspace processing to augment the trace stream
    with additional information could make use of this facility.

    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Frédéric Weisbecker
    LKML-Reference:
    Signed-off-by: Tom Zanussi
    Signed-off-by: Arnaldo Carvalho de Melo

    Tom Zanussi
     

24 Apr, 2010

2 commits

  • The sample events recorded by perf record are not time ordered
    because we have one buffer per cpu for each event (even demultiplexed
    per task/per cpu for task bound events). But when we read trace events
    we want them to be ordered by time because many state machines are
    involved.

    There are currently two ways perf tools deal with that:

    - use -M to multiplex every buffers (perf sched, perf kmem)
    But this creates a lot of contention in SMP machines on
    record time.

    - use a post-processing time reordering (perf timechart, perf lock)
    The reordering used by timechart is simple but doesn't scale well
    with huge flow of events, in terms of performance and memory use
    (unusable with perf lock for example).
    Perf lock has its own samples reordering that flushes its memory
    use in a regular basis and that uses a sorting based on the
    previous event queued (a new event to be queued is close to the
    previous one most of the time).

    This patch proposes to export perf lock's samples reordering facility
    to the session layer that reads the events. So if a tool wants to
    get ordered sample events, it needs to set its
    struct perf_event_ops::ordered_samples to true and that's it.

    This prepares tracing based perf tools to get rid of the need to
    use buffers multiplexing (-M) or to implement their own
    reordering.

    Also lower the flush period to 2 as it's sufficient already.

    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Hitoshi Mitake
    Cc: Ingo Molnar
    Cc: Masami Hiramatsu
    Cc: Tom Zanussi

    Frederic Weisbecker
     
  • Previous state machine of perf lock was really broken.
    This patch improves it a little.

    This patch prepares the list of state machine that represents
    lock sequences for each threads.

    These state machines can be one of these sequences:

    1) acquire -> acquired -> release
    2) acquire -> contended -> acquired -> release
    3) acquire (w/ try) -> release
    4) acquire (w/ read) -> release

    The case of 4) is a little special.
    Double acquire of read lock is allowed, so the state machine
    counts read lock number, and permits double acquire and release.

    But, things are not so simple. Something in my model is still wrong.
    I counted the number of lock instances with bad sequence,
    and ratio is like this (case of tracing whoami): bad:233, total:2279

    version 2:
    * threads are now identified with tid, not pid
    * prepared SEQ_STATE_READ_ACQUIRED for read lock.
    * bunch of struct lock_seq_stat is now linked list
    * debug information enhanced (this have to be removed someday)
    e.g.
    | === output for debug===
    |
    | bad:233, total:2279
    | bad rate:0.000000
    | histogram of events caused bad sequence
    | acquire: 165
    | acquired: 0
    | contended: 0
    | release: 68

    Signed-off-by: Hitoshi Mitake
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Jens Axboe
    Cc: Jason Baron
    Cc: Xiao Guangrong
    Cc: Ingo Molnar
    LKML-Reference:
    [rename SEQ_STATE_UNINITED to SEQ_STATE_UNINITIALIZED]
    Signed-off-by: Frederic Weisbecker

    Hitoshi Mitake
     

14 Apr, 2010

1 commit

  • Parsing an option from the command line with OPT_BOOLEAN on a
    bool data type would not work on a big-endian machine due to the
    manner in which the boolean was being cast into an int and
    incremented. For example, running 'perf probe --list' on a
    PowerPC machine would fail to properly set the list_events bool
    and would therefore print out the usage information and
    terminate.

    This patch makes OPT_BOOLEAN work as expected with a bool
    datatype. For cases where the original OPT_BOOLEAN was
    intentionally being used to increment an int each time it was
    passed in on the command line, this patch introduces OPT_INCR
    with the old behaviour of OPT_BOOLEAN (the verbose variable is
    currently the only such example of this).

    I have reviewed every use of OPT_BOOLEAN to verify that a true
    C99 bool was passed. Where integers were used, I verified that
    they were only being used for boolean logic and changed them to
    bools to ensure that they would not be mistakenly used as ints.
    The major exception was the verbose variable which now uses
    OPT_INCR instead of OPT_BOOLEAN.

    Signed-off-by: Ian Munsie
    Acked-by: David S. Miller
    Cc: # NOTE: wont apply to .3[34].x cleanly, please backport
    Cc: Git development list
    Cc: Ian Munsie
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: KOSAKI Motohiro
    Cc: Hitoshi Mitake
    Cc: Rusty Russell
    Cc: Frederic Weisbecker
    Cc: Eric B Munson
    Cc: Valdis.Kletnieks@vt.edu
    Cc: WANG Cong
    Cc: Thiago Farina
    Cc: Masami Hiramatsu
    Cc: Xiao Guangrong
    Cc: Jaswinder Singh Rajput
    Cc: Arjan van de Ven
    Cc: OGAWA Hirofumi
    Cc: Mike Galbraith
    Cc: Tom Zanussi
    Cc: Anton Blanchard
    Cc: John Kacur
    Cc: Li Zefan
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ian Munsie
     

28 Feb, 2010

1 commit

  • We need to deal with time ordered events to build a correct
    state machine of lock events. This is why we multiplex the lock
    events buffers. But the ordering is done from the kernel, on
    the tracing fast path, leading to high contention between cpus.

    Without multiplexing, the events appears in a weak order.
    If we have four events, each split per cpu, perf record will
    read the events buffers in the following order:

    [ CPU0 ev0, CPU0 ev1, CPU0 ev3, CPU0 ev4, CPU1 ev0, CPU1 ev0....]

    To handle a post processing reordering, we could just read and sort
    the whole in memory, but it just doesn't scale with high amounts
    of events: lock events can fill huge amounts in few times.

    Basically we need to sort in memory and find a "grace period"
    point when we know that a given slice of previously sorted events
    can be committed for post-processing, so that we can unload the
    memory usage step by step and keep a scalable sorting list.

    There is no strong rules about how to define such "grace period".
    What does this patch is:

    We define a FLUSH_PERIOD value that defines a grace period in
    seconds.
    We want to have a slice of events covering 2 * FLUSH_PERIOD in our
    sorted list.
    If FLUSH_PERIOD is big enough, it ensures every events that occured
    in the first half of the timeslice have all been buffered and there
    are none remaining and there won't be further to put inside this
    first timeslice. Then once we reach the 2 * FLUSH_PERIOD
    timeslice, we flush the first half to be gentle with the memory
    (the second half can still get new events in the middle, so wait
    another period to flush it)

    FLUSH_PERIOD is defined to 5 seconds. Say the first event started on
    time t0. We can safely assume that at the time we are processing
    events of t0 + 10 seconds, ther won't be anymore events to read
    from perf.data that occured between t0 and t0 + 5 seconds. Hence
    we can safely flush the first half.

    To point out funky bugs, we have a guardian that checks a new event
    timestamp is not below the last event's timestamp flushed and that
    displays a warning in this case.

    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Steven Rostedt
    Cc: Paul Mackerras
    Cc: Hitoshi Mitake
    Cc: Li Zefan
    Cc: Lai Jiangshan
    Cc: Masami Hiramatsu
    Cc: Jens Axboe

    Frederic Weisbecker
     

31 Jan, 2010

2 commits

  • Fix up a few small stylistic details:

    - use consistent vertical spacing/alignment
    - remove line80 artifacts
    - group some global variables better
    - remove dead code

    Plus rename 'prof' to 'report' to make it more in line with other
    tools, and remove the line/file keying as we really want to use
    IPs like the other tools do.

    Signed-off-by: Ingo Molnar
    Cc: Hitoshi Mitake
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Adding new subcommand "perf lock" to perf.

    I have a lot of remaining ToDos, but for now perf lock can
    already provide minimal functionality for analyzing lock
    statistics.

    Signed-off-by: Hitoshi Mitake
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Hitoshi Mitake