24 May, 2010

1 commit

  • The hists__tty_browse_tree function was created with the loop to print
    all events, and its equivalent, hists__tui_browse_tree, was created in a
    similar fashion, where it is possible to switch among the multiple
    events, if present, using TAB to go the next event, and shift+TAB
    (UNTAB) to go to the previous.

    The report TUI now shows as the window title the name of the event and a
    leak was fixed wrt pstacks.

    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

21 May, 2010

1 commit

  • Using the same scheme as for git's/perf's pager setup, i.e. if one
    doesn't want to, on a newt enabled perf binary, to disable the TUI for
    'perf report', its just a matter of doing:

    [root@doppio linux-2.6-tip]# printf "[tui]\n\nreport = off\n" >
    /root/.perfconfig
    [root@doppio linux-2.6-tip]# cat /root/.perfconfig
    [tui]

    report = off
    [root@doppio linux-2.6-tip]#

    System wide settings are also possible, by editing /etc/perfconfig, etc,
    i.e. the git machinery for config files applies to perf as well, so when
    in doubt where to put your settings, consult the git documentation, if
    it fails, please let us know.

    Suggested-by: Ingo Molnar
    Discussed-with: Stephane Eranian
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

18 May, 2010

1 commit

  • OPT_SET_INT was renamed to OPT_SET_UINT since the only use in these
    tools is to set something that has an enum type, that is builtin
    compatible with unsigned int.

    Several string constifications were done to make OPT_STRING require a
    const char * type.

    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

15 May, 2010

2 commits

  • Number of samples is meaningless after we switched to auto-freq, so
    report the number of events, i.e. not the sum of the different periods,
    but the number PERF_RECORD_SAMPLE emitted by the kernel.

    While doing this I noticed that naming "count" to the sum of all the
    event periods can be confusing, so rename it to .period, just like in
    struct sample.data, so that we become more consistent.

    This helps with the next step, that was to record in struct hist_entry
    the number of sample events for each instance, we need that because we
    use it to generate the number of events when applying filters to the
    tree of hist entries like it is being done in the TUI report browser.

    Suggested-by: Ingo Molnar
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The events_stats.total field is too generic, rename it to .total_period,
    and also add a comment explaining that it is the sum of all the .period
    fields in samples, that is needed because we use auto-freq to avoid
    sampling artifacts.

    Ditto for events_stats.lost, that is the sum of all lost_event.lost
    fields, i.e. the number of events the kernel dropped.

    Looking at the users, builtin-sched.c can make use of these fields and
    stop doing it again.

    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

14 May, 2010

1 commit


12 May, 2010

1 commit

  • Now we don't anymore use popen to run 'perf annotate' for the selected
    symbol, instead we collect per address samplings when processing samples
    in 'perf report' if we're using the newt browser, then we use this data
    directly to do annotation.

    Done this way we can actually traverse the objdump_line objects
    directly, matching the addresses to the collected samples and colouring
    them appropriately using lower level slang routines.

    The new ui_browser class will be reused for the main, callchain aware,
    histogram browser, when it will be made generic and don't assume that
    the objects are always instances of the objdump_line class maintained
    using list_heads.

    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

11 May, 2010

3 commits

  • Those are really not specific to the newt code, can be used by other UI
    frontends.

    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Better done when we are adding entries, be it initially of when we're
    re-sorting the histograms.

    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • In cbbc79a we introduced support for multiple events by introducing a
    new "event_stat_id" struct and then made several perf_session methods
    receive a point to it instead of a pointer to perf_session, and kept the
    event_stats and hists rb_tree in perf_session.

    While working on the new newt based browser, I realised that it would be
    better to introduce a new class, "hists" (short for "histograms"),
    renaming the "event_stat_id" struct and the perf_session methods that
    were really "hists" methods, as they manipulate only struct hists
    members, not touching anything in the other perf_session members.

    Other optimizations, such as calculating the maximum lenght of a symbol
    name present in an hists instance will be possible as we add them,
    avoiding a re-traversal just for finding that information.

    The rationale for the name "hists" to replace "event_stat_id" is that we
    may have multiple sets of hists for the same event_stat id, as, for
    instance, the 'perf diff' tool has, so event stat id is not what
    characterizes what this struct and the functions that manipulate it do.

    Cc: Eric B Munson
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

10 May, 2010

4 commits


09 May, 2010

1 commit

  • This patch improves 'perf report -h' output for the
    '--call-graph' command line option by enumerating the
    different output types.

    Signed-off-by: Pekka Enberg
    Cc: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Pekka Enberg
     

03 May, 2010

1 commit

  • Currently, perf 'live mode' writes build-ids at the end of the
    session, which isn't actually useful for processing live mode events.

    What would be better would be to have the build-ids sent before any of
    the samples that reference them, which can be done by processing the
    event stream and retrieving the build-ids on the first hit. Doing
    that in perf-record itself, however, is off-limits.

    This patch introduces perf-inject, which does the same job while
    leaving perf-record untouched. Normal mode perf still records the
    build-ids at the end of the session as it should, but for live mode,
    perf-inject can be injected in between the record and report steps
    e.g.:

    perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -

    perf-inject reads a perf-record event stream and repipes it to stdout.
    At any point the processing code can inject other events into the
    event stream - in this case build-ids (-b option) are read and
    injected as needed into the event stream.

    Build-ids are just the first user of perf-inject - potentially
    anything that needs userspace processing to augment the trace stream
    with additional information could make use of this facility.

    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Frédéric Weisbecker
    LKML-Reference:
    Signed-off-by: Tom Zanussi
    Signed-off-by: Arnaldo Carvalho de Melo

    Tom Zanussi
     

28 Apr, 2010

2 commits

  • Now those methods don't operate on a global list of dsos, but on lists
    of machines, so make this clear by renaming the functions.

    Cc: Avi Kivity
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Zhang, Yanmin
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • struct kernel_info and kerninfo__ are too vague, what they really
    describe are machines, virtual ones or hosts.

    There are more changes to introduce helpers to shorten function calls
    and to make more clear what is really being done, but I left that for
    subsequent patches.

    Cc: Avi Kivity
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Zhang, Yanmin
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

19 Apr, 2010

1 commit


14 Apr, 2010

6 commits

  • Bypasses the build_id perf header code and replaces it with a
    synthesized event and processing function that accomplishes the
    same thing, used when reading/writing perf data to/from a pipe.

    Signed-off-by: Tom Zanussi
    Acked-by: Thomas Gleixner
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: k-keiichi@bx.jp.nec.com
    Cc: acme@ghostprotocols.net
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • Bypasses the tracing_data perf header code and replaces it with
    a synthesized event and processing function that accomplishes
    the same thing, used when reading/writing perf data to/from a
    pipe.

    The tracing data is pretty large, and this patch doesn't attempt
    to break it down into component events. The tracing_data event
    itself doesn't actually contain the tracing data, rather it
    arranges for the event processing code to skip over it after
    it's read, using the skip return value added to the event
    processing loop in a previous patch.

    Signed-off-by: Tom Zanussi
    Acked-by: Thomas Gleixner
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: k-keiichi@bx.jp.nec.com
    Cc: acme@ghostprotocols.net
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • Bypasses the event type perf header code and replaces it with a
    synthesized event and processing function that accomplishes the
    same thing, used when reading/writing perf data to/from a pipe.

    Signed-off-by: Tom Zanussi
    Acked-by: Thomas Gleixner
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: k-keiichi@bx.jp.nec.com
    Cc: acme@ghostprotocols.net
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • Bypasses the attr perf header code and replaces it with a
    synthesized event and processing function that accomplishes the
    same thing, used when reading/writing perf data to/from a pipe.

    Making the attrs into events allows them to be streamed over a
    pipe along with the rest of the header data (in later patches).
    It also paves the way to allowing events to be added and removed
    from perf sessions dynamically.

    Signed-off-by: Tom Zanussi
    Acked-by: Thomas Gleixner
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: k-keiichi@bx.jp.nec.com
    Cc: acme@ghostprotocols.net
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • Adds special treatment for stdin - if the user specifies '-i -'
    to perf report, the intent is that the event stream be written
    to stdin rather than from a disk file.

    The actual handling of the '-' filename is done by the session;
    this just adds a signal handler to stop reporting, and turns off
    interference by the pager.

    Signed-off-by: Tom Zanussi
    Acked-by: Thomas Gleixner
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: k-keiichi@bx.jp.nec.com
    Cc: acme@ghostprotocols.net
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • Parsing an option from the command line with OPT_BOOLEAN on a
    bool data type would not work on a big-endian machine due to the
    manner in which the boolean was being cast into an int and
    incremented. For example, running 'perf probe --list' on a
    PowerPC machine would fail to properly set the list_events bool
    and would therefore print out the usage information and
    terminate.

    This patch makes OPT_BOOLEAN work as expected with a bool
    datatype. For cases where the original OPT_BOOLEAN was
    intentionally being used to increment an int each time it was
    passed in on the command line, this patch introduces OPT_INCR
    with the old behaviour of OPT_BOOLEAN (the verbose variable is
    currently the only such example of this).

    I have reviewed every use of OPT_BOOLEAN to verify that a true
    C99 bool was passed. Where integers were used, I verified that
    they were only being used for boolean logic and changed them to
    bools to ensure that they would not be mistakenly used as ints.
    The major exception was the verbose variable which now uses
    OPT_INCR instead of OPT_BOOLEAN.

    Signed-off-by: Ian Munsie
    Acked-by: David S. Miller
    Cc: # NOTE: wont apply to .3[34].x cleanly, please backport
    Cc: Git development list
    Cc: Ian Munsie
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: KOSAKI Motohiro
    Cc: Hitoshi Mitake
    Cc: Rusty Russell
    Cc: Frederic Weisbecker
    Cc: Eric B Munson
    Cc: Valdis.Kletnieks@vt.edu
    Cc: WANG Cong
    Cc: Thiago Farina
    Cc: Masami Hiramatsu
    Cc: Xiao Guangrong
    Cc: Jaswinder Singh Rajput
    Cc: Arjan van de Ven
    Cc: OGAWA Hirofumi
    Cc: Mike Galbraith
    Cc: Tom Zanussi
    Cc: Anton Blanchard
    Cc: John Kacur
    Cc: Li Zefan
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ian Munsie
     

03 Apr, 2010

6 commits

  • So that it can use it in the 'perf annotate' command line, otherwise
    it'll use the default and not the specified -i filename passed to 'perf
    report'.

    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • So that we avoid conflict with libc's string.h header.

    Reviewed-by: KOSAKI Motohiro
    Suggested-by: KOSAKI Motohiro
    Cc: KOSAKI Motohiro
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Propagate error instead.

    LKML-Reference:
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Return NULL instead and make the caller propagate the error.

    LKML-Reference:
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The struct callchain_node size is 120 bytes, that are never used when
    there are no callchains or '-g none' is specified, so conditionally
    allocate it, reducing sizeof(struct hist_entry) from 210 bytes to only
    96, greatly speeding the non-callchain processing.

    LKML-Reference:
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • For when we are processing the events and inserting the entries in the
    browser.

    Experimentation here: naming "ui_something" we may be treading into
    creating a TUI/GUI set of routines that can then be implemented in terms
    of multiple backends.

    Also the time it takes for adding things to the "browser" takes, visually
    (I guess I should do some profiling here ;-) ), more time than for
    processing the events...

    That means we probably need to create a custom hist_entry browser, so
    that we reuse the structures we have in place instead of duplicating
    them in newt.

    But progress was made and at least we can see something while long files
    are being loaded, that must be one of UI 101 bullet points :-)

    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

26 Mar, 2010

1 commit


23 Mar, 2010

1 commit

  • Callchains have markers inside their capture to tell we
    enter a context (kernel, user, ...).

    Those are not displayed in the callchains but they are
    incidentally an active part of the radix tree where
    callchains are stored, just like any other address.

    If we have the two following callchains:

    addr1 -> addr2 -> user context -> addr3
    addr1 -> addr2 -> user context -> addr4
    addr1 -> addr2 -> addr 5

    This is pretty common if addr1 and addr2 are part of an
    interrupt path, addr3 and addr4 are user addresses and
    addr5 is a kernel non interrupt path.

    This will be stored as follows in the tree:

    addr1
    addr2
    / \
    / addr5
    user context
    / \
    addr3 addr4

    But we ignore the context markers in the report, hence
    the addr3 and addr4 will appear as orphan branches:

    |--28.30%-- hrtimer_interrupt
    | smp_apic_timer_interrupt
    | apic_timer_interrupt
    | |
    Signed-off-by: Frederic Weisbecker
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Signed-off-by: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

12 Mar, 2010

1 commit

  • Newt has widespread availability and provides a rather simple
    API as can be seen by the size of this patch.

    The work needed to support it will benefit other frontends too.

    In this initial patch it just checks if the output is a tty, if
    not it falls back to the previous behaviour, also if
    newt-devel/libnewt-dev is not installed the previous behaviour
    is maintaned.

    Pressing enter on a symbol will annotate it, ESC in the
    annotation window will return to the report symbol list.

    More work will be done to remove the special casing in
    color_fprintf, stop using fmemopen/FILE in the printing of
    hist_entries, etc.

    Also the annotation doesn't need to be done via spawning "perf
    annotate" and then browsing its output, we can do better by
    calling directly the builtin-annotate.c functions, that would
    then be moved to tools/perf/util/annotate.c and shared with perf
    top, etc

    But lets go by baby steps, this patch already improves perf
    usability by allowing to quickly do annotations on symbols from
    the report screen and provides a first experimentation with
    libnewt/TUI integration of tools.

    Tested on RHEL5 and Fedora12 X86_64 and on Debian PARISC64 to
    browse a perf.data file collected on a Fedora12 x86_64 box.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Avi Kivity
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

10 Mar, 2010

3 commits

  • Perf report does not handle multiple events being reported, even
    though perf record stores them properly on disk. This patch
    addresses that issue by adding the logic to perf report to use
    the event stream id that is saved by record and the new data
    structures to seperate the event streams and report them
    individually.

    Signed-off-by: Eric B Munson
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Eric B Munson
     
  • Now that report can store historgrams for multiple events we
    need to be able to do the post processing work for each
    histogram. This patch changes the post processing functions so
    that they can be called individually for each event's histogram.

    Signed-off-by: Eric B Munson
    [ Guarantee bisectabilty by fixing up builtin-report.c ]
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Eric B Munson
     
  • In order to minimize the impact of storing multiple events in a
    report this function will now take the root of the histogram
    tree so that the logic for selecting the proper tree can be
    inserted before the call.

    Signed-off-by: Eric B Munson
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Eric B Munson
     

29 Jan, 2010

1 commit


27 Jan, 2010

1 commit


16 Jan, 2010

1 commit

  • Since they can come from another architecture with bigger
    pointers, i.e. processing a 64-bit perf.data on a 32-bit arch.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo