29 Oct, 2012

1 commit

  • Currently many perf commands annotate/evlist/report/script/lock etc all
    support "-i" option to chose a specific perf data, and all of them
    create a local "input_name" to save the file name for that perf data.

    Since most of these commands need it, we can add a global variable for
    it, also it can some other benefits:

    1. When calling script browser inside hists/annotation browser, it needs
    to know the perf data file name to run that script.

    2. For further feature like runtime switching to another perf data file,
    this variable can also help.

    Signed-off-by: Feng Tang
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1351569369-26732-2-git-send-email-feng.tang@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Feng Tang
     

03 Oct, 2012

1 commit

  • Some variables were global but used in just one function, so move it to
    where it belongs.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-fapdrw3h3hz713w8h5eww596@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

24 Sep, 2012

1 commit

  • Use zalloc for the malloc+memset open coded sequence.

    Fix leak on the #ifdef'ed C state handling and when detecting invalid
    data in p_state_change().

    Cc: Arjan van de Ven
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-v9x3q9rv4caxtox7wtjpchq5@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

11 Sep, 2012

1 commit

  • perf defines both __used and __unused variables to use for marking
    unused variables. The variable __used is defined to
    __attribute__((__unused__)), which contradicts the kernel definition to
    __attribute__((__used__)) for new gcc versions. On Android, __used is
    also defined in system headers and this leads to warnings like: warning:
    '__used__' attribute ignored

    __unused is not defined in the kernel and is not a standard definition.
    If __unused is included everywhere instead of __used, this leads to
    conflicts with glibc headers, since glibc has a variables with this name
    in its headers.

    The best approach is to use __maybe_unused, the definition used in the
    kernel for __attribute__((unused)). In this way there is only one
    definition in perf sources (instead of 2 definitions that point to the
    same thing: __used and __unused) and it works on both Linux and Android.
    This patch simply replaces all instances of __used and __unused with
    __maybe_unused.

    Signed-off-by: Irina Tirdea
    Acked-by: Pekka Enberg
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
    [ committer note: fixed up conflict with a116e05 in builtin-sched.c ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Irina Tirdea
     

24 Dec, 2011

1 commit

  • The default input file for perf report is not handled the same way as
    perf record does it for its output file. This leads to unexpected
    behavior of perf report, etc. E.g.:

    # perf record -a -e cpu-cycles sleep 2 | perf report | cat
    failed to open perf.data: No such file or directory (try 'perf record' first)

    While perf record writes to a fifo, perf report expects perf.data to be
    read. This patch changes this to accept fifos as input file.

    Applies to the following commands:

    perf annotate
    perf buildid-list
    perf evlist
    perf kmem
    perf lock
    perf report
    perf sched
    perf script
    perf timechart

    Also fixes char const* -> const char* type declaration for filename
    strings.

    v2:
    * Prevent potential null pointer access to input_name in
    builtin-report.c. Needed due to removal of patch "perf report: Setup
    browser if stdout is a pipe"

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.com
    Signed-off-by: Robert Richter
    Signed-off-by: Arnaldo Carvalho de Melo

    Robert Richter
     

28 Nov, 2011

4 commits

  • To better reflect that it became the base class for all tools, that must
    be in each tool struct and where common stuff will be put.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-qgpc4msetqlwr8y2k7537cxe@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Reducing the exposure of perf_session further, so that we can use the
    classes in cases where no perf.data file is created.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-stua66dcscsezzrcdugvbmvd@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • So that we don't need to have that many globals.

    Next steps will remove the 'session' pointer, that in most cases is
    not needed.

    Then we can rename perf_event_ops to 'perf_tool' that better describes
    this class hierarchy.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-wp4djox7x6w1i2bab1pt4xxp@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Eventually session->sample_type will go away as we want to support
    multiple sample types per session, so use it from the evsel which is a
    step in that direction.

    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-0vwdpjcwbjezw459lw5n3ew1@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

24 Mar, 2011

1 commit

  • Resolving the sample->id to an evsel since the most advanced tools,
    report and annotate, and the others will too when they evolve to
    properly support multi-event perf.data files.

    Good also because it does an extra validation, checking that the ID is
    valid when present. When that is not the case, the overhead is just a
    branch + function call (perf_evlist__id2evsel).

    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

04 Mar, 2011

1 commit


28 Feb, 2011

1 commit

  • Currently numcpus is determined in pid_put_sample which is only
    called on sched_switch/sched_wakeup sample processing.

    On a machine with a lot cpus I often saw the last cpu missing.

    Check for (max) numcpus on every event happening and in the
    beginning. -> fixes the issue for me.

    Signed-off-by: Thomas Renninger
    Cc: Arjan van de Ven
    Cc: Arnaldo Carvalho de Melo
    Cc: lenb@kernel.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Thomas Renninger
     

30 Jan, 2011

2 commits


04 Jan, 2011

1 commit

  • builtin-timechart must only pass -e power:xy events if they are supported by
    the running kernel, otherwise try to fetch the old power:power{start,end}
    events.

    For this I added the tiny helper function:

    int is_valid_tracepoint(const char *event_string)

    to parse-events.[hc], which could be more generic as an interface and support
    hardware/software/... events, not only tracepoints, but someone else could
    extend that if needed...

    Signed-off-by: Thomas Renninger
    Acked-by: Arjan van de Ven
    Acked-by: Jean Pihet
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Thomas Renninger
     

22 Dec, 2010

2 commits

  • The symfs argument allows analysis of perf.data file using a locally accessible
    filesystem tree with debug symbols - e.g., tree created during image builds,
    sshfs mount, loop mounted KVM disk images, USB keys, initrds, etc. Anything
    with an OS tree can be analyzed from anywhere without the need to populate a
    local data store with build-ids.

    Commiter notes:

    o Fixed up symfs="/" variants handling.

    o prefixed DSO__ORIG_GUEST_KMODULE case with symfs too, avoiding use of files
    outside the symfs directory.

    LKML-Reference:
    Signed-off-by: David Ahern
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • If we are running the new perf on an old kernel without support for
    sample_id_all, we should fall back to the old unordered processing of
    events. If we didn't than we would *always* process events without
    timestamps out of order, whether or not we hit a reordering race. In
    other words, instead of there being a chance of not attributing samples
    correctly, we would guarantee that samples would not be attributed.

    While processing all events without timestamps before events with
    timestamps may seem like an intuitive solution, it falls down as
    PERF_RECORD_EXIT events would also be processed before any samples.
    Even with a workaround for that case, samples before/after an exec would
    not be attributed correctly.

    This patch allows commands to indicate whether they need to fall back to
    unordered processing, so that commands that do not care about timestamps
    on every event will not be affected. If we do fallback, this will print
    out a warning if report -D was invoked.

    This patch adds the test in perf_session__new so that we only need to
    test once per session. Commands that do not use an event_ops (such as
    record and top) can simply pass NULL in it's place.

    Acked-by: Thomas Gleixner
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    LKML-Reference:
    Signed-off-by: Ian Munsie
    Signed-off-by: Arnaldo Carvalho de Melo

    Ian Munsie
     

06 Dec, 2010

1 commit

  • There were a few stray calloc()'s and malloc()'s which were not having
    their return values checked for success.

    As the calling code either already coped with failure or didn't actually
    care we just return -ENOMEM at that point.

    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Signed-off-by: Chris Samuel
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Chris Samuel
     

05 Dec, 2010

1 commit

  • At perf_session__process_event, so that we reduce the number of lines in eache
    tool sample processing routine that now receives a sample_data pointer already
    parsed.

    This will also be useful in the next patch, where we'll allow sample the
    identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
    timestamp) just after before every event.

    Also validate callchains in perf_session__process_event, i.e. as early as
    possible, and keep a counter of the number of events discarded due to invalid
    callchains, warning the user about it if it happens.

    There is an assumption that was kept that all events have the same sample_type,
    that will be dealt with in the future, when this preexisting limitation will be
    removed.

    Tested-by: Thomas Gleixner
    Reviewed-by: Thomas Gleixner
    Acked-by: Ian Munsie
    Acked-by: Thomas Gleixner
    Cc: Frédéric Weisbecker
    Cc: Ian Munsie
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Stephane Eranian
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

06 Aug, 2010

1 commit

  • Outdent the code following the if.

    The semantic match that finds this problem is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @r disable braces4@
    position p1,p2;
    statement S1,S2;
    @@

    (
    if (...) { ... }
    |
    if (...) S1@p1 S2@p2
    )

    @script:python@
    p1 << r.p1;
    p2 << r.p2;
    @@

    if (p1[0].column == p2[0].column):
    cocci.print_main("branch",p1)
    cocci.print_secs("after",p2)
    //

    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Ingo Molnar
    LKML-Reference:
    Signed-off-by: Julia Lawall
    Signed-off-by: Arnaldo Carvalho de Melo

    Julia Lawall
     

22 Jul, 2010

1 commit

  • and fix the broken case if a core's frequency depends on others.

    trace_power_frequency was only implemented in a rather ungeneric
    way in acpi-cpufreq driver's target() function only.

    -> Move the call to trace_power_frequency to
    cpufreq.c:cpufreq_notify_transition() where CPUFREQ_POSTCHANGE
    notifier is triggered.
    This will support power frequency tracing by all cpufreq
    drivers.

    trace_power_frequency did not trace frequency changes correctly
    when the userspace governor was used or when CPU cores'
    frequency depend on each other.

    -> Moving this into the CPUFREQ_POSTCHANGE notifier and pass the cpu
    which gets switched automatically fixes this.

    Robert Schoene provided some important fixes on top of my
    initial quick shot version which are integrated in this patch:
    - Forgot some changes in power_end trace (TP_printk/variable names)
    - Variable dummy in power_end must now be cpu_id
    - Use static 64 bit variable instead of unsigned int for cpu_id

    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Thomas Renninger
    Cc: davej@codemonkey.org.uk
    Signed-off-by: Ingo Molnar
    Cc: Dave Jones
    Acked-by: Arjan van de Ven
    Cc: Robert Schoene
    Tested-by: Robert Schoene
    Signed-off-by: Andrew Morton

    Thomas Renninger
     

03 May, 2010

1 commit

  • Currently, perf 'live mode' writes build-ids at the end of the
    session, which isn't actually useful for processing live mode events.

    What would be better would be to have the build-ids sent before any of
    the samples that reference them, which can be done by processing the
    event stream and retrieving the build-ids on the first hit. Doing
    that in perf-record itself, however, is off-limits.

    This patch introduces perf-inject, which does the same job while
    leaving perf-record untouched. Normal mode perf still records the
    build-ids at the end of the session as it should, but for live mode,
    perf-inject can be injected in between the record and report steps
    e.g.:

    perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -

    perf-inject reads a perf-record event stream and repipes it to stdout.
    At any point the processing code can inject other events into the
    event stream - in this case build-ids (-b option) are read and
    injected as needed into the event stream.

    Build-ids are just the first user of perf-inject - potentially
    anything that needs userspace processing to augment the trace stream
    with additional information could make use of this facility.

    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Frédéric Weisbecker
    LKML-Reference:
    Signed-off-by: Tom Zanussi
    Signed-off-by: Arnaldo Carvalho de Melo

    Tom Zanussi
     

24 Apr, 2010

1 commit


14 Apr, 2010

1 commit

  • Parsing an option from the command line with OPT_BOOLEAN on a
    bool data type would not work on a big-endian machine due to the
    manner in which the boolean was being cast into an int and
    incremented. For example, running 'perf probe --list' on a
    PowerPC machine would fail to properly set the list_events bool
    and would therefore print out the usage information and
    terminate.

    This patch makes OPT_BOOLEAN work as expected with a bool
    datatype. For cases where the original OPT_BOOLEAN was
    intentionally being used to increment an int each time it was
    passed in on the command line, this patch introduces OPT_INCR
    with the old behaviour of OPT_BOOLEAN (the verbose variable is
    currently the only such example of this).

    I have reviewed every use of OPT_BOOLEAN to verify that a true
    C99 bool was passed. Where integers were used, I verified that
    they were only being used for boolean logic and changed them to
    bools to ensure that they would not be mistakenly used as ints.
    The major exception was the verbose variable which now uses
    OPT_INCR instead of OPT_BOOLEAN.

    Signed-off-by: Ian Munsie
    Acked-by: David S. Miller
    Cc: # NOTE: wont apply to .3[34].x cleanly, please backport
    Cc: Git development list
    Cc: Ian Munsie
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: KOSAKI Motohiro
    Cc: Hitoshi Mitake
    Cc: Rusty Russell
    Cc: Frederic Weisbecker
    Cc: Eric B Munson
    Cc: Valdis.Kletnieks@vt.edu
    Cc: WANG Cong
    Cc: Thiago Farina
    Cc: Masami Hiramatsu
    Cc: Xiao Guangrong
    Cc: Jaswinder Singh Rajput
    Cc: Arjan van de Ven
    Cc: OGAWA Hirofumi
    Cc: Mike Galbraith
    Cc: Tom Zanussi
    Cc: Anton Blanchard
    Cc: John Kacur
    Cc: Li Zefan
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ian Munsie
     

08 Apr, 2010

1 commit

  • Using 'pahole --packable' I found some structs that could be reorganized
    to eliminate alignment holes, in some cases getting them to be cacheline
    multiples.

    [acme@doppio linux-2.6-tip]$ codiff perf.old ~/bin/perf
    builtin-annotate.c:
    struct perf_session | -8
    struct perf_header | -8
    2 structs changed

    builtin-diff.c:
    struct sample_data | -8
    1 struct changed
    diff__process_sample_event | -8
    1 function changed, 8 bytes removed, diff: -8

    builtin-sched.c:
    struct sched_atom | -8
    1 struct changed

    builtin-timechart.c:
    struct per_pid | -8
    1 struct changed
    cmd_timechart | -16
    1 function changed, 16 bytes removed, diff: -16

    builtin-probe.c:
    struct perf_probe_point | -8
    struct perf_probe_event | -8
    2 structs changed
    opt_add_probe_event | -3
    1 function changed, 3 bytes removed, diff: -3

    util/probe-finder.c:
    struct probe_finder | -8
    1 struct changed
    find_kprobe_trace_events | -16
    1 function changed, 16 bytes removed, diff: -16

    /home/acme/bin/perf:
    4 functions changed, 43 bytes removed, diff: -43
    [acme@doppio linux-2.6-tip]$

    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

03 Apr, 2010

1 commit


29 Jan, 2010

1 commit


17 Jan, 2010

1 commit

  • A process that changes its comm field, does this on a per kernel
    task struct basis. The timechart tool used, incorrectly, the pid
    to track this, and should have used the tid instead...

    Signed-off-by: Arjan van de Ven
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    CC:
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     

28 Dec, 2009

3 commits


16 Dec, 2009

2 commits


15 Dec, 2009

1 commit


14 Dec, 2009

5 commits

  • As we'll need to sort multiple times for multiple perf sessions,
    so that we can then do a diff.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • There is still some more work to do to disentangle map creation
    from DSO loading, but this happens only for the kernel, and for
    the early adopters of perf diff, where this disentanglement
    matters most, we'll be testing different kernels, so no problem
    here.

    Further clarification: right now we create the kernel maps for
    the various modules and discontiguous kernel text maps when
    loading the DSO, we should do it as a two step process, first
    creating the maps, for multiple mappings with the same DSO
    store, then doing the dso load just once, for the first hit on
    one of the maps sharing this DSO backing store.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • By having the cwd/cwdlen in the perf_session struct and
    full_paths in perf_event_ops.

    Now its just a matter of passing the ops.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Pass the event_ops to perf_session__process_events instead.

    Also move the event_ops definition to session.h, starting to
    move things around to their right place, trimming the many
    unneeded headers we have.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • They will need it to get the right threads list, etc.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

12 Dec, 2009

1 commit

  • That does all the initialization boilerplate, opening the file,
    reading the header, checking if it is valid, etc.

    And that will as well have the threads list, kmap (now) global
    variable, etc, so that we can handle two (or more) perf.data files
    describing sessions to compare.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo