30 Nov, 2009

4 commits

  • Fedora needs perl-ExtUtils-Embed for Perl scripting, which also
    brings along libperl-devel; note this info for the convenience
    of Fedora users.

    Signed-off-by: Tom Zanussi
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: anton@samba.org
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • The common_* functions (e.g. common_pc(), etc) are exported as
    common_* but named get_common_*, resulting in unresolved
    subroutine errors when executing scripts.

    Make the internal and external names match.

    Signed-off-by: Tom Zanussi
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: anton@samba.org
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • The debugging versions of the ENTER and LEAVE internal perl
    macros, used when embedding perl, define a local block with a
    my_perl perl variable that shadows a global variable of the same
    name, which is also the name expected by the embedding API for
    the embedded interpreter.

    Since we don't have control over the code generated in this case
    (it's an externality) and can't get rid of the warning, ignore it.

    Signed-off-by: Tom Zanussi
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: anton@samba.org
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • The backtick shell substitutions for PERL_EMBED_LDOPT/CCOPT make
    a lot of noise on stderr if Embed.pm isn't installed - this
    silences them.

    Signed-off-by: Tom Zanussi
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: anton@samba.org
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     

28 Nov, 2009

20 commits

  • Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • To capture the relevant events for a given Perl script and to
    avoid having to continually remember and type in long
    command-lines, add a scripts/perl/bin directory containing two
    simple shell scripts for each Perl script, one for recording and
    one for processing/display. For example, to record perf data for
    the rw-by-pid.pl script, run scripts/perl/bin/rw-by-pid-record
    and to actually run the script and display the output run
    scripts/perl/bin/rw-by-pid-report.

    Signed-off-by: Tom Zanussi
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: anton@samba.org
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • Adds perf-trace-perl Documentation and a link to it from the
    perf-trace page.

    Signed-off-by: Tom Zanussi
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: anton@samba.org
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • The Perl scripting support for perf trace allows most of a trace
    event's data to be accessed directly as handler arguments, but
    not all of it e.g. the less common fields aren't passed in. To
    give scripts access to the other fields and/or any other data or
    metadata in the main perf executable that might be useful, a way
    to access the C data in perf from Perl is needed; this patch
    uses the Perl XS facility to do it for the common_xxx event
    fields not passed to handler functions.

    Context.pm exports three functions to Perl scripts that access
    fields for the current event by calling back into perf:
    common_pc(), common_flags() and common_lock_depth(). Support
    for common_flags() field values was added to Core.pm and a
    script used to sanity check these and other basic scripting
    features, check-perf-trace.pl, was also added.

    Signed-off-by: Tom Zanussi
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: anton@samba.org
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • Add Perf-Trace-Util Perl module and some scripts that use it.
    Core.pm contains Perl code to define and access flag and
    symbolic fields. Util.pm contains general-purpose utility
    functions.

    Also adds some makefile bits to install them in
    libexec/perf-core/scripts/perl (or wherever perfexec_instdir
    points).

    Signed-off-by: Tom Zanussi
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: anton@samba.org
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • Implement trace_scripting_ops to make Perl a supported perf
    trace scripting language.

    Additionally adds code that allows Perl trace scripts to access
    the 'flag' and 'symbolic' (__print_flags(), __print_symbolic())
    field information parsed from the trace format files.

    Also adds the Perl implementation of the generate_script()
    trace_scripting_op, which creates a ready-to-run perf trace Perl
    script based on existing trace data. Scripts generated by this
    implementation print out all the fields for each event mentioned
    in perf.data (and will detect and generate the proper scripting
    code for 'flag' and 'symbolic' fields), and will additionally
    generate handlers for the special 'trace_unhandled',
    'trace_begin' and 'trace_end' handlers. Script authors can
    simply remove the printing code to implement their own custom
    event handling.

    Signed-off-by: Tom Zanussi
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: anton@samba.org
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • It's useful to know whether a field is a flag or symbolic field
    for e.g. when generating scripts - it allows us to translate
    those fields specially rather than literally as plain numeric
    values.

    Signed-off-by: Tom Zanussi
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: anton@samba.org
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • Adds an interface, scripting_ops, that when implemented for a
    particular scripting language enables built-in support for trace
    stream processing using that language.

    The interface is designed to enable full-fledged language
    interpreters to be embedded inside the perf executable and
    thereby make the full capabilities of the supported languages
    available for trace processing.

    See below for details on the interface.

    This patch also adds a couple command-line options to 'perf
    trace':

    The -s option option is used to specify the script to be run.
    Script names that can be used with -s take the form:

    [language spec:]scriptname[.ext]

    Scripting languages register a set of 'language specs' that can
    be used to specify scripts for the registered languages. The
    specs can be used either as prefixes or extensions.

    If [language spec:] is used, the script is taken as a script of
    the matching language regardless of any extension it might have.
    If [language spec:] is not used, [.ext] is used to look up the
    language it corresponds to. Language specs are case
    insensitive.

    e.g. Perl scripts can be specified in the following ways:

    Perl:scriptname
    pl:scriptname.py # extension ignored
    PL:scriptname
    scriptname.pl
    scriptname.perl

    The -g [language spec] option gives users an easy starting point
    for writing scripts in the specified language. Scripting
    support for a particular language can implement a
    generate_script() scripting op that outputs an empty (or
    near-empty) set of handlers for all the events contained in a
    given perf.data trace file - this option gives users a direct
    way to access that.

    Adding support for a scripting language
    ---------------------------------------

    The main thing that needs to be done do add support for a new
    language is to implement the scripting_ops interface:

    It consists of the following four functions:

    start_script()
    stop_script()
    process_event()
    generate_script()

    start_script() is called before any events are processed, and is
    meant to give the scripting language support an opportunity to
    set things up to receive events e.g. create and initialize an
    instance of a language interpreter.

    stop_script() is called after all events are processed, and is
    meant to give the scripting language support an opportunity to
    clean up e.g. destroy the interpreter instance, etc.

    process_event() is called once for each event and takes as its
    main parameter a pointer to the binary trace event record to be
    processed. The implementation is responsible for picking out the
    binary fields from the event record and sending them to the
    script handler function associated with that event e.g. a
    function derived from the event name it's meant to handle e.g.
    'sched::sched_switch()'. The 'format' information for trace
    events can be used to parse the binary data and map it into a
    form usable by a given scripting language; see the Perl
    implemention in subsequent patches for one possible way to
    leverage the existing trace format parsing code in perf and map
    that info into specific scripting language types.

    generate_script() should generate a ready-to-run script for the
    current set of events in the trace, preferably with bodies that
    print out every field for each event. Again, look at the Perl
    implementation for clues as to how that can be done. This is an
    optional, but very useful op.

    Support for a given language should also add a language-specific
    setup function and call it from setup_scripting(). The
    language-specific setup function associates the the scripting
    ops for that language with one or more 'language specifiers'
    (see below) using script_spec_register(). When a script name is
    specified on the command line, the scripting ops associated with
    the specified language are used to instantiate and use the
    appropriate interpreter to process the trace stream.

    In general, it should be relatively easy to add support for a
    new language, especially if the language implementation supports
    an interface allowing an interpreter to be 'embedded' inside
    another program (in this case the containing program will be
    'perf trace'). If so, it should be relatively straightforward to
    translate trace events into invocations of user-defined script
    functions where e.g. the function name corresponds to the event
    type and the function parameters correspond to the event fields.
    The event and field type information exported by the event
    tracing infrastructure (via the event 'format' files) should be
    enough to parse and send any piece of trace data to the user
    script. The easiest way to see how this can be done would be to
    look at the Perl implementation contained in
    perf/util/trace-event-perl.c/.h.

    There are a couple of other things that aren't covered by the
    scripting_ops or setup interface and are technically optional,
    but should be implemented if possible. One of these is support
    for 'flag' and 'symbolic' fields e.g. being able to use more
    human-readable values such as 'GFP_KERNEL' or
    HI/BLOCK_IOPOLL/TASKLET in place of raw flag values. See the
    Perl implementation to see how this can be done. The other thing
    is support for 'calling back' into the perf executable to access
    e.g. uncommon fields not passed by default into handler
    functions, or any metadata the implementation might want to make
    available to users via the language interface. Again, see the
    Perl implementation for examples.

    Signed-off-by: Tom Zanussi
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    Cc: anton@samba.org
    Cc: hch@infradead.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • Now we have a very high level routine for simple tools to
    process IP sample events:

    int event__preprocess_sample(const event_t *self,
    struct addr_location *al,
    symbol_filter_t filter)

    It receives the event itself and will insert new threads in the
    global threads list and resolve the map and symbol, filling all
    this info into the new addr_location struct, so that tools like
    annotate and report can further process the event by creating
    hist_entries in their specific way (with or without callgraphs,
    etc).

    It in turn uses the new next layer function:

    void thread__find_addr_location(struct thread *self, u8 cpumode,
    enum map_type type, u64 addr,
    struct addr_location *al,
    symbol_filter_t filter)

    This one will, given a thread (userspace or the kernel kthread
    one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE
    too in the near future) at the given cpumode, taking vdsos into
    account (userspace hit, but kernel symbol) and will fill all
    these details in the addr_location given.

    Tools that need a more compact API for plain function
    resolution, like 'kmem', can use this other one:

    struct symbol *thread__find_function(struct thread *self, u64 addr,
    symbol_filter_t filter)

    So, to resolve a kernel symbol, that is all the 'kmem' tool
    needs, its just a matter of calling:

    sym = thread__find_function(kthread, addr, NULL);

    The 'filter' parameter is needed because we do lazy
    parsing/loading of ELF symtabs or /proc/kallsyms.

    With this we remove more code duplication all around, which is
    always good, huh? :-)

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: John Kacur
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • While implementing event__preprocess_sample, that will do all of
    the symbol lookup in one convenient function, I noticed that
    util/process_event.[ch] were not being used at all, then started
    looking if there were other functions that could be shared
    and...

    All those functions really don't need to receive offset + head,
    the only thing they did was common to all of them, so do it at
    one place instead.

    Stats about number of each type of event processed now is done
    in a central place.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: John Kacur
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Making the routines that were so far specific to the kernel maps
    useful for all threads.

    This is done by making the kernel maps be contained in a kernel
    "thread".

    This gets the kernel specific routines closer to the userspace
    counterparts, which will help in reducing the boilerplate for
    resolving a symbol, as will be demonstrated in the next patches.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • So that we can support multiple symbol table types.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • So that the kallsyms loading routines are the direct counterpart
    of the vmlinux loading ones, i.e. dso__load_kallsyms is the
    counterpart of dso__load_vmlinux.

    In the process make them also use the symbols rb tree indexed by
    map->type, paving the way for supporting other types of symtabs,
    such as the next one to be supported: variables.

    This also allowed removal of yet another global variable:
    kernel_map__functions.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • By using an array of rb_roots in struct dso we can, from a
    struct map instance to get the right symbol rb_tree more easily.
    This way we can have just one symbol lookup method for struct
    map instances, map__find_symbol, instead of one per symtab type
    (functions, variables).

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • That way we will be able to check if the right symtab is loaded
    in the underlying DSO.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • perf annotate was the only user, and it doesn't really need it.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • We don't need to look at modules in dsos__findnew because the
    kernel events come only with user DSOs. Also we need a way to
    list just the module DSOs so that we can create multiple sets of
    maps, now that we will support maps for the variables in a
    symtab.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • As we'll have kernel_map[s]__variables too.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • This should be properly fixed when we remove the XXX comment in
    'perf report', function resolve_symbol.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

25 Nov, 2009

1 commit

  • Commit 13999e59343b042b0807be2df6ae5895d29782a0 (perf tools:
    Handle the case with and without the "signed" trace field)
    removed code to set the FIELD_IS_SIGNED flag that was originally
    added by commit 26a50744b21fff65bd754874072857bee8967f4d
    (tracing/events: Add 'signed' field to format files).

    This adds it back.

    Signed-off-by: Tom Zanussi
    Cc: Steven Rostedt
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     

24 Nov, 2009

15 commits

  • Paving the way for supporting variable in adition to function
    symbols.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • And also make xrealloc and xmalloc weak symbols so that we don't
    have this problem:

    /usr/lib/gcc/x86_64-redhat-linux/4.4.1/../../../../lib64/libiberty.a(xmalloc.o):
    In function `xrealloc':
    (.text+0xc0): multiple definition of `xrealloc'
    libperf.a(wrapper.o):/home/acme_unencrypted/git/linux-2.6-tip/tools/perf/util/wrapper.c:67:
    first defined here
    collect2: ld returned 1 exit status

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • This way we type less characters and it looks more like the
    kzalloc kernel counterpart.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • And also express its configuration toggles via a struct.

    Now all one has to do is to call symbol__init(NULL) if the
    defaults are OK, or pass a struct symbol_conf pointer with the
    desired configuration.

    If a tool uses kernel_maps__find_symbol() to look at the kernel
    and modules mappings for a symbol but didn't call symbol__init()
    first, that will generate a one time warning too, alerting the
    subcommand developer that symbol__init() must be called.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Ingo found it confusing, and I agree with that, for 'perf
    report' its OK because it is static, but for a tool refreshing
    it the eventual switch from column to summary at the top may
    seem confusing.

    Suggested-by: Ingo Molnar
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Prevent bit-rot in perf-annotate by using common functions where
    possible. Here we create process_events.[ch] to hold the common
    functions.

    Signed-off-by: John Kacur
    Cc: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: acme@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    John Kacur
     
  • Signed-off-by: John Kacur
    Cc: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: acme@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    John Kacur
     
  • Merge reason: Looks mergable - ready it for the merge window.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Add Documentation/perf-kmem.txt

    Signed-off-by: Li Zefan
    Acked-by: Pekka Enberg
    Cc: Eduard - Gabriel Munteanu
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: linux-mm@kvack.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Li Zefan
     
  • Show statistics for allocations and frees on different cpus:

    ------------------------------------------------------------------------------------------------------
    Callsite | Total_alloc/Per | Total_req/Per | Hit | Ping-pong | Frag
    ------------------------------------------------------------------------------------------------------
    perf_event_alloc.clone.0+0 | 7504/682 | 7128/648 | 11 | 0 | 5.011%
    alloc_buffer_head+16 | 288/57 | 280/56 | 5 | 0 | 2.778%
    radix_tree_preload+51 | 296/296 | 288/288 | 1 | 0 | 2.703%
    tracepoint_add_probe+32e | 157/31 | 154/30 | 5 | 0 | 1.911%
    do_maps_open+0 | 796/12 | 792/12 | 66 | 0 | 0.503%
    sock_alloc_send_pskb+16e | 23780/495 | 23744/494 | 48 | 38 | 0.151%
    anon_vma_prepare+9a | 3744/44 | 3740/44 | 85 | 0 | 0.107%
    d_alloc+21 | 64948/164 | 64944/164 | 396 | 0 | 0.006%
    proc_alloc_inode+23 | 262292/676 | 262288/676 | 388 | 0 | 0.002%
    create_object+28 | 459600/200 | 459600/200 | 2298 | 71 | 0.000%
    journal_start+67 | 14440/40 | 14440/40 | 361 | 0 | 0.000%
    get_empty_filp+df | 53504/256 | 53504/256 | 209 | 0 | 0.000%
    getname+2a | 823296/4096 | 823296/4096 | 201 | 0 | 0.000%
    seq_read+2b0 | 544768/4096 | 544768/4096 | 133 | 0 | 0.000%
    seq_open+6d | 17024/128 | 17024/128 | 133 | 0 | 0.000%
    mmap_region+2e6 | 11704/88 | 11704/88 | 133 | 0 | 0.000%
    single_open+0 | 1072/16 | 1072/16 | 67 | 0 | 0.000%
    __alloc_skb+2e | 12544/256 | 12544/256 | 49 | 38 | 0.000%
    __sigqueue_alloc+4a | 1296/144 | 1296/144 | 9 | 8 | 0.000%
    tracepoint_add_probe+6f | 80/16 | 80/16 | 5 | 0 | 0.000%
    ------------------------------------------------------------------------------------------------------
    ...

    Signed-off-by: Li Zefan
    Acked-by: Pekka Enberg
    Cc: Eduard - Gabriel Munteanu
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: linux-mm@kvack.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Li Zefan
     
  • Show cross node memory allocations:

    # ./perf kmem

    SUMMARY
    =======
    ...
    Cross node allocations: 0/3633

    Signed-off-by: Li Zefan
    Acked-by: Pekka Enberg
    Cc: Eduard - Gabriel Munteanu
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: linux-mm@kvack.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Li Zefan
     
  • Make the output sort by fragmentation by default.

    Also make the usage of "--sort" option consistent with other
    perf tools. That is, we support multi keys: "--sort
    key1[,key2]...".

    # ./perf kmem --stat caller
    ------------------------------------------------------------------------------
    Callsite |Total_alloc/Per | Total_req/Per | Hit | Frag
    ------------------------------------------------------------------------------
    __netdev_alloc_skb+23 | 5048/1682 | 4564/1521 | 3| 9.588%
    perf_event_alloc.clone.0+0 | 7504/682 | 7128/648 | 11| 5.011%
    tracepoint_add_probe+32e | 157/31 | 154/30 | 5| 1.911%
    alloc_buffer_head+16 | 456/57 | 448/56 | 8| 1.754%
    radix_tree_preload+51 | 584/292 | 576/288 | 2| 1.370%
    ...

    TODO:
    - Extract duplicate code in builtin-kmem.c and builtin-sched.c
    into util/sort.c.

    Signed-off-by: Li Zefan
    Acked-by: Pekka Enberg
    Cc: Eduard - Gabriel Munteanu
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: linux-mm@kvack.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Li Zefan
     
  • Add option "--raw-ip" to show raw ip instead of symbols:

    # ./perf kmem --stat caller --raw-ip
    ------------------------------------------------------------------------------
    Callsite |Total_alloc/Per | Total_req/Per | Hit | Frag
    ------------------------------------------------------------------------------
    0xc05301aa | 733184/4096 | 733184/4096 | 179| 0.000%
    0xc0542ba0 | 483328/4096 | 483328/4096 | 118| 0.000%
    ...

    Also show symbols with format sym+offset instead of sym/offset.

    Signed-off-by: Li Zefan
    Acked-by: Pekka Enberg
    Cc: Eduard - Gabriel Munteanu
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: linux-mm@kvack.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Li Zefan
     
  • Currently, perf fails to compile on powerpc with this error:

    CC util/header.o
    In file included from util/../perf.h:17,
    from util/header.c:9:
    util/../../../arch/powerpc/include/asm/unistd.h:360:27: error:
    linux/linkage.h: No such file or directory make: ***
    [util/header.o] Error 1

    The reason is that we still have a #define __KERNEL__ in effect
    at the point where gets included, which means we
    get extra stuff that we don't need or want.

    This fixes the problem by undefining __KERNEL__ once we have
    included the file for which we need __KERNEL__ defined.

    Signed-off-by: Paul Mackerras
    Cc: Frederic Weisbecker
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Mackerras
     
  • E.g.:

    [root@doppio linux-2.6-tip]# perf kmem record sleep 3s
    [ perf record: Woken up 2 times to write data ]
    [ perf record: Captured and wrote 0.804 MB perf.data (~35105 samples) ]

    [root@doppio linux-2.6-tip]# perf kmem --stat caller | head -10
    ------------------------------------------------------------------------------
    Callsite |Total_alloc/Per | Total_req/Per | Hit | Frag
    ------------------------------------------------------------------------------
    getname/40 | 1519616/4096 | 1519616/4096 | 371| 0.000%
    seq_read/a2 | 987136/4096 | 987136/4096 | 241| 0.000%
    __netdev_alloc_skb/43 | 260368/1049 | 259968/1048 | 248| 0.154%
    __alloc_skb/5a | 77312/256 | 77312/256 | 302| 0.000%
    proc_alloc_inode/33 | 76480/632 | 76472/632 | 121| 0.010%
    get_empty_filp/8d | 70272/192 | 70272/192 | 366| 0.000%
    split_vma/8e | 42064/176 | 42064/176 | 239| 0.000%
    [root@doppio linux-2.6-tip]#

    Signed-off-by: Arnaldo Carvalho de Melo
    Acked-by: Pekka Enberg
    Cc: Eduard - Gabriel Munteanu
    Cc: Frédéric Weisbecker
    Cc: linux-mm@kvack.org
    Cc: Li Zefan
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo