01 Mar, 2010

1 commit

  • …git/tip/linux-2.6-tip

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (172 commits)
    perf_event, amd: Fix spinlock initialization
    perf_event: Fix preempt warning in perf_clock()
    perf tools: Flush maps on COMM events
    perf_events, x86: Split PMU definitions into separate files
    perf annotate: Handle samples not at objdump output addr boundaries
    perf_events, x86: Remove superflous MSR writes
    perf_events: Simplify code by removing cpu argument to hw_perf_group_sched_in()
    perf_events, x86: AMD event scheduling
    perf_events: Add new start/stop PMU callbacks
    perf_events: Report the MMAP pgoff value in bytes
    perf annotate: Defer allocating sym_priv->hist array
    perf symbols: Improve debugging information about symtab origins
    perf top: Use a macro instead of a constant variable
    perf symbols: Check the right return variable
    perf/scripts: Tag syscall_name helper as not yet available
    perf/scripts: Add perf-trace-python Documentation
    perf/scripts: Remove unnecessary PyTuple resizes
    perf/scripts: Add syscall tracing scripts
    perf/scripts: Add Python scripting engine
    perf/scripts: Remove check-perf-trace from listed scripts
    ...

    Fix trivial conflict in tools/perf/util/probe-event.c

    Linus Torvalds
     

26 Feb, 2010

3 commits

  • Even though we don't register the counters until the child is right about
    to exec(), we're still going to get at least a few events while the
    fork()'d child is still executing 'perf' and in particular we're going to
    get the MMAP events.

    We can't distinguish the ones in the newly executed process because the
    PID will be the same.

    One way to solve this would be to have a PERF_RECORD_EXEC event, and when
    this is seen 'perf' can flush it's map cache. We can't use
    PERF_RECORD_COMM since that's generated by other things, not just exec().

    Actually, thinking about it some more, using PERF_RECORD_COMM might be a
    good enough approximation.

    Signed-off-by: David S. Miller
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    David S. Miller
     
  • Without this patch we get this for need_resched:

    [root@mica ~]# perf annotate need_resched

    ------------------------------------------------
    Percent | Source code & Disassembly of vmlinux
    ------------------------------------------------
    :
    :
    : Disassembly of section .text:
    :
    : ffffffff810095ed :
    : return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p);
    : }
    :
    : static inline int need_resched(void)
    : {
    0.00 : ffffffff810095ed: 55 push %rbp
    : return unlikely(test_thread_flag(TIF_NEED_RESCHED));
    0.00 : ffffffff810095ee: be 03 00 00 00 mov $0x3,%esi
    :
    : static inline struct thread_info *current_thread_info(void)
    : {
    : struct thread_info *ti;
    : ti = (void *)(percpu_read_stable(kernel_stack) +
    0.00 : ffffffff810095f3: 65 48 8b 3c 25 48 b5 mov %gs:0xb548,%rdi
    0.00 : ffffffff810095fa: 00 00
    : return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p);
    : }
    :
    : static inline int need_resched(void)
    : {
    0.00 : ffffffff810095fc: 48 89 e5 mov %rsp,%rbp
    : return unlikely(test_thread_flag(TIF_NEED_RESCHED));
    0.00 : ffffffff810095ff: 48 81 ef d8 1f 00 00 sub $0x1fd8,%rdi
    0.00 : ffffffff81009606: e8 9d ff ff ff callq ffffffff810095a8
    : }
    0.00 : ffffffff8100960b: c9 leaveq
    0.00 : ffffffff8100960c: 85 c0 test %eax,%eax
    0.00 : ffffffff8100960e: 0f 95 c0 setne %al
    0.00 : ffffffff81009611: 0f b6 c0 movzbl %al,%eax
    : Disassembly of section .vsyscall_0:
    : Disassembly of section .vsyscall_fn:
    : Disassembly of section .vsyscall_1:
    : Disassembly of section .vsyscall_2:
    : Disassembly of section .init.text:
    : Disassembly of section .altinstr_replacement:
    : Disassembly of section .exit.text:
    [root@mica ~]#

    But from the 'perf report' result we know that there are hits
    for need_resched on a 4 way machine mostly doing nothing, so
    after adding code to show what is in each hist offset and
    collapsing IP hits for what happens between objdump lines we
    get, for the same perf.data file:

    [root@mica ~]# perf annotate -v need_resched

    ------------------------------------------------
    Percent | Source code & Disassembly of vmlinux
    ------------------------------------------------
    :
    :
    : Disassembly of section .text:
    :
    : ffffffff810095ed :
    : return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p);
    : }
    :
    : static inline int need_resched(void)
    : {
    0.00 : ffffffff810095ed: 55 push %rbp
    : return unlikely(test_thread_flag(TIF_NEED_RESCHED));
    52.78 : ffffffff810095ee: be 03 00 00 00 mov $0x3,%esi
    :
    : static inline struct thread_info *current_thread_info(void)
    : {
    : struct thread_info *ti;
    : ti = (void *)(percpu_read_stable(kernel_stack) +
    0.00 : ffffffff810095f3: 65 48 8b 3c 25 48 b5 mov %gs:0xb548,%rdi
    0.00 : ffffffff810095fa: 00 00
    : return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p);
    : }
    :
    : static inline int need_resched(void)
    : {
    0.00 : ffffffff810095fc: 48 89 e5 mov %rsp,%rbp
    : return unlikely(test_thread_flag(TIF_NEED_RESCHED));
    9.72 : ffffffff810095ff: 48 81 ef d8 1f 00 00 sub $0x1fd8,%rdi
    0.00 : ffffffff81009606: e8 9d ff ff ff callq ffffffff810095a8
    : }
    0.00 : ffffffff8100960b: c9 leaveq
    0.00 : ffffffff8100960c: 85 c0 test %eax,%eax
    37.50 : ffffffff8100960e: 0f 95 c0 setne %al
    0.00 : ffffffff81009611: 0f b6 c0 movzbl %al,%eax
    : Disassembly of section .vsyscall_0:
    : Disassembly of section .vsyscall_fn:
    : Disassembly of section .vsyscall_1:
    : Disassembly of section .vsyscall_2:
    : Disassembly of section .init.text:
    : Disassembly of section .altinstr_replacement:
    : Disassembly of section .exit.text:
    [root@mica ~]#

    And now 'perf annotate -v', verbose mode, will show the hits per
    precise IP, so that one can make sense of the attribution to
    each objdumop line:

    [root@mica ~]# perf annotate -v need_resched
    Looking at the vmlinux_path (5 entries long)
    Using /lib/modules/2.6.33-rc8-tip-00784-g3471df5-dirty/build/vmlinux
    for symbols annotate_sym: filename=/lib/modules/2.6.33-rc8-tip-00784-g3471df5-dirty/build/vmlinux, sym=need_resched, start=0xffffffff810095ed, end=0xffffffff81009614

    ------------------------------------------------
    Percent | Source code & Disassembly of vmlinux
    ------------------------------------------------
    ffffffff810095f1: 152
    ffffffff81009603: 28
    ffffffff8100960f: 55
    ffffffff81009610: 53
    h->sum: 288

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: David Miller
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Because symbol->end is not fixed up at symbol_filter time, only
    after all symbols for a DSO are loaded, and that, for asm
    symbols, may be bogus, causing segfaults when hits happen in
    these symbols.

    Reported-by: David Miller
    Reported-by: Anton Blanchard
    Acked-by: David Miller
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: # for .33.x. Does not apply cleanly, needs backport.
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

25 Feb, 2010

8 commits

  • Be more clear about DSO long names and tell from which file
    kernel symbols were obtained, all in --verbose mode:

    [root@mica ~]# perf report -v > /dev/null
    Looking at the vmlinux_path (5 entries long)
    Using /lib/modules/2.6.33-rc8-tip-00777-g0918527-dirty/build/vmlinux for symbols
    [root@mica ~]# mv /lib/modules/2.6.33-rc8-tip-00777-g0918527-dirty/build/vmlinux /tmp/dd
    [root@mica ~]# perf report -v > /dev/null
    Looking at the vmlinux_path (5 entries long)
    Using /proc/kallsyms for symbols
    [root@mica ~]#

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • To overcome a silly gcc warning:

    cc1: warnings being treated as errors
    builtin-top.c: In function ‘lookup_sym_source’:
    builtin-top.c:291: warning: not protecting local variables:
    variable length buffer make: *** [builtin-top.o] Error 1
    make: *** Waiting for unfinished jobs....

    That is emitted for this:

    const size_t pattern_len = BITS_PER_LONG / 4 + 2;
    char pattern[pattern_len + 1];

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    [ -v2: macroify the naming style ]
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • In function dso__split_kallsyms(), curr_map saves the return value
    of map__new2. So check it instead of var map after the call returns.

    Signed-off-by: Zhang Yanmin
    Acked-by: David S. Miller
    Cc: # for .33.x
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Zhang, Yanmin
     
  • syscall_name() helper, which resolves a syscall arch number to
    its name, is not yet available as we first need to implement
    event injection for it to work.

    Remove it from the documentation or tag its references as
    unavailable yet. Once it's implemented, we can just revert
    the current patch.

    Signed-off-by: Frederic Weisbecker
    Cc: Tom Zanussi
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Keiichi KII
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo

    Frederic Weisbecker
     
  • Also small update to perf-trace-perl and perf-trace docs.

    Signed-off-by: Tom Zanussi
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Keiichi KII
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Tom Zanussi
     
  • If we know the size of a tuple in advance, there's no need to resize
    it - start out with the known size in the first place.

    Signed-off-by: Tom Zanussi
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Keiichi KII
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Tom Zanussi
     
  • Adds a set of scripts that aggregate system call totals and system
    call errors. Most are Python scripts that also test basic
    functionality of the new Python engine, but there's also one Perl
    script added for comparison and for reference in some new
    Documentation contained in a later patch.

    Signed-off-by: Tom Zanussi
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Keiichi KII
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Tom Zanussi
     
  • Add base support for Python scripting to perf trace.

    Signed-off-by: Tom Zanussi
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Keiichi KII
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Tom Zanussi
     

24 Feb, 2010

5 commits

  • The check-perf-trace script only checks Perl functionality, and
    doesn't really need to be listed as as user script anyway.

    This only removes the '-report' shell script, so although it doesn't
    appear in the listing, the '-record' shell script and the check perf
    trace perl script itself is still available and can still be run
    manually as such:

    $ libexec/perf-core/scripts/perl/bin/check-perf-trace-record
    $ perf trace -s libexec/perf-core/scripts/perl/check-perf-trace.pl

    Signed-off-by: Tom Zanussi
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Keiichi KII
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Tom Zanussi
     
  • Create a scripting-engines directory to contain scripting engine
    implementation code, in anticipation of the addition of new scripting
    support. Also removes trace-event-perl.h.

    Signed-off-by: Tom Zanussi
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Keiichi KII
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Tom Zanussi
     
  • This stuff is needed by all scripting engines; move it from the Perl
    engine source to a more common place.

    Signed-off-by: Tom Zanussi
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Keiichi KII
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Tom Zanussi
     
  • Fix bogus calculation.

    Signed-off-by: Tom Zanussi
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Keiichi KII
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Tom Zanussi
     
  • 'perf trace -s list' prints a list of the supported scripting
    languages. One problem with it is that it falls through and prints
    the trace as well. The use of 'list' for this also makes it easy to
    confuse with 'perf trace -l', used for listing available scripts. So
    change 'perf trace -s list' to 'perf trace -s lang' and fixes the
    fall-through problem.

    Signed-off-by: Tom Zanussi
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Cc: Keiichi KII
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Tom Zanussi
     

22 Feb, 2010

3 commits

  • Clear struct probe_point before using it in
    show_perf_probe_events(), and set pp->found counter correctly in
    synthesize_perf_probe_point(). Without this initialization,
    clear_probe_point() will free random addresses.

    Signed-off-by: Masami Hiramatsu
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: systemtap
    Cc: DLE
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • As the parent comm then is worthless, confusing users about the
    thread where the sample really happened, leading to think that
    the sample happened in the parent, not where it really happened,
    in the children of a thread for which a PERF_RECORD_COMM event
    was not received.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • In 2161db9 we stopped failing when not finding modules when
    asked too, but then the kernel maps (just one, for vmlinux)
    wasn't having its ->end field correctly set up, so symbols were
    not being found for the vmlinux map because its range was 0-0.

    Reported-by: Ingo Molnar
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

14 Feb, 2010

1 commit

  • Print this:

    Mapped keys:
    [d] display refresh delay. (2)
    [e] display entries (lines). (46)
    [f] profile display filter (count). (5)
    [F] annotate display filter (percent). (5%)
    [s] annotate symbol. (NULL)
    [S] stop annotation.
    [K] hide kernel_symbols symbols. (no)
    [U] hide user symbols. (no)
    [z] toggle sample zeroing. (0)
    [qQ] quit.

    instead of:

    Mapped keys:
    [d] display refresh delay. (2)
    [e] display entries (lines). (46)
    [f] profile display filter (count). (5)
    [F] annotate display filter (percent). (5%)
    [s] annotate symbol. (NULL)
    [S] stop annotation.
    [K] hide kernel_symbols symbols. (no)
    [U] hide user symbols. (no)
    [z] toggle sample zeroing. (0)
    [qQ] quit.

    Signed-off-by: Kirill Smelkov
    Acked-by: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Kirill Smelkov
     

09 Feb, 2010

1 commit

  • cpumode bits are defined as such:

    #define PERF_RECORD_MISC_KERNEL (1 << 0)
    #define PERF_RECORD_MISC_USER (2 << 0)
    #define PERF_RECORD_MISC_HYPERVISOR (3 << 0)

    We need to compare against the complete value of cpumode,
    otherwise hypervisor samples get incorrectly attributed as
    userspace.

    Signed-off-by: Anton Blanchard
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: fweisbec@gmail.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Anton Blanchard
     

08 Feb, 2010

3 commits

  • When 'perf record -g' a existing process, even with debuginfo
    packages, still cannnot get symbol from 'perf report'.

    try:

    perf record -g -p `pidof xxx` -f
    perf report

    68.26% :1181 b74870f2 [.] 0x000000b74870f2
    |
    |--32.09%-- 0xb73b5b44
    | 0xb7487102
    | 0xb748a4e2
    | 0xb748633d
    | 0xb73b41cd
    | 0xb73b4467
    | 0xb747d531

    The reason is: for existing process, in __cmd_record(),
    the pid is 0 rather than the existing process id.

    Signed-off-by: Austin Zhang
    Acked-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    austin_zhang@linux.intel.com
     
  • Because we may have aliases, like __GI___strcoll_l in
    /lib64/libc-2.10.2.so that appears in objdump as:

    $ objdump --start-address=0x0000003715a86420 \
    --stop-address=0x0000003715a872dc -dS /lib64/libc-2.10.2.so

    0000003715a86420 :
    3715a86420: 55 push %rbp
    3715a86421: 48 89 e5 mov %rsp,%rbp
    3715a86424: 41 57 push %r15
    [root@doppio linux-2.6-tip]#

    So look for the address exactly at the start of the line instead
    so that annotation can work for in these cases.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Kirill Smelkov
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • First, for programs and prelinked libraries, annotate code was
    fooled by objdump output IPs (src->eip in the code) being
    wrongly converted to absolute IPs. In such case there were no
    conversion needed, but in

    src->eip = strtoull(src->line, NULL, 16);
    src->eip = map->unmap_ip(map, src->eip); // = eip + map->start - map->pgoff

    we were reading absolute address from objdump (e.g. 8048604) and
    then almost doubling it, because eip & map->start are
    approximately close for small programs.

    Needless to say, that later, in record_precise_ip() there was no
    matching with real runtime IPs.

    And second, like with `perf annotate` the problem with
    non-prelinked *.so was that we were doing rip -> objdump address
    conversion wrong.

    Also, because unlike `perf annotate`, `perf top` code does
    annotation based on absolute IPs for performance reasons(*), new
    helper for mapping objdump addresse to IP is introduced.

    (*) we get samples info in absolute IPs, and since we do lots of
    hit-testing on absolute IPs at runtime in record_precise_ip(), it's
    better to convert objdump addresses to IPs once and do no conversion
    at runtime.

    I also had to fix how objdump output is parsed (with hardcoded
    8/16 characters format, which was inappropriate for ET_DYN dsos
    with small addresses like '4ac')

    Also note, that not all objdump output lines has associtated
    IPs, e.g. look at source lines here:

    000004ac :
    extern "C"
    int my_strlen(const char *s)
    4ac: 55 push %ebp
    4ad: 89 e5 mov %esp,%ebp
    4af: 83 ec 10 sub $0x10,%esp
    {
    int len = 0;
    4b2: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%ebp)
    4b9: eb 08 jmp 4c3

    while (*s) {
    ++len;
    4bb: 83 45 fc 01 addl $0x1,-0x4(%ebp)
    ++s;
    4bf: 83 45 08 01 addl $0x1,0x8(%ebp)

    So we mark them with eip=0, and ignore such lines in annotate
    lookup code.

    Signed-off-by: Kirill Smelkov
    [ Note: one hunk of this patch was applied by Mike in 57d8188 ]
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Mike Galbraith
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Kirill Smelkov
     

04 Feb, 2010

12 commits

  • perf top and perf record refuses to initialize on non-modular kernels:
    refuse to initialize:

    $ perf top -v
    map_groups__set_modules_path_dir: cannot open /lib/modules/2.6.33-rc6-tip-00586-g398dde3-dirty/

    Cc: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Setting _FILE_OFFSET_BITS and using O_LARGEFILE, lseek64, etc,
    is redundant. Thanks H. Peter Anvin for pointing it out.

    So, this patch removes O_LARGEFILE, lseek64, etc.

    Suggested-by: "H. Peter Anvin"
    Signed-off-by: Xiao Guangrong
    Cc: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Xiao Guangrong
     
  • Signed-off-by: Mike Galbraith
    Cc: Kirill Smelkov
    Cc: Arnaldo Carvalho de Melo
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Mike Galbraith
     
  • By relying on logic in dso__load_kernel_sym(), we can
    automatically load vmlinux.

    The only thing which needs to be adjusted, is how --sym-annotate
    option is handled - now we can't rely on vmlinux been loaded
    until full successful pass of dso__load_vmlinux(), but that's
    not the case if we'll do sym_filter_entry setup in
    symbol_filter().

    So move this step right after event__process_sample() where we
    know the whole dso__load_kernel_sym() pass is done.

    By the way, though conceptually similar `perf top` still can't
    annotate userspace - see next patches with fixes.

    Signed-off-by: Kirill Smelkov
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Mike Galbraith
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Kirill Smelkov
     
  • The problem was we were incorrectly calculating objdump
    addresses for sym->start and sym->end, look:

    For simple ET_DYN type DSO (*.so) with one function, objdump -dS
    output is something like this:

    000004ac :
    int my_strlen(const char *s)
    4ac: 55 push %ebp
    4ad: 89 e5 mov %esp,%ebp
    4af: 83 ec 10 sub $0x10,%esp
    {

    i.e. we have relative-to-dso-mapping IPs (=RIP) there.

    For ET_EXEC type and probably for prelinked libs as well (sorry
    can't test - I don't use prelink) objdump outputs absolute IPs,
    e.g.

    08048604 :
    extern "C"
    int zz_strlen(const char *s)
    8048604: 55 push %ebp
    8048605: 89 e5 mov %esp,%ebp
    8048607: 83 ec 10 sub $0x10,%esp
    {

    So, if sym->start is always relative to dso mapping(*), we'll
    have to unmap it for ET_EXEC like cases, and leave as is for
    ET_DYN cases.

    (*) and it is - we've explicitely made it relative. Look for
    adjust_symbols handling in dso__load_sym()

    Previously we were always unmapping sym->start and for ET_DYN
    dsos resulting addresses were wrong, and so objdump output was
    empty.

    The end result was that perf annotate output for symbols from
    non-prelinked *.so had always 0.00% percents only, which is
    wrong.

    To fix it, let's introduce a helper for converting rip to
    objdump address, and also let's document what map_ip() and
    unmap_ip() do -- I had to study sources for several hours to
    understand it.

    Signed-off-by: Kirill Smelkov
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Mike Galbraith
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Kirill Smelkov
     
  • Not to pollute too much 'perf annotate' debugging sessions.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • We want to stream events as fast as possible to perf.data, and
    also in the future we want to have splice working, when no
    interception will be possible.

    Using build_id__mark_dso_hit_ops to create the list of DSOs that
    back MMAPs we also optimize disk usage in the build-id cache by
    only caching DSOs that had hits.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Xiao Guangrong
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Because 'perf record' will have to find the build-ids in after
    we stop recording, so as to reduce even more the impact in the
    workload while we do the measurement.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • With the recent modifications done to untie the session and
    symbol layers, 'perf probe' now can use just the symbols layer.

    Signed-off-by: Arnaldo Carvalho de Melo
    Acked-by: Masami Hiramatsu
    Cc: Frédéric Weisbecker
    Cc: Masami Hiramatsu
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • We can check using strcmp, most DSOs don't start with '[' so the
    test is cheap enough and we had to test it there anyway since
    when reading perf.data files we weren't calling the routine that
    created this global variable and thus weren't setting it as
    "loaded", which was causing a bogus:

    Failed to open [vdso], continuing without symbols

    Message as the first line of 'perf report'.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • While debugging a problem reported by Pekka Enberg by printing
    the IP and all the maps for a thread when we don't find a map
    for an IP I noticed that dso__load_sym needs to fixup these
    extra maps it creates to hold symbols in different ELF sections
    than the main kernel one.

    Now we're back showing things like:

    [root@doppio linux-2.6-tip]# perf report | grep vsyscall
    0.02% mutt [kernel.kallsyms].vsyscall_fn [.] vread_hpet
    0.01% named [kernel.kallsyms].vsyscall_fn [.] vread_hpet
    0.01% NetworkManager [kernel.kallsyms].vsyscall_fn [.] vread_hpet
    0.01% gconfd-2 [kernel.kallsyms].vsyscall_0 [.] vgettimeofday
    0.01% hald-addon-rfki [kernel.kallsyms].vsyscall_fn [.] vread_hpet
    0.00% dbus-daemon [kernel.kallsyms].vsyscall_fn [.] vread_hpet
    [root@doppio linux-2.6-tip]#

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • I noticed while writing the first test in 'perf regtest' that to
    just test the symbol handling routines one needs to create a
    perf session, that is a layer centered on a perf.data file,
    events, etc, so I untied these layers.

    This reduces the complexity for the users as the number of
    parameters to most of the symbols and session APIs now was
    reduced while not adding more state to all the map instances by
    only having data that is needed to split the kernel (kallsyms
    and ELF symtab sections) maps and do vmlinux relocation on the
    main kernel map.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

03 Feb, 2010

1 commit

  • Open perf data file with O_LARGEFILE flag since its size is
    easily larger that 2G.

    For example:

    # rm -rf perf.data
    # ./perf kmem record sleep 300

    [ perf record: Woken up 0 times to write data ]
    [ perf record: Captured and wrote 3142.147 MB perf.data
    (~137282513 samples) ]

    # ll -h perf.data
    -rw------- 1 root root 3.1G .....

    Signed-off-by: Xiao Guangrong
    Cc: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Xiao Guangrong
     

31 Jan, 2010

2 commits

  • Fix up a few small stylistic details:

    - use consistent vertical spacing/alignment
    - remove line80 artifacts
    - group some global variables better
    - remove dead code

    Plus rename 'prof' to 'report' to make it more in line with other
    tools, and remove the line/file keying as we really want to use
    IPs like the other tools do.

    Signed-off-by: Ingo Molnar
    Cc: Hitoshi Mitake
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Adding new subcommand "perf lock" to perf.

    I have a lot of remaining ToDos, but for now perf lock can
    already provide minimal functionality for analyzing lock
    statistics.

    Signed-off-by: Hitoshi Mitake
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Hitoshi Mitake