19 Mar, 2010

1 commit

  • …/git/tip/linux-2.6-tip

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (35 commits)
    perf: Fix unexported generic perf_arch_fetch_caller_regs
    perf record: Don't try to find buildids in a zero sized file
    perf: export perf_trace_regs and perf_arch_fetch_caller_regs
    perf, x86: Fix hw_perf_enable() event assignment
    perf, ppc: Fix compile error due to new cpu notifiers
    perf: Make the install relative to DESTDIR if specified
    kprobes: Calculate the index correctly when freeing the out-of-line execution slot
    perf tools: Fix sparse CPU numbering related bugs
    perf_event: Fix oops triggered by cpu offline/online
    perf: Drop the obsolete profile naming for trace events
    perf: Take a hot regs snapshot for trace events
    perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot
    perf/x86-64: Use frame pointer to walk on irq and process stacks
    lockdep: Move lock events under lockdep recursion protection
    perf report: Print the map table just after samples for which no map was found
    perf report: Add multiple event support
    perf session: Change perf_session post processing functions to take histogram tree
    perf session: Add storage for seperating event types in report
    perf session: Change add_hist_entry to take the tree root instead of session
    perf record: Add ID and to recorded event data when recording multiple events
    ...

    Linus Torvalds
     

14 Mar, 2010

1 commit

  • …/git/tip/linux-2.6-tip

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf: Provide generic perf_sample_data initialization
    MAINTAINERS: Add Arnaldo as tools/perf/ co-maintainer
    perf trace: Don't use pager if scripting
    perf trace/scripting: Remove extraneous header read
    perf, ARM: Modify kuser rmb() call to compile for Thumb-2
    x86/stacktrace: Don't dereference bad frame pointers
    perf archive: Don't try to collect files without a build-id
    perf_events, x86: Fixup fixed counter constraints
    perf, x86: Restrict the ANY flag
    perf, x86: rename macro in ARCH_PERFMON_EVENTSEL_ENABLE
    perf, x86: add some IBS macros to perf_event.h
    perf, x86: make IBS macros available in perf_event.h
    hw-breakpoints: Remove stub unthrottle callback
    x86/hw-breakpoints: Remove the name field
    perf: Remove pointless breakpoint union
    perf lock: Drop the buffers multiplexing dependency
    perf lock: Fix and add misc documentally things
    percpu: Add __percpu sparse annotations to hw_breakpoint

    Linus Torvalds
     

12 Mar, 2010

1 commit

  • Fixing this symptom:

    [acme@mica linux-2.6-tip]$ perf record -a -f
    Fatal: Permission error - are you root?

    Bus error
    [acme@mica linux-2.6-tip]$

    I.e. if for some reason no data is collected, in this case a non
    root user trying to do systemwide profiling, no data will be
    collected, and then we end up trying to mmap a zero sized file
    and access the file header, b00m.

    Reported-by: Ingo Molnar
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc:
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

11 Mar, 2010

2 commits

  • Without this change, the install path is relative to
    prefix/DESTDIR where prefix is automatically set to $HOME.

    This can produce unexpected results. For example:

    make -C tools/perf DESTDIR=/home/jkacur/tmp install-man

    creates the directory: /home/jkacur/home/jkacur/tmp/share/...
    instead of the expected: /home/jkacur/tmp/share/...

    Signed-off-by: John Kacur
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Tom Zanussi
    Cc: Kyle McMartin
    Cc:
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    John Kacur
     
  • At present, the perf subcommands that do system-wide monitoring
    (perf stat, perf record and perf top) don't work properly unless
    the online cpus are numbered 0, 1, ..., N-1. These tools ask
    for the number of online cpus with sysconf(_SC_NPROCESSORS_ONLN)
    and then try to create events for cpus 0, 1, ..., N-1.

    This creates problems for systems where the online cpus are
    numbered sparsely. For example, a POWER6 system in
    single-threaded mode (i.e. only running 1 hardware thread per
    core) will have only even-numbered cpus online.

    This fixes the problem by reading the /sys/devices/system/cpu/online
    file to find out which cpus are online. The code that does that is in
    tools/perf/util/cpumap.[ch], and consists of a read_cpu_map()
    function that sets up a cpumap[] array and returns the number of
    online cpus. If /sys/devices/system/cpu/online can't be read or
    can't be parsed successfully, it falls back to using sysconf to
    ask how many cpus are online and sets up an identity map in cpumap[].

    The perf record, perf stat and perf top code then calls
    read_cpu_map() in the system-wide monitoring case (instead of
    sysconf) and uses cpumap[] to get the cpu numbers to pass to
    perf_event_open.

    Signed-off-by: Paul Mackerras
    Cc: Anton Blanchard
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Mackerras
     

10 Mar, 2010

9 commits

  • If -vv is used just the map table will be printed, -vvv will
    print the symbol table too, with it we can see that we have a
    bug where some samples are not being resolved to a map when we
    get them in the perf.data stream, but after we have it all
    processed, we can find the right map, some reordering probably
    is happening.

    Upcoming patches will provide ways to ask for most PERF_SAMPLE_
    conditional samples to be taken for !PERF_RECORD_SAMPLE events
    too, then we'll be able to ask for PERF_SAMPLE_TIME and
    PERF_SAMPLE_CPU to help diagnose this.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Perf report does not handle multiple events being reported, even
    though perf record stores them properly on disk. This patch
    addresses that issue by adding the logic to perf report to use
    the event stream id that is saved by record and the new data
    structures to seperate the event streams and report them
    individually.

    Signed-off-by: Eric B Munson
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Eric B Munson
     
  • Now that report can store historgrams for multiple events we
    need to be able to do the post processing work for each
    histogram. This patch changes the post processing functions so
    that they can be called individually for each event's histogram.

    Signed-off-by: Eric B Munson
    [ Guarantee bisectabilty by fixing up builtin-report.c ]
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Eric B Munson
     
  • This patch adds the structures necessary to count each event
    type independently in perf report.

    Signed-off-by: Eric B Munson
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Eric B Munson
     
  • In order to minimize the impact of storing multiple events in a
    report this function will now take the root of the histogram
    tree so that the logic for selecting the proper tree can be
    inserted before the call.

    Signed-off-by: Eric B Munson
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Eric B Munson
     
  • Currently perf record does not write the ID or the to disk for
    events. This doesn't allow report to tell if an event stream
    contains one or more types of events. This patch adds this
    entry to the list of data that record will write to disk if more
    than one event was requested.

    Signed-off-by: Eric B Munson
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Eric B Munson
     
  • cc1: warnings being treated as errors
    util/probe-finder.c: In function 'find_line_range':
    util/probe-finder.c:172: warning: 'src' may be used
    uninitialized in this function make: *** [util/probe-finder.o]
    Error 1

    Signed-off-by: Arnaldo Carvalho de Melo
    Acked-by: Masami Hiramatsu
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Arnaldo Carvalho de Melo
    Cc: David S. Miller
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Conflicts:
    tools/perf/util/probe-event.c

    Merge reason: Pick up -rc1 and resolve the conflict as well.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

08 Mar, 2010

1 commit


06 Mar, 2010

1 commit

  • …nel/git/tip/linux-2.6-tip

    * 'perf-probes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86: Issue at least one memory barrier in stop_machine_text_poke()
    perf probe: Correct probe syntax on command line help
    perf probe: Add lazy line matching support
    perf probe: Show more lines after last line
    perf probe: Check function address range strictly in line finder
    perf probe: Use libdw callback routines
    perf probe: Use elfutils-libdw for analyzing debuginfo
    perf probe: Rename probe finder functions
    perf probe: Fix bugs in line range finder
    perf probe: Update perf probe document
    perf probe: Do not show --line option without dwarf support
    kprobes: Add documents of jump optimization
    kprobes/x86: Support kprobes jump optimization on x86
    x86: Add text_poke_smp for SMP cross modifying code
    kprobes/x86: Cleanup save/restore registers
    kprobes/x86: Boost probes when reentering
    kprobes: Jump optimization sysctl interface
    kprobes: Introduce kprobes jump optimization
    kprobes: Introduce generic insn_slot framework
    kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE

    Linus Torvalds
     

04 Mar, 2010

4 commits

  • It's useful for paging through raw traces, but just gets in the
    way when scripting.

    Signed-off-by: Tom Zanussi
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • perf_header__read() is already done in perf_session__open(), so
    remove it from the script gen case.

    Signed-off-by: Tom Zanussi
    Cc: fweisbec@gmail.com
    Cc: rostedt@goodmis.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • The Thumb-2 instruction set does not provide an encoding
    for sub pc, r0, #95 as present in the rmb() definition used
    by perf. This results in compilation failure when using a
    compiler targetting an instruction set other than ARM.

    This patch redefines rmb() for ARM by casting the address
    of the kuser helper to a function pointer, therefore getting
    the compiler to take care of making the call.

    Patch taken against tip/master.

    Signed-off-by: Will Deacon
    Cc: Russell King - ARM Linux
    Cc: Jamie Iles
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Will Deacon
     
  • Move @SRC right after FUNC in syntax according to syntax change
    on command line help.

    Signed-off-by: Masami Hiramatsu
    Cc: systemtap
    Cc: DLE
    Cc: Frederic Weisbecker
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Mike Galbraith
    Cc: K.Prasad
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     

03 Mar, 2010

1 commit

  • To avoid these error:

    [root@doppio ~]# perf archive
    tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat:
    No such file or directory
    tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat:
    No such file or directory
    tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat:
    No such file or directory
    tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat:
    No such file or directory
    tar: Exiting with failure status due to previous errors
    [root@doppio ~]#

    More work is needed to support archiving symtabs for binaries
    without a build-id, perhaps creating a perf.data UUID + adding
    build-ids for the binaries copied into the cache and then have
    this perf.data session UUID be a directory with symlinks to the
    by now calculated build-id of the files inside it.

    Or just do an extra pass and insert the calculated build-ids in
    the perf.data header.

    Reported-by: Ingo Molnar
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

01 Mar, 2010

1 commit

  • …git/tip/linux-2.6-tip

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (172 commits)
    perf_event, amd: Fix spinlock initialization
    perf_event: Fix preempt warning in perf_clock()
    perf tools: Flush maps on COMM events
    perf_events, x86: Split PMU definitions into separate files
    perf annotate: Handle samples not at objdump output addr boundaries
    perf_events, x86: Remove superflous MSR writes
    perf_events: Simplify code by removing cpu argument to hw_perf_group_sched_in()
    perf_events, x86: AMD event scheduling
    perf_events: Add new start/stop PMU callbacks
    perf_events: Report the MMAP pgoff value in bytes
    perf annotate: Defer allocating sym_priv->hist array
    perf symbols: Improve debugging information about symtab origins
    perf top: Use a macro instead of a constant variable
    perf symbols: Check the right return variable
    perf/scripts: Tag syscall_name helper as not yet available
    perf/scripts: Add perf-trace-python Documentation
    perf/scripts: Remove unnecessary PyTuple resizes
    perf/scripts: Add syscall tracing scripts
    perf/scripts: Add Python scripting engine
    perf/scripts: Remove check-perf-trace from listed scripts
    ...

    Fix trivial conflict in tools/perf/util/probe-event.c

    Linus Torvalds
     

28 Feb, 2010

2 commits

  • We need to deal with time ordered events to build a correct
    state machine of lock events. This is why we multiplex the lock
    events buffers. But the ordering is done from the kernel, on
    the tracing fast path, leading to high contention between cpus.

    Without multiplexing, the events appears in a weak order.
    If we have four events, each split per cpu, perf record will
    read the events buffers in the following order:

    [ CPU0 ev0, CPU0 ev1, CPU0 ev3, CPU0 ev4, CPU1 ev0, CPU1 ev0....]

    To handle a post processing reordering, we could just read and sort
    the whole in memory, but it just doesn't scale with high amounts
    of events: lock events can fill huge amounts in few times.

    Basically we need to sort in memory and find a "grace period"
    point when we know that a given slice of previously sorted events
    can be committed for post-processing, so that we can unload the
    memory usage step by step and keep a scalable sorting list.

    There is no strong rules about how to define such "grace period".
    What does this patch is:

    We define a FLUSH_PERIOD value that defines a grace period in
    seconds.
    We want to have a slice of events covering 2 * FLUSH_PERIOD in our
    sorted list.
    If FLUSH_PERIOD is big enough, it ensures every events that occured
    in the first half of the timeslice have all been buffered and there
    are none remaining and there won't be further to put inside this
    first timeslice. Then once we reach the 2 * FLUSH_PERIOD
    timeslice, we flush the first half to be gentle with the memory
    (the second half can still get new events in the middle, so wait
    another period to flush it)

    FLUSH_PERIOD is defined to 5 seconds. Say the first event started on
    time t0. We can safely assume that at the time we are processing
    events of t0 + 10 seconds, ther won't be anymore events to read
    from perf.data that occured between t0 and t0 + 5 seconds. Hence
    we can safely flush the first half.

    To point out funky bugs, we have a guardian that checks a new event
    timestamp is not below the last event's timestamp flushed and that
    displays a warning in this case.

    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Steven Rostedt
    Cc: Paul Mackerras
    Cc: Hitoshi Mitake
    Cc: Li Zefan
    Cc: Lai Jiangshan
    Cc: Masami Hiramatsu
    Cc: Jens Axboe

    Frederic Weisbecker
     
  • I've forgot to add 'perf lock' line to command-list.txt,
    so users of perf could not find perf lock when they type 'perf'.

    Fixing command-list.txt requires document
    (tools/perf/Documentation/perf-lock.txt).
    But perf lock is too much "under construction" to write a
    stable document, so this is something like pseudo document for now.

    And I wrote description of perf lock at help section of
    CONFIG_LOCK_STAT, this will navigate users of lock trace events.

    Signed-off-by: Hitoshi Mitake
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Hitoshi Mitake
     

27 Feb, 2010

1 commit


26 Feb, 2010

12 commits

  • Even though we don't register the counters until the child is right about
    to exec(), we're still going to get at least a few events while the
    fork()'d child is still executing 'perf' and in particular we're going to
    get the MMAP events.

    We can't distinguish the ones in the newly executed process because the
    PID will be the same.

    One way to solve this would be to have a PERF_RECORD_EXEC event, and when
    this is seen 'perf' can flush it's map cache. We can't use
    PERF_RECORD_COMM since that's generated by other things, not just exec().

    Actually, thinking about it some more, using PERF_RECORD_COMM might be a
    good enough approximation.

    Signed-off-by: David S. Miller
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    David S. Miller
     
  • Without this patch we get this for need_resched:

    [root@mica ~]# perf annotate need_resched

    ------------------------------------------------
    Percent | Source code & Disassembly of vmlinux
    ------------------------------------------------
    :
    :
    : Disassembly of section .text:
    :
    : ffffffff810095ed :
    : return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p);
    : }
    :
    : static inline int need_resched(void)
    : {
    0.00 : ffffffff810095ed: 55 push %rbp
    : return unlikely(test_thread_flag(TIF_NEED_RESCHED));
    0.00 : ffffffff810095ee: be 03 00 00 00 mov $0x3,%esi
    :
    : static inline struct thread_info *current_thread_info(void)
    : {
    : struct thread_info *ti;
    : ti = (void *)(percpu_read_stable(kernel_stack) +
    0.00 : ffffffff810095f3: 65 48 8b 3c 25 48 b5 mov %gs:0xb548,%rdi
    0.00 : ffffffff810095fa: 00 00
    : return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p);
    : }
    :
    : static inline int need_resched(void)
    : {
    0.00 : ffffffff810095fc: 48 89 e5 mov %rsp,%rbp
    : return unlikely(test_thread_flag(TIF_NEED_RESCHED));
    0.00 : ffffffff810095ff: 48 81 ef d8 1f 00 00 sub $0x1fd8,%rdi
    0.00 : ffffffff81009606: e8 9d ff ff ff callq ffffffff810095a8
    : }
    0.00 : ffffffff8100960b: c9 leaveq
    0.00 : ffffffff8100960c: 85 c0 test %eax,%eax
    0.00 : ffffffff8100960e: 0f 95 c0 setne %al
    0.00 : ffffffff81009611: 0f b6 c0 movzbl %al,%eax
    : Disassembly of section .vsyscall_0:
    : Disassembly of section .vsyscall_fn:
    : Disassembly of section .vsyscall_1:
    : Disassembly of section .vsyscall_2:
    : Disassembly of section .init.text:
    : Disassembly of section .altinstr_replacement:
    : Disassembly of section .exit.text:
    [root@mica ~]#

    But from the 'perf report' result we know that there are hits
    for need_resched on a 4 way machine mostly doing nothing, so
    after adding code to show what is in each hist offset and
    collapsing IP hits for what happens between objdump lines we
    get, for the same perf.data file:

    [root@mica ~]# perf annotate -v need_resched

    ------------------------------------------------
    Percent | Source code & Disassembly of vmlinux
    ------------------------------------------------
    :
    :
    : Disassembly of section .text:
    :
    : ffffffff810095ed :
    : return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p);
    : }
    :
    : static inline int need_resched(void)
    : {
    0.00 : ffffffff810095ed: 55 push %rbp
    : return unlikely(test_thread_flag(TIF_NEED_RESCHED));
    52.78 : ffffffff810095ee: be 03 00 00 00 mov $0x3,%esi
    :
    : static inline struct thread_info *current_thread_info(void)
    : {
    : struct thread_info *ti;
    : ti = (void *)(percpu_read_stable(kernel_stack) +
    0.00 : ffffffff810095f3: 65 48 8b 3c 25 48 b5 mov %gs:0xb548,%rdi
    0.00 : ffffffff810095fa: 00 00
    : return (state & TASK_INTERRUPTIBLE) || __fatal_signal_pending(p);
    : }
    :
    : static inline int need_resched(void)
    : {
    0.00 : ffffffff810095fc: 48 89 e5 mov %rsp,%rbp
    : return unlikely(test_thread_flag(TIF_NEED_RESCHED));
    9.72 : ffffffff810095ff: 48 81 ef d8 1f 00 00 sub $0x1fd8,%rdi
    0.00 : ffffffff81009606: e8 9d ff ff ff callq ffffffff810095a8
    : }
    0.00 : ffffffff8100960b: c9 leaveq
    0.00 : ffffffff8100960c: 85 c0 test %eax,%eax
    37.50 : ffffffff8100960e: 0f 95 c0 setne %al
    0.00 : ffffffff81009611: 0f b6 c0 movzbl %al,%eax
    : Disassembly of section .vsyscall_0:
    : Disassembly of section .vsyscall_fn:
    : Disassembly of section .vsyscall_1:
    : Disassembly of section .vsyscall_2:
    : Disassembly of section .init.text:
    : Disassembly of section .altinstr_replacement:
    : Disassembly of section .exit.text:
    [root@mica ~]#

    And now 'perf annotate -v', verbose mode, will show the hits per
    precise IP, so that one can make sense of the attribution to
    each objdumop line:

    [root@mica ~]# perf annotate -v need_resched
    Looking at the vmlinux_path (5 entries long)
    Using /lib/modules/2.6.33-rc8-tip-00784-g3471df5-dirty/build/vmlinux
    for symbols annotate_sym: filename=/lib/modules/2.6.33-rc8-tip-00784-g3471df5-dirty/build/vmlinux, sym=need_resched, start=0xffffffff810095ed, end=0xffffffff81009614

    ------------------------------------------------
    Percent | Source code & Disassembly of vmlinux
    ------------------------------------------------
    ffffffff810095f1: 152
    ffffffff81009603: 28
    ffffffff8100960f: 55
    ffffffff81009610: 53
    h->sum: 288

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: David Miller
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Add lazy line matching support for specifying new probes.
    This also changes the syntax of perf probe a bit. Now
    perf probe accepts one of below probe event definitions.

    1) Define event based on function name
    [EVENT=]FUNC[@SRC][:RLN|+OFF|%return|;PTN] [ARG ...]

    2) Define event based on source file with line number
    [EVENT=]SRC:ALN [ARG ...]

    3) Define event based on source file with lazy pattern
    [EVENT=]SRC;PTN [ARG ...]

    - New lazy matching pattern(PTN) follows ';' (semicolon). And it
    must be put the end of the definition.
    - So, @SRC is no longer the part which must be put at the end
    of the definition.

    Note that ';' (semicolon) can be interpreted as the end of
    a command by the shell. This means that you need to quote it.
    (anyway you will need to quote the lazy pattern itself too,
    because it may contains other sensitive characters, like
    '[',']' etc.).

    Lazy matching
    -------------
    The lazy line matching is similar to glob matching except
    ignoring spaces in both of pattern and target.

    e.g.
    'a=*' can matches 'a=b', 'a = b', 'a == b' and so on.

    This provides some sort of flexibility and robustness to
    probe point definitions against minor code changes.
    (for example, actual 10th line of schedule() can be changed
    easily by modifying schedule(), but the same line matching
    'rq=cpu_rq*' may still exist.)

    Changes in v3:
    - Cast Dwarf_Addr to uintmax_t for printf-formats.

    Changes in v2:
    - Cast Dwarf_Addr to unsigned long long for printf-formats.

    Signed-off-by: Masami Hiramatsu
    Cc: systemtap
    Cc: DLE
    Cc: Frederic Weisbecker
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Mike Galbraith
    Cc: K.Prasad
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Show 2 more lines after the last probe-able line.
    This will clearly show the last closed-brace of
    inline functions.

    Signed-off-by: Masami Hiramatsu
    Cc: systemtap
    Cc: DLE
    Cc: Frederic Weisbecker
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Mike Galbraith
    Cc: K.Prasad
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Check (inlined) function address range strictly for
    improving output of probe-able lines of inline functions.

    Without this change, perf probe --line sometimes
    showed other inline function bodies too, because it didn't
    filter out inlined functions.

    Signed-off-by: Masami Hiramatsu
    Cc: systemtap
    Cc: DLE
    Cc: Frederic Weisbecker
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Mike Galbraith
    Cc: K.Prasad
    Cc: Ulrich Drepper
    Cc: Roland McGrath
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Use libdw callback functions aggressively, and remove
    local tree-search API. This change simplifies the code.

    Changes in v3:
    - Cast Dwarf_Addr to uintmax_t for printf-formats.

    Changes in v2:
    - Cast Dwarf_Addr to unsigned long long for printf-formats.

    Signed-off-by: Masami Hiramatsu
    Cc: systemtap
    Cc: DLE
    Cc: Frederic Weisbecker
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Mike Galbraith
    Cc: K.Prasad
    Cc: Ulrich Drepper
    Cc: Roland McGrath
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Newer gcc introduces newer & richer debuginfo, and only libdw
    in elfutils project can support it. So perf probe moves onto
    elfutils-libdw from libdwarf.

    Changes in v3:
    - Cast Dwarf_Addr/Dwarf_Word to uintmax_t for printf-formats.
    - Recover a sign-prefix which was removed in v2 by mistake.

    Changes in v2:
    - Fix a type-casting bug in Makefile.
    - Cast Dwarf_Addr/Dwarf_Word to unsigned long long for printf-formats.

    Signed-off-by: Masami Hiramatsu
    Cc: systemtap
    Cc: DLE
    Cc: Frederic Weisbecker
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Mike Galbraith
    Cc: K.Prasad
    Cc: Ulrich Drepper
    Cc: Roland McGrath
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Rename *_probepoint to *_probe_point, for nothing
    but a cosmetic reason.

    Signed-off-by: Masami Hiramatsu
    Cc: systemtap
    Cc: DLE
    Cc: Frederic Weisbecker
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Mike Galbraith
    Cc: K.Prasad
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Fix find_line_range_by_line() to init line_list and remove
    misconseptional found marking which should be done when
    real lines are found (if there is no lines probe-able,
    find_line_range() should return 0).

    Signed-off-by: Masami Hiramatsu
    Cc: systemtap
    Cc: DLE
    Cc: Frederic Weisbecker
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Mike Galbraith
    Cc: K.Prasad
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Update perf-probe.txt to suit to current perf-probe command
    and add some examples.

    Signed-off-by: Masami Hiramatsu
    Cc: systemtap
    Cc: DLE
    Cc: Frederic Weisbecker
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Mike Galbraith
    Cc: K.Prasad
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Do not show --line option in help message when perf
    doesn't support dwarf.

    Signed-off-by: Masami Hiramatsu
    Cc: systemtap
    Cc: DLE
    Cc: Frederic Weisbecker
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Mike Galbraith
    Cc: K.Prasad
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Because symbol->end is not fixed up at symbol_filter time, only
    after all symbols for a DSO are loaded, and that, for asm
    symbols, may be bogus, causing segfaults when hits happen in
    these symbols.

    Reported-by: David Miller
    Reported-by: Anton Blanchard
    Acked-by: David Miller
    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: # for .33.x. Does not apply cleanly, needs backport.
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

25 Feb, 2010

3 commits

  • Be more clear about DSO long names and tell from which file
    kernel symbols were obtained, all in --verbose mode:

    [root@mica ~]# perf report -v > /dev/null
    Looking at the vmlinux_path (5 entries long)
    Using /lib/modules/2.6.33-rc8-tip-00777-g0918527-dirty/build/vmlinux for symbols
    [root@mica ~]# mv /lib/modules/2.6.33-rc8-tip-00777-g0918527-dirty/build/vmlinux /tmp/dd
    [root@mica ~]# perf report -v > /dev/null
    Looking at the vmlinux_path (5 entries long)
    Using /proc/kallsyms for symbols
    [root@mica ~]#

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • To overcome a silly gcc warning:

    cc1: warnings being treated as errors
    builtin-top.c: In function ‘lookup_sym_source’:
    builtin-top.c:291: warning: not protecting local variables:
    variable length buffer make: *** [builtin-top.o] Error 1
    make: *** Waiting for unfinished jobs....

    That is emitted for this:

    const size_t pattern_len = BITS_PER_LONG / 4 + 2;
    char pattern[pattern_len + 1];

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Frédéric Weisbecker
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    LKML-Reference:
    [ -v2: macroify the naming style ]
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • In function dso__split_kallsyms(), curr_map saves the return value
    of map__new2. So check it instead of var map after the call returns.

    Signed-off-by: Zhang Yanmin
    Acked-by: David S. Miller
    Cc: # for .33.x
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Zhang, Yanmin