23 May, 2011

1 commit


19 Aug, 2010

4 commits

  • Store the kernel and user contexts from the generic layer instead
    of archs, this gathers some repetitive code.

    Signed-off-by: Frederic Weisbecker
    Acked-by: Paul Mackerras
    Tested-by: Will Deacon
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Stephane Eranian
    Cc: David Miller
    Cc: Paul Mundt
    Cc: Borislav Petkov

    Frederic Weisbecker
     
  • - Most archs use one callchain buffer per cpu, except x86 that needs
    to deal with NMIs. Provide a default perf_callchain_buffer()
    implementation that x86 overrides.

    - Centralize all the kernel/user regs handling and invoke new arch
    handlers from there: perf_callchain_user() / perf_callchain_kernel()
    That avoid all the user_mode(), current->mm checks and so...

    - Invert some parameters in perf_callchain_*() helpers: entry to the
    left, regs to the right, following the traditional (dst, src).

    Signed-off-by: Frederic Weisbecker
    Acked-by: Paul Mackerras
    Tested-by: Will Deacon
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Stephane Eranian
    Cc: David Miller
    Cc: Paul Mundt
    Cc: Borislav Petkov

    Frederic Weisbecker
     
  • callchain_store() is the same on every archs, inline it in
    perf_event.h and rename it to perf_callchain_store() to avoid
    any collision.

    This removes repetitive code.

    Signed-off-by: Frederic Weisbecker
    Acked-by: Paul Mackerras
    Tested-by: Will Deacon
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Stephane Eranian
    Cc: David Miller
    Cc: Paul Mundt
    Cc: Borislav Petkov

    Frederic Weisbecker
     
  • Drop the TASK_RUNNING test on user tasks for callchains as
    this check doesn't seem to make any sense.

    Also remove the tests for !current that is not supposed to
    happen and current->pid as this should be handled at the
    generic level, with exclude_idle attribute.

    Signed-off-by: Frederic Weisbecker
    Tested-by: Will Deacon
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Paul Mackerras
    Cc: Stephane Eranian
    Cc: David Miller
    Cc: Paul Mundt
    Cc: Borislav Petkov

    Frederic Weisbecker
     

28 Jan, 2010

1 commit

  • When running perf across all cpus with backtracing (-a -g), sometimes we
    get samples without associated backtraces:

    23.44% init [kernel] [k] restore
    11.46% init eeba0c [k] 0x00000000eeba0c
    6.77% swapper [kernel] [k] .perf_ctx_adjust_freq
    5.73% init [kernel] [k] .__trace_hcall_entry
    4.69% perf libc-2.9.so [.] 0x0000000006bb8c
    |
    |--11.11%-- 0xfffa941bbbc

    It turns out the backtrace code has a check for the idle task and the IP
    sampling does not. This creates problems when profiling an interrupt
    heavy workload (in my case 10Gbit ethernet) since we get no backtraces
    for interrupts received while idle (ie most of the workload).

    Right now x86 and sh check that current is not NULL, which should never
    happen so remove that too.

    Idle task's exclusion must be performed from the core code, on top
    of perf_event_attr:exclude_idle.

    Signed-off-by: Anton Blanchard
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Ingo Molnar
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mundt
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Anton Blanchard
     

05 Nov, 2009

1 commit

  • This implements preliminary support for perf callchains (at the moment
    only the kernel side is implemented). The actual implementation itself is
    just a simple wrapper around the unwinder API, which allows for callchain
    generation with or without the dwarf unwinder.

    Signed-off-by: Paul Mundt

    Paul Mundt