22 Oct, 2010

1 commit

  • Fix to get the actual type die of variables by using dwarf_attr_integrate()
    which gets attribute from die even if the type die is connected by
    DW_AT_abstract_origin.

    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Ingo Molnar
    LKML-Reference:
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Arnaldo Carvalho de Melo

    Masami Hiramatsu
     

21 Oct, 2010

1 commit

  • With the addition of trace_softirq_raise() the softirq tracepoint got
    even more convoluted. Why the tracepoints take two pointers to assign
    an integer is beyond my comprehension.

    But adding an extra case which treats the first pointer as an unsigned
    long when the second pointer is NULL including the back and forth
    type casting is just horrible.

    Convert the softirq tracepoints to take a single unsigned int argument
    for the softirq vector number and fix the call sites.

    Signed-off-by: Thomas Gleixner
    LKML-Reference:
    Acked-by: Peter Zijlstra
    Acked-by: mathieu.desnoyers@efficios.com
    Cc: Frederic Weisbecker
    Cc: Steven Rostedt

    Thomas Gleixner
     

20 Oct, 2010

1 commit


19 Oct, 2010

19 commits

  • The function start_func_tracer() was incorrectly added in the
    #ifdef CONFIG_FUNCTION_TRACER condition, but is still used even
    when function tracing is not enabled.

    The calls to register_ftrace_function() and register_ftrace_graph()
    become nops (and their arguments are even ignored), thus there is
    no reason to hide start_func_tracer() when function tracing is
    not enabled.

    Reported-by: Ingo Molnar
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Remove a couple of pointless header file includes.
    Fixes a compile bug caused by header file include dependencies with
    "irq: Add tracepoint to softirq_raise" within linux-next.

    Reported-by: Sachin Sant
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky
    [ cherry-picked from the s390 tree to fix "2bf2160: irq: Add tracepoint to softirq_raise" ]
    Signed-off-by: Ingo Molnar

    Heiko Carstens
     
  • Ugly #include dependencies. We need to have local_softirq_pending()
    defined before it gets used in . But
    provides the definition *after* this #include chain:



    Signed-off-by: Tony Luck
    [ cherry-picked from the ia64 tree to fix "2bf2160: irq: Add tracepoint to softirq_raise" ]
    Signed-off-by: Ingo Molnar

    Tony Luck
     
  • Commit c3f00c70 ("perf: Separate find_get_context() from event
    initialization") changed the generic perf_event code to call
    perf_event_alloc, which calls the arch-specific event_init code,
    before looking up the context for the new event. Unfortunately,
    power_pmu_event_init uses event->ctx->task to see whether the
    new event is a per-task event or a system-wide event, and thus
    crashes since event->ctx is NULL at the point where
    power_pmu_event_init gets called.

    (The reason it needs to know whether it is a per-task event is
    because there are some hardware events on Power systems which
    only count when the processor is not idle, and there are some
    fixed-function counters which count such events. For example,
    the "run cycles" event counts cycles when the processor is not
    idle. If the user asks to count cycles, we can use "run cycles"
    if this is a per-task event, since the processor is running when
    the task is running, by definition. We can't use "run cycles"
    if the user asks for "cycles" on a system-wide counter.)

    Fortunately the information we need is in the
    event->attach_state field, so we just use that instead.

    Signed-off-by: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar
    Reported-by: Alexey Kardashevskiy
    Signed-off-by: Ingo Molnar

    Paul Mackerras
     
  • …nel/git/rostedt/linux-2.6-trace into perf/core

    Ingo Molnar
     
  • When DYNAMIC_FTRACE is enabled and we use the C version of recordmcount,
    all objects are run through the recordmcount program to create a
    separate section that stores all the callers of mcount.

    The build process has a special file: scripts/mod/empty.o. This is
    built from empty.c which is literally an empty file (except for a
    single comment). This file is used to find information about the target
    elf format, like endianness and word size.

    The problem comes up when we need to build recordmcount. The
    build process requires that empty.o is built first. The build rules
    for empty.o will try to execute recordmcount on the empty.o file.
    We get an error that recordmcount does not exist.

    To avoid this recursion, the build file will skip running recordmcount
    if the file that it is building is script/mod/empty.o.

    [ extra comment Suggested-by: Sam Ravnborg ]

    Reported-by: Ingo Molnar
    Tested-by: Ingo Molnar
    Cc: Michal Marek
    Cc: linux-kbuild@vger.kernel.org
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The use of the JUMP_LABEL() construct ends up creating endless silly
    wrappers, create a higher level construct to reduce this clutter.

    Signed-off-by: Peter Zijlstra
    Cc: Jason Baron
    Cc: Steven Rostedt
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Paul Mackerras
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Acked-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Trades a call + conditional + ret for an unconditional jmp.

    Acked-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Add an interface to allow usage of jump_labels with atomic counters.

    Signed-off-by: Peter Zijlstra
    Acked-by: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Now that there's still only a few users around, rename things to make
    them more consistent.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • hw_breakpoint creation needs to account stuff per-task to ensure there
    is always sufficient hardware resources to back these things due to
    ptrace.

    With the perf per pmu context changes the event initialization no
    longer has access to the event context, for the simple reason that we
    need to first find the pmu (result of initialization) before we can
    find the context.

    This makes hw_breakpoints unhappy, because it can no longer do per
    task accounting, cure this by frobbing a task pointer in the event::hw
    bits for now...

    Signed-off-by: Peter Zijlstra
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • So that we can pass the task pointer to the event allocation, so that
    we can use task associated data during event initialization.

    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Currently it looks like find_lively_task_by_vpid() takes a task ref
    and relies on find_get_context() to drop it.

    The problem is that perf_event_create_kernel_counter() shouldn't be
    dropping task refs.

    Signed-off-by: Peter Zijlstra
    Acked-by: Frederic Weisbecker
    Acked-by: Matt Helsley
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Matt found we trigger the WARN_ON_ONCE() in perf_group_attach() when we take
    the move_group path in perf_event_open().

    Since we cannot de-construct the group (we rely on it to move the events), we
    have to simply ignore the double attach. The group state is context invariant
    and doesn't need changing.

    Reported-by: Matt Fleming
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Provide a mechanism that allows running code in IRQ context. It is
    most useful for NMI code that needs to interact with the rest of the
    system -- like wakeup a task to drain buffers.

    Perf currently has such a mechanism, so extract that and provide it as
    a generic feature, independent of perf so that others may also
    benefit.

    The IRQ context callback is generated through self-IPIs where
    possible, or on architectures like powerpc the decrementer (the
    built-in timer facility) is set to generate an interrupt immediately.

    Architectures that don't have anything like this get to do with a
    callback from the timer tick. These architectures can call
    irq_work_run() at the tail of any IRQ handlers that might enqueue such
    work (like the perf IRQ handler) to avoid undue latencies in
    processing the work.

    Signed-off-by: Peter Zijlstra
    Acked-by: Kyle McMartin
    Acked-by: Martin Schwidefsky
    [ various fixes ]
    Signed-off-by: Huang Ying
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The group_sched_in() function uses a transactional approach to schedule
    a group of events. In a group, either all events can be scheduled or
    none are. To schedule each event in, the function calls event_sched_in().
    In case of error, event_sched_out() is called on each event in the group.

    The problem is that event_sched_out() does not completely cancel the
    effects of event_sched_in(). Furthermore event_sched_out() changes the
    state of the event as if it had run which is not true is this particular
    case.

    Those inconsistencies impact time tracking fields and may lead to events
    in a group not all reporting the same time_enabled and time_running values.
    This is demonstrated with the example below:

    $ task -eunhalted_core_cycles,baclears,baclears -e unhalted_core_cycles,baclears,baclears sleep 5
    1946101 unhalted_core_cycles (32.85% scaling, ena=829181, run=556827)
    11423 baclears (32.85% scaling, ena=829181, run=556827)
    7671 baclears (0.00% scaling, ena=556827, run=556827)

    2250443 unhalted_core_cycles (57.83% scaling, ena=962822, run=405995)
    11705 baclears (57.83% scaling, ena=962822, run=405995)
    11705 baclears (57.83% scaling, ena=962822, run=405995)

    Notice that in the first group, the last baclears event does not
    report the same timings as its siblings.

    This issue comes from the fact that tstamp_stopped is updated
    by event_sched_out() as if the event had actually run.

    To solve the issue, we must ensure that, in case of error, there is
    no change in the event state whatsoever. That means timings must
    remain as they were when entering group_sched_in().

    To do this we defer updating tstamp_running until we know the
    transaction succeeded. Therefore, we have split event_sched_in()
    in two parts separating the update to tstamp_running.

    Similarly, in case of error, we do not want to update tstamp_stopped.
    Therefore, we have split event_sched_out() in two parts separating
    the update to tstamp_stopped.

    With this patch, we now get the following output:

    $ task -eunhalted_core_cycles,baclears,baclears -e unhalted_core_cycles,baclears,baclears sleep 5
    2492050 unhalted_core_cycles (71.75% scaling, ena=1093330, run=308841)
    11243 baclears (71.75% scaling, ena=1093330, run=308841)
    11243 baclears (71.75% scaling, ena=1093330, run=308841)

    1852746 unhalted_core_cycles (0.00% scaling, ena=784489, run=784489)
    9253 baclears (0.00% scaling, ena=784489, run=784489)
    9253 baclears (0.00% scaling, ena=784489, run=784489)

    Note that the uneven timing between groups is a side effect of
    the process spending most of its time sleeping, i.e., not enough
    event rotations (but that's a separate issue).

    Signed-off-by: Stephane Eranian
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Stephane Eranian
     
  • PERF_COUNT_HW_CACHE_DTLB:READ:MISS had a bogus umask value of 0 which
    counts nothing. Needed to be 0x7 (to count all possibilities).

    PERF_COUNT_HW_CACHE_ITLB:READ:MISS had a bogus umask value of 0 which
    counts nothing. Needed to be 0x3 (to count all possibilities).

    Signed-off-by: Stephane Eranian
    Signed-off-by: Peter Zijlstra
    Cc: Robert Richter
    Cc: # as far back as it applies
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Stephane Eranian
     
  • You can only call update_context_time() when the context
    is active, i.e., the thread it is attached to is still running.

    However, perf_event_read() can be called even when the context
    is inactive, e.g., user read() the counters. The call to
    update_context_time() must be conditioned on the status of
    the context, otherwise, bogus time_enabled, time_running may
    be returned. Here is an example on AMD64. The task program
    is an example from libpfm4. The -p prints deltas every 1s.

    $ task -p -e cpu_clk_unhalted sleep 5
    2,266,610 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
    5,242,358,071 cpu_clk_unhalted (99.95% scaling, ena=5,000,359,984, run=2,319,270)

    Whereas if you don't read deltas, e.g., no call to perf_event_read() until
    the process terminates:

    $ task -e cpu_clk_unhalted sleep 5
    2,497,783 cpu_clk_unhalted (0.00% scaling, ena=2,376,899, run=2,376,899)

    Notice that time_enable, time_running are bogus in the first example
    causing bogus scaling.

    This patch fixes the problem, by conditionally calling update_context_time()
    in perf_event_read().

    Signed-off-by: Stephane Eranian
    Signed-off-by: Peter Zijlstra
    Cc: stable@kernel.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Stephane Eranian
     

18 Oct, 2010

7 commits

  • Even though the parent is recorded with the normal function tracing
    of the latency tracers (irqsoff and wakeup), the function graph
    recording is bogus.

    This is due to the function graph messing with the return stack.
    The latency tracers pass in as the parent CALLER_ADDR0, which
    works fine for plain function tracing. But this causes bogus output
    with the graph tracer:

    3) -0 | d.s3. 0.000 us | return_to_handler();
    3) -0 | d.s3. 0.000 us | _raw_spin_unlock_irqrestore();
    3) -0 | d.s3. 0.000 us | return_to_handler();
    3) -0 | d.s3. 0.000 us | trace_hardirqs_on();

    The "return_to_handle()" call is the trampoline of the
    function graph tracer, and is meaningless in this context.

    Cc: Jiri Olsa
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The preempt and irqsoff tracers have three types of function tracers.
    Normal function tracer, function graph entry, and function graph return.
    Each of these use a complex dance to prevent recursion and whether
    to trace the data or not (depending if interrupts are enabled or not).

    This patch moves the duplicate code into a single routine, to
    prevent future mistakes with modifying duplicate complex code.

    Cc: Jiri Olsa
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The wakeup tracer has three types of function tracers. Normal
    function tracer, function graph entry, and function graph return.
    Each of these use a complex dance to prevent recursion and whether
    to trace the data or not (depending on the wake_task variable).

    This patch moves the duplicate code into a single routine, to
    prevent future mistakes with modifying duplicate complex code.

    Cc: Jiri Olsa
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Add function graph support for wakeup latency tracer.
    The graph output is enabled by setting the 'display-graph'
    trace option.

    Signed-off-by: Jiri Olsa
    LKML-Reference:
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • Move trace_graph_function() and print_graph_headers_flags() functions
    to the trace_function_graph.c to be globaly available.

    Signed-off-by: Jiri Olsa
    LKML-Reference:
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • The check_irq_entry and check_irq_return could be called
    from graph event context. In such case there's no graph
    private data allocated. Adding checks to handle this case.

    Signed-off-by: Jiri Olsa
    LKML-Reference:

    [ Fixed some grammar in the comments ]

    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • Unnecessary cast from void* in assignment.

    Signed-off-by: matt mooney
    Signed-off-by: Steven Rostedt

    matt mooney
     

17 Oct, 2010

1 commit


16 Oct, 2010

2 commits


15 Oct, 2010

8 commits

  • The file kernel/trace/ftrace.c references the mcount() call to
    convert the mcount() callers to nops. But because it references
    mcount(), the mcount() address is placed in the relocation table.

    The C version of recordmcount reads the relocation table of all
    object files, and it will add all references to mcount to the
    __mcount_loc table that is used to find the places that call mcount()
    and change the call to a nop. When recordmcount finds the mcount reference
    in kernel/trace/ftrace.o, it saves that location even though the code
    is not a call, but references mcount as data.

    On boot up, when all calls are converted to nops, the code has a safety
    check to determine what op code it is actually replacing before it
    replaces it. If that op code at the address does not match, then
    a warning is printed and the function tracer is disabled.

    The reference to mcount in ftrace.c, causes this warning to trigger,
    since the reference is not a call to mcount(). The ftrace.c file is
    not compiled with the -pg flag, so no calls to mcount() should be
    expected.

    This patch simply makes recordmcount.c skip the kernel/trace/ftrace.c
    file. This was the same solution used by the perl version of
    recordmcount.

    Reported-by: Ingo Molnar
    Cc: John Reiser
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Make !CONFIG_PM function stubs static inline and remove section
    attribute.

    Signed-off-by: Robert Richter

    Robert Richter
     
  • Commit e9677b3ce (oprofile, ARM: Use oprofile_arch_exit() to
    cleanup on failure) caused oprofile_perf_exit to be called
    in the cleanup path of oprofile_perf_init. The __exit tag
    for oprofile_perf_exit should therefore be dropped.

    The same has to be done for exit_driverfs as well, as this
    function is called from oprofile_perf_exit. Else, we get
    the following two linker errors.

    LD .tmp_vmlinux1
    `oprofile_perf_exit' referenced in section `.init.text' of arch/arm/oprofile/built-in.o: defined in discarded section `.exit.text' of arch/arm/oprofile/built-in.o
    make: *** [.tmp_vmlinux1] Error 1

    LD .tmp_vmlinux1
    `exit_driverfs' referenced in section `.text' of arch/arm/oprofile/built-in.o: defined in discarded section `.exit.text' of arch/arm/oprofile/built-in.o
    make: *** [.tmp_vmlinux1] Error 1

    Signed-off-by: Anand Gadiyar
    Cc: Will Deacon
    Signed-off-by: Robert Richter

    Anand Gadiyar
     
  • oprofile_perf.c needs to include platform_device.h
    Otherwise we get the following build break.

    CC arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.o
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:192: warning: 'struct platform_device' declared inside parameter list
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:192: warning: its scope is only this definition or declaration, which is probably not what you want
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:201: warning: 'struct platform_device' declared inside parameter list
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:210: error: variable 'oprofile_driver' has initializer but incomplete type
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:211: error: unknown field 'driver' specified in initializer
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:211: error: extra brace group at end of initializer
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:211: error: (near initialization for 'oprofile_driver')
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:213: warning: excess elements in struct initializer
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:213: warning: (near initialization for 'oprofile_driver')
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:214: error: unknown field 'resume' specified in initializer
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:214: warning: excess elements in struct initializer
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:214: warning: (near initialization for 'oprofile_driver')
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:215: error: unknown field 'suspend' specified in initializer
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:215: warning: excess elements in struct initializer
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:215: warning: (near initialization for 'oprofile_driver')
    arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c: In function 'init_driverfs':

    Signed-off-by: Anand Gadiyar
    Cc: Matt Fleming
    Cc: Will Deacon
    Signed-off-by: Robert Richter

    Anand Gadiyar
     
  • Conflicts:
    arch/arm/oprofile/common.c
    kernel/perf_event.c

    Robert Richter
     
  • …nel/git/rostedt/linux-2.6-trace into perf/core

    Ingo Molnar
     
  • The config option used by archs to let the build system know that
    the C version of the recordmcount works for said arch is currently
    called HAVE_C_MCOUNT_RECORD which enables BUILD_C_RECORDMCOUNT. To
    be more consistent with the name that all archs may use, it has been
    renamed to HAVE_C_RECORDMCOUNT. This will be less confusing since
    we are building a C recordmcount and not a mcount_record.

    Suggested-by: Ingo Molnar
    Cc:
    Cc: Michal Marek
    Cc: linux-kbuild@vger.kernel.org
    Cc: John Reiser
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • …ic/random-tracing into perf/core

    Ingo Molnar