08 Jul, 2011

1 commit

  • If a function is set to be traced by the set_graph_function, but the
    option funcgraph-irqs is zero, and the traced function happens to be
    called from a interrupt, it will not be traced.

    The point of funcgraph-irqs is to not trace interrupts when we are
    preempted by an irq, not to not trace functions we want to trace that
    happen to be *in* a irq.

    Luckily the current->trace_recursion element is perfect to add a flag
    to help us be able to trace functions within an interrupt even when
    we are not tracing interrupts that preempt the trace.

    Reported-by: Heiko Carstens
    Tested-by: Heiko Carstens
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

15 Jun, 2011

4 commits

  • The function_graph tracer does not follow global context-info option.
    Adding TRACE_ITER_CONTEXT_INFO trace_flags check to enable it.

    With following commands:
    # echo function_graph > ./current_tracer
    # echo 0 > options/context-info
    # cat trace

    This is what it looked like before:
    # tracer: function_graph
    #
    # TIME CPU DURATION FUNCTION CALLS
    # | | | | | | | |
    1) 0.079 us | } /* __vma_link_rb */
    1) 0.056 us | copy_page_range();
    1) | security_vm_enough_memory() {
    ...

    This is what it looks like now:
    # tracer: function_graph
    #
    } /* update_ts_time_stats */
    timekeeping_max_deferment();
    ...

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1307113131-10045-6-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • The lock_depth was removed in commit
    e6e1e25 tracing: Remove lock_depth from event entry

    Removing the lock_depth info from function_graph latency header.

    With following commands:
    # echo function_graph > ./current_tracer
    # echo 1 > options/latency-format
    # cat trace

    This is what it looked like before:
    # tracer: function_graph
    #
    # function_graph latency trace v1.1.5 on 3.0.0-rc1-tip+
    # --------------------------------------------------------------------
    # latency: 0 us, #59756/311298, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
    # -----------------
    # | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
    # -----------------
    #
    # _-----=> irqs-off
    # / _----=> need-resched
    # | / _---=> hardirq/softirq
    # || / _--=> preempt-depth
    # ||| / _-=> lock-depth
    # |||| /
    # CPU||||| DURATION FUNCTION CALLS
    # | ||||| | | | | | |
    0) .... 0.068 us | } /* __rcu_read_unlock */
    ...

    This is what it looks like now:
    # tracer: function_graph
    #
    # function_graph latency trace v1.1.5 on 3.0.0-rc1-tip+
    # --------------------------------------------------------------------
    # latency: 0 us, #59747/1744610, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
    # -----------------
    # | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
    # -----------------
    #
    # _-----=> irqs-off
    # / _----=> need-resched
    # | / _---=> hardirq/softirq
    # || / _--=> preempt-depth
    # ||| /
    # CPU|||| DURATION FUNCTION CALLS
    # | |||| | | | | | |
    0) ..s. 1.641 us | } /* __rcu_process_callbacks */
    ...

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1307113131-10045-5-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • Functions print_graph_overhead() and print_graph_duration() displays
    data for one field - DURATION.

    I merged them into single function print_graph_duration(),
    and added a way to display the empty parts of the field.

    This way the print_graph_irq() function can use this column to display
    the IRQ signs if needed and the DURATION field details stays inside
    the print_graph_duration() function.

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1307113131-10045-3-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • The display of absolute time and duration fields is based on the
    latency field. This was added during the irqsoff/wakeup tracers
    graph support changes.

    It's causing confusion in what fields will be displayed for the
    function_graph tracer itself. So I'm removing this depency, and
    adding absolute time and duration fields to the preemptirqsoff
    preemptoff irqsoff wakeup tracers.

    With following commands:
    # echo function_graph > ./current_tracer
    # cat trace

    This is what it looked like before:
    # tracer: function_graph
    #
    # TIME CPU DURATION FUNCTION CALLS
    # | | | | | | | |
    0) 0.068 us | } /* page_add_file_rmap */
    0) | _raw_spin_unlock() {
    ...

    This is what it looks like now:
    # tracer: function_graph
    #
    # CPU DURATION FUNCTION CALLS
    # | | | | | | |
    0) 0.068 us | } /* add_preempt_count */
    0) 0.993 us | } /* vfsmount_lock_local_lock */
    ...

    For preemptirqsoff preemptoff irqsoff wakeup tracers,
    this is what it looked like before:
    SNIP
    # _-----=> irqs-off
    # / _----=> need-resched
    # | / _---=> hardirq/softirq
    # || / _--=> preempt-depth
    # ||| / _-=> lock-depth
    # |||| /
    # CPU TASK/PID ||||| DURATION FUNCTION CALLS
    # | | | ||||| | | | | | |
    1) -0 | d..1 0.000 us | acpi_idle_enter_simple();
    ...

    This is what it looks like now:
    SNIP
    #
    # _-----=> irqs-off
    # / _----=> need-resched
    # | / _---=> hardirq/softirq
    # || / _--=> preempt-depth
    # ||| /
    # TIME CPU TASK/PID |||| DURATION FUNCTION CALLS
    # | | | | |||| | | | | | |
    19.847735 | 1) -0 | d..1 0.000 us | acpi_idle_enter_simple();
    ...

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1307113131-10045-2-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     

31 Mar, 2011

1 commit


18 Oct, 2010

3 commits

  • Even though the parent is recorded with the normal function tracing
    of the latency tracers (irqsoff and wakeup), the function graph
    recording is bogus.

    This is due to the function graph messing with the return stack.
    The latency tracers pass in as the parent CALLER_ADDR0, which
    works fine for plain function tracing. But this causes bogus output
    with the graph tracer:

    3) -0 | d.s3. 0.000 us | return_to_handler();
    3) -0 | d.s3. 0.000 us | _raw_spin_unlock_irqrestore();
    3) -0 | d.s3. 0.000 us | return_to_handler();
    3) -0 | d.s3. 0.000 us | trace_hardirqs_on();

    The "return_to_handle()" call is the trampoline of the
    function graph tracer, and is meaningless in this context.

    Cc: Jiri Olsa
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Move trace_graph_function() and print_graph_headers_flags() functions
    to the trace_function_graph.c to be globaly available.

    Signed-off-by: Jiri Olsa
    LKML-Reference:
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • The check_irq_entry and check_irq_return could be called
    from graph event context. In such case there's no graph
    private data allocated. Adding checks to handle this case.

    Signed-off-by: Jiri Olsa
    LKML-Reference:

    [ Fixed some grammar in the comments ]

    Signed-off-by: Steven Rostedt

    Jiri Olsa
     

13 Oct, 2010

1 commit

  • Fix

    kernel/trace/trace_functions_graph.c: In function ‘trace_print_graph_duration’:
    kernel/trace/trace_functions_graph.c:652: warning: comparison of distinct pointer types lacks a cast

    when building 36-rc6 on a 32-bit due to the strict type check failing
    in the min() macro.

    Signed-off-by: Borislav Petkov
    Cc: Chase Douglas
    Cc: Steven Rostedt
    Cc: Ingo Molnar
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Borislav Petkov
     

15 Sep, 2010

2 commits

  • When the function graph tracer funcgraph-irq option is zero, disable
    tracing in IRQs. This makes the option have two effects.

    1) When reading the trace file, do not display the functions that
    happen in interrupt context (when detected)

    2) [*new*] When recording a trace, skip those that are detected
    to be in interrupt by the 'in_irq()' function

    Note, in_irq() is updated at irq_enter() and irq_exit(). There are
    still functions that are recorded by the function graph tracer that
    is in interrupt context but outside the irq_enter/exit() routines.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • It's handy to be able to disable the irq related output
    and not to have to jump over each irq related code, when
    you have no interrest in it.

    The option is by default enabled, so there's no change to
    current behaviour. It affects only the final output, so all
    the irq related data stay in the ring buffer.

    Signed-off-by: Jiri Olsa
    LKML-Reference:
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     

19 Aug, 2010

2 commits


16 Aug, 2010

1 commit


07 Aug, 2010

1 commit

  • With CONFIG_DEBUG_PAGEALLOC, I observed an unallocated memory access in
    function_graph trace. It appears we find a small size entry in ring buffer,
    but we access it as a big size entry. The access overflows the page size
    and touches an unallocated page.

    Cc:
    Signed-off-by: Shaohua Li
    LKML-Reference:
    [ Added a comment to explain the problem - SDR ]
    Signed-off-by: Steven Rostedt

    Shaohua Li
     

29 Jun, 2010

1 commit

  • The nsecs_str string is a local variable defined as:

    char nsecs_str[5];

    It is possible for the snprintf call to use a size value larger than the
    size of the string. This should not cause a buffer overrun as it is
    written now due to the value for the string format "%03lu" can not be
    larger than 1000. However, this change makes it correct. By making the
    size correct we guard against potential future changes that could actually
    cause a buffer overrun.

    Signed-off-by: Chase Douglas
    LKML-Reference:

    [ added 'UL' to number 8 to fix gcc warning comparing it to sizeof() ]

    Signed-off-by: Steven Rostedt

    Chase Douglas
     

15 May, 2010

1 commit

  • Multiple events may use the same method to print their data.
    Instead of having all events have a pointer to their print funtions,
    the trace_event structure now points to a trace_event_functions structure
    that will hold the way to print ouf the event.

    The event itself is now passed to the print function to let the print
    function know what kind of event it should print.

    This opens the door to consolidating the way several events print
    their output.

    text data bss dec hex filename
    4913961 1088356 861512 6863829 68bbd5 vmlinux.orig
    4900382 1048964 861512 6810858 67ecea vmlinux.init
    4900446 1049028 861512 6810986 67ed6a vmlinux.preprint

    This change slightly increases the size but is needed for the next change.

    v3: Fix the branch tracer events to handle this change.

    v2: Fix the new function graph tracer event calls to handle this change.

    Acked-by: Mathieu Desnoyers
    Acked-by: Masami Hiramatsu
    Acked-by: Frederic Weisbecker
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

28 Apr, 2010

1 commit


27 Apr, 2010

2 commits


08 Apr, 2010

1 commit

  • Conflicts:
    include/linux/module.h
    kernel/module.c

    Semantic conflict:
    include/trace/events/module.h

    Merge reason: Resolve the conflict with upstream commit 5fbfb18 ("Fix up
    possibly racy module refcounting")

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

01 Apr, 2010

1 commit

  • Currently, when the ring buffer drops events, it does not record
    the fact that it did so. It does inform the writer that the event
    was dropped by returning a NULL event, but it does not put in any
    place holder where the event was dropped.

    This is not a trivial thing to add because the ring buffer mostly
    runs in overwrite (flight recorder) mode. That is, when the ring
    buffer is full, new data will overwrite old data.

    In a produce/consumer mode, where new data is simply dropped when
    the ring buffer is full, it is trivial to add the placeholder
    for dropped events. When there's more room to write new data, then
    a special event can be added to notify the reader about the dropped
    events.

    But in overwrite mode, any new write can overwrite events. A place
    holder can not be inserted into the ring buffer since there never
    may be room. A reader could also come in at anytime and miss the
    placeholder.

    Luckily, the way the ring buffer works, the read side can find out
    if events were lost or not, and how many events. Everytime a write
    takes place, if it overwrites the header page (the next read) it
    updates a "overrun" variable that keeps track of the number of
    lost events. When a reader swaps out a page from the ring buffer,
    it can record this number, perfom the swap, and then check to
    see if the number changed, and take the diff if it has, which would be
    the number of events dropped. This can be stored by the reader
    and returned to callers of the reader.

    Since the reader page swap will fail if the writer moved the head
    page since the time the reader page set up the swap, this gives room
    to record the overruns without worrying about races. If the reader
    sets up the pages, records the overrun, than performs the swap,
    if the swap succeeds, then the overrun variable has not been
    updated since the setup before the swap.

    For binary readers of the ring buffer, a flag is set in the header
    of each sub page (sub buffer) of the ring buffer. This flag is embedded
    in the size field of the data on the sub buffer, in the 31st bit (the size
    can be 32 or 64 bits depending on the architecture), but only 27
    bits needs to be used for the actual size (less actually).

    We could add a new field in the sub buffer header to also record the
    number of events dropped since the last read, but this will change the
    format of the binary ring buffer a bit too much. Perhaps this change can
    be made if the information on the number of events dropped is considered
    important enough.

    Note, the notification of dropped events is only used by consuming reads
    or peeking at the ring buffer. Iterating over the ring buffer does not
    keep this information because the necessary data is only available when
    a page swap is made, and the iterator does not swap out pages.

    Cc: Robert Richter
    Cc: Andi Kleen
    Cc: Li Zefan
    Cc: Arnaldo Carvalho de Melo
    Cc: "Luis Claudio R. Goncalves"
    Cc: Frederic Weisbecker
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

14 Mar, 2010

1 commit

  • …nel/git/tip/linux-2.6-tip

    * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    tracing: Do not record user stack trace from NMI context
    tracing: Disable buffer switching when starting or stopping trace
    tracing: Use same local variable when resetting the ring buffer
    function-graph: Init curr_ret_stack with ret_stack
    ring-buffer: Move disabled check into preempt disable section
    function-graph: Add tracing_thresh support to function_graph tracer
    tracing: Update the comm field in the right variable in update_max_tr
    function-graph: Use comment notation for func names of dangling '}'
    function-graph: Fix unused reference to ftrace_set_func()
    tracing: Fix warning in s_next of trace file ops
    tracing: Include irqflags headers from trace clock

    Linus Torvalds
     

06 Mar, 2010

2 commits

  • Add support for tracing_thresh to the function_graph tracer. This
    version of this feature isolates the checks into new entry and
    return functions, to avoid adding more conditional code into the
    main function_graph paths.

    When the tracing_thresh is set and the function graph tracer is
    enabled, only the functions that took longer than the time in
    microseconds that was set in tracing_thresh are recorded. To do this
    efficiently, only the function exits are recorded:

    [tracing]# echo 100 > tracing_thresh
    [tracing]# echo function_graph > current_tracer
    [tracing]# cat trace
    # tracer: function_graph
    #
    # CPU DURATION FUNCTION CALLS
    # | | | | | | |
    1) ! 119.214 us | } /* smp_apic_timer_interrupt */
    1)
    Signed-off-by: Tim Bird
    LKML-Reference:
    Signed-off-by: Steven Rostedt

    Tim Bird
     
  • When a '}' does not have a matching function start, the name is printed
    within parenthesis. But this makes it confusing between ending '}'
    and function starts. This patch makes the function name appear in C comment
    notation.

    Old view:
    3) 1.281 us | } (might_fault)
    3) 3.620 us | } (filldir)
    3) 5.251 us | } (call_filldir)
    3) | call_filldir() {
    3) | filldir() {

    New view:
    3) 1.281 us | } /* might_fault */
    3) 3.620 us | } /* filldir */
    3) 5.251 us | } /* call_filldir */
    3) | call_filldir() {
    3) | filldir() {

    Requested-by: Ingo Molnar
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

03 Mar, 2010

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
    percpu: add __percpu sparse annotations to what's left
    percpu: add __percpu sparse annotations to fs
    percpu: add __percpu sparse annotations to core kernel subsystems
    local_t: Remove leftover local.h
    this_cpu: Remove pageset_notifier
    this_cpu: Page allocator conversion
    percpu, x86: Generic inc / dec percpu instructions
    local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c
    module: Use this_cpu_xx to dynamically allocate counters
    local_t: Remove cpu_local_xx macros
    percpu: refactor the code in pcpu_[de]populate_chunk()
    percpu: remove compile warnings caused by __verify_pcpu_ptr()
    percpu: make accessors check for percpu pointer in sparse
    percpu: add __percpu for sparse.
    percpu: make access macros universal
    percpu: remove per_cpu__ prefix.

    Linus Torvalds
     

27 Feb, 2010

1 commit

  • The function graph tracer is currently the most invasive tracer
    in the ftrace family. It can easily overflow the buffer even with
    10megs per CPU. This means that events can often be lost.

    On start up, or after events are lost, if the function return is
    recorded but the function enter was lost, all we get to see is the
    exiting '}'.

    Here is how a typical trace output starts:

    [tracing] cat trace
    # tracer: function_graph
    #
    # CPU DURATION FUNCTION CALLS
    # | | | | | | |
    0) + 91.897 us | }
    0) ! 567.961 us | }
    0)

    Steven Rostedt
     

25 Feb, 2010

1 commit


29 Jan, 2010

1 commit

  • In the function graph tracer, a calling function is to be traced
    only when it is enabled through the set_graph_function file,
    or when it is nested in an enabled function.

    Current code uses TSK_TRACE_FL_GRAPH to test whether it is nested
    or not. Looking at the code, we can get this:
    (trace->depth > 0) (TSK_TRACE_FL_GRAPH is set)

    trace->depth is more explicit to tell that it is nested.
    So we use trace->depth directly and simplify the code.

    No functionality is changed.
    TSK_TRACE_FL_GRAPH is not removed yet, it is left for future usage.

    Signed-off-by: Lai Jiangshan
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Frederic Weisbecker

    Lai Jiangshan
     

17 Jan, 2010

1 commit

  • Each time we save a function entry from the function graph
    tracer, we check if the trace array is set, which is wasteful
    because it is set anyway before we start the tracer. All we need
    is to ensure we have good read and write orderings. When we set
    the trace array, we just need to guarantee it to be visible
    before starting tracing.

    Signed-off-by: Frederic Weisbecker
    Acked-by: Steven Rostedt
    Cc: Lai Jiangshan
    Cc: Paul E. McKenney
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

05 Jan, 2010

1 commit


15 Dec, 2009

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
    m68k: rename global variable vmalloc_end to m68k_vmalloc_end
    percpu: add missing per_cpu_ptr_to_phys() definition for UP
    percpu: Fix kdump failure if booted with percpu_alloc=page
    percpu: make misc percpu symbols unique
    percpu: make percpu symbols in ia64 unique
    percpu: make percpu symbols in powerpc unique
    percpu: make percpu symbols in x86 unique
    percpu: make percpu symbols in xen unique
    percpu: make percpu symbols in cpufreq unique
    percpu: make percpu symbols in oprofile unique
    percpu: make percpu symbols in tracer unique
    percpu: make percpu symbols under kernel/ and mm/ unique
    percpu: remove some sparse warnings
    percpu: make alloc_percpu() handle array types
    vmalloc: fix use of non-existent percpu variable in put_cpu_var()
    this_cpu: Use this_cpu_xx in trace_functions_graph.c
    this_cpu: Use this_cpu_xx for ftrace
    this_cpu: Use this_cpu_xx in nmi handling
    this_cpu: Use this_cpu operations in RCU
    this_cpu: Use this_cpu ops for VM statistics
    ...

    Fix up trivial (famous last words) global per-cpu naming conflicts in
    arch/x86/kvm/svm.c
    mm/slab.c

    Linus Torvalds
     

10 Dec, 2009

1 commit

  • There is a case where the graph tracer might get confused and omits
    displaying of a single record. This applies mostly with the trace_pipe
    since it is unlikely that the trace_seq buffer will overflow with the
    trace file.

    As the function_graph tracer goes through the trace entries keeping a
    pointer to the current record:

    current -> func1 ENTRY
    func2 ENTRY
    func2 RETURN
    func1 RETURN

    When an function ENTRY is encountered, it moves the pointer to the
    next entry to check if the function is a nested or leaf function.

    func1 ENTRY
    current -> func2 ENTRY
    func2 RETURN
    func1 RETURN

    If the rest of the writing of the function fills the trace_seq buffer,
    then the trace_pipe read will ignore this entry. The next read will
    Now start at the current location, but the first entry (func1) will
    be discarded.

    This patch keeps a copy of the current entry in the iterator private
    storage and will keep track of when the trace_seq buffer fills. When
    the trace_seq buffer fills, it will reuse the copy of the entry in the
    next iteration.

    [
    This patch has been largely modified by Steven Rostedt in order to
    clean it up and simplify it. The original idea and concept was from
    Jirka and for that, this patch will go under his name to give him
    the credit he deserves. But because this was modify by Steven Rostedt
    anything wrong with the patch should be blamed on Steven.
    ]

    Signed-off-by: Jiri Olsa
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     

29 Oct, 2009

1 commit

  • Now that the return from alloc_percpu is compatible with the address
    of per-cpu vars, it makes sense to hand around the address of per-cpu
    variables. To make this sane, we remove the per_cpu__ prefix we used
    created to stop people accidentally using these vars directly.

    Now we have sparse, we can use that (next patch).

    tj: * Updated to convert stuff which were missed by or added after the
    original patch.

    * Kill per_cpu_var() macro.

    Signed-off-by: Rusty Russell
    Signed-off-by: Tejun Heo
    Reviewed-by: Christoph Lameter

    Rusty Russell
     

13 Oct, 2009

1 commit


18 Sep, 2009

1 commit


12 Sep, 2009

2 commits

  • Both trace_output.c and trace_function_graph.c do basically the same
    thing to handle the printing of the latency-format. This patch moves
    the code into one function that both can use.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • This patch adds the lock depth of the big kernel lock to the generic
    entry header. This way we can see the depth of the lock and help
    in removing the BKL.

    Example:

    # _------=> CPU#
    # / _-----=> irqs-off
    # | / _----=> need-resched
    # || / _---=> hardirq/softirq
    # ||| / _--=> preempt-depth
    # |||| /_--=> lock-depth
    # |||||/ delay
    # cmd pid |||||| time | caller
    # \ / |||||| \ | /
    -0 2.N..3 5902255250us+: lock_acquire: read rcu_read_lock
    -0 2.N..3 5902255253us+: lock_release: rcu_read_lock
    -0 2dN..3 5902255257us+: lock_acquire: xtime_lock
    -0 2dN..4 5902255259us : lock_acquire: clocksource_lock
    -0 2dN..4 5902255261us+: lock_release: clocksource_lock

    Signed-off-by: Steven Rostedt

    Steven Rostedt