15 May, 2010

1 commit

  • Multiple events may use the same method to print their data.
    Instead of having all events have a pointer to their print funtions,
    the trace_event structure now points to a trace_event_functions structure
    that will hold the way to print ouf the event.

    The event itself is now passed to the print function to let the print
    function know what kind of event it should print.

    This opens the door to consolidating the way several events print
    their output.

    text data bss dec hex filename
    4913961 1088356 861512 6863829 68bbd5 vmlinux.orig
    4900382 1048964 861512 6810858 67ecea vmlinux.init
    4900446 1049028 861512 6810986 67ed6a vmlinux.preprint

    This change slightly increases the size but is needed for the next change.

    v3: Fix the branch tracer events to handle this change.

    v2: Fix the new function graph tracer event calls to handle this change.

    Acked-by: Mathieu Desnoyers
    Acked-by: Masami Hiramatsu
    Acked-by: Frederic Weisbecker
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

10 Feb, 2010

1 commit

  • The branch annotation is a bit difficult to see the worst offenders
    because it only sorts by percentage:

    correct incorrect % Function File Line
    ------- --------- - -------- ---- ----
    0 163 100 qdisc_restart sch_generic.c 179
    0 163 100 pfifo_fast_dequeue sch_generic.c 447
    0 4 100 pskb_trim_rcsum skbuff.h 1689
    0 4 100 llc_rcv llc_input.c 170
    0 18 100 psmouse_interrupt psmouse-base.c 304
    0 3 100 atkbd_interrupt atkbd.c 389
    0 5 100 usb_alloc_dev usb.c 437
    0 11 100 vsscanf vsprintf.c 1897
    0 2 100 IS_ERR err.h 34
    0 23 100 __rmqueue_fallback page_alloc.c 865
    0 4 100 probe_wakeup_sched_switch trace_sched_wakeup.c 142
    0 3 100 move_masked_irq migration.c 11

    Adding the incorrect and correct values as sort keys makes this file a
    bit more informative:

    correct incorrect % Function File Line
    ------- --------- - -------- ---- ----
    0 366541 100 audit_syscall_entry auditsc.c 1637
    0 366538 100 audit_syscall_exit auditsc.c 1685
    0 115839 100 sched_info_switch sched_stats.h 269
    0 74567 100 sched_info_queued sched_stats.h 222
    0 66578 100 sched_info_dequeued sched_stats.h 177
    0 15113 100 trace_workqueue_insertion workqueue.h 38
    0 15107 100 trace_workqueue_execution workqueue.h 45
    0 3622 100 syscall_trace_leave ptrace.c 1772
    0 2750 100 sched_move_task sched.c 10100
    0 2750 100 sched_move_task sched.c 10110
    0 1815 100 pre_schedule_rt sched_rt.c 1462
    0 837 100 audit_alloc auditsc.c 879
    0 814 100 tcp_mss_split_point tcp_output.c 1302

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

08 Oct, 2009

2 commits


07 May, 2009

1 commit


14 Apr, 2009

3 commits

  • Before patch:

    # tracer: branch
    #
    # TASK-PID CPU# TIMESTAMP FUNCTION
    # | | | | |
    -2981 [000] 24008.872738: [ ok ] trace_irq_handler_exit:irq_event_types.h:41
    -2981 [000] 24008.872742: [ ok ] note_interrupt:spurious.c:229
    ...

    After patch:

    # tracer: branch
    #
    # TASK-PID CPU# TIMESTAMP CORRECT FUNC:FILE:LINE
    # | | | | | |
    -2985 [000] 26329.142970: [ ok ] slab_free:slub.c:1776
    -2985 [000] 26329.142972: [ ok ] trace_kmem_cache_free:kmem_event_types.h:191
    ...

    Signed-off-by: Zhao Lei
    Acked-by: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Zhaolei
     
  • This patch changes filter_check_discard() to make use of the new
    ring_buffer_discard_commit() function and modifies the current users to
    call the old commit function in the non-discard case.

    It also introduces a version of filter_check_discard() that uses the
    global trace buffer (filter_current_check_discard()) for those cases.

    v2 changes:

    - fix compile error noticed by Ingo Molnar

    Signed-off-by: Tom Zanussi
    Cc: Steven Rostedt
    Cc: fweisbec@gmail.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     
  • This patch adds run-time field descriptions to all the event formats
    exported using TRACE_EVENT_FORMAT. It also hooks up all the tracers
    that use them (i.e. the tracers in the 'ftrace subsystem') so they can
    also have their output filtered by the event-filtering mechanism.

    When I was testing this, there were a couple of things that fooled me
    into thinking the filters weren't working, when actually they were -
    I'll mention them here so others don't make the same mistakes (and file
    bug reports. ;-)

    One is that some of the tracers trace multiple events e.g. the
    sched_switch tracer uses the context_switch and wakeup events, and if
    you don't set filters on all of the traced events, the unfiltered output
    from the events without filters on them can make it look like the
    filtering as a whole isn't working properly, when actually it is doing
    what it was asked to do - it just wasn't asked to do the right thing.

    The other is that for the really high-volume tracers e.g. the function
    tracer, the volume of filtered events can be so high that it pushes the
    unfiltered events out of the ring buffer before they can be read so e.g.
    cat'ing the trace file repeatedly shows either no output, or once in
    awhile some output but that isn't there the next time you read the
    trace, which isn't what you normally expect when reading the trace file.
    If you read from the trace_pipe file though, you can catch them before
    they disappear.

    Changes from v1:

    As suggested by Frederic Weisbecker:

    - get rid of externs in functions
    - added unlikely() to filter_check_discard()

    Signed-off-by: Tom Zanussi
    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Tom Zanussi
     

25 Mar, 2009

1 commit

  • Currently, if a trace_stat user wants a handle to some private data,
    the trace_stat infrastructure does not supply a way to do that.

    This patch passes the trace_stat structure to the start function of
    the trace_stat code.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

11 Mar, 2009

1 commit


05 Mar, 2009

1 commit

  • Impact: clean up

    The trace and latency_trace function pointers are identical for
    every tracer but the function tracer. The differences in the function
    tracer are trivial (latency output puts paranthesis around parent).

    This patch removes the latency_trace pointer and all prints will
    now just use the trace output function pointer.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

11 Feb, 2009

1 commit


06 Feb, 2009

3 commits

  • Impact: cleanup

    To make it easy for ftrace plugin writers, as this was open coded in
    the existing plugins

    Signed-off-by: Arnaldo Carvalho de Melo
    Acked-by: Frédéric Weisbecker
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Impact: new API

    These new functions do what previously was being open coded, reducing
    the number of details ftrace plugin writers have to worry about.

    It also standardizes the handling of stacktrace, userstacktrace and
    other trace options we may introduce in the future.

    With this patch, for instance, the blk tracer (and some others already
    in the tree) can use the "userstacktrace" /d/tracing/trace_options
    facility.

    $ codiff /tmp/vmlinux.before /tmp/vmlinux.after
    linux-2.6-tip/kernel/trace/trace.c:
    trace_vprintk | -5
    trace_graph_return | -22
    trace_graph_entry | -26
    trace_function | -45
    __ftrace_trace_stack | -27
    ftrace_trace_userstack | -29
    tracing_sched_switch_trace | -66
    tracing_stop | +1
    trace_seq_to_user | -1
    ftrace_trace_special | -63
    ftrace_special | +1
    tracing_sched_wakeup_trace | -70
    tracing_reset_online_cpus | -1
    13 functions changed, 2 bytes added, 355 bytes removed, diff: -353

    linux-2.6-tip/block/blktrace.c:
    __blk_add_trace | -58
    1 function changed, 58 bytes removed, diff: -58

    linux-2.6-tip/kernel/trace/trace.c:
    trace_buffer_lock_reserve | +88
    trace_buffer_unlock_commit | +86
    2 functions changed, 174 bytes added, diff: +174

    /tmp/vmlinux.after:
    16 functions changed, 176 bytes added, 413 bytes removed, diff: -237

    Signed-off-by: Arnaldo Carvalho de Melo
    Acked-by: Frédéric Weisbecker
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Impact: API change, cleanup

    >From ring_buffer_{lock_reserve,unlock_commit}.

    $ codiff /tmp/vmlinux.before /tmp/vmlinux.after
    linux-2.6-tip/kernel/trace/trace.c:
    trace_vprintk | -14
    trace_graph_return | -14
    trace_graph_entry | -10
    trace_function | -8
    __ftrace_trace_stack | -8
    ftrace_trace_userstack | -8
    tracing_sched_switch_trace | -8
    ftrace_trace_special | -12
    tracing_sched_wakeup_trace | -8
    9 functions changed, 90 bytes removed, diff: -90

    linux-2.6-tip/block/blktrace.c:
    __blk_add_trace | -1
    1 function changed, 1 bytes removed, diff: -1

    /tmp/vmlinux.after:
    10 functions changed, 91 bytes removed, diff: -91

    Signed-off-by: Arnaldo Carvalho de Melo
    Acked-by: Frédéric Weisbecker
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

05 Feb, 2009

4 commits


03 Feb, 2009

1 commit


29 Jan, 2009

1 commit


14 Jan, 2009

1 commit

  • Impact: tracing's Api change

    Currently, the stat tracing depends on the events tracing.
    When you switch to a new tracer, the stats files of the previous tracer
    will disappear. But it's more scalable to separate those two engines.
    This way, we can keep the stat files of one or several tracers when we
    want, without bothering of multiple tracer stat files or tracer switching.

    To build/destroys its stats files, a tracer just have to call
    register_stat_tracer/unregister_stat_tracer everytimes it wants to.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

11 Jan, 2009

1 commit

  • Impact: new API for tracers

    Make the stat tracing API reentrant. And also provide the new directory
    /debugfs/tracing/trace_stat which will contain all the stat files for the
    current active tracer.

    Now a tracer will, if desired, want to provide a zero terminated array of
    tracer_stat structures.
    Each one contains the callbacks necessary for one stat file.
    It have to provide at least a name for its stat file, an iterator with
    stat_start/start_next callback and an output callback for one stat entry.

    Also adapt the branch tracer to this new API.
    We create two files "all" and "annotated" inside the /debugfs/tracing/trace_stat
    directory, making the both stats simultaneously available instead of needing
    to change an option to switch from one stat file to another.

    The output of these stats haven't changed.

    Changes in v2:

    _ Apply the previous memory leak fix (rebase against tip/master)

    Changes in v3:

    _ Merge the patch that adapted the branch tracer to this Api in this patch to
    not break the kernel build.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

29 Dec, 2008

2 commits

  • Impact: refactor the branch tracer

    This patch adapts the branch tracer to the tracing API.

    This is a proof of concept because the branch tracer implements two
    "stat tracing" that were split in two files.

    So I added an option to the branch tracer: stat_all_branch.
    If it is set, then trace_stat will output all of the branches
    entries stats. Otherwise, it will print the annotated branches.

    Its is a kind of quick trick, waiting for a better solution.

    By default, the annotated branches stat are sorted by incorrect branch
    prediction percentage.

    Ie:

    correct incorrect % Function File Line
    ------- --------- - -------- ---- ----
    0 1 100 native_smp_prepare_cpus smpboot.c 1228
    0 1 100 hpet_rtc_timer_reinit hpet.c 1057
    0 18032 100 sched_info_queued sched_stats.h 223
    0 684 100 yield_task_fair sched_fair.c 984
    0 282 100 pre_schedule_rt sched_rt.c 1263
    0 13414 100 sched_info_dequeued sched_stats.h 178
    0 21724 100 sched_info_switch sched_stats.h 270
    0 1 100 get_signal_to_deliver signal.c 1820
    0 8 100 __cancel_work_timer workqueue.c 560
    0 212 100 verify_export_symbols module.c 1509
    0 17 100 __rmqueue_fallback page_alloc.c 793
    0 43 100 clear_page_mlock internal.h 129
    0 124 100 try_to_unmap_anon rmap.c 1021
    0 53 100 try_to_unmap_anon rmap.c 1013
    0 6 100 vma_address rmap.c 232
    0 3301 100 try_to_unmap_file rmap.c 1082
    0 466 100 try_to_unmap_file rmap.c 1077
    0 1 100 mem_cgroup_create memcontrol.c 1090
    0 3 100 inotify_find_update_watch inotify.c 726
    2 30163 99 perf_counter_task_sched_out perf_counter.c 385
    1 2935 99 percpu_free allocpercpu.c 138
    1544 297672 99 dentry_lru_del_init dcache.c 153
    8 1074 99 input_pass_event input.c 86
    1390 76781 98 mapping_unevictable pagemap.h 50
    280 6665 95 pick_next_task_rt sched_rt.c 889
    750 4826 86 next_pidmap pid.c 194
    2 8 80 blocking_notifier_chain_regist notifier.c 220
    36 130 78 ioremap_pte_range ioremap.c 22
    1093 3247 74 IS_ERR err.h 34
    1023 2908 73 sched_slice sched_fair.c 445
    22 60 73 disk_put_part genhd.h 206
    [...]

    It enables a developer to quickly address the source of incorrect branch
    predictions. Note that this sorting would be better with a second sort on
    the number of incorrect predictions.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Impact: rework trace.c to use new event register API

    Almost every ftrace event has to implement its output display in
    trace.c through a different function. Some events did not handle
    all the formats (trace, latency-trace, raw, hex, binary), and
    this method does not scale well.

    This patch converts the format functions to use the event API to
    find the event and and print its format. Currently, we have
    a print function for trace, latency_trace, raw, hex and binary.
    A trace_nop_print is available if the event wants to avoid output
    on a particular format.

    Perhaps other tracers could use this in the future (like mmiotrace and
    function_graph).

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

03 Dec, 2008

1 commit


29 Nov, 2008

1 commit

  • Impact: fix build error on branch tracer

    This should fix a build error reported on alpha in linux-next:

    CC kernel/trace/trace_branch.o
    kernel/trace/trace_branch.c: In function 'probe_likely_condition':
    kernel/trace/trace_branch.c:44: error: implicit declaration of function 'raw_local_irq_save'
    kernel/trace/trace_branch.c:76: error: implicit declaration of function 'raw_local_irq_restore'

    Unfortunately, I can't test it since I don't have any Alpha build environment.

    Reported-by: Alexey Dobriyan
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

23 Nov, 2008

4 commits

  • Impact: fix compiler warning

    The ftrace_pointers used in the branch profiler are constant values.
    They should never change. But the compiler complains when they are
    passed into the debugfs_create_file as a data pointer, because the
    function discards the qualifier.

    This patch typecasts the parameter to debugfs_create_file back to
    a void pointer. To remind the callbacks that they are pointing to
    a constant value, I also modified the callback local pointers to
    be const struct ftrace_pointer * as well.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Impact: feature to profile if statements

    This patch adds a branch profiler for all if () statements.
    The results will be found in:

    /debugfs/tracing/profile_branch

    For example:

    miss hit % Function File Line
    ------- --------- - -------- ---- ----
    0 1 100 x86_64_start_reservations head64.c 127
    0 1 100 copy_bootdata head64.c 69
    1 0 0 x86_64_start_kernel head64.c 111
    32 0 0 set_intr_gate desc.h 319
    1 0 0 reserve_ebda_region head.c 51
    1 0 0 reserve_ebda_region head.c 47
    0 1 100 reserve_ebda_region head.c 42
    0 0 X maxcpus main.c 165

    Miss means the branch was not taken. Hit means the branch was taken.
    The percent is the percentage the branch was taken.

    This adds a significant amount of overhead and should only be used
    by those analyzing their system.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Impact: cleanup on output of branch profiler

    When a branch has not been taken, it does not make sense to show
    a percentage incorrect or hit. This patch changes the behaviour
    to print out a 'X' when the branch has not been executed yet.

    For example:

    correct incorrect % Function File Line
    ------- --------- - -------- ---- ----
    2096 0 0 do_arch_prctl process_64.c 832
    0 0 X do_arch_prctl process_64.c 804
    2604 0 0 IS_ERR err.h 34
    130228 5765 4 __switch_to process_64.c 673
    0 0 X enable_TSC process_64.c 448
    0 0 X disable_TSC process_64.c 431

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Impact: clean up to make one profiler of like and unlikely tracer

    The likely and unlikely profiler prints out the file and line numbers
    of the annotated branches that it is profiling. It shows the number
    of times it was correct or incorrect in its guess. Having two
    different files or sections for that matter to tell us if it was a
    likely or unlikely is pretty pointless. We really only care if
    it was correct or not.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

17 Nov, 2008

1 commit


16 Nov, 2008

2 commits

  • Impact: fix crash when enabling the branch-tracer

    When the branch tracer inserts an event through
    probe_likely_condition(), it calls local_irq_save() and then results
    in a trace recursion.

    local_irq_save() -> trace_hardirqs_off() -> trace_hardirqs_off_caller()
    -> unlikely()

    The trace_branch.c file is protected by DISABLE_BRANCH_PROFILING but
    that doesn't prevent from external call to functions that use
    unlikely().

    My box crashed each time I tried to set this tracer (sudden and hard
    reboot).

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Impact: extend the ->init() method with the ability to fail

    This bring a way to know if the initialization of a tracer successed.
    A tracer must return 0 on success and a traditional error (ie:
    -ENOMEM) if it fails.

    If a tracer fails to init, it is free to print a detailed warn. The
    tracing api will not and switch to a new tracer will just return the
    error from the init callback.

    Note: this will be used for the return tracer.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

13 Nov, 2008

1 commit

  • Impact: File name change of trace_unlikely.c

    The "unlikely" name for the tracer is quite ugly. We renamed all the
    parts of it to "branch" and now it is time to rename the file too.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt