06 Nov, 2013

2 commits

  • The original SOFT_DISABLE patches didn't add support for soft disable
    of syscall events; this adds it.

    Add an array of ftrace_event_file pointers indexed by syscall number
    to the trace array and remove the existing enabled bitmaps, which as a
    result are now redundant. The ftrace_event_file structs in turn
    contain the soft disable flags we need for per-syscall soft disable
    accounting.

    Adding ftrace_event_files also means we can remove the USE_CALL_FILTER
    bit, thus enabling multibuffer filter support for syscall events.

    Link: http://lkml.kernel.org/r/6e72b566e85d8df8042f133efbc6c30e21fb017e.1382620672.git.tom.zanussi@linux.intel.com

    Signed-off-by: Tom Zanussi
    Signed-off-by: Steven Rostedt

    Tom Zanussi
     
  • The trace event filters are still tied to event calls rather than
    event files, which means you don't get what you'd expect when using
    filters in the multibuffer case:

    Before:

    # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
    # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
    bytes_alloc > 8192
    # mkdir /sys/kernel/debug/tracing/instances/test1
    # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
    # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
    bytes_alloc > 2048
    # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
    bytes_alloc > 2048

    Setting the filter in tracing/instances/test1/events shouldn't affect
    the same event in tracing/events as it does above.

    After:

    # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
    # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
    bytes_alloc > 8192
    # mkdir /sys/kernel/debug/tracing/instances/test1
    # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
    # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
    bytes_alloc > 8192
    # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
    bytes_alloc > 2048

    We'd like to just move the filter directly from ftrace_event_call to
    ftrace_event_file, but there are a couple cases that don't yet have
    multibuffer support and therefore have to continue using the current
    event_call-based filters. For those cases, a new USE_CALL_FILTER bit
    is added to the event_call flags, whose main purpose is to keep the
    old behavior for those cases until they can be updated with
    multibuffer support; at that point, the USE_CALL_FILTER flag (and the
    new associated call_filter_check_discard() function) can go away.

    The multibuffer support also made filter_current_check_discard()
    redundant, so this change removes that function as well and replaces
    it with filter_check_discard() (or call_filter_check_discard() as
    appropriate).

    Link: http://lkml.kernel.org/r/f16e9ce4270c62f46b2e966119225e1c3cca7e60.1382620672.git.tom.zanussi@linux.intel.com

    Signed-off-by: Tom Zanussi
    Signed-off-by: Steven Rostedt

    Tom Zanussi
     

19 Oct, 2013

2 commits

  • The set_graph_notrace filter is analogous to set_ftrace_notrace and
    can be used for eliminating uninteresting part of function graph trace
    output. It also works with set_graph_function nicely.

    # cd /sys/kernel/debug/tracing/
    # echo do_page_fault > set_graph_function
    # perf ftrace live true
    2) | do_page_fault() {
    2) | __do_page_fault() {
    2) 0.381 us | down_read_trylock();
    2) 0.055 us | __might_sleep();
    2) 0.696 us | find_vma();
    2) | handle_mm_fault() {
    2) | handle_pte_fault() {
    2) | __do_fault() {
    2) | filemap_fault() {
    2) | find_get_page() {
    2) 0.033 us | __rcu_read_lock();
    2) 0.035 us | __rcu_read_unlock();
    2) 1.696 us | }
    2) 0.031 us | __might_sleep();
    2) 2.831 us | }
    2) | _raw_spin_lock() {
    2) 0.046 us | add_preempt_count();
    2) 0.841 us | }
    2) 0.033 us | page_add_file_rmap();
    2) | _raw_spin_unlock() {
    2) 0.057 us | sub_preempt_count();
    2) 0.568 us | }
    2) | unlock_page() {
    2) 0.084 us | page_waitqueue();
    2) 0.126 us | __wake_up_bit();
    2) 1.117 us | }
    2) 7.729 us | }
    2) 8.397 us | }
    2) 8.956 us | }
    2) 0.085 us | up_read();
    2) + 12.745 us | }
    2) + 13.401 us | }
    ...

    # echo handle_mm_fault > set_graph_notrace
    # perf ftrace live true
    1) | do_page_fault() {
    1) | __do_page_fault() {
    1) 0.205 us | down_read_trylock();
    1) 0.041 us | __might_sleep();
    1) 0.344 us | find_vma();
    1) 0.069 us | up_read();
    1) 4.692 us | }
    1) 5.311 us | }
    ...

    Link: http://lkml.kernel.org/r/1381739066-7531-5-git-send-email-namhyung@kernel.org

    Signed-off-by: Namhyung Kim
    Signed-off-by: Steven Rostedt

    Namhyung Kim
     
  • The ftrace_graph_filter_enabled means that user sets function filter
    and it always has same meaning of ftrace_graph_count > 0.

    Link: http://lkml.kernel.org/r/1381739066-7531-2-git-send-email-namhyung@kernel.org

    Signed-off-by: Namhyung Kim
    Signed-off-by: Steven Rostedt

    Namhyung Kim
     

10 Sep, 2013

1 commit

  • Pull tracing updates from Steven Rostedt:
    "Not much changes for the 3.12 merge window. The major tracing changes
    are still in flux, and will have to wait for 3.13.

    The changes for 3.12 are mostly clean ups and minor fixes.

    H Peter Anvin added a check to x86_32 static function tracing that
    helps a small segment of the kernel community.

    Oleg Nesterov had a few changes from 3.11, but were mostly clean ups
    and not worth pushing in the -rc time frame.

    Li Zefan had small clean up with annotating a raw_init with __init.

    I fixed a slight race in updating function callbacks, but the race is
    so small and the bug that happens when it occurs is so minor it's not
    even worth pushing to stable.

    The only real enhancement is from Alexander Z Lam that made the
    tracing_cpumask work for trace buffer instances, instead of them all
    sharing a global cpumask"

    * tag 'trace-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    ftrace/rcu: Do not trace debug_lockdep_rcu_enabled()
    x86-32, ftrace: Fix static ftrace when early microcode is enabled
    ftrace: Fix a slight race in modifying what function callback gets traced
    tracing: Make tracing_cpumask available for all instances
    tracing: Kill the !CONFIG_MODULES code in trace_events.c
    tracing: Don't pass file_operations array to event_create_dir()
    tracing: Kill trace_create_file_ops() and friends
    tracing/syscalls: Annotate raw_init function with __init

    Linus Torvalds
     

03 Sep, 2013

1 commit

  • …/linux-rcu into core/rcu

    Pull RCU updates from Paul E. McKenney:

    "
    * Update RCU documentation. These were posted to LKML at
    https://lkml.org/lkml/2013/8/19/611.

    * Miscellaneous fixes. These were posted to LKML at
    https://lkml.org/lkml/2013/8/19/619.

    * Full-system idle detection. This is for use by Frederic
    Weisbecker's adaptive-ticks mechanism. Its purpose is
    to allow the timekeeping CPU to shut off its tick when
    all other CPUs are idle. These were posted to LKML at
    https://lkml.org/lkml/2013/8/19/648.

    * Improve rcutorture test coverage. These were posted to LKML at
    https://lkml.org/lkml/2013/8/19/675.
    "

    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

23 Aug, 2013

1 commit

  • Allow tracer instances to disable tracing by cpu by moving
    the static global tracing_cpumask into trace_array.

    Link: http://lkml.kernel.org/r/921622317f239bfc2283cac2242647801ef584f2.1375980149.git.azl@google.com

    Cc: Vaibhav Nagarnaik
    Cc: David Sharp
    Cc: Alexander Z Lam
    Signed-off-by: Alexander Z Lam
    Signed-off-by: Steven Rostedt

    Alexander Z Lam
     

27 Jul, 2013

1 commit

  • There are several tracepoints (mostly in RCU), that reference a string
    pointer and uses the print format of "%s" to display the string that
    exists in the kernel, instead of copying the actual string to the
    ring buffer (saves time and ring buffer space).

    But this has an issue with userspace tools that read the binary buffers
    that has the address of the string but has no access to what the string
    itself is. The end result is just output that looks like:

    rcu_dyntick: ffffffff818adeaa 1 0
    rcu_dyntick: ffffffff818adeb5 0 140000000000000
    rcu_dyntick: ffffffff818adeb5 0 140000000000000
    rcu_utilization: ffffffff8184333b
    rcu_utilization: ffffffff8184333b

    The above is pretty useless when read by the userspace tools. Ideally
    we would want something that looks like this:

    rcu_dyntick: Start 1 0
    rcu_dyntick: End 0 140000000000000
    rcu_dyntick: Start 140000000000000 0
    rcu_callback: rcu_preempt rhp=0xffff880037aff710 func=put_cred_rcu 0/4
    rcu_callback: rcu_preempt rhp=0xffff880078961980 func=file_free_rcu 0/5
    rcu_dyntick: End 0 1

    The trace_printk() which also only stores the address of the string
    format instead of recording the string into the buffer itself, exports
    the mapping of kernel addresses to format strings via the printk_format
    file in the debugfs tracing directory.

    The tracepoint strings can use this same method and output the format
    to the same file and the userspace tools will be able to decipher
    the address without any modification.

    The tracepoint strings need its own section to save the strings because
    the trace_printk section will cause the trace_printk() buffers to be
    allocated if anything exists within the section. trace_printk() is only
    used for debugging and should never exist in the kernel, we can not use
    the trace_printk sections.

    Add a new tracepoint_str section that will also be examined by the output
    of the printk_format file.

    Cc: Paul E. McKenney
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

24 Jul, 2013

1 commit

  • After the previous changes trace_array_cpu->trace_cpu and
    trace_array->trace_cpu becomes write-only. Remove these members
    and kill "struct trace_cpu" as well.

    As a side effect this also removes memset(per_cpu_memory, 0).
    It was not needed, alloc_percpu() returns zero-filled memory.

    Link: http://lkml.kernel.org/r/20130723152613.GA23741@redhat.com

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Steven Rostedt

    Oleg Nesterov
     

19 Jul, 2013

2 commits

  • Trivial. trace_array->waiter has no users since 6eaaa5d5
    "tracing/core: use appropriate waiting on trace_pipe".

    Link: http://lkml.kernel.org/r/20130719142036.GA1594@redhat.com

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Steven Rostedt

    Oleg Nesterov
     
  • The selftest for function and function graph tracers are defined as
    __init, as they are only executed at boot up. The "tracer" structs
    that are associated to those tracers are not setup as __init as they
    are used after boot. To stop mismatch warnings, those structures
    need to be annotated with __ref_data.

    Currently, the tracer structures are defined to __read_mostly, as they
    do not really change. But in the future they should be converted to
    consts, but that will take a little work because they have a "next"
    pointer that gets updated when they are registered. That will have to
    wait till the next major release.

    Link: http://lkml.kernel.org/r/1373596735.17876.84.camel@gandalf.local.home

    Reported-by: kbuild test robot
    Reported-by: Chen Gang
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

12 Jul, 2013

1 commit

  • Pull tracing changes from Steven Rostedt:
    "The majority of the changes here are cleanups for the large changes
    that were added to 3.10, which includes several bug fixes that have
    been marked for stable.

    As for new features, there were a few, but nothing to write to LWN
    about. These include:

    New function trigger called "dump" and "cpudump" that will cause
    ftrace to dump its buffer to the console when the function is called.
    The difference between "dump" and "cpudump" is that "dump" will dump
    the entire contents of the ftrace buffer, where as "cpudump" will only
    dump the contents of the ftrace buffer for the CPU that called the
    function.

    Another small enhancement is a new sysctl switch called
    "traceoff_on_warning" which, when enabled, will disable tracing if any
    WARN_ON() is triggered. This is useful if you want to debug what
    caused a warning and do not want to risk losing your trace data by the
    ring buffer overwriting the data before you can disable it. There's
    also a kernel command line option that will make this enabled at boot
    up called the same thing"

    * tag 'trace-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (34 commits)
    tracing: Make tracing_open_generic_{tr,tc}() static
    tracing: Remove ftrace() function
    tracing: Remove TRACE_EVENT_TYPE enum definition
    tracing: Make tracer_tracing_{off,on,is_on}() static
    tracing: Fix irqs-off tag display in syscall tracing
    uprobes: Fix return value in error handling path
    tracing: Fix race between deleting buffer and setting events
    tracing: Add trace_array_get/put() to event handling
    tracing: Get trace_array ref counts when accessing trace files
    tracing: Add trace_array_get/put() to handle instance refs better
    tracing: Protect ftrace_trace_arrays list in trace_events.c
    tracing: Make trace_marker use the correct per-instance buffer
    ftrace: Do not run selftest if command line parameter is set
    tracing/kprobes: Don't pass addr=ip to perf_trace_buf_submit()
    tracing: Use flag buffer_disabled for irqsoff tracer
    tracing/kprobes: Turn trace_probe->files into list_head
    tracing: Fix disabling of soft disable
    tracing: Add missing syscall_metadata comment
    tracing: Simplify code for showing of soft disabled flag
    tracing/kprobes: Kill probe_enable_lock
    ...

    Linus Torvalds
     

03 Jul, 2013

3 commits

  • The only caller of function ftrace(...) was removed a long time ago,
    so remove the function body as well.

    Link: http://lkml.kernel.org/r/1365564393-10972-10-git-send-email-jovi.zhangwei@huawei.com

    Signed-off-by: zhangwei(Jovi)
    Signed-off-by: Steven Rostedt

    zhangwei(Jovi)
     
  • TRACE_EVENT_TYPE enum is not used at present, remove it.

    Link: http://lkml.kernel.org/r/1365564393-10972-8-git-send-email-jovi.zhangwei@huawei.com

    Signed-off-by: zhangwei(Jovi)
    Signed-off-by: Steven Rostedt

    zhangwei(Jovi)
     
  • Commit a695cb58162 "tracing: Prevent deleting instances when they are being read"
    tried to fix a race between deleting a trace instance and reading contents
    of a trace file. But it wasn't good enough. The following could crash the kernel:

    # cd /sys/kernel/debug/tracing/instances
    # ( while :; do mkdir foo; rmdir foo; done ) &
    # ( while :; do echo 1 > foo/events/sched/sched_switch 2> /dev/null; done ) &

    Luckily this can only be done by root user, but it should be fixed regardless.

    The problem is that a delete of the file can happen after the write to the event
    is opened, but before the enabling happens.

    The solution is to make sure the trace_array is available before succeeding in
    opening for write, and incerment the ref counter while opened.

    Now the instance can be deleted when the events are writing to the buffer,
    but the deletion of the instance will disable all events before the instance
    is actually deleted.

    Cc: stable@vger.kernel.org # 3.10
    Reported-by: Alexander Lam
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

02 Jul, 2013

2 commits

  • There are multiple places where the ftrace_trace_arrays list is accessed in
    trace_events.c without the trace_types_lock held.

    Link: http://lkml.kernel.org/r/1372732674-22726-1-git-send-email-azl@google.com

    Cc: Vaibhav Nagarnaik
    Cc: David Sharp
    Cc: Alexander Z Lam
    Cc: stable@vger.kernel.org # 3.10
    Signed-off-by: Alexander Z Lam
    Signed-off-by: Steven Rostedt

    Alexander Z Lam
     
  • If the kernel command line ftrace filter parameters are set
    (ftrace_filter or ftrace_notrace), force the function self test to
    pass, with a warning why it was forced.

    If the user adds a filter to the kernel command line, it is assumed
    that they know what they are doing, and the self test should just not
    run instead of failing (which disables function tracing) or clearing
    the filter, as that will probably annoy the user.

    If the user wants the selftest to run, the message will tell them why
    it did not.

    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

12 Jun, 2013

1 commit

  • Outputting formats of x86-tsc and counter should be a raw format, but after
    applying the patch(2b6080f28c7cc3efc8625ab71495aae89aeb63a0), the format was
    changed to nanosec. This is because the global variable trace_clock_id was used.
    When we use multiple buffers, clock_id of each sub-buffer should be used. Then,
    this patch uses tr->clock_id instead of the global variable trace_clock_id.

    [ Basically, this fixes a regression where the multibuffer code changed the
    trace_clock file to update tr->clock_id but the traces still use the old
    global trace_clock_id variable, negating the file's effect. The global
    trace_clock_id variable is obsolete and removed. - SR ]

    Link: http://lkml.kernel.org/r/20130423013239.22334.7394.stgit@yunodevel

    Signed-off-by: Yoshihiro YUNOMAE
    Signed-off-by: Steven Rostedt

    Yoshihiro YUNOMAE
     

30 Apr, 2013

1 commit

  • Pull perf updates from Ingo Molnar:
    "Features:

    - Add "uretprobes" - an optimization to uprobes, like kretprobes are
    an optimization to kprobes. "perf probe -x file sym%return" now
    works like kretprobes. By Oleg Nesterov.

    - Introduce per core aggregation in 'perf stat', from Stephane
    Eranian.

    - Add memory profiling via PEBS, from Stephane Eranian.

    - Event group view for 'annotate' in --stdio, --tui and --gtk, from
    Namhyung Kim.

    - Add support for AMD NB and L2I "uncore" counters, by Jacob Shin.

    - Add Ivy Bridge-EP uncore support, by Zheng Yan

    - IBM zEnterprise EC12 oprofile support patchlet from Robert Richter.

    - Add perf test entries for checking breakpoint overflow signal
    handler issues, from Jiri Olsa.

    - Add perf test entry for for checking number of EXIT events, from
    Namhyung Kim.

    - Add perf test entries for checking --cpu in record and stat, from
    Jiri Olsa.

    - Introduce perf stat --repeat forever, from Frederik Deweerdt.

    - Add --no-demangle to report/top, from Namhyung Kim.

    - PowerPC fixes plus a couple of cleanups/optimizations in uprobes
    and trace_uprobes, by Oleg Nesterov.

    Various fixes and refactorings:

    - Fix dependency of the python binding wrt libtraceevent, from
    Naohiro Aota.

    - Simplify some perf_evlist methods and to allow 'stat' to share code
    with 'record' and 'trace', by Arnaldo Carvalho de Melo.

    - Remove dead code in related to libtraceevent integration, from
    Namhyung Kim.

    - Revert "perf sched: Handle PERF_RECORD_EXIT events" to get 'perf
    sched lat' back working, by Arnaldo Carvalho de Melo

    - We don't use Newt anymore, just plain libslang, by Arnaldo Carvalho
    de Melo.

    - Kill a bunch of die() calls, from Namhyung Kim.

    - Fix build on non-glibc systems due to libio.h absence, from Cody P
    Schafer.

    - Remove some perf_session and tracing dead code, from David Ahern.

    - Honor parallel jobs, fix from Borislav Petkov

    - Introduce tools/lib/lk library, initially just removing duplication
    among tools/perf and tools/vm. from Borislav Petkov

    ... and many more I missed to list, see the shortlog and git log for
    more details."

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (136 commits)
    perf/x86/intel/P4: Robistify P4 PMU types
    perf/x86/amd: Fix AMD NB and L2I "uncore" support
    perf/x86/amd: Remove old-style NB counter support from perf_event_amd.c
    perf/x86: Check all MSRs before passing hw check
    perf/x86/amd: Add support for AMD NB and L2I "uncore" counters
    perf/x86/intel: Add Ivy Bridge-EP uncore support
    perf/x86/intel: Fix SNB-EP CBO and PCU uncore PMU filter management
    perf/x86: Avoid kfree() in CPU_{STARTING,DYING}
    uprobes/perf: Avoid perf_trace_buf_prepare/submit if ->perf_events is empty
    uprobes/tracing: Don't pass addr=ip to perf_trace_buf_submit()
    uprobes/tracing: Change create_trace_uprobe() to support uretprobes
    uprobes/tracing: Make seq_printf() code uretprobe-friendly
    uprobes/tracing: Make register_uprobe_event() paths uretprobe-friendly
    uprobes/tracing: Make uprobe_{trace,perf}_print() uretprobe-friendly
    uprobes/tracing: Introduce is_ret_probe() and uretprobe_dispatcher()
    uprobes/tracing: Introduce uprobe_{trace,perf}_print() helpers
    uprobes/tracing: Generalize struct uprobe_trace_entry_head
    uprobes/tracing: Kill the pointless local_save_flags/preempt_count calls
    uprobes/tracing: Kill the pointless seq_print_ip_sym() call
    uprobes/tracing: Kill the pointless task_pt_regs() calls
    ...

    Linus Torvalds
     

13 Apr, 2013

1 commit

  • struct uprobe_trace_entry_head has a single member for reporting,
    "unsigned long ip". If we want to support uretprobes we need to
    create another struct which has "func" and "ret_ip" and duplicate
    a lot of functions, like trace_kprobe.c does.

    To avoid this copy-and-paste horror we turn ->ip into ->vaddr[]
    and add couple of trivial helpers to calculate sizeof/data. This
    uglifies the code a bit, but this allows us to avoid a lot more
    complications later, when we add the support for ret-probes.

    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju
    Tested-by: Anton Arapov

    Oleg Nesterov
     

16 Mar, 2013

1 commit

  • By moving find_event_field() and trace_find_field() into trace_events.c,
    the ftrace_common_fields list and trace_get_fields() can become local to
    the trace_events.c file.

    find_event_field() is renamed to trace_find_event_field() to conform to
    the tracing global function names.

    Link: http://lkml.kernel.org/r/513D8426.9070109@huawei.com

    Signed-off-by: zhangwei(Jovi)
    [ rostedt: Modified trace_find_field() to trace_find_event_field() ]
    Signed-off-by: Steven Rostedt

    zhangwei(Jovi)
     

15 Mar, 2013

18 commits

  • Currently, the only way to stop the latency tracers from doing function
    tracing is to fully disable the function tracer from the proc file
    system:

    echo 0 > /proc/sys/kernel/ftrace_enabled

    This is a big hammer approach as it disables function tracing for
    all users. This includes kprobes, perf, stack tracer, etc.

    Instead, create a function-trace option that the latency tracers can
    check to determine if it should enable function tracing or not.
    This option can be set or cleared even while the tracer is active
    and the tracers will disable or enable function tracing depending
    on how the option was set.

    Instead of using the proc file, disable latency function tracing with

    echo 0 > /debug/tracing/options/function-trace

    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: Clark Williams
    Cc: John Kacur
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • There's a few places that ftrace uses trace_printk() for internal
    use, but this requires context (normal, softirq, irq, NMI) buffers
    to keep things lockless. But the trace_puts() does not, as it can
    write the string directly into the ring buffer. Make a internal helper
    for trace_puts() and have the internal functions use that.

    This way the extra context buffers are not used.

    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • The trace_printk() is extremely fast and is very handy as it can be
    used in any context (including NMIs!). But it still requires scanning
    the fmt string for parsing the args. Even the trace_bprintk() requires
    a scan to know what args will be saved, although it doesn't copy the
    format string itself.

    Several times trace_printk() has no args, and wastes cpu cycles scanning
    the fmt string.

    Adding trace_puts() allows the developer to use an even faster
    tracing method that only saves the pointer to the string in the
    ring buffer without doing any format parsing at all. This will
    help remove even more of the "Heisenbug" effect, when debugging.

    Also fixed up the F_printk()s for the ftrace internal bprint and print events.

    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • If debugging the kernel, and the developer wants to use
    tracing_snapshot() in places where tracing_snapshot_alloc() may
    be difficult (or more likely, the developer is lazy and doesn't
    want to bother with tracing_snapshot_alloc() at all), then adding

    alloc_snapshot

    to the kernel command line parameter will tell ftrace to allocate
    the snapshot buffer (if configured) when it allocates the main
    tracing buffer.

    I also noticed that ring_buffer_expanded and tracing_selftest_disabled
    had inconsistent use of boolean "true" and "false" with "0" and "1".
    I cleaned that up too.

    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • Add a ref count to the trace_array structure and prevent removal
    of instances that have open descriptors.

    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • The snapshot buffer belongs to the trace array not the tracer that is
    running. The trace array should be the data structure that keeps track
    of whether or not the snapshot buffer is allocated, not the tracer
    desciptor. Having the trace array keep track of it makes modifications
    so much easier.

    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • Currently, the way the latency tracers and snapshot feature works
    is to have a separate trace_array called "max_tr" that holds the
    snapshot buffer. For latency tracers, this snapshot buffer is used
    to swap the running buffer with this buffer to save the current max
    latency.

    The only items needed for the max_tr is really just a copy of the buffer
    itself, the per_cpu data pointers, the time_start timestamp that states
    when the max latency was triggered, and the cpu that the max latency
    was triggered on. All other fields in trace_array are unused by the
    max_tr, making the max_tr mostly bloat.

    This change removes the max_tr completely, and adds a new structure
    called trace_buffer, that holds the buffer pointer, the per_cpu data
    pointers, the time_start timestamp, and the cpu where the latency occurred.

    The trace_array, now has two trace_buffers, one for the normal trace and
    one for the max trace or snapshot. By doing this, not only do we remove
    the bloat from the max_trace but the instances of traces can now use
    their own snapshot feature and not have just the top level global_trace have
    the snapshot feature and latency tracers for itself.

    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • Currently we do not know what buffer a module event was enabled in.
    On unload, it is safest to clear all buffer instances, not just the
    top level buffer.

    Todo: Clear only the buffer that the event was used in. The
    infrastructure is there to do this, but it makes the code a bit
    more complex. Lets get the current code vetted before we add that.

    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • With the conversion of the data array to per cpu, sparse now complains
    about the use of per_cpu_ptr() on the variable. But The variable is
    allocated with alloc_percpu() and is fine to use. But since the structure
    that contains the data variable does not annotate it as such, sparse
    gives out a lot of false warnings.

    Reported-by: Fengguang Wu
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • The names used to display the field and type in the event format
    files are copied, as well as the system name that is displayed.

    All these names are created by constant values passed in.
    If one of theses values were to be removed by a module, the module
    would also be required to remove any event it created.

    By using the strings directly, we can save over 100K of memory.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Add a method to the hijacked dentry descriptor of the
    "instances" directory to allow for rmdir to remove an
    instance of a multibuffer.

    Example:

    cd /debug/tracing/instances
    mkdir hello
    ls
    hello/
    rmdir hello
    ls

    Like the mkdir method, the i_mutex is dropped for the instances
    directory. The instances directory is created at boot up and can
    not be renamed or removed. The trace_types_lock mutex is used to
    synchronize adding and removing of instances.

    I've run several stress tests with different threads trying to
    create and delete directories of the same name, and it has stood
    up fine.

    Cc: Al Viro
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Add the interface ("instances" directory) to add multiple buffers
    to ftrace. To create a new instance, simply do a mkdir in the
    instances directory:

    This will create a directory with the following:

    # cd instances
    # mkdir foo
    # ls foo
    buffer_size_kb free_buffer trace_clock trace_pipe
    buffer_total_size_kb set_event trace_marker tracing_enabled
    events/ trace trace_options tracing_on

    Currently only events are able to be set, and there isn't a way
    to delete a buffer when one is created (yet).

    Note, the i_mutex lock is dropped from the parent "instances"
    directory during the mkdir operation. As the "instances" directory
    can not be renamed or deleted (created on boot), I do not see
    any harm in dropping the lock. The creation of the sub directories
    is protected by trace_types_lock mutex, which only lets one
    instance get into the code path at a time. If two tasks try to
    create or delete directories of the same name, only one will occur
    and the other will fail with -EEXIST.

    Cc: Al Viro
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Currently the syscall events record into the global buffer. But if
    multiple buffers are in place, then we need to have syscall events
    record in the proper buffers.

    By adding descriptors to pass to the syscall event functions, the
    syscall events can now record into the buffers that have been assigned
    to them (one event may be applied to mulitple buffers).

    This will allow tracing high volume syscalls along with seldom occurring
    syscalls without losing the seldom syscall events.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The global and max-tr currently use static per_cpu arrays for the CPU data
    descriptors. But in order to get new allocated trace_arrays, they need to
    be allocated per_cpu arrays. Instead of using the static arrays, switch
    the global and max-tr to use allocated data.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The global_trace variable in kernel/trace/trace.c has been kept 'static' and
    local to that file so that it would not be used too much outside of that
    file. This has paid off, even though there were lots of changes to make
    the trace_array structure more generic (not depending on global_trace).

    Removal of a lot of direct usages of global_trace is needed to be able to
    create more trace_arrays such that we can add multiple buffers.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Both RING_BUFFER_ALL_CPUS and TRACE_PIPE_ALL_CPU are defined as
    -1 and used to say that all the ring buffers are to be modified
    or read (instead of just a single cpu, which would be >= 0).

    There's no reason to keep TRACE_PIPE_ALL_CPU as it is also started
    to be used for more than what it was created for, and now that
    the ring buffer code added a generic RING_BUFFER_ALL_CPUS define,
    we can clean up the trace code to use that instead and remove
    the TRACE_PIPE_ALL_CPU macro.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The trace events for ftrace are all defined via global variables.
    The arrays of events and event systems are linked to a global list.
    This prevents multiple users of the event system (what to enable and
    what not to).

    By adding descriptors to represent the event/file relation, as well
    as to which trace_array descriptor they are associated with, allows
    for more than one set of events to be defined. Once the trace events
    files have a link between the trace event and the trace_array they
    are associated with, we can create multiple trace_arrays that can
    record separate events in separate buffers.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The latency tracers require the buffers to be in overwrite mode,
    otherwise they get screwed up. Force the buffers to stay in overwrite
    mode when latency tracers are enabled.

    Added a flag_changed() method to the tracer structure to allow
    the tracers to see what flags are being changed, and also be able
    to prevent the change from happing.

    Cc: stable@vger.kernel.org
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

31 Jan, 2013

1 commit

  • Ftrace has a snapshot feature available from kernel space and
    latency tracers (e.g. irqsoff) are using it. This patch enables
    user applictions to take a snapshot via debugfs.

    Add "snapshot" debugfs file in "tracing" directory.

    snapshot:
    This is used to take a snapshot and to read the output of the
    snapshot.

    # echo 1 > snapshot

    This will allocate the spare buffer for snapshot (if it is
    not allocated), and take a snapshot.

    # cat snapshot

    This will show contents of the snapshot.

    # echo 0 > snapshot

    This will free the snapshot if it is allocated.

    Any other positive values will clear the snapshot contents if
    the snapshot is allocated, or return EINVAL if it is not allocated.

    Link: http://lkml.kernel.org/r/20121226025300.3252.86850.stgit@liselsia

    Cc: Jiri Olsa
    Cc: David Sharp
    Signed-off-by: Hiraku Toyooka
    [
    Fixed irqsoff selftest and also a conflict with a change
    that fixes the update_max_tr.
    ]
    Signed-off-by: Steven Rostedt

    Hiraku Toyooka