19 Feb, 2009

3 commits

  • Impact: prevent deadlock if ring buffer gets corrupted

    This patch adds a paranoid check to make sure the ring buffer consumer
    does not go into an infinite loop. Since the ring buffer has been set
    to read only, the consumer should not loop for more than the ring buffer
    size. A check is added to make sure the consumer does not loop more than
    the ring buffer size.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Impact: fix output of function tracer to be useful

    The function tracer is pretty useless if KALLSYMS is not configured.
    Unless you are good at reading hex values, the function tracer should
    select the KALLSYMS configuration.

    Also, the dynamic function tracer will fail its self test if KALLSYMS
    is not selected.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Impact: fix to prevent hard lockup on self tests

    If one of the tracers are broken and is constantly filling the ring
    buffer while the test of the ring buffer is running, it will hang
    the box. The reason is that the test is a consumer that will not
    stop till the ring buffer is empty. But if the tracer is broken and
    is constantly producing input to the buffer, this test will never
    end. The result is a lockup of the box.

    This happened when KALLSYMS was not defined and the dynamic ftrace
    test constantly filled the ring buffer, because the filter failed
    and all functions were being traced. Something was being called
    that constantly filled the buffer.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

18 Feb, 2009

1 commit

  • When the function graph tracer is activated, it iterates over the task_list
    to allocate a stack to store the return addresses.

    But the per cpu idle tasks are not iterated by using
    do_each_thread / while_each_thread.

    So we have to iterate on them manually.

    This fixes somes weirdness in the traces and many losses of traces.
    Examples on two cpus:

    0) Xorg-4287 | 2.906 us | }
    0) Xorg-4287 | 3.965 us | }
    0) Xorg-4287 | 5.302 us | }
    ------------------------------------------
    0) Xorg-4287 => -0
    ------------------------------------------

    0) -0 | 2.861 us | }
    0) -0 | 0.526 us | set_normalized_timespec();
    0) -0 | 7.201 us | }
    0) -0 | 8.214 us | }
    0) -0 | | clockevents_program_event() {
    0) -0 | | lapic_next_event() {
    0) -0 | 0.510 us | native_apic_mem_write();
    0) -0 | 1.546 us | }
    0) -0 | 2.583 us | }
    0) -0 | + 12.435 us | }
    0) -0 | + 13.470 us | }
    0) -0 | 0.608 us | _spin_unlock_irqrestore();
    0) -0 | + 23.270 us | }
    0) -0 | + 24.336 us | }
    0) -0 | + 25.417 us | }
    0) -0 | 0.593 us | _spin_unlock();
    0) -0 | + 41.869 us | }
    0) -0 | + 42.906 us | }
    0) -0 | + 95.035 us | }
    0) -0 | 0.540 us | menu_reflect();
    0) -0 | ! 100.404 us | }
    0) -0 | 0.564 us | mce_idle_callback();
    0) -0 | | enter_idle() {
    0) -0 | 0.526 us | mce_idle_callback();
    0) -0 | 1.757 us | }
    0) -0 | | cpuidle_idle_call() {
    0) -0 | | menu_select() {
    0) -0 | 0.525 us | pm_qos_requirement();
    0) -0 | 0.518 us | tick_nohz_get_sleep_length();
    0) -0 | 2.621 us | }
    [...]
    1) -0 | 0.518 us | touch_softlockup_watchdog();
    1) -0 | + 14.355 us | }
    1) -0 | + 22.840 us | }
    1) -0 | + 25.949 us | }
    1) -0 | | handle_irq() {
    1) -0 | 0.511 us | irq_to_desc();
    1) -0 | | handle_edge_irq() {
    1) -0 | 0.638 us | _spin_lock();
    1) -0 | | ack_apic_edge() {
    1) -0 | 0.510 us | irq_to_desc();
    1) -0 | | move_native_irq() {
    1) -0 | 0.510 us | irq_to_desc();
    1) -0 | 1.532 us | }
    1) -0 | 0.511 us | native_apic_mem_write();
    ------------------------------------------
    1) -0 => cat-5073
    ------------------------------------------

    1) cat-5073 | 3.731 us | }
    1) cat-5073 | | run_local_timers() {
    1) cat-5073 | 0.533 us | hrtimer_run_queues();
    1) cat-5073 | | raise_softirq() {
    1) cat-5073 | | __raise_softirq_irqoff() {
    1) cat-5073 | | /* nr: 1 */
    1) cat-5073 | 2.718 us | }
    1) cat-5073 | 3.814 us | }

    Signed-off-by: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: Arnaldo Carvalho de Melo
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

16 Feb, 2009

2 commits

  • Impact: cosmetic change in Kconfig menu layout

    This patch was originally suggested by Peter Zijlstra, but seems it
    was forgotten.

    CONFIG_MMIOTRACE and CONFIG_MMIOTRACE_TEST were selectable
    directly under the Kernel hacking / debugging menu in the kernel
    configuration system. They were present only for x86 and x86_64.

    Other tracers that use the ftrace tracing framework are in their own
    sub-menu. This patch moves the mmiotrace configuration options there.
    Since the Kconfig file, where the tracer menu is, is not architecture
    specific, HAVE_MMIOTRACE_SUPPORT is introduced and provided only by
    x86/x86_64. CONFIG_MMIOTRACE now depends on it.

    Signed-off-by: Pekka Paalanen
    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Impact: enhances lost events counting in mmiotrace

    The tracing framework, or the ring buffer facility it uses, has a switch
    to stop recording data. When recording is off, the trace events will be
    lost. The framework does not count these, so mmiotrace has to count them
    itself.

    Signed-off-by: Pekka Paalanen
    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     

04 Feb, 2009

1 commit

  • "ftrace: use struct pid" commit 978f3a45d9499c7a447ca7615455cefb63d44165
    converted ftrace_pid_trace to "struct pid*".

    But we can't use do_each_pid_task() without rcu_read_lock() even if
    we know the pid itself can't go away (it was pinned in ftrace_pid_write).
    The exiting task can detach itself from this pid at any moment.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Ingo Molnar

    Oleg Nesterov
     

22 Jan, 2009

1 commit


21 Jan, 2009

5 commits

  • Impact: trace max latencies on start of latency tracing

    This patch sets the max latency to zero whenever one of the
    irq variant tracers or the wakeup tracer is set to current tracer.

    Most developers expect to see output when starting up a latency
    tracer. But since the max_latency is already set to max, and
    it takes a latency greater than max_latency to be recorded, there
    is no trace. This is not the expected behavior and has even confused
    myself.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Impact: limit ftrace dump output

    Currently ftrace_dump only calls ftrace_kill that is a fast way
    to prevent the function tracer functions from being called (just sets
    a flag and clears the function to call, nothing else). It is better
    to also turn off any recording to the ring buffers as well.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Impact: fix to print out ftrace_dump when expected

    I was debugging a hard race condition to only find out that
    after I hit the race, my log level was not at level to show
    KERN_INFO. The time it took to trigger the race was wasted because
    I did not capture the trace.

    Since ftrace_dump is only called from kernel oops (and only when
    it is set in the kernel command line to do so), or when a
    developer adds it to their own local tree, the log level of
    the print should be at KERN_EMERG to make sure the print appears.

    ftrace_dump is not called by a normal user setup, and will not
    add extra unwanted print out to the console. There is no reason
    it should be at KERN_INFO.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Impact: reset struct buffer_page.write when interrupt storm

    if struct buffer_page.write is not reset, any succedent committing
    will corrupted ring_buffer:

    static inline void
    rb_set_commit_to_write(struct ring_buffer_per_cpu *cpu_buffer)
    {
    ......
    cpu_buffer->commit_page->commit =
    cpu_buffer->commit_page->write;
    ......
    }

    when "if (RB_WARN_ON(cpu_buffer, next_page == reader_page))", ring_buffer
    is disabled, but some reserved buffers may haven't been committed.
    we need reset struct buffer_page.write.

    when "if (unlikely(next_page == cpu_buffer->commit_page))", ring_buffer
    is still available, we should not corrupt it.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Lai Jiangshan
     
  • Impact: fix a crash while kernel image restore

    When the function graph tracer is running and while suspend to disk, some racy
    and dangerous things happen against this tracer.

    The current task will save its registers including the stack pointer which
    contains the return address hooked by the tracer. But the current task will
    continue to enter other functions after that to save the memory, and then
    it will store other return addresses, and finally loose the old depth which
    matches the return address saved in the old stack (during the registers saving).

    So on image restore, the code will return to wrong addresses.
    And there are other things: on restore, the task will have it's "current"
    pointer overwritten during registers restoring....switching from one task to
    another... That would be insane to try to trace function graphs at these
    stages.

    This patch makes the function graph tracer listening on power events, making
    it's tracing disabled for the current task (the one that performs the
    hibernation work) while suspend/resume to disk, making the tracing safe
    during hibernation.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

20 Jan, 2009

1 commit

  • Impact: fix to allow some archs to use the ring buffer

    Commits in the ring buffer are checked by pointer arithmetic.
    If the calculation is incorrect, then the commits will never take
    place and the buffer will simply fill up and report an error.

    Each page in the ring buffer has a small header:

    struct buffer_data_page {
    u64 time_stamp;
    local_t commit;
    unsigned char data[];
    };

    Unfortuntely, some of the calculations used sizeof(struct buffer_data_page)
    to know the size of the header. But this is incorrect on some archs,
    where sizeof(struct buffer_data_page) does not equal
    offsetof(struct buffer_data_page, data), and on those archs, the commits
    are never processed.

    This patch replaces the sizeof with offsetof.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

10 Jan, 2009

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile: (31 commits)
    powerpc/oprofile: fix whitespaces in op_model_cell.c
    powerpc/oprofile: IBM CELL: add SPU event profiling support
    powerpc/oprofile: fix cell/pr_util.h
    powerpc/oprofile: IBM CELL: cleanup and restructuring
    oprofile: make new cpu buffer functions part of the api
    oprofile: remove #ifdef CONFIG_OPROFILE_IBS in non-ibs code
    ring_buffer: fix ring_buffer_event_length()
    oprofile: use new data sample format for ibs
    oprofile: add op_cpu_buffer_get_data()
    oprofile: add op_cpu_buffer_add_data()
    oprofile: rework implementation of cpu buffer events
    oprofile: modify op_cpu_buffer_read_entry()
    oprofile: add op_cpu_buffer_write_reserve()
    oprofile: rename variables in add_ibs_begin()
    oprofile: rename add_sample() in cpu_buffer.c
    oprofile: rename variable ibs_allowed to has_ibs in op_model_amd.c
    oprofile: making add_sample_entry() inline
    oprofile: remove backtrace code for ibs
    oprofile: remove unused ibs macro
    oprofile: remove unused components in struct oprofile_cpu_buffer
    ...

    Linus Torvalds
     

08 Jan, 2009

1 commit

  • Function ring_buffer_event_length() provides an interface to detect
    the length of data stored in an entry. However, the length contains
    offsets depending on the internal usage. This makes it unusable. This
    patch fixes this and now ring_buffer_event_length() returns the
    alligned length that has been used in ring_buffer_lock_reserve().

    Cc: Steven Rostedt
    Signed-off-by: Robert Richter

    Robert Richter
     

01 Jan, 2009

2 commits

  • Impact: Reduce future memory usage, use new cpumask API.

    Since the last patch was created and acked, more old cpumask users
    slipped into kernel/trace.

    Mostly trivial conversions, except struct trace_iterator's "started"
    member becomes a cpumask_var_t.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Impact: Reduce future memory usage, use new cpumask API.

    (Eventually, cpumask_var_t will be allocated based on nr_cpu_ids, not NR_CPUS).

    Convert kernel trace functions to use struct cpumask API:
    1) Use cpumask_copy/cpumask_test_cpu/for_each_cpu.
    2) Use cpumask_var_t and alloc_cpumask_var/free_cpumask_var everywhere.
    3) Use on_each_cpu instead of playing with current->cpus_allowed.

    Signed-off-by: Rusty Russell
    Signed-off-by: Mike Travis
    Acked-by: Steven Rostedt

    Rusty Russell
     

31 Dec, 2008

4 commits

  • Conflicts:

    arch/x86/kernel/io_apic.c

    Rusty Russell
     
  • Removed duplicated #include in kernel/trace/trace.c.

    Signed-off-by: Huang Weiyi
    Signed-off-by: Linus Torvalds

    Huang Weiyi
     
  • * 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    oprofile: select RING_BUFFER
    ring_buffer: adding EXPORT_SYMBOLs
    oprofile: fix lost sample counter
    oprofile: remove nr_available_slots()
    oprofile: port to the new ring_buffer
    ring_buffer: add remaining cpu functions to ring_buffer.h
    oprofile: moving cpu_buffer_reset() to cpu_buffer.h
    oprofile: adding cpu_buffer_entries()
    oprofile: adding cpu_buffer_write_commit()
    oprofile: adding cpu buffer r/w access functions
    ftrace: remove unused function arg in trace_iterator_increment()
    ring_buffer: update description for ring_buffer_alloc()
    oprofile: set values to default when creating oprofilefs
    oprofile: implement switch/case in buffer_sync.c
    x86/oprofile: cleanup IBS init/exit functions in op_model_amd.c
    x86/oprofile: reordering IBS code in op_model_amd.c
    oprofile: fix typo
    oprofile: whitspace changes only
    oprofile: update comment for oprofile_add_sample()
    oprofile: comment cleanup

    Linus Torvalds
     
  • …l/git/tip/linux-2.6-tip

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    hrtimers: fix warning in kernel/hrtimer.c
    x86: make sure we really have an hpet mapping before using it
    x86: enable HPET on Fujitsu u9200
    linux/timex.h: cleanup for userspace
    posix-timers: simplify de_thread()->exit_itimers() path
    posix-timers: check ->it_signal instead of ->it_pid to validate the timer
    posix-timers: use "struct pid*" instead of "struct task_struct*"
    nohz: suppress needless timer reprogramming
    clocksource, acpi_pm.c: put acpi_pm_read_slow() under CONFIG_PCI
    nohz: no softirq pending warnings for offline cpus
    hrtimer: removing all ur callback modes, fix
    hrtimer: removing all ur callback modes, fix hotplug
    hrtimer: removing all ur callback modes
    x86: correct link to HPET timer specification
    rtc-cmos: export second NVRAM bank

    Fixed up conflicts in sound/drivers/pcsp/pcsp.c and sound/core/hrtimer.c
    manually.

    Linus Torvalds
     

30 Dec, 2008

1 commit


29 Dec, 2008

1 commit

  • …el/git/tip/linux-2.6-tip

    * 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (241 commits)
    sched, trace: update trace_sched_wakeup()
    tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3
    Revert "x86: disable X86_PTRACE_BTS"
    ring-buffer: prevent false positive warning
    ring-buffer: fix dangling commit race
    ftrace: enable format arguments checking
    x86, bts: memory accounting
    x86, bts: add fork and exit handling
    ftrace: introduce tracing_reset_online_cpus() helper
    tracing: fix warnings in kernel/trace/trace_sched_switch.c
    tracing: fix warning in kernel/trace/trace.c
    tracing/ring-buffer: remove unused ring_buffer size
    trace: fix task state printout
    ftrace: add not to regex on filtering functions
    trace: better use of stack_trace_enabled for boot up code
    trace: add a way to enable or disable the stack tracer
    x86: entry_64 - introduce FTRACE_ frame macro v2
    tracing/ftrace: add the printk-msg-only option
    tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp()
    x86, bts: correctly report invalid bts records
    ...

    Fixed up trivial conflict in scripts/recordmcount.pl due to SH bits
    being already partly merged by the SH merge.

    Linus Torvalds
     

26 Dec, 2008

1 commit


25 Dec, 2008

2 commits


24 Dec, 2008

2 commits

  • Impact: eliminate false WARN_ON message

    If an interrupt goes off after the setting of the local variable
    tail_page and before incrementing the write index of that page,
    the interrupt could push the commit forward to the next page.

    Later a check is made to see if interrupts pushed the buffer around
    the entire ring buffer by comparing the next page to the last commited
    page. This can produce a false positive if the interrupt had pushed
    the commit page forward as stated above.

    Thanks to Jiaying Zhang for finding this race.

    Reported-by: Jiaying Zhang
    Signed-off-by: Steven Rostedt
    Cc:
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Impact: fix stuck trace-buffers

    If an interrupt comes in during the rb_set_commit_to_write and
    pushes the tail page forward just at the right time, the commit
    updates will miss the adding of the interrupt data. This will
    cause the commit pointer to cease from moving forward.

    Thanks to Jiaying Zhang for finding this race.

    Reported-by: Jiaying Zhang
    Signed-off-by: Steven Rostedt
    Cc:
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

19 Dec, 2008

4 commits


18 Dec, 2008

5 commits

  • Impact: remove dead code

    struct ring_buffer.size is not set after ring_buffer is initialized
    or resized. it is always 0.

    we can use "buffer->pages * PAGE_SIZE" to get ring_buffer's size

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Ingo Molnar

    Lai Jiangshan
     
  • Impact: fix occasionally incorrect trace output

    The tracing code has interesting varieties of printing out task state.

    Unfortunalely only one of the instances is correct as it copies the
    code from sched.c:sched_show_task(). The others are plain wrong as
    they treatthe bitfield as an integer offset into the character
    array. Also the size check of the character array is wrong as it
    includes the trailing \0.

    Use a common state decoder inline which does the Right Thing.

    Signed-off-by: Thomas Gleixner
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • Impact: enhancement

    Ingo Molnar has asked about a way to remove items from the filter
    lists. Currently, you can only add or replace items. The way
    items are added to the list is through opening one of the list
    files (set_ftrace_filter or set_ftrace_notrace) via append.
    If the file is opened for truncate, the list is cleared.

    echo spin_lock > /debug/tracing/set_ftrace_filter

    The above will replace the list with only spin_lock

    echo spin_lock >> /debug/tracing/set_ftrace_filter

    The above will add spin_lock to the list.

    Now this patch adds:

    echo '!spin_lock' >> /debug/tracing/set_ftrace_filter

    This will remove spin_lock from the list.

    The limited glob features of these lists also can be notted.

    echo '!spin_*' >> /debug/tracing/set_ftrace_filter

    This will remove all functions that start with 'spin_'

    Note:

    echo '!spin_*' > /debug/tracing/set_ftrace_filter

    will simply clear out the list (notice the '>' instead of '>>')

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Impact: clean up

    Andrew Morton suggested to use the stack_tracer_enabled variable
    to decide whether or not to start stack tracing on bootup.
    This lets us remove the start_stack_trace variable.

    Reported-by: Andrew Morton
    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Impact: enhancement to stack tracer

    The stack tracer currently is either on when configured in or
    off when it is not. It can not be disabled when it is configured on.
    (besides disabling the function tracer that it uses)

    This patch adds a way to enable or disable the stack tracer at
    run time. It defaults off on bootup, but a kernel parameter 'stacktrace'
    has been added to enable it on bootup.

    A new sysctl has been added "kernel.stack_tracer_enabled" to let
    the user enable or disable the stack tracer at run time.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

17 Dec, 2008

2 commits

  • Impact: display ftrace_printk messages "as is"

    By default, ftrace_printk() messages find their output with some other
    informations like pid, caller, ...
    Sometimes a developer just want to have the ftrace_printk left "as is", without
    other information.

    This is done by providing a default-off option called printk-msg-only.
    To enable it, just do `echo printk-msg-only > /debugfs/tracing/trace_options`

    Before the patch:

    -2739 [000] 145.692153: __might_sleep: I'm an ftrace_printk msg in __might_sleep
    -2739 [000] 145.692155: __might_sleep: I'm another ftrace_printk msg in __might_sleep

    After the patch and the printk-msg-only option enabled:

    I'm an ftrace_printk msg in __might_sleep
    I'm another ftrace_printk msg in __might_sleep

    Signed-off-by: Frederic Weisbecker
    Cc: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Impact: prevent a trace recursion

    After some tests with function graph tracer under x86-32, I saw some recursions
    caused by ring_buffer_time_stamp() that calls preempt_enable_no_notrace() which
    calls preempt_schedule() which is traced itself.

    This patch re-enables preemption without rescheduling.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker