14 Oct, 2008

40 commits

  • Make tracepoints use rcu sched. (cleanup)

    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Ingo Molnar

    Mathieu Desnoyers
     
  • unregister bug:

    codes using makers are typically calling marker_probe_unregister()
    and then destroying the data that marker_probe_func needs(or
    unloading this module). This is bug when the corresponding
    marker_probe_func is still running(on other cpus),
    it is using the destroying/ed data.

    we should call synchronize_sched() after marker_update_probes().

    reenter bug:

    marker_probe_register(), marker_probe_unregister() and
    marker_probe_unregister_private_data() are not reentrant safe
    functions. these 3 functions release markers_mutex and then
    require it again and do "entry->oldptr = old; ...", but entry->oldptr
    maybe is using now for these 3 functions may reenter when markers_mutex
    is released.

    we use synchronize_sched() instead of call_rcu_sched() to fix
    this bug. actually we can do:
    "
    if (entry->rcu_pending)
    rcu_barrier_sched();
    "
    after require markers_mutex again. but synchronize_sched()
    is better and simpler. For these 3 functions are not critical path.

    Signed-off-by: Lai Jiangshan
    Cc: Mathieu Desnoyers
    Signed-off-by: Ingo Molnar

    Lai Jiangshan
     
  • With latest -tip I get this bug:

    [ 49.439988] in_atomic():0, irqs_disabled():1
    [ 49.440118] INFO: lockdep is turned off.
    [ 49.440118] Pid: 2814, comm: modprobe Tainted: G W 2.6.27-rc7 #4
    [ 49.440118] [] __might_sleep+0xe1/0x120
    [ 49.440118] [] ftrace_modify_code+0x2a/0xd0
    [ 49.440118] [] ? ftrace_test_p6nop+0x0/0xa
    [ 49.440118] [] __ftrace_update_code+0xfe/0x2f0
    [ 49.440118] [] ? ftrace_test_p6nop+0x0/0xa
    [ 49.440118] [] ftrace_convert_nops+0x50/0x80
    [ 49.440118] [] ftrace_init_module+0x16/0x20
    [ 49.440118] [] load_module+0x185b/0x1d30
    [ 49.440118] [] ? find_get_page+0x0/0xf0
    [ 49.440118] [] ? sprintf+0x0/0x30
    [ 49.440118] [] ? mutex_lock_interruptible_nested+0x1f2/0x350
    [ 49.440118] [] sys_init_module+0x53/0x1b0
    [ 49.440118] [] ? do_page_fault+0x0/0x740
    [ 49.440118] [] syscall_call+0x7/0xb
    [ 49.440118] =======================

    It is because ftrace_modify_code() calls copy_to_user and
    copy_from_user.
    These functions have been inserted after guessing that there
    couldn't be any race condition but copy_[to/from]_user might
    sleep and __ftrace_update_code is called with local_irq_saved.

    These function have been inserted since this commit:
    d5e92e8978fd2574e415dc2792c5eb592978243d:
    "ftrace: x86 use copy from user function"

    Signed-off-by: Frederic Weisbecker
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Frédéric Weisbecker
     
  • Could just as easily change the three casts to cast to the correct
    type...this patch changes the type of ftrace_nop instead.

    Supresses sparse warnings:

    arch/x86/kernel/ftrace.c:157:14: warning: incorrect type in assignment (different signedness)
    arch/x86/kernel/ftrace.c:157:14: expected long *static [toplevel] ftrace_nop
    arch/x86/kernel/ftrace.c:157:14: got unsigned long *
    arch/x86/kernel/ftrace.c:161:14: warning: incorrect type in assignment (different signedness)
    arch/x86/kernel/ftrace.c:161:14: expected long *static [toplevel] ftrace_nop
    arch/x86/kernel/ftrace.c:161:14: got unsigned long *
    arch/x86/kernel/ftrace.c:165:14: warning: incorrect type in assignment (different signedness)
    arch/x86/kernel/ftrace.c:165:14: expected long *static [toplevel] ftrace_nop
    arch/x86/kernel/ftrace.c:165:14: got unsigned long *

    Signed-off-by: Harvey Harrison
    Signed-off-by: Ingo Molnar

    Harvey Harrison
     
  • With the recent updates to ftrace, there should not be any failures when
    modifying the code. If there is, then we need to warn about it.

    This patch has a cleaned up version of the code that I used to discover
    that the weak symbols were causing failures.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Replace "none" tracer by the recently created "nop" tracer.
    Both are pretty similar except that nop accepts TRACE_PRINT
    or TRACE_SPECIAL entries.

    And as a consequence, changing the size of the ring buffer now
    requires that tracing has already been disabled.

    Signed-off-by: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: Steven Noonan
    Signed-off-by: Ingo Molnar

    Frédéric Weisbecker
     
  • Now that the nop tracer is used as the default tracer by
    replacing the "none" tracer, tracing engine depends on it.

    Signed-off-by: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: Steven Noonan
    Signed-off-by: Ingo Molnar

    Frédéric Weisbecker
     
  • If nop tracer is selected, some old entries from the previous tracer
    could still be enqueued. Tracing have to be reset.

    Signed-off-by: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: Steven Noonan
    Signed-off-by: Ingo Molnar

    Frédéric Weisbecker
     
  • The functions are already 'extern' anyway, so there's no problem
    with linkage. Removing these ifdefs also helps find any potential
    compiler errors.

    Suggested by Andrew Morton.

    Signed-off-by: Steven Noonan
    Signed-off-by: Ingo Molnar

    Steven Noonan
     
  • When CONFIG_DYNAMIC_FTRACE isn't used, neither is mcount_addr. This
    patch eliminates that warning.

    Signed-off-by: Steven Noonan
    Signed-off-by: Ingo Molnar

    Steven Noonan
     
  • A no-op tracer which can serve two purposes:

    1. A template for development of a new tracer.
    2. A convenient way to see ftrace_printk() calls without
    an irrelevant trace making the output messy.

    [ mingo@elte.hu: resolved conflicts ]
    Signed-off-by: Steven Noonan
    Signed-off-by: Ingo Molnar

    Steven Noonan
     
  • Allow a user to inject a marker (TRACE_PRINT entry) into the trace ring
    buffer. The related file operations are derived from code by Frédéric
    Weisbecker .

    Signed-off-by: Pekka Paalanen
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Signed-off-by: Pekka Paalanen
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Also make trace_seq_print_cont() non-static, and add a newline if the
    seq buffer can't hold all data.

    Signed-off-by: Pekka Paalanen
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Offer mmiotrace users a function to inject markers from inside the kernel.
    This depends on the trace_vprintk() patch.

    Signed-off-by: Pekka Paalanen
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • trace_vprintk() for easier implementation of tracer specific *_printk
    functions. Add check check for no_tracer, and implement
    __ftrace_printk() as a wrapper.

    Signed-off-by: Pekka Paalanen
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Moves the mmiotrace specific functions from trace.c to
    trace_mmiotrace.c. Functions trace_wake_up(), tracing_get_trace_entry(),
    and tracing_generic_entry_update() are therefore made available outside
    trace.c.

    Signed-off-by: Pekka Paalanen
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Signed-off-by: Pekka Paalanen
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • This must be brown paper bag week for Steven Rostedt!

    While working on ftrace for PPC, I discovered that the hash locking done
    when CONFIG_FTRACE_MCOUNT_RECORD is not set, is totally incorrect.

    With a cut and paste error, I had the hash lock macro to lock for both
    hash_lock _and_ hash_unlock!

    This bug did not affect x86 since this bug was introduced when
    CONFIG_FTRACE_MCOUNT_RECORD was added to x86.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • ftrace_release is necessary for all uses of dynamic ftrace and not just
    the archs that have CONFIG_FTRACE_MCOUNT_RECORD defined.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • make most of the tracers depend on DEBUG_KERNEL - that's their intended
    purpose. (most distributions have DEBUG_KERNEL enabled anyway so this is
    not a practical limitation - but it simplifies the tracing menu in the
    normal case)

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • While profiling the smp behaviour of the scheduler it was needed to know to
    which cpu a task got woken.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Currently ftrace_printk only works with the ftrace tracer, switch it to an
    iter_ctrl setting so we can make us of them with other tracers too.

    [rostedt@redhat.com: tweak to the disable condition]

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • An item in the trace buffer that is bigger than one entry may be split
    up using the TRACE_CONT entry. This makes it a virtual single entry.
    The current code increments the iterator index even while traversing
    TRACE_CONT entries, making it look like the iterator is further than
    it actually is.

    This patch adds code to not increment the iterator index while skipping
    over TRACE_CONT entries.

    Signed-off-by: Steven Rostedt
    Acked-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Peter Zijlstra provided me with a nice brown paper bag while letting me know
    that I was doing a logical AND and not a binary one, making a condition
    true more often than it should be.

    Luckily, a false true is handled by the calling function and no harm is
    done. But this needs to be fixed regardless.

    Signed-off-by: Steven Rostedt
    Acked-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Currently some of the ftrace output goes skewiff if you have more
    than 9 cpus, and some if you have more than 99.

    Twiddle with the headers and format strings to make up to 999 cpus
    display without causing spacing problems.

    Signed-off-by: Michael Ellerman
    Acked-by: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar

    Michael Ellerman
     
  • Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • This patch adds indexes into the stack that the functions in the
    stack dump were found at. As an added bonus, I also added a diff
    to show which function is the most notorious consumer of the stack.

    The output now looks like this:

    # cat /debug/tracing/stack_trace
    Depth Size Location (48 entries)
    ----- ---- --------
    0) 2476 212 blk_recount_segments+0x39/0x59
    1) 2264 12 bio_phys_segments+0x16/0x1d
    2) 2252 20 blk_rq_bio_prep+0x23/0xaf
    3) 2232 12 init_request_from_bio+0x74/0x77
    4) 2220 56 __make_request+0x294/0x331
    5) 2164 136 generic_make_request+0x34f/0x37d
    6) 2028 56 submit_bio+0xe7/0xef
    7) 1972 28 submit_bh+0xd1/0xf0
    8) 1944 112 block_read_full_page+0x299/0x2a9
    9) 1832 8 blkdev_readpage+0x14/0x16
    10) 1824 28 read_cache_page_async+0x7e/0x109
    11) 1796 16 read_cache_page+0x11/0x49
    12) 1780 32 read_dev_sector+0x3c/0x72
    13) 1748 48 read_lba+0x4d/0xaa
    14) 1700 168 efi_partition+0x85/0x61b
    15) 1532 72 rescan_partitions+0x10e/0x266
    16) 1460 40 do_open+0x1c7/0x24e
    17) 1420 292 __blkdev_get+0x79/0x84
    18) 1128 12 blkdev_get+0x12/0x14
    19) 1116 20 register_disk+0xd1/0x11e
    20) 1096 28 add_disk+0x34/0x90
    21) 1068 52 sd_probe+0x2b1/0x366
    22) 1016 20 driver_probe_device+0xa5/0x120
    23) 996 8 __device_attach+0xd/0xf
    24) 988 32 bus_for_each_drv+0x3e/0x68
    25) 956 24 device_attach+0x56/0x6c
    26) 932 16 bus_attach_device+0x26/0x4d
    27) 916 64 device_add+0x380/0x4b4
    28) 852 28 scsi_sysfs_add_sdev+0xa1/0x1c9
    29) 824 160 scsi_probe_and_add_lun+0x919/0xa2a
    30) 664 36 __scsi_add_device+0x88/0xae
    31) 628 44 ata_scsi_scan_host+0x9e/0x21c
    32) 584 28 ata_host_register+0x1cb/0x1db
    33) 556 24 ata_host_activate+0x98/0xb5
    34) 532 192 ahci_init_one+0x9bd/0x9e9
    35) 340 20 pci_device_probe+0x3e/0x5e
    36) 320 20 driver_probe_device+0xa5/0x120
    37) 300 20 __driver_attach+0x3f/0x5e
    38) 280 36 bus_for_each_dev+0x40/0x62
    39) 244 12 driver_attach+0x19/0x1b
    40) 232 28 bus_add_driver+0x9c/0x1af
    41) 204 28 driver_register+0x76/0xd2
    42) 176 20 __pci_register_driver+0x44/0x71
    43) 156 8 ahci_init+0x14/0x16
    44) 148 100 _stext+0x42/0x122
    45) 48 20 kernel_init+0x175/0x1dc
    46) 28 28 kernel_thread_helper+0x7/0x10

    The first column is simply an index starting from the inner most function
    and counting down to the outer most.

    The next column is the location that the function was found on the stack.

    The next column is the size of the stack for that function.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • The warning messages about old objcopy and local functions spam the
    user quite drastically. Remove the warning until we can find a nicer
    way of tell the user to upgrade their objcopy.

    Signed-off-by: Steven Rostedt
    Cc: Stephen Rothwell
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • The mcount record method of ftrace scans objdump for references to mcount.
    Using mcount as the reference to test if the calls to mcount being replaced
    are indeed calls to mcount, this use of mcount was also caught as a
    location to change. Using a variable that points to the mcount address
    moves this reference into the data section that is not scanned, and
    we do not use a false location to try and modify.

    The warn on code was what was used to detect this bug.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • This is another tracer using the ftrace infrastructure, that examines
    at each function call the size of the stack. If the stack use is greater
    than the previous max it is recorded.

    You can always see (and set) the max stack size seen. By setting it
    to zero will start the recording again. The backtrace is also available.

    For example:

    # cat /debug/tracing/stack_max_size
    1856

    # cat /debug/tracing/stack_trace
    [] stack_trace_call+0x8f/0x101
    [] ftrace_call+0x5/0x8
    [] clocksource_get_next+0x12/0x48
    [] update_wall_time+0x538/0x6d1
    [] do_timer+0x23/0xb0
    [] tick_do_update_jiffies64+0xd9/0xf1
    [] tick_sched_timer+0x4a/0xad
    [] __run_hrtimer+0x3e/0x75
    [] hrtimer_interrupt+0xf1/0x154
    [] smp_apic_timer_interrupt+0x71/0x84
    [] apic_timer_interrupt+0x2d/0x34
    [] finish_task_switch+0x29/0xa0
    [] schedule+0x765/0x7be
    [] schedule_timeout+0x1b/0x90
    [] wait_for_common+0xab/0x101
    [] wait_for_completion+0x12/0x14
    [] blk_execute_rq+0x84/0x99
    [] scsi_execute+0xc2/0x105
    [] scsi_execute_req+0x57/0x7f
    [] sr_test_unit_ready+0x3e/0x97
    [] sr_media_change+0x43/0x205
    [] media_changed+0x48/0x77
    [] cdrom_media_changed+0x31/0x37
    [] sr_block_media_changed+0x16/0x18
    [] check_disk_change+0x1b/0x63
    [] cdrom_open+0x7a1/0x806
    [] sr_block_open+0x78/0x8d
    [] do_open+0x90/0x257
    [] blkdev_open+0x2d/0x56
    [] __dentry_open+0x14d/0x23c
    [] nameidata_to_filp+0x24/0x38
    [] do_filp_open+0x347/0x626
    [] do_sys_open+0x47/0xbc
    [] sys_open+0x23/0x2b
    [] sysenter_do_call+0x12/0x26

    I've tested this on both x86_64 and i386.

    Signed-off-by: Steven Rostedt
    Cc: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • CHK include/linux/version.h
    CHK include/linux/utsrelease.h
    CC scripts/mod/empty.o
    /bin/sh: /usr/src/25/scripts/recordmcount.pl: Permission denied

    We shouldn't assume that files have their `x' bits set. There are various
    ways in which file permissions get lost, including use of patch(1).

    It might not be correct to assume that perl lives in $PATH?

    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar

    Andrew Morton
     
  • The --globalize-symbols option came out in objcopy version 2.17.
    If the kernel is being compiled on a system with a lower version of
    objcopy, then we can not use the globalize / localize trick to
    link to symbols pointing to local functions.

    This patch tests the version of objcopy and will only use the trick
    if the version is greater than or equal to 2.17. Otherwise, if an
    object has only local functions within a section, it will give a
    nice warning and recommend the user to upgrade their objcopy.

    Leaving the symbols unrecorded is not that big of a deal, since the
    mcount record method changes the actual mcount code to be a simple
    "ret" without recording registers or anything.

    Reported-by: Stephen Rothwell
    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • enclose the argument in parenthesis. (especially since we cast it,
    which is a high prio operation)

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • After disabling FTRACE_MCOUNT_RECORD via a patch, a dormant build
    failure surfaced:

    kernel/trace/ftrace.c: In function 'ftrace_record_ip':
    kernel/trace/ftrace.c:416: error: incompatible type for argument 1 of '_spin_lock_irqsave'
    kernel/trace/ftrace.c:433: error: incompatible type for argument 1 of '_spin_lock_irqsave'

    Introduced by commit 6dad8e07f4c10b17b038e84d29f3ca41c2e55cd0 ("ftrace:
    add necessary locking for ftrace records").

    Signed-off-by: Stephen Rothwell
    Signed-off-by: Ingo Molnar

    Stephen Rothwell
     
  • The modification of code is performed either by kstop_machine, before
    SMP starts, or on module code before the module is executed. There is
    no reason to do the modifications from assembly. The copy to and from
    user functions are sufficient and produces cleaner and easier to read
    code.

    Thanks to Benjamin Herrenschmidt for suggesting the idea.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • During tests and checks, I've discovered that there were failures to
    convert mcount callers into nops. Looking deeper into these failures,
    code that was attempted to be changed was not an mcount caller.
    The current code only updates if the code being changed is what it expects,
    but I still investigate any time there is a failure.

    What was happening is that a weak symbol was being used as a reference
    for other mcount callers. That weak symbol was also referenced elsewhere
    so the offsets were using the strong symbol and not the function symbol
    that it was referenced from.

    This patch changes the setting up of the mcount_loc section to search
    for a global function that is not weak. It will pick a local over a weak
    but if only a weak is found in a section, a warning is printed and the
    mcount location is not recorded (just to be safe).

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • I'm trying to keep all the arch changes in recordmcount.pl in one place.
    I moved your code into that area, by adding the flags to the commands
    that were passed in.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • I'm seeing when I use separate src/build dirs:

    make[3]: *** [arch/x86/kernel/time_32.o] Error 1
    /bin/sh: scripts/recordmcount.pl: No such file or directory
    make[3]: *** [arch/x86/kernel/irq_32.o] Error 1
    /bin/sh: scripts/recordmcount.pl: No such file or directory
    make[3]: *** [arch/x86/kernel/ldt.o] Error 1
    /bin/sh: scripts/recordmcount.pl: No such file or directory
    make[3]: *** [arch/x86/kernel/i8259.o] Error 1
    /bin/sh: scripts/recordmcount.pl: No such file or directory

    This fixes it.

    Signed-off-by: Ingo Molnar

    Jeremy Fitzhardinge
     
  • This patch fixes incorrect comment style of __ftrace_enabled_save().

    Signed-off-by: Huang Ying
    Signed-off-by: Ingo Molnar

    Huang Ying