01 Nov, 2011

1 commit


31 Oct, 2011

1 commit


26 Oct, 2011

2 commits

  • * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (121 commits)
    perf symbols: Increase symbol KSYM_NAME_LEN size
    perf hists browser: Refuse 'a' hotkey on non symbolic views
    perf ui browser: Use libslang to read keys
    perf tools: Fix tracing info recording
    perf hists browser: Elide DSO column when it is set to just one DSO, ditto for threads
    perf hists: Don't consider filtered entries when calculating column widths
    perf hists: Don't decay total_period for filtered entries
    perf hists browser: Honour symbol_conf.show_{nr_samples,total_period}
    perf hists browser: Do not exit on tab key with single event
    perf annotate browser: Don't change selection line when returning from callq
    perf tools: handle endianness of feature bitmap
    perf tools: Add prelink suggestion to dso update message
    perf script: Fix unknown feature comment
    perf hists browser: Apply the dso and thread filters when merging new batches
    perf hists: Move the dso and thread filters from hist_browser
    perf ui browser: Honour the xterm colors
    perf top tui: Give color hints just on the percentage, like on --stdio
    perf ui browser: Make the colors configurable and change the defaults
    perf tui: Remove unneeded call to newtCls on startup
    perf hists: Don't format the percentage on hist_entry__snprintf
    ...

    Fix up conflicts in arch/x86/kernel/kprobes.c manually.

    Ingo's tree did the insane "add volatile to const array", which just
    doesn't make sense ("volatile const"?). But we could remove the const
    *and* make the array volatile to make doubly sure that gcc doesn't
    optimize it away..

    Also fix up kernel/trace/ring_buffer.c non-data-conflicts manually: the
    reader_lock has been turned into a raw lock by the core locking merge,
    and there was a new user of it introduced in this perf core merge. Make
    sure that new use also uses the raw accessor functions.

    Linus Torvalds
     
  • * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
    rtmutex: Add missing rcu_read_unlock() in debug_rt_mutex_print_deadlock()
    lockdep: Comment all warnings
    lib: atomic64: Change the type of local lock to raw_spinlock_t
    locking, lib/atomic64: Annotate atomic64_lock::lock as raw
    locking, x86, iommu: Annotate qi->q_lock as raw
    locking, x86, iommu: Annotate irq_2_ir_lock as raw
    locking, x86, iommu: Annotate iommu->register_lock as raw
    locking, dma, ipu: Annotate bank_lock as raw
    locking, ARM: Annotate low level hw locks as raw
    locking, drivers/dca: Annotate dca_lock as raw
    locking, powerpc: Annotate uic->lock as raw
    locking, x86: mce: Annotate cmci_discover_lock as raw
    locking, ACPI: Annotate c3_lock as raw
    locking, oprofile: Annotate oprofilefs lock as raw
    locking, video: Annotate vga console lock as raw
    locking, latencytop: Annotate latency_lock as raw
    locking, timer_stats: Annotate table_lock as raw
    locking, rwsem: Annotate inner lock as raw
    locking, semaphores: Annotate inner lock as raw
    locking, sched: Annotate thread_group_cputimer as raw
    ...

    Fix up conflicts in kernel/posix-cpu-timers.c manually: making
    cputimer->cputime a raw lock conflicted with the ABBA fix in commit
    bcd5cff7216f ("cputimer: Cure lock inversion").

    Linus Torvalds
     

14 Oct, 2011

2 commits

  • The trace_pipe_raw handler holds a cached page from the time the file
    is opened to the time it is closed. The cached page is used to handle
    the case of the user space buffer being smaller than what was read from
    the ring buffer. The left over buffer is held in the cache so that the
    next read will continue where the data left off.

    After EOF is returned (no more data in the buffer), the index of
    the cached page is set to zero. If a user app reads the page again
    after EOF, the check in the buffer will see that the cached page
    is less than page size and will return the cached page again. This
    will cause reading the trace_pipe_raw again after EOF to return
    duplicate data, making the output look like the time went backwards
    but instead data is just repeated.

    The fix is to not reset the index right after all data is read
    from the cache, but to reset it after all data is read and more
    data exists in the ring buffer.

    Cc: stable
    Reported-by: Jeremy Eder
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • tracing_enabled option is deprecated.
    To start/stop tracing, write to /sys/kernel/debug/tracing/tracing_on
    without tracing_enabled. This patch is based on Linux 3.1.0-rc1

    Signed-off-by: Geunsik Lim
    Link: http://lkml.kernel.org/r/1313127022-23830-1-git-send-email-leemgs1@gmail.com
    Signed-off-by: Steven Rostedt

    Geunsik Lim
     

12 Oct, 2011

1 commit


11 Oct, 2011

3 commits

  • When doing intense tracing, the kmalloc inside trace_marker can
    introduce side effects to what is being traced.

    As trace_marker() is used by userspace to inject data into the
    kernel ring buffer, it needs to do so with the least amount
    of intrusion to the operations of the kernel or the user space
    application.

    As the ring buffer is designed to write directly into the buffer
    without the need to make a temporary buffer, and userspace already
    went through the hassle of knowing how big the write will be,
    we can simply pin the userspace pages and write the data directly
    into the buffer. This improves the impact of tracing via trace_marker
    tremendously!

    Thanks to Peter Zijlstra and Thomas Gleixner for pointing out the
    use of get_user_pages_fast() and kmap_atomic().

    Suggested-by: Thomas Gleixner
    Suggested-by: Peter Zijlstra
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • As the function tracer is very intrusive, lots of self checks are
    performed on the tracer and if something is found to be strange
    it will shut itself down keeping it from corrupting the rest of the
    kernel. This shutdown may still allow functions to be traced, as the
    tracing only stops new modifications from happening. Trying to stop
    the function tracer itself can cause more harm as it requires code
    modification.

    Although a WARN_ON() is executed, a user may not notice it. To help
    the user see that something isn't right with the tracing of the system
    a big warning is added to the output of the tracer that lets the user
    know that their data may be incomplete.

    Reported-by: Thomas Gleixner
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Fix kprobe-tracer not to delete a probe if the probe is in use.
    In that case, delete operation will return -EBUSY.

    This bug can cause a kernel panic if enabled probes are deleted
    during perf record.

    (Add some probes on functions)
    sh-4.2# perf probe --del probe:\*
    sh-4.2# exit
    (kernel panic)

    This is originally reported on the fedora bugzilla:

    https://bugzilla.redhat.com/show_bug.cgi?id=742383

    I've also checked that this problem doesn't happen on
    tracepoints when module removing because perf event
    locks target module.

    $ sudo ./perf record -e xfs:\* -aR sh
    sh-4.2# rmmod xfs
    ERROR: Module xfs is in use
    sh-4.2# exit
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.203 MB perf.data (~8862 samples) ]

    Signed-off-by: Masami Hiramatsu
    Cc: Arnaldo Carvalho de Melo
    Cc: Ingo Molnar
    Cc: Frederic Weisbecker
    Cc: Frank Ch. Eigler
    Cc: stable@kernel.org
    Link: http://lkml.kernel.org/r/20111004104438.14591.6553.stgit@fedora15
    Signed-off-by: Steven Rostedt

    Masami Hiramatsu
     

08 Oct, 2011

1 commit

  • * pm-runtime:
    PM / Tracing: build rpm-traces.c only if CONFIG_PM_RUNTIME is set
    PM / Runtime: Replace dev_dbg() with trace_rpm_*()
    PM / Runtime: Introduce trace points for tracing rpm_* functions
    PM / Runtime: Don't run callbacks under lock for power.irq_safe set
    USB: Add wakeup info to debugging messages
    PM / Runtime: pm_runtime_idle() can be called in atomic context
    PM / Runtime: Add macro to test for runtime PM events
    PM / Runtime: Add might_sleep() to runtime PM functions

    Rafael J. Wysocki
     

30 Sep, 2011

1 commit


28 Sep, 2011

1 commit


22 Sep, 2011

1 commit


19 Sep, 2011

1 commit

  • When debugging tight race conditions, it can be helpful to have a
    synchronized tracing method. Although in most cases the global clock
    provides this functionality, if timings is not the issue, it is more
    comforting to know that the order of events really happened in a precise
    order.

    Instead of using a clock, add a "counter" that is simply an incrementing
    atomic 64bit counter that orders the events as they are perceived to
    happen.

    The trace_clock_counter() is added from the attempt by Peter Zijlstra
    trying to convert the trace_clock_global() to it. I took Peter's counter
    code and made trace_clock_counter() instead, and added it to the choice
    of clocks. Just echo counter > /debug/tracing/trace_clock to activate
    it.

    Requested-by: Thomas Gleixner
    Requested-by: Peter Zijlstra
    Reviewed-By: Valdis Kletnieks
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

13 Sep, 2011

1 commit

  • The tracing locks can be taken in atomic context and therefore
    cannot be preempted on -rt - annotate it.

    In mainline this change documents the low level nature of
    the lock - otherwise there's no functional difference. Lockdep
    and Sparse checking will work as usual.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

31 Aug, 2011

3 commits

  • The stats file under per_cpu folder provides the number of entries,
    overruns and other statistics about the CPU ring buffer. However, the
    numbers do not provide any indication of how full the ring buffer is in
    bytes compared to the overall size in bytes. Also, it is helpful to know
    the rate at which the cpu buffer is filling up.

    This patch adds an entry "bytes: " in printed stats for per_cpu ring
    buffer which provides the actual bytes consumed in the ring buffer. This
    field includes the number of bytes used by recorded events and the
    padding bytes added when moving the tail pointer to next page.

    It also adds the following time stamps:
    "oldest event ts:" - the oldest timestamp in the ring buffer
    "now ts:" - the timestamp at the time of reading

    The field "now ts" provides a consistent time snapshot to the userspace
    when being read. This is read from the same trace clock used by tracing
    event timestamps.

    Together, these values provide the rate at which the buffer is filling
    up, from the formula:
    bytes / (now_ts - oldest_event_ts)

    Signed-off-by: Vaibhav Nagarnaik
    Cc: Michael Rubin
    Cc: David Sharp
    Link: http://lkml.kernel.org/r/1313531179-9323-3-git-send-email-vnagarnaik@google.com
    Signed-off-by: Steven Rostedt

    Vaibhav Nagarnaik
     
  • The current file "buffer_size_kb" reports the size of per-cpu buffer and
    not the overall memory allocated which could be misleading. A new file
    "buffer_total_size_kb" adds up all the enabled CPU buffer sizes and
    reports it. This is only a readonly entry.

    Signed-off-by: Vaibhav Nagarnaik
    Cc: Michael Rubin
    Cc: David Sharp
    Link: http://lkml.kernel.org/r/1313531179-9323-2-git-send-email-vnagarnaik@google.com
    Signed-off-by: Steven Rostedt

    Vaibhav Nagarnaik
     
  • The self testing for event filters does not really need preemption
    disabled as there are no races at the time of testing, but the functions
    it calls uses rcu_dereference_sched() which will complain if preemption
    is not disabled.

    Cc: Jiri Olsa
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

20 Aug, 2011

11 commits

  • Adding automated tests running as late_initcall. Tests are
    compiled in with CONFIG_FTRACE_STARTUP_TEST option.

    Adding test event "ftrace_test_filter" used to simulate
    filter processing during event occurance.

    String filters are compiled and tested against several
    test events with different values.

    Also testing that evaluation of explicit predicates is ommited
    due to the lazy filter evaluation.

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1313072754-4620-11-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • Changing filter_match_preds function to use unified predicates tree
    processing.

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1313072754-4620-10-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • Changing fold_pred_tree function to use unified predicates tree
    processing.

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1313072754-4620-9-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • Changing fold_pred_tree function to use unified predicates tree
    processing.

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1313072754-4620-8-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • Changing count_leafs function to use unified predicates tree
    processing.

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1313072754-4620-7-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • Adding walk_pred_tree function to be used for walking throught
    the filter predicates.

    For each predicate the callback function is called, allowing
    users to add their own functionality or customize their way
    through the filter predicates.

    Changing check_pred_tree function to use walk_pred_tree.

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1313072754-4620-6-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • We dont need to perform lookup through the ftrace_events list,
    instead we can use the 'tp_event' field.

    Each perf_event contains tracepoint event field 'tp_event', which
    got initialized during the tracepoint event initialization.

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1313072754-4620-5-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • The field_name was used just for finding event's fields. This way we
    don't need to care about field_name allocation/free.

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1313072754-4620-4-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • Making the code cleaner by having one function to fully prepare
    the predicate (create_pred), and another to add the predicate to
    the filter (filter_add_pred).

    As a benefit, this way the dry_run flag stays only inside the
    replace_preds function and is not passed deeper.

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1313072754-4620-3-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • Don't dynamically allocate filter_pred struct, use static memory.
    This way we can get rid of the code managing the dynamic filter_pred
    struct object.

    The create_pred function integrates create_logical_pred function.
    This way the static predicate memory is returned only from
    one place.

    Signed-off-by: Jiri Olsa
    Link: http://lkml.kernel.org/r/1313072754-4620-2-git-send-email-jolsa@redhat.com
    Signed-off-by: Steven Rostedt

    Jiri Olsa
     
  • * 'for-linus' of git://git.kernel.dk/linux-block: (23 commits)
    Revert "cfq: Remove special treatment for metadata rqs."
    block: fix flush machinery for stacking drivers with differring flush flags
    block: improve rq_affinity placement
    blktrace: add FLUSH/FUA support
    Move some REQ flags to the common bio/request area
    allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH
    xen/blkback: Make description more obvious.
    cfq-iosched: Add documentation about idling
    block: Make rq_affinity = 1 work as expected
    block: swim3: fix unterminated of_device_id table
    block/genhd.c: remove useless cast in diskstats_show()
    drivers/cdrom/cdrom.c: relax check on dvd manufacturer value
    drivers/block/drbd/drbd_nl.c: use bitmap_parse instead of __bitmap_parse
    bsg-lib: add module.h include
    cfq-iosched: Reduce linked group count upon group destruction
    blk-throttle: correctly determine sync bio
    loop: fix deadlock when sysfs and LOOP_CLR_FD race against each other
    loop: add BLK_DEV_LOOP_MIN_COUNT=%i to allow distros 0 pre-allocated loop devices
    loop: add management interface for on-demand device allocation
    loop: replace linked list of allocated devices with an idr index
    ...

    Linus Torvalds
     

11 Aug, 2011

2 commits

  • Add FLUSH/FUA support to blktrace. As FLUSH precedes WRITE and/or
    FUA follows WRITE, use the same 'F' flag for both cases and
    distinguish them by their (relative) position. The end results
    look like (other flags might be shown also):

    - WRITE: W
    - WRITE_FLUSH: FW
    - WRITE_FUA: WF
    - WRITE_FLUSH_FUA: FWF

    Note that we reuse TC_BARRIER due to lack of bit space of act_mask
    so that the older versions of blktrace tools will report flush
    requests as barriers from now on.

    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Signed-off-by: Namhyung Kim
    Reviewed-by: Jeff Moyer
    Signed-off-by: Jens Axboe

    Namhyung Kim
     
  • gcc incorrectly states that the variable "fmt" is uninitialized when
    CC_OPITMIZE_FOR_SIZE is set.

    Instead of just blindly setting fmt to NULL, the code is cleaned up
    a little to be a bit easier for humans to follow, as well as gcc
    to know the variables are initialized.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

05 Aug, 2011

1 commit


27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

25 Jul, 2011

1 commit


21 Jul, 2011

2 commits


16 Jul, 2011

3 commits

  • Since the address of a module-local variable can only be
    solved after the target module is loaded, the symbol
    fetch-argument should be updated when loading target
    module.

    Signed-off-by: Masami Hiramatsu
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: Arnaldo Carvalho de Melo
    Link: http://lkml.kernel.org/r/20110627072703.6528.75042.stgit@fedora15
    Signed-off-by: Steven Rostedt

    Masami Hiramatsu
     
  • To support probing module init functions, kprobe-tracer allows
    user to define a probe on non-existed function when it is given
    with a module name. This also enables user to set a probe on
    a function on a specific module, even if a same name (but different)
    function is locally defined in another module.

    The module name must be in the front of function name and separated
    by a ':'. e.g. btrfs:btrfs_init_sysfs

    Signed-off-by: Masami Hiramatsu
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: Arnaldo Carvalho de Melo
    Link: http://lkml.kernel.org/r/20110627072656.6528.89970.stgit@fedora15
    Signed-off-by: Steven Rostedt

    Masami Hiramatsu
     
  • Merge redundant enable/disable functions into enable_trace_probe()
    and disable_trace_probe().

    Signed-off-by: Masami Hiramatsu
    Cc: Arnaldo Carvalho de Melo
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: yrl.pp-manager.tt@hitachi.com
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Link: http://lkml.kernel.org/r/20110627072644.6528.26910.stgit@fedora15

    [ converted kprobe selftest to use enable_trace_probe ]

    Signed-off-by: Steven Rostedt

    Masami Hiramatsu