09 Jan, 2012
1 commit
-
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
reiserfs: Properly display mount options in /proc/mounts
vfs: prevent remount read-only if pending removes
vfs: count unlinked inodes
vfs: protect remounting superblock read-only
vfs: keep list of mounts for each superblock
vfs: switch ->show_options() to struct dentry *
vfs: switch ->show_path() to struct dentry *
vfs: switch ->show_devname() to struct dentry *
vfs: switch ->show_stats to struct dentry *
switch security_path_chmod() to struct path *
vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
vfs: trim includes a bit
switch mnt_namespace ->root to struct mount
vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
vfs: opencode mntget() mnt_set_mountpoint()
vfs: spread struct mount - remaining argument of next_mnt()
vfs: move fsnotify junk to struct mount
vfs: move mnt_devname
vfs: move mnt_list to struct mount
vfs: switch pnode.h macros to struct mount *
...
04 Jan, 2012
1 commit
-
Signed-off-by: Al Viro
17 Nov, 2011
1 commit
-
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
-0 [000] d..2 49.309305: cpuidle_get_driver -0 [000] d..2 49.309307: mwait_idle -0 [000] d..2 49.309309: need_resched -0 [000] d..2 49.309310: test_ti_thread_flag -0 [000] d..2 49.309312: trace_power_start.constprop.13 -0 [000] d..2 49.309313: trace_cpu_idle -0 [000] d..2 49.309315: need_resched -0 [000] 49.309305: cpuidle_get_driver -0 [000] 49.309307: mwait_idle -0 [000] 49.309309: need_resched -0 [000] 49.309310: test_ti_thread_flag -0 [000] 49.309312: trace_power_start.constprop.13 -0 [000] 49.309313: trace_cpu_idle -0 [000] 49.309315: need_resched
Signed-off-by: Steven Rostedt
08 Nov, 2011
1 commit
-
In case the the graph tracer (CONFIG_FUNCTION_GRAPH_TRACER) or even the
function tracer (CONFIG_FUNCTION_TRACER) are not set, the latency tracers
do not display proper latency header.The involved/fixed latency tracers are:
wakeup_rt
wakeup
preemptirqsoff
preemptoff
irqsoffThe patch adds proper handling of tracer configuration options for latency
tracers, and displaying correct header info accordingly.* The current output (for wakeup tracer) with both graph and function
tracers disabled is:# tracer: wakeup
#
-0 0d.h5 1us+: 0:120:R + [000] 7: 0:R watchdog/0
-0 0d.h5 3us+: ttwu_do_activate.clone.1 CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
cat-1129 0d..4 1us : 1129:120:R + [000] 6: 0:R migration/0
cat-1129 0d..4 2us+: ttwu_do_activate.clone.1 CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
-0 1d.h5 1us+: 0:120:R + [001] 12: 0:R watchdog/1
-0 1d.h5 3us : ttwu_do_activate.clone.1
Cc: Ingo Molnar
Signed-off-by: Jiri Olsa
Signed-off-by: Steven Rostedt
11 Oct, 2011
1 commit
-
As the function tracer is very intrusive, lots of self checks are
performed on the tracer and if something is found to be strange
it will shut itself down keeping it from corrupting the rest of the
kernel. This shutdown may still allow functions to be traced, as the
tracing only stops new modifications from happening. Trying to stop
the function tracer itself can cause more harm as it requires code
modification.Although a WARN_ON() is executed, a user may not notice it. To help
the user see that something isn't right with the tracing of the system
a big warning is added to the output of the tracer that lets the user
know that their data may be incomplete.Reported-by: Thomas Gleixner
Signed-off-by: Steven Rostedt
20 Aug, 2011
2 commits
-
Adding automated tests running as late_initcall. Tests are
compiled in with CONFIG_FTRACE_STARTUP_TEST option.Adding test event "ftrace_test_filter" used to simulate
filter processing during event occurance.String filters are compiled and tested against several
test events with different values.Also testing that evaluation of explicit predicates is ommited
due to the lazy filter evaluation.Signed-off-by: Jiri Olsa
Link: http://lkml.kernel.org/r/1313072754-4620-11-git-send-email-jolsa@redhat.com
Signed-off-by: Steven Rostedt -
The field_name was used just for finding event's fields. This way we
don't need to care about field_name allocation/free.Signed-off-by: Jiri Olsa
Link: http://lkml.kernel.org/r/1313072754-4620-4-git-send-email-jolsa@redhat.com
Signed-off-by: Steven Rostedt
27 Jul, 2011
1 commit
-
This allows us to move duplicated code in
(atomic_inc_not_zero() for now) toSigned-off-by: Arun Sharma
Reviewed-by: Eric Dumazet
Cc: Ingo Molnar
Cc: David Miller
Cc: Eric Dumazet
Acked-by: Mike Frysinger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
21 Jul, 2011
2 commits
-
…stedt/linux-2.6-trace into perf/core
-
Merge reason: pick up the latest fixes - they won't make v3.0.
Signed-off-by: Ingo Molnar
08 Jul, 2011
1 commit
-
If a function is set to be traced by the set_graph_function, but the
option funcgraph-irqs is zero, and the traced function happens to be
called from a interrupt, it will not be traced.The point of funcgraph-irqs is to not trace interrupts when we are
preempted by an irq, not to not trace functions we want to trace that
happen to be *in* a irq.Luckily the current->trace_recursion element is perfect to add a flag
to help us be able to trace functions within an interrupt even when
we are not tracing interrupts that preempt the trace.Reported-by: Heiko Carstens
Tested-by: Heiko Carstens
Signed-off-by: Steven Rostedt
07 Jul, 2011
1 commit
-
The event system is freed when its nr_events is set to zero. This happens
when a module created an event system and then later the module is
removed. Modules may share systems, so the system is allocated when
it is created and freed when the modules are unloaded and all the
events under the system are removed (nr_events set to zero).The problem arises when a task opened the "filter" file for the
system. If the module is unloaded and it removed the last event for
that system, the system structure is freed. If the task that opened
the filter file accesses the "filter" file after the system has
been freed, the system will access an invalid pointer.By adding a ref_count, and using it to keep track of what
is using the event system, we can free it after all users
are finished with the event system.Cc:
Reported-by: Johannes Berg
Signed-off-by: Steven Rostedt
15 Jun, 2011
2 commits
-
Fix to support kernel stack trace correctly on kprobe-tracer.
Since the execution path of kprobe-based dynamic events is different
from other tracepoint-based events, normal ftrace_trace_stack() doesn't
work correctly. To fix that, this introduces ftrace_trace_stack_regs()
which traces stack via pt_regs instead of current stack register.e.g.
# echo p schedule+4 > /sys/kernel/debug/tracing/kprobe_events
# echo 1 > /sys/kernel/debug/tracing/options/stacktrace
# echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
# head -n 20 /sys/kernel/debug/tracing/trace
bash-2968 [000] 10297.050245: p_schedule_4: (schedule+0x4/0x4ca)
bash-2968 [000] 10297.050247:
=> schedule_timeout
=> n_tty_read
=> tty_read
=> vfs_read
=> sys_read
=> system_call_fastpath
kworker/0:1-2940 [000] 10297.050265: p_schedule_4: (schedule+0x4/0x4ca)
kworker/0:1-2940 [000] 10297.050266:
=> worker_thread
=> kthread
=> kernel_thread_helper
sshd-1132 [000] 10297.050365: p_schedule_4: (schedule+0x4/0x4ca)
sshd-1132 [000] 10297.050365:
=> sysret_carefulNote: Even with this fix, the first entry will be skipped
if the probe is put on the function entry area before
the frame pointer is set up (usually, that is 4 bytes
(push %bp; mov %sp %bp) on x86), because stack unwinder
depends on the frame pointer.Signed-off-by: Masami Hiramatsu
Cc: Frederic Weisbecker
Cc: yrl.pp-manager.tt@hitachi.com
Cc: Peter Zijlstra
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/20110608070934.17777.17116.stgit@fedora15
Signed-off-by: Steven Rostedt -
Add a trace option to disable tracing on free. When this option is
set, a write into the free_buffer file will not only shrink the
ring buffer down to zero, but it will also disable tracing.Cc: Vaibhav Nagarnaik
Signed-off-by: Steven Rostedt
26 May, 2011
1 commit
-
Witold reported a reboot caused by the selftests of the dynamic function
tracer. He sent me a config and I used ktest to do a config_bisect on it
(as my config did not cause the crash). It pointed out that the problem
config was CONFIG_PROVE_RCU.What happened was that if multiple callbacks are attached to the
function tracer, we iterate a list of callbacks. Because the list is
managed by synchronize_sched() and preempt_disable, the access to the
pointers uses rcu_dereference_raw().When PROVE_RCU is enabled, the rcu_dereference_raw() calls some
debugging functions, which happen to be traced. The tracing of the debug
function would then call rcu_dereference_raw() which would then call the
debug function and then... well you get the idea.I first wrote two different patches to solve this bug.
1) add a __rcu_dereference_raw() that would not do any checks.
2) add notrace to the offending debug functions.Both of these patches worked.
Talking with Paul McKenney on IRC, he suggested to add recursion
detection instead. This seemed to be a better solution, so I decided to
implement it. As the task_struct already has a trace_recursion to detect
recursion in the ring buffer, and that has a very small number it
allows, I decided to use that same variable to add flags that can detect
the recursion inside the infrastructure of the function tracer.I plan to change it so that the task struct bit can be checked in
mcount, but as that requires changes to all archs, I will hold that off
to the next merge window.Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Frederic Weisbecker
Cc: Paul E. McKenney
Link: http://lkml.kernel.org/r/1306348063.1465.116.camel@gandalf.stny.rr.com
Reported-by: Witold Baryluk
Signed-off-by: Steven Rostedt
19 May, 2011
1 commit
-
Add some basic sanity tests for multiple users of the function
tracer at startup.Signed-off-by: Steven Rostedt
10 Mar, 2011
2 commits
-
Move elements in struct tracer for better alignment.
Signed-off-by: Steven Rostedt
-
Add an "overwrite" trace_option for ftrace to control whether the buffer should
be overwritten on overflow or not. The default remains to overwrite old events
when the buffer is full. This patch adds the option to instead discard newest
events when the buffer is full. This is useful to get a snapshot of traces just
after enabling traces. Dropping the current event is also a simpler code path.Signed-off-by: David Sharp
LKML-Reference:
Signed-off-by: Steven Rostedt
08 Feb, 2011
7 commits
-
Now that the filter logic does not require to save the pred results
on the stack, we can increase the max number of preds we allow.
As the preds are index by a short value, and we use the MSBs as flags
we can increase the max preds to 2^14 (16384) which should be way
more than enough.Cc: Tom Zanussi
Signed-off-by: Steven Rostedt -
The MAX_FILTER_PRED is only needed by the kernel/trace/*.c files.
Move it to kernel/trace/trace.h.Cc: Tom Zanussi
Signed-off-by: Steven Rostedt -
There are many cases that a filter will contain multiple ORs or
ANDs together near the leafs. Walking up and down the tree to get
to the next compare can be a waste.If there are several ORs or ANDs together, fold them into a single
pred and allocate an array of the conditions that they check.
This will speed up the filter by linearly walking an array
and can still break out if a short circuit condition is met.Cc: Tom Zanussi
Signed-off-by: Steven Rostedt -
Currently the filter_match_preds() requires a stack to push
and pop the preds to determine if the filter matches the record or not.
This has two drawbacks:1) It requires a stack to store state information. As this is done
in fast paths we can't allocate the storage for this stack, and
we can't use a global as it must be re-entrant. The stack is stored
on the kernel stack and this greatly limits how many preds we
may allow.2) All conditions are calculated even when a short circuit exists.
a || b will always calculate a and b even though a was determined
to be true.Using a tree we can walk a constant structure that will save
the state as we go. The algorithm is simply:pred = root;
do {
switch (move) {
case MOVE_DOWN:
if (OR or AND) {
pred = left;
continue;
}
if (pred == root)
break;
match = pred->fn();
pred = pred->parent;
move = left child ? MOVE_UP_FROM_LEFT : MOVE_UP_FROM_RIGHT;
continue;case MOVE_UP_FROM_LEFT:
/* Only OR or AND can be a parent */
if (match && OR || !match && AND) {
/* short circuit */
if (pred == root)
break;
pred = pred->parent;
move = left child ?
MOVE_UP_FROM_LEFT :
MOVE_UP_FROM_RIGHT;
continue;
}
pred = pred->right;
move = MOVE_DOWN;
continue;case MOVE_UP_FROM_RIGHT:
if (pred == root)
break;
pred = pred->parent;
move = left child ? MOVE_UP_FROM_LEFT : MOVE_UP_FROM_RIGHT;
continue;
}
done = 1;
} while (!done);This way there's no strict limit to how many preds we allow
and it also will short circuit the logical operations when possible.Cc: Tom Zanussi
Signed-off-by: Steven Rostedt -
Currently we allocate an array of pointers to filter_preds, and then
allocate a separate filter_pred for each item in the array.
This adds slight overhead in the filters as it needs to derefernce
twice to get to the op condition.Allocating the preds themselves in a single array removes a dereference
as well as helps on the cache footprint.Cc: Tom Zanussi
Signed-off-by: Steven Rostedt -
For every filter that is made, we create predicates to hold every
operation within the filter. We have a max of 32 predicates that we
can hold. Currently, we allocate all 32 even if we only need to
use one.Part of the reason we do this is that the filter can be used at
any moment by any event. Fortunately, the filter is only used
with preemption disabled. By reseting the count of preds used "n_preds"
to zero, then performing a synchronize_sched(), we can safely
free and reallocate a new array of preds.Cc: Tom Zanussi
Signed-off-by: Steven Rostedt -
The ops OR and AND act different from the other ops, as they
are the only ones to take other ops as their arguements.
These ops als change the logic of the filter_match_preds.By removing the OR and AND fn's we can also remove the val1 and val2
that is passed to all other fn's and are unused.Cc: Tom Zanussi
Signed-off-by: Steven Rostedt
18 Oct, 2010
1 commit
-
Move trace_graph_function() and print_graph_headers_flags() functions
to the trace_function_graph.c to be globaly available.Signed-off-by: Jiri Olsa
LKML-Reference:
Signed-off-by: Steven Rostedt
07 Aug, 2010
1 commit
-
…git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
tracing/kprobes: unregister_trace_probe needs to be called under mutex
perf: expose event__process function
perf events: Fix mmap offset determination
perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
perf, powerpc: Convert the FSL driver to use local64_t
perf tools: Don't keep unreferenced maps when unmaps are detected
perf session: Invalidate last_match when removing threads from rb_tree
perf session: Free the ref_reloc_sym memory at the right place
x86,mmiotrace: Add support for tracing STOS instruction
perf, sched migration: Librarize task states and event headers helpers
perf, sched migration: Librarize the GUI class
perf, sched migration: Make the GUI class client agnostic
perf, sched migration: Make it vertically scrollable
perf, sched migration: Parameterize cpu height and spacing
perf, sched migration: Fix key bindings
perf, sched migration: Ignore unhandled task states
perf, sched migration: Handle ignored migrate out events
perf: New migration tool overview
tracing: Drop cpparg() macro
perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
...Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
05 Aug, 2010
1 commit
-
Add in a helper function to allow the kdb shell to dump the ftrace
buffer.Modify trace.c to expose the capability to iterate over the ftrace
buffer in a read only capacity.Signed-off-by: Jason Wessel
Acked-by: Steven Rostedt
CC: Frederic Weisbecker
23 Jul, 2010
1 commit
-
…stedt/linux-2.6-trace into perf/core
21 Jul, 2010
2 commits
-
Documentation/trace/ftrace.txt says
buffer_size_kb:
This sets or displays the number of kilobytes each CPU
buffer can hold. The tracer buffers are the same size
for each CPU. The displayed number is the size of the
CPU buffer and not total size of all buffers. The
trace buffers are allocated in pages (blocks of memory
that the kernel uses for allocation, usually 4 KB in size).
If the last page allocated has room for more bytes
than requested, the rest of the page will be used,
making the actual allocation bigger than requested.
( Note, the size may not be a multiple of the page size
due to buffer management overhead. )This can only be updated when the current_tracer
is set to "nop".But it's incorrect. currently total memory consumption is
'buffer_size_kb x CPUs x 2'.Why two times difference is there? because ftrace implicitly allocate
the buffer for max latency too.That makes sad result when admin want to use large buffer. (If admin
want full logging and makes detail analysis). example, If admin
have 24 CPUs machine and write 200MB to buffer_size_kb, the system
consume ~10GB memory (200MB x 24 x 2). umm.. 5GB memory waste is
usually unacceptable.Fortunatelly, almost all users don't use max latency feature.
The max latency buffer can be disabled easily.This patch shrink buffer size of the max latency buffer if
unnecessary.Signed-off-by: KOSAKI Motohiro
LKML-Reference:
Signed-off-by: Steven Rostedt -
We found that even enabling a single trace event that will rarely be
triggered can add big overhead to context switch.(lmbench context switch test)
-------------------------------------------------
2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
------ ------ ------ ------ ------ ------- -------
2.19 2.3 2.21 2.56 2.13 2.54 2.07
2.39 2.51 2.35 2.75 2.27 2.81 2.24The overhead is 6% ~ 11%.
It's because when a trace event is enabled 3 tracepoints (sched_switch,
sched_wakeup, sched_wakeup_new) will be activated to map pid to cmdname.We'd like to avoid this overhead, so add a trace option '(no)record-cmd'
to allow to disable cmdline recording.Signed-off-by: Li Zefan
LKML-Reference:
Signed-off-by: Steven Rostedt
20 Jul, 2010
2 commits
-
Special traces type was only used by sysprof. Lets remove it now
that sysprof ftrace plugin has been dropped.Signed-off-by: Frederic Weisbecker
Acked-by: Soeren Sandmann
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Steven Rostedt
Cc: Li Zefan -
The sysprof ftrace plugin doesn't seem to be seriously used
somewhere. There is a branch in the sysprof tree that makes
an interface to it, but the real sysprof tool uses either its
own module or perf events.Drop the sysprof ftrace plugin then, as it's mostly useless.
Signed-off-by: Frederic Weisbecker
Acked-by: Soeren Sandmann
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Steven Rostedt
Cc: Li Zefan
16 Jul, 2010
1 commit
-
The ksym (breakpoint) ftrace plugin has been superseded by perf
tools that are much more poweful to use the cpu breakpoints.
This tracer doesn't bring more feature. It has been deprecated
for a while now, lets remove it.Signed-off-by: Frederic Weisbecker
Cc: Steven Rostedt
Cc: Prasad
Cc: Ingo Molnar
29 Jun, 2010
1 commit
-
Every event has the same common fields, so it's a big waste of
memory to have a copy of those fields for every event.Signed-off-by: Li Zefan
LKML-Reference:
Signed-off-by: Steven Rostedt
10 Jun, 2010
1 commit
-
…ic/random-tracing into perf/core
09 Jun, 2010
2 commits
-
We have been resisting new ftrace plugins and removing existing
ones, and kmemtrace has been superseded by kmem trace events
and perf-kmem, so we remove it.Signed-off-by: Li Zefan
Acked-by: Pekka Enberg
Acked-by: Eduard - Gabriel Munteanu
Cc: Ingo Molnar
Cc: Steven Rostedt
[ remove kmemtrace from the makefile, handle slob too ]
Signed-off-by: Frederic Weisbecker -
The boot tracer is useless. It simply logs the initcalls
but in fact these initcalls are also logged through printk
while using the initcall_debug kernel parameter.Nobody seem to be using it so far. Then just remove it.
Signed-off-by: WANG Cong
Cc: Chase Douglas
Cc: Steven Rostedt
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Li Zefan
LKML-Reference:
[ remove the hooks in main.c, and the headers ]
Signed-off-by: Frederic Weisbecker
04 Jun, 2010
1 commit
-
The ftrace_preempt_disable/enable functions were to address a
recursive race caused by the function tracer. The function tracer
traces all functions which makes it easily susceptible to recursion.
One area was preempt_enable(). This would call the scheduler and
the schedulre would call the function tracer and loop.
(So was it thought).The ftrace_preempt_disable/enable was made to protect against recursion
inside the scheduler by storing the NEED_RESCHED flag. If it was
set before the ftrace_preempt_disable() it would not call schedule
on ftrace_preempt_enable(), thinking that if it was set before then
it would have already scheduled unless it was already in the scheduler.This worked fine except in the case of SMP, where another task would set
the NEED_RESCHED flag for a task on another CPU, and then kick off an
IPI to trigger it. This could cause the NEED_RESCHED to be saved at
ftrace_preempt_disable() but the IPI to arrive in the the preempt
disabled section. The ftrace_preempt_enable() would not call the scheduler
because the flag was already set before entring the section.This bug would cause a missed preemption check and cause lower latencies.
Investigating further, I found that the recusion caused by the function
tracer was not due to schedule(), but due to preempt_schedule(). Now
that preempt_schedule is completely annotated with notrace, the recusion
no longer is an issue.Reported-by: Thomas Gleixner
Signed-off-by: Steven Rostedt
18 May, 2010
1 commit
-
…nux-2.6-tip into trace/tip/tracing/core-6
Conflicts:
include/trace/ftrace.h
kernel/trace/trace_kprobe.cAcked-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>