Eric Lee / smarc-fsl-linux-kernel

09 Jan, 2012

1 commit

972b2c719 Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
reiserfs: Properly display mount options in /proc/mounts
vfs: prevent remount read-only if pending removes
vfs: count unlinked inodes
vfs: protect remounting superblock read-only
vfs: keep list of mounts for each superblock
vfs: switch ->show_options() to struct dentry *
vfs: switch ->show_path() to struct dentry *
vfs: switch ->show_devname() to struct dentry *
vfs: switch ->show_stats to struct dentry *
switch security_path_chmod() to struct path *
vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
vfs: trim includes a bit
switch mnt_namespace ->root to struct mount
vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
vfs: opencode mntget() mnt_set_mountpoint()
vfs: spread struct mount - remaining argument of next_mnt()
vfs: move fsnotify junk to struct mount
vfs: move mnt_devname
vfs: move mnt_list to struct mount
vfs: switch pnode.h macros to struct mount *
...

Linus Torvalds
2012-01-09 04:19:57 +0800

07 Jan, 2012

1 commit

35b740e46 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (106 commits)
perf kvm: Fix copy & paste error in description
perf script: Kill script_spec__delete
perf top: Fix a memory leak
perf stat: Introduce get_ratio_color() helper
perf session: Remove impossible condition check
perf tools: Fix feature-bits rework fallout, remove unused variable
perf script: Add generic perl handler to process events
perf tools: Use for_each_set_bit() to iterate over feature flags
perf tools: Unify handling of features when writing feature section
perf report: Accept fifos as input file
perf tools: Moving code in some files
perf tools: Fix out-of-bound access to struct perf_session
perf tools: Continue processing header on unknown features
perf tools: Improve macros for struct feature_ops
perf: builtin-record: Document and check that mmap_pages must be a power of two.
perf: builtin-record: Provide advice if mmap'ing fails with EPERM.
perf tools: Fix truncated annotation
perf script: look up thread using tid instead of pid
perf tools: Look up thread names for system wide profiling
perf tools: Fix comm for processes with named threads
...

Linus Torvalds
2012-01-07 00:02:58 +0800

04 Jan, 2012

1 commit

f4ae40a6a switch debugfs to umode_t ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:54:56 +0800

12 Dec, 2011

1 commit

a8eecf224 trace: Allow ftrace_dump() to be called from modules ... Browse Code »

Add an EXPORT_SYMBOL_GPL() so that rcutorture can dump the trace buffer
upon detection of an RCU error.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
2011-12-12 02:31:25 +0800

18 Nov, 2011

1 commit

39eaf7ef8 tracing: Add entries in buffer and total entries to default output header ... Browse Code »

Knowing the number of event entries in the ring buffer compared
to the total number that were written is useful information. The
latency format gives this information and there's no reason that the
default format does not.

This information is now added to the default header, along with the
number of online CPUs:

# tracer: nop
#
# entries-in-buffer/entries-written: 159836/64690869 #P:4
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
-0 [000] ...2 49.442971: local_touch_nmi -0 [000] d..2 49.442973: enter_idle -0 [000] d..2 49.442974: atomic_notifier_call_chain -0 [000] d..2 49.442976: __atomic_notifier_call_chain

Steven Rostedt
2011-11-18 00:10:43 +0800

17 Nov, 2011

1 commit

77271ce4b tracing: Add irq, preempt-count and need resched info to default trace output ... Browse Code »

People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.

The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.

Latency format:

# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
-0 [000] d..2 49.309305: cpuidle_get_driver -0 [000] d..2 49.309307: mwait_idle -0 [000] d..2 49.309309: need_resched -0 [000] d..2 49.309310: test_ti_thread_flag -0 [000] d..2 49.309312: trace_power_start.constprop.13 -0 [000] d..2 49.309313: trace_cpu_idle -0 [000] d..2 49.309315: need_resched -0 [000] 49.309305: cpuidle_get_driver -0 [000] 49.309307: mwait_idle -0 [000] 49.309309: need_resched -0 [000] 49.309310: test_ti_thread_flag -0 [000] 49.309312: trace_power_start.constprop.13 -0 [000] 49.309313: trace_cpu_idle -0 [000] 49.309315: need_resched
Signed-off-by: Steven Rostedt

Steven Rostedt
2011-11-17 22:58:48 +0800

11 Nov, 2011

1 commit

efc96737b Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/ro… ... Browse Code »

…stedt/linux-trace into perf/core

Ingo Molnar
2011-11-11 15:19:37 +0800

08 Nov, 2011

1 commit

7e9a49ef5 tracing/latency: Fix header output for latency tracers ... Browse Code »

In case the the graph tracer (CONFIG_FUNCTION_GRAPH_TRACER) or even the
function tracer (CONFIG_FUNCTION_TRACER) are not set, the latency tracers
do not display proper latency header.

The involved/fixed latency tracers are:
wakeup_rt
wakeup
preemptirqsoff
preemptoff
irqsoff

The patch adds proper handling of tracer configuration options for latency
tracers, and displaying correct header info accordingly.

* The current output (for wakeup tracer) with both graph and function
tracers disabled is:

# tracer: wakeup
#
-0 0d.h5 1us+: 0:120:R + [000] 7: 0:R watchdog/0
-0 0d.h5 3us+: ttwu_do_activate.clone.1 CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
cat-1129 0d..4 1us : 1129:120:R + [000] 6: 0:R migration/0
cat-1129 0d..4 2us+: ttwu_do_activate.clone.1 CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
-0 1d.h5 1us+: 0:120:R + [001] 12: 0:R watchdog/1
-0 1d.h5 3us : ttwu_do_activate.clone.1
Cc: Ingo Molnar
Signed-off-by: Jiri Olsa
Signed-off-by: Steven Rostedt

Jiri Olsa
2011-11-08 02:48:35 +0800

26 Oct, 2011

1 commit

7115e3fcf Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »
43

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (121 commits)
perf symbols: Increase symbol KSYM_NAME_LEN size
perf hists browser: Refuse 'a' hotkey on non symbolic views
perf ui browser: Use libslang to read keys
perf tools: Fix tracing info recording
perf hists browser: Elide DSO column when it is set to just one DSO, ditto for threads
perf hists: Don't consider filtered entries when calculating column widths
perf hists: Don't decay total_period for filtered entries
perf hists browser: Honour symbol_conf.show_{nr_samples,total_period}
perf hists browser: Do not exit on tab key with single event
perf annotate browser: Don't change selection line when returning from callq
perf tools: handle endianness of feature bitmap
perf tools: Add prelink suggestion to dso update message
perf script: Fix unknown feature comment
perf hists browser: Apply the dso and thread filters when merging new batches
perf hists: Move the dso and thread filters from hist_browser
perf ui browser: Honour the xterm colors
perf top tui: Give color hints just on the percentage, like on --stdio
perf ui browser: Make the colors configurable and change the defaults
perf tui: Remove unneeded call to newtCls on startup
perf hists: Don't format the percentage on hist_entry__snprintf
...

Fix up conflicts in arch/x86/kernel/kprobes.c manually.

Ingo's tree did the insane "add volatile to const array", which just
doesn't make sense ("volatile const"?). But we could remove the const
*and* make the array volatile to make doubly sure that gcc doesn't
optimize it away..

Also fix up kernel/trace/ring_buffer.c non-data-conflicts manually: the
reader_lock has been turned into a raw lock by the core locking merge,
and there was a new user of it introduced in this perf core merge. Make
sure that new use also uses the raw accessor functions.

Linus Torvalds
2011-10-26 23:03:38 +0800

14 Oct, 2011

2 commits

436fc2802 tracing: Fix returning of duplicate data after EOF in trace_pipe_raw ... Browse Code »
1

The trace_pipe_raw handler holds a cached page from the time the file
is opened to the time it is closed. The cached page is used to handle
the case of the user space buffer being smaller than what was read from
the ring buffer. The left over buffer is held in the cache so that the
next read will continue where the data left off.

After EOF is returned (no more data in the buffer), the index of
the cached page is set to zero. If a user app reads the page again
after EOF, the check in the buffer will see that the cached page
is less than page size and will return the cached page again. This
will cause reading the trace_pipe_raw again after EOF to return
duplicate data, making the output look like the time went backwards
but instead data is just repeated.

The fix is to not reset the index right after all data is read
from the cache, but to reset it after all data is read and more
data exists in the ring buffer.

Cc: stable
Reported-by: Jeremy Eder
Signed-off-by: Steven Rostedt

Steven Rostedt
2011-10-14 22:44:25 +0800
9b5f8b31a ftrace: Fix README to state tracing_on to start/stop tracing ... Browse Code »

tracing_enabled option is deprecated.
To start/stop tracing, write to /sys/kernel/debug/tracing/tracing_on
without tracing_enabled. This patch is based on Linux 3.1.0-rc1

Signed-off-by: Geunsik Lim
Link: http://lkml.kernel.org/r/1313127022-23830-1-git-send-email-leemgs1@gmail.com
Signed-off-by: Steven Rostedt

Geunsik Lim
2011-10-14 22:41:33 +0800

11 Oct, 2011

2 commits

d696b58ca tracing: Do not allocate buffer for trace_marker ... Browse Code »

When doing intense tracing, the kmalloc inside trace_marker can
introduce side effects to what is being traced.

As trace_marker() is used by userspace to inject data into the
kernel ring buffer, it needs to do so with the least amount
of intrusion to the operations of the kernel or the user space
application.

As the ring buffer is designed to write directly into the buffer
without the need to make a temporary buffer, and userspace already
went through the hassle of knowing how big the write will be,
we can simply pin the userspace pages and write the data directly
into the buffer. This improves the impact of tracing via trace_marker
tremendously!

Thanks to Peter Zijlstra and Thomas Gleixner for pointing out the
use of get_user_pages_fast() and kmap_atomic().

Suggested-by: Thomas Gleixner
Suggested-by: Peter Zijlstra
Signed-off-by: Steven Rostedt

Steven Rostedt
2011-10-11 21:13:53 +0800
e0a413f61 tracing: Warn on output if the function tracer was found corrupted ... Browse Code »

As the function tracer is very intrusive, lots of self checks are
performed on the tracer and if something is found to be strange
it will shut itself down keeping it from corrupting the rest of the
kernel. This shutdown may still allow functions to be traced, as the
tracing only stops new modifications from happening. Trying to stop
the function tracer itself can cause more harm as it requires code
modification.

Although a WARN_ON() is executed, a user may not notice it. To help
the user see that something isn't right with the tracing of the system
a big warning is added to the output of the tracer that lets the user
know that their data may be incomplete.

Reported-by: Thomas Gleixner
Signed-off-by: Steven Rostedt

Steven Rostedt
2011-10-11 21:13:25 +0800

19 Sep, 2011

1 commit

6249687f7 tracing: Add a counter clock for those that do not trust clocks ... Browse Code »

When debugging tight race conditions, it can be helpful to have a
synchronized tracing method. Although in most cases the global clock
provides this functionality, if timings is not the issue, it is more
comforting to know that the order of events really happened in a precise
order.

Instead of using a clock, add a "counter" that is simply an incrementing
atomic 64bit counter that orders the events as they are perceived to
happen.

The trace_clock_counter() is added from the attempt by Peter Zijlstra
trying to convert the trace_clock_global() to it. I took Peter's counter
code and made trace_clock_counter() instead, and added it to the choice
of clocks. Just echo counter > /debug/tracing/trace_clock to activate
it.

Requested-by: Thomas Gleixner
Requested-by: Peter Zijlstra
Reviewed-By: Valdis Kletnieks
Signed-off-by: Steven Rostedt

Steven Rostedt
2011-09-19 23:35:58 +0800

13 Sep, 2011

1 commit

5389f6fad locking, tracing: Annotate tracing locks as raw ... Browse Code »

The tracing locks can be taken in atomic context and therefore
cannot be preempted on -rt - annotate it.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.

Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar

Thomas Gleixner
2011-09-13 17:11:52 +0800

31 Aug, 2011

2 commits

c64e148a3 trace: Add ring buffer stats to measure rate of events ... Browse Code »

The stats file under per_cpu folder provides the number of entries,
overruns and other statistics about the CPU ring buffer. However, the
numbers do not provide any indication of how full the ring buffer is in
bytes compared to the overall size in bytes. Also, it is helpful to know
the rate at which the cpu buffer is filling up.

This patch adds an entry "bytes: " in printed stats for per_cpu ring
buffer which provides the actual bytes consumed in the ring buffer. This
field includes the number of bytes used by recorded events and the
padding bytes added when moving the tail pointer to next page.

It also adds the following time stamps:
"oldest event ts:" - the oldest timestamp in the ring buffer
"now ts:" - the timestamp at the time of reading

The field "now ts" provides a consistent time snapshot to the userspace
when being read. This is read from the same trace clock used by tracing
event timestamps.

Together, these values provide the rate at which the buffer is filling
up, from the formula:
bytes / (now_ts - oldest_event_ts)

Signed-off-by: Vaibhav Nagarnaik
Cc: Michael Rubin
Cc: David Sharp
Link: http://lkml.kernel.org/r/1313531179-9323-3-git-send-email-vnagarnaik@google.com
Signed-off-by: Steven Rostedt

Vaibhav Nagarnaik
2011-08-31 00:27:45 +0800
f81ab074c trace: Add a new readonly entry to report total buffer size ... Browse Code »

The current file "buffer_size_kb" reports the size of per-cpu buffer and
not the overall memory allocated which could be misleading. A new file
"buffer_total_size_kb" adds up all the enabled CPU buffer sizes and
reports it. This is only a readonly entry.

Signed-off-by: Vaibhav Nagarnaik
Cc: Michael Rubin
Cc: David Sharp
Link: http://lkml.kernel.org/r/1313531179-9323-2-git-send-email-vnagarnaik@google.com
Signed-off-by: Steven Rostedt

Vaibhav Nagarnaik
2011-08-31 00:27:44 +0800

15 Jul, 2011

1 commit

4a9bd3f13 tracing: Have dynamic size event stack traces ... Browse Code »

Currently the stack trace per event in ftace is only 8 frames.
This can be quite limiting and sometimes useless. Especially when
the "ignore frames" is wrong and we also use up stack frames for
the event processing itself.

Change this to be dynamic by adding a percpu buffer that we can
write a large stack frame into and then copy into the ring buffer.

For interrupts and NMIs that come in while another event is being
process, will only get to use the 8 frame stack. That should be enough
as the task that it interrupted will have the full stack frame anyway.

Requested-by: Thomas Gleixner
Signed-off-by: Steven Rostedt

Steven Rostedt
2011-07-15 04:36:53 +0800

15 Jun, 2011

7 commits

1fd8df2c3 tracing/kprobes: Fix kprobe-tracer to support stack trace ... Browse Code »

Fix to support kernel stack trace correctly on kprobe-tracer.
Since the execution path of kprobe-based dynamic events is different
from other tracepoint-based events, normal ftrace_trace_stack() doesn't
work correctly. To fix that, this introduces ftrace_trace_stack_regs()
which traces stack via pt_regs instead of current stack register.

e.g.

# echo p schedule+4 > /sys/kernel/debug/tracing/kprobe_events
# echo 1 > /sys/kernel/debug/tracing/options/stacktrace
# echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
# head -n 20 /sys/kernel/debug/tracing/trace
bash-2968 [000] 10297.050245: p_schedule_4: (schedule+0x4/0x4ca)
bash-2968 [000] 10297.050247:
=> schedule_timeout
=> n_tty_read
=> tty_read
=> vfs_read
=> sys_read
=> system_call_fastpath
kworker/0:1-2940 [000] 10297.050265: p_schedule_4: (schedule+0x4/0x4ca)
kworker/0:1-2940 [000] 10297.050266:
=> worker_thread
=> kthread
=> kernel_thread_helper
sshd-1132 [000] 10297.050365: p_schedule_4: (schedule+0x4/0x4ca)
sshd-1132 [000] 10297.050365:
=> sysret_careful

Note: Even with this fix, the first entry will be skipped
if the probe is put on the function entry area before
the frame pointer is set up (usually, that is 4 bytes
(push %bp; mov %sp %bp) on x86), because stack unwinder
depends on the frame pointer.

Signed-off-by: Masami Hiramatsu
Cc: Frederic Weisbecker
Cc: yrl.pp-manager.tt@hitachi.com
Cc: Peter Zijlstra
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/20110608070934.17777.17116.stgit@fedora15
Signed-off-by: Steven Rostedt

Masami Hiramatsu
2011-06-15 10:48:53 +0800
22fe9b54d tracing: Convert to kstrtoul_from_user ... Browse Code »

This patch replaces the code for getting an unsigned long from a
userspace buffer by a simple call to kstroul_from_user.
This makes it easier to read and less error prone.

Signed-off-by: Peter Huewe
Link: http://lkml.kernel.org/r/1307476707-14762-1-git-send-email-peterhuewe@gmx.de
Signed-off-by: Steven Rostedt

Peter Huewe
2011-06-15 10:48:50 +0800
f56e7f8ef tracing, function: Fix trace header to follow context-info option ... Browse Code »

The header display of function tracer does not follow
the context-info option, so field names are displayed even
if this option is off.

Added check for TRACE_ITER_CONTEXT_INFO trace_flags.

With following commands:
# echo function > ./current_tracer
# echo 0 > options/context-info
# cat trace

This is what it looked like before:
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
add_preempt_count
Link: http://lkml.kernel.org/r/1307113131-10045-4-git-send-email-jolsa@redhat.com
Signed-off-by: Steven Rostedt

Jiri Olsa
2011-06-15 10:48:48 +0800
cf30cf67d tracing: Add disable_on_free option ... Browse Code »

Add a trace option to disable tracing on free. When this option is
set, a write into the free_buffer file will not only shrink the
ring buffer down to zero, but it will also disable tracing.

Cc: Vaibhav Nagarnaik
Signed-off-by: Steven Rostedt

Steven Rostedt
2011-06-15 10:48:45 +0800
4f271a2a6 tracing: Add a proc file to stop tracing and free buffer ... Browse Code »

The proc file entry buffer_size_kb is used to set the size of tracing
buffer. The memory to expand the buffer size is kernel memory. Consider
a use case where tracing is handled by a user space utility, which acts
as a gate keeper for tracing requests. In an OOM condition, tracing is
considered a low priority task and if the utility gets killed the ring
buffer memory cannot be released back to the kernel.

This patch adds a proc file called "free_buffer" whose purpose is to
stop tracing and free up the ring buffer when it is closed.

The user space process can then set the desired size in buffer_size_kb
file and open the fd to the "free_buffer" file. Under OOM condition, if
the process gets killed, the kernel closes the file descriptor. The
release handler stops the tracing and releases the kernel memory
automatically.

Cc: Ingo Molnar
Cc: Frederic Weisbecker
Cc: Michael Rubin
Cc: David Sharp
Signed-off-by: Vaibhav Nagarnaik
Link: http://lkml.kernel.org/r/1308012717-11148-1-git-send-email-vnagarnaik@google.com
Signed-off-by: Steven Rostedt

Vaibhav Nagarnaik
2011-06-15 10:48:37 +0800
7ea590640 tracing: Use NUMA allocation for per-cpu ring buffer pages ... Browse Code »

The tracing ring buffer is a group of per-cpu ring buffers where
allocation and logging is done on a per-cpu basis. The events that are
generated on a particular CPU are logged in the corresponding buffer.
This is to provide wait-free writes between CPUs and good NUMA node
locality while accessing the ring buffer.

However, the allocation routines consider NUMA locality only for buffer
page metadata and not for the actual buffer page. This causes the pages
to be allocated on the NUMA node local to the CPU where the allocation
routine is running at the time.

This patch fixes the problem by using a NUMA node specific allocation
routine so that the pages are allocated from a NUMA node local to the
logging CPU.

I tested with the getuid_microbench from autotest. It is a simple binary
that calls getuid() in a loop and measures the average time for the
syscall to complete. The following command was used to test:
$ getuid_microbench 1000000

Compared the numbers found on kernel with and without this patch and
found that logging latency decreases by 30-50 ns/call.
tracing with non-NUMA allocation - 569 ns/call
tracing with NUMA allocation - 512 ns/call

Signed-off-by: Vaibhav Nagarnaik
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Michael Rubin
Cc: David Sharp
Link: http://lkml.kernel.org/r/1304470602-20366-1-git-send-email-vnagarnaik@google.com
Signed-off-by: Steven Rostedt

Vaibhav Nagarnaik
2011-06-15 10:04:39 +0800
e7e2ee89a tracing: Schedule a delayed work to call wakeup() ... Browse Code »

In using syscall tracing by concurrent processes, the wakeup() that is
called in the event commit function causes contention on the spin lock
of the waitqueue. I enabled sys_enter_getuid and sys_exit_getuid
tracepoints, and by running getuid_microbench from autotest in parallel
I found that the contention causes exponential latency increase in the
tracing path.

The autotest binary getuid_microbench calls getuid() in a tight loop for
the given number of iterations and measures the average time required to
complete a single invocation of syscall.

The patch schedules a delayed work after 2 ms once an event commit calls
to wake up the trace wait_queue. This removes the delay caused by
contention on spin lock in wakeup() and amortizes the wakeup() calls
scheduled over the 2 ms period.

In the following example, the script enables the sys_enter_getuid and
sys_exit_getuid tracepoints and runs the getuid_microbench in parallel
with the given number of processes. The output clearly shows the latency
increase caused by contentions.

$ ~/getuid.sh 1
1000000 calls in 0.720974253 s (720.974253 ns/call)

$ ~/getuid.sh 2
1000000 calls in 1.166457554 s (1166.457554 ns/call)
1000000 calls in 1.168933765 s (1168.933765 ns/call)

$ ~/getuid.sh 3
1000000 calls in 1.783827516 s (1783.827516 ns/call)
1000000 calls in 1.795553270 s (1795.553270 ns/call)
1000000 calls in 1.796493376 s (1796.493376 ns/call)

$ ~/getuid.sh 4
1000000 calls in 4.483041796 s (4483.041796 ns/call)
1000000 calls in 4.484165388 s (4484.165388 ns/call)
1000000 calls in 4.484850762 s (4484.850762 ns/call)
1000000 calls in 4.485643576 s (4485.643576 ns/call)

$ ~/getuid.sh 5
1000000 calls in 6.497521653 s (6497.521653 ns/call)
1000000 calls in 6.502000236 s (6502.000236 ns/call)
1000000 calls in 6.501709115 s (6501.709115 ns/call)
1000000 calls in 6.502124100 s (6502.124100 ns/call)
1000000 calls in 6.502936358 s (6502.936358 ns/call)

After the patch, the latencies scale better.
1000000 calls in 0.728720455 s (728.720455 ns/call)

1000000 calls in 0.842782857 s (842.782857 ns/call)
1000000 calls in 0.883803135 s (883.803135 ns/call)

1000000 calls in 0.902077764 s (902.077764 ns/call)
1000000 calls in 0.902838202 s (902.838202 ns/call)
1000000 calls in 0.908896885 s (908.896885 ns/call)

1000000 calls in 0.932523515 s (932.523515 ns/call)
1000000 calls in 0.958009672 s (958.009672 ns/call)
1000000 calls in 0.986188020 s (986.188020 ns/call)
1000000 calls in 0.989771102 s (989.771102 ns/call)

1000000 calls in 0.933518391 s (933.518391 ns/call)
1000000 calls in 0.958897947 s (958.897947 ns/call)
1000000 calls in 1.031038897 s (1031.038897 ns/call)
1000000 calls in 1.089516025 s (1089.516025 ns/call)
1000000 calls in 1.141998347 s (1141.998347 ns/call)

Signed-off-by: Vaibhav Nagarnaik
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Michael Rubin
Cc: David Sharp
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1305059241-7629-1-git-send-email-vnagarnaik@google.com
Signed-off-by: Steven Rostedt

Vaibhav Nagarnaik
2011-06-15 09:59:41 +0800

10 May, 2011

1 commit

932fed4e2 Merge commit 'v2.6.39-rc7' into perf/core ... Browse Code »

Merge reason: pull in the latest fixes.

Signed-off-by: Ingo Molnar

Ingo Molnar
2011-05-10 23:05:45 +0800

07 May, 2011

1 commit

a3a4a5acd Regression: partial revert "tracing: Remove lock_depth from event entry" ... Browse Code »

This partially reverts commit e6e1e2593592a8f6f6380496655d8c6f67431266.

That commit changed the structure layout of the trace structure, which
in turn broke PowerTOP (1.9x generation) quite badly.

I appreciate not wanting to expose the variable in question, and
PowerTOP was not using it, so I've replaced the variable with just a
padding field - that way if in the future a new field is needed it can
just use this padding field.

Signed-off-by: Arjan van de Ven
Signed-off-by: Linus Torvalds

Arjan van de Ven
2011-05-07 04:20:59 +0800

27 Apr, 2011

1 commit

32673822e Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/ro… ... Browse Code »

…stedt/linux-2.6-trace into perf/core

Conflicts:
include/linux/perf_event.h

Merge reason: pick up the latest jump-label enhancements, they are cooked ready.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

Ingo Molnar
2011-04-27 16:40:21 +0800

05 Apr, 2011

1 commit

ee5e51f51 tracing: Avoid soft lockup in trace_pipe ... Browse Code »

running following commands:

# enable the binary option
echo 1 > ./options/bin
# disable context info option
echo 0 > ./options/context-info
# tracing only events
echo 1 > ./events/enable
cat trace_pipe

plus forcing system to generate many tracing events,
is causing lockup (in NON preemptive kernels) inside
tracing_read_pipe function.

The issue is also easily reproduced by running ltp stress test.
(ftrace_stress_test.sh)

The reasons are:
- bin/hex/raw output functions for events are set to
trace_nop_print function, which prints nothing and
returns TRACE_TYPE_HANDLED value
- LOST EVENT trace do not handle trace_seq overflow

These reasons force the while loop in tracing_read_pipe
function never to break.

The attached patch fixies handling of lost event trace, and
changes trace_nop_print to print minimal info, which is needed
for the correct tracing_read_pipe processing.

v2 changes:
- omit the cond_resched changes by trace_nop_print changes
- WARN changed to WARN_ONCE and added info to be able
to find out the culprit

v3 changes:
- make more accurate patch comment

Signed-off-by: Jiri Olsa
LKML-Reference:
Signed-off-by: Steven Rostedt

Jiri Olsa
2011-04-05 00:18:24 +0800

31 Mar, 2011

1 commit

25985edce Fix common misspellings ... Browse Code »

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi

Lucas De Marchi
2011-03-31 22:26:23 +0800

10 Mar, 2011

3 commits

4a0b1665d tracing: Fix irqoff selftest expanding max buffer ... Browse Code »

If the kernel command line declares a tracer "ftrace=sometracer" and
that tracer is either not defined or is enabled after irqsoff,
then the irqs off selftest will fail with the following error:

Testing tracer irqsoff:
------------[ cut here ]------------
WARNING: at /home/rostedt/work/autotest/nobackup/linux-test.git/kernel/trace/tra
ce.c:713 update_max_tr_single+0xfa/0x11b()
Hardware name:
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.38-rc8-test #1
Call Trace:
[] ? warn_slowpath_common+0x65/0x7a
[] ? update_max_tr_single+0xfa/0x11b
[] ? warn_slowpath_null+0xf/0x13
[] ? update_max_tr_single+0xfa/0x11b
[] ? stop_critical_timing+0x154/0x204
[] ? trace_selftest_startup_irqsoff+0x5b/0xc1
[] ? trace_selftest_startup_irqsoff+0x5b/0xc1
[] ? trace_selftest_startup_irqsoff+0x5b/0xc1
[] ? time_hardirqs_on+0x25/0x28
[] ? trace_hardirqs_on_caller+0x18/0x12f
[] ? trace_hardirqs_on+0xb/0xd
[] ? trace_selftest_startup_irqsoff+0x5b/0xc1
[] ? register_tracer+0xf8/0x1a3
[] ? init_irqsoff_tracer+0xd/0x11
[] ? do_one_initcall+0x71/0x121
[] ? init_irqsoff_tracer+0x0/0x11
[] ? kernel_init+0x13a/0x1b6
[] ? kernel_init+0x0/0x1b6
[] ? kernel_thread_helper+0x6/0x10
---[ end trace e93713a9d40cd06c ]---
.. no entries found ..FAILED!

What happens is the "ftrace=..." will expand the ring buffer to its
default size (from its minimum size) but it will not expand the
max ring buffer (the ring buffer to store maximum latencies).
When the irqsoff test runs, it will call the ring buffer swap routine
that checks if the max ring buffer is the same size as the normal
ring buffer, and will fail if it is not. This causes the test to fail.

The solution is to expand the max ring buffer before running the self
test if the max ring buffer is used by that tracer and the normal ring
buffer is expanded. The max ring buffer should be shrunk again after
the test is done to save space.

Signed-off-by: Steven Rostedt

Steven Rostedt
2011-03-10 23:34:58 +0800
e6e1e2593 tracing: Remove lock_depth from event entry ... Browse Code »

The lock_depth field in the event headers was added as a temporary
data point for help in removing the BKL. Now that the BKL is pretty
much been removed, we can remove this field.

This in turn changes the header from 12 bytes to 8 bytes,
removing the 4 byte buffer that gcc would insert if the first field
in the data load was 8 bytes in size.

Signed-off-by: Steven Rostedt

Steven Rostedt
2011-03-10 23:31:48 +0800
750912fa3 tracing: Add an 'overwrite' trace_option. ... Browse Code »

Add an "overwrite" trace_option for ftrace to control whether the buffer should
be overwritten on overflow or not. The default remains to overwrite old events
when the buffer is full. This patch adds the option to instead discard newest
events when the buffer is full. This is useful to get a snapshot of traces just
after enabling traces. Dropping the current event is also a simpler code path.

Signed-off-by: David Sharp
LKML-Reference:
Signed-off-by: Steven Rostedt

David Sharp
2011-03-10 02:52:27 +0800

09 Feb, 2011

1 commit

6752ab4a9 tracing: Deprecate tracing_enabled for tracing_on ... Browse Code »

tracing_enabled should not be used, it is heavy weight and does not
do much in helping lower the overhead.

tracing_on should be used instead. Warn users to use tracing_on
when tracing_enabled is used as it will soon be removed from the
tracing directory.

Signed-off-by: Steven Rostedt

Steven Rostedt
2011-02-09 06:14:58 +0800

08 Jan, 2011

1 commit

1dbd1951f tracing: Fix preempt count leak ... Browse Code »

While running my ftrace stress test, this showed up:

BUG: sleeping function called from invalid context at mm/mmap.c:233
...
note: cat[3293] exited with preempt_count 1

The bug was introduced by commit 91e86e560d0b3ce4c5fc64fd2bbb99f856a30a4e
("tracing: Fix recursive user stack trace")

Cc:
Signed-off-by: Li Zefan
LKML-Reference:
Signed-off-by: Steven Rostedt

Li Zefan
2011-01-08 12:20:02 +0800

16 Dec, 2010

1 commit

d949750fe Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/… ... Browse Code »

…rostedt/linux-2.6-trace into perf/urgent

Ingo Molnar
2010-12-16 18:21:24 +0800

01 Dec, 2010

1 commit

364829b12 tracing: Fix panic when lseek() called on "trace" opened for writing ... Browse Code »

The file_ops struct for the "trace" special file defined llseek as seq_lseek().
However, if the file was opened for writing only, seq_open() was not called,
and the seek would dereference a null pointer, file->private_data.

This patch introduces a new wrapper for seq_lseek() which checks if the file
descriptor is opened for reading first. If not, it does nothing.

Cc:
Signed-off-by: Slava Pestov
LKML-Reference:
Signed-off-by: Steven Rostedt

Slava Pestov
2010-12-01 01:18:17 +0800

27 Nov, 2010

1 commit

d2f30c73a Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf symbols: Remove incorrect open-coded container_of()
perf record: Handle restrictive permissions in /proc/{kallsyms,modules}
x86/kprobes: Prevent kprobes to probe on save_args()
irq_work: Drop cmpxchg() result
perf: Fix owner-list vs exit
x86, hw_nmi: Move backtrace_mask declaration under ARCH_HAS_NMI_WATCHDOG
tracing: Fix recursive user stack trace
perf,hw_breakpoint: Initialize hardware api earlier
x86: Ignore trap bits on single step exceptions
tracing: Force arch_local_irq_* notrace for paravirt
tracing: Fix module use of trace_bprintk()

Linus Torvalds
2010-11-27 06:28:17 +0800

18 Nov, 2010

1 commit

451a3c24b BKL: remove extraneous #include <smp_lock.h> ... Browse Code »

The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.

Signed-off-by: Arnd Bergmann
Signed-off-by: Linus Torvalds

Arnd Bergmann
2010-11-18 00:59:32 +0800

13 Nov, 2010

1 commit

91e86e560 tracing: Fix recursive user stack trace ... Browse Code »

The user stack trace can fault when examining the trace. Which
would call the do_page_fault handler, which would trace again,
which would do the user stack trace, which would fault and call
do_page_fault again ...

Thus this is causing a recursive bug. We need to have a recursion
detector here.

[ Resubmitted by Jiri Olsa ]

[ Eric Dumazet recommended using __this_cpu_* instead of __get_cpu_* ]

Cc: Eric Dumazet
Signed-off-by: Jiri Olsa
LKML-Reference:
Signed-off-by: Steven Rostedt

Steven Rostedt
2010-11-13 10:20:08 +0800