Eric Lee / smarc-fsl-linux-kernel

10 Mar, 2009

1 commit

12e87e36e Merge branches 'tracing/doc', 'tracing/ftrace', 'tracing/printk' and 'linus' into tracing/core Browse Code »

Ingo Molnar
2009-03-10 16:56:25 +0800

09 Mar, 2009

1 commit

c3ffc7a40 tracing: Don't use tracing_record_cmdline() in workqueue tracer ... Browse Code »

Impact: improve workqueue tracer output

Currently, /sys/kernel/debug/tracing/trace_stat/workqueues can display
wrong and strange thread names.

Why?

Currently, ftrace has tracing_record_cmdline()/trace_find_cmdline()
convenience function that implements a task->comm string cache.

This can avoid unnecessary memcpy overhead and the workqueue tracer
uses it.

However, in general, any trace statistics feature shouldn't use
tracing_record_cmdline() because trace statistics can display
very old process. Then comm cache can return wrong string because
recent process overrides the cache.

Fortunately, workqueue trace guarantees that displayed processes
are live. Thus we can search comm string from PID at display time.

% cat workqueues
# CPU INSERTED EXECUTED NAME
# | | | |

7 431913 431913 kondemand/7
7 0 0 tail
7 21 21 git
7 0 0 ls
7 9 9 cat
7 832632 832632 unix_chkpwd
7 236292 236292 ls

Note: tail, git, ls, cat unix_chkpwd are obiously not workqueue thread.

% cat workqueues
# CPU INSERTED EXECUTED NAME
# | | | |

7 510 510 kondemand/7
7 0 0 kmpathd/7
7 15 15 ata/7
7 0 0 aio/7
7 11 11 kblockd/7
7 1063 1063 work_on_cpu/7
7 167 167 events/7

Signed-off-by: KOSAKI Motohiro
Cc: Lai Jiangshan
Cc: Steven Rostedt
Cc: Frederic Weisbecker
Signed-off-by: Ingo Molnar

KOSAKI Motohiro
2009-03-09 17:26:13 +0800

08 Mar, 2009

2 commits

888b55dc3 ftrace: tracing header should put '#' at the beginning of a line ... Browse Code »

In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.

Then, we can easily filtered the header by following grep usage:

cat trace | grep -v '^#'

Wakeup trace also has the same header problem.

Comparison of headers displayed:

before this patch:

# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------

# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function
Cc: Andrew Morton
LKML-Reference:
Signed-off-by: Ingo Molnar

KOSAKI Motohiro
2009-03-08 23:57:22 +0800
dba58e39c Merge branches 'tracing/doc', 'tracing/ftrace', 'tracing/printk' and 'tracing/te… ... Browse Code »

…xtedit' into tracing/core

Ingo Molnar
2009-03-08 23:48:51 +0800

07 Mar, 2009

4 commits

9de36825b tracing: trace_bprintk() cleanups ... Browse Code »

Impact: cleanup

Remove a few leftovers and clean up the code a bit.

Signed-off-by: Frederic Weisbecker
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Ingo Molnar
2009-03-07 00:59:12 +0800
769b0441f tracing/core: drop the old trace_printk() implementation in favour of trace_bprintk() ... Browse Code »

Impact: faster and lighter tracing

Now that we have trace_bprintk() which is faster and consume lesser
memory than trace_printk() and has the same purpose, we can now drop
the old implementation in favour of the binary one from trace_bprintk(),
which means we move all the implementation of trace_bprintk() to
trace_printk(), so the Api doesn't change except that we must now use
trace_seq_bprintk() to print the TRACE_PRINT entries.

Some changes result of this:

- Previously, trace_bprintk depended of a single tracer and couldn't
work without. This tracer has been dropped and the whole implementation
of trace_printk() (like the module formats management) is now integrated
in the tracing core (comes with CONFIG_TRACING), though we keep the file
trace_printk (previously trace_bprintk.c) where we can find the module
management. Thus we don't overflow trace.c

- changes some parts to use trace_seq_bprintk() to print TRACE_PRINT entries.

- change a bit trace_printk/trace_vprintk macros to support non-builtin formats
constants, and fix 'const' qualifiers warnings. But this is all transparent for
developers.

- etc...

V2:

- Rebase against last changes
- Fix mispell on the changelog

V3:

- Rebase against last changes (moving trace_printk() to kernel.h)

Signed-off-by: Frederic Weisbecker
Acked-by: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Frederic Weisbecker
2009-03-07 00:59:12 +0800
1ba28e02a tracing: add trace_bprintk() ... Browse Code »

Impact: add a generic printk() for tracing, like trace_printk()

trace_bprintk() uses the infrastructure to record events on ring_buffer.

[ fweisbec@gmail.com: ported to latest -tip, made it work if
!CONFIG_MODULES, never free the format strings from modules
because we can't keep track of them and conditionnaly create
the ftrace format strings section (reported by Steven Rostedt) ]

Signed-off-by: Lai Jiangshan
Signed-off-by: Frederic Weisbecker
Acked-by: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Lai Jiangshan
2009-03-07 00:59:11 +0800
1427cdf05 tracing: infrastructure for supporting binary record ... Browse Code »

Impact: save on memory for tracing

Current tracers are typically using a struct(like struct ftrace_entry,
struct ctx_switch_entry, struct special_entr etc...)to record a binary
event. These structs can only record a their own kind of events.
A new kind of tracer need a new struct and a lot of code too handle it.

So we need a generic binary record for events. This infrastructure
is for this purpose.

[fweisbec@gmail.com: rebase against latest -tip, make it safe while sched
tracing as reported by Steven Rostedt]

Signed-off-by: Lai Jiangshan
Signed-off-by: Frederic Weisbecker
Acked-by: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

Lai Jiangshan
2009-03-07 00:59:11 +0800

06 Mar, 2009

11 commits

4460fdad8 tracing, Text Edit Lock - kprobes architecture independent support ... Browse Code »

Use the mutual exclusion provided by the text edit lock in the kprobes code. It
allows coherent manipulation of the kernel code by other subsystems.

Changelog:

Move the kernel_text_lock/unlock out of the for loops.
Use text_mutex directly instead of a function.
Remove whitespace modifications.

(note : kprobes_mutex is always taken outside of text_mutex)

Signed-off-by: Mathieu Desnoyers
Acked-by: Ananth N Mavinakayanahalli
Acked-by: Masami Hiramatsu
LKML-Reference:
Signed-off-by: Ingo Molnar

Mathieu Desnoyers
2009-03-06 23:49:00 +0800
f0ef03985 Merge branch 'x86/core' into tracing/textedit ... Browse Code »

Conflicts:
arch/x86/Kconfig
block/blktrace.c
kernel/irq/handle.c

Semantic conflict:
kernel/trace/blktrace.c

Signed-off-by: Ingo Molnar

Ingo Molnar
2009-03-06 23:45:01 +0800
10dd3ebe2 tracing: fix deadlock when setting set_ftrace_pid ... Browse Code »

Impact: fix deadlock while using set_ftrace_pid

Reproducer:

# cd /sys/kernel/debug/tracing
# echo $$ > set_ftrace_pid

then, console becomes hung.

Details:

when writing set_ftracepid, kernel callstack is following

ftrace_pid_write()
mutex_lock(&ftrace_lock);
ftrace_update_pid_func()
mutex_lock(&ftrace_lock);
mutex_unlock(&ftrace_lock);
mutex_unlock(&ftrace_lock);

then, system always deadlocks when ftrace_pid_write() is called.

In past days, ftrace_pid_write() used ftrace_start_lock, but
commit e6ea44e9b4c12325337cd1c06103cd515a1c02b2 consolidated
ftrace_start_lock to ftrace_lock.

Signed-off-by: KOSAKI Motohiro
Reviewed-by: Lai Jiangshan
Cc: Steven Rostedt
LKML-Reference:
Signed-off-by: Ingo Molnar

KOSAKI Motohiro
2009-03-06 19:07:38 +0800
422d3c7a5 tracing: current tip/master can't enable ftrace ... Browse Code »

After commit 40ada30f9621fbd831ac2437b9a2a399aad34b00,
"make menuconfig" doesn't display "Tracer" item.

Following modification restores it.

KOSAKI Motohiro
2009-03-06 18:56:42 +0800
bc722f508 Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/g… ... Browse Code »

…it/rostedt/linux-2.6-trace into tracing/ftrace

Ingo Molnar
2009-03-06 18:40:37 +0800
160974397 Merge branches 'tracing/ftrace' and 'tracing/function-graph-tracer' into tracing/core Browse Code »

Ingo Molnar
2009-03-06 18:39:18 +0800
770cb2434 tracing: add format files for ftrace default entries ... Browse Code »

Impact: allow user apps to read binary format of basic ftrace entries

Currently, only defined raw events export their formats so a binary
reader can parse them. There's no reason that the default ftrace entries
can't export their formats.

This patch adds a subsystem called "ftrace" in the events directory
that includes the ftrace entries for basic ftrace recorded items.

These only have three files in the events directory:

type : printf
available_types : printf
format : format for the event entry

For example:

# cat /debug/tracing/events/ftrace/wakeup/format
name: wakeup
ID: 3
format:
field:unsigned char type; offset:0; size:1;
field:unsigned char flags; offset:1; size:1;
field:unsigned char preempt_count; offset:2; size:1;
field:int pid; offset:4; size:4;
field:int tgid; offset:8; size:4;

field:unsigned int prev_pid; offset:12; size:4;
field:unsigned char prev_prio; offset:16; size:1;
field:unsigned char prev_state; offset:17; size:1;
field:unsigned int next_pid; offset:20; size:4;
field:unsigned char next_prio; offset:24; size:1;
field:unsigned char next_state; offset:25; size:1;
field:unsigned int next_cpu; offset:28; size:4;

print fmt: "%u:%u:%u ==+ %u:%u:%u [%03u]"

Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-06 10:46:44 +0800
33b0c229e tracing: move print of event format to separate file ... Browse Code »

Impact: clean up

Move the macro that creates the event format file to a separate header.
This will allow the default ftrace events to use this same macro
to create the formats to read those events.

Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-06 10:46:42 +0800
5e2336a0d tracing: make all file_operations const ... Browse Code »

Impact: cleanup

All file_operations structures should be constant. No one is going to
change them.

Reported-by: Andrew Morton
Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-06 10:46:40 +0800
40ada30f9 tracing: clean up menu ... Browse Code »

Clean up menu structure, introduce TRACING_SUPPORT switch that signals
whether an architecture supports various instrumentation mechanisms.

Signed-off-by: Ingo Molnar

Ingo Molnar
2009-03-06 04:53:25 +0800
a1413c89a Merge branch 'x86/urgent' into x86/core ... Browse Code »

Conflicts:
arch/x86/include/asm/fixmap_64.h
Semantic merge:
arch/x86/include/asm/fixmap.h

Signed-off-by: Ingo Molnar

Ingo Molnar
2009-03-06 04:48:50 +0800

05 Mar, 2009

15 commits

0012693ad tracing/function-graph-tracer: use the more lightweight local clock ... Browse Code »

Impact: decrease hangs risks with the graph tracer on slow systems

Since the function graph tracer can spend too much time on timer
interrupts, it's better now to use the more lightweight local
clock. Anyway, the function graph traces are more reliable on a
per cpu trace.

Signed-off-by: Frederic Weisbecker
Cc: Steven Rostedt
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Frederic Weisbecker
2009-03-05 19:14:41 +0800
5e1607a00 tracing: rename ftrace_printk() => trace_printk() ... Browse Code »

Impact: cleanup

Use a more generic name - this also allows the prototype to move
to kernel.h and be generally available to kernel developers who
want to do some quick tracing.

Signed-off-by: Ingo Molnar

Ingo Molnar
2009-03-05 17:24:48 +0800
e9d25fe6e tracing: have latency tracers set the latency format ... Browse Code »

The latency tracers (irqsoff, preemptoff, preemptirqsoff, and wakeup)
are pretty useless with the default output format. This patch makes them
automatically enable the latency format when they are selected. They
also record the state of the latency option, and if it was not enabled
when selected, they disable it on reset.

Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-05 11:15:30 +0800
27d48be84 tracing: consolidate print_lat_fmt and print_trace_fmt ... Browse Code »

Impact: clean up

Both print_lat_fmt and print_trace_fmt do pretty much the same thing
except for one different function call. This patch consolidates the
two functions and adds an if statement to perform the difference.

Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-05 10:57:29 +0800
5fd73f862 tracing: remove extra latency_trace method from trace structure ... Browse Code »

Impact: clean up

The trace and latency_trace function pointers are identical for
every tracer but the function tracer. The differences in the function
tracer are trivial (latency output puts paranthesis around parent).

This patch removes the latency_trace pointer and all prints will
now just use the trace output function pointer.

Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-05 10:42:04 +0800
c032ef64d tracing: add latency output format option ... Browse Code »

With the removal of the latency_trace file, we lost the ability
to see some of the finer details in a trace. Like the state of
interrupts enabled, the preempt count, need resched, and if we
are in an interrupt handler, softirq handler or not.

This patch simply creates an option to bring back the old format.
This also removes the warning about an unused variable that held
the latency_trace file operations.

Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-05 09:34:24 +0800
e74da5235 tracing: fix seq read from trace files ... Browse Code »

The buffer used by trace_seq was updated incorrectly. Instead
of consuming what was actually read, it consumed the rest of the
buffer on reads.

Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-05 09:31:11 +0800
2dc5d12b1 tracing: do not return EFAULT if read copied anything ... Browse Code »

Impact: fix trace read to conform to standards

Andrew Morton, Theodore Tso and H. Peter Anvin brought to my attention
that a userspace read should not return -EFAULT if it succeeded in
copying anything. It should only return -EFAULT if it failed to copy
at all.

This patch modifies the check of copy_from_user and updates the return
code appropriately.

I also used H. Peter Anvin's short cut rule to just test ret == count.

Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-05 08:10:05 +0800
4f3640f8a ring-buffer: fix timestamp in partial ring_buffer_page_read ... Browse Code »

If a partial ring_buffer_page_read happens, then some of the
incremental timestamps may be lost. This patch writes the
recent timestamp into the page that is passed back to the caller.

A partial ring_buffer_page_read is where the full page would not
be written back to the user, and instead, just part of the page
is copied to the user. A full page would be a page swap with the
ring buffer and the timestamps would be correct.

Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-05 08:01:45 +0800
e543ad769 tracing: add cpu_file intialization for ftrace_dump ... Browse Code »

Impact: fix to ftrace_dump output corruption

The commit: b04cc6b1f6398b0e0b60d37e27ce51b4899672ec
tracing/core: introduce per cpu tracing files

added a new field to the iterator called cpu_file. This was a handle
to differentiate between the per cpu trace output files and the
all cpu "trace" file. The all cpu "trace" file required setting this
to TRACE_PIPE_ALL_CPU.

The problem is that the ftrace_dump sets up its own iterator but was
not updated to handle this change. The result was only CPU 0 printing
out on crash and a lot of ""'s also being printed.

Reported-by: Thomas Gleixner
Tested-by: Darren Hart
Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-05 07:32:28 +0800
64ca5ab91 rcu: increment quiescent state counter in ksoftirqd() ... Browse Code »

If a machine is flooded by network frames, a cpu can loop
100% of its time inside ksoftirqd() without calling schedule().
This can delay RCU grace period to insane values.

Adding rcu_qsctr_inc() call in ksoftirqd() solves this problem.

Paul: "This regression was a result of the recent change from
"schedule()" to "cond_resched()", which got rid of that quiescent
state in the common case where a reschedule is not needed".

Signed-off-by: Eric Dumazet
Reviewed-by: Paul E. McKenney
Signed-off-by: Andrew Morton
Signed-off-by: Ingo Molnar

Eric Dumazet
2009-03-05 05:08:45 +0800
efed792d6 tracing: add lockdep tracepoints for lock acquire/release ... Browse Code »

Augment the traces with lock names when lockdep is available:

1) | down_read_trylock() {
1) | _spin_lock_irqsave() {
1) | /* lock_acquire: &sem->wait_lock */
1) 4.201 us | }
1) | _spin_unlock_irqrestore() {
1) | /* lock_release: &sem->wait_lock */
1) 3.523 us | }
1) | /* lock_acquire: try read &mm->mmap_sem */
1) + 13.386 us | }
1) 1.635 us | find_vma();
1) | handle_mm_fault() {
1) | __do_fault() {
1) | filemap_fault() {
1) | find_lock_page() {
1) | find_get_page() {
1) | /* lock_acquire: read rcu_read_lock */
1) | /* lock_release: rcu_read_lock */
1) 5.697 us | }
1) 8.158 us | }
1) + 11.079 us | }
1) | _spin_lock() {
1) | /* lock_acquire: __pte_lockptr(page) */
1) 3.949 us | }
1) 1.460 us | page_add_file_rmap();
1) | _spin_unlock() {
1) | /* lock_release: __pte_lockptr(page) */
1) 3.115 us | }
1) | unlock_page() {
1) 1.421 us | page_waitqueue();
1) 1.220 us | __wake_up_bit();
1) 6.519 us | }
1) + 34.328 us | }
1) + 37.452 us | }
1) | up_read() {
1) | /* lock_release: &mm->mmap_sem */
1) | _spin_lock_irqsave() {
1) | /* lock_acquire: &sem->wait_lock */
1) 3.865 us | }
1) | _spin_unlock_irqrestore() {
1) | /* lock_release: &sem->wait_lock */
1) 8.562 us | }
1) + 17.370 us | }

Signed-off-by: Peter Zijlstra
Cc: Steven Rostedt
Cc: =?ISO-8859-1?Q?T=F6r=F6k?= Edwin
Cc: Jason Baron
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-03-05 01:49:58 +0800
28b1bd1cb Merge branch 'core/locking' into tracing/ftrace Browse Code »

Ingo Molnar
2009-03-05 01:49:19 +0800
26575e28d lockdep: remove extra "irq" string ... Browse Code »

Impact: clarify lockdep printk text

print_irq_inversion_bug() gets handed state strings of the form

"HARDIRQ", "SOFTIRQ", "RECLAIM_FS"

and appends "-irq-{un,}safe" to them, which is either redudant for *IRQ or
confusing in the RECLAIM_FS case.

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-03-05 01:38:34 +0800
1c21f14ec lockdep: fix incorrect state name ... Browse Code »

In the recent mark_lock_irq() rework a bug snuck in that would report the
state of write locks causing irq inversion under a read lock as a read
lock.

Fix this by masking the read bit of the state when validating write
dependencies.

Reported-by: Andrew Morton
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-03-05 01:38:33 +0800

04 Mar, 2009

6 commits

2602c3ba4 Merge branch 'rfc/splice/tip/tracing/ftrace' of git://git.kernel.org/pub/scm/lin… ... Browse Code »

…ux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace

Ingo Molnar
2009-03-04 18:15:42 +0800
a1be621df Merge branch 'tracing/ftrace'; commit 'v2.6.29-rc7' into tracing/core Browse Code »

Ingo Molnar
2009-03-04 18:14:47 +0800
2cadf9135 tracing: add binary buffer files for use with splice ... Browse Code »

Impact: new feature

This patch creates a directory of files that correspond to the
per CPU ring buffers. These are binary files and are made to
be used with splice. This is the fastest way to extract data from
the ftrace ring buffers.

Thanks to Jiaying Zhang for pushing me to get this code fixed,
and to Eduard - Gabriel Munteanu for his splice code that helped
me debug my code.

Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-04 10:01:55 +0800
474d32b68 ring-buffer: make ring_buffer_read_page read from start on partial page ... Browse Code »

Impact: dont leave holes in read buffer page

The ring_buffer_read_page swaps a given page with the reader page
of the ring buffer, if certain conditions are set:

1) requested length is big enough to hold entire page data

2) a writer is not currently on the page

3) the page is not partially consumed.

Instead of swapping with the supplied page. It copies the data to
the supplied page instead. But currently the data is copied in the
same offset as the source page. This causes a hole at the start
of the reader page. This complicates the use of this function.
Instead, it should copy the data at the beginning of the function
and update the index fields accordingly.

Other small clean ups are also done in this patch.

Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-04 09:52:27 +0800
e3d6bf0a0 ring-buffer: replace sizeof of event header with offsetof ... Browse Code »

Impact: fix to possible alignment problems on some archs.

Some arch compilers include an NULL char array in the sizeof field.
Since the ring_buffer_event type includes one of these, it is better
to use the "offsetof" instead, to avoid strange bugs on these archs.

Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-04 09:52:01 +0800
ef7a4a161 ring-buffer: fix ring_buffer_read_page ... Browse Code »

The ring_buffer_read_page was broken if it were to only copy part
of the page. This patch fixes that up as well as adds a parameter
to allow a length field, in order to only copy part of the buffer page.

Signed-off-by: Steven Rostedt

Steven Rostedt
2009-03-04 09:51:24 +0800