Eric Lee / smarc-fsl-linux-kernel

16 Dec, 2015

1 commit

a8758dc11 Merge remote-tracking branch 'fsl-linux-sdk/imx_4.1.y' into aosp android-4.1 ... Browse Code »

Conflicts:
drivers/mmc/core/core.c
drivers/usb/gadget/configfs.c

Zhang Sanshan
2015-12-16 12:30:09 +0800

20 Nov, 2015

4 commits

02746d4e1 trace: fix compilation for 4.1 ... Browse Code »

Change-Id: Id88b5d30847bc6d3cfe1d8cd00cbdc975c9712d1
Signed-off-by: Dmitry Shmidt

Dmitry Shmidt
2015-11-20 04:38:23 +0800
1217590b8 trace: add non-hierarchical function_graph option ... Browse Code »

Add the 'funcgraph-flat' option to the function_graph tracer to use the default
trace printing format rather than the hierarchical formatting normally used.

Change-Id: If2900bfb86e6f8f51379f56da4f6fabafa630909
Signed-off-by: Jamie Gennis

Jamie Gennis
2015-11-20 04:38:22 +0800
1fbd01fa1 trace: Add an option to show tgids in trace output ... Browse Code »

The tgids are tracked along side the saved_cmdlines tracking, and can be
included in trace output by enabling the 'print-tgid' trace option. This is
useful when doing post-processing of the trace data, as it allows events to be
grouped by tgid.

Change-Id: I52ed04c3a8ca7fddbb868b792ce5d21ceb76250e
Signed-off-by: Jamie Gennis

Jamie Gennis
2015-11-20 04:38:21 +0800
7b33772a4 trace/events: add gpu trace events ... Browse Code »

Change-Id: I0607b9c776acf61cb796b8572cf8cfb8b2dc1377
Signed-off-by: Jamie Gennis

Jamie Gennis
2015-11-20 04:38:21 +0800

11 Aug, 2015

1 commit

23713b4de ftrace: Fix breakage of set_ftrace_pid ... Browse Code »

commit e3eea1404f5ff7a2ceb7b5e7ba412a6fd94f2935 upstream.

Commit 4104d326b670 ("ftrace: Remove global function list and call function
directly") simplified the ftrace code by removing the global_ops list with a
new design. But this cleanup also broke the filtering of PIDs that are added
to the set_ftrace_pid file.

Add back the proper hooks to have pid filtering working once again.

Reported-by: Matt Fleming
Reported-by: Richard Weinberger
Tested-by: Matt Fleming
Signed-off-by: Steven Rostedt
Signed-off-by: Greg Kroah-Hartman

Steven Rostedt (Red Hat)
2015-08-11 03:21:55 +0800

04 Aug, 2015

4 commits

624dda42c tracing: Have branch tracer use recursive field of task struct ... Browse Code »

commit 6224beb12e190ff11f3c7d4bf50cb2922878f600 upstream.

Fengguang Wu's tests triggered a bug in the branch tracer's start up
test when CONFIG_DEBUG_PREEMPT set. This was because that config
adds some debug logic in the per cpu field, which calls back into
the branch tracer.

The branch tracer has its own recursive checks, but uses a per cpu
variable to implement it. If retrieving the per cpu variable calls
back into the branch tracer, you can see how things will break.

Instead of using a per cpu variable, use the trace_recursion field
of the current task struct. Simply set a bit when entering the
branch tracing and clear it when leaving. If the bit is set on
entry, just don't do the tracing.

There's also the case with lockdep, as the local_irq_save() called
before the recursion can also trigger code that can call back into
the function. Changing that to a raw_local_irq_save() will protect
that as well.

This prevents the recursion and the inevitable crash that follows.

Link: http://lkml.kernel.org/r/20150630141803.GA28071@wfg-t540p.sh.intel.com

Reported-by: Fengguang Wu
Tested-by: Fengguang Wu
Signed-off-by: Steven Rostedt
Signed-off-by: Greg Kroah-Hartman

Steven Rostedt (Red Hat)
2015-08-04 00:29:12 +0800
2161c8679 tracing: Fix typo from "static inlin" to "static inline" ... Browse Code »

commit cc9e4bde03f2b4cfba52406c021364cbd2a4a0f3 upstream.

The trace.h header when called without CONFIG_EVENT_TRACING enabled
(seldom done), will not compile because of a typo in the protocol
of trace_event_enum_update().

Signed-off-by: Steven Rostedt
Signed-off-by: Greg Kroah-Hartman

Steven Rostedt (Red Hat)
2015-08-04 00:29:12 +0800
a27274be0 tracing/filter: Do not allow infix to exceed end of string ... Browse Code »

commit 6b88f44e161b9ee2a803e5b2b1fbcf4e20e8b980 upstream.

While debugging a WARN_ON() for filtering, I found that it is possible
for the filter string to be referenced after its end. With the filter:

# echo '>' > /sys/kernel/debug/events/ext4/ext4_truncate_exit/filter

The filter_parse() function can call infix_get_op() which calls
infix_advance() that updates the infix filter pointers for the cnt
and tail without checking if the filter is already at the end, which
will put the cnt to zero and the tail beyond the end. The loop then calls
infix_next() that has

ps->infix.cnt--;
return ps->infix.string[ps->infix.tail++];

The cnt will now be below zero, and the tail that is returned is
already passed the end of the filter string. So far the allocation
of the filter string usually has some buffer that is zeroed out, but
if the filter string is of the exact size of the allocated buffer
there's no guarantee that the charater after the nul terminating
character will be zero.

Luckily, only root can write to the filter.

Signed-off-by: Steven Rostedt
Signed-off-by: Greg Kroah-Hartman

Steven Rostedt (Red Hat)
2015-08-04 00:29:12 +0800
baa7b4625 tracing/filter: Do not WARN on operand count going below zero ... Browse Code »

commit b4875bbe7e68f139bd3383828ae8e994a0df6d28 upstream.

When testing the fix for the trace filter, I could not come up with
a scenario where the operand count goes below zero, so I added a
WARN_ON_ONCE(cnt < 0) to the logic. But there is legitimate case
that it can happen (although the filter would be wrong).

# echo '>' > /sys/kernel/debug/events/ext4/ext4_truncate_exit/filter

That is, a single operation without any operands will hit the path
where the WARN_ON_ONCE() can trigger. Although this is harmless,
and the filter is reported as a error. But instead of spitting out
a warning to the kernel dmesg, just fail nicely and report it via
the proper channels.

Link: http://lkml.kernel.org/r/558C6082.90608@oracle.com

Reported-by: Vince Weaver
Reported-by: Sasha Levin
Signed-off-by: Steven Rostedt
Signed-off-by: Greg Kroah-Hartman

Steven Rostedt (Red Hat)
2015-08-04 00:29:12 +0800

17 Jun, 2015

1 commit

2cf30dc18 tracing: Have filter check for balanced ops ... Browse Code »

When the following filter is used it causes a warning to trigger:

# cd /sys/kernel/debug/tracing
# echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
# cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: No error

------------[ cut here ]------------
WARNING: CPU: 2 PID: 1223 at kernel/trace/trace_events_filter.c:1640 replace_preds+0x3c5/0x990()
Modules linked in: bnep lockd grace bluetooth ...
CPU: 3 PID: 1223 Comm: bash Tainted: G W 4.1.0-rc3-test+ #450
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
0000000000000668 ffff8800c106bc98 ffffffff816ed4f9 ffff88011ead0cf0
0000000000000000 ffff8800c106bcd8 ffffffff8107fb07 ffffffff8136b46c
ffff8800c7d81d48 ffff8800d4c2bc00 ffff8800d4d4f920 00000000ffffffea
Call Trace:
[] dump_stack+0x4c/0x6e
[] warn_slowpath_common+0x97/0xe0
[] ? _kstrtoull+0x2c/0x80
[] warn_slowpath_null+0x1a/0x20
[] replace_preds+0x3c5/0x990
[] create_filter+0x82/0xb0
[] apply_event_filter+0xd4/0x180
[] event_filter_write+0x8f/0x120
[] __vfs_write+0x28/0xe0
[] ? __sb_start_write+0x53/0xf0
[] ? security_file_permission+0x30/0xc0
[] vfs_write+0xb8/0x1b0
[] SyS_write+0x4f/0xb0
[] system_call_fastpath+0x12/0x6a
---[ end trace e11028bd95818dcd ]---

Worse yet, reading the error message (the filter again) it says that
there was no error, when there clearly was. The issue is that the
code that checks the input does not check for balanced ops. That is,
having an op between a closed parenthesis and the next token.

This would only cause a warning, and fail out before doing any real
harm, but it should still not caues a warning, and the error reported
should work:

# cd /sys/kernel/debug/tracing
# echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
# cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: Meaningless filter expression

And give no kernel warning.

Link: http://lkml.kernel.org/r/20150615175025.7e809215@gandalf.local.home

Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Cc: stable@vger.kernel.org # 2.6.31+
Reported-by: Vince Weaver
Tested-by: Vince Weaver
Signed-off-by: Steven Rostedt

Steven Rostedt
2015-06-17 19:13:30 +0800

11 Jun, 2015

1 commit

108029323 ring-buffer-benchmark: Fix the wrong sched_priority of producer ... Browse Code »

The producer should be used producer_fifo as its sched_priority,
so correct it.

Link: http://lkml.kernel.org/r/1433923957-67842-1-git-send-email-long.wanglong@huawei.com

Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Wang Long
Signed-off-by: Steven Rostedt

Wang Long
2015-06-11 21:27:58 +0800

07 May, 2015

1 commit

ac01ce141 tracing: Make ftrace_print_array_seq compute buf_len ... Browse Code »

The only caller to this function (__print_array) was getting it wrong by
passing the array length instead of buffer length. As the element size
was already being passed for other reasons it seems reasonable to push
the calculation of buffer length into the function.

Link: http://lkml.kernel.org/r/1430320727-14582-1-git-send-email-alex.bennee@linaro.org

Signed-off-by: Alex Bennée
Signed-off-by: Steven Rostedt

Alex Bennée
2015-05-07 11:03:23 +0800

27 Apr, 2015

1 commit

9ec3a646f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull fourth vfs update from Al Viro:
"d_inode() annotations from David Howells (sat in for-next since before
the beginning of merge window) + four assorted fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
RCU pathwalk breakage when running into a symlink overmounting something
fix I_DIO_WAKEUP definition
direct-io: only inc/dec inode->i_dio_count for file systems
fs/9p: fix readdir()
VFS: assorted d_backing_inode() annotations
VFS: fs/inode.c helpers: d_inode() annotations
VFS: fs/cachefiles: d_backing_inode() annotations
VFS: fs library helpers: d_inode() annotations
VFS: assorted weird filesystems: d_inode() annotations
VFS: normal filesystems (and lustre): d_inode() annotations
VFS: security/: d_inode() annotations
VFS: security/: d_backing_inode() annotations
VFS: net/: d_inode() annotations
VFS: net/unix: d_backing_inode() annotations
VFS: kernel/: d_inode() annotations
VFS: audit: d_backing_inode() annotations
VFS: Fix up some ->d_inode accesses in the chelsio driver
VFS: Cachefiles should perform fs modifications on the top layer only
VFS: AF_UNIX sockets should call mknod on the top layer only

Linus Torvalds
2015-04-27 08:22:07 +0800

23 Apr, 2015

1 commit

4f2112351 Merge tag 'trace-v4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing fixes from Steven Rostedt:
"This adds three fixes for the tracing code.

The first is a bug when ftrace_dump_on_oops is triggered in atomic
context and function graph tracer is the tracer that is being
reported.

The second fix is bad parsing of the trace_events from the kernel
command line, where it would ignore specific events if the system name
is used with defining the event(it enables all events within the
system).

The last one is a fix to the TRACE_DEFINE_ENUM(), where a check was
missing to see if the ptr was incremented to the end of the string,
but the loop increments it again and can miss the nul delimiter to
stop processing"

* tag 'trace-v4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix possible out of bounds memory access when parsing enums
tracing: Fix incorrect enabling of trace events by boot cmdline
tracing: Handle ftrace_dump() atomic context in graph_trace_open()

Linus Torvalds
2015-04-23 02:27:36 +0800

17 Apr, 2015

1 commit

3193899d4 tracing: Fix possible out of bounds memory access when parsing enums ... Browse Code »

The code that replaces the enum names with the enum values in the
tracepoints' format files could possible miss the end of string nul
character. This was caused by processing things like backslashes, quotes
and other tokens. After processing the tokens, a check for the nul
character needed to be done before continuing the loop, because the loop
incremented the pointer before doing the check, which could bypass the nul
character.

Link: http://lkml.kernel.org/r/552E661D.5060502@oracle.com

Reported-by: Sasha Levin # via KASan
Tested-by: Andrey Ryabinin
Fixes: 0c564a538aa9 "tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values"
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-04-17 22:34:43 +0800

16 Apr, 2015

4 commits

84fce9db4 tracing: Fix incorrect enabling of trace events by boot cmdline ... Browse Code »

There is a problem that trace events are not properly enabled with
boot cmdline. The problem is that if we pass "trace_event=kmem:mm_page_alloc"
to the boot cmdline, it enables all kmem trace events, and not just
the page_alloc event.

This is caused by the parsing mechanism. When we parse the cmdline, the buffer
contents is modified due to tokenization. And, if we use this buffer
again, we will get the wrong result.

Unfortunately, this buffer is be accessed three times to set trace events
properly at boot time. So, we need to handle this situation.

There is already code handling ",", but we need another for ":".
This patch adds it.

Link: http://lkml.kernel.org/r/1429159484-22977-1-git-send-email-iamjoonsoo.kim@lge.com

Cc: stable@vger.kernel.org # 3.19+
Signed-off-by: Joonsoo Kim
[ added missing return ret; ]
Signed-off-by: Steven Rostedt

Joonsoo Kim
2015-04-16 21:44:07 +0800
ef99b88b1 tracing: Handle ftrace_dump() atomic context in graph_trace_open() ... Browse Code »

graph_trace_open() can be called in atomic context from ftrace_dump().
Use GFP_ATOMIC for the memory allocations when that's the case, in order
to avoid the following splat.

BUG: sleeping function called from invalid context at mm/slab.c:2849
in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/0
Backtrace:
..
[] (__might_sleep) from [] (kmem_cache_alloc_trace+0x160/0x238)
r7:87800040 r6:000080d0 r5:810d16e8 r4:000080d0
[] (kmem_cache_alloc_trace) from [] (graph_trace_open+0x30/0xd0)
r10:00000100 r9:809171a8 r8:00008e28 r7:810d16f0 r6:00000001 r5:810d16e8
r4:810d16f0
[] (graph_trace_open) from [] (trace_init_global_iter+0x50/0x9c)
r8:00008e28 r7:808c853c r6:00000001 r5:810d16e8 r4:810d16f0 r3:800cbd30
[] (trace_init_global_iter) from [] (ftrace_dump+0x90/0x2ec)
r4:810d2580 r3:00000000
[] (ftrace_dump) from [] (sysrq_ftrace_dump+0x1c/0x20)
r10:00000100 r9:809171a8 r8:808f6e7c r7:00000001 r6:00000007 r5:0000007a
r4:808d5394
[] (sysrq_ftrace_dump) from [] (return_to_handler+0x0/0x18)
[] (__handle_sysrq) from [] (return_to_handler+0x0/0x18)
r8:808c8100 r7:808c8444 r6:00000101 r5:00000010 r4:84eb3210
[] (handle_sysrq) from [] (return_to_handler+0x0/0x18)
[] (pl011_int) from [] (return_to_handler+0x0/0x18)
r10:809171bc r9:809171a8 r8:00000001 r7:00000026 r6:808c6000 r5:84f01e60
r4:8454fe00
[] (handle_irq_event_percpu) from [] (handle_irq_event+0x4c/0x6c)
r10:808c7ef0 r9:87283e00 r8:00000001 r7:00000000 r6:8454fe00 r5:84f01e60
r4:84f01e00
[] (handle_irq_event) from [] (handle_fasteoi_irq+0xf0/0x1ac)
r6:808f52a4 r5:84f01e60 r4:84f01e00 r3:00000000
[] (handle_fasteoi_irq) from [] (generic_handle_irq+0x3c/0x4c)
r6:00000026 r5:00000000 r4:00000026 r3:8007a938
[] (generic_handle_irq) from [] (__handle_domain_irq+0x8c/0xfc)
r4:808c1e38 r3:0000002e
[] (__handle_domain_irq) from [] (gic_handle_irq+0x34/0x6c)
r10:80917748 r9:00000001 r8:88802100 r7:808c7ef0 r6:808c8fb0 r5:00000015
r4:8880210c r3:808c7ef0
[] (gic_handle_irq) from [] (__irq_svc+0x44/0x7c)

Link: http://lkml.kernel.org/r/1428953721-31349-1-git-send-email-rabin@rab.in
Link: http://lkml.kernel.org/r/1428957012-2319-1-git-send-email-rabin@rab.in

Cc: stable@vger.kernel.org # 3.13+
Signed-off-by: Rabin Vincent
Signed-off-by: Steven Rostedt

Rabin Vincent
2015-04-16 21:32:17 +0800
962e3707d tracing: remove use of seq_printf return value ... Browse Code »

The seq_printf return value, because it's frequently misused,
will eventually be converted to void.

See: commit 1f33c41c03da ("seq_file: Rename seq_overflow() to
seq_has_overflowed() and make public")

Miscellanea:

o Remove unused return value from trace_lookup_stack

Signed-off-by: Joe Perches
Acked-by: Steven Rostedt
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2015-04-16 07:35:25 +0800
7682c9184 VFS: kernel/: d_inode() annotations ... Browse Code »

relayfs and tracefs are dealing with inodes of their own;
those two act as filesystem drivers

Signed-off-by: David Howells
Signed-off-by: Al Viro

David Howells
2015-04-16 03:06:55 +0800

15 Apr, 2015

3 commits

6c8a53c9e Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf changes from Ingo Molnar:
"Core kernel changes:

- One of the more interesting features in this cycle is the ability
to attach eBPF programs (user-defined, sandboxed bytecode executed
by the kernel) to kprobes.

This allows user-defined instrumentation on a live kernel image
that can never crash, hang or interfere with the kernel negatively.
(Right now it's limited to root-only, but in the future we might
allow unprivileged use as well.)

(Alexei Starovoitov)

- Another non-trivial feature is per event clockid support: this
allows, amongst other things, the selection of different clock
sources for event timestamps traced via perf.

This feature is sought by people who'd like to merge perf generated
events with external events that were measured with different
clocks:

- cluster wide profiling

- for system wide tracing with user-space events,

- JIT profiling events

etc. Matching perf tooling support is added as well, available via
the -k, --clockid parameter to perf record et al.

(Peter Zijlstra)

Hardware enablement kernel changes:

- x86 Intel Processor Trace (PT) support: which is a hardware tracer
on steroids, available on Broadwell CPUs.

The hardware trace stream is directly output into the user-space
ring-buffer, using the 'AUX' data format extension that was added
to the perf core to support hardware constraints such as the
necessity to have the tracing buffer physically contiguous.

This patch-set was developed for two years and this is the result.
A simple way to make use of this is to use BTS tracing, the PT
driver emulates BTS output - available via the 'intel_bts' PMU.
More explicit PT specific tooling support is in the works as well -
will probably be ready by 4.2.

(Alexander Shishkin, Peter Zijlstra)

- x86 Intel Cache QoS Monitoring (CQM) support: this is a hardware
feature of Intel Xeon CPUs that allows the measurement and
allocation/partitioning of caches to individual workloads.

These kernel changes expose the measurement side as a new PMU
driver, which exposes various QoS related PMU events. (The
partitioning change is work in progress and is planned to be merged
as a cgroup extension.)

(Matt Fleming, Peter Zijlstra; CPU feature detection by Peter P
Waskiewicz Jr)

- x86 Intel Haswell LBR call stack support: this is a new Haswell
feature that allows the hardware recording of call chains, plus
tooling support. To activate this feature you have to enable it
via the new 'lbr' call-graph recording option:

perf record --call-graph lbr
perf report

or:

perf top --call-graph lbr

This hardware feature is a lot faster than stack walk or dwarf
based unwinding, but has some limitations:

- It reuses the current LBR facility, so LBR call stack and
branch record can not be enabled at the same time.

- It is only available for user-space callchains.

(Yan, Zheng)

- x86 Intel Broadwell CPU support and various event constraints and
event table fixes for earlier models.

(Andi Kleen)

- x86 Intel HT CPUs event scheduling workarounds. This is a complex
CPU bug affecting the SNB,IVB,HSW families that results in counter
value corruption. The mitigation code is automatically enabled and
is transparent.

(Maria Dimakopoulou, Stephane Eranian)

The perf tooling side had a ton of changes in this cycle as well, so
I'm only able to list the user visible changes here, in addition to
the tooling changes outlined above:

User visible changes affecting all tools:

- Improve support of compressed kernel modules (Jiri Olsa)
- Save DSO loading errno to better report errors (Arnaldo Carvalho de Melo)
- Bash completion for subcommands (Yunlong Song)
- Add 'I' event modifier for perf_event_attr.exclude_idle bit (Jiri Olsa)
- Support missing -f to override perf.data file ownership. (Yunlong Song)
- Show the first event with an invalid filter (David Ahern, Arnaldo Carvalho de Melo)

User visible changes in individual tools:

'perf data':

New tool for converting perf.data to other formats, initially
for the CTF (Common Trace Format) from LTTng (Jiri Olsa,
Sebastian Siewior)

'perf diff':

Add --kallsyms option (David Ahern)

'perf list':

Allow listing events with 'tracepoint' prefix (Yunlong Song)

Sort the output of the command (Yunlong Song)

'perf kmem':

Respect -i option (Jiri Olsa)

Print big numbers using thousands' group (Namhyung Kim)

Allow -v option (Namhyung Kim)

Fix alignment of slab result table (Namhyung Kim)

'perf probe':

Support multiple probes on different binaries on the same command line (Masami Hiramatsu)

Support unnamed union/structure members data collection. (Masami Hiramatsu)

Check kprobes blacklist when adding new events. (Masami Hiramatsu)

'perf record':

Teach 'perf record' about perf_event_attr.clockid (Peter Zijlstra)

Support recording running/enabled time (Andi Kleen)

'perf sched':

Improve the performance of 'perf sched replay' on high CPU core count machines (Yunlong Song)

'perf report' and 'perf top':

Allow annotating entries in callchains in the hists browser (Arnaldo Carvalho de Melo)

Indicate which callchain entries are annotated in the
TUI hists browser (Arnaldo Carvalho de Melo)

Add pid/tid filtering to 'report' and 'script' commands (David Ahern)

Consider PERF_RECORD_ events with cpumode == 0 in 'perf top', removing one
cause of long term memory usage buildup, i.e. not processing PERF_RECORD_EXIT
events (Arnaldo Carvalho de Melo)

'perf stat':

Report unsupported events properly (Suzuki K. Poulose)

Output running time and run/enabled ratio in CSV mode (Andi Kleen)

'perf trace':

Handle legacy syscalls tracepoints (David Ahern, Arnaldo Carvalho de Melo)

Only insert blank duration bracket when tracing syscalls (Arnaldo Carvalho de Melo)

Filter out the trace pid when no threads are specified (Arnaldo Carvalho de Melo)

Dump stack on segfaults (Arnaldo Carvalho de Melo)

No need to explicitely enable evsels for workload started from perf, let it
be enabled via perf_event_attr.enable_on_exec, removing some events that take
place in the 'perf trace' before a workload is really started by it.
(Arnaldo Carvalho de Melo)

Allow mixing with tracepoints and suppressing plain syscalls. (Arnaldo Carvalho de Melo)

There's also been a ton of infrastructure work done, such as the
split-out of perf's build system into tools/build/ and other changes -
see the shortlog and changelog for details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (358 commits)
perf/x86/intel/pt: Clean up the control flow in pt_pmu_hw_init()
perf evlist: Fix type for references to data_head/tail
perf probe: Check the orphaned -x option
perf probe: Support multiple probes on different binaries
perf buildid-list: Fix segfault when show DSOs with hits
perf tools: Fix cross-endian analysis
perf tools: Fix error path to do closedir() when synthesizing threads
perf tools: Fix synthesizing fork_event.ppid for non-main thread
perf tools: Add 'I' event modifier for exclude_idle bit
perf report: Don't call map__kmap if map is NULL.
perf tests: Fix attr tests
perf probe: Fix ARM 32 building error
perf tools: Merge all perf_event_attr print functions
perf record: Add clockid parameter
perf sched replay: Use replay_repeat to calculate the runavg of cpu usage instead of the default value 10
perf sched replay: Support using -f to override perf.data file ownership
perf sched replay: Fix the EMFILE error caused by the limitation of the maximum open files
perf sched replay: Handle the dead halt of sem_wait when create_tasks() fails for any task
perf sched replay: Fix the segmentation fault problem caused by pr_err in threads
perf sched replay: Realloc the memory of pid_to_task stepwise to adapt to the different pid_max configurations
...

Linus Torvalds
2015-04-15 05:37:47 +0800
eeee78cf7 Merge tag 'trace-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing updates from Steven Rostedt:
"Some clean ups and small fixes, but the biggest change is the addition
of the TRACE_DEFINE_ENUM() macro that can be used by tracepoints.

Tracepoints have helper functions for the TP_printk() called
__print_symbolic() and __print_flags() that lets a numeric number be
displayed as a a human comprehensible text. What is placed in the
TP_printk() is also shown in the tracepoint format file such that user
space tools like perf and trace-cmd can parse the binary data and
express the values too. Unfortunately, the way the TRACE_EVENT()
macro works, anything placed in the TP_printk() will be shown pretty
much exactly as is. The problem arises when enums are used. That's
because unlike macros, enums will not be changed into their values by
the C pre-processor. Thus, the enum string is exported to the format
file, and this makes it useless for user space tools.

The TRACE_DEFINE_ENUM() solves this by converting the enum strings in
the TP_printk() format into their number, and that is what is shown to
user space. For example, the tracepoint tlb_flush currently has this
in its format file:

__print_symbolic(REC->reason,
{ TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
{ TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
{ TLB_LOCAL_SHOOTDOWN, "local shootdown" },
{ TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })

After adding:

TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);

Its format file will contain this:

__print_symbolic(REC->reason,
{ 0, "flush on task switch" },
{ 1, "remote shootdown" },
{ 2, "local shootdown" },
{ 3, "local mm shootdown" })"

* tag 'trace-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (27 commits)
tracing: Add enum_map file to show enums that have been mapped
writeback: Export enums used by tracepoint to user space
v4l: Export enums used by tracepoints to user space
SUNRPC: Export enums in tracepoints to user space
mm: tracing: Export enums in tracepoints to user space
irq/tracing: Export enums in tracepoints to user space
f2fs: Export the enums in the tracepoints to userspace
net/9p/tracing: Export enums in tracepoints to userspace
x86/tlb/trace: Export enums in used by tlb_flush tracepoint
tracing/samples: Update the trace-event-sample.h with TRACE_DEFINE_ENUM()
tracing: Allow for modules to convert their enums to values
tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
tracing: Update trace-event-sample with TRACE_SYSTEM_VAR documentation
tracing: Give system name a pointer
brcmsmac: Move each system tracepoints to their own header
iwlwifi: Move each system tracepoints to their own header
mac80211: Move message tracepoints to their own header
tracing: Add TRACE_SYSTEM_VAR to xhci-hcd
tracing: Add TRACE_SYSTEM_VAR to kvm-s390
tracing: Add TRACE_SYSTEM_VAR to intel-sst
...

Linus Torvalds
2015-04-15 01:49:03 +0800
3f3c73de7 Merge tag 'trace-4.1-tracefs' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracefs from Steven Rostedt:
"This adds the new tracefs file system.

This has been in linux-next for more than one release, as I had it
ready for the 4.0 merge window, but a last minute thing that needed to
go into Linux first had to be done. That was that perf hard coded the
file system number when reading /sys/kernel/debugfs/tracing directory
making sure that the path had the debugfs mount # before it would
parse the tracing file. This broke other use cases of perf, and the
check is removed.

Now when mounting /sys/kernel/debug, tracefs is automatically mounted
in /sys/kernel/debug/tracing such that old tools will still see that
path as expected. But now system admins can mount tracefs directly
and not need to mount debugfs, which can expose security issues. A
new directory is created when tracefs is configured such that system
admins can now mount it separately (/sys/kernel/tracing)"

* tag 'trace-4.1-tracefs' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Have mkdir and rmdir be part of tracefs
tracefs: Add directory /sys/kernel/tracing
tracing: Automatically mount tracefs on debugfs/tracing
tracing: Convert the tracing facility over to use tracefs
tracefs: Add new tracefs file system
tracing: Create cmdline tracer options on tracing fs init
tracing: Only create tracer options files if directory exists
debugfs: Provide a file creation function that also takes an initial size

Linus Torvalds
2015-04-15 01:22:29 +0800

08 Apr, 2015

3 commits

9828413d4 tracing: Add enum_map file to show enums that have been mapped ... Browse Code »

Add a enum_map file in the tracing directory to see what enums have been
saved to convert in the print fmt files.

As this requires the enum mapping to be persistent in memory, it is only
created if the new config option CONFIG_TRACE_ENUM_MAP_FILE is enabled.
This is for debugging and will increase the persistent memory footprint
of the kernel.

Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org

Reviewed-by: Masami Hiramatsu
Tested-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-04-08 22:58:35 +0800
3673b8e4c tracing: Allow for modules to convert their enums to values ... Browse Code »

Update the infrastructure such that modules that declare TRACE_DEFINE_ENUM()
will have those enums converted into their values in the tracepoint
print fmt strings.

Link: http://lkml.kernel.org/r/87vbhjp74q.fsf@rustcorp.com.au

Acked-by: Rusty Russell
Reviewed-by: Masami Hiramatsu
Tested-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-04-08 21:39:57 +0800
0c564a538 tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values ... Browse Code »

Several tracepoints use the helper functions __print_symbolic() or
__print_flags() and pass in enums that do the mapping between the
binary data stored and the value to print. This works well for reading
the ASCII trace files, but when the data is read via userspace tools
such as perf and trace-cmd, the conversion of the binary value to a
human string format is lost if an enum is used, as userspace does not
have access to what the ENUM is.

For example, the tracepoint trace_tlb_flush() has:

__print_symbolic(REC->reason,
{ TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
{ TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
{ TLB_LOCAL_SHOOTDOWN, "local shootdown" },
{ TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })

Which maps the enum values to the strings they represent. But perf and
trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would
not be able to map it.

With TRACE_DEFINE_ENUM(), developers can place these in the event header
files and ftrace will convert the enums to their values:

By adding:

TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);

$ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format
[...]
__print_symbolic(REC->reason,
{ 0, "flush on task switch" },
{ 1, "remote shootdown" },
{ 2, "local shootdown" },
{ 3, "local mm shootdown" })

The above is what userspace expects to see, and tools do not need to
be modified to parse them.

Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org

Cc: Guilherme Cox
Cc: Tony Luck
Cc: Xie XiuQi
Acked-by: Namhyung Kim
Reviewed-by: Masami Hiramatsu
Tested-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-04-08 21:39:56 +0800

03 Apr, 2015

1 commit

00ccbf2f5 ftrace/x86: Let dynamic trampolines call ops->func even for dynamic fops ... Browse Code »

Dynamically allocated trampolines call ftrace_ops_get_func to get the
function which they should call. For dynamic fops (FTRACE_OPS_FL_DYNAMIC
flag is set) ftrace_ops_list_func is always returned. This is reasonable
for static trampolines but goes against the main advantage of dynamic
ones, that is avoidance of going through the list of all registered
callbacks for functions that are only being traced by a single callback.

We can fix it by returning ops->func (or recursion safe version) from
ftrace_ops_get_func whenever it is possible for dynamic trampolines.

Note that dynamic trampolines are not allowed for dynamic fops if
CONFIG_PREEMPT=y.

Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1501291023000.25445@pobox.suse.cz
Link: http://lkml.kernel.org/r/1424357773-13536-1-git-send-email-mbenes@suse.cz

Reported-by: Miroslav Benes
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-04-03 03:43:33 +0800

02 Apr, 2015

5 commits

e1abf2cc8 bpf: Fix the build on BPF_SYSCALL=y && !CONFIG_TRACING kernels, make it more configurable ... Browse Code »

So bpf_tracing.o depends on CONFIG_BPF_SYSCALL - but that's not its only
dependency, it also depends on the tracing infrastructure and on kprobes,
without which it will fail to build with:

In file included from kernel/trace/bpf_trace.c:14:0:
kernel/trace/trace.h: In function ‘trace_test_and_set_recursion’:
kernel/trace/trace.h:491:28: error: ‘struct task_struct’ has no member named ‘trace_recursion’
unsigned int val = current->trace_recursion;
[...]

It took quite some time to trigger this build failure, because right now
BPF_SYSCALL is very obscure, depends on CONFIG_EXPERT. So also make BPF_SYSCALL
more configurable, not just under CONFIG_EXPERT.

If BPF_SYSCALL, tracing and kprobes are enabled then enable the bpf_tracing
gateway as well.

We might want to make this an interactive option later on, although
I'd not complicate it unnecessarily: enabling BPF_SYSCALL is enough of
an indicator that the user wants BPF support.

Cc: Alexei Starovoitov
Cc: Andrew Morton
Cc: Arnaldo Carvalho de Melo
Cc: Arnaldo Carvalho de Melo
Cc: Daniel Borkmann
Cc: David S. Miller
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Ingo Molnar
2015-04-02 22:28:06 +0800
9c959c863 tracing: Allow BPF programs to call bpf_trace_printk() ... Browse Code »

Debugging of BPF programs needs some form of printk from the
program, so let programs call limited trace_printk() with %d %u
%x %p modifiers only.

Similar to kernel modules, during program load verifier checks
whether program is calling bpf_trace_printk() and if so, kernel
allocates trace_printk buffers and emits big 'this is debug
only' banner.

Signed-off-by: Alexei Starovoitov
Reviewed-by: Steven Rostedt
Cc: Andrew Morton
Cc: Arnaldo Carvalho de Melo
Cc: Arnaldo Carvalho de Melo
Cc: Daniel Borkmann
Cc: David S. Miller
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1427312966-8434-6-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar

Alexei Starovoitov
2015-04-02 19:25:50 +0800
d9847d310 tracing: Allow BPF programs to call bpf_ktime_get_ns() ... Browse Code »

bpf_ktime_get_ns() is used by programs to compute time delta
between events or as a timestamp

Signed-off-by: Alexei Starovoitov
Reviewed-by: Steven Rostedt
Cc: Andrew Morton
Cc: Arnaldo Carvalho de Melo
Cc: Arnaldo Carvalho de Melo
Cc: Daniel Borkmann
Cc: David S. Miller
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Masami Hiramatsu
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1427312966-8434-5-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar

Alexei Starovoitov
2015-04-02 19:25:49 +0800
2541517c3 tracing, perf: Implement BPF programs attached to kprobes ... Browse Code »

BPF programs, attached to kprobes, provide a safe way to execute
user-defined BPF byte-code programs without being able to crash or
hang the kernel in any way. The BPF engine makes sure that such
programs have a finite execution time and that they cannot break
out of their sandbox.

The user interface is to attach to a kprobe via the perf syscall:

struct perf_event_attr attr = {
.type = PERF_TYPE_TRACEPOINT,
.config = event_id,
...
};

event_fd = perf_event_open(&attr,...);
ioctl(event_fd, PERF_EVENT_IOC_SET_BPF, prog_fd);

'prog_fd' is a file descriptor associated with BPF program
previously loaded.

'event_id' is an ID of the kprobe created.

Closing 'event_fd':

close(event_fd);

... automatically detaches BPF program from it.

BPF programs can call in-kernel helper functions to:

- lookup/update/delete elements in maps

- probe_read - wraper of probe_kernel_read() used to access any
kernel data structures

BPF programs receive 'struct pt_regs *' as an input ('struct pt_regs' is
architecture dependent) and return 0 to ignore the event and 1 to store
kprobe event into the ring buffer.

Note, kprobes are a fundamentally _not_ a stable kernel ABI,
so BPF programs attached to kprobes must be recompiled for
every kernel version and user must supply correct LINUX_VERSION_CODE
in attr.kern_version during bpf_prog_load() call.

Signed-off-by: Alexei Starovoitov
Reviewed-by: Steven Rostedt
Reviewed-by: Masami Hiramatsu
Cc: Andrew Morton
Cc: Arnaldo Carvalho de Melo
Cc: Arnaldo Carvalho de Melo
Cc: Daniel Borkmann
Cc: David S. Miller
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1427312966-8434-4-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar

Alexei Starovoitov
2015-04-02 19:25:49 +0800
72cbbc899 tracing: Add kprobe flag ... Browse Code »

add TRACE_EVENT_FL_KPROBE flag to differentiate kprobe type of
tracepoints, since bpf programs can only be attached to kprobe
type of PERF_TYPE_TRACEPOINT perf events.

Signed-off-by: Alexei Starovoitov
Reviewed-by: Steven Rostedt
Reviewed-by: Masami Hiramatsu
Cc: Andrew Morton
Cc: Arnaldo Carvalho de Melo
Cc: Arnaldo Carvalho de Melo
Cc: Daniel Borkmann
Cc: David S. Miller
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1427312966-8434-3-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar

Alexei Starovoitov
2015-04-02 19:25:49 +0800

31 Mar, 2015

1 commit

d631c8cce ring-buffer: Remove duplicate use of '&' in recursive code ... Browse Code »

A clean up of the recursive protection code changed

val = this_cpu_read(current_context);
val--;
val &= this_cpu_read(current_context);

to

val = this_cpu_read(current_context);
val &= val & (val - 1);

Which has a duplicate use of '&' as the above is the same as

val = val & (val - 1);

Actually, it would be best to remove that line altogether and
just add it to where it is used.

And Christoph even mentioned that it can be further compacted to
just a single line:

__this_cpu_and(current_context, __this_cpu_read(current_context) - 1);

Link: http://lkml.kernel.org/alpine.DEB.2.11.1503271423580.23114@gentwo.org

Suggested-by: Christoph Lameter
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-03-31 01:36:31 +0800

27 Mar, 2015

1 commit

936c663ae Merge branch 'perf/x86' into perf/core, because it's ready ... Browse Code »

Signed-off-by: Ingo Molnar

Ingo Molnar
2015-03-27 16:46:19 +0800

25 Mar, 2015

4 commits

d9a16d3ab trace: Don't use __weak in header files ... Browse Code »

The commit that added a check for this to checkpatch says:

"Using weak declarations can have unintended link defects. The __weak on
the declaration causes non-weak definitions to become weak."

In this case, when a PowerPC kernel is built with CONFIG_KPROBE_EVENT
but not CONFIG_UPROBE_EVENT, it generates the following warning:

WARNING: 1 bad relocations
c0000000014f2190 R_PPC64_ADDR64 uprobes_fetch_type_table

This is fixed by passing the fetch_table arrays to
traceprobe_parse_probe_arg() which also means that they can never be NULL.

Link: http://lkml.kernel.org/r/20150312165834.4482cb48@canb.auug.org.au

Acked-by: Masami Hiramatsu
Signed-off-by: Stephen Rothwell
Signed-off-by: Steven Rostedt

Stephen Rothwell
2015-03-25 20:57:23 +0800
754cb0071 tracing: remove ftrace:function TRACE_EVENT_FL_USE_CALL_FILTER flag ... Browse Code »

TRACE_EVENT_FL_USE_CALL_FILTER flag in ftrace:functon event can be
removed. This flag was first introduced in commit
f306cc82a93d ("tracing: Update event filters for multibuffer").

Now, the only place uses this flag is ftrace:function, but the filter of
ftrace:function has a different code path with events/syscalls and
events/tracepoints. It uses ftrace_filter_write() and perf's
ftrace_profile_set_filter() to set the filter, the functionality of file
'tracing/events/ftrace/function/filter' is bypassed in function
init_pred(), in which case, neither call->filter nor file->filter is
used.

So we can safely remove TRACE_EVENT_FL_USE_CALL_FILTER flag from
ftrace:function events.

Link: http://lkml.kernel.org/r/1425367294-27852-1-git-send-email-hekuang@huawei.com

Signed-off-by: He Kuang
Signed-off-by: Steven Rostedt

He Kuang
2015-03-25 20:57:23 +0800
bbedb1799 tracing: %pF is only for function pointers ... Browse Code »

Use %pS for actual addresses, otherwise you'll get bad output
on arches like ppc64 where %pF expects a function descriptor.

Link: http://lkml.kernel.org/r/1426130037-17956-22-git-send-email-scottwood@freescale.com

Signed-off-by: Scott Wood
Signed-off-by: Steven Rostedt

Scott Wood
2015-03-25 20:57:22 +0800
80a9b64e2 ring-buffer: Replace this_cpu_*() with __this_cpu_*() ... Browse Code »

It has come to my attention that this_cpu_read/write are horrible on
architectures other than x86. Worse yet, they actually disable
preemption or interrupts! This caused some unexpected tracing results
on ARM.

101.356868: preempt_count_add
Reported-by: Uwe Kleine-Koenig
Tested-by: Uwe Kleine-Koenig
Signed-off-by: Steven Rostedt

Steven Rostedt
2015-03-25 20:56:49 +0800

23 Mar, 2015

1 commit

50f16a8bf perf: Remove type specific target pointers ... Browse Code »

The only reason CQM had to use a hard-coded pmu type was so it could use
cqm_target in hw_perf_event.

Do away with the {tp,bp,cqm}_target pointers and provide a non type
specific one.

This allows us to do away with that silly pmu type as well.

Signed-off-by: Peter Zijlstra (Intel)
Cc: Vince Weaver
Cc: acme@kernel.org
Cc: acme@redhat.com
Cc: hpa@zytor.com
Cc: jolsa@redhat.com
Cc: kanaka.d.juvva@intel.com
Cc: matt.fleming@intel.com
Cc: tglx@linutronix.de
Cc: torvalds@linux-foundation.org
Cc: vikas.shivappa@linux.intel.com
Link: http://lkml.kernel.org/r/20150305211019.GU21418@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2015-03-23 17:58:04 +0800

09 Mar, 2015

1 commit

524a38682 ftrace: Fix ftrace enable ordering of sysctl ftrace_enabled ... Browse Code »

Some archs (specifically PowerPC), are sensitive with the ordering of
the enabling of the calls to function tracing and setting of the
function to use to be traced.

That is, update_ftrace_function() sets what function the ftrace_caller
trampoline should call. Some archs require this to be set before
calling ftrace_run_update_code().

Another bug was discovered, that ftrace_startup_sysctl() called
ftrace_run_update_code() directly. If the function the ftrace_caller
trampoline changes, then it will not be updated. Instead a call
to ftrace_startup_enable() should be called because it tests to see
if the callback changed since the code was disabled, and will
tell the arch to update appropriately. Most archs do not need this
notification, but PowerPC does.

The problem could be seen by the following commands:

# echo 0 > /proc/sys/kernel/ftrace_enabled
# echo function > /sys/kernel/debug/tracing/current_tracer
# echo 1 > /proc/sys/kernel/ftrace_enabled
# cat /sys/kernel/debug/tracing/trace

The trace will show that function tracing was not active.

Cc: stable@vger.kernel.org # 2.6.27+
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-03-09 22:55:34 +0800