Eric Lee / smarc-fsl-linux-kernel

28 Oct, 2019

1 commit

65133033e Merge branch 'perf/urgent' into perf/core, to pick up fixes ... Browse Code »

Signed-off-by: Ingo Molnar

Ingo Molnar
2019-10-28 19:38:26 +0800

22 Oct, 2019

1 commit

6b1340cc0 tracing: Fix race in perf_trace_buf initialization ... Browse Code »

A race condition exists while initialiazing perf_trace_buf from
perf_trace_init() and perf_kprobe_init().

CPU0 CPU1
perf_trace_init()
mutex_lock(&event_mutex)
perf_trace_event_init()
perf_trace_event_reg()
total_ref_count == 0
buf = alloc_percpu()
perf_trace_buf[i] = buf
tp_event->class->reg() //fails perf_kprobe_init()
goto fail perf_trace_event_init()
perf_trace_event_reg()
fail:
total_ref_count == 0

total_ref_count == 0
buf = alloc_percpu()
perf_trace_buf[i] = buf
tp_event->class->reg()
total_ref_count++

free_percpu(perf_trace_buf[i])
perf_trace_buf[i] = NULL

Any subsequent call to perf_trace_event_reg() will observe total_ref_count > 0,
causing the perf_trace_buf to be always NULL. This can result in perf_trace_buf
getting accessed from perf_trace_buf_alloc() without being initialized. Acquiring
event_mutex in perf_kprobe_init() before calling perf_trace_event_init() should
fix this race.

The race caused the following bug:

Unable to handle kernel paging request at virtual address 0000003106f2003c
Mem abort info:
ESR = 0x96000045
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000045
CM = 0, WnR = 1
user pgtable: 4k pages, 39-bit VAs, pgdp = ffffffc034b9b000
[0000003106f2003c] pgd=0000000000000000, pud=0000000000000000
Internal error: Oops: 96000045 [#1] PREEMPT SMP
Process syz-executor (pid: 18393, stack limit = 0xffffffc093190000)
pstate: 80400005 (Nzcv daif +PAN -UAO)
pc : __memset+0x20/0x1ac
lr : memset+0x3c/0x50
sp : ffffffc09319fc50

__memset+0x20/0x1ac
perf_trace_buf_alloc+0x140/0x1a0
perf_trace_sys_enter+0x158/0x310
syscall_trace_enter+0x348/0x7c0
el0_svc_common+0x11c/0x368
el0_svc_handler+0x12c/0x198
el0_svc+0x8/0xc

Ramdumps showed the following:
total_ref_count = 3
perf_trace_buf = (
0x0 -> NULL,
0x0 -> NULL,
0x0 -> NULL,
0x0 -> NULL)

Link: http://lkml.kernel.org/r/1571120245-4186-1-git-send-email-prsood@codeaurora.org

Cc: stable@vger.kernel.org
Fixes: e12f03d7031a9 ("perf/core: Implement the 'perf_kprobe' PMU")
Acked-by: Song Liu
Signed-off-by: Prateek Sood
Signed-off-by: Steven Rostedt (VMware)

Prateek Sood
2019-10-22 07:38:28 +0800

18 Oct, 2019

1 commit

da97e1845 perf_event: Add support for LSM and SELinux checks ... Browse Code »

In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:

1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.

This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.

5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.

2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.

3. perf_event_free: Called when the event is closed.

4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.

5. perf_event_write: Called from the ioctl(2) syscalls for the event.

[1] https://lwn.net/Articles/696240/

Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.

To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.

Suggested-by: Peter Zijlstra
Co-developed-by: Peter Zijlstra
Signed-off-by: Joel Fernandes (Google)
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: James Morris
Cc: Arnaldo Carvalho de Melo
Cc: rostedt@goodmis.org
Cc: Yonghong Song
Cc: Kees Cook
Cc: Ingo Molnar
Cc: Alexei Starovoitov
Cc: jeffv@google.com
Cc: Jiri Olsa
Cc: Daniel Borkmann
Cc: primiano@google.com
Cc: Song Liu
Cc: rsavitski@google.com
Cc: Namhyung Kim
Cc: Matthew Garrett
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org

Joel Fernandes (Google)
2019-10-18 03:31:55 +0800

17 Jul, 2019

1 commit

46710f3a3 tracing: Pass type into tracing_generic_entry_update() ... Browse Code »

All callers of tracing_generic_entry_update() have to initialize
entry->type, so let's just simply move it inside.
Link: http://lkml.kernel.org/r/20190525165802.25944-2-xiyou.wangcong@gmail.com

Cc: Ingo Molnar
Signed-off-by: Cong Wang
Signed-off-by: Steven Rostedt (VMware)

Cong Wang
2019-07-17 03:14:48 +0800

21 Feb, 2019

1 commit

83540fbc8 tracing/perf: Use strndup_user() instead of buggy open-coded version ... Browse Code »

The first version of this method was missing the check for
`ret == PATH_MAX`; then such a check was added, but it didn't call kfree()
on error, so there was still a small memory leak in the error case.
Fix it by using strndup_user() instead of open-coding it.

Link: http://lkml.kernel.org/r/20190220165443.152385-1-jannh@google.com

Cc: Ingo Molnar
Cc: stable@vger.kernel.org
Fixes: 0eadcc7a7bc0 ("perf/core: Fix perf_uprobe_init()")
Reviewed-by: Masami Hiramatsu
Acked-by: Song Liu
Signed-off-by: Jann Horn
Signed-off-by: Steven Rostedt (VMware)

Jann Horn
2019-02-21 23:35:10 +0800

11 Oct, 2018

1 commit

a6ca88b24 trace_uprobe: support reference counter in fd-based uprobe ... Browse Code »

This patch enables uprobes with reference counter in fd-based uprobe.
Highest 32 bits of perf_event_attr.config is used to stored offset
of the reference count (semaphore).

Format information in /sys/bus/event_source/devices/uprobe/format/ is
updated to reflect this new feature.

Link: http://lkml.kernel.org/r/20181002053636.1896903-1-songliubraving@fb.com

Cc: Oleg Nesterov
Acked-by: Peter Zijlstra (Intel)
Reviewed-and-tested-by: Ravi Bangoria
Signed-off-by: Song Liu
Signed-off-by: Steven Rostedt (VMware)

Song Liu
2018-10-11 10:14:17 +0800

17 Aug, 2018

1 commit

bcea3f96e tracing: Add SPDX License format tags to tracing files ... Browse Code »

Add the SPDX License header to ease license compliance management.

Signed-off-by: Steven Rostedt (VMware)

Steven Rostedt (VMware)
2018-08-17 07:08:06 +0800

10 Apr, 2018

2 commits

0eadcc7a7 perf/core: Fix perf_uprobe_init() ... Browse Code »

Similarly to the uprobe PMU fix in perf_kprobe_init(), fix error
handling in perf_uprobe_init() as well.

Reported-by: 范龙飞
Signed-off-by: Song Liu
Acked-by: Masami Hiramatsu
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Fixes: e12f03d7031a ("perf/core: Implement the 'perf_kprobe' PMU")
Signed-off-by: Ingo Molnar

Song Liu
2018-04-10 13:33:10 +0800
5da13ab8b perf/core: Fix perf_kprobe_init() ... Browse Code »

Fix error handling in perf_kprobe_init():

==================================================================
BUG: KASAN: slab-out-of-bounds in strlen+0x8e/0xa0 lib/string.c:482
Read of size 1 at addr ffff88003f9cc5c0 by task syz-executor2/23095

CPU: 0 PID: 23095 Comm: syz-executor2 Not tainted 4.16.0+ #24
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xca/0x13e lib/dump_stack.c:113
print_address_description+0x6e/0x2c0 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report+0x256/0x380 mm/kasan/report.c:412
strlen+0x8e/0xa0 lib/string.c:482
kstrdup+0x21/0x70 mm/util.c:55
alloc_trace_kprobe+0xc8/0x930 kernel/trace/trace_kprobe.c:325
create_local_trace_kprobe+0x4f/0x3a0 kernel/trace/trace_kprobe.c:1438
perf_kprobe_init+0x149/0x1f0 kernel/trace/trace_event_perf.c:264
perf_kprobe_event_init+0xa8/0x120 kernel/events/core.c:8407
perf_try_init_event+0xcb/0x2a0 kernel/events/core.c:9719
perf_init_event kernel/events/core.c:9750 [inline]
perf_event_alloc+0x1367/0x1e20 kernel/events/core.c:10022
SYSC_perf_event_open+0x242/0x2330 kernel/events/core.c:10477
do_syscall_64+0x198/0x640 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7

Reported-by: 范龙飞
Signed-off-by: Masami Hiramatsu
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Song Liu
Cc: Thomas Gleixner
Fixes: e12f03d7031a ("perf/core: Implement the 'perf_kprobe' PMU")
Signed-off-by: Ingo Molnar

Masami Hiramatsu
2018-04-10 13:33:10 +0800

06 Feb, 2018

2 commits

33ea4b242 perf/core: Implement the 'perf_uprobe' PMU ... Browse Code »

This patch adds perf_uprobe support with similar pattern as previous
patch (for kprobe).

Two functions, create_local_trace_uprobe() and
destroy_local_trace_uprobe(), are created so a uprobe can be created
and attached to the file descriptor created by perf_event_open().

Signed-off-by: Song Liu
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Yonghong Song
Reviewed-by: Josef Bacik
Cc:
Cc:
Cc:
Cc:
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20171206224518.3598254-7-songliubraving@fb.com
Signed-off-by: Ingo Molnar

Song Liu
2018-02-06 18:29:28 +0800
e12f03d70 perf/core: Implement the 'perf_kprobe' PMU ... Browse Code »

A new PMU type, perf_kprobe is added. Based on attr from perf_event_open(),
perf_kprobe creates a kprobe (or kretprobe) for the perf_event. This
kprobe is private to this perf_event, and thus not added to global
lists, and not available in tracefs.

Two functions, create_local_trace_kprobe() and
destroy_local_trace_kprobe() are added to created and destroy these
local trace_kprobe.

Signed-off-by: Song Liu
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Yonghong Song
Reviewed-by: Josef Bacik
Cc:
Cc:
Cc:
Cc:
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Namhyung Kim
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20171206224518.3598254-6-songliubraving@fb.com
Signed-off-by: Ingo Molnar

Song Liu
2018-02-06 18:29:26 +0800

17 Oct, 2017

3 commits

1dd311e6d perf/ftrace: Small cleanup ... Browse Code »

ops->flags _should_ be 0 at this point, so setting the flag using
bitwise or is a bit daft.

Link: http://lkml.kernel.org/r/20171011080224.315585202@infradead.org

Requested-by: Steven Rostedt
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Steven Rostedt (VMware)

Peter Zijlstra
2017-10-17 06:13:28 +0800
466c81c45 perf/ftrace: Fix function trace events ... Browse Code »

The function-trace perf interface is a tad messed up. Where all
the other trace perf interfaces use a single trace hook
registration and use per-cpu RCU based hlist to iterate the events,
function-trace actually needs multiple hook registrations in order to
minimize function entry patching when filters are present.

The end result is that we iterate events both on the trace hook and on
the hlist, which results in reporting events multiple times.

Since function-trace cannot use the regular scheme, fix it the other
way around, use singleton hlists.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Steven Rostedt (VMware)

Peter Zijlstra
2017-10-17 06:12:21 +0800
8fd0fbbe8 perf/ftrace: Revert ("perf/ftrace: Fix double traces of perf on ftrace:function") ... Browse Code »

Revert commit:

75e8387685f6 ("perf/ftrace: Fix double traces of perf on ftrace:function")

The reason I instantly stumbled on that patch is that it only addresses the
ftrace situation and doesn't mention the other _5_ places that use this
interface. It doesn't explain why those don't have the problem and if not, why
their solution doesn't work for ftrace.

It doesn't, but this is just putting more duct tape on.

Link: http://lkml.kernel.org/r/20171011080224.200565770@infradead.org

Cc: Zhou Chengming
Cc: Jiri Olsa
Cc: Ingo Molnar
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Steven Rostedt (VMware)

Peter Zijlstra
2017-10-17 06:11:02 +0800

29 Aug, 2017

1 commit

75e838768 perf/ftrace: Fix double traces of perf on ftrace:function ... Browse Code »

When running perf on the ftrace:function tracepoint, there is a bug
which can be reproduced by:

perf record -e ftrace:function -a sleep 20 &
perf record -e ftrace:function ls
perf script

ls 10304 [005] 171.853235: ftrace:function:
perf_output_begin
ls 10304 [005] 171.853237: ftrace:function:
perf_output_begin
ls 10304 [005] 171.853239: ftrace:function:
task_tgid_nr_ns
ls 10304 [005] 171.853240: ftrace:function:
task_tgid_nr_ns
ls 10304 [005] 171.853242: ftrace:function:
__task_pid_nr_ns
ls 10304 [005] 171.853244: ftrace:function:
__task_pid_nr_ns

We can see that all the function traces are doubled.

The problem is caused by the inconsistency of the register
function perf_ftrace_event_register() with the probe function
perf_ftrace_function_call(). The former registers one probe
for every perf_event. And the latter handles all perf_events
on the current cpu. So when two perf_events on the current cpu,
the traces of them will be doubled.

So this patch adds an extra parameter "event" for perf_tp_event,
only send sample data to this event when it's not NULL.

Signed-off-by: Zhou Chengming
Reviewed-by: Jiri Olsa
Acked-by: Steven Rostedt (VMware)
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: acme@kernel.org
Cc: alexander.shishkin@linux.intel.com
Cc: huawei.libin@huawei.com
Link: http://lkml.kernel.org/r/1503668977-12526-1-git-send-email-zhouchengming1@huawei.com
Signed-off-by: Ingo Molnar

Zhou Chengming
2017-08-29 19:29:29 +0800

18 May, 2016

1 commit

a7fd20d1c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking updates from David Miller:
"Highlights:

1) Support SPI based w5100 devices, from Akinobu Mita.

2) Partial Segmentation Offload, from Alexander Duyck.

3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.

4) Allow cls_flower stats offload, from Amir Vadai.

5) Implement bpf blinding, from Daniel Borkmann.

6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
actually using FASYNC these atomics are superfluous. From Eric
Dumazet.

7) Run TCP more preemptibly, also from Eric Dumazet.

8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
driver, from Gal Pressman.

9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.

10) Improve BPF usage documentation, from Jesper Dangaard Brouer.

11) Support tunneling offloads in qed, from Manish Chopra.

12) aRFS offloading in mlx5e, from Maor Gottlieb.

13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
Leitner.

14) Add MSG_EOR support to TCP, this allows controlling packet
coalescing on application record boundaries for more accurate
socket timestamp sampling. From Martin KaFai Lau.

15) Fix alignment of 64-bit netlink attributes across the board, from
Nicolas Dichtel.

16) Per-vlan stats in bridging, from Nikolay Aleksandrov.

17) Several conversions of drivers to ethtool ksettings, from Philippe
Reynes.

18) Checksum neutral ILA in ipv6, from Tom Herbert.

19) Factorize all of the various marvell dsa drivers into one, from
Vivien Didelot

20) Add VF support to qed driver, from Yuval Mintz"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
Revert "phy dp83867: Make rgmii parameters optional"
r8169: default to 64-bit DMA on recent PCIe chips
phy dp83867: Make rgmii parameters optional
phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
bpf: arm64: remove callee-save registers use for tmp registers
asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
switchdev: pass pointer to fib_info instead of copy
net_sched: close another race condition in tcf_mirred_release()
tipc: fix nametable publication field in nl compat
drivers: net: Don't print unpopulated net_device name
qed: add support for dcbx.
ravb: Add missing free_irq() calls to ravb_close()
qed: Remove a stray tab
net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
net: ethernet: fec-mpc52xx: use phydev from struct net_device
bpf, doc: fix typo on bpf_asm descriptions
stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
net: ethernet: fs-enet: use phydev from struct net_device
...

Linus Torvalds
2016-05-18 07:26:30 +0800

08 Apr, 2016

2 commits

1e1dcd93b perf: split perf_trace_buf_prepare into alloc and update parts ... Browse Code »

split allows to move expensive update of 'struct trace_entry' to later phase.
Repurpose unused 1st argument of perf_tp_event() to indicate event type.

While splitting use temp variable 'rctx' instead of '*rctx' to avoid
unnecessary loads done by the compiler due to -fno-strict-aliasing

Signed-off-by: Alexei Starovoitov
Acked-by: Peter Zijlstra (Intel)
Signed-off-by: David S. Miller

Alexei Starovoitov
2016-04-08 09:04:26 +0800
ec5e099d6 perf: optimize perf_fetch_caller_regs ... Browse Code »

avoid memset in perf_fetch_caller_regs, since it's the critical path of all tracepoints.
It's called from perf_sw_event_sched, perf_event_task_sched_in and all of perf_trace_##call
with this_cpu_ptr(&__perf_regs[..]) which are zero initialized by perpcu init logic and
subsequent call to perf_arch_fetch_caller_regs initializes the same fields on all archs,
so we can safely drop memset from all of the above cases and move it into
perf_ftrace_function_call that calls it with stack allocated pt_regs.

Acked-by: Peter Zijlstra (Intel)
Signed-off-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Alexei Starovoitov
2016-04-08 09:04:26 +0800

31 Mar, 2016

1 commit

0a74c5b3d ftrace/perf: Check sample types only for sampling events ... Browse Code »

Currently we check sample type for ftrace:function events
even if it's not created as a sampling event. That prevents
creating ftrace_function event in counting mode.

Make sure we check sample types only for sampling events.

Before:
$ sudo perf stat -e ftrace:function ls
...

Performance counter stats for 'ls':

ftrace:function

0.001983662 seconds time elapsed

After:
$ sudo perf stat -e ftrace:function ls
...

Performance counter stats for 'ls':

44,498 ftrace:function

0.037534722 seconds time elapsed

Suggested-by: Namhyung Kim
Signed-off-by: Jiri Olsa
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Steven Rostedt
Cc: Alexander Shishkin
Cc: Arnaldo Carvalho de Melo
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Thomas Gleixner
Cc: Vince Weaver
Link: http://lkml.kernel.org/r/1458138873-1553-2-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar

Jiri Olsa
2016-03-31 16:30:45 +0800

13 Jan, 2016

1 commit

c17488d06 Merge tag 'trace-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing updates from Steven Rostedt:
"Not much new with tracing for this release. Mostly just clean ups and
minor fixes.

Here's what else is new:

- A new TRACE_EVENT_FN_COND macro, combining both _FN and _COND for
those that want both.

- New selftest to test the instance create and delete

- Better debug output when ftrace fails"

* tag 'trace-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (24 commits)
ftrace: Fix the race between ftrace and insmod
ftrace: Add infrastructure for delayed enabling of module functions
x86: ftrace: Fix the comments for ftrace_modify_code_direct()
tracing: Fix comment to use tracing_on over tracing_enable
metag: ftrace: Fix the comments for ftrace_modify_code
sh: ftrace: Fix the comments for ftrace_modify_code()
ia64: ftrace: Fix the comments for ftrace_modify_code()
ftrace: Clean up ftrace_module_init() code
ftrace: Join functions ftrace_module_init() and ftrace_init_module()
tracing: Introduce TRACE_EVENT_FN_COND macro
tracing: Use seq_buf_used() in seq_buf_to_user() instead of len
bpf: Constify bpf_verifier_ops structure
ftrace: Have ftrace_ops_get_func() handle RCU and PER_CPU flags too
ftrace: Remove use of control list and ops
ftrace: Fix output of enabled_functions for showing tramp
ftrace: Fix a typo in comment
ftrace: Show all tramps registered to a record on ftrace_bug()
ftrace: Add variable ftrace_expected for archs to show expected code
ftrace: Add new type to distinguish what kind of ftrace_bug()
tracing: Update cond flag when enabling or disabling a trigger
...

Linus Torvalds
2016-01-13 12:04:15 +0800

24 Dec, 2015

1 commit

ba27f2bc7 ftrace: Remove use of control list and ops ... Browse Code »

Currently perf has its own list function within the ftrace infrastructure
that seems to be used only to allow for it to have per-cpu disabling as well
as a check to make sure that it's not called while RCU is not watching. It
uses something called the "control_ops" which is used to iterate over ops
under it with the control_list_func().

The problem is that this control_ops and control_list_func unnecessarily
complicates the code. By replacing FTRACE_OPS_FL_CONTROL with two new flags
(FTRACE_OPS_FL_RCU and FTRACE_OPS_FL_PER_CPU) we can remove all the code
that is special with the control ops and add the needed checks within the
generic ftrace_list_func().

Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-12-24 03:27:18 +0800

23 Nov, 2015

1 commit

90eec103b treewide: Remove old email address ... Browse Code »

There were still a number of references to my old Red Hat email
address in the kernel source. Remove these while keeping the
Red Hat copyright notices intact.

Signed-off-by: Peter Zijlstra (Intel)
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Thomas Gleixner
Cc: Vince Weaver
Signed-off-by: Ingo Molnar

Peter Zijlstra
2015-11-23 16:44:58 +0800

14 May, 2015

1 commit

2425bcb92 tracing: Rename ftrace_event_{call,class} to trace_event_{call,class} ... Browse Code »

The name "ftrace" really refers to the function hook infrastructure. It
is not about the trace_events. The structures ftrace_event_call and
ftrace_event_class have nothing to do with the function hooks, and are
really trace_event structures. Rename ftrace_event_* to trace_event_*.

Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-05-14 02:06:10 +0800

14 Jan, 2015

1 commit

86038c5ea perf: Avoid horrible stack usage ... Browse Code »

Both Linus (most recent) and Steve (a while ago) reported that perf
related callbacks have massive stack bloat.

The problem is that software events need a pt_regs in order to
properly report the event location and unwind stack. And because we
could not assume one was present we allocated one on stack and filled
it with minimal bits required for operation.

Now, pt_regs is quite large, so this is undesirable. Furthermore it
turns out that most sites actually have a pt_regs pointer available,
making this even more onerous, as the stack space is pointless waste.

This patch addresses the problem by observing that software events
have well defined nesting semantics, therefore we can use static
per-cpu storage instead of on-stack.

Linus made the further observation that all but the scheduler callers
of perf_sw_event() have a pt_regs available, so we change the regular
perf_sw_event() to require a valid pt_regs (where it used to be
optional) and add perf_sw_event_sched() for the scheduler.

We have a scheduler specific call instead of a more generic _noregs()
like construct because we can assume non-recursion from the scheduler
and thereby simplify the code further (_noregs would have to put the
recursion context call inline in order to assertain which __perf_regs
element to use).

One last note on the implementation of perf_trace_buf_prepare(); we
allow .regs = NULL for those cases where we already have a pt_regs
pointer available and do not need another.

Reported-by: Linus Torvalds
Reported-by: Steven Rostedt
Signed-off-by: Peter Zijlstra (Intel)
Cc: Arnaldo Carvalho de Melo
Cc: Javi Merino
Cc: Linus Torvalds
Cc: Mathieu Desnoyers
Cc: Oleg Nesterov
Cc: Paul Mackerras
Cc: Petr Mladek
Cc: Steven Rostedt
Cc: Tom Zanussi
Cc: Vaibhav Nagarnaik
Link: http://lkml.kernel.org/r/20141216115041.GW3337@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra (Intel)
2015-01-14 22:11:45 +0800

28 Jul, 2014

1 commit

f4be073db perf: Check permission only for parent tracepoint event ... Browse Code »

There's no need to check cloned event's permission once the
parent was already checked.

Also the code is checking 'current' process permissions, which
is not owner process for cloned events, thus could end up with
wrong permission check result.

Reported-by: Alexander Yarygin
Tested-by: Alexander Yarygin
Signed-off-by: Jiri Olsa
Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Corey Ashford
Cc: Frederic Weisbecker
Cc: Paul Mackerras
Cc: Linus Torvalds
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/1405079782-8139-1-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar

Jiri Olsa
2014-07-28 16:01:38 +0800

24 Apr, 2014

1 commit

3da0f1800 kprobes, ftrace: Use NOKPROBE_SYMBOL macro in ftrace ... Browse Code »

Use NOKPROBE_SYMBOL macro to protect functions from
kprobes instead of __kprobes annotation in ftrace.
This applies nokprobe_inline annotation for some cases,
because NOKPROBE_SYMBOL() will inhibit inlining by
referring the symbol address.

Signed-off-by: Masami Hiramatsu
Cc: Frederic Weisbecker
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/20140417081828.26341.55152.stgit@ltc230.yrl.intra.hitachi.co.jp
Signed-off-by: Ingo Molnar

Masami Hiramatsu
2014-04-24 16:26:39 +0800

11 Mar, 2014

2 commits

63c45f4ba perf: Disallow user-space stack dumps for function trace events ... Browse Code »

Recent issues with user space callchains processing within
page fault handler tracing showed as Peter said 'there's
just too much fail surface'.

The user space stack dump is just another source of the this issue.

Related list discussions:
http://marc.info/?t=139302086500001&r=1&w=2
http://marc.info/?t=139301437300003&r=1&w=2

Suggested-by: Peter Zijlstra
Signed-off-by: Jiri Olsa
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Vince Weaver
Cc: Steven Rostedt
Cc: Paul Mackerras
Cc: H. Peter Anvin
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1393775800-13524-3-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar

Jiri Olsa
2014-03-11 18:57:58 +0800
cfa77bc4a perf: Disallow user-space callchains for function trace events ... Browse Code »

Recent issues with user space callchains processing within
page fault handler tracing showed as Peter said 'there's
just too much fail surface'.

Related list discussions:

http://marc.info/?t=139302086500001&r=1&w=2
http://marc.info/?t=139301437300003&r=1&w=2

Suggested-by: Peter Zijlstra
Signed-off-by: Jiri Olsa
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: "H. Peter Anvin"
Cc: Vince Weaver
Cc: Steven Rostedt
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1393775800-13524-2-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar

Jiri Olsa
2014-03-11 18:57:57 +0800

19 Nov, 2013

2 commits

0022cedd4 perf/trace: Properly use u64 to hold event_id ... Browse Code »

The 64-bit attr.config value for perf trace events was being copied into
an "int" before doing a comparison, meaning the top 32 bits were
being truncated.

As far as I can tell this didn't cause any errors, but it did mean
it was possible to create valid aliases for all the tracepoint ids
which I don't think was intended. (For example, 0xffffffff00000018
and 0x18 both enable the same tracepoint).

Signed-off-by: Vince Weaver
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1311151236100.11932@vincent-weaver-1.um.maine.edu
Signed-off-by: Ingo Molnar

Vince Weaver
2013-11-19 23:57:44 +0800
d5b5f391d ftrace, perf: Avoid infinite event generation loop ... Browse Code »

Vince's perf-trinity fuzzer found yet another 'interesting' problem.

When we sample the irq_work_exit tracepoint with period==1 (or
PERF_SAMPLE_PERIOD) and we add an fasync SIGNAL handler we create an
infinite event generation loop:

,->
| irq_work_exit() ->
| trace_irq_work_exit() ->
| ...
| __perf_event_overflow() -> (due to fasync)
| irq_work_queue() -> (irq_work_list must be empty)
'--------- arch_irq_work_raise()

Similar things can happen due to regular poll() wakeups if we exceed
the ring-buffer wakeup watermark, or have an event_limit.

To avoid this, dis-allow sampling this particular tracepoint.

In order to achieve this, create a special perf_perm function pointer
for each event and call this (when set) on trying to create a
tracepoint perf event.

[ roasted: use expr... to allow for ',' in your expression ]

Reported-by: Vince Weaver
Tested-by: Vince Weaver
Signed-off-by: Peter Zijlstra
Cc: Steven Rostedt
Cc: Dave Jones
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/r/20131114152304.GC5364@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2013-11-19 23:57:40 +0800

07 Nov, 2013

1 commit

12ae030d5 perf/ftrace: Fix paranoid level for enabling function tracer ... Browse Code »

The current default perf paranoid level is "1" which has
"perf_paranoid_kernel()" return false, and giving any operations that
use it, access to normal users. Unfortunately, this includes function
tracing and normal users should not be allowed to enable function
tracing by default.

The proper level is defined at "-1" (full perf access), which
"perf_paranoid_tracepoint_raw()" will only give access to. Use that
check instead for enabling function tracing.

Reported-by: Dave Jones
Reported-by: Vince Weaver
Tested-by: Vince Weaver
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Jiri Olsa
Cc: Frederic Weisbecker
Cc: stable@vger.kernel.org # 3.4+
CVE: CVE-2013-2930
Fixes: ced39002f5ea ("ftrace, perf: Add support to use function tracepoint in perf")
Signed-off-by: Steven Rostedt

Steven Rostedt
2013-11-07 03:44:49 +0800

19 Jul, 2013

2 commits

cd92bf61d tracing/perf: Move the PERF_MAX_TRACE_SIZE check into perf_trace_buf_prepare() ... Browse Code »

Every perf_trace_buf_prepare() caller does
WARN_ONCE(size > PERF_MAX_TRACE_SIZE, message) and "message" is
almost the same.

Shift this WARN_ONCE() into perf_trace_buf_prepare(). This changes
the meaning of _ONCE, but I think this is fine.

- 4947014 2932448 10104832 17984294 1126b26 vmlinux
+ 4948422 2932448 10104832 17985702 11270a6 vmlinux

on my build.

Link: http://lkml.kernel.org/r/20130617170211.GA19813@redhat.com

Acked-by: Peter Zijlstra
Signed-off-by: Oleg Nesterov
Signed-off-by: Steven Rostedt

Oleg Nesterov
2013-07-19 09:31:28 +0800
b8ebfd3f7 tracing/function: Avoid perf_trace_buf_*() if event_function.perf_events is empty ... Browse Code »

perf_trace_buf_prepare() + perf_trace_buf_submit(head, task => NULL)
make no sense if hlist_empty(head). Change perf_ftrace_function_call()
to check event_function.perf_events beforehand.

Link: http://lkml.kernel.org/r/20130617170204.GA19803@redhat.com

Acked-by: Peter Zijlstra
Signed-off-by: Oleg Nesterov
Signed-off-by: Steven Rostedt

Oleg Nesterov
2013-07-19 09:31:27 +0800

21 Aug, 2012

1 commit

bcada3d4b Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git… ... Browse Code »

…/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

* Fix include order for bison/flex-generated C files, from Ben Hutchings

* Build fixes and documentation corrections from David Ahern

* Group parsing support, from Jiri Olsa

* UI/gtk refactorings and improvements from Namhyung Kim

* NULL deref fix for perf script, from Namhyung Kim

* Assorted cleanups from Robert Richter

* Let O= makes handle relative paths, from Steven Rostedt

* perf script python fixes, from Feng Tang.

* Improve 'perf lock' error message when the needed tracepoints
are not present, from David Ahern.

* Initial bash completion support, from Frederic Weisbecker

* Allow building without libelf, from Namhyung Kim.

* Support DWARF CFI based unwind to have callchains when %bp
based unwinding is not possible, from Jiri Olsa.

* Symbol resolution fixes, while fixing support PPC64 files with an .opt ELF
section was the end goal, several fixes for code that handles all
architectures and cleanups are included, from Cody Schafer.

* Add a description for the JIT interface, from Andi Kleen.

* Assorted fixes for Documentation and build in 32 bit, from Robert Richter

* Add support for non-tracepoint events in perf script python, from Feng Tang

* Cache the libtraceevent event_format associated to each evsel early, so that we
avoid relookups, i.e. calling pevent_find_event repeatedly when processing
tracepoint events.

[ This is to reduce the surface contact with libtraceevents and make clear what
is that the perf tools needs from that lib: so far parsing the common and per
event fields. ]

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2012-08-21 17:27:00 +0800

31 Jul, 2012

1 commit

e6dab5ffa perf/trace: Add ability to set a target task for events ... Browse Code »

A few events are interesting not only for a current task.
For example, sched_stat_* events are interesting for a task
which wakes up. For this reason, it will be good if such
events will be delivered to a target task too.

Now a target task can be set by using __perf_task().

The original idea and a draft patch belongs to Peter Zijlstra.

I need these events for profiling sleep times. sched_switch is used for
getting callchains and sched_stat_* is used for getting time periods.
These events are combined in user space, then it can be analyzed by
perf tools.

Inspired-by: Peter Zijlstra
Cc: Steven Rostedt
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Steven Rostedt
Cc: Arun Sharma
Signed-off-by: Andrew Vagin
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1342016098-213063-1-git-send-email-avagin@openvz.org
Signed-off-by: Ingo Molnar

Andrew Vagin
2012-07-31 23:02:05 +0800

20 Jul, 2012

2 commits

a1e2e31d1 ftrace: Return pt_regs to function trace callback ... Browse Code »

Return as the 4th paramater to the function tracer callback the pt_regs.

Later patches that implement regs passing for the architectures will require
having the ftrace_ops set the SAVE_REGS flag, which will tell the arch
to take the time to pass a full set of pt_regs to the ftrace_ops callback
function. If the arch does not support it then it should pass NULL.

If an arch can pass full regs, then it should define:
ARCH_SUPPORTS_FTRACE_SAVE_REGS to 1

Link: http://lkml.kernel.org/r/20120702201821.019966811@goodmis.org

Reviewed-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt

Steven Rostedt
2012-07-20 01:18:49 +0800
2f5f6ad93 ftrace: Pass ftrace_ops as third parameter to function trace callback ... Browse Code »

Currently the function trace callback receives only the ip and parent_ip
of the function that it traced. It would be more powerful to also return
the ops that registered the function as well. This allows the same function
to act differently depending on what ftrace_ops registered it.

Link: http://lkml.kernel.org/r/20120612225424.267254552@goodmis.org

Reviewed-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt

Steven Rostedt
2012-07-20 01:17:35 +0800

22 Feb, 2012

3 commits

5500fa511 ftrace, perf: Add filter support for function trace event ... Browse Code »

Adding support to filter function trace event via perf
interface. It is now possible to use filter interface
in the perf tool like:

perf record -e ftrace:function --filter="(ip == mm_*)" ls

The filter syntax is restricted to the the 'ip' field only,
and following operators are accepted '==' '!=' '||', ending
up with the filter strings like:

ip == f1[, ]f2 ... || ip != f3[, ]f4 ...

with comma ',' or space ' ' as a function separator. If the
space ' ' is used as a separator, the right side of the
assignment needs to be enclosed in double quotes '"', e.g.:

perf record -e ftrace:function --filter '(ip == do_execve,sys_*,ext*)' ls
perf record -e ftrace:function --filter '(ip == "do_execve,sys_*,ext*")' ls
perf record -e ftrace:function --filter '(ip == "do_execve sys_* ext*")' ls

The '==' operator adds trace filter with same effect as would
be added via set_ftrace_filter file.

The '!=' operator adds trace filter with same effect as would
be added via set_ftrace_notrace file.

The right side of the '!=', '==' operators is list of functions
or regexp. to be added to filter separated by space.

The '||' operator is used for connecting multiple filter definitions
together. It is possible to have more than one '==' and '!='
operators within one filter string.

Link: http://lkml.kernel.org/r/1329317514-8131-8-git-send-email-jolsa@redhat.com

Signed-off-by: Jiri Olsa
Signed-off-by: Steven Rostedt

Jiri Olsa
2012-02-22 00:08:30 +0800
ced39002f ftrace, perf: Add support to use function tracepoint in perf ... Browse Code »
43

Adding perf registration support for the ftrace function event,
so it is now possible to register it via perf interface.

The perf_event struct statically contains ftrace_ops as a handle
for function tracer. The function tracer is registered/unregistered
in open/close actions.

To be efficient, we enable/disable ftrace_ops each time the traced
process is scheduled in/out (via TRACE_REG_PERF_(ADD|DELL) handlers).
This way tracing is enabled only when the process is running.
Intentionally using this way instead of the event's hw state
PERF_HES_STOPPED, which would not disable the ftrace_ops.

It is now possible to use function trace within perf commands
like:

perf record -e ftrace:function ls
perf stat -e ftrace:function ls

Allowed only for root.

Link: http://lkml.kernel.org/r/1329317514-8131-6-git-send-email-jolsa@redhat.com

Acked-by: Frederic Weisbecker
Signed-off-by: Jiri Olsa
Signed-off-by: Steven Rostedt

Jiri Olsa
2012-02-22 00:08:27 +0800
489c75c3b ftrace, perf: Add add/del tracepoint perf registration actions ... Browse Code »

Adding TRACE_REG_PERF_ADD and TRACE_REG_PERF_DEL to handle
perf event schedule in/out actions.

The add action is invoked for when the perf event is scheduled in,
while the del action is invoked when the event is scheduled out.

Link: http://lkml.kernel.org/r/1329317514-8131-4-git-send-email-jolsa@redhat.com

Acked-by: Frederic Weisbecker
Signed-off-by: Jiri Olsa
Signed-off-by: Steven Rostedt

Jiri Olsa
2012-02-22 00:08:25 +0800