09 Feb, 2013
12 commits
-
uprobe_perf_open/close call the costly uprobe_apply() every time,
we can avoid it if:- "nr_systemwide != 0" is not changed.
- There is another process/thread with the same ->mm.
- copy_proccess() does inherit_event(). dup_mmap() preserves the
inserted breakpoints.- event->attr.enable_on_exec == T, we can rely on uprobe_mmap()
called by exec/mmap paths.- tp_target is exiting. Only _close() checks PF_EXITING, I don't
think TRACE_REG_PERF_OPEN can hit the dying task too often.Signed-off-by: Oleg Nesterov
-
Change uprobe_trace_func() and uprobe_perf_func() to return "int". Change
uprobe_dispatcher() to return "trace_ret | perf_ret" although this is not
needed, currently TP_FLAG_TRACE/TP_FLAG_PROFILE are mutually exclusive.The only functional change is that uprobe_perf_func() checks the filtering
too and returns UPROBE_HANDLER_REMOVE if nobody wants to trace current.Testing:
# perf probe -x /lib/libc.so.6 syscall
# perf record -e probe_libc:syscall -i perl -e 'fork; syscall -1 for 1..10; wait'
# perf report --show-total-period
100.00% 10 perl libc-2.8.so [.] syscallBefore this patch:
# cat /sys/kernel/debug/tracing/uprobe_profile
/lib/libc.so.6 syscall 20A child process doesn't have a counter, but still it hits this breakoint
"copied" by dup_mmap().After the patch:
# cat /sys/kernel/debug/tracing/uprobe_profile
/lib/libc.so.6 syscall 11The child process hits this int3 only once and does unapply_uprobe().
Signed-off-by: Oleg Nesterov
-
Finally implement uprobe_perf_filter() which checks ->nr_systemwide or
->perf_events to figure out whether we need to insert the breakpoint.uprobe_perf_open/close are changed to do uprobe_apply(true/false) when
the new perf event comes or goes away.Note that currently this is very suboptimal:
- uprobe_register() called by TRACE_REG_PERF_REGISTER becomes a
heavy nop, consumer->filter() always returns F at this stage.As it was already discussed we need uprobe_register_only() to
avoid the costly register_for_each_vma() when possible.- uprobe_apply() is oftenly overkill. Unless "nr_systemwide != 0"
changes we need uprobe_apply_mm(), unapply_uprobe() is almost
what we need.- uprobe_apply() can be simply avoided sometimes, see the next
changes.Testing:
# perf probe -x /lib/libc.so.6 syscall
# perl -e 'syscall -1 while 1' &
[1] 530# perf record -e probe_libc:syscall perl -e 'syscall -1 for 1..10; sleep 1'
# perf report --show-total-period
100.00% 10 perl libc-2.8.so [.] syscallBefore this patch:
# cat /sys/kernel/debug/tracing/uprobe_profile
/lib/libc.so.6 syscall 79291A huge ->nrhit == 79291 reflects the fact that the background process
530 constantly hits this breakpoint too, even if doesn't contribute to
the output.After the patch:
# cat /sys/kernel/debug/tracing/uprobe_profile
/lib/libc.so.6 syscall 10This shows that only the target process was punished by int3.
Signed-off-by: Oleg Nesterov
-
Introduce "struct trace_uprobe_filter" which records the "active"
perf_event's attached to ftrace_event_call. For the start we simply
use list_head, we can optimize this later if needed. For example, we
do not really need to record an event with ->parent != NULL, we can
rely on parent->child_list. And we can certainly do some optimizations
for the case when 2 events have the same ->tp_target or tp_target->mm.Change trace_uprobe_register() to process TRACE_REG_PERF_OPEN/CLOSE
and add/del this perf_event to the list.We can probably avoid any locking, but lets start with the "obvioulsy
correct" trace_uprobe_filter->rwlock which protects everything.Signed-off-by: Oleg Nesterov
-
Move tu->nhit++ from uprobe_trace_func() to uprobe_dispatcher().
->nhit counts how many time we hit the breakpoint inserted by this
uprobe, we do not want to loose this info if uprobe was enabled by
sys_perf_event_open().Signed-off-by: Oleg Nesterov
Acked-by: Srikar Dronamraju -
trace_uprobe->consumer and "struct uprobe_trace_consumer" add the
unnecessary indirection and complicate the code for no reason.This patch simply embeds uprobe_consumer into "struct trace_uprobe",
all other changes only fix the compilation errors.Signed-off-by: Oleg Nesterov
-
probe_event_enable/disable() check tu->consumer != NULL to avoid the
wrong uprobe_register/unregister().We are going to kill this pointer and "struct uprobe_trace_consumer",
so we add the new helper, is_trace_uprobe_enabled(), which can rely
on TP_FLAG_TRACE/TP_FLAG_PROFILE instead.Note: the current logic doesn't look optimal, it is not clear why
TP_FLAG_TRACE/TP_FLAG_PROFILE are mutually exclusive, we will probably
change this later.Also kill the unused TP_FLAG_UPROBE.
Signed-off-by: Oleg Nesterov
Acked-by: Srikar Dronamraju -
probe_event_enable/disable() check tu->inode != NULL at the start.
This is ugly, if igrab() can fail create_trace_uprobe() should not
succeed and "postpone" the failure.And S_ISREG(inode->i_mode) check added by d24d7dbf is not safe.
Note: alloc_uprobe() should probably check igrab() != NULL as well.
Signed-off-by: Oleg Nesterov
Acked-by: Srikar Dronamraju -
probe_event_enable() does uprobe_register() and only after that sets
utc->tu and tu->consumer/flags. This can race with uprobe_dispatcher()
which can miss these assignments or see them out of order. Nothing
really bad can happen, but this doesn't look clean/safe.And this does not allow to use uprobe_consumer->filter() we are going
to add, it is called by uprobe_register() and it needs utc->tu.Change this code to initialize everything before uprobe_register(), and
reset tu->consumer/flags if it fails. We can't race with event_disable(),
the caller holds event_mutex, and if we could the code would be wrong
anyway.In fact I think uprobe_trace_consumer should die, it buys nothing but
complicates the code. We can simply add uprobe_consumer into trace_uprobe.Signed-off-by: Oleg Nesterov
Acked-by: Srikar Dronamraju -
create_trace_uprobe() does kern_path() to find ->d_inode, but forgets
to do path_put(). We can do this right after igrab().Signed-off-by: Oleg Nesterov
Acked-by: Srikar Dronamraju -
Change handle_swbp() to set regs->ip = bp_vaddr in advance, this is
what consumer->handler() needs but uprobe_get_swbp_addr() is not
exported.This also simplifies the code and makes it more consistent across
the supported architectures. handle_swbp() becomes the only caller
of uprobe_get_swbp_addr().Signed-off-by: Oleg Nesterov
Acked-by: Ananth N Mavinakayanahalli -
uprobe_consumer->filter() is pointless in its current form, kill it.
We will add it back, but with the different signature/semantics. Perhaps
we will even re-introduce the callsite in handler_chain(), but not to
just skip uc->handler().Signed-off-by: Oleg Nesterov
Acked-by: Srikar Dronamraju
22 Jan, 2013
1 commit
-
Without this patch, we can register a uprobe event for a directory.
Enabling such a uprobe event would anyway fail.Example:
$ echo 'p /bin:0x4245c0' > /sys/kernel/debug/tracing/uprobe_eventsHowever dirctories cannot be valid targets for uprobe.
Hence verify if the target is a regular file during the probe
registration.Link: http://lkml.kernel.org/r/20130103004212.690763002@goodmis.org
Cc: Namhyung Kim
Signed-off-by: Jovi Zhang
Acked-by: Srikar Dronamraju
[ cleaned up whitespace and removed redundant IS_DIR() check ]
Signed-off-by: Steven Rostedt
18 Dec, 2012
1 commit
-
Signed-off-by: Andy Shevchenko
Cc: Steven Rostedt
Cc: Frederic Weisbecker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
08 Dec, 2012
1 commit
-
…g/misc into perf/core
Pull uprobes fixes, cleanups and preparation for the ARM port from Oleg Nesterov.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
01 Nov, 2012
1 commit
-
* remove old string conversions with kstrto*
Link: http://lkml.kernel.org/r/20120926200838.GC1244@0x90.at
Signed-off-by: Daniel Walter
Signed-off-by: Steven Rostedt
25 Oct, 2012
1 commit
-
There don't have any 'r' prefix in uprobe event naming, remove it.
Signed-off-by: Jovi Zhang
Acked-by: Srikar Dronamraju
Signed-off-by: Oleg Nesterov
31 Jul, 2012
1 commit
-
A few events are interesting not only for a current task.
For example, sched_stat_* events are interesting for a task
which wakes up. For this reason, it will be good if such
events will be delivered to a target task too.Now a target task can be set by using __perf_task().
The original idea and a draft patch belongs to Peter Zijlstra.
I need these events for profiling sleep times. sched_switch is used for
getting callchains and sched_stat_* is used for getting time periods.
These events are combined in user space, then it can be analyzed by
perf tools.Inspired-by: Peter Zijlstra
Cc: Steven Rostedt
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Steven Rostedt
Cc: Arun Sharma
Signed-off-by: Andrew Vagin
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1342016098-213063-1-git-send-email-avagin@openvz.org
Signed-off-by: Ingo Molnar
07 May, 2012
1 commit
-
Implements trace_event support for uprobes. In its current form
it can be used to put probes at a specified offset in a file and
dump the required registers when the code flow reaches the
probed address.The following example shows how to dump the instruction pointer
and %ax a register at the probed text address. Here we are
trying to probe zfree in /bin/zsh:# cd /sys/kernel/debug/tracing/
# cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp
00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh
# objdump -T /bin/zsh | grep -w zfree
0000000000446420 g DF .text 0000000000000012 Base
zfree # echo 'p /bin/zsh:0x46420 %ip %ax' > uprobe_events
# cat uprobe_events
p:uprobes/p_zsh_0x46420 /bin/zsh:0x0000000000046420
# echo 1 > events/uprobes/enable
# sleep 20
# echo 0 > events/uprobes/enable
# cat trace
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
zsh-24842 [006] 258544.995456: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
zsh-24842 [007] 258545.000270: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
zsh-24842 [002] 258545.043929: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
zsh-24842 [004] 258547.046129: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79Signed-off-by: Srikar Dronamraju
Acked-by: Steven Rostedt
Acked-by: Masami Hiramatsu
Cc: Linus Torvalds
Cc: Ananth N Mavinakayanahalli
Cc: Jim Keniston
Cc: Linux-mm
Cc: Oleg Nesterov
Cc: Andi Kleen
Cc: Christoph Hellwig
Cc: Arnaldo Carvalho de Melo
Cc: Anton Arapov
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120411103043.GB29437@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar