13 Nov, 2013

1 commit


12 Sep, 2013

2 commits

  • The current two insn slot caches both use module_alloc/module_free to
    allocate and free insn slot cache pages.

    For s390 this is not sufficient since there is the need to allocate insn
    slots that are either within the vmalloc module area or within dma memory.

    Therefore add a mechanism which allows to specify an own allocator for an
    own insn slot cache.

    Signed-off-by: Heiko Carstens
    Acked-by: Masami Hiramatsu
    Cc: Ananth N Mavinakayanahalli
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     
  • The current kpropes insn caches allocate memory areas for insn slots
    with module_alloc(). The assumption is that the kernel image and module
    area are both within the same +/- 2GB memory area.

    This however is not true for s390 where the kernel image resides within
    the first 2GB (DMA memory area), but the module area is far away in the
    vmalloc area, usually somewhere close below the 4TB area.

    For new pc relative instructions s390 needs insn slots that are within
    +/- 2GB of each area. That way we can patch displacements of
    pc-relative instructions within the insn slots just like x86 and
    powerpc.

    The module area works already with the normal insn slot allocator,
    however there is currently no way to get insn slots that are within the
    first 2GB on s390 (aka DMA area).

    Therefore this patch set modifies the kprobes insn slot cache code in
    order to allow to specify a custom allocator for the insn slot cache
    pages. In addition architecure can now have private insn slot caches
    withhout the need to modify common code.

    Patch 1 unifies and simplifies the current insn and optinsn caches
    implementation. This is a preparation which allows to add more
    insn caches in a simple way.

    Patch 2 adds the possibility to specify a custom allocator.

    Patch 3 makes s390 use the new insn slot mechanisms and adds support for
    pc-relative instructions with long displacements.

    This patch (of 3):

    The two insn caches (insn, and optinsn) each have an own mutex and
    alloc/free functions (get_[opt]insn_slot() / free_[opt]insn_slot()).

    Since there is the need for yet another insn cache which satifies dma
    allocations on s390, unify and simplify the current implementation:

    - Move the per insn cache mutex into struct kprobe_insn_cache.
    - Move the alloc/free functions to kprobe.h so they are simply
    wrappers for the generic __get_insn_slot/__free_insn_slot functions.
    The implementation is done with a DEFINE_INSN_CACHE_OPS() macro
    which provides the alloc/free functions for each cache if needed.
    - move the struct kprobe_insn_cache to kprobe.h which allows to generate
    architecture specific insn slot caches outside of the core kprobes
    code.

    Signed-off-by: Heiko Carstens
    Cc: Masami Hiramatsu
    Cc: Ananth N Mavinakayanahalli
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     

04 Jul, 2013

1 commit

  • When writing invalid input to 'debug/kprobes/enabled' it'll silently be
    ignored. Even worse, when writing an empty string to this file, the
    outcome is purely random as the switch statement will make its decision
    based on the value of an uninitialized stack variable.

    Fix this by handling invalid/empty input as error returning -EINVAL.

    Signed-off-by: Mathias Krause
    Cc: Ananth N Mavinakayanahalli
    Cc: Anil S Keshavamurthy
    Cc: "David S. Miller"
    Cc: Masami Hiramatsu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathias Krause
     

28 May, 2013

1 commit

  • Fix to free gone and unused optprobes. This bug will
    cause a kernel panic if the user reuses the killed and
    unused probe.

    Reported at:

    http://sourceware.org/ml/systemtap/2013-q2/msg00142.html

    In the normal path, an optprobe on an init function is
    unregistered when a module goes live.

    unregister_kprobe(kp)
    -> __unregister_kprobe_top
    ->__disable_kprobe
    ->disarm_kprobe(ap == op)
    ->__disarm_kprobe
    ->unoptimize_kprobe : the op is queued
    on unoptimizing_list
    and do nothing in __unregister_kprobe_bottom

    After a while (usually wait 5 jiffies), kprobe_optimizer
    runs to unoptimize and free optprobe.

    kprobe_optimizer
    ->do_unoptimize_kprobes
    ->arch_unoptimize_kprobes : moved to free_list
    ->do_free_cleaned_kprobes
    ->hlist_del: the op is removed
    ->free_aggr_kprobe
    ->arch_remove_optimized_kprobe
    ->arch_remove_kprobe
    ->kfree: the op is freed

    Here, if kprobes_module_callback is called and the delayed
    unoptimizing probe is picked BEFORE kprobe_optimizer runs,

    kprobes_module_callback
    ->kill_kprobe
    ->kill_optimized_kprobe : dequeued from unoptimizing_list arch_remove_optimized_kprobe
    ->arch_remove_kprobe
    (but op is not freed, and on the kprobe hash table)

    This doesn't happen if the probe unregistration is done AFTER
    kprobes_module_callback is called (because at that time the op
    is gone), and kprobe-tracer does it.

    To fix this bug, this patch changes kprobes_module_callback to
    enqueue the op to freeing_list at kill_optimized_kprobe only
    if the op is unused. The unused probes on freeing_list will
    be freed in do_free_cleaned_kprobes.

    Note that this calls arch_remove_*kprobe twice on the
    same probe. Thus those functions have to check the double free.
    Fortunately, most of arch codes already checked that except
    for mips. This will be fixed in the next patch.

    Signed-off-by: Masami Hiramatsu
    Cc: Timo Juhani Lindfors
    Cc: Ananth N Mavinakayanahalli
    Cc: Anil S Keshavamurthy
    Cc: Frank Ch. Eigler
    Cc: systemtap@sourceware.org
    Cc: yrl.pp-manager.tt@hitachi.com
    Cc: David S. Miller
    Cc: "David S. Miller"
    Link: http://lkml.kernel.org/r/20130522093409.9084.63554.stgit@mhiramat-M0-7522
    [ Minor edits. ]
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     

18 Apr, 2013

1 commit

  • Fix a double locking bug caused when debug.kprobe-optimization=0.
    While the proc_kprobes_optimization_handler locks kprobe_mutex,
    wait_for_kprobe_optimizer locks it again and that causes a double lock.
    To fix the bug, this introduces different mutex for protecting
    sysctl parameter and locks it in proc_kprobes_optimization_handler.
    Of course, since we need to lock kprobe_mutex when touching kprobes
    resources, that is done in *optimize_all_kprobes().

    This bug was introduced by commit ad72b3bea744 ("kprobes: fix
    wait_for_kprobe_optimizer()")

    Signed-off-by: Masami Hiramatsu
    Acked-by: Ananth N Mavinakayanahalli
    Cc: Ingo Molnar
    Cc: Tejun Heo
    Cc: "David S. Miller"
    Signed-off-by: Linus Torvalds

    Masami Hiramatsu
     

28 Feb, 2013

1 commit

  • I'm not sure why, but the hlist for each entry iterators were conceived

    list_for_each_entry(pos, head, member)

    The hlist ones were greedy and wanted an extra parameter:

    hlist_for_each_entry(tpos, pos, head, member)

    Why did they need an extra pos parameter? I'm not quite sure. Not only
    they don't really need it, it also prevents the iterator from looking
    exactly like the list iterator, which is unfortunate.

    Besides the semantic patch, there was some manual work required:

    - Fix up the actual hlist iterators in linux/list.h
    - Fix up the declaration of other iterators based on the hlist ones.
    - A very small amount of places were using the 'node' parameter, this
    was modified to use 'obj->member' instead.
    - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
    properly, so those had to be fixed up manually.

    The semantic patch which is mostly the work of Peter Senna Tschudin is here:

    @@
    iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

    type T;
    expression a,c,d,e;
    identifier b;
    statement S;
    @@

    -T b;

    [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
    [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
    [akpm@linux-foundation.org: checkpatch fixes]
    [akpm@linux-foundation.org: fix warnings]
    [akpm@linux-foudnation.org: redo intrusive kvm changes]
    Tested-by: Peter Senna Tschudin
    Acked-by: Paul E. McKenney
    Signed-off-by: Sasha Levin
    Cc: Wu Fengguang
    Cc: Marcelo Tosatti
    Cc: Gleb Natapov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     

20 Feb, 2013

1 commit

  • Pull workqueue [delayed_]work_pending() cleanups from Tejun Heo:
    "This is part of on-going cleanups to remove / minimize usages of
    workqueue interfaces which are deprecated and/or misleading.

    This round drops a number of usages of [delayed_]work_pending(), which
    are dangerous as they lack any form of synchronization and thus often
    lead to buggy / unnecessary code. There are a couple legitimate use
    cases in kernel. Hopefully, they can be converted and
    [delayed_]work_pending() can be removed completely. Even if not,
    removing most of misuses should make it more difficult to find
    examples of misuses and thus slow down growth of them.

    These changes are independent from other workqueue changes."

    * 'for-3.9-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    wimax/i2400m: fix i2400m->wake_tx_skb handling
    kprobes: fix wait_for_kprobe_optimizer()
    ipw2x00: simplify scan_event handling
    video/exynos: don't use [delayed_]work_pending()
    tty/max3100: don't use [delayed_]work_pending()
    x86/mce: don't use [delayed_]work_pending()
    rfkill: don't use [delayed_]work_pending()
    wl1251: don't use [delayed_]work_pending()
    thinkpad_acpi: don't use [delayed_]work_pending()
    mwifiex: don't use [delayed_]work_pending()
    sja1000: don't use [delayed_]work_pending()

    Linus Torvalds
     

10 Feb, 2013

1 commit

  • wait_for_kprobe_optimizer() seems largely broken. It uses
    optimizer_comp which is never re-initialized, so
    wait_for_kprobe_optimizer() will never wait for anything once
    kprobe_optimizer() finishes all pending jobs for the first time.

    Also, aside from completion, delayed_work_pending() is %false once
    kprobe_optimizer() starts execution and wait_for_kprobe_optimizer()
    won't wait for it.

    Reimplement it so that it flushes optimizing_work until
    [un]optimizing_lists are empty. Note that this also makes
    optimizing_work execute immediately if someone's waiting for it, which
    is the nicer behavior.

    Only compile tested.

    Signed-off-by: Tejun Heo
    Acked-by: Masami Hiramatsu
    Cc: Ananth N Mavinakayanahalli
    Cc: Anil S Keshavamurthy
    Cc: "David S. Miller"

    Tejun Heo
     

22 Jan, 2013

1 commit

  • Split ftrace-based kprobes code from kprobes, and introduce
    CONFIG_(HAVE_)KPROBES_ON_FTRACE Kconfig flags.
    For the cleanup reason, this also moves kprobe_ftrace check
    into skip_singlestep.

    Link: http://lkml.kernel.org/r/20120928081520.3560.25624.stgit@ltc138.sdl.hitachi.co.jp

    Cc: Ingo Molnar
    Cc: Ananth N Mavinakayanahalli
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Frederic Weisbecker
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt

    Masami Hiramatsu
     

14 Sep, 2012

1 commit

  • Fix kprobes/x86 to support jprobes on ftrace-based kprobes.
    Because of -mfentry support of ftrace, ftrace is now put
    on the beginning of function where jprobes are put.

    Originally ftrace-based kprobes doesn't support jprobe
    because it will change regs->ip and ftrace doesn't support
    changing IP and ftrace itself doesn't conflict jprobe.
    However, ftrace -mfentry support moves mcount call on the
    top of functions where jprobes are put. This means that
    jprobe always conflicts with ftrace-based kprobe and fails.

    This patch allows ftrace-based kprobes to support jprobes
    by allowing to modify regs->ip and kprobes breakpoint
    handler also allows to skip singlestepping because there
    is a ftrace call (not an original instruction).

    Link: http://lkml.kernel.org/r/20120905143125.10329.90836.stgit@localhost.localdomain

    Reported-by: Fengguang Wu
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt

    Masami Hiramatsu
     

31 Jul, 2012

5 commits

  • Add function tracer based kprobe optimization support
    handlers on x86. This allows kprobes to use function
    tracer for probing on mcount call.

    Link: http://lkml.kernel.org/r/20120605102838.27845.26317.stgit@localhost.localdomain

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Ananth N Mavinakayanahalli
    Cc: "Frank Ch. Eigler"
    Cc: Andrew Morton
    Cc: Frederic Weisbecker
    Signed-off-by: Masami Hiramatsu

    [ Updated to new port of ftrace save regs functions ]

    Signed-off-by: Steven Rostedt

    Masami Hiramatsu
     
  • Introduce function trace based kprobes optimization.

    With using ftrace optimization, kprobes on the mcount calling
    address, use ftrace's mcount call instead of breakpoint.
    Furthermore, this optimization works with preemptive kernel
    not like as current jump-based optimization. Of cource,
    this feature works only if the probe is on mcount call.

    Only if kprobe.break_handler is set, that probe is not
    optimized with ftrace (nor put on ftrace). The reason why this
    limitation comes is that this break_handler may be used only
    from jprobes which changes ip address (for fetching the function
    arguments), but function tracer ignores modified ip address.

    Changes in v2:
    - Fix ftrace_ops registering right after setting its filter.
    - Unregister ftrace_ops if there is no kprobe using.
    - Remove notrace dependency from __kprobes macro.

    Link: http://lkml.kernel.org/r/20120605102832.27845.63461.stgit@localhost.localdomain

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Ananth N Mavinakayanahalli
    Cc: "Frank Ch. Eigler"
    Cc: Andrew Morton
    Cc: Frederic Weisbecker
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt

    Masami Hiramatsu
     
  • Break a big critical region into fine-grained pieces at
    registering kprobe path. This helps us to solve circular
    locking dependency when introducing ftrace-based kprobes.

    Link: http://lkml.kernel.org/r/20120605102826.27845.81689.stgit@localhost.localdomain

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Ananth N Mavinakayanahalli
    Cc: "Frank Ch. Eigler"
    Cc: Andrew Morton
    Cc: Frederic Weisbecker
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt

    Masami Hiramatsu
     
  • Separate probe-able address checking code from
    register_kprobe().

    Link: http://lkml.kernel.org/r/20120605102820.27845.90133.stgit@localhost.localdomain

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Ananth N Mavinakayanahalli
    Cc: "Frank Ch. Eigler"
    Cc: Andrew Morton
    Cc: Frederic Weisbecker
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt

    Masami Hiramatsu
     
  • Currently module_mutex is taken before kprobe_mutex, but this
    can cause issues when we have kprobes register ftrace, as the ftrace
    mutex is taken before enabling a tracepoint, which currently takes
    the module mutex.

    If module_mutex is taken before kprobe_mutex, then we can not
    have kprobes use the ftrace infrastructure.

    There seems to be no reason that the kprobe_mutex can't be taken
    before the module_mutex. Running lockdep shows that it is safe
    among the kernels I've run.

    Link: http://lkml.kernel.org/r/20120605102814.27845.21047.stgit@localhost.localdomain

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Ananth N Mavinakayanahalli
    Cc: "Frank Ch. Eigler"
    Cc: Andrew Morton
    Cc: Frederic Weisbecker
    Cc: Masami Hiramatsu
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

06 Mar, 2012

1 commit

  • register_kprobe() aborts if the address of the new request falls in a
    prohibited area (such as ftrace pouch, __kprobes annotated functions,
    non-kernel text addresses, jump label text). We however don't return the
    right error on this abort, resulting in a silent failure - incorrect
    adding/reporting of kprobes ('perf probe do_fork+18' or 'perf probe
    mcount' for instance).

    In V2 we are incorporating Masami Hiramatsu's feedback.

    This patch fixes it by returning -EINVAL upon failure.

    While we are here, rename the label used for exit to be more appropriate.

    Signed-off-by: Ananth N Mavinakayanahalli
    Signed-off-by: Prashanth K Nageshappa
    Acked-by: Masami Hiramatsu
    Cc: Jason Baron
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Prashanth Nageshappa
     

04 Feb, 2012

1 commit

  • In function pre_handler_kretprobe(), the allocated kretprobe_instance
    object will get leaked if the entry_handler callback returns non-zero.
    This may cause all the preallocated kretprobe_instance objects exhausted.

    This issue can be reproduced by changing
    samples/kprobes/kretprobe_example.c to probe "mutex_unlock". And the fix
    is straightforward: just put the allocated kretprobe_instance object back
    onto the free_instances list.

    [akpm@linux-foundation.org: use raw_spin_lock/unlock]
    Signed-off-by: Jiang Liu
    Acked-by: Jim Keniston
    Acked-by: Ananth N Mavinakayanahalli
    Cc: Masami Hiramatsu
    Cc: Anil S Keshavamurthy
    Cc: "David S. Miller"
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     

24 Jan, 2012

1 commit

  • Commit ef53d9c5e ("kprobes: improve kretprobe scalability with hashed
    locking") introduced a bug where we can potentially leak
    kretprobe_instances since we initialize a hlist head after having used
    it.

    Initialize the hlist head before using it.

    Reported by: Jim Keniston
    Acked-by: Jim Keniston
    Signed-off-by: Ananth N Mavinakayanahalli
    Acked-by: Masami Hiramatsu
    Cc: Srinivasa D S
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ananth N Mavinakayanahalli
     

13 Jan, 2012

1 commit

  • Enabling DEBUG_STRICT_USER_COPY_CHECKS causes the following warning:

    In file included from arch/x86/include/asm/uaccess.h:573,
    from kernel/kprobes.c:55:
    In function 'copy_from_user',
    inlined from 'write_enabled_file_bool' at
    kernel/kprobes.c:2191:
    arch/x86/include/asm/uaccess_64.h:65:
    warning: call to 'copy_from_user_overflow' declared with attribute warning: copy_from_user() buffer size is not provably correct

    presumably due to buf_size being signed causing GCC to fail to see that
    buf_size can't become negative.

    Signed-off-by: Stephen Boyd
    Cc: Ananth N Mavinakayanahalli
    Cc: Anil S Keshavamurthy
    Cc: David S. Miller
    Acked-by: Masami Hiramatsu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Boyd
     

31 Oct, 2011

1 commit

  • The changed files were only including linux/module.h for the
    EXPORT_SYMBOL infrastructure, and nothing else. Revector them
    onto the isolated export header for faster compile times.

    Nothing to see here but a whole lot of instances of:

    -#include
    +#include

    This commit is only changing the kernel dir; next targets
    will probably be mm, fs, the arch dirs, etc.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

13 Sep, 2011

1 commit


16 Jul, 2011

1 commit

  • Return -ENOENT if probe point doesn't exist, but still returns
    -EINVAL if both of kprobe->addr and kprobe->symbol_name are
    specified or both are not specified.

    Acked-by: Ananth N Mavinakayanahalli
    Signed-off-by: Masami Hiramatsu
    Cc: Ananth N Mavinakayanahalli
    Cc: Arnaldo Carvalho de Melo
    Cc: Ingo Molnar
    Cc: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Anil S Keshavamurthy
    Cc: "David S. Miller"
    Link: http://lkml.kernel.org/r/20110627072650.6528.67329.stgit@fedora15
    Signed-off-by: Steven Rostedt

    Masami Hiramatsu
     

08 Jan, 2011

1 commit

  • * 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (30 commits)
    gameport: use this_cpu_read instead of lookup
    x86: udelay: Use this_cpu_read to avoid address calculation
    x86: Use this_cpu_inc_return for nmi counter
    x86: Replace uses of current_cpu_data with this_cpu ops
    x86: Use this_cpu_ops to optimize code
    vmstat: User per cpu atomics to avoid interrupt disable / enable
    irq_work: Use per cpu atomics instead of regular atomics
    cpuops: Use cmpxchg for xchg to avoid lock semantics
    x86: this_cpu_cmpxchg and this_cpu_xchg operations
    percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support
    percpu,x86: relocate this_cpu_add_return() and friends
    connector: Use this_cpu operations
    xen: Use this_cpu_inc_return
    taskstats: Use this_cpu_ops
    random: Use this_cpu_inc_return
    fs: Use this_cpu_inc_return in buffer.c
    highmem: Use this_cpu_xx_return() operations
    vmstat: Use this_cpu_inc_return for vm statistics
    x86: Support for this_cpu_add, sub, dec, inc_return
    percpu: Generic support for this_cpu_add, sub, dec, inc_return
    ...

    Fixed up conflicts: in arch/x86/kernel/{apic/nmi.c, apic/x2apic_uv_x.c, process.c}
    as per Tejun.

    Linus Torvalds
     

17 Dec, 2010

1 commit


07 Dec, 2010

7 commits

  • Use text_poke_smp_batch() on unoptimization path for reducing
    the number of stop_machine() issues. If the number of
    unoptimizing probes is more than MAX_OPTIMIZE_PROBES(=256),
    kprobes unoptimizes first MAX_OPTIMIZE_PROBES probes and kicks
    optimizer for remaining probes.

    Signed-off-by: Masami Hiramatsu
    Cc: Rusty Russell
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    Cc: Jason Baron
    Cc: Mathieu Desnoyers
    Cc: 2nddept-manager@sdl.hitachi.co.jp
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Use text_poke_smp_batch() in optimization path for reducing
    the number of stop_machine() issues. If the number of optimizing
    probes is more than MAX_OPTIMIZE_PROBES(=256), kprobes optimizes
    first MAX_OPTIMIZE_PROBES probes and kicks optimizer for
    remaining probes.

    Changes in v5:
    - Use kick_kprobe_optimizer() instead of directly calling
    schedule_delayed_work().
    - Rescheduling optimizer outside of kprobe mutex lock.

    Changes in v2:
    - Allocate code buffer and parameters in arch_init_kprobes()
    instead of using static arraies.
    - Merge previous max optimization limit patch into this patch.
    So, this patch introduces upper limit of optimization at
    once.

    Signed-off-by: Masami Hiramatsu
    Cc: Rusty Russell
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    Cc: Jason Baron
    Cc: Mathieu Desnoyers
    Cc: 2nddept-manager@sdl.hitachi.co.jp
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Reuse unused (waiting for unoptimizing and no user handler)
    kprobe on given address instead of returning -EBUSY for
    registering a new kprobe.

    Signed-off-by: Masami Hiramatsu
    Cc: Rusty Russell
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    Cc: Jason Baron
    Cc: Mathieu Desnoyers
    Cc: 2nddept-manager@sdl.hitachi.co.jp
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Unoptimization occurs when a probe is unregistered or disabled,
    and is heavy because it recovers instructions by using
    stop_machine(). This patch delays unoptimization operations and
    unoptimize several probes at once by using
    text_poke_smp_batch(). This can avoid unexpected system slowdown
    coming from stop_machine().

    Changes in v5:
    - Split this patch into several cleanup patches and this patch.
    - Fix some text_mutex lock miss.
    - Use bool instead of int for behavior flags.
    - Add additional comment for (un)optimizing path.

    Changes in v2:
    - Use dynamic allocated buffers and params.

    Signed-off-by: Masami Hiramatsu
    Cc: Rusty Russell
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    Cc: Jason Baron
    Cc: Mathieu Desnoyers
    Cc: 2nddept-manager@sdl.hitachi.co.jp
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Separate kprobe optimizing code from optimizer, this
    will make easy to introducing unoptimizing code in
    optimizer.

    Signed-off-by: Masami Hiramatsu
    Cc: Rusty Russell
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    Cc: Jason Baron
    Cc: Mathieu Desnoyers
    Cc: 2nddept-manager@sdl.hitachi.co.jp
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Merge disabling kprobe to unregistering kprobe function
    and add comments for disabing/unregistring process.

    Current unregistering code disables(disarms) kprobes after
    checking target kprobe status. This patch changes it to
    disabling kprobe first after that it changing the kprobe's
    state. This allows to share probe disabling code between
    disable_kprobe() and unregister_kprobe().

    Signed-off-by: Masami Hiramatsu
    Cc: Rusty Russell
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    Cc: Jason Baron
    Cc: Mathieu Desnoyers
    Cc: 2nddept-manager@sdl.hitachi.co.jp
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • Rename irrelevant uses of "old_p" to more appropriate names.
    Originally, "old_p" just meant "the old kprobe on given address"
    but current code uses that name as "just another kprobe" or
    something like that. This patch renames those pointer names
    to more appropriate one for maintainability.

    Signed-off-by: Masami Hiramatsu
    Cc: Rusty Russell
    Cc: Frederic Weisbecker
    Cc: Ananth N Mavinakayanahalli
    Cc: Jason Baron
    Cc: Mathieu Desnoyers
    Cc: 2nddept-manager@sdl.hitachi.co.jp
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     

30 Oct, 2010

1 commit

  • Kprobes and jump label were having a race between mutexes that
    was fixed by reordering the jump label. But this reordering
    moved the jump label mutex into a preempt disable location.

    This patch does a little fiddling to move the grabbing of
    the jump label mutex from inside the preempt disable section
    and still keep the order correct between the mutex and the
    kprobes lock.

    Reported-by: Ingo Molnar
    Acked-by: Masami Hiramatsu
    Cc: Jason Baron
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

28 Oct, 2010

2 commits

  • register_kprobe() downs the 'text_mutex' and then calls
    jump_label_text_reserved(), which downs the 'jump_label_mutex'.
    However, the jump label code takes those mutexes in the reverse
    order.

    Fix by requiring the caller of jump_label_text_reserved() to do
    the jump label locking via the newly added: jump_label_lock(),
    jump_label_unlock(). Currently, kprobes is the only user
    of jump_label_text_reserved().

    Reported-by: Ingo Molnar
    Acked-by: Masami Hiramatsu
    Signed-off-by: Jason Baron
    LKML-Reference:
    Signed-off-by: Steven Rostedt

    Jason Baron
     
  • …/git/tip/linux-2.6-tip

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (50 commits)
    perf python scripting: Add futex-contention script
    perf python scripting: Fixup cut'n'paste error in sctop script
    perf scripting: Shut up 'perf record' final status
    perf record: Remove newline character from perror() argument
    perf python scripting: Support fedora 11 (audit 1.7.17)
    perf python scripting: Improve the syscalls-by-pid script
    perf python scripting: print the syscall name on sctop
    perf python scripting: Improve the syscalls-counts script
    perf python scripting: Improve the failed-syscalls-by-pid script
    kprobes: Remove redundant text_mutex lock in optimize
    x86/oprofile: Fix uninitialized variable use in debug printk
    tracing: Fix 'faild' -> 'failed' typo
    perf probe: Fix format specified for Dwarf_Off parameter
    perf trace: Fix detection of script extension
    perf trace: Use $PERF_EXEC_PATH in canned report scripts
    perf tools: Document event modifiers
    perf tools: Remove direct slang.h include
    perf_events: Fix for transaction recovery in group_sched_in()
    perf_events: Revert: Fix transaction recovery in group_sched_in()
    perf, x86: Use NUMA aware allocations for PEBS/BTS/DS allocations
    ...

    Linus Torvalds
     

25 Oct, 2010

1 commit

  • Remove text_mutex locking in optimize_all_kprobes, because
    this function doesn't modify text. It simply queues probes on
    optimization list for kprobe_optimizer worker thread.

    Signed-off-by: Masami Hiramatsu
    Cc: Ananth N Mavinakayanahalli
    Cc: Anil S Keshavamurthy
    Cc: David S. Miller
    Cc: Namhyung Kim
    Cc: Jason Baron
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     

23 Oct, 2010

1 commit

  • * 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
    vfs: make no_llseek the default
    vfs: don't use BKL in default_llseek
    llseek: automatically add .llseek fop
    libfs: use generic_file_llseek for simple_attr
    mac80211: disallow seeks in minstrel debug code
    lirc: make chardev nonseekable
    viotape: use noop_llseek
    raw: use explicit llseek file operations
    ibmasmfs: use generic_file_llseek
    spufs: use llseek in all file operations
    arm/omap: use generic_file_llseek in iommu_debug
    lkdtm: use generic_file_llseek in debugfs
    net/wireless: use generic_file_llseek in debugfs
    drm: use noop_llseek

    Linus Torvalds
     

15 Oct, 2010

1 commit

  • All file_operations should get a .llseek operation so we can make
    nonseekable_open the default for future file operations without a
    .llseek pointer.

    The three cases that we can automatically detect are no_llseek, seq_lseek
    and default_llseek. For cases where we can we can automatically prove that
    the file offset is always ignored, we use noop_llseek, which maintains
    the current behavior of not returning an error from a seek.

    New drivers should normally not use noop_llseek but instead use no_llseek
    and call nonseekable_open at open time. Existing drivers can be converted
    to do the same when the maintainer knows for certain that no user code
    relies on calling seek on the device file.

    The generated code is often incorrectly indented and right now contains
    comments that clarify for each added line why a specific variant was
    chosen. In the version that gets submitted upstream, the comments will
    be gone and I will manually fix the indentation, because there does not
    seem to be a way to do that using coccinelle.

    Some amount of new code is currently sitting in linux-next that should get
    the same modifications, which I will do at the end of the merge window.

    Many thanks to Julia Lawall for helping me learn to write a semantic
    patch that does all this.

    ===== begin semantic patch =====
    // This adds an llseek= method to all file operations,
    // as a preparation for making no_llseek the default.
    //
    // The rules are
    // - use no_llseek explicitly if we do nonseekable_open
    // - use seq_lseek for sequential files
    // - use default_llseek if we know we access f_pos
    // - use noop_llseek if we know we don't access f_pos,
    // but we still want to allow users to call lseek
    //
    @ open1 exists @
    identifier nested_open;
    @@
    nested_open(...)
    {

    }

    @ open exists@
    identifier open_f;
    identifier i, f;
    identifier open1.nested_open;
    @@
    int open_f(struct inode *i, struct file *f)
    {

    }

    @ read disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {

    }

    @ read_no_fpos disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ write @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {

    }

    @ write_no_fpos @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ fops0 @
    identifier fops;
    @@
    struct file_operations fops = {
    ...
    };

    @ has_llseek depends on fops0 @
    identifier fops0.fops;
    identifier llseek_f;
    @@
    struct file_operations fops = {
    ...
    .llseek = llseek_f,
    ...
    };

    @ has_read depends on fops0 @
    identifier fops0.fops;
    identifier read_f;
    @@
    struct file_operations fops = {
    ...
    .read = read_f,
    ...
    };

    @ has_write depends on fops0 @
    identifier fops0.fops;
    identifier write_f;
    @@
    struct file_operations fops = {
    ...
    .write = write_f,
    ...
    };

    @ has_open depends on fops0 @
    identifier fops0.fops;
    identifier open_f;
    @@
    struct file_operations fops = {
    ...
    .open = open_f,
    ...
    };

    // use no_llseek if we call nonseekable_open
    ////////////////////////////////////////////
    @ nonseekable1 depends on !has_llseek && has_open @
    identifier fops0.fops;
    identifier nso ~= "nonseekable_open";
    @@
    struct file_operations fops = {
    ... .open = nso, ...
    +.llseek = no_llseek, /* nonseekable */
    };

    @ nonseekable2 depends on !has_llseek @
    identifier fops0.fops;
    identifier open.open_f;
    @@
    struct file_operations fops = {
    ... .open = open_f, ...
    +.llseek = no_llseek, /* open uses nonseekable */
    };

    // use seq_lseek for sequential files
    /////////////////////////////////////
    @ seq depends on !has_llseek @
    identifier fops0.fops;
    identifier sr ~= "seq_read";
    @@
    struct file_operations fops = {
    ... .read = sr, ...
    +.llseek = seq_lseek, /* we have seq_read */
    };

    // use default_llseek if there is a readdir
    ///////////////////////////////////////////
    @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier readdir_e;
    @@
    // any other fop is used that changes pos
    struct file_operations fops = {
    ... .readdir = readdir_e, ...
    +.llseek = default_llseek, /* readdir is present */
    };

    // use default_llseek if at least one of read/write touches f_pos
    /////////////////////////////////////////////////////////////////
    @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read.read_f;
    @@
    // read fops use offset
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = default_llseek, /* read accesses f_pos */
    };

    @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ... .write = write_f, ...
    + .llseek = default_llseek, /* write accesses f_pos */
    };

    // Use noop_llseek if neither read nor write accesses f_pos
    ///////////////////////////////////////////////////////////

    @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    identifier write_no_fpos.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ...
    .write = write_f,
    .read = read_f,
    ...
    +.llseek = noop_llseek, /* read and write both use no f_pos */
    };

    @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write_no_fpos.write_f;
    @@
    struct file_operations fops = {
    ... .write = write_f, ...
    +.llseek = noop_llseek, /* write uses no f_pos */
    };

    @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    @@
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = noop_llseek, /* read uses no f_pos */
    };

    @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    @@
    struct file_operations fops = {
    ...
    +.llseek = noop_llseek, /* no read or write fn */
    };
    ===== End semantic patch =====

    Signed-off-by: Arnd Bergmann
    Cc: Julia Lawall
    Cc: Christoph Hellwig

    Arnd Bergmann
     

23 Sep, 2010

2 commits

  • Add a jump_label_text_reserved(void *start, void *end), so that other
    pieces of code that want to modify kernel text, can first verify that
    jump label has not reserved the instruction.

    Acked-by: Masami Hiramatsu
    Signed-off-by: Jason Baron
    LKML-Reference:
    Signed-off-by: Steven Rostedt

    Jason Baron
     
  • base patch to implement 'jump labeling'. Based on a new 'asm goto' inline
    assembly gcc mechanism, we can now branch to labels from an 'asm goto'
    statment. This allows us to create a 'no-op' fastpath, which can subsequently
    be patched with a jump to the slowpath code. This is useful for code which
    might be rarely used, but which we'd like to be able to call, if needed.
    Tracepoints are the current usecase that these are being implemented for.

    Acked-by: David S. Miller
    Signed-off-by: Jason Baron
    LKML-Reference:

    [ cleaned up some formating ]

    Signed-off-by: Steven Rostedt

    Jason Baron