31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version this program is distributed in the
    hope that it will be useful but without any warranty without even
    the implied warranty of merchantability or fitness for a particular
    purpose see the gnu general public license for more details you
    should have received a copy of the gnu general public license along
    with this program if not write to the free software foundation inc
    59 temple place suite 330 boston ma 02111 1307 usa

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 1334 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Richard Fontana
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

30 Apr, 2019

1 commit

  • In order to have a separate address space for text poking, we need to
    duplicate init_mm early during start_kernel(). This, however, introduces
    a problem since uprobes functions are called from dup_mmap(), but
    uprobes is still not initialized in this early stage.

    Since uprobes initialization is necassary for fork, and since all the
    dependant initialization has been done when fork is initialized (percpu
    and vmalloc), move uprobes initialization to fork_init(). It does not
    seem uprobes introduces any security problem for the poking_mm.

    Crash and burn if uprobes initialization fails, similarly to other early
    initializations. Change the init_probes() name to probes_init() to match
    other early initialization functions name convention.

    Reported-by: kernel test robot
    Signed-off-by: Nadav Amit
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andy Lutomirski
    Cc: Arnaldo Carvalho de Melo
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Rick Edgecombe
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Cc: akpm@linux-foundation.org
    Cc: ard.biesheuvel@linaro.org
    Cc: deneen.t.dock@intel.com
    Cc: kernel-hardening@lists.openwall.com
    Cc: kristen@linux.intel.com
    Cc: linux_dti@icloud.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190426232303.28381-6-nadav.amit@gmail.com
    Signed-off-by: Ingo Molnar

    Nadav Amit
     

24 Sep, 2018

1 commit

  • Userspace Statically Defined Tracepoints[1] are dtrace style markers
    inside userspace applications. Applications like PostgreSQL, MySQL,
    Pthread, Perl, Python, Java, Ruby, Node.js, libvirt, QEMU, glib etc
    have these markers embedded in them. These markers are added by developer
    at important places in the code. Each marker source expands to a single
    nop instruction in the compiled code but there may be additional
    overhead for computing the marker arguments which expands to couple of
    instructions. In case the overhead is more, execution of it can be
    omitted by runtime if() condition when no one is tracing on the marker:

    if (reference_counter > 0) {
    Execute marker instructions;
    }

    Default value of reference counter is 0. Tracer has to increment the
    reference counter before tracing on a marker and decrement it when
    done with the tracing.

    Implement the reference counter logic in core uprobe. User will be
    able to use it from trace_uprobe as well as from kernel module. New
    trace_uprobe definition with reference counter will now be:

    :[(ref_ctr_offset)]

    where ref_ctr_offset is an optional field. For kernel module, new
    variant of uprobe_register() has been introduced:

    uprobe_register_refctr(inode, offset, ref_ctr_offset, consumer)

    No new variant for uprobe_unregister() because it's assumed to have
    only one reference counter for one uprobe.

    [1] https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation

    Note: 'reference counter' is called as 'semaphore' in original Dtrace
    (or Systemtap, bcc and even in ELF) documentation and code. But the
    term 'semaphore' is misleading in this context. This is just a counter
    used to hold number of tracers tracing on a marker. This is not really
    used for any synchronization. So we are calling it a 'reference counter'
    in kernel / perf code.

    Link: http://lkml.kernel.org/r/20180820044250.11659-2-ravi.bangoria@linux.ibm.com

    Reviewed-by: Masami Hiramatsu
    [Only trace_uprobe.c]
    Reviewed-by: Oleg Nesterov
    Reviewed-by: Song Liu
    Tested-by: Song Liu
    Signed-off-by: Ravi Bangoria
    Signed-off-by: Steven Rostedt (VMware)

    Ravi Bangoria
     

14 Aug, 2018

1 commit


12 Dec, 2016

1 commit


23 Nov, 2015

1 commit

  • There were still a number of references to my old Red Hat email
    address in the kernel source. Remove these while keeping the
    Red Hat copyright notices intact.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

31 Jul, 2015

4 commits

  • The previous change documents that cleanup_return_instances()
    can't always detect the dead frames, the stack can grow. But
    there is one special case which imho worth fixing:
    arch_uretprobe_is_alive() can return true when the stack didn't
    actually grow, but the next "call" insn uses the already
    invalidated frame.

    Test-case:

    #include
    #include

    jmp_buf jmp;
    int nr = 1024;

    void func_2(void)
    {
    if (--nr == 0)
    return;
    longjmp(jmp, 1);
    }

    void func_1(void)
    {
    setjmp(jmp);
    func_2();
    }

    int main(void)
    {
    func_1();
    return 0;
    }

    If you ret-probe func_1() and func_2() prepare_uretprobe() hits
    the MAX_URETPROBE_DEPTH limit and "return" from func_2() is not
    reported.

    When we know that the new call is not chained, we can do the
    more strict check. In this case "sp" points to the new ret-addr,
    so every frame which uses the same "sp" must be dead. The only
    complication is that arch_uretprobe_is_alive() needs to know was
    it chained or not, so we add the new RP_CHECK_CHAIN_CALL enum
    and change prepare_uretprobe() to pass RP_CHECK_CALL only if
    !chained.

    Note: arch_uretprobe_is_alive() could also re-read *sp and check
    if this word is still trampoline_vaddr. This could obviously
    improve the logic, but I would like to avoid another
    copy_from_user() especially in the case when we can't avoid the
    false "alive == T" positives.

    Tested-by: Pratyush Anand
    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju
    Acked-by: Anton Arapov
    Cc: Andy Lutomirski
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20150721134028.GA4786@redhat.com
    Signed-off-by: Ingo Molnar

    Oleg Nesterov
     
  • arch/x86 doesn't care (so far), but as Pratyush Anand pointed
    out other architectures might want why arch_uretprobe_is_alive()
    was called and use different checks depending on the context.
    Add the new argument to distinguish 2 callers.

    Tested-by: Pratyush Anand
    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju
    Acked-by: Anton Arapov
    Cc: Andy Lutomirski
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20150721134026.GA4779@redhat.com
    Signed-off-by: Ingo Molnar

    Oleg Nesterov
     
  • Add the x86 specific version of arch_uretprobe_is_alive()
    helper. It returns true if the stack frame mangled by
    prepare_uretprobe() is still on stack. So if it returns false,
    we know that the probed function has already returned.

    We add the new return_instance->stack member and change the
    generic code to initialize it in prepare_uretprobe, but it
    should be equally useful for other architectures.

    TODO: this assumes that the probed application can't use
    multiple stacks (say sigaltstack). We will try to improve
    this logic later.

    Tested-by: Pratyush Anand
    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju
    Acked-by: Anton Arapov
    Cc: Andy Lutomirski
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20150721134018.GA4766@redhat.com
    Signed-off-by: Ingo Molnar

    Oleg Nesterov
     
  • Add the new "weak" helper, arch_uretprobe_is_alive(), used by
    the next patches. It should return true if this return_instance
    is still valid. The arch agnostic version just always returns
    true.

    The patch exports "struct return_instance" for the architectures
    which want to override this hook. We can also cleanup
    prepare_uretprobe() if we pass the new return_instance to
    arch_uretprobe_hijack_return_addr().

    Tested-by: Pratyush Anand
    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju
    Acked-by: Anton Arapov
    Cc: Andy Lutomirski
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20150721134016.GA4762@redhat.com
    Signed-off-by: Ingo Molnar

    Oleg Nesterov
     

23 Oct, 2014

1 commit

  • For the following interfaces:

    set_swbp()
    set_orig_insn()
    is_swbp_insn()
    is_trap_insn()
    uprobe_get_swbp_addr()
    arch_uprobe_ignore()
    arch_uprobe_copy_ixol()

    kernel/events/uprobes.c provides default definitions explicitly marked
    "weak". Some architectures provide their own definitions intended to
    override the defaults, but the "weak" attribute on the declarations applied
    to the arch definitions as well, so the linker chose one based on link
    order (see 10629d711ed7 ("PCI: Remove __weak annotation from
    pcibios_get_phb_of_node decl")).

    Remove the "weak" attribute from the declarations so we always prefer a
    non-weak definition over the weak one, independent of link order.

    Signed-off-by: Bjorn Helgaas
    Acked-by: Ingo Molnar
    Acked-by: Srikar Dronamraju
    CC: Victor Kamensky
    CC: Oleg Nesterov
    CC: David A. Long
    CC: Ananth N Mavinakayanahalli

    Bjorn Helgaas
     

13 Jun, 2014

1 commit

  • Pull more perf updates from Ingo Molnar:
    "A second round of perf updates:

    - wide reaching kprobes sanitization and robustization, with the hope
    of fixing all 'probe this function crashes the kernel' bugs, by
    Masami Hiramatsu.

    - uprobes updates from Oleg Nesterov: tmpfs support, corner case
    fixes and robustization work.

    - perf tooling updates and fixes from Jiri Olsa, Namhyung Ki, Arnaldo
    et al:
    * Add support to accumulate hist periods (Namhyung Kim)
    * various fixes, refactorings and enhancements"

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (101 commits)
    perf: Differentiate exec() and non-exec() comm events
    perf: Fix perf_event_comm() vs. exec() assumption
    uprobes/x86: Rename arch_uprobe->def to ->defparam, minor comment updates
    perf/documentation: Add description for conditional branch filter
    perf/x86: Add conditional branch filtering support
    perf/tool: Add conditional branch filter 'cond' to perf record
    perf: Add new conditional branch filter 'PERF_SAMPLE_BRANCH_COND'
    uprobes: Teach copy_insn() to support tmpfs
    uprobes: Shift ->readpage check from __copy_insn() to uprobe_register()
    perf/x86: Use common PMU interrupt disabled code
    perf/ARM: Use common PMU interrupt disabled code
    perf: Disable sampled events if no PMU interrupt
    perf: Fix use after free in perf_remove_from_context()
    perf tools: Fix 'make help' message error
    perf record: Fix poll return value propagation
    perf tools: Move elide bool into perf_hpp_fmt struct
    perf tools: Remove elide setup for SORT_MODE__MEMORY mode
    perf tools: Fix "==" into "=" in ui_browser__warning assignment
    perf tools: Allow overriding sysfs and proc finding with env var
    perf tools: Consider header files outside perf directory in tags target
    ...

    Linus Torvalds
     

26 May, 2014

1 commit

  • After instruction write into xol area, on ARM V7
    architecture code need to flush dcache and icache to sync
    them up for given set of addresses. Having just
    'flush_dcache_page(page)' call is not enough - it is
    possible to have stale instruction sitting in icache
    for given xol area slot address.

    Introduce arch_uprobe_ixol_copy weak function
    that by default calls uprobes copy_to_page function and
    than flush_dcache_page function and on ARM define new one
    that handles xol slot copy in ARM specific way

    flush_uprobe_xol_access function shares/reuses implementation
    with/of flush_ptrace_access function and takes care of writing
    instruction to user land address space on given variety of
    different cache types on ARM CPUs. Because
    flush_uprobe_xol_access does not have vma around
    flush_ptrace_access was split into two parts. First that
    retrieves set of condition from vma and common that receives
    those conditions as flags.

    Note ARM cache flush function need kernel address
    through which instruction write happened, so instead
    of using uprobes copy_to_page function changed
    code to explicitly map page and do memcpy.

    Note arch_uprobe_copy_ixol function, in similar way as
    copy_to_user_page function, has preempt_disable/preempt_enable.

    Signed-off-by: Victor Kamensky
    Acked-by: Oleg Nesterov
    Reviewed-by: David A. Long
    Signed-off-by: Russell King

    Victor Kamensky
     

14 May, 2014

1 commit

  • If the probed insn triggers a trap, ->si_addr = regs->ip is technically
    correct, but this is not what the signal handler wants; we need to pass
    the address of the probed insn, not the address of xol slot.

    Add the new arch-agnostic helper, uprobe_get_trap_addr(), and change
    fill_trap_info() and math_error() to use it. !CONFIG_UPROBES case in
    uprobes.h uses a macro to avoid include hell and ensure that it can be
    compiled even if an architecture doesn't define instruction_pointer().

    Test-case:

    #include
    #include
    #include

    extern void probe_div(void);

    void sigh(int sig, siginfo_t *info, void *c)
    {
    int passed = (info->si_addr == probe_div);
    printf(passed ? "PASS\n" : "FAIL\n");
    _exit(!passed);
    }

    int main(void)
    {
    struct sigaction sa = {
    .sa_sigaction = sigh,
    .sa_flags = SA_SIGINFO,
    };

    sigaction(SIGFPE, &sa, NULL);

    asm (
    "xor %ecx,%ecx\n"
    ".globl probe_div; probe_div:\n"
    "idiv %ecx\n"
    );

    return 0;
    }

    it fails if probe_div() is probed.

    Note: show_unhandled_signals users should probably use this helper too,
    but we need to cleanup them first.

    Signed-off-by: Oleg Nesterov
    Reviewed-by: Masami Hiramatsu

    Oleg Nesterov
     

19 Mar, 2014

1 commit


20 Nov, 2013

2 commits

  • 1. Don't include asm/uprobes.h unconditionally, we only need
    it if CONFIG_UPROBES.

    2. Move the definition of "struct xol_area" into uprobes.c.

    Perhaps we should simply kill struct uprobes_state, it buys
    nothing.

    3. Kill the dummy definition of uprobe_get_swbp_addr(), nobody
    except handle_swbp() needs it.

    4. Purely cosmetic, but move the decl of uprobe_get_swbp_addr()
    up, close to other __weak helpers.

    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju

    Oleg Nesterov
     
  • uprobe_task->vaddr is a bit strange. The generic code uses it only
    to pass the additional argument to arch_uprobe_pre_xol(), and since
    it is always equal to instruction_pointer() this looks even more
    strange.

    And both utask->vaddr and and utask->autask have the same scope,
    they only have the meaning when the task executes the probed insn
    out-of-line, so it is safe to reuse both in UTASK_RUNNING state.

    This all means that logically ->vaddr belongs to arch_uprobe_task
    and we should probably move it there, arch_uprobe_pre_xol() can
    record instruction_pointer() itself.

    OTOH, it is also used by uprobe_copy_process() and dup_xol_work()
    for another purpose, this doesn't look clean and doesn't allow to
    move this member into arch_uprobe_task.

    This patch adds the union with 2 anonymous structs into uprobe_task.

    The first struct is autask + vaddr, this way we "almost" move vaddr
    into autask.

    The second struct has 2 new members for uprobe_copy_process() paths:
    ->dup_xol_addr which can be used instead ->vaddr, and ->dup_xol_work
    which can be used to avoid kmalloc() and simplify the code.

    Note that this union will likely have another member(s), we need
    something like "private_data_for_handlers" so that the tracing
    handlers could use it to communicate with call_fetch() methods.

    Signed-off-by: Oleg Nesterov
    Reviewed-by: Masami Hiramatsu
    Acked-by: Srikar Dronamraju

    Oleg Nesterov
     

07 Nov, 2013

2 commits

  • set_swbp() and set_orig_insn() are __weak, but this is pointless
    because write_opcode() is static.

    Export write_opcode() as uprobe_write_opcode() for the upcoming
    arm port, this way it can actually override set_swbp() and use
    __opcode_to_mem_arm(bpinsn) instead if UPROBE_SWBP_INSN.

    Signed-off-by: Oleg Nesterov

    Oleg Nesterov
     
  • Move the function declarations from the arch headers to the common
    header, since only the function bodies are architecture-specific.
    These changes are from Vincent Rabin's uprobes patch.

    [ oleg: update arch/powerpc/include/asm/uprobes.h ]

    Signed-off-by: Rabin Vincent
    Signed-off-by: David A. Long
    Signed-off-by: Oleg Nesterov

    David A. Long
     

30 Oct, 2013

2 commits

  • uprobe_copy_process() does nothing if the child shares ->mm with
    the forking process, but there is a special case: CLONE_VFORK.
    In this case it would be more correct to do dup_utask() but avoid
    dup_xol(). This is not that important, the child should not unwind
    its stack too much, this can corrupt the parent's stack, but at
    least we need this to allow to ret-probe __vfork() itself.

    Note: in theory, it would be better to check task_pt_regs(p)->sp
    instead of CLONE_VFORK, we need to dup_utask() if and only if the
    child can return from the function called by the parent. But this
    needs the arch-dependant helper, and I think that nobody actually
    does clone(same_stack, CLONE_VM).

    Reported-by: Martin Cermak
    Reported-by: David Smith
    Signed-off-by: Oleg Nesterov

    Oleg Nesterov
     
  • linux/uprobes.h declares arch_uprobe_skip_sstep() as a weak function.
    But as there is no definition of generic version so when trying to build
    uprobes for an architecture that doesn't yet have a arch_uprobe_skip_sstep()
    implementation, the vmlinux will try to call arch_uprobe_skip_sstep()
    somehwere in Stupidhistan leading to a system crash. We rather want a
    proper link error so remove arch_uprobe_skip_sstep().

    Signed-off-by: Ralf Baechle
    Signed-off-by: Oleg Nesterov

    Ralf Baechle
     

13 Apr, 2013

3 commits

  • Unlike the kretprobes we can't trust userspace, thus must have
    protection from user space attacks. User-space have "unlimited"
    stack, and this patch limits the return probes nestedness as a
    simple remedy for it.

    Note that this implementation leaks return_instance on siglongjmp
    until exit()/exec().

    The intention is to have KISS and bare minimum solution for the
    initial implementation in order to not complicate the uretprobes
    code.

    In the future we may come up with more sophisticated solution that
    remove this depth limitation. It is not easy task and lays beyond
    this patchset.

    Signed-off-by: Anton Arapov
    Acked-by: Srikar Dronamraju
    Signed-off-by: Oleg Nesterov

    Anton Arapov
     
  • When a uprobe with return probe consumer is hit, prepare_uretprobe()
    function is invoked. It creates return_instance, hijacks return address
    and replaces it with the trampoline.

    * Return instances are kept as stack per uprobed task.
    * Return instance is chained, when the original return address is
    trampoline's page vaddr (e.g. recursive call of the probed function).

    Signed-off-by: Anton Arapov
    Acked-by: Srikar Dronamraju
    Signed-off-by: Oleg Nesterov

    Anton Arapov
     
  • Enclose return probes implementation, introduce ->ret_handler() and update
    existing code to rely on ->handler() *and* ->ret_handler() for uprobe and
    uretprobe respectively.

    Signed-off-by: Anton Arapov
    Acked-by: Srikar Dronamraju
    Signed-off-by: Oleg Nesterov

    Anton Arapov
     

04 Apr, 2013

1 commit

  • Some architectures like powerpc have multiple variants of the trap
    instruction. Introduce an additional helper is_trap_insn() for run-time
    handling of non-uprobe traps on such architectures.

    While there, change is_swbp_at_addr() to is_trap_at_addr() for reading
    clarity.

    With this change, the uprobe registration path will supercede any trap
    instruction inserted at the requested location, while taking care of
    delivering the SIGTRAP for cases where the trap notification came in
    for an address without a uprobe. See [1] for a more detailed explanation.

    [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-March/104771.html

    This change was suggested by Oleg Nesterov.

    Signed-off-by: Ananth N Mavinakayanahalli
    Acked-by: Srikar Dronamraju
    Signed-off-by: Oleg Nesterov

    Ananth N Mavinakayanahalli
     

09 Feb, 2013

4 commits

  • Currently it is not possible to change the filtering constraints after
    uprobe_register(), so a consumer can not, say, start to trace a task/mm
    which was previously filtered out, or remove the no longer needed bp's.

    Introduce uprobe_apply() which simply does register_for_each_vma() again
    to consult uprobe_consumer->filter() and install/remove the breakpoints.
    The only complication is that register_for_each_vma() can no longer
    assume that uprobe->consumers should be consulter if is_register == T,
    so we change it to accept "struct uprobe_consumer *new" instead.

    Unlike uprobe_register(), uprobe_apply(true) doesn't do "unregister" if
    register_for_each_vma() fails, it is up to caller to handle the error.

    Note: we probably need to cleanup the current interface, it is strange
    that uprobe_apply/unregister need inode/offset. We should either change
    uprobe_register() to return "struct uprobe *", or add a private ->uprobe
    member in uprobe_consumer. And in the long term uprobe_apply() should
    take a single argument, uprobe or consumer, even "bool add" should go
    away.

    Signed-off-by: Oleg Nesterov

    Oleg Nesterov
     
  • Currrently the are 2 problems with pre-filtering:

    1. It is not possible to add/remove a task (mm) after uprobe_register()

    2. A forked child inherits all breakpoints and uprobe_consumer can not
    control this.

    This patch does the first step to improve the filtering. handler_chain()
    removes the breakpoints installed by this uprobe from current->mm if all
    handlers return UPROBE_HANDLER_REMOVE.

    Note that handler_chain() relies on ->register_rwsem to avoid the race
    with uprobe_register/unregister which can add/del a consumer, or even
    remove and then insert the new uprobe at the same address.

    Perhaps we will add uprobe_apply_mm(uprobe, mm, is_register) and teach
    copy_mm() to do filter(UPROBE_FILTER_FORK), but I think this change makes
    sense anyway.

    Note: instead of checking the retcode from uc->handler, we could add
    uc->filter(UPROBE_FILTER_BPHIT). But I think this is not optimal to
    call 2 hooks in a row. This buys nothing, and if handler/filter do
    something nontrivial they will probably do the same work twice.

    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju

    Oleg Nesterov
     
  • Finally add uprobe_consumer->filter() and change consumer_filter()
    to actually call this method.

    Note that ->filter() accepts mm_struct, not task_struct. Because:

    1. We do not have for_each_mm_user(mm, task).

    2. Even if we implement for_each_mm_user(), ->filter() can
    use it itself.

    3. It is not clear who will actually need this interface to
    do the "nontrivial" filtering.

    Another argument is "enum uprobe_filter_ctx", consumer->filter() can
    use it to figure out why/where it was called. For example, perhaps
    we can add UPROBE_FILTER_PRE_REGISTER used by build_map_info() to
    quickly "nack" the unwanted mm's. In this case consumer should know
    that it is called under ->i_mmap_mutex.

    See the previous discussion at http://marc.info/?t=135214229700002
    Perhaps we should pass more arguments, vma/vaddr?

    Note: this patch obviously can't help to filter out the child created
    by fork(), this will be addressed later.

    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju

    Oleg Nesterov
     
  • uprobe_consumer->filter() is pointless in its current form, kill it.

    We will add it back, but with the different signature/semantics. Perhaps
    we will even re-introduce the callsite in handler_chain(), but not to
    just skip uc->handler().

    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju

    Oleg Nesterov
     

16 Nov, 2012

1 commit

  • This was always racy, but 268720903f87e0b84b161626c4447b81671b5d18
    "uprobes: Rework register_for_each_vma() to make it O(n)" should be
    blamed anyway, it made everything worse and I didn't notice.

    register/unregister call build_map_info() and then do install/remove
    breakpoint for every mm which mmaps inode/offset. This can obviously
    race with fork()->dup_mmap() in between and we can miss the child.

    uprobe_register() could be easily fixed but unregister is much worse,
    the new mm inherits "int3" from parent and there is no way to detect
    this if uprobe goes away.

    So this patch simply adds percpu_down_read/up_read around dup_mmap(),
    and percpu_down_write/up_write into register_for_each_vma().

    This adds 2 new hooks into dup_mmap() but we can kill uprobe_dup_mmap()
    and fold it into uprobe_end_dup_mmap().

    Reported-by: Srikar Dronamraju
    Acked-by: Srikar Dronamraju
    Signed-off-by: Oleg Nesterov

    Oleg Nesterov
     

04 Nov, 2012

1 commit


08 Oct, 2012

1 commit

  • Preparation. Extract the copy_insn/arch_uprobe_analyze_insn code
    from install_breakpoint() into the new helper, prepare_uprobe().

    And move uprobe->flags defines from uprobes.h to uprobes.c, nobody
    else can use them anyway.

    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju

    Oleg Nesterov
     

30 Sep, 2012

1 commit

  • Kill UTASK_BP_HIT state, it buys nothing but complicates the code.
    It is only used in uprobe_notify_resume() to decide who should be
    called, we can check utask->active_uprobe != NULL instead. And this
    allows us to simplify handle_swbp(), no need to clear utask->state.

    Likewise we could kill UTASK_SSTEP, but UTASK_BP_HIT is worse and
    imho should die. The problem is, it creates the special case when
    task->utask is NULL, we can't distinguish RUNNING and BP_HIT. With
    this patch utask == NULL always means RUNNING.

    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju

    Oleg Nesterov
     

15 Sep, 2012

1 commit

  • As Oleg pointed out in [0] uprobe should not use the ptrace interface
    for enabling/disabling single stepping.

    [0] http://lkml.kernel.org/r/20120730141638.GA5306@redhat.com

    Add the new "__weak arch" helpers which simply call user_*_single_step()
    as a preparation. This is only needed to not break the powerpc port, we
    will fold this logic into arch_uprobe_pre/post_xol() hooks later.

    We should also change handle_singlestep(), _disable_step(&uprobe->arch)
    should be called before put_uprobe().

    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju

    Sebastian Andrzej Siewior
     

29 Aug, 2012

4 commits

  • Nobody does set_orig_insn(verify => false), and I think nobody will.
    Remove this argument. IIUC set_orig_insn(verify => false) was needed
    to single-step without xol area.

    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju

    Oleg Nesterov
     
  • Now that we have uprobe_dup_mmap() we can fold uprobe_reset_state()
    into the new hook and remove it. mmput()->uprobe_clear_state() can't
    be called before dup_mmap().

    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju

    Oleg Nesterov
     
  • Add the new MMF_HAS_UPROBES flag. It is set by install_breakpoint()
    and it is copied by dup_mmap(), uprobe_pre_sstep_notifier() checks
    it to avoid the slow path if the task was never probed. Perhaps it
    makes sense to check it in valid_vma(is_register => false) as well.

    This needs the new dup_mmap()->uprobe_dup_mmap() hook. We can't use
    uprobe_reset_state() or put MMF_HAS_UPROBES into MMF_INIT_MASK, we
    need oldmm->mmap_sem to avoid the race with uprobe_register() or
    mmap() from another thread.

    Currently we never clear this bit, it can be false-positive after
    uprobe_unregister() or uprobe_munmap() or if dup_mmap() hits the
    probed VM_DONTCOPY vma. But this is fine correctness-wise and has
    no effect unless the task hits the non-uprobe breakpoint.

    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju

    Oleg Nesterov
     
  • uprobes_state->count is only needed to avoid the slow path in
    uprobe_pre_sstep_notifier(). It is also checked in uprobe_munmap()
    but ironically its only goal to decrement this counter. However,
    it is very broken. Just some examples:

    - uprobe_mmap() can race with uprobe_unregister() and wrongly
    increment the counter if it hits the non-uprobe "int3". Note
    that install_breakpoint() checks ->consumers first and returns
    -EEXIST if it is NULL.

    "atomic_sub() if error" in uprobe_mmap() looks obviously wrong
    too.

    - uprobe_munmap() can race with uprobe_register() and wrongly
    decrement the counter by the same reason.

    - Suppose an appication tries to increase the mmapped area via
    sys_mremap(). vma_adjust() does uprobe_munmap(whole_vma) first,
    this can nullify the counter temporarily and race with another
    thread which can hit the bp, the application will be killed by
    SIGTRAP.

    - Suppose an application mmaps 2 consecutive areas in the same file
    and one (or both) of these areas has uprobes. In the likely case
    mmap_region()->vma_merge() suceeds. Like above, this leads to
    uprobe_munmap/uprobe_mmap from vma_merge()->vma_adjust() but then
    mmap_region() does another uprobe_mmap(resulting_vma) and doubles
    the counter.

    This patch only removes this counter and fixes the compile errors,
    then we will try to cleanup the changed code and add something else
    instead.

    Signed-off-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju

    Oleg Nesterov
     

14 Apr, 2012

1 commit

  • Uprobes has a callback (uprobe_munmap()) in the unmap path to
    maintain the uprobes count.

    In the exit path this callback gets called in unlink_file_vma().
    However by the time unlink_file_vma() is called, the pages would
    have been unmapped (in unmap_vmas()) and the task->rss_stat counts
    accounted (in zap_pte_range()).

    If the exiting process has probepoints, uprobe_munmap() checks if
    the breakpoint instruction was around before decrementing the probe
    count.

    This results in a file backed page being reread by uprobe_munmap()
    and hence it does not find the breakpoint.

    This patch fixes this problem by moving the callback to
    unmap_single_vma(). Since unmap_single_vma() may not unmap the
    complete vma, add start and end parameters to uprobe_munmap().

    This bug became apparent courtesy of commit c3f0327f8e9d
    ("mm: add rss counters consistency check").

    Signed-off-by: Srikar Dronamraju
    Cc: Linus Torvalds
    Cc: Ananth N Mavinakayanahalli
    Cc: Jim Keniston
    Cc: Linux-mm
    Cc: Oleg Nesterov
    Cc: Andi Kleen
    Cc: Christoph Hellwig
    Cc: Steven Rostedt
    Cc: Arnaldo Carvalho de Melo
    Cc: Masami Hiramatsu
    Cc: Anton Arapov
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20120411103527.23245.9835.sendpatchset@srdronam.in.ibm.com
    Signed-off-by: Ingo Molnar

    Srikar Dronamraju
     

31 Mar, 2012

1 commit

  • Maintain a per-mm counter: number of uprobes that are inserted
    on this process address space.

    This counter can be used at probe hit time to determine if we
    need a lookup in the uprobes rbtree. Everytime a probe gets
    inserted successfully, the probe count is incremented and
    everytime a probe gets removed, the probe count is decremented.

    The new uprobe_munmap hook ensures the count is correct on a
    unmap or remap of a region. We expect that once a
    uprobe_munmap() is called, the vma goes away. So
    uprobe_unregister() finding a probe to unregister would either
    mean unmap event hasnt occurred yet or a mmap event on the same
    executable file occured after a unmap event.

    Additionally, uprobe_mmap hook now also gets called:

    a. on every executable vma that is COWed at fork.
    b. a vma of interest is newly mapped; breakpoint insertion also
    happens at the required address.

    On process creation, make sure the probes count in the child is
    set correctly.

    Special cases that are taken care include:

    a. mremap
    b. VM_DONTCOPY vmas on fork()
    c. insertion/removal races in the parent during fork().

    Signed-off-by: Srikar Dronamraju
    Cc: Linus Torvalds
    Cc: Ananth N Mavinakayanahalli
    Cc: Jim Keniston
    Cc: Linux-mm
    Cc: Oleg Nesterov
    Cc: Andi Kleen
    Cc: Christoph Hellwig
    Cc: Steven Rostedt
    Cc: Arnaldo Carvalho de Melo
    Cc: Masami Hiramatsu
    Cc: Anton Arapov
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20120330182646.10018.85805.sendpatchset@srdronam.in.ibm.com
    Signed-off-by: Ingo Molnar

    Srikar Dronamraju