05 Sep, 2005

40 commits

  • "extern inline" doesn't make sense.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Chris Zankel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • We were leaking pmd pages when 3_LEVEL_PGTABLES was enabled. This fixes that.

    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • cleanup and fix the check for advanced sysemu (PTRACE_SYSEMU_SINGLESTEP
    option)

    Signed-off-by: Bodo Stroesser
    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bodo Stroesser
     
  • Add new cmdline setups:
    - noprocmm
    - noptracefaultinfo
    In case of testing, they can be used to switch off usage of
    /proc/mm and PTRACE_FAULTINFO independently.

    Signed-off-by: Bodo Stroesser
    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bodo Stroesser
     
  • Change syscall-stub's data to include a "expected retval".

    Stub now checks syscalls retval and aborts execution of syscall list, if
    retval != expected retval.

    run_syscall_stub prints the data of the failed syscall, using the data pointer
    and retval written by the stub to the beginning of the stack.

    one_syscall_stub is removed, to simplify code, because only some instructions
    are saved by one_syscall_stub, no host-syscall.

    Using the stub with additional data (modify_ldt via stub)
    is prepared also.

    Signed-off-by: Bodo Stroesser
    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bodo Stroesser
     
  • This change enables SKAS0/SKAS3 to work with all combinations of /proc/mm and
    PTRACE_FAULTINFO being available or not.

    Also it changes the initialization of proc_mm and ptrace_faultinfo slightly,
    to ease forcing SKAS0 on a patched host. Forcing UML to run without /proc/mm
    or PTRACE_FAULTINFO by cmdline parameter can be implemented with a setup
    resetting the related variable.

    Signed-off-by: Bodo Stroesser
    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bodo Stroesser
     
  • The serial UML OS-abstraction layer patch (um/kernel dir).

    This moves all systemcalls from process.c file under os-Linux dir and join
    process.c and process_kern.c files.

    Signed-off-by: Gennady Sharapov
    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gennady Sharapov
     
  • This adds AIO support to the ubd driver.

    The driver breaks a struct request into IO requests to the host, based on the
    hardware segments in the request and on any COW blocks covered by the request.

    The ubd IO thread is gone, since there is now an equivalent thread in the AIO
    module.

    There is provision for multiple outstanding requests now. Requests aren't
    retired until all pieces of it have been completed. The AIO requests have a
    shared count, which is decremented as IO operations come in until it reaches
    0. This can be possibly moved to the request struct - haven't looked at this
    yet.

    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • This patch makes UML use host AIO support when it (and
    /usr/include/linux/aio_abi.h) are present. This is only the support, with no
    consumers - a consumer is coming in the next patch.

    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Added missing include list to uml AFLAGS

    Killed magic for stubs. [So] - it was needed only because of messed AFLAGS
    Switched segv_stubs.c to kernel CFLAGS sans profile, instead of user ones
    Killed STUBS_CFLAGS - it's not needed and the only remaining use had been
    gratitious - it only polluted CFLAGS

    Signed-off-by: Al Viro
    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Al Viro
     
  • This merges two sets of files which had no business being split apart in the
    first place.

    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • There is a lot of code which is duplicated between the 2 and 3 level
    implementation, with the only difference that the 3-level implementation is a
    bit more generalized (instead of accessing directly pte_t.pte, it uses the
    appropriate access macros).

    So this code is joined together.

    As obvious, a "core code nice cleanup" is not a "stability-friendly patch" so
    usual care applies.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Jeff Dike
    Cc: Paolo Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • This adds VM op batching to skas0. Rather than having a context switch to and
    from the userspace stub for each address space change, we write a number of
    operations to the stub data page and invoke a different stub which loops over
    them and executes them all in one go.

    The operations are stored as [ system call number, arg1, arg2, ... ] tuples.

    The set is terminated by a system call number of 0. Single operations, i.e.
    page faults, are handled in the old way, since that is slightly more
    efficient.

    For a kernel build, a minority (~1/4) of the operations are part of a set.
    These sets averaged ~100 in length, so for this quarter, the context switching
    overhead is greatly reduced.

    Signed-off-by: Jeff Dike
    Cc: Paolo Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Al Viro spotted a bunch of duplicated
    exports - this removes them.

    Signed-off-by: Jeff Dike
    Cc: Paolo Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Noticed by Al Viro - SMP on x86_64 is
    fundamentally broken due to UML's reuse of the host arch's percpu stuff. This
    is OK on x86, but the x86_64 pda stuff just won't work for UML.

    Signed-off-by: Jeff Dike
    Cc: Paolo Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Remove an unneeded reference to libc.

    Signed-off-by: Jeff Dike
    Cc: Paolo Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Al Viro
     
  • Build cleanups

    Signed-off-by: Jeff Dike
    Cc: Paolo Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Al Viro
     
  • This cleans up the error path in ubd_open, causing it now to call ubd_close
    appropriately when something fails.

    Signed-off-by: Jeff Dike
    Cc: Paolo Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Fix a macro typo which could break if the macro is passed arguments with
    side-effects.

    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Al Viro
     
  • The copy_user stuff in the signal frame code was broke.

    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Al Viro
     
  • Avoid chomping low bits of address for functions doing it by themselves,
    fix whitespace, add a correctness checking.

    I did this for remap-file-pages protection support, it was useful on its
    own too.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Normally, activate_mm() is called from exec(), and thus it used to be a
    no-op because we use a completely new "MM context" on the host (for
    instance, a new process), and so we didn't need to flush any "TLB entries"
    (which for us are the set of memory mappings for the host process from the
    virtual "RAM" file).

    Kernel threads, instead, are usually handled in a different way. So, when
    for AIO we call use_mm(), things used to break and so Benjamin implemented
    activate_mm(). However, that is only needed for AIO, and could slow down
    exec() inside UML, so be smart: detect being called for AIO (via
    PF_BORROWED_MM) and do the full flush only in that situation.

    Comment also the caller so that people won't go breaking UML without
    noticing. I also rely on the caller's locks for testing current->flags.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    CC: Benjamin LaHaise
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • If a SIGWINCH comes in, while winch_thread() isn't waiting in wait(),
    winch_thread could miss signals. It isn't very probable, that anyone will
    see this causing trouble, as it would need a very special timing, that a
    missed SIGWINCH results in a wrong window size.

    So, this is a minor problem. But why not fix, as it can be done so easy?

    Signed-off-by: Bodo Stroesser
    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bodo Stroesser
     
  • Apparently, GDB gets confused when we do an execvp() on ourselves.

    Since it's simply done to allocate further space for command line arguments
    (which we'll use to allow gathering the startup command line for guest
    processes through the host), allow the user to disable that to get a
    debuggable UML binary.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • As a follow-up to "UML Support - Ptrace: adds the host SYSEMU support, for
    UML and general usage" (i.e. uml-support-* in current mm).

    Avoid unconditionally jumping to work_pending and code copying, just reuse
    the already existing resume_userspace path.

    One interesting note, from Charles P. Wright, suggested that the API is
    improvable with no downsides for UML (except that it will have to support
    yet another host API, since dropping support for the current API, for UML,
    is not reasonable from users' point of view).

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    CC: Charles P. Wright
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Paolo 'Blaisorblade' Giarrusso

    This is simply an adjustment for "Ptrace - i386: fix Syscall Audit interaction
    with singlestep" to work on top of SYSEMU patches, too. On this patch, I have
    some doubts: I wonder why we need to alter that way ptrace_disable().

    I left the patch this way because it has been extensively tested, but I don't
    understand the reason.

    The current PTRACE_DETACH handling simply clears child->ptrace; actually this
    is not enough because entry.S just looks at the thread_flags; actually,
    do_syscall_trace checks current->ptrace but I don't think depending on that is
    good, at least for performance, so I think the clearing is done elsewhere.
    For instance, on PTRACE_CONT it's done, but doing PTRACE_DETACH without
    PTRACE_CONT is possible (and happens when gdb crashes and one kills it
    manually).

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    CC: Roland McGrath
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bodo Stroesser
     
  • This patch implements the new ptrace option PTRACE_SYSEMU_SINGLESTEP, which
    can be used by UML to singlestep a process: it will receive SINGLESTEP
    interceptions for normal instructions and syscalls, but syscall execution will
    be skipped just like with PTRACE_SYSEMU.

    Signed-off-by: Bodo Stroesser
    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bodo Stroesser
     
  • With this patch, we change the way we handle switching from PTRACE_SYSEMU to
    PTRACE_{SINGLESTEP,SYSCALL}, to free TIF_SYSCALL_EMU from double use as a
    preparation for PTRACE_SYSEMU_SINGLESTEP extension, without changing the
    behavior of the host kernel.

    Signed-off-by: Bodo Stroesser
    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bodo Stroesser
     
  • Jeff Dike ,
    Paolo 'Blaisorblade' Giarrusso ,
    Bodo Stroesser

    Adds a new ptrace(2) mode, called PTRACE_SYSEMU, resembling PTRACE_SYSCALL
    except that the kernel does not execute the requested syscall; this is useful
    to improve performance for virtual environments, like UML, which want to run
    the syscall on their own.

    In fact, using PTRACE_SYSCALL means stopping child execution twice, on entry
    and on exit, and each time you also have two context switches; with SYSEMU you
    avoid the 2nd stop and so save two context switches per syscall.

    Also, some architectures don't have support in the host for changing the
    syscall number via ptrace(), which is currently needed to skip syscall
    execution (UML turns any syscall into getpid() to avoid it being executed on
    the host). Fixing that is hard, while SYSEMU is easier to implement.

    * This version of the patch includes some suggestions of Jeff Dike to avoid
    adding any instructions to the syscall fast path, plus some other little
    changes, by myself, to make it work even when the syscall is executed with
    SYSENTER (but I'm unsure about them). It has been widely tested for quite a
    lot of time.

    * Various fixed were included to handle the various switches between
    various states, i.e. when for instance a syscall entry is traced with one of
    PT_SYSCALL / _SYSEMU / _SINGLESTEP and another one is used on exit.
    Basically, this is done by remembering which one of them was used even after
    the call to ptrace_notify().

    * We're combining TIF_SYSCALL_EMU with TIF_SYSCALL_TRACE or TIF_SINGLESTEP
    to make do_syscall_trace() notice that the current syscall was started with
    SYSEMU on entry, so that no notification ought to be done in the exit path;
    this is a bit of a hack, so this problem is solved in another way in next
    patches.

    * Also, the effects of the patch:
    "Ptrace - i386: fix Syscall Audit interaction with singlestep"
    are cancelled; they are restored back in the last patch of this series.

    Detailed descriptions of the patches doing this kind of processing follow (but
    I've already summed everything up).

    * Fix behaviour when changing interception kind #1.

    In do_syscall_trace(), we check the status of the TIF_SYSCALL_EMU flag
    only after doing the debugger notification; but the debugger might have
    changed the status of this flag because he continued execution with
    PTRACE_SYSCALL, so this is wrong. This patch fixes it by saving the flag
    status before calling ptrace_notify().

    * Fix behaviour when changing interception kind #2:
    avoid intercepting syscall on return when using SYSCALL again.

    A guest process switching from using PTRACE_SYSEMU to PTRACE_SYSCALL
    crashes.

    The problem is in arch/i386/kernel/entry.S. The current SYSEMU patch
    inhibits the syscall-handler to be called, but does not prevent
    do_syscall_trace() to be called after this for syscall completion
    interception.

    The appended patch fixes this. It reuses the flag TIF_SYSCALL_EMU to
    remember "we come from PTRACE_SYSEMU and now are in PTRACE_SYSCALL", since
    the flag is unused in the depicted situation.

    * Fix behaviour when changing interception kind #3:
    avoid intercepting syscall on return when using SINGLESTEP.

    When testing 2.6.9 and the skas3.v6 patch, with my latest patch and had
    problems with singlestepping on UML in SKAS with SYSEMU. It looped
    receiving SIGTRAPs without moving forward. EIP of the traced process was
    the same for all SIGTRAPs.

    What's missing is to handle switching from PTRACE_SYSCALL_EMU to
    PTRACE_SINGLESTEP in a way very similar to what is done for the change from
    PTRACE_SYSCALL_EMU to PTRACE_SYSCALL_TRACE.

    I.e., after calling ptrace(PTRACE_SYSEMU), on the return path, the debugger is
    notified and then wake ups the process; the syscall is executed (or skipped,
    when do_syscall_trace() returns 0, i.e. when using PTRACE_SYSEMU), and
    do_syscall_trace() is called again. Since we are on the return path of a
    SYSEMU'd syscall, if the wake up is performed through ptrace(PTRACE_SYSCALL),
    we must still avoid notifying the parent of the syscall exit. Now, this
    behaviour is extended even to resuming with PTRACE_SINGLESTEP.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laurent Vivier
     
  • Paolo 'Blaisorblade' Giarrusso

    Avoid giving two traps for singlestep instead of one, when syscall auditing is
    enabled.

    In fact no singlestep trap is sent on syscall entry, only on syscall exit, as
    can be seen in entry.S:

    # Note that in this mask _TIF_SINGLESTEP is not tested !!! <<<<<<<<<<<<<<
    testb $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),TI_flags(%ebp)
    jnz syscall_trace_entry
    ...
    syscall_trace_entry:
    ...
    call do_syscall_trace

    But auditing a SINGLESTEP'ed process causes do_syscall_trace to be called, so
    the tracer will get one more trap on the syscall entry path, which it
    shouldn't.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    CC: Roland McGrath
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bodo Stroesser
     
  • To the extent that sub-Kconfig files exist elsewhere in the tree, they are
    named Kconfig.foo, rather than the Kconfig_foo that UML has. This patch
    brings the names in line with the rest of the tree.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • This eliminates the segfault info ring buffer, which added a system call to
    each page fault, and which hadn't been useful for debugging in ages.

    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • This patch converts arch/cris/Kconfig.debug to using lib/Kconfig.debug.

    This should fix a compile error in 2.6.13-rc4 caused by a missing
    CONFIG_LOG_BUF_SHIFT definition.

    While I was editing this file, I also converted some spaces to tabs.

    Signed-off-by: Adrian Bunk
    Acked-by: Mikael Starvik
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • Use the builtin functions for memset/memclr/memcpy, special optimizations for
    page operations have dedicated functions now. Uninline memmove/memchr and
    move all functions into a single file and clean it up a little.

    Signed-off-by: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roman Zippel
     
  • Move a few cache functions into its own file and fix flush_icache_range() so
    it can handle both kernel and user addresses correctly (assuming context is
    set correctly).

    Turn copy_to_user_page/copy_from_user_page into inline functions and add a
    missing cache flush.

    Signed-off-by: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roman Zippel
     
  • - create helper function singlestep_disable()
    - move variable definitions to the top of the function
    - use "out_eio" label as common error destination
    - don't clear failure value for PTRACE_SETREGS/PTRACE_GETREGS

    Signed-off-by: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roman Zippel
     
  • This reformats and properly indents sys_ptrace (only whitespace changes).

    Signed-off-by: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roman Zippel
     
  • The timers lack .suspend/.resume methods. Because of this, jiffies got a
    big compensation after a S3 resume. And then softlockup watchdog reports
    an oops. This occured with HPET enabled, but it's also possible for other
    timers.

    Signed-off-by: Shaohua Li
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shaohua Li
     
  • Clean code up a bit, and only show suspend to disk as available when
    it is configured in.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • If process freezing fails, some processes are frozen, and rest are left in
    "were asked to be frozen" state. Thats wrong, we should leave it in some
    consistent state.

    Signed-off-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek