08 May, 2007

40 commits

  • Convert an arch that does not currently implement sub-jiffy timekeeping to
    use the generic timekeeping code.

    v850 looks like it has some intent to implement sub-jiffy timekeeping, so
    it may not yet be appropriate to try to convert, but I figured I'd get the
    maintainer's input and submit the patch for comment.

    Signed-off-by: John Stultz
    Cc: Miles Bader
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    john stultz
     
  • Declare strlcpy and strlcat more correctly.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • With the current timekeeping, !CONFIG_UML_REAL_TIME_CLOCK has
    inconsistent behavior. Previously, gettimeofday could be (and was)
    isolated from the clock ticking. Now, it's not, so when
    CONFIG_UML_REAL_TIME_CLOCK is disabled, gettimeofday must progress in
    lockstep with the clock, making it fully virtual.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • It turns out that the message complaining about a lack of tmpfs space
    on the host can be misunderstood as referring to the UML.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • When doing a full address space flush, only look at areas covered by a VMA.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • More trimming of the page fault path.

    Permissions are passed around in a single int rather than one bit per
    int. The permission values are copied from libc so that they can be
    passed to mmap and mprotect without any further conversion.

    The register sets used by do_syscall_stub and copy_context_skas0 are
    initialized once, at boot time, rather than once per call.

    wait_stub_done checks whether it is getting the signals it expects by
    comparing the wait status to a mask containing bits for the signals of
    interest rather than comparing individually to the signal numbers. It
    also has one check for a wait failure instead of two. The caller is
    expected to do the initial continue of the stub. This gets rid of an
    argument and some logic. The fname argument is gone, as that can be
    had from a stack trace.

    user_signal() is collapsed into userspace() as it is basically one or
    two lines of code afterwards.

    The physical memory remapping stuff is gone, as it is unused.

    flush_tlb_page is inlined.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • I missed removing another piece of debugging in an earlier patch.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Give the page fault code a specialized path. There is only one page to look
    at, so there's no point in going into the general page table walking code.
    There's only going to be one host operation, so there are no opportunities for
    merging. So, we go straight to the pte we want, figure out what needs doing,
    and do it.

    While I was in here, I fixed the wart where the address passed to unmap was a
    void *, but an unsigned long to map and protect.

    This gives me just under 10% on a kernel build.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Allow deadlocks to be avoided in the AIO code by setting the pipe to the I/O
    thread non-blocking.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Rename os_{read_write}_file_k back to os_{read_write}_file, delete
    the originals and their bogus infrastructure, and fix all the callers.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • I accidentally left the remnants of some debugging in an earlier patch.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Formatting fixes ahead of renaming os_{read_write}_file_k to
    os_{read_write}_file and fixing all the callers.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Convert all remaining os_{read_write}_file users to use the simple
    {read,write} wrappers, os_{read_write}_file_k.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Code running on the initial UML stack can't receive or process signals since
    current must be valid when IRQs are handled, and there is no current for this
    stack.

    So, instead of using UML_LONGJMP and UML_SETJMP, which are careful to save and
    restore signal state, and, as a side-effect, handle any deferred signals,
    start_idle_thread must use the bare equivalents, which don't do anything with
    signals.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Dump core after a panic. This will provide better debugging information than
    is currently available.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Sanitise gfp flags; it actually is an atomic context, so drop the
    GFP_KERNEL part.

    Signed-off-by: Peter Zijlstra
    Acked-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Instead of writing entire structures between UML and the I/O thread, we send
    pointers. This cuts down on the amount of data being copied and possibly
    allows more requests to be pending between the two.

    This requires that the requests be kmalloced and freed instead of living on
    the stack.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Send as many I/O requests to the I/O thread as possible, even though it will
    still only handle one at a time. This provides an opportunity to reduce
    latency by starting one request before the previous one has been finished in
    the driver.

    Request handling is somewhat modernized by requesting sg pieces of a request
    and handling them separately, finishing off the entire request after all the
    pieces are done.

    When a request queue stalls, normally because its pipe to the I/O thread is
    full, it is put on the restart list. This list is processed by starting up
    the queues on it whenever there is some indication that progress might be
    possible again. Currently, this happens in the driver interrupt routine.
    Some requests have been finished, so there is likely to be room in the pipe
    again.

    This almost doubles throughput when copying data between devices, but made no
    noticable difference on anything else I tried.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • This patch converts calls in the os layer to os_{read,write}_file to calls
    directly to libc read() and write() where it is clear that the I/O buffer is
    in the kernel.

    We can do that here instead of calling os_{read,write}_file_k since we are in
    libc code and can call libc directly.

    With the change in the calls, error handling needs to be changed to refer to
    errno directly rather than the return value of the call.

    CATCH_EINTR wrappers were also added where needed.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • This patch lays some groundwork for the next one, which converts calls to
    os_{read,write}_file into {read,write}, by doing some tidying in the affected
    areas.

    do_not_aio gets restructured to make the final result a bit cleaner.

    There are also whitespace and other formatting fixes, fixes in error messages,
    and a typo fix.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • This patch starts the removal of a very old, very broken piece of code. This
    stems from the problem of passing a userspace buffer into read() or write() on
    the host. If that buffer had not yet been faulted in, read and write will
    return -EFAULT.

    To avoid this problem, the solution was to fault the buffer in before the
    system call by touching the pages that hold the buffer by doing a copy-user of
    a byte to each page. This is obviously bogus, but it does usually work, in tt
    mode, since the kernel and process are in the same address space and userspace
    addresses can be accessed directly in the kernel.

    In skas mode, where the kernel and process are in separate address spaces, it
    is completely bogus because the userspace address, which is invalid in the
    kernel, is passed into the system call instead of the corresponding physical
    address, which would be valid. Here, it appears that this code, on every host
    read() or write(), tries to fault in a random process page. This doesn't seem
    to cause any correctness problems, but there is a performance impact. This
    patch, and the ones following, result in a 10-15% performance gain on a kernel
    build.

    This code can't be immediately tossed out because when it is, you can't log
    in. Apparently, there is some code in the console driver which depends on
    this somehow.

    However, we can start removing it by switching the code which does I/O using
    kernel addresses to using plain read() and write(). This patch introduces
    os_read_file_k and os_write_file_k for use with kernel buffers and converts
    all call locations which use obvious kernel buffers to use them. These
    include I/O using buffers which are local variables which are on the stack or
    kmalloc-ed. Later patches will handle the less obvious cases, followed by a
    mass conversion back to the original interface.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • It turns out that essentially none of the x86_64 bugs.c is needed. So, we can
    delete most of it.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • The previous page table walking code was horribly inefficient. This patch
    replaces it with code taken from elsewhere in the kernel.

    Forking from bash is now ~5% faster and page faults are handled ~10% faster.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Provide a register dump if handle_trap fails. Abstract out ptrace_dump_regs
    since it now has two callers.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Define release methods for the ubd and net drivers. They contain as much of
    the remove methods as make sense. All error checking must have already been
    done as well as anything else that might be holding a reference on the device
    kobject.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • HOST_FRAME_SIZE isn't used any more. It has been replaced with MAX_REG_NR.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Locking commentary.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Commentary about missing locking.

    Also got rid of uml_start because it was pointless.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • If there's a segfault inside the kernel, we want a dump of the registers at
    the point of the segfault, not the registers at the point of calling panic or
    the last userspace registers.

    sig_handler_common_skas now uses a static register set in the case of a
    SIGSEGV to avoid messing up the process registers if the segfault turns out to
    be non-fatal.

    The architecture sigcontext-to-pt_regs copying code was repurposed to copy
    data out of the SEGV stack frame.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Tidying in preparation for the segfault register dumping patch which follows.

    void * pointers are changed to union uml_pt_regs *. This makes the types
    match reality, except in arch_fixup, which is changed to operate on a union
    uml_pt_regs. This fixes a bug in the call from segv_handler, which passes a
    union uml_pt_regs, to segv, which expects to pass a struct sigcontext to
    arch_fixup.

    Whitespace and other style fixes.

    There's also a errno printk fix.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • kernel_thread() should just return an error value on do_fork failure, not
    panic.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • userspace code used to have to call the kernelspace function page_size() in
    order to determine the value of the kernel's PAGE_SIZE. Since this is now
    available directly from kern_constants.h as UM_KERN_PAGE_SIZE, page_size() can
    be deleted and calls changed to use the constant.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Clean up arch/um/kernel/process.c:

    - lots of return(x); -> return x; conversions

    - a number of the small functions are either unused, in which case they are
    gone, along any declarations in a header, or could be made static.

    - current_pid is ifdefed on CONFIG_MODE_TT and its declaration is ifdefed on
    both CONFIG_MODE_TT and UML_CONFIG_MODE_TT because we don't know whether
    it's being used in a userspace or kernel file.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Comment the lack of locking on a couple of globals.

    Also fix the formatting of __setup_host_supports_tls.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • flush_thread doesn't need to do a full page table walk in order to clear the
    address space. It knows what the end result needs to be, so it can call unmap
    directly.

    This results in a 10-20% speedup in an exec from bash.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Calls lines_init() *after* xterm_title is modified to include umid.

    Signed-off-by: Davide Brini
    Signed-off-by: Jeff Dike
    Acked-by: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davide Brini
     
  • To look at users I did:
    $ find arch/um/ include/asm-um -name '*.[ch]'|xargs grep -r 'net_kern\.h'
    +-l|xargs grep '\'

    Most users just cast user to the appropriate pointer, the remaining ones are
    fixed here. In net_kern.c, I'm almost sure that save trick is not needed
    anymore, but I've not verified it.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Avoid using the temporary buffer introduced by previous patch to hold the
    device name.

    Btw, avoid leaking device on an error path. Other error paths may need
    cleanup.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Improve checking and diagnostics for broadcast and multicast Ethernet MAC
    addresses, and distinguish between those cases in output; also make sure the
    device is assigned a MAC address valid only locally to avoid collisions.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Signed-off-by: Robert P. J. Day
    Acked-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day