31 Mar, 2006

3 commits

  • Woe be unto he who builds their filesystems as modules.

    Signed-off-by: Jeff Garzik
    [ Obscure quote from the infamous geek bible? ]
    Signed-off-by: Linus Torvalds

    Jeff Garzik
     
  • This enables the caller to migrate pages from one address space page
    cache to another. In buzz word marketing, you can do zero-copy file
    copies!

    Signed-off-by: Jens Axboe
    Signed-off-by: Linus Torvalds

    Jens Axboe
     
  • This adds support for the sys_splice system call. Using a pipe as a
    transport, it can connect to files or sockets (latter as output only).

    From the splice.c comments:

    "splice": joining two ropes together by interweaving their strands.

    This is the "extended pipe" functionality, where a pipe is used as
    an arbitrary in-memory buffer. Think of a pipe as a small kernel
    buffer that you can use to transfer data from one end to the other.

    The traditional unix read/write is extended with a "splice()" operation
    that transfers data buffers to or from a pipe buffer.

    Named by Larry McVoy, original implementation from Linus, extended by
    Jens to support splicing to files and fixing the initial implementation
    bugs.

    Signed-off-by: Jens Axboe
    Signed-off-by: Linus Torvalds

    Jens Axboe
     

30 Mar, 2006

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (67 commits)
    [PATCH] powerpc: Remove oprofile spinlock backtrace code
    [PATCH] powerpc: Add oprofile calltrace support to all powerpc cpus
    [PATCH] powerpc: Add oprofile calltrace support
    [PATCH] for_each_possible_cpu: ppc
    [PATCH] for_each_possible_cpu: powerpc
    [PATCH] lock PTE before updating it in 440/BookE page fault handler
    [PATCH] powerpc: Kill _machine and hard-coded platform numbers
    ppc: Fix compile error in arch/ppc/lib/strcase.c
    [PATCH] git-powerpc: WARN was a dumb idea
    [PATCH] powerpc: a couple of trivial compile warning fixes
    powerpc: remove OCP references
    powerpc: Make uImage default build output for MPC8540 ADS
    powerpc: move math-emu over to arch/powerpc
    powerpc: use memparse() for mem= command line parsing
    ppc: fix strncasecmp prototype
    [PATCH] powerpc: make ISA floppies work again
    [PATCH] powerpc: Fix some initcall return values
    [PATCH] powerpc: Workaround for pSeries RTAS bug
    [PATCH] spufs: fix __init/__exit annotations
    [PATCH] powerpc: add hvc backend for rtas
    ...

    Linus Torvalds
     
  • * git://oss.sgi.com:8090/oss/git/xfs-2.6:
    [XFS] Cleanup in XFS after recent get_block_t interface tweaks.
    [XFS] Remove unused/obsoleted function: xfs_bmap_do_search_extents()
    [XFS] A change to inode chunk allocation to try allocating the new chunk
    Fixes a regression from the recent "remove ->get_blocks() support"
    [XFS] Fix compiler warning and small code inconsistencies in compat
    [XFS] We really suck at spulling. Thanks to Chris Pascoe for fixing all

    Linus Torvalds
     

29 Mar, 2006

21 commits

  • This patch borrows a clever Hugh's 'struct anon_vma' trick.

    Without tasklist_lock held we can't trust task->sighand until we locked it
    and re-checked that it is still the same.

    But this means we don't need to defer 'kmem_cache_free(sighand)'. We can
    return the memory to slab immediately, all we need is to be sure that
    sighand->siglock can't dissapear inside rcu protected section.

    To do so we need to initialize ->siglock inside ctor function,
    SLAB_DESTROY_BY_RCU does the rest.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • add_parent(p, parent) is always called with parent == p->parent, and it makes
    no sense to do it differently. This patch removes this argument.

    No changes in affected .o files.

    Signed-off-by: Oleg Nesterov
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • switch_exec_pids is only called from de_thread by way of exec, and it is
    only called when we are exec'ing from a non thread group leader.

    Currently switch_exec_pids gives the leader the pid of the thread and
    unhashes and rehashes all of the process groups. The leader is already in
    the EXIT_DEAD state so no one cares about it's pids. The only concern for
    the leader is that __unhash_process called from release_task will function
    correctly. If we don't touch the leader at all we know that
    __unhash_process will work fine so there is no need to touch the leader.

    For the task becomming the thread group leader, we just need to give it the
    pid of the old thread group leader, add it to the task list, and attach it
    to the session and the process group of the thread group.

    Currently de_thread is also adding the task to the task list which is just
    silly.

    Currently the only leader of __detach_pid besides detach_pid is
    switch_exec_pids because of the ugly extra work that was being
    performed.

    So this patch removes switch_exec_pids because it is doing too much, it is
    creating an unnecessary special case in pid.c, duing work duplicated in
    de_thread, and generally obscuring what it is going on.

    The necessary work is added to de_thread, and it seems to be a little
    clearer there what is going on.

    Signed-off-by: Eric W. Biederman
    Cc: Oleg Nesterov
    Cc: Kirill Korotaev
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • I think it is enough to take tasklist_lock for reading while changing
    child_reaper:

    Reparenting needs write_lock(tasklist_lock)

    Only one thread in a thread group can do exec()

    sighand->siglock garantees that get_signal_to_deliver()
    will not see a stale value of child_reaper.

    This means that we can change child_reaper earlier, without calling
    zap_other_threads() twice.

    "child_reaper = current" is a NOOP when init does exec from main thread, we
    don't care.

    Signed-off-by: Oleg Nesterov
    Acked-by: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • After looking at the problem of init calling exec some more I figured out
    an easy way to make the code work.

    The actual symptom without out this patch is that all threads will die
    except pid == 1, and the thread calling exec. The thread calling exec will
    wait forever for pid == 1 to die.

    Since pid == 1 does not install a handler for SIGKILL it will never die.

    This modifies the tests for init from current->pid == 1 to the equivalent
    current == child_reaper. And then it causes exec in the ugly case to
    modify child_reaper.

    The only weird symptom is that you wind up with an init process that
    doesn't have the oldest start time on the box.

    Signed-off-by: Eric W. Biederman
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Paul Mackerras
     
  • Signed-off-by: Nathan Scott

    Nathan Scott
     
  • SGI-PV: 951415
    SGI-Modid: xfs-linux-melb:xfs-kern:208490a

    Signed-off-by: Mandy Kirkconnell
    Signed-off-by: Nathan Scott

    Mandy Kirkconnell
     
  • contiguous with the most recently allocated chunk. On a striped
    filesystem, this will fill a stripe unit with inodes before allocating new
    inodes in another stripe unit.

    SGI-PV: 951416
    SGI-Modid: xfs-linux-melb:xfs-kern:208488a

    Signed-off-by: Glen Overby
    Signed-off-by: Nathan Scott

    Glen Overby
     
  • change. inode->i_blkbits should be used when making a get_block_t
    request of a filesystem instead of dio->blkbits, as that does not
    indicate the filesystem block size all the time (depends on request
    alignment - see start of __blockdev_direct_IO).

    Signed-off-by: Nathan Scott
    Acked-by: Badari Pulavarty

    Nathan Scott
     
  • ioctl32 land.

    SGI-PV: 904196
    SGI-Modid: xfs-linux-melb:xfs-kern:25590a

    Signed-off-by: Nathan Scott

    Nathan Scott
     
  • these typos.

    SGI-PV: 904196
    SGI-Modid: xfs-linux-melb:xfs-kern:25539a

    Signed-off-by: Nathan Scott

    Nathan Scott
     
  • Fix a lot of typos. Eyeballed by jmc@ in OpenBSD.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • This is a conversion to make the various file_operations structs in fs/
    const. Basically a regexp job, with a few manual fixups

    The goal is both to increase correctness (harder to accidentally write to
    shared datastructures) and reducing the false sharing of cachelines with
    things that get dirty in .data (while .rodata is nicely read only and thus
    cache clean)

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     
  • Mark the f_ops members of inodes as const, as well as fix the
    ripple-through this causes by places that copy this f_ops and then "do
    stuff" with it.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     
  • replaces for_each_cpu with for_each_possible_cpu().

    Signed-off-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • Remove an unnecessary level of indirection in allocating and freeing select
    bits, as per the select_bits_alloc() and select_bits_free() functions.
    Both select.c and compat.c are updated.

    Signed-off-by: Vadim Lobanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vadim Lobanov
     
  • Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     
  • Optimize select and poll by a using stack space for small fd sets

    This brings back an old optimization from Linux 2.0. Using the stack is
    faster than kmalloc. On a Intel P4 system it speeds up a select of a
    single pty fd by about 13% (~4000 cycles -> ~3500)

    It also saves memory because a daemon hanging in select or poll will
    usually save one or two less pages. This can add up - e.g. if you have 10
    daemons blocking in poll/select you save 40KB of memory.

    I did a patch for this long ago, but it was never applied. This version is
    a reimplementation of the old patch that tries to be less intrusive. I
    only did the minimal changes needed for the stack allocation.

    The cut off point before external memory is allocated is currently at
    832bytes. The system calls always allocate this much memory on the stack.

    These 832 bytes are divided into 256 bytes frontend data (for the select
    bitmaps of the pollfds) and the rest of the space for the wait queues used
    by the low level drivers. There are some extreme cases where this won't
    work out for select and it falls back to allocating memory too early -
    especially with very sparse large select bitmaps - but the majority of
    processes who only have a small number of file descriptors should be ok.
    [TBD: 832/256 might not be the best split for select or poll]

    I suspect more optimizations might be possible, but they would be more
    complicated. One way would be to cache the select/poll context over
    multiple system calls because typically the input values should be similar.
    Problem is when to flush the file descriptors out though.

    Signed-off-by: Andi Kleen
    Cc: Eric Dumazet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • Add a proper prototype for autofs4_dentry_release() to autofs_i.h.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • Add proper prototypes for fat_cache_init() and fat_cache_destroy() in
    msdos_fs.h.

    Signed-off-by: Adrian Bunk
    Acked-by: OGAWA Hirofumi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     

28 Mar, 2006

14 commits