09 May, 2008

1 commit


04 May, 2008

1 commit

  • This replaces the duplicated arch-specific versions of "sys_pipe()" with
    one unified implementation. This removes almost 250 lines of duplicated
    code.

    It's marked __weak, so that *if* an architecture wants to override the
    default implementation it can do so by simply having its own replacement
    version, since many architectures use alternate calling conventions for
    the 'pipe()' system call for legacy reasons (ie traditional UNIX
    implementations often return the two file descriptors in registers)

    I still haven't changed the cris version even though Linus says the BKL
    isn't needed. The arch maintainer can easily do it if there are really
    no obstacles.

    Signed-off-by: Ulrich Drepper
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     

23 Apr, 2008

1 commit


19 Mar, 2008

1 commit

  • Some new uses of get_empty_filp() have crept in; switched
    to alloc_file() to make sure that pieces of initialization
    won't be missing.

    We really need to kill get_empty_filp().

    [AV] fixed dentry leak on failure exit in anon_inode_getfd()

    Cc: Erez Zadok
    Cc: Trond Myklebust
    Cc: "J Bruce Fields"
    Acked-by: Al Viro
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Dave Hansen
    Signed-off-by: Al Viro

    Dave Hansen
     

14 Feb, 2008

1 commit


09 Feb, 2008

1 commit


15 Oct, 2007

2 commits


27 Jul, 2007

1 commit


10 Jul, 2007

2 commits


09 May, 2007

1 commit

  • 1) Introduces a new method in 'struct dentry_operations'. This method
    called d_dname() might be called from d_path() to build a pathname for
    special filesystems. It is called without locks.

    Future patches (if we succeed in having one common dentry for all
    pipes/sockets) may need to change prototype of this method, but we now
    use : char *d_dname(struct dentry *dentry, char *buffer, int buflen);

    2) Adds a dynamic_dname() helper function that eases d_dname() implementations

    3) Defines d_dname method for sockets : No more sprintf() at socket
    creation. This is delayed up to the moment someone does an access to
    /proc/pid/fd/...

    4) Defines d_dname method for pipes : No more sprintf() at pipe
    creation. This is delayed up to the moment someone does an access to
    /proc/pid/fd/...

    A benchmark consisting of 1.000.000 calls to pipe()/close()/close() gives a
    *nice* speedup on my Pentium(M) 1.6 Ghz :

    3.090 s instead of 3.450 s

    Signed-off-by: Eric Dumazet
    Acked-by: Christoph Hellwig
    Acked-by: Linus Torvalds
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

18 Feb, 2007

1 commit

  • Provide an audit record of the descriptor pair returned by pipe() and
    socketpair(). Rewritten from the original posted to linux-audit by
    John D. Ramsdell

    Signed-off-by: Al Viro

    Al Viro
     

21 Dec, 2006

1 commit


14 Dec, 2006

1 commit

  • - pipe/splice should use const pipe_buf_operations and file_operations

    - struct pipe_inode_info has an unused field "start" : get rid of it.

    Signed-off-by: Eric Dumazet
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

09 Dec, 2006

1 commit

  • This patch changes struct file to use struct path instead of having
    independent pointers to struct dentry and struct vfsmount, and converts all
    users of f_{dentry,vfsmnt} in fs/ to use f_path.{dentry,mnt}.

    Additionally, it adds two #define's to make the transition easier for users of
    the f_dentry and f_vfsmnt.

    Signed-off-by: Josef "Jeff" Sipek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josef "Jeff" Sipek
     

08 Dec, 2006

1 commit

  • We currently insert pipe dentries into the global dentry hashtable. This
    is suboptimal because there is currently no way these entries can be used
    for a lookup(). (/proc/xxx/fd/xxx uses a different mechanism). Inserting
    them in dentry hashtable slows dcache lookups.

    To let __dpath() still work correctly (ie not adding a " (deleted)") after
    dentry name, we do :

    - Right after d_alloc(), pretend they are hashed by clearing the
    DCACHE_UNHASHED bit.

    - Call d_instantiate() instead of d_add() : dentry is not inserted in
    hash table.

    __dpath() & friends work as intended during dentry lifetime.

    - At dismantle time, once dput() must clear the dentry, setting again
    DCACHE_UNHASHED bit inside the custom d_delete() function provided by
    pipe code, so that dput() can just kill_it.

    This patch, combined with (avoid RCU for never hashed dentries) reduced
    time of { pipe(p); close(p[0]); close(p[1]);} on my UP machine (1.6GHz
    Pentium-M) from 3.23 us to 2.86 us (But this patch does not depend on other
    patches, only bench results)

    Signed-off-by: Eric Dumazet
    Acked-by: David Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

01 Oct, 2006

2 commits

  • Split the big and hard to read do_pipe function into smaller pieces.

    This creates new create_write_pipe/free_write_pipe/create_read_pipe
    functions. These functions are made global so that they can be used by
    other parts of the kernel.

    The resulting code is more generic and easier to read and has cleaner error
    handling and less gotos.

    [akpm@osdl.org: cleanup]
    Signed-off-by: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • This patch removes readv() and writev() methods and replaces them with
    aio_read()/aio_write() methods.

    Signed-off-by: Badari Pulavarty
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     

27 Sep, 2006

1 commit

  • This eliminates the i_blksize field from struct inode. Filesystems that want
    to provide a per-inode st_blksize can do so by providing their own getattr
    routine instead of using the generic_fillattr() function.

    Note that some filesystems were providing pretty much random (and incorrect)
    values for i_blksize.

    [bunk@stusta.de: cleanup]
    [akpm@osdl.org: generic_fillattr() fix]
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Theodore Ts'o
     

23 Jun, 2006

1 commit

  • Extend the get_sb() filesystem operation to take an extra argument that
    permits the VFS to pass in the target vfsmount that defines the mountpoint.

    The filesystem is then required to manually set the superblock and root dentry
    pointers. For most filesystems, this should be done with simple_set_mnt()
    which will set the superblock pointer and then set the root dentry to the
    superblock's s_root (as per the old default behaviour).

    The get_sb() op now returns an integer as there's now no need to return the
    superblock pointer.

    This patch permits a superblock to be implicitly shared amongst several mount
    points, such as can be done with NFS to avoid potential inode aliasing. In
    such a case, simple_set_mnt() would not be called, and instead the mnt_root
    and mnt_sb would be set directly.

    The patch also makes the following changes:

    (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
    pointer argument and return an integer, so most filesystems have to change
    very little.

    (*) If one of the convenience function is not used, then get_sb() should
    normally call simple_set_mnt() to instantiate the vfsmount. This will
    always return 0, and so can be tail-called from get_sb().

    (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
    dcache upon superblock destruction rather than shrink_dcache_anon().

    This is required because the superblock may now have multiple trees that
    aren't actually bound to s_root, but that still need to be cleaned up. The
    currently called functions assume that the whole tree is rooted at s_root,
    and that anonymous dentries are not the roots of trees which results in
    dentries being left unculled.

    However, with the way NFS superblock sharing are currently set to be
    implemented, these assumptions are violated: the root of the filesystem is
    simply a dummy dentry and inode (the real inode for '/' may well be
    inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
    with child trees.

    [*] Anonymous until discovered from another tree.

    (*) The documentation has been adjusted, including the additional bit of
    changing ext2_* into foo_* in the documentation.

    [akpm@osdl.org: convert ipath_fs, do other stuff]
    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Nathan Scott
    Cc: Roland Dreier
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

02 May, 2006

4 commits

  • Apply the same rules as the anon pipe pages, only allow stealing
    if no one else is using the page.

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • The pipe ->map() method uses kmap() to virtually map the pages, which
    is both slow and has known scalability issues on SMP. This patch enables
    atomic copying of pipe pages, by pre-faulting data and using kmap_atomic()
    instead.

    lmbench bw_pipe and lat_pipe measurements agree this is a Good Thing. Here
    are results from that on a UP machine with highmem (1.5GiB of RAM), running
    first a UP kernel, SMP kernel, and SMP kernel patched.

    Vanilla-UP:
    Pipe bandwidth: 1622.28 MB/sec
    Pipe bandwidth: 1610.59 MB/sec
    Pipe bandwidth: 1608.30 MB/sec
    Pipe latency: 7.3275 microseconds
    Pipe latency: 7.2995 microseconds
    Pipe latency: 7.3097 microseconds

    Vanilla-SMP:
    Pipe bandwidth: 1382.19 MB/sec
    Pipe bandwidth: 1317.27 MB/sec
    Pipe bandwidth: 1355.61 MB/sec
    Pipe latency: 9.6402 microseconds
    Pipe latency: 9.6696 microseconds
    Pipe latency: 9.6153 microseconds

    Patched-SMP:
    Pipe bandwidth: 1578.70 MB/sec
    Pipe bandwidth: 1579.95 MB/sec
    Pipe bandwidth: 1578.63 MB/sec
    Pipe latency: 9.1654 microseconds
    Pipe latency: 9.2266 microseconds
    Pipe latency: 9.1527 microseconds

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • The ->map() function is really expensive on highmem machines right now,
    since it has to use the slower kmap() instead of kmap_atomic(). Splice
    rarely needs to access the virtual address of a page, so it's a waste
    of time doing it.

    Introduce ->pin() to take over the responsibility of making sure the
    page data is valid. ->map() is then reduced to just kmap(). That way we
    can also share a most of the pipe buffer ops between pipe.c and splice.c

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Found by Oleg Nesterov , fixed by me.

    - Only allow full pages to go to the page cache.
    - Check page != buf->page instead of using PIPE_BUF_FLAG_STOLEN.
    - Remember to clear 'stolen' if add_to_page_cache() fails.

    And as a cleanup on that:

    - Make the bottom fall-through logic a little less convoluted. Also make
    the steal path hold an extra reference to the page, so we don't have
    to differentiate between stolen and non-stolen at the end.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

30 Apr, 2006

1 commit


11 Apr, 2006

5 commits


10 Apr, 2006

1 commit

  • separate out the 'internal pipe object' abstraction, and make it
    usable to splice. This cleans up and fixes several aspects of the
    internal splice APIs and the pipe code:

    - pipes: the allocation and freeing of pipe_inode_info is now more symmetric
    and more streamlined with existing kernel practices.

    - splice: small micro-optimization: less pointer dereferencing in splice
    methods

    Signed-off-by: Ingo Molnar

    Update XFS for the ->splice_read/->splice_write changes.

    Signed-off-by: Jens Axboe

    Ingo Molnar
     

03 Apr, 2006

2 commits

  • Originally from Nick Piggin, just adapted to the newer branch.

    You can't check PageLRU without holding zone->lru_lock. The page
    release code can get away with it only because the page refcount is 0 at
    that point. Also, you can't reliably remove pages from the LRU unless
    the refcount is 0. Ever.

    Signed-off-by: Nick Piggin
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • By cleaning up the writeback logic (killing write_one_page() and the manual
    set_page_dirty()), we can get rid of ->stolen inside the pipe_buffer and
    just keep it local in pipe_to_file().

    This also adds dirty page balancing logic and O_SYNC handling.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

31 Mar, 2006

2 commits

  • This enables the caller to migrate pages from one address space page
    cache to another. In buzz word marketing, you can do zero-copy file
    copies!

    Signed-off-by: Jens Axboe
    Signed-off-by: Linus Torvalds

    Jens Axboe
     
  • This adds support for the sys_splice system call. Using a pipe as a
    transport, it can connect to files or sockets (latter as output only).

    From the splice.c comments:

    "splice": joining two ropes together by interweaving their strands.

    This is the "extended pipe" functionality, where a pipe is used as
    an arbitrary in-memory buffer. Think of a pipe as a small kernel
    buffer that you can use to transfer data from one end to the other.

    The traditional unix read/write is extended with a "splice()" operation
    that transfers data buffers to or from a pipe buffer.

    Named by Larry McVoy, original implementation from Linus, extended by
    Jens to support splicing to files and fixing the initial implementation
    bugs.

    Signed-off-by: Jens Axboe
    Signed-off-by: Linus Torvalds

    Jens Axboe
     

29 Mar, 2006

1 commit

  • This is a conversion to make the various file_operations structs in fs/
    const. Basically a regexp job, with a few manual fixups

    The goal is both to increase correctness (harder to accidentally write to
    shared datastructures) and reducing the false sharing of cachelines with
    things that get dirty in .data (while .rodata is nicely read only and thus
    cache clean)

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

27 Mar, 2006

1 commit

  • I discovered on oprofile hunting on a SMP platform that dentry lookups were
    slowed down because d_hash_mask, d_hash_shift and dentry_hashtable were in
    a cache line that contained inodes_stat. So each time inodes_stats is
    changed by a cpu, other cpus have to refill their cache line.

    This patch moves some variables to the __read_mostly section, in order to
    avoid false sharing. RCU dentry lookups can go full speed.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

26 Mar, 2006

1 commit


09 Mar, 2006

1 commit