27 Jun, 2006

1 commit


20 Jun, 2006

1 commit

  • The following series of patches introduces a kernel API for inotify,
    making it possible for kernel modules to benefit from inotify's
    mechanism for watching inodes. With these patches, inotify will
    maintain for each caller a list of watches (via an embedded struct
    inotify_watch), where each inotify_watch is associated with a
    corresponding struct inode. The caller registers an event handler and
    specifies for which filesystem events their event handler should be
    called per inotify_watch.

    Signed-off-by: Amy Griffis
    Acked-by: Robert Love
    Acked-by: John McCutchan
    Signed-off-by: Al Viro

    Amy Griffis
     

18 May, 2006

1 commit


01 Apr, 2006

1 commit

  • Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT
    fadvise() additions, do it in a new sys_sync_file_range() syscall instead.
    Reasons:

    - It's more flexible. Things which would require two or three syscalls with
    fadvise() can be done in a single syscall.

    - Using fadvise() in this manner is something not covered by POSIX.

    The patch wires up the syscall for x86.

    The sycall is implemented in the new fs/sync.c. The intention is that we can
    move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later.

    Documentation for the syscall is in fs/sync.c.

    A test app (sync_file_range.c) is in
    http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz.

    The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can
    say NFS_DATA_SYNC or NFS_FILE_SYNC. I can skip the ->fsync call for
    NFS_DATA_SYNC which is hopefully the more common."

    Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if
    the queue is congested. This is trivial to fix: add a new flag bit, set
    wbc->nonblocking. But I'm not sure that we want to expose implementation
    details down to that level.

    Note: it's notable that we can sync an fd which wasn't opened for writing.
    Same with fsync() and fdatasync()).

    Note: the code takes some care to handle attempts to sync file contents
    outside the 16TB offset on 32-bit machines. It makes such attempts appear to
    succeed, for best 32-bit/64-bit compatibility. Perhaps it should make such
    requests fail...

    Cc: Nick Piggin
    Cc: Michael Kerrisk
    Cc: Ulrich Drepper
    Cc: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

31 Mar, 2006

1 commit

  • This adds support for the sys_splice system call. Using a pipe as a
    transport, it can connect to files or sockets (latter as output only).

    From the splice.c comments:

    "splice": joining two ropes together by interweaving their strands.

    This is the "extended pipe" functionality, where a pipe is used as
    an arbitrary in-memory buffer. Think of a pipe as a small kernel
    buffer that you can use to transfer data from one end to the other.

    The traditional unix read/write is extended with a "splice()" operation
    that transfers data buffers to or from a pipe buffer.

    Named by Larry McVoy, original implementation from Linus, extended by
    Jens to support splicing to files and fixing the initial implementation
    bugs.

    Signed-off-by: Jens Axboe
    Signed-off-by: Linus Torvalds

    Jens Axboe
     

24 Mar, 2006

1 commit


11 Jan, 2006

1 commit

  • Now that all these entries in the arch ioctl32.c files are gone [1], we can
    build fs/compat_ioctl.c as a normal object and kill tons of cruft. We need a
    special do_ioctl32_pointer handler for s390 so the compat_ptr call is done.
    This is not needed but harmless on all other architectures. Also remove some
    superflous includes in fs/compat_ioctl.c

    Tested on ppc64.

    [1] parisc still had it's PPP handler left, which is not fully correct
    for ppp and besides that ppp uses the generic SIOCPRIV ioctl so it'd
    kick in for all netdevice users. We can introduce a proper handler
    in one of the next patch series by adding a compat_ioctl method to
    struct net_device but for now let's just kill it - parisc doesn't
    compile in mainline anyway and I don't want this to block this
    patchset.

    Signed-off-by: Christoph Hellwig
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

09 Jan, 2006

1 commit

  • Add /proc/sys/vm/drop_caches. When written to, this will cause the kernel to
    discard as much pagecache and/or reclaimable slab objects as it can. THis
    operation requires root permissions.

    It won't drop dirty data, so the user should run `sync' first.

    Caveats:

    a) Holds inode_lock for exorbitant amounts of time.

    b) Needs to be taught about NUMA nodes: propagate these all the way through
    so the discarding can be controlled on a per-node basis.

    This is a debugging feature: useful for getting consistent results between
    filesystem benchmarks. We could possibly put it under a config option, but
    it's less than 300 bytes.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

04 Jan, 2006

2 commits


08 Nov, 2005

1 commit


10 Sep, 2005

2 commits

  • This patch adds FUSE filesystem to MAINTAINERS, fs/Kconfig and
    fs/Makefile.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • OVERVIEW

    V9FS is a distributed file system for Linux which provides an
    implementation of the Plan 9 resource sharing protocol 9P. It can be
    used to share all sorts of resources: static files, synthetic file servers
    (such as /proc or /sys), devices, and application file servers (such as
    FUSE).

    BACKGROUND

    Plan 9 (http://plan9.bell-labs.com/plan9) is a research operating
    system and associated applications suite developed by the Computing
    Science Research Center of AT&T Bell Laboratories (now a part of
    Lucent Technologies), the same group that developed UNIX , C, and C++.
    Plan 9 was initially released in 1993 to universities, and then made
    generally available in 1995. Its core operating systems code laid the
    foundation for the Inferno Operating System released as a product by
    Lucent Bell-Labs in 1997. The Inferno venture was the only commercial
    embodiment of Plan 9 and is currently maintained as a product by Vita
    Nuova (http://www.vitanuova.com). After updated releases in 2000 and
    2002, Plan 9 was open-sourced under the OSI approved Lucent Public
    License in 2003.

    The Plan 9 project was started by Ken Thompson and Rob Pike in 1985.
    Their intent was to explore potential solutions to some of the
    shortcomings of UNIX in the face of the widespread use of high-speed
    networks to connect machines. In UNIX, networking was an afterthought
    and UNIX clusters became little more than a network of stand-alone
    systems. Plan 9 was designed from first principles as a seamless
    distributed system with integrated secure network resource sharing.
    Applications and services were architected in such a way as to allow
    for implicit distribution across a cluster of systems. Configuring an
    environment to use remote application components or services in place
    of their local equivalent could be achieved with a few simple command
    line instructions. For the most part, application implementations
    operated independent of the location of their actual resources.

    Commercial operating systems haven't changed much in the 20 years
    since Plan 9 was conceived. Network and distributed systems support is
    provided by a patchwork of middle-ware, with an endless number of
    packages supplying pieces of the puzzle. Matters are complicated by
    the use of different complicated protocols for individual services,
    and separate implementations for kernel and application resources.
    The V9FS project (http://v9fs.sourceforge.net) is an attempt to bring
    Plan 9's unified approach to resource sharing to Linux and other
    operating systems via support for the 9P2000 resource sharing
    protocol.

    V9FS HISTORY

    V9FS was originally developed by Ron Minnich and Maya Gokhale at Los
    Alamos National Labs (LANL) in 1997. In November of 2001, Greg Watson
    setup a SourceForge project as a public repository for the code which
    supported the Linux 2.4 kernel.

    About a year ago, I picked up the initial attempt Ron Minnich had
    made to provide 2.6 support and got the code integrated into a 2.6.5
    kernel. I then went through a line-for-line re-write attempting to
    clean-up the code while more closely following the Linux Kernel style
    guidelines. I co-authored a paper with Ron Minnich on the V9FS Linux
    support including performance comparisons to NFSv3 using Bonnie and
    PostMark - this paper appeared at the USENIX/FREENIX 2005
    conference in April 2005:
    ( http://www.usenix.org/events/usenix05/tech/freenix/hensbergen.html ).

    CALL FOR PARTICIPATION/REQUEST FOR COMMENTS

    Our 2.6 kernel support is stabilizing and we'd like to begin pursuing
    its integration into the official kernel tree. We would appreciate any
    review, comments, critiques, and additions from this community and are
    actively seeking people to join our project and help us produce
    something that would be acceptable and useful to the Linux community.

    STATUS

    The code is reasonably stable, although there are no doubt corner cases
    our regression tests haven't discovered yet. It is in regular use by several
    of the developers and has been tested on x86 and PowerPC
    (32-bit and 64-bit) in both small and large (LANL cluster) deployments.
    Our current regression tests include fsx, bonnie, and postmark.

    It was our intention to keep things as simple as possible for this
    release -- trying to focus on correctness within the core of the
    protocol support versus a rich set of features. For example: a more
    complete security model and cache layer are in the road map, but
    excluded from this release. Additionally, we have removed support for
    mmap operations at Al Viro's request.

    PERFORMANCE

    Detailed performance numbers and analysis are included in the FREENIX
    paper, but we show comparable performance to NFSv3 for large file
    operations based on the Bonnie benchmark, and superior performance for
    many small file operations based on the PostMark benchmark. Somewhat
    preliminary graphs (from the FREENIX paper) are available
    (http://v9fs.sourceforge.net/perf/index.html).

    RESOURCES

    The source code is available in a few different forms:

    tarballs: http://v9fs.sf.net
    CVSweb: http://cvs.sourceforge.net/viewcvs.py/v9fs/linux-9p/
    CVS: :pserver:anonymous@cvs.sourceforge.net:/cvsroot/v9fs/linux-9p
    Git: rsync://v9fs.graverobber.org/v9fs (webgit: http://v9fs.graverobber.org)
    9P: tcp!v9fs.graverobber.org!6564

    The user-level server is available from either the Plan 9 distribution
    or from http://v9fs.sf.net
    Other support applications are still being developed, but preliminary
    version can be downloaded from sourceforge.

    Documentation on the protocol has historically been the Plan 9 Man
    pages (http://plan9.bell-labs.com/sys/man/5/INDEX.html), but there is
    an effort under way to write a more complete Internet-Draft style
    specification (http://v9fs.sf.net/rfc).

    There are a couple of mailing lists supporting v9fs, but the most used
    is v9fs-developer@lists.sourceforge.net -- please direct/cc your
    comments there so the other v9fs contibutors can participate in the
    conversation. There is also an IRC channel: irc://freenode.net/#v9fs

    This part of the patch contains Documentation, Makefiles, and configuration
    file changes.

    Signed-off-by: Eric Van Hensbergen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Van Hensbergen
     

08 Sep, 2005

1 commit

  • Here's the latest version of relayfs, against linux-2.6.11-mm2. I'm hoping
    you'll consider putting this version back into your tree - the previous
    rounds of comment seem to have shaken out all the API issues and the number
    of comments on the code itself have also steadily dwindled.

    This patch is essentially the same as the relayfs redux part 5 patch, with
    some minor changes based on reviewer comments. Thanks again to Pekka
    Enberg for those. The patch size without documentation is now a little
    smaller at just over 40k. Here's a detailed list of the changes:

    - removed the attribute_flags in relay open and changed it to a
    boolean specifying either overwrite or no-overwrite mode, and removed
    everything referencing the attribute flags.
    - added a check for NULL names in relayfs_create_entry()
    - got rid of the unnecessary multiple labels in relay_create_buf()
    - some minor simplification of relay_alloc_buf() which got rid of a
    couple params
    - updated the Documentation

    In addition, this version (through code contained in the relay-apps tarball
    linked to below, not as part of the relayfs patch) tries to make it as easy
    as possible to create the cooperating kernel/user pieces of a typical and
    common type of logging application, one where kernel logging is kicked off
    when a user space data collection app starts and stops when the collection
    app exits, with the data being automatically logged to disk in between. To
    create this type of application, you basically just include a header file
    (relay-app.h, included in the relay-apps tarball) in your kernel module,
    define a couple of callbacks and call an initialization function, and on
    the user side call a single function that sets up and continuously monitors
    the buffers, and writes data to files as it becomes available. Channels
    are created when the collection app is started and destroyed when it exits,
    not when the kernel module is inserted, so different channel buffer sizes
    can be specified for each separate run via command-line options. See the
    README in the relay-apps tarball for details.

    Also included in the relay-apps tarball are a couple examples
    demonstrating how you can use this to create quick and dirty kernel
    logging/debugging applications. They are:

    - tprintk, short for 'tee printk', which temporarily puts a kprobe on
    printk() and writes a duplicate stream of printk output to a relayfs
    channel. This could be used anywhere there's printk() debugging code
    in the kernel which you'd like to exercise, but would rather not have
    your system logs cluttered with debugging junk. You'd probably want
    to kill klogd while you do this, otherwise there wouldn't be much
    point (since putting a kprobe on printk() doesn't change the output
    of printk()). I've used this method to temporarily divert the packet
    logging output of the iptables LOG target from the system logs to
    relayfs files instead, for instance.

    - klog, which just provides a printk-like formatted logging function
    on top of relayfs. Again, you can use this to keep stuff out of your
    system logs if used in place of printk.

    The example applications can be found here:

    http://prdownloads.sourceforge.net/dprobes/relay-apps.tar.gz?download

    From: Christoph Hellwig

    avoid lookup_hash usage in relayfs

    Signed-off-by: Tom Zanussi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tom Zanussi
     

13 Jul, 2005

1 commit

  • inotify is intended to correct the deficiencies of dnotify, particularly
    its inability to scale and its terrible user interface:

    * dnotify requires the opening of one fd per each directory
    that you intend to watch. This quickly results in too many
    open files and pins removable media, preventing unmount.
    * dnotify is directory-based. You only learn about changes to
    directories. Sure, a change to a file in a directory affects
    the directory, but you are then forced to keep a cache of
    stat structures.
    * dnotify's interface to user-space is awful. Signals?

    inotify provides a more usable, simple, powerful solution to file change
    notification:

    * inotify's interface is a system call that returns a fd, not SIGIO.
    You get a single fd, which is select()-able.
    * inotify has an event that says "the filesystem that the item
    you were watching is on was unmounted."
    * inotify can watch directories or files.

    Inotify is currently used by Beagle (a desktop search infrastructure),
    Gamin (a FAM replacement), and other projects.

    See Documentation/filesystems/inotify.txt.

    Signed-off-by: Robert Love
    Cc: John McCutchan
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert Love
     

28 Jun, 2005

1 commit

  • This updates the CFQ io scheduler to the new time sliced design (cfq
    v3). It provides full process fairness, while giving excellent
    aggregate system throughput even for many competing processes. It
    supports io priorities, either inherited from the cpu nice value or set
    directly with the ioprio_get/set syscalls. The latter closely mimic
    set/getpriority.

    This import is based on my latest from -mm.

    Signed-off-by: Jens Axboe
    Signed-off-by: Linus Torvalds

    Jens Axboe
     

23 Jun, 2005

1 commit


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds