08 Dec, 2010

2 commits

  • In kernel ABI version 7.16 and later FUSE_IOCTL_RETRY reply from a
    unrestricted IOCTL request shall return with an array of 'struct
    fuse_ioctl_iovec' instead of 'struct iovec'. This fixes the ABI
    ambiguity of 32bit vs. 64bit.

    Reported-by: "ccmail111"
    Signed-off-by: Miklos Szeredi
    CC: Tejun Heo

    Miklos Szeredi
     
  • Terje Malmedal reports that a fuse filesystem with 32 million inodes
    on a machine with lots of memory can take up to 30 minutes to process
    FORGET requests when all those inodes are evicted from the icache.

    To solve this, create a BATCH_FORGET request that allows up to about
    8000 FORGET requests to be sent in a single message.

    This request is only sent if userspace supports interface version 7.16
    or later, otherwise fall back to sending individual FORGET messages.

    Reported-by: Terje Malmedal
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

12 Jul, 2010

2 commits

  • Userspace filesystem can request data to be retrieved from the inode's
    mapping. This request is synchronous and the retrieved data is queued
    as a new request. If the write to the fuse device returns an error
    then the retrieve request was not completed and a reply will not be
    sent.

    Only present pages are returned in the retrieve reply. Retrieving
    stops when it finds a non-present page and only data prior to that is
    returned.

    This request doesn't change the dirty state of pages.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Userspace filesystem can request data to be stored in the inode's
    mapping. This request is synchronous and has no reply. If the write
    to the fuse device returns an error then the store request was not
    fully completed (but may have updated some pages).

    If the stored data overflows the current file size, then the size is
    extended, similarly to a write(2) on the filesystem.

    Pages which have been completely stored are marked uptodate.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

25 May, 2010

1 commit

  • Allow userspace filesystem implementation to use splice() to write to
    the fuse device. The semantics of using splice() are:

    1) buffer the message header and data in a temporary pipe
    2) with a *single* splice() call move the message from the temporary pipe
    to the fuse device

    The READ reply message has the most interesting use for this, since
    now the data from an arbitrary file descriptor (which could be a
    regular file, a block device or a socket) can be tranferred into the
    fuse device without having to go through a userspace buffer. It will
    also allow zero copy moving of pages.

    One caveat is that the protocol on the fuse device requires the length
    of the whole message to be written into the header. But the length of
    the data transferred into the temporary pipe may not be known in
    advance. The current library implementation works around this by
    using vmplice to write the header and modifying the header after
    splicing the data into the pipe (error handling omitted):

    struct fuse_out_header out;

    iov.iov_base = &out;
    iov.iov_len = sizeof(struct fuse_out_header);
    vmsplice(pip[1], &iov, 1, 0);
    len = splice(input_fd, input_offset, pip[1], NULL, len, 0);
    /* retrospectively modify the header: */
    out.len = len + sizeof(struct fuse_out_header);
    splice(pip[0], NULL, fuse_chan_fd(req->ch), NULL, out.len, flags);

    This works since vmsplice only saves a pointer to the data, it does
    not copy the data itself.

    Since pipes are currently limited to 16 pages and messages need to be
    spliced atomically, the length of the data is limited to 15 pages (or
    60kB for 4k pages).

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

09 Jul, 2009

1 commit


07 Jul, 2009

1 commit


01 Jul, 2009

2 commits

  • Add notification messages that allow the filesystem to invalidate VFS
    caches.

    Two notifications are added:

    1) inode invalidation

    - invalidate cached attributes
    - invalidate a range of pages in the page cache (this is optional)

    2) dentry invalidation

    - try to invalidate a subtree in the dentry cache

    Care must be taken while accessing the 'struct super_block' for the
    mount, as it can go away while an invalidation is in progress. To
    prevent this, introduce a rw-semaphore, that is taken for read during
    the invalidation and taken for write in the ->kill_sb callback.

    Cc: Csaba Henk
    Cc: Anand Avati
    Signed-off-by: Miklos Szeredi

    John Muir
     
  • This patch lets filesystems handle masking the file mode on creation.
    This is needed if filesystem is using ACLs.

    - The CREATE, MKDIR and MKNOD requests are extended with a "umask"
    parameter.

    - A new FUSE_DONT_MASK flag is added to the INIT request/reply. With
    this the filesystem may request that the create mode is not masked.

    CC: Jean-Pierre André
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

09 Jun, 2009

1 commit

  • CUSE enables implementing character devices in userspace. With recent
    additions of ioctl and poll support, FUSE already has most of what's
    necessary to implement character devices. All CUSE has to do is
    bonding all those components - FUSE, chardev and the driver model -
    nicely.

    When client opens /dev/cuse, kernel starts conversation with
    CUSE_INIT. The client tells CUSE which device it wants to create. As
    the previous patch made fuse_file usable without associated
    fuse_inode, CUSE doesn't create super block or inodes. It attaches
    fuse_file to cdev file->private_data during open and set ff->fi to
    NULL. The rest of the operation is almost identical to FUSE direct IO
    case.

    Each CUSE device has a corresponding directory /sys/class/cuse/DEVNAME
    (which is symlink to /sys/devices/virtual/class/DEVNAME if
    SYSFS_DEPRECATED is turned off) which hosts "waiting" and "abort"
    among other things. Those two files have the same meaning as the FUSE
    control files.

    The only notable lacking feature compared to in-kernel implementation
    is mmap support.

    Signed-off-by: Tejun Heo
    Signed-off-by: Miklos Szeredi

    Tejun Heo
     

02 Dec, 2008

1 commit

  • Change interface version to 7.11 after adding the IOCTL and POLL
    messages.

    Also clean up the header a bit:
    - update copyright date to 2008
    - fix checkpatch warning:
    WARNING: Use #include instead of
    - remove FUSE_MAJOR define, which is not being used any more

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

26 Nov, 2008

4 commits

  • Implement poll support. Polled files are indexed using kh in a RB
    tree rooted at fuse_conn->polled_files.

    Client should send FUSE_NOTIFY_POLL notification once after processing
    FUSE_POLL which has FUSE_POLL_SCHEDULE_NOTIFY set. Sending
    notification unconditionally after the latest poll or everytime file
    content might have changed is inefficient but won't cause malfunction.

    fuse_file_poll() can sleep and requires patches from the following
    thread which allows f_op->poll() to sleep.

    http://thread.gmane.org/gmane.linux.kernel/726176

    Signed-off-by: Tejun Heo
    Signed-off-by: Miklos Szeredi

    Tejun Heo
     
  • Clients always used to write only in response to read requests. To
    implement poll efficiently, clients should be able to issue
    unsolicited notifications. This patch implements basic notification
    support.

    Zero fuse_out_header.unique is now accepted and considered unsolicited
    notification and the error field contains notification code. This
    patch doesn't implement any actual notification.

    Signed-off-by: Tejun Heo
    Signed-off-by: Miklos Szeredi

    Tejun Heo
     
  • Generic ioctl support is tricky to implement because only the ioctl
    implementation itself knows which memory regions need to be read
    and/or written. To support this, fuse client can request retry of
    ioctl specifying memory regions to read and write. Deep copying
    (nested pointers) can be implemented by retrying multiple times
    resolving one depth of dereference at a time.

    For security and cleanliness considerations, ioctl implementation has
    restricted mode where the kernel determines data transfer directions
    and sizes using the _IOC_*() macros on the ioctl command. In this
    mode, retry is not allowed.

    For all FUSE servers, restricted mode is enforced. Unrestricted ioctl
    will be used by CUSE.

    Plese read the comment on top of fs/fuse/file.c::fuse_file_do_ioctl()
    for more information.

    Signed-off-by: Tejun Heo
    Signed-off-by: Miklos Szeredi

    Tejun Heo
     
  • Move FUSE_MINOR to miscdevice.h. While at it, de-uglify the file.

    Signed-off-by: Tejun Heo
    Signed-off-by: Miklos Szeredi

    Tejun Heo
     

16 Oct, 2008

2 commits


26 Jul, 2008

1 commit

  • Implement the get_parent export operation by sending a LOOKUP request with
    ".." as the name.

    Implement looking up an inode by node ID after it has been evicted from
    the cache. This is done by seding a LOOKUP request with "." as the name
    (for all file types, not just directories).

    The filesystem can set the FUSE_EXPORT_SUPPORT flag in the INIT reply, to
    indicate that it supports these special lookups.

    Thanks to John Muir for the original implementation of this feature.

    Signed-off-by: Miklos Szeredi
    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Cc: Matthew Wilcox
    Cc: David Teigland
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

13 May, 2008

1 commit

  • Prior to 2.6.26 fuse only supported single page write requests. In theory all
    fuse filesystem should be able support bigger than 4k writes, as there's
    nothing in the API to prevent it. Unfortunately there's a known case in
    NTFS-3G where big writes cause filesystem corruption. There could also be
    other filesystems, where the lack of testing with big write requests would
    result in bugs.

    To prevent such problems on a kernel upgrade, disable big writes by default,
    but let filesystems set a flag to turn it on.

    Signed-off-by: Miklos Szeredi
    Cc: Szabolcs Szakacsits
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

30 Nov, 2007

1 commit

  • Some open flags (O_APPEND, O_DIRECT) can be changed with fcntl(F_SETFL, ...)
    after open, but fuse currently only sends the flags to userspace in open.

    To make it possible to correcly handle changing flags, send the
    current value to userspace in each read and write.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

19 Oct, 2007

7 commits

  • There are cases when the filesystem will be passed the buffer from a single
    read or write call, namely:

    1) in 'direct-io' mode (not O_DIRECT), read/write requests don't go
    through the page cache, but go directly to the userspace fs

    2) currently buffered writes are done with single page requests, but
    if Nick's ->perform_write() patch goes it, it will be possible to
    do larger write requests. But only if the original write() was
    also bigger than a page.

    In these cases the filesystem might want to give a hint to the app
    about the optimal I/O size.

    Allow the userspace filesystem to supply a blksize value to be returned by
    stat() and friends. If the field is zero, it defaults to the old
    PAGE_CACHE_SIZE value.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • For mandatory locking the userspace filesystem needs to know the lock
    ownership for read, write and truncate operations.

    This patch adds the necessary fields to the protocol.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • This patch adds a new helper function fuse_write_fill() which makes it
    possible to send WRITE requests asynchronously.

    A new flag for WRITE requests is also added which indicates that this a write
    from the page cache, and not a "normal" file write.

    This patch is in preparation for writable mmap support.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • It is trivial to add support for flock(2) semantics to the existing protocol,
    by setting the lock owner field to the file pointer, and passing a new
    FUSE_LK_FLOCK flag with the locking request.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • This patch allows fuse filesystems to implement open(..., O_TRUNC) as a single
    request, instead of separate truncate and open requests.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add two new flags for setattr: FATTR_ATIME_NOW and FATTR_MTIME_NOW. These
    mean, that atime or mtime should be changed to the current time.

    Also it is now possible to update atime or mtime individually, not just
    together.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add necessary protocol changes for supplying a file handle with the getattr
    operation. Step the API version to 7.9.

    This patch doesn't actually supply the file handle, because that needs some
    kind of VFS support, which we haven't yet been able to agree upon.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

17 Jul, 2007

1 commit

  • gcc-4.3:

    fs/fuse/dir.c: In function 'parse_dirfile':
    fs/fuse/dir.c:833: warning: cast from pointer to integer of different size
    fs/fuse/dir.c:835: warning: cast from pointer to integer of different size

    [miklos@szeredi.hu: use offsetof]
    Acked-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

08 Dec, 2006

3 commits

  • Add a DESTROY operation for block device based filesystems. With the help of
    this operation, such a filesystem can flush dirty data to the device
    synchronously before the umount returns.

    This is needed in situations where the filesystem is assumed to be clean
    immediately after unmount (e.g. ejecting removable media).

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add support for the BMAP operation for block device based filesystems. This
    is needed to support swap-files and lilo.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add a flag to the RELEASE message which specifies that a FLUSH operation
    should be performed as well. This interface update is needed for the FreeBSD
    port, and doesn't actually touch the Linux implementation at all.

    Also rename the unused 'flush_flags' in the FLUSH message to 'unused'.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

26 Jun, 2006

3 commits

  • Add synchronous request interruption. This is needed for file locking
    operations which have to be interruptible. However filesystem may implement
    interruptibility of other operations (e.g. like NFS 'intr' mount option).

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • This patch adds POSIX file locking support to the fuse interface.

    This implementation doesn't keep any locking state in kernel. Unlocking on
    close() is handled by the FLUSH message, which now contains the lock owner id.

    Mandatory locking is not supported. The filesystem may enfoce mandatory
    locking in userspace if needed.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • The following patches add POSIX file locking to the fuse interface.

    Additional changes ralated to this are:

    - asynchronous interrupt of requests by SIGKILL no longer supported

    - separate control filesystem, instead of using sysfs objects

    - add support for synchronously interrupting requests

    Details are documented in Documentation/filesystems/fuse.txt throughout the
    patches.

    This patch:

    Have fuse.h use MISC_MAJOR rather than a hardcoded '10'.

    Signed-off-by: Jan Engelhardt
    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Engelhardt
     

02 Feb, 2006

1 commit

  • While asynchronous reads mean a performance improvement in most cases, if
    the filesystem assumed that reads are synchronous, then async reads may
    degrade performance (filesystem may receive reads out of order, which can
    confuse it's own readahead logic).

    With sshfs a 1.5 to 4 times slowdown can be measured.

    There's also a need for userspace filesystems to know whether asynchronous
    reads are supported by the kernel or not.

    To achive these, negotiate in the INIT request whether async reads will be
    used and the maximum readahead value. Update interface version to 7.6

    If userspace uses a version earlier than 7.6, then disable async reads, and
    set maximum readahead value to the maximum read size, as done in previous
    versions.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

07 Jan, 2006

4 commits

  • Make the maximum size of write data configurable by the filesystem. The
    previous fixed 4096 limit only worked on architectures where the page size is
    less or equal to this. This change make writing work on other architectures
    too, and also lets the filesystem receive bigger write requests in direct_io
    mode.

    Normal writes which go through the page cache are still limited to a page
    sized chunk per request.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Change the way a too large request is handled. Until now in this case the
    device read returned -EINVAL and the operation returned -EIO.

    Make it more flexibible by not returning -EINVAL from the read, but restarting
    it instead.

    Also remove the fixed limit on setxattr data and let the filesystem provide as
    large a read buffer as it needs to handle the extended attribute data.

    The symbolic link length is already checked by VFS to be less than PATH_MAX,
    so the extra check against FUSE_SYMLINK_MAX is not needed.

    The check in fuse_create_open() against FUSE_NAME_MAX is not needed, since the
    dentry has already been looked up, and hence the name already checked.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add 'frsize' member to the statfs reply.

    I'm not sure if sending f_fsid will ever be needed, but just in case leave
    some space at the end of the structure, so less compatibility mess would be
    required.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Change interface version to 7.4.

    Following changes will need backward compatibility support, so store the minor
    version returned by userspace.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

07 Nov, 2005

1 commit

  • This patch passes the file handle supplied in iattr to userspace, in case the
    ->setattr() was invoked from sys_ftruncate(). This solves the permission
    checking (or lack thereof) in ftruncate() for the class of filesystems served
    by an unprivileged userspace process.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi