09 Apr, 2007

1 commit


08 Dec, 2006

2 commits

  • Add a DESTROY operation for block device based filesystems. With the help of
    this operation, such a filesystem can flush dirty data to the device
    synchronously before the umount returns.

    This is needed in situations where the filesystem is assumed to be clean
    immediately after unmount (e.g. ejecting removable media).

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add support for the BMAP operation for block device based filesystems. This
    is needed to support swap-files and lilo.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

17 Oct, 2006

1 commit

  • Fuse considered it an error (EIO) if lookup returned a directory inode, to
    which a dentry already refered. This is because directory aliases are not
    allowed.

    But in a network filesystem this could happen legitimately, if a directory is
    moved on a remote client. This patch attempts to relax the restriction by
    trying to first evict the offending alias from the cache. If this fails, it
    still returns an error (EBUSY).

    A rarer situation is if an mkdir races with an indenpendent lookup, which
    finds the newly created directory already moved. In this situation the mkdir
    should return success, but that would be incorrect, since the dentry cannot be
    instantiated, so return EBUSY.

    Previously checking for a directory alias and instantiation of the dentry
    weren't done atomically in lookup/mkdir, hence two such calls racing with each
    other could create aliased directories. To prevent this introduce a new
    per-connection mutex: fuse_conn->inst_mutex, which is taken for instantiations
    with a directory inode.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

01 Aug, 2006

1 commit

  • It is entirely possible (though rare) that jiffies half-wraps around, while a
    dentry/inode remains in the cache. This could mean that the dentry/inode is
    not invalidated for another half wraparound-time.

    To get around this problem, use 64-bit jiffies. The only problem with this is
    that dentry->d_time is 32 bits on 32-bit archs. So use d_fsdata as the high
    32 bits. This is an ugly hack, but far simpler, than having to allocate
    private data just for this purpose.

    Since 64-bit jiffies can be assumed never to wrap around, simple comparison
    can be used, and a zero time value can represent "invalid".

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

26 Jun, 2006

7 commits

  • VFS uses current->files pointer as lock owner ID, and it wouldn't be
    prudent to expose this value to userspace. So scramble it with XTEA using
    a per connection random key, known only to the kernel. Only one direction
    needs to be implemented, since the ID is never sent in the reverse
    direction.

    The XTEA algorithm is implemented inline since it's simple enough to do so,
    and this adds less complexity than if the crypto API were used.

    Thanks to Jesper Juhl for the idea.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add synchronous request interruption. This is needed for file locking
    operations which have to be interruptible. However filesystem may implement
    interruptibility of other operations (e.g. like NFS 'intr' mount option).

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Rename the 'interrupted' flag to 'aborted', since it indicates exactly that,
    and next patch will introduce an 'interrupted' flag for a

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • All POSIX locks owned by the current task are removed on close(). If the
    FLUSH request resulting initiated by close() fails to reach userspace, there
    might be locks remaining, which cannot be removed.

    The only reason it could fail, is if allocating the request fails. In this
    case use the request reserved for RELEASE, or if that is currently used by
    another FLUSH, wait for it to become available.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • This patch adds POSIX file locking support to the fuse interface.

    This implementation doesn't keep any locking state in kernel. Unlocking on
    close() is handled by the FLUSH message, which now contains the lock owner id.

    Mandatory locking is not supported. The filesystem may enfoce mandatory
    locking in userspace if needed.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add a control filesystem to fuse, replacing the attributes currently exported
    through sysfs. An empty directory '/sys/fs/fuse/connections' is still created
    in sysfs, and mounting the control filesystem here provides backward
    compatibility.

    Advantages of the control filesystem over the previous solution:

    - allows the object directory and the attributes to be owned by the
    filesystem owner, hence letting unpriviled users abort the
    filesystem connection

    - does not suffer from module unload race

    [akpm@osdl.org: fix this fs for recent dhowells depredations]
    [akpm@osdl.org: fix 64-bit printk warnings]
    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Don't put requests into the background when a fatal interrupt occurs while the
    request is in userspace. This removes a major wart from the implementation.

    Backgrounding of requests was introduced to allow breaking of deadlocks.
    However now the same can be achieved by aborting the filesystem through the
    'abort' sysfs attribute.

    This is a change in the interface, but should not cause problems, since these
    kinds of deadlocks never happen during normal operation.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

26 Apr, 2006

1 commit

  • This reverts 73ce8355c243a434524a34c05cc417dd0467996e commit.

    It was wrong, because it didn't take into account the requirement,
    that iput() for background requests must be performed synchronously
    with ->put_super(), otherwise active inodes may remain after unmount.

    The right solution is to keep the sbput_sem and perform iput() within
    the locked region, but move fput() outside sbput_sem.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

12 Apr, 2006

2 commits

  • Properly accounting the number of waiting requests was forgotten in
    "clean up request accounting" patch.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • A deadlock was possible, when the last reference to the superblock was
    held due to a background request containing a file reference.

    Releasing the file would release the vfsmount which in turn would
    release the superblock. Since sbput_sem is held during the fput() and
    fuse_put_super() tries to acquire this same semaphore, a deadlock
    results.

    The chosen soltuion is to get rid of sbput_sem, and instead use the
    spinlock to ensure the referenced inodes/file are released only once.
    Since the actual release may sleep, defer these outside the locked
    region, but using local variables instead of the structure members.

    This is a much more rubust solution.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

11 Apr, 2006

4 commits

  • The previous patch removed limiting the number of outstanding requests. This
    patch adds a much simpler limiting, that is also compatible with file locking
    operations.

    A task may have at most one synchronous request allocated. So these requests
    need not be otherwise limited.

    However the number of background requests (release, forget, asynchronous
    reads, interrupted requests) can grow indefinitely. This can be used by a
    malicous user to cause FUSE to allocate arbitrary amounts of unswappable
    kernel memory, denying service.

    For this reason add a limit for the number of background requests, and block
    allocations of new requests until the number goes bellow the limit.

    Also use this mechanism to block all requests until the INIT reply is
    received.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • FUSE allocated most requests from a fixed size pool filled at mount time.
    However in some cases (release/forget) non-pool requests were used. File
    locking operations aren't well served by the request pool, since they may
    block indefinetly thus exhausting the pool.

    This patch removes the request pool and always allocates requests on demand.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Remove the global spinlock in favor of a per-mount one.

    This patch is basically find & replace. The difficult part has already been
    done by the previous patch.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • This adds asynchronous notification to FUSE - a FUSE server can request
    O_ASYNC on a /dev/fuse file descriptor and receive SIGIO when there is input
    available.

    One subtlety - fuse_dev_fasync, which is called when O_ASYNC is requested,
    does no locking, unlink the other methods. I think it's unnecessary, as the
    fuse_conn.fasync list is manipulated only by fasync_helper and kill_fasync,
    which provide their own locking. It would also be wrong to use the fuse_lock,
    as it's a spin lock and fasync_helper can sleep. My one concern with this is
    the fuse_conn going away underneath fuse_dev_fasync - sys_fcntl takes a
    reference on the file struct, so this seems not to be a problem.

    Signed-off-by: Jeff Dike
    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     

29 Mar, 2006

1 commit

  • This is a conversion to make the various file_operations structs in fs/
    const. Basically a regexp job, with a few manual fixups

    The goal is both to increase correctness (harder to accidentally write to
    shared datastructures) and reducing the false sharing of cachelines with
    things that get dirty in .data (while .rodata is nicely read only and thus
    cache clean)

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

02 Feb, 2006

1 commit

  • While asynchronous reads mean a performance improvement in most cases, if
    the filesystem assumed that reads are synchronous, then async reads may
    degrade performance (filesystem may receive reads out of order, which can
    confuse it's own readahead logic).

    With sshfs a 1.5 to 4 times slowdown can be measured.

    There's also a need for userspace filesystems to know whether asynchronous
    reads are supported by the kernel or not.

    To achive these, negotiate in the INIT request whether async reads will be
    used and the maximum readahead value. Update interface version to 7.6

    If userspace uses a version earlier than 7.6, then disable async reads, and
    set maximum readahead value to the maximum read size, as done in previous
    versions.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

17 Jan, 2006

11 commits

  • Fix race in setting bitfields of fuse_conn. Spotted by Andrew Morton.

    The two fields ->connected and ->mounted were always changed with the
    fuse_lock held. But other bitfields in the same structure were changed
    without the lock. In theory this could lead to losing the assignment of
    even the ones under lock. The chosen solution is to change these two
    fields to be a full unsigned type. The other bitfields aren't "important"
    enough to warrant the extra complexity of full locking or changing them to
    bitops.

    For all bitfields document why they are safe wrt. concurrent
    assignments.

    Also make the initialization of the 'num_waiting' atomic counter explicit.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add a separate function for filling in the READ request. This will make it
    possible to send asynchronous READ requests as well as synchronous ones.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Now the INIT requests can be completely handled in inode.c and the
    fuse_send_init() function need not be global any more.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add possibility for requests to run asynchronously and call an 'end' callback
    when finished.

    With this, the special handling of the INIT and RELEASE requests can be
    cleaned up too.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add ability to abort a filesystem connection.

    With the introduction of asynchronous reads, the ability to interrupt any
    request is not enough to dissolve deadlocks, since now waiting for the request
    completion (page unlocked) is independent of the actual request, so in a
    deadlock all threads will be uninterruptible.

    The solution is to make it possible to abort all requests, even those
    currently undergoing I/O to/from userspace. The natural interface for this is
    'mount -f mountpoint', but that only works as long as the filesystem is
    attached. So also add an 'abort' attribute to the sysfs view of the
    connection.

    Signed-off-by: Miklos Szeredi
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • This patch adds the 'waiting' attribute which indicates how many filesystem
    requests are currently waiting to be completed. A non-zero value without any
    filesystem activity indicates a hung or deadlocked filesystem.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Kobjectify fuse_conn, and make it visible under /sys/fs/fuse/connections.

    Lacking any natural naming, connections are numbered.

    This patch doesn't add any attributes, just the infrastructure.

    Signed-off-by: Miklos Szeredi
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • The ->connected flag for a fuse_conn object previously only indicated whether
    the device file for this connection is currently open or not.

    Change it's meaning so that it indicates whether the connection is active or
    not: now either umount or device release will clear the flag.

    The separate ->mounted flag is still needed for handling background requests.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Create a new list for requests in the process of being transfered to/from
    userspace. This will be needed to be able to abort all requests even those
    currently under I/O

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • The state of request was made up of 2 bitfields (->sent and ->finished) and of
    the fact that the request was on a list or not.

    Unify this into a single state field.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • - remove some unneeded assignments

    - use kzalloc instead of kmalloc + memset

    - simplify setting sb->s_fs_info

    - in fuse_send_init() use fuse_get_request() instead of
    do_get_request() helper

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

07 Jan, 2006

3 commits

  • Make the maximum size of write data configurable by the filesystem. The
    previous fixed 4096 limit only worked on architectures where the page size is
    less or equal to this. This change make writing work on other architectures
    too, and also lets the filesystem receive bigger write requests in direct_io
    mode.

    Normal writes which go through the page cache are still limited to a page
    sized chunk per request.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Change the way a too large request is handled. Until now in this case the
    device read returned -EINVAL and the operation returned -EIO.

    Make it more flexibible by not returning -EINVAL from the read, but restarting
    it instead.

    Also remove the fixed limit on setxattr data and let the filesystem provide as
    large a read buffer as it needs to handle the extended attribute data.

    The symbolic link length is already checked by VFS to be less than PATH_MAX,
    so the extra check against FUSE_SYMLINK_MAX is not needed.

    The check in fuse_create_open() against FUSE_NAME_MAX is not needed, since the
    dentry has already been looked up, and hence the name already checked.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Change interface version to 7.4.

    Following changes will need backward compatibility support, so store the minor
    version returned by userspace.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

07 Nov, 2005

2 commits

  • This patch adds an atomic create+open operation. This does not yet work if
    the file type changes between lookup and create+open, but solves the
    permission checking problems for the separte create and open methods.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Add a new access call, which will only be called if ->permission is invoked
    from sys_access(). In all other cases permission checking is delayed until
    the actual filesystem operation.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

31 Oct, 2005

1 commit


10 Sep, 2005

2 commits

  • This patch removes ability to interrupt and restart operations while there
    hasn't been any side-effect.

    The reason: applications. There are some apps it seems that generate
    signals at a fast rate. This means, that if the operation cannot make
    enough progress between two signals, it will be restarted for ever. This
    bug actually manifested itself with 'krusader' trying to open a file for
    writing under sshfs. Thanks to Eduard Czimbalmos for the report.

    The problem can be solved just by making open() uninterruptible, because in
    this case it was the truncate operation that slowed down the progress. But
    it's better to solve this by simply not allowing interrupts at all (except
    SIGKILL), because applications don't expect file operations to be
    interruptible anyway. As an added bonus the code is simplified somewhat.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • This patch adds a new FSYNCDIR request, which is sent when fsync is called
    on directories. This operation is available in libfuse 2.3-pre1 or
    greater.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi