27 Feb, 2013

1 commit

  • Pull vfs pile (part one) from Al Viro:
    "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
    locking violations, etc.

    The most visible changes here are death of FS_REVAL_DOT (replaced with
    "has ->d_weak_revalidate()") and a new helper getting from struct file
    to inode. Some bits of preparation to xattr method interface changes.

    Misc patches by various people sent this cycle *and* ocfs2 fixes from
    several cycles ago that should've been upstream right then.

    PS: the next vfs pile will be xattr stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    saner proc_get_inode() calling conventions
    proc: avoid extra pde_put() in proc_fill_super()
    fs: change return values from -EACCES to -EPERM
    fs/exec.c: make bprm_mm_init() static
    ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
    ocfs2: fix possible use-after-free with AIO
    ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
    get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
    target: writev() on single-element vector is pointless
    export kernel_write(), convert open-coded instances
    fs: encode_fh: return FILEID_INVALID if invalid fid_type
    kill f_vfsmnt
    vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
    nfsd: handle vfs_getattr errors in acl protocol
    switch vfs_getattr() to struct path
    default SET_PERSONALITY() in linux/elf.h
    ceph: prepopulate inodes only when request is aborted
    d_hash_and_lookup(): export, switch open-coded instances
    9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
    9p: split dropping the acls from v9fs_set_create_acl()
    ...

    Linus Torvalds
     

23 Feb, 2013

1 commit


07 Feb, 2013

1 commit

  • For some filesystems (e.g. GlusterFS), the cost of performing a
    normal readdir and readdirplus are identical. Since adaptively
    using readdirplus has no benefit for those systems, give
    users/filesystems the option to control adaptive readdirplus use.

    v2 of this patch incorporates Miklos's suggestion to simplify the code,
    as well as improving consistency of macro names and documentation.

    Signed-off-by: Eric Wong
    Signed-off-by: Miklos Szeredi

    Eric Wong
     

04 Feb, 2013

1 commit


01 Feb, 2013

2 commits

  • Use the same adaptive readdirplus mechanism as NFS:

    http://permalink.gmane.org/gmane.linux.nfs/49299

    If the user space implementation wants to disable readdirplus
    temporarily, it could just return ENOTSUPP. Then kernel will
    recall it with readdir.

    Signed-off-by: Feng Shuo
    Signed-off-by: Miklos Szeredi

    Feng Shuo
     
  • Commit c69e8d9c0 added rcu lock to fuse/dir.c It was assuming
    that 'task' is some other process but in fact this parameter always
    equals to 'current'. Inline this parameter to make it more readable
    and remove RCU lock as it is not needed when access current process
    credentials.

    Signed-off-by: Anatol Pomozov
    Signed-off-by: Miklos Szeredi

    Anatol Pomozov
     

24 Jan, 2013

3 commits

  • Previously, anyone who set flag 'argpages' only filled req->pages[] and set
    per-request page_offset. This patch re-works all cases where argpages=1 to
    fill req->page_descs[] properly.

    Having req->page_descs[] filled properly allows to re-work fuse_copy_pages()
    to copy page fragments described by req->page_descs[]. This will be useful
    for next patches optimizing direct_IO.

    Signed-off-by: Maxim Patlasov
    Signed-off-by: Miklos Szeredi

    Maxim Patlasov
     
  • The patch categorizes all fuse_get_req() invocations into two categories:
    - fuse_get_req_nopages(fc) - when caller doesn't care about req->pages
    - fuse_get_req(fc, n) - when caller need n page pointers (n > 0)

    Adding fuse_get_req_nopages() helps to avoid numerous fuse_get_req(fc, 0)
    scattered over code. Now it's clear from the first glance when a caller need
    fuse_req with page pointers.

    The patch doesn't make any logic changes. In multi-page case, it silly
    allocates array of FUSE_MAX_PAGES_PER_REQ page pointers. This will be amended
    by future patches.

    Signed-off-by: Maxim Patlasov
    Signed-off-by: Miklos Szeredi

    Maxim Patlasov
     
  • This patch implements readdirplus support in FUSE, similar to NFS.
    The payload returned in the readdirplus call contains
    'fuse_entry_out' structure thereby providing all the necessary inputs
    for 'faking' a lookup() operation on the spot.

    If the dentry and inode already existed (for e.g. in a re-run of ls -l)
    then just the inode attributes timeout and dentry timeout are refreshed.

    With a simple client->network->server implementation of a FUSE based
    filesystem, the following performance observations were made:

    Test: Performing a filesystem crawl over 20,000 files with

    sh# time ls -lR /mnt

    Without readdirplus:
    Run 1: 18.1s
    Run 2: 16.0s
    Run 3: 16.2s

    With readdirplus:
    Run 1: 4.1s
    Run 2: 3.8s
    Run 3: 3.8s

    The performance improvement is significant as it avoided 20,000 upcalls
    calls (lookup). Cache consistency is no worse than what already is.

    Signed-off-by: Anand V. Avati
    Signed-off-by: Miklos Szeredi

    Anand V. Avati
     

15 Nov, 2012

1 commit

  • Use kuid_t and kgid_t in struct fuse_conn and struct fuse_mount_data.

    The connection between between a fuse filesystem and a fuse daemon is
    established when a fuse filesystem is mounted and provided with a file
    descriptor the fuse daemon created by opening /dev/fuse.

    For now restrict the communication of uids and gids between the fuse
    filesystem and the fuse daemon to the initial user namespace. Enforce
    this by verifying the file descriptor passed to the mount of fuse was
    opened in the initial user namespace. Ensuring the mount happens in
    the initial user namespace is not necessary as mounts from non-initial
    user namespaces are not yet allowed.

    In fuse_req_init_context convert the currrent fsuid and fsgid into the
    initial user namespace for the request that will be sent to the fuse
    daemon.

    In fuse_fill_attr convert the uid and gid passed from the fuse daemon
    from the initial user namespace into kuids and kgids.

    In iattr_to_fattr called from fuse_setattr convert kuids and kgids
    into the uids and gids in the initial user namespace before passing
    them to the fuse filesystem.

    In fuse_change_attributes_common called from fuse_dentry_revalidate,
    fuse_permission, fuse_geattr, and fuse_setattr, and fuse_iget convert
    the uid and gid from the fuse daemon into a kuid and a kgid to store
    on the fuse inode.

    By default fuse mounts are restricted to task whose uid, suid, and
    euid matches the fuse user_id and whose gid, sgid, and egid matches
    the fuse group id. Convert the user_id and group_id mount options
    into kuids and kgids at mount time, and use uid_eq and gid_eq to
    compare the in fuse_allow_task.

    Cc: Miklos Szeredi
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

15 Aug, 2012

1 commit


14 Jul, 2012

9 commits


14 May, 2012

2 commits

  • Don't use inode->i_blkbits which might be stale, instead calculate the blksize
    information from the freshly obtained attributes.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Now we store attr->ino at inode->i_ino, return attr->ino at the
    first time and then return inode->i_ino if the attribute timeout
    isn't expired. That's wrong on 32 bit platforms because attr->ino
    is 64 bit and inode->i_ino is 32 bit in this case.

    Fix this by saving 64 bit ino in fuse_inode structure and returning
    it every time we call getattr. Also squash attr->ino into inode->i_ino
    explicitly.

    Signed-off-by: Pavel Shilovsky
    Signed-off-by: Miklos Szeredi

    Pavel Shilovsky
     

05 Mar, 2012

2 commits

  • Implement ->direct_IO() method in aops. The ->direct_IO() method combines
    the existing fuse_direct_read/fuse_direct_write methods to implement
    O_DIRECT functionality.

    Reaching ->direct_IO() in the read path via generic_file_aio_read ensures
    proper synchronization with page cache with its existing framework.

    Reaching ->direct_IO() in the write path via fuse_file_aio_write is made
    to come via generic_file_direct_write() which makes it play nice with
    the page cache w.r.t other mmap pages etc.

    On files marked 'direct_io' by the filesystem server, IO always follows
    the fuse_direct_read/write path. There is no effect of fcntl(O_DIRECT)
    and it always succeeds.

    On files not marked with 'direct_io' by the filesystem server, the IO
    path depends on O_DIRECT flag by the application. This can be passed
    at the time of open() as well as via fcntl().

    Note that asynchronous O_DIRECT iocb jobs are completed synchronously
    always (this has been the case with FUSE even before this patch)

    Signed-off-by: Anand Avati
    Reviewed-by: Jeff Moyer
    Signed-off-by: Miklos Szeredi

    Anand Avati
     
  • Anand Avati reports that the following sequence of system calls fail on a fuse
    filesystem:

    create("filename") => 0
    link("filename", "linkname") => 0
    unlink("filename") => 0
    link("linkname", "filename") => -ENOENT ### BUG ###

    vfs_link() fails with ENOENT if i_nlink is zero, this is done to prevent
    resurrecting already deleted files.

    Fuse clears i_nlink on unlink even if there are other links pointing to the
    file.

    Reported-by: Anand Avati
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

13 Jan, 2012

1 commit


04 Jan, 2012

4 commits


13 Dec, 2011

2 commits

  • Allows a FUSE file-system to tell the kernel when a file or directory is
    deleted. If the specified dentry has the specified inode number, the kernel will
    unhash it.

    The current 'fuse_notify_inval_entry' does not cause the kernel to clean up
    directories that are in use properly, and as a result the users of those
    directories see incorrect semantics from the file-system. The error condition
    seen when 'fuse_notify_inval_entry' is used to notify of a deleted directory is
    avoided when 'fuse_notify_delete' is used instead.

    The following scenario demonstrates the difference:
    1. User A chdirs into 'testdir' and starts reading 'testfile'.
    2. User B rm -rf 'testdir'.
    3. User B creates 'testdir'.
    4. User C chdirs into 'testdir'.

    If you run the above within the same machine on any file-system (including fuse
    file-systems), there is no problem: user C is able to chdir into the new
    testdir. The old testdir is removed from the dentry tree, but still open by user
    A.

    If operations 2 and 3 are performed via the network such that the fuse
    file-system uses one of the notify functions to tell the kernel that the nodes
    are gone, then the following error occurs for user C while user A holds the
    original directory open:

    muirj@empacher:~> ls /test/testdir
    ls: cannot access /test/testdir: No such file or directory

    The issue here is that the kernel still has a dentry for testdir, and so it is
    requesting the attributes for the old directory, while the file-system is
    responding that the directory no longer exists.

    If on the other hand, if the file-system can notify the kernel that the
    directory is deleted using the new 'fuse_notify_delete' function, then the above
    ls will find the new directory as expected.

    Signed-off-by: John Muir
    Signed-off-by: Miklos Szeredi

    John Muir
     
  • Multiplexing filesystems may want to support ioctls on the underlying
    files and directores (e.g. FS_IOC_{GET,SET}FLAGS).

    Ioctl support on directories was missing so add it now.

    Reported-by: Antonio SJ Musumeci
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

21 Jul, 2011

1 commit

  • Btrfs needs to be able to control how filemap_write_and_wait_range() is called
    in fsync to make it less of a painful operation, so push down taking i_mutex and
    the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some
    file systems can drop taking the i_mutex altogether it seems, like ext3 and
    ocfs2. For correctness sake I just pushed everything down in all cases to make
    sure that we keep the current behavior the same for everybody, and then each
    individual fs maintainer can make up their mind about what to do from there.
    Thanks,

    Acked-by: Jan Kara
    Signed-off-by: Josef Bacik
    Signed-off-by: Al Viro

    Josef Bacik
     

20 Jul, 2011

5 commits


28 May, 2011

1 commit


27 May, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (25 commits)
    cifs: remove unnecessary dentry_unhash on rmdir/rename_dir
    ocfs2: remove unnecessary dentry_unhash on rmdir/rename_dir
    exofs: remove unnecessary dentry_unhash on rmdir/rename_dir
    nfs: remove unnecessary dentry_unhash on rmdir/rename_dir
    ext2: remove unnecessary dentry_unhash on rmdir/rename_dir
    ext3: remove unnecessary dentry_unhash on rmdir/rename_dir
    ext4: remove unnecessary dentry_unhash on rmdir/rename_dir
    btrfs: remove unnecessary dentry_unhash in rmdir/rename_dir
    ceph: remove unnecessary dentry_unhash calls
    vfs: clean up vfs_rename_other
    vfs: clean up vfs_rename_dir
    vfs: clean up vfs_rmdir
    vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems
    libfs: drop unneeded dentry_unhash
    vfs: update dentry_unhash() comment
    vfs: push dentry_unhash on rename_dir into file systems
    vfs: push dentry_unhash on rmdir into file systems
    vfs: remove dget() from dentry_unhash()
    vfs: dentry_unhash immediately prior to rmdir
    vfs: Block mmapped writes while the fs is frozen
    ...

    Linus Torvalds
     

26 May, 2011

1 commit