08 Apr, 2016

1 commit

  • Pull ext4 bugfixes from Ted Ts'o:
    "These changes contains a fix for overlayfs interacting with some
    (badly behaved) dentry code in various file systems. These have been
    reviewed by Al and the respective file system mtinainers and are going
    through the ext4 tree for convenience.

    This also has a few ext4 encryption bug fixes that were discovered in
    Android testing (yes, we will need to get these sync'ed up with the
    fs/crypto code; I'll take care of that). It also has some bug fixes
    and a change to ignore the legacy quota options to allow for xfstests
    regression testing of ext4's internal quota feature and to be more
    consistent with how xfs handles this case"

    * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: ignore quota mount options if the quota feature is enabled
    ext4 crypto: fix some error handling
    ext4: avoid calling dquot_get_next_id() if quota is not enabled
    ext4: retry block allocation for failed DIO and DAX writes
    ext4: add lockdep annotations for i_data_sem
    ext4: allow readdir()'s of large empty directories to be interrupted
    btrfs: fix crash/invalid memory access on fsync when using overlayfs
    ext4 crypto: use dget_parent() in ext4_d_revalidate()
    ext4: use file_dentry()
    ext4: use dget_parent() in ext4_file_open()
    nfs: use file_dentry()
    fs: add file_dentry()
    ext4 crypto: don't let data integrity writebacks fail with ENOMEM
    ext4: check if in-inode xattr is corrupted in ext4_expand_extra_isize_ea()

    Linus Torvalds
     

27 Mar, 2016

1 commit

  • NFS may be used as lower layer of overlayfs and accessing f_path.dentry can
    lead to a crash.

    Fix by replacing direct access of file->f_path.dentry with the
    file_dentry() accessor, which will always return a native object.

    Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
    Signed-off-by: Miklos Szeredi
    Tested-by: Goldwyn Rodrigues
    Acked-by: Trond Myklebust
    Signed-off-by: Theodore Ts'o
    Cc: # v4.2
    Cc: David Howells
    Cc: Al Viro

    Miklos Szeredi
     

17 Mar, 2016

1 commit

  • The only difference to nfs_file_fsync is the call to pnfs_sync_inode. But
    pnfs_sync_inode is just an inline that calls a pNFS layout driver method
    if CONFIG_PNFS is designed, and thus can be called just fine from the core
    NFS module.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Trond Myklebust

    Christoph Hellwig
     

23 Jan, 2016

1 commit

  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

08 Dec, 2015

1 commit

  • The btrfs clone ioctls are now adopted by other file systems, with NFS
    and CIFS already having support for them, and XFS being under active
    development. To avoid growth of various slightly incompatible
    implementations, add one to the VFS. Note that clones are different from
    file copies in several ways:

    - they are atomic vs other writers
    - they support whole file clones
    - they support 64-bit legth clones
    - they do not allow partial success (aka short writes)
    - clones are expected to be a fast metadata operation

    Because of that it would be rather cumbersome to try to piggyback them on
    top of the recent clone_file_range infrastructure. The converse isn't
    true and the clone_file_range system call could try clone file range as
    a first attempt to copy, something that further patches will enable.

    Based on earlier work from Peng Tao.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

24 Nov, 2015

4 commits


16 Oct, 2015

4 commits


08 Sep, 2015

1 commit

  • The NFSv4 delegation spec allows the server to tell a client to limit how
    much data it cache after the file is closed. In return, the server
    guarantees enough free space to avoid ENOSPC situations, etc.
    Prior to this patch, we assumed we could always cache aggressively after
    close. Unfortunately, this causes problems with servers that set the
    limit to 0 and therefore do not offer any ENOSPC guarantees.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

28 Aug, 2015

1 commit


26 Jun, 2015

1 commit

  • Commit 9597c13b forbade opens with O_APPEND|O_DIRECT for NFSv4:

    nfs: verify open flags before allowing an atomic open

    Currently, you can open a NFSv4 file with O_APPEND|O_DIRECT, but cannot
    fcntl(F_SETFL,...) with those flags. This flag combination is explicitly
    forbidden on NFSv3 opens, and it seems like it should also be on NFSv4.

    However, you can still open a file with O_DIRECT|O_APPEND if there exists a
    cached dentry for the file because nfs4_file_open() is used instead of
    nfs_atomic_open() and the check is bypassed. Add the check in
    nfs4_file_open() as well.

    Signed-off-by: Benjamin Coddington
    Signed-off-by: Trond Myklebust

    Benjamin Coddington
     

27 Apr, 2015

1 commit

  • Pull NFS client updates from Trond Myklebust:
    "Another set of mainly bugfixes and a couple of cleanups. No new
    functionality in this round.

    Highlights include:

    Stable patches:
    - Fix a regression in /proc/self/mountstats
    - Fix the pNFS flexfiles O_DIRECT support
    - Fix high load average due to callback thread sleeping

    Bugfixes:
    - Various patches to fix the pNFS layoutcommit support
    - Do not cache pNFS deviceids unless server notifications are enabled
    - Fix a SUNRPC transport reconnection regression
    - make debugfs file creation failure non-fatal in SUNRPC
    - Another fix for circular directory warnings on NFSv4 "junctioned"
    mountpoints
    - Fix locking around NFSv4.2 fallocate() support
    - Truncating NFSv4 file opens should also sync O_DIRECT writes
    - Prevent infinite loop in rpcrdma_ep_create()

    Features:
    - Various improvements to the RDMA transport code's handling of
    memory registration
    - Various code cleanups"

    * tag 'nfs-for-4.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (55 commits)
    fs/nfs: fix new compiler warning about boolean in switch
    nfs: Remove unneeded casts in nfs
    NFS: Don't attempt to decode missing directory entries
    Revert "nfs: replace nfs_add_stats with nfs_inc_stats when add one"
    NFS: Rename idmap.c to nfs4idmap.c
    NFS: Move nfs_idmap.h into fs/nfs/
    NFS: Remove CONFIG_NFS_V4 checks from nfs_idmap.h
    NFS: Add a stub for GETDEVICELIST
    nfs: remove WARN_ON_ONCE from nfs_direct_good_bytes
    nfs: fix DIO good bytes calculation
    nfs: Fetch MOUNTED_ON_FILEID when updating an inode
    sunrpc: make debugfs file creation failure non-fatal
    nfs: fix high load average due to callback thread sleeping
    NFS: Reduce time spent holding the i_mutex during fallocate()
    NFS: Don't zap caches on fallocate()
    xprtrdma: Make rpcrdma_{un}map_one() into inline functions
    xprtrdma: Handle non-SEND completions via a callout
    xprtrdma: Add "open" memreg op
    xprtrdma: Add "destroy MRs" memreg op
    xprtrdma: Add "reset MRs" memreg op
    ...

    Linus Torvalds
     

24 Apr, 2015

2 commits

  • At the very least, we should not be taking the i_mutex until after
    checking if the server even supports ALLOCATE or DEALLOCATE, allowing
    v4.0 or v4.1 to exit without potentially waiting on a lock.

    Signed-off-by: Anna Schumaker
    Signed-off-by: Trond Myklebust

    Anna Schumaker
     
  • This patch adds a GETATTR to the end of ALLOCATE and DEALLOCATE
    operations so we can set the updated inode size and change attribute
    directly. DEALLOCATE will still need to release pagecache pages, so
    nfs42_proc_deallocate() now calls truncate_pagecache_range() before
    contacting the server.

    Signed-off-by: Anna Schumaker
    Signed-off-by: Trond Myklebust

    Anna Schumaker
     

16 Apr, 2015

1 commit


12 Apr, 2015

1 commit

  • All places outside of core VFS that checked ->read and ->write for being NULL or
    called the methods directly are gone now, so NULL {read,write} with non-NULL
    {read,write}_iter will do the right thing in all cases.

    Signed-off-by: Al Viro

    Al Viro
     

28 Mar, 2015

4 commits


26 Nov, 2014

2 commits


19 Oct, 2014

1 commit

  • Pull NFS client updates from Trond Myklebust:
    "Highlights include:

    Stable fixes:
    - fix an uninitialised pointer Oops in the writeback error path
    - fix a bogus warning (and early exit from the loop) in nfs_generic_pgio()

    Features:
    - Add NFSv4.2 SEEK feature and client support for lseek(SEEK_HOLE/SEEK_DATA)

    Other fixes:
    - pnfs: replace broken pnfs_put_lseg_async
    - Remove dead prototype for nfs4_insert_deviceid_node"

    * tag 'nfs-for-3.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    NFS: Fix a bogus warning in nfs_generic_pgio
    NFS: Fix an uninitialised pointer Oops in the writeback error path
    NFSv4.1/pnfs: replace broken pnfs_put_lseg_async
    NFSv4: Remove dead prototype for nfs4_insert_deviceid_node()
    NFS: Implement SEEK

    Linus Torvalds
     

01 Oct, 2014

1 commit

  • The SEEK operation is used when an application makes an lseek call with
    either the SEEK_HOLE or SEEK_DATA flags set. I fall back on
    nfs_file_llseek() if the server does not have SEEK support.

    Signed-off-by: Anna Schumaker
    Signed-off-by: Trond Myklebust

    Anna Schumaker
     

10 Sep, 2014

1 commit

  • GFS2 and NFS have setlease routines that always just return -EINVAL.
    Turn that into a generic routine that can live in fs/libfs.c.

    Cc:
    Cc: Steven Whitehouse
    Cc:
    Signed-off-by: Jeff Layton
    Acked-by: Trond Myklebust
    Reviewed-by: Christoph Hellwig

    Jeff Layton
     

13 Jun, 2014

1 commit

  • Pull vfs updates from Al Viro:
    "This the bunch that sat in -next + lock_parent() fix. This is the
    minimal set; there's more pending stuff.

    In particular, I really hope to get acct.c fixes merged this cycle -
    we need that to deal sanely with delayed-mntput stuff. In the next
    pile, hopefully - that series is fairly short and localized
    (kernel/acct.c, fs/super.c and fs/namespace.c). In this pile: more
    iov_iter work. Most of prereqs for ->splice_write with sane locking
    order are there and Kent's dio rewrite would also fit nicely on top of
    this pile"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (70 commits)
    lock_parent: don't step on stale ->d_parent of all-but-freed one
    kill generic_file_splice_write()
    ceph: switch to iter_file_splice_write()
    shmem: switch to iter_file_splice_write()
    nfs: switch to iter_splice_write_file()
    fs/splice.c: remove unneeded exports
    ocfs2: switch to iter_file_splice_write()
    ->splice_write() via ->write_iter()
    bio_vec-backed iov_iter
    optimize copy_page_{to,from}_iter()
    bury generic_file_aio_{read,write}
    lustre: get rid of messing with iovecs
    ceph: switch to ->write_iter()
    ceph_sync_direct_write: stop poking into iov_iter guts
    ceph_sync_read: stop poking into iov_iter guts
    new helper: copy_page_from_iter()
    fuse: switch to ->write_iter()
    btrfs: switch to ->write_iter()
    ocfs2: switch to ->write_iter()
    xfs: switch to ->write_iter()
    ...

    Linus Torvalds
     

12 Jun, 2014

1 commit


29 May, 2014

1 commit

  • "fdatasync() is similar to fsync(), but does not flush modified metadata
    unless that metadata is needed in order to allow a subsequent data
    retrieval to be correctly handled."

    We absolutely need to commit the layouts to be able to retrieve the data
    in case either the client, the server or the storage subsystem go down.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Trond Myklebust

    Christoph Hellwig
     

07 May, 2014

2 commits


13 Nov, 2013

1 commit

  • Pull vfs updates from Al Viro:
    "All kinds of stuff this time around; some more notable parts:

    - RCU'd vfsmounts handling
    - new primitives for coredump handling
    - files_lock is gone
    - Bruce's delegations handling series
    - exportfs fixes

    plus misc stuff all over the place"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (101 commits)
    ecryptfs: ->f_op is never NULL
    locks: break delegations on any attribute modification
    locks: break delegations on link
    locks: break delegations on rename
    locks: helper functions for delegation breaking
    locks: break delegations on unlink
    namei: minor vfs_unlink cleanup
    locks: implement delegations
    locks: introduce new FL_DELEG lock flag
    vfs: take i_mutex on renamed file
    vfs: rename I_MUTEX_QUOTA now that it's not used for quotas
    vfs: don't use PARENT/CHILD lock classes for non-directories
    vfs: pull ext4's double-i_mutex-locking into common code
    exportfs: fix quadratic behavior in filehandle lookup
    exportfs: better variable name
    exportfs: move most of reconnect_path to helper function
    exportfs: eliminate unused "noprogress" counter
    exportfs: stop retrying once we race with rename/remove
    exportfs: clear DISCONNECTED on all parents sooner
    exportfs: more detailed comment for path_reconnect
    ...

    Linus Torvalds
     

29 Oct, 2013

1 commit

  • …/linux-fs into linux-next

    Pull fs-cache fixes from David Howells:

    Can you pull these commits to fix an issue with NFS whereby caching can be
    enabled on a file that is open for writing by subsequently opening it for
    reading. This can be made to crash by opening it for writing again if you're
    quick enough.

    The gist of the patchset is that the cookie should be acquired at inode
    creation only and subsequently enabled and disabled as appropriate (which
    dispenses with the backing objects when they're not needed).

    The extra synchronisation that NFS does can then be dispensed with as it is
    thenceforth managed by FS-Cache.

    Could you send these on to Linus?

    This likely will need fixing also in CIFS and 9P also once the FS-Cache
    changes are upstream. AFS and Ceph are probably safe.

    * 'fscache' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    NFS: Use i_writecount to control whether to get an fscache cookie in nfs_open()
    FS-Cache: Provide the ability to enable/disable cookies
    FS-Cache: Add use/unuse/wake cookie wrappers

    Trond Myklebust
     

25 Oct, 2013

1 commit


28 Sep, 2013

1 commit

  • Use i_writecount to control whether to get an fscache cookie in nfs_open() as
    NFS does not do write caching yet. I *think* this is the cause of a problem
    encountered by Mark Moseley whereby __fscache_uncache_page() gets a NULL
    pointer dereference because cookie->def is NULL:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    IP: [] __fscache_uncache_page+0x23/0x160
    PGD 0
    Thread overran stack, or stack corrupted
    Oops: 0000 [#1] SMP
    Modules linked in: ...
    CPU: 7 PID: 18993 Comm: php Not tainted 3.11.1 #1
    Hardware name: Dell Inc. PowerEdge R420/072XWF, BIOS 1.3.5 08/21/2012
    task: ffff8804203460c0 ti: ffff880420346640
    RIP: 0010:[] __fscache_uncache_page+0x23/0x160
    RSP: 0018:ffff8801053af878 EFLAGS: 00210286
    RAX: 0000000000000000 RBX: ffff8800be2f8780 RCX: ffff88022ffae5e8
    RDX: 0000000000004c66 RSI: ffffea00055ff440 RDI: ffff8800be2f8780
    RBP: ffff8801053af898 R08: 0000000000000001 R09: 0000000000000003
    R10: 0000000000000000 R11: 0000000000000000 R12: ffffea00055ff440
    R13: 0000000000001000 R14: ffff8800c50be538 R15: 0000000000000000
    FS: 0000000000000000(0000) GS:ffff88042fc60000(0063) knlGS:00000000e439c700
    CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
    CR2: 0000000000000010 CR3: 0000000001d8f000 CR4: 00000000000607f0
    Stack:
    ...
    Call Trace:
    [] __nfs_fscache_invalidate_page+0x42/0x70
    [] nfs_invalidate_page+0x75/0x90
    [] truncate_inode_page+0x8e/0x90
    [] truncate_inode_pages_range.part.12+0x14d/0x620
    [] ? __mutex_lock_slowpath+0x1fd/0x2e0
    [] truncate_inode_pages_range+0x53/0x70
    [] truncate_inode_pages+0x2d/0x40
    [] truncate_pagecache+0x4f/0x70
    [] nfs_setattr_update_inode+0xa0/0x120
    [] nfs3_proc_setattr+0xc4/0xe0
    [] nfs_setattr+0xc8/0x150
    [] notify_change+0x1cb/0x390
    [] do_truncate+0x7b/0xc0
    [] do_last+0xa4c/0xfd0
    [] path_openat+0xcc/0x670
    [] do_filp_open+0x4e/0xb0
    [] do_sys_open+0x13f/0x2b0
    [] compat_SyS_open+0x36/0x50
    [] sysenter_dispatch+0x7/0x24

    The code at the instruction pointer was disassembled:

    > (gdb) disas __fscache_uncache_page
    > Dump of assembler code for function __fscache_uncache_page:
    > ...
    > 0xffffffff812a18ff : mov 0x48(%rbx),%rax
    > 0xffffffff812a1903 : cmpb $0x0,0x10(%rax)
    > 0xffffffff812a1907 : je 0xffffffff812a19cd

    These instructions make up:

    ASSERTCMP(cookie->def->type, !=, FSCACHE_COOKIE_TYPE_INDEX);

    That cmpb is the faulting instruction (%rax is 0). So cookie->def is NULL -
    which presumably means that the cookie has already been at least partway
    through __fscache_relinquish_cookie().

    What I think may be happening is something like a three-way race on the same
    file:

    PROCESS 1 PROCESS 2 PROCESS 3
    =============== =============== ===============
    open(O_TRUNC|O_WRONLY)
    open(O_RDONLY)
    open(O_WRONLY)
    -->nfs_open()
    -->nfs_fscache_set_inode_cookie()
    nfs_fscache_inode_lock()
    nfs_fscache_disable_inode_cookie()
    __fscache_relinquish_cookie()
    nfs_inode->fscache = NULL
    nfs_open()
    -->nfs_fscache_set_inode_cookie()
    nfs_fscache_inode_lock()
    nfs_fscache_enable_inode_cookie()
    __fscache_acquire_cookie()
    nfs_inode->fscache = cookie
    nfs_setattr()
    ...
    ...
    -->nfs_invalidate_page()
    -->__nfs_fscache_invalidate_page()
    cookie = nfsi->fscache
    -->nfs_open()
    -->nfs_fscache_set_inode_cookie()
    nfs_fscache_inode_lock()
    nfs_fscache_disable_inode_cookie()
    -->__fscache_relinquish_cookie()
    -->__fscache_uncache_page(cookie)

    fscache = NULL

    Signed-off-by: David Howells

    David Howells
     

26 Sep, 2013

1 commit