25 Dec, 2016

1 commit


23 Dec, 2016

1 commit


18 Dec, 2016

1 commit

  • Pull more vfs updates from Al Viro:
    "In this pile:

    - autofs-namespace series
    - dedupe stuff
    - more struct path constification"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (40 commits)
    ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features
    ocfs2: charge quota for reflinked blocks
    ocfs2: fix bad pointer cast
    ocfs2: always unlock when completing dio writes
    ocfs2: don't eat io errors during _dio_end_io_write
    ocfs2: budget for extent tree splits when adding refcount flag
    ocfs2: prohibit refcounted swapfiles
    ocfs2: add newlines to some error messages
    ocfs2: convert inode refcount test to a helper
    simple_write_end(): don't zero in short copy into uptodate
    exofs: don't mess with simple_write_{begin,end}
    9p: saner ->write_end() on failing copy into non-uptodate page
    fix gfs2_stuffed_write_end() on short copies
    fix ceph_write_end()
    nfs_write_end(): fix handling of short copies
    vfs: refactor clone/dedupe_file_range common functions
    fs: try to clone files first in vfs_copy_file_range
    vfs: misc struct path constification
    namespace.c: constify struct path passed to a bunch of primitives
    quota: constify struct path in quota_on
    ...

    Linus Torvalds
     

16 Dec, 2016

4 commits

  • With overlayfs, it is wrong to compare file_inode(inode)->i_sb
    of regular files with those of non-regular files, because the
    former reference the real (upper/lower) sb and the latter reference
    the overlayfs sb.

    Move the test for same super block after the sanity tests for
    clone range of directory and non-regular file.

    This change fixes xfstest generic/157, which returned EXDEV instead
    of EISDIR/EINVAL in the following test cases over overlayfs:

    echo "Try to reflink a dir"
    _reflink_range $testdir1/dir1 0 $testdir1/file2 0 $blksz

    echo "Try to reflink a device"
    _reflink_range $testdir1/dev1 0 $testdir1/file2 0 $blksz

    Signed-off-by: Amir Goldstein
    Signed-off-by: Miklos Szeredi

    Amir Goldstein
     
  • Move sb_start_write()/sb_end_write() out of the vfs helper and up into the
    ioctl handler.

    Signed-off-by: Amir Goldstein
    Signed-off-by: Miklos Szeredi

    Amir Goldstein
     
  • FICLONE/FICLONERANGE ioctls return -EXDEV if src and dest
    files are not on the same mount point.
    Practically, clone only requires that src and dest files
    are on the same file system.

    Move the check for same mount point to ioctl handler and keep
    only the check for same super block in the vfs helper.

    A following patch is going to use the vfs_clone_file_range()
    helper in overlayfs to copy up between lower and upper
    mount points on the same file system.

    Signed-off-by: Amir Goldstein
    Signed-off-by: Miklos Szeredi

    Amir Goldstein
     
  • We've checked for file_out being opened for write. This ensures that we
    already have mnt_want_write() on target.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

10 Dec, 2016

2 commits

  • Hoist both the XFS reflink inode state and preparation code and the XFS
    file blocks compare functions into the VFS so that ocfs2 can take
    advantage of it for reflink and dedupe.

    Signed-off-by: Darrick J. Wong

    Darrick J. Wong
     
  • A clone is a perfectly fine implementation of a file copy, so most
    file systems just implement the copy that way. Instead of duplicating
    this logic move it to the VFS. Currently btrfs and XFS implement copies
    the same way as clones and there is no behavior change for them, cifs
    only implements clones and grow support for copy_file_range with this
    patch. NFS implements both, so this will allow copy_file_range to work
    on servers that only implement CLONE and be lot more efficient on servers
    that implements CLONE and COPY.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     

15 Oct, 2016

1 commit

  • Both import_iovec() and rw_copy_check_uvector() take an array
    (typically small and on-stack) which is used to hold an iovec array copy
    from userspace. This is to avoid an expensive memory allocation in the
    fast path (i.e. few iovec elements).

    The caller may have to check whether these functions actually used
    the provided buffer or allocated a new one -- but this differs between
    the too. Let's just add a kernel doc to clarify what the semantics are
    for each function.

    Signed-off-by: Vegard Nossum
    Signed-off-by: Al Viro

    Vegard Nossum
     

15 Jul, 2016

1 commit

  • Don't use the same syscall numbers for 2 different syscalls:

    534 x32 preadv compat_sys_preadv64
    535 x32 pwritev compat_sys_pwritev64
    534 x32 preadv2 compat_sys_preadv2
    535 x32 pwritev2 compat_sys_pwritev2

    Add compat_sys_preadv64v2() and compat_sys_pwritev64v2() so that 64-bit offset
    is passed in one 64-bit register on x32, similar to compat_sys_preadv64()
    and compat_sys_pwritev64().

    Signed-off-by: H.J. Lu
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Christoph Hellwig
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/CAMe9rOovCMf-RQfx_n1U_Tu_DX1BYkjtFr%3DQ4-_PFVSj9BCzUA@mail.gmail.com
    Signed-off-by: Ingo Molnar

    H.J. Lu
     

19 May, 2016

1 commit


18 May, 2016

1 commit

  • Pull vfs cleanups from Al Viro:
    "More cleanups from Christoph"

    * 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    nfsd: use RWF_SYNC
    fs: add RWF_DSYNC aand RWF_SYNC
    ceph: use generic_write_sync
    fs: simplify the generic_write_sync prototype
    fs: add IOCB_SYNC and IOCB_DSYNC
    direct-io: remove the offset argument to dio_complete
    direct-io: eliminate the offset argument to ->direct_IO
    xfs: eliminate the pos variable in xfs_file_dio_aio_write
    filemap: remove the pos argument to generic_file_direct_write
    filemap: remove pos variables in generic_file_read_iter

    Linus Torvalds
     

03 May, 2016

1 commit


02 May, 2016

1 commit


04 Apr, 2016

1 commit


19 Mar, 2016

1 commit


05 Mar, 2016

3 commits

  • This adds a flag that tells the file system that this is a high priority
    request for which it's worth to poll the hardware. The flag is purely
    advisory and can be ignored if not supported.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Stephen Bates
    Tested-by: Stephen Bates
    Acked-by: Jeff Moyer
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • New syscalls that take an flag argument. No flags are added yet in this
    patch.

    Signed-off-by: Milosz Tanski
    [hch: rebased on top of my kiocb changes]
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Stephen Bates
    Tested-by: Stephen Bates
    Acked-by: Jeff Moyer
    Signed-off-by: Al Viro

    Milosz Tanski
     
  • This way we can set kiocb flags also from the sync read/write path for
    the read_iter/write_iter operations. For now there is no way to pass
    flags to plain read/write operations as there is no real need for that,
    and all flags passed are explicitly rejected for these files.

    Signed-off-by: Milosz Tanski
    [hch: rebased on top of my kiocb changes]
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Stephen Bates
    Tested-by: Stephen Bates
    Acked-by: Jeff Moyer
    Signed-off-by: Al Viro

    Christoph Hellwig
     

28 Feb, 2016

1 commit


20 Feb, 2016

1 commit

  • The user-visible impact of the issue is for example that without this
    patch sensors-detect breaks when trying to seek in /dev/cpu/0/cpuid.

    '~0ULL' is a 'unsigned long long' that when converted to a loff_t,
    which is signed, gets turned into -1. later in vfs_setpos we have
    'if (offset > maxsize)', which makes it always return EINVAL.

    Fixes: b25472f9b961 ("new helpers: no_seek_end_llseek{,_size}()")
    Signed-off-by: Wouter van Kesteren
    Reviewed-by: Andreas Dilger
    Signed-off-by: Al Viro

    Wouter van Kesteren
     

23 Jan, 2016

2 commits


13 Jan, 2016

1 commit

  • Pull misc vfs updates from Al Viro:
    "All kinds of stuff. That probably should've been 5 or 6 separate
    branches, but by the time I'd realized how large and mixed that bag
    had become it had been too close to -final to play with rebasing.

    Some fs/namei.c cleanups there, memdup_user_nul() introduction and
    switching open-coded instances, burying long-dead code, whack-a-mole
    of various kinds, several new helpers for ->llseek(), assorted
    cleanups and fixes from various people, etc.

    One piece probably deserves special mention - Neil's
    lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets
    called without ->i_mutex and tries to avoid ever taking it. That, of
    course, means that it's not useful for any directory modifications,
    but things like getting inode attributes in nfds readdirplus are fine
    with that. I really should've asked for moratorium on lookup-related
    changes this cycle, but since I hadn't done that early enough... I
    *am* asking for that for the coming cycle, though - I'm going to try
    and get conversion of i_mutex to rwsem with ->lookup() done under lock
    taken shared.

    There will be a patch closer to the end of the window, along the lines
    of the one Linus had posted last May - mechanical conversion of
    ->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/
    inode_is_locked()/inode_lock_nested(). To quote Linus back then:

    -----
    | This is an automated patch using
    |
    | sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/'
    | sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/'
    | sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[ ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/'
    | sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/'
    | sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/'
    |
    | with a very few manual fixups
    -----

    I'm going to send that once the ->i_mutex-affecting stuff in -next
    gets mostly merged (or when Linus says he's about to stop taking
    merges)"

    * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    nfsd: don't hold i_mutex over userspace upcalls
    fs:affs:Replace time_t with time64_t
    fs/9p: use fscache mutex rather than spinlock
    proc: add a reschedule point in proc_readfd_common()
    logfs: constify logfs_block_ops structures
    fcntl: allow to set O_DIRECT flag on pipe
    fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE
    fs: xattr: Use kvfree()
    [s390] page_to_phys() always returns a multiple of PAGE_SIZE
    nbd: use ->compat_ioctl()
    fs: use block_device name vsprintf helper
    lib/vsprintf: add %*pg format specifier
    fs: use gendisk->disk_name where possible
    poll: plug an unused argument to do_poll
    amdkfd: don't open-code memdup_user()
    cdrom: don't open-code memdup_user()
    rsxx: don't open-code memdup_user()
    mtip32xx: don't open-code memdup_user()
    [um] mconsole: don't open-code memdup_user_nul()
    [um] hostaudio: don't open-code memdup_user()
    ...

    Linus Torvalds
     

01 Jan, 2016

2 commits


23 Dec, 2015

1 commit


08 Dec, 2015

2 commits

  • The btrfs clone ioctls are now adopted by other file systems, with NFS
    and CIFS already having support for them, and XFS being under active
    development. To avoid growth of various slightly incompatible
    implementations, add one to the VFS. Note that clones are different from
    file copies in several ways:

    - they are atomic vs other writers
    - they support whole file clones
    - they support 64-bit legth clones
    - they do not allow partial success (aka short writes)
    - clones are expected to be a fast metadata operation

    Because of that it would be rather cumbersome to try to piggyback them on
    top of the recent clone_file_range infrastructure. The converse isn't
    true and the clone_file_range system call could try clone file range as
    a first attempt to copy, something that further patches will enable.

    Based on earlier work from Peng Tao.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Pass a loff_t end for the last byte instead of the 32-bit count
    parameter to allow full file clones even on 32-bit architectures.
    While we're at it also simplify the read/write selection.

    Signed-off-by: Christoph Hellwig
    Acked-by: J. Bruce Fields
    Signed-off-by: Al Viro

    Christoph Hellwig
     

02 Dec, 2015

2 commits

  • This allows us to have an in-kernel copy mechanism that avoids frequent
    switches between kernel and user space. This is especially useful so
    NFSD can support server-side copies.

    The default (flags=0) means to first attempt copy acceleration, but use
    the pagecache if that fails.

    Signed-off-by: Anna Schumaker
    Reviewed-by: Darrick J. Wong
    Reviewed-by: Padraig Brady
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Anna Schumaker
     
  • Add a copy_file_range() system call for offloading copies between
    regular files.

    This gives an interface to underlying layers of the storage stack which
    can copy without reading and writing all the data. There are a few
    candidates that should support copy offloading in the nearer term:

    - btrfs shares extent references with its clone ioctl
    - NFS has patches to add a COPY command which copies on the server
    - SCSI has a family of XCOPY commands which copy in the device

    This system call avoids the complexity of also accelerating the creation
    of the destination file by operating on an existing destination file
    descriptor, not a path.

    Currently the high level vfs entry point limits copy offloading to files
    on the same mount and super (and not in the same file). This can be
    relaxed if we get implementations which can copy between file systems
    safely.

    Signed-off-by: Zach Brown
    [Anna Schumaker: Change -EINVAL to -EBADF during file verification,
    Change flags parameter from int to unsigned int,
    Add function to include/linux/syscalls.h,
    Check copy len after file open mode,
    Don't forbid ranges inside the same file,
    Use rw_verify_area() to veriy ranges,
    Use file_out rather than file_in,
    Add COPY_FR_REFLINK flag]
    Signed-off-by: Anna Schumaker
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Zach Brown
     

12 Apr, 2015

8 commits