11 Oct, 2016

4 commits

  • Pull more vfs updates from Al Viro:
    ">rename2() work from Miklos + current_time() from Deepa"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fs: Replace current_fs_time() with current_time()
    fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
    fs: Replace CURRENT_TIME with current_time() for inode timestamps
    fs: proc: Delete inode time initializations in proc_alloc_inode()
    vfs: Add current_time() api
    vfs: add note about i_op->rename changes to porting
    fs: rename "rename2" i_op to "rename"
    vfs: remove unused i_op->rename
    fs: make remaining filesystems use .rename2
    libfs: support RENAME_NOREPLACE in simple_rename()
    fs: support RENAME_NOREPLACE for local filesystems
    ncpfs: fix unused variable warning

    Linus Torvalds
     
  • Al Viro
     
  • Pull vfs xattr updates from Al Viro:
    "xattr stuff from Andreas

    This completes the switch to xattr_handler ->get()/->set() from
    ->getxattr/->setxattr/->removexattr"

    * 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    vfs: Remove {get,set,remove}xattr inode operations
    xattr: Stop calling {get,set,remove}xattr inode operations
    vfs: Check for the IOP_XATTR flag in listxattr
    xattr: Add __vfs_{get,set,remove}xattr helpers
    libfs: Use IOP_XATTR flag for empty directory handling
    vfs: Use IOP_XATTR flag for bad-inode handling
    vfs: Add IOP_XATTR inode operations flag
    vfs: Move xattr_resolve_name to the front of fs/xattr.c
    ecryptfs: Switch to generic xattr handlers
    sockfs: Get rid of getxattr iop
    sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
    kernfs: Switch to generic xattr handlers
    hfs: Switch to generic xattr handlers
    jffs2: Remove jffs2_{get,set,remove}xattr macros
    xattr: Remove unnecessary NULL attribute name check

    Linus Torvalds
     
  • Pull misc vfs updates from Al Viro:
    "Assorted misc bits and pieces.

    There are several single-topic branches left after this (rename2
    series from Miklos, current_time series from Deepa Dinamani, xattr
    series from Andreas, uaccess stuff from from me) and I'd prefer to
    send those separately"

    * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (39 commits)
    proc: switch auxv to use of __mem_open()
    hpfs: support FIEMAP
    cifs: get rid of unused arguments of CIFSSMBWrite()
    posix_acl: uapi header split
    posix_acl: xattr representation cleanups
    fs/aio.c: eliminate redundant loads in put_aio_ring_file
    fs/internal.h: add const to ns_dentry_operations declaration
    compat: remove compat_printk()
    fs/buffer.c: make __getblk_slow() static
    proc: unsigned file descriptors
    fs/file: more unsigned file descriptors
    fs: compat: remove redundant check of nr_segs
    cachefiles: Fix attempt to read i_blocks after deleting file [ver #2]
    cifs: don't use memcpy() to copy struct iov_iter
    get rid of separate multipage fault-in primitives
    fs: Avoid premature clearing of capabilities
    fs: Give dentry to inode_change_ok() instead of inode
    fuse: Propagate dentry down to inode_change_ok()
    ceph: Propagate dentry down to inode_change_ok()
    xfs: Propagate dentry down to inode_change_ok()
    ...

    Linus Torvalds
     

08 Oct, 2016

3 commits

  • Al Viro
     
  • These inode operations are no longer used; remove them.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     
  • Pull VFS splice updates from Al Viro:
    "There's a bunch of branches this cycle, both mine and from other folks
    and I'd rather send pull requests separately.

    This one is the conversion of ->splice_read() to ITER_PIPE iov_iter
    (and introduction of such). Gets rid of a lot of code in fs/splice.c
    and elsewhere; there will be followups, but these are for the next
    cycle... Some pipe/splice-related cleanups from Miklos in the same
    branch as well"

    * 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    pipe: fix comment in pipe_buf_operations
    pipe: add pipe_buf_steal() helper
    pipe: add pipe_buf_confirm() helper
    pipe: add pipe_buf_release() helper
    pipe: add pipe_buf_get() helper
    relay: simplify relay_file_read()
    switch default_file_splice_read() to use of pipe-backed iov_iter
    switch generic_file_splice_read() to use of ->read_iter()
    new iov_iter flavour: pipe-backed
    fuse_dev_splice_read(): switch to add_to_pipe()
    skb_splice_bits(): get rid of callback
    new helper: add_to_pipe()
    splice: lift pipe_lock out of splice_to_pipe()
    splice: switch get_iovec_page_array() to iov_iter
    splice_to_pipe(): don't open-code wakeup_pipe_readers()
    consistent treatment of EFAULT on O_DIRECT read/write

    Linus Torvalds
     

06 Oct, 2016

4 commits


04 Oct, 2016

2 commits

  • Signed-off-by: Al Viro

    Al Viro
     
  • * splice_to_pipe() stops at pipe overflow and does *not* take pipe_lock
    * ->splice_read() instances do the same
    * vmsplice_to_pipe() and do_splice() (ultimate callers of splice_to_pipe())
    arrange for waiting, looping, etc. themselves.

    That should make pipe_lock the outermost one.

    Unfortunately, existing rules for the amount passed by vmsplice_to_pipe()
    and do_splice() are quite ugly _and_ userland code can be easily broken
    by changing those. It's not even "no more than the maximal capacity of
    this pipe" - it's "once we'd fed pipe->nr_buffers pages into the pipe,
    leave instead of waiting".

    Considering how poorly these rules are documented, let's try "wait for some
    space to appear, unless given SPLICE_F_NONBLOCK, then push into pipe
    and if we run into overflow, we are done".

    Signed-off-by: Al Viro

    Al Viro
     

03 Oct, 2016

1 commit


01 Oct, 2016

12 commits

  • Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • The two invocations share little code.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Signed-off-by: Al Viro
    Signed-off-by: Miklos Szeredi

    Al Viro
     
  • In preparation for posix acl support, rework fuse to use xattr handlers and
    the generic setxattr/getxattr/listxattr callbacks. Split the xattr code
    out into it's own file, and promote symbols to module-global scope as
    needed.

    Functionally these changes have no impact, as fuse still uses a single
    handler for all xattrs which uses the old callbacks.

    Signed-off-by: Seth Forshee
    Signed-off-by: Miklos Szeredi

    Seth Forshee
     
  • Only two flags: "default_permissions" and "allow_other". All other flags
    are handled via bitfields. So convert these two as well. They don't
    change during the lifetime of the filesystem, so this is quite safe.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Make sure userspace filesystem is returning a well formed list of xattr
    names (zero or more nonzero length, null terminated strings).

    [Michael Theall: only verify in the nonzero size case]

    Signed-off-by: Miklos Szeredi
    Cc:

    Miklos Szeredi
     
  • And check for valid nsec value before passing into timespec64_to_jiffies().

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Store in memory pointed to by ->d_fsdata. Use ->d_init() to allocate the
    storage. Need to use RCU freeing because the data is used in RCU lookup
    mode.

    We could cast ->d_fsdata directly on 64bit archs, but I don't think this is
    worth the extra complexity.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Add a new INIT flag, FUSE_POSIX_ACL, for negotiating ACL support with
    userspace. When it is set in the INIT response, ACL support will be
    enabled. ACL support also implies "default_permissions".

    When ACL support is enabled, the kernel will cache and have responsibility
    for enforcing ACLs. ACL xattrs will be passed to userspace, which is
    responsible for updating the ACLs in the filesystem, keeping the file mode
    in sync, and inheritance of default ACLs when new filesystem nodes are
    created.

    Signed-off-by: Seth Forshee
    Signed-off-by: Miklos Szeredi

    Seth Forshee
     
  • Only userspace filesystem can do the killing of suid/sgid without races.
    So introduce an INIT flag and negotiate support for this.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Fuse allowed VFS to set mode in setattr in order to clear suid/sgid on
    chown and truncate, and (since writeback_cache) write. The problem with
    this is that it'll potentially restore a stale mode.

    The poper fix would be to let the filesystems do the suid/sgid clearing on
    the relevant operations. Possibly some are already doing it but there's no
    way we can detect this.

    So fix this by refreshing and recalculating the mode. Do this only if
    ATTR_KILL_S[UG]ID is set to not destroy performance for writes. This is
    still racy but the size of the window is reduced.

    Signed-off-by: Miklos Szeredi
    Cc:

    Miklos Szeredi
     
  • Without "default_permissions" the userspace filesystem's lookup operation
    needs to perform the check for search permission on the directory.

    If directory does not allow search for everyone (this is quite rare) then
    userspace filesystem has to set entry timeout to zero to make sure
    permissions are always performed.

    Changing the mode bits of the directory should also invalidate the
    (previously cached) dentry to make sure the next lookup will have a chance
    of updating the timeout, if needed.

    Reported-by: Jean-Pierre André
    Signed-off-by: Miklos Szeredi
    Cc:

    Miklos Szeredi
     

28 Sep, 2016

2 commits

  • current_fs_time() uses struct super_block* as an argument.
    As per Linus's suggestion, this is changed to take struct
    inode* as a parameter instead. This is because the function
    is primarily meant for vfs inode timestamps.
    Also the function was renamed as per Arnd's suggestion.

    Change all calls to current_fs_time() to use the new
    current_time() function instead. current_fs_time() will be
    deleted.

    Signed-off-by: Deepa Dinamani
    Signed-off-by: Al Viro

    Deepa Dinamani
     
  • CURRENT_TIME macro is not appropriate for filesystems as it
    doesn't use the right granularity for filesystem timestamps.
    Use current_time() instead.

    CURRENT_TIME is also not y2038 safe.

    This is also in preparation for the patch that transitions
    vfs timestamps to use 64 bit time and hence make them
    y2038 safe. As part of the effort current_time() will be
    extended to do range checks. Hence, it is necessary for all
    file system timestamps to use current_time(). Also,
    current_time() will be transitioned along with vfs to be
    y2038 safe.

    Note that whenever a single call to current_time() is used
    to change timestamps in different inodes, it is because they
    share the same time granularity.

    Signed-off-by: Deepa Dinamani
    Reviewed-by: Arnd Bergmann
    Acked-by: Felipe Balbi
    Acked-by: Steven Whitehouse
    Acked-by: Ryusuke Konishi
    Acked-by: David Sterba
    Signed-off-by: Al Viro

    Deepa Dinamani
     

27 Sep, 2016

1 commit


22 Sep, 2016

2 commits

  • inode_change_ok() will be resposible for clearing capabilities and IMA
    extended attributes and as such will need dentry. Give it as an argument
    to inode_change_ok() instead of an inode. Also rename inode_change_ok()
    to setattr_prepare() to better relect that it does also some
    modifications in addition to checks.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Jan Kara
     
  • To avoid clearing of capabilities or security related extended
    attributes too early, inode_change_ok() will need to take dentry instead
    of inode. Propagate it down to fuse_do_setattr().

    Acked-by: Miklos Szeredi
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Jan Kara
     

25 Aug, 2016

1 commit

  • When reading from a loop device backed by a fuse file it deadlocks on
    lock_page().

    This is because the page is already locked by the read() operation done on
    the loop device. In this case we don't want to either lock the page or
    dirty it.

    So do what fs/direct-io.c does: only dirty the page for ITER_IOVEC vectors.

    Reported-by: Sheng Yang
    Fixes: aa4d86163e4e ("block: loop: switch to VFS ITER_BVEC")
    Signed-off-by: Miklos Szeredi
    Cc: # v4.1+
    Reviewed-by: Sheng Yang
    Reviewed-by: Ashish Samant
    Tested-by: Sheng Yang
    Tested-by: Ashish Samant

    Miklos Szeredi
     

06 Aug, 2016

1 commit

  • Pull qstr constification updates from Al Viro:
    "Fairly self-contained bunch - surprising lot of places passes struct
    qstr * as an argument when const struct qstr * would suffice; it
    complicates analysis for no good reason.

    I'd prefer to feed that separately from the assorted fixes (those are
    in #for-linus and with somewhat trickier topology)"

    * 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    qstr: constify instances in adfs
    qstr: constify instances in lustre
    qstr: constify instances in f2fs
    qstr: constify instances in ext2
    qstr: constify instances in vfat
    qstr: constify instances in procfs
    qstr: constify instances in fuse
    qstr constify instances in fs/dcache.c
    qstr: constify instances in nfs
    qstr: constify instances in ocfs2
    qstr: constify instances in autofs4
    qstr: constify instances in hfs
    qstr: constify instances in hfsplus
    qstr: constify instances in logfs
    qstr: constify dentry_init_security

    Linus Torvalds
     

31 Jul, 2016

1 commit


30 Jul, 2016

1 commit

  • Pull fuse updates from Miklos Szeredi:
    "This fixes error propagation from writeback to fsync/close for
    writeback cache mode as well as adding a missing capability flag to
    the INIT message. The rest are cleanups.

    (The commits are recent but all the code actually sat in -next for a
    while now. The recommits are due to conflict avoidance and the
    addition of Cc: stable@...)"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
    fuse: use filemap_check_errors()
    mm: export filemap_check_errors() to modules
    fuse: fix wrong assignment of ->flags in fuse_send_init()
    fuse: fuse_flush must check mapping->flags for errors
    fuse: fsync() did not return IO errors
    fuse: don't mess with blocking signals
    new helper: wait_event_killable_exclusive()
    fuse: improve aio directIO write performance for size extending writes

    Linus Torvalds
     

29 Jul, 2016

5 commits

  • Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • FUSE_HAS_IOCTL_DIR should be assigned to ->flags, it may be a typo.

    Signed-off-by: Wei Fang
    Signed-off-by: Miklos Szeredi
    Fixes: 69fe05c90ed5 ("fuse: add missing INIT flags")
    Cc:

    Wei Fang
     
  • fuse_flush() calls write_inode_now() that triggers writeback, but actual
    writeback will happen later, on fuse_sync_writes(). If an error happens,
    fuse_writepage_end() will set error bit in mapping->flags. So, we have to
    check mapping->flags after fuse_sync_writes().

    Signed-off-by: Maxim Patlasov
    Signed-off-by: Miklos Szeredi
    Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on")
    Cc: # v3.15+

    Maxim Patlasov
     
  • Due to implementation of fuse writeback filemap_write_and_wait_range() does
    not catch errors. We have to do this directly after fuse_sync_writes()

    Signed-off-by: Alexey Kuznetsov
    Signed-off-by: Maxim Patlasov
    Signed-off-by: Miklos Szeredi
    Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on")
    Cc: # v3.15+

    Alexey Kuznetsov
     
  • Merge more updates from Andrew Morton:
    "The rest of MM"

    * emailed patches from Andrew Morton : (101 commits)
    mm, compaction: simplify contended compaction handling
    mm, compaction: introduce direct compaction priority
    mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations
    mm, page_alloc: make THP-specific decisions more generic
    mm, page_alloc: restructure direct compaction handling in slowpath
    mm, page_alloc: don't retry initial attempt in slowpath
    mm, page_alloc: set alloc_flags only once in slowpath
    lib/stackdepot.c: use __GFP_NOWARN for stack allocations
    mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB
    mm, kasan: account for object redzone in SLUB's nearest_obj()
    mm: fix use-after-free if memory allocation failed in vma_adjust()
    zsmalloc: Delete an unnecessary check before the function call "iput"
    mm/memblock.c: fix index adjustment error in __next_mem_range_rev()
    mem-hotplug: alloc new page from a nearest neighbor node when mem-offline
    mm: optimize copy_page_to/from_iter_iovec
    mm: add cond_resched() to generic_swapfile_activate()
    Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements"
    mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode
    mm: hwpoison: remove incorrect comments
    make __section_nr() more efficient
    ...

    Linus Torvalds