18 Oct, 2016

1 commit

  • On ARM, we get this false-positive warning since the rework of
    the ext2_get_blocks interface:

    fs/ext2/inode.c: In function 'ext2_get_block':
    include/linux/buffer_head.h:340:16: error: 'bno' may be used uninitialized in this function [-Werror=maybe-uninitialized]

    The calling conventions for this function are rather complex, and it's
    not surprising that the compiler gets this wrong, I spent a long time
    trying to understand how it all fits together myself.

    This change to avoid the warning makes sure the compiler sees that we
    always set 'bno' pointer whenever we have a positive return code.
    The transformation is correct because we always arrive at the 'got_it'
    label with a positive count that gets used as the return value, while
    any branch to the 'cleanup' label has a negative or zero 'err'.

    Fixes: 6750ad71986d ("ext2: stop passing buffer_head to ext2_get_blocks")
    Signed-off-by: Arnd Bergmann
    Reviewed-by: Christoph Hellwig
    Cc: Dave Chinner
    Signed-off-by: Jan Kara

    Arnd Bergmann
     

11 Oct, 2016

4 commits

  • Pull more vfs updates from Al Viro:
    ">rename2() work from Miklos + current_time() from Deepa"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fs: Replace current_fs_time() with current_time()
    fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
    fs: Replace CURRENT_TIME with current_time() for inode timestamps
    fs: proc: Delete inode time initializations in proc_alloc_inode()
    vfs: Add current_time() api
    vfs: add note about i_op->rename changes to porting
    fs: rename "rename2" i_op to "rename"
    vfs: remove unused i_op->rename
    fs: make remaining filesystems use .rename2
    libfs: support RENAME_NOREPLACE in simple_rename()
    fs: support RENAME_NOREPLACE for local filesystems
    ncpfs: fix unused variable warning

    Linus Torvalds
     
  • Al Viro
     
  • Pull vfs xattr updates from Al Viro:
    "xattr stuff from Andreas

    This completes the switch to xattr_handler ->get()/->set() from
    ->getxattr/->setxattr/->removexattr"

    * 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    vfs: Remove {get,set,remove}xattr inode operations
    xattr: Stop calling {get,set,remove}xattr inode operations
    vfs: Check for the IOP_XATTR flag in listxattr
    xattr: Add __vfs_{get,set,remove}xattr helpers
    libfs: Use IOP_XATTR flag for empty directory handling
    vfs: Use IOP_XATTR flag for bad-inode handling
    vfs: Add IOP_XATTR inode operations flag
    vfs: Move xattr_resolve_name to the front of fs/xattr.c
    ecryptfs: Switch to generic xattr handlers
    sockfs: Get rid of getxattr iop
    sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
    kernfs: Switch to generic xattr handlers
    hfs: Switch to generic xattr handlers
    jffs2: Remove jffs2_{get,set,remove}xattr macros
    xattr: Remove unnecessary NULL attribute name check

    Linus Torvalds
     
  • Pull misc vfs updates from Al Viro:
    "Assorted misc bits and pieces.

    There are several single-topic branches left after this (rename2
    series from Miklos, current_time series from Deepa Dinamani, xattr
    series from Andreas, uaccess stuff from from me) and I'd prefer to
    send those separately"

    * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (39 commits)
    proc: switch auxv to use of __mem_open()
    hpfs: support FIEMAP
    cifs: get rid of unused arguments of CIFSSMBWrite()
    posix_acl: uapi header split
    posix_acl: xattr representation cleanups
    fs/aio.c: eliminate redundant loads in put_aio_ring_file
    fs/internal.h: add const to ns_dentry_operations declaration
    compat: remove compat_printk()
    fs/buffer.c: make __getblk_slow() static
    proc: unsigned file descriptors
    fs/file: more unsigned file descriptors
    fs: compat: remove redundant check of nr_segs
    cachefiles: Fix attempt to read i_blocks after deleting file [ver #2]
    cifs: don't use memcpy() to copy struct iov_iter
    get rid of separate multipage fault-in primitives
    fs: Avoid premature clearing of capabilities
    fs: Give dentry to inode_change_ok() instead of inode
    fuse: Propagate dentry down to inode_change_ok()
    ceph: Propagate dentry down to inode_change_ok()
    xfs: Propagate dentry down to inode_change_ok()
    ...

    Linus Torvalds
     

08 Oct, 2016

2 commits

  • These inode operations are no longer used; remove them.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     
  • To support DAX pmd mappings with unmodified applications, filesystems
    need to align an mmap address by the pmd size.

    Call thp_get_unmapped_area() from f_op->get_unmapped_area.

    Note, there is no change in behavior for a non-DAX file.

    Link: http://lkml.kernel.org/r/1472497881-9323-3-git-send-email-toshi.kani@hpe.com
    Signed-off-by: Toshi Kani
    Cc: Dan Williams
    Cc: Matthew Wilcox
    Cc: Ross Zwisler
    Cc: Kirill A. Shutemov
    Cc: Dave Chinner
    Cc: Jan Kara
    Cc: Theodore Ts'o
    Cc: Andreas Dilger
    Cc: Mike Kravetz
    Cc: "Kirill A. Shutemov"
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Toshi Kani
     

06 Oct, 2016

1 commit

  • Pull xfs and iomap updates from Dave Chinner:
    "The main things in this update are the iomap-based DAX infrastructure,
    an XFS delalloc rework, and a chunk of fixes to how log recovery
    schedules writeback to prevent spurious corruption detections when
    recovery of certain items was not required.

    The other main chunk of code is some preparation for the upcoming
    reflink functionality. Most of it is generic and cleanups that stand
    alone, but they were ready and reviewed so are in this pull request.

    Speaking of reflink, I'm currently planning to send you another pull
    request next week containing all the new reflink functionality. I'm
    working through a similar process to the last cycle, where I sent the
    reverse mapping code in a separate request because of how large it
    was. The reflink code merge is even bigger than reverse mapping, so
    I'll be doing the same thing again....

    Summary for this update:

    - change of XFS mailing list to linux-xfs@vger.kernel.org

    - iomap-based DAX infrastructure w/ XFS and ext2 support

    - small iomap fixes and additions

    - more efficient XFS delayed allocation infrastructure based on iomap

    - a rework of log recovery writeback scheduling to ensure we don't
    fail recovery when trying to replay items that are already on disk

    - some preparation patches for upcoming reflink support

    - configurable error handling fixes and documentation

    - aio access time update race fixes for XFS and
    generic_file_read_iter"

    * tag 'xfs-for-linus-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (40 commits)
    fs: update atime before I/O in generic_file_read_iter
    xfs: update atime before I/O in xfs_file_dio_aio_read
    ext2: fix possible integer truncation in ext2_iomap_begin
    xfs: log recovery tracepoints to track current lsn and buffer submission
    xfs: update metadata LSN in buffers during log recovery
    xfs: don't warn on buffers not being recovered due to LSN
    xfs: pass current lsn to log recovery buffer validation
    xfs: rework log recovery to submit buffers on LSN boundaries
    xfs: quiesce the filesystem after recovery on readonly mount
    xfs: remote attribute blocks aren't really userdata
    ext2: use iomap to implement DAX
    ext2: stop passing buffer_head to ext2_get_blocks
    xfs: use iomap to implement DAX
    xfs: refactor xfs_setfilesize
    xfs: take the ilock shared if possible in xfs_file_iomap_begin
    xfs: fix locking for DAX writes
    dax: provide an iomap based fault handler
    dax: provide an iomap based dax read/write path
    dax: don't pass buffer_head to copy_user_dax
    dax: don't pass buffer_head to dax_insert_mapping
    ...

    Linus Torvalds
     

03 Oct, 2016

1 commit


28 Sep, 2016

3 commits

  • CURRENT_TIME_SEC is not y2038 safe. current_time() will
    be transitioned to use 64 bit time along with vfs in a
    separate patch.
    There is no plan to transistion CURRENT_TIME_SEC to use
    y2038 safe time interfaces.

    current_time() will also be extended to use superblock
    range checking parameters when range checking is introduced.

    This works because alloc_super() fills in the the s_time_gran
    in super block to NSEC_PER_SEC.

    Signed-off-by: Deepa Dinamani
    Acked-by: Jan Kara
    Signed-off-by: Al Viro

    Deepa Dinamani
     
  • CURRENT_TIME macro is not appropriate for filesystems as it
    doesn't use the right granularity for filesystem timestamps.
    Use current_time() instead.

    CURRENT_TIME is also not y2038 safe.

    This is also in preparation for the patch that transitions
    vfs timestamps to use 64 bit time and hence make them
    y2038 safe. As part of the effort current_time() will be
    extended to do range checks. Hence, it is necessary for all
    file system timestamps to use current_time(). Also,
    current_time() will be transitioned along with vfs to be
    y2038 safe.

    Note that whenever a single call to current_time() is used
    to change timestamps in different inodes, it is because they
    share the same time granularity.

    Signed-off-by: Deepa Dinamani
    Reviewed-by: Arnd Bergmann
    Acked-by: Felipe Balbi
    Acked-by: Steven Whitehouse
    Acked-by: Ryusuke Konishi
    Acked-by: David Sterba
    Signed-off-by: Al Viro

    Deepa Dinamani
     
  • When zeroing blocks for DAX allocations, we also have to unmap aliases
    in the block device mappings. Otherwise writeback can overwrite zeros
    with stale data from block device page cache.

    Signed-off-by: Jan Kara

    Jan Kara
     

27 Sep, 2016

2 commits

  • Generated patch:

    sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2`
    sed -i "s/\brename2\b/rename/g" `git grep -wl rename2`

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • This is trivial to do:

    - add flags argument to foo_rename()
    - check if flags doesn't have any other than RENAME_NOREPLACE
    - assign foo_rename() to .rename2 instead of .rename

    Filesystems converted:

    affs, bfs, exofs, ext2, hfs, hfsplus, jffs2, jfs, logfs, minix, msdos,
    nilfs2, omfs, reiserfs, sysvfs, ubifs, udf, ufs, vfat.

    Signed-off-by: Miklos Szeredi
    Acked-by: Boaz Harrosh
    Acked-by: Richard Weinberger
    Acked-by: Bob Copeland
    Acked-by: Jan Kara
    Cc: Theodore Ts'o
    Cc: Jaegeuk Kim
    Cc: OGAWA Hirofumi
    Cc: Mikulas Patocka
    Cc: David Woodhouse
    Cc: Dave Kleikamp
    Cc: Ryusuke Konishi
    Cc: Christoph Hellwig

    Miklos Szeredi
     

22 Sep, 2016

2 commits

  • inode_change_ok() will be resposible for clearing capabilities and IMA
    extended attributes and as such will need dentry. Give it as an argument
    to inode_change_ok() instead of an inode. Also rename inode_change_ok()
    to setattr_prepare() to better relect that it does also some
    modifications in addition to checks.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Jan Kara
     
  • When file permissions are modified via chmod(2) and the user is not in
    the owning group or capable of CAP_FSETID, the setgid bit is cleared in
    inode_change_ok(). Setting a POSIX ACL via setxattr(2) sets the file
    permissions as well as the new ACL, but doesn't clear the setgid bit in
    a similar way; this allows to bypass the check in chmod(2). Fix that.

    References: CVE-2016-7097
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Jeff Layton
    Signed-off-by: Jan Kara
    Signed-off-by: Andreas Gruenbacher

    Jan Kara
     

19 Sep, 2016

2 commits


09 Aug, 2016

1 commit


06 Aug, 2016

1 commit

  • Pull qstr constification updates from Al Viro:
    "Fairly self-contained bunch - surprising lot of places passes struct
    qstr * as an argument when const struct qstr * would suffice; it
    complicates analysis for no good reason.

    I'd prefer to feed that separately from the assorted fixes (those are
    in #for-linus and with somewhat trickier topology)"

    * 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    qstr: constify instances in adfs
    qstr: constify instances in lustre
    qstr: constify instances in f2fs
    qstr: constify instances in ext2
    qstr: constify instances in vfat
    qstr: constify instances in procfs
    qstr: constify instances in fuse
    qstr constify instances in fs/dcache.c
    qstr: constify instances in nfs
    qstr: constify instances in ocfs2
    qstr: constify instances in autofs4
    qstr: constify instances in hfs
    qstr: constify instances in hfsplus
    qstr: constify instances in logfs
    qstr: constify dentry_init_security

    Linus Torvalds
     

31 Jul, 2016

1 commit


27 Jul, 2016

2 commits

  • Merge updates from Andrew Morton:

    - a few misc bits

    - ocfs2

    - most(?) of MM

    * emailed patches from Andrew Morton : (125 commits)
    thp: fix comments of __pmd_trans_huge_lock()
    cgroup: remove unnecessary 0 check from css_from_id()
    cgroup: fix idr leak for the first cgroup root
    mm: memcontrol: fix documentation for compound parameter
    mm: memcontrol: remove BUG_ON in uncharge_list
    mm: fix build warnings in
    mm, thp: convert from optimistic swapin collapsing to conservative
    mm, thp: fix comment inconsistency for swapin readahead functions
    thp: update Documentation/{vm/transhuge,filesystems/proc}.txt
    shmem: split huge pages beyond i_size under memory pressure
    thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE
    khugepaged: add support of collapse for tmpfs/shmem pages
    shmem: make shmem_inode_info::lock irq-safe
    khugepaged: move up_read(mmap_sem) out of khugepaged_alloc_page()
    thp: extract khugepaged from mm/huge_memory.c
    shmem, thp: respect MADV_{NO,}HUGEPAGE for file mappings
    shmem: add huge pages support
    shmem: get_unmapped_area align huge page
    shmem: prepare huge= mount option and sysfs knob
    mm, rmap: account shmem thp pages
    ...

    Linus Torvalds
     
  • Remove the unused wrappers dax_fault() and dax_pmd_fault(). After this
    removal, rename __dax_fault() and __dax_pmd_fault() to dax_fault() and
    dax_pmd_fault() respectively, and update all callers.

    The dax_fault() and dax_pmd_fault() wrappers were initially intended to
    capture some filesystem independent functionality around page faults
    (calling sb_start_pagefault() & sb_end_pagefault(), updating file mtime
    and ctime).

    However, the following commits:

    5726b27b09cc ("ext2: Add locking for DAX faults")
    ea3d7209ca01 ("ext4: fix races between page faults and hole punching")

    added locking to the ext2 and ext4 filesystems after these common
    operations but before __dax_fault() and __dax_pmd_fault() were called.
    This means that these wrappers are no longer used, and are unlikely to
    be used in the future.

    XFS has had locking analogous to what was recently added to ext2 and
    ext4 since DAX support was initially introduced by:

    6b698edeeef0 ("xfs: add DAX file operations support")

    Link: http://lkml.kernel.org/r/20160714214049.20075-2-ross.zwisler@linux.intel.com
    Signed-off-by: Ross Zwisler
    Cc: "Theodore Ts'o"
    Cc: Alexander Viro
    Cc: Andreas Dilger
    Cc: Dan Williams
    Cc: Dave Chinner
    Reviewed-by: Jan Kara
    Cc: Jonathan Corbet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ross Zwisler
     

06 Jul, 2016

1 commit

  • This bug can be reproducible with fsfuzzer, although, I couldn't reproduce it
    100% of my tries, it is quite easily reproducible.

    During the deletion of an inode, ext2_xattr_delete_inode() does not check if the
    block pointed by EXT2_I(inode)->i_file_acl is a valid data block, this might
    lead to a deadlock, when i_file_acl == 1, and the filesystem block size is 1024.

    In that situation, ext2_xattr_delete_inode, will load the superblock's buffer
    head (instead of a valid i_file_acl block), and then lock that buffer head,
    which, ext2_sync_super will also try to lock, making the filesystem deadlock in
    the following stack trace:

    root 17180 0.0 0.0 113660 660 pts/0 D+ 07:08 0:00 rmdir
    /media/test/dir1

    [] __sync_dirty_buffer+0xaf/0x100
    [] sync_dirty_buffer+0x13/0x20
    [] ext2_sync_super+0xb7/0xc0 [ext2]
    [] ext2_error+0x119/0x130 [ext2]
    [] ext2_free_blocks+0x83/0x350 [ext2]
    [] ext2_xattr_delete_inode+0x173/0x190 [ext2]
    [] ext2_evict_inode+0xc9/0x130 [ext2]
    [] evict+0xb3/0x180
    [] iput+0x1b8/0x240
    [] d_delete+0x11c/0x150
    [] vfs_rmdir+0xfe/0x120
    [] do_rmdir+0x17e/0x1f0
    [] SyS_rmdir+0x16/0x20
    [] entry_SYSCALL_64_fastpath+0x1a/0xa4
    [] 0xffffffffffffffff

    Fix this by using the same approach ext4 uses to test data blocks validity,
    implementing ext2_data_block_valid.

    An another possibility when the superblock is very corrupted, is that i_file_acl
    is 1, block_count is 1 and first_data_block is 0. For such situations, we might
    have i_file_acl pointing to a 'valid' block, but still step over the superblock.
    The approach I used was to also test if the superblock is not in the range
    described by ext2_data_block_valid() arguments

    Signed-off-by: Carlos Maiolino
    Signed-off-by: Theodore Ts'o

    Carlos Maiolino
     

28 May, 2016

2 commits

  • Pull vfs fixes from Al Viro:
    "Followups to the parallel lookup work:

    - update docs

    - restore killability of the places that used to take ->i_mutex
    killably now that we have down_write_killable() merged

    - Additionally, it turns out that I missed a prerequisite for
    security_d_instantiate() stuff - ->getxattr() wasn't the only thing
    that could be called before dentry is attached to inode; with smack
    we needed the same treatment applied to ->setxattr() as well"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    switch ->setxattr() to passing dentry and inode separately
    switch xattr_handler->set() to passing dentry and inode separately
    restore killability of old mutex_lock_killable(&inode->i_mutex) users
    add down_write_killable_nested()
    update D/f/directory-locking

    Linus Torvalds
     
  • preparation for similar switch in ->setxattr() (see the next commit for
    rationale).

    Signed-off-by: Al Viro

    Al Viro
     

27 May, 2016

1 commit

  • Pull misc DAX updates from Vishal Verma:
    "DAX error handling for 4.7

    - Until now, dax has been disabled if media errors were found on any
    device. This enables the use of DAX in the presence of these
    errors by making all sector-aligned zeroing go through the driver.

    - The driver (already) has the ability to clear errors on writes that
    are sent through the block layer using 'DSMs' defined in ACPI 6.1.

    Other misc changes:

    - When mounting DAX filesystems, check to make sure the partition is
    page aligned. This is a requirement for DAX, and previously, we
    allowed such unaligned mounts to succeed, but subsequent
    reads/writes would fail.

    - Misc/cleanup fixes from Jan that remove unused code from DAX
    related to zeroing, writeback, and some size checks"

    * tag 'dax-misc-for-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
    dax: fix a comment in dax_zero_page_range and dax_truncate_page
    dax: for truncate/hole-punch, do zeroing through the driver if possible
    dax: export a low-level __dax_zero_page_range helper
    dax: use sb_issue_zerout instead of calling dax_clear_sectors
    dax: enable dax in the presence of known media errors (badblocks)
    dax: fallback from pmd to pte on error
    block: Update blkdev_dax_capable() for consistency
    xfs: Add alignment check for DAX mount
    ext2: Add alignment check for DAX mount
    ext4: Add alignment check for DAX mount
    block: Add bdev_dax_supported() for dax mount checks
    block: Add vfs_msg() interface
    dax: Remove redundant inode size checks
    dax: Remove pointless writeback from dax_do_io()
    dax: Remove zeroing from dax_io()
    dax: Remove dead zeroing code from fault handlers
    ext2: Avoid DAX zeroing to corrupt data
    ext2: Fix block zeroing in ext2_get_blocks() for DAX
    dax: Remove complete_unwritten argument
    DAX: move RADIX_DAX_ definitions to dax.c

    Linus Torvalds
     

19 May, 2016

1 commit

  • dax_clear_sectors() cannot handle poisoned blocks. These must be
    zeroed using the BIO interface instead. Convert ext2 and XFS to use
    only sb_issue_zerout().

    Reviewed-by: Jeff Moyer
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Jan Kara
    Signed-off-by: Matthew Wilcox
    [vishal: Also remove the dax_clear_sectors function entirely]
    Signed-off-by: Vishal Verma

    Matthew Wilcox
     

18 May, 2016

1 commit

  • Pull vfs cleanups from Al Viro:
    "More cleanups from Christoph"

    * 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    nfsd: use RWF_SYNC
    fs: add RWF_DSYNC aand RWF_SYNC
    ceph: use generic_write_sync
    fs: simplify the generic_write_sync prototype
    fs: add IOCB_SYNC and IOCB_DSYNC
    direct-io: remove the offset argument to dio_complete
    direct-io: eliminate the offset argument to ->direct_IO
    xfs: eliminate the pos variable in xfs_file_dio_aio_write
    filemap: remove the pos argument to generic_file_direct_write
    filemap: remove pos variables in generic_file_read_iter

    Linus Torvalds
     

17 May, 2016

4 commits

  • When a partition is not aligned by 4KB, mount -o dax succeeds,
    but any read/write access to the filesystem fails, except for
    metadata update.

    Call bdev_dax_supported() to perform proper precondition checks
    which includes this partition alignment check.

    Signed-off-by: Toshi Kani
    Reviewed-by: Jan Kara
    Cc: Jan Kara
    Cc: Dan Williams
    Cc: Ross Zwisler
    Cc: Christoph Hellwig
    Cc: Boaz Harrosh
    Signed-off-by: Vishal Verma

    Toshi Kani
     
  • Currently ext2 zeroes any data blocks allocated for DAX inode however it
    still returns them as BH_New. Thus DAX code zeroes them again in
    dax_insert_mapping() which can possibly overwrite the data that has been
    already stored to those blocks by a racing dax_io(). Avoid marking
    pre-zeroed buffers as new.

    Reviewed-by: Ross Zwisler
    Signed-off-by: Jan Kara
    Signed-off-by: Vishal Verma

    Jan Kara
     
  • When zeroing allocated blocks for DAX, we accidentally zeroed only the
    first allocated block instead of all of them. So far this problem is
    hidden by the fact that page faults always need only a single block and
    DAX write code zeroes blocks again. But the zeroing in DAX code is racy
    and needs to be removed so fix the zeroing in ext2 to zero all allocated
    blocks.

    Reported-by: Ross Zwisler
    Signed-off-by: Jan Kara
    Signed-off-by: Vishal Verma

    Jan Kara
     
  • Fault handlers currently take complete_unwritten argument to convert
    unwritten extents after PTEs are updated. However no filesystem uses
    this anymore as the code is racy. Remove the unused argument.

    Reviewed-by: Ross Zwisler
    Signed-off-by: Jan Kara
    Signed-off-by: Vishal Verma

    Jan Kara
     

03 May, 2016

3 commits


02 May, 2016

1 commit


11 Apr, 2016

2 commits


05 Apr, 2016

1 commit