07 Aug, 2017

1 commit

  • Pull ext4 fixes from Ted Ts'o:
    "A large number of ext4 bug fixes and cleanups for v4.13"

    * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: fix copy paste error in ext4_swap_extents()
    ext4: fix overflow caused by missing cast in ext4_resize_fs()
    ext4, project: expand inode extra size if possible
    ext4: cleanup ext4_expand_extra_isize_ea()
    ext4: restructure ext4_expand_extra_isize
    ext4: fix forgetten xattr lock protection in ext4_expand_extra_isize
    ext4: make xattr inode reads faster
    ext4: inplace xattr block update fails to deduplicate blocks
    ext4: remove unused mode parameter
    ext4: fix warning about stack corruption
    ext4: fix dir_nlink behaviour
    ext4: silence array overflow warning
    ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize
    ext4: release discard bio after sending discard commands
    ext4: convert swap_inode_data() over to use swap() on most of the fields
    ext4: error should be cleared if ea_inode isn't added to the cache
    ext4: Don't clear SGID when inheriting ACLs
    ext4: preserve i_mode if __ext4_set_acl() fails
    ext4: remove unused metadata accounting variables
    ext4: correct comment references to ext4_ext_direct_IO()

    Linus Torvalds
     

06 Aug, 2017

14 commits

  • This bug was found by a static code checker tool for copy paste
    problems.

    Signed-off-by: Maninder Singh
    Signed-off-by: Vaneet Narang
    Signed-off-by: Theodore Ts'o

    Maninder Singh
     
  • On a 32-bit platform, the value of n_blcoks_count may be wrong during
    the file system is resized to size larger than 2^32 blocks. This may
    caused the superblock being corrupted with zero blocks count.

    Fixes: 1c6bd7173d66
    Signed-off-by: Jerry Lee
    Signed-off-by: Theodore Ts'o
    Cc: stable@vger.kernel.org # 3.7+

    Jerry Lee
     
  • When upgrading from old format, try to set project id
    to old file first time, it will return EOVERFLOW, but if
    that file is dirtied(touch etc), changing project id will
    be allowed, this might be confusing for users, we could
    try to expand @i_extra_isize here too.

    Reported-by: Zhang Yi
    Signed-off-by: Miao Xie
    Signed-off-by: Wang Shilong
    Signed-off-by: Theodore Ts'o

    Miao Xie
     
  • Clean up some goto statement, make ext4_expand_extra_isize_ea() clearer.

    Signed-off-by: Miao Xie
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Wang Shilong

    Miao Xie
     
  • Current ext4_expand_extra_isize just tries to expand extra isize, if
    someone is holding xattr lock or some check fails, it will give up.
    So rename its name to ext4_try_to_expand_extra_isize.

    Besides that, we clean up unnecessary check and move some relative checks
    into it.

    Signed-off-by: Miao Xie
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Wang Shilong

    Miao Xie
     
  • We should avoid the contention between the i_extra_isize update and
    the inline data insertion, so move the xattr trylock in front of
    i_extra_isize update.

    Signed-off-by: Miao Xie
    Reviewed-by: Wang Shilong

    Miao Xie
     
  • ext4_xattr_inode_read() currently reads each block sequentially while
    waiting for io operation to complete before moving on to the next
    block. This prevents request merging in block layer.

    Add a ext4_bread_batch() function that starts reads for all blocks
    then optionally waits for them to complete. A similar logic is used
    in ext4_find_entry(), so update that code to use the new function.

    Signed-off-by: Tahsin Erdogan
    Signed-off-by: Theodore Ts'o

    Tahsin Erdogan
     
  • When an xattr block has a single reference, block is updated inplace
    and it is reinserted to the cache. Later, a cache lookup is performed
    to see whether an existing block has the same contents. This cache
    lookup will most of the time return the just inserted entry so
    deduplication is not achieved.

    Running the following test script will produce two xattr blocks which
    can be observed in "File ACL: " line of debugfs output:

    mke2fs -b 1024 -I 128 -F -O extent /dev/sdb 1G
    mount /dev/sdb /mnt/sdb

    touch /mnt/sdb/{x,y}

    setfattr -n user.1 -v aaa /mnt/sdb/x
    setfattr -n user.2 -v bbb /mnt/sdb/x

    setfattr -n user.1 -v aaa /mnt/sdb/y
    setfattr -n user.2 -v bbb /mnt/sdb/y

    debugfs -R 'stat x' /dev/sdb | cat
    debugfs -R 'stat y' /dev/sdb | cat

    This patch defers the reinsertion to the cache so that we can locate
    other blocks with the same contents.

    Signed-off-by: Tahsin Erdogan
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Andreas Dilger

    Tahsin Erdogan
     
  • ext4_alloc_file_blocks() does not use its mode parameter. Remove it.

    Signed-off-by: Tahsin Erdogan
    Signed-off-by: Theodore Ts'o

    Tahsin Erdogan
     
  • After commit 62d1034f53e3 ("fortify: use WARN instead of BUG for now"),
    we get a warning about possible stack overflow from a memcpy that
    was not strictly bounded to the size of the local variable:

    inlined from 'ext4_mb_seq_groups_show' at fs/ext4/mballoc.c:2322:2:
    include/linux/string.h:309:9: error: '__builtin_memcpy': writing between 161 and 1116 bytes into a region of size 160 overflows the destination [-Werror=stringop-overflow=]

    We actually had a bug here that would have been found by the warning,
    but it was already fixed last year in commit 30a9d7afe70e ("ext4: fix
    stack memory corruption with 64k block size").

    This replaces the fixed-length structure on the stack with a variable-length
    structure, using the correct upper bound that tells the compiler that
    everything is really fine here. I also change the loop count to check
    for the same upper bound for consistency, but the existing code is
    already correct here.

    Note that while clang won't allow certain kinds of variable-length arrays
    in structures, this particular instance is fine, as the array is at the
    end of the structure, and the size is strictly bounded.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Theodore Ts'o

    Arnd Bergmann
     
  • The dir_nlink feature has been enabled by default for new ext4
    filesystems since e2fsprogs-1.41 in 2008, and was automatically
    enabled by the kernel for older ext4 filesystems since the
    dir_nlink feature was added with ext4 in kernel 2.6.28+ when
    the subdirectory count exceeded EXT4_LINK_MAX-1.

    Automatically adding the file system features such as dir_nlink is
    generally frowned upon, since it could cause the file system to not be
    mountable on older kernel, thus preventing the administrator from
    rolling back to an older kernel if necessary.

    In this case, the administrator might also want to disable the feature
    because glibc's fts_read() function does not correctly optimize
    directory traversal for directories that use st_nlinks field of 1 to
    indicate that the number of links in the directory are not tracked by
    the file system, and could fail to traverse the full directory
    hierarchy. Fortunately, in the past ten years very few users have
    complained about incomplete file system traversal by glibc's
    fts_read().

    This commit also changes ext4_inc_count() to allow i_nlinks to reach
    the full EXT4_LINK_MAX links on the parent directory (including "."
    and "..") before changing i_links_count to be 1.

    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196405
    Signed-off-by: Andreas Dilger
    Signed-off-by: Theodore Ts'o

    Andreas Dilger
     
  • I get a static checker warning:

    fs/ext4/ext4.h:3091 ext4_set_de_type()
    error: buffer overflow 'ext4_type_by_mode' 15

    Dan Carpenter
     
  • ext4_find_unwritten_pgoff() does not properly handle a situation when
    starting index is in the middle of a page and blocksize < pagesize. The
    following command shows the bug on filesystem with 1k blocksize:

    xfs_io -f -c "falloc 0 4k" \
    -c "pwrite 1k 1k" \
    -c "pwrite 3k 1k" \
    -c "seek -a -r 0" foo

    In this example, neither lseek(fd, 1024, SEEK_HOLE) nor lseek(fd, 2048,
    SEEK_DATA) will return the correct result.

    Fix the problem by neglecting buffers in a page before starting offset.

    Reported-by: Andreas Gruenbacher
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Jan Kara
    CC: stable@vger.kernel.org # 3.8+

    Jan Kara
     
  • We've changed the discard command handling into parallel manner.
    But, in this change, I forgot decreasing the usage count of the bio
    which was used to send discard request. I'm sorry about that.

    Fixes: a015434480dc ("ext4: send parallel discards on commit completions")
    Signed-off-by: Daeho Jeong
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Jan Kara

    Daeho Jeong
     

04 Aug, 2017

1 commit

  • Merge misc fixes from Andrew Morton:
    "15 fixes"

    [ This does not merge the "fortify: use WARN instead of BUG for now"
    patch, which needs a bit of extra work to build cleanly with all
    configurations. Arnd is on it. - Linus ]

    * emailed patches from Andrew Morton :
    ocfs2: don't clear SGID when inheriting ACLs
    mm: allow page_cache_get_speculative in interrupt context
    userfaultfd: non-cooperative: flush event_wqh at release time
    ipc: add missing container_of()s for randstruct
    cpuset: fix a deadlock due to incomplete patching of cpusets_enabled()
    userfaultfd_zeropage: return -ENOSPC in case mm has gone
    mm: take memory hotplug lock within numa_zonelist_order_handler()
    mm/page_io.c: fix oops during block io poll in swapin path
    zram: do not free pool->size_class
    kthread: fix documentation build warning
    kasan: avoid -Wmaybe-uninitialized warning
    userfaultfd: non-cooperative: notify about unmap of destination during mremap
    mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries
    pid: kill pidhash_size in pidhash_init()
    mm/hugetlb.c: __get_user_pages ignores certain follow_hugetlb_page errors

    Linus Torvalds
     

03 Aug, 2017

4 commits

  • Pull NFS client fixes from Anna Schumaker:
    "Two fixes from Trond this time, now that he's back from his vacation.
    The first is a stable fix for the EXCHANGE_ID issue on the mailing
    list, and the other fixes a double-free situation that he found at the
    same time.

    Stable fix:
    - Fix EXCHANGE_ID corrupt verifier issue

    Other fix:
    - Fix double frees in nfs4_test_session_trunk()"

    * tag 'nfs-for-4.13-4' of git://git.linux-nfs.org/projects/anna/linux-nfs:
    NFSv4: Fix double frees in nfs4_test_session_trunk()
    NFSv4: Fix EXCHANGE_ID corrupt verifier issue

    Linus Torvalds
     
  • When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
    set, DIR1 is expected to have SGID bit set (and owning group equal to
    the owning group of 'DIR0'). However when 'DIR0' also has some default
    ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
    'DIR1' to get cleared if user is not member of the owning group.

    Fix the problem by moving posix_acl_update_mode() out of ocfs2_set_acl()
    into ocfs2_iop_set_acl(). That way the function will not be called when
    inheriting ACLs which is what we want as it prevents SGID bit clearing
    and the mode has been properly set by posix_acl_create() anyway. Also
    posix_acl_chmod() that is calling ocfs2_set_acl() takes care of updating
    mode itself.

    Fixes: 073931017b4 ("posix_acl: Clear SGID bit when setting file permissions")
    Link: http://lkml.kernel.org/r/20170801141252.19675-3-jack@suse.cz
    Signed-off-by: Jan Kara
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Joseph Qi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • There may still be threads waiting on event_wqh at the time the
    userfault file descriptor is closed. Flush the events wait-queue to
    prevent waiting threads from hanging.

    Link: http://lkml.kernel.org/r/1501398127-30419-1-git-send-email-rppt@linux.vnet.ibm.com
    Fixes: 9cd75c3cd4c3d ("userfaultfd: non-cooperative: add ability to report
    non-PF events from uffd descriptor")
    Signed-off-by: Mike Rapoport
    Cc: Andrea Arcangeli
    Cc: "Dr. David Alan Gilbert"
    Cc: Pavel Emelyanov
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • In the non-cooperative userfaultfd case, the process exit may race with
    outstanding mcopy_atomic called by the uffd monitor. Returning -ENOSPC
    instead of -EINVAL when mm is already gone will allow uffd monitor to
    distinguish this case from other error conditions.

    Unfortunately I overlooked userfaultfd_zeropage when updating
    userfaultd_copy().

    Link: http://lkml.kernel.org/r/1501136819-21857-1-git-send-email-rppt@linux.vnet.ibm.com
    Fixes: 96333187ab162 ("userfaultfd_copy: return -ENOSPC in case mm has gone")
    Signed-off-by: Mike Rapoport
    Cc: Andrea Arcangeli
    Cc: "Dr. David Alan Gilbert"
    Cc: Pavel Emelyanov
    Cc: Michal Hocko
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

02 Aug, 2017

2 commits

  • rpc_clnt_add_xprt() expects the callback function to be synchronous, and
    expects to release the transport and switch references itself.

    Fixes: 04fa2c6bb51b1 ("NFS pnfs data server multipath session trunking")
    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     
  • The verifier is allocated on the stack, but the EXCHANGE_ID RPC call was
    changed to be asynchronous by commit 8d89bd70bc939. If we interrrupt
    the call to rpc_wait_for_completion_task(), we can therefore end up
    transmitting random stack contents in lieu of the verifier.

    Fixes: 8d89bd70bc939 ("NFS setup async exchange_id")
    Cc: stable@vger.kernel.org # v4.9+
    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     

31 Jul, 2017

6 commits

  • For some odd reason, it forces a byte-by-byte copy of each field. A
    plain old swap() on most of these fields would be more efficient. We
    do need to retain the memswap of i_data however as that field is an array.

    Signed-off-by: Theodore Ts'o
    Signed-off-by: Jeff Layton
    Reviewed-by: Jan Kara

    Jeff Layton
     
  • For Lustre, if ea_inode fails in hash validation but passes parent
    inode and generation checks, it won't be added to the cache as well
    as the error "-EFSCORRUPTED" should be cleared, otherwise it will
    cause "Structure needs cleaning" when running getfattr command.

    Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-9723

    Cc: stable@vger.kernel.org
    Fixes: dec214d00e0d78a08b947d7dccdfdb84407a9f4d
    Signed-off-by: Emoly Liu
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Andreas Dilger
    Reviewed-by: tahsin@google.com

    Emoly Liu
     
  • When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
    set, DIR1 is expected to have SGID bit set (and owning group equal to
    the owning group of 'DIR0'). However when 'DIR0' also has some default
    ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
    'DIR1' to get cleared if user is not member of the owning group.

    Fix the problem by moving posix_acl_update_mode() out of
    __ext4_set_acl() into ext4_set_acl(). That way the function will not be
    called when inheriting ACLs which is what we want as it prevents SGID
    bit clearing and the mode has been properly set by posix_acl_create()
    anyway.

    Fixes: 073931017b49d9458aa351605b43a7e34598caef
    CC: stable@vger.kernel.org
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Jan Kara
    Reviewed-by: Andreas Gruenbacher

    Jan Kara
     
  • When changing a file's acl mask, __ext4_set_acl() will first set the group
    bits of i_mode to the value of the mask, and only then set the actual
    extended attribute representing the new acl.

    If the second part fails (due to lack of space, for example) and the file
    had no acl attribute to begin with, the system will from now on assume
    that the mask permission bits are actual group permission bits, potentially
    granting access to the wrong users.

    Prevent this by only changing the inode mode after the acl has been set.

    Signed-off-by: Ernesto A. Fernández
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Jan Kara

    Ernesto A. Fernández
     
  • Two variables in ext4_inode_info, i_reserved_meta_blocks and
    i_allocated_meta_blocks, are unused. Removing them saves a little
    memory per in-memory inode and cleans up clutter in several tracepoints.
    Adjust tracepoint output from ext4_alloc_da_blocks() for consistency
    and fix a typo and whitespace near these changes.

    Signed-off-by: Eric Whitney
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Jan Kara

    Eric Whitney
     
  • Commit 914f82a32d0268847 "ext4: refactor direct IO code" deleted
    ext4_ext_direct_IO(), but references to that function remain in
    comments. Update them to refer to ext4_direct_IO_write().

    Signed-off-by: Eric Whitney
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Andreas Dilger
    Reviewed-by: Jan Kara

    Eric Whitney
     

29 Jul, 2017

4 commits

  • Pull NFS client fixes from Anna Schumaker:
    "More NFS client bugfixes for 4.13.

    Most of these fix locking bugs that Ben and Neil noticed, but I also
    have a patch to fix one more access bug that was reported after last
    week.

    Stable fixes:
    - Fix a race where CB_NOTIFY_LOCK fails to wake a waiter
    - Invalidate file size when taking a lock to prevent corruption

    Other fixes:
    - Don't excessively generate tiny writes with fallocate
    - Use the raw NFS access mask in nfs4_opendata_access()"

    * tag 'nfs-for-4.13-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
    NFSv4.1: Fix a race where CB_NOTIFY_LOCK fails to wake a waiter
    NFS: Optimize fallocate by refreshing mapping when needed.
    NFS: invalidate file size when taking a lock.
    NFS: Use raw NFS access mask in nfs4_opendata_access()

    Linus Torvalds
     
  • Pull xfs fixes from Darrick Wong:

    - fix firstfsb variables that we left uninitialized, which could lead
    to locking problems.

    - check for NULL metadata buffer pointers before using them.

    - don't allow btree cursor manipulation if the btree block is corrupt.
    Better to just shut down.

    - fix infinite loop problems in quotacheck.

    - fix buffer overrun when validating directory blocks.

    - fix deadlock problem in bunmapi.

    * tag 'xfs-4.13-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
    xfs: fix multi-AG deadlock in xfs_bunmapi
    xfs: check that dir block entries don't off the end of the buffer
    xfs: fix quotacheck dquot id overflow infinite loop
    xfs: check _alloc_read_agf buffer pointer before using
    xfs: set firstfsb to NULLFSBLOCK before feeding it to _bmapi_write
    xfs: check _btree_check_block value

    Linus Torvalds
     
  • nfs4_retry_setlk() sets the task's state to TASK_INTERRUPTIBLE within the
    same region protected by the wait_queue's lock after checking for a
    notification from CB_NOTIFY_LOCK callback. However, after releasing that
    lock, a wakeup for that task may race in before the call to
    freezable_schedule_timeout_interruptible() and set TASK_WAKING, then
    freezable_schedule_timeout_interruptible() will set the state back to
    TASK_INTERRUPTIBLE before the task will sleep. The result is that the task
    will sleep for the entire duration of the timeout.

    Since we've already set TASK_INTERRUPTIBLE in the locked section, just use
    freezable_schedule_timout() instead.

    Fixes: a1d617d8f134 ("nfs: allow blocking locks to be awoken by lock callbacks")
    Signed-off-by: Benjamin Coddington
    Reviewed-by: Jeff Layton
    Cc: stable@vger.kernel.org # v4.9+
    Signed-off-by: Anna Schumaker

    Benjamin Coddington
     
  • Pull btrfs fixes from David Sterba:
    "Fixes addressing problems reported by users, and there's one more
    regression fix"

    * 'for-4.13-part3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
    btrfs: round down size diff when shrinking/growing device
    Btrfs: fix early ENOSPC due to delalloc
    btrfs: fix lockup in find_free_extent with read-only block groups
    Btrfs: fix dir item validation when replaying xattr deletes

    Linus Torvalds
     

27 Jul, 2017

3 commits

  • posix_fallocate() will allocate space in an NFS file by considering
    the last byte of every 4K block. If it is before EOF, it will read
    the byte and if it is zero, a zero is written out. If it is after EOF,
    the zero is unconditionally written.

    For the blocks beyond EOF, if NFS believes its cache is valid, it will
    expand these writes to write full pages, and then will merge the pages.
    This results if (typically) 1MB writes. If NFS believes its cache is
    not valid (particularly if NFS_INO_INVALID_DATA or
    NFS_INO_REVAL_PAGECACHE are set - see nfs_write_pageuptodate()), it will
    send the individual 1-byte writes. This results in (typically) 256 times
    as many RPC requests, and can be substantially slower.

    Currently nfs_revalidate_mapping() is only used when reading a file or
    mmapping a file, as these are times when the content needs to be
    up-to-date. Writes don't generally need the cache to be up-to-date, but
    writes beyond EOF can benefit, particularly in the posix_fallocate()
    case.

    So this patch calls nfs_revalidate_mapping() when writing beyond EOF -
    i.e. when there is a gap between the end of the file and the start of
    the write. If the cache is thought to be out of date (as happens after
    taking a file lock), this will cause a GETATTR, and the two flags
    mentioned above will be cleared. With this, posix_fallocate() on a
    newly locked file does not generate excessive tiny writes.

    Signed-off-by: NeilBrown
    Signed-off-by: Anna Schumaker

    NeilBrown
     
  • Prior to commit ca0daa277aca ("NFS: Cache aggressively when file is open
    for writing"), NFS would revalidate, or invalidate, the file size when
    taking a lock. Since that commit it only invalidates the file content.

    If the file size is changed on the server while wait for the lock, the
    client will have an incorrect understanding of the file size and could
    corrupt data. This particularly happens when writing beyond the
    (supposed) end of file and can be easily be demonstrated with
    posix_fallocate().

    If an application opens an empty file, waits for a write lock, and then
    calls posix_fallocate(), glibc will determine that the underlying
    filesystem doesn't support fallocate (assuming version 4.1 or earlier)
    and will write out a '0' byte at the end of each 4K page in the region
    being fallocated that is after the end of the file.
    NFS will (usually) detect that these writes are beyond EOF and will
    expand them to cover the whole page, and then will merge the pages.
    Consequently, NFS will write out large blocks of zeroes beyond where it
    thought EOF was. If EOF had moved, the pre-existing part of the file
    will be over-written. Locking should have protected against this,
    but it doesn't.

    This patch restores the use of nfs_zap_caches() which invalidated the
    cached attributes. When posix_fallocate() asks for the file size, the
    request will go to the server and get a correct answer.

    cc: stable@vger.kernel.org (v4.8+)
    Fixes: ca0daa277aca ("NFS: Cache aggressively when file is open for writing")
    Signed-off-by: NeilBrown
    Signed-off-by: Anna Schumaker

    NeilBrown
     
  • Commit bd8b2441742b ("NFS: Store the raw NFS access mask in the inode's
    access cache") changed how the access results are stored after an
    access() call. An NFS v4 OPEN might have access bits returned with the
    opendata, so we should use the NFS4_ACCESS values when determining the
    return value in nfs4_opendata_access().

    Fixes: bd8b2441742b ("NFS: Store the raw NFS access mask in the inode's
    access cache")
    Reported-by: Eryu Guan
    Signed-off-by: Anna Schumaker
    Tested-by: Takashi Iwai

    Anna Schumaker
     

26 Jul, 2017

1 commit

  • Just like in the allocator we must avoid touching multiple AGs out of
    order when freeing blocks, as freeing still locks the AGF and can cause
    the same AB-BA deadlocks as in the allocation path.

    Signed-off-by: Christoph Hellwig
    Reported-by: Nikolay Borisov
    Reviewed-by: Darrick J. Wong
    Signed-off-by: Darrick J. Wong

    Christoph Hellwig
     

25 Jul, 2017

2 commits


24 Jul, 2017

2 commits

  • If a dquot has an id of U32_MAX, the next lookup index increment
    overflows the uint32_t back to 0. This starts the lookup sequence
    over from the beginning, repeats indefinitely and results in a
    livelock.

    Update xfs_qm_dquot_walk() to explicitly check for the lookup
    overflow and exit the loop.

    Signed-off-by: Brian Foster
    Reviewed-by: Darrick J. Wong
    Signed-off-by: Darrick J. Wong

    Brian Foster
     
  • Further testing showed that the fix introduced in 7dfb8be11b5d ("btrfs:
    Round down values which are written for total_bytes_size") was
    insufficient and it could still lead to discrepancies between the
    total_bytes in the super block and the device total bytes. So this patch
    also ensures that the difference between old/new sizes when
    shrinking/growing is also rounded down. This ensure that we won't be
    subtracting/adding a non-sectorsize multiples to the superblock/device
    total sizees.

    Fixes: 7dfb8be11b5d ("btrfs: Round down values which are written for total_bytes_size")
    Signed-off-by: Nikolay Borisov
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba

    Nikolay Borisov