29 Jun, 2023

1 commit

  • Pull mm updates from Andrew Morton:

    - Yosry Ahmed brought back some cgroup v1 stats in OOM logs

    - Yosry has also eliminated cgroup's atomic rstat flushing

    - Nhat Pham adds the new cachestat() syscall. It provides userspace
    with the ability to query pagecache status - a similar concept to
    mincore() but more powerful and with improved usability

    - Mel Gorman provides more optimizations for compaction, reducing the
    prevalence of page rescanning

    - Lorenzo Stoakes has done some maintanance work on the
    get_user_pages() interface

    - Liam Howlett continues with cleanups and maintenance work to the
    maple tree code. Peng Zhang also does some work on maple tree

    - Johannes Weiner has done some cleanup work on the compaction code

    - David Hildenbrand has contributed additional selftests for
    get_user_pages()

    - Thomas Gleixner has contributed some maintenance and optimization
    work for the vmalloc code

    - Baolin Wang has provided some compaction cleanups,

    - SeongJae Park continues maintenance work on the DAMON code

    - Huang Ying has done some maintenance on the swap code's usage of
    device refcounting

    - Christoph Hellwig has some cleanups for the filemap/directio code

    - Ryan Roberts provides two patch series which yield some
    rationalization of the kernel's access to pte entries - use the
    provided APIs rather than open-coding accesses

    - Lorenzo Stoakes has some fixes to the interaction between pagecache
    and directio access to file mappings

    - John Hubbard has a series of fixes to the MM selftesting code

    - ZhangPeng continues the folio conversion campaign

    - Hugh Dickins has been working on the pagetable handling code, mainly
    with a view to reducing the load on the mmap_lock

    - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment
    from 128 to 8

    - Domenico Cerasuolo has improved the zswap reclaim mechanism by
    reorganizing the LRU management

    - Matthew Wilcox provides some fixups to make gfs2 work better with the
    buffer_head code

    - Vishal Moola also has done some folio conversion work

    - Matthew Wilcox has removed the remnants of the pagevec code - their
    functionality is migrated over to struct folio_batch

    * tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (380 commits)
    mm/hugetlb: remove hugetlb_set_page_subpool()
    mm: nommu: correct the range of mmap_sem_read_lock in task_mem()
    hugetlb: revert use of page_cache_next_miss()
    Revert "page cache: fix page_cache_next/prev_miss off by one"
    mm/vmscan: fix root proactive reclaim unthrottling unbalanced node
    mm: memcg: rename and document global_reclaim()
    mm: kill [add|del]_page_to_lru_list()
    mm: compaction: convert to use a folio in isolate_migratepages_block()
    mm: zswap: fix double invalidate with exclusive loads
    mm: remove unnecessary pagevec includes
    mm: remove references to pagevec
    mm: rename invalidate_mapping_pagevec to mapping_try_invalidate
    mm: remove struct pagevec
    net: convert sunrpc from pagevec to folio_batch
    i915: convert i915_gpu_error to use a folio_batch
    pagevec: rename fbatch_count()
    mm: remove check_move_unevictable_pages()
    drm: convert drm_gem_put_pages() to use a folio_batch
    i915: convert shmem_sg_free_table() to use a folio_batch
    scatterlist: add sg_set_folio()
    ...

    Linus Torvalds
     

14 Jun, 2023

1 commit

  • Fix dio_bio_cleanup() to advance the head index into the list of pages past
    the pages it has released, as __blockdev_direct_IO() will call it twice if
    do_direct_IO() fails.

    The issue was causing:

    WARNING: CPU: 6 PID: 2220 at mm/gup.c:76 try_get_folio

    This can be triggered by setting up a clean pair of UDF filesystems on
    loopback devices and running the generic/451 xfstest with them as the
    scratch and test partitions. Something like the following:

    fallocate /mnt2/udf_scratch -l 1G
    fallocate /mnt2/udf_test -l 1G
    mknod /dev/lo0 b 7 0
    mknod /dev/lo1 b 7 1
    losetup lo0 /mnt2/udf_scratch
    losetup lo1 /mnt2/udf_test
    mkfs -t udf /dev/lo0
    mkfs -t udf /dev/lo1
    cd xfstests
    ./check generic/451

    with xfstests configured by putting the following into local.config:

    export FSTYP=udf
    export DISABLE_UDF_TEST=1
    export TEST_DEV=/dev/lo1
    export TEST_DIR=/xfstest.test
    export SCRATCH_DEV=/dev/lo0
    export SCRATCH_MNT=/xfstest.scratch

    Fixes: 1ccf164ec866 ("block: Use iov_iter_extract_pages() and page pinning in direct-io.c")
    Reported-by: kernel test robot
    Closes: https://lore.kernel.org/oe-lkp/202306120931.a9606b88-oliver.sang@intel.com
    Signed-off-by: David Howells
    cc: Christoph Hellwig
    cc: David Hildenbrand
    cc: Andrew Morton
    cc: Jens Axboe
    cc: Al Viro
    cc: Matthew Wilcox
    cc: Jan Kara
    cc: Jeff Layton
    cc: Jason Gunthorpe
    cc: Logan Gunthorpe
    cc: Hillf Danton
    cc: Christian Brauner
    cc: Linus Torvalds
    cc: linux-fsdevel@vger.kernel.org
    cc: linux-block@vger.kernel.org
    cc: linux-kernel@vger.kernel.org
    cc: linux-mm@kvack.org
    Reviewed-by: David Hildenbrand
    Reviewed-by: Christoph Hellwig
    Link: https://lore.kernel.org/r/1193485.1686693279@warthog.procyon.org.uk
    Signed-off-by: Jens Axboe

    David Howells
     

10 Jun, 2023

1 commit

  • Add a helper to invalidate page cache after a dio write.

    Link: https://lkml.kernel.org/r/20230601145904.1385409-7-hch@lst.de
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Damien Le Moal
    Reviewed-by: Hannes Reinecke
    Acked-by: Darrick J. Wong
    Cc: Al Viro
    Cc: Andreas Gruenbacher
    Cc: Anna Schumaker
    Cc: Chao Yu
    Cc: Christian Brauner
    Cc: Ilya Dryomov
    Cc: Jaegeuk Kim
    Cc: Jens Axboe
    Cc: Johannes Thumshirn
    Cc: Matthew Wilcox
    Cc: Miklos Szeredi
    Cc: Miklos Szeredi
    Cc: Theodore Ts'o
    Cc: Trond Myklebust
    Cc: Xiubo Li
    Signed-off-by: Andrew Morton

    Christoph Hellwig
     

31 May, 2023

1 commit

  • Change the old block-based direct-I/O code to use iov_iter_extract_pages()
    to pin user pages or leave kernel pages unpinned rather than taking refs
    when submitting bios.

    This makes use of the preceding patches to not take pins on the zero page
    (thereby allowing insertion of zero pages in with pinned pages) and to get
    additional pins on pages, allowing an extracted page to be used in multiple
    bios without having to re-extract it.

    Signed-off-by: David Howells
    cc: Christoph Hellwig
    cc: David Hildenbrand
    cc: Lorenzo Stoakes
    cc: Andrew Morton
    cc: Jens Axboe
    cc: Al Viro
    cc: Matthew Wilcox
    cc: Jan Kara
    cc: Jeff Layton
    cc: Jason Gunthorpe
    cc: Logan Gunthorpe
    cc: Hillf Danton
    cc: Christian Brauner
    cc: Linus Torvalds
    cc: linux-fsdevel@vger.kernel.org
    cc: linux-block@vger.kernel.org
    cc: linux-kernel@vger.kernel.org
    cc: linux-mm@kvack.org
    Reviewed-by: Christoph Hellwig
    Link: https://lore.kernel.org/r/20230526214142.958751-4-dhowells@redhat.com
    Signed-off-by: Jens Axboe

    David Howells
     

24 May, 2023

1 commit

  • Replace BIO_NO_PAGE_REF with a BIO_PAGE_REFFED flag that has the inverted
    meaning is only set when a page reference has been acquired that needs to
    be released by bio_release_pages().

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David Howells
    Reviewed-by: John Hubbard
    cc: Al Viro
    cc: Jens Axboe
    cc: Jan Kara
    cc: Matthew Wilcox
    cc: Logan Gunthorpe
    cc: linux-block@vger.kernel.org
    Reviewed-by: Jan Kara
    Link: https://lore.kernel.org/r/20230522205744.2825689-4-dhowells@redhat.com
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

06 Mar, 2023

1 commit


27 Jan, 2023

1 commit

  • sb_init_dio_done_wq is also used by the iomap code, so move it to
    super.c in preparation for building direct-io.c conditionally.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Eric Biggers
    Reviewed-by: Jan Kara
    Link: https://lore.kernel.org/r/20230125065839.191256-2-hch@lst.de
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

20 Sep, 2022

1 commit


09 Aug, 2022

2 commits

  • Most of the users immediately follow successful iov_iter_get_pages()
    with advancing by the amount it had returned.

    Provide inline wrappers doing that, convert trivial open-coded
    uses of those.

    BTW, iov_iter_get_pages() never returns more than it had been asked
    to; such checks in cifs ought to be removed someday...

    Reviewed-by: Jeff Layton
    Signed-off-by: Al Viro

    Al Viro
     
  • Equivalent of single-segment iovec. Initialized by iov_iter_ubuf(),
    checked for by iter_is_ubuf(), otherwise behaves like ITER_IOVEC
    ones.

    We are going to expose the things like ->write_iter() et.al. to those
    in subsequent commits.

    New predicate (user_backed_iter()) that is true for ITER_IOVEC and
    ITER_UBUF; places like direct-IO handling should use that for
    checking that pages we modify after getting them from iov_iter_get_pages()
    would need to be dirtied.

    DO NOT assume that replacing iter_is_iovec() with user_backed_iter()
    will solve all problems - there's code that uses iter_is_iovec() to
    decide how to poke around in iov_iter guts and for that the predicate
    replacement obviously won't suffice.

    Signed-off-by: Al Viro

    Al Viro
     

04 Aug, 2022

1 commit

  • Pull vfs iov_iter updates from Al Viro:
    "Part 1 - isolated cleanups and optimizations.

    One of the goals is to reduce the overhead of using ->read_iter() and
    ->write_iter() instead of ->read()/->write().

    new_sync_{read,write}() has a surprising amount of overhead, in
    particular inside iocb_flags(). That's the explanation for the
    beginning of the series is in this pile; it's not directly
    iov_iter-related, but it's a part of the same work..."

    * tag 'pull-work.iov_iter-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    first_iovec_segment(): just return address
    iov_iter: massage calling conventions for first_{iovec,bvec}_segment()
    iov_iter: first_{iovec,bvec}_segment() - simplify a bit
    iov_iter: lift dealing with maxpages out of first_{iovec,bvec}_segment()
    iov_iter_get_pages{,_alloc}(): cap the maxsize with MAX_RW_COUNT
    iov_iter_bvec_advance(): don't bother with bvec_iter
    copy_page_{to,from}_iter(): switch iovec variants to generic
    keep iocb_flags() result cached in struct file
    iocb: delay evaluation of IS_SYNC(...) until we want to check IOCB_DSYNC
    struct file: use anonymous union member for rcuhead and llist
    btrfs: use IOMAP_DIO_NOSYNC
    teach iomap_dio_rw() to suppress dsync
    No need of likely/unlikely on calls of check_copy_size()

    Linus Torvalds
     

15 Jul, 2022

1 commit

  • Reduce the size of struct dio by combining the 'op' and 'op_flags' into
    the new 'opf' member. Use the new blk_opf_t type to improve static type
    checking. This patch does not change any functionality.

    Reviewed-by: Jan Kara
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: Darrick J. Wong
    Signed-off-by: Bart Van Assche
    Link: https://lore.kernel.org/r/20220714180729.1065367-49-bvanassche@acm.org
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

11 Jun, 2022

1 commit


18 Apr, 2022

1 commit

  • Randomly poking into block device internals for manual prefetches isn't
    exactly a very maintainable thing to do. And none of the performance
    critical direct I/O implementations still use this library function
    anyway, so just drop it.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Damien Le Moal
    Link: https://lore.kernel.org/r/20220415045258.199825-28-hch@lst.de
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

08 Mar, 2022

1 commit


02 Feb, 2022

1 commit

  • Pass the block_device and operation that we plan to use this bio for to
    bio_alloc to optimize the assignment. NULL/0 can be passed, both for the
    passthrough case on a raw request_queue and to temporarily avoid
    refactoring some nasty code.

    Also move the gfp_mask argument after the nr_vecs argument for a much
    more logical calling convention matching what most of the kernel does.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Chaitanya Kulkarni
    Link: https://lore.kernel.org/r/20220124091107.642561-18-hch@lst.de
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

26 Oct, 2021

1 commit

  • The second argument was only used by the USB gadget code, yet everyone
    pays the overhead of passing a zero to be passed into aio, where it
    ends up being part of the aio res2 value.

    Now that everybody is passing in zero, kill off the extra argument.

    Reviewed-by: Darrick J. Wong
    Signed-off-by: Jens Axboe

    Jens Axboe
     

18 Oct, 2021

1 commit

  • The polling support in the legacy direct-io support is a little crufty.
    It already doesn't support the asynchronous polling needed for io_uring
    polling, and is hard to adopt to upcoming changes in the polling
    interfaces. Given that all the major file systems already use the iomap
    direct I/O code, just drop the polling support.

    Signed-off-by: Christoph Hellwig
    Tested-by: Mark Wunderlich
    Link: https://lore.kernel.org/r/20211012111226.760968-2-hch@lst.de
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

10 Apr, 2021

1 commit

  • I encountered a hung task issue, but not a performance one. I run DIO
    on a device (need lba continuous, for example open channel ssd), maybe
    hungtask in below case:

    DIO: Checkpoint:
    get addr A(at boundary), merge into BIO,
    no submit because boundary missing
    flush dirty data(get addr A+1), wait IO(A+1)
    writeback timeout, because DIO(A) didn't submit
    get addr A+2 fail, because checkpoint is doing

    dio_send_cur_page() may clear sdio->boundary, so prevent it from missing
    a boundary.

    Link: https://lkml.kernel.org/r/20210322042253.38312-1-jack.qiu@huawei.com
    Fixes: b1058b981272 ("direct-io: submit bio after boundary buffer is added to it")
    Signed-off-by: Jack Qiu
    Reviewed-by: Jan Kara
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jack Qiu
     

01 Mar, 2021

1 commit

  • Pull more block updates from Jens Axboe:
    "A few stragglers (and one due to me missing it originally), and fixes
    for changes in this merge window mostly. In particular:

    - blktrace cleanups (Chaitanya, Greg)

    - Kill dead blk_pm_* functions (Bart)

    - Fixes for the bio alloc changes (Christoph)

    - Fix for the partition changes (Christoph, Ming)

    - Fix for turning off iopoll with polled IO inflight (Jeffle)

    - nbd disconnect fix (Josef)

    - loop fsync error fix (Mauricio)

    - kyber update depth fix (Yang)

    - max_sectors alignment fix (Mikulas)

    - Add bio_max_segs helper (Matthew)"

    * tag 'block-5.12-2021-02-27' of git://git.kernel.dk/linux-block: (21 commits)
    block: Add bio_max_segs
    blktrace: fix documentation for blk_fill_rw()
    block: memory allocations in bounce_clone_bio must not fail
    block: remove the gfp_mask argument to bounce_clone_bio
    block: fix bounce_clone_bio for passthrough bios
    block-crypto-fallback: use a bio_set for splitting bios
    block: fix logging on capacity change
    blk-settings: align max_sectors on "logical_block_size" boundary
    block: reopen the device in blkdev_reread_part
    block: don't skip empty device in in disk_uevent
    blktrace: remove debugfs file dentries from struct blk_trace
    nbd: handle device refs for DESTROY_ON_DISCONNECT properly
    kyber: introduce kyber_depth_updated()
    loop: fix I/O error on fsync() in detached loop devices
    block: fix potential IO hang when turning off io_poll
    block: get rid of the trace rq insert wrapper
    blktrace: fix blk_rq_merge documentation
    blktrace: fix blk_rq_issue documentation
    blktrace: add blk_fill_rwbs documentation comment
    block: remove superfluous param in blk_fill_rwbs()
    ...

    Linus Torvalds
     

27 Feb, 2021

1 commit

  • It's often inconvenient to use BIO_MAX_PAGES due to min() requiring the
    sign to be the same. Introduce bio_max_segs() and change BIO_MAX_PAGES to
    be unsigned to make it easier for the users.

    Reviewed-by: Chaitanya Kulkarni
    Signed-off-by: Matthew Wilcox (Oracle)
    Signed-off-by: Jens Axboe

    Matthew Wilcox (Oracle)
     

25 Feb, 2021

1 commit

  • Delete duplicate words in fs/*.c.
    The doubled words that are being dropped are:
    that, be, the, in, and, for

    Link: https://lkml.kernel.org/r/20201224052810.25315-1-rdunlap@infradead.org
    Signed-off-by: Randy Dunlap
    Reviewed-by: Matthew Wilcox (Oracle)
    Cc: Alexander Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

25 Jan, 2021

2 commits

  • Direct IO does not operate on the current working set of pages managed
    by the kernel, so it should not be accounted as memory stall to PSI
    infrastructure.

    The block layer and iomap direct IO use bio_iov_iter_get_pages()
    to build bios, and they are the only users of it, so to avoid PSI
    tracking for them clear out BIO_WORKINGSET flag. Do same for
    dio_bio_submit() because fs/direct_io constructs bios by hand directly
    calling bio_add_page().

    Reported-by: Christoph Hellwig
    Suggested-by: Christoph Hellwig
    Suggested-by: Johannes Weiner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Pavel Begunkov
    Reviewed-by: Ming Lei
    Signed-off-by: Jens Axboe

    Pavel Begunkov
     
  • Replace the gendisk pointer in struct bio with a pointer to the newly
    improved struct block device. From that the gendisk can be trivially
    accessed with an extra indirection, but it also allows to directly
    look up all information related to partition remapping.

    Signed-off-by: Christoph Hellwig
    Acked-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

16 Oct, 2020

1 commit


09 Oct, 2020

3 commits

  • Prior to commit 9fe55eea7e4b ("Fix race when checking i_size on direct
    i/o read"), an unaligned direct read past end of file would trigger EOF,
    since generic_file_aio_read detected this read-at-EOF condition and
    skipped the direct IO read entirely, returning 0. After that change, the
    read now reaches dio_generic, which detects the misalignment and returns
    EINVAL.

    This consolidates the generic direct-io to follow the same behavior of
    filesystems. Apparently, this fix will only affect ocfs2 since other
    filesystems do this verification before calling do_blockdev_direct_IO,
    with the exception of f2fs, which has the same bug, but is fixed in the
    next patch.

    it can be verified by a read loop on a file that does a partial read
    before EOF (On file that doesn't end at an aligned address). The
    following code fails on an unaligned file on filesystems without
    prior validation without this patch, but not on btrfs, ext4, and xfs.

    while (done < total) {
    ssize_t delta = pread(fd, buf + done, total - done, off + done);
    if (!delta)
    break;
    ...
    }

    Fix this regression by moving the misalignment check to after the EOF
    check added by commit 74cedf9b6c60 ("direct-io: Fix negative return from
    dio read beyond eof").

    Based on a patch by Jamie Liu.

    Link: https://lore.kernel.org/r/20201008062620.2928326-4-krisman@collabora.com
    Reported-by: Jamie Liu
    Reviewed-by: Jan Kara
    Reviewed-by: Jens Axboe
    Signed-off-by: Gabriel Krisman Bertazi
    Signed-off-by: Jan Kara

    Gabriel Krisman Bertazi
     
  • If a DIO read starts past EOF, the kernel won't attempt it, so we don't
    need to flush dirty pages before failing the syscall.

    Link: https://lore.kernel.org/r/20201008062620.2928326-3-krisman@collabora.com
    Suggested-by: Jan Kara
    Reviewed-by: Jan Kara
    Reviewed-by: Jens Axboe
    Signed-off-by: Gabriel Krisman Bertazi
    Signed-off-by: Jan Kara

    Gabriel Krisman Bertazi
     
  • In preparation to resort DIO checks, reduce code duplication of error
    handling in do_blockdev_direct_IO.

    Link: https://lore.kernel.org/r/20201008062620.2928326-2-krisman@collabora.com
    Reviewed-by: Jan Kara
    Reviewed-by: Jens Axboe
    Signed-off-by: Gabriel Krisman Bertazi
    Signed-off-by: Jan Kara

    Gabriel Krisman Bertazi
     

07 Oct, 2020

1 commit

  • Since we removed the last user of dio_end_io() when btrfs got converted
    to iomap infrastructure ("btrfs: switch to iomap for direct IO"), remove
    the helper function dio_end_io().

    Reviewed-by: Nikolay Borisov
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Josef Bacik
    Signed-off-by: Goldwyn Rodrigues
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba

    Goldwyn Rodrigues
     

01 Jul, 2020

1 commit


15 Jun, 2020

1 commit

  • Pull btrfs updates from David Sterba:
    "This reverts the direct io port to iomap infrastructure of btrfs
    merged in the first pull request. We found problems in invalidate page
    that don't seem to be fixable as regressions or without changing iomap
    code that would not affect other filesystems.

    There are four reverts in total, but three of them are followup
    cleanups needed to revert a43a67a2d715 cleanly. The result is the
    buffer head based implementation of direct io.

    Reverts are not great, but under current circumstances I don't see
    better options"

    * tag 'for-5.8-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
    Revert "btrfs: switch to iomap_dio_rw() for dio"
    Revert "fs: remove dio_end_io()"
    Revert "btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK"
    Revert "btrfs: split btrfs_direct_IO to read and write part"

    Linus Torvalds
     

10 Jun, 2020

1 commit


03 Jun, 2020

1 commit

  • Pull btrfs updates from David Sterba:
    "Highlights:

    - speedup dead root detection during orphan cleanup, eg. when there
    are many deleted subvolumes waiting to be cleaned, the trees are
    now looked up in radix tree instead of a O(N^2) search

    - snapshot creation with inherited qgroup will mark the qgroup
    inconsistent, requires a rescan

    - send will emit file capabilities after chown, this produces a
    stream that does not need postprocessing to set the capabilities
    again

    - direct io ported to iomap infrastructure, cleaned up and simplified
    code, notably removing last use of struct buffer_head in btrfs code

    Core changes:

    - factor out backreference iteration, to be used by ordinary
    backreferences and relocation code

    - improved global block reserve utilization
    * better logic to serialize requests
    * increased maximum available for unlink
    * improved handling on large pages (64K)

    - direct io cleanups and fixes
    * simplify layering, where cloned bios were unnecessarily created
    for some cases
    * error handling fixes (submit, endio)
    * remove repair worker thread, used to avoid deadlocks during
    repair

    - refactored block group reading code, preparatory work for new type
    of block group storage that should improve mount time on large
    filesystems

    Cleanups:

    - cleaned up (and slightly sped up) set/get helpers for metadata data
    structure members

    - root bit REF_COWS got renamed to SHAREABLE to reflect the that the
    blocks of the tree get shared either among subvolumes or with the
    relocation trees

    Fixes:

    - when subvolume deletion fails due to ENOSPC, the filesystem is not
    turned read-only

    - device scan deals with devices from other filesystems that changed
    ownership due to overwrite (mkfs)

    - fix a race between scrub and block group removal/allocation

    - fix long standing bug of a runaway balance operation, printing the
    same line to the syslog, caused by a stale status bit on a reloc
    tree that prevented progress

    - fix corrupt log due to concurrent fsync of inodes with shared
    extents

    - fix space underflow for NODATACOW and buffered writes when it for
    some reason needs to fallback to COW mode"

    * tag 'for-5.8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (133 commits)
    btrfs: fix space_info bytes_may_use underflow during space cache writeout
    btrfs: fix space_info bytes_may_use underflow after nocow buffered write
    btrfs: fix wrong file range cleanup after an error filling dealloc range
    btrfs: remove redundant local variable in read_block_for_search
    btrfs: open code key_search
    btrfs: split btrfs_direct_IO to read and write part
    btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK
    fs: remove dio_end_io()
    btrfs: switch to iomap_dio_rw() for dio
    iomap: remove lockdep_assert_held()
    iomap: add a filesystem hook for direct I/O bio submission
    fs: export generic_file_buffered_read()
    btrfs: turn space cache writeout failure messages into debug messages
    btrfs: include error on messages about failure to write space/inode caches
    btrfs: remove useless 'fail_unlock' label from btrfs_csum_file_blocks()
    btrfs: do not ignore error from btrfs_next_leaf() when inserting checksums
    btrfs: make checksum item extension more efficient
    btrfs: fix corrupt log due to concurrent fsync of inodes with shared extents
    btrfs: unexport btrfs_compress_set_level()
    btrfs: simplify iget helpers
    ...

    Linus Torvalds
     

28 May, 2020

1 commit

  • Since we removed the last user of dio_end_io(), remove the helper
    function dio_end_io().

    Reviewed-by: Nikolay Borisov
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Goldwyn Rodrigues
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba

    Goldwyn Rodrigues
     

13 May, 2020

1 commit

  • Sync dio could be big, or may take long time in discard or in case of
    IO failure.

    We have prevented task hung in submit_bio_wait() and blk_execute_rq(),
    so apply the same trick for prevent task hung from happening in sync dio.

    Add helper of blk_io_schedule() and use io_schedule_timeout() to prevent
    task hung warning.

    Signed-off-by: Ming Lei
    Reviewed-by: Bart Van Assche
    Cc: Salman Qazi
    Cc: Jesse Barnes
    Cc: Christoph Hellwig
    Cc: Bart Van Assche
    Cc: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Ming Lei
     

05 Jan, 2020

1 commit

  • Include fs/internal.h to address the following 'sparse' warning:

    fs/direct-io.c:591:5: warning: symbol 'sb_init_dio_done_wq' was not declared. Should it be static?

    Link: http://lkml.kernel.org/r/20191209234544.128302-1-ebiggers@kernel.org
    Signed-off-by: Eric Biggers
    Reviewed-by: Jan Kara
    Cc: Alexander Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Biggers
     

01 Dec, 2019

1 commit

  • This helper prints warning if direct I/O write failed to invalidate cache,
    and set EIO at inode to warn usersapce about possible data corruption.

    See also commit 5a9d929d6e13 ("iomap: report collisions between directio
    and buffered writes to userspace").

    Direct I/O is supported by non-disk filesystems, for example NFS. Thus
    generic code needs this even in kernel without CONFIG_BLOCK.

    Link: http://lkml.kernel.org/r/157270038074.4812.7980855544557488880.stgit@buzz
    Signed-off-by: Konstantin Khlebnikov
    Reviewed-by: Andrew Morton
    Reviewed-by: Jan Kara
    Cc: Jens Axboe
    Cc: Alexander Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     

15 Oct, 2019

1 commit

  • Fix kernel-doc warning in fs/direct-io.c:

    fs/direct-io.c:258: warning: Excess function parameter 'offset' description in 'dio_complete'

    Also, don't mark this function as having kernel-doc notation since it is
    not exported.

    Link: http://lkml.kernel.org/r/97908511-4328-4a56-17fe-f43a1d7aa470@infradead.org
    Fixes: 6d544bb4d901 ("dio: centralize completion in dio_complete()")
    Signed-off-by: Randy Dunlap
    Cc: Zach Brown
    Cc: Alexander Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

29 Jun, 2019

1 commit


21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner