14 Oct, 2020

1 commit

  • Pull block updates from Jens Axboe:

    - Series of merge handling cleanups (Baolin, Christoph)

    - Series of blk-throttle fixes and cleanups (Baolin)

    - Series cleaning up BDI, seperating the block device from the
    backing_dev_info (Christoph)

    - Removal of bdget() as a generic API (Christoph)

    - Removal of blkdev_get() as a generic API (Christoph)

    - Cleanup of is-partition checks (Christoph)

    - Series reworking disk revalidation (Christoph)

    - Series cleaning up bio flags (Christoph)

    - bio crypt fixes (Eric)

    - IO stats inflight tweak (Gabriel)

    - blk-mq tags fixes (Hannes)

    - Buffer invalidation fixes (Jan)

    - Allow soft limits for zone append (Johannes)

    - Shared tag set improvements (John, Kashyap)

    - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel)

    - DM no-wait support (Mike, Konstantin)

    - Request allocation improvements (Ming)

    - Allow md/dm/bcache to use IO stat helpers (Song)

    - Series improving blk-iocost (Tejun)

    - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang,
    Xianting, Yang, Yufen, yangerkun)

    * tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits)
    block: fix uapi blkzoned.h comments
    blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue
    blk-mq: get rid of the dead flush handle code path
    block: get rid of unnecessary local variable
    block: fix comment and add lockdep assert
    blk-mq: use helper function to test hw stopped
    block: use helper function to test queue register
    block: remove redundant mq check
    block: invoke blk_mq_exit_sched no matter whether have .exit_sched
    percpu_ref: don't refer to ref->data if it isn't allocated
    block: ratelimit handle_bad_sector() message
    blk-throttle: Re-use the throtl_set_slice_end()
    blk-throttle: Open code __throtl_de/enqueue_tg()
    blk-throttle: Move service tree validation out of the throtl_rb_first()
    blk-throttle: Move the list operation after list validation
    blk-throttle: Fix IO hang for a corner case
    blk-throttle: Avoid tracking latency if low limit is invalid
    blk-throttle: Avoid getting the current time if tg->last_finish_time is 0
    blk-throttle: Remove a meaningless parameter for throtl_downgrade_state()
    block: Remove redundant 'return' statement
    ...

    Linus Torvalds
     

07 Oct, 2020

1 commit

  • Don't error out if the dasd_biodasdinfo symbol is not available.

    Cc: stable@vger.kernel.org
    Fixes: 26d7e28e3820 ("s390/dasd: remove ioctl_by_bdev calls")
    Reported-by: Christian Borntraeger
    Signed-off-by: Christoph Hellwig
    Tested-by: Christian Borntraeger
    Reviewed-by: Stefan Haberland
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

06 Oct, 2020

1 commit

  • All remaining callers of bdget() outside of fs/block_dev.c want to get a
    reference to the struct block_device for a given struct hd_struct. Add
    a helper just for that and then mark bdget static.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

08 Sep, 2020

1 commit


02 Sep, 2020

4 commits


01 Sep, 2020

2 commits

  • We need to hold the whole device bd_mutex to protect against
    other thread concurrently deleting out partition before we get
    to it, and thus causing a use after free.

    Fixes: cddae808aeb7 ("block: pass a hd_struct to delete_partition")
    Reported-by: syzbot+6448f3c229bc52b82f69@syzkaller.appspotmail.com
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • Commit e8c7d14ac6c3 ("block: revert back to synchronous request_queue removal")
    stops to release request queue from wq context because that commit
    supposed all blk_put_queue() is called in context which is allowed
    to sleep. However, this assumption isn't true because we release disk's
    reference in partition's percpu_ref's ->release() which doesn't allow
    to sleep, because the ->release() is run via call_rcu().

    Fixes this issue by moving put disk reference into hd_struct_free_work()

    Fixes: e8c7d14ac6c3 ("block: revert back to synchronous request_queue removal")
    Reported-by: Ilya Dryomov
    Signed-off-by: Ming Lei
    Tested-by: Ilya Dryomov
    Reviewed-by: Christoph Hellwig
    Cc: Luis Chamberlain
    Cc: Christoph Hellwig
    Cc: Bart Van Assche
    Signed-off-by: Jens Axboe

    Ming Lei
     

15 Jul, 2020

1 commit

  • In theory, when GENHD_FL_NO_PART_SCAN is set, no partitions can be created
    on one disk. However, ioctl(BLKPG, BLKPG_ADD_PARTITION) doesn't check
    GENHD_FL_NO_PART_SCAN, so partitions still can be added even though
    GENHD_FL_NO_PART_SCAN is set.

    So far blk_drop_partitions() only removes partitions when disk_part_scan_enabled()
    return true. This way can make ghost partition on loop device after changing/clearing
    FD in case that PARTSCAN is disabled, such as partitions can be added
    via 'parted' on loop disk even though GENHD_FL_NO_PART_SCAN is set.

    Fix this issue by always removing partitions in blk_drop_partitions(), and
    this way is correct because the current code supposes that no partitions
    can be added in case of GENHD_FL_NO_PART_SCAN.

    Signed-off-by: Ming Lei
    Reviewed-by: Christoph Hellwig
    Cc: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Ming Lei
     

20 Jun, 2020

1 commit

  • Pull block fixes from Jens Axboe:

    - Use import_uuid() where appropriate (Andy)

    - bcache fixes (Coly, Mauricio, Zhiqiang)

    - blktrace sparse warnings fix (Jan)

    - blktrace concurrent setup fix (Luis)

    - blkdev_get use-after-free fix (Jason)

    - Ensure all blk-mq maps are updated (Weiping)

    - Loop invalidate bdev fix (Zheng)

    * tag 'block-5.8-2020-06-19' of git://git.kernel.dk/linux-block:
    block: make function 'kill_bdev' static
    loop: replace kill_bdev with invalidate_bdev
    partitions/ldm: Replace uuid_copy() with import_uuid() where it makes sense
    block: update hctx map when use multiple maps
    blktrace: Avoid sparse warnings when assigning q->blk_trace
    blktrace: break out of blktrace setup on concurrent calls
    block: Fix use-after-free in blkdev_get()
    trace/events/block.h: drop kernel-doc for dropped function parameter
    blk-mq: Remove redundant 'return' statement
    bcache: pr_info() format clean up in bcache_device_init()
    bcache: use delayed kworker fo asynchronous devices registration
    bcache: check and adjust logical block size for backing devices
    bcache: fix potential deadlock problem in btree_gc_coalesce

    Linus Torvalds
     

18 Jun, 2020

1 commit


16 Jun, 2020

1 commit

  • There is a regular need in the kernel to provide a way to declare having a
    dynamically sized set of trailing elements in a structure. Kernel code should
    always use “flexible array members”[1] for these cases. The older style of
    one-element or zero-length arrays should no longer be used[2].

    [1] https://en.wikipedia.org/wiki/Flexible_array_member
    [2] https://github.com/KSPP/linux/issues/21

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

14 Jun, 2020

1 commit

  • Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over
    '---help---'"), the number of '---help---' has been gradually
    decreasing, but there are still more than 2400 instances.

    This commit finishes the conversion. While I touched the lines,
    I also fixed the indentation.

    There are a variety of indentation styles found.

    a) 4 spaces + '---help---'
    b) 7 spaces + '---help---'
    c) 8 spaces + '---help---'
    d) 1 space + 1 tab + '---help---'
    e) 1 tab + '---help---' (correct indentation)
    f) 1 tab + 1 space + '---help---'
    g) 1 tab + 2 spaces + '---help---'

    In order to convert all of them to 1 tab + 'help', I ran the
    following commend:

    $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     

03 Jun, 2020

1 commit

  • Pull block driver updates from Jens Axboe:
    "On top of the core changes, here are the block driver changes for this
    merge window:

    - NVMe changes:
    - NVMe over Fibre Channel protocol updates, which also reach
    over to drivers/scsi/lpfc (James Smart)
    - namespace revalidation support on the target (Anthony
    Iliopoulos)
    - gcc zero length array fix (Arnd Bergmann)
    - nvmet cleanups (Chaitanya Kulkarni)
    - misc cleanups and fixes (me, Keith Busch, Sagi Grimberg)
    - use a SRQ per completion vector (Max Gurtovoy)
    - fix handling of runtime changes to the queue count (Weiping
    Zhang)
    - t10 protection information support for nvme-rdma and
    nvmet-rdma (Israel Rukshin and Max Gurtovoy)
    - target side AEN improvements (Chaitanya Kulkarni)
    - various fixes and minor improvements all over, icluding the
    nvme part of the lpfc driver"

    - Floppy code cleanup series (Willy, Denis)

    - Floppy contention fix (Jiri)

    - Loop CONFIGURE support (Martijn)

    - bcache fixes/improvements (Coly, Joe, Colin)

    - q->queuedata cleanups (Christoph)

    - Get rid of ioctl_by_bdev (Christoph, Stefan)

    - md/raid5 allocation fixes (Coly)

    - zero length array fixes (Gustavo)

    - swim3 task state fix (Xu)"

    * tag 'for-5.8/drivers-2020-06-01' of git://git.kernel.dk/linux-block: (166 commits)
    bcache: configure the asynchronous registertion to be experimental
    bcache: asynchronous devices registration
    bcache: fix refcount underflow in bcache_device_free()
    bcache: Convert pr_ uses to a more typical style
    bcache: remove redundant variables i and n
    lpfc: Fix return value in __lpfc_nvme_ls_abort
    lpfc: fix axchg pointer reference after free and double frees
    lpfc: Fix pointer checks and comments in LS receive refactoring
    nvme: set dma alignment to qword
    nvmet: cleanups the loop in nvmet_async_events_process
    nvmet: fix memory leak when removing namespaces and controllers concurrently
    nvmet-rdma: add metadata/T10-PI support
    nvmet: add metadata support for block devices
    nvmet: add metadata/T10-PI support
    nvme: add Metadata Capabilities enumerations
    nvmet: rename nvmet_check_data_len to nvmet_check_transfer_len
    nvmet: rename nvmet_rw_len to nvmet_rw_data_len
    nvmet: add metadata characteristics for a namespace
    nvme-rdma: add metadata/T10-PI support
    nvme-rdma: introduce nvme_rdma_sgl structure
    ...

    Linus Torvalds
     

27 May, 2020

1 commit


21 May, 2020

1 commit

  • The IBM partition parser requires device type specific information only
    available to the DASD driver to correctly register partitions. The
    current approach of using ioctl_by_bdev with a fake user space pointer
    is discouraged.

    Fix this by replacing IOCTL calls with direct in-kernel function calls.

    Suggested-by: Christoph Hellwig
    Signed-off-by: Stefan Haberland
    Reviewed-by: Jan Hoeppner
    Reviewed-by: Peter Oberparleiter
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Stefan Haberland
     

13 May, 2020

2 commits

  • The seqcount of 'nr_sects_seq' is only needed in case of 32bit SMP,
    so define it just for 32bit SMP.

    Signed-off-by: Ming Lei
    Reviewed-by: Christoph Hellwig
    Cc: Yufen Yu
    Cc: Christoph Hellwig
    Cc: Hou Tao
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • delete_partition() clears the cached last_lookup partition. However the
    .last_lookup cache may be overwritten by one IO path after it is cleared
    from delete_partition(). Then another IO path may use the cached deleting
    partition after hd_struct_free() is called, then use-after-free is triggered
    on the cached partition.

    Fixes the issue by the following approach:

    1) always get the partition's refcount via hd_struct_try_get() before
    setting .last_lookup

    2) move clearing .last_lookup from delete_partition() to hd_struct_free()
    which is the release handle of the partition's percpu-refcount, so that no
    IO path can cache deleteing partition via .last_lookup.

    It is one candidate approach of Yufen's patch[1] which adds overhead
    in fast path by indirect lookup which may introduce one extra cacheline
    in IO path. Also this patch relies on percpu-refcount's protection, and
    it is easier to understand and verify.

    [1] https://lore.kernel.org/linux-block/20200109013551.GB9655@ming.t460p/T/#t

    Reported-by: Yufen Yu
    Signed-off-by: Ming Lei
    Reviewed-by: Christoph Hellwig
    Cc: Christoph Hellwig
    Cc: Hou Tao
    Signed-off-by: Jens Axboe

    Ming Lei
     

10 May, 2020

1 commit

  • Pull in block-5.7 fixes for 5.8. Mostly to resolve a conflict with
    the blk-iocost changes, but we also need the base of the bdi
    use-after-free as well as we build on top of it.

    * block-5.7:
    nvme: fix possible hang when ns scanning fails during error recovery
    nvme-pci: fix "slimmer CQ head update"
    bdi: add a ->dev_name field to struct backing_dev_info
    bdi: use bdi_dev_name() to get device name
    bdi: move bdi_dev_name out of line
    vboxsf: don't use the source name in the bdi name
    iocost: protect iocg->abs_vdebt with iocg->waitq.lock
    block: remove the bd_openers checks in blk_drop_partitions
    nvme: prevent double free in nvme_alloc_ns() error handling
    null_blk: Cleanup zoned device initialization
    null_blk: Fix zoned command handling
    block: remove unused header
    blk-iocost: Fix error on iocost_ioc_vrate_adj
    bdev: Reduce time holding bd_mutex in sync in blkdev_close()
    buffer: remove useless comment and WB_REASON_FREE_MORE_MEM, reason.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

01 May, 2020

1 commit

  • When replacing the bd_super check with a bd_openers I followed a logical
    conclusion, which turns out to be utterly wrong. When a block device has
    bd_super sets it has a mount file system on it (although not every
    mounted file system sets bd_super), but that also implies it doesn't even
    have partitions to start with.

    So instead of trying to come up with a logical check for all openers,
    just remove the check entirely.

    Fixes: d3ef5536274f ("block: fix busy device checking in blk_drop_partitions")
    Fixes: cb6b771b05c3 ("block: fix busy device checking in blk_drop_partitions again")
    Reported-by: Michal Koutný
    Reported-by: Yang Xu
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

21 Apr, 2020

8 commits


10 Apr, 2020

1 commit


08 Apr, 2020

1 commit

  • bd_super is only set by get_tree_bdev and mount_bdev, and thus not by
    other openers like btrfs or the XFS realtime and log devices, as well as
    block devices directly opened from user space. Check bd_openers
    instead.

    Fixes: 77032ca66f86 ("Return EBUSY from BLKRRPART for mounted whole-dev fs")
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

31 Mar, 2020

1 commit

  • Pull EFI updates from Ingo Molnar:
    "The EFI changes in this cycle are much larger than usual, for two
    (positive) reasons:

    - The GRUB project is showing signs of life again, resulting in the
    introduction of the generic Linux/UEFI boot protocol, instead of
    x86 specific hacks which are increasingly difficult to maintain.
    There's hope that all future extensions will now go through that
    boot protocol.

    - Preparatory work for RISC-V EFI support.

    The main changes are:

    - Boot time GDT handling changes

    - Simplify handling of EFI properties table on arm64

    - Generic EFI stub cleanups, to improve command line handling, file
    I/O, memory allocation, etc.

    - Introduce a generic initrd loading method based on calling back
    into the firmware, instead of relying on the x86 EFI handover
    protocol or device tree.

    - Introduce a mixed mode boot method that does not rely on the x86
    EFI handover protocol either, and could potentially be adopted by
    other architectures (if another one ever surfaces where one
    execution mode is a superset of another)

    - Clean up the contents of 'struct efi', and move out everything that
    doesn't need to be stored there.

    - Incorporate support for UEFI spec v2.8A changes that permit
    firmware implementations to return EFI_UNSUPPORTED from UEFI
    runtime services at OS runtime, and expose a mask of which ones are
    supported or unsupported via a configuration table.

    - Partial fix for the lack of by-VA cache maintenance in the
    decompressor on 32-bit ARM.

    - Changes to load device firmware from EFI boot service memory
    regions

    - Various documentation updates and minor code cleanups and fixes"

    * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
    efi/libstub/arm: Fix spurious message that an initrd was loaded
    efi/libstub/arm64: Avoid image_base value from efi_loaded_image
    partitions/efi: Fix partition name parsing in GUID partition entry
    efi/x86: Fix cast of image argument
    efi/libstub/x86: Use ULONG_MAX as upper bound for all allocations
    efi: Fix a mistype in comments mentioning efivar_entry_iter_begin()
    efi/libstub: Avoid linking libstub/lib-ksyms.o into vmlinux
    efi/x86: Preserve %ebx correctly in efi_set_virtual_address_map()
    efi/x86: Ignore the memory attributes table on i386
    efi/x86: Don't relocate the kernel unless necessary
    efi/x86: Remove extra headroom for setup block
    efi/x86: Add kernel preferred address to PE header
    efi/x86: Decompress at start of PE image load address
    x86/boot/compressed/32: Save the output address instead of recalculating it
    efi/libstub/x86: Deal with exit() boot service returning
    x86/boot: Use unsigned comparison for addresses
    efi/x86: Avoid using code32_start
    efi/x86: Make efi32_pe_entry() more readable
    efi/x86: Respect 32-bit ABI in efi32_pe_entry()
    efi/x86: Annotate the LOADED_IMAGE_PROTOCOL_GUID with SYM_DATA
    ...

    Linus Torvalds
     

25 Mar, 2020

1 commit


24 Mar, 2020

6 commits