14 Dec, 2020

1 commit

  • errata:
    When a read command returns less data than specified in the PRDs (for
    example, there are two PRDs for this command, but the device returns a
    number of bytes which is less than in the first PRD), the second PRD of
    this command is not read out of the PRD FIFO, causing the next command
    to use this PRD erroneously.

    workaround
    - forces sg_tablesize = 1
    - modified the sg_io function in block/scsi_ioctl.c to use a 64k buffer
    allocated with dma_alloc_coherent during the probe in ahci_imx
    - In order to fix the scsi/sata hang, when CD_ROM and HDD are
    accessed simultaneously after the workaround is applied.
    Do not go to sleep in scsi_eh_handler, when there is host failed.

    Signed-off-by: Richard Zhu

    Richard Zhu
     

14 Oct, 2020

2 commits

  • Pull block driver updates from Jens Axboe:
    "Here are the driver updates for 5.10.

    A few SCSI updates in here too, in coordination with Martin as they
    depend on core block changes for the shared tag bitmap.

    This contains:

    - NVMe pull requests via Christoph:
    - fix keep alive timer modification (Amit Engel)
    - order the PCI ID list more sensibly (Andy Shevchenko)
    - cleanup the open by controller helper (Chaitanya Kulkarni)
    - use an xarray for the CSE log lookup (Chaitanya Kulkarni)
    - support ZNS in nvmet passthrough mode (Chaitanya Kulkarni)
    - fix nvme_ns_report_zones (Christoph Hellwig)
    - add a sanity check to nvmet-fc (James Smart)
    - fix interrupt allocation when too many polled queues are
    specified (Jeffle Xu)
    - small nvmet-tcp optimization (Mark Wunderlich)
    - fix a controller refcount leak on init failure (Chaitanya
    Kulkarni)
    - misc cleanups (Chaitanya Kulkarni)
    - major refactoring of the scanning code (Christoph Hellwig)

    - MD updates via Song:
    - Bug fixes in bitmap code, from Zhao Heming
    - Fix a work queue check, from Guoqing Jiang
    - Fix raid5 oops with reshape, from Song Liu
    - Clean up unused code, from Jason Yan
    - Discard improvements, from Xiao Ni
    - raid5/6 page offset support, from Yufen Yu

    - Shared tag bitmap for SCSI/hisi_sas/null_blk (John, Kashyap,
    Hannes)

    - null_blk open/active zone limit support (Niklas)

    - Set of bcache updates (Coly, Dongsheng, Qinglang)"

    * tag 'drivers-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (78 commits)
    md/raid5: fix oops during stripe resizing
    md/bitmap: fix memory leak of temporary bitmap
    md: fix the checking of wrong work queue
    md/bitmap: md_bitmap_get_counter returns wrong blocks
    md/bitmap: md_bitmap_read_sb uses wrong bitmap blocks
    md/raid0: remove unused function is_io_in_chunk_boundary()
    nvme-core: remove extra condition for vwc
    nvme-core: remove extra variable
    nvme: remove nvme_identify_ns_list
    nvme: refactor nvme_validate_ns
    nvme: move nvme_validate_ns
    nvme: query namespace identifiers before adding the namespace
    nvme: revalidate zone bitmaps in nvme_update_ns_info
    nvme: remove nvme_update_formats
    nvme: update the known admin effects
    nvme: set the queue limits in nvme_update_ns_info
    nvme: remove the 0 lba_shift check in nvme_update_ns_info
    nvme: clean up the check for too large logic block sizes
    nvme: freeze the queue over ->lba_shift updates
    nvme: factor out a nvme_configure_metadata helper
    ...

    Linus Torvalds
     
  • Pull block updates from Jens Axboe:

    - Series of merge handling cleanups (Baolin, Christoph)

    - Series of blk-throttle fixes and cleanups (Baolin)

    - Series cleaning up BDI, seperating the block device from the
    backing_dev_info (Christoph)

    - Removal of bdget() as a generic API (Christoph)

    - Removal of blkdev_get() as a generic API (Christoph)

    - Cleanup of is-partition checks (Christoph)

    - Series reworking disk revalidation (Christoph)

    - Series cleaning up bio flags (Christoph)

    - bio crypt fixes (Eric)

    - IO stats inflight tweak (Gabriel)

    - blk-mq tags fixes (Hannes)

    - Buffer invalidation fixes (Jan)

    - Allow soft limits for zone append (Johannes)

    - Shared tag set improvements (John, Kashyap)

    - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel)

    - DM no-wait support (Mike, Konstantin)

    - Request allocation improvements (Ming)

    - Allow md/dm/bcache to use IO stat helpers (Song)

    - Series improving blk-iocost (Tejun)

    - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang,
    Xianting, Yang, Yufen, yangerkun)

    * tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits)
    block: fix uapi blkzoned.h comments
    blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue
    blk-mq: get rid of the dead flush handle code path
    block: get rid of unnecessary local variable
    block: fix comment and add lockdep assert
    blk-mq: use helper function to test hw stopped
    block: use helper function to test queue register
    block: remove redundant mq check
    block: invoke blk_mq_exit_sched no matter whether have .exit_sched
    percpu_ref: don't refer to ref->data if it isn't allocated
    block: ratelimit handle_bad_sector() message
    blk-throttle: Re-use the throtl_set_slice_end()
    blk-throttle: Open code __throtl_de/enqueue_tg()
    blk-throttle: Move service tree validation out of the throtl_rb_first()
    blk-throttle: Move the list operation after list validation
    blk-throttle: Fix IO hang for a corner case
    blk-throttle: Avoid tracking latency if low limit is invalid
    blk-throttle: Avoid getting the current time if tg->last_finish_time is 0
    blk-throttle: Remove a meaningless parameter for throtl_downgrade_state()
    block: Remove redundant 'return' statement
    ...

    Linus Torvalds
     

13 Oct, 2020

1 commit

  • Pull compat iovec cleanups from Al Viro:
    "Christoph's series around import_iovec() and compat variant thereof"

    * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    security/keys: remove compat_keyctl_instantiate_key_iov
    mm: remove compat_process_vm_{readv,writev}
    fs: remove compat_sys_vmsplice
    fs: remove the compat readv/writev syscalls
    fs: remove various compat readv/writev helpers
    iov_iter: transparently handle compat iovecs in import_iovec
    iov_iter: refactor rw_copy_check_uvector and import_iovec
    iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c
    compat.h: fix a spelling error in

    Linus Torvalds
     

03 Oct, 2020

3 commits

  • Use in compat_syscall to import either native or the compat iovecs, and
    remove the now superflous compat_import_iovec.

    This removes the need for special compat logic in most callers, and
    the remaining ones can still be simplified by using __import_iovec
    with a bool compat parameter.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • One-element arrays are being deprecated[1]. Replace the one-element array
    with a simple object of type compat_caddr_t: 'compat_caddr_t unused'[2],
    once it seems this field is actually never used.

    Also, update struct cdrom_generic_command in UAPI by adding an
    anonimous union to avoid using the one-element array _reserved_.

    [1] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays
    [2] https://github.com/KSPP/linux/issues/86

    Signed-off-by: Gustavo A. R. Silva
    Link: https://lore.kernel.org/lkml/5f76f5d0.qJ4t%2FHWuRzSW7bTa%25lkp@intel.com/
    Build-tested-by: kernel test robot
    Signed-off-by: Jens Axboe

    Gustavo A. R. Silva
     
  • scsi_put_cdrom_generic_arg() is copying uninitialized stack memory to
    userspace, since the compiler may leave a 3-byte hole in the middle of
    `cgc32`. Fix it by adding a padding field to `struct
    compat_cdrom_generic_command`.

    Cc: stable@vger.kernel.org
    Fixes: f3ee6e63a9df ("compat_ioctl: move CDROM_SEND_PACKET handling into scsi")
    Suggested-by: Dan Carpenter
    Suggested-by: Arnd Bergmann
    Reported-by: syzbot+85433a479a646a064ab3@syzkaller.appspotmail.com
    Signed-off-by: Peilin Ye
    Signed-off-by: Jens Axboe

    Peilin Ye
     

25 Sep, 2020

1 commit


11 Sep, 2020

1 commit


17 Mar, 2020

1 commit

  • Allow users with read permissions to issue REPORT ZONE commands and users
    with write permissions to manage zones on block devices supporting the ZBC
    specification.

    Link: https://lore.kernel.org/r/20200226170518.92963-2-ryanattard@ryanattard.info
    Signed-off-by: Ryan Attard
    Reviewed-by: Damien Le Moal
    Signed-off-by: Martin K. Petersen

    Ryan Attard
     

03 Jan, 2020

2 commits

  • There is only one implementation of this ioctl, so move the handling out
    of the common block layer code into the place where it's actually needed.

    It also gets called indirectly through pktcdvd, which needs to be aware
    of this change.

    As I noticed, the old implementation of the compat handler failed to
    convert the structure on the way out, so the updated fields never got
    written back to user space. This is either not important, or it has
    never worked and should be fixed now.

    Reviewed-by: Ben Hutchings
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     
  • In the v5.4 merge window, a cleanup patch from Al Viro conflicted
    with my rework of the compat handling for sg.c read(). Linus Torvalds
    did a correct merge but pointed out that the resulting code is still
    unsatisfactory.

    I later noticed that the sg_new_read() function still gets the compat
    mode wrong, when the 'count' argument is large enough to pass a
    compat_sg_io_hdr object, but not a nativ sg_io_hdr.

    To address both of these, move the definition of compat_sg_io_hdr
    into a scsi/sg.h to make it visible to sg.c and rewrite the logic
    for reading req_pack_id as well as the size check to a simpler
    version that gets the expected results.

    Fixes: c35a5cfb4150 ("scsi: sg: sg_read(): simplify reading ->pack_id of userland sg_io_hdr_t")
    Fixes: 98aaaec4a150 ("compat_ioctl: reimplement SG_IO handling")
    Reviewed-by: Ben Hutchings
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

23 Oct, 2019

1 commit

  • There are two code locations that implement the SG_IO ioctl: the old
    sg.c driver, and the generic scsi_ioctl helper that is in turn used by
    multiple drivers.

    To eradicate the old compat_ioctl conversion handler for the SG_IO
    command, I implement a readable pair of put_sg_io_hdr() /get_sg_io_hdr()
    helper functions that can be used for both compat and native mode,
    and then I call this from both drivers.

    For the iovec handling, there is already a compat_import_iovec() function
    that can simply be called in place of import_iovec().

    To avoid having to pass the compat/native state through multiple
    indirections, I mark the SG_IO command itself as compatible in
    fs/compat_ioctl.c and use in_compat_syscall() to figure out where
    we are called from.

    As a side-effect of this, the sg.c driver now also accepts the 32-bit
    sg_io_hdr format in compat mode using the read/write interface, not
    just ioctl. This should improve compatiblity with old 32-bit binaries,
    but it would break if any application intentionally passes the 64-bit
    data structure in compat mode here.

    Steffen Maier helped debug an issue in an earlier version of this patch.

    Cc: Steffen Maier
    Cc: linux-scsi@vger.kernel.org
    Cc: Doug Gilbert
    Cc: "James E.J. Bottomley"
    Cc: "Martin K. Petersen"
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

01 May, 2019

1 commit


14 May, 2018

2 commits


11 Jan, 2018

1 commit

  • After the first few months, the message has not led to many bug reports.
    It's been almost five years now, and in practice the main source of
    it seems to be MTIOCGET that someone is using to detect tape devices.
    While we could whitelist it just like CDROM_GET_CAPABILITY, this patch
    just removes the message altogether.

    The patch also removes the "safe but not very useful" ioctl whitelist,
    as suggested by Christoph. I doubt anything is using most of those
    ioctls _in general_, let alone on a partition.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Jens Axboe

    Paolo Bonzini
     

10 Jan, 2018

1 commit

  • Commit 3a025e1d1c2e ("Add optional check for bad kernel-doc comments")
    causes W=1 the kernel-doc script to be run and thereby causes several
    new warnings to appear when building the kernel with W=1. Fix the
    block layer kernel-doc headers such that the block layer again builds
    cleanly with W=1.

    Signed-off-by: Bart Van Assche
    Cc: Martin K. Petersen
    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Cc: Johannes Thumshirn
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

11 Nov, 2017

1 commit


21 Jun, 2017

2 commits

  • Since scsi_req_init() works on a struct scsi_request, change the
    argument type into struct scsi_request *.

    Signed-off-by: Bart Van Assche
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • Instead of explicitly calling scsi_req_init() after blk_get_request(),
    call that function from inside blk_get_request(). Add an
    .initialize_rq_fn() callback function to the block drivers that need
    it. Merge the IDE .init_rq_fn() function into .initialize_rq_fn()
    because it is too small to keep it as a separate function. Keep the
    scsi_req_init() call in ide_prep_sense() because it follows a
    blk_rq_init() call.

    References: commit 82ed4db499b8 ("block: split scsi_request out of struct request")
    Signed-off-by: Bart Van Assche
    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Cc: Omar Sandoval
    Cc: Nicholas Bellinger
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

21 Apr, 2017

2 commits

  • This passes on the scsi_cmnd result field to users of passthrough
    requests. Currently we abuse req->errors for this purpose, but that
    field will go away in its current form.

    Note that the old IDE code abuses the errors field in very creative
    ways and stores all kinds of different values in it. I didn't dare
    to touch this magic, so the abuses are brought forward 1:1.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • The function only returns -EIO if rq->errors is non-zero, which is not
    very useful and lets a large number of callers ignore the return value.

    Just let the callers figure out their error themselves.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

06 Apr, 2017

1 commit


01 Feb, 2017

1 commit

  • Instead of keeping two levels of indirection for requests types, fold it
    all into the operations. The little caveat here is that previously
    cmd_type only applied to struct request, while the request and bio op
    fields were set to plain REQ_OP_READ/WRITE even for passthrough
    operations.

    Instead this patch adds new REQ_OP_* for SCSI passthrough and driver
    private requests, althought it has to add two for each so that we
    can communicate the data in/out nature of the request.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

28 Jan, 2017

1 commit

  • And require all drivers that want to support BLOCK_PC to allocate it
    as the first thing of their private data. To support this the legacy
    IDE and BSG code is switched to set cmd_size on their queues to let
    the block layer allocate the additional space.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

25 Dec, 2016

1 commit


19 Dec, 2016

1 commit

  • The WRITE_SAME commands are not present in the blk_default_cmd_filter
    write_ok list, and thus are failed with -EPERM when the SG_IO ioctl()
    is executed without CAP_SYS_RAWIO capability (e.g., unprivileged users).
    [ sg_io() -> blk_fill_sghdr_rq() > blk_verify_command() -> -EPERM ]

    The problem can be reproduced with the sg_write_same command

    # sg_write_same --num 1 --xferlen 512 /dev/sda
    #

    # capsh --drop=cap_sys_rawio -- -c \
    'sg_write_same --num 1 --xferlen 512 /dev/sda'
    Write same: pass through os error: Operation not permitted
    #

    For comparison, the WRITE_VERIFY command does not observe this problem,
    since it is in that list:

    # capsh --drop=cap_sys_rawio -- -c \
    'sg_write_verify --num 1 --ilen 512 --lba 0 /dev/sda'
    #

    So, this patch adds the WRITE_SAME commands to the list, in order
    for the SG_IO ioctl to finish successfully:

    # capsh --drop=cap_sys_rawio -- -c \
    'sg_write_same --num 1 --xferlen 512 /dev/sda'
    #

    That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices
    (qemu "-device scsi-block" [1], libvirt "" [2]),
    which employs the SG_IO ioctl() and runs as an unprivileged user (libvirt-qemu).

    In that scenario, when a filesystem (e.g., ext4) performs its zero-out calls,
    which are translated to write-same calls in the guest kernel, and then into
    SG_IO ioctls to the host kernel, SCSI I/O errors may be observed in the guest:

    [...] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
    [...] sd 0:0:0:0: [sda] tag#0 Sense Key : Aborted Command [current]
    [...] sd 0:0:0:0: [sda] tag#0 Add. Sense: I/O process terminated
    [...] sd 0:0:0:0: [sda] tag#0 CDB: Write Same(10) 41 00 01 04 e0 78 00 00 08 00
    [...] blk_update_request: I/O error, dev sda, sector 17096824

    Links:
    [1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52
    [2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device')

    Signed-off-by: Mauricio Faria de Oliveira
    Signed-off-by: Brahadambal Srinivasan
    Reported-by: Manjunatha H R
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Mauricio Faria de Oliveira
     

07 Nov, 2015

1 commit

  • __GFP_WAIT was used to signal that the caller was in atomic context and
    could not sleep. Now it is possible to distinguish between true atomic
    context and callers that are not willing to sleep. The latter should
    clear __GFP_DIRECT_RECLAIM so kswapd will still wake. As clearing
    __GFP_WAIT behaves differently, there is a risk that people will clear the
    wrong flags. This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly
    indicate what it does -- setting it allows all reclaim activity, clearing
    them prevents it.

    [akpm@linux-foundation.org: fix build]
    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Mel Gorman
    Acked-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Acked-by: Johannes Weiner
    Cc: Christoph Lameter
    Acked-by: David Rientjes
    Cc: Vitaly Wool
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

28 Jun, 2015

1 commit

  • Whenever blk_fill_sghdr_rq fails, its errno code is ignored and changed to
    EFAULT. This can cause very confusing errors:

    $ sg_persist -k /dev/sda
    persistent reservation in: pass through os error: Bad address

    The fix is trivial, just propagate the return value from
    blk_fill_sghdr_rq.

    Signed-off-by: Paolo Bonzini
    Acked-by: Jeff Moyer
    Signed-off-by: Jens Axboe

    Paolo Bonzini
     

12 Apr, 2015

1 commit


06 Feb, 2015

1 commit

  • Make use of a new interface provided by iov_iter, backed by
    scatter-gather list of iovec, instead of the old interface based on
    sg_iovec. Also use iov_iter_advance() instead of manual iteration.

    This commit should contain only literal replacements, without
    functional changes.

    Cc: Christoph Hellwig
    Cc: Jens Axboe
    Cc: Doug Gilbert
    Cc: "James E.J. Bottomley"
    Signed-off-by: Kent Overstreet
    [dpark: add more description in commit message]
    Signed-off-by: Dongsu Park
    [hch: fixed to do a deep clone of the iov_iter, and to properly use
    the iov_iter direction]
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Ming Lei
    Signed-off-by: Jens Axboe

    Kent Overstreet
     

08 Dec, 2014

1 commit


25 Nov, 2014

1 commit


11 Nov, 2014

1 commit


23 Oct, 2014

1 commit

  • When sg_scsi_ioctl() fails to prepare request to submit in
    blk_rq_map_kern() we jump to a label where we just end up copying
    (luckily zeroed-out) kernel buffer to userspace instead of reporting
    error. Fix the problem by jumping to the right label.

    CC: Jens Axboe
    CC: linux-scsi@vger.kernel.org
    CC: stable@vger.kernel.org
    Coverity-id: 1226871
    Signed-off-by: Jan Kara

    Fixed up the, now unused, out label.

    Signed-off-by: Jens Axboe

    Jan Kara
     

11 Sep, 2014

1 commit


29 Aug, 2014

1 commit

  • The blk_get_request function may fail in low-memory conditions or during
    device removal (even if __GFP_WAIT is set). To distinguish between these
    errors, modify the blk_get_request call stack to return the appropriate
    ERR_PTR. Verify that all callers check the return status and consider
    IS_ERR instead of a simple NULL pointer check.

    For consistency, make a similar change to the blk_mq_alloc_request leg
    of blk_get_request. It may fail if the queue is dead, or the caller was
    unwilling to wait.

    Signed-off-by: Joe Lawrence
    Acked-by: Jiri Kosina [for pktdvd]
    Acked-by: Boaz Harrosh [for osd]
    Reviewed-by: Jeff Moyer
    Signed-off-by: Jens Axboe

    Joe Lawrence
     

27 Aug, 2014

1 commit

  • The blk-core dead queue checks introduce an error scenario to
    blk_get_request that returns NULL if the request queue has been
    shutdown. This affects the behavior for __GFP_WAIT callers, who should
    verify the return value before dereferencing.

    Signed-off-by: Joe Lawrence
    Acked-by: Jiri Kosina [for pktdvd]
    Reviewed-by: Jeff Moyer
    Signed-off-by: Jens Axboe

    Joe Lawrence
     

26 Aug, 2014

1 commit

  • Before commit 2cada584b200 ("block: cleanup error handling in sg_io"),
    we had ret = 0 before entering the last big if block of sg_io.

    Since 2cada584b200, ret = -EFAULT, which breaks hdparm:

    /dev/sda:
    setting Advanced Power Management level to 0xc8 (200)
    HDIO_DRIVE_CMD failed: Bad address
    APM_level = 128

    Signed-off-by: Sabrina Dubroca
    Fixes: 2cada584b200 ("block: cleanup error handling in sg_io")
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Sabrina Dubroca