20 Jan, 2021

6 commits

  • commit ada831772188192243f9ea437c46e37e97a5975d upstream.

    We shouldn't call smp_processor_id() in a preemptible
    context, but this is advisory at best, so instead
    call __smp_processor_id().

    Fixes: db5ad6b7f8cd ("nvme-tcp: try to send request in queue_rq context")
    Reported-by: Or Gerlitz
    Reported-by: Yi Zhang
    Signed-off-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Greg Kroah-Hartman

    Sagi Grimberg
     
  • commit ca1ff67d0fb14f39cf0cc5102b1fbcc3b14f6fb9 upstream.

    When a bio merges, we can get a request that spans multiple
    bios, and the overall request payload size is the sum of
    all bios. When we calculate how much we need to send
    from the existing bio (and bvec), we did not take into
    account the iov_iter byte count cap.

    Since multipage bvecs support, bvecs can split in the middle
    which means that when we account for the last bvec send we
    should also take the iov_iter byte count cap as it might be
    lower than the last bvec size.

    Reported-by: Hao Wang
    Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
    Tested-by: Hao Wang
    Signed-off-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Greg Kroah-Hartman

    Sagi Grimberg
     
  • commit 5ab25a32cd90ce561ac28b9302766e565d61304c upstream.

    Discovery controllers usually don't support smart log page command.
    So when we connect to the discovery controller we see this warning:
    nvme nvme0: Failed to read smart log (error 24577)
    nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.123.1:8009
    nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"

    Introduce a new helper to understand if the controller is a discovery
    controller and use this helper to skip nvme_init_hwmon (also use it in
    other places that we check if the controller is a discovery controller).

    Fixes: 400b6a7b13a3 ("nvme: Add hardware monitoring support")
    Signed-off-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Greg Kroah-Hartman

    Sagi Grimberg
     
  • [ Upstream commit 19fce0470f05031e6af36e49ce222d0f0050d432 ]

    Recent patches changed calling sequences. nvme_fc_abort_outstanding_ios
    used to be called from a timeout or work context. Now it is being called
    in an io completion context, which can be an interrupt handler.
    Unfortunately, the abort outstanding ios routine attempts to stop nvme
    queues and nested routines that may try to sleep, which is in conflict
    with the interrupt handler.

    Correct replacing the direct call with a work element scheduling, and the
    abort outstanding ios routine will be called in the work element.

    Fixes: 95ced8a2c72d ("nvme-fc: eliminate terminate_io use by nvme_fc_error_recovery")
    Signed-off-by: James Smart
    Reported-by: Daniel Wagner
    Tested-by: Daniel Wagner
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Sasha Levin

    James Smart
     
  • [ Upstream commit 62df80165d7f197c9c0652e7416164f294a96661 ]

    While handling the completion queue, keep a local copy of the command id
    from the DMA-accessible completion entry. This silences a time-of-check
    to time-of-use (TOCTOU) warning from KF/x[1], with respect to a
    Thunderclap[2] vulnerability analysis. The double-read impact appears
    benign.

    There may be a theoretical window for @command_id to be used as an
    adversary-controlled array-index-value for mounting a speculative
    execution attack, but that mitigation is saved for a potential follow-on.
    A man-in-the-middle attack on the data payload is out of scope for this
    analysis and is hopefully mitigated by filesystem integrity mechanisms.

    [1] https://github.com/intel/kernel-fuzzer-for-xen-project
    [2] http://thunderclap.io/thunderclap-paper-ndss2019.pdf
    Signed-off-by: Lalithambika Krishna Kumar
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Sasha Levin

    Lalithambika Krishnakumar
     
  • [ Upstream commit 7ee5c78ca3895d44e918c38332921983ed678be0 ]

    A system with more than one of these SSDs will only have one usable.
    Hence the kernel fails to detect nvme devices due to duplicate cntlids.

    [ 6.274554] nvme nvme1: Duplicate cntlid 33 with nvme0, rejecting
    [ 6.274566] nvme nvme1: Removing after probe failure status: -22

    Adding the NVME_QUIRK_IGNORE_DEV_SUBNQN quirk to resolves the issue.

    Signed-off-by: Gopal Tiwari
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Sasha Levin

    Gopal Tiwari
     

17 Jan, 2021

1 commit

  • commit 5c11f7d9f843bdd24cd29b95401938bc3f168070 upstream.

    We may send a request (with or without its data) from two paths:

    1. From our I/O context nvme_tcp_io_work which is triggered from:
    - queue_rq
    - r2t reception
    - socket data_ready and write_space callbacks
    2. Directly from queue_rq if the send_list is empty (because we want to
    save the context switch associated with scheduling our io_work).

    However, given that now we have the send_mutex, we may run into a race
    condition where none of these contexts will send the pending payload to
    the controller. Both io_work send path and queue_rq send path
    opportunistically attempt to acquire the send_mutex however queue_rq only
    attempts to send a single request, and if io_work context fails to
    acquire the send_mutex it will complete without rescheduling itself.

    The race can trigger with the following sequence:

    1. queue_rq sends request (no incapsule data) and blocks
    2. RX path receives r2t - prepares data PDU to send, adds h2cdata PDU
    to the send_list and schedules io_work
    3. io_work triggers and cannot acquire the send_mutex - because of (1),
    ends without self rescheduling
    4. queue_rq completes the send, and completes

    ==> no context will send the h2cdata - timeout.

    Fix this by having queue_rq sending as much as it can from the send_list
    such that if it still has any left, its because the socket buffer is
    full and the socket write_space callback will trigger, thus guaranteeing
    that a context will be scheduled to send the h2cdata PDU.

    Fixes: db5ad6b7f8cd ("nvme-tcp: try to send request in queue_rq context")
    Reported-by: Potnuri Bharat Teja
    Reported-by: Samuel Jones
    Signed-off-by: Sagi Grimberg
    Tested-by: Potnuri Bharat Teja
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Greg Kroah-Hartman

    Sagi Grimberg
     

14 Nov, 2020

3 commits

  • xa_destroy() frees only internal data. The caller is responsible for
    freeing the exteranl objects referenced by an xarray.

    Fixes: 1cf7a12e09aa4 ("nvme: use an xarray to lookup the Commands Supported and Effects log")
    Signed-off-by: Keith Busch
    Signed-off-by: Christoph Hellwig

    Keith Busch
     
  • Remove the struct used for tracking known command effects logs in a
    list. This is now saved in an xarray that doesn't use these elements.
    Instead, store the log directly instead of the wrapper struct.

    Signed-off-by: Keith Busch
    Signed-off-by: Christoph Hellwig

    Keith Busch
     
  • If Doorbell Buffer Config command fails even 'dev->dbbuf_dbs != NULL'
    which means OACS indicates that NVME_CTRL_OACS_DBBUF_SUPP is set,
    nvme_dbbuf_update_and_check_event() will check event even it's not been
    successfully set.

    This patch fixes mismatch among dbbuf for sq/cqs in case that dbbuf
    command fails.

    Signed-off-by: Minwoo Im
    Signed-off-by: Christoph Hellwig

    Minwoo Im
     

10 Nov, 2020

1 commit


05 Nov, 2020

1 commit

  • Pull NVMe fixes from Christoph:

    "nvme fixes for 5.10:

    - revert a nvme_queue size optimization (Keith Bush)
    - fabrics timeout races fixes (Chao Leng and Sagi Grimberg)"

    * tag 'nvme-5.10-2020-11-05' of git://git.infradead.org/nvme:
    nvme-tcp: avoid repeated request completion
    nvme-rdma: avoid repeated request completion
    nvme-tcp: avoid race between time out and tear down
    nvme-rdma: avoid race between time out and tear down
    nvme: introduce nvme_sync_io_queues
    Revert "nvme-pci: remove last_sq_tail"

    Jens Axboe
     

03 Nov, 2020

6 commits

  • The request may be executed asynchronously, and rq->state may be
    changed to IDLE. To avoid repeated request completion, only
    MQ_RQ_COMPLETE of rq->state is checked in nvme_tcp_complete_timed_out.
    It is not safe, so need adding check IDLE for rq->state.

    Signed-off-by: Sagi Grimberg
    Signed-off-by: Chao Leng
    Signed-off-by: Christoph Hellwig

    Sagi Grimberg
     
  • The request may be executed asynchronously, and rq->state may be
    changed to IDLE. To avoid repeated request completion, only
    MQ_RQ_COMPLETE of rq->state is checked in nvme_rdma_complete_timed_out.
    It is not safe, so need adding check IDLE for rq->state.

    Signed-off-by: Sagi Grimberg
    Signed-off-by: Chao Leng
    Signed-off-by: Christoph Hellwig

    Sagi Grimberg
     
  • Now use teardown_lock to serialize for time out and tear down. This may
    cause abnormal: first cancel all request in tear down, then time out may
    complete the request again, but the request may already be freed or
    restarted.

    To avoid race between time out and tear down, in tear down process,
    first we quiesce the queue, and then delete the timer and cancel
    the time out work for the queue. At the same time we need to delete
    teardown_lock.

    Signed-off-by: Chao Leng
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    Chao Leng
     
  • Now use teardown_lock to serialize for time out and tear down. This may
    cause abnormal: first cancel all request in tear down, then time out may
    complete the request again, but the request may already be freed or
    restarted.

    To avoid race between time out and tear down, in tear down process,
    first we quiesce the queue, and then delete the timer and cancel
    the time out work for the queue. At the same time we need to delete
    teardown_lock.

    Signed-off-by: Chao Leng
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    Chao Leng
     
  • Introduce sync io queues for some scenarios which just only need sync
    io queues not sync all queues.

    Signed-off-by: Chao Leng
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    Chao Leng
     
  • Multiple CPUs may be mapped to the same hctx, allowing mulitple
    submission contexts to attempt commit_rqs(). We need to verify we're
    not writing the same doorbell value multiple times since that's a spec
    violation.

    Revert commit 54b2fcee1db041a83b52b51752dade6090cf952f.

    Link: https://bugzilla.redhat.com/show_bug.cgi?id=1878596
    Reported-by: "B.L. Jones"
    Signed-off-by: Keith Busch

    Keith Busch
     

31 Oct, 2020

1 commit

  • Pull block fixes from Jens Axboe:

    - null_blk zone fixes (Damien, Kanchan)

    - NVMe pull request from Christoph:
    - improve zone revalidation (Keith Busch)
    - gracefully handle zero length messages in nvme-rdma (zhenwei pi)
    - nvme-fc error handling fixes (James Smart)
    - nvmet tracing NULL pointer dereference fix (Chaitanya Kulkarni)"

    - xsysace platform fixes (Andy)

    - scatterlist type cleanup (David)

    - blk-cgroup memory fixes (Gabriel)

    - nbd block size update fix (Ming)

    - Flush completion state fix (Ming)

    - bio_add_hw_page() iteration fix (Naohiro)

    * tag 'block-5.10-2020-10-30' of git://git.kernel.dk/linux-block:
    blk-mq: mark flush request as IDLE in flush_end_io()
    lib/scatterlist: use consistent sg_copy_buffer() return type
    xsysace: use platform_get_resource() and platform_get_irq_optional()
    null_blk: Fix locking in zoned mode
    null_blk: Fix zone reset all tracing
    nbd: don't update block size after device is started
    block: advance iov_iter on bio_add_hw_page failure
    null_blk: synchronization fix for zoned device
    nvmet: fix a NULL pointer dereference when tracing the flush command
    nvme-fc: remove nvme_fc_terminate_io()
    nvme-fc: eliminate terminate_io use by nvme_fc_error_recovery
    nvme-fc: remove err_work work item
    nvme-fc: track error_recovery while connecting
    nvme-rdma: handle unexpected nvme completion data length
    nvme: ignore zone validate errors on subsequent scans
    blk-cgroup: Pre-allocate tree node on blkg_conf_prep
    blk-cgroup: Fix memleak on error path

    Linus Torvalds
     

28 Oct, 2020

1 commit

  • There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the
    handler triggers a completion and another thread does rdma_connect() or
    the handler directly calls rdma_connect().

    In all cases rdma_connect() needs to hold the handler_mutex, but when
    handler's are invoked this is already held by the core code. This causes
    ULPs using the 2nd method to deadlock.

    Provide a rdma_connect_locked() and have all ULPs call it from their
    handlers.

    Link: https://lore.kernel.org/r/0-v2-53c22d5c1405+33-rdma_connect_locking_jgg@nvidia.com
    Reported-and-tested-by: Guoqing Jiang
    Fixes: 2a7cec538169 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state")
    Acked-by: Santosh Shilimkar
    Acked-by: Jack Wang
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Max Gurtovoy
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Jason Gunthorpe

    Jason Gunthorpe
     

27 Oct, 2020

6 commits

  • __nvme_fc_terminate_io() is now called by only 1 place, in reset_work.
    Consoldate and move the functionality of terminate_io into reset_work.

    In reset_work, rather than calling the create_association directly,
    schedule the connect work element to do its thing. After scheduling,
    flush the connect work element to continue with semantic of not
    returning until connect has been attempted at least once.

    Signed-off-by: James Smart
    Signed-off-by: Christoph Hellwig

    James Smart
     
  • nvme_fc_error_recovery() special cases handling when in CONNECTING state
    and calls __nvme_fc_terminate_io(). __nvme_fc_terminate_io() itself
    special cases CONNECTING state and calls the routine to abort outstanding
    ios.

    Simplify the sequence by putting the call to abort outstanding I/Os
    directly in nvme_fc_error_recovery.

    Move the location of __nvme_fc_abort_outstanding_ios(), and
    nvme_fc_terminate_exchange() which is called by it, to avoid adding
    function prototypes for nvme_fc_error_recovery().

    Signed-off-by: James Smart
    Signed-off-by: Christoph Hellwig

    James Smart
     
  • err_work was created to handle errors (mainly I/O timeouts) while in
    CONNECTING state. The flag for err_work_active is also unneeded.

    Remove err_work_active and err_work. The actions to abort I/Os are moved
    inline to nvme_error_recovery().

    Signed-off-by: James Smart
    Signed-off-by: Christoph Hellwig

    James Smart
     
  • Whenever there are errors during CONNECTING, the driver recovers by
    aborting all outstanding ios and counts on the io completion to fail them
    and thus the connection/association they are on. However, the connection
    failure depends on a failure state from the core routines. Not all
    commands that are issued by the core routine are guaranteed to cause a
    failure of the core routine. They may be treated as a failure status and
    the status is then ignored.

    As such, whenever the transport enters error_recovery while CONNECTING,
    it will set a new flag indicating an association failed. The
    create_association routine which creates and initializes the controller,
    will monitor the state of the flag as well as the core routine error
    status and ensure the association fails if there was an error.

    Signed-off-by: James Smart
    Signed-off-by: Christoph Hellwig

    James Smart
     
  • Receiving a zero length message leads to the following warnings because
    the CQE is processed twice:

    refcount_t: underflow; use-after-free.
    WARNING: CPU: 0 PID: 0 at lib/refcount.c:28

    RIP: 0010:refcount_warn_saturate+0xd9/0xe0
    Call Trace:

    nvme_rdma_recv_done+0xf3/0x280 [nvme_rdma]
    __ib_process_cq+0x76/0x150 [ib_core]
    ...

    Sanity check the received data length, to avoids this.

    Thanks to Chao Leng & Sagi for suggestions.

    Signed-off-by: zhenwei pi
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    zhenwei pi
     
  • Revalidating nvme zoned namespaces requires IO commands, and there are
    controller states that prevent IO. For example, a sanitize in progress
    is required to fail all IO, but we don't want to remove a namespace
    we've previously added just because the controller is in such a state.
    Suppress the error in this case.

    Reported-by: Michael Nguyen
    Signed-off-by: Keith Busch
    Reviewed-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Keith Busch
     

23 Oct, 2020

4 commits

  • We've had several complaints about a 10s reconnect delay (the default)
    when there was an error while there is connectivity to a subsystem.
    The max_reconnects and reconnect_delay are set in common code prior to
    calling the transport to create the controller.

    This change checks if the default reconnect delay is being used, and if
    so, it adjusts it to a shorter period (2s) for the nvme-fc transport.
    It does so by calculating the controller loss tmo window, changing the
    value of the reconnect delay, and then recalculating the maximum number
    of reconnect attempts allowed.

    Signed-off-by: James Smart
    Reviewed-by: Himanshu Madhani
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Christoph Hellwig

    James Smart
     
  • On reconnect, the code currently does not freeze the controller before
    possibly updating the number hw queues for the controller.

    Add the freeze before updating the number of hw queues. Note: the queues
    are already started and remain started through the reconnect.

    Signed-off-by: James Smart
    Reviewed-by: Himanshu Madhani
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Christoph Hellwig

    James Smart
     
  • The loop that backs out of hw io queue creation continues through index
    0, which corresponds to the admin queue as well.

    Fix the loop so it only proceeds through indexes 1..n which correspond to
    I/O queues.

    Signed-off-by: James Smart
    Reviewed-by: Himanshu Madhani
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Christoph Hellwig

    James Smart
     
  • Currently, an I/O timeout unconditionally invokes
    nvme_fc_error_recovery() which checks for LIVE or CONNECTING state. If
    live, the routine resets the controller which initiates a reconnect -
    which is valid. If CONNECTING, err_work is scheduled. Err_work then
    calls the terminate_io routine, which also checks for CONNECTING and
    noops any further action on outstanding I/O. The result is nothing
    happened to the timed out io. As such, if the command was dropped on
    the wire, it will never timeout / complete, and the connect process
    will hang.

    Change the behavior of the io timeout routine to unconditionally abort
    the I/O. I/O completion handling will note that an io failed due to an
    abort and will terminate the connection / association as needed. If the
    abort was unable to happen, continue with a call to
    nvme_fc_error_recovery(). To ensure something different happens in
    nvme_fc_error_recovery() rework it so at it will abort all I/Os on the
    association to force a failure.

    As I/O aborts now may occur outside of delete_association, counting for
    completion must be wary and only count those aborted during
    delete_association when TERMIO is set on the controller.

    Signed-off-by: James Smart
    Signed-off-by: Christoph Hellwig

    James Smart
     

22 Oct, 2020

4 commits

  • Like commit 5611ec2b9814 ("nvme-pci: prevent SK hynix PC400 from using
    Write Zeroes command"), Sandisk Skyhawk has the same issue:
    [ 6305.633887] blk_update_request: operation not supported error, dev nvme0n1, sector 340812032 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0

    So also disable Write Zeroes command on Sandisk Skyhawk.

    BugLink: https://bugs.launchpad.net/bugs/1899503
    Signed-off-by: Kai-Heng Feng
    Reviewed-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Kai-Heng Feng
     
  • The request's rq_disk isn't set for passthrough IO commands, so tracing
    uses qid 0 for these which incorrectly decodes as an admin command. Use
    the request_queue's queuedata instead since that value is always set for
    the IO queues, and never set for the admin queue.

    Signed-off-by: Keith Busch
    Signed-off-by: Christoph Hellwig

    Keith Busch
     
  • A crash happened due to injecting error test.
    When a CQE has incorrect command id due do an error injection, the host
    may find a request which is already freed. Dereferencing req->mr->rkey
    causes a crash in nvme_rdma_process_nvme_rsp because the mr is already
    freed.

    Add a check for the mr to fix it.

    Signed-off-by: Chao Leng
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    Chao Leng
     
  • A crash can happened when a connect is rejected. The host establishes
    the connection after received ConnectReply, and then continues to send
    the fabrics Connect command. If the controller does not receive the
    ReadyToUse capsule, host may receive a ConnectReject reply.

    Call nvme_rdma_destroy_queue_ib after the host received the
    RDMA_CM_EVENT_REJECTED event. Then when the fabrics Connect command
    times out, nvme_rdma_timeout calls nvme_rdma_complete_rq to fail the
    request. A crash happenes due to use after free in
    nvme_rdma_complete_rq.

    nvme_rdma_destroy_queue_ib is redundant when handling the
    RDMA_CM_EVENT_REJECTED event as nvme_rdma_destroy_queue_ib is already
    called in connection failure handler.

    Signed-off-by: Chao Leng
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    Chao Leng
     

14 Oct, 2020

3 commits

  • Translate zoned resource errors to the appropriate blk_status_t.

    Reviewed-by: Christoph Hellwig
    Reviewed-by: Damien Le Moal
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Keith Busch
    Signed-off-by: Jens Axboe

    Keith Busch
     
  • Pull block driver updates from Jens Axboe:
    "Here are the driver updates for 5.10.

    A few SCSI updates in here too, in coordination with Martin as they
    depend on core block changes for the shared tag bitmap.

    This contains:

    - NVMe pull requests via Christoph:
    - fix keep alive timer modification (Amit Engel)
    - order the PCI ID list more sensibly (Andy Shevchenko)
    - cleanup the open by controller helper (Chaitanya Kulkarni)
    - use an xarray for the CSE log lookup (Chaitanya Kulkarni)
    - support ZNS in nvmet passthrough mode (Chaitanya Kulkarni)
    - fix nvme_ns_report_zones (Christoph Hellwig)
    - add a sanity check to nvmet-fc (James Smart)
    - fix interrupt allocation when too many polled queues are
    specified (Jeffle Xu)
    - small nvmet-tcp optimization (Mark Wunderlich)
    - fix a controller refcount leak on init failure (Chaitanya
    Kulkarni)
    - misc cleanups (Chaitanya Kulkarni)
    - major refactoring of the scanning code (Christoph Hellwig)

    - MD updates via Song:
    - Bug fixes in bitmap code, from Zhao Heming
    - Fix a work queue check, from Guoqing Jiang
    - Fix raid5 oops with reshape, from Song Liu
    - Clean up unused code, from Jason Yan
    - Discard improvements, from Xiao Ni
    - raid5/6 page offset support, from Yufen Yu

    - Shared tag bitmap for SCSI/hisi_sas/null_blk (John, Kashyap,
    Hannes)

    - null_blk open/active zone limit support (Niklas)

    - Set of bcache updates (Coly, Dongsheng, Qinglang)"

    * tag 'drivers-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (78 commits)
    md/raid5: fix oops during stripe resizing
    md/bitmap: fix memory leak of temporary bitmap
    md: fix the checking of wrong work queue
    md/bitmap: md_bitmap_get_counter returns wrong blocks
    md/bitmap: md_bitmap_read_sb uses wrong bitmap blocks
    md/raid0: remove unused function is_io_in_chunk_boundary()
    nvme-core: remove extra condition for vwc
    nvme-core: remove extra variable
    nvme: remove nvme_identify_ns_list
    nvme: refactor nvme_validate_ns
    nvme: move nvme_validate_ns
    nvme: query namespace identifiers before adding the namespace
    nvme: revalidate zone bitmaps in nvme_update_ns_info
    nvme: remove nvme_update_formats
    nvme: update the known admin effects
    nvme: set the queue limits in nvme_update_ns_info
    nvme: remove the 0 lba_shift check in nvme_update_ns_info
    nvme: clean up the check for too large logic block sizes
    nvme: freeze the queue over ->lba_shift updates
    nvme: factor out a nvme_configure_metadata helper
    ...

    Linus Torvalds
     
  • Pull block updates from Jens Axboe:

    - Series of merge handling cleanups (Baolin, Christoph)

    - Series of blk-throttle fixes and cleanups (Baolin)

    - Series cleaning up BDI, seperating the block device from the
    backing_dev_info (Christoph)

    - Removal of bdget() as a generic API (Christoph)

    - Removal of blkdev_get() as a generic API (Christoph)

    - Cleanup of is-partition checks (Christoph)

    - Series reworking disk revalidation (Christoph)

    - Series cleaning up bio flags (Christoph)

    - bio crypt fixes (Eric)

    - IO stats inflight tweak (Gabriel)

    - blk-mq tags fixes (Hannes)

    - Buffer invalidation fixes (Jan)

    - Allow soft limits for zone append (Johannes)

    - Shared tag set improvements (John, Kashyap)

    - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel)

    - DM no-wait support (Mike, Konstantin)

    - Request allocation improvements (Ming)

    - Allow md/dm/bcache to use IO stat helpers (Song)

    - Series improving blk-iocost (Tejun)

    - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang,
    Xianting, Yang, Yufen, yangerkun)

    * tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits)
    block: fix uapi blkzoned.h comments
    blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue
    blk-mq: get rid of the dead flush handle code path
    block: get rid of unnecessary local variable
    block: fix comment and add lockdep assert
    blk-mq: use helper function to test hw stopped
    block: use helper function to test queue register
    block: remove redundant mq check
    block: invoke blk_mq_exit_sched no matter whether have .exit_sched
    percpu_ref: don't refer to ref->data if it isn't allocated
    block: ratelimit handle_bad_sector() message
    blk-throttle: Re-use the throtl_set_slice_end()
    blk-throttle: Open code __throtl_de/enqueue_tg()
    blk-throttle: Move service tree validation out of the throtl_rb_first()
    blk-throttle: Move the list operation after list validation
    blk-throttle: Fix IO hang for a corner case
    blk-throttle: Avoid tracking latency if low limit is invalid
    blk-throttle: Avoid getting the current time if tg->last_finish_time is 0
    blk-throttle: Remove a meaningless parameter for throtl_downgrade_state()
    block: Remove redundant 'return' statement
    ...

    Linus Torvalds
     

09 Oct, 2020

1 commit

  • Pull block fixes from Jens Axboe:
    "A few fixes that should go into this release:

    - NVMe controller error path reference fix (Chaitanya)

    - Fix regression with IBM partitions on non-dasd devices (Christoph)

    - Fix a missing clear in the compat CDROM packet structure (Peilin)"

    * tag 'block5.9-2020-10-08' of git://git.kernel.dk/linux-block:
    partitions/ibm: fix non-DASD devices
    nvme-core: put ctrl ref when module ref get fail
    block/scsi-ioctl: Fix kernel-infoleak in scsi_put_cdrom_generic_arg()

    Linus Torvalds
     

07 Oct, 2020

2 commits

  • In nvme_set_queue_limits() we initialize vwc to false and later add
    a condition to set vwc true. The value of the vwc can be declare
    initialized which makes all the blk_queue_XXX() calls uniform.

    Signed-off-by: Chaitanya Kulkarni
    Reviewed-by: Keith Busch
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • In nvme_validate_ns() the exra variable ctrl is used only twice.
    Using ns->ctrl directly still maintains the redability and original
    length of the lines in the code. Get rid of the extra variable ctrl &
    use ns->ctrl directly.

    Signed-off-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni