Eric Lee / smarc-fsl-linux-kernel

14 Mar, 2019

3 commits

7066774e6 nvme-pci: add missing unlock for reset error ... Browse Code »

[ Upstream commit 4726bcf30fad37cc555cd9dcd6c73f2b2668c879 ]

The reset work holds a mutex to prevent races with removal modifying the
same resources, but was unlocking only on success. Unlock on failure
too.

Fixes: 5c959d73dba64 ("nvme-pci: fix rapid add remove sequence")
Signed-off-by: Keith Busch
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin

Keith Busch
2019-03-14 05:02:38 +0800
3cc6703d4 nvme-pci: fix rapid add remove sequence ... Browse Code »

[ Upstream commit 5c959d73dba6495ec01d04c206ee679d61ccb2b0 ]

A surprise removal may fail to tear down request queues if it is racing
with the initial asynchronous probe. If that happens, the remove path
won't see the queue resources to tear down, and the controller reset
path may create a new request queue on a removed device, but will not
be able to make forward progress, deadlocking the pci removal.

Protect setting up non-blocking resources from a shutdown by holding the
same mutex, and transition to the CONNECTING state after these resources
are initialized so the probe path may see the dead controller state
before dispatching new IO.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=202081
Reported-by: Alex Gagniuc
Signed-off-by: Keith Busch
Tested-by: Alex Gagniuc
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin

Keith Busch
2019-03-14 05:02:38 +0800
e3aabe4c2 nvme: lock NS list changes while handling command effects ... Browse Code »

[ Upstream commit e7ad43c3eda6a1690c4c3c341f95dc1c6898da83 ]

If a controller supports the NS Change Notification, the namespace
scan_work is automatically triggered after attaching a new namespace.

Occasionally the namespace scan_work may append the new namespace to the
list before the admin command effects handling is completed. The effects
handling unfreezes namespaces, but if it unfreezes the newly attached
namespace, its request_queue freeze depth will be off and we'll hit the
warning in blk_mq_unfreeze_queue().

On the next namespace add, we will fail to freeze that queue due to the
previous bad accounting and deadlock waiting for frozen.

Fix that by preventing scan work from altering the namespace list while
command effects handling needs to pair freeze with unfreeze.

Reported-by: Wen Xiong
Tested-by: Wen Xiong
Signed-off-by: Keith Busch
Reviewed-by: Chaitanya Kulkarni
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin

Keith Busch
2019-03-14 05:02:38 +0800

06 Mar, 2019

2 commits

9f260d76b nvme-multipath: drop optimization for static ANA group IDs ... Browse Code »

[ Upstream commit 78a61cd42a64f3587862b372a79e1d6aaf131fd7 ]

Bit 6 in the ANACAP field is used to indicate that the ANA group ID
doesn't change while the namespace is attached to the controller.
There is an optimisation in the code to only allocate space
for the ANA group header, as the namespace list won't change and
hence would not need to be refreshed.
However, this optimisation was never carried over to the actual
workflow, which always assumes that the buffer is large enough
to hold the ANA header _and_ the namespace list.
So drop this optimisation and always allocate enough space.

Reviewed-by: Christoph Hellwig
Signed-off-by: Hannes Reinecke
Signed-off-by: Sagi Grimberg
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin

Hannes Reinecke
2019-03-06 00:58:51 +0800
550e0ea7e nvme-rdma: fix timeout handler ... Browse Code »

[ Upstream commit 4c174e6366746ae8d49f9cc409f728eebb7a9ac9 ]

Currently, we have several problems with the timeout
handler:
1. If we timeout on the controller establishment flow, we will hang
because we don't execute the error recovery (and we shouldn't because
the create_ctrl flow needs to fail and cleanup on its own)
2. We might also hang if we get a disconnet on a queue while the
controller is already deleting. This racy flow can cause the controller
disable/shutdown admin command to hang.

We cannot complete a timed out request from the timeout handler without
mutual exclusion from the teardown flow (e.g. nvme_rdma_error_recovery_work).
So we serialize it in the timeout handler and teardown io and admin
queues to guarantee that no one races with us from completing the
request.

Reported-by: Jaesoo Lee
Reviewed-by: Christoph Hellwig
Signed-off-by: Sagi Grimberg
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin

Sagi Grimberg
2019-03-06 00:58:51 +0800

20 Feb, 2019

4 commits

2f5581e82 nvme: pad fake subsys NQN vid and ssvid with zeros ... Browse Code »

[ Upstream commit 3da584f57133e51aeb84aaefae5e3d69531a1e4f ]

We need to preserve the leading zeros in the vid and ssvid when generating
a unique NQN. Truncating these may lead to naming collisions.

Signed-off-by: Keith Busch
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin

Keith Busch
2019-02-20 17:25:41 +0800
6c27b5230 nvme-multipath: zero out ANA log buffer ... Browse Code »

[ Upstream commit c7055fd15ff46d92eb0dd1c16a4fe010d58224c8 ]

When nvme_init_identify() fails the ANA log buffer is deallocated
but _not_ set to NULL. This can cause double free oops when this
controller is deleted without ever being reconnected.

Signed-off-by: Hannes Reinecke
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin

Hannes Reinecke
2019-02-20 17:25:41 +0800
095cfdf85 nvme-pci: fix out of bounds access in nvme_cqe_pending ... Browse Code »

[ Upstream commit dcca1662727220d18fa351097ddff33f95f516c5 ]

There is an out of bounds array access in nvme_cqe_peding().

When enable irq_thread for nvme interrupt, there is racing between the
nvmeq->cq_head updating and reading.

nvmeq->cq_head is updated in nvme_update_cq_head(), if nvmeq->cq_head
equals nvmeq->q_depth and before its value set to zero, nvme_cqe_pending()
uses its value as an array index, the index will be out of bounds.

Signed-off-by: Hongbo Yao
[hch: slight coding style update]
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin

Hongbo Yao
2019-02-20 17:25:41 +0800
1e746fe21 nvme-pci: use the same attributes when freeing host_mem_desc_bufs. ... Browse Code »

[ Upstream commit cc667f6d5de023ee131e96bb88e5cddca23272bd ]

When using HMB the PCIe host driver allocates host_mem_desc_bufs using
dma_alloc_attrs() but frees them using dma_free_coherent(). Use the
correct dma_free_attrs() function to free the buffers.

Signed-off-by: Liviu Dudau
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin

Liviu Dudau
2019-02-20 17:25:41 +0800

31 Jan, 2019

2 commits

7dbf12973 nvmet-rdma: fix null dereference under heavy load ... Browse Code »

commit 5cbab6303b4791a3e6713dfe2c5fda6a867f9adc upstream.

Under heavy load if we don't have any pre-allocated rsps left, we
dynamically allocate a rsp, but we are not actually allocating memory
for nvme_completion (rsp->req.rsp). In such a case, accessing pointer
fields (req->rsp->status) in nvmet_req_init() will result in crash.

To fix this, allocate the memory for nvme_completion by calling
nvmet_rdma_alloc_rsp()

Fixes: 8407879c("nvmet-rdma:fix possible bogus dereference under heavy load")

Cc:
Reviewed-by: Max Gurtovoy
Reviewed-by: Christoph Hellwig
Signed-off-by: Raju Rangoju
Signed-off-by: Sagi Grimberg
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Raju Rangoju
2019-01-31 15:14:41 +0800
fa9184be6 nvmet-rdma: Add unlikely for response allocated check ... Browse Code »

commit ad1f824948e4ed886529219cf7cd717d078c630d upstream.

Signed-off-by: Israel Rukshin
Reviewed-by: Sagi Grimberg
Reviewed-by: Max Gurtovoy
Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe
Cc: Raju Rangoju
Signed-off-by: Greg Kroah-Hartman

Israel Rukshin
2019-01-31 15:14:41 +0800

21 Dec, 2018

2 commits

5f286ec24 nvmet-rdma: fix response use after free ... Browse Code »

[ Upstream commit d7dcdf9d4e15189ecfda24cc87339a3425448d5c ]

nvmet_rdma_release_rsp() may free the response before using it at error
flow.

Fixes: 8407879 ("nvmet-rdma: fix possible bogus dereference under heavy load")
Signed-off-by: Israel Rukshin
Reviewed-by: Sagi Grimberg
Reviewed-by: Max Gurtovoy
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin

Israel Rukshin
2018-12-21 21:15:25 +0800
b2d587568 nvme: validate controller state before rescheduling keep alive ... Browse Code »

[ Upstream commit 86880d646122240596d6719b642fee3213239994 ]

Delete operations are seeing NULL pointer references in call_timer_fn.
Tracking these back, the timer appears to be the keep alive timer.

nvme_keep_alive_work() which is tied to the timer that is cancelled
by nvme_stop_keep_alive(), simply starts the keep alive io but doesn't
wait for it's completion. So nvme_stop_keep_alive() only stops a timer
when it's pending. When a keep alive is in flight, there is no timer
running and the nvme_stop_keep_alive() will have no affect on the keep
alive io. Thus, if the io completes successfully, the keep alive timer
will be rescheduled. In the failure case, delete is called, the
controller state is changed, the nvme_stop_keep_alive() is called while
the io is outstanding, and the delete path continues on. The keep
alive happens to successfully complete before the delete paths mark it
as aborted as part of the queue termination, so the timer is restarted.
The delete paths then tear down the controller, and later on the timer
code fires and the timer entry is now corrupt.

Fix by validating the controller state before rescheduling the keep
alive. Testing with the fix has confirmed the condition above was hit.

Signed-off-by: James Smart
Reviewed-by: Sagi Grimberg
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin

James Smart
2018-12-21 21:15:25 +0800

17 Dec, 2018

3 commits

992a773cb nvme-rdma: fix double freeing of async event data ... Browse Code »

[ Upstream commit 6344d02dc8f886b6bbcd922ae1a17e4a41500f2d ]

Some error paths in configuration of admin queue free data buffer
associated with async request SQE without resetting the data buffer
pointer to NULL, This buffer is also freed up again if the controller
is shutdown or reset.

Signed-off-by: Prabhath Sajeepa
Reviewed-by: Roland Dreier
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin

Prabhath Sajeepa
2018-12-17 16:24:40 +0800
5893e48f8 nvme: flush namespace scanning work just before removing namespaces ... Browse Code »

[ Upstream commit f6c8e432cb0479255322c5d0335b9f1699a0270c ]

nvme_stop_ctrl can be called also for reset flow and there is no need to
flush the scan_work as namespaces are not being removed. This can cause
deadlock in rdma, fc and loop drivers since nvme_stop_ctrl barriers
before controller teardown (and specifically I/O cancellation of the
scan_work itself) takes place, but the scan_work will be blocked anyways
so there is no need to flush it.

Instead, move scan_work flush to nvme_remove_namespaces() where it really
needs to flush.

Reported-by: Ming Lei
Signed-off-by: Sagi Grimberg
Reviewed-by: Keith Busch
Reviewed by: James Smart
Tested-by: Ewan D. Milne
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin

Sagi Grimberg
2018-12-17 16:24:40 +0800
1bda8b799 nvme: warn when finding multi-port subsystems without multipathing enabled ... Browse Code »

[ Upstream commit 14a1336e6fff47dd1028b484d6c802105c58e2ee ]

Without CONFIG_NVME_MULTIPATH enabled a multi-port subsystem might
show up as invididual devices and cause problems, warn about it.

Signed-off-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Signed-off-by: Sasha Levin

Christoph Hellwig
2018-12-17 16:24:40 +0800

13 Dec, 2018

1 commit

0e79e30e6 nvme-fc: resolve io failures during connect ... Browse Code »

[ Upstream commit 4cff280a5fccf6513ed9e895bb3a4e7ad8b0cedc ]

If an io error occurs on an io issued while connecting, recovery
of the io falls flat as the state checking ends up nooping the error
handler.

Create an err_work work item that is scheduled upon an io error while
connecting. The work thread terminates all io on all queues and marks
the queues as not connected. The termination of the io will return
back to the callee, which will then back out of the connection attempt
and will reschedule, if possible, the connection attempt.

The changes:
- in case there are several commands hitting the error handler, a
state flag is kept so that the error work is only scheduled once,
on the first error. The subsequent errors can be ignored.
- The calling sequence to stop keep alive and terminate the queues
and their io is lifted from the reset routine. Made a small
service routine used by both reset and err_work.
- During debugging, found that the teardown path can reference
an uninitialized pointer, resulting in a NULL pointer oops.
The aen_ops weren't initialized yet. Add validation on their
initialization before calling the teardown routine.

Signed-off-by: James Smart
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin

James Smart
2018-12-13 16:16:11 +0800

27 Nov, 2018

1 commit

fa4712942 nvme: make sure ns head inherits underlying device limits ... Browse Code »

[ Upstream commit 8f676b8508c250bbe255096522fdefb73f1ea0b9 ]

Whenever we update ns_head info, we need to make sure it is still
compatible with all underlying backing devices because although nvme
multipath doesn't have any explicit use of these limits, other devices
can still be stacked on top of it which may rely on the underlying limits.
Start with unlimited stacking limits, and every info update iterate over
siblings and adjust queue limits.

Signed-off-by: Sagi Grimberg
Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin

Sagi Grimberg
2018-11-27 23:13:05 +0800

14 Nov, 2018

1 commit

4b445d478 nvme: call nvme_complete_rq when nvmf_check_ready fails for mpath I/O ... Browse Code »

[ Upstream commit 783f4a4408e1251d17f333ad56abac24dde988b9 ]

When an io is rejected by nvmf_check_ready() due to validation of the
controller state, the nvmf_fail_nonready_command() will normally return
BLK_STS_RESOURCE to requeue and retry. However, if the controller is
dying or the I/O is marked for NVMe multipath, the I/O is failed so that
the controller can terminate or so that the io can be issued on a
different path. Unfortunately, as this reject point is before the
transport has accepted the command, blk-mq ends up completing the I/O
and never calls nvme_complete_rq(), which is where multipath may preserve
or re-route the I/O. The end result is, the device user ends up seeing an
EIO error.

Example: single path connectivity, controller is under load, and a reset
is induced. An I/O is received:

a) while the reset state has been set but the queues have yet to be
stopped; or
b) after queues are started (at end of reset) but before the reconnect
has completed.

The I/O finishes with an EIO status.

This patch makes the following changes:

- Adds the HOST_PATH_ERROR pathing status from TP4028
- Modifies the reject point such that it appears to queue successfully,
but actually completes the io with the new pathing status and calls
nvme_complete_rq().
- nvme_complete_rq() recognizes the new status, avoids resetting the
controller (likely was already done in order to get this new status),
and calls the multipather to clear the current path that errored.
This allows the next command (retry or new command) to select a new
path if there is one.

Signed-off-by: James Smart
Reviewed-by: Sagi Grimberg
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

James Smart
2018-11-14 03:08:24 +0800

08 Oct, 2018

1 commit

48f78be33 nvme: remove ns sibling before clearing path ... Browse Code »

The code had been clearing a namespace being deleted as the current
path while that namespace was still in the path siblings list. It is
possible a new IO could set that namespace back to the current path
since it appeared to be an eligable path to select, which may result in
a use-after-free error.

This patch ensures a namespace being removed is not eligable to be reset
as a current path prior to clearing it as the current path.

Signed-off-by: Keith Busch
Reviewed-by: Sagi Grimberg
Signed-off-by: Christoph Hellwig

Keith Busch
2018-10-08 17:53:42 +0800

26 Sep, 2018

1 commit

bb830add1 nvme: properly propagate errors in nvme_mpath_init ... Browse Code »

Signed-off-by: Susobhan Dey
Signed-off-by: Christoph Hellwig

Susobhan Dey
2018-09-26 07:21:40 +0800

17 Sep, 2018

1 commit

be1277f5e nvme: count all ANA groups for ANA Log page ... Browse Code »

When issuing a short read on the ANA log page the number of groups
should not change, even though the final returned data might contain
less groups than that number.

Signed-off-by: Hannes Reinecke
[switched to a for loop]
Signed-off-by: Christoph Hellwig

Hannes Reinecke
2018-09-17 21:49:40 +0800

06 Sep, 2018

1 commit

8407879c4 nvmet-rdma: fix possible bogus dereference under heavy load ... Browse Code »

Currently we always repost the recv buffer before we send a response
capsule back to the host. Since ordering is not guaranteed for send
and recv completions, it is posible that we will receive a new request
from the host before we got a send completion for the response capsule.

Today, we pre-allocate 2x rsps the length of the queue, but in reality,
under heavy load there is nothing that is really preventing the gap to
expand until we exhaust all our rsps.

To fix this, if we don't have any pre-allocated rsps left, we dynamically
allocate a rsp and make sure to free it when we are done. If under memory
pressure we fail to allocate a rsp, we silently drop the command and
wait for the host to retry.

Reported-by: Steve Wise
Tested-by: Steve Wise
Signed-off-by: Sagi Grimberg
[hch: dropped a superflous assignment]
Signed-off-by: Christoph Hellwig

Sagi Grimberg
2018-09-06 03:18:01 +0800

28 Aug, 2018

3 commits

04db0e5ec nvmet: free workqueue object if module init fails ... Browse Code »

Signed-off-by: Chaitanya Kulkarni
Signed-off-by: Christoph Hellwig

Chaitanya Kulkarni
2018-08-28 14:40:44 +0800
afd299ca9 nvme-fcloop: Fix dropped LS's to removed target port ... Browse Code »

When a targetport is removed from the config, fcloop will avoid calling
the LS done() routine thinking the targetport is gone. This leaves the
initiator reset/reconnect hanging as it waits for a status on the
Create_Association LS for the reconnect.

Change the filter in the LS callback path. If tport null (set when
failed validation before "sending to remote port"), be sure to call
done. This was the main bug. But, continue the logic that only calls
done if tport was set but there is no remoteport (e.g. case where
remoteport has been removed, thus host doesn't expect a completion).

Signed-off-by: James Smart
Signed-off-by: Christoph Hellwig

James Smart
2018-08-28 14:40:43 +0800
f1ed3df20 nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event ... Browse Code »

In many architectures loads may be reordered with older stores to
different locations. In the nvme driver the following two operations
could be reordered:

- Write shadow doorbell (dbbuf_db) into memory.
- Read EventIdx (dbbuf_ei) from memory.

This can result in a potential race condition between driver and VM host
processing requests (if given virtual NVMe controller has a support for
shadow doorbell). If that occurs, then the NVMe controller may decide to
wait for MMIO doorbell from guest operating system, and guest driver may
decide not to issue MMIO doorbell on any of subsequent commands.

This issue is purely timing-dependent one, so there is no easy way to
reproduce it. Currently the easiest known approach is to run "Oracle IO
Numbers" (orion) that is shipped with Oracle DB:

orion -run advanced -num_large 0 -size_small 8 -type rand -simulate \
concat -write 40 -duration 120 -matrix row -testname nvme_test

Where nvme_test is a .lun file that contains a list of NVMe block
devices to run test against. Limiting number of vCPUs assigned to given
VM instance seems to increase chances for this bug to occur. On test
environment with VM that got 4 NVMe drives and 1 vCPU assigned the
virtual NVMe controller hang could be observed within 10-20 minutes.
That correspond to about 400-500k IO operations processed (or about
100GB of IO read/writes).

Orion tool was used as a validation and set to run in a loop for 36
hours (equivalent of pushing 550M IO operations). No issues were
observed. That suggest that the patch fixes the issue.

Fixes: f9f38e33389c ("nvme: improve performance for virtual NVMe devices")
Signed-off-by: Michal Wnukowski
Reviewed-by: Keith Busch
Reviewed-by: Sagi Grimberg
[hch: updated changelog and comment a bit]
Signed-off-by: Christoph Hellwig

Michal Wnukowski
2018-08-28 14:40:42 +0800

17 Aug, 2018

2 commits

0a3173a5f Merge branch 'linus/master' into rdma.git for-next ... Browse Code »

rdma.git merge resolution for the 4.19 merge window

Conflicts:
drivers/infiniband/core/rdma_core.c
- Use the rdma code and revise with the new spelling for
atomic_fetch_add_unless
drivers/nvme/host/rdma.c
- Replace max_sge with max_send_sge in new blk code
drivers/nvme/target/rdma.c
- Use the blk code and revise to use NULL for ib_post_recv when
appropriate
- Replace max_sge with max_recv_sge in new blk code
net/rds/ib_send.c
- Use the net code and revise to use NULL for ib_post_recv when
appropriate

Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2018-08-17 04:21:29 +0800
89982f7cc Merge tag 'v4.18' into rdma.git for-next ... Browse Code »

Resolve merge conflicts from the -rc cycle against the rdma.git tree:

Conflicts:
drivers/infiniband/core/uverbs_cmd.c
- New ifs added to ib_uverbs_ex_create_flow in -rc and for-next
- Merge removal of file->ucontext in for-next with new code in -rc
drivers/infiniband/core/uverbs_main.c
- for-next removed code from ib_uverbs_write() that was modified
in for-rc

Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2018-08-17 03:12:00 +0800

15 Aug, 2018

1 commit

73ba2fb33 Merge tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block updates from Jens Axboe:
"First pull request for this merge window, there will also be a
followup request with some stragglers.

This pull request contains:

- Fix for a thundering heard issue in the wbt block code (Anchal
Agarwal)

- A few NVMe pull requests:
* Improved tracepoints (Keith)
* Larger inline data support for RDMA (Steve Wise)
* RDMA setup/teardown fixes (Sagi)
* Effects log suppor for NVMe target (Chaitanya Kulkarni)
* Buffered IO suppor for NVMe target (Chaitanya Kulkarni)
* TP4004 (ANA) support (Christoph)
* Various NVMe fixes

- Block io-latency controller support. Much needed support for
properly containing block devices. (Josef)

- Series improving how we handle sense information on the stack
(Kees)

- Lightnvm fixes and updates/improvements (Mathias/Javier et al)

- Zoned device support for null_blk (Matias)

- AIX partition fixes (Mauricio Faria de Oliveira)

- DIF checksum code made generic (Max Gurtovoy)

- Add support for discard in iostats (Michael Callahan / Tejun)

- Set of updates for BFQ (Paolo)

- Removal of async write support for bsg (Christoph)

- Bio page dirtying and clone fixups (Christoph)

- Set of bcache fix/changes (via Coly)

- Series improving blk-mq queue setup/teardown speed (Ming)

- Series improving merging performance on blk-mq (Ming)

- Lots of other fixes and cleanups from a slew of folks"

* tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-block: (190 commits)
blkcg: Make blkg_root_lookup() work for queues in bypass mode
bcache: fix error setting writeback_rate through sysfs interface
null_blk: add lock drop/acquire annotation
Blk-throttle: reduce tail io latency when iops limit is enforced
block: paride: pd: mark expected switch fall-throughs
block: Ensure that a request queue is dissociated from the cgroup controller
block: Introduce blk_exit_queue()
blkcg: Introduce blkg_root_lookup()
block: Remove two superfluous #include directives
blk-mq: count the hctx as active before allocating tag
block: bvec_nr_vecs() returns value for wrong slab
bcache: trivial - remove tailing backslash in macro BTREE_FLAG
bcache: make the pr_err statement used for ENOENT only in sysfs_attatch section
bcache: set max writeback rate when I/O request is idle
bcache: add code comments for bset.c
bcache: fix mistaken comments in request.c
bcache: fix mistaken code comments in bcache.h
bcache: add a comment in super.c
bcache: avoid unncessary cache prefetch bch_btree_node_get()
bcache: display rate debug parameters to 0 when writeback is not running
...

Linus Torvalds
2018-08-15 01:23:25 +0800

08 Aug, 2018

3 commits

66414e802 nvme-fabrics: fix ctrl_loss_tmo < 0 to reconnect forever ... Browse Code »

When the user supplies a ctrl_loss_tmo < 0, we warn them that this will
cause the fabrics layer to attempt reconnection forever. However, in
reality the fabrics layer never attempts to reconnect because the
condition to test whether we should reconnect is backwards in this case.

Signed-off-by: Tal Shorer
Reviewed-by: Chaitanya Kulkarni
Signed-off-by: Christoph Hellwig

Tal Shorer
2018-08-08 18:01:49 +0800
dedf0be54 nvmet: add ns write protect support ... Browse Code »

This patch implements the Namespace Write Protect feature described in
"NVMe TP 4005a Namespace Write Protect". In this version, we implement
No Write Protect and Write Protect states for target ns which can be
toggled by set-features commands from the host side.

For write-protect state transition, we need to flush the ns specified
as a part of command so we also add helpers for carrying out synchronous
flush operations.

Signed-off-by: Chaitanya Kulkarni
[hch: fixed an incorrect endianess conversion, minor cleanups]
Signed-off-by: Christoph Hellwig

Chaitanya Kulkarni
2018-08-08 18:00:53 +0800
1293477f4 nvme: set gendisk read only based on nsattr ... Browse Code »

NVMe 1.3 TP 4005 introduces new filed (NSATTR). This field indicates
whether given namespace is write protected or not. This patch sets the
gendisk associated with the namespace to read only based on the identify
namespace nsattr field.

Signed-off-by: Chaitanya Kulkarni
Signed-off-by: Christoph Hellwig

Chaitanya Kulkarni
2018-08-08 17:55:49 +0800

07 Aug, 2018

1 commit

8f220c418 nvme: fixup crash on failed discovery ... Browse Code »

When the initial discovery fails the subsystem hasn't been setup yet
in nvme_mpath_stop, and we can't dereference ctrl->subsys.

Fixes: 0d0b660f ("nvme: add ANA support")
Signed-off-by: Hannes Reinecke
Signed-off-by: Christoph Hellwig

Hannes Reinecke
2018-08-07 20:40:27 +0800

06 Aug, 2018

3 commits

f10fe9d85 lightnvm: remove minor version check for 2.0 ... Browse Code »

A minor version number increase should not break backwards
compatibility.

Fixes: 3cb98f84d368b ("lightnvm: add minor version to generic geometry")
Reviewed-by: Javier González
Signed-off-by: Matias Bjørling
Signed-off-by: Jens Axboe

Matias Bjørling
2018-08-06 09:36:09 +0800
f87b0f0df Merge branch 'nvme-4.19' of git://git.infradead.org/nvme into for-4.19/block2 ... Browse Code »

Pull NVMe changes from Christoph:

"This contains the support for TP4004, Asymmetric Namespace Access,
which makes NVMe multipathing usable in practice."

* 'nvme-4.19' of git://git.infradead.org/nvme:
nvmet: use Retain Async Event bit to clear AEN
nvmet: support configuring ANA groups
nvmet: add minimal ANA support
nvmet: track and limit the number of namespaces per subsystem
nvmet: keep a port pointer in nvmet_ctrl
nvme: add ANA support
nvme: remove nvme_req_needs_failover
nvme: simplify the API for getting log pages
nvme.h: add ANA definitions
nvme.h: add support for the log specific field

Signed-off-by: Jens Axboe

Jens Axboe
2018-08-06 09:34:09 +0800
05b9ba4b5 Merge tag 'v4.18-rc6' into for-4.19/block2 ... Browse Code »

Pull in 4.18-rc6 to get the NVMe core AEN change to avoid a
merge conflict down the line.

Signed-of-by: Jens Axboe

Jens Axboe
2018-08-06 09:32:09 +0800

30 Jul, 2018

2 commits

f7f1fc363 nvme: use blk API to remap ref tags for IOs with metadata ... Browse Code »

Also moved the logic of the remapping to the nvme core driver instead
of implementing it in the nvme pci driver. This way all the other nvme
transport drivers will benefit from it (in case they'll implement metadata
support).

Suggested-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Acked-by: Keith Busch
Signed-off-by: Max Gurtovoy
Signed-off-by: Jens Axboe

Max Gurtovoy
2018-07-30 22:27:04 +0800
ddd0bc756 block: move ref_tag calculation func to the block layer ... Browse Code »

Currently this function is implemented in the scsi layer, but it's
actual place should be the block layer since T10-PI is a general
data integrity feature that is used in the nvme protocol as well.

Suggested-by: Christoph Hellwig
Cc: Martin K. Petersen
Signed-off-by: Max Gurtovoy
Signed-off-by: Jens Axboe

Max Gurtovoy
2018-07-30 22:27:01 +0800

28 Jul, 2018

2 commits

eb181a814 Merge tag 'for-linus-20180727' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block fixes from Jens Axboe:
"Bigger than usual at this time, mostly due to the O_DIRECT corruption
issue and the fact that I was on vacation last week. This contains:

- NVMe pull request with two fixes for the FC code, and two target
fixes (Christoph)

- a DIF bio reset iteration fix (Greg Edwards)

- two nbd reply and requeue fixes (Josef)

- SCSI timeout fixup (Keith)

- a small series that fixes an issue with bio_iov_iter_get_pages(),
which ended up causing corruption for larger sized O_DIRECT writes
that ended up racing with buffered writes (Martin Wilck)"

* tag 'for-linus-20180727' of git://git.kernel.dk/linux-block:
block: reset bi_iter.bi_done after splitting bio
block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs
blkdev: __blkdev_direct_IO_simple: fix leak in error case
block: bio_iov_iter_get_pages: fix size of last iovec
nvmet: only check for filebacking on -ENOTBLK
nvmet: fixup crash on NULL device path
scsi: set timed out out mq requests to complete
blk-mq: export setting request completion state
nvme: if_ready checks to fail io to deleting controller
nvmet-fc: fix target sgl list on large transfers
nbd: handle unexpected replies better
nbd: don't requeue the same request twice.

Linus Torvalds
2018-07-28 03:51:00 +0800
b369b30cf nvmet: use Retain Async Event bit to clear AEN ... Browse Code »

In the current implementation, we clear the AEN bit when we get the
"get log page" command if given log page is associated with AEN.
This patch allows optionally retaining the AEN for the ctrl
under consideration when Retain Asynchronous Event (RAE) bit is set
as a part of "get log page" command.

This allows the host to read the Log page and optionally retaining the
AEN associated with this log page when using userspace tools like
nvme-cli.

Signed-off-by: Chaitanya Kulkarni
[hch: also use the new helper in the just merged ANA code]
Signed-off-by: Christoph Hellwig

Chaitanya Kulkarni
2018-07-28 01:14:31 +0800