Eric Lee / smarc-fsl-linux-kernel

04 Sep, 2020

5 commits

f1b49fdc1 blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap ... Browse Code »

For when using a shared sbitmap, no longer should the number of active
request queues per hctx be relied on for when judging how to share the tag
bitmap.

Instead maintain the number of active request queues per tag_set, and make
the judgement based on that.

Originally-from: Kashyap Desai
Signed-off-by: John Garry
Tested-by: Don Brace #SCSI resv cmds patches used
Tested-by: Douglas Gilbert
Signed-off-by: Jens Axboe

John Garry
2020-09-04 05:20:47 +0800
bccf5e26d blk-mq: Record nr_active_requests per queue for when using shared sbitmap ... Browse Code »

The per-hctx nr_active value can no longer be used to fairly assign a share
of tag depth per request queue for when using a shared sbitmap, as it does
not consider that the tags are shared tags over all hctx's.

For this case, record the nr_active_requests per request_queue, and make
the judgement based on that value.

Co-developed-with: Kashyap Desai
Signed-off-by: John Garry
Tested-by: Don Brace #SCSI resv cmds patches used
Tested-by: Douglas Gilbert
Signed-off-by: Jens Axboe

John Garry
2020-09-04 05:20:47 +0800
a0235d230 blk-mq: Relocate hctx_may_queue() ... Browse Code »

blk-mq.h and blk-mq-tag.h include on each other, which is less than ideal.

Locate hctx_may_queue() to blk-mq.h, as it is not really tag specific code.

In this way, we can drop the blk-mq-tag.h include of blk-mq.h

Signed-off-by: John Garry
Tested-by: Douglas Gilbert
Signed-off-by: Jens Axboe

John Garry
2020-09-04 05:20:47 +0800
32bc15afe blk-mq: Facilitate a shared sbitmap per tagset ... Browse Code »

Some SCSI HBAs (such as HPSA, megaraid, mpt3sas, hisi_sas_v3 ..) support
multiple reply queues with single hostwide tags.

In addition, these drivers want to use interrupt assignment in
pci_alloc_irq_vectors(PCI_IRQ_AFFINITY). However, as discussed in [0],
CPU hotplug may cause in-flight IO completion to not be serviced when an
interrupt is shutdown. That problem is solved in commit bf0beec0607d
("blk-mq: drain I/O when all CPUs in a hctx are offline").

However, to take advantage of that blk-mq feature, the HBA HW queuess are
required to be mapped to that of the blk-mq hctx's; to do that, the HBA HW
queues need to be exposed to the upper layer.

In making that transition, the per-SCSI command request tags are no
longer unique per Scsi host - they are just unique per hctx. As such, the
HBA LLDD would have to generate this tag internally, which has a certain
performance overhead.

However another problem is that blk-mq assumes the host may accept
(Scsi_host.can_queue * #hw queue) commands. In commit 6eb045e092ef ("scsi:
core: avoid host-wide host_busy counter for scsi_mq"), the Scsi host busy
counter was removed, which would stop the LLDD being sent more than
.can_queue commands; however, it should still be ensured that the block
layer does not issue more than .can_queue commands to the Scsi host.

To solve this problem, introduce a shared sbitmap per blk_mq_tag_set,
which may be requested at init time.

New flag BLK_MQ_F_TAG_HCTX_SHARED should be set when requesting the
tagset to indicate whether the shared sbitmap should be used.

Even when BLK_MQ_F_TAG_HCTX_SHARED is set, a full set of tags and requests
are still allocated per hctx; the reason for this is that if tags and
requests were only allocated for a single hctx - like hctx0 - it may break
block drivers which expect a request be associated with a specific hctx,
i.e. not always hctx0. This will introduce extra memory usage.

This change is based on work originally from Ming Lei in [1] and from
Bart's suggestion in [2].

[0] https://lore.kernel.org/linux-block/alpine.DEB.2.21.1904051331270.1802@nanos.tec.linutronix.de/
[1] https://lore.kernel.org/linux-block/20190531022801.10003-1-ming.lei@redhat.com/
[2] https://lore.kernel.org/linux-block/ff77beff-5fd9-9f05-12b6-826922bace1f@huawei.com/T/#m3db0a602f095cbcbff27e9c884d6b4ae826144be

Signed-off-by: John Garry
Tested-by: Don Brace #SCSI resv cmds patches used
Tested-by: Douglas Gilbert
Signed-off-by: Jens Axboe

John Garry
2020-09-04 05:20:47 +0800
1c0706a70 blk-mq: Pass flags for tag init/free ... Browse Code »

Pass hctx/tagset flags argument down to blk_mq_init_tags() and
blk_mq_free_tags() for selective init/free.

For now, make it include the alloc policy flag, which can be evaluated
when needed (in blk_mq_init_tags()).

Signed-off-by: John Garry
Tested-by: Douglas Gilbert
Signed-off-by: Jens Axboe

John Garry
2020-09-04 05:20:46 +0800

02 Jul, 2020

1 commit

4e2f62e56 Revert "blk-mq: put driver tag when this request is completed" ... Browse Code »

This reverts commits the following commits:

37f4a24c2469a10a4c16c641671bd766e276cf9f
723bf178f158abd1ce6069cb049581b3cb003aab
36a3df5a4574d5ddf59804fcd0c4e9654c514d9a

The last one is the culprit, but we have to go a bit deeper to get this
to revert cleanly. There's been a report that this breaks some MMC
setups [1], and also causes an issue with swap [2]. Until this can be
figured out, revert the offending commits.

[1] https://lore.kernel.org/linux-block/57fb09b1-54ba-f3aa-f82c-d709b0e6b281@samsung.com/
[2] https://lore.kernel.org/linux-block/20200702043721.GA1087@lca.pw/

Reported-by: Marek Szyprowski
Reported-by: Qian Cai
Signed-off-by: Jens Axboe

Jens Axboe
2020-07-02 12:58:32 +0800

01 Jul, 2020

1 commit

723bf178f blk-mq: move blk_mq_put_driver_tag() into blk-mq.c ... Browse Code »

It is used by blk-mq.c only, so move it to the source file.

Suggested-by: Christoph Hellwig
Signed-off-by: Ming Lei
Reviewed-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
Cc: Christoph Hellwig
Signed-off-by: Jens Axboe

Ming Lei
2020-07-01 02:57:59 +0800

30 Jun, 2020

3 commits

1fd40b5ea blk-mq: pass obtained budget count to blk_mq_dispatch_rq_list ... Browse Code »

Pass obtained budget count to blk_mq_dispatch_rq_list(), and prepare
for supporting fully batching submission.

With the obtained budget count, it is easier to put extra budgets
in case of .queue_rq failure.

Meantime remove the old 'got_budget' parameter.

Signed-off-by: Ming Lei
Tested-by: Baolin Wang
Reviewed-by: Christoph Hellwig
Cc: Sagi Grimberg
Cc: Baolin Wang
Cc: Christoph Hellwig
Signed-off-by: Jens Axboe

Ming Lei
2020-06-30 21:51:48 +0800
445874e89 blk-mq: pass hctx to blk_mq_dispatch_rq_list ... Browse Code »

All requests in the 'list' of blk_mq_dispatch_rq_list belong to same
hctx, so it is better to pass hctx instead of request queue, because
blk-mq's dispatch target is hctx instead of request queue.

Signed-off-by: Ming Lei
Tested-by: Baolin Wang
Reviewed-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Reviewed-by: Johannes Thumshirn
Cc: Sagi Grimberg
Cc: Baolin Wang
Cc: Christoph Hellwig
Signed-off-by: Jens Axboe

Ming Lei
2020-06-30 21:51:48 +0800
65c763694 blk-mq: pass request queue into get/put budget callback ... Browse Code »

blk-mq budget is abstract from scsi's device queue depth, and it is
always per-request-queue instead of hctx.

It can be quite absurd to get a budget from one hctx, then dequeue a
request from scheduler queue, and this request may not belong to this
hctx, at least for bfq and deadline.

So fix the mess and always pass request queue to get/put budget
callback.

Signed-off-by: Ming Lei
Tested-by: Baolin Wang
Reviewed-by: Johannes Thumshirn
Reviewed-by: Christoph Hellwig
Reviewed-by: Douglas Anderson
Reviewed-by: Sagi Grimberg
Cc: Sagi Grimberg
Cc: Baolin Wang
Cc: Christoph Hellwig
Cc: Douglas Anderson
Signed-off-by: Jens Axboe

Ming Lei
2020-06-30 21:51:48 +0800

29 Jun, 2020

1 commit

42fdc5e49 blk-mq: remove the BLK_MQ_REQ_INTERNAL flag ... Browse Code »

Just check for a non-NULL elevator directly to make the code more clear.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2020-06-29 23:56:18 +0800

07 Jun, 2020

1 commit

d94ecfc39 blk-mq: split out a __blk_mq_get_driver_tag helper ... Browse Code »

Allocation of the driver tag in the case of using a scheduler shares very
little code with the "normal" tag allocation. Split out a new helper to
streamline this path, and untangle it from the complex normal tag
allocation.

This way also avoids to fail driver tag allocation because of inactive hctx
during cpu hotplug, and fixes potential hang risk.

Fixes: bf0beec0607d ("blk-mq: drain I/O when all CPUs in a hctx are offline")
Signed-off-by: Ming Lei
Signed-off-by: Christoph Hellwig
Tested-by: John Garry
Cc: Dongli Zhang
Cc: Hannes Reinecke
Cc: Daniel Wagner
Signed-off-by: Jens Axboe

Christoph Hellwig
2020-06-07 22:56:50 +0800

30 May, 2020

1 commit

766473681 blk-mq: use BLK_MQ_NO_TAG in more places ... Browse Code »

Replace various magic -1 constants for tags with BLK_MQ_NO_TAG.

Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
Reviewed-by: Johannes Thumshirn
Reviewed-by: Bart Van Assche
Reviewed-by: Daniel Wagner
Signed-off-by: Jens Axboe

Christoph Hellwig
2020-05-30 00:23:25 +0800

27 Feb, 2020

1 commit

cae740a04 blk-mq: Remove some unused function arguments ... Browse Code »

The struct blk_mq_hw_ctx pointer argument in blk_mq_put_tag(),
blk_mq_poll_nsecs(), and blk_mq_poll_hybrid_sleep() is unused, so remove
it.

Overall obj code size shows a minor reduction, before:
text data bss dec hex filename
27306 1312 0 28618 6fca block/blk-mq.o
4303 272 0 4575 11df block/blk-mq-tag.o

after:
27282 1312 0 28594 6fb2 block/blk-mq.o
4311 272 0 4583 11e7 block/blk-mq-tag.o

Reviewed-by: Johannes Thumshirn
Reviewed-by: Hannes Reinecke
Signed-off-by: John Garry
--
This minor patch had been carried as part of the blk-mq shared tags RFC,
I'd rather not carry it anymore as it required rebasing, so now or never..
Signed-off-by: Jens Axboe

John Garry
2020-02-27 01:34:41 +0800

25 Feb, 2020

1 commit

01e99aeca blk-mq: insert passthrough request into hctx->dispatch directly ... Browse Code »

For some reason, device may be in one situation which can't handle
FS request, so STS_RESOURCE is always returned and the FS request
will be added to hctx->dispatch. However passthrough request may
be required at that time for fixing the problem. If passthrough
request is added to scheduler queue, there isn't any chance for
blk-mq to dispatch it given we prioritize requests in hctx->dispatch.
Then the FS IO request may never be completed, and IO hang is caused.

So passthrough request has to be added to hctx->dispatch directly
for fixing the IO hang.

Fix this issue by inserting passthrough request into hctx->dispatch
directly together withing adding FS request to the tail of
hctx->dispatch in blk_mq_dispatch_rq_list(). Actually we add FS request
to tail of hctx->dispatch at default, see blk_mq_request_bypass_insert().

Then it becomes consistent with original legacy IO request
path, in which passthrough request is always added to q->queue_head.

Cc: Dongli Zhang
Cc: Christoph Hellwig
Cc: Ewan D. Milne
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2020-02-25 09:50:48 +0800

07 Oct, 2019

1 commit

27a46989a blk-mq: Inline status checkers ... Browse Code »

blk_mq_request_completed() and blk_mq_request_started() are
short, inline it.

Signed-off-by: Pavel Begunkov
Signed-off-by: Jens Axboe

Pavel Begunkov
2019-10-07 22:31:59 +0800

11 Jul, 2019

1 commit

b49773e7b block: Disable write plugging for zoned block devices ... Browse Code »

Simultaneously writing to a sequential zone of a zoned block device
from multiple contexts requires mutual exclusion for BIO issuing to
ensure that writes happen sequentially. However, even for a well
behaved user correctly implementing such synchronization, BIO plugging
may interfere and result in BIOs from the different contextx to be
reordered if plugging is done outside of the mutual exclusion section,
e.g. the plug was started by a function higher in the call chain than
the function issuing BIOs.

Context A Context B

| blk_start_plug()
| ...
| seq_write_zone()
| mutex_lock(zone)
| bio-0->bi_iter.bi_sector = zone->wp
| zone->wp += bio_sectors(bio-0)
| submit_bio(bio-0)
| bio-1->bi_iter.bi_sector = zone->wp
| zone->wp += bio_sectors(bio-1)
| submit_bio(bio-1)
| mutex_unlock(zone)
| return
| -----------------------> | seq_write_zone()
| mutex_lock(zone)
| bio-2->bi_iter.bi_sector = zone->wp
| zone->wp += bio_sectors(bio-2)
| submit_bio(bio-2)
| mutex_unlock(zone)
|
Signed-off-by: Jens Axboe

Damien Le Moal
2019-07-11 04:18:01 +0800

03 Jul, 2019

1 commit

c05f42206 blk-mq: remove blk_mq_put_ctx() ... Browse Code »

No code that occurs between blk_mq_get_ctx() and blk_mq_put_ctx() depends
on preemption being disabled for its correctness. Since removing the CPU
preemption calls does not measurably affect performance, simplify the
blk-mq code by removing the blk_mq_put_ctx() function and also by not
disabling preemption in blk_mq_get_ctx().

Cc: Hannes Reinecke
Cc: Omar Sandoval
Reviewed-by: Christoph Hellwig
Reviewed-by: Ming Lei
Signed-off-by: Bart Van Assche
Signed-off-by: Jens Axboe

Bart Van Assche
2019-07-03 11:03:27 +0800

04 May, 2019

1 commit

c7e2d94b3 blk-mq: free hw queue's resource in hctx's release handler ... Browse Code »

Once blk_cleanup_queue() returns, tags shouldn't be used any more,
because blk_mq_free_tag_set() may be called. Commit 45a9c9d909b2
("blk-mq: Fix a use-after-free") fixes this issue exactly.

However, that commit introduces another issue. Before 45a9c9d909b2,
we are allowed to run queue during cleaning up queue if the queue's
kobj refcount is held. After that commit, queue can't be run during
queue cleaning up, otherwise oops can be triggered easily because
some fields of hctx are freed by blk_mq_free_queue() in blk_cleanup_queue().

We have invented ways for addressing this kind of issue before, such as:

8dc765d438f1 ("SCSI: fix queue cleanup race before queue initialization is done")
c2856ae2f315 ("blk-mq: quiesce queue before freeing queue")

But still can't cover all cases, recently James reports another such
kind of issue:

https://marc.info/?l=linux-scsi&m=155389088124782&w=2

This issue can be quite hard to address by previous way, given
scsi_run_queue() may run requeues for other LUNs.

Fixes the above issue by freeing hctx's resources in its release handler, and this
way is safe becasue tags isn't needed for freeing such hctx resource.

This approach follows typical design pattern wrt. kobject's release handler.

Cc: Dongli Zhang
Cc: James Smart
Cc: Bart Van Assche
Cc: linux-scsi@vger.kernel.org,
Cc: Martin K . Petersen ,
Cc: Christoph Hellwig ,
Cc: James E . J . Bottomley ,
Reported-by: James Smart
Fixes: 45a9c9d909b2 ("blk-mq: Fix a use-after-free")
Cc: stable@vger.kernel.org
Reviewed-by: Hannes Reinecke
Reviewed-by: Christoph Hellwig
Tested-by: James Smart
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2019-05-04 21:24:05 +0800

05 Apr, 2019

1 commit

fd9c40f64 block: Revert v5.0 blk_mq_request_issue_directly() changes ... Browse Code »

blk_mq_try_issue_directly() can return BLK_STS*_RESOURCE for requests that
have been queued. If that happens when blk_mq_try_issue_directly() is called
by the dm-mpath driver then dm-mpath will try to resubmit a request that is
already queued and a kernel crash follows. Since it is nontrivial to fix
blk_mq_request_issue_directly(), revert the blk_mq_request_issue_directly()
changes that went into kernel v5.0.

This patch reverts the following commits:
* d6a51a97c0b2 ("blk-mq: replace and kill blk_mq_request_issue_directly") # v5.0.
* 5b7a6f128aad ("blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests") # v5.0.
* 7f556a44e61d ("blk-mq: refactor the code of issue request directly") # v5.0.

Cc: Christoph Hellwig
Cc: Ming Lei
Cc: Jianchao Wang
Cc: Hannes Reinecke
Cc: Johannes Thumshirn
Cc: James Smart
Cc: Dongli Zhang
Cc: Laurence Oberman
Cc:
Reported-by: Laurence Oberman
Tested-by: Laurence Oberman
Fixes: 7f556a44e61d ("blk-mq: refactor the code of issue request directly") # v5.0.
Signed-off-by: Bart Van Assche
Signed-off-by: Jens Axboe

Bart Van Assche
2019-04-05 23:40:46 +0800

25 Mar, 2019

1 commit

13f063815 blk-mq: use blk_mq_put_driver_tag() to put tag ... Browse Code »

Expect arguments, blk_mq_put_driver_tag_hctx() and blk_mq_put_driver_tag()
is same. We can just use argument 'request' to put tag by blk_mq_put_driver_tag().
Then we can remove the unused blk_mq_put_driver_tag_hctx().

Signed-off-by: Yufen Yu
Signed-off-by: Jens Axboe

Yufen Yu
2019-03-25 00:26:16 +0800

21 Mar, 2019

1 commit

e6c987120 block: Unexport blk_mq_add_to_requeue_list() ... Browse Code »

This function is not used outside the block layer core. Hence unexport it.

Cc: Christoph Hellwig
Cc: Ming Lei
Signed-off-by: Bart Van Assche
Signed-off-by: Jens Axboe

Bart Van Assche
2019-03-21 04:19:36 +0800

15 Feb, 2019

1 commit

6fb845f0e Merge tag 'v5.0-rc6' into for-5.1/block ... Browse Code »

Pull in 5.0-rc6 to avoid a dumb merge conflict with fs/iomap.c.
This is needed since io_uring is now based on the block branch,
to avoid a conflict between the multi-page bvecs and the bits
of io_uring that touch the core block parts.

* tag 'v5.0-rc6': (525 commits)
Linux 5.0-rc6
x86/mm: Make set_pmd_at() paravirt aware
MAINTAINERS: Update the ocores i2c bus driver maintainer, etc
blk-mq: remove duplicated definition of blk_mq_freeze_queue
Blk-iolatency: warn on negative inflight IO counter
blk-iolatency: fix IO hang due to negative inflight counter
MAINTAINERS: unify reference to xen-devel list
x86/mm/cpa: Fix set_mce_nospec()
futex: Handle early deadlock return correctly
futex: Fix barrier comment
net: dsa: b53: Fix for failure when irq is not defined in dt
blktrace: Show requests without sector
mips: cm: reprime error cause
mips: loongson64: remove unreachable(), fix loongson_poweroff().
sit: check if IPv6 enabled before calling ip6_err_gen_icmpv6_unreach()
geneve: should not call rt6_lookup() when ipv6 was disabled
KVM: nVMX: unconditionally cancel preemption timer in free_nested (CVE-2019-7221)
KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222)
kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974)
signal: Better detection of synchronous signals
...

Jens Axboe
2019-02-15 23:43:59 +0800

09 Feb, 2019

1 commit

269848417 blk-mq: remove duplicated definition of blk_mq_freeze_queue ... Browse Code »

As the prototype has been defined in "include/linux/blk-mq.h", the one
in "block/blk-mq.h" can be removed then.

Signed-off-by: Liu Bo
Signed-off-by: Jens Axboe

Liu Bo
2019-02-09 03:42:29 +0800

01 Feb, 2019

2 commits

bb94aea14 blk-mq: save default hctx into ctx->hctxs for not-supported type ... Browse Code »

Currently, we check whether the hctx type is supported every time
in hot path. Actually, this is not necessary, we could save the
default hctx into ctx->hctxs if the type is not supported when
map swqueues and use it directly with ctx->hctxs[type].

We also needn't check whether the poll is enabled or not, because
the caller would clear the REQ_HIPRI in that case.

Signed-off-by: Jianchao Wang
Signed-off-by: Jens Axboe

Jianchao Wang
2019-02-01 23:33:43 +0800
8ccdf4a37 blk-mq: save queue mapping result into ctx directly ... Browse Code »

Currently, the queue mapping result is saved in a two-dimensional
array. In the hot path, to get a hctx, we need do following:

q->queue_hw_ctx[q->tag_set->map[type].mq_map[cpu]]

This isn't very efficient. We could save the queue mapping result into
ctx directly with different hctx type, like,

ctx->hctxs[type]

Signed-off-by: Jianchao Wang
Signed-off-by: Jens Axboe

Jianchao Wang
2019-02-01 23:33:04 +0800

18 Dec, 2018

1 commit

c16d6b5a9 blk-mq: fix dispatch from sw queue ... Browse Code »

When a request is added to rq list of sw queue(ctx), the rq may be from
a different type of hctx, especially after multi queue mapping is
introduced.

So when dispach request from sw queue via blk_mq_flush_busy_ctxs() or
blk_mq_dequeue_from_ctx(), one request belonging to other queue type of
hctx can be dispatched to current hctx in case that read queue or poll
queue is enabled.

This patch fixes this issue by introducing per-queue-type list.

Cc: Christoph Hellwig
Signed-off-by: Ming Lei

Changed by me to not use separately cacheline aligned lists, just
place them all in the same cacheline where we had just the one list
and lock before.

Signed-off-by: Jens Axboe

Ming Lei
2018-12-18 02:19:54 +0800

17 Dec, 2018

1 commit

5aceaeb26 blk-mq: only dispatch to non-defauly queue maps if they have queues ... Browse Code »

We should check if a given queue map actually has queues enabled before
dispatching to it. This allows drivers to not initialize optional but
not used map types, which subsequently will allow fixing problems with
queue map rebuilds for that case.

Reviewed-by: Ming Lei
Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2018-12-17 20:44:45 +0800

16 Dec, 2018

1 commit

d6a51a97c blk-mq: replace and kill blk_mq_request_issue_directly ... Browse Code »

Replace blk_mq_request_issue_directly with blk_mq_try_issue_directly
in blk_insert_cloned_request and kill it as nobody uses it any more.

Signed-off-by: Jianchao Wang
Signed-off-by: Jens Axboe

Jianchao Wang
2018-12-16 23:33:58 +0800

10 Dec, 2018

1 commit

e016b7820 block: return just one value from part_in_flight ... Browse Code »

The previous patches deleted all the code that needed the second value
returned from part_in_flight - now the kernel only uses the first value.

Consequently, part_in_flight (and blk_mq_in_flight) may be changed so that
it only returns one value.

This patch just refactors the code, there's no functional change.

Signed-off-by: Mikulas Patocka
Signed-off-by: Mike Snitzer
Signed-off-by: Jens Axboe

Mikulas Patocka
2018-12-10 23:30:38 +0800

05 Dec, 2018

1 commit

e20ba6e1d block: move queues types to the block layer ... Browse Code »

Having another indirect all in the fast path doesn't really help
in our post-spectre world. Also having too many queue type is just
going to create confusion, so I'd rather manage them centrally.

Note that the queue type naming and ordering changes a bit - the
first index now is the default queue for everything not explicitly
marked, the optional ones are read and poll queues.

Reviewed-by: Sagi Grimberg
Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2018-12-05 02:38:17 +0800

30 Nov, 2018

1 commit

be94f058f blk-mq: use bd->last == true for list inserts ... Browse Code »

If we are issuing a list of requests, we know if we're at the last one.
If we fail issuing, ensure that we call ->commits_rqs() to flush any
potential previous requests.

Reviewed-by: Omar Sandoval
Reviewed-by: Ming Lei
Reviewed-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Jens Axboe
2018-11-30 01:12:32 +0800

21 Nov, 2018

1 commit

1db4909e7 blk-mq: not embed .mq_kobj and ctx->kobj into queue instance ... Browse Code »

Even though .mq_kobj, ctx->kobj and q->kobj share same lifetime
from block layer's view, actually they don't because userspace may
grab one kobject anytime via sysfs.

This patch fixes the issue by the following approach:

1) introduce 'struct blk_mq_ctxs' for holding .mq_kobj and managing
all ctxs

2) free all allocated ctxs and the 'blk_mq_ctxs' instance in release
handler of .mq_kobj

3) grab one ref of .mq_kobj before initializing each ctx->kobj, so that
.mq_kobj is always released after all ctxs are freed.

This patch fixes kernel panic issue during booting when DEBUG_KOBJECT_RELEASE
is enabled.

Reported-by: Guenter Roeck
Cc: "jianchao.wang"
Tested-by: Guenter Roeck
Reviewed-by: Greg Kroah-Hartman
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-11-21 20:57:56 +0800

08 Nov, 2018

7 commits

ea4f995ee blk-mq: cache request hardware queue mapping ... Browse Code »

We call blk_mq_map_queue() a lot, at least two times for each
request per IO, sometimes more. Since we now have an indirect
call as well in that function. cache the mapping so we don't
have to re-call blk_mq_map_queue() for the same request
multiple times.

Reviewed-by: Keith Busch
Reviewed-by: Sagi Grimberg
Reviewed-by: Hannes Reinecke
Signed-off-by: Jens Axboe

Jens Axboe
2018-11-08 04:44:59 +0800
b3c661b15 blk-mq: support multiple hctx maps ... Browse Code »

Add support for the tag set carrying multiple queue maps, and
for the driver to inform blk-mq how many it wishes to support
through setting set->nr_maps.

This adds an mq_ops helper for drivers that support more than 1
map, mq_ops->rq_flags_to_type(). The function takes request/bio
flags and CPU, and returns a queue map index for that. We then
use the type information in blk_mq_map_queue() to index the map
set.

Reviewed-by: Hannes Reinecke
Reviewed-by: Keith Busch
Reviewed-by: Sagi Grimberg
Signed-off-by: Jens Axboe

Jens Axboe
2018-11-08 04:44:59 +0800
f31967f0e blk-mq: allow software queue to map to multiple hardware queues ... Browse Code »

The mapping used to be dependent on just the CPU location, but
now it's a tuple of (type, cpu) instead. This is a prep patch
for allowing a single software queue to map to multiple hardware
queues. No functional changes in this patch.

This changes the software queue count to an unsigned short
to save a bit of space. We can still support 64K-1 CPUs,
which should be enough. Add a check to catch a wrap.

Reviewed-by: Hannes Reinecke
Reviewed-by: Keith Busch
Signed-off-by: Jens Axboe

Jens Axboe
2018-11-08 04:44:59 +0800
f9afca4d3 blk-mq: pass in request/bio flags to queue mapping ... Browse Code »

Prep patch for being able to place request based not just on
CPU location, but also on the type of request.

Reviewed-by: Hannes Reinecke
Reviewed-by: Keith Busch
Signed-off-by: Jens Axboe

Jens Axboe
2018-11-08 04:44:59 +0800
ff2c56609 blk-mq: provide dummy blk_mq_map_queue_type() helper ... Browse Code »

Doesn't do anything right now, but it's needed as a prep patch
to get the interfaces right.

While in there, correct the blk_mq_map_queue() CPU type to an unsigned
int.

Reviewed-by: Hannes Reinecke
Reviewed-by: Keith Busch
Signed-off-by: Jens Axboe

Jens Axboe
2018-11-08 04:44:59 +0800
ed76e329d blk-mq: abstract out queue map ... Browse Code »

This is in preparation for allowing multiple sets of maps per
queue, if so desired.

Reviewed-by: Hannes Reinecke
Reviewed-by: Bart Van Assche
Reviewed-by: Keith Busch
Signed-off-by: Jens Axboe

Jens Axboe
2018-11-08 04:44:59 +0800
a8908939a blk-mq: kill q->mq_map ... Browse Code »

It's just a pointer to set->mq_map, use that instead. Move the
assignment a bit earlier, so we always know it's valid.

Reviewed-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
Reviewed-by: Bart Van Assche
Reviewed-by: Keith Busch
Signed-off-by: Jens Axboe

Jens Axboe
2018-11-08 04:44:59 +0800