Eric Lee / smarc-fsl-linux-kernel

04 Sep, 2020

5 commits

a0235d230 blk-mq: Relocate hctx_may_queue() ... Browse Code »

blk-mq.h and blk-mq-tag.h include on each other, which is less than ideal.

Locate hctx_may_queue() to blk-mq.h, as it is not really tag specific code.

In this way, we can drop the blk-mq-tag.h include of blk-mq.h

Signed-off-by: John Garry
Tested-by: Douglas Gilbert
Signed-off-by: Jens Axboe

John Garry
2020-09-04 05:20:47 +0800
32bc15afe blk-mq: Facilitate a shared sbitmap per tagset ... Browse Code »

Some SCSI HBAs (such as HPSA, megaraid, mpt3sas, hisi_sas_v3 ..) support
multiple reply queues with single hostwide tags.

In addition, these drivers want to use interrupt assignment in
pci_alloc_irq_vectors(PCI_IRQ_AFFINITY). However, as discussed in [0],
CPU hotplug may cause in-flight IO completion to not be serviced when an
interrupt is shutdown. That problem is solved in commit bf0beec0607d
("blk-mq: drain I/O when all CPUs in a hctx are offline").

However, to take advantage of that blk-mq feature, the HBA HW queuess are
required to be mapped to that of the blk-mq hctx's; to do that, the HBA HW
queues need to be exposed to the upper layer.

In making that transition, the per-SCSI command request tags are no
longer unique per Scsi host - they are just unique per hctx. As such, the
HBA LLDD would have to generate this tag internally, which has a certain
performance overhead.

However another problem is that blk-mq assumes the host may accept
(Scsi_host.can_queue * #hw queue) commands. In commit 6eb045e092ef ("scsi:
core: avoid host-wide host_busy counter for scsi_mq"), the Scsi host busy
counter was removed, which would stop the LLDD being sent more than
.can_queue commands; however, it should still be ensured that the block
layer does not issue more than .can_queue commands to the Scsi host.

To solve this problem, introduce a shared sbitmap per blk_mq_tag_set,
which may be requested at init time.

New flag BLK_MQ_F_TAG_HCTX_SHARED should be set when requesting the
tagset to indicate whether the shared sbitmap should be used.

Even when BLK_MQ_F_TAG_HCTX_SHARED is set, a full set of tags and requests
are still allocated per hctx; the reason for this is that if tags and
requests were only allocated for a single hctx - like hctx0 - it may break
block drivers which expect a request be associated with a specific hctx,
i.e. not always hctx0. This will introduce extra memory usage.

This change is based on work originally from Ming Lei in [1] and from
Bart's suggestion in [2].

[0] https://lore.kernel.org/linux-block/alpine.DEB.2.21.1904051331270.1802@nanos.tec.linutronix.de/
[1] https://lore.kernel.org/linux-block/20190531022801.10003-1-ming.lei@redhat.com/
[2] https://lore.kernel.org/linux-block/ff77beff-5fd9-9f05-12b6-826922bace1f@huawei.com/T/#m3db0a602f095cbcbff27e9c884d6b4ae826144be

Signed-off-by: John Garry
Tested-by: Don Brace #SCSI resv cmds patches used
Tested-by: Douglas Gilbert
Signed-off-by: Jens Axboe

John Garry
2020-09-04 05:20:47 +0800
222a5ae03 blk-mq: Use pointers for blk_mq_tags bitmap tags ... Browse Code »

Introduce pointers for the blk_mq_tags regular and reserved bitmap tags,
with the goal of later being able to use a common shared tag bitmap across
all HW contexts in a set.

Signed-off-by: John Garry
Tested-by: Don Brace #SCSI resv cmds patches used
Tested-by: Douglas Gilbert
Reviewed-by: Hannes Reinecke
Signed-off-by: Jens Axboe

John Garry
2020-09-04 05:20:47 +0800
1c0706a70 blk-mq: Pass flags for tag init/free ... Browse Code »

Pass hctx/tagset flags argument down to blk_mq_init_tags() and
blk_mq_free_tags() for selective init/free.

For now, make it include the alloc policy flag, which can be evaluated
when needed (in blk_mq_init_tags()).

Signed-off-by: John Garry
Tested-by: Douglas Gilbert
Signed-off-by: Jens Axboe

John Garry
2020-09-04 05:20:46 +0800
51db1c37e blk-mq: Rename BLK_MQ_F_TAG_SHARED as BLK_MQ_F_TAG_QUEUE_SHARED ... Browse Code »

BLK_MQ_F_TAG_SHARED actually means that tags is shared among request
queues, all of which should belong to LUNs attached to same HBA.

So rename it to make the point explicitly.

[jpg: rebase a few times, add rnbd-clt.c change]

Suggested-by: Bart Van Assche
Signed-off-by: Ming Lei
Signed-off-by: John Garry
Tested-by: Douglas Gilbert
Reviewed-by: Hannes Reinecke
Signed-off-by: Jens Axboe

Ming Lei
2020-09-04 05:20:46 +0800

09 Jul, 2020

1 commit

568f27006 blk-mq: centralise related handling into blk_mq_get_driver_tag ... Browse Code »

Move .nr_active update and request assignment into blk_mq_get_driver_tag(),
all are good to do during getting driver tag.

Meantime blk-flush related code is simplified and flush request needn't
to update the request table manually any more.

Signed-off-by: Ming Lei
Cc: Christoph Hellwig
Signed-off-by: Jens Axboe

Ming Lei
2020-07-09 06:06:42 +0800

02 Jul, 2020

1 commit

4e2f62e56 Revert "blk-mq: put driver tag when this request is completed" ... Browse Code »

This reverts commits the following commits:

37f4a24c2469a10a4c16c641671bd766e276cf9f
723bf178f158abd1ce6069cb049581b3cb003aab
36a3df5a4574d5ddf59804fcd0c4e9654c514d9a

The last one is the culprit, but we have to go a bit deeper to get this
to revert cleanly. There's been a report that this breaks some MMC
setups [1], and also causes an issue with swap [2]. Until this can be
figured out, revert the offending commits.

[1] https://lore.kernel.org/linux-block/57fb09b1-54ba-f3aa-f82c-d709b0e6b281@samsung.com/
[2] https://lore.kernel.org/linux-block/20200702043721.GA1087@lca.pw/

Reported-by: Marek Szyprowski
Reported-by: Qian Cai
Signed-off-by: Jens Axboe

Jens Axboe
2020-07-02 12:58:32 +0800

01 Jul, 2020

2 commits

37f4a24c2 blk-mq: centralise related handling into blk_mq_get_driver_tag ... Browse Code »

Move .nr_active update and request assignment into blk_mq_get_driver_tag(),
all are good to do during getting driver tag.

Meantime blk-flush related code is simplified and flush request needn't
to update the request table manually any more.

Signed-off-by: Ming Lei
Cc: Christoph Hellwig
Signed-off-by: Jens Axboe

Ming Lei
2020-07-01 02:57:59 +0800
570e9b73b blk-mq: move blk_mq_get_driver_tag into blk-mq.c ... Browse Code »

blk_mq_get_driver_tag() is only used by blk-mq.c and is supposed to
stay in blk-mq.c, so move it and preparing for cleanup code of
get/put driver tag.

Meantime hctx_may_queue() is moved to header file and it is fine
since it is defined as inline always.

No functional change.

Signed-off-by: Ming Lei
Reviewed-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
Cc: Christoph Hellwig
Signed-off-by: Jens Axboe

Ming Lei
2020-07-01 02:57:59 +0800

07 Jun, 2020

1 commit

d94ecfc39 blk-mq: split out a __blk_mq_get_driver_tag helper ... Browse Code »

Allocation of the driver tag in the case of using a scheduler shares very
little code with the "normal" tag allocation. Split out a new helper to
streamline this path, and untangle it from the complex normal tag
allocation.

This way also avoids to fail driver tag allocation because of inactive hctx
during cpu hotplug, and fixes potential hang risk.

Fixes: bf0beec0607d ("blk-mq: drain I/O when all CPUs in a hctx are offline")
Signed-off-by: Ming Lei
Signed-off-by: Christoph Hellwig
Tested-by: John Garry
Cc: Dongli Zhang
Cc: Hannes Reinecke
Cc: Daniel Wagner
Signed-off-by: Jens Axboe

Christoph Hellwig
2020-06-07 22:56:50 +0800

30 May, 2020

2 commits

602380d28 blk-mq: add blk_mq_all_tag_iter ... Browse Code »

Add a new blk_mq_all_tag_iter function to iterate over all allocated
scheduler tags and driver tags. This is more flexible than the existing
blk_mq_all_tag_busy_iter function as it allows the callers to do whatever
they want on allocated request instead of being limited to started
requests.

It will be used to implement draining allocated requests on specified
hctx in this patchset.

[hch: switch from the two booleans to a more readable flags field and
consolidate the tags iter functions]

Signed-off-by: Ming Lei
Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
Reviewed-by: Daniel Wagner
Reviewed-by: Bart van Assche
Signed-off-by: Jens Axboe

Ming Lei
2020-05-30 00:23:25 +0800
419c3d5e8 blk-mq: rename BLK_MQ_TAG_FAIL to BLK_MQ_NO_TAG ... Browse Code »

To prepare for wider use of this constant give it a more applicable name.

Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
Reviewed-by: Johannes Thumshirn
Reviewed-by: Bart Van Assche
Reviewed-by: Daniel Wagner
Signed-off-by: Jens Axboe

Christoph Hellwig
2020-05-30 00:23:25 +0800

27 Feb, 2020

1 commit

cae740a04 blk-mq: Remove some unused function arguments ... Browse Code »

The struct blk_mq_hw_ctx pointer argument in blk_mq_put_tag(),
blk_mq_poll_nsecs(), and blk_mq_poll_hybrid_sleep() is unused, so remove
it.

Overall obj code size shows a minor reduction, before:
text data bss dec hex filename
27306 1312 0 28618 6fca block/blk-mq.o
4303 272 0 4575 11df block/blk-mq-tag.o

after:
27282 1312 0 28594 6fb2 block/blk-mq.o
4311 272 0 4583 11e7 block/blk-mq-tag.o

Reviewed-by: Johannes Thumshirn
Reviewed-by: Hannes Reinecke
Signed-off-by: John Garry
--
This minor patch had been carried as part of the blk-mq shared tags RFC,
I'd rather not carry it anymore as it required rebasing, so now or never..
Signed-off-by: Jens Axboe

John Garry
2020-02-27 01:34:41 +0800

14 Nov, 2019

1 commit

cb711b91a blk-mq: Delete blk_mq_has_free_tags() and blk_mq_can_queue() ... Browse Code »

These functions are not referenced, so delete them.

Signed-off-by: John Garry
Signed-off-by: Jens Axboe

John Garry
2019-11-14 03:50:38 +0800

15 Nov, 2017

1 commit

e2c5923c3 Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block ... Browse Code »

Pull core block layer updates from Jens Axboe:
"This is the main pull request for block storage for 4.15-rc1.

Nothing out of the ordinary in here, and no API changes or anything
like that. Just various new features for drivers, core changes, etc.
In particular, this pull request contains:

- A patch series from Bart, closing the whole on blk/scsi-mq queue
quescing.

- A series from Christoph, building towards hidden gendisks (for
multipath) and ability to move bio chains around.

- NVMe
- Support for native multipath for NVMe (Christoph).
- Userspace notifications for AENs (Keith).
- Command side-effects support (Keith).
- SGL support (Chaitanya Kulkarni)
- FC fixes and improvements (James Smart)
- Lots of fixes and tweaks (Various)

- bcache
- New maintainer (Michael Lyle)
- Writeback control improvements (Michael)
- Various fixes (Coly, Elena, Eric, Liang, et al)

- lightnvm updates, mostly centered around the pblk interface
(Javier, Hans, and Rakesh).

- Removal of unused bio/bvec kmap atomic interfaces (me, Christoph)

- Writeback series that fix the much discussed hundreds of millions
of sync-all units. This goes all the way, as discussed previously
(me).

- Fix for missing wakeup on writeback timer adjustments (Yafang
Shao).

- Fix laptop mode on blk-mq (me).

- {mq,name} tupple lookup for IO schedulers, allowing us to have
alias names. This means you can use 'deadline' on both !mq and on
mq (where it's called mq-deadline). (me).

- blktrace race fix, oopsing on sg load (me).

- blk-mq optimizations (me).

- Obscure waitqueue race fix for kyber (Omar).

- NBD fixes (Josef).

- Disable writeback throttling by default on bfq, like we do on cfq
(Luca Miccio).

- Series from Ming that enable us to treat flush requests on blk-mq
like any other request. This is a really nice cleanup.

- Series from Ming that improves merging on blk-mq with schedulers,
getting us closer to flipping the switch on scsi-mq again.

- BFQ updates (Paolo).

- blk-mq atomic flags memory ordering fixes (Peter Z).

- Loop cgroup support (Shaohua).

- Lots of minor fixes from lots of different folks, both for core and
driver code"

* 'for-4.15/block' of git://git.kernel.dk/linux-block: (294 commits)
nvme: fix visibility of "uuid" ns attribute
blk-mq: fixup some comment typos and lengths
ide: ide-atapi: fix compile error with defining macro DEBUG
blk-mq: improve tag waiting setup for non-shared tags
brd: remove unused brd_mutex
blk-mq: only run the hardware queue if IO is pending
block: avoid null pointer dereference on null disk
fs: guard_bio_eod() needs to consider partitions
xtensa/simdisk: fix compile error
nvme: expose subsys attribute to sysfs
nvme: create 'slaves' and 'holders' entries for hidden controllers
block: create 'slaves' and 'holders' entries for hidden gendisks
nvme: also expose the namespace identification sysfs files for mpath nodes
nvme: implement multipath access to nvme subsystems
nvme: track shared namespaces
nvme: introduce a nvme_ns_ids structure
nvme: track subsystems
block, nvme: Introduce blk_mq_req_flags_t
block, scsi: Make SCSI quiesce and resume work reliably
block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
...

Linus Torvalds
2017-11-15 07:32:19 +0800

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800

01 Oct, 2017

1 commit

5385fa47d blk-mq-tag: kill unused tag enums ... Browse Code »

We don't have any notion of a tagging cache anymore, and haven't
for a long time. Kill off the unused enums.

Signed-off-by: Jens Axboe

Jens Axboe
2017-10-01 15:26:21 +0800

02 Mar, 2017

1 commit

415b806de blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset ... Browse Code »

Signed-off-by: Sagi Grimberg

Modified by me to also check at driver tag allocation time if the
original request was reserved, so we can be sure to allocate a
properly reserved tag at that point in time, too.

Signed-off-by: Jens Axboe

Sagi Grimberg
2017-03-02 23:56:04 +0800

27 Jan, 2017

1 commit

d96b37c0a blk-mq: move tags and sched_tags info from sysfs to debugfs ... Browse Code »

These are very tied to the blk-mq tag implementation, so exposing them
to sysfs isn't a great idea. Move the debugging information to debugfs
and add basic entries for the number of tags and the number of reserved
tags to sysfs.

Reviewed-by: Hannes Reinecke
Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe

Omar Sandoval
2017-01-27 23:17:44 +0800

21 Jan, 2017

1 commit

70f36b600 blk-mq: allow resize of scheduler requests ... Browse Code »

Add support for growing the tags associated with a hardware queue, for
the scheduler tags. Currently we only support resizing within the
limits of the original depth, change that so we can grow it as well by
allocating and replacing the existing scheduler tag set.

This is similar to how we could increase the software queue depth with
the legacy IO stack and schedulers.

Signed-off-by: Jens Axboe
Reviewed-by: Omar Sandoval

Jens Axboe
2017-01-21 00:05:53 +0800

18 Jan, 2017

2 commits

2af8cbe30 blk-mq: split tag ->rqs[] into two ... Browse Code »

This is in preparation for having two sets of tags available. For
that we need a static index, and a dynamically assignable one.

Signed-off-by: Jens Axboe
Reviewed-by: Omar Sandoval

Jens Axboe
2017-01-18 01:04:15 +0800
4941115be blk-mq-tag: cleanup the normal/reserved tag allocation ... Browse Code »

This is in preparation for having another tag set available. Cleanup
the parameters, and allow passing in of tags for blk_mq_put_tag().

Signed-off-by: Jens Axboe
[hch: even more cleanups]
Signed-off-by: Christoph Hellwig
Reviewed-by: Omar Sandoval

Jens Axboe
2017-01-18 01:03:59 +0800

10 Oct, 2016

1 commit

12e3d3cdd Merge branch 'for-4.9/block-irq' of git://git.kernel.dk/linux-block ... Browse Code »

Pull blk-mq irq/cpu mapping updates from Jens Axboe:
"This is the block-irq topic branch for 4.9-rc. It's mostly from
Christoph, and it allows drivers to specify their own mappings, and
more importantly, to share the blk-mq mappings with the IRQ affinity
mappings. It's a good step towards making this work better out of the
box"

* 'for-4.9/block-irq' of git://git.kernel.dk/linux-block:
blk_mq: linux/blk-mq.h does not include all the headers it depends on
blk-mq: kill unused blk_mq_create_mq_map()
blk-mq: get rid of the cpumask in struct blk_mq_tags
nvme: remove the post_scan callout
nvme: switch to use pci_alloc_irq_vectors
blk-mq: provide a default queue mapping for PCI device
blk-mq: allow the driver to pass in a queue mapping
blk-mq: remove ->map_queue
blk-mq: only allocate a single mq_map per tag_set
blk-mq: don't redistribute hardware queues on a CPU hotplug event

Linus Torvalds
2016-10-10 08:29:33 +0800

17 Sep, 2016

4 commits

98d95416d sbitmap: randomize initial alloc_hint values ... Browse Code »

In order to get good cache behavior from a sbitmap, we want each CPU to
stick to its own cacheline(s) as much as possible. This might happen
naturally as the bitmap gets filled up and the alloc_hint values spread
out, but we really want this behavior from the start. blk-mq apparently
intended to do this, but the code to do this was never wired up. Get rid
of the dead code and make it part of the sbitmap library.

Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe

Omar Sandoval
2016-09-17 22:39:14 +0800
f4a644db8 sbitmap: push alloc policy into sbitmap_queue ... Browse Code »

Again, there's no point in passing this in every time. Make it part of
struct sbitmap_queue and clean up the API.

Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe

Omar Sandoval
2016-09-17 22:39:12 +0800
40aabb674 sbitmap: push per-cpu last_tag into sbitmap_queue ... Browse Code »

Allocating your own per-cpu allocation hint separately makes for an
awkward API. Instead, allocate the per-cpu hint as part of the struct
sbitmap_queue. There's no point for a struct sbitmap_queue without the
cache, but you can still use a bare struct sbitmap.

Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe

Omar Sandoval
2016-09-17 22:39:10 +0800
88459642c blk-mq: abstract tag allocation out into sbitmap library ... Browse Code »

This is a generally useful data structure, so make it available to
anyone else who might want to use it. It's also a nice cleanup
separating the allocation logic from the rest of the tag handling logic.

The code is behind a new Kconfig option, CONFIG_SBITMAP, which is only
selected by CONFIG_BLOCK for now.

This should be a complete noop functionality-wise.

Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe

Omar Sandoval
2016-09-17 22:38:44 +0800

15 Sep, 2016

1 commit

1b157939f blk-mq: get rid of the cpumask in struct blk_mq_tags ... Browse Code »

Unused now that NVMe sets up irq affinity before calling into blk-mq.

Signed-off-by: Christoph Hellwig
Reviewed-by: Keith Busch
Signed-off-by: Jens Axboe

Christoph Hellwig
2016-09-15 22:42:03 +0800

01 Oct, 2015

1 commit

0bf6cd5b9 blk-mq: factor out a helper to iterate all tags for a request_queue ... Browse Code »

And replace the blk_mq_tag_busy_iter with it - the driver use has been
replaced with a new helper a while ago, and internal to the block we
only need the new version.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2015-10-01 16:10:57 +0800

15 Aug, 2015

1 commit

0048b4837 blk-mq: fix race between timeout and freeing request ... Browse Code »

Inside timeout handler, blk_mq_tag_to_rq() is called
to retrieve the request from one tag. This way is obviously
wrong because the request can be freed any time and some
fiedds of the request can't be trusted, then kernel oops
might be triggered[1].

Currently wrt. blk_mq_tag_to_rq(), the only special case is
that the flush request can share same tag with the request
cloned from, and the two requests can't be active at the same
time, so this patch fixes the above issue by updating tags->rqs[tag]
with the active request(either flush rq or the request cloned
from) of the tag.

Also blk_mq_tag_to_rq() gets much simplified with this patch.

Given blk_mq_tag_to_rq() is mainly for drivers and the caller must
make sure the request can't be freed, so in bt_for_each() this
helper is replaced with tags->rqs[tag].

[1] kernel oops log
[ 439.696220] BUG: unable to handle kernel NULL pointer dereference at 0000000000000158^M
[ 439.697162] IP: [] blk_mq_tag_to_rq+0x21/0x6e^M
[ 439.700653] PGD 7ef765067 PUD 7ef764067 PMD 0 ^M
[ 439.700653] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC ^M
[ 439.700653] Dumping ftrace buffer:^M
[ 439.700653] (ftrace buffer empty)^M
[ 439.700653] Modules linked in: nbd ipv6 kvm_intel kvm serio_raw^M
[ 439.700653] CPU: 6 PID: 2779 Comm: stress-ng-sigfd Not tainted 4.2.0-rc5-next-20150805+ #265^M
[ 439.730500] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011^M
[ 439.730500] task: ffff880605308000 ti: ffff88060530c000 task.ti: ffff88060530c000^M
[ 439.730500] RIP: 0010:[] [] blk_mq_tag_to_rq+0x21/0x6e^M
[ 439.730500] RSP: 0018:ffff880819203da0 EFLAGS: 00010283^M
[ 439.730500] RAX: ffff880811b0e000 RBX: ffff8800bb465f00 RCX: 0000000000000002^M
[ 439.730500] RDX: 0000000000000000 RSI: 0000000000000202 RDI: 0000000000000000^M
[ 439.730500] RBP: ffff880819203db0 R08: 0000000000000002 R09: 0000000000000000^M
[ 439.730500] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000202^M
[ 439.730500] R13: ffff880814104800 R14: 0000000000000002 R15: ffff880811a2ea00^M
[ 439.730500] FS: 00007f165b3f5740(0000) GS:ffff880819200000(0000) knlGS:0000000000000000^M
[ 439.730500] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b^M
[ 439.730500] CR2: 0000000000000158 CR3: 00000007ef766000 CR4: 00000000000006e0^M
[ 439.730500] Stack:^M
[ 439.730500] 0000000000000008 ffff8808114eed90 ffff880819203e00 ffffffff812dc104^M
[ 439.755663] ffff880819203e40 ffffffff812d9f5e 0000020000000000 ffff8808114eed80^M
[ 439.755663] Call Trace:^M
[ 439.755663] ^M
[ 439.755663] [] bt_for_each+0x6e/0xc8^M
[ 439.755663] [] ? blk_mq_rq_timed_out+0x6a/0x6a^M
[ 439.755663] [] ? blk_mq_rq_timed_out+0x6a/0x6a^M
[ 439.755663] [] blk_mq_tag_busy_iter+0x55/0x5e^M
[ 439.755663] [] ? blk_mq_bio_to_request+0x38/0x38^M
[ 439.755663] [] blk_mq_rq_timer+0x5d/0xd4^M
[ 439.755663] [] call_timer_fn+0xf7/0x284^M
[ 439.755663] [] ? call_timer_fn+0x5/0x284^M
[ 439.755663] [] ? blk_mq_bio_to_request+0x38/0x38^M
[ 439.755663] [] run_timer_softirq+0x1ce/0x1f8^M
[ 439.755663] [] __do_softirq+0x181/0x3a4^M
[ 439.755663] [] irq_exit+0x40/0x94^M
[ 439.755663] [] smp_apic_timer_interrupt+0x33/0x3e^M
[ 439.755663] [] apic_timer_interrupt+0x84/0x90^M
[ 439.755663] ^M
[ 439.755663] [] ? _raw_spin_unlock_irq+0x32/0x4a^M
[ 439.755663] [] finish_task_switch+0xe0/0x163^M
[ 439.755663] [] ? finish_task_switch+0xa2/0x163^M
[ 439.755663] [] __schedule+0x469/0x6cd^M
[ 439.755663] [] schedule+0x82/0x9a^M
[ 439.789267] [] signalfd_read+0x186/0x49a^M
[ 439.790911] [] ? wake_up_q+0x47/0x47^M
[ 439.790911] [] __vfs_read+0x28/0x9f^M
[ 439.790911] [] ? __fget_light+0x4d/0x74^M
[ 439.790911] [] vfs_read+0x7a/0xc6^M
[ 439.790911] [] SyS_read+0x49/0x7f^M
[ 439.790911] [] entry_SYSCALL_64_fastpath+0x12/0x6f^M
[ 439.790911] Code: 48 89 e5 e8 a9 b8 e7 ff 5d c3 0f 1f 44 00 00 55 89
f2 48 89 e5 41 54 41 89 f4 53 48 8b 47 60 48 8b 1c d0 48 8b 7b 30 48 8b
53 38 8b 87 58 01 00 00 48 85 c0 75 09 48 8b 97 88 0c 00 00 eb 10
^M
[ 439.790911] RIP [] blk_mq_tag_to_rq+0x21/0x6e^M
[ 439.790911] RSP ^M
[ 439.790911] CR2: 0000000000000158^M
[ 439.790911] ---[ end trace d40af58949325661 ]---^M

Cc:
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2015-08-15 23:45:21 +0800

02 Jun, 2015

1 commit

f26cdc853 blk-mq: Shared tag enhancements ... Browse Code »

Storage controllers may expose multiple block devices that share hardware
resources managed by blk-mq. This patch enhances the shared tags so a
low-level driver can access the shared resources not tied to the unshared
h/w contexts. This way the LLD can dynamically add and delete disks and
request queues without having to track all the request_queue hctx's to
iterate outstanding tags.

Signed-off-by: Keith Busch
Signed-off-by: Jens Axboe

Keith Busch
2015-06-02 04:35:56 +0800

24 Jan, 2015

1 commit

24391c0dc blk-mq: add tag allocation policy ... Browse Code »

This is the blk-mq part to support tag allocation policy. The default
allocation policy isn't changed (though it's not a strict FIFO). The new
policy is round-robin for libata. But it's a try-best implementation. If
multiple tasks are competing, the tags returned will be mixed (which is
unavoidable even with !mq, as requests from different tasks can be
mixed in queue)

Cc: Jens Axboe
Cc: Tejun Heo
Cc: Christoph Hellwig
Signed-off-by: Shaohua Li
Signed-off-by: Jens Axboe

Shaohua Li
2015-01-24 05:18:00 +0800

01 Jan, 2015

1 commit

aed3ea94b block: wake up waiters when a queue is marked dying ... Browse Code »

If it's dying, we can't expect new request to complete and come
in an wake up other tasks waiting for requests. So after we
have marked it as dying, wake up everybody currently waiting
for a request. Once they wake, they will retry their allocation
and fail appropriately due to the state of the queue.

Tested-by: Keith Busch
Signed-off-by: Jens Axboe

Jens Axboe
2015-01-01 00:39:16 +0800

18 Jun, 2014

1 commit

8537b1203 blk-mq: bitmap tag: fix races on shared ::wake_index fields ... Browse Code »

Fix racy updates of shared blk_mq_bitmap_tags::wake_index
and blk_mq_hw_ctx::wake_index fields.

Cc: Ming Lei
Signed-off-by: Alexander Gordeev
Signed-off-by: Jens Axboe

Alexander Gordeev
2014-06-18 13:12:35 +0800

04 Jun, 2014

1 commit

cb96a42cc blk-mq: fix schedule from atomic context ... Browse Code »

blk_mq_put_ctx() has to be called before io_schedule() in
bt_get().

This patch fixes the problem by taking similar approach from
percpu_ida allocation for the situation.

Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2014-06-04 11:04:39 +0800

28 May, 2014

1 commit

a3bd77567 blk-mq: remove blk_mq_wait_for_tags ... Browse Code »

The current logic for blocking tag allocation is rather confusing, as we
first allocated and then free again a tag in blk_mq_wait_for_tags, just
to attempt a non-blocking allocation and then repeat if someone else
managed to grab the tag before us.

Instead change blk_mq_alloc_request_pinned to simply do a blocking tag
allocation itself and use the request we get back from it.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2014-05-28 23:49:23 +0800

24 May, 2014

1 commit

edf866b38 blk-mq: export blk_mq_tag_busy_iter ... Browse Code »

Export the blk-mq in-flight tag iterator for driver consumption.
This is particularly useful in exception paths or SRSI where
in-flight IOs need to be cancelled and/or reissued. The NVMe driver
conversion will use this.

Signed-off-by: Sam Bradshaw
Signed-off-by: Matias Bjørling
Signed-off-by: Jens Axboe

Sam Bradshaw
2014-05-24 03:30:16 +0800

21 May, 2014

1 commit

e3a2b3f93 blk-mq: allow changing of queue depth through sysfs ... Browse Code »

For request_fn based devices, the block layer exports a 'nr_requests'
file through sysfs to allow adjusting of queue depth on the fly.
Currently this returns -EINVAL for blk-mq, since it's not wired up.
Wire this up for blk-mq, so that it now also always dynamic
adjustments of the allowed queue depth for any given block device
managed by blk-mq.

Signed-off-by: Jens Axboe

Jens Axboe
2014-05-21 01:49:02 +0800

20 May, 2014

2 commits

39a9f97e5 Merge branch 'for-3.16/blk-mq-tagging' into for-3.16/core ... Browse Code »

Signed-off-by: Jens Axboe

Conflicts:
block/blk-mq-tag.c

Jens Axboe
2014-05-20 01:52:35 +0800
e93ecf602 blk-mq: move the cache friendly bitmap type of out blk-mq-tag ... Browse Code »

We will use it for the pending list in blk-mq core as well.

Signed-off-by: Jens Axboe

Jens Axboe
2014-05-20 01:02:47 +0800