Eric Lee / smarc-fsl-linux-kernel

13 Jan, 2019

1 commit

6353c0a03 block: mq-deadline: Fix write completion handling ... Browse Code »

commit 7211aef86f79583e59b88a0aba0bc830566f7e8e upstream.

For a zoned block device using mq-deadline, if a write request for a
zone is received while another write was already dispatched for the same
zone, dd_dispatch_request() will return NULL and the newly inserted
write request is kept in the scheduler queue waiting for the ongoing
zone write to complete. With this behavior, when no other request has
been dispatched, rq_list in blk_mq_sched_dispatch_requests() is empty
and blk_mq_sched_mark_restart_hctx() not called. This in turn leads to
__blk_mq_free_request() call of blk_mq_sched_restart() to not run the
queue when the already dispatched write request completes. The newly
dispatched request stays stuck in the scheduler queue until eventually
another request is submitted.

This problem does not affect SCSI disk as the SCSI stack handles queue
restart on request completion. However, this problem is can be triggered
the nullblk driver with zoned mode enabled.

Fix this by always requesting a queue restart in dd_dispatch_request()
if no request was dispatched while WRITE requests are queued.

Fixes: 5700f69178e9 ("mq-deadline: Introduce zone locking support")
Cc:
Signed-off-by: Damien Le Moal
Signed-off-by: Greg Kroah-Hartman

Add missing export of blk_mq_sched_restart()

Signed-off-by: Jens Axboe

Damien Le Moal
2019-01-13 16:51:07 +0800

21 Aug, 2018

1 commit

d48ece209 blk-mq: init hctx sched after update ctx and hctx mapping ... Browse Code »

Currently, when update nr_hw_queues, IO scheduler's init_hctx will
be invoked before the mapping between ctx and hctx is adapted
correctly by blk_mq_map_swqueue. The IO scheduler init_hctx (kyber)
may depend on this mapping and get wrong result and panic finally.
A simply way to fix this is that switch the IO scheduler to 'none'
before update the nr_hw_queues, and then switch it back after
update nr_hw_queues. blk_mq_sched_init_/exit_hctx are removed due
to nobody use them any more.

Signed-off-by: Jianchao Wang
Signed-off-by: Jens Axboe

Jianchao Wang
2018-08-21 23:02:55 +0800

18 Jul, 2018

1 commit

6ce3dd6ee blk-mq: issue directly if hw queue isn't busy in case of 'none' ... Browse Code »

In case of 'none' io scheduler, when hw queue isn't busy, it isn't
necessary to enqueue request to sw queue and dequeue it from
sw queue because request may be submitted to hw queue asap without
extra cost, meantime there shouldn't be much request in sw queue,
and we don't need to worry about effect on IO merge.

There are still some single hw queue SCSI HBAs(HPSA, megaraid_sas, ...)
which may connect high performance devices, so 'none' is often required
for obtaining good performance.

This patch improves IOPS and decreases CPU unilization on megaraid_sas,
per Kashyap's test.

Cc: Kashyap Desai
Cc: Laurence Oberman
Cc: Omar Sandoval
Cc: Christoph Hellwig
Cc: Bart Van Assche
Cc: Hannes Reinecke
Reported-by: Kashyap Desai
Tested-by: Kashyap Desai
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-07-18 06:04:00 +0800

09 Jul, 2018

3 commits

6e7687173 blk-mq: dequeue request one by one from sw queue if hctx is busy ... Browse Code »

It won't be efficient to dequeue request one by one from sw queue,
but we have to do that when queue is busy for better merge performance.

This patch takes the Exponential Weighted Moving Average(EWMA) to figure
out if queue is busy, then only dequeue request one by one from sw queue
when queue is busy.

Fixes: b347689ffbca ("blk-mq-sched: improve dispatching from sw queue")
Cc: Kashyap Desai
Cc: Laurence Oberman
Cc: Omar Sandoval
Cc: Christoph Hellwig
Cc: Bart Van Assche
Cc: Hannes Reinecke
Reported-by: Kashyap Desai
Tested-by: Kashyap Desai
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-07-09 23:07:53 +0800
b04f50ab8 blk-mq: only attempt to merge bio if there is rq in sw queue ... Browse Code »

Only attempt to merge bio iff the ctx->rq_list isn't empty, because:

1) for high-performance SSD, most of times dispatch may succeed, then
there may be nothing left in ctx->rq_list, so don't try to merge over
sw queue if it is empty, then we can save one acquiring of ctx->lock

2) we can't expect good merge performance on per-cpu sw queue, and missing
one merge on sw queue won't be a big deal since tasks can be scheduled from
one CPU to another.

Cc: Laurence Oberman
Cc: Omar Sandoval
Cc: Bart Van Assche
Tested-by: Kashyap Desai
Reported-by: Kashyap Desai
Reviewed-by: Christoph Hellwig
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-07-09 23:07:53 +0800
97889f9ac blk-mq: remove synchronize_rcu() from blk_mq_del_queue_tag_set() ... Browse Code »

We have to remove synchronize_rcu() from blk_queue_cleanup(),
otherwise long delay can be caused during lun probe. For removing
it, we have to avoid to iterate the set->tag_list in IO path, eg,
blk_mq_sched_restart().

This patch reverts 5b79413946d (Revert "blk-mq: don't handle
TAG_SHARED in restart"). Given we have fixed enough IO hang issue,
and there isn't any reason to restart all queues in one tags any more,
see the following reasons:

1) blk-mq core can deal with shared-tags case well via blk_mq_get_driver_tag(),
which can wake up queues waiting for driver tag.

2) SCSI is a bit special because it may return BLK_STS_RESOURCE if queue,
target or host is ready, but SCSI built-in restart can cover all these well,
see scsi_end_request(), queue will be rerun after any request initiated from
this host/target is completed.

In my test on scsi_debug(8 luns), this patch may improve IOPS by 20% ~ 30%
when running I/O on these 8 luns concurrently.

Fixes: 705cda97ee3a ("blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list")
Cc: Omar Sandoval
Cc: Bart Van Assche
Cc: Christoph Hellwig
Cc: Martin K. Petersen
Cc: linux-scsi@vger.kernel.org
Reported-by: Andrew Jones
Tested-by: Andrew Jones
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-07-09 23:07:52 +0800

03 Jun, 2018

1 commit

32a50fabb blk-mq: update nr_requests when switching to 'none' scheduler ... Browse Code »

Now we setup q->nr_requests when switching to one new scheduler,
but not do it for 'none', then q->nr_requests may not be correct
for 'none'.

This patch fixes this issue by always updating 'nr_requests' when
switching to 'none'.

Cc: Marco Patalano
Cc: "Ewan D. Milne"
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-06-03 10:35:00 +0800

01 Jun, 2018

2 commits

acddf3b30 block: move sysfs_lock into elevator_init ... Browse Code »

Both callers take just around so function call, so move it in.
Also remove the now pointless blk_mq_sched_init wrapper.

Signed-off-by: Christoph Hellwig
Reviewed-by: Damien Le Moal
Tested-by: Damien Le Moal
Signed-off-by: Jens Axboe

Christoph Hellwig
2018-06-01 21:38:19 +0800
ddb725325 block: remove the always unused name argument to elevator_init ... Browse Code »

Reported-by: Damien Le Moal
Signed-off-by: Christoph Hellwig
Reviewed-by: Damien Le Moal
Tested-by: Damien Le Moal
Signed-off-by: Jens Axboe

Christoph Hellwig
2018-06-01 21:38:17 +0800

31 May, 2018

1 commit

9c5587346 blk-mq: abstract out blk-mq-sched rq list iteration bio merge helper ... Browse Code »

No functional changes in this patch, just a prep patch for utilizing
this in an IO scheduler.

Signed-off-by: Jens Axboe
Reviewed-by: Omar Sandoval

Jens Axboe
2018-05-31 00:43:58 +0800

02 Feb, 2018

1 commit

bea99a500 blk-mq-sched: Enable merging discard bio into request ... Browse Code »

Signed-off-by: Keith Busch
Signed-off-by: Jens Axboe

Keith Busch
2018-02-02 05:45:11 +0800

18 Jan, 2018

1 commit

9e97d2951 blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request ... Browse Code »

After commit:

923218f6166a ("blk-mq: don't allocate driver tag upfront for flush rq")

we no longer use the 'can_block' argument in
blk_mq_sched_insert_request(). Kill it.

Signed-off-by: Mike Snitzer

Added actual commit message as to why it's being removed.

Signed-off-by: Jens Axboe

Mike Snitzer
2018-01-18 00:49:21 +0800

05 Jan, 2018

1 commit

913a9500b blk-mq: remove confusing comment of blk_mq_sched_dispatch_requests ... Browse Code »

Commit de1482974080
("blk-mq: introduce .get_budget and .put_budget in blk_mq_ops")
changes the function to return bool type, and then commit 1f460b63d4b3
("blk-mq: don't restart queue when .get_budget returns BLK_STS_RESOURCE")
changes it back to void, but the comment remains.

Signed-off-by: Liu Bo
Signed-off-by: Jens Axboe

Liu Bo
2018-01-05 23:36:33 +0800

11 Nov, 2017

2 commits

79f720a75 blk-mq: only run the hardware queue if IO is pending ... Browse Code »

Currently we are inconsistent in when we decide to run the queue. Using
blk_mq_run_hw_queues() we check if the hctx has pending IO before
running it, but we don't do that from the individual queue run function,
blk_mq_run_hw_queue(). This results in a lot of extra and pointless
queue runs, potentially, on flush requests and (much worse) on tag
starvation situations. This is observable just looking at top output,
with lots of kworkers active. For the !async runs, it just adds to the
CPU overhead of blk-mq.

Move the has-pending check into the run function instead of having
callers do it.

Reviewed-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Jens Axboe
2017-11-11 10:55:57 +0800
05b794139 Revert "blk-mq: don't handle TAG_SHARED in restart" ... Browse Code »

This reverts commit 358a3a6bccb74da9d63a26b2dd5f09f1e9970e0b.

We have cases that aren't covered 100% in the drivers, so for now
we have to retain the shared tag restart loops.

Signed-off-by: Jens Axboe

Jens Axboe
2017-11-11 10:53:25 +0800

05 Nov, 2017

3 commits

923218f61 blk-mq: don't allocate driver tag upfront for flush rq ... Browse Code »

The idea behind it is simple:

1) for none scheduler, driver tag has to be borrowed for flush rq,
otherwise we may run out of tag, and that causes an IO hang. And
get/put driver tag is actually noop for none, so reordering tags
isn't necessary at all.

2) for a real I/O scheduler, we need not allocate a driver tag upfront
for flush rq. It works just fine to follow the same approach as
normal requests: allocate driver tag for each rq just before calling
->queue_rq().

One driver visible change is that the driver tag isn't shared in the
flush request sequence. That won't be a problem, since we always do that
in legacy path.

Then flush rq need not be treated specially wrt. get/put driver tag.
This cleans up the code - for instance, reorder_tags_to_front() can be
removed, and we needn't worry about request ordering in dispatch list
for avoiding I/O deadlock.

Also we have to put the driver tag before requeueing.

Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2017-11-05 02:40:13 +0800
a6a252e64 blk-mq-sched: decide how to handle flush rq via RQF_FLUSH_SEQ ... Browse Code »

In case of IO scheduler we always pre-allocate one driver tag before
calling blk_insert_flush(), and flush request will be marked as
RQF_FLUSH_SEQ once it is in flush machinery.

So if RQF_FLUSH_SEQ isn't set, we call blk_insert_flush() to handle
the request, otherwise the flush request is dispatched to ->dispatch
list directly.

This is a preparation patch for not preallocating a driver tag for flush
requests, and for not treating flush requests as a special case. This is
similar to what the legacy path does.

Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2017-11-05 02:38:50 +0800
88022d720 blk-mq: don't handle failure in .get_budget ... Browse Code »

It is enough to just check if we can get the budget via .get_budget().
And we don't need to deal with device state change in .get_budget().

For SCSI, one issue to be fixed is that we have to call
scsi_mq_uninit_cmd() to free allocated ressources if SCSI device fails
to handle the request. And it isn't enough to simply call
blk_mq_end_request() to do that if this request is marked as
RQF_DONTPREP.

Fixes: 0df21c86bdbf(scsi: implement .get_budget and .put_budget for blk-mq)
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2017-11-05 02:31:08 +0800

01 Nov, 2017

6 commits

1f460b63d blk-mq: don't restart queue when .get_budget returns BLK_STS_RESOURCE ... Browse Code »

SCSI restarts its queue in scsi_end_request() automatically, so we don't
need to handle this case in blk-mq.

Especailly any request won't be dequeued in this case, we needn't to
worry about IO hang caused by restart vs. dispatch.

Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2017-11-01 22:20:34 +0800
358a3a6bc blk-mq: don't handle TAG_SHARED in restart ... Browse Code »

Now restart is used in the following cases, and TAG_SHARED is for
SCSI only.

1) .get_budget() returns BLK_STS_RESOURCE
- if resource in target/host level isn't satisfied, this SCSI device
will be added in shost->starved_list, and the whole queue will be rerun
(via SCSI's built-in RESTART) in scsi_end_request() after any request
initiated from this host/targe is completed. Forget to mention, host level
resource can't be an issue for blk-mq at all.

- the same is true if resource in the queue level isn't satisfied.

- if there isn't outstanding request on this queue, then SCSI's RESTART
can't work(blk-mq's can't work too), and the queue will be run after
SCSI_QUEUE_DELAY, and finally all starved sdevs will be handled by SCSI's
RESTART when this request is finished

2) scsi_dispatch_cmd() returns BLK_STS_RESOURCE
- if there isn't onprogressing request on this queue, the queue
will be run after SCSI_QUEUE_DELAY

- otherwise, SCSI's RESTART covers the rerun.

3) blk_mq_get_driver_tag() failed
- BLK_MQ_S_TAG_WAITING covers the cross-queue RESTART for driver
allocation.

In one word, SCSI's built-in RESTART is enough to cover the queue
rerun, and we don't need to pay special attention to TAG_SHARED wrt. restart.

In my test on scsi_debug(8 luns), this patch improves IOPS by 20% ~ 30% when
running I/O on these 8 luns concurrently.

Aslo Roman Pen reported the current RESTART is very expensive especialy
when there are lots of LUNs attached in one host, such as in his
test, RESTART causes half of IOPS be cut.

Fixes: https://marc.info/?l=linux-kernel&m=150832216727524&w=2
Fixes: 6d8c6c0f97ad ("blk-mq: Restart a single queue if tag sets are shared")
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2017-11-01 22:20:33 +0800
b347689ff blk-mq-sched: improve dispatching from sw queue ... Browse Code »

SCSI devices use host-wide tagset, and the shared driver tag space is
often quite big. However, there is also a queue depth for each lun(
.cmd_per_lun), which is often small, for example, on both lpfc and
qla2xxx, .cmd_per_lun is just 3.

So lots of requests may stay in sw queue, and we always flush all
belonging to same hw queue and dispatch them all to driver.
Unfortunately it is easy to cause queue busy because of the small
.cmd_per_lun. Once these requests are flushed out, they have to stay in
hctx->dispatch, and no bio merge can happen on these requests, and
sequential IO performance is harmed.

This patch introduces blk_mq_dequeue_from_ctx for dequeuing a request
from a sw queue, so that we can dispatch them in scheduler's way. We can
then avoid dequeueing too many requests from sw queue, since we don't
flush ->dispatch completely.

This patch improves dispatching from sw queue by using the .get_budget
and .put_budget callbacks.

Reviewed-by: Omar Sandoval
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2017-11-01 22:20:02 +0800
de1482974 blk-mq: introduce .get_budget and .put_budget in blk_mq_ops ... Browse Code »

For SCSI devices, there is often a per-request-queue depth, which needs
to be respected before queuing one request.

Currently blk-mq always dequeues the request first, then calls
.queue_rq() to dispatch the request to lld. One obvious issue with this
approach is that I/O merging may not be successful, because when the
per-request-queue depth can't be respected, .queue_rq() has to return
BLK_STS_RESOURCE, and then this request has to stay in hctx->dispatch
list. This means it never gets a chance to be merged with other IO.

This patch introduces .get_budget and .put_budget callback in blk_mq_ops,
then we can try to get reserved budget first before dequeuing request.
If the budget for queueing I/O can't be satisfied, we don't need to
dequeue request at all. Hence the request can be left in the IO
scheduler queue, for more merging opportunities.

Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2017-11-01 22:20:02 +0800
caf8eb0d6 blk-mq-sched: move actual dispatching into one helper ... Browse Code »

So that it becomes easy to support to dispatch from sw queue in the
following patch.

No functional change.

Reviewed-by: Bart Van Assche
Reviewed-by: Omar Sandoval
Suggested-by: Christoph Hellwig # for simplifying dispatch logic
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2017-11-01 22:20:02 +0800
5e3d02bba blk-mq-sched: dispatch from scheduler IFF progress is made in ->dispatch ... Browse Code »

When the hw queue is busy, we shouldn't take requests from the scheduler
queue any more, otherwise it is difficult to do IO merge.

This patch fixes the awful IO performance on some SCSI devices(lpfc,
qla2xxx, ...) when mq-deadline/kyber is used by not taking requests if
hw queue is busy.

Reviewed-by: Omar Sandoval
Reviewed-by: Bart Van Assche
Reviewed-by: Christoph Hellwig
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2017-11-01 22:20:02 +0800

04 Jul, 2017

1 commit

32825c45f blk-mq-sched: fix performance regression of mq-deadline ... Browse Code »

When mq-deadline is taken, IOPS of sequential read and
seqential write is observed more than 20% drop on sata(scsi-mq)
devices, compared with using 'none' scheduler.

The reason is that the default nr_requests for scheduler is
too big for small queuedepth devices, and latency is increased
much.

Since the principle of taking 256 requests for mq scheduler
is based on 128 queue depth, this patch changes into
double size of min(hw queue_depth, 128).

Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2017-07-04 06:54:09 +0800

23 Jun, 2017

1 commit

f95a0d6a9 Merge commit '8e8320c9315c ' into for-4.13/block ... Browse Code »

Pull in the fix for shared tags, as it conflicts with the pending
changes in for-4.13/block. We already pulled in v4.12-rc5 to solve
other conflicts or get fixes that went into 4.12, so not a lot
of changes in this merge.

Signed-off-by: Jens Axboe

Jens Axboe
2017-06-23 11:55:24 +0800

22 Jun, 2017

1 commit

8e8320c93 blk-mq: fix performance regression with shared tags ... Browse Code »

If we have shared tags enabled, then every IO completion will trigger
a full loop of every queue belonging to a tag set, and every hardware
queue for each of those queues, even if nothing needs to be done.
This causes a massive performance regression if you have a lot of
shared devices.

Instead of doing this huge full scan on every IO, add an atomic
counter to the main queue that tracks how many hardware queues have
been marked as needing a restart. With that, we can avoid looking for
restartable queues, if we don't have to.

Max reports that this restores performance. Before this patch, 4K
IOPS was limited to 22-23K IOPS. With the patch, we are running at
950-970K IOPS.

Fixes: 6d8c6c0f97ad ("blk-mq: Restart a single queue if tag sets are shared")
Reported-by: Max Gurtovoy
Tested-by: Max Gurtovoy
Reviewed-by: Bart Van Assche
Tested-by: Bart Van Assche
Signed-off-by: Jens Axboe

Jens Axboe
2017-06-22 00:17:49 +0800

21 Jun, 2017

1 commit

7b6078146 blk-mq: Document locking assumptions ... Browse Code »

Document the locking assumptions in functions that modify
blk_mq_ctx.rq_list to make it easier for humans to verify
this code.

Signed-off-by: Bart Van Assche
Reviewed-by: Christoph Hellwig
Cc: Hannes Reinecke
Cc: Omar Sandoval
Cc: Ming Lei
Signed-off-by: Jens Axboe

Bart Van Assche
2017-06-21 09:27:14 +0800

19 Jun, 2017

4 commits

f4560ffe8 blk-mq: use QUEUE_FLAG_QUIESCED to quiesce queue ... Browse Code »

It is required that no dispatch can happen any more once
blk_mq_quiesce_queue() returns, and we don't have such requirement
on APIs of stopping queue.

But blk_mq_quiesce_queue() still may not block/drain dispatch in the
the case of BLK_MQ_S_START_ON_RUN, so use the new introduced flag of
QUEUE_FLAG_QUIESCED and evaluate it inside RCU read-side critical
sections for fixing this issue.

Also blk_mq_quiesce_queue() is implemented via stopping queue, which
limits its uses, and easy to cause race, because any queue restart in
other paths may break blk_mq_quiesce_queue(). With the introduced
flag of QUEUE_FLAG_QUIESCED, we don't need to depend on stopping queue
for quiescing any more.

Signed-off-by: Ming Lei
Reviewed-by: Bart Van Assche
Signed-off-by: Jens Axboe

Ming Lei
2017-06-19 04:24:27 +0800
44e8c2bff blk-mq: refactor blk_mq_sched_assign_ioc ... Browse Code »

blk_mq_sched_assign_ioc now only handles the assigned of the ioc if
the schedule needs it (bfq only at the moment). The caller to the
per-request initializer is moved out so that it can be merged with
a similar call for the kyber I/O scheduler.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2017-06-19 00:08:55 +0800
ea511e3c2 blk-mq: remove blk_mq_sched_{get,put}_rq_priv ... Browse Code »

Having these as separate helpers in a header really does not help
readability, or my chances to refactor this code sanely.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2017-06-19 00:08:55 +0800
d2c0d3832 blk-mq: move blk_mq_sched_{get,put}_request to blk-mq.c ... Browse Code »

Having them out of line in blk-mq-sched.c just makes the code flow
unnecessarily complicated.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2017-06-19 00:08:55 +0800

27 May, 2017

1 commit

9bddeb2a5 blk-mq: make per-sw-queue bio merge as default .bio_merge ... Browse Code »

Because what the per-sw-queue bio merge does is basically same with
scheduler's .bio_merge(), this patch makes per-sw-queue bio merge
as the default .bio_merge if no scheduler is used or io scheduler
doesn't provide .bio_merge().

Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2017-05-27 04:12:04 +0800

04 May, 2017

1 commit

d332ce091 blk-mq-debugfs: allow schedulers to register debugfs attributes ... Browse Code »

This provides the infrastructure for schedulers to expose their internal
state through debugfs. We add a list of queue attributes and a list of
hctx attributes to struct elevator_type and wire them up when switching
schedulers.

Signed-off-by: Omar Sandoval
Reviewed-by: Hannes Reinecke

Add missing seq_file.h header in blk-mq-debugfs.h

Signed-off-by: Jens Axboe

Omar Sandoval
2017-05-04 22:24:40 +0800

02 May, 2017

1 commit

9f2779bff blk-mq-sched: remove hack that bypasses scheduler for reserved requests ... Browse Code »

We have update the troublesome driver (mtip32xx) to deal with this
appropriately. So kill the hack that bypassed scheduler allocation
and insertion for reserved requests.

Reviewed-by: Ming Lei
Reviewed-by: Christoph Hellwig
Tested-by: Ming Lei
Signed-off-by: Jens Axboe

Jens Axboe
2017-05-02 21:52:08 +0800

27 Apr, 2017

1 commit

339318080 blk-mq-sched: alloate reserved tags out of normal pool ... Browse Code »

At least one driver, mtip32xx, has a hard coded dependency on
the value of the reserved tag used for internal commands. While
that should really be fixed up, for now let's ensure that we just
bypass the scheduler tags an allocation marked as reserved. They
are used for house keeping or error handling, so we can safely
ignore them in the scheduler.

Tested-by: Ming Lei
Signed-off-by: Jens Axboe

Jens Axboe
2017-04-27 21:45:46 +0800

21 Apr, 2017

1 commit

246665db3 blk-mq: Remove blk_mq_sched_move_to_dispatch() ... Browse Code »

commit c13660a08c8b ("blk-mq-sched: change ->dispatch_requests()
to ->dispatch_request()") removed the last user of this function.
Hence also remove the function itself.

Signed-off-by: Bart Van Assche
Cc: Omar Sandoval
Cc: Hannes Reinecke
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-21 07:28:30 +0800

08 Apr, 2017

2 commits

ee056f981 blk-mq-sched: provide hooks for initializing hardware queue data ... Browse Code »

Schedulers need to be informed when a hardware queue is added or removed
at runtime so they can allocate/free per-hardware queue data. So,
replace the blk_mq_sched_init_hctx_data() helper, which only makes sense
at init time, with .init_hctx() and .exit_hctx() hooks.

Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe

Omar Sandoval
2017-04-08 02:45:41 +0800
6d8c6c0f9 blk-mq: Restart a single queue if tag sets are shared ... Browse Code »

To improve scalability, if hardware queues are shared, restart
a single hardware queue in round-robin fashion. Rename
blk_mq_sched_restart_queues() to reflect the new semantics.
Remove blk_mq_sched_mark_restart_queue() because this function
has no callers. Remove flag QUEUE_FLAG_RESTART because this
patch removes the code that uses this flag.

Signed-off-by: Bart Van Assche
Cc: Christoph Hellwig
Cc: Hannes Reinecke
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-08 02:40:09 +0800

07 Apr, 2017

1 commit

54d5329d4 blk-mq-sched: fix crash in switch error path ... Browse Code »

In elevator_switch(), if blk_mq_init_sched() fails, we attempt to fall
back to the original scheduler. However, at this point, we've already
torn down the original scheduler's tags, so this causes a crash. Doing
the fallback like the legacy elevator path is much harder for mq, so fix
it by just falling back to none, instead.

Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe

Omar Sandoval
2017-04-07 22:56:48 +0800