Eric Lee / smarc-fsl-linux-kernel

07 Mar, 2014

1 commit

739c3eea7 blk-mq: add REQ_SYNC early ... Browse Code »

Add REQ_SYNC early, so rq_dispatched[] in blk_mq_rq_ctx_init
is set correctly.

Signed-off-by: Shaohua Li
Signed-off-by: Jens Axboe

Shaohua Li
2014-03-07 23:15:28 +0800

22 Feb, 2014

3 commits

d6a25b313 blk-mq: support partial I/O completions ... Browse Code »

Add a new blk_mq_end_io_partial function to partially complete requests
as needed by the SCSI layer. We do this by reusing blk_update_request
to advance the bio instead of having a simplified version of it in
the blk-mq code.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2014-02-22 00:58:49 +0800
feb71dae1 blk-mq: merge blk_mq_insert_request and blk_mq_run_request ... Browse Code »

It's almost identical to blk_mq_insert_request, so fold the two into one
slightly more generic function by making the flush special case a bit
smarted.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2014-02-22 00:58:48 +0800
fd694131b blk-mq: remove blk_mq_alloc_rq ... Browse Code »

There's only one caller, which is a straight wrapper and fits the naming
scheme of the related functions a lot better.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2014-02-22 00:58:47 +0800

12 Feb, 2014

2 commits

49f5baa51 blk-mq: pair blk_mq_start_request / blk_mq_requeue_request ... Browse Code »

Make sure we have a proper pairing between starting and requeueing
requests. Move the dma drain and REQ_END setup into blk_mq_start_request,
and make sure blk_mq_requeue_request properly undoes them, giving us
a pair of function to prepare and unprepare a request without leaving
side effects.

Together this ensures we always clean up properly after
BLK_MQ_RQ_QUEUE_BUSY returns from ->queue_rq.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2014-02-12 00:34:08 +0800
1e93b8c27 blk-mq: dont assume rq->errors is set when returning an error from ->queue_rq ... Browse Code »

rq->errors never has been part of the communication protocol between drivers
and the block stack and most drivers will not have initialized it.

Return -EIO to upper layers when the driver returns BLK_MQ_RQ_QUEUE_ERROR
unconditionally. If a driver want to return a different error it can easily
do so by returning success after calling blk_mq_end_io itself.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2014-02-12 00:34:07 +0800

11 Feb, 2014

2 commits

18741986a blk-mq: rework flush sequencing logic ... Browse Code »

Witch to using a preallocated flush_rq for blk-mq similar to what's done
with the old request path. This allows us to set up the request properly
with a tag from the actually allowed range and ->rq_disk as needed by
some drivers. To make life easier we also switch to dynamic allocation
of ->flush_rq for the old path.

This effectively reverts most of

"blk-mq: fix for flush deadlock"

and

"blk-mq: Don't reserve a tag for flush request"

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2014-02-11 00:29:00 +0800
30a91cb4e blk-mq: rework I/O completions ... Browse Code »

Rework I/O completions to work more like the old code path. blk_mq_end_io
now stays out of the business of deferring completions to others CPUs
and calling blk_mark_rq_complete. The latter is very important to allow
completing requests that have timed out and thus are already marked completed,
the former allows using the IPI callout even for driver specific completions
instead of having to reimplement them.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2014-02-11 00:27:31 +0800

08 Feb, 2014

4 commits

14ec77f35 blk-mq: Add bio_integrity setup to blk_mq_make_request ... Browse Code »

This patch adds the missing bio_integrity_enabled() +
bio_integrity_prep() setup into blk_mq_make_request()
in order to use DIF protection with scsi-mq.

Cc: Martin K. Petersen
Signed-off-by: Nicholas Bellinger
Signed-off-by: Jens Axboe

Nicholas Bellinger
2014-02-08 04:45:39 +0800
1be036e94 blk-mq: initialize sg_reserved_size ... Browse Code »

To behave the same way as the old request path.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2014-02-08 02:58:54 +0800
4f7f418c4 blk-mq: handle dma_drain_size ... Browse Code »

Make blk-mq handle the dma_drain_size field the same way as the old request
path.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2014-02-08 02:58:54 +0800
72a0a36e2 blk-mq: support at_head inserations for blk_execute_rq ... Browse Code »

This is neede for proper SG_IO operation as well as various uses of
blk_execute_rq from the SCSI midlayer.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2014-02-08 02:58:54 +0800

31 Jan, 2014

1 commit

f0276924f blk-mq: Don't reserve a tag for flush request ... Browse Code »

Reserving a tag (request) for flush to avoid dead lock is a overkill. A
tag is valuable resource. We can track the number of flush requests and
disallow having too many pending flush requests allocated. With this
patch, blk_mq_alloc_request_pinned() could do a busy nop (but not a dead
loop) if too many pending requests are allocated and new flush request
is allocated. But this should not be a problem, too many pending flush
requests are very rare case.

I verified this can fix the deadlock caused by too many pending flush
requests.

Signed-off-by: Shaohua Li
Signed-off-by: Jens Axboe

Shaohua Li
2014-01-31 03:57:25 +0800

09 Jan, 2014

2 commits

6753471c0 blk-mq: uses page->list incorrectly ... Browse Code »

'struct page' has two list_head fields: 'lru' and 'list'. Conveniently,
they are unioned together. This means that code can use them
interchangably, which gets horribly confusing.

The blk-mq made the logical decision to try to use page->list. But, that
field was actually introduced just for the slub code. ->lru is the right
field to use outside of slab/slub.

Signed-off-by: Dave Hansen
Acked-by: David Rientjes
Acked-by: Kirill A. Shutemov
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe

Dave Hansen
2014-01-09 11:17:46 +0800
3d6efbf62 blk-mq: use __smp_call_function_single directly ... Browse Code »

__smp_call_function_single already avoids multiple IPIs by internally
queing up the items, and now also is available for non-SMP builds as
a trivially correct stub, so there is no need to wrap it. If the
additional lock roundtrip cause problems my patch to convert the
generic IPI code to llists is waiting to get merged will fix it.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2014-01-09 05:31:27 +0800

04 Jan, 2014

1 commit

0fec08b4e blk-mq: fix initializing request's start time ... Browse Code »

blk_rq_init() is called in req's complete handler to initialize
the request, so the members of start_time and start_time_ns might
become inaccurate when it is allocated in future.

The patch initializes the two members in blk_mq_rq_ctx_init() to
fix the problem.

Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2014-01-04 01:00:08 +0800

01 Jan, 2014

4 commits

3edcc0ce8 block: blk-mq: don't export blk_mq_free_queue() ... Browse Code »

blk_mq_free_queue() is called from release handler of
queue kobject, so it needn't be called from drivers.

Cc: Jens Axboe
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2014-01-01 00:53:05 +0800
f04c1fe76 block: blk-mq: make blk_sync_queue support mq ... Browse Code »

This patch moves synchronization on mq->delay_work
from blk_mq_free_queue() to blk_sync_queue(), so that
blk_sync_queue can work on mq.

Cc: Jens Axboe
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2014-01-01 00:53:05 +0800
43a5e4e21 block: blk-mq: support draining mq queue ... Browse Code »

blk_mq_drain_queue() is introduced so that we can drain
mq queue inside blk_cleanup_queue().

Also don't accept new requests any more if queue is marked
as dying.

Cc: Jens Axboe
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2014-01-01 00:53:05 +0800
b28bc9b38 Merge tag 'v3.13-rc6' into for-3.14/core ... Browse Code »

Needed to bring blk-mq uptodate, since changes have been going in
since for-3.14/core was established.

Fixup merge issues related to the immutable biovec changes.

Signed-off-by: Jens Axboe

Conflicts:
block/blk-flush.c
fs/btrfs/check-integrity.c
fs/btrfs/extent_io.c
fs/btrfs/scrub.c
fs/logfs/dev_bdev.c

Jens Axboe
2014-01-01 00:51:02 +0800

06 Dec, 2013

1 commit

0d11e6aca blk-mq: fix use-after-free of request ... Browse Code »

If accounting is on, we will do the IO completion accounting after
we have freed the request. Fix that by moving it sooner instead.

Signed-off-by: Jens Axboe

Ming Lei
2013-12-06 01:50:39 +0800

04 Dec, 2013

1 commit

959a35f13 blk-mq: fix dereference of rq->mq_ctx if allocation fails ... Browse Code »

If __GFP_WAIT isn't set and we fail allocating, when we go
to drop the reference on the ctx, we will attempt to dereference
the NULL rq. Fix that.

Signed-off-by: Jeff Moyer
Signed-off-by: Jens Axboe

Jeff Moyer
2013-12-04 05:24:28 +0800

24 Nov, 2013

1 commit

4f024f379 block: Abstract out bvec iterator ... Browse Code »

Immutable biovecs are going to require an explicit iterator. To
implement immutable bvecs, a later patch is going to add a bi_bvec_done
member to this struct; for now, this patch effectively just renames
things.

Signed-off-by: Kent Overstreet
Cc: Jens Axboe
Cc: Geert Uytterhoeven
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: "Ed L. Cashin"
Cc: Nick Piggin
Cc: Lars Ellenberg
Cc: Jiri Kosina
Cc: Matthew Wilcox
Cc: Geoff Levand
Cc: Yehuda Sadeh
Cc: Sage Weil
Cc: Alex Elder
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris
Cc: Philip Kelleher
Cc: Rusty Russell
Cc: "Michael S. Tsirkin"
Cc: Konrad Rzeszutek Wilk
Cc: Jeremy Fitzhardinge
Cc: Neil Brown
Cc: Alasdair Kergon
Cc: Mike Snitzer
Cc: dm-devel@redhat.com
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: linux390@de.ibm.com
Cc: Boaz Harrosh
Cc: Benny Halevy
Cc: "James E.J. Bottomley"
Cc: Greg Kroah-Hartman
Cc: "Nicholas A. Bellinger"
Cc: Alexander Viro
Cc: Chris Mason
Cc: "Theodore Ts'o"
Cc: Andreas Dilger
Cc: Jaegeuk Kim
Cc: Steven Whitehouse
Cc: Dave Kleikamp
Cc: Joern Engel
Cc: Prasad Joshi
Cc: Trond Myklebust
Cc: KONISHI Ryusuke
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Ben Myers
Cc: xfs@oss.sgi.com
Cc: Steven Rostedt
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Len Brown
Cc: Pavel Machek
Cc: "Rafael J. Wysocki"
Cc: Herton Ronaldo Krzesinski
Cc: Ben Hutchings
Cc: Andrew Morton
Cc: Guo Chao
Cc: Tejun Heo
Cc: Asai Thambi S P
Cc: Selvan Mani
Cc: Sam Bradshaw
Cc: Wei Yongjun
Cc: "Roger Pau Monné"
Cc: Jan Beulich
Cc: Stefano Stabellini
Cc: Ian Campbell
Cc: Sebastian Ott
Cc: Christian Borntraeger
Cc: Minchan Kim
Cc: Jiang Liu
Cc: Nitin Gupta
Cc: Jerome Marchand
Cc: Joe Perches
Cc: Peng Tao
Cc: Andy Adamson
Cc: fanchaoting
Cc: Jie Liu
Cc: Sunil Mushran
Cc: "Martin K. Petersen"
Cc: Namjae Jeon
Cc: Pankaj Kumar
Cc: Dan Magenheimer
Cc: Mel Gorman 6

Kent Overstreet
2013-11-24 14:33:47 +0800

20 Nov, 2013

2 commits

01b983c9f blk-mq: add blktrace insert event trace ... Browse Code »

We need it to make 'btt' from blktrace happy, otherwise
we are missing one state transition.

Signed-off-by: Jens Axboe

Jens Axboe
2013-11-20 10:00:45 +0800
94eddfbea blk-mq: ensure that we set REQ_IO_STAT so diskstats work ... Browse Code »

If disk stats are enabled on the queue, a request needs to
be marked with REQ_IO_STAT for accounting to be active on
that request. This fixes an issue with virtio-blk not
showing up in /proc/diskstats after the conversion to
blk-mq.

Add QUEUE_FLAG_MQ_DEFAULT, setting stats and same cpu-group
completion on by default.

Reported-by: Dave Chinner
Signed-off-by: Jens Axboe

Jens Axboe
2013-11-20 00:25:07 +0800

16 Nov, 2013

1 commit

f412f2c60 Merge branch 'for-linus' of git://git.kernel.dk/linux-block ... Browse Code »

Pull second round of block driver updates from Jens Axboe:
"As mentioned in the original pull request, the bcache bits were pulled
because of their dependency on the immutable bio vecs. Kent re-did
this part and resubmitted it, so here's the 2nd round of (mostly)
driver updates for 3.13. It contains:

- The bcache work from Kent.

- Conversion of virtio-blk to blk-mq. This removes the bio and request
path, and substitutes with the blk-mq path instead. The end result
almost 200 deleted lines. Patch is acked by Asias and Christoph, who
both did a bunch of testing.

- A removal of bootmem.h include from Grygorii Strashko, part of a
larger series of his killing the dependency on that header file.

- Removal of __cpuinit from blk-mq from Paul Gortmaker"

* 'for-linus' of git://git.kernel.dk/linux-block: (56 commits)
virtio_blk: blk-mq support
blk-mq: remove newly added instances of __cpuinit
bcache: defensively handle format strings
bcache: Bypass torture test
bcache: Delete some slower inline asm
bcache: Use ida for bcache block dev minor
bcache: Fix sysfs splat on shutdown with flash only devs
bcache: Better full stripe scanning
bcache: Have btree_split() insert into parent directly
bcache: Move spinlock into struct time_stats
bcache: Kill sequential_merge option
bcache: Kill bch_next_recurse_key()
bcache: Avoid deadlocking in garbage collection
bcache: Incremental gc
bcache: Add make_btree_freeing_key()
bcache: Add btree_node_write_sync()
bcache: PRECEDING_KEY()
bcache: bch_(btree|extent)_ptr_invalid()
bcache: Don't bother with bucket refcount for btree node allocations
bcache: Debug code improvements
...

Linus Torvalds
2013-11-16 08:33:41 +0800

15 Nov, 2013

1 commit

0a06ff068 kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS ... Browse Code »

We've switched over every architecture that supports SMP to it, so
remove the new useless config variable.

Signed-off-by: Christoph Hellwig
Cc: Jan Kara
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2013-11-15 08:32:22 +0800

14 Nov, 2013

1 commit

f618ef7c4 blk-mq: remove newly added instances of __cpuinit ... Browse Code »

The new blk-mq code added new instances of __cpuinit usage.
We removed this a couple versions ago; we now want to remove
the compat no-op stubs. Introducing new users is not what
we want to see at this point in time, as it will break once
the stubs are gone.

Signed-off-by: Paul Gortmaker
Signed-off-by: Jens Axboe

Paul Gortmaker
2013-11-14 23:26:02 +0800

30 Oct, 2013

1 commit

92f399c72 blk-mq: mq plug list breakage ... Browse Code »

We switched to plug mq_list for mq, but some code are still using old list.

Signed-off-by: Shaohua Li
Signed-off-by: Jens Axboe

Shaohua Li
2013-10-30 02:01:03 +0800

29 Oct, 2013

1 commit

3228f48be blk-mq: fix for flush deadlock ... Browse Code »

The flush state machine takes in a struct request, which then is
submitted multiple times to the underling driver. The old block code
requeses the same request for each of those, so it does not have an
issue with tapping into the request pool. The new one on the other hand
allocates a new request for each of the actualy steps of the flush
sequence. If have already allocated all of the tags for IO, we will
fail allocating the flush request.

Set aside a reserved request just for flushes.

Signed-off-by: Jens Axboe

Christoph Hellwig
2013-10-29 03:33:58 +0800

25 Oct, 2013

2 commits

280d45f6c blk-mq: add blk_mq_stop_hw_queues ... Browse Code »

Add a helper to iterate over all hw queues and stop them. This is useful
for driver that implement PM suspend functionality.

Signed-off-by: Christoph Hellwig

Modified to just call blk_mq_stop_hw_queue() by Jens.

Signed-off-by: Jens Axboe

Christoph Hellwig
2013-10-25 21:45:58 +0800
320ae51fe blk-mq: new multi-queue block IO queueing mechanism ... Browse Code »

Linux currently has two models for block devices:

- The classic request_fn based approach, where drivers use struct
request units for IO. The block layer provides various helper
functionalities to let drivers share code, things like tag
management, timeout handling, queueing, etc.

- The "stacked" approach, where a driver squeezes in between the
block layer and IO submitter. Since this bypasses the IO stack,
driver generally have to manage everything themselves.

With drivers being written for new high IOPS devices, the classic
request_fn based driver doesn't work well enough. The design dates
back to when both SMP and high IOPS was rare. It has problems with
scaling to bigger machines, and runs into scaling issues even on
smaller machines when you have IOPS in the hundreds of thousands
per device.

The stacked approach is then most often selected as the model
for the driver. But this means that everybody has to re-invent
everything, and along with that we get all the problems again
that the shared approach solved.

This commit introduces blk-mq, block multi queue support. The
design is centered around per-cpu queues for queueing IO, which
then funnel down into x number of hardware submission queues.
We might have a 1:1 mapping between the two, or it might be
an N:M mapping. That all depends on what the hardware supports.

blk-mq provides various helper functions, which include:

- Scalable support for request tagging. Most devices need to
be able to uniquely identify a request both in the driver and
to the hardware. The tagging uses per-cpu caches for freed
tags, to enable cache hot reuse.

- Timeout handling without tracking request on a per-device
basis. Basically the driver should be able to get a notification,
if a request happens to fail.

- Optional support for non 1:1 mappings between issue and
submission queues. blk-mq can redirect IO completions to the
desired location.

- Support for per-request payloads. Drivers almost always need
to associate a request structure with some driver private
command structure. Drivers can tell blk-mq this at init time,
and then any request handed to the driver will have the
required size of memory associated with it.

- Support for merging of IO, and plugging. The stacked model
gets neither of these. Even for high IOPS devices, merging
sequential IO reduces per-command overhead and thus
increases bandwidth.

For now, this is provided as a potential 3rd queueing model, with
the hope being that, as it matures, it can replace both the classic
and stacked model. That would get us back to having just 1 real
model for block devices, leaving the stacked approach to dm/md
devices (as it was originally intended).

Contributions in this patch from the following people:

Shaohua Li
Alexander Gordeev
Christoph Hellwig
Mike Christie
Matias Bjorling
Jeff Moyer

Acked-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Jens Axboe
2013-10-25 18:56:00 +0800