Eric Lee / smarc-fsl-linux-kernel

25 Apr, 2018

2 commits

4412efecf Revert "blk-mq: remove code for dealing with remapping queue" ... Browse Code »

This reverts commit 37c7c6c76d431dd7ef9c29d95f6052bd425f004c.

Turns out some drivers(most are FC drivers) may not use managed
IRQ affinity, and has their customized .map_queues meantime, so
still keep this code for avoiding regression.

Reported-by: Laurence Oberman
Tested-by: Laurence Oberman
Tested-by: Christian Borntraeger
Tested-by: Stefan Haberland
Cc: Ewan Milne
Cc: Christoph Hellwig
Cc: Sagi Grimberg
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-04-25 23:49:22 +0800
fe644072d block: mq: Add some minor doc for core structs ... Browse Code »

As it came up in discussion on the mailing list that the semantic
meaning of 'blk_mq_ctx' and 'blk_mq_hw_ctx' isn't completely
obvious to everyone, let's add some minimal kerneldoc for a
starter.

Signed-off-by: Linus Walleij
Signed-off-by: Jens Axboe

Linus Walleij
2018-04-25 21:58:18 +0800

19 Apr, 2018

2 commits

901932a3f blkcg: init root blkcg_gq under lock ... Browse Code »

The initializing of q->root_blkg is currently outside of queue lock
and rcu, so the blkg may be destroied before the initializing, which
may cause dangling/null references. On the other side, the destroys
of blkg are protected by queue lock or rcu. Put the initializing
inside the queue lock and rcu to make it safer.

Signed-off-by: Jiang Biao
Signed-off-by: Wen Yang
CC: Tejun Heo
CC: Jens Axboe
Signed-off-by: Jens Axboe

Jiang Biao
2018-04-19 22:51:59 +0800
bea548831 blkcg: small fix on comment in blkcg_init_queue ... Browse Code »

The comment before blkg_create() in blkcg_init_queue() was moved
from blkcg_activate_policy() by commit ec13b1d6f0a0457312e615, but
it does not suit for the new context.

Signed-off-by: Jiang Biao
Signed-off-by: Wen Yang
CC: Tejun Heo
CC: Jens Axboe
Signed-off-by: Jens Axboe

Jiang Biao
2018-04-19 22:51:57 +0800

18 Apr, 2018

2 commits

946b81da1 blkcg: don't hold blkcg lock when deactivating policy ... Browse Code »

As described in the comment of blkcg_activate_policy(),
*Update of each blkg is protected by both queue and blkcg locks so
that holding either lock and testing blkcg_policy_enabled() is
always enough for dereferencing policy data.*
with queue lock held, there is no need to hold blkcg lock in
blkcg_deactivate_policy(). Similar case is in
blkcg_activate_policy(), which has removed holding of blkcg lock in
commit 4c55f4f9ad3001ac1fefdd8d8ca7641d18558e23.

Signed-off-by: Jiang Biao
Signed-off-by: Wen Yang
CC: Tejun Heo
Signed-off-by: Jens Axboe

Jiang Biao
2018-04-18 22:37:18 +0800
72961c4e6 bfq-iosched: ensure to clear bic/bfqq pointers when preparing request ... Browse Code »

Even if we don't have an IO context attached to a request, we still
need to clear the priv[0..1] pointers, as they could be pointing
to previously used bic/bfqq structures. If we don't do so, we'll
either corrupt memory on dispatching a request, or cause an
imbalance in counters.

Inspired by a fix from Kees.

Reported-by: Oleksandr Natalenko
Reported-by: Kees Cook
Cc: stable@vger.kernel.org
Fixes: aee69d78dec0 ("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler")
Signed-off-by: Jens Axboe

Jens Axboe
2018-04-18 07:08:52 +0800

17 Apr, 2018

1 commit

f4560231e blk-mq: start request gstate with gen 1 ... Browse Code »

rq->gstate and rq->aborted_gstate both are zero before rqs are
allocated. If we have a small timeout, when the timer fires,
there could be rqs that are never allocated, and also there could
be rq that has been allocated but not initialized and started. At
the moment, the rq->gstate and rq->aborted_gstate both are 0, thus
the blk_mq_terminate_expired will identify the rq is timed out and
invoke .timeout early.

For scsi, this will cause scsi_times_out to be invoked before the
scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at
the moment, then we will get crash.

Cc: Bart Van Assche
Cc: Tejun Heo
Cc: Ming Lei
Cc: Martin Steigerwald
Cc: stable@vger.kernel.org
Signed-off-by: Jianchao Wang
Signed-off-by: Jens Axboe

Jianchao Wang
2018-04-17 11:56:41 +0800

15 Apr, 2018

1 commit

1dc3039bc block: do not use interruptible wait anywhere ... Browse Code »

When blk_queue_enter() waits for a queue to unfreeze, or unset the
PREEMPT_ONLY flag, do not allow it to be interrupted by a signal.

The PREEMPT_ONLY flag was introduced later in commit 3a0a529971ec
("block, scsi: Make SCSI quiesce and resume work reliably"). Note the SCSI
device is resumed asynchronously, i.e. after un-freezing userspace tasks.

So that commit exposed the bug as a regression in v4.15. A mysterious
SIGBUS (or -EIO) sometimes happened during the time the device was being
resumed. Most frequently, there was no kernel log message, and we saw Xorg
or Xwayland killed by SIGBUS.[1]

[1] E.g. https://bugzilla.redhat.com/show_bug.cgi?id=1553979

Without this fix, I get an IO error in this test:

# dd if=/dev/sda of=/dev/null iflag=direct & \
while killall -SIGUSR1 dd; do sleep 0.1; done & \
echo mem > /sys/power/state ; \
sleep 5; killall dd # stop after 5 seconds

The interruptible wait was added to blk_queue_enter in
commit 3ef28e83ab15 ("block: generic request_queue reference counting").
Before then, the interruptible wait was only in blk-mq, but I don't think
it could ever have been correct.

Reviewed-by: Bart Van Assche
Cc: stable@vger.kernel.org
Signed-off-by: Alan Jenkins
Signed-off-by: Jens Axboe

Alan Jenkins
2018-04-15 03:54:33 +0800

11 Apr, 2018

2 commits

2434af79c blk-mq: Revert "blk-mq: reimplement blk_mq_hw_queue_mapped" ... Browse Code »

This reverts commit 127276c6ce5a30fcc806b7fe53015f4f89b62956.

When all CPUs of one hw queue become offline, there still may have IOs
not completed from this hctx. But blk_mq_hw_queue_mapped() is called in
blk_mq_queue_tag_busy_iter(), which is used for iterating request in timeout
handler, timeout event will be missed on the inactive hctx, then request may
never be completed.

Also the replementation of blk_mq_hw_queue_mapped() doesn't match the helper's
name any more, and it should have been named as blk_mq_hw_queue_active().

Even other callers need further verification about this reimplemenation.

So revert this patch now, and we can improve hw queue activate/inactivate event
after adequent researching and test.

Cc: Stefan Haberland
Cc: Christian Borntraeger
Cc: Christoph Hellwig
Reported-by: Jens Axboe
Fixes: 127276c6ce5a30fcc ("blk-mq: reimplement blk_mq_hw_queue_mapped")
Reviewed-by: Sagi Grimberg
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-04-11 21:59:15 +0800
37f9579f4 blk-mq: Avoid that submitting a bio concurrently with device removal triggers a crash ... Browse Code »

Because blkcg_exit_queue() is now called from inside blk_cleanup_queue()
it is no longer safe to access cgroup information during or after the
blk_cleanup_queue() call. Hence protect the generic_make_request_checks()
call with blk_queue_enter() / blk_queue_exit().

Reported-by: Ming Lei
Fixes: a063057d7c73 ("block: Fix a race between request queue removal and the block cgroup controller")
Signed-off-by: Bart Van Assche
Cc: Ming Lei
Cc: Joseph Qi
Signed-off-by: Jens Axboe

Bart Van Assche
2018-04-11 07:46:40 +0800

10 Apr, 2018

9 commits

37c7c6c76 blk-mq: remove code for dealing with remapping queue ... Browse Code »

Firstly, from commit 4b855ad37194 ("blk-mq: Create hctx for each present CPU),
blk-mq doesn't remap queue any more after CPU topo is changed.

Secondly, set->nr_hw_queues can't be bigger than nr_cpu_ids, and now we map
all possible CPUs to hw queues, so at least one CPU is mapped to each hctx.

So queue mapping has became static and fixed just like percpu variable, and
we don't need to handle queue remapping any more.

Cc: Stefan Haberland
Tested-by: Christian Borntraeger
Reviewed-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-04-10 22:38:46 +0800
127276c6c blk-mq: reimplement blk_mq_hw_queue_mapped ... Browse Code »

Now the actual meaning of queue mapped is that if there is any online
CPU mapped to this hctx, so implement blk_mq_hw_queue_mapped() in this
way.

Cc: Stefan Haberland
Tested-by: Christian Borntraeger
Reviewed-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-04-10 22:38:46 +0800
efea8450c blk-mq: don't check queue mapped in __blk_mq_delay_run_hw_queue() ... Browse Code »

There are several reasons for removing the check:

1) blk_mq_hw_queue_mapped() returns true always now since each hctx
may be mapped by one CPU at least

2) when there isn't any online CPU mapped to this hctx, there won't
be any IO queued to this CPU, blk_mq_run_hw_queue() only runs queue
if there is IO queued to this hctx

3) If __blk_mq_delay_run_hw_queue() is called by blk_mq_delay_run_hw_queue(),
which is run from blk_mq_dispatch_rq_list() or scsi_mq_get_budget(), and
the hctx to be handled has to be mapped.

Cc: Stefan Haberland
Tested-by: Christian Borntraeger
Reviewed-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-04-10 22:38:46 +0800
15fe8a90b blk-mq: remove blk_mq_delay_queue() ... Browse Code »

No driver uses this interface any more, so remove it.

Cc: Stefan Haberland
Tested-by: Christian Borntraeger
Reviewed-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-04-10 22:38:46 +0800
f82ddf192 blk-mq: introduce blk_mq_hw_queue_first_cpu() to figure out first cpu ... Browse Code »

This patch introduces helper of blk_mq_hw_queue_first_cpu() for
figuring out the hctx's first cpu, and code duplication can be
avoided.

Cc: Stefan Haberland
Tested-by: Christian Borntraeger
Reviewed-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-04-10 22:38:46 +0800
476f8c98a blk-mq: avoid to write intermediate result to hctx->next_cpu ... Browse Code »

This patch figures out the final selected CPU, then writes
it to hctx->next_cpu once, then we can avoid to intermediate
next cpu observed from other dispatch paths.

Cc: Stefan Haberland
Tested-by: Christian Borntraeger
Reviewed-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-04-10 22:38:46 +0800
bffa9909a blk-mq: don't keep offline CPUs mapped to hctx 0 ... Browse Code »

From commit 4b855ad37194 ("blk-mq: Create hctx for each present CPU),
blk-mq doesn't remap queue after CPU topo is changed, that said when
some of these offline CPUs become online, they are still mapped to
hctx 0, then hctx 0 may become the bottleneck of IO dispatch and
completion.

This patch sets up the mapping from the beginning, and aligns to
queue mapping for PCI device (blk_mq_pci_map_queues()).

Cc: Stefan Haberland
Cc: Keith Busch
Cc: stable@vger.kernel.org
Fixes: 4b855ad37194 ("blk-mq: Create hctx for each present CPU)
Tested-by: Christian Borntraeger
Reviewed-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-04-10 22:38:46 +0800
a1c735fb7 blk-mq: make sure that correct hctx->next_cpu is set ... Browse Code »

From commit 20e4d81393196 (blk-mq: simplify queue mapping & schedule
with each possisble CPU), one hctx can be mapped from all offline CPUs,
then hctx->next_cpu can be set as wrong.

This patch fixes this issue by making hctx->next_cpu pointing to the
first CPU in hctx->cpumask if all CPUs in hctx->cpumask are offline.

Cc: Stefan Haberland
Tested-by: Christian Borntraeger
Reviewed-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Fixes: 20e4d81393196 ("blk-mq: simplify queue mapping & schedule with each possisble CPU")
Cc: stable@vger.kernel.org
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-04-10 22:38:46 +0800
0bca799b9 blk-mq: order getting budget and driver tag ... Browse Code »

This patch orders getting budget and driver tag by making sure to acquire
driver tag after budget is got, this way can help to avoid the following
race:

1) before dispatch request from scheduler queue, get one budget first, then
dequeue a request, call it request A.

2) in another IO path for dispatching request B which is from hctx->dispatch,
driver tag is got, then try to get budget in blk_mq_dispatch_rq_list(),
unfortunately the budget is held by request A.

3) meantime blk_mq_dispatch_rq_list() is called for dispatching request
A, and try to get driver tag first, unfortunately no driver tag is
available because the driver tag is held by request B

4) both two IO pathes can't move on, and IO stall is caused.

This issue can be observed when running dbench on USB storage.

This patch fixes this issue by always getting budget before getting
driver tag.

Cc: stable@vger.kernel.org
Fixes: de1482974080ec9e ("blk-mq: introduce .get_budget and .put_budget in blk_mq_ops")
Cc: Christoph Hellwig
Cc: Bart Van Assche
Cc: Omar Sandoval
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-04-10 22:38:46 +0800

06 Apr, 2018

1 commit

3526dd0c7 Merge tag 'for-4.17/block-20180402' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block layer updates from Jens Axboe:
"It's a pretty quiet round this time, which is nice. This contains:

- series from Bart, cleaning up the way we set/test/clear atomic
queue flags.

- series from Bart, fixing races between gendisk and queue
registration and removal.

- set of bcache fixes and improvements from various folks, by way of
Michael Lyle.

- set of lightnvm updates from Matias, most of it being the 1.2 to
2.0 transition.

- removal of unused DIO flags from Nikolay.

- blk-mq/sbitmap memory ordering fixes from Omar.

- divide-by-zero fix for BFQ from Paolo.

- minor documentation patches from Randy.

- timeout fix from Tejun.

- Alpha "can't write a char atomically" fix from Mikulas.

- set of NVMe fixes by way of Keith.

- bsg and bsg-lib improvements from Christoph.

- a few sed-opal fixes from Jonas.

- cdrom check-disk-change deadlock fix from Maurizio.

- various little fixes, comment fixes, etc from various folks"

* tag 'for-4.17/block-20180402' of git://git.kernel.dk/linux-block: (139 commits)
blk-mq: Directly schedule q->timeout_work when aborting a request
blktrace: fix comment in blktrace_api.h
lightnvm: remove function name in strings
lightnvm: pblk: remove some unnecessary NULL checks
lightnvm: pblk: don't recover unwritten lines
lightnvm: pblk: implement 2.0 support
lightnvm: pblk: implement get log report chunk
lightnvm: pblk: rename ppaf* to addrf*
lightnvm: pblk: check for supported version
lightnvm: implement get log report chunk helpers
lightnvm: make address conversions depend on generic device
lightnvm: add support for 2.0 address format
lightnvm: normalize geometry nomenclature
lightnvm: complete geo structure with maxoc*
lightnvm: add shorten OCSSD version in geo
lightnvm: add minor version to generic geometry
lightnvm: simplify geometry structure
lightnvm: pblk: refactor init/exit sequences
lightnvm: Avoid validation of default op value
lightnvm: centralize permission check for lightnvm ioctl
...

Linus Torvalds
2018-04-06 05:27:02 +0800

05 Apr, 2018

1 commit

06dd3dfee Merge tag 'char-misc-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc ... Browse Code »

Pull char/misc updates from Greg KH:
"Here is the big set of char/misc driver patches for 4.17-rc1.

There are a lot of little things in here, nothing huge, but all
important to the different hardware types involved:

- thunderbolt driver updates

- parport updates (people still care...)

- nvmem driver updates

- mei updates (as always)

- hwtracing driver updates

- hyperv driver updates

- extcon driver updates

- ... and a handful of even smaller driver subsystem and individual
driver updates

All of these have been in linux-next with no reported issues"

* tag 'char-misc-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (149 commits)
hwtracing: Add HW tracing support menu
intel_th: Add ACPI glue layer
intel_th: Allow forcing host mode through drvdata
intel_th: Pick up irq number from resources
intel_th: Don't touch switch routing in host mode
intel_th: Use correct method of finding hub
intel_th: Add SPDX GPL-2.0 header to replace GPLv2 boilerplate
stm class: Make dummy's master/channel ranges configurable
stm class: Add SPDX GPL-2.0 header to replace GPLv2 boilerplate
MAINTAINERS: Bestow upon myself the care for drivers/hwtracing
hv: add SPDX license id to Kconfig
hv: add SPDX license to trace
Drivers: hv: vmbus: do not mark HV_PCIE as perf_device
Drivers: hv: vmbus: respect what we get from hv_get_synint_state()
/dev/mem: Avoid overwriting "err" in read_mem()
eeprom: at24: use SPDX identifier instead of GPL boiler-plate
eeprom: at24: simplify the i2c functionality checking
eeprom: at24: fix a line break
eeprom: at24: tweak newlines
eeprom: at24: refactor at24_probe()
...

Linus Torvalds
2018-04-05 11:07:20 +0800

03 Apr, 2018

1 commit

bc6d65e6d blk-mq: Directly schedule q->timeout_work when aborting a request ... Browse Code »

Request abortion is performed by overriding deadline to now and
scheduling timeout handling immediately. For the latter part, the
code was using mod_timer(timeout, 0) which can't guarantee that the
timer runs afterwards. Let's schedule the underlying work item
directly instead.

This fixes the hangs during probing reported by Sitsofe but it isn't
yet clear to me how the failure can happen reliably if it's just the
above described race condition.

Signed-off-by: Tejun Heo
Reported-by: Sitsofe Wheeler
Reported-by: Meelis Roos
Fixes: 358f70da49d7 ("blk-mq: make blk_abort_request() trigger timeout path")
Cc: stable@vger.kernel.org # v4.16
Link: http://lkml.kernel.org/r/CALjAwxh-PVYFnYFCJpGOja+m5SzZ8Sa4J7ohxdK=r8NyOF-EMA@mail.gmail.com
Link: http://lkml.kernel.org/r/alpine.LRH.2.21.1802261049140.4893@math.ut.ee
Signed-off-by: Jens Axboe

Tejun Heo
2018-04-03 06:36:13 +0800

28 Mar, 2018

1 commit

f23f5bece blk-mq: Allow PCI vector offset for mapping queues ... Browse Code »

The PCI interrupt vectors intended to be associated with a queue may
not start at 0; a driver may allocate pre_vectors for special use. This
patch adds an offset parameter so blk-mq may find the intended affinity
mask and updates all drivers using this API accordingly.

Cc: Don Brace
Cc:
Cc:
Signed-off-by: Keith Busch
Reviewed-by: Ming Lei
Signed-off-by: Jens Axboe

Keith Busch
2018-03-28 11:25:36 +0800

27 Mar, 2018

1 commit

bc56e2caf block, bfq: lower-bound the estimated peak rate to 1 ... Browse Code »

If a storage device handled by BFQ happens to be slower than 7.5 KB/s
for a certain amount of time (in the order of a second), then the
estimated peak rate of the device, maintained in BFQ, becomes equal to
0. The reason is the limited precision with which the rate is
represented (details on the range of representable values in the
comments introduced by this commit). This leads to a division-by-zero
error where the estimated peak rate is used as divisor. Such a type of
failure has been reported in [1].

This commit addresses this issue by:
1. Lower-bounding the estimated peak rate to 1
2. Adding and improving comments on the range of rates representable

[1] https://www.spinics.net/lists/kernel/msg2739205.html

Signed-off-by: Konstantin Khlebnikov
Signed-off-by: Paolo Valente
Signed-off-by: Jens Axboe

Paolo Valente
2018-03-27 00:18:27 +0800

26 Mar, 2018

1 commit

a687a5337 treewide: simplify Kconfig dependencies for removed archs ... Browse Code »

A lot of Kconfig symbols have architecture specific dependencies.
In those cases that depend on architectures we have already removed,
they can be omitted.

Acked-by: Kalle Valo
Acked-by: Alexandre Belloni
Signed-off-by: Arnd Bergmann

Arnd Bergmann
2018-03-26 21:55:57 +0800

22 Mar, 2018

1 commit

bd5c4facf Fix slab name "biovec-(1<<(21-12))" ... Browse Code »

I'm getting a slab named "biovec-(1<
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Jens Axboe

Mikulas Patocka
2018-03-22 09:25:22 +0800

20 Mar, 2018

1 commit

818e0fa29 block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}() ... Browse Code »

scsi_device_quiesce() uses synchronize_rcu() to guarantee that the
effect of blk_set_preempt_only() will be visible for percpu_ref_tryget()
calls that occur after the queue unfreeze by using the approach
explained in https://lwn.net/Articles/573497/. The rcu read lock and
unlock calls in blk_queue_enter() form a pair with the synchronize_rcu()
call in scsi_device_quiesce(). Both scsi_device_quiesce() and
blk_queue_enter() must either use regular RCU or RCU-sched.
Since neither the RCU-protected code in blk_queue_enter() nor
blk_queue_usage_counter_release() sleeps, regular RCU protection
is sufficient. Note: scsi_device_quiesce() does not have to be
modified since it already uses synchronize_rcu().

Reported-by: Tejun Heo
Fixes: 3a0a529971ec ("block, scsi: Make SCSI quiesce and resume work reliably")
Signed-off-by: Bart Van Assche
Acked-by: Tejun Heo
Cc: Tejun Heo
Cc: Hannes Reinecke
Cc: Ming Lei
Cc: Christoph Hellwig
Cc: Johannes Thumshirn
Cc: Oleksandr Natalenko
Cc: Martin Steigerwald
Cc: stable@vger.kernel.org # v4.15
Signed-off-by: Jens Axboe

Bart Van Assche
2018-03-20 02:50:10 +0800

18 Mar, 2018

2 commits

52c5e62d4 block: bio_check_eod() needs to consider partitions ... Browse Code »

bio_check_eod() should check partition size not the whole disk if
bio->bi_partno is non-zero. Do this by moving the call
to bio_check_eod() into blk_partition_remap().

Based on an earlier patch from Jiufei Xue.

Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index")
Reported-by: Jiufei Xue
Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2018-03-18 04:48:04 +0800
ec6dcf63c blk-mq-debugfs: Show more request state information ... Browse Code »

Since commit 634f9e4631a8 ("blk-mq: remove REQ_ATOM_COMPLETE usages
from blk-mq") blk_rq_is_complete() only reports whether or not a
request has completed for legacy queues. Hence modify the
blk-mq-debugfs code such that it shows the blk-mq request state
again.

Fixes: 634f9e4631a8 ("blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq")
Signed-off-by: Bart Van Assche
Cc: Tejun Heo
Signed-off-by: Jens Axboe

Bart Van Assche
2018-03-18 04:41:17 +0800

17 Mar, 2018

2 commits

4c6994806 blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir() ... Browse Code »

We've triggered a WARNING in blk_throtl_bio() when throttling writeback
io, which complains blkg->refcnt is already 0 when calling blkg_get(),
and then kernel crashes with invalid page request.
After investigating this issue, we've found it is caused by a race
between blkcg_bio_issue_check() and cgroup_rmdir(), which is described
below:

writeback kworker cgroup_rmdir
cgroup_destroy_locked
kill_css
css_killed_ref_fn
css_killed_work_fn
offline_css
blkcg_css_offline
blkcg_bio_issue_check
rcu_read_lock
blkg_lookup
spin_trylock(q->queue_lock)
blkg_destroy
spin_unlock(q->queue_lock)
blk_throtl_bio
spin_lock_irq(q->queue_lock)
...
spin_unlock_irq(q->queue_lock)
rcu_read_unlock

Since rcu can only prevent blkg from releasing when it is being used,
the blkg->refcnt can be decreased to 0 during blkg_destroy() and schedule
blkg release.
Then trying to blkg_get() in blk_throtl_bio() will complains the WARNING.
And then the corresponding blkg_put() will schedule blkg release again,
which result in double free.
This race is introduced by commit ae1188963611 ("blkcg: consolidate blkg
creation in blkcg_bio_issue_check()"). Before this commit, it will
lookup first and then try to lookup/create again with queue_lock. Since
revive this logic is a bit drastic, so fix it by only offlining pd during
blkcg_css_offline(), and move the rest destruction (especially
blkg_put()) into blkcg_css_free(), which should be the right way as
discussed.

Fixes: ae1188963611 ("blkcg: consolidate blkg creation in blkcg_bio_issue_check()")
Reported-by: Jiufei Xue
Signed-off-by: Joseph Qi
Acked-by: Tejun Heo
Signed-off-by: Jens Axboe

Joseph Qi
2018-03-17 00:35:12 +0800
5f990d316 block: sed-opal: fix u64 short atom length ... Browse Code »

The length must be given as bytes and not as 4 bit tuples.

Reviewed-by: Scott Bauer
Signed-off-by: Jonas Rabenstein
Signed-off-by: Jens Axboe

Jonas Rabenstein
2018-03-17 00:15:27 +0800

16 Mar, 2018

1 commit

f33ff110e block, char_dev: Use correct format specifier for unsigned ints ... Browse Code »

register_blkdev() and __register_chrdev_region() treat the major
number as an unsigned int. So print it the same way to avoid
absurd error statements such as:
"... major requested (-1) is greater than the maximum (511) ..."
(and also fix off-by-one bugs in the error prints).

While at it, also update the comment describing register_blkdev().

Signed-off-by: Srivatsa S. Bhat
Reviewed-by: Logan Gunthorpe
Signed-off-by: Greg Kroah-Hartman

Srivatsa S. Bhat
2018-03-16 00:59:24 +0800

14 Mar, 2018

3 commits

17cb960f2 bsg: split handling of SCSI CDBs vs transport requeues ... Browse Code »

The current BSG design tries to shoe-horn the transport-specific
passthrough commands into the overall framework for SCSI passthrough
requests. This has a couple problems:

- each passthrough queue has to set the QUEUE_FLAG_SCSI_PASSTHROUGH flag
despite not dealing with SCSI commands at all. Because of that these
queues could also incorrectly accept SCSI commands from in-kernel
users or through the legacy SCSI_IOCTL_SEND_COMMAND ioctl.
- the real SCSI bsg queues also incorrectly accept bsg requests of the
BSG_SUB_PROTOCOL_SCSI_TRANSPORT type
- the bsg transport code is almost unredable because it tries to reuse
different SCSI concepts for its own purpose.

This patch instead adds a new bsg_ops structure to handle the two cases
differently, and thus solves all of the above problems. Another side
effect is that the bsg-lib queues also don't need to embedd a
struct scsi_request anymore.

Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
Reviewed-by: Johannes Thumshirn
Signed-off-by: Jens Axboe

Christoph Hellwig
2018-03-14 01:40:24 +0800
ef6fa64f9 bsg-lib: remove bsg_job.req ... Browse Code »

Users of the bsg-lib interface should only use the bsg_job data structure
and not know about implementation details of it.

Signed-off-by: Christoph Hellwig
Reviewed-by: Benjamin Block
Reviewed-by: Hannes Reinecke
Reviewed-by: Johannes Thumshirn
Signed-off-by: Jens Axboe

Christoph Hellwig
2018-03-14 01:40:23 +0800
31156ec37 bsg-lib: introduce a timeout field in struct bsg_job ... Browse Code »

The zfcp driver wants to know the timeout for a bsg job, so add a field
to struct bsg_job for it in preparation of not exposing the request
to the bsg-lib users.

Signed-off-by: Christoph Hellwig
Reviewed-by: Benjamin Block
Reviewed-by: Hannes Reinecke
Reviewed-by: Johannes Thumshirn
Signed-off-by: Jens Axboe

Christoph Hellwig
2018-03-14 01:40:21 +0800

09 Mar, 2018

5 commits

56c4bddb9 block: Suppress kernel-doc warnings triggered by blk-zoned.c ... Browse Code »

Avoid that building with W=1 causes the kernel-doc tool to complain
about undocumented function arguments for the blk-zoned.c source file.

Signed-off-by: Bart Van Assche
Cc: Christoph Hellwig
Cc: Damien Le Moal
Signed-off-by: Jens Axboe

Bart Van Assche
2018-03-09 23:23:32 +0800
8a0ac14b8 block: Move the queue_flag_*() functions from a public into a private header file ... Browse Code »

This patch helps to avoid that new code gets introduced in block drivers
that manipulates queue flags without holding the queue lock when that
lock should be held.

Cc: Christoph Hellwig
Cc: Hannes Reinecke
Cc: Ming Lei
Reviewed-by: Johannes Thumshirn
Reviewed-by: Martin K. Petersen
Signed-off-by: Bart Van Assche
Signed-off-by: Jens Axboe

Bart Van Assche
2018-03-09 05:13:48 +0800
8b904b5b6 block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() ... Browse Code »

This patch has been generated as follows:

for verb in set_unlocked clear_unlocked set clear; do
replace-in-files queue_flag_${verb} blk_queue_flag_${verb%_unlocked} \
$(git grep -lw queue_flag_${verb} drivers block/bsg*)
done

Except for protecting all queue flag changes with the queue lock
this patch does not change any functionality.

Cc: Mike Snitzer
Cc: Shaohua Li
Cc: Christoph Hellwig
Cc: Hannes Reinecke
Cc: Ming Lei
Signed-off-by: Bart Van Assche
Reviewed-by: Martin K. Petersen
Reviewed-by: Johannes Thumshirn
Acked-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Bart Van Assche
2018-03-09 05:13:48 +0800
7dfdbc736 block: Protect queue flag changes with the queue lock ... Browse Code »

Since the queue flags may be changed concurrently from multiple
contexts after a queue becomes visible in sysfs, make these changes
safe by protecting these with the queue lock.

Cc: Christoph Hellwig
Cc: Hannes Reinecke
Cc: Ming Lei
Reviewed-by: Martin K. Petersen
Reviewed-by: Johannes Thumshirn
Signed-off-by: Bart Van Assche
Signed-off-by: Jens Axboe

Bart Van Assche
2018-03-09 05:13:48 +0800
8814ce8a0 block: Introduce blk_queue_flag_{set,clear,test_and_{set,clear}}() ... Browse Code »

Introduce functions that modify the queue flags and that protect
these modifications with the request queue lock. Except for moving
one wake_up_all() call from inside to outside a critical section,
this patch does not change any functionality.

Cc: Christoph Hellwig
Cc: Hannes Reinecke
Cc: Ming Lei
Reviewed-by: Johannes Thumshirn
Reviewed-by: Martin K. Petersen
Signed-off-by: Bart Van Assche
Signed-off-by: Jens Axboe

Bart Van Assche
2018-03-09 05:13:48 +0800