Eric Lee / smarc-fsl-linux-kernel

06 Feb, 2020

1 commit

ed535f2c9 Merge tag 'block-5.6-2020-02-05' of git://git.kernel.dk/linux-block ... Browse Code »

Pull more block updates from Jens Axboe:
"Some later arrivals, but all fixes at this point:

- bcache fix series (Coly)

- Series of BFQ fixes (Paolo)

- NVMe pull request from Keith with a few minor NVMe fixes

- Various little tweaks"

* tag 'block-5.6-2020-02-05' of git://git.kernel.dk/linux-block: (23 commits)
nvmet: update AEN list and array at one place
nvmet: Fix controller use after free
nvmet: Fix error print message at nvmet_install_queue function
brd: check and limit max_part par
nvme-pci: remove nvmeq->tags
nvmet: fix dsm failure when payload does not match sgl descriptor
nvmet: Pass lockdep expression to RCU lists
block, bfq: clarify the goal of bfq_split_bfqq()
block, bfq: get a ref to a group when adding it to a service tree
block, bfq: remove ifdefs from around gets/puts of bfq groups
block, bfq: extend incomplete name of field on_st
block, bfq: get extra ref to prevent a queue from being freed during a group move
block, bfq: do not insert oom queue into position tree
block, bfq: do not plug I/O for bfq_queues with no proc refs
bcache: check return value of prio_read()
bcache: fix incorrect data type usage in btree_flush_write()
bcache: add readahead cache policy options via sysfs interface
bcache: explicity type cast in bset_bkey_last()
bcache: fix memory corruption in bch_cache_accounting_clear()
xen/blkfront: limit allocated memory size to actual use case
...

Linus Torvalds
2020-02-06 14:15:23 +0800

03 Feb, 2020

7 commits

c92bddee7 block, bfq: clarify the goal of bfq_split_bfqq() ... Browse Code »

The exact, general goal of the function bfq_split_bfqq() is not that
apparent. Add a comment to make it clear.

Tested-by: Oleksandr Natalenko
Signed-off-by: Paolo Valente
Signed-off-by: Jens Axboe

Paolo Valente
2020-02-03 21:58:15 +0800
db37a34c5 block, bfq: get a ref to a group when adding it to a service tree ... Browse Code »

BFQ schedules generic entities, which may represent either bfq_queues
or groups of bfq_queues. When an entity is inserted into a service
tree, a reference must be taken, to make sure that the entity does not
disappear while still referred in the tree. Unfortunately, such a
reference is mistakenly taken only if the entity represents a
bfq_queue. This commit takes a reference also in case the entity
represents a group.

Tested-by: Oleksandr Natalenko
Tested-by: Chris Evich
Signed-off-by: Paolo Valente
Signed-off-by: Jens Axboe

Paolo Valente
2020-02-03 21:58:15 +0800
4d8340d0d block, bfq: remove ifdefs from around gets/puts of bfq groups ... Browse Code »

ifdefs around gets and puts of bfq groups reduce readability, remove them.

Tested-by: Oleksandr Natalenko
Reported-by: Jens Axboe
Signed-off-by: Paolo Valente
Signed-off-by: Jens Axboe

Paolo Valente
2020-02-03 21:58:15 +0800
33a16a980 block, bfq: extend incomplete name of field on_st ... Browse Code »

The flag on_st in the bfq_entity data structure is true if the entity
is on a service tree or is in service. Yet the name of the field,
confusingly, does not mention the second, very important case. Extend
the name to mention the second case too.

Tested-by: Oleksandr Natalenko
Signed-off-by: Paolo Valente
Signed-off-by: Jens Axboe

Paolo Valente
2020-02-03 21:58:15 +0800
ecedd3d7e block, bfq: get extra ref to prevent a queue from being freed during a group move ... Browse Code »

In bfq_bfqq_move(), the bfq_queue, say Q, to be moved to a new group
may happen to be deactivated in the scheduling data structures of the
source group (and then activated in the destination group). If Q is
referred only by the data structures in the source group when the
deactivation happens, then Q is freed upon the deactivation.

This commit addresses this issue by getting an extra reference before
the possible deactivation, and releasing this extra reference after Q
has been moved.

Tested-by: Chris Evich
Tested-by: Oleksandr Natalenko
Signed-off-by: Paolo Valente
Signed-off-by: Jens Axboe

Paolo Valente
2020-02-03 21:58:15 +0800
32c59e3a9 block, bfq: do not insert oom queue into position tree ... Browse Code »

BFQ maintains an ordered list, implemented with an RB tree, of
head-request positions of non-empty bfq_queues. This position tree,
inherited from CFQ, is used to find bfq_queues that contain I/O close
to each other. BFQ merges these bfq_queues into a single shared queue,
if this boosts throughput on the device at hand.

There is however a special-purpose bfq_queue that does not participate
in queue merging, the oom bfq_queue. Yet, also this bfq_queue could be
wrongly added to the position tree. So bfqq_find_close() could return
the oom bfq_queue, which is a source of further troubles in an
out-of-memory situation. This commit prevents the oom bfq_queue from
being inserted into the position tree.

Tested-by: Patrick Dung
Tested-by: Oleksandr Natalenko
Signed-off-by: Paolo Valente
Signed-off-by: Jens Axboe

Paolo Valente
2020-02-03 21:58:15 +0800
f718b0932 block, bfq: do not plug I/O for bfq_queues with no proc refs ... Browse Code »

Commit 478de3380c1c ("block, bfq: deschedule empty bfq_queues not
referred by any process") fixed commit 3726112ec731 ("block, bfq:
re-schedule empty queues if they deserve I/O plugging") by
descheduling an empty bfq_queue when it remains with not process
reference. Yet, this still left a case uncovered: an empty bfq_queue
with not process reference that remains in service. This happens for
an in-service sync bfq_queue that is deemed to deserve I/O-dispatch
plugging when it remains empty. Yet no new requests will arrive for
such a bfq_queue if no process sends requests to it any longer. Even
worse, the bfq_queue may happen to be prematurely freed while still in
service (because there may remain no reference to it any longer).

This commit solves this problem by preventing I/O dispatch from being
plugged for the in-service bfq_queue, if the latter has no process
reference (the bfq_queue is then prevented from remaining in service).

Fixes: 3726112ec731 ("block, bfq: re-schedule empty queues if they deserve I/O plugging")
Tested-by: Oleksandr Natalenko
Reported-by: Patrick Dung
Tested-by: Patrick Dung
Signed-off-by: Paolo Valente
Signed-off-by: Jens Axboe

Paolo Valente
2020-02-03 21:58:14 +0800

30 Jan, 2020

1 commit

33c84e89a Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi ... Browse Code »

Pull SCSI updates from James Bottomley:
"This series is slightly unusual because it includes Arnd's compat
ioctl tree here:

1c46a2cf2dbd Merge tag 'block-ioctl-cleanup-5.6' into 5.6/scsi-queue

Excluding Arnd's changes, this is mostly an update of the usual
drivers: megaraid_sas, mpt3sas, qla2xxx, ufs, lpfc, hisi_sas.

There are a couple of core and base updates around error propagation
and atomicity in the attribute container base we use for the SCSI
transport classes.

The rest is minor changes and updates"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (149 commits)
scsi: hisi_sas: Rename hisi_sas_cq.pci_irq_mask
scsi: hisi_sas: Add prints for v3 hw interrupt converge and automatic affinity
scsi: hisi_sas: Modify the file permissions of trigger_dump to write only
scsi: hisi_sas: Replace magic number when handle channel interrupt
scsi: hisi_sas: replace spin_lock_irqsave/spin_unlock_restore with spin_lock/spin_unlock
scsi: hisi_sas: use threaded irq to process CQ interrupts
scsi: ufs: Use UFS device indicated maximum LU number
scsi: ufs: Add max_lu_supported in struct ufs_dev_info
scsi: ufs: Delete is_init_prefetch from struct ufs_hba
scsi: ufs: Inline two functions into their callers
scsi: ufs: Move ufshcd_get_max_pwr_mode() to ufshcd_device_params_init()
scsi: ufs: Split ufshcd_probe_hba() based on its called flow
scsi: ufs: Delete struct ufs_dev_desc
scsi: ufs: Fix ufshcd_probe_hba() reture value in case ufshcd_scsi_add_wlus() fails
scsi: ufs-mediatek: enable low-power mode for hibern8 state
scsi: ufs: export some functions for vendor usage
scsi: ufs-mediatek: add dbg_register_dump implementation
scsi: qla2xxx: Fix a NULL pointer dereference in an error path
scsi: qla1280: Make checking for 64bit support consistent
scsi: megaraid_sas: Update driver version to 07.713.01.00-rc1
...

Linus Torvalds
2020-01-30 10:16:16 +0800

28 Jan, 2020

1 commit

48b4b4ff1 Merge tag 'for-5.6/block-2020-01-27' of git://git.kernel.dk/linux-block ... Browse Code »

Pull core block updates from Jens Axboe:
"This may be the most quiet round we've had in years. I'm not
complaining. Really not a lot to detail here, outside of spelling and
documentation improvements/fixes, we have:

- Allow t10-pi to be modular (Herbert)

- Remove dead code in bfq (Alex)

- Mark zone management requests with REQ_SYNC (Chaitanya)

- BFQ division improvement (Wen)

- Small series improving plugging (Pavel)"

* tag 'for-5.6/block-2020-01-27' of git://git.kernel.dk/linux-block:
partitions/ldm: fix spelling mistake "to" -> "too"
block, bfq: improve arithmetic division in bfq_delta()
block/bfq: remove unused bfq_class_rt which never used
block: mark zone-mgmt bios with REQ_SYNC
blk-mq: Document functions for sending request
block: Allow t10-pi to be modular
blk-mq: optimise blk_mq_flush_plug_list()
list: introduce list_for_each_continue()
blk-mq: optimise rq sort function

Linus Torvalds
2020-01-28 04:38:25 +0800

27 Jan, 2020

1 commit

b72053072 block: allow partitions on host aware zone devices ... Browse Code »

Host-aware SMR drives can be used with the commands to explicitly manage
zone state, but they can also be used as normal disks. In the former
case it makes perfect sense to allow partitions on them, in the latter
it does not, just like for host managed devices. Add a check to
add_partition to allow partitions on host aware devices, but give
up any zone management capabilities in that case, which also catches
the previously missed case of adding a partition vs just scanning it.

Because sd can rescan the attribute at runtime it needs to check if
a disk has partitions, for which a new helper is added to genhd.h.

Fixes: 5eac3eb30c9a ("block: Remove partition support for zoned block devices")
Reported-by: Borislav Petkov
Signed-off-by: Christoph Hellwig
Tested-by: Damien Le Moal
Reviewed-by: Damien Le Moal
Signed-off-by: Jens Axboe

Christoph Hellwig
2020-01-27 00:59:08 +0800

24 Jan, 2020

1 commit

5336da37a partitions/ldm: fix spelling mistake "to" -> "too" ... Browse Code »

There is a spelling mistake in a ldm_error message. Fix it.

Signed-off-by: Colin Ian King
Signed-off-by: Jens Axboe

Colin Ian King
2020-01-24 02:41:45 +0800

23 Jan, 2020

2 commits

554d21efb block, bfq: improve arithmetic division in bfq_delta() ... Browse Code »

do_div() does a 64-by-32 division. Use div64_ul() instead of it
if the divisor is unsigned long, to avoid truncation to 32-bit.
And as a nice side effect also cleans up the function a bit.

Signed-off-by: Wen Yang
Cc: Paolo Valente
Cc: Jens Axboe
Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Jens Axboe

Wen Yang
2020-01-23 01:34:11 +0800
b7f22d993 block/bfq: remove unused bfq_class_rt which never used ... Browse Code »

This macro is never used after introduced from commit aee69d78dec0
("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler")

Better to remove it.

Signed-off-by: Alex Shi
Cc: Paolo Valente
Cc: Jens Axboe
Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Jens Axboe

Alex Shi
2020-01-23 01:31:20 +0800

16 Jan, 2020

1 commit

ad6bf88a6 block: fix an integer overflow in logical block size ... Browse Code »

Logical block size has type unsigned short. That means that it can be at
most 32768. However, there are architectures that can run with 64k pages
(for example arm64) and on these architectures, it may be possible to
create block devices with 64k block size.

For exmaple (run this on an architecture with 64k pages):

Mount will fail with this error because it tries to read the superblock using 2-sector
access:
device-mapper: writecache: I/O is not aligned, sector 2, size 1024, block size 65536
EXT4-fs (dm-0): unable to read superblock

This patch changes the logical block size from unsigned short to unsigned
int to avoid the overflow.

Cc: stable@vger.kernel.org
Reviewed-by: Martin K. Petersen
Reviewed-by: Ming Lei
Signed-off-by: Mikulas Patocka
Signed-off-by: Jens Axboe

Mikulas Patocka
2020-01-16 12:43:09 +0800

15 Jan, 2020

1 commit

4a2f704eb block: fix get_max_segment_size() overflow on 32bit arch ... Browse Code »

Commit 429120f3df2d starts to take account of segment's start dma address
when computing max segment size, and data type of 'unsigned long'
is used to do that. However, the segment mask may be 0xffffffff, so
the figured out segment size may be overflowed in case of zero physical
address on 32bit arch.

Fix the issue by returning queue_max_segment_size() directly when that
happens.

Fixes: 429120f3df2d ("block: fix splitting segments on boundary masks")
Reported-by: Guenter Roeck
Tested-by: Guenter Roeck
Cc: Christoph Hellwig
Tested-by: Steven Rostedt (VMware)
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2020-01-15 04:37:40 +0800

09 Jan, 2020

2 commits

83c9c5471 fs: move guard_bio_eod() after bio_set_op_attrs ... Browse Code »

Commit 85a8ce62c2ea ("block: add bio_truncate to fix guard_bio_eod")
adds bio_truncate() for handling bio EOD. However, bio_truncate()
doesn't use the passed 'op' parameter from guard_bio_eod's callers.

So bio_trunacate() may retrieve wrong 'op', and zering pages may
not be done for READ bio.

Fixes this issue by moving guard_bio_eod() after bio_set_op_attrs()
in submit_bh_wbc() so that bio_truncate() can always retrieve correct
op info.

Meantime remove the 'op' parameter from guard_bio_eod() because it isn't
used any more.

Cc: Carlos Maiolino
Cc: linux-fsdevel@vger.kernel.org
Fixes: 85a8ce62c2ea ("block: add bio_truncate to fix guard_bio_eod")
Signed-off-by: Ming Lei

Fold in kerneldoc and bio_op() change.

Signed-off-by: Jens Axboe

Ming Lei
2020-01-09 23:16:12 +0800
8e42d239c block: mark zone-mgmt bios with REQ_SYNC ... Browse Code »

In the current implementation, final zone-mgmt request is issued with
submit_bio_wait() which marks the bio REQ_SYNC. This is needed since
immediate action is expected for zone-mgmt requests as these are
blocking operations. This also bypasses the scheduler in the
blk_mq_make_request() and dispatches the request directly into the
hw ctx.

This patch marks all the chained bios REQ_SYNC so that we can have
above-mentioned behavior for non-final bios also.

Reviewed-by: Damien Le Moal
Reviewed-by: Bob Liu
Signed-off-by: Chaitanya Kulkarni
Signed-off-by: Jens Axboe

Chaitanya Kulkarni
2020-01-09 22:59:12 +0800

07 Jan, 2020

2 commits

105663f73 blk-mq: Document functions for sending request ... Browse Code »

Add or improve documentation for function regarding creating and sending
IO requests to the hardware.

Signed-off-by: André Almeida
Signed-off-by: Jens Axboe

André Almeida
2020-01-07 12:00:27 +0800
a754bd5f1 block: Allow t10-pi to be modular ... Browse Code »

Currently t10-pi can only be built into the block layer which via
crc-t10dif pulls in a whole chunk of the Crypto API. In fact all
users of t10-pi work as modules and there is no reason for it to
always be built-in.

This patch adds a new hidden option for t10-pi that is selected
automatically based on BLK_DEV_INTEGRITY and whether the users
of t10-pi are built-in or not.

Signed-off-by: Herbert Xu
Signed-off-by: Jens Axboe

Herbert Xu
2020-01-07 11:59:04 +0800

03 Jan, 2020

10 commits

9b81648cb compat_ioctl: simplify up block/ioctl.c ... Browse Code »

Having separate implementations of blkdev_ioctl() often leads to these
getting out of sync, despite the comment at the top.

Since most of the ioctl commands are compatible, and we try very hard
not to add any new incompatible ones, move all the common bits into a
shared function and leave only the ones that are historically different
in separate functions for native/compat mode.

To deal with the compat_ptr() conversion, pass both the integer
argument and the pointer argument into the new blkdev_common_ioctl()
and make sure to always use the correct one of these.

blkdev_ioctl() is now only kept as a separate exported interfact
for drivers/char/raw.c, which lacks a compat_ioctl variant.
We should probably either move raw.c to staging if there are no
more users, or export blkdev_compat_ioctl() as well.

Reviewed-by: Ben Hutchings
Signed-off-by: Arnd Bergmann

Arnd Bergmann
2020-01-03 16:42:52 +0800
5fb889f58 compat_ioctl: block: simplify compat_blkpg_ioctl() ... Browse Code »

There is no need to go through a compat_alloc_user_space()
copy any more, just wrap the function in a small helper that
works the same way for native and compat mode.

Reviewed-by: Ben Hutchings
Signed-off-by: Arnd Bergmann

Arnd Bergmann
2020-01-03 16:42:52 +0800
bdc1ddad3 compat_ioctl: block: move blkdev_compat_ioctl() into ioctl.c ... Browse Code »

Having both in the same file allows a number of simplifications
to the compat path, and makes it more likely that changes to
the native path get applied to the compat version as well.

Reviewed-by: Ben Hutchings
Signed-off-by: Arnd Bergmann

Arnd Bergmann
2020-01-03 16:42:52 +0800
1df23c6fe compat_ioctl: move HDIO ioctl handling into drivers/ide ... Browse Code »

Most of the HDIO ioctls are only used by the obsolete drivers/ide
subsystem, these can be handled by changing ide_cmd_ioctl() to be aware
of compat mode and doing the correct transformations in place and using
it as both native and compat handlers for all drivers.

The SCSI drivers implementing the same commands are already doing
this in the drivers, so the compat_blkdev_driver_ioctl() function
is no longer needed now.

The BLKSECTSET and HDIO_GETGEO_BIG ioctls are not implemented
in any driver any more and no longer need any conversion.

Reviewed-by: Ben Hutchings
Signed-off-by: Arnd Bergmann

Arnd Bergmann
2020-01-03 16:42:52 +0800
64cbfa965 compat_ioctl: move cdrom commands into cdrom.c ... Browse Code »

There is no need for the special cases for the cdrom ioctls any more now,
so make sure that each cdrom driver has a .compat_ioctl() callback and
calls cdrom_compat_ioctl() directly there.

Reviewed-by: Ben Hutchings
Signed-off-by: Arnd Bergmann

Arnd Bergmann
2020-01-03 16:42:52 +0800
fe0da4e5e compat_ioctl: bsg: add handler ... Browse Code »

bsg_ioctl() calls into scsi_cmd_ioctl() for a couple of generic commands
and relies on fs/compat_ioctl.c to handle it correctly in compat mode.

Adding a private compat_ioctl() handler avoids that round-trip and lets
us get rid of the generic emulation once this is done.

Note that bsg implements an SG_IO command that is different from the
other drivers and does not need emulation.

Reviewed-by: Ben Hutchings
Signed-off-by: Arnd Bergmann

Arnd Bergmann
2020-01-03 16:33:21 +0800
8f8f56203 compat_ioctl: move CDROMREADADIO to cdrom.c ... Browse Code »

Again, there is only one file that needs this, so move the conversion
handler into the native implementation.

Reviewed-by: Ben Hutchings
Signed-off-by: Arnd Bergmann

Arnd Bergmann
2020-01-03 16:33:08 +0800
f3ee6e63a compat_ioctl: move CDROM_SEND_PACKET handling into scsi ... Browse Code »

There is only one implementation of this ioctl, so move the handling out
of the common block layer code into the place where it's actually needed.

It also gets called indirectly through pktcdvd, which needs to be aware
of this change.

As I noticed, the old implementation of the compat handler failed to
convert the structure on the way out, so the updated fields never got
written back to user space. This is either not important, or it has
never worked and should be fixed now.

Reviewed-by: Ben Hutchings
Signed-off-by: Arnd Bergmann

Arnd Bergmann
2020-01-03 16:33:05 +0800
ee6a129df compat_ioctl: block: add blkdev_compat_ptr_ioctl ... Browse Code »

A lot of block drivers need only a trivial .compat_ioctl callback.

Add a helper function that can be set as the callback pointer
to only convert the argument using the compat_ptr() conversion
and otherwise assume all input and output data is compatible,
or handled using in_compat_syscall() checks.

This mirrors the compat_ptr_ioctl() helper function used in
character devices.

Reviewed-by: Ben Hutchings
Signed-off-by: Arnd Bergmann

Arnd Bergmann
2020-01-03 16:32:59 +0800
78ed001d9 compat: scsi: sg: fix v3 compat read/write interface ... Browse Code »

In the v5.4 merge window, a cleanup patch from Al Viro conflicted
with my rework of the compat handling for sg.c read(). Linus Torvalds
did a correct merge but pointed out that the resulting code is still
unsatisfactory.

I later noticed that the sg_new_read() function still gets the compat
mode wrong, when the 'count' argument is large enough to pass a
compat_sg_io_hdr object, but not a nativ sg_io_hdr.

To address both of these, move the definition of compat_sg_io_hdr
into a scsi/sg.h to make it visible to sg.c and rewrite the logic
for reading req_pack_id as well as the size check to a simpler
version that gets the expected results.

Fixes: c35a5cfb4150 ("scsi: sg: sg_read(): simplify reading ->pack_id of userland sg_io_hdr_t")
Fixes: 98aaaec4a150 ("compat_ioctl: reimplement SG_IO handling")
Reviewed-by: Ben Hutchings
Signed-off-by: Arnd Bergmann

Arnd Bergmann
2020-01-03 16:32:54 +0800

30 Dec, 2019

1 commit

429120f3d block: fix splitting segments on boundary masks ... Browse Code »

We ran into a problem with a mpt3sas based controller, where we would
see random (and hard to reproduce) file corruption). The issue seemed
specific to this controller, but wasn't specific to the file system.
After a lot of debugging, we find out that it's caused by segments
spanning a 4G memory boundary. This shouldn't happen, as the default
setting for segment boundary masks is 4G.

Turns out there are two issues in get_max_segment_size():

1) The default segment boundary mask is bypassed

2) The segment start address isn't taken into account when checking
segment boundary limit

Fix these two issues by removing the bypass of the segment boundary
check even if the mask is set to the default value, and taking into
account the actual start address of the request when checking if a
segment needs splitting.

Cc: stable@vger.kernel.org # v5.1+
Reviewed-by: Chris Mason
Tested-by: Chris Mason
Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
Signed-off-by: Ming Lei

Dropped const on the page pointer, ppc page_to_phys() doesn't mark the
page as const...

Signed-off-by: Jens Axboe

Ming Lei
2019-12-30 23:51:18 +0800

29 Dec, 2019

1 commit

85a8ce62c block: add bio_truncate to fix guard_bio_eod ... Browse Code »

Some filesystem, such as vfat, may send bio which crosses device boundary,
and the worse thing is that the IO request starting within device boundaries
can contain more than one segment past EOD.

Commit dce30ca9e3b6 ("fs: fix guard_bio_eod to check for real EOD errors")
tries to fix this issue by returning -EIO for this situation. However,
this way lets fs user code lose chance to handle -EIO, then sync_inodes_sb()
may hang for ever.

Also the current truncating on last segment is dangerous by updating the
last bvec, given bvec table becomes not immutable any more, and fs bio
users may not retrieve the truncated pages via bio_for_each_segment_all() in
its .end_io callback.

Fixes this issue by supporting multi-segment truncating. And the
approach is simpler:

- just update bio size since block layer can make correct bvec with
the updated bio size. Then bvec table becomes really immutable.

- zero all truncated segments for read bio

Cc: Carlos Maiolino
Cc: linux-fsdevel@vger.kernel.org
Fixed-by: dce30ca9e3b6 ("fs: fix guard_bio_eod to check for real EOD errors")
Reported-by: syzbot+2b9e54155c8c25d8d165@syzkaller.appspotmail.com
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2019-12-29 00:44:56 +0800

21 Dec, 2019

7 commits

b2c0fcd28 compat_ioctl: block: handle Persistent Reservations ... Browse Code »

These were added to blkdev_ioctl() in linux-5.5 but not
blkdev_compat_ioctl, so add them now.

Cc: # v4.4+
Fixes: bbd3e064362e ("block: add an API for Persistent Reservations")
Signed-off-by: Arnd Bergmann

Fold in followup patch from Arnd with missing pr.h header include.

Signed-off-by: Jens Axboe

Arnd Bergmann
2019-12-21 22:26:56 +0800
4b43f31d6 compat_ioctl: block: handle add zone open, close and finish ioctl ... Browse Code »

These were added to blkdev_ioctl() in linux-5.5 but not
blkdev_compat_ioctl, so add them now.

Fixes: e876df1fe0ad ("block: add zone open, close and finish ioctl support")
Reviewed-by: Damien Le Moal
Signed-off-by: Arnd Bergmann
Signed-off-by: Jens Axboe

Arnd Bergmann
2019-12-21 22:26:41 +0800
21d373409 compat_ioctl: block: handle BLKGETZONESZ/BLKGETNRZONES ... Browse Code »

These were added to blkdev_ioctl() in v4.20 but not blkdev_compat_ioctl,
so add them now.

Cc: # v4.20+
Fixes: 72cd87576d1d ("block: Introduce BLKGETZONESZ ioctl")
Fixes: 65e4e3eee83d ("block: Introduce BLKGETNRZONES ioctl")
Reviewed-by: Damien Le Moal
Signed-off-by: Arnd Bergmann
Signed-off-by: Jens Axboe

Arnd Bergmann
2019-12-21 22:26:41 +0800
673bdf8ce compat_ioctl: block: handle BLKREPORTZONE/BLKRESETZONE ... Browse Code »

These were added to blkdev_ioctl() but not blkdev_compat_ioctl,
so add them now.

Cc: # v4.10+
Fixes: 3ed05a987e0f ("blk-zoned: implement ioctls")
Reviewed-by: Damien Le Moal
Signed-off-by: Arnd Bergmann
Signed-off-by: Jens Axboe

Arnd Bergmann
2019-12-21 22:26:40 +0800
3b7995a98 block: fix memleak when __blk_rq_map_user_iov() is failed ... Browse Code »

When I doing fuzzy test, get the memleak report:

BUG: memory leak
unreferenced object 0xffff88837af80000 (size 4096):
comm "memleak", pid 3557, jiffies 4294817681 (age 112.499s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
20 00 00 00 10 01 00 00 00 00 00 00 01 00 00 00 ...............
backtrace:
[] bio_alloc_bioset+0x393/0x590
[] bio_copy_user_iov+0x300/0xcd0
[] blk_rq_map_user_iov+0x2f1/0x5f0
[] blk_rq_map_user+0xf2/0x160
[] sg_common_write.isra.21+0x1094/0x1870
[] sg_write.part.25+0x5d9/0x950
[] sg_write+0x5f/0x8c
[] __vfs_write+0x7c/0x100
[] vfs_write+0x1c3/0x500
[] ksys_write+0xf9/0x200
[] do_syscall_64+0x9f/0x4f0
[] entry_SYSCALL_64_after_hwframe+0x49/0xbe

If __blk_rq_map_user_iov() is failed in blk_rq_map_user_iov(),
the bio(s) which is allocated before this failing will leak. The
refcount of the bio(s) is init to 1 and increased to 2 by calling
bio_get(), but __blk_rq_unmap_user() only decrease it to 1, so
the bio cannot be freed. Fix it by calling blk_rq_unmap_user().

Reviewed-by: Bob Liu
Reported-by: Hulk Robot
Signed-off-by: Yang Yingliang
Signed-off-by: Jens Axboe

Yang Yingliang
2019-12-21 02:52:01 +0800
b3c6a5997 block: Fix a lockdep complaint triggered by request queue flushing ... Browse Code »

Avoid that running test nvme/012 from the blktests suite triggers the
following false positive lockdep complaint:

============================================
WARNING: possible recursive locking detected
5.0.0-rc3-xfstests-00015-g1236f7d60242 #841 Not tainted
--------------------------------------------
ksoftirqd/1/16 is trying to acquire lock:
000000000282032e (&(&fq->mq_flush_lock)->rlock){..-.}, at: flush_end_io+0x4e/0x1d0

but task is already holding lock:
00000000cbadcbc2 (&(&fq->mq_flush_lock)->rlock){..-.}, at: flush_end_io+0x4e/0x1d0

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&(&fq->mq_flush_lock)->rlock);
lock(&(&fq->mq_flush_lock)->rlock);

*** DEADLOCK ***

May be due to missing lock nesting notation

1 lock held by ksoftirqd/1/16:
#0: 00000000cbadcbc2 (&(&fq->mq_flush_lock)->rlock){..-.}, at: flush_end_io+0x4e/0x1d0

stack backtrace:
CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.0.0-rc3-xfstests-00015-g1236f7d60242 #841
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
dump_stack+0x67/0x90
__lock_acquire.cold.45+0x2b4/0x313
lock_acquire+0x98/0x160
_raw_spin_lock_irqsave+0x3b/0x80
flush_end_io+0x4e/0x1d0
blk_mq_complete_request+0x76/0x110
nvmet_req_complete+0x15/0x110 [nvmet]
nvmet_bio_done+0x27/0x50 [nvmet]
blk_update_request+0xd7/0x2d0
blk_mq_end_request+0x1a/0x100
blk_flush_complete_seq+0xe5/0x350
flush_end_io+0x12f/0x1d0
blk_done_softirq+0x9f/0xd0
__do_softirq+0xca/0x440
run_ksoftirqd+0x24/0x50
smpboot_thread_fn+0x113/0x1e0
kthread+0x121/0x140
ret_from_fork+0x3a/0x50

Cc: Christoph Hellwig
Cc: Ming Lei
Cc: Hannes Reinecke
Signed-off-by: Bart Van Assche
Signed-off-by: Jens Axboe

Bart Van Assche
2019-12-21 02:52:01 +0800
c44a4edb2 block: Fix the type of 'sts' in bsg_queue_rq() ... Browse Code »

This patch fixes the following sparse warnings:

block/bsg-lib.c:269:19: warning: incorrect type in initializer (different base types)
block/bsg-lib.c:269:19: expected int sts
block/bsg-lib.c:269:19: got restricted blk_status_t [usertype]
block/bsg-lib.c:286:16: warning: incorrect type in return expression (different base types)
block/bsg-lib.c:286:16: expected restricted blk_status_t
block/bsg-lib.c:286:16: got int [assigned] sts

Cc: Martin Wilck
Fixes: d46fe2cb2dce ("block: drop device references in bsg_queue_rq()")
Signed-off-by: Bart Van Assche
Signed-off-by: Jens Axboe

Bart Van Assche
2019-12-21 02:52:01 +0800

19 Dec, 2019

1 commit

95ed0c5b1 blk-mq: optimise blk_mq_flush_plug_list() ... Browse Code »

Instead of using list_del_init() in a loop, that generates a lot of
unnecessary memory read/writes, iterate from the first request of a
batch and cut out a sublist with list_cut_before().

Apart from removing the list node initialisation part, this is more
register-friendly, and the assembly uses the stack less intensively.

list_empty() at the beginning is done with hope, that the compiler can
optimise out the same check in the following list_splice_init().

Signed-off-by: Pavel Begunkov
Signed-off-by: Jens Axboe

Pavel Begunkov
2019-12-19 21:08:50 +0800