Eric Lee / smarc-fsl-linux-kernel

12 Feb, 2019

1 commit

f615c9afe block: fix a typo ... Browse Code »

Fix a typo in pkt_start_recovery.

Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index")
Reviewed-by: Christoph Hellwig
Signed-off-by: Jiufei Xue
Signed-off-by: Jens Axboe
(cherry picked from commit 158e61865a31ef7abf39629c37285810504d60b5)

Jiufei Xue
2019-02-12 10:33:24 +0800

23 Jan, 2019

7 commits

c2912ca3f nbd: Use set_blocksize() to set device blocksize ... Browse Code »

commit c8a83a6b54d0ca078de036aafb3f6af58c1dc5eb upstream.

NBD can update block device block size implicitely through
bd_set_size(). Make it explicitely set blocksize with set_blocksize() as
this behavior of bd_set_size() is going away.

CC: Josef Bacik
Signed-off-by: Jan Kara
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Jan Kara
2019-01-23 15:09:51 +0800
45662e4b7 loop: drop caches if offset or block_size are changed ... Browse Code »

commit 5db470e229e22b7eda6e23b5566e532c96fb5bc3 upstream.

If we don't drop caches used in old offset or block_size, we can get old data
from new offset/block_size, which gives unexpected data to user.

For example, Martijn found a loopback bug in the below scenario.
1) LOOP_SET_FD loads first two pages on loop file
2) LOOP_SET_STATUS64 changes the offset on the loop file
3) mount is failed due to the cached pages having wrong superblock

Cc: Jens Axboe
Cc: linux-block@vger.kernel.org
Reported-by: Martijn Coenen
Reviewed-by: Bart Van Assche
Signed-off-by: Jaegeuk Kim
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Jaegeuk Kim
2019-01-23 15:09:51 +0800
d2762edcb loop: Fix double mutex_unlock(&loop_ctl_mutex) in loop_control_ioctl() ... Browse Code »

commit 628bd85947091830a8c4872adfd5ed1d515a9cf2 upstream.

Commit 0a42e99b58a20883 ("loop: Get rid of loop_index_mutex") forgot to
remove mutex_unlock(&loop_ctl_mutex) from loop_control_ioctl() when
replacing loop_index_mutex with loop_ctl_mutex.

Fixes: 0a42e99b58a20883 ("loop: Get rid of loop_index_mutex")
Reported-by: syzbot
Reviewed-by: Ming Lei
Reviewed-by: Jan Kara
Signed-off-by: Tetsuo Handa
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Tetsuo Handa
2019-01-23 15:09:51 +0800
c1e63df4f loop: Get rid of loop_index_mutex ... Browse Code »

commit 0a42e99b58a208839626465af194cfe640ef9493 upstream.

Now that loop_ctl_mutex is global, just get rid of loop_index_mutex as
there is no good reason to keep these two separate and it just
complicates the locking.

Signed-off-by: Jan Kara
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Jan Kara
2019-01-23 15:09:51 +0800
f1e81ba8a loop: Fold __loop_release into loop_release ... Browse Code »

commit 967d1dc144b50ad005e5eecdfadfbcfb399ffff6 upstream.

__loop_release() has a single call site. Fold it there. This is
currently not a huge win but it will make following replacement of
loop_index_mutex more obvious.

Signed-off-by: Jan Kara
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Jan Kara
2019-01-23 15:09:51 +0800
57da9a974 block/loop: Use global lock for ioctl() operation. ... Browse Code »

commit 310ca162d779efee8a2dc3731439680f3e9c1e86 upstream.

syzbot is reporting NULL pointer dereference [1] which is caused by
race condition between ioctl(loop_fd, LOOP_CLR_FD, 0) versus
ioctl(other_loop_fd, LOOP_SET_FD, loop_fd) due to traversing other
loop devices at loop_validate_file() without holding corresponding
lo->lo_ctl_mutex locks.

Since ioctl() request on loop devices is not frequent operation, we don't
need fine grained locking. Let's use global lock in order to allow safe
traversal at loop_validate_file().

Note that syzbot is also reporting circular locking dependency between
bdev->bd_mutex and lo->lo_ctl_mutex [2] which is caused by calling
blkdev_reread_part() with lock held. This patch does not address it.

[1] https://syzkaller.appspot.com/bug?id=f3cfe26e785d85f9ee259f385515291d21bd80a3
[2] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d15889

Signed-off-by: Tetsuo Handa
Reported-by: syzbot
Reviewed-by: Jan Kara
Signed-off-by: Jan Kara
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Tetsuo Handa
2019-01-23 15:09:51 +0800
06ee6e217 block/loop: Don't grab "struct file" for vfs_getattr() operation. ... Browse Code »

commit b1ab5fa309e6c49e4e06270ec67dd7b3e9971d04 upstream.

vfs_getattr() needs "struct path" rather than "struct file".
Let's use path_get()/path_put() rather than get_file()/fput().

Signed-off-by: Tetsuo Handa
Reviewed-by: Jan Kara
Signed-off-by: Jan Kara
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Tetsuo Handa
2019-01-23 15:09:51 +0800

17 Jan, 2019

1 commit

022ce60cc rbd: don't return 0 on unmap if RBD_DEV_FLAG_REMOVING is set ... Browse Code »

commit 85f5a4d666fd9be73856ed16bb36c5af5b406b29 upstream.

There is a window between when RBD_DEV_FLAG_REMOVING is set and when
the device is removed from rbd_dev_list. During this window, we set
"already" and return 0.

Returning 0 from write(2) can confuse userspace tools because
0 indicates that nothing was written. In particular, "rbd unmap"
will retry the write multiple times a second:

10:28:05.463299 write(4, "0", 1) = 0
10:28:05.463509 write(4, "0", 1) = 0
10:28:05.463720 write(4, "0", 1) = 0
10:28:05.463942 write(4, "0", 1) = 0
10:28:05.464155 write(4, "0", 1) = 0

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov
Tested-by: Dongsheng Yang
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2019-01-17 05:07:12 +0800

13 Jan, 2019

1 commit

ce8daa28a zram: fix double free backing device ... Browse Code »

commit 5547932dc67a48713eece4fa4703bfdf0cfcb818 upstream.

If blkdev_get fails, we shouldn't do blkdev_put. Otherwise, kernel emits
below log. This patch fixes it.

WARNING: CPU: 0 PID: 1893 at fs/block_dev.c:1828 blkdev_put+0x105/0x120
Modules linked in:
CPU: 0 PID: 1893 Comm: swapoff Not tainted 4.19.0+ #453
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
RIP: 0010:blkdev_put+0x105/0x120
Call Trace:
__x64_sys_swapoff+0x46d/0x490
do_syscall_64+0x5a/0x190
entry_SYSCALL_64_after_hwframe+0x49/0xbe
irq event stamp: 4466
hardirqs last enabled at (4465): __free_pages_ok+0x1e3/0x490
hardirqs last disabled at (4466): trace_hardirqs_off_thunk+0x1a/0x1c
softirqs last enabled at (3420): __do_softirq+0x333/0x446
softirqs last disabled at (3407): irq_exit+0xd1/0xe0

Link: http://lkml.kernel.org/r/20181127055429.251614-3-minchan@kernel.org
Signed-off-by: Minchan Kim
Reviewed-by: Sergey Senozhatsky
Reviewed-by: Joey Pabalinas
Cc: [4.14+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Minchan Kim
2019-01-13 17:01:02 +0800

01 Dec, 2018

1 commit

5565c30a0 floppy: fix race condition in __floppy_read_block_0() ... Browse Code »

[ Upstream commit de7b75d82f70c5469675b99ad632983c50b6f7e7 ]

LKP recently reported a hang at bootup in the floppy code:

[ 245.678853] INFO: task mount:580 blocked for more than 120 seconds.
[ 245.679906] Tainted: G T 4.19.0-rc6-00172-ga9f38e1 #1
[ 245.680959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 245.682181] mount D 6372 580 1 0x00000004
[ 245.683023] Call Trace:
[ 245.683425] __schedule+0x2df/0x570
[ 245.683975] schedule+0x2d/0x80
[ 245.684476] schedule_timeout+0x19d/0x330
[ 245.685090] ? wait_for_common+0xa5/0x170
[ 245.685735] wait_for_common+0xac/0x170
[ 245.686339] ? do_sched_yield+0x90/0x90
[ 245.686935] wait_for_completion+0x12/0x20
[ 245.687571] __floppy_read_block_0+0xfb/0x150
[ 245.688244] ? floppy_resume+0x40/0x40
[ 245.688844] floppy_revalidate+0x20f/0x240
[ 245.689486] check_disk_change+0x43/0x60
[ 245.690087] floppy_open+0x1ea/0x360
[ 245.690653] __blkdev_get+0xb4/0x4d0
[ 245.691212] ? blkdev_get+0x1db/0x370
[ 245.691777] blkdev_get+0x1f3/0x370
[ 245.692351] ? path_put+0x15/0x20
[ 245.692871] ? lookup_bdev+0x4b/0x90
[ 245.693539] blkdev_get_by_path+0x3d/0x80
[ 245.694165] mount_bdev+0x2a/0x190
[ 245.694695] squashfs_mount+0x10/0x20
[ 245.695271] ? squashfs_alloc_inode+0x30/0x30
[ 245.695960] mount_fs+0xf/0x90
[ 245.696451] vfs_kern_mount+0x43/0x130
[ 245.697036] do_mount+0x187/0xc40
[ 245.697563] ? memdup_user+0x28/0x50
[ 245.698124] ksys_mount+0x60/0xc0
[ 245.698639] sys_mount+0x19/0x20
[ 245.699167] do_int80_syscall_32+0x61/0x130
[ 245.699813] entry_INT80_32+0xc7/0xc7

showing that we never complete that read request. The reason is that
the completion setup is racy - it initializes the completion event
AFTER submitting the IO, which means that the IO could complete
before/during the init. If it does, we are passing garbage to
complete() and we may sleep forever waiting for the event to
occur.

Fixes: 7b7b68bba5ef ("floppy: bail out in open() if drive is not responding to block0 read")
Reviewed-by: Omar Sandoval
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin

Jens Axboe
2018-12-01 16:42:53 +0800

27 Nov, 2018

1 commit

c887029c1 zram: close udev startup race condition as default groups ... Browse Code »

commit fef912bf860e upstream.
commit 98af4d4df889 upstream.

I got a report from Howard Chen that he saw zram and sysfs race(ie,
zram block device file is created but sysfs for it isn't yet)
when he tried to create new zram devices via hotadd knob.

v4.20 kernel fixes it by [1, 2] but it's too large size to merge
into -stable so this patch fixes the problem by registering defualt
group by Greg KH's approach[3].

This patch should be applied to every stable tree [3.16+] currently
existing from kernel.org because the problem was introduced at 2.6.37
by [4].

[1] fef912bf860e, block: genhd: add 'groups' argument to device_add_disk
[2] 98af4d4df889, zram: register default groups with device_add_disk()
[3] http://kroah.com/log/blog/2013/06/26/how-to-create-a-sysfs-file-correctly/
[4] 33863c21e69e9, Staging: zram: Replace ioctls with sysfs interface

Cc: Sergey Senozhatsky
Cc: Hannes Reinecke
Tested-by: Howard Chen
Signed-off-by: Minchan Kim
Signed-off-by: Sasha Levin

Minchan Kim
2018-11-27 23:10:49 +0800

14 Nov, 2018

4 commits

4e6d30de2 xen-blkfront: fix kernel panic with negotiate_mq error path ... Browse Code »

commit 6cc4a0863c9709c512280c64e698d68443ac8053 upstream.

info->nr_rings isn't adjusted in case of ENOMEM error from
negotiate_mq(). This leads to kernel panic in error path.

Typical call stack involving panic -
#8 page_fault at ffffffff8175936f
[exception RIP: blkif_free_ring+33]
RIP: ffffffffa0149491 RSP: ffff8804f7673c08 RFLAGS: 00010292
...
#9 blkif_free at ffffffffa0149aaa [xen_blkfront]
#10 talk_to_blkback at ffffffffa014c8cd [xen_blkfront]
#11 blkback_changed at ffffffffa014ea8b [xen_blkfront]
#12 xenbus_otherend_changed at ffffffff81424670
#13 backend_changed at ffffffff81426dc3
#14 xenwatch_thread at ffffffff81422f29
#15 kthread at ffffffff810abe6a
#16 ret_from_fork at ffffffff81754078

Cc: stable@vger.kernel.org
Fixes: 7ed8ce1c5fc7 ("xen-blkfront: move negotiate_mq to cover all cases of new VBDs")
Signed-off-by: Manjunath Patil
Acked-by: Roger Pau Monné
Signed-off-by: Juergen Gross
Signed-off-by: Greg Kroah-Hartman

Manjunath Patil
2018-11-14 03:15:11 +0800
57cd3a096 xen/blkfront: avoid NULL blkfront_info dereference on device removal ... Browse Code »

commit f92898e7f32e3533bfd95be174044bc349d416ca upstream.

If a block device is hot-added when we are out of grants,
gnttab_grant_foreign_access fails with -ENOSPC (log message "28
granting access to ring page") in this code path:

talk_to_blkback ->
setup_blkring ->
xenbus_grant_ring ->
gnttab_grant_foreign_access

and the failing path in talk_to_blkback sets the driver_data to NULL:

destroy_blkring:
blkif_free(info, 0);

mutex_lock(&blkfront_mutex);
free_info(info);
mutex_unlock(&blkfront_mutex);

dev_set_drvdata(&dev->dev, NULL);

This results in a NULL pointer BUG when blkfront_remove and blkif_free
try to access the failing device's NULL struct blkfront_info.

Cc: stable@vger.kernel.org # 4.5 and later
Signed-off-by: Vasilis Liaskovitis
Reviewed-by: Roger Pau Monné
Signed-off-by: Konrad Rzeszutek Wilk
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Vasilis Liaskovitis
2018-11-14 03:15:02 +0800
8ac8e0fec swim: fix cleanup on setup error ... Browse Code »

[ Upstream commit 1448a2a5360ae06f25e2edc61ae070dff5c0beb4 ]

If we fail to allocate the request queue for a disk, we still need to
free that disk, not just the previous ones. Additionally, we need to
cleanup the previous request queues.

Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Omar Sandoval
2018-11-14 03:14:52 +0800
7084b74ff ataflop: fix error handling during setup ... Browse Code »

[ Upstream commit 71327f547ee3a46ec5c39fdbbd268401b2578d0e ]

Move queue allocation next to disk allocation to fix a couple of issues:

- If add_disk() hasn't been called, we should clear disk->queue before
calling put_disk().
- If we fail to allocate a request queue, we still need to put all of
the disks, not just the ones that we allocated queues for.

Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Omar Sandoval
2018-11-14 03:14:51 +0800

04 Oct, 2018

1 commit

04bc4dd86 floppy: Do not copy a kernel pointer to user memory in FDGETPRM ioctl ... Browse Code »

commit 65eea8edc315589d6c993cf12dbb5d0e9ef1fe4e upstream.

The final field of a floppy_struct is the field "name", which is a pointer
to a string in kernel memory. The kernel pointer should not be copied to
user memory. The FDGETPRM ioctl copies a floppy_struct to user memory,
including this "name" field. This pointer cannot be used by the user
and it will leak a kernel address to user-space, which will reveal the
location of kernel code and data and undermine KASLR protection.

Model this code after the compat ioctl which copies the returned data
to a previously cleared temporary structure on the stack (excluding the
name pointer) and copy out to userspace from there. As we already have
an inparam union with an appropriate member and that memory is already
cleared even for read only calls make use of that as a temporary store.

Based on an initial patch by Brian Belleville.

CVE-2018-7755
Signed-off-by: Andy Whitcroft
Broke up long line.
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Andy Whitcroft
2018-10-04 08:00:54 +0800

20 Sep, 2018

2 commits

7141f97cd pktcdvd: Fix possible Spectre-v1 for pkt_devs ... Browse Code »

[ Upstream commit 55690c07b44a82cc3359ce0c233f4ba7d80ba145 ]

User controls @dev_minor which to be used as index of pkt_devs.
So, It can be exploited via Spectre-like attack. (speculative execution)

This kind of attack leaks address of pkt_devs, [1]
It leads an attacker to bypass security mechanism such as KASLR.

So sanitize @dev_minor before using it to prevent attack.

[1] https://github.com/jinb-park/linux-exploit/
tree/master/exploit-remaining-spectre-gadget/leak_pkt_devs.c

Signed-off-by: Jinbum Park
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Jinbum Park
2018-09-20 04:43:43 +0800
23ecbbad7 nbd: don't allow invalid blocksize settings ... Browse Code »

commit bc811f05d77f47059c197a98b6ad242eb03999cb upstream.

syzbot reports a divide-by-zero off the NBD_SET_BLKSIZE ioctl.
We need proper validation of the input here. Not just if it's
zero, but also if the value is a power-of-2 and in a valid
range. Add that.

Cc: stable@vger.kernel.org
Reported-by: syzbot
Reviewed-by: Josef Bacik
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Jens Axboe
2018-09-20 04:43:35 +0800

10 Sep, 2018

1 commit

256f63f52 drivers/block/zram/zram_drv.c: fix bug storing backing_dev ... Browse Code »

commit c8bd134a4bddafe5917d163eea73873932c15e83 upstream.

The call to strlcpy in backing_dev_store is incorrect. It should take
the size of the destination buffer instead of the size of the source
buffer. Additionally, ignore the newline character (\n) when reading
the new file_name buffer. This makes it possible to set the backing_dev
as follows:

echo /dev/sdX > /sys/block/zram0/backing_dev

The reason it worked before was the fact that strlcpy() copies 'len - 1'
bytes, which is strlen(buf) - 1 in our case, so it accidentally didn't
copy the trailing new line symbol. Which also means that "echo -n
/dev/sdX" most likely was broken.

Signed-off-by: Peter Kalauskas
Link: http://lkml.kernel.org/r/20180813061623.GC64836@rodete-desktop-imager.corp.google.com
Acked-by: Minchan Kim
Reviewed-by: Sergey Senozhatsky
Cc: [4.14+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Peter Kalauskas
2018-09-10 01:55:58 +0800

05 Sep, 2018

2 commits

05ee6166d nbd: handle unexpected replies better ... Browse Code »

[ Upstream commit 8f3ea35929a0806ad1397db99a89ffee0140822a ]

If the server or network is misbehaving and we get an unexpected reply
we can sometimes miss the request not being started and wait on a
request and never get a response, or even double complete the same
request. Fix this by replacing the send_complete completion with just a
per command lock. Add a per command cookie as well so that we can know
if we're getting a double completion for a previous event. Also check
to make sure we dont have REQUEUED set as that means we raced with the
timeout handler and need to just let the retry occur.

Signed-off-by: Josef Bacik
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Josef Bacik
2018-09-05 15:26:25 +0800
ced413c5e nbd: don't requeue the same request twice. ... Browse Code »

[ Upstream commit d7d94d48a272fd7583dc3c83acb8f5ed4ef456a4 ]

We can race with the snd timeout and the per-request timeout and end up
requeuing the same request twice. We can't use the send_complete
completion to tell if everything is ok because we hold the tx_lock
during send, so the timeout stuff will block waiting to mark the socket
dead, and we could be marked complete and still requeue. Instead add a
flag to the socket so we know whether we've been requeued yet.

Signed-off-by: Josef Bacik
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Josef Bacik
2018-09-05 15:26:25 +0800

24 Aug, 2018

2 commits

a56749343 drbd: Fix drbd_request_prepare() discard handling ... Browse Code »

[ Upstream commit fad2d4ef636654e926d374ef038f4cd4286661f6 ]

Fix the test that verifies whether bio_op(bio) represents a discard
or write zeroes operation. Compile-tested only.

Cc: Philipp Reisner
Cc: Lars Ellenberg
Fixes: 7435e9018f91 ("drbd: zero-out partial unaligned discards on local backend")
Signed-off-by: Bart Van Assche
Reviewed-by: Christoph Hellwig
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Bart Van Assche
2018-08-24 19:09:09 +0800
9b0b62584 nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag. ... Browse Code »

[ Upstream commit 08ba91ee6e2c1c08d3f0648f978cbb5dbf3491d8 ]

If NBD_DISCONNECT_ON_CLOSE is set on a device, then the driver will
issue a disconnect from nbd_release if the device has no remaining
bdev->bd_openers.

Fix ret val so reconfigure with only setting the flag succeeds.

Reviewed-by: Josef Bacik
Signed-off-by: Doron Roberts-Kedes
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Doron Roberts-Kedes
2018-08-24 19:09:03 +0800

17 Jul, 2018

2 commits

d2c18ad18 loop: remember whether sysfs_create_group() was done ... Browse Code »

commit d3349b6b3c373ac1fbfb040b810fcee5e2adc7e0 upstream.

syzbot is hitting WARN() triggered by memory allocation fault
injection [1] because loop module is calling sysfs_remove_group()
when sysfs_create_group() failed.
Fix this by remembering whether sysfs_create_group() succeeded.

[1] https://syzkaller.appspot.com/bug?id=3f86c0edf75c86d2633aeb9dd69eccc70bc7e90b

Signed-off-by: Tetsuo Handa
Reported-by: syzbot
Reviewed-by: Greg Kroah-Hartman

Renamed sysfs_ready -> sysfs_inited.

Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Tetsuo Handa
2018-07-17 17:39:33 +0800
6f9f5797f loop: add recursion validation to LOOP_CHANGE_FD ... Browse Code »

commit d2ac838e4cd7e5e9891ecc094d626734b0245c99 upstream.

Refactor the validation code used in LOOP_SET_FD so it is also used in
LOOP_CHANGE_FD. Otherwise it is possible to construct a set of loop
devices that all refer to each other. This can lead to a infinite
loop in starting with "while (is_loop_device(f)) .." in loop_set_fd().

Fix this by refactoring out the validation code and using it for
LOOP_CHANGE_FD as well as LOOP_SET_FD.

Reported-by: syzbot+4349872271ece473a7c91190b68b4bac7c5dbc87@syzkaller.appspotmail.com
Reported-by: syzbot+40bd32c4d9a3cc12a339@syzkaller.appspotmail.com
Reported-by: syzbot+769c54e66f994b041be7@syzkaller.appspotmail.com
Reported-by: syzbot+0a89a9ce473936c57065@syzkaller.appspotmail.com
Signed-off-by: Theodore Ts'o
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Theodore Ts'o
2018-07-17 17:39:32 +0800

11 Jul, 2018

1 commit

0ce6c4646 drbd: fix access after free ... Browse Code »

commit 64dafbc9530c10300acffc57fae3269d95fa8f93 upstream.

We have
struct drbd_requests { ... struct bio *private_bio; ... }
to hold a bio clone for local submission.

On local IO completion, we put that bio, and in case we want to use the
result later, we overload that member to hold the ERR_PTR() of the
completion result,

Which, before v4.3, used to be the passed in "int error",
so we could first bio_put(), then assign.

v4.3-rc1~100^2~21 4246a0b63bd8 block: add a bi_error field to struct bio
changed that:
bio_put(req->private_bio);
- req->private_bio = ERR_PTR(error);
+ req->private_bio = ERR_PTR(bio->bi_error);

Which introduces an access after free,
because it was non obvious that req->private_bio == bio.

Impact of that was mostly unnoticable, because we only use that value
in a multiple-failure case, and even then map any "unexpected" error
code to EIO, so worst case we could potentially mask a more specific
error with EIO in a multiple failure case.

Unless the pointed to memory region was unmapped, as is the case with
CONFIG_DEBUG_PAGEALLOC, in which case this results in

BUG: unable to handle kernel paging request

v4.13-rc1~70^2~75 4e4cbee93d56 block: switch bios to blk_status_t
changes it further to
bio_put(req->private_bio);
req->private_bio = ERR_PTR(blk_status_to_errno(bio->bi_status));

And blk_status_to_errno() now contains a WARN_ON_ONCE() for unexpected
values, which catches this "sometimes", if the memory has been reused
quickly enough for other things.

Should also go into stable since 4.3, with the trivial change around 4.13.

Cc: stable@vger.kernel.org
Fixes: 4246a0b63bd8 block: add a bi_error field to struct bio
Reported-by: Sarah Newman
Signed-off-by: Lars Ellenberg
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Lars Ellenberg
2018-07-11 22:29:14 +0800

03 Jul, 2018

1 commit

76022230a rbd: flush rbd_dev->watch_dwork after watch is unregistered ... Browse Code »

commit 23edca864951250af845a11da86bb3ea63522ed2 upstream.

There is a problem if we are going to unmap a rbd device and the
watch_dwork is going to queue delayed work for watch:

unmap Thread watch Thread timer
do_rbd_remove
cancel_tasks_sync(rbd_dev)
queue_delayed_work for watch
destroy_workqueue(rbd_dev->task_wq)
drain_workqueue(wq)
destroy other resources in wq
call_timer_fn
__queue_work()

Then the delayed work escape the cancel_tasks_sync() and
destroy_workqueue() and we will get an user-after-free call trace:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
Modules linked in:
CPU: 7 PID: 0 Comm: swapper/7 Tainted: G OE 4.17.0-rc6+ #13
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:__queue_work+0x6a/0x3b0
RSP: 0018:ffff9427df1c3e90 EFLAGS: 00010086
RAX: ffff9427deca8400 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff9427deca8400 RSI: ffff9427df1c3e50 RDI: 0000000000000000
RBP: ffff942783e39e00 R08: ffff9427deca8400 R09: ffff9427df1c3f00
R10: 0000000000000004 R11: 0000000000000005 R12: ffff9427cfb85970
R13: 0000000000002000 R14: 000000000001eca0 R15: 0000000000000007
FS: 0000000000000000(0000) GS:ffff9427df1c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000004c900a005 CR4: 00000000000206e0
Call Trace:

? __queue_work+0x3b0/0x3b0
call_timer_fn+0x2d/0x130
run_timer_softirq+0x16e/0x430
? tick_sched_timer+0x37/0x70
__do_softirq+0xd2/0x280
irq_exit+0xd5/0xe0
smp_apic_timer_interrupt+0x6c/0x130
apic_timer_interrupt+0xf/0x20

[ Move rbd_dev->watch_dwork cancellation so that rbd_reregister_watch()
either bails out early because the watch is UNREGISTERED at that point
or just gets cancelled. ]

Cc: stable@vger.kernel.org
Fixes: 99d1694310df ("rbd: retry watch re-registration periodically")
Signed-off-by: Dongsheng Yang
Reviewed-by: Ilya Dryomov
Signed-off-by: Ilya Dryomov
Signed-off-by: Greg Kroah-Hartman

Dongsheng Yang
2018-07-03 17:25:03 +0800

26 Jun, 2018

3 commits

00946218f nbd: use bd_set_size when updating disk size ... Browse Code »

commit 9e2b19675d1338d2a38e99194756f2db44a081df upstream.

When we stopped relying on the bdev everywhere I broke updating the
block device size on the fly, which ceph relies on. We can't just do
set_capacity, we also have to do bd_set_size so things like parted will
notice the device size change.

Fixes: 29eaadc ("nbd: stop using the bdev everywhere")
cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Josef Bacik
2018-06-26 08:06:32 +0800
a477d0055 nbd: update size when connected ... Browse Code »

commit c3f7c9397609705ef848cc98a5fb429b3e90c3c4 upstream.

I messed up changing the size of an NBD device while it was connected by
not actually updating the device or doing the uevent. Fix this by
updating everything if we're connected and we change the size.

cc: stable@vger.kernel.org
Fixes: 639812a ("nbd: don't set the device size until we're connected")
Signed-off-by: Josef Bacik
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Josef Bacik
2018-06-26 08:06:31 +0800
edee2e826 nbd: fix nbd device deletion ... Browse Code »

commit 8364da4751cf22201d74933d5e634176f44ed407 upstream.

This fixes a use after free bug, we shouldn't be doing disk->queue right
after we do del_gendisk(disk). Save the queue and do the cleanup after
the del_gendisk.

Fixes: c6a4759ea0c9 ("nbd: add device refcounting")
cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Josef Bacik
2018-06-26 08:06:31 +0800

30 May, 2018

4 commits

a64948842 block: null_blk: fix 'Invalid parameters' when loading module ... Browse Code »

[ Upstream commit 66231ad3e2886ba99fbf440cea44cab547e5163f ]

On ARM64, the default page size has been 64K on some distributions, and
we should allow ARM64 people to play null_blk.

This patch fixes the issue by extend page bitmap size for supporting
other non-4KB PAGE_SIZE.

Cc: Bart Van Assche
Cc: Shaohua Li
Cc: Kyungchan Koh ,
Cc: weiping zhang
Cc: Yi Zhang
Reported-by: Yi Zhang
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Ming Lei
2018-05-30 13:52:36 +0800
e6e5de324 cdrom: do not call check_disk_change() inside cdrom_open() ... Browse Code »

[ Upstream commit 2bbea6e117357d17842114c65e9a9cf2d13ae8a3 ]

when mounting an ISO filesystem sometimes (very rarely)
the system hangs because of a race condition between two tasks.

PID: 6766 TASK: ffff88007b2a6dd0 CPU: 0 COMMAND: "mount"
#0 [ffff880078447ae0] __schedule at ffffffff8168d605
#1 [ffff880078447b48] schedule_preempt_disabled at ffffffff8168ed49
#2 [ffff880078447b58] __mutex_lock_slowpath at ffffffff8168c995
#3 [ffff880078447bb8] mutex_lock at ffffffff8168bdef
#4 [ffff880078447bd0] sr_block_ioctl at ffffffffa00b6818 [sr_mod]
#5 [ffff880078447c10] blkdev_ioctl at ffffffff812fea50
#6 [ffff880078447c70] ioctl_by_bdev at ffffffff8123a8b3
#7 [ffff880078447c90] isofs_fill_super at ffffffffa04fb1e1 [isofs]
#8 [ffff880078447da8] mount_bdev at ffffffff81202570
#9 [ffff880078447e18] isofs_mount at ffffffffa04f9828 [isofs]
#10 [ffff880078447e28] mount_fs at ffffffff81202d09
#11 [ffff880078447e70] vfs_kern_mount at ffffffff8121ea8f
#12 [ffff880078447ea8] do_mount at ffffffff81220fee
#13 [ffff880078447f28] sys_mount at ffffffff812218d6
#14 [ffff880078447f80] system_call_fastpath at ffffffff81698c49
RIP: 00007fd9ea914e9a RSP: 00007ffd5d9bf648 RFLAGS: 00010246
RAX: 00000000000000a5 RBX: ffffffff81698c49 RCX: 0000000000000010
RDX: 00007fd9ec2bc210 RSI: 00007fd9ec2bc290 RDI: 00007fd9ec2bcf30
RBP: 0000000000000000 R8: 0000000000000000 R9: 0000000000000010
R10: 00000000c0ed0001 R11: 0000000000000206 R12: 00007fd9ec2bc040
R13: 00007fd9eb6b2380 R14: 00007fd9ec2bc210 R15: 00007fd9ec2bcf30
ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b

This task was trying to mount the cdrom. It allocated and configured a
super_block struct and owned the write-lock for the super_block->s_umount
rwsem. While exclusively owning the s_umount lock, it called
sr_block_ioctl and waited to acquire the global sr_mutex lock.

PID: 6785 TASK: ffff880078720fb0 CPU: 0 COMMAND: "systemd-udevd"
#0 [ffff880078417898] __schedule at ffffffff8168d605
#1 [ffff880078417900] schedule at ffffffff8168dc59
#2 [ffff880078417910] rwsem_down_read_failed at ffffffff8168f605
#3 [ffff880078417980] call_rwsem_down_read_failed at ffffffff81328838
#4 [ffff8800784179d0] down_read at ffffffff8168cde0
#5 [ffff8800784179e8] get_super at ffffffff81201cc7
#6 [ffff880078417a10] __invalidate_device at ffffffff8123a8de
#7 [ffff880078417a40] flush_disk at ffffffff8123a94b
#8 [ffff880078417a88] check_disk_change at ffffffff8123ab50
#9 [ffff880078417ab0] cdrom_open at ffffffffa00a29e1 [cdrom]
#10 [ffff880078417b68] sr_block_open at ffffffffa00b6f9b [sr_mod]
#11 [ffff880078417b98] __blkdev_get at ffffffff8123ba86
#12 [ffff880078417bf0] blkdev_get at ffffffff8123bd65
#13 [ffff880078417c78] blkdev_open at ffffffff8123bf9b
#14 [ffff880078417c90] do_dentry_open at ffffffff811fc7f7
#15 [ffff880078417cd8] vfs_open at ffffffff811fc9cf
#16 [ffff880078417d00] do_last at ffffffff8120d53d
#17 [ffff880078417db0] path_openat at ffffffff8120e6b2
#18 [ffff880078417e48] do_filp_open at ffffffff8121082b
#19 [ffff880078417f18] do_sys_open at ffffffff811fdd33
#20 [ffff880078417f70] sys_open at ffffffff811fde4e
#21 [ffff880078417f80] system_call_fastpath at ffffffff81698c49
RIP: 00007f29438b0c20 RSP: 00007ffc76624b78 RFLAGS: 00010246
RAX: 0000000000000002 RBX: ffffffff81698c49 RCX: 0000000000000000
RDX: 00007f2944a5fa70 RSI: 00000000000a0800 RDI: 00007f2944a5fa70
RBP: 00007f2944a5f540 R8: 0000000000000000 R9: 0000000000000020
R10: 00007f2943614c40 R11: 0000000000000246 R12: ffffffff811fde4e
R13: ffff880078417f78 R14: 000000000000000c R15: 00007f2944a4b010
ORIG_RAX: 0000000000000002 CS: 0033 SS: 002b

This task tried to open the cdrom device, the sr_block_open function
acquired the global sr_mutex lock. The call to check_disk_change()
then saw an event flag indicating a possible media change and tried
to flush any cached data for the device.
As part of the flush, it tried to acquire the super_block->s_umount
lock associated with the cdrom device.
This was the same super_block as created and locked by the previous task.

The first task acquires the s_umount lock and then the sr_mutex_lock;
the second task acquires the sr_mutex_lock and then the s_umount lock.

This patch fixes the issue by moving check_disk_change() out of
cdrom_open() and let the caller take care of it.

Signed-off-by: Maurizio Lombardi
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Maurizio Lombardi
2018-05-30 13:52:34 +0800
9238d1fa3 xen-blkfront: move negotiate_mq to cover all cases of new VBDs ... Browse Code »

[ Upstream commit 7ed8ce1c5fc7cf25b3602c73bef897a3466a6645 ]

negotiate_mq should happen in all cases of a new VBD being discovered by
xen-blkfront, whether called through _probe() or a hot-attached new VBD
from dom-0 via xenstore. Otherwise, hot-attached new VBDs are left
configured without multi-queue.

Signed-off-by: Bhavesh Davda
Reviewed-by: Konrad Rzeszutek Wilk
Signed-off-by: Konrad Rzeszutek Wilk
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Bhavesh Davda
2018-05-30 13:52:13 +0800
615bf75c4 nbd: fix return value in error handling path ... Browse Code »

[ Upstream commit 0979962f5490abe75b3e2befb07a564fa0cf631b ]

It seems that the proper value to return in this particular case is the
one contained into variable new_index instead of ret.

Addresses-Coverity-ID: 1465148 ("Copy-paste error")
Fixes: e46c7287b1c2 ("nbd: add a basic netlink interface")
Reviewed-by: Omar Sandoval
Signed-off-by: Gustavo A. R. Silva
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Gustavo A. R. Silva
2018-05-30 13:52:06 +0800

25 May, 2018

2 commits

360964411 loop: fix LOOP_GET_STATUS lock imbalance ... Browse Code »

commit bdac616db9bbadb90b7d6a406144571015e138f7 upstream.

Commit 2d1d4c1e591f made loop_get_status() drop lo_ctx_mutex before
returning, but the loop_get_status_old(), loop_get_status64(), and
loop_get_status_compat() wrappers don't call loop_get_status() if the
passed argument is NULL. The callers expect that the lock is dropped, so
make sure we drop it in that case, too.

Reported-by: syzbot+31e8daa8b3fc129e75f2@syzkaller.appspotmail.com
Fixes: 2d1d4c1e591f ("loop: don't call into filesystem while holding lo_ctl_mutex")
Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Omar Sandoval
2018-05-25 22:17:35 +0800
c18270ac9 loop: don't call into filesystem while holding lo_ctl_mutex ... Browse Code »

commit 2d1d4c1e591fd40bd7dafd868a249d7d00e215d5 upstream.

We hit an issue where a loop device on NFS was stuck in
loop_get_status() doing vfs_getattr() after the NFS server died, which
caused a pile-up of uninterruptible processes waiting on lo_ctl_mutex.
There's no reason to hold this lock while we wait on the filesystem;
let's drop it so that other processes can do their thing. We need to
grab a reference on lo_backing_file while we use it, and we can get rid
of the check on lo_device, which has been unnecessary since commit
a34c0ae9ebd6 ("[PATCH] loop: remove the bio remapping capability") in
the linux-history tree.

Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Omar Sandoval
2018-05-25 22:17:34 +0800

29 Apr, 2018

3 commits

d82923c01 block/swim: Fix IO error at end of medium ... Browse Code »

commit 5a13388d7aa1177b98d7168330ecbeeac52f844d upstream.

Reading to the end of a 720K disk results in an IO error instead of EOF
because the block layer thinks the disk has 2880 sectors. (Partly this
is a result of inverted logic of the ONEMEG_MEDIA bit that's now fixed.)

Initialize the density and head count in swim_add_floppy() to agree
with the device size passed to set_capacity() during drive probe.

Call set_capacity() again upon device open, after refreshing the density
and head count values.

Cc: Laurent Vivier
Cc: Jens Axboe
Cc: stable@vger.kernel.org # v4.14+
Tested-by: Stan Johnson
Signed-off-by: Finn Thain
Acked-by: Laurent Vivier
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Finn Thain
2018-04-29 17:33:17 +0800
06dc2e919 block/swim: Fix array bounds check ... Browse Code »

commit 7ae6a2b6cc058005ee3d0d2b9ce27688e51afa4b upstream.

In the floppy_find() function in swim.c is a call to
get_disk(swd->unit[drive].disk). The actual parameter to this call
can be a NULL pointer when drive == swd->floppy_count. This causes
an oops in get_disk().

Data read fault at 0x00000198 in Super Data (pc=0x1be5b6)
BAD KERNEL BUSERR
Oops: 00000000
Modules linked in: swim_mod ipv6 mac8390
PC: [] get_disk+0xc/0x76
SR: 2004 SP: 9a078bc1 a2: 0213ed90
d0: 00000000 d1: 00000000 d2: 00000000 d3: 000000ff
d4: 00000002 d5: 02983590 a0: 02332e00 a1: 022dfd64
Process dd (pid: 285, task=020ab25b)
Frame format=B ssw=074d isc=4a88 isb=6732 daddr=00000198 dobuf=00000000
baddr=001be5bc dibuf=bfffffff ver=f
Stack from 022dfca4:
00000000 0203fc00 0213ed90 022dfcc0 02982936 00000000 00200000 022dfd08
0020f85a 00200000 022dfd64 02332e00 004040fc 00000014 001be77e 022dfd64
00334e4a 001be3f8 0800001d 022dfd64 01c04b60 01c04b70 022aba80 029828f8
02332e00 022dfd2c 001be7ac 0203fc00 00200000 022dfd64 02103a00 01c04b60
01c04b60 0200e400 022dfd68 000e191a 00200000 022dfd64 02103a00 0800001d
00000000 00000003 000b89de 00500000 02103a00 01c04b60 02103a08 01c04c2e
Call Trace: [] floppy_find+0x3e/0x4a [swim_mod]
[] uart_remove_one_port+0x1a2/0x260
[] kobj_lookup+0xde/0x132
[] uart_remove_one_port+0x1a2/0x260
[] get_gendisk+0x0/0x130
[] mutex_lock+0x0/0x2e
[] disk_block_events+0x0/0x6c
[] floppy_find+0x0/0x4a [swim_mod]
[] get_gendisk+0x2e/0x130
[] uart_remove_one_port+0x1a2/0x260
[] __blkdev_get+0x32/0x45a
[] uart_remove_one_port+0x1a2/0x260
[] complete_walk+0x0/0x8a
[] blkdev_get+0xe0/0x29a
[] blkdev_open+0x0/0xb0
[] complete_walk+0x0/0x8a
[] blkdev_open+0x0/0xb0
[] bd_acquire+0x74/0x8a
[] blkdev_open+0x80/0xb0
[] blkdev_open+0x0/0xb0
[] do_dentry_open+0x1a4/0x322
[] __do_proc_douintvec+0x22/0x27e
[] complete_walk+0x0/0x8a
[] link_path_walk+0x0/0x48e
[] inode_permission+0x20/0x54
[] vfs_open+0x42/0x78
[] path_openat+0x2b2/0xeaa
[] path_openat+0x0/0xeaa
[] __irq_wake_thread+0x0/0x4e
[] task_tick_fair+0x18/0xc8
[] do_filp_open+0xa0/0xea
[] do_sys_open+0x11a/0x1ee
[] __do_proc_douintvec+0x22/0x27e
[] SyS_open+0x1e/0x22
[] __do_proc_douintvec+0x22/0x27e
[] syscall+0x8/0xc
[] __do_proc_douintvec+0x22/0x27e
[] dyadic+0x1/0x28
Code: 4e5e 4e75 4e56 fffc 2f0b 2f02 266e 0008 0198 4a88 6732 2428 002c 661e 486b 0058 4eb9 0032 0b96 588f 4a88 672c 2008
Disabling lock debugging due to kernel taint

Fix the array index bounds check to avoid this.

Cc: Laurent Vivier
Cc: Jens Axboe
Cc: stable@vger.kernel.org # v4.14+
Fixes: 8852ecd97488 ("[PATCH] m68k: mac - Add SWIM floppy support")
Tested-by: Stan Johnson
Signed-off-by: Finn Thain
Acked-by: Laurent Vivier
Reviewed-by: Geert Uytterhoeven
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Finn Thain
2018-04-29 17:33:17 +0800
8c37ac3c0 block/swim: Select appropriate drive on device open ... Browse Code »

commit b3906535ccc6cd04c42f9b1c7e31d1947b3ebc74 upstream.

The driver supports internal and external FDD units so the floppy_open
function must not hard-code the drive location.

Cc: Laurent Vivier
Cc: Jens Axboe
Cc: stable@vger.kernel.org # v4.14+
Tested-by: Stan Johnson
Signed-off-by: Finn Thain
Acked-by: Laurent Vivier
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Finn Thain
2018-04-29 17:33:17 +0800