Eric Lee / smarc-fsl-linux-kernel

06 Feb, 2020

4 commits

761a10b67 reiserfs: Fix memory leak of journal device string ... Browse Code »

commit 5474ca7da6f34fa95e82edc747d5faa19cbdfb5c upstream.

When a filesystem is mounted with jdev mount option, we store the
journal device name in an allocated string in superblock. However we
fail to ever free that string. Fix it.

Reported-by: syzbot+1c6756baf4b16b94d2a6@syzkaller.appspotmail.com
Fixes: c3aa077648e1 ("reiserfs: Properly display mount options in /proc/mounts")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara
Signed-off-by: Greg Kroah-Hartman

Jan Kara
2020-02-06 05:22:40 +0800
73774def7 gfs2: Another gfs2_find_jhead fix ... Browse Code »

commit eed0f953b90e86e765197a1dad06bb48aedc27fe upstream.

On filesystems with a block size smaller than the page size,
gfs2_find_jhead can split a page across two bios (for example, when
blocks are not allocated consecutively). When that happens, the first
bio that completes will unlock the page in its bi_end_io handler even
though the page hasn't been read completely yet. Fix that by using a
chained bio for the rest of the page.

While at it, clean up the sector calculation logic in
gfs2_log_alloc_bio. In gfs2_find_jhead, simplify the disk block and
offset calculation logic and fix a variable name.

Fixes: f4686c26ecc3 ("gfs2: read journal in large chunks")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Andreas Gruenbacher
Signed-off-by: Greg Kroah-Hartman

Andreas Gruenbacher
2020-02-06 05:22:40 +0800
cd0826719 cifs: fix soft mounts hanging in the reconnect code ... Browse Code »

commit c54849ddd832ae0a45cab16bcd1ed2db7da090d7 upstream.

RHBZ: 1795429

In recent DFS updates we have a new variable controlling how many times we will
retry to reconnect the share.
If DFS is not used, then this variable is initialized to 0 in:

static inline int
dfs_cache_get_nr_tgts(const struct dfs_cache_tgt_list *tl)
{
return tl ? tl->tl_numtgts : 0;
}

This means that in the reconnect loop in smb2_reconnect() we will immediately wrap retries to -1
and never actually get to pass this conditional:

if (--retries)
continue;

The effect is that we no longer reach the point where we fail the commands with -EHOSTDOWN
and basically the kernel threads are virtually hung and unkillable.

Fixes: a3a53b7603798fd8 (cifs: Add support for failover in smb2_reconnect())
Signed-off-by: Ronnie Sahlberg
Signed-off-by: Steve French
Reviewed-by: Paulo Alcantara (SUSE)
CC: Stable
Signed-off-by: Greg Kroah-Hartman

Ronnie Sahlberg
2020-02-06 05:22:39 +0800
2c38e6140 vfs: fix do_last() regression ... Browse Code »

commit 6404674acd596de41fd3ad5f267b4525494a891a upstream.

Brown paperbag time: fetching ->i_uid/->i_mode really should've been
done from nd->inode. I even suggested that, but the reason for that has
slipped through the cracks and I went for dir->d_inode instead - made
for more "obvious" patch.

Analysis:

- at the entry into do_last() and all the way to step_into(): dir (aka
nd->path.dentry) is known not to have been freed; so's nd->inode and
it's equal to dir->d_inode unless we are already doomed to -ECHILD.
inode of the file to get opened is not known.

- after step_into(): inode of the file to get opened is known; dir
might be pointing to freed memory/be negative/etc.

- at the call of may_create_in_sticky(): guaranteed to be out of RCU
mode; inode of the file to get opened is known and pinned; dir might
be garbage.

The last was the reason for the original patch. Except that at the
do_last() entry we can be in RCU mode and it is possible that
nd->path.dentry->d_inode has already changed under us.

In that case we are going to fail with -ECHILD, but we need to be
careful; nd->inode is pointing to valid struct inode and it's the same
as nd->path.dentry->d_inode in "won't fail with -ECHILD" case, so we
should use that.

Reported-by: "Rantala, Tommi T. (Nokia - FI/Espoo)"
Reported-by: syzbot+190005201ced78a74ad6@syzkaller.appspotmail.com
Wearing-brown-paperbag: Al Viro
Cc: stable@kernel.org
Fixes: d0cb50185ae9 ("do_last(): fetch directory ->i_mode and ->i_uid before it's too late")
Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Al Viro
2020-02-06 05:22:39 +0800

01 Feb, 2020

4 commits

e08884197 cifs: Fix memory allocation in __smb2_handle_cancelled_cmd() ... Browse Code »

commit 0a5a98863c9debc02387b3d23c46d187756f5e2b upstream.

__smb2_handle_cancelled_cmd() is called under a spin lock held in
cifs_mid_q_entry_release(), so make its memory allocation GFP_ATOMIC.

This issue was observed when running xfstests generic/028:

[ 1722.589204] CIFS VFS: \\192.168.30.26 Cancelling wait for mid 72064 cmd: 5
[ 1722.590687] CIFS VFS: \\192.168.30.26 Cancelling wait for mid 72065 cmd: 17
[ 1722.593529] CIFS VFS: \\192.168.30.26 Cancelling wait for mid 72066 cmd: 6
[ 1723.039014] BUG: sleeping function called from invalid context at mm/slab.h:565
[ 1723.040710] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 30877, name: cifsd
[ 1723.045098] CPU: 3 PID: 30877 Comm: cifsd Not tainted 5.5.0-rc4+ #313
[ 1723.046256] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 1723.048221] Call Trace:
[ 1723.048689] dump_stack+0x97/0xe0
[ 1723.049268] ___might_sleep.cold+0xd1/0xe1
[ 1723.050069] kmem_cache_alloc_trace+0x204/0x2b0
[ 1723.051051] __smb2_handle_cancelled_cmd+0x40/0x140 [cifs]
[ 1723.052137] smb2_handle_cancelled_mid+0xf6/0x120 [cifs]
[ 1723.053247] cifs_mid_q_entry_release+0x44d/0x630 [cifs]
[ 1723.054351] ? cifs_reconnect+0x26a/0x1620 [cifs]
[ 1723.055325] cifs_demultiplex_thread+0xad4/0x14a0 [cifs]
[ 1723.056458] ? cifs_handle_standard+0x2c0/0x2c0 [cifs]
[ 1723.057365] ? kvm_sched_clock_read+0x14/0x30
[ 1723.058197] ? sched_clock+0x5/0x10
[ 1723.058838] ? sched_clock_cpu+0x18/0x110
[ 1723.059629] ? lockdep_hardirqs_on+0x17d/0x250
[ 1723.060456] kthread+0x1ab/0x200
[ 1723.061149] ? cifs_handle_standard+0x2c0/0x2c0 [cifs]
[ 1723.062078] ? kthread_create_on_node+0xd0/0xd0
[ 1723.062897] ret_from_fork+0x3a/0x50

Signed-off-by: Paulo Alcantara (SUSE)
Fixes: 9150c3adbf24 ("CIFS: Close open handle after interrupted close")
Cc: Stable
Signed-off-by: Steve French
Reviewed-by: Pavel Shilovsky
Signed-off-by: Greg Kroah-Hartman

Paulo Alcantara (SUSE)
2020-02-01 17:34:37 +0800
b396ec724 cifs: set correct max-buffer-size for smb2_ioctl_init() ... Browse Code »

commit 731b82bb1750a906c1e7f070aedf5505995ebea7 upstream.

Fix two places where we need to adjust down the max response size for
ioctl when it is used together with compounding.

Signed-off-by: Ronnie Sahlberg
Signed-off-by: Steve French
Reviewed-by: Pavel Shilovsky
CC: Stable
Signed-off-by: Greg Kroah-Hartman

Ronnie Sahlberg
2020-02-01 17:34:37 +0800
d65b067c2 CIFS: Fix task struct use-after-free on reconnect ... Browse Code »

commit f1f27ad74557e39f67a8331a808b860f89254f2d upstream.

The task which created the MID may be gone by the time cifsd attempts to
call the callbacks on MIDs from cifs_reconnect().

This leads to a use-after-free of the task struct in cifs_wake_up_task:

==================================================================
BUG: KASAN: use-after-free in __lock_acquire+0x31a0/0x3270
Read of size 8 at addr ffff8880103e3a68 by task cifsd/630

CPU: 0 PID: 630 Comm: cifsd Not tainted 5.5.0-rc6+ #119
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
dump_stack+0x8e/0xcb
print_address_description.constprop.5+0x1d3/0x3c0
? __lock_acquire+0x31a0/0x3270
__kasan_report+0x152/0x1aa
? __lock_acquire+0x31a0/0x3270
? __lock_acquire+0x31a0/0x3270
kasan_report+0xe/0x20
__lock_acquire+0x31a0/0x3270
? __wake_up_common+0x1dc/0x630
? _raw_spin_unlock_irqrestore+0x4c/0x60
? mark_held_locks+0xf0/0xf0
? _raw_spin_unlock_irqrestore+0x39/0x60
? __wake_up_common_lock+0xd5/0x130
? __wake_up_common+0x630/0x630
lock_acquire+0x13f/0x330
? try_to_wake_up+0xa3/0x19e0
_raw_spin_lock_irqsave+0x38/0x50
? try_to_wake_up+0xa3/0x19e0
try_to_wake_up+0xa3/0x19e0
? cifs_compound_callback+0x178/0x210
? set_cpus_allowed_ptr+0x10/0x10
cifs_reconnect+0xa1c/0x15d0
? generic_ip_connect+0x1860/0x1860
? rwlock_bug.part.0+0x90/0x90
cifs_readv_from_socket+0x479/0x690
cifs_read_from_socket+0x9d/0xe0
? cifs_readv_from_socket+0x690/0x690
? mempool_resize+0x690/0x690
? rwlock_bug.part.0+0x90/0x90
? memset+0x1f/0x40
? allocate_buffers+0xff/0x340
cifs_demultiplex_thread+0x388/0x2a50
? cifs_handle_standard+0x610/0x610
? rcu_read_lock_held_common+0x120/0x120
? mark_lock+0x11b/0xc00
? __lock_acquire+0x14ed/0x3270
? __kthread_parkme+0x78/0x100
? lockdep_hardirqs_on+0x3e8/0x560
? lock_downgrade+0x6a0/0x6a0
? lockdep_hardirqs_on+0x3e8/0x560
? _raw_spin_unlock_irqrestore+0x39/0x60
? cifs_handle_standard+0x610/0x610
kthread+0x2bb/0x3a0
? kthread_create_worker_on_cpu+0xc0/0xc0
ret_from_fork+0x3a/0x50

Allocated by task 649:
save_stack+0x19/0x70
__kasan_kmalloc.constprop.5+0xa6/0xf0
kmem_cache_alloc+0x107/0x320
copy_process+0x17bc/0x5370
_do_fork+0x103/0xbf0
__x64_sys_clone+0x168/0x1e0
do_syscall_64+0x9b/0xec0
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 0:
save_stack+0x19/0x70
__kasan_slab_free+0x11d/0x160
kmem_cache_free+0xb5/0x3d0
rcu_core+0x52f/0x1230
__do_softirq+0x24d/0x962

The buggy address belongs to the object at ffff8880103e32c0
which belongs to the cache task_struct of size 6016
The buggy address is located 1960 bytes inside of
6016-byte region [ffff8880103e32c0, ffff8880103e4a40)
The buggy address belongs to the page:
page:ffffea000040f800 refcount:1 mapcount:0 mapping:ffff8880108da5c0
index:0xffff8880103e4c00 compound_mapcount: 0
raw: 4000000000010200 ffffea00001f2208 ffffea00001e3408 ffff8880108da5c0
raw: ffff8880103e4c00 0000000000050003 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff8880103e3900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880103e3980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880103e3a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880103e3a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880103e3b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

This can be reliably reproduced by adding the below delay to
cifs_reconnect(), running find(1) on the mount, restarting the samba
server while find is running, and killing find during the delay:

spin_unlock(&GlobalMid_Lock);
mutex_unlock(&server->srv_mutex);

+ msleep(10000);
+
cifs_dbg(FYI, "%s: issuing mid callbacks\n", __func__);
list_for_each_safe(tmp, tmp2, &retry_list) {
mid_entry = list_entry(tmp, struct mid_q_entry, qhead);

Fix this by holding a reference to the task struct until the MID is
freed.

Signed-off-by: Vincent Whitchurch
Signed-off-by: Steve French
CC: Stable
Reviewed-by: Paulo Alcantara (SUSE)
Reviewed-by: Pavel Shilovsky
Signed-off-by: Greg Kroah-Hartman

Vincent Whitchurch
2020-02-01 17:34:37 +0800
6826af9a5 debugfs: Return -EPERM when locked down ... Browse Code »

commit a37f4958f7b63d2b3cd17a76151fdfc29ce1da5f upstream.

When lockdown is enabled, debugfs_is_locked_down returns 1. It will then
trigger the following:

WARNING: CPU: 48 PID: 3747
CPU: 48 PID: 3743 Comm: bash Not tainted 5.4.0-1946.x86_64 #1
Hardware name: Oracle Corporation ORACLE SERVER X7-2/ASM, MB, X7-2, BIOS 41060400 05/20/2019
RIP: 0010:do_dentry_open+0x343/0x3a0
Code: 00 40 08 00 45 31 ff 48 c7 43 28 40 5b e7 89 e9 02 ff ff ff 48 8b 53 28 4c 8b 72 70 4d 85 f6 0f 84 10 fe ff ff e9 f5 fd ff ff 0b 41 bf ea ff ff ff e9 3b ff ff ff 41 bf e6 ff ff ff e9 b4 fe
RSP: 0018:ffffb8740dde7ca0 EFLAGS: 00010202
RAX: ffffffff89e88a40 RBX: ffff928c8e6b6f00 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff928dbfd97778 RDI: ffff9285cff685c0
RBP: ffffb8740dde7cc8 R08: 0000000000000821 R09: 0000000000000030
R10: 0000000000000057 R11: ffffb8740dde7a98 R12: ffff926ec781c900
R13: ffff928c8e6b6f10 R14: ffffffff8936e190 R15: 0000000000000001
FS: 00007f45f6777740(0000) GS:ffff928dbfd80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fff95e0d5d8 CR3: 0000001ece562006 CR4: 00000000007606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
vfs_open+0x2d/0x30
path_openat+0x2d4/0x1680
? tty_mode_ioctl+0x298/0x4c0
do_filp_open+0x93/0x100
? strncpy_from_user+0x57/0x1b0
? __alloc_fd+0x46/0x150
do_sys_open+0x182/0x230
__x64_sys_openat+0x20/0x30
do_syscall_64+0x60/0x1b0
entry_SYSCALL_64_after_hwframe+0x170/0x1d5
RIP: 0033:0x7f45f5e5ce02
Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 25 59 2d 00 8b 00 85 c0 75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 3d 00 f0 ff ff 0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
RSP: 002b:00007fff95e0d2e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000561178c069b0 RCX: 00007f45f5e5ce02
RDX: 0000000000000241 RSI: 0000561178c08800 RDI: 00000000ffffff9c
RBP: 00007fff95e0d3e0 R08: 0000000000000020 R09: 0000000000000005
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000003 R14: 0000000000000001 R15: 0000561178c08800

Change the return type to int and return -EPERM when lockdown is enabled
to remove the warning above. Also rename debugfs_is_locked_down to
debugfs_locked_down to make it sound less like it returns a boolean.

Fixes: 5496197f9b08 ("debugfs: Restrict debugfs when the kernel is locked down")
Signed-off-by: Eric Snowberg
Reviewed-by: Matthew Wilcox (Oracle)
Cc: stable
Acked-by: James Morris
Link: https://lore.kernel.org/r/20191207161603.35907-1-eric.snowberg@oracle.com
Signed-off-by: Greg Kroah-Hartman

Eric Snowberg
2020-02-01 17:34:35 +0800

29 Jan, 2020

6 commits

ab94448be readdir: be more conservative with directory entry names ... Browse Code »

commit 2c6b7bcd747201441923a0d3062577a8d1fbd8f8 upstream.

Commit 8a23eb804ca4 ("Make filldir[64]() verify the directory entry
filename is valid") added some minimal validity checks on the directory
entries passed to filldir[64](). But they really were pretty minimal.

This fleshes out at least the name length check: we used to disallow
zero-length names, but really, negative lengths or oevr-long names
aren't ok either. Both could happen if there is some filesystem
corruption going on.

Now, most filesystems tend to use just an "unsigned char" or similar for
the length of a directory entry name, so even with a corrupt filesystem
you should never see anything odd like that. But since we then use the
name length to create the directory entry record length, let's make sure
it actually is half-way sensible.

Note how POSIX states that the size of a path component is limited by
NAME_MAX, but we actually use PATH_MAX for the check here. That's
because while NAME_MAX is generally the correct maximum name length
(it's 255, for the same old "name length is usually just a byte on
disk"), there's nothing in the VFS layer that really cares.

So the real limitation at a VFS layer is the total pathname length you
can pass as a filename: PATH_MAX.

Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Linus Torvalds
2020-01-29 23:45:31 +0800
454759886 do_last(): fetch directory ->i_mode and ->i_uid before it's too late ... Browse Code »

commit d0cb50185ae942b03c4327be322055d622dc79f6 upstream.

may_create_in_sticky() call is done when we already have dropped the
reference to dir.

Fixes: 30aba6656f61e (namei: allow restricted O_CREAT of FIFOs and regular files)
Signed-off-by: Al Viro
Signed-off-by: Greg Kroah-Hartman

Al Viro
2020-01-29 23:45:31 +0800
376b86033 readdir: make user_access_begin() use the real access range ... Browse Code »

commit 3c2659bd1db81ed6a264a9fc6262d51667d655ad upstream.

In commit 9f79b78ef744 ("Convert filldir[64]() from __put_user() to
unsafe_put_user()") I changed filldir to not do individual __put_user()
accesses, but instead use unsafe_put_user() surrounded by the proper
user_access_begin/end() pair.

That make them enormously faster on modern x86, where the STAC/CLAC
games make individual user accesses fairly heavy-weight.

However, the user_access_begin() range was not really the exact right
one, since filldir() has the unfortunate problem that it needs to not
only fill out the new directory entry, it also needs to fix up the
previous one to contain the proper file offset.

It's unfortunate, but the "d_off" field in "struct dirent" is _not_ the
file offset of the directory entry itself - it's the offset of the next
one. So we end up backfilling the offset in the previous entry as we
walk along.

But since x86 didn't really care about the exact range, and used to be
the only architecture that did anything fancy in user_access_begin() to
begin with, the filldir[64]() changes did something lazy, and even
commented on it:

/*
* Note! This range-checks 'previous' (which may be NULL).
* The real range was checked in getdents
*/
if (!user_access_begin(dirent, sizeof(*dirent)))
goto efault;

and it all worked fine.

But now 32-bit ppc is starting to also implement user_access_begin(),
and the fact that we faked the range to only be the (possibly not even
valid) previous directory entry becomes a problem, because ppc32 will
actually be using the range that is passed in for more than just "check
that it's user space".

This is a complete rewrite of Christophe's original patch.

By saving off the record length of the previous entry instead of a
pointer to it in the filldir data structures, we can simplify the range
check and the writing of the previous entry d_off field. No need for
any conditionals in the user accesses themselves, although we retain the
conditional EINTR checking for the "was this the first directory entry"
signal handling latency logic.

Fixes: 9f79b78ef744 ("Convert filldir[64]() from __put_user() to unsafe_put_user()")
Link: https://lore.kernel.org/lkml/a02d3426f93f7eb04960a4d9140902d278cab0bb.1579697910.git.christophe.leroy@c-s.fr/
Link: https://lore.kernel.org/lkml/408c90c4068b00ea8f1c41cca45b84ec23d4946b.1579783936.git.christophe.leroy@c-s.fr/
Reported-and-tested-by: Christophe Leroy
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Linus Torvalds
2020-01-29 23:45:29 +0800
fdd0f3b0e ceph: hold extra reference to r_parent over life of request ... Browse Code »

commit 9c1c2b35f1d94de8325344c2777d7ee67492db3b upstream.

Currently, we just assume that it will stick around by virtue of the
submitter's reference, but later patches will allow the syscall to
return early and we can't rely on that reference at that point.

While I'm not aware of any reports of it, Xiubo pointed out that this
may fix a use-after-free. If the wait for a reply times out or is
canceled via signal, and then the reply comes in after the syscall
returns, the client can end up trying to access r_parent without a
reference.

Take an extra reference to the inode when setting r_parent and release
it when releasing the request.

Cc: stable@vger.kernel.org
Signed-off-by: Jeff Layton
Reviewed-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov
Signed-off-by: Greg Kroah-Hartman

Jeff Layton
2020-01-29 23:45:24 +0800
2d00fec60 afs: Fix characters allowed into cell names ... Browse Code »

commit a45ea48e2bcd92c1f678b794f488ca0bda9835b8 upstream.

The afs filesystem needs to prohibit certain characters from cell names,
such as '/', as these are used to form filenames in procfs, leading to
the following warning being generated:

WARNING: CPU: 0 PID: 3489 at fs/proc/generic.c:178

Fix afs_alloc_cell() to disallow nonprintable characters, '/', '@' and
names that begin with a dot.

Remove the check for "@cell" as that is then redundant.

This can be tested by running:

echo add foo/.bar 1.2.3.4 >/proc/fs/afs/cells

Note that we will also need to deal with:

- Names ending in ".invalid" shouldn't be passed to the DNS.

- Names that contain non-valid domainname chars shouldn't be passed to
the DNS.

- DNS replies that say "your-dns-needs-immediate-attention." and
replies containing A records that say 127.0.53.53 should be
considered invalid.
[https://www.icann.org/en/system/files/files/name-collision-mitigation-01aug14-en.pdf]

but these need to be dealt with by the kafs-client DNS program rather
than the kernel.

Reported-by: syzbot+b904ba7c947a37b4b291@syzkaller.appspotmail.com
Cc: stable@kernel.org
Signed-off-by: David Howells
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

David Howells
2020-01-29 23:45:24 +0800
b29d143a6 Revert "io_uring: only allow submit from owning task" ... Browse Code »

commit 73e08e711d9c1d79fae01daed4b0e1fee5f8a275 upstream.

This ends up being too restrictive for tasks that willingly fork and
share the ring between forks. Andres reports that this breaks his
postgresql work. Since we're close to 5.5 release, revert this change
for now.

Cc: stable@vger.kernel.org
Fixes: 44d282796f81 ("io_uring: only allow submit from owning task")
Reported-by: Andres Freund
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Jens Axboe
2020-01-29 23:45:24 +0800

26 Jan, 2020

3 commits

c1b9854f3 afs: Remove set but not used variables 'before', 'after' ... Browse Code »

[ Upstream commit 51590df4f3306cb1f43dca54e3ccdd121ab89594 ]

Fixes gcc '-Wunused-but-set-variable' warning:

fs/afs/dir_edit.c: In function afs_set_contig_bits:
fs/afs/dir_edit.c:75:20: warning: variable after set but not used [-Wunused-but-set-variable]
fs/afs/dir_edit.c: In function afs_set_contig_bits:
fs/afs/dir_edit.c:75:12: warning: variable before set but not used [-Wunused-but-set-variable]
fs/afs/dir_edit.c: In function afs_clear_contig_bits:
fs/afs/dir_edit.c:100:20: warning: variable after set but not used [-Wunused-but-set-variable]
fs/afs/dir_edit.c: In function afs_clear_contig_bits:
fs/afs/dir_edit.c:100:12: warning: variable before set but not used [-Wunused-but-set-variable]

They are never used since commit 63a4681ff39c.

Fixes: 63a4681ff39c ("afs: Locally edit directory data for mkdir/create/unlink/...")
Reported-by: Hulk Robot
Signed-off-by: zhengbin
Signed-off-by: David Howells
Signed-off-by: Sasha Levin

zhengbin
2020-01-26 17:01:08 +0800
cdac80457 nfsd: depend on CRYPTO_MD5 for legacy client tracking ... Browse Code »

commit 38a2204f5298620e8a1c3b1dc7b831425106dbc0 upstream.

The legacy client tracking infrastructure of nfsd makes use of MD5 to
derive a client's recovery directory name. As the nfsd module doesn't
declare any dependency on CRYPTO_MD5, though, it may fail to allocate
the hash if the kernel was compiled without it. As a result, generation
of client recovery directories will fail with the following error:

NFSD: unable to generate recoverydir name

The explicit dependency on CRYPTO_MD5 was removed as redundant back in
6aaa67b5f3b9 (NFSD: Remove redundant "select" clauses in fs/Kconfig
2008-02-11) as it was already implicitly selected via RPCSEC_GSS_KRB5.
This broke when RPCSEC_GSS_KRB5 was made optional for NFSv4 in commit
df486a25900f (NFS: Fix the selection of security flavours in Kconfig) at
a later point.

Fix the issue by adding back an explicit dependency on CRYPTO_MD5.

Fixes: df486a25900f (NFS: Fix the selection of security flavours in Kconfig)
Signed-off-by: Patrick Steinhardt
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

Patrick Steinhardt
2020-01-26 17:01:01 +0800
dc08e4455 xfs: Sanity check flags of Q_XQUOTARM call ... Browse Code »

commit 3dd4d40b420846dd35869ccc8f8627feef2cff32 upstream.

Flags passed to Q_XQUOTARM were not sanity checked for invalid values.
Fix that.

Fixes: 9da93f9b7cdf ("xfs: fix Q_XQUOTARM ioctl")
Reported-by: Yang Xu
Signed-off-by: Jan Kara
Reviewed-by: Eric Sandeen
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
Signed-off-by: Greg Kroah-Hartman

Jan Kara
2020-01-26 17:00:59 +0800

23 Jan, 2020

10 commits

5090afc7d reiserfs: fix handling of -EOPNOTSUPP in reiserfs_for_each_xattr ... Browse Code »

commit 394440d469413fa9b74f88a11f144d76017221f2 upstream.

Commit 60e4cf67a58 (reiserfs: fix extended attributes on the root
directory) introduced a regression open_xa_root started returning
-EOPNOTSUPP but it was not handled properly in reiserfs_for_each_xattr.

When the reiserfs module is built without CONFIG_REISERFS_FS_XATTR,
deleting an inode would result in a warning and chowning an inode
would also result in a warning and then fail to complete.

With CONFIG_REISERFS_FS_XATTR enabled, the xattr root would always be
present for read-write operations.

This commit handles -EOPNOSUPP in the same way -ENODATA is handled.

Fixes: 60e4cf67a582 ("reiserfs: fix extended attributes on the root directory")
CC: stable@vger.kernel.org # Commit 60e4cf67a58 was picked up by stable
Link: https://lore.kernel.org/r/20200115180059.6935-1-jeffm@suse.com
Reported-by: Michael Brunnbauer
Signed-off-by: Jeff Mahoney
Signed-off-by: Jan Kara
Signed-off-by: Greg Kroah-Hartman

Jeff Mahoney
2020-01-23 15:22:57 +0800
cef6f2aed Btrfs: always copy scrub arguments back to user space ... Browse Code »

commit 5afe6ce748c1ea99e0d648153c05075e1ab93afb upstream.

If scrub returns an error we are not copying back the scrub arguments
structure to user space. This prevents user space to know how much
progress scrub has done if an error happened - this includes -ECANCELED
which is returned when users ask for scrub to stop. A particular use
case, which is used in btrfs-progs, is to resume scrub after it is
canceled, in that case it relies on checking the progress from the scrub
arguments structure and then use that progress in a call to resume
scrub.

So fix this by always copying the scrub arguments structure to user
space, overwriting the value returned to user space with -EFAULT only if
copying the structure failed to let user space know that either that
copying did not happen, and therefore the structure is stale, or it
happened partially and the structure is probably not valid and corrupt
due to the partial copy.

Reported-by: Graham Cobb
Link: https://lore.kernel.org/linux-btrfs/d0a97688-78be-08de-ca7d-bcb4c7fb397e@cobb.uk.net/
Fixes: 06fe39ab15a6a4 ("Btrfs: do not overwrite scrub error with fault error in scrub ioctl")
CC: stable@vger.kernel.org # 5.1+
Reviewed-by: Johannes Thumshirn
Reviewed-by: Qu Wenruo
Tested-by: Graham Cobb
Signed-off-by: Filipe Manana
Reviewed-by: David Sterba
Signed-off-by: David Sterba
Signed-off-by: Greg Kroah-Hartman

Filipe Manana
2020-01-23 15:22:41 +0800
2f7050c2b btrfs: check rw_devices, not num_devices for balance ... Browse Code »

commit b35cf1f0bf1f2b0b193093338414b9bd63b29015 upstream.

The fstest btrfs/154 reports

[ 8675.381709] BTRFS: Transaction aborted (error -28)
[ 8675.383302] WARNING: CPU: 1 PID: 31900 at fs/btrfs/block-group.c:2038 btrfs_create_pending_block_groups+0x1e0/0x1f0 [btrfs]
[ 8675.390925] CPU: 1 PID: 31900 Comm: btrfs Not tainted 5.5.0-rc6-default+ #935
[ 8675.392780] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 8675.395452] RIP: 0010:btrfs_create_pending_block_groups+0x1e0/0x1f0 [btrfs]
[ 8675.402672] RSP: 0018:ffffb2090888fb00 EFLAGS: 00010286
[ 8675.404413] RAX: 0000000000000000 RBX: ffff92026dfa91c8 RCX: 0000000000000001
[ 8675.406609] RDX: 0000000000000000 RSI: ffffffff8e100899 RDI: ffffffff8e100971
[ 8675.408775] RBP: ffff920247c61660 R08: 0000000000000000 R09: 0000000000000000
[ 8675.410978] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffe4
[ 8675.412647] R13: ffff92026db74000 R14: ffff920247c616b8 R15: ffff92026dfbc000
[ 8675.413994] FS: 00007fd5e57248c0(0000) GS:ffff92027d800000(0000) knlGS:0000000000000000
[ 8675.416146] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8675.417833] CR2: 0000564aa51682d8 CR3: 000000006dcbc004 CR4: 0000000000160ee0
[ 8675.419801] Call Trace:
[ 8675.420742] btrfs_start_dirty_block_groups+0x355/0x480 [btrfs]
[ 8675.422600] btrfs_commit_transaction+0xc8/0xaf0 [btrfs]
[ 8675.424335] reset_balance_state+0x14a/0x190 [btrfs]
[ 8675.425824] btrfs_balance.cold+0xe7/0x154 [btrfs]
[ 8675.427313] ? kmem_cache_alloc_trace+0x235/0x2c0
[ 8675.428663] btrfs_ioctl_balance+0x298/0x350 [btrfs]
[ 8675.430285] btrfs_ioctl+0x466/0x2550 [btrfs]
[ 8675.431788] ? mem_cgroup_charge_statistics+0x51/0xf0
[ 8675.433487] ? mem_cgroup_commit_charge+0x56/0x400
[ 8675.435122] ? do_raw_spin_unlock+0x4b/0xc0
[ 8675.436618] ? _raw_spin_unlock+0x1f/0x30
[ 8675.438093] ? __handle_mm_fault+0x499/0x740
[ 8675.439619] ? do_vfs_ioctl+0x56e/0x770
[ 8675.441034] do_vfs_ioctl+0x56e/0x770
[ 8675.442411] ksys_ioctl+0x3a/0x70
[ 8675.443718] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 8675.445333] __x64_sys_ioctl+0x16/0x20
[ 8675.446705] do_syscall_64+0x50/0x210
[ 8675.448059] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 8675.479187] BTRFS: error (device vdb) in btrfs_create_pending_block_groups:2038: errno=-28 No space left

We now use btrfs_can_overcommit() to see if we can flip a block group
read only. Before this would fail because we weren't taking into
account the usable un-allocated space for allocating chunks. With my
patches we were allowed to do the balance, which is technically correct.

The test is trying to start balance on degraded mount. So now we're
trying to allocate a chunk and cannot because we want to allocate a
RAID1 chunk, but there's only 1 device that's available for usage. This
results in an ENOSPC.

But we shouldn't even be making it this far, we don't have enough
devices to restripe. The problem is we're using btrfs_num_devices(),
that also includes missing devices. That's not actually what we want, we
need to use rw_devices.

The chunk_mutex is not needed here, rw_devices changes only in device
add, remove or replace, all are excluded by EXCL_OP mechanism.

Fixes: e4d8ec0f65b9 ("Btrfs: implement online profile changing")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik
Reviewed-by: David Sterba
[ add stacktrace, update changelog, drop chunk_mutex ]
Signed-off-by: David Sterba
Signed-off-by: Greg Kroah-Hartman

Josef Bacik
2020-01-23 15:22:41 +0800
b25e68dd5 btrfs: fix memory leak in qgroup accounting ... Browse Code »

commit 26ef8493e1ab771cb01d27defca2fa1315dc3980 upstream.

When running xfstests on the current btrfs I get the following splat from
kmemleak:

unreferenced object 0xffff88821b2404e0 (size 32):
comm "kworker/u4:7", pid 26663, jiffies 4295283698 (age 8.776s)
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 10 ff fd 26 82 88 ff ff ...........&....
10 ff fd 26 82 88 ff ff 20 ff fd 26 82 88 ff ff ...&.... ..&....
backtrace:
[] ulist_alloc+0x25/0x60 [btrfs]
[] btrfs_find_all_roots_safe+0x41/0x100 [btrfs]
[] btrfs_find_all_roots+0x52/0x70 [btrfs]
[] btrfs_qgroup_rescan_worker+0x343/0x680 [btrfs]
[] btrfs_work_helper+0xac/0x1e0 [btrfs]
[] process_one_work+0x1cf/0x350
[] worker_thread+0x28/0x3c0
[] kthread+0x109/0x120
[] ret_from_fork+0x35/0x40

This corresponds to:

(gdb) l *(btrfs_find_all_roots_safe+0x41)
0x8d7e1 is in btrfs_find_all_roots_safe (fs/btrfs/backref.c:1413).
1408
1409 tmp = ulist_alloc(GFP_NOFS);
1410 if (!tmp)
1411 return -ENOMEM;
1412 *roots = ulist_alloc(GFP_NOFS);
1413 if (!*roots) {
1414 ulist_free(tmp);
1415 return -ENOMEM;
1416 }
1417

Following the lifetime of the allocated 'roots' ulist, it gets freed
again in btrfs_qgroup_account_extent().

But this does not happen if the function is called with the
'BTRFS_FS_QUOTA_ENABLED' flag cleared, then btrfs_qgroup_account_extent()
does a short leave and directly returns.

Instead of directly returning we should jump to the 'out_free' in order to
free all resources as expected.

CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Qu Wenruo
Signed-off-by: Johannes Thumshirn
[ add comment ]
Signed-off-by: David Sterba
Signed-off-by: Greg Kroah-Hartman

Johannes Thumshirn
2020-01-23 15:22:41 +0800
707de9c08 btrfs: relocation: fix reloc_root lifespan and access ... Browse Code »

commit 6282675e6708ec78518cc0e9ad1f1f73d7c5c53d upstream.

[BUG]
There are several different KASAN reports for balance + snapshot
workloads. Involved call paths include:

should_ignore_root+0x54/0xb0 [btrfs]
build_backref_tree+0x11af/0x2280 [btrfs]
relocate_tree_blocks+0x391/0xb80 [btrfs]
relocate_block_group+0x3e5/0xa00 [btrfs]
btrfs_relocate_block_group+0x240/0x4d0 [btrfs]
btrfs_relocate_chunk+0x53/0xf0 [btrfs]
btrfs_balance+0xc91/0x1840 [btrfs]
btrfs_ioctl_balance+0x416/0x4e0 [btrfs]
btrfs_ioctl+0x8af/0x3e60 [btrfs]
do_vfs_ioctl+0x831/0xb10

create_reloc_root+0x9f/0x460 [btrfs]
btrfs_reloc_post_snapshot+0xff/0x6c0 [btrfs]
create_pending_snapshot+0xa9b/0x15f0 [btrfs]
create_pending_snapshots+0x111/0x140 [btrfs]
btrfs_commit_transaction+0x7a6/0x1360 [btrfs]
btrfs_mksubvol+0x915/0x960 [btrfs]
btrfs_ioctl_snap_create_transid+0x1d5/0x1e0 [btrfs]
btrfs_ioctl_snap_create_v2+0x1d3/0x270 [btrfs]
btrfs_ioctl+0x241b/0x3e60 [btrfs]
do_vfs_ioctl+0x831/0xb10

btrfs_reloc_pre_snapshot+0x85/0xc0 [btrfs]
create_pending_snapshot+0x209/0x15f0 [btrfs]
create_pending_snapshots+0x111/0x140 [btrfs]
btrfs_commit_transaction+0x7a6/0x1360 [btrfs]
btrfs_mksubvol+0x915/0x960 [btrfs]
btrfs_ioctl_snap_create_transid+0x1d5/0x1e0 [btrfs]
btrfs_ioctl_snap_create_v2+0x1d3/0x270 [btrfs]
btrfs_ioctl+0x241b/0x3e60 [btrfs]
do_vfs_ioctl+0x831/0xb10

[CAUSE]
All these call sites are only relying on root->reloc_root, which can
undergo btrfs_drop_snapshot(), and since we don't have real refcount
based protection to reloc roots, we can reach already dropped reloc
root, triggering KASAN.

[FIX]
To avoid such access to unstable root->reloc_root, we should check
BTRFS_ROOT_DEAD_RELOC_TREE bit first.

This patch introduces wrappers that provide the correct way to check the
bit with memory barriers protection.

Most callers don't distinguish merged reloc tree and no reloc tree. The
only exception is should_ignore_root(), as merged reloc tree can be
ignored, while no reloc tree shouldn't.

[CRITICAL SECTION ANALYSIS]
Although test_bit()/set_bit()/clear_bit() doesn't imply a barrier, the
DEAD_RELOC_TREE bit has extra help from transaction as a higher level
barrier, the lifespan of root::reloc_root and DEAD_RELOC_TREE bit are:

NULL: reloc_root is NULL PTR: reloc_root is not NULL
0: DEAD_RELOC_ROOT bit not set DEAD: DEAD_RELOC_ROOT bit set

(NULL, 0) Initial state __
| /\ Section A
btrfs_init_reloc_root() \/
| __
(PTR, 0) reloc_root initialized /\
| |
btrfs_update_reloc_root() | Section B
| |
(PTR, DEAD) reloc_root has been merged \/
| __
=== btrfs_commit_transaction() ====================
| /\
clean_dirty_subvols() |
| | Section C
(NULL, DEAD) reloc_root cleanup starts \/
| __
btrfs_drop_snapshot() /\
| | Section D
(NULL, 0) Back to initial state \/

Every have_reloc_root() or test_bit(DEAD_RELOC_ROOT) caller holds
transaction handle, so none of such caller can cross transaction boundary.

In Section A, every caller just found no DEAD bit, and grab reloc_root.

In the cross section A-B, caller may get no DEAD bit, but since reloc_root
is still completely valid thus accessing reloc_root is completely safe.

No test_bit() caller can cross the boundary of Section B and Section C.

In Section C, every caller found the DEAD bit, so no one will access
reloc_root.

In the cross section C-D, either caller gets the DEAD bit set, avoiding
access reloc_root no matter if it's safe or not. Or caller get the DEAD
bit cleared, then access reloc_root, which is already NULL, nothing will
be wrong.

The memory write barriers are between the reloc_root updates and bit
set/clear, the pairing read side is before test_bit.

Reported-by: Zygo Blaxell
Fixes: d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Josef Bacik
Signed-off-by: Qu Wenruo
Reviewed-by: David Sterba
[ barriers ]
Signed-off-by: David Sterba
Signed-off-by: Greg Kroah-Hartman

Qu Wenruo
2020-01-23 15:22:40 +0800
4c281ce51 btrfs: do not delete mismatched root refs ... Browse Code »

commit 423a716cd7be16fb08690760691befe3be97d3fc upstream.

btrfs_del_root_ref() will simply WARN_ON() if the ref doesn't match in
any way, and then continue to delete the reference. This shouldn't
happen, we have these values because there's more to the reference than
the original root and the sub root. If any of these checks fail, return
-ENOENT.

CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik
Reviewed-by: David Sterba
Signed-off-by: David Sterba
Signed-off-by: Greg Kroah-Hartman

Josef Bacik
2020-01-23 15:22:40 +0800
d5e34783c btrfs: fix invalid removal of root ref ... Browse Code »

commit d49d3287e74ffe55ae7430d1e795e5f9bf7359ea upstream.

If we have the following sequence of events

btrfs sub create A
btrfs sub create A/B
btrfs sub snap A C
mkdir C/foo
mv A/B C/foo
rm -rf *

We will end up with a transaction abort.

The reason for this is because we create a root ref for B pointing to A.
When we create a snapshot of C we still have B in our tree, but because
the root ref points to A and not C we will make it appear to be empty.

The problem happens when we move B into C. This removes the root ref
for B pointing to A and adds a ref of B pointing to C. When we rmdir C
we'll see that we have a ref to our root and remove the root ref,
despite not actually matching our reference name.

Now btrfs_del_root_ref() allowing this to work is a bug as well, however
we know that this inode does not actually point to a root ref in the
first place, so we shouldn't be calling btrfs_del_root_ref() in the
first place and instead simply look up our dir index for this item and
do the rest of the removal.

CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik
Reviewed-by: David Sterba
Signed-off-by: David Sterba
Signed-off-by: Greg Kroah-Hartman

Josef Bacik
2020-01-23 15:22:40 +0800
a8ac2da72 btrfs: rework arguments of btrfs_unlink_subvol ... Browse Code »

[ Upstream commit 045d3967b6920b663fc010ad414ade1b24143bd1 ]

btrfs_unlink_subvol takes the name of the dentry and the root objectid
based on what kind of inode this is, either a real subvolume link or a
empty one that we inherited as a snapshot. We need to fix how we unlink
in the case for BTRFS_EMPTY_SUBVOL_DIR_OBJECTID in the future, so rework
btrfs_unlink_subvol to just take the dentry and handle getting the right
objectid given the type of inode this is. There is no functional change
here, simply pushing the work into btrfs_unlink_subvol() proper.

Signed-off-by: Josef Bacik
Reviewed-by: David Sterba
Signed-off-by: David Sterba
Signed-off-by: Sasha Levin

Josef Bacik
2020-01-23 15:22:40 +0800
af2e7c923 io_uring: only allow submit from owning task ... Browse Code »

commit 44d282796f81eb1debc1d7cb53245b4cb3214cb5 upstream.

If the credentials or the mm doesn't match, don't allow the task to
submit anything on behalf of this ring. The task that owns the ring can
pass the file descriptor to another task, but we don't want to allow
that task to submit an SQE that then assumes the ring mm and creds if
it needs to go async.

Cc: stable@vger.kernel.org
Suggested-by: Stefan Metzmacher
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Jens Axboe
2020-01-23 15:22:32 +0800
7e7f29200 fuse: fix fuse_send_readpages() in the syncronous read case ... Browse Code »

commit 7df1e988c723a066754090b22d047c3225342152 upstream.

Buffered read in fuse normally goes via:

-> generic_file_buffered_read()
-> fuse_readpages()
-> fuse_send_readpages()
->fuse_simple_request() [called since v5.4]

In the case of a read request, fuse_simple_request() will return a
non-negative bytecount on success or a negative error value. A positive
bytecount was taken to be an error and the PG_error flag set on the page.
This resulted in generic_file_buffered_read() falling back to ->readpage(),
which would repeat the read request and succeed. Because of the repeated
read succeeding the bug was not detected with regression tests or other use
cases.

The FTP module in GVFS however fails the second read due to the
non-seekable nature of FTP downloads.

Fix by checking and ignoring positive return value from
fuse_simple_request().

Reported-by: Ondrej Holy
Link: https://gitlab.gnome.org/GNOME/gvfs/issues/441
Fixes: 134831e36bbd ("fuse: convert readpages to simple api")
Cc: # v5.4
Signed-off-by: Miklos Szeredi
Signed-off-by: Greg Kroah-Hartman

Miklos Szeredi
2020-01-23 15:22:32 +0800

18 Jan, 2020

13 commits

dae87141c ocfs2: call journal flush to mark journal as empty after journal recovery when mount ... Browse Code »

[ Upstream commit 397eac17f86f404f5ba31d8c3e39ec3124b39fd3 ]

If journal is dirty when mount, it will be replayed but jbd2 sb log tail
cannot be updated to mark a new start because journal->j_flag has
already been set with JBD2_ABORT first in journal_init_common.

When a new transaction is committed, it will be recored in block 1
first(journal->j_tail is set to 1 in journal_reset). If emergency
restart happens again before journal super block is updated
unfortunately, the new recorded trans will not be replayed in the next
mount.

The following steps describe this procedure in detail.
1. mount and touch some files
2. these transactions are committed to journal area but not checkpointed
3. emergency restart
4. mount again and its journals are replayed
5. journal super block's first s_start is 1, but its s_seq is not updated
6. touch a new file and its trans is committed but not checkpointed
7. emergency restart again
8. mount and journal is dirty, but trans committed in 6 will not be
replayed.

This exception happens easily when this lun is used by only one node.
If it is used by multi-nodes, other node will replay its journal and its
journal super block will be updated after recovery like what this patch
does.

ocfs2_recover_node->ocfs2_replay_journal.

The following jbd2 journal can be generated by touching a new file after
journal is replayed, and seq 15 is the first valid commit, but first seq
is 13 in journal super block.

logdump:
Block 0: Journal Superblock
Seq: 0 Type: 4 (JBD2_SUPERBLOCK_V2)
Blocksize: 4096 Total Blocks: 32768 First Block: 1
First Commit ID: 13 Start Log Blknum: 1
Error: 0
Feature Compat: 0
Feature Incompat: 2 block64
Feature RO compat: 0
Journal UUID: 4ED3822C54294467A4F8E87D2BA4BC36
FS Share Cnt: 1 Dynamic Superblk Blknum: 0
Per Txn Block Limit Journal: 0 Data: 0

Block 1: Journal Commit Block
Seq: 14 Type: 2 (JBD2_COMMIT_BLOCK)

Block 2: Journal Descriptor
Seq: 15 Type: 1 (JBD2_DESCRIPTOR_BLOCK)
No. Blocknum Flags
0. 587 none
UUID: 00000000000000000000000000000000
1. 8257792 JBD2_FLAG_SAME_UUID
2. 619 JBD2_FLAG_SAME_UUID
3. 24772864 JBD2_FLAG_SAME_UUID
4. 8257802 JBD2_FLAG_SAME_UUID
5. 513 JBD2_FLAG_SAME_UUID JBD2_FLAG_LAST_TAG
...
Block 7: Inode
Inode: 8257802 Mode: 0640 Generation: 57157641 (0x3682809)
FS Generation: 2839773110 (0xa9437fb6)
CRC32: 00000000 ECC: 0000
Type: Regular Attr: 0x0 Flags: Valid
Dynamic Features: (0x1) InlineData
User: 0 (root) Group: 0 (root) Size: 7
Links: 1 Clusters: 0
ctime: 0x5de5d870 0x11104c61 -- Tue Dec 3 11:37:20.286280801 2019
atime: 0x5de5d870 0x113181a1 -- Tue Dec 3 11:37:20.288457121 2019
mtime: 0x5de5d870 0x11104c61 -- Tue Dec 3 11:37:20.286280801 2019
dtime: 0x0 -- Thu Jan 1 08:00:00 1970
...
Block 9: Journal Commit Block
Seq: 15 Type: 2 (JBD2_COMMIT_BLOCK)

The following is journal recovery log when recovering the upper jbd2
journal when mount again.

syslog:
ocfs2: File system on device (252,1) was not unmounted cleanly, recovering it.
fs/jbd2/recovery.c:(do_one_pass, 449): Starting recovery pass 0
fs/jbd2/recovery.c:(do_one_pass, 449): Starting recovery pass 1
fs/jbd2/recovery.c:(do_one_pass, 449): Starting recovery pass 2
fs/jbd2/recovery.c:(jbd2_journal_recover, 278): JBD2: recovery, exit status 0, recovered transactions 13 to 13

Due to first commit seq 13 recorded in journal super is not consistent
with the value recorded in block 1(seq is 14), journal recovery will be
terminated before seq 15 even though it is an unbroken commit, inode
8257802 is a new file and it will be lost.

Link: http://lkml.kernel.org/r/20191217020140.2197-1-li.kai4@h3c.com
Signed-off-by: Kai Li
Reviewed-by: Joseph Qi
Reviewed-by: Changwei Ge
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Junxiao Bi
Cc: Gang He
Cc: Jun Piao
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin

Kai Li
2020-01-18 02:49:08 +0800
4ef359320 NFSD fixing possible null pointer derefering in copy offload ... Browse Code »

commit 18f428d4e2f7eff162d80b2b21689496c4e82afd upstream.

Static checker revealed possible error path leading to possible
NULL pointer dereferencing.

Reported-by: Dan Carpenter
Fixes: e0639dc5805a: ("NFSD introduce async copy feature")
Signed-off-by: Olga Kornievskaia
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

Olga Kornievskaia
2020-01-18 02:49:02 +0800
382e63a56 f2fs: fix potential overflow ... Browse Code »

commit 1f0d5c911b64165c9754139a26c8c2fad352c132 upstream.

We expect 64-bit calculation result from below statement, however
in 32-bit machine, looped left shift operation on pgoff_t type
variable may cause overflow issue, fix it by forcing type cast.

page->index << PAGE_SHIFT;

Fixes: 26de9b117130 ("f2fs: avoid unnecessary updating inode during fsync")
Fixes: 0a2aa8fbb969 ("f2fs: refactor __exchange_data_block for speed up")
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
Signed-off-by: Greg Kroah-Hartman

Chao Yu
2020-01-18 02:49:02 +0800
2d657e3ac ubifs: do_kill_orphans: Fix a memory leak bug ... Browse Code »

commit 10256f000932f12596dc043cf880ecf488a32510 upstream.

If there are more than one valid snod on the sleb->nodes list,
do_kill_orphans will malloc ino more than once without releasing
previous ino's memory. Finally, it will trigger memory leak.

Fixes: ee1438ce5dc4 ("ubifs: Check link count of inodes when...")
Signed-off-by: Zhihao Cheng
Signed-off-by: zhangyi (F)
Signed-off-by: Richard Weinberger
Signed-off-by: Greg Kroah-Hartman

Zhihao Cheng
2020-01-18 02:49:00 +0800
c7e5f0942 ubifs: Fixed missed le64_to_cpu() in journal ... Browse Code »

commit df22b5b3ecc6233e33bd27f67f14c0cd1b5a5897 upstream.

In the ubifs_jnl_write_inode() functon, it calls ubifs_iget()
with xent->inum. The xent->inum is __le64, but the ubifs_iget()
takes native cpu endian.

I think that this should be changed to passing le64_to_cpu(xent->inum)
to fix the following sparse warning:

fs/ubifs/journal.c:902:58: warning: incorrect type in argument 2 (different base types)
fs/ubifs/journal.c:902:58: expected unsigned long inum
fs/ubifs/journal.c:902:58: got restricted __le64 [usertype] inum

Fixes: 7959cf3a7506 ("ubifs: journal: Handle xattrs like files")
Signed-off-by: Ben Dooks
Signed-off-by: Richard Weinberger
Signed-off-by: Greg Kroah-Hartman

Ben Dooks (Codethink)
2020-01-18 02:49:00 +0800
e071addac Revert "ubifs: Fix memory leak bug in alloc_ubifs_info() error path" ... Browse Code »

commit 91cbf01178c37086b32148c53e24b04cb77557cf upstream.

This reverts commit 9163e0184bd7d5f779934d34581843f699ad2ffd.

At the point when ubifs_fill_super() runs, we have already a reference
to the super block. So upon deactivate_locked_super() c will get
free()'ed via ->kill_sb().

Cc: Wenwen Wang
Fixes: 9163e0184bd7 ("ubifs: Fix memory leak bug in alloc_ubifs_info() error path")
Reported-by: https://twitter.com/grsecurity/status/1180609139359277056
Signed-off-by: Richard Weinberger
Tested-by: Romain Izard
Signed-off-by: Richard Weinberger
Signed-off-by: Greg Kroah-Hartman

Richard Weinberger
2020-01-18 02:48:59 +0800
de1605c60 gfs2: add compat_ioctl support ... Browse Code »

commit 8d0980704842e8a68df2c3164c1c165e5c7ebc08 upstream.

Out of the four ioctl commands supported on gfs2, only FITRIM
works in compat mode.

Add a proper handler based on the ext4 implementation.

Fixes: 6ddc5c3ddf25 ("gfs2: getlabel support")
Reviewed-by: Bob Peterson
Cc: Andreas Gruenbacher
Signed-off-by: Arnd Bergmann
Signed-off-by: Greg Kroah-Hartman

Arnd Bergmann
2020-01-18 02:48:52 +0800
6bdc0eab8 affs: fix a memory leak in affs_remount ... Browse Code »

commit 450c3d4166837c496ebce03650c08800991f2150 upstream.

In affs_remount if data is provided it is duplicated into new_opts. The
allocated memory for new_opts is only released if parse_options fails.

There's a bit of history behind new_options, originally there was
save/replace options on the VFS layer so the 'data' passed must not
change (thus strdup), this got cleaned up in later patches. But not
completely.

There's no reason to do the strdup in cases where the filesystem does
not need to reuse the 'data' again, because strsep would modify it
directly.

Fixes: c8f33d0bec99 ("affs: kstrdup() memory handling")
Signed-off-by: Navid Emamdoost
[ update changelog ]
Signed-off-by: David Sterba
Signed-off-by: Greg Kroah-Hartman

Navid Emamdoost
2020-01-18 02:48:50 +0800
64a549fa9 NFSv4.x: Drop the slot if nfs4_delegreturn_prepare waits for layoutreturn ... Browse Code »

commit 5326de9e94bedcf7366e7e7625d4deb8c1f1ca8a upstream.

If nfs4_delegreturn_prepare needs to wait for a layoutreturn to complete
then make sure we drop the sequence slot if we hold it.

Fixes: 1c5bd76d17cc ("pNFS: Enable layoutreturn operation for return-on-close")
Signed-off-by: Trond Myklebust
Signed-off-by: Greg Kroah-Hartman

Trond Myklebust
2020-01-18 02:48:48 +0800
92f31482e NFSv4.x: Handle bad/dead sessions correctly in nfs41_sequence_process() ... Browse Code »

commit 5c441544f045e679afd6c3c6d9f7aaf5fa5f37b0 upstream.

If the server returns a bad or dead session error, the we don't want
to update the session slot number, but just immediately schedule
recovery and allow it to proceed.

We can/should then remove handling in other places

Fixes: 3453d5708b33 ("NFSv4.1: Avoid false retries when RPC calls are interrupted")
Signed-off-by: Trond Myklebust
Signed-off-by: Greg Kroah-Hartman

Trond Myklebust
2020-01-18 02:48:47 +0800
b09ed8142 nfsd: v4 support requires CRYPTO_SHA256 ... Browse Code »

commit a2e2f2dc77a18d2b0f450fb7fcb4871c9f697822 upstream.

The new nfsdcld client tracking operations use sha256 to compute hashes
of the kerberos principals, so make sure CRYPTO_SHA256 is enabled.

Fixes: 6ee95d1c8991 ("nfsd: add support for upcall version 2")
Reported-by: Jamie Heilman
Signed-off-by: Scott Mayhew
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

Scott Mayhew
2020-01-18 02:48:47 +0800
0efb7388f nfsd: Fix cld_net->cn_tfm initialization ... Browse Code »

commit 18b9a895e652979b70f9c20565394a69354dfebc upstream.

Don't assign an error pointer to cld_net->cn_tfm, otherwise an oops will
occur in nfsd4_remove_cld_pipe().

Also, move the initialization of cld_net->cn_tfm so that it occurs after
the check to see if nfsdcld is running. This is necessary because
nfsd4_client_tracking_init() looks for -ETIMEDOUT to determine whether
to use the "old" nfsdcld tracking ops.

Fixes: 6ee95d1c8991 ("nfsd: add support for upcall version 2")
Reported-by: Jamie Heilman
Signed-off-by: Scott Mayhew
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

Scott Mayhew
2020-01-18 02:48:47 +0800
2455e1b81 NFSv2: Fix a typo in encode_sattr() ... Browse Code »

commit ad97a995d8edff820d4238bd0dfc69f440031ae6 upstream.

Encode the mtime correctly.

Fixes: 95582b0083883 ("vfs: change inode times to use struct timespec64")
Signed-off-by: Trond Myklebust
Signed-off-by: Greg Kroah-Hartman

Trond Myklebust
2020-01-18 02:48:47 +0800