Eric Lee / smarc-fsl-linux-kernel

08 Sep, 2021

1 commit

3754707bc Revert "memcg: enable accounting for file lock caches" ... Browse Code »

This reverts commit 0f12156dff2862ac54235fc72703f18770769042.

The kernel test robot reports a sizeable performance regression for this
commit, and while it clearly does the rigth thing in theory, we'll need
to look at just how to avoid or minimize the performance overhead of the
memcg accounting.

People already have suggestions on how to do that, but it's "future
work".

So revert it for now.

Link: https://lore.kernel.org/lkml/20210907150757.GE17617@xsang-OptiPlex-9020/
Acked-by: Jens Axboe
Acked-by: Shakeel Butt
Acked-by: Roman Gushchin
Cc: Tejun Heo
Signed-off-by: Linus Torvalds

Linus Torvalds
2021-09-08 02:21:48 +0800

04 Sep, 2021

2 commits

14726903c Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge misc updates from Andrew Morton:
"173 patches.

Subsystems affected by this series: ia64, ocfs2, block, and mm (debug,
pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap,
bootmem, sparsemem, vmalloc, kasan, pagealloc, memory-failure,
hugetlb, userfaultfd, vmscan, compaction, mempolicy, memblock,
oom-kill, migration, ksm, percpu, vmstat, and madvise)"

* emailed patches from Andrew Morton : (173 commits)
mm/madvise: add MADV_WILLNEED to process_madvise()
mm/vmstat: remove unneeded return value
mm/vmstat: simplify the array size calculation
mm/vmstat: correct some wrong comments
mm/percpu,c: remove obsolete comments of pcpu_chunk_populated()
selftests: vm: add COW time test for KSM pages
selftests: vm: add KSM merging time test
mm: KSM: fix data type
selftests: vm: add KSM merging across nodes test
selftests: vm: add KSM zero page merging test
selftests: vm: add KSM unmerge test
selftests: vm: add KSM merge test
mm/migrate: correct kernel-doc notation
mm: wire up syscall process_mrelease
mm: introduce process_mrelease system call
memblock: make memblock_find_in_range method private
mm/mempolicy.c: use in_task() in mempolicy_slab_node()
mm/mempolicy: unify the create() func for bind/interleave/prefer-many policies
mm/mempolicy: advertise new MPOL_PREFERRED_MANY
mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY
...

Linus Torvalds
2021-09-04 01:08:28 +0800
0f12156df memcg: enable accounting for file lock caches ... Browse Code »

User can create file locks for each open file and force kernel to allocate
small but long-living objects per each open file.

It makes sense to account for these objects to limit the host's memory
consumption from inside the memcg-limited container.

Link: https://lkml.kernel.org/r/b009f4c7-f0ab-c0ec-8e83-918f47d677da@virtuozzo.com
Signed-off-by: Vasily Averin
Reviewed-by: Shakeel Butt
Cc: Alexander Viro
Cc: Alexey Dobriyan
Cc: Andrei Vagin
Cc: Borislav Petkov
Cc: Borislav Petkov
Cc: Christian Brauner
Cc: Dmitry Safonov
Cc: "Eric W. Biederman"
Cc: Greg Kroah-Hartman
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: "J. Bruce Fields"
Cc: Jeff Layton
Cc: Jens Axboe
Cc: Jiri Slaby
Cc: Johannes Weiner
Cc: Kirill Tkhai
Cc: Michal Hocko
Cc: Oleg Nesterov
Cc: Roman Gushchin
Cc: Serge Hallyn
Cc: Tejun Heo
Cc: Thomas Gleixner
Cc: Vladimir Davydov
Cc: Yutian Yang
Cc: Zefan Li
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vasily Averin
2021-09-04 00:58:12 +0800

23 Aug, 2021

1 commit

f7e33bdbd fs: remove mandatory file locking support ... Browse Code »

We added CONFIG_MANDATORY_FILE_LOCKING in 2015, and soon after turned it
off in Fedora and RHEL8. Several other distros have followed suit.

I've heard of one problem in all that time: Someone migrated from an
older distro that supported "-o mand" to one that didn't, and the host
had a fstab entry with "mand" in it which broke on reboot. They didn't
actually _use_ mandatory locking so they just removed the mount option
and moved on.

This patch rips out mandatory locking support wholesale from the kernel,
along with the Kconfig option and the Documentation file. It also
changes the mount code to ignore the "mand" mount option instead of
erroring out, and to throw a big, ugly warning.

Signed-off-by: Jeff Layton

Jeff Layton
2021-08-23 18:15:36 +0800

06 May, 2021

1 commit

a79cdfba6 Merge tag 'nfsd-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux ... Browse Code »

Pull more nfsd updates from Chuck Lever:
"Additional fixes and clean-ups for NFSD since tags/nfsd-5.13,
including a fix to grant read delegations for files open for writing"

* tag 'nfsd-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
SUNRPC: Fix null pointer dereference in svc_rqst_free()
SUNRPC: fix ternary sign expansion bug in tracing
nfsd: Fix fall-through warnings for Clang
nfsd: grant read delegations to clients holding writes
nfsd: reshuffle some code
nfsd: track filehandle aliasing in nfs4_files
nfsd: hash nfs4_files by inode number
nfsd: ensure new clients break delegations
nfsd: removed unused argument in nfsd_startup_generic()
nfsd: remove unused function
svcrdma: Pass a useful error code to the send_err tracepoint
svcrdma: Rename goto labels in svc_rdma_sendto()
svcrdma: Don't leak send_ctxt on Send errors

Linus Torvalds
2021-05-06 04:44:19 +0800

27 Apr, 2021

1 commit

befbfe07e Merge tag 'locks-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux ... Browse Code »

Pull file locking updates from Jeff Layton:
"When we reworked the blocked locks into a tree structure instead of a
flat list a few releases ago, we lost the ability to see all of the
file locks in /proc/locks. Luo's patch fixes it to dump out all of the
blocked locks instead, which restores the full output.

This changes the format of /proc/locks as the blocked locks are shown
at multiple levels of indentation now, but lslocks (the only common
program I've ID'ed that scrapes this info) seems to be OK with that.

Tian also contributed a small patch to remove a useless assignment"

* tag 'locks-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
fs/locks: remove useless assignment in fcntl_getlk
fs/locks: print full locks information

Linus Torvalds
2021-04-27 04:24:39 +0800

20 Apr, 2021

1 commit

aba2072f4 nfsd: grant read delegations to clients holding writes ... Browse Code »

It's OK to grant a read delegation to a client that holds a write,
as long as it's the only client holding the write.

We originally tried to do this in commit 94415b06eb8a ("nfsd4: a
client's own opens needn't prevent delegations"), which had to be
reverted in commit 6ee65a773096 ("Revert "nfsd4: a client's own
opens needn't prevent delegations"").

Signed-off-by: J. Bruce Fields
Signed-off-by: Chuck Lever

J. Bruce Fields
2021-04-20 04:41:36 +0800

13 Apr, 2021

1 commit

cbe6fc4e0 fs/locks: remove useless assignment in fcntl_getlk ... Browse Code »

Function parameter 'cmd' is rewritten with unused value at locks.c

Signed-off-by: Tian Tao
Signed-off-by: Jeff Layton

Tian Tao
2021-04-13 19:26:38 +0800

11 Mar, 2021

1 commit

b8da9b10e fs/locks: print full locks information ... Browse Code »

Commit fd7732e033e3 ("fs/locks: create a tree of dependent requests.")
has put blocked locks into a tree.

So, with a for loop, we can't check all locks information.

To solve this problem, we should traverse the tree.

Signed-off-by: Luo Longjun
Signed-off-by: Jeff Layton

Luo Longjun
2021-03-11 20:48:11 +0800

09 Mar, 2021

1 commit

6ee65a773 Revert "nfsd4: a client's own opens needn't prevent delegations" ... Browse Code »

This reverts commit 94415b06eb8aed13481646026dc995f04a3a534a.

That commit claimed to allow a client to get a read delegation when it
was the only writer. Actually it allowed a client to get a read
delegation when *any* client has a write open!

The main problem is that it's depending on nfs4_clnt_odstate structures
that are actually only maintained for pnfs exports.

This causes clients to miss writes performed by other clients, even when
there have been intervening closes and opens, violating close-to-open
cache consistency.

We can do this a different way, but first we should just revert this.

I've added pynfs 4.1 test DELEG19 to test for this, as I should have
done originally!

Cc: stable@vger.kernel.org
Reported-by: Timo Rothenpieler
Signed-off-by: J. Bruce Fields
Signed-off-by: Chuck Lever

J. Bruce Fields
2021-03-09 23:37:34 +0800

16 Dec, 2020

1 commit

faf145d6f Merge branch 'exec-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/e… ... Browse Code »

…biederm/user-namespace

Pull execve updates from Eric Biederman:
"This set of changes ultimately fixes the interaction of posix file
lock and exec. Fundamentally most of the change is just moving where
unshare_files is called during exec, and tweaking the users of
files_struct so that the count of files_struct is not unnecessarily
played with.

Along the way fcheck and related helpers were renamed to more
accurately reflect what they do.

There were also many other small changes that fell out, as this is the
first time in a long time much of this code has been touched.

Benchmarks haven't turned up any practical issues but Al Viro has
observed a possibility for a lot of pounding on task_lock. So I have
some changes in progress to convert put_files_struct to always rcu
free files_struct. That wasn't ready for the merge window so that will
have to wait until next time"

* 'exec-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
exec: Move io_uring_task_cancel after the point of no return
coredump: Document coredump code exclusively used by cell spufs
file: Remove get_files_struct
file: Rename __close_fd_get_file close_fd_get_file
file: Replace ksys_close with close_fd
file: Rename __close_fd to close_fd and remove the files parameter
file: Merge __alloc_fd into alloc_fd
file: In f_dupfd read RLIMIT_NOFILE once.
file: Merge __fd_install into fd_install
proc/fd: In fdinfo seq_show don't use get_files_struct
bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu
proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu
file: Implement task_lookup_next_fd_rcu
kcmp: In get_file_raw_ptr use task_lookup_fd_rcu
proc/fd: In tid_fd_mode use task_lookup_fd_rcu
file: Implement task_lookup_fd_rcu
file: Rename fcheck lookup_fd_rcu
file: Replace fcheck_files with files_lookup_fd_rcu
file: Factor files_lookup_fd_locked out of fcheck_files
file: Rename __fcheck_files to files_lookup_fd_raw
...

Linus Torvalds
2020-12-16 11:29:43 +0800

11 Dec, 2020

1 commit

120ce2b0c file: Factor files_lookup_fd_locked out of fcheck_files ... Browse Code »

To make it easy to tell where files->file_lock protection is being
used when looking up a file create files_lookup_fd_locked. Only allow
this function to be called with the file_lock held.

Update the callers of fcheck and fcheck_files that are called with the
files->file_lock held to call files_lookup_fd_locked instead.

Hopefully this makes it easier to quickly understand what is going on.

The need for better names became apparent in the last round of
discussion of this set of changes[1].

[1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20201120231441.29911-8-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2020-12-11 02:39:59 +0800

26 Oct, 2020

2 commits

529adfe8f locks: fix a typo at a kernel-doc markup ... Browse Code »

locks_delete_lock -> locks_delete_block

Signed-off-by: Mauro Carvalho Chehab
Signed-off-by: Jeff Layton

Mauro Carvalho Chehab
2020-10-26 20:00:39 +0800
16238415e locks: Fix UBSAN undefined behaviour in flock64_to_posix_lock ... Browse Code »

When the sum of fl->fl_start and l->l_len overflows,
UBSAN shows the following warning:

UBSAN: Undefined behaviour in fs/locks.c:482:29
signed integer overflow: 2 + 9223372036854775806
cannot be represented in type 'long long int'
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xe4/0x14e lib/dump_stack.c:118
ubsan_epilogue+0xe/0x81 lib/ubsan.c:161
handle_overflow+0x193/0x1e2 lib/ubsan.c:192
flock64_to_posix_lock fs/locks.c:482 [inline]
flock_to_posix_lock+0x595/0x690 fs/locks.c:515
fcntl_setlk+0xf3/0xa90 fs/locks.c:2262
do_fcntl+0x456/0xf60 fs/fcntl.c:387
__do_sys_fcntl fs/fcntl.c:483 [inline]
__se_sys_fcntl fs/fcntl.c:468 [inline]
__x64_sys_fcntl+0x12d/0x180 fs/fcntl.c:468
do_syscall_64+0xc8/0x5a0 arch/x86/entry/common.c:293
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fix it by parenthesizing 'l->l_len - 1'.

Signed-off-by: Luo Meng
Signed-off-by: Jeff Layton

Luo Meng
2020-10-26 19:59:29 +0800

24 Aug, 2020

1 commit

df561f668 treewide: Use fallthrough pseudo-keyword ... Browse Code »

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva

Gustavo A. R. Silva
2020-08-24 06:36:59 +0800

10 Aug, 2020

1 commit

7a6b60441 Merge tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6 ... Browse Code »

Pull NFS server updates from Chuck Lever:
"Highlights:
- Support for user extended attributes on NFS (RFC 8276)
- Further reduce unnecessary NFSv4 delegation recalls

Notable fixes:
- Fix recent krb5p regression
- Address a few resource leaks and a rare NULL dereference

Other:
- De-duplicate RPC/RDMA error handling and other utility functions
- Replace storage and display of kernel memory addresses by tracepoints"

* tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6: (38 commits)
svcrdma: CM event handler clean up
svcrdma: Remove transport reference counting
svcrdma: Fix another Receive buffer leak
SUNRPC: Refresh the show_rqstp_flags() macro
nfsd: netns.h: delete a duplicated word
SUNRPC: Fix ("SUNRPC: Add "@len" parameter to gss_unwrap()")
nfsd: avoid a NULL dereference in __cld_pipe_upcall()
nfsd4: a client's own opens needn't prevent delegations
nfsd: Use seq_putc() in two functions
svcrdma: Display chunk completion ID when posting a rw_ctxt
svcrdma: Record send_ctxt completion ID in trace_svcrdma_post_send()
svcrdma: Introduce Send completion IDs
svcrdma: Record Receive completion ID in svc_rdma_decode_rqst
svcrdma: Introduce Receive completion IDs
svcrdma: Introduce infrastructure to support completion IDs
svcrdma: Add common XDR encoders for RDMA and Read segments
svcrdma: Add common XDR decoders for RDMA and Read segments
SUNRPC: Add helpers for decoding list discriminators symbolically
svcrdma: Remove declarations for functions long removed
svcrdma: Clean up trace_svcrdma_send_failed() tracepoint
...

Linus Torvalds
2020-08-10 04:58:04 +0800

04 Aug, 2020

1 commit

3208167a8 Merge tag 'filelock-v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux ... Browse Code »

Pull file locking fix from Jeff Layton:
"Just a single, one-line patch to fix an inefficiency in the posix
locking code that can lead to it doing more wakeups than necessary"

* tag 'filelock-v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
locks: add locks_move_blocks in posix_lock_inode

Linus Torvalds
2020-08-04 01:46:41 +0800

14 Jul, 2020

1 commit

94415b06e nfsd4: a client's own opens needn't prevent delegations ... Browse Code »

We recently fixed lease breaking so that a client's actions won't break
its own delegations.

But we still have an unnecessary self-conflict when granting
delegations: a client's own write opens will prevent us from handing out
a read delegation even when no other client has the file open for write.

Fix that by turning off the checks for conflicting opens under
vfs_setlease, and instead performing those checks in the nfsd code.

We don't depend much on locks here: instead we acquire the delegation,
then check for conflicts, and drop the delegation again if we find any.

The check beforehand is an optimization of sorts, just to avoid
acquiring the delegation unnecessarily. There's a race where the first
check could cause us to deny the delegation when we could have granted
it. But, that's OK, delegation grants are optional (and probably not
even a good idea in that case).

Signed-off-by: J. Bruce Fields
Signed-off-by: Chuck Lever

J. Bruce Fields
2020-07-14 05:28:46 +0800

12 Jun, 2020

1 commit

c742b6347 Merge tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"Highlights:

- Keep nfsd clients from unnecessarily breaking their own
delegations.

Note this requires a small kthreadd addition. The result is Tejun
Heo's suggestion (see link), and he was OK with this going through
my tree.

- Patch nfsd/clients/ to display filenames, and to fix byte-order
when displaying stateid's.

- fix a module loading/unloading bug, from Neil Brown.

- A big series from Chuck Lever with RPC/RDMA and tracing
improvements, and lay some groundwork for RPC-over-TLS"

Link: https://lore.kernel.org/r/1588348912-24781-1-git-send-email-bfields@redhat.com

* tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux: (49 commits)
sunrpc: use kmemdup_nul() in gssp_stringify()
nfsd: safer handling of corrupted c_type
nfsd4: make drc_slab global, not per-net
SUNRPC: Remove unreachable error condition in rpcb_getport_async()
nfsd: Fix svc_xprt refcnt leak when setup callback client failed
sunrpc: clean up properly in gss_mech_unregister()
sunrpc: svcauth_gss_register_pseudoflavor must reject duplicate registrations.
sunrpc: check that domain table is empty at module unload.
NFSD: Fix improperly-formatted Doxygen comments
NFSD: Squash an annoying compiler warning
SUNRPC: Clean up request deferral tracepoints
NFSD: Add tracepoints for monitoring NFSD callbacks
NFSD: Add tracepoints to the NFSD state management code
NFSD: Add tracepoints to NFSD's duplicate reply cache
SUNRPC: svc_show_status() macro should have enum definitions
SUNRPC: Restructure svc_udp_recvfrom()
SUNRPC: Refactor svc_recvfrom()
SUNRPC: Clean up svc_release_skb() functions
SUNRPC: Refactor recvfrom path dealing with incomplete TCP receives
SUNRPC: Replace dprintk() call sites in TCP receive path
...

Linus Torvalds
2020-06-12 01:33:13 +0800

05 Jun, 2020

1 commit

9ff725857 Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull proc updates from Eric Biederman:
"This has four sets of changes:

- modernize proc to support multiple private instances

- ensure we see the exit of each process tid exactly

- remove has_group_leader_pid

- use pids not tasks in posix-cpu-timers lookup

Alexey updated proc so each mount of proc uses a new superblock. This
allows people to actually use mount options with proc with no fear of
messing up another mount of proc. Given the kernel's internal mounts
of proc for things like uml this was a real problem, and resulted in
Android's hidepid mount options being ignored and introducing security
issues.

The rest of the changes are small cleanups and fixes that came out of
my work to allow this change to proc. In essence it is swapping the
pids in de_thread during exec which removes a special case the code
had to handle. Then updating the code to stop handling that special
case"

* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
proc: proc_pid_ns takes super_block as an argument
remove the no longer needed pid_alive() check in __task_pid_nr_ns()
posix-cpu-timers: Replace __get_task_for_clock with pid_for_clock
posix-cpu-timers: Replace cpu_timer_pid_type with clock_pid_type
posix-cpu-timers: Extend rcu_read_lock removing task_struct references
signal: Remove has_group_leader_pid
exec: Remove BUG_ON(has_group_leader_pid)
posix-cpu-timer: Unify the now redundant code in lookup_task
posix-cpu-timer: Tidy up group_leader logic in lookup_task
proc: Ensure we see the exit of each process tid exactly once
rculist: Add hlists_swap_heads_rcu
proc: Use PIDTYPE_TGID in next_tgid
Use proc_pid_ns() to get pid_namespace from the proc superblock
proc: use named enums for better readability
proc: use human-readable values for hidepid
docs: proc: add documentation for "hidepid=4" and "subset=pid" options and new mount behavior
proc: add option to mount only a pids subset
proc: instantiate only pids that we can ptrace on 'hidepid=4' mount option
proc: allow to mount many instances of proc in one pid namespace
proc: rename struct proc_fs_info to proc_fs_opts

Linus Torvalds
2020-06-05 04:54:34 +0800

03 Jun, 2020

1 commit

5ef159681 locks: add locks_move_blocks in posix_lock_inode ... Browse Code »

We forget to call locks_move_blocks in posix_lock_inode when try to
process same owner and different types.

Signed-off-by: yangerkun
Signed-off-by: Jeff Layton

yangerkun
2020-06-03 00:08:25 +0800

19 May, 2020

1 commit

9d78edeae proc: proc_pid_ns takes super_block as an argument ... Browse Code »

syzbot found that

touch /proc/testfile

causes NULL pointer dereference at tomoyo_get_local_path()
because inode of the dentry is NULL.

Before c59f415a7cb6, Tomoyo received pid_ns from proc's s_fs_info
directly. Since proc_pid_ns() can only work with inode, using it in
the tomoyo_get_local_path() was wrong.

To avoid creating more functions for getting proc_ns, change the
argument type of the proc_pid_ns() function. Then, Tomoyo can use
the existing super_block to get pid_ns.

Link: https://lkml.kernel.org/r/0000000000002f0c7505a5b0e04c@google.com
Link: https://lkml.kernel.org/r/20200518180738.2939611-1-gladkov.alexey@gmail.com
Reported-by: syzbot+c1af344512918c61362c@syzkaller.appspotmail.com
Fixes: c59f415a7cb6 ("Use proc_pid_ns() to get pid_namespace from the proc superblock")
Signed-off-by: Alexey Gladkov
Signed-off-by: Eric W. Biederman

Alexey Gladkov
2020-05-19 20:07:50 +0800

09 May, 2020

1 commit

28df3d153 nfsd: clients don't need to break their own delegations ... Browse Code »

We currently revoke read delegations on any write open or any operation
that modifies file data or metadata (including rename, link, and
unlink). But if the delegation in question is the only read delegation
and is held by the client performing the operation, that's not really
necessary.

It's not always possible to prevent this in the NFSv4.0 case, because
there's not always a way to determine which client an NFSv4.0 delegation
came from. (In theory we could try to guess this from the transport
layer, e.g., by assuming all traffic on a given TCP connection comes
from the same client. But that's not really correct.)

In the NFSv4.1 case the session layer always tells us the client.

This patch should remove such self-conflicts in all cases where we can
reliably determine the client from the compound.

To do that we need to track "who" is performing a given (possibly
lease-breaking) file operation. We're doing that by storing the
information in the svc_rqst and using kthread_data() to map the current
task back to a svc_rqst.

Signed-off-by: J. Bruce Fields

J. Bruce Fields
2020-05-09 09:23:10 +0800

05 May, 2020

1 commit

a02dcdf65 docs: filesystems: convert mandatory-locking.txt to ReST ... Browse Code »

- Add a SPDX header;
- Adjust document title;
- Some whitespace fixes and new line breaks;
- Use notes markups;
- Add it to filesystems/index.rst.

Signed-off-by: Mauro Carvalho Chehab
Link: https://lore.kernel.org/r/aecd6259fe9f99b2c2b3440eab6a2b989125e00d.1588021877.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet

Mauro Carvalho Chehab
2020-05-05 23:22:22 +0800

25 Apr, 2020

1 commit

c59f415a7 Use proc_pid_ns() to get pid_namespace from the proc superblock ... Browse Code »

To get pid_namespace from the procfs superblock should be used a special
helper. This will avoid errors when s_fs_info will change the type.

Link: https://lore.kernel.org/lkml/20200423200316.164518-3-gladkov.alexey@gmail.com/
Link: https://lore.kernel.org/lkml/20200423112858.95820-1-gladkov.alexey@gmail.com/
Link: https://lore.kernel.org/lkml/06B50A1C-406F-4057-BFA8-3A7729EA7469@lca.pw/
Signed-off-by: Alexey Gladkov
Signed-off-by: Eric W. Biederman

Alexey Gladkov
2020-04-25 05:38:30 +0800

19 Mar, 2020

1 commit

dcf23ac3e locks: reinstate locks_delete_block optimization ... Browse Code »

There is measurable performance impact in some synthetic tests due to
commit 6d390e4b5d48 (locks: fix a potential use-after-free problem when
wakeup a waiter). Fix the race condition instead by clearing the
fl_blocker pointer after the wake_up, using explicit acquire/release
semantics.

This does mean that we can no longer use the clearing of fl_blocker as
the wait condition, so switch the waiters over to checking whether the
fl_blocked_member list_head is empty.

Reviewed-by: yangerkun
Reviewed-by: NeilBrown
Fixes: 6d390e4b5d48 (locks: fix a potential use-after-free problem when wakeup a waiter)
Signed-off-by: Jeff Layton
Signed-off-by: Linus Torvalds

Linus Torvalds
2020-03-19 04:03:38 +0800

07 Mar, 2020

1 commit

6d390e4b5 locks: fix a potential use-after-free problem when wakeup a waiter ... Browse Code »

'16306a61d3b7 ("fs/locks: always delete_block after waiting.")' add the
logic to check waiter->fl_blocker without blocked_lock_lock. And it will
trigger a UAF when we try to wakeup some waiter：

Thread 1 has create a write flock a on file, and now thread 2 try to
unlock and delete flock a, thread 3 try to add flock b on the same file.

Thread2 Thread3
flock syscall(create flock b)
...flock_lock_inode_wait
flock_lock_inode(will insert
our fl_blocked_member list
to flock a's fl_blocked_requests)
sleep
flock syscall(unlock)
...flock_lock_inode_wait
locks_delete_lock_ctx
...__locks_wake_up_blocks
__locks_delete_blocks(
b->fl_blocker = NULL)
...
break by a signal
locks_delete_block
b->fl_blocker == NULL &&
list_empty(&b->fl_blocked_requests)
success, return directly
locks_free_lock b
wake_up(&b->fl_waiter)
trigger UAF

Fix it by remove this logic, and this patch may also fix CVE-2019-19769.

Cc: stable@vger.kernel.org
Fixes: 16306a61d3b7 ("fs/locks: always delete_block after waiting.")
Signed-off-by: yangerkun
Signed-off-by: Jeff Layton

yangerkun
2020-03-07 00:54:13 +0800

29 Dec, 2019

1 commit

98ca480a8 locks: print unsigned ino in /proc/locks ... Browse Code »

An ino is unsigned, so display it as such in /proc/locks.

Cc: stable@vger.kernel.org
Signed-off-by: Amir Goldstein
Signed-off-by: Jeff Layton

Amir Goldstein
2019-12-29 22:00:58 +0800

28 Sep, 2019

1 commit

298fb76a5 Merge tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"Highlights:

- Add a new knfsd file cache, so that we don't have to open and close
on each (NFSv2/v3) READ or WRITE. This can speed up read and write
in some cases. It also replaces our readahead cache.

- Prevent silent data loss on write errors, by treating write errors
like server reboots for the purposes of write caching, thus forcing
clients to resend their writes.

- Tweak the code that allocates sessions to be more forgiving, so
that NFSv4.1 mounts are less likely to hang when a server already
has a lot of clients.

- Eliminate an arbitrary limit on NFSv4 ACL sizes; they should now be
limited only by the backend filesystem and the maximum RPC size.

- Allow the server to enforce use of the correct kerberos credentials
when a client reclaims state after a reboot.

And some miscellaneous smaller bugfixes and cleanup"

* tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linux: (34 commits)
sunrpc: clean up indentation issue
nfsd: fix nfs read eof detection
nfsd: Make nfsd_reset_boot_verifier_locked static
nfsd: degraded slot-count more gracefully as allocation nears exhaustion.
nfsd: handle drc over-allocation gracefully.
nfsd: add support for upcall version 2
nfsd: add a "GetVersion" upcall for nfsdcld
nfsd: Reset the boot verifier on all write I/O errors
nfsd: Don't garbage collect files that might contain write errors
nfsd: Support the server resetting the boot verifier
nfsd: nfsd_file cache entries should be per net namespace
nfsd: eliminate an unnecessary acl size limit
Deprecate nfsd fault injection
nfsd: remove duplicated include from filecache.c
nfsd: Fix the documentation for svcxdr_tmpalloc()
nfsd: Fix up some unused variable warnings
nfsd: close cached files prior to a REMOVE or RENAME that would replace target
nfsd: rip out the raparms cache
nfsd: have nfsd_test_lock use the nfsd_file cache
nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache
...

Linus Torvalds
2019-09-28 08:00:27 +0800

20 Aug, 2019

1 commit

cfddf9f4c locks: fix a memory leak bug in __break_lease() ... Browse Code »

In __break_lease(), the file lock 'new_fl' is allocated in lease_alloc().
However, it is not deallocated in the following execution if
smp_load_acquire() fails, leading to a memory leak bug. To fix this issue,
free 'new_fl' before returning the error.

Signed-off-by: Wenwen Wang
Signed-off-by: Jeff Layton

Wenwen Wang
2019-08-20 17:48:52 +0800

19 Aug, 2019

2 commits

eb82dd393 nfsd: convert fi_deleg_file and ls_file fields to nfsd_file ... Browse Code »

Have them keep an nfsd_file reference instead of a struct file.

Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust
Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Jeff Layton
2019-08-19 23:09:09 +0800
18f6622eb locks: create a new notifier chain for lease attempts ... Browse Code »

With the new file caching infrastructure in nfsd, we can end up holding
files open for an indefinite period of time, even when they are still
idle. This may prevent the kernel from handing out leases on the file,
which is something we don't want to block.

Fix this by running a SRCU notifier call chain whenever on any
lease attempt. nfsd can then purge the cache for that inode before
returning.

Since SRCU is only conditionally compiled in, we must only define the
new chain if it's enabled, and users of the chain must ensure that
SRCU is enabled.

Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust
Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Jeff Layton
2019-08-19 23:00:39 +0800

25 Jul, 2019

1 commit

43e4cb942 locks: Fix procfs output for file leases ... Browse Code »

Since commit 778fc546f749c588aa2f ("locks: fix tracking of inprogress
lease breaks"), leases break don't change @fl_type but modifies
@fl_flags. However, procfs's part haven't been updated.

Previously, for a breaking lease the target type was printed (see
target_leasetype()), as returns fcntl(F_GETLEASE). But now it's always
"READ", as F_UNLCK no longer means "breaking". Unlike the previous
one, this behaviour don't provide a complete description of the lease.

There are /proc/pid/fdinfo/ outputs for a lease (the same for READ and
WRITE) breaked by O_WRONLY.
-- before:
lock: 1: LEASE BREAKING READ 2558 08:03:815793 0 EOF
-- after:
lock: 1: LEASE BREAKING UNLCK 2558 08:03:815793 0 EOF

Signed-off-by: Pavel Begunkov
Signed-off-by: Jeff Layton

Pavel Begunkov
2019-07-25 19:49:44 +0800

11 Jul, 2019

1 commit

d2b6b4c83 Merge tag 'nfsd-5.3' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"Highlights:

- Add a new /proc/fs/nfsd/clients/ directory which exposes some
long-requested information about NFSv4 clients (like open files)
and allows forced revocation of client state.

- Replace the global duplicate reply cache by a cache per network
namespace; previously, a request in one network namespace could
incorrectly match an entry from another, though we haven't seen
this in production. This is the last remaining container bug that
I'm aware of; at this point you should be able to run separate
nfsd's in each network namespace, each with their own set of
exports, and everything should work.

- Cleanup and modify lock code to show the pid of lockd as the owner
of NLM locks. This is the correct version of the bugfix originally
attempted in b8eee0e90f97 ("lockd: Show pid of lockd for remote
locks")"

* tag 'nfsd-5.3' of git://linux-nfs.org/~bfields/linux: (34 commits)
nfsd: Make __get_nfsdfs_client() static
nfsd: Make two functions static
nfsd: Fix misuse of strlcpy
sunrpc/cache: remove the exporting of cache_seq_next
nfsd: decode implementation id
nfsd: create xdr_netobj_dup helper
nfsd: allow forced expiration of NFSv4 clients
nfsd: create get_nfsdfs_clp helper
nfsd4: show layout stateids
nfsd: show lock and deleg stateids
nfsd4: add file to display list of client's opens
nfsd: add more information to client info file
nfsd: escape high characters in binary data
nfsd: copy client's address including port number to cl_addr
nfsd4: add a client info file
nfsd: make client/ directory names small ints
nfsd: add nfsd/clients directory
nfsd4: use reference count to free client
nfsd: rename cl_refcount
nfsd: persist nfsd filesystem across mounts
...

Linus Torvalds
2019-07-11 12:22:43 +0800

04 Jul, 2019

1 commit

f85d93385 locks: Cleanup lm_compare_owner and lm_owner_key ... Browse Code »

After the update to use nlm_lockowners for the NLM server, there are no
more users of lm_compare_owner and lm_owner_key.

Signed-off-by: Benjamin Coddington
Signed-off-by: J. Bruce Fields

Benjamin Coddington
2019-07-04 05:52:09 +0800

19 Jun, 2019

2 commits

387e3746d locks: eliminate false positive conflicts for write lease ... Browse Code »

check_conflicting_open() is checking for existing fd's open for read or
for write before allowing to take a write lease. The check that was
implemented using i_count and d_count is an approximation that has
several false positives. For example, overlayfs since v4.19, takes an
extra reference on the dentry; An open with O_PATH takes a reference on
the dentry although the file cannot be read nor written.

Change the implementation to use i_readcount and i_writecount to
eliminate the false positive conflicts and allow a write lease to be
taken on an overlayfs file.

The change of behavior with existing fd's open with O_PATH is symmetric
w.r.t. current behavior of lease breakers - an open with O_PATH currently
does not break a write lease.

This increases the size of struct inode by 4 bytes on 32bit archs when
CONFIG_FILE_LOCKING is defined and CONFIG_IMA was not already
defined.

Signed-off-by: Amir Goldstein
Signed-off-by: Jeff Layton

Amir Goldstein
2019-06-19 20:49:38 +0800
d51f527f4 locks: Add trace_leases_conflict ... Browse Code »

Signed-off-by: Ira Weiny
Signed-off-by: Jeff Layton

Ira Weiny
2019-06-19 20:49:37 +0800

21 May, 2019

1 commit

457c89965 treewide: Add SPDX license identifier for missed files ... Browse Code »

Add SPDX license identifiers to all files which:

- Have no license information of any form

- Have EXPORT_.*_SYMBOL_GPL inside which was used in the
initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

GPL-2.0-only

Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-21 16:50:45 +0800

16 May, 2019

1 commit

700a800a9 Merge tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"This consists mostly of nfsd container work:

Scott Mayhew revived an old api that communicates with a userspace
daemon to manage some on-disk state that's used to track clients
across server reboots. We've been using a usermode_helper upcall for
that, but it's tough to run those with the right namespaces, so a
daemon is much friendlier to container use cases.

Trond fixed nfsd's handling of user credentials in user namespaces. He
also contributed patches that allow containers to support different
sets of NFS protocol versions.

The only remaining container bug I'm aware of is that the NFS reply
cache is shared between all containers. If anyone's aware of other
gaps in our container support, let me know.

The rest of this is miscellaneous bugfixes"

* tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linux: (23 commits)
nfsd: update callback done processing
locks: move checks from locks_free_lock() to locks_release_private()
nfsd: fh_drop_write in nfsd_unlink
nfsd: allow fh_want_write to be called twice
nfsd: knfsd must use the container user namespace
SUNRPC: rsi_parse() should use the current user namespace
SUNRPC: Fix the server AUTH_UNIX userspace mappings
lockd: Pass the user cred from knfsd when starting the lockd server
SUNRPC: Temporary sockets should inherit the cred from their parent
SUNRPC: Cache the process user cred in the RPC server listener
nfsd: Allow containers to set supported nfs versions
nfsd: Add custom rpcbind callbacks for knfsd
SUNRPC: Allow further customisation of RPC program registration
SUNRPC: Clean up generic dispatcher code
SUNRPC: Add a callback to initialise server requests
SUNRPC/nfs: Fix return value for nfs4_callback_compound()
nfsd: handle legacy client tracking records sent by nfsdcld
nfsd: re-order client tracking method selection
nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld
nfsd: un-deprecate nfsdcld
...

Linus Torvalds
2019-05-16 09:21:43 +0800

08 May, 2019

1 commit

b4b52b881 Merge tag 'Wimplicit-fallthrough-5.2-rc1' of git://git.kernel.org/pub/scm/linux/… ... Browse Code »

…kernel/git/gustavoars/linux

Pull Wimplicit-fallthrough updates from Gustavo A. R. Silva:
"Mark switch cases where we are expecting to fall through.

This is part of the ongoing efforts to enable -Wimplicit-fallthrough.

Most of them have been baking in linux-next for a whole development
cycle. And with Stephen Rothwell's help, we've had linux-next
nag-emails going out for newly introduced code that triggers
-Wimplicit-fallthrough to avoid gaining more of these cases while we
work to remove the ones that are already present.

We are getting close to completing this work. Currently, there are
only 32 of 2311 of these cases left to be addressed in linux-next. I'm
auditing every case; I take a look into the code and analyze it in
order to determine if I'm dealing with an actual bug or a false
positive, as explained here:

https://lore.kernel.org/lkml/c2fad584-1705-a5f2-d63c-824e9b96cf50@embeddedor.com/

While working on this, I've found and fixed the several missing
break/return bugs, some of them introduced more than 5 years ago.

Once this work is finished, we'll be able to universally enable
"-Wimplicit-fallthrough" to avoid any of these kinds of bugs from
entering the kernel again"

* tag 'Wimplicit-fallthrough-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (27 commits)
memstick: mark expected switch fall-throughs
drm/nouveau/nvkm: mark expected switch fall-throughs
NFC: st21nfca: Fix fall-through warnings
NFC: pn533: mark expected switch fall-throughs
block: Mark expected switch fall-throughs
ASN.1: mark expected switch fall-through
lib/cmdline.c: mark expected switch fall-throughs
lib: zstd: Mark expected switch fall-throughs
scsi: sym53c8xx_2: sym_nvram: Mark expected switch fall-through
scsi: sym53c8xx_2: sym_hipd: mark expected switch fall-throughs
scsi: ppa: mark expected switch fall-through
scsi: osst: mark expected switch fall-throughs
scsi: lpfc: lpfc_scsi: Mark expected switch fall-throughs
scsi: lpfc: lpfc_nvme: Mark expected switch fall-through
scsi: lpfc: lpfc_nportdisc: Mark expected switch fall-through
scsi: lpfc: lpfc_hbadisc: Mark expected switch fall-throughs
scsi: lpfc: lpfc_els: Mark expected switch fall-throughs
scsi: lpfc: lpfc_ct: Mark expected switch fall-throughs
scsi: imm: mark expected switch fall-throughs
scsi: csiostor: csio_wr: mark expected switch fall-through
...

Linus Torvalds
2019-05-08 03:48:10 +0800