Eric Lee / smarc-fsl-linux-kernel

24 Aug, 2020

1 commit

df561f668 treewide: Use fallthrough pseudo-keyword ... Browse Code »

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva

Gustavo A. R. Silva
2020-08-24 06:36:59 +0800

10 Aug, 2020

1 commit

7a6b60441 Merge tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6 ... Browse Code »

Pull NFS server updates from Chuck Lever:
"Highlights:
- Support for user extended attributes on NFS (RFC 8276)
- Further reduce unnecessary NFSv4 delegation recalls

Notable fixes:
- Fix recent krb5p regression
- Address a few resource leaks and a rare NULL dereference

Other:
- De-duplicate RPC/RDMA error handling and other utility functions
- Replace storage and display of kernel memory addresses by tracepoints"

* tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6: (38 commits)
svcrdma: CM event handler clean up
svcrdma: Remove transport reference counting
svcrdma: Fix another Receive buffer leak
SUNRPC: Refresh the show_rqstp_flags() macro
nfsd: netns.h: delete a duplicated word
SUNRPC: Fix ("SUNRPC: Add "@len" parameter to gss_unwrap()")
nfsd: avoid a NULL dereference in __cld_pipe_upcall()
nfsd4: a client's own opens needn't prevent delegations
nfsd: Use seq_putc() in two functions
svcrdma: Display chunk completion ID when posting a rw_ctxt
svcrdma: Record send_ctxt completion ID in trace_svcrdma_post_send()
svcrdma: Introduce Send completion IDs
svcrdma: Record Receive completion ID in svc_rdma_decode_rqst
svcrdma: Introduce Receive completion IDs
svcrdma: Introduce infrastructure to support completion IDs
svcrdma: Add common XDR encoders for RDMA and Read segments
svcrdma: Add common XDR decoders for RDMA and Read segments
SUNRPC: Add helpers for decoding list discriminators symbolically
svcrdma: Remove declarations for functions long removed
svcrdma: Clean up trace_svcrdma_send_failed() tracepoint
...

Linus Torvalds
2020-08-10 04:58:04 +0800

04 Aug, 2020

1 commit

3208167a8 Merge tag 'filelock-v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux ... Browse Code »

Pull file locking fix from Jeff Layton:
"Just a single, one-line patch to fix an inefficiency in the posix
locking code that can lead to it doing more wakeups than necessary"

* tag 'filelock-v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
locks: add locks_move_blocks in posix_lock_inode

Linus Torvalds
2020-08-04 01:46:41 +0800

14 Jul, 2020

1 commit

94415b06e nfsd4: a client's own opens needn't prevent delegations ... Browse Code »

We recently fixed lease breaking so that a client's actions won't break
its own delegations.

But we still have an unnecessary self-conflict when granting
delegations: a client's own write opens will prevent us from handing out
a read delegation even when no other client has the file open for write.

Fix that by turning off the checks for conflicting opens under
vfs_setlease, and instead performing those checks in the nfsd code.

We don't depend much on locks here: instead we acquire the delegation,
then check for conflicts, and drop the delegation again if we find any.

The check beforehand is an optimization of sorts, just to avoid
acquiring the delegation unnecessarily. There's a race where the first
check could cause us to deny the delegation when we could have granted
it. But, that's OK, delegation grants are optional (and probably not
even a good idea in that case).

Signed-off-by: J. Bruce Fields
Signed-off-by: Chuck Lever

J. Bruce Fields
2020-07-14 05:28:46 +0800

12 Jun, 2020

1 commit

c742b6347 Merge tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"Highlights:

- Keep nfsd clients from unnecessarily breaking their own
delegations.

Note this requires a small kthreadd addition. The result is Tejun
Heo's suggestion (see link), and he was OK with this going through
my tree.

- Patch nfsd/clients/ to display filenames, and to fix byte-order
when displaying stateid's.

- fix a module loading/unloading bug, from Neil Brown.

- A big series from Chuck Lever with RPC/RDMA and tracing
improvements, and lay some groundwork for RPC-over-TLS"

Link: https://lore.kernel.org/r/1588348912-24781-1-git-send-email-bfields@redhat.com

* tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux: (49 commits)
sunrpc: use kmemdup_nul() in gssp_stringify()
nfsd: safer handling of corrupted c_type
nfsd4: make drc_slab global, not per-net
SUNRPC: Remove unreachable error condition in rpcb_getport_async()
nfsd: Fix svc_xprt refcnt leak when setup callback client failed
sunrpc: clean up properly in gss_mech_unregister()
sunrpc: svcauth_gss_register_pseudoflavor must reject duplicate registrations.
sunrpc: check that domain table is empty at module unload.
NFSD: Fix improperly-formatted Doxygen comments
NFSD: Squash an annoying compiler warning
SUNRPC: Clean up request deferral tracepoints
NFSD: Add tracepoints for monitoring NFSD callbacks
NFSD: Add tracepoints to the NFSD state management code
NFSD: Add tracepoints to NFSD's duplicate reply cache
SUNRPC: svc_show_status() macro should have enum definitions
SUNRPC: Restructure svc_udp_recvfrom()
SUNRPC: Refactor svc_recvfrom()
SUNRPC: Clean up svc_release_skb() functions
SUNRPC: Refactor recvfrom path dealing with incomplete TCP receives
SUNRPC: Replace dprintk() call sites in TCP receive path
...

Linus Torvalds
2020-06-12 01:33:13 +0800

05 Jun, 2020

1 commit

9ff725857 Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull proc updates from Eric Biederman:
"This has four sets of changes:

- modernize proc to support multiple private instances

- ensure we see the exit of each process tid exactly

- remove has_group_leader_pid

- use pids not tasks in posix-cpu-timers lookup

Alexey updated proc so each mount of proc uses a new superblock. This
allows people to actually use mount options with proc with no fear of
messing up another mount of proc. Given the kernel's internal mounts
of proc for things like uml this was a real problem, and resulted in
Android's hidepid mount options being ignored and introducing security
issues.

The rest of the changes are small cleanups and fixes that came out of
my work to allow this change to proc. In essence it is swapping the
pids in de_thread during exec which removes a special case the code
had to handle. Then updating the code to stop handling that special
case"

* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
proc: proc_pid_ns takes super_block as an argument
remove the no longer needed pid_alive() check in __task_pid_nr_ns()
posix-cpu-timers: Replace __get_task_for_clock with pid_for_clock
posix-cpu-timers: Replace cpu_timer_pid_type with clock_pid_type
posix-cpu-timers: Extend rcu_read_lock removing task_struct references
signal: Remove has_group_leader_pid
exec: Remove BUG_ON(has_group_leader_pid)
posix-cpu-timer: Unify the now redundant code in lookup_task
posix-cpu-timer: Tidy up group_leader logic in lookup_task
proc: Ensure we see the exit of each process tid exactly once
rculist: Add hlists_swap_heads_rcu
proc: Use PIDTYPE_TGID in next_tgid
Use proc_pid_ns() to get pid_namespace from the proc superblock
proc: use named enums for better readability
proc: use human-readable values for hidepid
docs: proc: add documentation for "hidepid=4" and "subset=pid" options and new mount behavior
proc: add option to mount only a pids subset
proc: instantiate only pids that we can ptrace on 'hidepid=4' mount option
proc: allow to mount many instances of proc in one pid namespace
proc: rename struct proc_fs_info to proc_fs_opts

Linus Torvalds
2020-06-05 04:54:34 +0800

03 Jun, 2020

1 commit

5ef159681 locks: add locks_move_blocks in posix_lock_inode ... Browse Code »

We forget to call locks_move_blocks in posix_lock_inode when try to
process same owner and different types.

Signed-off-by: yangerkun
Signed-off-by: Jeff Layton

yangerkun
2020-06-03 00:08:25 +0800

19 May, 2020

1 commit

9d78edeae proc: proc_pid_ns takes super_block as an argument ... Browse Code »

syzbot found that

touch /proc/testfile

causes NULL pointer dereference at tomoyo_get_local_path()
because inode of the dentry is NULL.

Before c59f415a7cb6, Tomoyo received pid_ns from proc's s_fs_info
directly. Since proc_pid_ns() can only work with inode, using it in
the tomoyo_get_local_path() was wrong.

To avoid creating more functions for getting proc_ns, change the
argument type of the proc_pid_ns() function. Then, Tomoyo can use
the existing super_block to get pid_ns.

Link: https://lkml.kernel.org/r/0000000000002f0c7505a5b0e04c@google.com
Link: https://lkml.kernel.org/r/20200518180738.2939611-1-gladkov.alexey@gmail.com
Reported-by: syzbot+c1af344512918c61362c@syzkaller.appspotmail.com
Fixes: c59f415a7cb6 ("Use proc_pid_ns() to get pid_namespace from the proc superblock")
Signed-off-by: Alexey Gladkov
Signed-off-by: Eric W. Biederman

Alexey Gladkov
2020-05-19 20:07:50 +0800

09 May, 2020

1 commit

28df3d153 nfsd: clients don't need to break their own delegations ... Browse Code »

We currently revoke read delegations on any write open or any operation
that modifies file data or metadata (including rename, link, and
unlink). But if the delegation in question is the only read delegation
and is held by the client performing the operation, that's not really
necessary.

It's not always possible to prevent this in the NFSv4.0 case, because
there's not always a way to determine which client an NFSv4.0 delegation
came from. (In theory we could try to guess this from the transport
layer, e.g., by assuming all traffic on a given TCP connection comes
from the same client. But that's not really correct.)

In the NFSv4.1 case the session layer always tells us the client.

This patch should remove such self-conflicts in all cases where we can
reliably determine the client from the compound.

To do that we need to track "who" is performing a given (possibly
lease-breaking) file operation. We're doing that by storing the
information in the svc_rqst and using kthread_data() to map the current
task back to a svc_rqst.

Signed-off-by: J. Bruce Fields

J. Bruce Fields
2020-05-09 09:23:10 +0800

05 May, 2020

1 commit

a02dcdf65 docs: filesystems: convert mandatory-locking.txt to ReST ... Browse Code »

- Add a SPDX header;
- Adjust document title;
- Some whitespace fixes and new line breaks;
- Use notes markups;
- Add it to filesystems/index.rst.

Signed-off-by: Mauro Carvalho Chehab
Link: https://lore.kernel.org/r/aecd6259fe9f99b2c2b3440eab6a2b989125e00d.1588021877.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet

Mauro Carvalho Chehab
2020-05-05 23:22:22 +0800

25 Apr, 2020

1 commit

c59f415a7 Use proc_pid_ns() to get pid_namespace from the proc superblock ... Browse Code »

To get pid_namespace from the procfs superblock should be used a special
helper. This will avoid errors when s_fs_info will change the type.

Link: https://lore.kernel.org/lkml/20200423200316.164518-3-gladkov.alexey@gmail.com/
Link: https://lore.kernel.org/lkml/20200423112858.95820-1-gladkov.alexey@gmail.com/
Link: https://lore.kernel.org/lkml/06B50A1C-406F-4057-BFA8-3A7729EA7469@lca.pw/
Signed-off-by: Alexey Gladkov
Signed-off-by: Eric W. Biederman

Alexey Gladkov
2020-04-25 05:38:30 +0800

19 Mar, 2020

1 commit

dcf23ac3e locks: reinstate locks_delete_block optimization ... Browse Code »

There is measurable performance impact in some synthetic tests due to
commit 6d390e4b5d48 (locks: fix a potential use-after-free problem when
wakeup a waiter). Fix the race condition instead by clearing the
fl_blocker pointer after the wake_up, using explicit acquire/release
semantics.

This does mean that we can no longer use the clearing of fl_blocker as
the wait condition, so switch the waiters over to checking whether the
fl_blocked_member list_head is empty.

Reviewed-by: yangerkun
Reviewed-by: NeilBrown
Fixes: 6d390e4b5d48 (locks: fix a potential use-after-free problem when wakeup a waiter)
Signed-off-by: Jeff Layton
Signed-off-by: Linus Torvalds

Linus Torvalds
2020-03-19 04:03:38 +0800

07 Mar, 2020

1 commit

6d390e4b5 locks: fix a potential use-after-free problem when wakeup a waiter ... Browse Code »

'16306a61d3b7 ("fs/locks: always delete_block after waiting.")' add the
logic to check waiter->fl_blocker without blocked_lock_lock. And it will
trigger a UAF when we try to wakeup some waiter：

Thread 1 has create a write flock a on file, and now thread 2 try to
unlock and delete flock a, thread 3 try to add flock b on the same file.

Thread2 Thread3
flock syscall(create flock b)
...flock_lock_inode_wait
flock_lock_inode(will insert
our fl_blocked_member list
to flock a's fl_blocked_requests)
sleep
flock syscall(unlock)
...flock_lock_inode_wait
locks_delete_lock_ctx
...__locks_wake_up_blocks
__locks_delete_blocks(
b->fl_blocker = NULL)
...
break by a signal
locks_delete_block
b->fl_blocker == NULL &&
list_empty(&b->fl_blocked_requests)
success, return directly
locks_free_lock b
wake_up(&b->fl_waiter)
trigger UAF

Fix it by remove this logic, and this patch may also fix CVE-2019-19769.

Cc: stable@vger.kernel.org
Fixes: 16306a61d3b7 ("fs/locks: always delete_block after waiting.")
Signed-off-by: yangerkun
Signed-off-by: Jeff Layton

yangerkun
2020-03-07 00:54:13 +0800

29 Dec, 2019

1 commit

98ca480a8 locks: print unsigned ino in /proc/locks ... Browse Code »

An ino is unsigned, so display it as such in /proc/locks.

Cc: stable@vger.kernel.org
Signed-off-by: Amir Goldstein
Signed-off-by: Jeff Layton

Amir Goldstein
2019-12-29 22:00:58 +0800

28 Sep, 2019

1 commit

298fb76a5 Merge tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"Highlights:

- Add a new knfsd file cache, so that we don't have to open and close
on each (NFSv2/v3) READ or WRITE. This can speed up read and write
in some cases. It also replaces our readahead cache.

- Prevent silent data loss on write errors, by treating write errors
like server reboots for the purposes of write caching, thus forcing
clients to resend their writes.

- Tweak the code that allocates sessions to be more forgiving, so
that NFSv4.1 mounts are less likely to hang when a server already
has a lot of clients.

- Eliminate an arbitrary limit on NFSv4 ACL sizes; they should now be
limited only by the backend filesystem and the maximum RPC size.

- Allow the server to enforce use of the correct kerberos credentials
when a client reclaims state after a reboot.

And some miscellaneous smaller bugfixes and cleanup"

* tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linux: (34 commits)
sunrpc: clean up indentation issue
nfsd: fix nfs read eof detection
nfsd: Make nfsd_reset_boot_verifier_locked static
nfsd: degraded slot-count more gracefully as allocation nears exhaustion.
nfsd: handle drc over-allocation gracefully.
nfsd: add support for upcall version 2
nfsd: add a "GetVersion" upcall for nfsdcld
nfsd: Reset the boot verifier on all write I/O errors
nfsd: Don't garbage collect files that might contain write errors
nfsd: Support the server resetting the boot verifier
nfsd: nfsd_file cache entries should be per net namespace
nfsd: eliminate an unnecessary acl size limit
Deprecate nfsd fault injection
nfsd: remove duplicated include from filecache.c
nfsd: Fix the documentation for svcxdr_tmpalloc()
nfsd: Fix up some unused variable warnings
nfsd: close cached files prior to a REMOVE or RENAME that would replace target
nfsd: rip out the raparms cache
nfsd: have nfsd_test_lock use the nfsd_file cache
nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache
...

Linus Torvalds
2019-09-28 08:00:27 +0800

20 Aug, 2019

1 commit

cfddf9f4c locks: fix a memory leak bug in __break_lease() ... Browse Code »

In __break_lease(), the file lock 'new_fl' is allocated in lease_alloc().
However, it is not deallocated in the following execution if
smp_load_acquire() fails, leading to a memory leak bug. To fix this issue,
free 'new_fl' before returning the error.

Signed-off-by: Wenwen Wang
Signed-off-by: Jeff Layton

Wenwen Wang
2019-08-20 17:48:52 +0800

19 Aug, 2019

2 commits

eb82dd393 nfsd: convert fi_deleg_file and ls_file fields to nfsd_file ... Browse Code »

Have them keep an nfsd_file reference instead of a struct file.

Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust
Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Jeff Layton
2019-08-19 23:09:09 +0800
18f6622eb locks: create a new notifier chain for lease attempts ... Browse Code »

With the new file caching infrastructure in nfsd, we can end up holding
files open for an indefinite period of time, even when they are still
idle. This may prevent the kernel from handing out leases on the file,
which is something we don't want to block.

Fix this by running a SRCU notifier call chain whenever on any
lease attempt. nfsd can then purge the cache for that inode before
returning.

Since SRCU is only conditionally compiled in, we must only define the
new chain if it's enabled, and users of the chain must ensure that
SRCU is enabled.

Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust
Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Jeff Layton
2019-08-19 23:00:39 +0800

25 Jul, 2019

1 commit

43e4cb942 locks: Fix procfs output for file leases ... Browse Code »

Since commit 778fc546f749c588aa2f ("locks: fix tracking of inprogress
lease breaks"), leases break don't change @fl_type but modifies
@fl_flags. However, procfs's part haven't been updated.

Previously, for a breaking lease the target type was printed (see
target_leasetype()), as returns fcntl(F_GETLEASE). But now it's always
"READ", as F_UNLCK no longer means "breaking". Unlike the previous
one, this behaviour don't provide a complete description of the lease.

There are /proc/pid/fdinfo/ outputs for a lease (the same for READ and
WRITE) breaked by O_WRONLY.
-- before:
lock: 1: LEASE BREAKING READ 2558 08:03:815793 0 EOF
-- after:
lock: 1: LEASE BREAKING UNLCK 2558 08:03:815793 0 EOF

Signed-off-by: Pavel Begunkov
Signed-off-by: Jeff Layton

Pavel Begunkov
2019-07-25 19:49:44 +0800

11 Jul, 2019

1 commit

d2b6b4c83 Merge tag 'nfsd-5.3' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"Highlights:

- Add a new /proc/fs/nfsd/clients/ directory which exposes some
long-requested information about NFSv4 clients (like open files)
and allows forced revocation of client state.

- Replace the global duplicate reply cache by a cache per network
namespace; previously, a request in one network namespace could
incorrectly match an entry from another, though we haven't seen
this in production. This is the last remaining container bug that
I'm aware of; at this point you should be able to run separate
nfsd's in each network namespace, each with their own set of
exports, and everything should work.

- Cleanup and modify lock code to show the pid of lockd as the owner
of NLM locks. This is the correct version of the bugfix originally
attempted in b8eee0e90f97 ("lockd: Show pid of lockd for remote
locks")"

* tag 'nfsd-5.3' of git://linux-nfs.org/~bfields/linux: (34 commits)
nfsd: Make __get_nfsdfs_client() static
nfsd: Make two functions static
nfsd: Fix misuse of strlcpy
sunrpc/cache: remove the exporting of cache_seq_next
nfsd: decode implementation id
nfsd: create xdr_netobj_dup helper
nfsd: allow forced expiration of NFSv4 clients
nfsd: create get_nfsdfs_clp helper
nfsd4: show layout stateids
nfsd: show lock and deleg stateids
nfsd4: add file to display list of client's opens
nfsd: add more information to client info file
nfsd: escape high characters in binary data
nfsd: copy client's address including port number to cl_addr
nfsd4: add a client info file
nfsd: make client/ directory names small ints
nfsd: add nfsd/clients directory
nfsd4: use reference count to free client
nfsd: rename cl_refcount
nfsd: persist nfsd filesystem across mounts
...

Linus Torvalds
2019-07-11 12:22:43 +0800

04 Jul, 2019

1 commit

f85d93385 locks: Cleanup lm_compare_owner and lm_owner_key ... Browse Code »

After the update to use nlm_lockowners for the NLM server, there are no
more users of lm_compare_owner and lm_owner_key.

Signed-off-by: Benjamin Coddington
Signed-off-by: J. Bruce Fields

Benjamin Coddington
2019-07-04 05:52:09 +0800

19 Jun, 2019

2 commits

387e3746d locks: eliminate false positive conflicts for write lease ... Browse Code »

check_conflicting_open() is checking for existing fd's open for read or
for write before allowing to take a write lease. The check that was
implemented using i_count and d_count is an approximation that has
several false positives. For example, overlayfs since v4.19, takes an
extra reference on the dentry; An open with O_PATH takes a reference on
the dentry although the file cannot be read nor written.

Change the implementation to use i_readcount and i_writecount to
eliminate the false positive conflicts and allow a write lease to be
taken on an overlayfs file.

The change of behavior with existing fd's open with O_PATH is symmetric
w.r.t. current behavior of lease breakers - an open with O_PATH currently
does not break a write lease.

This increases the size of struct inode by 4 bytes on 32bit archs when
CONFIG_FILE_LOCKING is defined and CONFIG_IMA was not already
defined.

Signed-off-by: Amir Goldstein
Signed-off-by: Jeff Layton

Amir Goldstein
2019-06-19 20:49:38 +0800
d51f527f4 locks: Add trace_leases_conflict ... Browse Code »

Signed-off-by: Ira Weiny
Signed-off-by: Jeff Layton

Ira Weiny
2019-06-19 20:49:37 +0800

21 May, 2019

1 commit

457c89965 treewide: Add SPDX license identifier for missed files ... Browse Code »

Add SPDX license identifiers to all files which:

- Have no license information of any form

- Have EXPORT_.*_SYMBOL_GPL inside which was used in the
initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

GPL-2.0-only

Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-21 16:50:45 +0800

16 May, 2019

1 commit

700a800a9 Merge tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"This consists mostly of nfsd container work:

Scott Mayhew revived an old api that communicates with a userspace
daemon to manage some on-disk state that's used to track clients
across server reboots. We've been using a usermode_helper upcall for
that, but it's tough to run those with the right namespaces, so a
daemon is much friendlier to container use cases.

Trond fixed nfsd's handling of user credentials in user namespaces. He
also contributed patches that allow containers to support different
sets of NFS protocol versions.

The only remaining container bug I'm aware of is that the NFS reply
cache is shared between all containers. If anyone's aware of other
gaps in our container support, let me know.

The rest of this is miscellaneous bugfixes"

* tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linux: (23 commits)
nfsd: update callback done processing
locks: move checks from locks_free_lock() to locks_release_private()
nfsd: fh_drop_write in nfsd_unlink
nfsd: allow fh_want_write to be called twice
nfsd: knfsd must use the container user namespace
SUNRPC: rsi_parse() should use the current user namespace
SUNRPC: Fix the server AUTH_UNIX userspace mappings
lockd: Pass the user cred from knfsd when starting the lockd server
SUNRPC: Temporary sockets should inherit the cred from their parent
SUNRPC: Cache the process user cred in the RPC server listener
nfsd: Allow containers to set supported nfs versions
nfsd: Add custom rpcbind callbacks for knfsd
SUNRPC: Allow further customisation of RPC program registration
SUNRPC: Clean up generic dispatcher code
SUNRPC: Add a callback to initialise server requests
SUNRPC/nfs: Fix return value for nfs4_callback_compound()
nfsd: handle legacy client tracking records sent by nfsdcld
nfsd: re-order client tracking method selection
nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld
nfsd: un-deprecate nfsdcld
...

Linus Torvalds
2019-05-16 09:21:43 +0800

08 May, 2019

1 commit

b4b52b881 Merge tag 'Wimplicit-fallthrough-5.2-rc1' of git://git.kernel.org/pub/scm/linux/… ... Browse Code »

…kernel/git/gustavoars/linux

Pull Wimplicit-fallthrough updates from Gustavo A. R. Silva:
"Mark switch cases where we are expecting to fall through.

This is part of the ongoing efforts to enable -Wimplicit-fallthrough.

Most of them have been baking in linux-next for a whole development
cycle. And with Stephen Rothwell's help, we've had linux-next
nag-emails going out for newly introduced code that triggers
-Wimplicit-fallthrough to avoid gaining more of these cases while we
work to remove the ones that are already present.

We are getting close to completing this work. Currently, there are
only 32 of 2311 of these cases left to be addressed in linux-next. I'm
auditing every case; I take a look into the code and analyze it in
order to determine if I'm dealing with an actual bug or a false
positive, as explained here:

https://lore.kernel.org/lkml/c2fad584-1705-a5f2-d63c-824e9b96cf50@embeddedor.com/

While working on this, I've found and fixed the several missing
break/return bugs, some of them introduced more than 5 years ago.

Once this work is finished, we'll be able to universally enable
"-Wimplicit-fallthrough" to avoid any of these kinds of bugs from
entering the kernel again"

* tag 'Wimplicit-fallthrough-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (27 commits)
memstick: mark expected switch fall-throughs
drm/nouveau/nvkm: mark expected switch fall-throughs
NFC: st21nfca: Fix fall-through warnings
NFC: pn533: mark expected switch fall-throughs
block: Mark expected switch fall-throughs
ASN.1: mark expected switch fall-through
lib/cmdline.c: mark expected switch fall-throughs
lib: zstd: Mark expected switch fall-throughs
scsi: sym53c8xx_2: sym_nvram: Mark expected switch fall-through
scsi: sym53c8xx_2: sym_hipd: mark expected switch fall-throughs
scsi: ppa: mark expected switch fall-through
scsi: osst: mark expected switch fall-throughs
scsi: lpfc: lpfc_scsi: Mark expected switch fall-throughs
scsi: lpfc: lpfc_nvme: Mark expected switch fall-through
scsi: lpfc: lpfc_nportdisc: Mark expected switch fall-through
scsi: lpfc: lpfc_hbadisc: Mark expected switch fall-throughs
scsi: lpfc: lpfc_els: Mark expected switch fall-throughs
scsi: lpfc: lpfc_ct: Mark expected switch fall-throughs
scsi: imm: mark expected switch fall-throughs
scsi: csiostor: csio_wr: mark expected switch fall-through
...

Linus Torvalds
2019-05-08 03:48:10 +0800

24 Apr, 2019

1 commit

5926459e7 locks: move checks from locks_free_lock() to locks_release_private() ... Browse Code »

Code that allocates locks using locks_alloc_lock() will free it
using locks_free_lock(), and will benefit from the BUG_ON()
consistency checks therein.

However some code (nfsd and lockd) allocate a lock embedded in
some other data structure, and so free the lock themselves after
calling locks_release_private(). This path does not benefit from
the consistency checks.

To help catch future errors, move the BUG_ON() checks to
locks_release_private() - which locks_free_lock() already calls.
This ensures that all users for locks will find out if the lock
isn't detached properly before being free.

Signed-off-by: NeilBrown
Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields

NeilBrown
2019-04-24 21:51:48 +0800

09 Apr, 2019

1 commit

0a4c92657 fs: mark expected switch fall-throughs ... Browse Code »

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

This patch fixes the following warnings:

fs/affs/affs.h:124:38: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/configfs/dir.c:1692:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/configfs/dir.c:1694:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ceph/file.c:249:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext4/hash.c:233:15: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext4/hash.c:246:15: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext2/inode.c:1237:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext2/inode.c:1244:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext4/indirect.c:1182:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext4/indirect.c:1188:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext4/indirect.c:1432:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ext4/indirect.c:1440:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/f2fs/node.c:618:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/f2fs/node.c:620:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/btrfs/ref-verify.c:522:15: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/gfs2/bmap.c:711:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/gfs2/bmap.c:722:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/jffs2/fs.c:339:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/nfsd/nfs4proc.c:429:12: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ufs/util.h:62:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/ufs/util.h:43:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/fcntl.c:770:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/seq_file.c:319:10: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/libfs.c:148:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/libfs.c:150:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/signalfd.c:178:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
fs/locks.c:1473:16: warning: this statement may fall through [-Wimplicit-fallthrough=]

Warning level 3 was used: -Wimplicit-fallthrough=3

This patch is part of the ongoing efforts to enabling
-Wimplicit-fallthrough.

Reviewed-by: Kees Cook
Signed-off-by: Gustavo A. R. Silva

Gustavo A. R. Silva
2019-04-09 07:21:02 +0800

25 Mar, 2019

1 commit

945ab8f6d locks: wake any locks blocked on request before deadlock check ... Browse Code »

Andreas reported that he was seeing the tdbtorture test fail in some
cases with -EDEADLCK when it wasn't before. Some debugging showed that
deadlock detection was sometimes discovering the caller's lock request
itself in a dependency chain.

While we remove the request from the blocked_lock_hash prior to
reattempting to acquire it, any locks that are blocked on that request
will still be present in the hash and will still have their fl_blocker
pointer set to the current request.

This causes posix_locks_deadlock to find a deadlock dependency chain
when it shouldn't, as a lock request cannot block itself.

We are going to end up waking all of those blocked locks anyway when we
go to reinsert the request back into the blocked_lock_hash, so just do
it prior to checking for deadlocks. This ensures that any lock blocked
on the current request will no longer be part of any blocked request
chain.

URL: https://bugzilla.kernel.org/show_bug.cgi?id=202975
Fixes: 5946c4319ebb ("fs/locks: allow a lock request to block other requests.")
Cc: stable@vger.kernel.org
Reported-by: Andreas Schneider
Signed-off-by: Neil Brown
Signed-off-by: Jeff Layton

Jeff Layton
2019-03-25 20:36:24 +0800

28 Feb, 2019

1 commit

02e525b2a locking/percpu-rwsem: Remove preempt_disable variants ... Browse Code »

Effective revert commit:

87709e28dc7c ("fs/locks: Use percpu_down_read_preempt_disable()")

This is causing major pain for PREEMPT_RT.

Sebastian did a lot of lockperf runs on 2 and 4 node machines with all
preemption modes (PREEMPT=n should be an obvious NOP for this patch
and thus serves as a good control) and no results showed significance
over 2-sigma (the PREEMPT=n results were almost empty at 1-sigma).

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Will Deacon
Signed-off-by: Ingo Molnar

Peter Zijlstra
2019-02-28 14:55:37 +0800

03 Jan, 2019

1 commit

bf77ae4c9 locks: fix error in locks_move_blocks() ... Browse Code »

After moving all requests from
fl->fl_blocked_requests
to
new->fl_blocked_requests

it is nonsensical to do anything to all the remaining elements, there
aren't any. This should do something to all the requests that have been
moved. For simplicity, it does it to all requests in the target list.

Setting "f->fl_blocker = new" to all members of new->fl_blocked_requests
is "obviously correct" as it preserves the invariant of the linkage
among requests.

Reported-by: syzbot+239d99847eb49ecb3899@syzkaller.appspotmail.com
Fixes: 5946c4319ebb ("fs/locks: allow a lock request to block other requests.")
Signed-off-by: NeilBrown
Signed-off-by: Jeff Layton

NeilBrown
2019-01-03 09:14:50 +0800

17 Dec, 2018

1 commit

052b8cfa4 locks: Use inode_is_open_for_write ... Browse Code »

Use the aptly named function rather than open coding it. No functional
changes.

Signed-off-by: Nikolay Borisov
Signed-off-by: Jeff Layton

Nikolay Borisov
2018-12-17 20:19:46 +0800

07 Dec, 2018

5 commits

7bbd1fc0e fs/locks: remove unnecessary white space. ... Browse Code »

- spaces before tabs,
- spaces at the end of lines,
- multiple blank lines,
- blank lines before EXPORT_SYMBOL,
can all go.

Signed-off-by: NeilBrown
Reviewed-by: J. Bruce Fields
Signed-off-by: Jeff Layton

NeilBrown
2018-12-07 19:51:00 +0800
cb03f94ff fs/locks: merge posix_unblock_lock() and locks_delete_block() ... Browse Code »

posix_unblock_lock() is not specific to posix locks, and behaves
nearly identically to locks_delete_block() - the former returning a
status while the later doesn't.

So discard posix_unblock_lock() and use locks_delete_block() instead,
after giving that function an appropriate return value.

Signed-off-by: NeilBrown
Reviewed-by: J. Bruce Fields
Signed-off-by: Jeff Layton

NeilBrown
2018-12-07 19:50:56 +0800
fd7732e03 fs/locks: create a tree of dependent requests. ... Browse Code »

When we find an existing lock which conflicts with a request,
and the request wants to wait, we currently add the request
to a list. When the lock is removed, the whole list is woken.
This can cause the thundering-herd problem.
To reduce the problem, we make use of the (new) fact that
a pending request can itself have a list of blocked requests.
When we find a conflict, we look through the existing blocked requests.
If any one of them blocks the new request, the new request is attached
below that request, otherwise it is added to the list of blocked
requests, which are now known to be mutually non-conflicting.

This way, when the lock is released, only a set of non-conflicting
locks will be woken, the rest can stay asleep.
If the lock request cannot be granted and the request needs to be
requeued, all the other requests it blocks will then be woken

To make this more concrete:

If you have a many-core machine, and have many threads all wanting to
briefly lock a give file (udev is known to do this), you can get quite
poor performance.

When one thread releases a lock, it wakes up all other threads that
are waiting (classic thundering-herd) - one will get the lock and the
others go to sleep.
When you have few cores, this is not very noticeable: by the time the
4th or 5th thread gets enough CPU time to try to claim the lock, the
earlier threads have claimed it, done what was needed, and released.
So with few cores, many of the threads don't end up contending.
With 50+ cores, lost of threads can get the CPU at the same time,
and the contention can easily be measured.

This patchset creates a tree of pending lock requests in which siblings
don't conflict and each lock request does conflict with its parent.
When a lock is released, only requests which don't conflict with each
other a woken.

Testing shows that lock-acquisitions-per-second is now fairly stable
even as the number of contending process goes to 1000. Without this
patch, locks-per-second drops off steeply after a few 10s of
processes.

There is a small cost to this extra complexity.
At 20 processes running a particular test on 72 cores, the lock
acquisitions per second drops from 1.8 million to 1.4 million with
this patch. For 100 processes, this patch still provides 1.4 million
while without this patch there are about 700,000.

Reported-and-tested-by: Martin Wilck
Signed-off-by: NeilBrown
Reviewed-by: J. Bruce Fields
Signed-off-by: Jeff Layton

NeilBrown
2018-12-07 19:49:24 +0800
c0e159089 fs/locks: change all *_conflict() functions to return bool. ... Browse Code »

posix_locks_conflict() and flock_locks_conflict() both return int.
leases_conflict() returns bool.

This inconsistency will cause problems for the next patch if not
fixed.

So change posix_locks_conflict() and flock_locks_conflict() to return
bool.
Also change the locks_conflict() helper.

And convert some
return (foo);
to
return foo;

Signed-off-by: NeilBrown
Reviewed-by: J. Bruce Fields
Signed-off-by: Jeff Layton

NeilBrown
2018-12-07 19:49:24 +0800
16306a61d fs/locks: always delete_block after waiting. ... Browse Code »

Now that requests can block other requests, we
need to be careful to always clean up those blocked
requests.
Any time that we wait for a request, we might have
other requests attached, and when we stop waiting,
we must clean them up.
If the lock was granted, the requests might have been
moved to the new lock, though when merged with a
pre-exiting lock, this might not happen.
In all cases we don't want blocked locks to remain
attached, so we remove them to be safe.

Signed-off-by: NeilBrown
Reviewed-by: J. Bruce Fields
Tested-by: syzbot+a4a3d526b4157113ec6a@syzkaller.appspotmail.com
Tested-by: kernel test robot
Signed-off-by: Jeff Layton

NeilBrown
2018-12-07 19:49:17 +0800

01 Dec, 2018

3 commits

5946c4319 fs/locks: allow a lock request to block other requests. ... Browse Code »

Currently, a lock can block pending requests, but all pending
requests are equal. If lots of pending requests are
mutually exclusive, this means they will all be woken up
and all but one will fail. This can hurt performance.

So we will allow pending requests to block other requests.
Only the first request will be woken, and it will wake the others.

This patch doesn't implement this fully, but prepares the way.

- It acknowledges that a request might be blocking other requests,
and when the request is converted to a lock, those blocked
requests are moved across.
- When a request is requeued or discarded, all blocked requests are
woken.
- When deadlock-detection looks for the lock which blocks a
given request, we follow the chain of ->fl_blocker all
the way to the top.

Tested-by: kernel test robot
Signed-off-by: NeilBrown
Reviewed-by: J. Bruce Fields
Signed-off-by: Jeff Layton

NeilBrown
2018-12-01 00:26:12 +0800
d6367d624 fs/locks: use properly initialized file_lock when unlocking. ... Browse Code »

Both locks_remove_posix() and locks_remove_flock() use a
struct file_lock without calling locks_init_lock() on it.
This means the various list_heads are not initialized, which
will become a problem with a later patch.

So change them both to initialize properly. For flock locks,
this involves using flock_make_lock(), and changing it to
allow a file_lock to be passed in, so memory allocation isn't
always needed.

Signed-off-by: NeilBrown
Reviewed-by: J. Bruce Fields
Signed-off-by: Jeff Layton

NeilBrown
2018-12-01 00:26:12 +0800
ad6bbd8b1 fs/locks: split out __locks_wake_up_blocks(). ... Browse Code »

This functionality will be useful in future patches, so
split it out from locks_wake_up_blocks().

Signed-off-by: NeilBrown
Reviewed-by: J. Bruce Fields
Signed-off-by: Jeff Layton

NeilBrown
2018-12-01 00:26:12 +0800