Eric Lee / smarc-fsl-linux-kernel

14 Jun, 2020

6 commits

96144c58a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Pull networking fixes from David Miller:

1) Fix cfg80211 deadlock, from Johannes Berg.

2) RXRPC fails to send norigications, from David Howells.

3) MPTCP RM_ADDR parsing has an off by one pointer error, fix from
Geliang Tang.

4) Fix crash when using MSG_PEEK with sockmap, from Anny Hu.

5) The ucc_geth driver needs __netdev_watchdog_up exported, from
Valentin Longchamp.

6) Fix hashtable memory leak in dccp, from Wang Hai.

7) Fix how nexthops are marked as FDB nexthops, from David Ahern.

8) Fix mptcp races between shutdown and recvmsg, from Paolo Abeni.

9) Fix crashes in tipc_disc_rcv(), from Tuong Lien.

10) Fix link speed reporting in iavf driver, from Brett Creeley.

11) When a channel is used for XSK and then reused again later for XSK,
we forget to clear out the relevant data structures in mlx5 which
causes all kinds of problems. Fix from Maxim Mikityanskiy.

12) Fix memory leak in genetlink, from Cong Wang.

13) Disallow sockmap attachments to UDP sockets, it simply won't work.
From Lorenz Bauer.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
net: ethernet: ti: ale: fix allmulti for nu type ale
net: ethernet: ti: am65-cpsw-nuss: fix ale parameters init
net: atm: Remove the error message according to the atomic context
bpf: Undo internal BPF_PROBE_MEM in BPF insns dump
libbpf: Support pre-initializing .bss global variables
tools/bpftool: Fix skeleton codegen
bpf: Fix memlock accounting for sock_hash
bpf: sockmap: Don't attach programs to UDP sockets
bpf: tcp: Recv() should return 0 when the peer socket is closed
ibmvnic: Flush existing work items before device removal
genetlink: clean up family attributes allocations
net: ipa: header pad field only valid for AP->modem endpoint
net: ipa: program upper nibbles of sequencer type
net: ipa: fix modem LAN RX endpoint id
net: ipa: program metadata mask differently
ionic: add pcie_print_link_status
rxrpc: Fix race between incoming ACK parser and retransmitter
net/mlx5: E-Switch, Fix some error pointer dereferences
net/mlx5: Don't fail driver on failure to create debugfs
net/mlx5e: CT: Fix ipv6 nat header rewrite actions
...

Linus Torvalds
2020-06-14 07:27:13 +0800
fa7566a0d Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf ... Browse Code »

Alexei Starovoitov says:

====================
pull-request: bpf 2020-06-12

The following pull-request contains BPF updates for your *net* tree.

We've added 26 non-merge commits during the last 10 day(s) which contain
a total of 27 files changed, 348 insertions(+), 93 deletions(-).

The main changes are:

1) sock_hash accounting fix, from Andrey.

2) libbpf fix and probe_mem sanitizing, from Andrii.

3) sock_hash fixes, from Jakub.

4) devmap_val fix, from Jesper.

5) load_bytes_relative fix, from YiFei.
====================

Signed-off-by: David S. Miller

David S. Miller
2020-06-14 06:28:08 +0800
bf97bac9d net: atm: Remove the error message according to the atomic context ... Browse Code »

Looking into the context (atomic!) and the error message should be dropped.

Signed-off-by: Liao Pingfang
Signed-off-by: David S. Miller

Liao Pingfang
2020-06-14 06:27:06 +0800
6adc19fd1 Merge tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild ... Browse Code »

Pull more Kbuild updates from Masahiro Yamada:

- fix build rules in binderfs sample

- fix build errors when Kbuild recurses to the top Makefile

- covert '---help---' in Kconfig to 'help'

* tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
treewide: replace '---help---' in Kconfig files with 'help'
kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables
samples: binderfs: really compile this sample and fix build issues

Linus Torvalds
2020-06-14 04:29:16 +0800
61f3e825b Merge tag '9p-for-5.8' of git://github.com/martinetd/linux ... Browse Code »

Pull 9p update from Dominique Martinet:
"Another very quiet cycle... Only one commit: increase the size of the
ring used for xen transport"

* tag '9p-for-5.8' of git://github.com/martinetd/linux:
9p/xen: increase XEN_9PFS_RING_ORDER

Linus Torvalds
2020-06-14 03:38:57 +0800
a7f7f6248 treewide: replace '---help---' in Kconfig files with 'help' ... Browse Code »

Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.

This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.

There are a variety of indentation styles found.

a) 4 spaces + '---help---'
b) 7 spaces + '---help---'
c) 8 spaces + '---help---'
d) 1 space + 1 tab + '---help---'
e) 1 tab + '---help---' (correct indentation)
f) 1 tab + 1 space + '---help---'
g) 1 tab + 2 spaces + '---help---'

In order to convert all of them to 1 tab + 'help', I ran the
following commend:

$ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'

Signed-off-by: Masahiro Yamada

Masahiro Yamada
2020-06-14 00:57:21 +0800

13 Jun, 2020

4 commits

60e5ca8a6 bpf: Fix memlock accounting for sock_hash ... Browse Code »

Add missed bpf_map_charge_init() in sock_hash_alloc() and
correspondingly bpf_map_charge_finish() on ENOMEM.

It was found accidentally while working on unrelated selftest that
checks "map->memory.pages > 0" is true for all map types.

Before:
# bpftool m l
...
3692: sockhash name m_sockhash flags 0x0
key 4B value 4B max_entries 8 memlock 0B

After:
# bpftool m l
...
84: sockmap name m_sockmap flags 0x0
key 4B value 4B max_entries 8 memlock 4096B

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Andrey Ignatov
Signed-off-by: Alexei Starovoitov
Link: https://lore.kernel.org/bpf/20200612000857.2881453-1-rdna@fb.com

Andrey Ignatov
2020-06-13 06:21:29 +0800
f6fede856 bpf: sockmap: Don't attach programs to UDP sockets ... Browse Code »

The stream parser infrastructure isn't set up to deal with UDP
sockets, so we mustn't try to attach programs to them.

I remember making this change at some point, but I must have lost
it while rebasing or something similar.

Fixes: 7b98cd42b049 ("bpf: sockmap: Add UDP support")
Signed-off-by: Lorenz Bauer
Signed-off-by: Alexei Starovoitov
Acked-by: Jakub Sitnicki
Link: https://lore.kernel.org/bpf/20200611172520.327602-1-lmb@cloudflare.com

Lorenz Bauer
2020-06-13 06:13:43 +0800
2c7269b23 bpf: tcp: Recv() should return 0 when the peer socket is closed ... Browse Code »

If the peer is closed, we will never get more data, so
tcp_bpf_wait_data will get stuck forever. In case we passed
MSG_DONTWAIT to recv(), we get EAGAIN but we should actually get
0.

>From man 2 recv:

RETURN VALUE

When a stream socket peer has performed an orderly shutdown, the
return value will be 0 (the traditional "end-of-file" return).

This patch makes tcp_bpf_wait_data always return 1 when the peer
socket has been shutdown. Either we have data available, and it would
have returned 1 anyway, or there isn't, in which case we'll call
tcp_recvmsg which does the right thing in this situation.

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Sabrina Dubroca
Signed-off-by: Alexei Starovoitov
Acked-by: Jakub Sitnicki
Link: https://lore.kernel.org/bpf/26038a28c21fea5d04d4bd4744c5686d3f2e5504.1591784177.git.sd@queasysnail.net

Sabrina Dubroca
2020-06-13 06:10:12 +0800
b65ce380b genetlink: clean up family attributes allocations ... Browse Code »

genl_family_rcv_msg_attrs_parse() and genl_family_rcv_msg_attrs_free()
take a boolean parameter to determine whether allocate/free the family
attrs. This is unnecessary as we can just check family->parallel_ops.
More importantly, callers would not need to worry about pairing these
parameters correctly after this patch.

And this fixes a memory leak, as after commit c36f05559104
("genetlink: fix memory leaks in genl_family_rcv_msg_dumpit()")
we call genl_family_rcv_msg_attrs_parse() for both parallel and
non-parallel cases.

Fixes: c36f05559104 ("genetlink: fix memory leaks in genl_family_rcv_msg_dumpit()")
Reported-by: Ido Schimmel
Signed-off-by: Cong Wang
Reviewed-by: Ido Schimmel
Tested-by: Ido Schimmel
Signed-off-by: David S. Miller

Cong Wang
2020-06-13 05:05:08 +0800

12 Jun, 2020

17 commits

2ad6691d9 rxrpc: Fix race between incoming ACK parser and retransmitter ... Browse Code »

There's a race between the retransmission code and the received ACK parser.
The problem is that the retransmission loop has to drop the lock under
which it is iterating through the transmission buffer in order to transmit
a packet, but whilst the lock is dropped, the ACK parser can crank the Tx
window round and discard the packets from the buffer.

The retransmission code then updated the annotations for the wrong packet
and a later retransmission thought it had to retransmit a packet that
wasn't there, leading to a NULL pointer dereference.

Fix this by:

(1) Moving the annotation change to before we drop the lock prior to
transmission. This means we can't vary the annotation depending on
the outcome of the transmission, but that's fine - we'll retransmit
again later if it failed now.

(2) Skipping the packet if the skb pointer is NULL.

The following oops was seen:

BUG: kernel NULL pointer dereference, address: 000000000000002d
Workqueue: krxrpcd rxrpc_process_call
RIP: 0010:rxrpc_get_skb+0x14/0x8a
...
Call Trace:
rxrpc_resend+0x331/0x41e
? get_vtime_delta+0x13/0x20
rxrpc_process_call+0x3c0/0x4ac
process_one_work+0x18f/0x27f
worker_thread+0x1a3/0x247
? create_worker+0x17d/0x17d
kthread+0xe6/0xeb
? kthread_delayed_work_timer_fn+0x83/0x83
ret_from_fork+0x1f/0x30

Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells
Signed-off-by: David S. Miller

David Howells
2020-06-12 09:18:22 +0800
aa2cad060 xdp: Fix xsk_generic_xmit errno ... Browse Code »

Propagate sock_alloc_send_skb error code, not set it to
EAGAIN unconditionally, when fail to allocate skb, which
might cause that user space unnecessary loops.

Fixes: 35fcde7f8deb ("xsk: support for Tx")
Signed-off-by: Li RongQing
Signed-off-by: Daniel Borkmann
Acked-by: Björn Töpel
Link: https://lore.kernel.org/bpf/1591852266-24017-1-git-send-email-lirongqing@baidu.com

Li RongQing
2020-06-12 05:44:33 +0800
979827826 tipc: fix NULL pointer dereference in tipc_disc_rcv() ... Browse Code »

When a bearer is enabled, we create a 'tipc_discoverer' object to store
the bearer related data along with a timer and a preformatted discovery
message buffer for later probing... However, this is only carried after
the bearer was set 'up', that left a race condition resulting in kernel
panic.

It occurs when a discovery message from a peer node is received and
processed in bottom half (since the bearer is 'up' already) just before
the discoverer object is created but is now accessed in order to update
the preformatted buffer (with a new trial address, ...) so leads to the
NULL pointer dereference.

We solve the problem by simply moving the bearer 'up' setting to later,
so make sure everything is ready prior to any message receiving.

Acked-by: Jon Maloy
Signed-off-by: Tuong Lien
Signed-off-by: David S. Miller

Tuong Lien
2020-06-12 03:48:08 +0800
c9aa81faf tipc: fix kernel WARNING in tipc_msg_append() ... Browse Code »

syzbot found the following issue:

WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 check_copy_size include/linux/thread_info.h:150 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 copy_from_iter include/linux/uio.h:144 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 tipc_msg_append+0x49a/0x5e0 net/tipc/msg.c:242
Kernel panic - not syncing: panic_on_warn set ...

This happens after commit 5e9eeccc58f3 ("tipc: fix NULL pointer
dereference in streaming") that tried to build at least one buffer even
when the message data length is zero... However, it now exposes another
bug that the 'mss' can be zero and the 'cpy' will be negative, thus the
above kernel WARNING will appear!
The zero value of 'mss' is never expected because it means Nagle is not
enabled for the socket (actually the socket type was 'SOCK_SEQPACKET'),
so the function 'tipc_msg_append()' must not be called at all. But that
was in this particular case since the message data length was zero, and
the 'send
Signed-off-by: Tuong Lien
Signed-off-by: David S. Miller

Tuong Lien
2020-06-12 03:47:23 +0800
a53956829 Merge tag 'nfs-for-5.8-1' of git://git.linux-nfs.org/projects/anna/linux-nfs ... Browse Code »

Pull NFS client updates from Anna Schumaker:
"New features and improvements:
- Sunrpc receive buffer sizes only change when establishing a GSS credentials
- Add more sunrpc tracepoints
- Improve on tracepoints to capture internal NFS I/O errors

Other bugfixes and cleanups:
- Move a dprintk() to after a call to nfs_alloc_fattr()
- Fix off-by-one issues in rpc_ntop6
- Fix a few coccicheck warnings
- Use the correct SPDX license identifiers
- Fix rpc_call_done assignment for BIND_CONN_TO_SESSION
- Replace zero-length array with flexible array
- Remove duplicate headers
- Set invalid blocks after NFSv4 writes to update space_used attribute
- Fix direct WRITE throughput regression"

* tag 'nfs-for-5.8-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits)
NFS: Fix direct WRITE throughput regression
SUNRPC: rpc_xprt lifetime events should record xprt->state
xprtrdma: Make xprt_rdma_slot_table_entries static
nfs: set invalid blocks after NFSv4 writes
NFS: remove redundant initialization of variable result
sunrpc: add missing newline when printing parameter 'auth_hashtable_size' by sysfs
NFS: Add a tracepoint in nfs_set_pgio_error()
NFS: Trace short NFS READs
NFS: nfs_xdr_status should record the procedure name
SUNRPC: Set SOFTCONN when destroying GSS contexts
SUNRPC: rpc_call_null_helper() should set RPC_TASK_SOFT
SUNRPC: rpc_call_null_helper() already sets RPC_TASK_NULLCREDS
SUNRPC: trace RPC client lifetime events
SUNRPC: Trace transport lifetime events
SUNRPC: Split the xdr_buf event class
SUNRPC: Add tracepoint to rpc_call_rpcerror()
SUNRPC: Update the RPC_SHOW_SOCKET() macro
SUNRPC: Update the rpc_show_task_flags() macro
SUNRPC: Trace GSS context lifetimes
SUNRPC: receive buffer size estimation values almost never change
...

Linus Torvalds
2020-06-12 03:22:41 +0800
5bffb0062 xprtrdma: Make xprt_rdma_slot_table_entries static ... Browse Code »

Fix the following sparse warning:

net/sunrpc/xprtrdma/transport.c:71:14: warning: symbol 'xprt_rdma_slot_table_entries'
was not declared. Should it be static?

Reported-by: Hulk Robot
Signed-off-by: Zou Wei
Reviewed-by: Chuck Lever
Signed-off-by: Anna Schumaker

Zou Wei
2020-06-12 01:33:48 +0800
2ac3ddc72 sunrpc: add missing newline when printing parameter 'auth_hashtable_size' by sysfs ... Browse Code »

When I cat parameter
'/sys/module/sunrpc/parameters/auth_hashtable_size', it displays as
follows. It is better to add a newline for easy reading.

[root@hulk-202 ~]# cat /sys/module/sunrpc/parameters/auth_hashtable_size
16[root@hulk-202 ~]#

Signed-off-by: Xiongfeng Wang
Signed-off-by: Anna Schumaker

Xiongfeng Wang
2020-06-12 01:33:48 +0800
841a2ed9a SUNRPC: Set SOFTCONN when destroying GSS contexts ... Browse Code »

Move the RPC_TASK_SOFTCONN flag into rpc_call_null_helper(). The
only minor behavior change is that it is now also set when
destroying GSS contexts.

This gives a better guarantee that gss_send_destroy_context() will
not hang for long if a connection cannot be established.

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-06-12 01:33:48 +0800
6fc3737aa SUNRPC: rpc_call_null_helper() should set RPC_TASK_SOFT ... Browse Code »

Clean up.

All of rpc_call_null_helper() call sites assert RPC_TASK_SOFT, so
move that setting into rpc_call_null_helper() itself.

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-06-12 01:33:48 +0800
eefc536db SUNRPC: rpc_call_null_helper() already sets RPC_TASK_NULLCREDS ... Browse Code »

Clean up.

Commit a52458b48af1 ("NFS/NFSD/SUNRPC: replace generic creds with
'struct cred'.") made rpc_call_null_helper() set RPC_TASK_NULLCREDS
unconditionally. Therefore there's no need for
rpc_call_null_helper()'s call sites to set RPC_TASK_NULLCREDS.

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-06-12 01:33:48 +0800
42aad0d7f SUNRPC: trace RPC client lifetime events ... Browse Code »

The "create" tracepoint records parts of the rpc_create arguments,
and the shutdown tracepoint records when the rpc_clnt is about to
signal pending tasks and destroy auths.

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-06-12 01:33:48 +0800
911813d7a SUNRPC: Trace transport lifetime events ... Browse Code »

Refactor: Hoist create/destroy/disconnect tracepoints out of
xprtrdma and into the generic RPC client. Some benefits include:

- Enable tracing of xprt lifetime events for the socket transport
types

- Expose the different types of disconnect to help run down
issues with lingering connections

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-06-12 01:33:48 +0800
c509f15a5 SUNRPC: Split the xdr_buf event class ... Browse Code »

To help tie the recorded xdr_buf to a particular RPC transaction,
the client side version of this class should display task ID
information and the server side one should show the request's XID.

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-06-12 01:33:48 +0800
0125ecbb5 SUNRPC: Add tracepoint to rpc_call_rpcerror() ... Browse Code »

Add a tracepoint in another common exit point for failing RPCs.

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-06-12 01:33:48 +0800
74fb8fece SUNRPC: Trace GSS context lifetimes ... Browse Code »

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-06-12 01:33:47 +0800
53bc19f17 SUNRPC: receive buffer size estimation values almost never change ... Browse Code »

Avoid unnecessary cache sloshing by placing the buffer size
estimation update logic behind an atomic bit flag.

The size of GSS information included in each wrapped Reply does
not change during the lifetime of a GSS context. Therefore, the
au_rslack and au_ralign fields need to be updated only once after
establishing a fresh GSS credential.

Thus a slack size update must occur after a cred is created,
duplicated, renewed, or expires. I'm not sure I have this exactly
right. A trace point is introduced to track updates to these
variables to enable troubleshooting the problem if I missed a spot.

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-06-12 01:33:47 +0800
c742b6347 Merge tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"Highlights:

- Keep nfsd clients from unnecessarily breaking their own
delegations.

Note this requires a small kthreadd addition. The result is Tejun
Heo's suggestion (see link), and he was OK with this going through
my tree.

- Patch nfsd/clients/ to display filenames, and to fix byte-order
when displaying stateid's.

- fix a module loading/unloading bug, from Neil Brown.

- A big series from Chuck Lever with RPC/RDMA and tracing
improvements, and lay some groundwork for RPC-over-TLS"

Link: https://lore.kernel.org/r/1588348912-24781-1-git-send-email-bfields@redhat.com

* tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux: (49 commits)
sunrpc: use kmemdup_nul() in gssp_stringify()
nfsd: safer handling of corrupted c_type
nfsd4: make drc_slab global, not per-net
SUNRPC: Remove unreachable error condition in rpcb_getport_async()
nfsd: Fix svc_xprt refcnt leak when setup callback client failed
sunrpc: clean up properly in gss_mech_unregister()
sunrpc: svcauth_gss_register_pseudoflavor must reject duplicate registrations.
sunrpc: check that domain table is empty at module unload.
NFSD: Fix improperly-formatted Doxygen comments
NFSD: Squash an annoying compiler warning
SUNRPC: Clean up request deferral tracepoints
NFSD: Add tracepoints for monitoring NFSD callbacks
NFSD: Add tracepoints to the NFSD state management code
NFSD: Add tracepoints to NFSD's duplicate reply cache
SUNRPC: svc_show_status() macro should have enum definitions
SUNRPC: Restructure svc_udp_recvfrom()
SUNRPC: Refactor svc_recvfrom()
SUNRPC: Clean up svc_release_skb() functions
SUNRPC: Refactor recvfrom path dealing with incomplete TCP receives
SUNRPC: Replace dprintk() call sites in TCP receive path
...

Linus Torvalds
2020-06-12 01:33:13 +0800

11 Jun, 2020

6 commits

0f5d82f18 net/filter: Permit reading NET in load_bytes_relative when MAC not set ... Browse Code »

Added a check in the switch case on start_header that checks for
the existence of the header, and in the case that MAC is not set
and the caller requests for MAC, -EFAULT. If the caller requests
for NET then MAC's existence is completely ignored.

There is no function to check NET header's existence and as far
as cgroup_skb/egress is concerned it should always be set.

Removed for ptr >= the start of header, considering offset is
bounded unsigned and should always be true. len
Signed-off-by: Daniel Borkmann
Reviewed-by: Stanislav Fomichev
Link: https://lore.kernel.org/bpf/76bb820ddb6a95f59a772ecbd8c8a336f646b362.1591812755.git.zhuyifei@google.com

YiFei Zhu
2020-06-11 22:05:56 +0800
4b5af4412 mptcp: don't leak msk in token container ... Browse Code »

If a listening MPTCP socket has unaccepted sockets at close
time, the related msks are freed via mptcp_sock_destruct(),
which in turn does not invoke the proto->destroy() method
nor the mptcp_token_destroy() function.

Due to the above, the child msk socket is not removed from
the token container, leading to later UaF.

Address the issue explicitly removing the token even in the
above error path.

Fixes: 79c0949e9a09 ("mptcp: Add key generation and token tree")
Signed-off-by: Paolo Abeni
Reviewed-by: Matthieu Baerts
Signed-off-by: David S. Miller

Paolo Abeni
2020-06-11 07:07:00 +0800
1c3837266 Merge branch 'work.sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull sysctl fixes from Al Viro:
"Fixups to regressions in sysctl series"

* 'work.sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
sysctl: reject gigantic reads/write to sysctl files
cdrom: fix an incorrect __user annotation on cdrom_sysctl_info
trace: fix an incorrect __user annotation on stack_trace_sysctl
random: fix an incorrect __user annotation on proc_do_entropy
net/sysctl: remove leftover __user annotations on neigh_proc_dointvec*
net/sysctl: use cpumask_parse in flow_limit_cpu_sysctl

Linus Torvalds
2020-06-11 07:05:54 +0800
4152d146e Merge branch 'rwonce/rework' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux ... Browse Code »

Pull READ/WRITE_ONCE rework from Will Deacon:
"This the READ_ONCE rework I've been working on for a while, which
bumps the minimum GCC version and improves code-gen on arm64 when
stack protector is enabled"

[ Side note: I'm _really_ tempted to raise the minimum gcc version to
4.9, so that we can just say that we require _Generic() support.

That would allow us to more cleanly handle a lot of the cases where we
depend on very complex macros with 'sizeof' or __builtin_choose_expr()
with __builtin_types_compatible_p() etc.

This branch has a workaround for sparse not handling _Generic(),
either, but that was already fixed in the sparse development branch,
so it's really just gcc-4.9 that we'd require. - Linus ]

* 'rwonce/rework' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux:
compiler_types.h: Use unoptimized __unqual_scalar_typeof for sparse
compiler_types.h: Optimize __unqual_scalar_typeof compilation time
compiler.h: Enforce that READ_ONCE_NOCHECK() access size is sizeof(long)
compiler-types.h: Include naked type in __pick_integer_type() match
READ_ONCE: Fix comment describing 2x32-bit atomicity
gcov: Remove old GCC 3.4 support
arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros
locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros
READ_ONCE: Drop pointer qualifiers when reading from scalar types
READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses
READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE()
arm64: csum: Disable KASAN for do_csum()
fault_inject: Don't rely on "return value" from WRITE_ONCE()
net: tls: Avoid assigning 'const' pointer to non-const pointer
netfilter: Avoid assigning 'const' pointer to non-const pointer
compiler/gcc: Raise minimum GCC version for kernel builds to 4.8

Linus Torvalds
2020-06-11 05:46:54 +0800
5969856ae mptcp: fix races between shutdown and recvmsg ... Browse Code »

The msk sk_shutdown flag is set by a workqueue, possibly
introducing some delay in user-space notification. If the last
subflow carries some data with the fin packet, the user space
can wake-up before RCV_SHUTDOWN is set. If it executes unblocking
recvmsg(), it may return with an error instead of eof.

Address the issue explicitly checking for eof in recvmsg(), when
no data is found.

Fixes: 59832e246515 ("mptcp: subflow: check parent mptcp socket on subflow state change")
Signed-off-by: Paolo Abeni
Reviewed-by: Matthieu Baerts
Signed-off-by: David S. Miller

Paolo Abeni
2020-06-11 04:34:14 +0800
ce9ac056d nexthop: Fix fdb labeling for groups ... Browse Code »

fdb nexthops are marked with a flag. For standalone nexthops, a flag was
added to the nh_info struct. For groups that flag was added to struct
nexthop when it should have been added to the group information. Fix
by removing the flag from the nexthop struct and adding a flag to nh_group
that mirrors nh_info and is really only a caching of the individual types.
Add a helper, nexthop_is_fdb, for use by the vxlan code and fixup the
internal code to use the flag from either nh_info or nh_group.

v2
- propagate fdb_nh in remove_nh_grp_entry

Fixes: 38428d68719c ("nexthop: support for fdb ecmp nexthops")
Cc: Roopa Prabhu
Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2020-06-11 04:18:40 +0800

10 Jun, 2020

7 commits

c96b6acc8 dccp: Fix possible memleak in dccp_init and dccp_fini ... Browse Code »

There are some memory leaks in dccp_init() and dccp_fini().

In dccp_fini() and the error handling path in dccp_init(), free lhash2
is missing. Add inet_hashinfo2_free_mod() to do it.

If inet_hashinfo2_init_mod() failed in dccp_init(),
percpu_counter_destroy() should be called to destroy dccp_orphan_count.
It need to goto out_free_percpu when inet_hashinfo2_init_mod() failed.

Fixes: c92c81df93df ("net: dccp: fix kernel crash on module load")
Reported-by: Hulk Robot
Signed-off-by: Wang Hai
Signed-off-by: David S. Miller

Wang Hai
2020-06-10 04:26:23 +0800
1a3db27ad net: sched: export __netdev_watchdog_up() ... Browse Code »

Since the quiesce/activate rework, __netdev_watchdog_up() is directly
called in the ucc_geth driver.

Unfortunately, this function is not available for modules and thus
ucc_geth cannot be built as a module anymore. Fix it by exporting
__netdev_watchdog_up().

Since the commit introducing the regression was backported to stable
branches, this one should ideally be as well.

Fixes: 79dde73cf9bc ("net/ethernet/freescale: rework quiesce/activate for ucc_geth")
Signed-off-by: Valentin Longchamp
Signed-off-by: David S. Miller

Valentin Longchamp
2020-06-10 04:14:31 +0800
845e0ebb4 net: change addr_list_lock back to static key ... Browse Code »

The dynamic key update for addr_list_lock still causes troubles,
for example the following race condition still exists:

CPU 0: CPU 1:
(RCU read lock) (RTNL lock)
dev_mc_seq_show() netdev_update_lockdep_key()
-> lockdep_unregister_key()
-> netif_addr_lock_bh()

because lockdep doesn't provide an API to update it atomically.
Therefore, we have to move it back to static keys and use subclass
for nest locking like before.

In commit 1a33e10e4a95 ("net: partially revert dynamic lockdep key
changes"), I already reverted most parts of commit ab92d68fc22f
("net: core: add generic lockdep keys").

This patch reverts the rest and also part of commit f3b0a18bb6cb
("net: remove unnecessary variables and callback"). After this
patch, addr_list_lock changes back to using static keys and
subclasses to satisfy lockdep. Thanks to dev->lower_level, we do
not have to change back to ->ndo_get_lock_subclass().

And hopefully this reduces some syzbot lockdep noises too.

Reported-by: syzbot+f3a0e80c34b3fc28ac5e@syzkaller.appspotmail.com
Cc: Taehee Yoo
Cc: Dmitry Vyukov
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

Cong Wang
2020-06-10 03:59:45 +0800
75e68e5bf bpf, sockhash: Synchronize delete from bucket list on map free ... Browse Code »

We can end up modifying the sockhash bucket list from two CPUs when a
sockhash is being destroyed (sock_hash_free) on one CPU, while a socket
that is in the sockhash is unlinking itself from it on another CPU
it (sock_hash_delete_from_link).

This results in accessing a list element that is in an undefined state as
reported by KASAN:

| ==================================================================
| BUG: KASAN: wild-memory-access in sock_hash_free+0x13c/0x280
| Write of size 8 at addr dead000000000122 by task kworker/2:1/95
|
| CPU: 2 PID: 95 Comm: kworker/2:1 Not tainted 5.7.0-rc7-02961-ge22c35ab0038-dirty #691
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
| Workqueue: events bpf_map_free_deferred
| Call Trace:
| dump_stack+0x97/0xe0
| ? sock_hash_free+0x13c/0x280
| __kasan_report.cold+0x5/0x40
| ? mark_lock+0xbc1/0xc00
| ? sock_hash_free+0x13c/0x280
| kasan_report+0x38/0x50
| ? sock_hash_free+0x152/0x280
| sock_hash_free+0x13c/0x280
| bpf_map_free_deferred+0xb2/0xd0
| ? bpf_map_charge_finish+0x50/0x50
| ? rcu_read_lock_sched_held+0x81/0xb0
| ? rcu_read_lock_bh_held+0x90/0x90
| process_one_work+0x59a/0xac0
| ? lock_release+0x3b0/0x3b0
| ? pwq_dec_nr_in_flight+0x110/0x110
| ? rwlock_bug.part.0+0x60/0x60
| worker_thread+0x7a/0x680
| ? _raw_spin_unlock_irqrestore+0x4c/0x60
| kthread+0x1cc/0x220
| ? process_one_work+0xac0/0xac0
| ? kthread_create_on_node+0xa0/0xa0
| ret_from_fork+0x24/0x30
| ==================================================================

Fix it by reintroducing spin-lock protected critical section around the
code that removes the elements from the bucket on sockhash free.

To do that we also need to defer processing of removed elements, until out
of atomic context so that we can unlink the socket from the map when
holding the sock lock.

Fixes: 90db6d772f74 ("bpf, sockmap: Remove bucket->lock from sock_{hash|map}_free")
Reported-by: Eric Dumazet
Signed-off-by: Jakub Sitnicki
Signed-off-by: Alexei Starovoitov
Acked-by: John Fastabend
Link: https://lore.kernel.org/bpf/20200607205229.2389672-3-jakub@cloudflare.com

Jakub Sitnicki
2020-06-10 01:59:04 +0800
33a7c8315 bpf, sockhash: Fix memory leak when unlinking sockets in sock_hash_free ... Browse Code »

When sockhash gets destroyed while sockets are still linked to it, we will
walk the bucket lists and delete the links. However, we are not freeing the
list elements after processing them, leaking the memory.

The leak can be triggered by close()'ing a sockhash map when it still
contains sockets, and observed with kmemleak:

unreferenced object 0xffff888116e86f00 (size 64):
comm "race_sock_unlin", pid 223, jiffies 4294731063 (age 217.404s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
81 de e8 41 00 00 00 00 c0 69 2f 15 81 88 ff ff ...A.....i/.....
backtrace:
[] sock_hash_update_common+0x4ca/0x760
[] sock_hash_update_elem+0x1d2/0x200
[] __do_sys_bpf+0x2046/0x2990
[] do_syscall_64+0xad/0x9a0
[] entry_SYSCALL_64_after_hwframe+0x49/0xb3

Fix it by freeing the list element when we're done with it.

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Jakub Sitnicki
Signed-off-by: Alexei Starovoitov
Acked-by: John Fastabend
Link: https://lore.kernel.org/bpf/20200607205229.2389672-2-jakub@cloudflare.com

Jakub Sitnicki
2020-06-10 01:59:04 +0800
487082fb7 bpf/sockmap: Fix kernel panic at __tcp_bpf_recvmsg ... Browse Code »

When user application calls read() with MSG_PEEK flag to read data
of bpf sockmap socket, kernel panic happens at
__tcp_bpf_recvmsg+0x12c/0x350. sk_msg is not removed from ingress_msg
queue after read out under MSG_PEEK flag is set. Because it's not
judged whether sk_msg is the last msg of ingress_msg queue, the next
sk_msg may be the head of ingress_msg queue, whose memory address of
sg page is invalid. So it's necessary to add check codes to prevent
this problem.

[20759.125457] BUG: kernel NULL pointer dereference, address:
0000000000000008
[20759.132118] CPU: 53 PID: 51378 Comm: envoy Tainted: G E
5.4.32 #1
[20759.140890] Hardware name: Inspur SA5212M4/YZMB-00370-109, BIOS
4.1.12 06/18/2017
[20759.149734] RIP: 0010:copy_page_to_iter+0xad/0x300
[20759.270877] __tcp_bpf_recvmsg+0x12c/0x350
[20759.276099] tcp_bpf_recvmsg+0x113/0x370
[20759.281137] inet_recvmsg+0x55/0xc0
[20759.285734] __sys_recvfrom+0xc8/0x130
[20759.290566] ? __audit_syscall_entry+0x103/0x130
[20759.296227] ? syscall_trace_enter+0x1d2/0x2d0
[20759.301700] ? __audit_syscall_exit+0x1e4/0x290
[20759.307235] __x64_sys_recvfrom+0x24/0x30
[20759.312226] do_syscall_64+0x55/0x1b0
[20759.316852] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: dihu
Signed-off-by: Alexei Starovoitov
Acked-by: John Fastabend
Acked-by: Jakub Sitnicki
Link: https://lore.kernel.org/bpf/20200605084625.9783-1-anny.hu@linux.alibaba.com

dihu
2020-06-10 01:56:36 +0800
3e4e28c5a mmap locking API: convert mmap_sem API comments ... Browse Code »

Convert comments that reference old mmap_sem APIs to reference
corresponding new mmap locking APIs instead.

Signed-off-by: Michel Lespinasse
Signed-off-by: Andrew Morton
Reviewed-by: Vlastimil Babka
Reviewed-by: Davidlohr Bueso
Reviewed-by: Daniel Jordan
Cc: David Rientjes
Cc: Hugh Dickins
Cc: Jason Gunthorpe
Cc: Jerome Glisse
Cc: John Hubbard
Cc: Laurent Dufour
Cc: Liam Howlett
Cc: Matthew Wilcox
Cc: Peter Zijlstra
Cc: Ying Han
Link: http://lkml.kernel.org/r/20200520052908.204642-12-walken@google.com
Signed-off-by: Linus Torvalds

Michel Lespinasse
2020-06-10 00:39:14 +0800