Eric Lee / smarc-fsl-linux-kernel

30 Dec, 2020

6 commits

7941ee42d net: sunrpc: Fix 'snprintf' return value check in 'do_xprt_debugfs' ... Browse Code »

[ Upstream commit 35a6d396721e28ba161595b0fc9e8896c00399bb ]

'snprintf' returns the number of characters which would have been written
if enough space had been available, excluding the terminating null byte.
Thus, the return value of 'sizeof(buf)' means that the last character
has been dropped.

Signed-off-by: Fedor Tokarev
Fixes: 2f34b8bfae19 ("SUNRPC: add links for all client xprts to debugfs")
Signed-off-by: Trond Myklebust
Signed-off-by: Sasha Levin

Fedor Tokarev
2020-12-30 18:53:30 +0800
c1e628f91 SUNRPC: xprt_load_transport() needs to support the netid "rdma6" ... Browse Code »

[ Upstream commit d5aa6b22e2258f05317313ecc02efbb988ed6d38 ]

According to RFC5666, the correct netid for an IPv6 addressed RDMA
transport is "rdma6", which we've supported as a mount option since
Linux-4.7. The problem is when we try to load the module "xprtrdma6",
that will fail, since there is no modulealias of that name.

Fixes: 181342c5ebe8 ("xprtrdma: Add rdma6 option to support NFS/RDMA IPv6")
Signed-off-by: Trond Myklebust
Signed-off-by: Sasha Levin

Trond Myklebust
2020-12-30 18:53:30 +0800
d1296acac SUNRPC: rpc_wake_up() should wake up tasks in the correct order ... Browse Code »

[ Upstream commit e4c72201b6ec3173dfe13fa2e2335a3ad78d4921 ]

Currently, we wake up the tasks by priority queue ordering, which means
that we ignore the batching that is supposed to help with QoS issues.

Fixes: c049f8ea9a0d ("SUNRPC: Remove the bh-safe lock requirement on the rpc_wait_queue->lock")
Signed-off-by: Trond Myklebust
Signed-off-by: Sasha Levin

Trond Myklebust
2020-12-30 18:53:30 +0800
c98d33579 Bluetooth: Fix: LL PRivacy BLE device fails to connect ... Browse Code »

[ Upstream commit 1fb17dfc258ff6208f7873cc7b8e40e27515d2d5 ]

When adding device to white list the device is added to resolving list
also. It has to be added only when HCI_ENABLE_LL_PRIVACY flag is set.
HCI_ENABLE_LL_PRIVACY flag has to be tested before adding/deleting devices
to resolving list. use_ll_privacy macro is used only to check if controller
supports LL_Privacy.

https://bugzilla.kernel.org/show_bug.cgi?id=209745

Fixes: 0eee35bdfa3b ("Bluetooth: Update resolving list when updating whitelist")
Signed-off-by: Sathish Narasimman
Signed-off-by: Marcel Holtmann
Signed-off-by: Sasha Levin

Sathish Narasimman
2020-12-30 18:53:05 +0800
147cdf5f3 Bluetooth: Fix null pointer dereference in hci_event_packet() ... Browse Code »

[ Upstream commit 6dfccd13db2ff2b709ef60a50163925d477549aa ]

AMP_MGR is getting derefernced in hci_phy_link_complete_evt(), when called
from hci_event_packet() and there is a possibility, that hcon->amp_mgr may
not be found when accessing after initialization of hcon.

- net/bluetooth/hci_event.c:4945
The bug seems to get triggered in this line:

bredr_hcon = hcon->amp_mgr->l2cap_conn->hcon;

Fix it by adding a NULL check for the hcon->amp_mgr before checking the ev-status.

Fixes: d5e911928bd8 ("Bluetooth: AMP: Process Physical Link Complete evt")
Reported-and-tested-by: syzbot+0bef568258653cff272f@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=0bef568258653cff272f
Signed-off-by: Anmol Karn
Signed-off-by: Marcel Holtmann
Signed-off-by: Sasha Levin

Anmol Karn
2020-12-30 18:53:05 +0800
615bc1ba5 nl80211/cfg80211: fix potential infinite loop ... Browse Code »

[ Upstream commit ba5c25236bc3d399df82ebe923490ea8d2d35cf2 ]

The for-loop iterates with a u8 loop counter and compares this
with the loop upper limit of request->n_ssids which is an int type.
There is a potential infinite loop if n_ssids is larger than the
u8 loop counter, so fix this by making the loop counter an int.

Addresses-Coverity: ("Infinite loop")
Fixes: c8cb5b854b40 ("nl80211/cfg80211: support 6 GHz scanning")
Signed-off-by: Colin Ian King
Link: https://lore.kernel.org/r/20201029222407.390218-1-colin.king@canonical.com
Signed-off-by: Johannes Berg
Signed-off-by: Sasha Levin

Colin Ian King
2020-12-30 18:53:03 +0800

26 Dec, 2020

3 commits

05725b40b nl80211: validate key indexes for cfg80211_registered_device ... Browse Code »

commit 2d9463083ce92636a1bdd3e30d1236e3e95d859e upstream.

syzbot discovered a bug in which an OOB access was being made because
an unsuitable key_idx value was wrongly considered to be acceptable
while deleting a key in nl80211_del_key().

Since we don't know the cipher at the time of deletion, if
cfg80211_validate_key_settings() were to be called directly in
nl80211_del_key(), even valid keys would be wrongly determined invalid,
and deletion wouldn't occur correctly.
For this reason, a new function - cfg80211_valid_key_idx(), has been
created, to determine if the key_idx value provided is valid or not.
cfg80211_valid_key_idx() is directly called in 2 places -
nl80211_del_key(), and cfg80211_validate_key_settings().

Reported-by: syzbot+49d4cab497c2142ee170@syzkaller.appspotmail.com
Tested-by: syzbot+49d4cab497c2142ee170@syzkaller.appspotmail.com
Suggested-by: Johannes Berg
Signed-off-by: Anant Thazhemadam
Link: https://lore.kernel.org/r/20201204215825.129879-1-anant.thazhemadam@gmail.com
Cc: stable@vger.kernel.org
[also disallow IGTK key IDs if no IGTK cipher is supported]
Signed-off-by: Johannes Berg
Signed-off-by: Greg Kroah-Hartman

Anant Thazhemadam
2020-12-26 23:02:45 +0800
b260e4a68 Bluetooth: Fix slab-out-of-bounds read in hci_le_direct_adv_report_evt() ... Browse Code »

commit f7e0e8b2f1b0a09b527885babda3e912ba820798 upstream.

`num_reports` is not being properly checked. A malformed event packet with
a large `num_reports` number makes hci_le_direct_adv_report_evt() read out
of bounds. Fix it.

Cc: stable@vger.kernel.org
Fixes: 2f010b55884e ("Bluetooth: Add support for handling LE Direct Advertising Report events")
Reported-and-tested-by: syzbot+24ebd650e20bd263ca01@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=24ebd650e20bd263ca01
Signed-off-by: Peilin Ye
Signed-off-by: Marcel Holtmann
Signed-off-by: Greg Kroah-Hartman

Peilin Ye
2020-12-26 23:02:44 +0800
eadec7f53 net: ipconfig: Avoid spurious blank lines in boot log ... Browse Code »

commit c9f64d1fc101c64ea2be1b2e562b4395127befc9 upstream.

When dumping the name and NTP servers advertised by DHCP, a blank line
is emitted if either of the lists is empty. This can lead to confusing
issues such as the blank line getting flagged as warning. This happens
because the blank line is the result of pr_cont("\n") and that may see
its level corrupted by some other driver concurrently writing to the
console.

Fix this by making sure that the terminating newline is only emitted
if at least one entry in the lists was printed before.

Reported-by: Jon Hunter
Signed-off-by: Thierry Reding
Link: https://lore.kernel.org/r/20201110073757.1284594-1-thierry.reding@gmail.com
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Thierry Reding
2020-12-26 23:02:38 +0800

11 Dec, 2020

1 commit

d9838b1d3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf ... Browse Code »

Alexei Starovoitov says:

====================
pull-request: bpf 2020-12-10

The following pull-request contains BPF updates for your *net* tree.

We've added 21 non-merge commits during the last 12 day(s) which contain
a total of 21 files changed, 163 insertions(+), 88 deletions(-).

The main changes are:

1) Fix propagation of 32-bit signed bounds from 64-bit bounds, from Alexei.

2) Fix ring_buffer__poll() return value, from Andrii.

3) Fix race in lwt_bpf, from Cong.

4) Fix test_offload, from Toke.

5) Various xsk fixes.

Please consider pulling these changes from:

git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

Thanks a lot!

Also thanks to reporters, reviewers and testers of commits in this pull-request:

Cong Wang, Hulk Robot, Jakub Kicinski, Jean-Philippe Brucker, John
Fastabend, Magnus Karlsson, Maxim Mikityanskiy, Yonghong Song
====================

Signed-off-by: David S. Miller

David S. Miller
2020-12-11 06:29:30 +0800

10 Dec, 2020

7 commits

7fdd375e3 net: sched: Fix dump of MPLS_OPT_LSE_LABEL attribute in cls_flower ... Browse Code »

TCA_FLOWER_KEY_MPLS_OPT_LSE_LABEL is a u32 attribute (MPLS label is
20 bits long).

Fixes the following bug:

$ tc filter add dev ethX ingress protocol mpls_uc \
flower mpls lse depth 2 label 256 \
action drop

$ tc filter show dev ethX ingress
filter protocol mpls_uc pref 49152 flower chain 0
filter protocol mpls_uc pref 49152 flower chain 0 handle 0x1
eth_type 8847
mpls
lse depth 2 label 0
Signed-off-by: David S. Miller

Guillaume Nault
2020-12-10 12:39:38 +0800
b7e4ba9a9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Switch to RCU in x_tables to fix possible NULL pointer dereference,
from Subash Abhinov Kasiviswanathan.

2) Fix netlink dump of dynset timeouts later than 23 days.

3) Add comment for the indirect serialization of the nft commit mutex
with rtnl_mutex.

4) Remove bogus check for confirmed conntrack when matching on the
conntrack ID, from Brett Mastbergen.
====================

Signed-off-by: David S. Miller

David S. Miller
2020-12-10 10:55:46 +0800
299bcb55e tcp: fix cwnd-limited bug for TSO deferral where we send nothing ... Browse Code »

When cwnd is not a multiple of the TSO skb size of N*MSS, we can get
into persistent scenarios where we have the following sequence:

(1) ACK for full-sized skb of N*MSS arrives
-> tcp_write_xmit() transmit full-sized skb with N*MSS
-> move pacing release time forward
-> exit tcp_write_xmit() because pacing time is in the future

(2) TSQ callback or TCP internal pacing timer fires
-> try to transmit next skb, but TSO deferral finds remainder of
available cwnd is not big enough to trigger an immediate send
now, so we defer sending until the next ACK.

(3) repeat...

So we can get into a case where we never mark ourselves as
cwnd-limited for many seconds at a time, even with
bulk/infinite-backlog senders, because:

o In case (1) above, every time in tcp_write_xmit() we have enough
cwnd to send a full-sized skb, we are not fully using the cwnd
(because cwnd is not a multiple of the TSO skb size). So every time we
send data, we are not cwnd limited, and so in the cwnd-limited
tracking code in tcp_cwnd_validate() we mark ourselves as not
cwnd-limited.

o In case (2) above, every time in tcp_write_xmit() that we try to
transmit the "remainder" of the cwnd but defer, we set the local
variable is_cwnd_limited to true, but we do not send any packets, so
sent_pkts is zero, so we don't call the cwnd-limited logic to update
tp->is_cwnd_limited.

Fixes: ca8a22634381 ("tcp: make cwnd-limited checks measurement-based, and gentler")
Reported-by: Ingemar Johansson
Signed-off-by: Neal Cardwell
Signed-off-by: Yuchung Cheng
Acked-by: Soheil Hassas Yeganeh
Signed-off-by: Eric Dumazet
Link: https://lore.kernel.org/r/20201209035759.1225145-1-ncardwell.kernel@gmail.com
Signed-off-by: Jakub Kicinski

Neal Cardwell
2020-12-10 08:15:54 +0800
5137d3036 net: flow_offload: Fix memory leak for indirect flow block ... Browse Code »

The offending commit introduces a cleanup callback that is invoked
when the driver module is removed to clean up the tunnel device
flow block. But it returns on the first iteration of the for loop.
The remaining indirect flow blocks will never be freed.

Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
CC: Pablo Neira Ayuso
Signed-off-by: Chris Mi
Reviewed-by: Roi Dayan

Chris Mi
2020-12-10 08:08:33 +0800
8ef44b6fe tcp: Retain ECT bits for tos reflection ... Browse Code »

For DCTCP, we have to retain the ECT bits set by the congestion control
algorithm on the socket when reflecting syn TOS in syn-ack, in order to
make ECN work properly.

Fixes: ac8f1710c12b ("tcp: reflect tos value received in SYN to the socket")
Reported-by: Alexander Duyck
Signed-off-by: Wei Wang
Reviewed-by: Eric Dumazet
Signed-off-by: David S. Miller

Wei Wang
2020-12-10 08:08:23 +0800
a770bf515 ethtool: fix stack overflow in ethnl_parse_bitset() ... Browse Code »

Syzbot reported a stack overflow in bitmap_from_arr32() called from
ethnl_parse_bitset() when bitset from netlink message is longer than
target bitmap length. While ethnl_compact_sanity_checks() makes sure that
trailing part is all zeros (i.e. the request does not try to touch bits
kernel does not recognize), we also need to cap change_bits to nbits so
that we don't try to write past the prepared bitmaps.

Fixes: 88db6d1e4f62 ("ethtool: add ethnl_parse_bitset() helper")
Reported-by: syzbot+9d39fa49d4df294aab93@syzkaller.appspotmail.com
Signed-off-by: Michal Kubecek
Link: https://lore.kernel.org/r/3487ee3a98e14cd526f55b6caaa959d2dcbcad9f.1607465316.git.mkubecek@suse.cz
Signed-off-by: Jakub Kicinski

Michal Kubecek
2020-12-10 07:50:38 +0800
323a391a2 can: isotp: isotp_setsockopt(): block setsockopt on bound sockets ... Browse Code »

The isotp socket can be widely configured in its behaviour regarding addressing
types, fill-ups, receive pattern tests and link layer length. Usually all
these settings need to be fixed before bind() and can not be changed
afterwards.

This patch adds a check to enforce the common usage pattern.

Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol")
Signed-off-by: Oliver Hartkopp
Tested-by: Thomas Wagner
Link: https://lore.kernel.org/r/20201203140604.25488-2-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde
Link: https://lore.kernel.org/r/20201204133508.742120-3-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski

Oliver Hartkopp
2020-12-10 00:44:15 +0800

09 Dec, 2020

6 commits

998f17296 xdp: Remove the xdp_attachment_flags_ok() callback ... Browse Code »

Since commit 7f0a838254bd ("bpf, xdp: Maintain info on attached XDP BPF
programs in net_device"), the XDP program attachment info is now maintained
in the core code. This interacts badly with the xdp_attachment_flags_ok()
check that prevents unloading an XDP program with different load flags than
it was loaded with. In practice, two kinds of failures are seen:

- An XDP program loaded without specifying a mode (and which then ends up
in driver mode) cannot be unloaded if the program mode is specified on
unload.

- The dev_xdp_uninstall() hook always calls the driver callback with the
mode set to the type of the program but an empty flags argument, which
means the flags_ok() check prevents the program from being removed,
leading to bpf prog reference leaks.

The original reason this check was added was to avoid ambiguity when
multiple programs were loaded. With the way the checks are done in the core
now, this is quite simple to enforce in the core code, so let's add a check
there and get rid of the xdp_attachment_flags_ok() callback entirely.

Fixes: 7f0a838254bd ("bpf, xdp: Maintain info on attached XDP BPF programs in net_device")
Signed-off-by: Toke Høiland-Jørgensen
Signed-off-by: Daniel Borkmann
Acked-by: Jakub Kicinski
Link: https://lore.kernel.org/bpf/160752225751.110217.10267659521308669050.stgit@toke.dk

Toke Høiland-Jørgensen
2020-12-09 23:27:42 +0800
2d94b20b9 netfilter: nft_ct: Remove confirmation check for NFT_CT_ID ... Browse Code »

Since commit 656c8e9cc1ba ("netfilter: conntrack: Use consistent ct id
hash calculation") the ct id will not change from initialization to
confirmation. Removing the confirmation check allows for things like
adding an element to a 'typeof ct id' set in prerouting upon reception
of the first packet of a new connection, and then being able to
reference that set consistently both before and after the connection
is confirmed.

Fixes: 656c8e9cc1ba ("netfilter: conntrack: Use consistent ct id hash calculation")
Signed-off-by: Brett Mastbergen
Acked-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Brett Mastbergen
2020-12-09 17:31:58 +0800
72d05c00d tcp: select sane initial rcvq_space.space for big MSS ... Browse Code »

Before commit a337531b942b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
small tcp_rmem[1] values were overridden by tcp_fixup_rcvbuf() to accommodate various MSS.

This is no longer the case, and Hazem Mohamed Abuelfotoh reported
that DRS would not work for MTU 9000 endpoints receiving regular (1500 bytes) frames.

Root cause is that tcp_init_buffer_space() uses tp->rcv_wnd for upper limit
of rcvq_space.space computation, while it can select later a smaller
value for tp->rcv_ssthresh and tp->window_clamp.

ss -temoi on receiver would show :

skmem:(r0,rb131072,t0,tb46080,f0,w0,o0,bl0,d0) rcv_space:62496 rcv_ssthresh:56596

This means that TCP can not increase its window in tcp_grow_window(),
and that DRS can never kick.

Fix this by making sure that rcvq_space.space is not bigger than number of bytes
that can be held in TCP receive queue.

People unable/unwilling to change their kernel can work around this issue by
selecting a bigger tcp_rmem[1] value as in :

echo "4096 196608 6291456" >/proc/sys/net/ipv4/tcp_rmem

Based on an initial report and patch from Hazem Mohamed Abuelfotoh
https://lore.kernel.org/netdev/20201204180622.14285-1-abuehaze@amazon.com/

Fixes: a337531b942b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
Fixes: 041a14d26715 ("tcp: start receiver buffer autotuning sooner")
Reported-by: Hazem Mohamed Abuelfotoh
Signed-off-by: Eric Dumazet
Acked-by: Soheil Hassas Yeganeh
Signed-off-by: David S. Miller

Eric Dumazet
2020-12-09 08:27:48 +0800
0398ba9e5 net: tipc: prevent possible null deref of link ... Browse Code »

`tipc_node_apply_property` does a null check on a `tipc_link_entry`
pointer but also accesses the same pointer out of the null check block.

This triggers a warning on Coverity Static Analyzer because we're
implying that `e->link` can BE null.

Move "Update MTU for node link entry" line into if block to make sure
that we're not in a state that `e->link` is null.

Signed-off-by: Cengiz Can
Signed-off-by: David S. Miller

Cengiz Can
2020-12-09 07:53:41 +0800
42f1c2712 netfilter: nftables: comment indirect serialization of commit_mutex with rtnl_mutex ... Browse Code »

Add an explicit comment in the code to describe the indirect
serialization of the holders of the commit_mutex with the rtnl_mutex.
Commit 90d2723c6d4c ("netfilter: nf_tables: do not hold reference on
netdevice from preparation phase") already describes this, but a comment
in this case is better for reference.

Reported-by: Vladimir Oltean
Reviewed-by: Vladimir Oltean
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2020-12-09 04:53:48 +0800
917d80d37 netfilter: nft_dynset: fix timeouts later than 23 days ... Browse Code »

Use nf_msecs_to_jiffies64 and nf_jiffies64_to_msecs as provided by
8e1102d5a159 ("netfilter: nf_tables: support timeouts larger than 23
days"), otherwise ruleset listing breaks.

Fixes: a8b1e36d0d1d ("netfilter: nft_dynset: fix element timeout for HZ != 1000")
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2020-12-09 03:42:11 +0800

08 Dec, 2020

6 commits

cc00bcaa5 netfilter: x_tables: Switch synchronization to RCU ... Browse Code »

When running concurrent iptables rules replacement with data, the per CPU
sequence count is checked after the assignment of the new information.
The sequence count is used to synchronize with the packet path without the
use of any explicit locking. If there are any packets in the packet path using
the table information, the sequence count is incremented to an odd value and
is incremented to an even after the packet process completion.

The new table value assignment is followed by a write memory barrier so every
CPU should see the latest value. If the packet path has started with the old
table information, the sequence counter will be odd and the iptables
replacement will wait till the sequence count is even prior to freeing the
old table info.

However, this assumes that the new table information assignment and the memory
barrier is actually executed prior to the counter check in the replacement
thread. If CPU decides to execute the assignment later as there is no user of
the table information prior to the sequence check, the packet path in another
CPU may use the old table information. The replacement thread would then free
the table information under it leading to a use after free in the packet
processing context-

Unable to handle kernel NULL pointer dereference at virtual
address 000000000000008e
pc : ip6t_do_table+0x5d0/0x89c
lr : ip6t_do_table+0x5b8/0x89c
ip6t_do_table+0x5d0/0x89c
ip6table_filter_hook+0x24/0x30
nf_hook_slow+0x84/0x120
ip6_input+0x74/0xe0
ip6_rcv_finish+0x7c/0x128
ipv6_rcv+0xac/0xe4
__netif_receive_skb+0x84/0x17c
process_backlog+0x15c/0x1b8
napi_poll+0x88/0x284
net_rx_action+0xbc/0x23c
__do_softirq+0x20c/0x48c

This could be fixed by forcing instruction order after the new table
information assignment or by switching to RCU for the synchronization.

Fixes: 80055dab5de0 ("netfilter: x_tables: make xt_replace_table wait until old rules are not used anymore")
Reported-by: Sean Tranchetti
Reported-by: kernel test robot
Suggested-by: Florian Westphal
Signed-off-by: Subash Abhinov Kasiviswanathan
Signed-off-by: Pablo Neira Ayuso

Subash Abhinov Kasiviswanathan
2020-12-08 19:57:39 +0800
819f56bad Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec ... Browse Code »

Steffen Klassert says:

====================
pull request (net): ipsec 2020-12-07

1) Sysbot reported fixes for the new 64/32 bit compat layer.
From Dmitry Safonov.

2) Fix a memory leak in xfrm_user_policy that was introduced
by adding the 64/32 bit compat layer. From Yu Kuai.

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
net: xfrm: fix memory leak in xfrm_user_policy()
xfrm/compat: Don't allocate memory with __GFP_ZERO
xfrm/compat: memset(0) 64-bit padding at right place
xfrm/compat: Translate by copying XFRMA_UNSPEC attribute
====================

Link: https://lore.kernel.org/r/20201207093937.2874932-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski

Jakub Kicinski
2020-12-08 10:29:54 +0800
f55628b3e mptcp: print new line in mptcp_seq_show() if mptcp isn't in use ... Browse Code »

When do cat /proc/net/netstat, the output isn't append with a new line, it looks like this:
[root@localhost ~]# cat /proc/net/netstat
...
MPTcpExt: 0 0 0 0 0 0 0 0 0 0 0 0 0[root@localhost ~]#

This is because in mptcp_seq_show(), if mptcp isn't in use, net->mib.mptcp_statistics is NULL,
so it just puts all 0 after "MPTcpExt:", and return, forgot the '\n'.

After this patch:

[root@localhost ~]# cat /proc/net/netstat
...
MPTcpExt: 0 0 0 0 0 0 0 0 0 0 0 0 0
[root@localhost ~]#

Fixes: fc518953bc9c8d7d ("mptcp: add and use MIB counter infrastructure")
Signed-off-by: Jianguo Wu
Acked-by: Florian Westphal
Link: https://lore.kernel.org/r/142e2fd9-58d9-bb13-fb75-951cccc2331e@163.com
Signed-off-by: Jakub Kicinski

Jianguo Wu
2020-12-08 09:45:29 +0800
851d0a73c bridge: Fix a deadlock when enabling multicast snooping ... Browse Code »

When enabling multicast snooping, bridge module deadlocks on multicast_lock
if 1) IPv6 is enabled, and 2) there is an existing querier on the same L2
network.

The deadlock was caused by the following sequence: While holding the lock,
br_multicast_open calls br_multicast_join_snoopers, which eventually causes
IP stack to (attempt to) send out a Listener Report (in igmp6_join_group).
Since the destination Ethernet address is a multicast address, br_dev_xmit
feeds the packet back to the bridge via br_multicast_rcv, which in turn
calls br_multicast_add_group, which then deadlocks on multicast_lock.

The fix is to move the call br_multicast_join_snoopers outside of the
critical section. This works since br_multicast_join_snoopers only deals
with IP and does not modify any multicast data structures of the bridge,
so there's no need to hold the lock.

Steps to reproduce:
1. sysctl net.ipv6.conf.all.force_mld_version=1
2. have another querier
3. ip link set dev bridge type bridge mcast_snooping 0 && \
ip link set dev bridge type bridge mcast_snooping 1 < deadlock >

A typical call trace looks like the following:

[ 936.251495] _raw_spin_lock+0x5c/0x68
[ 936.255221] br_multicast_add_group+0x40/0x170 [bridge]
[ 936.260491] br_multicast_rcv+0x7ac/0xe30 [bridge]
[ 936.265322] br_dev_xmit+0x140/0x368 [bridge]
[ 936.269689] dev_hard_start_xmit+0x94/0x158
[ 936.273876] __dev_queue_xmit+0x5ac/0x7f8
[ 936.277890] dev_queue_xmit+0x10/0x18
[ 936.281563] neigh_resolve_output+0xec/0x198
[ 936.285845] ip6_finish_output2+0x240/0x710
[ 936.290039] __ip6_finish_output+0x130/0x170
[ 936.294318] ip6_output+0x6c/0x1c8
[ 936.297731] NF_HOOK.constprop.0+0xd8/0xe8
[ 936.301834] igmp6_send+0x358/0x558
[ 936.305326] igmp6_join_group.part.0+0x30/0xf0
[ 936.309774] igmp6_group_added+0xfc/0x110
[ 936.313787] __ipv6_dev_mc_inc+0x1a4/0x290
[ 936.317885] ipv6_dev_mc_inc+0x10/0x18
[ 936.321677] br_multicast_open+0xbc/0x110 [bridge]
[ 936.326506] br_multicast_toggle+0xec/0x140 [bridge]

Fixes: 4effd28c1245 ("bridge: join all-snoopers multicast address")
Signed-off-by: Joseph Huang
Acked-by: Nikolay Aleksandrov
Link: https://lore.kernel.org/r/20201204235628.50653-1-Joseph.Huang@garmin.com
Signed-off-by: Jakub Kicinski

Joseph Huang
2020-12-08 09:14:43 +0800
e3366884b lwt_bpf: Replace preempt_disable() with migrate_disable() ... Browse Code »

migrate_disable() is just a wrapper for preempt_disable() in
non-RT kernel. It is safe to replace it, and RT kernel will
benefit.

Note that it is introduced since Feb 2020.

Suggested-by: Alexei Starovoitov
Signed-off-by: Cong Wang
Signed-off-by: Alexei Starovoitov
Link: https://lore.kernel.org/bpf/20201205075946.497763-2-xiyou.wangcong@gmail.com

Cong Wang
2020-12-08 03:53:40 +0800
d9054a1ff lwt: Disable BH too in run_lwt_bpf() ... Browse Code »

The per-cpu bpf_redirect_info is shared among all skb_do_redirect()
and BPF redirect helpers. Callers on RX path are all in BH context,
disabling preemption is not sufficient to prevent BH interruption.

In production, we observed strange packet drops because of the race
condition between LWT xmit and TC ingress, and we verified this issue
is fixed after we disable BH.

Although this bug was technically introduced from the beginning, that
is commit 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure"),
at that time call_rcu() had to be call_rcu_bh() to match the RCU context.
So this patch may not work well before RCU flavor consolidation has been
completed around v5.0.

Update the comments above the code too, as call_rcu() is now BH friendly.

Signed-off-by: Dongdong Wang
Signed-off-by: Alexei Starovoitov
Reviewed-by: Cong Wang
Link: https://lore.kernel.org/bpf/20201205075946.497763-1-xiyou.wangcong@gmail.com

Dongdong Wang
2020-12-08 03:53:39 +0800

07 Dec, 2020

1 commit

10c678bd0 udp: fix the proto value passed to ip_protocol_deliver_rcu for the segments ... Browse Code »

Guillaume noticed that: for segments udp_queue_rcv_one_skb() returns the
proto, and it should pass "ret" unmodified to ip_protocol_deliver_rcu().
Otherwize, with a negtive value passed, it will underflow inet_protos.

This can be reproduced with IPIP FOU:

# ip fou add port 5555 ipproto 4
# ethtool -K eth1 rx-gro-list on

Fixes: cf329aa42b66 ("udp: cope with UDP GRO packet misdirection")
Reported-by: Guillaume Nault
Signed-off-by: Xin Long
Signed-off-by: David S. Miller

Xin Long
2020-12-07 16:32:11 +0800

05 Dec, 2020

5 commits

905b2032f mac80211: mesh: fix mesh_pathtbl_init() error path ... Browse Code »

If tbl_mpp can not be allocated, we call mesh_table_free(tbl_path)
while tbl_path rhashtable has not yet been initialized, which causes
panics.

Simply factorize the rhashtable_init() call into mesh_table_alloc()

WARNING: CPU: 1 PID: 8474 at kernel/workqueue.c:3040 __flush_work kernel/workqueue.c:3040 [inline]
WARNING: CPU: 1 PID: 8474 at kernel/workqueue.c:3040 __cancel_work_timer+0x514/0x540 kernel/workqueue.c:3136
Modules linked in:
CPU: 1 PID: 8474 Comm: syz-executor663 Not tainted 5.10.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__flush_work kernel/workqueue.c:3040 [inline]
RIP: 0010:__cancel_work_timer+0x514/0x540 kernel/workqueue.c:3136
Code: 5d c3 e8 bf ae 29 00 0f 0b e9 f0 fd ff ff e8 b3 ae 29 00 0f 0b 43 80 3c 3e 00 0f 85 31 ff ff ff e9 34 ff ff ff e8 9c ae 29 00 0b e9 dc fe ff ff 89 e9 80 e1 07 80 c1 03 38 c1 0f 8c 7d fd ff
RSP: 0018:ffffc9000165f5a0 EFLAGS: 00010293
RAX: ffffffff814b7064 RBX: 0000000000000001 RCX: ffff888021c80000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff888024039ca0 R08: dffffc0000000000 R09: fffffbfff1dd3e64
R10: fffffbfff1dd3e64 R11: 0000000000000000 R12: 1ffff920002cbebd
R13: ffff888024039c88 R14: 1ffff11004807391 R15: dffffc0000000000
FS: 0000000001347880(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000140 CR3: 000000002cc0a000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
rhashtable_free_and_destroy+0x25/0x9c0 lib/rhashtable.c:1137
mesh_table_free net/mac80211/mesh_pathtbl.c:69 [inline]
mesh_pathtbl_init+0x287/0x2e0 net/mac80211/mesh_pathtbl.c:785
ieee80211_mesh_init_sdata+0x2ee/0x530 net/mac80211/mesh.c:1591
ieee80211_setup_sdata+0x733/0xc40 net/mac80211/iface.c:1569
ieee80211_if_add+0xd5c/0x1cd0 net/mac80211/iface.c:1987
ieee80211_add_iface+0x59/0x130 net/mac80211/cfg.c:125
rdev_add_virtual_intf net/wireless/rdev-ops.h:45 [inline]
nl80211_new_interface+0x563/0xb40 net/wireless/nl80211.c:3855
genl_family_rcv_msg_doit net/netlink/genetlink.c:739 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
genl_rcv_msg+0xe4e/0x1280 net/netlink/genetlink.c:800
netlink_rcv_skb+0x190/0x3a0 net/netlink/af_netlink.c:2494
genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
netlink_unicast+0x780/0x930 net/netlink/af_netlink.c:1330
netlink_sendmsg+0x9a8/0xd40 net/netlink/af_netlink.c:1919
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg net/socket.c:671 [inline]
____sys_sendmsg+0x519/0x800 net/socket.c:2353
___sys_sendmsg net/socket.c:2407 [inline]
__sys_sendmsg+0x2b1/0x360 net/socket.c:2440
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 60854fd94573 ("mac80211: mesh: convert path table to rhashtable")
Signed-off-by: Eric Dumazet
Reported-by: syzbot
Reviewed-by: Johannes Berg
Link: https://lore.kernel.org/r/20201204162428.2583119-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski

Eric Dumazet
2020-12-05 09:34:25 +0800
bb2da7651 openvswitch: fix error return code in validate_and_copy_dec_ttl() ... Browse Code »

Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Changing 'return start' to 'return action_start' can fix this bug.

Fixes: 69929d4c49e1 ("net: openvswitch: fix TTL decrement action netlink message format")
Reported-by: Hulk Robot
Signed-off-by: Wang Hai
Reviewed-by: Eelco Chaudron
Link: https://lore.kernel.org/r/20201204114314.1596-1-wanghai38@huawei.com
Signed-off-by: Jakub Kicinski

Wang Hai
2020-12-05 07:43:14 +0800
ee4f52a8d net: bridge: vlan: fix error return code in __vlan_add() ... Browse Code »

Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: f8ed289fab84 ("bridge: vlan: use br_vlan_(get|put)_master to deal with refcounts")
Reported-by: Hulk Robot
Signed-off-by: Zhang Changzhong
Acked-by: Nikolay Aleksandrov
Link: https://lore.kernel.org/r/1607071737-33875-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski

Zhang Changzhong
2020-12-05 07:41:06 +0800
b410f04eb ipv4: fix error return code in rtm_to_fib_config() ... Browse Code »

Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: d15662682db2 ("ipv4: Allow ipv6 gateway with ipv4 routes")
Reported-by: Hulk Robot
Signed-off-by: Zhang Changzhong
Reviewed-by: David Ahern
Link: https://lore.kernel.org/r/1607071695-33740-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski

Zhang Changzhong
2020-12-05 07:38:16 +0800
4eef8b1f3 net/sched: fq_pie: initialize timer earlier in fq_pie_init() ... Browse Code »

with the following tdc testcase:

83be: (qdisc, fq_pie) Create FQ-PIE with invalid number of flows

as fq_pie_init() fails, fq_pie_destroy() is called to clean up. Since the
timer is not yet initialized, it's possible to observe a splat like this:

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 0 PID: 975 Comm: tc Not tainted 5.10.0-rc4+ #298
Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
Call Trace:
dump_stack+0x99/0xcb
register_lock_class+0x12dd/0x1750
__lock_acquire+0xfe/0x3970
lock_acquire+0x1c8/0x7f0
del_timer_sync+0x49/0xd0
fq_pie_destroy+0x3f/0x80 [sch_fq_pie]
qdisc_create+0x916/0x1160
tc_modify_qdisc+0x3c4/0x1630
rtnetlink_rcv_msg+0x346/0x8e0
netlink_unicast+0x439/0x630
netlink_sendmsg+0x719/0xbf0
sock_sendmsg+0xe2/0x110
____sys_sendmsg+0x5ba/0x890
___sys_sendmsg+0xe9/0x160
__sys_sendmsg+0xd3/0x170
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
[...]
ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0
WARNING: CPU: 0 PID: 975 at lib/debugobjects.c:508 debug_print_object+0x162/0x210
[...]
Call Trace:
debug_object_assert_init+0x268/0x380
try_to_del_timer_sync+0x6a/0x100
del_timer_sync+0x9e/0xd0
fq_pie_destroy+0x3f/0x80 [sch_fq_pie]
qdisc_create+0x916/0x1160
tc_modify_qdisc+0x3c4/0x1630
rtnetlink_rcv_msg+0x346/0x8e0
netlink_rcv_skb+0x120/0x380
netlink_unicast+0x439/0x630
netlink_sendmsg+0x719/0xbf0
sock_sendmsg+0xe2/0x110
____sys_sendmsg+0x5ba/0x890
___sys_sendmsg+0xe9/0x160
__sys_sendmsg+0xd3/0x170
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9

fix it moving timer_setup() before any failure, like it was done on 'red'
with former commit 608b4adab178 ("net_sched: initialize timer earlier in
red_init()").

Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler")
Signed-off-by: Davide Caratti
Reviewed-by: Cong Wang
Link: https://lore.kernel.org/r/2e78e01c504c633ebdff18d041833cf2e079a3a4.1607020450.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski

Davide Caratti
2020-12-05 06:15:01 +0800

04 Dec, 2020

5 commits

12c8a8ca1 xsk: Return error code if force_zc is set ... Browse Code »

If force_zc is set, we should exit out with an error, not fall back to
copy mode.

Fixes: 921b68692abb ("xsk: Enable sharing of dma mappings")
Reported-by: Hulk Robot
Signed-off-by: Zhang Changzhong
Signed-off-by: Daniel Borkmann
Acked-by: Magnus Karlsson
Link: https://lore.kernel.org/bpf/1607077277-41995-1-git-send-email-zhangchangzhong@huawei.com

Zhang Changzhong
2020-12-04 23:48:31 +0800
bdeca45a0 mac80211: set SDATA_STATE_RUNNING for monitor interfaces ... Browse Code »

During restarrt, mac80211 is supposed to reconfigure the driver.
When there's a monitor interface, the interface is added and the
channel context for it was created, but not assigned to it as it
was not considered running during the restart.

Fix this by setting SDATA_STATE_RUNNING while adding monitor
interfaces.

Signed-off-by: Borwankar, Antara
Signed-off-by: Luca Coelho
Link: https://lore.kernel.org/r/iwlwifi.20201129172929.e1df99693a4c.I494579f28018c2d0b9d4083a664cf872c28405ae@changeid
[reword commit log]
Signed-off-by: Johannes Berg

Borwankar, Antara
2020-12-04 19:45:25 +0800
f495acd88 cfg80211: initialize rekey_data ... Browse Code »

In case we have old supplicant, the akm field is uninitialized.

Signed-off-by: Sara Sharon
Signed-off-by: Luca Coelho
Link: https://lore.kernel.org/r/iwlwifi.20201129172929.930f0ab7ebee.Ic546e384efab3f4a89f318eafddc3eb7d556aecb@changeid
Signed-off-by: Johannes Berg

Sara Sharon
2020-12-04 19:35:58 +0800
8fca2b870 mac80211: fix return value of ieee80211_chandef_he_6ghz_oper ... Browse Code »

ieee80211_chandef_he_6ghz_oper() needs to return true if it
determined a value 6 GHz chandef, fix that.

Fixes: 1d00ce807efa ("mac80211: support S1G association")
Signed-off-by: Wen Gong
Link: https://lore.kernel.org/r/1606121152-3452-1-git-send-email-wgong@codeaurora.org
[rewrite commit message]
Signed-off-by: Johannes Berg

Wen Gong
2020-12-04 19:35:58 +0800
9608fa653 net/sched: act_mpls: ensure LSE is pullable before reading it ... Browse Code »

when 'act_mpls' is used to mangle the LSE, the current value is read from
the packet dereferencing 4 bytes at mpls_hdr(): ensure that the label is
contained in the skb "linear" area.

Found by code inspection.

v2:
- use MPLS_HLEN instead of sizeof(new_lse), thanks to Jakub Kicinski

Fixes: 2a2ea50870ba ("net: sched: add mpls manipulation actions to TC")
Signed-off-by: Davide Caratti
Acked-by: Guillaume Nault
Link: https://lore.kernel.org/r/3243506cba43d14858f3bd21ee0994160e44d64a.1606987058.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski

Davide Caratti
2020-12-04 03:13:37 +0800