Eric Lee / smarc-fsl-linux-kernel

20 Jan, 2021

10 commits

68023ebbb Merge tag 'v5.10.9' into lf-5.10.y ... Browse Code »

This is the 5.10.9 stable release

* tag 'v5.10.9': (153 commits)
Linux 5.10.9
netfilter: nf_nat: Fix memleak in nf_nat_init
netfilter: conntrack: fix reading nf_conntrack_buckets
...

Signed-off-by: Jason Liu

Jason Liu
2021-01-20 19:46:08 +0800
f181a2461 Merge tag 'v5.10.8' into lf-5.10.y ... Browse Code »

This is the 5.10.8 stable release

* tag 'v5.10.8': (104 commits)
Linux 5.10.8
tools headers UAPI: Sync linux/fscrypt.h with the kernel sources
drm/panfrost: Remove unused variables in panfrost_job_close()
...

Signed-off-by: Jason Liu

Jason Liu
2021-01-20 19:46:06 +0800
b7397449a Merge tag 'v5.10.7' into lf-5.10.y ... Browse Code »

This is the 5.10.7 stable release

* tag 'v5.10.7': (144 commits)
Linux 5.10.7
scsi: target: Fix XCOPY NAA identifier lookup
rtlwifi: rise completion at the last step of firmware callback
...

Signed-off-by: Jason Liu

Jason Liu
2021-01-20 19:46:05 +0800
c93f257cd Merge tag 'v5.10.6' into lf-5.10.y ... Browse Code »

This is the 5.10.6 stable release

* tag 'v5.10.6': (21 commits)
Linux 5.10.6
mwifiex: Fix possible buffer overflows in mwifiex_cmd_802_11_ad_hoc_start
exec: Transform exec_update_mutex into a rw_semaphore
...

Signed-off-by: Jason Liu

Conflicts:
drivers/rtc/rtc-pcf2127.c

Jason Liu
2021-01-20 19:45:45 +0800
f35fe82c7 Merge tag 'v5.10.5' into lf-5.10.y ... Browse Code »

This is the 5.10.5 stable release

* tag 'v5.10.5': (63 commits)
Linux 5.10.5
device-dax: Fix range release
ext4: avoid s_mb_prefetch to be zero in individual scenarios
...

Signed-off-by: Jason Liu

Jason Liu
2021-01-20 16:17:26 +0800
88a5c90f3 netfilter: nf_nat: Fix memleak in nf_nat_init ... Browse Code »

commit 869f4fdaf4ca7bb6e0d05caf6fa1108dddc346a7 upstream.

When register_pernet_subsys() fails, nf_nat_bysource
should be freed just like when nf_ct_extend_register()
fails.

Fixes: 1cd472bf036ca ("netfilter: nf_nat: add nat hook register functions to nf_nat")
Signed-off-by: Dinghao Liu
Acked-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Greg Kroah-Hartman

Dinghao Liu
2021-01-20 01:27:33 +0800
f14e31c16 netfilter: conntrack: fix reading nf_conntrack_buckets ... Browse Code »

commit f6351c3f1c27c80535d76cac2299aec44c36291e upstream.

The old way of changing the conntrack hashsize runtime was through changing
the module param via file /sys/module/nf_conntrack/parameters/hashsize. This
was extended to sysctl change in commit 3183ab8997a4 ("netfilter: conntrack:
allow increasing bucket size via sysctl too").

The commit introduced second "user" variable nf_conntrack_htable_size_user
which shadow actual variable nf_conntrack_htable_size. When hashsize is
changed via module param this "user" variable isn't updated. This results in
sysctl net/netfilter/nf_conntrack_buckets shows the wrong value when users
update via the old way.

This patch fix the issue by always updating "user" variable when reading the
proc file. This will take care of changes to the actual variable without
sysctl need to be aware.

Fixes: 3183ab8997a4 ("netfilter: conntrack: allow increasing bucket size via sysctl too")
Reported-by: Yoel Caspersen
Signed-off-by: Jesper Dangaard Brouer
Acked-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Greg Kroah-Hartman

Jesper Dangaard Brouer
2021-01-20 01:27:33 +0800
f0cd3fba4 net: sunrpc: interpret the return value of kstrtou32 correctly ... Browse Code »

commit 86b53fbf08f48d353a86a06aef537e78e82ba721 upstream.

A return value of 0 means success. This is documented in lib/kstrtox.c.

This was found by trying to mount an NFS share from a link-local IPv6
address with the interface specified by its index:

mount("[fe80::1%1]:/srv/nfs", "/mnt", "nfs", 0, "nolock,addr=fe80::1%1")

Before this commit this failed with EINVAL and also caused the following
message in dmesg:

[...] NFS: bad IP address specified: addr=fe80::1%1

The syscall using the same address based on the interface name instead
of its index succeeds.

Credits for this patch go to my colleague Christian Speich, who traced
the origin of this bug to this line of code.

Signed-off-by: Johannes Nixdorf
Fixes: 00cfaa943ec3 ("replace strict_strto calls")
Signed-off-by: Trond Myklebust
Signed-off-by: Greg Kroah-Hartman

j.nixdorf@avm.de
2021-01-20 01:27:33 +0800
4ac5d2018 cfg80211: select CONFIG_CRC32 ... Browse Code »

[ Upstream commit 152a8a6c017bfdeda7f6d052fbc6e151891bd9b6 ]

Without crc32 support, this fails to link:

arm-linux-gnueabi-ld: net/wireless/scan.o: in function `cfg80211_scan_6ghz':
scan.c:(.text+0x928): undefined reference to `crc32_le'

Fixes: c8cb5b854b40 ("nl80211/cfg80211: support 6 GHz scanning")
Signed-off-by: Arnd Bergmann
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin

Arnd Bergmann
2021-01-20 01:27:28 +0800
cc77e4a02 netfilter: ipset: fixes possible oops in mtype_resize ... Browse Code »

[ Upstream commit 2b33d6ffa9e38f344418976b06057e2fc2aa9e2a ]

currently mtype_resize() can cause oops

t = ip_set_alloc(htable_size(htable_bits));
if (!t) {
ret = -ENOMEM;
goto out;
}
t->hregion = ip_set_alloc(ahash_sizeof_regions(htable_bits));

Increased htable_bits can force htable_size() to return 0.
In own turn ip_set_alloc(0) returns not 0 but ZERO_SIZE_PTR,
so follwoing access to t->hregion should trigger an OOPS.

Signed-off-by: Vasily Averin
Acked-by: Jozsef Kadlecsik
Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Sasha Levin

Vasily Averin
2021-01-20 01:27:23 +0800

17 Jan, 2021

11 commits

43f6ea414 net: drop bogus skb with CHECKSUM_PARTIAL and offset beyond end of trimmed packet ... Browse Code »

commit 54970a2fbb673f090b7f02d7f57b10b2e0707155 upstream.

syzbot reproduces BUG_ON in skb_checksum_help():
tun creates (bogus) skb with huge partial-checksummed area and
small ip packet inside. Then ip_rcv trims the skb based on size
of internal ip packet, after that csum offset points beyond of
trimmed skb. Then checksum_tg() called via netfilter hook
triggers BUG_ON:

offset = skb_checksum_start_offset(skb);
BUG_ON(offset >= skb_headlen(skb));

To work around the problem this patch forces pskb_trim_rcsum_slow()
to return -EINVAL in described scenario. It allows its callers to
drop such kind of packets.

Link: https://syzkaller.appspot.com/bug?id=b419a5ca95062664fe1a60b764621eb4526e2cd0
Reported-by: syzbot+7010af67ced6105e5ab6@syzkaller.appspotmail.com
Signed-off-by: Vasily Averin
Acked-by: Willem de Bruijn
Link: https://lore.kernel.org/r/1b2494af-2c56-8ee2-7bc0-923fcad1cdf8@virtuozzo.com
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Vasily Averin
2021-01-17 21:17:06 +0800
61e8c02ae can: isotp: isotp_getname(): fix kernel information leak ... Browse Code »

commit b42b3a2744b3e8f427de79896720c72823af91ad upstream.

Initialize the sockaddr_can structure to prevent a data leak to user space.

Suggested-by: Cong Wang
Reported-by: syzbot+057884e2f453e8afebc8@syzkaller.appspotmail.com
Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol")
Signed-off-by: Oliver Hartkopp
Link: https://lore.kernel.org/r/20210112091643.11789-1-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde
Signed-off-by: Greg Kroah-Hartman

Oliver Hartkopp
2021-01-17 21:17:05 +0800
be6657273 xsk: Rollback reservation at NETDEV_TX_BUSY ... Browse Code »

commit b1b95cb5c0a9694d47d5f845ba97e226cfda957d upstream.

Rollback the reservation in the completion ring when we get a
NETDEV_TX_BUSY. When this error is received from the driver, we are
supposed to let the user application retry the transmit again. And in
order to do this, we need to roll back the failed send so it can be
retried. Unfortunately, we did not cancel the reservation we had made
in the completion ring. By not doing this, we actually make the
completion ring one entry smaller per NETDEV_TX_BUSY error we get, and
after enough of these errors the completion ring will be of size zero
and transmit will stop working.

Fix this by cancelling the reservation when we get a NETDEV_TX_BUSY
error.

Fixes: 642e450b6b59 ("xsk: Do not discard packet when NETDEV_TX_BUSY")
Reported-by: Xuan Zhuo
Signed-off-by: Magnus Karlsson
Signed-off-by: Daniel Borkmann
Acked-by: Björn Töpel
Link: https://lore.kernel.org/bpf/20201218134525.13119-3-magnus.karlsson@gmail.com
Signed-off-by: Greg Kroah-Hartman

Magnus Karlsson
2021-01-17 21:17:05 +0800
9ad0375ed xsk: Fix race in SKB mode transmit with shared cq ... Browse Code »

commit f09ced4053bc0a2094a12b60b646114c966ef4c6 upstream.

Fix a race when multiple sockets are simultaneously calling sendto()
when the completion ring is shared in the SKB case. This is the case
when you share the same netdev and queue id through the
XDP_SHARED_UMEM bind flag. The problem is that multiple processes can
be in xsk_generic_xmit() and call the backpressure mechanism in
xskq_prod_reserve(xs->pool->cq). As this is a shared resource in this
specific scenario, a race might occur since the rings are
single-producer single-consumer.

Fix this by moving the tx_completion_lock from the socket to the pool
as the pool is shared between the sockets that share the completion
ring. (The pool is not shared when this is not the case.) And then
protect the accesses to xskq_prod_reserve() with this lock. The
tx_completion_lock is renamed cq_lock to better reflect that it
protects accesses to the potentially shared completion ring.

Fixes: 35fcde7f8deb ("xsk: support for Tx")
Reported-by: Xuan Zhuo
Signed-off-by: Magnus Karlsson
Signed-off-by: Daniel Borkmann
Acked-by: Björn Töpel
Link: https://lore.kernel.org/bpf/20201218134525.13119-2-magnus.karlsson@gmail.com
Signed-off-by: Greg Kroah-Hartman

Magnus Karlsson
2021-01-17 21:17:05 +0800
5fb8a3116 nexthop: Bounce NHA_GATEWAY in FDB nexthop groups ... Browse Code »

[ Upstream commit b19218b27f3477316d296e8bcf4446aaf017aa69 ]

The function nh_check_attr_group() is called to validate nexthop groups.
The intention of that code seems to have been to bounce all attributes
above NHA_GROUP_TYPE except for NHA_FDB. However instead it bounces all
these attributes except when NHA_FDB attribute is present--then it accepts
them.

NHA_FDB validation that takes place before, in rtm_to_nh_config(), already
bounces NHA_OIF, NHA_BLACKHOLE, NHA_ENCAP and NHA_ENCAP_TYPE. Yet further
back, NHA_GROUPS and NHA_MASTER are bounced unconditionally.

But that still leaves NHA_GATEWAY as an attribute that would be accepted in
FDB nexthop groups (with no meaning), so long as it keeps the address
family as unspecified:

# ip nexthop add id 1 fdb via 127.0.0.1
# ip nexthop add id 10 fdb via default group 1

The nexthop code is still relatively new and likely not used very broadly,
and the FDB bits are newer still. Even though there is a reproducer out
there, it relies on an improbable gateway arguments "via default", "via
all" or "via any". Given all this, I believe it is OK to reformulate the
condition to do the right thing and bounce NHA_GATEWAY.

Fixes: 38428d68719c ("nexthop: support for fdb ecmp nexthops")
Signed-off-by: Petr Machata
Signed-off-by: Ido Schimmel
Reviewed-by: David Ahern
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Petr Machata
2021-01-17 21:16:57 +0800
eaa7a6c39 nexthop: Unlink nexthop group entry in error path ... Browse Code »

[ Upstream commit 7b01e53eee6dce7a8a6736e06b99b68cd0cc7a27 ]

In case of error, remove the nexthop group entry from the list to which
it was previously added.

Fixes: 430a049190de ("nexthop: Add support for nexthop groups")
Signed-off-by: Ido Schimmel
Reviewed-by: Petr Machata
Reviewed-by: David Ahern
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Ido Schimmel
2021-01-17 21:16:57 +0800
6486bc0a3 nexthop: Fix off-by-one error in error path ... Browse Code »

[ Upstream commit 07e61a979ca4dddb3661f59328b3cd109f6b0070 ]

A reference was not taken for the current nexthop entry, so do not try
to put it in the error path.

Fixes: 430a049190de ("nexthop: Add support for nexthop groups")
Signed-off-by: Ido Schimmel
Reviewed-by: Petr Machata
Reviewed-by: David Ahern
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Ido Schimmel
2021-01-17 21:16:57 +0800
b0ff6d00e net: ip: always refragment ip defragmented packets ... Browse Code »

[ Upstream commit bb4cc1a18856a73f0ff5137df0c2a31f4c50f6cf ]

Conntrack reassembly records the largest fragment size seen in IPCB.
However, when this gets forwarded/transmitted, fragmentation will only
be forced if one of the fragmented packets had the DF bit set.

In that case, a flag in IPCB will force fragmentation even if the
MTU is large enough.

This should work fine, but this breaks with ip tunnels.
Consider client that sends a UDP datagram of size X to another host.

The client fragments the datagram, so two packets, of size y and z, are
sent. DF bit is not set on any of these packets.

Middlebox netfilter reassembles those packets back to single size-X
packet, before routing decision.

packet-size-vs-mtu checks in ip_forward are irrelevant, because DF bit
isn't set. At output time, ip refragmentation is skipped as well
because x is still smaller than the mtu of the output device.

If ttransmit device is an ip tunnel, the packet size increases to
x+overhead.

Also, tunnel might be configured to force DF bit on outer header.

In this case, packet will be dropped (exceeds MTU) and an ICMP error is
generated back to sender.

But sender already respects the announced MTU, all the packets that
it sent did fit the announced mtu.

Force refragmentation as per original sizes unconditionally so ip tunnel
will encapsulate the fragments instead.

The only other solution I see is to place ip refragmentation in
the ip_tunnel code to handle this case.

Fixes: d6b915e29f4ad ("ip_fragment: don't forward defragmented DF packet")
Reported-by: Christian Perle
Signed-off-by: Florian Westphal
Acked-by: Pablo Neira Ayuso
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Florian Westphal
2021-01-17 21:16:56 +0800
d5fc41ebe net: fix pmtu check in nopmtudisc mode ... Browse Code »

[ Upstream commit 50c661670f6a3908c273503dfa206dfc7aa54c07 ]

For some reason ip_tunnel insist on setting the DF bit anyway when the
inner header has the DF bit set, EVEN if the tunnel was configured with
'nopmtudisc'.

This means that the script added in the previous commit
cannot be made to work by adding the 'nopmtudisc' flag to the
ip tunnel configuration. Doing so breaks connectivity even for the
without-conntrack/netfilter scenario.

When nopmtudisc is set, the tunnel will skip the mtu check, so no
icmp error is sent to client. Then, because inner header has DF set,
the outer header gets added with DF bit set as well.

IP stack then sends an error to itself because the packet exceeds
the device MTU.

Fixes: 23a3647bc4f93 ("ip_tunnels: Use skb-len to PMTU check.")
Cc: Stefano Brivio
Signed-off-by: Florian Westphal
Acked-by: Pablo Neira Ayuso
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Florian Westphal
2021-01-17 21:16:56 +0800
69363e37d net: ipv6: fib: flush exceptions when purging route ... Browse Code »

[ Upstream commit d8f5c29653c3f6995e8979be5623d263e92f6b86 ]

Route removal is handled by two code paths. The main removal path is via
fib6_del_route() which will handle purging any PMTU exceptions from the
cache, removing all per-cpu copies of the DST entry used by the route, and
releasing the fib6_info struct.

The second removal location is during fib6_add_rt2node() during a route
replacement operation. This path also calls fib6_purge_rt() to handle
cleaning up the per-cpu copies of the DST entries and releasing the
fib6_info associated with the older route, but it does not flush any PMTU
exceptions that the older route had. Since the older route is removed from
the tree during the replacement, we lose any way of accessing it again.

As these lingering DSTs and the fib6_info struct are holding references to
the underlying netdevice struct as well, unregistering that device from the
kernel can never complete.

Fixes: 2b760fcf5cfb3 ("ipv6: hook up exception table to store dst cache")
Signed-off-by: Sean Tranchetti
Reviewed-by: David Ahern
Link: https://lore.kernel.org/r/1609892546-11389-1-git-send-email-stranche@quicinc.com
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Sean Tranchetti
2021-01-17 21:16:56 +0800
9591f32a6 net: vlan: avoid leaks on register_vlan_dev() failures ... Browse Code »

[ Upstream commit 55b7ab1178cbf41f979ff83236d3321ad35ed2ad ]

VLAN checks for NETREG_UNINITIALIZED to distinguish between
registration failure and unregistration in progress.

Since commit cb626bf566eb ("net-sysfs: Fix reference count leak")
registration failure may, however, result in NETREG_UNREGISTERED
as well as NETREG_UNINITIALIZED.

This fix is similer to cebb69754f37 ("rtnetlink: Fix
memory(net_device) leak when ->newlink fails")

Fixes: cb626bf566eb ("net-sysfs: Fix reference count leak")
Signed-off-by: Jakub Kicinski
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Jakub Kicinski
2021-01-17 21:16:55 +0800

13 Jan, 2021

14 commits

0fae7d269 xsk: Fix memory leak for failed bind ... Browse Code »

commit 8bee683384087a6275c9183a483435225f7bb209 upstream.

Fix a possible memory leak when a bind of an AF_XDP socket fails. When
the fill and completion rings are created, they are tied to the
socket. But when the buffer pool is later created at bind time, the
ownership of these two rings are transferred to the buffer pool as
they might be shared between sockets (and the buffer pool cannot be
created until we know what we are binding to). So, before the buffer
pool is created, these two rings are cleaned up with the socket, and
after they have been transferred they are cleaned up together with
the buffer pool.

The problem is that ownership was transferred before it was absolutely
certain that the buffer pool could be created and initialized
correctly and when one of these errors occurred, the fill and
completion rings did neither belong to the socket nor the pool and
where therefore leaked. Solve this by moving the ownership transfer
to the point where the buffer pool has been completely set up and
there is no way it can fail.

Fixes: 7361f9c3d719 ("xsk: Move fill and completion rings to buffer pool")
Reported-by: syzbot+cfa88ddd0655afa88763@syzkaller.appspotmail.com
Signed-off-by: Magnus Karlsson
Signed-off-by: Daniel Borkmann
Acked-by: Björn Töpel
Link: https://lore.kernel.org/bpf/20201214085127.3960-1-magnus.karlsson@gmail.com
Signed-off-by: Greg Kroah-Hartman

Magnus Karlsson
2021-01-13 03:18:26 +0800
8b109f4cd netfilter: nft_dynset: report EOPNOTSUPP on missing set feature ... Browse Code »

commit 95cd4bca7b1f4a25810f3ddfc5e767fb46931789 upstream.

If userspace requests a feature which is not available the original set
definition, then bail out with EOPNOTSUPP. If userspace sends
unsupported dynset flags (new feature not supported by this kernel),
then report EOPNOTSUPP to userspace. EINVAL should be only used to
report malformed netlink messages from userspace.

Fixes: 22fe54d5fefc ("netfilter: nf_tables: add support for dynamic set updates")
Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Greg Kroah-Hartman

Pablo Neira Ayuso
2021-01-13 03:18:26 +0800
810bc977f netfilter: xt_RATEEST: reject non-null terminated string from userspace ... Browse Code »

commit 6cb56218ad9e580e519dcd23bfb3db08d8692e5a upstream.

syzbot reports:
detected buffer overflow in strlen
[..]
Call Trace:
strlen include/linux/string.h:325 [inline]
strlcpy include/linux/string.h:348 [inline]
xt_rateest_tg_checkentry+0x2a5/0x6b0 net/netfilter/xt_RATEEST.c:143

strlcpy assumes src is a c-string. Check info->name before its used.

Reported-by: syzbot+e86f7c428c8c50db65b4@syzkaller.appspotmail.com
Fixes: 5859034d7eb8793 ("[NETFILTER]: x_tables: add RATEEST target")
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Greg Kroah-Hartman

Florian Westphal
2021-01-13 03:18:26 +0800
d17f2ccf6 netfilter: ipset: fix shift-out-of-bounds in htable_bits() ... Browse Code »

commit 5c8193f568ae16f3242abad6518dc2ca6c8eef86 upstream.

htable_bits() can call jhash_size(32) and trigger shift-out-of-bounds

UBSAN: shift-out-of-bounds in net/netfilter/ipset/ip_set_hash_gen.h:151:6
shift exponent 32 is too large for 32-bit type 'unsigned int'
CPU: 0 PID: 8498 Comm: syz-executor519
Not tainted 5.10.0-rc7-next-20201208-syzkaller #0
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x107/0x163 lib/dump_stack.c:120
ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
__ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
htable_bits net/netfilter/ipset/ip_set_hash_gen.h:151 [inline]
hash_mac_create.cold+0x58/0x9b net/netfilter/ipset/ip_set_hash_gen.h:1524
ip_set_create+0x610/0x1380 net/netfilter/ipset/ip_set_core.c:1115
nfnetlink_rcv_msg+0xecc/0x1180 net/netfilter/nfnetlink.c:252
netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
nfnetlink_rcv+0x1ac/0x420 net/netfilter/nfnetlink.c:600
netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330
netlink_sendmsg+0x907/0xe40 net/netlink/af_netlink.c:1919
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:672
____sys_sendmsg+0x6e8/0x810 net/socket.c:2345
___sys_sendmsg+0xf3/0x170 net/socket.c:2399
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2432
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

This patch replaces htable_bits() by simple fls(hashsize - 1) call:
it alone returns valid nbits both for round and non-round hashsizes.
It is normal to set any nbits here because it is validated inside
following htable_size() call which returns 0 for nbits>31.

Fixes: 1feab10d7e6d("netfilter: ipset: Unified hash type generation")
Reported-by: syzbot+d66bfadebca46cf61a2b@syzkaller.appspotmail.com
Signed-off-by: Vasily Averin
Acked-by: Jozsef Kadlecsik
Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Greg Kroah-Hartman

Vasily Averin
2021-01-13 03:18:26 +0800
27bc60d96 netfilter: x_tables: Update remaining dereference to RCU ... Browse Code »

commit 443d6e86f821a165fae3fc3fc13086d27ac140b1 upstream.

This fixes the dereference to fetch the RCU pointer when holding
the appropriate xtables lock.

Reported-by: kernel test robot
Fixes: cc00bcaa5899 ("netfilter: x_tables: Switch synchronization to RCU")
Signed-off-by: Subash Abhinov Kasiviswanathan
Reviewed-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Greg Kroah-Hartman

Subash Abhinov Kasiviswanathan
2021-01-13 03:18:25 +0800
0d6eeee3b erspan: fix version 1 check in gre_parse_header() ... Browse Code »

[ Upstream commit 085c7c4e1c0e50d90b7d90f61a12e12b317a91e2 ]

Both version 0 and version 1 use ETH_P_ERSPAN, but version 0 does not
have an erspan header. So the check in gre_parse_header() is wrong,
we have to distinguish version 1 from version 0.

We can just check the gre header length like is_erspan_type1().

Fixes: cb73ee40b1b3 ("net: ip_gre: use erspan key field for tunnel lookup")
Reported-by: syzbot+f583ce3d4ddf9836b27a@syzkaller.appspotmail.com
Cc: William Tu
Cc: Lorenzo Bianconi
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Cong Wang
2021-01-13 03:18:12 +0800
7a20969b8 net: sched: prevent invalid Scell_log shift count ... Browse Code »

[ Upstream commit bd1248f1ddbc48b0c30565fce897a3b6423313b8 ]

Check Scell_log shift size in red_check_params() and modify all callers
of red_check_params() to pass Scell_log.

This prevents a shift out-of-bounds as detected by UBSAN:
UBSAN: shift-out-of-bounds in ./include/net/red.h:252:22
shift exponent 72 is too large for 32-bit type 'int'

Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values")
Signed-off-by: Randy Dunlap
Reported-by: syzbot+97c5bd9cc81eca63d36e@syzkaller.appspotmail.com
Cc: Nogah Frankel
Cc: Jamal Hadi Salim
Cc: Cong Wang
Cc: Jiri Pirko
Cc: netdev@vger.kernel.org
Cc: "David S. Miller"
Cc: Jakub Kicinski
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Randy Dunlap
2021-01-13 03:18:12 +0800
5e87eabce ipv4: Ignore ECN bits for fib lookups in fib_compute_spec_dst() ... Browse Code »

[ Upstream commit 21fdca22eb7df2a1e194b8adb812ce370748b733 ]

RT_TOS() only clears one of the ECN bits. Therefore, when
fib_compute_spec_dst() resorts to a fib lookup, it can return
different results depending on the value of the second ECN bit.

For example, ECT(0) and ECT(1) packets could be treated differently.

$ ip netns add ns0
$ ip netns add ns1
$ ip link add name veth01 netns ns0 type veth peer name veth10 netns ns1
$ ip -netns ns0 link set dev lo up
$ ip -netns ns1 link set dev lo up
$ ip -netns ns0 link set dev veth01 up
$ ip -netns ns1 link set dev veth10 up

$ ip -netns ns0 address add 192.0.2.10/24 dev veth01
$ ip -netns ns1 address add 192.0.2.11/24 dev veth10

$ ip -netns ns1 address add 192.0.2.21/32 dev lo
$ ip -netns ns1 route add 192.0.2.10/32 tos 4 dev veth10 src 192.0.2.21
$ ip netns exec ns1 sysctl -wq net.ipv4.icmp_echo_ignore_broadcasts=0

With TOS 4 and ECT(1), ns1 replies using source address 192.0.2.21
(ping uses -Q to set all TOS and ECN bits):

$ ip netns exec ns0 ping -c 1 -b -Q 5 192.0.2.255
[...]
64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.544 ms

But with TOS 4 and ECT(0), ns1 replies using source address 192.0.2.11
because the "tos 4" route isn't matched:

$ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255
[...]
64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=0.597 ms

After this patch the ECN bits don't affect the result anymore:

$ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255
[...]
64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.591 ms

Fixes: 35ebf65e851c ("ipv4: Create and use fib_compute_spec_dst() helper.")
Signed-off-by: Guillaume Nault
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Guillaume Nault
2021-01-13 03:18:12 +0800
e4535dbb7 net-sysfs: take the rtnl lock when accessing xps_rxqs_map and num_tc ... Browse Code »

[ Upstream commit 4ae2bb81649dc03dfc95875f02126b14b773f7ab ]

Accesses to dev->xps_rxqs_map (when using dev->num_tc) should be
protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't
see an actual bug being triggered, but let's be safe here and take the
rtnl lock while accessing the map in sysfs.

Fixes: 8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
Signed-off-by: Antoine Tenart
Reviewed-by: Alexander Duyck
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Antoine Tenart
2021-01-13 03:18:11 +0800
90297553d net-sysfs: take the rtnl lock when storing xps_rxqs ... Browse Code »

[ Upstream commit 2d57b4f142e0b03e854612b8e28978935414bced ]

Two race conditions can be triggered when storing xps rxqs, resulting in
various oops and invalid memory accesses:

1. Calling netdev_set_num_tc while netif_set_xps_queue:

- netif_set_xps_queue uses dev->tc_num as one of the parameters to
compute the size of new_dev_maps when allocating it. dev->tc_num is
also used to access the map, and the compiler may generate code to
retrieve this field multiple times in the function.

- netdev_set_num_tc sets dev->tc_num.

If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
is set to a higher value through netdev_set_num_tc, later accesses to
new_dev_maps in netif_set_xps_queue could lead to accessing memory
outside of new_dev_maps; triggering an oops.

2. Calling netif_set_xps_queue while netdev_set_num_tc is running:

2.1. netdev_set_num_tc starts by resetting the xps queues,
dev->tc_num isn't updated yet.

2.2. netif_set_xps_queue is called, setting up the map with the
*old* dev->num_tc.

2.3. netdev_set_num_tc updates dev->tc_num.

2.4. Later accesses to the map lead to out of bound accesses and
oops.

A similar issue can be found with netdev_reset_tc.

One way of triggering this is to set an iface up (for which the driver
uses netdev_set_num_tc in the open path, such as bnx2x) and writing to
xps_rxqs in a concurrent thread. With the right timing an oops is
triggered.

Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
and netdev_reset_tc should be mutually exclusive. We do that by taking
the rtnl lock in xps_rxqs_store.

Fixes: 8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
Signed-off-by: Antoine Tenart
Reviewed-by: Alexander Duyck
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Antoine Tenart
2021-01-13 03:18:11 +0800
0ca897c1e net-sysfs: take the rtnl lock when accessing xps_cpus_map and num_tc ... Browse Code »

[ Upstream commit fb25038586d0064123e393cadf1fadd70a9df97a ]

Accesses to dev->xps_cpus_map (when using dev->num_tc) should be
protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't
see an actual bug being triggered, but let's be safe here and take the
rtnl lock while accessing the map in sysfs.

Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Antoine Tenart
Reviewed-by: Alexander Duyck
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Antoine Tenart
2021-01-13 03:18:11 +0800
4da25d83b net-sysfs: take the rtnl lock when storing xps_cpus ... Browse Code »

[ Upstream commit 1ad58225dba3f2f598d2c6daed4323f24547168f ]

Two race conditions can be triggered when storing xps cpus, resulting in
various oops and invalid memory accesses:

1. Calling netdev_set_num_tc while netif_set_xps_queue:

- netif_set_xps_queue uses dev->tc_num as one of the parameters to
compute the size of new_dev_maps when allocating it. dev->tc_num is
also used to access the map, and the compiler may generate code to
retrieve this field multiple times in the function.

- netdev_set_num_tc sets dev->tc_num.

If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
is set to a higher value through netdev_set_num_tc, later accesses to
new_dev_maps in netif_set_xps_queue could lead to accessing memory
outside of new_dev_maps; triggering an oops.

2. Calling netif_set_xps_queue while netdev_set_num_tc is running:

2.1. netdev_set_num_tc starts by resetting the xps queues,
dev->tc_num isn't updated yet.

2.2. netif_set_xps_queue is called, setting up the map with the
*old* dev->num_tc.

2.3. netdev_set_num_tc updates dev->tc_num.

2.4. Later accesses to the map lead to out of bound accesses and
oops.

A similar issue can be found with netdev_reset_tc.

One way of triggering this is to set an iface up (for which the driver
uses netdev_set_num_tc in the open path, such as bnx2x) and writing to
xps_cpus in a concurrent thread. With the right timing an oops is
triggered.

Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
and netdev_reset_tc should be mutually exclusive. We do that by taking
the rtnl lock in xps_cpus_store.

Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Antoine Tenart
Reviewed-by: Alexander Duyck
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Antoine Tenart
2021-01-13 03:18:10 +0800
2cdf8c274 net/ncsi: Use real net-device for response handler ... Browse Code »

[ Upstream commit 427c940558560bff2583d07fc119a21094675982 ]

When aggregating ncsi interfaces and dedicated interfaces to bond
interfaces, the ncsi response handler will use the wrong net device to
find ncsi_dev, so that the ncsi interface will not work properly.
Here, we use the original net device to fix it.

Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
Signed-off-by: John Wang
Link: https://lore.kernel.org/r/20201223055523.2069-1-wangzhiqiang.bj@bytedance.com
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

John Wang
2021-01-13 03:18:10 +0800
db8895aa5 net/sched: sch_taprio: ensure to reset/destroy all child qdiscs ... Browse Code »

[ Upstream commit 698285da79f5b0b099db15a37ac661ac408c80eb ]

taprio_graft() can insert a NULL element in the array of child qdiscs. As
a consquence, taprio_reset() might not reset child qdiscs completely, and
taprio_destroy() might leak resources. Fix it by ensuring that loops that
iterate over q->qdiscs[] don't end when they find the first NULL item.

Fixes: 44d4775ca518 ("net/sched: sch_taprio: reset child qdiscs before freeing them")
Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")
Suggested-by: Jakub Kicinski
Signed-off-by: Davide Caratti
Link: https://lore.kernel.org/r/13edef6778fef03adc751582562fba4a13e06d6a.1608240532.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Davide Caratti
2021-01-13 03:18:08 +0800

09 Jan, 2021

1 commit

ce9163cf7 Bluetooth: Fix attempting to set RPA timeout when unsupported ... Browse Code »

commit a31489d2a368d2f9225ed6a6f595c63bc7d10de8 upstream.

During controller initialization, an LE Set RPA Timeout command is sent
to the controller if supported. However, the value checked to determine
if the command is supported is incorrect. Page 1921 of the Bluetooth
Core Spec v5.2 shows that bit 2 of octet 35 of the Supported_Commands
field corresponds to the LE Set RPA Timeout command, but currently
bit 6 of octet 35 is checked. This patch checks the correct value
instead.

This issue led to the error seen in the following btmon output during
initialization of an adapter (rtl8761b) and prevented initialization
from completing.

< HCI Command: LE Set Resolvable Private Address Timeout (0x08|0x002e) plen 2
Timeout: 900 seconds
> HCI Event: Command Complete (0x0e) plen 4
LE Set Resolvable Private Address Timeout (0x08|0x002e) ncmd 2
Status: Unsupported Remote Feature / Unsupported LMP Feature (0x1a)
= Close Index: 00:E0:4C:6B:E5:03

The error did not appear when running with this patch.

Signed-off-by: Edward Vear
Signed-off-by: Marcel Holtmann
Signed-off-by: Johan Hedberg
Cc: Sudip Mukherjee
Signed-off-by: Greg Kroah-Hartman

Edward Vear
2021-01-09 20:46:23 +0800

06 Jan, 2021

4 commits

62162b322 ethtool: fix string set id check ... Browse Code »

[ Upstream commit efb796f5571f030743e1d9c662cdebdad724f8c5 ]

Syzbot reported a shift of a u32 by more than 31 in strset_parse_request()
which is undefined behavior. This is caused by range check of string set id
using variable ret (which is always 0 at this point) instead of id (string
set id from request).

Fixes: 71921690f974 ("ethtool: provide string sets with STRSET_GET request")
Reported-by: syzbot+96523fb438937cd01220@syzkaller.appspotmail.com
Signed-off-by: Michal Kubecek
Link: https://lore.kernel.org/r/b54ed5c5fd972a59afea3e1badfb36d86df68799.1607952208.git.mkubecek@suse.cz
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Michal Kubecek
2021-01-06 21:56:48 +0800
95fcb69c4 ethtool: fix error paths in ethnl_set_channels() ... Browse Code »

[ Upstream commit ef72cd3c5ce168829c6684ecb2cae047d3493690 ]

Fix two error paths in ethnl_set_channels() to avoid lock-up caused
but unreleased RTNL.

Fixes: e19c591eafad ("ethtool: set device channel counts with CHANNELS_SET request")
Reported-by: LiLiang
Signed-off-by: Ivan Vecera
Reviewed-by: Michal Kubecek
Link: https://lore.kernel.org/r/20201215090810.801777-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Ivan Vecera
2021-01-06 21:56:48 +0800
aeab3d7a0 mptcp: fix security context on server socket ... Browse Code »

[ Upstream commit 0c14846032f2c0a3b63234e1fc2759f4155b6067 ]

Currently MPTCP is not propagating the security context
from the ingress request socket to newly created msk
at clone time.

Address the issue invoking the missing security helper.

Fixes: cf7da0d66cc1 ("mptcp: Create SUBFLOW socket for incoming connections")
Signed-off-by: Paolo Abeni
Reviewed-by: Mat Martineau
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Paolo Abeni
2021-01-06 21:56:48 +0800
a969a632c net/sched: sch_taprio: reset child qdiscs before freeing them ... Browse Code »

[ Upstream commit 44d4775ca51805b376a8db5b34f650434a08e556 ]

syzkaller shows that packets can still be dequeued while taprio_destroy()
is running. Let sch_taprio use the reset() function to cancel the advance
timer and drop all skbs from the child qdiscs.

Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")
Link: https://syzkaller.appspot.com/bug?id=f362872379bf8f0017fb667c1ab158f2d1e764ae
Reported-by: syzbot+8971da381fb5a31f542d@syzkaller.appspotmail.com
Signed-off-by: Davide Caratti
Acked-by: Vinicius Costa Gomes
Link: https://lore.kernel.org/r/63b6d79b0e830ebb0283e020db4df3cdfdfb2b94.1608142843.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski
Signed-off-by: Greg Kroah-Hartman

Davide Caratti
2021-01-06 21:56:48 +0800