Eric Lee / smarc-fsl-linux-kernel

20 Jan, 2020

1 commit

11a827294 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Pull networking fixes from David Miller:

1) Fix non-blocking connect() in x25, from Martin Schiller.

2) Fix spurious decryption errors in kTLS, from Jakub Kicinski.

3) Netfilter use-after-free in mtype_destroy(), from Cong Wang.

4) Limit size of TSO packets properly in lan78xx driver, from Eric
Dumazet.

5) r8152 probe needs an endpoint sanity check, from Johan Hovold.

6) Prevent looping in tcp_bpf_unhash() during sockmap/tls free, from
John Fastabend.

7) hns3 needs short frames padded on transmit, from Yunsheng Lin.

8) Fix netfilter ICMP header corruption, from Eyal Birger.

9) Fix soft lockup when low on memory in hns3, from Yonglong Liu.

10) Fix NTUPLE firmware command failures in bnxt_en, from Michael Chan.

11) Fix memory leak in act_ctinfo, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits)
cxgb4: reject overlapped queues in TC-MQPRIO offload
cxgb4: fix Tx multi channel port rate limit
net: sched: act_ctinfo: fix memory leak
bnxt_en: Do not treat DSN (Digital Serial Number) read failure as fatal.
bnxt_en: Fix ipv6 RFS filter matching logic.
bnxt_en: Fix NTUPLE firmware command failures.
net: systemport: Fixed queue mapping in internal ring map
net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec
net: dsa: sja1105: Don't error out on disabled ports with no phy-mode
net: phy: dp83867: Set FORCE_LINK_GOOD to default after reset
net: hns: fix soft lockup when there is not enough memory
net: avoid updating qdisc_xmit_lock_key in netdev_update_lockdep_key()
net/sched: act_ife: initalize ife->metalist earlier
netfilter: nat: fix ICMP header corruption on ICMP errors
net: wan: lapbether.c: Use built-in RCU list checking
netfilter: nf_tables: fix flowtable list del corruption
netfilter: nf_tables: fix memory leak in nf_tables_parse_netdev_hooks()
netfilter: nf_tables: remove WARN and add NLA_STRING upper limits
netfilter: nft_tunnel: ERSPAN_VERSION must not be null
netfilter: nft_tunnel: fix null-attribute check
...

Linus Torvalds
2020-01-20 04:03:53 +0800

19 Jan, 2020

1 commit

09d4f10a5 net: sched: act_ctinfo: fix memory leak ... Browse Code »

Implement a cleanup method to properly free ci->params

BUG: memory leak
unreferenced object 0xffff88811746e2c0 (size 64):
comm "syz-executor617", pid 7106, jiffies 4294943055 (age 14.250s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
c0 34 60 84 ff ff ff ff 00 00 00 00 00 00 00 00 .4`.............
backtrace:
[] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
[] slab_post_alloc_hook mm/slab.h:586 [inline]
[] slab_alloc mm/slab.c:3320 [inline]
[] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3549
[] kmalloc include/linux/slab.h:556 [inline]
[] kzalloc include/linux/slab.h:670 [inline]
[] tcf_ctinfo_init+0x21a/0x530 net/sched/act_ctinfo.c:236
[] tcf_action_init_1+0x400/0x5b0 net/sched/act_api.c:944
[] tcf_action_init+0x135/0x1c0 net/sched/act_api.c:1000
[] tcf_action_add+0x9a/0x200 net/sched/act_api.c:1410
[] tc_ctl_action+0x14d/0x1bb net/sched/act_api.c:1465
[] rtnetlink_rcv_msg+0x178/0x4b0 net/core/rtnetlink.c:5424
[] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
[] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5442
[] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
[] netlink_unicast+0x223/0x310 net/netlink/af_netlink.c:1328
[] netlink_sendmsg+0x2c0/0x570 net/netlink/af_netlink.c:1917
[] sock_sendmsg_nosec net/socket.c:639 [inline]
[] sock_sendmsg+0x54/0x70 net/socket.c:659
[] ____sys_sendmsg+0x2d0/0x300 net/socket.c:2330
[] ___sys_sendmsg+0x8a/0xd0 net/socket.c:2384
[] __sys_sendmsg+0x80/0xf0 net/socket.c:2417
[] __do_sys_sendmsg net/socket.c:2426 [inline]
[] __se_sys_sendmsg net/socket.c:2424 [inline]
[] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2424

Fixes: 24ec483cec98 ("net: sched: Introduce act_ctinfo action")
Signed-off-by: Eric Dumazet
Reported-by: syzbot
Cc: Kevin 'ldir' Darbyshire-Bryant
Cc: Cong Wang
Cc: Toke Høiland-Jørgensen
Acked-by: Kevin 'ldir' Darbyshire-Bryant
Signed-off-by: David S. Miller

Eric Dumazet
2020-01-19 23:02:15 +0800

17 Jan, 2020

3 commits

53d374979 net: avoid updating qdisc_xmit_lock_key in netdev_update_lockdep_key() ... Browse Code »

syzbot reported some bogus lockdep warnings, for example bad unlock
balance in sch_direct_xmit(). They are due to a race condition between
slow path and fast path, that is qdisc_xmit_lock_key gets re-registered
in netdev_update_lockdep_key() on slow path, while we could still
acquire the queue->_xmit_lock on fast path in this small window:

CPU A CPU B
__netif_tx_lock();
lockdep_unregister_key(qdisc_xmit_lock_key);
__netif_tx_unlock();
lockdep_register_key(qdisc_xmit_lock_key);

In fact, unlike the addr_list_lock which has to be reordered when
the master/slave device relationship changes, queue->_xmit_lock is
only acquired on fast path and only when NETIF_F_LLTX is not set,
so there is likely no nested locking for it.

Therefore, we can just get rid of re-registration of
qdisc_xmit_lock_key.

Reported-by: syzbot+4ec99438ed7450da6272@syzkaller.appspotmail.com
Fixes: ab92d68fc22f ("net: core: add generic lockdep keys")
Cc: Taehee Yoo
Signed-off-by: Cong Wang
Acked-by: Taehee Yoo
Signed-off-by: David S. Miller

Cong Wang
2020-01-17 18:02:44 +0800
44c23d715 net/sched: act_ife: initalize ife->metalist earlier ... Browse Code »

It seems better to init ife->metalist earlier in tcf_ife_init()
to avoid the following crash :

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 10483 Comm: syz-executor216 Not tainted 5.5.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:_tcf_ife_cleanup net/sched/act_ife.c:412 [inline]
RIP: 0010:tcf_ife_cleanup+0x6e/0x400 net/sched/act_ife.c:431
Code: 48 c1 ea 03 80 3c 02 00 0f 85 94 03 00 00 49 8b bd f8 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8d 67 e8 48 89 fa 48 c1 ea 03 3c 02 00 0f 85 5c 03 00 00 48 bb 00 00 00 00 00 fc ff df 48 8b
RSP: 0018:ffffc90001dc6d00 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffffffff864619c0 RCX: ffffffff815bfa09
RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000000
RBP: ffffc90001dc6d50 R08: 0000000000000004 R09: fffff520003b8d8e
R10: fffff520003b8d8d R11: 0000000000000003 R12: ffffffffffffffe8
R13: ffff8880a79fc000 R14: ffff88809aba0e00 R15: 0000000000000000
FS: 0000000001b51880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000563f52cce140 CR3: 0000000093541000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
tcf_action_cleanup+0x62/0x1b0 net/sched/act_api.c:119
__tcf_action_put+0xfa/0x130 net/sched/act_api.c:135
__tcf_idr_release net/sched/act_api.c:165 [inline]
__tcf_idr_release+0x59/0xf0 net/sched/act_api.c:145
tcf_idr_release include/net/act_api.h:171 [inline]
tcf_ife_init+0x97c/0x1870 net/sched/act_ife.c:616
tcf_action_init_1+0x6b6/0xa40 net/sched/act_api.c:944
tcf_action_init+0x21a/0x330 net/sched/act_api.c:1000
tcf_action_add+0xf5/0x3b0 net/sched/act_api.c:1410
tc_ctl_action+0x390/0x488 net/sched/act_api.c:1465
rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5424
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5442
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x58c/0x7d0 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:639 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:659
____sys_sendmsg+0x753/0x880 net/socket.c:2330
___sys_sendmsg+0x100/0x170 net/socket.c:2384
__sys_sendmsg+0x105/0x1d0 net/socket.c:2417
__do_sys_sendmsg net/socket.c:2426 [inline]
__se_sys_sendmsg net/socket.c:2424 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2424
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 11a94d7fd80f ("net/sched: act_ife: validate the control action inside init()")
Signed-off-by: Eric Dumazet
Reported-by: syzbot
Cc: Davide Caratti
Reviewed-by: Davide Caratti
Acked-by: Cong Wang
Signed-off-by: David S. Miller

Eric Dumazet
2020-01-17 17:58:15 +0800
a72b6a1ee Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter updates for net

The following patchset contains Netfilter fixes for net:

1) Fix use-after-free in ipset bitmap destroy path, from Cong Wang.

2) Missing init netns in entry cleanup path of arp_tables,
from Florian Westphal.

3) Fix WARN_ON in set destroy path due to missing cleanup on
transaction error.

4) Incorrect netlink sanity check in tunnel, from Florian Westphal.

5) Missing sanity check for erspan version netlink attribute, also
from Florian.

6) Remove WARN in nft_request_module() that can be triggered from
userspace, from Florian Westphal.

7) Memleak in NFTA_HOOK_DEVS netlink parser, from Dan Carpenter.

8) List poison from commit path for flowtables that are added and
deleted in the same batch, from Florian Westphal.

9) Fix NAT ICMP packet corruption, from Eyal Birger.
====================

Signed-off-by: David S. Miller

David S. Miller
2020-01-17 17:36:07 +0800

16 Jan, 2020

19 commits

61177e911 netfilter: nat: fix ICMP header corruption on ICMP errors ... Browse Code »

Commit 8303b7e8f018 ("netfilter: nat: fix spurious connection timeouts")
made nf_nat_icmp_reply_translation() use icmp_manip_pkt() as the l4
manipulation function for the outer packet on ICMP errors.

However, icmp_manip_pkt() assumes the packet has an 'id' field which
is not correct for all types of ICMP messages.

This is not correct for ICMP error packets, and leads to bogus bytes
being written the ICMP header, which can be wrongfully regarded as
'length' bytes by RFC 4884 compliant receivers.

Fix by assigning the 'id' field only for ICMP messages that have this
semantic.

Reported-by: Shmulik Ladkani
Fixes: 8303b7e8f018 ("netfilter: nat: fix spurious connection timeouts")
Signed-off-by: Eyal Birger
Acked-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Eyal Birger
2020-01-16 22:08:25 +0800
335178d54 netfilter: nf_tables: fix flowtable list del corruption ... Browse Code »

syzbot reported following crash:

list_del corruption, ffff88808c9bb000->prev is LIST_POISON2 (dead000000000122)
[..]
Call Trace:
__list_del_entry include/linux/list.h:131 [inline]
list_del_rcu include/linux/rculist.h:148 [inline]
nf_tables_commit+0x1068/0x3b30 net/netfilter/nf_tables_api.c:7183
[..]

The commit transaction list has:

NFT_MSG_NEWTABLE
NFT_MSG_NEWFLOWTABLE
NFT_MSG_DELFLOWTABLE
NFT_MSG_DELTABLE

A missing generation check during DELTABLE processing causes it to queue
the DELFLOWTABLE operation a second time, so we corrupt the list here:

case NFT_MSG_DELFLOWTABLE:
list_del_rcu(&nft_trans_flowtable(trans)->list);
nf_tables_flowtable_notify(&trans->ctx,

because we have two different DELFLOWTABLE transactions for the same
flowtable. We then call list_del_rcu() twice for the same flowtable->list.

The object handling seems to suffer from the same bug so add a generation
check too and only queue delete transactions for flowtables/objects that
are still active in the next generation.

Reported-by: syzbot+37a6804945a3a13b1572@syzkaller.appspotmail.com
Fixes: 3b49e2e94e6eb ("netfilter: nf_tables: add flow table netlink frontend")
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2020-01-16 21:22:33 +0800
cd77e75b5 netfilter: nf_tables: fix memory leak in nf_tables_parse_netdev_hooks() ... Browse Code »

Syzbot detected a leak in nf_tables_parse_netdev_hooks(). If the hook
already exists, then the error handling doesn't free the newest "hook".

Reported-by: syzbot+f9d4095107fc8749c69c@syzkaller.appspotmail.com
Fixes: b75a3e8371bc ("netfilter: nf_tables: allow netdevice to be used only once per flowtable")
Signed-off-by: Dan Carpenter
Reviewed-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Dan Carpenter
2020-01-16 21:22:32 +0800
9332d27d7 netfilter: nf_tables: remove WARN and add NLA_STRING upper limits ... Browse Code »

This WARN can trigger because some of the names fed to the module
autoload function can be of arbitrary length.

Remove the WARN and add limits for all NLA_STRING attributes.

Reported-by: syzbot+0e63ae76d117ae1c3a01@syzkaller.appspotmail.com
Fixes: 452238e8d5ffd8 ("netfilter: nf_tables: add and use helper for module autoload")
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2020-01-16 21:22:32 +0800
9ec22d7c6 netfilter: nft_tunnel: ERSPAN_VERSION must not be null ... Browse Code »

Fixes: af308b94a2a4a5 ("netfilter: nf_tables: add tunnel support")
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2020-01-16 21:22:32 +0800
1c702bf90 netfilter: nft_tunnel: fix null-attribute check ... Browse Code »

else we get null deref when one of the attributes is missing, both
must be non-null.

Reported-by: syzbot+76d0b80493ac881ff77b@syzkaller.appspotmail.com
Fixes: aaecfdb5c5dd8ba ("netfilter: nf_tables: match on tunnel metadata")
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2020-01-16 21:22:32 +0800
ec7470b83 netfilter: nf_tables: store transaction list locally while requesting module ... Browse Code »

This patch fixes a WARN_ON in nft_set_destroy() due to missing
set reference count drop from the preparation phase. This is triggered
by the module autoload path. Do not exercise the abort path from
nft_request_module() while preparation phase cleaning up is still
pending.

WARNING: CPU: 3 PID: 3456 at net/netfilter/nf_tables_api.c:3740 nft_set_destroy+0x45/0x50 [nf_tables]
[...]
CPU: 3 PID: 3456 Comm: nft Not tainted 5.4.6-arch3-1 #1
RIP: 0010:nft_set_destroy+0x45/0x50 [nf_tables]
Code: e8 30 eb 83 c6 48 8b 85 80 00 00 00 48 8b b8 90 00 00 00 e8 dd 6b d7 c5 48 8b 7d 30 e8 24 dd eb c5 48 89 ef 5d e9 6b c6 e5 c5 0b c3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 7f 10 e9 52
RSP: 0018:ffffac4f43e53700 EFLAGS: 00010202
RAX: 0000000000000001 RBX: ffff99d63a154d80 RCX: 0000000001f88e03
RDX: 0000000001f88c03 RSI: ffff99d6560ef0c0 RDI: ffff99d63a101200
RBP: ffff99d617721de0 R08: 0000000000000000 R09: 0000000000000318
R10: 00000000f0000000 R11: 0000000000000001 R12: ffffffff880fabf0
R13: dead000000000122 R14: dead000000000100 R15: ffff99d63a154d80
FS: 00007ff3dbd5b740(0000) GS:ffff99d6560c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00001cb5de6a9000 CR3: 000000016eb6a004 CR4: 00000000001606e0
Call Trace:
__nf_tables_abort+0x3e3/0x6d0 [nf_tables]
nft_request_module+0x6f/0x110 [nf_tables]
nft_expr_type_request_module+0x28/0x50 [nf_tables]
nf_tables_expr_parse+0x198/0x1f0 [nf_tables]
nft_expr_init+0x3b/0xf0 [nf_tables]
nft_dynset_init+0x1e2/0x410 [nf_tables]
nf_tables_newrule+0x30a/0x930 [nf_tables]
nfnetlink_rcv_batch+0x2a0/0x640 [nfnetlink]
nfnetlink_rcv+0x125/0x171 [nfnetlink]
netlink_unicast+0x179/0x210
netlink_sendmsg+0x208/0x3d0
sock_sendmsg+0x5e/0x60
____sys_sendmsg+0x21b/0x290

Update comment on the code to describe the new behaviour.

Reported-by: Marco Oliverio
Fixes: 452238e8d5ff ("netfilter: nf_tables: add and use helper for module autoload")
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2020-01-16 21:21:51 +0800
bd5874da5 net: dsa: tag_qca: fix doubled Tx statistics ... Browse Code »

DSA subsystem takes care of netdev statistics since commit 4ed70ce9f01c
("net: dsa: Refactor transmit path to eliminate duplication"), so
any accounting inside tagger callbacks is redundant and can lead to
messing up the stats.
This bug is present in Qualcomm tagger since day 0.

Fixes: cafdc45c949b ("net-next: dsa: add Qualcomm tag RX/TX handler")
Reviewed-by: Andrew Lunn
Signed-off-by: Alexander Lobakin
Reviewed-by: Florian Fainelli
Signed-off-by: David S. Miller

Alexander Lobakin
2020-01-16 20:59:40 +0800
ad3220547 net: dsa: tag_gswip: fix typo in tagger name ... Browse Code »

The correct name is GSWIP (Gigabit Switch IP). Typo was introduced in
875138f81d71a ("dsa: Move tagger name into its ops structure") while
moving tagger names to their structures.

Fixes: 875138f81d71a ("dsa: Move tagger name into its ops structure")
Reviewed-by: Andrew Lunn
Signed-off-by: Alexander Lobakin
Reviewed-by: Florian Fainelli
Acked-by: Hauke Mehrtens
Signed-off-by: David S. Miller

Alexander Lobakin
2020-01-16 20:58:26 +0800
3981f955e Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf ... Browse Code »

Daniel Borkmann says:

====================
pull-request: bpf 2020-01-15

The following pull-request contains BPF updates for your *net* tree.

We've added 12 non-merge commits during the last 9 day(s) which contain
a total of 13 files changed, 95 insertions(+), 43 deletions(-).

The main changes are:

1) Fix refcount leak for TCP time wait and request sockets for socket lookup
related BPF helpers, from Lorenz Bauer.

2) Fix wrong verification of ARSH instruction under ALU32, from Daniel Borkmann.

3) Batch of several sockmap and related TLS fixes found while operating
more complex BPF programs with Cilium and OpenSSL, from John Fastabend.

4) Fix sockmap to read psock's ingress_msg queue before regular sk_receive_queue()
to avoid purging data upon teardown, from Lingpeng Chen.

5) Fix printing incorrect pointer in bpftool's btf_dump_ptr() in order to properly
dump a BPF map's value with BTF, from Martin KaFai Lau.
====================

Signed-off-by: David S. Miller

David S. Miller
2020-01-16 17:04:40 +0800
7361d4489 bpf: Sockmap/tls, fix pop data with SK_DROP return code ... Browse Code »

When user returns SK_DROP we need to reset the number of copied bytes
to indicate to the user the bytes were dropped and not sent. If we
don't reset the copied arg sendmsg will return as if those bytes were
copied giving the user a positive return value.

This works as expected today except in the case where the user also
pops bytes. In the pop case the sg.size is reduced but we don't correctly
account for this when copied bytes is reset. The popped bytes are not
accounted for and we return a small positive value potentially confusing
the user.

The reason this happens is due to a typo where we do the wrong comparison
when accounting for pop bytes. In this fix notice the if/else is not
needed and that we have a similar problem if we push data except its not
visible to the user because if delta is larger the sg.size we return a
negative value so it appears as an error regardless.

Fixes: 7246d8ed4dcce ("bpf: helper to pop data from messages")
Signed-off-by: John Fastabend
Signed-off-by: Daniel Borkmann
Acked-by: Jonathan Lemon
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/bpf/20200111061206.8028-9-john.fastabend@gmail.com

John Fastabend
2020-01-16 06:26:13 +0800
9aaaa5684 bpf: Sockmap/tls, skmsg can have wrapped skmsg that needs extra chaining ... Browse Code »

Its possible through a set of push, pop, apply helper calls to construct
a skmsg, which is just a ring of scatterlist elements, with the start
value larger than the end value. For example,

end start
|_0_|_1_| ... |_n_|_n+1_|

Where end points at 1 and start points and n so that valid elements is
the set {n, n+1, 0, 1}.

Currently, because we don't build the correct chain only {n, n+1} will
be sent. This adds a check and sg_chain call to correctly submit the
above to the crypto and tls send path.

Fixes: d3b18ad31f93d ("tls: add bpf support to sk_msg handling")
Signed-off-by: John Fastabend
Signed-off-by: Daniel Borkmann
Acked-by: Jonathan Lemon
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/bpf/20200111061206.8028-8-john.fastabend@gmail.com

John Fastabend
2020-01-16 06:26:13 +0800
d468e4775 bpf: Sockmap/tls, tls_sw can create a plaintext buf > encrypt buf ... Browse Code »

It is possible to build a plaintext buffer using push helper that is larger
than the allocated encrypt buffer. When this record is pushed to crypto
layers this can result in a NULL pointer dereference because the crypto
API expects the encrypt buffer is large enough to fit the plaintext
buffer. Kernel splat below.

To resolve catch the cases this can happen and split the buffer into two
records to send individually. Unfortunately, there is still one case to
handle where the split creates a zero sized buffer. In this case we merge
the buffers and unmark the split. This happens when apply is zero and user
pushed data beyond encrypt buffer. This fixes the original case as well
because the split allocated an encrypt buffer larger than the plaintext
buffer and the merge simply moves the pointers around so we now have
a reference to the new (larger) encrypt buffer.

Perhaps its not ideal but it seems the best solution for a fixes branch
and avoids handling these two cases, (a) apply that needs split and (b)
non apply case. The are edge cases anyways so optimizing them seems not
necessary unless someone wants later in next branches.

[ 306.719107] BUG: kernel NULL pointer dereference, address: 0000000000000008
[...]
[ 306.747260] RIP: 0010:scatterwalk_copychunks+0x12f/0x1b0
[...]
[ 306.770350] Call Trace:
[ 306.770956] scatterwalk_map_and_copy+0x6c/0x80
[ 306.772026] gcm_enc_copy_hash+0x4b/0x50
[ 306.772925] gcm_hash_crypt_remain_continue+0xef/0x110
[ 306.774138] gcm_hash_crypt_continue+0xa1/0xb0
[ 306.775103] ? gcm_hash_crypt_continue+0xa1/0xb0
[ 306.776103] gcm_hash_assoc_remain_continue+0x94/0xa0
[ 306.777170] gcm_hash_assoc_continue+0x9d/0xb0
[ 306.778239] gcm_hash_init_continue+0x8f/0xa0
[ 306.779121] gcm_hash+0x73/0x80
[ 306.779762] gcm_encrypt_continue+0x6d/0x80
[ 306.780582] crypto_gcm_encrypt+0xcb/0xe0
[ 306.781474] crypto_aead_encrypt+0x1f/0x30
[ 306.782353] tls_push_record+0x3b9/0xb20 [tls]
[ 306.783314] ? sk_psock_msg_verdict+0x199/0x300
[ 306.784287] bpf_exec_tx_verdict+0x3f2/0x680 [tls]
[ 306.785357] tls_sw_sendmsg+0x4a3/0x6a0 [tls]

test_sockmap test signature to trigger bug,

[TEST]: (1, 1, 1, sendmsg, pass,redir,start 1,end 2,pop (1,2),ktls,):

Fixes: d3b18ad31f93d ("tls: add bpf support to sk_msg handling")
Signed-off-by: John Fastabend
Signed-off-by: Daniel Borkmann
Acked-by: Jonathan Lemon
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/bpf/20200111061206.8028-7-john.fastabend@gmail.com

John Fastabend
2020-01-16 06:26:13 +0800
cf21e9ba1 bpf: Sockmap/tls, msg_push_data may leave end mark in place ... Browse Code »

Leaving an incorrect end mark in place when passing to crypto
layer will cause crypto layer to stop processing data before
all data is encrypted. To fix clear the end mark on push
data instead of expecting users of the helper to clear the
mark value after the fact.

This happens when we push data into the middle of a skmsg and
have room for it so we don't do a set of copies that already
clear the end flag.

Fixes: 6fff607e2f14b ("bpf: sk_msg program helper bpf_msg_push_data")
Signed-off-by: John Fastabend
Signed-off-by: Daniel Borkmann
Acked-by: Song Liu
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/bpf/20200111061206.8028-6-john.fastabend@gmail.com

John Fastabend
2020-01-16 06:26:13 +0800
6562e29cf bpf: Sockmap, skmsg helper overestimates push, pull, and pop bounds ... Browse Code »

In the push, pull, and pop helpers operating on skmsg objects to make
data writable or insert/remove data we use this bounds check to ensure
specified data is valid,

/* Bounds checks: start and pop must be inside message */
if (start >= offset + l || last >= msg->sg.size)
return -EINVAL;

The problem here is offset has already included the length of the
current element the 'l' above. So start could be past the end of
the scatterlist element in the case where start also points into an
offset on the last skmsg element.

To fix do the accounting slightly different by adding the length of
the previous entry to offset at the start of the iteration. And
ensure its initialized to zero so that the first iteration does
nothing.

Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Fixes: 6fff607e2f14b ("bpf: sk_msg program helper bpf_msg_push_data")
Fixes: 7246d8ed4dcce ("bpf: helper to pop data from messages")
Signed-off-by: John Fastabend
Signed-off-by: Daniel Borkmann
Acked-by: Song Liu
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/bpf/20200111061206.8028-5-john.fastabend@gmail.com

John Fastabend
2020-01-16 06:26:13 +0800
33bfe20dd bpf: Sockmap/tls, push write_space updates through ulp updates ... Browse Code »

When sockmap sock with TLS enabled is removed we cleanup bpf/psock state
and call tcp_update_ulp() to push updates to TLS ULP on top. However, we
don't push the write_space callback up and instead simply overwrite the
op with the psock stored previous op. This may or may not be correct so
to ensure we don't overwrite the TLS write space hook pass this field to
the ULP and have it fixup the ctx.

This completes a previous fix that pushed the ops through to the ULP
but at the time missed doing this for write_space, presumably because
write_space TLS hook was added around the same time.

Fixes: 95fa145479fbc ("bpf: sockmap/tls, close can race with map free")
Signed-off-by: John Fastabend
Signed-off-by: Daniel Borkmann
Reviewed-by: Jakub Sitnicki
Acked-by: Jonathan Lemon
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/bpf/20200111061206.8028-4-john.fastabend@gmail.com

John Fastabend
2020-01-16 06:26:13 +0800
7e81a3530 bpf: Sockmap, ensure sock lock held during tear down ... Browse Code »

The sock_map_free() and sock_hash_free() paths used to delete sockmap
and sockhash maps walk the maps and destroy psock and bpf state associated
with the socks in the map. When done the socks no longer have BPF programs
attached and will function normally. This can happen while the socks in
the map are still "live" meaning data may be sent/received during the walk.

Currently, though we don't take the sock_lock when the psock and bpf state
is removed through this path. Specifically, this means we can be writing
into the ops structure pointers such as sendmsg, sendpage, recvmsg, etc.
while they are also being called from the networking side. This is not
safe, we never used proper READ_ONCE/WRITE_ONCE semantics here if we
believed it was safe. Further its not clear to me its even a good idea
to try and do this on "live" sockets while networking side might also
be using the socket. Instead of trying to reason about using the socks
from both sides lets realize that every use case I'm aware of rarely
deletes maps, in fact kubernetes/Cilium case builds map at init and
never tears it down except on errors. So lets do the simple fix and
grab sock lock.

This patch wraps sock deletes from maps in sock lock and adds some
annotations so we catch any other cases easier.

Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend
Signed-off-by: Daniel Borkmann
Acked-by: Song Liu
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/bpf/20200111061206.8028-3-john.fastabend@gmail.com

John Fastabend
2020-01-16 06:26:13 +0800
5a40420e0 Merge tag 'batadv-net-for-davem-20200114' of git://git.open-mesh.org/linux-merge ... Browse Code »

Simon Wunderlich says:

====================
Here is a batman-adv bugfix:

- Fix DAT candidate selection on little endian systems,
by Sven Eckelmann
====================

Signed-off-by: David S. Miller

David S. Miller
2020-01-16 05:44:23 +0800
e176b1ba4 tcp: fix marked lost packets not being retransmitted ... Browse Code »

When the packet pointed to by retransmit_skb_hint is unlinked by ACK,
retransmit_skb_hint will be set to NULL in tcp_clean_rtx_queue().
If packet loss is detected at this time, retransmit_skb_hint will be set
to point to the current packet loss in tcp_verify_retransmit_hint(),
then the packets that were previously marked lost but not retransmitted
due to the restriction of cwnd will be skipped and cannot be
retransmitted.

To fix this, when retransmit_skb_hint is NULL, retransmit_skb_hint can
be reset only after all marked lost packets are retransmitted
(retrans_out >= lost_out), otherwise we need to traverse from
tcp_rtx_queue_head in tcp_xmit_retransmit_queue().

Packetdrill to demonstrate:

// Disable RACK and set max_reordering to keep things simple
0 `sysctl -q net.ipv4.tcp_recovery=0`
+0 `sysctl -q net.ipv4.tcp_max_reordering=3`

// Establish a connection
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

+.1 < S 0:0(0) win 32792
+0 > S. 0:0(0) ack 1
+.01 < . 1:1(0) ack 1 win 257
+0 accept(3, ..., ...) = 4

// Send 8 data segments
+0 write(4, ..., 8000) = 8000
+0 > P. 1:8001(8000) ack 1

// Enter recovery and 1:3001 is marked lost
+.01 < . 1:1(0) ack 1 win 257
+0 < . 1:1(0) ack 1 win 257
+0 < . 1:1(0) ack 1 win 257

// Retransmit 1:1001, now retransmit_skb_hint points to 1001:2001
+0 > . 1:1001(1000) ack 1

// 1001:2001 was ACKed causing retransmit_skb_hint to be set to NULL
+.01 < . 1:1(0) ack 2001 win 257
// Now retransmit_skb_hint points to 4001:5001 which is now marked lost

// BUG: 2001:3001 was not retransmitted
+0 > . 2001:3001(1000) ack 1

Signed-off-by: Pengcheng Yang
Acked-by: Neal Cardwell
Tested-by: Neal Cardwell
Signed-off-by: David S. Miller

Pengcheng Yang
2020-01-16 05:34:31 +0800

15 Jan, 2020

14 commits

eb507906f Merge tag 'mac80211-for-net-2020-01-15' of git://git.kernel.org/pub/scm/linux/ke… ... Browse Code »

…rnel/git/jberg/mac80211

Johannes Berg says:

====================
A few fixes:
* -O3 enablement fallout, thanks to Arnd who ran this
* fixes for a few leaks, thanks to Felix
* channel 12 regulatory fix for custom regdomains
* check for a crash reported by syzbot
(NULL function is called on drivers that don't have it)
* fix TKIP replay protection after setup with some APs
(from Jouni)
* restrict obtaining some mesh data to avoid WARN_ONs
* fix deadlocks with auto-disconnect (socket owner)
* fix radar detection events with multiple devices
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

David S. Miller
2020-01-15 20:12:00 +0800
81c044fc3 cfg80211: fix page refcount issue in A-MSDU decap ... Browse Code »

The fragments attached to a skb can be part of a compound page. In that case,
page_ref_inc will increment the refcount for the wrong page. Fix this by
using get_page instead, which calls page_ref_inc on the compound head and
also checks for overflow.

Fixes: 2b67f944f88c ("cfg80211: reuse existing page fragments in A-MSDU rx")
Cc: stable@vger.kernel.org
Signed-off-by: Felix Fietkau
Link: https://lore.kernel.org/r/20200113182107.20461-1-nbd@nbd.name
Signed-off-by: Johannes Berg

Felix Fietkau
2020-01-15 16:53:35 +0800
24953de0a cfg80211: check for set_wiphy_params ... Browse Code »

Check if set_wiphy_params is assigned and return an error if not,
some drivers (e.g. virt_wifi where syzbot reported it) don't have
it.

Reported-by: syzbot+e8a797964a4180eb57d5@syzkaller.appspotmail.com
Reported-by: syzbot+34b582cf32c1db008f8e@syzkaller.appspotmail.com
Signed-off-by: Johannes Berg
Link: https://lore.kernel.org/r/20200113125358.ac07f276efff.Ibd85ee1b12e47b9efb00a2adc5cd3fac50da791a@changeid
Signed-off-by: Johannes Berg

Johannes Berg
2020-01-15 16:53:24 +0800
df16737d4 cfg80211: fix memory leak in cfg80211_cqm_rssi_update ... Browse Code »

The per-tid statistics need to be released after the call to rdev_get_station

Cc: stable@vger.kernel.org
Fixes: 8689c051a201 ("cfg80211: dynamically allocate per-tid stats for station info")
Signed-off-by: Felix Fietkau
Link: https://lore.kernel.org/r/20200108170630.33680-2-nbd@nbd.name
Signed-off-by: Johannes Berg

Felix Fietkau
2020-01-15 16:53:12 +0800
2a279b341 cfg80211: fix memory leak in nl80211_probe_mesh_link ... Browse Code »

The per-tid statistics need to be released after the call to rdev_get_station

Cc: stable@vger.kernel.org
Fixes: 5ab92e7fe49a ("cfg80211: add support to probe unexercised mesh link")
Signed-off-by: Felix Fietkau
Link: https://lore.kernel.org/r/20200108170630.33680-1-nbd@nbd.name
Signed-off-by: Johannes Berg

Felix Fietkau
2020-01-15 16:53:02 +0800
5a128a088 cfg80211: fix deadlocks in autodisconnect work ... Browse Code »

Use methods which do not try to acquire the wdev lock themselves.

Cc: stable@vger.kernel.org
Fixes: 37b1c004685a3 ("cfg80211: Support all iftypes in autodisconnect_wk")
Signed-off-by: Markus Theil
Link: https://lore.kernel.org/r/20200108115536.2262-1-markus.theil@tu-ilmenau.de
Signed-off-by: Johannes Berg

Markus Theil
2020-01-15 16:52:49 +0800
e16119655 wireless: wext: avoid gcc -O3 warning ... Browse Code »

After the introduction of CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3,
the wext code produces a bogus warning:

In function 'iw_handler_get_iwstats',
inlined from 'ioctl_standard_call' at net/wireless/wext-core.c:1015:9,
inlined from 'wireless_process_ioctl' at net/wireless/wext-core.c:935:10,
inlined from 'wext_ioctl_dispatch.part.8' at net/wireless/wext-core.c:986:8,
inlined from 'wext_handle_ioctl':
net/wireless/wext-core.c:671:3: error: argument 1 null where non-null expected [-Werror=nonnull]
memcpy(extra, stats, sizeof(struct iw_statistics));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from arch/x86/include/asm/string.h:5,
net/wireless/wext-core.c: In function 'wext_handle_ioctl':
arch/x86/include/asm/string_64.h:14:14: note: in a call to function 'memcpy' declared here

The problem is that ioctl_standard_call() sometimes calls the handler
with a NULL argument that would cause a problem for iw_handler_get_iwstats.
However, iw_handler_get_iwstats never actually gets called that way.

Marking that function as noinline avoids the warning and leads
to slightly smaller object code as well.

Signed-off-by: Arnd Bergmann
Link: https://lore.kernel.org/r/20200107200741.3588770-1-arnd@arndb.de
Signed-off-by: Johannes Berg

Arnd Bergmann
2020-01-15 16:52:20 +0800
6f6012652 mac80211: Fix TKIP replay protection immediately after key setup ... Browse Code »

TKIP replay protection was skipped for the very first frame received
after a new key is configured. While this is potentially needed to avoid
dropping a frame in some cases, this does leave a window for replay
attacks with group-addressed frames at the station side. Any earlier
frame sent by the AP using the same key would be accepted as a valid
frame and the internal RSC would then be updated to the TSC from that
frame. This would allow multiple previously transmitted group-addressed
frames to be replayed until the next valid new group-addressed frame
from the AP is received by the station.

Fix this by limiting the no-replay-protection exception to apply only
for the case where TSC=0, i.e., when this is for the very first frame
protected using the new key, and the local RSC had not been set to a
higher value when configuring the key (which may happen with GTK).

Signed-off-by: Jouni Malinen
Link: https://lore.kernel.org/r/20200107153545.10934-1-j@w1.fi
Signed-off-by: Johannes Berg

Jouni Malinen
2020-01-15 16:52:12 +0800
26ec17a1d cfg80211: Fix radar event during another phy CAC ... Browse Code »

In case a radar event of CAC_FINISHED or RADAR_DETECTED
happens during another phy is during CAC we might need
to cancel that CAC.

If we got a radar in a channel that another phy is now
doing CAC on then the CAC should be canceled there.

If, for example, 2 phys doing CAC on the same channels,
or on comptable channels, once on of them will finish his
CAC the other might need to cancel his CAC, since it is no
longer relevant.

To fix that the commit adds an callback and implement it in
mac80211 to end CAC.
This commit also adds a call to said callback if after a radar
event we see the CAC is no longer relevant

Signed-off-by: Orr Mazor
Reviewed-by: Sergey Matyukevich
Link: https://lore.kernel.org/r/20191222145449.15792-1-Orr.Mazor@tandemg.com
[slightly reformat/reword commit message]
Signed-off-by: Johannes Berg

Orr Mazor
2020-01-15 16:50:48 +0800
c4b9d655e wireless: fix enabling channel 12 for custom regulatory domain ... Browse Code »

Commit e33e2241e272 ("Revert "cfg80211: Use 5MHz bandwidth by
default when checking usable channels"") fixed a broken
regulatory (leaving channel 12 open for AP where not permitted).
Apply a similar fix to custom regulatory domain processing.

Signed-off-by: Cathy Luo
Signed-off-by: Ganapathi Bhat
Link: https://lore.kernel.org/r/1576836859-8945-1-git-send-email-ganapathi.bhat@nxp.com
[reword commit message, fix coding style, add a comment]
Signed-off-by: Johannes Berg

Ganapathi Bhat
2020-01-15 16:47:52 +0800
c742c59e1 hv_sock: Remove the accept port restriction ... Browse Code »

Currently, hv_sock restricts the port the guest socket can accept
connections on. hv_sock divides the socket port namespace into two parts
for server side (listening socket), 0-0x7FFFFFFF & 0x80000000-0xFFFFFFFF
(there are no restrictions on client port namespace). The first part
(0-0x7FFFFFFF) is reserved for sockets where connections can be accepted.
The second part (0x80000000-0xFFFFFFFF) is reserved for allocating ports
for the peer (host) socket, once a connection is accepted.
This reservation of the port namespace is specific to hv_sock and not
known by the generic vsock library (ex: af_vsock). This is problematic
because auto-binds/ephemeral ports are handled by the generic vsock
library and it has no knowledge of this port reservation and could
allocate a port that is not compatible with hv_sock (and legitimately so).
The issue hasn't surfaced so far because the auto-bind code of vsock
(__vsock_bind_stream) prior to the change 'VSOCK: bind to random port for
VMADDR_PORT_ANY' would start walking up from LAST_RESERVED_PORT (1023) and
start assigning ports. That will take a large number of iterations to hit
0x7FFFFFFF. But, after the above change to randomize port selection, the
issue has started coming up more frequently.
There has really been no good reason to have this port reservation logic
in hv_sock from the get go. Reserving a local port for peer ports is not
how things are handled generally. Peer ports should reflect the peer port.
This fixes the issue by lifting the port reservation, and also returns the
right peer port. Since the code converts the GUID to the peer port (by
using the first 4 bytes), there is a possibility of conflicts, but that
seems like a reasonable risk to take, given this is limited to vsock and
that only applies to all local sockets.

Signed-off-by: Sunil Muthuswamy
Signed-off-by: David S. Miller

Sunil Muthuswamy
2020-01-15 03:50:53 +0800
671c450b6 xprtrdma: Fix oops in Receive handler after device removal ... Browse Code »

Since v5.4, a device removal occasionally triggered this oops:

Dec 2 17:13:53 manet kernel: BUG: unable to handle page fault for address: 0000000c00000219
Dec 2 17:13:53 manet kernel: #PF: supervisor read access in kernel mode
Dec 2 17:13:53 manet kernel: #PF: error_code(0x0000) - not-present page
Dec 2 17:13:53 manet kernel: PGD 0 P4D 0
Dec 2 17:13:53 manet kernel: Oops: 0000 [#1] SMP
Dec 2 17:13:53 manet kernel: CPU: 2 PID: 468 Comm: kworker/2:1H Tainted: G W 5.4.0-00050-g53717e43af61 #883
Dec 2 17:13:53 manet kernel: Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015
Dec 2 17:13:53 manet kernel: Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
Dec 2 17:13:53 manet kernel: RIP: 0010:rpcrdma_wc_receive+0x7c/0xf6 [rpcrdma]
Dec 2 17:13:53 manet kernel: Code: 6d 8b 43 14 89 c1 89 45 78 48 89 4d 40 8b 43 2c 89 45 14 8b 43 20 89 45 18 48 8b 45 20 8b 53 14 48 8b 30 48 8b 40 10 48 8b 38 8b 87 18 02 00 00 48 85 c0 75 18 48 8b 05 1e 24 c4 e1 48 85 c0
Dec 2 17:13:53 manet kernel: RSP: 0018:ffffc900035dfe00 EFLAGS: 00010246
Dec 2 17:13:53 manet kernel: RAX: ffff888467290000 RBX: ffff88846c638400 RCX: 0000000000000048
Dec 2 17:13:53 manet kernel: RDX: 0000000000000048 RSI: 00000000f942e000 RDI: 0000000c00000001
Dec 2 17:13:53 manet kernel: RBP: ffff888467611b00 R08: ffff888464e4a3c4 R09: 0000000000000000
Dec 2 17:13:53 manet kernel: R10: ffffc900035dfc88 R11: fefefefefefefeff R12: ffff888865af4428
Dec 2 17:13:53 manet kernel: R13: ffff888466023000 R14: ffff88846c63f000 R15: 0000000000000010
Dec 2 17:13:53 manet kernel: FS: 0000000000000000(0000) GS:ffff88846fa80000(0000) knlGS:0000000000000000
Dec 2 17:13:53 manet kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 2 17:13:53 manet kernel: CR2: 0000000c00000219 CR3: 0000000002009002 CR4: 00000000001606e0
Dec 2 17:13:53 manet kernel: Call Trace:
Dec 2 17:13:53 manet kernel: __ib_process_cq+0x5c/0x14e [ib_core]
Dec 2 17:13:53 manet kernel: ib_cq_poll_work+0x26/0x70 [ib_core]
Dec 2 17:13:53 manet kernel: process_one_work+0x19d/0x2cd
Dec 2 17:13:53 manet kernel: ? cancel_delayed_work_sync+0xf/0xf
Dec 2 17:13:53 manet kernel: worker_thread+0x1a6/0x25a
Dec 2 17:13:53 manet kernel: ? cancel_delayed_work_sync+0xf/0xf
Dec 2 17:13:53 manet kernel: kthread+0xf4/0xf9
Dec 2 17:13:53 manet kernel: ? kthread_queue_delayed_work+0x74/0x74
Dec 2 17:13:53 manet kernel: ret_from_fork+0x24/0x30

The proximal cause is that this rpcrdma_rep has a rr_rdmabuf that
is still pointing to the old ib_device, which has been freed. The
only way that is possible is if this rpcrdma_rep was not destroyed
by rpcrdma_ia_remove.

Debugging showed that was indeed the case: this rpcrdma_rep was
still in use by a completing RPC at the time of the device removal,
and thus wasn't on the rep free list. So, it was not found by
rpcrdma_reps_destroy().

The fix is to introduce a list of all rpcrdma_reps so that they all
can be found when a device is removed. That list is used to perform
only regbuf DMA unmapping, replacing that call to
rpcrdma_reps_destroy().

Meanwhile, to prevent corruption of this list, I've moved the
destruction of temp rpcrdma_rep objects to rpcrdma_post_recvs().
rpcrdma_xprt_drain() ensures that post_recvs (and thus rep_destroy) is
not invoked while rpcrdma_reps_unmap is walking rb_all_reps, thus
protecting the rb_all_reps list.

Fixes: b0b227f071a0 ("xprtrdma: Use an llist to manage free rpcrdma_reps")
Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-01-15 02:30:24 +0800
13cb886c5 xprtrdma: Fix completion wait during device removal ... Browse Code »

I've found that on occasion, "rmmod " will hang while if an NFS
is under load.

Ensure that ri_remove_done is initialized only just before the
transport is woken up to force a close. This avoids the completion
possibly getting initialized again while the CM event handler is
waiting for a wake-up.

Fixes: bebd031866ca ("xprtrdma: Support unplugging an HCA from under an NFS mount")
Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-01-15 02:30:24 +0800
b32b9ed49 xprtrdma: Fix create_qp crash on device unload ... Browse Code »

On device re-insertion, the RDMA device driver crashes trying to set
up a new QP:

Nov 27 16:32:06 manet kernel: BUG: kernel NULL pointer dereference, address: 00000000000001c0
Nov 27 16:32:06 manet kernel: #PF: supervisor write access in kernel mode
Nov 27 16:32:06 manet kernel: #PF: error_code(0x0002) - not-present page
Nov 27 16:32:06 manet kernel: PGD 0 P4D 0
Nov 27 16:32:06 manet kernel: Oops: 0002 [#1] SMP
Nov 27 16:32:06 manet kernel: CPU: 1 PID: 345 Comm: kworker/u28:0 Tainted: G W 5.4.0 #852
Nov 27 16:32:06 manet kernel: Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015
Nov 27 16:32:06 manet kernel: Workqueue: xprtiod xprt_rdma_connect_worker [rpcrdma]
Nov 27 16:32:06 manet kernel: RIP: 0010:atomic_try_cmpxchg+0x2/0x12
Nov 27 16:32:06 manet kernel: Code: ff ff 48 8b 04 24 5a c3 c6 07 00 0f 1f 40 00 c3 31 c0 48 81 ff 08 09 68 81 72 0c 31 c0 48 81 ff 83 0c 68 81 0f 92 c0 c3 8b 06 0f b1 17 0f 94 c2 84 d2 75 02 89 06 88 d0 c3 53 ba 01 00 00 00
Nov 27 16:32:06 manet kernel: RSP: 0018:ffffc900035abbf0 EFLAGS: 00010046
Nov 27 16:32:06 manet kernel: RAX: 0000000000000000 RBX: 00000000000001c0 RCX: 0000000000000000
Nov 27 16:32:06 manet kernel: RDX: 0000000000000001 RSI: ffffc900035abbfc RDI: 00000000000001c0
Nov 27 16:32:06 manet kernel: RBP: ffffc900035abde0 R08: 000000000000000e R09: ffffffffffffc000
Nov 27 16:32:06 manet kernel: R10: 0000000000000000 R11: 000000000002e800 R12: ffff88886169d9f8
Nov 27 16:32:06 manet kernel: R13: ffff88886169d9f4 R14: 0000000000000246 R15: 0000000000000000
Nov 27 16:32:06 manet kernel: FS: 0000000000000000(0000) GS:ffff88846fa40000(0000) knlGS:0000000000000000
Nov 27 16:32:06 manet kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 27 16:32:06 manet kernel: CR2: 00000000000001c0 CR3: 0000000002009006 CR4: 00000000001606e0
Nov 27 16:32:06 manet kernel: Call Trace:
Nov 27 16:32:06 manet kernel: do_raw_spin_lock+0x2f/0x5a
Nov 27 16:32:06 manet kernel: create_qp_common.isra.47+0x856/0xadf [mlx4_ib]
Nov 27 16:32:06 manet kernel: ? slab_post_alloc_hook.isra.60+0xa/0x1a
Nov 27 16:32:06 manet kernel: ? __kmalloc+0x125/0x139
Nov 27 16:32:06 manet kernel: mlx4_ib_create_qp+0x57f/0x972 [mlx4_ib]

The fix is to copy the qp_init_attr struct that was just created by
rpcrdma_ep_create() instead of using the one from the previous
connection instance.

Fixes: 98ef77d1aaa7 ("xprtrdma: Send Queue size grows after a reconnect")
Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-01-15 02:30:24 +0800

14 Jan, 2020

2 commits

212e7f566 netfilter: arp_tables: init netns pointer in xt_tgdtor_param struct ... Browse Code »

An earlier commit (1b789577f655060d98d20e,
"netfilter: arp_tables: init netns pointer in xt_tgchk_param struct")
fixed missing net initialization for arptables, but turns out it was
incomplete. We can get a very similar struct net NULL deref during
error unwinding:

general protection fault: 0000 [#1] PREEMPT SMP KASAN
RIP: 0010:xt_rateest_put+0xa1/0x440 net/netfilter/xt_RATEEST.c:77
xt_rateest_tg_destroy+0x72/0xa0 net/netfilter/xt_RATEEST.c:175
cleanup_entry net/ipv4/netfilter/arp_tables.c:509 [inline]
translate_table+0x11f4/0x1d80 net/ipv4/netfilter/arp_tables.c:587
do_replace net/ipv4/netfilter/arp_tables.c:981 [inline]
do_arpt_set_ctl+0x317/0x650 net/ipv4/netfilter/arp_tables.c:1461

Also init the netns pointer in xt_tgdtor_param struct.

Fixes: add67461240c1d ("netfilter: add struct net * to target parameters")
Reported-by: syzbot+91bdd8eece0f6629ec8b@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2020-01-14 02:22:10 +0800
c12095938 netfilter: fix a use-after-free in mtype_destroy() ... Browse Code »

map->members is freed by ip_set_free() right before using it in
mtype_ext_cleanup() again. So we just have to move it down.

Reported-by: syzbot+4c3cc6dbe7259dbf9054@syzkaller.appspotmail.com
Fixes: 40cd63bf33b2 ("netfilter: ipset: Support extensions which need a per data destroy function")
Acked-by: Jozsef Kadlecsik
Signed-off-by: Cong Wang
Signed-off-by: Pablo Neira Ayuso

Cong Wang
2020-01-14 01:53:59 +0800