Eric Lee / smarc-fsl-linux-kernel

01 Apr, 2020

1 commit

7f80ccfe9 net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline ... Browse Code »

In case memory resources for buf were allocated, release them before
return.

Addresses-Coverity-ID: 1492011 ("Resource leak")
Fixes: a7a29f9c361f ("net: ipv6: add rpl sr tunnel")
Signed-off-by: Gustavo A. R. Silva
Signed-off-by: David S. Miller

Gustavo A. R. Silva
2020-04-01 01:12:51 +0800

31 Mar, 2020

4 commits

ed52f2c60 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next ... Browse Code »

Signed-off-by: David S. Miller

David S. Miller
2020-03-31 10:52:37 +0800
71489e21d net: Track socket refcounts in skb_steal_sock() ... Browse Code »

Refactor the UDP/TCP handlers slightly to allow skb_steal_sock() to make
the determination of whether the socket is reference counted in the case
where it is prefetched by earlier logic such as early_demux.

Signed-off-by: Joe Stringer
Signed-off-by: Alexei Starovoitov
Acked-by: Martin KaFai Lau
Link: https://lore.kernel.org/bpf/20200329225342.16317-3-joe@wand.net.nz

Joe Stringer
2020-03-31 04:45:04 +0800
cf7fbe660 bpf: Add socket assign support ... Browse Code »

Add support for TPROXY via a new bpf helper, bpf_sk_assign().

This helper requires the BPF program to discover the socket via a call
to bpf_sk*_lookup_*(), then pass this socket to the new helper. The
helper takes its own reference to the socket in addition to any existing
reference that may or may not currently be obtained for the duration of
BPF processing. For the destination socket to receive the traffic, the
traffic must be routed towards that socket via local route. The
simplest example route is below, but in practice you may want to route
traffic more narrowly (eg by CIDR):

$ ip route add local default dev lo

This patch avoids trying to introduce an extra bit into the skb->sk, as
that would require more invasive changes to all code interacting with
the socket to ensure that the bit is handled correctly, such as all
error-handling cases along the path from the helper in BPF through to
the orphan path in the input. Instead, we opt to use the destructor
variable to switch on the prefetch of the socket.

Signed-off-by: Joe Stringer
Signed-off-by: Alexei Starovoitov
Acked-by: Martin KaFai Lau
Link: https://lore.kernel.org/bpf/20200329225342.16317-2-joe@wand.net.nz

Joe Stringer
2020-03-31 04:45:04 +0800
acc086bfb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next ... Browse Code »

Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2020-03-28

1) Use kmem_cache_zalloc() instead of kmem_cache_alloc()
in xfrm_state_alloc(). From Huang Zijiang.

2) esp_output_fill_trailer() is the same in IPv4 and IPv6,
so share this function to avoide code duplcation.
From Raed Salem.

3) Add offload support for esp beet mode.
From Xin Long.
====================

Signed-off-by: David S. Miller

David S. Miller
2020-03-31 01:59:20 +0800

30 Mar, 2020

5 commits

a7a29f9c3 net: ipv6: add rpl sr tunnel ... Browse Code »

This patch adds functionality to configure routes for RPL source routing
functionality. There is no IPIP functionality yet implemented which can
be added later when the cases when to use IPv6 encapuslation comes more
clear.

Signed-off-by: Alexander Aring
Signed-off-by: David S. Miller

Alexander Aring
2020-03-30 13:30:57 +0800
faee67694 net: add net available in build_state ... Browse Code »

The build_state callback of lwtunnel doesn't contain the net namespace
structure yet. This patch will add it so we can check on specific
address configuration at creation time of rpl source routes.

Signed-off-by: Alexander Aring
Signed-off-by: David S. Miller

Alexander Aring
2020-03-30 13:30:57 +0800
8610c7c6e net: ipv6: add support for rpl sr exthdr ... Browse Code »

This patch adds rpl source routing receive handling. Everything works
only if sysconf "rpl_seg_enabled" and source routing is enabled. Mostly
the same behaviour as IPv6 segmentation routing. To handle compression
and uncompression a rpl.c file is created which contains the necessary
functionality. The receive handling will also care about IPv6
encapsulated so far it's specified as possible nexthdr in RFC 6554.

Signed-off-by: Alexander Aring
Signed-off-by: David S. Miller

Alexander Aring
2020-03-30 13:30:57 +0800
f37c60593 addrconf: add functionality to check on rpl requirements ... Browse Code »

This patch adds a functionality to addrconf to check on a specific RPL
address configuration. According to RFC 6554:

To detect loops in the SRH, a router MUST determine if the SRH
includes multiple addresses assigned to any interface on that
router. If such addresses appear more than once and are separated by
at least one address not assigned to that router.

Signed-off-by: Alexander Aring
Signed-off-by: David S. Miller

Alexander Aring
2020-03-30 13:30:57 +0800
f0b598974 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Minor comment conflict in mac80211.

Signed-off-by: David S. Miller

David S. Miller
2020-03-30 12:25:29 +0800

28 Mar, 2020

1 commit

e00dd941f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec ... Browse Code »

Steffen Klassert says:

====================
pull request (net): ipsec 2020-03-27

1) Handle NETDEV_UNREGISTER for xfrm device to handle asynchronous
unregister events cleanly. From Raed Salem.

2) Fix vti6 tunnel inter address family TX through bpf_redirect().
From Nicolas Dichtel.

3) Fix lenght check in verify_sec_ctx_len() to avoid a
slab-out-of-bounds. From Xin Long.

4) Add a missing verify_sec_ctx_len check in xfrm_add_acquire
to avoid a possible out-of-bounds to access. From Xin Long.

5) Use built-in RCU list checking of hlist_for_each_entry_rcu
to silence false lockdep warning in __xfrm6_tunnel_spi_lookup
when CONFIG_PROVE_RCU_LIST is enabled. From Madhuparna Bhowmik.

6) Fix a panic on esp offload when crypto is done asynchronously.
From Xin Long.

7) Fix a skb memory leak in an error path of vti6_rcv.
From Torsten Hilbrich.

8) Fix a race that can lead to a doulbe free in xfrm_policy_timer.
From Xin Long.
====================

Signed-off-by: David S. Miller

David S. Miller
2020-03-28 05:56:55 +0800

27 Mar, 2020

1 commit

c24a77edc ipv6: ndisc: add support for 'PREF64' dns64 prefix identifier ... Browse Code »

This is trivial since we already have support for the entirely
identical (from the kernel's point of view) RDNSS, DNSSL, etc. that
also contain opaque data that needs to be passed down to userspace
for further processing.

As specified in draft-ietf-6man-ra-pref64-09 (while it is still a draft,
it is purely waiting on the RFC Editor for cleanups and publishing):
PREF64 option contains lifetime and a (up to) 96-bit IPv6 prefix.

The 8-bit identifier of the option type as assigned by the IANA is 38.

Since we lack DNS64/NAT64/CLAT support in kernel at the moment,
thus this option should also be passed on to userland.

See:
https://tools.ietf.org/html/draft-ietf-6man-ra-pref64-09
https://www.iana.org/assignments/icmpv6-parameters/icmpv6-parameters.xhtml#icmpv6-parameters-5

Cc: Erik Kline
Cc: Jen Linkova
Cc: Lorenzo Colitti
Cc: Michael Haro
Signed-off-by: Maciej Żenczykowski
Acked-By: Lorenzo Colitti
Signed-off-by: David S. Miller

Maciej Żenczykowski
2020-03-27 11:05:58 +0800

26 Mar, 2020

1 commit

7f9e40eb1 esp6: add gso_segment for esp6 beet mode ... Browse Code »

Similar to xfrm6_tunnel/transport_gso_segment(), _gso_segment()
is added to do gso_segment for esp6 beet mode. Before calling
inet6_offloads[proto]->callbacks.gso_segment, it needs to do:

- Get the upper proto from ph header to get its gso_segment
when xo->proto is IPPROTO_BEETPH.

- Add SKB_GSO_TCPV6 to gso_type if x->sel.family != AF_INET6
and the proto == IPPROTO_TCP, so that the current tcp ipv6
packet can be segmented.

- Calculate a right value for skb->transport_header and move
skb->data to the transport header position.

Signed-off-by: Xin Long
Signed-off-by: Steffen Klassert

Xin Long
2020-03-26 21:51:07 +0800

24 Mar, 2020

1 commit

af13b3c33 Remove DST_HOST ... Browse Code »

Previous changes to the IP routing code have removed all the
tests for the DS_HOST route flag.
Remove the flags and all the code that sets it.

Signed-off-by: David Laight
Acked-by: David Ahern
Signed-off-by: David S. Miller

David Laight
2020-03-24 12:57:44 +0800

16 Mar, 2020

1 commit

2a9de3af2 vti6: Fix memory leak of skb if input policy check fails ... Browse Code »

The vti6_rcv function performs some tests on the retrieved tunnel
including checking the IP protocol, the XFRM input policy, the
source and destination address.

In all but one places the skb is released in the error case. When
the input policy check fails the network packet is leaked.

Using the same goto-label discard in this case to fix this problem.

Fixes: ed1efb2aefbb ("ipv6: Add support for IPsec virtual tunnel interfaces")
Signed-off-by: Torsten Hilbrich
Reviewed-by: Nicolas Dichtel
Signed-off-by: Steffen Klassert

Torsten Hilbrich
2020-03-16 18:13:48 +0800

15 Mar, 2020

1 commit

6daf14140 netfilter: Replace zero-length array with flexible-array member ... Browse Code »

The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
int stuff;
struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

Lastly, fix checkpatch.pl warning
WARNING: __aligned(size) is preferred over __attribute__((aligned(size)))
in net/bridge/netfilter/ebtables.c

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva
Signed-off-by: Pablo Neira Ayuso

Gustavo A. R. Silva
2020-03-15 22:20:16 +0800

13 Mar, 2020

2 commits

1d3435793 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Minor overlapping changes, nothing serious.

Signed-off-by: David S. Miller

David S. Miller
2020-03-13 13:34:48 +0800
a8eceea84 inet: Use fallthrough; ... Browse Code »

Convert the various uses of fallthrough comments to fallthrough;

Done via script
Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com/

And by hand:

net/ipv6/ip6_fib.c has a fallthrough comment outside of an #ifdef block
that causes gcc to emit a warning if converted in-place.

So move the new fallthrough; inside the containing #ifdef/#endif too.

Signed-off-by: Joe Perches
Signed-off-by: David S. Miller

Joe Perches
2020-03-13 06:55:00 +0800

12 Mar, 2020

1 commit

267762538 seg6: fix SRv6 L2 tunnels to use IANA-assigned protocol number ... Browse Code »

The Internet Assigned Numbers Authority (IANA) has recently assigned
a protocol number value of 143 for Ethernet [1].

Before this assignment, encapsulation mechanisms such as Segment Routing
used the IPv6-NoNxt protocol number (59) to indicate that the encapsulated
payload is an Ethernet frame.

In this patch, we add the definition of the Ethernet protocol number to the
kernel headers and update the SRv6 L2 tunnels to use it.

[1] https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml

Signed-off-by: Paolo Lungaroni
Reviewed-by: Andrea Mayer
Acked-by: Ahmed Abdelsalam
Signed-off-by: David S. Miller

Paolo Lungaroni
2020-03-12 14:49:30 +0800

11 Mar, 2020

1 commit

60380488e ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interface ... Browse Code »

Rafał found an issue that for non-Ethernet interface, if we down and up
frequently, the memory will be consumed slowly.

The reason is we add allnodes/allrouters addressed in multicast list in
ipv6_add_dev(). When link down, we call ipv6_mc_down(), store all multicast
addresses via mld_add_delrec(). But when link up, we don't call ipv6_mc_up()
for non-Ethernet interface to remove the addresses. This makes idev->mc_tomb
getting bigger and bigger. The call stack looks like:

addrconf_notify(NETDEV_REGISTER)
ipv6_add_dev
ipv6_dev_mc_inc(ff01::1)
ipv6_dev_mc_inc(ff02::1)
ipv6_dev_mc_inc(ff02::2)

addrconf_notify(NETDEV_UP)
addrconf_dev_config
/* Alas, we support only Ethernet autoconfiguration. */
return;

addrconf_notify(NETDEV_DOWN)
addrconf_ifdown
ipv6_mc_down
igmp6_group_dropped(ff02::2)
mld_add_delrec(ff02::2)
igmp6_group_dropped(ff02::1)
igmp6_group_dropped(ff01::1)

After investigating, I can't found a rule to disable multicast on
non-Ethernet interface. In RFC2460, the link could be Ethernet, PPP, ATM,
tunnels, etc. In IPv4, it doesn't check the dev type when calls ip_mc_up()
in inetdev_event(). Even for IPv6, we don't check the dev type and call
ipv6_add_dev(), ipv6_dev_mc_inc() after register device.

So I think it's OK to fix this memory consumer by calling ipv6_mc_up() for
non-Ethernet interface.

v2: Also check IFF_MULTICAST flag to make sure the interface supports
multicast

Reported-by: Rafał Miłecki
Tested-by: Rafał Miłecki
Fixes: 74235a25c673 ("[IPV6] addrconf: Fix IPv6 on tuntap tunnels")
Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when set link down")
Signed-off-by: Hangbin Liu
Signed-off-by: David S. Miller

Hangbin Liu
2020-03-11 06:37:49 +0800

04 Mar, 2020

3 commits

d2f7e56d1 ipv6: Use math to point per net sysctls into the appropriate struct net ... Browse Code »

The data pointers of ipv6 sysctl are set one by one which is hard to
maintain, especially with kconfig. This patch simplifies it by using
math to point the per net sysctls into the appropriate struct net,
just like what we did for ipv4.

Signed-off-by: Cambda Zhu
Reviewed-by: Eric Dumazet
Signed-off-by: David S. Miller

Cambda Zhu
2020-03-04 06:50:08 +0800
d0098e4c6 net/ipv6: remove the old peer route if change it to a new one ... Browse Code »

When we modify the peer route and changed it to a new one, we should
remove the old route first. Before the fix:

+ ip addr add dev dummy1 2001:db8::1 peer 2001:db8::2
+ ip -6 route show dev dummy1
2001:db8::1 proto kernel metric 256 pref medium
2001:db8::2 proto kernel metric 256 pref medium
+ ip addr change dev dummy1 2001:db8::1 peer 2001:db8::3
+ ip -6 route show dev dummy1
2001:db8::1 proto kernel metric 256 pref medium
2001:db8::2 proto kernel metric 256 pref medium

After the fix:
+ ip addr change dev dummy1 2001:db8::1 peer 2001:db8::3
+ ip -6 route show dev dummy1
2001:db8::1 proto kernel metric 256 pref medium
2001:db8::3 proto kernel metric 256 pref medium

This patch depend on the previous patch "net/ipv6: need update peer route
when modify metric" to update new peer route after delete old one.

Signed-off-by: Hangbin Liu
Reviewed-by: David Ahern
Signed-off-by: David S. Miller

Hangbin Liu
2020-03-04 06:43:16 +0800
617940123 net/ipv6: need update peer route when modify metric ... Browse Code »

When we modify the route metric, the peer address's route need also
be updated. Before the fix:

+ ip addr add dev dummy1 2001:db8::1 peer 2001:db8::2 metric 60
+ ip -6 route show dev dummy1
2001:db8::1 proto kernel metric 60 pref medium
2001:db8::2 proto kernel metric 60 pref medium
+ ip addr change dev dummy1 2001:db8::1 peer 2001:db8::2 metric 61
+ ip -6 route show dev dummy1
2001:db8::1 proto kernel metric 61 pref medium
2001:db8::2 proto kernel metric 60 pref medium

After the fix:
+ ip addr change dev dummy1 2001:db8::1 peer 2001:db8::2 metric 61
+ ip -6 route show dev dummy1
2001:db8::1 proto kernel metric 61 pref medium
2001:db8::2 proto kernel metric 61 pref medium

Fixes: 8308f3ff1753 ("net/ipv6: Add support for specifying metric of connected routes")
Signed-off-by: Hangbin Liu
Reviewed-by: David Ahern
Signed-off-by: David S. Miller

Hangbin Liu
2020-03-04 06:43:16 +0800

01 Mar, 2020

1 commit

07758eb9f net/ipv6: use configured metric when add peer route ... Browse Code »

When we add peer address with metric configured, IPv4 could set the dest
metric correctly, but IPv6 do not. e.g.

]# ip addr add 192.0.2.1 peer 192.0.2.2/32 dev eth1 metric 20
]# ip route show dev eth1
192.0.2.2 proto kernel scope link src 192.0.2.1 metric 20
]# ip addr add 2001:db8::1 peer 2001:db8::2/128 dev eth1 metric 20
]# ip -6 route show dev eth1
2001:db8::1 proto kernel metric 20 pref medium
2001:db8::2 proto kernel metric 256 pref medium

Fix this by using configured metric instead of default one.

Reported-by: Jianlin Shi
Fixes: 8308f3ff1753 ("net/ipv6: Add support for specifying metric of connected routes")
Reviewed-by: David Ahern
Signed-off-by: Hangbin Liu
Signed-off-by: David S. Miller

Hangbin Liu
2020-03-01 13:55:55 +0800

29 Feb, 2020

1 commit

b0c9a2d9a ipv6: Replace zero-length array with flexible-array member ... Browse Code »

The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
int stuff;
struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva
Signed-off-by: David S. Miller

Gustavo A. R. Silva
2020-02-29 04:08:37 +0800

28 Feb, 2020

1 commit

9f6e05590 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

The mptcp conflict was overlapping additions.

The SMC conflict was an additional and removal happening at the same
time.

Signed-off-by: David S. Miller

David S. Miller
2020-02-28 10:31:39 +0800

27 Feb, 2020

2 commits

edf0d283d ipv6: xfrm6_tunnel.c: Use built-in RCU list checking ... Browse Code »

hlist_for_each_entry_rcu() has built-in RCU and lock checking.

Pass cond argument to list_for_each_entry_rcu() to silence
false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled
by default.

Signed-off-by: Madhuparna Bhowmik
Signed-off-by: Steffen Klassert

Madhuparna Bhowmik
2020-02-27 17:17:41 +0800
b6f611890 ipv6: restrict IPV6_ADDRFORM operation ... Browse Code »

IPV6_ADDRFORM is able to transform IPv6 socket to IPv4 one.
While this operation sounds illogical, we have to support it.

One of the things it does for TCP socket is to switch sk->sk_prot
to tcp_prot.

We now have other layers playing with sk->sk_prot, so we should make
sure to not interfere with them.

This patch makes sure sk_prot is the default pointer for TCP IPv6 socket.

syzbot reported :
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD a0113067 P4D a0113067 PUD a8771067 PMD 0
Oops: 0010 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 10686 Comm: syz-executor.0 Not tainted 5.6.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246
RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40
RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5
R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098
R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000
FS: 00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
inet_release+0x165/0x1c0 net/ipv4/af_inet.c:427
__sock_release net/socket.c:605 [inline]
sock_close+0xe1/0x260 net/socket.c:1283
__fput+0x2e4/0x740 fs/file_table.c:280
____fput+0x15/0x20 fs/file_table.c:313
task_work_run+0x176/0x1b0 kernel/task_work.c:113
tracehook_notify_resume include/linux/tracehook.h:188 [inline]
exit_to_usermode_loop arch/x86/entry/common.c:164 [inline]
prepare_exit_to_usermode+0x480/0x5b0 arch/x86/entry/common.c:195
syscall_return_slowpath+0x113/0x4a0 arch/x86/entry/common.c:278
do_syscall_64+0x11f/0x1c0 arch/x86/entry/common.c:304
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x45c429
Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f2ae75dac78 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: 0000000000000000 RBX: 00007f2ae75db6d4 RCX: 000000000045c429
RDX: 0000000000000001 RSI: 000000000000011a RDI: 0000000000000004
RBP: 000000000076bf20 R08: 0000000000000038 R09: 0000000000000000
R10: 0000000020000180 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000000a9d R14: 00000000004ccfb4 R15: 000000000076bf2c
Modules linked in:
CR2: 0000000000000000
---[ end trace 82567b5207e87bae ]---
RIP: 0010:0x0
Code: Bad RIP value.
RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246
RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40
RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5
R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098
R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000
FS: 00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Eric Dumazet
Reported-by: syzbot+1938db17e275e85dc328@syzkaller.appspotmail.com
Cc: Daniel Borkmann
Signed-off-by: David S. Miller

Eric Dumazet
2020-02-27 12:20:58 +0800

25 Feb, 2020

2 commits

571912c69 net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc. ... Browse Code »

The Bareudp tunnel module provides a generic L3 encapsulation
tunnelling module for tunnelling different protocols like MPLS,
IP,NSH etc inside a UDP tunnel.

Signed-off-by: Martin Varghese
Acked-by: Willem de Bruijn
Signed-off-by: David S. Miller

Martin Varghese
2020-02-25 05:31:42 +0800
28b380e28 ip6mr: Fix RCU list debugging warning ... Browse Code »

ip6mr_for_each_table() macro uses list_for_each_entry_rcu()
for traversing outside an RCU read side critical section
but under the protection of rtnl_mutex. Hence add the
corresponding lockdep expression to silence the following
false-positive warnings:

[ 4.319479] =============================
[ 4.319480] WARNING: suspicious RCU usage
[ 4.319482] 5.5.4-stable #17 Tainted: G E
[ 4.319483] -----------------------------
[ 4.319485] net/ipv6/ip6mr.c:1243 RCU-list traversed in non-reader section!!

[ 4.456831] =============================
[ 4.456832] WARNING: suspicious RCU usage
[ 4.456834] 5.5.4-stable #17 Tainted: G E
[ 4.456835] -----------------------------
[ 4.456837] net/ipv6/ip6mr.c:1582 RCU-list traversed in non-reader section!!

Signed-off-by: Amol Grover
Signed-off-by: David S. Miller

Amol Grover
2020-02-25 05:19:21 +0800

21 Feb, 2020

1 commit

46d30cb10 net: ip6_gre: Distribute switch variables for initialization ... Browse Code »

Variables declared in a switch statement before any case statements
cannot be automatically initialized with compiler instrumentation (as
they are not part of any execution flow). With GCC's proposed automatic
stack variable initialization feature, this triggers a warning (and they
don't get initialized). Clang's automatic stack variable initialization
(via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
doesn't initialize such variables[1]. Note that these warnings (or silent
skipping) happen before the dead-store elimination optimization phase,
so even when the automatic initializations are later elided in favor of
direct initializations, the warnings remain.

To avoid these problems, move such variables into the "case" where
they're used or lift them up into the main function body.

net/ipv6/ip6_gre.c: In function ‘ip6gre_err’:
net/ipv6/ip6_gre.c:440:32: warning: statement will never be executed [-Wswitch-unreachable]
440 | struct ipv6_tlv_tnl_enc_lim *tel;
| ^~~

net/ipv6/ip6_tunnel.c: In function ‘ip6_tnl_err’:
net/ipv6/ip6_tunnel.c:520:32: warning: statement will never be executed [-Wswitch-unreachable]
520 | struct ipv6_tlv_tnl_enc_lim *tel;
| ^~~

[1] https://bugs.llvm.org/show_bug.cgi?id=44916

Signed-off-by: Kees Cook
Signed-off-by: David S. Miller

Kees Cook
2020-02-21 02:00:19 +0800

19 Feb, 2020

1 commit

dda520c4d ESP: Export esp_output_fill_trailer function ... Browse Code »

The esp fill trailer method is identical for both
IPv6 and IPv4.

Share the implementation for esp6 and esp to avoid
code duplication in addition it could be also used
at various drivers code.

Signed-off-by: Raed Salem
Reviewed-by: Boris Pismenny
Reviewed-by: Saeed Mahameed
Signed-off-by: Steffen Klassert

Raed Salem
2020-02-19 20:52:32 +0800

17 Feb, 2020

2 commits

afecdb376 ipv6: Fix nlmsg_flags when splitting a multipath route ... Browse Code »

When splitting an RTA_MULTIPATH request into multiple routes and adding the
second and later components, we must not simply remove NLM_F_REPLACE but
instead replace it by NLM_F_CREATE. Otherwise, it may look like the netlink
message was malformed.

For example,
ip route add 2001:db8::1/128 dev dummy0
ip route change 2001:db8::1/128 nexthop via fe80::30:1 dev dummy0 \
nexthop via fe80::30:2 dev dummy0
results in the following warnings:
[ 1035.057019] IPv6: RTM_NEWROUTE with no NLM_F_CREATE or NLM_F_REPLACE
[ 1035.057517] IPv6: NLM_F_CREATE should be set when creating new route

This patch makes the nlmsg sequence look equivalent for __ip6_ins_rt() to
what it would get if the multipath route had been added in multiple netlink
operations:
ip route add 2001:db8::1/128 dev dummy0
ip route change 2001:db8::1/128 nexthop via fe80::30:1 dev dummy0
ip route append 2001:db8::1/128 nexthop via fe80::30:2 dev dummy0

Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
Signed-off-by: Benjamin Poirier
Reviewed-by: Michal Kubecek
Reviewed-by: David Ahern
Signed-off-by: David S. Miller

Benjamin Poirier
2020-02-17 10:34:31 +0800
e404b8c7c ipv6: Fix route replacement with dev-only route ... Browse Code »

After commit 27596472473a ("ipv6: fix ECMP route replacement") it is no
longer possible to replace an ECMP-able route by a non ECMP-able route.
For example,
ip route add 2001:db8::1/128 via fe80::1 dev dummy0
ip route replace 2001:db8::1/128 dev dummy0
does not work as expected.

Tweak the replacement logic so that point 3 in the log of the above commit
becomes:
3. If the new route is not ECMP-able, and no matching non-ECMP-able route
exists, replace matching ECMP-able route (if any) or add the new route.

We can now summarize the entire replace semantics to:
When doing a replace, prefer replacing a matching route of the same
"ECMP-able-ness" as the replace argument. If there is no such candidate,
fallback to the first route found.

Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
Signed-off-by: Benjamin Poirier
Reviewed-by: Michal Kubecek
Signed-off-by: David S. Miller

Benjamin Poirier
2020-02-17 10:34:31 +0800

14 Feb, 2020

2 commits

5fdcce211 net, ip6_tunnel: enhance tunnel locate with link check ... Browse Code »

With ipip, it is possible to create an extra interface explicitly
attached to a given physical interface:

# ip link show tunl0
4: tunl0@NONE: mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
# ip link add tunl1 type ipip dev eth0
# ip link show tunl1
6: tunl1@eth0: mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0

But it is not possible with ip6tnl:

# ip link show ip6tnl0
5: ip6tnl0@NONE: mtu 1452 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/tunnel6 :: brd ::
# ip link add ip6tnl1 type ip6tnl dev eth0
RTNETLINK answers: File exists

This patch aims to make it possible by adding link comparaison in both
tunnel locate and lookup functions; we also modify mtu calculation when
attached to an interface with a lower mtu.

This permits to make use of x-netns communication by moving the newly
created tunnel in a given netns.

Signed-off-by: William Dauchy
Reviewed-by: Nicolas Dichtel
Signed-off-by: David S. Miller

William Dauchy
2020-02-14 23:31:48 +0800
0b41713b6 icmp: introduce helper for nat'd source address in network device context ... Browse Code »

This introduces a helper function to be called only by network drivers
that wraps calls to icmp[v6]_send in a conntrack transformation, in case
NAT has been used. We don't want to pollute the non-driver path, though,
so we introduce this as a helper to be called by places that actually
make use of this, as suggested by Florian.

Signed-off-by: Jason A. Donenfeld
Cc: Florian Westphal
Signed-off-by: David S. Miller

Jason A. Donenfeld
2020-02-14 06:19:00 +0800

08 Feb, 2020

1 commit

db3fa2710 ipv6/addrconf: fix potential NULL deref in inet6_set_link_af() ... Browse Code »

__in6_dev_get(dev) called from inet6_set_link_af() can return NULL.

The needed check has been recently removed, let's add it back.

While do_setlink() does call validate_linkmsg() :
...
err = validate_linkmsg(dev, tb); /* OK at this point */
...

It is possible that the following call happening before the
->set_link_af() removes IPv6 if MTU is less than 1280 :

if (tb[IFLA_MTU]) {
err = dev_set_mtu_ext(dev, nla_get_u32(tb[IFLA_MTU]), extack);
if (err < 0)
goto errout;
status |= DO_SETLINK_MODIFIED;
}
...

if (tb[IFLA_AF_SPEC]) {
...
err = af_ops->set_link_af(dev, af);
->inet6_set_link_af() // CRASH because idev is NULL

Please note that IPv4 is immune to the bug since inet_set_link_af() does :

struct in_device *in_dev = __in_dev_get_rcu(dev);
if (!in_dev)
return -EAFNOSUPPORT;

This problem has been mentioned in commit cf7afbfeb8ce ("rtnl: make
link af-specific updates atomic") changelog :

This method is not fail proof, while it is currently sufficient
to make set_link_af() inerrable and thus 100% atomic, the
validation function method will not be able to detect all error
scenarios in the future, there will likely always be errors
depending on states which are f.e. not protected by rtnl_mutex
and thus may change between validation and setting.

IPv6: ADDRCONF(NETDEV_CHANGE): lo: link becomes ready
general protection fault, probably for non-canonical address 0xdffffc0000000056: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x00000000000002b0-0x00000000000002b7]
CPU: 0 PID: 9698 Comm: syz-executor712 Not tainted 5.5.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:inet6_set_link_af+0x66e/0xae0 net/ipv6/addrconf.c:5733
Code: 38 d0 7f 08 84 c0 0f 85 20 03 00 00 48 8d bb b0 02 00 00 45 0f b6 64 24 04 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 b6 04 02 84 c0 74 08 3c 03 0f 8e 1a 03 00 00 44 89 a3 b0 02 00
RSP: 0018:ffffc90005b06d40 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff86df39a6
RDX: 0000000000000056 RSI: ffffffff86df3e74 RDI: 00000000000002b0
RBP: ffffc90005b06e70 R08: ffff8880a2ac0380 R09: ffffc90005b06db0
R10: fffff52000b60dbe R11: ffffc90005b06df7 R12: 0000000000000000
R13: 0000000000000000 R14: ffff8880a1fcc424 R15: dffffc0000000000
FS: 0000000000c46880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055f0494ca0d0 CR3: 000000009e4ac000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
do_setlink+0x2a9f/0x3720 net/core/rtnetlink.c:2754
rtnl_group_changelink net/core/rtnetlink.c:3103 [inline]
__rtnl_newlink+0xdd1/0x1790 net/core/rtnetlink.c:3257
rtnl_newlink+0x69/0xa0 net/core/rtnetlink.c:3377
rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5438
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5456
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:672
____sys_sendmsg+0x753/0x880 net/socket.c:2343
___sys_sendmsg+0x100/0x170 net/socket.c:2397
__sys_sendmsg+0x105/0x1d0 net/socket.c:2430
__do_sys_sendmsg net/socket.c:2439 [inline]
__se_sys_sendmsg net/socket.c:2437 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4402e9
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fffd62fbcf8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004402e9
RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000008 R09: 00000000004002c8
R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000401b70
R13: 0000000000401c00 R14: 0000000000000000 R15: 0000000000000000
Modules linked in:
---[ end trace cfa7664b8fdcdff3 ]---
RIP: 0010:inet6_set_link_af+0x66e/0xae0 net/ipv6/addrconf.c:5733
Code: 38 d0 7f 08 84 c0 0f 85 20 03 00 00 48 8d bb b0 02 00 00 45 0f b6 64 24 04 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 b6 04 02 84 c0 74 08 3c 03 0f 8e 1a 03 00 00 44 89 a3 b0 02 00
RSP: 0018:ffffc90005b06d40 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff86df39a6
RDX: 0000000000000056 RSI: ffffffff86df3e74 RDI: 00000000000002b0
RBP: ffffc90005b06e70 R08: ffff8880a2ac0380 R09: ffffc90005b06db0
R10: fffff52000b60dbe R11: ffffc90005b06df7 R12: 0000000000000000
R13: 0000000000000000 R14: ffff8880a1fcc424 R15: dffffc0000000000
FS: 0000000000c46880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000004 CR3: 000000009e4ac000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Fixes: 7dc2bccab0ee ("Validate required parameters in inet6_validate_link_af")
Signed-off-by: Eric Dumazet
Bisected-and-reported-by: syzbot
Cc: Maxim Mikityanskiy
Signed-off-by: David S. Miller

Eric Dumazet
2020-02-08 01:43:23 +0800

06 Feb, 2020

1 commit

f1ed10264 vti[6]: fix packet tx through bpf_redirect() in XinY cases ... Browse Code »

I forgot the 4in6/6in4 cases in my previous patch. Let's fix them.

Fixes: 95224166a903 ("vti[6]: fix packet tx through bpf_redirect()")
Signed-off-by: Nicolas Dichtel
Signed-off-by: Steffen Klassert

Nicolas Dichtel
2020-02-06 20:27:30 +0800

30 Jan, 2020

2 commits

31484d56c mptcp: Fix undefined mptcp_handle_ipv6_mapped for modular IPV6 ... Browse Code »

If CONFIG_MPTCP=y, CONFIG_MPTCP_IPV6=n, and CONFIG_IPV6=m:

ERROR: "mptcp_handle_ipv6_mapped" [net/ipv6/ipv6.ko] undefined!

This does not happen if CONFIG_MPTCP_IPV6=y, as CONFIG_MPTCP_IPV6
selects CONFIG_IPV6, and thus forces CONFIG_IPV6 builtin.

As exporting a symbol for an empty function would be a bit wasteful, fix
this by providing a dummy version of mptcp_handle_ipv6_mapped() for the
CONFIG_MPTCP_IPV6=n case.

Rename mptcp_handle_ipv6_mapped() to mptcpv6_handle_mapped(), to make it
clear this is a pure-IPV6 function, just like mptcpv6_init().

Fixes: cec37a6e41aae7bf ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Signed-off-by: Geert Uytterhoeven
Signed-off-by: David S. Miller

Geert Uytterhoeven
2020-01-30 17:55:54 +0800
ae2dd7164 mptcp: handle tcp fallback when using syn cookies ... Browse Code »

We can't deal with syncookie mode yet, the syncookie rx path will create
tcp reqsk, i.e. we get OOB access because we treat tcp reqsk as mptcp reqsk one:

TCP: SYN flooding on port 20002. Sending cookies.
BUG: KASAN: slab-out-of-bounds in subflow_syn_recv_sock+0x451/0x4d0 net/mptcp/subflow.c:191
Read of size 1 at addr ffff8881167bc148 by task syz-executor099/2120
subflow_syn_recv_sock+0x451/0x4d0 net/mptcp/subflow.c:191
tcp_get_cookie_sock+0xcf/0x520 net/ipv4/syncookies.c:209
cookie_v6_check+0x15a5/0x1e90 net/ipv6/syncookies.c:252
tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:1123 [inline]
[..]

Bug can be reproduced via "sysctl net.ipv4.tcp_syncookies=2".

Note that MPTCP should work with syncookies (4th ack would carry needed
state), but it appears better to sort that out in -next so do tcp
fallback for now.

I removed the MPTCP ifdef for tcp_rsk "is_mptcp" member because
if (IS_ENABLED()) is easier to read than "#ifdef IS_ENABLED()/#endif" pair.

Cc: Eric Dumazet
Fixes: cec37a6e41aae7bf ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Reported-by: Christoph Paasch
Tested-by: Christoph Paasch
Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller

Florian Westphal
2020-01-30 00:45:20 +0800