Eric Lee / smarc-fsl-linux-kernel

25 Aug, 2020

1 commit

fdf1923bf net: Remove duplicated midx check against 0 ... Browse Code »

Check midx against 0 is always equal to check midx against sk_bound_dev_if
when sk_bound_dev_if is known not equal to 0 in these case.

Signed-off-by: Miaohe Lin
Signed-off-by: David S. Miller

Miaohe Lin
2020-08-25 21:23:59 +0800

25 Jul, 2020

8 commits

178c49d9f icmp: prepare rfc 4884 for ipv6 ... Browse Code »

The RFC 4884 spec is largely the same between IPv4 and IPv6.
Factor out the IPv4 specific parts in preparation for IPv6 support:

- icmp types supported

- icmp header size, and thus offset to original datagram start

- datagram length field offset in icmp(6)hdr.

- datagram length field word size: 4B for IPv4, 8B for IPv6.

Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller

Willem de Bruijn
2020-07-25 08:12:41 +0800
a7b75c5a8 net: pass a sockptr_t into ->setsockopt ... Browse Code »

Rework the remaining setsockopt code to pass a sockptr_t instead of a
plain user pointer. This removes the last remaining set_fs(KERNEL_DS)
outside of architecture specific code.

Signed-off-by: Christoph Hellwig
Acked-by: Stefan Schmidt [ieee802154]
Acked-by: Matthieu Baerts
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-25 06:41:54 +0800
89654c5fc net/ipv4: switch do_ip_setsockopt to sockptr_t ... Browse Code »

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-25 06:41:54 +0800
de40a3e88 net/ipv4: merge ip_options_get and ip_options_get_from_user ... Browse Code »

Use the sockptr_t type to merge the versions.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-25 06:41:54 +0800
01ccb5b48 net/ipv4: switch ip_mroute_setsockopt to sockptr_t ... Browse Code »

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-25 06:41:54 +0800
b03afaa82 bpfilter: switch bpfilter_ip_set_sockopt to sockptr_t ... Browse Code »

This is mostly to prepare for cleaning up the callers, as bpfilter by
design can't handle kernel pointers.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-25 06:41:54 +0800
c2f12630c netfilter: switch nf_setsockopt to sockptr_t ... Browse Code »

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-25 06:41:54 +0800
c6d1b26a8 net/xfrm: switch xfrm_user_policy to sockptr_t ... Browse Code »

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-25 06:41:53 +0800

20 Jul, 2020

6 commits

eba75c587 icmp: support rfc 4884 ... Browse Code »

Add setsockopt SOL_IP/IP_RECVERR_4884 to return the offset to an
extension struct if present.

ICMP messages may include an extension structure after the original
datagram. RFC 4884 standardized this behavior. It stores the offset
in words to the extension header in u8 icmphdr.un.reserved[1].

The field is valid only for ICMP types destination unreachable, time
exceeded and parameter problem, if length is at least 128 bytes and
entire packet does not exceed 576 bytes.

Return the offset to the start of the extension struct when reading an
ICMP error from the error queue, if it matches the above constraints.

Do not return the raw u8 field. Return the offset from the start of
the user buffer, in bytes. The kernel does not return the network and
transport headers, so subtract those.

Also validate the headers. Return the offset regardless of validation,
as an invalid extension must still not be misinterpreted as part of
the original datagram. Note that !invalid does not imply valid. If
the extension version does not match, no validation can take place,
for instance.

For backward compatibility, make this optional, set by setsockopt
SOL_IP/IP_RECVERR_RFC4884. For API example and feature test, see
github.com/wdebruij/kerneltools/blob/master/tests/recv_icmp_v2.c

For forward compatibility, reserve only setsockopt value 1, leaving
other bits for additional icmp extensions.

Changes
v1->v2:
- convert word offset to byte offset from start of user buffer
- return in ee_data as u8 may be insufficient
- define extension struct and object header structs
- return len only if constraints met
- if returning len, also validate

Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller

Willem de Bruijn
2020-07-20 10:20:22 +0800
b6238c04c net/ipv4: remove compat_ip_{get,set}sockopt ... Browse Code »

Handle the few cases that need special treatment in-line using
in_compat_syscall().

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-20 09:16:41 +0800
02caad7cc net/ipv4: factor out mcast join/leave setsockopt helpers ... Browse Code »

Factor out one helper each for setting the native and compat
version of the MCAST_MSFILTER option.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-20 09:16:41 +0800
d62c38f6a net/ipv4: factor out MCAST_MSFILTER setsockopt helpers ... Browse Code »

Factor out one helper each for setting the native and compat
version of the MCAST_MSFILTER option.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-20 09:16:41 +0800
49e74c24f net/ipv4: factor out MCAST_MSFILTER getsockopt helpers ... Browse Code »

Factor out one helper each for getting the native and compat
version of the MCAST_MSFILTER option.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-20 09:16:41 +0800
77d4df41d netfilter: remove the compat_{get,set} methods ... Browse Code »

All instances handle compat sockopts via in_compat_syscall() now, so
remove the compat_{get,set} methods as well as the
compat_nf_{get,set}sockopt wrappers.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-20 09:16:40 +0800

29 May, 2020

5 commits

c1f9ec577 ipv4: add ip_sock_set_pktinfo ... Browse Code »

Add a helper to directly set the IP_PKTINFO sockopt from kernel
space without going through a fake uaccess.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-05-29 02:11:45 +0800
2de569bda ipv4: add ip_sock_set_mtu_discover ... Browse Code »

Add a helper to directly set the IP_MTU_DISCOVER sockopt from kernel
space without going through a fake uaccess.

Signed-off-by: Christoph Hellwig
Reviewed-by: David Howells [rxrpc bits]
Signed-off-by: David S. Miller

Christoph Hellwig
2020-05-29 02:11:45 +0800
db45c0ef2 ipv4: add ip_sock_set_recverr ... Browse Code »

Add a helper to directly set the IP_RECVERR sockopt from kernel space
without going through a fake uaccess.

Signed-off-by: Christoph Hellwig
Reviewed-by: David Howells
Signed-off-by: David S. Miller

Christoph Hellwig
2020-05-29 02:11:45 +0800
c4e446bf5 ipv4: add ip_sock_set_freebind ... Browse Code »

Add a helper to directly set the IP_FREEBIND sockopt from kernel space
without going through a fake uaccess.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-05-29 02:11:45 +0800
6ebf71bab ipv4: add ip_sock_set_tos ... Browse Code »

Add a helper to directly set the IP_TOS sockopt from kernel space without
going through a fake uaccess.

Signed-off-by: Christoph Hellwig
Acked-by: Sagi Grimberg
Signed-off-by: David S. Miller

Christoph Hellwig
2020-05-29 02:11:45 +0800

26 May, 2020

1 commit

6a1015b0b ipv4: potential underflow in compat_ip_setsockopt() ... Browse Code »

The value of "n" is capped at 0x1ffffff but it checked for negative
values. I don't think this causes a problem but I'm not certain and
it's harmless to prevent it.

Fixes: 2e04172875c9 ("ipv4: do compat setsockopt for MCAST_MSFILTER directly")
Signed-off-by: Dan Carpenter
Signed-off-by: David S. Miller

Dan Carpenter
2020-05-26 08:51:28 +0800

21 May, 2020

8 commits

b212c322c handle the group_source_req options directly ... Browse Code »

Native ->setsockopt() handling of these options (MCAST_..._SOURCE_GROUP
and MCAST_{,UN}BLOCK_SOURCE) consists of copyin + call of a helper that
does the actual work. The only change needed for ->compat_setsockopt()
is a slightly different copyin - the helpers can be reused as-is.

Signed-off-by: Al Viro

Al Viro
2020-05-21 08:31:32 +0800
2bbf8c1ea ipv4: take handling of group_source_req options into a helper ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2020-05-21 08:31:31 +0800
2f984f11f ipv[46]: do compat setsockopt for MCAST_{JOIN,LEAVE}_GROUP directly ... Browse Code »

direct parallel to the way these two are handled in the native
->setsockopt() instances - the helpers that do the real work
are already separated and can be reused as-is in this case.

Signed-off-by: Al Viro

Al Viro
2020-05-21 08:31:30 +0800
2e0417287 ipv4: do compat setsockopt for MCAST_MSFILTER directly ... Browse Code »

Parallel to what the native setsockopt() does, except that unlike
the native setsockopt() we do not use memdup_user() - we want
the sockaddr_storage fields properly aligned, so we allocate
4 bytes more and copy compat_group_filter at the offset 4,
which yields the proper alignments.

Signed-off-by: Al Viro

Al Viro
2020-05-21 08:31:28 +0800
e986d4dab set_mcast_msfilter(): take the guts of setsockopt(MCAST_MSFILTER) into a helper ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2020-05-21 08:31:28 +0800
0dfe6581a get rid of compat_mc_getsockopt() ... Browse Code »

now we can do MCAST_MSFILTER in compat ->getsockopt() without
playing silly buggers with copying things back and forth.
We can form a native struct group_filter (sans the variable-length
tail) on stack, pass that + pointer to the tail of original request
to the helper doing the bulk of the work, then do the rest of
copyout - same as the native getsockopt() does.

Signed-off-by: Al Viro

Al Viro
2020-05-21 08:31:27 +0800
931ca7ab7 ip*_mc_gsfget(): lift copyout of struct group_filter into callers ... Browse Code »

pass the userland pointer to the array in its tail, so that part
gets copied out by our functions; copyout of everything else is
done in the callers. Rationale: reuse for compat; the array
is the same in native and compat, the layout of parts before it
is different for compat.

Signed-off-by: Al Viro

Al Viro
2020-05-21 08:31:27 +0800
e9c375fb5 compat_ip{,v6}_setsockopt(): enumerate MCAST_... options explicitly ... Browse Code »

We want to check if optname is among the MCAST_... ones; do that as
an explicit switch.

Signed-off-by: Al Viro

Al Viro
2020-05-21 08:31:26 +0800

12 May, 2020

1 commit

1f466e1f1 net: cleanly handle kernel vs user buffers for ->msg_control ... Browse Code »

The msg_control field in struct msghdr can either contain a user
pointer when used with the recvmsg system call, or a kernel pointer
when used with sendmsg. To complicate things further kernel_recvmsg
can stuff a kernel pointer in and then use set_fs to make the uaccess
helpers accept it.

Replace it with a union of a kernel pointer msg_control field, and
a user pointer msg_control_user one, and allow kernel_recvmsg operate
on a proper kernel pointer using a bitfield to override the normal
choice of a user pointer for recvmsg.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-05-12 07:59:16 +0800

26 May, 2019

1 commit

425aa0e1d ip_sockglue: Fix missing-check bug in ip_ra_control() ... Browse Code »

In function ip_ra_control(), the pointer new_ra is allocated a memory
space via kmalloc(). And it is used in the following codes. However,
when there is a memory allocation error, kmalloc() fails. Thus null
pointer dereference may happen. And it will cause the kernel to crash.
Therefore, we should check the return value and handle the error.

Signed-off-by: Gen Zhang
Signed-off-by: David S. Miller

Gen Zhang
2019-05-26 02:00:50 +0800

10 Jan, 2019

1 commit

4a06fa67c ip: on queued skb use skb_header_pointer instead of pskb_may_pull ... Browse Code »

Commit 2efd4fca703a ("ip: in cmsg IP(V6)_ORIGDSTADDR call
pskb_may_pull") avoided a read beyond the end of the skb linear
segment by calling pskb_may_pull.

That function can trigger a BUG_ON in pskb_expand_head if the skb is
shared, which it is when when peeking. It can also return ENOMEM.

Avoid both by switching to safer skb_header_pointer.

Fixes: 2efd4fca703a ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull")
Reported-by: syzbot
Suggested-by: Eric Dumazet
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller

Willem de Bruijn
2019-01-10 22:27:20 +0800

06 Nov, 2018

1 commit

97adaddaa net: bpfilter: fix iptables failure if bpfilter_umh is disabled ... Browse Code »

When iptables command is executed, ip_{set/get}sockopt() try to upload
bpfilter.ko if bpfilter is enabled. if it couldn't find bpfilter.ko,
command is failed.
bpfilter.ko is generated if CONFIG_BPFILTER_UMH is enabled.
ip_{set/get}sockopt() only checks CONFIG_BPFILTER.
So that if CONFIG_BPFILTER is enabled and CONFIG_BPFILTER_UMH is disabled,
iptables command is always failed.

test config:
CONFIG_BPFILTER=y
# CONFIG_BPFILTER_UMH is not set

test command:
%iptables -L
iptables: No chain/target/match by that name.

Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module")
Signed-off-by: Taehee Yoo
Signed-off-by: David S. Miller

Taehee Yoo
2018-11-06 09:12:18 +0800

03 Oct, 2018

1 commit

64199fc0a ipv4: fix use-after-free in ip_cmsg_recv_dstaddr() ... Browse Code »

Caching ip_hdr(skb) before a call to pskb_may_pull() is buggy,
do not do it.

Fixes: 2efd4fca703a ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull")
Signed-off-by: Eric Dumazet
Cc: Willem de Bruijn
Reported-by: syzbot
Acked-by: Willem de Bruijn
Signed-off-by: David S. Miller

Eric Dumazet
2018-10-03 13:32:05 +0800

25 Jul, 2018

1 commit

2efd4fca7 ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull ... Browse Code »

Syzbot reported a read beyond the end of the skb head when returning
IPV6_ORIGDSTADDR:

BUG: KMSAN: kernel-infoleak in put_cmsg+0x5ef/0x860 net/core/scm.c:242
CPU: 0 PID: 4501 Comm: syz-executor128 Not tainted 4.17.0+ #9
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x185/0x1d0 lib/dump_stack.c:113
kmsan_report+0x188/0x2a0 mm/kmsan/kmsan.c:1125
kmsan_internal_check_memory+0x138/0x1f0 mm/kmsan/kmsan.c:1219
kmsan_copy_to_user+0x7a/0x160 mm/kmsan/kmsan.c:1261
copy_to_user include/linux/uaccess.h:184 [inline]
put_cmsg+0x5ef/0x860 net/core/scm.c:242
ip6_datagram_recv_specific_ctl+0x1cf3/0x1eb0 net/ipv6/datagram.c:719
ip6_datagram_recv_ctl+0x41c/0x450 net/ipv6/datagram.c:733
rawv6_recvmsg+0x10fb/0x1460 net/ipv6/raw.c:521
[..]

This logic and its ipv4 counterpart read the destination port from
the packet at skb_transport_offset(skb) + 4.

With MSG_MORE and a local SOCK_RAW sender, syzbot was able to cook a
packet that stores headers exactly up to skb_transport_offset(skb) in
the head and the remainder in a frag.

Call pskb_may_pull before accessing the pointer to ensure that it lies
in skb head.

Link: http://lkml.kernel.org/r/CAF=yD-LEJwZj5a1-bAAj2Oy_hKmGygV6rsJ_WOrAYnv-fnayiQ@mail.gmail.com
Reported-by: syzbot+9adb4b567003cac781f0@syzkaller.appspotmail.com
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller

Willem de Bruijn
2018-07-25 07:35:58 +0800

17 Jul, 2018

1 commit

6e2059b53 ipv4/igmp: init group mode as INCLUDE when join source group ... Browse Code »

Based on RFC3376 5.1
If no interface
state existed for that multicast address before the change (i.e., the
change consisted of creating a new per-interface record), or if no
state exists after the change (i.e., the change consisted of deleting
a per-interface record), then the "non-existent" state is considered
to have a filter mode of INCLUDE and an empty source list.

Which means a new multicast group should start with state IN().

Function ip_mc_join_group() works correctly for IGMP ASM(Any-Source Multicast)
mode. It adds a group with state EX() and inits crcount to mc_qrv,
so the kernel will send a TO_EX() report message after adding group.

But for IGMPv3 SSM(Source-specific multicast) JOIN_SOURCE_GROUP mode, we
split the group joining into two steps. First we join the group like ASM,
i.e. via ip_mc_join_group(). So the state changes from IN() to EX().

Then we add the source-specific address with INCLUDE mode. So the state
changes from EX() to IN(A).

Before the first step sends a group change record, we finished the second
step. So we will only send the second change record. i.e. TO_IN(A).

Regarding the RFC stands, we should actually send an ALLOW(A) message for
SSM JOIN_SOURCE_GROUP as the state should mimic the 'IN() to IN(A)'
transition.

The issue was exposed by commit a052517a8ff65 ("net/multicast: should not
send source list records when have filter mode change"). Before this change,
we used to send both ALLOW(A) and TO_IN(A). After this change we only send
TO_IN(A).

Fix it by adding a new parameter to init group mode. Also add new wrapper
functions so we don't need to change too much code.

v1 -> v2:
In my first version I only cleared the group change record. But this is not
enough. Because when a new group join, it will init as EXCLUDE and trigger
an filter mode change in ip/ip6_mc_add_src(), which will clear all source
addresses' sf_crcount. This will prevent early joined address sending state
change records if multi source addressed joined at the same time.

In v2 patch, I fixed it by directly initializing the mode to INCLUDE for SSM
JOIN_SOURCE_GROUP. I also split the original patch into two separated patches
for IPv4 and IPv6.

Fixes: a052517a8ff65 ("net/multicast: should not send source list records when have filter mode change")
Reviewed-by: Stefano Brivio
Signed-off-by: Hangbin Liu
Signed-off-by: David S. Miller

Hangbin Liu
2018-07-17 02:20:06 +0800

27 May, 2018

1 commit

5b79c2af6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Lots of easy overlapping changes in the confict
resolutions here.

Signed-off-by: David S. Miller

David S. Miller
2018-05-27 07:46:15 +0800

25 May, 2018

1 commit

730c54d59 ipv4: remove warning in ip_recv_error ... Browse Code »

A precondition check in ip_recv_error triggered on an otherwise benign
race. Remove the warning.

The warning triggers when passing an ipv6 socket to this ipv4 error
handling function. RaceFuzzer was able to trigger it due to a race
in setsockopt IPV6_ADDRFORM.

---
CPU0
do_ipv6_setsockopt
sk->sk_socket->ops = &inet_dgram_ops;

---
CPU1
sk->sk_prot->recvmsg
udp_recvmsg
ip_recv_error
WARN_ON_ONCE(sk->sk_family == AF_INET6);

---
CPU0
do_ipv6_setsockopt
sk->sk_family = PF_INET;

This socket option converts a v6 socket that is connected to a v4 peer
to an v4 socket. It updates the socket on the fly, changing fields in
sk as well as other structs. This is inherently non-atomic. It races
with the lockless udp_recvmsg path.

No other code makes an assumption that these fields are updated
atomically. It is benign here, too, as ip_recv_error cares only about
the protocol of the skbs enqueued on the error queue, for which
sk_family is not a precise predictor (thanks to another isue with
IPV6_ADDRFORM).

Link: http://lkml.kernel.org/r/20180518120826.GA19515@dragonet.kaist.ac.kr
Fixes: 7ce875e5ecb8 ("ipv4: warn once on passing AF_INET6 socket to ip_recv_error")
Reported-by: DaeRyong Jeong
Suggested-by: Eric Dumazet
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller

Willem de Bruijn
2018-05-25 10:16:57 +0800

24 May, 2018

1 commit

d2ba09c17 net: add skeleton of bpfilter kernel module ... Browse Code »

bpfilter.ko consists of bpfilter_kern.c (normal kernel module code)
and user mode helper code that is embedded into bpfilter.ko

The steps to build bpfilter.ko are the following:
- main.c is compiled by HOSTCC into the bpfilter_umh elf executable file
- with quite a bit of objcopy and Makefile magic the bpfilter_umh elf file
is converted into bpfilter_umh.o object file
with _binary_net_bpfilter_bpfilter_umh_start and _end symbols
Example:
$ nm ./bld_x64/net/bpfilter/bpfilter_umh.o
0000000000004cf8 T _binary_net_bpfilter_bpfilter_umh_end
0000000000004cf8 A _binary_net_bpfilter_bpfilter_umh_size
0000000000000000 T _binary_net_bpfilter_bpfilter_umh_start
- bpfilter_umh.o and bpfilter_kern.o are linked together into bpfilter.ko

bpfilter_kern.c is a normal kernel module code that calls
the fork_usermode_blob() helper to execute part of its own data
as a user mode process.

Notice that _binary_net_bpfilter_bpfilter_umh_start - end
is placed into .init.rodata section, so it's freed as soon as __init
function of bpfilter.ko is finished.
As part of __init the bpfilter.ko does first request/reply action
via two unix pipe provided by fork_usermode_blob() helper to
make sure that umh is healthy. If not it will kill it via pid.

Later bpfilter_process_sockopt() will be called from bpfilter hooks
in get/setsockopt() to pass iptable commands into umh via bpfilter.ko

If admin does 'rmmod bpfilter' the __exit code bpfilter.ko will
kill umh as well.

Signed-off-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Alexei Starovoitov
2018-05-24 01:23:40 +0800

23 Mar, 2018

1 commit

d9ff30497 net: Replace ip_ra_lock with per-net mutex ... Browse Code »

Since ra_chain is per-net, we may use per-net mutexes
to protect them in ip_ra_control(). This improves
scalability.

Signed-off-by: Kirill Tkhai
Signed-off-by: David S. Miller

Kirill Tkhai
2018-03-23 03:12:56 +0800