23 Jan, 2019
2 commits
-
[ Upstream commit 4a06fa67c4da20148803525151845276cdb995c1 ]
Commit 2efd4fca703a ("ip: in cmsg IP(V6)_ORIGDSTADDR call
pskb_may_pull") avoided a read beyond the end of the skb linear
segment by calling pskb_may_pull.That function can trigger a BUG_ON in pskb_expand_head if the skb is
shared, which it is when when peeking. It can also return ENOMEM.Avoid both by switching to safer skb_header_pointer.
Fixes: 2efd4fca703a ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull")
Reported-by: syzbot
Suggested-by: Eric Dumazet
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 7d033c9f6a7fd3821af75620a0257db87c2b552a ]
This patch makes sure the flow label in the IPv6 header
forged in ipv6_local_error() is initialized.BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
CPU: 1 PID: 24675 Comm: syz-executor1 Not tainted 4.20.0-rc7+ #4
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x173/0x1d0 lib/dump_stack.c:113
kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
kmsan_internal_check_memory+0x455/0xb00 mm/kmsan/kmsan.c:675
kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601
_copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
copy_to_user include/linux/uaccess.h:177 [inline]
move_addr_to_user+0x2e9/0x4f0 net/socket.c:227
___sys_recvmsg+0x5d7/0x1140 net/socket.c:2284
__sys_recvmsg net/socket.c:2327 [inline]
__do_sys_recvmsg net/socket.c:2337 [inline]
__se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334
__x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334
do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x457ec9
Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f8750c06c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9
RDX: 0000000000002000 RSI: 0000000020000400 RDI: 0000000000000005
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f8750c076d4
R13: 00000000004c4a60 R14: 00000000004d8140 R15: 00000000ffffffffUninit was stored to memory at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
kmsan_save_stack mm/kmsan/kmsan.c:219 [inline]
kmsan_internal_chain_origin+0x134/0x230 mm/kmsan/kmsan.c:439
__msan_chain_origin+0x70/0xe0 mm/kmsan/kmsan_instr.c:200
ipv6_recv_error+0x1e3f/0x1eb0 net/ipv6/datagram.c:475
udpv6_recvmsg+0x398/0x2ab0 net/ipv6/udp.c:335
inet_recvmsg+0x4fb/0x600 net/ipv4/af_inet.c:830
sock_recvmsg_nosec net/socket.c:794 [inline]
sock_recvmsg+0x1d1/0x230 net/socket.c:801
___sys_recvmsg+0x4d5/0x1140 net/socket.c:2278
__sys_recvmsg net/socket.c:2327 [inline]
__do_sys_recvmsg net/socket.c:2337 [inline]
__se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334
__x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334
do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158
kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176
kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185
slab_post_alloc_hook mm/slab.h:446 [inline]
slab_alloc_node mm/slub.c:2759 [inline]
__kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383
__kmalloc_reserve net/core/skbuff.c:137 [inline]
__alloc_skb+0x309/0xa20 net/core/skbuff.c:205
alloc_skb include/linux/skbuff.h:998 [inline]
ipv6_local_error+0x1a7/0x9e0 net/ipv6/datagram.c:334
__ip6_append_data+0x129f/0x4fd0 net/ipv6/ip6_output.c:1311
ip6_make_skb+0x6cc/0xcf0 net/ipv6/ip6_output.c:1775
udpv6_sendmsg+0x3f8e/0x45d0 net/ipv6/udp.c:1384
inet_sendmsg+0x54a/0x720 net/ipv4/af_inet.c:798
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg net/socket.c:631 [inline]
__sys_sendto+0x8c4/0xac0 net/socket.c:1788
__do_sys_sendto net/socket.c:1800 [inline]
__se_sys_sendto+0x107/0x130 net/socket.c:1796
__x64_sys_sendto+0x6e/0x90 net/socket.c:1796
do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7Bytes 4-7 of 28 are uninitialized
Memory access of size 28 starts at ffff8881937bfce0
Data copied to user address 0000000020000000Signed-off-by: Eric Dumazet
Reported-by: syzbot
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
28 Jul, 2018
1 commit
-
[ Upstream commit 2efd4fca703a6707cad16ab486eaab8fc7f0fd49 ]
Syzbot reported a read beyond the end of the skb head when returning
IPV6_ORIGDSTADDR:BUG: KMSAN: kernel-infoleak in put_cmsg+0x5ef/0x860 net/core/scm.c:242
CPU: 0 PID: 4501 Comm: syz-executor128 Not tainted 4.17.0+ #9
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x185/0x1d0 lib/dump_stack.c:113
kmsan_report+0x188/0x2a0 mm/kmsan/kmsan.c:1125
kmsan_internal_check_memory+0x138/0x1f0 mm/kmsan/kmsan.c:1219
kmsan_copy_to_user+0x7a/0x160 mm/kmsan/kmsan.c:1261
copy_to_user include/linux/uaccess.h:184 [inline]
put_cmsg+0x5ef/0x860 net/core/scm.c:242
ip6_datagram_recv_specific_ctl+0x1cf3/0x1eb0 net/ipv6/datagram.c:719
ip6_datagram_recv_ctl+0x41c/0x450 net/ipv6/datagram.c:733
rawv6_recvmsg+0x10fb/0x1460 net/ipv6/raw.c:521
[..]This logic and its ipv4 counterpart read the destination port from
the packet at skb_transport_offset(skb) + 4.With MSG_MORE and a local SOCK_RAW sender, syzbot was able to cook a
packet that stores headers exactly up to skb_transport_offset(skb) in
the head and the remainder in a frag.Call pskb_may_pull before accessing the pointer to ensure that it lies
in skb head.Link: http://lkml.kernel.org/r/CAF=yD-LEJwZj5a1-bAAj2Oy_hKmGygV6rsJ_WOrAYnv-fnayiQ@mail.gmail.com
Reported-by: syzbot+9adb4b567003cac781f0@syzkaller.appspotmail.com
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
26 Jun, 2018
1 commit
-
[ Upstream commit 6c206b20092a3623184cff9470dba75d21507874 ]
After commit 6b229cf77d68 ("udp: add batching to udp_rmem_release()")
the sk_rmem_alloc field does not measure exactly anymore the
receive queue length, because we batch the rmem release. The issue
is really apparent only after commit 0d4a6608f68c ("udp: do rmem bulk
free even if the rx sk queue is empty"): the user space can easily
check for an empty socket with not-0 queue length reported by the 'ss'
tool or the procfs interface.We need to use a custom UDP helper to report the correct queue length,
taking into account the forward allocation deficit.Reported-by: trevor.francis@46labs.com
Fixes: 6b229cf77d68 ("UDP: add batching to udp_rmem_release()")
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
01 Apr, 2018
2 commits
-
[ Upstream commit 5f2fb802eee1df0810b47ea251942fe3fd36589a ]
Fixes: 2f987a76a977 ("net: ipv6: keep sk status consistent after datagram connect failure")
Signed-off-by: Stefano Brivio
Acked-by: Paolo Abeni
Acked-by: Guillaume Nault
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 2f987a76a97773beafbc615b9c4d8fe79129a7f4 ]
On unsuccesful ip6_datagram_connect(), if the failure is caused by
ip6_datagram_dst_update(), the sk peer information are cleared, but
the sk->sk_state is preserved.If the socket was already in an established status, the overall sk
status is inconsistent and fouls later checks in datagram code.Fix this saving the old peer information and restoring them in
case of failure. This also aligns ipv6 datagram connect() behavior
with ipv4.v1 -> v2:
- added missing Fixes tagFixes: 85cb73ff9b74 ("net: ipv6: reset daddr and dport in sk if connect() fails")
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
01 Jul, 2017
1 commit
-
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.This patch uses refcount_inc_not_zero() instead of
atomic_inc_not_zero_hint() due to absense of a _hint()
version of refcount API. If the hint() version must
be used, we might need to revisit API.Signed-off-by: Elena Reshetova
Signed-off-by: Hans Liljestrand
Signed-off-by: Kees Cook
Signed-off-by: David Windsor
Signed-off-by: David S. Miller
25 Jun, 2017
1 commit
-
In __ip6_datagram_connect(), reset sk->sk_v6_daddr and inet->dport if
error occurs.
In udp_v6_early_demux(), check for sk_state to make sure it is in
TCP_ESTABLISHED state.
Together, it makes sure unconnected UDP socket won't be considered as a
valid candidate for early demux.v3: add TCP_ESTABLISHED state check in udp_v6_early_demux()
v2: fix compilation errorFixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast")
Signed-off-by: Wei Wang
Acked-by: Maciej Żenczykowski
Signed-off-by: David S. Miller
18 Apr, 2017
1 commit
-
Syzkaller reported a use-after-free in ip_recv_error at line
info->ipi_ifindex = skb->dev->ifindex;
This function is called on dequeue from the error queue, at which
point the device pointer may no longer be valid.Save ifindex on enqueue in __skb_complete_tx_timestamp, when the
pointer is valid or NULL. Store it in temporary storage skb->cb.It is safe to reference skb->dev here, as called from device drivers
or dev_queue_xmit. The exception is when called from tcp_ack_tstamp;
in that case it is NULL and ifindex is set to 0 (invalid).Do not return a pktinfo cmsg if ifindex is 0. This maintains the
current behavior of not returning a cmsg if skb->dev was NULL.On dequeue, the ipv4 path will cast from sock_exterr_skb to
in_pktinfo. Both have ifindex as their first element, so no explicit
conversion is needed. This is by design, introduced in commit
0b922b7a829c ("net: original ingress device index in PKTINFO"). For
ipv6 ip6_datagram_support_cmsg converts to in6_pktinfo.Fixes: 829ae9d61165 ("net-timestamp: allow reading recv cmsg on errqueue with origin tstamp")
Reported-by: Andrey Konovalov
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller
15 Feb, 2017
1 commit
-
This patch adds a check on the type of the source address for the case
where the destination address is in6addr_any. If the source is an
IPv4-mapped IPv6 source address, the destination is changed to
::ffff:127.0.0.1, and otherwise the destination is changed to ::1. This
is done in three locations to handle UDP calls to either connect() or
sendmsg() and TCP calls to connect(). Note that udpv6_sendmsg() delays
handling an in6addr_any destination until very late, so the patch only
needs to handle the case where the source is an IPv4-mapped IPv6
address.Signed-off-by: Jonathan T. Leighton
Signed-off-by: David S. Miller
25 Dec, 2016
1 commit
-
This was entirely automated, using the script by Al:
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
sed -i -e "s!$PATT!#include !" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)to do the replacement at the end of the merge window.
Requested-by: Al Viro
Signed-off-by: Linus Torvalds
24 Dec, 2016
1 commit
-
Socket cmsg IP(V6)_RECVORIGDSTADDR checks that port range lies within
the packet. For sockets that have transport headers pulled, transport
offset can be negative. Use signed comparison to avoid overflow.Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
Reported-by: Nisar Jagabar
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller
04 Dec, 2016
1 commit
-
Couple conflicts resolved here:
1) In the MACB driver, a bug fix to properly initialize the
RX tail pointer properly overlapped with some changes
to support variable sized rings.2) In XGBE we had a "CONFIG_PM" --> "CONFIG_PM_SLEEP" fix
overlapping with a reorganization of the driver to support
ACPI, OF, as well as PCI variants of the chip.3) In 'net' we had several probe error path bug fixes to the
stmmac driver, meanwhile a lot of this code was cleaned up
and reorganized in 'net-next'.4) The cls_flower classifier obtained a helper function in
'net-next' called __fl_delete() and this overlapped with
Daniel Borkamann's bug fix to use RCU for object destruction
in 'net'. It also overlapped with Jiri's change to guard
the rhashtable_remove_fast() call with a check against
tc_skip_sw().5) In mlx4, a revert bug fix in 'net' overlapped with some
unrelated changes in 'net-next'.6) In geneve, a stale header pointer after pskb_expand_head()
bug fix in 'net' overlapped with a large reorganization of
the same code in 'net-next'. Since the 'net-next' code no
longer had the bug in question, there was nothing to do
other than to simply take the 'net-next' hunks.Signed-off-by: David S. Miller
01 Dec, 2016
1 commit
-
Socket flags aren't updated atomically, so the socket must be locked
while reading the SOCK_ZAPPED flag.This issue exists for both l2tp_ip and l2tp_ip6. For IPv6, this patch
also brings error handling for __ip6_datagram_connect() failures.Signed-off-by: Guillaume Nault
Signed-off-by: David S. Miller
05 Nov, 2016
1 commit
-
- Use the UID in routing lookups made by protocol connect() and
sendmsg() functions.
- Make sure that routing lookups triggered by incoming packets
(e.g., Path MTU discovery) take the UID of the socket into
account.
- For packets not associated with a userspace socket, (e.g., ping
replies) use UID 0 inside the user namespace corresponding to
the network namespace the socket belongs to. This allows
all namespaces to apply routing and iptables rules to
kernel-originated traffic in that namespaces by matching UID 0.
This is better than using the UID of the kernel socket that is
sending the traffic, because the UID of kernel sockets created
at namespace creation time (e.g., the per-processor ICMP and
TCP sockets) is the UID of the user that created the socket,
which might not be mapped in the namespace.Tested: compiles allnoconfig, allyesconfig, allmodconfig
Tested: https://android-review.googlesource.com/253302
Signed-off-by: Lorenzo Colitti
Signed-off-by: David S. Miller
04 Nov, 2016
1 commit
-
When reading a datagram or raw packet that arrived fragmented, expose
the maximum fragment size if recorded to allow applications to
estimate receive path MTU.At this point, the field is only recorded when ipv6 connection
tracking is enabled. A follow-up patch will record this field also
in the ipv6 input path.Tested using the test for IP_RECVFRAGSIZE plus
ip netns exec to ip addr add dev veth1 fc07::1/64
ip netns exec from ip addr add dev veth0 fc07::2/64ip netns exec to ./recv_cmsg_recvfragsize -6 -u -p 6000 &
ip netns exec from nc -q 1 -u fc07::1 6000 < payloadBoth with and without enabling connection tracking
ip6tables -A INPUT -m state --state NEW -p udp -j LOG
Signed-off-by: Willem de Bruijn
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
17 May, 2016
1 commit
-
__sock_cmsg_send() might return different error codes, not only -EINVAL.
Fixes: 24025c465f77 ("ipv4: process socket-level control messages in IPv4")
Fixes: ad1e46a83716 ("ipv6: process socket-level control messages in IPv6")
Signed-off-by: Eric Dumazet
Cc: Soheil Hassas Yeganeh
Acked-by: Soheil Hassas Yeganeh
Signed-off-by: David S. Miller
04 May, 2016
1 commit
-
In the sendmsg function of UDP, raw, ICMP and l2tp sockets, we use local
variables like hlimits, tclass, opt and dontfrag and pass them to corresponding
functions like ip6_make_skb, ip6_append_data and xxx_push_pending_frames.
This is not a good practice and makes it hard to add new parameters.
This fix introduces a new struct ipcm6_cookie similar to ipcm_cookie in
ipv4 and include the above mentioned variables. And we only pass the
pointer to this structure to corresponding functions. This makes it easier
to add new parameters in the future and makes the function cleaner.Signed-off-by: Wei Wang
Signed-off-by: David S. Miller
26 Apr, 2016
1 commit
-
We should call consume_skb(skb) when skb is properly consumed,
or kfree_skb(skb) when skb must be dropped in error case.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
24 Apr, 2016
1 commit
-
Conflicts were two cases of simple overlapping changes,
nothing serious.In the UDP case, we need to add a hlist_add_tail_rcu()
to linux/rculist.h, because we've moved UDP socket handling
away from using nulls lists.Signed-off-by: David S. Miller
15 Apr, 2016
4 commits
-
This patch adds a release_cb for UDPv6. It does a route lookup
and updates sk->sk_dst_cache if it is needed. It picks up the
left-over job from ip6_sk_update_pmtu() if the sk was owned
by user during the pmtu update.It takes a rcu_read_lock to protect the __sk_dst_get() operations
because another thread may do ip6_dst_store() without taking the
sk lock (e.g. sendmsg).Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
Signed-off-by: Martin KaFai Lau
Reported-by: Wei Wang
Cc: Cong Wang
Cc: Eric Dumazet
Cc: Wei Wang
Signed-off-by: David S. Miller -
There is a case in connected UDP socket such that
getsockopt(IPV6_MTU) will return a stale MTU value. The reproducible
sequence could be the following:
1. Create a connected UDP socket
2. Send some datagrams out
3. Receive a ICMPV6_PKT_TOOBIG
4. No new outgoing datagrams to trigger the sk_dst_check()
logic to update the sk->sk_dst_cache.
5. getsockopt(IPV6_MTU) returns the mtu from the invalid
sk->sk_dst_cache instead of the newly created RTF_CACHE clone.This patch updates the sk->sk_dst_cache for a connected datagram sk
during pmtu-update code path.Note that the sk->sk_v6_daddr is used to do the route lookup
instead of skb->data (i.e. iph). It is because a UDP socket can become
connected after sending out some datagrams in un-connected state. or
It can be connected multiple times to different destinations. Hence,
iph may not be related to where sk is currently connected to.It is done under '!sock_owned_by_user(sk)' condition because
the user may make another ip6_datagram_connect() (i.e changing
the sk->sk_v6_daddr) while dst lookup is happening in the pmtu-update
code path.For the sock_owned_by_user(sk) == true case, the next patch will
introduce a release_cb() which will update the sk->sk_dst_cache.Test:
Server (Connected UDP Socket):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Route Details:
[root@arch-fb-vm1 ~]# ip -6 r show | egrep '2fac'
2fac::/64 dev eth0 proto kernel metric 256 pref medium
2fac:face::/64 via 2fac::face dev eth0 metric 1024 pref mediumA simple python code to create a connected UDP socket:
import socket
import errnoHOST = '2fac::1'
PORT = 8080s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
s.bind((HOST, PORT))
s.connect(('2fac:face::face', 53))
print("connected")
while True:
try:
data = s.recv(1024)
except socket.error as se:
if se.errno == errno.EMSGSIZE:
pmtu = s.getsockopt(41, 24)
print("PMTU:%d" % pmtu)
break
s.close()Python program output after getting a ICMPV6_PKT_TOOBIG:
[root@arch-fb-vm1 ~]# python2 ~/devshare/kernel/tasks/fib6/udp-connect-53-8080.py
connected
PMTU:1300Cache routes after recieving TOOBIG:
[root@arch-fb-vm1 ~]# ip -6 r show table cache
2fac:face::face via 2fac::face dev eth0 metric 0
cache expires 463sec mtu 1300 pref mediumClient (Send the ICMPV6_PKT_TOOBIG):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
scapy is used to generate the TOOBIG message. Here is the scapy script I have
used:>>> p=Ether(src='da:75:4d:36:ac:32', dst='52:54:00:12:34:66', type=0x86dd)/IPv6(src='2fac::face', dst='2fac::1')/ICMPv6PacketTooBig(mtu=1300)/IPv6(src='2fac::
1',dst='2fac:face::face', nh='UDP')/UDP(sport=8080,dport=53)
>>> sendp(p, iface='qemubr0')Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
Signed-off-by: Martin KaFai Lau
Reported-by: Wei Wang
Cc: Cong Wang
Cc: Eric Dumazet
Cc: Wei Wang
Signed-off-by: David S. Miller -
This patch moves the route lookup and update codes for connected
datagram sk to a newly created function ip6_datagram_dst_update()It will be reused during the pmtu update in the later patch.
Signed-off-by: Martin KaFai Lau
Cc: Cong Wang
Cc: Eric Dumazet
Cc: Wei Wang
Signed-off-by: David S. Miller -
Move flowi6 init codes for connected datagram sk to a newly created
function ip6_datagram_flow_key_init().Notes:
1. fl6_flowlabel is used instead of fl6.flowlabel in __ip6_datagram_connect
2. ipv6_addr_is_multicast(&fl6->daddr) is used instead of
(addr_type & IPV6_ADDR_MULTICAST) in ip6_datagram_flow_key_init()This new function will be reused during pmtu update in the later patch.
Signed-off-by: Martin KaFai Lau
Cc: Cong Wang
Cc: Eric Dumazet
Cc: Wei Wang
Signed-off-by: David S. Miller
05 Apr, 2016
1 commit
-
Process socket-level control messages by invoking
__sock_cmsg_send in ip6_datagram_send_ctl for control messages on
the SOL_SOCKET layer.This makes sure whenever ip6_datagram_send_ctl is called for
udp and raw, we also process socket-level control messages.This is a bit uglier than IPv4, since IPv6 does not have
something like ipcm_cookie. Perhaps we can later create
a control message cookie for IPv6?Note that this commit interprets new control messages that
were ignored before. As such, this commit does not change
the behavior of IPv6 control messages.Signed-off-by: Soheil Hassas Yeganeh
Acked-by: Willem de Bruijn
Signed-off-by: David S. Miller
30 Jan, 2016
1 commit
-
Currently, the egress interface index specified via IPV6_PKTINFO
is ignored by __ip6_datagram_connect(), so that RFC 3542 section 6.7
can be subverted when the user space application calls connect()
before sendmsg().
Fix it by initializing properly flowi6_oif in connect() before
performing the route lookup.Signed-off-by: Paolo Abeni
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
03 Dec, 2015
1 commit
-
This patch addresses multiple problems :
UDP/RAW sendmsg() need to get a stable struct ipv6_txoptions
while socket is not locked : Other threads can change np->opt
concurrently. Dmitry posted a syzkaller
(http://github.com/google/syzkaller) program desmonstrating
use-after-free.Starting with TCP/DCCP lockless listeners, tcp_v6_syn_recv_sock()
and dccp_v6_request_recv_sock() also need to use RCU protection
to dereference np->opt once (before calling ipv6_dup_options())This patch adds full RCU protection to np->opt
Reported-by: Dmitry Vyukov
Signed-off-by: Eric Dumazet
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
26 Sep, 2015
1 commit
-
This is to document that socket lock might not be held at this point.
skb_set_owner_w() and ipv6_local_error() are using proper atomic ops
or spinlocks, so we promote the socket to non const when calling them.netfilter hooks should never assume socket lock is held,
we also promote the socket to non const.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
30 Jul, 2015
1 commit
-
This patch creates sk_set_txhash and eliminates protocol specific
inet_set_txhash and ip6_set_txhash. sk_set_txhash simply sets a
random number instead of performing flow dissection. sk_set_txash
is also allowed to be called multiple times for the same socket,
we'll need this when redoing the hash for negative routing advice.Signed-off-by: Tom Herbert
Signed-off-by: David S. Miller
23 Jul, 2015
1 commit
-
Conflicts:
net/bridge/br_mdb.cbr_mdb.c conflict was a function call being removed to fix a bug in
'net' but whose signature was changed in 'net-next'.Signed-off-by: David S. Miller
16 Jul, 2015
1 commit
-
ip6_datagram_connect() is doing a lot of socket changes without
socket being locked.This looks wrong, at least for udp_lib_rehash() which could corrupt
lists because of concurrent udp_sk(sk)->udp_portaddr_hash accesses.Signed-off-by: Eric Dumazet
Acked-by: Herbert Xu
Signed-off-by: David S. Miller
10 Jul, 2015
1 commit
-
Hop was always either 0 or sizeof(struct ipv6hdr).
Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller
24 Jun, 2015
1 commit
-
ICMP messages can trigger ICMP and local errors. In this case
serr->port is 0 and starting from Linux 4.0 we do not return
the original target address to the error queue readers.
Add function to define which errors provide addr_offset.
With this fix my ping command is not silent anymore.Fixes: c247f0534cc5 ("ip: fix error queue empty skb handling")
Signed-off-by: Julian Anastasov
Acked-by: Willem de Bruijn
Signed-off-by: David S. Miller
01 Apr, 2015
1 commit
-
The ipv6 code uses a mixture of coding styles. In some instances check for NULL
pointer is done as x == NULL and sometimes as !x. !x is preferred according to
checkpatch and this patch makes the code consistent by adopting the latter
form.No changes detected by objdiff.
Signed-off-by: Ian Morris
Signed-off-by: David S. Miller
09 Mar, 2015
1 commit
-
When reading from the error queue, msg_name and msg_control are only
populated for some errors. A new exception for empty timestamp skbs
added a false positive on icmp errors without payload.`traceroute -M udpconn` only displayed gateways that return payload
with the icmp error: the embedded network headers are pulled before
sock_queue_err_skb, leaving an skb with skb->len == 0 otherwise.Fix this regression by refining when msg_name and msg_control
branches are taken. The solutions for the two fields are independent.msg_name only makes sense for errors that configure serr->port and
serr->addr_offset. Test the first instead of skb->len. This also fixes
another issue. saddr could hold the wrong data, as serr->addr_offset
is not initialized in some code paths, pointing to the start of the
network header. It is only valid when serr->port is set (non-zero).msg_control support differs between IPv4 and IPv6. IPv4 only honors
requests for ICMP and timestamps with SOF_TIMESTAMPING_OPT_CMSG. The
skb->len test can simply be removed, because skb->dev is also tested
and never true for empty skbs. IPv6 honors requests for all errors
aside from local errors and timestamps on empty skbs.In both cases, make the policy more explicit by moving this logic to
a new function that decides whether to process msg_control and that
optionally prepares the necessary fields in skb->cb[]. After this
change, the IPv4 and IPv6 paths are more similar.The last case is rxrpc. Here, simply refine to only match timestamps.
Fixes: 49ca0d8bfaf3 ("net-timestamp: no-payload option")
Reported-by: Jan Niehusmann
Signed-off-by: Willem de Bruijn----
Changes
v1->v2
- fix local origin test inversion in ip6_datagram_support_cmsg
- make v4 and v6 code paths more similar by introducing analogous
ipv4_datagram_support_cmsg
- fix compile bug in rxrpc
Signed-off-by: David S. Miller
03 Feb, 2015
1 commit
-
Add timestamping option SOF_TIMESTAMPING_OPT_TSONLY. For transmit
timestamps, this loops timestamps on top of empty packets.Doing so reduces the pressure on SO_RCVBUF. Payload inspection and
cmsg reception (aside from timestamps) are no longer possible. This
works together with a follow on patch that allows administrators to
only allow tx timestamping if it does not loop payload or metadata.Signed-off-by: Willem de Bruijn
----
Changes (rfc -> v1)
- add documentation
- remove unnecessary skb->len test (thanks to Richard Cochran)
Signed-off-by: David S. Miller
16 Jan, 2015
1 commit
-
The sockaddr is returned in IP(V6)_RECVERR as part of errhdr. That
structure is defined and allocated on the stack asstruct {
struct sock_extended_err ee;
struct sockaddr_in(6) offender;
} errhdr;The second part is only initialized for certain SO_EE_ORIGIN values.
Always initialize it completely.An MTU exceeded error on a SOCK_RAW/IPPROTO_RAW is one example that
would return uninitialized bytes.Signed-off-by: Willem de Bruijn
----
Also verified that there is no padding between errhdr.ee and
errhdr.offender that could leak additional kernel data.
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
11 Dec, 2014
1 commit
-
Introduce helper macro for_each_cmsghdr as a wrapper of the enumerating
cmsghdr from msghdr, just cleanup.Signed-off-by: Gu Zheng
Signed-off-by: David S. Miller
09 Dec, 2014
1 commit
-
Allow reading of timestamps and cmsg at the same time on all relevant
socket families. One use is to correlate timestamps with egress
device, by asking for cmsg IP_PKTINFO.on AF_INET sockets, call the relevant function (ip_cmsg_recv). To
avoid changing legacy expectations, only do so if the caller sets a
new timestamping flag SOF_TIMESTAMPING_OPT_CMSG.on AF_INET6 sockets, IPV6_PKTINFO and all other recv cmsg are already
returned for all origins. only change is to set ifindex, which is
not initialized for all error origins.In both cases, only generate the pktinfo message if an ifindex is
known. This is not the case for ACK timestamps.The difference between the protocol families is probably a historical
accident as a result of the different conditions for generating cmsg
in the relevant ip(v6)_recv_error function:ipv4: if (serr->ee.ee_origin == SO_EE_ORIGIN_ICMP) {
ipv6: if (serr->ee.ee_origin != SO_EE_ORIGIN_LOCAL) {At one time, this was the same test bar for the ICMP/ICMP6
distinction. This is no longer true.Signed-off-by: Willem de Bruijn
----
Changes
v1 -> v2
large rewrite
- integrate with existing pktinfo cmsg generation code
- on ipv4: only send with new flag, to maintain legacy behavior
- on ipv6: send at most a single pktinfo cmsg
- on ipv6: initialize fields if not yet initializedThe recv cmsg interfaces are also relevant to the discussion of
whether looping packet headers is problematic. For v6, cmsgs that
identify many headers are already returned. This patch expands
that to v4. If it sounds reasonable, I will follow with patches1. request timestamps without payload with SOF_TIMESTAMPING_OPT_TSONLY
(http://patchwork.ozlabs.org/patch/366967/)
2. sysctl to conditionally drop all timestamps that have payload or
cmsg from users without CAP_NET_RAW.
Signed-off-by: David S. Miller
12 Nov, 2014
1 commit
-
Use the more common dynamic_debug capable net_dbg_ratelimited
and remove the LIMIT_NETDEBUG macro.All messages are still ratelimited.
Some KERN_ uses are changed to KERN_DEBUG.
This may have some negative impact on messages that were
emitted at KERN_INFO that are not not enabled at all unless
DEBUG is defined or dynamic_debug is enabled. Even so,
these messages are now _not_ emitted by default.This also eliminates the use of the net_msg_warn sysctl
"/proc/sys/net/core/warnings". For backward compatibility,
the sysctl is not removed, but it has no function. The extern
declaration of net_msg_warn is removed from sock.h and made
static in net/core/sysctl_net_core.cMiscellanea:
o Update the sysctl documentation
o Remove the embedded uses of pr_fmt
o Coalesce format fragments
o Realign argumentsSigned-off-by: Joe Perches
Signed-off-by: David S. Miller