18 Dec, 2019

1 commit

  • [ Upstream commit c4e85f73afb6384123e5ef1bba3315b2e3ad031e ]

    This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow,
    as some modules currently pass a net argument without a socket to
    ip6_dst_lookup. This is equivalent to commit 343d60aada5a ("ipv6: change
    ipv6_stub_impl.ipv6_dst_lookup to take net argument").

    Signed-off-by: Sabrina Dubroca
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Sabrina Dubroca
     

16 Sep, 2019

1 commit

  • UDP reuseport groups can hold a mix unconnected and connected sockets.
    Ensure that connections only receive all traffic to their 4-tuple.

    Fast reuseport returns on the first reuseport match on the assumption
    that all matches are equal. Only if connections are present, return to
    the previous behavior of scoring all sockets.

    Record if connections are present and if so (1) treat such connected
    sockets as an independent match from the group, (2) only return
    2-tuple matches from reuseport and (3) do not return on the first
    2-tuple reuseport match to allow for a higher scoring match later.

    New field has_conns is set without locks. No other fields in the
    bitmap are modified at runtime and the field is only ever set
    unconditionally, so an RMW cannot miss a change.

    Fixes: e32ea7e74727 ("soreuseport: fast reuseport UDP socket selection")
    Link: http://lkml.kernel.org/r/CA+FuTSfRP09aJNYRt04SS6qj22ViiOEWaWmLAwX0psk8-PGNxw@mail.gmail.com
    Signed-off-by: Willem de Bruijn
    Acked-by: Paolo Abeni
    Acked-by: Craig Gallek
    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

12 Jul, 2019

1 commit

  • Willem forgot to change one of the calls to fl6_sock_lookup(),
    which can now return an error or NULL.

    syzbot reported :

    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    CPU: 1 PID: 31763 Comm: syz-executor.0 Not tainted 5.2.0-rc6+ #63
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:ip6_datagram_dst_update+0x559/0xc30 net/ipv6/datagram.c:83
    Code: 00 00 e8 ea 29 3f fb 4d 85 f6 0f 84 96 04 00 00 e8 dc 29 3f fb 49 8d 7e 20 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 3c 02 00 0f 85 16 06 00 00 4d 8b 6e 20 e8 b4 29 3f fb 4c 89 ee
    RSP: 0018:ffff88809ba97ae0 EFLAGS: 00010207
    RAX: dffffc0000000000 RBX: ffff8880a81254b0 RCX: ffffc90008118000
    RDX: 0000000000000003 RSI: ffffffff86319a84 RDI: 000000000000001e
    RBP: ffff88809ba97c10 R08: ffff888065e9e700 R09: ffffed1015d26c80
    R10: ffffed1015d26c7f R11: ffff8880ae9363fb R12: ffff8880a8124f40
    R13: 0000000000000001 R14: fffffffffffffffe R15: ffff88809ba97b40
    FS: 00007f38e606a700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000202c0140 CR3: 00000000a026a000 CR4: 00000000001406e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    __ip6_datagram_connect+0x5e9/0x1390 net/ipv6/datagram.c:246
    ip6_datagram_connect+0x30/0x50 net/ipv6/datagram.c:269
    ip6_datagram_connect_v6_only+0x69/0x90 net/ipv6/datagram.c:281
    inet_dgram_connect+0x14a/0x2d0 net/ipv4/af_inet.c:571
    __sys_connect+0x264/0x330 net/socket.c:1824
    __do_sys_connect net/socket.c:1835 [inline]
    __se_sys_connect net/socket.c:1832 [inline]
    __x64_sys_connect+0x73/0xb0 net/socket.c:1832
    do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x4597c9
    Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007f38e6069c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
    RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004597c9
    RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000003
    RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00007f38e606a6d4
    R13: 00000000004bfd07 R14: 00000000004d1838 R15: 00000000ffffffff
    Modules linked in:
    RIP: 0010:ip6_datagram_dst_update+0x559/0xc30 net/ipv6/datagram.c:83
    Code: 00 00 e8 ea 29 3f fb 4d 85 f6 0f 84 96 04 00 00 e8 dc 29 3f fb 49 8d 7e 20 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 3c 02 00 0f 85 16 06 00 00 4d 8b 6e 20 e8 b4 29 3f fb 4c 89 ee

    Fixes: 59c820b2317f ("ipv6: elide flowlabel check if no exclusive leases exist")
    Signed-off-by: Eric Dumazet
    Acked-by: Willem de Bruijn
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

20 May, 2019

1 commit


10 Jan, 2019

2 commits

  • This patch makes sure the flow label in the IPv6 header
    forged in ipv6_local_error() is initialized.

    BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
    CPU: 1 PID: 24675 Comm: syz-executor1 Not tainted 4.20.0-rc7+ #4
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x173/0x1d0 lib/dump_stack.c:113
    kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
    kmsan_internal_check_memory+0x455/0xb00 mm/kmsan/kmsan.c:675
    kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601
    _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
    copy_to_user include/linux/uaccess.h:177 [inline]
    move_addr_to_user+0x2e9/0x4f0 net/socket.c:227
    ___sys_recvmsg+0x5d7/0x1140 net/socket.c:2284
    __sys_recvmsg net/socket.c:2327 [inline]
    __do_sys_recvmsg net/socket.c:2337 [inline]
    __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334
    __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334
    do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x63/0xe7
    RIP: 0033:0x457ec9
    Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007f8750c06c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
    RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9
    RDX: 0000000000002000 RSI: 0000000020000400 RDI: 0000000000000005
    RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00007f8750c076d4
    R13: 00000000004c4a60 R14: 00000000004d8140 R15: 00000000ffffffff

    Uninit was stored to memory at:
    kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
    kmsan_save_stack mm/kmsan/kmsan.c:219 [inline]
    kmsan_internal_chain_origin+0x134/0x230 mm/kmsan/kmsan.c:439
    __msan_chain_origin+0x70/0xe0 mm/kmsan/kmsan_instr.c:200
    ipv6_recv_error+0x1e3f/0x1eb0 net/ipv6/datagram.c:475
    udpv6_recvmsg+0x398/0x2ab0 net/ipv6/udp.c:335
    inet_recvmsg+0x4fb/0x600 net/ipv4/af_inet.c:830
    sock_recvmsg_nosec net/socket.c:794 [inline]
    sock_recvmsg+0x1d1/0x230 net/socket.c:801
    ___sys_recvmsg+0x4d5/0x1140 net/socket.c:2278
    __sys_recvmsg net/socket.c:2327 [inline]
    __do_sys_recvmsg net/socket.c:2337 [inline]
    __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334
    __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334
    do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x63/0xe7

    Uninit was created at:
    kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
    kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158
    kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176
    kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185
    slab_post_alloc_hook mm/slab.h:446 [inline]
    slab_alloc_node mm/slub.c:2759 [inline]
    __kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383
    __kmalloc_reserve net/core/skbuff.c:137 [inline]
    __alloc_skb+0x309/0xa20 net/core/skbuff.c:205
    alloc_skb include/linux/skbuff.h:998 [inline]
    ipv6_local_error+0x1a7/0x9e0 net/ipv6/datagram.c:334
    __ip6_append_data+0x129f/0x4fd0 net/ipv6/ip6_output.c:1311
    ip6_make_skb+0x6cc/0xcf0 net/ipv6/ip6_output.c:1775
    udpv6_sendmsg+0x3f8e/0x45d0 net/ipv6/udp.c:1384
    inet_sendmsg+0x54a/0x720 net/ipv4/af_inet.c:798
    sock_sendmsg_nosec net/socket.c:621 [inline]
    sock_sendmsg net/socket.c:631 [inline]
    __sys_sendto+0x8c4/0xac0 net/socket.c:1788
    __do_sys_sendto net/socket.c:1800 [inline]
    __se_sys_sendto+0x107/0x130 net/socket.c:1796
    __x64_sys_sendto+0x6e/0x90 net/socket.c:1796
    do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x63/0xe7

    Bytes 4-7 of 28 are uninitialized
    Memory access of size 28 starts at ffff8881937bfce0
    Data copied to user address 0000000020000000

    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Commit 2efd4fca703a ("ip: in cmsg IP(V6)_ORIGDSTADDR call
    pskb_may_pull") avoided a read beyond the end of the skb linear
    segment by calling pskb_may_pull.

    That function can trigger a BUG_ON in pskb_expand_head if the skb is
    shared, which it is when when peeking. It can also return ENOMEM.

    Avoid both by switching to safer skb_header_pointer.

    Fixes: 2efd4fca703a ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull")
    Reported-by: syzbot
    Suggested-by: Eric Dumazet
    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

08 Nov, 2018

1 commit

  • Ensure an unbound datagram skt is chosen when not in a VRF. The check
    for a device match in compute_score() for UDP must be performed when
    there is no device match. For this, a failure is returned when there is
    no device match. This ensures that bound sockets are never selected,
    even if there is no unbound socket.

    Allow IPv6 packets to be sent over a datagram skt bound to a VRF. These
    packets are currently blocked, as flowi6_oif was set to that of the
    master vrf device, and the ipi6_ifindex is that of the slave device.
    Allow these packets to be sent by checking the device with ipi6_ifindex
    has the same L3 scope as that of the bound device of the skt, which is
    the master vrf device. Note that this check always succeeds if the skt
    is unbound.

    Even though the right datagram skt is now selected by compute_score(),
    a different skt is being returned that is bound to the wrong vrf. The
    difference between these and stream sockets is the handling of the skt
    option for SO_REUSEPORT. While the handling when adding a skt for reuse
    correctly checks that the bound device of the skt is a match, the skts
    in the hashslot are already incorrect. So for the same hash, a skt for
    the wrong vrf may be selected for the required port. The root cause is
    that the skt is immediately placed into a slot when it is created,
    but when the skt is then bound using SO_BINDTODEVICE, it remains in the
    same slot. The solution is to move the skt to the correct slot by
    forcing a rehash.

    Signed-off-by: Mike Manning
    Reviewed-by: David Ahern
    Tested-by: David Ahern
    Signed-off-by: David S. Miller

    Mike Manning
     

02 Aug, 2018

1 commit


30 Jul, 2018

1 commit


25 Jul, 2018

2 commits

  • David S. Miller
     
  • Syzbot reported a read beyond the end of the skb head when returning
    IPV6_ORIGDSTADDR:

    BUG: KMSAN: kernel-infoleak in put_cmsg+0x5ef/0x860 net/core/scm.c:242
    CPU: 0 PID: 4501 Comm: syz-executor128 Not tainted 4.17.0+ #9
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x185/0x1d0 lib/dump_stack.c:113
    kmsan_report+0x188/0x2a0 mm/kmsan/kmsan.c:1125
    kmsan_internal_check_memory+0x138/0x1f0 mm/kmsan/kmsan.c:1219
    kmsan_copy_to_user+0x7a/0x160 mm/kmsan/kmsan.c:1261
    copy_to_user include/linux/uaccess.h:184 [inline]
    put_cmsg+0x5ef/0x860 net/core/scm.c:242
    ip6_datagram_recv_specific_ctl+0x1cf3/0x1eb0 net/ipv6/datagram.c:719
    ip6_datagram_recv_ctl+0x41c/0x450 net/ipv6/datagram.c:733
    rawv6_recvmsg+0x10fb/0x1460 net/ipv6/raw.c:521
    [..]

    This logic and its ipv4 counterpart read the destination port from
    the packet at skb_transport_offset(skb) + 4.

    With MSG_MORE and a local SOCK_RAW sender, syzbot was able to cook a
    packet that stores headers exactly up to skb_transport_offset(skb) in
    the head and the remainder in a frag.

    Call pskb_may_pull before accessing the pointer to ensure that it lies
    in skb head.

    Link: http://lkml.kernel.org/r/CAF=yD-LEJwZj5a1-bAAj2Oy_hKmGygV6rsJ_WOrAYnv-fnayiQ@mail.gmail.com
    Reported-by: syzbot+9adb4b567003cac781f0@syzkaller.appspotmail.com
    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

07 Jul, 2018

1 commit

  • ipcm_cookie includes sockcm_cookie. Do the same for ipcm6_cookie.

    This reduces the number of arguments that need to be passed around,
    applies ipcm6_init to all cookie fields at once and reduces code
    differentiation between ipv4 and ipv6.

    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

09 Jun, 2018

1 commit

  • After commit 6b229cf77d68 ("udp: add batching to udp_rmem_release()")
    the sk_rmem_alloc field does not measure exactly anymore the
    receive queue length, because we batch the rmem release. The issue
    is really apparent only after commit 0d4a6608f68c ("udp: do rmem bulk
    free even if the rx sk queue is empty"): the user space can easily
    check for an empty socket with not-0 queue length reported by the 'ss'
    tool or the procfs interface.

    We need to use a custom UDP helper to report the correct queue length,
    taking into account the forward allocation deficit.

    Reported-by: trevor.francis@46labs.com
    Fixes: 6b229cf77d68 ("UDP: add batching to udp_rmem_release()")
    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     

04 Apr, 2018

1 commit

  • Move commonly used pattern of ip6_dst_store() usage to a separate
    function - ip6_sk_dst_store_flow(), which will check the addresses
    for equality using the flow information, before saving them.

    There is no functional changes in this patch. In addition, it will
    be used in the next patch, in ip6_sk_dst_lookup_flow().

    Signed-off-by: Alexey Kodanev
    Signed-off-by: David S. Miller

    Alexey Kodanev
     

23 Mar, 2018

1 commit

  • Fun set of conflict resolutions here...

    For the mac80211 stuff, these were fortunately just parallel
    adds. Trivially resolved.

    In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the
    function phy_disable_interrupts() earlier in the file, whilst in
    'net-next' the phy_error() call from this function was removed.

    In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the
    'rt_table_id' member of rtable collided with a bug fix in 'net' that
    added a new struct member "rt_mtu_locked" which needs to be copied
    over here.

    The mlxsw driver conflict consisted of net-next separating
    the span code and definitions into separate files, whilst
    a 'net' bug fix made some changes to that moved code.

    The mlx5 infiniband conflict resolution was quite non-trivial,
    the RDMA tree's merge commit was used as a guide here, and
    here are their notes:

    ====================

    Due to bug fixes found by the syzkaller bot and taken into the for-rc
    branch after development for the 4.17 merge window had already started
    being taken into the for-next branch, there were fairly non-trivial
    merge issues that would need to be resolved between the for-rc branch
    and the for-next branch. This merge resolves those conflicts and
    provides a unified base upon which ongoing development for 4.17 can
    be based.

    Conflicts:
    drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524
    (IB/mlx5: Fix cleanup order on unload) added to for-rc and
    commit b5ca15ad7e61 (IB/mlx5: Add proper representors support)
    add as part of the devel cycle both needed to modify the
    init/de-init functions used by mlx5. To support the new
    representors, the new functions added by the cleanup patch
    needed to be made non-static, and the init/de-init list
    added by the representors patch needed to be modified to
    match the init/de-init list changes made by the cleanup
    patch.
    Updates:
    drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function
    prototypes added by representors patch to reflect new function
    names as changed by cleanup patch
    drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init
    stage list to match new order from cleanup patch
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

21 Mar, 2018

1 commit


16 Mar, 2018

1 commit

  • ipv6_chk_addr_and_flags determines if an address is a local address and
    optionally if it is an address on a specific device. For example, it is
    called by ip6_route_info_create to determine if a given gateway address
    is a local address. The address check currently does not consider L3
    domains and as a result does not allow a route to be added in one VRF
    if the nexthop points to an address in a second VRF. e.g.,

    $ ip route add 2001:db8:1::/64 vrf r2 via 2001:db8:102::23
    Error: Invalid gateway address.

    where 2001:db8:102::23 is an address on an interface in vrf r1.

    ipv6_chk_addr_and_flags needs to allow callers to always pass in a device
    with a separate argument to not limit the address to the specific device.
    The device is used used to determine the L3 domain of interest.

    To that end add an argument to skip the device check and update callers
    to always pass a device where possible and use the new argument to mean
    any address in the domain.

    Update a handful of users of ipv6_chk_addr with a NULL dev argument. This
    patch handles the change to these callers without adding the domain check.

    ip6_validate_gw needs to handle 2 cases - one where the device is given
    as part of the nexthop spec and the other where the device is resolved.
    There is at least 1 VRF case where deferring the check to only after
    the route lookup has resolved the device fails with an unintuitive error
    "RTNETLINK answers: No route to host" as opposed to the preferred
    "Error: Gateway can not be a local address." The 'no route to host'
    error is because of the fallback to a full lookup. The check is done
    twice to avoid this error.

    Signed-off-by: David Ahern
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    David Ahern
     

13 Mar, 2018

1 commit

  • On unsuccesful ip6_datagram_connect(), if the failure is caused by
    ip6_datagram_dst_update(), the sk peer information are cleared, but
    the sk->sk_state is preserved.

    If the socket was already in an established status, the overall sk
    status is inconsistent and fouls later checks in datagram code.

    Fix this saving the old peer information and restoring them in
    case of failure. This also aligns ipv6 datagram connect() behavior
    with ipv4.

    v1 -> v2:
    - added missing Fixes tag

    Fixes: 85cb73ff9b74 ("net: ipv6: reset daddr and dport in sk if connect() fails")
    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     

09 Jan, 2018

1 commit

  • Allow a process bound to a VRF to connect to a linklocal address.
    Currently, this fails because of a mismatch between the scope of the
    linklocal address and the sk_bound_dev_if inherited by the VRF binding:
    $ ssh -6 fe80::70b8:cff:fedd:ead8%eth1
    ssh: connect to host fe80::70b8:cff:fedd:ead8%eth1 port 22: Invalid argument

    Relax the scope check to allow the socket to be bound to the same L3
    device as the scope id.

    This makes ipv6 linklocal consistent with other relaxed checks enabled
    by commits 1ff23beebdd3 ("net: l3mdev: Allow send on enslaved interface")
    and 7bb387c5ab12a ("net: Allow IP_MULTICAST_IF to set index to L3 slave").

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

01 Jul, 2017

1 commit

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    This patch uses refcount_inc_not_zero() instead of
    atomic_inc_not_zero_hint() due to absense of a _hint()
    version of refcount API. If the hint() version must
    be used, we might need to revisit API.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

25 Jun, 2017

1 commit

  • In __ip6_datagram_connect(), reset sk->sk_v6_daddr and inet->dport if
    error occurs.
    In udp_v6_early_demux(), check for sk_state to make sure it is in
    TCP_ESTABLISHED state.
    Together, it makes sure unconnected UDP socket won't be considered as a
    valid candidate for early demux.

    v3: add TCP_ESTABLISHED state check in udp_v6_early_demux()
    v2: fix compilation error

    Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast")
    Signed-off-by: Wei Wang
    Acked-by: Maciej Żenczykowski
    Signed-off-by: David S. Miller

    Wei Wang
     

18 Apr, 2017

1 commit

  • Syzkaller reported a use-after-free in ip_recv_error at line

    info->ipi_ifindex = skb->dev->ifindex;

    This function is called on dequeue from the error queue, at which
    point the device pointer may no longer be valid.

    Save ifindex on enqueue in __skb_complete_tx_timestamp, when the
    pointer is valid or NULL. Store it in temporary storage skb->cb.

    It is safe to reference skb->dev here, as called from device drivers
    or dev_queue_xmit. The exception is when called from tcp_ack_tstamp;
    in that case it is NULL and ifindex is set to 0 (invalid).

    Do not return a pktinfo cmsg if ifindex is 0. This maintains the
    current behavior of not returning a cmsg if skb->dev was NULL.

    On dequeue, the ipv4 path will cast from sock_exterr_skb to
    in_pktinfo. Both have ifindex as their first element, so no explicit
    conversion is needed. This is by design, introduced in commit
    0b922b7a829c ("net: original ingress device index in PKTINFO"). For
    ipv6 ip6_datagram_support_cmsg converts to in6_pktinfo.

    Fixes: 829ae9d61165 ("net-timestamp: allow reading recv cmsg on errqueue with origin tstamp")
    Reported-by: Andrey Konovalov
    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

15 Feb, 2017

1 commit

  • This patch adds a check on the type of the source address for the case
    where the destination address is in6addr_any. If the source is an
    IPv4-mapped IPv6 source address, the destination is changed to
    ::ffff:127.0.0.1, and otherwise the destination is changed to ::1. This
    is done in three locations to handle UDP calls to either connect() or
    sendmsg() and TCP calls to connect(). Note that udpv6_sendmsg() delays
    handling an in6addr_any destination until very late, so the patch only
    needs to handle the case where the source is an IPv4-mapped IPv6
    address.

    Signed-off-by: Jonathan T. Leighton
    Signed-off-by: David S. Miller

    Jonathan T. Leighton
     

25 Dec, 2016

1 commit


24 Dec, 2016

1 commit

  • Socket cmsg IP(V6)_RECVORIGDSTADDR checks that port range lies within
    the packet. For sockets that have transport headers pulled, transport
    offset can be negative. Use signed comparison to avoid overflow.

    Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
    Reported-by: Nisar Jagabar
    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

04 Dec, 2016

1 commit

  • Couple conflicts resolved here:

    1) In the MACB driver, a bug fix to properly initialize the
    RX tail pointer properly overlapped with some changes
    to support variable sized rings.

    2) In XGBE we had a "CONFIG_PM" --> "CONFIG_PM_SLEEP" fix
    overlapping with a reorganization of the driver to support
    ACPI, OF, as well as PCI variants of the chip.

    3) In 'net' we had several probe error path bug fixes to the
    stmmac driver, meanwhile a lot of this code was cleaned up
    and reorganized in 'net-next'.

    4) The cls_flower classifier obtained a helper function in
    'net-next' called __fl_delete() and this overlapped with
    Daniel Borkamann's bug fix to use RCU for object destruction
    in 'net'. It also overlapped with Jiri's change to guard
    the rhashtable_remove_fast() call with a check against
    tc_skip_sw().

    5) In mlx4, a revert bug fix in 'net' overlapped with some
    unrelated changes in 'net-next'.

    6) In geneve, a stale header pointer after pskb_expand_head()
    bug fix in 'net' overlapped with a large reorganization of
    the same code in 'net-next'. Since the 'net-next' code no
    longer had the bug in question, there was nothing to do
    other than to simply take the 'net-next' hunks.

    Signed-off-by: David S. Miller

    David S. Miller
     

01 Dec, 2016

1 commit


05 Nov, 2016

1 commit

  • - Use the UID in routing lookups made by protocol connect() and
    sendmsg() functions.
    - Make sure that routing lookups triggered by incoming packets
    (e.g., Path MTU discovery) take the UID of the socket into
    account.
    - For packets not associated with a userspace socket, (e.g., ping
    replies) use UID 0 inside the user namespace corresponding to
    the network namespace the socket belongs to. This allows
    all namespaces to apply routing and iptables rules to
    kernel-originated traffic in that namespaces by matching UID 0.
    This is better than using the UID of the kernel socket that is
    sending the traffic, because the UID of kernel sockets created
    at namespace creation time (e.g., the per-processor ICMP and
    TCP sockets) is the UID of the user that created the socket,
    which might not be mapped in the namespace.

    Tested: compiles allnoconfig, allyesconfig, allmodconfig
    Tested: https://android-review.googlesource.com/253302
    Signed-off-by: Lorenzo Colitti
    Signed-off-by: David S. Miller

    Lorenzo Colitti
     

04 Nov, 2016

1 commit

  • When reading a datagram or raw packet that arrived fragmented, expose
    the maximum fragment size if recorded to allow applications to
    estimate receive path MTU.

    At this point, the field is only recorded when ipv6 connection
    tracking is enabled. A follow-up patch will record this field also
    in the ipv6 input path.

    Tested using the test for IP_RECVFRAGSIZE plus

    ip netns exec to ip addr add dev veth1 fc07::1/64
    ip netns exec from ip addr add dev veth0 fc07::2/64

    ip netns exec to ./recv_cmsg_recvfragsize -6 -u -p 6000 &
    ip netns exec from nc -q 1 -u fc07::1 6000 < payload

    Both with and without enabling connection tracking

    ip6tables -A INPUT -m state --state NEW -p udp -j LOG

    Signed-off-by: Willem de Bruijn
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

17 May, 2016

1 commit

  • __sock_cmsg_send() might return different error codes, not only -EINVAL.

    Fixes: 24025c465f77 ("ipv4: process socket-level control messages in IPv4")
    Fixes: ad1e46a83716 ("ipv6: process socket-level control messages in IPv6")
    Signed-off-by: Eric Dumazet
    Cc: Soheil Hassas Yeganeh
    Acked-by: Soheil Hassas Yeganeh
    Signed-off-by: David S. Miller

    Eric Dumazet
     

04 May, 2016

1 commit

  • In the sendmsg function of UDP, raw, ICMP and l2tp sockets, we use local
    variables like hlimits, tclass, opt and dontfrag and pass them to corresponding
    functions like ip6_make_skb, ip6_append_data and xxx_push_pending_frames.
    This is not a good practice and makes it hard to add new parameters.
    This fix introduces a new struct ipcm6_cookie similar to ipcm_cookie in
    ipv4 and include the above mentioned variables. And we only pass the
    pointer to this structure to corresponding functions. This makes it easier
    to add new parameters in the future and makes the function cleaner.

    Signed-off-by: Wei Wang
    Signed-off-by: David S. Miller

    Wei Wang
     

26 Apr, 2016

1 commit


24 Apr, 2016

1 commit


15 Apr, 2016

4 commits

  • This patch adds a release_cb for UDPv6. It does a route lookup
    and updates sk->sk_dst_cache if it is needed. It picks up the
    left-over job from ip6_sk_update_pmtu() if the sk was owned
    by user during the pmtu update.

    It takes a rcu_read_lock to protect the __sk_dst_get() operations
    because another thread may do ip6_dst_store() without taking the
    sk lock (e.g. sendmsg).

    Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
    Signed-off-by: Martin KaFai Lau
    Reported-by: Wei Wang
    Cc: Cong Wang
    Cc: Eric Dumazet
    Cc: Wei Wang
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     
  • There is a case in connected UDP socket such that
    getsockopt(IPV6_MTU) will return a stale MTU value. The reproducible
    sequence could be the following:
    1. Create a connected UDP socket
    2. Send some datagrams out
    3. Receive a ICMPV6_PKT_TOOBIG
    4. No new outgoing datagrams to trigger the sk_dst_check()
    logic to update the sk->sk_dst_cache.
    5. getsockopt(IPV6_MTU) returns the mtu from the invalid
    sk->sk_dst_cache instead of the newly created RTF_CACHE clone.

    This patch updates the sk->sk_dst_cache for a connected datagram sk
    during pmtu-update code path.

    Note that the sk->sk_v6_daddr is used to do the route lookup
    instead of skb->data (i.e. iph). It is because a UDP socket can become
    connected after sending out some datagrams in un-connected state. or
    It can be connected multiple times to different destinations. Hence,
    iph may not be related to where sk is currently connected to.

    It is done under '!sock_owned_by_user(sk)' condition because
    the user may make another ip6_datagram_connect() (i.e changing
    the sk->sk_v6_daddr) while dst lookup is happening in the pmtu-update
    code path.

    For the sock_owned_by_user(sk) == true case, the next patch will
    introduce a release_cb() which will update the sk->sk_dst_cache.

    Test:

    Server (Connected UDP Socket):
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Route Details:
    [root@arch-fb-vm1 ~]# ip -6 r show | egrep '2fac'
    2fac::/64 dev eth0 proto kernel metric 256 pref medium
    2fac:face::/64 via 2fac::face dev eth0 metric 1024 pref medium

    A simple python code to create a connected UDP socket:

    import socket
    import errno

    HOST = '2fac::1'
    PORT = 8080

    s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
    s.bind((HOST, PORT))
    s.connect(('2fac:face::face', 53))
    print("connected")
    while True:
    try:
    data = s.recv(1024)
    except socket.error as se:
    if se.errno == errno.EMSGSIZE:
    pmtu = s.getsockopt(41, 24)
    print("PMTU:%d" % pmtu)
    break
    s.close()

    Python program output after getting a ICMPV6_PKT_TOOBIG:
    [root@arch-fb-vm1 ~]# python2 ~/devshare/kernel/tasks/fib6/udp-connect-53-8080.py
    connected
    PMTU:1300

    Cache routes after recieving TOOBIG:
    [root@arch-fb-vm1 ~]# ip -6 r show table cache
    2fac:face::face via 2fac::face dev eth0 metric 0
    cache expires 463sec mtu 1300 pref medium

    Client (Send the ICMPV6_PKT_TOOBIG):
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    scapy is used to generate the TOOBIG message. Here is the scapy script I have
    used:

    >>> p=Ether(src='da:75:4d:36:ac:32', dst='52:54:00:12:34:66', type=0x86dd)/IPv6(src='2fac::face', dst='2fac::1')/ICMPv6PacketTooBig(mtu=1300)/IPv6(src='2fac::
    1',dst='2fac:face::face', nh='UDP')/UDP(sport=8080,dport=53)
    >>> sendp(p, iface='qemubr0')

    Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
    Signed-off-by: Martin KaFai Lau
    Reported-by: Wei Wang
    Cc: Cong Wang
    Cc: Eric Dumazet
    Cc: Wei Wang
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     
  • This patch moves the route lookup and update codes for connected
    datagram sk to a newly created function ip6_datagram_dst_update()

    It will be reused during the pmtu update in the later patch.

    Signed-off-by: Martin KaFai Lau
    Cc: Cong Wang
    Cc: Eric Dumazet
    Cc: Wei Wang
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     
  • Move flowi6 init codes for connected datagram sk to a newly created
    function ip6_datagram_flow_key_init().

    Notes:
    1. fl6_flowlabel is used instead of fl6.flowlabel in __ip6_datagram_connect
    2. ipv6_addr_is_multicast(&fl6->daddr) is used instead of
    (addr_type & IPV6_ADDR_MULTICAST) in ip6_datagram_flow_key_init()

    This new function will be reused during pmtu update in the later patch.

    Signed-off-by: Martin KaFai Lau
    Cc: Cong Wang
    Cc: Eric Dumazet
    Cc: Wei Wang
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     

05 Apr, 2016

1 commit

  • Process socket-level control messages by invoking
    __sock_cmsg_send in ip6_datagram_send_ctl for control messages on
    the SOL_SOCKET layer.

    This makes sure whenever ip6_datagram_send_ctl is called for
    udp and raw, we also process socket-level control messages.

    This is a bit uglier than IPv4, since IPv6 does not have
    something like ipcm_cookie. Perhaps we can later create
    a control message cookie for IPv6?

    Note that this commit interprets new control messages that
    were ignored before. As such, this commit does not change
    the behavior of IPv6 control messages.

    Signed-off-by: Soheil Hassas Yeganeh
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Soheil Hassas Yeganeh
     

30 Jan, 2016

1 commit

  • Currently, the egress interface index specified via IPV6_PKTINFO
    is ignored by __ip6_datagram_connect(), so that RFC 3542 section 6.7
    can be subverted when the user space application calls connect()
    before sendmsg().
    Fix it by initializing properly flowi6_oif in connect() before
    performing the route lookup.

    Signed-off-by: Paolo Abeni
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Paolo Abeni