04 Nov, 2018

1 commit

  • [ Upstream commit 84dad55951b0d009372ec21760b650634246e144 ]

    The commit eb63f2964dbe ("udp6: add missing checks on edumux packet
    processing") used the same return code convention of the ipv4 counterpart,
    but ipv6 uses the opposite one: positive values means resubmit.

    This change addresses the issue, using positive return value for
    resubmitting. Also update the related comment, which was broken, too.

    Fixes: eb63f2964dbe ("udp6: add missing checks on edumux packet processing")
    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     

29 Sep, 2018

1 commit

  • [ Upstream commit eb63f2964dbe36f26deac77d3016791675821ded ]

    Currently the UDPv6 early demux rx code path lacks some mandatory
    checks, already implemented into the normal RX code path - namely
    the checksum conversion and no_check6_rx check.

    Similar to the previous commit, we move the common processing to
    an UDPv6 specific helper and call it from both edemux code path
    and normal code path. In respect to the UDPv4, we need to add an
    explicit check for non zero csum according to no_check6_rx value.

    Reported-by: Jianlin Shi
    Suggested-by: Xin Long
    Fixes: c9f2c1ae123a ("udp6: fix socket leak on early demux")
    Fixes: 2abb7cdc0dc8 ("udp: Add support for doing checksum unnecessary conversion")
    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     

26 Jun, 2018

1 commit

  • [ Upstream commit 6c206b20092a3623184cff9470dba75d21507874 ]

    After commit 6b229cf77d68 ("udp: add batching to udp_rmem_release()")
    the sk_rmem_alloc field does not measure exactly anymore the
    receive queue length, because we batch the rmem release. The issue
    is really apparent only after commit 0d4a6608f68c ("udp: do rmem bulk
    free even if the rx sk queue is empty"): the user space can easily
    check for an empty socket with not-0 queue length reported by the 'ss'
    tool or the procfs interface.

    We need to use a custom UDP helper to report the correct queue length,
    taking into account the forward allocation deficit.

    Reported-by: trevor.francis@46labs.com
    Fixes: 6b229cf77d68 ("UDP: add batching to udp_rmem_release()")
    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     

19 May, 2018

1 commit

  • [ Upstream commit 69678bcd4d2dedbc3e8fcd6d7d99f283d83c531a ]

    Damir reported a breakage of SO_BINDTODEVICE for UDP sockets.
    In absence of VRF devices, after commit fb74c27735f0 ("net:
    ipv4: add second dif to udp socket lookups") the dif mismatch
    isn't fatal anymore for UDP socket lookup with non null
    sk_bound_dev_if, breaking SO_BINDTODEVICE semantics.

    This changeset addresses the issue making the dif match mandatory
    again in the above scenario.

    Reported-by: Damir Mansurov
    Fixes: fb74c27735f0 ("net: ipv4: add second dif to udp socket lookups")
    Fixes: 1801b570dd2a ("net: ipv6: add second dif to udp socket lookups")
    Signed-off-by: Paolo Abeni
    Acked-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     

19 Sep, 2017

1 commit

  • While trying an ESP transport mode encryption for UDPv6 packets of
    datagram size 1436 with MTU 1500, checksum error was observed in
    the secondary fragment.

    This error occurs due to the UDP payload checksum being missed out
    when computing the full checksum for these packets in
    udp6_hwcsum_outgoing().

    Fixes: d39d938c8228 ("ipv6: Introduce udpv6_send_skb()")
    Signed-off-by: Subash Abhinov Kasiviswanathan
    Signed-off-by: David S. Miller

    Subash Abhinov Kasiviswanathan
     

07 Sep, 2017

1 commit

  • Pull networking updates from David Miller:

    1) Support ipv6 checksum offload in sunvnet driver, from Shannon
    Nelson.

    2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric
    Dumazet.

    3) Allow generic XDP to work on virtual devices, from John Fastabend.

    4) Add bpf device maps and XDP_REDIRECT, which can be used to build
    arbitrary switching frameworks using XDP. From John Fastabend.

    5) Remove UFO offloads from the tree, gave us little other than bugs.

    6) Remove the IPSEC flow cache, from Florian Westphal.

    7) Support ipv6 route offload in mlxsw driver.

    8) Support VF representors in bnxt_en, from Sathya Perla.

    9) Add support for forward error correction modes to ethtool, from
    Vidya Sagar Ravipati.

    10) Add time filter for packet scheduler action dumping, from Jamal Hadi
    Salim.

    11) Extend the zerocopy sendmsg() used by virtio and tap to regular
    sockets via MSG_ZEROCOPY. From Willem de Bruijn.

    12) Significantly rework value tracking in the BPF verifier, from Edward
    Cree.

    13) Add new jump instructions to eBPF, from Daniel Borkmann.

    14) Rework rtnetlink plumbing so that operations can be run without
    taking the RTNL semaphore. From Florian Westphal.

    15) Support XDP in tap driver, from Jason Wang.

    16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal.

    17) Add Huawei hinic ethernet driver.

    18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan
    Delalande.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits)
    i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq
    i40e: avoid NVM acquire deadlock during NVM update
    drivers: net: xgene: Remove return statement from void function
    drivers: net: xgene: Configure tx/rx delay for ACPI
    drivers: net: xgene: Read tx/rx delay for ACPI
    rocker: fix kcalloc parameter order
    rds: Fix non-atomic operation on shared flag variable
    net: sched: don't use GFP_KERNEL under spin lock
    vhost_net: correctly check tx avail during rx busy polling
    net: mdio-mux: add mdio_mux parameter to mdio_mux_init()
    rxrpc: Make service connection lookup always check for retry
    net: stmmac: Delete dead code for MDIO registration
    gianfar: Fix Tx flow control deactivation
    cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6
    cxgb4: Fix pause frame count in t4_get_port_stats
    cxgb4: fix memory leak
    tun: rename generic_xdp to skb_xdp
    tun: reserve extra headroom only when XDP is set
    net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping
    net: dsa: bcm_sf2: Advertise number of egress queues
    ...

    Linus Torvalds
     

04 Sep, 2017

1 commit


02 Sep, 2017

1 commit


29 Aug, 2017

1 commit

  • Twice patches trying to constify inet{6}_protocol have been reverted:
    39294c3df2a8 ("Revert "ipv6: constify inet6_protocol structures"") to
    revert 3a3a4e3054137 and then 03157937fe0b5 ("Revert "ipv4: make
    net_protocol const"") to revert aa8db499ea67.

    Add a comment that the structures can not be const because the
    early_demux field can change based on a sysctl.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

26 Aug, 2017

1 commit

  • Currently, in the udp6 code, the dst cookie is not initialized/updated
    concurrently with the RX dst used by early demux.

    As a result, the dst_check() in the early_demux path always fails,
    the rx dst cache is always invalidated, and we can't really
    leverage significant gain from the demux lookup.

    Fix it adding udp6 specific variant of sk_rx_dst_set() and use it
    to set the dst cookie when the dst entry is really changed.

    The issue is there since the introduction of early demux for ipv6.

    Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast")
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     

25 Aug, 2017

1 commit


22 Aug, 2017

1 commit


19 Aug, 2017

1 commit

  • Due to commit e6afc8ace6dd5cef5e812f26c72579da8806f5ac ("udp: remove
    headers from UDP packets before queueing"), when udp packets are being
    peeked the requested extra offset is always 0 as there is no need to skip
    the udp header. However, when the offset is 0 and the next skb is
    of length 0, it is only returned once. The behaviour can be seen with
    the following python script:

    from socket import *;
    f=socket(AF_INET6, SOCK_DGRAM | SOCK_NONBLOCK, 0);
    g=socket(AF_INET6, SOCK_DGRAM | SOCK_NONBLOCK, 0);
    f.bind(('::', 0));
    addr=('::1', f.getsockname()[1]);
    g.sendto(b'', addr)
    g.sendto(b'b', addr)
    print(f.recvfrom(10, MSG_PEEK));
    print(f.recvfrom(10, MSG_PEEK));

    Where the expected output should be the empty string twice.

    Instead, make sk_peek_offset return negative values, and pass those values
    to __skb_try_recv_datagram/__skb_try_recv_from_queue. If the passed offset
    to __skb_try_recv_from_queue is negative, the checked skb is never skipped.
    __skb_try_recv_from_queue will then ensure the offset is reset back to 0
    if a peek is requested without an offset, unless no packets are found.

    Also simplify the if condition in __skb_try_recv_from_queue. If _off is
    greater then 0, and off is greater then or equal to skb->len, then
    (_off || skb->len) must always be true assuming skb->len >= 0 is always
    true.

    Also remove a redundant check around a call to sk_peek_offset in af_unix.c,
    as it double checked if MSG_PEEK was set in the flags.

    V2:
    - Moved the negative fixup into __skb_try_recv_from_queue, and remove now
    redundant checks
    - Fix peeking in udp{,v6}_recvmsg to report the right value when the
    offset is 0

    V3:
    - Marked new branch in __skb_try_recv_from_queue as unlikely.

    Signed-off-by: Matthew Dawson
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Matthew Dawson
     

10 Aug, 2017

1 commit

  • Any use of key->enabled (that is static_key_enabled and static_key_count)
    outside jump_label_lock should handle its own serialization. The only
    two that are not doing so are the UDP encapsulation static keys. Change
    them to use static_key_enable, which now correctly tests key->enabled under
    the jump label lock.

    Signed-off-by: Paolo Bonzini
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Eric Dumazet
    Cc: Jason Baron
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1501601046-35683-3-git-send-email-pbonzini@redhat.com
    Signed-off-by: Ingo Molnar

    Paolo Bonzini
     

08 Aug, 2017

2 commits

  • Add a second device index, sdif, to inet6 socket lookups. sdif is the
    index for ingress devices enslaved to an l3mdev. It allows the lookups
    to consider the enslaved device as well as the L3 domain when searching
    for a socket.

    TCP moves the data in the cb. Prior to tcp_v4_rcv (e.g., early demux) the
    ingress index is obtained from IPCB using inet_sdif and after tcp_v4_rcv
    tcp_v4_sdif is used.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Add a second device index, sdif, to udp socket lookups. sdif is the
    index for ingress devices enslaved to an l3mdev. It allows the lookups
    to consider the enslaved device as well as the L3 domain when searching
    for a socket.

    Early demux lookups are handled in the next patch as part of INET_MATCH
    changes.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

01 Aug, 2017

1 commit

  • Since commit 67a51780aebb ("ipv6: udp: leverage scratch area
    helpers") udp6_recvmsg() read the skb len from the scratch area,
    to avoid a cache miss.
    But the UDP6 rx path support RFC 2675 UDPv6 jumbograms, and their
    length exceeds the 16 bits available in the scratch area. As a side
    effect the length returned by recvmsg() is:
    % (1<len if
    required, without a measurable overhead.

    Fixes: 67a51780aebb ("ipv6: udp: leverage scratch area helpers")
    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     

30 Jul, 2017

1 commit

  • When an early demuxed packet reaches __udp6_lib_lookup_skb(), the
    sk reference is retrieved and used, but the relevant reference
    count is leaked and the socket destructor is never called.
    Beyond leaking the sk memory, if there are pending UDP packets
    in the receive queue, even the related accounted memory is leaked.

    In the long run, this will cause persistent forward allocation errors
    and no UDP skbs (both ipv4 and ipv6) will be able to reach the
    user-space.

    Fix this by explicitly accessing the early demux reference before
    the lookup, and properly decreasing the socket reference count
    after usage.

    Also drop the skb_steal_sock() in __udp6_lib_lookup_skb(), and
    the now obsoleted comment about "socket cache".

    The newly added code is derived from the current ipv4 code for the
    similar path.

    v1 -> v2:
    fixed the __udp6_lib_rcv() return code for resubmission,
    as suggested by Eric

    Reported-by: Sam Edwards
    Reported-by: Marc Haber
    Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast")
    Signed-off-by: Paolo Abeni
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Paolo Abeni
     

01 Jul, 2017

2 commits

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    This patch uses refcount_inc_not_zero() instead of
    atomic_inc_not_zero_hint() due to absense of a _hint()
    version of refcount API. If the hint() version must
    be used, we might need to revisit API.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     
  • A set of overlapping changes in macvlan and the rocker
    driver, nothing serious.

    Signed-off-by: David S. Miller

    David S. Miller
     

28 Jun, 2017

1 commit

  • The commit b65ac44674dd ("udp: try to avoid 2 cache miss on dequeue")
    leveraged the scratched area helpers for UDP v4 but I forgot to
    update accordingly the IPv6 code path.

    This change extends the scratch area usage to the IPv6 code, synching
    the two implementations and giving some performance benefit.
    IPv6 is again almost on the same level of IPv4, performance-wide.

    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     

25 Jun, 2017

1 commit

  • In __ip6_datagram_connect(), reset sk->sk_v6_daddr and inet->dport if
    error occurs.
    In udp_v6_early_demux(), check for sk_state to make sure it is in
    TCP_ESTABLISHED state.
    Together, it makes sure unconnected UDP socket won't be considered as a
    valid candidate for early demux.

    v3: add TCP_ESTABLISHED state check in udp_v6_early_demux()
    v2: fix compilation error

    Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast")
    Signed-off-by: Wei Wang
    Acked-by: Maciej Żenczykowski
    Signed-off-by: David S. Miller

    Wei Wang
     

23 Jun, 2017

1 commit

  • very similar to commit dd99e425be23 ("udp: prefetch
    rmem_alloc in udp_queue_rcv_skb()"), this allows saving a cache
    miss when the BH is bottle-neck for UDP over ipv6 packet
    processing, e.g. for small packets when a single RX NIC ingress
    queue is in use.

    Performances under flood when multiple NIC RX queues used are
    unaffected, but when a single NIC rx queue is in use, this
    gives ~8% performance improvement.

    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     

18 Jun, 2017

1 commit

  • In udp_v4/6_early_demux() code, we try to hold dst->__refcnt for
    dst with DST_NOCACHE flag. This is because later in udp_sk_rx_dst_set()
    function, we will try to cache this dst in sk for connected case.
    However, a better way to achieve this is to not try to hold dst in
    early_demux(), but in udp_sk_rx_dst_set(), call dst_hold_safe(). This
    approach is also more consistant with how tcp is handling it. And it
    will make later changes simpler.

    Signed-off-by: Wei Wang
    Acked-by: Martin KaFai Lau
    Signed-off-by: David S. Miller

    Wei Wang
     

19 May, 2017

1 commit


18 May, 2017

1 commit

  • Since the udp memory accounting refactor, we don't need any more
    to export the *udp*_queue_rcv_skb(). Make them static and fix
    a couple of sparse warnings:

    net/ipv4/udp.c:1615:5: warning: symbol 'udp_queue_rcv_skb' was not
    declared. Should it be static?
    net/ipv6/udp.c:572:5: warning: symbol 'udpv6_queue_rcv_skb' was not
    declared. Should it be static?

    Fixes: 850cbaddb52d ("udp: use it's own memory accounting schema")
    Fixes: c915fe13cbaa ("udplite: fix NULL pointer dereference")
    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     

17 May, 2017

1 commit

  • under udp flood the sk_receive_queue spinlock is heavily contended.
    This patch try to reduce the contention on such lock adding a
    second receive queue to the udp sockets; recvmsg() looks first
    in such queue and, only if empty, tries to fetch the data from
    sk_receive_queue. The latter is spliced into the newly added
    queue every time the receive path has to acquire the
    sk_receive_queue lock.

    The accounting of forward allocated memory is still protected with
    the sk_receive_queue lock, so udp_rmem_release() needs to acquire
    both locks when the forward deficit is flushed.

    On specific scenarios we can end up acquiring and releasing the
    sk_receive_queue lock multiple times; that will be covered by
    the next patch

    Suggested-by: Eric Dumazet
    Signed-off-by: Paolo Abeni
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Paolo Abeni
     

21 Apr, 2017

1 commit

  • David Ahern reported that 5425077d73e0c ("net: ipv6: Add early demux
    handler for UDP unicast") breaks udp_l3mdev_accept=0 since early
    demux for IPv6 UDP was doing a generic socket lookup which does not
    require an exact match. Fix this by making UDPv6 early demux match
    connected sockets only.

    v1->v2: Take reference to socket after match as suggested by Eric
    v2->v3: Add comment before break

    Fixes: 5425077d73e0c ("net: ipv6: Add early demux handler for UDP unicast")
    Reported-by: David Ahern
    Signed-off-by: Subash Abhinov Kasiviswanathan
    Cc: Eric Dumazet
    Acked-by: David Ahern
    Tested-by: David Ahern
    Signed-off-by: David S. Miller

    subashab@codeaurora.org
     

25 Mar, 2017

1 commit

  • Certain system process significant unconnected UDP workload.
    It would be preferrable to disable UDP early demux for those systems
    and enable it for TCP only.

    By disabling UDP demux, we see these slight gains on an ARM64 system-
    782 -> 788Mbps unconnected single stream UDPv4
    633 -> 654Mbps unconnected UDPv4 different sources

    The performance impact can change based on CPU architecure and cache
    sizes. There will not much difference seen if entire UDP hash table
    is in cache.

    Both sysctls are enabled by default to preserve existing behavior.

    v1->v2: Change function pointer instead of adding conditional as
    suggested by Stephen.

    v2->v3: Read once in callers to avoid issues due to compiler
    optimizations. Also update commit message with the tests.

    v3->v4: Store and use read once result instead of querying pointer
    again incorrectly.

    v4->v5: Refactor to avoid errors due to compilation with IPV6={m,n}

    Signed-off-by: Subash Abhinov Kasiviswanathan
    Suggested-by: Eric Dumazet
    Cc: Stephen Hemminger
    Cc: Tom Herbert
    Cc: David Miller
    Signed-off-by: David S. Miller

    subashab@codeaurora.org
     

24 Mar, 2017

1 commit


23 Mar, 2017

1 commit

  • In the case udp_sk(sk)->pending is AF_INET6, udpv6_sendmsg() would
    jump to do_append_data, skipping the initialization of sockc.tsflags.
    Fix the problem by moving sockc.tsflags initialization earlier.

    The bug was detected with KMSAN.

    Fixes: c14ac9451c34 ("sock: enable timestamping using control messages")
    Signed-off-by: Alexander Potapenko
    Acked-by: Soheil Hassas Yeganeh
    Signed-off-by: David S. Miller

    Alexander Potapenko
     

13 Mar, 2017

1 commit

  • While running a single stream UDPv6 test, we observed that amount
    of CPU spent in NET_RX softirq was much greater than UDPv4 for an
    equivalent receive rate. The test here was run on an ARM64 based
    Android system. On further analysis with perf, we found that UDPv6
    was spending significant time in the statistics netfilter targets
    which did socket lookup per packet. These statistics rules perform
    a lookup when there is no socket associated with the skb. Since
    there are multiple instances of these rules based on UID, there
    will be equal number of lookups per skb.

    By introducing early demux for UDPv6, we avoid the redundant lookups.
    This also helped to improve the performance (800Mbps -> 870Mbps) on a
    CPU limited system in a single stream UDPv6 receive test with 1450
    byte sized datagrams using iperf.

    v1->v2: Use IPv6 cookie to validate dst instead of 0 as suggested
    by Eric

    Signed-off-by: Subash Abhinov Kasiviswanathan
    Signed-off-by: David S. Miller

    subashab@codeaurora.org
     

17 Feb, 2017

1 commit


15 Feb, 2017

1 commit

  • This patch adds a check on the type of the source address for the case
    where the destination address is in6addr_any. If the source is an
    IPv4-mapped IPv6 source address, the destination is changed to
    ::ffff:127.0.0.1, and otherwise the destination is changed to ::1. This
    is done in three locations to handle UDP calls to either connect() or
    sendmsg() and TCP calls to connect(). Note that udpv6_sendmsg() delays
    handling an in6addr_any destination until very late, so the patch only
    needs to handle the case where the source is an IPv4-mapped IPv6
    address.

    Signed-off-by: Jonathan T. Leighton
    Signed-off-by: David S. Miller

    Jonathan T. Leighton
     

08 Feb, 2017

3 commits

  • The conflict was an interaction between a bug fix in the
    netvsc driver in 'net' and an optimization of the RX path
    in 'net-next'.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • When same struct dst_entry can be used for many different
    neighbours we can not use it for pending confirmations.

    The datagram protocols can use MSG_CONFIRM to confirm the
    neighbour. When used with MSG_PROBE we do not reach the
    code where neighbour is confirmed, so we have to do the
    same slow lookup by using the dst_confirm_neigh() helper.
    When MSG_PROBE is not used, ip_append_data/ip6_append_data
    will set the skb flag dst_pending_confirm.

    Reported-by: YueHaibing
    Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.")
    Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.")
    Signed-off-by: Julian Anastasov
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Julian Anastasov
     
  • Dmitry reported that UDP sockets being destroyed would trigger the
    WARN_ON(atomic_read(&sk->sk_rmem_alloc)); in inet_sock_destruct()

    It turns out we do not properly destroy skb(s) that have wrong UDP
    checksum.

    Thanks again to syzkaller team.

    Fixes : 7c13f97ffde6 ("udp: do fwd memory scheduling on dequeue")
    Reported-by: Dmitry Vyukov
    Signed-off-by: Eric Dumazet
    Cc: Paolo Abeni
    Cc: Hannes Frederic Sowa
    Acked-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Eric Dumazet
     

31 Jan, 2017

1 commit

  • Packets arriving in a VRF currently are delivered to UDP sockets that
    aren't bound to any interface. TCP defaults to not delivering packets
    arriving in a VRF to unbound sockets. IP route lookup and socket
    transmit both assume that unbound means using the default table and
    UDP applications that haven't been changed to be aware of VRFs may not
    function correctly in this case since they may not be able to handle
    overlapping IP address ranges, or be able to send packets back to the
    original sender if required.

    So add a sysctl, udp_l3mdev_accept, to control this behaviour with it
    being analgous to the existing tcp_l3mdev_accept, namely to allow a
    process to have a VRF-global listen socket. Have this default to off
    as this is the behaviour that users will expect, given that there is
    no explicit mechanism to set unmodified VRF-unaware application into a
    default VRF.

    Signed-off-by: Robert Shearman
    Acked-by: David Ahern
    Tested-by: David Ahern
    Signed-off-by: David S. Miller

    Robert Shearman
     

19 Jan, 2017

1 commit

  • We pass these per-protocol equal functions around in various places, but
    we can just have one function that checks the sk->sk_family and then do
    the right comparison function. I've also changed the ipv4 version to
    not cast to inet_sock since it is unneeded.

    Signed-off-by: Josef Bacik
    Signed-off-by: David S. Miller

    Josef Bacik
     

25 Dec, 2016

1 commit