22 Jul, 2018

1 commit

  • [ Upstream commit 603d4cf8fe095b1ee78f423d514427be507fb513 ]

    Since the addition of GRO for ESP, gro_receive can consume the skb and
    return -EINPROGRESS. In that case, the lower layer GRO handler cannot
    touch the skb anymore.

    Commit 5f114163f2f5 ("net: Add a skb_gro_flush_final helper.") converted
    some of the gro_receive handlers that can lead to ESP's gro_receive so
    that they wouldn't access the skb when -EINPROGRESS is returned, but
    missed other spots, mainly in tunneling protocols.

    This patch finishes the conversion to using skb_gro_flush_final(), and
    adds a new helper, skb_gro_flush_final_remcsum(), used in VXLAN and
    GUE.

    Fixes: 5f114163f2f5 ("net: Add a skb_gro_flush_final helper.")
    Signed-off-by: Sabrina Dubroca
    Reviewed-by: Stefano Brivio
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Sabrina Dubroca
     

02 Aug, 2017

1 commit

  • In the case that GRO is turned on and the original received packet is
    CHECKSUM_PARTIAL, if the outer UDP header is exactly at the last
    csum-unnecessary point, which for instance could occur if the packet
    comes from another Linux guest on the same Linux host, we have to do
    either remcsum_adjust or set up CHECKSUM_PARTIAL again with its
    csum_start properly reset considering RCO.

    However, since b7fe10e5ebac ("gro: Fix remcsum offload to deal with frags
    in GRO") that barrier in such case could be skipped if GRO turned on,
    hence we pass over it and the inner L4 validation mistakenly reckons
    it as a bad csum.

    This patch makes remcsum_offload being reset at the same time of GRO
    remcsum cleanup, so as to make it work in such case as before.

    Fixes: b7fe10e5ebac ("gro: Fix remcsum offload to deal with frags in GRO")
    Signed-off-by: Koichiro Den
    Signed-off-by: David S. Miller

    K. Den
     

22 May, 2017

1 commit

  • The build header functions are not used by any other code.

    net/ipv6/fou6.c:36:5: warning: no previous prototype for ‘fou6_build_header’ [-Wmissing-prototypes]
    net/ipv6/fou6.c:54:5: warning: no previous prototype for ‘gue6_build_header’ [-Wmissing-prototypes]

    Need to do some code rearranging to satisfy different Kconfig possiblities.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     

31 Oct, 2016

1 commit


28 Oct, 2016

3 commits

  • Now genl_register_family() is the only thing (other than the
    users themselves, perhaps, but I didn't find any doing that)
    writing to the family struct.

    In all families that I found, genl_register_family() is only
    called from __init functions (some indirectly, in which case
    I've add __init annotations to clarifly things), so all can
    actually be marked __ro_after_init.

    This protects the data structure from accidental corruption.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Instead of providing macros/inline functions to initialize
    the families, make all users initialize them statically and
    get rid of the macros.

    This reduces the kernel code size by about 1.6k on x86-64
    (with allyesconfig).

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Static family IDs have never really been used, the only
    use case was the workaround I introduced for those users
    that assumed their family ID was also their multicast
    group ID.

    Additionally, because static family IDs would never be
    reserved by the generic netlink code, using a relatively
    low ID would only work for built-in families that can be
    registered immediately after generic netlink is started,
    which is basically only the control family (apart from
    the workaround code, which I also had to add code for so
    it would reserve those IDs)

    Thus, anything other than GENL_ID_GENERATE is flawed and
    luckily not used except in the cases I mentioned. Move
    those workarounds into a few lines of code, and then get
    rid of GENL_ID_GENERATE entirely, making it more robust.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

21 Oct, 2016

1 commit

  • Currently, GRO can do unlimited recursion through the gro_receive
    handlers. This was fixed for tunneling protocols by limiting tunnel GRO
    to one level with encap_mark, but both VLAN and TEB still have this
    problem. Thus, the kernel is vulnerable to a stack overflow, if we
    receive a packet composed entirely of VLAN headers.

    This patch adds a recursion counter to the GRO layer to prevent stack
    overflow. When a gro_receive function hits the recursion limit, GRO is
    aborted for this skb and it is processed normally. This recursion
    counter is put in the GRO CB, but could be turned into a percpu counter
    if we run out of space in the CB.

    Thanks to Vladimír Beneš for the initial bug report.

    Fixes: CVE-2016-7039
    Fixes: 9b174d88c257 ("net: Add Transparent Ethernet Bridging GRO support.")
    Fixes: 66e5133f19e9 ("vlan: Add GRO support for non hardware accelerated vlan")
    Signed-off-by: Sabrina Dubroca
    Reviewed-by: Jiri Benc
    Acked-by: Hannes Frederic Sowa
    Acked-by: Tom Herbert
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     

02 Sep, 2016

1 commit


08 Jun, 2016

1 commit

  • This patch implements direct encapsulation of IPv4 and IPv6 packets
    in UDP. This is done a version "1" of GUE and as explained in I-D
    draft-ietf-nvo3-gue-03.

    Changes here are only in the receive path, fou with IPxIPx already
    supports the transmit side. Both the normal receive path and
    GRO path are modified to check for GUE version and check for
    IP version in the case that GUE version is "1".

    Tested:

    IPIP with direct GUE encap
    1 TCP_STREAM
    4530 Mbps
    200 TCP_RR
    1297625 tps
    135/232/444 90/95/99% latencies

    IP4IP6 with direct GUE encap
    1 TCP_STREAM
    4903 Mbps
    200 TCP_RR
    1184481 tps
    149/253/473 90/95/99% latencies

    IP6IP6 direct GUE encap
    1 TCP_STREAM
    5146 Mbps
    200 TCP_RR
    1202879 tps
    146/251/472 90/95/99% latencies

    SIT with direct GUE encap
    1 TCP_STREAM
    6111 Mbps
    200 TCP_RR
    1250337 tps
    139/241/467 90/95/99% latencies

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

21 May, 2016

3 commits

  • This patch adds receive path support for IPv6 with fou.

    - Add address family to fou structure for open sockets. This supports
    AF_INET and AF_INET6. Lookups for fou ports are performed on both the
    port number and family.
    - In fou and gue receive adjust tot_len in IPv4 header or payload_len
    based on address family.
    - Allow AF_INET6 in FOU_ATTR_AF netlink attribute.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     
  • Create __fou_build_header and __gue_build_header. These implement the
    protocol generic parts of building the fou and gue header.
    fou_build_header and gue_build_header implement the IPv4 specific
    functions and call the __*_build_header functions.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     
  • Use helper function to set up UDP tunnel related information for a fou
    socket.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

10 May, 2016

1 commit


07 May, 2016

2 commits

  • UDP tunnel segmentation code relies on the inner offsets being set for
    an UDP tunnel GSO packet, but the inner *_complete() functions will
    set the inner offsets only if 'encapsulation' is set before calling
    them. Currently, udp_gro_complete() sets 'encapsulation' only after
    the inner *_complete() functions are done. This causes the inner
    offsets having invalid values after udp_gro_complete() returns, which
    in turn will make it impossible to properly segment the packet in case
    it needs to be forwarded, which would be visible to the user either as
    invalid packets being sent or as packet loss.

    This patch fixes this by setting skb's 'encapsulation' in
    udp_gro_complete() before calling into the inner complete functions,
    and by making each possible UDP tunnel gro_complete() callback set the
    inner_mac_header to the beginning of the tunnel payload.

    Signed-off-by: Jarno Rajahalme
    Reviewed-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Jarno Rajahalme
     
  • The setting of the UDP tunnel GSO type is already performed by
    udp[46]_gro_complete().

    Signed-off-by: Jarno Rajahalme
    Signed-off-by: David S. Miller

    Jarno Rajahalme
     

17 Apr, 2016

1 commit

  • This patch updates the IP tunnel core function iptunnel_handle_offloads so
    that we return an int and do not free the skb inside the function. This
    actually allows us to clean up several paths in several tunnels so that we
    can free the skb at one point in the path without having to have a
    secondary path if we are supporting tunnel offloads.

    In addition it should resolve some double-free issues I have found in the
    tunnels paths as I believe it is possible for us to end up triggering such
    an event in the case of fou or gue.

    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     

10 Apr, 2016

1 commit


08 Apr, 2016

2 commits

  • This patch fixes an issue I found in which we were dropping frames if we
    had enabled checksums on GRE headers that were encapsulated by either FOU
    or GUE. Without this patch I was barely able to get 1 Gb/s of throughput.
    With this patch applied I am now at least getting around 6 Gb/s.

    The issue is due to the fact that with FOU or GUE applied we do not provide
    a transport offset pointing to the GRE header, nor do we offload it in
    software as the GRE header is completely skipped by GSO and treated like a
    VXLAN or GENEVE type header. As such we need to prevent the stack from
    generating it and also prevent GRE from generating it via any interface we
    create.

    Fixes: c3483384ee511 ("gro: Allow tunnel stacking in the case of FOU/GUE")
    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     
  • Adapt gue_gro_receive, gue_gro_complete to take a socket argument.
    Don't set udp_offloads any more.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

31 Mar, 2016

1 commit

  • This patch should fix the issues seen with a recent fix to prevent
    tunnel-in-tunnel frames from being generated with GRO. The fix itself is
    correct for now as long as we do not add any devices that support
    NETIF_F_GSO_GRE_CSUM. When such a device is added it could have the
    potential to mess things up due to the fact that the outer transport header
    points to the outer UDP header and not the GRE header as would be expected.

    Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")
    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     

21 Mar, 2016

1 commit

  • If a packet is either locally encapsulated or processed through GRO
    it is marked with the offloads that it requires. However, when it is
    decapsulated these tunnel offload indications are not removed. This
    means that if we receive an encapsulated TCP packet, aggregate it with
    GRO, decapsulate, and retransmit the resulting frame on a NIC that does
    not support encapsulation, we won't be able to take advantage of hardware
    offloads even though it is just a simple TCP packet at this point.

    This fixes the problem by stripping off encapsulation offload indications
    when packets are decapsulated.

    The performance impacts of this bug are significant. In a test where a
    Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated,
    and bridged to a VM performance is improved by 60% (5Gbps->8Gbps) as a
    result of avoiding unnecessary segmentation at the VM tap interface.

    Reported-by: Ramu Ramamurthy
    Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE")
    Signed-off-by: Jesse Gross
    Signed-off-by: David S. Miller

    Jesse Gross
     

14 Mar, 2016

1 commit

  • This patch updates the GRO handlers for GRE, VXLAN, GENEVE, and FOU so that
    we do not clear the flush bit until after we have called the next level GRO
    handler. Previously this was being cleared before parsing through the list
    of frames, however this resulted in several paths where either the bit
    needed to be reset but wasn't as in the case of FOU, or cases where it was
    being set as in GENEVE. By just deferring the clearing of the bit until
    after the next level protocol has been parsed we can avoid any unnecessary
    bit twiddling and avoid bugs.

    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     

12 Feb, 2016

2 commits


11 Jan, 2016

1 commit

  • udp tunnel offloads tend to aggregate datagrams based on inner
    headers. gro engine gets notified by tunnel implementations about
    possible offloads. The match is solely based on the port number.

    Imagine a tunnel bound to port 53, the offloading will look into all
    DNS packets and tries to aggregate them based on the inner data found
    within. This could lead to data corruption and malformed DNS packets.

    While this patch minimizes the problem and helps an administrator to find
    the issue by querying ip tunnel/fou, a better way would be to match on
    the specific destination ip address so if a user space socket is bound
    to the same address it will conflict.

    Cc: Tom Herbert
    Cc: Eric Dumazet
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

17 Dec, 2015

1 commit

  • fou->udp_offloads is managed by RCU. As it is actually included inside
    the fou sockets, we cannot let the memory go out of scope before a grace
    period. We either can synchronize_rcu or switch over to kfree_rcu to
    manage the sockets. kfree_rcu seems appropriate as it is used by vxlan
    and geneve.

    Fixes: 23461551c00628c ("fou: Support for foo-over-udp RX path")
    Cc: Tom Herbert
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

30 Aug, 2015

1 commit

  • fou does not really support IPv6 encapsulation. After an UDP socket is
    created in fou_create, the encap_rcv callback is set either to fou_udp_recv
    or to gue_udp_recv. Both of those unconditionally assume that the received
    packet has an IPv4 header and access the data at network_header as it was an
    IPv4 header. This leads to IPv6 flow label being interpreted as IP packet
    length, etc.

    Disallow fou tunnel to be configured as IPv6 until real IPv6 support is
    added to fou.

    CC: Tom Herbert
    Signed-off-by: Jiri Benc
    Signed-off-by: David S. Miller

    Jiri Benc
     

24 Aug, 2015

2 commits


17 Apr, 2015

1 commit


15 Apr, 2015

1 commit


13 Apr, 2015

5 commits


09 Apr, 2015

1 commit

  • const __read_mostly is a senseless combination. If something
    is already const it cannot be __read_mostly. Remove the bogus
    __read_mostly in the fou driver.

    This fixes section conflicts with LTO.

    Signed-off-by: Andi Kleen
    Signed-off-by: David S. Miller

    Andi Kleen
     

12 Feb, 2015

2 commits

  • Change remote checksum handling to set checksum partial as default
    behavior. Added an iflink parameter to configure not using
    checksum partial (calling csum_partial to update checksum).

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     
  • This patch adds infrastructure so that remote checksum offload can
    set CHECKSUM_PARTIAL instead of calling csum_partial and writing
    the modfied checksum field.

    Add skb_remcsum_adjust_partial function to set an skb for using
    CHECKSUM_PARTIAL with remote checksum offload. Changed
    skb_remcsum_process and skb_gro_remcsum_process to take a boolean
    argument to indicate if checksum partial can be set or the
    checksum needs to be modified using the normal algorithm.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert