04 Jun, 2016

1 commit

  • SCTP has this pecualiarity that its packets cannot be just segmented to
    (P)MTU. Its chunks must be contained in IP segments, padding respected.
    So we can't just generate a big skb, set gso_size to the fragmentation
    point and deliver it to IP layer.

    This patch takes a different approach. SCTP will now build a skb as it
    would be if it was received using GRO. That is, there will be a cover
    skb with protocol headers and children ones containing the actual
    segments, already segmented to a way that respects SCTP RFCs.

    With that, we can tell skb_segment() to just split based on frag_list,
    trusting its sizes are already in accordance.

    This way SCTP can benefit from GSO and instead of passing several
    packets through the stack, it can pass a single large packet.

    v2:
    - Added support for receiving GSO frames, as requested by Dave Miller.
    - Clear skb->cb if packet is GSO (otherwise it's not used by SCTP)
    - Added heuristics similar to what we have in TCP for not generating
    single GSO packets that fills cwnd.
    v3:
    - consider sctphdr size in skb_gso_transport_seglen()
    - rebased due to 5c7cdf339af5 ("gso: Remove arbitrary checks for
    unsupported GSO")

    Signed-off-by: Marcelo Ricardo Leitner
    Tested-by: Xin Long
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

21 May, 2016

1 commit

  • This patch defines two new GSO definitions SKB_GSO_IPXIP4 and
    SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and
    NETIF_F_GSO_IPXIP6. These are used to described IP in IP
    tunnel and what the outer protocol is. The inner protocol
    can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and
    SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT
    are removed (these are both instances of SKB_GSO_IPXIP4).
    SKB_GSO_IPXIP6 will be used when support for GSO with IP
    encapsulation over IPv6 is added.

    Signed-off-by: Tom Herbert
    Acked-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Tom Herbert
     

22 Apr, 2016

1 commit


17 Apr, 2016

1 commit

  • I realized that when I added NETIF_F_TSO_MANGLEID as a TSO type I forgot to
    add it to NETIF_F_ALL_TSO. This patch corrects that so the flag will be
    included correctly.

    The result should be minor as it was only used by a few drivers and in a
    few specific cases such as when NETIF_F_SG was not supported on a device so
    the TSO flags were cleared.

    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     

15 Apr, 2016

2 commits

  • This patch adds support for something I am referring to as GSO partial.
    The basic idea is that we can support a broader range of devices for
    segmentation if we use fixed outer headers and have the hardware only
    really deal with segmenting the inner header. The idea behind the naming
    is due to the fact that everything before csum_start will be fixed headers,
    and everything after will be the region that is handled by hardware.

    With the current implementation it allows us to add support for the
    following GSO types with an inner TSO_MANGLEID or TSO6 offload:
    NETIF_F_GSO_GRE
    NETIF_F_GSO_GRE_CSUM
    NETIF_F_GSO_IPIP
    NETIF_F_GSO_SIT
    NETIF_F_UDP_TUNNEL
    NETIF_F_UDP_TUNNEL_CSUM

    In the case of hardware that already supports tunneling we may be able to
    extend this further to support TSO_TCPV4 without TSO_MANGLEID if the
    hardware can support updating inner IPv4 headers.

    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     
  • This patch adds support for TSO using IPv4 headers with a fixed IP ID
    field. This is meant to allow us to do a lossless GRO in the case of TCP
    flows that use a fixed IP ID such as those that convert IPv6 header to IPv4
    headers.

    In addition I am adding a feature that for now I am referring to TSO with
    IP ID mangling. Basically when this flag is enabled the device has the
    option to either output the flow with incrementing IP IDs or with a fixed
    IP ID regardless of what the original IP ID ordering was. This is useful
    in cases where the DF bit is set and we do not care if the original IP ID
    value is maintained.

    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     

17 Feb, 2016

1 commit

  • Its useful to turn off the qdisc offload feature at a per device
    level. This gives us a big hammer to enable/disable offloading.
    More fine grained control (i.e. per rule) may be supported later.

    Signed-off-by: John Fastabend
    Acked-by: Jiri Pirko
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    John Fastabend
     

16 Dec, 2015

3 commits

  • These netif flags are unnecessary convolutions. It is more
    straightforward to just use NETIF_F_HW_CSUM, NETIF_F_IP_CSUM,
    and NETIF_F_IPV6_CSUM directly.

    This patch also:
    - Cleans up can_checksum_protocol
    - Simplifies netdev_intersect_features

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     
  • The name NETIF_F_ALL_CSUM is a misnomer. This does not correspond to the
    set of features for offloading all checksums. This is a mask of the
    checksum offload related features bits. It is incorrect to set both
    NETIF_F_HW_CSUM and NETIF_F_IP_CSUM or NETIF_F_IPV6 at the same time for
    features of a device.

    This patch:
    - Changes instances of NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK (where
    NETIF_F_ALL_CSUM is being used as a mask).
    - Changes bonding, sfc/efx, ipvlan, macvlan, vlan, and team drivers to
    use NEITF_F_HW_CSUM in features list instead of NETIF_F_ALL_CSUM.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     
  • The SCTP checksum is really a CRC and is very different from the
    standards 1's complement checksum that serves as the checksum
    for IP protocols. This offload interface is also very different.
    Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC to highlight these
    differences. The term CSUM should be reserved in the stack to refer
    to the standard 1's complement IP checksum.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

04 Nov, 2015

1 commit

  • As pointed out by Nikolay and further explained by Geert, the initial
    for_each_netdev_feature macro was broken, as feature would get set outside
    of the block of code it was intended to run in, thus only ever working for
    the first feature bit in the mask. While less pretty this way, this is
    tested and confirmed functional with multiple feature bits set in
    NETIF_F_UPPER_DISABLES.

    [root@dell-per730-01 ~]# ethtool -K bond0 lro off
    ...
    [ 242.761394] bond0: Disabling feature 0x0000000000008000 on lower dev p5p2.
    [ 243.552178] bnx2x 0000:06:00.1 p5p2: using MSI-X IRQs: sp 74 fp[0] 76 ... fp[7] 83
    [ 244.353978] bond0: Disabling feature 0x0000000000008000 on lower dev p5p1.
    [ 245.147420] bnx2x 0000:06:00.0 p5p1: using MSI-X IRQs: sp 62 fp[0] 64 ... fp[7] 71

    [root@dell-per730-01 ~]# ethtool -K bond0 gro off
    ...
    [ 251.925645] bond0: Disabling feature 0x0000000000004000 on lower dev p5p2.
    [ 252.713693] bnx2x 0000:06:00.1 p5p2: using MSI-X IRQs: sp 74 fp[0] 76 ... fp[7] 83
    [ 253.499085] bond0: Disabling feature 0x0000000000004000 on lower dev p5p1.
    [ 254.290922] bnx2x 0000:06:00.0 p5p1: using MSI-X IRQs: sp 62 fp[0] 64 ... fp[7] 71

    Fixes: fd867d51f ("net/core: generic support for disabling netdev features down stack")
    CC: "David S. Miller"
    CC: Eric Dumazet
    CC: Jay Vosburgh
    CC: Veaceslav Falico
    CC: Andy Gospodarek
    CC: Jiri Pirko
    CC: Nikolay Aleksandrov
    CC: Michal Kubecek
    CC: Alexander Duyck
    CC: Geert Uytterhoeven
    CC: netdev@vger.kernel.org
    Signed-off-by: Jarod Wilson
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Jarod Wilson
     

03 Nov, 2015

1 commit

  • There are some netdev features, which when disabled on an upper device,
    such as a bonding master or a bridge, must be disabled and cannot be
    re-enabled on underlying devices.

    This is a rework of an earlier more heavy-handed appraoch, which simply
    disables and prevents re-enabling of netdev features listed in a new
    define in include/net/netdev_features.h, NETIF_F_UPPER_DISABLES. Any upper
    device that disables a flag in that feature mask, the disabling will
    propagate down the stack, and any lower device that has any upper device
    with one of those flags disabled should not be able to enable said flag.

    Initially, only LRO is included for proof of concept, and because this
    code effectively does the same thing as dev_disable_lro(), though it will
    also activate from the ethtool path, which was one of the goals here.

    [root@dell-per730-01 ~]# ethtool -k bond0 |grep large
    large-receive-offload: on
    [root@dell-per730-01 ~]# ethtool -k p5p1 |grep large
    large-receive-offload: on
    [root@dell-per730-01 ~]# ethtool -K bond0 lro off
    [root@dell-per730-01 ~]# ethtool -k bond0 |grep large
    large-receive-offload: off
    [root@dell-per730-01 ~]# ethtool -k p5p1 |grep large
    large-receive-offload: off

    dmesg dump:

    [ 1033.277986] bond0: Disabling feature 0x0000000000008000 on lower dev p5p2.
    [ 1034.067949] bnx2x 0000:06:00.1 p5p2: using MSI-X IRQs: sp 74 fp[0] 76 ... fp[7] 83
    [ 1034.753612] bond0: Disabling feature 0x0000000000008000 on lower dev p5p1.
    [ 1035.591019] bnx2x 0000:06:00.0 p5p1: using MSI-X IRQs: sp 62 fp[0] 64 ... fp[7] 71

    This has been successfully tested with bnx2x, qlcnic and netxen network
    cards as slaves in a bond interface. Turning LRO on or off on the master
    also turns it on or off on each of the slaves, new slaves are added with
    LRO in the same state as the master, and LRO can't be toggled on the
    slaves.

    Also, this should largely remove the need for dev_disable_lro(), and most,
    if not all, of its call sites can be replaced by simply making sure
    NETIF_F_LRO isn't included in the relevant device's feature flags.

    Note that this patch is driven by bug reports from users saying it was
    confusing that bonds and slaves had different settings for the same
    features, and while it won't be 100% in sync if a lower device doesn't
    support a feature like LRO, I think this is a good step in the right
    direction.

    CC: "David S. Miller"
    CC: Eric Dumazet
    CC: Jay Vosburgh
    CC: Veaceslav Falico
    CC: Andy Gospodarek
    CC: Jiri Pirko
    CC: Nikolay Aleksandrov
    CC: Michal Kubecek
    CC: Alexander Duyck
    CC: netdev@vger.kernel.org
    Signed-off-by: Jarod Wilson
    Signed-off-by: David S. Miller

    Jarod Wilson
     

13 May, 2015

1 commit


02 Feb, 2015

1 commit


06 Nov, 2014

2 commits


24 Jul, 2014

1 commit


15 Jun, 2014

1 commit

  • Joseph Gasparakis reported that VXLAN GSO offload stopped working with
    i40e device after recent UDP changes. The problem is that the
    SKB_GSO_* bits are out of sync with the corresponding NETIF flags. This
    patch fixes that. Also, we add BUILD_BUG_ONs in net_gso_ok for several
    GSO constants that were missing to avoid the problem in the future.

    Reported-by: Joseph Gasparakis
    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

05 Jun, 2014

2 commits

  • Call gso_make_checksum. This should have the benefit of using a
    checksum that may have been previously computed for the packet.

    This also adds NETIF_F_GSO_GRE_CSUM to differentiate devices that
    offload GRE GSO with and without the GRE checksum offloaed.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     
  • Added a new netif feature for GSO_UDP_TUNNEL_CSUM. This indicates
    that a device is capable of computing the UDP checksum in the
    encapsulating header of a UDP tunnel.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

04 Apr, 2014

1 commit


29 Mar, 2014

1 commit


08 Nov, 2013

1 commit

  • Add a operations structure that allows a network interface to export
    the fact that it supports package forwarding in hardware between
    physical interfaces and other mac layer devices assigned to it (such
    as macvlans). This operaions structure can be used by virtual mac
    devices to bypass software switching so that forwarding can be done
    in hardware more efficiently.

    Signed-off-by: John Fastabend
    Signed-off-by: Neil Horman
    CC: Andy Gospodarek
    CC: "David S. Miller"
    Signed-off-by: David S. Miller

    John Fastabend
     

22 Oct, 2013

1 commit

  • Now ipv6_gso_segment() is stackable, its relatively easy to
    implement GSO/TSO support for SIT tunnels

    Performance results, when segmentation is done after tunnel
    device (as no NIC is yet enabled for TSO SIT support) :

    Before patch :

    lpq84:~# ./netperf -H 2002:af6:1153:: -Cc
    MIGRATED TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:1153:: () port 0 AF_INET6
    Recv Send Send Utilization Service Demand
    Socket Socket Message Elapsed Send Recv Send Recv
    Size Size Size Time Throughput local remote local remote
    bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB

    87380 16384 16384 10.00 3168.31 4.81 4.64 2.988 2.877

    After patch :

    lpq84:~# ./netperf -H 2002:af6:1153:: -Cc
    MIGRATED TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:1153:: () port 0 AF_INET6
    Recv Send Send Utilization Service Demand
    Socket Socket Message Elapsed Send Recv Send Recv
    Size Size Size Time Throughput local remote local remote
    bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB

    87380 16384 16384 10.00 5525.00 7.76 5.17 2.763 1.840

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 Oct, 2013

1 commit

  • Now inet_gso_segment() is stackable, its relatively easy to
    implement GSO/TSO support for IPIP

    Performance results, when segmentation is done after tunnel
    device (as no NIC is yet enabled for TSO IPIP support) :

    Before patch :

    lpq83:~# ./netperf -H 7.7.9.84 -Cc
    MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.9.84 () port 0 AF_INET
    Recv Send Send Utilization Service Demand
    Socket Socket Message Elapsed Send Recv Send Recv
    Size Size Size Time Throughput local remote local remote
    bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB

    87380 16384 16384 10.00 3357.88 5.09 3.70 2.983 2.167

    After patch :

    lpq83:~# ./netperf -H 7.7.9.84 -Cc
    MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.9.84 () port 0 AF_INET
    Recv Send Send Utilization Service Demand
    Socket Socket Message Elapsed Send Recv Send Recv
    Size Size Size Time Throughput local remote local remote
    bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB

    87380 16384 16384 10.00 7710.19 4.52 6.62 1.152 1.687

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

28 May, 2013

1 commit

  • In the case where a non-MPLS packet is received and an MPLS stack is
    added it may well be the case that the original skb is GSO but the
    NIC used for transmit does not support GSO of MPLS packets.

    The aim of this code is to provide GSO in software for MPLS packets
    whose skbs are GSO.

    SKB Usage:

    When an implementation adds an MPLS stack to a non-MPLS packet it should do
    the following to skb metadata:

    * Set skb->inner_protocol to the old non-MPLS ethertype of the packet.
    skb->inner_protocol is added by this patch.

    * Set skb->protocol to the new MPLS ethertype of the packet.

    * Set skb->network_header to correspond to the
    end of the L3 header, including the MPLS label stack.

    I have posted a patch, "[PATCH v3.29] datapath: Add basic MPLS support to
    kernel" which adds MPLS support to the kernel datapath of Open vSwtich.
    That patch sets the above requirements in datapath/actions.c:push_mpls()
    and was used to exercise this code. The datapath patch is against the Open
    vSwtich tree but it is intended that it be added to the Open vSwtich code
    present in the mainline Linux kernel at some point.

    Features:

    I believe that the approach that I have taken is at least partially
    consistent with the handling of other protocols. Jesse, I understand that
    you have some ideas here. I am more than happy to change my implementation.

    This patch adds dev->mpls_features which may be used by devices
    to advertise features supported for MPLS packets.

    A new NETIF_F_MPLS_GSO feature is added for devices which support
    hardware MPLS GSO offload. Currently no devices support this
    and MPLS GSO always falls back to software.

    Alternate Implementation:

    One possible alternate implementation is to teach netif_skb_features()
    and skb_network_protocol() about MPLS, in a similar way to their
    understanding of VLANs. I believe this would avoid the need
    for net/mpls/mpls_gso.c and in particular the calls to
    __skb_push() and __skb_push() in mpls_gso_segment().

    I have decided on the implementation in this patch as it should
    not introduce any overhead in the case where mpls_gso is not compiled
    into the kernel or inserted as a module.

    MPLS GSO suggested by Jesse Gross.
    Based in part on "v4 GRE: Add TCP segmentation offload for GRE"
    by Pravin B Shelar.

    Cc: Jesse Gross
    Cc: Pravin B Shelar
    Signed-off-by: Simon Horman
    Signed-off-by: David S. Miller

    Simon Horman
     

02 May, 2013

1 commit

  • Commit 8ad227ff89a7 ("net: vlan: add 802.1ad support") added some new
    NETIF_F_* features bits, but it added them in the middle of existing
    values.

    Userland depends upon the flag bits via the per-netdevice 'flags' sysfs
    file.

    So restore the previous ordering by adding the new flags at the end.

    Reported-by: Linus Torvalds
    Signed-off-by: David S. Miller
    Signed-off-by: Linus Torvalds

    David Miller
     

20 Apr, 2013

2 commits

  • Add support for 802.1ad VLAN devices. This mainly consists of checking for
    ETH_P_8021AD in addition to ETH_P_8021Q in a couple of places and check
    offloading capabilities based on the used protocol.

    Configuration is done using "ip link":

    # ip link add link eth0 eth0.1000 \
    type vlan proto 802.1ad id 1000
    # ip link add link eth0.1000 eth0.1000.1000 \
    type vlan proto 802.1q id 1000

    52:54:00:12:34:56 > 92:b1:54:28:e4:8c, ethertype 802.1Q (0x8100), length 106: vlan 1000, p 0, ethertype 802.1Q, vlan 1000, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
    20.1.0.2 > 20.1.0.1: ICMP echo request, id 3003, seq 8, length 64
    92:b1:54:28:e4:8c > 52:54:00:12:34:56, ethertype 802.1Q-QinQ (0x88a8), length 106: vlan 1000, p 0, ethertype 802.1Q, vlan 1000, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 47944, offset 0, flags [none], proto ICMP (1), length 84)
    20.1.0.1 > 20.1.0.2: ICMP echo reply, id 3003, seq 8, length 64

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Rename the hardware VLAN acceleration features to include "CTAG" to indicate
    that they only support CTAGs. Follow up patches will introduce 802.1ad
    server provider tagging (STAGs) and require the distinction for hardware not
    supporting acclerating both.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

18 Mar, 2013

1 commit


10 Mar, 2013

1 commit

  • Adds generic tunneling offloading support for IPv4-UDP based
    tunnels.
    GSO type is added to request this offload for a skb.
    netdev feature NETIF_F_UDP_TUNNEL is added for hardware offloaded
    udp-tunnel support. Currently no device supports this feature,
    software offload is used.

    This can be used by tunneling protocols like VXLAN.

    CC: Jesse Gross
    Signed-off-by: Pravin B Shelar
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Pravin B Shelar
     

16 Feb, 2013

1 commit

  • Following patch adds GRE protocol offload handler so that
    skb_gso_segment() can segment GRE packets.
    SKB GSO CB is added to keep track of total header length so that
    skb_segment can push entire header. e.g. in case of GRE, skb_segment
    need to push inner and outer headers to every segment.
    New NETIF_F_GRE_GSO feature is added for devices which support HW
    GRE TSO offload. Currently none of devices support it therefore GRE GSO
    always fall backs to software GSO.

    [ Compute pkt_len before ip_local_out() invocation. -DaveM ]

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     

24 Feb, 2012

2 commits

  • This flag requests that network devices pass all
    received frames up the stack, even ones with errors
    such as invalid FCS (frame check sum). This will
    allow sniffers to see bad packets and perhaps
    give the user some idea how to fix the problem.

    Signed-off-by: Ben Greear
    Tested-by: Aaron Brown
    Signed-off-by: Jeff Kirsher

    Ben Greear
     
  • When set on hardware that supports the feature,
    this causes the Ethernet FCS to be appended
    to the end of the skb.

    Useful for sniffing packets.

    Signed-off-by: Ben Greear
    Tested-by: Aaron Brown
    Signed-off-by: Jeff Kirsher

    Ben Greear
     

17 Nov, 2011

5 commits