Blame view

Documentation/networking/segmentation-offloads.rst 7.73 KB
1b23f5e99   Otto Sabart   doc: networking: ...
1
  .. SPDX-License-Identifier: GPL-2.0
b83eb68cb   Otto Sabart   doc: networking: ...
2
3
4
  =====================
  Segmentation Offloads
  =====================
1b23f5e99   Otto Sabart   doc: networking: ...
5

f7a6272bf   Alexander Duyck   Documentation: Ad...
6
7
8
9
10
11
12
13
14
15
16
17
18
19
  
  Introduction
  ============
  
  This document describes a set of techniques in the Linux networking stack
  to take advantage of segmentation offload capabilities of various NICs.
  
  The following technologies are described:
   * TCP Segmentation Offload - TSO
   * UDP Fragmentation Offload - UFO
   * IPIP, SIT, GRE, and UDP Tunnel Offloads
   * Generic Segmentation Offload - GSO
   * Generic Receive Offload - GRO
   * Partial Generic Segmentation Offload - GSO_PARTIAL
ba3c43851   Weitao Hou   networking: : fix...
20
   * SCTP acceleration with GSO - GSO_BY_FRAGS
f7a6272bf   Alexander Duyck   Documentation: Ad...
21

1b23f5e99   Otto Sabart   doc: networking: ...
22

f7a6272bf   Alexander Duyck   Documentation: Ad...
23
24
25
26
27
  TCP Segmentation Offload
  ========================
  
  TCP segmentation allows a device to segment a single frame into multiple
  frames with a data payload size specified in skb_shinfo()->gso_size.
3d07e0746   Daniel Axtens   docs: segmentatio...
28
29
  When TCP segmentation requested the bit for either SKB_GSO_TCPV4 or
  SKB_GSO_TCPV6 should be set in skb_shinfo()->gso_type and
f7a6272bf   Alexander Duyck   Documentation: Ad...
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
  skb_shinfo()->gso_size should be set to a non-zero value.
  
  TCP segmentation is dependent on support for the use of partial checksum
  offload.  For this reason TSO is normally disabled if the Tx checksum
  offload for a given device is disabled.
  
  In order to support TCP segmentation offload it is necessary to populate
  the network and transport header offsets of the skbuff so that the device
  drivers will be able determine the offsets of the IP or IPv6 header and the
  TCP header.  In addition as CHECKSUM_PARTIAL is required csum_start should
  also point to the TCP header of the packet.
  
  For IPv4 segmentation we support one of two types in terms of the IP ID.
  The default behavior is to increment the IP ID with every segment.  If the
  GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
  ID and all segments will use the same IP ID.  If a device has
  NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO
  and we will either increment the IP ID for all frames, or leave it at a
  static value based on driver preference.
1b23f5e99   Otto Sabart   doc: networking: ...
49

f7a6272bf   Alexander Duyck   Documentation: Ad...
50
51
52
53
54
55
56
  UDP Fragmentation Offload
  =========================
  
  UDP fragmentation offload allows a device to fragment an oversized UDP
  datagram into multiple IPv4 fragments.  Many of the requirements for UDP
  fragmentation offload are the same as TSO.  However the IPv4 ID for
  fragments should not increment as a single IPv4 datagram is fragmented.
a65820e69   Daniel Axtens   docs: segmentatio...
57
58
59
  UFO is deprecated: modern kernels will no longer generate UFO skbs, but can
  still receive them from tuntap and similar devices. Offload of UDP-based
  tunnel protocols is still supported.
1b23f5e99   Otto Sabart   doc: networking: ...
60

f7a6272bf   Alexander Duyck   Documentation: Ad...
61
62
63
64
65
66
  IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
  ========================================================
  
  In addition to the offloads described above it is possible for a frame to
  contain additional headers such as an outer tunnel.  In order to account
  for such instances an additional set of segmentation offload types were
11bafd547   Nicolas Dichtel   doc: SKB_GSO_[IPI...
67
  introduced including SKB_GSO_IPXIP4, SKB_GSO_IPXIP6, SKB_GSO_GRE, and
f7a6272bf   Alexander Duyck   Documentation: Ad...
68
69
70
71
72
73
74
75
76
  SKB_GSO_UDP_TUNNEL.  These extra segmentation types are used to identify
  cases where there are more than just 1 set of headers.  For example in the
  case of IPIP and SIT we should have the network and transport headers moved
  from the standard list of headers to "inner" header offsets.
  
  Currently only two levels of headers are supported.  The convention is to
  refer to the tunnel headers as the outer headers, while the encapsulated
  data is normally referred to as the inner headers.  Below is the list of
  calls to access the given headers:
1b23f5e99   Otto Sabart   doc: networking: ...
77
78
79
80
81
82
  IPIP/SIT Tunnel::
  
               Outer                  Inner
    MAC        skb_mac_header
    Network    skb_network_header     skb_inner_network_header
    Transport  skb_transport_header
f7a6272bf   Alexander Duyck   Documentation: Ad...
83

1b23f5e99   Otto Sabart   doc: networking: ...
84
85
86
87
88
89
  UDP/GRE Tunnel::
  
               Outer                  Inner
    MAC        skb_mac_header         skb_inner_mac_header
    Network    skb_network_header     skb_inner_network_header
    Transport  skb_transport_header   skb_inner_transport_header
f7a6272bf   Alexander Duyck   Documentation: Ad...
90
91
92
93
94
  
  In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and
  SKB_GSO_UDP_TUNNEL_CSUM.  These two additional tunnel types reflect the
  fact that the outer header also requests to have a non-zero checksum
  included in the outer header.
bc3c2431d   Daniel Axtens   docs: segmentatio...
95
96
97
98
  Finally there is SKB_GSO_TUNNEL_REMCSUM which indicates that a given tunnel
  header has requested a remote checksum offload.  In this case the inner
  headers will be left with a partial checksum and only the outer header
  checksum will be computed.
f7a6272bf   Alexander Duyck   Documentation: Ad...
99

1b23f5e99   Otto Sabart   doc: networking: ...
100

f7a6272bf   Alexander Duyck   Documentation: Ad...
101
102
103
104
105
106
107
108
109
110
111
112
  Generic Segmentation Offload
  ============================
  
  Generic segmentation offload is a pure software offload that is meant to
  deal with cases where device drivers cannot perform the offloads described
  above.  What occurs in GSO is that a given skbuff will have its data broken
  out over multiple skbuffs that have been resized to match the MSS provided
  via skb_shinfo()->gso_size.
  
  Before enabling any hardware segmentation offload a corresponding software
  offload is required in GSO.  Otherwise it becomes possible for a frame to
  be re-routed between devices and end up being unable to be transmitted.
1b23f5e99   Otto Sabart   doc: networking: ...
113

f7a6272bf   Alexander Duyck   Documentation: Ad...
114
115
116
117
118
119
120
121
122
123
  Generic Receive Offload
  =======================
  
  Generic receive offload is the complement to GSO.  Ideally any frame
  assembled by GRO should be segmented to create an identical sequence of
  frames using GSO, and any sequence of frames segmented by GSO should be
  able to be reassembled back to the original by GRO.  The only exception to
  this is IPv4 ID in the case that the DF bit is set for a given IP header.
  If the value of the IPv4 ID is not sequentially incrementing it will be
  altered so that it is when a frame assembled via GRO is segmented via GSO.
1b23f5e99   Otto Sabart   doc: networking: ...
124

f7a6272bf   Alexander Duyck   Documentation: Ad...
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
  Partial Generic Segmentation Offload
  ====================================
  
  Partial generic segmentation offload is a hybrid between TSO and GSO.  What
  it effectively does is take advantage of certain traits of TCP and tunnels
  so that instead of having to rewrite the packet headers for each segment
  only the inner-most transport header and possibly the outer-most network
  header need to be updated.  This allows devices that do not support tunnel
  offloads or tunnel offloads with checksum to still make use of segmentation.
  
  With the partial offload what occurs is that all headers excluding the
  inner transport header are updated such that they will contain the correct
  values for if the header was simply duplicated.  The one exception to this
  is the outer IPv4 ID field.  It is up to the device drivers to guarantee
  that the IPv4 ID field is incremented in the case that a given header does
  not have the DF bit set.
a67708892   Daniel Axtens   docs: segmentatio...
141

1b23f5e99   Otto Sabart   doc: networking: ...
142

ba3c43851   Weitao Hou   networking: : fix...
143
  SCTP acceleration with GSO
a67708892   Daniel Axtens   docs: segmentatio...
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
  ===========================
  
  SCTP - despite the lack of hardware support - can still take advantage of
  GSO to pass one large packet through the network stack, rather than
  multiple small packets.
  
  This requires a different approach to other offloads, as SCTP packets
  cannot be just segmented to (P)MTU. Rather, the chunks must be contained in
  IP segments, padding respected. So unlike regular GSO, SCTP can't just
  generate a big skb, set gso_size to the fragmentation point and deliver it
  to IP layer.
  
  Instead, the SCTP protocol layer builds an skb with the segments correctly
  padded and stored as chained skbs, and skb_segment() splits based on those.
  To signal this, gso_size is set to the special value GSO_BY_FRAGS.
  
  Therefore, any code in the core networking stack must be aware of the
  possibility that gso_size will be GSO_BY_FRAGS and handle that case
d02f51cbc   Daniel Axtens   bpf: fix bpf_skb_...
162
  appropriately.
1dd27cde3   Daniel Axtens   net: use skb_is_g...
163
  There are some helpers to make this easier:
1b23f5e99   Otto Sabart   doc: networking: ...
164
165
  - skb_is_gso(skb) && skb_is_gso_sctp(skb) is the best way to see if
    an skb is an SCTP GSO skb.
d02f51cbc   Daniel Axtens   bpf: fix bpf_skb_...
166

1b23f5e99   Otto Sabart   doc: networking: ...
167
168
  - For size checks, the skb_gso_validate_*_len family of helpers correctly
    considers GSO_BY_FRAGS.
d02f51cbc   Daniel Axtens   bpf: fix bpf_skb_...
169

1b23f5e99   Otto Sabart   doc: networking: ...
170
171
  - For manipulating packets, skb_increase_gso_size and skb_decrease_gso_size
    will check for GSO_BY_FRAGS and WARN if asked to manipulate these skbs.
a67708892   Daniel Axtens   docs: segmentatio...
172
173
174
  
  This also affects drivers with the NETIF_F_FRAGLIST & NETIF_F_GSO_SCTP bits
  set. Note also that NETIF_F_GSO_SCTP is included in NETIF_F_GSO_SOFTWARE.