02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
20 Jul, 2017
1 commit
-
This patch adjusts the timeout formula to schedule the TCP loss probe
(TLP). The previous formula uses 2*SRTT or 1.5*RTT + DelayACKMax if
only one packet is in flight. It keeps a lower bound of 10 msec which
is too large for short RTT connections (e.g. within a data-center).
The new formula = 2*RTT + (inflight == 1 ? 200ms : 2ticks) which
performs better for short and fast connections.Signed-off-by: Yuchung Cheng
Signed-off-by: Neal Cardwell
Signed-off-by: David S. Miller
18 May, 2017
2 commits
-
TCP Timestamps option is defined in RFC 7323
Traditionally on linux, it has been tied to the internal
'jiffies' variable, because it had been a cheap and good enough
generator.For TCP flows on the Internet, 1 ms resolution would be much better
than 4ms or 10ms (HZ=250 or HZ=100 respectively)For TCP flows in the DC, Google has used usec resolution for more
than two years with great success [1]Receive size autotuning (DRS) is indeed more precise and converges
faster to optimal window size.This patch converts tp->tcp_mstamp to a plain u64 value storing
a 1 usec TCP clock.This choice will allow us to upstream the 1 usec TS option as
discussed in IETF 97.[1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf
Signed-off-by: Eric Dumazet
Acked-by: Soheil Hassas Yeganeh
Signed-off-by: David S. Miller -
Idea is to later convert tp->tcp_mstamp to a full u64 counter
using usec resolution, so that we can later have fine
grained TCP TS clock (RFC 7323), regardless of HZ value.We try to refresh tp->tcp_mstamp only when necessary.
Signed-off-by: Eric Dumazet
Acked-by: Soheil Hassas Yeganeh
Signed-off-by: David S. Miller
27 Apr, 2017
4 commits
-
I wrongly assumed tp->tcp_mstamp was up to date at the time
tcp_rack_reo_timeout() was called.It is not true, since we only update tcp->tcp_mstamp when receiving
a packet (as initially done in commit 69e996c58a35 ("tcp: add
tp->tcp_mstamp field")tcp_rack_reo_timeout() being called by a timer and not an incoming
packet, we need to refresh tp->tcp_mstampFixes: 7c1c7308592f ("tcp: do not pass timestamp to tcp_rack_detect_loss()")
Signed-off-by: Eric Dumazet
Cc: Soheil Hassas Yeganeh
Cc: Neal Cardwell
Cc: Yuchung Cheng
Acked-by: Soheil Hassas Yeganeh
Acked-by: Neal Cardwell
Signed-off-by: David S. Miller -
No longer needed, since tp->tcp_mstamp holds the information.
This is needed to remove sack_state.ack_time in a following patch.
Signed-off-by: Eric Dumazet
Acked-by: Soheil Hassas Yeganeh
Acked-by: Neal Cardwell
Signed-off-by: David S. Miller -
This is no longer used, since tcp_rack_detect_loss() takes
the timestamp from tp->tcp_mstampSigned-off-by: Eric Dumazet
Acked-by: Soheil Hassas Yeganeh
Acked-by: Neal Cardwell
Signed-off-by: David S. Miller -
We can use tp->tcp_mstamp as it contains a recent timestamp.
This removes a call to skb_mstamp_get() from tcp_rack_reo_timeout()
Signed-off-by: Eric Dumazet
Acked-by: Soheil Hassas Yeganeh
Acked-by: Neal Cardwell
Signed-off-by: David S. Miller
06 Apr, 2017
1 commit
-
The lost retransmit SNMP stat is under-counting retransmission
that uses segment offloading. This patch fixes that so all
retransmission related SNMP counters are consistent.Fixes: 10d3be569243 ("tcp-tso: do not split TSO packets at retransmit time")
Signed-off-by: Yuchung Cheng
Signed-off-by: Eric Dumazet
Signed-off-by: Neal Cardwell
Signed-off-by: Soheil Hassas Yeganeh
Signed-off-by: David S. Miller
14 Jan, 2017
6 commits
-
This patch changes two things:
1. Start fast recovery with RACK in addition to other heuristics
(e.g., DUPACK threshold, FACK). Prior to this change RACK
is enabled to detect losses only after the recovery has
started by other algorithms.2. Disable TCP early retransmit. RACK subsumes the early retransmit
with the new reordering timer feature. A latter patch in this
series removes the early retransmit code.Signed-off-by: Yuchung Cheng
Signed-off-by: Neal Cardwell
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
The packets inside a jumbo skb (e.g., TSO) share the same skb
timestamp, even though they are sent sequentially on the wire. Since
RACK is based on time, it can not detect some packets inside the
same skb are lost. However, we can leverage the packet sequence
numbers as extended timestamps to detect losses. Therefore, when
RACK timestamp is identical to skb's timestamp (i.e., one of the
packets of the skb is acked or sacked), we use the sequence numbers
of the acked and unacked packets to break ties.We can use the same sequence logic to advance RACK xmit time as
well to detect more losses and avoid timeout.Signed-off-by: Yuchung Cheng
Signed-off-by: Neal Cardwell
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
This patch makes RACK install a reordering timer when it suspects
some packets might be lost, but wants to delay the decision
a little bit to accomodate reordering.It does not create a new timer but instead repurposes the existing
RTO timer, because both are meant to retransmit packets.
Specifically it arms a timer ICSK_TIME_REO_TIMEOUT when
the RACK timing check fails. The wait time is set toRACK.RTT + RACK.reo_wnd - (NOW - Packet.xmit_time) + fudge
This translates to expecting a packet (Packet) should take
(RACK.RTT + RACK.reo_wnd + fudge) to deliver after it was sent.When there are multiple packets that need a timer, we use one timer
with the maximum timeout. Therefore the timer conservatively uses
the maximum window to expire N packets by one timeout, instead of
N timeouts to expire N packets sent at different times.The fudge factor is 2 jiffies to ensure when the timer fires, all
the suspected packets would exceed the deadline and be marked lost
by tcp_rack_detect_loss(). It has to be at least 1 jiffy because the
clock may tick between calling icsk_reset_xmit_timer(timeout) and
actually hang the timer. The next jiffy is to lower-bound the timeout
to 2 jiffies when reo_wnd is < 1ms.When the reordering timer fires (tcp_rack_reo_timeout): If we aren't
in Recovery we'll enter fast recovery and force fast retransmit.
This is very similar to the early retransmit (RFC5827) except RACK
is not constrained to only enter recovery for small outstanding
flights.Signed-off-by: Yuchung Cheng
Signed-off-by: Neal Cardwell
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
Record the most recent RTT in RACK. It is often identical to the
"ca_rtt_us" values in tcp_clean_rtx_queue. But when the packet has
been retransmitted, RACK choses to believe the ACK is for the
(latest) retransmitted packet if the RTT is over minimum RTT.This requires passing the arrival time of the most recent ACK to
RACK routines. The timestamp is now recorded in the "ack_time"
in tcp_sacktag_state during the ACK processing.This patch does not change the RACK algorithm itself. It only adds
the RTT variable to prepare the next main patch.Signed-off-by: Yuchung Cheng
Signed-off-by: Neal Cardwell
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
Create a new helper tcp_rack_detect_loss to prepare the upcoming
RACK reordering timer patch.Signed-off-by: Yuchung Cheng
Signed-off-by: Neal Cardwell
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
Create a new helper tcp_rack_mark_skb_lost to prepare the
upcoming RACK reordering timer support.Signed-off-by: Yuchung Cheng
Signed-off-by: Neal Cardwell
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
03 May, 2016
1 commit
-
We want to to make TCP stack preemptible, as draining prequeue
and backlog queues can take lot of time.Many SNMP updates were assuming that BH (and preemption) was disabled.
Need to convert some __NET_INC_STATS() calls to NET_INC_STATS()
and some __TCP_INC_STATS() to TCP_INC_STATS()Before using this_cpu_ptr(net->ipv4.tcp_sk) in tcp_v4_send_reset()
and tcp_v4_send_ack(), we add an explicit preempt disabled section.Signed-off-by: Eric Dumazet
Acked-by: Soheil Hassas Yeganeh
Signed-off-by: David S. Miller
28 Apr, 2016
1 commit
-
Rename NET_INC_STATS_BH() to __NET_INC_STATS()
and NET_ADD_STATS_BH() to __NET_ADD_STATS()Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
21 Oct, 2015
2 commits
-
This patch implements the second half of RACK that uses the the most
recent transmit time among all delivered packets to detect losses.tcp_rack_mark_lost() is called upon receiving a dubious ACK.
It then checks if an not-yet-sacked packet was sent at least
"reo_wnd" prior to the sent time of the most recently delivered.
If so the packet is deemed lost.The "reo_wnd" reordering window starts with 1msec for fast loss
detection and changes to min-RTT/4 when reordering is observed.
We found 1msec accommodates well on tiny degree of reordering
(
Signed-off-by: Neal Cardwell
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
This patch is the first half of the RACK loss recovery.
RACK loss recovery uses the notion of time instead
of packet sequence (FACK) or counts (dupthresh). It's inspired by the
previous FACK heuristic in tcp_mark_lost_retrans(): when a limited
transmit (new data packet) is sacked, then current retransmitted
sequence below the newly sacked sequence must been lost,
since at least one round trip time has elapsed.But it has several limitations:
1) can't detect tail drops since it depends on limited transmit
2) is disabled upon reordering (assumes no reordering)
3) only enabled in fast recovery ut not timeout recoveryRACK (Recently ACK) addresses these limitations with the notion
of time instead: a packet P1 is lost if a later packet P2 is s/acked,
as at least one round trip has passed.Since RACK cares about the time sequence instead of the data sequence
of packets, it can detect tail drops when later retransmission is
s/acked while FACK or dupthresh can't. For reordering RACK uses a
dynamically adjusted reordering window ("reo_wnd") to reduce false
positives on ever (small) degree of reordering.This patch implements tcp_advanced_rack() which tracks the
most recent transmission time among the packets that have been
delivered (ACKed or SACKed) in tp->rack.mstamp. This timestamp
is the key to determine which packet has been lost.Consider an example that the sender sends six packets:
T1: P1 (lost)
T2: P2
T3: P3
T4: P4
T100: sack of P2. rack.mstamp = T2
T101: retransmit P1
T102: sack of P2,P3,P4. rack.mstamp = T4
T205: ACK of P4 since the hole is repaired. rack.mstamp = T101We need to be careful about spurious retransmission because it may
falsely advance tp->rack.mstamp by an RTT or an RTO, causing RACK
to falsely mark all packets lost, just like a spurious timeout.We identify spurious retransmission by the ACK's TS echo value.
If TS option is not applicable but the retransmission is acknowledged
less than min-RTT ago, it is likely to be spurious. We refrain from
using the transmission time of these spurious retransmissions.The second half is implemented in the next patch that marks packet
lost using RACK timestamp.Signed-off-by: Yuchung Cheng
Signed-off-by: Neal Cardwell
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller