Eric Lee / smarc-fsl-linux-kernel

21 May, 2019

1 commit

457c89965 treewide: Add SPDX license identifier for missed files ... Browse Code »

Add SPDX license identifiers to all files which:

- Have no license information of any form

- Have EXPORT_.*_SYMBOL_GPL inside which was used in the
initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

GPL-2.0-only

Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-21 16:50:45 +0800

22 Sep, 2018

1 commit

2fd66ffba tcp: introduce tcp_skb_timestamp_us() helper ... Browse Code »

There are few places where TCP reads skb->skb_mstamp expecting
a value in usec unit.

skb->tstamp (aka skb->skb_mstamp) will soon store CLOCK_TAI nsec value.

Add tcp_skb_timestamp_us() to provide proper conversion when needed.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2018-09-22 10:37:59 +0800

12 Jul, 2018

1 commit

4929c9428 tcp: expose both send and receive intervals for rate sample ... Browse Code »

Congestion control algorithms, which access the rate sample
through the tcp_cong_control function, only have access to the maximum
of the send and receive interval, for cases where the acknowledgment
rate may be inaccurate due to ACK compression or decimation. Algorithms
may want to use send rates and receive rates as separate signals.

Signed-off-by: Deepti Raghavan
Acked-by: Neal Cardwell
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Deepti Raghavan
2018-07-12 14:01:56 +0800

08 Dec, 2017

1 commit

d4761754b tcp: invalidate rate samples during SACK reneging ... Browse Code »

Mark tcp_sock during a SACK reneging event and invalidate rate samples
while marked. Such rate samples may overestimate bw by including packets
that were SACKed before reneging.

< ack 6001 win 10000 sack 7001:38001
< ack 7001 win 0 sack 8001:38001 // Reneg detected
> seq 7001:8001 // RTO, SACK cleared.
< ack 38001 win 10000

In above example the rate sample taken after the last ack will count
7001-38001 as delivered while the actual delivery rate likely could
be much lower i.e. 7001-8001.

This patch adds a new field tcp_sock.sack_reneg and marks it when we
declare SACK reneging and entering TCP_CA_Loss, and unmarks it after
the last rate sample was taken before moving back to TCP_CA_Open. This
patch also invalidates rate samples taken while tcp_sock.is_sack_reneg
is set.

Fixes: b9f64820fb22 ("tcp: track data delivery rate for a TCP connection")
Signed-off-by: Yousuk Seung
Signed-off-by: Neal Cardwell
Signed-off-by: Yuchung Cheng
Acked-by: Soheil Hassas Yeganeh
Acked-by: Eric Dumazet
Acked-by: Priyaranjan Jha
Signed-off-by: David S. Miller

Yousuk Seung
2017-12-08 23:07:02 +0800

16 Jun, 2017

1 commit

e3b5616a3 tcp: export do_tcp_sendpages and tcp_rate_check_app_limited functions ... Browse Code »

Export do_tcp_sendpages and tcp_rate_check_app_limited, since tls will need to
sendpages while the socket is already locked.

tcp_sendpage is exported, but requires the socket lock to not be held already.

Signed-off-by: Aviad Yehezkel
Signed-off-by: Ilya Lesokhin
Signed-off-by: Boris Pismenny
Signed-off-by: Dave Watson
Signed-off-by: David S. Miller

Dave Watson
2017-06-16 00:12:40 +0800

18 May, 2017

1 commit

9a568de48 tcp: switch TCP TS option (RFC 7323) to 1ms clock ... Browse Code »

TCP Timestamps option is defined in RFC 7323

Traditionally on linux, it has been tied to the internal
'jiffies' variable, because it had been a cheap and good enough
generator.

For TCP flows on the Internet, 1 ms resolution would be much better
than 4ms or 10ms (HZ=250 or HZ=100 respectively)

For TCP flows in the DC, Google has used usec resolution for more
than two years with great success [1]

Receive size autotuning (DRS) is indeed more precise and converges
faster to optimal window size.

This patch converts tp->tcp_mstamp to a plain u64 value storing
a 1 usec TCP clock.

This choice will allow us to upstream the 1 usec TS option as
discussed in IETF 97.

[1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf

Signed-off-by: Eric Dumazet
Acked-by: Soheil Hassas Yeganeh
Signed-off-by: David S. Miller

Eric Dumazet
2017-05-18 04:06:01 +0800

27 Apr, 2017

1 commit

88d5c6509 tcp: do not pass timestamp to tcp_rate_gen() ... Browse Code »

No longer needed, since tp->tcp_mstamp holds the information.

This is needed to remove sack_state.ack_time in a following patch.

Signed-off-by: Eric Dumazet
Acked-by: Soheil Hassas Yeganeh
Acked-by: Neal Cardwell
Signed-off-by: David S. Miller

Eric Dumazet
2017-04-27 02:44:38 +0800

21 Sep, 2016

3 commits

eb8329e0a tcp: export data delivery rate ... Browse Code »

This commit export two new fields in struct tcp_info:

tcpi_delivery_rate: The most recent goodput, as measured by
tcp_rate_gen(). If the socket is limited by the sending
application (e.g., no data to send), it reports the highest
measurement instead of the most recent. The unit is bytes per
second (like other rate fields in tcp_info).

tcpi_delivery_rate_app_limited: A boolean indicating if the goodput
was measured when the socket's throughput was limited by the
sending application.

This delivery rate information can be useful for applications that
want to know the current throughput the TCP connection is seeing,
e.g. adaptive bitrate video streaming. It can also be very useful for
debugging or troubleshooting.

Signed-off-by: Van Jacobson
Signed-off-by: Neal Cardwell
Signed-off-by: Yuchung Cheng
Signed-off-by: Nandita Dukkipati
Signed-off-by: Eric Dumazet
Signed-off-by: Soheil Hassas Yeganeh
Signed-off-by: David S. Miller

Yuchung Cheng
2016-09-21 12:23:00 +0800
d7722e857 tcp: track application-limited rate samples ... Browse Code »

This commit adds code to track whether the delivery rate represented
by each rate_sample was limited by the application.

Upon each transmit, we store in the is_app_limited field in the skb a
boolean bit indicating whether there is a known "bubble in the pipe":
a point in the rate sample interval where the sender was
application-limited, and did not transmit even though the cwnd and
pacing rate allowed it.

This logic marks the flow app-limited on a write if *all* of the
following are true:

1) There is less than 1 MSS of unsent data in the write queue
available to transmit.

2) There is no packet in the sender's queues (e.g. in fq or the NIC
tx queue).

3) The connection is not limited by cwnd.

4) There are no lost packets to retransmit.

The tcp_rate_check_app_limited() code in tcp_rate.c determines whether
the connection is application-limited at the moment. If the flow is
application-limited, it sets the tp->app_limited field. If the flow is
application-limited then that means there is effectively a "bubble" of
silence in the pipe now, and this silence will be reflected in a lower
bandwidth sample for any rate samples from now until we get an ACK
indicating this bubble has exited the pipe: specifically, until we get
an ACK for the next packet we transmit.

When we send every skb we record in scb->tx.is_app_limited whether the
resulting rate sample will be application-limited.

The code in tcp_rate_gen() checks to see when it is safe to mark all
known application-limited bubbles of silence as having exited the
pipe. It does this by checking to see when the delivered count moves
past the tp->app_limited marker. At this point it zeroes the
tp->app_limited marker, as all known bubbles are out of the pipe.

We make room for the tx.is_app_limited bit in the skb by borrowing a
bit from the in_flight field used by NV to record the number of bytes
in flight. The receive window in the TCP header is 16 bits, and the
max receive window scaling shift factor is 14 (RFC 1323). So the max
receive window offered by the TCP protocol is 2^(16+14) = 2^30. So we
only need 30 bits for the tx.in_flight used by NV.

Signed-off-by: Van Jacobson
Signed-off-by: Neal Cardwell
Signed-off-by: Yuchung Cheng
Signed-off-by: Nandita Dukkipati
Signed-off-by: Eric Dumazet
Signed-off-by: Soheil Hassas Yeganeh
Signed-off-by: David S. Miller

Soheil Hassas Yeganeh
2016-09-21 12:23:00 +0800
b9f64820f tcp: track data delivery rate for a TCP connection ... Browse Code »

This patch generates data delivery rate (throughput) samples on a
per-ACK basis. These rate samples can be used by congestion control
modules, and specifically will be used by TCP BBR in later patches in
this series.

Key state:

tp->delivered: Tracks the total number of data packets (original or not)
delivered so far. This is an already-existing field.

tp->delivered_mstamp: the last time tp->delivered was updated.

Algorithm:

A rate sample is calculated as (d1 - d0)/(t1 - t0) on a per-ACK basis:

d1: the current tp->delivered after processing the ACK
t1: the current time after processing the ACK

d0: the prior tp->delivered when the acked skb was transmitted
t0: the prior tp->delivered_mstamp when the acked skb was transmitted

When an skb is transmitted, we snapshot d0 and t0 in its control
block in tcp_rate_skb_sent().

When an ACK arrives, it may SACK and ACK some skbs. For each SACKed
or ACKed skb, tcp_rate_skb_delivered() updates the rate_sample struct
to reflect the latest (d0, t0).

Finally, tcp_rate_gen() generates a rate sample by storing
(d1 - d0) in rs->delivered and (t1 - t0) in rs->interval_us.

One caveat: if an skb was sent with no packets in flight, then
tp->delivered_mstamp may be either invalid (if the connection is
starting) or outdated (if the connection was idle). In that case,
we'll re-stamp tp->delivered_mstamp.

At first glance it seems t0 should always be the time when an skb was
transmitted, but actually this could over-estimate the rate due to
phase mismatch between transmit and ACK events. To track the delivery
rate, we ensure that if packets are in flight then t0 and and t1 are
times at which packets were marked delivered.

If the initial and final RTTs are different then one may be corrupted
by some sort of noise. The noise we see most often is sending gaps
caused by delayed, compressed, or stretched acks. This either affects
both RTTs equally or artificially reduces the final RTT. We approach
this by recording the info we need to compute the initial RTT
(duration of the "send phase" of the window) when we recorded the
associated inflight. Then, for a filter to avoid bandwidth
overestimates, we generalize the per-sample bandwidth computation
from:

bw = delivered / ack_phase_rtt

to the following:

bw = delivered / max(send_phase_rtt, ack_phase_rtt)

In large-scale experiments, this filtering approach incorporating
send_phase_rtt is effective at avoiding bandwidth overestimates due to
ACK compression or stretched ACKs.

Signed-off-by: Van Jacobson
Signed-off-by: Neal Cardwell
Signed-off-by: Yuchung Cheng
Signed-off-by: Nandita Dukkipati
Signed-off-by: Eric Dumazet
Signed-off-by: Soheil Hassas Yeganeh
Signed-off-by: David S. Miller

Yuchung Cheng
2016-09-21 12:23:00 +0800