07 Aug, 2017

1 commit

  • Most TCP congestion controls are using identical logic to undo
    cwnd except BBR. This patch consolidates these similar functions
    to the one used currently by Reno and others.

    Suggested-by: Neal Cardwell
    Signed-off-by: Yuchung Cheng
    Signed-off-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Yuchung Cheng
     

18 May, 2017

1 commit


12 May, 2016

1 commit

  • Replace 2 arguments (cnt and rtt) in the congestion control modules'
    pkts_acked() function with a struct. This will allow adding more
    information without having to modify existing congestion control
    modules (tcp_nv in particular needs bytes in flight when packet
    was sent).

    As proposed by Neal Cardwell in his comments to the tcp_nv patch.

    Signed-off-by: Lawrence Brakmo
    Acked-by: Yuchung Cheng
    Signed-off-by: David S. Miller

    Lawrence Brakmo
     

10 Jul, 2015

1 commit

  • Add a helper to test the slow start condition in various congestion
    control modules and other places. This is to prepare a slight improvement
    in policy as to exactly when to slow start.

    Signed-off-by: Yuchung Cheng
    Signed-off-by: Neal Cardwell
    Signed-off-by: Eric Dumazet
    Signed-off-by: Nandita Dukkipati
    Signed-off-by: David S. Miller

    Yuchung Cheng
     

29 Jan, 2015

1 commit

  • LRO, GRO, delayed ACKs, and middleboxes can cause "stretch ACKs" that
    cover more than the RFC-specified maximum of 2 packets. These stretch
    ACKs can cause serious performance shortfalls in common congestion
    control algorithms that were designed and tuned years ago with
    receiver hosts that were not using LRO or GRO, and were instead
    politely ACKing every other packet.

    This patch series fixes Reno and CUBIC to handle stretch ACKs.

    This patch prepares for the upcoming stretch ACK bug fix patches. It
    adds an "acked" parameter to tcp_cong_avoid_ai() to allow for future
    fixes to tcp_cong_avoid_ai() to correctly handle stretch ACKs, and
    changes all congestion control algorithms to pass in 1 for the ACKed
    count. It also changes tcp_slow_start() to return the number of packet
    ACK "credits" that were not processed in slow start mode, and can be
    processed by the congestion control module in additive increase mode.

    In future patches we will fix tcp_cong_avoid_ai() to handle stretch
    ACKs, and fix Reno and CUBIC handling of stretch ACKs in slow start
    and additive increase mode.

    Reported-by: Eyal Perry
    Signed-off-by: Neal Cardwell
    Signed-off-by: Yuchung Cheng
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Neal Cardwell
     

02 Sep, 2014

1 commit

  • Fix places where there is space before tab, long lines, and
    awkward if(){, double spacing etc. Add blank line after declaration/initialization.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     

04 May, 2014

1 commit


05 Nov, 2013

1 commit

  • Slow start now increases cwnd by 1 if an ACK acknowledges some packets,
    regardless the number of packets. Consequently slow start performance
    is highly dependent on the degree of the stretch ACKs caused by
    receiver or network ACK compression mechanisms (e.g., delayed-ACK,
    GRO, etc). But slow start algorithm is to send twice the amount of
    packets of packets left so it should process a stretch ACK of degree
    N as if N ACKs of degree 1, then exits when cwnd exceeds ssthresh. A
    follow up patch will use the remainder of the N (if greater than 1)
    to adjust cwnd in the congestion avoidance phase.

    In addition this patch retires the experimental limited slow start
    (LSS) feature. LSS has multiple drawbacks but questionable benefit. The
    fractional cwnd increase in LSS requires a loop in slow start even
    though it's rarely used. Configuring such an increase step via a global
    sysctl on different BDPS seems hard. Finally and most importantly the
    slow start overshoot concern is now better covered by the Hybrid slow
    start (hystart) enabled by default.

    Signed-off-by: Yuchung Cheng
    Signed-off-by: Neal Cardwell
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Yuchung Cheng
     

21 Jan, 2012

1 commit

  • This patch fixes BIC so that cwnd reductions made during RTOs can be
    undone (just as they already can be undone when using the default/Reno
    behavior).

    When undoing cwnd reductions, BIC-derived congestion control modules
    were restoring the cwnd from last_max_cwnd. There were two problems
    with using last_max_cwnd to restore a cwnd during undo:

    (a) last_max_cwnd was set to 0 on state transitions into TCP_CA_Loss
    (by calling the module's reset() functions), so cwnd reductions from
    RTOs could not be undone.

    (b) when fast_covergence is enabled (which it is by default)
    last_max_cwnd does not actually hold the value of snd_cwnd before the
    loss; instead, it holds a scaled-down version of snd_cwnd.

    This patch makes the following changes:

    (1) upon undo, revert snd_cwnd to ca->loss_cwnd, which is already, as
    the existing comment notes, the "congestion window at last loss"

    (2) stop forgetting ca->loss_cwnd on TCP_CA_Loss events

    (3) use ca->last_max_cwnd to check if we're in slow start

    Signed-off-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Neal Cardwell
     

10 Mar, 2011

1 commit


02 Mar, 2009

1 commit

  • It seems that implementation in yeah was inconsistent to what
    other did as it would increase cwnd one ack earlier than the
    others do.

    Size benefits:

    bictcp_cong_avoid | -36
    tcp_cong_avoid_ai | +52
    bictcp_cong_avoid | -34
    tcp_scalable_cong_avoid | -36
    tcp_veno_cong_avoid | -12
    tcp_yeah_cong_avoid | -38

    = -104 bytes total

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     

29 Feb, 2008

1 commit


29 Jan, 2008

1 commit


11 Oct, 2007

1 commit


31 Jul, 2007

1 commit

  • This patch changes the API for the callback that is done after an ACK is
    received. It solves a couple of issues:

    * Some congestion controls want higher resolution value of RTT
    (controlled by TCP_CONG_RTT_SAMPLE flag). These don't really want a ktime, but
    all compute a RTT in microseconds.

    * Other congestion control could use RTT at jiffies resolution.

    To keep API consistent the units should be the same for both cases, just the
    resolution should change.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

18 Jul, 2007

1 commit


13 Jun, 2007

1 commit


26 Apr, 2007

1 commit

  • Do some simple changes to make congestion control API faster/cleaner.
    * use ktime_t rather than timeval
    * merge rtt sampling into existing ack callback
    this means one indirect call versus two per ack.
    * use flags bits to store options/settings

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

23 Sep, 2006

1 commit


01 Jul, 2006

1 commit


18 Jun, 2006

1 commit

  • Many of the TCP congestion methods all just use ssthresh
    as the minimum congestion window on decrease. Rather than
    duplicating the code, just have that be the default if that
    handle in the ops structure is not set.

    Minor behaviour change to TCP compound. It probably wants
    to use this (ssthresh) as lower bound, rather than ssthresh/2
    because the latter causes undershoot on loss.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

04 Jan, 2006

2 commits


11 Nov, 2005

2 commits

  • Move all the code that does linear TCP slowstart to one
    inline function to ease later patch to add ABC support.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • TCP peformance with TSO over networks with delay is awful.
    On a 100Mbit link with 150ms delay, we get 4Mbits/sec with TSO and
    50Mbits/sec without TSO.

    The problem is with TSO, we intentionally do not keep the maximum
    number of packets in flight to fill the window, we hold out to until
    we can send a MSS chunk. But, we also don't update the congestion window
    unless we have filled, as per RFC2861.

    This patch replaces the check for the congestion window being full
    with something smarter that accounts for TSO.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

03 Nov, 2005

1 commit

  • The max growth of BIC TCP is too large. Original code was based on
    BIC 1.0 and the default there was 32. Later code (2.6.13) included
    compensation for delayed acks, and should have reduced the default
    value to 16; since normally TCP gets one ack for every two packets sent.

    The current value of 32 makes BIC too aggressive and unfair to other
    flows.

    Submitted-by: Injong Rhee
    Signed-off-by: Stephen Hemminger
    Acked-by: Ian McDonald
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephen Hemminger
     

06 Oct, 2005

1 commit


30 Aug, 2005

1 commit

  • This changeset basically moves tcp_sk()->{ca_ops,ca_state,etc} to inet_csk(),
    minimal renaming/moving done in this changeset to ease review.

    Most of it is just changes of struct tcp_sock * to struct sock * parameters.

    With this we move to a state closer to two interesting goals:

    1. Generalisation of net/ipv4/tcp_diag.c, becoming inet_diag.c, being used
    for any INET transport protocol that has struct inet_hashinfo and are
    derived from struct inet_connection_sock. Keeps the userspace API, that will
    just not display DCCP sockets, while newer versions of tools can support
    DCCP.

    2. INET generic transport pluggable Congestion Avoidance infrastructure, using
    the current TCP CA infrastructure with DCCP.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     

24 Jun, 2005

1 commit

  • TCP BIC congestion control reworked to use the new congestion control
    infrastructure. This version is more up to date than the BIC
    code in 2.6.12; it incorporates enhancements from BICTCP 1.1,
    to handle low latency links.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger