04 May, 2014

1 commit


27 Feb, 2014

1 commit

  • Upcoming congestion controls for TCP require usec resolution for RTT
    estimations. Millisecond resolution is simply not enough these days.

    FQ/pacing in DC environments also require this change for finer control
    and removal of bimodal behavior due to the current hack in
    tcp_update_pacing_rate() for 'small rtt'

    TCP_CONG_RTT_STAMP is no longer needed.

    As Julian Anastasov pointed out, we need to keep user compatibility :
    tcp_metrics used to export RTT and RTTVAR in msec resolution,
    so we added RTT_US and RTTVAR_US. An iproute2 patch is needed
    to use the new attributes if provided by the kernel.

    In this example ss command displays a srtt of 32 usecs (10Gbit link)

    lpk51:~# ./ss -i dst lpk52
    Netid State Recv-Q Send-Q Local Address:Port Peer
    Address:Port
    tcp ESTAB 0 1 10.246.11.51:42959
    10.246.11.52:64614
    cubic wscale:6,6 rto:201 rtt:0.032/0.001 ato:40 mss:1448
    cwnd:10 send
    3620.0Mbps pacing_rate 7240.0Mbps unacked:1 rcv_rtt:993 rcv_space:29559

    Updated iproute2 ip command displays :

    lpk51:~# ./ip tcp_metrics | grep 10.246.11.52
    10.246.11.52 age 561.914sec cwnd 10 rtt 274us rttvar 213us source
    10.246.11.51

    Old binary displays :

    lpk51:~# ip tcp_metrics | grep 10.246.11.52
    10.246.11.52 age 561.914sec cwnd 10 rtt 250us rttvar 125us source
    10.246.11.51

    With help from Julian Anastasov, Stephen Hemminger and Yuchung Cheng

    Signed-off-by: Eric Dumazet
    Acked-by: Neal Cardwell
    Cc: Stephen Hemminger
    Cc: Yuchung Cheng
    Cc: Larry Brakmo
    Cc: Julian Anastasov
    Signed-off-by: David S. Miller

    Eric Dumazet
     

14 Feb, 2014

1 commit


05 Nov, 2013

1 commit

  • Slow start now increases cwnd by 1 if an ACK acknowledges some packets,
    regardless the number of packets. Consequently slow start performance
    is highly dependent on the degree of the stretch ACKs caused by
    receiver or network ACK compression mechanisms (e.g., delayed-ACK,
    GRO, etc). But slow start algorithm is to send twice the amount of
    packets of packets left so it should process a stretch ACK of degree
    N as if N ACKs of degree 1, then exits when cwnd exceeds ssthresh. A
    follow up patch will use the remainder of the N (if greater than 1)
    to adjust cwnd in the congestion avoidance phase.

    In addition this patch retires the experimental limited slow start
    (LSS) feature. LSS has multiple drawbacks but questionable benefit. The
    fractional cwnd increase in LSS requires a loop in slow start even
    though it's rarely used. Configuring such an increase step via a global
    sysctl on different BDPS seems hard. Finally and most importantly the
    slow start overshoot concern is now better covered by the Hybrid slow
    start (hystart) enabled by default.

    Signed-off-by: Yuchung Cheng
    Signed-off-by: Neal Cardwell
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Yuchung Cheng
     

10 Mar, 2011

1 commit


26 May, 2009

1 commit


09 Dec, 2008

1 commit

  • This patch addresses a book-keeping issue in tcp_vegas.c. At present
    tcp_vegas does separate book-keeping of cwnd based on packet sequence
    numbers. A mismatch can develop between this book-keeping and
    tp->snd_cwnd due, for example, to delayed acks acking multiple
    packets. When vegas transitions to reno operation (e.g. following
    loss), then this mismatch leads to incorrect behaviour (akin to a cwnd
    backoff). This seems mostly to affect operation at low cwnds where
    delayed acking can lead to a significant fraction of cwnd being
    covered by a single ack, leading to the book-keeping mismatch. This
    patch modifies the congestion avoidance update to avoid the need for
    separate book-keeping while leaving vegas congestion avoidance
    functionally unchanged. A secondary advantage of this modification is
    that the use of fixed-point (via V_PARAM_SHIFT) and 64 bit arithmetic
    is no longer necessary, simplifying the code.

    Some example test measurements with the patched code (confirming no functional
    change in the congestion avoidance algorithm) can be seen at:

    http://www.hamilton.ie/doug/vegaspatch/

    Signed-off-by: Doug Leith
    Signed-off-by: David S. Miller

    Doug Leith
     

05 Dec, 2008

1 commit

  • This patch fixes a bug in tcp_vegas.c. At the moment this code leaves
    ssthresh untouched. However, this means that the vegas congestion
    control algorithm is effectively unable to reduce cwnd below the
    ssthresh value (if the vegas update lowers the cwnd below ssthresh,
    then slow start is activated to raise it back up). One example where
    this matters is when during slow start cwnd overshoots the link
    capacity and a flow then exits slow start with ssthresh set to a value
    above where congestion avoidance would like to adjust it.

    Signed-off-by: Doug Leith
    Signed-off-by: David S. Miller

    Doug Leith
     

01 May, 2008

1 commit

  • drivers/net/8390.c:37:2: warning: returning void-valued expression
    drivers/net/bnx2.c:1635:3: warning: returning void-valued expression
    drivers/net/xen-netfront.c:1806:2: warning: returning void-valued expression
    net/ipv4/tcp_hybla.c:105:3: warning: returning void-valued expression
    net/ipv4/tcp_vegas.c:171:3: warning: returning void-valued expression
    net/ipv4/tcp_veno.c:123:3: warning: returning void-valued expression
    net/sysctl_net.c:85:2: warning: returning void-valued expression

    Signed-off-by: Harvey Harrison
    Acked-by: Alan Cox
    Signed-off-by: David S. Miller

    Harvey Harrison
     

30 Apr, 2008

1 commit

  • From: Lachlan Andrew

    There is an overflow bug in net/ipv4/tcp_vegas.c for large BDPs
    (e.g. 400Mbit/s, 400ms). The multiplication (old_wnd *
    vegas->baseRTT) << V_PARAM_SHIFT overflows a u32.

    [ Fix tcp_veno.c too, it has similar calculations. -DaveM ]

    Signed-off-by: David S. Miller

    Lachlan Andrew
     

29 Jan, 2008

1 commit


30 Oct, 2007

1 commit

  • TCP Vegas implementation has a bug in the process of disabling
    slow-start with gamma parameter. The bug may lead to extreme
    unfairness in the presence of early packet loss. See details in:
    http://www.cs.caltech.edu/~weixl/technical/ns2linux/known_linux/index.html#vegas

    Switch the order of "if (tp->snd_cwnd snd_ssthresh)" statement
    and "if (diff > gamma)" statement to eliminate the problem.

    Signed-off-by: Xiaoliang (David) Wei
    Signed-off-by: David S. Miller

    Xiaoliang (David) Wei
     

31 Jul, 2007

1 commit

  • This patch changes the API for the callback that is done after an ACK is
    received. It solves a couple of issues:

    * Some congestion controls want higher resolution value of RTT
    (controlled by TCP_CONG_RTT_SAMPLE flag). These don't really want a ktime, but
    all compute a RTT in microseconds.

    * Other congestion control could use RTT at jiffies resolution.

    To keep API consistent the units should be the same for both cases, just the
    resolution should change.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

18 Jul, 2007

1 commit


16 Jun, 2007

1 commit

  • Commit 164891aadf1721fca4dce473bb0e0998181537c6 broke RTT
    sampling of congestion control modules. Inaccurate timestamps
    could be fed to them without providing any way for them to
    identify such cases. Previously RTT sampler was called only if
    FLAG_RETRANS_DATA_ACKED was not set filtering inaccurate
    timestamps nicely. In addition, the new behavior could give an
    invalid timestamp (zero) to RTT sampler if only skbs with
    TCPCB_RETRANS were ACKed. This solves both problems.

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     

26 Apr, 2007

4 commits


11 Feb, 2007

1 commit


03 Dec, 2006

1 commit


23 Sep, 2006

1 commit


01 Jul, 2006

1 commit


05 Jan, 2006

1 commit


07 Dec, 2005

2 commits


12 Nov, 2005

1 commit


11 Nov, 2005

1 commit


30 Aug, 2005

3 commits

  • Next changeset will introduce net/ipv4/tcp_diag.c, moving the code that was put
    transitioanlly in inet_diag.c.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Next changeset will rename tcp_diag.[ch] to inet_diag.[ch].

    I'm taking this longer route so as to easy review, making clear the changes
    made all along the way.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • This changeset basically moves tcp_sk()->{ca_ops,ca_state,etc} to inet_csk(),
    minimal renaming/moving done in this changeset to ease review.

    Most of it is just changes of struct tcp_sock * to struct sock * parameters.

    With this we move to a state closer to two interesting goals:

    1. Generalisation of net/ipv4/tcp_diag.c, becoming inet_diag.c, being used
    for any INET transport protocol that has struct inet_hashinfo and are
    derived from struct inet_connection_sock. Keeps the userspace API, that will
    just not display DCCP sockets, while newer versions of tools can support
    DCCP.

    2. INET generic transport pluggable Congestion Avoidance infrastructure, using
    the current TCP CA infrastructure with DCCP.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     

24 Jun, 2005

1 commit