20 Mar, 2012

2 commits

  • With increasing receive window sizes, but speed of light not improved
    that much, out of order queue can contain a huge number of skbs, waiting
    to be moved to receive_queue when missing packets can fill the holes.

    Some devices happen to use fat skbs (truesize of 4096 + sizeof(struct
    sk_buff)) to store regular (MTU
    Cc: Neal Cardwell
    Cc: Yuchung Cheng
    Cc: H.K. Jerry Chu
    Cc: Tom Herbert
    Cc: Ilpo Järvinen
    Acked-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Split tcp_data_queue() in two parts for better readability.

    tcp_data_queue_ofo() is responsible for queueing incoming skb into out
    of order queue.

    Change code layout so that the skb_set_owner_r() is performed only if
    skb is not dropped.

    This is a preliminary patch before "reduce out_of_order memory use"
    following patch.

    Signed-off-by: Eric Dumazet
    Cc: Neal Cardwell
    Cc: Yuchung Cheng
    Cc: H.K. Jerry Chu
    Cc: Tom Herbert
    Cc: Ilpo Järvinen
    Acked-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Eric Dumazet
     

13 Mar, 2012

1 commit

  • Add #define pr_fmt(fmt) as appropriate.

    Add "IPv4: ", "TCP: ", and "IPsec: " to appropriate files.
    Standardize on "UDPLite: " for appropriate uses.
    Some prefixes were previously "UDPLITE: " and "UDP-Lite: ".

    Add KBUILD_MODNAME ": " to icmp and gre.
    Remove embedded prefixes as appropriate.

    Add missing "\n" to pr_info in gre.c.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

12 Mar, 2012

1 commit

  • Use a more current kernel messaging style.

    Convert a printk block to print_hex_dump.
    Coalesce formats, align arguments.
    Use %s, __func__ instead of embedding function names.

    Some messages that were prefixed with _close are
    now prefixed with _fini. Some ah4 and esp messages
    are now not prefixed with "ip ".

    The intent of this patch is to later add something like
    #define pr_fmt(fmt) "IPv4: " fmt.
    to standardize the output messages.

    Text size is trivially reduced. (x86-32 allyesconfig)

    $ size net/ipv4/built-in.o*
    text data bss dec hex filename
    887888 31558 249696 1169142 11d6f6 net/ipv4/built-in.o.new
    887934 31558 249800 1169292 11d78c net/ipv4/built-in.o.old

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

07 Mar, 2012

1 commit

  • This commit fixes tcp_shift_skb_data() so that it does not shift
    SACKed data below snd_una.

    This fixes an issue whose symptoms exactly match reports showing
    tp->sacked_out going negative since 3.3.0-rc4 (see "WARNING: at
    net/ipv4/tcp_input.c:3418" thread on netdev).

    Since 2008 (832d11c5cd076abc0aa1eaf7be96c81d1a59ce41)
    tcp_shift_skb_data() had been shifting SACKed ranges that were below
    snd_una. It checked that the *end* of the skb it was about to shift
    from was above snd_una, but did not check that the end of the actual
    shifted range was above snd_una; this commit adds that check.

    Shifting SACKed ranges below snd_una is problematic because for such
    ranges tcp_sacktag_one() short-circuits: it does not declare anything
    as SACKed and does not increase sacked_out.

    Before the fixes in commits cc9a672ee522d4805495b98680f4a3db5d0a0af9
    and daef52bab1fd26e24e8e9578f8fb33ba1d0cb412, shifting SACKed ranges
    below snd_una happened to work because tcp_shifted_skb() was always
    (incorrectly) passing in to tcp_sacktag_one() an skb whose end_seq
    tcp_shift_skb_data() had already guaranteed was beyond snd_una. Hence
    tcp_sacktag_one() never short-circuited and always increased
    tp->sacked_out in this case.

    After those two fixes, my testing has verified that shifting SACKed
    ranges below snd_una could cause tp->sacked_out to go negative with
    the following sequence of events:

    (1) tcp_shift_skb_data() sees an skb whose end_seq is beyond snd_una,
    then shifts a prefix of that skb that is below snd_una

    (2) tcp_shifted_skb() increments the packet count of the
    already-SACKed prev sk_buff

    (3) tcp_sacktag_one() sees the end of the new SACKed range is below
    snd_una, so it short-circuits and doesn't increase tp->sacked_out

    (5) tcp_clean_rtx_queue() sees the SACKed skb has been ACKed,
    decrements tp->sacked_out by this "inflated" pcount that was
    missing a matching increase in tp->sacked_out, and hence
    tp->sacked_out underflows to a u32 like 0xFFFFFFFF, which casted
    to s32 is negative.

    (6) this leads to the warnings seen in the recent "WARNING: at
    net/ipv4/tcp_input.c:3418" thread on the netdev list; e.g.:
    tcp_input.c:3418 WARN_ON((int)tp->sacked_out < 0);

    More generally, I think this bug can be tickled in some cases where
    two or more ACKs from the receiver are lost and then a DSACK arrives
    that is immediately above an existing SACKed skb in the write queue.

    This fix changes tcp_shift_skb_data() to abort this sequence at step
    (1) in the scenario above by noticing that the bytes are below snd_una
    and not shifting them.

    Signed-off-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Neal Cardwell
     

04 Mar, 2012

1 commit

  • In tcp_mark_head_lost() we should not attempt to fragment a SACKed skb
    to mark the first portion as lost. This is for two primary reasons:

    (1) tcp_shifted_skb() coalesces adjacent regions of SACKed skbs. When
    doing this, it preserves the sum of their packet counts in order to
    reflect the real-world dynamics on the wire. But given that skbs can
    have remainders that do not align to MSS boundaries, this packet count
    preservation means that for SACKed skbs there is not necessarily a
    direct linear relationship between tcp_skb_pcount(skb) and
    skb->len. Thus tcp_mark_head_lost()'s previous attempts to fragment
    off and mark as lost a prefix of length (packets - oldcnt)*mss from
    SACKed skbs were leading to occasional failures of the WARN_ON(len >
    skb->len) in tcp_fragment() (which used to be a BUG_ON(); see the
    recent "crash in tcp_fragment" thread on netdev).

    (2) there is no real point in fragmenting off part of a SACKed skb and
    calling tcp_skb_mark_lost() on it, since tcp_skb_mark_lost() is a NOP
    for SACKed skbs.

    Signed-off-by: Neal Cardwell
    Acked-by: Ilpo Järvinen
    Acked-by: Yuchung Cheng
    Acked-by: Nandita Dukkipati
    Signed-off-by: David S. Miller

    Neal Cardwell
     

29 Feb, 2012

1 commit

  • When tcp_shifted_skb() shifts bytes from the skb that is currently
    pointed to by 'highest_sack' then the increment of
    TCP_SKB_CB(skb)->seq implicitly advances tcp_highest_sack_seq(). This
    implicit advancement, combined with the recent fix to pass the correct
    SACKed range into tcp_sacktag_one(), caused tcp_sacktag_one() to think
    that the newly SACKed range was before the tcp_highest_sack_seq(),
    leading to a call to tcp_update_reordering() with a degree of
    reordering matching the size of the newly SACKed range (typically just
    1 packet, which is a NOP, but potentially larger).

    This commit fixes this by simply calling tcp_sacktag_one() before the
    TCP_SKB_CB(skb)->seq advancement that can advance our notion of the
    highest SACKed sequence.

    Correspondingly, we can simplify the code a little now that
    tcp_shifted_skb() should update the lost_cnt_hint in all cases where
    skb == tp->lost_skb_hint.

    Signed-off-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Neal Cardwell
     

15 Feb, 2012

1 commit

  • This commit ensures that lost_cnt_hint is correctly updated in
    tcp_shifted_skb() for FACK TCP senders. The lost_cnt_hint adjustment
    in tcp_sacktag_one() only applies to non-FACK senders, so FACK senders
    need their own adjustment.

    This applies the spirit of 1e5289e121372a3494402b1b131b41bfe1cf9b7f -
    except now that the sequence range passed into tcp_sacktag_one() is
    correct we need only have a special case adjustment for FACK.

    Signed-off-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Neal Cardwell
     

13 Feb, 2012

2 commits

  • Fix the newly-SACKed range to be the range of newly-shifted bytes.

    Previously - since 832d11c5cd076abc0aa1eaf7be96c81d1a59ce41 -
    tcp_shifted_skb() incorrectly called tcp_sacktag_one() with the start
    and end sequence numbers of the skb it passes in set to the range just
    beyond the range that is newly-SACKed.

    This commit also removes a special-case adjustment to lost_cnt_hint in
    tcp_shifted_skb() since the pre-existing adjustment of lost_cnt_hint
    in tcp_sacktag_one() now properly handles this things now that the
    correct start sequence number is passed in.

    Signed-off-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Neal Cardwell
     
  • This commit allows callers of tcp_sacktag_one() to pass in sequence
    ranges that do not align with skb boundaries, as tcp_shifted_skb()
    needs to do in an upcoming fix in this patch series.

    In fact, now tcp_sacktag_one() does not need to depend on an input skb
    at all, which makes its semantics and dependencies more clear.

    Signed-off-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Neal Cardwell
     

23 Jan, 2012

1 commit

  • Correctly implement a loss detection heuristic: New sequences (above
    high_seq) sent during the fast recovery are deemed lost when higher
    sequences are SACKed.

    Current code does not catch these losses, because tcp_mark_head_lost()
    does not check packets beyond high_seq. The fix is straight-forward by
    checking packets until the highest sacked packet. In addition, all the
    FLAG_DATA_LOST logic are in-effective and redundant and can be removed.

    Update the loss heuristic comments. The algorithm above is documented
    as heuristic B, but it is redundant too because heuristic A already
    covers B.

    Note that this change only marks some forward-retransmitted packets LOST.
    It does NOT forbid TCP performing further CWR on new losses. A potential
    follow-up patch under preparation is to perform another CWR on "new"
    losses such as
    1) sequence above high_seq is lost (by resetting high_seq to snd_nxt)
    2) retransmission is lost.

    Signed-off-by: Yuchung Cheng
    Signed-off-by: David S. Miller

    Yuchung Cheng
     

21 Dec, 2011

1 commit


13 Dec, 2011

1 commit

  • This patch replaces all uses of struct sock fields' memory_pressure,
    memory_allocated, sockets_allocated, and sysctl_mem to acessor
    macros. Those macros can either receive a socket argument, or a mem_cgroup
    argument, depending on the context they live in.

    Since we're only doing a macro wrapping here, no performance impact at all is
    expected in the case where we don't have cgroups disabled.

    Signed-off-by: Glauber Costa
    Reviewed-by: Hiroyouki Kamezawa
    CC: David S. Miller
    CC: Eric W. Biederman
    CC: Eric Dumazet
    Signed-off-by: David S. Miller

    Glauber Costa
     

12 Dec, 2011

1 commit


04 Dec, 2011

1 commit

  • Denys Fedoryshchenko reported that SYN+FIN attacks were bringing his
    linux machines to their limits.

    Dont call conn_request() if the TCP flags includes SYN flag

    Reported-by: Denys Fedoryshchenko
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

28 Nov, 2011

5 commits

  • The problem: Senders were overriding cwnd values picked during an undo
    by calling tcp_moderate_cwnd() in tcp_try_to_open().

    The fix: Don't moderate cwnd in tcp_try_to_open() if we're in
    TCP_CA_Open, since doing so is generally unnecessary and specifically
    would override a DSACK-based undo of a cwnd reduction made in fast
    recovery.

    Signed-off-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Neal Cardwell
     
  • Previously, SACK-enabled connections hung around in TCP_CA_Disorder
    state while snd_una==high_seq, just waiting to accumulate DSACKs and
    hopefully undo a cwnd reduction. This could and did lead to the
    following unfortunate scenario: if some incoming ACKs advance snd_una
    beyond high_seq then we were setting undo_marker to 0 and moving to
    TCP_CA_Open, so if (due to reordering in the ACK return path) we
    shortly thereafter received a DSACK then we were no longer able to
    undo the cwnd reduction.

    The change: Simplify the congestion avoidance state machine by
    removing the behavior where SACK-enabled connections hung around in
    the TCP_CA_Disorder state just waiting for DSACKs. Instead, when
    snd_una advances to high_seq or beyond we typically move to
    TCP_CA_Open immediately and allow an undo in either TCP_CA_Open or
    TCP_CA_Disorder if we later receive enough DSACKs.

    Other patches in this series will provide other changes that are
    necessary to fully fix this problem.

    Signed-off-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Neal Cardwell
     
  • The bug: When the ACK field is below snd_una (which can happen when
    ACKs are reordered), senders ignored DSACKs (preventing undo) and did
    not call tcp_fastretrans_alert, so they did not increment
    prr_delivered to reflect newly-SACKed sequence ranges, and did not
    call tcp_xmit_retransmit_queue, thus passing up chances to send out
    more retransmitted and new packets based on any newly-SACKed packets.

    The change: When the ACK field is below snd_una (the "old_ack" goto
    label), call tcp_fastretrans_alert to allow undo based on any
    newly-arrived DSACKs and try to send out more packets based on
    newly-SACKed packets.

    Other patches in this series will provide other changes that are
    necessary to fully fix this problem.

    Signed-off-by: Neal Cardwell
    Acked-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Neal Cardwell
     
  • The bug: Senders ignored DSACKs after recovery when there were no
    outstanding packets (a common scenario for HTTP servers).

    The change: when there are no outstanding packets (the "no_queue" goto
    label), call tcp_fastretrans_alert() in order to use DSACKs to undo
    congestion window reductions.

    Other patches in this series will provide other changes that are
    necessary to fully fix this problem.

    Signed-off-by: Neal Cardwell
    Acked-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Neal Cardwell
     
  • Allow callers to decide whether an ACK is a duplicate ACK. This is a
    prerequisite to allowing fastretrans_alert to be called from new
    contexts, such as the no_queue and old_ack code paths, from which we
    have extra info that tells us whether an ACK is a dupack.

    Signed-off-by: Neal Cardwell
    Acked-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Neal Cardwell
     

21 Oct, 2011

3 commits

  • Adding const qualifiers to pointers can ease code review, and spot some
    bugs. It might allow compiler to optimize code further.

    For example, is it legal to temporary write a null cksum into tcphdr
    in tcp_md5_hash_header() ? I am afraid a sniffer could catch the
    temporary null value...

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • tcp_fin() only needs socket pointer, we can remove skb and th params.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Since commit 356f039822b (TCP: increase default initial receive
    window.), we allow sender to send 10 (TCP_DEFAULT_INIT_RCVWND) segments.

    Change tcp_fixup_rcvbuf() to reflect this change, even if no real change
    is expected, since sysctl_tcp_rmem[1] = 87380 and this value
    is bigger than tcp_fixup_rcvbuf() computed rcvmem (~23720)

    Note: Since commit 356f039822b limited default window to maximum of
    10*1460 and 2*MSS, we use same heuristic in this patch.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 Oct, 2011

1 commit


14 Oct, 2011

1 commit

  • skb truesize currently accounts for sk_buff struct and part of skb head.
    kmalloc() roundings are also ignored.

    Considering that skb_shared_info is larger than sk_buff, its time to
    take it into account for better memory accounting.

    This patch introduces SKB_TRUESIZE(X) macro to centralize various
    assumptions into a single place.

    At skb alloc phase, we put skb_shared_info struct at the exact end of
    skb head, to allow a better use of memory (lowering number of
    reallocations), since kmalloc() gives us power-of-two memory blocks.

    Unless SLUB/SLUB debug is active, both skb->head and skb_shared_info are
    aligned to cache lines, as before.

    Note: This patch might trigger performance regressions because of
    misconfigured protocol stacks, hitting per socket or global memory
    limits that were previously not reached. But its a necessary step for a
    more accurate memory accounting.

    Signed-off-by: Eric Dumazet
    CC: Andi Kleen
    CC: Ben Hutchings
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Oct, 2011

1 commit


05 Oct, 2011

1 commit

  • lost_skb_hint is used by tcp_mark_head_lost() to mark the first unhandled skb.
    lost_cnt_hint is the number of packets or sacked packets before the lost_skb_hint;
    When shifting a skb that is before the lost_skb_hint, if tcp_is_fack() is ture,
    the skb has already been counted in the lost_cnt_hint; if tcp_is_fack() is false,
    tcp_sacktag_one() will increase the lost_cnt_hint. So tcp_shifted_skb() does not
    need to adjust the lost_cnt_hint by itself. When shifting a skb that is equal to
    lost_skb_hint, the shifted packets will not be counted by tcp_mark_head_lost().
    So tcp_shifted_skb() should adjust the lost_cnt_hint even tcp_is_fack(tp) is true.

    Signed-off-by: Zheng Yan
    Signed-off-by: David S. Miller

    Yan, Zheng
     

28 Sep, 2011

1 commit

  • Rename struct tcp_skb_cb "flags" to "tcp_flags" to ease code review and
    maintenance.

    Its content is a combination of FIN/SYN/RST/PSH/ACK/URG/ECE/CWR flags

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

27 Sep, 2011

2 commits

  • struct tcp_skb_cb contains a "flags" field containing either tcp flags
    or IP dsfield depending on context (input or output path)

    Introduce ip_dsfield to make the difference clear and ease maintenance.
    If later we want to save space, we can union flags/ip_dsfield

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • While playing with a new ADSL box at home, I discovered that ECN
    blackhole can trigger suboptimal quickack mode on linux : We send one
    ACK for each incoming data frame, without any delay and eventual
    piggyback.

    This is because TCP_ECN_check_ce() considers that if no ECT is seen on a
    segment, this is because this segment was a retransmit.

    Refine this heuristic and apply it only if we seen ECT in a previous
    segment, to detect ECN blackhole at IP level.

    Signed-off-by: Eric Dumazet
    CC: Jamal Hadi Salim
    CC: Jerry Chu
    CC: Ilpo Järvinen
    CC: Jim Gettys
    CC: Dave Taht
    Acked-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Eric Dumazet
     

22 Sep, 2011

1 commit

  • Conflicts:
    MAINTAINERS
    drivers/net/Kconfig
    drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
    drivers/net/ethernet/broadcom/tg3.c
    drivers/net/wireless/iwlwifi/iwl-pci.c
    drivers/net/wireless/iwlwifi/iwl-trans-tx-pcie.c
    drivers/net/wireless/rt2x00/rt2800usb.c
    drivers/net/wireless/wl12xx/main.c

    David S. Miller
     

19 Sep, 2011

1 commit

  • D-SACK is allowed to reside below snd_una. But the corresponding check
    in tcp_is_sackblock_valid() is the exact opposite. It looks like a typo.

    Signed-off-by: Zheng Yan
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Zheng Yan
     

25 Aug, 2011

1 commit

  • This patch implements Proportional Rate Reduction (PRR) for TCP.
    PRR is an algorithm that determines TCP's sending rate in fast
    recovery. PRR avoids excessive window reductions and aims for
    the actual congestion window size at the end of recovery to be as
    close as possible to the window determined by the congestion control
    algorithm. PRR also improves accuracy of the amount of data sent
    during loss recovery.

    The patch implements the recommended flavor of PRR called PRR-SSRB
    (Proportional rate reduction with slow start reduction bound) and
    replaces the existing rate halving algorithm. PRR improves upon the
    existing Linux fast recovery under a number of conditions including:
    1) burst losses where the losses implicitly reduce the amount of
    outstanding data (pipe) below the ssthresh value selected by the
    congestion control algorithm and,
    2) losses near the end of short flows where application runs out of
    data to send.

    As an example, with the existing rate halving implementation a single
    loss event can cause a connection carrying short Web transactions to
    go into the slow start mode after the recovery. This is because during
    recovery Linux pulls the congestion window down to packets_in_flight+1
    on every ACK. A short Web response often runs out of new data to send
    and its pipe reduces to zero by the end of recovery when all its packets
    are drained from the network. Subsequent HTTP responses using the same
    connection will have to slow start to raise cwnd to ssthresh. PRR on
    the other hand aims for the cwnd to be as close as possible to ssthresh
    by the end of recovery.

    A description of PRR and a discussion of its performance can be found at
    the following links:
    - IETF Draft:
    http://tools.ietf.org/html/draft-mathis-tcpm-proportional-rate-reduction-01
    - IETF Slides:
    http://www.ietf.org/proceedings/80/slides/tcpm-6.pdf
    http://tools.ietf.org/agenda/81/slides/tcpm-2.pdf
    - Paper to appear in Internet Measurements Conference (IMC) 2011:
    Improving TCP Loss Recovery
    Nandita Dukkipati, Matt Mathis, Yuchung Cheng

    Signed-off-by: Nandita Dukkipati
    Signed-off-by: David S. Miller

    Nandita Dukkipati
     

09 Jun, 2011

1 commit

  • This patch lowers the default initRTO from 3secs to 1sec per
    RFC2988bis. It falls back to 3secs if the SYN or SYN-ACK packet
    has been retransmitted, AND the TCP timestamp option is not on.

    It also adds support to take RTT sample during 3WHS on the passive
    open side, just like its active open counterpart, and uses it, if
    valid, to seed the initRTO for the data transmission phase.

    The patch also resets ssthresh to its initial default at the
    beginning of the data transmission phase, and reduces cwnd to 1 if
    there has been MORE THAN ONE retransmission during 3WHS per RFC5681.

    Signed-off-by: H.K. Jerry Chu
    Signed-off-by: David S. Miller

    Jerry Chu
     

23 Mar, 2011

2 commits

  • Signed-off-by: David S. Miller

    David S. Miller
     
  • In the current undo logic, cwnd is moderated after it was restored
    to the value prior entering fast-recovery. It was moderated first
    in tcp_try_undo_recovery then again in tcp_complete_cwr.

    Since the undo indicates recovery was false, these moderations
    are not necessary. If the undo is triggered when most of the
    outstanding data have been acknowledged, the (restored) cwnd is
    falsely pulled down to a small value.

    This patch removes these cwnd moderations if cwnd is undone
    a) during fast-recovery
    b) by receiving DSACKs past fast-recovery

    Signed-off-by: Yuchung Cheng
    Signed-off-by: David S. Miller

    Yuchung Cheng
     

16 Mar, 2011

1 commit


15 Mar, 2011

1 commit

  • In the congestion control interface, the callback for each ACK
    includes an estimated round trip time in microseconds.
    Some algorithms need high resolution (Vegas style) but most only
    need jiffie resolution. If RTT is not accurate (like a retransmission)
    -1 is used as a flag value.

    When doing coarse resolution if RTT is less than a a jiffie
    then 0 should be returned rather than no estimate. Otherwise algorithms
    that expect good ack's to trigger slow start (like CUBIC Hystart)
    will be confused.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     

04 Mar, 2011

1 commit


22 Feb, 2011

1 commit

  • Fix a bug that undo_retrans is incorrectly decremented when undo_marker is
    not set or undo_retrans is already 0. This happens when sender receives
    more DSACK ACKs than packets retransmitted during the current
    undo phase. This may also happen when sender receives DSACK after
    the undo operation is completed or cancelled.

    Fix another bug that undo_retrans is incorrectly incremented when
    sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case
    is rare but not impossible.

    Signed-off-by: Yuchung Cheng
    Acked-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Yuchung Cheng