Eric Lee / smarc-fsl-linux-kernel

25 Apr, 2017

2 commits

cf1ef3f07 net/tcp_fastopen: Disable active side TFO in certain scenarios ... Browse Code »

Middlebox firewall issues can potentially cause server's data being
blackholed after a successful 3WHS using TFO. Following are the related
reports from Apple:
https://www.nanog.org/sites/default/files/Paasch_Network_Support.pdf
Slide 31 identifies an issue where the client ACK to the server's data
sent during a TFO'd handshake is dropped.
C ---> syn-data ---> S
C X S
[retry and timeout]

https://www.ietf.org/proceedings/94/slides/slides-94-tcpm-13.pdf
Slide 5 shows a similar situation that the server's data gets dropped
after 3WHS.
C ---- syn-data ---> S
C S
S (accept & write)
C? X
Acked-by: Yuchung Cheng
Acked-by: Neal Cardwell
Signed-off-by: David S. Miller

Wei Wang
2017-04-25 02:27:17 +0800
58c4c6a3f net: add rcu locking when changing early demux ... Browse Code »

systemd-sysctl is triggering a suspicious RCU usage message when
net.ipv4.tcp_early_demux or net.ipv4.udp_early_demux is changed via
a sysctl config file:

[ 33.896184] ===============================
[ 33.899558] [ ERR: suspicious RCU usage. ]
[ 33.900624] 4.11.0-rc7+ #104 Not tainted
[ 33.901698] -------------------------------
[ 33.903059] /home/dsa/kernel-2.git/net/ipv4/sysctl_net_ipv4.c:305 suspicious rcu_dereference_check() usage!
[ 33.905724]
other info that might help us debug this:

[ 33.907656]
rcu_scheduler_active = 2, debug_locks = 0
[ 33.909288] 1 lock held by systemd-sysctl/143:
[ 33.910373] #0: (sb_writers#5){.+.+.+}, at: [] file_start_write+0x45/0x48
[ 33.912407]
stack backtrace:
[ 33.914018] CPU: 0 PID: 143 Comm: systemd-sysctl Not tainted 4.11.0-rc7+ #104
[ 33.915631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 33.917870] Call Trace:
[ 33.918431] dump_stack+0x81/0xb6
[ 33.919241] lockdep_rcu_suspicious+0x10f/0x118
[ 33.920263] proc_configure_early_demux+0x65/0x10a
[ 33.921391] proc_udp_early_demux+0x3a/0x41

add rcu locking to proc_configure_early_demux.

Fixes: dddb64bcb3461 ("net: Add sysctl to toggle early demux for tcp and udp")
Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2017-04-25 02:08:19 +0800

25 Mar, 2017

1 commit

dddb64bcb net: Add sysctl to toggle early demux for tcp and udp ... Browse Code »

Certain system process significant unconnected UDP workload.
It would be preferrable to disable UDP early demux for those systems
and enable it for TCP only.

By disabling UDP demux, we see these slight gains on an ARM64 system-
782 -> 788Mbps unconnected single stream UDPv4
633 -> 654Mbps unconnected UDPv4 different sources

The performance impact can change based on CPU architecure and cache
sizes. There will not much difference seen if entire UDP hash table
is in cache.

Both sysctls are enabled by default to preserve existing behavior.

v1->v2: Change function pointer instead of adding conditional as
suggested by Stephen.

v2->v3: Read once in callers to avoid issues due to compiler
optimizations. Also update commit message with the tests.

v3->v4: Store and use read once result instead of querying pointer
again incorrectly.

v4->v5: Refactor to avoid errors due to compilation with IPV6={m,n}

Signed-off-by: Subash Abhinov Kasiviswanathan
Suggested-by: Eric Dumazet
Cc: Stephen Hemminger
Cc: Tom Herbert
Cc: David Miller
Signed-off-by: David S. Miller

subashab@codeaurora.org
2017-03-25 04:17:07 +0800

22 Mar, 2017

1 commit

bf4e0a3db net: ipv4: add support for ECMP hash policy choice ... Browse Code »

This patch adds support for ECMP hash policy choice via a new sysctl
called fib_multipath_hash_policy and also adds support for L4 hashes.
The current values for fib_multipath_hash_policy are:
0 - layer 3 (default)
1 - layer 4
If there's an skb hash already set and it matches the chosen policy then it
will be used instead of being calculated (currently only for L4).
In L3 mode we always calculate the hash due to the ICMP error special
case, the flow dissector's field consistentification should handle the
address order thus we can remove the address reversals.
If the skb is provided we always use it for the hash calculation,
otherwise we fallback to fl4, that is if skb is NULL fl4 has to be set.

Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Nikolay Aleksandrov
2017-03-22 06:27:19 +0800

17 Mar, 2017

1 commit

4396e4618 tcp: remove tcp_tw_recycle ... Browse Code »

The tcp_tw_recycle was already broken for connections
behind NAT, since the per-destination timestamp is not
monotonically increasing for multiple machines behind
a single destination address.

After the randomization of TCP timestamp offsets
in commit 8a5bd45f6616 (tcp: randomize tcp timestamp offsets
for each connection), the tcp_tw_recycle is broken for all
types of connections for the same reason: the timestamps
received from a single machine is not monotonically increasing,
anymore.

Remove tcp_tw_recycle, since it is not functional. Also, remove
the PAWSPassive SNMP counter since it is only used for
tcp_tw_recycle, and simplify tcp_v4_route_req and tcp_v6_route_req
since the strict argument is only set when tcp_tw_recycle is
enabled.

Signed-off-by: Soheil Hassas Yeganeh
Signed-off-by: Eric Dumazet
Signed-off-by: Neal Cardwell
Signed-off-by: Yuchung Cheng
Cc: Lutz Vieweg
Cc: Florian Westphal
Signed-off-by: David S. Miller

Soheil Hassas Yeganeh
2017-03-17 11:33:56 +0800

31 Jan, 2017

1 commit

63a6fff35 net: Avoid receiving packets with an l3mdev on unbound UDP sockets ... Browse Code »

Packets arriving in a VRF currently are delivered to UDP sockets that
aren't bound to any interface. TCP defaults to not delivering packets
arriving in a VRF to unbound sockets. IP route lookup and socket
transmit both assume that unbound means using the default table and
UDP applications that haven't been changed to be aware of VRFs may not
function correctly in this case since they may not be able to handle
overlapping IP address ranges, or be able to send packets back to the
original sender if required.

So add a sysctl, udp_l3mdev_accept, to control this behaviour with it
being analgous to the existing tcp_l3mdev_accept, namely to allow a
process to have a VRF-global listen socket. Have this default to off
as this is the behaviour that users will expect, given that there is
no explicit mechanism to set unmodified VRF-unaware application into a
default VRF.

Signed-off-by: Robert Shearman
Acked-by: David Ahern
Tested-by: David Ahern
Signed-off-by: David S. Miller

Robert Shearman
2017-01-31 04:00:58 +0800

25 Jan, 2017

1 commit

4548b683b Introduce a sysctl that modifies the value of PROT_SOCK. ... Browse Code »

Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl
that denotes the first unprivileged inet port in the namespace. To
disable all privileged ports set this to zero. It also checks for
overlap with the local port range. The privileged and local range may
not overlap.

The use case for this change is to allow containerized processes to bind
to priviliged ports, but prevent them from ever being allowed to modify
their container's network configuration. The latter is accomplished by
ensuring that the network namespace is not a child of the user
namespace. This modification was needed to allow the container manager
to disable a namespace's priviliged port restrictions without exposing
control of the network namespace to processes in the user namespace.

Signed-off-by: Krister Johansen
Signed-off-by: David S. Miller

Krister Johansen
2017-01-25 01:10:51 +0800

14 Jan, 2017

1 commit

4a7f60094 tcp: remove thin_dupack feature ... Browse Code »

Thin stream DUPACK is to start fast recovery on only one DUPACK
provided the connection is a thin stream (i.e., low inflight). But
this older feature is now subsumed with RACK. If a connection
receives only a single DUPACK, RACK would arm a reordering timer
and soon starts fast recovery instead of timeout if no further
ACKs are received.

The socket option (THIN_DUPACK) is kept as a nop for compatibility.
Note that this patch does not change another thin-stream feature
which enables linear RTO. Although it might be good to generalize
that in the future (i.e., linear RTO for the first say 3 retries).

Signed-off-by: Yuchung Cheng
Signed-off-by: Neal Cardwell
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Yuchung Cheng
2017-01-14 11:37:16 +0800

12 Jan, 2017

1 commit

02ac5d148 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Two AF_* families adding entries to the lockdep tables
at the same time.

Signed-off-by: David S. Miller

David S. Miller
2017-01-12 03:43:39 +0800

10 Jan, 2017

1 commit

b007f0907 ipv4: make tcp_notsent_lowat sysctl knob behave as true unsigned int ... Browse Code »

> cat /proc/sys/net/ipv4/tcp_notsent_lowat
-1
> echo 4294967295 > /proc/sys/net/ipv4/tcp_notsent_lowat
-bash: echo: write error: Invalid argument
> echo -2147483648 > /proc/sys/net/ipv4/tcp_notsent_lowat
> cat /proc/sys/net/ipv4/tcp_notsent_lowat
-2147483648

but in documentation we have "tcp_notsent_lowat - UNSIGNED INTEGER"

v2: simplify to just proc_douintvec
Signed-off-by: Pavel Tikhomirov
Signed-off-by: David S. Miller

Pavel Tikhomirov
2017-01-10 05:34:38 +0800

30 Dec, 2016

2 commits

fee83d097 ipv4: Namespaceify tcp_max_syn_backlog knob ... Browse Code »

Different namespace application might require different maximal
number of remembered connection requests.

Signed-off-by: Haishuang Yan
Signed-off-by: David S. Miller

Haishuang Yan
2016-12-30 00:38:31 +0800
1946e672c ipv4: Namespaceify tcp_tw_recycle and tcp_max_tw_buckets knob ... Browse Code »

Different namespace application might require fast recycling
TIME-WAIT sockets independently of the host.

Signed-off-by: Haishuang Yan
Signed-off-by: David S. Miller

Haishuang Yan
2016-12-30 00:38:31 +0800

28 Dec, 2016

1 commit

56ab6b930 ipv4: Namespaceify tcp_tw_reuse knob ... Browse Code »

Different namespaces might have different requirements to reuse
TIME-WAIT sockets for new connections. This might be required in
cases where different namespace applications are in place which
require TIME_WAIT socket connections to be reduced independently
of the host.

Signed-off-by: Haishuang Yan
Signed-off-by: David S. Miller

Haishuang Yan
2016-12-28 01:28:07 +0800

23 Oct, 2016

1 commit

396a30cce ipv4: use the right lock for ping_group_range ... Browse Code »

This reverts commit a681574c99be23e4d20b769bf0e543239c364af5
("ipv4: disable BH in set_ping_group_range()") because we never
read ping_group_range in BH context (unlike local_port_range).

Then, since we already have a lock for ping_group_range, those
using ip_local_ports.lock for ping_group_range are clearly typos.

We might consider to share a same lock for both ping_group_range
and local_port_range w.r.t. space saving, but that should be for
net-next.

Fixes: a681574c99be ("ipv4: disable BH in set_ping_group_range()")
Fixes: ba6b918ab234 ("ping: move ping_group_range out of CONFIG_SYSCTL")
Cc: Eric Dumazet
Cc: Eric Salo
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

WANG Cong
2016-10-23 04:23:12 +0800

21 Oct, 2016

1 commit

a681574c9 ipv4: disable BH in set_ping_group_range() ... Browse Code »

In commit 4ee3bd4a8c746 ("ipv4: disable BH when changing ip local port
range") Cong added BH protection in set_local_port_range() but missed
that same fix was needed in set_ping_group_range()

Fixes: b8f1a55639e6 ("udp: Add function to make source port for UDP tunnels")
Signed-off-by: Eric Dumazet
Reported-by: Eric Salo
Signed-off-by: David S. Miller

Eric Dumazet
2016-10-21 02:49:32 +0800

24 May, 2016

1 commit

049bbf589 ipv4: Fix non-initialized TTL when CONFIG_SYSCTL=n ... Browse Code »

Commit fa50d974d104 ("ipv4: Namespaceify ip_default_ttl sysctl knob")
moves the default TTL assignment, and as side-effect IPv4 TTL now
has a default value only if sysctl support is enabled (CONFIG_SYSCTL=y).

The sysctl_ip_default_ttl is fundamental for IP to work properly,
as it provides the TTL to be used as default. The defautl TTL may be
used in ip_selected_ttl, through the following flow:

ip_select_ttl
ip4_dst_hoplimit
net->ipv4.sysctl_ip_default_ttl

This commit fixes the issue by assigning net->ipv4.sysctl_ip_default_ttl
in net_init_net, called during ipv4's initialization.

Without this commit, a kernel built without sysctl support will send
all IP packets with zero TTL (unless a TTL is explicitly set, e.g.
with setsockopt).

Given a similar issue might appear on the other knobs that were
namespaceify, this commit also moves them.

Fixes: fa50d974d104 ("ipv4: Namespaceify ip_default_ttl sysctl knob")
Signed-off-by: Ezequiel Garcia
Signed-off-by: David S. Miller

Ezequiel Garcia
2016-05-24 05:32:06 +0800

12 Apr, 2016

1 commit

a6db4494d net: ipv4: Consider failed nexthops in multipath routes ... Browse Code »

Multipath route lookups should consider knowledge about next hops and not
select a hop that is known to be failed.

Example:

[h2] [h3] 15.0.0.5
| |
3| 3|
[SP1] [SP2]--+
1 2 1 2
| | /-------------+ |
| \ / |
| X |
| / \ |
| / \---------------\ |
1 2 1 2
12.0.0.2 [TOR1] 3-----------------3 [TOR2] 12.0.0.3
4 4
\ /
\ /
\ /
-------| |-----/
1 2
[TOR3]
3|
|
[h1] 12.0.0.1

host h1 with IP 12.0.0.1 has 2 paths to host h3 at 15.0.0.5:

root@h1:~# ip ro ls
...
12.0.0.0/24 dev swp1 proto kernel scope link src 12.0.0.1
15.0.0.0/16
nexthop via 12.0.0.2 dev swp1 weight 1
nexthop via 12.0.0.3 dev swp1 weight 1
...

If the link between tor3 and tor1 is down and the link between tor1
and tor2 then tor1 is effectively cut-off from h1. Yet the route lookups
in h1 are alternating between the 2 routes: ping 15.0.0.5 gets one and
ssh 15.0.0.5 gets the other. Connections that attempt to use the
12.0.0.2 nexthop fail since that neighbor is not reachable:

root@h1:~# ip neigh show
...
12.0.0.3 dev swp1 lladdr 00:02:00:00:00:1b REACHABLE
12.0.0.2 dev swp1 FAILED
...

The failed path can be avoided by considering known neighbor information
when selecting next hops. If the neighbor lookup fails we have no
knowledge about the nexthop, so give it a shot. If there is an entry
then only select the nexthop if the state is sane. This is similar to
what fib_detect_death does.

To maintain backward compatibility use of the neighbor information is
based on a new sysctl, fib_multipath_use_neigh.

Signed-off-by: David Ahern
Reviewed-by: Julian Anastasov
Signed-off-by: David S. Miller

David Ahern
2016-04-12 03:16:13 +0800

17 Feb, 2016

3 commits

e21145a98 ipv4: namespacify ip_early_demux sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-17 09:42:54 +0800
287b7f38f ipv4: Namespacify ip_dynaddr sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-17 09:42:54 +0800
fa50d974d ipv4: Namespaceify ip_default_ttl sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-17 09:42:54 +0800

11 Feb, 2016

4 commits

165094afc igmp: Namespacify igmp_qrv sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-11 22:59:22 +0800
87a8a2ae6 igmp: Namespaceify igmp_llm_reports sysctl knob ... Browse Code »

This was initially introduced in df2cf4a78e488d26 ("IGMP: Inhibit
reports for local multicast groups") by defining the sysctl in the
ipv4_net_table array, however it was never implemented to be
namespace aware. Fix this by changing the code accordingly.

Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-11 22:59:22 +0800
166b6b2d6 igmp: Namespaceify igmp_max_msf sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-11 22:59:22 +0800
815c52700 igmp: Namespaceify igmp_max_memberships sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-11 22:59:22 +0800

08 Feb, 2016

9 commits

4979f2d9f ipv4: Namespaceify tcp_notsent_lowat sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-08 03:36:11 +0800
1e579caa1 ipv4: Namespaceify tcp_fin_timeout sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-08 03:36:11 +0800
c402d9bef ipv4: Namespaceify tcp_orphan_retries sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-08 03:35:11 +0800
c6214a97c ipv4: Namespaceify tcp_retries2 sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-08 03:35:11 +0800
ae5c3f406 ipv4: Namespaceify tcp_retries1 sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-08 03:35:10 +0800
1043e25ff ipv4: Namespaceify tcp reordering sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-08 03:35:10 +0800
12ed8244e ipv4: Namespaceify tcp syncookies sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-08 03:35:10 +0800
7c083ecb3 ipv4: Namespaceify tcp synack retries sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-08 03:35:10 +0800
6fa251663 ipv4: Namespaceify tcp syn retries sysctl knob ... Browse Code »

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-02-08 03:35:10 +0800

21 Jan, 2016

1 commit

d55f90bfa net: drop tcp_memcontrol.c ... Browse Code »

tcp_memcontrol.c only contains legacy memory.tcp.kmem.* file definitions
and mem_cgroup->tcp_mem init/destroy stuff. This doesn't belong to
network subsys. Let's move it to memcontrol.c. This also allows us to
reuse generic code for handling legacy memcg files.

Signed-off-by: Vladimir Davydov
Acked-by: Johannes Weiner
Cc: "David S. Miller"
Acked-by: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vladimir Davydov
2016-01-21 09:09:18 +0800

11 Jan, 2016

3 commits

b840d15d3 ipv4: Namespecify the tcp_keepalive_intvl sysctl knob ... Browse Code »

This is the final part required to namespaceify the tcp
keep alive mechanism.

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-01-11 06:32:09 +0800
9bd6861bd ipv4: Namespecify tcp_keepalive_probes sysctl knob ... Browse Code »

This is required to have full tcp keepalive mechanism namespace
support.

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-01-11 06:32:09 +0800
13b287e8d ipv4: Namespaceify tcp_keepalive_time sysctl knob ... Browse Code »

Different net namespaces might have different requirements as to
the keepalive time of tcp sockets. This might be required in cases
where different firewall rules are in place which require tcp
timeout sockets to be increased/decreased independently of the host.

Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller

Nikolay Borisov
2016-01-11 06:32:09 +0800

19 Dec, 2015

1 commit

6dd9a14e9 net: Allow accepted sockets to be bound to l3mdev domain ... Browse Code »

Allow accepted sockets to derive their sk_bound_dev_if setting from the
l3mdev domain in which the packets originated. A sysctl setting is added
to control the behavior which is similar to sk_mark and
sysctl_tcp_fwmark_accept.

This effectively allow a process to have a "VRF-global" listen socket,
with child sockets bound to the VRF device in which the packet originated.
A similar behavior can be achieved using sk_mark, but a solution using marks
is incomplete as it does not handle duplicate addresses in different L3
domains/VRFs. Allowing sockets to inherit the sk_bound_dev_if from l3mdev
domain provides a complete solution.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2015-12-19 03:43:38 +0800

05 Nov, 2015

1 commit

4ee3bd4a8 ipv4: disable BH when changing ip local port range ... Browse Code »

This fixes the following lockdep warning:

[ INFO: inconsistent lock state ]
4.3.0-rc7+ #1197 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-R} -> {SOFTIRQ-ON-W} usage.
sysctl/1019 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&(&net->ipv4.ip_local_ports.lock)->seqcount){+.+-..}, at: [] ipv4_local_port_range+0xb4/0x12a
{IN-SOFTIRQ-R} state was registered at:
[] __lock_acquire+0x2f6/0xdf0
[] lock_acquire+0x11c/0x1a4
[] inet_get_local_port_range+0x4e/0xae
[] udp_flow_src_port.constprop.40+0x23/0x116
[] vxlan_xmit_one+0x219/0xa6a
[] vxlan_xmit+0xa6b/0xaa5
[] dev_hard_start_xmit+0x2ae/0x465
[] __dev_queue_xmit+0x531/0x633
[] dev_queue_xmit_sk+0x13/0x15
[] neigh_resolve_output+0x12f/0x14d
[] ip6_finish_output2+0x344/0x39f
[] ip6_finish_output+0x88/0x8e
[] ip6_output+0x91/0xe5
[] dst_output_sk+0x47/0x4c
[] NF_HOOK_THRESH.constprop.30+0x38/0x82
[] mld_sendpack+0x189/0x266
[] mld_ifc_timer_expire+0x1ef/0x223
[] call_timer_fn+0xfb/0x28c
[] run_timer_softirq+0x1c7/0x1f1

Fixes: b8f1a55639e6 ("udp: Add function to make source port for UDP tunnels")
Cc: Tom Herbert
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

WANG Cong
2015-11-05 10:29:06 +0800

21 Oct, 2015

1 commit

4f41b1c58 tcp: use RACK to detect losses ... Browse Code »

This patch implements the second half of RACK that uses the the most
recent transmit time among all delivered packets to detect losses.

tcp_rack_mark_lost() is called upon receiving a dubious ACK.
It then checks if an not-yet-sacked packet was sent at least
"reo_wnd" prior to the sent time of the most recently delivered.
If so the packet is deemed lost.

The "reo_wnd" reordering window starts with 1msec for fast loss
detection and changes to min-RTT/4 when reordering is observed.
We found 1msec accommodates well on tiny degree of reordering
(
Signed-off-by: Neal Cardwell
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Yuchung Cheng
2015-10-21 22:00:53 +0800