Doug / smarc-fsl-linux-kernel | Embedian Git Server

02 Nov, 2011

1 commit

73cb88ecb net: make the tcp and udp file_operations for the /proc stuff const ... Browse Code »

the tcp and udp code creates a set of struct file_operations at runtime
while it can also be done at compile time, with the added benefit of then
having these file operations be const.

the trickiest part was to get the "THIS_MODULE" reference right; the naive
method of declaring a struct in the place of registration would not work
for this reason.

Signed-off-by: Arjan van de Ven
Signed-off-by: David S. Miller

Arjan van de Ven
2011-11-02 05:56:14 +0800

25 Oct, 2011

1 commit

78d81d15b TCP: remove TCP_DEBUG ... Browse Code »

It was enabled by default and the messages guarded
by the define are useful.

Signed-off-by: Flavio Leitner
Signed-off-by: David S. Miller

Flavio Leitner
2011-10-25 05:36:08 +0800

24 Oct, 2011

2 commits

318cf7aaa tcp: md5: add more const attributes ... Browse Code »

Now tcp_md5_hash_header() has a const tcphdr argument, we can add more
const attributes to callers.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-10-24 14:46:04 +0800
ca35a0ef8 tcp: md5: dont write skb head in tcp_md5_hash_header() ... Browse Code »

tcp_md5_hash_header() writes into skb header a temporary zero value,
this might confuse other users of this area.

Since tcphdr is small (20 bytes), copy it in a temporary variable and
make the change in the copy.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-10-24 13:52:35 +0800

21 Oct, 2011

1 commit

cf533ea53 tcp: add const qualifiers where possible ... Browse Code »

Adding const qualifiers to pointers can ease code review, and spot some
bugs. It might allow compiler to optimize code further.

For example, is it legal to temporary write a null cksum into tcphdr
in tcp_md5_hash_header() ? I am afraid a sniffer could catch the
temporary null value...

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-10-21 17:22:42 +0800

28 Sep, 2011

1 commit

4de075e04 tcp: rename tcp_skb_cb flags ... Browse Code »

Rename struct tcp_skb_cb "flags" to "tcp_flags" to ease code review and
maintenance.

Its content is a combination of FIN/SYN/RST/PSH/ACK/URG/ECE/CWR flags

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-09-28 01:25:05 +0800

27 Sep, 2011

2 commits

b82d1bb4f tcp: unalias tcp_skb_cb flags and ip_dsfield ... Browse Code »

struct tcp_skb_cb contains a "flags" field containing either tcp flags
or IP dsfield depending on context (input or output path)

Introduce ip_dsfield to make the difference clear and ease maintenance.
If later we want to save space, we can union flags/ip_dsfield

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-09-27 14:20:08 +0800
7a269ffad tcp: ECN blackhole should not force quickack mode ... Browse Code »

While playing with a new ADSL box at home, I discovered that ECN
blackhole can trigger suboptimal quickack mode on linux : We send one
ACK for each incoming data frame, without any delay and eventual
piggyback.

This is because TCP_ECN_check_ce() considers that if no ECT is seen on a
segment, this is because this segment was a retransmit.

Refine this heuristic and apply it only if we seen ECT in a previous
segment, to detect ECN blackhole at IP level.

Signed-off-by: Eric Dumazet
CC: Jamal Hadi Salim
CC: Jerry Chu
CC: Ilpo Järvinen
CC: Jim Gettys
CC: Dave Taht
Acked-by: Ilpo Järvinen
Signed-off-by: David S. Miller

Eric Dumazet
2011-09-27 12:58:44 +0800

22 Sep, 2011

1 commit

8decf8687 Merge branch 'master' of github.com:davem330/net ... Browse Code »

Conflicts:
MAINTAINERS
drivers/net/Kconfig
drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
drivers/net/ethernet/broadcom/tg3.c
drivers/net/wireless/iwlwifi/iwl-pci.c
drivers/net/wireless/iwlwifi/iwl-trans-tx-pcie.c
drivers/net/wireless/rt2x00/rt2800usb.c
drivers/net/wireless/wl12xx/main.c

David S. Miller
2011-09-22 15:23:13 +0800

19 Sep, 2011

1 commit

e05c82d36 tcp: fix build error if !CONFIG_SYN_COOKIES ... Browse Code »

commit 946cedccbd7387 (tcp: Change possible SYN flooding messages)
added a build error if CONFIG_SYN_COOKIES=n

Reported-by: Markus Trippelsdorf
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-09-19 09:48:01 +0800

17 Sep, 2011

1 commit

765cf9976 tcp: md5: remove one indirection level in tcp_md5sig_pool ... Browse Code »

tcp_md5sig_pool is currently an 'array' (a percpu object) of pointers to
struct tcp_md5sig_pool. Only the pointers are NUMA aware, but objects
themselves are all allocated on a single node.

Remove this extra indirection to get proper percpu memory (NUMA aware)
and make code simpler.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-09-17 13:15:46 +0800

16 Sep, 2011

1 commit

946cedccb tcp: Change possible SYN flooding messages ... Browse Code »

"Possible SYN flooding on port xxxx " messages can fill logs on servers.

Change logic to log the message only once per listener, and add two new
SNMP counters to track :

TCPReqQFullDoCookies : number of times a SYNCOOKIE was replied to client

TCPReqQFullDrop : number of times a SYN request was dropped because
syncookies were not enabled.

Based on a prior patch from Tom Herbert, and suggestions from David.

Signed-off-by: Eric Dumazet
CC: Tom Herbert
Signed-off-by: David S. Miller

Eric Dumazet
2011-09-16 02:49:43 +0800

09 Jun, 2011

1 commit

9ad7c049f tcp: RFC2988bis + taking RTT sample from 3WHS for the passive open side ... Browse Code »

This patch lowers the default initRTO from 3secs to 1sec per
RFC2988bis. It falls back to 3secs if the SYN or SYN-ACK packet
has been retransmitted, AND the TCP timestamp option is not on.

It also adds support to take RTT sample during 3WHS on the passive
open side, just like its active open counterpart, and uses it, if
valid, to seed the initRTO for the data transmission phase.

The patch also resets ssthresh to its initial default at the
beginning of the data transmission phase, and reduces cwnd to 1 if
there has been MORE THAN ONE retransmission during 3WHS per RFC5681.

Signed-off-by: H.K. Jerry Chu
Signed-off-by: David S. Miller

Jerry Chu
2011-06-09 08:05:30 +0800

21 Feb, 2011

1 commit

089c34827 tcp: Remove debug macro of TCP_CHECK_TIMER ... Browse Code »

Now, TCP_CHECK_TIMER is not used for debuging, it does nothing.
And, it has been there for several years, maybe 6 years.

Remove it to keep code clearer.

Signed-off-by: Shan Wei
Signed-off-by: David S. Miller

Shan Wei
2011-02-21 03:10:14 +0800

06 Feb, 2011

1 commit

7eb38527c tcp: Add reference to initial CWND ietf draft. ... Browse Code »

Suggested by Alexander Zimmermann

Signed-off-by: David S. Miller

David S. Miller
2011-02-06 10:13:45 +0800

03 Feb, 2011

1 commit

442b9635c tcp: Increase the initial congestion window to 10. ... Browse Code »

Signed-off-by: David S. Miller
Acked-by: Nandita Dukkipati

David S. Miller
2011-02-03 12:48:47 +0800

25 Jan, 2011

1 commit

04ed3e741 net: change netdev->features to u32 ... Browse Code »

Quoting Ben Hutchings: we presumably won't be defining features that
can only be enabled on 64-bit architectures.

Occurences found by `grep -r` on net/, drivers/net, include/

[ Move features and vlan_features next to each other in
struct netdev, as per Eric Dumazet's suggestion -DaveM ]

Signed-off-by: Michał Mirosław
Signed-off-by: David S. Miller

Michał Mirosław
2011-01-25 07:32:47 +0800

21 Dec, 2010

1 commit

356f03982 TCP: increase default initial receive window. ... Browse Code »

This patch changes the default initial receive window to 10 mss
(defined constant). The default window is limited to the maximum
of 10*1460 and 2*mss (when mss > 1460).

draft-ietf-tcpm-initcwnd-00 is a proposal to the IETF that recommends
increasing TCP's initial congestion window to 10 mss or about 15KB.
Leading up to this proposal were several large-scale live Internet
experiments with an initial congestion window of 10 mss (IW10), where
we showed that the average latency of HTTP responses improved by
approximately 10%. This was accompanied by a slight increase in
retransmission rate (0.5%), most of which is coming from applications
opening multiple simultaneous connections. To understand the extreme
worst case scenarios, and fairness issues (IW10 versus IW3), we further
conducted controlled testbed experiments. We came away finding minimal
negative impact even under low link bandwidths (dial-ups) and small
buffers. These results are extremely encouraging to adopting IW10.

However, an initial congestion window of 10 mss is useless unless a TCP
receiver advertises an initial receive window of at least 10 mss.
Fortunately, in the large-scale Internet experiments we found that most
widely used operating systems advertised large initial receive windows
of 64KB, allowing us to experiment with a wide range of initial
congestion windows. Linux systems were among the few exceptions that
advertised a small receive window of 6KB. The purpose of this patch is
to fix this shortcoming.

References:
1. A comprehensive list of all IW10 references to date.
http://code.google.com/speed/protocols/tcpm-IW10.html

2. Paper describing results from large-scale Internet experiments with IW10.
http://ccr.sigcomm.org/drupal/?q=node/621

3. Controlled testbed experiments under worst case scenarios and a
fairness study.
http://www.ietf.org/proceedings/79/slides/tcpm-0.pdf

4. Raw test data from testbed experiments (Linux senders/receivers)
with initial congestion and receive windows of both 10 mss.
http://research.csc.ncsu.edu/netsrv/?q=content/iw10

5. Internet-Draft. Increasing TCP's Initial Window.
https://datatracker.ietf.org/doc/draft-ietf-tcpm-initcwnd/

Signed-off-by: Nandita Dukkipati
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Nandita Dukkipati
2010-12-21 13:33:00 +0800

20 Dec, 2010

1 commit

4c306a929 net: kill unused macros ... Browse Code »

These macros never be used, so remove them.

Signed-off-by: Shan Wei
Signed-off-by: David S. Miller

Shan Wei
2010-12-20 13:59:35 +0800

17 Dec, 2010

1 commit

bc2ce894e tcp: relax tcp_paws_check() ... Browse Code »

Some windows versions have wrong RFC1323 implementations, with SYN and
SYNACKS messages containing zero tcp timestamps.

We relaxed in commit fc1ad92dfc4e363 the passive connection case
(Windows connects to a linux machine), but the reverse case (linux
connects to a Windows machine) has an analogue problem when tsvals from
windows machine are 'negative' (high order bit set) : PAWS triggers and
we drops incoming messages.

Fix this by making zero ts_recent value special, allowing frame to be
processed.

Based on a report and initial patch from Dmitiy Balakin

Bugzilla reference : https://bugzilla.kernel.org/show_bug.cgi?id=24842

Reported-by: dmitriy.balakin@nicneiron.ru
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2010-12-17 06:08:34 +0800

03 Dec, 2010

1 commit

dca9b2404 net: kill unused macros from head file ... Browse Code »

These macros have been defined for several years since v2.6.12-rc2（tracing by git）,
but never be used. So remove them.

Signed-off-by: Shan Wei
Signed-off-by: David S. Miller

Shan Wei
2010-12-03 05:27:33 +0800

02 Dec, 2010

1 commit

ccb7c410d timewait_sock: Create and use getpeer op. ... Browse Code »

The only thing AF-specific about remembering the timestamp
for a time-wait TCP socket is getting the peer.

Abstract that behind a new timewait_sock_ops vector.

Support for real IPV6 sockets is not filled in yet, but
curiously this makes timewait recycling start to work
for v4-mapped ipv6 sockets.

Signed-off-by: David S. Miller

David S. Miller
2010-12-02 10:09:13 +0800

01 Dec, 2010

1 commit

3f419d2d4 inet: Turn ->remember_stamp into ->get_peer in connection AF ops. ... Browse Code »

Then we can make a completely generic tcp_remember_stamp()
that uses ->get_peer() as a helper, minimizing the AF specific
code and minimizing the eventual code duplication when we implement
the ipv6 side of TW recycling.

Signed-off-by: David S. Miller

David S. Miller
2010-12-01 04:28:06 +0800

11 Nov, 2010

1 commit

8d987e5c7 net: avoid limits overflow ... Browse Code »

Robin Holt tried to boot a 16TB machine and found some limits were
reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]

We can switch infrastructure to use long "instead" of "int", now
atomic_long_t primitives are available for free.

Signed-off-by: Eric Dumazet
Reported-by: Robin Holt
Reviewed-by: Robin Holt
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller

Eric Dumazet
2010-11-11 04:12:00 +0800

30 Sep, 2010

1 commit

1b9f40929 tcp: tcp_enter_quickack_mode can be static ... Browse Code »

Function only used in tcp_input.c

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

stephen hemminger
2010-09-30 10:45:36 +0800

27 Sep, 2010

1 commit

e40051d13 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
drivers/net/qlcnic/qlcnic_init.c
net/ipv4/ip_output.c

David S. Miller
2010-09-27 16:03:03 +0800

16 Sep, 2010

1 commit

01f83d698 tcp: Prevent overzealous packetization by SWS logic. ... Browse Code »

If peer uses tiny MSS (say, 75 bytes) and similarly tiny advertised
window, the SWS logic will packetize to half the MSS unnecessarily.

This causes problems with some embedded devices.

However for large MSS devices we do want to half-MSS packetize
otherwise we never get enough packets into the pipe for things
like fast retransmit and recovery to work.

Be careful also to handle the case where MSS > window, otherwise
we'll never send until the probe timer.

Reported-by: ツ Leandro Melo de Sales
Signed-off-by: David S. Miller

Alexey Kuznetsov
2010-09-16 03:01:44 +0800

10 Sep, 2010

1 commit

e548833df Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
net/mac80211/main.c

David S. Miller
2010-09-10 13:27:33 +0800

31 Aug, 2010

2 commits

3d5b99ae8 TCP: update initial windows according to RFC 5681 ... Browse Code »

This updates the use of larger initial windows, as originally specified in
RFC 3390, to use the newer IW values specified in RFC 5681, section 3.1.

The changes made in RFC 5681 are:
a) the setting now is more clearly specified in units of segments (as the
comments by John Heffner emphasized, this was not very clear in RFC 3390);
b) for connections with 1095 < SMSS
Signed-off-by: David S. Miller

Gerrit Renker
2010-08-31 04:50:44 +0800
22b71c8f4 tcp/dccp: Consolidate common code for RFC 3390 conversion ... Browse Code »

This patch consolidates initial-window code common to TCP and CCID-2:
* TCP uses RFC 3390 in a packet-oriented manner (tcp_input.c) and
* CCID-2 uses RFC 3390 in packet-oriented manner (RFC 4341).

Signed-off-by: Gerrit Renker
Signed-off-by: David S. Miller

Gerrit Renker
2010-08-31 04:45:26 +0800

25 Aug, 2010

1 commit

ad1af0fed tcp: Combat per-cpu skew in orphan tests. ... Browse Code »

As reported by Anton Blanchard when we use
percpu_counter_read_positive() to make our orphan socket limit checks,
the check can be off by up to num_cpus_online() * batch (which is 32
by default) which on a 128 cpu machine can be as large as the default
orphan limit itself.

Fix this by doing the full expensive sum check if the optimized check
triggers.

Reported-by: Anton Blanchard
Signed-off-by: David S. Miller
Acked-by: Eric Dumazet

David S. Miller
2010-08-25 17:27:49 +0800

16 Jul, 2010

1 commit

f86586fa4 tcp: sizeof struct tcp_skb_cb is 44 ... Browse Code »

Correct comment stating sizeof(struct tcp_skb_cb) is 36 or 40, since its
44 bytes, since commit 951dbc8ac714b04 ([IPV6]: Move nextheader offset
to the IP6CB).

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2010-07-16 12:41:00 +0800

13 Jul, 2010

2 commits

7ba429100 inet, inet6: make tcp_sendmsg() and tcp_sendpage() through inet_sendmsg() and inet_sendpage() ... Browse Code »

a new boolean flag no_autobind is added to structure proto to avoid the autobind
calls when the protocol is TCP. Then sock_rps_record_flow() is called int the
TCP's sendmsg() and sendpage() pathes.

Signed-off-by: Changli Gao
----
include/net/inet_common.h | 4 ++++
include/net/sock.h | 1 +
include/net/tcp.h | 8 ++++----
net/ipv4/af_inet.c | 15 +++++++++------
net/ipv4/tcp.c | 11 +++++------
net/ipv4/tcp_ipv4.c | 3 +++
net/ipv6/af_inet6.c | 8 ++++----
net/ipv6/tcp_ipv6.c | 3 +++
8 files changed, 33 insertions(+), 20 deletions(-)
Signed-off-by: David S. Miller

Changli Gao
2010-07-13 11:21:46 +0800
53d3176b2 net: cleanups ... Browse Code »

remove useless blanks.

Signed-off-by: Changli Gao
----
include/net/inet_common.h | 55 ++++-------
include/net/tcp.h | 222 +++++++++++++++++-----------------------------
include/net/udp.h | 38 +++----
3 files changed, 123 insertions(+), 192 deletions(-)
Signed-off-by: David S. Miller

Changli Gao
2010-07-13 11:21:45 +0800

27 Jun, 2010

1 commit

172d69e63 syncookies: add support for ECN ... Browse Code »

Allows use of ECN when syncookies are in effect by encoding ecn_ok
into the syn-ack tcp timestamp.

While at it, remove a uneeded #ifdef CONFIG_SYN_COOKIES.
With CONFIG_SYN_COOKIES=nm want_cookie is ifdef'd to 0 and gcc
removes the "if (0)".

Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller

Florian Westphal
2010-06-27 13:00:03 +0800

17 Jun, 2010

1 commit

8c7636817 syncookies: check decoded options against sysctl settings ... Browse Code »

Discard the ACK if we find options that do not match current sysctl
settings.

Previously it was possible to create a connection with sack, wscale,
etc. enabled even if the feature was disabled via sysctl.

Also remove an unneeded call to tcp_sack_reset() in
cookie_check_timestamp: Both call sites (cookie_v4_check,
cookie_v6_check) zero "struct tcp_options_received", hand it to
tcp_parse_options() (which does not change tcp_opt->num_sacks/dsack)
and then call cookie_check_timestamp().

Even if num_sacks/dsacks were changed, the structure is allocated on
the stack and after cookie_check_timestamp returns only a few selected
members are copied to the inet_request_sock.

Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller

Florian Westphal
2010-06-17 05:42:15 +0800

16 Jun, 2010

1 commit

a3433f35a tcp: unify tcp flag macros ... Browse Code »

unify tcp flag macros: TCPHDR_FIN, TCPHDR_SYN, TCPHDR_RST, TCPHDR_PSH,
TCPHDR_ACK, TCPHDR_URG, TCPHDR_ECE and TCPHDR_CWR. TCBCB_FLAG_* are replaced
with the corresponding TCPHDR_*.

Signed-off-by: Changli Gao
----
include/net/tcp.h | 24 ++++++-------
net/ipv4/tcp.c | 8 ++--
net/ipv4/tcp_input.c | 2 -
net/ipv4/tcp_output.c | 59 ++++++++++++++++-----------------
net/netfilter/nf_conntrack_proto_tcp.c | 32 ++++++-----------
net/netfilter/xt_TCPMSS.c | 4 --
6 files changed, 58 insertions(+), 71 deletions(-)
Signed-off-by: David S. Miller

Changli Gao
2010-06-16 02:56:19 +0800

07 Jun, 2010

1 commit

a8b690f98 tcp: Fix slowness in read /proc/net/tcp ... Browse Code »

This patch address a serious performance issue in reading the
TCP sockets table (/proc/net/tcp).

Reading the full table is done by a number of sequential read
operations. At each read operation, a seek is done to find the
last socket that was previously read. This seek operation requires
that the sockets in the table need to be counted up to the current
file position, and to count each of these requires taking a lock for
each non-empty bucket. The whole algorithm is O(n^2).

The fix is to cache the last bucket value, offset within the bucket,
and the file position returned by the last read operation. On the
next sequential read, the bucket and offset are used to find the
last read socket immediately without needing ot scan the previous
buckets the table. This algorithm t read the whole table is O(n).

The improvement offered by this patch is easily show by performing
cat'ing /proc/net/tcp on a machine with a lot of connections. With
about 182K connections in the table, I see the following:

- Without patch
time cat /proc/net/tcp > /dev/null

real 1m56.729s
user 0m0.214s
sys 1m56.344s

- With patch
time cat /proc/net/tcp > /dev/null

real 0m0.894s
user 0m0.290s
sys 0m0.594s

Signed-off-by: Tom Herbert
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Tom Herbert
2010-06-07 15:43:42 +0800

17 May, 2010

1 commit

6811d58fc Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
include/linux/if_link.h

David S. Miller
2010-05-17 13:26:58 +0800

16 May, 2010

1 commit

35790c042 tcp: fix MD5 (RFC2385) support ... Browse Code »

TCP MD5 support uses percpu data for temporary storage. It currently
disables preemption so that same storage cannot be reclaimed by another
thread on same cpu.

We also have to make sure a softirq handler wont try to use also same
context. Various bug reports demonstrated corruptions.

Fix is to disable preemption and BH.

Reported-by: Bhaskar Dutta
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2010-05-16 15:34:04 +0800