Doug / smarc-fsl-linux-kernel | Embedian Git Server

26 Apr, 2007

4 commits

aa8223c7b [SK_BUFF]: Introduce tcp_hdr(), remove skb->h.th ... Browse Code »

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2007-04-26 13:25:26 +0800
2de979bd7 [TCP]: whitespace cleanup ... Browse Code »

Add whitespace around keywords.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2007-04-26 13:24:13 +0800
9d729f72d [NET]: Convert xtime.tv_sec to get_seconds() ... Browse Code »

Where appropriate, convert references to xtime.tv_sec to the
get_seconds() helper function.

Signed-off-by: James Morris
Signed-off-by: David S. Miller

James Morris
2007-04-26 13:23:32 +0800
54287cc17 [TCP]: Keep copied_seq, rcv_wup and rcv_next together. ... Browse Code »

I noticed in oprofile study a cache miss in tcp_rcv_established() to read
copied_seq.

ffffffff80400a80 : /* tcp_rcv_established total: 4034293
2.0400 */

55493 0.0281 :ffffffff80400bc9: mov 0x4c8(%r12),%eax copied_seq
543103 0.2746 :ffffffff80400bd1: cmp 0x3e0(%r12),%eax rcv_nxt

if (tp->copied_seq == tp->rcv_nxt &&
len - tcp_header_len ucopy.len) {

In this function, the cache line 0x4c0 -> 0x500 is used only for this
reading 'copied_seq' field.

rcv_wup and copied_seq should be next to rcv_nxt field, to lower number of
active cache lines in hot paths. (tcp_rcv_established(), tcp_poll(), ...)

As you suggested, I changed tcp_create_openreq_child() so that these fields
are changed together, to avoid adding a new store buffer stall.

Patch is 64bit friendly (no new hole because of alignment constraints)

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2007-04-26 13:23:21 +0800

01 Mar, 2007

1 commit

a9948a7e1 [TCP]: Fix minisock tcp_create_openreq_child() typo. ... Browse Code »

On 2/28/07, KOVACS Krisztian wrote:
>
> Hi,
>
> While reading TCP minisock code I've found this suspiciously looking
> code fragment:
>
> - 8< -
> struct sock *tcp_create_openreq_child(struct sock *sk, struct request_sock *req, struct sk_buff *skb)
> {
> struct sock *newsk = inet_csk_clone(sk, req, GFP_ATOMIC);
>
> if (newsk != NULL) {
> const struct inet_request_sock *ireq = inet_rsk(req);
> struct tcp_request_sock *treq = tcp_rsk(req);
> struct inet_connection_sock *newicsk = inet_csk(sk);
> struct tcp_sock *newtp;
> - 8< -
>
> The above code initializes newicsk to inet_csk(sk), isn't that supposed
> to be inet_csk(newsk)? As far as I can tell this might leave
> icsk_ack.last_seg_size zero even if we do have received data.

Good catch!

David, please apply the attached patch.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2007-03-01 03:05:56 +0800

11 Feb, 2007

1 commit

e905a9eda [NET] IPV4: Fix whitespace errors. ... Browse Code »

Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

YOSHIFUJI Hideaki
2007-02-11 15:19:39 +0800

05 Dec, 2006

1 commit

4c1ac1b49 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 ... Browse Code »

Conflicts:

drivers/infiniband/core/iwcm.c
drivers/net/chelsio/cxgb2.c
drivers/net/wireless/bcm43xx/bcm43xx_main.c
drivers/net/wireless/prism54/islpci_eth.c
drivers/usb/core/hub.h
drivers/usb/input/hid-core.c
net/core/netpoll.c

Fix up merge failures with Linus's head and fix new compilation failures.

Signed-Off-By: David Howells

David Howells
2006-12-05 22:37:56 +0800

03 Dec, 2006

4 commits

c67862403 [TCP] minisocks: Use kmemdup and LIMIT_NETDEBUG ... Browse Code »

Code diff stats:

[acme@newtoy net-2.6.20]$ codiff /tmp/tcp_minisocks.o.before /tmp/tcp_minisocks.o.after
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/tcp_minisocks.c:
tcp_check_req | -44
1 function changed, 44 bytes removed
[acme@newtoy net-2.6.20]$

Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2006-12-03 13:23:57 +0800
714e85be3 [IPV6]: Assorted trivial endianness annotations. ... Browse Code »

Signed-off-by: Al Viro
Signed-off-by: David S. Miller

Al Viro
2006-12-03 13:22:50 +0800
a928630a2 [TCP]: Fix some warning when MD5 is disabled. ... Browse Code »

Just some mis-placed ifdefs:

net/ipv4/tcp_minisocks.c: In function ‘tcp_twsk_destructor’:
net/ipv4/tcp_minisocks.c:364: warning: unused variable ‘twsk’
net/ipv6/tcp_ipv6.c:1846: warning: ‘tcp_sock_ipv6_specific’ defined but not used
net/ipv6/tcp_ipv6.c:1877: warning: ‘tcp_sock_ipv6_mapped_specific’ defined but not used

Signed-off-by: David S. Miller

David S. Miller
2006-12-03 13:22:43 +0800
cfb6eeb4c [TCP]: MD5 Signature Option (RFC2385) support. ... Browse Code »

Based on implementation by Rick Payne.

Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

YOSHIFUJI Hideaki
2006-12-03 13:22:39 +0800

22 Nov, 2006

1 commit

65f27f384 WorkStruct: Pass the work_struct pointer instead of context data ... Browse Code »

Pass the work_struct pointer to the work function rather than context data.
The work function can use container_of() to work out the data.

For the cases where the container of the work_struct may go away the moment the
pending bit is cleared, it is made possible to defer the release of the
structure by deferring the clearing of the pending bit.

To make this work, an extra flag is introduced into the management side of the
work_struct. This governs auto-release of the structure upon execution.

Ordinarily, the work queue executor would release the work_struct for further
scheduling or deallocation by clearing the pending bit prior to jumping to the
work function. This means that, unless the driver makes some guarantee itself
that the work_struct won't go away, the work function may not access anything
else in the work_struct or its container lest they be deallocated.. This is a
problem if the auxiliary data is taken away (as done by the last patch).

However, if the pending bit is *not* cleared before jumping to the work
function, then the work function *may* access the work_struct and its container
with no problems. But then the work function must itself release the
work_struct by calling work_release().

In most cases, automatic release is fine, so this is the default. Special
initiators exist for the non-auto-release case (ending in _NAR).

Signed-Off-By: David Howells

David Howells
2006-11-22 22:55:48 +0800

23 Sep, 2006

1 commit

ab32ea5d8 [NET/IPV4/IPV6]: Change some sysctl variables to __read_mostly ... Browse Code »

Change net/core, ipv4 and ipv6 sysctl variables to __read_mostly.

Couldn't actually measure any performance increase while testing (.3%
I consider noise), but seems like the right thing to do.

Signed-off-by: Brian Haley
Signed-off-by: David S. Miller

Brian Haley
2006-09-23 05:55:03 +0800

03 Aug, 2006

1 commit

3687b1dc6 [TCP]: SNMPv2 tcpAttemptFails counter error ... Browse Code »

Refer to RFC2012, tcpAttemptFails is defined as following:
tcpAttemptFails OBJECT-TYPE
SYNTAX Counter32
MAX-ACCESS read-only
STATUS current
DESCRIPTION
"The number of times TCP connections have made a direct
transition to the CLOSED state from either the SYN-SENT
state or the SYN-RCVD state, plus the number of times TCP
connections have made a direct transition to the LISTEN
state from the SYN-RCVD state."
::= { tcp 7 }

When I lookup into RFC793, I found that the state change should occured
under following condition:
1. SYN-SENT -> CLOSED
a) Received ACK,RST segment when SYN-SENT state.

2. SYN-RCVD -> CLOSED
b) Received SYN segment when SYN-RCVD state(came from LISTEN).
c) Received RST segment when SYN-RCVD state(came from SYN-SENT).
d) Received SYN segment when SYN-RCVD state(came from SYN-SENT).

3. SYN-RCVD -> LISTEN
e) Received RST segment when SYN-RCVD state(came from LISTEN).

In my test, those direct state transition can not be counted to
tcpAttemptFails.

Signed-off-by: Wei Yongjun
Signed-off-by: David S. Miller

Wei Yongjun
2006-08-03 04:38:19 +0800

04 Jul, 2006

1 commit

e4d919188 [PATCH] lockdep: locking init debugging improvement ... Browse Code »

Locking init improvement:

- introduce and use __SPIN_LOCK_UNLOCKED for array initializations,
to pass in the name string of locks, used by debugging

Signed-off-by: Ingo Molnar
Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2006-07-04 06:27:02 +0800

01 Jul, 2006

1 commit

6ab3d5624 Remove obsolete #include <linux/config.h> ... Browse Code »

Signed-off-by: Jörn Engel
Signed-off-by: Adrian Bunk

Jörn Engel
2006-07-01 01:25:36 +0800

30 Jun, 2006

1 commit

b0da85370 [NET]: Add ECN support for TSO ... Browse Code »

In the current TSO implementation, NETIF_F_TSO and ECN cannot be
turned on together in a TCP connection. The problem is that most
hardware that supports TSO does not handle CWR correctly if it is set
in the TSO packet. Correct handling requires CWR to be set in the
first packet only if it is set in the TSO header.

This patch adds the ability to turn on NETIF_F_TSO and ECN using
GSO if necessary to handle TSO packets with CWR set. Hardware
that handles CWR correctly can turn on NETIF_F_TSO_ECN in the dev->
features flag.

All TSO packets with CWR set will have the SKB_GSO_TCPV4_ECN set. If
the output device does not have the NETIF_F_TSO_ECN feature set, GSO
will split the packet up correctly with CWR only set in the first
segment.

With help from Herbert Xu .

Since ECN can always be enabled with TSO, the SOCK_NO_LARGESEND sock
flag is completely removed.

Signed-off-by: Michael Chan
Signed-off-by: David S. Miller

Michael Chan
2006-06-30 07:58:08 +0800

04 Jan, 2006

2 commits

0fa1a53e1 [IPV6]: Introduce inet6_timewait_sock ... Browse Code »

Out of tcp6_timewait_sock, that now is just an aggregation of
inet_timewait_sock and inet6_timewait_sock, using tw_ipv6_offset in struct
inet_timewait_sock, that is common to the IPv6 transport protocols that use
timewait sockets, like DCCP and TCP.

tw_ipv6_offset plays the struct inet_sock pinfo6 role, i.e. for the generic
code to find the IPv6 area in a timewait sock.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2006-01-04 05:10:47 +0800
8292a17a3 [ICSK]: Rename struct tcp_func to struct inet_connection_sock_af_ops ... Browse Code »

And move it to struct inet_connection_sock. DCCP will use it in the
upcoming changesets.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2006-01-04 05:10:38 +0800

11 Nov, 2005

2 commits

caa20d9ab [TCP]: spelling fixes ... Browse Code »

Minor spelling fixes for TCP code.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-11-11 09:13:47 +0800
9772efb97 [TCP]: Appropriate Byte Count support ... Browse Code »

This is an updated version of the RFC3465 ABC patch originally
for Linux 2.6.11-rc4 by Yee-Ting Li. ABC is a way of counting
bytes ack'd rather than packets when updating congestion control.

The orignal ABC described in the RFC applied to a Reno style
algorithm. For advanced congestion control there is little
change after leaving slow start.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-11-11 09:09:53 +0800

21 Sep, 2005

1 commit

7957aed72 [TCP]: Set default congestion control correctly for incoming connections. ... Browse Code »

Patch from Joel Sing to fix the default congestion control algorithm
for incoming connections. If a new congestion control handler is added
(via module), it should become the default for new
connections. Instead, the incoming connections use reno. The cause is
incorrect initialisation causes the tcp_init_congestion_control()
function to return after the initial if test fails.

Signed-off-by: Stephen Hemminger
Acked-by: Ian McDonald
Signed-off-by: David S. Miller

Stephen Hemminger
2005-09-21 15:19:46 +0800

30 Aug, 2005

15 commits

6687e988d [ICSK]: Move TCP congestion avoidance members to icsk ... Browse Code »

This changeset basically moves tcp_sk()->{ca_ops,ca_state,etc} to inet_csk(),
minimal renaming/moving done in this changeset to ease review.

Most of it is just changes of struct tcp_sock * to struct sock * parameters.

With this we move to a state closer to two interesting goals:

1. Generalisation of net/ipv4/tcp_diag.c, becoming inet_diag.c, being used
for any INET transport protocol that has struct inet_hashinfo and are
derived from struct inet_connection_sock. Keeps the userspace API, that will
just not display DCCP sockets, while newer versions of tools can support
DCCP.

2. INET generic transport pluggable Congestion Avoidance infrastructure, using
the current TCP CA infrastructure with DCCP.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:56:18 +0800
696ab2d3b [TIMEWAIT]: Move inet_timewait_death_row routines to net/ipv4/inet_timewait_sock.c ... Browse Code »

Also export the ones that will be used in the next changeset, when
DCCP uses this infrastructure.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:55:58 +0800
295ff7edb [TIMEWAIT]: Introduce inet_timewait_death_row ... Browse Code »

That groups all of the tables and variables associated to the TCP timewait
schedulling/recycling/killing code, that now can be isolated from the TCP
specific code and used by other transport protocols, such as DCCP.

Next changeset will move this code to net/ipv4/inet_timewait_sock.c

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:55:48 +0800
295f7324f [ICSK]: Introduce reqsk_queue_prune from code in tcp_synack_timer ... Browse Code »

With this we're very close to getting all of the current TCP
refactorings in my dccp-2.6 tree merged, next changeset will export
some functions needed by the current DCCP code and then dccp-2.6.git
will be born!

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:49:29 +0800
9f1d2604c [ICSK]: Introduce inet_csk_clone ... Browse Code »

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:49:20 +0800
463c84b97 [NET]: Introduce inet_connection_sock ... Browse Code »

This creates struct inet_connection_sock, moving members out of struct
tcp_sock that are shareable with other INET connection oriented
protocols, such as DCCP, that in my private tree already uses most of
these members.

The functions that operate on these members were renamed, using a
inet_csk_ prefix while not being moved yet to a new file, so as to
ease the review of these changes.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:43:19 +0800
87d11ceb9 [SOCK]: Introduce sk_clone ... Browse Code »

Out of tcp_create_openreq_child, will be used in
dccp_create_openreq_child, and is a nice sock function anyway.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:42:36 +0800
c676270bc [INET_TWSK]: Introduce inet_twsk_alloc ... Browse Code »

With the parts of tcp_time_wait that are not TCP specific, tcp_time_wait uses
it and so will dccp_time_wait.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:42:26 +0800
e48c414ee [INET]: Generalise the TCP sock ID lookup routines ... Browse Code »

And also some TIME_WAIT functions.

[acme@toy net-2.6.14]$ grep built-in /tmp/before.size /tmp/after.size
/tmp/before.size: 282955 13122 9312 305389 4a8ed net/ipv4/built-in.o
/tmp/after.size: 281566 13122 9312 304000 4a380 net/ipv4/built-in.o
[acme@toy net-2.6.14]$

I kept them still inlined, will uninline at some point to see what
would be the performance difference.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:42:18 +0800
8feaf0c0a [INET]: Generalise tcp_tw_bucket, aka TIME_WAIT sockets ... Browse Code »

This paves the way to generalise the rest of the sock ID lookup
routines and saves some bytes in TCPv4 TIME_WAIT sockets on distro
kernels (where IPv6 is always built as a module):

[root@qemu ~]# grep tw_sock /proc/slabinfo
tw_sock_TCPv6 0 0 128 31 1
tw_sock_TCP 0 0 96 41 1
[root@qemu ~]#

Now if a protocol wants to use the TIME_WAIT generic infrastructure it
only has to set the sk_prot->twsk_obj_size field with the size of its
inet_timewait_sock derived sock and proto_register will create
sk_prot->twsk_slab, for now its only for INET sockets, but we can
introduce timewait_sock later if some non INET transport protocolo
wants to use this stuff.

Next changesets will take advantage of this new infrastructure to
generalise even more TCP code.

[acme@toy net-2.6.14]$ grep built-in /tmp/before.size /tmp/after.size
/tmp/before.size: 188646 11764 5068 205478 322a6 net/ipv4/built-in.o
/tmp/after.size: 188144 11764 5068 204976 320b0 net/ipv4/built-in.o
[acme@toy net-2.6.14]$

Tested with both IPv4 & IPv6 (::1 (localhost) & ::ffff:172.20.0.1
(qemu host)).

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:42:13 +0800
6e04e0216 [INET]: Move tcp_port_rover to inet_hashinfo ... Browse Code »

Also expose all of the tcp_hashinfo members, i.e. killing those
tcp_ehash, etc macros, this will more clearly expose already generic
functions and some that need just a bit of work to become generic, as
we'll see in the upcoming changesets.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:41:44 +0800
a55ebcc4c [INET]: Move bind_hash from tcp_sk to inet_sk ... Browse Code »

This should really be in a inet_connection_sock, but I'm leaving it
for a later optimization, when some more fields common to INET
transport protocols now in tcp_sk or inet_sk will be chunked out into
inet_connection_sock, for now its better to concentrate on getting the
changes in the core merged to leave the DCCP tree with only DCCP
specific code.

Next changesets will take advantage of this move to generalise things
like tcp_bind_hash, tcp_put_port, tcp_inherit_port, making the later
receive a inet_hashinfo parameter, and even __tcp_tw_hashdance, etc in
the future, when tcp_tw_bucket gets transformed into the struct
timewait_sock hierarchy.

tcp_destroy_sock also is eligible as soon as tcp_orphan_count gets
moved to sk_prot.

A cascade of incremental changes will ultimately make the tcp_lookup
functions be fully generic.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:38:48 +0800
0f7ff9274 [INET]: Just rename the TCP hashtable functions/structs to inet_ ... Browse Code »

This is to break down the complexity of the series of patches,
making it very clear that this one just does:

1. renames tcp_ prefixed hashtable functions and data structures that
were already mostly generic to inet_ to share it with DCCP and
other INET transport protocols.

2. Removes not used functions (__tb_head & tb_head)

3. Removes some leftover prototypes in the headers (tcp_bucket_unlock &
tcp_v4_build_header)

Next changesets will move tcp_sk(sk)->bind_hash to inet_sock so that we can
make functions such as tcp_inherit_port, __tcp_inherit_port, tcp_v4_get_port,
__tcp_put_port, generic and get others like tcp_destroy_sock closer to generic
(tcp_orphan_count will go to sk->sk_prot to allow this).

Eventually most of these functions will be used passing the transport protocol
inet_hashinfo structure.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:38:32 +0800
6cbb0df78 [SOCK]: Introduce sk_setup_caps ... Browse Code »

From tcp_v4_setup_caps, that always is preceded by a call to
__sk_dst_set, so coalesce this sequence into sk_setup_caps, removing
one call to a TCP function in the IP layer.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:37:48 +0800
e6848976b [NET]: Cleanup INET_REFCNT_DEBUG code ... Browse Code »

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:37:29 +0800

24 Jun, 2005

1 commit

317a76f9a [TCP]: Add pluggable congestion control algorithm infrastructure. ... Browse Code »

Allow TCP to have multiple pluggable congestion control algorithms.
Algorithms are defined by a set of operations and can be built in
or modules. The legacy "new RENO" algorithm is used as a starting
point and fallback.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2005-06-24 03:19:55 +0800

19 Jun, 2005

2 commits

0e87506fc [NET] Generalise tcp_listen_opt ... Browse Code »

This chunks out the accept_queue and tcp_listen_opt code and moves
them to net/core/request_sock.c and include/net/request_sock.h, to
make it useful for other transport protocols, DCCP being the first one
to use it.

Next patches will rename tcp_listen_opt to accept_sock and remove the
inline tcp functions that just call a reqsk_queue_ function.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-06-19 13:47:59 +0800
60236fdd0 [NET] Rename open_request to request_sock ... Browse Code »

Ok, this one just renames some stuff to have a better namespace and to
dissassociate it from TCP:

struct open_request -> struct request_sock
tcp_openreq_alloc -> reqsk_alloc
tcp_openreq_free -> reqsk_free
tcp_openreq_fastfree -> __reqsk_free

With this most of the infrastructure closely resembles a struct
sock methods subset.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-06-19 13:47:21 +0800