Eric Lee / smarc-fsl-linux-kernel

16 Sep, 2011

1 commit

946cedccb tcp: Change possible SYN flooding messages ... Browse Code »

"Possible SYN flooding on port xxxx " messages can fill logs on servers.

Change logic to log the message only once per listener, and add two new
SNMP counters to track :

TCPReqQFullDoCookies : number of times a SYNCOOKIE was replied to client

TCPReqQFullDrop : number of times a SYN request was dropped because
syncookies were not enabled.

Based on a prior patch from Tom Herbert, and suggestions from David.

Signed-off-by: Eric Dumazet
CC: Tom Herbert
Signed-off-by: David S. Miller

Eric Dumazet
2011-09-16 02:49:43 +0800

18 Jan, 2010

1 commit

72659ecce tcp: account SYN-ACK timeouts & retransmissions ... Browse Code »

Currently we don't increment SYN-ACK timeouts & retransmissions
although we do increment the same stats for SYN. We seem to have lost
the SYN-ACK accounting with the introduction of tcp_syn_recv_timer
(commit 2248761e in the netdev-vger-cvs tree).

This patch fixes this issue. In the process we also rename the v4/v6
syn/ack retransmit functions for clarity. We also add a new
request_socket operations (syn_ack_timeout) so we can keep code in
inet_connection_sock.c protocol agnostic.

Signed-off-by: Octavian Purdila
Signed-off-by: David S. Miller

Octavian Purdila
2010-01-18 11:09:39 +0800

03 Dec, 2009

1 commit

e6b4d1136 TCPCT part 1a: add request_values parameter for sending SYNACK ... Browse Code »

Add optional function parameters associated with sending SYNACK.
These parameters are not needed after sending SYNACK, and are not
used for retransmission. Avoids extending struct tcp_request_sock,
and avoids allocating kernel memory.

Also affects DCCP as it uses common struct request_sock_ops,
but this parameter is currently reserved for future use.

Signed-off-by: William.Allen.Simpson@gmail.com
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

William Allen Simpson
2009-12-03 14:07:23 +0800

22 Nov, 2008

1 commit

7e56b5d69 net: Fix memory leak in the proto_register function ... Browse Code »

If the slub allocator is used, kmem_cache_create() may merge two or more
kmem_cache's into one but the cache name pointer is not updated and
kmem_cache_name() is no longer guaranteed to return the pointer passed
to the former function. This patch stores the kmalloc'ed pointers in the
corresponding request_sock_ops and timewait_sock_ops structures.

Signed-off-by: Catalin Marinas
Acked-by: Arnaldo Carvalho de Melo
Reviewed-by: Christoph Lameter
Signed-off-by: David S. Miller

Catalin Marinas
2008-11-22 08:45:22 +0800

07 Aug, 2008

1 commit

6edafaaf6 tcp: Fix kernel panic when calling tcp_v(4/6)_md5_do_lookup ... Browse Code »

If the following packet flow happen, kernel will panic.
MathineA MathineB
SYN
---------------------->
SYN+ACK

When a bad seq ACK is received, tcp_v4_md5_do_lookup(skb->sk, ip_hdr(skb)->daddr))
is finally called by tcp_v4_reqsk_send_ack(), but the first parameter(skb->sk) is
NULL at that moment, so kernel panic happens.
This patch fixes this bug.

OOPS output is as following:
[ 302.812793] IP: [] tcp_v4_md5_do_lookup+0x12/0x42
[ 302.817075] Oops: 0000 [#1] SMP
[ 302.819815] Modules linked in: ipv6 loop dm_multipath rtc_cmos rtc_core rtc_lib pcspkr pcnet32 mii i2c_piix4 parport_pc i2c_core parport ac button ata_piix libata dm_mod mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
[ 302.849946]
[ 302.851198] Pid: 0, comm: swapper Not tainted (2.6.27-rc1-guijf #5)
[ 302.855184] EIP: 0060:[] EFLAGS: 00010296 CPU: 0
[ 302.858296] EIP is at tcp_v4_md5_do_lookup+0x12/0x42
[ 302.861027] EAX: 0000001e EBX: 00000000 ECX: 00000046 EDX: 00000046
[ 302.864867] ESI: ceb69e00 EDI: 1467a8c0 EBP: cf75f180 ESP: c0792e54
[ 302.868333] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 302.871287] Process swapper (pid: 0, ti=c0792000 task=c0712340 task.ti=c0746000)
[ 302.875592] Stack: c06f413a 00000000 cf75f180 ceb69e00 00000000 c05d0d86 000016d0 ceac5400
[ 302.883275] c05d28f8 000016d0 ceb69e00 ceb69e20 681bf6e3 00001000 00000000 0a67a8c0
[ 302.890971] ceac5400 c04250a3 c06f413a c0792eb0 c0792edc cf59a620 cf59a620 cf59a634
[ 302.900140] Call Trace:
[ 302.902392] [] tcp_v4_reqsk_send_ack+0x17/0x35
[ 302.907060] [] tcp_check_req+0x156/0x372
[ 302.910082] [] printk+0x14/0x18
[ 302.912868] [] tcp_v4_do_rcv+0x1d3/0x2bf
[ 302.917423] [] tcp_v4_rcv+0x563/0x5b9
[ 302.920453] [] ip_local_deliver_finish+0xe8/0x183
[ 302.923865] [] ip_rcv_finish+0x286/0x2a3
[ 302.928569] [] dev_alloc_skb+0x11/0x25
[ 302.931563] [] netif_receive_skb+0x2d6/0x33a
[ 302.934914] [] pcnet32_poll+0x333/0x680 [pcnet32]
[ 302.938735] [] net_rx_action+0x5c/0xfe
[ 302.941792] [] __do_softirq+0x5d/0xc1
[ 302.944788] [] __do_softirq+0x0/0xc1
[ 302.948999] [] do_softirq+0x55/0x88
[ 302.951870] [] handle_fasteoi_irq+0x0/0xa4
[ 302.954986] [] irq_exit+0x35/0x69
[ 302.959081] [] do_IRQ+0x99/0xae
[ 302.961896] [] common_interrupt+0x23/0x28
[ 302.966279] [] default_idle+0x2a/0x3d
[ 302.969212] [] cpu_idle+0xb2/0xd2
[ 302.972169] =======================
[ 302.974274] Code: fc ff 84 d2 0f 84 df fd ff ff e9 34 fe ff ff 83 c4 0c 5b 5e 5f 5d c3 90 90 57 89 d7 56 53 89 c3 50 68 3a 41 6f c0 e8 e9 55 e5 ff 93 9c 04 00 00 58 85 d2 59 74 1e 8b 72 10 31 db 31 c9 85 f6
[ 303.011610] EIP: [] tcp_v4_md5_do_lookup+0x12/0x42 SS:ESP 0068:c0792e54
[ 303.018360] Kernel panic - not syncing: Fatal exception in interrupt

Signed-off-by: Gui Jianfeng
Signed-off-by: David S. Miller

Gui Jianfeng
2008-08-07 14:50:04 +0800

26 Jul, 2008

1 commit

547b792ca net: convert BUG_TRAP to generic WARN_ON ... Browse Code »

Removes legacy reinvent-the-wheel type thing. The generic
machinery integrates much better to automated debugging aids
such as kerneloops.org (and others), and is unambiguous due to
better naming. Non-intuively BUG_TRAP() is actually equal to
WARN_ON() rather than BUG_ON() though some might actually be
promoted to BUG_ON() but I left that to future.

I could make at least one BUILD_BUG_ON conversion.

Signed-off-by: Ilpo Järvinen
Signed-off-by: David S. Miller

Ilpo Järvinen
2008-07-26 12:43:18 +0800

13 Jun, 2008

1 commit

ec0a19662 tcp: Revert 'process defer accept as established' changes. ... Browse Code »

This reverts two changesets, ec3c0982a2dd1e671bad8e9d26c28dcba0039d87
("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and
the follow-on bug fix 9ae27e0adbf471c7a6b80102e38e1d5a346b3b38
("tcp: Fix slab corruption with ipv6 and tcp6fuzz").

This change causes several problems, first reported by Ingo Molnar
as a distcc-over-loopback regression where connections were getting
stuck.

Ilpo Järvinen first spotted the locking problems. The new function
added by this code, tcp_defer_accept_check(), only has the
child socket locked, yet it is modifying state of the parent
listening socket.

Fixing that is non-trivial at best, because we can't simply just grab
the parent listening socket lock at this point, because it would
create an ABBA deadlock. The normal ordering is parent listening
socket --> child socket, but this code path would require the
reverse lock ordering.

Next is a problem noticed by Vitaliy Gusev, he noted:

----------------------------------------
>--- a/net/ipv4/tcp_timer.c
>+++ b/net/ipv4/tcp_timer.c
>@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data)
> goto death;
> }
>
>+ if (tp->defer_tcp_accept.request && sk->sk_state == TCP_ESTABLISHED) {
>+ tcp_send_active_reset(sk, GFP_ATOMIC);
>+ goto death;

Here socket sk is not attached to listening socket's request queue. tcp_done()
will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should
release this sk) as socket is not DEAD. Therefore socket sk will be lost for
freeing.
----------------------------------------

Finally, Alexey Kuznetsov argues that there might not even be any
real value or advantage to these new semantics even if we fix all
of the bugs:

----------------------------------------
Hiding from accept() sockets with only out-of-order data only
is the only thing which is impossible with old approach. Is this really
so valuable? My opinion: no, this is nothing but a new loophole
to consume memory without control.
----------------------------------------

So revert this thing for now.

Signed-off-by: David S. Miller

David S. Miller
2008-06-13 07:34:35 +0800

10 Apr, 2008

1 commit

4dfc28170 [Syncookies]: Add support for TCP options via timestamps. ... Browse Code »

Allow the use of SACK and window scaling when syncookies are used
and the client supports tcp timestamps. Options are encoded into
the timestamp sent in the syn-ack and restored from the timestamp
echo when the ack is received.

Based on earlier work by Glenn Griffin.
This patch avoids increasing the size of structs by encoding TCP
options into the least significant bits of the timestamp and
by not using any 'timestamp offset'.

The downside is that the timestamp sent in the packet after the synack
will increase by several seconds.

changes since v1:
don't duplicate timestamp echo decoding function, put it into ipv4/syncookie.c
and have ipv6/syncookies.c use it.
Feedback from Glenn Griffin: fix line indented with spaces, kill redundant if ()

Reviewed-by: Hagen Paul Pfeifer
Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller

Florian Westphal
2008-04-10 18:12:40 +0800

22 Mar, 2008

1 commit

ec3c0982a [TCP]: TCP_DEFER_ACCEPT updates - process as established ... Browse Code »

Change TCP_DEFER_ACCEPT implementation so that it transitions a
connection to ESTABLISHED after handshake is complete instead of
leaving it in SYN-RECV until some data arrvies. Place connection in
accept queue when first data packet arrives from slow path.

Benefits:
- established connection is now reset if it never makes it
to the accept queue

- diagnostic state of established matches with the packet traces
showing completed handshake

- TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be
enforced with reasonable accuracy instead of rounding up to next
exponential back-off of syn-ack retry.

Signed-off-by: Patrick McManus
Signed-off-by: David S. Miller

Patrick McManus
2008-03-22 07:33:01 +0800

01 Mar, 2008

1 commit

fd80eb942 [INET]: Remove struct dst_entry *dst from request_sock_ops.rtx_syn_ack. ... Browse Code »

It looks like dst parameter is used in this API due to historical
reasons. Actually, it is really used in the direct call to
tcp_v4_send_synack only. So, create a wrapper for tcp_v4_send_synack
and remove dst from rtx_syn_ack.

Signed-off-by: Denis V. Lunev
Signed-off-by: David S. Miller

Denis V. Lunev
2008-03-01 03:43:03 +0800

15 Nov, 2007

1 commit

dab6ba368 [INET]: Fix potential kfree on vmalloc-ed area of request_sock_queue ... Browse Code »

The request_sock_queue's listen_opt is either vmalloc-ed or
kmalloc-ed depending on the number of table entries. Thus it
is expected to be handled properly on free, which is done in
the reqsk_queue_destroy().

However the error path in inet_csk_listen_start() calls
the lite version of reqsk_queue_destroy, called
__reqsk_queue_destroy, which calls the kfree unconditionally.

Fix this and move the __reqsk_queue_destroy into a .c file as
it looks too big to be inline.

As David also noticed, this is an error recovery path only,
so no locking is required and the lopt is known to be not NULL.

reqsk_queue_yank_listen_sk is also now only used in
net/core/request_sock.c so we should move it there too.

Signed-off-by: Pavel Emelyanov
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-15 18:57:06 +0800

08 Dec, 2006

2 commits

e18b890bb [PATCH] slab: remove kmem_cache_t ... Browse Code »

Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#

set -e

for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
done

The script was run like this

sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-12-08 00:39:25 +0800
54e6ecb23 [PATCH] slab: remove SLAB_ATOMIC ... Browse Code »

SLAB_ATOMIC is an alias of GFP_ATOMIC

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-12-08 00:39:24 +0800

03 Dec, 2006

3 commits

cfb6eeb4c [TCP]: MD5 Signature Option (RFC2385) support. ... Browse Code »

Based on implementation by Rick Payne.

Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

YOSHIFUJI Hideaki
2006-12-03 13:22:39 +0800
72a3effaf [NET]: Size listen hash tables using backlog hint ... Browse Code »

We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for
each LISTEN socket, regardless of various parameters (listen backlog for
example)

On x86_64, this means order-1 allocations (might fail), even for 'small'
sockets, expecting few connections. On the contrary, a huge server wanting a
backlog of 50000 is slowed down a bit because of this fixed limit.

This patch makes the sizing of listen hash table a dynamic parameter,
depending of :
- net.core.somaxconn tunable (default is 128)
- net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128)
- backlog value given by user application (2nd parameter of listen())

For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of
kmalloc().

We still limit memory allocation with the two existing tunables (somaxconn &
tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM
usage.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2006-12-03 13:21:44 +0800
6b877699c SELinux: Return correct context for SO_PEERSEC ... Browse Code »

Fix SO_PEERSEC for tcp sockets to return the security context of
the peer (as represented by the SA from the peer) as opposed to the
SA used by the local/source socket.

Signed-off-by: Venkat Yekkirala
Signed-off-by: James Morris

Venkat Yekkirala
2006-12-03 13:21:33 +0800

23 Sep, 2006

1 commit

4237c75c0 [MLSXFRM]: Auto-labeling of child sockets ... Browse Code »

This automatically labels the TCP, Unix stream, and dccp child sockets
as well as openreqs to be at the same MLS level as the peer. This will
result in the selection of appropriately labeled IPSec Security
Associations.

This also uses the sock's sid (as opposed to the isec sid) in SELinux
enforcement of secmark in rcv_skb and postroute_last hooks.

Signed-off-by: Venkat Yekkirala
Signed-off-by: David S. Miller

Venkat Yekkirala
2006-09-23 05:53:29 +0800

27 Mar, 2006

1 commit

3eb4801d7 [NET]: drop duplicate assignment in request_sock ... Browse Code »

Just noticed that request_sock.[ch] contain a useless assignment of
rskq_accept_head to itself. I assume this is a typo and the 2nd one
was supposed to be _tail. However, setting _tail to NULL is not
needed, so the patch below just drops the 2nd assignment.

Signed-off-By: Norbert Kiesel
Signed-off-by: Adrian Bunk
Signed-off-by: David S. Miller

Norbert Kiesel
2006-03-27 09:39:55 +0800

04 Jan, 2006

1 commit

8129765ac [IPV6]: Generalise tcp_v6_search_req & tcp_v6_synq_add ... Browse Code »

More work is needed tho to introduce inet6_request_sock from
tcp6_request_sock, in the same layout considerations as ipv6_pinfo in
inet_sock, next changeset will do that.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2006-01-04 05:10:36 +0800

30 Aug, 2005

4 commits

a019d6fe2 [ICSK]: Move generalised functions from tcp to inet_connection_sock ... Browse Code »

This also improves reqsk_queue_prune and renames it to
inet_csk_reqsk_queue_prune, as it deals with both inet_connection_sock
and inet_request_sock objects, not just with request_sock ones thus
belonging to inet_request_sock.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:49:50 +0800
295f7324f [ICSK]: Introduce reqsk_queue_prune from code in tcp_synack_timer ... Browse Code »

With this we're very close to getting all of the current TCP
refactorings in my dccp-2.6 tree merged, next changeset will export
some functions needed by the current DCCP code and then dccp-2.6.git
will be born!

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:49:29 +0800
463c84b97 [NET]: Introduce inet_connection_sock ... Browse Code »

This creates struct inet_connection_sock, moving members out of struct
tcp_sock that are shareable with other INET connection oriented
protocols, such as DCCP, that in my private tree already uses most of
these members.

The functions that operate on these members were renamed, using a
inet_csk_ prefix while not being moved yet to a new file, so as to
ease the review of these changes.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:43:19 +0800
83e3609eb [REQSK]: Move the syn_table destroy from tcp_listen_stop to reqsk_queue_destroy ... Browse Code »

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-08-30 06:32:11 +0800

19 Jun, 2005

4 commits

2ad69c55a [NET] rename struct tcp_listen_opt to struct listen_sock ... Browse Code »

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-06-19 13:48:55 +0800
0e87506fc [NET] Generalise tcp_listen_opt ... Browse Code »

This chunks out the accept_queue and tcp_listen_opt code and moves
them to net/core/request_sock.c and include/net/request_sock.h, to
make it useful for other transport protocols, DCCP being the first one
to use it.

Next patches will rename tcp_listen_opt to accept_sock and remove the
inline tcp functions that just call a reqsk_queue_ function.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-06-19 13:47:59 +0800
60236fdd0 [NET] Rename open_request to request_sock ... Browse Code »

Ok, this one just renames some stuff to have a better namespace and to
dissassociate it from TCP:

struct open_request -> struct request_sock
tcp_openreq_alloc -> reqsk_alloc
tcp_openreq_free -> reqsk_free
tcp_openreq_fastfree -> __reqsk_free

With this most of the infrastructure closely resembles a struct
sock methods subset.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-06-19 13:47:21 +0800
2e6599cb8 [NET] Generalise TCP's struct open_request minisock infrastructure ... Browse Code »

Kept this first changeset minimal, without changing existing names to
ease peer review.

Basicaly tcp_openreq_alloc now receives the or_calltable, that in turn
has two new members:

->slab, that replaces tcp_openreq_cachep
->obj_size, to inform the size of the openreq descendant for
a specific protocol

The protocol specific fields in struct open_request were moved to a
class hierarchy, with the things that are common to all connection
oriented PF_INET protocols in struct inet_request_sock, the TCP ones
in tcp_request_sock, that is an inet_request_sock, that is an
open_request.

I.e. this uses the same approach used for the struct sock class
hierarchy, with sk_prot indicating if the protocol wants to use the
open_request infrastructure by filling in sk_prot->rsk_prot with an
or_calltable.

Results? Performance is improved and TCP v4 now uses only 64 bytes per
open request minisock, down from 96 without this patch :-)

Next changeset will rename some of the structs, fields and functions
mentioned above, struct or_calltable is way unclear, better name it
struct request_sock_ops, s/struct open_request/struct request_sock/g,
etc.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Arnaldo Carvalho de Melo
2005-06-19 13:46:52 +0800