Eric Lee / smarc-fsl-linux-kernel

22 Nov, 2011

1 commit

70e9942f1 netfilter: nf_conntrack: make event callback registration per-netns ... Browse Code »
43

This patch fixes an oops that can be triggered following this recipe:

0) make sure nf_conntrack_netlink and nf_conntrack_ipv4 are loaded.
1) container is started.
2) connect to it via lxc-console.
3) generate some traffic with the container to create some conntrack
entries in its table.
4) stop the container: you hit one oops because the conntrack table
cleanup tries to report the destroy event to user-space but the
per-netns nfnetlink socket has already gone (as the nfnetlink
socket is per-netns but event callback registration is global).

To fix this situation, we make the ctnl_notifier per-netns so the
callback is registered/unregistered if the container is
created/destroyed.

Alex Bligh and Alexey Dobriyan originally proposed one small patch to
check if the nfnetlink socket is gone in nfnetlink_has_listeners,
but this is a very visited path for events, thus, it may reduce
performance and it looks a bit hackish to check for the nfnetlink
socket only to workaround this situation. As a result, I decided
to follow the bigger path choice, which seems to look nicer to me.

Cc: Alexey Dobriyan
Reported-by: Alex Bligh
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2011-11-22 07:34:47 +0800

05 Feb, 2011

1 commit

bd4a6974c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Browse Code »

David S. Miller
2011-02-05 06:28:58 +0800

01 Feb, 2011

1 commit

3db7e93d3 netfilter: ecache: always set events bits, filter them later ... Browse Code »

For the following rule:

iptables -I PREROUTING -t raw -j CT --ctevents assured

The event delivered looks like the following:

[UPDATE] tcp 6 src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]

Note that the TCP protocol state is not included. For that reason
the CT event filtering is not very useful for conntrackd.

To resolve this issue, instead of conditionally setting the CT events
bits based on the ctmask, we always set them and perform the filtering
in the late stage, just before the delivery.

Thus, the event delivered looks like the following:

[UPDATE] tcp 6 432000 ESTABLISHED src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]

Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Patrick McHardy

Pablo Neira Ayuso
2011-02-01 23:06:30 +0800

16 Nov, 2010

1 commit

0e60ebe04 netfilter: add __rcu annotations ... Browse Code »

Add some __rcu annotations and use helpers to reduce number of sparse
warnings (CONFIG_SPARSE_RCU_POINTER=y)

Signed-off-by: Eric Dumazet
Signed-off-by: Patrick McHardy

Eric Dumazet
2010-11-16 01:17:21 +0800

15 Nov, 2010

1 commit

e0e76c83b netfilter: ct_extend: define NF_CT_EXT_* as needed ... Browse Code »

Less IDs make nf_ct_ext smaller.

Signed-off-by: Changli Gao
Signed-off-by: Patrick McHardy

Changli Gao
2010-11-15 19:23:24 +0800

03 Feb, 2010

2 commits

0cebe4b41 netfilter: ctnetlink: support selective event delivery ... Browse Code »

Add two masks for conntrack end expectation events to struct nf_conntrack_ecache
and use them to filter events. Their default value is "all events" when the
event sysctl is on and "no events" when it is off. A following patch will add
specific initializations. Expectation events depend on the ecache struct of
their master conntrack.

Signed-off-by: Patrick McHardy

Patrick McHardy
2010-02-03 20:51:51 +0800
858b31330 netfilter: nf_conntrack: split up IPCT_STATUS event ... Browse Code »

Split up the IPCT_STATUS event into an IPCT_REPLY event, which is generated
when the IPS_SEEN_REPLY bit is set, and an IPCT_ASSURED event, which is
generated when the IPS_ASSURED bit is set.

In combination with a following patch to support selective event delivery,
this can be used for "sparse" conntrack replication: start replicating the
conntrack entry after it reached the ASSURED state and that way it's SYN-flood
resistant.

Signed-off-by: Patrick McHardy

Patrick McHardy
2010-02-03 20:48:53 +0800

04 Nov, 2009

1 commit

fd2c3ef76 net: cleanup include/net ... Browse Code »

This cleanup patch puts struct/union/enum opening braces,
in first line to ease grep games.

struct something
{

becomes :

struct something {

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-11-04 21:06:25 +0800

13 Jun, 2009

2 commits

dd7669a92 netfilter: conntrack: optional reliable conntrack event delivery ... Browse Code »

This patch improves ctnetlink event reliability if one broadcast
listener has set the NETLINK_BROADCAST_ERROR socket option.

The logic is the following: if an event delivery fails, we keep
the undelivered events in the missed event cache. Once the next
packet arrives, we add the new events (if any) to the missed
events in the cache and we try a new delivery, and so on. Thus,
if ctnetlink fails to deliver an event, we try to deliver them
once we see a new packet. Therefore, we may lose state
transitions but the userspace process gets in sync at some point.

At worst case, if no events were delivered to userspace, we make
sure that destroy events are successfully delivered. Basically,
if ctnetlink fails to deliver the destroy event, we remove the
conntrack entry from the hashes and we insert them in the dying
list, which contains inactive entries. Then, the conntrack timer
is added with an extra grace timeout of random32() % 15 seconds
to trigger the event again (this grace timeout is tunable via
/proc). The use of a limited random timeout value allows
distributing the "destroy" resends, thus, avoiding accumulating
lots "destroy" events at the same time. Event delivery may
re-order but we can identify them by means of the tuple plus
the conntrack ID.

The maximum number of conntrack entries (active or inactive) is
still handled by nf_conntrack_max. Thus, we may start dropping
packets at some point if we accumulate a lot of inactive conntrack
entries that did not successfully report the destroy event to
userspace.

During my stress tests consisting of setting a very small buffer
of 2048 bytes for conntrackd and the NETLINK_BROADCAST_ERROR socket
flag, and generating lots of very small connections, I noticed
very few destroy entries on the fly waiting to be resend.

A simple way to test this patch consist of creating a lot of
entries, set a very small Netlink buffer in conntrackd (+ a patch
which is not in the git tree to set the BROADCAST_ERROR flag)
and invoke `conntrack -F'.

For expectations, no changes are introduced in this patch.
Currently, event delivery is only done for new expectations (no
events from expectation expiration, removal and confirmation).
In that case, they need a per-expectation event cache to implement
the same idea that is exposed in this patch.

This patch can be useful to provide reliable flow-accouting. We
still have to add a new conntrack extension to store the creation
and destroy time.

Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Patrick McHardy

Pablo Neira Ayuso
2009-06-13 18:30:52 +0800
a0891aa6a netfilter: conntrack: move event caching to conntrack extension infrastructure ... Browse Code »

This patch reworks the per-cpu event caching to use the conntrack
extension infrastructure.

The main drawback is that we consume more memory per conntrack
if event delivery is enabled. This patch is required by the
reliable event delivery that follows to this patch.

BTW, this patch allows you to enable/disable event delivery via
/proc/sys/net/netfilter/nf_conntrack_events in runtime, although
you can still disable event caching as compilation option.

Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Patrick McHardy

Pablo Neira Ayuso
2009-06-13 18:26:29 +0800

03 Jun, 2009

3 commits

e34d5c1a4 netfilter: conntrack: replace notify chain by function pointer ... Browse Code »

This patch removes the notify chain infrastructure and replace it
by a simple function pointer. This issue has been mentioned in the
mailing list several times: the use of the notify chain adds
too much overhead for something that is only used by ctnetlink.

This patch also changes nfnetlink_send(). It seems that gfp_any()
returns GFP_KERNEL for user-context request, like those via
ctnetlink, inside the RCU read-side section which is not valid.
Using GFP_KERNEL is also evil since netlink may schedule(),
this leads to "scheduling while atomic" bug reports.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2009-06-03 16:32:06 +0800
17e6e4eac netfilter: conntrack: simplify event caching system ... Browse Code »

This patch simplifies the conntrack event caching system by removing
several events:

* IPCT_[*]_VOLATILE, IPCT_HELPINFO and IPCT_NATINFO has been deleted
since the have no clients.
* IPCT_COUNTER_FILLING which is a leftover of the 32-bits counter
days.
* IPCT_REFRESH which is not of any use since we always include the
timeout in the messages.

After this patch, the existing events are:

* IPCT_NEW, IPCT_RELATED and IPCT_DESTROY, that are used to identify
addition and deletion of entries.
* IPCT_STATUS, that notes that the status bits have changes,
eg. IPS_SEEN_REPLY and IPS_ASSURED.
* IPCT_PROTOINFO, that reports that internal protocol information has
changed, eg. the TCP, DCCP and SCTP protocol state.
* IPCT_HELPER, that a helper has been assigned or unassigned to this
entry.
* IPCT_MARK and IPCT_SECMARK, that reports that the mark has changed, this
covers the case when a mark is set to zero.
* IPCT_NATSEQADJ, to report that there's updates in the NAT sequence
adjustment.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2009-06-03 02:08:46 +0800
6bfea1984 netfilter: conntrack: remove events flags from userspace exposed file ... Browse Code »

This patch moves the event flags from linux/netfilter/nf_conntrack_common.h
to net/netfilter/nf_conntrack_ecache.h. This flags are not of any use
from userspace.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2009-06-03 02:08:44 +0800

18 Nov, 2008

1 commit

19abb7b09 netfilter: ctnetlink: deliver events for conntracks changed from userspace ... Browse Code »

As for now, the creation and update of conntracks via ctnetlink do not
propagate an event to userspace. This can result in inconsistent situations
if several userspace processes modify the connection tracking table by means
of ctnetlink at the same time. Specifically, using the conntrack command
line tool and conntrackd at the same time can trigger unconsistencies.

This patch also modifies the event cache infrastructure to pass the
process PID and the ECHO flag to nfnetlink_send() to report back
to userspace if the process that triggered the change needs so.
Based on a suggestion from Patrick McHardy.

Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Patrick McHardy

Pablo Neira Ayuso
2008-11-18 18:56:20 +0800

12 Oct, 2008

1 commit

64f1b6538 net: fix dummy 'nf_conntrack_event_cache()' ... Browse Code »

The dummy version of 'nf_conntrack_event_cache()' (used when the
NF_CONNTRACK_EVENTS config option is not enabled) had not been updated
when the calling convention changed.

This was introduced by commit a71996fccce4b2086a26036aa3c915365ca36926
("netfilter: netns nf_conntrack: pass conntrack to
nf_conntrack_event_cache() not skb")

Tssk.

Cc: Alexey Dobriyan
Cc: Patrick McHardy
Cc: David Miller
Signed-off-by: Linus Torvalds

Linus Torvalds
2008-10-12 00:46:24 +0800

10 Oct, 2008

1 commit

bb21c95e2 nf_conntrack_ecache.h: Fix missing braces ... Browse Code »

This patch add missing braces of today's net-next-2.6:
include/net/netfilter/nf_conntrack_ecache.h

Signed-off-by: Guo-Fu Tseng
Signed-off-by: David S. Miller

Guo-Fu Tseng
2008-10-10 12:10:36 +0800

08 Oct, 2008

2 commits

6058fa6bb netfilter: netns nf_conntrack: per-netns event cache ... Browse Code »

Heh, last minute proof-reading of this patch made me think,
that this is actually unneeded, simply because "ct" pointers will be
different for different conntracks in different netns, just like they
are different in one netns.

Not so sure anymore.

[Patrick: pointers will be different, flushing can only be done while
inactive though and thus it needs to be per netns]

Signed-off-by: Alexey Dobriyan
Signed-off-by: Patrick McHardy

Alexey Dobriyan
2008-10-08 17:35:07 +0800
a71996fcc netfilter: netns nf_conntrack: pass conntrack to nf_conntrack_event_cache() not skb ... Browse Code »

This is cleaner, we already know conntrack to which event is relevant.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Patrick McHardy

Alexey Dobriyan
2008-10-08 17:35:07 +0800

11 Jul, 2007

1 commit

6823645d6 [NETFILTER]: nf_conntrack_expect: function naming unification ... Browse Code »

Currently there is a wild mix of nf_conntrack_expect_, nf_ct_exp_,
expect_, exp_, ...

Consistently use nf_ct_ as prefix for exported functions.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-11 13:17:53 +0800

26 Apr, 2007

1 commit

010c7d6f8 [NETFILTER]: nf_conntrack: uninline notifier registration functions ... Browse Code »

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-04-26 13:25:46 +0800

03 Dec, 2006

1 commit

f61801218 [NETFILTER]: nf_conntrack: split out the event cache ... Browse Code »

This patch splits out the event cache into its own file
nf_conntrack_ecache.c

Signed-off-by: Martin Josefsson
Signed-off-by: Patrick McHardy

Martin Josefsson
2006-12-03 13:31:06 +0800