Eric Lee / smarc-fsl-linux-kernel

19 Jun, 2019

1 commit

d2912cb15 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 ... Browse Code »

Based on 2 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Enrico Weigelt
Reviewed-by: Kate Stewart
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-06-19 23:09:55 +0800

29 Mar, 2019

1 commit

717700d18 netfilter: Export nf_ct_{set,destroy}_timeout() ... Browse Code »

This patch exports nf_ct_set_timeout() and nf_ct_destroy_timeout().
The two functions are derived from xt_ct_destroy_timeout() and
xt_ct_set_timeout() in xt_CT.c, and moved to nf_conntrack_timeout.c
without any functional change.
It would be useful for other users (i.e. OVS) that utilizes the
finer-grain conntrack timeout feature.

CC: Pablo Neira Ayuso
CC: Pravin Shelar
Signed-off-by: Yi-Hung Wei
Signed-off-by: David S. Miller

Yi-Hung Wei
2019-03-29 07:53:29 +0800

18 Jan, 2019

1 commit

4a60dc748 netfilter: conntrack: remove nf_ct_l4proto_find_get ... Browse Code »

Its now same as __nf_ct_l4proto_find(), so rename that to
nf_ct_l4proto_find and use it everywhere.

It never returns NULL and doesn't need locks or reference counts.

Before this series:
302824 net/netfilter/nf_conntrack.ko
21504 net/netfilter/nf_conntrack_proto_gre.ko

text data bss dec hex filename
6281 1732 4 8017 1f51 nf_conntrack_proto_gre.ko
108356 20613 236 129205 1f8b5 nf_conntrack.ko

After:
294864 net/netfilter/nf_conntrack.ko
text data bss dec hex filename
106979 19557 240 126776 1ef38 nf_conntrack.ko

so, even with builtin gre, total size got reduced.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-01-18 22:02:34 +0800

21 Sep, 2018

1 commit

dd2934a95 netfilter: conntrack: remove l3->l4 mapping information ... Browse Code »

l4 protocols are demuxed by l3num, l4num pair.

However, almost all l4 trackers are l3 agnostic.

Only exceptions are:
- gre, icmp (ipv4 only)
- icmpv6 (ipv6 only)

This commit gets rid of the l3 mapping, l4 trackers can now be looked up
by their IPPROTO_XXX value alone, which gets rid of the additional l3
indirection.

For icmp, ipcmp6 and gre, add a check on state->pf and
return -NF_ACCEPT in case we're asked to track e.g. icmpv6-in-ipv4,
this seems more fitting than using the generic tracker.

Additionally we can kill the 2nd l4proto definitions that were needed
for v4/v6 split -- they are now the same so we can use single l4proto
struct for each protocol, rather than two.

The EXPORT_SYMBOLs can be removed as all these object files are
part of nf_conntrack with no external references.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2018-09-21 00:07:35 +0800

07 Aug, 2018

1 commit

6c1fd7dc4 netfilter: cttimeout: decouple timeout policy from nfnetlink_cttimeout object ... Browse Code »

The timeout policy is currently embedded into the nfnetlink_cttimeout
object, move the policy into an independent object. This allows us to
reuse part of the existing conntrack timeout extension from nf_tables
without adding dependencies with the nfnetlink_cttimeout object layout.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2018-08-07 23:14:15 +0800

18 Jul, 2018

1 commit

440534d3c netfilter: Remove useless param helper of nf_ct_helper_ext_add ... Browse Code »

The param helper of nf_ct_helper_ext_add is useless now, then remove
it now.

Signed-off-by: Gao Feng
Signed-off-by: Pablo Neira Ayuso

Gao Feng
2018-07-18 17:26:42 +0800

01 Jun, 2018

1 commit

8f4d19aac netfilter: xt_CT: Reject the non-null terminated string from user space ... Browse Code »

The helper and timeout strings are from user-space, we need to make
sure they are null terminated. If not, evil user could make kernel
read the unexpected memory, even print it when fail to find by the
following codes.

pr_info_ratelimited("No such helper \"%s\"\n", helper_name);

Signed-off-by: Gao Feng
Acked-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Gao Feng
2018-06-01 16:14:51 +0800

15 Feb, 2018

1 commit

11f7aee23 netfilter: xt_CT: use pr ratelimiting ... Browse Code »

checkpatch complains about line > 80 but this would require splitting
"literal" over two lines which is worse.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2018-02-15 04:05:34 +0800

25 Aug, 2017

1 commit

b3480fe05 netfilter: conntrack: make protocol tracker pointers const ... Browse Code »

Doesn't change generated code, but will make it easier to eventually
make the actual trackers themselvers const.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2017-08-25 00:52:33 +0800

15 May, 2017

1 commit

d91fc59cd netfilter: introduce nf_conntrack_helper_put helper function ... Browse Code »

And convert module_put invocation to nf_conntrack_helper_put, this is
prepared for the followup patch, which will add a refcnt for cthelper,
so we can reject the deleting request when cthelper is in use.

Signed-off-by: Liping Zhang
Signed-off-by: Pablo Neira Ayuso

Liping Zhang
2017-05-15 18:42:29 +0800

03 May, 2017

1 commit

4d89ac2dd Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter/IPVS/OVS fixes for net

The following patchset contains a rather large batch of Netfilter, IPVS
and OVS fixes for your net tree. This includes fixes for ctnetlink, the
userspace conntrack helper infrastructure, conntrack OVS support,
ebtables DNAT target, several leaks in error path among other. More
specifically, they are:

1) Fix reference count leak in the CT target error path, from Gao Feng.

2) Remove conntrack entry clashing with a matching expectation, patch
from Jarno Rajahalme.

3) Fix bogus EEXIST when registering two different userspace helpers,
from Liping Zhang.

4) Don't leak dummy elements in the new bitmap set type in nf_tables,
from Liping Zhang.

5) Get rid of module autoload from conntrack update path in ctnetlink,
we don't need autoload at this late stage and it is happening with
rcu read lock held which is not good. From Liping Zhang.

6) Fix deadlock due to double-acquire of the expect_lock from conntrack
update path, this fixes a bug that was introduced when the central
spinlock got removed. Again from Liping Zhang.

7) Safe ct->status update from ctnetlink path, from Liping. The expect_lock
protection that was selected when the central spinlock was removed was
not really protecting anything at all.

8) Protect sequence adjustment under ct->lock.

9) Missing socket match with IPv6, from Peter Tirsek.

10) Adjust skb->pkt_type of DNAT'ed frames from ebtables, from
Linus Luessing.

11) Don't give up on evaluating the expression on new entries added via
dynset expression in nf_tables, from Liping Zhang.

12) Use skb_checksum() when mangling icmpv6 in IPv6 NAT as this deals
with non-linear skbuffs.

13) Don't allow IPv6 service in IPVS if no IPv6 support is available,
from Paolo Abeni.

14) Missing mutex release in error path of xt_find_table_lock(), from
Dan Carpenter.

15) Update maintainers files, Netfilter section. Add Florian to the
file, refer to nftables.org and change project status from Supported
to Maintained.

16) Bail out on mismatching extensions in element updates in nf_tables.
====================

Signed-off-by: David S. Miller

David S. Miller
2017-05-03 22:11:26 +0800

25 Apr, 2017

1 commit

470acf55a netfilter: xt_CT: fix refcnt leak on error path ... Browse Code »

There are two cases which causes refcnt leak.

1. When nf_ct_timeout_ext_add failed in xt_ct_set_timeout, it should
free the timeout refcnt.
Now goto the err_put_timeout error handler instead of going ahead.

2. When the time policy is not found, we should call module_put.
Otherwise, the related cthelper module cannot be removed anymore.
It is easy to reproduce by typing the following command:
# iptables -t raw -A OUTPUT -p tcp -j CT --helper ftp --timeout xxx

Signed-off-by: Gao Feng
Signed-off-by: Liping Zhang
Signed-off-by: Pablo Neira Ayuso

Gao Feng
2017-04-25 02:03:01 +0800

15 Apr, 2017

1 commit

cc41c84b7 netfilter: kill the fake untracked conntrack objects ... Browse Code »

resurrect an old patch from Pablo Neira to remove the untracked objects.

Currently, there are four possible states of an skb wrt. conntrack.

1. No conntrack attached, ct is NULL.
2. Normal (kmem cache allocated) ct attached.
3. a template (kmalloc'd), not in any hash tables at any point in time
4. the 'untracked' conntrack, a percpu nf_conn object, tagged via
IPS_UNTRACKED_BIT in ct->status.

Untracked is supposed to be identical to case 1. It exists only
so users can check

-m conntrack --ctstate UNTRACKED vs.
-m conntrack --ctstate INVALID

e.g. attempts to set connmark on INVALID or UNTRACKED conntracks is
supposed to be a no-op.

Thus currently we need to check
ct == NULL || nf_ct_is_untracked(ct)

in a lot of places in order to avoid altering untracked objects.

The other consequence of the percpu untracked object is that all
-j NOTRACK (and, later, kfree_skb of such skbs) result in an atomic op
(inc/dec the untracked conntracks refcount).

This adds a new kernel-private ctinfo state, IP_CT_UNTRACKED, to
make the distinction instead.

The (few) places that care about packet invalid (ct is NULL) vs.
packet untracked now need to test ct == NULL vs. ctinfo == IP_CT_UNTRACKED,
but all other places can omit the nf_ct_is_untracked() check.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2017-04-15 17:47:57 +0800

02 Feb, 2017

3 commits

a9e419dc7 netfilter: merge ctinfo into nfct pointer storage area ... Browse Code »

After this change conntrack operations (lookup, creation, matching from
ruleset) only access one instead of two sk_buff cache lines.

This works for normal conntracks because those are allocated from a slab
that guarantees hw cacheline or 8byte alignment (whatever is larger)
so the 3 bits needed for ctinfo won't overlap with nf_conn addresses.

Template allocation now does manual address alignment (see previous change)
on arches that don't have sufficent kmalloc min alignment.

Some spots intentionally use skb->_nfct instead of skb_nfct() helpers,
this is to avoid undoing the skb_nfct() use when we remove untracked
conntrack object in the future.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2017-02-02 21:31:56 +0800
c74454fad netfilter: add and use nf_ct_set helper ... Browse Code »

Add a helper to assign a nf_conn entry and the ctinfo bits to an sk_buff.
This avoids changing code in followup patch that merges skb->nfct and
skb->nfctinfo into skb->_nfct.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2017-02-02 21:31:54 +0800
cb9c68363 skbuff: add and use skb_nfct helper ... Browse Code »

Followup patch renames skb->nfct and changes its type so add a helper to
avoid intrusive rename change later.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2017-02-02 21:31:53 +0800

10 Jan, 2017

1 commit

ec2318904 xtables: extend matches and targets with .usersize ... Browse Code »

In matches and targets that define a kernel-only tail to their
xt_match and xt_target data structs, add a field .usersize that
specifies up to where data is to be shared with userspace.

Performed a search for comment "Used internally by the kernel" to find
relevant matches and targets. Manually inspected the structs to derive
a valid offsetof.

Signed-off-by: Willem de Bruijn
Signed-off-by: Pablo Neira Ayuso

Willem de Bruijn
2017-01-10 00:24:55 +0800

05 Dec, 2016

1 commit

ecb2421b5 netfilter: add and use nf_ct_netns_get/put ... Browse Code »

currently aliased to try_module_get/_put.
Will be changed in next patch when we add functions to make use of ->net
argument to store usercount per l3proto tracker.

This is needed to avoid registering the conntrack hooks in all netns and
later only enable connection tracking in those that need conntrack.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2016-12-05 04:16:50 +0800

14 Dec, 2015

1 commit

19576c947 netfilter: cttimeout: add netns support ... Browse Code »

Add a per-netns list of timeout objects and adjust code to use it.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira
2015-12-14 19:48:58 +0800

12 Oct, 2015

2 commits

ae2d708ed netfilter: conntrack: fix crash on timeout object removal ... Browse Code »

The object and module refcounts are updated for each conntrack template,
however, if we delete the iptables rules and we flush the timeout
database, we may end up with invalid references to timeout object that
are just gone.

Resolve this problem by setting the timeout reference to NULL when the
custom timeout entry is removed from our base. This patch requires some
RCU trickery to ensure safe pointer handling.

This handling is similar to what we already do with conntrack helpers,
the idea is to avoid bumping the timeout object reference counter from
the packet path to avoid the cost of atomic ops.

Reported-by: Stephen Hemminger
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2015-10-12 23:04:34 +0800
403d89ad9 netfilter: xt_CT: don't put back reference to timeout policy object ... Browse Code »

On success, this shouldn't put back the timeout policy object, otherwise
we may have module refcount overflow and we allow deletion of timeout
that are still in use.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2015-10-12 22:54:45 +0800

06 Sep, 2015

1 commit

53cfd053e Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Conflicts:
include/net/netfilter/nf_conntrack.h

The conflict was an overlap between changing the type of the zone
argument to nf_ct_tmpl_alloc() whilst exporting nf_ct_tmpl_free.

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net, they are:

1) Oneliner to restore maps in nf_tables since we support addressing registers
at 32 bits level.

2) Restore previous default behaviour in bridge netfilter when CONFIG_IPV6=n,
oneliner from Bernhard Thaler.

3) Out of bound access in ipset hash:net* set types, reported by Dave Jones'
KASan utility, patch from Jozsef Kadlecsik.

4) Fix ipset compilation with gcc 4.4.7 related to C99 initialization of
unnamed unions, patch from Elad Raz.

5) Add a workaround to address inconsistent endianess in the res_id field of
nfnetlink batch messages, reported by Florian Westphal.

6) Fix error paths of CT/synproxy since the conntrack template was moved to use
kmalloc, patch from Daniel Borkmann.

All of them look good to me to reach 4.2, I can route this to -stable myself
too, just let me know what you prefer.
====================

Signed-off-by: David S. Miller

David S. Miller
2015-09-06 12:57:42 +0800

01 Sep, 2015

1 commit

9cf94eab8 netfilter: conntrack: use nf_ct_tmpl_free in CT/synproxy error paths ... Browse Code »

Commit 0838aa7fcfcd ("netfilter: fix netns dependencies with conntrack
templates") migrated templates to the new allocator api, but forgot to
update error paths for them in CT and synproxy to use nf_ct_tmpl_free()
instead of nf_conntrack_free().

Due to that, memory is being freed into the wrong kmemcache, but also
we drop the per net reference count of ct objects causing an imbalance.

In Brad's case, this leads to a wrap-around of net->ct.count and thus
lets __nf_conntrack_alloc() refuse to create a new ct object:

[ 10.340913] xt_addrtype: ipv6 does not support BROADCAST matching
[ 10.810168] nf_conntrack: table full, dropping packet
[ 11.917416] r8169 0000:07:00.0 eth0: link up
[ 11.917438] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 12.815902] nf_conntrack: table full, dropping packet
[ 15.688561] nf_conntrack: table full, dropping packet
[ 15.689365] nf_conntrack: table full, dropping packet
[ 15.690169] nf_conntrack: table full, dropping packet
[ 15.690967] nf_conntrack: table full, dropping packet
[...]

With slab debugging, it also reports the wrong kmemcache (kmalloc-512 vs.
nf_conntrack_ffffffff81ce75c0) and reports poison overwrites, etc. Thus,
to fix the problem, export and use nf_ct_tmpl_free() instead.

Fixes: 0838aa7fcfcd ("netfilter: fix netns dependencies with conntrack templates")
Reported-by: Brad Jackson
Signed-off-by: Daniel Borkmann
Signed-off-by: Pablo Neira Ayuso

Daniel Borkmann
2015-09-01 18:15:08 +0800

21 Aug, 2015

1 commit

81bf1c64e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Resolve conflicts with conntrack template fixes.

Conflicts:
net/netfilter/nf_conntrack_core.c
net/netfilter/nf_synproxy_core.c
net/netfilter/xt_CT.c

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2015-08-21 12:09:05 +0800

18 Aug, 2015

2 commits

5e8018fc6 netfilter: nf_conntrack: add efficient mark to zone mapping ... Browse Code »

This work adds the possibility of deriving the zone id from the skb->mark
field in a scalable manner. This allows for having only a single template
serving hundreds/thousands of different zones, for example, instead of the
need to have one match for each zone as an extra CT jump target.

Note that we'd need to have this information attached to the template as at
the time when we're trying to lookup a possible ct object, we already need
to know zone information for a possible match when going into
__nf_conntrack_find_get(). This work provides a minimal implementation for
a possible mapping.

In order to not add/expose an extra ct->status bit, the zone structure has
been extended to carry a flag for deriving the mark.

Signed-off-by: Daniel Borkmann
Signed-off-by: Pablo Neira Ayuso

Daniel Borkmann
2015-08-18 07:24:05 +0800
deedb5903 netfilter: nf_conntrack: add direction support for zones ... Browse Code »

This work adds a direction parameter to netfilter zones, so identity
separation can be performed only in original/reply or both directions
(default). This basically opens up the possibility of doing NAT with
conflicting IP address/port tuples from multiple, isolated tenants
on a host (e.g. from a netns) without requiring each tenant to NAT
twice resp. to use its own dedicated IP address to SNAT to, meaning
overlapping tuples can be made unique with the zone identifier in
original direction, where the NAT engine will then allocate a unique
tuple in the commonly shared default zone for the reply direction.
In some restricted, local DNAT cases, also port redirection could be
used for making the reply traffic unique w/o requiring SNAT.

The consensus we've reached and discussed at NFWS and since the initial
implementation [1] was to directly integrate the direction meta data
into the existing zones infrastructure, as opposed to the ct->mark
approach we proposed initially.

As we pass the nf_conntrack_zone object directly around, we don't have
to touch all call-sites, but only those, that contain equality checks
of zones. Thus, based on the current direction (original or reply),
we either return the actual id, or the default NF_CT_DEFAULT_ZONE_ID.
CT expectations are direction-agnostic entities when expectations are
being compared among themselves, so we can only use the identifier
in this case.

Note that zone identifiers can not be included into the hash mix
anymore as they don't contain a "stable" value that would be equal
for both directions at all times, f.e. if only zone->id would
unconditionally be xor'ed into the table slot hash, then replies won't
find the corresponding conntracking entry anymore.

If no particular direction is specified when configuring zones, the
behaviour is exactly as we expect currently (both directions).

Support has been added for the CT netlink interface as well as the
x_tables raw CT target, which both already offer existing interfaces
to user space for the configuration of zones.

Below a minimal, simplified collision example (script in [2]) with
netperf sessions:

+--- tenant-1 ---+ mark := 1
| netperf |--+
+----------------+ | CT zone := mark [ORIGINAL]
[ip,sport] := X +--------------+ +--- gateway ---+
| mark routing |--| SNAT |-- ... +
+--------------+ +---------------+ |
+--- tenant-2 ---+ | ~~~|~~~
| netperf |--+ +-----------+ |
+----------------+ mark := 2 | netserver |------ ... +
[ip,sport] := X +-----------+
[ip,port] := Y
On the gateway netns, example:

iptables -t raw -A PREROUTING -j CT --zone mark --zone-dir ORIGINAL
iptables -t nat -A POSTROUTING -o -j SNAT --to-source --random-fully

iptables -t mangle -A PREROUTING -m conntrack --ctdir ORIGINAL -j CONNMARK --save-mark
iptables -t mangle -A POSTROUTING -m conntrack --ctdir REPLY -j CONNMARK --restore-mark

conntrack dump from gateway netns:

netperf -H 10.1.1.2 -t TCP_STREAM -l60 -p12865,5555 from each tenant netns

tcp 6 431995 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=1
src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=1024
[ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1

tcp 6 431994 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=2
src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=5555
[ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=1

tcp 6 299 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=39438 dport=33768 zone-orig=1
src=10.1.1.2 dst=10.1.1.1 sport=33768 dport=39438
[ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1

tcp 6 300 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=32889 dport=40206 zone-orig=2
src=10.1.1.2 dst=10.1.1.1 sport=40206 dport=32889
[ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=2

Taking this further, test script in [2] creates 200 tenants and runs
original-tuple colliding netperf sessions each. A conntrack -L dump in
the gateway netns also confirms 200 overlapping entries, all in ESTABLISHED
state as expected.

I also did run various other tests with some permutations of the script,
to mention some: SNAT in random/random-fully/persistent mode, no zones (no
overlaps), static zones (original, reply, both directions), etc.

[1] http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.devel/57412/
[2] https://paste.fedoraproject.org/242835/65657871/

Signed-off-by: Daniel Borkmann
Signed-off-by: Pablo Neira Ayuso

Daniel Borkmann
2015-08-18 07:22:50 +0800

11 Aug, 2015

1 commit

308ac9143 netfilter: nf_conntrack: push zone object into functions ... Browse Code »

This patch replaces the zone id which is pushed down into functions
with the actual zone object. It's a bigger one-time change, but
needed for later on extending zones with a direction parameter, and
thus decoupling this additional information from all call-sites.

No functional changes in this patch.

The default zone becomes a global const object, namely nf_ct_zone_dflt
and will be returned directly in various cases, one being, when there's
f.e. no zoning support.

Signed-off-by: Daniel Borkmann
Signed-off-by: Pablo Neira Ayuso

Daniel Borkmann
2015-08-11 18:29:01 +0800

30 Jul, 2015

1 commit

1a727c636 netfilter: nf_conntrack: checking for IS_ERR() instead of NULL ... Browse Code »

We recently changed this from nf_conntrack_alloc() to nf_ct_tmpl_alloc()
so the error handling needs to changed to check for NULL instead of
IS_ERR().

Fixes: 0838aa7fcfcd ('netfilter: fix netns dependencies with conntrack templates')
Signed-off-by: Dan Carpenter
Signed-off-by: Pablo Neira Ayuso

Dan Carpenter
2015-07-30 20:04:19 +0800

20 Jul, 2015

1 commit

0838aa7fc netfilter: fix netns dependencies with conntrack templates ... Browse Code »

Quoting Daniel Borkmann:

"When adding connection tracking template rules to a netns, f.e. to
configure netfilter zones, the kernel will endlessly busy-loop as soon
as we try to delete the given netns in case there's at least one
template present, which is problematic i.e. if there is such bravery that
the priviledged user inside the netns is assumed untrusted.

Minimal example:

ip netns add foo
ip netns exec foo iptables -t raw -A PREROUTING -d 1.2.3.4 -j CT --zone 1
ip netns del foo

What happens is that when nf_ct_iterate_cleanup() is being called from
nf_conntrack_cleanup_net_list() for a provided netns, we always end up
with a net->ct.count > 0 and thus jump back to i_see_dead_people. We
don't get a soft-lockup as we still have a schedule() point, but the
serving CPU spins on 100% from that point onwards.

Since templates are normally allocated with nf_conntrack_alloc(), we
also bump net->ct.count. The issue why they are not yet nf_ct_put() is
because the per netns .exit() handler from x_tables (which would eventually
invoke xt_CT's xt_ct_tg_destroy() that drops reference on info->ct) is
called in the dependency chain at a *later* point in time than the per
netns .exit() handler for the connection tracker.

This is clearly a chicken'n'egg problem: after the connection tracker
.exit() handler, we've teared down all the connection tracking
infrastructure already, so rightfully, xt_ct_tg_destroy() cannot be
invoked at a later point in time during the netns cleanup, as that would
lead to a use-after-free. At the same time, we cannot make x_tables depend
on the connection tracker module, so that the xt_ct_tg_destroy() would
be invoked earlier in the cleanup chain."

Daniel confirms this has to do with the order in which modules are loaded or
having compiled nf_conntrack as modules while x_tables built-in. So we have no
guarantees regarding the order in which netns callbacks are executed.

Fix this by allocating the templates through kmalloc() from the respective
SYNPROXY and CT targets, so they don't depend on the conntrack kmem cache.
Then, release then via nf_ct_tmpl_free() from destroy_conntrack(). This branch
is marked as unlikely since conntrack templates are rarely allocated and only
from the configuration plane path.

Note that templates are not kept in any list to avoid further dependencies with
nf_conntrack anymore, thus, the tmpl larval list is removed.

Reported-by: Daniel Borkmann
Signed-off-by: Pablo Neira Ayuso
Tested-by: Daniel Borkmann

Pablo Neira Ayuso
2015-07-20 20:58:19 +0800

06 Feb, 2014

1 commit

e53376bef netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt ... Browse Code »

With this patch, the conntrack refcount is initially set to zero and
it is bumped once it is added to any of the list, so we fulfill
Eric's golden rule which is that all released objects always have a
refcount that equals zero.

Andrey Vagin reports that nf_conntrack_free can't be called for a
conntrack with non-zero ref-counter, because it can race with
nf_conntrack_find_get().

A conntrack slab is created with SLAB_DESTROY_BY_RCU. Non-zero
ref-counter says that this conntrack is used. So when we release
a conntrack with non-zero counter, we break this assumption.

CPU1 CPU2
____nf_conntrack_find()
nf_ct_put()
destroy_conntrack()
...
init_conntrack
__nf_conntrack_alloc (set use = 1)
atomic_inc_not_zero(&ct->use) (use = 2)
if (!l4proto->new(ct, skb, dataoff, timeouts))
nf_conntrack_free(ct); (use = 2 !!!)
...
__nf_conntrack_alloc (set use = 1)
if (!nf_ct_key_equal(h, tuple, zone))
nf_ct_put(ct); (use = 0)
destroy_conntrack()
/* continue to work with CT */

After applying the path "[PATCH] netfilter: nf_conntrack: fix RCU
race in nf_conntrack_find_get" another bug was triggered in
destroy_conntrack():

[67096.759334] ------------[ cut here ]------------
[67096.759353] kernel BUG at net/netfilter/nf_conntrack_core.c:211!
...
[67096.759837] Pid: 498649, comm: atdd veid: 666 Tainted: G C --------------- 2.6.32-042stab084.18 #1 042stab084_18 /DQ45CB
[67096.759932] RIP: 0010:[] [] destroy_conntrack+0x15c/0x190 [nf_conntrack]
[67096.760255] Call Trace:
[67096.760255] [] nf_conntrack_destroy+0x17/0x30
[67096.760255] [] nf_conntrack_find_get+0x85/0x130 [nf_conntrack]
[67096.760255] [] nf_conntrack_in+0x352/0xb60 [nf_conntrack]
[67096.760255] [] ipv4_conntrack_local+0x51/0x60 [nf_conntrack_ipv4]
[67096.760255] [] nf_iterate+0x69/0xb0
[67096.760255] [] ? dst_output+0x0/0x20
[67096.760255] [] nf_hook_slow+0x74/0x110
[67096.760255] [] ? dst_output+0x0/0x20
[67096.760255] [] raw_sendmsg+0x775/0x910
[67096.760255] [] ? flush_tlb_others_ipi+0x128/0x130
[67096.760255] [] ? apic_timer_interrupt+0xe/0x20
[67096.760255] [] ? apic_timer_interrupt+0xe/0x20
[67096.760255] [] inet_sendmsg+0x4a/0xb0
[67096.760255] [] ? sock_sendmsg+0x13/0x140
[67096.760255] [] sock_sendmsg+0x117/0x140
[67096.760255] [] ? native_smp_send_reschedule+0x49/0x60
[67096.760255] [] ? _spin_unlock_bh+0x1b/0x20
[67096.760255] [] ? autoremove_wake_function+0x0/0x40
[67096.760255] [] ? do_ip_setsockopt+0x90/0xd80
[67096.760255] [] ? apic_timer_interrupt+0xe/0x20
[67096.760255] [] ? apic_timer_interrupt+0xe/0x20
[67096.760255] [] sys_sendto+0x139/0x190
[67096.760255] [] ? audit_syscall_entry+0x1d7/0x200
[67096.760255] [] ? __audit_syscall_exit+0x265/0x290
[67096.760255] [] compat_sys_socketcall+0x13f/0x210
[67096.760255] [] ia32_sysret+0x0/0x5

I have reused the original title for the RFC patch that Andrey posted and
most of the original patch description.

Cc: Eric Dumazet
Cc: Andrew Vagin
Cc: Florian Westphal
Reported-by: Andrew Vagin
Signed-off-by: Pablo Neira Ayuso
Reviewed-by: Eric Dumazet
Acked-by: Andrew Vagin

Pablo Neira Ayuso
2014-02-06 00:46:06 +0800

04 Jan, 2014

1 commit

14abfa161 netfilter: xt_CT: fix error value in xt_ct_tg_check() ... Browse Code »

If setting event mask fails then we were returning 0 for success.
This patch updates return code to -EINVAL in case of problem.

Signed-off-by: Eric Leblond
Signed-off-by: Pablo Neira Ayuso

Eric Leblond
2014-01-04 06:41:39 +0800

23 May, 2013

1 commit

27e7190ef netfilter: xt_CT: optimize XT_CT_NOTRACK ... Browse Code »

The percpu untracked ct are not currently used for XT_CT_NOTRACK.

xt_ct_tg_check()/xt_ct_target() provides a single ct.

Thats not optimal as the ct->ct_general.use cache line will bounce among
cpus.

Use the intended [1] thing : xt_ct_target() should select the percpu
object.

[1] Refs :
commit 5bfddbd46a95c97 ("netfilter: nf_conntrack: IPS_UNTRACKED bit")
commit b3c5163fe0193a7 ("netfilter: nf_conntrack: per_cpu untracking")

Signed-off-by: Eric Dumazet
Signed-off-by: Pablo Neira Ayuso

Eric Dumazet
2013-05-23 17:09:29 +0800

05 Feb, 2013

2 commits

5474f57f7 netfilter: xt_CT: add alias flag ... Browse Code »

This patch adds the alias flag to support full NOTRACK target
aliasing.

Based on initial patch from Jozsef Kadlecsik.

Acked-by: Jozsef Kadlecsik
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2013-02-05 08:49:26 +0800
d52ed4379 netfilter: xt_CT: merge common code of revision 0 and 1 ... Browse Code »

This patch merges the common code for revision 0 and 1.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2013-02-05 08:49:16 +0800

10 Jan, 2013

1 commit

4610476d8 netfilter: xt_CT: fix unset return value if conntrack zone are disabled ... Browse Code »

net/netfilter/xt_CT.c: In function ‘xt_ct_tg_check_v1’:
net/netfilter/xt_CT.c:250:6: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]
net/netfilter/xt_CT.c: In function ‘xt_ct_tg_check_v0’:
net/netfilter/xt_CT.c:112:6: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]

Reported-by: Borislav Petkov
Acked-by: Borislav Petkov
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2013-01-10 20:11:00 +0800

24 Dec, 2012

1 commit

10db9069e netfilter: xt_CT: recover NOTRACK target support ... Browse Code »

Florian Westphal reported that the removal of the NOTRACK target
(9655050 netfilter: remove xt_NOTRACK) is breaking some existing
setups.

That removal was scheduled for removal since long time ago as
described in Documentation/feature-removal-schedule.txt

What: xt_NOTRACK
Files: net/netfilter/xt_NOTRACK.c
When: April 2011
Why: Superseded by xt_CT

Still, people may have not notice / may have decided to stick to an
old iptables version. I agree with him in that some more conservative
approach by spotting some printk to warn users for some time is less
agressive.

Current iptables 1.4.16.3 already contains the aliasing support
that makes it point to the CT target, so upgrading would fix it.
Still, the policy so far has been to avoid pushing our users to
upgrade.

As a solution, this patch recovers the NOTRACK target inside the CT
target and it now spots a warning.

Reported-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2012-12-24 19:55:09 +0800

17 Dec, 2012

1 commit

252b3e8c1 netfilter: xt_CT: fix crash while destroy ct templates ... Browse Code »

In (d871bef netfilter: ctnetlink: dump entries from the dying and
unconfirmed lists), we assume that all conntrack objects are
inserted in any of the existing lists. However, template conntrack
objects were not. This results in hitting BUG_ON in the
destroy_conntrack path while removing a rule that uses the CT target.

This patch fixes the situation by adding the template lists, which
is where template conntrack objects reside now.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2012-12-17 06:44:12 +0800

15 Oct, 2012

1 commit

0153d5a81 netfilter: xt_CT: fix timeout setting with IPv6 ... Browse Code »

This patch fixes ip6tables and the CT target if it is used to set
some custom conntrack timeout policy for IPv6.

Use xt_ct_find_proto which already handles the ip6tables case for us.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2012-10-15 19:38:58 +0800

03 Sep, 2012

1 commit

236df0056 netfilter: xt_CT: refactorize xt_ct_tg_check ... Browse Code »

This patch adds xt_ct_set_helper and xt_ct_set_timeout to reduce
the size of xt_ct_tg_check.

This aims to improve code mantainability by splitting xt_ct_tg_check
in smaller chunks.

Suggested by Eric Dumazet.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2012-09-03 19:32:48 +0800

16 Jun, 2012

1 commit

1afc56794 netfilter: nf_ct_helper: implement variable length helper private data ... Browse Code »
43

This patch uses the new variable length conntrack extensions.

Instead of using union nf_conntrack_help that contain all the
helper private data information, we allocate variable length
area to store the private helper data.

This patch includes the modification of all existing helpers.
It also includes a couple of include header to avoid compilation
warnings.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2012-06-16 21:08:55 +0800