10 Jan, 2019
2 commits
-
commit 21ba8847f857028dc83a0f341e16ecc616e34740 upstream.
Currently, we use check_hlist() for garbage colleciton. However, we
use the ‘zone’ from the counted entry to query the existence of
existing entries in the hlist. This could be wrong when they are in
different zones, and this patch fixes this issue.Fixes: e59ea3df3fc2 ("netfilter: xt_connlimit: honor conntrack zone if available")
Signed-off-by: Yi-Hung Wei
Signed-off-by: Pablo Neira Ayuso[mfo: backport: refresh context lines and use older symbol/file names, note hunk 5:
- nf_conncount.c -> xt_connlimit.c
- nf_conncount_rb -> xt_connlimit_rb
- nf_conncount_tuple -> xt_connlimit_conn
- hunk 5: remove check for non-NULL 'tuple', that isn't required as it's introduced
by upstream commit 35d8deb80 ("netfilter: conncount: Support count only use case")
which addresses nf_conncount_count() that does not exist yet -- it's introduced by
upstream commit 625c556118f3 ("netfilter: connlimit: split xt_connlimit into front
and backend"), a refactor change.
- nft_connlimit.c -> removed, not used/doesn't exist yet.]
Signed-off-by: Mauricio Faria de OliveiraSigned-off-by: Sasha Levin
-
commit 5e5cbc7b23eaf13e18652c03efbad5be6995de6a upstream.
This patch provides an interface to maintain the list of connections and
the lookup function to obtain the number of connections in the list.Signed-off-by: Pablo Neira Ayuso
[mfo: backport: refresh context lines and use older symbol/file names:
- nf_conntrack_count.h: new file, add include guards.
- nf_conncount.c -> xt_connlimit.c.
- nf_conncount_rb -> xt_connlimit_rb
- nf_conncount_tuple -> xt_connlimit_conn
- conncount_rb_cachep -> connlimit_rb_cachep
- conncount_conn_cachep -> connlimit_conn_cachep]
Signed-off-by: Mauricio Faria de OliveiraSigned-off-by: Sasha Levin
08 Jul, 2018
1 commit
-
commit bb7b40aecbf778c0c83a5bd62b0f03ca9f49a618 upstream.
When removing a rule that jumps to chain and such chain in the same
batch, this bogusly hits EBUSY. Add activate and deactivate operations
to expression that can be called from the preparation and the
commit/abort phases.Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Greg Kroah-Hartman
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
09 Sep, 2017
1 commit
-
This reverts commit 870190a9ec9075205c0fa795a09fa931694a3ff1.
It was not a good idea. The custom hash table was a much better
fit for this purpose.A fast lookup is not essential, in fact for most cases there is no lookup
at all because original tuple is not taken and can be used as-is.
What needs to be fast is insertion and deletion.rhlist removal however requires a rhlist walk.
We can have thousands of entries in such a list if source port/addresses
are reused for multiple flows, if this happens removal requests are so
expensive that deletions of a few thousand flows can take several
seconds(!).The advantages that we got from rhashtable are:
1) table auto-sizing
2) multiple locks1) would be nice to have, but it is not essential as we have at
most one lookup per new flow, so even a million flows in the bysource
table are not a problem compared to current deletion cost.
2) is easy to add to custom hash table.I tried to add hlist_node to rhlist to speed up rhltable_remove but this
isn't doable without changing semantics. rhltable_remove_fast will
check that the to-be-deleted object is part of the table and that
requires a list walk that we want to avoid.Furthermore, using hlist_node increases size of struct rhlist_head, which
in turn increases nf_conn size.Link: https://bugzilla.kernel.org/show_bug.cgi?id=196821
Reported-by: Ivan Babrou
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
04 Sep, 2017
4 commits
-
This patch removes CONFIG_NETFILTER_DEBUG and _ASSERT() macros as they
are no longer required. Replace _ASSERT() macros with WARN_ON().Signed-off-by: Varsha Rao
Signed-off-by: Pablo Neira Ayuso -
This patch removes NF_CT_ASSERT() and instead uses WARN_ON().
Signed-off-by: Varsha Rao
-
tested with allmodconfig build.
Signed-off-by: Florian Westphal
-
This patch adds support for overloading stateful objects operations
through the select_ops() callback, just as it is implemented for
expressions.This change is needed for upcoming additions to the stateful objects
infrastructure.Signed-off-by: Pablo M. Bermudo Garay
Signed-off-by: Pablo Neira Ayuso
28 Aug, 2017
1 commit
-
This converts the storage and layout of netfilter hook entries from a
linked list to an array. After this commit, hook entries will be
stored adjacent in memory. The next pointer is no longer required.The ops pointers are stored at the end of the array as they are only
used in the register/unregister path and in the legacy br_netfilter code.nf_unregister_net_hooks() is slower than needed as it just calls
nf_unregister_net_hook in a loop (i.e. at least n synchronize_net()
calls), this will be addressed in followup patch.Test setup:
- ixgbe 10gbit
- netperf UDP_STREAM, 64 byte packets
- 5 hooks: (raw + mangle prerouting, mangle+filter input, inet filter):
empty mangle and raw prerouting, mangle and filter input hooks:
353.9
this patch:
364.2Signed-off-by: Aaron Conole
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
25 Aug, 2017
7 commits
-
Doesn't change generated code, but will make it easier to eventually
make the actual trackers themselvers const.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
CONFIG_NF_CONNTRACK_PROCFS is deprecated, no need to use a function
pointer in the trackers for this. Place the printf formatting in
the one place that uses it.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
can use u16 for both, shrinks size by another 8 bytes.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
no need to waste storage for something that is only needed
in one place and can be deduced from protocol number.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
no need to waste storage for something that is only needed
in one place and can be deduced from protocol number.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
avoids a pointer and allows struct to be const later on.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
02 Aug, 2017
1 commit
-
When a nf_conntrack_l3/4proto parameter is not on the left hand side
of an assignment, its address is not taken, and it is not passed to a
function that may modify its fields, then it can be declared as const.This change is useful from a documentation point of view, and can
possibly facilitate making some nf_conntrack_l3/4proto structures const
subsequently.Done with the help of Coccinelle.
Signed-off-by: Julia Lawall
Signed-off-by: Pablo Neira Ayuso
01 Aug, 2017
7 commits
-
Discussion during NFWS 2017 in Faro has shown that the current
conntrack behaviour is unreasonable.Even if conntrack module is loaded on behalf of a single net namespace,
its turned on for all namespaces, which is expensive. Commit
481fa373476 ("netfilter: conntrack: add nf_conntrack_default_on sysctl")
attempted to provide an alternative to the 'default on' behaviour by
adding a sysctl to change it.However, as Eric points out, the sysctl only becomes available
once the module is loaded, and then its too late.So we either have to move the sysctl to the core, or, alternatively,
change conntrack to become active only once the rule set requires this.This does the latter, conntrack is only enabled when a rule needs it.
Reported-by: Eric Dumazet
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
Same conversion as for table names, use NFT_NAME_MAXLEN as upper
boundary as well.Signed-off-by: Phil Sutter
Signed-off-by: Pablo Neira Ayuso -
Same conversion as for table names, use NFT_NAME_MAXLEN as upper
boundary as well.Signed-off-by: Phil Sutter
Signed-off-by: Pablo Neira Ayuso -
Same conversion as for table names, use NFT_NAME_MAXLEN as upper
boundary as well.Signed-off-by: Phil Sutter
Signed-off-by: Pablo Neira Ayuso -
Allocate all table names dynamically to allow for arbitrary lengths but
introduce NFT_NAME_MAXLEN as an upper sanity boundary. It's value was
chosen to allow using a domain name as per RFC 1035.Signed-off-by: Phil Sutter
Signed-off-by: Pablo Neira Ayuso -
This also removes __nf_ct_unconfirmed_destroy() call from
nf_ct_iterate_cleanup_net, so that function can be used only
when missing conntracks from unconfirmed list isn't a problem.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
We have several spots that open-code a expect walk, add a helper
that is similar to nf_ct_iterate_destroy/nf_ct_iterate_cleanup.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
24 Jul, 2017
1 commit
-
These chain counters are only used by the iptables-compat tool, that
allow users to use the x_tables extensions from the existing nf_tables
framework. This patch makes nf_tables by ~5% for the general usecase,
ie. native nft users, where no chain counters are used at all.Signed-off-by: Pablo Neira Ayuso
01 Jul, 2017
1 commit
-
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.Signed-off-by: Elena Reshetova
Signed-off-by: Hans Liljestrand
Signed-off-by: Kees Cook
Signed-off-by: David Windsor
Signed-off-by: David S. Miller
20 Jun, 2017
1 commit
-
We don't support anything larger than NFPROTO_MAX, so we can shrink this a bit:
text data dec hex filename
old: 8259 1096 9355 248b net/netfilter/nf_conntrack_proto.o
new: 8259 624 8883 22b3 net/netfilter/nf_conntrack_proto.oSigned-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
29 May, 2017
4 commits
-
The new non-resizable hashtable variant needs this to calculate the
size of the bucket array.Signed-off-by: Pablo Neira Ayuso
-
This patch adds the infrastructure to support several implementations of
the same set type. This selection will be based on the set description
and the features available for this set. This allow us to select set
backend implementation that will result in better performance numbers.Signed-off-by: Pablo Neira Ayuso
-
sledgehammer to be used on module unload (to remove affected conntracks
from all namespaces).It will also flag all unconfirmed conntracks as dying, i.e. they will
not be committed to main table.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
There are several places where we needlesly call nf_ct_iterate_cleanup,
we should instead iterate the full table at module unload time.This is a leftover from back when the conntrack table got duplicated
per net namespace.So rename nf_ct_iterate_cleanup to nf_ct_iterate_cleanup_net.
A later patch will then add a non-net variant.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
15 May, 2017
3 commits
-
Andreas reports that the following incremental update using our commit
protocol doesn't work.# nft -f incremental-update.nft
delete element ip filter client_to_any { 10.180.86.22 : goto CIn_1 }
delete chain ip filter CIn_1
... Error: Could not process rule: Device or resource busyThe existing code is not well-integrated into the commit phase protocol,
since element deletions do not result in refcount decrement from the
preparation phase. This results in bogus EBUSY errors like the one
above.Two new functions come with this patch:
* nft_set_elem_activate() function is used from the abort path, to
restore the set element refcounting on objects that occurred from
the preparation phase.* nft_set_elem_deactivate() that is called from nft_del_setelem() to
decrement set element refcounting on objects from the preparation
phase in the commit protocol.The nft_data_uninit() has been renamed to nft_data_release() since this
function does not uninitialize any data store in the data register,
instead just releases the references to objects. Moreover, a new
function nft_data_hold() has been introduced to be used from
nft_set_elem_activate().Reported-by: Andreas Schultz
Signed-off-by: Pablo Neira Ayuso -
We can still delete the ct helper even if it is in use, this will cause
a use-after-free error. In more detail, I mean:
# nfct helper add ssdp inet udp
# iptables -t raw -A OUTPUT -p udp -j CT --helper ssdp
# nfct helper delete ssdp //--> oops, succeed!
BUG: unable to handle kernel paging request at 000026ca
IP: 0x26ca
[...]
Call Trace:
? ipv4_helper+0x62/0x80 [nf_conntrack_ipv4]
nf_hook_slow+0x21/0xb0
ip_output+0xe9/0x100
? ip_fragment.constprop.54+0xc0/0xc0
ip_local_out+0x33/0x40
ip_send_skb+0x16/0x80
udp_send_skb+0x84/0x240
udp_sendmsg+0x35d/0xa50So add reference count to fix this issue, if ct helper is used by
others, reject the delete request.Apply this patch:
# nfct helper delete ssdp
nfct v1.4.3: netlink error: Device or resource busySigned-off-by: Liping Zhang
Signed-off-by: Pablo Neira Ayuso -
And convert module_put invocation to nf_conntrack_helper_put, this is
prepared for the followup patch, which will add a refcnt for cthelper,
so we can reject the deleting request when cthelper is in use.Signed-off-by: Liping Zhang
Signed-off-by: Pablo Neira Ayuso
01 May, 2017
2 commits
-
For NF_NAT_MANIP_SRC, we will insert the ct to the nat_bysource_table,
then remove it from the nat_bysource_table via nat_extend->destroy.But now, the nat extension is attached on demand, so if the nat extension
is not attached, we will not be notified when the ct is destroyed, i.e.
we may fail to remove ct from the nat_bysource_table.So just keep it simple, even if the extension is not attached, we will
still invoke the related ext->destroy. And this will also preserve the
flexibility for the future extension.Fixes: 9a08ecfe74d7 ("netfilter: don't attach a nat extension by default")
Signed-off-by: Liping Zhang
Signed-off-by: Pablo Neira Ayuso -
nf_unregister_net_hook(s) can avoid a second call to synchronize_net,
provided there is no nfqueue active in that net namespace (which is
the common case).This also gets rid of the extra arg to nf_queue_nf_hook_drop(), normally
this gets called during netns cleanup so no packets should be queued.For the rare case of base chain being unregistered or module removal
while nfqueue is in use the extra hiccup due to the packet drops isn't
a big deal.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
26 Apr, 2017
3 commits
-
nowadays the NAT extension only stores the interface index
(used to purge connections that got masqueraded when interface goes down)
and pptp nat information.Previous patches moved nf_ct_nat_ext_add to those places that need it.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
It was used by the nat extension, but since commit
7c9664351980 ("netfilter: move nat hlist_head to nf_conn") its only needed
for connections that use MASQUERADE target or a nat helper.Also it seems a lot easier to preallocate a fixed size instead.
With default settings, conntrack first adds ecache extension (sysctl
defaults to 1), so we get 40(ct extension header) + 24 (ecache) == 64 byte
on x86_64 for initial allocation.Followup patches can constify the extension structs and avoid
the initial zeroing of the entire extension area.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso