12 Oct, 2020
3 commits
-
This patch adds the NF_INET_INGRESS pseudohook for the NFPROTO_INET
family. This is a mapping this new hook to the existing NFPROTO_NETDEV
and NF_NETDEV_INGRESS hook. The hook does not guarantee that packets are
inet only, users must filter out non-ip traffic explicitly.This infrastructure makes it easier to support this new hook in nf_tables.
Signed-off-by: Pablo Neira Ayuso
-
Add helper function to check if this is an ingress hook.
Signed-off-by: Pablo Neira Ayuso
-
Add helper functions increment and decrement the hook static keys.
Signed-off-by: Pablo Neira Ayuso
16 Apr, 2020
1 commit
-
nf_remove_net_hook() uses WRITE_ONCE() to assign a 'const' pointer to a
'non-const' pointer. Cleanups to the implementation of WRITE_ONCE() mean
that this will give rise to a compiler warning, just like a plain old
assignment would do:| In file included from ./include/linux/export.h:43,
| from ./include/linux/linkage.h:7,
| from ./include/linux/kernel.h:8,
| from net/netfilter/core.c:9:
| net/netfilter/core.c: In function ‘nf_remove_net_hook’:
| ./include/linux/compiler.h:216:30: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
| *(volatile typeof(x) *)&(x) = (val); \
| ^
| net/netfilter/core.c:379:3: note: in expansion of macro ‘WRITE_ONCE’
| WRITE_ONCE(orig_ops[i], &dummy_ops);
| ^~~~~~~~~~Follow the pattern used elsewhere in this file and add a cast to 'void *'
to squash the warning.Cc: Pablo Neira Ayuso
Cc: Jozsef Kadlecsik
Cc: Florian Westphal
Cc: "David S. Miller"
Reviewed-by: Nick Desaulniers
Signed-off-by: Will Deacon
17 Oct, 2019
1 commit
-
At this time, NF_HOOK_LIST() macro will iterate the list and then calls
nf_hook() for each individual skb.This makes it so the entire list is passed into the netfilter core.
The advantage is that we only need to fetch the rule blob once per list
instead of per-skb.NF_HOOK_LIST now only works for ipv4 and ipv6, as those are the only
callers.v2: use skb_list_del_init() instead of list_del (Edward Cree)
Signed-off-by: Florian Westphal
Acked-by: Edward Cree
Signed-off-by: Pablo Neira Ayuso
04 Jul, 2019
1 commit
-
Its not used anywhere, so remove this.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
01 Jun, 2019
1 commit
-
This converts all remaining users and then removes skb_make_writable.
Suggested-by: Daniel Borkmann
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
15 May, 2019
1 commit
-
CONFIG_DEBUG_KERNEL should not impact code generation. Use the newly
defined CONFIG_DEBUG_MISC instead to keep the current code.Link: http://lkml.kernel.org/r/20190413224438.10802-6-okaya@kernel.org
Signed-off-by: Sinan Kaya
Acked-by: Florian Westphal
Reviewed-by: Josh Triplett
Reviewed-by: Kees Cook
Cc: Florian Westphal
Cc: Pablo Neira Ayuso
Cc: Jozsef Kadlecsik
Cc: "David S. Miller"
Cc: Anders Roxell
Cc: Benjamin Herrenschmidt
Cc: Christophe Leroy
Cc: Chris Zankel
Cc: Greg Kroah-Hartman
Cc: James Hogan
Cc: Max Filippov
Cc: Michael Ellerman
Cc: Michal Hocko
Cc: Mike Rapoport
Cc: Paul Burton
Cc: Paul Mackerras
Cc: Ralf Baechle
Cc: Thomas Bogendoerfer
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 Apr, 2019
1 commit
-
Replace NF_HOOK() based invocation of the netfilter hooks with a private
copy of nf_hook_slow().This copy has one difference: it can return the rx handler value expected
by the stack, i.e. RX_HANDLER_CONSUMED or RX_HANDLER_PASS.This is needed by the next patch to invoke the ebtables
"broute" table via the standard netfilter hooks rather than the custom
"br_should_route_hook" indirection that is used now.When the skb is to be "brouted", we must return RX_HANDLER_PASS from the
bridge rx input handler, but there is no way to indicate this via
NF_HOOK(), unless perhaps by some hack such as exposing bridge_cb in the
netfilter core or a percpu flag.text data bss dec filename
3369 56 0 3425 net/bridge/br_input.o.before
3458 40 0 3498 net/bridge/br_input.o.afterThis allows removal of the "br_should_route_hook" in the next patch.
Signed-off-by: Florian Westphal
Acked-by: David S. Miller
Acked-by: Nikolay Aleksandrov
Signed-off-by: Pablo Neira Ayuso
06 Jan, 2019
1 commit
-
Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label".
The jump label is controlled by HAVE_JUMP_LABEL, which is defined
like this:#if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
# define HAVE_JUMP_LABEL
#endifWe can improve this by testing 'asm goto' support in Kconfig, then
make JUMP_LABEL depend on CC_HAS_ASM_GOTO.Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will
match to the real kernel capability.Signed-off-by: Masahiro Yamada
Acked-by: Michael Ellerman (powerpc)
Tested-by: Sedat Dilek
11 Jul, 2018
1 commit
-
This adds a global netfilter function to extract a conntrack tuple from an
skb. The function uses a new function added to nf_ct_hook, which will try
to get the tuple from skb->_nfct, and do a full lookup if that fails. This
makes it possible to use the lookup function before the skb has passed
through the conntrack init hooks (e.g., in an ingress qdisc). The tuple is
copied to the caller to avoid issues with reference counting.The function returns false if conntrack is not loaded, allowing it to be
used without incurring a module dependency on conntrack. This is used by
the NAT mode in sch_cake.Cc: netfilter-devel@vger.kernel.org
Signed-off-by: Toke Høiland-Jørgensen
Signed-off-by: David S. Miller
24 May, 2018
1 commit
-
Pablo Neira Ayuso says:
====================
Netfilter updates for net-nextThe following patchset contains Netfilter updates for your net-next
tree, they are:1) Remove obsolete nf_log tracing from nf_tables, from Florian Westphal.
2) Add support for map lookups to numgen, random and hash expressions,
from Laura Garcia.3) Allow to register nat hooks for iptables and nftables at the same
time. Patchset from Florian Westpha.4) Timeout support for rbtree sets.
5) ip6_rpfilter works needs interface for link-local addresses, from
Vincent Bernat.6) Add nf_ct_hook and nf_nat_hook structures and use them.
7) Do not drop packets on packets raceing to insert conntrack entries
into hashes, this is particularly a problem in nfqueue setups.8) Address fallout from xt_osf separation to nf_osf, patches
from Florian Westphal and Fernando Mancera.9) Remove reference to struct nft_af_info, which doesn't exist anymore.
From Taehee Yoo.This batch comes with is a conflict between 25fd386e0bc0 ("netfilter:
core: add missing __rcu annotation") in your tree and 2c205dd3981f
("netfilter: add struct nf_nat_hook and use it") coming in this batch.
This conflict can be solved by leaving the __rcu tag on
__netfilter_net_init() - added by 25fd386e0bc0 - and remove all code
related to nf_nat_decode_session_hook - which is gone after
2c205dd3981f, as described by:diff --cc net/netfilter/core.c
index e0ae4aae96f5,206fb2c4c319..168af54db975
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@@ -611,7 -580,13 +611,8 @@@ const struct nf_conntrack_zone nf_ct_zo
EXPORT_SYMBOL_GPL(nf_ct_zone_dflt);
#endif /* CONFIG_NF_CONNTRACK */- static void __net_init __netfilter_net_init(struct nf_hook_entries **e, int max)
-#ifdef CONFIG_NF_NAT_NEEDED
-void (*nf_nat_decode_session_hook)(struct sk_buff *, struct flowi *);
-EXPORT_SYMBOL(nf_nat_decode_session_hook);
-#endif
-
+ static void __net_init
+ __netfilter_net_init(struct nf_hook_entries __rcu **e, int max)
{
int h;I can also merge your net-next tree into nf-next, solve the conflict and
resend the pull request if you prefer so.
====================Signed-off-by: David S. Miller
23 May, 2018
4 commits
-
Move decode_session() and parse_nat_setup_hook() indirections to struct
nf_nat_hook structure.Signed-off-by: Pablo Neira Ayuso
-
Move the nf_ct_destroy indirection to the struct nf_ct_hook.
Signed-off-by: Pablo Neira Ayuso
-
This reverts commit f92b40a8b2645
("netfilter: core: only allow one nat hook per hook point"), this
limitation is no longer needed. The nat core now invokes these
functions and makes sure that hook evaluation stops after a mapping is
created and a null binding is created otherwise.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
This will allow the nat core to reuse the nf_hook infrastructure
to maintain nat lookup functions.The raw versions don't assume a particular hook location, the
functions get added/deleted from the hook blob that is passed to the
functions.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
08 May, 2018
1 commit
-
removes following sparse error:
net/netfilter/core.c:598:30: warning: incorrect type in argument 1 (different address spaces)
net/netfilter/core.c:598:30: expected struct nf_hook_entries **e
net/netfilter/core.c:598:30: got struct nf_hook_entries [noderef] **Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
10 Jan, 2018
2 commits
-
EEXIST is used for an object that already exists, with the same
name/handle. However, there no same object there, instead there is a
object that is using the single slot that is available for NAT hooks
since patch f92b40a8b264 ("netfilter: core: only allow one nat hook per
hook point"). Let's change this return value before this behaviour gets
exposed in the first -rc.Signed-off-by: Pablo Neira Ayuso
-
Fixes the following sparse warning:
net/netfilter/core.c:380:6: warning:
symbol '__nf_unregister_net_hook' was not declared. Should it be static?Signed-off-by: Wei Yongjun
Signed-off-by: Pablo Neira Ayuso
09 Jan, 2018
13 commits
-
This abstraction has no clients anymore, remove it.
This is what remains from previous authors, so correct copyright
statement after recent modifications and code removal.Signed-off-by: Pablo Neira Ayuso
-
Expand NFPROTO_INET in two hook registrations, one for NFPROTO_IPV4 and
another for NFPROTO_IPV6. Hence, we handle NFPROTO_INET from the core.Signed-off-by: Pablo Neira Ayuso
-
So static_key_slow_dec applies to the family behind NFPROTO_INET.
Signed-off-by: Pablo Neira Ayuso
-
Instead of passing struct nf_hook_ops, this is needed by follow up
patches to handle NFPROTO_INET from the core.Signed-off-by: Pablo Neira Ayuso
-
Just a cleanup, __nf_unregister_net_hook() is used by a follow up patch
when handling NFPROTO_INET as a real family from the core.Signed-off-by: Pablo Neira Ayuso
-
The netfilter NAT core cannot deal with more than one NAT hook per hook
location (prerouting, input ...), because the NAT hooks install a NAT null
binding in case the iptables nat table (iptable_nat hooks) or the
corresponding nftables chain (nft nat hooks) doesn't specify a nat
transformation.Null bindings are needed to detect port collsisions between NAT-ed and
non-NAT-ed connections.This causes nftables NAT rules to not work when iptable_nat module is
loaded, and vice versa because nat binding has already been attached
when the second nat hook is consulted.The netfilter core is not really the correct location to handle this
(hooks are just hooks, the core has no notion of what kinds of side
effects a hook implements), but its the only place where we can check
for conflicts between both iptables hooks and nftables hooks without
adding dependencies.So add nat annotation to hook_ops to describe those hooks that will
add NAT bindings and then make core reject if such a hook already exists.
The annotation fills a padding hole, in case further restrictions appar
we might change this to a 'u8 type' instead of bool.iptables error if nft nat hook active:
iptables -t nat -A POSTROUTING -j MASQUERADE
iptables v1.4.21: can't initialize iptables table `nat': File exists
Perhaps iptables or your kernel needs to be upgraded.nftables error if iptables nat table present:
nft -f /etc/nftables/ipv4-nat
/usr/etc/nftables/ipv4-nat:3:1-2: Error: Could not process rule: File exists
table nat {
^^Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
no need to define hook points if the family isn't supported.
Because we need these hooks for either nftables, arp/ebtables
or the 'call-iptables' hack we have in the bridge layer add two
new dependencies, NETFILTER_FAMILY_{ARP,BRIDGE}, and have the
users select them.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
no need to define hook points if the family isn't supported.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
Not all families share the same hook count, adjust sizes to what is
needed.struct net before:
/* size: 6592, cachelines: 103, members: 46 */
after:
/* size: 5952, cachelines: 93, members: 46 */Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
struct net contains:
struct nf_hook_entries __rcu *hooks[NFPROTO_NUMPROTO][NF_MAX_HOOKS];
which store the hook entry point locations for the various protocol
families and the hooks.Using array results in compact c code when doing accesses, i.e.
x = rcu_dereference(net->nf.hooks[pf][hook]);but its also wasting a lot of memory, as most families are
not used.So split the array into those families that are used, which
are only 5 (instead of 13). In most cases, the 'pf' argument is
constant, i.e. gcc removes switch statement.struct net before:
/* size: 5184, cachelines: 81, members: 46 */
after:
/* size: 4672, cachelines: 73, members: 46 */Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
Giuseppe Scrivano says:
"SELinux, if enabled, registers for each new network namespace 6
netfilter hooks."Cost for this is high. With synchronize_net() removed:
"The net benefit on an SMP machine with two cores is that creating a
new network namespace takes -40% of the original time."This patch replaces synchronize_net+kvfree with call_rcu().
We store rcu_head at the tail of a structure that has no fixed layout,
i.e. we cannot use offsetof() to compute the start of the original
allocation. Thus store this information right after the rcu head.We could simplify this by just placing the rcu_head at the start
of struct nf_hook_entries. However, this structure is used in
packet processing hotpath, so only place what is needed for that
at the beginning of the struct.Reported-by: Giuseppe Scrivano
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
since commit 960632ece6949b ("netfilter: convert hook list to an array")
nfqueue no longer stores a pointer to the hook that caused the packet
to be queued. Therefore no extra synchronize_net() call is needed after
dropping the packets enqueued by the old rule blob.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
This reverts commit d3ad2c17b4047
("netfilter: core: batch nf_unregister_net_hooks synchronize_net calls").Nothing wrong with it. However, followup patch will delay freeing of hooks
with call_rcu, so all synchronize_net() calls become obsolete and there
is no need anymore for this batching.This revert causes a temporary performance degradation when destroying
network namespace, but its resolved with the upcoming call_rcu conversion.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
09 Sep, 2017
1 commit
-
kernel test robot reported:
WARNING: CPU: 0 PID: 1244 at net/netfilter/core.c:218 __nf_hook_entries_try_shrink+0x49/0xcd
[..]After allowing batching in nf_unregister_net_hooks its possible that an earlier
call to __nf_hook_entries_try_shrink already compacted the list.
If this happens we don't need to do anything.Fixes: d3ad2c17b4047 ("netfilter: core: batch nf_unregister_net_hooks synchronize_net calls")
Reported-by: kernel test robot
Signed-off-by: Florian Westphal
Acked-by: Aaron Conole
Signed-off-by: Pablo Neira Ayuso
28 Aug, 2017
3 commits
-
re-add batching in nf_unregister_net_hooks().
Similar as before, just store an array with to-be-free'd rule arrays
on stack, then call synchronize_net once per batch.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
Make sure our grow/shrink routine places them in the correct order.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
This converts the storage and layout of netfilter hook entries from a
linked list to an array. After this commit, hook entries will be
stored adjacent in memory. The next pointer is no longer required.The ops pointers are stored at the end of the array as they are only
used in the register/unregister path and in the legacy br_netfilter code.nf_unregister_net_hooks() is slower than needed as it just calls
nf_unregister_net_hook in a loop (i.e. at least n synchronize_net()
calls), this will be addressed in followup patch.Test setup:
- ixgbe 10gbit
- netperf UDP_STREAM, 64 byte packets
- 5 hooks: (raw + mangle prerouting, mangle+filter input, inet filter):
empty mangle and raw prerouting, mangle and filter input hooks:
353.9
this patch:
364.2Signed-off-by: Aaron Conole
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
19 Jul, 2017
1 commit
-
We accidentally return an uninitialized variable.
Fixes: cf56c2f892a8 ("netfilter: remove old pre-netns era hook api")
Signed-off-by: Dan Carpenter
Acked-by: Pablo Neira Ayuso
Signed-off-by: David S. Miller
17 Jul, 2017
1 commit
-
no more users in the tree, remove this.
The old api is racy wrt. module removal, all users have been converted
to the netns-aware api.The old api pretended we still have global hooks but that has not been
true for a long time.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
01 May, 2017
2 commits
-
nf_unregister_net_hook(s) can avoid a second call to synchronize_net,
provided there is no nfqueue active in that net namespace (which is
the common case).This also gets rid of the extra arg to nf_queue_nf_hook_drop(), normally
this gets called during netns cleanup so no packets should be queued.For the rare case of base chain being unregistered or module removal
while nfqueue is in use the extra hiccup due to the packet drops isn't
a big deal.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso -
synchronize_net is expensive and slows down netns cleanup a lot.
We have two APIs to unregister a hook:
nf_unregister_net_hook (which calls synchronize_net())
and
nf_unregister_net_hooks (calls nf_unregister_net_hook in a loop)Make nf_unregister_net_hook a wapper around new helper
__nf_unregister_net_hook, which unlinks the hook but does not free it.Then, we can call that helper in nf_unregister_net_hooks and then
call synchronize_net() only once.Andrey Konovalov reports this change improves syzkaller fuzzing speed at
least twice.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso