28 May, 2020
1 commit
-
Conntrack dump does not support kernel side filtering (only get exists,
but it returns only one entry. And user has to give a full valid tuple)It means that userspace has to implement filtering after receiving many
irrelevant entries, consuming resources (conntrack table is sometimes
very huge, much more than a routing table for example).This patch adds filtering in kernel side. To achieve this goal, we:
* Add a new CTA_FILTER netlink attributes, actually a flag list to
parametize filtering
* Convert some *nlattr_to_tuple() functions, to allow a partial parsing
of CTA_TUPLE_ORIG and CTA_TUPLE_REPLY (so nf_conntrack_tuple it not
fully set)Filtering is now possible on:
* IP SRC/DST values
* Ports for TCP and UDP flows
* IMCP(v6) codes types and IDsFiltering is done as an "AND" operator. For example, when flags
PROTO_SRC_PORT, PROTO_NUM and IP_SRC are sets, only entries matching all
values are dumped.Changes since v1:
Set NLM_F_DUMP_FILTERED in nlm flags if entries are filteredChanges since v2:
Move several constants to nf_internals.h
Move a fix on netlink values check in a separate patch
Add a check on not-supported flags
Return EOPNOTSUPP if CDA_FILTER is set in ctnetlink_flush_conntrack
(not yet implemented)
Code style issuesChanges since v3:
Fix compilation warning reported by kbuild test robotChanges since v4:
Fix a regression introduced in v3 (returned EINVAL for valid netlink
messages without CTA_MARK)Changes since v5:
Change definition of CTA_FILTER_F_ALL
Fix a regression when CTA_TUPLE_ZONE is not setSigned-off-by: Romain Bellan
Signed-off-by: Florent Fourcot
Signed-off-by: Pablo Neira Ayuso
12 Apr, 2019
1 commit
-
Replace NF_HOOK() based invocation of the netfilter hooks with a private
copy of nf_hook_slow().This copy has one difference: it can return the rx handler value expected
by the stack, i.e. RX_HANDLER_CONSUMED or RX_HANDLER_PASS.This is needed by the next patch to invoke the ebtables
"broute" table via the standard netfilter hooks rather than the custom
"br_should_route_hook" indirection that is used now.When the skb is to be "brouted", we must return RX_HANDLER_PASS from the
bridge rx input handler, but there is no way to indicate this via
NF_HOOK(), unless perhaps by some hack such as exposing bridge_cb in the
netfilter core or a percpu flag.text data bss dec filename
3369 56 0 3425 net/bridge/br_input.o.before
3458 40 0 3498 net/bridge/br_input.o.afterThis allows removal of the "br_should_route_hook" in the next patch.
Signed-off-by: Florian Westphal
Acked-by: David S. Miller
Acked-by: Nikolay Aleksandrov
Signed-off-by: Pablo Neira Ayuso
23 May, 2018
1 commit
-
This will allow the nat core to reuse the nf_hook infrastructure
to maintain nat lookup functions.The raw versions don't assume a particular hook location, the
functions get added/deleted from the hook blob that is passed to the
functions.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
09 Jan, 2018
1 commit
-
since commit 960632ece6949b ("netfilter: convert hook list to an array")
nfqueue no longer stores a pointer to the hook that caused the packet
to be queued. Therefore no extra synchronize_net() call is needed after
dropping the packets enqueued by the old rule blob.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
29 Aug, 2017
1 commit
-
Remove NFDEBUG and use pr_debug() instead of it.
Signed-off-by: Varsha Rao
Signed-off-by: Pablo Neira Ayuso
28 Aug, 2017
1 commit
-
This converts the storage and layout of netfilter hook entries from a
linked list to an array. After this commit, hook entries will be
stored adjacent in memory. The next pointer is no longer required.The ops pointers are stored at the end of the array as they are only
used in the register/unregister path and in the legacy br_netfilter code.nf_unregister_net_hooks() is slower than needed as it just calls
nf_unregister_net_hook in a loop (i.e. at least n synchronize_net()
calls), this will be addressed in followup patch.Test setup:
- ixgbe 10gbit
- netperf UDP_STREAM, 64 byte packets
- 5 hooks: (raw + mangle prerouting, mangle+filter input, inet filter):
empty mangle and raw prerouting, mangle and filter input hooks:
353.9
this patch:
364.2Signed-off-by: Aaron Conole
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
19 Aug, 2017
1 commit
-
The netfilter_queue_init() has been removed.
so we can remove the prototype of that.Signed-off-by: Taehee Yoo
Signed-off-by: Pablo Neira Ayuso
01 May, 2017
1 commit
-
nf_unregister_net_hook(s) can avoid a second call to synchronize_net,
provided there is no nfqueue active in that net namespace (which is
the common case).This also gets rid of the extra arg to nf_queue_nf_hook_drop(), normally
this gets called during netns cleanup so no packets should be queued.For the rare case of base chain being unregistered or module removal
while nfqueue is in use the extra hiccup due to the packet drops isn't
a big deal.Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
03 Nov, 2016
1 commit
-
nf_iterate() has become rather simple, we can integrate this code into
nf_hook_slow() to reduce the amount of LOC in the core path.However, we still need nf_iterate() around for nf_queue packet handling,
so move this function there where we only need it. I think it should be
possible to refactor nf_queue code to get rid of it definitely, but
given this is slow path anyway, let's have a look this later.Signed-off-by: Pablo Neira Ayuso
21 Oct, 2016
1 commit
-
nf_queue handling is broken since e3b37f11e6e4 ("netfilter: replace
list_head with single linked list") for two reasons:1) If the bypass flag is set on, there are no userspace listeners and
we still have more hook entries to iterate over, then jump to the
next hook. Otherwise accept the packet. On nf_reinject() path, the
okfn() needs to be invoked.2) We should not re-enter the same hook on packet reinjection. If the
packet is accepted, we have to skip the current hook from where the
packet was enqueued, otherwise the packets gets enqueued over and
over again.This restores the previous list_for_each_entry_continue() behaviour
happening from nf_iterate() that was dealing with these two cases.
This patch introduces a new nf_queue() wrapper function so this fix
becomes simpler.Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list")
Signed-off-by: Pablo Neira Ayuso
25 Sep, 2016
1 commit
-
The netfilter hook list never uses the prev pointer, and so can be trimmed to
be a simple singly-linked list.In addition to having a more light weight structure for hook traversal,
struct net becomes 5568 bytes (down from 6400) and struct net_device becomes
2176 bytes (down from 2240).Signed-off-by: Aaron Conole
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
23 Jul, 2015
1 commit
-
This function reacquires the rtnl_lock() which is already held by
nf_unregister_hook().This can be triggered via: modprobe nf_conntrack_ipv4 && rmmod nf_conntrack_ipv4
[ 720.628746] INFO: task rmmod:3578 blocked for more than 120 seconds.
[ 720.628749] Not tainted 4.2.0-rc2+ #113
[ 720.628752] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 720.628754] rmmod D ffff8800ca46fd58 0 3578 3571 0x00000080
[...]
[ 720.628783] Call Trace:
[ 720.628790] [] schedule+0x6b/0x90
[ 720.628795] [] schedule_preempt_disabled+0x13/0x20
[ 720.628799] [] mutex_lock_nested+0x1f5/0x380
[ 720.628803] [] ? rtnl_lock+0x12/0x20
[ 720.628807] [] ? rtnl_lock+0x12/0x20
[ 720.628812] [] rtnl_lock+0x12/0x20
[ 720.628817] [] nf_queue_nf_hook_drop+0x15/0x160
[ 720.628825] [] nf_unregister_net_hook+0x168/0x190
[ 720.628831] [] nf_unregister_hook+0x64/0x80
[ 720.628837] [] nf_unregister_hooks+0x20/0x30
[...]Moreover, nf_unregister_net_hook() should only destroy the queue for this
netns, not for every netns.Reported-by: Fengguang Wu
Fixes: 085db2c04557 ("netfilter: Per network namespace netfilter hooks.")
Signed-off-by: Pablo Neira Ayuso
Acked-by: "Eric W. Biederman"
23 Jun, 2015
1 commit
-
Add code to nf_unregister_hook to flush the nf_queue when a hook is
unregistered. This guarantees that the pointer that the nf_queue code
retains into the nf_hook list will remain valid while a packet is
queued.I tested what would happen if we do not flush queued packets and was
trivially able to obtain the oops below. All that was required was
to stop the nf_queue listening process, to delete all of the nf_tables,
and to awaken the nf_queue listening process.> BUG: unable to handle kernel paging request at 0000000100000001
> IP: [] 0x100000001
> PGD b9c35067 PUD 0
> Oops: 0010 [#1] SMP
> Modules linked in:
> CPU: 0 PID: 519 Comm: lt-nfqnl_test Not tainted
> task: ffff8800b9c8c050 ti: ffff8800ba9d8000 task.ti: ffff8800ba9d8000
> RIP: 0010:[] [] 0x100000001
> RSP: 0018:ffff8800ba9dba40 EFLAGS: 00010a16
> RAX: ffff8800bab48a00 RBX: ffff8800ba9dba90 RCX: ffff8800ba9dba90
> RDX: ffff8800b9c10128 RSI: ffff8800ba940900 RDI: ffff8800bab48a00
> RBP: ffff8800b9c10128 R08: ffffffff82976660 R09: ffff8800ba9dbb28
> R10: dead000000100100 R11: dead000000200200 R12: ffff8800ba940900
> R13: ffffffff8313fd50 R14: ffff8800b9c95200 R15: 0000000000000000
> FS: 00007fb91fc34700(0000) GS:ffff8800bfa00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000100000001 CR3: 00000000babfb000 CR4: 00000000000007f0
> Stack:
> ffffffff8206ab0f ffffffff82982240 ffff8800bab48a00 ffff8800b9c100a8
> ffff8800b9c10100 0000000000000001 ffff8800ba940900 ffff8800b9c10128
> ffffffff8206bd65 ffff8800bfb0d5e0 ffff8800bab48a00 0000000000014dc0
> Call Trace:
> [] ? nf_iterate+0x4f/0xa0
> [] ? nf_reinject+0x125/0x190
> [] ? nfqnl_recv_verdict+0x255/0x360
> [] ? nla_parse+0x80/0xf0
> [] ? nfnetlink_rcv_msg+0x13c/0x240
> [] ? __memcg_kmem_get_cache+0x4c/0x150
> [] ? nfnl_lock+0x20/0x20
> [] ? netlink_rcv_skb+0xa9/0xc0
> [] ? netlink_unicast+0x12f/0x1c0
> [] ? netlink_sendmsg+0x28e/0x650
> [] ? sock_sendmsg+0x44/0x50
> [] ? ___sys_sendmsg+0x2ab/0x2c0
> [] ? __wake_up+0x43/0x70
> [] ? tty_write+0x1c4/0x2a0
> [] ? __sys_sendmsg+0x44/0x80
> [] ? system_call_fastpath+0x12/0x6a
> Code: Bad RIP value.
> RIP [] 0x100000001
> RSP
> CR2: 0000000100000001
> ---[ end trace 08eb65d42362793f ]---Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller
05 Apr, 2015
1 commit
-
Instead of passing a large number of arguments down into the nf_hook()
entry points, create a structure which carries this state down through
the hook processing layers.This makes is so that if we want to change the types or signatures of
any of these pieces of state, there are less places that need to be
changed.Signed-off-by: David S. Miller
20 Oct, 2013
1 commit
-
There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.Signed-off-by: Joe Perches
Signed-off-by: David S. Miller
03 Sep, 2012
2 commits
-
Since 'list_for_each_continue_rcu' has already been replaced by
'list_for_each_entry_continue_rcu', pass 'list_head' to nf_queue() as a
parameter can not benefit us any more.This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of
nf_queue() and __nf_queue() to save code.Signed-off-by: Michael Wang
Signed-off-by: Pablo Neira Ayuso -
Since 'list_for_each_continue_rcu' has already been replaced by
'list_for_each_entry_continue_rcu', pass 'list_head' to nf_iterate() as a
parameter can not benefit us any more.This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of
nf_iterate() to save code.Signed-off-by: Michael Wang
Signed-off-by: Pablo Neira Ayuso
13 May, 2010
1 commit
-
Make sure all printk messages have a severity level.
Signed-off-by: Stephen Hemminger
Signed-off-by: Patrick McHardy
08 Oct, 2008
1 commit
-
and (try to) consistently use u_int8_t for the L3 family.
Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy
16 Oct, 2007
1 commit
-
With all the users of the double pointers removed, this patch mops up by
finally replacing all occurances of sk_buff ** in the netfilter API by
sk_buff *.Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller
13 Feb, 2007
1 commit
-
Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller
23 Sep, 2006
1 commit
-
Handle GSO packets in nf_queue by segmenting them before queueing to
avoid breaking GSO in case they get mangled.Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller
01 Jul, 2006
1 commit
-
Signed-off-by: Jörn Engel
Signed-off-by: Adrian Bunk
30 Aug, 2005
1 commit
-
This patch doesn't introduce any code changes, but merely splits the
core netfilter code into four separate files. It also moves it from
it's old location in net/core/ to the recently-created net/netfilter/
directory.Signed-off-by: Harald Welte
Signed-off-by: David S. Miller