Eric Lee / smarc-fsl-linux-kernel

13 May, 2019

1 commit

3ebb41bf4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Postpone chain policy update to drop after transaction is complete,
from Florian Westphal.

2) Add entry to flowtable after confirmation to fix UDP flows with
packets going in one single direction.

3) Reference count leak in dst object, from Taehee Yoo.

4) Check for TTL field in flowtable datapath, from Taehee Yoo.

5) Fix h323 conntrack helper due to incorrect boundary check,
from Jakub Jankowski.

6) Fix incorrect rcu dereference when fetching basechain stats,
from Florian Westphal.

7) Missing error check when adding new entries to flowtable,
from Taehee Yoo.

8) Use version field in nfnetlink message to honor the nfgen_family
field, from Kristian Evensen.

9) Remove incorrect configuration check for CONFIG_NF_CONNTRACK_IPV6,
from Subash Abhinov Kasiviswanathan.

10) Prevent dying entries from being added to the flowtable,
from Taehee Yoo.

11) Don't hit WARN_ON() with malformed blob in ebtables with
trailing data after last rule, reported by syzbot, patch
from Florian Westphal.

12) Remove NFT_CT_TIMEOUT enumeration, never used in the kernel
code.

13) Fix incorrect definition for NFT_LOGLEVEL_MAX, from Florian
Westphal.

This batch comes with a conflict that can be fixed with this patch:

diff --cc include/uapi/linux/netfilter/nf_tables.h
index 7bdb234f3d8c,f0cf7b0f4f35..505393c6e959
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@@ -966,6 -966,8 +966,7 @@@ enum nft_socket_keys
* @NFT_CT_DST_IP: conntrack layer 3 protocol destination (IPv4 address)
* @NFT_CT_SRC_IP6: conntrack layer 3 protocol source (IPv6 address)
* @NFT_CT_DST_IP6: conntrack layer 3 protocol destination (IPv6 address)
- * @NFT_CT_TIMEOUT: connection tracking timeout policy assigned to conntrack
+ * @NFT_CT_ID: conntrack id
*/
enum nft_ct_keys {
NFT_CT_STATE,
@@@ -991,6 -993,8 +992,7 @@@
NFT_CT_DST_IP,
NFT_CT_SRC_IP6,
NFT_CT_DST_IP6,
- NFT_CT_TIMEOUT,
+ NFT_CT_ID,
__NFT_CT_MAX
};
#define NFT_CT_MAX (__NFT_CT_MAX - 1)

That replaces the unused NFT_CT_TIMEOUT definition by NFT_CT_ID. If you prefer,
I can also solve this conflict here, just let me know.
====================

Signed-off-by: David S. Miller

David S. Miller
2019-05-13 23:55:15 +0800

06 May, 2019

7 commits

8cd2bc981 netfilter: nf_flow_table: do not flow offload deleted conntrack entries ... Browse Code »

Conntrack entries can be deleted by the masquerade module. In that case,
flow offload should be deleted too, but GC and data-path of flow offload
do not check for conntrack status bits, hence flow offload entries will
be removed only by the timeout.

Update garbage collector and data-path to check for ct->status. If
IPS_DYING_BIT is set, garbage collector removes flow offload entries and
data-path routine ignores them.

Signed-off-by: Taehee Yoo
Signed-off-by: Pablo Neira Ayuso

Taehee Yoo
2019-05-06 21:15:09 +0800
b33c448c4 netfilter: nf_conntrack_h323: Remove deprecated config check ... Browse Code »

CONFIG_NF_CONNTRACK_IPV6 has been deprecated so replace it with a check
for IPV6 instead.

Use nf_ip6_route6() instead of v6ops->route() and keep the IS_MODULE()
in nf_ipv6_ops as mentioned by Florian so that direct calls are used
when IPV6 is builtin and indirect calls are used only when IPV6 is a
module.

Fixes: a0ae2562c6c4b2 ("netfilter: conntrack: remove l3proto abstraction")
Signed-off-by: Subash Abhinov Kasiviswanathan
Reviewed-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Subash Abhinov Kasiviswanathan
2019-05-06 21:15:09 +0800
f8e608982 netfilter: ctnetlink: Resolve conntrack L3-protocol flush regression ... Browse Code »

Commit 59c08c69c278 ("netfilter: ctnetlink: Support L3 protocol-filter
on flush") introduced a user-space regression when flushing connection
track entries. Before this commit, the nfgen_family field was not used
by the kernel and all entries were removed. Since this commit,
nfgen_family is used to filter out entries that should not be removed.
One example a broken tool is conntrack. conntrack always sets
nfgen_family to AF_INET, so after 59c08c69c278 only IPv4 entries were
removed with the -F parameter.

Pablo Neira Ayuso suggested using nfgenmsg->version to resolve the
regression, and this commit implements his suggestion. nfgenmsg->version
is so far set to zero, so it is well-suited to be used as a flag for
selecting old or new flush behavior. If version is 0, nfgen_family is
ignored and all entries are used. If user-space sets the version to one
(or any other value than 0), then the new behavior is used. As version
only can have two valid values, I chose not to add a new
NFNETLINK_VERSION-constant.

Fixes: 59c08c69c278 ("netfilter: ctnetlink: Support L3 protocol-filter on flush")
Reported-by: Nicolas Dichtel
Suggested-by: Pablo Neira Ayuso
Signed-off-by: Kristian Evensen
Tested-by: Nicolas Dichtel
Signed-off-by: Pablo Neira Ayuso

Kristian Evensen
2019-05-06 21:15:02 +0800
eabb47821 netfilter: xt_hashlimit: use struct_size() helper ... Browse Code »

Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes, in particular in the
context in which this code is being used.

So, replace code of the following form:

sizeof(struct xt_hashlimit_htable) + sizeof(struct hlist_head) * size

with:

struct_size(hinfo, hash, size)

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva
Signed-off-by: Pablo Neira Ayuso

Gustavo A. R. Silva
2019-05-06 07:03:04 +0800
43c8f1311 netfilter: nf_flow_table: fix missing error check for rhashtable_insert_fast ... Browse Code »

rhashtable_insert_fast() may return an error value when memory
allocation fails, but flow_offload_add() does not check for errors.
This patch just adds missing error checking.

Fixes: ac2a66665e23 ("netfilter: add generic flow table infrastructure")
Signed-off-by: Taehee Yoo
Signed-off-by: Pablo Neira Ayuso

Taehee Yoo
2019-05-06 06:36:30 +0800
edbd82c5f netfilter: nf_tables: fix base chain stat rcu_dereference usage ... Browse Code »

Following splat gets triggered when nfnetlink monitor is running while
xtables-nft selftests are running:

net/netfilter/nf_tables_api.c:1272 suspicious rcu_dereference_check() usage!
other info that might help us debug this:

1 lock held by xtables-nft-mul/27006:
#0: 00000000e0f85be9 (&net->nft.commit_mutex){+.+.}, at: nf_tables_valid_genid+0x1a/0x50
Call Trace:
nf_tables_fill_chain_info.isra.45+0x6cc/0x6e0
nf_tables_chain_notify+0xf8/0x1a0
nf_tables_commit+0x165c/0x1740

nf_tables_fill_chain_info() can be called both from dumps (rcu read locked)
or from the transaction path if a userspace process subscribed to nftables
notifications.

In the 'table dump' case, rcu_access_pointer() cannot be used: We do not
hold transaction mutex so the pointer can be NULLed right after the check.
Just unconditionally fetch the value, then have the helper return
immediately if its NULL.

In the notification case we don't hold the rcu read lock, but updates are
prevented due to transaction mutex. Use rcu_dereference_check() to make lockdep
aware of this.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-05-06 06:36:29 +0800
f5e85ce8e netfilter: nf_conntrack_h323: restore boundary check correctness ... Browse Code »

Since commit bc7d811ace4a ("netfilter: nf_ct_h323: Convert
CHECK_BOUND macro to function"), NAT traversal for H.323
doesn't work, failing to parse H323-UserInformation.
nf_h323_error_boundary() compares contents of the bitstring,
not the addresses, preventing valid H.323 packets from being
conntrack'd.

This looks like an oversight from when CHECK_BOUND macro was
converted to a function.

To fix it, stop dereferencing bs->cur and bs->end.

Fixes: bc7d811ace4a ("netfilter: nf_ct_h323: Convert CHECK_BOUND macro to function")
Signed-off-by: Jakub Jankowski
Signed-off-by: Pablo Neira Ayuso

Jakub Jankowski
2019-05-06 06:36:17 +0800

30 Apr, 2019

12 commits

1de6f3342 netfilter: connlabels: fix spelling mistake "trackling" -> "tracking" ... Browse Code »

There is a spelling mistake in the module description. Fix this.

Signed-off-by: Colin Ian King
Reviewed-by: Mukesh Ojha
Signed-off-by: Pablo Neira Ayuso

Colin Ian King
2019-04-30 20:19:57 +0800
3087c3f7c netfilter: nft_ct: Add ct id support ... Browse Code »

The 'id' key returns the unique id of the conntrack entry as returned
by nf_ct_get_id().

Signed-off-by: Brett Mastbergen
Acked-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Brett Mastbergen
2019-04-30 20:19:57 +0800
53b11308a netfilter: nf_nat: register NAT helpers. ... Browse Code »

Register amanda, ftp, irc, sip and tftp NAT helpers.

Signed-off-by: Flavio Leitner
Signed-off-by: Pablo Neira Ayuso

Flavio Leitner
2019-04-30 20:19:55 +0800
08010a216 netfilter: add API to manage NAT helpers. ... Browse Code »

The API allows a conntrack helper to indicate its corresponding
NAT helper which then can be loaded and reference counted.

Signed-off-by: Flavio Leitner
Signed-off-by: Pablo Neira Ayuso

Flavio Leitner
2019-04-30 20:19:55 +0800
e1f172e16 netfilter: use macros to create module aliases. ... Browse Code »

Each NAT helper creates a module alias which follows a pattern.
Use macros for consistency.

Signed-off-by: Flavio Leitner
Signed-off-by: Pablo Neira Ayuso

Flavio Leitner
2019-04-30 20:19:54 +0800
8f14c99c7 netfilter: conntrack: limit sysctl setting for boolean options ... Browse Code »

We use the zero and one to limit the boolean options setting.
After this patch we only set 0 or 1 to boolean options for nf
conntrack sysctl.

Signed-off-by: Tonghao Zhang
Signed-off-by: Pablo Neira Ayuso

Tonghao Zhang
2019-04-30 20:18:56 +0800
33cc3c0cf netfilter: nf_flow_table: check ttl value in flow offload data path ... Browse Code »

nf_flow_offload_ip_hook() and nf_flow_offload_ipv6_hook() do not check
ttl value. So, ttl value overflow may occur.

Fixes: 97add9f0d66d ("netfilter: flow table support for IPv4")
Fixes: 0995210753a2 ("netfilter: flow table support for IPv6")
Signed-off-by: Taehee Yoo
Signed-off-by: Pablo Neira Ayuso

Taehee Yoo
2019-04-30 19:56:19 +0800
26a302afb netfilter: nf_flow_table: fix netdev refcnt leak ... Browse Code »

flow_offload_alloc() calls nf_route() to get a dst_entry. Internally,
nf_route() calls ip_route_output_key() that allocates a dst_entry and
holds it. So, a dst_entry should be released by dst_release() if
nf_route() is successful.

Otherwise, netns exit routine cannot be finished and the following
message is printed:

[ 257.490952] unregister_netdevice: waiting for lo to become free. Usage count = 1

Fixes: ac2a66665e23 ("netfilter: add generic flow table infrastructure")
Signed-off-by: Taehee Yoo
Signed-off-by: Pablo Neira Ayuso

Taehee Yoo
2019-04-30 19:55:00 +0800
270a8a297 netfilter: nft_flow_offload: add entry to flowtable after confirmation ... Browse Code »

This is fixing flow offload for UDP traffic where packets only follow
one single direction.

The flow_offload_fixup_tcp() mechanism works fine in case that the
offloaded entry remains in SYN_RECV state, given sequence tracking is
reset and that conntrack handles syn+ack packets as a retransmission, ie.

sES + synack => sIG

for reply traffic.

Fixes: a3c90f7a2323 ("netfilter: nf_tables: flow offload expression")
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2019-04-30 19:48:23 +0800
66293c46c netfilter: nf_tables: delay chain policy update until transaction is complete ... Browse Code »

When we process a long ruleset of the form

chain input {
type filter hook input priority filter; policy drop;
...
}

Then the base chain gets registered early on, we then continue to
process/validate the next messages coming in the same transaction.

Problem is that if the base chain policy is 'drop', it will take effect
immediately, which causes all traffic to get blocked until the
transaction completes or is aborted.

Fix this by deferring the policy until the transaction has been
processed and all of the rules have been flagged as active.

Reported-by: Jann Haber
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-04-30 19:47:32 +0800
c5f1931f6 netfilter: nf_tables: fix implicit include of module.h ... Browse Code »

This file clearly uses modular infrastructure but does not call
out the inclusion of explicitly. We add that
include explicitly here, so we can tidy up some header usage
elsewhere w/o causing build breakage.

Signed-off-by: Paul Gortmaker
Signed-off-by: Pablo Neira Ayuso

Paul Gortmaker
2019-04-30 19:35:26 +0800
f1f86d09c netfilter: nf_tables: relocate header content to consumer ... Browse Code »

The nf_tables.h header is used in a lot of files, but it turns out
that there is only one actual user of nft_expr_clone().

Hence we relocate that function to be with the one consumer of it
and avoid having to process it with CPP for all the other files.

This will also enable a reduction in the other headers that the
nf_tables.h itself has to include just to be stand-alone, hence
a pending further significant reduction in the CPP content that
needs to get processed for each netfilter file.

Note that the explicit "inline" has been dropped as part of this
relocation. In similar changes to this, I believe Dave has asked
this be done, so we free up gcc to make the choice of whether to
inline or not.

Signed-off-by: Paul Gortmaker
Signed-off-by: Pablo Neira Ayuso

Paul Gortmaker
2019-04-30 19:35:17 +0800

28 Apr, 2019

4 commits

ef6243acb genetlink: optionally validate strictly/dumps ... Browse Code »

Add options to strictly validate messages and dump messages,
sometimes perhaps validating dump messages non-strictly may
be required, so add an option for that as well.

Since none of this can really be applied to existing commands,
set the options everwhere using the following spatch:

@@
identifier ops;
expression X;
@@
struct genl_ops ops[] = {
...,
{
.cmd = X,
+ .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
...
},
...
};

For new commands one should just not copy the .validate 'opt-out'
flags and thus get strict validation.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2019-04-28 05:07:22 +0800
8cb081746 netlink: make validation more configurable for future strictness ... Browse Code »

We currently have two levels of strict validation:

1) liberal (default)
- undefined (type >= max) & NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
- garbage at end of message accepted
2) strict (opt-in)
- NLA_UNSPEC attributes accepted
- attribute length >= expected accepted

Split out parsing strictness into four different options:
* TRAILING - check that there's no trailing data after parsing
attributes (in message or nested)
* MAXTYPE - reject attrs > max known type
* UNSPEC - reject attributes with NLA_UNSPEC policy entries
* STRICT_ATTRS - strictly validate attribute size

The default for future things should be *everything*.
The current *_strict() is a combination of TRAILING and MAXTYPE,
and is renamed to _deprecated_strict().
The current regular parsing has none of this, and is renamed to
*_parse_deprecated().

Additionally it allows us to selectively set one of the new flags
even on old policies. Notably, the UNSPEC flag could be useful in
this case, since it can be arranged (by filling in the policy) to
not be an incompatible userspace ABI change, but would then going
forward prevent forgetting attribute entries. Similar can apply
to the POLICY flag.

We end up with the following renames:
* nla_parse -> nla_parse_deprecated
* nla_parse_strict -> nla_parse_deprecated_strict
* nlmsg_parse -> nlmsg_parse_deprecated
* nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
* nla_parse_nested -> nla_parse_nested_deprecated
* nla_validate_nested -> nla_validate_nested_deprecated

Using spatch, of course:
@@
expression TB, MAX, HEAD, LEN, POL, EXT;
@@
-nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
+nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)

@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)

@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)

@@
expression TB, MAX, NLA, POL, EXT;
@@
-nla_parse_nested(TB, MAX, NLA, POL, EXT)
+nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)

@@
expression START, MAX, POL, EXT;
@@
-nla_validate_nested(START, MAX, POL, EXT)
+nla_validate_nested_deprecated(START, MAX, POL, EXT)

@@
expression NLH, HDRLEN, MAX, POL, EXT;
@@
-nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
+nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)

For this patch, don't actually add the strict, non-renamed versions
yet so that it breaks compile if I get it wrong.

Also, while at it, make nla_validate and nla_parse go down to a
common __nla_validate_parse() function to avoid code duplication.

Ultimately, this allows us to have very strict validation for every
new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
next patch, while existing things will continue to work as is.

In effect then, this adds fully strict validation for any new command.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2019-04-28 05:07:21 +0800
12ad5f65f ipset: drop ipset_nest_start() and ipset_nest_end() ... Browse Code »

After the previous commit, both ipset_nest_start() and ipset_nest_end() are
just aliases for nla_nest_start() and nla_nest_end() so that there is no
need to keep them.

Signed-off-by: Michal Kubecek
Acked-by: Jozsef Kadlecsik
Signed-off-by: David S. Miller

Michal Kubecek
2019-04-28 05:03:44 +0800
ae0be8de9 netlink: make nla_nest_start() add NLA_F_NESTED flag ... Browse Code »

Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
netlink based interfaces (including recently added ones) are still not
setting it in kernel generated messages. Without the flag, message parsers
not aware of attribute semantics (e.g. wireshark dissector or libmnl's
mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
the structure of their contents.

Unfortunately we cannot just add the flag everywhere as there may be
userspace applications which check nlattr::nla_type directly rather than
through a helper masking out the flags. Therefore the patch renames
nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
are rewritten to use nla_nest_start().

Except for changes in include/net/netlink.h, the patch was generated using
this semantic patch:

@@ expression E1, E2; @@
-nla_nest_start(E1, E2)
+nla_nest_start_noflag(E1, E2)

@@ expression E1, E2; @@
-nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
+nla_nest_start(E1, E2)

Signed-off-by: Michal Kubecek
Acked-by: Jiri Pirko
Acked-by: David Ahern
Signed-off-by: David S. Miller

Michal Kubecek
2019-04-28 05:03:44 +0800

26 Apr, 2019

1 commit

8b4483658 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Two easy cases of overlapping changes.

Signed-off-by: David S. Miller

David S. Miller
2019-04-26 11:52:29 +0800

22 Apr, 2019

2 commits

d48668052 netfilter: fix nf_l4proto_log_invalid to log invalid packets ... Browse Code »

It doesn't log a packet if sysctl_log_invalid isn't equal to protonum
OR sysctl_log_invalid isn't equal to IPPROTO_RAW. This sentence is
always true. I believe we need to replace OR to AND.

Cc: Florian Westphal
Fixes: c4f3db1595827 ("netfilter: conntrack: add and use nf_l4proto_log_invalid")
Signed-off-by: Andrei Vagin
Acked-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Andrei Vagin
2019-04-22 16:38:50 +0800
916f6efae netfilter: never get/set skb->tstamp ... Browse Code »

setting net.netfilter.nf_conntrack_timestamp=1 breaks xmit with fq
scheduler. skb->tstamp might be "refreshed" using ktime_get_real(),
but fq expects CLOCK_MONOTONIC.

This patch removes all places in netfilter that check/set skb->tstamp:

1. To fix the bogus "start" time seen with conntrack timestamping for
outgoing packets, never use skb->tstamp and always use current time.
2. In nfqueue and nflog, only use skb->tstamp for incoming packets,
as determined by current hook (prerouting, input, forward).
3. xt_time has to use system clock as well rather than skb->tstamp.
We could still use skb->tstamp for prerouting/input/foward, but
I see no advantage to make this conditional.

Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
Cc: Eric Dumazet
Reported-by: Michal Soltys
Signed-off-by: Florian Westphal
Acked-by: Eric Dumazet
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-04-22 16:34:30 +0800

15 Apr, 2019

3 commits

5bdac418f netfilter: nat: fix icmp id randomization ... Browse Code »

Sven Auhagen reported that a 2nd ping request will fail if 'fully-random'
mode is used.

Reason is that if no proto information is given, min/max are both 0,
so we set the icmp id to 0 instead of chosing a random value between
0 and 65535.

Update test case as well to catch this, without fix this yields:
[..]
ERROR: cannot ping ns1 from ns2 with ip masquerade fully-random (attempt 2)
ERROR: cannot ping ns1 from ns2 with ipv6 masquerade fully-random (attempt 2)

... becaus 2nd ping clashes with existing 'id 0' icmp conntrack and gets
dropped.

Fixes: 203f2e78200c27e ("netfilter: nat: remove l4proto->unique_tuple")
Reported-by: Sven Auhagen
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-04-15 13:31:50 +0800
33d1c0181 netfilter: nf_tables: prevent shift wrap in nft_chain_parse_hook() ... Browse Code »

I believe that "hook->num" can be up to UINT_MAX. Shifting more than
31 bits would is undefined in C but in practice it would lead to shift
wrapping. That would lead to an array overflow in nf_tables_addchain():

ops->hook = hook.type->hooks[ops->hooknum];

Fixes: fe19c04ca137 ("netfilter: nf_tables: remove nhooks field from struct nft_af_info")
Signed-off-by: Dan Carpenter
Signed-off-by: Pablo Neira Ayuso

Dan Carpenter
2019-04-15 13:31:50 +0800
3c7910763 netfilter: ctnetlink: don't use conntrack/expect object addresses as id ... Browse Code »

else, we leak the addresses to userspace via ctnetlink events
and dumps.

Compute an ID on demand based on the immutable parts of nf_conn struct.

Another advantage compared to using an address is that there is no
immediate re-use of the same ID in case the conntrack entry is freed and
reallocated again immediately.

Fixes: 3583240249ef ("[NETFILTER]: nf_conntrack_expect: kill unique ID")
Fixes: 7f85f914721f ("[NETFILTER]: nf_conntrack: kill unique ID")
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-04-15 13:31:44 +0800

13 Apr, 2019

3 commits

0261ea1bd ipvs: do not schedule icmp errors from tunnels ... Browse Code »

We can receive ICMP errors from client or from
tunneling real server. While the former can be
scheduled to real server, the latter should
not be scheduled, they are decapsulated only when
existing connection is found.

Fixes: 6044eeffafbe ("ipvs: attempt to schedule icmp packets")
Signed-off-by: Julian Anastasov
Signed-off-by: Simon Horman
Signed-off-by: Pablo Neira Ayuso

Julian Anastasov
2019-04-13 20:52:57 +0800
8176c8332 netfilter: conntrack: initialize ct->timeout ... Browse Code »

KMSAN started reporting an error when accessing ct->timeout for the
first time without initialization:

BUG: KMSAN: uninit-value in __nf_ct_refresh_acct+0x1ae/0x470 net/netfilter/nf_conntrack_core.c:1765
...
dump_stack+0x173/0x1d0 lib/dump_stack.c:113
kmsan_report+0x131/0x2a0 mm/kmsan/kmsan.c:624
__msan_warning+0x7a/0xf0 mm/kmsan/kmsan_instr.c:310
__nf_ct_refresh_acct+0x1ae/0x470 net/netfilter/nf_conntrack_core.c:1765
nf_ct_refresh_acct ./include/net/netfilter/nf_conntrack.h:201
nf_conntrack_udp_packet+0xb44/0x1040 net/netfilter/nf_conntrack_proto_udp.c:122
nf_conntrack_handle_packet net/netfilter/nf_conntrack_core.c:1605
nf_conntrack_in+0x1250/0x26c9 net/netfilter/nf_conntrack_core.c:1696
...
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:205
kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:159
kmsan_kmalloc+0xa9/0x130 mm/kmsan/kmsan_hooks.c:173
kmem_cache_alloc+0x554/0xb10 mm/slub.c:2789
__nf_conntrack_alloc+0x16f/0x690 net/netfilter/nf_conntrack_core.c:1342
init_conntrack+0x6cb/0x2490 net/netfilter/nf_conntrack_core.c:1421

Signed-off-by: Alexander Potapenko
Fixes: cc16921351d8ba1 ("netfilter: conntrack: avoid same-timeout update")
Cc: Florian Westphal
Acked-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Alexander Potapenko
2019-04-13 20:52:57 +0800
1025ce752 netfilter: conntrack: don't set related state for different outer address ... Browse Code »

Luca Moro says:
------
The issue lies in the filtering of ICMP and ICMPv6 errors that include an
inner IP datagram.
For these packets, icmp_error_message() extract the ICMP error and inner
layer to search of a known state.
If a state is found the packet is tagged as related (IP_CT_RELATED).

The problem is that there is no correlation check between the inner and
outer layer of the packet.
So one can encapsulate an error with an inner layer matching a known state,
while its outer layer is directed to a filtered host.
In this case the whole packet will be tagged as related.
This has various implications from a rule bypass (if a rule to related
trafic is allow), to a known state oracle.

Unfortunately, we could not find a real statement in a RFC on how this case
should be filtered.
The closest we found is RFC5927 (Section 4.3) but it is not very clear.

A possible fix would be to check that the inner IP source is the same than
the outer destination.

We believed this kind of attack was not documented yet, so we started to
write a blog post about it.
You can find it attached to this mail (sorry for the extract quality).
It contains more technical details, PoC and discussion about the identified
behavior.
We discovered later that
https://www.gont.com.ar/papers/filtering-of-icmp-error-messages.pdf
described a similar attack concept in 2004 but without the stateful
filtering in mind.
-----

This implements above suggested fix:
In icmp(v6) error handler, take outer destination address, then pass
that into the common function that does the "related" association.

After obtaining the nf_conn of the matching inner-headers connection,
check that the destination address of the opposite direction tuple
is the same as the outer address and only set RELATED if thats the case.

Reported-by: Luca Moro
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-04-13 20:52:57 +0800

12 Apr, 2019

4 commits

971502d77 bridge: netfilter: unroll NF_HOOK helper in bridge input path ... Browse Code »

Replace NF_HOOK() based invocation of the netfilter hooks with a private
copy of nf_hook_slow().

This copy has one difference: it can return the rx handler value expected
by the stack, i.e. RX_HANDLER_CONSUMED or RX_HANDLER_PASS.

This is needed by the next patch to invoke the ebtables
"broute" table via the standard netfilter hooks rather than the custom
"br_should_route_hook" indirection that is used now.

When the skb is to be "brouted", we must return RX_HANDLER_PASS from the
bridge rx input handler, but there is no way to indicate this via
NF_HOOK(), unless perhaps by some hack such as exposing bridge_cb in the
netfilter core or a percpu flag.

text data bss dec filename
3369 56 0 3425 net/bridge/br_input.o.before
3458 40 0 3498 net/bridge/br_input.o.after

This allows removal of the "br_should_route_hook" in the next patch.

Signed-off-by: Florian Westphal
Acked-by: David S. Miller
Acked-by: Nikolay Aleksandrov
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-04-12 07:47:39 +0800
610a43149 netfilter: nf_nat_masquerade: unify ipv4/6 notifier registration ... Browse Code »

Only reason for having two different register functions was because of
ipt_MASQUERADE and ip6t_MASQUERADE being two different modules.

Previous patch merged those into xt_MASQUERADE, so we can merge this too.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-04-12 02:59:34 +0800
adf82accc netfilter: x_tables: merge ip and ipv6 masquerade modules ... Browse Code »

No need to have separate modules for this.
before:
text data bss dec filename
2038 1168 0 3206 net/ipv4/netfilter/ipt_MASQUERADE.ko
1526 1024 0 2550 net/ipv6/netfilter/ip6t_MASQUERADE.ko
after:
text data bss dec filename
2521 1296 0 3817 net/netfilter/xt_MASQUERADE.ko

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-04-12 02:59:29 +0800
bf8981a2a netfilter: nf_nat: merge ip/ip6 masquerade headers ... Browse Code »

Both are now implemented by nf_nat_masquerade.c, so no need to keep
different headers.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-04-12 02:59:21 +0800

09 Apr, 2019

3 commits

3b0a081db netfilter: make two functions static ... Browse Code »

They have no external callers anymore.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-04-09 05:28:33 +0800
22c7652cd netfilter: nft_osf: Add version option support ... Browse Code »

Add version option support to the nftables "osf" expression.

Signed-off-by: Fernando Fernandez Mancera
Signed-off-by: Pablo Neira Ayuso

Fernando Fernandez Mancera
2019-04-09 05:27:12 +0800
63ce3940f netfilter: nft_redir: add inet support ... Browse Code »

allows to redirect both ipv4 and ipv6 with a single rule in an
inet nat table.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-04-09 05:03:01 +0800