Eric Lee / smarc-fsl-linux-kernel

23 Sep, 2020

1 commit

3ab0a7a0c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Two minor conflicts:

1) net/ipv4/route.c, adding a new local variable while
moving another local variable and removing it's
initial assignment.

2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes.
One pretty prints the port mode differently, whilst another
changes the driver to try and obtain the port mode from
the port node rather than the switch node.

Signed-off-by: David S. Miller

David S. Miller
2020-09-23 07:45:34 +0800

08 Sep, 2020

2 commits

6c0d95d12 netfilter: ctnetlink: fix mark based dump filtering regression ... Browse Code »

conntrack mark based dump filtering may falsely skip entries if a mask
is given: If the mask-based check does not filter out the entry, the
else-if check is always true and compares the mark without considering
the mask. The if/else-if logic seems wrong.

Given that the mask during filter setup is implicitly set to 0xffffffff
if not specified explicitly, the mark filtering flags seem to just
complicate things. Restore the previously used approach by always
matching against a zero mask is no filter mark is given.

Fixes: cb8aa9a3affb ("netfilter: ctnetlink: add kernel side filtering for dump")
Signed-off-by: Martin Willi
Signed-off-by: Pablo Neira Ayuso

Martin Willi
2020-09-08 19:04:51 +0800
1cc5ef91d netfilter: ctnetlink: add a range check for l3/l4 protonum ... Browse Code »

The indexes to the nf_nat_l[34]protos arrays come from userspace. So
check the tuple's family, e.g. l3num, when creating the conntrack in
order to prevent an OOB memory access during setup. Here is an example
kernel panic on 4.14.180 when userspace passes in an index greater than
NFPROTO_NUMPROTO.

Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:...
Process poc (pid: 5614, stack limit = 0x00000000a3933121)
CPU: 4 PID: 5614 Comm: poc Tainted: G S W O 4.14.180-g051355490483
Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM
task: 000000002a3dfffe task.stack: 00000000a3933121
pc : __cfi_check_fail+0x1c/0x24
lr : __cfi_check_fail+0x1c/0x24
...
Call trace:
__cfi_check_fail+0x1c/0x24
name_to_dev_t+0x0/0x468
nfnetlink_parse_nat_setup+0x234/0x258
ctnetlink_parse_nat_setup+0x4c/0x228
ctnetlink_new_conntrack+0x590/0xc40
nfnetlink_rcv_msg+0x31c/0x4d4
netlink_rcv_skb+0x100/0x184
nfnetlink_rcv+0xf4/0x180
netlink_unicast+0x360/0x770
netlink_sendmsg+0x5a0/0x6a4
___sys_sendmsg+0x314/0x46c
SyS_sendmsg+0xb4/0x108
el0_svc_naked+0x34/0x38

This crash is not happening since 5.4+, however, ctnetlink still
allows for creating entries with unsupported layer 3 protocol number.

Fixes: c1d10adb4a521 ("[NETFILTER]: Add ctnetlink port for nf_conntrack")
Signed-off-by: Will McVicker
[pablo@netfilter.org: rebased original patch on top of nf.git]
Signed-off-by: Pablo Neira Ayuso

Will McVicker
2020-09-08 17:41:43 +0800

29 Aug, 2020

2 commits

bc9247041 netfilter: conntrack: add clash resolution stat counter ... Browse Code »

There is a misconception about what "insert_failed" means.

We increment this even when a clash got resolved, so it might not indicate
a problem.

Add a dedicated counter for clash resolution and only increment
insert_failed if a clash cannot be resolved.

For the old /proc interface, export this in place of an older stat
that got removed a while back.
For ctnetlink, export this with a new attribute.

Also correct an outdated comment that implies we add a duplicate tuple --
we only add the (unique) reply direction.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2020-08-29 01:51:26 +0800
4afc41dfa netfilter: conntrack: remove ignore stats ... Browse Code »

This counter increments when nf_conntrack_in sees a packet that already
has a conntrack attached or when the packet is marked as UNTRACKED.
Neither is an error.

The former is normal for loopback traffic. The second happens for
certain ICMPv6 packets or when nftables/ip(6)tables rules are in place.

In case someone needs to count UNTRACKED packets, or packets
that are marked as untracked before conntrack_in this can be done with
both nftables and ip(6)tables rules.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2020-08-29 01:51:26 +0800

11 Jun, 2020

1 commit

6c2d2176a netfilter: ctnetlink: memleak in filter initialization error path ... Browse Code »

Release the filter object in case of error.

Fixes: cb8aa9a3affb ("netfilter: ctnetlink: add kernel side filtering for dump")
Reported-by: syzbot+38b8b548a851a01793c5@syzkaller.appspotmail.com
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2020-06-11 01:33:34 +0800

28 May, 2020

1 commit

cb8aa9a3a netfilter: ctnetlink: add kernel side filtering for dump ... Browse Code »

Conntrack dump does not support kernel side filtering (only get exists,
but it returns only one entry. And user has to give a full valid tuple)

It means that userspace has to implement filtering after receiving many
irrelevant entries, consuming resources (conntrack table is sometimes
very huge, much more than a routing table for example).

This patch adds filtering in kernel side. To achieve this goal, we:

* Add a new CTA_FILTER netlink attributes, actually a flag list to
parametize filtering
* Convert some *nlattr_to_tuple() functions, to allow a partial parsing
of CTA_TUPLE_ORIG and CTA_TUPLE_REPLY (so nf_conntrack_tuple it not
fully set)

Filtering is now possible on:
* IP SRC/DST values
* Ports for TCP and UDP flows
* IMCP(v6) codes types and IDs

Filtering is done as an "AND" operator. For example, when flags
PROTO_SRC_PORT, PROTO_NUM and IP_SRC are sets, only entries matching all
values are dumped.

Changes since v1:
Set NLM_F_DUMP_FILTERED in nlm flags if entries are filtered

Changes since v2:
Move several constants to nf_internals.h
Move a fix on netlink values check in a separate patch
Add a check on not-supported flags
Return EOPNOTSUPP if CDA_FILTER is set in ctnetlink_flush_conntrack
(not yet implemented)
Code style issues

Changes since v3:
Fix compilation warning reported by kbuild test robot

Changes since v4:
Fix a regression introduced in v3 (returned EINVAL for valid netlink
messages without CTA_MARK)

Changes since v5:
Change definition of CTA_FILTER_F_ALL
Fix a regression when CTA_TUPLE_ZONE is not set

Signed-off-by: Romain Bellan
Signed-off-by: Florent Fourcot
Signed-off-by: Pablo Neira Ayuso

Romain Bellan
2020-05-28 04:20:34 +0800

30 Mar, 2020

1 commit

7c6b41216 netfilter: ctnetlink: be more strict when NF_CONNTRACK_MARK is not set ... Browse Code »

When CONFIG_NF_CONNTRACK_MARK is not set, any CTA_MARK or CTA_MARK_MASK
in netlink message are not supported. We should return an error when one
of them is set, not both

Fixes: 9306425b70bf ("netfilter: ctnetlink: must check mark attributes vs NULL")
Signed-off-by: Romain Bellan
Signed-off-by: Florent Fourcot
Signed-off-by: Pablo Neira Ayuso

Romain Bellan
2020-03-30 08:05:36 +0800

28 Mar, 2020

1 commit

19f8f717f netfilter: ctnetlink: Add missing annotation for ctnetlink_parse_nat_setup() ... Browse Code »

Sparse reports a warning at ctnetlink_parse_nat_setup()

warning: context imbalance in ctnetlink_parse_nat_setup()
- unexpected unlock

The root cause is the missing annotation at ctnetlink_parse_nat_setup()
Add the missing __must_hold(RCU) annotation

Signed-off-by: Jules Irenge
Signed-off-by: Pablo Neira Ayuso

Jules Irenge
2020-03-28 01:21:09 +0800

29 Nov, 2019

1 commit

18a110b02 netfilter: ctnetlink: netns exit must wait for callbacks ... Browse Code »

Curtis Taylor and Jon Maxwell reported and debugged a crash on 3.10
based kernel.

Crash occurs in ctnetlink_conntrack_events because net->nfnl socket is
NULL. The nfnl socket was set to NULL by netns destruction running on
another cpu.

The exiting network namespace calls the relevant destructors in the
following order:

1. ctnetlink_net_exit_batch

This nulls out the event callback pointer in struct netns.

2. nfnetlink_net_exit_batch

This nulls net->nfnl socket and frees it.

3. nf_conntrack_cleanup_net_list

This removes all remaining conntrack entries.

This is order is correct. The only explanation for the crash so ar is:

cpu1: conntrack is dying, eviction occurs:
-> nf_ct_delete()
-> nf_conntrack_event_report \
-> nf_conntrack_eventmask_report
-> notify->fcn() (== ctnetlink_conntrack_events).

cpu1: a. fetches rcu protected pointer to obtain ctnetlink event callback.
b. gets interrupted.
cpu2: runs netns exit handlers:
a runs ctnetlink destructor, event cb pointer set to NULL.
b runs nfnetlink destructor, nfnl socket is closed and set to NULL.
cpu1: c. resumes and trips over NULL net->nfnl.

Problem appears to be that ctnetlink_net_exit_batch only prevents future
callers of nf_conntrack_eventmask_report() from obtaining the callback.
It doesn't wait of other cpus that might have already obtained the
callbacks address.

I don't see anything in upstream kernels that would prevent similar
crash: We need to wait for all cpus to have exited the event callback.

Fixes: 9592a5c01e79dbc59eb56fa ("netfilter: ctnetlink: netns support")
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-11-29 15:59:34 +0800

17 Oct, 2019

1 commit

49ca022bc netfilter: ctnetlink: don't dump ct extensions of unconfirmed conntracks ... Browse Code »

When dumping the unconfirmed lists, the cpu that is processing the ct
entry can reallocate ct->ext at any time.

Right now accessing the extensions from another CPU is ok provided
we're holding rcu read lock: extension reallocation does use rcu.

Once RCU isn't used anymore this becomes unsafe, so skip extensions for
the unconfirmed list.

Dumping the extension area for confirmed or dying conntracks is fine:
no reallocations are allowed and list iteration holds appropriate
locks that prevent ct (and this ct->ext) from getting free'd.

v2: fix compiler warnings due to misue of 'const' and missing return
statement (kbuild robot).

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-10-17 17:46:51 +0800

04 Sep, 2019

1 commit

b067fa009 netfilter: ctnetlink: honor IPS_OFFLOAD flag ... Browse Code »

If this flag is set, timeout and state are irrelevant to userspace.

Fixes: 90964016e5d3 ("netfilter: nf_conntrack: add IPS_OFFLOAD status bit")
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2019-09-04 04:55:41 +0800

16 Jul, 2019

1 commit

3c00fb0bf netfilter: nf_conntrack_sip: fix expectation clash ... Browse Code »

When conntracks change during a dialog, SDP messages may be sent from
different conntracks to establish expects with identical tuples. In this
case expects conflict may be detected for the 2nd SDP message and end up
with a process failure.

The fixing here is to reuse an existing expect who has the same tuple for a
different conntrack if any.

Here are two scenarios for the case.

1)
SERVER CPE

| INVITE SDP |
5060 ||5060
| 183 SDP |
5060 |---------------------->|5060 ===> Conntrack 1
| PRACK |
50601 ||5060
| 200 OK (INVITE) |
5060 |---------------------->|5060
| ACK |
50601 ||
| |
| INVITE SDP (t38) |
50601 |---------------------->|5060 ===> Conntrack 2

With a certain configuration in the CPE, SIP messages "183 with SDP" and
"re-INVITE with SDP t38" will go through the sip helper to create
expects for RTP and RTCP.

It is okay to create RTP and RTCP expects for "183", whose master
connection source port is 5060, and destination port is 5060.

In the "183" message, port in Contact header changes to 50601 (from the
original 5060). So the following requests e.g. PRACK and ACK are sent to
port 50601. It is a different conntrack (let call Conntrack 2) from the
original INVITE (let call Conntrack 1) due to the port difference.

In this example, after the call is established, there is RTP stream but no
RTCP stream for Conntrack 1, so the RTP expect created upon "183" is
cleared, and RTCP expect created for Conntrack 1 retains.

When "re-INVITE with SDP t38" arrives to create RTP&RTCP expects, current
ALG implementation will call nf_ct_expect_related() for RTP and RTCP. The
expects tuples are identical to those for Conntrack 1. RTP expect for
Conntrack 2 succeeds in creation as the one for Conntrack 1 has been
removed. RTCP expect for Conntrack 2 fails in creation because it has
idential tuples and 'conflict' with the one retained for Conntrack 1. And
then result in a failure in processing of the re-INVITE.

2)

SERVER A CPE

| REGISTER |
5060 | CT1
| 200 |
5060 |------------------>| 5060
| |
| INVITE SDP(1) |
5060 || 5060 SERVER B
| ACK |
5060 || 5060 ==> CT2
| 100 |
5060 || 50601 ==> CT3
| |
||
| |
| BYE |
5060 || 50601
| INVITE SDP(3) |
5060 | CT1

CPE sends an INVITE request(1) to Server A, and creates a RTP&RTCP expect
pair for this Conntrack 1 (CT1). Server A responds 300 to redirect to
Server B. The RTP&RTCP expect pairs created on CT1 are removed upon 300
response.

CPE sends the INVITE request(2) to Server B, and creates an expect pair
for the new conntrack (due to destination address difference), let call
CT2. Server B changes the port to 50601 in 200 OK response, and the
following requests ACK and BYE from CPE are sent to 50601. The call is
established. There is RTP stream and no RTCP stream. So RTP expect is
removed and RTCP expect for CT2 retains.

As BYE request is sent from port 50601, it is another conntrack, let call
CT3, different from CT2 due to the port difference. So the BYE request will
not remove the RTCP expect for CT2.

Then another outgoing call is made, with the same RTP port being used (not
definitely but possibly). CPE firstly sends the INVITE request(3) to Server
A, and tries to create a RTP&RTCP expect pairs for this CT1. In current ALG
implementation, the RTCP expect for CT1 fails in creation because it
'conflicts' with the residual one for CT2. As a result the INVITE request
fails to send.

Signed-off-by: xiao ruizhu
Signed-off-by: Pablo Neira Ayuso

xiao ruizhu
2019-07-16 19:16:59 +0800

26 Jun, 2019

1 commit

e7600865d netfilter: ctnetlink: Fix regression in conntrack entry deletion ... Browse Code »

Commit f8e608982022 ("netfilter: ctnetlink: Resolve conntrack
L3-protocol flush regression") introduced a regression in which deletion
of conntrack entries would fail because the L3 protocol information
is replaced by AF_UNSPEC. As a result the search for the entry to be
deleted would turn up empty due to the tuple used to perform the search
is now different from the tuple used to initially set up the entry.

For flushing the conntrack table we do however want to keep the option
for nfgenmsg->version to have a non-zero value to allow for newer
user-space tools to request treatment under the new behavior. With that
it is possible to independently flush tables for a defined L3 protocol.
This was introduced with the enhancements in in commit 59c08c69c278
("netfilter: ctnetlink: Support L3 protocol-filter on flush").

Older user-space tools will retain the behavior of flushing all tables
regardless of defined L3 protocol.

Fixes: f8e608982022 ("netfilter: ctnetlink: Resolve conntrack L3-protocol flush regression")
Suggested-by: Pablo Neira Ayuso
Signed-off-by: Felix Kaechele
Signed-off-by: Pablo Neira Ayuso

Felix Kaechele
2019-06-26 17:07:50 +0800

13 May, 2019

1 commit

3ebb41bf4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Postpone chain policy update to drop after transaction is complete,
from Florian Westphal.

2) Add entry to flowtable after confirmation to fix UDP flows with
packets going in one single direction.

3) Reference count leak in dst object, from Taehee Yoo.

4) Check for TTL field in flowtable datapath, from Taehee Yoo.

5) Fix h323 conntrack helper due to incorrect boundary check,
from Jakub Jankowski.

6) Fix incorrect rcu dereference when fetching basechain stats,
from Florian Westphal.

7) Missing error check when adding new entries to flowtable,
from Taehee Yoo.

8) Use version field in nfnetlink message to honor the nfgen_family
field, from Kristian Evensen.

9) Remove incorrect configuration check for CONFIG_NF_CONNTRACK_IPV6,
from Subash Abhinov Kasiviswanathan.

10) Prevent dying entries from being added to the flowtable,
from Taehee Yoo.

11) Don't hit WARN_ON() with malformed blob in ebtables with
trailing data after last rule, reported by syzbot, patch
from Florian Westphal.

12) Remove NFT_CT_TIMEOUT enumeration, never used in the kernel
code.

13) Fix incorrect definition for NFT_LOGLEVEL_MAX, from Florian
Westphal.

This batch comes with a conflict that can be fixed with this patch:

diff --cc include/uapi/linux/netfilter/nf_tables.h
index 7bdb234f3d8c,f0cf7b0f4f35..505393c6e959
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@@ -966,6 -966,8 +966,7 @@@ enum nft_socket_keys
* @NFT_CT_DST_IP: conntrack layer 3 protocol destination (IPv4 address)
* @NFT_CT_SRC_IP6: conntrack layer 3 protocol source (IPv6 address)
* @NFT_CT_DST_IP6: conntrack layer 3 protocol destination (IPv6 address)
- * @NFT_CT_TIMEOUT: connection tracking timeout policy assigned to conntrack
+ * @NFT_CT_ID: conntrack id
*/
enum nft_ct_keys {
NFT_CT_STATE,
@@@ -991,6 -993,8 +992,7 @@@
NFT_CT_DST_IP,
NFT_CT_SRC_IP6,
NFT_CT_DST_IP6,
- NFT_CT_TIMEOUT,
+ NFT_CT_ID,
__NFT_CT_MAX
};
#define NFT_CT_MAX (__NFT_CT_MAX - 1)

That replaces the unused NFT_CT_TIMEOUT definition by NFT_CT_ID. If you prefer,
I can also solve this conflict here, just let me know.
====================

Signed-off-by: David S. Miller

David S. Miller
2019-05-13 23:55:15 +0800

06 May, 2019

1 commit

f8e608982 netfilter: ctnetlink: Resolve conntrack L3-protocol flush regression ... Browse Code »

Commit 59c08c69c278 ("netfilter: ctnetlink: Support L3 protocol-filter
on flush") introduced a user-space regression when flushing connection
track entries. Before this commit, the nfgen_family field was not used
by the kernel and all entries were removed. Since this commit,
nfgen_family is used to filter out entries that should not be removed.
One example a broken tool is conntrack. conntrack always sets
nfgen_family to AF_INET, so after 59c08c69c278 only IPv4 entries were
removed with the -F parameter.

Pablo Neira Ayuso suggested using nfgenmsg->version to resolve the
regression, and this commit implements his suggestion. nfgenmsg->version
is so far set to zero, so it is well-suited to be used as a flag for
selecting old or new flush behavior. If version is 0, nfgen_family is
ignored and all entries are used. If user-space sets the version to one
(or any other value than 0), then the new behavior is used. As version
only can have two valid values, I chose not to add a new
NFNETLINK_VERSION-constant.

Fixes: 59c08c69c278 ("netfilter: ctnetlink: Support L3 protocol-filter on flush")
Reported-by: Nicolas Dichtel
Suggested-by: Pablo Neira Ayuso
Signed-off-by: Kristian Evensen
Tested-by: Nicolas Dichtel
Signed-off-by: Pablo Neira Ayuso

Kristian Evensen
2019-05-06 21:15:02 +0800

28 Apr, 2019

2 commits

8cb081746 netlink: make validation more configurable for future strictness ... Browse Code »

We currently have two levels of strict validation:

1) liberal (default)
- undefined (type >= max) & NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
- garbage at end of message accepted
2) strict (opt-in)
- NLA_UNSPEC attributes accepted
- attribute length >= expected accepted

Split out parsing strictness into four different options:
* TRAILING - check that there's no trailing data after parsing
attributes (in message or nested)
* MAXTYPE - reject attrs > max known type
* UNSPEC - reject attributes with NLA_UNSPEC policy entries
* STRICT_ATTRS - strictly validate attribute size

The default for future things should be *everything*.
The current *_strict() is a combination of TRAILING and MAXTYPE,
and is renamed to _deprecated_strict().
The current regular parsing has none of this, and is renamed to
*_parse_deprecated().

Additionally it allows us to selectively set one of the new flags
even on old policies. Notably, the UNSPEC flag could be useful in
this case, since it can be arranged (by filling in the policy) to
not be an incompatible userspace ABI change, but would then going
forward prevent forgetting attribute entries. Similar can apply
to the POLICY flag.

We end up with the following renames:
* nla_parse -> nla_parse_deprecated
* nla_parse_strict -> nla_parse_deprecated_strict
* nlmsg_parse -> nlmsg_parse_deprecated
* nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
* nla_parse_nested -> nla_parse_nested_deprecated
* nla_validate_nested -> nla_validate_nested_deprecated

Using spatch, of course:
@@
expression TB, MAX, HEAD, LEN, POL, EXT;
@@
-nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
+nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)

@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)

@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)

@@
expression TB, MAX, NLA, POL, EXT;
@@
-nla_parse_nested(TB, MAX, NLA, POL, EXT)
+nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)

@@
expression START, MAX, POL, EXT;
@@
-nla_validate_nested(START, MAX, POL, EXT)
+nla_validate_nested_deprecated(START, MAX, POL, EXT)

@@
expression NLH, HDRLEN, MAX, POL, EXT;
@@
-nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
+nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)

For this patch, don't actually add the strict, non-renamed versions
yet so that it breaks compile if I get it wrong.

Also, while at it, make nla_validate and nla_parse go down to a
common __nla_validate_parse() function to avoid code duplication.

Ultimately, this allows us to have very strict validation for every
new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
next patch, while existing things will continue to work as is.

In effect then, this adds fully strict validation for any new command.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2019-04-28 05:07:21 +0800
ae0be8de9 netlink: make nla_nest_start() add NLA_F_NESTED flag ... Browse Code »

Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
netlink based interfaces (including recently added ones) are still not
setting it in kernel generated messages. Without the flag, message parsers
not aware of attribute semantics (e.g. wireshark dissector or libmnl's
mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
the structure of their contents.

Unfortunately we cannot just add the flag everywhere as there may be
userspace applications which check nlattr::nla_type directly rather than
through a helper masking out the flags. Therefore the patch renames
nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
are rewritten to use nla_nest_start().

Except for changes in include/net/netlink.h, the patch was generated using
this semantic patch:

@@ expression E1, E2; @@
-nla_nest_start(E1, E2)
+nla_nest_start_noflag(E1, E2)

@@ expression E1, E2; @@
-nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
+nla_nest_start(E1, E2)

Signed-off-by: Michal Kubecek
Acked-by: Jiri Pirko
Acked-by: David Ahern
Signed-off-by: David S. Miller

Michal Kubecek
2019-04-28 05:03:44 +0800

26 Apr, 2019

1 commit

8b4483658 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Two easy cases of overlapping changes.

Signed-off-by: David S. Miller

David S. Miller
2019-04-26 11:52:29 +0800

15 Apr, 2019

1 commit

3c7910763 netfilter: ctnetlink: don't use conntrack/expect object addresses as id ... Browse Code »

else, we leak the addresses to userspace via ctnetlink events
and dumps.

Compute an ID on demand based on the immutable parts of nf_conn struct.

Another advantage compared to using an address is that there is no
immediate re-use of the same ID in case the conntrack entry is freed and
reallocated again immediately.

Fixes: 3583240249ef ("[NETFILTER]: nf_conntrack_expect: kill unique ID")
Fixes: 7f85f914721f ("[NETFILTER]: nf_conntrack: kill unique ID")
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-04-15 13:31:44 +0800

09 Apr, 2019

1 commit

4806e9757 netfilter: replace NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT) ... Browse Code »

NF_NAT_NEEDED is true whenever nat support for either ipv4 or ipv6 is
enabled. Now that the af-specific nat configuration switches have been
removed, IS_ENABLED(CONFIG_NF_NAT) has the same effect.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-04-09 05:02:52 +0800

27 Feb, 2019

1 commit

d2c5c103b netfilter: nat: remove nf_nat_l3proto.h and nf_nat_core.h ... Browse Code »

The l3proto name is gone, its header file is the last trace.
While at it, also remove nf_nat_core.h, its very small and all users
include nf_nat.h too.

before:
text data bss dec hex filename
22948 1612 4136 28696 7018 nf_nat.ko

after removal of l3proto register/unregister functions:
text data bss dec hex filename
22196 1516 4136 27848 6cc8 nf_nat.ko

checkpatch complains about overly long lines, but line breaks
do not make things more readable and the line length gets smaller
here, not larger.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-02-27 17:54:08 +0800

12 Feb, 2019

1 commit

48ab807c7 netfilter: conntrack: fix indentation issue ... Browse Code »

A statement in an if block is not indented correctly. Fix this.

Signed-off-by: Colin Ian King
Signed-off-by: Pablo Neira Ayuso

Colin Ian King
2019-02-12 07:39:39 +0800

18 Jan, 2019

1 commit

4a60dc748 netfilter: conntrack: remove nf_ct_l4proto_find_get ... Browse Code »

Its now same as __nf_ct_l4proto_find(), so rename that to
nf_ct_l4proto_find and use it everywhere.

It never returns NULL and doesn't need locks or reference counts.

Before this series:
302824 net/netfilter/nf_conntrack.ko
21504 net/netfilter/nf_conntrack_proto_gre.ko

text data bss dec hex filename
6281 1732 4 8017 1f51 nf_conntrack_proto_gre.ko
108356 20613 236 129205 1f8b5 nf_conntrack.ko

After:
294864 net/netfilter/nf_conntrack.ko
text data bss dec hex filename
106979 19557 240 126776 1ef38 nf_conntrack.ko

so, even with builtin gre, total size got reduced.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-01-18 22:02:34 +0800

18 Dec, 2018

1 commit

5cbabeec1 netfilter: nat: remove nf_nat_l4proto struct ... Browse Code »

This removes the (now empty) nf_nat_l4proto struct, all its instances
and all the no longer needed runtime (un)register functionality.

nf_nat_need_gre() can be axed as well: the module that calls it (to
load the no-longer-existing nat_gre module) also calls other nat core
functions. GRE nat is now always available if kernel is built with it.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2018-12-18 06:33:31 +0800

12 Nov, 2018

1 commit

58fc419be netfilter: ctnetlink: always honor CTA_MARK_MASK ... Browse Code »

Useful to only set a particular range of the conntrack mark while
leaving existing parts of the value alone, e.g. when updating
conntrack marks via netlink from userspace.

For NFQUEUE it was already implemented in commit 534473c6080e
("netfilter: ctnetlink: honor CTA_MARK_MASK when setting ctmark").

This now adds the same functionality also for the other netlink
conntrack mark changes.

Signed-off-by: Andreas Jaggi
Signed-off-by: Pablo Neira Ayuso

Andreas Jaggi
2018-11-12 17:25:00 +0800

21 Sep, 2018

2 commits

9306425b7 netfilter: ctnetlink: must check mark attributes vs NULL ... Browse Code »

else we will oops (null deref) when the attributes aren't present.

Also add back the EOPNOTSUPP in case MARK filtering is requested but
kernel doesn't support it.

Fixes: 59c08c69c2788 ("netfilter: ctnetlink: Support L3 protocol-filter on flush")
Reported-by: syzbot+e45eda8eda6e93a03959@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2018-09-21 16:14:46 +0800
dd2934a95 netfilter: conntrack: remove l3->l4 mapping information ... Browse Code »

l4 protocols are demuxed by l3num, l4num pair.

However, almost all l4 trackers are l3 agnostic.

Only exceptions are:
- gre, icmp (ipv4 only)
- icmpv6 (ipv6 only)

This commit gets rid of the l3 mapping, l4 trackers can now be looked up
by their IPPROTO_XXX value alone, which gets rid of the additional l3
indirection.

For icmp, ipcmp6 and gre, add a check on state->pf and
return -NF_ACCEPT in case we're asked to track e.g. icmpv6-in-ipv4,
this seems more fitting than using the generic tracker.

Additionally we can kill the 2nd l4proto definitions that were needed
for v4/v6 split -- they are now the same so we can use single l4proto
struct for each protocol, rather than two.

The EXPORT_SYMBOLs can be removed as all these object files are
part of nf_conntrack with no external references.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2018-09-21 00:07:35 +0800

17 Sep, 2018

1 commit

59c08c69c netfilter: ctnetlink: Support L3 protocol-filter on flush ... Browse Code »

The same connection mark can be set on flows belonging to different
address families. This commit adds support for filtering on the L3
protocol when flushing connection track entries. If no protocol is
specified, then all L3 protocols match.

In order to avoid code duplication and a redundant check, the protocol
comparison in ctnetlink_dump_table() has been removed. Instead, a filter
is created if the GET-message triggering the dump contains an address
family. ctnetlink_filter_match() is then used to compare the L3
protocols.

Signed-off-by: Kristian Evensen
Signed-off-by: Pablo Neira Ayuso

Kristian Evensen
2018-09-17 18:04:14 +0800

17 Aug, 2018

1 commit

3e673b23b netfilter: fix memory leaks on netlink_dump_start error ... Browse Code »

Shaochun Chen points out we leak dumper filter state allocations
stored in dump_control->data in case there is an error before netlink sets
cb_running (after which ->done will be called at some point).

In order to fix this, add .start functions and move allocations there.

Same pattern as used in commit 90fd131afc565159c9e0ea742f082b337e10f8c6
("netfilter: nf_tables: move dumper state allocation into ->start").

Reported-by: shaochun chen
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2018-08-17 01:37:00 +0800

18 Jul, 2018

1 commit

440534d3c netfilter: Remove useless param helper of nf_ct_helper_ext_add ... Browse Code »

The param helper of nf_ct_helper_ext_add is useless now, then remove
it now.

Signed-off-by: Gao Feng
Signed-off-by: Pablo Neira Ayuso

Gao Feng
2018-07-18 17:26:42 +0800

16 Jul, 2018

1 commit

f957be9d3 netfilter: conntrack: remove ctnetlink callbacks from l3 protocol trackers ... Browse Code »

handle everything from ctnetlink directly.

After all these years we still only support ipv4 and ipv6, so it
seems reasonable to remove l3 protocol tracker support and instead
handle ipv4/ipv6 from a common, always builtin inet tracker.

Step 1: Get rid of all the l3proto->func() calls.

Start with ctnetlink, then move on to packet-path ones.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2018-07-16 23:54:58 +0800

13 Jun, 2018

1 commit

c05a45c08 netfilter: ctnetlink: avoid null pointer dereference ... Browse Code »

Dan Carpenter points out that deref occurs after NULL check, we should
re-fetch the pointer and check that instead.

Fixes: 2c205dd3981f7 ("netfilter: add struct nf_nat_hook and use it")
Reported-by: Dan Carpenter
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2018-06-13 01:31:07 +0800

23 May, 2018

1 commit

2c205dd39 netfilter: add struct nf_nat_hook and use it ... Browse Code »

Move decode_session() and parse_nat_setup_hook() indirections to struct
nf_nat_hook structure.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2018-05-23 15:26:07 +0800

07 May, 2018

1 commit

538c5672b netfilter: ctnetlink: export nf_conntrack_max ... Browse Code »

IPCTNL_MSG_CT_GET_STATS netlink command allow to monitor current number
of conntrack entries. However, if one wants to compare it with the
maximum (and detect exhaustion), the only solution is currently to read
sysctl value.

This patch add nf_conntrack_max value in netlink message, and simplify
monitoring for application built on netlink API.

Signed-off-by: Florent Fourcot
Signed-off-by: Pablo Neira Ayuso

Florent Fourcot
2018-05-07 06:04:02 +0800

30 Mar, 2018

2 commits

d162190bd Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for your net-next
tree. This batch comes with more input sanitization for xtables to
address bug reports from fuzzers, preparation works to the flowtable
infrastructure and assorted updates. In no particular order, they are:

1) Make sure userspace provides a valid standard target verdict, from
Florian Westphal.

2) Sanitize error target size, also from Florian.

3) Validate that last rule in basechain matches underflow/policy since
userspace assumes this when decoding the ruleset blob that comes
from the kernel, from Florian.

4) Consolidate hook entry checks through xt_check_table_hooks(),
patch from Florian.

5) Cap ruleset allocations at 512 mbytes, 134217728 rules and reject
very large compat offset arrays, so we have a reasonable upper limit
and fuzzers don't exercise the oom-killer. Patches from Florian.

6) Several WARN_ON checks on xtables mutex helper, from Florian.

7) xt_rateest now has a hashtable per net, from Cong Wang.

8) Consolidate counter allocation in xt_counters_alloc(), from Florian.

9) Earlier xt_table_unlock() call in {ip,ip6,arp,eb}tables, patch
from Xin Long.

10) Set FLOW_OFFLOAD_DIR_* to IP_CT_DIR_* definitions, patch from
Felix Fietkau.

11) Consolidate code through flow_offload_fill_dir(), also from Felix.

12) Inline ip6_dst_mtu_forward() just like ip_dst_mtu_maybe_forward()
to remove a dependency with flowtable and ipv6.ko, from Felix.

13) Cache mtu size in flow_offload_tuple object, this is safe for
forwarding as f87c10a8aa1e describes, from Felix.

14) Rename nf_flow_table.c to nf_flow_table_core.o, to simplify too
modular infrastructure, from Felix.

15) Add rt0, rt2 and rt4 IPv6 routing extension support, patch from
Ahmed Abdelsalam.

16) Remove unused parameter in nf_conncount_count(), from Yi-Hung Wei.

17) Support for counting only to nf_conncount infrastructure, patch
from Yi-Hung Wei.

18) Add strict NFT_CT_{SRC_IP,DST_IP,SRC_IP6,DST_IP6} key datatypes
to nft_ct.

19) Use boolean as return value from ipt_ah and from IPVS too, patch
from Gustavo A. R. Silva.

20) Remove useless parameters in nfnl_acct_overquota() and
nf_conntrack_broadcast_help(), from Taehee Yoo.

21) Use ipv6_addr_is_multicast() from xt_cluster, also from Taehee Yoo.

22) Statify nf_tables_obj_lookup_byhandle, patch from Fengguang Wu.

23) Fix typo in xt_limit, from Geert Uytterhoeven.

24) Do no use VLAs in Netfilter code, again from Gustavo.

25) Use ADD_COUNTER from ebtables, from Taehee Yoo.

26) Bitshift support for CONNMARK and MARK targets, from Jack Ma.

27) Use pr_*() and add pr_fmt(), from Arushi Singhal.

28) Add synproxy support to ctnetlink.

29) ICMP type and IGMP matching support for ebtables, patches from
Matthias Schiffer.

30) Support for the revision infrastructure to ebtables, from
Bernie Harris.

31) String match support for ebtables, also from Bernie.

32) Documentation for the new flowtable infrastructure.

33) Use generic comparison functions in ebt_stp, from Joe Perches.

34) Demodularize filter chains in nftables.

35) Register conntrack hooks in case nftables NAT chain is added.

36) Merge assignments with return in a couple of spots in the
Netfilter codebase, also from Arushi.

37) Document that xtables percpu counters are stored in the same
memory area, from Ben Hutchings.

38) Revert mark_source_chains() sanity checks that break existing
rulesets, from Florian Westphal.

39) Use is_zero_ether_addr() in the ipset codebase, from Joe Perches.
====================

Signed-off-by: David S. Miller

David S. Miller
2018-03-30 23:41:18 +0800
c47d36b38 netfilter: Merge assignment with return ... Browse Code »

Merge assignment with return statement to directly return the value.

Signed-off-by: Arushi Singhal
Signed-off-by: Pablo Neira Ayuso

Arushi Singhal
2018-03-30 17:44:27 +0800

28 Mar, 2018

1 commit

2f635ceeb net: Drop pernet_operations::async ... Browse Code »

Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.

Signed-off-by: Kirill Tkhai
Signed-off-by: David S. Miller

Kirill Tkhai
2018-03-28 01:18:09 +0800

20 Mar, 2018

1 commit

20710b3b8 netfilter: ctnetlink: synproxy support ... Browse Code »

This patch exposes synproxy information per-conntrack. Moreover, send
sequence adjustment events once server sends us the SYN,ACK packet, so
we can synchronize the sequence adjustment too for packets going as
reply from the server, as part of the synproxy logic.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2018-03-20 21:39:31 +0800

05 Mar, 2018

1 commit

b04a3d098 net: Convert ctnetlink_net_ops ... Browse Code »

These pernet_operations register and unregister
two conntrack notifiers, and they seem to be safe
to be executed in parallel.

General/not related to async pernet_operations JFI:
ctnetlink_net_exit_batch() actions are grouped in batch,
and this could look like there is synchronize_rcu()
is forgotten. But there is synchronize_rcu() on module
exit patch (in ctnetlink_exit()), so this batch may
be reworked as simple .exit method.

Signed-off-by: Kirill Tkhai
Signed-off-by: David S. Miller

Kirill Tkhai
2018-03-05 23:48:28 +0800