28 Apr, 2019
2 commits
-
We currently have two levels of strict validation:
1) liberal (default)
- undefined (type >= max) & NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
- garbage at end of message accepted
2) strict (opt-in)
- NLA_UNSPEC attributes accepted
- attribute length >= expected acceptedSplit out parsing strictness into four different options:
* TRAILING - check that there's no trailing data after parsing
attributes (in message or nested)
* MAXTYPE - reject attrs > max known type
* UNSPEC - reject attributes with NLA_UNSPEC policy entries
* STRICT_ATTRS - strictly validate attribute sizeThe default for future things should be *everything*.
The current *_strict() is a combination of TRAILING and MAXTYPE,
and is renamed to _deprecated_strict().
The current regular parsing has none of this, and is renamed to
*_parse_deprecated().Additionally it allows us to selectively set one of the new flags
even on old policies. Notably, the UNSPEC flag could be useful in
this case, since it can be arranged (by filling in the policy) to
not be an incompatible userspace ABI change, but would then going
forward prevent forgetting attribute entries. Similar can apply
to the POLICY flag.We end up with the following renames:
* nla_parse -> nla_parse_deprecated
* nla_parse_strict -> nla_parse_deprecated_strict
* nlmsg_parse -> nlmsg_parse_deprecated
* nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
* nla_parse_nested -> nla_parse_nested_deprecated
* nla_validate_nested -> nla_validate_nested_deprecatedUsing spatch, of course:
@@
expression TB, MAX, HEAD, LEN, POL, EXT;
@@
-nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
+nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)@@
expression TB, MAX, NLA, POL, EXT;
@@
-nla_parse_nested(TB, MAX, NLA, POL, EXT)
+nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)@@
expression START, MAX, POL, EXT;
@@
-nla_validate_nested(START, MAX, POL, EXT)
+nla_validate_nested_deprecated(START, MAX, POL, EXT)@@
expression NLH, HDRLEN, MAX, POL, EXT;
@@
-nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
+nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)For this patch, don't actually add the strict, non-renamed versions
yet so that it breaks compile if I get it wrong.Also, while at it, make nla_validate and nla_parse go down to a
common __nla_validate_parse() function to avoid code duplication.Ultimately, this allows us to have very strict validation for every
new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
next patch, while existing things will continue to work as is.In effect then, this adds fully strict validation for any new command.
Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller -
Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
netlink based interfaces (including recently added ones) are still not
setting it in kernel generated messages. Without the flag, message parsers
not aware of attribute semantics (e.g. wireshark dissector or libmnl's
mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
the structure of their contents.Unfortunately we cannot just add the flag everywhere as there may be
userspace applications which check nlattr::nla_type directly rather than
through a helper masking out the flags. Therefore the patch renames
nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
are rewritten to use nla_nest_start().Except for changes in include/net/netlink.h, the patch was generated using
this semantic patch:@@ expression E1, E2; @@
-nla_nest_start(E1, E2)
+nla_nest_start_noflag(E1, E2)@@ expression E1, E2; @@
-nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
+nla_nest_start(E1, E2)Signed-off-by: Michal Kubecek
Acked-by: Jiri Pirko
Acked-by: David Ahern
Signed-off-by: David S. Miller
23 Apr, 2019
1 commit
-
The header contains rtnh_ macros so rename the file accordingly.
Allows a later patch to use the nexthop.h name for the new
nexthop code.Signed-off-by: David Ahern
Signed-off-by: David S. Miller
09 Apr, 2019
2 commits
-
Add support for an IPv6 gateway to rtable. Since a gateway is either
IPv4 or IPv6, make it a union with rt_gw4 where rt_gw_family decides
which address is in use.When dumping the route data, encode an ipv6 nexthop using RTA_VIA.
Signed-off-by: David Ahern
Reviewed-by: Ido Schimmel
Signed-off-by: David S. Miller -
To allow the gateway to be either an IPv4 or IPv6 address, remove
rt_uses_gateway from rtable and replace with rt_gw_family. If
rt_gw_family is set it implies rt_uses_gateway. Rename rt_gateway
to rt_gw4 to represent the IPv4 version.Signed-off-by: David Ahern
Reviewed-by: Ido Schimmel
Signed-off-by: David S. Miller
30 Mar, 2019
1 commit
-
The number of stubs is growing and has nothing to do with addrconf.
Move the definition of the stubs to a separate header file and update
users. In the move, drop the vxlan specific comment before ipv6_stub.Code move only; no functional change intended.
Signed-off-by: David Ahern
Signed-off-by: David S. Miller
20 Mar, 2019
1 commit
-
This patch adds support for 6PE (RFC 4798) which uses IPv4-mapped IPv6
nexthop to connect IPv6 islands over IPv4 only MPLS network core.Prior to this fix, to find the link-layer destination mac address, 6PE
enabled host/router was sending IPv6 ND requests for IPv4-mapped IPv6
nexthop address over the interface facing the IPv4 only core which
wouldn't success as the core is IPv6 free.This fix changes that behavior on 6PE host to treat the nexthop as IPv4
address and send ARP requests whenever the next-hop address is an
IPv4-mapped IPv6 address.Below topology illustrates the issue and how the patch addresses it.
abcd::1.1.1.1 (lo) abcd::2.2.2.2 (lo)
R0 (PE/host)------------------------R1--------------------------------R2 (PE/host)
eth1 eth2 eth3 eth4
172.18.0.10 172.18.0.11 172.19.0.11 172.19.0.12
ffff::172.18.0.10 ffff::172.19.0.12
R0 and R2 act as 6PE routers of IPv6 islands. R1 is IPv4 only with MPLS tunnels
between R0,R1 and R1,R2.docker exec r0 ip -f inet6 route add abcd::2.2.2.2/128 nexthop encap mpls 100 via ::ffff:172.18.0.11 dev eth1
docker exec r2 ip -f inet6 route add abcd::1.1.1.1/128 nexthop encap mpls 200 via ::ffff:172.19.0.11 dev eth4docker exec r1 ip -f mpls route add 100 via inet 172.19.0.12 dev eth3
docker exec r1 ip -f mpls route add 200 via inet 172.18.0.10 dev eth2With the change, when R0 sends an IPv6 packet over MPLS tunnel to abcd::2.2.2.2,
using ::ffff:172.18.0.11 as the nexthop, it does neighbor discovery for
172.18.18.0.11.Signed-off-by: Vinay K Nallamothu
Tested-by: Avinash Lingala
Tested-by: Aravind Srinivas Srinivasa Prabhakar
Signed-off-by: David S. Miller
03 Mar, 2019
1 commit
27 Feb, 2019
1 commit
-
MPLS does not support nexthops with an MPLS address family.
Specifically, it does not handle RTA_GATEWAY attribute. Make it
clear by returning an error.Fixes: 03c0566542f4c ("mpls: Netlink commands to add, remove, and dump routes")
Signed-off-by: David Ahern
Signed-off-by: David S. Miller
09 Feb, 2019
1 commit
-
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:struct foo {
int stuff;
struct boo entry[];
};instance = alloc(sizeof(struct foo) + count * sizeof(struct boo));
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:instance = alloc(struct_size(instance, entry, count));
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva
Signed-off-by: David S. Miller
20 Jan, 2019
2 commits
-
Make RTM_GETNETCONF's doit handler use strict checks when
NETLINK_F_STRICT_CHK is set.Signed-off-by: Jakub Kicinski
Signed-off-by: David S. Miller -
Make RTM_GETROUTE's doit handler use strict checks when
NETLINK_F_STRICT_CHK is set.Signed-off-by: Jakub Kicinski
Signed-off-by: David S. Miller
16 Oct, 2018
4 commits
-
Update the dump request parsing in MPLS for the non-INET case to
enable kernel side filtering. If INET is disabled the only filters
that make sense for MPLS are protocol and nexthop device.Signed-off-by: David Ahern
Signed-off-by: David S. Miller -
Update parsing of route dump request to enable kernel side filtering.
Allow filtering results by protocol (e.g., which routing daemon installed
the route), route type (e.g., unicast), table id and nexthop device. These
amount to the low hanging fruit, yet a huge improvement, for dumping
routes.ip_valid_fib_dump_req is called with RTNL held, so __dev_get_by_index can
be used to look up the device index without taking a reference. From
there filter->dev is only used during dump loops with the lock still held.Set NLM_F_DUMP_FILTERED in the answer_flags so the user knows the results
have been filtered should no entries be returned.Signed-off-by: David Ahern
Signed-off-by: David S. Miller -
Implement kernel side filtering of routes by egress device index and
protocol. MPLS uses only a single table and route type.Signed-off-by: David Ahern
Signed-off-by: David S. Miller -
Add struct fib_dump_filter for options on limiting which routes are
returned in a dump request. The current list is table id, protocol,
route type, rtm_flags and nexthop device index. struct net is needed
to lookup the net_device from the index.Declare the filter for each route dump handler and plumb the new
arguments from dump handlers to ip_valid_fib_dump_req.Signed-off-by: David Ahern
Signed-off-by: David S. Miller
11 Oct, 2018
1 commit
-
Without CONFIG_INET enabled compiles fail with:
net/mpls/af_mpls.o: In function `mpls_dump_routes':
af_mpls.c:(.text+0xed0): undefined reference to `ip_valid_fib_dump_req'The preference is for MPLS to use the same handler as ipv4 and ipv6
to allow consistency when doing a dump for AF_UNSPEC which walks
all address families invoking the route dump handler. If INET is
disabled then fallback to an MPLS version which can be tighter on
the data checks.Fixes: e8ba330ac0c5 ("rtnetlink: Update fib dumps for strict data checking")
Reported-by: Randy Dunlap
Reported-by: Arnd Bergmann
Signed-off-by: David Ahern
Signed-off-by: David S. Miller
09 Oct, 2018
3 commits
-
Update inet_netconf_dump_devconf, inet6_netconf_dump_devconf, and
mpls_netconf_dump_devconf for strict data checking. If the flag is set,
the dump request is expected to have an netconfmsg struct as the header.
The struct only has the family member and no attributes can be appended.Signed-off-by: David Ahern
Acked-by: Christian Brauner
Signed-off-by: David S. Miller -
Add helper to check netlink message for route dumps. If the strict flag
is set the dump request is expected to have an rtmsg struct as the header.
All elements of the struct are expected to be 0 with the exception of
rtm_flags (which is used by both ipv4 and ipv6 dumps) and no attributes
can be appended. rtm_flags can only have RTM_F_CLONED and RTM_F_PREFIX
set.Update inet_dump_fib, inet6_dump_fib, mpls_dump_routes, ipmr_rtm_dumproute,
and ip6mr_rtm_dumproute to call this helper if strict data checking is
enabled.Signed-off-by: David Ahern
Signed-off-by: David S. Miller -
Make sure extack is passed to nlmsg_parse where easy to do so.
Most of these are dump handlers and leveraging the extack in
the netlink_callback.Signed-off-by: David Ahern
Acked-by: Christian Brauner
Signed-off-by: David S. Miller
25 Sep, 2018
1 commit
-
Summary:
This appears to be necessary and sufficient change to enable `MPLS` on
`ip6gre` tunnels (RFC4023).This diff allows IP6GRE devices to be recognized by MPLS kernel module
and hence user can configure interface to accept packets with mpls
headers as well setup mpls routes on them.Test Plan:
Test plan consists of multiple containers connected via GRE-V6 tunnel.
Then carrying out testing steps as below.- Carry out necessary sysctl settings on all containers
```
sysctl -w net.mpls.platform_labels=65536
sysctl -w net.mpls.ip_ttl_propagate=1
sysctl -w net.mpls.conf.lo.input=1
```- Establish IP6GRE tunnels
```
ip -6 tunnel add name if_1_2_1 mode ip6gre \
local 2401:db00:21:6048:feed:0::1 \
remote 2401:db00:21:6048:feed:0::2 key 1
ip link set dev if_1_2_1 up
sysctl -w net.mpls.conf.if_1_2_1.input=1
ip -4 addr add 169.254.0.2/31 dev if_1_2_1 scope linkip -6 tunnel add name if_1_3_1 mode ip6gre \
local 2401:db00:21:6048:feed:0::1 \
remote 2401:db00:21:6048:feed:0::3 key 1
ip link set dev if_1_3_1 up
sysctl -w net.mpls.conf.if_1_3_1.input=1
ip -4 addr add 169.254.0.4/31 dev if_1_3_1 scope link
```- Install MPLS encap rules on node-1 towards node-2
```
ip route add 192.168.0.11/32 nexthop encap mpls 32/64 \
via inet 169.254.0.3 dev if_1_2_1
```- Install MPLS forwarding rules on node-2 and node-3
```
// node2
ip -f mpls route add 32 via inet 169.254.0.7 dev if_2_4_1// node3
ip -f mpls route add 64 via inet 169.254.0.12 dev if_4_3_1
```- Ping 192.168.0.11 (node4) from 192.168.0.1 (node1) (where routing
towards 192.168.0.1 is via IP route directly towards node1 from node4)
```
ping 192.168.0.11
```- tcpdump on interface to capture ping packets wrapped within MPLS
header which inturn wrapped within IP6GRE header```
16:43:41.121073 IP6
2401:db00:21:6048:feed::1 > 2401:db00:21:6048:feed::2:
DSTOPT GREv0, key=0x1, length 100:
MPLS (label 32, exp 0, ttl 255) (label 64, exp 0, [S], ttl 255)
IP 192.168.0.1 > 192.168.0.11:
ICMP echo request, id 1208, seq 45, length 640x0000: 6000 2cdb 006c 3c3f 2401 db00 0021 6048 `.,..l
Signed-off-by: David S. Miller
25 Jul, 2018
1 commit
-
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller
05 Mar, 2018
1 commit
-
If you take a GSO skb, and split it into packets, will the network
length (L3 headers + L4 headers + payload) of those packets be small
enough to fit within a given MTU?skb_gso_validate_mtu gives you the answer to that question. However,
we recently added to add a way to validate the MAC length of a split GSO
skb (L2+L3+L4+payload), and the names get confusing, so rename
skb_gso_validate_mtu to skb_gso_validate_network_lenSigned-off-by: Daniel Axtens
Reviewed-by: Marcelo Ricardo Leitner
Signed-off-by: David S. Miller
09 Feb, 2018
1 commit
-
mpls_label_ok() validates that the 'platform_label' array index from a
userspace netlink message payload is valid. Under speculation the
mpls_label_ok() result may not resolve in the CPU pipeline until after
the index is used to access an array element. Sanitize the index to zero
to prevent userspace-controlled arbitrary out-of-bounds speculation, a
precursor for a speculative execution side channel vulnerability.Cc:
Cc: "David S. Miller"
Cc: Eric W. Biederman
Signed-off-by: Dan Williams
Signed-off-by: David S. Miller
05 Dec, 2017
1 commit
-
all of these can be compiled as a module, so use new
_module version to make sure module can no longer be removed
while callback/dump is in use.Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller
04 Nov, 2017
1 commit
-
Files removed in 'net-next' had their license header updated
in 'net'. We take the remove from 'net-next'.Signed-off-by: David S. Miller
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
13 Oct, 2017
1 commit
-
When af_mpls is built-in but the tunnel support is a module,
we get a link failure:net/mpls/af_mpls.o: In function `mpls_init':
af_mpls.c:(.init.text+0xdc): undefined reference to `ip_tunnel_encap_add_ops'This adds a Kconfig statement to prevent the broken
configuration and force mpls to be a module as well in
this case.Fixes: bdc476413dcd ("ip_tunnel: add mpls over gre support")
Signed-off-by: Arnd Bergmann
Acked-by: Amine Kherbouche
Signed-off-by: David S. Miller
12 Oct, 2017
1 commit
-
The function ipgre_mpls_encap_hlen is local to the source and
does not need to be in global scope, so make it static.Cleans up sparse warning:
symbol 'ipgre_mpls_encap_hlen' was not declared. Should it be static?Fixes: bdc476413dcdb ("ip_tunnel: add mpls over gre support")
Signed-off-by: Colin Ian King
Acked-by: David Ahern
Signed-off-by: David S. Miller
08 Oct, 2017
1 commit
-
This commit introduces the MPLSoGRE support (RFC 4023), using ip tunnel
API by simply adding ipgre_tunnel_encap_(add|del)_mpls_ops() and the new
tunnel type TUNNEL_ENCAP_MPLS.Signed-off-by: Amine Kherbouche
Signed-off-by: David S. Miller
10 Aug, 2017
1 commit
-
This change allows us to later indicate to rtnetlink core that certain
doit functions should be called without acquiring rtnl_mutex.This change should have no effect, we simply replace the last (now
unused) calcit argument with the new flag.Signed-off-by: Florian Westphal
Reviewed-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
08 Jul, 2017
1 commit
-
Fix the below warning generated by static checker:
net/mpls/af_mpls.c:2111 mpls_getroute()
error: uninitialized symbol 'in_label'."Fixes: 397fc9e5cefe ("mpls: route get support")
Reported-by: Dan Carpenter
Signed-off-by: Roopa Prabhu
Signed-off-by: David S. Miller
05 Jul, 2017
1 commit
-
fix rtm policy name typo in mpls_getroute and also remove
export of rtm_ipv4_policyFixes: 397fc9e5cefe ("mpls: route get support")
Reported-by: David S. Miller
Signed-off-by: Roopa Prabhu
Signed-off-by: David S. Miller
04 Jul, 2017
1 commit
-
This patch adds RTM_GETROUTE doit handler for mpls routes.
Input:
RTA_DST - input label
RTA_NEWDST - labels in packet for multipath selectionBy default the getroute handler returns matched
nexthop label, via and oifWith RTM_F_FIB_MATCH flag, full matched route is
returned.example (with patched iproute2):
$ip -f mpls route show
101
nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
201
nexthop as to 202/203 via inet6 2001:db8:2::2 dev virt1-2
nexthop as to 402/403 via inet6 2001:db8:12::2 dev virt1-12$ip -f mpls route get 103
RTNETLINK answers: Network is unreachable$ip -f mpls route get 101
101 as to 102/103 via inet 172.16.2.2 dev virt1-2$ip -f mpls route get as to 302/303 101
101 as to 302/303 via inet 172.16.12.2 dev virt1-12$ip -f mpls route get fibmatch 103
RTNETLINK answers: Network is unreachable$ip -f mpls route get fibmatch 101
101
nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12Signed-off-by: Roopa Prabhu
Signed-off-by: David S. Miller
07 Jun, 2017
1 commit
-
Just some simple overlapping changes in marvell PHY driver
and the DSA core code.Signed-off-by: David S. Miller
01 Jun, 2017
1 commit
-
recent fixes to use WRITE_ONCE for nh_flags on link up,
accidently ended up leaving the deadflags on a nh. This patch
fixes the WRITE_ONCE to use freshly evaluated nh_flags.Fixes: 39eb8cd17588 ("net: mpls: rt_nhn_alive and nh_flags should be accessed using READ_ONCE")
Reported-by: Satish Ashok
Signed-off-by: Roopa Prabhu
Acked-by: David Ahern
Signed-off-by: David S. Miller
30 May, 2017
4 commits
-
err is initialized to EINVAL and not used before it is set again.
Remove the unnecessary initialization.Signed-off-by: David Ahern
Signed-off-by: David S. Miller -
nla_get_via is only used in af_mpls.c. Remove declaration from internal.h
and move up in af_mpls.c before first use. Code move only; no
functional change intended.Signed-off-by: David Ahern
Signed-off-by: David S. Miller -
Add error messages for failures in adding and deleting mpls routes.
This covers most of the annoying EINVAL errors.Signed-off-by: David Ahern
Signed-off-by: David S. Miller -
mpls_route_add and mpls_route_del have the same checks on the label.
Move to a helper. Avoid duplicate extack messages in the next patch.Signed-off-by: David Ahern
Signed-off-by: David S. Miller