Eric Lee / smarc-fsl-linux-kernel

25 Jul, 2020

1 commit

b43c61531 net/ipv6: switch ip6_mroute_setsockopt to sockptr_t ... Browse Code »

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-25 06:41:54 +0800

22 May, 2020

1 commit

41b4bd986 net: don't return invalid table id error when we fall back to PF_UNSPEC ... Browse Code »

In case we can't find a ->dumpit callback for the requested
(family,type) pair, we fall back to (PF_UNSPEC,type). In effect, we're
in the same situation as if userspace had requested a PF_UNSPEC
dump. For RTM_GETROUTE, that handler is rtnl_dump_all, which calls all
the registered RTM_GETROUTE handlers.

The requested table id may or may not exist for all of those
families. commit ae677bbb4441 ("net: Don't return invalid table id
error when dumping all families") fixed the problem when userspace
explicitly requests a PF_UNSPEC dump, but missed the fallback case.

For example, when we pass ipv6.disable=1 to a kernel with
CONFIG_IP_MROUTE=y and CONFIG_IP_MROUTE_MULTIPLE_TABLES=y,
the (PF_INET6, RTM_GETROUTE) handler isn't registered, so we end up in
rtnl_dump_all, and listing IPv6 routes will unexpectedly print:

# ip -6 r
Error: ipv4: MR table does not exist.
Dump terminated

commit ae677bbb4441 introduced the dump_all_families variable, which
gets set when userspace requests a PF_UNSPEC dump. However, we can't
simply set the family to PF_UNSPEC in rtnetlink_rcv_msg in the
fallback case to get dump_all_families == true, because some messages
types (for example RTM_GETRULE and RTM_GETNEIGH) only register the
PF_UNSPEC handler and use the family to filter in the kernel what is
dumped to userspace. We would then export more entries, that userspace
would have to filter. iproute does that, but other programs may not.

Instead, this patch removes dump_all_families and updates the
RTM_GETROUTE handlers to check if the family that is being dumped is
their own. When it's not, which covers both the intentional PF_UNSPEC
dumps (as dump_all_families did) and the fallback case, ignore the
missing table id error.

Fixes: cb167893f41e ("net: Plumb support for filtering ipv4 and ipv6 multicast route dumps")
Signed-off-by: Sabrina Dubroca
Reviewed-by: David Ahern
Signed-off-by: David S. Miller

Sabrina Dubroca
2020-05-22 08:25:50 +0800

17 May, 2020

1 commit

b6dd5acde ipv6: Fix suspicious RCU usage warning in ip6mr ... Browse Code »

This patch fixes the following warning:

=============================
WARNING: suspicious RCU usage
5.7.0-rc4-next-20200507-syzkaller #0 Not tainted
-----------------------------
net/ipv6/ip6mr.c:124 RCU-list traversed in non-reader section!!

ipmr_new_table() returns an existing table, but there is no table at
init. Therefore the condition: either holding rtnl or the list is empty
is used.

Fixes: d1db275dd3f6e ("ipv6: ip6mr: support multiple tables")
Reported-by: kernel test robot
Suggested-by: Jakub Kicinski
Signed-off-by: Madhuparna Bhowmik
Signed-off-by: David S. Miller

Madhuparna Bhowmik
2020-05-17 04:41:53 +0800

13 Mar, 2020

1 commit

a8eceea84 inet: Use fallthrough; ... Browse Code »

Convert the various uses of fallthrough comments to fallthrough;

Done via script
Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com/

And by hand:

net/ipv6/ip6_fib.c has a fallthrough comment outside of an #ifdef block
that causes gcc to emit a warning if converted in-place.

So move the new fallthrough; inside the containing #ifdef/#endif too.

Signed-off-by: Joe Perches
Signed-off-by: David S. Miller

Joe Perches
2020-03-13 06:55:00 +0800

25 Feb, 2020

1 commit

28b380e28 ip6mr: Fix RCU list debugging warning ... Browse Code »

ip6mr_for_each_table() macro uses list_for_each_entry_rcu()
for traversing outside an RCU read side critical section
but under the protection of rtnl_mutex. Hence add the
corresponding lockdep expression to silence the following
false-positive warnings:

[ 4.319479] =============================
[ 4.319480] WARNING: suspicious RCU usage
[ 4.319482] 5.5.4-stable #17 Tainted: G E
[ 4.319483] -----------------------------
[ 4.319485] net/ipv6/ip6mr.c:1243 RCU-list traversed in non-reader section!!

[ 4.456831] =============================
[ 4.456832] WARNING: suspicious RCU usage
[ 4.456834] 5.5.4-stable #17 Tainted: G E
[ 4.456835] -----------------------------
[ 4.456837] net/ipv6/ip6mr.c:1582 RCU-list traversed in non-reader section!!

Signed-off-by: Amol Grover
Signed-off-by: David S. Miller

Amol Grover
2020-02-25 05:19:21 +0800

05 Oct, 2019

1 commit

b7a595577 net: fib_notifier: propagate extack down to the notifier block callback ... Browse Code »

Since errors are propagated all the way up to the caller, propagate
possible extack of the caller all the way down to the notifier block
callback.

Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller

Jiri Pirko
2019-10-05 02:10:56 +0800

07 Sep, 2019

1 commit

0079ad8e8 ipmr: remove hard code cache_resolve_queue_len limit ... Browse Code »

This is a re-post of previous patch wrote by David Miller[1].

Phil Karn reported[2] that on busy networks with lots of unresolved
multicast routing entries, the creation of new multicast group routes
can be extremely slow and unreliable.

The reason is we hard-coded multicast route entries with unresolved source
addresses(cache_resolve_queue_len) to 10. If some multicast route never
resolves and the unresolved source addresses increased, there will
be no ability to create new multicast route cache.

To resolve this issue, we need either add a sysctl entry to make the
cache_resolve_queue_len configurable, or just remove cache_resolve_queue_len
limit directly, as we already have the socket receive queue limits of mrouted
socket, pointed by David.

>From my side, I'd perfer to remove the cache_resolve_queue_len limit instead
of creating two more(IPv4 and IPv6 version) sysctl entry.

[1] https://lkml.org/lkml/2018/7/22/11
[2] https://lkml.org/lkml/2018/7/21/343

v3: instead of remove cache_resolve_queue_len totally, let's only remove
the hard code limit when allocate the unresolved cache, as Eric Dumazet
suggested, so we don't need to re-count it in other places.

v2: hold the mfc_unres_lock while walking the unresolved list in
queue_count(), as Nikolay Aleksandrov remind.

Reported-by: Phil Karn
Signed-off-by: Hangbin Liu
Reviewed-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Hangbin Liu
2019-09-07 23:49:00 +0800

31 May, 2019

1 commit

2874c5fd2 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 ... Browse Code »

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-31 02:26:32 +0800

08 Apr, 2019

1 commit

8f0db0180 rhashtable: use bit_spin_locks to protect hash bucket. ... Browse Code »

This patch changes rhashtables to use a bit_spin_lock on BIT(1) of the
bucket pointer to lock the hash chain for that bucket.

The benefits of a bit spin_lock are:
- no need to allocate a separate array of locks.
- no need to have a configuration option to guide the
choice of the size of this array
- locking cost is often a single test-and-set in a cache line
that will have to be loaded anyway. When inserting at, or removing
from, the head of the chain, the unlock is free - writing the new
address in the bucket head implicitly clears the lock bit.
For __rhashtable_insert_fast() we ensure this always happens
when adding a new key.
- even when lockings costs 2 updates (lock and unlock), they are
in a cacheline that needs to be read anyway.

The cost of using a bit spin_lock is a little bit of code complexity,
which I think is quite manageable.

Bit spin_locks are sometimes inappropriate because they are not fair -
if multiple CPUs repeatedly contend of the same lock, one CPU can
easily be starved. This is not a credible situation with rhashtable.
Multiple CPUs may want to repeatedly add or remove objects, but they
will typically do so at different buckets, so they will attempt to
acquire different locks.

As we have more bit-locks than we previously had spinlocks (by at
least a factor of two) we can expect slightly less contention to
go with the slightly better cache behavior and reduced memory
consumption.

To enhance type checking, a new struct is introduced to represent the
pointer plus lock-bit
that is stored in the bucket-table. This is "struct rhash_lock_head"
and is empty. A pointer to this needs to be cast to either an
unsigned lock, or a "struct rhash_head *" to be useful.
Variables of this type are most often called "bkt".

Previously "pprev" would sometimes point to a bucket, and sometimes a
->next pointer in an rhash_head. As these are now different types,
pprev is NULL when it would have pointed to the bucket. In that case,
'blk' is used, together with correct locking protocol.

Signed-off-by: NeilBrown
Signed-off-by: David S. Miller

NeilBrown
2019-04-08 10:12:12 +0800

05 Mar, 2019

1 commit

87c11f1dd ip6mr: Do not call __IP6_INC_STATS() from preemptible context ... Browse Code »

Similar to commit 44f49dd8b5a6 ("ipmr: fix possible race resulting from
improper usage of IP_INC_STATS_BH() in preemptible context."), we cannot
assume preemption is disabled when incrementing the counter and
accessing a per-CPU variable.

Preemption can be enabled when we add a route in process context that
corresponds to packets stored in the unresolved queue, which are then
forwarded using this route [1].

Fix this by using IP6_INC_STATS() which takes care of disabling
preemption on architectures where it is needed.

[1]
[ 157.451447] BUG: using __this_cpu_add() in preemptible [00000000] code: smcrouted/2314
[ 157.460409] caller is ip6mr_forward2+0x73e/0x10e0
[ 157.460434] CPU: 3 PID: 2314 Comm: smcrouted Not tainted 5.0.0-rc7-custom-03635-g22f2712113f1 #1336
[ 157.460449] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
[ 157.460461] Call Trace:
[ 157.460486] dump_stack+0xf9/0x1be
[ 157.460553] check_preemption_disabled+0x1d6/0x200
[ 157.460576] ip6mr_forward2+0x73e/0x10e0
[ 157.460705] ip6_mr_forward+0x9a0/0x1510
[ 157.460771] ip6mr_mfc_add+0x16b3/0x1e00
[ 157.461155] ip6_mroute_setsockopt+0x3cb/0x13c0
[ 157.461384] do_ipv6_setsockopt.isra.8+0x348/0x4060
[ 157.462013] ipv6_setsockopt+0x90/0x110
[ 157.462036] rawv6_setsockopt+0x4a/0x120
[ 157.462058] __sys_setsockopt+0x16b/0x340
[ 157.462198] __x64_sys_setsockopt+0xbf/0x160
[ 157.462220] do_syscall_64+0x14d/0x610
[ 157.462349] entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 0912ea38de61 ("[IPV6] MROUTE: Add stats in multicast routing module method ip6_mr_forward().")
Signed-off-by: Ido Schimmel
Reported-by: Amit Cohen
Signed-off-by: David S. Miller

Ido Schimmel
2019-03-05 02:55:48 +0800

22 Feb, 2019

1 commit

ca8d4794f ipmr: ip6mr: Create new sockopt to clear mfc cache or vifs ... Browse Code »

Currently the only way to clear the forwarding cache was to delete the
entries one by one using the MRT_DEL_MFC socket option or to destroy and
recreate the socket.

Create a new socket option which with the use of optional flags can
clear any combination of multicast entries (static or not static) and
multicast vifs (static or not static).

Calling the new socket option MRT_FLUSH with the flags MRT_FLUSH_MFC and
MRT_FLUSH_VIFS will clear all entries and vifs on the socket except for
static entries.

Signed-off-by: Callum Sinclair
Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Callum Sinclair
2019-02-22 05:05:05 +0800

28 Jan, 2019

1 commit

146820cc2 ip6mr: Fix notifiers call on mroute_clean_tables() ... Browse Code »

When the MC route socket is closed, mroute_clean_tables() is called to
cleanup existing routes. Mistakenly notifiers call was put on the cleanup
of the unresolved MC route entries cache.
In a case where the MC socket closes before an unresolved route expires,
the notifier call leads to a crash, caused by the driver trying to
increment a non initialized refcount_t object [1] and then when handling
is done, to decrement it [2]. This was detected by a test recently added in
commit 6d4efada3b82 ("selftests: forwarding: Add multicast routing test").

Fix that by putting notifiers call on the resolved entries traversal,
instead of on the unresolved entries traversal.

[1]

[ 245.748967] refcount_t: increment on 0; use-after-free.
[ 245.754829] WARNING: CPU: 3 PID: 3223 at lib/refcount.c:153 refcount_inc_checked+0x2b/0x30
...
[ 245.802357] Hardware name: Mellanox Technologies Ltd. MSN2740/SA001237, BIOS 5.6.5 06/07/2016
[ 245.811873] RIP: 0010:refcount_inc_checked+0x2b/0x30
...
[ 245.907487] Call Trace:
[ 245.910231] mlxsw_sp_router_fib_event.cold.181+0x42/0x47 [mlxsw_spectrum]
[ 245.917913] notifier_call_chain+0x45/0x7
[ 245.922484] atomic_notifier_call_chain+0x15/0x20
[ 245.927729] call_fib_notifiers+0x15/0x30
[ 245.932205] mroute_clean_tables+0x372/0x3f
[ 245.936971] ip6mr_sk_done+0xb1/0xc0
[ 245.940960] ip6_mroute_setsockopt+0x1da/0x5f0
...

[2]

[ 246.128487] refcount_t: underflow; use-after-free.
[ 246.133859] WARNING: CPU: 0 PID: 7 at lib/refcount.c:187 refcount_sub_and_test_checked+0x4c/0x60
[ 246.183521] Hardware name: Mellanox Technologies Ltd. MSN2740/SA001237, BIOS 5.6.5 06/07/2016
...
[ 246.193062] Workqueue: mlxsw_core_ordered mlxsw_sp_router_fibmr_event_work [mlxsw_spectrum]
[ 246.202394] RIP: 0010:refcount_sub_and_test_checked+0x4c/0x60
...
[ 246.298889] Call Trace:
[ 246.301617] refcount_dec_and_test_checked+0x11/0x20
[ 246.307170] mlxsw_sp_router_fibmr_event_work.cold.196+0x47/0x78 [mlxsw_spectrum]
[ 246.315531] process_one_work+0x1fa/0x3f0
[ 246.320005] worker_thread+0x2f/0x3e0
[ 246.324083] kthread+0x118/0x130
[ 246.327683] ? wq_update_unbound_numa+0x1b0/0x1b0
[ 246.332926] ? kthread_park+0x80/0x80
[ 246.337013] ret_from_fork+0x1f/0x30

Fixes: 088aa3eec2ce ("ip6mr: Support fib notifications")
Signed-off-by: Nir Dotan
Reviewed-by: Ido Schimmel
Signed-off-by: David S. Miller

Nir Dotan
2019-01-28 15:16:07 +0800

02 Jan, 2019

1 commit

cb9f1b783 ip: validate header length on virtual device xmit ... Browse Code »

KMSAN detected read beyond end of buffer in vti and sit devices when
passing truncated packets with PF_PACKET. The issue affects additional
ip tunnel devices.

Extend commit 76c0ddd8c3a6 ("ip6_tunnel: be careful when accessing the
inner header") and commit ccfec9e5cb2d ("ip_tunnel: be careful when
accessing the inner header").

Move the check to a separate helper and call at the start of each
ndo_start_xmit function in net/ipv4 and net/ipv6.

Minor changes:
- convert dev_kfree_skb to kfree_skb on error path,
as dev_kfree_skb calls consume_skb which is not for error paths.
- use pskb_network_may_pull even though that is pedantic here,
as the same as pskb_may_pull for devices without llheaders.
- do not cache ipv6 hdrs if used only once
(unsafe across pskb_may_pull, was more relevant to earlier patch)

Reported-by: syzbot
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller

Willem de Bruijn
2019-01-02 04:05:02 +0800

21 Dec, 2018

1 commit

2be09de7d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Lots of conflicts, by happily all cases of overlapping
changes, parallel adds, things of that nature.

Thanks to Stephen Rothwell, Saeed Mahameed, and others
for their guidance in these resolutions.

Signed-off-by: David S. Miller

David S. Miller
2018-12-21 03:53:36 +0800

18 Dec, 2018

1 commit

f5c6dfdef ip6mr: Drop mfc6_cache argument to ip6mr_forward2 ... Browse Code »

mfc6_cache is not needed by ip6mr_forward2 so drop it from the input
argument list.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2018-12-18 15:31:14 +0800

15 Dec, 2018

1 commit

69d2c8676 ip6mr: Fix potential Spectre v1 vulnerability ... Browse Code »

vr.mifi is indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

net/ipv6/ip6mr.c:1845 ip6mr_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap)
net/ipv6/ip6mr.c:1919 ip6mr_compat_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap)

Fix this by sanitizing vr.mifi before using it to index mrt->vif_table'

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Signed-off-by: Gustavo A. R. Silva
Signed-off-by: David S. Miller

Gustavo A. R. Silva
2018-12-15 07:34:28 +0800

07 Dec, 2018

1 commit

00f54e689 net: core: dev: Add extack argument to dev_open() ... Browse Code »

In order to pass extack together with NETDEV_PRE_UP notifications, it's
necessary to route the extack to __dev_open() from diverse (possibly
indirect) callers. One prominent API through which the notification is
invoked is dev_open().

Therefore extend dev_open() with and extra extack argument and update
all users. Most of the calls end up just encoding NULL, but bond and
team drivers have the extack readily available.

Signed-off-by: Petr Machata
Acked-by: Jiri Pirko
Reviewed-by: Ido Schimmel
Reviewed-by: David Ahern
Signed-off-by: David S. Miller

Petr Machata
2018-12-07 05:26:06 +0800

25 Oct, 2018

1 commit

ae677bbb4 net: Don't return invalid table id error when dumping all families ... Browse Code »

When doing a route dump across all address families, do not error out
if the table does not exist. This allows a route dump for AF_UNSPEC
with a table id that may only exist for some of the families.

Do return the table does not exist error if dumping routes for a
specific family and the table does not exist.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2018-10-25 05:06:25 +0800

16 Oct, 2018

3 commits

effe67926 net: Enable kernel side filtering of route dumps ... Browse Code »

Update parsing of route dump request to enable kernel side filtering.
Allow filtering results by protocol (e.g., which routing daemon installed
the route), route type (e.g., unicast), table id and nexthop device. These
amount to the low hanging fruit, yet a huge improvement, for dumping
routes.

ip_valid_fib_dump_req is called with RTNL held, so __dev_get_by_index can
be used to look up the device index without taking a reference. From
there filter->dev is only used during dump loops with the lock still held.

Set NLM_F_DUMP_FILTERED in the answer_flags so the user knows the results
have been filtered should no entries be returned.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2018-10-16 15:14:07 +0800
cb167893f net: Plumb support for filtering ipv4 and ipv6 multicast route dumps ... Browse Code »

Implement kernel side filtering of routes by egress device index and
table id. If the table id is given in the filter, lookup table and
call mr_table_dump directly for it.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2018-10-16 15:13:39 +0800
4724676d5 net: Add struct for fib dump filter ... Browse Code »

Add struct fib_dump_filter for options on limiting which routes are
returned in a dump request. The current list is table id, protocol,
route type, rtm_flags and nexthop device index. struct net is needed
to lookup the net_device from the index.

Declare the filter for each route dump handler and plumb the new
arguments from dump handlers to ip_valid_fib_dump_req.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2018-10-16 15:13:12 +0800

09 Oct, 2018

1 commit

e8ba330ac rtnetlink: Update fib dumps for strict data checking ... Browse Code »

Add helper to check netlink message for route dumps. If the strict flag
is set the dump request is expected to have an rtmsg struct as the header.
All elements of the struct are expected to be 0 with the exception of
rtm_flags (which is used by both ipv4 and ipv6 dumps) and no attributes
can be appended. rtm_flags can only have RTM_F_CLONED and RTM_F_PREFIX
set.

Update inet_dump_fib, inet6_dump_fib, mpls_dump_routes, ipmr_rtm_dumproute,
and ip6mr_rtm_dumproute to call this helper if strict data checking is
enabled.

Signed-off-by: David Ahern
Signed-off-by: David S. Miller

David Ahern
2018-10-09 01:39:05 +0800

03 Oct, 2018

1 commit

e4a38c0c4 ipv6: add vrf table handling code for ipv6 mcast ... Browse Code »

The code to obtain the correct table for the incoming interface was
missing for IPv6. This has been added along with the table creation
notification to fib rules for the RTNL_FAMILY_IP6MR address family.

Signed-off-by: Patrick Ruddy
Signed-off-by: Mike Manning
Reviewed-by: David Ahern
Signed-off-by: David S. Miller

Patrick Ruddy
2018-10-03 13:29:08 +0800

22 Jun, 2018

1 commit

0eb71a9da rhashtable: split rhashtable.h ... Browse Code »

Due to the use of rhashtables in net namespaces,
rhashtable.h is included in lots of the kernel,
so a small changes can required a large recompilation.
This makes development painful.

This patch splits out rhashtable-types.h which just includes
the major type declarations, and does not include (non-trivial)
inline code. rhashtable.h is no longer included by anything
in the include/ directory.
Common include files only include rhashtable-types.h so a large
recompilation is only triggered when that changes.

Acked-by: Herbert Xu
Signed-off-by: NeilBrown
Signed-off-by: David S. Miller

NeilBrown
2018-06-22 12:43:27 +0800

07 Jun, 2018

1 commit

1c8c5a9d3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking updates from David Miller:

1) Add Maglev hashing scheduler to IPVS, from Inju Song.

2) Lots of new TC subsystem tests from Roman Mashak.

3) Add TCP zero copy receive and fix delayed acks and autotuning with
SO_RCVLOWAT, from Eric Dumazet.

4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard
Brouer.

5) Add ttl inherit support to vxlan, from Hangbin Liu.

6) Properly separate ipv6 routes into their logically independant
components. fib6_info for the routing table, and fib6_nh for sets of
nexthops, which thus can be shared. From David Ahern.

7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP
messages from XDP programs. From Nikita V. Shirokov.

8) Lots of long overdue cleanups to the r8169 driver, from Heiner
Kallweit.

9) Add BTF ("BPF Type Format"), from Martin KaFai Lau.

10) Add traffic condition monitoring to iwlwifi, from Luca Coelho.

11) Plumb extack down into fib_rules, from Roopa Prabhu.

12) Add Flower classifier offload support to igb, from Vinicius Costa
Gomes.

13) Add UDP GSO support, from Willem de Bruijn.

14) Add documentation for eBPF helpers, from Quentin Monnet.

15) Add TLS tx offload to mlx5, from Ilya Lesokhin.

16) Allow applications to be given the number of bytes available to read
on a socket via a control message returned from recvmsg(), from
Soheil Hassas Yeganeh.

17) Add x86_32 eBPF JIT compiler, from Wang YanQing.

18) Add AF_XDP sockets, with zerocopy support infrastructure as well.
From Björn Töpel.

19) Remove indirect load support from all of the BPF JITs and handle
these operations in the verifier by translating them into native BPF
instead. From Daniel Borkmann.

20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha.

21) Allow XDP programs to do lookups in the main kernel routing tables
for forwarding. From David Ahern.

22) Allow drivers to store hardware state into an ELF section of kernel
dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy.

23) Various RACK and loss detection improvements in TCP, from Yuchung
Cheng.

24) Add TCP SACK compression, from Eric Dumazet.

25) Add User Mode Helper support and basic bpfilter infrastructure, from
Alexei Starovoitov.

26) Support ports and protocol values in RTM_GETROUTE, from Roopa
Prabhu.

27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard
Brouer.

28) Add lots of forwarding selftests, from Petr Machata.

29) Add generic network device failover driver, from Sridhar Samudrala.

* ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits)
strparser: Add __strp_unpause and use it in ktls.
rxrpc: Fix terminal retransmission connection ID to include the channel
net: hns3: Optimize PF CMDQ interrupt switching process
net: hns3: Fix for VF mailbox receiving unknown message
net: hns3: Fix for VF mailbox cannot receiving PF response
bnx2x: use the right constant
Revert "net: sched: cls: Fix offloading when ingress dev is vxlan"
net: dsa: b53: Fix for brcm tag issue in Cygnus SoC
enic: fix UDP rss bits
netdev-FAQ: clarify DaveM's position for stable backports
rtnetlink: validate attributes in do_setlink()
mlxsw: Add extack messages for port_{un, }split failures
netdevsim: Add extack error message for devlink reload
devlink: Add extack to reload and port_{un, }split operations
net: metrics: add proper netlink validation
ipmr: fix error path when ipmr_new_table fails
ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
net: hns3: remove unused hclgevf_cfg_func_mta_filter
netfilter: provide udp*_lib_lookup for nf_tproxy
qed*: Utilize FW 8.37.2.0
...

Linus Torvalds
2018-06-07 09:39:49 +0800

06 Jun, 2018

2 commits

e783bb00a ipmr: fix error path when ipmr_new_table fails ... Browse Code »

commit 0bbbf0e7d0e7 ("ipmr, ip6mr: Unite creation of new mr_table")
refactored ipmr_new_table, so that it now returns NULL when
mr_table_alloc fails. Unfortunately, all callers of ipmr_new_table
expect an ERR_PTR.

This can result in NULL deref, for example when ipmr_rules_exit calls
ipmr_free_table with NULL net->ipv4.mrt in the
!CONFIG_IP_MROUTE_MULTIPLE_TABLES version.

This patch makes mr_table_alloc return errors, and changes
ip6mr_new_table and its callers to return/expect error pointers as
well. It also removes the version of mr_table_alloc defined under
!CONFIG_IP_MROUTE_COMMON, since it is never used.

Fixes: 0bbbf0e7d0e7 ("ipmr, ip6mr: Unite creation of new mr_table")
Signed-off-by: Sabrina Dubroca
Signed-off-by: David S. Miller

Sabrina Dubroca
2018-06-06 00:26:41 +0800
848235edb ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds ... Browse Code »

Currently, raw6_sk(sk)->ip6mr_table is set unconditionally during
ip6_mroute_setsockopt(MRT6_TABLE). A subsequent attempt at the same
setsockopt will fail with -ENOENT, since we haven't actually created
that table.

A similar fix for ipv4 was included in commit 5e1859fbcc3c ("ipv4: ipmr:
various fixes and cleanups").

Fixes: d1db275dd3f6 ("ipv6: ip6mr: support multiple tables")
Signed-off-by: Sabrina Dubroca
Signed-off-by: David S. Miller

Sabrina Dubroca
2018-06-06 00:26:39 +0800

16 May, 2018

1 commit

c35063722 proc: introduce proc_create_net{,_data} ... Browse Code »

Variants of proc_create{,_data} that directly take a struct seq_operations
and deal with network namespaces in ->open and ->release. All callers of
proc_create + seq_open_net converted over, and seq_{open,release}_net are
removed entirely.

Signed-off-by: Christoph Hellwig

Christoph Hellwig
2018-05-16 13:24:30 +0800

23 Apr, 2018

1 commit

b16fb418b net: fib_rules: add extack support ... Browse Code »

Signed-off-by: Roopa Prabhu
Signed-off-by: David S. Miller

Roopa Prabhu
2018-04-23 22:21:24 +0800

28 Mar, 2018

1 commit

2f635ceeb net: Drop pernet_operations::async ... Browse Code »

Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.

Signed-off-by: Kirill Tkhai
Signed-off-by: David S. Miller

Kirill Tkhai
2018-03-28 01:18:09 +0800

27 Mar, 2018

3 commits

8c13af2a2 ip6mr: Add refcounting to mfc ... Browse Code »

Since ipmr and ip6mr are using the same mr_mfc struct at their core, we
can now refactor the ipmr_cache_{hold,put} logic and apply refcounting
to both ipmr and ip6mr.

Signed-off-by: Yuval Mintz
Signed-off-by: Ido Schimmel
Signed-off-by: David S. Miller

Yuval Mintz
2018-03-27 01:14:43 +0800
d3c07e5b9 ip6mr: Add API for default_rule fib ... Browse Code »

Add the ability to discern whether a given FIB rule notification relates
to the default rule inserted when registering ip6mr or a different one.

Would later be used by drivers wishing to offload ipv6 multicast routes
but unable to offload rules other than the default one.

Signed-off-by: Yuval Mintz
Signed-off-by: Ido Schimmel
Signed-off-by: David S. Miller

Yuval Mintz
2018-03-27 01:14:43 +0800
088aa3eec ip6mr: Support fib notifications ... Browse Code »

In similar fashion to ipmr, support fib notifications for ip6mr mfc and
vif related events. This would later allow drivers to react to said
notifications and offload the IPv6 mroutes.

Signed-off-by: Yuval Mintz
Signed-off-by: Ido Schimmel
Signed-off-by: David S. Miller

Yuval Mintz
2018-03-27 01:14:43 +0800

08 Mar, 2018

1 commit

a366e300a ip6mr: remove synchronize_rcu() in favor of SOCK_RCU_FREE ... Browse Code »

Kirill found that recently added synchronize_rcu() call in
ip6mr_sk_done()
was slowing down netns dismantle and posted a patch to use it only if
the socket
was found.

I instead suggested to get rid of this call, and use instead
SOCK_RCU_FREE

We might later change IPv4 side to use the same technique and unify
both stacks. IPv4 does not use synchronize_rcu() but has a call_rcu()
that could be replaced by SOCK_RCU_FREE.

Tested:
time for i in {1..1000}; do unshare -n /bin/false;done

Before : real 7m18.911s
After : real 10.187s

Fixes: 8571ab479a6e ("ip6mr: Make mroute_sk rcu-based")
Signed-off-by: Eric Dumazet
Reported-by: Kirill Tkhai
Cc: Yuval Mintz
Reviewed-by: Kirill Tkhai
Signed-off-by: David S. Miller

Eric Dumazet
2018-03-08 07:13:41 +0800

02 Mar, 2018

6 commits

7b0db8573 ipmr, ip6mr: Unite dumproute flows ... Browse Code »

The various MFC entries are being held in the same kind of mr_tables
for both ipmr and ip6mr, and their traversal logic is identical.
Also, with the exception of the addresses [and other small tidbits]
the major bulk of the nla setting is identical.

Unite as much of the dumping as possible between the two.
Notice this requires creating an mr_table iterator for each, as the
for-each preprocessor macro can't be used by the common logic.

Signed-off-by: Yuval Mintz
Acked-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Yuval Mintz
2018-03-02 02:13:23 +0800
889cd83cb ip6mr: Remove MFC_NOTIFY and refactor flags ... Browse Code »

MFC_NOTIFY exists in ip6mr, probably as some legacy code
[was already removed for ipmr in commit
06bd6c0370bb ("net: ipmr: remove unused MFC_NOTIFY flag and make the flags enum").
Remove it from ip6mr as well, and move the enum into a common file;
Notice MFC_OFFLOAD is currently only used by ipmr.

Signed-off-by: Yuval Mintz
Acked-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Yuval Mintz
2018-03-02 02:13:23 +0800
3feda6b46 ipmr, ip6mr: Unite vif seq functions ... Browse Code »

Same as previously done with the mfc seq, the logic for the vif seq is
refactored to be shared between ipmr and ip6mr.

Signed-off-by: Yuval Mintz
Acked-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Yuval Mintz
2018-03-02 02:13:23 +0800
c8d619680 ipmr, ip6mr: Unite mfc seq logic ... Browse Code »

With the exception of the final dump, ipmr and ip6mr have the exact same
seq logic for traversing a given mr_table. Refactor that code and make
it common.

Signed-off-by: Yuval Mintz
Acked-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Yuval Mintz
2018-03-02 02:13:23 +0800
845c9a7ae ipmr, ip6mr: Unite logic for searching in MFC cache ... Browse Code »

ipmr and ip6mr utilize the exact same methods for searching the
hashed resolved connections, difference being only in the construction
of the hash comparison key.

In order to unite the flow, introduce an mr_table operation set that
would contain the protocol specific information required for common
flows, in this case - the hash parameters and a comparison key
representing a (*,*) route.

Signed-off-by: Yuval Mintz
Acked-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Yuval Mintz
2018-03-02 02:13:23 +0800
494fff563 ipmr, ip6mr: Make mfc_cache a common structure ... Browse Code »

mfc_cache and mfc6_cache are almost identical - the main difference is
in the origin/group addresses and comparison-key. Make a common
structure encapsulating most of the multicast routing logic - mr_mfc
and convert both ipmr and ip6mr into using it.

For easy conversion [casting, in this case] mr_mfc has to be the first
field inside every multicast routing abstraction utilizing it.

Signed-off-by: Yuval Mintz
Acked-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Yuval Mintz
2018-03-02 02:13:23 +0800