Eric Lee / smarc-fsl-linux-kernel

16 Nov, 2019

1 commit

faeb2dce0 bpf: Add kernel test functions for fentry testing ... Browse Code »

Add few kernel functions with various number of arguments,
their types and sizes for BPF trampoline testing to cover
different calling conventions.

Signed-off-by: Alexei Starovoitov
Signed-off-by: Daniel Borkmann
Acked-by: Song Liu
Link: https://lore.kernel.org/bpf/20191114185720.1641606-9-ast@kernel.org

Alexei Starovoitov
2019-11-16 06:43:01 +0800

04 Nov, 2019

13 commits

fac6fce9b net: icmp6: provide input address for traceroute6 ... Browse Code »

traceroute6 output can be confusing, in that it shows the address
that a router would use to reach the sender, rather than the address
the packet used to reach the router.
Consider this case:

------------------------ N2
| |
------ ------ N3 ----
| R1 | | R2 |------|H2|
------ ------ ----
| |
------------------------ N1
|
----
|H1|
----

where H1's default route is through R1, and R1's default route is
through R2 over N2.
traceroute6 from H1 to H2 shows R2's address on N1 rather than on N2.

The script below can be used to reproduce this scenario.

traceroute6 output without this patch:

traceroute to 2000:103::4 (2000:103::4), 30 hops max, 80 byte packets
1 2000:101::1 (2000:101::1) 0.036 ms 0.008 ms 0.006 ms
2 2000:101::2 (2000:101::2) 0.011 ms 0.008 ms 0.007 ms
3 2000:103::4 (2000:103::4) 0.013 ms 0.010 ms 0.009 ms

traceroute6 output with this patch:

traceroute to 2000:103::4 (2000:103::4), 30 hops max, 80 byte packets
1 2000:101::1 (2000:101::1) 0.056 ms 0.019 ms 0.006 ms
2 2000:102::2 (2000:102::2) 0.013 ms 0.008 ms 0.008 ms
3 2000:103::4 (2000:103::4) 0.013 ms 0.009 ms 0.009 ms

#!/bin/bash
#
# ------------------------ N2
# | |
# ------ ------ N3 ----
# | R1 | | R2 |------|H2|
# ------ ------ ----
# | |
# ------------------------ N1
# |
# ----
# |H1|
# ----
#
# N1: 2000:101::/64
# N2: 2000:102::/64
# N3: 2000:103::/64
#
# R1's host part of address: 1
# R2's host part of address: 2
# H1's host part of address: 3
# H2's host part of address: 4
#
# For example:
# the IPv6 address of R1's interface on N2 is 2000:102::1/64
#
# Nets are implemented by macvlan interfaces (bridge mode) over
# dummy interfaces.
#

# Create net namespaces
ip netns add host1
ip netns add host2
ip netns add rtr1
ip netns add rtr2

# Create nets
ip link add net1 type dummy; ip link set net1 up
ip link add net2 type dummy; ip link set net2 up
ip link add net3 type dummy; ip link set net3 up

# Add interfaces to net1, move them to their nemaspaces
ip link add link net1 dev host1net1 type macvlan mode bridge
ip link set host1net1 netns host1
ip link add link net1 dev rtr1net1 type macvlan mode bridge
ip link set rtr1net1 netns rtr1
ip link add link net1 dev rtr2net1 type macvlan mode bridge
ip link set rtr2net1 netns rtr2

# Add interfaces to net2, move them to their nemaspaces
ip link add link net2 dev rtr1net2 type macvlan mode bridge
ip link set rtr1net2 netns rtr1
ip link add link net2 dev rtr2net2 type macvlan mode bridge
ip link set rtr2net2 netns rtr2

# Add interfaces to net3, move them to their nemaspaces
ip link add link net3 dev rtr2net3 type macvlan mode bridge
ip link set rtr2net3 netns rtr2
ip link add link net3 dev host2net3 type macvlan mode bridge
ip link set host2net3 netns host2

# Configure interfaces and routes in host1
ip netns exec host1 ip link set lo up
ip netns exec host1 ip link set host1net1 up
ip netns exec host1 ip -6 addr add 2000:101::3/64 dev host1net1
ip netns exec host1 ip -6 route add default via 2000:101::1

# Configure interfaces and routes in rtr1
ip netns exec rtr1 ip link set lo up
ip netns exec rtr1 ip link set rtr1net1 up
ip netns exec rtr1 ip -6 addr add 2000:101::1/64 dev rtr1net1
ip netns exec rtr1 ip link set rtr1net2 up
ip netns exec rtr1 ip -6 addr add 2000:102::1/64 dev rtr1net2
ip netns exec rtr1 ip -6 route add default via 2000:102::2
ip netns exec rtr1 sysctl net.ipv6.conf.all.forwarding=1

# Configure interfaces and routes in rtr2
ip netns exec rtr2 ip link set lo up
ip netns exec rtr2 ip link set rtr2net1 up
ip netns exec rtr2 ip -6 addr add 2000:101::2/64 dev rtr2net1
ip netns exec rtr2 ip link set rtr2net2 up
ip netns exec rtr2 ip -6 addr add 2000:102::2/64 dev rtr2net2
ip netns exec rtr2 ip link set rtr2net3 up
ip netns exec rtr2 ip -6 addr add 2000:103::2/64 dev rtr2net3
ip netns exec rtr2 sysctl net.ipv6.conf.all.forwarding=1

# Configure interfaces and routes in host2
ip netns exec host2 ip link set lo up
ip netns exec host2 ip link set host2net3 up
ip netns exec host2 ip -6 addr add 2000:103::4/64 dev host2net3
ip netns exec host2 ip -6 route add default via 2000:103::2

# Ping host2 from host1
ip netns exec host1 ping6 -c5 2000:103::4

# Traceroute host2 from host1
ip netns exec host1 traceroute6 2000:103::4

# Delete nets
ip link del net3
ip link del net2
ip link del net1

# Delete namespaces
ip netns del rtr2
ip netns del rtr1
ip netns del host2
ip netns del host1

Signed-off-by: Francesco Ruggeri
Original-patch-by: Honggang Xu
Signed-off-by: David S. Miller

Francesco Ruggeri
2019-11-04 09:26:53 +0800
06e7c70c6 tipc: improve message bundling algorithm ... Browse Code »

As mentioned in commit e95584a889e1 ("tipc: fix unlimited bundling of
small messages"), the current message bundling algorithm is inefficient
that can generate bundles of only one payload message, that causes
unnecessary overheads for both the sender and receiver.

This commit re-designs the 'tipc_msg_make_bundle()' function (now named
as 'tipc_msg_try_bundle()'), so that when a message comes at the first
place, we will just check & keep a reference to it if the message is
suitable for bundling. The message buffer will be put into the link
backlog queue and processed as normal. Later on, when another one comes
we will make a bundle with the first message if possible and so on...
This way, a bundle if really needed will always consist of at least two
payload messages. Otherwise, we let the first buffer go its way without
any need of bundling, so reduce the overheads to zero.

Moreover, since now we have both the messages in hand, we can even
optimize the 'tipc_msg_bundle()' function, make bundle of a very large
(size ~ MSS) and small messages which is not with the current algorithm
e.g. [1400-byte message] + [10-byte message] (MTU = 1500).

Acked-by: Ying Xue
Acked-by: Jon Maloy
Signed-off-by: Tuong Lien
Signed-off-by: David S. Miller

Tuong Lien
2019-11-04 09:26:15 +0800
2adf81c0f net: icmp: use input address in traceroute ... Browse Code »

Even with icmp_errors_use_inbound_ifaddr set, traceroute returns the
primary address of the interface the packet was received on, even if
the path goes through a secondary address. In the example:

1.0.3.1/24
---- 1.0.1.3/24 1.0.1.1/24 ---- 1.0.2.1/24 1.0.2.4/24 ----
|H1|--------------------------|R1|--------------------------|H2|
---- N1 ---- N2 ----

where 1.0.3.1/24 is R1's primary address on N1, traceroute from
H1 to H2 returns:

traceroute to 1.0.2.4 (1.0.2.4), 30 hops max, 60 byte packets
1 1.0.3.1 (1.0.3.1) 0.018 ms 0.006 ms 0.006 ms
2 1.0.2.4 (1.0.2.4) 0.021 ms 0.007 ms 0.007 ms

After applying this patch, it returns:

traceroute to 1.0.2.4 (1.0.2.4), 30 hops max, 60 byte packets
1 1.0.1.1 (1.0.1.1) 0.033 ms 0.007 ms 0.006 ms
2 1.0.2.4 (1.0.2.4) 0.011 ms 0.007 ms 0.007 ms

Original-patch-by: Bill Fenner
Signed-off-by: Francesco Ruggeri
Reviewed-by: David Ahern
Signed-off-by: David S. Miller

Francesco Ruggeri
2019-11-04 09:25:18 +0800
eec62eadd net: openvswitch: simplify the ovs_dp_cmd_new ... Browse Code »

use the specified functions to init resource.

Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:04 +0800
4c76bf696 net: openvswitch: don't unlock mutex when changing the user_features fails ... Browse Code »

Unlocking of a not locked mutex is not allowed.
Other kernel thread may be in critical section while
we unlock it because of setting user_feature fail.

Fixes: 95a7233c4 ("net: openvswitch: Set OvS recirc_id from tc chain index")
Cc: Paul Blakey
Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: William Tu
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:04 +0800
50b0e61b3 net: openvswitch: fix possible memleak on destroy flow-table ... Browse Code »

When we destroy the flow tables which may contain the flow_mask,
so release the flow mask struct.

Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:03 +0800
0a3e01371 net: openvswitch: add likely in flow_lookup ... Browse Code »

The most case *index < ma->max, and flow-mask is not NULL.
We add un/likely for performance.

Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: William Tu
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:03 +0800
515b65a4b net: openvswitch: simplify the flow_hash ... Browse Code »

Simplify the code and remove the unnecessary BUILD_BUG_ON.

Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: William Tu
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:03 +0800
57f7d7b91 net: openvswitch: optimize flow-mask looking up ... Browse Code »

The full looking up on flow table traverses all mask array.
If mask-array is too large, the number of invalid flow-mask
increase, performance will be drop.

One bad case, for example: M means flow-mask is valid and NULL
of flow-mask means deleted.

+-------------------------------------------+
| M | NULL | ... | NULL | M|
+-------------------------------------------+

In that case, without this patch, openvswitch will traverses all
mask array, because there will be one flow-mask in the tail. This
patch changes the way of flow-mask inserting and deleting, and the
mask array will be keep as below: there is not a NULL hole. In the
fast path, we can "break" "for" (not "continue") in flow_lookup
when we get a NULL flow-mask.

"break"
v
+-------------------------------------------+
| M | M | NULL |... | NULL | NULL|
+-------------------------------------------+

This patch don't optimize slow or control path, still using ma->max
to traverse. Slow path:
* tbl_mask_array_realloc
* ovs_flow_tbl_lookup_exact
* flow_mask_find

Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:03 +0800
a7f35e78e net: openvswitch: optimize flow mask cache hash collision ... Browse Code »

Port the codes to linux upstream and with little changes.

Pravin B Shelar, says:
| In case hash collision on mask cache, OVS does extra flow
| lookup. Following patch avoid it.

Link: https://github.com/openvswitch/ovs/commit/0e6efbe2712da03522532dc5e84806a96f6a0dd1
Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:03 +0800
1689754de net: openvswitch: shrink the mask array if necessary ... Browse Code »

When creating and inserting flow-mask, if there is no available
flow-mask, we realloc the mask array. When removing flow-mask,
if necessary, we shrink mask array.

Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: William Tu
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:03 +0800
4bc63b1b5 net: openvswitch: convert mask list in mask array ... Browse Code »

Port the codes to linux upstream and with little changes.

Pravin B Shelar, says:
| mask caches index of mask in mask_list. On packet recv OVS
| need to traverse mask-list to get cached mask. Therefore array
| is better for retrieving cached mask. This also allows better
| cache replacement algorithm by directly checking mask's existence.

Link: https://github.com/openvswitch/ovs/commit/d49fc3ff53c65e4eca9cabd52ac63396746a7ef5
Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: William Tu
Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:03 +0800
04b7d136d net: openvswitch: add flow-mask cache for performance ... Browse Code »

The idea of this optimization comes from a patch which
is committed in 2014, openvswitch community. The author
is Pravin B Shelar. In order to get high performance, I
implement it again. Later patches will use it.

Pravin B Shelar, says:
| On every packet OVS needs to lookup flow-table with every
| mask until it finds a match. The packet flow-key is first
| masked with mask in the list and then the masked key is
| looked up in flow-table. Therefore number of masks can
| affect packet processing performance.

Link: https://github.com/openvswitch/ovs/commit/5604935e4e1cbc16611d2d97f50b717aa31e8ec5
Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: William Tu
Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:03 +0800

03 Nov, 2019

2 commits

ae8a76fb8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next ... Browse Code »

Alexei Starovoitov says:

====================
pull-request: bpf-next 2019-11-02

The following pull-request contains BPF updates for your *net-next* tree.

We've added 30 non-merge commits during the last 7 day(s) which contain
a total of 41 files changed, 1864 insertions(+), 474 deletions(-).

The main changes are:

1) Fix long standing user vs kernel access issue by introducing
bpf_probe_read_user() and bpf_probe_read_kernel() helpers, from Daniel.

2) Accelerated xskmap lookup, from Björn and Maciej.

3) Support for automatic map pinning in libbpf, from Toke.

4) Cleanup of BTF-enabled raw tracepoints, from Alexei.

5) Various fixes to libbpf and selftests.
====================

Signed-off-by: David S. Miller

David S. Miller
2019-11-03 06:29:58 +0800
d31e95585 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

The only slightly tricky merge conflict was the netdevsim because the
mutex locking fix overlapped a lot of driver reload reorganization.

The rest were (relatively) trivial in nature.

Signed-off-by: David S. Miller

David S. Miller
2019-11-03 04:54:56 +0800

02 Nov, 2019

8 commits

1204c70d9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Pull networking fixes from David Miller:

1) Fix free/alloc races in batmanadv, from Sven Eckelmann.

2) Several leaks and other fixes in kTLS support of mlx5 driver, from
Tariq Toukan.

3) BPF devmap_hash cost calculation can overflow on 32-bit, from Toke
Høiland-Jørgensen.

4) Add an r8152 device ID, from Kazutoshi Noguchi.

5) Missing include in ipv6's addrconf.c, from Ben Dooks.

6) Use siphash in flow dissector, from Eric Dumazet. Attackers can
easily infer the 32-bit secret otherwise etc.

7) Several netdevice nesting depth fixes from Taehee Yoo.

8) Fix several KCSAN reported errors, from Eric Dumazet. For example,
when doing lockless skb_queue_empty() checks, and accessing
sk_napi_id/sk_incoming_cpu lockless as well.

9) Fix jumbo packet handling in RXRPC, from David Howells.

10) Bump SOMAXCONN and tcp_max_syn_backlog values, from Eric Dumazet.

11) Fix DMA synchronization in gve driver, from Yangchun Fu.

12) Several bpf offload fixes, from Jakub Kicinski.

13) Fix sk_page_frag() recursion during memory reclaim, from Tejun Heo.

14) Fix ping latency during high traffic rates in hisilicon driver, from
Jiangfent Xiao.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (146 commits)
net: fix installing orphaned programs
net: cls_bpf: fix NULL deref on offload filter removal
selftests: bpf: Skip write only files in debugfs
selftests: net: reuseport_dualstack: fix uninitalized parameter
r8169: fix wrong PHY ID issue with RTL8168dp
net: dsa: bcm_sf2: Fix IMP setup for port different than 8
net: phylink: Fix phylink_dbg() macro
gve: Fixes DMA synchronization.
inet: stop leaking jiffies on the wire
ixgbe: Remove duplicate clear_bit() call
Documentation: networking: device drivers: Remove stray asterisks
e1000: fix memory leaks
i40e: Fix receive buffer starvation for AF_XDP
igb: Fix constant media auto sense switching when no cable is connected
net: ethernet: arc: add the missed clk_disable_unprepare
igb: Enable media autosense for the i350.
igb/igc: Don't warn on fatal read failures when the device is removed
tcp: increase tcp_max_syn_backlog max value
net: increase SOMAXCONN to 4096
netdevsim: Fix use-after-free during device dismantle
...

Linus Torvalds
2019-11-02 08:48:11 +0800
d817991cc xsk: Restructure/inline XSKMAP lookup/redirect/flush ... Browse Code »

In this commit the XSKMAP entry lookup function used by the XDP
redirect code is moved from the xskmap.c file to the xdp_sock.h
header, so the lookup can be inlined from, e.g., the
bpf_xdp_redirect_map() function.

Further the __xsk_map_redirect() and __xsk_map_flush() is moved to the
xsk.c, which lets the compiler inline the xsk_rcv() and xsk_flush()
functions.

Finally, all the XDP socket functions were moved from linux/bpf.h to
net/xdp_sock.h, where most of the XDP sockets functions are anyway.

This yields a ~2% performance boost for the xdpsock "rx_drop"
scenario.

Signed-off-by: Björn Töpel
Signed-off-by: Daniel Borkmann
Link: https://lore.kernel.org/bpf/20191101110346.15004-4-bjorn.topel@gmail.com

Björn Töpel
2019-11-02 07:38:49 +0800
aefc3e723 net: fix installing orphaned programs ... Browse Code »

When netdevice with offloaded BPF programs is destroyed
the programs are orphaned and removed from the program
IDA - their IDs get released (the programs may remain
accessible via existing open file descriptors and pinned
files). After IDs are released they are set to 0.

This confuses dev_change_xdp_fd() because it compares
the __dev_xdp_query() result where 0 means no program
with prog->aux->id where 0 means orphaned.

dev_change_xdp_fd() would have incorrectly returned success
even though it had not installed the program.

Since drivers already catch this case via bpf_offload_dev_match()
let them handle this case. The error message drivers produce in
this case ("program loaded for a different device") is in fact
correct as the orphaned program must had to be loaded for a
different device.

Fixes: c14a9f633d9e ("net: Don't call XDP_SETUP_PROG when nothing is changed")
Signed-off-by: Jakub Kicinski
Signed-off-by: David S. Miller

Jakub Kicinski
2019-11-02 06:16:01 +0800
41aa29a58 net: cls_bpf: fix NULL deref on offload filter removal ... Browse Code »

Commit 401192113730 ("net: sched: refactor block offloads counter
usage") missed the fact that either new prog or old prog may be
NULL.

Fixes: 401192113730 ("net: sched: refactor block offloads counter usage")
Signed-off-by: Jakub Kicinski
Signed-off-by: David S. Miller

Jakub Kicinski
2019-11-02 06:16:01 +0800
a904a0693 inet: stop leaking jiffies on the wire ... Browse Code »

Historically linux tried to stick to RFC 791, 1122, 2003
for IPv4 ID field generation.

RFC 6864 made clear that no matter how hard we try,
we can not ensure unicity of IP ID within maximum
lifetime for all datagrams with a given source
address/destination address/protocol tuple.

Linux uses a per socket inet generator (inet_id), initialized
at connection startup with a XOR of 'jiffies' and other
fields that appear clear on the wire.

Thiemo Nagel pointed that this strategy is a privacy
concern as this provides 16 bits of entropy to fingerprint
devices.

Let's switch to a random starting point, this is just as
good as far as RFC 6864 is concerned and does not leak
anything critical.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet
Reported-by: Thiemo Nagel
Signed-off-by: David S. Miller

Eric Dumazet
2019-11-02 05:57:52 +0800
58ec1ea63 net: bridge: fdb: restore unlikely() when taking over externally added entries ... Browse Code »

Taking over hw-learned entries is not a likely scenario so restore the
unlikely() use for the case of SW taking over externally learned
entries.

Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Nikolay Aleksandrov
2019-11-02 01:32:43 +0800
31f1155bd net: bridge: fdb: avoid two atomic bitops in br_fdb_external_learn_add() ... Browse Code »

If we setup the fdb flags prior to calling fdb_create() we can avoid
two atomic bitops when learning a new entry.

Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Nikolay Aleksandrov
2019-11-02 01:32:43 +0800
be0c56779 net: bridge: fdb: br_fdb_update can take flags directly ... Browse Code »

If we modify br_fdb_update() to take flags directly we can get rid of
one test and one atomic bitop in the learning path.

Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Nikolay Aleksandrov
2019-11-02 01:32:43 +0800

01 Nov, 2019

10 commits

fcee85f19 net: dsa: tag_8021q: clarify index limitation ... Browse Code »

Now that there's no restriction from the DSA core side regarding
the switch IDs and port numbers, only tag_8021q which is currently
reserving 3 bits for the switch ID and 4 bits for the port number, has
limitation for these values. Update their descriptions to reflect that.

Signed-off-by: Vivien Didelot
Signed-off-by: David S. Miller

Vivien Didelot
2019-11-01 05:26:38 +0800
27d4d19d7 net: dsa: remove limitation of switch index value ... Browse Code »

Because there is no static array describing the links between switches
anymore, we have no reason to force a limitation of the index value
set by the device tree.

Signed-off-by: Vivien Didelot
Signed-off-by: David S. Miller

Vivien Didelot
2019-11-01 05:26:38 +0800
8e5cb84c6 net: dsa: remove tree functions related to switches ... Browse Code »

The DSA fabric setup code has been simplified a lot so get rid of
the dsa_tree_remove_switch, dsa_tree_add_switch and dsa_switch_add
helpers, and keep the code simple with only the dsa_switch_probe and
dsa_switch_remove functions.

Signed-off-by: Vivien Didelot
Signed-off-by: David S. Miller

Vivien Didelot
2019-11-01 05:26:38 +0800
9c8ad1ab6 net: dsa: remove the dst->ds array ... Browse Code »

Now that the DSA ports are listed in the switch fabric, there is
no need to store the dsa_switch structures from the drivers in the
fabric anymore. So get rid of the dst->ds static array.

Signed-off-by: Vivien Didelot
Signed-off-by: David S. Miller

Vivien Didelot
2019-11-01 05:26:38 +0800
3774ecdb8 net: dsa: remove switch routing table setup code ... Browse Code »

The dsa_switch structure has no routing table specific data to setup,
so the switch fabric can directly walk its ports and initialize its
routing table from them.

This allows us to remove the dsa_switch_setup_routing_table function.

Signed-off-by: Vivien Didelot
Signed-off-by: David S. Miller

Vivien Didelot
2019-11-01 05:26:38 +0800
96252b8e0 net: dsa: remove ds->rtable ... Browse Code »

Drivers do not use the ds->rtable static arrays anymore, get rid of it.

Signed-off-by: Vivien Didelot
Signed-off-by: David S. Miller

Vivien Didelot
2019-11-01 05:26:38 +0800
c5f51765a net: dsa: list DSA links in the fabric ... Browse Code »

Implement a new list of DSA links in the switch fabric itself, to
provide an alterative to the ds->rtable static arrays.

At the same time, provide a new dsa_routing_port() helper to abstract
the usage of ds->rtable in drivers. If there's no port to reach a
given device, return the first invalid port, ds->num_ports. This avoids
potential signedness errors or the need to define special values.

Signed-off-by: Vivien Didelot
Signed-off-by: David S. Miller

Vivien Didelot
2019-11-01 05:26:38 +0800
623d0c2db tcp: increase tcp_max_syn_backlog max value ... Browse Code »

tcp_max_syn_backlog default value depends on memory size
and TCP ehash size. Before this patch, the max value
was 2048 [1], which is considered too small nowadays.

Increase it to 4096 to match the recent SOMAXCONN change.

[1] This is with TCP ehash size being capped to 524288 buckets.

Signed-off-by: Eric Dumazet
Cc: Willy Tarreau
Cc: Yue Cao
Signed-off-by: David S. Miller

Eric Dumazet
2019-11-01 05:02:01 +0800
f9c32435a rxrpc: Fix handling of last subpacket of jumbo packet ... Browse Code »

When rxrpc_recvmsg_data() sets the return value to 1 because it's drained
all the data for the last packet, it checks the last-packet flag on the
whole packet - but this is wrong, since the last-packet flag is only set on
the final subpacket of the last jumbo packet. This means that a call that
receives its last packet in a jumbo packet won't complete properly.

Fix this by having rxrpc_locate_data() determine the last-packet state of
the subpacket it's looking at and passing that back to the caller rather
than having the caller look in the packet header. The caller then needs to
cache this in the rxrpc_call struct as rxrpc_locate_data() isn't then
called again for this packet.

Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Fixes: e2de6c404898 ("rxrpc: Use info in skbuff instead of reparsing a jumbo packet")
Signed-off-by: David Howells
Signed-off-by: David S. Miller

David Howells
2019-11-01 03:23:09 +0800
5a7ec6678 Merge tag 'mac80211-for-net-2019-10-31' of git://git.kernel.org/pub/scm/linux/ke… ... Browse Code »

…rnel/git/jberg/mac80211

Johannes Berg says:

====================
Just two fixes:
* HT operation is not allowed on channel 14 (Japan only)
* netlink policy for nexthop attribute was wrong
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

David S. Miller
2019-11-01 02:43:36 +0800

31 Oct, 2019

6 commits

e38226786 net: sched: update action implementations to support flags ... Browse Code »

Extend struct tc_action with new "tcfa_flags" field. Set the field in
tcf_idr_create() function and provide new helper
tcf_idr_create_from_flags() that derives 'cpustats' boolean from flags
value. Update individual hardware-offloaded actions init() to pass their
"flags" argument to new helper in order to skip percpu stats allocation
when user requested it through flags.

Signed-off-by: Vlad Buslov
Signed-off-by: David S. Miller

Vlad Buslov
2019-10-31 09:07:51 +0800
abbb0d336 net: sched: extend TCA_ACT space with TCA_ACT_FLAGS ... Browse Code »

Extend TCA_ACT space with nla_bitfield32 flags. Add
TCA_ACT_FLAGS_NO_PERCPU_STATS as the only allowed flag. Parse the flags in
tcf_action_init_1() and pass resulting value as additional argument to
a_o->init().

Signed-off-by: Vlad Buslov
Signed-off-by: David S. Miller

Vlad Buslov
2019-10-31 09:07:50 +0800
5e174d5e7 net: sched: modify stats helper functions to support regular stats ... Browse Code »

Modify stats update helper functions introduced in previous patches in this
series to fallback to regular tc_action->tcfa_{b|q}stats if cpu stats are
not allocated for the action argument. If regular non-percpu allocated
counters are in use, then obtain action tcfa_lock while modifying them.

Signed-off-by: Vlad Buslov
Acked-by: Jiri Pirko
Signed-off-by: David S. Miller

Vlad Buslov
2019-10-31 09:07:50 +0800
ef816f3c4 net: sched: don't expose action qstats to skb_tc_reinsert() ... Browse Code »

Previous commit introduced helper function for updating qstats and
refactored set of actions to use the helpers, instead of modifying qstats
directly. However, one of the affected action exposes its qstats to
skb_tc_reinsert(), which then modifies it.

Refactor skb_tc_reinsert() to return integer error code and don't increment
overlimit qstats in case of error, and use the returned error code in
tcf_mirred_act() to manually increment the overlimit counter with new
helper function.

Signed-off-by: Vlad Buslov
Acked-by: Jiri Pirko
Signed-off-by: David S. Miller

Vlad Buslov
2019-10-31 09:07:50 +0800
26b537a88 net: sched: extract qstats update code into functions ... Browse Code »

Extract common code that increments cpu_qstats counters into standalone act
API functions. Change hardware offloaded actions that use percpu counter
allocation to use the new functions instead of accessing cpu_qstats
directly.

This commit doesn't change functionality.

Signed-off-by: Vlad Buslov
Acked-by: Jiri Pirko
Signed-off-by: David S. Miller

Vlad Buslov
2019-10-31 09:07:50 +0800
5e1ad95b6 net: sched: extract bstats update code into function ... Browse Code »

Extract common code that increments cpu_bstats counter into standalone act
API function. Change hardware offloaded actions that use percpu counter
allocation to use the new function instead of incrementing cpu_bstats
directly.

This commit doesn't change functionality.

Signed-off-by: Vlad Buslov
Acked-by: Jiri Pirko
Signed-off-by: David S. Miller

Vlad Buslov
2019-10-31 09:07:50 +0800