Eric Lee / smarc-fsl-linux-kernel

26 Apr, 2020

3 commits

659d4587f net: openvswitch: use div_u64() for 64-by-32 divisions ... Browse Code »

Compile the kernel for arm 32 platform, the build warning found.
To fix that, should use div_u64() for divisions.
| net/openvswitch/meter.c:396: undefined reference to `__udivdi3'

[add more commit msg, change reported tag, and use div_u64 instead
of do_div by Tonghao]

Fixes: e57358873bb5d6ca ("net: openvswitch: use u64 for meter bucket")
Reported-by: kbuild test robot
Signed-off-by: Tonghao Zhang
Tested-by: Tonghao Zhang
Signed-off-by: David S. Miller

Tonghao Zhang
2020-04-26 11:48:21 +0800
4b36a0dff net: openvswitch: suitable access to the dp_meters ... Browse Code »

To fix the following sparse warning:
| net/openvswitch/meter.c:109:38: sparse: sparse: incorrect type
| in assignment (different address spaces) ...
| net/openvswitch/meter.c:720:45: sparse: sparse: incorrect type
| in argument 1 (different address spaces) ...

Reported-by: kbuild test robot
Signed-off-by: Tonghao Zhang
Signed-off-by: David S. Miller

Tonghao Zhang
2020-04-26 11:48:21 +0800
d48338967 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Simple overlapping changes to linux/vermagic.h

Signed-off-by: David S. Miller

David S. Miller
2020-04-26 11:18:53 +0800

24 Apr, 2020

5 commits

e57358873 net: openvswitch: use u64 for meter bucket ... Browse Code »

When setting the meter rate to 4+Gbps, there is an
overflow, the meters don't work as expected.

Cc: Pravin B Shelar
Cc: Andy Zhou
Signed-off-by: Tonghao Zhang
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2020-04-24 09:26:11 +0800
c77350089 net: openvswitch: make EINVAL return value more obvious ... Browse Code »

Cc: Pravin B Shelar
Cc: Andy Zhou
Signed-off-by: Tonghao Zhang
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2020-04-24 09:26:11 +0800
a8e387384 net: openvswitch: remove the unnecessary check ... Browse Code »

Before invoking the ovs_meter_cmd_reply_stats, "meter"
was checked, so don't check it agin in that function.

Cc: Pravin B Shelar
Cc: Andy Zhou
Signed-off-by: Tonghao Zhang
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2020-04-24 09:26:11 +0800
eb58eebc7 net: openvswitch: set max limitation to meters ... Browse Code »

Don't allow user to create meter unlimitedly, which may cause
to consume a large amount of kernel memory. The max number
supported is decided by physical memory and 20K meters as default.

Cc: Pravin B Shelar
Cc: Andy Zhou
Signed-off-by: Tonghao Zhang
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2020-04-24 09:26:11 +0800
c7c4c44c9 net: openvswitch: expand the meters supported number ... Browse Code »

In kernel datapath of Open vSwitch, there are only 1024
buckets of meter in one datapath. If installing more than
1024 (e.g. 8192) meters, it may lead to the performance drop.
But in some case, for example, Open vSwitch used as edge
gateway, there should be 20K at least, where meters used for
IP address bandwidth limitation.

[Open vSwitch userspace datapath has this issue too.]

For more scalable meter, this patch use meter array instead of
hash tables, and expand/shrink the array when necessary. So we
can install more meters than before in the datapath.
Introducing the struct *dp_meter_instance, it's easy to
expand meter though changing the *ti point in the struct
*dp_meter_table.

Cc: Pravin B Shelar
Cc: Andy Zhou
Signed-off-by: Tonghao Zhang
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2020-04-24 09:26:11 +0800

21 Apr, 2020

1 commit

27de77cec net: openvswitch: ovs_ct_exit to be done under ovs_lock ... Browse Code »

syzbot wrote:
| =============================
| WARNING: suspicious RCU usage
| 5.7.0-rc1+ #45 Not tainted
| -----------------------------
| net/openvswitch/conntrack.c:1898 RCU-list traversed in non-reader section!!
|
| other info that might help us debug this:
| rcu_scheduler_active = 2, debug_locks = 1
| ...
|
| stack backtrace:
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
| Workqueue: netns cleanup_net
| Call Trace:
| ...
| ovs_ct_exit
| ovs_exit_net
| ops_exit_list.isra.7
| cleanup_net
| process_one_work
| worker_thread

To avoid that warning, invoke the ovs_ct_exit under ovs_lock and add
lockdep_ovsl_is_held as optional lockdep expression.

Link: https://lore.kernel.org/lkml/000000000000e642a905a0cbee6e@google.com
Fixes: 11efd5cb04a1 ("openvswitch: Support conntrack zone limit")
Cc: Pravin B Shelar
Cc: Yi-Hung Wei
Reported-by: syzbot+7ef50afd3a211f879112@syzkaller.appspotmail.com
Signed-off-by: Tonghao Zhang
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2020-04-21 01:53:54 +0800

03 Apr, 2020

1 commit

64948427a net: openvswitch: use hlist_for_each_entry_rcu instead of hlist_for_each_entry ... Browse Code »

The struct sw_flow is protected by RCU, when traversing them,
use hlist_for_each_entry_rcu.

Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Reviewed-by: Greg Rose
Signed-off-by: David S. Miller

Tonghao Zhang
2020-04-03 08:48:12 +0800

30 Mar, 2020

1 commit

a08e7fd91 net: Fix typo of SKB_SGO_CB_OFFSET ... Browse Code »

The SKB_SGO_CB_OFFSET should be SKB_GSO_CB_OFFSET which means the
offset of the GSO in skb cb. This patch fixes the typo.

Fixes: 9207f9d45b0a ("net: preserve IP control block during GSO segmentation")
Signed-off-by: Cambda Zhu
Signed-off-by: David S. Miller

Cambda Zhu
2020-03-30 12:53:18 +0800

13 Mar, 2020

1 commit

1d3435793 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Minor overlapping changes, nothing serious.

Signed-off-by: David S. Miller

David S. Miller
2020-03-13 13:34:48 +0800

04 Mar, 2020

1 commit

b5ab1f1be openvswitch: add missing attribute validation for hash ... Browse Code »

Add missing attribute validation for OVS_PACKET_ATTR_HASH
to the netlink policy.

Fixes: bd1903b7c459 ("net: openvswitch: add hash info to upcall")
Signed-off-by: Jakub Kicinski
Reviewed-by: Greg Rose
Signed-off-by: David S. Miller

Jakub Kicinski
2020-03-04 05:28:48 +0800

22 Feb, 2020

1 commit

e65ee2fb5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Conflict resolution of ice_virtchnl_pf.c based upon work by
Stephen Rothwell.

Signed-off-by: David S. Miller

David S. Miller
2020-02-22 05:39:34 +0800

21 Feb, 2020

1 commit

16a556eeb openvswitch: Distribute switch variables for initialization ... Browse Code »

Variables declared in a switch statement before any case statements
cannot be automatically initialized with compiler instrumentation (as
they are not part of any execution flow). With GCC's proposed automatic
stack variable initialization feature, this triggers a warning (and they
don't get initialized). Clang's automatic stack variable initialization
(via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
doesn't initialize such variables[1]. Note that these warnings (or silent
skipping) happen before the dead-store elimination optimization phase,
so even when the automatic initializations are later elided in favor of
direct initializations, the warnings remain.

To avoid these problems, move such variables into the "case" where
they're used or lift them up into the main function body.

net/openvswitch/flow_netlink.c: In function ‘validate_set’:
net/openvswitch/flow_netlink.c:2711:29: warning: statement will never be executed [-Wswitch-unreachable]
2711 | const struct ovs_key_ipv4 *ipv4_key;
| ^~~~~~~~

[1] https://bugs.llvm.org/show_bug.cgi?id=44916

Signed-off-by: Kees Cook
Signed-off-by: David S. Miller

Kees Cook
2020-02-21 02:00:19 +0800

19 Feb, 2020

4 commits

a2cfb96cc flow_table.c: Use built-in RCU list checking ... Browse Code »

hlist_for_each_entry_rcu() has built-in RCU and lock checking.

Pass cond argument to list_for_each_entry_rcu() to silence
false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled
by default.

Signed-off-by: Madhuparna Bhowmik
Signed-off-by: David S. Miller

Madhuparna Bhowmik
2020-02-19 04:46:27 +0800
53742e69e datapath.c: Use built-in RCU list checking ... Browse Code »

hlist_for_each_entry_rcu() has built-in RCU and lock checking.

Pass cond argument to list_for_each_entry_rcu() to silence
false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled
by default.

Signed-off-by: Madhuparna Bhowmik
Signed-off-by: David S. Miller

Madhuparna Bhowmik
2020-02-19 04:46:27 +0800
fed48423f vport.c: Use built-in RCU list checking ... Browse Code »

hlist_for_each_entry_rcu() has built-in RCU and lock checking.

Pass cond argument to list_for_each_entry_rcu() to silence
false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled
by default.

Signed-off-by: Madhuparna Bhowmik
Signed-off-by: David S. Miller

Madhuparna Bhowmik
2020-02-19 04:46:27 +0800
779061461 meter.c: Use built-in RCU list checking ... Browse Code »

hlist_for_each_entry_rcu() has built-in RCU and lock checking.

Pass cond argument to list_for_each_entry_rcu() to silence
false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled
by default.

Signed-off-by: Madhuparna Bhowmik
Signed-off-by: David S. Miller

Madhuparna Bhowmik
2020-02-19 04:46:26 +0800

17 Feb, 2020

1 commit

744676e77 openvswitch: add TTL decrement action ... Browse Code »

New action to decrement TTL instead of setting it to a fixed value.
This action will decrement the TTL and, in case of expired TTL, drop it
or execute an action passed via a nested attribute.
The default TTL expired action is to drop the packet.

Supports both IPv4 and IPv6 via the ttl and hop_limit fields, respectively.

Tested with a corresponding change in the userspace:

# ovs-dpctl dump-flows
in_port(2),eth(),eth_type(0x0800), packets:0, bytes:0, used:never, actions:dec_ttl{ttl 192.168.0.2: ICMP echo request, id 386, seq 1, length 64
# ping -c1 192.168.0.2 -t 120
IP (tos 0x0, ttl 119, id 62070, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.0.1 > 192.168.0.2: ICMP echo request, id 388, seq 1, length 64
# ping -c1 192.168.0.2 -t 1
#

Co-developed-by: Bindiya Kurle
Signed-off-by: Bindiya Kurle
Signed-off-by: Matteo Croce
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Matteo Croce
2020-02-17 11:34:44 +0800

15 Jan, 2020

1 commit

2cec4448d net: openvswitch: use skb_list_walk_safe helper for gso segments ... Browse Code »

This is a straight-forward conversion case for the new function, keeping
the flow of the existing code as intact as possible.

Signed-off-by: Jason A. Donenfeld
Signed-off-by: David S. Miller

Jason A. Donenfeld
2020-01-15 03:48:41 +0800

25 Dec, 2019

1 commit

f66b53fdb openvswitch: New MPLS actions for layer 2 tunnelling ... Browse Code »

The existing PUSH MPLS action inserts MPLS header between ethernet header
and the IP header. Though this behaviour is fine for L3 VPN where an IP
packet is encapsulated inside a MPLS tunnel, it does not suffice the L2
VPN (l2 tunnelling) requirements. In L2 VPN the MPLS header should
encapsulate the ethernet packet.

The new mpls action ADD_MPLS inserts MPLS header at the start of the
packet or at the start of the l3 header depending on the value of l3 tunnel
flag in the ADD_MPLS arguments.

POP_MPLS action is extended to support ethertype 0x6558.

Signed-off-by: Martin Varghese
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Martin Varghese
2019-12-25 14:24:45 +0800

10 Dec, 2019

1 commit

c593642c8 treewide: Use sizeof_field() macro ... Browse Code »

Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except
at places where these are defined. Later patches will remove the unused
definition of FIELD_SIZEOF().

This patch is generated using following script:

EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h"

git grep -l -e "\bFIELD_SIZEOF\b" | while read file;
do

if [[ "$file" =~ $EXCLUDE_FILES ]]; then
continue
fi
sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file;
done

Signed-off-by: Pankaj Bharadiya
Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com
Co-developed-by: Kees Cook
Signed-off-by: Kees Cook
Acked-by: David Miller # for net

Pankaj Bharadiya
2019-12-10 02:36:44 +0800

05 Dec, 2019

2 commits

d04ac224b net: Fixed updating of ethertype in skb_mpls_push() ... Browse Code »

The skb_mpls_push was not updating ethertype of an ethernet packet if
the packet was originally received from a non ARPHRD_ETHER device.

In the below OVS data path flow, since the device corresponding to
port 7 is an l3 device (ARPHRD_NONE) the skb_mpls_push function does
not update the ethertype of the packet even though the previous
push_eth action had added an ethernet header to the packet.

recirc_id(0),in_port(7),eth_type(0x0800),ipv4(tos=0/0xfc,ttl=64,frag=no),
actions:push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),
push_mpls(label=13,tc=0,ttl=64,bos=1,eth_type=0x8847),4

Fixes: 8822e270d697 ("net: core: move push MPLS functionality from OvS to core helper")
Signed-off-by: Martin Varghese
Signed-off-by: David S. Miller

Martin Varghese
2019-12-05 09:11:25 +0800
5d50aa83e openvswitch: support asymmetric conntrack ... Browse Code »

The openvswitch module shares a common conntrack and NAT infrastructure
exposed via netfilter. It's possible that a packet needs both SNAT and
DNAT manipulation, due to e.g. tuple collision. Netfilter can support
this because it runs through the NAT table twice - once on ingress and
again after egress. The openvswitch module doesn't have such capability.

Like netfilter hook infrastructure, we should run through NAT twice to
keep the symmetry.

Fixes: 05752523e565 ("openvswitch: Interface with NAT.")
Signed-off-by: Aaron Conole
Signed-off-by: David S. Miller

Aaron Conole
2019-12-05 08:31:15 +0800

03 Dec, 2019

1 commit

040b5cfbc Fixed updating of ethertype in function skb_mpls_pop ... Browse Code »

The skb_mpls_pop was not updating ethertype of an ethernet packet if the
packet was originally received from a non ARPHRD_ETHER device.

In the below OVS data path flow, since the device corresponding to port 7
is an l3 device (ARPHRD_NONE) the skb_mpls_pop function does not update
the ethertype of the packet even though the previous push_eth action had
added an ethernet header to the packet.

recirc_id(0),in_port(7),eth_type(0x8847),
mpls(label=12/0xfffff,tc=0/0,ttl=0/0x0,bos=1/1),
actions:push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),
pop_mpls(eth_type=0x800),4

Fixes: ed246cee09b9 ("net: core: move pop MPLS functionality from OvS to core helper")
Signed-off-by: Martin Varghese
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Martin Varghese
2019-12-03 05:03:50 +0800

02 Dec, 2019

2 commits

8a574f866 openvswitch: remove another BUG_ON() ... Browse Code »

If we can't build the flow del notification, we can simply delete
the flow, no need to crash the kernel. Still keep a WARN_ON to
preserve debuggability.

Note: the BUG_ON() predates the Fixes tag, but this change
can be applied only after the mentioned commit.

v1 -> v2:
- do not leak an skb on error

Fixes: aed067783e50 ("openvswitch: Minimize ovs_flow_cmd_del critical section.")
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller

Paolo Abeni
2019-12-02 05:21:24 +0800
8ffeb03fb openvswitch: drop unneeded BUG_ON() in ovs_flow_cmd_build_info() ... Browse Code »

All the callers of ovs_flow_cmd_build_info() already deal with
error return code correctly, so we can handle the error condition
in a more gracefull way. Still dump a warning to preserve
debuggability.

v1 -> v2:
- clarify the commit message
- clean the skb and report the error (DaveM)

Fixes: ccb1352e76cf ("net: Add Open vSwitch kernel components.")
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller

Paolo Abeni
2019-12-02 05:21:24 +0800

27 Nov, 2019

1 commit

4e81c0b3f openvswitch: fix flow command message size ... Browse Code »

When user-space sets the OVS_UFID_F_OMIT_* flags, and the relevant
flow has no UFID, we can exceed the computed size, as
ovs_nla_put_identifier() will always dump an OVS_FLOW_ATTR_KEY
attribute.
Take the above in account when computing the flow command message
size.

Fixes: 74ed7ab9264c ("openvswitch: Add support for unique flow IDs.")
Reported-by: Qi Jun Ding
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller

Paolo Abeni
2019-11-27 07:13:46 +0800

16 Nov, 2019

1 commit

61ca533c0 net: openvswitch: don't call pad_packet if not necessary ... Browse Code »

The nla_put_u16/nla_put_u32 makes sure that
*attrlen is align. The call tree is that:

nla_put_u16/nla_put_u32
-> nla_put attrlen = sizeof(u16) or sizeof(u32)
-> __nla_put attrlen
-> __nla_reserve attrlen
-> skb_put(skb, nla_total_size(attrlen))

nla_total_size returns the total length of attribute
including padding.

Cc: Joe Stringer
Cc: William Tu
Signed-off-by: Tonghao Zhang
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-16 04:43:27 +0800

15 Nov, 2019

1 commit

bd1903b7c net: openvswitch: add hash info to upcall ... Browse Code »

When using the kernel datapath, the upcall don't
include skb hash info relatived. That will introduce
some problem, because the hash of skb is important
in kernel stack. For example, VXLAN module uses
it to select UDP src port. The tx queue selection
may also use the hash in stack.

Hash is computed in different ways. Hash is random
for a TCP socket, and hash may be computed in hardware,
or software stack. Recalculation hash is not easy.

Hash of TCP socket is computed:
tcp_v4_connect
-> sk_set_txhash (is random)

__tcp_transmit_skb
-> skb_set_hash_from_sk

There will be one upcall, without information of skb
hash, to ovs-vswitchd, for the first packet of a TCP
session. The rest packets will be processed in Open vSwitch
modules, hash kept. If this tcp session is forward to
VXLAN module, then the UDP src port of first tcp packet
is different from rest packets.

TCP packets may come from the host or dockers, to Open vSwitch.
To fix it, we store the hash info to upcall, and restore hash
when packets sent back.

+---------------+ +-------------------------+
| Docker/VMs | | ovs-vswitchd |
+----+----------+ +-+--------------------+--+
| ^ |
| | |
| | upcall v restore packet hash (not recalculate)
| +-+--------------------+--+
| tap netdev | | vxlan module
+---------------> +--> Open vSwitch ko +-->
or internal type | |
+-------------------------+

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-October/364062.html
Signed-off-by: Tonghao Zhang
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-15 09:29:46 +0800

07 Nov, 2019

1 commit

90ce9f23a net: openvswitch: select vport upcall portid directly ... Browse Code »

The commit 69c51582ff786 ("dpif-netlink: don't allocate per
thread netlink sockets"), in Open vSwitch ovs-vswitchd, has
changed the number of allocated sockets to just one per port
by moving the socket array from a per handler structure to
a per datapath one. In the kernel datapath, a vport will have
only one socket in most case, if so select it directly in
fast-path.

Signed-off-by: Tonghao Zhang
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-07 13:25:19 +0800

06 Nov, 2019

1 commit

fbdcdd78d Change in Openvswitch to support MPLS label depth of 3 in ingress direction ... Browse Code »

The openvswitch was supporting a MPLS label depth of 1 in the ingress
direction though the userspace OVS supports a max depth of 3 labels.
This change enables openvswitch module to support a max depth of
3 labels in the ingress.

Signed-off-by: Martin Varghese
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Martin Varghese
2019-11-06 10:02:29 +0800

04 Nov, 2019

7 commits

eec62eadd net: openvswitch: simplify the ovs_dp_cmd_new ... Browse Code »

use the specified functions to init resource.

Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:04 +0800
4c76bf696 net: openvswitch: don't unlock mutex when changing the user_features fails ... Browse Code »

Unlocking of a not locked mutex is not allowed.
Other kernel thread may be in critical section while
we unlock it because of setting user_feature fail.

Fixes: 95a7233c4 ("net: openvswitch: Set OvS recirc_id from tc chain index")
Cc: Paul Blakey
Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: William Tu
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:04 +0800
50b0e61b3 net: openvswitch: fix possible memleak on destroy flow-table ... Browse Code »

When we destroy the flow tables which may contain the flow_mask,
so release the flow mask struct.

Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:03 +0800
0a3e01371 net: openvswitch: add likely in flow_lookup ... Browse Code »

The most case *index < ma->max, and flow-mask is not NULL.
We add un/likely for performance.

Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: William Tu
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:03 +0800
515b65a4b net: openvswitch: simplify the flow_hash ... Browse Code »

Simplify the code and remove the unnecessary BUILD_BUG_ON.

Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: William Tu
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:03 +0800
57f7d7b91 net: openvswitch: optimize flow-mask looking up ... Browse Code »

The full looking up on flow table traverses all mask array.
If mask-array is too large, the number of invalid flow-mask
increase, performance will be drop.

One bad case, for example: M means flow-mask is valid and NULL
of flow-mask means deleted.

+-------------------------------------------+
| M | NULL | ... | NULL | M|
+-------------------------------------------+

In that case, without this patch, openvswitch will traverses all
mask array, because there will be one flow-mask in the tail. This
patch changes the way of flow-mask inserting and deleting, and the
mask array will be keep as below: there is not a NULL hole. In the
fast path, we can "break" "for" (not "continue") in flow_lookup
when we get a NULL flow-mask.

"break"
v
+-------------------------------------------+
| M | M | NULL |... | NULL | NULL|
+-------------------------------------------+

This patch don't optimize slow or control path, still using ma->max
to traverse. Slow path:
* tbl_mask_array_realloc
* ovs_flow_tbl_lookup_exact
* flow_mask_find

Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:03 +0800
a7f35e78e net: openvswitch: optimize flow mask cache hash collision ... Browse Code »

Port the codes to linux upstream and with little changes.

Pravin B Shelar, says:
| In case hash collision on mask cache, OVS does extra flow
| lookup. Following patch avoid it.

Link: https://github.com/openvswitch/ovs/commit/0e6efbe2712da03522532dc5e84806a96f6a0dd1
Signed-off-by: Tonghao Zhang
Tested-by: Greg Rose
Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller

Tonghao Zhang
2019-11-04 09:18:03 +0800