30 Jun, 2017
18 commits
-
Checks are added to the existing sockex3 and test_map_in_map test.
Signed-off-by: Martin KaFai Lau
Acked-by: Daniel Borkmann
Signed-off-by: David S. Miller -
This patch allows userspace to do BPF_MAP_LOOKUP_ELEM on
BPF_MAP_TYPE_PROG_ARRAY,
BPF_MAP_TYPE_ARRAY_OF_MAPS and
BPF_MAP_TYPE_HASH_OF_MAPS.The lookup returns a prog-id or map-id to the userspace.
The userspace can then use the BPF_PROG_GET_FD_BY_ID
or BPF_MAP_GET_FD_BY_ID to get a fd.Signed-off-by: Martin KaFai Lau
Acked-by: Daniel Borkmann
Signed-off-by: David S. Miller -
Version 3.70a of the Designware has additional DMA registers so
add those to the ethtool DMA Register dump.
Offset 9 - Receive Interrupt Watchdog Timer Register
Offset 10 - AXI Bus Mode Register
Offset 11 - AHB or AXI Status Register
Offset 22 - HW Feature RegisterSigned-off-by: Thor Thayer
Acked-by: Giuseppe Cavallaro
Signed-off-by: David S. Miller -
Saeed Mahameed says:
====================
mlx5-updates-2017-06-27 (Innova IPsec offload support)This patchset adds support for Innova IPSec network interface card.
About Innova device:
--------------------
Innova is a network card with a ConnectX chip and an FPGA chip as a
bump-on-the-wire.Internal
+----------+ Link +-----------------+
| +--------------+ FPGA | +------+
| ConnectX | | Shell +--+ QSFP |
| +--------------+ +-------+ | | Port |
+----------+ I2C | | SBU | | +------+
| +-------+ |
+--+----------+---+
| |
+--+--+ +---+---+
| DDR | | Flash |
+-----+ +-------+The FPGA synthesized logic is loaded from dedicated flash storage and has
access to its own dedicated DDR RAM.
The ConnectX chip firmware programs the FPGA by accessing its configuration
space over either the slow internal I2C link or the high-speed internal link.The FPGA logic is divided into a "Shell" and a "Sandbox Unit" (SBU).
mlx5_core driver (with CONFIG_MLX5_FPGA) handles all shell functionality,
while other components may handle the various SBU functionalities.The driver opens high-speed reliable communication channels with the shell and
the SBU over the internal link.
These channels may be used for high-bandwidth configuration or for SBU-specific
out-of-band data paths.About Innova IPSec device:
--------------------------
Innova IPSec is a network card that allows offloading IPSec cryptography operations
from the host CPU to the NIC. It is an Innova card with an IPSec SBU.
The hardware keeps the database of IPSec Security Associations (SADB) in the FPGA's
DDR memory.Internal
+----------+ Link +-----------------+
| +--------------+ FPGA | +------+
| ConnectX | | Shell +--+ QSFP |
| +--------------+ +-------+ | | Port |
+----------+ Internal I2C | | IPSec | | +------+
| | SBU | |
| +-------+ |
+--+----------+---+
| |
+--+--+ +---+---+
| DDR | | |
| | | Flash |
|SADB | | |
+-----+ +-------+Modes and ciphers:
Currently the following modes and ciphers are supported:
IPv4 and IPv6
ESP tunnel and transport modes
AES 128 and 256 bit encryption, with GCM authentication (RFC4106)IV is generated using seqiv, in sync with Linux's geniv.
More modes and ciphers may be added later.
Notes:
In the future similar functionality will be included in a single-chip NIC.About the driver:
-----------------
Patches 1-4 prepare some existing driver code for the new feature:
* Add support for reserved GIDs in the hardware GID table
* Allow multiple modules to enable hardware RoCE support independently
Patches 5-6 define structs and helper functions for QP work-queues.
Patches 7-11 add various FPGA-related features required for Innova.
IPSec.
Patch 12 adds abstraction layer for Mellanox IPSec-offload capable devices.
atches 13-16 add IPSec offload support to the mlx5 netdevice.This driver services the new IPSec offload API introduced in commit
d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")Configuration Path:
If Innova IPSec device is detected, the mlx5e netdevice gets the new
NETIF_F_HW_ESP feature and the xdo callbacks, indicating ESP offload
capabilities, and also the matching TX checksum and GSO features.The driver configures offloaded Security Associations (SAs) by sending
an ADD_SA or DEL_SA message to the IPSec SBU, which updates the SADB in DDR.
These messages and their responses are sent over a high-speed channel.
Counters for ethtool are retrieved by the driver from the SBU.Data path:
On receive path, the SBU decrypts ESP packets which match the offloaded SADB,
but keeps them encapsulated.
The SBU injects metadata (Mellanox owned ethertype) indicating that crypto-offload
has taken place, the SA with which it was done, and the authentication result.The ConnectX chip performs RX checksum offload on the packet, and RSS using the
ESP SPI value. The driver detects the special ethertype, and attaches a struct
secpath to the RX SKB, including flags to indicate that crypto offload took place,
the authentication result, and which xfrm_state was used for decryption, in the
olen and ovec members. The RX SKB may have useful CHECKSUM_COMPLETE. A separate
patchset will add support for that in the xfrm stack.On transmit path, the stack encapsulates the packet but does not encrypt it, and
indicates in the SKB's secpath that crypto offload is to be performed and the SA
to use to do so.
The driver avoids performing crypto-offload for ESP fragments, and packets with
IP options, as the SBU cannot currently do that. For eligible packets, the driver
prepends a special ethertype with metadata instructing the hardware to perform crypto offload.
The stack builds regular (non-GSO) SKBs so that they contain a placeholder for the ESP trailer.
The driver trims it off, because the SBU automatically appends the trailer for offloaded packets.
The ConnectX chip performs TX checksum offload on inner UDP or TCP packets,
and GSO for TCP packets (duplicating the prepended metadata).
The segmented packets then undergo encryption in the SBU before going on the wire.Performance:
We measure single stream of TCP on Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz
Using AES-NI with ESP GSO we get constant 4.1 Gbps.
Using crypto offload we get constant 18 Gbps.Note that these numbers require CHECKSUM_COMPLETE support in XFRM, which we submit separately.
- Ilan Tayari
====================Signed-off-by: David S. Miller
-
Ivan Khoronzhuk says:
====================
net: fix sw timestamping for non PTP packetsThis series contains several corrections connected with timestamping
for cpsw and netcp drivers based on same cpts module.Based on net/next
====================Reviewed-by: Grygorii Strashko
Signed-off-by: David S. Miller -
There is cpts function to check if packet can be timstamped with cpts.
Seems that ptp_classify_raw cover all cases listed with "case".Signed-off-by: Ivan Khoronzhuk
Signed-off-by: David S. Miller -
The cpts can timestmap only ptp packets at this moment, so driver
cannot mark every packet as though it's going to be timestamped,
only because h/w timestamping for given skb is enabled with
SKBTX_HW_TSTAMP. It doesn't allow to use sw timestamping, as result
outgoing packet is not timestamped at all if it's not PTP and h/w
timestamping is enabled. So, fix it by setting SKBTX_IN_PROGRESS
only for PTP packets.Signed-off-by: Ivan Khoronzhuk
Signed-off-by: David S. Miller -
Move sw timestamp function close to channel submit function.
Signed-off-by: Ivan Khoronzhuk
Signed-off-by: David S. Miller -
Using netdev_(netdev, "%s: ...", netdev->name) duplicates the
name in the output. Remove those uses.Miscellanea:
o Use the netif_ convenience macros at the same time
Signed-off-by: Joe Perches
Signed-off-by: David S. Miller -
Trivial fix to spelling mistake in mlx4_dbg debug message
Signed-off-by: Colin Ian King
Acked-by: Tariq Toukan
Signed-off-by: David S. Miller -
Trivial fix to spelling mistake in netif_info message
Signed-off-by: Colin Ian King
Signed-off-by: David S. Miller -
Since the PHY used is internal, simply set phy-mode as internal.
Signed-off-by: Corentin Labbe
Signed-off-by: David S. Miller -
Since the PHY used is internal, simply set phy-mode as internal.
Signed-off-by: Corentin Labbe
Signed-off-by: David S. Miller -
Since the PHY used is internal, simply set phy-mode as internal.
Signed-off-by: Corentin Labbe
Signed-off-by: David S. Miller -
Since the PHY used is internal, simply set phy-mode as internal.
Signed-off-by: Corentin Labbe
Signed-off-by: David S. Miller -
Since the PHY used is internal, simply set phy-mode as internal.
Signed-off-by: Corentin Labbe
Signed-off-by: David S. Miller -
The current way to find if the phy is internal is to compare DT phy-mode
and emac_variant/internal_phy.
But it will negate a possible future SoC where an external PHY use the
same phy mode than the internal one.By using phy-mode = "internal" we permit to have an external PHY with
the same mode than the internal one.Reported-by: André Przywara
Signed-off-by: Corentin Labbe
Signed-off-by: David S. Miller -
The bond_options.c file contains multiple netdev_info statements that clutter kernel output.
This patch replaces all netdev_info with netdev_dbg and adds a netdev_dbg statement for the
packets per slave parameter. Also fixes misalignment at line 467.Suggested-by: Joe Perches
Signed-off-by: Michael J Dilmore
Signed-off-by: David S. Miller
28 Jun, 2017
21 commits
-
Jakub Kicinski says:
====================
nfp: get_phys_port_name for representors and SR-IOV reorderThis series starts by making the error message if FW cannot be located
easier to understand. Then I move some functions from PCI probe files
into library code (nfpcore) where they belong, and remove one function
which is never used.Next few patches equip representors with nfp_port structure and make
their NDOs fully shared (not defined in apps), thanks to which we can
easily determine which netdevs are NFP's by comparing the NDO pointers.10th patch makes use of the shared NDOs and nfp_ports to deliver
netdev-type independent .ndo_get_phys_port_name() implementation.Patches 11 and 12 reorder the nfp_app SR-IOV callbacks with enabling
SR-IOV VFs. Unfortunately due to how PCI subsystem works we can't
guarantee being able to disable SR-IOV at exit or that it will be
disabled when we first probe... We must therefore make sure FW is
able to deal with being loaded while SR-IOV is already on.Patch 13 fixes potential deadlock when enabling SR-IOV happens at
the same time as port state refresh. Note that this can't happen
at this point, since Flower doesn't refresh ports... but lockdep
doesn't know about such details and we will have to deal with this
sooner or later anyway.Last but not least a new Kconfig is added to make sure those who
don't care about flower offloads have a way of not including the
code in their kernels. Thanks to nfp_app separation this costs us
a single ifdef and excluding flower files from the build.
====================Signed-off-by: David S. Miller
-
Give users an option not to build the flower-offload related code.
Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
Since we grab pf->lock around pci_enable_sriov() we can no longer
safely queue work which may also grab that lock onto system workqueue.
pci_enable_sriov() will flush system workqueue as part to wait for VF
probing.Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
We previously assumed that app callback can be guaranteed to be
executed before SR-IOV is actually enabled. Given that we can't
guarantee that SR-IOV will be disabled during probe or that we
will be able to disable it on remove, we should reorder the callbacks.
We should also call the app's sriov_enable if SR-IOV was enabled
during probe.Application FW must be able to disable VFs internally and not depend
on them being removed at PCIe level.Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
We assumed that when we probe number of enabled VFs will be at 0.
This doesn't have to be the case for example if previous driver left
SR-IOV enabled due to some VFs being assigned. Read the number of VFs
enabled. Fail probe if it's above current FWs limit.Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
Make nfp_port_get_phys_port_name() support new port types and
wire it up to representors' struct net_device_ops.Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
Based on struct net_device_ops figure out if netdev is a nfp_repr.
Use this knowledge to convert netdev directly to nfp_port.Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
Apps shouldn't declare their own struct net_device_ops for
representors, this makes sharing code harder. Add necessary
nfp_app callbacks and move the definition of representors'
struct net_device_ops to common code.Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
Thanks to the fact that all representors will now have an nfp_port,
we can depend on information there to provide a app-independent
.ndo_get_stats64().Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
nfp_port is an abstraction which is supposed to allow us sharing
code between different netdev types (vNIC vs repr). Spawn ports
for PFs and VFs to enable this sharing.Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
Add a cleanup callback for undoing what app init callback did.
Make flower allocate its private structure on init and free
it from the new callback.While at it remember to set the app pointer to NULL on the
error path to avoid any races while probe path unwinds.Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
Remove unused nfp_cpp_area_check_range() function.
Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
Move most of the helper for mapping RTsyms from nfp_net_main.c
to nfpcore. Use the new helper directly for mapping MAC statistics,
since they don't need to include the PCIe interface ID in the symbol
name.Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
nfp_net_map_area() is a helper for mapping areas of NFP memory
defined in nfp_net_main.c. Move it to nfpcore to allow reuse
and rename accordingly. Create an additional helper -
nfp_cpp_area_alloc_acquire() the opposite of already existing
nfp_cpp_area_release_free().Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
We support application FW being either loaded automatically at
boot from flash or (more commonly) by the driver from disk.
If FW is not found on disk and nothing is preloaded users are
faced with this unintuitive error:nfp 0000:04:00.0: nfp: Failed to find PF symbol _pf0_net_bar0
We can do better. Since we rely on symbol table being present -
check early if it could be correctly read out of from the device
and if not print a more informative message.Signed-off-by: Jakub Kicinski
Reviewed-by: Simon Horman
Signed-off-by: David S. Miller -
Paolo Abeni says:
====================
ipv6: udp: exploit dev_scratch helpersWhen bringing in the recent cache optimization for the UDP protocol, I forgot
to leverage the newly introduced scratched area helpers in the UDPv6 code path.
As a result, the UDPv6 implementation suffers some unnecessary performance
penality when compared to v4.This series aim to bring back UDPv6 on equal footing in respect to v4.
The first patch moves the shared helpers to the common include files, while
the second uses them in the UDPv6 code.This gives 5-8% performance improvement for a system under flood with small
UDPv6 packets. The performance delta is less than the one reported on the
original patch set because the UDPv6 code path already leveraged some of the
optimization.
====================Signed-off-by: David S. Miller
-
The commit b65ac44674dd ("udp: try to avoid 2 cache miss on dequeue")
leveraged the scratched area helpers for UDP v4 but I forgot to
update accordingly the IPv6 code path.This change extends the scratch area usage to the IPv6 code, synching
the two implementations and giving some performance benefit.
IPv6 is again almost on the same level of IPv4, performance-wide.Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller -
So that they can be later used by the IPv6 code, too.
Also lift the comments a bit.Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller -
If icsk_ulp_ops is unset, it dereferences a null ptr.
Add a null ptr check.BUG: KASAN: null-ptr-deref in copy_to_user include/linux/uaccess.h:168 [inline]
BUG: KASAN: null-ptr-deref in do_tcp_getsockopt.isra.33+0x24f/0x1e30 net/ipv4/tcp.c:3057
Read of size 4 at addr 0000000000000020 by task syz-executor1/15452Signed-off-by: Dave Watson
Reported-by: "Levin, Alexander (Sasha Levin)"
Signed-off-by: David S. Miller -
The access to the wrong variable could lead to a NULL dereference and
possibly other invalid memory reads in vxlan newlink/changelink requests
with a IFLA_MTU attribute.Fixes: a985343ba906 "vxlan: refactor verification and application of configuration"
Signed-off-by: Matthias Schiffer
Signed-off-by: David S. Miller -
It dates back from 2.1.16 and is obsolete since 2.1.68 when the current
rule system has been introduced.Signed-off-by: Vincent Bernat
Signed-off-by: David S. Miller
27 Jun, 2017
1 commit
-
Add Innova IPSec SBU counters to the ethtool -S stats.
Add IPSec offload error counters to the ethtool -S stats.Signed-off-by: Ilan Tayari
Reviewed-by: Boris Pismenny
Reviewed-by: Gal Pressman
Signed-off-by: Saeed Mahameed