02 Jul, 2019
13 commits
-
Add two ptp_ops: init and fini, to initialize and finalize the PTP
subsystem. Call as appropriate from mlxsw_sp_init() and _fini().Lay the groundwork for Spectrum-1 support. On Spectrum-1, the received
timestamped packets and their corresponding timestamps arrive
independently, and need to be matched up. Introduce the related data types
and add to struct mlxsw_sp_ptp_state the hash table that will keep the
unmatched entries.Signed-off-by: Petr Machata
Acked-by: Jiri Pirko
Signed-off-by: Ido Schimmel
Signed-off-by: David S. Miller -
On Spectrum-1, timestamps are delivered separately from the packets, and
need to paired up. Therefore, at some point after mlxsw_sp_port_xmit()
is invoked, it is necessary to involve the chip-specific driver code to
allow it to do the necessary bookkeeping and matching.On Spectrum-2, timestamps are delivered in CQE. For that reason,
position the point of driver involvement into mlxsw_pci_cqe_sdq_handle()
to make it hopefully easier to extend for Spectrum-2 in the future.To tell the driver what port the packet was sent on, keep tx_info
in SKB control buffer.Introduce a new driver core interface mlxsw_core_ptp_transmitted(), a
driver callback ptp_transmitted, and a PTP op transmitted. The callee is
responsible for taking care of releasing the SKB passed to the new
interfaces, and correspondingly have the new stub callbacks just call
dev_kfree_skb_any().Follow-up patches will introduce the actual content into
mlxsw_sp1_ptp_transmitted() in particular.Signed-off-by: Petr Machata
Acked-by: Jiri Pirko
Signed-off-by: Ido Schimmel
Signed-off-by: David S. Miller -
The SKB control buffer is useful (and used) for bookkeeping of information
related to that SKB. Add helpers so that the mlxsw driver(s) can safely use
the buffer as well. The structure is currently empty, individual users will
add members to it as necessary.Note that SKB allocation functions already clear the buffer, so the cleanup
is only necessary when ndo_start_xmit is called.Signed-off-by: Petr Machata
Acked-by: Jiri Pirko
Signed-off-by: Ido Schimmel
Signed-off-by: David S. Miller -
When configured, the Spectrum hardware can recognize PTP packets and
trap them to the CPU using dedicated traps, PTP0 and PTP1.One reason to get PTP packets under dedicated traps is to have a
separate policer suitable for the amount of PTP traffic expected when
switch is operated as a boundary clock. For this, add two new trap
groups, MLXSW_REG_HTGT_TRAP_GROUP_SP_PTP0 and _PTP1, and associate the
two PTP traps with these two groups.In the driver, specifically for Spectrum-1, event PTP packets will need
to be paired up with their timestamps. Those arrive through a different
set of traps, added later in the patch set. To support this future use,
introduce a new PTP op, ptp_receive.It is possible to configure which PTP messages should be trapped under
which PTP trap. On Spectrum systems, we will use PTP0 for event
packets (which need timestamping), and PTP1 for control packets (which
do not). Thus configure PTP0 trap with a custom callback that defers to
the ptp_receive op.Additionally, L2 PTP packets are actually trapped through the LLDP trap,
not through any of the PTP traps. So treat the LLDP trap the same way as
the PTP0 trap. Unlike PTP traps, which are currently still disabled,
LLDP trap is active. Correspondingly, have all the implementations of
the ptp_receive op return true, which the handler treats as a signal to
forward the packet immediately.Signed-off-by: Petr Machata
Acked-by: Jiri Pirko
Signed-off-by: Ido Schimmel
Signed-off-by: David S. Miller -
On Spectrum-1, timestamps for PTP packets are delivered through queues
of ingress and egress timestamps. There are two event traps
corresponding to activity on each of those queues. This mechanism is
absent on Spectrum-2, and therefore the traps should only be registered
on Spectrum-1.Carry a chip-specific listener array in mlxsw_sp->listeners and
listeners_count. Register listeners from that array in
mlxsw_sp_traps_init(). Add a new listener array for Spectrum-1 traps and
configure the newly-added mlxsw_sp->listeners with this array.The listener array is empty for now, the events will be added in a later
patch.Signed-off-by: Petr Machata
Acked-by: Jiri Pirko
Signed-off-by: Ido Schimmel
Signed-off-by: David S. Miller -
On Spectrum-1, timestamps for PTP packets are delivered through queues
of ingress and egress timestamps. There are two event traps
corresponding to activity on each of those queues. This mechanism is
absent on Spectrum-2, and therefore the traps should only be registered
on Spectrum-1.Extract out of mlxsw_sp_traps_init() a generic helper,
mlxsw_sp_traps_register(), and likewise with _unregister(). The new helpers
will later be called with Spectrum-1-specific traps.Signed-off-by: Petr Machata
Acked-by: Jiri Pirko
Signed-off-by: Ido Schimmel
Signed-off-by: David S. Miller -
This register serves to configure global parameters of certain
monitoring operations. The following patches will use it to configure
that when PTP timestamps are delivered through the PTP FIFO traps, the
FIFO in question is cleared as well.Signed-off-by: Petr Machata
Acked-by: Jiri Pirko
Signed-off-by: Ido Schimmel
Signed-off-by: David S. Miller -
The MTPPTR is used for reading the per port PTP timestamp FIFO.
Signed-off-by: Petr Machata
Acked-by: Jiri Pirko
Signed-off-by: Ido Schimmel
Signed-off-by: David S. Miller -
This register is used for configuring under which trap to deliver PTP
packets depending on type of the packet.Signed-off-by: Petr Machata
Acked-by: Jiri Pirko
Signed-off-by: Ido Schimmel
Signed-off-by: David S. Miller -
This register serves for configuration of which PTP messages should be
timestamped. This is a global configuration, despite the register name.Signed-off-by: Petr Machata
Acked-by: Jiri Pirko
Signed-off-by: Ido Schimmel
Signed-off-by: David S. Miller -
Currently, kernel pktgen has the feature to specify udp destination port
for sending packet. (e.g. pgset "udp_dst_min 9")But on samples, each of the scripts doesn't have any option to achieve this.
This commit adds the DST_PORT option to specify the target port(s) in the script.
-p : ($DST_PORT) destination PORT range (e.g. 433-444) is also allowed
Signed-off-by: Daniel T. Lee
Acked-by: Jesper Dangaard Brouer
Signed-off-by: David S. Miller -
This commit adds port parsing and port validate helper function to parse
single or range of port(s) from a given string. (e.g. 1234, 443-444)Helpers will be used in prior to set target port(s) in samples/pktgen.
Signed-off-by: Daniel T. Lee
Acked-by: Jesper Dangaard Brouer
Signed-off-by: David S. Miller -
Extend flowlabel_reflect bitmask to allow conditional
reflection of incoming flowlabels in echo replies.Note this has precedence against auto flowlabels.
Add flowlabel_reflect enum to replace hard coded
values.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
01 Jul, 2019
2 commits
-
Saeed Mahameed says:
====================
mlx5e-updates-2019-06-28This series adds some misc updates for mlx5e driver
1) Allow adding the same mac more than once in MPFS table
2) Move to HW checksumming advertising
3) Report netdevice MPLS features
4) Correct physical port name of the PF representor
5) Reduce stack usage in mlx5_eswitch_termtbl_create
6) Refresh TIR improvement for representors
7) Expose same physical switch_id for all representors
====================Signed-off-by: David S. Miller
-
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2019-06-28This series contains a smorgasbord of updates to many of the Intel
drivers.Gustavo A. R. Silva updates the ice and iavf drivers to use the
strcut_size() helper where possible.Miguel increases the pause and refresh time for flow control in the
e1000e driver during reset for certain devices.Dann Frazier fixes a potential NULL pointer dereference in ixgbe driver
when using non-IPSec enabled devices.Colin Ian King fixes a potential overflow during a shift in the ixgbe
driver. Also fixes a potential NULL pointer dereference in the iavf
driver by adding a check.Venkatesh Srinivas converts the e1000 driver to use dma_wmb() instead of
wmb() for doorbell writes to avoid SFENCEs in the transmit and receive
paths.Arjan updates the e1000e driver to improve boot time by over 100 msec by
reducing the usleep ranges suring system startup.Artem updates the igb driver register dump in ethtool, first prepares
the register dump for future additions of registers in the dump, then
secondly, adds the RR2DCDELAY register to the dump. When dealing with
time-sensitive networks, this register is helpful in determining your
latency from the device to the ring.Alex fixes the ixgbevf driver to use the current cached link state,
rather than trying to re-check the value from the PF.Harshitha adds support for MACVLAN offloads in i40e by using channels as
MACVLAN interfaces.Detlev Casanova updates the e1000e driver to use delayed work instead of
timers to run the watchdog.Vitaly fixes an issue in e1000e, where when disconnecting and
reconnecting the physical cable connection, the NIC enters a DMoff
state. This state causes a mismatch in link and duplexing, so check the
PCIm function state and perform a PHY reset when in this state to
resolve the issue.
====================Signed-off-by: David S. Miller
30 Jun, 2019
10 commits
-
DMA_API_HOWTO.txt includes an example explaining when
dma_sync_single_for_device() is not needed, and that example matches
our use case. The buffer isn't changed by the CPU and direction is
DMA_FROM_DEVICE, so we can remove the call to
dma_sync_single_for_device().Signed-off-by: Heiner Kallweit
Signed-off-by: David S. Miller -
Documentation/DMA-API-HOWTO.txt states:
By default, the kernel assumes that your device can address 32-bits of
DMA addressing. For a 64-bit capable device, this needs to be increased,
and for a device with limitations, it needs to be decreased.Therefore we don't need the 32 Bit DMA fallback configuration and can
remove it.Signed-off-by: Heiner Kallweit
Signed-off-by: David S. Miller -
The VLAN tag is stored in the descriptor in network byte order.
Using swab16 works on little endian host systems only. Better play safe
and use ntohs or htons respectively.Signed-off-by: Heiner Kallweit
Signed-off-by: David S. Miller -
running the script on systems without netdevsim now prints:
SKIP: ipsec_offload can't load netdevsim
instead of error message & failed status.
Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller -
Nikolay Aleksandrov says:
====================
em_ipt: add support for addrtypeWe would like to be able to use the addrtype from tc for ACL rules and
em_ipt seems the best place to add support for the already existing xt
match. The biggest issue is that addrtype revision 1 (with ipv6 support)
is NFPROTO_UNSPEC and currently em_ipt can't differentiate between v4/v6
if such xt match is used because it passes the match's family instead of
the packet one. The first 3 patches make em_ipt match only on IP
traffic (currently both policy and addrtype recognize such traffic
only) and make it pass the actual packet's protocol instead of the xt
match family when it's unspecified. They also add support for NFPROTO_UNSPEC
xt matches. The last patch allows to add addrtype rules via em_ipt.
We need to keep the user-specified nfproto for dumping in order to be
compatible with libxtables, we cannot dump NFPROTO_UNSPEC as the nfproto
or we'll get an error from libxtables, thus the nfproto is limited to
ipv4/ipv6 in patch 03 and is recorded.v3: don't use the user nfproto for matching, only for dumping, more
information is available in the commit message in patch 03
v2: change patch 02 to set the nfproto only when unspecified and drop
patch 04 from v1 (Eyal Birger)
====================Signed-off-by: David S. Miller
-
Allow em_ipt to use addrtype for matching. Restrict the use only to
revision 1 which has IPv6 support. Since it's a NFPROTO_UNSPEC xt match
we use the user-specified nfproto for matching, in case it's unspecified
both v4/v6 will be matched by the rule.v2: no changes, was patch 5 in v1
Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller -
If we dump NFPROTO_UNSPEC as nfproto user-space libxtables can't handle
it and would exit with an error like:
"libxtables: unhandled NFPROTO in xtables_set_nfproto"
In order to avoid the error return the user-specified nfproto. If we
don't record it then the match family is used which can be
NFPROTO_UNSPEC. Even if we add support to mask NFPROTO_UNSPEC in
iproute2 we have to be compatible with older versions which would be
also be allowed to add NFPROTO_UNSPEC matches (e.g. addrtype after the
last patch).v3: don't use the user nfproto for matching, only for dumping the rule,
also don't allow the nfproto to be unspecified (explained above)
v2: adjust changes to missing patch, was patch 04 in v1Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller -
Set the family based on the packet if it's unspecified otherwise
protocol-neutral matches will have wrong information (e.g. NFPROTO_UNSPEC).
In preparation for using NFPROTO_UNSPEC xt matches.v2: set the nfproto only when unspecified
Suggested-by: Eyal Birger
Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller -
Restrict matching only to ip/ipv6 traffic and make sure we can use the
headers, otherwise matches will be attempted on any protocol which can
be unexpected by the xt matches. Currently policy supports only ipv4/6.Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller -
This patch adds vlan offload support for the HINIC driver.
Signed-off-by: Xue Chaojing
Signed-off-by: David S. Miller
29 Jun, 2019
15 commits
-
After changing the parent_id to be the same for both NICs of same
the hardware device, netdev_port_same_parent_id now returns true for
more cases (all the lower devices in the hierarchy are on the same
hardware device).If merged eswitch isn't enabled, these cases aren't supported, so disallow
them.Signed-off-by: Paul Blakey
Reviewed-by: Roi Dayan
Signed-off-by: Saeed Mahameed -
Report system_image_guid as the E-Switch switch_id, this ensures
that when a NIC contains multiple PCI functions and which
has merged eswitch capability, all representors from
multiple PFs publish same switch_id.Signed-off-by: Paul Blakey
Reviewed-by: Parav Pandit
Reviewed-by: Roi Dayan
Signed-off-by: Saeed Mahameed -
Refreshing TIRs is done in order to update the TIRs with the current
state of SQs in the transport domain, so that the TIRs can filter out
undesired self-loopback packets based on the source SQ of the packet.Representor TIRs will only receive packets that originate from their
associated vport, due to dedicated steering, and therefore will never
receive self-loopback packets, whose source vport will be the vport of
the E-Switch manager, and therefore not the vport associated with the
representor. As such, it is not necessary to refresh the representors'
TIRs, since self-loopback packets can't reach them.Since representors only exist in switchdev mode, and there is no
scenario in which a representor will exist in the transport domain
alongside a non-representor, it is not necessary to refresh the
transport domain's TIRs upon changing the state of a representor's
queues. Therefore, do not refresh TIRs upon such a change. Achieve
this by adding an update_rx callback to the mlx5e_profile, which
refreshes TIRs for non-representors and does nothing for representors,
and replace instances of mlx5e_refresh_tirs() upon changing the state
of the queues with update_rx().Signed-off-by: Gavi Teitz
Reviewed-by: Roi Dayan
Reviewed-by: Tariq Toukan
Signed-off-by: Saeed Mahameed -
Putting an empty 'mlx5_flow_spec' structure on the stack is a bit
wasteful and causes a warning on 32-bit architectures when building
with clang -fsanitize-coverage:drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c: In function 'mlx5_eswitch_termtbl_create':
drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c:90:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]Since the structure is never written to, we can statically allocate
it to avoid the stack usage. To be on the safe side, mark all
subsequent function arguments that we pass it into as 'const'
as well.Fixes: 10caabdaad5a ("net/mlx5e: Use termination table for VLAN push actions")
Signed-off-by: Arnd Bergmann
Acked-by: Saeed Mahameed
Acked-by: Mark Bloch
Signed-off-by: Saeed Mahameed -
Consider PCI and non PCI device types while setting device name
in get_drvinfo() callback using existing generic device.Signed-off-by: Parav Pandit
Reviewed-by: Vu Pham
Signed-off-by: Saeed Mahameed -
Currently PF phys_port_name is named as pfNvf-1 as vport number for PF
vport is 65535.
Correct PF's phys_port name as agreed upon name as pfN.Signed-off-by: Parav Pandit
Reviewed-by: Vu Pham
Signed-off-by: Saeed Mahameed -
Set supported device features in the netdevice MPLS features mask.
This will enable HW checksumming and TSO for MPLS tagged traffic.Signed-off-by: Ariel Levkovich
Signed-off-by: Saeed Mahameed -
This patch changes the way the driver advertises its checksum offload
capabilities within the net device features bit mask.Instead of advertising protocol specific checksumming capabilities
which are limited today to IPv4 and IPv6, we move to reporing
generic HW checksumming capabilities.This will allow the network stack to let mlx5 device offload checksum
for cases where the IP header is encapsulated within another protocol
and the skb->protocol doesn't indicate one of the IP versions protocol,
specifically in the case of MPLS label encapsulating the IP header and
the skb->protocol indiciates MPLS ethertype rather than IP.Moving the HW_CSUM reporting is required in the basic net device hw
features mask and also in the extensions (vlan and encpasulation
features) since the extensions are always multiplied by the basic
features set during the packet's traversal through the stack's tx flow.Signed-off-by: Ariel Levkovich
Signed-off-by: Saeed Mahameed -
Remove the limitation preventing adding a vport's MAC address to the
Multi-Physical Function Switch (MPFS) more than once per E-switch, as
there is no difference in the MPFS if an address is being used by an
E-switch more than once.This allows the E-switch to have multiple vports with the same MAC
address, allowing vports to be classified by VLAN id instead of by MAC
if desired.Signed-off-by: Gavi Teitz
Signed-off-by: Saeed Mahameed -
Unify and isolate the error handling flow in mlx5_mpfs_add_mac(),
removing code duplication.Signed-off-by: Gavi Teitz
Signed-off-by: Saeed Mahameed -
Misc updates from mlx5-next branch:
1) E-Switch vport metadata support for source vport matching
2) Convert mkey_table to XArray
3) Shared IRQs and to use single IRQ for all async EQsSigned-off-by: Saeed Mahameed
-
Due to commit: 5d8682588605 ("[misc] mei: me: allow runtime
pm for platform with D0i3")
When disconnecting the cable and reconnecting it the NIC
enters DMoff state. This caused wrong link indication
and duplex mismatch. This bug is described in:
https://bugzilla.redhat.com/show_bug.cgi?id=1689436Checking PCIm function state and performing PHY reset after a
timeout in watchdog task solves this issue.Signed-off-by: Vitaly Lifshits
Acked-by: Sasha Neftin
Tested-by: Aaron Brown
Signed-off-by: Jeff Kirsher -
Use delayed work instead of timers to run the watchdog of the e1000e
driver.Simplify the code with one less middle function.
Signed-off-by: Detlev Casanova
Tested-by: Aaron Brown
Signed-off-by: Jeff Kirsher -
This patch enables macvlan offloads for i40e. The idea is to use
channels as macvlan interfaces. The channels are VSIs of
type VMDQ. When the first macvlan is created, the maximum number of
channels possible are created. From then on, as a macvlan interface
is created, a macvlan filter is added to these already created
channels (VSIs).This patch utilizes subordinate device traffic classes to make queue
groups(channels) available for an upper device like a macvlan.Steps to configure macvlan offloads:
dev macvlan1
1. ethtool -K ethx l2-fwd-offload on
2. ip link add link ethx name macvlan1 type macvlan
3. ip addr add
4. ip link set macvlan1 upSigned-off-by: Harshitha Ramamurthy
Tested-by: Andrew Bowers
Signed-off-by: Jeff Kirsher -
Change the ethtool link settings call to just read the cached state out of
the adapter structure instead of trying to recheck the value from the PF.
Doing this should prevent excessive reading of the mailbox.Signed-off-by: Alexander Duyck
Reviewed-by: "Guilherme G. Piccoli"
Tested-by: Andrew Bowers
Signed-off-by: Jeff Kirsher