Eric Lee / smarc-fsl-linux-kernel

02 Apr, 2020

1 commit

919dce247 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma ... Browse Code »

Pull rdma updates from Jason Gunthorpe:
"The majority of the patches are cleanups, refactorings and clarity
improvements.

This cycle saw some more activity from Syzkaller, I think we are now
clean on all but one of those bugs, including the long standing and
obnoxious rdma_cm locking design defect. Continue to see many drivers
getting cleanups, with a few new user visible features.

Summary:

- Various driver updates for siw, bnxt_re, rxe, efa, mlx5, hfi1

- Lots of cleanup patches for hns

- Convert more places to use refcount

- Aggressively lock the RDMA CM code that syzkaller says isn't
working

- Work to clarify ib_cm

- Use the new ib_device lifecycle model in bnxt_re

- Fix mlx5's MR cache which seems to be failing more often with the
new ODP code

- mlx5 'dynamic uar' and 'tx steering' user interfaces"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (144 commits)
RDMA/bnxt_re: make bnxt_re_ib_init static
IB/qib: Delete struct qib_ivdev.qp_rnd
RDMA/hns: Fix uninitialized variable bug
RDMA/hns: Modify the mask of QP number for CQE of hip08
RDMA/hns: Reduce the maximum number of extend SGE per WQE
RDMA/hns: Reduce PFC frames in congestion scenarios
RDMA/mlx5: Add support for RDMA TX flow table
net/mlx5: Add support for RDMA TX steering
IB/hfi1: Call kobject_put() when kobject_init_and_add() fails
IB/hfi1: Fix memory leaks in sysfs registration and unregistration
IB/mlx5: Move to fully dynamic UAR mode once user space supports it
IB/mlx5: Limit the scope of struct mlx5_bfreg_info to mlx5_ib
IB/mlx5: Extend QP creation to get uar page index from user space
IB/mlx5: Extend CQ creation to get uar page index from user space
IB/mlx5: Expose UAR object and its alloc/destroy commands
IB/hfi1: Get rid of a warning
RDMA/hns: Remove redundant judgment of qp_type
RDMA/hns: Remove redundant assignment of wc->smac when polling cq
RDMA/hns: Remove redundant qpc setup operations
RDMA/hns: Remove meaningless prints
...

Linus Torvalds
2020-04-02 09:18:18 +0800

30 Mar, 2020

1 commit

e999a7343 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux ... Browse Code »

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
mlx5: Remove uninitialized use of key in mlx5_core_create_mkey
{IB,net}/mlx5: Move asynchronous mkey creation to mlx5_ib
{IB,net}/mlx5: Assign mkey variant in mlx5_ib only
{IB,net}/mlx5: Setup mkey variant before mr create command invocation

Signed-off-by: Saeed Mahameed

Saeed Mahameed
2020-03-30 14:42:11 +0800

27 Mar, 2020

1 commit

215286229 IB/mlx5: Limit the scope of struct mlx5_bfreg_info to mlx5_ib ... Browse Code »

struct mlx5_bfreg_info is used by mlx5_ib only but is exposed to both RDMA
and netdev parts of mlx5 driver. Move that struct to mlx5_ib namespace,
clean vertical space alignment and convert lib_uar_4k from bool to
bitfield.

Link: https://lore.kernel.org/r/20200324060143.1569116-5-leon@kernel.org
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe

Leon Romanovsky
2020-03-27 23:59:04 +0800

13 Mar, 2020

2 commits

a3cfdd392 {IB,net}/mlx5: Move asynchronous mkey creation to mlx5_ib ... Browse Code »

As mlx5_ib is the only user of the mlx5_core_create_mkey_cb, move the
logic inside mlx5_ib and cleanup the code in mlx5_core.

Signed-off-by: Michael Guralnik
Signed-off-by: Leon Romanovsky

Michael Guralnik
2020-03-13 21:48:10 +0800
fc6a9f86f {IB,net}/mlx5: Assign mkey variant in mlx5_ib only ... Browse Code »

mkey variant is not required for mlx5_core use, move the mkey variant
counter to mlx5_ib.

Signed-off-by: Saeed Mahameed
Signed-off-by: Leon Romanovsky

Saeed Mahameed
2020-03-13 21:48:04 +0800

10 Mar, 2020

1 commit

a70ed9d8e Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux ... Browse Code »

This series adds some HW bits and definitions for mlx5 driver, to be
used by downstream features in both rdma and netdev branches.

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5: HW bit for goto chain offload support
net/mlx5: Expose link speed directly
net/mlx5: Introduce TLS and IPSec objects enums
net/mlx5: Introduce egress acl forward-to-vport capability
net/mlx5: Expose raw packet pacing APIs
net/mlx5e: Replace zero-length array with flexible-array member
net/mlx5: fix spelling mistake "reserverd" -> "reserved"

Signed-off-by: Saeed Mahameed

Saeed Mahameed
2020-03-10 07:58:26 +0800

05 Mar, 2020

1 commit

1326034b3 net/mlx5: Expose raw packet pacing APIs ... Browse Code »

Expose raw packet pacing APIs to be used by DEVX based applications.
The existing code was refactored to have a single flow with the new raw
APIs.

The new raw APIs considered the input of 'pp_rate_limit_context', uid,
'dedicated', upon looking for an existing entry.

This raw mode enables future device specification data in the raw
context without changing the existing logic and code.

The ability to ask for a dedicated entry gives control for application
to allocate entries according to its needs.

A dedicated entry may not be used by some other process and it also
enables the process spreading its resources to some different entries
for use different hardware resources as part of enforcing the rate.

The counter per entry was changed to be u64 to prevent any option to
overflow.

Signed-off-by: Yishai Hadas
Acked-by: Saeed Mahameed
Signed-off-by: Leon Romanovsky

Yishai Hadas
2020-03-05 20:18:09 +0800

19 Feb, 2020

1 commit

12206b172 net/mlx5: Add support for resource dump ... Browse Code »

On driver load:
- Initialize resource dump data structure and memory access tools (mkey
& pd).
- Read the resource dump's menu which contains the FW segment
identifier. Each record is identified by the segment name (ASCII).

During the driver's course of life, users (like reporters) may request
dumps per segment. The user should create a command providing the
segment identifier (SW enumeration) and command keys. In return, the
user receives a command context. In order to receive the dump, the user
should supply the command context and a memory (aligned to a PAGE) on
which the dump content will be written. Since the dump may be larger
than the given memory, the user may resubmit the command until received
an indication of end-of-dump. It is the user's responsibility to destroy
the command.

Signed-off-by: Aya Levin
Reviewed-by: Moshe Shemesh
Acked-by: Jiri Pirko
Signed-off-by: Saeed Mahameed

Aya Levin
2020-02-19 11:17:29 +0800

01 Feb, 2020

1 commit

8fdd4019b Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma ... Browse Code »

Pull rdma updates from Jason Gunthorpe:
"A very quiet cycle with few notable changes. Mostly the usual list of
one or two patches to drivers changing something that isn't quite rc
worthy. The subsystem seems to be seeing a larger number of rework and
cleanup style patches right now, I feel that several vendors are
prepping their drivers for new silicon.

Summary:

- Driver updates and cleanup for qedr, bnxt_re, hns, siw, mlx5, mlx4,
rxe, i40iw

- Larger series doing cleanup and rework for hns and hfi1.

- Some general reworking of the CM code to make it a little more
understandable

- Unify the different code paths connected to the uverbs FD scheme

- New UAPI ioctls conversions for get context and get async fd

- Trace points for CQ and CM portions of the RDMA stack

- mlx5 driver support for virtio-net formatted rings as RDMA raw
ethernet QPs

- verbs support for setting the PCI-E relaxed ordering bit on DMA
traffic connected to a MR

- A couple of bug fixes that came too late to make rc7"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (108 commits)
RDMA/core: Make the entire API tree static
RDMA/efa: Mask access flags with the correct optional range
RDMA/cma: Fix unbalanced cm_id reference count during address resolve
RDMA/umem: Fix ib_umem_find_best_pgsz()
IB/mlx4: Fix leak in id_map_find_del
IB/opa_vnic: Spelling correction of 'erorr' to 'error'
IB/hfi1: Fix logical condition in msix_request_irq
RDMA/cm: Remove CM message structs
RDMA/cm: Use IBA functions for complex structure members
RDMA/cm: Use IBA functions for simple structure members
RDMA/cm: Use IBA functions for swapping get/set acessors
RDMA/cm: Use IBA functions for simple get/set acessors
RDMA/cm: Add SET/GET implementations to hide IBA wire format
RDMA/cm: Add accessors for CM_REQ transport_type
IB/mlx5: Return the administrative GUID if exists
RDMA/core: Ensure that rdma_user_mmap_entry_remove() is a fence
IB/mlx4: Fix memory leak in add_gid error flow
IB/mlx5: Expose RoCE accelerator counters
RDMA/mlx5: Set relaxed ordering when requested
RDMA/core: Add the core support field to METHOD_GET_CONTEXT
...

Linus Torvalds
2020-02-01 06:40:36 +0800

26 Jan, 2020

1 commit

4bbd4923d IB/mlx5: Return the administrative GUID if exists ... Browse Code »

A user can change the operational GUID (a.k.a affective GUID) through
link/infiniband. Therefore it is preferred to return the currently set
GUID if it exists instead of the operational.

This way the PF can query which VF GUID will be set in the next bind. In
order to align with MAC address, zero is returned if administrative GUID
is not set.

For example, before setting administrative GUID:
$ ip link show
ib0: mtu 4092 qdisc mq state UP mode DEFAULT group default qlen 256
link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
vf 0 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff,
spoof checking off, NODE_GUID 00:00:00:00:00:00:00:00, PORT_GUID 00:00:00:00:00:00:00:00, link-state auto, trust off, query_rss off

Then:

$ ip link set ib0 vf 0 node_guid 11:00:af:21:cb:05:11:00
$ ip link set ib0 vf 0 port_guid 22:11:af:21:cb:05:11:00

After setting administrative GUID:
$ ip link show
ib0: mtu 4092 qdisc mq state UP mode DEFAULT group default qlen 256
link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
vf 0 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff,
spoof checking off, NODE_GUID 11:00:af:21:cb:05:11:00, PORT_GUID 22:11:af:21:cb:05:11:00, link-state auto, trust off, query_rss off

Fixes: 9c0015ef0928 ("IB/mlx5: Implement callbacks for getting VFs GUID attributes")
Link: https://lore.kernel.org/r/20200116120048.12744-1-leon@kernel.org
Signed-off-by: Danit Goldberg
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe

Danit Goldberg
2020-01-26 02:54:39 +0800

17 Jan, 2020

4 commits

12e9e0d0d Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux ... Browse Code »

This merge syncs with mlx5-next latest HW bits and layout updates for next
features, in addition one patch that improves
mlx5_create_auto_grouped_flow_table() API across all mlx5 users.

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: Refactor mlx5_create_auto_grouped_flow_table
net/mlx5e: Add discard counters per priority
net/mlx5e: Expose FEC feilds and related capability bit
net/mlx5: Add mlx5_ifc definitions for connection tracking support
net/mlx5: Add copy header action struct layout
net/mlx5: Expose resource dump register mapping
net/mlx5: Add structures and defines for MIRC register
net/mlx5: Read MCAM register groups 1 and 2
net/mlx5: Add structures layout for new MCAM access reg groups
net/mlx5: Expose vDPA emulation device capabilities
net/mlx5: Add Virtio Emulation related device capabilities

Signed-off-by: Saeed Mahameed

Saeed Mahameed
2020-01-17 07:48:24 +0800
609b82727 net/mlx5: Expose resource dump register mapping ... Browse Code »

Add new register enumeration for resource dump. Add layout mapping for
resource dump: access command and response.

Signed-off-by: Aya Levin
Reviewed-by: Moshe Shemesh
Signed-off-by: Saeed Mahameed

Aya Levin
2020-01-17 06:11:23 +0800
bab58ba10 net/mlx5: Add structures and defines for MIRC register ... Browse Code »

Add needed structures, layouts and defines for MIRC (Management Image
Re-activation Control) register. This structure will be used for the FSM
reactivation flow in the downstream patches.

Signed-off-by: Eran Ben Elisha
Signed-off-by: Saeed Mahameed

Eran Ben Elisha
2020-01-17 06:11:21 +0800
932ef1551 net/mlx5: Read MCAM register groups 1 and 2 ... Browse Code »

On load, Driver caches MCAM (Management Capabilities Mask Register)
registers. in addition to the only MCAM register group (0) the driver
already reads, here we add support for reading groups 1 and 2.

Signed-off-by: Eran Ben Elisha
Signed-off-by: Saeed Mahameed

Eran Ben Elisha
2020-01-17 06:11:19 +0800

08 Jan, 2020

1 commit

8007880a2 net/mlx5: limit the function in local scope ... Browse Code »

The function mlx5_buf_alloc_node is only used by the function in the
local scope. So it is appropriate to limit this function in the local
scope.

Signed-off-by: Zhu Yanjun
Signed-off-by: Saeed Mahameed

Zhu Yanjun
2020-01-08 02:40:22 +0800

25 Nov, 2019

1 commit

3694e41e4 Merge branch 'ib-guids' into rdma.git for-next ... Browse Code »

Danit Goldberg says:

====================
This series extends RTNETLINK to provide IB port and node GUIDs, which
were configured for Infiniband VFs.

The functionality to set VF GUIDs already existed for a long time, and
here we are adding the missing "get" so that netlink will be symmetric and
various cloud orchestration tools will be able to manage such VFs more
naturally.

The iproute2 was extended too to present those GUIDs.

- ip link show

For example:
- ip link set ib4 vf 0 node_guid 22:44:33:00:33:11:00:33
- ip link set ib4 vf 0 port_guid 10:21:33:12:00:11:22:10
- ip link show ib4
ib4: mtu 4092 qdisc noop state DOWN mode DEFAULT group default qlen 256
link/infiniband 00:00:0a:2d:fe:80:00:00:00:00:00:00:ec:0d:9a:03:00:44:36:8d brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
vf 0 link/infiniband 00:00:0a:2d:fe:80:00:00:00:00:00:00:ec:0d:9a:03:00:44:36:8d brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff,
spoof checking off, NODE_GUID 22:44:33:00:33:11:00:33, PORT_GUID 10:21:33:12:00:11:22:10, link-state disable, trust off, query_rss off
====================

Based on the mlx5-next branch from
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux for
dependencies

* branch 'ib-guids': (35 commits)
IB/mlx5: Implement callbacks for getting VFs GUID attributes
IB/ipoib: Add ndo operation for getting VFs GUID attributes
IB/core: Add interfaces to get VF node and port GUIDs
net/core: Add support for getting VF GUIDs

net/mlx5: Add new chain for netfilter flow table offload
net/mlx5: Refactor creating fast path prio chains
net/mlx5: Accumulate levels for chains prio namespaces
net/mlx5: Define fdb tc levels per prio
net/mlx5: Rename FDB_* tc related defines to FDB_TC_* defines
net/mlx5: Simplify fdb chain and prio eswitch defines
IB/mlx5: Load profile according to RoCE enablement state
IB/mlx5: Rename profile and init methods
net/mlx5: Handle "enable_roce" devlink param
net/mlx5: Document flow_steering_mode devlink param
devlink: Add new "enable_roce" generic device param
net/mlx5: fix spelling mistake "metdata" -> "metadata"
net/mlx5: fix kvfree of uninitialized pointer spec
IB/mlx5: Introduce and use mlx5_core_is_vf()
net/mlx5: E-switch, Enable metadata on own vport
net/mlx5: Refactor ingress acl configuration
...

Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2019-11-25 22:31:47 +0800

12 Nov, 2019

1 commit

cc9defcbb net/mlx5: Handle "enable_roce" devlink param ... Browse Code »

Register "enable_roce" param, default value is RoCE enabled.
Current configuration is stored on mlx5_core_dev and exposed to user
through the cmode runtime devlink param.
Changing configuration requires changing the cmode driverinit devlink
param and calling devlink reload.

Signed-off-by: Michael Guralnik
Acked-by: Jiri Pirko
Signed-off-by: Saeed Mahameed

Michael Guralnik
2019-11-12 04:15:29 +0800

02 Nov, 2019

1 commit

e53a9d26c IB/mlx5: Introduce and use mlx5_core_is_vf() ... Browse Code »

Instead of deciding a given device is virtual function or
not based on a device is PF or not, use already defined
MLX5_COREDEV_VF by introducing an helper API mlx5_core_is_vf().

This enables to clearly identify PF, VF and non virtual functions.

Signed-off-by: Parav Pandit
Reviewed-by: Vu Pham
Signed-off-by: Saeed Mahameed

Parav Pandit
2019-11-02 05:40:27 +0800

29 Oct, 2019

1 commit

74bddb368 RDMA/mlx5: Delete struct mlx5_priv->mkey_table ... Browse Code »

No users are left, delete it.

Link: https://lore.kernel.org/r/20191009160934.3143-5-jgg@ziepe.ca
Reviewed-by: Artemy Kovalyov
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2019-10-29 03:41:13 +0800

02 Sep, 2019

2 commits

a06ebb8d9 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux ... Browse Code »

Merge mlx5-next patches needed for upcoming mlx5 software steering.

1) Alex adds HW bits and definitions required for SW steering
2) Ariel moves device memory management to mlx5_core (From mlx5_ib)
3) Maor, Cleanups and fixups for eswitch mode and RoCE
4) Mark, Set only stag for match untagged packets

Signed-off-by: Saeed Mahameed

Saeed Mahameed
2019-09-02 15:16:05 +0800
c9b9dcb43 net/mlx5: Move device memory management to mlx5_core ... Browse Code »

Move the device memory allocation and deallocation commands
SW ICM memory to mlx5_core to expose this API for all
mlx5_core users.

This comes as preparation for supporting SW steering in kernel
where it will be required to allocate and register device
memory for direct rule insertion.

In addition, an API to register this device memory for future
remote access operations is introduced using the create_mkey
commands.

Signed-off-by: Ariel Levkovich
Reviewed-by: Mark Bloch
Signed-off-by: Saeed Mahameed

Ariel Levkovich
2019-09-02 14:44:41 +0800

29 Aug, 2019

1 commit

537f32109 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux ... Browse Code »

mlx5 HW spec and bits updates:
1) Aya exposes IP-in-IP capability in mlx5_core.
2) Maxim exposes lag tx port affinity capabilities.
3) Moshe adds VNIC_ENV internal rq counter bits.
4) ODP capabilities for DC transport

Misc updates:
5) Saeed, two compiler warnings cleanups
6) Add XRQ legacy commands opcodes
7) Use refcount_t for refcount
8) fix a -Wstringop-truncation warning

Saeed Mahameed
2019-08-29 02:48:56 +0800

22 Aug, 2019

1 commit

87175120d net/mlx5: Add HV VHCA infrastructure ... Browse Code »

HV VHCA is a layer which provides PF to VF communication channel based on
HyperV PCI config channel. It implements Mellanox's Inter VHCA control
communication protocol. The protocol contains control block in order to
pass messages between the PF and VF drivers, and data blocks in order to
pass actual data.

The infrastructure is agent based. Each agent will be responsible of
contiguous buffer blocks in the VHCA config space. This infrastructure will
bind agents to their blocks, and those agents can only access read/write
the buffer blocks assigned to them. Each agent will provide three
callbacks (control, invalidate, cleanup). Control will be invoked when
block-0 is invalidated with a command that concerns this agent. Invalidate
callback will be invoked if one of the blocks assigned to this agent was
invalidated. Cleanup will be invoked before the agent is being freed in
order to clean all of its open resources or deferred works.

Block-0 serves as the control block. All execution commands from the PF
will be written by the PF over this block. VF will ack on those by
writing on block-0 as well. Its format is described by struct
mlx5_hv_vhca_control_block layout.

Signed-off-by: Eran Ben Elisha
Signed-off-by: Saeed Mahameed
Signed-off-by: Haiyang Zhang
Signed-off-by: David S. Miller

Eran Ben Elisha
2019-08-22 15:25:12 +0800

11 Aug, 2019

1 commit

9f818c8a7 mlx5: no need to check return value of debugfs_create functions ... Browse Code »

When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.

This cleans up a lot of unneeded code and logic around the debugfs
files, making all of this much simpler and easier to understand as we
don't need to keep the dentries saved anymore.

Cc: Saeed Mahameed
Cc: Leon Romanovsky
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman
Signed-off-by: David S. Miller

Greg Kroah-Hartman
2019-08-11 06:25:47 +0800

08 Aug, 2019

1 commit

94f3e14e0 mlx5: Use refcount_t for refcount ... Browse Code »

Reference counters are preferred to use refcount_t instead of
atomic_t.
This is because the implementation of refcount_t can prevent
overflows and detect possible use-after-free.
So convert atomic_t ref counters to refcount_t.

Signed-off-by: Chuhong Yuan
Acked-by: Leon Romanovsky
Signed-off-by: Saeed Mahameed

Chuhong Yuan
2019-08-08 02:01:48 +0800

02 Aug, 2019

2 commits

558101f1b net/mlx5: Add flow counter pool ... Browse Code »

Add a pool of flow counters, based on flow counter bulks, removing the
need to allocate a new counter via a costly FW command during the flow
creation process. The time it takes to acquire/release a flow counter
is cut from ~50 [us] to ~50 [ns].

The pool is part of the mlx5 driver instance, and provides flow
counters for aging flows. mlx5_fc_create() was modified to provide
counters for aging flows from the pool by default, and
mlx5_destroy_fc() was modified to release counters back to the pool
for later reuse. If bulk allocation is not supported or fails, and for
non-aging flows, the fallback behavior is to allocate and free
individual counters.

The pool is comprised of three lists of flow counter bulks, one of
fully used bulks, one of partially used bulks, and one of unused
bulks. Counters are provided from the partially used bulks first, to
help limit bulk fragmentation.

The pool maintains a threshold, and strives to maintain the amount of
available counters below it. The pool is increased in size when a
counter acquisition request is made and there are no available
counters, and it is decreased in size when the last counter in a bulk
is released and there are more available counters than the threshold.
All pool size changes are done in the context of the
acquiring/releasing process.

The value of the threshold is directly correlated to the amount of
used counters the pool is providing, while constrained by a hard
maximum, and is recalculated every time a bulk is allocated/freed.
This ensures that the pool only consumes large amounts of memory for
available counters if the pool is being used heavily. When fully
populated and at the hard maximum, the buffer of available counters
consumes ~40 [MB].

Signed-off-by: Gavi Teitz
Reviewed-by: Vlad Buslov
Signed-off-by: Saeed Mahameed

Gavi Teitz
2019-08-02 03:33:30 +0800
6f06e04b6 net/mlx5: Refactor and optimize flow counter bulk query ... Browse Code »

Towards introducing the ability to allocate bulks of flow counters,
refactor the flow counter bulk query process, removing functions and
structs whose names indicated being used for flow counter bulk
allocation FW commands, despite them actually only being used to
support bulk querying, and migrate their functionality to correctly
named functions in their natural location, fs_counters.c.

Additionally, optimize the bulk query process by:
* Extracting the memory used for the query to mlx5_fc_stats so
that it is only allocated once, and not for each bulk query.
* Querying all the counters in one function call.

Signed-off-by: Gavi Teitz
Reviewed-by: Vlad Buslov
Signed-off-by: Saeed Mahameed

Gavi Teitz
2019-08-02 02:14:24 +0800

05 Jul, 2019

1 commit

e08a976a1 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux ... Browse Code »

Misc updates from mlx5-next branch:

1) Add the required HW definitions and structures for upcoming TLS
support.
2) Add support for MCQI and MCQS hardware registers for fw version query.
3) Added hardware bits and structures definitions for sub-functions
4) Small code cleanup and improvement for PF pci driver.
5) Bluefield (ECPF) updates and refactoring for better E-Switch
management on ECPF embedded CPU NIC:
5.1) Consolidate querying eswitch number of VFs
5.2) Register event handler at the correct E-Switch init stage
5.3) Setup PF's inline mode and vlan pop when the ECPF is the
E-Swtich manager ( the host PF is basically a VF ).
5.4) Handle Vport UC address changes in switchdev mode.

6) Cleanup the rep and netdev reference when unloading IB rep.

Signed-off-by: Saeed Mahameed

i# All conflicts fixed but you are still merging.

Saeed Mahameed
2019-07-05 04:42:59 +0800

04 Jul, 2019

2 commits

2752b8231 net/mlx5: Introduce and use mlx5_eswitch_get_total_vports() ... Browse Code »

Instead MLX5_TOTAL_VPORTS, use mlx5_eswitch_get_total_vports().
mlx5_eswitch_get_total_vports() in subsequent patch accounts for SF
vports as well.
Expanding MLX5_TOTAL_VPORTS macro would require exposing SF internals to
more generic vport.h header file. Such exposure is not desired.
Hence a mlx5_eswitch_get_total_vports() is introduced.

Given that mlx5_eswitch_get_total_vports() API wants to work on const
mlx5_core_dev*, change its helper functions also to accept const *dev.

Signed-off-by: Parav Pandit
Signed-off-by: Saeed Mahameed

Parav Pandit
2019-07-04 03:50:42 +0800
c0670781f net/mlx5: Expose the API to register for ANY event ... Browse Code »

Expose the API to register for ANY event, mlx5_ib will be able to use
this functionality for its needs.

Signed-off-by: Yishai Hadas
Acked-by: Saeed Mahameed
Signed-off-by: Leon Romanovsky

Yishai Hadas
2019-07-04 01:56:29 +0800

02 Jul, 2019

3 commits

d886aba67 net/mlx5: Reduce dependency on enabled_vfs counter and num_vfs ... Browse Code »

While enabling SR-IOV, PCI core already checks that if SR-IOV is already
enabled, it returns failure error code.
Hence, remove such duplicate check from mlx5_core driver.

While at it, make mlx5_device_disable_sriov() to perform cleanup of VFs in
reverse order of mlx5_device_enable_sriov().

Signed-off-by: Parav Pandit
Signed-off-by: Saeed Mahameed

Parav Pandit
2019-07-02 07:40:30 +0800
386e75af9 net/mlx5: Rename mlx5_pci_dev_type to mlx5_coredev_type ... Browse Code »

Rename mlx5_pci_dev_type to mlx5_coredev_type to distinguish different mlx5
device types.

mlx5_coredev_type represents mlx5_core_dev instance type. Hence keep
mlx5_coredev_type in mlx5_core_dev structure.

Signed-off-by: Huy Nguyen
Signed-off-by: Vu Pham
Signed-off-by: Parav Pandit
Reviewed-by: Parav Pandit
Signed-off-by: Saeed Mahameed

Huy Nguyen
2019-07-02 07:40:30 +0800
a82e0b5bd net/mlx5: Added MCQI and MCQS registers' description to ifc ... Browse Code »

Given a fw component index, the MCQI register allows us to query
this component's information (e.g. its version and capabilities).

Given a fw component index, the MCQS register allows us to query the
status of a fw component, including its type and state
(e.g. PRESET/IN_USE).
It can be used to find the index of a component of a specific type, by
sequentially increasing the component index, and querying each time the
type of the returned component.
If max component index is reached, 'last_index_flag' is set by the HCA.

These registers' description was added to query the running and pending
fw version of the HCA.

Signed-off-by: Shay Agroskin
Signed-off-by: Saeed Mahameed

Shay Agroskin
2019-07-02 07:40:30 +0800

29 Jun, 2019

1 commit

4f5d1bead Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux ... Browse Code »

Misc updates from mlx5-next branch:

1) E-Switch vport metadata support for source vport matching
2) Convert mkey_table to XArray
3) Shared IRQs and to use single IRQ for all async EQs

Signed-off-by: Saeed Mahameed

Saeed Mahameed
2019-06-29 07:03:54 +0800

25 Jun, 2019

1 commit

792c4e9d0 net/mlx5: Convert mkey_table to XArray ... Browse Code »

The lock protecting the data structure does not need to be an rwlock. The
only read access to the lock is in an error path, and if that's limiting
your scalability, you have bigger performance problems.

Eliminate mlx5_mkey_table in favour of using the xarray directly.
reg_mr_callback must use GFP_ATOMIC for allocating XArray nodes as it may
be called in interrupt context.

This also fixes a minor bug where SRCU locking was being used on the radix
tree read side, when RCU was needed too.

Signed-off-by: Matthew Wilcox
Signed-off-by: Jason Gunthorpe
Signed-off-by: Saeed Mahameed

Matthew Wilcox
2019-06-25 07:44:40 +0800

14 Jun, 2019

5 commits

b3bd076f7 net/mlx5: Report devlink health on FW fatal issues ... Browse Code »

Report devlink health on FW fatal issues via fw_fatal_reporter. The
driver recover flow for FW fatal error is now being handled by the
devlink health.

Having the recovery controlled by devlink health, the user has the
ability to cancel the auto-recovery for debug session and run it
manually.

Call mlx5_enter_error_state() before calling devlink_health_report() to
ensure entering device error state even if auto-recovery is off.

Signed-off-by: Moshe Shemesh
Signed-off-by: Saeed Mahameed

Moshe Shemesh
2019-06-14 04:23:19 +0800
96c82cdfe net/mlx5: Add fw fatal devlink_health_reporter ... Browse Code »

Create mlx5_devlink_health_reporter for fw fatal reporter.
The fw fatal reporter is added in addition to the fw reporter and
implements the recover callback.
The point of having two reporters for FW issues, is that we
don't want to run FW recover on any issue, but only fatal ones.

Signed-off-by: Moshe Shemesh
Signed-off-by: Eran Ben Elisha
Signed-off-by: Saeed Mahameed

Moshe Shemesh
2019-06-14 04:23:19 +0800
d1bf0e2cc net/mlx5: Report devlink health on FW issues ... Browse Code »

Use devlink_health_report() to report any symptom of FW issue as FW
counter miss or new health syndrome.
The FW issues detected in mlx5 during poll_health which is called in
timer atomic context and so health work queue is used to schedule the
reports.

Signed-off-by: Moshe Shemesh
Signed-off-by: Eran Ben Elisha
Signed-off-by: Saeed Mahameed

Moshe Shemesh
2019-06-14 04:23:19 +0800
1e34f3efd net/mlx5: Create FW devlink_health_reporter ... Browse Code »

Create mlx5_devlink_health_reporter for FW reporter. The FW reporter
implements devlink_health_reporter diagnose callback.

The fw reporter diagnose command can be triggered any time by the user
to check current fw status.
In healthy status, it will return clear syndrome. Otherwise it will
return the syndrome and description of the error type.

Command example and output on healthy status:
$ devlink health diagnose pci/0000:82:00.0 reporter fw
Syndrome: 0

Command example and output on non healthy status:
$ devlink health diagnose pci/0000:82:00.0 reporter fw
Syndrome: 8 Description: unrecoverable hardware error

Signed-off-by: Moshe Shemesh
Signed-off-by: Eran Ben Elisha
Signed-off-by: Saeed Mahameed

Moshe Shemesh
2019-06-14 04:23:18 +0800
3e5b72ac2 net/mlx5: Issue SW reset on FW assert ... Browse Code »

If a FW assert is considered fatal, indicated by a new bit in the health
buffer, reset the FW. After the reset go through the normal recovery
flow. Only one PF needs to issue the reset, so an attempt is made to
prevent the 2nd function from also issuing the reset.
It's not an error if that happens, it just slows recovery.

Signed-off-by: Feras Daoud
Signed-off-by: Alex Vesker
Signed-off-by: Moshe Shemesh
Signed-off-by: Daniel Jurgens
Signed-off-by: Saeed Mahameed

Feras Daoud
2019-06-14 04:23:18 +0800