Eric Lee / smarc-fsl-linux-kernel

01 Feb, 2020

4 commits

8fdd4019b Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma ... Browse Code »

Pull rdma updates from Jason Gunthorpe:
"A very quiet cycle with few notable changes. Mostly the usual list of
one or two patches to drivers changing something that isn't quite rc
worthy. The subsystem seems to be seeing a larger number of rework and
cleanup style patches right now, I feel that several vendors are
prepping their drivers for new silicon.

Summary:

- Driver updates and cleanup for qedr, bnxt_re, hns, siw, mlx5, mlx4,
rxe, i40iw

- Larger series doing cleanup and rework for hns and hfi1.

- Some general reworking of the CM code to make it a little more
understandable

- Unify the different code paths connected to the uverbs FD scheme

- New UAPI ioctls conversions for get context and get async fd

- Trace points for CQ and CM portions of the RDMA stack

- mlx5 driver support for virtio-net formatted rings as RDMA raw
ethernet QPs

- verbs support for setting the PCI-E relaxed ordering bit on DMA
traffic connected to a MR

- A couple of bug fixes that came too late to make rc7"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (108 commits)
RDMA/core: Make the entire API tree static
RDMA/efa: Mask access flags with the correct optional range
RDMA/cma: Fix unbalanced cm_id reference count during address resolve
RDMA/umem: Fix ib_umem_find_best_pgsz()
IB/mlx4: Fix leak in id_map_find_del
IB/opa_vnic: Spelling correction of 'erorr' to 'error'
IB/hfi1: Fix logical condition in msix_request_irq
RDMA/cm: Remove CM message structs
RDMA/cm: Use IBA functions for complex structure members
RDMA/cm: Use IBA functions for simple structure members
RDMA/cm: Use IBA functions for swapping get/set acessors
RDMA/cm: Use IBA functions for simple get/set acessors
RDMA/cm: Add SET/GET implementations to hide IBA wire format
RDMA/cm: Add accessors for CM_REQ transport_type
IB/mlx5: Return the administrative GUID if exists
RDMA/core: Ensure that rdma_user_mmap_entry_remove() is a fence
IB/mlx4: Fix memory leak in add_gid error flow
IB/mlx5: Expose RoCE accelerator counters
RDMA/mlx5: Set relaxed ordering when requested
RDMA/core: Add the core support field to METHOD_GET_CONTEXT
...

Linus Torvalds
2020-02-01 06:40:36 +0800
f1f6a7dd9 mm, tree-wide: rename put_user_page*() to unpin_user_page*() ... Browse Code »

In order to provide a clearer, more symmetric API for pinning and
unpinning DMA pages. This way, pin_user_pages*() calls match up with
unpin_user_pages*() calls, and the API is a lot closer to being
self-explanatory.

Link: http://lkml.kernel.org/r/20200107224558.2362728-23-jhubbard@nvidia.com
Signed-off-by: John Hubbard
Reviewed-by: Jan Kara
Cc: Alex Williamson
Cc: Aneesh Kumar K.V
Cc: Björn Töpel
Cc: Christoph Hellwig
Cc: Daniel Vetter
Cc: Dan Williams
Cc: Hans Verkuil
Cc: Ira Weiny
Cc: Jason Gunthorpe
Cc: Jason Gunthorpe
Cc: Jens Axboe
Cc: Jerome Glisse
Cc: Jonathan Corbet
Cc: Kirill A. Shutemov
Cc: Leon Romanovsky
Cc: Mauro Carvalho Chehab
Cc: Mike Rapoport
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

John Hubbard
2020-02-01 02:30:38 +0800
dfa0a4fff IB/{core,hw,umem}: set FOLL_PIN via pin_user_pages*(), fix up ODP ... Browse Code »

Convert infiniband to use the new pin_user_pages*() calls.

Also, revert earlier changes to Infiniband ODP that had it using
put_user_page(). ODP is "Case 3" in
Documentation/core-api/pin_user_pages.rst, which is to say, normal
get_user_pages() and put_page() is the API to use there.

The new pin_user_pages*() calls replace corresponding get_user_pages*()
calls, and set the FOLL_PIN flag. The FOLL_PIN flag requires that the
caller must return the pages via put_user_page*() calls, but infiniband
was already doing that as part of an earlier commit.

Link: http://lkml.kernel.org/r/20200107224558.2362728-14-jhubbard@nvidia.com
Signed-off-by: John Hubbard
Reviewed-by: Jason Gunthorpe
Cc: Alex Williamson
Cc: Aneesh Kumar K.V
Cc: Björn Töpel
Cc: Christoph Hellwig
Cc: Daniel Vetter
Cc: Dan Williams
Cc: Hans Verkuil
Cc: Ira Weiny
Cc: Jan Kara
Cc: Jason Gunthorpe
Cc: Jens Axboe
Cc: Jerome Glisse
Cc: Jonathan Corbet
Cc: Kirill A. Shutemov
Cc: Leon Romanovsky
Cc: Mauro Carvalho Chehab
Cc: Mike Rapoport
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

John Hubbard
2020-02-01 02:30:37 +0800
4789fcdd1 IB/umem: use get_user_pages_fast() to pin DMA pages ... Browse Code »

And get rid of the mmap_sem calls, as part of that. Note that
get_user_pages_fast() will, if necessary, fall back to
__gup_longterm_unlocked(), which takes the mmap_sem as needed.

Link: http://lkml.kernel.org/r/20200107224558.2362728-10-jhubbard@nvidia.com
Signed-off-by: John Hubbard
Reviewed-by: Leon Romanovsky
Reviewed-by: Christoph Hellwig
Reviewed-by: Jan Kara
Reviewed-by: Jason Gunthorpe
Reviewed-by: Ira Weiny
Cc: Alex Williamson
Cc: Aneesh Kumar K.V
Cc: Björn Töpel
Cc: Daniel Vetter
Cc: Dan Williams
Cc: Hans Verkuil
Cc: Jason Gunthorpe
Cc: Jens Axboe
Cc: Jerome Glisse
Cc: Jonathan Corbet
Cc: Kirill A. Shutemov
Cc: Mauro Carvalho Chehab
Cc: Mike Rapoport
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

John Hubbard
2020-02-01 02:30:37 +0800

31 Jan, 2020

1 commit

8889f6fa3 RDMA/core: Make the entire API tree static ... Browse Code »

Compilation of mlx5 driver without CONFIG_INFINIBAND_USER_ACCESS generates
the following error.

on x86_64:

ld: drivers/infiniband/hw/mlx5/main.o: in function `mlx5_ib_handler_MLX5_IB_METHOD_VAR_OBJ_ALLOC':
main.c:(.text+0x186d): undefined reference to `ib_uverbs_get_ucontext_file'
ld: drivers/infiniband/hw/mlx5/main.o:(.rodata+0x2480): undefined reference to `uverbs_idr_class'
ld: drivers/infiniband/hw/mlx5/main.o:(.rodata+0x24d8): undefined reference to `uverbs_destroy_def_handler'

This is happening because some parts of the UAPI description are not
static. This is a hold over from earlier code that relied on struct
pointers to refer to object types, now object types are referenced by
number. Remove the unused globals and add statics to the remaining UAPI
description elements.

Remove the redundent #ifdefs around mlx5_ib_*defs and obsolete
mlx5_ib_get_devx_tree().

The compiler now trims alot more unused code, including the above
problematic definitions when !CONFIG_INFINIBAND_USER_ACCESS.

Fixes: 7be76bef320b ("IB/mlx5: Introduce VAR object and its alloc/destroy methods")
Reported-by: Randy Dunlap
Acked-by: Randy Dunlap
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-01-31 04:28:52 +0800

30 Jan, 2020

1 commit

ba19e1665 RDMA/efa: Mask access flags with the correct optional range ... Browse Code »

The uapi value IB_UVERBS_ACCESS_OPTIONAL_RANGE shouldn't be used inside
the driver, use IB_ACCESS_OPTIONAL instead.

Fixes: 86dd738cf20c ("RDMA/efa: Allow passing of optional access flags for MR registration")
Link: https://lore.kernel.org/r/20200129071803.40117-1-galpress@amazon.com
Signed-off-by: Gal Pressman
Signed-off-by: Jason Gunthorpe

Gal Pressman
2020-01-30 04:41:05 +0800

29 Jan, 2020

5 commits

bd2463ac7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next ... Browse Code »

Pull networking updates from David Miller:

1) Add WireGuard

2) Add HE and TWT support to ath11k driver, from John Crispin.

3) Add ESP in TCP encapsulation support, from Sabrina Dubroca.

4) Add variable window congestion control to TIPC, from Jon Maloy.

5) Add BCM84881 PHY driver, from Russell King.

6) Start adding netlink support for ethtool operations, from Michal
Kubecek.

7) Add XDP drop and TX action support to ena driver, from Sameeh
Jubran.

8) Add new ipv4 route notifications so that mlxsw driver does not have
to handle identical routes itself. From Ido Schimmel.

9) Add BPF dynamic program extensions, from Alexei Starovoitov.

10) Support RX and TX timestamping in igc, from Vinicius Costa Gomes.

11) Add support for macsec HW offloading, from Antoine Tenart.

12) Add initial support for MPTCP protocol, from Christoph Paasch,
Matthieu Baerts, Florian Westphal, Peter Krystad, and many others.

13) Add Octeontx2 PF support, from Sunil Goutham, Geetha sowjanya, Linu
Cherian, and others.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1469 commits)
net: phy: add default ARCH_BCM_IPROC for MDIO_BCM_IPROC
udp: segment looped gso packets correctly
netem: change mailing list
qed: FW 8.42.2.0 debug features
qed: rt init valid initialization changed
qed: Debug feature: ilt and mdump
qed: FW 8.42.2.0 Add fw overlay feature
qed: FW 8.42.2.0 HSI changes
qed: FW 8.42.2.0 iscsi/fcoe changes
qed: Add abstraction for different hsi values per chip
qed: FW 8.42.2.0 Additional ll2 type
qed: Use dmae to write to widebus registers in fw_funcs
qed: FW 8.42.2.0 Parser offsets modified
qed: FW 8.42.2.0 Queue Manager changes
qed: FW 8.42.2.0 Expose new registers and change windows
qed: FW 8.42.2.0 Internal ram offsets modifications
MAINTAINERS: Add entry for Marvell OcteonTX2 Physical Function driver
Documentation: net: octeontx2: Add RVU HW and drivers overview
octeontx2-pf: ethtool RSS config support
octeontx2-pf: Add basic ethtool support
...

Linus Torvalds
2020-01-29 08:02:33 +0800
b4fb4cc5b RDMA/cma: Fix unbalanced cm_id reference count during address resolve ... Browse Code »

Below commit missed the AF_IB and loopback code flow in
rdma_resolve_addr(). This leads to an unbalanced cm_id refcount in
cma_work_handler() which puts the refcount which was not incremented prior
to queuing the work.

A call trace is observed with such code flow:

BUG: unable to handle kernel NULL pointer dereference at (null)
[] __mutex_lock_slowpath+0x166/0x1d0
[] mutex_lock+0x1f/0x2f
[] cma_work_handler+0x25/0xa0
[] process_one_work+0x17f/0x440
[] worker_thread+0x126/0x3c0

Hence, hold the cm_id reference when scheduling the resolve work item.

Fixes: 722c7b2bfead ("RDMA/{cma, core}: Avoid callback on rdma_addr_cancel()")
Link: https://lore.kernel.org/r/20200126142652.104803-2-leon@kernel.org
Signed-off-by: Parav Pandit
Signed-off-by: Leon Romanovsky
Reviewed-by: Jason Gunthorpe
Signed-off-by: Jason Gunthorpe

Parav Pandit
2020-01-29 02:15:23 +0800
36798d5ae RDMA/umem: Fix ib_umem_find_best_pgsz() ... Browse Code »

Except for the last entry, the ending iova alignment sets the maximum
possible page size as the low bits of the iova must be zero when starting
the next chunk.

Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR")
Link: https://lore.kernel.org/r/20200128135612.174820-1-leon@kernel.org
Signed-off-by: Artemy Kovalyov
Signed-off-by: Leon Romanovsky
Tested-by: Gal Pressman
Reviewed-by: Jason Gunthorpe
Signed-off-by: Jason Gunthorpe

Artemy Kovalyov
2020-01-29 02:10:54 +0800
c0e809e24 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf updates from Ingo Molnar:
"Kernel side changes:

- Ftrace is one of the last W^X violators (after this only KLP is
left). These patches move it over to the generic text_poke()
interface and thereby get rid of this oddity. This requires a
surprising amount of surgery, by Peter Zijlstra.

- x86/AMD PMUs: add support for 'Large Increment per Cycle Events' to
count certain types of events that have a special, quirky hw ABI
(by Kim Phillips)

- kprobes fixes by Masami Hiramatsu

Lots of tooling updates as well, the following subcommands were
updated: annotate/report/top, c2c, clang, record, report/top TUI,
sched timehist, tests; plus updates were done to the gtk ui, libperf,
headers and the parser"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
perf/x86/amd: Add support for Large Increment per Cycle Events
perf/x86/amd: Constrain Large Increment per Cycle events
perf/x86/intel/rapl: Add Comet Lake support
tracing: Initialize ret in syscall_enter_define_fields()
perf header: Use last modification time for timestamp
perf c2c: Fix return type for histogram sorting comparision functions
perf beauty sockaddr: Fix augmented syscall format warning
perf/ui/gtk: Fix gtk2 build
perf ui gtk: Add missing zalloc object
perf tools: Use %define api.pure full instead of %pure-parser
libperf: Setup initial evlist::all_cpus value
perf report: Fix no libunwind compiled warning break s390 issue
perf tools: Support --prefix/--prefix-strip
perf report: Clarify in help that --children is default
tools build: Fix test-clang.cpp with Clang 8+
perf clang: Fix build with Clang 9
kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic
tools lib: Fix builds when glibc contains strlcpy()
perf report/top: Make 'e' visible in the help and make it toggle showing callchains
perf report/top: Do not offer annotation for symbols without samples
...

Linus Torvalds
2020-01-29 01:44:15 +0800
634cd4b6a Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull EFI updates from Ingo Molnar:
"The main changes in this cycle were:

- Cleanup of the GOP [graphics output] handling code in the EFI stub

- Complete refactoring of the mixed mode handling in the x86 EFI stub

- Overhaul of the x86 EFI boot/runtime code

- Increase robustness for mixed mode code

- Add the ability to disable DMA at the root port level in the EFI
stub

- Get rid of RWX mappings in the EFI memory map and page tables,
where possible

- Move the support code for the old EFI memory mapping style into its
only user, the SGI UV1+ support code.

- plus misc fixes, updates, smaller cleanups.

... and due to interactions with the RWX changes, another round of PAT
cleanups make a guest appearance via the EFI tree - with no side
effects intended"

* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits)
efi/x86: Disable instrumentation in the EFI runtime handling code
efi/libstub/x86: Fix EFI server boot failure
efi/x86: Disallow efi=old_map in mixed mode
x86/boot/compressed: Relax sed symbol type regex for LLVM ld.lld
efi/x86: avoid KASAN false positives when accessing the 1: 1 mapping
efi: Fix handling of multiple efi_fake_mem= entries
efi: Fix efi_memmap_alloc() leaks
efi: Add tracking for dynamically allocated memmaps
efi: Add a flags parameter to efi_memory_map
efi: Fix comment for efi_mem_type() wrt absent physical addresses
efi/arm: Defer probe of PCIe backed efifb on DT systems
efi/x86: Limit EFI old memory map to SGI UV machines
efi/x86: Avoid RWX mappings for all of DRAM
efi/x86: Don't map the entire kernel text RW for mixed mode
x86/mm: Fix NX bit clearing issue in kernel_map_pages_in_pgd
efi/libstub/x86: Fix unused-variable warning
efi/libstub/x86: Use mandatory 16-byte stack alignment in mixed mode
efi/libstub/x86: Use const attribute for efi_is_64bit()
efi: Allow disabling PCI busmastering on bridges during boot
efi/x86: Allow translating 64-bit arguments for mixed mode calls
...

Linus Torvalds
2020-01-29 01:03:40 +0800

28 Jan, 2020

2 commits

6a1000bd2 Merge tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap ... Browse Code »

Pull ioremap updates from Christoph Hellwig:
"Remove the ioremap_nocache API (plus wrappers) that are always
identical to ioremap"

* tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap:
remove ioremap_nocache and devm_ioremap_nocache
MIPS: define ioremap_nocache to ioremap

Linus Torvalds
2020-01-28 05:03:00 +0800
ea660ad7c IB/mlx4: Fix leak in id_map_find_del ... Browse Code »

Using CX-3 virtual functions, either from a bare-metal machine or
pass-through from a VM, MAD packets are proxied through the PF driver.

Since the VF drivers have separate name spaces for MAD Transaction Ids
(TIDs), the PF driver has to re-map the TIDs and keep the book keeping in
a cache.

Following the RDMA Connection Manager (CM) protocol, it is clear when an
entry has to evicted from the cache. When a DREP is sent from
mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar when
a REJ is received by the mlx4_ib_demux_cm_handler(), id_map_find_del() is
called.

This function wipes out the TID in use from the IDR or XArray and removes
the id_map_entry from the table.

In short, it does everything except the topping of the cake, which is to
remove the entry from the list and free it. In other words, for the REJ
case enumerated above, one id_map_entry will be leaked.

For the other case above, a DREQ has been received first. The reception of
the DREQ will trigger queuing of a delayed work to delete the
id_map_entry, for the case where the VM doesn't send back a DREP.

In the normal case, the VM _will_ send back a DREP, and id_map_find_del()
will be called.

But this scenario introduces a secondary leak. First, when the DREQ is
received, a delayed work is queued. The VM will then return a DREP, which
will call id_map_find_del(). As stated above, this will free the TID used
from the XArray or IDR. Now, there is window where that particular TID can
be re-allocated, lets say by an outgoing REQ. This TID will later be wiped
out by the delayed work, when the function id_map_ent_timeout() is
called. But the id_map_entry allocated by the outgoing REQ will not be
de-allocated, and we have a leak.

Both leaks are fixed by removing the id_map_find_del() function and only
using schedule_delayed(). Of course, a check in schedule_delayed() to see
if the work already has been queued, has been added.

Another benefit of always using the delayed version for deleting entries,
is that we do get a TimeWait effect; a TID no longer in use, will occupy
the XArray or IDR for CM_CLEANUP_CACHE_TIMEOUT time, without any ability
of being re-used for that time period.

Fixes: 3cf69cc8dbeb ("IB/mlx4: Add CM paravirtualization")
Link: https://lore.kernel.org/r/20200123155521.1212288-1-haakon.bugge@oracle.com
Signed-off-by: Håkon Bugge
Signed-off-by: Manjunath Patil
Reviewed-by: Rama Nichanamatlu
Reviewed-by: Jack Morgenstein
Signed-off-by: Jason Gunthorpe

Håkon Bugge
2020-01-28 04:46:53 +0800

27 Jan, 2020

1 commit

54343d951 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi ... Browse Code »

Pull SCSI fixes from James Bottomley:
"Two last minute fixes, both in drivers.

The fnic one is a highly unlikely condition, but the RDMA one is a
recently introduced regression that causes a kernel warning to trigger
in every RDMA logon, which would be unsightly if it got into the final
release"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: RDMA/isert: Fix a recently introduced regression related to logout
scsi: fnic: do not queue commands during fwreset

Linus Torvalds
2020-01-27 02:39:09 +0800

26 Jan, 2020

11 commits

7f04c71f1 IB/opa_vnic: Spelling correction of 'erorr' to 'error' ... Browse Code »

Correcting a minor spelling mistake in the comments.

Link: https://lore.kernel.org/r/20200118162542.15188-1-dab9861@gmail.com
Signed-off-by: Dillon Brock
Acked-by: Dennis Dalessandro
Signed-off-by: Jason Gunthorpe

Dillon Brock
2020-01-26 03:37:56 +0800
79ba4f931 IB/hfi1: Fix logical condition in msix_request_irq ... Browse Code »

Clang warns:

drivers/infiniband/hw/hfi1/msix.c:136:22: warning: overlapping
comparisons always evaluate to false [-Wtautological-overlap-compare]
if (type < IRQ_SDMA && type >= IRQ_OTHER)
~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
1 warning generated.

It is impossible for something to be less than 0 (IRQ_SDMA) and greater
than or equal to 3 (IRQ_OTHER) at the same time. A logical OR should
have been used to keep the same logic as before.

Link: https://lore.kernel.org/r/20200116222658.5285-1-natechancellor@gmail.com
Link: https://github.com/ClangBuiltLinux/linux/issues/841
Fixes: 13d2a8384bd9 ("IB/hfi1: Decouple IRQ name from type")
Signed-off-by: Nathan Chancellor
Reviewed-by: Nick Desaulniers
Acked-by: Dennis Dalessandro
Signed-off-by: Jason Gunthorpe

Nathan Chancellor
2020-01-26 03:33:53 +0800
13e0af180 RDMA/cm: Remove CM message structs ... Browse Code »

All accesses now use the new IBA acessor scheme, so delete the structs
entirely and generate the structures from the schema file.

Link: https://lore.kernel.org/r/20200116170037.30109-8-jgg@ziepe.ca
Tested-by: Leon Romanovsky
Reviewed-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-01-26 03:11:37 +0800
4ca662a30 RDMA/cm: Use IBA functions for complex structure members ... Browse Code »

Use a Coccinelle spatch to replace CM structure members used as
structures, arrays, or pointers with IBA_GET/SET versions. Applied with

$ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c

The spatch file was generated using the template pattern:

@@
expression src;
expression len;
{struct} *msg;
@@
- memcpy(msg->{old_name}, src, len)
+ IBA_SET_MEM({new_name}, msg, src, len)
@@
{struct} *msg;
identifier x;
@@
- msg->{old_name}.x
+ IBA_GET_MEM_PTR({new_name}, msg)->x
@@
{struct} *msg;
@@
- &msg->{old_name}
+ IBA_GET_MEM_PTR({new_name}, msg)

For GIDs:
@@
{struct} *msg;
@@
- msg->{old_name}
+ *IBA_GET_MEM_PTR({new_name}, msg)

For non-GIDs:
@@
{struct} *msg;
@@
- msg->{old_name}
+ IBA_GET_MEM_PTR({new_name}, msg)

Iterated for every remaining IBA_CHECK_OFF()/IBA_CHECK_GET()
pairing. Touched up with clang-format after.

Link: https://lore.kernel.org/r/20200116170037.30109-7-jgg@ziepe.ca
Tested-by: Leon Romanovsky
Reviewed-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-01-26 03:06:01 +0800
91b60a712 RDMA/cm: Use IBA functions for simple structure members ... Browse Code »

Use a Coccinelle spatch script to replace use of simple CM structure
members with IBA_GET/SET versions. Applied with

$ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c

The spatch file was generated using the template pattern:

@@
expression val;
{struct} *msg;
@@
- msg->{old_name} = val
+ IBA_SET({new_name}, msg, be{bits}_to_cpu(val))
@@
{struct} *msg;
@@
- msg->{old_name}
+ cpu_to_be{bits}(IBA_GET({new_name}, msg))

Iterated for every IBA_CHECK_OFF that isn't a CM_FIELD_MLOC.

And the below iterated over all byte sizes to remove doubled byte swaps:

@@
expression val;
@@
-be{bits}_to_cpu(cpu_to_be{bits}(val))
+val

(and __be_to_cpu and ntoh varients)

Touched up with clang-format after.

Link: https://lore.kernel.org/r/20200116170037.30109-6-jgg@ziepe.ca
Tested-by: Leon Romanovsky
Reviewed-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-01-26 03:06:00 +0800
01adb7f46 RDMA/cm: Use IBA functions for swapping get/set acessors ... Browse Code »

Use a Coccinelle spatch script to replace CM helper functions that
return/accept BE values with IBA_GET/SET versions. Applied with

$ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c

The spatch file was generated using the template pattern:

@@
expression val;
{struct} *msg;
@@
- {old_setter}(msg, val)
+ IBA_SET({new_name}, msg, be{bits}_to_cpu(val))
@@
{struct} *msg;
@@
- {old_getter}(msg)
+ cpu_to_be{bits}(IBA_GET({new_name}, msg))

Iterated for every IBA_CHECK_GET_BE()/IBA_CHECK_SET_BE() pairing.

And the below iterated over all byte sizes to remove doubled byte swaps:

@@
expression val;
@@
-be{bits}_to_cpu(cpu_to_be{bits}(val))
+val

(and __be_to_cpu and ntoh varients)

Touched up with clang-format after.

Link: https://lore.kernel.org/r/20200116170037.30109-5-jgg@ziepe.ca
Tested-by: Leon Romanovsky
Reviewed-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-01-26 03:06:00 +0800
b6bbee688 RDMA/cm: Use IBA functions for simple get/set acessors ... Browse Code »

Use a Coccinelle spatch to replace CM helper functions with IBA_GET/SET
versions. Applied with

$ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c

The spatch file was generated using the template pattern:

@@
expression val;
{struct} *msg;
@@
- {old_setter}
+ IBA_SET({new_name}, msg, val)
@@
{struct} *msg;
@@
- {old_getter}
+ IBA_GET({new_name}, msg)

Iterated for every IBA_CHECK_GET()/IBA_CHECK_GET() pairing. Touched up
with clang-format after.

Link: https://lore.kernel.org/r/20200116170037.30109-4-jgg@ziepe.ca
Tested-by: Leon Romanovsky
Reviewed-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-01-26 03:05:59 +0800
d05d4ac4c RDMA/cm: Add SET/GET implementations to hide IBA wire format ... Browse Code »

There is no separation between RDMA-CM wire format as it is declared in
IBTA and kernel logic which implements needed support. Such situation
causes to many mistakes in conversion between big-endian (wire format)
and CPU format used by kernel. It also mixes RDMA core code with
combination of uXX and beXX variables.

The idea that all accesses to IBA definitions will go through special
GET/SET macros to ensure that no conversion mistakes are made. The
shifting and masking required to read the value is automatically deduced
using the field offset description from the tables in the IBA
specification.

This starts with the CM MADs described in IBTA release 1.3 volume 1.

To confirm that the new macros behave the same as the old accessors a
self-test is included in this patch.

Each macro replacing a straightforward struct field compile-time tests
that the new field has the same offsetof() and width as the old field.

For the fields with accessor functions a runtime test, the 'all ones'
value is placed in a dummy message and read back in several ways to
confirm that both approaches give identical results.

Later patches in this series delete the self test.

This creates a tested table of new field name, old field name(s) and some
meta information like BE coding for the functions which will be used in
the next patches.

Link: https://lore.kernel.org/r/20200116170037.30109-3-jgg@ziepe.ca
Link: https://lore.kernel.org/r/20191212093830.316934-5-leon@kernel.org
Signed-off-by: Leon Romanovsky
Tested-by: Leon Romanovsky
Reviewed-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe

Leon Romanovsky
2020-01-26 03:05:59 +0800
792a7c1f2 RDMA/cm: Add accessors for CM_REQ transport_type ... Browse Code »

Access the two fields through wrappers, like all other fields, to make it
clearer what is happening.

Link: https://lore.kernel.org/r/20200116170037.30109-2-jgg@ziepe.ca
Tested-by: Leon Romanovsky
Reviewed-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-01-26 03:05:59 +0800
4bbd4923d IB/mlx5: Return the administrative GUID if exists ... Browse Code »

A user can change the operational GUID (a.k.a affective GUID) through
link/infiniband. Therefore it is preferred to return the currently set
GUID if it exists instead of the operational.

This way the PF can query which VF GUID will be set in the next bind. In
order to align with MAC address, zero is returned if administrative GUID
is not set.

For example, before setting administrative GUID:
$ ip link show
ib0: mtu 4092 qdisc mq state UP mode DEFAULT group default qlen 256
link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
vf 0 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff,
spoof checking off, NODE_GUID 00:00:00:00:00:00:00:00, PORT_GUID 00:00:00:00:00:00:00:00, link-state auto, trust off, query_rss off

Then:

$ ip link set ib0 vf 0 node_guid 11:00:af:21:cb:05:11:00
$ ip link set ib0 vf 0 port_guid 22:11:af:21:cb:05:11:00

After setting administrative GUID:
$ ip link show
ib0: mtu 4092 qdisc mq state UP mode DEFAULT group default qlen 256
link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
vf 0 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff,
spoof checking off, NODE_GUID 11:00:af:21:cb:05:11:00, PORT_GUID 22:11:af:21:cb:05:11:00, link-state auto, trust off, query_rss off

Fixes: 9c0015ef0928 ("IB/mlx5: Implement callbacks for getting VFs GUID attributes")
Link: https://lore.kernel.org/r/20200116120048.12744-1-leon@kernel.org
Signed-off-by: Danit Goldberg
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe

Danit Goldberg
2020-01-26 02:54:39 +0800
6b3712c02 RDMA/core: Ensure that rdma_user_mmap_entry_remove() is a fence ... Browse Code »

The set of entry->driver_removed is missing locking, protect it with
xa_lock() which is held by the only reader.

Otherwise readers may continue to see driver_removed = false after
rdma_user_mmap_entry_remove() returns and may continue to try and
establish new mmaps.

Fixes: 3411f9f01b76 ("RDMA/core: Create mmap database and cookie helper functions")
Link: https://lore.kernel.org/r/20200115202041.GA17199@ziepe.ca
Reviewed-by: Gal Pressman
Acked-by: Michal Kalderon
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-01-26 02:48:33 +0800

21 Jan, 2020

3 commits

e8b3a426f Merge tag 'rds-odp-for-5.5' into rdma.git for-next ... Browse Code »

From https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma

Leon Romanovsky says:

====================
Use ODP MRs for kernel ULPs

The following series extends MR creation routines to allow creation of
user MRs through kernel ULPs as a proxy. The immediate use case is to
allow RDS to work over FS-DAX, which requires ODP (on-demand-paging)
MRs to be created and such MRs were not possible to create prior this
series.

The first part of this patchset extends RDMA to have special verb
ib_reg_user_mr(). The common use case that uses this function is a
userspace application that allocates memory for HCA access but the
responsibility to register the memory at the HCA is on an kernel ULP.
This ULP acts as an agent for the userspace application.

The second part provides advise MR functionality for ULPs. This is
integral part of ODP flows and used to trigger pagefaults in advance
to prepare memory before running working set.

The third part is actual user of those in-kernel APIs.
====================

* tag 'rds-odp-for-5.5':
net/rds: Use prefetch for On-Demand-Paging MR
net/rds: Handle ODP mr registration/unregistration
net/rds: Detect need of On-Demand-Paging memory registration
RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths
IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs
RDMA/mlx5: Don't fake udata for kernel path
IB/mlx5: Add ODP WQE handlers for kernel QPs
IB/core: Add interface to advise_mr for kernel users
IB/core: Introduce ib_reg_user_mr
IB: Allow calls to ib_umem_get from kernel ULPs

Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-01-21 21:55:04 +0800
ad063075d Merge tag 'rds-odp-for-5.5' of https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma ... Browse Code »

Leon Romanovsky says:

====================
Use ODP MRs for kernel ULPs

The following series extends MR creation routines to allow creation of
user MRs through kernel ULPs as a proxy. The immediate use case is to
allow RDS to work over FS-DAX, which requires ODP (on-demand-paging)
MRs to be created and such MRs were not possible to create prior this
series.

The first part of this patchset extends RDMA to have special verb
ib_reg_user_mr(). The common use case that uses this function is a
userspace application that allocates memory for HCA access but the
responsibility to register the memory at the HCA is on an kernel ULP.
This ULP acts as an agent for the userspace application.

The second part provides advise MR functionality for ULPs. This is
integral part of ODP flows and used to trigger pagefaults in advance
to prepare memory before running working set.

The third part is actual user of those in-kernel APIs.
====================

Signed-off-by: David S. Miller

David S. Miller
2020-01-21 17:22:51 +0800
04060db41 scsi: RDMA/isert: Fix a recently introduced regression related to logout ... Browse Code »

iscsit_close_connection() calls isert_wait_conn(). Due to commit
e9d3009cb936 both functions call target_wait_for_sess_cmds() although that
last function should be called only once. Fix this by removing the
target_wait_for_sess_cmds() call from isert_wait_conn() and by only calling
isert_wait_conn() after target_wait_for_sess_cmds().

Fixes: e9d3009cb936 ("scsi: target: iscsi: Wait for all commands to finish before freeing a session").
Link: https://lore.kernel.org/r/20200116044737.19507-1-bvanassche@acm.org
Reported-by: Rahul Kundu
Signed-off-by: Bart Van Assche
Tested-by: Mike Marciniszyn
Acked-by: Sagi Grimberg
Signed-off-by: Martin K. Petersen

Bart Van Assche
2020-01-21 13:24:46 +0800

20 Jan, 2020

2 commits

cb6c82df6 Merge tag 'v5.5-rc7' into perf/core, to pick up fixes ... Browse Code »

Signed-off-by: Ingo Molnar

Ingo Molnar
2020-01-20 15:43:44 +0800
b3f7e3f23 Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net Browse Code »

David S. Miller
2020-01-20 05:10:04 +0800

17 Jan, 2020

10 commits

12e9e0d0d Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux ... Browse Code »

This merge syncs with mlx5-next latest HW bits and layout updates for next
features, in addition one patch that improves
mlx5_create_auto_grouped_flow_table() API across all mlx5 users.

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: Refactor mlx5_create_auto_grouped_flow_table
net/mlx5e: Add discard counters per priority
net/mlx5e: Expose FEC feilds and related capability bit
net/mlx5: Add mlx5_ifc definitions for connection tracking support
net/mlx5: Add copy header action struct layout
net/mlx5: Expose resource dump register mapping
net/mlx5: Add structures and defines for MIRC register
net/mlx5: Read MCAM register groups 1 and 2
net/mlx5: Add structures layout for new MCAM access reg groups
net/mlx5: Expose vDPA emulation device capabilities
net/mlx5: Add Virtio Emulation related device capabilities

Signed-off-by: Saeed Mahameed

Saeed Mahameed
2020-01-17 07:48:24 +0800
61dc7b014 net/mlx5: Refactor mlx5_create_auto_grouped_flow_table ... Browse Code »

Refactor mlx5_create_auto_grouped_flow_table() to use ft_attr param
which already carries the max_fte, prio and flags memebers, and is
used the same in similar mlx5_create_flow_table() function.

Signed-off-by: Paul Blakey
Reviewed-by: Roi Dayan
Reviewed-by: Oz Shlomo
Reviewed-by: Mark Bloch
Signed-off-by: Saeed Mahameed

Paul Blakey
2020-01-17 07:41:59 +0800
eaad647e5 IB/mlx4: Fix memory leak in add_gid error flow ... Browse Code »

In procedure mlx4_ib_add_gid(), if the driver is unable to update the FW
gid table, there is a memory leak in the driver's copy of the gid table:
the gid entry's context buffer is not freed.

If such an error occurs, free the entry's context buffer, and mark the
entry as available (by setting its context pointer to NULL).

Fixes: e26be1bfef81 ("IB/mlx4: Implement ib_device callbacks")
Link: https://lore.kernel.org/r/20200115085050.73746-1-leon@kernel.org
Signed-off-by: Jack Morgenstein
Reviewed-by: Parav Pandit
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe

Jack Morgenstein
2020-01-17 04:13:22 +0800
d7fab9163 IB/mlx5: Expose RoCE accelerator counters ... Browse Code »

Introduce the following RoCE accelerator counters:
* roce_adp_retrans - number of adaptive retransmission for RoCE traffic.
* roce_adp_retrans_to - number of times RoCE traffic reached time out
due to adaptive retransmission.
* roce_slow_restart - number of times RoCE slow restart was used.
* roce_slow_restart_cnps - number of times RoCE slow restart
generate CNP packets.
* roce_slow_restart_trans - number of times RoCE slow restart change
state to slow restart.

Link: https://lore.kernel.org/r/20200115145459.83280-3-leon@kernel.org
Signed-off-by: Avihai Horon
Reviewed-by: Maor Gottlieb
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe

Avihai Horon
2020-01-17 04:11:52 +0800
d6de0bb18 RDMA/mlx5: Set relaxed ordering when requested ... Browse Code »

Enable relaxed ordering in the mkey context when requested. As relaxed
ordering is not currently supported in UMR, disable UMR usage for relaxed
ordering MRs.

Link: https://lore.kernel.org/r/1578506740-22188-11-git-send-email-yishaih@mellanox.com
Signed-off-by: Michael Guralnik
Signed-off-by: Yishai Hadas
Signed-off-by: Jason Gunthorpe

Michael Guralnik
2020-01-17 03:55:47 +0800
811646998 RDMA/core: Add the core support field to METHOD_GET_CONTEXT ... Browse Code »

Add the core support field to METHOD_GET_CONTEXT, this field should
represent capabilities that are not device-specific.

Return support for optional access flags for memory regions. User-space
will use this capability to mask the optional access flags for
unsupporting kernels.

Link: https://lore.kernel.org/r/1578506740-22188-10-git-send-email-yishaih@mellanox.com
Signed-off-by: Michael Guralnik
Signed-off-by: Yishai Hadas
Signed-off-by: Jason Gunthorpe

Michael Guralnik
2020-01-17 03:55:46 +0800
86dd738cf RDMA/efa: Allow passing of optional access flags for MR registration ... Browse Code »

As part of adding a range of optional access flags that drivers need to be
able to accept, mask this range inside efa driver. This will prevent the
driver from failing when an access flag from that range is passed.

Link: https://lore.kernel.org/r/1578506740-22188-8-git-send-email-yishaih@mellanox.com
Signed-off-by: Michael Guralnik
Signed-off-by: Yishai Hadas
Signed-off-by: Jason Gunthorpe

Michael Guralnik
2020-01-17 03:55:46 +0800
a1123418b RDMA/uverbs: Add ioctl command to get a device context ... Browse Code »

Allow future extensions of the get context command through the uverbs
ioctl kabi.

Unlike the uverbs version this does not return an async_fd as well, that
has to be done with another command.

Link: https://lore.kernel.org/r/1578506740-22188-5-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-01-17 03:55:45 +0800
da57db256 RDMA/core: Remove ucontext_lock from the uverbs_destry_ufile_hw() path ... Browse Code »

This lock only serializes ucontext creation. Instead of checking the
ucontext_lock during destruction hold the existing hw_destroy_rwsem during
creation, which is the standard pattern for object creation.

The simplification of locking is needed for the next patch.

Link: https://lore.kernel.org/r/1578506740-22188-4-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-01-17 03:55:45 +0800
d680e88e2 RDMA/core: Add UVERBS_METHOD_ASYNC_EVENT_ALLOC ... Browse Code »

Allow the async FD to be allocated separately from the context.

This is necessary to introduce the ioctl to create a context, as an ioctl
should only ever create a single uobject at a time.

If multiple async FDs are created then the first one is used to deliver
affiliated events from any ib_uevent_object, with all subsequent ones will
receive only unaffiliated events.

Link: https://lore.kernel.org/r/1578506740-22188-3-git-send-email-yishaih@mellanox.com
Signed-off-by: Yishai Hadas
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-01-17 03:55:45 +0800