Eric Lee / smarc-fsl-linux-kernel

11 Sep, 2020

2 commits

81655d3c4 RDMA/mlx4: Use ib_umem_num_dma_blocks() ... Browse Code »

For the calls linked to mlx4_ib_umem_calc_optimal_mtt_size() use
ib_umem_num_dma_blocks() inside the function, it is just some weird static
default.

All other places are just using it with PAGE_SIZE, switch to
ib_umem_num_dma_blocks().

As this is the last call site, remove ib_umem_num_count().

Link: https://lore.kernel.org/r/15-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-09-11 21:24:54 +0800
a665aca89 RDMA/umem: Split ib_umem_num_pages() into ib_umem_num_dma_blocks() ... Browse Code »

ib_umem_num_pages() should only be used by things working with the SGL in
CPU pages directly.

Drivers building DMA lists should use the new ib_num_dma_blocks() which
returns the number of blocks rdma_umem_for_each_block() will return.

To make this general for DMA drivers requires a different implementation.
Computing DMA block count based on umem->address only works if the
requested page size is < PAGE_SIZE and/or the IOVA == umem->address.

Instead the number of DMA pages should be computed in the IOVA address
space, not umem->address. Thus the IOVA has to be stored inside the umem
so it can be used for these calculations.

For now set it to umem->address by default and fix it up if
ib_umem_find_best_pgsz() was called. This allows drivers to be converted
to ib_umem_num_dma_blocks() safely.

Link: https://lore.kernel.org/r/6-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-09-11 21:24:53 +0800

10 Sep, 2020

1 commit

ebc24096c RDMA/umem: Add rdma_umem_for_each_dma_block() ... Browse Code »

This helper does the same as rdma_for_each_block(), except it works on a
umem. This simplifies most of the call sites.

Link: https://lore.kernel.org/r/4-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Acked-by: Miguel Ojeda
Acked-by: Shiraz Saleem
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-09-10 02:33:17 +0800

31 Aug, 2020

1 commit

61690d01d RDMA/umem: Fix signature of stub ib_umem_find_best_pgsz() ... Browse Code »

The original function returns unsigned long and 0 on failure.

Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR")
Link: https://lore.kernel.org/r/0-v1-982a13cc5c6d+501ae-fix_best_pgsz_stub_jgg@nvidia.com
Reviewed-by: Gal Pressman
Acked-by: Shiraz Saleem
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2020-08-31 23:15:10 +0800

30 Jul, 2020

1 commit

6bf9d8f6f RDMA/include: Replace license text with SPDX tags ... Browse Code »

The header files in RDMA subsystem are dual licensed and can be
described by simple SPDX tag, so replace all of them at once
together with making them use the same coding style for header
guard defines.

Link: https://lore.kernel.org/r/20200719072521.135260-1-leon@kernel.org
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe

Leon Romanovsky
2020-07-30 01:48:36 +0800

16 Jan, 2020

1 commit

c320e527e IB: Allow calls to ib_umem_get from kernel ULPs ... Browse Code »

So far the assumption was that ib_umem_get() and ib_umem_odp_get()
are called from flows that start in UVERBS and therefore has a user
context. This assumption restricts flows that are initiated by ULPs
and need the service that ib_umem_get() provides.

This patch changes ib_umem_get() and ib_umem_odp_get() to get IB device
directly by relying on the fact that both UVERBS and ULPs sets that
field correctly.

Reviewed-by: Guy Levi
Signed-off-by: Moni Shoua
Signed-off-by: Leon Romanovsky

Moni Shoua
2020-01-16 22:14:28 +0800

17 Nov, 2019

1 commit

72b894b09 IB/umem: remove the dmasync argument to ib_umem_get ... Browse Code »

The argument is always ignored, so remove it.

Link: https://lore.kernel.org/r/20191113073214.9514-3-hch@lst.de
Signed-off-by: Christoph Hellwig
Reviewed-by: Jason Gunthorpe
Acked-by: Michal Kalderon
Signed-off-by: Jason Gunthorpe

Christoph Hellwig
2019-11-17 22:37:00 +0800

22 Aug, 2019

1 commit

47f725ee7 RDMA/odp: remove ib_ucontext from ib_umem ... Browse Code »

At this point the ucontext is only being stored to access the ib_device,
so just store the ib_device directly instead. This is more natural and
logical as the umem has nothing to do with the ucontext.

Link: https://lore.kernel.org/r/20190806231548.25242-8-jgg@ziepe.ca
Signed-off-by: Jason Gunthorpe

Jason Gunthorpe
2019-08-22 07:58:19 +0800

22 May, 2019

1 commit

d2183c6f1 RDMA/umem: Move page_shift from ib_umem to ib_odp_umem ... Browse Code »

This value has always been set to PAGE_SHIFT in the core code, the only
thing that does differently was the ODP path. Move the value into the ODP
struct and still use it for ODP, but change all the non-ODP things to just
use PAGE_SHIFT/PAGE_SIZE/PAGE_MASK directly.

Reviewed-by: Shiraz Saleem
Signed-off-by: Jason Gunthorpe
Signed-off-by: Leon Romanovsky

Jason Gunthorpe
2019-05-22 02:23:24 +0800

07 May, 2019

2 commits

db6c6774a RDMA/umem: Remove hugetlb flag ... Browse Code »

The drivers i40iw and bnxt_re no longer dependent on the hugetlb flag. So
remove this flag from ib_umem structure.

Reviewed-by: Michael J. Ruhl
Signed-off-by: Shiraz Saleem
Signed-off-by: Jason Gunthorpe

Shiraz Saleem
2019-05-07 00:08:11 +0800
4a3533995 RDMA/umem: Add API to find best driver supported page size in an MR ... Browse Code »

This helper iterates through the SG list to find the best page size to use
from a bitmap of HW supported page sizes. Drivers that support multiple
page sizes, but not mixed sizes in an MR can use this API.

Suggested-by: Jason Gunthorpe
Signed-off-by: Shiraz Saleem
Signed-off-by: Jason Gunthorpe

Shiraz Saleem
2019-05-07 00:08:11 +0800

09 Apr, 2019

1 commit

d10bcf947 RDMA/umem: Combine contiguous PAGE_SIZE regions in SGEs ... Browse Code »

Combine contiguous regions of PAGE_SIZE pages into single scatter list
entry while building the scatter table for a umem. This minimizes the
number of the entries in the scatter list and reduces the DMA mapping
overhead, particularly with the IOMMU.

Set default max_seg_size in core for IB devices to 2G and do not combine
if we exceed this limit.

Also, purge npages in struct ib_umem as we now DMA map the umem SGL with
sg_nents and npage computation is not needed. Drivers should now be using
ib_umem_num_pages(), so fix the last stragglers.

Move npages tracking to ib_umem_odp as ODP drivers still need it.

Suggested-by: Jason Gunthorpe
Reviewed-by: Michael J. Ruhl
Reviewed-by: Ira Weiny
Acked-by: Adit Ranadive
Signed-off-by: Shiraz Saleem
Tested-by: Gal Pressman
Tested-by: Selvin Xavier
Signed-off-by: Jason Gunthorpe

Shiraz Saleem
2019-04-09 00:05:24 +0800

11 Jan, 2019

1 commit

b0ea0fa54 IB/{core,hw}: Have ib_umem_get extract the ib_ucontext from ib_udata ... Browse Code »

ib_umem_get() can only be called in a method callback, which always has a
udata parameter. This allows ib_umem_get() to derive the ucontext pointer
directly from the udata without requiring the drivers to find it in some
way or another.

Signed-off-by: Jason Gunthorpe
Signed-off-by: Shamir Rabinovitch

Jason Gunthorpe
2019-01-11 08:07:45 +0800

21 Sep, 2018

2 commits

597ecc5a0 RDMA/umem: Get rid of struct ib_umem.odp_data ... Browse Code »

This no longer has any use, we can use container_of to get to the
umem_odp, and a simple flag to indicate if this is an odp MR. Remove the
few remaining references to it.

Signed-off-by: Jason Gunthorpe
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford

Jason Gunthorpe
2018-09-21 23:54:46 +0800
d4b4dd1b9 RDMA/umem: Do not use current->tgid to track the mm_struct ... Browse Code »

This is just wrong, the process that calls into the reg_mr is the process
associated with the umem, and that does not have to be the same process
that created the context.

When this code was first written mmgrab() didn't exist, however these days
we can just directly hold the mm_struct pointer in the umem and have no
ambiguity when it comes to releasing the umem as to which mm it was
associated with.

Signed-off-by: Jason Gunthorpe
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford

Jason Gunthorpe
2018-09-21 04:19:30 +0800

16 May, 2018

1 commit

8e907ed48 IB/umem: Use the correct mm during ib_umem_release ... Browse Code »

User-space may invoke ibv_reg_mr and ibv_dereg_mr in different threads.

If ibv_dereg_mr is called after the thread which invoked ibv_reg_mr has
exited, get_pid_task will return NULL and ib_umem_release will not
decrease mm->pinned_vm.

Instead of using threads to locate the mm, use the overall tgid from the
ib_ucontext struct instead. This matches the behavior of ODP and
disassociate in handling the mm of the process that called ibv_reg_mr.

Cc:
Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get")
Signed-off-by: Lidong Chen
Signed-off-by: Jason Gunthorpe

Lidong Chen
2018-05-16 07:09:10 +0800

26 Apr, 2017

2 commits

403cd12e2 IB/umem: Add contiguous ODP support ... Browse Code »

Currenlty ODP supports only regular MMU pages.
Add ODP support for regions consisting of physically contiguous chunks
of arbitrary order (huge pages for instance) to improve performance.

Signed-off-by: Artemy Kovalyov
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford

Artemy Kovalyov
2017-04-26 03:40:28 +0800
3e7e1193e IB: Replace ib_umem page_size by page_shift ... Browse Code »

Size of pages are held by struct ib_umem in page_size field.

It is better to store it as an exponent, because page size by nature
is always power-of-two and used as a factor, divisor or ilog2's argument.

The conversion of page_size to be page_shift allows to have portable
code and avoid following error while compiling on ARM:

ERROR: "__aeabi_uldivmod" [drivers/infiniband/core/ib_core.ko] undefined!

CC: Selvin Xavier
CC: Steve Wise
CC: Lijun Ou
CC: Shiraz Saleem
CC: Adit Ranadive
CC: Dennis Dalessandro
CC: Ram Amrani
Signed-off-by: Artemy Kovalyov
Signed-off-by: Leon Romanovsky
Acked-by: Ram Amrani
Acked-by: Shiraz Saleem
Acked-by: Selvin Xavier
Acked-by: Selvin Xavier
Acked-by: Adit Ranadive
Signed-off-by: Doug Ledford

Artemy Kovalyov
2017-04-26 03:40:28 +0800

16 Dec, 2014

4 commits

8ada2c1c0 IB/core: Add support for on demand paging regions ... Browse Code »

* Extend the umem struct to keep the ODP related data.
* Allocate and initialize the ODP related information in the umem
(page_list, dma_list) and freeing as needed in the end of the run.
* Store a reference to the process PID struct in the ucontext. Used to
safely obtain the task_struct and the mm during fault handling,
without preventing the task destruction if needed.
* Add 2 helper functions: ib_umem_odp_map_dma_pages and
ib_umem_odp_unmap_dma_pages. These functions get the DMA addresses
of specific pages of the umem (and, currently, pin them).
* Support for page faults only - IB core will keep the reference on
the pages used and call put_page when freeing an ODP umem
area. Invalidations support will be added in a later patch.

Signed-off-by: Sagi Grimberg
Signed-off-by: Shachar Raindel
Signed-off-by: Haggai Eran
Signed-off-by: Majd Dibbiny
Signed-off-by: Roland Dreier

Shachar Raindel
2014-12-16 10:13:36 +0800
c1395a2a8 IB/mlx5: Add function to read WQE from user-space ... Browse Code »

Add a helper function mlx5_ib_read_user_wqe to read information from
user-space owned work queues. The function will be used in a later
patch by the page-fault handling code in mlx5_ib.

Signed-off-by: Haggai Eran

[ Add stub for ib_umem_copy_from() for CONFIG_INFINIBAND_USER_MEM=n
- Roland ]

Signed-off-by: Roland Dreier

Haggai Eran
2014-12-16 10:13:35 +0800
c5d76f130 IB/core: Add umem function to read data from user-space ... Browse Code »

In some drivers there's a need to read data from a user space area
that was pinned using ib_umem when running from a different process
context.

The ib_umem_copy_from function allows reading data from the physical
pages pinned in the ib_umem struct.

Signed-off-by: Haggai Eran
Signed-off-by: Roland Dreier

Haggai Eran
2014-12-16 10:13:35 +0800
406f9e5fa IB/core: Replace ib_umem's offset field with a full address ... Browse Code »

In order to allow umems that do not pin memory, we need the umem to
keep track of its region's address.

This makes the offset field redundant, and so this patch removes it.

Signed-off-by: Haggai Eran
Signed-off-by: Roland Dreier

Haggai Eran
2014-12-16 10:13:35 +0800

20 Sep, 2014

1 commit

87773dd56 IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get ... Browse Code »

In debugging an application that receives -ENOMEM from ib_reg_mr(), I
found that ib_umem_get() can fail because the pinned_vm count has
wrapped causing it to always be larger than the lock limit even with
RLIMIT_MEMLOCK set to RLIM_INFINITY.

The wrapping of pinned_vm occurs because the process that calls
ib_reg_mr() will have its mm->pinned_vm count incremented. Later a
different process with a different mm_struct than the one that
allocated the ib_umem struct ends up releasing it which results in
decrementing the new processes mm->pinned_vm count past zero and
wrapping.

I'm not entirely sure what circumstances cause a different process to
release the ib_umem than the one that allocated it but the kernel
stack trace of the freeing process from my situation looks like the
following:

Call Trace:
[] dump_stack+0x19/0x1b
[] ib_umem_release+0x1f5/0x200 [ib_core]
[] mlx4_ib_destroy_qp+0x241/0x440 [mlx4_ib]
[] ib_destroy_qp+0x12c/0x170 [ib_core]
[] ib_uverbs_close+0x259/0x4e0 [ib_uverbs]
[] __fput+0xba/0x240
[] ____fput+0xe/0x10
[] task_work_run+0xc4/0xe0
[] do_notify_resume+0x95/0xa0
[] int_signal+0x12/0x17

The following patch fixes the issue by storing the pid struct of the
process that calls ib_umem_get() so that ib_umem_release and/or
ib_umem_account() can properly decrement the pinned_vm count of the
correct mm_struct.

Signed-off-by: Shawn Bohrer
Reviewed-by: Shachar Raindel
Signed-off-by: Roland Dreier

Shawn Bohrer
2014-09-20 00:55:42 +0800

05 Mar, 2014

1 commit

eeb8461e3 IB: Refactor umem to use linear SG table ... Browse Code »

This patch refactors the IB core umem code and vendor drivers to use a
linear (chained) SG table instead of chunk list. With this change the
relevant code becomes clearer—no need for nested loops to build and
use umem.

Signed-off-by: Shachar Raindel
Signed-off-by: Yishai Hadas
Signed-off-by: Roland Dreier

Yishai Hadas
2014-03-05 02:34:28 +0800

29 Apr, 2008

1 commit

cb9fbc5c3 IB: expand ib_umem_get() prototype ... Browse Code »

Add a new parameter, dmasync, to the ib_umem_get() prototype. Use dmasync = 1
when mapping user-allocated CQs with ib_umem_get().

Signed-off-by: Arthur Kepner
Cc: Tony Luck
Cc: Jesse Barnes
Cc: Jes Sorensen
Cc: Randy Dunlap
Cc: Roland Dreier
Cc: James Bottomley
Cc: David Miller
Cc: Benjamin Herrenschmidt
Cc: Grant Grundler
Cc: Michael Ellerman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arthur Kepner
2008-04-29 23:06:12 +0800

10 Oct, 2007

1 commit

c8d8beea0 IB/umem: Add hugetlb flag to struct ib_umem ... Browse Code »

During ib_umem_get(), determine whether all pages from the memory
region are hugetlb pages and report this in the "hugetlb" member.
Low-level drivers can use this information if they need it.

Signed-off-by: Joachim Fenkes
Signed-off-by: Roland Dreier

Joachim Fenkes
2007-10-10 10:59:13 +0800

22 May, 2007

1 commit

e8edc6e03 Detach sched.h from mm.h ... Browse Code »

First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.

This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
getting them indirectly

Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
they don't need sched.h
b) sched.h stops being dependency for significant number of files:
on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
after patch it's only 3744 (-8.3%).

Cross-compile tested on

all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
alpha alpha-up
arm
i386 i386-up i386-defconfig i386-allnoconfig
ia64 ia64-up
m68k
mips
parisc parisc-up
powerpc powerpc-up
s390 s390-up
sparc sparc-up
sparc64 sparc64-up
um-x86_64
x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-05-22 00:18:19 +0800

09 May, 2007

2 commits

1bf66a304 IB: Put rlimit accounting struct in struct ib_umem ... Browse Code »

When memory pinned with ib_umem_get() is released, ib_umem_release()
needs to subtract the amount of memory being unpinned from
mm->locked_vm. However, ib_umem_release() may be called with
mm->mmap_sem already held for writing if the memory is being released
as part of an munmap() call, so it is sometimes necessary to defer
this accounting into a workqueue.

However, the work struct used to defer this accounting is dynamically
allocated before it is queued, so there is the possibility of failing
that allocation. If the allocation fails, then ib_umem_release has no
choice except to bail out and leave the process with a permanently
elevated locked_vm.

Fix this by allocating the structure to defer accounting as part of
the original struct ib_umem, so there's no possibility of failing a
later allocation if creating the struct ib_umem and pinning memory
succeeds.

Signed-off-by: Roland Dreier

Roland Dreier
2007-05-09 09:00:37 +0800
f7c6a7b5d IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules ... Browse Code »

Export ib_umem_get()/ib_umem_release() and put low-level drivers in
control of when to call ib_umem_get() to pin and DMA map userspace,
rather than always calling it in ib_uverbs_reg_mr() before calling the
low-level driver's reg_user_mr method.

Also move these functions to be in the ib_core module instead of
ib_uverbs, so that driver modules using them do not depend on
ib_uverbs.

This has a number of advantages:
- It is better design from the standpoint of making generic code a
library that can be used or overridden by device-specific code as
the details of specific devices dictate.
- Drivers that do not need to pin userspace memory regions do not
need to take the performance hit of calling ib_mem_get(). For
example, although I have not tried to implement it in this patch,
the ipath driver should be able to avoid pinning memory and just
use copy_{to,from}_user() to access userspace memory regions.
- Buffers that need special mapping treatment can be identified by
the low-level driver. For example, it may be possible to solve
some Altix-specific memory ordering issues with mthca CQs in
userspace by mapping CQ buffers with extra flags.
- Drivers that need to pin and DMA map userspace memory for things
other than memory regions can use ib_umem_get() directly, instead
of hacks using extra parameters to their reg_phys_mr method. For
example, the mlx4 driver that is pending being merged needs to pin
and DMA map QP and CQ buffers, but it does not need to create a
memory key for these buffers. So the cleanest solution is for mlx4
to call ib_umem_get() in the create_qp and create_cq methods.

Signed-off-by: Roland Dreier

Roland Dreier
2007-05-09 09:00:37 +0800