25 Dec, 2016
1 commit
-
This was entirely automated, using the script by Al:
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
sed -i -e "s!$PATT!#include !" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)to do the replacement at the end of the merge window.
Requested-by: Al Viro
Signed-off-by: Linus Torvalds
23 Dec, 2016
1 commit
-
Code that dereferences the struct net_device ip_ptr member must be
protected with an in_dev_get() / in_dev_put() pair. Hence insert
calls to these functions.Fixes: commit 7b85627b9f02 ("IB/cma: IBoE (RoCE) IP-based GID addressing")
Signed-off-by: Bart Van Assche
Reviewed-by: Moni Shoua
Cc: Or Gerlitz
Cc: Roland Dreier
Cc:
Signed-off-by: Doug Ledford
15 Dec, 2016
6 commits
-
rdma_consumer_reject_data() will return the private data pointer
and length if any is available.Reviewed-by: Sagi Grimberg
Reviewed-by: Christoph Hellwig
Signed-off-by: Steve Wise
Signed-off-by: Doug Ledford -
Return true if the peer consumer application rejected the
connection attempt.Reviewed-by: Sagi Grimberg
Reviewed-by: Christoph Hellwig
Signed-off-by: Steve Wise
Reviewed-by: Bart Van Assche
Signed-off-by: Doug Ledford -
rdma_reject_msg() returns a pointer to a string message associated with
the transport reject reason codes.Reviewed-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Signed-off-by: Steve Wise
Reviewed-by: Bart Van Assche
Signed-off-by: Doug Ledford -
and rename class version define to indicate SM rather than SMP or SMI
Signed-off-by: Hal Rosenstock
Reviewed-by: Ira Weiny
Signed-off-by: Doug Ledford
14 Dec, 2016
6 commits
-
Add new member rate_limit to ib_qp_attr which holds the packet pacing rate
in kbps, 0 means unlimited.IB_QP_RATE_LIMIT is added to ib_attr_mask and could be used by RAW
QPs when changing QP state from RTR to RTS, RTS to RTS.Signed-off-by: Bodong Wang
Reviewed-by: Matan Barak
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford -
Add struct ib_udata to the signature of create_ah callback that is
implemented by IB device drivers. This allows HW drivers to return extra
data to the userspace library.
This patch prepares the ground for mlx5 driver to resolve destination
mac address for a given GID and return it to userspace.
This patch was previously submitted by Knut Omang as a part of the
patch set to support Oracle's Infiniband HCA (SIF).Signed-off-by: Knut Omang
Signed-off-by: Moni Shoua
Reviewed-by: Yishai Hadas
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford -
The function ib_resolve_eth_dmac() requires struct qp_attr * and
qp_attr_mask as parameters while the function might be useful to resolve
dmac for address handles. This patch changes the signature of the
function so it can be used in the flow of creating an address handle.Signed-off-by: Moni Shoua
Reviewed-by: Yishai Hadas
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford -
For a tunneled packet which contains external and internal headers,
we refer to the external headers as "outer fields" and the internal
headers as "inner fields".Example of a tunneled packet:
{ L2 | L3 | L4 | tunnel header | L2 | L3 | l4 | data }
| | | | | | |
{ outer fields }{ inner fields }This patch introduces a new flag for flow steering rules
- IB_FLOW_SPEC_INNER - which specifies that the rule applies
to the inner fields, rather than to the outer fields of the packet.Signed-off-by: Moses Reuben
Reviewed-by: Maor Gottlieb
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford -
Aligned the structure ib_flow_spec_type indentation,
after adding a new definition.Signed-off-by: Moses Reuben
Reviewed-by: Maor Gottlieb
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford -
In order to support tunneling, that can be used by the QP,
both struct ib_flow_spec_tunnel and struct ib_flow_tunnel_filter can be
used to more IP or UDP based tunneling protocols (e.g NVGRE, GRE, etc).IB_FLOW_SPEC_VXLAN_TUNNEL type flow specification is added to use this
functionality and match specific Vxlan packets.In similar to IPv6, we check overflow of the vni value by
comparing with the maximum size.Signed-off-by: Moses Reuben
Reviewed-by: Maor Gottlieb
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford
12 Dec, 2016
3 commits
-
Add rvt_div_round_up_mtu() and rvt_div_mtu() routines to
do the computation based on the pmtu and the log_pmtu.Change divides in qib, hfi1 to use the new inlines.
Reviewed-by: Kaike Wan
Signed-off-by: Mike Marciniszyn
Signed-off-by: Dennis Dalessandro
Signed-off-by: Doug Ledford -
Add a helper to release mr references held by
an swqe.Reviewed-by: Brian Welty
Signed-off-by: Mike Marciniszyn
Signed-off-by: Dennis Dalessandro
Signed-off-by: Doug Ledford -
This is for use by client drivers to drive
send completions into a CQ.A new exported table allows for the mapping
of ib_wr_opcode into a ib_wc_opcode.Reviewed-by: Ashutosh Dixit
Signed-off-by: Mike Marciniszyn
Signed-off-by: Dennis Dalessandro
Signed-off-by: Doug Ledford
17 Nov, 2016
1 commit
-
When MAD arrives to the hypervisor, we need to identify which slave it
should be sent by destination GID. When L3 protocol is IPv4 the
GRH is replaced by an IPv4 header. This patch detects when IPv4 header
needs to be parsed instead of GRH.Fixes: b6ffaeffaea4 ('mlx4: In RoCE allow guests to have multiple GIDS')
Signed-off-by: Moni Shoua
Signed-off-by: Daniel Jurgens
Reviewed-by: Mark Bloch
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford
16 Nov, 2016
2 commits
-
Save a cacheline by having hot path calldowns together.
Reviewed-by: Sebastian Sanchez
Signed-off-by: Mike Marciniszyn
Signed-off-by: Dennis Dalessandro
Signed-off-by: Doug Ledford -
Profiling shows that the key validation is susceptible
to cache line trading when accessing the lkey table.Fix by separating out the read mostly fields from the write
fields. In addition the shift amount, which is function
of the lkey table size, is precomputed and stored with the
table pointer. Since both the shift and table pointer
are in the same read mostly cacheline, this saves a cache
line in this hot path.Reviewed-by: Sebastian Sanchez
Signed-off-by: Mike Marciniszyn
Signed-off-by: Dennis Dalessandro
Signed-off-by: Doug Ledford
10 Oct, 2016
1 commit
-
Pull main rdma updates from Doug Ledford:
"This is the main pull request for the rdma stack this release. The
code has been through 0day and I had it tagged for linux-next testing
for a couple days.Summary:
- updates to mlx5
- updates to mlx4 (two conflicts, both minor and easily resolved)
- updates to iw_cxgb4 (one conflict, not so obvious to resolve,
proper resolution is to keep the code in cxgb4_main.c as it is in
Linus' tree as attach_uld was refactored and moved into
cxgb4_uld.c)- improvements to uAPI (moved vendor specific API elements to uAPI
area)- add hns-roce driver and hns and hns-roce ACPI reset support
- conversion of all rdma code away from deprecated
create_singlethread_workqueue- security improvement: remove unsafe ib_get_dma_mr (breaks lustre in
staging)"* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (75 commits)
staging/lustre: Disable InfiniBand support
iw_cxgb4: add fast-path for small REG_MR operations
cxgb4: advertise support for FR_NSMR_TPTE_WR
IB/core: correctly handle rdma_rw_init_mrs() failure
IB/srp: Fix infinite loop when FMR sg[0].offset != 0
IB/srp: Remove an unused argument
IB/core: Improve ib_map_mr_sg() documentation
IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets
IB/mthca: Move user vendor structures
IB/nes: Move user vendor structures
IB/ocrdma: Move user vendor structures
IB/mlx4: Move user vendor structures
IB/cxgb4: Move user vendor structures
IB/cxgb3: Move user vendor structures
IB/mlx5: Move and decouple user vendor structures
IB/{core,hw}: Add constant for node_desc
ipoib: Make ipoib_warn ratelimited
IB/mlx4/alias_GUID: Remove deprecated create_singlethread_workqueue
IB/ipoib_verbs: Remove deprecated create_singlethread_workqueue
IB/ipoib: Remove deprecated create_singlethread_workqueue
...
08 Oct, 2016
5 commits
-
Signed-off-by: Yuval Shaia
Signed-off-by: Doug Ledford -
Add the following fields to IPv6 flow filter specification:
1. Traffic Class
2. Flow Label
3. Next Header
4. Hop LimitSigned-off-by: Maor Gottlieb
Reviewed-by: Sagi Grimberg
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford -
Add the following fields to IPv4 flow filter specification:
1. Type of Service
2. Time to Live
3. Flags
4. ProtocolSigned-off-by: Maor Gottlieb
Reviewed-by: Sagi Grimberg
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford -
Flow steering specifications structures were implemented as in an
extensible way that allows one to add new filters and new fields
to existing filters.
These specifications have never been extended, therefore the
kernel flow specifications size and the user flow specifications size
were must to be equal.In downstream patch, the IPv4 flow specifications type is extended to
support TOS and TTL fields.To support an extension we change the flow specifications size
condition test to be as following:* If the user flow specifications is bigger than the kernel
specifications, we verify that all the bits which not in the kernel
specifications are zeros and the flow is added only with the kernel
specifications fields.* Otherwise, we add flow rule only with the user specifications fields.
User space filters must be aligned with 32bits.
Signed-off-by: Maor Gottlieb
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford -
Expose RSS related capabilities, it includes both direct ones (i.e.
struct ib_rss_caps) and max_wq_type_rq which may be used in both
RSS and non RSS flows.Specifically,
supported_qpts:
- QP types that support RSS on the device.max_rwq_indirection_tables:
- Max number of receive work queue indirection tables that
could be opened on the device.max_rwq_indirection_table_size:
- Max size of a receive work queue indirection table.max_wq_type_rq:
- Max number of work queues of receive type that
could be opened on the device.Signed-off-by: Yishai Hadas
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford
07 Oct, 2016
1 commit
-
This patch fixes below kernel crash on memory registration for rxe
and other transport drivers which has dma_ops extension.IB/core invokes ib_map_sg_attrs() in generic manner with dma attributes
which is used by mlx5 and mthca adapters. However in doing so it
ignored honoring dma_ops extension of software based transports for
sg map/unmap operation. This results in calling dma_map_sg_attrs of
hardware virtual device resulting in crash for null reference.We extend the core to support sg_map/unmap_attrs and transport drivers
to implement those dma_ops callback functions.Verified usign perftest applications.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [] check_addr+0x35/0x60
...
Call Trace:
[] ? nommu_map_sg+0x99/0xd0
[] ib_umem_get+0x3d6/0x470 [ib_core]
[] rxe_mem_init_user+0x49/0x270 [rdma_rxe]
[] ? rxe_add_index+0xca/0x100 [rdma_rxe]
[] rxe_reg_user_mr+0x9f/0x130 [rdma_rxe]
[] ib_uverbs_reg_mr+0x14e/0x2c0 [ib_uverbs]
[] ib_uverbs_write+0x15b/0x3b0 [ib_uverbs]
[] ? mem_cgroup_commit_charge+0x76/0xe0
[] ? page_add_new_anon_rmap+0x89/0xc0
[] ? lru_cache_add_active_or_unevictable+0x39/0xc0
[] __vfs_write+0x28/0x120
[] ? rw_verify_area+0x49/0xb0
[] vfs_write+0xb2/0x1b0
[] SyS_write+0x46/0xa0
[] entry_SYSCALL_64_fastpath+0x1a/0xa4Signed-off-by: Parav Pandit
Signed-off-by: Doug Ledford
02 Oct, 2016
1 commit
-
Add IB headers, defines, and accessors that are identical
in both qib and hfi1 into the core includes.The accessors for be maintenance of __be64 fields since
alignment is potentially invalid and can differ based on
the presense of the GRH.{hfi1,qib}_ib_headers will be ib_headers.
{hfi1,qib|_other_headers will be ib_other_headers.Reviewed-by: Dennis Dalessandro
Reviewed-by: Don Hiatt
Signed-off-by: Mike Marciniszyn
Signed-off-by: Dennis Dalessandro
Signed-off-by: Doug Ledford
24 Sep, 2016
3 commits
-
We now only use it from ib_alloc_pd to create a local DMA lkey if the
device doesn't provide one, or a global rkey if the ULP requests it.This patch removes ib_get_dma_mr and open codes the functionality in
ib_alloc_pd so that we can simplify the code and prevent abuse of the
functionality. As a side effect we can also simplify things by removing
the valid access bit check, and the PD refcounting.In the future I hope to also remove the per-PD global MR entirely by
shifting this work into the HW drivers, as one step towards avoiding
the struct ib_mr overload for various different use cases.Signed-off-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Reviewed-by: Jason Gunthorpe
Reviewed-by: Steve Wise
Signed-off-by: Doug Ledford -
Instead of exposing ib_get_dma_mr to ULPs and letting them use it more or
less unchecked, this moves the capability of creating a global rkey into
the RDMA core, where it can be easily audited. It also prints a warning
everytime this feature is used as well.Signed-off-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Reviewed-by: Jason Gunthorpe
Reviewed-by: Steve Wise
Signed-off-by: Doug Ledford -
This has two reasons: a) to clearly mark that drivers don't have any
business using it, and b) because we're going to use it for the
(dangerous) global rkey soon, so that drivers don't create on themselves.Signed-off-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
Reviewed-by: Jason Gunthorpe
Reviewed-by: Steve Wise
Signed-off-by: Doug Ledford
17 Sep, 2016
1 commit
-
This centralizes the function and improves code readability.
Reviewed-by: Dennis Dalessandro
Signed-off-by: Mike Marciniszyn
Signed-off-by: Dennis Dalessandro
Signed-off-by: Doug Ledford
24 Aug, 2016
1 commit
-
* Reuse existing functionality from memdup_user() instead of keeping
duplicate source code.This issue was detected by using the Coccinelle software.
* The local variable "ret" will be set to an appropriate value a bit later.
Thus omit the explicit initialisation at the beginning.Signed-off-by: Markus Elfring
Signed-off-by: Doug Ledford
05 Aug, 2016
2 commits
-
Pull second round of rdma updates from Doug Ledford:
"This can be split out into just two categories:- fixes to the RDMA R/W API in regards to SG list length limits
(about 5 patches)- fixes/features for the Intel hfi1 driver (everything else)
The hfi1 driver is still being brought to full feature support by
Intel, and they have a lot of people working on it, so that amounts to
almost the entirety of this pull request"* tag 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (84 commits)
IB/hfi1: Add cache evict LRU list
IB/hfi1: Fix memory leak during unexpected shutdown
IB/hfi1: Remove unneeded mm argument in remove function
IB/hfi1: Consistently call ops->remove outside spinlock
IB/hfi1: Use evict mmu rb operation
IB/hfi1: Add evict operation to the mmu rb handler
IB/hfi1: Fix TID caching actions
IB/hfi1: Make the cache handler own its rb tree root
IB/hfi1: Make use of mm consistent
IB/hfi1: Fix user SDMA racy user request claim
IB/hfi1: Fix error condition that needs to clean up
IB/hfi1: Release node on insert failure
IB/hfi1: Validate SDMA user iovector count
IB/hfi1: Validate SDMA user request index
IB/hfi1: Use the same capability state for all shared contexts
IB/hfi1: Prevent null pointer dereference
IB/hfi1: Rename TID mmu_rb_* functions
IB/hfi1: Remove unneeded empty check in hfi1_mmu_rb_unregister()
IB/hfi1: Restructure hfi1_file_open
IB/hfi1: Make iovec loop index easy to understand
... -
Pull base rdma updates from Doug Ledford:
"Round one of 4.8 code: while this is mostly normal, there is a new
driver in here (the driver was hosted outside the kernel for several
years and is actually a fairly mature and well coded driver). It
amounts to 13,000 of the 16,000 lines of added code in here.Summary:
- Updates/fixes for iw_cxgb4 driver
- Updates/fixes for mlx5 driver
- Add flow steering and RSS API
- Add hardware stats to mlx4 and mlx5 drivers
- Add firmware version API for RDMA driver use
- Add the rxe driver (this is a software RoCE driver that makes any
Ethernet device a RoCE device)
- Fixes for i40iw driver
- Support for send only multicast joins in the cma layer
- Other minor fixes"* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (72 commits)
Soft RoCE driver
IB/core: Support for CMA multicast join flags
IB/sa: Add cached attribute containing SM information to SA port
IB/uverbs: Fix race between uverbs_close and remove_one
IB/mthca: Clean up error unwind flow in mthca_reset()
IB/mthca: NULL arg to pci_dev_put is OK
IB/hfi1: NULL arg to sc_return_credits is OK
IB/mlx4: Add diagnostic hardware counters
net/mlx4: Query performance and diagnostics counters
net/mlx4: Add diagnostic counters capability bit
Use smaller 512 byte messages for portmapper messages
IB/ipoib: Report SG feature regardless of HW UD CSUM capability
IB/mlx4: Don't use GFP_ATOMIC for CQ resize struct
IB/hfi1: Disable by default
IB/rdmavt: Disable by default
IB/mlx5: Fix port counter ID association to QP offset
IB/mlx5: Fix iteration overrun in GSI qps
i40iw: Add NULL check for puda buffer
i40iw: Change dup_ack_thresh to u8
i40iw: Remove unnecessary check for moving CQ head
...
04 Aug, 2016
4 commits
-
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer. Thus the pointer can point to const data.
However the attributes do not have to be a bitfield. Instead unsigned
long will do fine:1. This is just simpler. Both in terms of reading the code and setting
attributes. Instead of initializing local attributes on the stack
and passing pointer to it to dma_set_attr(), just set the bits.2. It brings safeness and checking for const correctness because the
attributes are passed by value.Semantic patches for this change (at least most of them):
virtual patch
virtual context@r@
identifier f, attrs;@@
f(...,
- struct dma_attrs *attrs
+ unsigned long attrs
, ...)
{
...
}@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)and
// Options: --all-includes
virtual patch
virtual context@r@
identifier f, attrs;
type t;@@
t f(..., struct dma_attrs *attrs);@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski
Acked-by: Vineet Gupta
Acked-by: Robin Murphy
Acked-by: Hans-Christian Noren Egtvedt
Acked-by: Mark Salter [c6x]
Acked-by: Jesper Nilsson [cris]
Acked-by: Daniel Vetter [drm]
Reviewed-by: Bart Van Assche
Acked-by: Joerg Roedel [iommu]
Acked-by: Fabien Dessenne [bdisp]
Reviewed-by: Marek Szyprowski [vb2-core]
Acked-by: David Vrabel [xen]
Acked-by: Konrad Rzeszutek Wilk [xen swiotlb]
Acked-by: Joerg Roedel [iommu]
Acked-by: Richard Kuo [hexagon]
Acked-by: Geert Uytterhoeven [m68k]
Acked-by: Gerald Schaefer [s390]
Acked-by: Bjorn Andersson
Acked-by: Hans-Christian Noren Egtvedt [avr32]
Acked-by: Vineet Gupta [arc]
Acked-by: Robin Murphy [arm64 and dma-iommu]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Added UCMA and CMA support for multicast join flags. Flags are
passed using UCMA CM join command previously reserved fields.
Currently supporting two join flags indicating two different
multicast JoinStates:1. Full Member:
The initiator creates the Multicast group(MCG) if it wasn't
previously created, can send Multicast messages to the group
and receive messages from the MCG.2. Send Only Full Member:
The initiator creates the Multicast group(MCG) if it wasn't
previously created, can send Multicast messages to the group
but doesn't receive any messages from the MCG.IB: Send Only Full Member requires a query of ClassPortInfo
to determine if SM/SA supports this option. If SM/SA
doesn't support Send-Only there will be no join request
sent and an error will be returned.ETH: When Send Only Full Member is requested no IGMP join
will be sent.Signed-off-by: Alex Vesker
Reviewed by: Hal Rosenstock
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford
03 Aug, 2016
1 commit
-
The use of the specific opcode test is redundant since
all ack entry users correctly manipulate the mr pointer
to selectively trigger the reference clearing.The overly specific test hinders the use of implementation
specific operations.The change needs to get rid of the union to insure that
an atomic value is not seen as an MR pointer.Reviewed-by: Ashutosh Dixit
Signed-off-by: Mike Marciniszyn
Signed-off-by: Ira Weiny
Signed-off-by: Doug Ledford