13 Oct, 2018
1 commit
-
commit 5fe23f262e0548ca7f19fb79f89059a60d087d22 upstream.
There is a race condition between ucma_close() and ucma_resolve_ip():
CPU0 CPU1
ucma_resolve_ip(): ucma_close():ctx = ucma_get_ctx(file, cmd.id);
list_for_each_entry_safe(ctx, tmp, &file->ctx_list, list) {
mutex_lock(&mut);
idr_remove(&ctx_idr, ctx->id);
mutex_unlock(&mut);
...
mutex_lock(&mut);
if (!ctx->closing) {
mutex_unlock(&mut);
rdma_destroy_id(ctx->cm_id);
...
ucma_free_ctx(ctx);ret = rdma_resolve_addr();
ucma_put_ctx(ctx);Before idr_remove(), ucma_get_ctx() could still find the ctx
and after rdma_destroy_id(), rdma_resolve_addr() may still
access id_priv pointer. Also, ucma_put_ctx() may use ctx after
ucma_free_ctx() too.ucma_close() should call ucma_put_ctx() too which tests the
refcnt and waits for the last one releasing it. The similar
pattern is already used by ucma_destroy_id().Reported-and-tested-by: syzbot+da2591e115d57a9cbb8b@syzkaller.appspotmail.com
Reported-by: syzbot+cfe3c1e8ef634ba8964b@syzkaller.appspotmail.com
Cc: Jason Gunthorpe
Cc: Doug Ledford
Cc: Leon Romanovsky
Signed-off-by: Cong Wang
Reviewed-by: Leon Romanovsky
Signed-off-by: Doug Ledford
Signed-off-by: Greg Kroah-Hartman
10 Oct, 2018
1 commit
-
[ Upstream commit 0d23ba6034b9cf48b8918404367506da3e4b3ee5 ]
The current code grabs the private_data of whatever file descriptor
userspace has supplied and implicitly casts it to a `struct ucma_file *`,
potentially causing a type confusion.This is probably fine in practice because the pointer is only used for
comparisons, it is never actually dereferenced; and even in the
comparisons, it is unlikely that a file from another filesystem would have
a ->private_data pointer that happens to also be valid in this context.
But ->private_data is not always guaranteed to be a valid pointer to an
object owned by the file's filesystem; for example, some filesystems just
cram numbers in there.Check the type of the supplied file descriptor to be safe, analogous to how
other places in the kernel do it.Fixes: 88314e4dda1e ("RDMA/cma: add support for rdma_migrate_id()")
Signed-off-by: Jann Horn
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman
04 Oct, 2018
2 commits
-
commit 67e3816842fe6414d629c7515b955952ec40c7d7 upstream.
Currently a uverbs completion event queue is flushed of events in
ib_uverbs_comp_event_close() with the queue spinlock held and then
released. Yet setting ev_queue->is_closed is not set until later in
uverbs_hot_unplug_completion_event_file().In between the time ib_uverbs_comp_event_close() releases the lock and
uverbs_hot_unplug_completion_event_file() acquires the lock, a completion
event can arrive and be inserted into the event queue by
ib_uverbs_comp_handler().This can cause a "double add" list_add warning or crash depending on the
kernel configuration, or a memory leak because the event is never dequeued
since the queue is already closed down.So add setting ev_queue->is_closed = 1 to ib_uverbs_comp_event_close().
Cc: stable@vger.kernel.org
Fixes: 1e7710f3f656 ("IB/core: Change completion channel to use the reworked objects schema")
Signed-off-by: Steve Wise
Signed-off-by: Jason Gunthorpe
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit c2d7c8ff89b22ddefb1ac2986c0d48444a667689 ]
"nents" is an unsigned int, so if ib_map_mr_sg() returns a negative
error code then it's type promoted to a high unsigned int which is
treated as success.Fixes: a060b5629ab0 ("IB/core: generic RDMA READ/WRITE API")
Signed-off-by: Dan Carpenter
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman
26 Sep, 2018
1 commit
-
commit 954a8e3aea87e896e320cf648c1a5bbe47de443e upstream.
When AF_IB addresses are used during rdma_resolve_addr() a lock is not
held. A cma device can get removed while list traversal is in progress
which may lead to crash. ieCPU0 CPU1
==== ====
rdma_resolve_addr()
cma_resolve_ib_dev()
list_for_each() cma_remove_one()
cur_dev->device mutex_lock(&lock)
list_del();
mutex_unlock(&lock);
cma_process_remove();Therefore, hold a lock while traversing the list which avoids such
situation.Cc: # 3.10
Fixes: f17df3b0dede ("RDMA/cma: Add support for AF_IB to rdma_resolve_addr()")
Signed-off-by: Parav Pandit
Reviewed-by: Daniel Jurgens
Signed-off-by: Leon Romanovsky
Reviewed-by: Dennis Dalessandro
Signed-off-by: Jason Gunthorpe
Signed-off-by: Greg Kroah-Hartman
20 Sep, 2018
1 commit
-
[ Upstream commit 643d213a9a034fa04f5575a40dfc8548e33ce04f ]
Currently if the cm_id is not bound to any netdevice, than for such cm_id,
net namespace is ignored; which is incorrect.Regardless of cm_id bound to a netdevice or not, net namespace must
match. When a cm_id is bound to a netdevice, in such case net namespace
and netdevice both must match.Fixes: 4c21b5bcef73 ("IB/cma: Add net_dev and private data checks to RDMA CM")
Signed-off-by: Parav Pandit
Reviewed-by: Daniel Jurgens
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman
06 Aug, 2018
1 commit
-
commit addb8a6559f0f8b5a37582b7ca698358445a55bf upstream.
The commit cited below checked that the port numbers provided in the
primary and alt AVs are legal.That is sufficient to prevent a kernel panic. However, it is not
sufficient for correct operation.In Linux, AVs (both primary and alt) must be completely self-described.
We do not accept an AV from userspace without an embedded port number.
(This has been the case since kernel 3.14 commit dbf727de7440
("IB/core: Use GID table in AH creation and dmac resolution")).For the primary AV, this embedded port number must match the port number
specified with IB_QP_PORT.We also expect the port number embedded in the alt AV to match the
alt_port_num value passed by the userspace driver in the modify_qp command
base structure.Add these checks to modify_qp.
Cc: # 4.16
Fixes: 5d4c05c3ee36 ("RDMA/uverbs: Sanitize user entered port numbers prior to access it")
Signed-off-by: Jack Morgenstein
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Greg Kroah-Hartman
03 Aug, 2018
5 commits
-
commit 940efcc8889f0d15567eb07fc9fd69b06e366aa5 upstream.
Flows can be created on UD and RAW_PACKET QP types. Attempts to provide
other QP types as an input causes to various unpredictable failures.The reason is that in order to support all various types (e.g. XRC), we
are supposed to use real_qp handle and not qp handle and expect to
driver/FW to fail such (XRC) flows. The simpler and safer variant is to
ban all QP types except UD and RAW_PACKET, instead of relying on
driver/FW.Cc: # 3.11
Fixes: 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through uverbs")
Cc: syzkaller
Reported-by: Noa Osherovich
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sudip Mukherjee
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 2468b82d69e3a53d024f28d79ba0fdb8bf43dfbf ]
Let's perform checks in-place instead of BUG_ONs.
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit cb2595c1393b4a5211534e6f0a0fbad369e21ad8 ]
ucma_process_join() will free the new allocated "mc" struct,
if there is any error after that, especially the copy_to_user().But in parallel, ucma_leave_multicast() could find this "mc"
through idr_find() before ucma_process_join() frees it, since it
is already published.So "mc" could be used in ucma_leave_multicast() after it is been
allocated and freed in ucma_process_join(), since we don't refcnt
it.Fix this by separating "publish" from ID allocation, so that we
can get an ID first and publish it later after copy_to_user().Fixes: c8f6a362bf3e ("RDMA/cma: Add multicast communication support")
Reported-by: Noam Rathaus
Signed-off-by: Cong Wang
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
commit 6ee687735e745eafae9e6b93d1ea70bc52e7ad07 upstream.
gcc-4.4.4 has issues with initialization of anonymous unions.
drivers/infiniband/core/verbs.c: In function '__ib_drain_sq':
drivers/infiniband/core/verbs.c:2204: error: unknown field 'wr_cqe' specified in initializer
drivers/infiniband/core/verbs.c:2204: warning: initialization makes integer from pointer without a castWork around this.
Fixes: a1ae7d0345edd5 ("RDMA/core: Avoid that ib_drain_qp() triggers an out-of-bounds stack access")
Cc: Bart Van Assche
Cc: Steve Wise
Cc: Sagi Grimberg
Cc: Jason Gunthorpe
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Doug Ledford
Signed-off-by: Sudip Mukherjee
Signed-off-by: Greg Kroah-Hartman -
commit a1ae7d0345edd593d6725d3218434d903a0af95d upstream.
This patch fixes the following KASAN complaint:
==================================================================
BUG: KASAN: stack-out-of-bounds in rxe_post_send+0x77d/0x9b0 [rdma_rxe]
Read of size 8 at addr ffff880061aef860 by task 01/1080CPU: 2 PID: 1080 Comm: 01 Not tainted 4.16.0-rc3-dbg+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Call Trace:
dump_stack+0x85/0xc7
print_address_description+0x65/0x270
kasan_report+0x231/0x350
rxe_post_send+0x77d/0x9b0 [rdma_rxe]
__ib_drain_sq+0x1ad/0x250 [ib_core]
ib_drain_qp+0x9/0x30 [ib_core]
srp_destroy_qp+0x51/0x70 [ib_srp]
srp_free_ch_ib+0xfc/0x380 [ib_srp]
srp_create_target+0x1071/0x19e0 [ib_srp]
kernfs_fop_write+0x180/0x210
__vfs_write+0xb1/0x2e0
vfs_write+0xf6/0x250
SyS_write+0x99/0x110
do_syscall_64+0xee/0x2b0
entry_SYSCALL_64_after_hwframe+0x42/0xb7The buggy address belongs to the page:
page:ffffea000186bbc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x4000000000000000()
raw: 4000000000000000 0000000000000000 0000000000000000 00000000ffffffff
raw: 0000000000000000 ffffea000186bbe0 0000000000000000 0000000000000000
page dumped because: kasan: bad access detectedMemory state around the buggy address:
ffff880061aef700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff880061aef780: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
>ffff880061aef800: f2 f2 f2 f2 f2 f2 f2 00 00 00 00 00 f2 f2 f2 f2
^
ffff880061aef880: f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 f2 f2
ffff880061aef900: f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================Fixes: 765d67748bcf ("IB: new common API for draining queues")
Signed-off-by: Bart Van Assche
Cc: Steve Wise
Cc: Sagi Grimberg
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sudip Mukherjee
Signed-off-by: Greg Kroah-Hartman
17 Jul, 2018
1 commit
-
commit 7a8690ed6f5346f6738971892205e91d39b6b901 upstream.
In commit 357d23c811a7 ("Remove the obsolete libibcm library")
in rdma-core [1], we removed obsolete library which used the
/dev/infiniband/ucmX interface.Following multiple syzkaller reports about non-sanitized
user input in the UCMA module, the short audit reveals the same
issues in UCM module too.It is better to disable this interface in the kernel,
before syzkaller team invests time and energy to harden
this unused interface.[1] https://github.com/linux-rdma/rdma-core/pull/279
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Greg Kroah-Hartman
03 Jul, 2018
1 commit
-
commit 08bb558ac11ab944e0539e78619d7b4c356278bd upstream.
Make the MR writability flags check, which is performed in umem.c,
a static inline function in file ib_verbs.hThis allows the function to be used by low-level infiniband drivers.
Cc:
Signed-off-by: Jason Gunthorpe
Signed-off-by: Jack Morgenstein
Signed-off-by: Leon Romanovsky
Signed-off-by: Greg Kroah-Hartman
21 Jun, 2018
5 commits
-
[ Upstream commit 9aa169213d1166d30ae357a44abbeae93459339d ]
When commit [1] was added, SGID was queried to derive the SMAC address.
Then, later on during a refactor [2], SMAC was no longer needed. However,
the now useless GID query remained. Then during additional code changes
later on, the GID query was being done in such a way that it caused iWARP
queries to start breaking. Remove the useless GID query and resolve the
iWARP breakage at the same time.This is discussed in [3].
[1] commit dd5f03beb4f7 ("IB/core: Ethernet L2 attributes in verbs/cm structures")
[2] commit 5c266b2304fb ("IB/cm: Remove the usage of smac and vid of qp_attr and cm_av")
[3] https://www.spinics.net/lists/linux-rdma/msg63951.htmlSuggested-by: Shiraz Saleem
Signed-off-by: Parav Pandit
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit db82476f37413eaeff5f836a9d8b022d6544accf ]
Currently, the kernel protects access to the agent ID allocator on a per
port basis using a spinlock, so it is impossible for two apps/threads on
the same port to get the same TID, but it is entirely possible for two
threads on different ports to end up with the same TID.As this can be confusing (regardless of it being legal according to the
IB Spec 1.3, C13-18.1.1, in section 13.4.6.4 - TransactionID usage),
and as the rdma-core user space API for /dev/umad devices implies unique
TIDs even across ports, make the TID an atomic type so that no two
allocations, regardless of port number, will be the same.Signed-off-by: Håkon Bugge
Reviewed-by: Jack Morgenstein
Reviewed-by: Ira Weiny
Reviewed-by: Zhu Yanjun
Signed-off-by: Doug Ledford
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit f96416cea7bce9afe619c15e87fced70f93f9098 ]
In the cases where iwpm_hash_bucket is NULL and where function
get_mapinfo_hash_bucket returns NULL then the map_info is never added
to hash_bucket_head and hence there is a leak of map_info. Fix this
by nullifying hash_bucket_head and if that is null we know that
that map_info was not added to hash_bucket_head and hence map_info
should be free'd.Detected by CoverityScan, CID#1222481 ("Resource Leak")
Fixes: 30dc5e63d6a5 ("RDMA/core: Add support for iWARP Port Mapper user space service")
Signed-off-by: Colin Ian King
Signed-off-by: Doug Ledford
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 2918c1a900252b4a0c730715ec205437c7daf79d ]
There are few issues with validation of netdevice and listen id lookup
for IB (IPoIB) while processing incoming CM request as below.1. While performing lookup of bind_list in cma_ps_find(), net namespace
of the netdevice can get deleted in cma_exit_net(), resulting in use
after free access of idr and/or net namespace structures.
This lookup occurs from the workqueue context (and not userspace
context where net namespace is always valid).CPU0 CPU1
==== ====bind_list = cma_ps_find();
move netdevice to new namespace
delete net namespace
cma_exit_net()
idr_destroy(idr);[..]
cma_find_listener(bind_list, ..);2. While netdevice is validated for IP address in given net namespace,
netdevice's net namespace and/or ifindex can change in
cma_get_net_dev() and cma_match_net_dev().Above issues are overcome by using rcu lock along with netdevice
UP/DOWN state as described below.
When a net namespace is getting deleted, netdevice is closed and
shutdown before moving it back to init_net namespace.
change_net_namespace() synchronizes with any existing use of netdevice
before changing the netdev properties such as net or ifindex.
Once netdevice IFF_UP flags is cleared, such fields are not guaranteed
to be valid.
Therefore, rcu lock along with netdevice state check ensures that,
while route lookup and cm_id lookup is in progress, netdevice of
interest won't migrate to any other net namespace.
This ensures that associated net namespace of netdevice won't get
deleted while rcu lock is held for netdevice which is in IFF_UP state.Fixes: fa20105e09e9 ("IB/cma: Add support for network namespaces")
Fixes: 4be74b42a6d0 ("IB/cma: Separate port allocation to network namespaces")
Fixes: f887f2ac87c2 ("IB/cma: Validate routing of incoming requests")
Signed-off-by: Parav Pandit
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit f604db645a66b7ba4f21c426fe73253928dada41 ]
Previously, if a method contained mandatory attributes in a namespace
that wasn't given by the user, these attributes weren't validated.
Fixing this by iterating over all specification namespaces.Fixes: fac9658cabb9 ("IB/core: Add new ioctl interface")
Signed-off-by: Matan Barak
Signed-off-by: Doug Ledford
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman
05 Jun, 2018
1 commit
-
commit a840c93ca7582bb6c88df2345a33f979b7a67874 upstream.
When a GID entry is invalid EAGAIN is returned. This is an incorrect error
code, there is nothing that will make this GID entry valid again in
bounded time.Some user space tools fail incorrectly if EAGAIN is returned here, and
this represents a small ABI change from earlier kernels.The first patch in the Fixes list makes entries that were valid before
to become invalid, allowing this code to trigger, while the second patch
in the Fixes list introduced the wrong EAGAIN.Therefore revert the return result to EINVAL which matches the historical
expectations of the ibv_query_gid_type() API of the libibverbs user space
library.Cc:
Fixes: 598ff6bae689 ("IB/core: Refactor GID modify code for RoCE")
Fixes: 03db3a2d81e6 ("IB/core: Add RoCE GID table management")
Reviewed-by: Daniel Jurgens
Signed-off-by: Parav Pandit
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Greg Kroah-Hartman
30 May, 2018
7 commits
-
[ Upstream commit 563c4ba3bd2b8b0b21c65669ec2226b1cfa1138b ]
ah_attr contains the port number to which cm_id is bound. However, while
searching for GID table for matching GID entry, the port number is
ignored.This could cause the wrong GID to be used when the ah_attr is converted to
an AH.Reviewed-by: Daniel Jurgens
Signed-off-by: Parav Pandit
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 5f3e3b85cc0a5eae1c46d72e47d3de7bf208d9e2 ]
The option size check is using optval instead of optlen
causing the set option call to fail. Use the correct
field, optlen, for size check.Fixes: 6a21dfc0d0db ("RDMA/ucma: Limit possible option size")
Signed-off-by: Chien Tin Tung
Signed-off-by: Shiraz Saleem
Reviewed-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit bb7f8f199c354c4cf155b1d6d55f86eaaed7fa5a ]
resolved_dev returned might be NULL as ifindex is transient number.
Ignoring NULL check of resolved_dev might crash the kernel.
Therefore perform NULL check before accessing resolved_dev.Additionally rdma_resolve_ip_route() invokes addr_resolve() which
performs check and address translation for loopback ifindex.
Therefore, checking it again in rdma_resolve_ip_route() is not helpful.
Therefore, the code is simplified to avoid IFF_LOOPBACK check.Fixes: 200298326b27 ("IB/core: Validate route when we init ah")
Reviewed-by: Daniel Jurgens
Signed-off-by: Parav Pandit
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit ec6f8401c48a86809237e86878a6fac6b281118f ]
If remove_commit fails then the lock is left locked while the uobj still
exists. Eventually the kernel will deadlock.lockdep detects this and says:
test/4221 is leaving the kernel with locks still held!
1 lock held by test/4221:
#0: (&ucontext->cleanup_rwsem){.+.+}, at: [] rdma_explicit_destroy+0x37/0x120 [ib_uverbs]Fixes: 4da70da23e9b ("IB/core: Explicitly destroy an object while keeping uobject")
Signed-off-by: Leon Romanovsky
Reviewed-by: Dennis Dalessandro
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 4d39a959bc1f3d164b5a54147fdeb19f84b1ed58 ]
If the same attribute is listed twice by the user in the ioctl attribute
list then error unwind can cause the kernel to deref garbage.This happens when an object with WRITE access is sent twice. The second
parse properly fails but corrupts the state required for the error unwind
it triggers.Fixing this by making duplicates in the attribute list invalid. This is
not something we need to support.The ioctl interface is currently recommended to be disabled in kConfig.
Signed-off-by: Matan Barak
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 3d89459e2ef92cc0e5a50dde868780ccda9786c1 ]
Fix a bug in uverbs_ioctl_merge that looked at the object's iterator
number instead of the method's iterator number when merging methods.While we're at it, make the uverbs_ioctl_merge code a bit more clear
and faster.Fixes: 118620d3686b ('IB/core: Add uverbs merge trees functionality')
Signed-off-by: Matan Barak
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
commit 8e907ed4882714fd13cfe670681fc6cb5284c780 upstream.
User-space may invoke ibv_reg_mr and ibv_dereg_mr in different threads.
If ibv_dereg_mr is called after the thread which invoked ibv_reg_mr has
exited, get_pid_task will return NULL and ib_umem_release will not
decrease mm->pinned_vm.Instead of using threads to locate the mm, use the overall tgid from the
ib_ucontext struct instead. This matches the behavior of ODP and
disassociate in handling the mm of the process that called ibv_reg_mr.Cc:
Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get")
Signed-off-by: Lidong Chen
Signed-off-by: Jason Gunthorpe
Signed-off-by: Greg Kroah-Hartman
09 May, 2018
1 commit
-
commit 09abfe7b5b2f442a85f4c4d59ecf582ad76088d7 upstream.
The RDMA CM will select a source device and address by consulting
the routing table if no source address is passed into
rdma_resolve_address(). Userspace will ask for this by passing an
all-zero source address in the RESOLVE_IP command. Unfortunately
the new check for non-zero address size rejects this with EINVAL,
which breaks valid userspace applications.Fix this by explicitly allowing a zero address family for the source.
Fixes: 2975d5de6428 ("RDMA/ucma: Check AF family prior resolving address")
Cc:
Signed-off-by: Roland Dreier
Signed-off-by: Doug Ledford
Signed-off-by: Greg Kroah-Hartman
26 Apr, 2018
4 commits
-
[ Upstream commit d3b9e8ad425cfd5b9116732e057f1b48e4d3bcb8 ]
Fix warning limit for kernel stack consumption:
drivers/infiniband/core/cq.c: In function 'ib_process_cq_direct':
drivers/infiniband/core/cq.c:78:1: error: the frame size of 1032 bytes
is larger than 1024 bytes [-Werror=frame-larger-than=]Using smaller ib_wc array on the stack brings us comfortably below that
limit again.Fixes: 246d8b184c10 ("IB/cq: Don't force IB_POLL_DIRECT poll context for ib_process_cq_direct")
Reported-by: Arnd Bergmann
Reviewed-by: Sergey Gorenko
Signed-off-by: Max Gurtovoy
Signed-off-by: Leon Romanovsky
Reviewed-by: Bart Van Assche
Acked-by: Arnd Bergmann
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 3624a8f02568f08aef299d3b117f2226f621177d ]
Returning EOPNOTSUPP is problematic because it can also be
returned by the method function, and we use it in quite a few
places in drivers these days.Instead, dedicate EPROTONOSUPPORT to indicate that the ioctl framework
is enabled but the requested object and method are not supported by
the kernel. No other case will return this code, and it lets userspace
know to fall back to write().grep says we do not use it today in drivers/infiniband subsystem.
Signed-off-by: Jason Gunthorpe
Reviewed-by: Matan Barak
Signed-off-by: Doug Ledford
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 00db63c128dd3daf38f481371976c24d32678142 ]
If valid netdevice is not found for RoCE, GID table should not be
searched with NULL netdevice.Doing so causes the search routines to ignore the netdev argument and may
match the wrong GID table entry if the netdev is deleted.Fixes: abae1b71dd37 ("IB/cma: cma_validate_port should verify the port and netdevice")
Signed-off-by: Parav Pandit
Reviewed-by: Mark Bloch
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 246d8b184c100e8eb6b4e8c88f232c2ed2a4e672 ]
polling the completion queue directly does not interfere
with the existing polling logic, hence drop the requirement.
Be aware that running ib_process_cq_direct with non IB_POLL_DIRECT
CQ may trigger concurrent CQ processing.This can be used for polling mode ULPs.
Cc: Bart Van Assche
Reported-by: Steve Wise
Signed-off-by: Sagi Grimberg
[maxg: added wcs array argument to __ib_process_cq]
Signed-off-by: Max Gurtovoy
Signed-off-by: Doug Ledford
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman
24 Apr, 2018
1 commit
-
commit 8435168d50e66fa5eae01852769d20a36f9e5e83 upstream.
Check to make sure that ctx->cm_id->device is set before we use it.
Otherwise userspace can trigger a NULL dereference by doing
RDMA_USER_CM_CMD_SET_OPTION on an ID that is not bound to a device.Cc:
Reported-by:
Signed-off-by: Roland Dreier
Signed-off-by: Jason Gunthorpe
Signed-off-by: Greg Kroah-Hartman
12 Apr, 2018
2 commits
-
[ Upstream commit 89838118a515847d3e5c904d2e022779a7173bec ]
The 'if' logic in ucma_query_path was broken with OPA was introduced
and started to treat RoCE paths as as OPA paths. Invert the logic
of the 'if' so only OPA paths are treated as OPA paths.Otherwise the path records returned to rdma_cma users are mangled
when in RoCE mode.Fixes: 57520751445b ("IB/SA: Add OPA path record type")
Signed-off-by: Parav Pandit
Reviewed-by: Mark Bloch
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit e48e5e198fb6ec77c91047a694022f0fefa45292 ]
The commit 1a1c116f3dcf ("RDMA/netlink: Simplify the put_msg and put_attr")
removes nlmsg_len calculation in ibnl_put_attr causing netlink messages and
caused to miss source and destination addresses.Fixes: 1a1c116f3dcf ("RDMA/netlink: Simplify the put_msg and put_attr")
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman
08 Apr, 2018
5 commits
-
commit 84652aefb347297aa08e91e283adf7b18f77c2d5 upstream.
There are several places in the ucma ABI where userspace can pass in a
sockaddr but set the address family to AF_IB. When that happens,
rdma_addr_size() will return a size bigger than sizeof struct sockaddr_in6,
and the ucma kernel code might end up copying past the end of a buffer
not sized for a struct sockaddr_ib.Fix this by introducing new variants
int rdma_addr_size_in6(struct sockaddr_in6 *addr);
int rdma_addr_size_kss(struct __kernel_sockaddr_storage *addr);that are type-safe for the types used in the ucma ABI and return 0 if the
size computed is bigger than the size of the type passed in. We can use
these new variants to check what size userspace has passed in before
copying any addresses.Reported-by:
Signed-off-by: Roland Dreier
Signed-off-by: Jason Gunthorpe
Signed-off-by: Greg Kroah-Hartman -
commit c8d3bcbfc5eab3f01cf373d039af725f3b488813 upstream.
Ensure that device exists prior to accessing its properties.
Reported-by:
Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Greg Kroah-Hartman -
commit 4b658d1bbc16605330694bb3ef2570c465ef383d upstream.
Add missing check that device is connected prior to access it.
[ 55.358652] BUG: KASAN: null-ptr-deref in rdma_init_qp_attr+0x4a/0x2c0
[ 55.359389] Read of size 8 at addr 00000000000000b0 by task qp/618
[ 55.360255]
[ 55.360432] CPU: 1 PID: 618 Comm: qp Not tainted 4.16.0-rc1-00071-gcaf61b1b8b88 #91
[ 55.361693] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[ 55.363264] Call Trace:
[ 55.363833] dump_stack+0x5c/0x77
[ 55.364215] kasan_report+0x163/0x380
[ 55.364610] ? rdma_init_qp_attr+0x4a/0x2c0
[ 55.365238] rdma_init_qp_attr+0x4a/0x2c0
[ 55.366410] ucma_init_qp_attr+0x111/0x200
[ 55.366846] ? ucma_notify+0xf0/0xf0
[ 55.367405] ? _get_random_bytes+0xea/0x1b0
[ 55.367846] ? urandom_read+0x2f0/0x2f0
[ 55.368436] ? kmem_cache_alloc_trace+0xd2/0x1e0
[ 55.369104] ? refcount_inc_not_zero+0x9/0x60
[ 55.369583] ? refcount_inc+0x5/0x30
[ 55.370155] ? rdma_create_id+0x215/0x240
[ 55.370937] ? _copy_to_user+0x4f/0x60
[ 55.371620] ? mem_cgroup_commit_charge+0x1f5/0x290
[ 55.372127] ? _copy_from_user+0x5e/0x90
[ 55.372720] ucma_write+0x174/0x1f0
[ 55.373090] ? ucma_close_id+0x40/0x40
[ 55.373805] ? __lru_cache_add+0xa8/0xd0
[ 55.374403] __vfs_write+0xc4/0x350
[ 55.374774] ? kernel_read+0xa0/0xa0
[ 55.375173] ? fsnotify+0x899/0x8f0
[ 55.375544] ? fsnotify_unmount_inodes+0x170/0x170
[ 55.376689] ? __fsnotify_update_child_dentry_flags+0x30/0x30
[ 55.377522] ? handle_mm_fault+0x174/0x320
[ 55.378169] vfs_write+0xf7/0x280
[ 55.378864] SyS_write+0xa1/0x120
[ 55.379270] ? SyS_read+0x120/0x120
[ 55.379643] ? mm_fault_error+0x180/0x180
[ 55.380071] ? task_work_run+0x7d/0xd0
[ 55.380910] ? __task_pid_nr_ns+0x120/0x140
[ 55.381366] ? SyS_read+0x120/0x120
[ 55.381739] do_syscall_64+0xeb/0x250
[ 55.382143] entry_SYSCALL_64_after_hwframe+0x21/0x86
[ 55.382841] RIP: 0033:0x7fc2ef803e99
[ 55.383227] RSP: 002b:00007fffcc5f3be8 EFLAGS: 00000217 ORIG_RAX: 0000000000000001
[ 55.384173] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc2ef803e99
[ 55.386145] RDX: 0000000000000057 RSI: 0000000020000080 RDI: 0000000000000003
[ 55.388418] RBP: 00007fffcc5f3c00 R08: 0000000000000000 R09: 0000000000000000
[ 55.390542] R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000400480
[ 55.392916] R13: 00007fffcc5f3cf0 R14: 0000000000000000 R15: 0000000000000000
[ 55.521088] Code: e5 4d 1e ff 48 89 df 44 0f b6 b3 b8 01 00 00 e8 65 50 1e ff 4c 8b 2b 49
8d bd b0 00 00 00 e8 56 50 1e ff 41 0f b6 c6 48 c1 e0 04 03 85 b0 00 00 00 48 8d 78 08
48 89 04 24 e8 3a 4f 1e ff 48
[ 55.525980] RIP: rdma_init_qp_attr+0x52/0x2c0 RSP: ffff8801e2c2f9d8
[ 55.532648] CR2: 00000000000000b0
[ 55.534396] ---[ end trace 70cee64090251c0b ]---Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
Fixes: d541e45500bd ("IB/core: Convert ah_attr from OPA to IB when copying to user")
Reported-by:
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Greg Kroah-Hartman -
commit 9137108cc3d64ade13e753108ec611a0daed16a0 upstream.
process_one_req() can race with rdma_addr_cancel():
CPU0 CPU1
==== ====
process_one_work()
debug_work_deactivate(work);
process_one_req()
rdma_addr_cancel()
mutex_lock(&lock);
set_timeout(&req->work,..);
__queue_work()
debug_work_activate(work);
mutex_unlock(&lock);mutex_lock(&lock);
[..]
list_del(&req->list);
mutex_unlock(&lock);
[..]// ODEBUG explodes since the work is still queued.
kfree(req);Causing ODEBUG to detect the use after free:
ODEBUG: free active (active state 0) object type: work_struct hint: process_one_req+0x0/0x6c0 include/net/dst.h:165
WARNING: CPU: 0 PID: 79 at lib/debugobjects.c:291 debug_print_object+0x166/0x220 lib/debugobjects.c:288
kvm: emulating exchange as write
Kernel panic - not syncing: panic_on_warn set ...CPU: 0 PID: 79 Comm: kworker/u4:3 Not tainted 4.16.0-rc6+ #361
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: ib_addr process_one_req
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
panic+0x1e4/0x41c kernel/panic.c:183
__warn+0x1dc/0x200 kernel/panic.c:547
report_bug+0x1f4/0x2b0 lib/bug.c:186
fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
fixup_bug arch/x86/kernel/traps.c:247 [inline]
do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986
RIP: 0010:debug_print_object+0x166/0x220 lib/debugobjects.c:288
RSP: 0000:ffff8801d966f210 EFLAGS: 00010086
RAX: dffffc0000000008 RBX: 0000000000000003 RCX: ffffffff815acd6e
RDX: 0000000000000000 RSI: 1ffff1003b2cddf2 RDI: 0000000000000000
RBP: ffff8801d966f250 R08: 0000000000000000 R09: 1ffff1003b2cddc8
R10: ffffed003b2cde71 R11: ffffffff86f39a98 R12: 0000000000000001
R13: ffffffff86f15540 R14: ffffffff86408700 R15: ffffffff8147c0a0
__debug_check_no_obj_freed lib/debugobjects.c:745 [inline]
debug_check_no_obj_freed+0x662/0xf1f lib/debugobjects.c:774
kfree+0xc7/0x260 mm/slab.c:3799
process_one_req+0x2e7/0x6c0 drivers/infiniband/core/addr.c:592
process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113
worker_thread+0x223/0x1990 kernel/workqueue.c:2247
kthread+0x33c/0x400 kernel/kthread.c:238
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406Fixes: 5fff41e1f89d ("IB/core: Fix race condition in resolving IP to MAC")
Reported-by:
Signed-off-by: Jason Gunthorpe
Signed-off-by: Greg Kroah-Hartman -
commit e8980d67d6017c8eee8f9c35f782c4bd68e004c9 upstream.
Prior to access UCMA commands, the context should be initialized
and connected to CM_ID with ucma_create_id(). In case user skips
this step, he can provide non-valid ctx without CM_ID and cause
to multiple NULL dereferences.Also there are situations where the create_id can be raced with
other user access, ensure that the context is only shared to
other threads once it is fully initialized to avoid the races.[ 109.088108] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[ 109.090315] IP: ucma_connect+0x138/0x1d0
[ 109.092595] PGD 80000001dc02d067 P4D 80000001dc02d067 PUD 1da9ef067 PMD 0
[ 109.095384] Oops: 0000 [#1] SMP KASAN PTI
[ 109.097834] CPU: 0 PID: 663 Comm: uclose Tainted: G B 4.16.0-rc1-00062-g2975d5de6428 #45
[ 109.100816] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[ 109.105943] RIP: 0010:ucma_connect+0x138/0x1d0
[ 109.108850] RSP: 0018:ffff8801c8567a80 EFLAGS: 00010246
[ 109.111484] RAX: 0000000000000000 RBX: 1ffff100390acf50 RCX: ffffffff9d7812e2
[ 109.114496] RDX: 1ffffffff3f507a5 RSI: 0000000000000297 RDI: 0000000000000297
[ 109.117490] RBP: ffff8801daa15600 R08: 0000000000000000 R09: ffffed00390aceeb
[ 109.120429] R10: 0000000000000001 R11: ffffed00390aceea R12: 0000000000000000
[ 109.123318] R13: 0000000000000120 R14: ffff8801de6459c0 R15: 0000000000000118
[ 109.126221] FS: 00007fabb68d6700(0000) GS:ffff8801e5c00000(0000) knlGS:0000000000000000
[ 109.129468] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 109.132523] CR2: 0000000000000020 CR3: 00000001d45d8003 CR4: 00000000003606b0
[ 109.135573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 109.138716] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 109.142057] Call Trace:
[ 109.144160] ? ucma_listen+0x110/0x110
[ 109.146386] ? wake_up_q+0x59/0x90
[ 109.148853] ? futex_wake+0x10b/0x2a0
[ 109.151297] ? save_stack+0x89/0xb0
[ 109.153489] ? _copy_from_user+0x5e/0x90
[ 109.155500] ucma_write+0x174/0x1f0
[ 109.157933] ? ucma_resolve_route+0xf0/0xf0
[ 109.160389] ? __mod_node_page_state+0x1d/0x80
[ 109.162706] __vfs_write+0xc4/0x350
[ 109.164911] ? kernel_read+0xa0/0xa0
[ 109.167121] ? path_openat+0x1b10/0x1b10
[ 109.169355] ? fsnotify+0x899/0x8f0
[ 109.171567] ? fsnotify_unmount_inodes+0x170/0x170
[ 109.174145] ? __fget+0xa8/0xf0
[ 109.177110] vfs_write+0xf7/0x280
[ 109.179532] SyS_write+0xa1/0x120
[ 109.181885] ? SyS_read+0x120/0x120
[ 109.184482] ? compat_start_thread+0x60/0x60
[ 109.187124] ? SyS_read+0x120/0x120
[ 109.189548] do_syscall_64+0xeb/0x250
[ 109.192178] entry_SYSCALL_64_after_hwframe+0x21/0x86
[ 109.194725] RIP: 0033:0x7fabb61ebe99
[ 109.197040] RSP: 002b:00007fabb68d5e98 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 109.200294] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fabb61ebe99
[ 109.203399] RDX: 0000000000000120 RSI: 00000000200001c0 RDI: 0000000000000004
[ 109.206548] RBP: 00007fabb68d5ec0 R08: 0000000000000000 R09: 0000000000000000
[ 109.209902] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fabb68d5fc0
[ 109.213327] R13: 0000000000000000 R14: 00007fff40ab2430 R15: 00007fabb68d69c0
[ 109.216613] Code: 88 44 24 2c 0f b6 84 24 6e 01 00 00 88 44 24 2d 0f
b6 84 24 69 01 00 00 88 44 24 2e 8b 44 24 60 89 44 24 30 e8 da f6 06 ff
31 c0 41 83 7c 24 20 1b 75 04 8b 44 24 64 48 8d 74 24 20 4c 89 e7
[ 109.223602] RIP: ucma_connect+0x138/0x1d0 RSP: ffff8801c8567a80
[ 109.226256] CR2: 0000000000000020Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
Reported-by:
Signed-off-by: Leon Romanovsky
Signed-off-by: Jason Gunthorpe
Signed-off-by: Greg Kroah-Hartman