19 Mar, 2019
1 commit
-
[ Upstream commit d7cf4a3bf3a83c977a29055e1c4ffada7697b31f ]
smc_poll() returns with mask bit EPOLLPRI if the connection urg_state
is SMC_URG_VALID. Since SMC_URG_VALID is zero, smc_poll signals
EPOLLPRI errorneously if called in state SMC_INIT before the connection
is created, for instance in a non-blocking connect scenario.This patch switches to non-zero values for the urg states.
Reviewed-by: Karsten Graul
Fixes: de8474eb9d50 ("net/smc: urgent data support")
Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
23 Jan, 2019
1 commit
-
[ Upstream commit 26d92e951fe0a44ee4aec157cabb65a818cc8151 ]
In smc_release() we release smc->clcsock before unhash the smc
sock, but a parallel smc_diag_dump() may be still reading
smc->clcsock, therefore this could cause a use-after-free as
reported by syzbot.Reported-and-tested-by: syzbot+fbd1e5476e4c94c7b34e@syzkaller.appspotmail.com
Fixes: 51f1de79ad8e ("net/smc: replace sock_put worker by socket refcounting")
Cc: Ursula Braun
Signed-off-by: Cong Wang
Reported-by: syzbot+0bf2e01269f1274b4b03@syzkaller.appspotmail.com
Reported-by: syzbot+e3132895630f957306bc@syzkaller.appspotmail.com
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
10 Jan, 2019
1 commit
-
[ Upstream commit 78abe3d0dfad196959b1246003366e2610775ea6 ]
clcsock can be released while kernel_accept() references it in TCP
listen worker. Also, clcsock needs to wake up before released if TCP
fallback is used and the clcsock is blocked by accept. Add a lock to
safely release clcsock and call kernel_sock_shutdown() to wake up
clcsock from accept in smc_release().Reported-by: syzbot+0bf2e01269f1274b4b03@syzkaller.appspotmail.com
Reported-by: syzbot+e3132895630f957306bc@syzkaller.appspotmail.com
Signed-off-by: Myungho Jung
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
04 Nov, 2018
2 commits
-
[ Upstream commit fb692ec4117f6fd25044cfb5720d6b79d400dc65 ]
The pointer to the link group is unset in the smc connection structure
right before the call to smc_buf_unuse. Provide the lgr pointer to
smc_buf_unuse explicitly.
And move the call to smc_lgr_schedule_free_work to the end of
smc_conn_free.Fixes: a6920d1d130c ("net/smc: handle unregistered buffers")
Signed-off-by: Karsten Graul
Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 89ab066d4229acd32e323f1569833302544a4186 ]
This reverts commit dd979b4df817e9976f18fb6f9d134d6bc4a3c317.
This broke tcp_poll for SMC fallback: An AF_SMC socket establishes an
internal TCP socket for the initial handshake with the remote peer.
Whenever the SMC connection can not be established this TCP socket is
used as a fallback. All socket operations on the SMC socket are then
forwarded to the TCP socket. In case of poll, the file->private_data
pointer references the SMC socket because the TCP socket has no file
assigned. This causes tcp_poll to wait on the wrong socket.Signed-off-by: Karsten Graul
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
20 Sep, 2018
1 commit
-
The generic netlink family is only initialized during module init,
so it should be __ro_after_init like all other generic netlink
families.Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller
19 Sep, 2018
5 commits
-
Comparing an int to a size, which is unsigned, causes the int to become
unsigned, giving the wrong result. kernel_sendmsg can return a negative
error code.Signed-off-by: YueHaibing
Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
Don't check a listen socket for pending urgent data in smc_poll().
Signed-off-by: Karsten Graul
Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
If a linkgroup is terminated abnormally already due to failing
LLC CONFIRM LINK or LLC ADD LINK, fallback to TCP is still possible.
In this case do not switch to state SMC_PEERABORTWAIT and do not set
sk_err.Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
For a failing smc_listen_rdma_finish() smc_listen_decline() is
called. If fallback is possible, the new socket is already enqueued
to be accepted in smc_listen_decline(). Avoid enqueuing a second time
afterwards in this case, otherwise the smc_create_lgr_pending lock
is released twice:
[ 373.463976] WARNING: bad unlock balance detected!
[ 373.463978] 4.18.0-rc7+ #123 Tainted: G O
[ 373.463979] -------------------------------------
[ 373.463980] kworker/1:1/30 is trying to release lock (smc_create_lgr_pending) at:
[ 373.463990] [] smc_listen_work+0x22c/0x5d0 [smc]
[ 373.463991] but there are no more locks to release!
[ 373.463991]
other info that might help us debug this:
[ 373.463993] 2 locks held by kworker/1:1/30:
[ 373.463994] #0: 00000000772cbaed ((wq_completion)"events"){+.+.}, at: process_one_work+0x1ec/0x6b0
[ 373.464000] #1: 000000003ad0894a ((work_completion)(&new_smc->smc_listen_work)){+.+.}, at: process_one_work+0x1ec/0x6b0
[ 373.464003]
stack backtrace:
[ 373.464005] CPU: 1 PID: 30 Comm: kworker/1:1 Kdump: loaded Tainted: G O 4.18.0-rc7uschi+ #123
[ 373.464007] Hardware name: IBM 2827 H43 738 (LPAR)
[ 373.464010] Workqueue: events smc_listen_work [smc]
[ 373.464011] Call Trace:
[ 373.464015] ([] show_stack+0x60/0xd8)
[ 373.464019] [] dump_stack+0x9c/0xd8
[ 373.464021] [] print_unlock_imbalance_bug+0xf8/0x108
[ 373.464022] [] lock_release+0x114/0x4f8
[ 373.464025] [] __mutex_unlock_slowpath+0x4a/0x300
[ 373.464027] [] smc_listen_work+0x22c/0x5d0 [smc]
[ 373.464029] [] process_one_work+0x2a8/0x6b0
[ 373.464030] [] worker_thread+0x52/0x410
[ 373.464033] [] kthread+0x15e/0x178
[ 373.464035] [] kernel_thread_starter+0x6/0xc
[ 373.464052] [] kernel_thread_starter+0x0/0xc
[ 373.464054] INFO: lockdep is turned off.Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
In state SMC_INIT smc_poll() delegates polling to the internal
CLC socket. This means, once the connect worker has finished
its kernel_connect() step, the poll wake-up may occur. This is not
intended. The wake-up should occur from the wake up call in
smc_connect_work() after __smc_connect() has finished.
Thus in state SMC_INIT this patch now calls sock_poll_wait() on the
main SMC socket.Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller
18 Aug, 2018
1 commit
-
All RDMA ULPs should be using rdma_get_gid_attr instead of
ib_query_gid. Convert SMC to use the new API.In the process correct some confusion with gid_type - if attr->ndev is
!NULL then gid_type can never be IB_GID_TYPE_IB by
definition. IB_GID_TYPE_ROCE shares the same enum value and is probably
what was intended here.Reviewed-by: Parav Pandit
Signed-off-by: Jason Gunthorpe
17 Aug, 2018
3 commits
-
rdma.git merge resolution for the 4.19 merge window
Conflicts:
drivers/infiniband/core/rdma_core.c
- Use the rdma code and revise with the new spelling for
atomic_fetch_add_unless
drivers/nvme/host/rdma.c
- Replace max_sge with max_send_sge in new blk code
drivers/nvme/target/rdma.c
- Use the blk code and revise to use NULL for ib_post_recv when
appropriate
- Replace max_sge with max_recv_sge in new blk code
net/rds/ib_send.c
- Use the net code and revise to use NULL for ib_post_recv when
appropriateSigned-off-by: Jason Gunthorpe
-
This reverts commit ddb457c6993babbcdd41fca638b870d2a2fc3941.
The include rdma/ib_cache.h is kept, and we have to add a memset
to the compat wrapper to avoid compiler warnings in gcc-7This revert is done to avoid extensive merge conflicts with SMC
changes in netdev during the 4.19 merge window.Signed-off-by: Jason Gunthorpe
-
Resolve merge conflicts from the -rc cycle against the rdma.git tree:
Conflicts:
drivers/infiniband/core/uverbs_cmd.c
- New ifs added to ib_uverbs_ex_create_flow in -rc and for-next
- Merge removal of file->ucontext in for-next with new code in -rc
drivers/infiniband/core/uverbs_main.c
- for-next removed code from ib_uverbs_write() that was modified
in for-rcSigned-off-by: Jason Gunthorpe
11 Aug, 2018
1 commit
-
With SMC-D z/OS sends a test link signal every 10 seconds. Linux is
supposed to answer, otherwise the SMC-D connection breaks.Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller
10 Aug, 2018
1 commit
-
Overlapping changes in RXRPC, changing to ktime_get_seconds() whilst
adding some tracepoints.Signed-off-by: David S. Miller
09 Aug, 2018
3 commits
-
When an SMC socket is connecting it is decided whether fallback to
TCP is needed. To avoid races between connect and ioctl move the
sock lock before the use_fallback check.Reported-by: syzbot+5b2cece1a8ecb2ca77d8@syzkaller.appspotmail.com
Reported-by: syzbot+19557374321ca3710990@syzkaller.appspotmail.com
Fixes: 1992d99882af ("net/smc: take sock lock in smc_ioctl()")
Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
Without setsockopt SO_SNDBUF and SO_RCVBUF settings, the sysctl
defaults net.ipv4.tcp_wmem and net.ipv4.tcp_rmem should be the base
for the sizes of the SMC sndbuf and rcvbuf. Any TCP buffer size
optimizations for servers should be ignored.Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
Invoking shutdown for a socket in state SMC_LISTEN does not make
sense. Nevertheless programs like syzbot fuzzing the kernel may
try to do this. For SMC this means a socket refcounting problem.
This patch makes sure a shutdown call for an SMC socket in state
SMC_LISTEN simply returns with -ENOTCONN.Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller
06 Aug, 2018
1 commit
-
Lots of overlapping changes, mostly trivial in nature.
The mlxsw conflict was resolving using the example
resolution at:https://github.com/jpirko/linux_mlxsw/blob/combined_queue/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_actions.c
Signed-off-by: David S. Miller
05 Aug, 2018
1 commit
-
If a writer blocked condition is received without data, the current
consumer cursor is immediately sent. Servers could already receive this
condition in state SMC_INIT without finished tx-setup. This patch
avoids sending a consumer cursor update in this case.Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller
31 Jul, 2018
1 commit
-
The wait_address argument is always directly derived from the filp
argument, so remove it.Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller
26 Jul, 2018
4 commits
-
Send an orderly DELETE LINK request before termination of a link group,
add support for client triggered DELETE LINK processing. And send a
disorderly DELETE LINK before module is unloaded.Signed-off-by: Karsten Graul
Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
Remember the fallback reason code and the peer diagnosis code for
smc sockets, and provide them in smc_diag.c to the netlink interface.
And add more detailed reason codes.Signed-off-by: Karsten Graul
Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
SMC code uses the base gid for VLAN traffic. The gids exchanged in
the CLC handshake and the gid index used for the QP have to switch
from the base gid to the appropriate vlan gid.When searching for a matching IB device port for a certain vlan
device, it does not make sense to return an IB device port, which
is not enabled for the used vlan_id. Add another check whether a
vlan gid exists for a certain IB device port.Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
Link confirmation will always be sent across the new link being
confirmed. This allows to shrink the parameter list.
No functional change.Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller
25 Jul, 2018
2 commits
-
Instead of declaring and passing a dummy 'bad_wr' pointer, pass NULL
as third argument to ib_post_(send|recv|srq_recv)().Signed-off-by: Bart Van Assche
Acked-by: Ursula Braun
Signed-off-by: Jason Gunthorpe -
Remove a WARN_ON() statement that verifies something that is guaranteed
by the RDMA API, namely that the failed_wr pointer is not touched if an
ib_post_send() call succeeds and that it points at the failed wr if an
ib_post_send() call fails.Signed-off-by: Bart Van Assche
Acked-by: Ursula Braun
Signed-off-by: Jason Gunthorpe
24 Jul, 2018
5 commits
-
The page map address is already stored in the RMB descriptor.
There is no need to derive it from the cpu_addr value.Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
Link group field tokens_used_mask is a bitmap. Use macro
DECLARE_BITMAP for its definition.Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
Replace a frequently used construct with a more readable variant,
reducing the code. Also might come handy when we start to support
more than a single per link group.Signed-off-by: Stefan Raspl
Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
The functions to read and write cursors are exclusively used to copy
cursors. Therefore switch to a respective function instead.Signed-off-by: Stefan Raspl
Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
Rename field diag_fallback into diag_mode and set the smc mode of a
connection explicitly.Signed-off-by: Karsten Graul
Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller
21 Jul, 2018
1 commit
-
All conflicts were trivial overlapping changes, so reasonably
easy to resolve.Signed-off-by: David S. Miller
19 Jul, 2018
4 commits
-
Pull networking fixes from David Miller:
"Lots of fixes, here goes:1) NULL deref in qtnfmac, from Gustavo A. R. Silva.
2) Kernel oops when fw download fails in rtlwifi, from Ping-Ke Shih.
3) Lost completion messages in AF_XDP, from Magnus Karlsson.
4) Correct bogus self-assignment in rhashtable, from Rishabh
Bhatnagar.5) Fix regression in ipv6 route append handling, from David Ahern.
6) Fix masking in __set_phy_supported(), from Heiner Kallweit.
7) Missing module owner set in x_tables icmp, from Florian Westphal.
8) liquidio's timeouts are HZ dependent, fix from Nicholas Mc Guire.
9) Link setting fixes for sh_eth and ravb, from Vladimir Zapolskiy.
10) Fix NULL deref when using chains in act_csum, from Davide Caratti.
11) XDP_REDIRECT needs to check if the interface is up and whether the
MTU is sufficient. From Toshiaki Makita.12) Net diag can do a double free when killing TCP_NEW_SYN_RECV
connections, from Lorenzo Colitti.13) nf_defrag in ipv6 can unnecessarily hold onto dst entries for a
full minute, delaying device unregister. From Eric Dumazet.14) Update MAC entries in the correct order in ixgbe, from Alexander
Duyck.15) Don't leave partial mangles bpf program in jit_subprogs, from
Daniel Borkmann.16) Fix pfmemalloc SKB state propagation, from Stefano Brivio.
17) Fix ACK handling in DCTCP congestion control, from Yuchung Cheng.
18) Use after free in tun XDP_TX, from Toshiaki Makita.
19) Stale ipv6 header pointer in ipv6 gre code, from Prashant Bhole.
20) Don't reuse remainder of RX page when XDP is set in mlx4, from
Saeed Mahameed.21) Fix window probe handling of TCP rapair sockets, from Stefan
Baranoff.22) Missing socket locking in smc_ioctl(), from Ursula Braun.
23) IPV6_ILA needs DST_CACHE, from Arnd Bergmann.
24) Spectre v1 fix in cxgb3, from Gustavo A. R. Silva.
25) Two spots in ipv6 do a rol32() on a hash value but ignore the
result. Fixes from Colin Ian King"* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (176 commits)
tcp: identify cryptic messages as TCP seq # bugs
ptp: fix missing break in switch
hv_netvsc: Fix napi reschedule while receive completion is busy
MAINTAINERS: Drop inactive Vitaly Bordug's email
net: cavium: Add fine-granular dependencies on PCI
net: qca_spi: Fix log level if probe fails
net: qca_spi: Make sure the QCA7000 reset is triggered
net: qca_spi: Avoid packet drop during initial sync
ipv6: fix useless rol32 call on hash
ipv6: sr: fix useless rol32 call on hash
net: sched: Using NULL instead of plain integer
net: usb: asix: replace mii_nway_restart in resume path
net: cxgb3_main: fix potential Spectre v1
lib/rhashtable: consider param->min_size when setting initial table size
net/smc: reset recv timeout after clc handshake
net/smc: add error handling for get_user()
net/smc: optimize consumer cursor updates
net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.
ipv6: ila: select CONFIG_DST_CACHE
net: usb: rtl8150: demote allmulti message to dev_dbg()
... -
During clc handshake the receive timeout is set to CLC_WAIT_TIME.
Remember and reset the original timeout value after the receive calls,
and remove a duplicate assignment of CLC_WAIT_TIME.Signed-off-by: Karsten Graul
Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
For security reasons the return code of get_user() should always be
checked.Fixes: 01d2f7e2cdd31 ("net/smc: sockopts TCP_NODELAY and TCP_CORK")
Reported-by: Heiko Carstens
Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller -
The SMC protocol requires to send a separate consumer cursor update,
if it cannot be piggybacked to updates of the producer cursor.
Currently the decision to send a separate consumer cursor update
just considers the amount of data already received by the socket
program. It does not consider the amount of data already arrived, but
not yet consumed by the receiver. Basing the decision on the
difference between already confirmed and already arrived data
(instead of difference between already confirmed and already consumed
data), may lead to a somewhat earlier consumer cursor update send in
fast unidirectional traffic scenarios, and thus to better throughput.Signed-off-by: Ursula Braun
Suggested-by: Thomas Richter
Signed-off-by: David S. Miller
17 Jul, 2018
1 commit
-
SMC ioctl processing requires the sock lock to work properly in
all thinkable scenarios.
Problem has been found with RaceFuzzer and fixes:
KASAN: null-ptr-deref Read in smc_ioctlReported-by: Byoungyoung Lee
Reported-by: syzbot+35b2c5aa76fd398b9fd4@syzkaller.appspotmail.com
Signed-off-by: Ursula Braun
Reviewed-by: Stefano Brivio
Signed-off-by: David S. Miller