Eric Lee / smarc-fsl-linux-kernel

30 Dec, 2020

2 commits

35f71f3cb xprtrdma: Fix XDRBUF_SPARSE_PAGES support ... Browse Code »

commit 15261b9126cd5bb2ad8521da49d8f5c042d904c7 upstream.

Olga K. observed that rpcrdma_marsh_req() allocates sparse pages
only when it has determined that a Reply chunk is necessary. There
are plenty of cases where no Reply chunk is needed, but the
XDRBUF_SPARSE_PAGES flag is set. The result would be a crash in
rpcrdma_inline_fixup() when it tries to copy parts of the received
Reply into a missing page.

To avoid crashing, handle sparse page allocation up front.

Until XATTR support was added, this issue did not appear often
because the only SPARSE_PAGES consumer always expected a reply large
enough to always require a Reply chunk.

Reported-by: Olga Kornievskaia
Signed-off-by: Chuck Lever
Cc:
Signed-off-by: Trond Myklebust
Signed-off-by: Greg Kroah-Hartman

Chuck Lever
2020-12-30 18:54:15 +0800
c1e628f91 SUNRPC: xprt_load_transport() needs to support the netid "rdma6" ... Browse Code »

[ Upstream commit d5aa6b22e2258f05317313ecc02efbb988ed6d38 ]

According to RFC5666, the correct netid for an IPv6 addressed RDMA
transport is "rdma6", which we've supported as a mount option since
Linux-4.7. The problem is when we try to load the module "xprtrdma6",
that will fail, since there is no modulealias of that name.

Fixes: 181342c5ebe8 ("xprtrdma: Add rdma6 option to support NFS/RDMA IPv6")
Signed-off-by: Trond Myklebust
Signed-off-by: Sasha Levin

Trond Myklebust
2020-12-30 18:53:30 +0800

23 Oct, 2020

1 commit

24717cfbb Merge tag 'nfsd-5.10' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"The one new feature this time, from Anna Schumaker, is READ_PLUS,
which has the same arguments as READ but allows the server to return
an array of data and hole extents.

Otherwise it's a lot of cleanup and bugfixes"

* tag 'nfsd-5.10' of git://linux-nfs.org/~bfields/linux: (43 commits)
NFSv4.2: Fix NFS4ERR_STALE error when doing inter server copy
SUNRPC: fix copying of multiple pages in gss_read_proxy_verf()
sunrpc: raise kernel RPC channel buffer size
svcrdma: fix bounce buffers for unaligned offsets and multiple pages
nfsd: remove unneeded break
net/sunrpc: Fix return value for sysctl sunrpc.transports
NFSD: Encode a full READ_PLUS reply
NFSD: Return both a hole and a data segment
NFSD: Add READ_PLUS hole segment encoding
NFSD: Add READ_PLUS data support
NFSD: Hoist status code encoding into XDR encoder functions
NFSD: Map nfserr_wrongsec outside of nfsd_dispatch
NFSD: Remove the RETURN_STATUS() macro
NFSD: Call NFSv2 encoders on error returns
NFSD: Fix .pc_release method for NFSv2
NFSD: Remove vestigial typedefs
NFSD: Refactor nfsd_dispatch() error paths
NFSD: Clean up nfsd_dispatch() variables
NFSD: Clean up stale comments in nfsd_dispatch()
NFSD: Clean up switch statement in nfsd_dispatch()
...

Linus Torvalds
2020-10-23 00:44:27 +0800

17 Oct, 2020

1 commit

c327a310e svcrdma: fix bounce buffers for unaligned offsets and multiple pages ... Browse Code »

This was discovered using O_DIRECT at the client side, with small
unaligned file offsets or IOs that span multiple file pages.

Fixes: e248aa7be86 ("svcrdma: Remove max_sge check at connect time")
Signed-off-by: Dan Aloni
Signed-off-by: J. Bruce Fields

Dan Aloni
2020-10-17 03:15:04 +0800

26 Sep, 2020

1 commit

1cc5213ba net: sunrpc: delete repeated words ... Browse Code »

Drop duplicate words in net/sunrpc/.
Also fix "Anyone" to be "Any one".

Signed-off-by: Randy Dunlap
Cc: "J. Bruce Fields"
Cc: Chuck Lever
Cc: linux-nfs@vger.kernel.org
Signed-off-by: J. Bruce Fields

Randy Dunlap
2020-09-26 06:01:26 +0800

22 Sep, 2020

1 commit

ed38c33f1 xprtrdma: drop double zeroing ... Browse Code »

sg_init_table zeroes its first argument, so the allocation of that argument
doesn't have to.

the semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

//
@@
expression x,n,flags;
@@

x =
- kcalloc
+ kmalloc_array
(n,sizeof(*x),flags)
...
sg_init_table(x,n)
//

Signed-off-by: Julia Lawall
Acked-by: Chuck Lever
Signed-off-by: Anna Schumaker

Julia Lawall
2020-09-22 00:15:25 +0800

21 Sep, 2020

3 commits

ac1ae5342 SUNRPC: Hoist trace_xprtrdma_op_setport into generic code ... Browse Code »

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-09-21 22:21:09 +0800
780694875 SUNRPC: Remove debugging instrumentation from xprt_release ... Browse Code »

These instruments don't appear to add any substantial value.

We already have this at the termination of each RPC:

iozone-2617 [002] 975.713126: rpc_stats_latency: task:418@5 xid=0x260eab5d nfsv3 LOOKUP backlog=15 rtt=32 execute=58
iozone-2617 [002] 975.713127: xprt_release_cong: task:418@5 snd_task:4294967295 cong=256 cwnd=16384
iozone-2617 [002] 975.713127: xprt_put_cong: task:418@5 snd_task:4294967295 cong=0 cwnd=16384

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-09-21 22:21:08 +0800
06e234c61 SUNRPC: Hoist trace_xprtrdma_op_allocate into generic code ... Browse Code »

Introduce a tracepoint in call_allocate that reports the exact
sizes in the RPC buffer allocation request and the status of the
result. This helps catch problems with XDR buffer provisioning,
and replaces transport-specific debugging instrumentation.

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-09-21 22:21:08 +0800

10 Sep, 2020

1 commit

ab29a807a Merge tag 'nfs-for-5.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs ... Browse Code »

Pull NFS client bugfixes from Trond Myklebust:

- Fix an NFS/RDMA resource leak

- Fix the error handling during delegation recall

- NFSv4.0 needs to return the delegation on a zero-stateid SETATTR

- Stop printk reading past end of string

* tag 'nfs-for-5.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: stop printk reading past end of string
NFS: Zero-stateid SETATTR should first return delegation
NFSv4.1 handle ERR_DELAY error reclaiming locking state on delegation recall
xprtrdma: Release in-flight MRs on disconnect

Linus Torvalds
2020-09-10 02:14:20 +0800

27 Aug, 2020

1 commit

5de55ce95 xprtrdma: Release in-flight MRs on disconnect ... Browse Code »

Dan Aloni reports that when a server disconnects abruptly, a few
memory regions are left DMA mapped. Over time this leak could pin
enough I/O resources to slow or even deadlock an NFS/RDMA client.

I found that if a transport disconnects before pending Send and
FastReg WRs can be posted, the to-be-registered MRs are stranded on
the req's rl_registered list and never released -- since they
weren't posted, there's no Send completion to DMA unmap them.

Reported-by: Dan Aloni
Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-08-27 03:29:21 +0800

24 Aug, 2020

1 commit

df561f668 treewide: Use fallthrough pseudo-keyword ... Browse Code »

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva

Gustavo A. R. Silva
2020-08-24 06:36:59 +0800

10 Aug, 2020

1 commit

7a6b60441 Merge tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6 ... Browse Code »

Pull NFS server updates from Chuck Lever:
"Highlights:
- Support for user extended attributes on NFS (RFC 8276)
- Further reduce unnecessary NFSv4 delegation recalls

Notable fixes:
- Fix recent krb5p regression
- Address a few resource leaks and a rare NULL dereference

Other:
- De-duplicate RPC/RDMA error handling and other utility functions
- Replace storage and display of kernel memory addresses by tracepoints"

* tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6: (38 commits)
svcrdma: CM event handler clean up
svcrdma: Remove transport reference counting
svcrdma: Fix another Receive buffer leak
SUNRPC: Refresh the show_rqstp_flags() macro
nfsd: netns.h: delete a duplicated word
SUNRPC: Fix ("SUNRPC: Add "@len" parameter to gss_unwrap()")
nfsd: avoid a NULL dereference in __cld_pipe_upcall()
nfsd4: a client's own opens needn't prevent delegations
nfsd: Use seq_putc() in two functions
svcrdma: Display chunk completion ID when posting a rw_ctxt
svcrdma: Record send_ctxt completion ID in trace_svcrdma_post_send()
svcrdma: Introduce Send completion IDs
svcrdma: Record Receive completion ID in svc_rdma_decode_rqst
svcrdma: Introduce Receive completion IDs
svcrdma: Introduce infrastructure to support completion IDs
svcrdma: Add common XDR encoders for RDMA and Read segments
svcrdma: Add common XDR decoders for RDMA and Read segments
SUNRPC: Add helpers for decoding list discriminators symbolically
svcrdma: Remove declarations for functions long removed
svcrdma: Clean up trace_svcrdma_send_failed() tracepoint
...

Linus Torvalds
2020-08-10 04:58:04 +0800

28 Jul, 2020

3 commits

b297fed69 svcrdma: CM event handler clean up ... Browse Code »

Now that there's a core tracepoint that reports these events, there's
no need to maintain dprintk() call sites in each arm of the switch
statements.

We also refresh the documenting comments.

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-28 22:18:15 +0800
365e9992b svcrdma: Remove transport reference counting ... Browse Code »

Jason tells me that a ULP cannot rely on getting an ESTABLISHED
and DISCONNECTED event pair for each connection, so transport
reference counting in the CM event handler will never be reliable.

Now that we have ib_drain_qp(), svcrdma should no longer need to
hold transport references while Sends and Receives are posted. So
remove the get/put call sites in the CM event handlers.

This eliminates a significant source of locked memory bus traffic.

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-28 22:18:14 +0800
64d264225 svcrdma: Fix another Receive buffer leak ... Browse Code »

During a connection tear down, the Receive queue is flushed before
the device resources are freed. Typically, all the Receives flush
with IB_WR_FLUSH_ERR.

However, any pending successful Receives flush with IB_WR_SUCCESS,
and the server automatically posts a fresh Receive to replace the
completing one. This happens even after the connection has closed
and the RQ is drained. Receives that are posted after the RQ is
drained appear never to complete, causing a Receive resource leak.
The leaked Receive buffer is left DMA-mapped.

To prevent these late-posted recv_ctxt's from leaking, block new
Receive posting after XPT_CLOSE is set.

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-28 22:18:13 +0800

16 Jul, 2020

1 commit

912288442 xprtrdma: fix incorrect header size calculations ... Browse Code »

Currently the header size calculations are using an assignment
operator instead of a += operator when accumulating the header
size leading to incorrect sizes. Fix this by using the correct
operator.

Addresses-Coverity: ("Unused value")
Fixes: 302d3deb2068 ("xprtrdma: Prevent inline overflow")
Signed-off-by: Colin Ian King
Reviewed-by: Chuck Lever
Signed-off-by: Anna Schumaker

Colin Ian King
2020-07-16 01:01:01 +0800

14 Jul, 2020

16 commits

6787f0bea svcrdma: Display chunk completion ID when posting a rw_ctxt ... Browse Code »

Re-use the post_rw tracepoint (safely) to trace cc_info lifetime
events, including completion IDs.

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
17f70f8dd svcrdma: Record send_ctxt completion ID in trace_svcrdma_post_send() ... Browse Code »

First, refactor: Dereference the svc_rdma_send_ctxt inside
svc_rdma_send() instead of at every call site.

Then, it can be passed into trace_svcrdma_post_send() to get the
proper completion ID.

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
3ac56c2fb svcrdma: Introduce Send completion IDs ... Browse Code »

Set up a completion ID in each svc_rdma_send_ctxt. The ID is used
to match an incoming Send completion to a transport and to a
previous ib_post_send().

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
007140ee9 svcrdma: Record Receive completion ID in svc_rdma_decode_rqst ... Browse Code »

When recording a trace event in the Receive path, tie decoding
results and errors to an incoming Receive completion.

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
9b3bcf8c5 svcrdma: Introduce Receive completion IDs ... Browse Code »

Set up a completion ID in each svc_rdma_recv_ctxt. The ID is used
to match an incoming Receive completion to a transport and to a
previous ib_post_recv().

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
379c3bc6b svcrdma: Add common XDR encoders for RDMA and Read segments ... Browse Code »

Clean up: De-duplicate some code.

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
f60a08697 svcrdma: Add common XDR decoders for RDMA and Read segments ... Browse Code »

Clean up: De-duplicate some code.

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
07e9a6325 SUNRPC: Add helpers for decoding list discriminators symbolically ... Browse Code »

Use these helpers in a few spots to demonstrate their use.

The remaining open-coded discriminator checks in rpcrdma will be
addressed in subsequent patches.

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
3f8f25c69 svcrdma: Clean up trace_svcrdma_send_failed() tracepoint ... Browse Code »

- Use the _err naming convention instead
- Remove display of kernel memory address of the controlling xprt

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
ba6cc9773 svcrdma: Consolidate send_error helper functions ... Browse Code »

Final refactor: Replace internals of svc_rdma_send_error() with a
simple call to svc_rdma_send_error_msg().

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
c65b326b1 svcrdma: Make svc_rdma_send_error_msg() a global function ... Browse Code »

Prepare for svc_rdma_send_error_msg() to be invoked from another
source file.

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
605c61bee svcrdma: Eliminate return value for svc_rdma_send_error_msg() ... Browse Code »

Like svc_rdma_send_error(), have svc_rdma_send_error_msg() handle
any error conditions internally, rather than duplicating that
recovery logic at every call site.

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
4f200bd8a svcrdma: Add a @status parameter to svc_rdma_send_error_msg() ... Browse Code »

The common "send RDMA_ERR" function should be in svc_rdma_sendto.c,
since that is where the other Send-related functions are located.
So from here, I will beef up svc_rdma_send_error_msg() and deprecate
svc_rdma_send_error().

A generic svc_rdma_send_error_msg() will need to handle both
ERR_CHUNK and ERR_VERS. Copy that logic from svc_rdma_send_error()
to svc_rdma_send_error_msg().

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
d1f6e2369 svcrdma: Add @rctxt parameter to svc_rdma_send_error() functions ... Browse Code »

Another step towards making svc_rdma_send_error_msg() and
svc_rdma_send_error() similar enough to eliminate one of them.

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
6e9fab707 svcrdma: Remove save_io_pages() call from send_error_msg() ... Browse Code »

Commit 4757d90b15d8 ("svcrdma: Report Write/Reply chunk overruns")
made an effort to preserve I/O pages until RDMA Write completion.

In a subsequent patch, I intend to de-duplicate the two functions
that send ERR_CHUNK responses. Pull the save_io_pages() call out of
svc_rdma_send_error_msg() to make it more like
svc_rdma_send_error().

Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800
e814eecbe svcrdma: Fix page leak in svc_rdma_recv_read_chunk() ... Browse Code »

Commit 07d0ff3b0cd2 ("svcrdma: Clean up Read chunk path") moved the
page saver logic so that it gets executed event when an error occurs.
In that case, the I/O is never posted, and those pages are then
leaked. Errors in this path, however, are quite rare.

Fixes: 07d0ff3b0cd2 ("svcrdma: Clean up Read chunk path")
Signed-off-by: Chuck Lever

Chuck Lever
2020-07-14 05:28:24 +0800

13 Jul, 2020

4 commits

af667527b xprtrdma: Fix handling of connect errors ... Browse Code »

Ensure that the connect worker is awoken if an attempt to establish
a connection is unsuccessful. Otherwise the worker waits forever
and the transport workload hangs.

Connect errors should not attempt to destroy the ep, since the
connect worker continues to use it after the handler runs, so these
errors are now handled independently of DISCONNECTED events.

Reported-by: Dan Aloni
Fixes: e28ce90083f0 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-07-13 22:50:41 +0800
dda9a951d xprtrdma: Fix return code from rpcrdma_xprt_connect() ... Browse Code »

I noticed that when rpcrdma_xprt_connect() returns -ENOMEM,
instead of retrying the connect, the RPC client kills the
RPC task that requested the connection. We want a retry
here.

Fixes: cb586decbb88 ("xprtrdma: Make sendctx queue lifetime the same as connection lifetime")
Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-07-13 22:50:41 +0800
4cf44be6f xprtrdma: Fix recursion into rpcrdma_xprt_disconnect() ... Browse Code »

Both Dan and I have observed two processes invoking
rpcrdma_xprt_disconnect() concurrently. In my case:

1. The connect worker invokes rpcrdma_xprt_disconnect(), which
drains the QP and waits for the final completion
2. This causes the newly posted Receive to flush and invoke
xprt_force_disconnect()
3. xprt_force_disconnect() sets CLOSE_WAIT and wakes up the RPC task
that is holding the transport lock
4. The RPC task invokes xprt_connect(), which calls ->ops->close
5. xprt_rdma_close() invokes rpcrdma_xprt_disconnect(), which tries
to destroy the QP.

Deadlock.

To prevent xprt_force_disconnect() from waking anything, handle the
clean up after a failed connection attempt in the xprt's sndtask.

The retry loop is removed from rpcrdma_xprt_connect() to ensure
that the newly allocated ep and id are properly released before
a REJECTED connection attempt can be retried.

Reported-by: Dan Aloni
Fixes: e28ce90083f0 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-07-13 22:50:41 +0800
85bfd71bc xprtrdma: Fix double-free in rpcrdma_ep_create() ... Browse Code »

In the error paths, there's no need to call kfree(ep) after calling
rpcrdma_ep_put(ep).

Fixes: e28ce90083f0 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-07-13 22:50:41 +0800

22 Jun, 2020

3 commits

7b2182ec3 xprtrdma: Fix handling of RDMA_ERROR replies ... Browse Code »

The RPC client currently doesn't handle ERR_CHUNK replies correctly.
rpcrdma_complete_rqst() incorrectly passes a negative number to
xprt_complete_rqst() as the number of bytes copied. Instead, set
task->tk_status to the error value, and return zero bytes copied.

In these cases, return -EIO rather than -EREMOTEIO. The RPC client's
finite state machine doesn't know what to do with -EREMOTEIO.

Additional clean ups:
- Don't double-count RDMA_ERROR replies
- Remove a stale comment

Signed-off-by: Chuck Lever
Cc:
Signed-off-by: Anna Schumaker

Chuck Lever
2020-06-22 21:34:35 +0800
c487eb7d8 xprtrdma: Clean up disconnect ... Browse Code »

1. Ensure that only rpcrdma_cm_event_handler() modifies
ep->re_connect_status to avoid racy changes to that field.

2. Ensure that xprt_force_disconnect() is invoked only once as a
transport is closed or destroyed.

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-06-22 21:34:35 +0800
f423f755f xprtrdma: Clean up synopsis of rpcrdma_flush_disconnect() ... Browse Code »

Refactor: Pass struct rpcrdma_xprt instead of an IB layer object.

Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2020-06-22 21:34:35 +0800