30 Dec, 2020

2 commits

  • commit 15261b9126cd5bb2ad8521da49d8f5c042d904c7 upstream.

    Olga K. observed that rpcrdma_marsh_req() allocates sparse pages
    only when it has determined that a Reply chunk is necessary. There
    are plenty of cases where no Reply chunk is needed, but the
    XDRBUF_SPARSE_PAGES flag is set. The result would be a crash in
    rpcrdma_inline_fixup() when it tries to copy parts of the received
    Reply into a missing page.

    To avoid crashing, handle sparse page allocation up front.

    Until XATTR support was added, this issue did not appear often
    because the only SPARSE_PAGES consumer always expected a reply large
    enough to always require a Reply chunk.

    Reported-by: Olga Kornievskaia
    Signed-off-by: Chuck Lever
    Cc:
    Signed-off-by: Trond Myklebust
    Signed-off-by: Greg Kroah-Hartman

    Chuck Lever
     
  • [ Upstream commit d5aa6b22e2258f05317313ecc02efbb988ed6d38 ]

    According to RFC5666, the correct netid for an IPv6 addressed RDMA
    transport is "rdma6", which we've supported as a mount option since
    Linux-4.7. The problem is when we try to load the module "xprtrdma6",
    that will fail, since there is no modulealias of that name.

    Fixes: 181342c5ebe8 ("xprtrdma: Add rdma6 option to support NFS/RDMA IPv6")
    Signed-off-by: Trond Myklebust
    Signed-off-by: Sasha Levin

    Trond Myklebust
     

23 Oct, 2020

1 commit

  • Pull nfsd updates from Bruce Fields:
    "The one new feature this time, from Anna Schumaker, is READ_PLUS,
    which has the same arguments as READ but allows the server to return
    an array of data and hole extents.

    Otherwise it's a lot of cleanup and bugfixes"

    * tag 'nfsd-5.10' of git://linux-nfs.org/~bfields/linux: (43 commits)
    NFSv4.2: Fix NFS4ERR_STALE error when doing inter server copy
    SUNRPC: fix copying of multiple pages in gss_read_proxy_verf()
    sunrpc: raise kernel RPC channel buffer size
    svcrdma: fix bounce buffers for unaligned offsets and multiple pages
    nfsd: remove unneeded break
    net/sunrpc: Fix return value for sysctl sunrpc.transports
    NFSD: Encode a full READ_PLUS reply
    NFSD: Return both a hole and a data segment
    NFSD: Add READ_PLUS hole segment encoding
    NFSD: Add READ_PLUS data support
    NFSD: Hoist status code encoding into XDR encoder functions
    NFSD: Map nfserr_wrongsec outside of nfsd_dispatch
    NFSD: Remove the RETURN_STATUS() macro
    NFSD: Call NFSv2 encoders on error returns
    NFSD: Fix .pc_release method for NFSv2
    NFSD: Remove vestigial typedefs
    NFSD: Refactor nfsd_dispatch() error paths
    NFSD: Clean up nfsd_dispatch() variables
    NFSD: Clean up stale comments in nfsd_dispatch()
    NFSD: Clean up switch statement in nfsd_dispatch()
    ...

    Linus Torvalds
     

17 Oct, 2020

1 commit


26 Sep, 2020

1 commit


22 Sep, 2020

1 commit

  • sg_init_table zeroes its first argument, so the allocation of that argument
    doesn't have to.

    the semantic patch that makes this change is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @@
    expression x,n,flags;
    @@

    x =
    - kcalloc
    + kmalloc_array
    (n,sizeof(*x),flags)
    ...
    sg_init_table(x,n)
    //

    Signed-off-by: Julia Lawall
    Acked-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Julia Lawall
     

21 Sep, 2020

3 commits


10 Sep, 2020

1 commit

  • Pull NFS client bugfixes from Trond Myklebust:

    - Fix an NFS/RDMA resource leak

    - Fix the error handling during delegation recall

    - NFSv4.0 needs to return the delegation on a zero-stateid SETATTR

    - Stop printk reading past end of string

    * tag 'nfs-for-5.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    SUNRPC: stop printk reading past end of string
    NFS: Zero-stateid SETATTR should first return delegation
    NFSv4.1 handle ERR_DELAY error reclaiming locking state on delegation recall
    xprtrdma: Release in-flight MRs on disconnect

    Linus Torvalds
     

27 Aug, 2020

1 commit

  • Dan Aloni reports that when a server disconnects abruptly, a few
    memory regions are left DMA mapped. Over time this leak could pin
    enough I/O resources to slow or even deadlock an NFS/RDMA client.

    I found that if a transport disconnects before pending Send and
    FastReg WRs can be posted, the to-be-registered MRs are stranded on
    the req's rl_registered list and never released -- since they
    weren't posted, there's no Send completion to DMA unmap them.

    Reported-by: Dan Aloni
    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     

24 Aug, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

10 Aug, 2020

1 commit

  • Pull NFS server updates from Chuck Lever:
    "Highlights:
    - Support for user extended attributes on NFS (RFC 8276)
    - Further reduce unnecessary NFSv4 delegation recalls

    Notable fixes:
    - Fix recent krb5p regression
    - Address a few resource leaks and a rare NULL dereference

    Other:
    - De-duplicate RPC/RDMA error handling and other utility functions
    - Replace storage and display of kernel memory addresses by tracepoints"

    * tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6: (38 commits)
    svcrdma: CM event handler clean up
    svcrdma: Remove transport reference counting
    svcrdma: Fix another Receive buffer leak
    SUNRPC: Refresh the show_rqstp_flags() macro
    nfsd: netns.h: delete a duplicated word
    SUNRPC: Fix ("SUNRPC: Add "@len" parameter to gss_unwrap()")
    nfsd: avoid a NULL dereference in __cld_pipe_upcall()
    nfsd4: a client's own opens needn't prevent delegations
    nfsd: Use seq_putc() in two functions
    svcrdma: Display chunk completion ID when posting a rw_ctxt
    svcrdma: Record send_ctxt completion ID in trace_svcrdma_post_send()
    svcrdma: Introduce Send completion IDs
    svcrdma: Record Receive completion ID in svc_rdma_decode_rqst
    svcrdma: Introduce Receive completion IDs
    svcrdma: Introduce infrastructure to support completion IDs
    svcrdma: Add common XDR encoders for RDMA and Read segments
    svcrdma: Add common XDR decoders for RDMA and Read segments
    SUNRPC: Add helpers for decoding list discriminators symbolically
    svcrdma: Remove declarations for functions long removed
    svcrdma: Clean up trace_svcrdma_send_failed() tracepoint
    ...

    Linus Torvalds
     

28 Jul, 2020

3 commits

  • Now that there's a core tracepoint that reports these events, there's
    no need to maintain dprintk() call sites in each arm of the switch
    statements.

    We also refresh the documenting comments.

    Signed-off-by: Chuck Lever

    Chuck Lever
     
  • Jason tells me that a ULP cannot rely on getting an ESTABLISHED
    and DISCONNECTED event pair for each connection, so transport
    reference counting in the CM event handler will never be reliable.

    Now that we have ib_drain_qp(), svcrdma should no longer need to
    hold transport references while Sends and Receives are posted. So
    remove the get/put call sites in the CM event handlers.

    This eliminates a significant source of locked memory bus traffic.

    Signed-off-by: Chuck Lever

    Chuck Lever
     
  • During a connection tear down, the Receive queue is flushed before
    the device resources are freed. Typically, all the Receives flush
    with IB_WR_FLUSH_ERR.

    However, any pending successful Receives flush with IB_WR_SUCCESS,
    and the server automatically posts a fresh Receive to replace the
    completing one. This happens even after the connection has closed
    and the RQ is drained. Receives that are posted after the RQ is
    drained appear never to complete, causing a Receive resource leak.
    The leaked Receive buffer is left DMA-mapped.

    To prevent these late-posted recv_ctxt's from leaking, block new
    Receive posting after XPT_CLOSE is set.

    Signed-off-by: Chuck Lever

    Chuck Lever
     

16 Jul, 2020

1 commit

  • Currently the header size calculations are using an assignment
    operator instead of a += operator when accumulating the header
    size leading to incorrect sizes. Fix this by using the correct
    operator.

    Addresses-Coverity: ("Unused value")
    Fixes: 302d3deb2068 ("xprtrdma: Prevent inline overflow")
    Signed-off-by: Colin Ian King
    Reviewed-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Colin Ian King
     

14 Jul, 2020

16 commits


13 Jul, 2020

4 commits

  • Ensure that the connect worker is awoken if an attempt to establish
    a connection is unsuccessful. Otherwise the worker waits forever
    and the transport workload hangs.

    Connect errors should not attempt to destroy the ep, since the
    connect worker continues to use it after the handler runs, so these
    errors are now handled independently of DISCONNECTED events.

    Reported-by: Dan Aloni
    Fixes: e28ce90083f0 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • I noticed that when rpcrdma_xprt_connect() returns -ENOMEM,
    instead of retrying the connect, the RPC client kills the
    RPC task that requested the connection. We want a retry
    here.

    Fixes: cb586decbb88 ("xprtrdma: Make sendctx queue lifetime the same as connection lifetime")
    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Both Dan and I have observed two processes invoking
    rpcrdma_xprt_disconnect() concurrently. In my case:

    1. The connect worker invokes rpcrdma_xprt_disconnect(), which
    drains the QP and waits for the final completion
    2. This causes the newly posted Receive to flush and invoke
    xprt_force_disconnect()
    3. xprt_force_disconnect() sets CLOSE_WAIT and wakes up the RPC task
    that is holding the transport lock
    4. The RPC task invokes xprt_connect(), which calls ->ops->close
    5. xprt_rdma_close() invokes rpcrdma_xprt_disconnect(), which tries
    to destroy the QP.

    Deadlock.

    To prevent xprt_force_disconnect() from waking anything, handle the
    clean up after a failed connection attempt in the xprt's sndtask.

    The retry loop is removed from rpcrdma_xprt_connect() to ensure
    that the newly allocated ep and id are properly released before
    a REJECTED connection attempt can be retried.

    Reported-by: Dan Aloni
    Fixes: e28ce90083f0 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • In the error paths, there's no need to call kfree(ep) after calling
    rpcrdma_ep_put(ep).

    Fixes: e28ce90083f0 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     

22 Jun, 2020

3 commits

  • The RPC client currently doesn't handle ERR_CHUNK replies correctly.
    rpcrdma_complete_rqst() incorrectly passes a negative number to
    xprt_complete_rqst() as the number of bytes copied. Instead, set
    task->tk_status to the error value, and return zero bytes copied.

    In these cases, return -EIO rather than -EREMOTEIO. The RPC client's
    finite state machine doesn't know what to do with -EREMOTEIO.

    Additional clean ups:
    - Don't double-count RDMA_ERROR replies
    - Remove a stale comment

    Signed-off-by: Chuck Lever
    Cc:
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • 1. Ensure that only rpcrdma_cm_event_handler() modifies
    ep->re_connect_status to avoid racy changes to that field.

    2. Ensure that xprt_force_disconnect() is invoked only once as a
    transport is closed or destroyed.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Refactor: Pass struct rpcrdma_xprt instead of an IB layer object.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever