02 Apr, 2020

3 commits


01 Apr, 2020

1 commit


29 Mar, 2020

1 commit


28 Mar, 2020

20 commits


27 Mar, 2020

12 commits

  • Change the rpcrdma_xprt_disconnect() function so that it no longer
    waits for the DISCONNECTED event. This prevents blocking if the
    remote is unresponsive.

    In rpcrdma_xprt_disconnect(), the transport's rpcrdma_ep is
    detached. Upon return from rpcrdma_xprt_disconnect(), the transport
    (r_xprt) is ready immediately for a new connection.

    The RDMA_CM_DEVICE_REMOVAL and RDMA_CM_DISCONNECTED events are now
    handled almost identically.

    However, because the lifetimes of rpcrdma_xprt structures and
    rpcrdma_ep structures are now independent, creating an rpcrdma_ep
    needs to take a module ref count. The ep now owns most of the
    hardware resources for a transport.

    Also, a kref is needed to ensure that rpcrdma_ep sticks around
    long enough for the cm_event_handler to finish.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • rpcrdma_cm_event_handler() is always passed an @id pointer that is
    valid. However, in a subsequent patch, we won't be able to extract
    an r_xprt in every case. So instead of using the r_xprt's
    presentation address strings, extract them from struct rdma_cm_id.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • I eventually want to allocate rpcrdma_ep separately from struct
    rpcrdma_xprt so that on occasion there can be more than one ep per
    xprt.

    The new struct rpcrdma_ep will contain all the fields currently in
    rpcrdma_ia and in rpcrdma_ep. This is all the device and CM settings
    for the connection, in addition to per-connection settings
    negotiated with the remote.

    Take this opportunity to rename the existing ep fields from rep_* to
    re_* to disambiguate these from struct rpcrdma_rep.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Completion errors after a disconnect often occur much sooner than a
    CM_DISCONNECT event. Use this to try to detect connection loss more
    quickly.

    Note that other kernel ULPs do take care to disconnect explicitly
    when a WR is flushed.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Clean up:
    The upper layer serializes calls to xprt_rdma_close, so there is no
    need for an atomic bit operation, saving 8 bytes in rpcrdma_ia.

    This enables merging rpcrdma_ia_remove directly into the disconnect
    logic.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Move rdma_cm_id creation into rpcrdma_ep_create() so that it is now
    responsible for allocating all per-connection hardware resources.

    With this clean-up, all three arms of the switch statement in
    rpcrdma_ep_connect are exactly the same now, thus the switch can be
    removed.

    Because device removal behaves a little differently than
    disconnection, there is a little more work to be done before
    rpcrdma_ep_destroy() can release the connection's rdma_cm_id. So
    it is not quite symmetrical with rpcrdma_ep_create() yet.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Make a Protection Domain (PD) a per-connection resource rather than
    a per-transport resource. In other words, when the connection
    terminates, the PD is destroyed.

    Thus there is one less HW resource that remains allocated to a
    transport after a connection is closed.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Clean up: Simplify the synopses of functions in the connect and
    disconnect paths in preparation for combining the rpcrdma_ia and
    struct rpcrdma_ep structures.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Clean up: Simplify the synopses of functions in the post_send path
    by combining the struct rpcrdma_ia and struct rpcrdma_ep arguments.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Clean up: prepare for combining the rpcrdma_ia and rpcrdma_ep
    structures. Take the opportunity to rename the function to be
    consistent with the "subsystem _ object _ verb" naming scheme.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Refactor rpcrdma_ep_create(), rpcrdma_ep_disconnect(), and
    rpcrdma_ep_destroy().

    rpcrdma_ep_create will be invoked at connect time instead of at
    transport set-up time. It will be responsible for allocating per-
    connection resources. In this patch it allocates the CQs and
    creates a QP. More to come.

    rpcrdma_ep_destroy() is the inverse functionality that is
    invoked at disconnect time. It will be responsible for releasing
    the CQs and QP.

    These changes should be safe to do because both connect and
    disconnect is guaranteed to be serialized by the transport send
    lock.

    This takes us another step closer to resolving the address and route
    only at connect time so that connection failover to another device
    will work correctly.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Two changes:
    - Show the number of SG entries that were mapped. This helps debug
    DMA-related problems.
    - Record the MR's resource ID instead of its memory address. This
    groups each MR with its associated rdma-tool output, and reduces
    needless exposure of memory addresses.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     

26 Mar, 2020

3 commits