18 Jul, 2011

4 commits


15 Jul, 2011

1 commit


28 May, 2011

1 commit

  • TI-RPC introduces the capability of performing RPC over AF_LOCAL
    sockets. It uses this mainly for registering and unregistering
    local RPC services securely with the local rpcbind, but we could
    also conceivably use it as a generic upcall mechanism.

    This patch provides a client-side only implementation for the moment.
    We might also consider a server-side implementation to provide
    AF_LOCAL access to NLM (for statd downcalls, and such like).

    Autobinding is not supported on kernel AF_LOCAL transports at this
    time. Kernel ULPs must specify the pathname of the remote endpoint
    when an AF_LOCAL transport is created. rpcbind supports registering
    services available via AF_LOCAL, so the kernel could handle it with
    some adjustment to ->rpcbind and ->set_port. But we don't need this
    feature for doing upcalls via well-known named sockets.

    This has not been tested with ULPs that move a substantial amount of
    data. Thus, I can't attest to how robust the write_space and
    congestion management logic is.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

18 Mar, 2011

1 commit


12 Jan, 2011

1 commit


02 Oct, 2010

4 commits


04 Aug, 2010

1 commit


15 May, 2010

4 commits

  • It seems strange to maintain stats for bytes_sent in one structure, and
    bytes received in another. Try to assemble all the RPC request-related
    stats in struct rpc_rqst

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Currently RPC performance metrics that tabulate elapsed time use
    jiffies time values. This is problematic on systems that use slow
    jiffies (for instance 100HZ systems built for paravirtualized
    environments). It is also a problem for computing precise latency
    statistics for advanced network transports, such as InfiniBand,
    that can have round-trip latencies significanly faster than a single
    clock tick.

    For the RPC client, adopt the high resolution time stamp mechanism
    already used by the network layer and blktrace: ktime.

    We use ktime format time stamps for all internal computations, and
    convert to milliseconds for presentation. As a result, we need only
    addition operations in the performance critical paths; multiply/divide
    is required only for presentation.

    We could report RTT metrics in microseconds. In fact the mountstats
    format is versioned to accomodate exactly this kind of interface
    improvement.

    For now, however, we'll stay with millisecond precision for
    presentation to maintain backwards compatibility with the handful of
    currently deployed user space tools. At a later point, we'll move to
    an API such as BDI_STATS where a finer timestamp precision can be
    reported.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Compute an RPC request's RTT once, and use that value both for reporting
    RPC metrics, and for adjusting the RTT context used by the RPC client's RTT
    estimator algorithm.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • We should not allow soft tasks to wait for longer than the major timeout
    period when waiting for a reconnect to occur.

    Remove the field xprt->connect_timeout since it has been obsoleted by
    xprt->reestablish_timeout.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

14 Sep, 2009

1 commit


12 Sep, 2009

1 commit

  • When the call direction is a reply, copy the xid and call direction into the
    req->rq_private_buf.head[0].iov_base otherwise rpc_verify_header returns
    rpc_garbage.

    Signed-off-by: Rahul Iyer
    Signed-off-by: Mike Sager
    Signed-off-by: Marc Eshel
    Signed-off-by: Benny Halevy
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Andy Adamson
    Signed-off-by: Benny Halevy
    [get rid of CONFIG_NFSD_V4_1]
    [sunrpc: refactoring of svc_tcp_recvfrom]
    [nfsd41: sunrpc: create common send routine for the fore and the back channels]
    [nfsd41: sunrpc: Use free_page() to free server backchannel pages]
    [nfsd41: sunrpc: Document server backchannel locking]
    [nfsd41: sunrpc: remove bc_connect_worker()]
    [nfsd41: sunrpc: Define xprt_server_backchannel()[
    [nfsd41: sunrpc: remove bc_close and bc_init_auto_disconnect dummy functions]
    [nfsd41: sunrpc: eliminate unneeded switch statement in xs_setup_tcp()]
    [nfsd41: sunrpc: Don't auto close the server backchannel connection]
    [nfsd41: sunrpc: Remove unused functions]
    Signed-off-by: Alexandros Batsakis
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [nfsd41: change bc_sock to bc_xprt]
    [nfsd41: sunrpc: move struct rpc_buffer def into a common header file]
    [nfsd41: sunrpc: use rpc_sleep in bc_send_request so not to block on mutex]
    [removed cosmetic changes]
    Signed-off-by: Benny Halevy
    [sunrpc: add new xprt class for nfsv4.1 backchannel]
    [sunrpc: v2.1 change handling of auto_close and init_auto_disconnect operations for the nfsv4.1 backchannel]
    Signed-off-by: Alexandros Batsakis
    [reverted more cosmetic leftovers]
    [got rid of xprt_server_backchannel]
    [separated "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel"]
    Signed-off-by: Benny Halevy
    Cc: Trond Myklebust
    [sunrpc: change idle timeout value for the backchannel]
    Signed-off-by: Alexandros Batsakis
    Signed-off-by: Benny Halevy
    Acked-by: Trond Myklebust
    Signed-off-by: J. Bruce Fields

    Rahul Iyer
     

10 Aug, 2009

2 commits

  • At some point, I recall that rpc_pipe_fs used RPC_DISPLAY_ALL.
    Currently there are no uses of RPC_DISPLAY_ALL outside the transport
    modules themselves, so we can safely get rid of it.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • RPC universal address generation is currently done in several places:
    rpcb_clnt.c, nfs4proc.c xprtsock.c, and xprtrdma.c. Remove the
    redundant cases that convert a socket address to a universal
    address. The nfs4proc.c case takes a pre-formatted presentation
    address string, not a socket address, so we'll leave that one.

    Because the new uaddr constructor uses the recently introduced
    rpc_ntop(), it now supports proper "::" shorthanding for IPv6
    addresses. This allows the kernel to register properly formed
    universal addresses with the local rpcbind service, in _all_ cases.

    The kernel can now also send properly formed universal addresses in
    RPCB_GETADDR requests, and support link-local properly when
    encoding and decoding IPv6 addresses.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

18 Jun, 2009

4 commits

  • The 'rq_received' member of 'struct rpc_rqst' is used to track when we
    have received a reply to our request. With v4.1, the backchannel
    can now accept callback requests over the existing connection. Rename
    this field to make it clear that it is only used for tracking reply bytes
    and not all bytes received on the connection.

    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • Adds rpc_run_bc_task() which is called by the NFS callback service to
    process backchannel requests. It performs similar work to rpc_run_task()
    though "schedules" the backchannel task to be executed starting at the
    call_trasmit state in the RPC state machine.

    It also introduces some miscellaneous updates to the argument validation,
    call_transmit, and transport cleanup functions to take into account
    that there are now forechannel and backchannel tasks.

    Backchannel requests do not carry an RPC message structure, since the
    payload has already been XDR encoded using the existing NFSv4 callback
    mechanism.

    Introduce a new transmit state for the client to reply on to backchannel
    requests. This new state simply reserves the transport and issues the
    reply. In case of a connection related error, disconnects the transport and
    drops the reply. It requires the forechannel to re-establish the connection
    and the server to retransmit the request, as stated in NFSv4.1 section
    2.9.2 "Client and Server Transport Behavior".

    Note: There is no need to loop attempting to reserve the transport. If EAGAIN
    is returned by xprt_prepare_transmit(), return with tk_status == 0,
    setting tk_action to call_bc_transmit. rpc_execute() will invoke it again
    after the task is taken off the sleep queue.

    [nfs41: rpc_run_bc_task() need not be exported outside RPC module]
    [nfs41: New call_bc_transmit RPC state]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [nfs41: Backchannel: No need to loop in call_bc_transmit()]
    Signed-off-by: Andy Adamson
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [rpc_count_iostats incorrectly exits early]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [Convert rpc_reply_expected() to inline function]
    [Remove unnecessary BUG_ON()]
    [Rename variable]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • This patch introduces support to setup the callback xprt on the client side.
    It allocates/ destroys the preallocated memory structures used to process
    backchannel requests.

    At setup time, xprt_setup_backchannel() is invoked to allocate one or
    more rpc_rqst structures and substructures. This ensures that they
    are available when an RPC callback arrives. The rpc_rqst structures
    are maintained in a linked list attached to the rpc_xprt structure.
    We keep track of the number of allocations so that they can be correctly
    removed when the channel is destroyed.

    When an RPC callback arrives, xprt_alloc_bc_request() is invoked to
    obtain a preallocated rpc_rqst structure. An rpc_xprt structure is
    returned, and its RPC_BC_PREALLOC_IN_USE bit is set in
    rpc_xprt->bc_flags. The structure is removed from the the list
    since it is now in use, and it will be later added back when its
    user is done with it.

    After the RPC callback replies, the rpc_rqst structure is returned
    by invoking xprt_free_bc_request(). This clears the
    RPC_BC_PREALLOC_IN_USE bit and adds it back to the list, allowing it
    to be reused by a subsequent RPC callback request.

    To be consistent with the reception of RPC messages, the backchannel requests
    should be placed into the 'struct rpc_rqst' rq_rcv_buf, which is then in turn
    copied to the 'struct rpc_rqst' rq_private_buf.

    [nfs41: Preallocate rpc_rqst receive buffer for handling callbacks]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [Update copyright notice and explain page allocation]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • Adds new list of rpc_xprt structures, and a readers/writers lock to
    protect the list. The list is used to preallocate resources for
    the backchannel during backchannel requests. Callbacks are not
    expected to cause significant latency, so only one callback will
    be allowed at this time.

    It also adds a pointer to the NFS callback service so that
    requests can be directed to it for processing.

    New callback members added to svc_serv. The NFSv4.1 callback service will
    sleep on the svc_serv->svc_cb_waitq until new callback requests arrive.
    The request will be queued in svc_serv->svc_cb_list. This patch adds this
    list, the sleep queue and spinlock to svc_serv.

    [nfs41: NFSv4.1 callback support]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     

03 May, 2009

1 commit


20 Mar, 2009

1 commit

  • This fixes a regression against FreeBSD servers as reported by Tomas
    Kasparek. Apparently when using RPC over a TCP socket, the FreeBSD servers
    don't ever react to the client closing the socket, and so commit
    e06799f958bf7f9f8fae15f0c6f519953fb0257c (SUNRPC: Use shutdown() instead of
    close() when disconnecting a TCP socket) causes the setup to hang forever
    whenever the client attempts to close and then reconnect.

    We break the deadlock by adding a 'linger2' style timeout to the socket,
    after which, the client will abort the connection using a TCP 'RST'.

    The default timeout is set to 15 seconds. A subsequent patch will put it
    under user control by means of a systctl.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

12 Mar, 2009

1 commit

  • Provide an api to attempt to load any necessary kernel RPC
    client transport module automatically. By convention, the
    desired module name is "xprt"+"transport name". For example,
    when NFS mounting with "-o proto=rdma", attempt to load the
    "xprtrdma" module.

    Signed-off-by: Tom Talpey
    Cc: Chuck Lever
    Signed-off-by: Trond Myklebust

    Tom Talpey
     

24 Dec, 2008

1 commit


20 Apr, 2008

2 commits

  • NFSv4 requires us to ensure that we break the TCP connection before we're
    allowed to retransmit a request. However in the case where we're
    retransmitting several requests that have been sent on the same
    connection, we need to ensure that we don't interfere with the attempt to
    reconnect and/or break the connection again once it has been established.

    We therefore introduce a 'connection' cookie that is bumped every time a
    connection is broken. This allows requests to track if they need to force a
    disconnection.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • The rest of the networking layer uses SOCK_ASYNC_NOSPACE to signal whether
    or not we have someone waiting for buffer memory. Convert the SUNRPC layer
    to use the same idiom.
    Remove the unlikely()s in xs_udp_write_space and xs_tcp_write_space. In
    fact, the most common case will be that there is nobody waiting for buffer
    space.

    SOCK_NOSPACE is there to tell the TCP layer whether or not the cwnd was
    limited by the application window. Ensure that we follow the same idiom as
    the rest of the networking layer here too.

    Finally, ensure that we clear SOCK_ASYNC_NOSPACE once we wake up, so that
    write_space() doesn't keep waking things up on xprt->pending.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

30 Jan, 2008

6 commits


10 Oct, 2007

3 commits