10 Jul, 2008

1 commit

  • The RPC client uses the rq_xtime field in each RPC request to determine the
    round-trip time of the request. Currently, the rq_xtime field is
    initialized by each transport just before it starts enqueing a request to
    be sent. However, transports do not handle initializing this value
    consistently; sometimes they don't initialize it at all.

    To make the measurement of request round-trip time consistent for all
    RPC client transport capabilities, pull rq_xtime initialization into the
    RPC client's generic transport logic. Now all transports will get a
    standardized RTT measure automatically, from:

    xprt_transmit()

    to

    xprt_complete_rqst()

    This makes round-trip time calculation more accurate for the TCP transport.
    The socket ->sendmsg() method can return "-EAGAIN" if the socket's output
    buffer is full, so the TCP transport's ->send_request() method may call
    the ->sendmsg() method repeatedly until it gets all of the request's bytes
    queued in the socket's buffer.

    Currently, the TCP transport sets the rq_xtime field every time through
    that loop so the final value is the timestamp just before the *last* call
    to the underlying socket's ->sendmsg() method. After this patch, the
    rq_xtime field contains a timestamp that reflects the time just before the
    *first* call to ->sendmsg().

    This is consequential under heavy workloads because large requests often
    take multiple ->sendmsg() calls to get all the bytes of a request queued.
    The TCP transport causes the request to sleep until the remote end of the
    socket has received enough bytes to clear space in the socket's local
    output buffer. This delay can be quite significant.

    The method introduced by this patch is a more accurate measure of RTT
    for stream transports, since the server can cause enough back pressure
    to delay (ie increase the latency of) requests from the client.

    Additionally, this patch corrects the behavior of the RDMA transport, which
    entirely neglected to initialize the rq_xtime field. RPC performance
    metrics for RDMA transports now display correct RPC request round trip
    times.

    Signed-off-by: Chuck Lever
    Acked-by: Tom Talpey
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

25 Apr, 2008

1 commit


20 Apr, 2008

3 commits

  • NFSv4 requires us to ensure that we break the TCP connection before we're
    allowed to retransmit a request. However in the case where we're
    retransmitting several requests that have been sent on the same
    connection, we need to ensure that we don't interfere with the attempt to
    reconnect and/or break the connection again once it has been established.

    We therefore introduce a 'connection' cookie that is bumped every time a
    connection is broken. This allows requests to track if they need to force a
    disconnection.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • The condition for exiting from the loop in xs_tcp_send_request() should be
    that we find we're not making progress (i.e. number of bytes sent is 0).

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • The rest of the networking layer uses SOCK_ASYNC_NOSPACE to signal whether
    or not we have someone waiting for buffer memory. Convert the SUNRPC layer
    to use the same idiom.
    Remove the unlikely()s in xs_udp_write_space and xs_tcp_write_space. In
    fact, the most common case will be that there is nobody waiting for buffer
    space.

    SOCK_NOSPACE is there to tell the TCP layer whether or not the cwnd was
    limited by the application window. Ensure that we follow the same idiom as
    the rest of the networking layer here too.

    Finally, ensure that we clear SOCK_ASYNC_NOSPACE once we wake up, so that
    write_space() doesn't keep waking things up on xprt->pending.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

06 Mar, 2008

1 commit


29 Feb, 2008

1 commit


30 Jan, 2008

13 commits


29 Jan, 2008

1 commit

  • The previous move of the the UDP inDatagrams counter caused the
    counting of encapsulated packets, SUNRPC data (as opposed to call)
    packets and RXRPC packets to go missing.

    This patch restores all of these.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

27 Nov, 2007

1 commit


16 Oct, 2007

1 commit

  • * git://git.linux-nfs.org/pub/linux/nfs-2.6: (131 commits)
    NFSv4: Fix a typo in nfs_inode_reclaim_delegation
    NFS: Add a boot parameter to disable 64 bit inode numbers
    NFS: nfs_refresh_inode should clear cache_validity flags on success
    NFS: Fix a connectathon regression in NFSv3 and NFSv4
    NFS: Use nfs_refresh_inode() in ops that aren't expected to change the inode
    SUNRPC: Don't call xprt_release in call refresh
    SUNRPC: Don't call xprt_release() if call_allocate fails
    SUNRPC: Fix buggy UDP transmission
    [23/37] Clean up duplicate includes in
    [2.6 patch] net/sunrpc/rpcb_clnt.c: make struct rpcb_program static
    SUNRPC: Use correct type in buffer length calculations
    SUNRPC: Fix default hostname created in rpc_create()
    nfs: add server port to rpc_pipe info file
    NFS: Get rid of some obsolete macros
    NFS: Simplify filehandle revalidation
    NFS: Ensure that nfs_link() returns a hashed dentry
    NFS: Be strict about dentry revalidation when doing exclusive create
    NFS: Don't zap the readdir caches upon error
    NFS: Remove the redundant nfs_reval_fsid()
    NFSv3: Always use directory post-op attributes in nfs3_proc_lookup
    ...

    Fix up trivial conflict due to sock_owned_by_user() cleanup manually in
    net/sunrpc/xprtsock.c

    Linus Torvalds
     

11 Oct, 2007

1 commit


10 Oct, 2007

16 commits