19 Jun, 2009

1 commit


18 Jun, 2009

6 commits

  • Trond Myklebust
     
  • Executes the backchannel task on the RPC state machine using
    the existing open connection previously established by the client.

    Signed-off-by: Ricardo Labiaga

    nfs41: Add bc_svc.o to sunrpc Makefile.

    [nfs41: bc_send() does not need to be exported outside RPC module]
    [nfs41: xprt_free_bc_request() need not be exported outside RPC module]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [Update copyright]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • In the case of -EADDRNOTAVAIL and/or unhandled connection errors, we want
    to get rid of the existing socket and retry immediately, just as the
    comment says. Currently we end up sleeping for a minute, due to the missing
    "break" statement.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Handles RPC replies and backchannel callbacks. Traditionally the NFS
    client has expected only RPC replies on its open connections. With
    NFSv4.1, callbacks can arrive over an existing open connection.

    This patch refactors the old xs_tcp_read_request() into an RPC reply handler:
    xs_tcp_read_reply(), a new backchannel callback handler: xs_tcp_read_callback(),
    and a common routine to read the data off the transport: xs_tcp_read_common().
    The new xs_tcp_read_callback() queues callback requests onto a queue where
    the callback service (a separate thread) is listening for the processing.

    This patch incorporates work and suggestions from Rahul Iyer (iyer@netapp.com)
    and Benny Halevy (bhalevy@panasas.com).

    xs_tcp_read_callback() drops the connection when the number of expected
    callbacks is exceeded. Use xprt_force_disconnect(), ensuring tasks on
    the pending queue are awaken on disconnect.

    [nfs41: Keep track of RPC call/reply direction with a flag]
    [nfs41: Preallocate rpc_rqst receive buffer for handling callbacks]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [nfs41: sunrpc: xs_tcp_read_callback() should use xprt_force_disconnect()]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [Moves embedded #ifdefs into #ifdef function blocks]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • Reading and storing the RPC direction is a three step process.

    1. xs_tcp_read_calldir() reads the RPC direction, but it will not store it
    in the XDR buffer since the 'struct rpc_rqst' is not yet available.

    2. The 'struct rpc_rqst' is obtained during the TCP_RCV_COPY_DATA state.
    This state need not necessarily be preceeded by the TCP_RCV_READ_CALLDIR.
    For example, we may be reading a continuation packet to a large reply.
    Therefore, we can't simply obtain the 'struct rpc_rqst' during the
    TCP_RCV_READ_CALLDIR state and assume it's available during TCP_RCV_COPY_DATA.

    This patch adds a new TCP_RCV_READ_CALLDIR flag to indicate the need to
    read the RPC direction. It then uses TCP_RCV_COPY_CALLDIR to indicate the
    RPC direction needs to be saved after the 'struct rpc_rqst' has been allocated.

    3. The 'struct rpc_rqst' is obtained by the xs_tcp_read_data() helper
    functions. xs_tcp_read_common() then saves the RPC direction in the XDR
    buffer if TCP_RCV_COPY_CALLDIR is set. This will happen when we're reading
    the data immediately after the direction was read. xs_tcp_read_common()
    then clears this flag.

    [was nfs41: Skip past the RPC call direction]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [nfs41: sunrpc: Add RPC direction back into the XDR buffer]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [nfs41: sunrpc: Don't skip past the RPC call direction]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • NFSv4.1 callbacks can arrive over an existing connection. This patch adds
    the logic to read the RPC call direction (call or reply). It does this by
    updating the state machine to look for the call direction invoking
    xs_tcp_read_calldir(...) after reading the XID.

    [nfs41: Keep track of RPC call/reply direction with a flag]

    As per 11/14/08 review of RFC 53/85.

    Add a new flag to track whether the incoming message is an RPC call or an
    RPC reply. TCP_RPC_REPLY is set in the 'struct sock_xprt' tcp_flags in
    xs_tcp_read_calldir() if the message is an RPC reply sent on the forechannel.
    It is cleared if the message is an RPC request sent on the back channel.

    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     

03 Jun, 2009

1 commit

  • Define three accessors to get/set dst attached to a skb

    struct dst_entry *skb_dst(const struct sk_buff *skb)

    void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

    void skb_dst_drop(struct sk_buff *skb)
    This one should replace occurrences of :
    dst_release(skb->dst)
    skb->dst = NULL;

    Delete skb->dst field

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

03 May, 2009

1 commit


02 Apr, 2009

1 commit


27 Mar, 2009

1 commit


20 Mar, 2009

4 commits


12 Mar, 2009

10 commits


07 Feb, 2009

1 commit

  • $ diff-funcs xs_udp_write_space net/sunrpc/xprtsock.c
    net/sunrpc/xprtsock.c xs_tcp_write_space
    --- net/sunrpc/xprtsock.c:xs_udp_write_space()
    +++ net/sunrpc/xprtsock.c:xs_tcp_write_space()
    @@ -1,4 +1,4 @@
    - * xs_udp_write_space - callback invoked when socket buffer space
    + * xs_tcp_write_space - callback invoked when socket buffer space
    * becomes available
    * @sk: socket whose state has changed
    *
    @@ -7,12 +7,12 @@
    * progress, otherwise we'll waste resources thrashing kernel_sendmsg
    * with a bunch of small requests.
    */
    -static void xs_udp_write_space(struct sock *sk)
    +static void xs_tcp_write_space(struct sock *sk)
    {
    read_lock(&sk->sk_callback_lock);

    - /* from net/core/sock.c:sock_def_write_space */
    - if (sock_writeable(sk)) {
    + /* from net/core/stream.c:sk_stream_write_space */
    + if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
    struct socket *sock;
    struct rpc_xprt *xprt;

    $ codiff net/sunrpc/xprtsock.o net/sunrpc/xprtsock.o.new
    net/sunrpc/xprtsock.c:
    xs_tcp_write_space | -163
    xs_udp_write_space | -163
    2 functions changed, 326 bytes removed

    net/sunrpc/xprtsock.c:
    xs_write_space | +179
    1 function changed, 179 bytes added

    net/sunrpc/xprtsock.o.new:
    3 functions changed, 179 bytes added, 326 bytes removed, diff: -147

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     

03 Nov, 2008

1 commit


31 Oct, 2008

2 commits


30 Oct, 2008

2 commits


29 Oct, 2008

3 commits


14 Oct, 2008

1 commit

  • Clean up the various different email addresses of mine listed in the code
    to a single current and valid address. As Dave says his network merges
    for 2.6.28 are now done this seems a good point to send them in where
    they won't risk disrupting real changes.

    Signed-off-by: Alan Cox
    Signed-off-by: David S. Miller

    Alan Cox
     

10 Jul, 2008

1 commit

  • The RPC client uses the rq_xtime field in each RPC request to determine the
    round-trip time of the request. Currently, the rq_xtime field is
    initialized by each transport just before it starts enqueing a request to
    be sent. However, transports do not handle initializing this value
    consistently; sometimes they don't initialize it at all.

    To make the measurement of request round-trip time consistent for all
    RPC client transport capabilities, pull rq_xtime initialization into the
    RPC client's generic transport logic. Now all transports will get a
    standardized RTT measure automatically, from:

    xprt_transmit()

    to

    xprt_complete_rqst()

    This makes round-trip time calculation more accurate for the TCP transport.
    The socket ->sendmsg() method can return "-EAGAIN" if the socket's output
    buffer is full, so the TCP transport's ->send_request() method may call
    the ->sendmsg() method repeatedly until it gets all of the request's bytes
    queued in the socket's buffer.

    Currently, the TCP transport sets the rq_xtime field every time through
    that loop so the final value is the timestamp just before the *last* call
    to the underlying socket's ->sendmsg() method. After this patch, the
    rq_xtime field contains a timestamp that reflects the time just before the
    *first* call to ->sendmsg().

    This is consequential under heavy workloads because large requests often
    take multiple ->sendmsg() calls to get all the bytes of a request queued.
    The TCP transport causes the request to sleep until the remote end of the
    socket has received enough bytes to clear space in the socket's local
    output buffer. This delay can be quite significant.

    The method introduced by this patch is a more accurate measure of RTT
    for stream transports, since the server can cause enough back pressure
    to delay (ie increase the latency of) requests from the client.

    Additionally, this patch corrects the behavior of the RDMA transport, which
    entirely neglected to initialize the rq_xtime field. RPC performance
    metrics for RDMA transports now display correct RPC request round trip
    times.

    Signed-off-by: Chuck Lever
    Acked-by: Tom Talpey
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

25 Apr, 2008

1 commit


20 Apr, 2008

3 commits

  • NFSv4 requires us to ensure that we break the TCP connection before we're
    allowed to retransmit a request. However in the case where we're
    retransmitting several requests that have been sent on the same
    connection, we need to ensure that we don't interfere with the attempt to
    reconnect and/or break the connection again once it has been established.

    We therefore introduce a 'connection' cookie that is bumped every time a
    connection is broken. This allows requests to track if they need to force a
    disconnection.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • The condition for exiting from the loop in xs_tcp_send_request() should be
    that we find we're not making progress (i.e. number of bytes sent is 0).

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • The rest of the networking layer uses SOCK_ASYNC_NOSPACE to signal whether
    or not we have someone waiting for buffer memory. Convert the SUNRPC layer
    to use the same idiom.
    Remove the unlikely()s in xs_udp_write_space and xs_tcp_write_space. In
    fact, the most common case will be that there is nobody waiting for buffer
    space.

    SOCK_NOSPACE is there to tell the TCP layer whether or not the cwnd was
    limited by the application window. Ensure that we follow the same idiom as
    the rest of the networking layer here too.

    Finally, ensure that we clear SOCK_ASYNC_NOSPACE once we wake up, so that
    write_space() doesn't keep waking things up on xprt->pending.

    Signed-off-by: Trond Myklebust

    Trond Myklebust