27 Jun, 2009

1 commit

  • The sunrpc module uses rcu_call() thus it should use rcu_barrier() on
    module unload.

    Have not verified that the possibility for new call_rcu() callbacks
    has been disabled. As a hint for checking, the functions calling
    call_rcu() (unx_destroy_cred and generic_destroy_cred) are
    registered as crdestroy function pointer in struct rpc_credops.

    Acked-by: Paul E. McKenney
    Acked-by: Trond Myklebust
    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     

23 Jun, 2009

1 commit

  • * 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits)
    SUNRPC: Fix the TCP server's send buffer accounting
    nfsd41: Backchannel: minorversion support for the back channel
    nfsd41: Backchannel: cleanup nfs4.0 callback encode routines
    nfsd41: Remove ip address collision detection case
    nfsd: optimise the starting of zero threads when none are running.
    nfsd: don't take nfsd_mutex twice when setting number of threads.
    nfsd41: sanity check client drc maxreqs
    nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct
    NFS: kill off complicated macro 'PROC'
    sunrpc: potential memory leak in function rdma_read_xdr
    nfsd: minor nfsd_vfs_write cleanup
    nfsd: Pull write-gathering code out of nfsd_vfs_write
    nfsd: track last inode only in use_wgather case
    sunrpc: align cache_clean work's timer
    nfsd: Use write gathering only with NFSv2
    NFSv4: kill off complicated macro 'PROC'
    NFSv4: do exact check about attribute specified
    knfsd: remove unreported filehandle stats counters
    knfsd: fix reply cache memory corruption
    knfsd: reply cache cleanups
    ...

    Linus Torvalds
     

21 Jun, 2009

1 commit


19 Jun, 2009

2 commits

  • Currently, the sunrpc server is refusing to allow us to process new RPC
    calls if the TCP send buffer is 2/3 full, even if we do actually have
    enough free space to guarantee that we can send another request.
    The following patch fixes svc_tcp_has_wspace() so that we only stop
    processing requests if we know that the socket buffer cannot possibly fit
    another reply.

    It also fixes the tcp write_space() callback so that we only clear the
    SOCK_NOSPACE flag when the TCP send buffer is less than 2/3 full.
    This should ensure that the send window will grow as per the standard TCP
    socket code.

    Signed-off-by: Trond Myklebust
    Signed-off-by: J. Bruce Fields

    Trond Myklebust
     
  • Conflicts:
    fs/nfs/client.c
    fs/nfs/super.c

    Trond Myklebust
     

18 Jun, 2009

17 commits

  • Trond Myklebust
     
  • The 'rq_received' member of 'struct rpc_rqst' is used to track when we
    have received a reply to our request. With v4.1, the backchannel
    can now accept callback requests over the existing connection. Rename
    this field to make it clear that it is only used for tracking reply bytes
    and not all bytes received on the connection.

    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • Obtain the rpc_xprt from the rpc_rqst so that calls and callback replies
    can both use the same code path. A client needs the rpc_xprt in order
    to reply to a callback.

    Signed-off-by: Rahul Iyer
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Rahul Iyer
     
  • Signed-off-by: Benny Halevy

    Benny Halevy
     
  • This svc_xprt is passed on to the callback service thread to be later used
    to processes incoming svc_rqst's

    Signed-off-by: Benny Halevy

    Andy Adamson
     
  • For nfs41 callbacks we need an svc_xprt to process requests coming up the
    backchannel socket as rpc_rqst's that are transformed into svc_rqst's that
    need a rq_xprt to be processed.

    The svc_{udp,tcp}_create methods are too heavy for this job as svc_create_socket
    creates an actual socket to listen on while for nfs41 we're "reusing" the
    fore channel's socket.

    Signed-off-by: Benny Halevy

    Benny Halevy
     
  • Implement the NFSv4.1 backchannel service. Invokes the common callback
    processing logic svc_process_common() to authenticate the call and
    dispatch the appropriate NFSv4.1 XDR decoder and operation procedure.
    It then invokes bc_send() to send the reply over the same connection.
    bc_send() is implemented in a separate patch.

    At this time there is no slot validation or reply cache handling.

    [nfs41: Preallocate rpc_rqst receive buffer for handling callbacks]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [Move bc_svc_process() declaration to correct patch]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • net/sunrpc/svc.c:svc_process() is used by the NFSv4 callback service
    to process RPC requests arriving over connections initiated by the
    server. NFSv4.1 supports callbacks over the backchannel on connections
    initiated by the client. This patch refactors svc_process() so that
    common code can also be used by the backchannel.

    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • Executes the backchannel task on the RPC state machine using
    the existing open connection previously established by the client.

    Signed-off-by: Ricardo Labiaga

    nfs41: Add bc_svc.o to sunrpc Makefile.

    [nfs41: bc_send() does not need to be exported outside RPC module]
    [nfs41: xprt_free_bc_request() need not be exported outside RPC module]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [Update copyright]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • Adds rpc_run_bc_task() which is called by the NFS callback service to
    process backchannel requests. It performs similar work to rpc_run_task()
    though "schedules" the backchannel task to be executed starting at the
    call_trasmit state in the RPC state machine.

    It also introduces some miscellaneous updates to the argument validation,
    call_transmit, and transport cleanup functions to take into account
    that there are now forechannel and backchannel tasks.

    Backchannel requests do not carry an RPC message structure, since the
    payload has already been XDR encoded using the existing NFSv4 callback
    mechanism.

    Introduce a new transmit state for the client to reply on to backchannel
    requests. This new state simply reserves the transport and issues the
    reply. In case of a connection related error, disconnects the transport and
    drops the reply. It requires the forechannel to re-establish the connection
    and the server to retransmit the request, as stated in NFSv4.1 section
    2.9.2 "Client and Server Transport Behavior".

    Note: There is no need to loop attempting to reserve the transport. If EAGAIN
    is returned by xprt_prepare_transmit(), return with tk_status == 0,
    setting tk_action to call_bc_transmit. rpc_execute() will invoke it again
    after the task is taken off the sleep queue.

    [nfs41: rpc_run_bc_task() need not be exported outside RPC module]
    [nfs41: New call_bc_transmit RPC state]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [nfs41: Backchannel: No need to loop in call_bc_transmit()]
    Signed-off-by: Andy Adamson
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [rpc_count_iostats incorrectly exits early]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [Convert rpc_reply_expected() to inline function]
    [Remove unnecessary BUG_ON()]
    [Rename variable]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • In the case of -EADDRNOTAVAIL and/or unhandled connection errors, we want
    to get rid of the existing socket and retry immediately, just as the
    comment says. Currently we end up sleeping for a minute, due to the missing
    "break" statement.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Handles RPC replies and backchannel callbacks. Traditionally the NFS
    client has expected only RPC replies on its open connections. With
    NFSv4.1, callbacks can arrive over an existing open connection.

    This patch refactors the old xs_tcp_read_request() into an RPC reply handler:
    xs_tcp_read_reply(), a new backchannel callback handler: xs_tcp_read_callback(),
    and a common routine to read the data off the transport: xs_tcp_read_common().
    The new xs_tcp_read_callback() queues callback requests onto a queue where
    the callback service (a separate thread) is listening for the processing.

    This patch incorporates work and suggestions from Rahul Iyer (iyer@netapp.com)
    and Benny Halevy (bhalevy@panasas.com).

    xs_tcp_read_callback() drops the connection when the number of expected
    callbacks is exceeded. Use xprt_force_disconnect(), ensuring tasks on
    the pending queue are awaken on disconnect.

    [nfs41: Keep track of RPC call/reply direction with a flag]
    [nfs41: Preallocate rpc_rqst receive buffer for handling callbacks]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [nfs41: sunrpc: xs_tcp_read_callback() should use xprt_force_disconnect()]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [Moves embedded #ifdefs into #ifdef function blocks]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • This patch introduces support to setup the callback xprt on the client side.
    It allocates/ destroys the preallocated memory structures used to process
    backchannel requests.

    At setup time, xprt_setup_backchannel() is invoked to allocate one or
    more rpc_rqst structures and substructures. This ensures that they
    are available when an RPC callback arrives. The rpc_rqst structures
    are maintained in a linked list attached to the rpc_xprt structure.
    We keep track of the number of allocations so that they can be correctly
    removed when the channel is destroyed.

    When an RPC callback arrives, xprt_alloc_bc_request() is invoked to
    obtain a preallocated rpc_rqst structure. An rpc_xprt structure is
    returned, and its RPC_BC_PREALLOC_IN_USE bit is set in
    rpc_xprt->bc_flags. The structure is removed from the the list
    since it is now in use, and it will be later added back when its
    user is done with it.

    After the RPC callback replies, the rpc_rqst structure is returned
    by invoking xprt_free_bc_request(). This clears the
    RPC_BC_PREALLOC_IN_USE bit and adds it back to the list, allowing it
    to be reused by a subsequent RPC callback request.

    To be consistent with the reception of RPC messages, the backchannel requests
    should be placed into the 'struct rpc_rqst' rq_rcv_buf, which is then in turn
    copied to the 'struct rpc_rqst' rq_private_buf.

    [nfs41: Preallocate rpc_rqst receive buffer for handling callbacks]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [Update copyright notice and explain page allocation]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • Reading and storing the RPC direction is a three step process.

    1. xs_tcp_read_calldir() reads the RPC direction, but it will not store it
    in the XDR buffer since the 'struct rpc_rqst' is not yet available.

    2. The 'struct rpc_rqst' is obtained during the TCP_RCV_COPY_DATA state.
    This state need not necessarily be preceeded by the TCP_RCV_READ_CALLDIR.
    For example, we may be reading a continuation packet to a large reply.
    Therefore, we can't simply obtain the 'struct rpc_rqst' during the
    TCP_RCV_READ_CALLDIR state and assume it's available during TCP_RCV_COPY_DATA.

    This patch adds a new TCP_RCV_READ_CALLDIR flag to indicate the need to
    read the RPC direction. It then uses TCP_RCV_COPY_CALLDIR to indicate the
    RPC direction needs to be saved after the 'struct rpc_rqst' has been allocated.

    3. The 'struct rpc_rqst' is obtained by the xs_tcp_read_data() helper
    functions. xs_tcp_read_common() then saves the RPC direction in the XDR
    buffer if TCP_RCV_COPY_CALLDIR is set. This will happen when we're reading
    the data immediately after the direction was read. xs_tcp_read_common()
    then clears this flag.

    [was nfs41: Skip past the RPC call direction]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [nfs41: sunrpc: Add RPC direction back into the XDR buffer]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [nfs41: sunrpc: Don't skip past the RPC call direction]
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • NFSv4.1 callbacks can arrive over an existing connection. This patch adds
    the logic to read the RPC call direction (call or reply). It does this by
    updating the state machine to look for the call direction invoking
    xs_tcp_read_calldir(...) after reading the XID.

    [nfs41: Keep track of RPC call/reply direction with a flag]

    As per 11/14/08 review of RFC 53/85.

    Add a new flag to track whether the incoming message is an RPC call or an
    RPC reply. TCP_RPC_REPLY is set in the 'struct sock_xprt' tcp_flags in
    xs_tcp_read_calldir() if the message is an RPC reply sent on the forechannel.
    It is cleared if the message is an RPC request sent on the back channel.

    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy

    Ricardo Labiaga
     
  • Signed-off-by: Andy Adamson
    Signed-off-by: Benny Halevy
    Signed-off-by: Trond Myklebust

    Andy Adamson
     

17 Jun, 2009

1 commit

  • num_online_nodes() is called in a number of places but most often by the
    page allocator when deciding whether the zonelist needs to be filtered
    based on cpusets or the zonelist cache. This is actually a heavy function
    and touches a number of cache lines.

    This patch stores the number of online nodes at boot time and updates the
    value when nodes get onlined and offlined. The value is then used in a
    number of important paths in place of num_online_nodes().

    [rientjes@google.com: do not override definition of node_set_online() with macro]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Mel Gorman
    Cc: KOSAKI Motohiro
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Cc: Nick Piggin
    Cc: Dave Hansen
    Cc: Lee Schermerhorn
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

16 Jun, 2009

3 commits


15 Jun, 2009

1 commit


10 Jun, 2009

1 commit


03 Jun, 2009

1 commit

  • Define three accessors to get/set dst attached to a skb

    struct dst_entry *skb_dst(const struct sk_buff *skb)

    void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

    void skb_dst_drop(struct sk_buff *skb)
    This one should replace occurrences of :
    dst_release(skb->dst)
    skb->dst = NULL;

    Delete skb->dst field

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

29 May, 2009

1 commit


28 May, 2009

2 commits

  • The svcrdma module was incorrectly unmapping the RPCRDMA header page.
    On IBM pserver systems this causes a resource leak that results in
    running out of bus address space (10 cthon iterations will reproduce it).
    The code was mapping the full page but only unmapping the actual header
    length. The fix is to only map the header length.

    I also cleaned up the use of ib_dma_map_page() calls since the unmap
    logic always uses ib_dma_unmap_single(). I made these symmetrical.

    Signed-off-by: Steve Wise
    Signed-off-by: Tom Tucker
    Signed-off-by: J. Bruce Fields

    Steve Wise
     
  • This reverts commit 47a14ef1af48c696b214ac168f056ddc79793d0e "svcrpc:
    take advantage of tcp autotuning", which uncovered some further problems
    in the server rpc code, causing significant performance regressions in
    common cases.

    We will likely reinstate this patch after releasing 2.6.30 and applying
    some work on the underlying fixes to the problem (developed by Trond).

    Reported-by: Jeff Moyer
    Cc: Olga Kornievskaia
    Cc: Jim Rees
    Cc: Trond Myklebust
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

27 May, 2009

1 commit


13 May, 2009

1 commit


04 May, 2009

1 commit

  • These fixes resolved crashes due to resource leak BUG_ON checks. The
    resource leaks were detected by introducing asynchronous transport errors.

    Signed-off-by: Steve Wise
    Signed-off-by: Tom Tucker
    Signed-off-by: J. Bruce Fields

    Steve Wise
     

03 May, 2009

1 commit


29 Apr, 2009

4 commits