01 Mar, 2010

1 commit

  • One the changes in commit d7979ae4a "svc: Move close processing to a
    single place" is:

    err_delete:
    - svc_delete_socket(svsk);
    + set_bit(SK_CLOSE, &svsk->sk_flags);
    return -EAGAIN;

    This is insufficient. The recvfrom methods must always call
    svc_xprt_received on completion so that the socket gets re-queued if
    there is any more work to do. This particular path did not make that
    call because it actually destroyed the svsk, making requeue pointless.
    When the svc_delete_socket was change to just set a bit, we should have
    added a call to svc_xprt_received,

    This is the problem that b0401d7253 attempted to fix, incorrectly.

    Signed-off-by: J. Bruce Fields

    Neil Brown
     

27 Jan, 2010

1 commit


06 Nov, 2009

1 commit


31 Oct, 2009

1 commit

  • On UDP sockets, we must call skb_free_datagram() with socket locked,
    or risk sk_forward_alloc corruption. This requirement is not respected
    in SUNRPC.

    Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC

    Reported-by: Francis Moreau
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Oct, 2009

1 commit

  • In order to have better cache layouts of struct sock (separate zones
    for rx/tx paths), we need this preliminary patch.

    Goal is to transfert fields used at lookup time in the first
    read-mostly cache line (inside struct sock_common) and move sk_refcnt
    to a separate cache line (only written by rx path)

    This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
    sport and id fields. This allows a future patch to define these
    fields as macros, like sk_refcnt, without name clashes.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

12 Sep, 2009

1 commit

  • When the call direction is a reply, copy the xid and call direction into the
    req->rq_private_buf.head[0].iov_base otherwise rpc_verify_header returns
    rpc_garbage.

    Signed-off-by: Rahul Iyer
    Signed-off-by: Mike Sager
    Signed-off-by: Marc Eshel
    Signed-off-by: Benny Halevy
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Andy Adamson
    Signed-off-by: Benny Halevy
    [get rid of CONFIG_NFSD_V4_1]
    [sunrpc: refactoring of svc_tcp_recvfrom]
    [nfsd41: sunrpc: create common send routine for the fore and the back channels]
    [nfsd41: sunrpc: Use free_page() to free server backchannel pages]
    [nfsd41: sunrpc: Document server backchannel locking]
    [nfsd41: sunrpc: remove bc_connect_worker()]
    [nfsd41: sunrpc: Define xprt_server_backchannel()[
    [nfsd41: sunrpc: remove bc_close and bc_init_auto_disconnect dummy functions]
    [nfsd41: sunrpc: eliminate unneeded switch statement in xs_setup_tcp()]
    [nfsd41: sunrpc: Don't auto close the server backchannel connection]
    [nfsd41: sunrpc: Remove unused functions]
    Signed-off-by: Alexandros Batsakis
    Signed-off-by: Ricardo Labiaga
    Signed-off-by: Benny Halevy
    [nfsd41: change bc_sock to bc_xprt]
    [nfsd41: sunrpc: move struct rpc_buffer def into a common header file]
    [nfsd41: sunrpc: use rpc_sleep in bc_send_request so not to block on mutex]
    [removed cosmetic changes]
    Signed-off-by: Benny Halevy
    [sunrpc: add new xprt class for nfsv4.1 backchannel]
    [sunrpc: v2.1 change handling of auto_close and init_auto_disconnect operations for the nfsv4.1 backchannel]
    Signed-off-by: Alexandros Batsakis
    [reverted more cosmetic leftovers]
    [got rid of xprt_server_backchannel]
    [separated "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel"]
    Signed-off-by: Benny Halevy
    Cc: Trond Myklebust
    [sunrpc: change idle timeout value for the backchannel]
    Signed-off-by: Alexandros Batsakis
    Signed-off-by: Benny Halevy
    Acked-by: Trond Myklebust
    Signed-off-by: J. Bruce Fields

    Rahul Iyer
     

25 Aug, 2009

1 commit


15 Jul, 2009

1 commit

  • PKTINFO is needed to scrape the caller's IP address off the socket so
    RPC datagram replies are routed correctly. Fill in missing pieces in
    the kernel RPC server's UDP receive path to request IPv6 PKTINFO and
    correctly parse the IPv6 cmsg header.

    Without this patch, kernel RPC services drop all incoming requests on
    UDP on IPv6.

    Related commit: 7a37f5787e76bf1765c1add3a9a7163f841a28bb

    Signed-off-by: Chuck Lever
    Cc: Neil Brown
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     

23 Jun, 2009

1 commit

  • * 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits)
    SUNRPC: Fix the TCP server's send buffer accounting
    nfsd41: Backchannel: minorversion support for the back channel
    nfsd41: Backchannel: cleanup nfs4.0 callback encode routines
    nfsd41: Remove ip address collision detection case
    nfsd: optimise the starting of zero threads when none are running.
    nfsd: don't take nfsd_mutex twice when setting number of threads.
    nfsd41: sanity check client drc maxreqs
    nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct
    NFS: kill off complicated macro 'PROC'
    sunrpc: potential memory leak in function rdma_read_xdr
    nfsd: minor nfsd_vfs_write cleanup
    nfsd: Pull write-gathering code out of nfsd_vfs_write
    nfsd: track last inode only in use_wgather case
    sunrpc: align cache_clean work's timer
    nfsd: Use write gathering only with NFSv2
    NFSv4: kill off complicated macro 'PROC'
    NFSv4: do exact check about attribute specified
    knfsd: remove unreported filehandle stats counters
    knfsd: fix reply cache memory corruption
    knfsd: reply cache cleanups
    ...

    Linus Torvalds
     

19 Jun, 2009

1 commit

  • Currently, the sunrpc server is refusing to allow us to process new RPC
    calls if the TCP send buffer is 2/3 full, even if we do actually have
    enough free space to guarantee that we can send another request.
    The following patch fixes svc_tcp_has_wspace() so that we only stop
    processing requests if we know that the socket buffer cannot possibly fit
    another reply.

    It also fixes the tcp write_space() callback so that we only clear the
    SOCK_NOSPACE flag when the TCP send buffer is less than 2/3 full.
    This should ensure that the send window will grow as per the standard TCP
    socket code.

    Signed-off-by: Trond Myklebust
    Signed-off-by: J. Bruce Fields

    Trond Myklebust
     

18 Jun, 2009

1 commit


16 Jun, 2009

1 commit


28 May, 2009

1 commit

  • This reverts commit 47a14ef1af48c696b214ac168f056ddc79793d0e "svcrpc:
    take advantage of tcp autotuning", which uncovered some further problems
    in the server rpc code, causing significant performance regressions in
    common cases.

    We will likely reinstate this patch after releasing 2.6.30 and applying
    some work on the underlying fixes to the problem (developed by Trond).

    Reported-by: Jeff Moyer
    Cc: Olga Kornievskaia
    Cc: Jim Rees
    Cc: Trond Myklebust
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

29 Apr, 2009

6 commits

  • Clean up svc_one_sock_name() by setting up automatic variables for
    frequently used expressions.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Add an arm to the switch statement in svc_one_sock_name() so it can
    construct the name of PF_INET6 sockets properly.

    Signed-off-by: Chuck Lever
    Cc: Aime Le Rouzic
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Use snprintf() in one_sock_name() to prevent overflowing the output
    buffer. If the name doesn't fit in the buffer, the buffer is filled
    in with an empty string, and -ENAMETOOLONG is returned.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Adjust the synopsis of svc_sock_names() to pass in the size of the
    output buffer. Add a documenting comment.

    This is a cosmetic change for now. A subsequent patch will make sure
    the buffer length is passed to one_sock_name(), where the length will
    actually be useful.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Adjust the synopsis of svc_addsock() to pass in the size of the output
    buffer. Add a documenting comment.

    This is a cosmetic change for now. A subsequent patch will make sure
    the buffer length is passed to one_sock_name(), where the length will
    actually be useful.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • The svc_addr_len() helper function returns -EAFNOSUPPORT if it doesn't
    recognize the address family of the passed-in socket address. However,
    the return type of this function is size_t, which means -EAFNOSUPPORT
    is turned into a very large positive value in this case.

    The check in svc_udp_recvfrom() to see if the return value is less
    than zero therefore won't work at all.

    Additionally, handle_connect_req() passes this value directly to
    memset(). This could cause memset() to clobber a large chunk of memory
    if svc_addr_len() has returned an error. Currently the address family
    of these addresses, however, is known to be supported long before
    handle_connect_req() is called, so this isn't a real risk.

    Change the error return value of svc_addr_len() to zero, which fits in
    the range of size_t, and is safer to pass to memset() directly.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     

07 Apr, 2009

1 commit

  • * 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: (81 commits)
    nfsd41: define nfsd4_set_statp as noop for !CONFIG_NFSD_V4
    nfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc
    nfsd41: Documentation/filesystems/nfs41-server.txt
    nfsd41: CREATE_EXCLUSIVE4_1
    nfsd41: SUPPATTR_EXCLCREAT attribute
    nfsd41: support for 3-word long attribute bitmask
    nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify
    nfsd41: pass writable attrs mask to nfsd4_decode_fattr
    nfsd41: provide support for minor version 1 at rpc level
    nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions
    nfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap
    nfsd41: access_valid
    nfsd41: clientid handling
    nfsd41: check encode size for sessions maxresponse cached
    nfsd41: stateid handling
    nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op
    nfsd41: destroy_session operation
    nfsd41: non-page DRC for solo sequence responses
    nfsd41: Add a create session replay cache
    nfsd41: create_session operation
    ...

    Linus Torvalds
     

02 Apr, 2009

1 commit


29 Mar, 2009

3 commits

  • We are about to convert to using separate RPC listener sockets for
    PF_INET and PF_INET6. This echoes the way IPv6 is handled in user
    space by TI-RPC, and eliminates the need for ULPs to worry about
    mapped IPv4 AF_INET6 addresses when doing address comparisons.

    Start by setting the IPV6ONLY flag on PF_INET6 RPC listener sockets.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Since the sv_family field is going away, modify svc_setup_socket() to
    extract the protocol family from the passed-in socket instead of from
    the passed-in svc_serv struct.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • The sv_family field is going away. Instead of using sv_family, have
    the svc_register() function take a protocol family argument.

    Since this argument represents a protocol family, and not an address
    family, this argument takes an int, as this is what is passed to
    sock_create_kern(). Also make sure svc_register's helpers are
    checking for PF_FOO instead of AF_FOO. The value of [AP]F_FOO are
    equivalent; this is simply a symbolic change to reflect the semantics
    of the value stored in that variable.

    sock_create_kern() should return EPFNOSUPPORT if the passed-in
    protocol family isn't supported, but it uses EAFNOSUPPORT for this
    case. We will stick with that tradition here, as svc_register()
    is called by the RPC server in the same path as sock_create_kern().

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

19 Mar, 2009

1 commit

  • Allow the NFSv4 server to make use of TCP autotuning behaviour, which
    was previously disabled by setting the sk_userlocks variable.

    Set the receive buffers to be big enough to receive the whole RPC
    request, and set this for the listening socket, not the accept socket.

    Remove the code that readjusts the receive/send buffer sizes for the
    accepted socket. Previously this code was used to influence the TCP
    window management behaviour, which is no longer needed when autotuning
    is enabled.

    This can improve IO bandwidth on networks with high bandwidth-delay
    products, where a large tcp window is required. It also simplifies
    performance tuning, since getting adequate tcp buffers previously
    required increasing the number of nfsd threads.

    Signed-off-by: Olga Kornievskaia
    Cc: Jim Rees
    Signed-off-by: J. Bruce Fields

    Olga Kornievskaia
     

08 Jan, 2009

2 commits


07 Jan, 2009

1 commit


16 Dec, 2008

1 commit


25 Nov, 2008

1 commit

  • The svc_addsock function adds transport instances without taking a
    reference on the sunrpc.ko module, however, the generic transport
    destruction code drops a reference when a transport instance
    is destroyed.

    Add a try_module_get call to the svc_addsock function for transport
    instances added by this function.

    Signed-off-by: Tom Tucker
    Signed-off-by: J. Bruce Fields
    Tested-by: Jeff Moyer

    Tom Tucker
     

31 Oct, 2008

1 commit


05 Oct, 2008

1 commit


30 Sep, 2008

1 commit

  • My plan is to use an AF_INET listener on systems that support only IPv4,
    and an AF_INET6 listener on systems that can support IPv6. Incoming
    IPv4 packets will be posted to an AF_INET6 listener with a mapped IPv4
    address.

    Max Matveev says:
    Creating a single listener can be dangerous - if net.ipv6.bindv6only
    is enabled then it's possible to create another listener in v4
    namespace on the same port and steal the traffic from the "unifed"
    listener. You need to disable V6ONLY explicitly via a sockopt to stop
    that.

    Set appropriate socket option on RPC server listener sockets to prevent
    this.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     

24 Apr, 2008

3 commits


22 Feb, 2008

1 commit

  • Sorry for the noise, but here's the v3 of this compilation fix :)

    There are some places, which declare the char buf[...] on the stack
    to push it later into dprintk(). Since the dprintk sometimes (if the
    CONFIG_SYSCTL=n) becomes an empty do { } while (0) stub, these buffers
    cause gcc to produce appropriate warnings.

    Wrap these buffers with RPC_IFDEBUG macro, as Trond proposed, to
    compile them out when not needed.

    Signed-off-by: Pavel Emelyanov
    Acked-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Pavel Emelyanov
     

02 Feb, 2008

3 commits

  • Some transports have a header in front of the RPC header. The current
    defer/revisit processing considers only the iov_len and arg_len to
    determine how much to back up when saving the original request
    to revisit. Add a field to the rqstp structure to save the size
    of the transport header so svc_defer can correctly compute
    the start of a request.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Reviewed-by: Chuck Lever
    Reviewed-by: Greg Banks
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • This functionally trivial patch moves all of the transport independent
    functions from the svcsock.c file to the transport independent svc_xprt.c
    file.

    In addition the following formatting changes were made:
    - White space cleanup
    - Function signatures on single line
    - The inline directive was removed
    - Lines over 80 columns were reformatted
    - The term 'socket' was changed to 'transport' in comments
    - The SMP comment was moved and updated.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Reviewed-by: Chuck Lever
    Reviewed-by: Greg Banks
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • The svc_check_conn_limits function only manipulates xprt fields. Change references
    to svc_sock->sk_xprt to svc_xprt directly.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Reviewed-by: Chuck Lever
    Reviewed-by: Greg Banks
    Signed-off-by: J. Bruce Fields

    Tom Tucker