23 Nov, 2011

1 commit

  • By returning '0' instead of 'EAGAIN' when the tests in xs_nospace() fail
    to find evidence of socket congestion, we are making the RPC engine believe
    that the message was incorrectly sent and so it disconnects the socket
    instead of just retrying.

    The bug appears to have been introduced by commit
    5e3771ce2d6a69e10fcc870cdf226d121d868491 (SUNRPC: Ensure that xs_nospace
    return values are propagated).

    Reported-by: Andrew Cooper
    Signed-off-by: Trond Myklebust
    Cc: stable@vger.kernel.org [>= 2.6.30]
    Tested-by: Andrew Cooper

    Trond Myklebust
     

11 Nov, 2011

1 commit


18 Jul, 2011

2 commits


15 Jul, 2011

1 commit


28 May, 2011

3 commits

  • TI-RPC introduces the capability of performing RPC over AF_LOCAL
    sockets. It uses this mainly for registering and unregistering
    local RPC services securely with the local rpcbind, but we could
    also conceivably use it as a generic upcall mechanism.

    This patch provides a client-side only implementation for the moment.
    We might also consider a server-side implementation to provide
    AF_LOCAL access to NLM (for statd downcalls, and such like).

    Autobinding is not supported on kernel AF_LOCAL transports at this
    time. Kernel ULPs must specify the pathname of the remote endpoint
    when an AF_LOCAL transport is created. rpcbind supports registering
    services available via AF_LOCAL, so the kernel could handle it with
    some adjustment to ->rpcbind and ->set_port. But we don't need this
    feature for doing upcalls via well-known named sockets.

    This has not been tested with ULPs that move a substantial amount of
    data. Thus, I can't attest to how robust the write_space and
    congestion management logic is.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: Use a more generic name for xs_encode_tcp_fragment_header();
    it's appropriate to use for all stream transport types. We're about
    to add new stream transport.

    Also, move it to a place where it is more easily shared amongst the
    various send_request methods. And finally, replace the "htonl" macro
    invocation with its modern equivalent.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • The TCP connection state code depends on the state_change() callback
    being called when the SYN_SENT state is set. However the networking layer
    doesn't actually call us back in that case.

    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org

    Trond Myklebust
     

31 Mar, 2011

1 commit


23 Mar, 2011

1 commit


11 Mar, 2011

1 commit


15 Jan, 2011

1 commit

  • * 'for-2.6.38' of git://linux-nfs.org/~bfields/linux: (62 commits)
    nfsd4: fix callback restarting
    nfsd: break lease on unlink, link, and rename
    nfsd4: break lease on nfsd setattr
    nfsd: don't support msnfs export option
    nfsd4: initialize cb_per_client
    nfsd4: allow restarting callbacks
    nfsd4: simplify nfsd4_cb_prepare
    nfsd4: give out delegations more quickly in 4.1 case
    nfsd4: add helper function to run callbacks
    nfsd4: make sure sequence flags are set after destroy_session
    nfsd4: re-probe callback on connection loss
    nfsd4: set sequence flag when backchannel is down
    nfsd4: keep finer-grained callback status
    rpc: allow xprt_class->setup to return a preexisting xprt
    rpc: keep backchannel xprt as long as server connection
    rpc: move sk_bc_xprt to svc_xprt
    nfsd4: allow backchannel recovery
    nfsd4: support BIND_CONN_TO_SESSION
    nfsd4: modify session list under cl_lock
    Documentation: fl_mylease no longer exists
    ...

    Fix up conflicts in fs/nfsd/vfs.c with the vfs-scale work. The
    vfs-scale work touched some msnfs cases, and this merge removes support
    for that entirely, so the conflict was trivial to resolve.

    Linus Torvalds
     

12 Jan, 2011

3 commits

  • This allows us to reuse the xprt associated with a server connection if
    one has already been set up.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • Multiple backchannels can share the same tcp connection; from rfc 5661 section
    2.10.3.1:

    A connection's association with a session is not exclusive. A
    connection associated with the channel(s) of one session may be
    simultaneously associated with the channel(s) of other sessions
    including sessions associated with other client IDs.

    However, multiple backchannels share a connection, they must all share
    the same xid stream (hence the same rpc_xprt); the only way we have to
    match replies with calls at the rpc layer is using the xid.

    So, keep the rpc_xprt around as long as the connection lasts, in case
    we're asked to use the connection as a backchannel again.

    Requests to create new backchannel clients over a given server
    connection should results in creating new clients that reuse the
    existing rpc_xprt.

    But to start, just reject attempts to associate multiple rpc_xprt's with
    the same underlying bc_xprt.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • This seems obviously transport-level information even if it's currently
    used only by the server socket code.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

15 Dec, 2010

1 commit

  • cancel_rearming_delayed_work[queue]() has been superceded by
    cancel_delayed_work_sync() quite some time ago. Convert all the
    in-kernel users. The conversions are completely equivalent and
    trivial.

    Signed-off-by: Tejun Heo
    Acked-by: "David S. Miller"
    Acked-by: Greg Kroah-Hartman
    Acked-by: Evgeniy Polyakov
    Cc: Jeff Garzik
    Cc: Benjamin Herrenschmidt
    Cc: Mauro Carvalho Chehab
    Cc: netdev@vger.kernel.org
    Cc: Anton Vorontsov
    Cc: David Woodhouse
    Cc: "J. Bruce Fields"
    Cc: Neil Brown
    Cc: Alex Elder
    Cc: xfs-masters@oss.sgi.com
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: Andrew Morton
    Cc: netfilter-devel@vger.kernel.org
    Cc: Trond Myklebust
    Cc: linux-nfs@vger.kernel.org

    Tejun Heo
     

27 Oct, 2010

1 commit

  • * 'for-2.6.37' of git://linux-nfs.org/~bfields/linux: (99 commits)
    svcrpc: svc_tcp_sendto XPT_DEAD check is redundant
    svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue
    svcrpc: assume svc_delete_xprt() called only once
    svcrpc: never clear XPT_BUSY on dead xprt
    nfsd4: fix connection allocation in sequence()
    nfsd4: only require krb5 principal for NFSv4.0 callbacks
    nfsd4: move minorversion to client
    nfsd4: delay session removal till free_client
    nfsd4: separate callback change and callback probe
    nfsd4: callback program number is per-session
    nfsd4: track backchannel connections
    nfsd4: confirm only on succesful create_session
    nfsd4: make backchannel sequence number per-session
    nfsd4: use client pointer to backchannel session
    nfsd4: move callback setup into session init code
    nfsd4: don't cache seq_misordered replies
    SUNRPC: Properly initialize sock_xprt.srcaddr in all cases
    SUNRPC: Use conventional switch statement when reclassifying sockets
    sunrpc/xprtrdma: clean up workqueue usage
    sunrpc: Turn list_for_each-s into the ..._entry-s
    ...

    Fix up trivial conflicts (two different deprecation notices added in
    separate branches) in Documentation/feature-removal-schedule.txt

    Linus Torvalds
     

21 Oct, 2010

2 commits

  • The source address field in the transport's sock_xprt is initialized
    ONLY IF the RPC application passed a pointer to a source address
    during the call to rpc_create(). However, xs_bind() subsequently uses
    the value of this field without regard to whether the source address
    was initialized during transport creation or not.

    So far we've been lucky: the uninitialized value of this field is
    zeroes. xs_bind(), until recently, used only the sin[6]_addr field in
    this sockaddr, and all zeroes is a valid value for this: it means
    ANYADDR. This is a happy coincidence.

    However, xs_bind() now wants to use the sa_family field as well, and
    expects it to be initialized to something other than zero.

    Therefore, the source address sockaddr field should be fully
    initialized at transport create time in _every_ case, not just when
    the RPC application wants to use a specific bind address.

    Bruce added a workaround for this missing initialization by adjusting
    commit 6bc9638a, but the "right" way to do this is to ensure that the
    source address sockaddr is always correctly initialized from the
    get-go.

    This patch doesn't introduce a behavior change. It's simply a
    clean-up of Bruce's fix, to prevent future problems of this kind. It
    may look like overkill, but

    a) it clearly documents the default initial value of this field,

    b) it doesn't assume that the sockaddr_storage memory is first
    initialized to any particular value, and

    c) it will fail verbosely if some unknown address family is passed
    in

    Originally introduced by commit d3bc9a1d.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up.

    Defensive coding: If "family" is ever something that is neither
    AF_INET nor AF_INET6, xs_reclassify_socket6() is not the appropriate
    default action. Choose to do nothing in that case.

    Introduced by commit 6bc9638a.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     

19 Oct, 2010

14 commits


02 Oct, 2010

4 commits


25 Sep, 2010

1 commit

  • We have for each socket :

    One spinlock (sk_slock.slock)
    One rwlock (sk_callback_lock)

    Possible scenarios are :

    (A) (this is used in net/sunrpc/xprtsock.c)
    read_lock(&sk->sk_callback_lock) (without blocking BH)

    spin_lock(&sk->sk_slock.slock);
    ...
    read_lock(&sk->sk_callback_lock);
    ...

    (B)
    write_lock_bh(&sk->sk_callback_lock)
    stuff
    write_unlock_bh(&sk->sk_callback_lock)

    (C)
    spin_lock_bh(&sk->sk_slock)
    ...
    write_lock_bh(&sk->sk_callback_lock)
    stuff
    write_unlock_bh(&sk->sk_callback_lock)
    spin_unlock_bh(&sk->sk_slock)

    This (C) case conflicts with (A) :

    CPU1 [A] CPU2 [C]
    read_lock(callback_lock)
    spin_lock_bh(slock)

    We have one problematic (C) use case in inet_csk_listen_stop() :

    local_bh_disable();
    bh_lock_sock(child); // spin_lock_bh(&sk->sk_slock)
    WARN_ON(sock_owned_by_user(child));
    ...
    sock_orphan(child); // write_lock_bh(&sk->sk_callback_lock)

    lockdep is not happy with this, as reported by Tetsuo Handa

    It seems only way to deal with this is to use read_lock_bh(callbacklock)
    everywhere.

    Thanks to Jarek for pointing a bug in my first attempt and suggesting
    this solution.

    Reported-by: Tetsuo Handa
    Tested-by: Tetsuo Handa
    Signed-off-by: Eric Dumazet
    CC: Jarek Poplawski
    Tested-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Aug, 2010

1 commit

  • * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
    NFS: Fix an Oops in the NFSv4 atomic open code
    NFS: Fix the selection of security flavours in Kconfig
    NFS: fix the return value of nfs_file_fsync()
    rpcrdma: Fix SQ size calculation when memreg is FRMR
    xprtrdma: Do not truncate iova_start values in frmr registrations.
    nfs: Remove redundant NULL check upon kfree()
    nfs: Add "lookupcache" to displayed mount options
    NFS: allow close-to-open cache semantics to apply to root of NFS filesystem
    SUNRPC: fix NFS client over TCP hangs due to packet loss (Bug 16494)

    Linus Torvalds
     

11 Aug, 2010

1 commit

  • This is more kernel-ish, saves some space, and also allows us to
    expand the ops without breaking all the callers who are happy for the
    new members to be NULL.

    The few places which defined their own param types are changed to the
    new scheme (more which crept in recently fixed in following patches).

    Since we're touching them anyway, we change get() and set() to take a
    const struct kernel_param (which they really are). This causes some
    harmless warnings until we fix them (in following patches).

    To reduce churn, module_param_call creates the ops struct so the callers
    don't have to change (and casts the functions to reduce warnings).
    The modern version which takes an ops struct is called module_param_cb.

    Signed-off-by: Rusty Russell
    Reviewed-by: Takashi Iwai
    Tested-by: Phil Carmody
    Cc: "David S. Miller"
    Cc: Ville Syrjala
    Cc: Dmitry Torokhov
    Cc: Alessandro Rubini
    Cc: Michal Januszewski
    Cc: Trond Myklebust
    Cc: "J. Bruce Fields"
    Cc: Neil Brown
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-input@vger.kernel.org
    Cc: linux-fbdev-devel@lists.sourceforge.net
    Cc: linux-nfs@vger.kernel.org
    Cc: netdev@vger.kernel.org

    Rusty Russell