Doug / smarc-fsl-linux-kernel | Embedian Git Server

10 Jul, 2008

1 commit

b22602a67 SUNRPC: Ensure all transports set rq_xtime consistently ... Browse Code »

The RPC client uses the rq_xtime field in each RPC request to determine the
round-trip time of the request. Currently, the rq_xtime field is
initialized by each transport just before it starts enqueing a request to
be sent. However, transports do not handle initializing this value
consistently; sometimes they don't initialize it at all.

To make the measurement of request round-trip time consistent for all
RPC client transport capabilities, pull rq_xtime initialization into the
RPC client's generic transport logic. Now all transports will get a
standardized RTT measure automatically, from:

xprt_transmit()

to

xprt_complete_rqst()

This makes round-trip time calculation more accurate for the TCP transport.
The socket ->sendmsg() method can return "-EAGAIN" if the socket's output
buffer is full, so the TCP transport's ->send_request() method may call
the ->sendmsg() method repeatedly until it gets all of the request's bytes
queued in the socket's buffer.

Currently, the TCP transport sets the rq_xtime field every time through
that loop so the final value is the timestamp just before the *last* call
to the underlying socket's ->sendmsg() method. After this patch, the
rq_xtime field contains a timestamp that reflects the time just before the
*first* call to ->sendmsg().

This is consequential under heavy workloads because large requests often
take multiple ->sendmsg() calls to get all the bytes of a request queued.
The TCP transport causes the request to sleep until the remote end of the
socket has received enough bytes to clear space in the socket's local
output buffer. This delay can be quite significant.

The method introduced by this patch is a more accurate measure of RTT
for stream transports, since the server can cause enough back pressure
to delay (ie increase the latency of) requests from the client.

Additionally, this patch corrects the behavior of the RDMA transport, which
entirely neglected to initialize the rq_xtime field. RPC performance
metrics for RDMA transports now display correct RPC request round trip
times.

Signed-off-by: Chuck Lever
Acked-by: Tom Talpey
Signed-off-by: Trond Myklebust

Chuck Lever
2008-07-10 00:09:15 +0800

25 Apr, 2008

1 commit

233607dbb Merge branch 'devel' Browse Code »

Trond Myklebust
2008-04-25 02:01:02 +0800

20 Apr, 2008

3 commits

7c1d71cf5 SUNRPC: Don't disconnect more than once if retransmitting NFSv4 requests ... Browse Code »

NFSv4 requires us to ensure that we break the TCP connection before we're
allowed to retransmit a request. However in the case where we're
retransmitting several requests that have been sent on the same
connection, we need to ensure that we don't interfere with the attempt to
reconnect and/or break the connection again once it has been established.

We therefore introduce a 'connection' cookie that is bumped every time a
connection is broken. This allows requests to track if they need to force a
disconnection.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-04-20 04:55:12 +0800
06b4b681a SUNRPC: remove XS_SENDMSG_RETRY ... Browse Code »

The condition for exiting from the loop in xs_tcp_send_request() should be
that we find we're not making progress (i.e. number of bytes sent is 0).

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-04-20 04:55:05 +0800
b6ddf64ff SUNRPC: Fix up xprt_write_space() ... Browse Code »

The rest of the networking layer uses SOCK_ASYNC_NOSPACE to signal whether
or not we have someone waiting for buffer memory. Convert the SUNRPC layer
to use the same idiom.
Remove the unlikely()s in xs_udp_write_space and xs_tcp_write_space. In
fact, the most common case will be that there is nobody waiting for buffer
space.

SOCK_NOSPACE is there to tell the TCP layer whether or not the cwnd was
limited by the application window. Ensure that we follow the same idiom as
the rest of the networking layer here too.

Finally, ensure that we clear SOCK_ASYNC_NOSPACE once we wake up, so that
write_space() doesn't keep waking things up on xprt->pending.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-04-20 04:52:44 +0800

06 Mar, 2008

1 commit

0dc47877a net: replace remaining __FUNCTION__ occurrences ... Browse Code »

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison
Signed-off-by: David S. Miller

Harvey Harrison
2008-03-06 12:47:47 +0800

29 Feb, 2008

1 commit

ff2d7db84 SUNRPC: Ensure that we read all available tcp data ... Browse Code »

Don't stop until we run out of data, or we hit an error.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-02-29 15:26:27 +0800

30 Jan, 2008

13 commits

33e01dc7f SUNRPC: Clean up functions that free address_strings array ... Browse Code »

Clean up: document the rule (kfree) and the exceptions
(RPC_DISPLAY_PROTO and RPC_DISPLAY_NETID) when freeing the objects in
a transport's address_strings array.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2008-01-30 15:06:08 +0800
b454ae906 SUNRPC: fewer conditionals in the format_ip_address routines ... Browse Code »

Clean up: have the set up routines explicitly pass the strings to be used
for the transport name and NETID. This removes a number of conditionals
and dependencies on rpc_xprt.prot, which is overloaded.

Tighten up type checking on the address_strings array while we're at it.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2008-01-30 15:06:04 +0800
ba7392bb3 SUNRPC: Add support for per-client timeout values ... Browse Code »

In order to be able to support setting the timeo and retrans parameters on
a per-mountpoint basis, we move the rpc_timeout structure into the
rpc_clnt.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:59 +0800
2881ae74e SUNRPC: Clean up the transport timeout initialisation ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:58 +0800
663b8858d SUNRPC: Reconnect immediately whenever the server isn't refusing it. ... Browse Code »

If we've disconnected from the server, rather than the other way round,
then it makes little sense to wait 3 seconds before reconnecting.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:27 +0800
62da3b248 SUNRPC: Rename xprt_disconnect() ... Browse Code »

xprt_disconnect() should really only be called when the transport shutdown
is completed, and it is time to wake up any pending tasks. Rename it to
xprt_disconnect_done() in order to reflect the semantical change.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:27 +0800
3ebb067d9 SUNRPC: Make call_status()/call_decode() call xprt_force_disconnect() ... Browse Code »

Move the calls to xprt_disconnect() over to xprt_force_disconnect() in
order to enable the transport layer to manage the state of the
XPRT_CONNECTED flag.
Ditto in xs_tcp_read_fraghdr().

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:26 +0800
7272dcd31 SUNRPC: xprt_autoclose() should not call xprt_disconnect() ... Browse Code »

The transport layer should do that itself whenever appropriate.

Note that the RDMA transport already assumes that it needs to call
xprt_disconnect in xprt_rdma_close().
For TCP sockets, we want to call xprt_disconnect() only after the
connection has been closed by both ends.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:26 +0800
e06799f95 SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket ... Browse Code »

By using shutdown() rather than close() we allow the RPC client to wait
for the TCP close handshake to complete before we start trying to reconnect
using the same port.
We use shutdown(SHUT_WR) only instead of shutting down both directions,
however we wait until the server has closed the connection on its side.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:26 +0800
ef8036707 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:25 +0800
3b948ae5b SUNRPC: Allow the client to detect if the TCP connection is closed ... Browse Code »

Add an xprt->state bit to enable the TCP ->state_change() method to signal
whether or not the TCP connection is in the process of closing down.
This will to be used by the reconnection logic in a separate patch.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:25 +0800
67a391d72 SUNRPC: Fix TCP rebinding logic ... Browse Code »

Currently the TCP rebinding logic assumes that if we're not using a
reserved port, then we don't need to reconnect on the same port if a
disconnection event occurs. This breaks most RPC duplicate reply cache
implementations.

Also take into account the fact that xprt_min_resvport and
xprt_max_resvport may change while we're reconnecting, since the user may
change them at any time via the sysctls. Ensure that we check the port
boundaries every time we loop in xs_bind4/xs_bind6. Also ensure that if the
boundaries change, we only scan the ports a maximum of 2 times.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:25 +0800
66af1e558 SUNRPC: Fix a race in xs_tcp_state_change() ... Browse Code »

When scheduling the autoclose RPC call, we want to ensure that we don't
race against the test_bit() call in xprt_clear_locked().

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:24 +0800

29 Jan, 2008

1 commit

1781f7f58 [UDP]: Restore missing inDatagrams increments ... Browse Code »

The previous move of the the UDP inDatagrams counter caused the
counting of encapsulated packets, SUNRPC data (as opposed to call)
packets and RXRPC packets to go missing.

This patch restores all of these.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2008-01-29 06:56:33 +0800

27 Nov, 2007

1 commit

483066d62 SUNRPC: make sunrpc/xprtsock.c:xs_setup_{udp,tcp}() static ... Browse Code »

xs_setup_{udp,tcp}() can now become static.

Signed-off-by: Adrian Bunk
Signed-off-by: Trond Myklebust

Adrian Bunk
2007-11-27 05:24:50 +0800

16 Oct, 2007

1 commit

f4921aff5 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 ... Browse Code »

* git://git.linux-nfs.org/pub/linux/nfs-2.6: (131 commits)
NFSv4: Fix a typo in nfs_inode_reclaim_delegation
NFS: Add a boot parameter to disable 64 bit inode numbers
NFS: nfs_refresh_inode should clear cache_validity flags on success
NFS: Fix a connectathon regression in NFSv3 and NFSv4
NFS: Use nfs_refresh_inode() in ops that aren't expected to change the inode
SUNRPC: Don't call xprt_release in call refresh
SUNRPC: Don't call xprt_release() if call_allocate fails
SUNRPC: Fix buggy UDP transmission
[23/37] Clean up duplicate includes in
[2.6 patch] net/sunrpc/rpcb_clnt.c: make struct rpcb_program static
SUNRPC: Use correct type in buffer length calculations
SUNRPC: Fix default hostname created in rpc_create()
nfs: add server port to rpc_pipe info file
NFS: Get rid of some obsolete macros
NFS: Simplify filehandle revalidation
NFS: Ensure that nfs_link() returns a hashed dentry
NFS: Be strict about dentry revalidation when doing exclusive create
NFS: Don't zap the readdir caches upon error
NFS: Remove the redundant nfs_reval_fsid()
NFSv3: Always use directory post-op attributes in nfs3_proc_lookup
...

Fix up trivial conflict due to sock_owned_by_user() cleanup manually in
net/sunrpc/xprtsock.c

Linus Torvalds
2007-10-16 01:47:35 +0800

11 Oct, 2007

1 commit

02b3d3463 [NET] Cleanup: Use sock_owned_by_user() macro ... Browse Code »

Changes asserts in sunrpc to use sock_owned_by_user() macro instead of
referencing sock_lock.owner directly.

Signed-off-by: John Heffner
Signed-off-by: David S. Miller

John Heffner
2007-10-11 07:49:00 +0800

10 Oct, 2007

16 commits

2199700f1 SUNRPC: Fix buggy UDP transmission ... Browse Code »

xs_sendpages() may return a negative result. We sure as hell don't want to
add that to the 'tk_bytes_sent' tally...

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-10-10 05:20:37 +0800
1321d8d97 SUNRPC: Fix bytes-per-op accounting for RPC over UDP ... Browse Code »

NFS performance metrics reported zero bytes sent per op when mounting with
UDP. The UDP socket transport wasn't properly counting the number of bytes
sent.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-10-10 05:18:19 +0800
4fa016eb2 NFS/SUNRPC: support transport protocol naming ... Browse Code »

To prepare for including non-sockets-based RPC transports, select
RPC transports by an identifier (to be used in following patches).

Signed-off-by: Tom Talpey
Signed-off-by: Trond Myklebust

\"Talpey, Thomas\
2007-10-10 05:17:50 +0800
49c36fcc4 SUNRPC: rearrange RPC sockets definitions ... Browse Code »

To prepare for including non-sockets-based RPC transports, move the
sockets-dependent definitions into their own file.

Signed-off-by: Tom Talpey
Signed-off-by: Trond Myklebust

\"Talpey, Thomas\
2007-10-10 05:17:48 +0800
3c341b0b9 SUNRPC: rename the rpc_xprtsock_create structure ... Browse Code »

To prepare for including non-sockets-based RPC transports, change the
overly suggestive name of the transport creation arguments struct.

Signed-off-by: Tom Talpey
Signed-off-by: Trond Myklebust

\"Talpey, Thomas\
2007-10-10 05:17:45 +0800
bc25571e2 SUNRPC: Finish API to load RPC transport implementations dynamically ... Browse Code »

Allow RPC client transport implementations to be loaded as needed, or
as they become available from distributors or third-party vendors.

Note that we leave the IP sockets implementation in sunrpc.o
permanently, as IP functionality is always available in any
kernel that runs NFS.

Signed-off-by: Chuck Lever
Signed-off-by: Tom Talpey
Signed-off-by: Trond Myklebust

\"Talpey, Thomas\
2007-10-10 05:17:42 +0800
4417c8c41 SUNRPC: export per-transport rpcbind netid's ... Browse Code »

The rpcbind (v3+) netid is provided by each RPC client transport. This fixes
an omission in IPv6 rpcbind client support, and enables future extension.

Signed-off-by: Tom Talpey
Signed-off-by: Trond Myklebust

\"Talpey, Thomas\
2007-10-10 05:17:20 +0800
756805e7a SUNRPC: Add support for formatted universal addresses ... Browse Code »

"Universal addresses" are a string representation of an IP address and
port. They are described fully in RFC 3530, section 2.2. Add support
for generating them in the RPC client's socket transport module.

Signed-off-by: Chuck Lever

Chuck Lever
2007-10-10 05:16:29 +0800
8945ee5e2 SUNRPC: Split xs_reclassify_socket into an IPv4 and IPv6 version ... Browse Code »

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-10-10 05:16:26 +0800
95392c593 SUNRPC: Add a helper for extracting the address using the correct type ... Browse Code »

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-10-10 05:16:24 +0800
8f9d5b1a2 SUNRPC: Add IPv6 address support to net/sunrpc/xprtsock.c ... Browse Code »

Finalize support for setting up RPC client transports to remote RPC
services addressed via IPv6.

Based on work done by Gilles Quillard at Bull Open Source.

Signed-off-by: Chuck Lever
Cc: Aurelien Charbon
Signed-off-by: Trond Myklebust

Chuck Lever
2007-10-10 05:16:21 +0800
68e220bd5 SUNRPC: create connect workers for IPv6 ... Browse Code »

Clone separate connect worker functions for connecting AF_INET6 sockets.

Signed-off-by: Chuck Lever
Cc: Aurelien Charbon
Signed-off-by: Trond Myklebust

Chuck Lever
2007-10-10 05:16:18 +0800
9c3d72de2 SUNRPC: Rename IPv4 connect workers ... Browse Code »

Prepare for introduction of IPv6 versions of same.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-10-10 05:16:15 +0800
16be2d20d SUNRPC: Refactor a part of socket connect logic into a helper function ... Browse Code »

Finishing a socket connect is the same for IPv4 and IPv6, so split it out
into a helper.

Signed-off-by: Chuck Lever
Cc: Aurelien Charbon
Signed-off-by: Trond Myklebust

Chuck Lever
2007-10-10 05:16:13 +0800
90058d37c SUNRPC: create an IPv6-savvy mechanism for binding to a reserved port ... Browse Code »

Clone xs_bindresvport into two functions, one that can handle IPv4
addresses, and one that can handle IPv6 addresses.

Signed-off-by: Chuck Lever
Cc: Aurelien Charbon
Signed-off-by: Trond Myklebust

Chuck Lever
2007-10-10 05:16:10 +0800
7dc753f03 SUNRPC: Rename xs_bind() to prepare for IPv6-specific bind method ... Browse Code »

Prepare for introduction of IPv6-specific socket bind function.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-10-10 05:16:08 +0800