Eric Lee / smarc-fsl-linux-kernel

19 Jun, 2009

1 commit

1f84603c0 Merge branch 'devel-for-2.6.31' into for-2.6.31 ... Browse Code »

Conflicts:
fs/nfs/client.c
fs/nfs/super.c

Trond Myklebust
2009-06-19 09:13:44 +0800

18 Jun, 2009

6 commits

301933a0a Merge commit 'linux-pnfs/nfs41-for-2.6.31' into nfsv41-for-2.6.31 Browse Code »

Trond Myklebust
2009-06-18 08:59:58 +0800
0d90ba1cd nfs41: Backchannel callback service helper routines ... Browse Code »

Executes the backchannel task on the RPC state machine using
the existing open connection previously established by the client.

Signed-off-by: Ricardo Labiaga

nfs41: Add bc_svc.o to sunrpc Makefile.

[nfs41: bc_send() does not need to be exported outside RPC module]
[nfs41: xprt_free_bc_request() need not be exported outside RPC module]
Signed-off-by: Ricardo Labiaga
Signed-off-by: Benny Halevy
[Update copyright]
Signed-off-by: Ricardo Labiaga
Signed-off-by: Benny Halevy

Ricardo Labiaga
2009-06-18 05:11:28 +0800
88b5ed73b SUNRPC: Fix a missing "break" option in xs_tcp_setup_socket() ... Browse Code »

In the case of -EADDRNOTAVAIL and/or unhandled connection errors, we want
to get rid of the existing socket and retry immediately, just as the
comment says. Currently we end up sleeping for a minute, due to the missing
"break" statement.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-06-18 04:22:57 +0800
44b98efdd nfs41: New xs_tcp_read_data() ... Browse Code »

Handles RPC replies and backchannel callbacks. Traditionally the NFS
client has expected only RPC replies on its open connections. With
NFSv4.1, callbacks can arrive over an existing open connection.

This patch refactors the old xs_tcp_read_request() into an RPC reply handler:
xs_tcp_read_reply(), a new backchannel callback handler: xs_tcp_read_callback(),
and a common routine to read the data off the transport: xs_tcp_read_common().
The new xs_tcp_read_callback() queues callback requests onto a queue where
the callback service (a separate thread) is listening for the processing.

This patch incorporates work and suggestions from Rahul Iyer (iyer@netapp.com)
and Benny Halevy (bhalevy@panasas.com).

xs_tcp_read_callback() drops the connection when the number of expected
callbacks is exceeded. Use xprt_force_disconnect(), ensuring tasks on
the pending queue are awaken on disconnect.

[nfs41: Keep track of RPC call/reply direction with a flag]
[nfs41: Preallocate rpc_rqst receive buffer for handling callbacks]
Signed-off-by: Ricardo Labiaga
Signed-off-by: Benny Halevy
[nfs41: sunrpc: xs_tcp_read_callback() should use xprt_force_disconnect()]
Signed-off-by: Ricardo Labiaga
Signed-off-by: Benny Halevy
[Moves embedded #ifdefs into #ifdef function blocks]
Signed-off-by: Ricardo Labiaga
Signed-off-by: Benny Halevy

Ricardo Labiaga
2009-06-18 04:06:16 +0800
f4a2e418b nfs41: Process the RPC call direction ... Browse Code »

Reading and storing the RPC direction is a three step process.

1. xs_tcp_read_calldir() reads the RPC direction, but it will not store it
in the XDR buffer since the 'struct rpc_rqst' is not yet available.

2. The 'struct rpc_rqst' is obtained during the TCP_RCV_COPY_DATA state.
This state need not necessarily be preceeded by the TCP_RCV_READ_CALLDIR.
For example, we may be reading a continuation packet to a large reply.
Therefore, we can't simply obtain the 'struct rpc_rqst' during the
TCP_RCV_READ_CALLDIR state and assume it's available during TCP_RCV_COPY_DATA.

This patch adds a new TCP_RCV_READ_CALLDIR flag to indicate the need to
read the RPC direction. It then uses TCP_RCV_COPY_CALLDIR to indicate the
RPC direction needs to be saved after the 'struct rpc_rqst' has been allocated.

3. The 'struct rpc_rqst' is obtained by the xs_tcp_read_data() helper
functions. xs_tcp_read_common() then saves the RPC direction in the XDR
buffer if TCP_RCV_COPY_CALLDIR is set. This will happen when we're reading
the data immediately after the direction was read. xs_tcp_read_common()
then clears this flag.

[was nfs41: Skip past the RPC call direction]
Signed-off-by: Ricardo Labiaga
Signed-off-by: Benny Halevy
[nfs41: sunrpc: Add RPC direction back into the XDR buffer]
Signed-off-by: Ricardo Labiaga
Signed-off-by: Benny Halevy
[nfs41: sunrpc: Don't skip past the RPC call direction]
Signed-off-by: Ricardo Labiaga
Signed-off-by: Benny Halevy

Ricardo Labiaga
2009-06-18 03:43:46 +0800
18dca02ae nfs41: Add ability to read RPC call direction on TCP stream. ... Browse Code »

NFSv4.1 callbacks can arrive over an existing connection. This patch adds
the logic to read the RPC call direction (call or reply). It does this by
updating the state machine to look for the call direction invoking
xs_tcp_read_calldir(...) after reading the XID.

[nfs41: Keep track of RPC call/reply direction with a flag]

As per 11/14/08 review of RFC 53/85.

Add a new flag to track whether the incoming message is an RPC call or an
RPC reply. TCP_RPC_REPLY is set in the 'struct sock_xprt' tcp_flags in
xs_tcp_read_calldir() if the message is an RPC reply sent on the forechannel.
It is cleared if the message is an RPC request sent on the back channel.

Signed-off-by: Ricardo Labiaga
Signed-off-by: Benny Halevy

Ricardo Labiaga
2009-06-18 03:43:45 +0800

03 Jun, 2009

1 commit

adf30907d net: skb->dst accessors ... Browse Code »

Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-06-03 17:51:04 +0800

03 May, 2009

1 commit

f75e6745a SUNRPC: Fix the problem of EADDRNOTAVAIL syslog floods on reconnect ... Browse Code »

See http://bugzilla.kernel.org/show_bug.cgi?id=13034

If the port gets into a TIME_WAIT state, then we cannot reconnect without
binding to a new port.

Tested-by: Petr Vandrovec
Tested-by: Jean Delvare
Signed-off-by: Trond Myklebust
Signed-off-by: Linus Torvalds

Trond Myklebust
2009-05-03 07:35:08 +0800

02 Apr, 2009

1 commit

cc8590611 Merge branch 'devel' into for-linus Browse Code »

Trond Myklebust
2009-04-02 01:28:15 +0800

27 Mar, 2009

1 commit

08abe18af Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ ... Browse Code »

Conflicts:
drivers/net/wimax/i2400m/usb-notif.c

David S. Miller
2009-03-27 06:23:24 +0800

20 Mar, 2009

4 commits

55420c24a SUNRPC: Ensure we close the socket on EPIPE errors too... ... Browse Code »

As long as one task is holding the socket lock, then calls to
xprt_force_disconnect(xprt) will not succeed in shutting down the socket.
In particular, this would mean that a server initiated shutdown will not
succeed until the lock is relinquished.
In order to avoid the deadlock, we should ensure that xs_tcp_send_request()
closes the socket on EPIPE errors too.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-20 03:17:36 +0800
b61d59fff SUNRPC: xs_tcp_connect_worker{4,6}: merge common code ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-20 03:17:35 +0800
25fe6142a SUNRPC: Add a sysctl to control the duration of the socket linger timeout ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-20 03:17:34 +0800
7d1e8255c SUNRPC: Add the equivalent of the linger and linger2 timeouts to RPC sockets ... Browse Code »

This fixes a regression against FreeBSD servers as reported by Tomas
Kasparek. Apparently when using RPC over a TCP socket, the FreeBSD servers
don't ever react to the client closing the socket, and so commit
e06799f958bf7f9f8fae15f0c6f519953fb0257c (SUNRPC: Use shutdown() instead of
close() when disconnecting a TCP socket) causes the setup to hang forever
whenever the client attempts to close and then reconnect.

We break the deadlock by adding a 'linger2' style timeout to the socket,
after which, the client will abort the connection using a TCP 'RST'.

The default timeout is set to 15 seconds. A subsequent patch will put it
under user control by means of a systctl.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-20 03:17:34 +0800

12 Mar, 2009

10 commits

5e3771ce2 SUNRPC: Ensure that xs_nospace return values are propagated ... Browse Code »
1

If xs_nospace() finds that the socket has disconnected, it attempts to
return ENOTCONN, however that value is then squashed by the callers.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-12 02:38:01 +0800
8a2cec295 SUNRPC: Delay, then retry on connection errors. ... Browse Code »

Enforce the comment in xs_tcp_connect_worker4/xs_tcp_connect_worker6 that
we should delay, then retry on certain connection errors.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-12 02:38:01 +0800
2a4919919 SUNRPC: Return EAGAIN instead of ENOTCONN when waking up xprt->pending ... Browse Code »

While we should definitely return socket errors to the task that is
currently trying to send data, there is no need to propagate the same error
to all the other tasks on xprt->pending. Doing so actually slows down
recovery, since it causes more than one tasks to attempt socket recovery.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-12 02:38:00 +0800
482f32e65 SUNRPC: Handle socket errors correctly ... Browse Code »

Ensure that we pick up and handle socket errors as they occur.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-12 02:38:00 +0800
c8485e4d6 SUNRPC: Handle ECONNREFUSED correctly in xprt_transmit() ... Browse Code »

If we get an ECONNREFUSED error, we currently go to sleep on the
'xprt->sending' wait queue. The problem is that no timeout is set there,
and there is nothing else that will wake the task up later.

We should deal with ECONNREFUSED in call_status, given that is where we
also deal with -EHOSTDOWN, and friends.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-12 02:37:59 +0800
40d2549db SUNRPC: Don't disconnect if a connection is still in progress. ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-12 02:37:58 +0800
670f94573 SUNRPC: Ensure we set XPRT_CLOSING only after we've sent a tcp FIN... ... Browse Code »

...so that we can distinguish between when we need to shutdown and when we
don't. Also remove the call to xs_tcp_shutdown() from xs_tcp_connect(),
since xprt_connect() makes the same test.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-12 02:37:58 +0800
fe315e76f SUNRPC: Avoid spurious wake-up during UDP connect processing ... Browse Code »

To clear out old state, the UDP connect workers unconditionally invoke
xs_close() before proceeding with a new connect. Nowadays this causes
a spurious wake-up of the task waiting for the connect to complete.

This is a little racey, but usually harmless. The waiting task
immediately retries the connect via a call_bind/call_connect sequence,
which usually finds the transport already in the connected state
because the connect worker has finished in the background.

To avoid a spurious wake-up, factor the xs_close() logic that resets
the underlying socket into a helper, and have the UDP connect workers
call that helper instead of xs_close().

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2009-03-12 02:10:21 +0800
01d37c428 SUNRPC: xprt_connect() don't abort the task if the transport isn't bound ... Browse Code »

If the transport isn't bound, then we should just return ENOTCONN, letting
call_connect_status() and/or call_status() deal with retrying. Currently,
we appear to abort all pending tasks with an EIO error.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-12 02:09:39 +0800
fba91afbe SUNRPC: Fix an Oops due to socket not set up yet... ... Browse Code »

We can Oops in both xs_udp_send_request() and xs_tcp_send_request() if the
call to xs_sendpages() returns an error due to the socket not yet being
set up.
Deal with that situation by returning a new error: ENOTSOCK, so that we
know to avoid dereferencing transport->sock.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-12 02:06:41 +0800

07 Feb, 2009

1 commit

1f0fa1543 net/sunrpc/xprtsock.c: some common code found ... Browse Code »

$ diff-funcs xs_udp_write_space net/sunrpc/xprtsock.c
net/sunrpc/xprtsock.c xs_tcp_write_space
--- net/sunrpc/xprtsock.c:xs_udp_write_space()
+++ net/sunrpc/xprtsock.c:xs_tcp_write_space()
@@ -1,4 +1,4 @@
- * xs_udp_write_space - callback invoked when socket buffer space
+ * xs_tcp_write_space - callback invoked when socket buffer space
* becomes available
* @sk: socket whose state has changed
*
@@ -7,12 +7,12 @@
* progress, otherwise we'll waste resources thrashing kernel_sendmsg
* with a bunch of small requests.
*/
-static void xs_udp_write_space(struct sock *sk)
+static void xs_tcp_write_space(struct sock *sk)
{
read_lock(&sk->sk_callback_lock);

- /* from net/core/sock.c:sock_def_write_space */
- if (sock_writeable(sk)) {
+ /* from net/core/stream.c:sk_stream_write_space */
+ if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
struct socket *sock;
struct rpc_xprt *xprt;

$ codiff net/sunrpc/xprtsock.o net/sunrpc/xprtsock.o.new
net/sunrpc/xprtsock.c:
xs_tcp_write_space | -163
xs_udp_write_space | -163
2 functions changed, 326 bytes removed

net/sunrpc/xprtsock.c:
xs_write_space | +179
1 function changed, 179 bytes added

net/sunrpc/xprtsock.o.new:
3 functions changed, 179 bytes added, 326 bytes removed, diff: -147

Signed-off-by: Ilpo Järvinen
Signed-off-by: David S. Miller

Ilpo Järvinen
2009-02-07 15:48:33 +0800

03 Nov, 2008

1 commit

e0db4a786 sunrpc: Fix build warning due to typo in %pI4 format changes. ... Browse Code »

Noticed by Stephen Hemminger.

Signed-off-by: David S. Miller

David S. Miller
2008-11-03 15:57:06 +0800

31 Oct, 2008

2 commits

21454aaad net: replace NIPQUAD() in net/*/ ... Browse Code »

Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4

Signed-off-by: Harvey Harrison
Signed-off-by: David S. Miller

Harvey Harrison
2008-10-31 15:54:56 +0800
a1744d3be Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:

drivers/net/wireless/p54/p54common.c

David S. Miller
2008-10-31 15:17:34 +0800

30 Oct, 2008

2 commits

5b095d989 net: replace %p6 with %pI6 ... Browse Code »

Signed-off-by: Harvey Harrison
Signed-off-by: David S. Miller

Harvey Harrison
2008-10-30 03:52:50 +0800
4b7a4274c net: replace %#p6 format specifier with %pi6 ... Browse Code »

gcc warns when using the # modifier with the %p format specifier,
so we can't use this to omit the colons when needed, introduces
%pi6 instead.

Signed-off-by: Harvey Harrison
Signed-off-by: David S. Miller

Harvey Harrison
2008-10-30 03:50:24 +0800

29 Oct, 2008

3 commits

fdb46ee75 net, misc: replace uses of NIP6_FMT with %p6 ... Browse Code »

Signed-off-by: Harvey Harrison
Signed-off-by: David S. Miller

Harvey Harrison
2008-10-29 14:02:32 +0800
b071195de net: replace all current users of NIP6_SEQFMT with %#p6 ... Browse Code »

The define in kernel.h can be done away with at a later time.

Signed-off-by: Harvey Harrison
Signed-off-by: David S. Miller

Harvey Harrison
2008-10-29 07:05:40 +0800
2a9e1cfa2 SUNRPC: Respond promptly to server TCP resets ... Browse Code »

If the server sends us an RST error while we're in the TCP_ESTABLISHED
state, then that will not result in a state change, and so the RPC client
ends up hanging forever (see
http://bugzilla.kernel.org/show_bug.cgi?id=11154)

We can intercept the reset by setting up an sk->sk_error_report callback,
which will then allow us to initiate a proper shutdown and retry...

We also make sure that if the send request receives an ECONNRESET, then we
shutdown too...

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-10-29 03:21:39 +0800

14 Oct, 2008

1 commit

113aa838e net: Rationalise email address: Network Specific Parts ... Browse Code »

Clean up the various different email addresses of mine listed in the code
to a single current and valid address. As Dave says his network merges
for 2.6.28 are now done this seems a good point to send them in where
they won't risk disrupting real changes.

Signed-off-by: Alan Cox
Signed-off-by: David S. Miller

Alan Cox
2008-10-14 10:01:08 +0800

10 Jul, 2008

1 commit

b22602a67 SUNRPC: Ensure all transports set rq_xtime consistently ... Browse Code »

The RPC client uses the rq_xtime field in each RPC request to determine the
round-trip time of the request. Currently, the rq_xtime field is
initialized by each transport just before it starts enqueing a request to
be sent. However, transports do not handle initializing this value
consistently; sometimes they don't initialize it at all.

To make the measurement of request round-trip time consistent for all
RPC client transport capabilities, pull rq_xtime initialization into the
RPC client's generic transport logic. Now all transports will get a
standardized RTT measure automatically, from:

xprt_transmit()

to

xprt_complete_rqst()

This makes round-trip time calculation more accurate for the TCP transport.
The socket ->sendmsg() method can return "-EAGAIN" if the socket's output
buffer is full, so the TCP transport's ->send_request() method may call
the ->sendmsg() method repeatedly until it gets all of the request's bytes
queued in the socket's buffer.

Currently, the TCP transport sets the rq_xtime field every time through
that loop so the final value is the timestamp just before the *last* call
to the underlying socket's ->sendmsg() method. After this patch, the
rq_xtime field contains a timestamp that reflects the time just before the
*first* call to ->sendmsg().

This is consequential under heavy workloads because large requests often
take multiple ->sendmsg() calls to get all the bytes of a request queued.
The TCP transport causes the request to sleep until the remote end of the
socket has received enough bytes to clear space in the socket's local
output buffer. This delay can be quite significant.

The method introduced by this patch is a more accurate measure of RTT
for stream transports, since the server can cause enough back pressure
to delay (ie increase the latency of) requests from the client.

Additionally, this patch corrects the behavior of the RDMA transport, which
entirely neglected to initialize the rq_xtime field. RPC performance
metrics for RDMA transports now display correct RPC request round trip
times.

Signed-off-by: Chuck Lever
Acked-by: Tom Talpey
Signed-off-by: Trond Myklebust

Chuck Lever
2008-07-10 00:09:15 +0800

25 Apr, 2008

1 commit

233607dbb Merge branch 'devel' Browse Code »

Trond Myklebust
2008-04-25 02:01:02 +0800

20 Apr, 2008

3 commits

7c1d71cf5 SUNRPC: Don't disconnect more than once if retransmitting NFSv4 requests ... Browse Code »

NFSv4 requires us to ensure that we break the TCP connection before we're
allowed to retransmit a request. However in the case where we're
retransmitting several requests that have been sent on the same
connection, we need to ensure that we don't interfere with the attempt to
reconnect and/or break the connection again once it has been established.

We therefore introduce a 'connection' cookie that is bumped every time a
connection is broken. This allows requests to track if they need to force a
disconnection.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-04-20 04:55:12 +0800
06b4b681a SUNRPC: remove XS_SENDMSG_RETRY ... Browse Code »

The condition for exiting from the loop in xs_tcp_send_request() should be
that we find we're not making progress (i.e. number of bytes sent is 0).

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-04-20 04:55:05 +0800
b6ddf64ff SUNRPC: Fix up xprt_write_space() ... Browse Code »

The rest of the networking layer uses SOCK_ASYNC_NOSPACE to signal whether
or not we have someone waiting for buffer memory. Convert the SUNRPC layer
to use the same idiom.
Remove the unlikely()s in xs_udp_write_space and xs_tcp_write_space. In
fact, the most common case will be that there is nobody waiting for buffer
space.

SOCK_NOSPACE is there to tell the TCP layer whether or not the cwnd was
limited by the application window. Ensure that we follow the same idiom as
the rest of the networking layer here too.

Finally, ensure that we clear SOCK_ASYNC_NOSPACE once we wake up, so that
write_space() doesn't keep waking things up on xprt->pending.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-04-20 04:52:44 +0800