Eric Lee / smarc-fsl-linux-kernel

11 Jul, 2007

2 commits

96802a095 SUNRPC: cleanup transport creation argument passing ... Browse Code »

Cleanup argument passing to functions for creating an RPC transport.

Signed-off-by: Frank van Maarseveen
Signed-off-by: Trond Myklebust

Frank van Maarseveen
2007-07-11 11:40:49 +0800
c1384c9c4 SUNRPC: fix hang due to eventd deadlock... ... Browse Code »

Brian Behlendorf writes:

The root cause of the NFS hang we were observing appears to be a rare
deadlock between the kernel provided usermodehelper API and the linux NFS
client. The deadlock can arise because both of these services use the
generic linux work queues. The usermodehelper API run the specified user
application in the context of the work queue. And NFS submits both cleanup
and reconnect work to the generic work queue for handling. Normally this
is fine but a deadlock can result in the following situation.

- NFS client is in a disconnected state
- [events/0] runs a usermodehelper app with an NFS dependent operation,
this triggers an NFS reconnect.
- NFS reconnect happens to be submitted to [events/0] work queue.
- Deadlock, the [events/0] work queue will never process the
reconnect because it is blocked on the previous NFS dependent
operation which will not complete.`

The solution is simply to run reconnect requests on rpciod.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-11 11:40:31 +0800

01 May, 2007

3 commits

a509050bd SUNRPC: introduce rpcbind: replacement for in-kernel portmapper ... Browse Code »

Introduce a replacement for the in-kernel portmapper client that supports
all 3 versions of the rpcbind protocol. This code is not used yet.

Original code by Groupe Bull updated for the latest kernel, with multiple
bug fixes.

Note that rpcb_clnt.c does not yet support registering via versions 3 and
4 of the rpcbind protocol. That is planned for a later patch.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-05-01 13:17:12 +0800
c5a4dd8b7 SUNRPC: Eliminate side effects from rpc_malloc ... Browse Code »

Currently rpc_malloc sets req->rq_buffer internally. Make this a more
generic interface: return a pointer to the new buffer (or NULL) and
make the caller set req->rq_buffer and req->rq_bufsize. This looks much
more like kmalloc and eliminates the side effects.

To fix a potential deadlock, this patch also replaces GFP_NOFS with
GFP_NOWAIT in rpc_malloc. This prevents async RPCs from sleeping outside
the RPC's task scheduler while allocating their buffer.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-05-01 13:17:11 +0800
2bea90d43 SUNRPC: RPC buffer size estimates are too large ... Browse Code »

The RPC buffer size estimation logic in net/sunrpc/clnt.c always
significantly overestimates the requirements for the buffer size.
A little instrumentation demonstrated that in fact rpc_malloc was never
allocating the buffer from the mempool, but almost always called kmalloc.

To compute the size of the RPC buffer more precisely, split p_bufsiz into
two fields; one for the argument size, and one for the result size.

Then, compute the sum of the exact call and reply header sizes, and split
the RPC buffer precisely between the two. That should keep almost all RPC
buffers within the 2KiB buffer mempool limit.

And, we can finally be rid of RPC_SLACK_SPACE!

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-05-01 13:17:10 +0800

21 Apr, 2007

1 commit

241c39b9a RPC: Fix the TCP resend semantics for NFSv4 ... Browse Code »

Fix a regression due to the patch "NFS: disconnect before retrying NFSv4
requests over TCP"

The assumption made in xprt_transmit() that the condition
"req->rq_bytes_sent == 0 and request is on the receive list"
should imply that we're dealing with a retransmission is false.
Firstly, it may simply happen that the socket send queue was full
at the time the request was initially sent through xprt_transmit().
Secondly, doing this for each request that was retransmitted implies
that we disconnect and reconnect for _every_ request that happened to
be retransmitted irrespective of whether or not a disconnection has
already occurred.

Fix is to move this logic into the call_status request timeout handler.

Signed-off-by: Trond Myklebust
Signed-off-by: Linus Torvalds

Trond Myklebust
2007-04-21 13:56:30 +0800

13 Feb, 2007

2 commits

d9bc125ca Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ ... Browse Code »

Conflicts:

net/sunrpc/auth_gss/gss_krb5_crypto.c
net/sunrpc/auth_gss/gss_spkm3_token.c
net/sunrpc/clnt.c

Merge with mainline and fix conflicts.

Trond Myklebust
2007-02-13 14:43:25 +0800
43d78ef2b NFS: disconnect before retrying NFSv4 requests over TCP ... Browse Code »

RFC3530 section 3.1.1 states an NFSv4 client MUST NOT send a request
twice on the same connection unless it is the NULL procedure. Section
3.1.1 suggests that the client should disconnect and reconnect if it
wants to retry a request.

Implement this by adding an rpc_clnt flag that an ULP can use to
specify that the underlying transport should be disconnected on a
major timeout. The NFSv4 client asserts this new flag, and requests
no retries after a minor retransmit timeout.

Note that disconnecting on a retransmit is in general not safe to do
if the RPC client does not reuse the TCP port number when reconnecting.

See http://bugzilla.linux-nfs.org/show_bug.cgi?id=6

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-02-13 14:40:45 +0800

11 Feb, 2007

1 commit

cca5172a7 [NET] SUNRPC: Fix whitespace errors. ... Browse Code »

Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

YOSHIFUJI Hideaki
2007-02-11 15:20:13 +0800

04 Feb, 2007

1 commit

46121cf7d SUNRPC: fix print format for tk_pid ... Browse Code »

The tk_pid field is an unsigned short. The proper print format specifier for
that type is %5u, not %4d.

Also clean up some miscellaneous print formatting nits.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-02-04 07:35:10 +0800

08 Dec, 2006

1 commit

34161db6b Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ into merge_linus ... Browse Code »

Conflicts:

include/linux/sunrpc/xprt.h
net/sunrpc/xprtsock.c
Fix up conflicts with the workqueue changes.

Trond Myklebust
2006-12-08 04:48:15 +0800

06 Dec, 2006

2 commits

5847e1f4d SUNRPC: Remove pprintk() from net/sunrpc/xprt.c ... Browse Code »

These appear to be deprecated. Removing them also gets rid of some sparse
noise.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-12-06 23:46:55 +0800
c8541ecdd SUNRPC: Make the transport-specific setup routine allocate rpc_xprt ... Browse Code »

Change the location where the rpc_xprt structure is allocated so each
transport implementation can allocate a private area from the same
chunk of memory.

Note also that xprt->ops->destroy, rather than xprt_destroy, is now
responsible for freeing rpc_xprt when the transport is destroyed.

Test plan:
Connectathon.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-12-06 23:46:34 +0800

22 Nov, 2006

1 commit

65f27f384 WorkStruct: Pass the work_struct pointer instead of context data ... Browse Code »

Pass the work_struct pointer to the work function rather than context data.
The work function can use container_of() to work out the data.

For the cases where the container of the work_struct may go away the moment the
pending bit is cleared, it is made possible to defer the release of the
structure by deferring the clearing of the pending bit.

To make this work, an extra flag is introduced into the management side of the
work_struct. This governs auto-release of the structure upon execution.

Ordinarily, the work queue executor would release the work_struct for further
scheduling or deallocation by clearing the pending bit prior to jumping to the
work function. This means that, unless the driver makes some guarantee itself
that the work_struct won't go away, the work function may not access anything
else in the work_struct or its container lest they be deallocated.. This is a
problem if the auxiliary data is taken away (as done by the last patch).

However, if the pending bit is *not* cleared before jumping to the work
function, then the work function *may* access the work_struct and its container
with no problems. But then the work function must itself release the
work_struct by calling work_release().

In most cases, automatic release is fine, so this is the default. Special
initiators exist for the non-auto-release case (ending in _NAR).

Signed-Off-By: David Howells

David Howells
2006-11-22 22:55:48 +0800

29 Sep, 2006

1 commit

d8ed029d6 [SUNRPC]: trivial endianness annotations ... Browse Code »

pure s/u32/__be32/

[AV: large part based on Alexey's patches]

Signed-off-by: Alexey Dobriyan
Signed-off-by: Al Viro
Signed-off-by: David S. Miller

Alexey Dobriyan
2006-09-29 09:01:21 +0800

23 Sep, 2006

7 commits

6b6ca86b7 SUNRPC: Add refcounting to the struct rpc_xprt ... Browse Code »

In a subsequent patch, this will allow the portmapper to take a reference
to the rpc_xprt for which it is updating the port number, fixing an Oops.

Signed-off-by: Trond Myklebust

Trond Myklebust
2006-09-23 11:25:01 +0800
da45828e2 SUNRPC: Clean up soft task error handling ... Browse Code »

- Ensure that the task aborts the RPC call only when it has actually timed out.
- Ensure that req->rq_majortimeo is initialised correctly.

Signed-off-by: Trond Myklebust

Trond Myklebust
2006-09-23 11:25:00 +0800
ff9aa5e56 SUNRPC: Eliminate xprt_create_proto and rpc_create_client ... Browse Code »

The two function call API for creating a new RPC client is now obsolete.
Remove it.

Also, remove an unnecessary check to see whether the caller is capable of
using privileged network services. The kernel RPC client always uses a
privileged ephemeral port by default; callers are responsible for checking
the authority of users to make use of any RPC service, or for specifying
that a nonprivileged port is acceptable.

Test plan:
Repeated runs of Connectathon locking suite. Check network trace to ensure
correctness of NLM requests and replies.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-09-23 11:24:51 +0800
c2866763b SUNRPC: use sockaddr + size when creating remote transport endpoints ... Browse Code »

Prepare for more generic transport endpoint handling needed by transports
that might use different forms of addressing, such as IPv6.

Introduce a single function call to replace the two-call
xprt_create_proto/rpc_create_client API. Define a new rpc_create_args
structure that allows callers to pass in remote endpoint addresses of
varying length.

Test-plan:
Compile kernel with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-09-23 11:24:49 +0800
c4efcb1d3 SUNRPC: Use "sockaddr_storage" for storing RPC client's remote peer address ... Browse Code »

IPv6 addresses are big (128 bytes). Now that no RPC client consumers treat
the addr field in rpc_xprt structs as an opaque, and access it only via the
API calls, we can safely widen the field in the rpc_xprt struct to
accomodate larger addresses.

Test plan:
Compile kernel with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-09-23 11:24:48 +0800
4a68179d3 SUNRPC: Make RPC portmapper use per-transport storage ... Browse Code »

Move connection and bind state that was maintained in the rpc_clnt
structure to the rpc_xprt structure. This will allow the creation of
a clean API for plugging in different types of bind mechanisms.

This brings improvements such as the elimination of a single spin lock to
control serialization for all in-kernel RPC binding. A set of per-xprt
bitops is used to serialize tasks during RPC binding, just like it now
works for making RPC transport connections.

Test-plan:
Destructive testing (unplugging the network temporarily). Connectathon
with UDP and TCP. NFSv2/3 and NFSv4 mounting should be carefully checked.
Probably need to rig a server where certain services aren't running, or
that returns an error for some typical operation.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-09-23 11:24:39 +0800
ec739ef03 SUNRPC: Create a helper to tell whether a transport is bound ... Browse Code »

Hide the contents and format of xprt->addr by eliminating direct uses
of the xprt->addr.sin_port field. This change is required to support
alternate RPC host address formats (eg IPv6).

Test-plan:
Destructive testing (unplugging the network temporarily). Repeated runs of
Connectathon locking suite with UDP and TCP.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-09-23 11:24:39 +0800

04 Aug, 2006

1 commit

e0ab53dea RPC: Ensure that we disconnect TCP socket when client requests error out ... Browse Code »

If we're part way through transmitting a TCP request, and the client
errors, then we need to disconnect and reconnect the TCP socket in order to
avoid confusing the server.

Signed-off-by: Trond Myklebust
(cherry picked from 031a50c8b9ea82616abd4a4e18021a25848941ce commit)

Trond Myklebust
2006-08-04 04:56:55 +0800

22 Jul, 2006

1 commit

0da974f4f [NET]: Conversions from kmalloc+memset to k(z|c)alloc. ... Browse Code »

Signed-off-by: Panagiotis Issaris
Signed-off-by: David S. Miller

Panagiotis Issaris
2006-07-22 05:51:30 +0800

09 Jun, 2006

1 commit

bf3fcf895 SUNRPC: NFS_ROOT always uses the same XIDs ... Browse Code »

The XID generator uses get_random_bytes to generate an initial XID.
NFS_ROOT starts up before the random driver, though, so get_random_bytes
doesn't set a random XID for NFS_ROOT. This causes NFS_ROOT mount points
to reuse XIDs every time the client is booted. If the client boots often
enough, the server will start serving old replies out of its DRC.

Use net_random() instead.

Test plan:
I/O intensive workloads should perform well and generate no errors. Traces
taken during client reboots should show that NFS_ROOT mounts use unique
XIDs after every reboot.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-06-09 21:34:06 +0800

21 Mar, 2006

5 commits

43ac3f296 SUNRPC: Fix memory barriers for req->rq_received ... Browse Code »

We need to ensure that all writes to the XDR buffers are done before
req->rq_received is visible to other processors.

Signed-off-by: Trond Myklebust

Trond Myklebust
2006-03-21 02:44:51 +0800
e95b85ec9 SUNRPC: minor cleanup ... Browse Code »

RPC_DEBUG_DATA no longer needed in net/sunrpc/xprt.c.

Test plan:
Compile kernel with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-03-21 02:44:23 +0800
11c556b3d SUNRPC: provide a mechanism for collecting stats in the RPC client ... Browse Code »

Add a simple mechanism for collecting stats in the RPC client. Stats are
tabulated during xprt_release. Note that per_cpu shenanigans are not
required here because the RPC client already serializes on the transport
write lock.

Test plan:
Compile kernel with CONFIG_NFS enabled. Basic performance regression
testing with high-speed networking and high performance server.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-03-21 02:44:22 +0800
ef759a2e5 SUNRPC: introduce per-task RPC iostats ... Browse Code »

Account for various things that occur while an RPC task is executed.
Separate timers for RPC round trip and RPC execution time show how
long RPC requests wait in queue before being sent. Eventually these
will be accumulated at xprt_release time in one place where they can
be viewed from userland.

Test plan:
Compile kernel with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-03-21 02:44:17 +0800
262ca07de SUNRPC: add a handful of per-xprt counters ... Browse Code »

Monitor generic transport events. Add a transport switch callout to
format transport counters for export to user-land.

Test plan:
Compile kernel with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-03-21 02:44:16 +0800

07 Jan, 2006

3 commits

0065db328 SUNRPC: Clean up xprt_destroy() ... Browse Code »

We ought never to be calling xprt_destroy() if there are still active
rpc_tasks. Optimise away the broken code that attempts to "fix" that case.

Signed-off-by: Trond Myklebust

Trond Myklebust
2006-01-07 03:58:58 +0800
632e3bdc5 SUNRPC: Ensure client closes the socket when server initiates a close ... Browse Code »

If the server decides to close the RPC socket, we currently don't actually
respond until either another RPC call is scheduled, or until xprt_autoclose()
gets called by the socket expiry timer (which may be up to 5 minutes
later).

This patch ensures that xprt_autoclose() is called much sooner if the
server closes the socket.

Signed-off-by: Trond Myklebust

Trond Myklebust
2006-01-07 03:58:57 +0800
021071483 SUNRPC: switchable buffer allocation ... Browse Code »

Add RPC client transport switch support for replacing buffer management
on a per-transport basis.

In the current IPv4 socket transport implementation, RPC buffers are
allocated as needed for each RPC message that is sent. Some transport
implementations may choose to use pre-allocated buffers for encoding,
sending, receiving, and unmarshalling RPC messages, however. For
transports capable of direct data placement, the buffers can be carved
out of a pre-registered area of memory rather than from a slab cache.

Test-plan:
Millions of fsx operations. Performance characterization with "sio" and
"iozone". Use oprofile and other tools to look for significant regression
in CPU utilization.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-01-07 03:58:55 +0800

19 Oct, 2005

2 commits

ead5e1c26 SUNRPC: Provide a callback to allow free pages allocated during xdr encoding ... Browse Code »

For privacy, we need to allocate pages to store the encrypted data (passed
in pages can't be used without the risk of corrupting data in the page cache).
So we need a way to free that memory after the request has been transmitted.

Signed-off-by: J. Bruce Fields
Signed-off-by: Trond Myklebust

J. Bruce Fields
2005-10-19 14:19:43 +0800
5e5ce5be6 RPC: allow call_encode() to delay transmission of an RPC call. ... Browse Code »

Currently, call_encode will cause the entire RPC call to abort if it returns
an error. This is unnecessarily rigid, and gets in the way of attempts
to allow the NFSv4 layer to order RPC calls that carry sequence ids.

Signed-off-by: Trond Myklebust

Trond Myklebust
2005-10-19 05:20:11 +0800

24 Sep, 2005

5 commits

03bf4b707 [PATCH] RPC: parametrize various transport connect timeouts ... Browse Code »

Each transport implementation can now set unique bind, connect,
reestablishment, and idle timeout values. These are variables,
allowing the values to be modified dynamically. This permits
exponential backoff of any of these values, for instance.

As an example, we implement exponential backoff for the connection
reestablishment timeout.

Test-plan:
Destructive testing (unplugging the network temporarily). Connectathon
with UDP and TCP.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2005-09-24 00:38:53 +0800
555ee3af1 [PATCH] RPC: clean up after nocong was removed ... Browse Code »

Clean-up: Move some macros that are specific to the Van Jacobson
implementation into xprt.c. Get rid of the cong_wait field in
rpc_xprt, which is no longer used. Get rid of xprt_clear_backlog.

Test-plan:
Compile with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2005-09-24 00:38:48 +0800
a58dd398f [PATCH] RPC: add a release_rqst callout to the RPC transport switch ... Browse Code »

The final place where congestion control state is adjusted is in
xprt_release, where each request is finally released. Add a callout
there to allow transports to perform additional processing when a
request is about to be released.

Test-plan:
Use WAN simulation to cause sporadic bursty packet loss. Look for significant
regression in performance or client stability.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2005-09-24 00:38:45 +0800
1570c1e41 [PATCH] RPC: add generic interface for adjusting the congestion window ... Browse Code »

A new interface that allows transports to adjust their congestion window
using the Van Jacobson implementation in xprt.c is provided.

Test-plan:
Use WAN simulation to cause sporadic bursty packet loss. Look for
significant regression in performance or client stability.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2005-09-24 00:38:43 +0800
46c0ee8bc [PATCH] RPC: separate xprt_timer implementations ... Browse Code »

Allow transports to hook the retransmit timer interrupt. Some transports
calculate their congestion window here so that a retransmit timeout has
immediate effect on the congestion window.

Test-plan:
Use WAN simulation to cause sporadic bursty packet loss. Look for significant
regression in performance or client stability.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2005-09-24 00:38:41 +0800