Doug / smarc-fsl-linux-kernel | Embedian Git Server

10 Jul, 2008

2 commits

46cb650c2 NFS: Remove the redundant file_open entry from struct nfs_rpc_ops ... Browse Code »

All instances are set to nfs_open(), so we should just remove the redundant
indirection. Ditto for the file_release op

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-07-10 00:09:16 +0800
2116271a3 NFS: Add correct bounds checking to NFSv2 locks ... Browse Code »

NFSv2 file locking currently fails the Connectathon tests, because the
calls to the VFS locking code do not return an EINVAL error if the
struct file_lock overflows the 32-bit boundaries.

The problem is due to the fact that we occasionally call helpers from
fs/locks.c in order to avoid RPC calls to the server when we know that a
local process holds the lock. These helpers are, of course, always
64-bit enabled, so EINVAL is not returned in cases when it would if
the call had gone to the NLM code.

For consistency, we therefore add support for a bounds-checking helper.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-07-10 00:08:40 +0800

20 Apr, 2008

1 commit

c1d519312 NFSv4: Only increment the sequence id if the server saw it ... Browse Code »

It is quite possible that the OPEN, CLOSE, LOCK, LOCKU,... compounds fail
before the actual stateful operation has been executed (for instance in the
PUTFH call). There is no way to tell from the overall status result which
operations were executed from the COMPOUND.

The fix is to move incrementing of the sequence id into the XDR layer,
so that we do it as we process the results from the stateful operation.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-04-20 04:53:15 +0800

30 Jan, 2008

4 commits

c0e07cb68 NFS: NFS version number is unsigned ... Browse Code »

RPC protocol version numbers are unsigned.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2008-01-30 15:06:08 +0800
69dd716c5 NFSv4: Add socket proto argument to setclientid ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:58 +0800
cc38bac3a NFS: Ensure NFSv4 SETCLIENTID send buffer is large enough ... Browse Code »

Ensure that the RPC buffer size specified for NFSv4 SETCLIENTID procedures
matches what we are encoding into the buffer. See the definition of
struct nfs4_setclientid {} and the encode_setclientid() function.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2008-01-30 15:05:51 +0800
bdc7f021f NFS: Clean up the (commit|read|write)_setup() callback routines ... Browse Code »

Move the common code for setting up the nfs_write_data and nfs_read_data
structures into fs/nfs/read.c, fs/nfs/write.c and fs/nfs/direct.c.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:32 +0800

10 Oct, 2007

2 commits

70ca88521 NFS: Fake up 'wcc' attributes to prevent cache invalidation after write ... Browse Code »

NFSv2 and v4 don't offer weak cache consistency attributes on WRITE calls.
In NFSv3, returning wcc data is optional. In all cases, we want to prevent
the client from invalidating our cached data whenever ->write_done()
attempts to update the inode attributes.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-10-10 05:19:15 +0800
76b32999d NFSv4: Make NFSv4 ACCESS calls return attributes too... ... Browse Code »

It doesn't really make sense to cache an access call without also
revalidating the attributes.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-10-10 05:18:38 +0800

20 Jul, 2007

2 commits

e4eff1a62 SUNRPC: Clean up the sillyrename code ... Browse Code »

Fix a couple of bugs:
- Don't rely on the parent dentry still being valid when the call completes.
Fixes a race with shrink_dcache_for_umount_subtree()

- Don't remove the file if the filehandle has been labelled as stale.

Fix a couple of inefficiencies
- Remove the global list of sillyrenamed files. Instead we can cache the
sillyrename information in the dentry->d_fsdata
- Move common code from unlink_setup/unlink_done into fs/nfs/unlink.c

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-20 03:21:39 +0800
4fdc17b2a NFS: Introduce struct nfs_removeargs+nfs_removeres ... Browse Code »

We need a common structure for setting up an unlink() rpc call in order to
fix the asynchronous unlink code.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-20 03:21:39 +0800

11 Jul, 2007

2 commits

9f958ab88 NFSv4: Reduce the chances of an open_owner identifier collision ... Browse Code »

Currently we just use a 32-bit counter.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-11 11:40:39 +0800
aa53ed541 NFS4: on a O_EXCL OPEN make sure SETATTR sets the fields holding the verifier ... Browse Code »

The Linux NFS4 client simply skips over the bitmask in an O_EXCL open
call and so it doesn't bother to reset any fields that may be holding
the verifier. This patch has us save the first two words of the bitmask
(which is all the current client has #defines for). The client then
later checks this bitmask and turns on the appropriate flags in the
sattr->ia_verify field for the following SETATTR call.

This patch only currently checks to see if the server used the atime
and mtime slots for the verifier (which is what the Linux server uses
for this). I'm not sure of what other fields the server could
reasonably use, but adding checks for others should be trivial.

Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust

Jeff Layton
2007-07-11 11:40:25 +0800

13 Feb, 2007

2 commits

d9bc125ca Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ ... Browse Code »

Conflicts:

net/sunrpc/auth_gss/gss_krb5_crypto.c
net/sunrpc/auth_gss/gss_spkm3_token.c
net/sunrpc/clnt.c

Merge with mainline and fix conflicts.

Trond Myklebust
2007-02-13 14:43:25 +0800
c5ef1c42c [PATCH] mark struct inode_operations const 3 ... Browse Code »

Many struct inode_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arjan van de Ven
2007-02-13 01:48:46 +0800

04 Feb, 2007

1 commit

8e0969f04 NFS: Remove nfs_readpage_sync() ... Browse Code »

It makes no sense to maintain 2 parallel systems for reading in pages.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-02-04 07:35:06 +0800

06 Dec, 2006

1 commit

200baa211 NFS: Remove nfs_writepage_sync() ... Browse Code »

Maintaining two parallel ways of doing synchronous writes is rather
pointless. This patch gets rid of the legacy nfs_writepage_sync(), and
replaces it with the faster asynchronous writes.

Signed-off-by: Trond Myklebust

Trond Myklebust
2006-12-06 23:46:38 +0800

21 Oct, 2006

2 commits

bc4785cd4 [PATCH] nfs: verifier is network-endian ... Browse Code »

Signed-off-by: Al Viro
Acked-by: Trond Myklebust
Acked-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Al Viro
2006-10-21 01:26:40 +0800
0dbb4c679 [PATCH] xdr annotations: NFS readdir entries ... Browse Code »

on-the-wire data is big-endian

[in large part pulled from Alexey's patch]

Signed-off-by: Al Viro
Acked-by: Trond Myklebust
Acked-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Al Viro
2006-10-21 01:26:40 +0800

23 Sep, 2006

6 commits

94a6d7532 NFS: Use cached page as buffer for NFS symlink requests ... Browse Code »

Now that we have a copy of the symlink path in the page cache, we can pass
a struct page down to the XDR routines instead of a string buffer.

Test plan:
Connectathon, all NFS versions.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-09-23 11:24:53 +0800
4f390c152 NFS: Fix double d_drop in nfs_instantiate() error path ... Browse Code »

If the LOOKUP or GETATTR in nfs_instantiate fail, nfs_instantiate will do a
d_drop before returning. But some callers already do a d_drop in the case
of an error return. Make certain we do only one d_drop in all error paths.

This issue was introduced because over time, the symlink proc API diverged
slightly from the create/mkdir/mknod proc API. To prevent other coding
mistakes of this type, change the symlink proc API to be more like
create/mkdir/mknod and move the nfs_instantiate call into the symlink proc
routines so it is used in exactly the same way for create, mkdir, mknod,
and symlink.

Test plan:
Connectathon, all versions of NFS.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-09-23 11:24:52 +0800
39d7bbcb5 SUNRPC: remove extraneous header inclusions ... Browse Code »

include/linux/sunrpc/clnt.h already includes include/linux/sunrpc/xprt.h.
We can remove xprt.h from source files that already include clnt.h.
Likewise include/linux/sunrpc/timer.h.

Test plan:
Compile kernel with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-09-23 11:24:47 +0800
509de8111 NFS: Add extra const qualifiers ... Browse Code »

Add some extra const qualifiers into NFS.

Signed-Off-By: David Howells
Signed-off-by: Trond Myklebust

David Howells
2006-09-23 11:24:34 +0800
e9326dcab NFS: Add a server capabilities NFS RPC op ... Browse Code »

Add a set_capabilities NFS RPC op so that the server capabilities can be set.

Signed-Off-By: David Howells
Signed-off-by: Trond Myklebust

David Howells
2006-09-23 11:24:33 +0800
2b3de4411 NFS: Add a lookupfh NFS RPC op ... Browse Code »

Add a lookup filehandle NFS RPC op so that a file handle can be looked up
without requiring dentries and inodes and other VFS stuff when doing an NFS4
pathwalk during mounting.

Signed-Off-By: David Howells
Signed-off-by: Trond Myklebust

David Howells
2006-09-23 11:24:32 +0800

09 Sep, 2006

1 commit

e9f7bee1d [PATCH] NFS: large non-page-aligned direct I/O clobbers memory ... Browse Code »

The logic in nfs_direct_read_schedule and nfs_direct_write_schedule can
allow data->npages to be one larger than rpages. This causes a page
pointer to be written beyond the end of the pagevec in nfs_read_data (or
nfs_write_data).

Fix this by making nfs_(read|write)_alloc() calculate the size of the
pagevec array, and initialise data->npages.

Also get rid of the redundant argument to nfs_commit_alloc().

Signed-off-by: Trond Myklebust
Cc: Chuck Lever
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Trond Myklebust
2006-09-09 01:22:51 +0800

25 Aug, 2006

1 commit

3cedf13af NFSv4: increase client-provided nfs4 clientid size ... Browse Code »

Neil Brown observed that the current limit of 32 bytes isn't enough to hold two
ip addresses and the rest of the stuff we're putting in it, so it's often
truncated to the point where it's unlikely to be unique. This can cause
spurious CLID_INUSE's from the server.

Signed-off-by: J. Bruce Fields
Signed-off-by: Trond Myklebust
(cherry picked from fc8c17ec251e984ab3df9182ed097aa5b577c915 commit)

J. Bruce Fields
2006-08-25 03:51:59 +0800

29 Jun, 2006

1 commit

607f31e80 Revert "Merge branch 'odirect'" ... Browse Code »

This reverts ccf01ef7aa9c6c293a1c64c27331a2ce227916ec commit.

No idea how git managed this one: when I asked it to merge the odirect
topic branch it actually generated a patch which reverted the change.

Reverting the 'merge' will once again reveal Chuck's recent NFS/O_DIRECT
work to the world.

Signed-off-by: Trond Myklebust

Trond Myklebust
2006-06-29 04:52:45 +0800

25 Jun, 2006

2 commits

ccf01ef7a Merge branch 'odirect' Browse Code »

Trond Myklebust
2006-06-25 18:27:31 +0800
06cf6f2ed NFS: Eliminate nfs_get_user_pages() ... Browse Code »

Neil Brown observed that the kmalloc() in nfs_get_user_pages() is more
likely to fail if the I/O is large enough to require the allocation of more
than a single page to keep track of all the pinned pages in the user's
buffer.

Instead of tracking one large page array per dreq/iocb, track pages per
nfs_read/write_data, just like the cached I/O path does. An array for
pages is already allocated for us by nfs_readdata_alloc() (and the write
and commit equivalents).

This is also required for adding support for vectored I/O to the NFS direct
I/O path.

The original reason to pin the user buffer and allocate all the NFS data
structures before trying to schedule I/O was to ensure all needed resources
are allocated on the client before starting to send requests. This reduces
the chance that resource exhaustion on the client will cause a short read
or write.

On the other hand, for an application making very large application I/O
requests, this means that it will be nearly impossible for the application
to make forward progress on a resource-limited client.

Thus, moving the buffer pinning functionality into the I/O scheduling
loops should be good for scalability. The next patch will do the same for
NFS data structure allocation.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-06-25 01:11:39 +0800

09 Jun, 2006

6 commits

6b97fd3da NFSv4: Follow a referral ... Browse Code »

Respond to a moved error on NFS lookup by setting up the referral.
Note: We don't actually follow the referral during lookup/getattr, but
later when we detect fsid mismatch in inode revalidation (similar to the
processing done for cloning submounts). Referrals will have fake attributes
until they are actually followed or traversed.

Signed-off-by: Manoj Naik
Signed-off-by: Trond Myklebust

Manoj Naik
2006-06-09 21:34:29 +0800
7aaa0b3bd NFSv4: convert fs-locations-components to conform to RFC3530 ... Browse Code »

Use component4-style formats for decoding list of servers and pathnames in
fs_locations.

Signed-off-by: Manoj Naik
Signed-off-by: Trond Myklebust

Manoj Naik
2006-06-09 21:34:23 +0800
683b57b43 NFSv4: Implement the fs_locations function call ... Browse Code »

NFSv4 allows for the fact that filesystems may be replicated across
several servers or that they may be migrated to a backup server in case of
failure of the primary server.
fs_locations is an NFSv4 operation for retrieving information about the
location of migrated and/or replicated filesystems.

Based on an initial implementation by Jiaying Zhang
Signed-off-by: Trond Myklebust

Trond Myklebust
2006-06-09 21:34:22 +0800
8b4bdcf89 NFS: Store the file system "fsid" value in the NFS super block. ... Browse Code »

This should enable us to detect if we are crossing a mountpoint in the
case where the server is exporting "nohide" mounts.

Signed-off-by: Trond Myklebust

Trond Myklebust
2006-06-09 21:34:19 +0800
0d0b5cb36 NFS: Optimize allocation of nfs_read/write_data structures ... Browse Code »

Clean up use of page_array, and fix an off-by-one error noticed by Tom
Talpey which causes kmalloc calls in cases where using the page_array
is sufficient.

Test plan:
Normal client functional testing with r/wsize=32768.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-06-09 21:34:07 +0800
73a3d07c1 NFS: Clean up inode metadata updates ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2006-06-09 21:34:04 +0800

21 Mar, 2006

2 commits

ec06c096e NFS: Cleanup of NFS read code ... Browse Code »

Same callback hierarchy inversion as for the NFS write calls. This patch is
not strictly speaking needed by the O_DIRECT code, but avoids confusing
differences between the asynchronous read and write code.

Signed-off-by: Trond Myklebust

Trond Myklebust
2006-03-21 02:44:27 +0800
788e7a89a NFS: Cleanup of NFS write code in preparation for asynchronous o_direct ... Browse Code »

This patch inverts the callback hierarchy for NFS write calls.

Instead of having the NFSv2/v3/v4-specific code set up the RPC callback
ops, we allow the original caller to do so. This allows for more
flexibility w.r.t. how to set up and tear down the nfs_write_data
structure while still allowing the NFSv3/v4 code to perform error
handling.

The greater flexibility is needed by the asynchronous O_DIRECT code, which
wants to be able to hold on to the original nfs_write_data structures after
the WRITE RPC call has completed in order to be able to replay them if the
COMMIT call determines that the server has rebooted.

Signed-off-by: Trond Myklebust

Trond Myklebust
2006-03-21 02:44:27 +0800

07 Jan, 2006

2 commits

fa178f29c NFSv4: Ensure DELEGRETURN returns attributes ... Browse Code »

Upon return of a write delegation, the server will almost always bump the
change attribute. Ensure that we pick up that change so that we don't
invalidate our data cache unnecessarily.

Signed-off-by: Trond Myklebust

Trond Myklebust
2006-01-07 03:58:51 +0800
40859d7ee NFS: support large reads and writes on the wire ... Browse Code »

Most NFS server implementations allow up to 64KB reads and writes on the
wire. The Solaris NFS server allows up to a megabyte, for instance.

Now the Linux NFS client supports transfer sizes up to 1MB, too. This will
help reduce protocol and context switch overhead on read/write intensive NFS
workloads, and support larger atomic read and write operations on servers
that support them.

Test-plan:
Connectathon and iozone on mount point with wsize=rsize>32768 over TCP.
Tests with NFS over UDP to verify the maximum RPC payload size cap.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2006-01-07 03:58:49 +0800