Eric Lee / smarc-fsl-linux-kernel

06 Jan, 2012

1 commit

0aaaf5c42 NFS: Cache state owners after files are closed ... Browse Code »

Servers have a finite amount of memory to store NFSv4 open and lock
owners. Moreover, servers may have a difficult time determining when
they can reap their state owner table, thanks to gray areas in the
NFSv4 protocol specification. Thus clients should be careful to reuse
state owners when possible.

Currently Linux is not too careful. When a user has closed all her
files on one mount point, the state owner's reference count goes to
zero, and it is released. The next OPEN allocates a new one. A
workload that serially opens and closes files can run through a large
number of open owners this way.

When a state owner's reference count goes to zero, slap it onto a free
list for that nfs_server, with an expiry time. Garbage collect before
looking for a state owner. This makes state owners for active users
available for re-use.

Now that there can be unused state owners remaining at umount time,
purge the state owner free list when a server is destroyed. Also be
sure not to reclaim unused state owners during state recovery.

This change has benefits for the client as well. For some workloads,
this approach drops the number of OPEN_CONFIRM calls from the same as
the number of OPEN calls, down to just one. This reduces wire traffic
and thus open(2) latency. Before this patch, untarring a kernel
source tarball shows the OPEN_CONFIRM call counter steadily increasing
through the test. With the patch, the OPEN_CONFIRM count remains at 1
throughout the entire untar.

As long as the expiry time is kept short, I don't think garbage
collection should be terribly expensive, although it does bounce the
clp->cl_lock around a bit.

[ At some point we should rationalize the use of the nfs_server
->destroy method. ]

Signed-off-by: Chuck Lever
[Trond: Fixed a garbage collection race and a few efficiency issues]
Signed-off-by: Trond Myklebust

Chuck Lever
2012-01-06 00:59:18 +0800

25 Oct, 2011

1 commit

1442d1678 Merge branch 'for-3.2' of git://linux-nfs.org/~bfields/linux ... Browse Code »

* 'for-3.2' of git://linux-nfs.org/~bfields/linux: (103 commits)
nfs41: implement DESTROY_CLIENTID operation
nfsd4: typo logical vs bitwise negate for want_mask
nfsd4: allow NFS4_SHARE_SIGNAL_DELEG_WHEN_RESRC_AVAIL | NFS4_SHARE_PUSH_DELEG_WHEN_UNCONTENDED
nfsd4: seq->status_flags may be used unitialized
nfsd41: use SEQ4_STATUS_BACKCHANNEL_FAULT when cb_sequence is invalid
nfsd4: implement new 4.1 open reclaim types
nfsd4: remove unneeded CLAIM_DELEGATE_CUR workaround
nfsd4: warn on open failure after create
nfsd4: preallocate open stateid in process_open1()
nfsd4: do idr preallocation with stateid allocation
nfsd4: preallocate nfs4_file in process_open1()
nfsd4: clean up open owners on OPEN failure
nfsd4: simplify process_open1 logic
nfsd4: make is_open_owner boolean
nfsd4: centralize renew_client() calls
nfsd4: typo logical vs bitwise negate
nfs: fix bug about IPv6 address scope checking
nfsd4: more robust ignoring of WANT bits in OPEN
nfsd4: move name-length checks to xdr
nfsd4: move access/deny validity checks to xdr code
...

Linus Torvalds
2011-10-25 21:42:01 +0800

28 Aug, 2011

1 commit

a9004abc3 nfsd4: cleanup and consolidate seqid_mutating_err ... Browse Code »

Signed-off-by: J. Bruce Fields

J. Bruce Fields
2011-08-28 02:21:26 +0800

25 Aug, 2011

3 commits

042b60beb NFSv4: renewd needs to be able to handle the NFS4ERR_CB_PATH_DOWN error ... Browse Code »

The NFSv4 spec does not specify that the server must repeat that error,
so in order to avoid having the delegations revoked, we should handle
it immediately.

Also note that NFS4ERR_CB_PATH_DOWN does in fact renew the lease...

Signed-off-by: Trond Myklebust

Trond Myklebust
2011-08-25 03:07:37 +0800
2f60ea6b8 NFSv4: The NFSv4.0 client must send RENEW calls if it holds a delegation ... Browse Code »

RFC3530 states that if the client holds a delegation, then it is obliged
to continue to send RENEW calls once every lease period in order to allow
the server to return NFS4ERR_CB_PATH_DOWN if the callback path is
unreachable.

This is not required for NFSv4.1, since the server can at any time set
the SEQ4_STATUS_CB_PATH_DOWN_SESSION in any SEQUENCE operation.

Signed-off-by: Trond Myklebust

Trond Myklebust
2011-08-25 03:07:37 +0800
8534d4ec0 NFSv4: nfs4_proc_renew should be declared static ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2011-08-25 03:07:37 +0800

01 Aug, 2011

1 commit

dae100c2b pnfs: ask for layout_blksize and save it in nfs_server ... Browse Code »

Block layout needs it to determine IO size.

Signed-off-by: Fred Isaman
Signed-off-by: Tao Guo
Signed-off-by: Benny Halevy
Signed-off-by: Benny Halevy
Signed-off-by: Jim Rees
Signed-off-by: Trond Myklebust

Fred Isaman
2011-08-01 00:18:15 +0800

26 Jul, 2011

1 commit

5f00bcb38 Merge branch 'master' into devel and apply fixup from Stephen Rothwell: ... Browse Code »

vfs/nfs: fixup for nfs_open_context change

Signed-off-by: Stephen Rothwell
Signed-off-by: Trond Myklebust

Stephen Rothwell
2011-07-26 02:53:52 +0800

20 Jul, 2011

1 commit

643168c2d nfs4_closedata doesn't need to mess with struct path ... Browse Code »

instead of path_get()/path_put(), we can just use nfs_sb_{,de}active()
to pin the superblock down.

Signed-off-by: Al Viro

Al Viro
2011-07-20 13:43:41 +0800

13 Jul, 2011

2 commits

fca78d6d2 NFS: Add SECINFO_NO_NAME procedure ... Browse Code »

If the client is using NFS v4.1, then we can use SECINFO_NO_NAME to find
the secflavor for the initial mount. If the server doesn't support
SECINFO_NO_NAME then I fall back on the "guess and check" method used
for v4.0 mounts.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2011-07-13 01:40:27 +0800
78fe0f41d NFS: use scope from exchange_id to skip reclaim ... Browse Code »

can be skipped if the "eir_server_scope" from the exchange_id proc differs from
previous calls.

Also, in the future server_scope will be useful for determining whether client
trunking is available

Signed-off-by: Weston Andros Adamson
Signed-off-by: Trond Myklebust

Weston Andros Adamson
2011-07-13 01:40:27 +0800

25 Apr, 2011

1 commit

fd954ae12 NFSv4.1: Don't loop forever in nfs4_proc_create_session ... Browse Code »

If a server for some reason keeps sending NFS4ERR_DELAY errors, we can end
up looping forever inside nfs4_proc_create_session, and so the usual
mechanisms for detecting if the nfs_client is dead don't work.

Fix this by ensuring that we loop inside the nfs4_state_manager thread
instead.

Signed-off-by: Trond Myklebust

Trond Myklebust
2011-04-25 02:28:18 +0800

25 Mar, 2011

3 commits

0acd22019 Merge branch 'nfs-for-2.6.39' into nfs-for-next Browse Code »

Trond Myklebust
2011-03-25 05:03:14 +0800
ef3115378 NFSv4.1 convert layoutcommit sync to boolean ... Browse Code »

Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust

Andy Adamson
2011-03-25 03:49:48 +0800
7c5130588 NFS: lookup supports alternate client ... Browse Code »

A later patch will need to perform a lookup using an
alternate client with a different security flavor.
This patch adds support for doing that on NFS v4.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2011-03-25 01:52:41 +0800

24 Mar, 2011

1 commit

863a3c6c6 NFSv4.1: layoutcommit ... Browse Code »

The filelayout driver sends LAYOUTCOMMIT only when COMMIT goes to
the data server (as opposed to the MDS) and the data server WRITE
is not NFS_FILE_SYNC.

Only whole file layout support means that there is only one IOMODE_RW layout
segment.

Signed-off-by: Andy Adamson
Signed-off-by: Alexandros Batsakis
Signed-off-by: Boaz Harrosh
Signed-off-by: Dean Hildebrand
Signed-off-by: Fred Isaman
Signed-off-by: Mingyang Guo
Signed-off-by: Tao Guo
Signed-off-by: Zhang Jingwang
Tested-by: Boaz Harrosh
Signed-off-by: Benny Halevy
Signed-off-by: Fred Isaman
Signed-off-by: Trond Myklebust

Andy Adamson
2011-03-24 03:29:04 +0800

12 Mar, 2011

5 commits

dc70d7b31 NFSv4.1: filelayout read ... Browse Code »

Attempt a pNFS file layout read by setting up the nfs_read_data struct and
calling nfs_initiate_read with the data server rpc client and the
filelayout rpc call ops.

Error handling is implemented in a subsequent patch.

Signed-off-by: Andy Adamson
Signed-off-by: Dean Hildebrand
Signed-off-by: Fred Isaman
Signed-off-by: Fred Isaman
Signed-off-by: Mingyang Guo
Signed-off-by: Oleg Drokin
Signed-off-by: Ricardo Labiaga
Tested-by: Guo Mingyang
Signed-off-by: Andy Adamson
Signed-off-by: Benny Halevy
Signed-off-by: Trond Myklebust

Andy Adamson
2011-03-12 04:38:43 +0800
d83217c13 NFSv4.1: data server connection ... Browse Code »

Introduce a data server set_client and init session following the
nfs4_set_client and nfs4_init_session convention.

Once a new nfs_client is on the nfs_client_list, the nfs_client cl_cons_state
serializes access to creating an nfs_client struct with matching properties.

Use the new nfs_get_client() that initializes new clients.

Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust

Andy Adamson
2011-03-12 04:38:42 +0800
94de8b27d NFSv4.1: add MDS mount DS only check ... Browse Code »

The DS only role cannot be used to mount.

Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust

Andy Adamson
2011-03-12 04:38:41 +0800
f9feab1e1 NFSv4: nfs4_state_mark_reclaim_nograce() should be static ... Browse Code »

There are no more external users of nfs4_state_mark_reclaim_nograce() or
nfs4_state_mark_reclaim_reboot(), so mark them as static.

Signed-off-by: Trond Myklebust

Trond Myklebust
2011-03-12 04:18:36 +0800
0400a6b0c NFSv4/4.1: Fix nfs4_schedule_state_recovery abuses ... Browse Code »

nfs4_schedule_state_recovery() should only be used when we need to force
the state manager to check the lease. If we just want to start the
state manager in order to handle a state recovery situation, we should be
using nfs4_schedule_state_manager().

This patch fixes the abuses of nfs4_schedule_state_recovery() by replacing
its use with a set of helper functions that do the right thing.

Signed-off-by: Trond Myklebust

Trond Myklebust
2011-03-12 04:18:22 +0800

07 Jan, 2011

3 commits

24d292b89 NFS: Move cl_state_owners and related fields to the nfs_server struct ... Browse Code »

NFSv4 migration needs to reassociate state owners from the source to
the destination nfs_server data structures. To make that easier, move
the cl_state_owners field to the nfs_server struct. cl_openowner_id
and cl_lockowner_id accompany this move, as they are used in
conjunction with cl_state_owners.

The cl_lock field in the parent nfs_client continues to protect all
three of these fields.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2011-01-07 03:47:57 +0800
f7e8917a6 pnfs: layout roc code ... Browse Code »

A layout can request return-on-close. How this interacts with the
forgetful model of never sending LAYOUTRETURNS is a bit ambiguous.
We forget any layouts marked roc, and wait for them to be completely
forgotten before continuing with the close. In addition, to compensate
for races with any inflight LAYOUTGETs, and the fact that we do not get
any layout stateid back from the server, we set the barrier to the worst
case scenario of current_seqid + number of outstanding LAYOUTGETS.

Signed-off-by: Fred Isaman
Signed-off-by: Trond Myklebust

Fred Isaman
2011-01-07 03:46:32 +0800
43f1b3da8 pnfs: add CB_LAYOUTRECALL handling ... Browse Code »

This is the heart of the wave 2 submission. Add the code to trigger
drain and forget of any afected layouts. In addition, we set a
"barrier", below which any LAYOUTGET reply is ignored. This is to
compensate for the fact that we do not wait for outstanding LAYOUTGETs
to complete as per section 12.5.5.2.1 of RFC 5661.

Signed-off-by: Fred Isaman
Signed-off-by: Trond Myklebust

Fred Isaman
2011-01-07 03:46:32 +0800

05 Jan, 2011

1 commit

64c2ce8b7 nfsv4: Switch to generic xattr handling code ... Browse Code »

This patch make nfsv4 use the generic xattr handling code
to get the nfsv4 acl. This will help us to add richacl
support to nfsv4 in later patches

Signed-off-by: Aneesh Kumar K.V
Signed-off-by: Trond Myklebust

Aneesh Kumar K.V
2011-01-05 02:10:41 +0800

17 Dec, 2010

1 commit

573c4e1ef NFS: Simplify ->decode_dirent() calling sequence ... Browse Code »

Clean up.

The pointer returned by ->decode_dirent() is no longer used as a
pointer. The only call site (xdr_decode() in fs/nfs/dir.c) simply
extracts the errno value encoded in the pointer. Replace the
returned pointer with a standard integer errno return value.

Also, pass the "server" argument as part of the nfs_entry instead of
as a separate parameter. It's faster to derive "server" in
nfs_readdir_xdr_to_array() since we already have the directory's inode
handy. "server" ought to be invariant for a set of entries in the
same directory, right?

The legacy versions of decode_dirent() don't use "server" anyway, so
it's wasted work for them to derive and pass "server" for each entry.

Signed-off-by: Chuck Lever
Tested-by: J. Bruce Fields
Signed-off-by: Trond Myklebust

Chuck Lever
2010-12-17 01:37:24 +0800

24 Oct, 2010

2 commits

82f2e5472 NFS: Readdir plus in v4 ... Browse Code »

By requsting more attributes during a readdir, we can mimic the readdir plus
operation that was in NFSv3.

To test, I ran the command `ls -lU --color=none` on directories with various
numbers of files. Without readdir plus, I see this:

n files | 100 | 1,000 | 10,000 | 100,000 | 1,000,000
--------+-----------+-----------+-----------+-----------+----------
real | 0m00.153s | 0m00.589s | 0m05.601s | 0m56.691s | 9m59.128s
user | 0m00.007s | 0m00.007s | 0m00.077s | 0m00.703s | 0m06.800s
sys | 0m00.010s | 0m00.070s | 0m00.633s | 0m06.423s | 1m10.005s
access | 3 | 1 | 1 | 4 | 31
getattr | 2 | 1 | 1 | 1 | 1
lookup | 104 | 1,003 | 10,003 | 100,003 | 1,000,003
readdir | 2 | 16 | 158 | 1,575 | 15,749
total | 111 | 1,021 | 10,163 | 101,583 | 1,015,784

With readdir plus enabled, I see this:

n files | 100 | 1,000 | 10,000 | 100,000 | 1,000,000
--------+-----------+-----------+-----------+-----------+----------
real | 0m00.115s | 0m00.206s | 0m01.079s | 0m12.521s | 2m07.528s
user | 0m00.003s | 0m00.003s | 0m00.040s | 0m00.290s | 0m03.296s
sys | 0m00.007s | 0m00.020s | 0m00.120s | 0m01.357s | 0m17.556s
access | 3 | 1 | 1 | 1 | 7
getattr | 2 | 1 | 1 | 1 | 1
lookup | 4 | 3 | 3 | 3 | 3
readdir | 6 | 62 | 630 | 6,300 | 62,993
total | 15 | 67 | 635 | 6,305 | 63,004

Readdir plus disabled has about a 16x increase in the number of rpc calls and
is 4 - 5 times slower on large directories.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2010-10-24 03:27:37 +0800
babddc72a NFS: decode_dirent should use an xdr_stream ... Browse Code »

Convert nfs*xdr.c to use an xdr stream in decode_dirent. This will prevent a
kernel oops that has been occuring when reading a vmapped page.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2010-10-24 03:27:33 +0800

17 Sep, 2010

5 commits

2b484297e NFS: Add an 'open_context' element to struct nfs_rpc_ops ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-09-17 22:56:51 +0800
535918f14 NFSv4: Further cleanups for nfs4_open_revalidate() ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-09-17 22:56:51 +0800
b8d4caddd NFSv4: Clean up nfs4_open_revalidate ... Browse Code »

Remove references to 'struct nameidata' from the low-level open_revalidate
code, and replace them with a struct nfs_open_context which will be
correctly initialised upon success.

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-09-17 22:56:51 +0800
f46e0bd34 NFSv4: Further minor cleanups for nfs4_atomic_open() ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-09-17 22:56:50 +0800
cd9a1c0e5 NFSv4: Clean up nfs4_atomic_open ... Browse Code »

Start moving the 'struct nameidata' dependent code out of the lower level
NFS code in preparation for the removal of open intents.

Instead of the struct nameidata, we pass down a partially initialised
struct nfs_open_context that will be fully initialised by the atomic open
upon success.

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-09-17 22:56:50 +0800

31 Jul, 2010

2 commits

77041ed9b NFSv4: Ensure the lockowners are labelled using the fl_owner and/or fl_pid ... Browse Code »

flock locks want to be labelled using the process pid, while posix locks
want to be labelled using the fl_owner.

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-07-31 02:46:10 +0800
d3c7b7ccc NFSv4: Add support for the RELEASE_LOCKOWNER operation ... Browse Code »

This is needed by NFSv4.0 servers in order to keep the number of locking
stateids at a manageable level.

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-07-31 02:46:10 +0800

25 Jun, 2010

1 commit

1f0e890db NFSv4: Clean up struct nfs4_state_owner ... Browse Code »

The 'so_delegations' list appears to be unused.

Also eliminate so_client. If we already have so_server, we can get to the
nfs_client structure.

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-06-25 03:11:43 +0800

23 Jun, 2010

4 commits

1055d76d9 NFSv4.1: There is no need to init the session more than once... ... Browse Code »

Set up a flag to ensure that is indeed the case.

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-06-23 01:24:03 +0800
e047a10c1 NFSv41: Fix nfs_async_inode_return_delegation() ugliness ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-06-23 01:24:02 +0800
c48f4f354 NFSv41: Convert the various reboot recovery ops etc to minor version ops ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-06-23 01:24:02 +0800
97dc13594 NFSv41: Clean up the NFSv4.1 minor version specific operations ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-06-23 01:24:02 +0800