13 Jan, 2012
1 commit
-
module_param(bool) used to counter-intuitively take an int. In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.It's time to remove the int/unsigned int option. For this version
it'll simply give a warning, but it'll break next kernel version.Acked-by: Mauro Carvalho Chehab
Signed-off-by: Rusty Russell
10 Jan, 2012
1 commit
-
Now that the use of numeric uids/gids is officially sanctioned in
RFC3530bis, it is time to change the default here to 'enabled'.By doing so, we ensure that NFSv4 copies the behaviour of NFSv3 when we're
using the default AUTH_SYS authentication (i.e. when the client uses the
numeric uids/gids as authentication tokens), so that when new files are
created, they will appear to have the correct user/group.
It also fixes a number of backward compatibility issues when migrating
from NFSv3 to NFSv4 on a platform where the server uses different uid/gid
mappings than the client.Note also that this setting has been successfully tested against servers
that do not support numeric uids/gids at several Connectathon/Bakeathon
events at this point, and the fall back to using string names/groups has
been shown to work well in all those test cases.Signed-off-by: Trond Myklebust
06 Jan, 2012
1 commit
-
Servers have a finite amount of memory to store NFSv4 open and lock
owners. Moreover, servers may have a difficult time determining when
they can reap their state owner table, thanks to gray areas in the
NFSv4 protocol specification. Thus clients should be careful to reuse
state owners when possible.Currently Linux is not too careful. When a user has closed all her
files on one mount point, the state owner's reference count goes to
zero, and it is released. The next OPEN allocates a new one. A
workload that serially opens and closes files can run through a large
number of open owners this way.When a state owner's reference count goes to zero, slap it onto a free
list for that nfs_server, with an expiry time. Garbage collect before
looking for a state owner. This makes state owners for active users
available for re-use.Now that there can be unused state owners remaining at umount time,
purge the state owner free list when a server is destroyed. Also be
sure not to reclaim unused state owners during state recovery.This change has benefits for the client as well. For some workloads,
this approach drops the number of OPEN_CONFIRM calls from the same as
the number of OPEN calls, down to just one. This reduces wire traffic
and thus open(2) latency. Before this patch, untarring a kernel
source tarball shows the OPEN_CONFIRM call counter steadily increasing
through the test. With the patch, the OPEN_CONFIRM count remains at 1
throughout the entire untar.As long as the expiry time is kept short, I don't think garbage
collection should be terribly expensive, although it does bounce the
clp->cl_lock around a bit.[ At some point we should rationalize the use of the nfs_server
->destroy method. ]Signed-off-by: Chuck Lever
[Trond: Fixed a garbage collection race and a few efficiency issues]
Signed-off-by: Trond Myklebust
05 Jan, 2012
1 commit
-
Instead of hacking specific service names into gss_encode_v1_msg, we should
just allow the caller to specify the service name explicitly.Signed-off-by: Trond Myklebust
Acked-by: J. Bruce Fields
21 Oct, 2011
1 commit
-
As soon as the nfs_client gets created, its cl_rpcclient is set to
ERR_PTR(-EINVAL). The rpc client structure is allocated later. Check
if the client is ready before using the cl_rpcclient pointer.Signed-off-by: Malahal Naineni
Signed-off-by: Trond Myklebust
19 Oct, 2011
1 commit
-
The result from ipv6_addr_scope() always not be a single SCOPE,
so we can't use equal to compare the result with IPV6_ADDR_SCOPE_LINKLOCAL
at nfs_sockaddr_match_ipaddr6.This patch fixs the problem, and lets checking address before scope_id.
Signed-off-by: Mi Jinlong
Signed-off-by: Trond Myklebust
01 Aug, 2011
3 commits
-
Signed-off-by: Jim Rees
Signed-off-by: Fred Isaman
Signed-off-by: Benny Halevy
Signed-off-by: Benny Halevy
[upcall bugfixes]
Signed-off-by: Peng Tao
Signed-off-by: Trond Myklebust -
Block layout needs it to determine IO size.
Signed-off-by: Fred Isaman
Signed-off-by: Tao Guo
Signed-off-by: Benny Halevy
Signed-off-by: Benny Halevy
Signed-off-by: Jim Rees
Signed-off-by: Trond Myklebust -
To allow layout driver to issue getdevicelist at mount time, and clean up
at umount time.[fixup non NFS_V4_1 set_pnfs_layoutdriver definition]
[pnfs: pass mntfh down the init_pnfs path]
Signed-off-by: Benny Halevy
Signed-off-by: Benny Halevy
Signed-off-by: Jim Rees
Signed-off-by: Trond Myklebust
15 Jul, 2011
1 commit
-
This is not part of an external ABI...
Signed-off-by: Trond Myklebust
13 Jul, 2011
2 commits
-
Layouts should be tracked per nfs_server (aka superblock)
instead of per struct nfs_client, which may have multiple FSIDs associated
with it.Signed-off-by: Weston Andros Adamson
Signed-off-by: Trond Myklebust -
can be skipped if the "eir_server_scope" from the exchange_id proc differs from
previous calls.Also, in the future server_scope will be useful for determining whether client
trunking is availableSigned-off-by: Weston Andros Adamson
Signed-off-by: Trond Myklebust
30 May, 2011
1 commit
-
Use the pnfs_layoutdriver_type both as a qualifier for the deviceid,
distinguishing deviceid from different layout types on the server,
and for freeing the layout-driver allocated structure containing the
nfs4_deviceid_node.[BUG in _deviceid_purge_client]
[layout_driver MUST set free_deviceid_node if using dev-cache]
[let ver < 4.1 compile]
Signed-off-by: Boaz Harrosh
[removed EXPORT_SYMBOL_GPL(nfs4_deviceid_purge_client)]
Signed-off-by: Benny Halevy
12 Mar, 2011
6 commits
-
The new behaviour is enabled using the new module parameter
'nfs4_disable_idmapping'.Note that if the server rejects an unmapped uid or gid, then
the client will automatically switch back to using the idmapper.Signed-off-by: Trond Myklebust
-
Introduce a data server set_client and init session following the
nfs4_set_client and nfs4_init_session convention.Once a new nfs_client is on the nfs_client_list, the nfs_client cl_cons_state
serializes access to creating an nfs_client struct with matching properties.Use the new nfs_get_client() that initializes new clients.
Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust -
The DS only role cannot be used to mount.
Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust -
Data servers cannot send nfs4_proc_get_lease_time. but still need to setup
state renewal. Add the NFS_CS_CHECK_LEASE_TIME bit to indicate if the lease
time can be checked.Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust -
Data servers not sharing a session with the mount MDS always have an empty
cl_superblocks list.
Replace the cl_superblocks empty list check to see if it is time to shut down
renewd with the NFS_CS_STOP_RENEW bit which is not set by such a data server.Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust -
Now nfs_get_client returns an nfs_client ready to be used no matter if it was
found or created.Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust
26 Jan, 2011
1 commit
-
The information required to find the nfs_client cooresponding to the incoming
back channel request is contained in the NFS layer. Perform minimal checking
in the RPC layer pg_authenticate method, and push more detailed checking into
the NFS layer where the nfs_client can be found.Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust
07 Jan, 2011
6 commits
-
Delegations are per-inode, not per-nfs_client. When a server file
system is migrated, delegations on the client must be moved from the
source to the destination nfs_server. Make it easier to manage a
mount point's delegation list across a migration event by moving the
list to the nfs_server struct.Clean up: I added documenting comments to public functions I changed
in this patch. For consistency I added comments to all the other
public functions in fs/nfs/delegation.c.Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust -
We're about to move some fields from struct nfs_client to struct
nfs_server. There is a many-to-one relationship between nfs_servers
and nfs_clients. After these fields are moved to the nfs_server
struct, to visit all of the data in these fields that is owned by one
nfs_client, code will need to visit each nfs_server on the
cl_superblocks list for that nfs_client.To serialize changes to the cl_superblocks list during these little
expeditions, protect the list with RCU.Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust -
A layout can request return-on-close. How this interacts with the
forgetful model of never sending LAYOUTRETURNS is a bit ambiguous.
We forget any layouts marked roc, and wait for them to be completely
forgotten before continuing with the close. In addition, to compensate
for races with any inflight LAYOUTGETs, and the fact that we do not get
any layout stateid back from the server, we set the barrier to the worst
case scenario of current_seqid + number of outstanding LAYOUTGETS.Signed-off-by: Fred Isaman
Signed-off-by: Trond Myklebust -
Fixes a bug where the nfs_client could be freed during callback processing.
Refactor nfs_find_client to use minorversion specific means to locate the
correct nfs_client structure.In the NFS layer, V4.0 clients are found using the callback_ident field in the
CB_COMPOUND header. V4.1 clients are found using the sessionID in the
CB_SEQUENCE operation which is also compared against the sessionID associated
with the back channel thread after a successful CREATE_SESSION.Each of these methods finds the one an only nfs_client associated
with the incoming callback request - so nfs_find_client_next is not needed.In the RPC layer, the pg_authenticate call needs to find the nfs_client. For
the v4.0 callback service, the callback identifier has not been decoded so a
search by address, version, and minorversion is used. The sessionid for the
sessions based callback service has (usually) not been set for the
pg_authenticate on a CB_NULL call which can be sent prior to the return
of a CREATE_SESSION call, so the sessionid associated with the back channel
thread is not used to find the client in pg_authenticate for CB_NULL calls.Pass the referenced nfs_client to each CB_COMPOUND operation being proceesed
via the new cb_process_state structure. The reference is held across
cb_compound processing.Use the new cb_process_state struct to move the NFS4ERR_RETRY_UNCACHED_REP
processing from process_op into nfs4_callback_sequence where it belongs.Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust -
Use the small id to pointer translator service to provide a unique callback
identifier per SETCLIENTID call used to identify the v4.0 callback service
associated with the clientid.Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust -
Resetting the client minor version operations causes nfs4_destroy_callback
to fail to shutdown the NFSv4.1 callback service.There is no reason to reset the client minorversion operations when the
nfs_client struct is being freed.Remove the minorverion reset and rename the function.
Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust
27 Oct, 2010
2 commits
-
* 'for-2.6.37' of git://linux-nfs.org/~bfields/linux: (99 commits)
svcrpc: svc_tcp_sendto XPT_DEAD check is redundant
svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue
svcrpc: assume svc_delete_xprt() called only once
svcrpc: never clear XPT_BUSY on dead xprt
nfsd4: fix connection allocation in sequence()
nfsd4: only require krb5 principal for NFSv4.0 callbacks
nfsd4: move minorversion to client
nfsd4: delay session removal till free_client
nfsd4: separate callback change and callback probe
nfsd4: callback program number is per-session
nfsd4: track backchannel connections
nfsd4: confirm only on succesful create_session
nfsd4: make backchannel sequence number per-session
nfsd4: use client pointer to backchannel session
nfsd4: move callback setup into session init code
nfsd4: don't cache seq_misordered replies
SUNRPC: Properly initialize sock_xprt.srcaddr in all cases
SUNRPC: Use conventional switch statement when reclassifying sockets
sunrpc/xprtrdma: clean up workqueue usage
sunrpc: Turn list_for_each-s into the ..._entry-s
...Fix up trivial conflicts (two different deprecation notices added in
separate branches) in Documentation/feature-removal-schedule.txt -
* 'nfs-for-2.6.37' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
net/sunrpc: Use static const char arrays
nfs4: fix channel attribute sanity-checks
NFSv4.1: Use more sensible names for 'initialize_mountpoint'
NFSv4.1: pnfs: filelayout: add driver's LAYOUTGET and GETDEVICEINFO infrastructure
NFSv4.1: pnfs: add LAYOUTGET and GETDEVICEINFO infrastructure
NFS: client needs to maintain list of inodes with active layouts
NFS: create and destroy inode's layout cache
NFSv4.1: pnfs: filelayout: introduce minimal file layout driver
NFSv4.1: pnfs: full mount/umount infrastructure
NFS: set layout driver
NFS: ask for layouttypes during v4 fsinfo call
NFS: change stateid to be a union
NFSv4.1: pnfsd, pnfs: protocol level pnfs constants
SUNRPC: define xdr_decode_opaque_fixed
NFSD: remove duplicate NFS4_STATEID_SIZE
26 Oct, 2010
1 commit
-
* 'nfs-for-2.6.37' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (67 commits)
SUNRPC: Cleanup duplicate assignment in rpcauth_refreshcred
nfs: fix unchecked value
Ask for time_delta during fsinfo probe
Revalidate caches on lock
SUNRPC: After calling xprt_release(), we must restart from call_reserve
NFSv4: Fix up the 'dircount' hint in encode_readdir
NFSv4: Clean up nfs4_decode_dirent
NFSv4: nfs4_decode_dirent must clear entry->fattr->valid
NFSv4: Fix a regression in decode_getfattr
NFSv4: Fix up decode_attr_filehandle() to handle the case of empty fh pointer
NFS: Ensure we check all allocation return values in new readdir code
NFS: Readdir plus in v4
NFS: introduce generic decode_getattr function
NFS: check xdr_decode for errors
NFS: nfs_readdir_filler catch all errors
NFS: readdir with vmapped pages
NFS: remove page size checking code
NFS: decode_dirent should use an xdr_stream
SUNRPC: Add a helper function xdr_inline_peek
NFS: remove readdir plus limit
...
25 Oct, 2010
4 commits
-
Implement the driver's io_ops->alloc_lseg and free_lseg functions,
which integrate into the deviceid cache and calls out to
nfs4_proc_getdeviceinfo when necessary.Signed-off-by: Andy Adamson
Signed-off-by: Dean Hildebrand
Signed-off-by: Marc Eshel
Signed-off-by: Mike Sager
Signed-off-by: Oleg Drokin
Signed-off-by: Ricardo Labiaga
Signed-off-by: Tao Guo
Signed-off-by: Boaz Harrosh
Signed-off-by: Fred Isaman
Signed-off-by: Trond Myklebust -
In particular, server reboot will invalidate all layouts.
Note that in order to have an active layout, we must get a successful response
from the server. To avoid adding that machinery, this patch just includes a
stub that fakes up a successful return. Since the layout is never referenced
for io, this is not a problem.Signed-off-by: Andy Adamson
Signed-off-by: Benny Halevy
Signed-off-by: Dean Hildebrand
Signed-off-by: Fred Isaman
Signed-off-by: Trond Myklebust -
Put in the infrastructure that uses information returned from the
server at mount to select a layout driver module.In this patch, a stub is used that always returns "no driver found".
Signed-off-by: Ricardo Labiaga
Signed-off-by: Dean Hildebrand
Signed-off-by: Marc Eshel
Signed-off-by: Andy Adamson
Signed-off-by: Benny Halevy
Signed-off-by: Fred Isaman
Signed-off-by: Trond Myklebust -
Instead of blindly zapping the caches, attempt to revalidate them if
the server has indicated that it uses high resolution timestamps.NFSv4 should be able to always revalidate the cache since the
protocol requires the update of the change attribute on modification of
the data. In reality, there are servers (the Linux NFS server
for example) that do not obey this requirement and use ctime as the
basis for change attribute. Long term, the server needs to be fixed.
At this time, and to be on the safe side, continue zapping caches if
the server indicates that it does not have a high resolution timestamp.Signed-off-by: Ricardo Labiaga
Signed-off-by: Trond Myklebust
24 Oct, 2010
2 commits
-
By requsting more attributes during a readdir, we can mimic the readdir plus
operation that was in NFSv3.To test, I ran the command `ls -lU --color=none` on directories with various
numbers of files. Without readdir plus, I see this:n files | 100 | 1,000 | 10,000 | 100,000 | 1,000,000
--------+-----------+-----------+-----------+-----------+----------
real | 0m00.153s | 0m00.589s | 0m05.601s | 0m56.691s | 9m59.128s
user | 0m00.007s | 0m00.007s | 0m00.077s | 0m00.703s | 0m06.800s
sys | 0m00.010s | 0m00.070s | 0m00.633s | 0m06.423s | 1m10.005s
access | 3 | 1 | 1 | 4 | 31
getattr | 2 | 1 | 1 | 1 | 1
lookup | 104 | 1,003 | 10,003 | 100,003 | 1,000,003
readdir | 2 | 16 | 158 | 1,575 | 15,749
total | 111 | 1,021 | 10,163 | 101,583 | 1,015,784With readdir plus enabled, I see this:
n files | 100 | 1,000 | 10,000 | 100,000 | 1,000,000
--------+-----------+-----------+-----------+-----------+----------
real | 0m00.115s | 0m00.206s | 0m01.079s | 0m12.521s | 2m07.528s
user | 0m00.003s | 0m00.003s | 0m00.040s | 0m00.290s | 0m03.296s
sys | 0m00.007s | 0m00.020s | 0m00.120s | 0m01.357s | 0m17.556s
access | 3 | 1 | 1 | 1 | 7
getattr | 2 | 1 | 1 | 1 | 1
lookup | 4 | 3 | 3 | 3 | 3
readdir | 6 | 62 | 630 | 6,300 | 62,993
total | 15 | 67 | 635 | 6,305 | 63,004Readdir plus disabled has about a 16x increase in the number of rpc calls and
is 4 - 5 times slower on large directories.Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust -
We can use vmapped pages to read more information from the network at once.
This will reduce the number of calls needed to complete a readdir.Signed-off-by: Bryan Schumaker
[trondmy: Added #include for linux/vmalloc.h> in fs/nfs/dir.c]
Signed-off-by: Trond Myklebust
02 Oct, 2010
1 commit
-
Signed-off-by: Pavel Emelyanov
Signed-off-by: J. Bruce Fields
23 Sep, 2010
1 commit
-
NFS clients since 2.6.12 support flock locks by emulating fcntl byte-range
locks. Due to this, some windows applications which seem to use both flock
(share mode lock mapped as flock by Samba) and fcntl locks sequentially on
the same file, can't lock as they falsely assume the file is already locked.
The problem was reported on a setup with windows clients accessing excel files
on a Samba exported share which is originally a NFS mount from a NetApp filer.Older NFS clients (< 2.6.12) did not see this problem as flock locks were
considered local. To support legacy flock behavior, this patch adds a mount
option "-olocal_lock=" which can take the following values:'none' - Neither flock locks nor POSIX locks are local
'flock' - flock locks are local
'posix' - fcntl/POSIX locks are local
'all' - Both flock locks and POSIX locks are localTesting:
- This patch was tested by using -olocal_lock option with different values
and the NLM calls were noted from the network packet captured.'none' - NLM calls were seen during both flock() and fcntl(), flock lock
was granted, fcntl was denied
'flock' - no NLM calls for flock(), NLM call was seen for fcntl(),
granted
'posix' - NLM call was seen for flock() - granted, no NLM call for fcntl()
'all' - no NLM calls were seen during both flock() and fcntl()- No bugs were seen during NFSv4 locking/unlocking in general and NFSv4
reboot recovery.Cc: Neil Brown
Signed-off-by: Suresh Jayaraman
Signed-off-by: Trond Myklebust
13 Sep, 2010
1 commit
-
Reported-by: Ben Greear
Signed-off-by: Trond Myklebust
Cc: stable@kernel.org
23 Jun, 2010
2 commits
-
There is no reason to change the nfs_client state every time we allocate a
new session. Move that line into nfs4_init_client_minor_version.Signed-off-by: Trond Myklebust
-
Signed-off-by: Trond Myklebust