Eric Lee / smarc-fsl-linux-kernel

13 Jul, 2011

1 commit

fca78d6d2 NFS: Add SECINFO_NO_NAME procedure ... Browse Code »

If the client is using NFS v4.1, then we can use SECINFO_NO_NAME to find
the secflavor for the initial mount. If the server doesn't support
SECINFO_NO_NAME then I fall back on the "guess and check" method used
for v4.0 mounts.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2011-07-13 01:40:27 +0800

13 Apr, 2011

1 commit

561f0b0ad NFS: Remove unused argument from nfs_find_best_sec() ... Browse Code »

The inode was used in an earlier version of the code, but it isn't
used anymore.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2011-04-13 07:34:23 +0800

07 Apr, 2011

1 commit

418875900 NFS: Fix a signed vs. unsigned secinfo bug ... Browse Code »

rpc_authflavor_t is cast from an unsigned int, but the
initial code tried to use it as a signed int. I fix
this by passing an rpc_authflavor_t pointer around, and
returning signed integers from functions.

Signed-off-by: Bryan Schumaker
Reported-by: Dan Carpenter
Signed-off-by: Trond Myklebust

Bryan Schumaker
2011-04-07 04:25:04 +0800

27 Mar, 2011

1 commit

a0e7e3cf7 NFS: Don't leak RPC clients in NFSv4 secinfo negotiation ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2011-03-27 23:48:17 +0800

25 Mar, 2011

2 commits

7ebb93159 NFS: use secinfo when crossing mountpoints ... Browse Code »

A submount may use different security than the parent
mount does. We should figure out what sec flavor the
submount uses at mount time.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2011-03-25 01:52:42 +0800
7c5130588 NFS: lookup supports alternate client ... Browse Code »

A later patch will need to perform a lookup using an
alternate client with a different security flavor.
This patch adds support for doing that on NFS v4.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2011-03-25 01:52:41 +0800

21 Mar, 2011

1 commit

1c34092ad nfs: lock() vs unlock() typo ... Browse Code »

These should be spin_unlock() instead of spin_lock(). It's a typo.

Signed-off-by: Dan Carpenter
Signed-off-by: Al Viro

Dan Carpenter
2011-03-21 12:45:50 +0800

17 Mar, 2011

2 commits

f8ad9c4ba nfs: nfs_do_{ref,sub}mount() superblock argument is redundant ... Browse Code »

It's always equal to dentry->d_sb

Signed-off-by: Al Viro

Al Viro
2011-03-17 04:48:06 +0800
b514f872f nfs: make nfs_path() work without vfsmount ... Browse Code »

part 3: now we have everything to get nfs_path() just by dentry -
just follow to (disconnected) root and pick the rest of the thing
there.

Start killing propagation of struct vfsmount * on the paths that
used to bring it to nfs_path().

Signed-off-by: Al Viro

Al Viro
2011-03-17 04:47:55 +0800

16 Jan, 2011

3 commits

ea5b778a8 Unexport do_add_mount() and add in follow_automount(), not ->d_automount() ... Browse Code »

Unexport do_add_mount() and make ->d_automount() return the vfsmount to be
added rather than calling do_add_mount() itself. follow_automount() will then
do the addition.

This slightly complicates things as ->d_automount() normally wants to add the
new vfsmount to an expiration list and start an expiration timer. The problem
with that is that the vfsmount will be deleted if it has a refcount of 1 and
the timer will not repeat if the expiration list is empty.

To this end, we require the vfsmount to be returned from d_automount() with a
refcount of (at least) 2. One of these refs will be dropped unconditionally.
In addition, follow_automount() must get a 3rd ref around the call to
do_add_mount() lest it eat a ref and return an error, leaving the mount we
have open to being expired as we would otherwise have only 1 ref on it.

d_automount() should also add the the vfsmount to the expiration list (by
calling mnt_set_expiry()) and start the expiration timer before returning, if
this mechanism is to be used. The vfsmount will be unlinked from the
expiration list by follow_automount() if do_add_mount() fails.

This patch also fixes the call to do_add_mount() for AFS to propagate the mount
flags from the parent vfsmount.

Signed-off-by: David Howells
Signed-off-by: Al Viro

David Howells
2011-01-16 09:07:48 +0800
36d43a437 NFS: Use d_automount() rather than abusing follow_link() ... Browse Code »
43

Make NFS use the new d_automount() dentry operation rather than abusing
follow_link() on directories.

Signed-off-by: David Howells
Acked-by: Trond Myklebust
Acked-by: Ian Kent
Signed-off-by: Al Viro

David Howells
2011-01-16 09:07:34 +0800
cc53ce53c Add a dentry op to allow processes to be held during pathwalk transit ... Browse Code »

Add a dentry op (d_manage) to permit a filesystem to hold a process and make it
sleep when it tries to transit away from one of that filesystem's directories
during a pathwalk. The operation is keyed off a new dentry flag
(DCACHE_MANAGE_TRANSIT).

The filesystem is allowed to be selective about which processes it holds and
which it permits to continue on or prohibits from transiting from each flagged
directory. This will allow autofs to hold up client processes whilst letting
its userspace daemon through to maintain the directory or the stuff behind it
or mounted upon it.

The ->d_manage() dentry operation:

int (*d_manage)(struct path *path, bool mounting_here);

takes a pointer to the directory about to be transited away from and a flag
indicating whether the transit is undertaken by do_add_mount() or
do_move_mount() skipping through a pile of filesystems mounted on a mountpoint.

It should return 0 if successful and to let the process continue on its way;
-EISDIR to prohibit the caller from skipping to overmounted filesystems or
automounting, and to use this directory; or some other error code to return to
the user.

->d_manage() is called with namespace_sem writelocked if mounting_here is true
and no other locks held, so it may sleep. However, if mounting_here is true,
it may not initiate or wait for a mount or unmount upon the parameter
directory, even if the act is actually performed by userspace.

Within fs/namei.c, follow_managed() is extended to check with d_manage() first
on each managed directory, before transiting away from it or attempting to
automount upon it.

follow_down() is renamed follow_down_one() and should only be used where the
filesystem deliberately intends to avoid management steps (e.g. autofs).

A new follow_down() is added that incorporates the loop done by all other
callers of follow_down() (do_add/move_mount(), autofs and NFSD; whilst AFS, NFS
and CIFS do use it, their use is removed by converting them to use
d_automount()). The new follow_down() calls d_manage() as appropriate. It
also takes an extra parameter to indicate if it is being called from mount code
(with namespace_sem writelocked) which it passes to d_manage(). follow_down()
ignores automount points so that it can be used to mount on them.

__follow_mount_rcu() is made to abort rcu-walk mode if it hits a directory with
DCACHE_MANAGE_TRANSIT set on the basis that we're probably going to have to
sleep. It would be possible to enter d_manage() in rcu-walk mode too, and have
that determine whether to abort or not itself. That would allow the autofs
daemon to continue on in rcu-walk mode.

Note that DCACHE_MANAGE_TRANSIT on a directory should be cleared when it isn't
required as every tranist from that directory will cause d_manage() to be
invoked. It can always be set again when necessary.

==========================
WHAT THIS MEANS FOR AUTOFS
==========================

Autofs currently uses the lookup() inode op and the d_revalidate() dentry op to
trigger the automounting of indirect mounts, and both of these can be called
with i_mutex held.

autofs knows that the i_mutex will be held by the caller in lookup(), and so
can drop it before invoking the daemon - but this isn't so for d_revalidate(),
since the lock is only held on _some_ of the code paths that call it. This
means that autofs can't risk dropping i_mutex from its d_revalidate() function
before it calls the daemon.

The bug could manifest itself as, for example, a process that's trying to
validate an automount dentry that gets made to wait because that dentry is
expired and needs cleaning up:

mkdir S ffffffff8014e05a 0 32580 24956
Call Trace:
[] :autofs4:autofs4_wait+0x674/0x897
[] avc_has_perm+0x46/0x58
[] autoremove_wake_function+0x0/0x2e
[] :autofs4:autofs4_expire_wait+0x41/0x6b
[] :autofs4:autofs4_revalidate+0x91/0x149
[] __lookup_hash+0xa0/0x12f
[] lookup_create+0x46/0x80
[] sys_mkdirat+0x56/0xe4

versus the automount daemon which wants to remove that dentry, but can't
because the normal process is holding the i_mutex lock:

automount D ffffffff8014e05a 0 32581 1 32561
Call Trace:
[] __mutex_lock_slowpath+0x60/0x9b
[] do_path_lookup+0x2ca/0x2f1
[] .text.lock.mutex+0xf/0x14
[] do_rmdir+0x77/0xde
[] tracesys+0x71/0xe0
[] tracesys+0xd5/0xe0

which means that the system is deadlocked.

This patch allows autofs to hold up normal processes whilst the daemon goes
ahead and does things to the dentry tree behind the automouter point without
risking a deadlock as almost no locks are held in d_manage() and none in
d_automount().

Signed-off-by: David Howells
Was-Acked-by: Ian Kent
Signed-off-by: Al Viro

David Howells
2011-01-16 09:07:31 +0800

07 Jan, 2011

2 commits

b5c84bf6f fs: dcache remove dcache_lock ... Browse Code »

dcache_lock no longer protects anything. remove it.

Signed-off-by: Nick Piggin

Nick Piggin
2011-01-07 14:50:23 +0800
949854d02 fs: Use rename lock and RCU for multi-step operations ... Browse Code »

The remaining usages for dcache_lock is to allow atomic, multi-step read-side
operations over the directory tree by excluding modifications to the tree.
Also, to walk in the leaf->root direction in the tree where we don't have
a natural d_lock ordering.

This could be accomplished by taking every d_lock, but this would mean a
huge number of locks and actually gets very tricky.

Solve this instead by using the rename seqlock for multi-step read-side
operations, retry in case of a rename so we don't walk up the wrong parent.
Concurrent dentry insertions are not serialised against. Concurrent deletes
are tricky when walking up the directory: our parent might have been deleted
when dropping locks so also need to check and retry for that.

We can also use the rename lock in cases where livelock is a worry (and it
is introduced in subsequent patch).

Signed-off-by: Nick Piggin

Nick Piggin
2011-01-07 14:50:22 +0800

15 May, 2010

1 commit

a4d7f1680 NFS: Reduce the stack footprint of nfs_follow_mountpoint() ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-05-15 03:09:22 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

23 Jun, 2009

1 commit

0b75b35c7 NFS: Fix nfs_path() to always return a '/' at the beginning of the path ... Browse Code »

Signed-off-by: Trond Myklebust
Signed-off-by: Linus Torvalds

Trond Myklebust
2009-06-23 12:28:25 +0800

12 Jun, 2009

1 commit

9393bd07c switch follow_down() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2009-06-12 09:36:01 +0800

08 Oct, 2008

2 commits

44d5759d3 nfs: BUG_ON in nfs_follow_mountpoint ... Browse Code »

Unfortunately, BUG_ON(IS_ROOT(dentry)) can happen inside
nfs_follow_mountpoint with NFS running Fedora 8 using a
specific setup.
https://bugzilla.redhat.com/show_bug.cgi?id=458622

So, the situation should be handled on NFS client gracefully.

Signed-off-by: Denis V. Lunev
CC: Trond Myklebust
CC: J. Bruce Fields
Signed-off-by: Trond Myklebust

Denis V. Lunev
2008-10-08 06:15:16 +0800
fd08d7e9d nfs: ERR_PTR is expected on failure from nfs_do_clone_mount ... Browse Code »

Replace NULL with ERR_PTR(-EINVAL).

Signed-off-by: Denis V. Lunev
Signed-off-by: Trond Myklebust

Denis V. Lunev
2008-10-08 06:14:34 +0800

01 Aug, 2008

1 commit

8d66bf548 [PATCH] pass struct path * to do_add_mount() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2008-08-01 23:25:32 +0800

17 May, 2008

2 commits

31f31db1a nfs: path_{get,put}() cleanups ... Browse Code »

Here are some more places where path_{get,put}() can be used instead of
dput()/mntput() pair.

Signed-off-by: Jan Blunck
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Signed-off-by: Andrew Morton
Signed-off-by: Trond Myklebust

Jan Blunck
2008-05-17 00:43:30 +0800
3110ff804 nfs: replace remaining __FUNCTION__ occurrences ... Browse Code »

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Signed-off-by: Andrew Morton
Signed-off-by: Trond Myklebust

Harvey Harrison
2008-05-17 00:43:29 +0800

20 Apr, 2008

1 commit

a3dab2935 make nfs_automount_list static ... Browse Code »

nfs_automount_list can now become static.

Signed-off-by: Adrian Bunk
Signed-off-by: Trond Myklebust

Adrian Bunk
2008-04-20 04:55:29 +0800

15 Feb, 2008

2 commits

1d957f9bf Introduce path_put() ... Browse Code »

* Add path_put() functions for releasing a reference to the dentry and
vfsmount of a struct path in the right order

* Switch from path_release(nd) to path_put(&nd->path)

* Rename dput_path() to path_put_conditional()

[akpm@linux-foundation.org: fix cifs]
Signed-off-by: Jan Blunck
Signed-off-by: Andreas Gruenbacher
Acked-by: Christoph Hellwig
Cc:
Cc: Al Viro
Cc: Steven French
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Blunck
2008-02-15 13:13:33 +0800
4ac913785 Embed a struct path into struct nameidata instead of nd->{dentry,mnt} ... Browse Code »

This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.

Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
pairs
- it prepares the VFS for stacking support since it is essential to have a
struct path in every place where the stack can be traversed
- it reduces the overall code size:

without patch series:
text data bss dec hex filename
5321639 858418 715768 6895825 6938d1 vmlinux

with patch series:
text data bss dec hex filename
5320026 858418 715768 6894212 693284 vmlinux

This patch:

Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck
Signed-off-by: Andreas Gruenbacher
Acked-by: Christoph Hellwig
Cc: Al Viro
Cc: Casey Schaufler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Blunck
2008-02-15 13:13:33 +0800

30 Jan, 2008

1 commit

40c553193 NFS: Remove the redundant nfs_client->cl_nfsversion ... Browse Code »

We can get the same information from the rpc_ops structure instead.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:49 +0800

01 Sep, 2007

1 commit

560aef745 NFS: Fix use of cancel_delayed_work_sync in nfs_release_automount_timer ... Browse Code »

Doh! We can't use cancel_delayed_work_sync because we may have been called
from an unmount that was being performed by nfs_automount_task.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-09-01 22:14:36 +0800

08 Aug, 2007

1 commit

3d39c691f NFS: Replace flush_scheduled_work with cancel_work_sync() and friends ... Browse Code »

This will avoid deadlocks of the form:

stack backtrace:
[] show_trace_log_lvl+0x1a/0x30
[] show_trace+0x12/0x20
[] dump_stack+0x15/0x20
[] __lock_acquire+0xc22/0x1030
[] lock_acquire+0x61/0x80
[] flush_workqueue+0x49/0x70
[] flush_scheduled_work+0xd/0x10
[] nfs_release_automount_timer+0x2c/0x30 [nfs]
[] nfs_free_server+0x9e/0xd0 [nfs]
[] nfs_kill_super+0x16/0x20 [nfs]
[] deactivate_super+0x7d/0xa0
[] mntput_no_expire+0x4b/0x80
[] expire_mount_list+0xe4/0x140
[] mark_mounts_for_expiry+0x99/0xb0
[] nfs_expire_automounts+0xd/0x40 [nfs]
[] run_workqueue+0x12b/0x1e0
[] worker_thread+0x9b/0x100
[] kthread+0x42/0x70
[] kernel_thread_helper+0x7/0x18
=======================

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-08-08 04:12:50 +0800

13 Feb, 2007

1 commit

92e1d5be9 [PATCH] mark struct inode_operations const 2 ... Browse Code »

Many struct inode_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arjan van de Ven
2007-02-13 01:48:46 +0800

22 Nov, 2006

2 commits

65f27f384 WorkStruct: Pass the work_struct pointer instead of context data ... Browse Code »

Pass the work_struct pointer to the work function rather than context data.
The work function can use container_of() to work out the data.

For the cases where the container of the work_struct may go away the moment the
pending bit is cleared, it is made possible to defer the release of the
structure by deferring the clearing of the pending bit.

To make this work, an extra flag is introduced into the management side of the
work_struct. This governs auto-release of the structure upon execution.

Ordinarily, the work queue executor would release the work_struct for further
scheduling or deallocation by clearing the pending bit prior to jumping to the
work function. This means that, unless the driver makes some guarantee itself
that the work_struct won't go away, the work function may not access anything
else in the work_struct or its container lest they be deallocated.. This is a
problem if the auxiliary data is taken away (as done by the last patch).

However, if the pending bit is *not* cleared before jumping to the work
function, then the work function *may* access the work_struct and its container
with no problems. But then the work function must itself release the
work_struct by calling work_release().

In most cases, automatic release is fine, so this is the default. Special
initiators exist for the non-auto-release case (ending in _NAR).

Signed-Off-By: David Howells

David Howells
2006-11-22 22:55:48 +0800
52bad64d9 WorkStruct: Separate delayable and non-delayable events. ... Browse Code »

Separate delayable work items from non-delayable work items be splitting them
into a separate structure (delayed_work), which incorporates a work_struct and
the timer_list removed from work_struct.

The work_struct struct is huge, and this limits it's usefulness. On a 64-bit
architecture it's nearly 100 bytes in size. This reduces that by half for the
non-delayable type of event.

Signed-Off-By: David Howells

David Howells
2006-11-22 22:54:01 +0800

04 Oct, 2006

1 commit

038b0a6d8 Remove all inclusions of <linux/config.h> ... Browse Code »

kbuild explicitly includes this at build time.

Signed-off-by: Dave Jones

Dave Jones
2006-10-04 15:38:54 +0800

27 Sep, 2006

1 commit

66f37509f [PATCH] fs/nfs/: make code static ... Browse Code »

Signed-off-by: Adrian Bunk
Acked-by: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-09-27 23:26:20 +0800

23 Sep, 2006

3 commits

54ceac451 NFS: Share NFS superblocks per-protocol per-server per-FSID ... Browse Code »

The attached patch makes NFS share superblocks between mounts from the same
server and FSID over the same protocol.

It does this by creating each superblock with a false root and returning the
real root dentry in the vfsmount presented by get_sb(). The root dentry set
starts off as an anonymous dentry if we don't already have the dentry for its
inode, otherwise it simply returns the dentry we already have.

We may thus end up with several trees of dentries in the superblock, and if at
some later point one of anonymous tree roots is discovered by normal filesystem
activity to be located in another tree within the superblock, the anonymous
root is named and materialises attached to the second tree at the appropriate
point.

Why do it this way? Why not pass an extra argument to the mount() syscall to
indicate the subpath and then pathwalk from the server root to the desired
directory? You can't guarantee this will work for two reasons:

(1) The root and intervening nodes may not be accessible to the client.

With NFS2 and NFS3, for instance, mountd is called on the server to get
the filehandle for the tip of a path. mountd won't give us handles for
anything we don't have permission to access, and so we can't set up NFS
inodes for such nodes, and so can't easily set up dentries (we'd have to
have ghost inodes or something).

With this patch we don't actually create dentries until we get handles
from the server that we can use to set up their inodes, and we don't
actually bind them into the tree until we know for sure where they go.

(2) Inaccessible symbolic links.

If we're asked to mount two exports from the server, eg:

mount warthog:/warthog/aaa/xxx /mmm
mount warthog:/warthog/bbb/yyy /nnn

We may not be able to access anything nearer the root than xxx and yyy,
but we may find out later that /mmm/www/yyy, say, is actually the same
directory as the one mounted on /nnn. What we might then find out, for
example, is that /warthog/bbb was actually a symbolic link to
/warthog/aaa/xxx/www, but we can't actually determine that by talking to
the server until /warthog is made available by NFS.

This would lead to having constructed an errneous dentry tree which we
can't easily fix. We can end up with a dentry marked as a directory when
it should actually be a symlink, or we could end up with an apparently
hardlinked directory.

With this patch we need not make assumptions about the type of a dentry
for which we can't retrieve information, nor need we assume we know its
place in the grand scheme of things until we actually see that place.

This patch reduces the possibility of aliasing in the inode and page caches for
inodes that may be accessed by more than one NFS export. It also reduces the
number of superblocks required for NFS where there are many NFS exports being
used from a server (home directory server + autofs for example).

This in turn makes it simpler to do local caching of network filesystems, as it
can then be guaranteed that there won't be links from multiple inodes in
separate superblocks to the same cache file.

Obviously, cache aliasing between different levels of NFS protocol could still
be a problem, but at least that gives us another key to use when indexing the
cache.

This patch makes the following changes:

(1) The server record construction/destruction has been abstracted out into
its own set of functions to make things easier to get right. These have
been moved into fs/nfs/client.c.

All the code in fs/nfs/client.c has to do with the management of
connections to servers, and doesn't touch superblocks in any way; the
remaining code in fs/nfs/super.c has to do with VFS superblock management.

(2) The sequence of events undertaken by NFS mount is now reordered:

(a) A volume representation (struct nfs_server) is allocated.

(b) A server representation (struct nfs_client) is acquired. This may be
allocated or shared, and is keyed on server address, port and NFS
version.

(c) If allocated, the client representation is initialised. The state
member variable of nfs_client is used to prevent a race during
initialisation from two mounts.

(d) For NFS4 a simple pathwalk is performed, walking from FH to FH to find
the root filehandle for the mount (fs/nfs/getroot.c). For NFS2/3 we
are given the root FH in advance.

(e) The volume FSID is probed for on the root FH.

(f) The volume representation is initialised from the FSINFO record
retrieved on the root FH.

(g) sget() is called to acquire a superblock. This may be allocated or
shared, keyed on client pointer and FSID.

(h) If allocated, the superblock is initialised.

(i) If the superblock is shared, then the new nfs_server record is
discarded.

(j) The root dentry for this mount is looked up from the root FH.

(k) The root dentry for this mount is assigned to the vfsmount.

(3) nfs_readdir_lookup() creates dentries for each of the entries readdir()
returns; this function now attaches disconnected trees from alternate
roots that happen to be discovered attached to a directory being read (in
the same way nfs_lookup() is made to do for lookup ops).

The new d_materialise_unique() function is now used to do this, thus
permitting the whole thing to be done under one set of locks, and thus
avoiding any race between mount and lookup operations on the same
directory.

(4) The client management code uses a new debug facility: NFSDBG_CLIENT which
is set by echoing 1024 to /proc/net/sunrpc/nfs_debug.

(5) Clone mounts are now called xdev mounts.

(6) Use the dentry passed to the statfs() op as the handle for retrieving fs
statistics rather than the root dentry of the superblock (which is now a
dummy).

Signed-Off-By: David Howells
Signed-off-by: Trond Myklebust

David Howells
2006-09-23 11:24:37 +0800
8fa5c000d NFS: Move rpc_ops from nfs_server to nfs_client ... Browse Code »

Move the rpc_ops from the nfs_server struct to the nfs_client struct as they're
common to all server records of a particular NFS protocol version.

Signed-Off-By: David Howells
Signed-off-by: Trond Myklebust

David Howells
2006-09-23 11:24:35 +0800
509de8111 NFS: Add extra const qualifiers ... Browse Code »

Add some extra const qualifiers into NFS.

Signed-Off-By: David Howells
Signed-off-by: Trond Myklebust

David Howells
2006-09-23 11:24:34 +0800

04 Aug, 2006

1 commit

ce5101932 NFS: Release dcache_lock in an error path of nfs_path ... Browse Code »

In one of the error paths of nfs_path, it may return with dcache_lock still
held; fix this by adding and using a new error path Elong_unlock which unlocks
dcache_lock.

Signed-off-by: Josh Triplett
Signed-off-by: Trond Myklebust
(cherry picked from f4b90b43677fb23297c56802c3056fc304f988d9 commit)

Josh Triplett
2006-08-04 04:55:01 +0800

09 Jun, 2006

2 commits

f7b422b17 NFS: Split fs/nfs/inode.c ... Browse Code »

As fs/nfs/inode.c is rather large, heterogenous and unwieldy, the attached
patch splits it up into a number of files:

(*) fs/nfs/inode.c

Strictly inode specific functions.

(*) fs/nfs/super.c

Superblock management functions for NFS and NFS4, normal access, clones
and referrals. The NFS4 superblock functions _could_ move out into a
separate conditionally compiled file, but it's probably not worth it as
there're so many common bits.

(*) fs/nfs/namespace.c

Some namespace-specific functions have been moved here.

(*) fs/nfs/nfs4namespace.c

NFS4-specific namespace functions (this could be merged into the previous
file). This file is conditionally compiled.

(*) fs/nfs/internal.h

Inter-file declarations, plus a few simple utility functions moved from
fs/nfs/inode.c.

Additionally, all the in-.c-file externs have been moved here, and those
files they were moved from now includes this file.

For the most part, the functions have not been changed, only some multiplexor
functions have changed significantly.

I've also:

(*) Added some extra banner comments above some functions.

(*) Rearranged the function order within the files to be more logical and
better grouped (IMO), though someone may prefer a different order.

(*) Reduced the number of #ifdefs in .c files.

(*) Added missing __init and __exit directives.

Signed-Off-By: David Howells

David Howells
2006-06-09 21:34:33 +0800
6b97fd3da NFSv4: Follow a referral ... Browse Code »

Respond to a moved error on NFS lookup by setting up the referral.
Note: We don't actually follow the referral during lookup/getattr, but
later when we detect fsid mismatch in inode revalidation (similar to the
processing done for cloning submounts). Referrals will have fake attributes
until they are actually followed or traversed.

Signed-off-by: Manoj Naik
Signed-off-by: Trond Myklebust

Manoj Naik
2006-06-09 21:34:29 +0800