11 Aug, 2010
1 commit
-
Commit d0adde574b8487ef30f69e2d08bba769e4be513f added MNT_STRICTATIME
but it isn't actually used (MS_STRICTATIME clears MNT_RELATIME and
MNT_NOATIME rather than setting any mount flag).Signed-off-by: Miklos Szeredi
Signed-off-by: Al Viro
28 Jul, 2010
1 commit
-
This patch adds the list and mask fields needed to support vfsmount marks.
These are the same fields fsnotify needs on an inode. They are not used,
just declared and we note where the cleanup hook should be (the function is
not yet defined)Signed-off-by: Andreas Gruenbacher
Signed-off-by: Eric Paris
05 Mar, 2010
1 commit
-
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits)
init: Open /dev/console from rootfs
mqueue: fix typo "failues" -> "failures"
mqueue: only set error codes if they are really necessary
mqueue: simplify do_open() error handling
mqueue: apply mathematics distributivity on mq_bytes calculation
mqueue: remove unneeded info->messages initialization
mqueue: fix mq_open() file descriptor leak on user-space processes
fix race in d_splice_alias()
set S_DEAD on unlink() and non-directory rename() victims
vfs: add NOFOLLOW flag to umount(2)
get rid of ->mnt_parent in tomoyo/realpath
hppfs can use existing proc_mnt, no need for do_kern_mount() in there
Mirror MS_KERNMOUNT in ->mnt_flags
get rid of useless vfsmount_lock use in put_mnt_ns()
Take vfsmount_lock to fs/internal.h
get rid of insanity with namespace roots in tomoyo
take check for new events in namespace (guts of mounts_poll()) to namespace.c
Don't mess with generic_permission() under ->d_lock in hpfs
sanitize const/signedness for udf
nilfs: sanitize const/signedness in dealing with ->d_name.name
...Fix up fairly trivial (famous last words...) conflicts in
drivers/infiniband/core/uverbs_main.c and security/tomoyo/realpath.c
04 Mar, 2010
3 commits
-
Signed-off-by: Al Viro
-
no more users left outside of fs/*.c (and very few outside of
fs/namespace.c, actually)Signed-off-by: Al Viro
-
The handling of mount flags in set_mnt_shared() got a little tangled
up during previous cleanups, with the following problems:* MNT_PNODE_MASK is defined as a literal constant when it should be a
bitwise xor of other MNT_* flags
* set_mnt_shared() clears and then sets MNT_SHARED (part of MNT_PNODE_MASK)
* MNT_PNODE_MASK could use a comment in mount.h
* MNT_PNODE_MASK is a terrible name, change to MNT_SHARED_MASKThis patch fixes these problems.
Signed-off-by: Al Viro
17 Feb, 2010
1 commit
-
Add __percpu sparse annotations to fs.
These annotations are to make sparse consider percpu variables to be
in a different address space and warn if accessed without going
through percpu accessors. This patch doesn't affect normal builds.Signed-off-by: Tejun Heo
Cc: "Theodore Ts'o"
Cc: Trond Myklebust
Cc: Alex Elder
Cc: Christoph Hellwig
Cc: Alexander Viro
12 Jun, 2009
2 commits
-
This patch speeds up lmbench lat_mmap test by about another 2% after the
first patch.Before:
avg = 462.286
std = 5.46106After:
avg = 453.12
std = 9.58257(50 runs of each, stddev gives a reasonable confidence)
It does this by introducing mnt_clone_write, which avoids some heavyweight
operations of mnt_want_write if called on a vfsmount which we know already
has a write count; and mnt_want_write_file, which can call mnt_clone_write
if the file is open for write.After these two patches, mnt_want_write and mnt_drop_write go from 7% on
the profile down to 1.3% (including mnt_clone_write).[AV: mnt_want_write_file() should take file alone and derive mnt from it;
not only all callers have that form, but that's the only mnt about which
we know that it's already held for write if file is opened for write]Cc: Dave Hansen
Signed-off-by: Nick Piggin
Signed-off-by: Al Viro -
This patch speeds up lmbench lat_mmap test by about 8%. lat_mmap is set up
basically to mmap a 64MB file on tmpfs, fault in its pages, then unmap it.
A microbenchmark yes, but it exercises some important paths in the mm.Before:
avg = 501.9
std = 14.7773After:
avg = 462.286
std = 5.46106(50 runs of each, stddev gives a reasonable confidence, but there is quite
a bit of variation there still)It does this by removing the complex per-cpu locking and counter-cache and
replaces it with a percpu counter in struct vfsmount. This makes the code
much simpler, and avoids spinlocks (although the msync is still pretty
costly, unfortunately). It results in about 900 bytes smaller code too. It
does increase the size of a vfsmount, however.It should also give a speedup on large systems if CPUs are frequently operating
on different mounts (because the existing scheme has to operate on an atomic in
the struct vfsmount when switching between mounts). But I'm most interested in
the single threaded path performance for the moment.[AV: minor cleanup]
Cc: Dave Hansen
Signed-off-by: Nick Piggin
Signed-off-by: Al Viro
27 Mar, 2009
1 commit
-
Add support for explicitly requesting full atime updates. This makes it
possible for kernels to default to relatime but still allow userspace to
override it.Signed-off-by: Matthew Garrett
Signed-off-by: Linus Torvalds
17 Oct, 2008
1 commit
-
Remove a CVS keyword that wasn't updated for a long time from a comment.
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
01 Aug, 2008
1 commit
-
Signed-off-by: Al Viro
27 Jul, 2008
1 commit
-
- use kstrdup() instead of kmalloc() + memcpy()
- return NULL if allocating ->mnt_devname failed
- mnt_devname should be constSigned-off-by: Li Zefan
Acked-by: Cyrill Gorcunov
Signed-off-by: Al Viro
30 Apr, 2008
1 commit
-
Remove the "#ifdef __KERNEL__" tests from unexported header files in
linux/include whose entire contents are wrapped in that preprocessor
test.Signed-off-by: Robert P. J. Day
Cc: David Woodhouse
Cc: Sam Ravnborg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
23 Apr, 2008
2 commits
-
Add a unique ID to each peer group using the IDR infrastructure. The
identifiers are reused after the peer group dissolves.The IDR structures are protected by holding namepspace_sem for write
while allocating or deallocating IDs.IDs are allocated when a previously unshared vfsmount becomes the
first member of a peer group. When a new member is added to an
existing group, the ID is copied from one of the old members.IDs are freed when the last member of a peer group is unshared.
Setting the MNT_SHARED flag on members of a subtree is done as a
separate step, after all the IDs have been allocated. This way an
allocation failure can be cleaned up easilty, without affecting the
propagation state.Based on design sketch by Al Viro.
Signed-off-by: Miklos Szeredi
Signed-off-by: Al Viro -
Add a unique ID to each vfsmount using the IDR infrastructure. The
identifiers are reused after the vfsmount is freed.Signed-off-by: Miklos Szeredi
Signed-off-by: Al Viro
22 Apr, 2008
1 commit
-
Signed-off-by: Al Viro
19 Apr, 2008
3 commits
-
Originally from: Herbert Poetzl
This is the core of the read-only bind mount patch set.
Note that this does _not_ add a "ro" option directly to the bind mount
operation. If you require such a mount, you must first do the bind, then
follow it up with a 'mount -o remount,ro' operation:If you wish to have a r/o bind mount of /foo on bar:
mount --bind /foo /bar
mount -o remount,ro /barAcked-by: Al Viro
Signed-off-by: Christoph Hellwig
Signed-off-by: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro -
This is the real meat of the entire series. It actually
implements the tracking of the number of writers to a mount.
However, it causes scalability problems because there can be
hundreds of cpus doing open()/close() on files on the same mnt at
the same time. Even an atomic_t in the mnt has massive scalaing
problems because the cacheline gets so terribly contended.This uses a statically-allocated percpu variable. All want/drop
operations are local to a cpu as long that cpu operates on the same
mount, and there are no writer count imbalances. Writer count
imbalances happen when a write is taken on one cpu, and released
on another, like when an open/close pair is performed on twoUpon a remount,ro request, all of the data from the percpu
variables is collected (expensive, but very rare) and we determine
if there are any outstanding writers to the mount.I've written a little benchmark to sit in a loop for a couple of
seconds in several cpus in parallel doing open/write/close loops.http://sr71.net/~dave/linux/openbench.c
The code in here is a a worst-possible case for this patch. It
does opens on a _pair_ of files in two different mounts in parallel.
This should cause my code to lose its "operate on the same mount"
optimization completely. This worst-case scenario causes a 3%
degredation in the benchmark.I could probably get rid of even this 3%, but it would be more
complex than what I have here, and I think this is getting into
acceptable territory. In practice, I expect writing more than 3
bytes to a file, as well as disk I/O to mask any effects that this
has.(To get rid of that 3%, we could have an #defined number of mounts
in the percpu variable. So, instead of a CPU getting operate only
on percpu data when it accesses only one mount, it could stay on
percpu data when it only accesses N or fewer mounts.)[AV] merged fix for __clear_mnt_mount() stepping on freed vfsmount
Acked-by: Al Viro
Signed-off-by: Christoph Hellwig
Signed-off-by: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro -
This patch adds two function mnt_want_write() and mnt_drop_write(). These are
used like a lock pair around and fs operations that might cause a write to the
filesystem.Before these can become useful, we must first cover each place in the VFS
where writes are performed with a want/drop pair. When that is complete, we
can actually introduce code that will safely check the counts before allowing
r/wr/o transitions to occur.Acked-by: Serge Hallyn
Acked-by: Al Viro
Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Dave Hansen
Signed-off-by: Al Viro
28 Mar, 2008
2 commits
-
... and take it out of ->umount_begin() instances. Call with all locks
already taken (by do_umount()) and leave calling release_mounts() to
caller (it will do release_mounts() anyway, so we can just put into
the same list).Signed-off-by: Al Viro
-
make propagate_mount_busy() exclude references from the vfsmounts
that had been isolated by umount_tree() and are just waiting for
release_mounts() to dispose of their ->mnt_parent/->mnt_mountpoint.Signed-off-by: Al Viro
09 May, 2007
1 commit
-
Fix the misspellings of "propogate", "writting" and (oh, the shame
:-) "kenrel" in the source tree.Signed-off-by: Robert P. J. Day
Signed-off-by: Adrian Bunk
12 Feb, 2007
1 commit
-
I noticed cache misses in touch_atime() that can be avoided if we keep
mnt_count & mnt_expiry_mark in a different cache line than mnt_flags
(mostly read)mnt_count & mnt_expiry_mark are modified each time a file is opened/closed
in a file system.touch_atime() is called each time a file is read, and generally needs to
read mnt_flags.Other fields of struct vfsmount are mostly read so I chose to move
mnt_count & mnt_expiry_mark at the end of struct vfsmount. And adding a
comment so that nobody tries to re-arrange fields to fill the holes :)On 64bits platforms, the new offsetof(mnt_count) is 0xC0
On 32bits platforms, it is 0x60, so I didnot add a
____cacheline_aligned_in_smp because it would have a too big impact on the
size of this object (in particular if CONFIG_X86_L1_CACHE_SHIFT=7)Signed-off-by: Eric Dumazet
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
14 Dec, 2006
1 commit
-
Add "relatime" (relative atime) support. Relative atime only updates the
atime if the previous atime is older than the mtime or ctime. Like
noatime, but useful for applications like mutt that need to know when a
file has been read since it was last modified.A corresponding patch against mount(8) is available at
http://userweb.kernel.org/~akpm/mount-relative-atime.txtSigned-off-by: Valerie Henson
Cc: Mark Fasheh
Cc: Al Viro
Cc: Christoph Hellwig
Cc: Karel Zak
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
09 Dec, 2006
1 commit
-
Rename 'struct namespace' to 'struct mnt_namespace' to avoid confusion with
other namespaces being developped for the containers : pid, uts, ipc, etc.
'namespace' variables and attributes are also renamed to 'mnt_ns'Signed-off-by: Kirill Korotaev
Signed-off-by: Cedric Le Goater
Cc: Eric W. Biederman
Cc: Herbert Poetzl
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
25 Jun, 2006
1 commit
-
Conflicts:
fs/nfs/inode.c
fs/super.cFix conflicts between patch 'NFS: Split fs/nfs/inode.c' and patch
'VFS: Permit filesystem to override root dentry on mount'
23 Jun, 2006
1 commit
-
Give the statfs superblock operation a dentry pointer rather than a superblock
pointer.This complements the get_sb() patch. That reduced the significance of
sb->s_root, allowing NFS to place a fake root there. However, NFS does
require a dentry to use as a target for the statfs operation. This permits
the root in the vfsmount to be used instead.linux/mount.h has been added where necessary to make allyesconfig build
successfully.Interest has also been expressed for use with the FUSE and XFS filesystems.
Signed-off-by: David Howells
Acked-by: Al Viro
Cc: Nathan Scott
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
09 Jun, 2006
2 commits
-
Allow a submount to be marked as being 'shrinkable' by means of the
vfsmount->mnt_flags, and then add a function 'shrink_submounts()' which
attempts to recursively unmount these submounts.Signed-off-by: Trond Myklebust
-
do_kern_mount() does not allow the kernel to use private mount interfaces
without exposing the same interfaces to userland. The problem is that the
filesystem is referenced by name, thus meaning that it and its mount
interface must be registered in the global filesystem list.vfs_kern_mount() passes the struct file_system_type as an explicit
parameter in order to overcome this limitation.Signed-off-by: Trond Myklebust
11 Jan, 2006
1 commit
-
Turn noatime and nodiratime into per-mount instead of per-sb flags.
After all the preparations this is a rather trivial patch. The mount code
needs to treat the two options as per-mount instead of per-superblock, and
touch_atime needs to be changed to check the new MNT_ flags in addition to
the MS_ flags that are kept for filesystems that are always
noatime/nodiratime but not user settable anymore. Besides that core code
only nfs needed an update because it's leaving atime updates to the server
and thus sets the S_NOATIME flag on every inode, but needs to know whether
it's a real noatime mount for an getattr optimization.While we're at it I've killed the IS_NOATIME/IS_NODIRATIME macros that were
only used by touch_atime.Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
09 Jan, 2006
1 commit
-
Small cleanups in shared mounts code.
Signed-off-by: Miklos Szeredi
Cc: Ram Pai
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
08 Nov, 2005
5 commits
-
An unbindable mount does not forward or receive propagation. Also
unbindable mount disallows bind mounts. The semantics is as follows.Bind semantics:
It is invalid to bind mount an unbindable mount.Move semantics:
It is invalid to move an unbindable mount under shared mount.Clone-namespace semantics:
If a mount is unbindable in the parent namespace, the corresponding
cloned mount in the child namespace becomes unbindable too. Note:
there is subtle difference, unbindable mounts cannot be bind mounted
but can be cloned during clone-namespace.Signed-off-by: Ram Pai
Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds -
A slave mount always has a master mount from which it receives
mount/umount events. Unlike shared mount the event propagation does not
flow from the slave mount to the master.Signed-off-by: Ram Pai
Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds -
This creates shared mounts. A shared mount when bind-mounted to some
mountpoint, propagates mount/umount events to each other. All the
shared mounts that propagate events to each other belong to the same
peer-group.Signed-off-by: Ram Pai
Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds -
A private mount does not forward or receive propagation. This patch
provides user the ability to convert any mount to private.Signed-off-by: Ram Pai
Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds -
The way we currently deal with quota and process accounting that might
keep vfsmount busy at umount time is inherently broken; we try to turn
them off just in case (not quite correctly, at that) anda) pray umount doesn't fail (otherwise they'll stay turned off)
b) pray nobody doesn anything funny just as we turn quota offMoreover, LSM provides hooks for doing the same sort of broken logics.
The proper way to deal with that is to introduce the second kind of
reference to vfsmount. Semantics:- when the last normal reference is dropped, all special ones are
converted to normal ones and if there had been any, cleanup is done.
- normal reference can be cloned into a special one
- special reference can be converted to normal one; that's a no-op if
we'd already passed the point of no return (i.e. mntput() had
converted special references to normal and started cleanup).The way it works: e.g. starting process accounting converts the vfsmount
reference pinned by the opened file into special one and turns it back
to normal when it gets shut down; acct_auto_close() is done when no
normal references are left. That way it does *not* obstruct umount(2)
and it silently gets turned off when the last normal reference to
vfsmount is gone. Which is exactly what we want...The same should be done by LSM module that holds some internal
references to vfsmount and wants to shut them down on umount - it should
make them special and security_sb_umount_close() will be called exactly
when the last normal reference to vfsmount is gone.quota handling is even simpler - we don't use normal file IO anymore, so
there's no need to hold vfsmounts at all. DQUOT_OFF() is done from
deactivate_super(), where it really belongs.Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds
13 Jul, 2005
1 commit
-
kernel/power/disk.c needs a declaration of name_to_dev_t() in scope. mount.h
seems like an appropriate choice.Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
08 Jul, 2005
2 commits
-
This patch renames _mntput() to something a little more descriptive:
mntput_no_expire().Signed-off-by: Miklos Szeredi
Acked-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch renames vfsmount->mnt_fslink to something a little more
descriptive: vfsmount->mnt_expire.Signed-off-by: Mike Waychison
Signed-off-by: Miklos Szeredi
Acked-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds