20 Jul, 2007
3 commits
-
Signed-off-by: Josef 'Jeff' Sipek
Cc: Al Viro
Acked-by: Christoph Hellwig
Cc: Trond Myklebust
Cc: Neil Brown
Cc: Michael Halcrow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Josef 'Jeff' Sipek
Cc: Al Viro
Acked-by: Christoph Hellwig
Cc: Trond Myklebust
Cc: Neil Brown
Cc: Michael Halcrow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Stackable file systems, among others, frequently need to lookup paths or
path components starting from an arbitrary point in the namespace
(identified by a dentry and a vfsmount). Currently, such file systems use
lookup_one_len, which is frowned upon [1] as it does not pass the lookup
intent along; not passing a lookup intent, for example, can trigger BUG_ON's
when stacking on top of NFSv4.The first patch introduces a new lookup function to allow lookup starting
from an arbitrary point in the namespace. This approach has been suggested
by Christoph Hellwig [2].The second patch changes sunrpc to use vfs_path_lookup.
The third patch changes nfsctl.c to use vfs_path_lookup.
The fourth patch marks link_path_walk static.
The fifth, and last patch, unexports path_walk because it is no longer
unnecessary to call it directly, and using the new vfs_path_lookup is
cleaner.For example, the following snippet of code, looks up "some/path/component"
in a directory pointed to by parent_{dentry,vfsmnt}:err = vfs_path_lookup(parent_dentry, parent_vfsmnt,
"some/path/component", 0, &nd);
if (!err) {
/* exits */...
/* once done, release the references */
path_release(&nd);
} else if (err == -ENOENT) {
/* doesn't exist */
} else {
/* other error */
}VFS functions such as lookup_create can be used on the nameidata structure
to pass the create intent to the file system.Signed-off-by: Josef 'Jeff' Sipek
Cc: Al Viro
Acked-by: Christoph Hellwig
Cc: Trond Myklebust
Cc: Neil Brown
Cc: Michael Halcrow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
18 Jul, 2007
1 commit
-
Introduce is_owner_or_cap() macro in fs.h, and convert over relevant
users to it. This is done because we want to avoid bugs in the future
where we check for only effective fsuid of the current task against a
file's owning uid, without simultaneously checking for CAP_FOWNER as
well, thus violating its semantics.
[ XFS uses special macros and structures, and in general looked ...
untouchable, so we leave it alone -- but it has been looked over. ]The (current->fsuid != inode->i_uid) check in generic_permission() and
exec_permission_lite() is left alone, because those operations are
covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations
falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone.Signed-off-by: Satyam Sharma
Cc: Al Viro
Acked-by: Serge E. Hallyn
Signed-off-by: Linus Torvalds
11 May, 2007
1 commit
-
Handle the edge cases for POSIX message queue auditing. Collect inode
info when opening an existing mq, and for send/receive operations. Remove
audit_inode_update() as it has really evolved into the equivalent of
audit_inode().Signed-off-by: Amy Griffis
Signed-off-by: Al Viro
10 May, 2007
2 commits
-
Since path_walk sets the total_link_count to 0 and calls link_path_walk, we
can just call path_walk directly.Signed-off-by: Josef 'Jeff' Sipek
Acked-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Josef 'Jeff' Sipek
Acked-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
09 May, 2007
2 commits
-
Remove includes of where it is not used/needed.
Suggested by Al Viro.Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We don't have a routine called namei() anymore since at least 2.3.x, and
the comment is just totally out of sync with the current lookup logic.Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
08 May, 2007
1 commit
-
Ensure pages are uptodate after returning from read_cache_page, which allows
us to cut out most of the filesystem-internal PageUptodate calls.I didn't have a great look down the call chains, but this appears to fixes 7
possible use-before uptodate in hfs, 2 in hfsplus, 1 in jfs, a few in
ecryptfs, 1 in jffs2, and a possible cleared data overwritten with readpage in
block2mtd. All depending on whether the filler is async and/or can return
with a !uptodate page.Signed-off-by: Nick Piggin
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
28 Apr, 2007
1 commit
-
Prevent permission checking from being performed when the kernel wants to
unconditionally remove a sysfs group, by introducing an kernel-only variant
of lookup_one_len(), lookup_one_len_kern().Additionally, as sysfs_remove_group() does not check the return value of
the lookup before using it, a BUG_ON has been added to pinpoint the cause
of any problems potentially caused by this (and as a form of annotation).Signed-off-by: James Morris
Cc: Nagendra Singh Tomar
Cc: Tejun Heo
Cc: Stephen Smalley
Cc: Eric Paris
Signed-off-by: Andrew Morton
Signed-off-by: Greg Kroah-Hartman
17 Feb, 2007
1 commit
-
If prepare_write or commit_write return AOP_TRUNCATED_PAGE we jump to
"retry" label and than if find_or_create_page() failed function return
incorrect error code.Signed-off-by: Dmitriy Monakhov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
13 Feb, 2007
1 commit
-
Many struct inode_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
09 Dec, 2006
2 commits
-
This patch changes struct file to use struct path instead of having
independent pointers to struct dentry and struct vfsmount, and converts all
users of f_{dentry,vfsmnt} in fs/ to use f_path.{dentry,mnt}.Additionally, it adds two #define's to make the transition easier for users of
the f_dentry and f_vfsmnt.Signed-off-by: Josef "Jeff" Sipek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Moved struct path from fs/namei.c to include/linux/namei.h. This allows many
places in the VFS, as well as any stackable filesystem to easily keep track of
dentry-vfsmount pairs.Signed-off-by: Josef "Jeff" Sipek
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
08 Dec, 2006
2 commits
-
d_count check after dget() is always true.
Signed-off-by: Vasily Averin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make access(X_OK) take the "noexec" mount option into account.
Signed-off-by: Stas Sergeev
Cc: Jakub Jelinek
Cc: Arjan van de Ven
Cc: Alan Cox
Cc: Hugh Dickins
Cc: Ulrich Drepper
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
01 Oct, 2006
2 commits
-
The code around vfs_create() in open_namei() is getting a bit too complex.
Right now, there is at least the reference count on the dentry, and the
i_mutex to worry about. Soon, we'll also have mnt_writecount.So, break the vfs_create() call out of open_namei(), and into a helper
function. This duplicates the call to may_open(), but that isn't such a bad
thing since the arguments (acc_mode and flag) were being heavily massaged
anyway.Signed-off-by: Dave Hansen
Acked-by: Christoph Hellwig
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We're shortly going to be adding a bunch more permission checks in these
functions. That requires adding either a bunch of new if() conditions, or
some gotos. This patch collapses existing if()s and uses gotos instead to
prepare for the upcoming changes.Signed-off-by: Dave Hansen
Acked-by: Christoph Hellwig
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
30 Sep, 2006
1 commit
-
Replace current->fs by fs helper variable to reduce some indirection
overhead and (at least at the moment, before the current_thread_info() %gs
PDA improvement is available) get rid of more costly current references.
Reduces fs/namei.o from 37786 to 37082 Bytes (704 Bytes saved).[akpm@osdl.org: cleanup]
Signed-off-by: Andreas Mohr
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
27 Sep, 2006
1 commit
-
For a long time now I have had a problem with not being able to return a
lookup failure on an existsing directory. In autofs this corresponds to a
mount failure on a autofs managed mount entry that is browsable (and so the
mount point directory exists).While this problem has been present for a long time I've avoided resolving
it because it was not very visible. But now that autofs v5 has "mount and
expire on demand" of nested multiple mounts, such as is found when mounting
an export list from a server, solving the problem cannot be avoided any
longer.I've tried very hard to find a way to do this entirely within the autofs4
module but have not been able to find a satisfactory way to achieve it.So, I need to propose a change to the VFS.
Signed-off-by: Ian Kent
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
25 Sep, 2006
1 commit
-
Some file systems want to manually d_move() the dentries involved in a
rename. We can do this by making use of the FS_ODD_RENAME flag if we just
have nfs_rename() unconditionally do the d_move(). While there, we rename
the flag to be more descriptive.OCFS2 uses this to protect that part of the rename operation with a cluster
lock.Signed-off-by: Mark Fasheh
Cc: Trond Myklebust
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
25 Aug, 2006
2 commits
-
Currently, the access() call will return incorrect information on NFS if
there exists an ACL that grants execute access to the user on a regular
file. The reason the information is incorrect is that the VFS overrides
this execute access in open_exec() by checking (inode->i_mode & 0111).This patch propagates the VFS execute bit check back into the generic
permission() call.Signed-off-by: Trond Myklebust
(cherry picked from 64cbae98848c4c99851cb0a405f0b4982cd76c1e commit) -
I'm trying to speeding up mkdir(2) for network file systems. A typical
mkdir(2) calls two inode_operations: lookup and mkdir. The lookup
operation would fail with ENOENT in common case. I think it is unnecessary
because the subsequent mkdir operation can check it. In case of creat(2),
lookup operation is called with the LOOKUP_CREATE flag, so individual
filesystem can omit real lookup. e.g. nfs_lookup().Here is a sample patch which uses LOOKUP_CREATE and O_EXCL on mkdir,
symlink and mknod. This uses the gadget for creat(2).And here is the result of a benchmark on NFSv3.
mkdir(2) 10,000 times:
original 50.5 sec
patched 29.0 secSigned-off-by: ASANO Masahiro
Signed-off-by: Andrew Morton
Signed-off-by: Trond Myklebust
(cherry picked from fab7bf44449b29f9d5572a5dd8adcf7c91d5bf0f commit)
03 Aug, 2006
3 commits
-
Signed-off-by: Al Viro
-
When an object is created via a symlink into an audited directory, audit misses
the event due to not having collected the inode data for the directory. Modify
__audit_inode_child() to copy the parent inode data if a parent wasn't found in
audit_names[].Signed-off-by: Amy Griffis
Signed-off-by: Al Viro -
When the specified path is an existing file or when it is a symlink, audit
collects the wrong inode number, which causes it to miss the open() event.
Adding a second hook to the open() path fixes this.Also add audit_copy_inode() to consolidate some code.
Signed-off-by: Amy Griffis
Signed-off-by: Al Viro
15 Jul, 2006
1 commit
-
2.6.16 leaks like hell. While testing, I found massive leakage
(reproduced in openvz) in:*filp
*size-4096And 1 object leaks in
*size-32
*size-64
*size-128It is the fix for the first one. filp leaks in the bowels of namei.c.
Seems, size-4096 is file table leaking in expand_fdtables.
I have no idea what are the rest and why they show only accompanying
another leaks. Some debugging structs?[akpm@osdl.org, Trond: remove the IS_ERR() check]
Signed-off-by: Alexey Kuznetsov
Cc: Kirill Korotaev
Cc:
Cc: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
04 Jul, 2006
1 commit
-
Teach special (recursive) locking code to the lock validator. Has no effect
on non-lockdep kernels.Signed-off-by: Ingo Molnar
Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
26 Jun, 2006
1 commit
-
When the linkat() syscall was added the flag parameter was added in the
last minute but it wasn't used so far. The following patch should change
that. My tests show that this is all that's needed.If OLDNAME is a symlink setting the flag causes linkat to follow the
symlink and create a hardlink with the target. This is actually the
behavior POSIX demands for link() as well but Linux wisely does not do
this. With this flag (which will most likely be in the next POSIX
revision) the programmer can choose the behavior, defaulting to the safe
variant. As a side effect it is now possible to implement a
POSIX-compliant link(2) function for those who are interested.touch file
ln -s file symlinklinkat(fd, "symlink", fd, "newlink", 0)
-> newlink is hardlink of symlinklinkat(fd, "symlink", fd, "newlink", AT_SYMLINK_FOLLOW)
-> newlink is hardlink of fileThe value of AT_SYMLINK_FOLLOW is determined by the definition we already
use in glibc.Signed-off-by: Ulrich Drepper
Acked-by: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
23 Jun, 2006
1 commit
-
Add read_mapping_page() which is used for callers that pass
mapping->a_ops->readpage as the filler for read_cache_page. This removes
some duplication from filesystem code.Signed-off-by: Pekka Enberg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
20 Jun, 2006
1 commit
-
When an audit event involves changes to a directory entry, include
a PATH record for the directory itself. A few other notable changes:- fixed audit_inode_child() hooks in fsnotify_move()
- removed unused flags arg from audit_inode()
- added audit log routines for logging a portion of a stringHere's some sample output.
before patch:
type=SYSCALL msg=audit(1149821605.320:26): arch=40000003 syscall=39 success=yes exit=0 a0=bf8d3c7c a1=1ff a2=804e1b8 a3=bf8d3c7c items=1 ppid=739 pid=800 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255
type=CWD msg=audit(1149821605.320:26): cwd="/root"
type=PATH msg=audit(1149821605.320:26): item=0 name="foo" parent=164068 inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0after patch:
type=SYSCALL msg=audit(1149822032.332:24): arch=40000003 syscall=39 success=yes exit=0 a0=bfdd9c7c a1=1ff a2=804e1b8 a3=bfdd9c7c items=2 ppid=714 pid=777 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255
type=CWD msg=audit(1149822032.332:24): cwd="/root"
type=PATH msg=audit(1149822032.332:24): item=0 name="/root" inode=164068 dev=03:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_dir_t:s0
type=PATH msg=audit(1149822032.332:24): item=1 name="foo" inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0Signed-off-by: Amy Griffis
Signed-off-by: Al Viro
06 Jun, 2006
1 commit
-
From: Trond Myklebust
We're presently running lock_kernel() under fs_lock via nfs's ->permission
handler. That's a ranking bug and sometimes a sleep-in-spinlock bug. This
problem was introduced in the openat() patchset.We should not need to hold the current->fs->lock for a codepath that doesn't
use current->fs.[vsu@altlinux.ru: fix error path]
Signed-off-by: Trond Myklebust
Cc: Al Viro
Signed-off-by: Sergey Vlasov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
01 Apr, 2006
1 commit
-
As announced, lookup_hash() can now become static.
Signed-off-by: Adrian Bunk
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
28 Mar, 2006
1 commit
-
In order to be able to trigger a mount using the follow_link inode method the
nameidata struct that is passed in needs to have the vfsmount of the autofs
trigger not its parent.During a path walk if an autofs trigger is mounted on a dentry, when the
follow_link method is called, the nameidata struct contains the vfsmount and
mountpoint dentry of the parent mount while the dentry that is passed in is
the root of the autofs trigger mount. I believe it is impossible to get the
vfsmount of the trigger mount, within the follow_link method, when only the
parent vfsmount and the root dentry of the trigger mount are known.This patch updates the nameidata struct on entry to __do_follow_link if it
detects that it is out of date. It moves the path_to_nameidata to above
__do_follow_link to facilitate calling it from there. The dput_path is moved
as well as that seemed sensible. No changes are made to these two functions.Signed-off-by: Ian Kent
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
26 Mar, 2006
3 commits
-
* 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (22 commits)
[PATCH] fix audit_init failure path
[PATCH] EXPORT_SYMBOL patch for audit_log, audit_log_start, audit_log_end and audit_format
[PATCH] sem2mutex: audit_netlink_sem
[PATCH] simplify audit_free() locking
[PATCH] Fix audit operators
[PATCH] promiscuous mode
[PATCH] Add tty to syscall audit records
[PATCH] add/remove rule update
[PATCH] audit string fields interface + consumer
[PATCH] SE Linux audit events
[PATCH] Minor cosmetic cleanups to the code moved into auditfilter.c
[PATCH] Fix audit record filtering with !CONFIG_AUDITSYSCALL
[PATCH] Fix IA64 success/failure indication in syscall auditing.
[PATCH] Miscellaneous bug and warning fixes
[PATCH] Capture selinux subject/object context information.
[PATCH] Exclude messages by message type
[PATCH] Collect more inode information during syscall processing.
[PATCH] Pass dentry, not just name, in fsnotify creation hooks.
[PATCH] Define new range of userspace messages.
[PATCH] Filter rule comparators
...Fixed trivial conflict in security/selinux/hooks.c
-
As prepare_write, commit_write and readpage are allowed to return
AOP_TRUNCATE_PAGE, page_symlink should respond to them.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It seems there is error check missing in open_namei for errors returned
through intent.open.file (from lookup_instantiate_filp).If there is plain open performed, then such a check done inside
__path_lookup_intent_open called from path_lookup_open(), but when the open
is performed with O_CREAT flag set, then __path_lookup_intent_open is only
called with LOOKUP_PARENT set where no file opening can occur yet.Later on lookup_hash is called where exact opening might take place and
intent.open.file may be filled. If it is filled with error value of some
sort, then we get kernel attempting to dereference this error value as
address (and corresponding oops) in nameidata_to_filp() called from
filp_open().While this is relatively simple to workaround in ->lookup() method by just
checking lookup_instantiate_filp() return value and returning error as
needed, this is not so easy in ->d_revalidate(), where we can only return
"yes, dentry is valid" or "no, dentry is invalid, perform full lookup
again", and just returning 0 on error would cause extra lookup (with
potential extra costly RPCs).So in short, I believe that there should be no difference in error handling
for opening a file and creating a file in open_namei() and propose this
simple patch as a solution.Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
23 Mar, 2006
1 commit
-
Semaphore to mutex conversion.
The conversion was generated via scripts, and the result was validated
automatically via a script as well.Signed-off-by: Arjan van de Ven
Signed-off-by: Ingo Molnar
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
21 Mar, 2006
1 commit
-
This patch augments the collection of inode info during syscall
processing. It represents part of the functionality that was provided
by the auditfs patch included in RHEL4.Specifically, it:
- Collects information for target inodes created or removed during
syscalls. Previous code only collects information for the target
inode's parent.- Adds the audit_inode() hook to syscalls that operate on a file
descriptor (e.g. fchown), enabling audit to do inode filtering for
these calls.- Modifies filtering code to check audit context for either an inode #
or a parent inode # matching a given rule.- Modifies logging to provide inode # for both parent and child.
- Protect debug info from NULL audit_names.name.
[AV: folded a later typo fix from the same author]
Signed-off-by: Amy Griffis
Signed-off-by: David Woodhouse
Signed-off-by: Al Viro