07 Jan, 2012

1 commit


04 Jan, 2012

3 commits


28 Oct, 2011

1 commit

  • In setlease, we use i_writecount to decide whether we can give out a
    read lease.

    In open, we break leases before incrementing i_writecount.

    There is therefore a window between the break lease and the i_writecount
    increment when setlease could add a new read lease.

    This would leave us with a simultaneous write open and read lease, which
    shouldn't happen.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Christoph Hellwig

    J. Bruce Fields
     

27 Jul, 2011

1 commit

  • The kludge in question is undocumented and doesn't work for 32bit
    binaries on amd64, sparc64 and s390. Passing (mode_t)-1 as
    mode had (since 0.99.14v and contrary to behaviour of any
    other Unix, prescriptions of POSIX, SuS and our own manpages)
    was kinda-sorta no-op. Note that any software relying on
    that (and looking for examples shows none) would be visibly
    broken on sparc64, where practically all userland is built
    32bit. No such complaints noticed...

    Signed-off-by: Al Viro

    Al Viro
     

23 Jul, 2011

1 commit


21 Mar, 2011

1 commit


17 Mar, 2011

1 commit

  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (33 commits)
    AppArmor: kill unused macros in lsm.c
    AppArmor: cleanup generated files correctly
    KEYS: Add an iovec version of KEYCTL_INSTANTIATE
    KEYS: Add a new keyctl op to reject a key with a specified error code
    KEYS: Add a key type op to permit the key description to be vetted
    KEYS: Add an RCU payload dereference macro
    AppArmor: Cleanup make file to remove cruft and make it easier to read
    SELinux: implement the new sb_remount LSM hook
    LSM: Pass -o remount options to the LSM
    SELinux: Compute SID for the newly created socket
    SELinux: Socket retains creator role and MLS attribute
    SELinux: Auto-generate security_is_socket_class
    TOMOYO: Fix memory leak upon file open.
    Revert "selinux: simplify ioctl checking"
    selinux: drop unused packet flow permissions
    selinux: Fix packet forwarding checks on postrouting
    selinux: Fix wrong checks for selinux_policycap_netpeer
    selinux: Fix check for xfrm selinux context algorithm
    ima: remove unnecessary call to ima_must_measure
    IMA: remove IMA imbalance checking
    ...

    Linus Torvalds
     

16 Mar, 2011

1 commit


15 Mar, 2011

2 commits

  • For readlinkat() we simply allow empty pathname; it will fail unless
    we have dfd equal to O_PATH-opened symlink, so we are outside of
    POSIX scope here. For fchownat() and fstatat() we allow AT_EMPTY_PATH;
    let the caller explicitly ask for such behaviour.

    Signed-off-by: Al Viro

    Al Viro
     
  • New flag for open(2) - O_PATH. Semantics:
    * pathname is resolved, but the file itself is _NOT_ opened
    as far as filesystem is concerned.
    * almost all operations on the resulting descriptors shall
    fail with -EBADF. Exceptions are:
    1) operations on descriptors themselves (i.e.
    close(), dup(), dup2(), dup3(), fcntl(fd, F_DUPFD),
    fcntl(fd, F_DUPFD_CLOEXEC, ...), fcntl(fd, F_GETFD),
    fcntl(fd, F_SETFD, ...))
    2) fcntl(fd, F_GETFL), for a common non-destructive way to
    check if descriptor is open
    3) "dfd" arguments of ...at(2) syscalls, i.e. the starting
    points of pathname resolution
    * closing such descriptor does *NOT* affect dnotify or
    posix locks.
    * permissions are checked as usual along the way to file;
    no permission checks are applied to the file itself. Of course,
    giving such thing to syscall will result in permission checks (at
    the moment it means checking that starting point of ....at() is
    a directory and caller has exec permissions on it).

    fget() and fget_light() return NULL on such descriptors; use of
    fget_raw() and fget_raw_light() is needed to get them. That protects
    existing code from dealing with those things.

    There are two things still missing (they come in the next commits):
    one is handling of symlinks (right now we refuse to open them that
    way; see the next commit for semantics related to those) and another
    is descriptor passing via SCM_RIGHTS datagrams.

    Signed-off-by: Al Viro

    Al Viro
     

14 Mar, 2011

2 commits

  • new function: file_open_root(dentry, mnt, name, flags) opens the file
    vfs_path_lookup would arrive to.

    Note that name can be empty; in that case the usual requirement that
    dentry should be a directory is lifted.

    open-coded equivalents switched to it, may_open() got down exactly
    one caller and became static.

    Signed-off-by: Al Viro

    Al Viro
     
  • take calculation of open_flags by open(2) arguments into new helper
    in fs/open.c, move filp_open() over there, have it and do_sys_open()
    use that helper, switch exec.c callers of do_filp_open() to explicit
    (and constant) struct open_flags.

    Signed-off-by: Al Viro

    Al Viro
     

10 Mar, 2011

1 commit

  • In the fallocate path the kernel doesn't check for the immutable/append
    flag. It's possible to have a race condition in this scenario: an
    application open a file in read/write and it does something, meanwhile
    root set the immutable flag on the file, the application at that point
    can call fallocate with success. In addition, we don't allow to do any
    unreserve operation on an append only file but only the reserve one.

    Signed-off-by: Marco Stornelli
    Signed-off-by: Al Viro

    Marco Stornelli
     

08 Mar, 2011

1 commit


12 Feb, 2011

1 commit

  • In commit 31e6b01f4183 ("fs: rcu-walk for path lookup") we started doing
    path lookup using RCU, which then falls back to a careful non-RCU lookup
    in case of problems (LOOKUP_REVAL). So do_filp_open() has this "re-do
    the lookup carefully" looping case.

    However, that means that we must not release the open-intent file data
    if we are going to loop around and use it once more!

    Fix this by moving the release of the open-intent data to the function
    that allocates it (do_filp_open() itself) rather than the helper
    functions that can get called multiple times (finish_open() and
    do_last()). This makes the logic for the lifetime of that field much
    more obvious, and avoids the possible double free.

    Reported-by: J. R. Okajima
    Acked-by: Al Viro
    Cc: Nick Piggin
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

10 Feb, 2011

1 commit

  • ima_counts_get() updated the readcount and invalidated the PCR,
    as necessary. Only update the i_readcount in the VFS layer.
    Move the PCR invalidation checks to ima_file_check(), where it
    belongs.

    Maintaining the i_readcount in the VFS layer, will allow other
    subsystems to use i_readcount.

    Signed-off-by: Mimi Zohar
    Acked-by: Eric Paris

    Mimi Zohar
     

17 Jan, 2011

1 commit

  • Currently all filesystems except XFS implement fallocate asynchronously,
    while XFS forced a commit. Both of these are suboptimal - in case of O_SYNC
    I/O we really want our allocation on disk, especially for the !KEEP_SIZE
    case where we actually grow the file with user-visible zeroes. On the
    other hand always commiting the transaction is a bad idea for fast-path
    uses of fallocate like for example in recent Samba versions. Given
    that block allocation is a data plane operation anyway change it from
    an inode operation to a file operation so that we have the file structure
    available that lets us check for O_SYNC.

    This also includes moving the code around for a few of the filesystems,
    and remove the already unnedded S_ISDIR checks given that we only wire
    up fallocate for regular files.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

13 Jan, 2011

1 commit

  • Hole punching has already been implemented by XFS and OCFS2, and has the
    potential to be implemented on both BTRFS and EXT4 so we need a generic way to
    get to this feature. The simplest way in my mind is to add FALLOC_FL_PUNCH_HOLE
    to fallocate() since it already looks like the normal fallocate() operation.
    I've tested this patch with XFS and BTRFS to make sure XFS did what it's
    supposed to do and that BTRFS failed like it was supposed to. Thank you,

    Signed-off-by: Josef Bacik
    Signed-off-by: Al Viro

    Josef Bacik
     

29 Oct, 2010

1 commit

  • nameidata_to_filp() drops nd->path or transfers it to opened
    file. In the former case it's a Bad Idea(tm) to do mnt_drop_write()
    on nd->path.mnt, since we might race with umount and vfsmount in
    question might be gone already.

    Fix: don't drop it, then... IOW, have nameidata_to_filp() grab nd->path
    in case it transfers it to file and do path_drop() in callers. After
    they are through with accessing nd->path...

    Reported-by: Miklos Szeredi
    Signed-off-by: Al Viro

    Al Viro
     

18 Aug, 2010

1 commit

  • fs: cleanup files_lock locking

    Lock tty_files with a new spinlock, tty_files_lock; provide helpers to
    manipulate the per-sb files list; unexport the files_lock spinlock.

    Cc: linux-kernel@vger.kernel.org
    Cc: Christoph Hellwig
    Cc: Alan Cox
    Acked-by: Andi Kleen
    Acked-by: Greg Kroah-Hartman
    Signed-off-by: Nick Piggin
    Signed-off-by: Al Viro

    Nick Piggin
     

11 Aug, 2010

2 commits

  • Signed-off-by: Dmitry Torokhov
    Acked-by: Arnd Bergmann
    Acked-by: John Kacur
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Torokhov
     
  • * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits)
    fanotify: use both marks when possible
    fsnotify: pass both the vfsmount mark and inode mark
    fsnotify: walk the inode and vfsmount lists simultaneously
    fsnotify: rework ignored mark flushing
    fsnotify: remove global fsnotify groups lists
    fsnotify: remove group->mask
    fsnotify: remove the global masks
    fsnotify: cleanup should_send_event
    fanotify: use the mark in handler functions
    audit: use the mark in handler functions
    dnotify: use the mark in handler functions
    inotify: use the mark in handler functions
    fsnotify: send fsnotify_mark to groups in event handling functions
    fsnotify: Exchange list heads instead of moving elements
    fsnotify: srcu to protect read side of inode and vfsmount locks
    fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called
    fsnotify: use _rcu functions for mark list traversal
    fsnotify: place marks on object in order of group memory address
    vfs/fsnotify: fsnotify_close can delay the final work in fput
    fsnotify: store struct file not struct path
    ...

    Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.

    Linus Torvalds
     

02 Aug, 2010

2 commits

  • Currently MAY_ACCESS means that filesystems must check the permissions
    right then and not rely on cached results or the results of future
    operations on the object. This can be because of a call to sys_access() or
    because of a call to chdir() which needs to check search without relying on
    any future operations inside that dir. I plan to use MAY_ACCESS for other
    purposes in the security system, so I split the MAY_ACCESS and the
    MAY_CHDIR cases.

    Signed-off-by: Eric Paris
    Acked-by: Stephen D. Smalley
    Signed-off-by: James Morris

    Eric Paris
     
  • When commit be6d3e56a6b9b3a4ee44a0685e39e595073c6f0d "introduce new LSM hooks
    where vfsmount is available." was proposed, regarding security_path_truncate(),
    only "struct file *" argument (which AppArmor wanted to use) was removed.
    But length and time_attrs arguments are not used by TOMOYO nor AppArmor.
    Thus, let's remove these arguments.

    Signed-off-by: Tetsuo Handa
    Acked-by: Nick Piggin
    Signed-off-by: James Morris

    Tetsuo Handa
     

28 Jul, 2010

2 commits


22 May, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

06 Mar, 2010

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits)
    quota: stop using QUOTA_OK / NO_QUOTA
    dquot: cleanup dquot initialize routine
    dquot: move dquot initialization responsibility into the filesystem
    dquot: cleanup dquot drop routine
    dquot: move dquot drop responsibility into the filesystem
    dquot: cleanup dquot transfer routine
    dquot: move dquot transfer responsibility into the filesystem
    dquot: cleanup inode allocation / freeing routines
    dquot: cleanup space allocation / freeing routines
    ext3: add writepage sanity checks
    ext3: Truncate allocated blocks if direct IO write fails to update i_size
    quota: Properly invalidate caches even for filesystems with blocksize < pagesize
    quota: generalize quota transfer interface
    quota: sb_quota state flags cleanup
    jbd: Delay discarding buffers in journal_unmap_buffer
    ext3: quota_write cross block boundary behaviour
    quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
    quota: split out compat_sys_quotactl support from quota.c
    quota: split out netlink notification support from quota.c
    quota: remove invalid optimization from quota_sync_all
    ...

    Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c

    Linus Torvalds
     

05 Mar, 2010

1 commit

  • Currently various places in the VFS call vfs_dq_init directly. This means
    we tie the quota code into the VFS. Get rid of that and make the
    filesystem responsible for the initialization. For most metadata operations
    this is a straight forward move into the methods, but for truncate and
    open it's a bit more complicated.

    For truncate we currently only call vfs_dq_init for the sys_truncate case
    because open already takes care of it for ftruncate and open(O_TRUNC) - the
    new code causes an additional vfs_dq_init for those which is harmless.

    For open the initialization is moved from do_filp_open into the open method,
    which means it happens slightly earlier now, and only for regular files.
    The latter is fine because we don't need to initialize it for operations
    on special files, and we already do it as part of the namespace operations
    for directories.

    Add a dquot_file_open helper that filesystems that support generic quotas
    can use to fill in ->open.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     

04 Mar, 2010

1 commit


23 Dec, 2009

2 commits


17 Dec, 2009

2 commits

  • * do ima_get_count() in __dentry_open()
    * stop doing that in followups
    * move ima_path_check() to right after nameidata_to_filp()
    * don't bump counters on it

    Signed-off-by: Al Viro

    Al Viro
     
  • All users outside of fs/ of get_empty_filp() have been removed. This patch
    moves the definition from the include/ directory to internal.h so no new
    users crop up and removes the EXPORT_SYMBOL. I'd love to see open intents
    stop using it too, but that's a problem for another day and a smarter
    developer!

    Signed-off-by: Eric Paris
    Acked-by: Miklos Szeredi
    Signed-off-by: Al Viro

    Eric Paris
     

24 Nov, 2009

1 commit


12 Oct, 2009

2 commits