18 Aug, 2010

4 commits

  • fs: brlock vfsmount_lock

    Use a brlock for the vfsmount lock. It must be taken for write whenever
    modifying the mount hash or associated fields, and may be taken for read when
    performing mount hash lookups.

    A new lock is added for the mnt-id allocator, so it doesn't need to take
    the heavy vfsmount write-lock.

    The number of atomics should remain the same for fastpath rlock cases, though
    code would be slightly slower due to per-cpu access. Scalability is not not be
    much improved in common cases yet, due to other locks (ie. dcache_lock) getting
    in the way. However path lookups crossing mountpoints should be one case where
    scalability is improved (currently requiring the global lock).

    The slowpath is slower due to use of brlock. On a 64 core, 64 socket, 32 node
    Altix system (high latency to remote nodes), a simple umount microbenchmark
    (mount --bind mnt mnt2 ; umount mnt2 loop 1000 times), before this patch it
    took 6.8s, afterwards took 7.1s, about 5% slower.

    Cc: Al Viro
    Signed-off-by: Nick Piggin
    Signed-off-by: Al Viro

    Nick Piggin
     
  • fs: remove extra lookup in __lookup_hash

    Optimize lookup for create operations, where no dentry should often be
    common-case. In cases where it is not, such as unlink, the added overhead
    is much smaller than the removed.

    Also, move comments about __d_lookup racyness to the __d_lookup call site.
    d_lookup is intuitive; __d_lookup is what needs commenting. So in that same
    vein, add kerneldoc comments to __d_lookup and clean up some of the comments:

    - We are interested in how the RCU lookup works here, particularly with
    renames. Make that explicit, and point to the document where it is explained
    in more detail.
    - RCU is pretty standard now, and macros make implementations pretty mindless.
    If we want to know about RCU barrier details, we look in RCU code.
    - Delete some boring legacy comments because we don't care much about how the
    code used to work, more about the interesting parts of how it works now. So
    comments about lazy LRU may be interesting, but would better be done in the
    LRU or refcount management code.

    Signed-off-by: Nick Piggin
    Signed-off-by: Al Viro

    Nick Piggin
     
  • fs: dentry allocation consolidation

    There are 2 duplicate copies of code in dentry allocation in path lookup.
    Consolidate them into a single function.

    Signed-off-by: Nick Piggin
    Signed-off-by: Al Viro

    Nick Piggin
     
  • fs: fix do_lookup false negative

    In do_lookup, if we initially find no dentry, we take the directory i_mutex and
    re-check the lookup. If we find a dentry there, then we revalidate it if
    needed. However if that revalidate asks for the dentry to be invalidated, we
    return -ENOENT from do_lookup. What should happen instead is an attempt to
    allocate and lookup a new dentry.

    This is probably not noticed because it is rare. It is only reached if a
    concurrent create races in first (in which case, the dentry probably won't be
    invalidated anyway), or if the racy __d_lookup has failed due to a
    false-negative (which is very rare).

    Fix this by removing code and have it use the normal reval path.

    Signed-off-by: Nick Piggin
    Signed-off-by: Al Viro

    Nick Piggin
     

11 Aug, 2010

2 commits

  • Add three helpers that retrieve a refcounted copy of the root and cwd
    from the supplied fs_struct.

    get_fs_root()
    get_fs_pwd()
    get_fs_root_and_pwd()

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Al Viro

    Miklos Szeredi
     
  • * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits)
    fanotify: use both marks when possible
    fsnotify: pass both the vfsmount mark and inode mark
    fsnotify: walk the inode and vfsmount lists simultaneously
    fsnotify: rework ignored mark flushing
    fsnotify: remove global fsnotify groups lists
    fsnotify: remove group->mask
    fsnotify: remove the global masks
    fsnotify: cleanup should_send_event
    fanotify: use the mark in handler functions
    audit: use the mark in handler functions
    dnotify: use the mark in handler functions
    inotify: use the mark in handler functions
    fsnotify: send fsnotify_mark to groups in event handling functions
    fsnotify: Exchange list heads instead of moving elements
    fsnotify: srcu to protect read side of inode and vfsmount locks
    fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called
    fsnotify: use _rcu functions for mark list traversal
    fsnotify: place marks on object in order of group memory address
    vfs/fsnotify: fsnotify_close can delay the final work in fput
    fsnotify: store struct file not struct path
    ...

    Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.

    Linus Torvalds
     

02 Aug, 2010

2 commits

  • SELinux needs to pass the MAY_ACCESS flag so it can handle auditting
    correctly. Presently the masking of MAY_* flags is done in the VFS. In
    order to allow LSMs to decide what flags they care about and what flags
    they don't just pass them all and the each LSM mask off what they don't
    need. This patch should contain no functional changes to either the VFS or
    any LSM.

    Signed-off-by: Eric Paris
    Acked-by: Stephen D. Smalley
    Signed-off-by: James Morris

    Eric Paris
     
  • When commit be6d3e56a6b9b3a4ee44a0685e39e595073c6f0d "introduce new LSM hooks
    where vfsmount is available." was proposed, regarding security_path_truncate(),
    only "struct file *" argument (which AppArmor wanted to use) was removed.
    But length and time_attrs arguments are not used by TOMOYO nor AppArmor.
    Thus, let's remove these arguments.

    Signed-off-by: Tetsuo Handa
    Acked-by: Nick Piggin
    Signed-off-by: James Morris

    Tetsuo Handa
     

28 Jul, 2010

1 commit

  • fsnotify was using char * when it passed around the d_name.name string
    internally but it is actually an unsigned char *. This patch switches
    fsnotify to use unsigned and should silence some pointer signess warnings
    which have popped out of xfs. I do not add -Wpointer-sign to the fsnotify
    code as there are still issues with kstrdup and strlen which would pop
    out needless warnings.

    Signed-off-by: Eric Paris

    Eric Paris
     

28 May, 2010

1 commit

  • Commit 1f36f774b22a0ceb7dd33eca626746c81a97b6a5 broke FS_REVAL_DOT semantics.

    In particular, before this patch, the command
    ls -l
    in an NFS mounted directory would always check if the directory on the server
    had changed and if so would flush and refill the pagecache for the dir.
    After this patch, the same "ls -l" will repeatedly return stale date until
    the cached attributes for the directory time out.

    The following patch fixes this by ensuring the d_revalidate is called by
    do_last when "." is being looked-up.
    link_path_walk has already called d_revalidate, but in that case LOOKUP_OPEN
    is not set so nfs_lookup_verify_inode chooses not to do any validation.

    The following patch restores the original behaviour.

    Cc: stable@kernel.org
    Signed-off-by: NeilBrown
    Signed-off-by: Al Viro

    Neil Brown
     

22 May, 2010

1 commit


15 May, 2010

1 commit

  • 1) i_flags simply doesn't work for mount/unlink race prevention;
    we may have many links to file and rm on one of those obviously
    shouldn't prevent bind on top of another later on. To fix it
    right way we need to mark _dentry_ as unsuitable for mounting
    upon; new flag (DCACHE_CANT_MOUNT) is protected by d_flags and
    i_mutex on the inode in question. Set it (with dont_mount(dentry))
    in unlink/rmdir/etc., check (with cant_mount(dentry)) in places
    in namespace.c that used to check for S_DEAD. Setting S_DEAD
    is still needed in places where we used to set it (for directories
    getting killed), since we rely on it for readdir/rmdir race
    prevention.

    2) rename()/mount() protection has another bogosity - we unhash
    the target before we'd checked that it's not a mountpoint. Fixed.

    3) ancient bogosity in pivot_root() - we locked i_mutex on the
    right directory, but checked S_DEAD on the different (and wrong)
    one. Noticed and fixed.

    Signed-off-by: Al Viro

    Al Viro
     

13 May, 2010

1 commit

  • According to specification

    mkdir d; ln -s d a; open("a/", O_NOFOLLOW | O_RDONLY)

    should return success but currently it returns ELOOP. This is a
    regression caused by path lookup cleanup patch series.

    Fix the code to ignore O_NOFOLLOW in case the provided path has trailing
    slashes.

    Cc: Andrew Morton
    Cc: Al Viro
    Reported-by: Marius Tolzmann
    Acked-by: Miklos Szeredi
    Signed-off-by: Jan Kara
    Signed-off-by: Linus Torvalds

    Jan Kara
     

27 Mar, 2010

1 commit


08 Mar, 2010

1 commit


07 Mar, 2010

1 commit


06 Mar, 2010

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits)
    quota: stop using QUOTA_OK / NO_QUOTA
    dquot: cleanup dquot initialize routine
    dquot: move dquot initialization responsibility into the filesystem
    dquot: cleanup dquot drop routine
    dquot: move dquot drop responsibility into the filesystem
    dquot: cleanup dquot transfer routine
    dquot: move dquot transfer responsibility into the filesystem
    dquot: cleanup inode allocation / freeing routines
    dquot: cleanup space allocation / freeing routines
    ext3: add writepage sanity checks
    ext3: Truncate allocated blocks if direct IO write fails to update i_size
    quota: Properly invalidate caches even for filesystems with blocksize < pagesize
    quota: generalize quota transfer interface
    quota: sb_quota state flags cleanup
    jbd: Delay discarding buffers in journal_unmap_buffer
    ext3: quota_write cross block boundary behaviour
    quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
    quota: split out compat_sys_quotactl support from quota.c
    quota: split out netlink notification support from quota.c
    quota: remove invalid optimization from quota_sync_all
    ...

    Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c

    Linus Torvalds
     

05 Mar, 2010

19 commits


04 Mar, 2010

4 commits