03 Jan, 2009

9 commits


01 Jan, 2009

31 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (34 commits)
    nfsd race fixes: jfs
    nfsd race fixes: reiserfs
    nfsd race fixes: ext4
    nfsd race fixes: ext3
    nfsd race fixes: ext2
    nfsd/create race fixes, infrastructure
    filesystem notification: create fs/notify to contain all fs notification
    fs/block_dev.c: __read_mostly improvement and sb_is_blkdev_sb utilization
    kill ->dir_notify()
    filp_cachep can be static in fs/file_table.c
    fix f_count description in Documentation/filesystems/files.txt
    make INIT_FS use the __RW_LOCK_UNLOCKED initialization
    take init_fs to saner place
    kill vfs_permission
    pass a struct path * to may_open
    kill walk_init_root
    remove incorrect comment in inode_permission
    expand some comments (d_path / seq_path)
    correct wrong function name of d_put in kernel document and source comment
    fix switch_names() breakage in short-to-short case
    ...

    Linus Torvalds
     
  • jfs version of Al Viro's nfsd race patches

    Signed-off-by: Dave Kleikamp
    Signed-off-by: Al Viro

    Dave Kleikamp
     
  • ... and the same for reiserfs. The difference here is that we need
    insert_inode_locked4() to match iget5_locked().

    Signed-off-by: Al Viro

    Al Viro
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • ext3 analog of the previous patch

    Signed-off-by: Al Viro

    Al Viro
     
  • * make ext2_new_inode() put the inode into icache in locked state
    * do not unlock until the inode is fully set up; otherwise nfsd
    might pick it in half-baked state.
    * make sure that ext2_new_inode() does *not* lead to two inodes with the
    same inumber hashed at the same time; otherwise a bogus fhandle coming
    from nfsd might race with inode creation:

    nfsd: iget_locked() creates inode
    nfsd: try to read from disk, block on that.
    ext2_new_inode(): allocate inode with that inumber
    ext2_new_inode(): insert it into icache, set it up and dirty
    ext2_write_inode(): get the relevant part of inode table in cache,
    set the entry for our inode (and start writing to disk)
    nfsd: get CPU again, look into inode table, see nice and sane on-disk
    inode, set the in-core inode from it

    oops - we have two in-core inodes with the same inumber live in icache,
    both used for IO. Welcome to fs corruption...

    Signed-off-by: Al Viro

    Al Viro
     
  • new helpers - insert_inode_locked() and insert_inode_locked4().
    Hash new inode, making sure that there's no such inode in icache
    already. If there is and it does not end up unhashed (as would
    happen if we have nfsd trying to resolve a bogus fhandle), fail.
    Otherwise insert our inode into hash and succeed.

    In either case have i_state set to new+locked; cleanup ends up
    being simpler with such calling conventions.

    Signed-off-by: Al Viro

    Al Viro
     
  • Creating a generic filesystem notification interface, fsnotify, which will be
    used by inotify, dnotify, and eventually fanotify is really starting to
    clutter the fs directory. This patch simply moves inotify and dnotify into
    fs/notify/inotify and fs/notify/dnotify respectively to make both current fs/
    and future notification tidier.

    Signed-off-by: Eric Paris
    Signed-off-by: Al Viro

    Eric Paris
     
  • - iget5_locked in bdget really needs blockdev_superblock, instead of
    bd_mnt, so bd_mnt could be just a local variable;

    - blockdev_superblock really needs __read_mostly, while local var bd_mnt
    not;

    - make use of sb_is_blkdev_sb in bd_forget, instead of direct reference
    to blockdev_superblock.

    Signed-off-by: Denis ChengRq
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Denis ChengRq
     
  • Remove the hopelessly misguided ->dir_notify(). The only instance (cifs)
    has been broken by design from the very beginning; the objects it creates
    are never destroyed, keep references to struct file they can outlive, nothing
    that could possibly evict them exists on close(2) path *and* no locking
    whatsoever is done to prevent races with close(), should the previous, er,
    deficiencies someday be dealt with.

    Signed-off-by: Al Viro

    Al Viro
     
  • Instead of creating the "filp" kmem_cache in vfs_caches_init(),
    we can do it a litle be later in files_init(), so that filp_cachep
    is static to fs/file_table.c

    Acked-by: Paul E. McKenney

    Signed-off-by: Eric Dumazet
    Signed-off-by: Al Viro

    Eric Dumazet
     
  • Documentation/filesystems/files.txt was not updated when
    f_count became an atomic_long_t.
    atomic_long_inc_not_zero() is now used instead of atomic_inc_not_zero()

    Signed-off-by: Al Viro

    Eric Dumazet
     
  • [AV: rediffed on top of unification of init_fs]
    Initialization of init_fs still uses the deprecated RW_LOCK_UNLOCKED macro.
    This patch updates it to use the __RW_LOCK_UNLOCKED(lock) macro.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Al Viro

    Steven Rostedt
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • With all the nameidata removal there's no point anymore for this helper.
    Of the three callers left two will go away with the next lookup series
    anyway.

    Also add proper kerneldoc to inode_permission as this is the main
    permission check routine now.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • No need for the nameidata in may_open - a struct path is enough.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • walk_init_root is a tiny helper that is marked __always_inline, has just
    one caller and an unused argument. Just merge it into the caller.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • We now pass on all MAY_ flags to the filesystems permission routines,
    so remove the comment stating the contrary.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Explain that you really need to use the return value of d_path rather than
    the buffer you passed into it.

    Also fix the comment for seq_path(), the function arguments changed
    recently but the comment hadn't been updated in sync.

    Signed-off-by: Arjan van de Ven
    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Arjan van de Ven
     
  • no function named d_put(), it should be dput().

    Impact: fix document and comment, no functionality changed

    Signed-off-by: Zhao Lei
    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Zhaolei
     
  • We want ->name.len to match the resulting name on *both*
    source and target

    Signed-off-by: Al Viro

    Al Viro
     
  • Ensure fast symlink targets are NUL-terminated, even if corrupted
    on-disk.

    Cc: Sergey S. Kostyliov
    Signed-off-by: Duane Griffin
    Signed-off-by: Al Viro

    Duane Griffin
     
  • Ensure fast symlink targets are NUL-terminated, even if corrupted
    on-disk.

    Cc: Christoph Hellwig
    Signed-off-by: Duane Griffin
    Signed-off-by: Al Viro

    Duane Griffin
     
  • Ensure fast symlink targets are NUL-terminated, even if corrupted
    on-disk.

    Cc: Christoph Hellwig
    Cc: Al Viro
    Signed-off-by: Duane Griffin
    Signed-off-by: Al Viro

    Duane Griffin
     
  • Ensure fast symlink targets are NUL-terminated, even if corrupted
    on-disk.

    Cc: Andrew Morton
    Cc: Theodore Ts'o
    Cc: adilger@sun.com
    Cc: linux-ext4@vger.kernel.org
    Signed-off-by: Duane Griffin
    Signed-off-by: Al Viro

    Duane Griffin
     
  • Ensure fast symlink targets are NUL-terminated, even if corrupted
    on-disk.

    Cc: Andrew Morton
    Cc: Stephen Tweedie
    Cc: linux-ext4@vger.kernel.org
    Signed-off-by: Duane Griffin
    Signed-off-by: Al Viro

    Duane Griffin
     
  • Ensure fast symlink targets are NUL-terminated, even if corrupted
    on-disk.

    Cc: Andrew Morton
    Signed-off-by: Duane Griffin
    Signed-off-by: Al Viro

    Duane Griffin
     
  • On-disk data corruption could cause a page link to have its i_size set
    to PAGE_SIZE (or a multiple thereof) and its contents all non-NUL.
    NUL-terminate the link name to ensure this doesn't cause further
    problems for the kernel.

    Cc: Al Viro
    Cc: Andrew Morton
    Signed-off-by: Duane Griffin
    Signed-off-by: Al Viro

    Duane Griffin
     
  • A number of filesystems were potentially triggering kernel bugs due to
    corrupted symlink names on disk. This function helps safely terminate
    the names.

    Cc: Al Viro
    Cc: Andrew Morton
    Signed-off-by: Duane Griffin
    Signed-off-by: Al Viro

    Duane Griffin
     
  • The result from readlink is being used to index into the link name
    buffer without checking whether it is a valid length. If readlink
    returns an error this will fault or cause memory corruption.

    Cc: Tyler Hicks
    Cc: Dustin Kirkland
    Cc: ecryptfs-devel@lists.launchpad.net
    Signed-off-by: Duane Griffin
    Acked-by: Michael Halcrow
    Acked-by: Tyler Hicks
    Signed-off-by: Al Viro

    Duane Griffin
     
  • The extra semicolon serves no purpose.

    Signed-off-by: Julia Lawall
    Reviewed-by: Richard Genoud
    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Julia Lawall