30 Nov, 2010

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (24 commits)
    Btrfs: don't use migrate page without CONFIG_MIGRATION
    Btrfs: deal with DIO bios that span more than one ordered extent
    Btrfs: setup blank root and fs_info for mount time
    Btrfs: fix fiemap
    Btrfs - fix race between btrfs_get_sb() and umount
    Btrfs: update inode ctime when using links
    Btrfs: make sure new inode size is ok in fallocate
    Btrfs: fix typo in fallocate to make it honor actual size
    Btrfs: avoid NULL pointer deref in try_release_extent_buffer
    Btrfs: make btrfs_add_nondir take parent inode as an argument
    Btrfs: hold i_mutex when calling btrfs_log_dentry_safe
    Btrfs: use dget_parent where we can UPDATED
    Btrfs: fix more ESTALE problems with NFS
    Btrfs: handle NFS lookups properly
    btrfs: make 1-bit signed fileds unsigned
    btrfs: Show device attr correctly for symlinks
    btrfs: Set file size correctly in file clone
    btrfs: Check if dest_offset is block-size aligned before cloning file
    Btrfs: handle the space_cache option properly
    btrfs: Fix early enospc because 'unused' calculated with wrong sign.
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
    GFS2: Userland expects quota limit/warn/usage in 512b blocks

    Linus Torvalds
     

29 Nov, 2010

5 commits

  • Fixes compile error

    Signed-off-by: Chris Mason

    Chris Mason
     
  • The new DIO bio splitting code has problems when the bio
    spans more than one ordered extent. This will happen as the
    generic DIO code merges our get_blocks calls together into
    a bigger single bio.

    This fixes things by walking forward in the ordered extent
    code finding all the overlapping ordered extents and completing them
    all at once.

    Signed-off-by: Chris Mason

    Chris Mason
     
  • This avoids some include-file hell, and the function isn't really
    important enough to be inlined anyway.

    Reported-by: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • And in particular, use it in 'pipe_fcntl()'.

    The other pipe functions do not need to use the 'careful' version, since
    they are only ever called for things that are already known to be pipes.

    The normal read/write/ioctl functions are called through the file
    operations structures, so if a file isn't a pipe, they'd never get
    called. But pipe_fcntl() is special, and called directly from the
    generic fcntl code, and needs to use the same careful function that the
    splice code is using.

    Cc: Jens Axboe
    Cc: Andrew Morton
    Cc: Al Viro
    Cc: Dave Jones
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • .. and change it to take the 'file' pointer instead of an inode, since
    that's what all users want anyway.

    The renaming is preparatory to exporting it to other users. The old
    'pipe_info()' name was too generic and is already used elsewhere, so
    before making the function public we need to use a more specific name.

    Cc: Jens Axboe
    Cc: Andrew Morton
    Cc: Al Viro
    Cc: Dave Jones
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

28 Nov, 2010

6 commits

  • There is a problem with how we use sget, it searches through the list of supers
    attached to the fs_type looking for a super with the same fs_devices as what
    we're trying to mount. This depends on sb->s_fs_info being filled, but we don't
    fill that in until we get to btrfs_fill_super, so we could hit supers on the
    fs_type super list that have a null s_fs_info. In order to fix that we need to
    go ahead and setup a blank root with a blank fs_info to hold fs_devices, that
    way our test will work out right and then we can set s_fs_info in
    btrfs_set_super, and then open_ctree will simply use our pre-allocated root and
    fs_info when setting everything up. Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • There are two big problems currently with FIEMAP

    1) We return extents for holes. This isn't supposed to happen, we just don't
    return extents for holes and then userspace interprets the lack of an extent as
    a hole.

    2) We sometimes don't set FIEMAP_EXTENT_LAST properly. This is because we wait
    to see a EXTENT_FLAG_VACANCY flag on the em, but this won't happen if say we ask
    fiemap to map up to the last extent in a file, and there is nothing but holes up
    to the i_size. To fix this we need to lookup the last extent in this file and
    save the logical offset, so if we happen to try and map that extent we can be
    sure to set FIEMAP_EXTENT_LAST.

    With this patch we now pass xfstest 225, which we never have before.

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • When mounting a btrfs file system btrfs_test_super() may attempt to
    use sb->s_fs_info, the btrfs root, of a super block that is going away
    and that has had the btrfs root set to NULL in its ->put_super(). But
    if the super block is going away it cannot be an existing super block
    so we can return false in this case.

    Signed-off-by: Ian Kent
    Signed-off-by: Chris Mason

    Ian Kent
     
  • Currently we fail xfstest 236 because we're not updating the inode ctime on
    link. This is a simple fix, and makes it so we pass 236 now.

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • We have been failing xfstest 228 forever, because we don't check to make sure
    the new inode size is acceptable as far as RLIMIT is concerned. Just check to
    make sure it's ok to create a inode with this new size and error out if not.
    With this patch we now pass 228.

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • There is a typo in __btrfs_prealloc_file_range() where we set the i_size to
    actual_len/cur_offset, and then just set it to cur_offset again, and do the same
    with btrfs_ordered_update_i_size(). This fixes it back to keeping i_size in a
    local variable and then updating i_size properly. Tested this with

    xfs_io -F -f -c "falloc 0 1" -c "pwrite 0 1" foo

    stat'ing foo gives us a size of 1 instead of 4096 like it was. Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     

27 Nov, 2010

3 commits

  • * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
    NFS: Ensure we return the dirent->d_type when it is known
    NFS: Correct the array bound calculation in nfs_readdir_add_to_array
    NFS: Don't ignore errors from nfs_do_filldir()
    NFS: Fix the error handling in "uncached_readdir()"
    NFS: Fix a page leak in uncached_readdir()
    NFS: Fix a page leak in nfs_do_filldir()
    NFS: Assume eof if the server returns no readdir records
    NFS: Buffer overflow in ->decode_dirent() should not be fatal
    Pure nfs client performance using odirect.
    SUNRPC: Fix an infinite loop in call_refresh/call_refreshresult

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
    cciss: fix build for PROC_FS disabled
    block: fix amiga and atari floppy driver compile warning
    blk-throttle: Fix calculation of max number of WRITES to be dispatched
    ioprio: grab rcu_read_lock in sys_ioprio_{set,get}()
    xen/blkfront: cope with backend that fail empty BLKIF_OP_WRITE_BARRIER requests
    xen/blkfront: Implement FUA with BLKIF_OP_WRITE_BARRIER
    xen/blkfront: change blk_shadow.request to proper pointer
    xen/blkfront: map REQ_FLUSH into a full barrier

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
    nilfs2: fix typo in comment of nilfs_dat_move function
    nilfs2: nilfs_iget_for_gc() returns ERR_PTR

    Linus Torvalds
     

25 Nov, 2010

3 commits

  • reiserfs_unpack() locks the inode mutex with reiserfs_mutex_lock_safe()
    to protect against reiserfs lock dependency. However this protection
    requires to have the reiserfs lock to be locked.

    This is the case if reiserfs_unpack() is called by reiserfs_ioctl but
    not from reiserfs_quota_on() when it tries to unpack tails of quota
    files.

    Fix the ordering of the two locks in reiserfs_unpack() to fix this
    issue.

    Signed-off-by: Frederic Weisbecker
    Reported-by: Markus Gapp
    Reported-by: Jan Kara
    Cc: Jeff Mahoney
    Cc: [2.6.36.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Frederic Weisbecker
     
  • Currently one pagemap_read() call walks in PAGEMAP_WALK_SIZE bytes (== 512
    pages.) But there is a corner case where walk_pmd_range() accidentally
    runs over a VMA associated with a hugetlbfs file.

    For example, when a process has mappings to VMAs as shown below:

    # cat /proc//maps
    ...
    3a58f6d000-3a58f72000 rw-p 00000000 00:00 0
    7fbd51853000-7fbd51855000 rw-p 00000000 00:00 0
    7fbd5186c000-7fbd5186e000 rw-p 00000000 00:00 0
    7fbd51a00000-7fbd51c00000 rw-s 00000000 00:12 8614 /hugepages/test

    then pagemap_read() goes into walk_pmd_range() path and walks in the range
    0x7fbd51853000-0x7fbd51a53000, but the hugetlbfs VMA should be handled by
    walk_hugetlb_range(). Otherwise PMD for the hugepage is considered bad
    and cleared, which causes undesirable results.

    This patch fixes it by separating pagemap walk range into one PMD.

    Signed-off-by: Naoya Horiguchi
    Cc: Jun'ichi Nomura
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • The attribute cache for a file was not being cleared when a file is opened
    with O_TRUNC.

    If the filesystem's open operation truncates the file ("atomic_o_trunc"
    feature flag is set) then the kernel should invalidate the cached st_mtime
    and st_ctime attributes.

    Also i_size should be explicitly be set to zero as it is used sometimes
    without refreshing the cache.

    Signed-off-by: Ken Sumrall
    Cc: Anfei
    Cc: "Anand V. Avati"
    Signed-off-by: Miklos Szeredi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ken Sumrall
     

24 Nov, 2010

1 commit


23 Nov, 2010

10 commits


22 Nov, 2010

10 commits

  • If we fail to find a pointer in the radix tree, don't try
    to deref the NULL one we do have.

    Signed-off-by: Chris Mason

    Chris Mason
     
  • Everybody who calls btrfs_add_nondir just passes in the dentry of the new file
    and then dereference dentry->d_parent->d_inode, but everybody who calls
    btrfs_add_nondir() are already passed the parent's inode. So instead of
    dereferencing dentry->d_parent, just make btrfs_add_nondir take the dir inode as
    an argument and pass that along so we don't have to worry about d_parent.
    Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • Since we walk up the path logging all of the parts of the inode's path, we need
    to hold i_mutex to make sure that the inode is not renamed while we're logging
    everything. btrfs_log_dentry_safe does dget_parent and all of that jazz, but we
    may get unexpected results if the rename changes the inode's location while
    we're higher up the path logging those dentries, so do this for safety reasons.
    Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • There are lots of places where we do dentry->d_parent->d_inode without holding
    the dentry->d_lock. This could cause problems with rename. So instead we need
    to use dget_parent() and hold the reference to the parent as long as we are
    going to use it's inode and then dput it at the end.

    Signed-off-by: Josef Bacik
    Cc: raven@themaw.net
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • When creating new inodes we don't setup inode->i_generation. So if we generate
    an fh with a newly created inode we save the generation of 0, but if we flush
    the inode to disk and have to read it back when getting the inode on the server
    we'll have the right i_generation, so gens wont match and we get ESTALE. This
    patch properly sets inode->i_generation when we create the new inode and now I'm
    no longer getting ESTALE. Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • People kept reporting NFS issues, specifically getting ESTALE alot. I figured
    out how to reproduce the problem

    SERVER
    mkfs.btrfs /dev/sda1
    mount /dev/sda1 /mnt/btrfs-test

    btrfs subvol create /mnt/btrfs-test/foo
    service nfs start

    CLIENT
    mount server:/mnt/btrfs /mnt/test
    cd /mnt/test/foo
    ls

    SERVER
    echo 3 > /proc/sys/vm/drop_caches

    CLIENT
    ls
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • Fixes these sparse warnings:
    fs/btrfs/ctree.h:811:17: error: dubious one-bit signed bitfield
    fs/btrfs/ctree.h:812:20: error: dubious one-bit signed bitfield
    fs/btrfs/ctree.h:813:19: error: dubious one-bit signed bitfield

    Signed-off-by: Mariusz Kozlowski
    Signed-off-by: Chris Mason

    Mariusz Kozlowski
     
  • Symlinks and files of other types show different device numbers, though
    they are on the same partition:

    $ touch tmp; ln -s tmp tmp2; stat tmp tmp2
    File: `tmp'
    Size: 0 Blocks: 0 IO Block: 4096 regular empty file
    Device: 15h/21d Inode: 984027 Links: 1
    --- snip ---
    File: `tmp2' -> `tmp'
    Size: 3 Blocks: 0 IO Block: 4096 symbolic link
    Device: 13h/19d Inode: 984028 Links: 1

    Reported-by: Toke Høiland-Jørgensen
    Signed-off-by: Li Zefan
    Signed-off-by: Chris Mason

    Li Zefan
     
  • Set src_offset = 0, src_length = 20K, dest_offset = 20K. And the
    original filesize of the dest file 'file2' is 30K:

    # ls -l /mnt/file2
    -rw-r--r-- 1 root root 30720 Nov 18 16:42 /mnt/file2

    Now clone file1 to file2, the dest file should be 40K, but it
    still shows 30K:

    # ls -l /mnt/file2
    -rw-r--r-- 1 root root 30720 Nov 18 16:42 /mnt/file2

    Signed-off-by: Li Zefan
    Signed-off-by: Chris Mason

    Li Zefan
     
  • We've done the check for src_offset and src_length, and We should
    also check dest_offset, otherwise we'll corrupt the destination
    file:

    (After cloning file1 to file2 with unaligned dest_offset)
    # cat /mnt/file2
    cat: /mnt/file2: Input/output error

    Signed-off-by: Li Zefan
    Signed-off-by: Chris Mason

    Li Zefan