14 Jun, 2011

3 commits


13 Jun, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
    Btrfs: use join_transaction in btrfs_evict_inode()
    Btrfs - use %pU to print fsid
    Btrfs: fix extent state leak on failed nodatasum reads
    btrfs: fix unlocked access of delalloc_inodes
    Btrfs: avoid stack bloat in btrfs_ioctl_fs_info()
    btrfs: remove 64bit alignment padding to allow extent_buffer to fit into one fewer cacheline
    Btrfs: clear current->journal_info on async transaction commit
    Btrfs: make sure to recheck for bitmaps in clusters
    btrfs: remove unneeded includes from scrub.c
    btrfs: reinitialize scrub workers
    btrfs: scrub: errors in tree enumeration
    Btrfs: don't map extent buffer if path->skip_locking is set
    Btrfs: unlock the trans lock properly
    Btrfs: don't map extent buffer if path->skip_locking is set
    Btrfs: fix duplicate checking logic
    Btrfs: fix the allocator loop logic
    Btrfs: fix bitmap regression
    Btrfs: don't commit the transaction if we dont have enough pinned bytes
    Btrfs: noinline the cluster searching functions
    Btrfs: cache bitmaps when searching for a cluster

    Linus Torvalds
     

11 Jun, 2011

12 commits

  • The WARN_ON() in start_transaction() was triggered while balancing.

    The cause is btrfs_relocate_chunk() started a transaction and
    then called iput() on the inode that stores free space cache,
    and iput() called btrfs_start_transaction() again.

    Reported-by: Tsutomu Itoh
    Signed-off-by: Li Zefan
    Reviewed-by: Josef Bacik
    Signed-off-by: Chris Mason

    Li Zefan
     
  • Checkpoint generation interval of nilfs goes wrong after user has
    changed the interval parameter with nilfs-tune tool.

    segctord starting. Construction interval = 5 seconds,
    CP frequency < 30 seconds
    segctord starting. Construction interval = 0 seconds,
    CP frequency < 30 seconds

    This turned out to be caused by a trivial bug in initialization code
    of log writer. This will fix it.

    Reported-by: Andrea Gelmini
    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     
  • nilfs_btree_delete function does not terminate part of virtual block
    addresses when shrinking the last remaining child node into the root
    node. The missing address termination causes that dead btree node
    blocks persist and chip away free disk space.

    This fixes the leak bug on the btree node deletion.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     
  • nilfs_btree_delete function wrongly terminates virtual block address
    of the btree node held by its parent at index 0. When concatenating
    the index-0 node with its right sibling node, nilfs_btree_delete
    terminates the block address of index-0 node instead of the right
    sibling node which should be deleted.

    This bug not only wears disk space in the long run, but also causes
    file system corruption. This will fix it.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     
  • Get rid of FIXME comment. Uuids from dmesg are now the same as uuids
    given by btrfs-progs.

    Signed-off-by: Ilya Dryomov
    Signed-off-by: Chris Mason

    Ilya Dryomov
     
  • When encountering an EIO while reading from a nodatasum extent, we
    insert an error record into the inode's failure tree.
    btrfs_readpage_end_io_hook returns early for nodatasum inodes. We'd
    better clear the failure tree in that case, otherwise the kernel
    complains about

    BUG extent_state: Objects remaining on kmem_cache_close()

    on rmmod.

    Signed-off-by: Jan Schmidt
    Signed-off-by: Chris Mason

    Jan Schmidt
     
  • …trfs-unstable-arne into for-linus

    Chris Mason
     
  • list_splice_init will make delalloc_inodes empty, but without a spinlock
    around, this may produce corrupted list head, accessed in many placess,
    The race window is very tight and nobody seems to have hit it so far.

    Signed-off-by: David Sterba
    Signed-off-by: Chris Mason

    David Sterba
     
  • The size of struct btrfs_ioctl_fs_info_args is as big as 1KB, so
    don't declare the variable on stack.

    Signed-off-by: Li Zefan
    Reviewed-by: Josef Bacik
    Signed-off-by: Chris Mason

    Li Zefan
     
  • Reorder extent_buffer to remove 8 bytes of alignment padding on 64 bit
    builds. This shrinks its size to 128 bytes allowing it to fit into one
    fewer cache lines and allows more objects per slab in its kmem_cache.

    slabinfo extent_buffer reports :-

    before:-
    Sizes (bytes) Slabs
    ----------------------------------
    Object : 136 Total : 123
    SlabObj: 136 Full : 121
    SlabSiz: 4096 Partial: 0
    Loss : 0 CpuSlab: 2
    Align : 8 Objects: 30

    after :-
    Object : 128 Total : 4
    SlabObj: 128 Full : 2
    SlabSiz: 4096 Partial: 0
    Loss : 0 CpuSlab: 2
    Align : 8 Objects: 32

    Signed-off-by: Richard Kennedy
    Signed-off-by: Chris Mason

    richard kennedy
     
  • Normally current->jouranl_info is cleared by commit_transaction. For an
    async snap or subvol creation, though, it runs in a work queue. Clear
    it in btrfs_commit_transaction_async() to avoid leaking a non-NULL
    journal_info when we return to userspace. When the actual commit runs in
    the other thread it won't care that it's current->journal_info is already
    NULL.

    Signed-off-by: Sage Weil
    Tested-by: Jim Schutt
    Signed-off-by: Chris Mason

    Sage Weil
     
  • Josef recently changed the free extent cache to look in
    the block group cluster for any bitmaps before trying to
    add a new bitmap for the same offset. This avoids BUG_ON()s due
    covering duplicate ranges.

    But it didn't go quite far enough. A given free range might span
    between one or more bitmaps or free space entries. The code has
    looping to cover this, but it doesn't check for clustered bitmaps
    every time.

    This shuffles our gotos to check for a bitmap in the cluster
    for every new bitmap entry we try to add.

    Signed-off-by: Chris Mason

    Chris Mason
     

10 Jun, 2011

5 commits

  • Signed-off-by: Arne Jansen

    Arne Jansen
     
  • Scrub starts the workers each time a scrub starts and stops them after it
    finished. This patch adds an initialization for the workers before each
    start, otherwise the workers behave strangely.

    Signed-off-by: Arne Jansen

    Arne Jansen
     
  • due to the semantics of btrfs_search_slot the path can point to an
    invalid slot when ret > 0. This condition went unnoticed, which in
    turn could have led to an incomplete scrubbing.

    Signed-off-by: Arne Jansen

    Arne Jansen
     
  • Arne's scrub stuff exposed a problem with mapping the extent buffer in
    reada_for_search. He searches the commit root with multiple threads and with
    skip_locking set, so we can race and overwrite node->map_token since node isn't
    locked. So fix this so that we only map the extent buffer if we don't already
    have a map_token and skip_locking isn't set. Without this patch scrub would
    panic almost immediately, with the patch it doesn't panic anymore. Thanks,

    Reported-by: Arne Jansen
    Signed-off-by: Josef Bacik

    Josef Bacik
     
  • Unconditionally changing the address limit to USER_DS and not restoring
    it to its old value in the error path is wrong because it prevents us
    using kernel memory on repeated calls to this function. This, in fact,
    breaks the fallback of hard coded paths to the init program from being
    ever successful if the first candidate fails to load.

    With this patch applied switching to USER_DS is delayed until the point
    of no return is reached which makes it possible to have a multi-arch
    rootfs with one arch specific init binary for each of the (hard coded)
    probed paths.

    Since the address limit is already set to USER_DS when start_thread()
    will be invoked, this redundancy can be safely removed.

    Signed-off-by: Mathias Krause
    Cc: Al Viro
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Mathias Krause
     

09 Jun, 2011

11 commits

  • In btrfs_wait_for_commit if we came upon a transaction that had committed we
    just exited, but that's bad since we are holding the trans_lock. So break
    instead so that the lock is dropped. Thanks,

    Reported-by: David Sterba
    Signed-off-by: Josef Bacik

    Josef Bacik
     
  • Arne's scrub stuff exposed a problem with mapping the extent buffer in
    reada_for_search. He searches the commit root with multiple threads and with
    skip_locking set, so we can race and overwrite node->map_token since node isn't
    locked. So fix this so that we only map the extent buffer if we don't already
    have a map_token and skip_locking isn't set. Without this patch scrub would
    panic almost immediately, with the patch it doesn't panic anymore. Thanks,

    Reported-by: Arne Jansen
    Signed-off-by: Josef Bacik

    Josef Bacik
     
  • …file into perf/urgent

    Ingo Molnar
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
    cifs: trivial: add space in fsc error message
    cifs: silence printk when establishing first session on socket
    CIFS ACL support needs CONFIG_KEYS, so depend on it
    possible memory corruption in cifs_parse_mount_options()
    cifs: make CIFS depend on CRYPTO_ECB
    cifs: fix the kernel release version in the default security warning message

    Linus Torvalds
     
  • When merging my code into the integration test the second check for duplicate
    entries got screwed up. This patch fixes it by dropping ret2 and just using ret
    for the return value, and checking if we got an error before adding the bitmap
    to the local list. Thanks,

    Signed-off-by: Josef Bacik

    Josef Bacik
     
  • I was testing with empty_cluster = 0 to try and reproduce a problem and kept
    hitting early enospc panics. This was because our loop logic was a little
    confused. So this is what I did

    1) Make the loop variable the ultimate decider on wether we should loop again
    isntead of checking to see if we had an uncached bg, empty size or empty
    cluster.

    2) Increment loop before checking to see what we are on to make the loop
    definitions make more sense.

    3) If we are on the chunk alloc loop don't set empty_size/empty_cluster to 0
    unless we didn't actually allocate a chunk. If we did allocate a chunk we
    should be able to easily setup a new cluster so clearing
    empty_size/empty_cluster makes us less efficient.

    This kept me from hitting panics while trying to reproduce the other problem.
    Thanks,

    Signed-off-by: Josef Bacik

    Josef Bacik
     
  • In cleaning up the clustering code I accidently introduced a regression by
    adding bitmap entries to the cluster rb tree. The problem is if we've maxed out
    the number of bitmaps we can have for the block group we can only add free space
    to the bitmaps, but since the bitmap is on the cluster we can't find it and we
    try to create another one. This would result in a panic because the total
    bitmaps was bigger than the max bitmaps that were allowed. This patch fixes
    this by checking to see if we have a cluster, and then looking at the cluster rb
    tree to see if it has a bitmap entry and if it does and that space belongs to
    that bitmap, go ahead and add it to that bitmap.

    I could hit this panic every time with an fs_mark test within a couple of
    minutes. With this patch I no longer hit the panic and fs_mark goes to
    completion. Thanks,

    Signed-off-by: Josef Bacik

    Josef Bacik
     
  • I noticed when running an enospc test that we would get stuck committing the
    transaction in check_data_space even though we truly didn't have enough space.
    So check to see if bytes_pinned is bigger than num_bytes, if it's not don't
    commit the transaction. Thanks,

    Signed-off-by: Josef Bacik

    Josef Bacik
     
  • When profiling the find cluster code it's hard to tell where we are spending our
    time because the bitmap and non-bitmap functions get inlined by the compiler, so
    make that not happen. Thanks,

    Signed-off-by: Josef Bacik

    Josef Bacik
     
  • If we are looking for a cluster in a particularly sparse or fragmented block
    group, we will do a lot of looping through the free space tree looking for
    various things, and if we need to look at bitmaps we will endup doing the whole
    dance twice. So instead add the bitmap entries to a temporary list so if we
    have to do the bitmap search we can just look through the list of entries we've
    found quickly instead of having to loop through the entire tree again. Thanks,

    Signed-off-by: Josef Bacik

    Josef Bacik
     
  • Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     

08 Jun, 2011

7 commits


07 Jun, 2011

1 commit