17 Feb, 2011

4 commits

  • When decompressing a chunk of data, we'll copy the data out to
    a working buffer if the data is stored in more than one page,
    otherwise we'll use the mapped page directly to avoid memory
    copy.

    In the latter case, we'll end up accessing the kernel address
    after we've unmapped the page in a corner case.

    Reported-by: Juan Francisco Cantero Hurtado
    Signed-off-by: Li Zefan
    Signed-off-by: Chris Mason

    Li Zefan
     
  • - Check user-specified flags correctly
    - Check the inode owership
    - Search root item in root tree but not fs tree

    Reported-by: Dan Rosenberg
    Signed-off-by: Li Zefan
    Signed-off-by: Chris Mason

    Li Zefan
     
  • Btrfs device shrinking and balancing ends up reallocating all the blocks
    in order to allow COW to move them to new destinations. It is somewhat
    awkward in terms of ENOSPC because most of the enospc code is built
    around the idea that some operation on a reference counted tree triggers
    allocations in the non-reference counted trees.

    This commit changes the balancing code to deal with enospc by trying to
    allocate a new chunk. If that allocation succeeds, we go ahead and
    retry whatever failed due to enospc.

    Signed-off-by: Chris Mason

    Chris Mason
     
  • ENOSPC in btrfs is getting to the point where the extra debugging isn't
    required. I've put it under mount -o enospc_debug just in case someone
    is having difficult problems.

    Signed-off-by: Chris Mason

    Chris Mason
     

15 Feb, 2011

6 commits

  • I add the check on the return value of alloc_extent_map() to several places.
    In addition, alloc_extent_map() returns only the address or NULL.
    Therefore, check by IS_ERR() is unnecessary. So, I remove IS_ERR() checking.

    Signed-off-by: Tsutomu Itoh
    Signed-off-by: Chris Mason

    Tsutomu Itoh
     
  • Memory allocated by calling kstrdup() should be freed.

    Signed-off-by: Ilya Dryomov
    Signed-off-by: Chris Mason

    Ilya Dryomov
     
  • Commit bf5fc093c5b625e4259203f1cee7ca73488a5620 refactored
    btrfs_ioctl_space_info() and introduced several security issues.

    space_args.space_slots is an unsigned 64-bit type controlled by a
    possibly unprivileged caller. The comparison as a signed int type
    allows providing values that are treated as negative and cause the
    subsequent allocation size calculation to wrap, or be truncated to 0.
    By providing a size that's truncated to 0, kmalloc() will return
    ZERO_SIZE_PTR. It's also possible to provide a value smaller than the
    slot count. The subsequent loop ignores the allocation size when
    copying data in, resulting in a heap overflow or write to ZERO_SIZE_PTR.

    The fix changes the slot count type and comparison typecast to u64,
    which prevents truncation or signedness errors, and also ensures that we
    don't copy more data than we've allocated in the subsequent loop. Note
    that zero-size allocations are no longer possible since there is already
    an explicit check for space_args.space_slots being 0 and truncation of
    this value is no longer an issue.

    Signed-off-by: Dan Rosenberg
    Signed-off-by: Josef Bacik
    Reviewed-by: Josef Bacik
    Signed-off-by: Chris Mason

    Dan Rosenberg
     
  • Mark the cloned backref_node as checked in clone_backref_node()

    Signed-off-by: Yan, Zheng
    Signed-off-by: Chris Mason

    Yan, Zheng
     
  • Btrfs tracks uptodate state in an rbtree as well as in the
    page bits. This is supposed to enable us to use block sizes other than
    the page size, but there are a few parts still missing before that
    completely works.

    But, our readpage routine trusts this additional range based tracking
    of uptodateness, much in the same way the buffer head up to date bits
    are trusted for the other filesystems.

    The problem is that sometimes we need to allocate memory in order to
    split records in the rbtree, even when we are just clearing bits. This
    can be difficult when our clearing function is called GFP_ATOMIC, which
    can happen in the releasepage path.

    So, what happens today looks like this:

    releasepage called with GFP_ATOMIC
    btrfs_releasepage calls clear_extent_bit
    clear_extent_bit fails to allocate ram, leaving the up to date bit set
    btrfs_releasepage returns success

    The end result is the page being gone, but btrfs thinking the range is
    up to date. Later on if someone tries to read that same page, the
    btrfs readpage code will return immediately thinking the page is already
    up to date.

    This commit fixes things to fail the releasepage when we can't clear the
    extent state bits. It covers both data pages and metadata tree blocks.

    Signed-off-by: Chris Mason

    Chris Mason
     
  • There is a race where btrfs_releasepage can drop the
    page->private contents just as alloc_extent_buffer is setting
    up pages for metadata. Because of how the Btrfs page flags work,
    this results in us skipping the crc on the page during IO.

    This patch sovles the race by waiting until after the extent buffer
    is inserted into the radix tree before it sets page private.

    Signed-off-by: Chris Mason

    Chris Mason
     

08 Feb, 2011

1 commit


06 Feb, 2011

4 commits

  • As this function is called in some error paths while not
    removing the module, the __exit attribute prevents the kernel
    image from linking when btrfs is compiled in statically.

    Signed-off-by: Alexey Charkov
    Signed-off-by: Chris Mason

    Alexey Charkov
     
  • When btrfs_alloc_path() fails, btrfs_free_path() need not be called.
    Therefore, it changes the branch ahead.

    Signed-off-by: Tsutomu Itoh
    Signed-off-by: Chris Mason

    Tsutomu Itoh
     
  • This has been resulting in a BUT_ON(ret) after btrfs_reserve_extent in
    btrfs_cow_file_range. The reason is we don't actually calculate the bytes_super
    for a block group until we go to cache it, which means that the space_info can
    hand out reservations for space that it doesn't actually have, and we can run
    out of data space. This is also a problem if you are using space caching since
    we don't ever calculate bytes_super for the block groups. So instead everytime
    we read a block group call exclude_super_stripes, which calculates the
    bytes_super for the block group so it can be left out of the space_info. Then
    whenever caching completes we just call free_excluded_extents so that the super
    excluded extents are freed up. Also if we are unmounting and we hit any block
    groups that haven't been cached we still need to call free_excluded_extents to
    make sure things are cleaned up properly. Thanks,

    Reported-by: Arne Jansen
    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • When we're cleaning up the tree log we need to be able to remove free space from
    the block group. The problem is if that free space spans bitmaps we would not
    find the space since we're looking for too many bytes. So make sure the amount
    of bytes we search for is limited to either the number of bytes we want, or the
    number of bytes left in the bitmap. This was tested by a user who was hitting
    the BUG() after search_bitmap. With this patch he can now mount his fs.
    Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     

01 Feb, 2011

5 commits


29 Jan, 2011

12 commits

  • Instead of doing a BUG_ON(1) in prepare_pages if grab_cache_page() fails, just
    loop through the pages we've already grabbed and unlock and release them, then
    return -ENOMEM like we should. Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • Got a report of a box panicing because we got a NULL eb in read_extent_buffer.
    His fs was borked and btrfs_search_path returned EIO, but we don't check for
    errors so the box paniced. Yes I know this will just make something higher up
    the stack panic, but that's a problem for future Josef. Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • We call use_block_rsv right before we make an allocation in order to make sure
    we have enough space. Now normally people have called btrfs_start_transaction()
    with the appropriate amount of space that we need, so we just use some of that
    pre-reserved space and move along happily. The problem is where people use
    btrfs_join_transaction(), which doesn't actually reserve any space. So we try
    and reserve space here, but we cannot flush delalloc, so this forces us to
    return -ENOSPC when in reality we have plenty of space. The most common symptom
    is seeing a bunch of "couldn't dirty inode" messages in syslog. With
    xfstests 224 we end up falling back to start_transaction and then doing all the
    flush delalloc stuff which causes to hang for a very long time.

    So instead steal from the global reserve, which is what this is meant for
    anyway. With this patch and the other 2 I have sent xfstests 224 now passes
    successfully. Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • When we do btrfs_block_rsv_release, if global_block_rsv is not full we will
    release all the extra bytes to global_block_rsv, even if it's only a little
    short of the amount of space that we need to reserve. This causes us to starve
    ourselves of reservable space during the transaction which will force us to
    shrink delalloc bytes and commit the transaction more often than we should. So
    instead just add the amount of bytes we need to add to the global reserve so
    reserved == size, and then add the rest back into the space_info for general
    use. Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • When running xfstests 224 I kept getting ENOSPC when trying to remove the files,
    and this is because we were returning ret from check_path_shared while it was
    uninitalized, which isn't right. Fix this to return 0 properly, and now
    xfstests 224 doesn't freak out when it tries to clean itself up. Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • btrfs_start_ioctl_transaction() returns ERR_PTR(), not NULL.
    So, it is necessary to use IS_ERR() to check the return value.

    Signed-off-by: Tsutomu Itoh
    Signed-off-by: Chris Mason

    Tsutomu Itoh
     
  • The error check of btrfs_join_transaction()/btrfs_join_transaction_nolock()
    is added, and the mistake of the error check in several places is
    corrected.

    For more stable Btrfs, I think that we should reduce BUG_ON().
    But, I think that long time is necessary for this.
    So, I propose this patch as a short-term solution.

    With this patch:
    - To more stable Btrfs, the part that should be corrected is clarified.
    - The panic isn't done by the NULL pointer reference etc. (even if
    BUG_ON() is increased temporarily)
    - The error code is returned in the place where the error can be easily
    returned.

    As a long-term plan:
    - BUG_ON() is reduced by using the forced-readonly framework, etc.

    Signed-off-by: Tsutomu Itoh
    Signed-off-by: Chris Mason

    Tsutomu Itoh
     
  • After the conditional that precedes the following code, inode may be an
    ERR_PTR value. This can eg result from a memory allocation failure via the
    call to btrfs_iget, and thus does not imply that root is different than
    sub_root. Thus, an IS_ERR check is added to ensure that there is no
    dereference of inode in this case.

    The semantic match that finds this problem is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @r@
    identifier f;
    @@
    f(...) { ... return ERR_PTR(...); }

    @@
    identifier r.f, fld;
    expression x;
    statement S1,S2;
    @@
    x = f(...)
    ... when != IS_ERR(x)
    (
    if (IS_ERR(x) ||...) S1 else S2
    |
    *x->fld
    )
    //

    Signed-off-by: Julia Lawall
    Signed-off-by: Chris Mason

    Julia Lawall
     
  • There is a missing break in switch, fix it.

    Signed-off-by: Liu Bo
    Signed-off-by: Chris Mason

    liubo
     
  • To make btrfs more stable, add several missing necessary memory allocation
    checks, and when no memory, return proper errno.

    We've checked that some of those -ENOMEM errors will be returned to
    userspace, and some will be catched by BUG_ON() in the upper callers,
    and none will be ignored silently.

    Signed-off-by: Liu Bo
    Signed-off-by: Chris Mason

    liubo
     
  • btrfs_submit_compressed_read() is lack of memory allocation checks and
    corresponding error route.

    After this fix, if it comes to "no memory" case, errno will be returned
    to userland step by step, and tell users this operation cannot go on.

    Signed-off-by: Liu Bo
    Signed-off-by: Chris Mason

    liubo
     
  • Chris Mason
     

27 Jan, 2011

8 commits