08 Dec, 2018

1 commit

  • commit 10950929e994c5ecee149ff0873388d3c98f12b5 upstream.

    [BUG]
    A completely valid btrfs will refuse to mount, with error message like:
    BTRFS critical (device sdb2): corrupt leaf: root=2 block=239681536 slot=172 \
    bg_start=12018974720 bg_len=10888413184, invalid block group size, \
    have 10888413184 expect (0, 10737418240]

    This has been reported several times as the 4.19 kernel is now being
    used. The filesystem refuses to mount, but is otherwise ok and booting
    4.18 is a workaround.

    Btrfs check returns no error, and all kernels used on this fs is later
    than 2011, which should all have the 10G size limit commit.

    [CAUSE]
    For a 12 devices btrfs, we could allocate a chunk larger than 10G due to
    stripe stripe bump up.

    __btrfs_alloc_chunk()
    |- max_stripe_size = 1G
    |- max_chunk_size = 10G
    |- data_stripe = 11
    |- if (1G * 11 > 10G) {
    stripe_size = 976128930;
    stripe_size = round_up(976128930, SZ_16M) = 989855744

    However the final stripe_size (989855744) * 11 = 10888413184, which is
    still larger than 10G.

    [FIX]
    For the comprehensive check, we need to do the full check at chunk read
    time, and rely on bg chunk mapping to do the check.

    We could just skip the length check for now.

    Fixes: fce466eab7ac ("btrfs: tree-checker: Verify block_group_item")
    Cc: stable@vger.kernel.org # v4.19+
    Reported-by: Wang Yugui
    Signed-off-by: Qu Wenruo
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    Signed-off-by: Greg Kroah-Hartman

    Qu Wenruo
     

06 Dec, 2018

10 commits

  • commit 761333f2f50ccc887aa9957ae829300262c0d15b upstream.

    block_group_err shows the group system as a decimal value with a '0x'
    prefix, which is somewhat misleading.

    Fix it to print hexadecimal, as was intended.

    Fixes: fce466eab7ac6 ("btrfs: tree-checker: Verify block_group_item")
    Reviewed-by: Nikolay Borisov
    Reviewed-by: Qu Wenruo
    Signed-off-by: Shaokun Zhang
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    Signed-off-by: Ben Hutchings
    Signed-off-by: Sasha Levin

    Shaokun Zhang
     
  • commit f556faa46eb4e96d0d0772e74ecf66781e132f72 upstream.

    Although we have tree level check at tree read runtime, it's completely
    based on its parent level.
    We still need to do accurate level check to avoid invalid tree blocks
    sneak into kernel space.

    The check itself is simple, for leaf its level should always be 0.
    For nodes its level should be in range [1, BTRFS_MAX_LEVEL - 1].

    Signed-off-by: Qu Wenruo
    Reviewed-by: Su Yue
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    [bwh: Backported to 4.14:
    - Pass root instead of fs_info to generic_err()
    - Adjust context]
    Signed-off-by: Ben Hutchings
    Signed-off-by: Sasha Levin

    Qu Wenruo
     
  • commit ba480dd4db9f1798541eb2d1c423fc95feee8d36 upstream.

    A crafted image has empty root tree block, which will later cause NULL
    pointer dereference.

    The following trees should never be empty:
    1) Tree root
    Must contain at least root items for extent tree, device tree and fs
    tree

    2) Chunk tree
    Or we can't even bootstrap as it contains the mapping.

    3) Fs tree
    At least inode item for top level inode (.).

    4) Device tree
    Dev extents for chunks

    5) Extent tree
    Must have corresponding extent for each chunk.

    If any of them is empty, we are sure the fs is corrupted and no need to
    mount it.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=199847
    Reported-by: Xu Wen
    Signed-off-by: Qu Wenruo
    Tested-by: Gu Jinxiang
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    [bwh: Backported to 4.14: Pass root instead of fs_info to generic_err()]
    Signed-off-by: Ben Hutchings
    Signed-off-by: Sasha Levin

    Qu Wenruo
     
  • commit fce466eab7ac6baa9d2dcd88abcf945be3d4a089 upstream.

    A crafted image with invalid block group items could make free space cache
    code to cause panic.

    We could detect such invalid block group item by checking:
    1) Item size
    Known fixed value.
    2) Block group size (key.offset)
    We have an upper limit on block group item (10G)
    3) Chunk objectid
    Known fixed value.
    4) Type
    Only 4 valid type values, DATA, METADATA, SYSTEM and DATA|METADATA.
    No more than 1 bit set for profile type.
    5) Used space
    No more than the block group size.

    This should allow btrfs to detect and refuse to mount the crafted image.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=199849
    Reported-by: Xu Wen
    Signed-off-by: Qu Wenruo
    Reviewed-by: Gu Jinxiang
    Reviewed-by: Nikolay Borisov
    Tested-by: Gu Jinxiang
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    [bwh: Backported to 4.14:
    - In check_leaf_item(), pass root->fs_info to check_block_group_item()
    - Adjust context]
    Signed-off-by: Ben Hutchings
    Signed-off-by: Sasha Levin

    Qu Wenruo
     
  • commit e2683fc9d219430f5b78889b50cde7f40efeba7b upstream.

    I've noticed that the updated item checker stack consumption increased
    dramatically in 542f5385e20cf97447 ("btrfs: tree-checker: Add checker
    for dir item")

    tree-checker.c:check_leaf +552 (176 -> 728)

    The array is 255 bytes long, dynamic allocation would slow down the
    sanity checks so it's more reasonable to keep it on-stack. Moving the
    variable to the scope of use reduces the stack usage again

    tree-checker.c:check_leaf -264 (728 -> 464)

    Reviewed-by: Josef Bacik
    Reviewed-by: Qu Wenruo
    Signed-off-by: David Sterba
    Signed-off-by: Ben Hutchings
    Signed-off-by: Sasha Levin

    David Sterba
     
  • commit 7cfad65297bfe0aa2996cd72d21c898aa84436d9 upstream.

    The return value of sizeof() is of type size_t, so we must print it
    using the %z format modifier rather than %l to avoid this warning
    on some architectures:

    fs/btrfs/tree-checker.c: In function 'check_dir_item':
    fs/btrfs/tree-checker.c:273:50: error: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'u32' {aka 'unsigned int'} [-Werror=format=]

    Fixes: 005887f2e3e0 ("btrfs: tree-checker: Add checker for dir item")
    Signed-off-by: Arnd Bergmann
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    Signed-off-by: Ben Hutchings
    Signed-off-by: Sasha Levin

    Arnd Bergmann
     
  • commit ad7b0368f33cffe67fecd302028915926e50ef7e upstream.

    Add checker for dir item, for key types DIR_ITEM, DIR_INDEX and
    XATTR_ITEM.

    This checker does comprehensive checks for:

    1) dir_item header and its data size
    Against item boundary and maximum name/xattr length.
    This part is mostly the same as old verify_dir_item().

    2) dir_type
    Against maximum file types, and against key type.
    Since XATTR key should only have FT_XATTR dir item, and normal dir
    item type should not have XATTR key.

    The check between key->type and dir_type is newly introduced by this
    patch.

    3) name hash
    For XATTR and DIR_ITEM key, key->offset is name hash (crc32c).
    Check the hash of the name against the key to ensure it's correct.

    The name hash check is only found in btrfs-progs before this patch.

    Signed-off-by: Qu Wenruo
    Reviewed-by: Nikolay Borisov
    Reviewed-by: Su Yue
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    Signed-off-by: Ben Hutchings
    Signed-off-by: Sasha Levin

    Qu Wenruo
     
  • commit 69fc6cbbac542c349b3d350d10f6e394c253c81d upstream.

    [BUG]
    If we run btrfs with CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y, it will
    instantly cause kernel panic like:

    ------
    ...
    assertion failed: 0, file: fs/btrfs/disk-io.c, line: 3853
    ...
    Call Trace:
    btrfs_mark_buffer_dirty+0x187/0x1f0 [btrfs]
    setup_items_for_insert+0x385/0x650 [btrfs]
    __btrfs_drop_extents+0x129a/0x1870 [btrfs]
    ...
    -----

    [Cause]
    Btrfs will call btrfs_check_leaf() in btrfs_mark_buffer_dirty() to check
    if the leaf is valid with CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y.

    However quite some btrfs_mark_buffer_dirty() callers(*) don't really
    initialize its item data but only initialize its item pointers, leaving
    item data uninitialized.

    This makes tree-checker catch uninitialized data as error, causing
    such panic.

    *: These callers include but not limited to
    setup_items_for_insert()
    btrfs_split_item()
    btrfs_expand_item()

    [Fix]
    Add a new parameter @check_item_data to btrfs_check_leaf().
    With @check_item_data set to false, item data check will be skipped and
    fallback to old btrfs_check_leaf() behavior.

    So we can still get early warning if we screw up item pointers, and
    avoid false panic.

    Cc: Filipe Manana
    Reported-by: Lakshmipathi.G
    Signed-off-by: Qu Wenruo
    Reviewed-by: Liu Bo
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    Signed-off-by: Ben Hutchings

    Signed-off-by: Sasha Levin

    Qu Wenruo
     
  • commit bba4f29896c986c4cec17bc0f19f2ce644fceae1 upstream.

    Use inline function to replace macro since we don't need
    stringification.
    (Macro still exists until all callers get updated)

    And add more info about the error, and replace EIO with EUCLEAN.

    For nr_items error, report if it's too large or too small, and output
    the valid value range.

    For node block pointer, added a new alignment checker.

    For key order, also output the next key to make the problem more
    obvious.

    Signed-off-by: Qu Wenruo
    [ wording adjustments, unindented long strings ]
    Signed-off-by: David Sterba
    Signed-off-by: Ben Hutchings
    Signed-off-by: Sasha Levin

    Qu Wenruo
     
  • commit 557ea5dd003d371536f6b4e8f7c8209a2b6fd4e3 upstream.

    It's no doubt the comprehensive tree block checker will become larger,
    so moving them into their own files is quite reasonable.

    Signed-off-by: Qu Wenruo
    [ wording adjustments ]
    Signed-off-by: David Sterba
    Signed-off-by: Ben Hutchings
    Signed-off-by: Sasha Levin

    Qu Wenruo