01 Jul, 2013

1 commit


17 Jun, 2013

1 commit


06 Jun, 2013

1 commit


22 Apr, 2013

2 commits


04 Apr, 2013

1 commit

  • Currently on many places in ext4 we're using
    ext4_get_group_no_and_offset() even though we're only interested in
    knowing the block group of the particular block, not the offset within
    the block group so we can use more efficient way to compute block
    group.

    This patch introduces ext4_get_group_number() which computes block
    group for a given block much more efficiently. Use this function
    instead of ext4_get_group_no_and_offset() everywhere where we're only
    interested in knowing the block group.

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner
     

12 Mar, 2013

1 commit

  • A user who was using a 8TB+ file system and with a very large flexbg
    size (> 65536) could cause the atomic_t used in the struct flex_groups
    to overflow. This was detected by PaX security patchset:

    http://forums.grsecurity.net/viewtopic.php?f=3&t=3289&p=12551#p12551

    This bug was introduced in commit 9f24e4208f7e, so it's been around
    since 2.6.30. :-(

    Fix this by using an atomic64_t for struct orlav_stats's
    free_clusters.

    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Lukas Czerner
    Cc: stable@vger.kernel.org

    Theodore Ts'o
     

03 Mar, 2013

1 commit


09 Feb, 2013

1 commit

  • So we can better understand what bits of ext4 are responsible for
    long-running jbd2 handles, use jbd2__journal_start() so we can pass
    context information for logging purposes.

    The recommended way for finding the longer-running handles is:

    T=/sys/kernel/debug/tracing
    EVENT=$T/events/jbd2/jbd2_handle_stats
    echo "interval > 5" > $EVENT/filter
    echo 1 > $EVENT/enable

    ./run-my-fs-benchmark

    cat $T/trace > /tmp/problem-handles

    This will list handles that were active for longer than 20ms. Having
    longer-running handles is bad, because a commit started at the wrong
    time could stall for those 20+ milliseconds, which could delay an
    fsync() or an O_SYNC operation. Here is an example line from the
    trace file describing a handle which lived on for 311 jiffies, or over
    1.2 seconds:

    postmark-2917 [000] .... 196.435786: jbd2_handle_stats: dev 254,32
    tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1
    dirtied_blocks 0

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

13 Jan, 2013

3 commits


09 Nov, 2012

1 commit

  • ext4_handle_release_buffer() was intended to remove journal
    write access from a buffer, but it doesn't actually do anything
    at all other than add a BUFFER_TRACE point, but it's not reliably
    used for that either. Remove all the associated dead code.

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Carlos Maiolino

    Eric Sandeen
     

22 Oct, 2012

1 commit

  • In mke2fs, we only checksum the whole bitmap block and it is right.
    While in the kernel, we use EXT4_BLOCKS_PER_GROUP to indicate the
    size of the checksumed bitmap which is wrong when we enable bigalloc.
    The right size should be EXT4_CLUSTERS_PER_GROUP and this patch fixes
    it.

    Also as every caller of ext4_block_bitmap_csum_set and
    ext4_block_bitmap_csum_verify pass in EXT4_BLOCKS_PER_GROUP(sb)/8,
    we'd better removes this parameter and sets it in the function itself.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Lukas Czerner
    Cc: stable@vger.kernel.org

    Tao Ma
     

26 Sep, 2012

2 commits

  • When performing an online resize, we add a bunch of groups at one time
    in ext4_flex_group_add, so in most cases a lot of group descriptors
    will be in the same group block. But in the end of this function,
    update_backups will be called for every group descriptor and the same
    block will be copied and journalled again and again. It is really a
    waste.

    Fix things so we only update a particular bg descriptor block once and
    skip subsequent updates of the same block.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • bh_submit_read() is responsible for unlock bh on endio. In addition,
    we need to use bh_uptodate_or_lock() to avoid races.

    Signed-off-by: Dmitry Monakhov
    Signed-off-by: "Theodore Ts'o"

    Dmitry Monakhov
     

20 Sep, 2012

1 commit

  • The update_backups() function is used to backup all the metadata
    blocks, so we should not take it for granted that 'data' is pointed to
    a super block and use ext4_superblock_csum_set to calculate the
    checksum there. In case where the data is a group descriptor block,
    it will corrupt the last group descriptor, and then e2fsck will
    complain about it it.

    As all the metadata checksums should already be OK when we do the
    backup, remove the wrong ext4_superblock_csum_set and it should be
    just fine.

    Reported-by: "Theodore Ts'o"
    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@vger.kernel.org

    Tao Ma
     

19 Sep, 2012

1 commit

  • Commit 1c6bd7173d66b3 introduced a regression where an online resize
    operation which did not change the number of block groups would fail,
    i.e:

    mke2fs -t /dev/vdc 60000
    mount /dev/vdc
    resize2fs /dev/vdc 60001

    This was due to a bug in the logic regarding when to try converting
    the filesystem to use meta_bg.

    Also fix up a number of other minor issues with the online resizing
    code: (a) Fix a sparse warning; (b) only check to make sure the device
    is large enough once, instead of multiple times through the resize
    loop.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

13 Sep, 2012

3 commits


05 Sep, 2012

7 commits


23 Jul, 2012

2 commits


10 Jul, 2012

1 commit

  • Commit f975d6bcc7a introduced bug which caused ext4_statfs() to
    miscalculate the number of file system overhead blocks. This causes
    the f_blocks field in the statfs structure to be larger than it should
    be. This would in turn cause the "df" output to show the number of
    data blocks in the file system and the number of data blocks used to
    be larger than they should be.

    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Theodore Ts'o
     

29 May, 2012

2 commits

  • The b_data field of the buffer_head is already a char *, so there's no
    point casting it to a char *.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • In alloc_flex_gd(), when flexbg_size is large, kmalloc size would
    overflow and flex_gd->groups would point to a buffer smaller than
    expected, causing OOB accesses when it is used.

    Note that in ext4_resize_fs(), flexbg_size is calculated using
    sbi->s_log_groups_per_flex, which is read from the disk and only bounded
    to [1, 31]. The patch returns NULL for too large flexbg_size.

    Reviewed-by: Eric Sandeen
    Signed-off-by: Haogang Chen
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Haogang Chen
     

30 Apr, 2012

4 commits


21 Mar, 2012

1 commit


20 Mar, 2012

1 commit


21 Feb, 2012

1 commit

  • When resizing file system in the way that the new size of the file
    system is still in the same group (no new groups are added), then we can
    hit a BUG_ON in ext4_alloc_group_tables()

    BUG_ON(flex_gd->count == 0 || group_data == NULL);

    because flex_gd->count is zero. The reason is the missing check for such
    case, so the code always extend the last group fully and then attempt to
    add more groups, but at that time n_blocks_count is actually smaller
    than o_blocks_count.

    It can be easily reproduced like this:

    mkfs.ext4 -b 4096 /dev/sda 30M
    mount /dev/sda /mnt/test
    resize2fs /dev/sda 50M

    Fix this by checking whether the resize happens within the singe group
    and only add that many blocks into the last group to satisfy user
    request. Then o_blocks_count == n_blocks_count and the resize will exit
    successfully without and attempt to add more groups into the fs.

    Also fix mixing together block number and blocks count which might be
    confusing and can easily lead to off-by-one errors (but it is actually
    not the case here since the two occurrence of this mix-up will cancel
    each other).

    Signed-off-by: Lukas Czerner
    Reported-by: Milan Broz
    Reviewed-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner