19 Dec, 2011

1 commit


26 Oct, 2011

4 commits


21 Oct, 2011

1 commit


06 Oct, 2011

1 commit

  • In commit 79a77c5ac, we move ext4_mb_init_backend after the allocation
    of s_locality_group to avoid memory leak in error path, but there are
    still some other error paths in ext4_mb_init that need to do the same
    work. So this patch adds all the error patch for ext4_mb_init. And all
    the pointers are reset to NULL in case the caller may double free them.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     

10 Sep, 2011

10 commits


02 Aug, 2011

3 commits


01 Aug, 2011

1 commit


27 Jul, 2011

4 commits


24 Jul, 2011

2 commits


18 Jul, 2011

1 commit

  • Previously, if a stripe width was provided, then it would be used
    as the preallocation granularity, with no santiy checking and no
    way to override this. Now, mb_prealloc_size defaults to the smallest
    multiple of stripe size that is greater than or equal to the old
    default mb_prealloc_size, and this can be overridden with the sysfs
    interface.

    Signed-off-by: Dan Ehrenberg
    Signed-off-by: "Theodore Ts'o"

    Dan Ehrenberg
     

12 Jul, 2011

2 commits

  • If we meet with an error in ext4_mb_add_groupinfo, we kfree
    sbi->s_group_info[group >> EXT4_DESC_PER_BLOCK_BITS(sb)], but fail to
    reset it to NULL. So the caller ext4_mb_init_backend will try to kfree
    it again and causes a double free. So fix it by resetting it to NULL.

    Some typo in comments of mballoc.c are also changed.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • In ext4_groupinfo_create_slab, we create ext4_groupinfo_caches within
    ext4_grpinfo_slab_create_mutex, but set it outside the lock, and there
    does exist some case that we may create it twice and causes a memory
    leak. So set it before we call mutex_unlock.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     

11 Jul, 2011

6 commits

  • at ext4_trim_all_free() comment, there is no longer an @e4b parameter,
    instead it is @group.

    Reported-by: Andreas Dilger
    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • In ext4, when FITRIM is called every time, we iterate all the
    groups and do trim one by one. It is a bit time wasting if the
    group has been trimmed and there is no change since the last
    trim.

    So this patch adds a new flag in ext4_group_info->bb_state to
    indicate that the group has been trimmed, and it will be cleared
    if some blocks is freed(in release_blocks_on_commit). Another
    trim_minlen is added in ext4_sb_info to record the last minlen
    we use to trim the volume, so that if the caller provide a small
    one, we will go on the trim regardless of the bb_state.

    A simple test with my intel x25m ssd:
    df -h shows:
    /dev/sdb1 40G 21G 17G 56% /mnt/ext4
    Block size: 4096

    run the FITRIM with the following parameter:
    range.start = 0;
    range.len = UINT64_MAX;
    range.minlen = 1048576;

    without the patch:
    [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
    real 0m5.505s
    user 0m0.000s
    sys 0m1.224s
    [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
    real 0m5.359s
    user 0m0.000s
    sys 0m1.178s
    [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
    real 0m5.228s
    user 0m0.000s
    sys 0m1.151s

    with the patch:
    [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
    real 0m5.625s
    user 0m0.000s
    sys 0m1.269s
    [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
    real 0m0.002s
    user 0m0.000s
    sys 0m0.001s
    [root@boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
    real 0m0.002s
    user 0m0.000s
    sys 0m0.001s

    A big improvement for the 2nd and 3rd run.

    Even after I delete some big image files, it is still much
    faster than iterating the whole disk.

    [root@boyu-tm test]# time ./ftrim /mnt/ext4/a
    real 0m1.217s
    user 0m0.000s
    sys 0m0.196s

    Cc: Lukas Czerner
    Reviewed-by: Andreas Dilger
    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • Add ext4_trim_extent and ext4_trim_all_free.

    Reviewed-by: Lukas Czerner
    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • When we trim some free blocks in a group of ext4, we need to
    calculate the free blocks properly and check whether there are
    enough freed blocks left for us to trim. Current solution will
    only calculate free spaces if they are large for a trim which
    isn't appropriate.

    Let us see a small example:
    a group has 1.5M free which are 300k, 300k, 300k, 300k, 300k.
    And minblocks is 1M. With current solution, we have to iterate
    the whole group since these 300k will never be subtracted from
    1.5M. But actually we should exit after we find the first 2
    free spaces since the left 3 chunks only sum up to 900K if we
    subtract the first 600K although they can't be trimed.

    Reviewed-by: Andreas Dilger
    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • In 0f0a25b, we adjust 'len' with s_first_data_block - start, but
    it could underflow in case blocksize=1K, fstrim_range.len=512 and
    fstrim_range.start = 0. In this case, when we run the code:
    len -= first_data_blk - start; len will be underflow to -1ULL.
    In the end, although we are safe that last_group check later will limit
    the trim to the whole volume, but that isn't what the user really want.

    So this patch fix it. It also adds the check for 'start' like ext3 so that
    we can break immediately if the start is invalid.

    Cc: Lukas Czerner
    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • The current implementation of ext4_free_blocks() always calls
    dquot_free_block This looks quite sensible in the most cases: blocks
    to be freed are associated with inode and were accounted in quota and
    i_blocks some time ago.

    However, there is a case when blocks to free were not accounted by the
    time calling ext4_free_blocks() yet:

    1. delalloc is on, write_begin pre-allocated some space in quota
    2. write-back happens, ext4 allocates some blocks in ext4_ext_map_blocks()
    3. then ext4_ext_map_blocks() gets an error (e.g. ENOSPC) from
    ext4_ext_insert_extent() and calls ext4_free_blocks().

    In this scenario, ext4_free_blocks() calls dquot_free_block() who, in
    turn, decrements i_blocks for blocks which were not accounted yet (due
    to delalloc) After clean umount, e2fsck reports something like:

    > Inode 21, i_blocks is 5080, should be 5128. Fix?
    because i_blocks was erroneously decremented as explained above.

    The patch fixes the problem by passing the new flag
    EXT4_FREE_BLOCKS_NO_QUOT_UPDATE to ext4_free_blocks(), to request
    that the dquot_free_block() call be skipped.

    Signed-off-by: Maxim Patlasov
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Maxim Patlasov
     

28 Jun, 2011

1 commit


06 Jun, 2011

2 commits


25 May, 2011

1 commit

  • This patch adds an allocation request flag to the ext4_has_free_blocks
    function which enables the use of reserved blocks. This will allow a
    punch hole to proceed even if the disk is full. Punching a hole may
    require additional blocks to first split the extents.

    Because ext4_has_free_blocks is a low level function, the flag needs
    to be passed down through several functions listed below:

    ext4_ext_insert_extent
    ext4_ext_create_new_leaf
    ext4_ext_grow_indepth
    ext4_ext_split
    ext4_ext_new_meta_block
    ext4_mb_new_blocks
    ext4_claim_free_blocks
    ext4_has_free_blocks

    [ext4 punch hole patch series 1/5 v7]

    Signed-off-by: Allison Henderson
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Mingming Cao

    Allison Henderson