10 Sep, 2011

1 commit

  • With bigalloc changes, the i_blocks value was not correctly set (it was still
    set to number of blocks being used, but in case of bigalloc, we want i_blocks
    to represent the number of clusters being used). Since the quota subsystem sets
    the i_blocks value, this patch fixes the quota accounting and makes sure that
    the i_blocks value is set correctly.

    Signed-off-by: Aditya Kali
    Signed-off-by: "Theodore Ts'o"

    Aditya Kali
     

06 Jun, 2011

2 commits

  • Currently we are not marking the extent as the last one
    (FIEMAP_EXTENT_LAST) if there is a hole at the end of the file. This is
    because we just do not check for it right now and continue searching for
    next extent. But at the point we hit the hole at the end of the file, it
    is too late.

    This commit adds check for the allocated block in subsequent extent and
    if there is no more extents (block = EXT_MAX_BLOCKS) just flag the
    current one as the last one.

    This behaviour has been spotted unintentionally by 252 xfstest, when the
    test hangs out, because of wrong loop condition. However on other
    filesystems (like xfs) it will exit anyway, because we notice the last
    extent flag and exit.

    With this patch xfstest 252 does not hang anymore, ext4 fiemap
    implementation still reports bad extent type in some cases, however
    this seems to be different issue.

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner
     
  • Kazuya Mio reported that he was able to hit BUG_ON(next == lblock)
    in ext4_ext_put_gap_in_cache() while creating a sparse file in extent
    format and fill the tail of file up to its end. We will hit the BUG_ON
    when we write the last block (2^32-1) into the sparse file.

    The root cause of the problem lies in the fact that we specifically set
    s_maxbytes so that block at s_maxbytes fit into on-disk extent format,
    which is 32 bit long. However, we are not storing start and end block
    number, but rather start block number and length in blocks. It means
    that in order to cover extent from 0 to EXT_MAX_BLOCK we need
    EXT_MAX_BLOCK+1 to fit into len (because we counting block 0 as well) -
    and it does not.

    The only way to fix it without changing the meaning of the struct
    ext4_extent members is, as Kazuya Mio suggested, to lower s_maxbytes
    by one fs block so we can cover the whole extent we can get by the
    on-disk extent format.

    Also in many places EXT_MAX_BLOCK is used as length instead of maximum
    logical block number as the name suggests, it is all a bit messy. So
    this commit renames it to EXT_MAX_BLOCKS and change its usage in some
    places to actually be maximum number of blocks in the extent.

    The bug which this commit fixes can be reproduced as follows:

    dd if=/dev/zero of=/mnt/mp1/file bs= count=1 seek=$((2**32-2))
    sync
    dd if=/dev/zero of=/mnt/mp1/file bs= count=1 seek=$((2**32-1))

    Reported-by: Kazuya Mio
    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner
     

11 Jan, 2011

2 commits

  • We can encode the ec_type information by using ee_len == 0 to denote
    EXT4_EXT_CACHE_NO, ee_start == 0 to denote EXT4_EXT_CACHE_GAP, and if
    neither is true, then the cache type must be EXT4_EXT_CACHE_EXTENT.
    This allows us to reduce the size of ext4_ext_inode by another 8
    bytes. (ec_type is 4 bytes, plus another 4 bytes of padding)

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • This fixes a number of places where we used sector_t instead of
    ext4_lblk_t for logical blocks, which for ext4 are still 32-bit data
    types. No point wasting space in the ext4_inode_info structure, and
    requiring 64-bit arithmetic on 32-bit systems, when it isn't
    necessary.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

28 Oct, 2010

2 commits

  • Cleanup namespace leaks from fs/ext4 and the inline trivial functions
    ext4_{ext,idx}_pblock() and ext4_{ext,idx}_store_pblock() since the
    code size actually shrinks when we make these functions inline,
    they're so trivial.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • These functions have no need to be exported beyond file context.

    No functions needed to be moved for this commit; just some function
    declarations changed to be static and removed from header files.

    (A similar patch was submitted by Eric Sandeen, but I wanted to handle
    code movement in separate patches to make sure code changes didn't
    accidentally get dropped.)

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

01 Jan, 2010

1 commit

  • In the past, ext4_calc_metadata_amount(), and its sub-functions
    ext4_ext_calc_metadata_amount() and ext4_indirect_calc_metadata_amount()
    badly over-estimated the number of metadata blocks that might be
    required for delayed allocation blocks. This didn't matter as much
    when functions which managed the reserved metadata blocks were more
    aggressive about dropping reserved metadata blocks as delayed
    allocation blocks were written, but unfortunately they were too
    aggressive. This was fixed in commit 0637c6f, but as a result the
    over-estimation by ext4_calc_metadata_amount() would lead to reserving
    2-3 times the number of pending delayed allocation blocks as
    potentially required metadata blocks. So if there are 1 megabytes of
    blocks which have been not yet been allocation, up to 3 megabytes of
    space would get reserved out of the user's quota and from the file
    system free space pool until all of the inode's data blocks have been
    allocated.

    This commit addresses this problem by much more accurately estimating
    the number of metadata blocks that will be required. It will still
    somewhat over-estimate the number of blocks needed, since it must make
    a worst case estimate not knowing which physical blocks will be
    needed, but it is much more accurate than before.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

29 Sep, 2009

1 commit

  • When writing into an unitialized extent via direct I/O, and the direct
    I/O doesn't exactly cover the unitialized extent, split the extent
    into uninitialized and initialized extents before submitting the I/O.
    This avoids needing to deal with an ENOSPC error in the end_io
    callback that gets used for direct I/O.

    When the IO is complete, the written extent will be marked as initialized.

    Singed-Off-By: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Mingming Cao
     

19 Sep, 2009

1 commit

  • ext4_ext_show_leaf() will display the leaf extents when extent
    debugging is enabled.

    Printing out the unwritten bit is useful for debugging unwritten
    extent, allow us to see the unwritten extents vs written extents,
    after the unwritten extents are splitted or converted.

    Signed-off-by: Mingming Cao

    Mingming
     

17 Sep, 2009

1 commit


18 Jun, 2009

1 commit

  • The EXT4_IOC_MOVE_EXT exchanges the blocks between orig_fd and donor_fd,
    and then write the file data of orig_fd to donor_fd.
    ext4_mext_move_extent() is the main fucntion of ext4 online defrag,
    and this patch includes all functions related to ext4 online defrag.

    Signed-off-by: Akira Fujita
    Signed-off-by: Takashi Sato
    Signed-off-by: Kazuya Mio
    Signed-off-by: "Theodore Ts'o"

    Akira Fujita
     

28 Mar, 2009

1 commit


05 Nov, 2008

1 commit


07 Oct, 2008

1 commit

  • ext4_ext_walk_space() was reinstated to be used for iterating over file
    extents with a callback; it is used by the ext4 fiemap implementation.

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"
    Cc: linux-ext4@vger.kernel.org
    Cc: linux-fsdevel@vger.kernel.org

    Eric Sandeen
     

20 Aug, 2008

1 commit

  • This patch modified the writepage/write_begin credit calculation for
    extent files, to use the credits caculation helper function.

    The current calculation of how many index/leaf blocks should be
    accounted is too conservetive, it always considered the worse case,
    where the tree level is 5, and in the case of multiple chunk
    allocations, it always assumed no blocks were dirtied in common across
    the allocations. This path uses the accurate depth of the inode with
    some extras to calculate the index blocks, and also less conservative in
    the case of multiple allocation accounting.

    Signed-off-by: Mingming Cao
    Reviewed-by: Aneesh Kumar K.V
    Signed-off-by: "Theodore Ts'o"

    Mingming Cao
     

15 Jul, 2008

1 commit

  • This patch does block reservation for delayed
    allocation, to avoid ENOSPC later at page flush time.

    Blocks(data and metadata) are reserved at da_write_begin()
    time, the freeblocks counter is updated by then, and the number of
    reserved blocks is store in per inode counter.

    At the writepage time, the unused reserved meta blocks are returned
    back. At unlink/truncate time, reserved blocks are properly released.

    Updated fix from Aneesh Kumar K.V
    to fix the oldallocator block reservation accounting with delalloc, added
    lock to guard the counters and also fix the reservation for meta blocks.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Mingming Cao
    Signed-off-by: Theodore Ts'o

    Mingming Cao
     

30 Apr, 2008

1 commit