15 Jul, 2008

1 commit

  • This patch does block reservation for delayed
    allocation, to avoid ENOSPC later at page flush time.

    Blocks(data and metadata) are reserved at da_write_begin()
    time, the freeblocks counter is updated by then, and the number of
    reserved blocks is store in per inode counter.

    At the writepage time, the unused reserved meta blocks are returned
    back. At unlink/truncate time, reserved blocks are properly released.

    Updated fix from Aneesh Kumar K.V
    to fix the oldallocator block reservation accounting with delalloc, added
    lock to guard the counters and also fix the reservation for meta blocks.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Mingming Cao
    Signed-off-by: Theodore Ts'o

    Mingming Cao
     

14 Jul, 2008

1 commit


12 Jul, 2008

11 commits

  • Right now i_blocks is not getting updated until the blocks are actually
    allocaed on disk. This means with delayed allocation, right after files
    are copied, "ls -sF" shoes the file as taking 0 blocks on disk. "du"
    also shows the files taking zero space, which is highly confusing to the
    user.

    Since delayed allocation already keeps track of per-inode total
    number of blocks that are subject to delayed allocation, this patch fix
    this by using that to adjust the value returned by stat(2). When real
    block allocation is done, the i_blocks will get updated. Since the
    reserved blocks for delayed allocation will be decreased, this will be
    keep value returned by stat(2) consistent.

    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Mingming Cao
     
  • Updated with fixes from Mingming Cao to unlock and
    release the page from page cache if the delalloc write_begin failed, and
    properly handle preallocated blocks. Also added a fix to clear
    buffer_delay in block_write_full_page() after allocating a delayed
    buffer.

    Updated with fixes from Aneesh Kumar K.V
    to update i_disksize properly and to add bmap support for delayed
    allocation.

    Updated with a fix from Valerie Clement to
    avoid filesystem corruption when the filesystem is mounted with the
    delalloc option and blocksize < pagesize.

    Signed-off-by: Alex Tomas
    Signed-off-by: Mingming Cao
    Signed-off-by: Dave Kleikamp
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Aneesh Kumar K.V

    Alex Tomas
     
  • This changes are needed to support data=ordered mode handling via
    inodes. This enables us to get rid of the journal heads and buffer
    heads for data buffers in the ordered mode. With the changes, during
    tranasaction commit we writeout the inode pages using the
    writepages()/writepage(). That implies we take page lock during
    transaction commit. This can cause a deadlock with the locking order
    page_lock -> jbd2_journal_start, since the jbd2_journal_start can wait
    for the journal_commit to happen and the journal_commit now needs to
    take the page lock. To avoid this dead lock reverse the locking order.

    Signed-off-by: Jan Kara
    Signed-off-by: Mingming Cao
    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • We would like to get notified when we are doing a write on mmap section.
    This is needed with respect to preallocated area. We split the preallocated
    area into initialzed extent and uninitialzed extent in the call back. This
    let us handle ENOSPC better. Otherwise we get ENOSPC in the writepage and
    that would result in data loss. The changes are also needed to handle ENOSPC
    when writing to an mmap section of files with holes.

    Acked-by: Jan Kara
    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     
  • Update group infos when updating a group's descriptor.
    Add group infos when adding a group's descriptor.
    Refresh cache pages used by mb_alloc when changes occur.
    This will probably need modifications when META_BG resizing will be allowed.

    Signed-off-by: Frederic Bohe
    Signed-off-by: Mingming Cao

    Frederic Bohe
     
  • mballoc allocation missed check for blocks reserved for root users. Add
    ext4_has_free_blocks() check before allocation. Also modified
    ext4_has_free_blocks() to support multiple block allocation request.

    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Mingming Cao
     
  • Move the code related to block allocation to a single function and add helper
    funtions to differient allocation for data and meta data blocks

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     
  • When mballoc is enabled, block allocation for old block-based
    files are allocated using mballoc allocator instead of old
    block-based allocator. The old ext3 block reservation is turned
    off when mballoc is turned on.

    However, the in-core preallocation is not enabled for block-based/
    non-extent based file block allocation. This result in performance
    regression, as now we don't have "reservation" ore in-core preallocation
    to prevent interleaved fragmentation in multiple writes workload.

    This patch fix this by enable per inode in-core preallocation
    for non extent files when mballoc is used.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     
  • This patch mostly controls the way inode are allocated in order to
    make ialloc aware of flex_bg block group grouping. It achieves this
    by bypassing the Orlov allocator when block group meta-data are packed
    toghether through mke2fs. Since the impact on the block allocator is
    minimal, this patch should have little or no effect on other block
    allocation algorithms. By controlling the inode allocation, it can
    basically control where the initial search for new block begins and
    thus indirectly manipulate the block allocator.

    This allocator favors data and meta-data locality so the disk will
    gradually be filled from block group zero upward. This helps improve
    performance by reducing seek time. Since the group of inode tables
    within one flex_bg are treated as one giant inode table, uninitialized
    block groups would not need to partially initialize as many inode
    table as with Orlov which would help fsck time as the filesystem usage
    goes up.

    Signed-off-by: Jose R. Santos
    Signed-off-by: Valerie Clement
    Signed-off-by: "Theodore Ts'o"

    Jose R. Santos
     
  • Change second/third to fourth.

    Signed-off-by: Shen Feng
    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Shen Feng
     
  • If the orphan node list includes valid, untruncatable nodes with nlink > 0
    the ext4_orphan_cleanup loop which attempts to delete them will not do so,
    causing it to loop forever. Fix by checking for such nodes in the
    ext4_orphan_get function.

    This patch fixes the second case (image hdb.20000009.softlockup.gz)
    reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882.

    Signed-off-by: Duane Griffin
    Signed-off-by: Theodore Ts'o

    Duane Griffin
     

30 Apr, 2008

1 commit