11 Jan, 2012

1 commit


05 Jan, 2012

3 commits

  • A couple more functions can reasonably be made static if desired.

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Eric Sandeen
     
  • Reserve the ext4 features flags EXT4_FEATURE_RO_COMPAT_METADATA_CSUM,
    EXT4_FEATURE_INCOMPAT_INLINEDATA, and EXT4_FEATURE_INCOMPAT_LARGEDIR.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • This patch adds new online resize interface, whose input argument is a
    64-bit integer indicating how many blocks there are in the resized fs.

    In new resize impelmentation, all work like allocating group tables
    are done by kernel side, so the new resize interface can support
    flex_bg feature and prepares ground for suppoting resize with features
    like bigalloc and exclude bitmap. Besides these, user-space tools just
    passes in the new number of blocks.

    We delay initializing the bitmaps and inode tables of added groups if
    possible and add multi groups (a flex groups) each time, so new resize
    is very fast like mkfs.

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"

    Yongqiang Yang
     

04 Jan, 2012

2 commits


29 Dec, 2011

2 commits

  • ext4_{set,clear}_bit() is defined as __test_and_{set,clear}_bit_le() for
    ext4. Only two ext4_{set,clear}_bit() calls check the return value. The
    rest of calls ignore the return value and they can be replaced with
    __{set,clear}_bit_le().

    This changes ext4_{set,clear}_bit() from __test_and_{set,clear}_bit_le()
    to __{set,clear}_bit_le() and introduces ext4_test_and_{set,clear}_bit()
    for the two places where old bit needs to be returned.

    This ext4_{set,clear}_bit() change is considered safe, because if someone
    uses these macros without noticing the change, new ext4_{set,clear}_bit
    don't have return value and causes compiler errors where the return value
    is used.

    This also removes unused ext4_find_first_zero_bit().

    Signed-off-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: "Theodore Ts'o"

    Akinobu Mita
     
  • The functions ext4_block_truncate_page() and ext4_block_zero_page_range()
    are no longer used, so remove them.

    Signed-off-by: Zheng Liu
    Signed-off-by: "Theodore Ts'o"

    Zheng Liu
     

03 Nov, 2011

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (97 commits)
    jbd2: Unify log messages in jbd2 code
    jbd/jbd2: validate sb->s_first in journal_get_superblock()
    ext4: let ext4_ext_rm_leaf work with EXT_DEBUG defined
    ext4: fix a syntax error in ext4_ext_insert_extent when debugging enabled
    ext4: fix a typo in struct ext4_allocation_context
    ext4: Don't normalize an falloc request if it can fit in 1 extent.
    ext4: remove comments about extent mount option in ext4_new_inode()
    ext4: let ext4_discard_partial_buffers handle unaligned range correctly
    ext4: return ENOMEM if find_or_create_pages fails
    ext4: move vars to local scope in ext4_discard_partial_page_buffers_no_lock()
    ext4: Create helper function for EXT4_IO_END_UNWRITTEN and i_aiodio_unwritten
    ext4: optimize locking for end_io extent conversion
    ext4: remove unnecessary call to waitqueue_active()
    ext4: Use correct locking for ext4_end_io_nolock()
    ext4: fix race in xattr block allocation path
    ext4: trace punch_hole correctly in ext4_ext_map_blocks
    ext4: clean up AGGRESSIVE_TEST code
    ext4: move variables to their scope
    ext4: fix quota accounting during migration
    ext4: migrate cleanup
    ...

    Linus Torvalds
     

01 Nov, 2011

2 commits

  • Standardize the style for compiler based printf format verification.
    Standardized the location of __printf too.

    Done via script and a little typing.

    $ grep -rPl --include=*.[ch] -w "__attribute__" * | \
    grep -vP "^(tools|scripts|include/linux/compiler-gcc.h)" | \
    xargs perl -n -i -e 'local $/; while (<>) { s/\b__attribute__\s*\(\s*\(\s*format\s*\(\s*printf\s*,\s*(.+)\s*,\s*(.+)\s*\)\s*\)\s*\)/__printf($1, $2)/g ; print; }'

    [akpm@linux-foundation.org: revert arch bits]
    Signed-off-by: Joe Perches
    Cc: "Kirill A. Shutemov"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • EXT4_IO_END_UNWRITTEN flag set and the increase of i_aiodio_unwritten
    should be done simultaneously since ext4_end_io_nolock always clear
    the flag and decrease the counter in the same time.

    We have found some bugs that the flag is set while leaving
    i_aiodio_unwritten unchanged(commit 32c80b32c053d). So this patch just tries
    to create a helper function to wrap them to avoid any future bug.
    The idea is inspired by Eric.

    Cc: Eric Sandeen
    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     

29 Oct, 2011

1 commit


25 Oct, 2011

1 commit

  • EOFBLOCK_FL should be updated if called w/o FALLOCATE_FL_KEEP_SIZE
    Currently it happens only if new extent was allocated.

    TESTCASE:
    fallocate test_file -n -l4096
    fallocate test_file -l4096
    Last fallocate cmd has updated size, but keept EOFBLOCK_FL set. And
    fsck will complain about that.

    Also remove ping pong in ext4_fallocate() in case of new extents,
    where ext4_ext_map_blocks() clear EOFBLOCKS bit, and later
    ext4_falloc_update_inode() restore it again.

    Signed-off-by: Dmitry Monakhov
    Signed-off-by: "Theodore Ts'o"

    Dmitry Monakhov
     

09 Oct, 2011

2 commits

  • There are no users of the EXT4_IOC_WAIT_FOR_READONLY ioctl, and it is
    also broken. No one sets the set_ro_timer, no one wakes up us and our
    state is set to TASK_INTERRUPTIBLE not RUNNING. So remove it.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     
  • For a long time now orlov is the default block allocator in the
    ext4. It performs better than the old one and no one seems to claim
    otherwise so we can safely drop it and make oldalloc and orlov mount
    option deprecated.

    This is a part of the effort to reduce number of ext4 options hence the
    test matrix.

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner
     

10 Sep, 2011

15 commits

  • Currently, there exists a race between delayed allocated writes and
    the writeback when bigalloc feature is in use. The race was because we
    wanted to determine what blocks in a cluster are under delayed
    allocation and we were using buffer_delayed(bh) check for it. But, the
    writeback codepath clears this bit without any synchronization which
    resulted in a race and an ext4 warning similar to:

    EXT4-fs (ram1): ext4_da_update_reserve_space: ino 13, used 1 with only 0
    reserved data blocks

    The race existed in two places.
    (1) between ext4_find_delalloc_range() and ext4_map_blocks() when called from
    writeback code path.
    (2) between ext4_find_delalloc_range() and ext4_da_get_block_prep() (where
    buffer_delayed(bh) is set.

    To fix (1), this patch introduces a new buffer_head state bit -
    BH_Da_Mapped. This bit is set under the protection of
    EXT4_I(inode)->i_data_sem when we have actually mapped the delayed
    allocated blocks during the writeout time. We can now reliably check
    for this bit inside ext4_find_delalloc_range() to determine whether
    the reservation for the blocks have already been claimed or not.

    To fix (2), it was necessary to set buffer_delay(bh) under the
    protection of i_data_sem. So, I extracted the very beginning of
    ext4_map_blocks into a new function - ext4_da_map_blocks() - and
    performed the required setting of bh_delay bit and the quota
    reservation under the protection of i_data_sem. These two fixes makes
    the checking of buffer_delay(bh) and buffer_da_mapped(bh) consistent,
    thus removing the race.

    Tested: I was able to reproduce the problem by running 'dd' and
    'fsync' in parallel. Also, xfstests sometimes used to reproduce this
    race. After the fix both my test and xfstests were successful and no
    race (warning message) was observed.

    Google-Bug-Id: 4997027

    Signed-off-by: Aditya Kali
    Signed-off-by: "Theodore Ts'o"

    Aditya Kali
     
  • This patch adds some tracepoints in ext4/extents.c and updates a tracepoint in
    ext4/inode.c.

    Tested: Built and ran the kernel and verified that these tracepoints work.
    Also ran xfstests.

    Signed-off-by: Aditya Kali
    Signed-off-by: "Theodore Ts'o"

    Aditya Kali
     
  • Rename the function so it is more clear what is going on. Also rename
    the various variables so it's clearer what's happening.

    Also fix a missing blocks to cluster conversion when reading the
    number of reserved blocks for root.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • This function really claims a number of free clusters, not blocks, so
    rename it so it's clearer what's going on.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • This function really returns the number of clusters after initializing
    an uninitalized block bitmap has been initialized.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • This function really counts the free clusters reported in the block
    group descriptors, so rename it to reduce confusion.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • The field bg_free_blocks_count_{lo,high} in the block group
    descriptor has been repurposed to hold the number of free clusters for
    bigalloc functions. So rename the functions so it makes it easier to
    read and audit the block allocation and block freeing code.

    Note: at this point in bigalloc development we doesn't support
    online resize, so this also makes it really obvious all of the places
    we need to fix up to add support for online resize.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • Now that we have implemented all of the changes needed for bigalloc,
    we can finally enable it!

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • With bigalloc changes, the i_blocks value was not correctly set (it was still
    set to number of blocks being used, but in case of bigalloc, we want i_blocks
    to represent the number of clusters being used). Since the quota subsystem sets
    the i_blocks value, this patch fixes the quota accounting and makes sure that
    the i_blocks value is set correctly.

    Signed-off-by: Aditya Kali
    Signed-off-by: "Theodore Ts'o"

    Aditya Kali
     
  • Convert the free_blocks to be free_clusters to make the final revised
    bigalloc changes easier to read/understand.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • Convert the percpu counters s_dirtyblocks_counter and
    s_freeblocks_counter in struct ext4_super_info to be
    s_dirtyclusters_counter and s_freeclusters_counter.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • The ext4_free_blocks() function now has two new flags that indicate
    whether a partial cluster at the beginning or the end of the block
    extents should be freed or not. That will be up the caller (i.e.,
    truncate), who can figure out whether partial clusters at the
    beginning or the end of a block range can be freed.

    We also have to update the ext4_mb_free_metadata() and
    release_blocks_on_commit() machinery to be cluster-based, since it is
    used by ext4_free_blocks().

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • Add bigalloc support to ext4_init_block_bitmap() and
    ext4_free_blocks_after_init().

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • The function ext4_free_blocks_after_init() used to be a #define of
    ext4_init_block_bitmap(). This actually made it difficult to
    understand how the function worked, and made it hard make changes to
    support clusters. So as an initial cleanup, I've separated out the
    functionality of initializing block bitmap from calculating the number
    of free blocks in the new block group.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • This adds supports for bigalloc file systems. It teaches the mount
    code just enough about bigalloc superblock fields that it will mount
    the file system without freaking out that the number of blocks per
    group is too big.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

04 Sep, 2011

1 commit

  • If the user explicitly specifies conflicting mount options for
    delalloc or dioread_nolock and data=journal, fail the mount, instead
    of printing a warning and continuing (since many user's won't look at
    dmesg and notice the warning).

    Also, print a single warning that data=journal implies that delayed
    allocation is not on by default (since it's not supported), and
    furthermore that O_DIRECT is not supported. Improve the text in
    Documentation/filesystems/ext4.txt so this is clear there as well.

    Similarly, if the dioread_nolock mount option is specified when the
    file system block size != PAGE_SIZE, fail the mount instead of
    printing a warning message and ignoring the mount option.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

03 Sep, 2011

1 commit

  • This patch adds two new routines: ext4_discard_partial_page_buffers
    and ext4_discard_partial_page_buffers_no_lock.

    The ext4_discard_partial_page_buffers routine is a wrapper
    function to ext4_discard_partial_page_buffers_no_lock.
    The wrapper function locks the page and passes it to
    ext4_discard_partial_page_buffers_no_lock.
    Calling functions that already have the page locked can call
    ext4_discard_partial_page_buffers_no_lock directly.

    The ext4_discard_partial_page_buffers_no_lock function
    zeros a specified range in a page, and unmaps the
    corresponding buffer heads. Only block aligned regions of the
    page will have their buffer heads unmapped. Unblock aligned regions
    will be mapped if needed so that they can be updated with the
    partial zero out. This function is meant to
    be used to update a page and its buffer heads to be zeroed
    and unmapped when the corresponding blocks have been released
    or will be released.

    This routine is used in the following scenarios:
    * A hole is punched and the non page aligned regions
    of the head and tail of the hole need to be discarded

    * The file is truncated and the partial page beyond EOF needs
    to be discarded

    * The end of a hole is in the same page as EOF. After the
    page is flushed, the partial page beyond EOF needs to be
    discarded.

    * A write operation begins or ends inside a hole and the partial
    page appearing before or after the write needs to be discarded

    * A write operation extends EOF and the partial page beyond EOF
    needs to be discarded

    This function takes a flag EXT4_DISCARD_PARTIAL_PG_ZERO_UNMAPPED
    which is used when a write operation begins or ends in a hole.
    When the EXT4_DISCARD_PARTIAL_PG_ZERO_UNMAPPED flag is used, only
    buffer heads that are already unmapped will have the corresponding
    regions of the page zeroed.

    Signed-off-by: Allison Henderson
    Signed-off-by: "Theodore Ts'o"

    Allison Henderson
     

31 Aug, 2011

2 commits

  • This doesn't make much sense, and it exposes a bug in the kernel where
    attempts to create a new file in an append-only directory using
    O_CREAT will fail (but still leave a zero-length file). This was
    discovered when xfstests #79 was generalized so it could run on all
    file systems.

    Signed-off-by: "Theodore Ts'o"
    Cc:stable@kernel.org

    Theodore Ts'o
     
  • The i_mutex lock and flush_completed_IO() added by commit 2581fdc810
    in ext4_evict_inode() causes lockdep complaining about potential
    deadlock in several places. In most/all of these LOCKDEP complaints
    it looks like it's a false positive, since many of the potential
    circular locking cases can't take place by the time the
    ext4_evict_inode() is called; but since at the very least it may mask
    real problems, we need to address this.

    This change removes the flush_completed_IO() and i_mutex lock in
    ext4_evict_inode(). Instead, we take a different approach to resolve
    the software lockup that commit 2581fdc810 intends to fix. Rather
    than having ext4-dio-unwritten thread wait for grabing the i_mutex
    lock of an inode, we use mutex_trylock() instead, and simply requeue
    the work item if we fail to grab the inode's i_mutex lock.

    This should speed up work queue processing in general and also
    prevents the following deadlock scenario: During page fault,
    shrink_icache_memory is called that in turn evicts another inode B.
    Inode B has some pending io_end work so it calls ext4_ioend_wait()
    that waits for inode B's i_ioend_count to become zero. However, inode
    B's ioend work was queued behind some of inode A's ioend work on the
    same cpu's ext4-dio-unwritten workqueue. As the ext4-dio-unwritten
    thread on that cpu is processing inode A's ioend work, it tries to
    grab inode A's i_mutex lock. Since the i_mutex lock of inode A is
    still hold before the page fault happened, we enter a deadlock.

    Signed-off-by: Jiaying Zhang
    Signed-off-by: "Theodore Ts'o"

    Jiaying Zhang
     

02 Aug, 2011

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (60 commits)
    ext4: prevent memory leaks from ext4_mb_init_backend() on error path
    ext4: use EXT4_BAD_INO for buddy cache to avoid colliding with valid inode #
    ext4: use ext4_msg() instead of printk in mballoc
    ext4: use ext4_kvzalloc()/ext4_kvmalloc() for s_group_desc and s_group_info
    ext4: introduce ext4_kvmalloc(), ext4_kzalloc(), and ext4_kvfree()
    ext4: use the correct error exit path in ext4_init_inode_table()
    ext4: add missing kfree() on error return path in add_new_gdb()
    ext4: change umode_t in tracepoint headers to be an explicit __u16
    ext4: fix races in ext4_sync_parent()
    ext4: Fix overflow caused by missing cast in ext4_fallocate()
    ext4: add action of moving index in ext4_ext_rm_idx for Punch Hole
    ext4: simplify parameters of reserve_backup_gdb()
    ext4: simplify parameters of add_new_gdb()
    ext4: remove lock_buffer in bclean() and setup_new_group_blocks()
    ext4: simplify journal handling in setup_new_group_blocks()
    ext4: let setup_new_group_blocks() set multiple bits at a time
    ext4: fix a typo in ext4_group_extend()
    ext4: let ext4_group_add_blocks() handle 0 blocks quickly
    ext4: let ext4_group_add_blocks() return an error code
    ext4: rename ext4_add_groupblocks() to ext4_group_add_blocks()
    ...

    Fix up conflict in fs/ext4/inode.c: commit aacfc19c626e ("fs: simplify
    the blockdev_direct_IO prototype") had changed the ext4_ind_direct_IO()
    function for the new simplified calling convention, while commit
    dae1e52cb126 ("ext4: move ext4_ind_* functions from inode.c to
    indirect.c") moved the function to another file.

    Linus Torvalds
     

01 Aug, 2011

1 commit


27 Jul, 2011

4 commits