12 Jul, 2008

3 commits

  • dx_root_limit() will had some dead code which forced it to always return
    20, and dx_node_limit to always return 22 for debugging purposes.
    Remove it.

    Acked-by: Andreas Dilger
    Signed-off-by: Li Zefan
    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Li Zefan
     
  • ext4_next_entry() is used by the debugging function dx_show_leaf(), so
    it must be defined before that function.

    Signed-off-by: Li Zefan
    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Li Zefan
     
  • ext4_dx_find_entry uses ext4_next_entry without verifying that the entry is
    valid. If its rec_len == 0 this causes an infinite loop. Refactor the loop
    to check the validity of entries before checking whether they match and
    moving onto the next one.

    There are other uses of ext4_next_entry in this file which also look
    problematic. They should be reviewed and fixed if/when we have a test-case
    that triggers them.

    This patch fixes the first case (image hdb.25.softlockup.gz) reported in
    http://bugzilla.kernel.org/show_bug.cgi?id=10882.

    Signed-off-by: Duane Griffin
    Signed-off-by: Theodore Ts'o

    Duane Griffin
     

30 Apr, 2008

2 commits


29 Apr, 2008

1 commit

  • This patch enables extent-formatted normal symlinks. Using extents
    format allows a symlink to refer to a block number larger than 2^32
    on large filesystems. We still don't enable extent format for fast
    symlinks, which are contained in the inode itself.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     

17 Apr, 2008

2 commits


26 Feb, 2008

1 commit

  • In addition, don't inherit EXT4_EXTENTS_FL from parent directory.
    If we have a directory with extent flag set and later mount the file
    system with -o noextents, the files created in that directory will also
    have extent flag set but we would not have called ext4_ext_tree_init for
    them. This will cause error later when we are verifying the extent header

    Signed-off-by: Aneesh Kumar K.V
    Acked-off-by: Eric Sandeen
    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     

22 Feb, 2008

1 commit


16 Feb, 2008

1 commit

  • The ext4_dec_count() function is only needed when dropping the i_nlink
    count on inodes which are (or which could be) directories. If we
    *know* that the inode in question can't possibly be a directory, use
    drop_nlink or clear_nlink() if we know i_nlink is 1.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

08 Feb, 2008

1 commit

  • Stop the EXT4 filesystem from using iget() and read_inode(). Replace
    ext4_read_inode() with ext4_iget(), and call that instead of iget().
    ext4_iget() then uses iget_locked() directly and returns a proper error code
    instead of an inode in the event of an error.

    ext4_fill_super() returns any error incurred when getting the root inode
    instead of EINVAL.

    Signed-off-by: David Howells
    Acked-by: "Theodore Ts'o"
    Acked-by: Jan Kara
    Cc:
    Acked-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

05 Feb, 2008

1 commit

  • For fast symbolic links, the file content is stored in the i_block[]
    array, which is not compatible with the new file extents format.
    e2fsck reports error on such files because EXTENTS_FL is set.
    Don't set the EXTENTS_FL flag when creating fast symlinks.

    In the case of file migration, skip fast symbolic links.

    Signed-off-by: Valerie Clement
    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Valerie Clement
     

29 Jan, 2008

3 commits

  • The unused code found in ext3_find_entry() is also present (and still
    unused) in the ext4_find_entry() code. This patch removes it.

    Signed-off-by: Mariusz Kozlowski
    Signed-off-by: "Theodore Ts'o"

    Mariusz Kozlowski
     
  • This patch adds a new data type ext4_lblk_t to represent
    the logical file blocks.

    This is the preparatory patch to support large files in ext4
    The follow up patch with convert the ext4_inode i_blocks to
    represent the number of blocks in file system block size. This
    changes makes it possible to have a block number 2**32 -1 which
    will result in overflow if the block number is represented by
    signed long. This patch convert all the block number to type
    ext4_lblk_t which is typedef to __u32

    Also remove dead code ext4_ext_walk_space

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Mingming Cao
    Signed-off-by: Eric Sandeen

    Aneesh Kumar K.V
     
  • With 64KB blocksize, a directory entry can have size 64KB which does not fit
    into 16 bits we have for entry lenght. So we store 0xffff instead and convert
    value when read from / written to disk. The patch also converts some places
    to use ext4_next_entry() when we are changing them anyway.

    Signed-off-by: Jan Kara
    Signed-off-by: Mingming Cao

    Jan Kara
     

18 Oct, 2007

1 commit

  • CONFIG_EXT4_INDEX is not an exposed config option in the kernel, and it is
    unconditionally defined in ext4_fs.h. tune2fs is already able to turn off
    dir indexing, so at this point it's just cluttering up the code. Remove
    it.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton

    Eric Sandeen
     

20 Sep, 2007

2 commits

  • The do_split() function for htree dir blocks is intended to split a leaf
    block to make room for a new entry. It sorts the entries in the original
    block by hash value, then moves the last half of the entries to the new
    block - without accounting for how much space this actually moves. (IOW,
    it moves half of the entry *count* not half of the entry *space*). If by
    chance we have both large & small entries, and we move only the smallest
    entries, and we have a large new entry to insert, we may not have created
    enough space for it.

    The patch below stores each record size when calculating the dx_map, and
    then walks the hash-sorted dx_map, calculating how many entries must be
    moved to more evenly split the existing entries between the old block and
    the new block, guaranteeing enough space for the new entry.

    The dx_map "offs" member is reduced to u16 so that the overall map size
    does not change - it is temporarily stored at the end of the new block, and
    if it grows too large it may be overwritten. By making offs and size both
    u16, we won't grow the map size.

    Also add a few comments to the functions involved.

    This fixes the testcase reported by hooanon05@yahoo.co.jp on the
    linux-ext4 list, "ext3 dir_index causes an error"

    Thanks to Andreas Dilger for discussing the problem & solution with me.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Andreas Dilger
    Tested-by: Junjiro Okajima
    Cc: Theodore Ts'o
    Cc:
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     
  • Convert asserts (BUGs) in dx_probe from bad on-disk data to recoverable
    errors with helpful warnings. With help catching other asserts from Duane
    Griffin

    Signed-off-by: Eric Sandeen
    Acked-by: Duane Griffin
    Acked-by: Theodore Ts'o
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     

18 Jul, 2007

2 commits

  • This patch adds support to ext4 for allowing more than 65000
    subdirectories. Currently the maximum number of subdirectories is capped
    at 32000.

    If we exceed 65000 subdirectories in an htree directory it sets the
    inode link count to 1 and no longer counts subdirectories. The
    directory link count is not actually used when determining if a
    directory is empty, as that only counts subdirectories and not regular
    files that might be in there.

    A EXT4_FEATURE_RO_COMPAT_DIR_NLINK flag has been added and it is set if
    the subdir count for any directory crosses 65000. A later fsck will clear
    EXT4_FEATURE_RO_COMPAT_DIR_NLINK if there are no longer any directory
    with >65000 subdirs.

    Signed-off-by: Andreas Dilger
    Signed-off-by: Kalpak Shah
    Signed-off-by: "Theodore Ts'o"

    Andreas Dilger
     
  • This patch adds nanosecond timestamps for ext4. This involves adding
    *time_extra fields to the ext4_inode to extend the timestamps to
    64-bits. Creation time is also added by this patch.

    These extended fields will fit into an inode if the filesystem was
    formatted with large inodes (-I 256 or larger) and there are currently
    no EAs consuming all of the available space. For new inodes we always
    reserve enough space for the kernel's known extended fields, but for
    inodes created with an old kernel this might not have been the case. So
    this patch also adds the EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE feature
    flag(ro-compat so that older kernels can't create inodes with a smaller
    extra_isize). which indicates if the fields fitting inside
    s_min_extra_isize are available or not. If the expansion of inodes if
    unsuccessful then this feature will be disabled. This feature is only
    enabled if requested by the sysadmin.

    None of the extended inode fields is critical for correct filesystem
    operation.

    Signed-off-by: Andreas Dilger
    Signed-off-by: Kalpak Shah
    Signed-off-by: Eric Sandeen
    Signed-off-by: Dave Kleikamp
    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Kalpak Shah
     

17 Jul, 2007

1 commit

  • After ext3 orphan list check has been added into ext3_destroy_inode()
    (please see my previous patch) the following situation has been detected:

    EXT3-fs warning (device sda6): ext3_unlink: Deleting nonexistent file (37901290), 0
    Inode 00000101a15b7840: orphan list check failed!
    00000773 6f665f00 74616d72 00000573 65725f00 06737270 66000000 616d726f
    ...
    Call Trace: [] ext3_destroy_inode+0x79/0x90
    [] sys_unlink+0x126/0x1a0
    [] error_exit+0x0/0x81
    [] system_call+0x7e/0x83

    First messages said that unlinked inode has i_nlink=0, then ext3_unlink()
    adds this inode into orphan list.

    Second message means that this inode has not been removed from orphan list.
    Inode dump has showed that i_fop = &bad_file_ops and it can be set in
    make_bad_inode() only. Then I've found that ext3_read_inode() can call
    make_bad_inode() without any error/warning messages, for example in the
    following case:

    ...
    if (inode->i_nlink == 0) {
    if (inode->i_mode == 0 ||
    !(EXT3_SB(inode->i_sb)->s_mount_state & EXT3_ORPHAN_FS)) {
    /* this inode is deleted */
    brelse (bh);
    goto bad_inode;
    ...

    Bad inode can live some time, ext3_unlink can add it to orphan list, but
    ext3_delete_inode() do not deleted this inode from orphan list. As result
    we can have orphan list corruption detected in ext3_destroy_inode().

    However it is not clear for me how to fix this issue correctly.

    As far as i see is_bad_inode() is called after iget() in all places
    excluding ext3_lookup() and ext3_get_parent(). I believe it makes sense to
    add bad inode check to these functions too and call iput if bad inode
    detected.

    Signed-off-by: Vasily Averin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vasily Averin
     

01 Jun, 2007

1 commit


09 May, 2007

2 commits

  • Remove includes of where it is not used/needed.
    Suggested by Al Viro.

    Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
    sparc64, and arm (all 59 defconfigs).

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • - ext3_dx_find_entry() exit with out setting proper error pointer

    - do_split() exit with out setting proper error pointer
    it is realy painful because many callers contain folowing code:

    de = do_split(handle,dir, &bh, frame, &hinfo, &retval);
    if (!(de))
    return retval;
    <<< WOW retval wasn't changed by do_split(), so caller failed
    <<< but return SUCCESS :)

    - Rearrange do_split() error path. Current error path is realy ugly, all
    this up and down jump stuff doesn't make code easy to understand.

    [dmonakhov@sw.ru: fix annoying fake error messages]
    Signed-off-by: Monakhov Dmitriy
    Cc: Andreas Dilger
    Cc: Theodore Ts'o
    Signed-off-by: Monakhov Dmitriy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitriy Monakhov
     

13 Feb, 2007

1 commit

  • Many struct inode_operations in the kernel can be "const". Marking them const
    moves these to the .rodata section, which avoids false sharing with potential
    dirty data. In addition it'll catch accidental writes at compile time to
    these shared resources.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

12 Feb, 2007

2 commits

  • - Naming is confusing, ext3_inc_count manipulates i_nlink not i_count
    - handle argument passed in is not used
    - ext3 and ext4 already call inc_nlink and dec_nlink directly in other places

    Signed-off-by: Eric Sandeen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     
  • Return -ENOENT from ext[34]_link if we've raced with unlink and i_nlink is
    0. Doing otherwise has the potential to corrupt the orphan inode list,
    because we'd wind up with an inode with a non-zero link count on the list,
    and it will never get properly cleaned up & removed from the orphan list
    before it is freed.

    [akpm@osdl.org: build fix]
    Signed-off-by: Eric Sandeen
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     

09 Dec, 2006

1 commit


08 Dec, 2006

1 commit

  • I've been using Steve Grubb's purely evil "fsfuzzer" tool, at
    http://people.redhat.com/sgrubb/files/fsfuzzer-0.4.tar.gz

    Basically it makes a filesystem, splats some random bits over it, then
    tries to mount it and do some simple filesystem actions.

    At best, the filesystem catches the corruption gracefully. At worst,
    things spin out of control.

    As you might guess, we found a couple places in ext4 where things spin out
    of control :)

    First, we had a corrupted directory that was never checked for
    consistency... it was corrupt, and pointed to another bad "entry" of
    length 0. The for() loop looped forever, since the length of
    ext4_next_entry(de) was 0, and we kept looking at the same pointer over and
    over and over and over... I modeled this check and subsequent action on
    what is done for other directory types in ext4_readdir...

    (adding this check adds some computational expense; I am testing a followup
    patch to reduce the number of times we check and re-check these directory
    entries, in all cases. Thanks for the idea, Andreas).

    Next we had a root directory inode which had a corrupted size, claimed to
    be > 200M on a 4M filesystem. There was only really 1 block in the
    directory, but because the size was so large, readdir kept coming back for
    more, spewing thousands of printk's along the way.

    Per Andreas' suggestion, if we're in this read error condition and we're
    trying to read an offset which is greater than i_blocks worth of bytes,
    stop trying, and break out of the loop.

    With these two changes fsfuzz test survives quite well on ext4.

    Signed-off-by: Eric Sandeen
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     

12 Oct, 2006

4 commits