05 Jan, 2012

1 commit

  • This patch adds new online resize interface, whose input argument is a
    64-bit integer indicating how many blocks there are in the resized fs.

    In new resize impelmentation, all work like allocating group tables
    are done by kernel side, so the new resize interface can support
    flex_bg feature and prepares ground for suppoting resize with features
    like bigalloc and exclude bitmap. Besides these, user-space tools just
    passes in the new number of blocks.

    We delay initializing the bitmaps and inode tables of added groups if
    possible and add multi groups (a flex groups) each time, so new resize
    is very fast like mkfs.

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"

    Yongqiang Yang
     

09 Oct, 2011

2 commits

  • For a long time now orlov is the default block allocator in the
    ext4. It performs better than the old one and no one seems to claim
    otherwise so we can safely drop it and make oldalloc and orlov mount
    option deprecated.

    This is a part of the effort to reduce number of ext4 options hence the
    test matrix.

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner
     
  • Acl and user_xattr mount options are no longer needed since those
    features are enabled by default if configured in (seee commit
    ea6633369458992241599c9d9ebadffaeddec164). We can not easily deprecate
    mount options itself (since it is probably too early), but we can
    remove it from documentation first.

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

04 Sep, 2011

1 commit

  • If the user explicitly specifies conflicting mount options for
    delalloc or dioread_nolock and data=journal, fail the mount, instead
    of printing a warning and continuing (since many user's won't look at
    dmesg and notice the warning).

    Also, print a single warning that data=journal implies that delayed
    allocation is not on by default (since it's not supported), and
    furthermore that O_DIRECT is not supported. Improve the text in
    Documentation/filesystems/ext4.txt so this is clear there as well.

    Similarly, if the dioread_nolock mount option is specified when the
    file system block size != PAGE_SIZE, fail the mount instead of
    printing a warning message and ignoring the mount option.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

25 Jun, 2011

1 commit


02 May, 2011

1 commit


31 Mar, 2011

1 commit


22 Feb, 2011

1 commit

  • Add documentation for mount options and ioctls to
    Documentation/filesystem/ext4.txt, which has not been udpated for some
    time. Also add for ext4 sysfs tunables to the
    Documentation/ABI/testing/sysfs-fs-ext4 file, and fix a few
    typographical errors in that file.

    https://bugzilla.kernel.org/show_bug.cgi?id=9423

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner
     

28 Oct, 2010

1 commit

  • When the lazy_itable_init extended option is passed to mke2fs, it
    considerably speeds up filesystem creation because inode tables are
    not zeroed out. The fact that parts of the inode table are
    uninitialized is not a problem so long as the block group descriptors,
    which contain information regarding how much of the inode table has
    been initialized, has not been corrupted However, if the block group
    checksums are not valid, e2fsck must scan the entire inode table, and
    the the old, uninitialized data could potentially cause e2fsck to
    report false problems.

    Hence, it is important for the inode tables to be initialized as soon
    as possble. This commit adds this feature so that mke2fs can safely
    use the lazy inode table initialization feature to speed up formatting
    file systems.

    This is done via a new new kernel thread called ext4lazyinit, which is
    created on demand and destroyed, when it is no longer needed. There
    is only one thread for all ext4 filesystems in the system. When the
    first filesystem with inititable mount option is mounted, ext4lazyinit
    thread is created, then the filesystem can register its request in the
    request list.

    This thread then walks through the list of requests picking up
    scheduled requests and invoking ext4_init_inode_table(). Next schedule
    time for the request is computed by multiplying the time it took to
    zero out last inode table with wait multiplier, which can be set with
    the (init_itable=n) mount option (default is 10). We are doing
    this so we do not take the whole I/O bandwidth. When the thread is no
    longer necessary (request list is empty) it frees the appropriate
    structures and exits (and can be created later later by another
    filesystem).

    We do not disturb regular inode allocations in any way, it just do not
    care whether the inode table is, or is not zeroed. But when zeroing, we
    have to skip used inodes, obviously. Also we should prevent new inode
    allocations from the group, while zeroing is on the way. For that we
    take write alloc_sem lock in ext4_init_inode_table() and read alloc_sem
    in the ext4_claim_inode, so when we are unlucky and allocator hits the
    group which is currently being zeroed, it just has to wait.

    This can be suppresed using the mount option no_init_itable.

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner
     

25 Dec, 2009

1 commit


20 Nov, 2009

2 commits

  • Users on the linux-ext4 list recently complained about differences
    across filesystems w.r.t. how to mount without a journal replay.

    In the discussion it was noted that xfs's "norecovery" option is
    perhaps more descriptively accurate than "noload," so let's make
    that an alias for ext4.

    Also show this status in /proc/mounts

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Eric Sandeen
     
  • It is anticipated that when sb_issue_discard starts doing
    real work on trim-capable devices, we may see issues. Make
    this mount-time optional, and default it to off until we know
    that things are working out OK.

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Eric Sandeen
     

03 Nov, 2009

1 commit

  • This reverts commit d0646f7b636d067d715fab52a2ba9c6f0f46b0d7, as
    requested by Eric Sandeen.

    It can basically cause an ext4 filesystem to miss recovery (and thus get
    mounted with errors) if the journal checksum does not match.

    Quoth Eric:

    "My hand-wavy hunch about what is happening is that we're finding a
    bad checksum on the last partially-written transaction, which is
    not surprising, but if we have a wrapped log and we're doing the
    initial scan for head/tail, and we abort scanning on that bad
    checksum, then we are essentially running an unrecovered filesystem.

    But that's hand-wavy and I need to go look at the code.

    We lived without journal checksums on by default until now, and at
    this point they're doing more harm than good, so we should revert
    the default-changing commit until we can fix it and do some good
    power-fail testing with the fixes in place."

    See

    http://bugzilla.kernel.org/show_bug.cgi?id=14354

    for all the gory details.

    Requested-by: Eric Sandeen
    Cc: Theodore Tso
    Cc: Alexey Fisher
    Cc: Maxim Levitsky
    Cc: Aneesh Kumar K.V
    Cc: Mathias Burén
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

30 Sep, 2009

1 commit


19 Sep, 2009

1 commit


06 Sep, 2009

1 commit


13 Jun, 2009

2 commits


28 Mar, 2009

1 commit

  • Add support for using the mount options "barrier" and "nobarrier", and
    "auto_da_alloc" and "noauto_da_alloc", which is more consistent than
    "barrier=" or "auto_da_alloc=". Most other ext3/ext4 mount
    options use the foo/nofoo naming convention. We allow the old forms
    of these mount options for backwards compatibility.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

23 Feb, 2009

1 commit


07 Jan, 2009

3 commits

  • This mount option is largely superfluous, and in fact the way it was
    implemented was buggy; if a filesystem which did not have the extents
    feature flag was mounted -o extents, the filesystem would attempt to
    create and use extents-based file even though the extents feature flag
    was not eabled. The simplest thing to do is to nuke the mount option
    entirely. It's not all that useful to force the non-creation of new
    extent-based files if the filesystem can support it.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • This code has been obsolete in quite some time, since the supported
    method for adding a journal inode is to use tune2fs (or to creating
    new filesystem with a journal via mke2fs or mkfs.ext4).

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • Fix paragraph with recommendations on how to tune ext4 for benchmarks.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

06 Jan, 2009

1 commit


04 Jan, 2009

1 commit

  • Add new mount options, min_batch_time and max_batch_time, which
    controls how long the jbd2 layer should wait for additional filesystem
    operations to get batched with a synchronous write transaction.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

17 Oct, 2008

1 commit


11 Oct, 2008

2 commits

  • If the journal doesn't abort when it gets an IO error in file data
    blocks, the file data corruption will spread silently. Because
    most of applications and commands do buffered writes without fsync(),
    they don't notice the IO error. It's scary for mission critical
    systems. On the other hand, if the journal aborts whenever it gets
    an IO error in file data blocks, the system will easily become
    inoperable. So this patch introduces a filesystem option to
    determine whether it aborts the journal or just call printk() when
    it gets an IO error in file data.

    If you mount an ext4 fs with data_err=abort option, it aborts on file
    data write error. If you mount it with data_err=ignore, it doesn't
    abort, just call printk(). data_err=ignore is the default.

    Here is the corresponding patch of the ext3 version:
    http://kerneltrap.org/mailarchive/linux-kernel/2008/9/9/3239374

    Signed-off-by: Hidehiro Kawai
    Signed-off-by: Theodore Ts'o

    Hidehiro Kawai
     
  • The ext4 filesystem is getting stable enough that it's time to drop
    the "dev" prefix. Also remove the requirement for the TEST_FILESYS
    flag.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

10 Oct, 2008

1 commit

  • With modern hard drives, reading 64k takes roughly the same time as
    reading a 4k block. So request readahead for adjacent inode table
    blocks to reduce the time it takes when iterating over directories
    (especially when doing this in htree sort order) in a cold cache case.
    With this patch, the time it takes to run "git status" on a kernel
    tree after flushing the caches via "echo 3 > /proc/sys/vm/drop_caches"
    is reduced by 21%.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

28 Jul, 2008

1 commit


12 Jul, 2008

2 commits


27 May, 2008

1 commit

  • I can't think of any valid reason for ext4 to not use barriers when
    they are available; I believe this is necessary for filesystem
    integrity in the face of a volatile write cache on storage.

    An administrator who trusts that the cache is sufficiently battery-
    backed (and power supplies are sufficiently redundant, etc...)
    can always turn it back off again.

    SuSE has carried such a patch for ext3 for quite some time now.

    Also document the mount option while we're at it.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Mingming Cao
    Signed-off-by: "Theodore Ts'o"

    Eric Sandeen
     

29 Jan, 2008

2 commits

  • Signed-off-by: Alex Tomas
    Signed-off-by: Andreas Dilger
    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Alex Tomas
     
  • The journal checksum feature adds two new flags i.e
    JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT and JBD2_FEATURE_COMPAT_CHECKSUM.

    JBD2_FEATURE_CHECKSUM flag indicates that the commit block contains the
    checksum for the blocks described by the descriptor blocks.
    Due to checksums, writing of the commit record no longer needs to be
    synchronous. Now commit record can be sent to disk without waiting for
    descriptor blocks to be written to disk. This behavior is controlled
    using JBD2_FEATURE_ASYNC_COMMIT flag. Older kernels/e2fsck should not be
    able to recover the journal with _ASYNC_COMMIT hence it is made
    incompat.
    The commit header has been extended to hold the checksum along with the
    type of the checksum.

    For recovery in pass scan checksums are verified to ensure the sanity
    and completeness(in case of _ASYNC_COMMIT) of every transaction.

    Signed-off-by: Andreas Dilger
    Signed-off-by: Girish Shilamkar
    Signed-off-by: Dave Kleikamp
    Signed-off-by: Mingming Cao

    Girish Shilamkar
     

12 Oct, 2006

1 commit

  • This file, ext4.txt, was put together with information from Andrew Morton,
    Andreas Dilger, Suparna Bhattacharya, and Ted Ts'o.

    I copied the mount options, with the exception of "extents", from ext3.txt,
    so if anyone is aware of anything out-of-date, please let me know.

    Signed-off-by: Dave Kleikamp
    Cc: Theodore Ts'o
    Cc: Suparna Bhattacharya
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Kleikamp