22 Feb, 2018

1 commit


30 Mar, 2017

1 commit

  • As ftp.kernel.org is closed [0], this commit fixes dead URLs in
    documents to use www.kernel.org instead.

    [0] https://www.kernel.org/shutting-down-ftp-services.html

    Signed-off-by: SeongJae Park
    Acked-by: Theodore Ts'o
    Acked-by: David S. Miller
    Reviewed-by: Mauro Carvalho Chehab
    Signed-off-by: Jonathan Corbet

    SeongJae Park
     

04 Dec, 2016

1 commit


17 Feb, 2015

1 commit

  • This is a port of the DAX functionality found in the current version of
    ext2.

    [matthew.r.wilcox@intel.com: heavily tweaked]
    [akpm@linux-foundation.org: remap_pages went away]
    Signed-off-by: Ross Zwisler
    Reviewed-by: Andreas Dilger
    Signed-off-by: Matthew Wilcox
    Cc: Boaz Harrosh
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Jan Kara
    Cc: Jens Axboe
    Cc: Kirill A. Shutemov
    Cc: Mathieu Desnoyers
    Cc: Randy Dunlap
    Cc: Theodore Ts'o
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ross Zwisler
     

07 Sep, 2013

1 commit

  • Pull trivial tree from Jiri Kosina:
    "The usual trivial updates all over the tree -- mostly typo fixes and
    documentation updates"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (52 commits)
    doc: Documentation/cputopology.txt fix typo
    treewide: Convert retrun typos to return
    Fix comment typo for init_cma_reserved_pageblock
    Documentation/trace: Correcting and extending tracepoint documentation
    mm/hotplug: fix a typo in Documentation/memory-hotplug.txt
    power: Documentation: Update s2ram link
    doc: fix a typo in Documentation/00-INDEX
    Documentation/printk-formats.txt: No casts needed for u64/s64
    doc: Fix typo "is is" in Documentations
    treewide: Fix printks with 0x%#
    zram: doc fixes
    Documentation/kmemcheck: update kmemcheck documentation
    doc: documentation/hwspinlock.txt fix typo
    PM / Hibernate: add section for resume options
    doc: filesystems : Fix typo in Documentations/filesystems
    scsi/megaraid fixed several typos in comments
    ppc: init_32: Fix error typo "CONFIG_START_KERNEL"
    treewide: Add __GFP_NOWARN to k.alloc calls with v.alloc fallbacks
    page_isolation: Fix a comment typo in test_pages_isolated()
    doc: fix a typo about irq affinity
    ...

    Linus Torvalds
     

29 Aug, 2013

1 commit

  • It's always been a hassle that if an external journal's
    device number changes, the filesystem won't mount.
    And since boot-time enumeration can change, device number
    changes aren't unusual.

    The current mechanism to update the journal location is by
    passing in a mount option w/ a new devnum, but that's a hassle;
    it's a manual approach, fixing things after the fact.

    Adding a mount option, "-o journal_path=/dev/$DEVICE" would
    help, since then we can do i.e.

    # mount -o journal_path=/dev/disk/by-label/$JOURNAL_LABEL ...

    and it'll mount even if the devnum has changed, as shown here:

    # losetup /dev/loop0 journalfile
    # mke2fs -L mylabel-journal -O journal_dev /dev/loop0
    # mkfs.ext4 -L mylabel -J device=/dev/loop0 /dev/sdb1

    Change the journal device number:

    # losetup -d /dev/loop0
    # losetup /dev/loop1 journalfile

    And today it will fail:

    # mount /dev/sdb1 /mnt/test
    mount: wrong fs type, bad option, bad superblock on /dev/sdb1,
    missing codepage or helper program, or other error
    In some cases useful info is found in syslog - try
    dmesg | tail or so

    # dmesg | tail -n 1
    [17343.240702] EXT4-fs (sdb1): error: couldn't read superblock of external journal

    But with this new mount option, we can specify the new path:

    # mount -o journal_path=/dev/loop1 /dev/sdb1 /mnt/test
    #

    (which does update the encoded device number, incidentally):

    # umount /dev/sdb1
    # dumpe2fs -h /dev/sdb1 | grep "Journal device"
    dumpe2fs 1.41.12 (17-May-2010)
    Journal device: 0x0701

    But best of all we can just always mount by journal-path, and
    it'll always work:

    # mount -o journal_path=/dev/disk/by-label/mylabel-journal /dev/sdb1 /mnt/test
    #

    So the journal_path option can be specified in fstab, and as long as
    the disk is available somewhere, and findable by label (or by UUID),
    we can mount.

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Jan Kara
    Reviewed-by: Carlos Maiolino

    Eric Sandeen
     

25 Jul, 2013

1 commit


10 Apr, 2013

1 commit

  • Currently in ENOSPC condition when writing into unwritten space, or
    punching a hole, we might need to split the extent and grow extent tree.
    However since we can not allocate any new metadata blocks we'll have to
    zero out unwritten part of extent or punched out part of extent, or in
    the worst case return ENOSPC even though use actually does not allocate
    any space.

    Also in delalloc path we do reserve metadata and data blocks for the
    time we're going to write out, however metadata block reservation is
    very tricky especially since we expect that logical connectivity implies
    physical connectivity, however that might not be the case and hence we
    might end up allocating more metadata blocks than previously reserved.
    So in future, metadata reservation checks should be removed since we can
    not assure that we do not under reserve.

    And this is where reserved space comes into the picture. When mounting
    the file system we slice off a little bit of the file system space (2%
    or 4096 clusters, whichever is smaller) which can be then used for the
    cases mentioned above to prevent costly zeroout, or unexpected ENOSPC.

    The number of reserved clusters can be set via sysfs, however it can
    never be bigger than number of free clusters in the file system.

    Note that this patch fixes the failure of xfstest 274 as expected.

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Carlos Maiolino

    Lukas Czerner
     

09 Apr, 2013

1 commit

  • Add a new ioctl, EXT4_IOC_SWAP_BOOT which swaps i_blocks and
    associated attributes (like i_blocks, i_size, i_flags, ...) from the
    specified inode with inode EXT4_BOOT_LOADER_INO (#5). This is
    typically used to store a boot loader in a secure part of the
    filesystem, where it can't be changed by a normal user by accident.
    The data blocks of the previous boot loader will be associated with
    the given inode.

    This usercode program is a simple example of the usage:

    int main(int argc, char *argv[])
    {
    int fd;
    int err;

    if ( argc != 2 ) {
    printf("usage: ext4-swap-boot-inode FILE-TO-SWAP\n");
    exit(1);
    }

    fd = open(argv[1], O_WRONLY);
    if ( fd < 0 ) {
    perror("open");
    exit(1);
    }

    err = ioctl(fd, EXT4_IOC_SWAP_BOOT);
    if ( err < 0 ) {
    perror("ioctl");
    exit(1);
    }

    close(fd);
    exit(0);
    }

    [ Modified by Theodore Ts'o to fix a number of bugs in the original code.]

    Signed-off-by: Dr. Tilmann Bubeck
    Signed-off-by: "Theodore Ts'o"

    Dr. Tilmann Bubeck
     

11 Dec, 2012

1 commit

  • Ted has sent out a RFC about removing this feature. Eric and Jan
    confirmed that both RedHat and SUSE enable this feature in all their
    product. David also said that "As far as I know, it's enabled in all
    Android kernels that use ext4." So it seems OK for us.

    And what's more, as inline data depends its implementation on xattr,
    and to be frank, I don't run any test again inline data enabled while
    xattr disabled. So I think we should add inline data and remove this
    config option in the same release.

    [ The savings if you disable CONFIG_EXT4_FS_XATTR is only 27k, which
    isn't much in the grand scheme of things. Since no one seems to be
    testing this configuration except for some automated compile farms, on
    balance we are better removing this config option, and so that it is
    effectively always enabled. -- tytso ]

    Cc: David Brown
    Cc: Eric Sandeen
    Reviewed-by: Jan Kara
    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"

    Tao Ma
     

17 Aug, 2012

1 commit

  • Very large directories can cause significant performance problems, or
    perhaps even invoke the OOM killer, if the process is running in a
    highly constrained memory environment (whether it is VM's with a small
    amount of memory or in a small memory cgroup).

    So it is useful, in cloud server/data center environments, to be able
    to set a filesystem-wide cap on the maximum size of a directory, to
    ensure that directories never get larger than a sane size. We do this
    via a new mount option, max_dir_size_kb. If there is an attempt to
    grow the directory larger than max_dir_size_kb, the system call will
    return ENOSPC instead.

    Google-Bug-Id: 6863013

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

29 Mar, 2012

1 commit

  • Pull ext4 updates for 3.4 from Ted Ts'o:
    "Ext4 commits for 3.3 merge window; mostly cleanups and bug fixes

    The changes to export dirty_writeback_interval are from Artem's s_dirt
    cleanup patch series. The same is true of the change to remove the
    s_dirt helper functions which never got used by anyone in-tree. I've
    run these changes by Al Viro, and am carrying them so that Artem can
    more easily fix up the rest of the file systems during the next merge
    window. (Originally we had hopped to remove the use of s_dirt from
    ext4 during this merge window, but his patches had some bugs, so I
    ultimately ended dropping them from the ext4 tree.)"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (66 commits)
    vfs: remove unused superblock helpers
    mm: export dirty_writeback_interval
    ext4: remove useless s_dirt assignment
    ext4: write superblock only once on unmount
    ext4: do not mark superblock as dirty unnecessarily
    ext4: correct ext4_punch_hole return codes
    ext4: remove restrictive checks for EOFBLOCKS_FL
    ext4: always set then trimmed blocks count into len
    ext4: fix trimmed block count accunting
    ext4: fix start and len arguments handling in ext4_trim_fs()
    ext4: update s_free_{inodes,blocks}_count during online resize
    ext4: change some printk() calls to use ext4_msg() instead
    ext4: avoid output message interleaving in ext4_error_()
    ext4: remove trailing newlines from ext4_msg() and ext4_error() messages
    ext4: add no_printk argument validation, fix fallout
    ext4: remove redundant "EXT4-fs: " from uses of ext4_msg
    ext4: give more helpful error message in ext4_ext_rm_leaf()
    ext4: remove unused code from ext4_ext_map_blocks()
    ext4: rewrite punch hole to use ext4_ext_remove_space()
    jbd2: cleanup journal tail after transaction commit
    ...

    Linus Torvalds
     

07 Mar, 2012

1 commit


21 Feb, 2012

2 commits


05 Jan, 2012

1 commit

  • This patch adds new online resize interface, whose input argument is a
    64-bit integer indicating how many blocks there are in the resized fs.

    In new resize impelmentation, all work like allocating group tables
    are done by kernel side, so the new resize interface can support
    flex_bg feature and prepares ground for suppoting resize with features
    like bigalloc and exclude bitmap. Besides these, user-space tools just
    passes in the new number of blocks.

    We delay initializing the bitmaps and inode tables of added groups if
    possible and add multi groups (a flex groups) each time, so new resize
    is very fast like mkfs.

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"

    Yongqiang Yang
     

09 Oct, 2011

2 commits

  • For a long time now orlov is the default block allocator in the
    ext4. It performs better than the old one and no one seems to claim
    otherwise so we can safely drop it and make oldalloc and orlov mount
    option deprecated.

    This is a part of the effort to reduce number of ext4 options hence the
    test matrix.

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner
     
  • Acl and user_xattr mount options are no longer needed since those
    features are enabled by default if configured in (seee commit
    ea6633369458992241599c9d9ebadffaeddec164). We can not easily deprecate
    mount options itself (since it is probably too early), but we can
    remove it from documentation first.

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

04 Sep, 2011

1 commit

  • If the user explicitly specifies conflicting mount options for
    delalloc or dioread_nolock and data=journal, fail the mount, instead
    of printing a warning and continuing (since many user's won't look at
    dmesg and notice the warning).

    Also, print a single warning that data=journal implies that delayed
    allocation is not on by default (since it's not supported), and
    furthermore that O_DIRECT is not supported. Improve the text in
    Documentation/filesystems/ext4.txt so this is clear there as well.

    Similarly, if the dioread_nolock mount option is specified when the
    file system block size != PAGE_SIZE, fail the mount instead of
    printing a warning message and ignoring the mount option.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

25 Jun, 2011

1 commit


02 May, 2011

1 commit


31 Mar, 2011

1 commit


22 Feb, 2011

1 commit

  • Add documentation for mount options and ioctls to
    Documentation/filesystem/ext4.txt, which has not been udpated for some
    time. Also add for ext4 sysfs tunables to the
    Documentation/ABI/testing/sysfs-fs-ext4 file, and fix a few
    typographical errors in that file.

    https://bugzilla.kernel.org/show_bug.cgi?id=9423

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner
     

28 Oct, 2010

1 commit

  • When the lazy_itable_init extended option is passed to mke2fs, it
    considerably speeds up filesystem creation because inode tables are
    not zeroed out. The fact that parts of the inode table are
    uninitialized is not a problem so long as the block group descriptors,
    which contain information regarding how much of the inode table has
    been initialized, has not been corrupted However, if the block group
    checksums are not valid, e2fsck must scan the entire inode table, and
    the the old, uninitialized data could potentially cause e2fsck to
    report false problems.

    Hence, it is important for the inode tables to be initialized as soon
    as possble. This commit adds this feature so that mke2fs can safely
    use the lazy inode table initialization feature to speed up formatting
    file systems.

    This is done via a new new kernel thread called ext4lazyinit, which is
    created on demand and destroyed, when it is no longer needed. There
    is only one thread for all ext4 filesystems in the system. When the
    first filesystem with inititable mount option is mounted, ext4lazyinit
    thread is created, then the filesystem can register its request in the
    request list.

    This thread then walks through the list of requests picking up
    scheduled requests and invoking ext4_init_inode_table(). Next schedule
    time for the request is computed by multiplying the time it took to
    zero out last inode table with wait multiplier, which can be set with
    the (init_itable=n) mount option (default is 10). We are doing
    this so we do not take the whole I/O bandwidth. When the thread is no
    longer necessary (request list is empty) it frees the appropriate
    structures and exits (and can be created later later by another
    filesystem).

    We do not disturb regular inode allocations in any way, it just do not
    care whether the inode table is, or is not zeroed. But when zeroing, we
    have to skip used inodes, obviously. Also we should prevent new inode
    allocations from the group, while zeroing is on the way. For that we
    take write alloc_sem lock in ext4_init_inode_table() and read alloc_sem
    in the ext4_claim_inode, so when we are unlucky and allocator hits the
    group which is currently being zeroed, it just has to wait.

    This can be suppresed using the mount option no_init_itable.

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"

    Lukas Czerner
     

25 Dec, 2009

1 commit


20 Nov, 2009

2 commits

  • Users on the linux-ext4 list recently complained about differences
    across filesystems w.r.t. how to mount without a journal replay.

    In the discussion it was noted that xfs's "norecovery" option is
    perhaps more descriptively accurate than "noload," so let's make
    that an alias for ext4.

    Also show this status in /proc/mounts

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Eric Sandeen
     
  • It is anticipated that when sb_issue_discard starts doing
    real work on trim-capable devices, we may see issues. Make
    this mount-time optional, and default it to off until we know
    that things are working out OK.

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Eric Sandeen
     

03 Nov, 2009

1 commit

  • This reverts commit d0646f7b636d067d715fab52a2ba9c6f0f46b0d7, as
    requested by Eric Sandeen.

    It can basically cause an ext4 filesystem to miss recovery (and thus get
    mounted with errors) if the journal checksum does not match.

    Quoth Eric:

    "My hand-wavy hunch about what is happening is that we're finding a
    bad checksum on the last partially-written transaction, which is
    not surprising, but if we have a wrapped log and we're doing the
    initial scan for head/tail, and we abort scanning on that bad
    checksum, then we are essentially running an unrecovered filesystem.

    But that's hand-wavy and I need to go look at the code.

    We lived without journal checksums on by default until now, and at
    this point they're doing more harm than good, so we should revert
    the default-changing commit until we can fix it and do some good
    power-fail testing with the fixes in place."

    See

    http://bugzilla.kernel.org/show_bug.cgi?id=14354

    for all the gory details.

    Requested-by: Eric Sandeen
    Cc: Theodore Tso
    Cc: Alexey Fisher
    Cc: Maxim Levitsky
    Cc: Aneesh Kumar K.V
    Cc: Mathias Burén
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

30 Sep, 2009

1 commit


19 Sep, 2009

1 commit


06 Sep, 2009

1 commit


13 Jun, 2009

2 commits


28 Mar, 2009

1 commit

  • Add support for using the mount options "barrier" and "nobarrier", and
    "auto_da_alloc" and "noauto_da_alloc", which is more consistent than
    "barrier=" or "auto_da_alloc=". Most other ext3/ext4 mount
    options use the foo/nofoo naming convention. We allow the old forms
    of these mount options for backwards compatibility.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

23 Feb, 2009

1 commit


07 Jan, 2009

3 commits

  • This mount option is largely superfluous, and in fact the way it was
    implemented was buggy; if a filesystem which did not have the extents
    feature flag was mounted -o extents, the filesystem would attempt to
    create and use extents-based file even though the extents feature flag
    was not eabled. The simplest thing to do is to nuke the mount option
    entirely. It's not all that useful to force the non-creation of new
    extent-based files if the filesystem can support it.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • This code has been obsolete in quite some time, since the supported
    method for adding a journal inode is to use tune2fs (or to creating
    new filesystem with a journal via mke2fs or mkfs.ext4).

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     
  • Fix paragraph with recommendations on how to tune ext4 for benchmarks.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

06 Jan, 2009

1 commit


04 Jan, 2009

1 commit

  • Add new mount options, min_batch_time and max_batch_time, which
    controls how long the jbd2 layer should wait for additional filesystem
    operations to get batched with a synchronous write transaction.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o