04 Jun, 2020

2 commits

  • Remove ext4_journal_free_reserved() function. It is never used.

    Signed-off-by: Jan Kara
    Reviewed-by: Andreas Dilger
    Link: https://lore.kernel.org/r/20200520133119.1383-2-jack@suse.cz
    Signed-off-by: Theodore Ts'o

    Jan Kara
     
  • ext4_mark_inode_dirty() can fail for real reasons. Ignoring its return
    value may lead ext4 to ignore real failures that would result in
    corruption / crashes. Harden ext4_mark_inode_dirty error paths to fail
    as soon as possible and return errors to the caller whenever
    appropriate.

    One of the possible scnearios when this bug could affected is that
    while creating a new inode, its directory entry gets added
    successfully but while writing the inode itself mark_inode_dirty
    returns error which is ignored. This would result in inconsistency
    that the directory entry points to a non-existent inode.

    Ran gce-xfstests smoke tests and verified that there were no
    regressions.

    Signed-off-by: Harshad Shirwadkar
    Link: https://lore.kernel.org/r/20200427013438.219117-1-harshadshirwadkar@gmail.com
    Signed-off-by: Theodore Ts'o

    Harshad Shirwadkar
     

26 Mar, 2020

1 commit

  • The patch "ext4: make dioread_nolock the default" (244adf6426ee) causes
    generic/422 to fail when run in kvm-xfstests' ext3conv test case. This
    applies both the dioread_nolock and nodelalloc mount options, a
    combination not previously tested by kvm-xfstests. The failure occurs
    because the dioread_nolock code path splits a previously fallocated
    multiblock extent into a series of single block extents when overwriting
    a portion of that extent. That causes allocation of an extent tree leaf
    node and a reshuffling of extents. Once writeback is completed, the
    individual extents are recombined into a single extent, the extent is
    moved again, and the leaf node is deleted. The difference in block
    utilization before and after writeback due to the leaf node triggers the
    failure.

    The original reason for this behavior was to avoid ENOSPC when handling
    I/O completions during writeback in the dioread_nolock code paths when
    delayed allocation is disabled. It may no longer be necessary, because
    code was added in the past to reserve extra space to solve this problem
    when delayed allocation is enabled, and this code may also apply when
    delayed allocation is disabled. Until this can be verified, don't use
    the dioread_nolock code paths if delayed allocation is disabled.

    Signed-off-by: Eric Whitney
    Link: https://lore.kernel.org/r/20200319150028.24592-1-enwlinux@gmail.com
    Signed-off-by: Theodore Ts'o

    Eric Whitney
     

18 Jan, 2020

1 commit

  • Determining an inode's journaling mode has gotten more complicated over
    time. Move ext4_inode_journal_mode() from an inline function into
    ext4_jbd2.c to reduce the compiled code size.

    Signed-off-by: Eric Biggers
    Link: https://lore.kernel.org/r/20191209233602.117778-1-ebiggers@kernel.org
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Jan Kara

    Eric Biggers
     

06 Nov, 2019

5 commits

  • So far we have reserved only relatively high fixed amount of revoke
    credits for each transaction. We over-reserved by large amount for most
    cases but when freeing large directories or files with data journalling,
    the fixed amount is not enough. In fact the worst case estimate is
    inconveniently large (maximum extent size) for freeing of one extent.

    We fix this by doing proper estimate of the amount of blocks that need
    to be revoked when removing blocks from the inode due to truncate or
    hole punching and otherwise reserve just a small amount of revoke
    credits for each transaction to accommodate freeing of xattrs block or
    so.

    Signed-off-by: Jan Kara
    Link: https://lore.kernel.org/r/20191105164437.32602-23-jack@suse.cz
    Signed-off-by: Theodore Ts'o

    Jan Kara
     
  • Extend functions for starting, extending, and restarting transaction
    handles to take number of revoke records handle must be able to
    accommodate. These functions then make sure transaction has enough
    credits to be able to store resulting revoke descriptor blocks. Also
    revoke code tracks number of revoke records created by a handle to catch
    situation where some place didn't reserve enough space for revoke
    records. Similarly to standard transaction credits, space for unused
    reserved revoke records is released when the handle is stopped.

    On the ext4 side we currently take a simplistic approach of reserving
    space for 1024 revoke records for any transaction. This grows amount of
    credits reserved for each handle only by a few and is enough for any
    normal workload so that we don't hit warnings in jbd2. We will refine
    the logic in following commits.

    Signed-off-by: Jan Kara
    Link: https://lore.kernel.org/r/20191105164437.32602-20-jack@suse.cz
    Signed-off-by: Theodore Ts'o

    Jan Kara
     
  • Provide accessor function to get number of credits available in a handle
    and use it from ext4. Later, computation of available credits won't be
    so straightforward.

    Reviewed-by: Theodore Ts'o
    Signed-off-by: Jan Kara
    Link: https://lore.kernel.org/r/20191105164437.32602-11-jack@suse.cz
    Signed-off-by: Theodore Ts'o

    Jan Kara
     
  • Provide ext4_journal_ensure_credits_fn() function to ensure transaction
    has given amount of credits and call helper function to prepare for
    restarting a transaction. This allows to remove some boilerplate code
    from various places, add proper error handling for the case where
    transaction extension or restart fails, and reduces following changes
    needed for proper revoke record reservation tracking.

    Signed-off-by: Jan Kara
    Link: https://lore.kernel.org/r/20191105164437.32602-10-jack@suse.cz
    Signed-off-by: Theodore Ts'o

    Jan Kara
     
  • Similarly to directories, EA inodes do only journalled modifications to
    their data. Change ext4_should_journal_data() to return true for them so
    that we don't have to special-case them during truncate.

    Signed-off-by: Jan Kara
    Link: https://lore.kernel.org/r/20191105164437.32602-7-jack@suse.cz
    Signed-off-by: Theodore Ts'o

    Jan Kara
     

21 Jun, 2019

1 commit

  • Use the newly introduced jbd2_inode dirty range scoping to prevent us
    from waiting forever when trying to complete a journal transaction.

    Signed-off-by: Ross Zwisler
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Jan Kara
    Cc: stable@vger.kernel.org

    Ross Zwisler
     

25 Mar, 2019

1 commit

  • Pull ext4 fixes from Ted Ts'o:
    "Miscellaneous ext4 bug fixes for 5.1"

    * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: prohibit fstrim in norecovery mode
    ext4: cleanup bh release code in ext4_ind_remove_space()
    ext4: brelse all indirect buffer in ext4_ind_remove_space()
    ext4: report real fs size after failed resize
    ext4: add missing brelse() in add_new_gdb_meta_bg()
    ext4: remove useless ext4_pin_inode()
    ext4: avoid panic during forced reboot
    ext4: fix data corruption caused by unaligned direct AIO
    ext4: fix NULL pointer dereference while journal is aborted

    Linus Torvalds
     

15 Mar, 2019

1 commit

  • We see the following NULL pointer dereference while running xfstests
    generic/475:
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
    PGD 8000000c84bad067 P4D 8000000c84bad067 PUD c84e62067 PMD 0
    Oops: 0000 [#1] SMP PTI
    CPU: 7 PID: 9886 Comm: fsstress Kdump: loaded Not tainted 5.0.0-rc8 #10
    RIP: 0010:ext4_do_update_inode+0x4ec/0x760
    ...
    Call Trace:
    ? jbd2_journal_get_write_access+0x42/0x50
    ? __ext4_journal_get_write_access+0x2c/0x70
    ? ext4_truncate+0x186/0x3f0
    ext4_mark_iloc_dirty+0x61/0x80
    ext4_mark_inode_dirty+0x62/0x1b0
    ext4_truncate+0x186/0x3f0
    ? unmap_mapping_pages+0x56/0x100
    ext4_setattr+0x817/0x8b0
    notify_change+0x1df/0x430
    do_truncate+0x5e/0x90
    ? generic_permission+0x12b/0x1a0

    This is triggered because the NULL pointer handle->h_transaction was
    dereferenced in function ext4_update_inode_fsync_trans().
    I found that the h_transaction was set to NULL in jbd2__journal_restart
    but failed to attached to a new transaction while the journal is aborted.

    Fix this by checking the handle before updating the inode.

    Fixes: b436b9bef84d ("ext4: Wait for proper transaction commit on fsync")
    Signed-off-by: Jiufei Xue
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Joseph Qi
    Cc: stable@kernel.org

    Jiufei Xue
     

24 Jan, 2019

1 commit


18 Dec, 2017

1 commit

  • A number of ext4 source files were skipped due because their copyright
    permission statements didn't match the expected text used by the
    automated conversion utilities. I've added SPDX tags for the rest.

    While looking at some of these files, I've noticed that we have quite
    a bit of variation on the licenses that were used --- in particular
    some of the Red Hat licenses on the jbd2 files use a GPL2+ license,
    and we have some files that have a LGPL-2.1 license (which was quite
    surprising).

    I've not attempted to do any license changes. Even if it is perfectly
    legal to relicense to GPL 2.0-only for consistency's sake, that should
    be done with ext4 developer community discussion.

    Signed-off-by: Theodore Ts'o

    Theodore Ts'o
     

06 Aug, 2017

1 commit

  • When upgrading from old format, try to set project id
    to old file first time, it will return EOVERFLOW, but if
    that file is dirtied(touch etc), changing project id will
    be allowed, this might be confusing for users, we could
    try to expand @i_extra_isize here too.

    Reported-by: Zhang Yi
    Signed-off-by: Miao Xie
    Signed-off-by: Wang Shilong
    Signed-off-by: Theodore Ts'o

    Miao Xie
     

22 Jun, 2017

2 commits

  • Both ext4_set_acl() and ext4_set_context() need to be made aware of
    ea_inode feature when it comes to credits calculation.

    Also add a sufficient credits check in ext4_xattr_set_handle() right
    after xattr write lock is grabbed. Original credits calculation is done
    outside the lock so there is a possiblity that the initially calculated
    credits are not sufficient anymore.

    Signed-off-by: Tahsin Erdogan
    Signed-off-by: Theodore Ts'o

    Tahsin Erdogan
     
  • This INCOMPAT_LARGEDIR feature allows larger directories to be created
    in ldiskfs, both with directory sizes over 2GB and and a maximum htree
    depth of 3 instead of the current limit of 2. These features are needed
    in order to exceed the current limit of approximately 10M entries in a
    single directory.

    This patch was originally written by Yang Sheng to support the Lustre server.

    [ Bumped the credits needed to update an indexed directory -- tytso ]

    Signed-off-by: Liang Zhen
    Signed-off-by: Yang Sheng
    Signed-off-by: Artem Blagodarenko
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Andreas Dilger

    Artem Blagodarenko
     

11 Dec, 2016

1 commit

  • Currently data journalling is incompatible with encryption: enabling both
    at the same time has never been supported by design, and would result in
    unpredictable behavior. However, users are not precluded from turning on
    both features simultaneously. This change programmatically replaces data
    journaling for encrypted regular files with ordered data journaling mode.

    Background:
    Journaling encrypted data has not been supported because it operates on
    buffer heads of the page in the page cache. Namely, when the commit
    happens, which could be up to five seconds after caching, the commit
    thread uses the buffer heads attached to the page to copy the contents of
    the page to the journal. With encryption, it would have been required to
    keep the bounce buffer with ciphertext for up to the aforementioned five
    seconds, since the page cache can only hold plaintext and could not be
    used for journaling. Alternatively, it would be required to setup the
    journal to initiate a callback at the commit time to perform deferred
    encryption - in this case, not only would the data have to be written
    twice, but it would also have to be encrypted twice. This level of
    complexity was not justified for a mode that in practice is very rarely
    used because of the overhead from the data journalling.

    Solution:
    If data=journaled has been set as a mount option for a filesystem, or if
    journaling is enabled on a regular file, do not perform journaling if the
    file is also encrypted, instead fall back to the data=ordered mode for the
    file.

    Rationale:
    The intent is to allow seamless and proper filesystem operation when
    journaling and encryption have both been enabled, and have these two
    conflicting features gracefully resolved by the filesystem.

    Fixes: 4461471107b7
    Signed-off-by: Sergey Karamov
    Signed-off-by: Theodore Ts'o
    Cc: stable@vger.kernel.org

    Sergey Karamov
     

27 Jun, 2016

1 commit


24 Apr, 2016

2 commits

  • Currently we ask jbd2 to write all dirty allocated buffers before
    committing a transaction when doing writeback of delay allocated blocks.
    However this is unnecessary since we move all pages to writeback state
    before dropping a transaction handle and then submit all the necessary
    IO. We still need the transaction commit to wait for all the outstanding
    writeback before flushing disk caches during transaction commit to avoid
    data exposure issues though. Use the new jbd2 capability and ask it to
    only wait for outstanding writeback during transaction commit when
    writing back data in ext4_writepages().

    Tested-by: "HUANG Weller (CM/ESW12-CN)"
    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Jan Kara
     
  • Currently when filesystem needs to make sure data is on permanent
    storage before committing a transaction it adds inode to transaction's
    inode list. During transaction commit, jbd2 writes back all dirty
    buffers that have allocated underlying blocks and waits for the IO to
    finish. However when doing writeback for delayed allocated data, we
    allocate blocks and immediately submit the data. Thus asking jbd2 to
    write dirty pages just unnecessarily adds more work to jbd2 possibly
    writing back other redirtied blocks.

    Add support to jbd2 to allow filesystem to ask jbd2 to only wait for
    outstanding data writes before committing a transaction and thus avoid
    unnecessary writes.

    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Jan Kara
     

18 Oct, 2015

1 commit


11 Sep, 2014

1 commit

  • MAXQUOTAS value defines maximum number of quota types VFS supports.
    This isn't necessarily the number of types ext4 supports. Although
    ext4 will support project quotas, use ext4 private definition for
    consistency with other filesystems.

    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Jan Kara
     

12 May, 2014

1 commit


29 Aug, 2013

1 commit


05 Jun, 2013

2 commits


10 Apr, 2013

1 commit


04 Apr, 2013

1 commit

  • It is incorrect to use list_for_each_entry_safe() for journal callback
    traversial because ->next may be removed by other task:
    ->ext4_mb_free_metadata()
    ->ext4_mb_free_metadata()
    ->ext4_journal_callback_del()

    This results in the following issue:

    WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250()
    Hardware name:
    list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b
    Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod
    Pid: 16400, comm: jbd2/dm-1-8 Tainted: G W 3.8.0-rc3+ #107
    Call Trace:
    [] warn_slowpath_common+0xad/0xf0
    [] warn_slowpath_fmt+0x46/0x50
    [] ? ext4_journal_commit_callback+0x99/0xc0
    [] __list_del_entry+0x1c0/0x250
    [] ext4_journal_commit_callback+0x6f/0xc0
    [] jbd2_journal_commit_transaction+0x23a6/0x2570
    [] ? try_to_del_timer_sync+0x82/0xa0
    [] ? del_timer_sync+0x91/0x1e0
    [] kjournald2+0x19f/0x6a0
    [] ? wake_up_bit+0x40/0x40
    [] ? bit_spin_lock+0x80/0x80
    [] kthread+0x10e/0x120
    [] ? __init_kthread_worker+0x70/0x70
    [] ret_from_fork+0x7c/0xb0
    [] ? __init_kthread_worker+0x70/0x70

    This patch fix the issue as follows:
    - ext4_journal_commit_callback() make list truly traversial safe
    simply by always starting from list_head
    - fix race between two ext4_journal_callback_del() and
    ext4_journal_callback_try_del()

    Signed-off-by: Dmitry Monakhov
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Jan Kara
    Cc: stable@vger.kernel.com

    Dmitry Monakhov
     

10 Feb, 2013

2 commits

  • Operations which modify extended attributes may need extra journal
    credits if inline data is used, since there is a chance that some
    extended attributes may need to get pushed to an external attribute
    block.

    Changes to reflect this was made in xattr.c, but they were missed in
    fs/ext4/acl.c. To fix this, abstract the calculation of the number of
    credits needed for xattr operations to an inline function defined in
    ext4_jbd2.h, and use it in acl.c and xattr.c.

    Also move the function declarations used in inline.c from xattr.h
    (where they are non-obviously hidden, and caused problems since
    ext4_jbd2.h needs to use the function ext4_has_inline_data), and move
    them to ext4.h.

    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Tao Ma
    Reviewed-by: Jan Kara

    Theodore Ts'o
     
  • The ext4_unlink() and ext4_rmdir() don't actually release the blocks
    associated with the file/directory. This gets done in a separate jbd2
    handle called via ext4_evict_inode(). Thus, we don't need to reserve
    lots of journal credits for the truncate.

    Note that using too many journal credits is non-optimal because it can
    leading to the journal transmit getting closed too early, before it is
    strictly necessary.

    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Jan Kara

    Theodore Ts'o
     

09 Feb, 2013

1 commit

  • So we can better understand what bits of ext4 are responsible for
    long-running jbd2 handles, use jbd2__journal_start() so we can pass
    context information for logging purposes.

    The recommended way for finding the longer-running handles is:

    T=/sys/kernel/debug/tracing
    EVENT=$T/events/jbd2/jbd2_handle_stats
    echo "interval > 5" > $EVENT/filter
    echo 1 > $EVENT/enable

    ./run-my-fs-benchmark

    cat $T/trace > /tmp/problem-handles

    This will list handles that were active for longer than 20ms. Having
    longer-running handles is bad, because a commit started at the wrong
    time could stall for those 20+ milliseconds, which could delay an
    fsync() or an O_SYNC operation. Here is an example line from the
    trace file describing a handle which lived on for 311 jiffies, or over
    1.2 seconds:

    postmark-2917 [000] .... 196.435786: jbd2_handle_stats: dev 254,32
    tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1
    dirtied_blocks 0

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

09 Nov, 2012

1 commit

  • ext4_handle_release_buffer() was intended to remove journal
    write access from a buffer, but it doesn't actually do anything
    at all other than add a BUFFER_TRACE point, but it's not reliably
    used for that either. Remove all the associated dead code.

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Carlos Maiolino

    Eric Sandeen
     

23 Jul, 2012

2 commits

  • The '__ext4_handle_dirty_metadata()' does not need the 'now' argument
    anymore and we can kill it.

    Signed-off-by: Artem Bityutskiy
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Jan Kara

    Artem Bityutskiy
     
  • This patch adds support for quotas as a first class feature in ext4;
    which is to say, the quota files are stored in hidden inodes as file
    system metadata, instead of as separate files visible in the file system
    directory hierarchy.

    It is based on the proposal at:
    https://ext4.wiki.kernel.org/index.php/Design_For_1st_Class_Quota_in_Ext4

    This patch introduces a new feature - EXT4_FEATURE_RO_COMPAT_QUOTA
    which, when turned on, enables quota accounting at mount time
    iteself. Also, the quota inodes are stored in two additional superblock
    fields. Some changes introduced by this patch that should be pointed
    out are:

    1) Two new ext4-superblock fields - s_usr_quota_inum and
    s_grp_quota_inum for storing the quota inodes in use.
    2) Default quota inodes are: inode#3 for tracking userquota and inode#4
    for tracking group quota. The superblock fields can be set to use
    other inodes as well.
    3) If the QUOTA feature and corresponding quota inodes are set in
    superblock, the quota usage tracking is turned on at mount time. On
    'quotaon' ioctl, the quota limits enforcement is turned
    on. 'quotaoff' ioctl turns off only the limits enforcement in this
    case.
    4) When QUOTA feature is in use, the quota mount options 'quota',
    'usrquota', 'grpquota' are ignored by the kernel.
    5) mke2fs or tune2fs can be used to set the QUOTA feature and initialize
    quota inodes. The default reserved inodes will not be visible to user
    as regular files.
    6) The quota-tools will need to be modified to support hidden quota
    files on ext4. E2fsprogs will also include support for creating and
    fixing quota files.
    7) Support is only for the new V2 quota file format.

    Tested-by: Jan Kara
    Reviewed-by: Jan Kara
    Reviewed-by: Johann Lombardi
    Signed-off-by: Aditya Kali
    Signed-off-by: "Theodore Ts'o"

    Aditya Kali
     

30 Apr, 2012

1 commit

  • Calculate and verify the superblock checksum. Since the UUID and
    block group number are embedded in each copy of the superblock, we
    need only checksum the entire block. Refactor some of the code to
    eliminate open-coding of the checksum update call.

    Signed-off-by: Darrick J. Wong
    Signed-off-by: "Theodore Ts'o"

    Darrick J. Wong
     

21 Feb, 2012

2 commits

  • The per-commit callback was used by mballoc code to manage free space
    bitmaps after deleted blocks have been released. This patch expands
    it to support multiple different callbacks, to allow other things to
    be done after the commit has been completed.

    Signed-off-by: Bobi Jam
    Signed-off-by: Andreas Dilger
    Signed-off-by: "Theodore Ts'o"

    Bobi Jam
     
  • Ext4 does not support data journalling with delayed allocation enabled.
    We even do not allow to mount the file system with delayed allocation
    and data journalling enabled, however it can be set via FS_IOC_SETFLAGS
    so we can hit the inode with EXT4_INODE_JOURNAL_DATA set even on file
    system mounted with delayed allocation (default) and that's where
    problem arises. The easies way to reproduce this problem is with the
    following set of commands:

    mkfs.ext4 /dev/sdd
    mount /dev/sdd /mnt/test1
    dd if=/dev/zero of=/mnt/test1/file bs=1M count=4
    chattr +j /mnt/test1/file
    dd if=/dev/zero of=/mnt/test1/file bs=1M count=4 conv=notrunc
    chattr -j /mnt/test1/file

    Additionally it can be reproduced quite reliably with xfstests 272 and
    269. In fact the above reproducer is a part of test 272.

    To fix this we should ignore the EXT4_INODE_JOURNAL_DATA inode flag if
    the file system is mounted with delayed allocation. This can be easily
    done by fixing ext4_should_*_data() functions do ignore data journal
    flag when delalloc is set (suggested by Ted). We also have to set the
    appropriate address space operations for the inode (again, ignoring data
    journal flag if delalloc enabled).

    Additionally this commit introduces ext4_inode_journal_mode() function
    because ext4_should_*_data() has already had a lot of common code and
    this change is putting it all into one function so it is easier to
    read.

    Successfully tested with xfstests in following configurations:

    delalloc + data=ordered
    delalloc + data=writeback
    data=journal
    nodelalloc + data=ordered
    nodelalloc + data=writeback
    nodelalloc + data=journal

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@vger.kernel.org

    Lukas Czerner
     

13 Aug, 2011

1 commit

  • ext4_should_writeback_data() had an incorrect sequence of
    tests to determine if it should return 0 or 1: in
    particular, even in no-journal mode, 0 was being returned
    for a non-regular-file inode.

    This meant that, in non-journal mode, we would use
    ext4_journalled_aops for directories, symlinks, and other
    non-regular files. However, calling journalled aop
    callbacks when there is no valid handle, can cause problems.

    This would cause a kernel crash with Jan Kara's commit
    2d859db3e4 ("ext4: fix data corruption in inodes with
    journalled data"), because we now dereference 'handle' in
    ext4_journalled_write_end().

    I also added BUG_ONs to check for a valid handle in the
    obviously journal-only aops callbacks.

    I tested this running xfstests with a scratch device in
    these modes:

    - no-journal
    - data=ordered
    - data=writeback
    - data=journal

    All work fine; the data=journal run has many failures and a
    crash in xfstests 074, but this is no different from a
    vanilla kernel.

    Signed-off-by: Curt Wohlgemuth
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Curt Wohlgemuth
     

09 May, 2011

1 commit

  • The block allocation code used to use jbd2_journal_get_undo_access as
    a way to make changes that wouldn't show up until the commit took
    place. The new multi-block allocation code has a its own way of
    preventing newly freed blocks from getting reused until the commit
    takes place (it avoids updating the buddy bitmaps until the commit is
    done), so we don't need to use jbd2_journal_get_undo_access(), which
    has extra overhead compared to jbd2_journal_get_write_access().

    There was one last vestigal use of ext4_journal_get_undo_access() in
    ext4_add_groupblocks(); change it to use ext4_journal_get_write_access()
    and then remove the ext4_journal_get_undo_access() support.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o