24 Apr, 2018

2 commits

  • commit fb7c02445c497943e7296cd3deee04422b63acb8 upstream.

    Previously the jbd2 layer assumed that a file system check would be
    required after a journal abort. In the case of the deliberate file
    system shutdown, this should not be necessary. Allow the jbd2 layer
    to distinguish between these two cases by using the ESHUTDOWN errno.

    Also add proper locking to __journal_abort_soft().

    Signed-off-by: Theodore Ts'o
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Theodore Ts'o
     
  • commit 85e0c4e89c1b864e763c4e3bb15d0b6d501ad5d9 upstream.

    This updates the jbd2 superblock unnecessarily, and on an abort we
    shouldn't truncate the log.

    Signed-off-by: Theodore Ts'o
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Theodore Ts'o
     

20 Jun, 2017

1 commit


11 May, 2017

1 commit

  • Pull RCU updates from Ingo Molnar:
    "The main changes are:

    - Debloat RCU headers

    - Parallelize SRCU callback handling (plus overlapping patches)

    - Improve the performance of Tree SRCU on a CPU-hotplug stress test

    - Documentation updates

    - Miscellaneous fixes"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits)
    rcu: Open-code the rcu_cblist_n_lazy_cbs() function
    rcu: Open-code the rcu_cblist_n_cbs() function
    rcu: Open-code the rcu_cblist_empty() function
    rcu: Separately compile large rcu_segcblist functions
    srcu: Debloat the header
    srcu: Adjust default auto-expediting holdoff
    srcu: Specify auto-expedite holdoff time
    srcu: Expedite first synchronize_srcu() when idle
    srcu: Expedited grace periods with reduced memory contention
    srcu: Make rcutorture writer stalls print SRCU GP state
    srcu: Exact tracking of srcu_data structures containing callbacks
    srcu: Make SRCU be built by default
    srcu: Fix Kconfig botch when SRCU not selected
    rcu: Make non-preemptive schedule be Tasks RCU quiescent state
    srcu: Expedite srcu_schedule_cbs_snp() callback invocation
    srcu: Parallelize callback handling
    kvm: Move srcu_struct fields to end of struct kvm
    rcu: Fix typo in PER_RCU_NODE_PERIOD header comment
    rcu: Use true/false in assignment to bool
    rcu: Use bool value directly
    ...

    Linus Torvalds
     

09 May, 2017

1 commit

  • Pull ext4 updates from Ted Ts'o:

    - add GETFSMAP support

    - some performance improvements for very large file systems and for
    random write workloads into a preallocated file

    - bug fixes and cleanups.

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    jbd2: cleanup write flags handling from jbd2_write_superblock()
    ext4: mark superblock writes synchronous for nobarrier mounts
    ext4: inherit encryption xattr before other xattrs
    ext4: replace BUG_ON with WARN_ONCE in ext4_end_bio()
    ext4: avoid unnecessary transaction stalls during writeback
    ext4: preload block group descriptors
    ext4: make ext4_shutdown() static
    ext4: support GETFSMAP ioctls
    vfs: add common GETFSMAP ioctl definitions
    ext4: evict inline data when writing to memory map
    ext4: remove ext4_xattr_check_entry()
    ext4: rename ext4_xattr_check_names() to ext4_xattr_check_entries()
    ext4: merge ext4_xattr_list() into ext4_listxattr()
    ext4: constify static data that is never modified
    ext4: trim return value and 'dir' argument from ext4_insert_dentry()
    jbd2: fix dbench4 performance regression for 'nobarrier' mounts
    jbd2: Fix lockdep splat with generic/270 test
    mm: retry writepages() on ENOMEM when doing an data integrity writeback

    Linus Torvalds
     

04 May, 2017

2 commits

  • Currently jbd2_write_superblock() silently adds REQ_SYNC to flags with
    which journal superblock is written. Make this explicit by making flags
    passed down to jbd2_write_superblock() contain REQ_SYNC.

    CC: linux-ext4@vger.kernel.org
    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Jan Kara
     
  • kjournald2 is central to the transaction commit processing. As such any
    potential allocation from this kernel thread has to be GFP_NOFS. Make
    sure to mark the whole kernel thread GFP_NOFS by the memalloc_nofs_save.

    [akpm@linux-foundation.org: coding-style fixes]
    Link: http://lkml.kernel.org/r/20170306131408.9828-8-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Suggested-by: Jan Kara
    Reviewed-by: Jan Kara
    Cc: Dave Chinner
    Cc: Theodore Ts'o
    Cc: Chris Mason
    Cc: David Sterba
    Cc: Brian Foster
    Cc: Darrick J. Wong
    Cc: Nikolay Borisov
    Cc: Peter Zijlstra
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

30 Apr, 2017

2 commits

  • Commit b685d3d65ac7 "block: treat REQ_FUA and REQ_PREFLUSH as
    synchronous" removed REQ_SYNC flag from WRITE_FUA implementation. Since
    JBD2 strips REQ_FUA and REQ_FLUSH flags from submitted IO when the
    filesystem is mounted with nobarrier mount option, journal superblock
    writes ended up being async writes after this patch and that caused
    heavy performance regression for dbench4 benchmark with high number of
    processes. In my test setup with HP RAID array with non-volatile write
    cache and 32 GB ram, dbench4 runs with 8 processes regressed by ~25%.

    Fix the problem by making sure journal superblock writes are always
    treated as synchronous since they generally block progress of the
    journalling machinery and thus the whole filesystem.

    Fixes: b685d3d65ac791406e0dfd8779cc9b3707fea5a3
    CC: stable@vger.kernel.org
    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Jan Kara
     
  • I've hit a lockdep splat with generic/270 test complaining that:

    3216.fsstress.b/3533 is trying to acquire lock:
    (jbd2_handle){++++..}, at: [] jbd2_log_wait_commit+0x0/0x150

    but task is already holding lock:
    (jbd2_handle){++++..}, at: [] start_this_handle+0x35b/0x850

    The underlying problem is that jbd2_journal_force_commit_nested()
    (called from ext4_should_retry_alloc()) may get called while a
    transaction handle is started. In such case it takes care to not wait
    for commit of the running transaction (which would deadlock) but only
    for a commit of a transaction that is already committing (which is safe
    as that doesn't wait for any filesystem locks).

    In fact there are also other callers of jbd2_log_wait_commit() that take
    care to pass tid of a transaction that is already committing and for
    those cases, the lockdep instrumentation is too restrictive and leading
    to false positive reports. Fix the problem by calling
    jbd2_might_wait_for_commit() from jbd2_log_wait_commit() only if the
    transaction isn't already committing.

    Fixes: 1eaa566d368b214d99cbb973647c1b0b8102a9ae
    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Jan Kara
     

23 Apr, 2017

1 commit


19 Apr, 2017

1 commit

  • A group of Linux kernel hackers reported chasing a bug that resulted
    from their assumption that SLAB_DESTROY_BY_RCU provided an existence
    guarantee, that is, that no block from such a slab would be reallocated
    during an RCU read-side critical section. Of course, that is not the
    case. Instead, SLAB_DESTROY_BY_RCU only prevents freeing of an entire
    slab of blocks.

    However, there is a phrase for this, namely "type safety". This commit
    therefore renames SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU in order
    to avoid future instances of this sort of confusion.

    Signed-off-by: Paul E. McKenney
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Andrew Morton
    Cc:
    Acked-by: Johannes Weiner
    Acked-by: Vlastimil Babka
    [ paulmck: Add comments mentioning the old name, as requested by Eric
    Dumazet, in order to help people familiar with the old name find
    the new one. ]
    Acked-by: David Rientjes

    Paul E. McKenney
     

16 Mar, 2017

1 commit

  • In journal_init_common(), if we failed to allocate the j_wbuf array, or
    if we failed to create the buffer_head for the journal superblock, we
    leaked the memory allocated for the revocation tables. Fix this.

    Cc: stable@vger.kernel.org # 4.9
    Fixes: f0c9fd5458bacf7b12a9a579a727dc740cbe047e
    Signed-off-by: Eric Biggers
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Jan Kara

    Eric Biggers
     

21 Feb, 2017

1 commit

  • Pull ext4 updates from Ted Ts'o:
    "For this cycle we add support for the shutdown ioctl, which is
    primarily used for testing, but which can be useful on production
    systems when a scratch volume is being destroyed and the data on it
    doesn't need to be saved.

    This found (and we fixed) a number of bugs with ext4's recovery to
    corrupted file system --- the bugs increased the amount of data that
    could be potentially lost, and in the case of the inline data feature,
    could cause the kernel to BUG.

    Also included are a number of other bug fixes, including in ext4's
    fscrypt, DAX, inline data support"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (26 commits)
    ext4: rename EXT4_IOC_GOINGDOWN to EXT4_IOC_SHUTDOWN
    ext4: fix fencepost in s_first_meta_bg validation
    ext4: don't BUG when truncating encrypted inodes on the orphan list
    ext4: do not use stripe_width if it is not set
    ext4: fix stripe-unaligned allocations
    dax: assert that i_rwsem is held exclusive for writes
    ext4: fix DAX write locking
    ext4: add EXT4_IOC_GOINGDOWN ioctl
    ext4: add shutdown bit and check for it
    ext4: rename s_resize_flags to s_ext4_flags
    ext4: return EROFS if device is r/o and journal replay is needed
    ext4: preserve the needs_recovery flag when the journal is aborted
    jbd2: don't leak modified metadata buffers on an aborted journal
    ext4: fix inline data error paths
    ext4: move halfmd4 into hash.c directly
    ext4: fix use-after-iput when fscrypt contexts are inconsistent
    jbd2: fix use after free in kjournald2()
    ext4: fix data corruption in data=journal mode
    ext4: trim allocation requests to group size
    ext4: replace BUG_ON with WARN_ON in mb_find_extent()
    ...

    Linus Torvalds
     

02 Feb, 2017

1 commit

  • Below is the synchronization issue between unmount and kjournald2
    contexts, which results into use after free issue in kjournald2().
    Fix this issue by using journal->j_state_lock to synchronize the
    wait_event() done in journal_kill_thread() and the wake_up() done
    in kjournald2().

    TASK 1:
    umount cmd:
    |--jbd2_journal_destroy() {
    |--journal_kill_thread() {
    write_lock(&journal->j_state_lock);
    journal->j_flags |= JBD2_UNMOUNT;
    ...
    write_unlock(&journal->j_state_lock);
    wake_up(&journal->j_wait_commit); TASK 2 wakes up here:
    kjournald2() {
    ...
    checks JBD2_UNMOUNT flag and calls goto end-loop;
    ...
    end_loop:
    write_unlock(&journal->j_state_lock);
    journal->j_task = NULL; --> If this thread gets
    pre-empted here, then TASK 1 wait_event will
    exit even before this thread is completely
    done.
    wait_event(journal->j_wait_done_commit, journal->j_task == NULL);
    ...
    write_lock(&journal->j_state_lock);
    write_unlock(&journal->j_state_lock);
    }
    |--kfree(journal);
    }
    }
    wake_up(&journal->j_wait_done_commit); --> this step
    now results into use after free issue.
    }

    Signed-off-by: Sahitya Tummala
    Signed-off-by: Theodore Ts'o

    Sahitya Tummala
     

14 Jan, 2017

1 commit

  • When an ext4 fs is bogged down by a lot of metadata IOs (in the
    reported case, it was deletion of millions of files, but any massive
    amount of journal writes would do), after the journal is filled up,
    tasks which try to access the filesystem and aren't currently
    performing the journal writes end up waiting in
    __jbd2_log_wait_for_space() for journal->j_checkpoint_mutex.

    Because those mutex sleeps aren't marked as iowait, this condition can
    lead to misleadingly low iowait and /proc/stat:procs_blocked. While
    iowait propagation is far from strict, this condition can be triggered
    fairly easily and annotating these sleeps correctly helps initial
    diagnosis quite a bit.

    Use the new mutex_lock_io() for journal->j_checkpoint_mutex so that
    these sleeps are properly marked as iowait.

    Reported-by: Mingbo Wan
    Signed-off-by: Tejun Heo
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andreas Dilger
    Cc: Andrew Morton
    Cc: Jan Kara
    Cc: Jens Axboe
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Theodore Ts'o
    Cc: Thomas Gleixner
    Cc: kernel-team@fb.com
    Link: http://lkml.kernel.org/r/1477673892-28940-5-git-send-email-tj@kernel.org
    Signed-off-by: Ingo Molnar

    Tejun Heo
     

25 Dec, 2016

1 commit


01 Nov, 2016

1 commit


16 Sep, 2016

1 commit

  • There are some repetitive code in jbd2_journal_init_dev() and
    jbd2_journal_init_inode(). So this patch moves the common code into
    journal_init_common() helper to simplify the code. And fix the coding
    style warnings reported by checkpatch.pl by the way.

    Signed-off-by: Geliang Tang
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Jan Kara

    Geliang Tang
     

27 Jul, 2016

2 commits

  • Pull ext4 updates from Ted Ts'o:
    "The major change this cycle is deleting ext4's copy of the file system
    encryption code and switching things over to using the copies in
    fs/crypto. I've updated the MAINTAINERS file to add an entry for
    fs/crypto listing Jaeguk Kim and myself as the maintainers.

    There are also a number of bug fixes, most notably for some problems
    found by American Fuzzy Lop (AFL) courtesy of Vegard Nossum. Also
    fixed is a writeback deadlock detected by generic/130, and some
    potential races in the metadata checksum code"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (21 commits)
    ext4: verify extent header depth
    ext4: short-cut orphan cleanup on error
    ext4: fix reference counting bug on block allocation error
    MAINTAINRES: fs-crypto maintainers update
    ext4 crypto: migrate into vfs's crypto engine
    ext2: fix filesystem deadlock while reading corrupted xattr block
    ext4: fix project quota accounting without quota limits enabled
    ext4: validate s_reserved_gdt_blocks on mount
    ext4: remove unused page_idx
    ext4: don't call ext4_should_journal_data() on the journal inode
    ext4: Fix WARN_ON_ONCE in ext4_commit_super()
    ext4: fix deadlock during page writeback
    ext4: correct error value of function verifying dx checksum
    ext4: avoid modifying checksum fields directly during checksum verification
    ext4: check for extents that wrap around
    jbd2: make journal y2038 safe
    jbd2: track more dependencies on transaction commit
    jbd2: move lockdep tracking to journal_s
    jbd2: move lockdep instrumentation for jbd2 handles
    ext4: respect the nobarrier mount option in nojournal mode
    ...

    Linus Torvalds
     
  • Pull core block updates from Jens Axboe:

    - the big change is the cleanup from Mike Christie, cleaning up our
    uses of command types and modified flags. This is what will throw
    some merge conflicts

    - regression fix for the above for btrfs, from Vincent

    - following up to the above, better packing of struct request from
    Christoph

    - a 2038 fix for blktrace from Arnd

    - a few trivial/spelling fixes from Bart Van Assche

    - a front merge check fix from Damien, which could cause issues on
    SMR drives

    - Atari partition fix from Gabriel

    - convert cfq to highres timers, since jiffies isn't granular enough
    for some devices these days. From Jan and Jeff

    - CFQ priority boost fix idle classes, from me

    - cleanup series from Ming, improving our bio/bvec iteration

    - a direct issue fix for blk-mq from Omar

    - fix for plug merging not involving the IO scheduler, like we do for
    other types of merges. From Tahsin

    - expose DAX type internally and through sysfs. From Toshi and Yigal

    * 'for-4.8/core' of git://git.kernel.dk/linux-block: (76 commits)
    block: Fix front merge check
    block: do not merge requests without consulting with io scheduler
    block: Fix spelling in a source code comment
    block: expose QUEUE_FLAG_DAX in sysfs
    block: add QUEUE_FLAG_DAX for devices to advertise their DAX support
    Btrfs: fix comparison in __btrfs_map_block()
    block: atari: Return early for unsupported sector size
    Doc: block: Fix a typo in queue-sysfs.txt
    cfq-iosched: Charge at least 1 jiffie instead of 1 ns
    cfq-iosched: Fix regression in bonnie++ rewrite performance
    cfq-iosched: Convert slice_resid from u64 to s64
    block: Convert fifo_time from ulong to u64
    blktrace: avoid using timespec
    block/blk-cgroup.c: Declare local symbols static
    block/bio-integrity.c: Add #include "blk.h"
    block/partition-generic.c: Remove a set-but-not-used variable
    block: bio: kill BIO_MAX_SIZE
    cfq-iosched: temporarily boost queue priority for idle classes
    block: drbd: avoid to use BIO_MAX_SIZE
    block: bio: remove BIO_MAX_SECTORS
    ...

    Linus Torvalds
     

30 Jun, 2016

2 commits

  • So far we were tracking only dependency on transaction commit due to
    starting a new handle (which may require commit to start a new
    transaction). Now add tracking also for other cases where we wait for
    transaction commit. This way lockdep can catch deadlocks e. g. because we
    call jbd2_journal_stop() for a synchronous handle with some locks held
    which rank below transaction start.

    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Jan Kara
     
  • Currently lockdep map is tracked in each journal handle. To be able to
    expand lockdep support to cover also other cases where we depend on
    transaction commit and where handle is not available, move lockdep map
    into struct journal_s. Since this makes the lockdep map shared for all
    handles, we have to use rwsem_acquire_read() for acquisitions now.

    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Jan Kara
     

25 Jun, 2016

1 commit

  • jbd2_alloc is explicit about its allocation preferences wrt. the
    allocation size. Sub page allocations go to the slab allocator and
    larger are using either the page allocator or vmalloc. This is all good
    but the logic is unnecessarily complex.

    1) as per Ted, the vmalloc fallback is a left-over:

    : jbd2_alloc is only passed in the bh->b_size, which can't be PAGE_SIZE, so
    : the code path that calls vmalloc() should never get called. When we
    : conveted jbd2_alloc() to suppor sub-page size allocations in commit
    : d2eecb039368, there was an assumption that it could be called with a size
    : greater than PAGE_SIZE, but that's certaily not true today.

    Moreover vmalloc allocation might even lead to a deadlock because the
    callers expect GFP_NOFS context while vmalloc is GFP_KERNEL.

    2) __GFP_REPEAT for requests
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

08 Jun, 2016

3 commits


24 Apr, 2016

1 commit

  • Currently when filesystem needs to make sure data is on permanent
    storage before committing a transaction it adds inode to transaction's
    inode list. During transaction commit, jbd2 writes back all dirty
    buffers that have allocated underlying blocks and waits for the IO to
    finish. However when doing writeback for delayed allocated data, we
    allocate blocks and immediately submit the data. Thus asking jbd2 to
    write dirty pages just unnecessarily adds more work to jbd2 possibly
    writing back other redirtied blocks.

    Add support to jbd2 to allow filesystem to ask jbd2 to only wait for
    outstanding data writes before committing a transaction and thus avoid
    unnecessary writes.

    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Jan Kara
     

05 Apr, 2016

1 commit

  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

10 Mar, 2016

1 commit

  • On umount path, jbd2_journal_destroy() writes latest transaction ID
    (->j_tail_sequence) to be used at next mount.

    The bug is that ->j_tail_sequence is not holding latest transaction ID
    in some cases. So, at next mount, there is chance to conflict with
    remaining (not overwritten yet) transactions.

    mount (id=10)
    write transaction (id=11)
    write transaction (id=12)
    umount (id=10) j_tail_sequence is not updated.
    (And another case is, __jbd2_journal_clean_checkpoint_list() is called
    with empty transaction.)

    So in above cases, ->j_tail_sequence is not pointing latest
    transaction ID at umount path. Plus, REQ_FLUSH for checkpoint is not
    done too.

    So, to fix this problem with minimum changes, this patch updates
    ->j_tail_sequence, and issue REQ_FLUSH. (With more complex changes,
    some optimizations would be possible to avoid unnecessary REQ_FLUSH
    for example though.)

    BTW,

    journal->j_tail_sequence =
    ++journal->j_transaction_sequence;

    Increment of ->j_transaction_sequence seems to be unnecessary, but
    ext3 does this.

    Signed-off-by: OGAWA Hirofumi
    Signed-off-by: Theodore Ts'o
    Cc: stable@vger.kernel.org

    OGAWA Hirofumi
     

23 Feb, 2016

3 commits


19 Oct, 2015

1 commit

  • If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the
    journaling will be aborted first and the error number will be recorded
    into JBD2 superblock and, finally, the system will enter into the
    panic state in "errors=panic" option. But, in the rare case, this
    sequence is little twisted like the below figure and it will happen
    that the system enters into panic state, which means the system reset
    in mobile environment, before completion of recording an error in the
    journal superblock. In this case, e2fsck cannot recognize that the
    filesystem failure occurred in the previous run and the corruption
    wouldn't be fixed.

    Task A Task B
    ext4_handle_error()
    -> jbd2_journal_abort()
    -> __journal_abort_soft()
    -> __jbd2_journal_abort_hard()
    | -> journal->j_flags |= JBD2_ABORT;
    |
    | __ext4_abort()
    | -> jbd2_journal_abort()
    | | -> __journal_abort_soft()
    | | -> if (journal->j_flags & JBD2_ABORT)
    | | return;
    | -> panic()
    |
    -> jbd2_journal_update_sb_errno()

    Tested-by: Hobin Woo
    Signed-off-by: Daeho Jeong
    Signed-off-by: Theodore Ts'o
    Cc: stable@vger.kernel.org

    Daeho Jeong
     

18 Oct, 2015

2 commits


15 Oct, 2015

1 commit

  • Change the journal's checksum functions to gate on whether or not the
    crc32c driver is loaded, and gate the loading on the superblock bits.
    This prevents a journal crash if someone loads a journal in no-csum
    mode and then randomizes the superblock, thus flipping on the feature
    bits.

    Tested-By: Nikolay Borisov
    Reported-by: Nikolay Borisov
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Theodore Ts'o

    Darrick J. Wong
     

29 Jul, 2015

1 commit

  • Commit 6f6a6fda2945 "jbd2: fix ocfs2 corrupt when updating journal
    superblock fails" changed jbd2_cleanup_journal_tail() to return EIO
    when the journal is aborted. That makes logic in
    jbd2_log_do_checkpoint() bail out which is fine, except that
    jbd2_journal_destroy() expects jbd2_log_do_checkpoint() to always make
    a progress in cleaning the journal. Without it jbd2_journal_destroy()
    just loops in an infinite loop.

    Fix jbd2_journal_destroy() to cleanup journal checkpoint lists of
    jbd2_log_do_checkpoint() fails with error.

    Reported-by: Eryu Guan
    Tested-by: Eryu Guan
    Fixes: 6f6a6fda294506dfe0e3e0a253bb2d2923f28f0a
    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o

    Jan Kara
     

23 Jul, 2015

1 commit

  • When an error condition is detected, an error status should be recorded into
    superblocks of EXT4 or JBD2. However, the write request is submitted now
    without REQ_FUA flag, even in "barrier=1" mode, which is followed by
    panic() function in "errors=panic" mode. On mobile devices which make
    whole system reset as soon as kernel panic occurs, this write request
    containing an error flag will disappear just from storage cache without
    written to the physical cells. Therefore, when next start, even forever,
    the error flag cannot be shown in both superblocks, and e2fsck cannot fix
    the filesystem problems automatically, unless e2fsck is executed in
    force checking mode.

    [ Changed use test_opt(sb, BARRIER) of checking the journal flags -- TYT ]

    Signed-off-by: Daeho Jeong
    Signed-off-by: Theodore Ts'o

    Daeho Jeong
     

27 Jun, 2015

1 commit

  • Merge second patchbomb from Andrew Morton:

    - most of the rest of MM

    - lots of misc things

    - procfs updates

    - printk feature work

    - updates to get_maintainer, MAINTAINERS, checkpatch

    - lib/ updates

    * emailed patches from Andrew Morton : (96 commits)
    exit,stats: /* obey this comment */
    coredump: add __printf attribute to cn_*printf functions
    coredump: use from_kuid/kgid when formatting corename
    fs/reiserfs: remove unneeded cast
    NILFS2: support NFSv2 export
    fs/befs/btree.c: remove unneeded initializations
    fs/minix: remove unneeded cast
    init/do_mounts.c: add create_dev() failure log
    kasan: remove duplicate definition of the macro KASAN_FREE_PAGE
    fs/efs: femove unneeded cast
    checkpatch: emit "NOTE: " message only once after multiple files
    checkpatch: emit an error when there's a diff in a changelog
    checkpatch: validate MODULE_LICENSE content
    checkpatch: add multi-line handling for PREFER_ETHER_ADDR_COPY
    checkpatch: suggest using eth_zero_addr() and eth_broadcast_addr()
    checkpatch: fix processing of MEMSET issues
    checkpatch: suggest using ether_addr_equal*()
    checkpatch: avoid NOT_UNIFIED_DIFF errors on cover-letter.patch files
    checkpatch: remove local from codespell path
    checkpatch: add --showfile to allow input via pipe to show filenames
    ...

    Linus Torvalds
     

26 Jun, 2015

1 commit