17 Aug, 2012

1 commit

  • Pull ext4 bug fixes from Ted Ts'o:
    "The following are all bug fixes and regressions. The most notable are
    the ones which cause problems for ext4 on RAID --- a performance
    problem when mounting very large filesystems, and a kernel OOPS when
    doing an rm -rf on large directory hierarchies on fast devices."

    * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: fix kernel BUG on large-scale rm -rf commands
    ext4: fix long mount times on very big file systems
    ext4: don't call ext4_error while block group is locked
    ext4: avoid kmemcheck complaint from reading uninitialized memory
    ext4: make sure the journal sb is written in ext4_clear_journal_err()

    Linus Torvalds
     

06 Aug, 2012

1 commit

  • After we transfer set the EXT4_ERROR_FS bit in the file system
    superblock, it's not enough to call jbd2_journal_clear_err() to clear
    the error indication from journal superblock --- we need to call
    jbd2_journal_update_sb_errno() as well. Otherwise, when the root file
    system is mounted read-only, the journal is replayed, and the error
    indicator is transferred to the superblock --- but the s_errno field
    in the jbd2 superblock is left set (since although we cleared it in
    memory, we never flushed it out to disk).

    This can end up confusing e2fsck. We should make e2fsck more robust
    in this case, but the kernel shouldn't be leaving things in this
    confused state, either.

    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Theodore Ts'o
     

04 Aug, 2012

1 commit


23 Jul, 2012

1 commit


01 Jun, 2012

1 commit


27 May, 2012

7 commits


23 May, 2012

1 commit


24 Apr, 2012

1 commit

  • flush request is issued in transaction commit code path, so looks using
    GFP_KERNEL to allocate memory for flush request bio falls into the classic
    deadlock issue. I saw btrfs and dm get it right, but ext4, xfs and md are
    using GFP.

    Signed-off-by: Shaohua Li
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Jan Kara
    Cc: stable@vger.kernel.org

    Shaohua Li
     

29 Mar, 2012

3 commits

  • …m/linux/kernel/git/dhowells/linux-asm_system

    Pull "Disintegrate and delete asm/system.h" from David Howells:
    "Here are a bunch of patches to disintegrate asm/system.h into a set of
    separate bits to relieve the problem of circular inclusion
    dependencies.

    I've built all the working defconfigs from all the arches that I can
    and made sure that they don't break.

    The reason for these patches is that I recently encountered a circular
    dependency problem that came about when I produced some patches to
    optimise get_order() by rewriting it to use ilog2().

    This uses bitops - and on the SH arch asm/bitops.h drags in
    asm-generic/get_order.h by a circuituous route involving asm/system.h.

    The main difficulty seems to be asm/system.h. It holds a number of
    low level bits with no/few dependencies that are commonly used (eg.
    memory barriers) and a number of bits with more dependencies that
    aren't used in many places (eg. switch_to()).

    These patches break asm/system.h up into the following core pieces:

    (1) asm/barrier.h

    Move memory barriers here. This already done for MIPS and Alpha.

    (2) asm/switch_to.h

    Move switch_to() and related stuff here.

    (3) asm/exec.h

    Move arch_align_stack() here. Other process execution related bits
    could perhaps go here from asm/processor.h.

    (4) asm/cmpxchg.h

    Move xchg() and cmpxchg() here as they're full word atomic ops and
    frequently used by atomic_xchg() and atomic_cmpxchg().

    (5) asm/bug.h

    Move die() and related bits.

    (6) asm/auxvec.h

    Move AT_VECTOR_SIZE_ARCH here.

    Other arch headers are created as needed on a per-arch basis."

    Fixed up some conflicts from other header file cleanups and moving code
    around that has happened in the meantime, so David's testing is somewhat
    weakened by that. We'll find out anything that got broken and fix it..

    * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits)
    Delete all instances of asm/system.h
    Remove all #inclusions of asm/system.h
    Add #includes needed to permit the removal of asm/system.h
    Move all declarations of free_initmem() to linux/mm.h
    Disintegrate asm/system.h for OpenRISC
    Split arch_align_stack() out from asm-generic/system.h
    Split the switch_to() wrapper out of asm-generic/system.h
    Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h
    Create asm-generic/barrier.h
    Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h
    Disintegrate asm/system.h for Xtensa
    Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt]
    Disintegrate asm/system.h for Tile
    Disintegrate asm/system.h for Sparc
    Disintegrate asm/system.h for SH
    Disintegrate asm/system.h for Score
    Disintegrate asm/system.h for S390
    Disintegrate asm/system.h for PowerPC
    Disintegrate asm/system.h for PA-RISC
    Disintegrate asm/system.h for MN10300
    ...

    Linus Torvalds
     
  • Remove all #inclusions of asm/system.h preparatory to splitting and killing
    it. Performed with the following command:

    perl -p -i -e 's!^#\s*include\s*.*\n!!' `grep -Irl '^#\s*include\s*' *`

    Signed-off-by: David Howells

    David Howells
     
  • Pull ext4 updates for 3.4 from Ted Ts'o:
    "Ext4 commits for 3.3 merge window; mostly cleanups and bug fixes

    The changes to export dirty_writeback_interval are from Artem's s_dirt
    cleanup patch series. The same is true of the change to remove the
    s_dirt helper functions which never got used by anyone in-tree. I've
    run these changes by Al Viro, and am carrying them so that Artem can
    more easily fix up the rest of the file systems during the next merge
    window. (Originally we had hopped to remove the use of s_dirt from
    ext4 during this merge window, but his patches had some bugs, so I
    ultimately ended dropping them from the ext4 tree.)"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (66 commits)
    vfs: remove unused superblock helpers
    mm: export dirty_writeback_interval
    ext4: remove useless s_dirt assignment
    ext4: write superblock only once on unmount
    ext4: do not mark superblock as dirty unnecessarily
    ext4: correct ext4_punch_hole return codes
    ext4: remove restrictive checks for EOFBLOCKS_FL
    ext4: always set then trimmed blocks count into len
    ext4: fix trimmed block count accunting
    ext4: fix start and len arguments handling in ext4_trim_fs()
    ext4: update s_free_{inodes,blocks}_count during online resize
    ext4: change some printk() calls to use ext4_msg() instead
    ext4: avoid output message interleaving in ext4_error_()
    ext4: remove trailing newlines from ext4_msg() and ext4_error() messages
    ext4: add no_printk argument validation, fix fallout
    ext4: remove redundant "EXT4-fs: " from uses of ext4_msg
    ext4: give more helpful error message in ext4_ext_rm_leaf()
    ext4: remove unused code from ext4_ext_map_blocks()
    ext4: rewrite punch hole to use ext4_ext_remove_space()
    jbd2: cleanup journal tail after transaction commit
    ...

    Linus Torvalds
     

22 Mar, 2012

1 commit

  • Pull power management updates for 3.4 from Rafael Wysocki:
    "Assorted extensions and fixes including:

    * Introduction of early/late suspend/hibernation device callbacks.
    * Generic PM domains extensions and fixes.
    * devfreq updates from Axel Lin and MyungJoo Ham.
    * Device PM QoS updates.
    * Fixes of concurrency problems with wakeup sources.
    * System suspend and hibernation fixes."

    * tag 'pm-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (43 commits)
    PM / Domains: Check domain status during hibernation restore of devices
    PM / devfreq: add relation of recommended frequency.
    PM / shmobile: Make MTU2 driver use pm_genpd_dev_always_on()
    PM / shmobile: Make CMT driver use pm_genpd_dev_always_on()
    PM / shmobile: Make TMU driver use pm_genpd_dev_always_on()
    PM / Domains: Introduce "always on" device flag
    PM / Domains: Fix hibernation restore of devices, v2
    PM / Domains: Fix handling of wakeup devices during system resume
    sh_mmcif / PM: Use PM QoS latency constraint
    tmio_mmc / PM: Use PM QoS latency constraint
    PM / QoS: Make it possible to expose PM QoS latency constraints
    PM / Sleep: JBD and JBD2 missing set_freezable()
    PM / Domains: Fix include for PM_GENERIC_DOMAINS=n case
    PM / Freezer: Remove references to TIF_FREEZE in comments
    PM / Sleep: Add more wakeup source initialization routines
    PM / Hibernate: Enable usermodehelpers in hibernate() error path
    PM / Sleep: Make __pm_stay_awake() delete wakeup source timers
    PM / Sleep: Fix race conditions related to wakeup source timer function
    PM / Sleep: Fix possible infinite loop during wakeup source destruction
    PM / Hibernate: print physical addresses consistently with other parts of kernel
    ...

    Linus Torvalds
     

20 Mar, 2012

1 commit


14 Mar, 2012

9 commits

  • Normally, we have to issue a cache flush before we can update journal tail in
    journal superblock, effectively wiping out old transactions from the journal.
    So use the fact that during transaction commit we issue cache flush anyway and
    opportunistically push journal tail as far as we can. Since update of journal
    superblock is still costly (we have to use WRITE_FUA), we update log tail only
    if we can free significant amount of space.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • All accesses to checkpointing entries in journal_head are protected
    by j_list_lock. Thus __jbd2_journal_remove_checkpoint() doesn't really
    need bh_state lock.

    Also the only part of journal head that the rest of checkpointing code
    needs to check is jh->b_transaction which is safe to read under
    j_list_lock.

    So we can safely remove bh_state lock from all of checkpointing code which
    makes it considerably prettier.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • The check b_jlist == BJ_None in __journal_try_to_free_buffer() is
    always true (__jbd2_journal_temp_unlink_buffer() also checks this in
    an assertion) so just remove it.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • BH_JWrite bit should be set when buffer is written to the journal. So
    checkpointing shouldn't set this bit when writing out buffer. This didn't
    cause any observable bug since BH_JWrite bit is used only for debugging
    purposes but it's good to have this consistent.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • When we reach jbd2_cleanup_journal_tail(), there is no guarantee that
    checkpointed buffers are on a stable storage - especially if buffers were
    written out by jbd2_log_do_checkpoint(), they are likely to be only in disk's
    caches. Thus when we update journal superblock effectively removing old
    transaction from journal, this write of superblock can get to stable storage
    before those checkpointed buffers which can result in filesystem corruption
    after a crash. Thus we must unconditionally issue a cache flush before we
    update journal superblock in these cases.

    A similar problem can also occur if journal superblock is written only in
    disk's caches, other transaction starts reusing space of the transaction
    cleaned from the log and power failure happens. Subsequent journal replay would
    still try to replay the old transaction but some of it's blocks may be already
    overwritten by the new transaction. For this reason we must use WRITE_FUA when
    updating log tail and we must first write new log tail to disk and update
    in-memory information only after that.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • With the latest and greatest changes to the freezer, I started seeing
    panics that were caused by jbd2 running post-process freezing and
    hitting the canary BUG_ON for non-TuxOnIce I/O submission. I've traced
    this back to a lack of set_freezable calls in both jbd and jbd2. Since
    they're clearly meant to be frozen (there are tests for freezing()), I
    submit the following patch to add the missing calls.

    Signed-off-by: Nigel Cunningham
    Acked-by: Jan Kara
    Signed-off-by: Rafael J. Wysocki

    Nigel Cunningham
     
  • There are some log tail updates that are not protected by j_checkpoint_mutex.
    Some of these are harmless because they happen during startup or shutdown but
    updates in jbd2_journal_commit_transaction() and jbd2_journal_flush() can
    really race with other log tail updates (e.g. someone doing
    jbd2_journal_flush() with someone running jbd2_cleanup_journal_tail()). So
    protect all log tail updates with j_checkpoint_mutex.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • There are three case of updating journal superblock. In the first case, we want
    to mark journal as empty (setting s_sequence to 0), in the second case we want
    to update log tail, in the third case we want to update s_errno. Split these
    cases into separate functions. It makes the code slightly more straightforward
    and later patches will make the distinction even more important.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     

21 Feb, 2012

6 commits

  • The V2 journal format was introduced around ten years ago,
    for ext3. It seems highly unlikely that anyone will need this
    migration option for ext4.

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Eric Sandeen
     
  • Use the KMEM_CACHE helper macro instead of kmem_cache_create().

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"

    Yongqiang Yang
     
  • This patch renames functions initializing the slab caches for the
    journal head and handle structures to so they are consistent with the
    names of the corresponding functions which destroys those slab caches.

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"

    Yongqiang Yang
     
  • There is normally only a handful of these active at any one time, but
    putting them in a separate slab cache makes debugging memory
    corruption problems easier. Manish Katiyar also wanted this make it
    easier to test memory failure scenarios in the jbd2 layer.

    Cc: Manish Katiyar
    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"

    Yongqiang Yang
     
  • journal_unmap_buffer()'s zap_buffer: code clears a lot of buffer head
    state ala discard_buffer(), but does not touch _Delay or _Unwritten as
    discard_buffer() does.

    This can be problematic in some areas of the ext4 code which assume
    that if they have found a buffer marked unwritten or delay, then it's
    a live one. Perhaps those spots should check whether it is mapped
    as well, but if jbd2 is going to tear down a buffer, let's really
    tear it down completely.

    Without this I get some fsx failures on sub-page-block filesystems
    up until v3.2, at which point 4e96b2dbbf1d7e81f22047a50f862555a6cb87cb
    and 189e868fa8fdca702eb9db9d8afc46b5cb9144c9 make the failures go
    away, because buried within that large change is some more flag
    clearing. I still think it's worth doing in jbd2, since
    ->invalidatepage leads here directly, and it's the right place
    to clear away these flags.

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@vger.kernel.org

    Eric Sandeen
     
  • This patch adds trace_jbd2_drop_transaction and
    trace_jbd2_update_superblock_end because there are similar tracepoints
    in jbd and they are needed in jbd2 as well.

    Reviewed-by: Lukas Czerner
    Signed-off-by: Seiji Aguchi
    Signed-off-by: "Theodore Ts'o"

    Seiji Aguchi
     

11 Jan, 2012

1 commit


09 Jan, 2012

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits)
    Kconfig: acpi: Fix typo in comment.
    misc latin1 to utf8 conversions
    devres: Fix a typo in devm_kfree comment
    btrfs: free-space-cache.c: remove extra semicolon.
    fat: Spelling s/obsolate/obsolete/g
    SCSI, pmcraid: Fix spelling error in a pmcraid_err() call
    tools/power turbostat: update fields in manpage
    mac80211: drop spelling fix
    types.h: fix comment spelling for 'architectures'
    typo fixes: aera -> area, exntension -> extension
    devices.txt: Fix typo of 'VMware'.
    sis900: Fix enum typo 'sis900_rx_bufer_status'
    decompress_bunzip2: remove invalid vi modeline
    treewide: Fix comment and string typo 'bufer'
    hyper-v: Update MAINTAINERS
    treewide: Fix typos in various parts of the kernel, and fix some comments.
    clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
    gpio: Kconfig: drop unknown symbol 'CS5535_GPIO'
    leds: Kconfig: Fix typo 'D2NET_V2'
    sound: Kconfig: drop unknown symbol ARCH_CLPS7500
    ...

    Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new
    kconfig additions, close to removed commented-out old ones)

    Linus Torvalds
     

05 Jan, 2012

1 commit

  • Toshiyuki Okajima found out that when running

    for ((i=0; i < 100000; i++)); do
    if ((i%2 == 0)); then
    chattr +j /mnt/file
    else
    chattr -j /mnt/file
    fi
    echo "0" >> /mnt/file
    done

    process sometimes hangs indefinitely in jbd2_journal_lock_updates().

    Toshiyuki identified that the following race happens:

    jbd2_journal_lock_updates() |jbd2_journal_stop()
    ---------------------------------------+---------------------------------------
    write_lock(&journal->j_state_lock) | .
    ++journal->j_barrier_count | .
    spin_lock(&tran->t_handle_lock) | .
    atomic_read(&tran->t_updates) //not 0 |
    | atomic_dec_and_test(&tran->t_updates)
    | // t_updates = 0
    | wake_up(&journal->j_wait_updates)
    prepare_to_wait() | // no process is woken up.
    spin_unlock(&tran->t_handle_lock) |
    write_unlock(&journal->j_state_lock) |
    schedule() // never return |

    We fix the problem by first calling prepare_to_wait() and only after that
    checking t_updates in jbd2_journal_lock_updates().

    Reported-and-analyzed-by: Toshiyuki Okajima
    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     

29 Dec, 2011

1 commit

  • Currently, we clear revoked flag only when a block is reused. However,
    this can tigger a false journal error. Consider a situation when a block
    is used as a meta block and is deleted(revoked) in ordered mode, then the
    block is allocated as a data block to a file. At this moment, user changes
    the file's journal mode from ordered to journaled and truncates the file.
    The block will be considered re-revoked by journal because it has revoked
    flag still pending from the last transaction and an assertion triggers.

    We fix the problem by keeping the revoked status more uptodate - we clear
    revoked flag when switching revoke tables to reflect there is no revoked
    buffers in current transaction any more.

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"

    Yongqiang Yang
     

06 Dec, 2011

1 commit


22 Nov, 2011

1 commit

  • There is no reason to export two functions for entering the
    refrigerator. Calling refrigerator() instead of try_to_freeze()
    doesn't save anything noticeable or removes any race condition.

    * Rename refrigerator() to __refrigerator() and make it return bool
    indicating whether it scheduled out for freezing.

    * Update try_to_freeze() to return bool and relay the return value of
    __refrigerator() if freezing().

    * Convert all refrigerator() users to try_to_freeze().

    * Update documentation accordingly.

    * While at it, add might_sleep() to try_to_freeze().

    Signed-off-by: Tejun Heo
    Cc: Samuel Ortiz
    Cc: Chris Mason
    Cc: "Theodore Ts'o"
    Cc: Steven Whitehouse
    Cc: Andrew Morton
    Cc: Jan Kara
    Cc: KONISHI Ryusuke
    Cc: Christoph Hellwig

    Tejun Heo