11 Jan, 2012

1 commit


09 Jan, 2012

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits)
    Kconfig: acpi: Fix typo in comment.
    misc latin1 to utf8 conversions
    devres: Fix a typo in devm_kfree comment
    btrfs: free-space-cache.c: remove extra semicolon.
    fat: Spelling s/obsolate/obsolete/g
    SCSI, pmcraid: Fix spelling error in a pmcraid_err() call
    tools/power turbostat: update fields in manpage
    mac80211: drop spelling fix
    types.h: fix comment spelling for 'architectures'
    typo fixes: aera -> area, exntension -> extension
    devices.txt: Fix typo of 'VMware'.
    sis900: Fix enum typo 'sis900_rx_bufer_status'
    decompress_bunzip2: remove invalid vi modeline
    treewide: Fix comment and string typo 'bufer'
    hyper-v: Update MAINTAINERS
    treewide: Fix typos in various parts of the kernel, and fix some comments.
    clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
    gpio: Kconfig: drop unknown symbol 'CS5535_GPIO'
    leds: Kconfig: Fix typo 'D2NET_V2'
    sound: Kconfig: drop unknown symbol ARCH_CLPS7500
    ...

    Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new
    kconfig additions, close to removed commented-out old ones)

    Linus Torvalds
     

05 Jan, 2012

1 commit

  • Toshiyuki Okajima found out that when running

    for ((i=0; i < 100000; i++)); do
    if ((i%2 == 0)); then
    chattr +j /mnt/file
    else
    chattr -j /mnt/file
    fi
    echo "0" >> /mnt/file
    done

    process sometimes hangs indefinitely in jbd2_journal_lock_updates().

    Toshiyuki identified that the following race happens:

    jbd2_journal_lock_updates() |jbd2_journal_stop()
    ---------------------------------------+---------------------------------------
    write_lock(&journal->j_state_lock) | .
    ++journal->j_barrier_count | .
    spin_lock(&tran->t_handle_lock) | .
    atomic_read(&tran->t_updates) //not 0 |
    | atomic_dec_and_test(&tran->t_updates)
    | // t_updates = 0
    | wake_up(&journal->j_wait_updates)
    prepare_to_wait() | // no process is woken up.
    spin_unlock(&tran->t_handle_lock) |
    write_unlock(&journal->j_state_lock) |
    schedule() // never return |

    We fix the problem by first calling prepare_to_wait() and only after that
    checking t_updates in jbd2_journal_lock_updates().

    Reported-and-analyzed-by: Toshiyuki Okajima
    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     

29 Dec, 2011

1 commit

  • Currently, we clear revoked flag only when a block is reused. However,
    this can tigger a false journal error. Consider a situation when a block
    is used as a meta block and is deleted(revoked) in ordered mode, then the
    block is allocated as a data block to a file. At this moment, user changes
    the file's journal mode from ordered to journaled and truncates the file.
    The block will be considered re-revoked by journal because it has revoked
    flag still pending from the last transaction and an assertion triggers.

    We fix the problem by keeping the revoked status more uptodate - we clear
    revoked flag when switching revoke tables to reflect there is no revoked
    buffers in current transaction any more.

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"

    Yongqiang Yang
     

06 Dec, 2011

1 commit


22 Nov, 2011

1 commit

  • There is no reason to export two functions for entering the
    refrigerator. Calling refrigerator() instead of try_to_freeze()
    doesn't save anything noticeable or removes any race condition.

    * Rename refrigerator() to __refrigerator() and make it return bool
    indicating whether it scheduled out for freezing.

    * Update try_to_freeze() to return bool and relay the return value of
    __refrigerator() if freezing().

    * Convert all refrigerator() users to try_to_freeze().

    * Update documentation accordingly.

    * While at it, add might_sleep() to try_to_freeze().

    Signed-off-by: Tejun Heo
    Cc: Samuel Ortiz
    Cc: Chris Mason
    Cc: "Theodore Ts'o"
    Cc: Steven Whitehouse
    Cc: Andrew Morton
    Cc: Jan Kara
    Cc: KONISHI Ryusuke
    Cc: Christoph Hellwig

    Tejun Heo
     

02 Nov, 2011

2 commits

  • Some jbd2 code prints out kernel messages with "JBD2: " prefix, at the
    same time other jbd2 code prints with "JBD: " prefix. Unify the prefix
    to "JBD2: ".

    Signed-off-by: Eryu Guan
    Signed-off-by: "Theodore Ts'o"

    Eryu Guan
     
  • I hit a J_ASSERT(blocknr != 0) failure in cleanup_journal_tail() when
    mounting a fsfuzzed ext3 image. It turns out that the corrupted ext3
    image has s_first = 0 in journal superblock, and the 0 is passed to
    journal->j_head in journal_reset(), then to blocknr in
    cleanup_journal_tail(), in the end the J_ASSERT failed.

    So validate s_first after reading journal superblock from disk in
    journal_get_superblock() to ensure s_first is valid.

    The following script could reproduce it:

    fstype=ext3
    blocksize=1024
    img=$fstype.img
    offset=0
    found=0
    magic="c0 3b 39 98"

    dd if=/dev/zero of=$img bs=1M count=8
    mkfs -t $fstype -b $blocksize -F $img
    filesize=`stat -c %s $img`
    while [ $offset -lt $filesize ]
    do
    if od -j $offset -N 4 -t x1 $img | grep -i "$magic";then
    echo "Found journal: $offset"
    found=1
    break
    fi
    offset=`echo "$offset+$blocksize" | bc`
    done

    if [ $found -ne 1 ];then
    echo "Magic \"$magic\" not found"
    exit 1
    fi

    dd if=/dev/zero of=$img seek=$(($offset+23)) conv=notrunc bs=1 count=1

    mkdir -p ./mnt
    mount -o loop $img ./mnt

    Cc: Jan Kara
    Signed-off-by: Eryu Guan
    Signed-off-by: "Theodore Ts'o"

    Eryu Guan
     

27 Oct, 2011

1 commit

  • Fix build error when CONFIG_BUG is not enabled:

    fs/jbd2/transaction.c:1175:3: error: implicit declaration of function '__WARN'

    by changing __WARN() to WARN_ON(), as suggested by
    Arnaud Lacombe .

    Signed-off-by: Randy Dunlap
    Signed-off-by: "Theodore Ts'o"
    Cc: Arnd Bergmann
    Cc: Arnaud Lacombe

    Randy Dunlap
     

04 Sep, 2011

2 commits

  • This silences some Sparse warnings:
    fs/jbd2/transaction.c:135:69: warning: incorrect type in argument 2 (different base types)
    fs/jbd2/transaction.c:135:69: expected restricted gfp_t [usertype] flags
    fs/jbd2/transaction.c:135:69: got int [signed] gfp_mask

    Signed-off-by: Dan Carpenter
    Signed-off-by: "Theodore Ts'o"

    Dan Carpenter
     
  • Add debugging information in case jbd2_journal_dirty_metadata() is
    called with a buffer_head which didn't have
    jbd2_journal_get_write_access() called on it, or if the journal_head
    has the wrong transaction in it. In addition, return an error code.
    This won't change anything for ocfs2, which will BUG_ON() the non-zero
    exit code.

    For ext4, the caller of this function is ext4_handle_dirty_metadata(),
    and on seeing a non-zero return code, will call __ext4_journal_stop(),
    which will print the function and line number of the (buggy) calling
    function and abort the journal. This will allow us to recover instead
    of bug halting, which is better from a robustness and reliability
    point of view.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

11 Jul, 2011

1 commit


28 Jun, 2011

1 commit

  • In journal checkpoint, we write the buffer and wait for its finish.
    But in cfq, the async queue has a very low priority, and in our test,
    if there are too many sync queues and every queue is filled up with
    requests, the write request will be delayed for quite a long time and
    all the tasks which are waiting for journal space will end with errors like:

    INFO: task attr_set:3816 blocked for more than 120 seconds.
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    attr_set D ffff880028393480 0 3816 1 0x00000000
    ffff8802073fbae8 0000000000000086 ffff8802140847c8 ffff8800283934e8
    ffff8802073fb9d8 ffffffff8103e456 ffff8802140847b8 ffff8801ed728080
    ffff8801db4bc080 ffff8801ed728450 ffff880028393480 0000000000000002
    Call Trace:
    [] ? __dequeue_entity+0x33/0x38
    [] ? need_resched+0x23/0x2d
    [] ? thread_return+0xa2/0xbc
    [] ? jbd2_journal_dirty_metadata+0x116/0x126 [jbd2]
    [] ? jbd2_journal_dirty_metadata+0x116/0x126 [jbd2]
    [] __mutex_lock_common+0x14e/0x1a9
    [] ? brelse+0x13/0x15 [ext4]
    [] __mutex_lock_slowpath+0x19/0x1b
    [] mutex_lock+0x1b/0x32
    [] __jbd2_journal_insert_checkpoint+0xe3/0x20c [jbd2]
    [] start_this_handle+0x438/0x527 [jbd2]
    [] ? autoremove_wake_function+0x0/0x3e
    [] jbd2_journal_start+0xa1/0xcc [jbd2]
    [] ext4_journal_start_sb+0x57/0x81 [ext4]
    [] ext4_xattr_set+0x6c/0xe3 [ext4]
    [] ext4_xattr_user_set+0x42/0x4b [ext4]
    [] generic_setxattr+0x6b/0x76
    [] __vfs_setxattr_noperm+0x47/0xc0
    [] vfs_setxattr+0x7f/0x9a
    [] setxattr+0xb5/0xe8
    [] ? do_filp_open+0x571/0xa6e
    [] sys_fsetxattr+0x6b/0x91
    [] system_call_fastpath+0x16/0x1b

    So this patch tries to use WRITE_SYNC in __flush_batch so that the request will
    be moved into sync queue and handled by cfq timely. We also use the new plug,
    sot that all the WRITE_SYNC requests can be given as a whole when we unplug it.

    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"
    Cc: Jan Kara
    Reported-by: Robin Dong

    Tao Ma
     

14 Jun, 2011

1 commit

  • jbd2_journal_remove_journal_head() can oops when trying to access
    journal_head returned by bh2jh(). This is caused for example by the
    following race:

    TASK1 TASK2
    jbd2_journal_commit_transaction()
    ...
    processing t_forget list
    __jbd2_journal_refile_buffer(jh);
    if (!jh->b_transaction) {
    jbd_unlock_bh_state(bh);
    jbd2_journal_try_to_free_buffers()
    jbd2_journal_grab_journal_head(bh)
    jbd_lock_bh_state(bh)
    __journal_try_to_free_buffer()
    jbd2_journal_put_journal_head(jh)
    jbd2_journal_remove_journal_head(bh);

    jbd2_journal_put_journal_head() in TASK2 sees that b_jcount == 0 and
    buffer is not part of any transaction and thus frees journal_head
    before TASK1 gets to doing so. Note that even buffer_head can be
    released by try_to_free_buffers() after
    jbd2_journal_put_journal_head() which adds even larger opportunity for
    oops (but I didn't see this happen in reality).

    Fix the problem by making transactions hold their own journal_head
    reference (in b_jcount). That way we don't have to remove journal_head
    explicitely via jbd2_journal_remove_journal_head() and instead just
    remove journal_head when b_jcount drops to zero. The result of this is
    that [__]jbd2_journal_refile_buffer(),
    [__]jbd2_journal_unfile_buffer(), and
    __jdb2_journal_remove_checkpoint() can free journal_head which needs
    modification of a few callers. Also we have to be careful because once
    journal_head is removed, buffer_head might be freed as well. So we
    have to get our own buffer_head reference where it matters.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     

13 Jun, 2011

1 commit


27 May, 2011

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (61 commits)
    jbd2: Add MAINTAINERS entry
    jbd2: fix a potential leak of a journal_head on an error path
    ext4: teach ext4_ext_split to calculate extents efficiently
    ext4: Convert ext4 to new truncate calling convention
    ext4: do not normalize block requests from fallocate()
    ext4: enable "punch hole" functionality
    ext4: add "punch hole" flag to ext4_map_blocks()
    ext4: punch out extents
    ext4: add new function ext4_block_zero_page_range()
    ext4: add flag to ext4_has_free_blocks
    ext4: reserve inodes and feature code for 'quota' feature
    ext4: add support for multiple mount protection
    ext4: ensure f_bfree returned by ext4_statfs() is non-negative
    ext4: protect bb_first_free in ext4_trim_all_free() with group lock
    ext4: only load buddy bitmap in ext4_trim_fs() when it is needed
    jbd2: Fix comment to match the code in jbd2__journal_start()
    ext4: fix waiting and sending of a barrier in ext4_sync_file()
    jbd2: Add function jbd2_trans_will_send_data_barrier()
    jbd2: fix sending of data flush on journal commit
    ext4: fix ext4_ext_fiemap_cb() to handle blocks before request range correctly
    ...

    Linus Torvalds
     

26 May, 2011

1 commit


25 May, 2011

1 commit


24 May, 2011

2 commits

  • Provide a function which returns whether a transaction with given tid
    will send a flush to the filesystem device. The function will be used
    by ext4 to detect whether fsync needs to send a separate flush or not.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • In data=ordered mode, it's theoretically possible (however rare) that
    an inode is filed to transaction's t_inode_list and a flusher thread
    writes all the data and inode is reclaimed before the transaction
    starts to commit. In such a case, we could erroneously omit sending a
    flush to file system device when it is different from the journal
    device (because data can still be in disk cache only).

    Fix the problem by setting a flag in a transaction when some inode is added
    to it and then send disk flush in the commit code when the flag is set.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     

23 May, 2011

1 commit

  • t_max_wait is added in commit 8e85fb3f to indicate how long we
    were waiting for new transaction to start. In commit 6d0bf005,
    it is moved to another function named update_t_max_wait to
    avoid a build warning. But the wrong thing is that the original
    'ts' is initialized in the start of function start_this_handle
    and we can calculate t_max_wait in the right way. while with
    this change, ts is initialized within the function and t_max_wait
    can never be calculated right.

    This patch moves the initialization of ts to the original beginning
    of start_this_handle and pass it to function update_t_max_wait so
    that it can be calculated right and the build warning is avoided also.

    Cc: Jan Kara
    Signed-off-by: Tao Ma
    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Eric Sandeen

    Tao Ma
     

17 May, 2011

1 commit


09 May, 2011

2 commits


02 May, 2011

1 commit

  • If an application program does not make any changes to the indirect
    blocks or extent tree, i_datasync_tid will not get updated. If there
    are enough commits (i.e., 2**31) such that tid_geq()'s calculations
    wrap, and there isn't a currently active transaction at the time of
    the fdatasync() call, this can end up triggering a BUG_ON in
    fs/jbd2/commit.c:

    J_ASSERT(journal->j_running_transaction != NULL);

    It's pretty rare that this can happen, since it requires the use of
    fdatasync() plus *very* frequent and excessive use of fsync(). But
    with the right workload, it can.

    We fix this by replacing the use of tid_geq() with an equality test,
    since there's only one valid transaction id that we is valid for us to
    wait until it is commited: namely, the currently running transaction
    (if it exists).

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

12 Apr, 2011

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: fix data corruption regression by reverting commit 6de9843dab3f
    ext4: Allow indirect-block file to grow the file size to max file size
    ext4: allow an active handle to be started when freezing
    ext4: sync the directory inode in ext4_sync_parent()
    ext4: init timer earlier to avoid a kernel panic in __save_error_info
    jbd2: fix potential memory leak on transaction commit
    ext4: fix a double free in ext4_register_li_request
    ext4: fix credits computing for indirect mapped files
    ext4: remove unnecessary [cm]time update of quota file
    jbd2: move bdget out of critical section

    Linus Torvalds
     

06 Apr, 2011

1 commit

  • There is potential memory leak of journal head in function
    jbd2_journal_commit_transaction. The problem is that JBD2 will not
    reclaim the journal head of commit record if error occurs or journal
    is abotred.

    I use the following script to reproduce this issue, on a RHEL6
    system. I found it very easy to reproduce with async commit enabled.

    mount /dev/sdb /mnt -o journal_checksum,journal_async_commit
    touch /mnt/xxx
    echo offline > /sys/block/sdb/device/state
    sync
    umount /mnt
    rmmod ext4
    rmmod jbd2

    Removal of the jbd2 module will make slab complaining that
    "cache `jbd2_journal_head': can't free all objects".

    Signed-off-by: Zhang Huan
    Signed-off-by: "Theodore Ts'o"

    Zhang Huan
     

05 Apr, 2011

1 commit


31 Mar, 2011

1 commit


25 Mar, 2011

1 commit

  • * 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
    Documentation/iostats.txt: bit-size reference etc.
    cfq-iosched: removing unnecessary think time checking
    cfq-iosched: Don't clear queue stats when preempt.
    blk-throttle: Reset group slice when limits are changed
    blk-cgroup: Only give unaccounted_time under debug
    cfq-iosched: Don't set active queue in preempt
    block: fix non-atomic access to genhd inflight structures
    block: attempt to merge with existing requests on plug flush
    block: NULL dereference on error path in __blkdev_get()
    cfq-iosched: Don't update group weights when on service tree
    fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
    block: Require subsystems to explicitly allocate bio_set integrity mempool
    jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
    jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
    fs: make fsync_buffers_list() plug
    mm: make generic_writepages() use plugging
    blk-cgroup: Add unaccounted time to timeslice_used.
    block: fixup plugging stubs for !CONFIG_BLOCK
    block: remove obsolete comments for blkdev_issue_zeroout.
    blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
    ...

    Fix up conflicts in fs/{aio.c,super.c}

    Linus Torvalds
     

17 Mar, 2011

1 commit


10 Mar, 2011

1 commit

  • With the plugging now being explicitly controlled by the
    submitter, callers need not pass down unplugging hints
    to the block layer. If they want to unplug, it's because they
    manually plugged on their own - in which case, they should just
    unplug at will.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

01 Mar, 2011

1 commit


12 Feb, 2011

1 commit

  • On an SMP ARM system running ext4, I've received a report that the
    first J_ASSERT in jbd2_journal_commit_transaction has been triggering:

    J_ASSERT(journal->j_running_transaction != NULL);

    While investigating possible causes for this problem, I noticed that
    __jbd2_log_start_commit() is getting called with j_state_lock only
    read-locked, in spite of the fact that it's possible for it might
    j_commit_request. Fix this by grabbing the necessary information so
    we can test to see if we need to start a new transaction before
    dropping the read lock, and then calling jbd2_log_start_commit() which
    will grab the write lock.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

14 Jan, 2011

1 commit

  • * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
    Documentation/trace/events.txt: Remove obsolete sched_signal_send.
    writeback: fix global_dirty_limits comment runtime -> real-time
    ppc: fix comment typo singal -> signal
    drivers: fix comment typo diable -> disable.
    m68k: fix comment typo diable -> disable.
    wireless: comment typo fix diable -> disable.
    media: comment typo fix diable -> disable.
    remove doc for obsolete dynamic-printk kernel-parameter
    remove extraneous 'is' from Documentation/iostats.txt
    Fix spelling milisec -> ms in snd_ps3 module parameter description
    Fix spelling mistakes in comments
    Revert conflicting V4L changes
    i7core_edac: fix typos in comments
    mm/rmap.c: fix comment
    sound, ca0106: Fix assignment to 'channel'.
    hrtimer: fix a typo in comment
    init/Kconfig: fix typo
    anon_inodes: fix wrong function name in comment
    fix comment typos concerning "consistent"
    poll: fix a typo in comment
    ...

    Fix up trivial conflicts in:
    - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
    - fs/ext4/ext4.h

    Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.

    Linus Torvalds
     

11 Jan, 2011

1 commit


23 Dec, 2010

1 commit


19 Dec, 2010

3 commits