10 Jan, 2012

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    ext2/3/4: delete unneeded includes of module.h
    ext{3,4}: Fix potential race when setversion ioctl updates inode
    udf: Mark LVID buffer as uptodate before marking it dirty
    ext3: Don't warn from writepage when readonly inode is spotted after error
    jbd: Remove j_barrier mutex
    reiserfs: Force inode evictions before umount to avoid crash
    reiserfs: Fix quota mount option parsing
    udf: Treat symlink component of type 2 as /
    udf: Fix deadlock when converting file from in-ICB one to normal one
    udf: Cleanup calling convention of inode_getblk()
    ext2: Fix error handling on inode bitmap corruption
    ext3: Fix error handling on inode bitmap corruption
    ext3: replace ll_rw_block with other functions
    ext3: NULL dereference in ext3_evict_inode()
    jbd: clear revoked flag on buffers before a new transaction started
    ext3: call ext3_mark_recovery_complete() when recovery is really needed

    Linus Torvalds
     

09 Jan, 2012

1 commit

  • j_barrier mutex is used for serializing different journal lock operations. The
    problem with it is that e.g. FIFREEZE ioctl results in process leaving kernel
    with j_barrier mutex held which makes lockdep freak out. Also hibernation code
    wants to freeze filesystem but it cannot do so because it then cannot hibernate
    the system because of mutex being locked.

    So we remove j_barrier mutex and use direct wait on j_barrier_count instead.
    Since locking journal is a rare operation we don't have to care about fairness
    or such things.

    CC: Andrew Morton
    Acked-by: Joel Becker
    Signed-off-by: Jan Kara

    Jan Kara
     

22 Nov, 2011

1 commit

  • There is no reason to export two functions for entering the
    refrigerator. Calling refrigerator() instead of try_to_freeze()
    doesn't save anything noticeable or removes any race condition.

    * Rename refrigerator() to __refrigerator() and make it return bool
    indicating whether it scheduled out for freezing.

    * Update try_to_freeze() to return bool and relay the return value of
    __refrigerator() if freezing().

    * Convert all refrigerator() users to try_to_freeze().

    * Update documentation accordingly.

    * While at it, add might_sleep() to try_to_freeze().

    Signed-off-by: Tejun Heo
    Cc: Samuel Ortiz
    Cc: Chris Mason
    Cc: "Theodore Ts'o"
    Cc: Steven Whitehouse
    Cc: Andrew Morton
    Cc: Jan Kara
    Cc: KONISHI Ryusuke
    Cc: Christoph Hellwig

    Tejun Heo
     

02 Nov, 2011

1 commit

  • I hit a J_ASSERT(blocknr != 0) failure in cleanup_journal_tail() when
    mounting a fsfuzzed ext3 image. It turns out that the corrupted ext3
    image has s_first = 0 in journal superblock, and the 0 is passed to
    journal->j_head in journal_reset(), then to blocknr in
    cleanup_journal_tail(), in the end the J_ASSERT failed.

    So validate s_first after reading journal superblock from disk in
    journal_get_superblock() to ensure s_first is valid.

    The following script could reproduce it:

    fstype=ext3
    blocksize=1024
    img=$fstype.img
    offset=0
    found=0
    magic="c0 3b 39 98"

    dd if=/dev/zero of=$img bs=1M count=8
    mkfs -t $fstype -b $blocksize -F $img
    filesize=`stat -c %s $img`
    while [ $offset -lt $filesize ]
    do
    if od -j $offset -N 4 -t x1 $img | grep -i "$magic";then
    echo "Found journal: $offset"
    found=1
    break
    fi
    offset=`echo "$offset+$blocksize" | bc`
    done

    if [ $found -ne 1 ];then
    echo "Magic \"$magic\" not found"
    exit 1
    fi

    dd if=/dev/zero of=$img seek=$(($offset+23)) conv=notrunc bs=1 count=1

    mkdir -p ./mnt
    mount -o loop $img ./mnt

    Cc: Jan Kara
    Signed-off-by: Eryu Guan
    Signed-off-by: "Theodore Ts'o"

    Eryu Guan
     

27 Jun, 2011

1 commit

  • journal_remove_journal_head() can oops when trying to access journal_head
    returned by bh2jh(). This is caused for example by the following race:

    TASK1 TASK2
    journal_commit_transaction()
    ...
    processing t_forget list
    __journal_refile_buffer(jh);
    if (!jh->b_transaction) {
    jbd_unlock_bh_state(bh);
    journal_try_to_free_buffers()
    journal_grab_journal_head(bh)
    jbd_lock_bh_state(bh)
    __journal_try_to_free_buffer()
    journal_put_journal_head(jh)
    journal_remove_journal_head(bh);

    journal_put_journal_head() in TASK2 sees that b_jcount == 0 and buffer is not
    part of any transaction and thus frees journal_head before TASK1 gets to doing
    so. Note that even buffer_head can be released by try_to_free_buffers() after
    journal_put_journal_head() which adds even larger opportunity for oops (but I
    didn't see this happen in reality).

    Fix the problem by making transactions hold their own journal_head reference
    (in b_jcount). That way we don't have to remove journal_head explicitely via
    journal_remove_journal_head() and instead just remove journal_head when
    b_jcount drops to zero. The result of this is that [__]journal_refile_buffer(),
    [__]journal_unfile_buffer(), and __journal_remove_checkpoint() can free
    journal_head which needs modification of a few callers. Also we have to be
    careful because once journal_head is removed, buffer_head might be freed as
    well. So we have to get our own buffer_head reference where it matters.

    Signed-off-by: Jan Kara

    Jan Kara
     

25 Jun, 2011

1 commit

  • This commit adds fixed tracepoint for jbd. It has been based on fixed
    tracepoints for jbd2, however there are missing those for collecting
    statistics, since I think that it will require more intrusive patch so I
    should have its own commit, if someone decide that it is needed. Also
    there are new tracepoints in __journal_drop_transaction() and
    journal_update_superblock().

    The list of jbd tracepoints:

    jbd_checkpoint
    jbd_start_commit
    jbd_commit_locking
    jbd_commit_flushing
    jbd_commit_logging
    jbd_drop_transaction
    jbd_end_commit
    jbd_do_submit_data
    jbd_cleanup_journal_tail
    jbd_update_superblock_end

    Signed-off-by: Lukas Czerner
    Cc: Jan Kara
    Signed-off-by: Jan Kara

    Lukas Czerner
     

17 May, 2011

1 commit

  • If an application program does not make any changes to the indirect
    blocks or extent tree, i_datasync_tid will not get updated. If there
    are enough commits (i.e., 2**31) such that tid_geq()'s calculations
    wrap, and there isn't a currently active transaction at the time of
    the fdatasync() call, this can end up triggering a BUG_ON in
    fs/jbd/commit.c:

    J_ASSERT(journal->j_running_transaction != NULL);

    It's pretty rare that this can happen, since it requires the use of
    fdatasync() plus *very* frequent and excessive use of fsync(). But
    with the right workload, it can.

    We fix this by replacing the use of tid_geq() with an equality test,
    since there's only one valid transaction id that is valid for us to
    start: namely, the currently running transaction (if it exists).

    CC: stable@kernel.org
    Reported-by: Martin_Zielinski@McAfee.com
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Jan Kara

    Ted Ts'o
     

31 Mar, 2011

1 commit


01 Mar, 2011

1 commit


28 Oct, 2010

4 commits


18 Aug, 2010

1 commit

  • These flags aren't real I/O types, but tell ll_rw_block to always
    lock the buffer instead of giving up on a failed trylock.

    Instead add a new write_dirty_buffer helper that implements this semantic
    and use it from the existing SWRITE* callers. Note that the ll_rw_block
    code had a bug where it didn't promote WRITE_SYNC_PLUG properly, which
    this patch fixes.

    In the ufs code clean up the helper that used to call ll_rw_block
    to mirror sync_dirty_buffer, which is the function it implements for
    compound buffers.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

21 Jul, 2010

1 commit


22 May, 2010

2 commits


23 Dec, 2009

1 commit


12 Nov, 2009

1 commit


11 Nov, 2009

1 commit


16 Sep, 2009

1 commit


21 Jul, 2009

1 commit

  • The function journal_write_metadata_buffer() calls jbd_unlock_bh_state(bh_in)
    too early; this could potentially allow another thread to call get_write_access
    on the buffer head, modify the data, and dirty it, and allowing the wrong data
    to be written into the journal. Fortunately, if we lose this race, the only
    time this will actually cause filesystem corruption is if there is a system
    crash or other unclean shutdown of the system before the next commit can take
    place.

    Signed-off-by: dingdinghua
    Acked-by: "Theodore Ts'o"
    Signed-off-by: Jan Kara

    dingdinghua
     

16 Jul, 2009

1 commit


03 Apr, 2009

1 commit


12 Feb, 2009

1 commit

  • journal_start_commit() returns 1 if either a transaction is committing or
    the function has queued a transaction commit. But it returns 0 if we
    raced with somebody queueing the transaction commit as well. This
    resulted in ext3_sync_fs() not functioning correctly (description from
    Arthur Jones): In the case of a data=ordered umount with pending long
    symlinks which are delayed due to a long list of other I/O on the backing
    block device, this causes the buffer associated with the long symlinks to
    not be moved to the inode dirty list in the second phase of fsync_super.
    Then, before they can be dirtied again, kjournald exits, seeing the UMOUNT
    flag and the dirty pages are never written to the backing block device,
    causing long symlink corruption and exposing new or previously freed block
    data to userspace.

    This can be reproduced with a script created by Eric Sandeen
    :

    #!/bin/bash

    umount /mnt/test2
    mount /dev/sdb4 /mnt/test2
    rm -f /mnt/test2/*
    dd if=/dev/zero of=/mnt/test2/bigfile bs=1M count=512
    touch /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
    ln -s /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
    /mnt/test2/link
    umount /mnt/test2
    mount /dev/sdb4 /mnt/test2
    ls /mnt/test2/

    This patch fixes journal_start_commit() to always return 1 when there's
    a transaction committing or queued for commit.

    Cc: Eric Sandeen
    Cc: Mike Snitzer
    Cc:
    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

23 Oct, 2008

1 commit

  • When a checkpointing IO fails, current JBD code doesn't check the error
    and continue journaling. This means latest metadata can be lost from both
    the journal and filesystem.

    This patch leaves the failed metadata blocks in the journal space and
    aborts journaling in the case of log_do_checkpoint(). To achieve this, we
    need to do:

    1. don't remove the failed buffer from the checkpoint list where in
    the case of __try_to_free_cp_buf() because it may be released or
    overwritten by a later transaction
    2. log_do_checkpoint() is the last chance, remove the failed buffer
    from the checkpoint list and abort the journal
    3. when checkpointing fails, don't update the journal super block to
    prevent the journaled contents from being cleaned. For safety,
    don't update j_tail and j_tail_sequence either
    4. when checkpointing fails, notify this error to the ext3 layer so
    that ext3 don't clear the needs_recovery flag, otherwise the
    journaled contents are ignored and cleaned in the recovery phase
    5. if the recovery fails, keep the needs_recovery flag
    6. prevent cleanup_journal_tail() from being called between
    __journal_drop_transaction() and journal_abort() (a race issue
    between journal_flush() and __log_wait_for_space()

    Signed-off-by: Hidehiro Kawai
    Acked-by: Jan Kara
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hidehiro Kawai
     

26 Jul, 2008

2 commits

  • Remove the unused EXPORT_SYMBOL(journal_update_superblock).

    Signed-off-by: Adrian Bunk
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • If an error occurs during jbd cache initialisation it is possible for the
    journal_head_cache to be NULL when journal_destroy_journal_head_cache is
    called. Replace the J_ASSERT with an if block to handle the situation
    correctly.

    Note that even with this fix things will break badly if jbd is statically
    compiled in and cache initialisation fails.

    Signed-off-by: Duane Griffin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Duane Griffin
     

28 Apr, 2008

1 commit


31 Mar, 2008

1 commit


20 Mar, 2008

1 commit


07 Feb, 2008

1 commit


20 Oct, 2007

2 commits

  • The jbd-debug file used to be located in /proc/sys/fs/jbd-debug, but
    create_proc_entry() does not do lookups on file names that are more that
    one directory deep. This causes the entry creation to fail and hence, no
    proc file is created.

    Instead of fixing this on procfs might as well move the jbd2-debug file to
    debugfs which would be the preferred location for this kind of tunable.
    The new location is now /sys/kernel/debug/jbd/jbd-debug.

    [akpm@linux-foundation.org: zillions of cleanups]
    Signed-off-by: Jose R. Santos
    Acked-by: Jan Kara
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jose R. Santos
     
  • Convert kmalloc to kzalloc() and get rid of the memset().

    Signed-off-by: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mingming Cao
     

19 Oct, 2007

1 commit

  • Get rid of sparse related warnings from places that use integer as NULL
    pointer.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Stephen Hemminger
    Cc: Andi Kleen
    Cc: Jeff Garzik
    Cc: Matt Mackall
    Cc: Ian Kent
    Cc: Arnd Bergmann
    Cc: Davide Libenzi
    Cc: Stephen Smalley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Hemminger
     

18 Oct, 2007

2 commits


17 Oct, 2007

1 commit

  • This patch marks a number of allocations that are either short-lived such as
    network buffers or are reclaimable such as inode allocations. When something
    like updatedb is called, long-lived and unmovable kernel allocations tend to
    be spread throughout the address space which increases fragmentation.

    This patch groups these allocations together as much as possible by adding a
    new MIGRATE_TYPE. The MIGRATE_RECLAIMABLE type is for allocations that can be
    reclaimed on demand, but not moved. i.e. they can be migrated by deleting
    them and re-reading the information from elsewhere.

    Signed-off-by: Mel Gorman
    Cc: Andy Whitcroft
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

20 Jul, 2007

1 commit

  • Slab destructors were no longer supported after Christoph's
    c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
    BUGs for both slab and slub, and slob never supported them
    either.

    This rips out support for the dtor pointer from kmem_cache_create()
    completely and fixes up every single callsite in the kernel (there were
    about 224, not including the slab allocator definitions themselves,
    or the documentation references).

    Signed-off-by: Paul Mundt

    Paul Mundt
     

09 May, 2007

1 commit

  • If the thread failed to create the subsequent wait_event will hang forever.

    This is likely to happen if kernel hits max_threads limit.

    Will be critical for virtualization systems that limit the number of tasks
    and kernel memory usage within the container.

    (akpm: JBD should be converted fully to the kthread API: kthread_should_stop()
    and kthread_stop()).

    Cc:
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelianov