30 Jan, 2008

1 commit

  • The break_lock data structure and code for spinlocks is quite nasty.
    Not only does it double the size of a spinlock but it changes locking to
    a potentially less optimal trylock.

    Put all of that under CONFIG_GENERIC_LOCKBREAK, and introduce a
    __raw_spin_is_contended that uses the lock data itself to determine whether
    there are waiters on the lock, to be used if CONFIG_GENERIC_LOCKBREAK is
    not set.

    Rename need_lockbreak to spin_needbreak, make it use spin_is_contended to
    decouple it from the spinlock implementation, and make it typesafe (rwlocks
    do not have any need_lockbreak sites -- why do they even get bloated up
    with that break_lock then?).

    Signed-off-by: Nick Piggin
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Nick Piggin
     

18 Jan, 2008

1 commit

  • This likely fixes the oops in __lock_acquire reported as:

    http://www.kerneloops.org/raw.php?rawid=2753&msgid=
    http://www.kerneloops.org/raw.php?rawid=2749&msgid=

    In these reported oopses, start_this_handle is returning -EROFS.

    Signed-off-by: Jonas Bonn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jonas Bonn
     

06 Dec, 2007

1 commit

  • Before we start committing a transaction, we call
    __journal_clean_checkpoint_list() to cleanup transaction's written-back
    buffers.

    If this call happens to remove all of them (and there were already some
    buffers), __journal_remove_checkpoint() will decide to free the transaction
    because it isn't (yet) a committing transaction and soon we fail some
    assertion - the transaction really isn't ready to be freed :).

    We change the check in __journal_remove_checkpoint() to free only a
    transaction in T_FINISHED state. The locking there is subtle though (as
    everywhere in JBD ;(). We use j_list_lock to protect the check and a
    subsequent call to __journal_drop_transaction() and do the same in the end
    of journal_commit_transaction() which is the only place where a transaction
    can get to T_FINISHED state.

    Probably I'm too paranoid here and such locking is not really necessary -
    checkpoint lists are processed only from log_do_checkpoint() where a
    transaction must be already committed to be processed or from
    __journal_clean_checkpoint_list() where kjournald itself calls it and thus
    transaction cannot change state either. Better be safe if something
    changes in future...

    Signed-off-by: Jan Kara
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

20 Oct, 2007

4 commits

  • Note from Mingming's JBD2 fix:

    Noticed all warnings are occurs when the debug level is 0. Then found the
    "jbd2: Move jbd2-debug file to debugfs" patch
    http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0f49d5d019afa4e94253bfc92f0daca3badb990b

    changed the jbd2_journal_enable_debug from int type to u8, makes the
    jbd_debug comparision is always true when the debugging level is 0. Thus
    the compile warning occurs.

    Thought about changing the jbd2_journal_enable_debug data type back to int,
    but can't, because the jbd2-debug is moved to debug fs, where calling
    debugfs_create_u8() to create the debugfs entry needs the value to be u8
    type.

    Even if we changed the data type back to int, the code is still buggy,
    kernel should not print jbd2 debug message if the jbd2_journal_enable_debug
    is set to 0. But this is not the case.

    The fix is change the level of debugging to 1. The same should fixed in
    ext3/JBD, but currently ext3 jbd-debug via /proc fs is broken, so we
    probably should fix it all together.

    Signed-off-by: Jose R. Santos
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jose R. Santos
     
  • We should really call journal_abort() and not __journal_abort_hard() in
    case of errors. The latter call does not record the error in the journal
    superblock and thus filesystem won't be marked as with errors later (and
    user could happily mount it without any warning).

    Signed-off-by: Jan Kara
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • The jbd-debug file used to be located in /proc/sys/fs/jbd-debug, but
    create_proc_entry() does not do lookups on file names that are more that
    one directory deep. This causes the entry creation to fail and hence, no
    proc file is created.

    Instead of fixing this on procfs might as well move the jbd2-debug file to
    debugfs which would be the preferred location for this kind of tunable.
    The new location is now /sys/kernel/debug/jbd/jbd-debug.

    [akpm@linux-foundation.org: zillions of cleanups]
    Signed-off-by: Jose R. Santos
    Acked-by: Jan Kara
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jose R. Santos
     
  • Convert kmalloc to kzalloc() and get rid of the memset().

    Signed-off-by: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mingming Cao
     

19 Oct, 2007

1 commit

  • Get rid of sparse related warnings from places that use integer as NULL
    pointer.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Stephen Hemminger
    Cc: Andi Kleen
    Cc: Jeff Garzik
    Cc: Matt Mackall
    Cc: Ian Kent
    Cc: Arnd Bergmann
    Cc: Davide Libenzi
    Cc: Stephen Smalley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Hemminger
     

18 Oct, 2007

2 commits


17 Oct, 2007

1 commit

  • This patch marks a number of allocations that are either short-lived such as
    network buffers or are reclaimable such as inode allocations. When something
    like updatedb is called, long-lived and unmovable kernel allocations tend to
    be spread throughout the address space which increases fragmentation.

    This patch groups these allocations together as much as possible by adding a
    new MIGRATE_TYPE. The MIGRATE_RECLAIMABLE type is for allocations that can be
    reclaimed on demand, but not moved. i.e. they can be migrated by deleting
    them and re-reading the information from elsewhere.

    Signed-off-by: Mel Gorman
    Cc: Andy Whitcroft
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

12 Oct, 2007

1 commit


20 Jul, 2007

1 commit

  • Slab destructors were no longer supported after Christoph's
    c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
    BUGs for both slab and slub, and slob never supported them
    either.

    This rips out support for the dtor pointer from kmem_cache_create()
    completely and fixes up every single callsite in the kernel (there were
    about 224, not including the slab allocator definitions themselves,
    or the documentation references).

    Signed-off-by: Paul Mundt

    Paul Mundt
     

17 Jul, 2007

2 commits

  • Replace (n & (n-1)) in the context of power of 2 checks with
    is_power_of_2().

    Signed-off-by: vignesh babu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    vignesh babu
     
  • We have to check that also the second checkpoint list is non-empty before
    dropping the transaction.

    Signed-off-by: Jan Kara
    Cc: Chuck Ebbert
    Cc: Kirill Korotaev
    Cc:
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

09 May, 2007

3 commits


23 Dec, 2006

1 commit

  • In the current jbd code, if a buffer on BJ_SyncData list is dirty and not
    locked, the buffer is refiled to BJ_Locked list, submitted to the IO and
    waited for IO completion.

    But the fsstress test showed the case that when a buffer was already
    submitted to the IO just before the buffer_dirty(bh) check, the buffer was
    not waited for IO completion.

    Following patch solves this problem. If it is assumed that a buffer is
    submitted to the IO before the buffer_dirty(bh) check and still being
    written to disk, this buffer is refiled to BJ_Locked list.

    Signed-off-by: Hisashi Hifumi
    Cc: Jan Kara
    Cc: "Stephen C. Tweedie"
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hisashi Hifumi
     

11 Dec, 2006

1 commit

  • This patch introduces a user: of the round_jiffies() function; the "5 second"
    ext3/jbd wakeup.

    While "every 5 seconds" doesn't sound as a problem, there can be many of these
    (and these timers do add up over all the kernel). The "5 second" wakeup isn't
    really timing sensitive; in addition even with rounding it'll still happen
    every 5 seconds (with the exception of the very first time, which is likely to
    be rounded up to somewhere closer to 6 seconds)

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

08 Dec, 2006

3 commits


29 Oct, 2006

1 commit

  • When running several fsx's and other filesystem stress tests, we found
    cases where an unmapped buffer was still being sent to submit_bh by the
    ext3 dirty data journaling code.

    I saw this happen in two ways, both related to another thread doing a
    truncate which would unmap the buffer in question.

    Either we would get into journal_dirty_data with a bh which was already
    unmapped (although journal_dirty_data_fn had checked for this earlier, the
    state was not locked at that point), or it would get unmapped in the middle
    of journal_dirty_data when we dropped locks to call sync_dirty_buffer.

    By re-checking for mapped state after we've acquired the bh state lock, we
    should avoid these races. If we find a buffer which is no longer mapped,
    we essentially ignore it, because journal_unmap_buffer has already decided
    that this buffer can go away.

    I've also added tracepoints in these two cases, and made a couple other
    tracepoint changes that I found useful in debugging this.

    Signed-off-by: Eric Sandeen
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     

21 Oct, 2006

1 commit

  • A disk generated some I/O error, after it, I hitted
    J_ASSERT(transaction->t_updates > 0) in journal_stop().

    It seems to happened on ext3_truncate() path from stack trace. Then,
    maybe the following case may trigger J_ASSERT(transaction->t_updates > 0).

    ext3_truncate()
    -> ext3_free_branches()
    -> ext3_journal_test_restart()
    -> ext3_journal_restart()
    -> journal_restart()
    transaction->t_updates--;
    /* another process aborted journal */
    -> start_this_handle()
    returns -EROFS without transaction->t_updates++;

    -> ext3_journal_stop()
    -> journal_stop()
    J_ASSERT(transaction->t_updates > 0)

    If journal was aborted in middle of journal_restart(), ext3_truncate()
    may trigger J_ASSERT().

    Signed-off-by: OGAWA Hirofumi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    OGAWA Hirofumi
     

12 Oct, 2006

1 commit


04 Oct, 2006

1 commit


30 Sep, 2006

2 commits


27 Sep, 2006

6 commits

  • Fixing up some endian-ness warnings in preparation to clone ext4 from ext3.

    Signed-off-by: Dave Kleikamp
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Kleikamp
     
  • More white space cleanups in preparation of cloning ext4 from ext3.
    Removing spaces that precede a tab.

    Signed-off-by: Dave Kleikamp
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Kleikamp
     
  • These are a few places I've found in jbd that look like they may not be
    16T-safe, or consistent with the use of unsigned longs for block
    containers. Problems here would be somewhat hard to hit, would require
    journal blocks past the 8T boundary, which would not be terribly common.
    Still, should fix.

    (some of these have come from the ext4 work on jbd as well).

    I think there's one more possibility that the wrap() function may not be
    safe IF your last block in the journal butts right up against the 232 block
    boundary, but that seems like a VERY remote possibility, and I'm not
    worrying about it at this point.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     
  • Signed-off-by: Alexey Dobriyan
    Acked-by: Stephen Tweedie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Remove whitespace from ext3 and jbd, before we clone ext4.

    Signed-off-by: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mingming Cao
     
  • jbd_sync_bh releases journal->j_list_lock. Add a lock annotation to this
    function so that sparse can check callers for lock pairing, and so that
    sparse will not complain about this function since it intentionally uses
    the lock in this manner.

    Signed-off-by: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josh Triplett
     

26 Sep, 2006

1 commit

  • Original commit code assumes, that when a buffer on BJ_SyncData list is
    locked, it is being written to disk. But this is not true and hence it can
    lead to a potential data loss on crash. Also the code didn't count with
    the fact that journal_dirty_data() can steal buffers from committing
    transaction and hence could write buffers that no longer belong to the
    committing transaction. Finally it could possibly happen that we tried
    writing out one buffer several times.

    The patch below tries to solve these problems by a complete rewrite of the
    data commit code. We go through buffers on t_sync_datalist, lock buffers
    needing write out and store them in an array. Buffers are also immediately
    refiled to BJ_Locked list or unfiled (if the write out is completed). When
    the array is full or we have to block on buffer lock, we submit all
    accumulated buffers for IO.

    [suitable for 2.6.18.x around the 2.6.19-rc2 timeframe]

    Signed-off-by: Jan Kara
    Cc: Badari Pulavarty
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

02 Sep, 2006

1 commit


28 Aug, 2006

1 commit

  • JBD currently allocates commit and frozen buffers from slabs. With
    CONFIG_SLAB_DEBUG, its possible for an allocation to cross the page
    boundary causing IO problems.

    https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=200127

    So, instead of allocating these from regular slabs - manage allocation from
    its own slabs and disable slab debug for these slabs.

    [akpm@osdl.org: cleanups]
    Signed-off-by: Badari Pulavarty
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     

28 Jun, 2006

1 commit

  • Localize poison values into one header file for better documentation and
    easier/quicker debugging and so that the same values won't be used for
    multiple purposes.

    Use these constants in core arch., mm, driver, and fs code.

    Signed-off-by: Randy Dunlap
    Acked-by: Matt Mackall
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: "David S. Miller"
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

26 Jun, 2006

1 commit