07 Nov, 2008

2 commits

  • Avoid freeing the transaction in __jbd2_journal_drop_transaction() so
    the journal commit callback can run without holding j_list_lock, to
    avoid lock contention on this spinlock.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     
  • Commit 23f8b79e introducd a regression because it assumed that if
    there were no transactions ready to be checkpointed, that no progress
    could be made on making space available in the journal, and so the
    journal should be aborted. This assumption is false; it could be the
    case that simply calling jbd2_cleanup_journal_tail() will recover the
    necessary space, or, for small journals, the currently committing
    transaction could be responsible for chewing up the required space in
    the log, so we need to wait for the currently committing transaction
    to finish before trying to force a checkpoint operation.

    This patch fixes a bug reported by Mihai Harpau at:
    https://bugzilla.redhat.com/show_bug.cgi?id=469582

    This patch fixes a bug reported by François Valenduc at:
    http://bugzilla.kernel.org/show_bug.cgi?id=11840

    Signed-off-by: "Theodore Ts'o"
    Cc: Duane Griffin
    Cc: Toshiyuki Okajima

    Theodore Ts'o
     

05 Nov, 2008

1 commit


11 Oct, 2008

1 commit

  • When a checkpointing IO fails, current JBD2 code doesn't check the
    error and continue journaling. This means latest metadata can be
    lost from both the journal and filesystem.

    This patch leaves the failed metadata blocks in the journal space
    and aborts journaling in the case of jbd2_log_do_checkpoint().
    To achieve this, we need to do:

    1. don't remove the failed buffer from the checkpoint list where in
    the case of __try_to_free_cp_buf() because it may be released or
    overwritten by a later transaction
    2. jbd2_log_do_checkpoint() is the last chance, remove the failed
    buffer from the checkpoint list and abort the journal
    3. when checkpointing fails, don't update the journal super block to
    prevent the journaled contents from being cleaned. For safety,
    don't update j_tail and j_tail_sequence either
    4. when checkpointing fails, notify this error to the ext4 layer so
    that ext4 don't clear the needs_recovery flag, otherwise the
    journaled contents are ignored and cleaned in the recovery phase
    5. if the recovery fails, keep the needs_recovery flag
    6. prevent jbd2_cleanup_journal_tail() from being called between
    __jbd2_journal_drop_transaction() and jbd2_journal_abort()
    (a possible race issue between jbd2_log_do_checkpoint()s called by
    jbd2_journal_flush() and __jbd2_log_wait_for_space())

    Signed-off-by: Hidehiro Kawai
    Signed-off-by: Theodore Ts'o

    Hidehiro Kawai
     

09 Oct, 2008

1 commit

  • The __jbd2_log_wait_for_space function sits in a loop checkpointing
    transactions until there is sufficient space free in the journal.
    However, if there are no transactions to be processed (e.g. because the
    free space calculation is wrong due to a corrupted filesystem) it will
    never progress.

    Check for space being required when no transactions are outstanding and
    abort the journal instead of endlessly looping.

    This patch fixes the bug reported by Sami Liedes at:
    http://bugzilla.kernel.org/show_bug.cgi?id=10976

    Signed-off-by: Duane Griffin
    Cc: Sami Liedes
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: "Theodore Ts'o"

    Duane Griffin
     

06 Oct, 2008

1 commit


12 Jul, 2008

1 commit


30 Jan, 2008

1 commit

  • The break_lock data structure and code for spinlocks is quite nasty.
    Not only does it double the size of a spinlock but it changes locking to
    a potentially less optimal trylock.

    Put all of that under CONFIG_GENERIC_LOCKBREAK, and introduce a
    __raw_spin_is_contended that uses the lock data itself to determine whether
    there are waiters on the lock, to be used if CONFIG_GENERIC_LOCKBREAK is
    not set.

    Rename need_lockbreak to spin_needbreak, make it use spin_is_contended to
    decouple it from the spinlock implementation, and make it typesafe (rwlocks
    do not have any need_lockbreak sites -- why do they even get bloated up
    with that break_lock then?).

    Signed-off-by: Nick Piggin
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Nick Piggin
     

29 Jan, 2008

2 commits

  • The patch below updates the jbd stats patch to 2.6.20/jbd2.
    The initial patch was posted by Alex Tomas in December 2005
    (http://marc.info/?l=linux-ext4&m=113538565128617&w=2).
    It provides statistics via procfs such as transaction lifetime and size.

    Sometimes, investigating performance problems, i find useful to have
    stats from jbd about transaction's lifetime, size, etc. here is a
    patch for review and inclusion probably.

    for example, stats after creation of 3M files in htree directory:

    [root@bob ~]# cat /proc/fs/jbd/sda/history
    R/C tid wait run lock flush log hndls block inlog ctime write drop close
    R 261 8260 2720 0 0 750 9892 8170 8187
    C 259 750 0 4885 1
    R 262 20 2200 10 0 770 9836 8170 8187
    R 263 30 2200 10 0 3070 9812 8170 8187
    R 264 0 5000 10 0 1340 0 0 0
    C 261 8240 3212 4957 0
    R 265 8260 1470 0 0 4640 9854 8170 8187
    R 266 0 5000 10 0 1460 0 0 0
    C 262 8210 2989 4868 0
    R 267 8230 1490 10 0 4440 9875 8171 8188
    R 268 0 5000 10 0 1260 0 0 0
    C 263 7710 2937 4908 0
    R 269 7730 1470 10 0 3330 9841 8170 8187
    R 270 0 5000 10 0 830 0 0 0
    C 265 8140 3234 4898 0
    C 267 720 0 4849 1
    R 271 8630 2740 20 0 740 9819 8170 8187
    C 269 800 0 4214 1
    R 272 40 2170 10 0 830 9716 8170 8187
    R 273 40 2280 0 0 3530 9799 8170 8187
    R 274 0 5000 10 0 990 0 0 0

    where,

    R - line for transaction's life from T_RUNNING to T_FINISHED
    C - line for transaction's checkpointing
    tid - transaction's id
    wait - for how long we were waiting for new transaction to start
    (the longest period journal_start() took in this transaction)
    run - real transaction's lifetime (from T_RUNNING to T_LOCKED
    lock - how long we were waiting for all handles to close
    (time the transaction was in T_LOCKED)
    flush - how long it took to flush all data (data=ordered)
    log - how long it took to write the transaction to the log
    hndls - how many handles got to the transaction
    block - how many blocks got to the transaction
    inlog - how many blocks are written to the log (block + descriptors)
    ctime - how long it took to checkpoint the transaction
    write - how many blocks have been written during checkpointing
    drop - how many blocks have been dropped during checkpointing
    close - how many running transactions have been closed to checkpoint this one

    all times are in msec.

    [root@bob ~]# cat /proc/fs/jbd/sda/info
    280 transaction, each upto 8192 blocks
    average:
    1633ms waiting for transaction
    3616ms running transaction
    5ms transaction was being locked
    1ms flushing data (in ordered mode)
    1799ms logging transaction
    11781 handles per transaction
    5629 blocks per transaction
    5641 logged blocks per transaction

    Signed-off-by: Johann Lombardi
    Signed-off-by: Mariusz Kozlowski
    Signed-off-by: Mingming Cao
    Signed-off-by: Eric Sandeen

    Johann Lombardi
     
  • Before we start committing a transaction, we call
    __journal_clean_checkpoint_list() to cleanup transaction's written-back
    buffers.

    If this call happens to remove all of them (and there were already some
    buffers), __journal_remove_checkpoint() will decide to free the transaction
    because it isn't (yet) a committing transaction and soon we fail some
    assertion - the transaction really isn't ready to be freed :).

    We change the check in __journal_remove_checkpoint() to free only a
    transaction in T_FINISHED state. The locking there is subtle though (as
    everywhere in JBD ;(). We use j_list_lock to protect the check and a
    subsequent call to __journal_drop_transaction() and do the same in the end
    of journal_commit_transaction() which is the only place where a transaction
    can get to T_FINISHED state.

    Probably I'm too paranoid here and such locking is not really necessary -
    checkpoint lists are processed only from log_do_checkpoint() where a
    transaction must be already committed to be processed or from
    __journal_clean_checkpoint_list() where kjournald itself calls it and thus
    transaction cannot change state either. Better be safe if something
    changes in future...

    Signed-off-by: Jan Kara
    Cc:
    Signed-off-by: Andrew Morton

    Jan Kara
     

09 May, 2007

1 commit


12 Oct, 2006

2 commits