Doug / smarc-fsl-linux-kernel | Embedian Git Server

09 Jan, 2009

2 commits

1579c3a15 jbd: remove excess kernel-doc notation ... Browse Code »

Remove excess kernel-doc from fs/jbd/transaction.c:

Warning(linux-2.6.28-git5//fs/jbd/transaction.c:764): Excess function parameter 'credits' description in 'journal_get_write_access'

Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2009-01-09 00:31:01 +0800
f420d4dc4 jbd: improve fsync batching ... Browse Code »

There is a flaw with the way jbd handles fsync batching. If we fsync() a
file and we were not the last person to run fsync() on this fs then we
automatically sleep for 1 jiffie in order to wait for new writers to join
into the transaction before forcing the commit. The problem with this is
that with really fast storage (ie a Clariion) the time it takes to commit
a transaction to disk is way faster than 1 jiffie in most cases, so
sleeping means waiting longer with nothing to do than if we just committed
the transaction and kept going. Ric Wheeler noticed this when using
fs_mark with more than 1 thread, the throughput would plummet as he added
more threads.

This patch attempts to fix this problem by recording the average time in
nanoseconds that it takes to commit a transaction to disk, and what time
we started the transaction. If we run an fsync() and we have been running
for less time than it takes to commit the transaction to disk, we sleep
for the delta amount of time and then commit to disk. We acheive
sub-jiffie sleeping using schedule_hrtimeout. This means that the wait
time is auto-tuned to the speed of the underlying disk, instead of having
this static timeout. I weighted the average according to somebody's
comments (Andreas Dilger I think) in order to help normalize random
outliers where we take way longer or way less time to commit than the
average. I also have a min() check in there to make sure we don't sleep
longer than a jiffie in case our storage is super slow, this was requested
by Andrew.

I unfortunately do not have access to a Clariion, so I had to use a
ramdisk to represent a super fast array. I tested with a SATA drive with
barrier=1 to make sure there was no regression with local disks, I tested
with a 4 way multipathed Apple Xserve RAID array and of course the
ramdisk. I ran the following command

fs_mark -d /mnt/ext3-test -s 4096 -n 2000 -D 64 -t $i

where $i was 2, 4, 8, 16 and 32. I mkfs'ed the fs each time. Here are my
results

type threads with patch without patch
sata 2 24.6 26.3
sata 4 49.2 48.1
sata 8 70.1 67.0
sata 16 104.0 94.1
sata 32 153.6 142.7

xserve 2 246.4 222.0
xserve 4 480.0 440.8
xserve 8 829.5 730.8
xserve 16 1172.7 1026.9
xserve 32 1816.3 1650.5

ramdisk 2 2538.3 1745.6
ramdisk 4 2942.3 661.9
ramdisk 8 2882.5 999.8
ramdisk 16 2738.7 1801.9
ramdisk 32 2541.9 2394.0

Signed-off-by: Josef Bacik
Cc: Andreas Dilger
Cc: Arjan van de Ven
Cc: Ric Wheeler
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josef Bacik
2009-01-09 00:31:00 +0800

07 Nov, 2008

1 commit

e219cca08 jbd: don't give up looking for space so easily in __log_wait_for_space ... Browse Code »

Commit be07c4ed introducd a regression because it assumed that if
there were no transactions ready to be checkpointed, that no progress
could be made on making space available in the journal, and so the
journal should be aborted. This assumption is false; it could be the
case that simply calling cleanup_journal_tail() will recover the
necessary space, or, for small journals, the currently committing
transaction could be responsible for chewing up the required space in
the log, so we need to wait for the currently committing transaction
to finish before trying to force a checkpoint operation.

This patch fixes the bug reported by Meelis Roos at:
http://bugzilla.kernel.org/show_bug.cgi?id=11937

Signed-off-by: "Theodore Ts'o"
Cc: Duane Griffin
Cc: Toshiyuki Okajima

Theodore Ts'o
2008-11-07 11:37:59 +0800

31 Oct, 2008

1 commit

e74481e23 fs: remove excess kernel-doc ... Browse Code »

Delete excess kernel-doc notation in fs/ subdirectory:

Warning(linux-2.6.27-git10//fs/jbd/transaction.c:886): Excess function parameter or struct member 'credits' description in 'journal_get_undo_access'

Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2008-10-31 02:38:46 +0800

23 Oct, 2008

3 commits

be07c4ed4 jbd: abort instead of waiting for nonexistent transactions ... Browse Code »

The __log_wait_for_space function sits in a loop checkpointing
transactions until there is sufficient space free in the journal.
However, if there are no transactions to be processed (e.g. because the
free space calculation is wrong due to a corrupted filesystem) it will
never progress.

Check for space being required when no transactions are outstanding and
abort the journal instead of endlessly looping.

This patch fixes the bug reported by Sami Liedes at:
http://bugzilla.kernel.org/show_bug.cgi?id=10976

Signed-off-by: Duane Griffin
Tested-by: Sami Liedes
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Duane Griffin
2008-10-23 23:55:02 +0800
9f818b4ac jbd: test BH_Write_EIO to detect errors on metadata buffers ... Browse Code »

__try_to_free_cp_buf(), __process_buffer(), and __wait_cp_io() test
BH_Uptodate flag to detect write I/O errors on metadata buffers. But by
commit 95450f5a7e53d5752ce1a0d0b8282e10fe745ae0 "ext3: don't read inode
block if the buffer has a write error"(*), BH_Uptodate flag can be set to
inode buffers with BH_Write_EIO in order to avoid reading old inode data.
So now, we have to test BH_Write_EIO flag of checkpointing inode buffers
instead of BH_Uptodate. This patch does it.

Signed-off-by: Hidehiro Kawai
Acked-by: Jan Kara
Acked-by: Eric Sandeen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hidehiro Kawai
2008-10-23 23:55:02 +0800
4afe97853 jbd: fix error handling for checkpoint io ... Browse Code »

When a checkpointing IO fails, current JBD code doesn't check the error
and continue journaling. This means latest metadata can be lost from both
the journal and filesystem.

This patch leaves the failed metadata blocks in the journal space and
aborts journaling in the case of log_do_checkpoint(). To achieve this, we
need to do:

1. don't remove the failed buffer from the checkpoint list where in
the case of __try_to_free_cp_buf() because it may be released or
overwritten by a later transaction
2. log_do_checkpoint() is the last chance, remove the failed buffer
from the checkpoint list and abort the journal
3. when checkpointing fails, don't update the journal super block to
prevent the journaled contents from being cleaned. For safety,
don't update j_tail and j_tail_sequence either
4. when checkpointing fails, notify this error to the ext3 layer so
that ext3 don't clear the needs_recovery flag, otherwise the
journaled contents are ignored and cleaned in the recovery phase
5. if the recovery fails, keep the needs_recovery flag
6. prevent cleanup_journal_tail() from being called between
__journal_drop_transaction() and journal_abort() (a race issue
between journal_flush() and __log_wait_for_space()

Signed-off-by: Hidehiro Kawai
Acked-by: Jan Kara
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hidehiro Kawai
2008-10-23 23:55:01 +0800

21 Oct, 2008

1 commit

6da0b38f4 fs/Kconfig: move ext2, ext3, ext4, JBD, JBD2 out ... Browse Code »

Use fs/*/Kconfig more, which is good because everything related to one
filesystem is in one place and fs/Kconfig is quite fat.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2008-10-21 02:43:59 +0800

20 Oct, 2008

4 commits

960a22ae6 jbd: ordered data integrity fix ... Browse Code »

In ordered mode, if a file data buffer being dirtied exists in the
committing transaction, we write the buffer to the disk, move it from the
committing transaction to the running transaction, then dirty it. But we
don't have to remove the buffer from the committing transaction when the
buffer couldn't be written out, otherwise it would miss the error and the
committing transaction would not abort.

This patch adds an error check before removing the buffer from the
committing transaction.

Signed-off-by: Hidehiro Kawai
Acked-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hidehiro Kawai
2008-10-20 23:52:37 +0800
0e4fb5e28 ext3: add an option to control error handling on file data ... Browse Code »

If the journal doesn't abort when it gets an IO error in file data blocks,
the file data corruption will spread silently. Because most of
applications and commands do buffered writes without fsync(), they don't
notice the IO error. It's scary for mission critical systems. On the
other hand, if the journal aborts whenever it gets an IO error in file
data blocks, the system will easily become inoperable. So this patch
introduces a filesystem option to determine whether it aborts the journal
or just call printk() when it gets an IO error in file data.

If you mount a ext3 fs with data_err=abort option, it aborts on file data
write error. If you mount it with data_err=ignore, it doesn't abort, just
call printk(). data_err=ignore is the default.

Signed-off-by: Hidehiro Kawai
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hidehiro Kawai
2008-10-20 23:52:37 +0800
885e353c7 jbd: don't dirty original metadata buffer on abort ... Browse Code »

Currently, original metadata buffers are dirtied when they are unfiled
whether the journal has aborted or not. Eventually these buffers will be
written-back to the filesystem by pdflush. This means some metadata
buffers are written to the filesystem without journaling if the journal
aborts. So if both journal abort and system crash happen at the same
time, the filesystem would become inconsistent state. Additionally,
replaying journaled metadata can overwrite the latest metadata on the
filesystem partly. Because, if the journal aborts, journaled metadata are
preserved and replayed during the next mount not to lose uncheckpointed
metadata. This would also break the consistency of the filesystem.

This patch prevents original metadata buffers from being dirtied on abort
by clearing BH_JBDDirty flag from those buffers. Thus, no metadata
buffers are written to the filesystem without journaling.

Signed-off-by: Hidehiro Kawai
Acked-by: Jan Kara
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hidehiro Kawai
2008-10-20 23:52:37 +0800
d1645e526 jbd: abort when failed to log metadata buffers ... Browse Code »

If we failed to write metadata buffers to the journal space and succeeded
to write the commit record, stale data can be written back to the
filesystem as metadata in the recovery phase.

To avoid this, when we failed to write out metadata buffers, abort the
journal before writing the commit record.

Signed-off-by: Hidehiro Kawai
Acked-by: Jan Kara
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hidehiro Kawai
2008-10-20 23:52:36 +0800

12 Aug, 2008

1 commit

23a0ee908 Merge branch 'core/locking' into core/urgent Browse Code »

Ingo Molnar
2008-08-12 06:11:49 +0800

11 Aug, 2008

2 commits

3295f0ef9 lockdep: rename map_[acquire|release]() => lock_map_[acquire|release]() ... Browse Code »

the names were too generic:

drivers/uio/uio.c:87: error: expected identifier or '(' before 'do'
drivers/uio/uio.c:87: error: expected identifier or '(' before 'while'
drivers/uio/uio.c:113: error: 'map_release' undeclared here (not in a function)

Signed-off-by: Ingo Molnar

Ingo Molnar
2008-08-11 16:30:30 +0800
4f3e7524b lockdep: map_acquire ... Browse Code »

Most the free-standing lock_acquire() usages look remarkably similar, sweep
them into a new helper.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-08-11 15:30:23 +0800

05 Aug, 2008

2 commits

ca5de404f fs: rename buffer trylock ... Browse Code »

Like the page lock change, this also requires name change, so convert the
raw test_and_set bitop to a trylock.

Signed-off-by: Nick Piggin
Signed-off-by: Linus Torvalds

Nick Piggin
2008-08-05 12:56:09 +0800
529ae9aaa mm: rename page trylock ... Browse Code »

Converting page lock to new locking bitops requires a change of page flag
operation naming, so we might as well convert it to something nicer
(!TestSetPageLocked_Lock => trylock_page, SetPageLocked => set_page_locked).

This also facilitates lockdeping of page lock.

Signed-off-by: Nick Piggin
Acked-by: KOSAKI Motohiro
Acked-by: Peter Zijlstra
Acked-by: Andrew Morton
Acked-by: Benjamin Herrenschmidt
Signed-off-by: Linus Torvalds

Nick Piggin
2008-08-05 12:31:34 +0800

26 Jul, 2008

7 commits

cbe5f466f jbd: don't abort if flushing file data failed ... Browse Code »

In ordered mode, the current jbd aborts the journal if a file data buffer
has an error. But this behavior is unintended, and we found that it has
been adopted accidentally.

This patch undoes it and just calls printk() instead of aborting the
journal. Additionally, set AS_EIO into the address_space object of the
failed buffer which is submitted by journal_do_submit_data() so that
fsync() can get -EIO.

Missing error checkings are also added to inform errors on file data
buffers to the user. The following buffers are targeted.

(a) the buffer which has already been written out by pdflush
(b) the buffer which has been unlocked before scanned in the
t_locked_list loop

[akpm@linux-foundation.org: improve grammar in a printk]
Signed-off-by: Hidehiro Kawai
Acked-by: Jan Kara
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hidehiro Kawai
2008-07-26 01:53:32 +0800
fc80c4427 jbd: positively dispose the unmapped data buffers in journal_commit_transaction() ... Browse Code »

After ext3-ordered files are truncated, there is a possibility that the
pages which cannot be estimated still remain. Remaining pages can be
released when the system has really few memory. So, it is not memory
leakage. But the resource management software etc. may not work
correctly.

It is possible that journal_unmap_buffer() cannot release the buffers, and
the pages to which they belong because they are attached to a commiting
transaction and journal_unmap_buffer() cannot release them. To release
such the buffers and the pages later, journal_unmap_buffer() leaves it to
journal_commit_transaction(). (journal_unmap_buffer() puts the mark
'BH_Freed' to the buffers so that journal_commit_transaction() can
identify whether they can be released or not.)

In the journalled mode and the writeback mode, jbd does with only metadata
buffers. But in the ordered mode, jbd does with metadata buffers and also
data buffers.

Actually, journal_commit_transaction() releases only the metadata buffers
of which release is demanded by journal_unmap_buffer(), and also releases
the pages to which they belong if possible.

As a result, the data buffers of which release is demanded by
journal_unmap_buffer() remain after a transaction commits. And also the
pages to which they belong remain.

Such the remained pages don't have mapping any longer. Due to this fact,
there is a possibility that the pages which cannot be estimated remain.

The metadata buffers marked 'BH_Freed' and the pages to which
they belong can be released at 'JBD: commit phase 7'.

Therefore, by applying the same code into 'JBD: commit phase 2' (where the
data buffers are done with), journal_commit_transaction() can also release
the data buffers marked 'BH_Freed' and the pages to which they belong.

As a result, all the buffers marked 'BH_Freed' can be released, and also
all the pages to which these buffers belong can be released at
journal_commit_transaction(). So, the page which cannot be estimated is
lost.

<>
> spin_lock(&journal->j_list_lock);
> while (commit_transaction->t_forget) {
> transaction_t *cp_transaction;
> struct buffer_head *bh;
>
> jh = commit_transaction->t_forget;
>...
> if (buffer_freed(bh)) {
> ^^^^^^^^^^^^^^^^^^^^^^^^
> clear_buffer_freed(bh);
> ^^^^^^^^^^^^^^^^^^^^^^^^
> clear_buffer_jbddirty(bh);
> }
>
> if (buffer_jbddirty(bh)) {
> JBUFFER_TRACE(jh, "add to new checkpointing trans");
> __journal_insert_checkpoint(jh, commit_transaction);
> JBUFFER_TRACE(jh, "refile for checkpoint writeback");
> __journal_refile_buffer(jh);
> jbd_unlock_bh_state(bh);
> } else {
> J_ASSERT_BH(bh, !buffer_dirty(bh));
> ...
> JBUFFER_TRACE(jh, "refile or unfile freed buffer");
> __journal_refile_buffer(jh);
> if (!jh->b_transaction) {
> jbd_unlock_bh_state(bh);
> /* needs a brelse */
> journal_remove_journal_head(bh);
> release_buffer_page(bh);
> ^^^^^^^^^^^^^^^^^^^^^^^^
> } else
> }
****************************************************************
* Apply the code of "^^^^^^" lines into 'JBD: commit phase 2' *
****************************************************************

At journal_commit_transaction() code, there is one extra message in the
series of jbd debug messages. ("JBD: commit phase 2") This patch fixes
it, too.

Signed-off-by: Toshiyuki Okajima
Acked-by: Jan Kara
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Toshiyuki Okajima
2008-07-26 01:53:32 +0800
a10320e8f jbd: unexport journal_update_superblock ... Browse Code »

Remove the unused EXPORT_SYMBOL(journal_update_superblock).

Signed-off-by: Adrian Bunk
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2008-07-26 01:53:32 +0800
3f31fddfa jbd: fix race between free buffer and commit transaction ... Browse Code »

journal_try_to_free_buffers() could race with jbd commit transaction when
the later is holding the buffer reference while waiting for the data
buffer to flush to disk. If the caller of journal_try_to_free_buffers()
request tries hard to release the buffers, it will treat the failure as
error and return back to the caller. We have seen the directo IO failed
due to this race. Some of the caller of releasepage() also expecting the
buffer to be dropped when passed with GFP_KERNEL mask to the
releasepage()->journal_try_to_free_buffers().

With this patch, if the caller is passing the __GFP_WAIT and __GFP_FS to
indicating this call could wait, in case of try_to_free_buffers() failed,
let's waiting for journal_commit_transaction() to finish commit the
current committing transaction, then try to free those buffers again.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Mingming Cao
Reviewed-by: Badari Pulavarty
Acked-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mingming Cao
2008-07-26 01:53:32 +0800
1984bb763 jbd: tidy up revoke cache initialisation and destruction ... Browse Code »

Make revocation cache destruction safe to call if initialisation fails
partially or entirely. This allows it to be used to cleanup in the case
of initialisation failure, simplifying that code slightly.

Signed-off-by: Duane Griffin
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Duane Griffin
2008-07-26 01:53:32 +0800
f4d79ca2f jbd: eliminate duplicated code in revocation table init/destroy functions ... Browse Code »

The revocation table initialisation/destruction code is repeated for each
of the two revocation tables stored in the journal. Refactoring the
duplicated code into functions is tidier, simplifies the logic in
initialisation in particular, and slightly reduces the code size.

There should not be any functional change.

Signed-off-by: Duane Griffin
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Duane Griffin
2008-07-26 01:53:32 +0800
3850f7a52 jbd: replace potentially false assertion with if block ... Browse Code »

If an error occurs during jbd cache initialisation it is possible for the
journal_head_cache to be NULL when journal_destroy_journal_head_cache is
called. Replace the J_ASSERT with an if block to handle the situation
correctly.

Note that even with this fix things will break badly if jbd is statically
compiled in and cache initialisation fails.

Signed-off-by: Duane Griffin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Duane Griffin
2008-07-26 01:53:32 +0800

15 May, 2008

1 commit

772279c5f jbd: need to hold j_state_lock to updates to transaction t_state to T_COMMIT ... Browse Code »

Updating the current transaction's t_state is protected by j_state_lock. We
need to do the same when updating the t_state to T_COMMIT.

Signed-off-by: Mingming Cao
Acked-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mingming Cao
2008-05-15 10:11:14 +0800

28 Apr, 2008

3 commits

08fc99bfc jbd: replace remaining __FUNCTION__ occurrences ... Browse Code »

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Harvey Harrison
2008-04-28 23:58:45 +0800
5b9a499d7 jbd: fix possible journal overflow issues ... Browse Code »

There are several cases where the running transaction can get buffers added to
its BJ_Metadata list which it never dirtied, which makes its t_nr_buffers
counter end up larger than its t_outstanding_credits counter.

This will cause issues when starting new transactions as while we are logging
buffers we decrement t_outstanding_buffers, so when t_outstanding_buffers goes
negative, we will report that we need less space in the journal than we
actually need, so transactions will be started even though there may not be
enough room for them. In the worst case scenario (which admittedly is almost
impossible to reproduce) this will result in the journal running out of space.

The fix is to only
refile buffers from the committing transaction to the running transactions
BJ_Modified list when b_modified is set on that journal, which is the only way
to be sure if the running transaction has modified that buffer.

This patch also fixes an accounting error in journal_forget, it is possible
that we can call journal_forget on a buffer without having modified it, only
gotten write access to it, so instead of freeing a credit, we only do so if
the buffer was modified. The assert will help catch if this problem occurs.
Without these two patches I could hit this assert within minutes of running
postmark, with them this issue no longer arises. Thank you,

Signed-off-by: Josef Bacik
Cc:
Acked-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josef Bacik
2008-04-28 23:58:44 +0800
5bc833fea jbd: fix the way the b_modified flag is cleared ... Browse Code »

Currently at the start of a journal commit we loop through all of the buffers
on the committing transaction and clear the b_modified flag (the flag that is
set when a transaction modifies the buffer) under the j_list_lock.

The problem is that everywhere else this flag is modified only under the jbd
lock buffer flag, so it will race with a running transaction who could
potentially set it, and have it unset by the committing transaction.

This is also a big waste, you can have several thousands of buffers that you
are clearing the modified flag on when you may not need to. This patch
removes this code and instead clears the b_modified flag upon entering
do_get_write_access/journal_get_create_access, so if that transaction does
indeed use the buffer then it will be accounted for properly, and if it does
not then we know we didn't use it.

That will be important for the next patch in this series. Tested thoroughly
by myself using postmark/iozone/bonnie++.

Signed-off-by: Josef Bacik
Cc:
Acked-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josef Bacik
2008-04-28 23:58:44 +0800

31 Mar, 2008

1 commit

1076d17ac jbd/jbd2 NULL noise ... Browse Code »

Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds

Al Viro
2008-03-31 05:18:41 +0800

20 Mar, 2008

3 commits

a6b91919e fs: fix kernel-doc notation warnings ... Browse Code »

Fix kernel-doc notation warnings in fs/.

Warning(mmotm-2008-0314-1449//fs/super.c:560): missing initial short description on line:
* mark_files_ro
Warning(mmotm-2008-0314-1449//fs/locks.c:1277): missing initial short description on line:
* lease_get_mtime
Warning(mmotm-2008-0314-1449//fs/locks.c:1277): missing initial short description on line:
* lease_get_mtime
Warning(mmotm-2008-0314-1449//fs/namei.c:1368): missing initial short description on line:
* lookup_one_len: filesystem helper to lookup single pathname component
Warning(mmotm-2008-0314-1449//fs/buffer.c:3221): missing initial short description on line:
* bh_uptodate_or_lock: Test whether the buffer is uptodate
Warning(mmotm-2008-0314-1449//fs/buffer.c:3240): missing initial short description on line:
* bh_submit_read: Submit a locked buffer for reading
Warning(mmotm-2008-0314-1449//fs/fs-writeback.c:30): missing initial short description on line:
* writeback_acquire: attempt to get exclusive writeback access to a device
Warning(mmotm-2008-0314-1449//fs/fs-writeback.c:47): missing initial short description on line:
* writeback_in_progress: determine whether there is writeback in progress
Warning(mmotm-2008-0314-1449//fs/fs-writeback.c:58): missing initial short description on line:
* writeback_release: relinquish exclusive writeback access against a device.
Warning(mmotm-2008-0314-1449//include/linux/jbd.h:351): contents before sections
Warning(mmotm-2008-0314-1449//include/linux/jbd.h:561): contents before sections
Warning(mmotm-2008-0314-1449//fs/jbd/transaction.c:1935): missing initial short description on line:
* void journal_invalidatepage()

Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2008-03-20 09:53:36 +0800
439aeec63 jbd: correctly unescape journal data blocks ... Browse Code »

Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping. At the moment the wrong buffer head's
data is being unescaped.

To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JFS_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device. Without this patch the files will contain zeros
instead of the correct data after recovery.

Signed-off-by: Duane Griffin
Acked-by: Jan Kara
Cc:
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Duane Griffin
2008-03-20 09:53:36 +0800
0cf01f668 jbd: fix jbd kernel-doc notation ... Browse Code »

Fix kernel-doc notation in jbd.

Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2008-03-20 09:53:35 +0800

04 Mar, 2008

1 commit

78a4a50a8 docbook: fix filesystems.tmpl source files ... Browse Code »

Fix docbook problems in filesystems.tmpl.
These cause the generated docbook to be incorrect.

Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds

Randy Dunlap
2008-03-04 02:47:13 +0800

09 Feb, 2008

1 commit

28ae094c6 ext3 can fail badly when device stops accepting BIO_RW_BARRIER requests ... Browse Code »

Some devices - notably dm and md - can change their behaviour in response
to BIO_RW_BARRIER requests. They might start out accepting such requests
but on reconfiguration, they find out that they cannot any more.

ext3 (and other filesystems) deal with this by always testing if
BIO_RW_BARRIER requests fail with EOPNOTSUPP, and retrying the write
requests without the barrier (probably after waiting for any pending writes
to complete).

However there is a bug in the handling for this for ext3.

When ext3 (jbd actually) decides to submit a BIO_RW_BARRIER request, it
sets the buffer_ordered flag on the buffer head. If the request completes
successfully, the flag STAYS SET.

Other code might then write the same buffer_head after the device has been
reconfigured to not accept barriers. This write will then fail, but the
"other code" is not ready to handle EOPNOTSUPP errors and the error will be
treated as fatal.

This can be seen without having to reconfigure a device at exactly the
wrong time by putting:

if (buffer_ordered(bh))
printk("OH DEAR, and ordered buffer\n");

in the while loop in "commit phase 5" of journal_commit_transaction.

If it ever prints the "OH DEAR ..." message (as it does sometimes for
me), then that request could (in different circumstances) have failed
with EOPNOTSUPP, but that isn't tested for.

My proposed fix is to clear the buffer_ordered flag after it has been
used, as in the following patch.

Signed-off-by: Neil Brown
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Neil Brown
2008-02-09 01:22:44 +0800

07 Feb, 2008

2 commits

533083836 make jbd/journal.c:__journal_abort_hard() static ... Browse Code »

__journal_abort_hard() can now become static.

Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2008-02-07 02:41:20 +0800
e86e14385 BKL-removal: remove incorrect comment refering to lock_kernel() from jbd/jbd2 ... Browse Code »

None of the callers of this function does actually take the BKL as far as I
can see. So remove the comment refering to the BKL.

Signed-off-by: Andi Kleen
Cc:
Cc: Theodore Ts'o
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andi Kleen
2008-02-07 02:41:20 +0800

01 Feb, 2008

1 commit

5315217ef [PATCH] jbd: Remove useless loop when writing commit record ... Browse Code »

Commit block was intended to have several copies of the header. But
due to a bug it never had them and actually, nobody checks that. So
just remove the useless loop.

Signed-off-by: Jan Kara
Signed-off-by: "Theodore Ts'o"

Jan Kara
2008-02-01 21:26:46 +0800

30 Jan, 2008

1 commit

95c354fe9 spinlock: lockbreak cleanup ... Browse Code »

The break_lock data structure and code for spinlocks is quite nasty.
Not only does it double the size of a spinlock but it changes locking to
a potentially less optimal trylock.

Put all of that under CONFIG_GENERIC_LOCKBREAK, and introduce a
__raw_spin_is_contended that uses the lock data itself to determine whether
there are waiters on the lock, to be used if CONFIG_GENERIC_LOCKBREAK is
not set.

Rename need_lockbreak to spin_needbreak, make it use spin_is_contended to
decouple it from the spinlock implementation, and make it typesafe (rwlocks
do not have any need_lockbreak sites -- why do they even get bloated up
with that break_lock then?).

Signed-off-by: Nick Piggin
Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner

Nick Piggin
2008-01-30 20:31:20 +0800

18 Jan, 2008

1 commit

f63dcda19 jbd: do not try lock_acquire after handle made invalid ... Browse Code »

This likely fixes the oops in __lock_acquire reported as:

http://www.kerneloops.org/raw.php?rawid=2753&msgid=
http://www.kerneloops.org/raw.php?rawid=2749&msgid=

In these reported oopses, start_this_handle is returning -EROFS.

Signed-off-by: Jonas Bonn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jonas Bonn
2008-01-18 07:38:59 +0800

06 Dec, 2007

1 commit

d4beaf4ab jbd: Fix assertion failure in fs/jbd/checkpoint.c ... Browse Code »

Before we start committing a transaction, we call
__journal_clean_checkpoint_list() to cleanup transaction's written-back
buffers.

If this call happens to remove all of them (and there were already some
buffers), __journal_remove_checkpoint() will decide to free the transaction
because it isn't (yet) a committing transaction and soon we fail some
assertion - the transaction really isn't ready to be freed :).

We change the check in __journal_remove_checkpoint() to free only a
transaction in T_FINISHED state. The locking there is subtle though (as
everywhere in JBD ;(). We use j_list_lock to protect the check and a
subsequent call to __journal_drop_transaction() and do the same in the end
of journal_commit_transaction() which is the only place where a transaction
can get to T_FINISHED state.

Probably I'm too paranoid here and such locking is not really necessary -
checkpoint lists are processed only from log_do_checkpoint() where a
transaction must be already committed to be processed or from
__journal_clean_checkpoint_list() where kjournald itself calls it and thus
transaction cannot change state either. Better be safe if something
changes in future...

Signed-off-by: Jan Kara
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2007-12-06 01:21:20 +0800