Eric Lee / smarc-fsl-linux-kernel

06 Dec, 2011

1 commit

90802ed9c treewide: Fix comment and string typo 'bufer' ... Browse Code »

Signed-off-by: Paul Bolle
Signed-off-by: Jiri Kosina

Paul Bolle
2011-12-06 16:53:40 +0800

28 Jun, 2011

1 commit

d3ad8434a jbd2: use WRITE_SYNC in journal checkpoint ... Browse Code »

In journal checkpoint, we write the buffer and wait for its finish.
But in cfq, the async queue has a very low priority, and in our test,
if there are too many sync queues and every queue is filled up with
requests, the write request will be delayed for quite a long time and
all the tasks which are waiting for journal space will end with errors like:

INFO: task attr_set:3816 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
attr_set D ffff880028393480 0 3816 1 0x00000000
ffff8802073fbae8 0000000000000086 ffff8802140847c8 ffff8800283934e8
ffff8802073fb9d8 ffffffff8103e456 ffff8802140847b8 ffff8801ed728080
ffff8801db4bc080 ffff8801ed728450 ffff880028393480 0000000000000002
Call Trace:
[] ? __dequeue_entity+0x33/0x38
[] ? need_resched+0x23/0x2d
[] ? thread_return+0xa2/0xbc
[] ? jbd2_journal_dirty_metadata+0x116/0x126 [jbd2]
[] ? jbd2_journal_dirty_metadata+0x116/0x126 [jbd2]
[] __mutex_lock_common+0x14e/0x1a9
[] ? brelse+0x13/0x15 [ext4]
[] __mutex_lock_slowpath+0x19/0x1b
[] mutex_lock+0x1b/0x32
[] __jbd2_journal_insert_checkpoint+0xe3/0x20c [jbd2]
[] start_this_handle+0x438/0x527 [jbd2]
[] ? autoremove_wake_function+0x0/0x3e
[] jbd2_journal_start+0xa1/0xcc [jbd2]
[] ext4_journal_start_sb+0x57/0x81 [ext4]
[] ext4_xattr_set+0x6c/0xe3 [ext4]
[] ext4_xattr_user_set+0x42/0x4b [ext4]
[] generic_setxattr+0x6b/0x76
[] __vfs_setxattr_noperm+0x47/0xc0
[] vfs_setxattr+0x7f/0x9a
[] setxattr+0xb5/0xe8
[] ? do_filp_open+0x571/0xa6e
[] sys_fsetxattr+0x6b/0x91
[] system_call_fastpath+0x16/0x1b

So this patch tries to use WRITE_SYNC in __flush_batch so that the request will
be moved into sync queue and handled by cfq timely. We also use the new plug,
sot that all the WRITE_SYNC requests can be given as a whole when we unplug it.

Signed-off-by: Tao Ma
Signed-off-by: "Theodore Ts'o"
Cc: Jan Kara
Reported-by: Robin Dong

Tao Ma
2011-06-28 00:36:29 +0800

14 Jun, 2011

1 commit

de1b79413 jbd2: Fix oops in jbd2_journal_remove_journal_head() ... Browse Code »

jbd2_journal_remove_journal_head() can oops when trying to access
journal_head returned by bh2jh(). This is caused for example by the
following race:

TASK1 TASK2
jbd2_journal_commit_transaction()
...
processing t_forget list
__jbd2_journal_refile_buffer(jh);
if (!jh->b_transaction) {
jbd_unlock_bh_state(bh);
jbd2_journal_try_to_free_buffers()
jbd2_journal_grab_journal_head(bh)
jbd_lock_bh_state(bh)
__journal_try_to_free_buffer()
jbd2_journal_put_journal_head(jh)
jbd2_journal_remove_journal_head(bh);

jbd2_journal_put_journal_head() in TASK2 sees that b_jcount == 0 and
buffer is not part of any transaction and thus frees journal_head
before TASK1 gets to doing so. Note that even buffer_head can be
released by try_to_free_buffers() after
jbd2_journal_put_journal_head() which adds even larger opportunity for
oops (but I didn't see this happen in reality).

Fix the problem by making transactions hold their own journal_head
reference (in b_jcount). That way we don't have to remove journal_head
explicitely via jbd2_journal_remove_journal_head() and instead just
remove journal_head when b_jcount drops to zero. The result of this is
that [__]jbd2_journal_refile_buffer(),
[__]jbd2_journal_unfile_buffer(), and
__jdb2_journal_remove_checkpoint() can free journal_head which needs
modification of a few callers. Also we have to be careful because once
journal_head is removed, buffer_head might be freed as well. So we
have to get our own buffer_head reference where it matters.

Signed-off-by: Jan Kara
Signed-off-by: "Theodore Ts'o"

Jan Kara
2011-06-14 03:38:22 +0800

28 Oct, 2010

2 commits

a107e5a3a Merge branch 'next' into upstream-merge ... Browse Code »

Conflicts:
fs/ext4/inode.c
fs/ext4/mballoc.c
include/trace/events/ext4.h

Theodore Ts'o
2010-10-28 11:44:47 +0800
5c2178e78 jbd2: Add sanity check for attempts to start handle during umount ... Browse Code »

An attempt to modify the file system during the call to
jbd2_destroy_journal() can lead to a system lockup. So add some
checking to make it much more obvious when this happens to and to
determine where the offending code is located.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-10-28 09:30:04 +0800

17 Sep, 2010

1 commit

dd3932edd block: remove BLKDEV_IFL_WAIT ... Browse Code »

All the blkdev_issue_* helpers can only sanely be used for synchronous
caller. To issue cache flushes or barriers asynchronously the caller needs
to set up a bio by itself with a completion callback to move the asynchronous
state machine ahead. So drop the BLKDEV_IFL_WAIT flag that is always
specified when calling blkdev_issue_* and also remove the now unused flags
argument to blkdev_issue_flush and blkdev_issue_zeroout. For
blkdev_issue_discard we need to keep it for the secure discard flag, which
gains a more descriptive name and loses the bitops vs flag confusion.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2010-09-17 02:52:58 +0800

18 Aug, 2010

1 commit

9cb569d60 remove SWRITE* I/O types ... Browse Code »

These flags aren't real I/O types, but tell ll_rw_block to always
lock the buffer instead of giving up on a failed trylock.

Instead add a new write_dirty_buffer helper that implements this semantic
and use it from the existing SWRITE* callers. Note that the ll_rw_block
code had a bug where it didn't promote WRITE_SYNC_PLUG properly, which
this patch fixes.

In the ufs code clean up the helper that used to call ll_rw_block
to mirror sync_dirty_buffer, which is the function it implements for
compound buffers.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-08-18 13:09:01 +0800

04 Aug, 2010

1 commit

a931da6ac jbd2: Change j_state_lock to be a rwlock_t ... Browse Code »

Lockstat reports have shown that j_state_lock is a major source of
lock contention, especially on systems with more than 4 CPU cores. So
change it to be a read/write spinlock.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-08-04 09:35:12 +0800

02 Aug, 2010

1 commit

a51dca9cd jbd2: Use atomic variables to avoid taking t_handle_lock in jbd2_journal_stop ... Browse Code »

By using an atomic_t for t_updates and t_outstanding credits, this
should allow us to not need to take transaction t_handle_lock in
jbd2_journal_stop().

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-08-02 20:43:25 +0800

29 Apr, 2010

1 commit

fbd9b09a1 blkdev: generalize flags for blkdev_issue_fn functions ... Browse Code »

The patch just convert all blkdev_issue_xxx function to common
set of flags. Wait/allocation semantics preserved.

Signed-off-by: Dmitry Monakhov
Signed-off-by: Jens Axboe

Dmitry Monakhov
2010-04-29 01:47:36 +0800

23 Dec, 2009

2 commits

71f2be213 ext4: Add new tracepoint for jbd2_cleanup_journal_tail ... Browse Code »

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-12-23 20:45:44 +0800
cc3e1bea5 ext4, jbd2: Add barriers for file systems with exernal journals ... Browse Code »

This is a bit complicated because we are trying to optimize when we
send barriers to the fs data disk. We could just throw in an extra
barrier to the data disk whenever we send a barrier to the journal
disk, but that's not always strictly necessary.

We only need to send a barrier during a commit when there are data
blocks which are must be written out due to an inode written in
ordered mode, or if fsync() depends on the commit to force data blocks
to disk. Finally, before we drop transactions from the beginning of
the journal during a checkpoint operation, we need to guarantee that
any blocks that were flushed out to the data disk are firmly on the
rust platter before we drop the transaction from the journal.

Thanks to Oleg Drokin for pointing out this flaw in ext3/ext4.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-12-23 19:52:08 +0800

30 Sep, 2009

1 commit

bf6993276 jbd2: Use tracepoints for history file ... Browse Code »

The /proc/fs/jbd2//history was maintained manually; by using
tracepoints, we can get all of the existing functionality of the /proc
file plus extra capabilities thanks to the ftrace infrastructure. We
save memory as a bonus.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-09-30 12:32:06 +0800

17 Jun, 2009

1 commit

879c5e6b7 jbd2: convert instrumentation from markers to tracepoints ... Browse Code »

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-06-17 23:47:48 +0800

07 Nov, 2008

2 commits

fb68407b0 jbd2: Call journal commit callback without holding j_list_lock ... Browse Code »

Avoid freeing the transaction in __jbd2_journal_drop_transaction() so
the journal commit callback can run without holding j_list_lock, to
avoid lock contention on this spinlock.

Signed-off-by: Aneesh Kumar K.V
Signed-off-by: "Theodore Ts'o"

Aneesh Kumar K.V
2008-11-07 06:50:21 +0800
8c3f25d89 jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space ... Browse Code »

Commit 23f8b79e introducd a regression because it assumed that if
there were no transactions ready to be checkpointed, that no progress
could be made on making space available in the journal, and so the
journal should be aborted. This assumption is false; it could be the
case that simply calling jbd2_cleanup_journal_tail() will recover the
necessary space, or, for small journals, the currently committing
transaction could be responsible for chewing up the required space in
the log, so we need to wait for the currently committing transaction
to finish before trying to force a checkpoint operation.

This patch fixes a bug reported by Mihai Harpau at:
https://bugzilla.redhat.com/show_bug.cgi?id=469582

This patch fixes a bug reported by François Valenduc at:
http://bugzilla.kernel.org/show_bug.cgi?id=11840

Signed-off-by: "Theodore Ts'o"
Cc: Duane Griffin
Cc: Toshiyuki Okajima

Theodore Ts'o
2008-11-07 11:38:07 +0800

05 Nov, 2008

1 commit

1a0d3786d jbd2: Remove a large array of bh's from the stack of the checkpoint routine ... Browse Code »

jbd2_log_do_checkpoint()n is one of the kernel's largest stack users.
Move the array of buffer head's from the stack of jbd2_log_do_checkpoint()
to the in-core journal structure.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2008-11-05 13:09:22 +0800

11 Oct, 2008

1 commit

44519faf2 jbd2: fix error handling for checkpoint io ... Browse Code »

When a checkpointing IO fails, current JBD2 code doesn't check the
error and continue journaling. This means latest metadata can be
lost from both the journal and filesystem.

This patch leaves the failed metadata blocks in the journal space
and aborts journaling in the case of jbd2_log_do_checkpoint().
To achieve this, we need to do:

1. don't remove the failed buffer from the checkpoint list where in
the case of __try_to_free_cp_buf() because it may be released or
overwritten by a later transaction
2. jbd2_log_do_checkpoint() is the last chance, remove the failed
buffer from the checkpoint list and abort the journal
3. when checkpointing fails, don't update the journal super block to
prevent the journaled contents from being cleaned. For safety,
don't update j_tail and j_tail_sequence either
4. when checkpointing fails, notify this error to the ext4 layer so
that ext4 don't clear the needs_recovery flag, otherwise the
journaled contents are ignored and cleaned in the recovery phase
5. if the recovery fails, keep the needs_recovery flag
6. prevent jbd2_cleanup_journal_tail() from being called between
__jbd2_journal_drop_transaction() and jbd2_journal_abort()
(a possible race issue between jbd2_log_do_checkpoint()s called by
jbd2_journal_flush() and __jbd2_log_wait_for_space())

Signed-off-by: Hidehiro Kawai
Signed-off-by: Theodore Ts'o

Hidehiro Kawai
2008-10-11 08:29:13 +0800

09 Oct, 2008

1 commit

23f8b79ea jbd2: abort instead of waiting for nonexistent transaction ... Browse Code »

The __jbd2_log_wait_for_space function sits in a loop checkpointing
transactions until there is sufficient space free in the journal.
However, if there are no transactions to be processed (e.g. because the
free space calculation is wrong due to a corrupted filesystem) it will
never progress.

Check for space being required when no transactions are outstanding and
abort the journal instead of endlessly looping.

This patch fixes the bug reported by Sami Liedes at:
http://bugzilla.kernel.org/show_bug.cgi?id=10976

Signed-off-by: Duane Griffin
Cc: Sami Liedes
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: "Theodore Ts'o"

Duane Griffin
2008-10-09 11:28:31 +0800

06 Oct, 2008

1 commit

ede86cc47 ext4: Add debugging markers that can be used by systemtap ... Browse Code »

This debugging markers are designed to debug problems such as the
random filesystem latency problems reported by Arjan.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2008-10-06 08:50:06 +0800

12 Jul, 2008

1 commit

87c89c232 jbd2: Remove data=ordered mode support using jbd buffer heads ... Browse Code »

Signed-off-by: Jan Kara

Jan Kara
2008-07-12 07:27:31 +0800

30 Jan, 2008

1 commit

95c354fe9 spinlock: lockbreak cleanup ... Browse Code »

The break_lock data structure and code for spinlocks is quite nasty.
Not only does it double the size of a spinlock but it changes locking to
a potentially less optimal trylock.

Put all of that under CONFIG_GENERIC_LOCKBREAK, and introduce a
__raw_spin_is_contended that uses the lock data itself to determine whether
there are waiters on the lock, to be used if CONFIG_GENERIC_LOCKBREAK is
not set.

Rename need_lockbreak to spin_needbreak, make it use spin_is_contended to
decouple it from the spinlock implementation, and make it typesafe (rwlocks
do not have any need_lockbreak sites -- why do they even get bloated up
with that break_lock then?).

Signed-off-by: Nick Piggin
Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner

Nick Piggin
2008-01-30 20:31:20 +0800

29 Jan, 2008

2 commits

8e85fb3f3 jbd2: jbd2 stats through procfs ... Browse Code »

The patch below updates the jbd stats patch to 2.6.20/jbd2.
The initial patch was posted by Alex Tomas in December 2005
(http://marc.info/?l=linux-ext4&m=113538565128617&w=2).
It provides statistics via procfs such as transaction lifetime and size.

Sometimes, investigating performance problems, i find useful to have
stats from jbd about transaction's lifetime, size, etc. here is a
patch for review and inclusion probably.

for example, stats after creation of 3M files in htree directory:

[root@bob ~]# cat /proc/fs/jbd/sda/history
R/C tid wait run lock flush log hndls block inlog ctime write drop close
R 261 8260 2720 0 0 750 9892 8170 8187
C 259 750 0 4885 1
R 262 20 2200 10 0 770 9836 8170 8187
R 263 30 2200 10 0 3070 9812 8170 8187
R 264 0 5000 10 0 1340 0 0 0
C 261 8240 3212 4957 0
R 265 8260 1470 0 0 4640 9854 8170 8187
R 266 0 5000 10 0 1460 0 0 0
C 262 8210 2989 4868 0
R 267 8230 1490 10 0 4440 9875 8171 8188
R 268 0 5000 10 0 1260 0 0 0
C 263 7710 2937 4908 0
R 269 7730 1470 10 0 3330 9841 8170 8187
R 270 0 5000 10 0 830 0 0 0
C 265 8140 3234 4898 0
C 267 720 0 4849 1
R 271 8630 2740 20 0 740 9819 8170 8187
C 269 800 0 4214 1
R 272 40 2170 10 0 830 9716 8170 8187
R 273 40 2280 0 0 3530 9799 8170 8187
R 274 0 5000 10 0 990 0 0 0

where,

R - line for transaction's life from T_RUNNING to T_FINISHED
C - line for transaction's checkpointing
tid - transaction's id
wait - for how long we were waiting for new transaction to start
(the longest period journal_start() took in this transaction)
run - real transaction's lifetime (from T_RUNNING to T_LOCKED
lock - how long we were waiting for all handles to close
(time the transaction was in T_LOCKED)
flush - how long it took to flush all data (data=ordered)
log - how long it took to write the transaction to the log
hndls - how many handles got to the transaction
block - how many blocks got to the transaction
inlog - how many blocks are written to the log (block + descriptors)
ctime - how long it took to checkpoint the transaction
write - how many blocks have been written during checkpointing
drop - how many blocks have been dropped during checkpointing
close - how many running transactions have been closed to checkpoint this one

all times are in msec.

[root@bob ~]# cat /proc/fs/jbd/sda/info
280 transaction, each upto 8192 blocks
average:
1633ms waiting for transaction
3616ms running transaction
5ms transaction was being locked
1ms flushing data (in ordered mode)
1799ms logging transaction
11781 handles per transaction
5629 blocks per transaction
5641 logged blocks per transaction

Signed-off-by: Johann Lombardi
Signed-off-by: Mariusz Kozlowski
Signed-off-by: Mingming Cao
Signed-off-by: Eric Sandeen

Johann Lombardi
2008-01-29 12:58:27 +0800
f5a7a6b0d jbd2: Fix assertion failure in fs/jbd2/checkpoint.c ... Browse Code »

Before we start committing a transaction, we call
__journal_clean_checkpoint_list() to cleanup transaction's written-back
buffers.

If this call happens to remove all of them (and there were already some
buffers), __journal_remove_checkpoint() will decide to free the transaction
because it isn't (yet) a committing transaction and soon we fail some
assertion - the transaction really isn't ready to be freed :).

We change the check in __journal_remove_checkpoint() to free only a
transaction in T_FINISHED state. The locking there is subtle though (as
everywhere in JBD ;(). We use j_list_lock to protect the check and a
subsequent call to __journal_drop_transaction() and do the same in the end
of journal_commit_transaction() which is the only place where a transaction
can get to T_FINISHED state.

Probably I'm too paranoid here and such locking is not really necessary -
checkpoint lists are processed only from log_do_checkpoint() where a
transaction must be already committed to be processed or from
__journal_clean_checkpoint_list() where kjournald itself calls it and thus
transaction cannot change state either. Better be safe if something
changes in future...

Signed-off-by: Jan Kara
Cc:
Signed-off-by: Andrew Morton

Jan Kara
2008-01-29 12:58:27 +0800

09 May, 2007

1 commit

588626996 fix file specification in comments ... Browse Code »

Many files include the filename at the beginning, serveral used a wrong one.

Signed-off-by: Uwe Kleine-König
Signed-off-by: Adrian Bunk

Uwe Kleine-König
2007-05-09 14:58:16 +0800

12 Oct, 2006

2 commits

f7f4bccb7 [PATCH] jbd2: rename jbd2 symbols to avoid duplication of jbd symbols ... Browse Code »

Mingming Cao originally did this work, and Shaggy reproduced it using some
scripts from her.

Signed-off-by: Mingming Cao
Signed-off-by: Dave Kleikamp
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mingming Cao
2006-10-12 02:14:15 +0800
470decc61 [PATCH] jbd2: initial copy of files from jbd ... Browse Code »

This is a simple copy of the files in fs/jbd to fs/jbd2 and
/usr/incude/linux/[ext4_]jbd.h to /usr/include/[ext4_]jbd2.h

Signed-off-by: Dave Kleikamp
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dave Kleikamp
2006-10-12 02:14:15 +0800