Eric Lee / smarc-fsl-linux-kernel

13 Dec, 2019

1 commit

1a6a96e0f jbd2: Fix possible overflow in jbd2_log_space_left() ... Browse Code »

commit add3efdd78b8a0478ce423bb9d4df6bd95e8b335 upstream.

When number of free space in the journal is very low, the arithmetic in
jbd2_log_space_left() could underflow resulting in very high number of
free blocks and thus triggering assertion failure in transaction commit
code complaining there's not enough space in the journal:

J_ASSERT(journal->j_free > 1);

Properly check for the low number of free blocks.

CC: stable@vger.kernel.org
Reviewed-by: Theodore Ts'o
Signed-off-by: Jan Kara
Link: https://lore.kernel.org/r/20191105164437.32602-1-jack@suse.cz
Signed-off-by: Theodore Ts'o
Signed-off-by: Greg Kroah-Hartman

Jan Kara
2019-12-13 15:42:53 +0800

25 Sep, 2019

1 commit

963abb9ae jbd2: remove jbd2_journal_inode_add_[write|wait] ... Browse Code »

Since ext4/ocfs2 are using jbd2_inode dirty range scoping APIs now,
jbd2_journal_inode_add_[write|wait] are not used any more, remove them.

Link: http://lkml.kernel.org/r/1562977611-8412-2-git-send-email-joseph.qi@linux.alibaba.com
Signed-off-by: Joseph Qi
Reviewed-by: Ross Zwisler
Acked-by: Changwei Ge
Cc: Gang He
Cc: Joel Becker
Cc: Joseph Qi
Cc: Jun Piao
Cc: Junxiao Bi
Cc: Mark Fasheh
Cc: "Theodore Ts'o"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2019-09-25 06:54:07 +0800

21 Jun, 2019

2 commits

9382cde8c jbd2: drop declaration of journal_sync_buffer() ... Browse Code »

The journal_sync_buffer() function was never carried over from jbd to
jbd2. So get rid of the vestigal declaration of this (non-existent)
function.

Signed-off-by: Theodore Ts'o
Reviewed-by: Darrick J. Wong

Theodore Ts'o
2019-06-21 05:32:21 +0800
6ba0e7dc6 jbd2: introduce jbd2_inode dirty range scoping ... Browse Code »

Currently both journal_submit_inode_data_buffers() and
journal_finish_inode_data_buffers() operate on the entire address space
of each of the inodes associated with a given journal entry. The
consequence of this is that if we have an inode where we are constantly
appending dirty pages we can end up waiting for an indefinite amount of
time in journal_finish_inode_data_buffers() while we wait for all the
pages under writeback to be written out.

The easiest way to cause this type of workload is do just dd from
/dev/zero to a file until it fills the entire filesystem. This can
cause journal_finish_inode_data_buffers() to wait for the duration of
the entire dd operation.

We can improve this situation by scoping each of the inode dirty ranges
associated with a given transaction. We do this via the jbd2_inode
structure so that the scoping is contained within jbd2 and so that it
follows the lifetime and locking rules for that structure.

This allows us to limit the writeback & wait in
journal_submit_inode_data_buffers() and
journal_finish_inode_data_buffers() respectively to the dirty range for
a given struct jdb2_inode, keeping us from waiting forever if the inode
in question is still being appended to.

Signed-off-by: Ross Zwisler
Signed-off-by: Theodore Ts'o
Reviewed-by: Jan Kara
Cc: stable@vger.kernel.org

Ross Zwisler
2019-06-21 05:24:56 +0800

24 May, 2019

1 commit

d69100585 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 83 ... Browse Code »

Based on 1 normalized pattern(s):

this file is part of the linux kernel and is made available under
the terms of the gnu general public license version 2 or at your
option any later version incorporated herein by reference

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 18 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Richard Fontana
Reviewed-by: Allison Randal
Reviewed-by: Armijn Hemel
Reviewed-by: Kate Stewart
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190520075211.321157221@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-24 23:37:52 +0800

20 May, 2019

1 commit

c4d36b63b Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 ... Browse Code »

Pull ext4 fixes from Ted Ts'o:
"Some bug fixes, and an update to the URL's for the final version of
Unicode 12.1.0"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: avoid panic during forced reboot due to aborted journal
ext4: fix block validity checks for journal inodes using indirect blocks
unicode: update to Unicode 12.1.0 final
unicode: add missing check for an error return from utf8lookup()
ext4: fix miscellaneous sparse warnings
ext4: unsigned int compared against zero
ext4: fix use-after-free in dx_release()
ext4: fix data corruption caused by overlapping unaligned and aligned IO
jbd2: fix potential double free
ext4: zero out the unused memory region in the extent tree block

Linus Torvalds
2019-05-20 02:43:16 +0800

11 May, 2019

1 commit

0d52154bb jbd2: fix potential double free ... Browse Code »

When failing from creating cache jbd2_inode_cache, we will destroy the
previously created cache jbd2_handle_cache twice. This patch fixes
this by moving each cache initialization/destruction to its own
separate, individual function.

Signed-off-by: Chengguang Xu
Signed-off-by: Theodore Ts'o
Cc: stable@kernel.org

Chengguang Xu
2019-05-11 09:15:47 +0800

25 Apr, 2019

1 commit

877b5691f crypto: shash - remove shash_desc::flags ... Browse Code »

The flags field in 'struct shash_desc' never actually does anything.
The only ostensibly supported flag is CRYPTO_TFM_REQ_MAY_SLEEP.
However, no shash algorithm ever sleeps, making this flag a no-op.

With this being the case, inevitably some users who can't sleep wrongly
pass MAY_SLEEP. These would all need to be fixed if any shash algorithm
actually started sleeping. For example, the shash_ahash_*() functions,
which wrap a shash algorithm with the ahash API, pass through MAY_SLEEP
from the ahash API to the shash API. However, the shash functions are
called under kmap_atomic(), so actually they're assumed to never sleep.

Even if it turns out that some users do need preemption points while
hashing large buffers, we could easily provide a helper function
crypto_shash_update_large() which divides the data into smaller chunks
and calls crypto_shash_update() and cond_resched() for each chunk. It's
not necessary to have a flag in 'struct shash_desc', nor is it necessary
to make individual shash algorithms aware of this at all.

Therefore, remove shash_desc::flags, and document that the
crypto_shash_*() functions can be called from any context.

Signed-off-by: Eric Biggers
Signed-off-by: Herbert Xu

Eric Biggers
2019-04-25 15:38:12 +0800

04 Dec, 2018

2 commits

32ea27500 jbd2: update locking documentation for transaction_t ... Browse Code »

The following members of struct transaction_s aka transaction_t
were turned into lock-free variables in the past:
- t_updates
- t_outstanding_credits
- t_handle_count
However, the documentation has not been updated yet.
This commit replaced the annotated lock by [none].

Found by LockDoc (Alexander Lochmann, Horst Schirmeier and Olaf Spinczyk)

Signed-off-by: Alexander Lochmann
Signed-off-by: Horst Schirmeier
Signed-off-by: Theodore Ts'o

Alexander Lochmann
2018-12-04 13:30:22 +0800
96f1e0974 jbd2: avoid long hold times of j_state_lock while committing a transaction ... Browse Code »

We can hold j_state_lock for writing at the beginning of
jbd2_journal_commit_transaction() for a rather long time (reportedly for
30 ms) due cleaning revoke bits of all revoked buffers under it. The
handling of revoke tables as well as cleaning of t_reserved_list, and
checkpoint lists does not need j_state_lock for anything. It is only
needed to prevent new handles from joining the transaction. Generally
T_LOCKED transaction state prevents new handles from joining the
transaction - except for reserved handles which have to allowed to join
while we wait for other handles to complete.

To prevent reserved handles from joining the transaction while cleaning
up lists, add new transaction state T_SWITCH and watch for it when
starting reserved handles. With this we can just drop the lock for
operations that don't need it.

Reported-and-tested-by: Adrian Hunter
Suggested-by: "Theodore Y. Ts'o"
Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o

Jan Kara
2018-12-04 12:16:07 +0800

10 Jan, 2018

1 commit

f69120ce6 jbd2: fix sphinx kernel-doc build warnings ... Browse Code »

Sphinx emits various (26) warnings when building make target 'htmldocs'.
Currently struct definitions contain duplicate documentation, some as
kernel-docs and some as standard c89 comments. We can reduce
duplication while cleaning up the kernel docs.

Move all kernel-docs to right above each struct member. Use the set of
all existing comments (kernel-doc and c89). Add documentation for
missing struct members and function arguments.

Signed-off-by: Tobin C. Harding
Signed-off-by: Theodore Ts'o
Cc: stable@vger.kernel.org

Tobin C. Harding
2018-01-10 13:27:29 +0800

03 Nov, 2017

1 commit

b8a6176c2 ext4: Support for synchronous DAX faults ... Browse Code »

We return IOMAP_F_DIRTY flag from ext4_iomap_begin() when asked to
prepare blocks for writing and the inode has some uncommitted metadata
changes. In the fault handler ext4_dax_fault() we then detect this case
(through VM_FAULT_NEEDDSYNC return value) and call helper
dax_finish_sync_fault() to flush metadata changes and insert page table
entry. Note that this will also dirty corresponding radix tree entry
which is what we want - fsync(2) will still provide data integrity
guarantees for applications not using userspace flushing. And
applications using userspace flushing can avoid calling fsync(2) and
thus avoid the performance overhead.

Reviewed-by: Ross Zwisler
Signed-off-by: Jan Kara
Signed-off-by: Dan Williams

Jan Kara
2017-11-03 21:26:26 +0800

04 May, 2017

1 commit

81378da64 jbd2: mark the transaction context with the scope GFP_NOFS context ... Browse Code »

now that we have memalloc_nofs_{save,restore} api we can mark the whole
transaction context as implicitly GFP_NOFS. All allocations will
automatically inherit GFP_NOFS this way. This means that we do not have
to mark any of those requests with GFP_NOFS and moreover all the
ext4_kv[mz]alloc(GFP_NOFS) are also safe now because even the hardcoded
GFP_KERNEL allocations deep inside the vmalloc will be NOFS now.

[akpm@linux-foundation.org: tweak comments]
Link: http://lkml.kernel.org/r/20170306131408.9828-7-mhocko@kernel.org
Signed-off-by: Michal Hocko
Reviewed-by: Jan Kara
Cc: Dave Chinner
Cc: Theodore Ts'o
Cc: Chris Mason
Cc: David Sterba
Cc: Brian Foster
Cc: Darrick J. Wong
Cc: Nikolay Borisov
Cc: Peter Zijlstra
Cc: Vlastimil Babka
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Hocko
2017-05-04 06:52:09 +0800

30 Jun, 2016

2 commits

1eaa566d3 jbd2: track more dependencies on transaction commit ... Browse Code »

So far we were tracking only dependency on transaction commit due to
starting a new handle (which may require commit to start a new
transaction). Now add tracking also for other cases where we wait for
transaction commit. This way lockdep can catch deadlocks e. g. because we
call jbd2_journal_stop() for a synchronous handle with some locks held
which rank below transaction start.

Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o

Jan Kara
2016-06-30 23:40:54 +0800
ab714aff4 jbd2: move lockdep tracking to journal_s ... Browse Code »

Currently lockdep map is tracked in each journal handle. To be able to
expand lockdep support to cover also other cases where we depend on
transaction commit and where handle is not available, move lockdep map
into struct journal_s. Since this makes the lockdep map shared for all
handles, we have to use rwsem_acquire_read() for acquisitions now.

Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o

Jan Kara
2016-06-30 23:39:38 +0800

06 May, 2016

1 commit

466c3fb61 jbd2: remove excess descriptions for handle_s ... Browse Code »

Commit bf6993276f74 ("jbd2: Use tracepoints for history file")
removed the members j_history, j_history_max and j_history_cur from struct
handle_s but the descriptions stayed lingering. Removing them.

Signed-off-by: Luis de Bethencourt
Signed-off-by: Theodore Ts'o
Reviewed-by: Jan Kara

Luis de Bethencourt
2016-05-06 10:35:54 +0800

24 Apr, 2016

1 commit

41617e1a8 jbd2: add support for avoiding data writes during transaction commits ... Browse Code »

Currently when filesystem needs to make sure data is on permanent
storage before committing a transaction it adds inode to transaction's
inode list. During transaction commit, jbd2 writes back all dirty
buffers that have allocated underlying blocks and waits for the IO to
finish. However when doing writeback for delayed allocated data, we
allocate blocks and immediately submit the data. Thus asking jbd2 to
write dirty pages just unnecessarily adds more work to jbd2 possibly
writing back other redirtied blocks.

Add support to jbd2 to allow filesystem to ask jbd2 to only wait for
outstanding data writes before committing a transaction and thus avoid
unnecessary writes.

Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o

Jan Kara
2016-04-24 12:56:07 +0800

23 Feb, 2016

3 commits

1101cd4d1 jbd2: unify revoke and tag block checksum handling ... Browse Code »

Revoke and tag descriptor blocks are just different kinds of descriptor
blocks and thus have checksum in the same place. Unify computation and
checking of checksums for these.

Reviewed-by: Darrick J. Wong
Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o

Jan Kara
2016-02-23 12:19:09 +0800
32ab67159 jbd2: factor out common descriptor block initialization ... Browse Code »

Descriptor block header is initialized in several places. Factor out the
common code into jbd2_journal_get_descriptor_buffer().

Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o

Jan Kara
2016-02-23 12:17:15 +0800
9bcf976cb jbd2: remove unnecessary arguments of jbd2_journal_write_revoke_records ... Browse Code »

jbd2_journal_write_revoke_records() takes journal pointer and write_op,
although journal can be obtained from the passed transaction and
write_op is always WRITE_SYNC. Remove these superfluous arguments.

Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o

Jan Kara
2016-02-23 12:07:30 +0800

19 Oct, 2015

1 commit

4327ba52a ext4, jbd2: ensure entering into panic after recording an error in superblock ... Browse Code »

If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the
journaling will be aborted first and the error number will be recorded
into JBD2 superblock and, finally, the system will enter into the
panic state in "errors=panic" option. But, in the rare case, this
sequence is little twisted like the below figure and it will happen
that the system enters into panic state, which means the system reset
in mobile environment, before completion of recording an error in the
journal superblock. In this case, e2fsck cannot recognize that the
filesystem failure occurred in the previous run and the corruption
wouldn't be fixed.

Task A Task B
ext4_handle_error()
-> jbd2_journal_abort()
-> __journal_abort_soft()
-> __jbd2_journal_abort_hard()
| -> journal->j_flags |= JBD2_ABORT;
|
| __ext4_abort()
| -> jbd2_journal_abort()
| | -> __journal_abort_soft()
| | -> if (journal->j_flags & JBD2_ABORT)
| | return;
| -> panic()
|
-> jbd2_journal_update_sb_errno()

Tested-by: Hobin Woo
Signed-off-by: Daeho Jeong
Signed-off-by: Theodore Ts'o
Cc: stable@vger.kernel.org

Daeho Jeong
2015-10-19 05:02:56 +0800

18 Oct, 2015

2 commits

56316a0d2 jbd2: clean up feature test macros with predicate functions ... Browse Code »

Create separate predicate functions to test/set/clear feature flags,
thereby replacing the wordy old macros. Furthermore, clean out the
places where we open-coded feature tests.

Signed-off-by: Darrick J. Wong
Signed-off-by: Theodore Ts'o

Darrick J. Wong
2015-10-18 04:18:45 +0800
6a797d273 ext4: call out CRC and corruption errors with specific error codes ... Browse Code »

Instead of overloading EIO for CRC errors and corrupt structures,
return the same error codes that XFS returns for the same issues.

Signed-off-by: Darrick J. Wong
Signed-off-by: Theodore Ts'o

Darrick J. Wong
2015-10-18 04:16:04 +0800

15 Oct, 2015

1 commit

8595798ca jbd2: gate checksum calculations on crc driver presence, not sb flags ... Browse Code »

Change the journal's checksum functions to gate on whether or not the
crc32c driver is loaded, and gate the loading on the superblock bits.
This prevents a journal crash if someone loads a journal in no-csum
mode and then randomizes the superblock, thus flipping on the feature
bits.

Tested-By: Nikolay Borisov
Reported-by: Nikolay Borisov
Signed-off-by: Darrick J. Wong
Signed-off-by: Theodore Ts'o

Darrick J. Wong
2015-10-15 22:30:36 +0800

04 Sep, 2015

1 commit

ea814ab9a Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 ... Browse Code »

Pull ext4 updates from Ted Ts'o:
"Pretty much all bug fixes and clean ups for 4.3, after a lot of
features and other churn going into 4.2"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
Revert "ext4: remove block_device_ejected"
ext4: ratelimit the file system mounted message
ext4: silence a format string false positive
ext4: simplify some code in read_mmp_block()
ext4: don't manipulate recovery flag when freezing no-journal fs
jbd2: limit number of reserved credits
ext4 crypto: remove duplicate header file
ext4: update c/mtime on truncate up
jbd2: avoid infinite loop when destroying aborted journal
ext4, jbd2: add REQ_FUA flag when recording an error in the superblock
ext4 crypto: fix spelling typo in comment
ext4 crypto: exit cleanly if ext4_derive_key_aes() fails
ext4: reject journal options for ext2 mounts
ext4: implement cgroup writeback support
ext4: replace ext4_io_submit->io_op with ->io_wbc
ext4 crypto: check for too-short encrypted file names
ext4 crypto: use a jbd2 transaction when adding a crypto policy
jbd2: speedup jbd2_journal_dirty_metadata()

Linus Torvalds
2015-09-04 03:52:19 +0800

29 Jul, 2015

1 commit

841df7df1 jbd2: avoid infinite loop when destroying aborted journal ... Browse Code »

Commit 6f6a6fda2945 "jbd2: fix ocfs2 corrupt when updating journal
superblock fails" changed jbd2_cleanup_journal_tail() to return EIO
when the journal is aborted. That makes logic in
jbd2_log_do_checkpoint() bail out which is fine, except that
jbd2_journal_destroy() expects jbd2_log_do_checkpoint() to always make
a progress in cleaning the journal. Without it jbd2_journal_destroy()
just loops in an infinite loop.

Fix jbd2_journal_destroy() to cleanup journal checkpoint lists of
jbd2_log_do_checkpoint() fails with error.

Reported-by: Eryu Guan
Tested-by: Eryu Guan
Fixes: 6f6a6fda294506dfe0e3e0a253bb2d2923f28f0a
Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o

Jan Kara
2015-07-29 02:57:14 +0800

24 Jul, 2015

1 commit

c290ea01a fs: Remove ext3 filesystem driver ... Browse Code »

The functionality of ext3 is fully supported by ext4 driver. Major
distributions (SUSE, RedHat) already use ext4 driver to handle ext3
filesystems for quite some time. There is some ugliness in mm resulting
from jbd cleaning buffers in a dirty page without cleaning page dirty
bit and also support for buffer bouncing in the block layer when stable
pages are required is there only because of jbd. So let's remove the
ext3 driver. This saves us some 28k lines of duplicated code.

Acked-by: Theodore Ts'o
Signed-off-by: Jan Kara

Jan Kara
2015-07-24 02:59:40 +0800

16 Jun, 2015

1 commit

6f6a6fda2 jbd2: fix ocfs2 corrupt when updating journal superblock fails ... Browse Code »

If updating journal superblock fails after journal data has been
flushed, the error is omitted and this will mislead the caller as a
normal case. In ocfs2, the checkpoint will be treated successfully
and the other node can get the lock to update. Since the sb_start is
still pointing to the old log block, it will rewrite the journal data
during journal recovery by the other node. Thus the new updates will
be overwritten and ocfs2 corrupts. So in above case we have to return
the error, and ocfs2_commit_cache will take care of the error and
prevent the other node to do update first. And only after recovering
journal it can do the new updates.

The issue discussion mail can be found at:
https://oss.oracle.com/pipermail/ocfs2-devel/2015-June/010856.html
http://comments.gmane.org/gmane.comp.file-systems.ext4/48841

[ Fixed bug in patch which allowed a non-negative error return from
jbd2_cleanup_journal_tail() to leak out of jbd2_fjournal_flush(); this
was causing xfstests ext4/306 to fail. -- Ted ]

Reported-by: Yiwen Jiang
Signed-off-by: Joseph Qi
Signed-off-by: Theodore Ts'o
Tested-by: Yiwen Jiang
Cc: Junxiao Bi
Cc: stable@vger.kernel.org

Joseph Qi
2015-06-16 02:36:01 +0800

15 Jan, 2015

1 commit

c38fda3fe jbd: drop jbd_ENOSYS debug ... Browse Code »

A quick search shows that there are no users, drop the
macro for both jbd and jbd2.

Signed-off-by: Davidlohr Bueso
Cc: Jan Kara
Signed-off-by: Jan Kara

Davidlohr Bueso
2015-01-15 17:34:54 +0800

18 Sep, 2014

1 commit

50849db32 jbd2: simplify calling convention around __jbd2_journal_clean_checkpoint_list ... Browse Code »

__jbd2_journal_clean_checkpoint_list() returns number of buffers it
freed but noone was using the value so just stop doing that. This
also allows for simplifying the calling convention for
journal_clean_once_cp_list().

Signed-off-by: Jan Kara
Signed-off-by: Theodore Ts'o

Jan Kara
2014-09-18 12:58:12 +0800

29 Aug, 2014

1 commit

db9ee2203 jbd2: fix descriptor block size handling errors with journal_csum ... Browse Code »

It turns out that there are some serious problems with the on-disk
format of journal checksum v2. The foremost is that the function to
calculate descriptor tag size returns sizes that are too big. This
causes alignment issues on some architectures and is compounded by the
fact that some parts of jbd2 use the structure size (incorrectly) to
determine the presence of a 64bit journal instead of checking the
feature flags.

Therefore, introduce journal checksum v3, which enlarges the
descriptor block tag format to allow for full 32-bit checksums of
journal blocks, fix the journal tag function to return the correct
sizes, and fix the jbd2 recovery code to use feature flags to
determine 64bitness.

Add a few function helpers so we don't have to open-code quite so
many pieces.

Switching to a 16-byte block size was found to increase journal size
overhead by a maximum of 0.1%, to convert a 32-bit journal with no
checksumming to a 32-bit journal with checksum v3 enabled.

Signed-off-by: Darrick J. Wong
Reported-by: TR Reardon
Signed-off-by: Theodore Ts'o
Cc: stable@vger.kernel.org

Darrick J. Wong
2014-08-29 10:22:29 +0800

01 Jul, 2013

1 commit

41a5b9131 jbd2: invalidate handle if jbd2_journal_restart() fails ... Browse Code »

If jbd2_journal_restart() fails the handle will have been disconnected
from the current transaction. In this situation, the handle must not
be used for for any jbd2 function other than jbd2_journal_stop().
Enforce this with by treating a handle which has a NULL transaction
pointer as an aborted handle, and issue a kernel warning if
jbd2_journal_extent(), jbd2_journal_get_write_access(),
jbd2_journal_dirty_metadata(), etc. is called with an invalid handle.

This commit also fixes a bug where jbd2_journal_stop() would trip over
a kernel jbd2 assertion check when trying to free an invalid handle.

Also move the responsibility of setting current->journal_info to
start_this_handle(), simplifying the three users of this function.

Signed-off-by: "Theodore Ts'o"
Reported-by: Younger Liu
Cc: Jan Kara

Theodore Ts'o
2013-07-01 20:12:41 +0800

13 Jun, 2013

4 commits

169f1a2a8 jbd2: use a single printk for jbd_debug() ... Browse Code »

Since the jbd_debug() is implemented with two separate printk()
calls, it can lead to corrupted and misleading debug output like
the following (see lines marked with "*"):

[ 290.339362] (fs/jbd2/journal.c, 203): kjournald2: kjournald2 wakes
[ 290.339365] (fs/jbd2/journal.c, 155): kjournald2: commit_sequence=42103, commit_request=42104
[ 290.339369] (fs/jbd2/journal.c, 158): kjournald2: OK, requests differ
[* 290.339376] (fs/jbd2/journal.c, 648): jbd2_log_wait_commit:
[* 290.339379] (fs/jbd2/commit.c, 370): jbd2_journal_commit_transaction: JBD2: want 42104, j_commit_sequence=42103
[* 290.339382] JBD2: starting commit of transaction 42104
[ 290.339410] (fs/jbd2/revoke.c, 566): jbd2_journal_write_revoke_records: Wrote 0 revoke records
[ 290.376555] (fs/jbd2/commit.c, 1088): jbd2_journal_commit_transaction: JBD2: commit 42104 complete, head 42079

i.e. the debug output from log_wait_commit and journal_commit_transaction
have become interleaved. The output should have been:

(fs/jbd2/journal.c, 648): jbd2_log_wait_commit: JBD2: want 42104, j_commit_sequence=42103
(fs/jbd2/commit.c, 370): jbd2_journal_commit_transaction: JBD2: starting commit of transaction 42104

It is expected that this is not easy to replicate -- I was only able
to cause it on preempt-rt kernels, and even then only under heavy
I/O load.

Reported-by: Paul Gortmaker
Suggested-by: "Theodore Ts'o"
Signed-off-by: Paul Gortmaker
Signed-off-by: "Theodore Ts'o"

Paul Gortmaker
2013-06-13 11:04:04 +0800
c9b3a8ccb jbd/jbd2: relocate bit_spinlock header to jbd_common ... Browse Code »

The bit_spinlock functions are only used for the jbd_lock_bh_state
functions (and friends) in jbd_common.h and are not directly used
by either of jbd.h or jbd2.h content.

The jbd_common file is new as of commit 446066724c36 ("jdb/jbd2: factor
out common functions from the jbd[2] header files") but common
(and isolated) headers were not considered for factoring at that time.

Signed-off-by: Paul Gortmaker
Signed-off-by: "Theodore Ts'o"

Paul Gortmaker
2013-06-13 11:02:35 +0800
06a407f13 ext4: fix data integrity for ext4_sync_fs ... Browse Code »

Inode's data or non journaled quota may be written w/o jounral so we
_must_ send a barrier at the end of ext4_sync_fs. But it can be
skipped if journal commit will do it for us.

Also fix data integrity for nojournal mode.

Signed-off-by: Dmitry Monakhov
Signed-off-by: "Theodore Ts'o"

Dmitry Monakhov
2013-06-13 10:25:07 +0800
9ff864462 jbd2: optimize jbd2_journal_force_commit ... Browse Code »

Current implementation of jbd2_journal_force_commit() is suboptimal because
result in empty and useless commits. But callers just want to force and wait
any unfinished commits. We already have jbd2_journal_force_commit_nested()
which does exactly what we want, except we are guaranteed that we do not hold
journal transaction open.

Signed-off-by: Dmitry Monakhov
Signed-off-by: "Theodore Ts'o"

Dmitry Monakhov
2013-06-13 10:24:07 +0800

05 Jun, 2013

4 commits

8f7d89f36 jbd2: transaction reservation support ... Browse Code »

In some cases we cannot start a transaction because of locking
constraints and passing started transaction into those places is not
handy either because we could block transaction commit for too long.
Transaction reservation is designed to solve these issues. It
reserves a handle with given number of credits in the journal and the
handle can be later attached to the running transaction without
blocking on commit or checkpointing. Reserved handles do not block
transaction commit in any way, they only reduce maximum size of the
running transaction (because we have to always be prepared to
accomodate request for attaching reserved handle).

Signed-off-by: Jan Kara
Signed-off-by: "Theodore Ts'o"

Jan Kara
2013-06-05 00:35:11 +0800
f29fad721 jbd2: remove unused waitqueues ... Browse Code »

j_wait_logspace and j_wait_checkpoint are unused. Remove them.

Reviewed-by: Zheng Liu
Signed-off-by: Jan Kara
Signed-off-by: "Theodore Ts'o"

Jan Kara
2013-06-05 00:24:11 +0800
76c399045 jbd2: cleanup needed free block estimates when starting a transaction ... Browse Code »

__jbd2_log_space_left() and jbd_space_needed() were kind of odd.
jbd_space_needed() accounted also credits needed for currently
committing transaction while it didn't account for credits needed for
control blocks. __jbd2_log_space_left() then accounted for control
blocks as a fraction of free space. Since results of these two
functions are always only compared against each other, this works
correct but is somewhat strange. Move the estimates so that
jbd_space_needed() returns number of blocks needed for a transaction
including control blocks and __jbd2_log_space_left() returns free
space in the journal (with the committing transaction already
subtracted). Rename functions to jbd2_log_space_left() and
jbd2_space_needed() while we are changing them.

Reviewed-by: Zheng Liu
Signed-off-by: Jan Kara
Signed-off-by: "Theodore Ts'o"

Jan Kara
2013-06-05 00:12:57 +0800
b34090e5e jbd2: refine waiting for shadow buffers ... Browse Code »

Currently when we add a buffer to a transaction, we wait until the
buffer is removed from BJ_Shadow list (so that we prevent any changes
to the buffer that is just written to the journal). This can take
unnecessarily long as a lot happens between the time the buffer is
submitted to the journal and the time when we remove the buffer from
BJ_Shadow list. (e.g. We wait for all data buffers in the
transaction, we issue a cache flush, etc.) Also this creates a
dependency of do_get_write_access() on transaction commit (namely
waiting for data IO to complete) which we want to avoid when
implementing transaction reservation.

So we modify commit code to set new BH_Shadow flag when temporary
shadowing buffer is created and we clear that flag once IO on that
buffer is complete. This allows do_get_write_access() to wait only
for BH_Shadow bit and thus removes the dependency on data IO
completion.

Reviewed-by: Zheng Liu
Signed-off-by: Jan Kara
Signed-off-by: "Theodore Ts'o"

Jan Kara
2013-06-05 00:08:56 +0800