Eric Lee / smarc-fsl-linux-kernel

18 Oct, 2015

1 commit

6934da923 ext4: fix potential use after free in __ext4_journal_stop ... Browse Code »

There is a use-after-free possibility in __ext4_journal_stop() in the
case that we free the handle in the first jbd2_journal_stop() because
we're referencing handle->h_err afterwards. This was introduced in
9705acd63b125dee8b15c705216d7186daea4625 and it is wrong. Fix it by
storing the handle->h_err value beforehand and avoid referencing
potentially freed handle.

Fixes: 9705acd63b125dee8b15c705216d7186daea4625
Signed-off-by: Lukas Czerner
Reviewed-by: Andreas Dilger
Cc: stable@vger.kernel.org

Lukas Czerner
2015-10-18 10:57:06 +0800

15 May, 2015

1 commit

9d5065940 ext4: fix NULL pointer dereference when journal restart fails ... Browse Code »

Currently when journal restart fails, we'll have the h_transaction of
the handle set to NULL to indicate that the handle has been effectively
aborted. We handle this situation quietly in the jbd2_journal_stop() and just
free the handle and exit because everything else has been done before we
attempted (and failed) to restart the journal.

Unfortunately there are a number of problems with that approach
introduced with commit

41a5b913197c "jbd2: invalidate handle if jbd2_journal_restart()
fails"

First of all in ext4 jbd2_journal_stop() will be called through
__ext4_journal_stop() where we would try to get a hold of the superblock
by dereferencing h_transaction which in this case would lead to NULL
pointer dereference and crash.

In addition we're going to free the handle regardless of the refcount
which is bad as well, because others up the call chain will still
reference the handle so we might potentially reference already freed
memory.

Moreover it's expected that we'll get aborted handle as well as detached
handle in some of the journalling function as the error propagates up
the stack, so it's unnecessary to call WARN_ON every time we get
detached handle.

And finally we might leak some memory by forgetting to free reserved
handle in jbd2_journal_stop() in the case where handle was detached from
the transaction (h_transaction is NULL).

Fix the NULL pointer dereference in __ext4_journal_stop() by just
calling jbd2_journal_stop() quietly as suggested by Jan Kara. Also fix
the potential memory leak in jbd2_journal_stop() and use proper
handle refcounting before we attempt to free it to avoid use-after-free
issues.

And finally remove all WARN_ON(!transaction) from the code so that we do
not get random traces when something goes wrong because when journal
restart fails we will get to some of those functions.

Cc: stable@vger.kernel.org
Signed-off-by: Lukas Czerner
Signed-off-by: Theodore Ts'o
Reviewed-by: Jan Kara

Lukas Czerner
2015-05-15 06:55:18 +0800

02 Oct, 2014

1 commit

c5d311926 ext4: fix over-defensive complaint after journal abort ... Browse Code »

Reviewed-by: Jan Kara
Signed-off-by: Dmitry Monakhov
Signed-off-by: Theodore Ts'o

Dmitry Monakhov
2014-10-02 10:23:15 +0800

12 May, 2014

1 commit

c197855ea ext4: make local functions static ... Browse Code »

I have been running make namespacecheck to look for unneeded globals, and
found these in ext4.

Signed-off-by: Stephen Hemminger
Signed-off-by: "Theodore Ts'o"

Stephen Hemminger
2014-05-12 22:50:23 +0800

13 Mar, 2014

1 commit

66a4cb187 jbd2: improve error messages for inconsistent journal heads ... Browse Code »

Fix up error messages printed when the transaction pointers in a
journal head are inconsistent. This improves the error messages which
are printed when running xfstests generic/068 in data=journal mode.
See the bug report at: https://bugzilla.kernel.org/show_bug.cgi?id=60786

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2014-03-13 04:38:03 +0800

02 Dec, 2013

1 commit

ae1495b12 ext4: call ext4_error_inode() if jbd2_journal_dirty_metadata() fails ... Browse Code »

While it's true that errors can only happen if there is a bug in
jbd2_journal_dirty_metadata(), if a bug does happen, we need to halt
the kernel or remount the file system read-only in order to avoid
further data loss. The ext4_journal_abort_handle() function doesn't
do any of this, and while it's likely that this call (since it doesn't
adjust refcounts) will likely result in the file system eventually
deadlocking since the current transaction will never be able to close,
it's much cleaner to call let ext4's error handling system deal with
this situation.

There's a separate bug here which is that if certain jbd2 errors
errors occur and file system is mounted errors=continue, the file
system will probably eventually end grind to a halt as described
above. But things have been this way in a long time, and usually when
we have these sorts of errors it's pretty much a disaster --- and
that's why the jbd2 layer aggressively retries memory allocations,
which is the most likely cause of these jbd2 errors.

Signed-off-by: "Theodore Ts'o"
Reviewed-by: Jan Kara
Cc: stable@vger.kernel.org

Theodore Ts'o
2013-12-02 22:31:36 +0800

12 Aug, 2013

1 commit

91aa11fae jbd2: Fix use after free after error in jbd2_journal_dirty_metadata() ... Browse Code »

When jbd2_journal_dirty_metadata() returns error,
__ext4_handle_dirty_metadata() stops the handle. However callers of this
function do not count with that fact and still happily used now freed
handle. This use after free can result in various issues but very likely
we oops soon.

The motivation of adding __ext4_journal_stop() into
__ext4_handle_dirty_metadata() in commit 9ea7a0df seems to be only to
improve error reporting. So replace __ext4_journal_stop() with
ext4_journal_abort_handle() which was there before that commit and add
WARN_ON_ONCE() to dump stack to provide useful information.

Reported-by: Sage Weil
Signed-off-by: Jan Kara
Signed-off-by: "Theodore Ts'o"
Cc: stable@vger.kernel.org # 3.2+

Jan Kara
2013-08-12 21:53:28 +0800

05 Jun, 2013

2 commits

5fe2fe895 ext4: provide wrappers for transaction reservation calls ... Browse Code »

Reviewed-by: Zheng Liu
Signed-off-by: Jan Kara
Signed-off-by: "Theodore Ts'o"

Jan Kara
2013-06-05 00:37:50 +0800
8f7d89f36 jbd2: transaction reservation support ... Browse Code »

In some cases we cannot start a transaction because of locking
constraints and passing started transaction into those places is not
handy either because we could block transaction commit for too long.
Transaction reservation is designed to solve these issues. It
reserves a handle with given number of credits in the journal and the
handle can be later attached to the running transaction without
blocking on commit or checkpointing. Reserved handles do not block
transaction commit in any way, they only reduce maximum size of the
running transaction (because we have to always be prepared to
accomodate request for attaching reserved handle).

Signed-off-by: Jan Kara
Signed-off-by: "Theodore Ts'o"

Jan Kara
2013-06-05 00:35:11 +0800

22 Apr, 2013

1 commit

13fca323e ext4: mark metadata blocks using bh flags ... Browse Code »

This allows metadata writebacks which are issued via block device
writeback to be sent with the current write request flags.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2013-04-22 04:45:54 +0800

04 Apr, 2013

1 commit

b10a44c36 ext4: add might_sleep() annotations ... Browse Code »

Signed-off-by: "Theodore Ts'o"
Reviewed-by: Lukas Czerner

Theodore Ts'o
2013-04-04 10:00:52 +0800

09 Feb, 2013

2 commits

9924a92a8 ext4: pass context information to jbd2__journal_start() ... Browse Code »

So we can better understand what bits of ext4 are responsible for
long-running jbd2 handles, use jbd2__journal_start() so we can pass
context information for logging purposes.

The recommended way for finding the longer-running handles is:

T=/sys/kernel/debug/tracing
EVENT=$T/events/jbd2/jbd2_handle_stats
echo "interval > 5" > $EVENT/filter
echo 1 > $EVENT/enable

./run-my-fs-benchmark

cat $T/trace > /tmp/problem-handles

This will list handles that were active for longer than 20ms. Having
longer-running handles is bad, because a commit started at the wrong
time could stall for those 20+ milliseconds, which could delay an
fsync() or an O_SYNC operation. Here is an example line from the
trace file describing a handle which lived on for 311 jiffies, or over
1.2 seconds:

postmark-2917 [000] .... 196.435786: jbd2_handle_stats: dev 254,32
tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1
dirtied_blocks 0

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2013-02-09 10:59:22 +0800
722887ddc ext4: move the jbd2 wrapper functions out of super.c ... Browse Code »

Move the jbd2 wrapper functions which start and stop handles out of
super.c, where they don't really logically belong, and into
ext4_jbd2.c.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2013-02-09 02:00:31 +0800

10 Oct, 2012

1 commit

06db49e68 ext4: fix metadata checksum calculation for the superblock ... Browse Code »

The function ext4_handle_dirty_super() was calculating the superblock
on the wrong block data. As a result, when the superblock is modified
while it is mounted (most commonly, when inodes are added or removed
from the orphan list), the superblock checksum would be wrong. We
didn't notice because the superblock *was* being correctly calculated
in ext4_commit_super(), and this would get called when the file system
was unmounted. So the problem only became obvious if the system
crashed while the file system was mounted.

Fix this by removing the poorly designed function signature for
ext4_superblock_csum_set(); if it only took a single argument, the
pointer to a struct superblock, the ambiguity which caused this
mistake would have been impossible.

Reported-by: George Spelvin
Signed-off-by: "Theodore Ts'o"
Cc: stable@vger.kernel.org

Theodore Ts'o
2012-10-10 13:06:58 +0800

23 Jul, 2012

2 commits

b50924c2c ext4: remove unnecessary argument from __ext4_handle_dirty_metadata() ... Browse Code »

The '__ext4_handle_dirty_metadata()' does not need the 'now' argument
anymore and we can kill it.

Signed-off-by: Artem Bityutskiy
Signed-off-by: "Theodore Ts'o"
Reviewed-by: Jan Kara

Artem Bityutskiy
2012-07-23 08:37:31 +0800
58c5873a7 ext4: remove unnecessary superblock dirtying ... Browse Code »

This patch changes the 'ext4_handle_dirty_super()' function which
submits the superblock for I/O in the following cases:

1. When creating the first large file on a file system without
EXT4_FEATURE_RO_COMPAT_LARGE_FILE feature.
2. When re-sizing the file-system.
3. When creating an xattr on a file-system without the
EXT4_FEATURE_COMPAT_EXT_ATTR feature.

If the file-system has journal enabled, the superblock is written via
the journal. We do not modify this path.

If the file-system has no journal, this function, falls back to just
marking the superblock as dirty using the 's_dirt' superblock
flag. This means that it delays the actual superblock I/O submission
by 5 seconds (default setting). Namely, the 'sync_supers()' kernel
thread will call 'ext4_write_super()' later and will actually submit
the superblock for I/O.

And this is the behavior this patch modifies: we stop using 's_dirt'
and just mark the superblock buffer as dirty right away. Indeed, all 3
cases above are extremely rare and it does not add any value to delay
the I/O submission for them.

Note: 'ext4_handle_dirty_super()' executes
'__ext4_handle_dirty_super()' with 'now = 0'. This patch basically
makes the 'now' argument unneeded and it will be deleted in one of the
next patches.

This patch also removes 's_dirt' condition on the unmount path because
we never set it anymore, so we should not test it.

Tested using xfstests for both journalled and non-journalled ext4.

Signed-off-by: Artem Bityutskiy
Signed-off-by: "Theodore Ts'o"
Reviewed-by: Jan Kara

Artem Bityutskiy
2012-07-23 08:33:31 +0800

30 Apr, 2012

1 commit

a9c473178 ext4: calculate and verify superblock checksum ... Browse Code »

Calculate and verify the superblock checksum. Since the UUID and
block group number are embedded in each copy of the superblock, we
need only checksum the entire block. Refactor some of the code to
eliminate open-coding of the checksum update call.

Signed-off-by: Darrick J. Wong
Signed-off-by: "Theodore Ts'o"

Darrick J. Wong
2012-04-30 06:29:10 +0800

04 Sep, 2011

1 commit

9ea7a0df6 jbd2: add debugging information to jbd2_journal_dirty_metadata() ... Browse Code »

Add debugging information in case jbd2_journal_dirty_metadata() is
called with a buffer_head which didn't have
jbd2_journal_get_write_access() called on it, or if the journal_head
has the wrong transaction in it. In addition, return an error code.
This won't change anything for ocfs2, which will BUG_ON() the non-zero
exit code.

For ext4, the caller of this function is ext4_handle_dirty_metadata(),
and on seeing a non-zero return code, will call __ext4_journal_stop(),
which will print the function and line number of the (buggy) calling
function and abort the journal. This will allow us to recover instead
of bug halting, which is better from a robustness and reliability
point of view.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2011-09-04 22:18:14 +0800

09 May, 2011

1 commit

2cd05cc39 ext4: remove unneeded ext4_journal_get_undo_access ... Browse Code »

The block allocation code used to use jbd2_journal_get_undo_access as
a way to make changes that wouldn't show up until the commit took
place. The new multi-block allocation code has a its own way of
preventing newly freed blocks from getting reused until the commit
takes place (it avoids updating the buddy bitmaps until the commit is
done), so we don't need to use jbd2_journal_get_undo_access(), which
has extra overhead compared to jbd2_journal_get_write_access().

There was one last vestigal use of ext4_journal_get_undo_access() in
ext4_add_groupblocks(); change it to use ext4_journal_get_write_access()
and then remove the ext4_journal_get_undo_access() support.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2011-05-09 22:58:45 +0800

27 Jul, 2010

2 commits

1c13d5c08 ext4: Save error information to the superblock for analysis ... Browse Code »

Save number of file system errors, and the time function name, line
number, block number, and inode number of the first and most recent
errors reported on the file system in the superblock.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-07-27 23:56:03 +0800
c398eda0e ext4: Pass line numbers to ext4_error() and friends ... Browse Code »

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-07-27 23:56:40 +0800

30 Jun, 2010

1 commit

90c7201b9 ext4: Pass line number to ext4_journal_abort_handle() ... Browse Code »

This allows the error messages to include the line number

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-06-30 02:53:24 +0800

29 Jun, 2010

1 commit

c67d859e3 ext4: clean up ext4_abort() so __func__ is now implicit ... Browse Code »

Use a macro definition for ext4_abort() to clean up the .c files a wee
bit.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-06-29 23:07:07 +0800

12 Jun, 2010

1 commit

a0375156c ext4: Clean up s_dirt handling ... Browse Code »
86

We don't need to set s_dirt in most of the ext4 code when journaling
is enabled. In ext3/4 some of the summary statistics for # of free
inodes, blocks, and directories are calculated from the per-block
group statistics when the file system is mounted or unmounted. As a
result the superblock doesn't have to be updated, either via the
journal or by setting s_dirt. There are a few exceptions, most
notably when resizing the file system, where the superblock needs to
be modified --- and in that case it should be done as a journalled
operation if possible, and s_dirt set only in no-journal mode.

This patch will optimize out some unneeded disk writes when using ext4
with a journal.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-06-12 11:14:04 +0800

17 Feb, 2010

1 commit

73b50c1c9 ext4: Fix BUG_ON at fs/buffer.c:652 in no journal mode ... Browse Code »

Calls to ext4_handle_dirty_metadata should only pass in an inode
pointer for inode-specific metadata, and not for shared metadata
blocks such as inode table blocks, block group descriptors, the
superblock, etc.

The BUG_ON can get tripped when updating a special device (such as a
block device) that is opened (so that i_mapping is set in
fs/block_dev.c) and the file system is mounted in no journal mode.

Addresses-Google-Bug: #2404870

Signed-off-by: Curt Wohlgemuth
Signed-off-by: "Theodore Ts'o"

Curt Wohlgemuth
2010-02-17 04:06:29 +0800

16 Feb, 2010

1 commit

12062dddd ext4: move __func__ into a macro for ext4_warning, ext4_error ... Browse Code »

Just a pet peeve of mine; we had a mishash of calls with either __func__
or "function_name" and the latter tends to get out of sync.

I think it's easier to just hide the __func__ in a macro, and it'll
be consistent from then on.

Signed-off-by: Eric Sandeen
Signed-off-by: "Theodore Ts'o"

Eric Sandeen
2010-02-16 03:19:27 +0800

25 Nov, 2009

1 commit

e4684b3fb ext4: fold ext4_journal_revoke() into ext4_forget() ... Browse Code »

The only caller of ext4_journal_revoke() is ext4_forget(), so we can
fold ext4_journal_revoke() into ext4_forget() to simplify the code and
shorten the call stack.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-11-25 00:05:59 +0800

23 Nov, 2009

2 commits

b7e57e7c2 ext4: fold ext4_journal_forget() into ext4_forget() ... Browse Code »

Convert the last two callers of ext4_journal_forget() to use
ext4_forget() instead, and then fold ext4_journal_forget() into
ext4_forget(). This reduces are code complexity and shortens our call
stack.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-11-23 10:00:13 +0800
d6797d14b ext4: move ext4_forget() to ext4_jbd2.c ... Browse Code »

The ext4_forget() function better belongs in ext4_jbd2.c. This will
allow us to do some cleanup of the ext4_journal_revoke() and
ext4_journal_forget() functions, as well as giving us better error
reporting since we can report the caller of ext4_forget() when things
go wrong.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-11-23 09:52:12 +0800

13 Sep, 2009

1 commit

fe188c0e0 ext4: Assure that metadata blocks are written during fsync in no journal mode ... Browse Code »

When there is no journal present, we must attach buffer heads
associated with extent tree and indirect blocks to the inode's
mapping->private_list via mark_buffer_dirty_inode() so that
ext4_sync_file() --- which is called to service fsync() and
fdatasync() system calls --- can write out the inode's metadata blocks
by calling sync_mapping_buffers().

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-09-13 01:41:55 +0800

10 Sep, 2009

1 commit

c7acb4c16 ext4: Use bforget() in no journal mode for ext4_journal_{forget,revoke}() ... Browse Code »

When ext4 is using a journal, a metadata block which is deallocated
must be passed into the journal layer so it can be dropped from the
current transaction and/or revoked. This is done by calling the
functions ext4_journal_forget() and ext4_journal_revoke(), which call
jbd2_journal_forget(), and jbd2_journal_revoke(), respectively.

Since the jbd2_journal_forget() and jbd2_journal_revoke() call
bforget(), if ext4 is not using a journal, ext4_journal_forget() and
ext4_journal_revoke() must call bforget() to avoid a dirty metadata
block overwriting a block after it has been reallocated and reused for
another inode's data block.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-09-10 09:32:41 +0800

13 Jul, 2009

1 commit

e6b5d3010 ext4: Fix buffer head reference leak in no-journal mode ... Browse Code »

We found a problem with buffer head reference leaks when using an ext4
partition without a journal. In particular, calls to ext4_forget() would
not to a brelse() on the input buffer head, which will cause pages they
belong to to not be reclaimable.

Further investigation showed that all places where ext4_journal_forget() and
ext4_journal_revoke() are called are subject to the same problem. The patch
below changes __ext4_journal_forget/__ext4_journal_revoke to do an explicit
release of the buffer head when the journal handle isn't valid.

Signed-off-by: Curt Wohlgemuth
Signed-off-by: "Theodore Ts'o"

Curt Wohlgemuth
2009-07-13 21:07:20 +0800

07 Jan, 2009

1 commit

0390131ba ext4: Allow ext4 to run without a journal ... Browse Code »

A few weeks ago I posted a patch for discussion that allowed ext4 to run
without a journal. Since that time I've integrated the excellent
comments from Andreas and fixed several serious bugs. We're currently
running with this patch and generating some performance numbers against
both ext2 (with backported reservations code) and ext4 with and without
a journal. It just so happens that running without a journal is
slightly faster for most everything.

We did
iozone -T -t 4 s 2g -r 256k -T -I -i0 -i1 -i2

which creates 4 threads, each of which create and do reads and writes on
a 2G file, with a buffer size of 256K, using O_DIRECT for all file opens
to bypass the page cache. Results:

ext2 ext4, default ext4, no journal
initial writes 13.0 MB/s 15.4 MB/s 15.7 MB/s
rewrites 13.1 MB/s 15.6 MB/s 15.9 MB/s
reads 15.2 MB/s 16.9 MB/s 17.2 MB/s
re-reads 15.3 MB/s 16.9 MB/s 17.2 MB/s
random readers 5.6 MB/s 5.6 MB/s 5.7 MB/s
random writers 5.1 MB/s 5.3 MB/s 5.4 MB/s

So it seems that, so far, this was a useful exercise.

Signed-off-by: Frank Mayhar
Signed-off-by: "Theodore Ts'o"

Frank Mayhar
2009-01-07 13:06:22 +0800

30 Apr, 2008

1 commit

3dcf54515 ext4: move headers out of include/linux ... Browse Code »

Move ext4 headers out of include/linux. This is just the trivial move,
there's some more thing that could be done later.

Signed-off-by: Christoph Hellwig
Signed-off-by: Mingming Cao
Signed-off-by: "Theodore Ts'o"

Christoph Hellwig
2008-04-30 06:13:32 +0800

17 Apr, 2008

1 commit

46e665e9d ext4: replace remaining __FUNCTION__ occurrences ... Browse Code »

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: "Theodore Ts'o"

Harvey Harrison
2008-04-17 22:38:59 +0800

08 Dec, 2006

1 commit

8984d137d [PATCH] ext4: uninline large functions ... Browse Code »

Saves nearly 4kbytes on x86.

Cc: Arnaldo Carvalho de Melo
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-12-08 00:39:35 +0800