Doug / smarc-fsl-linux-kernel | Embedian Git Server

12 Mar, 2013

1 commit

90ba983f6 ext4: use atomic64_t for the per-flexbg free_clusters count ... Browse Code »

A user who was using a 8TB+ file system and with a very large flexbg
size (> 65536) could cause the atomic_t used in the struct flex_groups
to overflow. This was detected by PaX security patchset:

http://forums.grsecurity.net/viewtopic.php?f=3&t=3289&p=12551#p12551

This bug was introduced in commit 9f24e4208f7e, so it's been around
since 2.6.30. :-(

Fix this by using an atomic64_t for struct orlav_stats's
free_clusters.

Signed-off-by: "Theodore Ts'o"
Reviewed-by: Lukas Czerner
Cc: stable@vger.kernel.org

Theodore Ts'o
2013-03-12 11:39:59 +0800

15 Feb, 2013

1 commit

8de5c325b ext4: use KERN_WARNING for warning messages ... Browse Code »

Some messages printed related to a WARN_ON(1) were printed using
KERN_NOTICE. Use KERN_WARNING or ext4_warning() instead so that
context related to the WARN_ON() is printed at the same printk warning
level (and log files, etc.)

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2013-02-15 04:11:41 +0800

10 Feb, 2013

1 commit

1139575a9 ext4: start handle at the last possible moment when creating inodes ... Browse Code »

In ext4_{create,mknod,mkdir,symlink}(), don't start the journal handle
until the inode has been succesfully allocated. In order to do this,
we need to start the handle in the ext4_new_inode(). So create a new
variant of this function, ext4_new_inode_start_handle(), so the handle
can be created at the last possible minute, before we need to modify
the inode allocation bitmap block.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2013-02-10 05:27:09 +0800

09 Feb, 2013

1 commit

9924a92a8 ext4: pass context information to jbd2__journal_start() ... Browse Code »

So we can better understand what bits of ext4 are responsible for
long-running jbd2 handles, use jbd2__journal_start() so we can pass
context information for logging purposes.

The recommended way for finding the longer-running handles is:

T=/sys/kernel/debug/tracing
EVENT=$T/events/jbd2/jbd2_handle_stats
echo "interval > 5" > $EVENT/filter
echo 1 > $EVENT/enable

./run-my-fs-benchmark

cat $T/trace > /tmp/problem-handles

This will list handles that were active for longer than 20ms. Having
longer-running handles is bad, because a commit started at the wrong
time could stall for those 20+ milliseconds, which could delay an
fsync() or an O_SYNC operation. Here is an example line from the
trace file describing a handle which lived on for 311 jiffies, or over
1.2 seconds:

postmark-2917 [000] .... 196.435786: jbd2_handle_stats: dev 254,32
tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1
dirtied_blocks 0

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2013-02-09 10:59:22 +0800

11 Dec, 2012

1 commit

f08225d17 ext4: enable ext4 inline support ... Browse Code »

Signed-off-by: Tao Ma
Signed-off-by: "Theodore Ts'o"

Tao Ma
2012-12-11 03:06:03 +0800

30 Nov, 2012

1 commit

aeb1e5d69 ext4: fix possible use after free with metadata csum ... Browse Code »

Commit fa77dcfafeaa introduces block bitmap checksum calculation into
ext4_new_inode() in the case that block group was uninitialized.
However we brelse() the bitmap buffer before we attempt to checksum it
so we have no guarantee that the buffer is still there.

Fix this by releasing the buffer after the possible checksum
computation.

Signed-off-by: Lukas Czerner
Signed-off-by: "Theodore Ts'o"
Acked-by: Darrick J. Wong
Cc: stable@vger.kernel.org

Theodore Ts'o
2012-11-30 10:21:22 +0800

29 Oct, 2012

1 commit

ffb5387e8 ext4: fix unjournaled inode bitmap modification ... Browse Code »

commit 119c0d4460b001e44b41dcf73dc6ee794b98bd31 changed
ext4_new_inode() such that the inode bitmap was being modified
outside a transaction, which could lead to corruption, and was
discovered when journal_checksum found a bad checksum in the
journal during log replay.

Nix ran into this when using the journal_async_commit mount
option, which enables journal checksumming. The ensuing
journal replay failures due to the bad checksums led to
filesystem corruption reported as the now infamous
"Apparent serious progressive ext4 data corruption bug"

[ Changed by tytso to only call ext4_journal_get_write_access() only
when we're fairly certain that we're going to allocate the inode. ]

I've tested this by mounting with journal_checksum and
running fsstress then dropping power; I've also tested by
hacking DM to create snapshots w/o first quiescing, which
allows me to test journal replay repeatedly w/o actually
power-cycling the box. Without the patch I hit a journal
checksum error every time. With this fix it survives
many iterations.

Reported-by: Nix
Signed-off-by: Eric Sandeen
Signed-off-by: "Theodore Ts'o"
Cc: stable@vger.kernel.org

Eric Sandeen
2012-10-29 10:24:57 +0800

22 Oct, 2012

1 commit

79f1ba495 ext4: Checksum the block bitmap properly with bigalloc enabled ... Browse Code »

In mke2fs, we only checksum the whole bitmap block and it is right.
While in the kernel, we use EXT4_BLOCKS_PER_GROUP to indicate the
size of the checksumed bitmap which is wrong when we enable bigalloc.
The right size should be EXT4_CLUSTERS_PER_GROUP and this patch fixes
it.

Also as every caller of ext4_block_bitmap_csum_set and
ext4_block_bitmap_csum_verify pass in EXT4_BLOCKS_PER_GROUP(sb)/8,
we'd better removes this parameter and sets it in the function itself.

Signed-off-by: Tao Ma
Signed-off-by: "Theodore Ts'o"
Reviewed-by: Lukas Czerner
Cc: stable@vger.kernel.org

Tao Ma
2012-10-22 12:34:32 +0800

24 Sep, 2012

1 commit

f2a09af64 ext4: check free inode count before allocating an inode ... Browse Code »

Recently, I ecountered some corrupted filesystems in which some
groups' free inode counts were 65535, it seemed that free inode
count was overflow. This patch teaches ext4 to check free inode
count before allocaing an inode.

Signed-off-by: Yongqiang Yang
Signed-off-by: "Theodore Ts'o"

Yongqiang Yang
2012-09-24 11:16:03 +0800

23 Jul, 2012

1 commit

97a740688 ext4: remove useless marking of superblock dirty ... Browse Code »

Commit a0375156 properly notes that superblock doesn't need to be marked
as dirty when only number of free inodes / blocks / number of directories
changes since that is recomputed on each mount anyway. However that comment
leaves some unnecessary markings as dirty in place. Remove these.

Artem: tested using xfstests for both journalled and non-journalled ext4.

Signed-off-by: Jan Kara
Signed-off-by: Artem Bityutskiy
Signed-off-by: "Theodore Ts'o"
Tested-by: Artem Bityutskiy

Jan Kara
2012-07-23 08:29:31 +0800

01 Jul, 2012

1 commit

f6fb99cad ext4: pass a char * to ext4_count_free() instead of a buffer_head ptr ... Browse Code »

Make it possible for ext4_count_free to operate on buffers and not
just data in buffer_heads.

Signed-off-by: "Theodore Ts'o"
Cc: stable@kernel.org

Theodore Ts'o
2012-07-01 07:14:57 +0800

02 Jun, 2012

1 commit

4edebed86 Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 ... Browse Code »

Pull Ext4 updates from Theodore Ts'o:
"The major new feature added in this update is Darrick J Wong's
metadata checksum feature, which adds crc32 checksums to ext4's
metadata fields.

There is also the usual set of cleanups and bug fixes."

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (44 commits)
ext4: hole-punch use truncate_pagecache_range
jbd2: use kmem_cache_zalloc wrapper instead of flag
ext4: remove mb_groups before tearing down the buddy_cache
ext4: add ext4_mb_unload_buddy in the error path
ext4: don't trash state flags in EXT4_IOC_SETFLAGS
ext4: let getattr report the right blocks in delalloc+bigalloc
ext4: add missing save_error_info() to ext4_error()
ext4: add debugging trigger for ext4_error()
ext4: protect group inode free counting with group lock
ext4: use consistent ssize_t type in ext4_file_write()
ext4: fix format flag in ext4_ext_binsearch_idx()
ext4: cleanup in ext4_discard_allocated_blocks()
ext4: return ENOMEM when mounts fail due to lack of memory
ext4: remove redundundant "(char *) bh->b_data" casts
ext4: disallow hard-linked directory in ext4_lookup
ext4: fix potential integer overflow in alloc_flex_gd()
ext4: remove needs_recovery in ext4_mb_init()
ext4: force ro mount if ext4_setup_super() fails
ext4: fix potential NULL dereference in ext4_free_inodes_counts()
ext4/jbd2: add metadata checksumming to the list of supported features
...

Linus Torvalds
2012-06-02 01:12:15 +0800

29 May, 2012

2 commits

6f2e9f0e7 ext4: protect group inode free counting with group lock ... Browse Code »

Now when we set the group inode free count, we don't have a proper
group lock so that multiple threads may decrease the inode free
count at the same time. And e2fsck will complain something like:

Free inodes count wrong for group #1 (1, counted=0).
Fix? no

Free inodes count wrong for group #2 (3, counted=0).
Fix? no

Directories count wrong for group #2 (780, counted=779).
Fix? no

Free inodes count wrong for group #3 (2272, counted=2273).
Fix? no

So this patch try to protect it with the ext4_lock_group.

btw, it is found by xfstests test case 269 and the volume is
mkfsed with the parameter
"-O ^resize_inode,^uninit_bg,extent,meta_bg,flex_bg,ext_attr"
and I have run it 100 times and the error in e2fsck doesn't
show up again.

Signed-off-by: Tao Ma
Signed-off-by: "Theodore Ts'o"

Tao Ma
2012-05-29 06:20:59 +0800
bb3d132a2 ext4: fix potential NULL dereference in ext4_free_inodes_counts() ... Browse Code »

The ext4_get_group_desc() function returns NULL on error, and
ext4_free_inodes_count() function dereferences it without checking.
There is a check on the next line, but it's too late.

Reviewed-by: Jan Kara
Signed-off-by: Dan Carpenter
Signed-off-by: "Theodore Ts'o"
Cc: stable@kernel.org

Dan Carpenter
2012-05-29 02:16:57 +0800

16 May, 2012

1 commit

08cefc7ab userns: Convert ext4 to user kuid/kgid where appropriate ... Browse Code »

Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2012-05-16 05:59:27 +0800

30 Apr, 2012

4 commits

feb0ab32a ext4: make block group checksums use metadata_csum algorithm ... Browse Code »

metadata_csum supersedes uninit_bg. Convert the ROCOMPAT uninit_bg
flag check to a helper function that covers both, and make the
checksum calculation algorithm use either crc16 or the metadata_csum
chosen algorithm depending on which flag is set. Print a warning if
we try to mount a filesystem with both feature flags set.

Signed-off-by: Darrick J. Wong
Signed-off-by: "Theodore Ts'o"

Darrick J. Wong
2012-04-30 06:45:10 +0800
fa77dcfaf ext4: calculate and verify block bitmap checksum ... Browse Code »

Compute and verify the checksum of the block bitmap; this checksum is
stored in the block group descriptor.

Signed-off-by: Darrick J. Wong
Signed-off-by: "Theodore Ts'o"

Darrick J. Wong
2012-04-30 06:35:10 +0800
41a246d1f ext4: calculate and verify checksums for inode bitmaps ... Browse Code »

Compute and verify the checksum of the inode bitmap; the checkum is
stored in the block group descriptor.

Signed-off-by: Darrick J. Wong
Signed-off-by: "Theodore Ts'o"

Darrick J. Wong
2012-04-30 06:33:10 +0800
814525f4d ext4: calculate and verify inode checksums ... Browse Code »

This patch introduces to ext4 the ability to calculate and verify
inode checksums. This requires the use of a new ro compatibility flag
and some accompanying e2fsprogs patches to provide the relevant
features in tune2fs and e2fsck. The inode generation changes have
been integrated into this patch.

Signed-off-by: Darrick J. Wong
Signed-off-by: "Theodore Ts'o"

Darrick J. Wong
2012-04-30 06:31:10 +0800

20 Mar, 2012

2 commits

92b978165 ext4: change some printk() calls to use ext4_msg() instead ... Browse Code »

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2012-03-20 11:41:49 +0800
1084f252e ext4: remove trailing newlines from ext4_msg() and ext4_error() messages ... Browse Code »

The functions ext4_msg() and ext4_error() already tack on a trailing
newline, so remove the unnecessary extra newline.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2012-03-20 11:13:43 +0800

21 Feb, 2012

1 commit

813e57276 ext4: fix race when setting bitmap_uptodate flag ... Browse Code »

In ext4_read_{inode,block}_bitmap() we were setting bitmap_uptodate()
before submitting the buffer for read. The is bad, since we check
bitmap_uptodate() without locking the buffer, and so if another
process is racing with us, it's possible that they will think the
bitmap is uptodate even though the read has not completed yet,
resulting in inodes and blocks potentially getting allocated more than
once if we get really unlucky.

Addresses-Google-Bug: 2828254

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2012-02-21 06:52:46 +0800

07 Feb, 2012

1 commit

119c0d446 ext4: fold ext4_claim_inode into ext4_new_inode ... Browse Code »

The function ext4_claim_inode() is only called by one function,
ext4_new_inode(), and by folding the functionality into
ext4_new_inode(), we can remove almost 50 lines of code, and put all
of the logic of allocating a new inode into a single place.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2012-02-07 09:12:03 +0800

11 Jan, 2012

1 commit

ff9cb1c4e Merge branch 'for_linus' into for_linus_merged ... Browse Code »

Conflicts:
fs/ext4/ioctl.c

Theodore Ts'o
2012-01-11 00:54:07 +0800

04 Jan, 2012

1 commit

dcca3fec9 ext4: propagate umode_t ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:54:59 +0800

29 Dec, 2011

2 commits

597d508c1 ext4: use proper little-endian bitops ... Browse Code »

ext4_{set,clear}_bit() is defined as __test_and_{set,clear}_bit_le() for
ext4. Only two ext4_{set,clear}_bit() calls check the return value. The
rest of calls ignore the return value and they can be replaced with
__{set,clear}_bit_le().

This changes ext4_{set,clear}_bit() from __test_and_{set,clear}_bit_le()
to __{set,clear}_bit_le() and introduces ext4_test_and_{set,clear}_bit()
for the two places where old bit needs to be returned.

This ext4_{set,clear}_bit() change is considered safe, because if someone
uses these macros without noticing the change, new ext4_{set,clear}_bit
don't have return value and causes compiler errors where the return value
is used.

This also removes unused ext4_find_first_zero_bit().

Signed-off-by: Akinobu Mita
Signed-off-by: Andrew Morton
Signed-off-by: "Theodore Ts'o"

Akinobu Mita
2011-12-29 09:32:07 +0800
14c83c9fd ext4: avoid counting the number of free inodes twice in find_group_orlov() ... Browse Code »

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2011-12-29 09:25:13 +0800

19 Dec, 2011

1 commit

acd6ad835 ext4: fix error handling on inode bitmap corruption ... Browse Code »

When insert_inode_locked() fails in ext4_new_inode() it most likely means inode
bitmap got corrupted and we allocated again inode which is already in use. Also
doing unlock_new_inode() during error recovery is wrong since the inode does
not have I_NEW set. Fix the problem by jumping to fail: (instead of fail_drop:)
which declares filesystem error and does not call unlock_new_inode().

Signed-off-by: Jan Kara
Signed-off-by: "Theodore Ts'o"

Jan Kara
2011-12-19 06:37:02 +0800

03 Nov, 2011

1 commit

d21185883 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue ... Browse Code »

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue:
vfs: add d_prune dentry operation
vfs: protect i_nlink
filesystems: add set_nlink()
filesystems: add missing nlink wrappers
logfs: remove unnecessary nlink setting
ocfs2: remove unnecessary nlink setting
jfs: remove unnecessary nlink setting
hypfs: remove unnecessary nlink setting
vfs: ignore error on forced remount
readlinkat: ensure we return ENOENT for the empty pathname for normal lookups
vfs: fix dentry leak in simple_fill_super()

Linus Torvalds
2011-11-03 02:41:01 +0800

02 Nov, 2011

1 commit

6d6b77f16 filesystems: add missing nlink wrappers ... Browse Code »

Replace direct i_nlink updates with the respective updater function
(inc_nlink, drop_nlink, clear_nlink, inode_dec_link_count).

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2011-11-02 19:53:43 +0800

01 Nov, 2011

1 commit

4af835089 ext4: remove comments about extent mount option in ext4_new_inode() ... Browse Code »

Remove comments about 'extent' mount option in ext4_new_inode(), since
it's no longer exists.

Signed-off-by: Eryu Guan
Signed-off-by: "Theodore Ts'o"

Eryu Guan
2011-11-01 06:21:29 +0800

29 Oct, 2011

1 commit

5cb81dabc ext4: fix quota accounting during migration ... Browse Code »

The tmp_inode should have same uid/gid as the original inode.
Otherwise new metadata blocks will be accounted to wrong quota-id,
which will result in a quota leak after the inode migration is
completed.

Signed-off-by: Dmitry Monakhov
Signed-off-by: "Theodore Ts'o"

Dmitry Monakhov
2011-10-29 21:05:00 +0800

18 Oct, 2011

1 commit

e0cbee3e1 ext4: functions should not be declared extern ... Browse Code »

The function declarations in ext4.h are already marked extern, so it's
not necessary to do so in the .c files.

This quiets the sparse noise:

warning: function 'ext4_flush_completed_IO' with external linkage has definition
warning: function 'ext4_init_inode_table' with external linkage has definition

Signed-off-by: H Hartley Sweeten
Signed-off-by: "Theodore Ts'o"

H Hartley Sweeten
2011-10-18 22:57:51 +0800

09 Oct, 2011

1 commit

4113c4caa ext4: remove deprecated oldalloc ... Browse Code »

For a long time now orlov is the default block allocator in the
ext4. It performs better than the old one and no one seems to claim
otherwise so we can safely drop it and make oldalloc and orlov mount
option deprecated.

This is a part of the effort to reduce number of ext4 options hence the
test matrix.

Signed-off-by: Lukas Czerner
Signed-off-by: "Theodore Ts'o"

Lukas Czerner
2011-10-09 02:34:47 +0800

10 Sep, 2011

5 commits

cff1dfd76 ext4: rename ext4_free_blocks_after_init() to ext4_free_clusters_after_init() ... Browse Code »

This function really returns the number of clusters after initializing
an uninitalized block bitmap has been initialized.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2011-09-10 07:12:51 +0800
021b65bb1 ext4: Rename ext4_free_blks_{count,set}() to refer to clusters ... Browse Code »

The field bg_free_blocks_count_{lo,high} in the block group
descriptor has been repurposed to hold the number of free clusters for
bigalloc functions. So rename the functions so it makes it easier to
read and audit the block allocation and block freeing code.

Note: at this point in bigalloc development we doesn't support
online resize, so this also makes it really obvious all of the places
we need to fix up to add support for online resize.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2011-09-10 07:08:51 +0800
24aaa8ef4 ext4: convert the free_blocks field in s_flex_groups to be free_clusters ... Browse Code »

Convert the free_blocks to be free_clusters to make the final revised
bigalloc changes easier to read/understand.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2011-09-10 06:58:51 +0800
570426518 ext4: convert s_{dirty,free}blocks_counter to s_{dirty,free}clusters_counter ... Browse Code »

Convert the percpu counters s_dirtyblocks_counter and
s_freeblocks_counter in struct ext4_super_info to be
s_dirtyclusters_counter and s_freeclusters_counter.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2011-09-10 06:56:51 +0800
fd034a84e ext4: split out ext4_free_blocks_after_init() ... Browse Code »

The function ext4_free_blocks_after_init() used to be a #define of
ext4_init_block_bitmap(). This actually made it difficult to
understand how the function worked, and made it hard make changes to
support clusters. So as an initial cleanup, I've separated out the
functionality of initializing block bitmap from calculating the number
of free blocks in the new block group.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2011-09-10 06:42:51 +0800

01 Aug, 2011

1 commit

33853a0dd ext4: use the correct error exit path in ext4_init_inode_table() ... Browse Code »

This patch lets ext4_init_inode_table() handle errors right.
ext4_init_inode_table() should down_write() alloc_sem which
has been up_write()ed and stop the started journal handle.

Signed-off-by: Yongqiang Yang
Signed-off-by: "Theodore Ts'o"

Yongqiang Yang
2011-08-01 18:32:19 +0800