Doug / smarc-fsl-linux-kernel | Embedian Git Server

11 Aug, 2010

1 commit

5f248c9c2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits)
no need for list_for_each_entry_safe()/resetting with superblock list
Fix sget() race with failing mount
vfs: don't hold s_umount over close_bdev_exclusive() call
sysv: do not mark superblock dirty on remount
sysv: do not mark superblock dirty on mount
btrfs: remove junk sb_dirt change
BFS: clean up the superblock usage
AFFS: wait for sb synchronization when needed
AFFS: clean up dirty flag usage
cifs: truncate fallout
mbcache: fix shrinker function return value
mbcache: Remove unused features
add f_flags to struct statfs(64)
pass a struct path to vfs_statfs
update VFS documentation for method changes.
All filesystems that need invalidate_inode_buffers() are doing that explicitly
convert remaining ->clear_inode() to ->evict_inode()
Make ->drop_inode() just return whether inode needs to be dropped
fs/inode.c:clear_inode() is gone
fs/inode.c:evict() doesn't care about delete vs. non-delete paths now
...

Fix up trivial conflicts in fs/nilfs2/super.c

Linus Torvalds
2010-08-11 02:26:52 +0800

10 Aug, 2010

1 commit

0930fcc1e convert ext4 to ->evict_inode() ... Browse Code »

pretty much brute-force...

Signed-off-by: Al Viro

Al Viro
2010-08-10 04:48:30 +0800

05 Aug, 2010

1 commit

0cfc9255a ext4: re-inline ext4_rec_len_(to|from)_disk functions ... Browse Code »

commit 3d0518f4, "ext4: New rec_len encoding for very
large blocksizes" made several changes to this path, but from
a perf perspective, un-inlining ext4_rec_len_from_disk() seems
most significant. This function is called from ext4_check_dir_entry(),
which on a file-creation workload is called extremely often.

I tested this with bonnie:

# bonnie++ -u root -s 0 -f -x 200 -d /mnt/test -n 32

(this does 200 iterations) and got this for the file creations:

ext4 stock: Average = 21206.8 files/s
ext4 inlined: Average = 22346.7 files/s (+5%)

Signed-off-by: Eric Sandeen
Signed-off-by: "Theodore Ts'o"

Eric Sandeen
2010-08-05 13:46:37 +0800

02 Aug, 2010

1 commit

8b67f04ab ext4: Add mount options in superblock ... Browse Code »

Allow mount options to be stored in the superblock. Also add default
mount option bits for nobarrier, block_validity, discard, and nodelalloc.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-08-02 11:14:20 +0800

27 Jul, 2010

7 commits

79e830367 ext4: fix ext4_get_blocks references ... Browse Code »

ext4_get_blocks got renamed to ext4_map_blocks, but left stale
comments and a prototype littered around.

Signed-off-by: Eric Sandeen
Signed-off-by: "Theodore Ts'o"

Eric Sandeen
2010-07-27 23:56:07 +0800
5b3ff237b ext4: move aio completion after unwritten extent conversion ... Browse Code »

This patch is to be applied upon Christoph's "direct-io: move aio_complete
into ->end_io" patch. It adds iocb and result fields to struct ext4_io_end_t,
so that we can call aio_complete from ext4_end_io_nolock() after the extent
conversion has finished.

I have verified with Christoph's aio-dio test that used to fail after a few
runs on an original kernel but now succeeds on the patched kernel.

See http://thread.gmane.org/gmane.comp.file-systems.ext4/19659 for details.

Signed-off-by: Jiaying Zhang
Signed-off-by: "Theodore Ts'o"

jiayingz@google.com (Jiaying Zhang) <>
2010-07-27 23:56:06 +0800
89eeddf03 ext4: Define s_jnl_backup_type in superblock ... Browse Code »

This has been in use by e2fsprogs for a while; define it to keep the
super block fields in sync.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-07-27 23:56:04 +0800
66e61a9e9 ext4: Once a day, printk file system error information to dmesg ... Browse Code »

This allows us to grab any file system error messages by scraping
/var/log/messages. This will make it easy for us to do error analysis
across the very large number of machines as we deploy ext4 across the
fleet.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-07-27 23:56:04 +0800
1c13d5c08 ext4: Save error information to the superblock for analysis ... Browse Code »

Save number of file system errors, and the time function name, line
number, block number, and inode number of the first and most recent
errors reported on the file system in the superblock.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-07-27 23:56:03 +0800
c398eda0e ext4: Pass line numbers to ext4_error() and friends ... Browse Code »

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-07-27 23:56:40 +0800
60fd4da34 ext4: Cleanup ext4_check_dir_entry so __func__ is now implicit ... Browse Code »

Also start passing the line number to ext4_check_dir since we're going
to need it in upcoming patch.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-07-27 23:54:40 +0800

30 Jun, 2010

1 commit

e29136f80 ext4: Enhance ext4_grp_locked_error() to take block and function numbers ... Browse Code »

Also use a macro definition so that __func__ and __LINE__ is implicit.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-06-30 00:54:28 +0800

29 Jun, 2010

2 commits

c67d859e3 ext4: clean up ext4_abort() so __func__ is now implicit ... Browse Code »

Use a macro definition for ext4_abort() to clean up the .c files a wee
bit.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-06-29 23:07:07 +0800
4a9cdec73 ext4: Add new superblock fields reserved for the Next3 snapshot feature ... Browse Code »

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-06-29 23:00:23 +0800

15 Jun, 2010

1 commit

206f7ab4f ext4: remove vestiges of nobh support ... Browse Code »

The nobh option was only supported for writeback mode, but given that all
write paths actually create buffer heads it effectively was a no-op already.

Signed-off-by: Christoph Hellwig
Signed-off-by: "Theodore Ts'o"

Christoph Hellwig
2010-06-15 02:42:49 +0800

12 Jun, 2010

1 commit

a0375156c ext4: Clean up s_dirt handling ... Browse Code »

We don't need to set s_dirt in most of the ext4 code when journaling
is enabled. In ext3/4 some of the summary statistics for # of free
inodes, blocks, and directories are calculated from the per-block
group statistics when the file system is mounted or unmounted. As a
result the superblock doesn't have to be updated, either via the
journal or by setting s_dirt. There are a few exceptions, most
notably when resizing the file system, where the superblock needs to
be modified --- and in that case it should be done as a journalled
operation if possible, and s_dirt set only in no-journal mode.

This patch will optimize out some unneeded disk writes when using ext4
with a journal.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-06-12 11:14:04 +0800

28 May, 2010

1 commit

7ea808591 drop unused dentry argument to ->fsync ... Browse Code »

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-05-28 10:05:02 +0800

17 May, 2010

8 commits

14ece1028 ext4: Make fsync sync new parent directories in no-journal mode ... Browse Code »

Add a new ext4 state to tell us when a file has been newly created; use
that state in ext4_sync_file in no-journal mode to tell us when we need
to sync the parent directory as well as the inode and data itself. This
fixes a problem in which a panic or power failure may lose the entire
file even when using fsync, since the parent directory entry is lost.

Addresses-Google-Bug: #2480057

Signed-off-by: Frank Mayhar
Signed-off-by: "Theodore Ts'o"

Frank Mayhar
2010-05-17 20:00:00 +0800
60e6679e2 ext4: Drop whitespace at end of lines ... Browse Code »

This patch was generated using:

#!/usr/bin/perl -i
while (<>) {
s/[ ]+$//;
print;
}

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-05-17 19:00:00 +0800
4d92dc0f0 ext4: Fix compat EXT4_IOC_ADD_GROUP ... Browse Code »

struct ext4_new_group_input needs to be converted because u64 has
only 32-bit alignment on some 32-bit architectures, notably i386.

Signed-off-by: Ben Hutchings
Signed-off-by: "Theodore Ts'o"

Ben Hutchings
2010-05-17 18:00:00 +0800
899ad0cea ext4: Conditionally define compat ioctl numbers ... Browse Code »

It is unnecessary, and in general impossible, to define the compat
ioctl numbers except when building the filesystem with CONFIG_COMPAT
defined.

Signed-off-by: Ben Hutchings
Signed-off-by: "Theodore Ts'o"

Ben Hutchings
2010-05-17 17:00:00 +0800
12e9b8920 ext4: Use bitops to read/modify i_flags in struct ext4_inode_info ... Browse Code »

At several places we modify EXT4_I(inode)->i_flags without holding
i_mutex (ext4_do_update_inode, ...). These modifications are racy and
we can lose updates to i_flags. So convert handling of i_flags to use
bitops which are atomic.

https://bugzilla.kernel.org/show_bug.cgi?id=15792

Signed-off-by: Dmitry Monakhov
Signed-off-by: "Theodore Ts'o"

Dmitry Monakhov
2010-05-17 10:00:00 +0800
24676da46 ext4: Convert calls of ext4_error() to EXT4_ERROR_INODE() ... Browse Code »

EXT4_ERROR_INODE() tends to provide better error information and in a
more consistent format. Some errors were not even identifying the inode
or directory which was corrupted, which made them not very useful.

Addresses-Google-Bug: #2507977

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-05-17 09:00:00 +0800
e35fd6609 ext4: Add new abstraction ext4_map_blocks() underneath ext4_get_blocks() ... Browse Code »

Jack up ext4_get_blocks() and add a new function, ext4_map_blocks()
which uses a much smaller structure, struct ext4_map_blocks which is
20 bytes, as opposed to a struct buffer_head, which nearly 5 times
bigger on an x86_64 machine. By switching things to use
ext4_map_blocks(), we can save stack space by using ext4_map_blocks()
since we can avoid allocating a struct buffer_head on the stack.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-05-17 07:00:00 +0800
8a57d9d61 ext4: check for a good block group before loading buddy pages ... Browse Code »

This adds a new field in ext4_group_info to cache the largest available
block range in a block group; and don't load the buddy pages until *after*
we've done a sanity check on the block group.

With large allocation requests (e.g., fallocate(), 8MiB) and relatively full
partitions, it's easy to have no block groups with a block extent large
enough to satisfy the input request length. This currently causes the loop
during cr == 0 in ext4_mb_regular_allocator() to load the buddy bitmap pages
for EVERY block group. That can be a lot of pages. The patch below allows
us to call ext4_mb_good_group() BEFORE we load the buddy pages (although we
have check again after we lock the block group).

Addresses-Google-Bug: #2578108
Addresses-Google-Bug: #2704453

Signed-off-by: Curt Wohlgemuth
Signed-off-by: "Theodore Ts'o"

Curt Wohlgemuth
2010-05-17 03:00:00 +0800

06 Mar, 2010

3 commits

9467c4fdd Merge branch 'write_inode2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 ... Browse Code »

* 'write_inode2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
pass writeback_control to ->write_inode
make sure data is on disk before calling ->write_inode

Linus Torvalds
2010-03-06 03:53:53 +0800
1f63b9c15 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 ... Browse Code »

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (36 commits)
ext4: fix up rb_root initializations to use RB_ROOT
ext4: Code cleanup for EXT4_IOC_MOVE_EXT ioctl
ext4: Fix the NULL reference in double_down_write_data_sem()
ext4: Fix insertion point of extent in mext_insert_across_blocks()
ext4: consolidate in_range() definitions
ext4: cleanup to use ext4_grp_offs_to_block()
ext4: cleanup to use ext4_group_first_block_no()
ext4: Release page references acquired in ext4_da_block_invalidatepages
ext4: Fix ext4_quota_write cross block boundary behaviour
ext4: Convert BUG_ON checks to use ext4_error() instead
ext4: Use direct_IO_no_locking in ext4 dio read
ext4: use ext4_get_block_write in buffer write
ext4: mechanical rename some of the direct I/O get_block's identifiers
ext4: make "offset" consistent in ext4_check_dir_entry()
ext4: Handle non empty on-disk orphan link
ext4: explicitly remove inode from orphan list after failed direct io
ext4: fix error handling in migrate
ext4: deprecate obsoleted mount options
ext4: Fix fencepost error in chosing choosing group vs file preallocation.
jbd2: clean up an assertion in jbd2_journal_commit_transaction()
...

Linus Torvalds
2010-03-06 02:47:00 +0800
a9185b41a pass writeback_control to ->write_inode ... Browse Code »

This gives the filesystem more information about the writeback that
is happening. Trond requested this for the NFS unstable write handling,
and other filesystems might benefit from this too by beeing able to
distinguish between the different callers in more detail.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-03-06 02:25:52 +0800

05 Mar, 2010

1 commit

744692dc0 ext4: use ext4_get_block_write in buffer write ... Browse Code »

Allocate uninitialized extent before ext4 buffer write and
convert the extent to initialized after io completes.
The purpose is to make sure an extent can only be marked
initialized after it has been written with new data so
we can safely drop the i_mutex lock in ext4 DIO read without
exposing stale data. This helps to improve multi-thread DIO
read performance on high-speed disks.

Skip the nobh and data=journal mount cases to make things simple for now.

Signed-off-by: Jiaying Zhang
Signed-off-by: "Theodore Ts'o"

Jiaying Zhang
2010-03-05 05:14:02 +0800

04 Mar, 2010

1 commit

731eb1a03 ext4: consolidate in_range() definitions ... Browse Code »

There are duplicate macro definitions of in_range() in mballoc.h and
balloc.c. This consolidates these two definitions into ext4.h, and
changes extents.c to use in_range() as well.

Signed-off-by: Akinobu Mita
Signed-off-by: "Theodore Ts'o"
Cc: Andreas Dilger

Akinobu Mita
2010-03-04 12:55:01 +0800

03 Mar, 2010

2 commits

273df556b ext4: Convert BUG_ON checks to use ext4_error() instead ... Browse Code »

Convert a bunch of BUG_ONs to emit a ext4_error() message and return
EIO. This is a first pass and most notably does _not_ cover
mballoc.c, which is a morass of void functions.

Signed-off-by: Frank Mayhar
Signed-off-by: "Theodore Ts'o"

Frank Mayhar
2010-03-03 00:46:09 +0800
c7064ef13 ext4: mechanical rename some of the direct I/O get_block's identifiers ... Browse Code »

This commit renames some of the direct I/O's block allocation flags,
variables, and functions introduced in Mingming's "Direct IO for holes
and fallocate" patches so that they can be used by ext4's buffered
write path as well. Also changed the related function comments
accordingly to cover both direct write and buffered write cases.

Signed-off-by: Jiaying Zhang
Signed-off-by: "Theodore Ts'o"

Jiaying Zhang
2010-03-03 02:28:44 +0800

24 Feb, 2010

1 commit

c8d46e41b ext4: Add flag to files with blocks intentionally past EOF ... Browse Code »

fallocate() may potentially instantiate blocks past EOF, depending
on the flags used when it is called.

e2fsck currently has a test for blocks past i_size, and it
sometimes trips up - noticeably on xfstests 013 which runs fsstress.

This patch from Jiayang does fix it up - it (along with
e2fsprogs updates and other patches recently from Aneesh) has
survived many fsstress runs in a row.

Signed-off-by: Eric Sandeen
Signed-off-by: Jiaying Zhang
Signed-off-by: "Theodore Ts'o"

Jiaying Zhang
2010-02-24 22:52:53 +0800

17 Feb, 2010

1 commit

003cb608a percpu: add __percpu sparse annotations to fs ... Browse Code »

Add __percpu sparse annotations to fs.

These annotations are to make sparse consider percpu variables to be
in a different address space and warn if accessed without going
through percpu accessors. This patch doesn't affect normal builds.

Signed-off-by: Tejun Heo
Cc: "Theodore Ts'o"
Cc: Trond Myklebust
Cc: Alex Elder
Cc: Christoph Hellwig
Cc: Alexander Viro

Tejun Heo
2010-02-17 10:17:38 +0800

16 Feb, 2010

1 commit

12062dddd ext4: move __func__ into a macro for ext4_warning, ext4_error ... Browse Code »

Just a pet peeve of mine; we had a mishash of calls with either __func__
or "function_name" and the latter tends to get out of sync.

I think it's easier to just hide the __func__ in a macro, and it'll
be consistent from then on.

Signed-off-by: Eric Sandeen
Signed-off-by: "Theodore Ts'o"

Eric Sandeen
2010-02-16 03:19:27 +0800

25 Jan, 2010

3 commits

f710b4b96 ext4: Reserve INCOMPAT_EA_INODE and INCOMPAT_DIRDATA feature codepoints ... Browse Code »

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-01-25 16:31:32 +0800
19f5fb7ad ext4: Use bitops to read/modify EXT4_I(inode)->i_state ... Browse Code »

At several places we modify EXT4_I(inode)->i_state without holding
i_mutex (ext4_release_file, ext4_bmap, ext4_journalled_writepage,
ext4_do_update_inode, ...). These modifications are racy and we can
lose updates to i_state. So convert handling of i_state to use bitops
which are atomic.

Cc: Jan Kara
Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-01-25 03:34:07 +0800
5f634d064 ext4: Fix quota accounting error with fallocate ... Browse Code »

When we fallocate a region of the file which we had recently written,
and which is still in the page cache marked as delayed allocated blocks
we need to make sure we don't do the quota update on writepage path.
This is because the needed quota updated would have already be done
by fallocate.

Signed-off-by: Aneesh Kumar K.V

Aneesh Kumar K.V
2010-01-25 17:00:31 +0800

15 Jan, 2010

1 commit

1296cc85c ext4: Drop EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE flag ... Browse Code »

We should update reserve space if it is delalloc buffer
and that is indicated by EXT4_GET_BLOCKS_DELALLOC_RESERVE flag.
So use EXT4_GET_BLOCKS_DELALLOC_RESERVE in place of
EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE

Signed-off-by: Aneesh Kumar K.V

Aneesh Kumar K.V
2010-01-15 14:27:59 +0800

01 Jan, 2010

1 commit

9d0be5023 ext4: Calculate metadata requirements more accurately ... Browse Code »

In the past, ext4_calc_metadata_amount(), and its sub-functions
ext4_ext_calc_metadata_amount() and ext4_indirect_calc_metadata_amount()
badly over-estimated the number of metadata blocks that might be
required for delayed allocation blocks. This didn't matter as much
when functions which managed the reserved metadata blocks were more
aggressive about dropping reserved metadata blocks as delayed
allocation blocks were written, but unfortunately they were too
aggressive. This was fixed in commit 0637c6f, but as a result the
over-estimation by ext4_calc_metadata_amount() would lead to reserving
2-3 times the number of pending delayed allocation blocks as
potentially required metadata blocks. So if there are 1 megabytes of
blocks which have been not yet been allocation, up to 3 megabytes of
space would get reserved out of the user's quota and from the file
system free space pool until all of the inode's data blocks have been
allocated.

This commit addresses this problem by much more accurately estimating
the number of metadata blocks that will be required. It will still
somewhat over-estimate the number of blocks needed, since it must make
a worst case estimate not knowing which physical blocks will be
needed, but it is much more accurate than before.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-01-01 15:41:30 +0800