Eric Lee / smarc-fsl-linux-kernel

17 Aug, 2012

4 commits

ef824bfba Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 ... Browse Code »

Pull ext4 bug fixes from Ted Ts'o:
"The following are all bug fixes and regressions. The most notable are
the ones which cause problems for ext4 on RAID --- a performance
problem when mounting very large filesystems, and a kernel OOPS when
doing an rm -rf on large directory hierarchies on fast devices."

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix kernel BUG on large-scale rm -rf commands
ext4: fix long mount times on very big file systems
ext4: don't call ext4_error while block group is locked
ext4: avoid kmemcheck complaint from reading uninitialized memory
ext4: make sure the journal sb is written in ext4_clear_journal_err()

Linus Torvalds
2012-08-17 23:04:47 +0800
89a4e48f8 ext4: fix kernel BUG on large-scale rm -rf commands ... Browse Code »

Commit 968dee7722: "ext4: fix hole punch failure when depth is greater
than 0" introduced a regression in v3.5.1/v3.6-rc1 which caused kernel
crashes when users ran run "rm -rf" on large directory hierarchy on
ext4 filesystems on RAID devices:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000028

Process rm (pid: 18229, threadinfo ffff8801276bc000, task ffff880123631710)
Call Trace:
[] ? __ext4_handle_dirty_metadata+0x83/0x110
[] ext4_ext_truncate+0x193/0x1d0
[] ? ext4_mark_inode_dirty+0x7f/0x1f0
[] ext4_truncate+0xf5/0x100
[] ext4_evict_inode+0x461/0x490
[] evict+0xa2/0x1a0
[] iput+0x103/0x1f0
[] do_unlinkat+0x154/0x1c0
[] ? sys_newfstatat+0x2a/0x40
[] sys_unlinkat+0x1b/0x50
[] system_call_fastpath+0x16/0x1b
Code: 8b 4d 20 0f b7 41 02 48 8d 04 40 48 8d 04 81 49 89 45 18 0f b7 49 02 48 83 c1 01 49 89 4d 00 e9 ae f8 ff ff 0f 1f 00 49 8b 45 28 8b 40 28 49 89 45 20 e9 85 f8 ff ff 0f 1f 80 00 00 00

RIP [] ext4_ext_remove_space+0xa34/0xdf0

This could be reproduced as follows:

The problem in commit 968dee7722 was that caused the variable 'i' to
be left uninitialized if the truncate required more space than was
available in the journal. This resulted in the function
ext4_ext_truncate_extend_restart() returning -EAGAIN, which caused
ext4_ext_remove_space() to restart the truncate operation after
starting a new jbd2 handle.

Reported-by: Maciej Żenczykowski
Reported-by: Marti Raudsepp
Tested-by: Fengguang Wu
Signed-off-by: "Theodore Ts'o"
Cc: stable@vger.kernel.org

Theodore Ts'o
2012-08-17 21:42:17 +0800
0548bbb85 ext4: fix long mount times on very big file systems ... Browse Code »

Commit 8aeb00ff85a: "ext4: fix overhead calculation used by
ext4_statfs()" introduced a O(n**2) calculation which makes very large
file systems take forever to mount. Fix this with an optimization for
non-bigalloc file systems. (For bigalloc file systems the overhead
needs to be set in the the superblock.)

Signed-off-by: "Theodore Ts'o"
Cc: stable@vger.kernel.org

Theodore Ts'o
2012-08-17 21:23:00 +0800
7a4c5de27 ext4: don't call ext4_error while block group is locked ... Browse Code »

While in ext4_validate_block_bitmap(), if an block allocation bitmap
is found to be invalid, we call ext4_error() while the block group is
still locked. This causes ext4_commit_super() to call a function
which might sleep while in an atomic context.

There's no need to keep the block group locked at this point, so hoist
the ext4_error() call up to ext4_validate_block_bitmap() and release
the block group spinlock before calling ext4_error().

The reported stack trace can be found at:

http://article.gmane.org/gmane.comp.file-systems.ext4/33731

Reported-by: Dave Jones
Signed-off-by: "Theodore Ts'o"
Cc: stable@vger.kernel.org

Theodore Ts'o
2012-08-17 21:06:06 +0800

06 Aug, 2012

2 commits

7e731bc9a ext4: avoid kmemcheck complaint from reading uninitialized memory ... Browse Code »

Commit 03179fe923 introduced a kmemcheck complaint in
ext4_da_get_block_prep() because we save and restore
ei->i_da_metadata_calc_last_lblock even though it is left
uninitialized in the case where i_da_metadata_calc_len is zero.

This doesn't hurt anything, but silencing the kmemcheck complaint
makes it easier for people to find real bugs.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=45631
(which is marked as a regression).

Signed-off-by: "Theodore Ts'o"
Cc: stable@vger.kernel.org

Theodore Ts'o
2012-08-06 11:28:16 +0800
d796c52ef ext4: make sure the journal sb is written in ext4_clear_journal_err() ... Browse Code »

After we transfer set the EXT4_ERROR_FS bit in the file system
superblock, it's not enough to call jbd2_journal_clear_err() to clear
the error indication from journal superblock --- we need to call
jbd2_journal_update_sb_errno() as well. Otherwise, when the root file
system is mounted read-only, the journal is replayed, and the error
indicator is transferred to the superblock --- but the s_errno field
in the jbd2 superblock is left set (since although we cleared it in
memory, we never flushed it out to disk).

This can end up confusing e2fsck. We should make e2fsck more robust
in this case, but the kernel shouldn't be leaving things in this
confused state, either.

Signed-off-by: "Theodore Ts'o"
Cc: stable@kernel.org

Theodore Ts'o
2012-08-06 07:04:57 +0800

04 Aug, 2012

2 commits

f6463b0da ext4: nuke pdflush from comments ... Browse Code »

The pdflush thread is long gone, so this patch removes references to pdflush
from ext4 comments.

Cc: "Theodore Ts'o"
Cc: Andreas Dilger
Signed-off-by: Artem Bityutskiy
Signed-off-by: Al Viro

Artem Bityutskiy
2012-08-04 16:15:34 +0800
7652bdfcb ext4: nuke write_super from comments ... Browse Code »

The '->write_super' superblock method is gone, and this patch removes all the
references to 'write_super' from ext3.

Cc: "Theodore Ts'o"
Cc: Andreas Dilger
Signed-off-by: Artem Bityutskiy
Signed-off-by: Al Viro

Artem Bityutskiy
2012-08-04 16:15:33 +0800

02 Aug, 2012

1 commit

a0e881b7c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull second vfs pile from Al Viro:
"The stuff in there: fsfreeze deadlock fixes by Jan (essentially, the
deadlock reproduced by xfstests 068), symlink and hardlink restriction
patches, plus assorted cleanups and fixes.

Note that another fsfreeze deadlock (emergency thaw one) is *not*
dealt with - the series by Fernando conflicts a lot with Jan's, breaks
userland ABI (FIFREEZE semantics gets changed) and trades the deadlock
for massive vfsmount leak; this is going to be handled next cycle.
There probably will be another pull request, but that stuff won't be
in it."

Fix up trivial conflicts due to unrelated changes next to each other in
drivers/{staging/gdm72xx/usb_boot.c, usb/gadget/storage_common.c}

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits)
delousing target_core_file a bit
Documentation: Correct s_umount state for freeze_fs/unfreeze_fs
fs: Remove old freezing mechanism
ext2: Implement freezing
btrfs: Convert to new freezing mechanism
nilfs2: Convert to new freezing mechanism
ntfs: Convert to new freezing mechanism
fuse: Convert to new freezing mechanism
gfs2: Convert to new freezing mechanism
ocfs2: Convert to new freezing mechanism
xfs: Convert to new freezing code
ext4: Convert to new freezing mechanism
fs: Protect write paths by sb_start_write - sb_end_write
fs: Skip atime update on frozen filesystem
fs: Add freezing handling to mnt_want_write() / mnt_drop_write()
fs: Improve filesystem freezing handling
switch the protection of percpu_counter list to spinlock
nfsd: Push mnt_want_write() outside of i_mutex
btrfs: Push mnt_want_write() outside of i_mutex
fat: Push mnt_want_write() outside of i_mutex
...

Linus Torvalds
2012-08-02 01:26:23 +0800

31 Jul, 2012

2 commits

8e8ad8a57 ext4: Convert to new freezing mechanism ... Browse Code »

We remove most of frozen checks since upper layer takes care of blocking all
writes. We have to handle protection in ext4_page_mkwrite() in a special way
because we cannot use generic block_page_mkwrite(). Also we add a freeze
protection to ext4_evict_inode() so that iput() of unlinked inode cannot modify
a frozen filesystem (we cannot easily instrument ext4_journal_start() /
ext4_journal_stop() with freeze protection because we are missing the
superblock pointer in ext4_journal_stop() in nojournal mode).

CC: linux-ext4@vger.kernel.org
CC: "Theodore Ts'o"
BugLink: https://bugs.launchpad.net/bugs/897421
Tested-by: Kamal Mostafa
Tested-by: Peter M. Petrakis
Tested-by: Dann Frazier
Tested-by: Massimo Morana
Acked-by: "Theodore Ts'o"
Signed-off-by: Jan Kara
Signed-off-by: Al Viro

Jan Kara
2012-07-31 13:45:48 +0800
6017b485c ext4: use memweight() ... Browse Code »

Convert ext4_count_free() to use memweight() instead of table lookup
based counting clear bits implementation. This change only affects the
code segments enabled by EXT4FS_DEBUG.

Note that this memweight() call can't be replaced with a single
bitmap_weight() call, although the pointer to the memory area is aligned
to long-word boundary. Because the size of the memory area may not be a
multiple of BITS_PER_LONG, then it returns wrong value on big-endian
architecture.

This also includes the following change.

- Remove unnecessary map == NULL check in ext4_count_free() which
always takes non-null pointer as the memory area.

Signed-off-by: Akinobu Mita
Cc: "Theodore Ts'o"
Cc: Andreas Dilger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Akinobu Mita
2012-07-31 08:25:16 +0800

28 Jul, 2012

1 commit

173f86547 Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 ... Browse Code »

Pull ext4 updates from Ted Ts'o:
"The usual collection of bug fixes and optimizations. Perhaps of
greatest note is a speed up for parallel, non-allocating DIO writes,
since we no longer take the i_mutex lock in that case.

For bug fixes, we fix an incorrect overhead calculation which caused
slightly incorrect results for df(1) and statfs(2). We also fixed
bugs in the metadata checksum feature."

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (23 commits)
ext4: undo ext4_calc_metadata_amount if we fail to claim space
ext4: don't let i_reserved_meta_blocks go negative
ext4: fix hole punch failure when depth is greater than 0
ext4: remove unnecessary argument from __ext4_handle_dirty_metadata()
ext4: weed out ext4_write_super
ext4: remove unnecessary superblock dirtying
ext4: convert last user of ext4_mark_super_dirty() to ext4_handle_dirty_super()
ext4: remove useless marking of superblock dirty
ext4: fix ext4 mismerge back in January
ext4: remove dynamic array size in ext4_chksum()
ext4: remove unused variable in ext4_update_super()
ext4: make quota as first class supported feature
ext4: don't take the i_mutex lock when doing DIO overwrites
ext4: add a new nolock flag in ext4_map_blocks
ext4: split ext4_file_write into buffered IO and direct IO
ext4: remove an unused statement in ext4_mb_get_buddy_page_lock()
ext4: fix out-of-date comments in extents.c
ext4: use s_csum_seed instead of i_csum_seed for xattr block
ext4: use proper csum calculation in ext4_rename
ext4: fix overhead calculation used by ext4_statfs()
...

Linus Torvalds
2012-07-28 11:52:25 +0800

24 Jul, 2012

1 commit

a66d2c8f7 Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull the big VFS changes from Al Viro:
"This one is *big* and changes quite a few things around VFS. What's in there:

- the first of two really major architecture changes - death to open
intents.

The former is finally there; it was very long in making, but with
Miklos getting through really hard and messy final push in
fs/namei.c, we finally have it. Unlike his variant, this one
doesn't introduce struct opendata; what we have instead is
->atomic_open() taking preallocated struct file * and passing
everything via its fields.

Instead of returning struct file *, it returns -E... on error, 0
on success and 1 in "deal with it yourself" case (e.g. symlink
found on server, etc.).

See comments before fs/namei.c:atomic_open(). That made a lot of
goodies finally possible and quite a few are in that pile:
->lookup(), ->d_revalidate() and ->create() do not get struct
nameidata * anymore; ->lookup() and ->d_revalidate() get lookup
flags instead, ->create() gets "do we want it exclusive" flag.

With the introduction of new helper (kern_path_locked()) we are rid
of all struct nameidata instances outside of fs/namei.c; it's still
visible in namei.h, but not for long. Come the next cycle,
declaration will move either to fs/internal.h or to fs/namei.c
itself. [me, miklos, hch]

- The second major change: behaviour of final fput(). Now we have
__fput() done without any locks held by caller *and* not from deep
in call stack.

That obviously lifts a lot of constraints on the locking in there.
Moreover, it's legal now to call fput() from atomic contexts (which
has immediately simplified life for aio.c). We also don't need
anti-recursion logics in __scm_destroy() anymore.

There is a price, though - the damn thing has become partially
asynchronous. For fput() from normal process we are guaranteed
that pending __fput() will be done before the caller returns to
userland, exits or gets stopped for ptrace.

For kernel threads and atomic contexts it's done via
schedule_work(), so theoretically we might need a way to make sure
it's finished; so far only one such place had been found, but there
might be more.

There's flush_delayed_fput() (do all pending __fput()) and there's
__fput_sync() (fput() analog doing __fput() immediately). I hope
we won't need them often; see warnings in fs/file_table.c for
details. [me, based on task_work series from Oleg merged last
cycle]

- sync series from Jan

- large part of "death to sync_supers()" work from Artem; the only
bits missing here are exofs and ext4 ones. As far as I understand,
those are going via the exofs and ext4 trees resp.; once they are
in, we can put ->write_super() to the rest, along with the thread
calling it.

- preparatory bits from unionmount series (from dhowells).

- assorted cleanups and fixes all over the place, as usual.

This is not the last pile for this cycle; there's at least jlayton's
ESTALE work and fsfreeze series (the latter - in dire need of fixes,
so I'm not sure it'll make the cut this cycle). I'll probably throw
symlink/hardlink restrictions stuff from Kees into the next pile, too.
Plus there's a lot of misc patches I hadn't thrown into that one -
it's large enough as it is..."

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (127 commits)
ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file()
btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file()
switch dentry_open() to struct path, make it grab references itself
spufs: shift dget/mntget towards dentry_open()
zoran: don't bother with struct file * in zoran_map
ecryptfs: don't reinvent the wheels, please - use struct completion
don't expose I_NEW inodes via dentry->d_inode
tidy up namei.c a bit
unobfuscate follow_up() a bit
ext3: pass custom EOF to generic_file_llseek_size()
ext4: use core vfs llseek code for dir seeks
vfs: allow custom EOF in generic_file_llseek code
vfs: Avoid unnecessary WB_SYNC_NONE writeback during sys_sync and reorder sync passes
vfs: Remove unnecessary flushing of block devices
vfs: Make sys_sync writeout also block device inodes
vfs: Create function for iterating over block devices
vfs: Reorder operations during sys_sync
quota: Move quota syncing to ->sync_fs method
quota: Split dquot_quota_sync() to writeback and cache flushing part
vfs: Move noop_backing_dev_info check from sync into writeback
...

Linus Torvalds
2012-07-24 03:27:27 +0800

23 Jul, 2012

18 commits

03179fe92 ext4: undo ext4_calc_metadata_amount if we fail to claim space ... Browse Code »
43

The function ext4_calc_metadata_amount() has side effects, although
it's not obvious from its function name. So if we fail to claim
space, regardless of whether we retry to claim the space again, or
return an error, we need to undo these side effects.

Otherwise we can end up incorrectly calculating the number of metadata
blocks needed for the operation, which was responsible for an xfstests
failure for test #271 when using an ext2 file system with delalloc
enabled.

Reported-by: Brian Foster
Signed-off-by: "Theodore Ts'o"
Cc: stable@vger.kernel.org

Theodore Ts'o
2012-07-23 12:00:20 +0800
97795d2a5 ext4: don't let i_reserved_meta_blocks go negative ... Browse Code »

If we hit a condition where we have allocated metadata blocks that
were not appropriately reserved, we risk underflow of
ei->i_reserved_meta_blocks. In turn, this can throw
sbi->s_dirtyclusters_counter significantly out of whack and undermine
the nondelalloc fallback logic in ext4_nonda_switch(). Warn if this
occurs and set i_allocated_meta_blocks to avoid this problem.

This condition is reproduced by xfstests 270 against ext2 with
delalloc enabled:

Mar 28 08:58:02 localhost kernel: [ 171.526344] EXT4-fs (loop1): delayed block allocation failed for inode 14 at logical offset 64486 with max blocks 64 with error -28
Mar 28 08:58:02 localhost kernel: [ 171.526346] EXT4-fs (loop1): This should not happen!! Data will be lost

270 ultimately fails with an inconsistent filesystem and requires an
fsck to repair. The cause of the error is an underflow in
ext4_da_update_reserve_space() due to an unreserved meta block
allocation.

Signed-off-by: Brian Foster
Signed-off-by: "Theodore Ts'o"
Cc: stable@vger.kernel.org

Brian Foster
2012-07-23 11:59:40 +0800
968dee772 ext4: fix hole punch failure when depth is greater than 0 ... Browse Code »
86

Whether to continue removing extents or not is decided by the return
value of function ext4_ext_more_to_rm() which checks 2 conditions:
a) if there are no more indexes to process.
b) if the number of entries are decreased in the header of "depth -1".

In case of hole punch, if the last block to be removed is not part of
the last extent index than this index will not be deleted, hence the
number of valid entries in the extent header of "depth - 1" will
remain as it is and ext4_ext_more_to_rm will return 0 although the
required blocks are not yet removed.

This patch fixes the above mentioned problem as instead of removing
the extents from the end of file, it starts removing the blocks from
the particular extent from which removing blocks is actually required
and continue backward until done.

Signed-off-by: Ashish Sangwan
Signed-off-by: Namjae Jeon
Reviewed-by: Lukas Czerner
Cc: stable@vger.kernel.org

Ashish Sangwan
2012-07-23 10:49:08 +0800
b50924c2c ext4: remove unnecessary argument from __ext4_handle_dirty_metadata() ... Browse Code »

The '__ext4_handle_dirty_metadata()' does not need the 'now' argument
anymore and we can kill it.

Signed-off-by: Artem Bityutskiy
Signed-off-by: "Theodore Ts'o"
Reviewed-by: Jan Kara

Artem Bityutskiy
2012-07-23 08:37:31 +0800
4d47603d9 ext4: weed out ext4_write_super ... Browse Code »

We do not depend on VFS's '->write_super()' anymore and do not need
the 's_dirt' flag anymore, so weed out 'ext4_write_super()' and
's_dirt'.

Signed-off-by: Artem Bityutskiy
Signed-off-by: "Theodore Ts'o"
Reviewed-by: Jan Kara

Artem Bityutskiy
2012-07-23 08:35:31 +0800
58c5873a7 ext4: remove unnecessary superblock dirtying ... Browse Code »

This patch changes the 'ext4_handle_dirty_super()' function which
submits the superblock for I/O in the following cases:

1. When creating the first large file on a file system without
EXT4_FEATURE_RO_COMPAT_LARGE_FILE feature.
2. When re-sizing the file-system.
3. When creating an xattr on a file-system without the
EXT4_FEATURE_COMPAT_EXT_ATTR feature.

If the file-system has journal enabled, the superblock is written via
the journal. We do not modify this path.

If the file-system has no journal, this function, falls back to just
marking the superblock as dirty using the 's_dirt' superblock
flag. This means that it delays the actual superblock I/O submission
by 5 seconds (default setting). Namely, the 'sync_supers()' kernel
thread will call 'ext4_write_super()' later and will actually submit
the superblock for I/O.

And this is the behavior this patch modifies: we stop using 's_dirt'
and just mark the superblock buffer as dirty right away. Indeed, all 3
cases above are extremely rare and it does not add any value to delay
the I/O submission for them.

Note: 'ext4_handle_dirty_super()' executes
'__ext4_handle_dirty_super()' with 'now = 0'. This patch basically
makes the 'now' argument unneeded and it will be deleted in one of the
next patches.

This patch also removes 's_dirt' condition on the unmount path because
we never set it anymore, so we should not test it.

Tested using xfstests for both journalled and non-journalled ext4.

Signed-off-by: Artem Bityutskiy
Signed-off-by: "Theodore Ts'o"
Reviewed-by: Jan Kara

Artem Bityutskiy
2012-07-23 08:33:31 +0800
044ce47fe ext4: convert last user of ext4_mark_super_dirty() to ext4_handle_dirty_super() ... Browse Code »

The last user of ext4_mark_super_dirty() in ext4_file_open() is so
rare it can well be modifying the superblock properly by journalling
the change. Change it and get rid of ext4_mark_super_dirty() as it's
not needed anymore.

Artem: small amendments.
Artem: tested using xfstests for both journalled and non-journalled ext4.

Signed-off-by: Jan Kara
Signed-off-by: Artem Bityutskiy
Signed-off-by: "Theodore Ts'o"
Tested-by: Artem Bityutskiy

Jan Kara
2012-07-23 08:31:31 +0800
97a740688 ext4: remove useless marking of superblock dirty ... Browse Code »

Commit a0375156 properly notes that superblock doesn't need to be marked
as dirty when only number of free inodes / blocks / number of directories
changes since that is recomputed on each mount anyway. However that comment
leaves some unnecessary markings as dirty in place. Remove these.

Artem: tested using xfstests for both journalled and non-journalled ext4.

Signed-off-by: Jan Kara
Signed-off-by: Artem Bityutskiy
Signed-off-by: "Theodore Ts'o"
Tested-by: Artem Bityutskiy

Jan Kara
2012-07-23 08:29:31 +0800
254706056 ext4: fix ext4 mismerge back in January ... Browse Code »

Duplicate caused, AFAICS, by mismerge in
ff9cb1c4eead5e4c292e75cd3170a82d66944101>

Signed-off-by: Al Viro
Signed-off-by: "Theodore Ts'o"
Cc: stable@vger.kernel.org

Al Viro
2012-07-23 08:27:31 +0800
3108b54bc ext4: remove dynamic array size in ext4_chksum() ... Browse Code »

The ext4_checksum() inline function was using a dynamic array size,
which is not legal C. (It is a gcc extension).

Remove it.

Cc: "Darrick J. Wong"
Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2012-07-23 08:25:31 +0800
8a9918497 ext4: remove unused variable in ext4_update_super() ... Browse Code »

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2012-07-23 08:23:31 +0800
7c319d328 ext4: make quota as first class supported feature ... Browse Code »

This patch adds support for quotas as a first class feature in ext4;
which is to say, the quota files are stored in hidden inodes as file
system metadata, instead of as separate files visible in the file system
directory hierarchy.

It is based on the proposal at:
https://ext4.wiki.kernel.org/index.php/Design_For_1st_Class_Quota_in_Ext4

This patch introduces a new feature - EXT4_FEATURE_RO_COMPAT_QUOTA
which, when turned on, enables quota accounting at mount time
iteself. Also, the quota inodes are stored in two additional superblock
fields. Some changes introduced by this patch that should be pointed
out are:

1) Two new ext4-superblock fields - s_usr_quota_inum and
s_grp_quota_inum for storing the quota inodes in use.
2) Default quota inodes are: inode#3 for tracking userquota and inode#4
for tracking group quota. The superblock fields can be set to use
other inodes as well.
3) If the QUOTA feature and corresponding quota inodes are set in
superblock, the quota usage tracking is turned on at mount time. On
'quotaon' ioctl, the quota limits enforcement is turned
on. 'quotaoff' ioctl turns off only the limits enforcement in this
case.
4) When QUOTA feature is in use, the quota mount options 'quota',
'usrquota', 'grpquota' are ignored by the kernel.
5) mke2fs or tune2fs can be used to set the QUOTA feature and initialize
quota inodes. The default reserved inodes will not be visible to user
as regular files.
6) The quota-tools will need to be modified to support hidden quota
files on ext4. E2fsprogs will also include support for creating and
fixing quota files.
7) Support is only for the new V2 quota file format.

Tested-by: Jan Kara
Reviewed-by: Jan Kara
Reviewed-by: Johann Lombardi
Signed-off-by: Aditya Kali
Signed-off-by: "Theodore Ts'o"

Aditya Kali
2012-07-23 08:21:31 +0800
4bd809dbb ext4: don't take the i_mutex lock when doing DIO overwrites ... Browse Code »

Aligned and overwrite direct I/O can be parallelized. In
ext4_file_dio_write, we first check whether these conditions are
satisfied or not. If so, we take i_data_sem and release i_mutex lock
directly. Meanwhile iocb->private is set to indicate that this is a
dio overwrite, and it will be handled in ext4_ext_direct_IO.

[ Added fix from Dan Carpenter to fix locking bug on the error path. ]

CC: Tao Ma
CC: Eric Sandeen
CC: Robin Dong
Signed-off-by: Zheng Liu
Signed-off-by: "Theodore Ts'o"
Signed-off-by: Dan Carpenter

Zheng Liu
2012-07-23 08:19:31 +0800
8cae6f715 ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-07-23 04:01:55 +0800
8fc37ec54 don't expose I_NEW inodes via dentry->d_inode ... Browse Code »

d_instantiate(dentry, inode);
unlock_new_inode(inode);

is a bad idea; do it the other way round...

Signed-off-by: Al Viro

Al Viro
2012-07-23 04:00:58 +0800
ec7268ce2 ext4: use core vfs llseek code for dir seeks ... Browse Code »

Use the new functionality in generic_file_llseek_size() to
accept a custom EOF position, and un-cut-and-paste all the
vfs llseek code from ext4.

Also fix up comments on ext4_llseek() to reflect reality.

Signed-off-by: Eric Sandeen
Signed-off-by: Al Viro

Eric Sandeen
2012-07-23 04:00:28 +0800
e8b96eb50 vfs: allow custom EOF in generic_file_llseek code ... Browse Code »

For ext3/4 htree directories, using the vfs llseek function with
SEEK_END goes to i_size like for any other file, but in reality
we want the maximum possible hash value. Recent changes
in ext4 have cut & pasted generic_file_llseek() back into fs/ext4/dir.c,
but replicating this core code seems like a bad idea, especially
since the copy has already diverged from the vfs.

This patch updates generic_file_llseek_size to accept
both a custom maximum offset, and a custom EOF position. With this
in place, ext4_dir_llseek can pass in the appropriate maximum hash
position for both maxsize and eof, and get what it wants.

As far as I know, this does not fix any bugs - nfs in the kernel
doesn't use SEEK_END, and I don't know of any user who does. But
some ext4 folks seem keen on doing the right thing here, and I can't
really argue.

(Patch also fixes up some comments slightly)

Signed-off-by: Eric Sandeen
Signed-off-by: Al Viro

Eric Sandeen
2012-07-23 04:00:15 +0800
a11778257 quota: Move quota syncing to ->sync_fs method ... Browse Code »

Since the moment writes to quota files are using block device page cache and
space for quota structures is reserved at the moment they are first accessed we
have no reason to sync quota before inode writeback. In fact this order is now
only harmful since quota information can easily change during inode writeback
(either because conversion of delayed-allocated extents or simply because of
allocation of new blocks for simple filesystems not using page_mkwrite).

So move syncing of quota information after writeback of inodes into ->sync_fs
method. This way we do not have to use ->quota_sync callback which is primarily
intended for use by quotactl syscall anyway and we get rid of calling
->sync_fs() twice unnecessarily. We skip quota syncing for OCFS2 since it does
proper quota journalling in all cases (unlike ext3, ext4, and reiserfs which
also support legacy non-journalled quotas) and thus there are no dirty quota
structures.

CC: "Theodore Ts'o"
CC: Joel Becker
CC: reiserfs-devel@vger.kernel.org
Acked-by: Steven Whitehouse
Acked-by: Dave Kleikamp
Reviewed-by: Christoph Hellwig
Signed-off-by: Jan Kara
Signed-off-by: Al Viro

Jan Kara
2012-07-23 03:58:34 +0800

18 Jul, 2012

1 commit

331ae4962 ext4: fix duplicated mnt_drop_write call in EXT4_IOC_MOVE_EXT ... Browse Code »

Caused, AFAICS, by mismerge in commit ff9cb1c4eead ("Merge branch
'for_linus' into for_linus_merged")

Signed-off-by: Al Viro
Cc: Theodore Ts'o
Cc: stable@vger.kernel.org # 3.3+
Signed-off-by: Linus Torvalds

Al Viro
2012-07-18 23:59:46 +0800

14 Jul, 2012

4 commits

ebfc3b49a don't pass nameidata to ->create() ... Browse Code »

boolean "does it have to be exclusive?" flag is passed instead;
Local filesystem should just ignore it - the object is guaranteed
not to be there yet.

Signed-off-by: Al Viro

Al Viro
2012-07-14 20:34:47 +0800
00cd8dd3b stop passing nameidata to ->lookup() ... Browse Code »

Just the flags; only NFS cares even about that, but there are
legitimate uses for such argument. And getting rid of that
completely would require splitting ->lookup() into a couple
of methods (at least), so let's leave that alone for now...

Signed-off-by: Al Viro

Al Viro
2012-07-14 20:34:32 +0800
b3d9b7a3c vfs: switch i_dentry/d_alias to hlist ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-07-14 20:32:55 +0800
9f713878f ext4: get rid of open-coded d_find_any_alias() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-07-14 20:32:54 +0800

10 Jul, 2012

4 commits

729f52c6b ext4: add a new nolock flag in ext4_map_blocks ... Browse Code »

EXT4_GET_BLOCKS_NO_LOCK flag is added to indicate that we don't need
to acquire i_data_sem lock in ext4_map_blocks. Meanwhile, it changes
ext4_get_block() to not start a new journal because when we do a
overwrite dio, there is no any metadata that needs to be modified.

We define a new function called ext4_get_block_write_nolock, which is
used in dio overwrite nolock. In this function, it doesn't try to
acquire i_data_sem lock and doesn't start a new journal as it does a
lookup.

CC: Tao Ma
CC: Eric Sandeen
CC: Robin Dong
Signed-off-by: Zheng Liu
Signed-off-by: "Theodore Ts'o"

Zheng Liu
2012-07-10 04:29:29 +0800
fbe104942 ext4: split ext4_file_write into buffered IO and direct IO ... Browse Code »

ext4_file_dio_write is defined in order to split buffered IO and
direct IO in ext4. This patch just refactor some stuff in write path.

CC: Tao Ma
CC: Eric Sandeen
CC: Robin Dong
Signed-off-by: Zheng Liu
Signed-off-by: "Theodore Ts'o"

Zheng Liu
2012-07-10 04:29:29 +0800
62a1391dd ext4: remove an unused statement in ext4_mb_get_buddy_page_lock() ... Browse Code »

In this patch, the statement "poff = block % blocks_per_page"
in ext4_mb_get_buddy_page_lock has no effect.

It will be optimized out by the compiler, but it's better to remove it.

Signed-off-by: Haibo Liu
Signed-off-by: "Theodore Ts'o"

Haibo Liu
2012-07-10 04:29:28 +0800
e7bcf8230 ext4: fix out-of-date comments in extents.c ... Browse Code »

In this patch, ext4_ext_try_to_merge has been change to merge
an extent both left and right. So we need to update the comment
in here.

Signed-off-by: HaiboLiu
Signed-off-by: "Theodore Ts'o"

HaiboLiu
2012-07-10 04:29:28 +0800