Eric Lee / smarc-fsl-linux-kernel

22 Nov, 2011

2 commits

f8f5ed7c9 Merge branch 'dev' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 ... Browse Code »

* 'dev' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix up a undefined error in ext4_free_blocks in debugging code
ext4: add blk_finish_plug in error case of writepages.
ext4: Remove kernel_lock annotations
ext4: ignore journalled data options on remount if fs has no journal

Linus Torvalds
2011-11-22 04:11:37 +0800
6e58ad69e ext4: fix up a undefined error in ext4_free_blocks in debugging code ... Browse Code »

sbi is not defined, so let ext4_free_blocks use EXT4_SB(sb) instead
when EXT4FS_DEBUG is defined.

Signed-off-by: Yongqiang Yang

Yongqiang Yang
2011-11-22 01:09:19 +0800

08 Nov, 2011

1 commit

3c1fcb2c2 ext4: add blk_finish_plug in error case of writepages. ... Browse Code »

blk_finish_plug is needed in error case of writepages.

Signed-off-by: Namjae Jeon
Signed-off-by: "Theodore Ts'o"

Namjae Jeon
2011-11-08 00:01:13 +0800

07 Nov, 2011

3 commits

2397256d6 ext4: Remove kernel_lock annotations ... Browse Code »

The BKL is gone, these annotations are useless.

Signed-off-by: Richard Weinberger
Signed-off-by: "Theodore Ts'o"

Richard Weinberger
2011-11-07 23:50:09 +0800
eb513689c ext4: ignore journalled data options on remount if fs has no journal ... Browse Code »

This avoids a confusing failure in the init scripts when the
/etc/fstab has data=writeback or data=journal but the file system does
not have a journal. So check for this case explicitly, and warn the
user that we are ignoring the (pointless, since they have no journal)
data=* mount option.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2011-11-07 23:47:42 +0800
208bca086 Merge branch 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux ... Browse Code »

* 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
writeback: Add a 'reason' to wb_writeback_work
writeback: send work item to queue_io, move_expired_inodes
writeback: trace event balance_dirty_pages
writeback: trace event bdi_dirty_ratelimit
writeback: fix ppc compile warnings on do_div(long long, unsigned long)
writeback: per-bdi background threshold
writeback: dirty position control - bdi reserve area
writeback: control dirty pause time
writeback: limit max dirty pause time
writeback: IO-less balance_dirty_pages()
writeback: per task dirty rate limit
writeback: stabilize bdi->dirty_ratelimit
writeback: dirty rate control
writeback: add bg_threshold parameter to __bdi_update_bandwidth()
writeback: dirty position control
writeback: account per-bdi accumulated dirtied pages

Linus Torvalds
2011-11-07 11:02:23 +0800

03 Nov, 2011

2 commits

d21185883 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue ... Browse Code »

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue:
vfs: add d_prune dentry operation
vfs: protect i_nlink
filesystems: add set_nlink()
filesystems: add missing nlink wrappers
logfs: remove unnecessary nlink setting
ocfs2: remove unnecessary nlink setting
jfs: remove unnecessary nlink setting
hypfs: remove unnecessary nlink setting
vfs: ignore error on forced remount
readlinkat: ensure we return ENOENT for the empty pathname for normal lookups
vfs: fix dentry leak in simple_fill_super()

Linus Torvalds
2011-11-03 02:41:01 +0800
f1f8935a5 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 ... Browse Code »

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (97 commits)
jbd2: Unify log messages in jbd2 code
jbd/jbd2: validate sb->s_first in journal_get_superblock()
ext4: let ext4_ext_rm_leaf work with EXT_DEBUG defined
ext4: fix a syntax error in ext4_ext_insert_extent when debugging enabled
ext4: fix a typo in struct ext4_allocation_context
ext4: Don't normalize an falloc request if it can fit in 1 extent.
ext4: remove comments about extent mount option in ext4_new_inode()
ext4: let ext4_discard_partial_buffers handle unaligned range correctly
ext4: return ENOMEM if find_or_create_pages fails
ext4: move vars to local scope in ext4_discard_partial_page_buffers_no_lock()
ext4: Create helper function for EXT4_IO_END_UNWRITTEN and i_aiodio_unwritten
ext4: optimize locking for end_io extent conversion
ext4: remove unnecessary call to waitqueue_active()
ext4: Use correct locking for ext4_end_io_nolock()
ext4: fix race in xattr block allocation path
ext4: trace punch_hole correctly in ext4_ext_map_blocks
ext4: clean up AGGRESSIVE_TEST code
ext4: move variables to their scope
ext4: fix quota accounting during migration
ext4: migrate cleanup
...

Linus Torvalds
2011-11-03 01:06:20 +0800

02 Nov, 2011

4 commits

bfe868486 filesystems: add set_nlink() ... Browse Code »

Replace remaining direct i_nlink updates with a new set_nlink()
updater function.

Signed-off-by: Miklos Szeredi
Tested-by: Toshiyuki Okajima
Signed-off-by: Christoph Hellwig

Miklos Szeredi
2011-11-02 19:53:43 +0800
6d6b77f16 filesystems: add missing nlink wrappers ... Browse Code »

Replace direct i_nlink updates with the respective updater function
(inc_nlink, drop_nlink, clear_nlink, inode_dec_link_count).

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2011-11-02 19:53:43 +0800
bf52c6f7a ext4: let ext4_ext_rm_leaf work with EXT_DEBUG defined ... Browse Code »

The variable 'block' is removed by commit 750c9c47, so use the
replacement ex_ee_block instead.

Signed-off-by: Yongqiang Yang
Signed-off-by: "Theodore Ts'o"

Yongqiang Yang
2011-11-02 06:59:26 +0800
32de67569 ext4: fix a syntax error in ext4_ext_insert_extent when debugging enabled ... Browse Code »

This patch fixes a syntax error which omits a comma. Besides this,
logical block number is unsigend 32 bits, so printk should use %u
instead %d.

Signed-off-by: Yongqiang Yang
Signed-off-by: "Theodore Ts'o"

Yongqiang Yang
2011-11-02 06:56:41 +0800

01 Nov, 2011

9 commits

b9075fa96 treewide: use __printf not __attribute__((format(printf,...))) ... Browse Code »

Standardize the style for compiler based printf format verification.
Standardized the location of __printf too.

Done via script and a little typing.

$ grep -rPl --include=*.[ch] -w "__attribute__" * | \
grep -vP "^(tools|scripts|include/linux/compiler-gcc.h)" | \
xargs perl -n -i -e 'local $/; while (<>) { s/\b__attribute__\s*$\s*\(\s*format\s*\(\s*printf\s*,\s*(.+)\s*,\s*(.+)\s*$\s*\)\s*\)/__printf($1, $2)/g ; print; }'

[akpm@linux-foundation.org: revert arch bits]
Signed-off-by: Joe Perches
Cc: "Kirill A. Shutemov"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2011-11-01 08:30:54 +0800
966dbde2c ext4: warn if direct reclaim tries to writeback pages ... Browse Code »

Direct reclaim should never writeback pages. Warn if an attempt is made.

Signed-off-by: Mel Gorman
Cc: Dave Chinner
Cc: Christoph Hellwig
Cc: Johannes Weiner
Cc: Wu Fengguang
Cc: Jan Kara
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Mel Gorman
Cc: Alex Elder
Cc: Theodore Ts'o
Cc: Chris Mason
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2011-11-01 08:30:46 +0800
ff3fc1736 ext4: fix a typo in struct ext4_allocation_context ... Browse Code »

This patch changes "bext" to "best".

Signed-off-by: Robin Dong
Signed-off-by: "Theodore Ts'o"

Robin Dong
2011-11-01 06:55:50 +0800
3c6fe7701 ext4: Don't normalize an falloc request if it can fit in 1 extent. ... Browse Code »

If an fallocate request fits in EXT_UNINIT_MAX_LEN, then set the
EXT4_GET_BLOCKS_NO_NORMALIZE flag. For larger fallocate requests,
let mballoc.c normalize the request.

This fixes a problem where large requests were being split into
non-contiguous extents due to commit 556b27abf73: ext4: do not
normalize block requests from fallocate.

Testing:
*) Checked that 8.x MB falloc'ed files are still laid down next to
each other (contiguously).
*) Checked that the maximum size extent (127.9MB) is allocated as 1
extent.
*) Checked that a 1GB file is somewhat contiguous (often 5-6
non-contiguous extents now).
*) Checked that a 120MB file can still be falloc'ed even if there are
no single extents large enough to hold it.

Signed-off-by: Greg Harm
Signed-off-by: "Theodore Ts'o"

Greg Harm
2011-11-01 06:41:47 +0800
4af835089 ext4: remove comments about extent mount option in ext4_new_inode() ... Browse Code »

Remove comments about 'extent' mount option in ext4_new_inode(), since
it's no longer exists.

Signed-off-by: Eryu Guan
Signed-off-by: "Theodore Ts'o"

Eryu Guan
2011-11-01 06:21:29 +0800
edb5ac899 ext4: let ext4_discard_partial_buffers handle unaligned range correctly ... Browse Code »

As comment says, we should handle unaligned range rather than aligned
one. This fixes a bug found by running xfstests #91.

Signed-off-by: Yongqiang Yang

Yongqiang Yang
2011-11-01 06:04:38 +0800
5129d05fd ext4: return ENOMEM if find_or_create_pages fails ... Browse Code »

Signed-off-by: Yongqiang Yang
Signed-off-by: "Theodore Ts'o"

Yongqiang Yang
2011-11-01 05:56:10 +0800
e260daf27 ext4: move vars to local scope in ext4_discard_partial_page_buffers_no_lock() ... Browse Code »

Signed-off-by: Yongqiang Yang
Signed-off-by: "Theodore Ts'o"

Yongqiang Yang
2011-11-01 05:54:36 +0800
0edeb71dc ext4: Create helper function for EXT4_IO_END_UNWRITTEN and i_aiodio_unwritten ... Browse Code »

EXT4_IO_END_UNWRITTEN flag set and the increase of i_aiodio_unwritten
should be done simultaneously since ext4_end_io_nolock always clear
the flag and decrease the counter in the same time.

We have found some bugs that the flag is set while leaving
i_aiodio_unwritten unchanged(commit 32c80b32c053d). So this patch just tries
to create a helper function to wrap them to avoid any future bug.
The idea is inspired by Eric.

Cc: Eric Sandeen
Signed-off-by: Tao Ma
Signed-off-by: "Theodore Ts'o"

Tao Ma
2011-11-01 05:30:44 +0800

31 Oct, 2011

4 commits

b82e384c7 ext4: optimize locking for end_io extent conversion ... Browse Code »
43

Now that we are doing the locking correctly, we need to grab the
i_completed_io_lock() twice per end_io. We can clean this up by
removing the structure from the i_complted_io_list, and use this as
the locking mechanism to prevent ext4_flush_completed_IO() racing
against ext4_end_io_work(), instead of clearing the
EXT4_IO_END_UNWRITTEN in io->flag.

In addition, if the ext4_convert_unwritten_extents() returns an error,
we no longer keep the end_io structure on the linked list. This
doesn't help, because it tends to lock up the file system and wedges
the system. That's one way to call attention to the problem, but it
doesn't help the overall robustness of the system.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2011-10-31 22:56:32 +0800
4e2980212 ext4: remove unnecessary call to waitqueue_active() ... Browse Code »

The usage of waitqueue_active() is not necessary, and introduces (I
believe) a hard-to-hit race.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2011-10-31 06:41:19 +0800
d73d5046a ext4: Use correct locking for ext4_end_io_nolock() ... Browse Code »

We must hold i_completed_io_lock when manipulating anything on the
i_completed_io_list linked list. This includes io->lock, which we
were checking in ext4_end_io_nolock().

So move this check to ext4_end_io_work(). This also has the bonus of
avoiding extra work if it is already done without needing to take the
mutex.

Signed-off-by: Tao Ma
Signed-off-by: "Theodore Ts'o"

Tao Ma
2011-10-31 06:26:08 +0800
0e175a183 writeback: Add a 'reason' to wb_writeback_work ... Browse Code »

This creates a new 'reason' field in a wb_writeback_work
structure, which unambiguously identifies who initiates
writeback activity. A 'wb_reason' enumeration has been
added to writeback.h, to enumerate the possible reasons.

The 'writeback_work_class' and tracepoint event class and
'writeback_queue_io' tracepoints are updated to include the
symbolic 'reason' in all trace events.

And the 'writeback_inodes_sbXXX' family of routines has had
a wb_stats parameter added to them, so callers can specify
why writeback is being started.

Acked-by: Jan Kara
Signed-off-by: Curt Wohlgemuth
Signed-off-by: Wu Fengguang

Curt Wohlgemuth
2011-10-31 00:33:36 +0800

29 Oct, 2011

7 commits

6d6a43519 ext4: fix race in xattr block allocation path ... Browse Code »
1

Ceph users reported that when using Ceph on ext4, the filesystem
would often become corrupted, containing inodes with incorrect
i_blocks counters.

I managed to reproduce this with a very hacked-up "streamtest"
binary from the Ceph tree.

Ceph is doing a lot of xattr writes, to out-of-inode blocks.
There is also another thread which does sync_file_range and close,
of the same files. The problem appears to happen due to this race:

sync/flush thread xattr-set thread
----------------- ----------------

do_writepages ext4_xattr_set
ext4_da_writepages ext4_xattr_set_handle
mpage_da_map_blocks ext4_xattr_block_set
set DELALLOC_RESERVE
ext4_new_meta_blocks
ext4_mb_new_blocks
if (!i_delalloc_reserved_flag)
vfs_dq_alloc_block
ext4_get_blocks
down_write(i_data_sem)
set i_delalloc_reserved_flag
...
up_write(i_data_sem)
if (i_delalloc_reserved_flag)
vfs_dq_alloc_block_nofail

In other words, the sync/flush thread pops in and sets
i_delalloc_reserved_flag on the inode, which makes the xattr thread
think that it's in a delalloc path in ext4_new_meta_blocks(),
and add the block for a second time, after already having added
it once in the !i_delalloc_reserved_flag case in ext4_mb_new_blocks

The real problem is that we shouldn't be using the DELALLOC_RESERVED
state flag, and instead we should be passing
EXT4_GET_BLOCKS_DELALLOC_RESERVE down to ext4_map_blocks() instead of
using an inode state flag. We'll fix this for now with using
i_data_sem to prevent this race, but this is really not the right way
to fix things.

Signed-off-by: Eric Sandeen
Signed-off-by: "Theodore Ts'o"
Cc: stable@kernel.org

Eric Sandeen
2011-10-29 22:15:35 +0800
e7b319e39 ext4: trace punch_hole correctly in ext4_ext_map_blocks ... Browse Code »

When ext4_ext_map_blocks() is called by punch_hole, trace should
trace blocks punched out.

Signed-off-by: Yongqiang Yang
Signed-off-by: "Theodore Ts'o"

Yongqiang Yang
2011-10-29 21:39:51 +0800
02dc62fba ext4: clean up AGGRESSIVE_TEST code ... Browse Code »

Signed-off-by: Yongqiang Yang
Signed-off-by: "Theodore Ts'o"

Yongqiang Yang
2011-10-29 21:29:11 +0800
81fdbb4a8 ext4: move variables to their scope ... Browse Code »

Signed-off-by: Yongqiang Yang
Signed-off-by: "Theodore Ts'o"

Yongqiang Yang
2011-10-29 21:23:38 +0800
5cb81dabc ext4: fix quota accounting during migration ... Browse Code »

The tmp_inode should have same uid/gid as the original inode.
Otherwise new metadata blocks will be accounted to wrong quota-id,
which will result in a quota leak after the inode migration is
completed.

Signed-off-by: Dmitry Monakhov
Signed-off-by: "Theodore Ts'o"

Dmitry Monakhov
2011-10-29 21:05:00 +0800
fba90ffee ext4: migrate cleanup ... Browse Code »

This patch cleanup code a bit, actual logic not changed
- Move current block pointer to migrate_structure, let's all
walk info will be in one structure.
- Get rid of usless null ind-block ptr checks, caller already
does that check.

Signed-off-by: Dmitry Monakhov
Signed-off-by: "Theodore Ts'o"

Dmitry Monakhov
2011-10-29 21:03:00 +0800
f362f98e7 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue ... Browse Code »

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue: (21 commits)
leases: fix write-open/read-lease race
nfs: drop unnecessary locking in llseek
ext4: replace cut'n'pasted llseek code with generic_file_llseek_size
vfs: add generic_file_llseek_size
vfs: do (nearly) lockless generic_file_llseek
direct-io: merge direct_io_walker into __blockdev_direct_IO
direct-io: inline the complete submission path
direct-io: separate map_bh from dio
direct-io: use a slab cache for struct dio
direct-io: rearrange fields in dio/dio_submit to avoid holes
direct-io: fix a wrong comment
direct-io: separate fields only used in the submission path from struct dio
vfs: fix spinning prevention in prune_icache_sb
vfs: add a comment to inode_permission()
vfs: pass all mask flags check_acl and posix_acl_permission
vfs: add hex format for MAY_* flag values
vfs: indicate that the permission functions take all the MAY_* flags
compat: sync compat_stats with statfs.
vfs: add "device" tag to /proc/self/mountstats
cleanup: vfs: small comment fix for block_invalidatepage
...

Fix up trivial conflict in fs/gfs2/file.c (llseek changes)

Linus Torvalds
2011-10-29 01:49:34 +0800

28 Oct, 2011

1 commit

4cce0e28b ext4: replace cut'n'pasted llseek code with generic_file_llseek_size ... Browse Code »

This gives ext4 the benefits of unlocked llseek.

Cc: tytso@mit.edu
Signed-off-by: Andi Kleen
Signed-off-by: Christoph Hellwig

Andi Kleen
2011-10-28 20:58:59 +0800

27 Oct, 2011

2 commits

80e675f90 ext4: optimize memmmove lengths in extent/index insertions ... Browse Code »

ext4_ext_insert_extent() (respectively ext4_ext_insert_index())
was using EXT_MAX_EXTENT() (resp. EXT_MAX_INDEX()) to determine
how many entries needed to be moved beyond the insertion point.
In practice this means that (320 - I) * 24 bytes were memmove()'d
when I is the insertion point, rather than (#entries - I) * 24 bytes.

This patch uses EXT_LAST_EXTENT() (resp. EXT_LAST_INDEX()) instead
to only move existing entries. The code flow is also simplified
slightly to highlight similarities and reduce code duplication in
the insertion logic.

This patch reduces system CPU consumption by over 25% on a 4kB
synchronous append DIO write workload when used with the
pre-2.6.39 x86_64 memmove() implementation. With the much faster
2.6.39 memmove() implementation we still see a decrease in
system CPU usage between 2% and 7%.

Note that the ext_debug() output changes with this patch, splitting
some log information between entries. Users of the ext_debug() output
should note that the "move %d" units changed from reporting the number
of bytes moved to reporting the number of entries moved.

Signed-off-by: Eric Gouriou
Signed-off-by: "Theodore Ts'o"

Eric Gouriou
2011-10-27 23:52:18 +0800
6f91bc5fd ext4: optimize ext4_ext_convert_to_initialized() ... Browse Code »

This patch introduces a fast path in ext4_ext_convert_to_initialized()
for the case when the conversion can be performed by transferring
the newly initialized blocks from the uninitialized extent into
an adjacent initialized extent. Doing so removes the expensive
invocations of memmove() which occur during extent insertion and
the subsequent merge.

In practice this should be the common case for clients performing
append writes into files pre-allocated via
fallocate(FALLOC_FL_KEEP_SIZE). In such a workload performed via
direct IO and when using a suboptimal implementation of memmove()
(x86_64 prior to the 2.6.39 rewrite), this patch reduces kernel CPU
consumption by 32%.

Two new trace points are added to ext4_ext_convert_to_initialized()
to offer visibility into its operations. No exit trace point has
been added due to the multiplicity of return points. This can be
revisited once the upstream cleanup is backported.

Signed-off-by: Eric Gouriou
Signed-off-by: "Theodore Ts'o"

Eric Gouriou
2011-10-27 23:43:23 +0800

26 Oct, 2011

5 commits

b3ff05690 ext4: don't check io->flag when setting EXT4_STATE_DIO_UNWRITTEN inode state ... Browse Code »

When we want to convert the unitialized extent in direct write, we can
either do it in ext4_end_io_nolock(AIO case) or in
ext4_ext_direct_IO(non AIO case) and EXT4_I(inode)->cur_aio_dio is a
guard for ext4_ext_map_blocks to find the right case. In e9e3bcecf,
we mistakenly change it by:

- if (io)
+ if (io && !(io->flag & EXT4_IO_END_UNWRITTEN)) {
io->flag = EXT4_IO_END_UNWRITTEN;
- else
+ atomic_inc(&EXT4_I(inode)->i_aiodio_unwritten);
+ } else
ext4_set_inode_state(inode,
EXT4_STATE_DIO_UNWRITTEN);

So now if we map 2 blocks, and the first one set the
EXT_IO_END_UNWRITTEN, the 2nd mapping will set inode state because of
the check for the flag. This is wrong.

Cc: Eric Sandeen
Signed-off-by: Tao Ma
Signed-off-by: "Theodore Ts'o"

Tao Ma
2011-10-26 23:08:39 +0800
0a10da73e ext4: fix a wrong comment in __mb_check_buddy() ... Browse Code »

The comment says the bit should be 0, but the after code assert the
bit to be 1. This makes people confused, so fix it.

Signed-off-by: Robin Dong
Signed-off-by: "Theodore Ts'o"

Robin Dong
2011-10-26 20:48:54 +0800
b051d8dc4 ext4: remove unused variable in mb_find_extent() ... Browse Code »

The variable 'ord' in function mb_find_extent() is redundant, so
remove it.

Signed-off-by: Robin Dong
Signed-off-by: "Theodore Ts'o"

Robin Dong
2011-10-26 17:30:30 +0800
66a83cde4 ext4: remove unused variable in ext4_mb_generate_from_pa() ... Browse Code »

The variable 'count' in function ext4_mb_generate_from_pa() looks
useless, so remove it.

Signed-off-by: Robin Dong
Signed-off-by: "Theodore Ts'o"

Robin Dong
2011-10-26 17:29:21 +0800
ebbe02779 ext4: use stream-alloc when mb_group_prealloc set to zero ... Browse Code »

The kernel will crash on

ext4_mb_mark_diskspace_used:
BUG_ON(ac->ac_b_ex.fe_len
Signed-off-by: "Theodore Ts'o"

Robin Dong
2011-10-26 17:14:27 +0800