Doug / smarc-fsl-linux-kernel | Embedian Git Server

25 Jan, 2010

1 commit

5f634d064 ext4: Fix quota accounting error with fallocate ... Browse Code »

When we fallocate a region of the file which we had recently written,
and which is still in the page cache marked as delayed allocated blocks
we need to make sure we don't do the quota update on writepage path.
This is because the needed quota updated would have already be done
by fallocate.

Signed-off-by: Aneesh Kumar K.V

Aneesh Kumar K.V
2010-01-25 17:00:31 +0800

23 Jan, 2010

1 commit

1db913823 ext4: Handle -EDQUOT error on write ... Browse Code »

We need to release the journal before we do a write_inode. Otherwise
we could deadlock.

Signed-off-by: Aneesh Kumar K.V

Aneesh Kumar K.V
2010-01-23 06:06:20 +0800

15 Jan, 2010

1 commit

1296cc85c ext4: Drop EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE flag ... Browse Code »

We should update reserve space if it is delalloc buffer
and that is indicated by EXT4_GET_BLOCKS_DELALLOC_RESERVE flag.
So use EXT4_GET_BLOCKS_DELALLOC_RESERVE in place of
EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE

Signed-off-by: Aneesh Kumar K.V

Aneesh Kumar K.V
2010-01-15 14:27:59 +0800

01 Jan, 2010

2 commits

9d0be5023 ext4: Calculate metadata requirements more accurately ... Browse Code »

In the past, ext4_calc_metadata_amount(), and its sub-functions
ext4_ext_calc_metadata_amount() and ext4_indirect_calc_metadata_amount()
badly over-estimated the number of metadata blocks that might be
required for delayed allocation blocks. This didn't matter as much
when functions which managed the reserved metadata blocks were more
aggressive about dropping reserved metadata blocks as delayed
allocation blocks were written, but unfortunately they were too
aggressive. This was fixed in commit 0637c6f, but as a result the
over-estimation by ext4_calc_metadata_amount() would lead to reserving
2-3 times the number of pending delayed allocation blocks as
potentially required metadata blocks. So if there are 1 megabytes of
blocks which have been not yet been allocation, up to 3 megabytes of
space would get reserved out of the user's quota and from the file
system free space pool until all of the inode's data blocks have been
allocated.

This commit addresses this problem by much more accurately estimating
the number of metadata blocks that will be required. It will still
somewhat over-estimate the number of blocks needed, since it must make
a worst case estimate not knowing which physical blocks will be
needed, but it is much more accurate than before.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-01-01 15:41:30 +0800
ee5f4d9cd ext4: Fix accounting of reserved metadata blocks ... Browse Code »

Commit 0637c6f had a typo which caused the reserved metadata blocks to
not be released correctly. Fix this.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2010-01-01 15:36:15 +0800

31 Dec, 2009

1 commit

0637c6f41 ext4: Patch up how we claim metadata blocks for quota purposes ... Browse Code »

As reported in Kernel Bugzilla #14936, commit d21cd8f triggered a BUG
in the function ext4_da_update_reserve_space() found in
fs/ext4/inode.c. The root cause of this BUG() was caused by the fact
that ext4_calc_metadata_amount() can severely over-estimate how many
metadata blocks will be needed, especially when using direct
block-mapped files.

In addition, it can also badly *under* estimate how much space is
needed, since ext4_calc_metadata_amount() assumes that the blocks are
contiguous, and this is not always true. If the application is
writing blocks to a sparse file, the number of metadata blocks
necessary can be severly underestimated by the functions
ext4_da_reserve_space(), ext4_da_update_reserve_space() and
ext4_da_release_space(). This was the cause of the dq_claim_space
reports found on kerneloops.org.

Unfortunately, doing this right means that we need to massively
over-estimate the amount of free space needed. So in some cases we
may need to force the inode to be written to disk asynchronously in
to avoid spurious quota failures.

http://bugzilla.kernel.org/show_bug.cgi?id=14936

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-12-31 03:20:45 +0800

30 Dec, 2009

1 commit

515f41c33 ext4: Ensure zeroout blocks have no dirty metadata ... Browse Code »

This fixes a bug (found by Curt Wohlgemuth) in which new blocks
returned from an extent created with ext4_ext_zeroout() can have dirty
metadata still associated with them.

Signed-off-by: Aneesh Kumar K.V
Signed-off-by: Curt Wohlgemuth
Signed-off-by: "Theodore Ts'o"

Aneesh Kumar K.V
2009-12-30 12:39:06 +0800

26 Dec, 2009

1 commit

2faf2e19d ext4: return correct wbc.nr_to_write in ext4_da_writepages ... Browse Code »

When ext4_da_writepages increases the nr_to_write in writeback_control
then it must always re-base the return value. Originally there was a
(misguided) attempt prevent wbc.nr_to_write from going negative. In
fact, it's necessary to allow nr_to_write to be negative so that
wb_writeback() can correctly calculate how many pages were actually
written.

Signed-off-by: Richard Kennedy
Signed-off-by: "Theodore Ts'o"

Richard Kennedy
2009-12-26 04:46:07 +0800

23 Dec, 2009

7 commits

c8afb4468 ext4: flush delalloc blocks when space is low ... Browse Code »

Creating many small files in rapid succession on a small
filesystem can lead to spurious ENOSPC; on a 104MB filesystem:

for i in `seq 1 22500`; do
echo -n > $SCRATCH_MNT/$i
echo XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > $SCRATCH_MNT/$i
done

leads to ENOSPC even though after a sync, 40% of the fs is free
again.

This is because we reserve worst-case metadata for delalloc writes,
and when data is allocated that worst-case reservation is not
usually needed.

When freespace is low, kicking off an async writeback will start
converting that worst-case space usage into something more realistic,
almost always freeing up space to continue.

This resolves the testcase for me, and survives all 4 generic
ENOSPC tests in xfstests.

We'll still need a hard synchronous sync to squeeze out the last bit,
but this fixes things up to a large degree.

Signed-off-by: Eric Sandeen
Signed-off-by: "Theodore Ts'o"

Eric Sandeen
2009-12-23 20:58:12 +0800
d3533d72e ext4: Eliminate potential double free on error path ... Browse Code »

b_entry_name and buffer are initially NULL, are initialized within a loop
to the result of calling kmalloc, and are freed at the bottom of this loop.
The loop contains gotos to cleanup, which also frees b_entry_name and
buffer. Some of these gotos are before the reinitializations of
b_entry_name and buffer. To maintain the invariant that b_entry_name and
buffer are NULL at the top of the loop, and thus acceptable arguments to
kfree, these variables are now set to NULL after the kfrees.

This seems to be the simplest solution. A more complicated solution
would be to introduce more labels in the error handling code at the end of
the function.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

//
@r@
identifier E;
expression E1;
iterator I;
statement S;
@@

*kfree(E);
... when != E = E1
when != I(E,...) S
when != &E
*kfree(E);
//

Signed-off-by: Julia Lawall
Signed-off-by: "Theodore Ts'o"

Julia Lawall
2009-12-23 20:52:31 +0800
a6b43e382 ext4: fix unsigned long long printk warning in super.c ... Browse Code »

sparc64 allmodconfig:

fs/ext4/super.c: In function `lifetime_write_kbytes_show':
fs/ext4/super.c:2174: warning: long long unsigned int format, long unsigned int arg (arg 4)
fs/ext4/super.c:2174: warning: long long unsigned int format, long unsigned int arg (arg 4)

Signed-off-by: Andrew Morton
Signed-off-by: "Theodore Ts'o"

Andrew Morton
2009-12-23 20:48:08 +0800
cc3e1bea5 ext4, jbd2: Add barriers for file systems with exernal journals ... Browse Code »

This is a bit complicated because we are trying to optimize when we
send barriers to the fs data disk. We could just throw in an extra
barrier to the data disk whenever we send a barrier to the journal
disk, but that's not always strictly necessary.

We only need to send a barrier during a commit when there are data
blocks which are must be written out due to an inode written in
ordered mode, or if fsync() depends on the commit to force data blocks
to disk. Finally, before we drop transactions from the beginning of
the journal during a checkpoint operation, we need to guarantee that
any blocks that were flushed out to the data disk are firmly on the
rust platter before we drop the transaction from the journal.

Thanks to Oleg Drokin for pointing out this flaw in ext3/ext4.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-12-23 19:52:08 +0800
39bc680a8 ext4: fix sleep inside spinlock issue with quota and dealloc (#14739) ... Browse Code »

Unlock i_block_reservation_lock before vfs_dq_reserve_block().
This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=14739

CC: Theodore Ts'o
Signed-off-by: Dmitry Monakhov
Signed-off-by: Jan Kara

Dmitry Monakhov
2009-12-23 20:44:12 +0800
d21cd8f16 ext4: Fix potential quota deadlock ... Browse Code »

We have to delay vfs_dq_claim_space() until allocation context destruction.
Currently we have following call-trace:
ext4_mb_new_blocks()
/* task is already holding ac->alloc_semp */
->ext4_mb_mark_diskspace_used
->vfs_dq_claim_space() /* acquire dqptr_sem here. Possible deadlock */
->ext4_mb_release_context() /* drop ac->alloc_semp here */

Let's move quota claiming to ext4_da_update_reserve_space()

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-rc7 #18
-------------------------------------------------------
write-truncate-/3465 is trying to acquire lock:
(&s->s_dquot.dqptr_sem){++++..}, at: [] dquot_claim_space+0x3b/0x1b0

but task is already holding lock:
(&meta_group_info[i]->alloc_sem){++++..}, at: [] ext4_mb_load_buddy+0xb2/0x370

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (&meta_group_info[i]->alloc_sem){++++..}:
[] __lock_acquire+0xd7b/0x1260
[] lock_acquire+0xba/0xd0
[] down_read+0x51/0x90
[] ext4_mb_load_buddy+0xb2/0x370
[] ext4_mb_free_blocks+0x46c/0x870
[] ext4_free_blocks+0x73/0x130
[] ext4_ext_truncate+0x76c/0x8d0
[] ext4_truncate+0x187/0x5e0
[] vmtruncate+0x6b/0x70
[] inode_setattr+0x62/0x190
[] ext4_setattr+0x25a/0x370
[] notify_change+0x151/0x340
[] do_truncate+0x6d/0xa0
[] may_open+0x1d4/0x200
[] do_filp_open+0x1eb/0x910
[] do_sys_open+0x6d/0x140
[] sys_open+0x2e/0x40
[] sysenter_do_call+0x12/0x32

-> #2 (&ei->i_data_sem){++++..}:
[] __lock_acquire+0xd7b/0x1260
[] lock_acquire+0xba/0xd0
[] down_read+0x51/0x90
[] ext4_get_blocks+0x47/0x450
[] ext4_getblk+0x61/0x1d0
[] ext4_bread+0x1f/0xa0
[] ext4_quota_write+0x12c/0x310
[] qtree_write_dquot+0x93/0x120
[] v2_write_dquot+0x28/0x30
[] dquot_commit+0xab/0xf0
[] ext4_write_dquot+0x77/0x90
[] ext4_mark_dquot_dirty+0x2f/0x50
[] dquot_alloc_inode+0x101/0x180
[] ext4_new_inode+0x602/0xf00
[] ext4_create+0x89/0x150
[] vfs_create+0xa2/0xc0
[] do_filp_open+0x7a7/0x910
[] do_sys_open+0x6d/0x140
[] sys_open+0x2e/0x40
[] sysenter_do_call+0x12/0x32

-> #1 (&sb->s_type->i_mutex_key#7/4){+.+...}:
[] __lock_acquire+0xd7b/0x1260
[] lock_acquire+0xba/0xd0
[] mutex_lock_nested+0x65/0x2d0
[] vfs_load_quota_inode+0x4bd/0x5a0
[] vfs_quota_on_path+0x5f/0x70
[] ext4_quota_on+0x112/0x190
[] sys_quotactl+0x44a/0x8a0
[] sysenter_do_call+0x12/0x32

-> #0 (&s->s_dquot.dqptr_sem){++++..}:
[] __lock_acquire+0x1091/0x1260
[] lock_acquire+0xba/0xd0
[] down_read+0x51/0x90
[] dquot_claim_space+0x3b/0x1b0
[] ext4_mb_mark_diskspace_used+0x36f/0x380
[] ext4_mb_new_blocks+0x34a/0x530
[] ext4_ext_get_blocks+0x122b/0x13c0
[] ext4_get_blocks+0x226/0x450
[] mpage_da_map_blocks+0xc3/0xaa0
[] ext4_da_writepages+0x506/0x790
[] do_writepages+0x22/0x50
[] __filemap_fdatawrite_range+0x6d/0x80
[] filemap_flush+0x2b/0x30
[] ext4_alloc_da_blocks+0x5c/0x60
[] ext4_release_file+0x75/0xb0
[] __fput+0xf9/0x210
[] fput+0x27/0x30
[] filp_close+0x4c/0x80
[] put_files_struct+0x6e/0xd0
[] exit_files+0x47/0x60
[] do_exit+0x144/0x710
[] do_group_exit+0x38/0xa0
[] get_signal_to_deliver+0x2ac/0x410
[] do_notify_resume+0xb9/0x890
[] work_notifysig+0x13/0x21

other info that might help us debug this:

3 locks held by write-truncate-/3465:
#0: (jbd2_handle){+.+...}, at: [] start_this_handle+0x38f/0x5c0
#1: (&ei->i_data_sem){++++..}, at: [] ext4_get_blocks+0xb6/0x450
#2: (&meta_group_info[i]->alloc_sem){++++..}, at: [] ext4_mb_load_buddy+0xb2/0x370

stack backtrace:
Pid: 3465, comm: write-truncate- Not tainted 2.6.32-rc7 #18
Call Trace:
[] ? printk+0x1d/0x22
[] print_circular_bug+0xca/0xd0
[] __lock_acquire+0x1091/0x1260
[] ? sched_clock_local+0xd2/0x170
[] ? trace_hardirqs_off_caller+0x20/0xd0
[] lock_acquire+0xba/0xd0
[] ? dquot_claim_space+0x3b/0x1b0
[] down_read+0x51/0x90
[] ? dquot_claim_space+0x3b/0x1b0
[] dquot_claim_space+0x3b/0x1b0
[] ext4_mb_mark_diskspace_used+0x36f/0x380
[] ext4_mb_new_blocks+0x34a/0x530
[] ? ext4_ext_find_extent+0x25d/0x280
[] ext4_ext_get_blocks+0x122b/0x13c0
[] ? sched_clock_local+0xd2/0x170
[] ? sched_clock_cpu+0x120/0x160
[] ? cpu_clock+0x4f/0x60
[] ? trace_hardirqs_off_caller+0x20/0xd0
[] ? down_write+0x8c/0xa0
[] ext4_get_blocks+0x226/0x450
[] ? sched_clock_cpu+0x120/0x160
[] ? cpu_clock+0x4f/0x60
[] ? trace_hardirqs_off+0xb/0x10
[] mpage_da_map_blocks+0xc3/0xaa0
[] ? find_get_pages_tag+0x16c/0x180
[] ? find_get_pages_tag+0x0/0x180
[] ? __mpage_da_writepage+0x16d/0x1a0
[] ? pagevec_lookup_tag+0x2e/0x40
[] ? write_cache_pages+0xdb/0x3d0
[] ? __mpage_da_writepage+0x0/0x1a0
[] ext4_da_writepages+0x506/0x790
[] ? cpu_clock+0x4f/0x60
[] ? sched_clock_local+0xd2/0x170
[] ? sched_clock_cpu+0x120/0x160
[] ? sched_clock_cpu+0x120/0x160
[] ? ext4_da_writepages+0x0/0x790
[] do_writepages+0x22/0x50
[] __filemap_fdatawrite_range+0x6d/0x80
[] filemap_flush+0x2b/0x30
[] ext4_alloc_da_blocks+0x5c/0x60
[] ext4_release_file+0x75/0xb0
[] __fput+0xf9/0x210
[] fput+0x27/0x30
[] filp_close+0x4c/0x80
[] put_files_struct+0x6e/0xd0
[] exit_files+0x47/0x60
[] do_exit+0x144/0x710
[] ? lock_release_holdtime+0x33/0x210
[] ? _spin_unlock_irq+0x27/0x30
[] do_group_exit+0x38/0xa0
[] ? trace_hardirqs_on+0xb/0x10
[] get_signal_to_deliver+0x2ac/0x410
[] do_notify_resume+0xb9/0x890
[] ? trace_hardirqs_off_caller+0x20/0xd0
[] ? lock_release_holdtime+0x33/0x210
[] ? autoremove_wake_function+0x0/0x50
[] ? trace_hardirqs_on_caller+0x134/0x190
[] ? trace_hardirqs_on+0xb/0x10
[] ? security_file_permission+0x14/0x20
[] ? vfs_write+0x131/0x190
[] ? do_sync_write+0x0/0x120
[] ? sysenter_do_call+0x27/0x32
[] work_notifysig+0x13/0x21

CC: Theodore Ts'o
Signed-off-by: Dmitry Monakhov
Signed-off-by: Jan Kara

Dmitry Monakhov
2009-12-23 20:44:12 +0800
a9e7f4472 ext4: Convert to generic reserved quota's space management. ... Browse Code »

This patch also fixes write vs chown race condition.

Acked-by: "Theodore Ts'o"
Signed-off-by: Dmitry Monakhov
Signed-off-by: Jan Kara

Dmitry Monakhov
2009-12-23 20:33:55 +0800

21 Dec, 2009

2 commits

51b7e3c9f ext4: add module aliases for ext2 and ext3 ... Browse Code »

Add module aliases for ext2 and ext3 when CONFIG_EXT4_USE_FOR_EXT23 is
set. This makes the existing user-space stuff like mkinitrd working
as is.

Signed-off-by: Takashi Iwai
Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-12-21 23:56:09 +0800
84c664730 ext4: Don't ask about supporting ext2/3 in ext4 if ext4 is not configured ... Browse Code »

Don't offer to build ext2/3 support into ext4 if ext4 itself is not
configured on.

Signed-off-by: David Howells
Signed-off-by: "Theodore Ts'o"

David Howells
2009-12-21 23:54:09 +0800

18 Dec, 2009

1 commit

b6e3224fb Revert "task_struct: make journal_info conditional" ... Browse Code »

This reverts commit e4c570c4cb7a95dbfafa3d016d2739bf3fdfe319, as
requested by Alexey:

"I think I gave a good enough arguments to not merge it.
To iterate:
* patch makes impossible to start using ext3 on EXT3_FS=n kernels
without reboot.
* this is done only for one pointer on task_struct"

None of config options which define task_struct are tristate directly
or effectively."

Requested-by: Alexey Dobriyan
Acked-by: Andrew Morton
Signed-off-by: Linus Torvalds

Linus Torvalds
2009-12-18 05:23:24 +0800

17 Dec, 2009

1 commit

431547b3c sanitize xattr handler prototypes ... Browse Code »

Add a flags argument to struct xattr_handler and pass it to all xattr
handler methods. This allows using the same methods for multiple
handlers, e.g. for the ACL methods which perform exactly the same action
for the access and default ACLs, just using a different underlying
attribute. With a little more groundwork it'll also allow sharing the
methods for the regular user/trusted/secure handlers in extN, ocfs2 and
jffs2 like it's already done for xfs in this patch.

Also change the inode argument to the handlers to a dentry to allow
using the handlers mechnism for filesystems that require it later,
e.g. cifs.

[with GFS2 bits updated by Steven Whitehouse ]

Signed-off-by: Christoph Hellwig
Reviewed-by: James Morris
Acked-by: Joel Becker
Signed-off-by: Al Viro

Christoph Hellwig
2009-12-17 01:16:49 +0800

16 Dec, 2009

2 commits

e7d2860b6 tree-wide: convert open calls to remove spaces to skip_spaces() lib function ... Browse Code »

Makes use of skip_spaces() defined in lib/string.c for removing leading
spaces from strings all over the tree.

It decreases lib.a code size by 47 bytes and reuses the function tree-wide:
text data bss dec hex filename
64688 584 592 65864 10148 (TOTALS-BEFORE)
64641 584 592 65817 10119 (TOTALS-AFTER)

Also, while at it, if we see (*str && isspace(*str)), we can be sure to
remove the first condition (*str) as the second one (isspace(*str)) also
evaluates to 0 whenever *str == 0, making it redundant. In other words,
"a char equals zero is never a space".

Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below,
and found occurrences of this pattern on 3 more files:
drivers/leds/led-class.c
drivers/leds/ledtrig-timer.c
drivers/video/output.c

@@
expression str;
@@

( // ignore skip_spaces cases
while (*str && isspace(*str)) { $str++;\|++str;$ }
|
- *str &&
isspace(*str)
)

Signed-off-by: André Goddard Rosa
Cc: Julia Lawall
Cc: Martin Schwidefsky
Cc: Jeff Dike
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Richard Purdie
Cc: Neil Brown
Cc: Kyle McMartin
Cc: Henrique de Moraes Holschuh
Cc: David Howells
Cc:
Cc: Samuel Ortiz
Cc: Patrick McHardy
Cc: Takashi Iwai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

André Goddard Rosa
2009-12-16 00:53:32 +0800
e4c570c4c task_struct: make journal_info conditional ... Browse Code »

journal_info in task_struct is used in journaling file system only. So
introduce CONFIG_FS_JOURNAL_INFO and make it conditional.

Signed-off-by: Hiroshi Shimamoto
Cc: Chris Mason
Cc: "Theodore Ts'o"
Cc: Steven Whitehouse
Cc: KONISHI Ryusuke
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hiroshi Shimamoto
2009-12-16 00:53:27 +0800

15 Dec, 2009

1 commit

d0316554d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
m68k: rename global variable vmalloc_end to m68k_vmalloc_end
percpu: add missing per_cpu_ptr_to_phys() definition for UP
percpu: Fix kdump failure if booted with percpu_alloc=page
percpu: make misc percpu symbols unique
percpu: make percpu symbols in ia64 unique
percpu: make percpu symbols in powerpc unique
percpu: make percpu symbols in x86 unique
percpu: make percpu symbols in xen unique
percpu: make percpu symbols in cpufreq unique
percpu: make percpu symbols in oprofile unique
percpu: make percpu symbols in tracer unique
percpu: make percpu symbols under kernel/ and mm/ unique
percpu: remove some sparse warnings
percpu: make alloc_percpu() handle array types
vmalloc: fix use of non-existent percpu variable in put_cpu_var()
this_cpu: Use this_cpu_xx in trace_functions_graph.c
this_cpu: Use this_cpu_xx for ftrace
this_cpu: Use this_cpu_xx in nmi handling
this_cpu: Use this_cpu operations in RCU
this_cpu: Use this_cpu ops for VM statistics
...

Fix up trivial (famous last words) global per-cpu naming conflicts in
arch/x86/kvm/svm.c
mm/slab.c

Linus Torvalds
2009-12-15 01:58:24 +0800

14 Dec, 2009

2 commits

034fb4c95 ext4: replace BUG() with return -EIO in ext4_ext_get_blocks ... Browse Code »

This patch fixes the Kernel BZ #14286. When the address of an extent
corresponding to a valid block is corrupted, a -EIO should be reported
instead of a BUG(). This situation should not normally not occur
except in the case of a corrupted filesystem. If however it does,
then the system should not panic directly but depending on the mount
time options appropriate action should be taken. If the mount options
so permit, the I/O should be gracefully aborted by returning a -EIO.

http://bugzilla.kernel.org/show_bug.cgi?id=14286

Signed-off-by: Surbhi Palande
Signed-off-by: "Theodore Ts'o"

Surbhi Palande
2009-12-14 22:53:52 +0800
149feb00d ext4: remove unused #include <linux/version.h> ... Browse Code »

Remove unused #include ('s) in
fs/ext4/block_validity.c
fs/ext4/mballoc.h

Signed-off-by: Huang Weiyi
Signed-off-by: "Theodore Ts'o"

Huang Weiyi
2009-12-14 22:24:20 +0800

12 Dec, 2009

1 commit

3126c136b Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 ... Browse Code »

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (21 commits)
ext3: PTR_ERR return of wrong pointer in setup_new_group_blocks()
ext3: Fix data / filesystem corruption when write fails to copy data
ext4: Support for 64-bit quota format
ext3: Support for vfsv1 quota format
quota: Implement quota format with 64-bit space and inode limits
quota: Move definition of QFMT_OCFS2 to linux/quota.h
ext2: fix comment in ext2_find_entry about return values
ext3: Unify log messages in ext3
ext2: clear uptodate flag on super block I/O error
ext2: Unify log messages in ext2
ext3: make "norecovery" an alias for "noload"
ext3: Don't update the superblock in ext3_statfs()
ext3: journal all modifications in ext3_xattr_set_handle
ext2: Explicitly assign values to on-disk enum of filetypes
quota: Fix WARN_ON in lookup_one_len
const: struct quota_format_ops
ubifs: remove manual O_SYNC handling
afs: remove manual O_SYNC handling
kill wait_on_page_writeback_range
vfs: Implement proper O_SYNC semantics
...

Linus Torvalds
2009-12-12 07:31:13 +0800

11 Dec, 2009

1 commit

4515c3069 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 ... Browse Code »

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (47 commits)
ext4: Fix potential fiemap deadlock (mmap_sem vs. i_data_sem)
ext4: Do not override ext2 or ext3 if built they are built as modules
jbd2: Export jbd2_log_start_commit to fix ext4 build
ext4: Fix insufficient checks in EXT4_IOC_MOVE_EXT
ext4: Wait for proper transaction commit on fsync
ext4: fix incorrect block reservation on quota transfer.
ext4: quota macros cleanup
ext4: ext4_get_reserved_space() must return bytes instead of blocks
ext4: remove blocks from inode prealloc list on failure
ext4: wait for log to commit when umounting
ext4: Avoid data / filesystem corruption when write fails to copy data
ext4: Use ext4 file system driver for ext2/ext3 file system mounts
ext4: Return the PTR_ERR of the correct pointer in setup_new_group_blocks()
jbd2: Add ENOMEM checking in and for jbd2_journal_write_metadata_buffer()
ext4: remove unused parameter wbc from __ext4_journalled_writepage()
ext4: remove encountered_congestion trace
ext4: move_extent_per_page() cleanup
ext4: initialize moved_len before calling ext4_move_extents()
ext4: Fix double-free of blocks with EXT4_IOC_MOVE_EXT
ext4: use ext4_data_block_valid() in ext4_free_blocks()
...

Linus Torvalds
2009-12-11 01:33:29 +0800

10 Dec, 2009

3 commits

5a20bdfcd ext4: Support for 64-bit quota format ... Browse Code »

Add support for new 64-bit quota format. It is enough to add proper
mount options handling. The rest is done by the generic code.

Signed-off-by: Jan Kara

Jan Kara
2009-12-10 22:02:54 +0800
fab3a549e ext4: Fix potential fiemap deadlock (mmap_sem vs. i_data_sem) ... Browse Code »

Fix the following potential circular locking dependency between
mm->mmap_sem and ei->i_data_sem:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-04115-gec044c5 #37
-------------------------------------------------------
ureadahead/1855 is trying to acquire lock:
(&mm->mmap_sem){++++++}, at: [] might_fault+0x5c/0xac

but task is already holding lock:
(&ei->i_data_sem){++++..}, at: [] ext4_fiemap+0x11b/0x159

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&ei->i_data_sem){++++..}:
[] __lock_acquire+0xb67/0xd0f
[] lock_acquire+0xdc/0x102
[] down_read+0x51/0x84
[] ext4_get_blocks+0x50/0x2a5
[] ext4_get_block+0xab/0xef
[] do_mpage_readpage+0x198/0x48d
[] mpage_readpages+0xd0/0x114
[] ext4_readpages+0x1d/0x1f
[] __do_page_cache_readahead+0x12f/0x1bc
[] ra_submit+0x21/0x25
[] filemap_fault+0x19f/0x32c
[] __do_fault+0x55/0x3a2
[] handle_mm_fault+0x327/0x734
[] do_page_fault+0x292/0x2aa
[] page_fault+0x25/0x30
[] clear_user+0x38/0x3c
[] padzero+0x20/0x31
[] load_elf_binary+0x8bc/0x17ed
[] search_binary_handler+0xc2/0x259
[] load_script+0x1b8/0x1cc
[] search_binary_handler+0xc2/0x259
[] do_execve+0x1ce/0x2cf
[] sys_execve+0x43/0x5a
[] stub_execve+0x6a/0xc0

-> #0 (&mm->mmap_sem){++++++}:
[] __lock_acquire+0xa11/0xd0f
[] lock_acquire+0xdc/0x102
[] might_fault+0x89/0xac
[] fiemap_fill_next_extent+0x95/0xda
[] ext4_ext_fiemap_cb+0x138/0x157
[] ext4_ext_walk_space+0x178/0x1f1
[] ext4_fiemap+0x13c/0x159
[] do_vfs_ioctl+0x348/0x4d6
[] sys_ioctl+0x56/0x79
[] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

1 lock held by ureadahead/1855:
#0: (&ei->i_data_sem){++++..}, at: [] ext4_fiemap+0x11b/0x159

stack backtrace:
Pid: 1855, comm: ureadahead Not tainted 2.6.32-04115-gec044c5 #37
Call Trace:
[] print_circular_bug+0xa8/0xb7
[] __lock_acquire+0xa11/0xd0f
[] ? sched_clock+0x9/0xd
[] lock_acquire+0xdc/0x102
[] ? might_fault+0x5c/0xac
[] might_fault+0x89/0xac
[] ? might_fault+0x5c/0xac
[] ? __kmalloc+0x13b/0x18c
[] fiemap_fill_next_extent+0x95/0xda
[] ext4_ext_fiemap_cb+0x138/0x157
[] ? ext4_ext_fiemap_cb+0x0/0x157
[] ext4_ext_walk_space+0x178/0x1f1
[] ext4_fiemap+0x13c/0x159
[] ? might_fault+0x5c/0xac
[] do_vfs_ioctl+0x348/0x4d6
[] ? __up_read+0x8d/0x95
[] ? retint_swapgs+0x13/0x1b
[] sys_ioctl+0x56/0x79
[] system_call_fastpath+0x16/0x1b

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-12-10 10:30:02 +0800
a214238d3 ext4: Do not override ext2 or ext3 if built they are built as modules ... Browse Code »

The CONFIG_EXT4_USE_FOR_EXT23 option must not try to take over the
ext2 or ext3 file systems if the those file system drivers are
configured to be built as mdoules.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-12-10 10:09:58 +0800

09 Dec, 2009

7 commits

b436b9bef ext4: Wait for proper transaction commit on fsync ... Browse Code »

We cannot rely on buffer dirty bits during fsync because pdflush can come
before fsync is called and clear dirty bits without forcing a transaction
commit. What we do is that we track which transaction has last changed
the inode and which transaction last changed allocation and force it to
disk on fsync.

Signed-off-by: Jan Kara
Signed-off-by: "Theodore Ts'o"

Jan Kara
2009-12-09 12:51:10 +0800
194074aca ext4: fix incorrect block reservation on quota transfer. ... Browse Code »

Inside ->setattr() call both ATTR_UID and ATTR_GID may be valid
This means that we may end-up with transferring all quotas. Add
we have to reserve QUOTA_DEL_BLOCKS for all quotas, as we do in
case of QUOTA_INIT_BLOCKS.

Signed-off-by: Dmitry Monakhov
Reviewed-by: Mingming Cao
Signed-off-by: "Theodore Ts'o"

Dmitry Monakhov
2009-12-09 11:42:28 +0800
5aca07eb7 ext4: quota macros cleanup ... Browse Code »

Currently all quota block reservation macros contains hard-coded "2"
aka MAXQUOTAS value. This is no good because in some places it is not
obvious to understand what does this digit represent. Let's introduce
new macro with self descriptive name.

Signed-off-by: Dmitry Monakhov
Acked-by: Mingming Cao
Signed-off-by: "Theodore Ts'o"

Dmitry Monakhov
2009-12-09 11:42:15 +0800
8aa6790f8 ext4: ext4_get_reserved_space() must return bytes instead of blocks ... Browse Code »

Signed-off-by: Dmitry Monakhov
Reviewed-by: Eric Sandeen
Acked-by: Mingming Cao
Signed-off-by: "Theodore Ts'o"

Dmitry Monakhov
2009-12-09 11:41:52 +0800
b844167ed ext4: remove blocks from inode prealloc list on failure ... Browse Code »

This fixes a leak of blocks in an inode prealloc list if device failures
cause ext4_mb_mark_diskspace_used() to fail.

Signed-off-by: Curt Wohlgemuth
Acked-by: Aneesh Kumar K.V
Signed-off-by: "Theodore Ts'o"

Curt Wohlgemuth
2009-12-09 11:18:25 +0800
d4edac314 ext4: wait for log to commit when umounting ... Browse Code »

There is a potential race when a transaction is committing right when
the file system is being umounting. This could reduce in a race
because EXT4_SB(sb)->s_group_info could be freed in ext4_put_super
before the commit code calls a callback so the mballoc code can
release freed blocks in the transaction, resulting in a panic trying
to access the freed s_group_info.

The fix is to wait for the transaction to finish committing before we
shutdown the multiblock allocator.

Signed-off-by: Josef Bacik
Signed-off-by: "Theodore Ts'o"

Josef Bacik
2009-12-09 10:48:58 +0800
b9a4207d5 ext4: Avoid data / filesystem corruption when write fails to copy data ... Browse Code »

When ext4_write_begin fails after allocating some blocks or
generic_perform_write fails to copy data to write, we truncate blocks
already instantiated beyond i_size. Although these blocks were never
inside i_size, we have to truncate the pagecache of these blocks so
that corresponding buffers get unmapped. Otherwise subsequent
__block_prepare_write (called because we are retrying the write) will
find the buffers mapped, not call ->get_block, and thus the page will
be backed by already freed blocks leading to filesystem and data
corruption.

Signed-off-by: Jan Kara
Signed-off-by: "Theodore Ts'o"

Jan Kara
2009-12-09 10:24:33 +0800

08 Dec, 2009

2 commits

d014d0438 Merge branch 'for-next' into for-linus ... Browse Code »

Conflicts:

kernel/irq/chip.c

Jiri Kosina
2009-12-08 01:36:35 +0800
24b584240 ext4: Use ext4 file system driver for ext2/ext3 file system mounts ... Browse Code »

Add a new config option, CONFIG_EXT4_USE_FOR_EXT23 which if enabled,
will cause ext4 to be used for either ext2 or ext3 file system mounts
when ext2 or ext3 is not enabled in the configuration.

This allows minimalist kernel fanatics to drop to file system drivers
from their compiled kernel with out losing functionality.

Signed-off-by: "Theodore Ts'o"

Theodore Ts'o
2009-12-08 03:08:51 +0800

07 Dec, 2009

2 commits

4a58579b9 ext4: Fix insufficient checks in EXT4_IOC_MOVE_EXT ... Browse Code »

This patch fixes three problems in the handling of the
EXT4_IOC_MOVE_EXT ioctl:

1. In current EXT4_IOC_MOVE_EXT, there are read access mode checks for
original and donor files, but they allow the illegal write access to
donor file, since donor file is overwritten by original file data. To
fix this problem, change access mode checks of original (r->r/w) and
donor (r->w) files.

2. Disallow the use of donor files that have a setuid or setgid bits.

3. Call mnt_want_write() and mnt_drop_write() before and after
ext4_move_extents() calling to get write access to a mount.

Signed-off-by: Akira Fujita
Signed-off-by: "Theodore Ts'o"

Akira Fujita
2009-12-07 12:38:31 +0800
c09eef305 ext4: Return the PTR_ERR of the correct pointer in setup_new_group_blocks() ... Browse Code »

Signed-off-by: Roel Kluin
Signed-off-by: "Theodore Ts'o"

Roel Kluin
2009-12-07 23:38:16 +0800