Eric Lee / smarc-fsl-linux-kernel

11 Jan, 2012

3 commits

9b467e6eb reiserfs: don't lock root inode searching ... Browse Code »

Nothing requires that we lock the filesystem until the root inode is
provided.

Also iget5_locked() triggers a warning because we are holding the
filesystem lock while allocating the inode, which result in a lockdep
suspicion that we have a lock inversion against the reclaim path:

[ 1986.896979] =================================
[ 1986.896990] [ INFO: inconsistent lock state ]
[ 1986.896997] 3.1.1-main #8
[ 1986.897001] ---------------------------------
[ 1986.897007] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
[ 1986.897016] kswapd0/16 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 1986.897023] (&REISERFS_SB(s)->lock){+.+.?.}, at: [] reiserfs_write_lock+0x20/0x2a
[ 1986.897044] {RECLAIM_FS-ON-W} state was registered at:
[ 1986.897050] [] mark_held_locks+0xae/0xd0
[ 1986.897060] [] lockdep_trace_alloc+0x7d/0x91
[ 1986.897068] [] kmem_cache_alloc+0x1a/0x93
[ 1986.897078] [] reiserfs_alloc_inode+0x13/0x3d
[ 1986.897088] [] alloc_inode+0x14/0x5f
[ 1986.897097] [] iget5_locked+0x62/0x13a
[ 1986.897106] [] reiserfs_fill_super+0x410/0x8b9
[ 1986.897114] [] mount_bdev+0x10b/0x159
[ 1986.897123] [] get_super_block+0x10/0x12
[ 1986.897131] [] mount_fs+0x59/0x12d
[ 1986.897138] [] vfs_kern_mount+0x45/0x7a
[ 1986.897147] [] do_kern_mount+0x2f/0xb0
[ 1986.897155] [] do_mount+0x5c2/0x612
[ 1986.897163] [] sys_mount+0x61/0x8f
[ 1986.897170] [] sysenter_do_call+0x12/0x32
[ 1986.897181] irq event stamp: 7509691
[ 1986.897186] hardirqs last enabled at (7509691): [] kmem_cache_alloc+0x6e/0x93
[ 1986.897197] hardirqs last disabled at (7509690): [] kmem_cache_alloc+0x24/0x93
[ 1986.897209] softirqs last enabled at (7508896): [] __do_softirq+0xee/0xfd
[ 1986.897222] softirqs last disabled at (7508859): [] do_softirq+0x50/0x9d
[ 1986.897234]
[ 1986.897235] other info that might help us debug this:
[ 1986.897242] Possible unsafe locking scenario:
[ 1986.897244]
[ 1986.897250] CPU0
[ 1986.897254] ----
[ 1986.897257] lock(&REISERFS_SB(s)->lock);
[ 1986.897265]
[ 1986.897269] lock(&REISERFS_SB(s)->lock);
[ 1986.897276]
[ 1986.897277] *** DEADLOCK ***
[ 1986.897278]
[ 1986.897286] no locks held by kswapd0/16.
[ 1986.897291]
[ 1986.897292] stack backtrace:
[ 1986.897299] Pid: 16, comm: kswapd0 Not tainted 3.1.1-main #8
[ 1986.897306] Call Trace:
[ 1986.897314] [] ? printk+0xf/0x11
[ 1986.897324] [] print_usage_bug+0x20e/0x21a
[ 1986.897332] [] ? print_irq_inversion_bug+0x172/0x172
[ 1986.897341] [] mark_lock+0x27f/0x483
[ 1986.897349] [] __lock_acquire+0x628/0x1472
[ 1986.897358] [] lock_acquire+0x47/0x5e
[ 1986.897366] [] ? reiserfs_write_lock+0x20/0x2a
[ 1986.897384] [] ? reiserfs_write_lock+0x20/0x2a
[ 1986.897397] [] mutex_lock_nested+0x35/0x26f
[ 1986.897409] [] ? reiserfs_write_lock+0x20/0x2a
[ 1986.897421] [] reiserfs_write_lock+0x20/0x2a
[ 1986.897433] [] map_block_for_writepage+0xc9/0x590
[ 1986.897448] [] ? create_empty_buffers+0x33/0x8f
[ 1986.897461] [] ? get_parent_ip+0xb/0x31
[ 1986.897472] [] ? sub_preempt_count+0x81/0x8e
[ 1986.897485] [] ? _raw_spin_unlock+0x27/0x3d
[ 1986.897496] [] ? get_parent_ip+0xb/0x31
[ 1986.897508] [] reiserfs_writepage+0x1b9/0x3e7
[ 1986.897521] [] ? clear_page_dirty_for_io+0xcb/0xde
[ 1986.897533] [] ? trace_hardirqs_on_caller+0x108/0x138
[ 1986.897546] [] ? trace_hardirqs_on+0xb/0xd
[ 1986.897559] [] shrink_page_list+0x34f/0x5e2
[ 1986.897572] [] shrink_inactive_list+0x172/0x22c
[ 1986.897585] [] shrink_zone+0x303/0x3b1
[ 1986.897597] [] ? _raw_spin_unlock+0x27/0x3d
[ 1986.897611] [] kswapd+0x3b7/0x5f2

The deadlock shouldn't happen since we are doing that allocation in the
mount path, the filesystem is not available for any reclaim. Still the
warning is annoying.

To solve this, acquire the lock later only where we need it, right before
calling reiserfs_read_locked_inode() that wants to lock to walk the tree.

Reported-by: Knut Petersen
Signed-off-by: Frederic Weisbecker
Cc: Al Viro
Cc: Christoph Hellwig
Cc: Jeff Mahoney
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Frederic Weisbecker
2012-01-11 08:30:54 +0800
37c69b98d reiserfs: don't lock journal_init() ... Browse Code »

journal_init() doesn't need the lock since no operation on the filesystem
is involved there. journal_read() and get_list_bitmap() have yet to be
reviewed carefully though before removing the lock there. Just keep the
it around these two calls for safety.

Signed-off-by: Frederic Weisbecker
Cc: Al Viro
Cc: Christoph Hellwig
Cc: Jeff Mahoney
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Frederic Weisbecker
2012-01-11 08:30:53 +0800
f32485be8 reiserfs: delay reiserfs lock until journal initialization ... Browse Code »

In the mount path, transactions that are made before journal
initialization don't involve the filesystem. We can delay the reiserfs
lock until we play with the journal.

Signed-off-by: Frederic Weisbecker
Cc: Al Viro
Cc: Christoph Hellwig
Cc: Jeff Mahoney
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Frederic Weisbecker
2012-01-11 08:30:53 +0800

10 Jan, 2012

1 commit

ac69e0928 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs ... Browse Code »

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
ext2/3/4: delete unneeded includes of module.h
ext{3,4}: Fix potential race when setversion ioctl updates inode
udf: Mark LVID buffer as uptodate before marking it dirty
ext3: Don't warn from writepage when readonly inode is spotted after error
jbd: Remove j_barrier mutex
reiserfs: Force inode evictions before umount to avoid crash
reiserfs: Fix quota mount option parsing
udf: Treat symlink component of type 2 as /
udf: Fix deadlock when converting file from in-ICB one to normal one
udf: Cleanup calling convention of inode_getblk()
ext2: Fix error handling on inode bitmap corruption
ext3: Fix error handling on inode bitmap corruption
ext3: replace ll_rw_block with other functions
ext3: NULL dereference in ext3_evict_inode()
jbd: clear revoked flag on buffers before a new transaction started
ext3: call ext3_mark_recovery_complete() when recovery is really needed

Linus Torvalds
2012-01-10 04:51:21 +0800

09 Jan, 2012

2 commits

a9e36da65 reiserfs: Force inode evictions before umount to avoid crash ... Browse Code »
1

This patch fixes a crash in reiserfs_delete_xattrs during umount.

When shrink_dcache_for_umount clears the dcache from
generic_shutdown_super, delayed evictions are forced to disk. If an
evicted inode has extended attributes associated with it, it will
need to walk the xattr tree to locate and remove them.

But since shrink_dcache_for_umount will BUG if it encounters active
dentries, the xattr tree must be released before it's called or it will
crash during every umount.

This patch forces the evictions to occur before generic_shutdown_super
by calling shrink_dcache_sb first. The additional evictions caused
by the removal of each associated xattr file and dir will be automatically
handled as they're added to the LRU list.

CC: reiserfs-devel@vger.kernel.org
CC: stable@kernel.org
Signed-off-by: Jeff Mahoney
Signed-off-by: Jan Kara

Jeff Mahoney
2012-01-09 20:52:09 +0800
a06d789b4 reiserfs: Fix quota mount option parsing ... Browse Code »
1

When jqfmt mount option is not specified on remount, we mistakenly clear
s_jquota_fmt value stored in superblock. Fix the problem.

CC: stable@kernel.org
CC: reiserfs-devel@vger.kernel.org
Signed-off-by: Jan Kara

Jan Kara
2012-01-09 20:52:09 +0800

07 Jan, 2012

2 commits

c3aa07764 reiserfs: Properly display mount options in /proc/mounts ... Browse Code »

Make reiserfs properly display mount options in /proc/mounts.

CC: reiserfs-devel@vger.kernel.org
Signed-off-by: Jan Kara
Signed-off-by: Al Viro

Jan Kara
2012-01-07 12:20:13 +0800
d8c9584ea vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-01-07 12:16:53 +0800

04 Jan, 2012

1 commit

6b520e056 vfs: fix the stupidity with i_dentry in inode destructors ... Browse Code »

Seeing that just about every destructor got that INIT_LIST_HEAD() copied into
it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once();
the cost of taking it into inode_init_always() will be negligible for pipes
and sockets and negative for everything else. Not to mention the removal of
boilerplate code from ->destroy_inode() instances...

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:52:40 +0800

21 Jul, 2011

1 commit

b4d5b10fb reiserfs: make reiserfs default to barrier=flush ... Browse Code »

Change the default reiserfs mount option to barrier=flush. Based on a patch
from Jeff Mahoney in the SuSE tree.

Signed-off-by: Jeff Mahoney
Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2011-07-21 08:47:55 +0800

27 May, 2011

1 commit

aa3857295 fs: pass exact type of data dirties to ->dirty_inode ... Browse Code »

Tell the filesystem if we just updated timestamp (I_DIRTY_SYNC) or
anything else, so that the filesystem can track internally if it
needs to push out a transaction for fdatasync or not.

This is just the prototype change with no user for it yet. I plan
to push large XFS changes for the next merge window, and getting
this trivial infrastructure in this window would help a lot to avoid
tree interdependencies.

Also remove incorrect comments that ->dirty_inode can't block. That
has been changed a long time ago, and many implementations rely on it.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2011-05-27 19:04:40 +0800

31 Mar, 2011

1 commit

25985edce Fix common misspellings ... Browse Code »

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi

Lucas De Marchi
2011-03-31 22:26:23 +0800

13 Jan, 2011

1 commit

f00c9e44a quota: Fix deadlock during path resolution ... Browse Code »

As Al Viro pointed out path resolution during Q_QUOTAON calls to quotactl
is prone to deadlocks. We hold s_umount semaphore for reading during the
path resolution and resolution itself may need to acquire the semaphore
for writing when e. g. autofs mountpoint is passed.

Solve the problem by performing the resolution before we get hold of the
superblock (and thus s_umount semaphore). The whole thing is complicated
by the fact that some filesystems (OCFS2) ignore the path argument. So to
distinguish between filesystem which want the path and which do not we
introduce new .quota_on_meta callback which does not get the path. OCFS2
then uses this callback instead of old .quota_on.

CC: Al Viro
CC: Christoph Hellwig
CC: Ted Ts'o
CC: Joel Becker
Signed-off-by: Jan Kara

Jan Kara
2011-01-13 02:14:55 +0800

07 Jan, 2011

1 commit

fa0d7e3de fs: icache RCU free inodes ... Browse Code »

RCU free the struct inode. This will allow:

- Subsequent store-free path walking patch. The inode must be consulted for
permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
to take i_lock no longer need to take sb_inode_list_lock to walk the list in
the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
page lock to follow page->mapping.

The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.

In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.

The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.

Signed-off-by: Nick Piggin

Nick Piggin
2011-01-07 14:50:26 +0800

18 Nov, 2010

1 commit

451a3c24b BKL: remove extraneous #include <smp_lock.h> ... Browse Code »

The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.

Signed-off-by: Arnd Bergmann
Signed-off-by: Linus Torvalds

Arnd Bergmann
2010-11-18 00:59:32 +0800

29 Oct, 2010

1 commit

152a08366 new helper: mount_bdev() ... Browse Code »

... and switch of the obvious get_sb_bdev() users to ->mount()

Signed-off-by: Al Viro

Al Viro
2010-10-29 16:16:13 +0800

10 Aug, 2010

2 commits

845a2cc05 convert reiserfs to ->evict_inode() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2010-08-10 04:48:23 +0800
0e4f6a791 Fix reiserfs_file_release() ... Browse Code »

a) count file openers correctly; i_count use was completely wrong
b) use new mutex for exclusion between final close/open/truncate,
to protect tailpacking logics. i_mutex use was wrong and resulted
in deadlocks.

Signed-off-by: Al Viro

Al Viro
2010-08-10 04:47:27 +0800

27 May, 2010

1 commit

f4b113ae6 reiserfs: Fix resuming of quotas on remount read-write ... Browse Code »

When quota was suspended on remount-ro, finish_unfinished() will try to turn
it on again (which fails) and also turns the quotas off on exit. Fix the
function to check whether quotas are already on at function entry and do
not turn them off in that case.

CC: reiserfs-devel@vger.kernel.org
Signed-off-by: Jan Kara

Jan Kara
2010-05-27 23:39:36 +0800

24 May, 2010

5 commits

287a80958 quota: rename default quotactl methods to dquot_ ... Browse Code »

Follow the dquot_* style used elsewhere in dquot.c.

[Jan Kara: Fixed up missing conversion of ext2]

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-05-24 20:10:17 +0800
307ae18a5 quota: drop remount argument to ->quota_on and ->quota_off ... Browse Code »

Remount handling has fully moved into the filesystem, so all this is
superflous now.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-05-24 20:09:12 +0800
e0ccfd959 quota: move unmount handling into the filesystem ... Browse Code »

Currently the VFS calls into the quotactl interface for unmounting
filesystems. This means filesystems with their own quota handling
can't easily distinguish between user-space originating quotaoff
and an unount. Instead move the responsibily of the unmount handling
into the filesystem to be consistent with all other dquot handling.

Note that we do call dquot_disable a lot later now, e.g. after
a sync_filesystem. But this is fine as the quota code does all its
writes via blockdev's mapping and that is synced even later.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-05-24 20:09:12 +0800
0f0dd62fd quota: kill the vfs_dq_off and vfs_dq_quota_on_remount wrappers ... Browse Code »

Instead of having wrappers in the VFS namespace export the dquot_suspend
and dquot_resume helpers directly. Also rename vfs_quota_disable to
dquot_disable while we're at it.

[Jan Kara: Moved dquot_suspend to quotaops.h and made it inline]

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-05-24 20:06:40 +0800
c79d967de quota: move remount handling into the filesystem ... Browse Code »

Currently do_remount_sb calls into the dquot code to tell it about going
from rw to ro and ro to rw. Move this code into the filesystem to
not depend on the dquot code in the VFS - note ocfs2 already ignores
these calls and handles remount by itself. This gets rid of overloading
the quotactl calls and allows to unify the VFS and XFS codepaths in
that area later.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-05-24 20:06:39 +0800

05 Apr, 2010

1 commit

336f5899d Merge branch 'master' into export-slabh Browse Code »

Tejun Heo
2010-04-05 10:37:28 +0800

31 Mar, 2010

1 commit

b7b7fa431 reiserfs: Fix locking BUG during mount failure ... Browse Code »

Commit 8ebc423238341b52912c7295b045a32477b33f09 (reiserfs: kill-the-BKL)
introduced a bug in the mount failure case.

The error label releases the lock before calling journal_release_error,
but it requires that the lock be held. do_journal_release unlocks and
retakes it. When it releases it without it held, we trigger a BUG().

The error_alloc label skips the unlock since the lock isn't held yet
but none of the other conditions that are clean up exist yet either.

This patch returns immediately after the kzalloc failure and moves
the reiserfs_write_unlock after the journal_release_error call.

This was reported in https://bugzilla.novell.com/show_bug.cgi?id=591807

Reported-by: Thomas Siedentopf
Signed-off-by: Jeff Mahoney
Cc: Thomas Siedentopf
Cc: Andrew Morton
Cc: 2.6.33.x
Signed-off-by: Frederic Weisbecker

Jeff Mahoney
2010-03-31 04:13:09 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

05 Mar, 2010

6 commits

871a29315 dquot: cleanup dquot initialize routine ... Browse Code »

Get rid of the initialize dquot operation - it is now always called from
the filesystem and if a filesystem really needs it's own (which none
currently does) it can just call into it's own routine directly.

Rename the now static low-level dquot_initialize helper to __dquot_initialize
and vfs_dq_init to dquot_initialize to have a consistent namespace.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-03-05 07:20:30 +0800
9f7547580 dquot: cleanup dquot drop routine ... Browse Code »

Get rid of the drop dquot operation - it is now always called from
the filesystem and if a filesystem really needs it's own (which none
currently does) it can just call into it's own routine directly.

Rename the now static low-level dquot_drop helper to __dquot_drop
and vfs_dq_drop to dquot_drop to have a consistent namespace.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-03-05 07:20:30 +0800
257ba15ce dquot: move dquot drop responsibility into the filesystem ... Browse Code »

Currently clear_inode calls vfs_dq_drop directly. This means
we tie the quota code into the VFS. Get rid of that and make the
filesystem responsible for the drop inside the ->clear_inode
superblock operation.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-03-05 07:20:29 +0800
b43fa8284 dquot: cleanup dquot transfer routine ... Browse Code »

Get rid of the transfer dquot operation - it is now always called from
the filesystem and if a filesystem really needs it's own (which none
currently does) it can just call into it's own routine directly.

Rename the now static low-level dquot_transfer helper to __dquot_transfer
and vfs_dq_transfer to dquot_transfer to have a consistent namespace,
and make the new dquot_transfer return a normal negative errno value
which all callers expect.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-03-05 07:20:29 +0800
63936ddaa dquot: cleanup inode allocation / freeing routines ... Browse Code »

Get rid of the alloc_inode and free_inode dquot operations - they are
always called from the filesystem and if a filesystem really needs
their own (which none currently does) it can just call into it's
own routine directly.

Also get rid of the vfs_dq_alloc/vfs_dq_free wrappers and always
call the lowlevel dquot_alloc_inode / dqout_free_inode routines
directly, which now lose the number argument which is always 1.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-03-05 07:20:28 +0800
5dd4056db dquot: cleanup space allocation / freeing routines ... Browse Code »

Get rid of the alloc_space, free_space, reserve_space, claim_space and
release_rsv dquot operations - they are always called from the filesystem
and if a filesystem really needs their own (which none currently does)
it can just call into it's own routine directly.

Move shared logic into the common __dquot_alloc_space,
dquot_claim_space_nodirty and __dquot_free_space low-level methods,
and rationalize the wrappers around it to move as much as possible
code into the common block for CONFIG_QUOTA vs not. Also rename
all these helpers to be named dquot_* instead of vfs_dq_*.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-03-05 07:20:28 +0800

16 Dec, 2009

1 commit

904e81293 reiserfs: remove /proc/fs/reiserfs/version ... Browse Code »

/proc/fs/reiserfs/version is on the way of removing ->read_proc interface.
It's empty however, so simply remove it instead of doing dummy
conversion. It's hard to see what information userspace can extract from
empty file.

Signed-off-by: Alexey Dobriyan
Cc: Jeff Mahoney
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2009-12-16 23:20:06 +0800

09 Dec, 2009

1 commit

a9280fed3 Merge branch 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/gi… ... Browse Code »

…t/frederic/random-tracing

* 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing: (31 commits)
kill-the-bkl/reiserfs: turn GFP_ATOMIC flag to GFP_NOFS in reiserfs_get_block()
kill-the-bkl/reiserfs: drop the fs race watchdog from _get_block_create_0()
kill-the-bkl/reiserfs: definitely drop the bkl from reiserfs_ioctl()
kill-the-bkl/reiserfs: always lock the ioctl path
kill-the-bkl/reiserfs: fix reiserfs lock to cpu_add_remove_lock dependency
kill-the-bkl/reiserfs: Fix induced mm->mmap_sem to sysfs_mutex dependency
kill-the-bkl/reiserfs: panic in case of lock imbalance
kill-the-bkl/reiserfs: fix recursive reiserfs write lock in reiserfs_commit_write()
kill-the-bkl/reiserfs: fix recursive reiserfs lock in reiserfs_mkdir()
kill-the-bkl/reiserfs: fix "reiserfs lock" / "inode mutex" lock inversion dependency
kill-the-bkl/reiserfs: move the concurrent tree accesses checks per superblock
kill-the-bkl/reiserfs: acquire the inode mutex safely
kill-the-bkl/reiserfs: unlock only when needed in search_by_key
kill-the-bkl/reiserfs: use mutex_lock in reiserfs_mutex_lock_safe
kill-the-bkl/reiserfs: factorize the locking in reiserfs_write_end()
kill-the-bkl/reiserfs: reduce number of contentions in search_by_key()
kill-the-bkl/reiserfs: don't hold the write recursively in reiserfs_lookup()
kill-the-bkl/reiserfs: lock only once on reiserfs_get_block()
kill-the-bkl/reiserfs: conditionaly release the write lock on fs_changed()
kill-the-BKL/reiserfs: add reiserfs_cond_resched()
...

Linus Torvalds
2009-12-09 23:58:15 +0800

22 Sep, 2009

2 commits

0d54b217a const: make struct super_block::s_qcop const ... Browse Code »

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2009-09-22 22:17:24 +0800
61e225dc3 const: make struct super_block::dq_op const ... Browse Code »

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2009-09-22 22:17:24 +0800

14 Sep, 2009

2 commits

dc8f6d893 kill-the-BKL/reiserfs: only acquire the write lock once in reiserfs_dirty_inode ... Browse Code »

Impact: fix a deadlock

reiserfs_dirty_inode() is the super_operations::dirty_inode() callback
of reiserfs. It can be called from different contexts where the write
lock can be already held.

But this function also grab the write lock (possibly recursively).
Subsequent release of the lock before sleep will actually not release
the lock if the caller of mark_inode_dirty() (which in turn calls
reiserfs_dirty_inode()) already owns the lock.

A typical case:

reiserfs_write_end() {
acquire_write_lock()
mark_inode_dirty() {
reiserfs_dirty_inode() {
reacquire_write_lock() {
journal_begin() {
do_journal_begin_r() {
/*
* fail to release, still
* one depth of lock
*/
release_write_lock()
reiserfs_wait_on_write_block() {
wait_event()

The event is usually provided by something which needs the write lock but
it hasn't been released.

We use reiserfs_write_lock_once() here to ensure we only grab the
write lock in one level.

Signed-off-by: Frederic Weisbecker
Cc: Frederic Weisbecker
Cc: Alessio Igor Bogani
Cc: Jeff Mahoney
Cc: Chris Mason
LKML-Reference:
Signed-off-by: Ingo Molnar

Frederic Weisbecker
2009-09-14 13:18:04 +0800
8ebc42323 reiserfs: kill-the-BKL ... Browse Code »

This patch is an attempt to remove the Bkl based locking scheme from
reiserfs and is intended.

It is a bit inspired from an old attempt by Peter Zijlstra:

http://lkml.indiana.edu/hypermail/linux/kernel/0704.2/2174.html

The bkl is heavily used in this filesystem to prevent from
concurrent write accesses on the filesystem.

Reiserfs makes a deep use of the specific properties of the Bkl:

- It can be acqquired recursively by a same task
- It is released on the schedule() calls and reacquired when schedule() returns

The two properties above are a roadmap for the reiserfs write locking so it's
very hard to simply replace it with a common mutex.

- We need a recursive-able locking unless we want to restructure several blocks
of the code.
- We need to identify the sites where the bkl was implictly relaxed
(schedule, wait, sync, etc...) so that we can in turn release and
reacquire our new lock explicitly.
Such implicit releases of the lock are often required to let other
resources producer/consumer do their job or we can suffer unexpected
starvations or deadlocks.

So the new lock that replaces the bkl here is a per superblock mutex with a
specific property: it can be acquired recursively by a same task, like the
bkl.

For such purpose, we integrate a lock owner and a lock depth field on the
superblock information structure.

The first axis on this patch is to turn reiserfs_write_(un)lock() function
into a wrapper to manage this mutex. Also some explicit calls to
lock_kernel() have been converted to reiserfs_write_lock() helpers.

The second axis is to find the important blocking sites (schedule...(),
wait_on_buffer(), sync_dirty_buffer(), etc...) and then apply an explicit
release of the write lock on these locations before blocking. Then we can
safely wait for those who can give us resources or those who need some.
Typically this is a fight between the current writer, the reiserfs workqueue
(aka the async commiter) and the pdflush threads.

The third axis is a consequence of the second. The write lock is usually
on top of a lock dependency chain which can include the journal lock, the
flush lock or the commit lock. So it's dangerous to release and trying to
reacquire the write lock while we still hold other locks.

This is fine with the bkl:

T1 T2

lock_kernel()
mutex_lock(A)
unlock_kernel()
// do something
lock_kernel()
mutex_lock(A) -> already locked by T1
schedule() (and then unlock_kernel())
lock_kernel()
mutex_unlock(A)
....

This is not fine with a mutex:

T1 T2

mutex_lock(write)
mutex_lock(A)
mutex_unlock(write)
// do something
mutex_lock(write)
mutex_lock(A) -> already locked by T1
schedule()

mutex_lock(write) -> already locked by T2
deadlock

The solution in this patch is to provide a helper which releases the write
lock and sleep a bit if we can't lock a mutex that depend on it. It's another
simulation of the bkl behaviour.

The last axis is to locate the fs callbacks that are called with the bkl held,
according to Documentation/filesystem/Locking.

Those are:

- reiserfs_remount
- reiserfs_fill_super
- reiserfs_put_super

Reiserfs didn't need to explicitly lock because of the context of these callbacks.
But now we must take care of that with the new locking.

After this patch, reiserfs suffers from a slight performance regression (for now).
On UP, a high volume write with dd reports an average of 27 MB/s instead
of 30 MB/s without the patch applied.

Signed-off-by: Frederic Weisbecker
Reviewed-by: Ingo Molnar
Cc: Jeff Mahoney
Cc: Peter Zijlstra
Cc: Bron Gondwana
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Alexander Viro
LKML-Reference:
Signed-off-by: Ingo Molnar

Frederic Weisbecker
2009-09-14 13:17:59 +0800

09 Jul, 2009

1 commit

b43f3cbd2 headers: mnt_namespace.h redux ... Browse Code »

Fix various silly problems wrt mnt_namespace.h:

- exit_mnt_ns() isn't used, remove it
- done that, sched.h and nsproxy.h inclusions aren't needed
- mount.h inclusion was need for vfsmount_lock, but no longer
- remove mnt_namespace.h inclusion from files which don't use anything
from mnt_namespace.h

Signed-off-by: Alexey Dobriyan
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2009-07-09 00:31:56 +0800