Eric Lee / smarc-fsl-linux-kernel

04 Jan, 2012

1 commit

8e0718924 reiserfs: propagate umode_t ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:55:00 +0800

02 Nov, 2011

2 commits

bfe868486 filesystems: add set_nlink() ... Browse Code »

Replace remaining direct i_nlink updates with a new set_nlink()
updater function.

Signed-off-by: Miklos Szeredi
Tested-by: Toshiyuki Okajima
Signed-off-by: Christoph Hellwig

Miklos Szeredi
2011-11-02 19:53:43 +0800
6d6b77f16 filesystems: add missing nlink wrappers ... Browse Code »

Replace direct i_nlink updates with the respective updater function
(inc_nlink, drop_nlink, clear_nlink, inode_dec_link_count).

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2011-11-02 19:53:43 +0800

26 Jul, 2011

1 commit

4482a087d reiserfs: cache negative ACLs for v1 stat format ... Browse Code »

Always set up a negative ACL cache entry if the inode can't have ACLs.
That behaves much better than doing this check inside ->check_acl.

Also remove the left over MAY_NOT_BLOCK check.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2011-07-26 02:25:38 +0800

21 Jul, 2011

2 commits

aacfc19c6 fs: simplify the blockdev_direct_IO prototype ... Browse Code »

Simple filesystems always pass inode->i_sb_bdev as the block device
argument, and never need a end_io handler. Let's simply things for
them and for my grepping activity by dropping these arguments. The
only thing not falling into that scheme is ext4, which passes and
end_io handler without needing special flags (yet), but given how
messy the direct I/O code there is use of __blockdev_direct_IO
in one instead of two out of three cases isn't going to make a large
difference anyway.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2011-07-21 08:47:49 +0800
562c72aa5 fs: move inode_dio_wait calls into ->setattr ... Browse Code »

Let filesystems handle waiting for direct I/O requests themselves instead
of doing it beforehand. This means filesystem-specific locks to prevent
new dio referenes from appearing can be held. This is important to allow
generalizing i_dio_count to non-DIO_LOCKING filesystems.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2011-07-21 08:47:47 +0800

25 Mar, 2011

1 commit

6c5103890 Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block ... Browse Code »

* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
Documentation/iostats.txt: bit-size reference etc.
cfq-iosched: removing unnecessary think time checking
cfq-iosched: Don't clear queue stats when preempt.
blk-throttle: Reset group slice when limits are changed
blk-cgroup: Only give unaccounted_time under debug
cfq-iosched: Don't set active queue in preempt
block: fix non-atomic access to genhd inflight structures
block: attempt to merge with existing requests on plug flush
block: NULL dereference on error path in __blkdev_get()
cfq-iosched: Don't update group weights when on service tree
fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
block: Require subsystems to explicitly allocate bio_set integrity mempool
jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
fs: make fsync_buffers_list() plug
mm: make generic_writepages() use plugging
blk-cgroup: Add unaccounted time to timeslice_used.
block: fixup plugging stubs for !CONFIG_BLOCK
block: remove obsolete comments for blkdev_issue_zeroout.
blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
...

Fix up conflicts in fs/{aio.c,super.c}

Linus Torvalds
2011-03-25 01:16:26 +0800

14 Mar, 2011

1 commit

5fe0c2378 exportfs: Return the minimum required handle size ... Browse Code »

The exportfs encode handle function should return the minimum required
handle size. This helps user to find out the handle size by passing 0
handle size in the first step and then redoing to the call again with
the returned handle size value.

Acked-by: Serge Hallyn
Signed-off-by: Aneesh Kumar K.V
Signed-off-by: Al Viro

Aneesh Kumar K.V
2011-03-14 21:15:28 +0800

10 Mar, 2011

1 commit

7eaceacca block: remove per-queue plugging ... Browse Code »
86

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:52:07 +0800

18 Nov, 2010

1 commit

451a3c24b BKL: remove extraneous #include <smp_lock.h> ... Browse Code »

The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.

Signed-off-by: Arnd Bergmann
Signed-off-by: Linus Torvalds

Arnd Bergmann
2010-11-18 00:59:32 +0800

27 Oct, 2010

2 commits

426e1f5ce Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits)
split invalidate_inodes()
fs: skip I_FREEING inodes in writeback_sb_inodes
fs: fold invalidate_list into invalidate_inodes
fs: do not drop inode_lock in dispose_list
fs: inode split IO and LRU lists
fs: switch bdev inode bdi's correctly
fs: fix buffer invalidation in invalidate_list
fsnotify: use dget_parent
smbfs: use dget_parent
exportfs: use dget_parent
fs: use RCU read side protection in d_validate
fs: clean up dentry lru modification
fs: split __shrink_dcache_sb
fs: improve DCACHE_REFERENCED usage
fs: use percpu counter for nr_dentry and nr_dentry_unused
fs: simplify __d_free
fs: take dcache_lock inside __d_path
fs: do not assign default i_ino in new_inode
fs: introduce a per-cpu last_ino allocator
new helper: ihold()
...

Linus Torvalds
2010-10-27 08:58:44 +0800
1b430beee writeback: remove nonblocking/encountered_congestion references ... Browse Code »

This removes more dead code that was somehow missed by commit 0d99519efef
(writeback: remove unused nonblocking and congestion checks). There are
no behavior change except for the removal of two entries from one of the
ext4 tracing interface.

The nonblocking checks in ->writepages are no longer used because the
flusher now prefer to block on get_request_wait() than to skip inodes on
IO congestion. The latter will lead to more seeky IO.

The nonblocking checks in ->writepage are no longer used because it's
redundant with the WB_SYNC_NONE check.

We no long set ->nonblocking in VM page out and page migration, because
a) it's effectively redundant with WB_SYNC_NONE in current code
b) it's old semantic of "Don't get stuck on request queues" is mis-behavior:
that would skip some dirty inodes on congestion and page out others, which
is unfair in terms of LRU age.

Inspired by Christoph Hellwig. Thanks!

Signed-off-by: Wu Fengguang
Cc: Theodore Ts'o
Cc: David Howells
Cc: Sage Weil
Cc: Steve French
Cc: Chris Mason
Cc: Jens Axboe
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wu Fengguang
2010-10-27 07:52:05 +0800

26 Oct, 2010

1 commit

ebdec241d fs: kill block_prepare_write ... Browse Code »

__block_write_begin and block_prepare_write are identical except for slightly
different calling conventions. Convert all callers to the __block_write_begin
calling conventions and drop block_prepare_write.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-10-26 09:18:20 +0800

18 Aug, 2010

1 commit

f4ae2faa4 fix reiserfs_evict_inode end_writeback second call ... Browse Code »

reiserfs_evict_inode calls end_writeback two times hitting
kernel BUG at fs/inode.c:298 becase inode->i_state is I_CLEAR already.

Signed-off-by: Sergey Senozhatsky
Signed-off-by: Al Viro

Sergey Senozhatsky
2010-08-18 12:58:57 +0800

11 Aug, 2010

1 commit

5f248c9c2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits)
no need for list_for_each_entry_safe()/resetting with superblock list
Fix sget() race with failing mount
vfs: don't hold s_umount over close_bdev_exclusive() call
sysv: do not mark superblock dirty on remount
sysv: do not mark superblock dirty on mount
btrfs: remove junk sb_dirt change
BFS: clean up the superblock usage
AFFS: wait for sb synchronization when needed
AFFS: clean up dirty flag usage
cifs: truncate fallout
mbcache: fix shrinker function return value
mbcache: Remove unused features
add f_flags to struct statfs(64)
pass a struct path to vfs_statfs
update VFS documentation for method changes.
All filesystems that need invalidate_inode_buffers() are doing that explicitly
convert remaining ->clear_inode() to ->evict_inode()
Make ->drop_inode() just return whether inode needs to be dropped
fs/inode.c:clear_inode() is gone
fs/inode.c:evict() doesn't care about delete vs. non-delete paths now
...

Fix up trivial conflicts in fs/nilfs2/super.c

Linus Torvalds
2010-08-11 02:26:52 +0800

10 Aug, 2010

6 commits

845a2cc05 convert reiserfs to ->evict_inode() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2010-08-10 04:48:23 +0800
db78b877f always call inode_change_ok early in ->setattr ... Browse Code »

Make sure we call inode_change_ok before doing any changes in ->setattr,
and make sure to call it even if our fs wants to ignore normal UNIX
permissions, but use the ATTR_FORCE to skip those.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-08-10 04:47:38 +0800
1025774ce remove inode_setattr ... Browse Code »

Replace inode_setattr with opencoded variants of it in all callers. This
moves the remaining call to vmtruncate into the filesystem methods where it
can be replaced with the proper truncate sequence.

In a few cases it was obvious that we would never end up calling vmtruncate
so it was left out in the opencoded variant:

spufs: explicitly checks for ATTR_SIZE earlier
btrfs,hugetlbfs,logfs,dlmfs: explicitly clears ATTR_SIZE earlier
ufs: contains an opencoded simple_seattr + truncate that sets the filesize just above

In addition to that ncpfs called inode_setattr with handcrafted iattrs,
which allowed to trim down the opencoded variant.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-08-10 04:47:37 +0800
6e1db88d5 introduce __block_write_begin ... Browse Code »

Split up the block_write_begin implementation - __block_write_begin is a new
trivial wrapper for block_prepare_write that always takes an already
allocated page and can be either called from block_write_begin or filesystem
code that already has a page allocated. Remove the handling of already
allocated pages from block_write_begin after switching all callers that
do it to __block_write_begin.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-08-10 04:47:32 +0800
eafdc7d19 sort out blockdev_direct_IO variants ... Browse Code »

Move the call to vmtruncate to get rid of accessive blocks to the callers
in prepearation of the new truncate calling sequence. This was only done
for DIO_LOCKING filesystems, so the __blockdev_direct_IO_newtrunc variant
was not needed anyway. Get rid of blockdev_direct_IO_no_locking and
its _newtrunc variant while at it as just opencoding the two additional
paramters is shorted than the name suffix.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-08-10 04:47:29 +0800
0e4f6a791 Fix reiserfs_file_release() ... Browse Code »

a) count file openers correctly; i_count use was completely wrong
b) use new mutex for exclusion between final close/open/truncate,
to protect tailpacking logics. i_mutex use was wrong and resulted
in deadlocks.

Signed-off-by: Al Viro

Al Viro
2010-08-10 04:47:27 +0800

17 Jun, 2010

1 commit

421f91d21 fix typos concerning "initiali[zs]e" ... Browse Code »

Signed-off-by: Uwe Kleine-König
Signed-off-by: Jiri Kosina

Uwe Kleine-König
2010-06-17 00:05:05 +0800

22 May, 2010

1 commit

12755627b quota: unify quota init condition in setattr ... Browse Code »

Quota must being initialized if size or uid/git changes requested.
But initialization performed in two different places:
in case of i_size file system is responsible for dquot init
, but in case of uid/gid init will be called internally in
dquot_transfer().
This ambiguity makes code harder to understand.
Let's move this logic to one common helper function.

Signed-off-by: Dmitry Monakhov
Signed-off-by: Jan Kara

Dmitry Monakhov
2010-05-22 01:30:45 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

06 Mar, 2010

2 commits

e213e26ab Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 ... Browse Code »

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits)
quota: stop using QUOTA_OK / NO_QUOTA
dquot: cleanup dquot initialize routine
dquot: move dquot initialization responsibility into the filesystem
dquot: cleanup dquot drop routine
dquot: move dquot drop responsibility into the filesystem
dquot: cleanup dquot transfer routine
dquot: move dquot transfer responsibility into the filesystem
dquot: cleanup inode allocation / freeing routines
dquot: cleanup space allocation / freeing routines
ext3: add writepage sanity checks
ext3: Truncate allocated blocks if direct IO write fails to update i_size
quota: Properly invalidate caches even for filesystems with blocksize < pagesize
quota: generalize quota transfer interface
quota: sb_quota state flags cleanup
jbd: Delay discarding buffers in journal_unmap_buffer
ext3: quota_write cross block boundary behaviour
quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
quota: split out compat_sys_quotactl support from quota.c
quota: split out netlink notification support from quota.c
quota: remove invalid optimization from quota_sync_all
...

Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c

Linus Torvalds
2010-03-06 05:20:53 +0800
a9185b41a pass writeback_control to ->write_inode ... Browse Code »

This gives the filesystem more information about the writeback that
is happening. Trond requested this for the NFS unstable write handling,
and other filesystems might benefit from this too by beeing able to
distinguish between the different callers in more detail.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-03-06 02:25:52 +0800

05 Mar, 2010

5 commits

871a29315 dquot: cleanup dquot initialize routine ... Browse Code »

Get rid of the initialize dquot operation - it is now always called from
the filesystem and if a filesystem really needs it's own (which none
currently does) it can just call into it's own routine directly.

Rename the now static low-level dquot_initialize helper to __dquot_initialize
and vfs_dq_init to dquot_initialize to have a consistent namespace.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-03-05 07:20:30 +0800
907f4554e dquot: move dquot initialization responsibility into the filesystem ... Browse Code »

Currently various places in the VFS call vfs_dq_init directly. This means
we tie the quota code into the VFS. Get rid of that and make the
filesystem responsible for the initialization. For most metadata operations
this is a straight forward move into the methods, but for truncate and
open it's a bit more complicated.

For truncate we currently only call vfs_dq_init for the sys_truncate case
because open already takes care of it for ftruncate and open(O_TRUNC) - the
new code causes an additional vfs_dq_init for those which is harmless.

For open the initialization is moved from do_filp_open into the open method,
which means it happens slightly earlier now, and only for regular files.
The latter is fine because we don't need to initialize it for operations
on special files, and we already do it as part of the namespace operations
for directories.

Add a dquot_file_open helper that filesystems that support generic quotas
can use to fill in ->open.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-03-05 07:20:30 +0800
9f7547580 dquot: cleanup dquot drop routine ... Browse Code »

Get rid of the drop dquot operation - it is now always called from
the filesystem and if a filesystem really needs it's own (which none
currently does) it can just call into it's own routine directly.

Rename the now static low-level dquot_drop helper to __dquot_drop
and vfs_dq_drop to dquot_drop to have a consistent namespace.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-03-05 07:20:30 +0800
b43fa8284 dquot: cleanup dquot transfer routine ... Browse Code »

Get rid of the transfer dquot operation - it is now always called from
the filesystem and if a filesystem really needs it's own (which none
currently does) it can just call into it's own routine directly.

Rename the now static low-level dquot_transfer helper to __dquot_transfer
and vfs_dq_transfer to dquot_transfer to have a consistent namespace,
and make the new dquot_transfer return a normal negative errno value
which all callers expect.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-03-05 07:20:29 +0800
63936ddaa dquot: cleanup inode allocation / freeing routines ... Browse Code »

Get rid of the alloc_inode and free_inode dquot operations - they are
always called from the filesystem and if a filesystem really needs
their own (which none currently does) it can just call into it's
own routine directly.

Also get rid of the vfs_dq_alloc/vfs_dq_free wrappers and always
call the lowlevel dquot_alloc_inode / dqout_free_inode routines
directly, which now lose the number argument which is always 1.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-03-05 07:20:28 +0800

15 Feb, 2010

1 commit

175359f89 reiserfs: Fix softlockup while waiting on an inode ... Browse Code »

When we wait for an inode through reiserfs_iget(), we hold
the reiserfs lock. And waiting for an inode may imply waiting
for its writeback. But the inode writeback path may also require
the reiserfs lock, which leads to a deadlock.

We just need to release the reiserfs lock from reiserfs_iget()
to fix this.

Reported-by: Alexander Beregalov
Signed-off-by: Frederic Weisbecker
Tested-by: Christian Kujau
Cc: Chris Mason

Frederic Weisbecker
2010-02-15 02:07:56 +0800

09 Jan, 2010

1 commit

82062e7b5 Merge branch 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/gi… ... Browse Code »

…t/frederic/random-tracing

* 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing:
reiserfs: Relax reiserfs_xattr_set_handle() while acquiring xattr locks
reiserfs: Fix unreachable statement
reiserfs: Don't call reiserfs_get_acl() with the reiserfs lock
reiserfs: Relax lock on xattr removing
reiserfs: Relax the lock before truncating pages
reiserfs: Fix recursive lock on lchown
reiserfs: Fix mistake in down_write() conversion

Linus Torvalds
2010-01-09 06:03:55 +0800

05 Jan, 2010

2 commits

108d3943c reiserfs: Relax the lock before truncating pages ... Browse Code »

While truncating a file, reiserfs_setattr() calls inode_setattr()
that will truncate the mapping for the given inode, but for that
it needs the pages locks.

In order to release these, the owners need the reiserfs lock to
complete their jobs. But they can't, as we don't release it before
calling inode_setattr().

We need to do that to fix the following softlockups:

INFO: task flush-8:0:2149 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
flush-8:0 D f51af998 0 2149 2 0x00000000
f51af9ac 00000092 00000002 f51af998 c2803304 00000000 c1894ad0 010f3000
f51af9cc c1462604 c189ef80 f51af974 c1710304 f715b450 f715b5ec c2807c40
00000000 0005bb00 c2803320 c102c55b c1710304 c2807c50 c2803304 00000246
Call Trace:
[] ? schedule+0x434/0xb20
[] ? resched_task+0x4b/0x70
[] ? mark_held_locks+0x62/0x80
[] ? mutex_lock_nested+0x1fd/0x350
[] mutex_lock_nested+0x169/0x350
[] ? reiserfs_write_lock+0x2e/0x40
[] reiserfs_write_lock+0x2e/0x40
[] do_journal_end+0xc2/0xe70
[] journal_end+0xb2/0x120
[] ? pathrelse+0x33/0xb0
[] reiserfs_end_persistent_transaction+0x64/0x70
[] reiserfs_get_block+0x12ba/0x15f0
[] ? mark_held_locks+0x62/0x80
[] reiserfs_writepage+0xa74/0xe80
[] ? _raw_spin_unlock_irq+0x27/0x50
[] ? radix_tree_gang_lookup_tag_slot+0x95/0xc0
[] ? find_get_pages_tag+0x127/0x1a0
[] ? mark_held_locks+0x62/0x80
[] ? trace_hardirqs_on_caller+0x124/0x170
[] __writepage+0x10/0x40
[] write_cache_pages+0x16b/0x320
[] ? __writepage+0x0/0x40
[] generic_writepages+0x28/0x40
[] do_writepages+0x35/0x40
[] writeback_single_inode+0xc7/0x330
[] writeback_inodes_wb+0x2c2/0x490
[] wb_writeback+0x106/0x1b0
[] wb_do_writeback+0x106/0x1e0
[] ? wb_do_writeback+0x28/0x1e0
[] bdi_writeback_task+0x3a/0xb0
[] bdi_start_fn+0x63/0xc0
[] ? bdi_start_fn+0x0/0xc0
[] kthread+0x74/0x80
[] ? kthread+0x0/0x80
[] kernel_thread_helper+0x6/0x10
3 locks held by flush-8:0/2149:
#0: (&type->s_umount_key#30){+++++.}, at: [] writeback_inodes_wb+0x27f/0x490
#1: (&journal->j_mutex){+.+...}, at: [] do_journal_end+0xba/0xe70
#2: (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x2e/0x40
INFO: task fstest:3813 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
fstest D 00000002 0 3813 3812 0x00000000
f5103c94 00000082 f5103c40 00000002 f5ad5450 00000007 f5103c28 011f3000
00000006 f5ad5450 c10bb005 00000480 c1710304 f5ad5450 f5ad55ec c2907c40
00000001 f5ad5450 f5103c74 00000046 00000002 f5ad5450 00000007 f5103c6c
Call Trace:
[] ? free_hot_cold_page+0x1d5/0x280
[] io_schedule+0x74/0xc0
[] sync_page+0x35/0x60
[] __wait_on_bit_lock+0x4a/0x90
[] ? sync_page+0x0/0x60
[] __lock_page+0x85/0x90
[] ? wake_bit_function+0x0/0x60
[] truncate_inode_pages_range+0x1e4/0x2d0
[] truncate_inode_pages+0x1f/0x30
[] truncate_pagecache+0x5f/0xa0
[] vmtruncate+0x5a/0x70
[] inode_setattr+0x5d/0x190
[] reiserfs_setattr+0x1f7/0x2f0
[] ? down_write+0x49/0x70
[] notify_change+0x151/0x330
[] do_truncate+0x6d/0xa0
[] do_filp_open+0x9a2/0xcf0
[] ? _raw_spin_unlock+0x2c/0x50
[] ? alloc_fd+0xe0/0x100
[] do_sys_open+0x6d/0x130
[] ? sysenter_exit+0xf/0x16
[] sys_open+0x2e/0x40
[] sysenter_do_call+0x12/0x32
3 locks held by fstest/3813:
#0: (&sb->s_type->i_mutex_key#4){+.+.+.}, at: [] do_truncate+0x63/0xa0
#1: (&sb->s_type->i_alloc_sem_key#3){+.+.+.}, at: [] notify_change+0x257/0x330
#2: (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock_once+0x2e/0x50

Signed-off-by: Frederic Weisbecker
Cc: Christian Kujau
Cc: Alexander Beregalov
Cc: Chris Mason
Cc: Ingo Molnar

Frederic Weisbecker
2010-01-05 15:00:29 +0800
5fe1533fd reiserfs: Fix recursive lock on lchown ... Browse Code »

On chown, reiserfs will call reiserfs_setattr() to change the owner
of the given inode, but it may also recursively call
reiserfs_setattr() to propagate the owner change to the private xattr
files for this inode.

Hence, the reiserfs lock may be acquired twice which is not wanted
as reiserfs_setattr() calls journal_begin() that is going to try to
relax the lock in order to safely acquire the journal mutex.

Using reiserfs_write_lock_once() from reiserfs_setattr() solves
the problem.

This fixes the following warning, that precedes a lockdep report.

WARNING: at fs/reiserfs/lock.c:95 reiserfs_lock_check_recursive+0x3f/0x50()
Hardware name: MS-7418
Unwanted recursive reiserfs lock!
Pid: 4189, comm: fsstress Not tainted 2.6.33-rc2-tip-atom+ #195
Call Trace:
[] ? reiserfs_lock_check_recursive+0x3f/0x50
[] ? reiserfs_lock_check_recursive+0x3f/0x50
[] warn_slowpath_common+0x6c/0xc0
[] ? reiserfs_lock_check_recursive+0x3f/0x50
[] warn_slowpath_fmt+0x2b/0x30
[] reiserfs_lock_check_recursive+0x3f/0x50
[] do_journal_begin_r+0x83/0x350
[] journal_begin+0x7d/0x140
[] ? in_group_p+0x2a/0x30
[] ? inode_change_ok+0x91/0x140
[] reiserfs_setattr+0x15d/0x2e0
[] ? dput+0xe3/0x140
[] ? _raw_spin_unlock+0x2c/0x50
[] chown_one_xattr+0xd/0x10
[] reiserfs_for_each_xattr+0x113/0x2c0
[] ? chown_one_xattr+0x0/0x10
[] ? mutex_lock_nested+0x2a9/0x350
[] reiserfs_chown_xattrs+0x1f/0x60
[] ? in_group_p+0x2a/0x30
[] ? inode_change_ok+0x91/0x140
[] reiserfs_setattr+0x126/0x2e0
[] ? reiserfs_getxattr+0x0/0x90
[] ? cap_inode_need_killpriv+0x37/0x50
[] notify_change+0x151/0x330
[] chown_common+0x6f/0x90
[] sys_lchown+0x6d/0x80
[] sysenter_do_call+0x12/0x32
---[ end trace 7c2b77224c1442fc ]---

Signed-off-by: Frederic Weisbecker
Cc: Christian Kujau
Cc: Alexander Beregalov
Cc: Chris Mason
Cc: Ingo Molnar

Frederic Weisbecker
2010-01-05 14:59:38 +0800

03 Jan, 2010

1 commit

45d28b097 Merge branch 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/gi… ... Browse Code »

…t/frederic/random-tracing

* 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing:
reiserfs: Safely acquire i_mutex from xattr_rmdir
reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr
reiserfs: Fix journal mutex <-> inode mutex lock inversion
reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink()
reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle()
reiserfs: Relax reiserfs lock while freeing the journal
reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr
reiserfs: Warn on lock relax if taken recursively
reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion
reiserfs: Fix remaining in-reclaim-fs <-> reclaim-fs-on locking inversion
reiserfs: Fix reiserfs lock <-> inode mutex dependency inversion
reiserfs: Fix reiserfs lock and journal lock inversion dependency
reiserfs: Fix possible recursive lock

Linus Torvalds
2010-01-03 03:17:05 +0800

18 Dec, 2009

1 commit

ec8e2f746 reiserfs: truncate blocks not used by a write ... Browse Code »

It can happen that write does not use all the blocks allocated in
write_begin either because of some filesystem error (like ENOSPC) or
because page with data to write has been removed from memory. We truncate
these blocks so that we don't have dangling blocks beyond i_size.

Cc: Jeff Mahoney
Signed-off-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2009-12-18 07:45:30 +0800

14 Dec, 2009

1 commit

cb1c2e51c reiserfs: Fix reiserfs lock and journal lock inversion dependency ... Browse Code »

When we were using the bkl, we didn't care about dependencies against
other locks, but the mutex conversion created new ones, which is why
we have reiserfs_mutex_lock_safe(), which unlocks the reiserfs lock
before acquiring another mutex.

But this trick actually fails if we have acquired the reiserfs lock
recursively, as we try to unlock it to acquire the new mutex without
inverted dependency, but we eventually only decrease its depth.

This happens in the case of a nested inode creation/deletion.
Say we have no space left on the device, we create an inode
and tak the lock but fail to create its entry, then we release the
inode using iput(), which calls reiserfs_delete_inode() that takes
the reiserfs lock recursively. The path eventually ends up in
journal_begin() where we try to take the journal safely but we
fail because of the reiserfs lock recursion:

[ INFO: possible circular locking dependency detected ]
2.6.32-06486-g053fe57 #2
-------------------------------------------------------
vi/23454 is trying to acquire lock:
(&journal->j_mutex){+.+...}, at: [] do_journal_begin_r+0x64/0x2f0

but task is already holding lock:
(&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x28/0x40

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
[] validate_chain+0xa23/0xf70
[] __lock_acquire+0x4e5/0xa70
[] lock_acquire+0x7a/0xa0
[] mutex_lock_nested+0x5f/0x2b0
[] reiserfs_write_lock+0x28/0x40
[] do_journal_begin_r+0x6b/0x2f0
[] journal_begin+0x7f/0x120
[] reiserfs_remount+0x212/0x4d0
[] do_remount_sb+0x67/0x140
[] do_mount+0x436/0x6b0
[] sys_mount+0x66/0xa0
[] sysenter_do_call+0x12/0x36

-> #0 (&journal->j_mutex){+.+...}:
[] validate_chain+0xf68/0xf70
[] __lock_acquire+0x4e5/0xa70
[] lock_acquire+0x7a/0xa0
[] mutex_lock_nested+0x5f/0x2b0
[] do_journal_begin_r+0x64/0x2f0
[] journal_begin+0x7f/0x120
[] reiserfs_delete_inode+0x9f/0x140
[] generic_delete_inode+0x9c/0x150
[] generic_drop_inode+0x3d/0x60
[] iput+0x47/0x50
[] reiserfs_create+0x16c/0x1c0
[] vfs_create+0xc1/0x130
[] do_filp_open+0x81c/0x920
[] do_sys_open+0x4f/0x110
[] sys_open+0x29/0x40
[] sysenter_do_call+0x12/0x36

other info that might help us debug this:

2 locks held by vi/23454:
#0: (&sb->s_type->i_mutex_key#5){+.+.+.}, at: []
do_filp_open+0x27e/0x920
#1: (&REISERFS_SB(s)->lock){+.+.+.}, at: []
reiserfs_write_lock+0x28/0x40

stack backtrace:
Pid: 23454, comm: vi Not tainted 2.6.32-06486-g053fe57 #2
Call Trace:
[] ? printk+0x18/0x1e
[] print_circular_bug+0xc0/0xd0
[] validate_chain+0xf68/0xf70
[] ? trace_hardirqs_off+0xb/0x10
[] __lock_acquire+0x4e5/0xa70
[] lock_acquire+0x7a/0xa0
[] ? do_journal_begin_r+0x64/0x2f0
[] mutex_lock_nested+0x5f/0x2b0
[] ? do_journal_begin_r+0x64/0x2f0
[] ? do_journal_begin_r+0x64/0x2f0
[] ? delete_one_xattr+0x0/0x1c0
[] do_journal_begin_r+0x64/0x2f0
[] journal_begin+0x7f/0x120
[] ? reiserfs_delete_xattrs+0x15/0x50
[] reiserfs_delete_inode+0x9f/0x140
[] ? generic_delete_inode+0x5f/0x150
[] ? reiserfs_delete_inode+0x0/0x140
[] generic_delete_inode+0x9c/0x150
[] generic_drop_inode+0x3d/0x60
[] iput+0x47/0x50
[] reiserfs_create+0x16c/0x1c0
[] ? inode_permission+0x7d/0xa0
[] vfs_create+0xc1/0x130
[] ? reiserfs_create+0x0/0x1c0
[] do_filp_open+0x81c/0x920
[] ? trace_hardirqs_off+0xb/0x10
[] ? _spin_unlock+0x1d/0x20
[] ? alloc_fd+0xba/0xf0
[] do_sys_open+0x4f/0x110
[] sys_open+0x29/0x40
[] sysenter_do_call+0x12/0x36

To fix this, use reiserfs_lock_once() from reiserfs_delete_inode()
which prevents from adding reiserfs lock recursion.

Reported-by: Alexander Beregalov
Signed-off-by: Frederic Weisbecker
Cc: Chris Mason
Cc: Ingo Molnar
Cc: Thomas Gleixner

Frederic Weisbecker
2009-12-14 18:47:11 +0800

21 Nov, 2009

1 commit

1d2c6cfd4 kill-the-bkl/reiserfs: turn GFP_ATOMIC flag to GFP_NOFS in reiserfs_get_block() ... Browse Code »

GFP_ATOMIC was used in reiserfs_get_block to not lose the Bkl so that
nobody can modify the tree in the middle of its work. Now that we
kicked out the bkl, we can use a more friendly flag. We use GFP_NOFS
here because we already hold the reiserfs lock.

Signed-off-by: Frederic Weisbecker
Cc: Jeff Mahoney
Cc: Chris Mason
Cc: Ingo Molnar
Cc: Alexander Beregalov
Cc: Laurent Riffard
Cc: Thomas Gleixner

Frederic Weisbecker
2009-11-21 01:25:02 +0800

15 Oct, 2009

1 commit

27b3a5c51 kill-the-bkl/reiserfs: drop the fs race watchdog from _get_block_create_0() ... Browse Code »

We had a watchdog in _get_block_create_0() that jumped to a fixup retry
path in case the bkl got relaxed while calling kmap().
This is not necessary anymore since we now have a reiserfs lock that is
not implicitly relaxed while sleeping.

Signed-off-by: Frederic Weisbecker
Cc: Jeff Mahoney
Cc: Chris Mason
Cc: Ingo Molnar
Cc: Alexander Beregalov
Cc: Laurent Riffard
Cc: Thomas Gleixner

Frederic Weisbecker
2009-10-15 05:34:31 +0800