Eric Lee / linux-smarc-t335x-v3.2

23 Aug, 2011

2 commits

65299a3b7 block: separate priority boosting from REQ_META ... Browse Code »

Add a new REQ_PRIO to let requests preempt others in the cfq I/O schedule,
and lave REQ_META purely for marking requests as metadata in blktrace.

All existing callers of REQ_META except for XFS are updated to also
set REQ_PRIO for now.

Signed-off-by: Christoph Hellwig
Reviewed-by: Namhyung Kim
Signed-off-by: Jens Axboe

Christoph Hellwig
2011-08-23 20:50:29 +0800
5dc06c5a7 block: remove READ_META and WRITE_META ... Browse Code »

Replace all occurnanced of the undocumented READ_META with READ | REQ_META
and remove the unused WRITE_META define.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2011-08-23 20:49:55 +0800

02 Aug, 2011

1 commit

1b8e94993 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
xfs: Fix build breakage in xfs_iops.c when CONFIG_FS_POSIX_ACL is not set
VFS: Reorganise shrink_dcache_for_umount_subtree() after demise of dcache_lock
VFS: Remove dentry->d_lock locking from shrink_dcache_for_umount_subtree()
VFS: Remove detached-dentry counter from shrink_dcache_for_umount_subtree()
switch posix_acl_chmod() to umode_t
switch posix_acl_from_mode() to umode_t
switch posix_acl_equiv_mode() to umode_t *
switch posix_acl_create() to umode_t *
block: initialise bd_super in bdget()
vfs: avoid call to inode_lru_list_del() if possible
vfs: avoid taking inode_hash_lock on pipes and sockets
vfs: conditionally call inode_wb_list_del()
VFS: Fix automount for negative autofs dentries
Btrfs: load the key from the dir item in readdir into a fake dentry
devtmpfs: missing initialialization in never-hit case
hppfs: missing include

Linus Torvalds
2011-08-02 07:48:31 +0800

01 Aug, 2011

2 commits

d6952123b switch posix_acl_equiv_mode() to umode_t * ... Browse Code »

... so that &inode->i_mode could be passed to it

Signed-off-by: Al Viro

Al Viro
2011-08-01 14:10:06 +0800
d3fb61207 switch posix_acl_create() to umode_t * ... Browse Code »

so we can pass &inode->i_mode to it

Signed-off-by: Al Viro

Al Viro
2011-08-01 14:09:42 +0800

28 Jul, 2011

1 commit

333c066bb Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
GFS2: Fix mount hang caused by certain access pattern to sysfs files

Linus Torvalds
2011-07-28 00:26:22 +0800

27 Jul, 2011

1 commit

60063497a atomic: use <linux/atomic.h> ... Browse Code »

This allows us to move duplicated code in
(atomic_inc_not_zero() for now) to

Signed-off-by: Arun Sharma
Reviewed-by: Eric Dumazet
Cc: Ingo Molnar
Cc: David Miller
Cc: Eric Dumazet
Acked-by: Mike Frysinger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arun Sharma
2011-07-27 07:49:47 +0800

26 Jul, 2011

5 commits

192370399 GFS2: Fix mount hang caused by certain access pattern to sysfs files ... Browse Code »

Depending upon the order of userspace/kernel during the
mount process, this can result in a hang without the
_all version of the completion.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-07-26 17:18:37 +0800
4e34e719e fs: take the ACL checks to common code ... Browse Code »

Replace the ->check_acl method with a ->get_acl method that simply reads an
ACL from disk after having a cache miss. This means we can replace the ACL
checking boilerplate code with a single implementation in namei.c.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2011-07-26 02:30:23 +0800
826cae2f2 kill boilerplates around posix_acl_create_masq() ... Browse Code »

new helper: posix_acl_create(&acl, gfp, mode_p). Replaces acl with
modified clone, on failure releases acl and replaces with NULL.
Returns 0 or -ve on error. All callers of posix_acl_create_masq()
switched.

Signed-off-by: Al Viro

Al Viro
2011-07-26 02:27:32 +0800
bc26ab5f6 kill boilerplate around posix_acl_chmod_masq() ... Browse Code »

new helper: posix_acl_chmod(&acl, gfp, mode). Replaces acl with modified
clone or with NULL if that has failed; returns 0 or -ve on error. All
callers of posix_acl_chmod_masq() switched to that - they'd been doing
exactly the same thing.

Signed-off-by: Al Viro

Al Viro
2011-07-26 02:27:30 +0800
e77819e57 vfs: move ACL cache lookup into generic code ... Browse Code »

This moves logic for checking the cached ACL values from low-level
filesystems into generic code. The end result is a streamlined ACL
check that doesn't need to load the inode->i_op->check_acl pointer at
all for the common cached case.

The filesystems also don't need to check for a non-blocking RCU walk
case in their acl_check() functions, because that is all handled at a
VFS layer.

Signed-off-by: Linus Torvalds
Signed-off-by: Al Viro

Linus Torvalds
2011-07-26 02:23:39 +0800

23 Jul, 2011

1 commit

bbd9d6f7f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (107 commits)
vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp
isofs: Remove global fs lock
jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory
fix IN_DELETE_SELF on overwriting rename() on ramfs et.al.
mm/truncate.c: fix build for CONFIG_BLOCK not enabled
fs:update the NOTE of the file_operations structure
Remove dead code in dget_parent()
AFS: Fix silly characters in a comment
switch d_add_ci() to d_splice_alias() in "found negative" case as well
simplify gfs2_lookup()
jfs_lookup(): don't bother with . or ..
get rid of useless dget_parent() in btrfs rename() and link()
get rid of useless dget_parent() in fs/btrfs/ioctl.c
fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers
drivers: fix up various ->llseek() implementations
fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek
Ext4: handle SEEK_HOLE/SEEK_DATA generically
Btrfs: implement our own ->llseek
fs: add SEEK_HOLE and SEEK_DATA flags
reiserfs: make reiserfs default to barrier=flush
...

Fix up trivial conflicts in fs/xfs/linux-2.6/xfs_super.c due to the new
shrinker callout for the inode cache, that clashed with the xfs code to
start the periodic workers later.

Linus Torvalds
2011-07-23 10:02:39 +0800

21 Jul, 2011

3 commits

6c673ab39 simplify gfs2_lookup() ... Browse Code »

d_splice_alias() will DTRT when given NULL or ERR_PTR

Signed-off-by: Al Viro

Al Viro
2011-07-21 08:48:02 +0800
02c24a821 fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers ... Browse Code »

Btrfs needs to be able to control how filemap_write_and_wait_range() is called
in fsync to make it less of a painful operation, so push down taking i_mutex and
the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some
file systems can drop taking the i_mutex altogether it seems, like ext3 and
ocfs2. For correctness sake I just pushed everything down in all cases to make
sure that we keep the current behavior the same for everybody, and then each
individual fs maintainer can make up their mind about what to do from there.
Thanks,

Acked-by: Jan Kara
Signed-off-by: Josef Bacik
Signed-off-by: Al Viro

Josef Bacik
2011-07-21 08:47:59 +0800
562c72aa5 fs: move inode_dio_wait calls into ->setattr ... Browse Code »

Let filesystems handle waiting for direct I/O requests themselves instead
of doing it beforehand. This means filesystem-specific locks to prevent
new dio referenes from appearing can be held. This is important to allow
generalizing i_dio_count to non-DIO_LOCKING filesystems.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2011-07-21 08:47:47 +0800

20 Jul, 2011

5 commits

10556cb21 ->permission() sanitizing: don't pass flags to ->permission() ... Browse Code »

not used by the instances anymore.

Signed-off-by: Al Viro

Al Viro
2011-07-20 13:43:24 +0800
2830ba7f3 ->permission() sanitizing: don't pass flags to generic_permission() ... Browse Code »

redundant; all callers get it duplicated in mask & MAY_NOT_BLOCK and none of
them removes that bit.

Signed-off-by: Al Viro

Al Viro
2011-07-20 13:43:22 +0800
7e40145eb ->permission() sanitizing: don't pass flags to ->check_acl() ... Browse Code »

not used in the instances anymore.

Signed-off-by: Al Viro

Al Viro
2011-07-20 13:43:21 +0800
9c2c70392 ->permission() sanitizing: pass MAY_NOT_BLOCK to ->check_acl() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2011-07-20 13:43:19 +0800
178ea7352 kill check_acl callback of generic_permission() ... Browse Code »

its value depends only on inode and does not change; we might as
well store it in ->i_op->check_acl and be done with that.

Signed-off-by: Al Viro

Al Viro
2011-07-20 13:43:16 +0800

15 Jul, 2011

4 commits

46fcb2ed2 GFS2: combine duplicated block freeing routines ... Browse Code »

__gfs2_free_data and __gfs2_free_meta are almost identical, and
can be trivially combined.

[This is as per Eric's original patch minus gfs2_free_data() which had
no callers left and plus the conversion of the bmap.c calls to these
functions. All in all, a nice clean up]

Signed-off-by: Eric Sandeen
Signed-off-by: Steven Whitehouse

Eric Sandeen
2011-07-15 16:32:52 +0800
9964afbb7 GFS2: Add S_NOSEC support ... Browse Code »

This adds S_NOSEC support to GFS2. We set/reset the flag either when
a user calls setattr or when we have just regained the glock
from another node. The flag is only set if there are no xattrs
on the inode and there is no suid bit set.

Signed-off-by: Steven Whitehouse
Reviewed-by: Andi Kleen
Cc: Al Viro

Steven Whitehouse
2011-07-15 16:32:35 +0800
7cf8dcd3b GFS2: Automatically adjust glock min hold time ... Browse Code »

This patch is a performance improvement for GFS2 in a clustered
environment. It makes the glock hold time self-adjusting.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2011-07-15 16:32:11 +0800
17d539f04 GFS2: Cache dir hash table in a contiguous buffer ... Browse Code »

This patch adds a cache for the hash table to the directory code
in order to help simplify the way in which the hash table is
accessed. This is intended to be a first step towards introducing
some performance improvements in the directory code.

There are two follow ups that I'm hoping to see fairly shortly. One
is to simplify the hash table reading code now that we always read the
complete hash table, whether we want one entry or all of them. The
other is to introduce readahead on the heads of the hash chains
which are referred to from the table.

The hash table is a maximum of 128k in size, so it is not worth trying
to read it in small chunks.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-07-15 16:31:48 +0800

14 Jul, 2011

1 commit

380f7c65a GFS2: Resolve inode eviction and ail list interaction bug ... Browse Code »

This patch contains a few misc fixes which resolve a recently
reported issue. This patch has been a real team effort and has
received a lot of testing.

The first issue is that the ail lock needs to be held over a few
more operations. The lock thats added into gfs2_releasepage() may
possibly be a candidate for replacing with RCU at some future
point, but at this stage we've gone for the obvious fix.

The second issue is that gfs2_write_inode() can end up calling
a glock recursively when called from gfs2_evict_inode() via the
syncing code, so it needs a guard added.

The third issue is that we either need to not truncate the metadata
pages of inodes which have zero link count, but which we cannot
deallocate due to them still being in use by other nodes, or we need
to ensure that those pages have all made it through the journal and
ail lists first. This patch takes the former approach, but the
latter has also been tested and there is nothing to choose between
them performance-wise. So again, we could revise that decision
in the future.

Also, the inode eviction process is now better documented.

Signed-off-by: Steven Whitehouse
Tested-by: Bob Peterson
Tested-by: Abhijith Das
Reported-by: Barry J. Marson
Reported-by: David Teigland

Steven Whitehouse
2011-07-14 15:59:44 +0800

12 Jul, 2011

2 commits

3942ae531 GFS2: Fix race during filesystem mount ... Browse Code »

There is a potential race during filesystem mounting which has recently
been reported. It occurs when the userland gfs_controld is able to
process requests fast enough that it tries to use the sysfs interface
before the lock module is properly initialised. This is a pretty
unusual case as normally the lock module initialisation is very quick
compared with gfs_controld.

This patch adds an interruptible completion which is used to ensure that
userland will wait for the initialisation of the lock module to
complete.

There are other potential solutions to this problem, but this is the
quickest at this stage and has been tested both with and without
mount.gfs2 present in the system.

Signed-off-by: Steven Whitehouse
Reported-by: David Booher

Steven Whitehouse
2011-07-12 16:15:46 +0800
1ce533686 GFS2: force a log flush when invalidating the rindex glock ... Browse Code »

Right now, there is nothing that forces the log to get flushed when a node
drops its rindex glock so that another node can grow the filesystem. If the
log doesn't get flushed, GFS2 can corrupt the sd_log_le_rg list in the
following way.

A node puts an rgd on the list in rg_lo_add(), and then the rindex glock is
dropped so the other node can grow the filesystem. When the node reacquires the
rindex glock, that rgd gets deleted in clear_rgrpdi() before ever being
removed from the list by gfs2_log_flush().

This code simply forces a log flush when the rindex glock is invalidated,
solving the problem.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2011-07-12 16:15:24 +0800

08 Jun, 2011

1 commit

d205df995 Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
GFS2: Processes waiting on inode glock that no processes are holding

Linus Torvalds
2011-06-08 09:44:10 +0800

27 May, 2011

1 commit

b7c2f0362 Merge branch 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6 ... Browse Code »

* 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
gfs2: Drop __TIME__ usage
isdn/diva: Drop __TIME__ usage
atm: Drop __TIME__ usage
dlm: Drop __TIME__ usage
wan/pc300: Drop __TIME__ usage
parport: Drop __TIME__ usage
hdlcdrv: Drop __TIME__ usage
baycom: Drop __TIME__ usage
pmcraid: Drop __DATE__ usage
edac: Drop __DATE__ usage
rio: Drop __DATE__ usage
scsi/wd33c93: Drop __TIME__ usage
scsi/in2000: Drop __TIME__ usage
aacraid: Drop __TIME__ usage
media/cx231xx: Drop __TIME__ usage
media/radio-maxiradio: Drop __TIME__ usage
nozomi: Drop __TIME__ usage
cyclades: Drop __TIME__ usage

Linus Torvalds
2011-05-27 04:19:00 +0800

26 May, 2011

1 commit

8d2c50e3b gfs2: Drop __TIME__ usage ... Browse Code »

The kernel already prints its build timestamp during boot, no need to
repeat it in random drivers and produce different object files each
time.

Cc: Steven Whitehouse
Cc: cluster-devel@redhat.com
Signed-off-by: Michal Marek

Michal Marek
2011-05-26 16:54:37 +0800

25 May, 2011

2 commits

1495f230f vmscan: change shrinker API by passing shrink_control struct ... Browse Code »

Change each shrinker's API by consolidating the existing parameters into
shrink_control struct. This will simplify any further features added w/o
touching each file of shrinker.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: fix warning]
[kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API]
[akpm@linux-foundation.org: fix xfs warning]
[akpm@linux-foundation.org: update gfs2]
Signed-off-by: Ying Han
Cc: KOSAKI Motohiro
Cc: Minchan Kim
Acked-by: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Cc: Mel Gorman
Acked-by: Rik van Riel
Cc: Johannes Weiner
Cc: Hugh Dickins
Cc: Dave Hansen
Cc: Steven Whitehouse
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ying Han
2011-05-25 23:39:26 +0800
f90e5b5b1 GFS2: Processes waiting on inode glock that no processes are holding ... Browse Code »

This patch fixes a race in the GFS2 glock state machine that may
result in lockups. The symptom is that all nodes but one will
hang, waiting for a particular glock. All the holder records
will have the "W" (Waiting) bit set. The other node will
typically have the glock stuck in Exclusive mode (EX) with no
holder records, but the dinode will be cached. In other words,
an entry with "I:" will appear in the glock dump for that glock,
but nothing else.

The race has to do with the glock "Pending Demote" bit, which
can be set, then immediately reset, thus losing the fact that
another node needs the glock. The sequence of events is:

1. Something schedules the glock workqueue (e.g. glock request from fs)
2. The glock workqueue gets to the point between the test of the reply pending
bit and the spin lock:

if (test_and_clear_bit(GLF_REPLY_PENDING, &gl->gl_flags)) {
finish_xmote(gl, gl->gl_reply);
drop_ref = 1;
}
down_read(&gfs2_umount_flush_sem); gl_spin);

3. In comes (a) the reply to our EX lock request setting GLF_REPLY_PENDING and
(b) the demote request which sets GLF_PENDING_DEMOTE

4. The following test is executed:

if (test_and_clear_bit(GLF_PENDING_DEMOTE, &gl->gl_flags) &&
gl->gl_state != LM_ST_UNLOCKED &&
gl->gl_demote_state != LM_ST_EXCLUSIVE) {

This resets the pending demote flag, and gl->gl_demote_state is not equal to
exclusive, however because the reply from the dlm arrived after we checked for
the GLF_REPLY_PENDING flag, gl->gl_state is still equal to unlocked, so
although we reset the GLF_PENDING_DEMOTE flag, we didn't then set the
GLF_DEMOTE flag or reinstate the GLF_PENDING_DEMOTE_FLAG.

The patch closes the timing window by only transitioning the
"Pending demote" bit to the "demote" flag once we know the
other conditions (not unlocked and not exclusive) are met.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2011-05-25 17:37:11 +0800

22 May, 2011

1 commit

26b06a695 GFS2: Wait properly when flushing the ail list ... Browse Code »

The ail flush code has always relied upon log flushing to prevent
it from spinning needlessly. This fixes it to wait on the last
I/O request submitted (we don't need to wait for all of it)
instead of either spinning with io_schedule or sleeping.

As a result cpu usage of gfs2_logd is much reduced with certain
workloads.

Reported-by: Abhijith Das
Tested-by: Abhijith Das
Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-05-22 02:21:07 +0800

21 May, 2011

2 commits

6d3117b41 GFS2: Wipe directory hash table metadata when deallocating a directory ... Browse Code »

The deallocation code for directories in GFS2 is largely divided into
two parts. The first part deallocates any directory leaf blocks and
marks the directory as being a regular file when that is complete. The
second stage was identical to deallocating regular files.

Regular files have their data blocks in a different
address space to directories, and thus what would have been normal data
blocks in a regular file (the hash table in a GFS2 directory) were
deallocated correctly. However, a reference to these blocks was left in the
journal (assuming of course that some previous activity had resulted in
those blocks being in the journal or ail list).

This patch uses the i_depth as a test of whether the inode is an
exhash directory (we cannot test the inode type as that has already
been changed to a regular file at this stage in deallocation)

The original issue was reported by Chris Hertel as an issue he encountered
running bonnie++

Reported-by: Christopher R. Hertel
Cc: Abhijith Das
Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-05-21 21:05:58 +0800
6c1b8d94b Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (32 commits)
GFS2: Move all locking inside the inode creation function
GFS2: Clean up symlink creation
GFS2: Clean up mkdir
GFS2: Use UUID field in generic superblock
GFS2: Rename ops_inode.c to inode.c
GFS2: Inode.c is empty now, remove it
GFS2: Move final part of inode.c into super.c
GFS2: Move most of the remaining inode.c into ops_inode.c
GFS2: Move gfs2_refresh_inode() and friends into glops.c
GFS2: Remove gfs2_dinode_print() function
GFS2: When adding a new dir entry, inc link count if it is a subdir
GFS2: Make gfs2_dir_del update link count when required
GFS2: Don't use gfs2_change_nlink in link syscall
GFS2: Don't use a try lock when promoting to a higher mode
GFS2: Double check link count under glock
GFS2: Improve bug trap code in ->releasepage()
GFS2: Fix ail list traversal
GFS2: make sure fallocate bytes is a multiple of blksize
GFS2: Add an AIL writeback tracepoint
GFS2: Make writeback more responsive to system conditions
...

Linus Torvalds
2011-05-21 04:28:45 +0800

13 May, 2011

3 commits

f2741d989 GFS2: Move all locking inside the inode creation function ... Browse Code »

Now that there are no longer any exceptions to the normal inode
creation code path, we can move the parts of the locking code
which were duplicated in mkdir/mknod/create/symlink into the
inode create function.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-05-13 19:11:17 +0800
160b4026d GFS2: Clean up symlink creation ... Browse Code »

This moves the symlink specific parts of inode creation
into the function where we initialise the rest of the
dinode. As a result we have one less place where we need
to look up the inode's buffer.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-05-13 17:34:59 +0800
e2d0a13bb GFS2: Clean up mkdir ... Browse Code »

This moves the initialisation of the directory into the inode
creation functions to avoid having to duplicate the lookup
of the inode's buffer.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-05-13 16:55:55 +0800

10 May, 2011

1 commit

32e471ef1 GFS2: Use UUID field in generic superblock ... Browse Code »

The VFS superblock structure now has a UUID field, so we can use that
in preference to the UUID field in the GFS2 superblock now.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2011-05-10 22:01:59 +0800