Eric Lee / smarc-fsl-linux-kernel

08 Mar, 2010

2 commits

52cf25d0a Driver core: Constify struct sysfs_ops in struct kobj_type ... Browse Code »

Constify struct sysfs_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

* prevents modification of data that is shared
(referenced) by many other structure instances
at runtime

* detects/prevents accidental (but not intentional)
modification attempts on archs that enforce
read-only kernel data at runtime

* potentially better optimized code as the compiler
can assume that the const data cannot be changed

* the compiler/linker move const data into .rodata
and therefore exclude them from false sharing

Signed-off-by: Emese Revfy
Acked-by: David Teigland
Acked-by: Matt Domsch
Acked-by: Maciej Sosnowski
Acked-by: Hans J. Koch
Acked-by: Pekka Enberg
Acked-by: Jens Axboe
Acked-by: Stephen Hemminger
Signed-off-by: Greg Kroah-Hartman

Emese Revfy
2010-03-08 09:04:49 +0800
9cd43611c kobject: Constify struct kset_uevent_ops ... Browse Code »

Constify struct kset_uevent_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

* prevents modification of data that is shared
(referenced) by many other structure instances
at runtime

* detects/prevents accidental (but not intentional)
modification attempts on archs that enforce
read-only kernel data at runtime

* potentially better optimized code as the compiler
can assume that the const data cannot be changed

* the compiler/linker move const data into .rodata
and therefore exclude them from false sharing

Signed-off-by: Emese Revfy
Signed-off-by: Greg Kroah-Hartman

Emese Revfy
2010-03-08 09:04:49 +0800

06 Mar, 2010

2 commits

e213e26ab Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 ... Browse Code »

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits)
quota: stop using QUOTA_OK / NO_QUOTA
dquot: cleanup dquot initialize routine
dquot: move dquot initialization responsibility into the filesystem
dquot: cleanup dquot drop routine
dquot: move dquot drop responsibility into the filesystem
dquot: cleanup dquot transfer routine
dquot: move dquot transfer responsibility into the filesystem
dquot: cleanup inode allocation / freeing routines
dquot: cleanup space allocation / freeing routines
ext3: add writepage sanity checks
ext3: Truncate allocated blocks if direct IO write fails to update i_size
quota: Properly invalidate caches even for filesystems with blocksize < pagesize
quota: generalize quota transfer interface
quota: sb_quota state flags cleanup
jbd: Delay discarding buffers in journal_unmap_buffer
ext3: quota_write cross block boundary behaviour
quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
quota: split out compat_sys_quotactl support from quota.c
quota: split out netlink notification support from quota.c
quota: remove invalid optimization from quota_sync_all
...

Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c

Linus Torvalds
2010-03-06 05:20:53 +0800
a9185b41a pass writeback_control to ->write_inode ... Browse Code »

This gives the filesystem more information about the writeback that
is happening. Trond requested this for the NFS unstable write handling,
and other filesystems might benefit from this too by beeing able to
distinguish between the different callers in more detail.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-03-06 02:25:52 +0800

05 Mar, 2010

2 commits

5fb324ad2 quota: move code from sync_quota_sb into vfs_quota_sync ... Browse Code »

Currenly sync_quota_sb does a lot of sync and truncate action that only
applies to "VFS" style quotas and is actively harmful for the sync
performance in XFS. Move it into vfs_quota_sync and add a wait parameter
to ->quota_sync to tell if we need it or not.

My audit of the GFS2 code says it's also not needed given the way GFS2
implements quotas, but I'd be happy if this can get a detailed review.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2010-03-05 07:20:24 +0800
0f2cc4ecd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits)
init: Open /dev/console from rootfs
mqueue: fix typo "failues" -> "failures"
mqueue: only set error codes if they are really necessary
mqueue: simplify do_open() error handling
mqueue: apply mathematics distributivity on mq_bytes calculation
mqueue: remove unneeded info->messages initialization
mqueue: fix mq_open() file descriptor leak on user-space processes
fix race in d_splice_alias()
set S_DEAD on unlink() and non-directory rename() victims
vfs: add NOFOLLOW flag to umount(2)
get rid of ->mnt_parent in tomoyo/realpath
hppfs can use existing proc_mnt, no need for do_kern_mount() in there
Mirror MS_KERNMOUNT in ->mnt_flags
get rid of useless vfsmount_lock use in put_mnt_ns()
Take vfsmount_lock to fs/internal.h
get rid of insanity with namespace roots in tomoyo
take check for new events in namespace (guts of mounts_poll()) to namespace.c
Don't mess with generic_permission() under ->d_lock in hpfs
sanitize const/signedness for udf
nilfs: sanitize const/signedness in dealing with ->d_name.name
...

Fix up fairly trivial (famous last words...) conflicts in
drivers/infiniband/core/uverbs_main.c and security/tomoyo/realpath.c

Linus Torvalds
2010-03-05 00:15:33 +0800

04 Mar, 2010

1 commit

c177c2ac8 Switch gfs2 to nd_set_link() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2010-03-04 02:00:22 +0800

01 Mar, 2010

4 commits

4818972ef GFS2: print glock numbers in hex ... Browse Code »

This patch changes glock numbers from printing in decimal to hex.
Since DLM prints corresponding resource IDs in hex, it makes debugging
easier.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2010-03-01 22:09:04 +0800
e5884636d GFS2: ordered writes are backwards ... Browse Code »

When we queue data buffers for ordered write, the buffers are added
to the head of the ordered write list. When the log needs to push
these buffers to disk, it also walks the list from the head. The
result is that the the ordered buffers are submitted to disk in
reverse order.

For large writes, this means that whenever the log flushes large
streams of reverse sequential order buffers are pushed down into the
block layers. The elevators don't handle this particularly well, so
IO rates tend to be significantly lower than if the IO was issued in
ascending block order.

Queue new ordered buffers to the tail of the ordered buffer list to
ensure that IO is dispatched in the order it was submitted. This
should significantly improve large sequential write speeds. On a
disk capable of 85MB/s, speeds increase from 50MB/s to 65MB/s for
noop and from 38MB/s to 50MB/s for cfq.

Signed-off-by: Dave Chinner
Signed-off-by: Steven Whitehouse

Dave Chinner
2010-03-01 22:08:26 +0800
c1184f8ab GFS2: Remove loopy umount code ... Browse Code »

As a consequence of the previous patch, we can now remove the
loop which used to be required due to the circular dependency
between the inodes and glocks. Instead we can just invalidate
the inodes, and then clear up any glocks which are left.

Also we no longer need the rwsem since there is no longer any
danger of the inode invalidation calling back into the glock
code (and from there back into the inode code).

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2010-03-01 22:07:53 +0800
009d85183 GFS2: Metadata address space clean up ... Browse Code »

Since the start of GFS2, an "extra" inode has been used to store
the metadata belonging to each inode. The only reason for using
this inode was to have an extra address space, the other fields
were unused. This means that the memory usage was rather inefficient.

The reason for keeping each inode's metadata in a separate address
space is that when glocks are requested on remote nodes, we need to
be able to efficiently locate the data and metadata which relating
to that glock (inode) in order to sync or sync and invalidate it
(depending on the remotely requested lock mode).

This patch adds a new type of glock, which has in addition to
its normal fields, has an address space. This applies to all
inode and rgrp glocks (but to no other glock types which remain
as before). As a result, we no longer need to have the second
inode.

This results in three major improvements:
1. A saving of approx 25% of memory used in caching inodes
2. A removal of the circular dependency between inodes and glocks
3. No confusion between "normal" and "metadata" inodes in super.c

Although the first of these is the more immediately apparent, the
second is just as important as it now enables a number of clean
ups at umount time. Those will be the subject of future patches.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2010-03-01 22:07:37 +0800

12 Feb, 2010

2 commits

07ccb7bf2 GFS2: Fix bmap allocation corner-case bug ... Browse Code »

This patch solves a corner case during allocation which occurs if both
metadata (indirect) and data blocks are required but there is an
obstacle in the filesystem (e.g. a resource group header or another
allocated block) such that when the allocation is requested only
enough blocks for the metadata are returned.

By changing the exit condition of this loop, we ensure that a
minimum of one data block will always be returned.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2010-02-12 18:16:14 +0800
0e5a9fb04 GFS2: Fix error code ... Browse Code »

We need this one-liner to signal the mount helper of the 'insufficient journals' condition.

Signed-off-by: Abhijith Das
Signed-off-by: Steven Whitehouse

Abhijith Das
2010-02-12 18:15:51 +0800

03 Feb, 2010

2 commits

8f05228ee GFS2: Extend umount wait coverage to full glock lifetime ... Browse Code »

Although all glocks are, by the time of the umount glock wait,
scheduled for demotion, some of them haven't made it far
enough through the process for the original set of waiting
code to wait for them.

This extends the ref count to the whole glock lifetime in order
to ensure that the waiting does catch all glocks. It does make
it a bit more invasive, but it seems the only sensible solution
at the moment.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2010-02-03 17:56:21 +0800
e402746a9 GFS2: Wait for unlock completion on umount ... Browse Code »

This patch adds a wait on umount between the point at which we
dispose of all glocks and the point at which we unmount the
lock protocol. This ensures that we've received all the replies
to our unlock requests before we stop the locking.

Signed-off-by: Steven Whitehouse
Reported-by: Fabio M. Di Nitto

Steven Whitehouse
2010-02-03 17:47:04 +0800

01 Feb, 2010

3 commits

ea8d62dad GFS2: Use GFP_NOFS for alloc structure ... Browse Code »

This is called under a glock, so its a good plan to use GFP_NOFS

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2010-02-01 18:01:34 +0800
7fe3ec6fe GFS2: Fix previous patch ... Browse Code »

The do_div() call needs to remain.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2010-02-01 18:00:23 +0800
55f0b4c54 GFS2: Don't withdraw on partial rindex entries ... Browse Code »

ince gfs2 writes the rindex file a block at a time, and releases the
exclusive lock after each block, it is possible that another process
will grab the lock in the middle of the write. Since rindex entries are
not an even divisor of blocks, that other process may see partial
entries. On grows, this is fine. The process can simply ignore the the
partial entires. Previously, the code withdrew when it saw partial
entries. Now it simply ignores them.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2010-02-01 17:59:54 +0800

12 Jan, 2010

1 commit

0f585f14d GFS2: Fix refcnt leak on gfs2_follow_link() error path ... Browse Code »

If ->follow_link handler return the error, it should decrement
nd->path refcnt.

This patch fix it.

Signed-off-by: OGAWA Hirofumi
Signed-off-by: Steven Whitehouse

OGAWA Hirofumi
2010-01-12 17:30:15 +0800

11 Jan, 2010

1 commit

ba198098a GFS2: Use MAX_LFS_FILESIZE for meta inode size ... Browse Code »

Using ~0ULL was cauing sign issues in filemap_fdatawrite_range, so
use MAX_LFS_FILESIZE instead.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2010-01-11 16:57:55 +0800

08 Jan, 2010

3 commits

e412bdb12 GFS2: Fix gfs2_xattr_acl_chmod() ... Browse Code »

The ref counting for the bh returned by gfs2_ea_find() was
wrong. This patch ensures that we always drop the ref count
to that bh correctly.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2010-01-08 21:42:59 +0800
24b977b5f GFS2: Fix locking bug in rename ... Browse Code »

The rename code was taking a resource group lock in cases where
it wasn't actually needed, this caused problems if the rename
was resulting in an inode being unlinked. The patch ensures that
we only take the rgrp lock early if it is really needed.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2010-01-08 21:42:42 +0800
56aa616a0 GFS2: Ensure uptodate inode size when using O_APPEND ... Browse Code »

The VFS reads the inode size during generic_file_aio_write() but
with no locking around it. In order to get the expected result
from O_APPEND opens, this patch updated the inode size before
calling generic_file_aio_write()

There is of course still a race here, in that there is nothing to
prevent another node coming in and extending the file in the
mean time. On the other hand, when used with file locking this
will ensure that the expected results are obtained.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2010-01-08 21:42:27 +0800

18 Dec, 2009

2 commits

b6e3224fb Revert "task_struct: make journal_info conditional" ... Browse Code »

This reverts commit e4c570c4cb7a95dbfafa3d016d2739bf3fdfe319, as
requested by Alexey:

"I think I gave a good enough arguments to not merge it.
To iterate:
* patch makes impossible to start using ext3 on EXT3_FS=n kernels
without reboot.
* this is done only for one pointer on task_struct"

None of config options which define task_struct are tristate directly
or effectively."

Requested-by: Alexey Dobriyan
Acked-by: Andrew Morton
Signed-off-by: Linus Torvalds

Linus Torvalds
2009-12-18 05:23:24 +0800
eaff8079d kill I_LOCK ... Browse Code »

After I_SYNC was split from I_LOCK the leftover is always used together with
I_NEW and thus superflous.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2009-12-18 00:03:25 +0800

17 Dec, 2009

1 commit

431547b3c sanitize xattr handler prototypes ... Browse Code »

Add a flags argument to struct xattr_handler and pass it to all xattr
handler methods. This allows using the same methods for multiple
handlers, e.g. for the ACL methods which perform exactly the same action
for the access and default ACLs, just using a different underlying
attribute. With a little more groundwork it'll also allow sharing the
methods for the regular user/trusted/secure handlers in extN, ocfs2 and
jffs2 like it's already done for xfs in this patch.

Also change the inode argument to the handlers to a dentry to allow
using the handlers mechnism for filesystems that require it later,
e.g. cifs.

[with GFS2 bits updated by Steven Whitehouse ]

Signed-off-by: Christoph Hellwig
Reviewed-by: James Morris
Acked-by: Joel Becker
Signed-off-by: Al Viro

Christoph Hellwig
2009-12-17 01:16:49 +0800

16 Dec, 2009

2 commits

f0b34ae63 fs/gfs2/sys.c: use %pUB to print UUIDs ... Browse Code »

Signed-off-by: Joe Perches
Cc: Steven Whitehouse
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2009-12-16 00:53:33 +0800
e4c570c4c task_struct: make journal_info conditional ... Browse Code »

journal_info in task_struct is used in journaling file system only. So
introduce CONFIG_FS_JOURNAL_INFO and make it conditional.

Signed-off-by: Hiroshi Shimamoto
Cc: Chris Mason
Cc: "Theodore Ts'o"
Cc: Steven Whitehouse
Cc: KONISHI Ryusuke
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hiroshi Shimamoto
2009-12-16 00:53:27 +0800

03 Dec, 2009

12 commits

26bb7505c GFS2: Fix glock refcount issues ... Browse Code »

This patch fixes some ref counting issues. Firstly by moving
the point at which we drop the ref count after a dlm lock
operation has completed we ensure that we never call
gfs2_glock_hold() on a lock with a zero ref count.

Secondly, by using atomic_dec_and_lock() in gfs2_glock_put()
we ensure that at no time will a glock with zero ref count
appear on the lru_list. That means that we can remove the
check for this in our shrinker (which was racy).

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2009-12-03 20:00:12 +0800
c29cd9004 writeback: remove unused nonblocking and congestion checks (gfs2) ... Browse Code »

No one is calling wb_writeback and write_cache_pages with
wbc.nonblocking=1 any more. And lumpy pageout will want to do
nonblocking writeback without the congestion wait.

Signed-off-by: Wu Fengguang
Signed-off-by: Steven Whitehouse

Wu Fengguang
2009-12-03 19:59:17 +0800
9ae3c6de6 GFS2: drop rindex glock to refresh rindex list ... Browse Code »

When a gfs2 filesystem is grown, it needs to rebuild the rindex list to be able
to use the new space. gfs2 does this when the rindex is marked not uptodate,
which happens when the rindex glock is dropped. However, on a single node
setup, there is never any reason to drop the rindex glock, so gfs2 never
invalidates the the rindex. This patch makes gfs2 automatically drop the
rindex glock after filesystem grows, so it can refresh the rindex list.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2009-12-03 19:59:03 +0800
0ab7d13fc GFS2: Tag all metadata with jid ... Browse Code »

There are two spare field in the header common to all GFS2
metadata. One is just the right size to fit a journal id
in it, and this patch updates the journal code so that each
time a metadata block is modified, we tag it with the journal
id of the node which is performing the modification.

The reason for this is that it should make it much easier to
debug issues which arise if we can tell which node was the
last to modify a particular metadata block.

Since the field is updated before the block is written into
the journal, each journal should only contain metadata which
is tagged with its own journal id. The one exception to this
is the journal header block, which might have a different node's
id in it, if that journal was recovered by another node in the
cluster.

Thus each journal will contain a record of which nodes recovered
it, via the journal header.

The other field in the metadata header could potentially be
used to hold information about what kind of operation was
performed, but for the time being we just zero it on each
transaction so that if we use it for that in future, we'll
know that the information (where it exists) is reliable.

I did consider using the other field to hold the journal
sequence number, however since in GFS2's journaling we write
the modified data into the journal and not the original
data, this gives no information as to what action caused the
modification, so I think we can probably come up with a better
use for those 64 bits in the future.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2009-12-03 19:58:47 +0800
2c7763496 GFS2: Locking order fix in gfs2_check_blk_state ... Browse Code »

In some cases we already have the rindex lock when
we enter this function.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2009-12-03 19:57:41 +0800
1579343a7 GFS2: Remove dirent_first() function ... Browse Code »

This function only had one caller left, and that caller only
called it for leaf blocks, hence one branch of the "if" was
never taken. In addition the call to get_left had already
verified the metadata type, so the function can be reduced
to a single line of code in its caller.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2009-12-03 19:57:23 +0800
cdcfde62d GFS2: Display nobarrier option in /proc/mounts ... Browse Code »

Since the default is barriers on, this only displays the
nobarrier option when that is active.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2009-12-03 19:57:05 +0800
f25934c5f GFS2: add barrier/nobarrier mount options ... Browse Code »

Currently gfs2 issues barrier unconditionally. There are various reasons
to disable them, be that just for testing or for stupid devices flushing
large battert backed caches. Add a nobarrier option that matches xfs and
btrfs for this. Also add a symmetric barrier option to turn it back on
at remount time.

Signed-off-by: Christoph Hellwig
Signed-off-by: Steven Whitehouse

Christoph Hellwig
2009-12-03 19:55:54 +0800
c14f5735e GFS2: remove division from new statfs code ... Browse Code »

It's not necessary to do any 64bit division for the statfs sync code, so
remove it.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2009-12-03 19:55:32 +0800
3d3c10f2c GFS2: Improve statfs and quota usability ... Browse Code »

GFS2 now has three new mount options, statfs_quantum, quota_quantum and
statfs_percent. statfs_quantum and quota_quantum simply allow you to
set the tunables of the same name. Setting setting statfs_quantum to 0
will also turn on the statfs_slow tunable. statfs_percent accepts an
integer between 0 and 100. Numbers between 1 and 100 will cause GFS2 to
do any early sync when the local number of blocks free changes by at
least statfs_percent from the totoal number of blocks free. Setting
statfs_percent to 0 disables this.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2009-12-03 19:55:17 +0800
2ec465052 GFS2: Use dquot_send_warning() ... Browse Code »

This adds support to GFS2 to send quota warnings via netlink.
Also it removes a stray \r which was left over from when the
code used to print warnings on the console.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2009-12-03 19:53:28 +0800
e285c1003 GFS2: Add set_xquota support ... Browse Code »

This patch adds the ability to set GFS2 quota limit and
warning levels via the XFS quota API.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2009-12-03 19:52:43 +0800