Eric Lee / smarc-fsl-linux-kernel

06 Jan, 2009

40 commits

7d8a804c5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm:
dlm: fs/dlm/ast.c: fix warning
dlm: add new debugfs entry
dlm: add time stamp of blocking callback
dlm: change lock time stamping
dlm: improve how bast mode handling
dlm: remove extra blocking callback check
dlm: replace schedule with cond_resched
dlm: remove kmap/kunmap
dlm: trivial annotation of be16 value
dlm: fix up memory allocation flags

Linus Torvalds
2009-01-06 11:02:09 +0800
c54febae9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (27 commits)
GFS2: Use DEFINE_SPINLOCK
GFS2: Fix use-after-free bug on umount (try #2)
Revert "GFS2: Fix use-after-free bug on umount"
GFS2: Streamline alloc calculations for writes
GFS2: Send useful information with uevent messages
GFS2: Fix use-after-free bug on umount
GFS2: Remove ancient, unused code
GFS2: Move four functions from super.c
GFS2: Fix bug in gfs2_lock_fs_check_clean()
GFS2: Send some sensible sysfs stuff
GFS2: Kill two daemons with one patch
GFS2: Move gfs2_recoverd into recovery.c
GFS2: Fix "truncate in progress" hang
GFS2: Clean up & move gfs2_quotad
GFS2: Add more detail to debugfs glock dumps
GFS2: Banish struct gfs2_rgrpd_host
GFS2: Move rg_free from gfs2_rgrpd_host to gfs2_rgrpd
GFS2: Move rg_igeneration into struct gfs2_rgrpd
GFS2: Banish struct gfs2_dinode_host
GFS2: Move i_size from gfs2_dinode_host and rename it to i_disksize
...

Linus Torvalds
2009-01-06 10:52:54 +0800
10cc04f5a Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 ... Browse Code »

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (138 commits)
ocfs2: Access the right buffer_head in ocfs2_merge_rec_left.
ocfs2: use min_t in ocfs2_quota_read()
ocfs2: remove unneeded lvb casts
ocfs2: Add xattr support checking in init_security
ocfs2: alloc xattr bucket in ocfs2_xattr_set_handle
ocfs2: calculate and reserve credits for xattr value in mknod
ocfs2/xattr: fix credits calculation during index create
ocfs2/xattr: Always updating ctime during xattr set.
ocfs2/xattr: Remove extend_trans call and add its credits from the beginning
ocfs2/dlm: Fix race during lockres mastery
ocfs2/dlm: Fix race in adding/removing lockres' to/from the tracking list
ocfs2/dlm: Hold off sending lockres drop ref message while lockres is migrating
ocfs2/dlm: Clean up errors in dlm_proxy_ast_handler()
ocfs2/dlm: Fix a race between migrate request and exit domain
ocfs2: One more hamming code optimization.
ocfs2: Another hamming code optimization.
ocfs2: Don't hand-code xor in ocfs2_hamming_encode().
ocfs2: Enable metadata checksums.
ocfs2: Validate superblock with checksum and ecc.
ocfs2: Checksum and ECC for directory blocks.
...

Linus Torvalds
2009-01-06 10:32:43 +0800
520c85346 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
inotify: fix type errors in interfaces
fix breakage in reiserfs_new_inode()
fix the treatment of jfs special inodes
vfs: remove duplicate code in get_fs_type()
add a vfs_fsync helper
sys_execve and sys_uselib do not call into fsnotify
zero i_uid/i_gid on inode allocation
inode->i_op is never NULL
ntfs: don't NULL i_op
isofs check for NULL ->i_op in root directory is dead code
affs: do not zero ->i_op
kill suid bit only for regular files
vfs: lseek(fd, 0, SEEK_CUR) race condition

Linus Torvalds
2009-01-06 10:32:06 +0800
4ae8978cf inotify: fix type errors in interfaces ... Browse Code »

The problems lie in the types used for some inotify interfaces, both at the kernel level and at the glibc level. This mail addresses the kernel problem. I will follow up with some suggestions for glibc changes.

For the sys_inotify_rm_watch() interface, the type of the 'wd' argument is
currently 'u32', it should be '__s32' . That is Robert's suggestion, and
is consistent with the other declarations of watch descriptors in the
kernel source, in particular, the inotify_event structure in
include/linux/inotify.h:

struct inotify_event {
__s32 wd; /* watch descriptor */
__u32 mask; /* watch mask */
__u32 cookie; /* cookie to synchronize two events */
__u32 len; /* length (including nulls) of name */
char name[0]; /* stub for possible name */
};

The patch makes the changes needed for inotify_rm_watch().

Signed-off-by: Michael Kerrisk
Cc: Robert Love
Cc: Vegard Nossum
Cc: Ulrich Drepper
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro

Michael Kerrisk
2009-01-06 00:54:29 +0800
2f1169e2d fix breakage in reiserfs_new_inode() ... Browse Code »

now that we use ih.key earlier, we need to do all its setup early enough

Signed-off-by: Al Viro

Al Viro
2009-01-06 00:54:29 +0800
5b45d96bf fix the treatment of jfs special inodes ... Browse Code »

We used to put them on a single list, without any locking. Racy.

Signed-off-by: Al Viro

Al Viro
2009-01-06 00:54:29 +0800
d8e9650df vfs: remove duplicate code in get_fs_type() ... Browse Code »

save 14 bytes:

text data bss dec hex filename
1354 32 4 1390 56e fs/filesystems.o.before
text data bss dec hex filename
1340 32 4 1376 560 fs/filesystems.o

Signed-off-by: Li Zefan
Signed-off-by: Al Viro

Li Zefan
2009-01-06 00:54:29 +0800
4c728ef58 add a vfs_fsync helper ... Browse Code »

Fsync currently has a fdatawrite/fdatawait pair around the method call,
and a mutex_lock/unlock of the inode mutex. All callers of fsync have
to duplicate this, but we have a few and most of them don't quite get
it right. This patch adds a new vfs_fsync that takes care of this.
It's a little more complicated as usual as ->fsync might get a NULL file
pointer and just a dentry from nfsd, but otherwise gets afile and we
want to take the mapping and file operations from it when it is there.

Notes on the fsync callers:

- ecryptfs wasn't calling filemap_fdatawrite / filemap_fdatawait on the
lower file
- coda wasn't calling filemap_fdatawrite / filemap_fdatawait on the host
file, and returning 0 when ->fsync was missing
- shm wasn't calling either filemap_fdatawrite / filemap_fdatawait nor
taking i_mutex. Now given that shared memory doesn't have disk
backing not doing anything in fsync seems fine and I left it out of
the vfs_fsync conversion for now, but in that case we might just
not pass it through to the lower file at all but just call the no-op
simple_sync_file directly.

[and now actually export vfs_fsync]

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2009-01-06 00:54:28 +0800
6110e3abb sys_execve and sys_uselib do not call into fsnotify ... Browse Code »

sys_execve and sys_uselib do not call into fsnotify so inotify does not get
open events for these types of syscalls. This patch simply makes the
requisite fsnotify calls.

Signed-off-by: Eric Paris
Signed-off-by: Al Viro

Eric Paris
2009-01-06 00:54:28 +0800
56ff5efad zero i_uid/i_gid on inode allocation ... Browse Code »

... and don't bother in callers. Don't bother with zeroing i_blocks,
while we are at it - it's already been zeroed.

i_mode is not worth the effort; it has no common default value.

Signed-off-by: Al Viro

Al Viro
2009-01-06 00:54:28 +0800
acfa4380e inode->i_op is never NULL ... Browse Code »

We used to have rather schizophrenic set of checks for NULL ->i_op even
though it had been eliminated years ago. You'd need to go out of your
way to set it to NULL explicitly _and_ a bunch of code would die on
such inodes anyway. After killing two remaining places that still
did that bogosity, all that crap can go away.

Signed-off-by: Al Viro

Al Viro
2009-01-06 00:54:28 +0800
9742df331 ntfs: don't NULL i_op ... Browse Code »

it's already set to empty table (and no, ntfs doesn't have any explicit
checks for NULL ->i_op or NULL ->i_fop)

Signed-off-by: Al Viro

Al Viro
2009-01-06 00:54:27 +0800
261964c60 isofs check for NULL ->i_op in root directory is dead code ... Browse Code »

for one thing it never happens, for another we check that inode
is a directory right after that place anyway (and we'd already
checked that reading it from disk has not failed).

Signed-off-by: Al Viro

Al Viro
2009-01-06 00:53:38 +0800
c765d4790 affs: do not zero ->i_op ... Browse Code »

it is already set to empty table and should never be NULL

Signed-off-by: Al Viro

Al Viro
2009-01-06 00:53:07 +0800
5b6f1eb97 vfs: lseek(fd, 0, SEEK_CUR) race condition ... Browse Code »

This patch fixes a race condition in lseek. While it is expected that
unpredictable behaviour may result while repositioning the offset of a
file descriptor concurrently with reading/writing to the same file
descriptor, this should not happen when merely *reading* the file
descriptor's offset.

Unfortunately, the only portable way in Unix to read a file
descriptor's offset is lseek(fd, 0, SEEK_CUR); however executing this
concurrently with read/write may mess up the position.

[with fixes from akpm]

Signed-off-by: Alain Knaff
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro

Alain Knaff
2009-01-06 00:53:07 +0800
9047beabb ocfs2: Access the right buffer_head in ocfs2_merge_rec_left. ... Browse Code »

In commit "ocfs2: Use metadata-specific ocfs2_journal_access_*()
functions", the wrong buffer_head is accessed. So change it
to the right buffer_head.

Signed-off-by: Tao Ma
Acked-by: Joel Becker
Signed-off-by: Mark Fasheh

Tao Ma
2009-01-06 00:40:37 +0800
dad7d975e ocfs2: use min_t in ocfs2_quota_read() ... Browse Code »

This is preferred to min().

Signed-off-by: Mark Fasheh

Mark Fasheh
2009-01-06 00:40:37 +0800
a641dc2a5 ocfs2: remove unneeded lvb casts ... Browse Code »

dlmglue.c has lots of code which casts the return value of ocfs2_dlm_lvb().
This is pointless however, as ocfs2_dlm_lvb() returns void *.

Signed-off-by: Mark Fasheh

Mark Fasheh
2009-01-06 00:40:36 +0800
38d59ef61 ocfs2: Add xattr support checking in init_security ... Browse Code »

We must check whether ocfs2 volume support xattr in init_security,
if not support xattr and security is enable, would cause failure of mknod.

Signed-off-by: Tiger Yang
Signed-off-by: Mark Fasheh

Tiger Yang
2009-01-06 00:40:36 +0800
008aafaf0 ocfs2: alloc xattr bucket in ocfs2_xattr_set_handle ... Browse Code »

In extreme situation, may need xattr bucket for setting
security entry and acl entries during mknod. This only
happens when block size is too small.

Signed-off-by: Tiger Yang
Signed-off-by: Mark Fasheh

Tiger Yang
2009-01-06 00:40:36 +0800
0e445b6fe ocfs2: calculate and reserve credits for xattr value in mknod ... Browse Code »

We extend the credits for xattr's large value in set_value_outside
before, this can give rise to a credits issue when we set one security
entry and two acl entries duing mknod. As we remove extend_trans form
set_value_outside, we must calculate and reserve the credits for
xattr's large value in mknod.

Signed-off-by: Tiger Yang
Signed-off-by: Mark Fasheh

Tiger Yang
2009-01-06 00:40:36 +0800
90cb546ca ocfs2/xattr: fix credits calculation during index create ... Browse Code »

When creating a xattr index block, the old calculation forget
to add credits for the meta change of the alloc file. So add
more credits and more comments to explain it.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2009-01-06 00:40:36 +0800
4b3f6209b ocfs2/xattr: Always updating ctime during xattr set. ... Browse Code »

In xattr set, we should always update ctime if the operation goes
sucessfully. The old one mistakenly put it in ocfs2_xattr_set_entry
which is only called when we set xattr in inode or xattr block. The
side benefit is that it resolve the bug 1052 since in that scenario,
ocfs2_calc_xattr_set_need only calc out the xattr set credits while
ocfs2_xattr_set_entry update the inode also which isn't concerned with
the process of xattr set.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2009-01-06 00:40:36 +0800
71d548a6a ocfs2/xattr: Remove extend_trans call and add its credits from the beginning ... Browse Code »

Actually, when setting a new xattr value, we know it from the very
beginning, and it isn't like the extension of bucket in which case
we can't figure it out. So remove ocfs2_extend_trans in that function
and calculate it before the transaction. It also relieve acl operation
from the worry about the side effect of ocfs2_extend_trans.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2009-01-06 00:40:36 +0800
7b791d685 ocfs2/dlm: Fix race during lockres mastery ... Browse Code »

dlm_get_lock_resource() is supposed to return a lock resource with a proper
master. If multiple concurrent threads attempt to lookup the lockres for the
same lockid while the lock mastery in underway, one or more threads are likely
to return a lockres without a proper master.

This patch makes the threads wait in dlm_get_lock_resource() while the mastery
is underway, ensuring all threads return the lockres with a proper master.

This issue is known to be limited to users using the flock() syscall. For all
other fs operations, the ocfs2 dlmglue layer serializes the dlm op for each
lockid.

Users encountering this bug will see flock() return EINVAL and dmesg have the
following error:
ERROR: Dlm error "DLM_BADARGS" while calling dlmlock on resource : bad api args

Reported-by: Coly Li
Signed-off-by: Sunil Mushran
Signed-off-by: Mark Fasheh

Sunil Mushran
2009-01-06 00:40:35 +0800
b0d4f817b ocfs2/dlm: Fix race in adding/removing lockres' to/from the tracking list ... Browse Code »

This patch adds a new lock, dlm->tracking_lock, to protect adding/removing
lockres' to/from the dlm->tracking_list. We were previously using dlm->spinlock
for the same, but that proved inadequate as we could be freeing a lockres from
a context that did not hold that lock. As the new lock only protects this list,
we can explicitly take it when removing the lockres from the tracking list.

This bug was exposed when testing multiple processes concurrently flock() the
same file.

Signed-off-by: Sunil Mushran
Signed-off-by: Mark Fasheh

Sunil Mushran
2009-01-06 00:40:35 +0800
d4f7e650e ocfs2/dlm: Hold off sending lockres drop ref message while lockres is migrating ... Browse Code »

During lockres purge, o2dlm sends a drop reference message to the lockres
master. This patch delays the message if the lockres is being migrated.

Fixes oss bugzilla#1012
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1012

Signed-off-by: Sunil Mushran
Signed-off-by: Mark Fasheh

Sunil Mushran
2009-01-06 00:40:35 +0800
57dff2676 ocfs2/dlm: Clean up errors in dlm_proxy_ast_handler() ... Browse Code »

Patch cleans printed errors in dlm_proxy_ast_handler(). The errors now includes
the node number that sent the (b)ast. Also it reduces the number of endian swaps
of the cookie.

Signed-off-by: Sunil Mushran
Signed-off-by: Mark Fasheh

Sunil Mushran
2009-01-06 00:40:35 +0800
2b8325640 ocfs2/dlm: Fix a race between migrate request and exit domain ... Browse Code »

Patch address a racing migrate request message and an exit domain message.
Instead of blocking exit domains for the duration of the migrate, we ignore
failure to deliver that message. This is because an exiting domain should
not have any active locks and thus has no role to play in the migration.

Signed-off-by: Sunil Mushran
Signed-off-by: Mark Fasheh

Sunil Mushran
2009-01-06 00:40:35 +0800
58896c4d0 ocfs2: One more hamming code optimization. ... Browse Code »

The previous optimization used a fast find-highest-bit-set operation to
give us a good starting point in calc_code_bit(). This version lets the
caller cache the previous code buffer bit offset. Thus, the next call
always starts where the last one left off.

This reduces the calculation another 39%, for a total 80% reduction from
the original, naive implementation. At least, on my machine. This also
brings the parity calculation to within an order of magnitude of the
crc32 calculation.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:40:35 +0800
7bb458a58 ocfs2: Another hamming code optimization. ... Browse Code »

In the calc_code_bit() function, we must find all powers of two beneath
the code bit number, *after* it's shifted by those powers of two. This
requires a loop to see where it ends up.

We can optimize it by starting at its most significant bit. This shaves
32% off the time, for a total of 67.6% shaved off of the original, naive
implementation.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:40:35 +0800
e798b3f8a ocfs2: Don't hand-code xor in ocfs2_hamming_encode(). ... Browse Code »

When I wrote ocfs2_hamming_encode(), I was following documentation of
the algorithm and didn't have quite the (possibly still imperfect) grasp
of it I do now. As part of this, I literally hand-coded xor. I would
test a bit, and then add that bit via xor to the parity word.

I can, of course, just do a single xor of the parity word and the source
word (the code buffer bit offset). This cuts CPU usage by 53% on a
mostly populated buffer (an inode containing utmp.h inline).

Joel

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:40:34 +0800
9d28cfb73 ocfs2: Enable metadata checksums. ... Browse Code »

Add OCFS2_FEATURE_INCOMPAT_META_ECC to the list of supported features.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:40:34 +0800
d030cc978 ocfs2: Validate superblock with checksum and ecc. ... Browse Code »

The superblock is read via a raw call. Validate it after we find it
from its signature.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:40:34 +0800
c175a518b ocfs2: Checksum and ECC for directory blocks. ... Browse Code »

Use the db_check field of ocfs2_dir_block_trailer to crc/ecc the
dirblocks.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:40:34 +0800
87d35a74b ocfs2: Add directory block trailers. ... Browse Code »

Future ocfs2 features metaecc and indexed directories need to store a
little bit of data in each dirblock. For compatibility, we place this
in a trailer at the end of the dirblock. The trailer plays itself as an
empty dirent, so that if the features are turned off, it can be reused
without requiring a tunefs scan.

This code adds the trailer and validates it when the block is read in.

[ Mark is the original author, but I reinserted this code before his
dir index work. -- Joel ]

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Mark Fasheh
2009-01-06 00:40:34 +0800
840089724 ocfs2: Use proper journal_access function in xattr.c ... Browse Code »

Change the rest of the naked ocfs2_journal_access() calls in
fs/ocfs2/xattr.c to use the appropriate ocfs2_journal_access_*() call
for their metadata type.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:40:34 +0800
4311901da ocfs2: Pass value buf to ocfs2_remove_value_outside(). ... Browse Code »

ocfs2_remove_value_outside() needs to know the type of buffer it is
looking at. Pass in an ocfs2_xattr_value_buf.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:40:33 +0800
512620f44 ocfs2: Use ocfs2_xattr_value_buf in ocfs2_xattr_set_entry(). ... Browse Code »

ocfs2_xattr_set_entry is the function that knows what type of block it
is setting into. This is what we wanted from ocfs2_xattr_value_buf.
Plus, moving the value buf up into ocfs2_xattr_set_entry() allows us to
pass it into ocfs2_xattr_set_value_outside() and ocfs2_xattr_cleanup().

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:40:33 +0800