Eric Lee / smarc-fsl-linux-kernel

20 Jul, 2007

40 commits

275afcac9 afs build fix ... Browse Code »

Bruce and David's patches clashed.

fs/afs/flock.c: In function 'afs_do_getlk':
fs/afs/flock.c:459: error: void value not ignored as it ought to be

Cc: "J. Bruce Fields"
Acked-by: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2007-07-20 01:04:57 +0800
c7d51402d knfsd: clean up EX_RDONLY ... Browse Code »

Share a little common code, reverse the arguments for consistency, drop the
unnecessary "inline", and lowercase the name.

Signed-off-by: "J. Bruce Fields"
Acked-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

J. Bruce Fields
2007-07-20 01:04:52 +0800
e22841c63 knfsd: move EX_RDONLY out of header ... Browse Code »

EX_RDONLY is only called in one place; just put it there.

Signed-off-by: "J. Bruce Fields"
Acked-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

J. Bruce Fields
2007-07-20 01:04:52 +0800
5d3dbbeaf nfsd: remove unnecessary NULL checks from nfsd_cross_mnt ... Browse Code »

We can now assume that rqst_exp_get_by_name() does not return NULL; so clean
up some unnecessary checks.

Signed-off-by: "J. Bruce Fields"
Acked-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

J. Bruce Fields
2007-07-20 01:04:52 +0800
9a25b96c1 nfsd: return errors, not NULL, from export functions ... Browse Code »

I converted the various export-returning functions to return -ENOENT instead
of NULL, but missed a few cases.

This particular case could cause actual bugs in the case of a krb5 client that
doesn't match any ip-based client and that is trying to access a filesystem
not exported to krb5 clients.

Signed-off-by: "J. Bruce Fields"
Acked-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

J. Bruce Fields
2007-07-20 01:04:52 +0800
a280df32d nfsd: fix possible read-ahead cache and export table corruption ... Browse Code »

The value of nperbucket calculated here is too small--we should be rounding up
instead of down--with the result that the index j in the following loop can
overflow the raparm_hash array. At least in my case, the next thing in memory
turns out to be export_table, so the symptoms I see are crashes caused by the
appearance of four zeroed-out export entries in the first bucket of the hash
table of exports (which were actually entries in the readahead cache, a
pointer to which had been written to the export table in this initialization
code).

It looks like the bug was probably introduced with commit
fce1456a19f5c08b688c29f00ef90fdfa074c79b ("knfsd: make the readahead params
cache SMP-friendly").

Cc:
Cc: Greg Banks
Signed-off-by: "J. Bruce Fields"
Acked-by: NeilBrown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

J. Bruce Fields
2007-07-20 01:04:52 +0800
dd00cc486 some kmalloc/memset ->kzalloc (tree wide) ... Browse Code »

Transform some calls to kmalloc/memset to a single kzalloc (or kcalloc).

Here is a short excerpt of the semantic patch performing
this transformation:

@@
type T2;
expression x;
identifier f,fld;
expression E;
expression E1,E2;
expression e1,e2,e3,y;
statement S;
@@

x =
- kmalloc
+ kzalloc
(E1,E2)
... when != \(x->fld=E;\|y=f(...,x,...);\|f(...,x,...);\|x=E;\|while(...) S\|for(e1;e2;e3) S\)
- memset((T2)x,0,E1);

@@
expression E1,E2,E3;
@@

- kzalloc(E1 * E2,E3)
+ kcalloc(E1,E2,E3)

[akpm@linux-foundation.org: get kcalloc args the right way around]
Signed-off-by: Yoann Padioleau
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Acked-by: Russell King
Cc: Bryan Wu
Acked-by: Jiri Slaby
Cc: Dave Airlie
Acked-by: Roland Dreier
Cc: Jiri Kosina
Acked-by: Dmitry Torokhov
Cc: Benjamin Herrenschmidt
Acked-by: Mauro Carvalho Chehab
Acked-by: Pierre Ossman
Cc: Jeff Garzik
Cc: "David S. Miller"
Acked-by: Greg KH
Cc: James Bottomley
Cc: "Antonino A. Daplas"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yoann Padioleau
2007-07-20 01:04:50 +0800
5b7f13bd2 coda: update module information ... Browse Code »

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:49 +0800
3cf01f28c coda: remove statistics counters from /proc/fs/coda ... Browse Code »

Similar information can easily be obtained with strace -c.

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
a1b0aa876 coda: remove struct coda_sb_info ... Browse Code »

The sb_info structure only contains a single pointer to the character device,
there is no need for the added indirection.

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
5fd31e9a6 coda: cleanup downcall handler ... Browse Code »

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
ed36f7236 coda: cleanup coda_lookup, use dsplice_alias ... Browse Code »

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
970648eb0 coda: ignore returned values when upcalls return errors ... Browse Code »

Venus returns an ENOENT error on open, so we shouldn't try to grab the
filehandle for the returned fd.

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
37461e195 coda: replace upc_alloc/upc_free with kmalloc/kfree ... Browse Code »

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
978752534 coda: avoid lockdep warning in coda_readdir ... Browse Code »

Signed-off-by: Jan Harkes
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
d9664c95a coda: block signals during upcall processing ... Browse Code »

We ignore signals for about 30 seconds to give userspace a chance to see the
upcall. As we did not block signals we ended up in a busy loop for the
remainder of the period when a signal is received.

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
fe71b5f38 coda: cleanup for upcall handling path ... Browse Code »

Make the code that processes upcall responses more straightforward, uncovered
at least one bad assumption. We trusted that vc_inuse would be 0 when upcalls
are aborted, however the device may have been reopened.

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
870655196 coda: cleanup /dev/cfs open and close handling ... Browse Code »
2

- Make sure device index is not a negative number.
- Unlink queued requests when the device is closed to avoid passing them
to the next opener.

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
ed31a7dd6 coda: use ilookup5 ... Browse Code »

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
fac1f0e34 coda: coda doesn't track atime ... Browse Code »

Set MS_NOATIME flag to avoid unnecessary calls when the coda inode is
accessed.

Also, set statfs.f_bsize to 4k. 1k is obviously too small for the suggested
IO size.

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
8c6d21528 coda: allow removal of busy directories ... Browse Code »

A directory without children may still be busy when it is the cwd for some
process. We can safely remove such a directory because the VFS prevents
further operations. Also we don't need to call d_delete as it is already
called in vfs_rmdir.

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
d728900cd coda: fix nlink updates for directories ... Browse Code »

The Coda client sets the directory link count to 1 when it isn't sure how many
subdirectories we have. In this case we shouldn't change the link count in
the kernel when a subdirectory is created or removed.

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
56ee35479 coda: correctly invalidate cached access rights ... Browse Code »

Change the epoch value to forces a refresh instead of clearing the cached
rights mask and block all further accesses to the object.

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
38c2e4370 coda: do not grab an uninitialized fd when the open upcall returns an error ... Browse Code »

When open fails the fd in the response is uninitialized and we ended up taking
a reference on the file struct and never released it.

Signed-off-by: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Harkes
2007-07-20 01:04:48 +0800
b38bd33a6 fix ext4/JBD2 build warnings ... Browse Code »

Looking at the current linus-git tree jbd_debug() define in
include/linux/jbd2.h

extern u8 journal_enable_debug;

#define jbd_debug(n, f, a...) \
do { \
if ((n) fs/ext4/inode.c: In function âext4_write_inodeâ:
> fs/ext4/inode.c:2906: warning: comparison is always true due to limited
> range of data type
>
> fs/jbd2/recovery.c: In function âjbd2_journal_recoverâ:
> fs/jbd2/recovery.c:254: warning: comparison is always true due to
> limited range of data type
> fs/jbd2/recovery.c:257: warning: comparison is always true due to
> limited range of data type
>
> fs/jbd2/recovery.c: In function âjbd2_journal_skip_recoveryâ:
> fs/jbd2/recovery.c:301: warning: comparison is always true due to
> limited range of data type
>
Noticed all warnings are occurs when the debug level is 0. Then found
the "jbd2: Move jbd2-debug file to debugfs" patch
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0f49d5d019afa4e94253bfc92f0daca3badb990b

changed the jbd2_journal_enable_debug from int type to u8, makes the
jbd_debug comparision is always true when the debugging level is 0. Thus
the compile warning occurs.

Thought about changing the jbd2_journal_enable_debug data type back to
int, but can't, because the jbd2-debug is moved to debug fs, where
calling debugfs_create_u8() to create the debugfs entry needs the value
to be u8 type.

Even if we changed the data type back to int, the code is still buggy,
kernel should not print jbd2 debug message if the
jbd2_journal_enable_debug is set to 0. But this is not the case.

The fix is change the level of debugging to 1. The same should fixed in
ext3/JBD, but currently ext3 jbd-debug via /proc fs is broken, so we
probably should fix it all together.

Signed-off-by: Mingming Cao
Cc: Jeff Garzik
Cc: Theodore Tso
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mingming Cao
2007-07-20 01:04:47 +0800
ee78b0a61 coredump masking: ELF-FDPIC: enable core dump filtering ... Browse Code »

This patch enables core dump filtering for ELF-FDPIC-formatted core file.

Signed-off-by: Hidehiro Kawai
Cc: Alan Cox
Cc: David Howells
Cc: Hugh Dickins
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kawai, Hidehiro
2007-07-20 01:04:47 +0800
e2e00906a coredump masking: ELF-FDPIC: remove an unused argument ... Browse Code »

This patch removes an unused argument from elf_fdpic_dump_segments().

Signed-off-by: Hidehiro Kawai
Cc: Alan Cox
Cc: David Howells
Cc: Hugh Dickins
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kawai, Hidehiro
2007-07-20 01:04:47 +0800
a1b59e802 coredump masking: ELF: enable core dump filtering ... Browse Code »

This patch enables core dump filtering for ELF-formatted core file.

Signed-off-by: Hidehiro Kawai
Cc: Alan Cox
Cc: David Howells
Cc: Hugh Dickins
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kawai, Hidehiro
2007-07-20 01:04:47 +0800
3cb4a0bb1 coredump masking: add an interface for core dump filter ... Browse Code »

This patch adds an interface to set/reset flags which determines each memory
segment should be dumped or not when a core file is generated.

/proc//coredump_filter file is provided to access the flags. You can
change the flag status for a particular process by writing to or reading from
the file.

The flag status is inherited to the child process when it is created.

Signed-off-by: Hidehiro Kawai
Cc: Alan Cox
Cc: David Howells
Cc: Hugh Dickins
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kawai, Hidehiro
2007-07-20 01:04:47 +0800
6c5d52382 coredump masking: reimplementation of dumpable using two flags ... Browse Code »

This patch changes mm_struct.dumpable to a pair of bit flags.

set_dumpable() converts three-value dumpable to two flags and stores it into
lower two bits of mm_struct.flags instead of mm_struct.dumpable.
get_dumpable() behaves in the opposite way.

[akpm@linux-foundation.org: export set_dumpable]
Signed-off-by: Hidehiro Kawai
Cc: Alan Cox
Cc: David Howells
Cc: Hugh Dickins
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kawai, Hidehiro
2007-07-20 01:04:46 +0800
f79c20f52 fs: remove path_walk export ... Browse Code »

Signed-off-by: Josef 'Jeff' Sipek
Cc: Al Viro
Acked-by: Christoph Hellwig
Cc: Trond Myklebust
Cc: Neil Brown
Cc: Michael Halcrow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josef 'Jeff' Sipek
2007-07-20 01:04:45 +0800
c4a7808fc fs: mark link_path_walk static ... Browse Code »

Signed-off-by: Josef 'Jeff' Sipek
Cc: Al Viro
Acked-by: Christoph Hellwig
Cc: Trond Myklebust
Cc: Neil Brown
Cc: Michael Halcrow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josef 'Jeff' Sipek
2007-07-20 01:04:45 +0800
16b6287a5 nfsctl: use vfs_path_lookup ... Browse Code »

use vfs_path_lookup instead of open-coding the necessary functionality.

Signed-off-by: Josef 'Jeff' Sipek
Acked-by: NeilBrown
Cc: Al Viro
Acked-by: Christoph Hellwig
Cc: Trond Myklebust
Cc: Michael Halcrow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josef 'Jeff' Sipek
2007-07-20 01:04:45 +0800
16f182002 fs: introduce vfs_path_lookup ... Browse Code »

Stackable file systems, among others, frequently need to lookup paths or
path components starting from an arbitrary point in the namespace
(identified by a dentry and a vfsmount). Currently, such file systems use
lookup_one_len, which is frowned upon [1] as it does not pass the lookup
intent along; not passing a lookup intent, for example, can trigger BUG_ON's
when stacking on top of NFSv4.

The first patch introduces a new lookup function to allow lookup starting
from an arbitrary point in the namespace. This approach has been suggested
by Christoph Hellwig [2].

The second patch changes sunrpc to use vfs_path_lookup.

The third patch changes nfsctl.c to use vfs_path_lookup.

The fourth patch marks link_path_walk static.

The fifth, and last patch, unexports path_walk because it is no longer
unnecessary to call it directly, and using the new vfs_path_lookup is
cleaner.

For example, the following snippet of code, looks up "some/path/component"
in a directory pointed to by parent_{dentry,vfsmnt}:

err = vfs_path_lookup(parent_dentry, parent_vfsmnt,
"some/path/component", 0, &nd);
if (!err) {
/* exits */

...

/* once done, release the references */
path_release(&nd);
} else if (err == -ENOENT) {
/* doesn't exist */
} else {
/* other error */
}

VFS functions such as lookup_create can be used on the nameidata structure
to pass the create intent to the file system.

Signed-off-by: Josef 'Jeff' Sipek
Cc: Al Viro
Acked-by: Christoph Hellwig
Cc: Trond Myklebust
Cc: Neil Brown
Cc: Michael Halcrow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josef 'Jeff' Sipek
2007-07-20 01:04:45 +0800
b6a2fea39 mm: variable length argument support ... Browse Code »

Remove the arg+env limit of MAX_ARG_PAGES by copying the strings directly from
the old mm into the new mm.

We create the new mm before the binfmt code runs, and place the new stack at
the very top of the address space. Once the binfmt code runs and figures out
where the stack should be, we move it downwards.

It is a bit peculiar in that we have one task with two mm's, one of which is
inactive.

[a.p.zijlstra@chello.nl: limit stack size]
Signed-off-by: Ollie Wild
Signed-off-by: Peter Zijlstra
Cc:
Cc: Hugh Dickins
[bunk@stusta.de: unexport bprm_mm_init]
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ollie Wild
2007-07-20 01:04:45 +0800
bdf4c48af audit: rework execve audit ... Browse Code »

The purpose of audit_bprm() is to log the argv array to a userspace daemon at
the end of the execve system call. Since user-space hasn't had time to run,
this array is still in pristine state on the process' stack; so no need to
copy it, we can just grab it from there.

In order to minimize the damage to audit_log_*() copy each string into a
temporary kernel buffer first.

Currently the audit code requires that the full argument vector fits in a
single packet. So currently it does clip the argv size to a (sysctl) limit,
but only when execve auditing is enabled.

If the audit protocol gets extended to allow for multiple packets this check
can be removed.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ollie Wild
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-07-20 01:04:45 +0800
cf914a7d6 readahead: split ondemand readahead interface into two functions ... Browse Code »

Split ondemand readahead interface into two functions. I think this makes it
a little clearer for non-readahead experts (like Rusty).

Internally they both call ondemand_readahead(), but the page argument is
changed to an obvious boolean flag.

Signed-off-by: Rusty Russell
Signed-off-by: Fengguang Wu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rusty Russell
2007-07-20 01:04:44 +0800
d8983910a readahead: pass real splice size ... Browse Code »

Pass real splice size to page_cache_readahead_ondemand().

The splice code works in chunks of 16 pages internally. The readahead code
should be told of the overall splice size, instead of the internal chunk size.
Otherwize bad things may happen. Imagine some 17-page random splice reads.
The code before this patch will result in two readahead calls: readahead(16);
readahead(1); That leads to one 16-page I/O and one 32-page I/O: one extra I/O
and 31 readahead miss pages.

Signed-off-by: Fengguang Wu
Cc: Jens Axboe
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fengguang Wu
2007-07-20 01:04:44 +0800
431a4820b readahead: move synchronous readahead call out of splice loop ... Browse Code »

Move synchronous page_cache_readahead_ondemand() call out of splice loop.

This avoids one pointless page allocation/insertion in case of non-zero
ra_pages, or many pointless readahead calls in case of zero ra_pages.

Note that if a user sets ra_pages to less than PIPE_BUFFERS=16 pages, he will
not get expected readahead behavior anyway. The splice code works in batches
of 16 pages, which can be taken as another form of synchronous readahead.

Signed-off-by: Fengguang Wu
Cc: Jens Axboe
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fengguang Wu
2007-07-20 01:04:44 +0800
dc7868fcb readahead: convert ext3/ext4 invocations ... Browse Code »

Convert ext3/ext4 dir reads to use on-demand readahead.

Readahead for dirs operates _not_ on file level, but on blockdev level. This
makes a difference when the data blocks are not continuous. And the read
routine is somehow opaque: there's no handy info about the status of current
page. So a simplified call scheme is employed: to call into readahead
whenever the current page falls out of readahead windows.

Signed-off-by: Fengguang Wu
Cc: Steven Pratt
Cc: Ram Pai
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fengguang Wu
2007-07-20 01:04:44 +0800