Eric Lee / smarc-fsl-linux-kernel

22 May, 2007

1 commit

e8edc6e03 Detach sched.h from mm.h ... Browse Code »

First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.

This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
getting them indirectly

Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
they don't need sched.h
b) sched.h stops being dependency for significant number of files:
on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
after patch it's only 3744 (-8.3%).

Cross-compile tested on

all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
alpha alpha-up
arm
i386 i386-up i386-defconfig i386-allnoconfig
ia64 ia64-up
m68k
mips
parisc parisc-up
powerpc powerpc-up
s390 s390-up
sparc sparc-up
sparc64 sparc64-up
um-x86_64
x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-05-22 00:18:19 +0800

17 May, 2007

2 commits

dd504ea16 Merge branch 'master' of /home/trondmy/repositories/git/linux-2.6/ Browse Code »

Trond Myklebust
2007-05-17 23:36:59 +0800
a35afb830 Remove SLAB_CTOR_CONSTRUCTOR ... Browse Code »

SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.

Signed-off-by: Christoph Lameter
Cc: David Howells
Cc: Jens Axboe
Cc: Steven French
Cc: Michael Halcrow
Cc: OGAWA Hirofumi
Cc: Miklos Szeredi
Cc: Steven Whitehouse
Cc: Roman Zippel
Cc: David Woodhouse
Cc: Dave Kleikamp
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Cc: Anton Altaparmakov
Cc: Mark Fasheh
Cc: Paul Mackerras
Cc: Christoph Hellwig
Cc: Jan Kara
Cc: David Chinner
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2007-05-17 20:23:04 +0800

15 May, 2007

4 commits

2e42c3e2a NFS: Fix more sparse warnings ... Browse Code »

- fs/nfs/nfs4xdr.c:2499:42: warning: incorrect type in argument 2
(different signedness)
- fs/nfs/nfs4xdr.c:2658:49: warning: incorrect type in argument 4
(different explicit signedness)
- fs/nfs/nfs4xdr.c:2683:50: warning: incorrect type in argument 4
(different explicit signedness)
- fs/nfs/nfs4xdr.c:3063:68: warning: incorrect type in argument 4
(different explicit signedness)
- fs/nfs/nfs4xdr.c:3065:68: warning: incorrect type in argument 4
(different explicit signedness)

- fs/nfs/callback_xdr.c:138:31: warning: incorrect type in argument 2
(different signedness)

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-15 07:33:46 +0800
10afec908 NFS: Fix some 'sparse' warnings... ... Browse Code »

- fs/nfs/dir.c:610:8: warning: symbol 'nfs_llseek_dir' was not declared.
Should it be static?
- fs/nfs/dir.c:636:5: warning: symbol 'nfs_fsync_dir' was not declared.
Should it be static?
- fs/nfs/write.c:925:19: warning: symbol 'req' shadows an earlier one
- fs/nfs/write.c:61:6: warning: symbol 'nfs_commit_rcu_free' was not
declared. Should it be static?
- fs/nfs/nfs4proc.c:793:5: warning: symbol 'nfs4_recover_expired_lease'
was not declared. Should it be static?

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-15 07:33:46 +0800
8ae20abdd NFS4: Fix incorrect use of sizeof() in fs/nfs/nfs4xdr.c ... Browse Code »

The XDR code should not depend on the physical allocation size of
structures like nfs4_stateid and nfs4_verifier since those may have to
change at some future date. We therefore replace all uses of
sizeof() with constants like NFS4_VERIFIER_SIZE and NFS4_STATEID_SIZE.

This also has the side-effect of fixing some warnings of the type
format ‘%u’ expects type ‘unsigned int’, but argument X has type
‘long unsigned int’
on 64-bit systems

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-15 07:33:45 +0800
60945cb7c NFS: use zero_user_page ... Browse Code »

Use zero_user_page() instead of the newly deprecated memclear_highpage_flush().

Signed-off-by: Nate Diller
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Signed-off-by: Andrew Morton
Signed-off-by: Trond Myklebust

Nate Diller
2007-05-15 07:33:45 +0800

10 May, 2007

6 commits

7a13e9322 NFS: Kill the obsolete NFS_PARANOIA ... Browse Code »

Signed-off-by: Jesper Juhl
Acked-by: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Trond Myklebust

Jesper Juhl
2007-05-10 05:58:01 +0800
fee7f23fe NFS: use __set_current_state() ... Browse Code »

use __set_current_state(TASK_*) instead of current->state = TASK_*, in fs/nfs

Signed-off-by: Milind Arun Choudhary
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Signed-off-by: Andrew Morton
Signed-off-by: Trond Myklebust

Milind Arun Choudhary
2007-05-10 05:58:01 +0800
e4cc6ee2e NFS: Clean up NFSv4 XDR error message ... Browse Code »

Make it more useful for debugging purposes.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-05-10 05:58:00 +0800
6ce7dc940 NFS: NFS client underestimates how large an NFSv4 SETATTR reply can be ... Browse Code »

The maximum size of an NFSv4 SETATTR compound reply should include the
GETATTR operation that we send.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-05-10 05:58:00 +0800
e70c49081 NFS: Remove redundant check in nfs_check_verifier() ... Browse Code »

The check for nfs_attribute_timeout(dir) in nfs_check_verifier is
redundant: nfs_lookup_revalidate() will already call nfs_revalidate_inode()
on the parent dir when necessary.

The only case where this is not done is the case of a negative dentry. Fix
this case by moving up the revalidation code.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-10 05:57:59 +0800
e62c2bba1 NFS: Fix a jiffie wraparound issue ... Browse Code »

dentry verifiers are always set to the parent directory's
cache_change_attribute. There is no reason to be testing for anything other
than equality when we're trying to find out if the dentry has been checked
since the last time the directory was modified.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-10 05:57:58 +0800

09 May, 2007

2 commits

277866a0e nfs: fix congestion control: use atomic_longs ... Browse Code »

Change the atomic_t in struct nfs_server to atomic_long_t in anticipation
of machines that can handle 8+TB of (4K) pages under writeback.

However I suspect other things in NFS will start going *bang* by then.

Signed-off-by: Peter Zijlstra
Cc: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-05-09 02:15:21 +0800
e63340ae6 header cleaning: don't include smp_lock.h when not used ... Browse Code »

Remove includes of where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2007-05-09 02:15:07 +0800

08 May, 2007

3 commits

2d56d3c43 Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux ... Browse Code »

* 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux:
gfs2: nfs lock support for gfs2
lockd: add code to handle deferred lock requests
lockd: always preallocate block in nlmsvc_lock()
lockd: handle test_lock deferrals
lockd: pass cookie in nlmsvc_testlock
lockd: handle fl_grant callbacks
lockd: save lock state on deferral
locks: add fl_grant callback for asynchronous lock return
nfsd4: Convert NFSv4 to new lock interface
locks: add lock cancel command
locks: allow {vfs,posix}_lock_file to return conflicting lock
locks: factor out generic/filesystem switch from setlock code
locks: factor out generic/filesystem switch from test_lock
locks: give posix_test_lock same interface as ->lock
locks: make ->lock release private data before returning in GETLK case
locks: create posix-to-flock helper functions
locks: trivial removal of unnecessary parentheses

Linus Torvalds
2007-05-08 03:34:24 +0800
50953fe9e slab allocators: Remove SLAB_DEBUG_INITIAL flag ... Browse Code »

I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by
SLAB.

I think its purpose was to have a callback after an object has been freed
to verify that the state is the constructor state again? The callback is
performed before each freeing of an object.

I would think that it is much easier to check the object state manually
before the free. That also places the check near the code object
manipulation of the object.

Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
compiled with SLAB debugging on. If there would be code in a constructor
handling SLAB_DEBUG_INITIAL then it would have to be conditional on
SLAB_DEBUG otherwise it would just be dead code. But there is no such code
in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real
use of, difficult to understand and there are easier ways to accomplish the
same effect (i.e. add debug code before kfree).

There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
clear in fs inode caches. Remove the pointless checks (they would even be
pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.

This is the last slab flag that SLUB did not support. Remove the check for
unimplemented flags from SLUB.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2007-05-08 03:12:57 +0800
6fe6900e1 mm: make read_cache_page synchronous ... Browse Code »

Ensure pages are uptodate after returning from read_cache_page, which allows
us to cut out most of the filesystem-internal PageUptodate calls.

I didn't have a great look down the call chains, but this appears to fixes 7
possible use-before uptodate in hfs, 2 in hfsplus, 1 in jfs, a few in
ecryptfs, 1 in jffs2, and a possible cleared data overwritten with readpage in
block2mtd. All depending on whether the filler is async and/or can return
with a !uptodate page.

Signed-off-by: Nick Piggin
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-05-08 03:12:51 +0800

07 May, 2007

2 commits

9d6a8c5c2 locks: give posix_test_lock same interface as ->lock ... Browse Code »

posix_test_lock() and ->lock() do the same job but have gratuitously
different interfaces. Modify posix_test_lock() so the two agree,
simplifying some code in the process.

Signed-off-by: Marc Eshel
Signed-off-by: "J. Bruce Fields"

Marc Eshel
2007-05-07 05:39:00 +0800
70cc6487a locks: make ->lock release private data before returning in GETLK case ... Browse Code »

The file_lock argument to ->lock is used to return the conflicting lock
when found. There's no reason for the filesystem to return any private
information with this conflicting lock, but nfsv4 is.

Fix nfsv4 client, and modify locks.c to stop calling fl_release_private
for it in this case.

Signed-off-by: "J. Bruce Fields"
Cc: "Trond Myklebust" "

J. Bruce Fields
2007-05-07 05:38:19 +0800

05 May, 2007

1 commit

84dde76c4 NFS: Fix a compile glitch on 64-bit systems ... Browse Code »

fs/nfs/pagelist.c:226: error: conflicting types for 'nfs_pageio_init'
include/linux/nfs_page.h:80: error: previous declaration of 'nfs_pageio_init' was here

Thanks to Andrew for spotting this...

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-05 02:44:06 +0800

02 May, 2007

2 commits

a19b89cad NFS: Clean up nfs_create_request comments ... Browse Code »

Remove some stale comments about hard limits which went away in 2.5.

Signed-off-by: Jason Uhlenkott
Signed-off-by: Trond Myklebust

Jason Uhlenkott
2007-05-02 22:37:29 +0800
08efa202e NFS4: invalidate cached acl on setacl ... Browse Code »

The ACL that the server sets may not be exactly the one we set--for
example, it may silently turn off bits that it does not support. So we
should remove any cached ACL so that any subsequent request for the ACL
will go to the server.

Signed-off-by: "J. Bruce Fields"
Signed-off-by: Trond Myklebust

J. Bruce Fields
2007-05-02 22:36:09 +0800

01 May, 2007

15 commits

83672d392 NFS: Fix directory caching problem - with test case and patch. ... Browse Code »

Try running this script in an NFS mounted directory (Client relatively
recent - 2.6.18 has the problem as does 2.6.20).

------------------------------------------------------
#!/bin/bash
#
# This script will produce the following errormessage from tar:
#
# tar: newdir/innerdir/innerfile: file changed as we read it

# create dirs
rm -rf nfstest
mkdir -p nfstest/dir/innerdir

# create files (should not be empty)
echo "Hello World!" >nfstest/dir/file
echo "Hello World!" >nfstest/dir/innerdir/innerfile

# problem only happens if we sleep before chmod
sleep 1

# change file modes
chmod -R a+r nfstest

# rename dir
mv nfstest/dir nfstest/newdir

# tar it
tar -cf nfstest/nfstest.tar -C nfstest newdir

# restore old dir name
mv nfstest/newdir nfstest/dir
--------------------------------------------------------

What happens:

The 'chmod -R' does a readdir_plus in each directory and the results
get cached in the page cache. It then updates the ctime on each file
by one second. When this happens, the post-op attributes are used to
update the ctime stored on the client to match the value in the kernel.

The 'mv' calls shrink_dcache_parent on the directory tree which
flushes all the dentries (so a new lookup will be required) but
doesn't flush the inodes or pagecache.

The 'tar' does a readdir on each directory, but (in the case of
'innerdir' at least) satisfies it from the pagecache and uses the
READDIRPLUS data to update all the inodes. In the case of
'innerdir/innerfile', the ctime is out of date.

'tar' then calls 'lstat' on innerdir/innerfile getting an old ctime.
It then opens the file (triggering a GETATTR), reads the content, and
then calls fstat to see if anything has changed. It finds that ctime
has changed and so complains.

The problem seems to be that the cache readdirplus info is kept around
for too long.

My patch below discards pagecache data for directories when
dentry_iput is called on them. This effectively removes the symptom
which convinces me that I correctly understand the problem. However
I'm not convinced that is a proper solution, as there could easily be
other races that trigger the same problem without being affected by
this 'fix'.

One possibility would be to require that readdirplus pagecache data be
only used *once* to instantiate an inode. Somehow it should then be
invalidated so that if the dentry subsequently disappears, it will
cause a new request to the server to fill in the stat data.

Another possibility is to compare the cache_change_attribute on the
inode with something similar for the readdirplus info and reject the
info from readdirplus if it is too old.

I haven't tried to implement these and would value other opinions
before I do.

Thanks,
NeilBrown

Signed-off-by: Neil Brown
Signed-off-by: Trond Myklebust

Neil Brown
2007-05-01 13:17:19 +0800
1f4eab7e7 NFS: Set meaningful value for fattr->time_start in readdirplus results. ... Browse Code »

Don't use uninitialsed value for fattr->time_start in readdirplus results.

The 'fattr' structure filled in by nfs3_decode_direct does not get a
value for ->time_start set.
Thus if an entry is for an inode that we already have in cache,
when nfs_readdir_lookup calls nfs_fhget, it will call nfs_refresh_inode
and may update the inode with out-of-date information.

Directories are read a page at a time, so each page could have a
different timestamp that "should" be used to set the time_start for
the fattr for info in that page. However storing the timestamp per
page is awkward. (We could stick in the first 4 bytes and only read 4092
bytes, but that is a bigger code change than I am interested it).

This patch ignores the readdir_plus attributes if a readdir finds the
information already in cache, and otherwise sets ->time_start to the time
the readdir request was sent to the server.

It might be nice to store - in the directory inode - the time stamp for
the earliest readdir request that is still in the page cache, so that we
don't ignore attribute data that we don't have to. This patch doesn't do
that.

Signed-off-by: Neil Brown
Signed-off-by: Trond Myklebust

Neil Brown
2007-05-01 13:17:18 +0800
74dd34e6e NFS: Added support to turn off the NFSv3 READDIRPLUS RPC. ... Browse Code »

READDIRPLUS can be a performance hindrance when the client is working with
large directories. In addition, some servers still have bugs in their
implementations (e.g. Tru64 returns wrong values for the fsid).

Add a mount flag to enable users to turn it off at mount time following the
implementation in Apple's NFS client.

Signed-off-by: Steve Dickson
Signed-off-by: Trond Myklebust

Steve Dickson
2007-05-01 13:17:16 +0800
df8b172a8 NFS: switch NFSROOT to use new rpcbind client ... Browse Code »

It is arguable whether NFSROOT will support IPv6, and thus whether
rpcb_getport_external needs to support rpcbind versions greater than 2.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-05-01 13:17:14 +0800
2bea90d43 SUNRPC: RPC buffer size estimates are too large ... Browse Code »

The RPC buffer size estimation logic in net/sunrpc/clnt.c always
significantly overestimates the requirements for the buffer size.
A little instrumentation demonstrated that in fact rpc_malloc was never
allocating the buffer from the mempool, but almost always called kmalloc.

To compute the size of the RPC buffer more precisely, split p_bufsiz into
two fields; one for the argument size, and one for the result size.

Then, compute the sum of the exact call and reply header sizes, and split
the RPC buffer precisely between the two. That should keep almost all RPC
buffers within the 2KiB buffer mempool limit.

And, we can finally be rid of RPC_SLACK_SPACE!

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-05-01 13:17:10 +0800
ca52fec15 NFS: Use pgoff_t in structures and functions that pass page cache offsets ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-01 13:17:09 +0800
724c439c2 NFS: Clean up nfs_sync_mapping_wait() ... Browse Code »

It has no business touching wbc->pages_skipped.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-01 13:17:08 +0800
8d5658c94 NFS: Fix a buffer overflow in the allocation of struct nfs_read/writedata ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-01 13:17:07 +0800
c63c7b051 NFS: Fix a race when doing NFS write coalescing ... Browse Code »

Currently we do write coalescing in a very inefficient manner: one pass in
generic_writepages() in order to lock the pages for writing, then one pass
in nfs_flush_mapping() and/or nfs_sync_mapping_wait() in order to gather
the locked pages for coalescing into RPC requests of size "wsize".

In fact, it turns out there is actually a deadlock possible here since we
only start I/O on the second pass. If the user signals the process while
we're in nfs_sync_mapping_wait(), for instance, then we may exit before
starting I/O on all the requests that have been queued up.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-01 13:17:06 +0800
8b09bee30 NFS: Cleanup for nfs_readpages() ... Browse Code »

Do the coalescing of read requests into block sized requests at start of
I/O as we scan through the pages instead of going through a second pass.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-01 13:17:05 +0800
bcb71bba7 NFS: Another cleanup of the read/write request coalescing code ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-01 13:17:04 +0800
d8a5ad75c NFS: Cleanup the coalescing code ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-01 13:17:04 +0800
91e59c368 NFS: Don't wait for congestion in nfs_update_request() ... Browse Code »

It is redundant, and will interfere with the call to
balance_dirty_pages_ratelimited_nr in generic_file_write().

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-01 13:17:03 +0800
1a0ba9ae4 NFS: statfs error-handling fix ... Browse Code »

The nfs statfs function returns a success code on error, and fills the
output buffer with invalid values. The attached patch makes it return a
correct error code instead.

Signed-off-by: Amnon Aaronsohn
Cc: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Trond Myklebust
(Modified patch to reinstate the dprintk())

Amnon Aaronsohn
2007-05-01 13:17:02 +0800
d585158b6 NFS: Fix nfs_set_page_dirty() ... Browse Code »

Be more careful about testing page->mapping.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-05-01 13:17:02 +0800

21 Apr, 2007

2 commits

2b82f190c NFS: Fix race in nfs_set_page_dirty ... Browse Code »

Protect nfs_set_page_dirty() against races with nfs_inode_add_request.

Signed-off-by: Trond Myklebust
Signed-off-by: Linus Torvalds

Trond Myklebust
2007-04-21 13:56:30 +0800
612c9384f NFS: Fix the 'desynchronized value of nfs_i.ncommit' error ... Browse Code »

Redirtying a request that is already marked for commit will screw up the
accounting for NR_UNSTABLE_NFS as well as nfs_i.ncommit.
Ensure that all requests on the commit queue are labelled with the
PG_NEED_COMMIT flag, and avoid moving them onto the dirty list inside
nfs_page_mark_flush().

Also inline nfs_mark_request_dirty() into nfs_page_mark_flush() for
atomicity reasons. Avoid dropping the spinlock until we're done marking the
request in the radix tree and have added it to the ->dirty list.

Signed-off-by: Trond Myklebust
Signed-off-by: Linus Torvalds

Trond Myklebust
2007-04-21 13:56:29 +0800