Doug / smarc-fsl-linux-kernel | Embedian Git Server

11 Aug, 2010

1 commit

5f248c9c2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits)
no need for list_for_each_entry_safe()/resetting with superblock list
Fix sget() race with failing mount
vfs: don't hold s_umount over close_bdev_exclusive() call
sysv: do not mark superblock dirty on remount
sysv: do not mark superblock dirty on mount
btrfs: remove junk sb_dirt change
BFS: clean up the superblock usage
AFFS: wait for sb synchronization when needed
AFFS: clean up dirty flag usage
cifs: truncate fallout
mbcache: fix shrinker function return value
mbcache: Remove unused features
add f_flags to struct statfs(64)
pass a struct path to vfs_statfs
update VFS documentation for method changes.
All filesystems that need invalidate_inode_buffers() are doing that explicitly
convert remaining ->clear_inode() to ->evict_inode()
Make ->drop_inode() just return whether inode needs to be dropped
fs/inode.c:clear_inode() is gone
fs/inode.c:evict() doesn't care about delete vs. non-delete paths now
...

Fix up trivial conflicts in fs/nilfs2/super.c

Linus Torvalds
2010-08-11 02:26:52 +0800

10 Aug, 2010

1 commit

b57922d97 convert remaining ->clear_inode() to ->evict_inode() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2010-08-10 04:48:37 +0800

04 Aug, 2010

1 commit

1b924e5f8 NFS: Clean up the callers of nfs_wb_all() ... Browse Code »

There is no need to flush out writes before calling nfs_wb_all().

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-08-04 10:06:40 +0800

31 Jul, 2010

1 commit

f11ac8db5 NFSv4: Ensure that we track the NFSv4 lock state in read/write requests. ... Browse Code »

This patch fixes bugzilla entry 14501:
https://bugzilla.kernel.org/show_bug.cgi?id=14501

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-07-31 02:41:56 +0800

15 May, 2010

4 commits

d7cf8dd01 NFSv4: Allow attribute caching with 'noac' mounts if client holds a delegation ... Browse Code »

If the server has given us a delegation on a file, we _know_ that we can
cache the attribute information even when the user has specified 'noac'.

Reviewed-by: Chuck Lever
Signed-off-by: Trond Myklebust

Trond Myklebust
2010-05-15 03:09:30 +0800
987f8dfc9 NFS: Reduce stack footprint of nfs_setattr() ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-05-15 03:09:27 +0800
a3cba2aad NFS: Reduce stack footprint of nfs_revalidate_inode() ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-05-15 03:09:24 +0800
2d36bfde8 NFS: Add helper functions for allocating filehandles and fattr structs ... Browse Code »

NFS Filehandles and struct fattr are really too large to be allocated on
the stack. This patch adds in a couple of helper functions to allocate them
dynamically instead.

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-05-15 03:09:21 +0800

10 Apr, 2010

1 commit

1544fa0f7 NFS: Fix the mode calculation in nfs_find_open_context ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-04-10 07:08:16 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

11 Mar, 2010

1 commit

b4d2314bb NFSv4: Don't ignore the NFS_INO_REVAL_FORCED flag in nfs_revalidate_inode() ... Browse Code »

If the NFS_INO_REVAL_FORCED flag is set, that means that we don't yet have
an up to date attribute cache. Even if we hold a delegation, we must
put a GETATTR on the wire.

Signed-off-by: Trond Myklebust
Cc: stable@kernel.org

Trond Myklebust
2010-03-11 04:21:44 +0800

06 Mar, 2010

8 commits

3fa04ecd7 Merge branch 'writeback-for-2.6.34' into nfs-for-2.6.34 Browse Code »

Trond Myklebust
2010-03-06 04:46:18 +0800
1cda707d5 NFS: Remove requirement for inode->i_mutex from nfs_invalidate_mapping ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-03-06 04:44:56 +0800
5cf95214c NFS: Clean up nfs_sync_mapping ... Browse Code »

Remove the redundant call to filemap_write_and_wait().

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-03-06 04:44:56 +0800
acdc53b21 NFS: Replace __nfs_write_mapping with sync_inode() ... Browse Code »

Now that we have correct COMMIT semantics in writeback_single_inode, we can
reduce and simplify nfs_wb_all(). Also replace nfs_wb_nocommit() with a
call to filemap_write_and_wait(), which doesn't need to hold the
inode->i_mutex.

With that done, we can eliminate nfs_write_mapping() altogether.

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-03-06 04:44:55 +0800
ff778d02b NFS: Add a count of the number of unstable writes carried by an inode ... Browse Code »

In order to know when we should do opportunistic commits of the unstable
writes, when the VM is doing a background flush, we add a field to count
the number of unstable writes.

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-03-06 04:44:54 +0800
8fc795f70 NFS: Cleanup - move nfs_write_inode() into fs/nfs/write.c ... Browse Code »

The sole purpose of nfs_write_inode is to commit unstable writes, so
move it into fs/nfs/write.c, and make nfs_commit_inode static.

Signed-off-by: Trond Myklebust

Trond Myklebust
2010-03-06 04:44:53 +0800
a9185b41a pass writeback_control to ->write_inode ... Browse Code »

This gives the filesystem more information about the writeback that
is happening. Trond requested this for the NFS unstable write handling,
and other filesystems might benefit from this too by beeing able to
distinguish between the different callers in more detail.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-03-06 02:25:52 +0800
26821ed40 make sure data is on disk before calling ->write_inode ... Browse Code »

Similar to the fsync issue fixed a while ago in commit
2daea67e966dc0c42067ebea015ddac6834cef88 we need to write for data to
actually hit the disk before writing out the metadata to guarantee
data integrity for filesystems that modify the inode in the data I/O
completion path. Currently XFS and NFS handle this manually, and AFS
has a write_inode method that does nothing but waiting for data, while
others are possibly missing out on this.

Fortunately this change has a lot less impact than the fsync change
as none of the write_inode methods starts data writeout of any form
by itself.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-03-06 02:25:10 +0800

04 Mar, 2010

1 commit

6eae7974d Switch alloc_nfs_open_context() to struct path ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2010-03-04 03:07:56 +0800

10 Feb, 2010

1 commit

f895c53f8 NFS: Make close(2) asynchronous when closing NFS O_DIRECT files ... Browse Code »

For NFSv2 and v3:

O_DIRECT writes are always synchronous, and aren't cached, so nothing
should be flushed when closing an NFS O_DIRECT file descriptor. Thus
there are no write errors to report on close(2).

In addition, there's no cached data to verify on the next open(2),
so we don't need clean GETATTR results at close time to compare with.

Thus, there's no need for the nfs_revalidate_inode() call when closing
an NFS O_DIRECT file. This reduces the number of synchronous
on-the-wire requests for a simple open-write-close of an NFS O_DIRECT
file by roughly 20%.

For NFSv4:

Call nfs4_do_close() with wait set to zero when closing an NFS
O_DIRECT file. The CLOSE will go on the wire, but the application
won't wait for it to complete.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2010-02-10 21:31:05 +0800

03 Feb, 2010

1 commit

9b4b35134 NFS: Don't clobber the attribute type in nfs_update_inode() ... Browse Code »

If the NFS_ATTR_FATTR_TYPE field isn't set in fattr->valid, then we should
not set the S_IFMT part of inode->i_mode.

Reported-by: Al Viro
Signed-off-by: Trond Myklebust

Trond Myklebust
2010-02-03 21:27:35 +0800

24 Sep, 2009

1 commit

c08d3b0e3 truncate: use new helpers ... Browse Code »

Update some fs code to make use of new helper functions introduced
in the previous patch. Should be no significant change in behaviour
(except CIFS now calls send_sig under i_lock, via inode_newsize_ok).

Reviewed-by: Christoph Hellwig
Acked-by: Miklos Szeredi
Cc: linux-nfs@vger.kernel.org
Cc: Trond.Myklebust@netapp.com
Cc: linux-cifs-client@lists.samba.org
Cc: sfrench@samba.org
Signed-off-by: Nick Piggin
Signed-off-by: Al Viro

npiggin@suse.de
2009-09-24 20:41:47 +0800

20 Aug, 2009

1 commit

e571cbf1a NFS: Add a dns resolver for use with NFSv4 referrals and migration ... Browse Code »

The NFSv4 and NFSv4.1 protocols both allow for the redirection of a client
from one server to another in order to support filesystem migration and
replication. For full protocol support, we need to add the ability to
convert a DNS host name into an IP address that we can feed to the RPC
client.

We'll reuse the sunrpc cache, now that it has been converted to work with
rpc_pipefs.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-08-20 06:22:15 +0800

10 Aug, 2009

1 commit

62ab460cf NFSv4: Add 'server capability' flags for NFSv4 recommended attributes ... Browse Code »

If the NFSv4 server doesn't support a POSIX attribute, the generic NFS code
needs to know that, so that it don't keep trying to poll for it.

However, by the same count, if the NFSv4 server does support that
attribute, then we should ensure that the inode metadata is appropriately
labelled as being untrusted. For instance, if we don't know the correct
value of the file's uid, we should certainly not be caching ACLs or ACCESS
results.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-08-10 03:06:19 +0800

13 Jul, 2009

1 commit

405f55712 headers: smp_lock.h redux ... Browse Code »

* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

This will make hardirq.h inclusion cheaper for every PREEMPT=n config
(which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2009-07-13 03:22:34 +0800

03 Apr, 2009

2 commits

ef79c097b NFS: Use local disk inode cache ... Browse Code »

Bind data storage objects in the local cache to NFS inodes.

Signed-off-by: David Howells
Acked-by: Steve Dickson
Acked-by: Trond Myklebust
Acked-by: Al Viro
Tested-by: Daire Byrne

David Howells
2009-04-03 23:42:43 +0800
8ec442ae4 NFS: Register NFS for caching and retrieve the top-level index ... Browse Code »

Register NFS for caching and retrieve the top-level cache index object cookie.

Signed-off-by: David Howells
Acked-by: Steve Dickson
Acked-by: Trond Myklebust
Acked-by: Al Viro
Tested-by: Daire Byrne

David Howells
2009-04-03 23:42:42 +0800

20 Mar, 2009

1 commit

7fe5c398f NFS: Optimise NFS close() ... Browse Code »

Close-to-open cache consistency rules really only require us to flush out
writes on calls to close(), and require us to revalidate attributes on the
very last close of the file.

Currently we appear to be doing a lot of extra attribute revalidation
and cache flushes.

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-20 03:35:50 +0800

12 Mar, 2009

5 commits

72cb77f4a NFS: Throttle page dirtying while we're flushing to disk ... Browse Code »

The following patch is a combination of a patch by myself and Peter
Staubach.

Trond: If we allow other processes to dirty pages while a process is doing
a consistency sync to disk, we can end up never making progress.

Peter: Attached is a patch which addresses a continuing problem with
the NFS client generating out of order WRITE requests. While
this is compliant with all of the current protocol
specifications, there are servers in the market which can not
handle out of order WRITE requests very well. Also, this may
lead to sub-optimal block allocations in the underlying file
system on the server. This may cause the read throughputs to
be reduced when reading the file from the server.

Peter: There has been a lot of work recently done to address out of
order issues on a systemic level. However, the NFS client is
still susceptible to the problem. Out of order WRITE
requests can occur when pdflush is in the middle of writing
out pages while the process dirtying the pages calls
generic_file_buffered_write which calls
generic_perform_write which calls
balance_dirty_pages_rate_limited which ends up calling
writeback_inodes which ends up calling back into the NFS
client to writes out dirty pages for the same file that
pdflush happens to be working with.

Signed-off-by: Peter Staubach
[modification by Trond to merge the two similar patches]
Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-12 02:10:30 +0800
fb8a1f11b NFS: cleanup - remove struct nfs_inode->ncommit ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-12 02:10:29 +0800
9e6e70f8d NFSv4: Support NFSv4 optional attributes in the struct nfs_fattr ... Browse Code »

Currently, filling struct nfs_fattr is more or less an all or nothing
operation, since NFSv2 and NFSv3 have only mandatory attributes.
In NFSv4, some attributes are optional, and so we may simply not be able to
fill in those fields. Furthermore, NFSv4 allows you to specify which
attributes you are interested in retrieving, thus permitting you to
optimise away retrieval of attributes that you know will no change...

Signed-off-by: Trond Myklebust

Trond Myklebust
2009-03-12 02:10:24 +0800
37d9d76d8 NFS: flush cached directory information slightly more readily. ... Browse Code »

If cached directory contents becomes incorrect, there is no way to
flush the contents. This contrasts with files where file locking is
the recommended way to ensure cache consistency between multiple
applications (a read-lock always flushes the cache).

Also while changes to files often change the size of the file (thus
triggering a cache flush), changes to directories often do not change
the apparent size (as the size is often rounded to a block size).

So it is particularly important with directories to avoid the
possibility of an incorrect cache wherever possible.

When the link count on a directory changes it implies a change in the
number of child directories, and so a change in the contents of this
directory. So use that as a trigger to flush cached contents.

When the ctime changes but the mtime does not, there are two possible
reasons.
1/ The owner/mode information has been changed.
2/ utimes has been used to set the mtime backwards.

In the first case, a data-cache flush is not required.
In the second case it is.

So on the basis that correctness trumps performance, flush the
directory contents cache in this case also.

Signed-off-by: NeilBrown
Signed-off-by: Trond Myklebust

NeilBrown
2009-03-12 02:10:23 +0800
2b57dc6cf NFS: Minor __nfs_revalidate_inode cleanup ... Browse Code »

Remove redundant NFS_STALE() check, a leftover due to the commit
691beb13cdc88358334ef0ba867c080a247a760f

Signed-off-by: Suresh Jayaraman
Signed-off-by: Trond Myklebust

Suresh Jayaraman
2009-03-12 02:10:22 +0800

24 Dec, 2008

2 commits

64672d55d optimize attribute timeouts for "noac" and "actimeo=0" ... Browse Code »

Hi.

I've been looking at a bugzilla which describes a problem where
a customer was advised to use either the "noac" or "actimeo=0"
mount options to solve a consistency problem that they were
seeing in the file attributes. It turned out that this solution
did not work reliably for them because sometimes, the local
attribute cache was believed to be valid and not timed out.
(With an attribute cache timeout of 0, the cache should always
appear to be timed out.)

In looking at this situation, it appears to me that the problem
is that the attribute cache timeout code has an off-by-one
error in it. It is assuming that the cache is valid in the
region, [read_cache_jiffies, read_cache_jiffies + attrtimeo]. The
cache should be considered valid only in the region,
[read_cache_jiffies, read_cache_jiffies + attrtimeo). With this
change, the options, "noac" and "actimeo=0", work as originally
expected.

This problem was previously addressed by special casing the
attrtimeo == 0 case. However, since the problem is only an off-
by-one error, the cleaner solution is address the off-by-one
error and thus, not require the special case.

Thanx...

ps

Signed-off-by: Peter Staubach
Signed-off-by: Trond Myklebust

Peter Staubach
2008-12-24 04:21:56 +0800
dc0b027df NFSv4: Convert the open and close ops to use fmode ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-12-24 04:21:56 +0800

29 Oct, 2008

1 commit

ae05f2694 NFS: Convert nfs_attr_generation_counter into an atomic_long ... Browse Code »

The most important property we need from nfs_attr_generation_counter is
monotonicity, which is not guaranteed by the current system of smp memory
barriers. We should convert it to an atomic_long_t, and drop the memory
barriers.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-10-29 03:21:40 +0800

27 Oct, 2008

1 commit

526719ba5 Switch to a valid email address... ... Browse Code »

Signed-off-by: Alan Cox
Signed-off-by: Linus Torvalds

Alan Cox
2008-10-27 23:40:17 +0800

15 Oct, 2008

2 commits

011935a0a NFS: Fix a resolution problem with nfs_inode->cache_change_attribute ... Browse Code »

The cache_change_attribute is used to decide whether or not a directory has
changed, in which case we may need to look it up again. Again, the use of
'jiffies' leads to an issue of resolution.

Once again, the fix is to change nfs_inode->cache_change_attribute, and
just make it a simple counter.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-10-15 07:24:50 +0800
4704f0e27 NFS: Fix the resolution problem with nfs_inode_attrs_need_update() ... Browse Code »

It appears that 'jiffies' timestamps do not have high enough resolution for
nfs_inode_attrs_need_update(). One problem is that a GETATTR can be
launched within < 1 jiffy of the last operation that updated the attribute.
Another problem is that RPC calls can take < 1 jiffy to execute.

We can fix this by switching the variables to use a simple global counter
that gets incremented every time we start another GETATTR call.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-10-15 07:23:17 +0800