Doug / smarc-fsl-linux-kernel | Embedian Git Server

08 Oct, 2008

1 commit

d7fb12077 NFS: Don't use range_cyclic for data integrity syncs ... Browse Code »

It is more efficient to write linearly starting from the beginning of the
file.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-10-08 06:19:05 +0800

16 Jul, 2008

1 commit

a3d01454b NFS: Remove BKL requirement from attribute updates ... Browse Code »

The main problem is dealing with inode->i_size: we need to set the
inode->i_lock on all attribute updates, and so vmtruncate won't cut it.
Make an NFS-private version of vmtruncate that has the necessary locking
semantics.

The result should be that the following inode attribute updates are
protected by inode->i_lock
nfsi->cache_validity
nfsi->read_cache_jiffies
nfsi->attrtimeo
nfsi->attrtimeo_timestamp
nfsi->change_attr
nfsi->last_updated
nfsi->cache_change_attribute
nfsi->access_cache
nfsi->access_cache_entry_lru
nfsi->access_cache_inode_lru
nfsi->acl_access
nfsi->acl_default
nfsi->nfs_page_tree
nfsi->ncommit
nfsi->npages
nfsi->open_files
nfsi->silly_list
nfsi->acl
nfsi->open_states
inode->i_size
inode->i_atime
inode->i_mtime
inode->i_ctime
inode->i_nlink
inode->i_uid
inode->i_gid

The following is protected by dir->i_mutex
nfsi->cookieverf

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-07-16 06:10:51 +0800

10 Jul, 2008

6 commits

e468bae97 NFS: Allow redirtying of a completed unstable write. ... Browse Code »

Currently, if an unstable write completes, we cannot redirty the page in
order to reflect a new change in the page data until after we've sent a
COMMIT request.

This patch allows a page rewrite to proceed without the unnecessary COMMIT
step, putting it immediately back onto the dirty page list, undoing the
VM unstable write accounting, and removing the NFS_PAGE_TAG_COMMIT tag from
the NFS radix tree.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-07-10 00:09:24 +0800
e7d39069e NFS: Clean up nfs_update_request() ... Browse Code »

Simplify the loop in nfs_update_request by moving into a separate function
the code that attempts to update an existing cached NFS write.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-07-10 00:09:23 +0800
48186c7d5 NFS: Fix trace debugging nits in write.c ... Browse Code »

Clean up: fix a few dprintk messages that still need to show the RPC task ID
correctly, and be sure we use the preferred %lld or %llu instead of %Ld or
%Lu.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2008-07-10 00:09:05 +0800
7e5f61466 NFS: Revert commit 44dd151d ... Browse Code »

Revert commit 44dd151d "NFS: Don't mark a written page as uptodate until it
is on disk". While it is true that the write may fail, that is always the
case. There is no reason why we should treat data on pages that are not
already marked as PG_uptodate as being special. The only thing we gain is a
noticeable slowdown when re-reading these pages.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-07-10 00:08:46 +0800
efc91ed01 NFS: Optimise append writes with holes ... Browse Code »

If a file is being extended, and we're creating a hole, we might as well
declare the entire page to be up to date.

This patch significantly improves the write performance for sparse files
in the case where lseek(SEEK_END) is used to append several non-contiguous
writes at intervals of < PAGE_SIZE.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-07-10 00:08:45 +0800
f3d47a3a6 NFS: Fix a preemption count leak in nfs_update_request ... Browse Code »

The commit 2785259631697ebb0749a3782cca206e2e542939 (nfs: use GFP_NOFS
preloads for radix-tree insertion) appears to have introduced a bug:
We only want to call radix_tree_preload() once after creating a request.
Calling it every time we loop after we created the request, will cause
preemption count leaks.

Signed-off-by: Trond Myklebust
Cc: Nick Piggin

Trond Myklebust
2008-07-10 00:08:39 +0800

24 Jun, 2008

1 commit

03fa9e84e NFS: nfs_updatepage(): don't mark page as dirty if an error occurred ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-06-24 05:09:07 +0800

17 May, 2008

1 commit

38def50fa nfs: fix race in nfs_dirty_request ... Browse Code »

When called from nfs_flush_incompatible, the req is not locked, so
req->wb_page might be set to NULL before it is used by PageWriteback.

Signed-off-by: Fred Isaman
Signed-off-by: Benny Halevy
Signed-off-by: Trond Myklebust

Fred Isaman
2008-05-17 00:43:23 +0800

20 Apr, 2008

3 commits

dbae4c73f NFS: Ensure that rpc_run_task() errors are propagated back to the caller ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-04-20 04:53:08 +0800
c9d8f89d9 NFS: Ensure that the write code cleans up properly when rpc_run_task() fails ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-04-20 04:53:05 +0800
73e3302f6 NFS: Fix nfs_wb_page() to always exit with an error or a clean page ... Browse Code »

It is possible for nfs_wb_page() to sometimes exit with 0 return value, yet
the page is left in a dirty state.
For instance in the case where the server rebooted, and the COMMIT request
failed, then all the previously "clean" pages which were cached by the
server, but were not guaranteed to have been writted out to disk,
have to be redirtied and resent to the server.
The fix is to have nfs_wb_page_priority() check that the page is clean
before it exits...

This fixes a condition that triggers the BUG_ON(PagePrivate(page)) in
nfs_create_request() when we're in the nfs_readpage() path.

Also eliminate a redundant BUG_ON(!PageLocked(page)) while we're at it. It
turns out that clear_page_dirty_for_io() has the exact same test.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-04-20 04:52:58 +0800

20 Mar, 2008

3 commits

6d884e8fc nfs: nfs_redirty_request ... Browse Code »

Both flush functions have the same error handling routine. Pull
it out as a function.

Signed-off-by: Fred Isaman
Signed-off-by: Trond Myklebust

Fred
2008-03-20 05:59:56 +0800
c7c350e92 Merge branch 'hotfixes' into devel Browse Code »

Trond Myklebust
2008-03-20 05:59:44 +0800
f8512ad0d nfs: don't ignore return value from nfs_pageio_add_request ... Browse Code »

Ignoring the return value from nfs_pageio_add_request can cause deadlocks.

In read path:
call nfs_pageio_add_request from readpage_async_filler
assume at this point that there are requests already in desc, that
can't be merged with the current request.
so nfs_pageio_doio is fired up to clear out desc.
assume something goes wrong in setting up the io, so desc->pg_error is set.
This causes nfs_pageio_add_request to return 0, *WITHOUT* adding the original
request.
BUT, since return code is ignored, readpage_async_filler assumes it has
been added, and does nothing further, leaving page locked.
do_generic_mapping_read will eventually call lock_page, resulting in deadlock

In write path:
page is marked dirty by generic_perform_write
nfs_writepages is called
call nfs_pageio_add_request from nfs_page_async_flush
assume at this point that there are requests already in desc, that
can't be merged with the current request.
so nfs_pageio_doio is fired up to clear out desc.
assume something goes wrong in setting up the io, so desc->pg_error is set.
This causes nfs_page_async_flush to return 0, *WITHOUT* adding the original
request, yet marking the request as locked (PG_BUSY) and in writeback,
clearing dirty marks.
The next time a write is done to the page, deadlock will result as
nfs_write_end calls nfs_update_request

Signed-off-by: Fred Isaman
Signed-off-by: Trond Myklebust

Fred Isaman
2008-03-20 05:59:02 +0800

08 Mar, 2008

1 commit

af1b8c2ff NFS: Fix an f_mode/f_flags confusion in fs/nfs/write.c ... Browse Code »

O_SYNC is stored in filp->f_flags.
Thanks to Al Viro for pointing out the bug.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-03-08 03:33:40 +0800

29 Feb, 2008

2 commits

cdd097294 Merge branch 'cleanups' into next Browse Code »

Trond Myklebust
2008-02-29 15:48:05 +0800
5e4424af9 SUNRPC: Remove now-redundant RCU-safe rpc_task free path ... Browse Code »

Now that we've tightened up the locking rules for RPC queue wakeups, we can
remove the RCU-safe kfree calls...

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-02-29 15:26:28 +0800

26 Feb, 2008

3 commits

101070ca2 NFS: Ensure that the asynchronous RPC calls complete on nfsiod. ... Browse Code »

We want to ensure that rpc_call_ops that involve mntput() are run on nfsiod
rather than on rpciod, so that they don't deadlock when the resulting
umount calls rpc_shutdown_client(). Hence we specify that read, write and
commit calls must complete on nfsiod.
Ditto for NFSv4 open, lock, locku and close asynchronous calls.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-02-26 13:40:37 +0800
383ba7193 NFS: Fix a deadlock with lazy umount ... Browse Code »

We can't allow rpc callback functions like task->tk_ops->rpc_call_prepare()
and task->tk_ops->rpc_call_done() to call mntput() in any way, since
that will cause a deadlock when the call to rpc_shutdown_client() attempts
to wait on 'task' to complete.

We can avoid the above deadlock by moving calls to mntput to
task->tk_ops->rpc_release() callback, since at that time the task will be
marked as completed, and so rpc_shutdown_client won't attempt to wait on
it.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-02-26 13:40:33 +0800
4b5621f6b NFS: Fix an f_mode/f_flags confusion in fs/nfs/write.c ... Browse Code »

O_SYNC is stored in filp->f_flags.
Thanks to Al Viro for pointing out the bug.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-02-26 07:56:29 +0800

14 Feb, 2008

1 commit

278525963 nfs: use GFP_NOFS preloads for radix-tree insertion ... Browse Code »

NFS should use GFP_NOFS mode radix tree preloads rather than GFP_ATOMIC
allocations at radix-tree insertion-time. This is important to reduce the
atomic memory requirement.

Signed-off-by: Nick Piggin
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Signed-off-by: Andrew Morton
Signed-off-by: Trond Myklebust

Nick Piggin
2008-02-14 12:24:09 +0800

08 Feb, 2008

1 commit

5d47a3560 NFS: Fix a potential file corruption issue when writing ... Browse Code »

If the inode is flagged as having an invalid mapping, then we can't rely on
the PageUptodate() flag. Ensure that we don't use the "anti-fragmentation"
write optimisation in nfs_updatepage(), since that will cause NFS to write
out areas of the page that are no longer guaranteed to be up to date.

A potential corruption could occur in the following scenario:

client 1 client 2
=============== ===============
fd=open("f",O_CREAT|O_WRONLY,0644);
write(fd,"fubar\n",6); // cache last page
close(fd);
fd=open("f",O_WRONLY|O_APPEND);
write(fd,"foo\n",4);
close(fd);

fd=open("f",O_WRONLY|O_APPEND);
write(fd,"bar\n",4);
close(fd);
-----
The bug may lead to the file "f" reading 'fubar\n\0\0\0\nbar\n' because
client 2 does not update the cached page after re-opening the file for
write. Instead it keeps it marked as PageUptodate() until someone calls
invaldate_inode_pages2() (typically by calling read()).

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-02-08 08:20:20 +0800

06 Feb, 2008

1 commit

eebd2aa35 Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user ... Browse Code »

Simplify page cache zeroing of segments of pages through 3 functions

zero_user_segments(page, start1, end1, start2, end2)

Zeros two segments of the page. It takes the position where to
start and end the zeroing which avoids length calculations and
makes code clearer.

zero_user_segment(page, start, end)

Same for a single segment.

zero_user(page, start, length)

Length variant for the case where we know the length.

We remove the zero_user_page macro. Issues:

1. Its a macro. Inline functions are preferable.

2. The KM_USER0 macro is only defined for HIGHMEM.

Having to treat this special case everywhere makes the
code needlessly complex. The parameter for zeroing is always
KM_USER0 except in one single case that we open code.

Avoiding KM_USER0 makes a lot of code not having to be dealing
with the special casing for HIGHMEM anymore. Dealing with
kmap is only necessary for HIGHMEM configurations. In those
configurations we use KM_USER0 like we do for a series of other
functions defined in highmem.h.

Since KM_USER0 is depends on HIGHMEM the existing zero_user_page
function could not be a macro. zero_user_* functions introduced
here can be be inline because that constant is not used when these
functions are called.

Also extract the flushing of the caches to be outside of the kmap.

[akpm@linux-foundation.org: fix nfs and ntfs build]
[akpm@linux-foundation.org: fix ntfs build some more]
Signed-off-by: Christoph Lameter
Cc: Steven French
Cc: Michael Halcrow
Cc:
Cc: Steven Whitehouse
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Cc: Anton Altaparmakov
Cc: Mark Fasheh
Cc: David Chinner
Cc: Michael Halcrow
Cc: Steven French
Cc: Steven Whitehouse
Cc: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2008-02-06 01:44:13 +0800

01 Feb, 2008

1 commit

75659ca0c Merge branch 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc ... Browse Code »

* 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc: (22 commits)
Remove commented-out code copied from NFS
NFS: Switch from intr mount option to TASK_KILLABLE
Add wait_for_completion_killable
Add wait_event_killable
Add schedule_timeout_killable
Use mutex_lock_killable in vfs_readdir
Add mutex_lock_killable
Use lock_page_killable
Add lock_page_killable
Add fatal_signal_pending
Add TASK_WAKEKILL
exit: Use task_is_*
signal: Use task_is_*
sched: Use task_contributes_to_load, TASK_ALL and TASK_NORMAL
ptrace: Use task_is_*
power: Use task_is_*
wait: Use TASK_NORMAL
proc/base.c: Use task_is_*
proc/array.c: Use TASK_REPORT
perfmon: Use task_is_*
...

Fixed up conflicts in NFS/sunrpc manually..

Linus Torvalds
2008-02-01 08:45:47 +0800

30 Jan, 2008

6 commits

bf4285e75 NFS: Fix minor mixed sign comparison in NFS client's write logic ... Browse Code »

Clean up: PAGE_CACHE_SIZE is unsigned, and nfs_pageio_init() takes a size_t.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2008-01-30 15:06:01 +0800
077376919 NFS/SUNRPC: Convert users of rpc_init_task+rpc_execute to rpc_run_task() ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:39 +0800
bdc7f021f NFS: Clean up the (commit|read|write)_setup() callback routines ... Browse Code »

Move the common code for setting up the nfs_write_data and nfs_read_data
structures into fs/nfs/read.c, fs/nfs/write.c and fs/nfs/direct.c.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:32 +0800
3ff7576dd SUNRPC: Clean up the initialisation of priority queue scheduling info. ... Browse Code »

We want the default scheduling priority (priority == 0) to remain
RPC_PRIORITY_NORMAL.

Also ensure that the priority wait queue scheduling is per process id
instead of sometimes being per thread, and sometimes being per inode.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:30 +0800
84115e1cd SUNRPC: Cleanup of rpc_task initialisation ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:30 +0800
acee478af NFS: Clean up the write request locking. ... Browse Code »

Ensure that we set/clear NFS_PAGE_TAG_LOCKED when the nfs_page is hashed.

Signed-off-by: Trond Myklebust

Trond Myklebust
2008-01-30 15:05:24 +0800

07 Dec, 2007

1 commit

150030b78 NFS: Switch from intr mount option to TASK_KILLABLE ... Browse Code »

By using the TASK_KILLABLE infrastructure, we can get rid of the 'intr'
mount option. We have to use _killable everywhere instead of _interruptible
as we get rid of rpc_clnt_sigmask/sigunmask.

Signed-off-by: Liam R. Howlett
Signed-off-by: Matthew Wilcox

Matthew Wilcox
2007-12-07 06:40:25 +0800

27 Nov, 2007

1 commit

5334eb13d NFS: make nfs_wb_page_priority() static ... Browse Code »

nfs_wb_page_priority() can now become static.

Signed-off-by: Adrian Bunk
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Signed-off-by: Andrew Morton
Signed-off-by: Trond Myklebust

Adrian Bunk
2007-11-27 05:24:48 +0800

20 Oct, 2007

1 commit

61e930a90 NFS: Fix a writeback race... ... Browse Code »

This patch fixes a regression that was introduced by commit
44dd151d5c21234cc534c47d7382f5c28c3143cd

We cannot zero the user page in nfs_mark_uptodate() any more, since

a) We'd be modifying the page without holding the page lock
b) We can race with other updates of the page, most notably
because of the call to nfs_wb_page() in nfs_writepage_setup().

Instead, we do the zeroing in nfs_update_request() if we see that we're
creating a request that might potentially be marked as up to date.

Thanks to Olivier Paquet for reporting the bug and providing a test-case.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-10-20 05:18:57 +0800

17 Oct, 2007

2 commits

c9e51e418 mm: count reclaimable pages per BDI ... Browse Code »

Count per BDI reclaimable pages; nr_reclaimable = nr_dirty + nr_unstable.

Signed-off-by: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-10-17 23:42:45 +0800
c4dc4beed nfs: remove congestion_end() ... Browse Code »

These patches aim to improve balance_dirty_pages() and directly address three
issues:
1) inter device starvation
2) stacked device deadlocks
3) inter process starvation

1 and 2 are a direct result from removing the global dirty limit and using
per device dirty limits. By giving each device its own dirty limit is will
no longer starve another device, and the cyclic dependancy on the dirty limit
is broken.

In order to efficiently distribute the dirty limit across the independant
devices a floating proportion is used, this will allocate a share of the total
limit proportional to the device's recent activity.

3 is done by also scaling the dirty limit proportional to the current task's
recent dirty rate.

This patch:

nfs: remove congestion_end(). It's redundant, clear_bdi_congested() already
wakes the waiters.

Signed-off-by: Peter Zijlstra
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-10-17 23:42:44 +0800

10 Oct, 2007

3 commits

60ccd4ec4 NFS: Remove nfs_begin_data_update/nfs_end_data_update ... Browse Code »

The lower level routines in fs/nfs/proc.c, fs/nfs/nfs3proc.c and
fs/nfs/nfs4proc.c should already be dealing with the revalidation issues.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-10-10 05:19:53 +0800
cd3758e37 NFS: Replace file->private_data with calls to nfs_file_open_context() ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-10-10 05:18:31 +0800
7b159fc18 NFS: Fall back to synchronous writes when a background write errors... ... Browse Code »

This helps prevent huge queues of background writes from building up
whenever the server runs out of disk or quota space, or if someone changes
the file access modes behind our backs.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-10-10 05:15:23 +0800