Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

29 Jan, 2014

1 commit

4db72b40f nfs: add memory barriers around NFS_INO_INVALID_DATA and NFS_INO_INVALIDATING ... Browse Code »

If the setting of NFS_INO_INVALIDATING gets reordered to before the
clearing of NFS_INO_INVALID_DATA, then another task may hit a race
window where both appear to be clear, even though the inode's pages are
still in need of invalidation. Fix this by adding the appropriate memory
barriers.

Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust

Jeff Layton
2014-01-29 03:48:18 +0800

28 Jan, 2014

1 commit

d529ef83c NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping ... Browse Code »

There is a possible race in how the nfs_invalidate_mapping function is
handled. Currently, we go and invalidate the pages in the file and then
clear NFS_INO_INVALID_DATA.

The problem is that it's possible for a stale page to creep into the
mapping after the page was invalidated (i.e., via readahead). If another
writer comes along and sets the flag after that happens but before
invalidate_inode_pages2 returns then we could clear the flag
without the cache having been properly invalidated.

So, we must clear the flag first and then invalidate the pages. Doing
this however, opens another race:

It's possible to have two concurrent read() calls that end up in
nfs_revalidate_mapping at the same time. The first one clears the
NFS_INO_INVALID_DATA flag and then goes to call nfs_invalidate_mapping.

Just before calling that though, the other task races in, checks the
flag and finds it cleared. At that point, it trusts that the mapping is
good and gets the lock on the page, allowing the read() to be satisfied
from the cache even though the data is no longer valid.

These effects are easily manifested by running diotest3 from the LTP
test suite on NFS. That program does a series of DIO writes and buffered
reads. The operations are serialized and page-aligned but the existing
code fails the test since it occasionally allows a read to come out of
the cache incorrectly. While mixing direct and buffered I/O isn't
recommended, I believe it's possible to hit this in other ways that just
use buffered I/O, though that situation is much harder to reproduce.

The problem is that the checking/clearing of that flag and the
invalidation of the mapping really need to be atomic. Fix this by
serializing concurrent invalidations with a bitlock.

At the same time, we also need to allow other places that check
NFS_INO_INVALID_DATA to check whether we might be in the middle of
invalidating the file, so fix up a couple of places that do that
to look for the new NFS_INO_INVALIDATING flag.

Doing this requires us to be careful not to set the bitlock
unnecessarily, so this code only does that if it believes it will
be doing an invalidation.

Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust

Jeff Layton
2014-01-28 04:35:56 +0800

18 Jan, 2014

1 commit

263b4509e nfs: always make sure page is up-to-date before extending a write to cover the entire page ... Browse Code »

We should always make sure the cached page is up-to-date when we're
determining whether we can extend a write to cover the full page -- even
if we've received a write delegation from the server.

Commit c7559663 added logic to skip this check if we have a write
delegation, which can lead to data corruption such as the following
scenario if client B receives a write delegation from the NFS server:

Client A:
# echo 123456789 > /mnt/file

Client B:
# echo abcdefghi >> /mnt/file
# cat /mnt/file
0�D0�abcdefghi

Just because we hold a write delegation doesn't mean that we've read in
the entire page contents.

Cc: # v3.11+
Signed-off-by: Scott Mayhew
Signed-off-by: Trond Myklebust

Scott Mayhew
2014-01-18 04:37:15 +0800

06 Jan, 2014

1 commit

1e8968c5b NFS: dprintk() should not print negative fileids and inode numbers ... Browse Code »

A fileid in NFS is a uint64. There are some occurrences where dprintk()
outputs a signed fileid. This leads to confusion and more difficult to
read debugging (negative fileids matching positive inode numbers).

Signed-off-by: Niels de Vos
CC: Santosh Pradhan
Signed-off-by: Trond Myklebust

Niels de Vos
2014-01-06 04:51:23 +0800

25 Oct, 2013

1 commit

6de1472f1 nfs: use %p[dD] instead of open-coded (and often racy) equivalents ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-10-25 11:34:50 +0800

06 Sep, 2013

1 commit

0f1d26055 NFS: Don't check lock owner compatibility in writes unless file is locked ... Browse Code »

If we're doing buffered writes, and there is no file locking involved,
then we don't have to worry about whether or not the lock owner information
is identical.
By relaxing this check, we ensure that fork()ed child processes can write
to a page without having to first sync dirty data that was written
by the parent to disk.

Reported-by: Quentin Barnes
Signed-off-by: Trond Myklebust
Tested-by: Quentin Barnes

Trond Myklebust
2013-09-06 06:11:42 +0800

05 Sep, 2013

2 commits

8c21c62c4 nfs4.1: Add SP4_MACH_CRED write and commit support ... Browse Code »

WRITE and COMMIT can use the machine credential.

If WRITE is supported and COMMIT is not, make all (mach cred) writes FILE_SYNC4.

Signed-off-by: Weston Andros Adamson
Signed-off-by: Trond Myklebust

Weston Andros Adamson
2013-09-05 22:50:45 +0800
ef1820f9b NFSv4: Don't try to recover NFSv4 locks when they are lost. ... Browse Code »

When an NFSv4 client loses contact with the server it can lose any
locks that it holds.

Currently when it reconnects to the server it simply tries to reclaim
those locks. This might succeed even though some other client has
held and released a lock in the mean time. So the first client might
think the file is unchanged, but it isn't. This isn't good.

If, when recovery happens, the locks cannot be claimed because some
other client still holds the lock, then we get a message in the kernel
logs, but the client can still write. So two clients can both think
they have a lock and can both write at the same time. This is equally
not good.

There was a patch a while ago
http://comments.gmane.org/gmane.linux.nfs/41917

which tried to address some of this, but it didn't seem to go
anywhere. That patch would also send a signal to the process. That
might be useful but for now this patch just causes writes to fail.

For NFSv4 (unlike v2/v3) there is a strong link between the lock and
the write request so we can fairly easily fail any IO of the lock is
gone. While some applications might not expect this, it is still
safer than allowing the write to succeed.

Because this is a fairly big change in behaviour a module parameter,
"recover_locks", is introduced which defaults to true (the current
behaviour) but can be set to "false" to tell the client not to try to
recover things that were lost.

Signed-off-by: NeilBrown
Signed-off-by: Trond Myklebust

NeilBrown
2013-09-05 00:26:32 +0800

04 Sep, 2013

1 commit

dc24826bf NFS avoid expired credential keys for buffered writes ... Browse Code »

We must avoid buffering a WRITE that is using a credential key (e.g. a GSS
context key) that is about to expire or has expired. We currently will
paint ourselves into a corner by returning success to the applciation
for such a buffered WRITE, only to discover that we do not have permission when
we attempt to flush the WRITE (and potentially associated COMMIT) to disk.

Use the RPC layer credential key timeout and expire routines which use a
a watermark, gss_key_expire_timeo. We test the key in nfs_file_write.

If a WRITE is using a credential with a key that will expire within
watermark seconds, flush the inode in nfs_write_end and send only
NFS_FILE_SYNC WRITEs by adding nfs_ctx_key_to_expire to nfs_need_sync_write.
Note that this results in single page NFS_FILE_SYNC WRITEs.

Signed-off-by: Andy Adamson
[Trond: removed a pr_warn_ratelimited() for now]
Signed-off-by: Trond Myklebust

Andy Adamson
2013-09-04 03:25:09 +0800

22 Aug, 2013

1 commit

f4ce1299b NFS: Add event tracing for generic NFS events ... Browse Code »

Add tracepoints for inode attribute updates, attribute revalidation,
writeback start/end fsync start/end, attribute change start/end,
permission check start/end.

The intention is to enable performance tracing using 'perf'as well as
improving debugging.

Signed-off-by: Trond Myklebust

Trond Myklebust
2013-08-22 20:58:17 +0800

10 Jul, 2013

1 commit

c7559663e NFS: Allow nfs_updatepage to extend a write under additional circumstances ... Browse Code »

Currently nfs_updatepage allows a write to be extended to cover a full
page only if we don't have a byte range lock lock on the file... but if
we have a write delegation on the file or if we have the whole file
locked for writing then we should be allowed to extend the write as
well.

Signed-off-by: Scott Mayhew
[Trond: fix up call to nfs_have_delegation()]
Signed-off-by: Trond Myklebust

Scott Mayhew
2013-07-10 07:32:50 +0800

26 Mar, 2013

1 commit

c58c84418 NFS: Don't accept more reads/writes if the open context recovery failed ... Browse Code »

If the state recovery failed, we want to ensure that the application
doesn't try to use the same file descriptor for more reads or writes.

Signed-off-by: Trond Myklebust

Trond Myklebust
2013-03-26 00:04:10 +0800

05 Jan, 2013

1 commit

6db6dd7d3 NFS: Ensure that we free the rpc_task after read and write cleanups are done ... Browse Code »

This patch ensures that we free the rpc_task after the cleanup callbacks
are done in order to avoid a deadlock problem that can be triggered if
the callback needs to wait for another workqueue item to complete.

Signed-off-by: Trond Myklebust
Cc: Weston Andros Adamson
Cc: Tejun Heo
Cc: Bruce Fields
Cc: stable@vger.kernel.org [>= 3.5]

Trond Myklebust
2013-01-05 01:59:10 +0800

21 Dec, 2012

1 commit

8c209ce72 NFS: nfs_migrate_page() does not wait for FS-Cache to finish with a page ... Browse Code »

nfs_migrate_page() does not wait for FS-Cache to finish with a page, probably
leading to the following bad-page-state:

BUG: Bad page state in process python-bin pfn:17d39b
page:ffffea00053649e8 flags:004000000000100c count:0 mapcount:0 mapping:(null)
index:38686 (Tainted: G B ---------------- )
Pid: 31053, comm: python-bin Tainted: G B ----------------
2.6.32-71.24.1.el6.x86_64 #1
Call Trace:
[] bad_page+0x107/0x160
[] free_hot_cold_page+0x1c9/0x220
[] __pagevec_free+0x59/0xb0
[] ? flush_tlb_others_ipi+0x128/0x130
[] release_pages+0x21c/0x250
[] ? remove_migration_pte+0x28a/0x2b0
[] ? mem_cgroup_get_reclaim_stat_from_page+0x18/0x70
[] ____pagevec_lru_add+0x167/0x180
[] __lru_cache_add+0x58/0x70
[] lru_cache_add_lru+0x21/0x40
[] putback_lru_page+0x69/0x100
[] migrate_pages+0x13d/0x5d0
[] ? ____pagevec_lru_add+0x167/0x180
[] ? compaction_alloc+0x0/0x370
[] compact_zone+0x4cc/0x600
[] ? get_page_from_freelist+0x15c/0x820
[] ? check_preempt_wakeup+0x1c4/0x3c0
[] compact_zone_order+0x7e/0xb0
[] try_to_compact_pages+0x109/0x170
[] __alloc_pages_nodemask+0x5ed/0x850
[] ? thread_return+0x4e/0x778
[] alloc_pages_vma+0x93/0x150
[] do_huge_pmd_anonymous_page+0x135/0x340
[] ? rwsem_down_read_failed+0x26/0x30
[] handle_mm_fault+0x245/0x2b0
[] do_page_fault+0x123/0x3a0
[] page_fault+0x25/0x30

nfs_migrate_page() calls nfs_fscache_release_page() which doesn't actually wait
- even if __GFP_WAIT is set. The reason that doesn't wait is that
fscache_maybe_release_page() might deadlock the allocator as the work threads
writing to the cache may all end up sleeping on memory allocation.

However, I wonder if that is actually a problem. There are a number of things
I can do to deal with this:

(1) Make nfs_migrate_page() wait.

(2) Make fscache_maybe_release_page() honour the __GFP_WAIT flag.

(3) Set a timeout around the wait.

(4) Make nfs_migrate_page() return an error if the page is still busy.

For the moment, I'll select (2) and (4).

Signed-off-by: David Howells
Acked-by: Jeff Layton

David Howells
2012-12-21 06:12:03 +0800

16 Dec, 2012

1 commit

ada8e20d0 NFS: Don't use SetPageError in the NFS writeback code ... Browse Code »

The writeback code is already capable of passing errors back to user space
by means of the open_context->error. In the case of ENOSPC, Neil Brown
is reporting seeing 2 errors being returned.

Neil writes:

"e.g. if /mnt2/ if an nfs mounted filesystem that has no space then

strace dd if=/dev/zero conv=fsync >> /mnt2/afile count=1

reported Input/output error and the relevant parts of the strace output are:

write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
fsync(1) = -1 EIO (Input/output error)
close(1) = -1 ENOSPC (No space left on device)"

Neil then shows that the duplication of error messages appears to be due to
the use of the PageError() mechanism, which causes filemap_fdatawait_range
to return the extra EIO. The regression was introduced by
commit 7b281ee026552f10862b617a2a51acf49c829554 (NFS: fsync() must exit
with an error if page writeback failed).

Fix this by removing the call to SetPageError(), and just relying on
open_context->error reporting the ENOSPC back to fsync().

Reported-by: Neil Brown
Tested-by: Neil Brown
Signed-off-by: Trond Myklebust
Cc: stable@vger.kernel.org [3.6+]

Trond Myklebust
2012-12-16 06:12:14 +0800

11 Dec, 2012

2 commits

7ce0171d4 Merge branch 'bugfixes' into nfs-for-next Browse Code »

Trond Myklebust
2012-12-11 22:16:26 +0800
81d9bce53 nfs: don't extend writes to cover entire page if pagecache is invalid ... Browse Code »

Jian reported that the following sequence would leave "testfile" with
corrupt data:

# mount localhost:/export /mnt/nfs/ -o vers=3
# echo abc > /mnt/nfs/testfile; echo def >> /export/testfile; echo ghi >> /mnt/nfs/testfile
# cat -v /export/testfile
abc
^@^@^@^@ghi

While there's no locking involved here, the operations are serialized,
so CTO should prevent corruption.

The first write to the file is fine and writes 4 bytes. The file is then
extended on the server. When it's reopened a GETATTR is issued and the
size change is noticed. This causes NFS_INO_INVALID_DATA to be set on
the file. Because the file is opened for write only,
nfs_want_read_modify_write() returns 0 to nfs_write_begin().
nfs_updatepage then calls nfs_write_pageuptodate() to see if it should
extend the nfs_page to cover the whole page. NFS_INO_INVALID_DATA is
still set on the file at that point, but that flag is ignored and
nfs_pageuptodate erroneously extends the write to cover the whole page,
with the write done on the server side filled in with zeroes.

This patch just has that function check for NFS_INO_INVALID_DATA in
addition to NFS_INO_REVAL_PAGECACHE. This fixes the bug, but looking
over the code, I wonder if we might have a similar bug in
nfs_revalidate_size(). The difference between those two flags is very
subtle, so it seems like we ought to be checking for
NFS_INO_INVALID_DATA in most of the places that we look for
NFS_INO_REVAL_PAGECACHE.

I believe this is regression introduced by commit 8d197a568. The code
did check for NFS_INO_INVALID_DATA prior to that patch.

Original bug report is here:

https://bugzilla.redhat.com/show_bug.cgi?id=885743

Cc: # 3.5+
Reported-by: Jian Li
Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust

Jeff Layton
2012-12-11 22:14:51 +0800

26 Nov, 2012

1 commit

4c1002100 nfs: Fix wrong slab cache in nfs_commit_mempool ... Browse Code »

The slab cache in nfs_commit_mempool is wrong, and I think it is just a slip.
I tested it on a x86-32 machine, the size of nfs_write_header is 544, and
the size of nfs_commit_data is 408, so it works fine. It is also true that
sizeof(struct nfs_write_header) > sizeof(struct nfs_commit_data) on other
platforms in my opinoin. Just fix it.

Signed-off-by: Yanchuan Nian
Signed-off-by: Trond Myklebust

Yanchuan Nian
2012-11-26 00:59:33 +0800

05 Nov, 2012

1 commit

deed85e76 NFS: Remove BUG_ON() calls from the generic writeback code ... Browse Code »

...and ensure that we set the return value for nfs_page_async_flush()
to zero! (Reported-by: Dros Adamson)

Signed-off-by: Trond Myklebust

Trond Myklebust
2012-11-05 03:43:39 +0800

29 Sep, 2012

2 commits

05990d1bf NFS: Fix fdatasync/fsync() when confronted with a server reboot ... Browse Code »

If the server reboots before it can commit the unstable writes to disk,
then nfs_commit_release_pages() will detect this when it compares the
verifier returned by COMMIT to the one returned by WRITE. When this
happens, the client needs to resend those writes in order to guarantee
that they make it to stable storage.

This patch adds a signalling mechanism to notify fsync() that it
needs to retry all writes before it can exit.

Signed-off-by: Trond Myklebust

Trond Myklebust
2012-09-29 04:03:05 +0800
2a369153c NFS: Clean up helper function nfs4_select_rw_stateid() ... Browse Code »

We want to be able to pass on the information that the page was not
dirtied under a lock. Instead of adding a flag parameter, do this
by passing a pointer to a 'struct nfs_lock_owner' that may be NULL.

Also reuse this structure in struct nfs_lock_context to carry the
fl_owner_t and pid_t.

Signed-off-by: Trond Myklebust

Trond Myklebust
2012-09-29 04:03:04 +0800

03 Aug, 2012

1 commit

3dd4765fc nfs: tear down caches in nfs_init_writepagecache when allocation fails ... Browse Code »

...and ensure that we tear down the nfs_commit_data cache too when
unloading the module.

Cc: Bryan Schumaker
Cc: stable@vger.kernel.org
Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust

Jeff Layton
2012-08-03 05:36:07 +0800

01 Aug, 2012

4 commits

ac694dbdb Merge branch 'akpm' (Andrew's patch-bomb) ... Browse Code »

Merge Andrew's second set of patches:
- MM
- a few random fixes
- a couple of RTC leftovers

* emailed patches from Andrew Morton : (120 commits)
rtc/rtc-88pm80x: remove unneed devm_kfree
rtc/rtc-88pm80x: assign ret only when rtc_register_driver fails
mm: hugetlbfs: close race during teardown of hugetlbfs shared page tables
tmpfs: distribute interleave better across nodes
mm: remove redundant initialization
mm: warn if pg_data_t isn't initialized with zero
mips: zero out pg_data_t when it's allocated
memcg: gix memory accounting scalability in shrink_page_list
mm/sparse: remove index_init_lock
mm/sparse: more checks on mem_section number
mm/sparse: optimize sparse_index_alloc
memcg: add mem_cgroup_from_css() helper
memcg: further prevent OOM with too many dirty pages
memcg: prevent OOM with too many dirty pages
mm: mmu_notifier: fix freed page still mapped in secondary MMU
mm: memcg: only check anon swapin page charges for swap cache
mm: memcg: only check swap cache pages for repeated charging
mm: memcg: split swapin charge function into private and public part
mm: memcg: remove needless !mm fixup to init_mm when charging
mm: memcg: remove unneeded shmem charge type
...

Linus Torvalds
2012-08-01 10:25:39 +0800
192e501b0 nfs: prevent page allocator recursions with swap over NFS. ... Browse Code »

GFP_NOFS is _more_ permissive than GFP_NOIO in that it will initiate IO,
just not of any filesystem data.

The problem is that previously NOFS was correct because that avoids
recursion into the NFS code. With swap-over-NFS, it is no longer correct
as swap IO can lead to this recursion.

Signed-off-by: Peter Zijlstra
Signed-off-by: Mel Gorman
Acked-by: Rik van Riel
Cc: Christoph Hellwig
Cc: David S. Miller
Cc: Eric B Munson
Cc: Eric Paris
Cc: James Morris
Cc: Mel Gorman
Cc: Mike Christie
Cc: Neil Brown
Cc: Sebastian Andrzej Siewior
Cc: Trond Myklebust
Cc: Xiaotian Feng
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2012-08-01 09:42:48 +0800
29418aa4b nfs: disable data cache revalidation for swapfiles ... Browse Code »

The VM does not like PG_private set on PG_swapcache pages. As suggested
by Trond in http://lkml.org/lkml/2006/8/25/348, this patch disables NFS
data cache revalidation on swap files. as it does not make sense to have
other clients change the file while it is being used as swap. This avoids
setting PG_private on swap pages, since there ought to be no further races
with invalidate_inode_pages2() to deal with.

Since we cannot set PG_private we cannot use page->private which is
already used by PG_swapcache pages to store the nfs_page. Thus augment
the new nfs_page_find_request logic.

Signed-off-by: Peter Zijlstra
Signed-off-by: Mel Gorman
Acked-by: Rik van Riel
Cc: Christoph Hellwig
Cc: David S. Miller
Cc: Eric B Munson
Cc: Eric Paris
Cc: James Morris
Cc: Mel Gorman
Cc: Mike Christie
Cc: Neil Brown
Cc: Sebastian Andrzej Siewior
Cc: Trond Myklebust
Cc: Xiaotian Feng
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2012-08-01 09:42:47 +0800
d56b4ddf7 nfs: teach the NFS client how to treat PG_swapcache pages ... Browse Code »

Replace all relevant occurences of page->index and page->mapping in the
NFS client with the new page_file_index() and page_file_mapping()
functions.

Signed-off-by: Peter Zijlstra
Signed-off-by: Mel Gorman
Acked-by: Rik van Riel
Cc: Christoph Hellwig
Cc: David S. Miller
Cc: Eric B Munson
Cc: Eric Paris
Cc: James Morris
Cc: Mel Gorman
Cc: Mike Christie
Cc: Neil Brown
Cc: Sebastian Andrzej Siewior
Cc: Trond Myklebust
Cc: Xiaotian Feng
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2012-08-01 09:42:47 +0800

31 Jul, 2012

4 commits

89d77c8fa NFS: Convert v4 into a module ... Browse Code »

This patch exports symbols needed by the v4 module. In addition, I also
switch over to using IS_ENABLED() to check if CONFIG_NFS_V4 or
CONFIG_NFS_V4_MODULE are set.

The module (nfs4.ko) will be created in the same directory as nfs.ko and
will be automatically loaded the first time you try to mount over NFS v4.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2012-07-31 07:06:52 +0800
1c606fb74 NFS: Convert v3 into a module ... Browse Code »

This patch exports symbols and moves over the final structures needed by
the v3 module. In addition, I also switch over to using IS_ENABLED() to
check if CONFIG_NFS_V3 or CONFIG_NFS_V3_MODULE are set.

The module (nfs3.ko) will be created in the same directory as nfs.ko and
will be automatically loaded the first time you try to mount over NFS v3.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2012-07-31 07:06:46 +0800
ddda8e0aa NFS: Convert v2 into a module ... Browse Code »

The module (nfs2.ko) will be created in the same directory as nfs.ko and
will be automatically loaded the first time you try to mount over NFS v2.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2012-07-31 07:06:41 +0800
19d87ca36 NFS: Split out remaining NFS v4 inode functions ... Browse Code »

Somehow I missed this in my previous patch series, but these functions
are only needed by the v4 code and should be moved to a v4-only file. I
wasn't exactly sure where I should put these functions, so I moved them
into nfs4super.c where I could make them static.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2012-07-31 07:06:20 +0800

29 Jun, 2012

4 commits

a8d8f02cf NFS: Create custom NFS v4 write_inode() function ... Browse Code »

This gives pnfs a chance to do a layout commit inside the v4 code.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2012-06-29 23:46:47 +0800
57208fa7e NFS: Create an write_pageio_init() function ... Browse Code »

pNFS needs to select a write function based on the layout driver
currently in use, so I let each NFS version decide how to best handle
initializing writes.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2012-06-29 23:46:46 +0800
011e2a7fd NFS: Create a have_delegation rpc_op ... Browse Code »

Delegations are a v4 feature, so push them out of the generic code.

Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust

Bryan Schumaker
2012-06-29 23:46:44 +0800
2f2c63bc2 NFS: Cleanup - only store the write verifier in struct nfs_page ... Browse Code »

The 'committed' field is not needed once we have put the struct nfs_page
on the right list.

Also correct the type of the verifier: it is not an array of __be32, but
simply an 8 byte long opaque array.

Signed-off-by: Trond Myklebust

Trond Myklebust
2012-06-29 05:20:50 +0800

06 Jun, 2012

1 commit

9bce008ba NFS: Fix a commit bug ... Browse Code »

The new commit code fails to copy the verifier into the wb_verf field
of _all_ the nfs_page structures; it only copies it into the first entry.
The consequence is that most requests end up failing to match in
nfs_commit_release.

Fix is to copy the verifier into the req->wb_verf field in
nfs_write_completion.

Signed-off-by: Trond Myklebust
Cc: Fred Isaman

Trond Myklebust
2012-06-06 06:38:47 +0800

20 May, 2012

1 commit

9f0ec176b NFSv4.1 set RPC_TASK_SOFTCONN for filelayout DS RPC calls ... Browse Code »

RPC_TASK_SOFTCONN returns connection errors to the caller which allows the pNFS
file layout to quickly try the MDS or perhaps another DS.

Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust

Andy Adamson
2012-05-20 05:54:19 +0800

10 May, 2012

4 commits

1d1afcbc2 NFS: Clean up - Rename nfs_unlock_request and nfs_unlock_request_dont_release ... Browse Code »

Function rename to ensure that the functionality of nfs_unlock_request()
mirrors that of nfs_lock_request(). Then let nfs_unlock_and_release_request()
do the work of what used to be called nfs_unlock_request()...

Signed-off-by: Trond Myklebust
Cc: Fred Isaman

Trond Myklebust
2012-05-10 03:17:43 +0800
7ad84aa94 NFS: Clean up - simplify nfs_lock_request() ... Browse Code »

We only have two places where we need to grab a reference when trying
to lock the nfs_page. We're better off making that explicit.

Signed-off-by: Trond Myklebust
Cc: Fred Isaman

Trond Myklebust
2012-05-10 03:17:34 +0800
d1182b33e NFS: nfs_set_page_writeback no longer needs to reference the page ... Browse Code »

We now hold a reference to the nfs_page across the calls to
nfs_set_page_writeback and nfs_end_page_writeback, and that
means we already have a reference to the struct page.

Signed-off-by: Trond Myklebust
Cc: Fred Isaman

Trond Myklebust
2012-05-10 03:17:28 +0800
3aff4ebb9 NFS: Prevent a deadlock in the new writeback code ... Browse Code »

We have to unlock the nfs_page before we call nfs_end_page_writeback
to avoid races with functions that expect the page to be unlocked
when PG_locked and PG_writeback are not set.
The problem is that nfs_unlock_request also releases the nfs_page,
causing a deadlock if the release of the nfs_open_context
triggers an iput() while the PG_writeback flag is still set...

The solution is to separate the unlocking and release of the nfs_page,
so that we can do the former before nfs_end_page_writeback and the
latter after.

Signed-off-by: Trond Myklebust
Cc: Fred Isaman

Trond Myklebust
2012-05-10 03:16:07 +0800