Eric Lee / smarc-fsl-linux-kernel

15 Aug, 2020

1 commit

37711e5e2 Merge tag 'nfs-for-5.9-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs ... Browse Code »

Pull NFS client updates from Trond Myklebust:
"Stable fixes:
- pNFS: Don't return layout segments that are being used for I/O
- pNFS: Don't move layout segments off the active list when being used for I/O

Features:
- NFS: Add support for user xattrs through the NFSv4.2 protocol
- NFS: Allow applications to speed up readdir+statx() using AT_STATX_DONT_SYNC
- NFSv4.0 allow nconnect for v4.0

Bugfixes and cleanups:
- nfs: ensure correct writeback errors are returned on close()
- nfs: nfs_file_write() should check for writeback errors
- nfs: Fix getxattr kernel panic and memory overflow
- NFS: Fix the pNFS/flexfiles mirrored read failover code
- SUNRPC: dont update timeout value on connection reset
- freezer: Add unsafe versions of freezable_schedule_timeout_interruptible for NFS
- sunrpc: destroy rpc_inode_cachep after unregister_filesystem"

* tag 'nfs-for-5.9-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (32 commits)
NFS: Fix flexfiles read failover
fs: nfs: delete repeated words in comments
rpc_pipefs: convert comma to semicolon
nfs: Fix getxattr kernel panic and memory overflow
NFS: Don't return layout segments that are in use
NFS: Don't move layouts to plh_return_segs list while in use
NFS: Add layout segment info to pnfs read/write/commit tracepoints
NFS: Add tracepoints for layouterror and layoutstats.
NFS: Report the stateid + status in trace_nfs4_layoutreturn_on_close()
SUNRPC dont update timeout value on connection reset
nfs: nfs_file_write() should check for writeback errors
nfs: ensure correct writeback errors are returned on close()
NFSv4.2: xattr cache: get rid of cache discard work queue
NFS: remove redundant initialization of variable result
NFSv4.0 allow nconnect for v4.0
freezer: Add unsafe versions of freezable_schedule_timeout_interruptible for NFS
sunrpc: destroy rpc_inode_cachep after unregister_filesystem
NFSv4.2: add client side xattr caching.
NFSv4.2: hook in the user extended attribute handlers
NFSv4.2: add the extended attribute proc functions.
...

Linus Torvalds
2020-08-15 23:26:55 +0800

05 Aug, 2020

1 commit

ce368536d nfs: nfs_file_write() should check for writeback errors ... Browse Code »

The NFS_CONTEXT_ERROR_WRITE flag (as well as the check of said flag) was
removed by commit 6fbda89b257f. The absence of an error check allows
writes to be continually queued up for a server that may no longer be
able to handle them. Fix it by adding an error check using the generic
error reporting functions.

Fixes: 6fbda89b257f ("NFS: Replace custom error reporting mechanism with generic one")
Signed-off-by: Scott Mayhew
Signed-off-by: Trond Myklebust

Scott Mayhew
2020-08-05 11:16:36 +0800

02 Aug, 2020

1 commit

67dd23f9e nfs: ensure correct writeback errors are returned on close() ... Browse Code »

nfs_wb_all() calls filemap_write_and_wait(), which uses
filemap_check_errors() to determine the error to return.
filemap_check_errors() only looks at the mapping->flags and will
therefore only return either -ENOSPC or -EIO. To ensure that the
correct error is returned on close(), nfs{,4}_file_flush() should call
filemap_check_wb_err() which looks at the errseq value in
mapping->wb_err without consuming it.

Fixes: 6fbda89b257f ("NFS: Replace custom error reporting mechanism with
generic one")
Signed-off-by: Scott Mayhew
Signed-off-by: Trond Myklebust

Scott Mayhew
2020-08-02 03:37:48 +0800

18 Jul, 2020

1 commit

65caafd0d SUNRPC reverting d03727b248d0 ("NFSv4 fix CLOSE not waiting for direct IO compeletion") ... Browse Code »

Reverting commit d03727b248d0 "NFSv4 fix CLOSE not waiting for
direct IO compeletion". This patch made it so that fput() by calling
inode_dio_done() in nfs_file_release() would wait uninterruptably
for any outstanding directIO to the file (but that wait on IO should
be killable).

The problem the patch was also trying to address was REMOVE returning
ERR_ACCESS because the file is still opened, is supposed to be resolved
by server returning ERR_FILE_OPEN and not ERR_ACCESS.

Signed-off-by: Olga Kornievskaia
Signed-off-by: Anna Schumaker

Olga Kornievskaia
2020-07-18 02:47:38 +0800

26 Jun, 2020

1 commit

d03727b24 NFSv4 fix CLOSE not waiting for direct IO compeletion ... Browse Code »

Figuring out the root case for the REMOVE/CLOSE race and
suggesting the solution was done by Neil Brown.

Currently what happens is that direct IO calls hold a reference
on the open context which is decremented as an asynchronous task
in the nfs_direct_complete(). Before reference is decremented,
control is returned to the application which is free to close the
file. When close is being processed, it decrements its reference
on the open_context but since directIO still holds one, it doesn't
sent a close on the wire. It returns control to the application
which is free to do other operations. For instance, it can delete a
file. Direct IO is finally releasing its reference and triggering
an asynchronous close. Which races with the REMOVE. On the server,
REMOVE can be processed before the CLOSE, failing the REMOVE with
EACCES as the file is still opened.

Signed-off-by: Olga Kornievskaia
Suggested-by: Neil Brown
CC: stable@vger.kernel.org
Signed-off-by: Anna Schumaker

Olga Kornievskaia
2020-06-26 20:43:14 +0800

15 Jan, 2020

2 commits

2197e9b06 NFS: Fix up fsync() when the server rebooted ... Browse Code »

Don't clear the NFS_CONTEXT_RESEND_WRITES flag until after calling
nfs_commit_inode(). Otherwise, if nfs_commit_inode() returns an
error, we end up with dirty pages in the page cache, but no tag
to tell us that those pages need resending.

Signed-off-by: Trond Myklebust
Signed-off-by: Anna Schumaker

Trond Myklebust
2020-01-15 23:54:32 +0800
bd89bc67f fs/nfs, swapon: check holes in swapfile ... Browse Code »

swapon over NFS does not go through generic_swapfile_activate
code path when setting up extents. This makes holes in NFS
swapfiles possible which is not expected for swapon.

Signed-off-by: Murphy Zhou
Signed-off-by: Anna Schumaker

Murphy Zhou
2020-01-15 23:54:31 +0800

18 Nov, 2019

1 commit

89658c4d0 NFS: Return -ETXTBSY when attempting to write to a swapfile ... Browse Code »

My understanding is that -EBUSY refers to the underlying device, and
that -ETXTBSY is used when attempting to access a file in use by the
kernel (like a swapfile). Changing this return code helps us pass
xfstests generic/569

Signed-off-by: Anna Schumaker
Signed-off-by: Trond Myklebust

Anna Schumaker
2019-11-18 17:43:24 +0800

21 May, 2019

1 commit

457c89965 treewide: Add SPDX license identifier for missed files ... Browse Code »

Add SPDX license identifiers to all files which:

- Have no license information of any form

- Have EXPORT_.*_SYMBOL_GPL inside which was used in the
initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

GPL-2.0-only

Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-21 16:50:45 +0800

26 Apr, 2019

2 commits

6fbda89b2 NFS: Replace custom error reporting mechanism with generic one ... Browse Code »

Replace the NFS custom error reporting mechanism with the generic
mapping_set_error().

Signed-off-by: Trond Myklebust
Signed-off-by: Anna Schumaker

Trond Myklebust
2019-04-26 02:18:14 +0800
aded8d7b5 NFS: Don't inadvertently clear writeback errors ... Browse Code »

vfs_fsync() has the side effect of clearing unreported writeback errors,
so we need to make sure that we do not abuse it in situations where
applications might not normally expect us to report those errors.

The solution is to replace calls to vfs_fsync() with calls to nfs_wb_all().

Signed-off-by: Trond Myklebust
Signed-off-by: Anna Schumaker

Trond Myklebust
2019-04-26 02:18:14 +0800

21 Feb, 2019

3 commits

2cde04e90 pNFS: Avoid read/modify/write when it is not necessary ... Browse Code »

As the block and SCSI layouts can only read/write fixed-length
blocks, we must perform read-modify-write when data to be written is
not aligned to a block boundary or smaller than the block size.
(612aa983a0410 pnfs: add flag to force read-modify-write in ->write_begin)

The current code tries to see if we have to do read-modify-write
on block-oriented pNFS layouts by just checking !PageUptodate(page),
but the same condition also applies for overwriting of any uncached
potions of existing files, making such operations excessively slow
even it is block-aligned.

The change does not affect the optimization for modify-write-read
cases (38c73044f5f4d NFS: read-modify-write page updating),
because partial update of !PageUptodate() pages can only happen
in layouts that can do arbitrary length read/write and never
in block-based ones.

Testing results:

We ran fio on one of the pNFS clients running 4.20 kernel
(vanilla and patched) in this configuration to read/write/overwrite
files on the storage array, exported as pnfs share by the server.

pNFS clients ---1G Ethernet--- pNFS server
(HP DL360 G8) (HP DL360 G8)
| |
| |
+------8G Fiber Channel--------+
|
Storage Array
(HP P6350)

Throughput of overwrite (both buffered and O_SYNC) is noticeably
improved.

Ops. |block size| Throughput |
| (KiB) | (MiB/s) |
| | 4.20 | patched|
---------+----------+----------------+
buffered | 4| 21.3 | 232 |
overwrite| 32| 22.2 | 256 |
| 512| 22.4 | 260 |
---------+----------+----------------+
O_SYNC | 4| 3.84| 4.77|
overwrite| 32| 12.2 | 32.0 |
| 512| 18.5 | 152 |
---------+----------+----------------+

Read and write (buffered and O_SYNC) by the same client remain unchanged
by the patch either negatively or positively, as they should do.

Ops. |block size| Throughput |
| (KiB) | (MiB/s) |
| | 4.20 | patched|
---------+----------+----------------+
read | 4| 548 | 550 |
| 32| 547 | 551 |
| 512| 548 | 551 |
---------+----------+----------------+
buffered | 4| 237 | 244 |
write | 32| 261 | 268 |
| 512| 265 | 272 |
---------+----------+----------------+
O_SYNC | 4| 0.46| 0.46|
write | 32| 3.60| 3.57|
| 512| 105 | 106 |
---------+----------+----------------+

Signed-off-by: Kazuo Ito
Tested-by: Hiroyuki Watanabe
Signed-off-by: Trond Myklebust

Kazuo Ito
2019-02-21 06:33:55 +0800
97ae91bbf pNFS: Fix potential corruption of page being written ... Browse Code »

nfs_want_read_modify_write() didn't check for !PagePrivate when pNFS
block or SCSI layout was in use, therefore we could lose data forever
if the page being written was filled by a read before completion.

Signed-off-by: Kazuo Ito
Signed-off-by: Trond Myklebust

Kazuo Ito
2019-02-21 06:33:55 +0800
302fad7bd NFS: Fix up documentation warnings ... Browse Code »

Fix up some compiler warnings about function parameters, etc not being
correctly described or formatted.

Signed-off-by: Trond Myklebust

Trond Myklebust
2019-02-21 04:14:21 +0800

31 Jul, 2018

1 commit

01a368441 fs: nfs: Adding new return type vm_fault_t ... Browse Code »

Use new return type vm_fault_t for fault handler
in struct vm_operations_struct. For now, this is
just documenting that the function returns a
VM_FAULT value rather than an errno. Once all
instances are converted, vm_fault_t will become
a distinct type.

see commit 1c8f422059ae ("mm: change return type to
vm_fault_t") for reference.

Signed-off-by: Souptick Joarder
Reviewed-by: Matthew Wilcox
Signed-off-by: Anna Schumaker

Souptick Joarder
2018-07-31 01:19:40 +0800

18 Nov, 2017

1 commit

fcfa44706 NFS: Revert "NFS: Move the flock open mode check into nfs_flock()" ... Browse Code »

Commit e12937279c8b "NFS: Move the flock open mode check into nfs_flock()"
changed NFSv3 behavior for flock() such that the open mode must match the
lock type, however that requirement shouldn't be enforced for flock().

Signed-off-by: Benjamin Coddington
Cc: stable@vger.kernel.org # v4.12
Signed-off-by: Anna Schumaker

Benjamin Coddington
2017-11-18 05:43:52 +0800

12 Sep, 2017

1 commit

bf4b49059 NFS: various changes relating to reporting IO errors. ... Browse Code »

1/ remove 'start' and 'end' args from nfs_file_fsync_commit().
They aren't used.

2/ Make nfs_context_set_write_error() a "static inline" in internal.h
so we can...

3/ Use nfs_context_set_write_error() instead of mapping_set_error()
if nfs_pageio_add_request() fails before sending any request.
NFS generally keeps errors in the open_context, not the mapping,
so this is more consistent.

4/ If filemap_write_and_write_range() reports any error, still
check ctx->error. The value in ctx->error is likely to be
more useful. As part of this, NFS_CONTEXT_ERROR_WRITE is
cleared slightly earlier, before nfs_file_fsync_commit() is called,
rather than at the start of that function.

Signed-off-by: NeilBrown
Signed-off-by: Trond Myklebust

NeilBrown
2017-09-12 10:28:56 +0800

07 Sep, 2017

2 commits

e973b1a59 NFS: Sync the correct byte range during synchronous writes ... Browse Code »

Since commit 18290650b1c8 ("NFS: Move buffered I/O locking into
nfs_file_write()") nfs_file_write() has not flushed the correct byte
range during synchronous writes. generic_write_sync() expects that
iocb->ki_pos points to the right edge of the range rather than the
left edge.

To replicate the problem, open a file with O_DSYNC, have the client
write at increasing offsets, and then print the successful offsets.
Block port 2049 partway through that sequence, and observe that the
client application indicates successful writes in advance of what the
server received.

Fixes: 18290650b1c8 ("NFS: Move buffered I/O locking into nfs_file_write()")
Signed-off-by: Jacob Strauss
Signed-off-by: Tarang Gupta
Tested-by: Tarang Gupta
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Trond Myklebust

tarangg@amazon.com
2017-09-07 23:07:13 +0800
779eafab0 NFS: flush data when locking a file to ensure cache coherence for mmap. ... Browse Code »

When a byte range lock (or flock) is taken out on an NFS file, the
validity of the cached data is checked and the inode is marked
NFS_INODE_INVALID_DATA. However the cached data isn't flushed from
the page cache.

This is sufficient for future read() requests or mmap() requests as
they call nfs_revalidate_mapping() which performs the flush if
necessary.

However an existing mapping is not affected. Accessing data through
that mapping will continue to return old data even though the inode is
marked NFS_INODE_INVALID_DATA.

This can easily be confirmed using the 'nfs' tool in
git://github.com/okirch/twopence-nfs.git
and running

nfs coherence FILENAME
on one client, and
nfs coherence -r FILENAME
on another client.

It appears that prior to Linux 2.6.0 this worked correctly.

However commit:

http://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/?id=ca9268fe3ddd075714005adecd4afbd7f9ab87d0

removed the call to inode_invalidate_pages() from nfs_zap_caches(). I
haven't tested this code, but inspection suggests that prior to this
commit, file locking would invalidate all inode pages.

This patch adds a call to nfs_revalidate_mapping() after a
successful SETLK so that invalid data is flushed. With this patch the
above test passes. To minimize impact (and possibly avoid a GETATTR
call) this only happens if the mapping might be mapped into
userspace.

Cc: Olaf Kirch
Signed-off-by: NeilBrown
Signed-off-by: Trond Myklebust

NeilBrown
2017-09-07 00:31:15 +0800

27 Jul, 2017

2 commits

6ba80d434 NFS: Optimize fallocate by refreshing mapping when needed. ... Browse Code »

posix_fallocate() will allocate space in an NFS file by considering
the last byte of every 4K block. If it is before EOF, it will read
the byte and if it is zero, a zero is written out. If it is after EOF,
the zero is unconditionally written.

For the blocks beyond EOF, if NFS believes its cache is valid, it will
expand these writes to write full pages, and then will merge the pages.
This results if (typically) 1MB writes. If NFS believes its cache is
not valid (particularly if NFS_INO_INVALID_DATA or
NFS_INO_REVAL_PAGECACHE are set - see nfs_write_pageuptodate()), it will
send the individual 1-byte writes. This results in (typically) 256 times
as many RPC requests, and can be substantially slower.

Currently nfs_revalidate_mapping() is only used when reading a file or
mmapping a file, as these are times when the content needs to be
up-to-date. Writes don't generally need the cache to be up-to-date, but
writes beyond EOF can benefit, particularly in the posix_fallocate()
case.

So this patch calls nfs_revalidate_mapping() when writing beyond EOF -
i.e. when there is a gap between the end of the file and the start of
the write. If the cache is thought to be out of date (as happens after
taking a file lock), this will cause a GETATTR, and the two flags
mentioned above will be cleared. With this, posix_fallocate() on a
newly locked file does not generate excessive tiny writes.

Signed-off-by: NeilBrown
Signed-off-by: Anna Schumaker

NeilBrown
2017-07-27 23:22:42 +0800
442ce0499 NFS: invalidate file size when taking a lock. ... Browse Code »

Prior to commit ca0daa277aca ("NFS: Cache aggressively when file is open
for writing"), NFS would revalidate, or invalidate, the file size when
taking a lock. Since that commit it only invalidates the file content.

If the file size is changed on the server while wait for the lock, the
client will have an incorrect understanding of the file size and could
corrupt data. This particularly happens when writing beyond the
(supposed) end of file and can be easily be demonstrated with
posix_fallocate().

If an application opens an empty file, waits for a write lock, and then
calls posix_fallocate(), glibc will determine that the underlying
filesystem doesn't support fallocate (assuming version 4.1 or earlier)
and will write out a '0' byte at the end of each 4K page in the region
being fallocated that is after the end of the file.
NFS will (usually) detect that these writes are beyond EOF and will
expand them to cover the whole page, and then will merge the pages.
Consequently, NFS will write out large blocks of zeroes beyond where it
thought EOF was. If EOF had moved, the pre-existing part of the file
will be over-written. Locking should have protected against this,
but it doesn't.

This patch restores the use of nfs_zap_caches() which invalidated the
cached attributes. When posix_fallocate() asks for the file size, the
request will go to the server and get a correct answer.

cc: stable@vger.kernel.org (v4.8+)
Fixes: ca0daa277aca ("NFS: Cache aggressively when file is open for writing")
Signed-off-by: NeilBrown
Signed-off-by: Anna Schumaker

NeilBrown
2017-07-27 23:22:42 +0800

27 Apr, 2017

1 commit

c373fff7b NFSv4: Don't special case "launder" ... Browse Code »

If the client receives a fatal server error from nfs_pageio_add_request(),
then we should always truncate the page on which the error occurred.

Signed-off-by: Trond Myklebust

Trond Myklebust
2017-04-27 01:03:04 +0800

21 Apr, 2017

2 commits

f30cb757f NFS: Always wait for I/O completion before unlock ... Browse Code »

NFS attempts to wait for read and write completion before unlocking in
order to ensure that the data returned was protected by the lock. When
this waiting is interrupted by a signal, the unlock may be skipped, and
messages similar to the following are seen in the kernel ring buffer:

[20.167876] Leaked locks on dev=0x0:0x2b ino=0x8dd4c3:
[20.168286] POSIX: fl_owner=ffff880078b06940 fl_flags=0x1 fl_type=0x0 fl_pid=20183
[20.168727] POSIX: fl_owner=ffff880078b06680 fl_flags=0x1 fl_type=0x0 fl_pid=20185

For NFSv3, the missing unlock will cause the server to refuse conflicting
locks indefinitely. For NFSv4, the leftover lock will be removed by the
server after the lease timeout.

This patch fixes this issue by skipping the usual wait in
nfs_iocounter_wait if the FL_CLOSE flag is set when signaled. Instead, the
wait happens in the unlock RPC task on the NFS UOC rpc_waitqueue.

For NFSv3, use lockd's new nlmclnt_operations along with
nfs_async_iocounter_wait to defer NLM's unlock task until the lock
context's iocounter reaches zero.

For NFSv4, call nfs_async_iocounter_wait() directly from unlock's
current rpc_call_prepare.

Signed-off-by: Benjamin Coddington
Reviewed-by: Jeff Layton
Signed-off-by: Trond Myklebust

Benjamin Coddington
2017-04-21 22:45:01 +0800
e12937279 NFS: Move the flock open mode check into nfs_flock() ... Browse Code »

We only need to check lock exclusive/shared types against open mode when
flock() is used on NFS, so move it into the flock-specific path instead of
checking it for all locks.

Signed-off-by: Benjamin Coddington
Reviewed-by: Christoph Hellwig
Reviewed-by: Jeff Layton
Signed-off-by: Trond Myklebust

Benjamin Coddington
2017-04-21 22:45:00 +0800

25 Feb, 2017

1 commit

11bac8000 mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmf ... Browse Code »

->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to
take a vma and vmf parameter when the vma already resides in vmf.

Remove the vma parameter to simplify things.

[arnd@arndb.de: fix ARM build]
Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dave Jiang
Signed-off-by: Arnd Bergmann
Reviewed-by: Ross Zwisler
Cc: Theodore Ts'o
Cc: Darrick J. Wong
Cc: Matthew Wilcox
Cc: Dave Hansen
Cc: Christoph Hellwig
Cc: Jan Kara
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dave Jiang
2017-02-25 09:46:54 +0800

25 Dec, 2016

1 commit

7c0f6ba68 Replace <asm/uaccess.h> with <linux/uaccess.h> globally ... Browse Code »

This was entirely automated, using the script by Al:

PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
sed -i -e "s!$PATT!#include !" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro
Signed-off-by: Linus Torvalds

Linus Torvalds
2016-12-25 03:46:01 +0800

22 Dec, 2016

1 commit

bc1ecd626 Merge tag 'nfs-for-4.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs ... Browse Code »

Pull more NFS client updates from Trond Myklebust:
"Highlights include:

- further attribute cache improvements to make revalidation more fine
grained

- NFSv4 locking improvements

Bugfixes:

- nfs4_fl_prepare_ds must be careful about reporting success in files
layout

- pNFS/flexfiles: Instead of marking a device inactive, remove it
from the cache"

* tag 'nfs-for-4.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: Retry the DELEGRETURN if the embedded GETATTR is rejected with EACCES
NFS: Retry the CLOSE if the embedded GETATTR is rejected with EACCES
NFSv4: Place the GETATTR operation before the CLOSE
NFSv4: Also ask for attributes when downgrading to a READ-only state
NFS: Don't abuse NFS_INO_REVAL_FORCED in nfs_post_op_update_inode_locked()
pNFS: Return RW layouts on OPEN_DOWNGRADE
NFSv4: Add encode/decode of the layoutreturn op in OPEN_DOWNGRADE
NFS: Don't disconnect open-owner on NFS4ERR_BAD_SEQID
NFSv4: ensure __nfs4_find_lock_state returns consistent result.
NFSv4.1: nfs4_fl_prepare_ds must be careful about reporting success.
pNFS/flexfiles: delete deviceid, don't mark inactive
NFS: Clean up nfs_attribute_timeout()
NFS: Remove unused function nfs_revalidate_inode_rcu()
NFS: Fix and clean up the access cache validity checking
NFS: Only look at the change attribute cache state in nfs_weak_revalidate()
NFS: Clean up cache validity checking
NFS: Don't revalidate the file on close if we hold a delegation
NFSv4: Don't discard the attributes returned by asynchronous DELEGRETURN
NFSv4: Update the attribute cache info in update_changeattr

Linus Torvalds
2016-12-22 02:40:30 +0800

20 Dec, 2016

1 commit

61540bf6b NFS: Clean up cache validity checking ... Browse Code »

Consolidate the open-coded checking of NFS_I(inode)->cache_validity
into a couple of helper functions.

Signed-off-by: Trond Myklebust

Trond Myklebust
2016-12-20 06:29:35 +0800

18 Dec, 2016

1 commit

0110c350c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull more vfs updates from Al Viro:
"In this pile:

- autofs-namespace series
- dedupe stuff
- more struct path constification"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (40 commits)
ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features
ocfs2: charge quota for reflinked blocks
ocfs2: fix bad pointer cast
ocfs2: always unlock when completing dio writes
ocfs2: don't eat io errors during _dio_end_io_write
ocfs2: budget for extent tree splits when adding refcount flag
ocfs2: prohibit refcounted swapfiles
ocfs2: add newlines to some error messages
ocfs2: convert inode refcount test to a helper
simple_write_end(): don't zero in short copy into uptodate
exofs: don't mess with simple_write_{begin,end}
9p: saner ->write_end() on failing copy into non-uptodate page
fix gfs2_stuffed_write_end() on short copies
fix ceph_write_end()
nfs_write_end(): fix handling of short copies
vfs: refactor clone/dedupe_file_range common functions
fs: try to clone files first in vfs_copy_file_range
vfs: misc struct path constification
namespace.c: constify struct path passed to a bunch of primitives
quota: constify struct path in quota_on
...

Linus Torvalds
2016-12-18 10:44:00 +0800

10 Dec, 2016

1 commit

c0cf3ef5e nfs_write_end(): fix handling of short copies ... Browse Code »

What matters when deciding if we should make a page uptodate is
not how much we _wanted_ to copy, but how much we actually have
copied. As it is, on architectures that do not zero tail on
short copy we can leave uninitialized data in page marked uptodate.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro

Al Viro
2016-12-10 11:41:47 +0800

05 Dec, 2016

1 commit

9310b224f NFS: Fix incorrect size revalidation when holding a delegation ... Browse Code »

We should only care about checking the attributes if the page cache
is marked as dubious (using NFS_INO_REVAL_PAGECACHE) and the
NFS_INO_REVAL_FORCED flag is set.

Signed-off-by: Trond Myklebust

Trond Myklebust
2016-12-05 07:08:40 +0800

14 Oct, 2016

1 commit

c4a86165d Merge tag 'nfs-for-4.9-1' of git://git.linux-nfs.org/projects/anna/linux-nfs ... Browse Code »

Pull NFS client updates from Anna Schumaker:
"Highlights include:

Stable bugfixes:
- sunrpc: fix writ espace race causing stalls
- NFS: Fix inode corruption in nfs_prime_dcache()
- NFSv4: Don't report revoked delegations as valid in nfs_have_delegation()
- NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is invalid
- NFSv4: Open state recovery must account for file permission changes
- NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_generic

Features:
- Add support for tracking multiple layout types with an ordered list
- Add support for using multiple backchannel threads on the client
- Add support for pNFS file layout session trunking
- Delay xprtrdma use of DMA API (for device driver removal)
- Add support for xprtrdma remote invalidation
- Add support for larger xprtrdma inline thresholds
- Use a scatter/gather list for sending xprtrdma RPC calls
- Add support for the CB_NOTIFY_LOCK callback
- Improve hashing sunrpc auth_creds by using both uid and gid

Bugfixes:
- Fix xprtrdma use of DMA API
- Validate filenames before adding to the dcache
- Fix corruption of xdr->nwords in xdr_copy_to_scratch
- Fix setting buffer length in xdr_set_next_buffer()
- Don't deadlock the state manager on the SEQUENCE status flags
- Various delegation and stateid related fixes
- Retry operations if an interrupted slot receives EREMOTEIO
- Make nfs boot time y2038 safe"

* tag 'nfs-for-4.9-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (100 commits)
NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_generic
fs: nfs: Make nfs boot time y2038 safe
sunrpc: replace generic auth_cred hash with auth-specific function
sunrpc: add RPCSEC_GSS hash_cred() function
sunrpc: add auth_unix hash_cred() function
sunrpc: add generic_auth hash_cred() function
sunrpc: add hash_cred() function to rpc_authops struct
Retry operation on EREMOTEIO on an interrupted slot
pNFS: Fix atime updates on pNFS clients
sunrpc: queue work on system_power_efficient_wq
NFSv4.1: Even if the stateid is OK, we may need to recover the open modes
NFSv4: If recovery failed for a specific open stateid, then don't retry
NFSv4: Fix retry issues with nfs41_test/free_stateid
NFSv4: Open state recovery must account for file permission changes
NFSv4: Mark the lock and open stateids as invalid after freeing them
NFSv4: Don't test open_stateid unless it is set
NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single stateid
NFS: Always call nfs_inode_find_state_and_recover() when revoking a delegation
NFSv4: Fix a race when updating an open_stateid
NFSv4: Fix a race in nfs_inode_reclaim_delegation()
...

Linus Torvalds
2016-10-14 12:28:20 +0800

06 Oct, 2016

1 commit

82c156f85 switch generic_file_splice_read() to use of ->read_iter() ... Browse Code »

... and kill the ->splice_read() instances that can be switched to it

Signed-off-by: Al Viro

Al Viro
2016-10-06 06:23:56 +0800

23 Sep, 2016

1 commit

75575ddf2 nfs: eliminate pointless and confusing do_vfs_lock wrappers ... Browse Code »

Signed-off-by: Jeff Layton
Signed-off-by: Anna Schumaker

Jeff Layton
2016-09-23 01:56:04 +0800

20 Sep, 2016

1 commit

f844cd0d7 nfs: cover ->migratepage with CONFIG_MIGRATION ... Browse Code »

It will be more clean to use CONFIG_MIGRATION to cover nfs' private
.migratepage in nfs_file_aops like we do in other part of nfs
operations.

Signed-off-by: Chao Yu
Signed-off-by: Anna Schumaker

Chao Yu
2016-09-20 21:29:39 +0800

04 Sep, 2016

1 commit

c49edecd5 NFS: Fix error reporting in nfs_file_write() ... Browse Code »

When doing O_DSYNC writes, the actual write errors are reported through
generic_write_sync(), so we must test the result.

Reported-by: J. R. Okajima
Fixes: 18290650b1c8 ("NFS: Move buffered I/O locking into nfs_file_write()")
Signed-off-by: Trond Myklebust

Trond Myklebust
2016-09-04 00:10:36 +0800

25 Jul, 2016

1 commit

362745268 Merge branch 'writeback' Browse Code »

Trond Myklebust
2016-07-25 05:08:31 +0800

20 Jul, 2016

1 commit

ce52914eb sunrpc: move NO_CRKEY_TIMEOUT to the auth->au_flags ... Browse Code »

A generic_cred can be used to look up a unx_cred or a gss_cred, so it's
not really safe to use the the generic_cred->acred->ac_flags to store
the NO_CRKEY_TIMEOUT flag. A lookup for a unx_cred triggered while the
KEY_EXPIRE_SOON flag is already set will cause both NO_CRKEY_TIMEOUT and
KEY_EXPIRE_SOON to be set in the ac_flags, leaving the user associated
with the auth_cred to be in a state where they're perpetually doing 4K
NFS_FILE_SYNC writes.

This can be reproduced as follows:

1. Mount two NFS filesystems, one with sec=krb5 and one with sec=sys.
They do not need to be the same export, nor do they even need to be from
the same NFS server. Also, v3 is fine.
$ sudo mount -o v3,sec=krb5 server1:/export /mnt/krb5
$ sudo mount -o v3,sec=sys server2:/export /mnt/sys

2. As the normal user, before accessing the kerberized mount, kinit with
a short lifetime (but not so short that renewing the ticket would leave
you within the 4-minute window again by the time the original ticket
expires), e.g.
$ kinit -l 10m -r 60m

3. Do some I/O to the kerberized mount and verify that the writes are
wsize, UNSTABLE:
$ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1

4. Wait until you're within 4 minutes of key expiry, then do some more
I/O to the kerberized mount to ensure that RPC_CRED_KEY_EXPIRE_SOON gets
set. Verify that the writes are 4K, FILE_SYNC:
$ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1

5. Now do some I/O to the sec=sys mount. This will cause
RPC_CRED_NO_CRKEY_TIMEOUT to be set:
$ dd if=/dev/zero of=/mnt/sys/file bs=1M count=1

6. Writes for that user will now be permanently 4K, FILE_SYNC for that
user, regardless of which mount is being written to, until you reboot
the client. Renewing the kerberos ticket (assuming it hasn't already
expired) will have no effect. Grabbing a new kerberos ticket at this
point will have no effect either.

Move the flag to the auth->au_flags field (which is currently unused)
and rename it slightly to reflect that it's no longer associated with
the auth_cred->ac_flags. Add the rpc_auth to the arg list of
rpcauth_cred_key_to_expire and check the au_flags there too. Finally,
add the inode to the arg list of nfs_ctx_key_to_expire so we can
determine the rpc_auth to pass to rpcauth_cred_key_to_expire.

Signed-off-by: Scott Mayhew
Signed-off-by: Trond Myklebust

Scott Mayhew
2016-07-20 04:23:24 +0800

06 Jul, 2016

2 commits

9a773e7c8 NFS nfs_vm_page_mkwrite: Don't freeze me, Bro... ... Browse Code »

Prevent filesystem freezes while handling the write page fault.

Signed-off-by: Trond Myklebust

Trond Myklebust
2016-07-06 07:11:08 +0800
f508d46ae NFS: Remove redundant waits for O_DIRECT in fsync() and write_begin() ... Browse Code »

We're now waiting immediately after taking the locks, so waiting
in fsync() and write_begin() is either redundant or potentially
subject to livelock (if not holding the lock).

Signed-off-by: Trond Myklebust

Trond Myklebust
2016-07-06 07:11:05 +0800