12 Oct, 2020
5 commits
-
Reply to the client with multiple hole and data segments. I use the
result of the first vfs_llseek() call for encoding as an optimization so
we don't have to immediately repeat the call. This also lets us encode
any remaining reply as data if we get an unexpected result while trying
to calculate a hole.Signed-off-by: Anna Schumaker
Signed-off-by: J. Bruce Fields -
But only one of each right now. We'll expand on this in the next patch.
Signed-off-by: Anna Schumaker
Signed-off-by: J. Bruce Fields -
However, we still only reply to the READ_PLUS call with a single segment
at this time.Signed-off-by: Anna Schumaker
Signed-off-by: J. Bruce Fields -
This patch adds READ_PLUS support for returning a single
NFS4_CONTENT_DATA segment to the client. This is basically the same as
the READ operation, only with the extra information about data segments.Signed-off-by: Anna Schumaker
Signed-off-by: J. Bruce Fields -
The original intent was presumably to reduce code duplication. The
trade-off was:- No support for an NFSD proc function returning a non-success
RPC accept_stat value.
- No support for void NFS replies to non-NULL procedures.
- Everyone pays for the deduplication with a few extra conditional
branches in a hot path.In addition, nfsd_dispatch() leaves *statp uninitialized in the
success path, unlike svc_generic_dispatch().Address all of these problems by moving the logic for encoding
the NFS status code into the NFS XDR encoders themselves. Then
update the NFS .pc_func methods to return an RPC accept_stat
value.Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields
02 Oct, 2020
1 commit
-
nfsd_dispatch() is a hot path. Let's optimize the XDR method calls
for the by-far common case, which is that the XDR methods are indeed
present.Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields
26 Sep, 2020
5 commits
-
Squelch some sparse warnings:
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1860:16: warning: incorrect type in assignment (different base types)
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1860:16: expected int status
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1860:16: got restricted __be32
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1862:24: warning: incorrect type in return expression (different base types)
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1862:24: expected restricted __be32
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1862:24: got int statusSigned-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields -
Squelch some sparse warnings:
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4692:24: warning: incorrect type in return expression (different base types)
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4692:24: expected int
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4692:24: got restricted __be32 [usertype]
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4702:32: warning: incorrect type in return expression (different base types)
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4702:32: expected int
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4702:32: got restricted __be32 [usertype]
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4739:13: warning: incorrect type in assignment (different base types)
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4739:13: expected restricted __be32 [usertype] err
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4739:13: got int
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4891:15: warning: incorrect type in assignment (different base types)
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4891:15: expected unsigned int [assigned] [usertype] count
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4891:15: got restricted __be32 [usertype]Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields -
Reserving space for a large READ payload requires special handling when
reserving space in the xdr buffer pages. One problem we can have is use
of the scratch buffer, which is used to get a pointer to a contiguous
region of data up to PAGE_SIZE. When using the scratch buffer, calls to
xdr_commit_encode() shift the data to it's proper alignment in the xdr
buffer. If we've reserved several pages in a vector, then this could
potentially invalidate earlier pointers and result in incorrect READ
data being sent to the client.I get around this by looking at the amount of space left in the current
page, and never reserve more than that for each entry in the read
vector. This lets us place data directly where it needs to go in the
buffer pages.Signed-off-by: Anna Schumaker
Signed-off-by: J. Bruce Fields -
In nfsd4_encode_listxattrs(), the variable p is assigned to at one point
but this value is never used before p is reassigned. Fix this.Addresses-Coverity: ("Unused value")
Signed-off-by: Alex Dewar
Signed-off-by: J. Bruce Fields -
Missing "is".
Signed-off-by: Alex Dewar
Signed-off-by: J. Bruce Fields
14 Jul, 2020
3 commits
-
Check if user extended attributes are supported for an inode,
and return the answer when being queried for file attributes.An exported filesystem can now signal its RFC8276 user extended
attributes capability.Signed-off-by: Frank van der Linden
Signed-off-by: Chuck Lever -
Implement the main entry points for the *XATTR operations.
Add functions to calculate the reply size for the user extended attribute
operations, and implement the XDR encode / decode logic for these
operations.Add the user extended attributes operations to nfsd4_ops.
Signed-off-by: Frank van der Linden
Signed-off-by: Chuck Lever -
nfs4_decode_write has code to parse incoming XDR write data in to
a kvec head, and a list of pages.Put this code in to a separate function, so that it can be used
later by the xattr code, for setxattr. No functional change.Signed-off-by: Frank van der Linden
Signed-off-by: Chuck Lever
17 Mar, 2020
3 commits
-
Address some minor nits I noticed while working on this function.
Signed-off-by: Chuck Lever
-
svcrdma expects that the payload falls precisely into the xdr_buf
page vector. This does not seem to be the case for
nfsd4_encode_readv().This code is called only when fops->splice_read is missing or when
RQ_SPLICE_OK is clear, so it's not a noticeable problem in many
common cases.Add new transport method: ->xpo_read_payload so that when a READ
payload does not fit exactly in rq_res's page vector, the XDR
encoder can inform the RPC transport exactly where that payload is,
without the payload's XDR pad.That way, when a Write chunk is present, the transport knows what
byte range in the Reply message is supposed to be matched with the
chunk.Note that the Linux NFS server implementation of NFS/RDMA can
currently handle only one Write chunk per RPC-over-RDMA message.
This simplifies the implementation of this fix.Fixes: b04209806384 ("nfsd4: allow exotic read compounds")
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=198053
Signed-off-by: Chuck Lever -
Currently, nfsd4_encode_exchange_id() encodes the utsname nodename
string in the server_scope field. In a multi-host container
environemnt, if an nfsd container is restarted on a different host than
it was originally running on, clients will see a server_scope mismatch
and will not attempt to reclaim opens.Instead, set the server_scope while we're in a process context during
service startup, so we get the utsname nodename of the current process
and store that in nfsd_net.Signed-off-by: Scott Mayhew
[bfields: fix up major_id too]
Signed-off-by: J. Bruce Fields
Signed-off-by: Chuck Lever
20 Dec, 2019
2 commits
-
The values in encode_time_delta are always small and don't
overflow the range of 'struct timespec', so changing it has
no effect.Change it to timespec64 as a prerequisite for removing the
timespec definition later.Signed-off-by: Arnd Bergmann
Signed-off-by: J. Bruce Fields -
The replay variable is set in the only caller of nfsd4_encode_replay.
The assertion is unnecessary and the patch removes this check.Signed-off-by: Aditya Pakki
Signed-off-by: J. Bruce Fields
10 Dec, 2019
2 commits
-
Signed-off-by: Olga Kornievskaia
-
Decode the ca_source_server list that's sent but only use the
first one. Presence of non-zero list indicates an "inter" copy.Signed-off-by: Andy Adamson
Signed-off-by: Olga Kornievskaia
08 Dec, 2019
1 commit
-
Pull nfsd updates from Bruce Fields:
"This is a relatively quiet cycle for nfsd, mainly various bugfixes.Possibly most interesting is Trond's fixes for some callback races
that were due to my incomplete understanding of rpc client shutdown.
Unfortunately at the last minute I've started noticing a new
intermittent failure to send callbacks. As the logic seems basically
correct, I'm leaving Trond's patches in for now, and hope to find a
fix in the next week so I don't have to revert those patches"* tag 'nfsd-5.5' of git://linux-nfs.org/~bfields/linux: (24 commits)
nfsd: depend on CRYPTO_MD5 for legacy client tracking
NFSD fixing possible null pointer derefering in copy offload
nfsd: check for EBUSY from vfs_rmdir/vfs_unink.
nfsd: Ensure CLONE persists data and metadata changes to the target file
SUNRPC: Fix backchannel latency metrics
nfsd: restore NFSv3 ACL support
nfsd: v4 support requires CRYPTO_SHA256
nfsd: Fix cld_net->cn_tfm initialization
lockd: remove __KERNEL__ ifdefs
sunrpc: remove __KERNEL__ ifdefs
race in exportfs_decode_fh()
nfsd: Drop LIST_HEAD where the variable it declares is never used.
nfsd: document callback_wq serialization of callback code
nfsd: mark cb path down on unknown errors
nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback()
nfsd: minor 4.1 callback cleanup
SUNRPC: Fix svcauth_gss_proxy_init()
SUNRPC: Trace gssproxy upcall results
sunrpc: fix crash when cache_head become valid before update
nfsd: remove private bin2hex implementation
...
16 Nov, 2019
1 commit
-
Most of the callers of lookup_one_len_unlocked() treat negatives are
ERR_PTR(-ENOENT). Provide a helper that would do just that. Note
that a pinned positive dentry remains positive - it's ->d_inode is
stable, etc.; a pinned _negative_ dentry can become positive at any
point as long as you are not holding its parent at least shared.
So using lookup_one_len_unlocked() needs to be careful;
lookup_positive_unlocked() is safer and that's what the callers
end up open-coding anyway.Signed-off-by: Al Viro
09 Oct, 2019
1 commit
-
Fixes gcc '-Wunused-but-set-variable' warning:
fs/nfsd/nfs4xdr.c: In function nfsd4_encode_splice_read:
fs/nfsd/nfs4xdr.c:3464:7: warning: variable len set but not used [-Wunused-but-set-variable]It is not used since commit 83a63072c815 ("nfsd: fix nfs read eof detection")
Reported-by: Hulk Robot
Signed-off-by: YueHaibing
Signed-off-by: J. Bruce Fields
24 Sep, 2019
1 commit
-
Currently, the knfsd server assumes that a short read indicates an
end of file. That assumption is incorrect. The short read means that
either we've hit the end of file, or we've hit a read error.In the case of a read error, the client may want to retry (as per the
implementation recommendations in RFC1813 and RFC7530), but currently it
is being told that it hit an eof.Move the code to detect eof from version specific code into the generic
nfsd read.Report eof only in the two following cases:
1) read() returns a zero length short read with no error.
2) the offset+length of the read is >= the file size.Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields
29 Aug, 2019
1 commit
-
We're unnecessarily limiting the size of an ACL to less than what most
filesystems will support. Some users do hit the limit and it's
confusing and unnecessary.It still seems prudent to impose some limit on the number of ACEs the
client gives us before passing it straight to kmalloc(). So, let's just
limit it to the maximum number that would be possible given the amount
of data left in the argument buffer.That will still leave one limit beyond whatever the filesystem imposes:
the client and server negotiate a limit on the size of a request, which
we have to respect.But we're no longer imposing any additional arbitrary limit.
struct nfs4_ace is 20 bytes on my system and the maximum call size we'll
negotiate is about a megabyte, so in practice this is limiting the
allocation here to about a megabyte.Reported-by: "de Vandiere, Louis"
Signed-off-by: J. Bruce Fields
19 Aug, 2019
3 commits
-
Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields -
Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields -
Have nfs4_preprocess_stateid_op pass back a nfsd_file instead of a filp.
Since we now presume that the struct file will be persistent in most
cases, we can stop fiddling with the raparms in the read code. This
also means that we don't really care about the rd_tmp_file field
anymore.Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust
Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields
04 Jul, 2019
3 commits
-
Decode the implementation ID and display in nfsd/clients/#/info. It may
be help identify the client. It won't be used otherwise.(When this went into the protocol, I thought the implementation ID would
be a slippery slope towards implementation-specific workarounds as with
the http user-agent. But I guess I was wrong, the risk seems pretty low
now.)Signed-off-by: J. Bruce Fields
-
Commit bf8d909705e "nfsd: Decode and send 64bit time values" fixed the
code without updating the comment.Signed-off-by: J. Bruce Fields
-
After commit 95582b008388 "vfs: change inode times to use struct
timespec64" there are spots in the NFSv4 decoding where we decode the
protocol into a struct timeval and then convert that into a timeval64.That's unnecesary in the NFSv4 case since the on-the-wire protocol also
uses 64-bit values. So just fix up our code to use timeval64 everywhere.Signed-off-by: J. Bruce Fields
24 Apr, 2019
2 commits
-
Convert knfsd to use the user namespace of the container that started
the server processes.Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields -
clang warns that 'contextlen' may be accessed without an initialization:
fs/nfsd/nfs4xdr.c:2911:9: error: variable 'contextlen' is uninitialized when used here [-Werror,-Wuninitialized]
contextlen);
^~~~~~~~~~
fs/nfsd/nfs4xdr.c:2424:16: note: initialize the variable 'contextlen' to silence this warning
int contextlen;
^
= 0Presumably this cannot happen, as FATTR4_WORD2_SECURITY_LABEL is
set if CONFIG_NFSD_V4_SECURITY_LABEL is enabled.
Adding another #ifdef like the other two in this function
avoids the warning.Signed-off-by: Arnd Bergmann
Signed-off-by: J. Bruce Fields
26 Sep, 2018
3 commits
-
Upon receiving a request for async copy, create a new kthread. If we
get asynchronous request, make sure to copy the needed arguments/state
from the stack before starting the copy. Then start the thread and reply
back to the client indicating copy is asynchronous.nfsd_copy_file_range() will copy in a loop over the total number of
bytes is needed to copy. In case a failure happens in the middle, we
ignore the error and return how much we copied so far. Once done
creating a workitem for the callback workqueue and send CB_OFFLOAD with
the results.The lifetime of the copy stateid is bound to the vfs copy. This way we
don't need to keep the nfsd_net structure for the callback. We could
keep it around longer so that an OFFLOAD_STATUS that came late would
still get results, but clients should be able to deal without that.We handle OFFLOAD_CANCEL by sending a signal to the copy thread and
calling kthread_stop.A client should cancel any ongoing copies before calling DESTROY_CLIENT;
if not, we return a CLIENT_BUSY error.If the client is destroyed for some other reason (lease expiration, or
server shutdown), we must clean up any ongoing copies ourselves.Signed-off-by: Olga Kornievskaia
[colin.king@canonical.com: fix leak in error case]
[bfields@fieldses.org: remove signalling, merge patches]
Signed-off-by: J. Bruce Fields -
Signed-off-by: Olga Kornievskaia
Signed-off-by: J. Bruce Fields -
Signed-off-by: Olga Kornievskaia
Signed-off-by: J. Bruce Fields
10 Aug, 2018
1 commit
-
READ_BUF(8);
dummy = be32_to_cpup(p++);
dummy = be32_to_cpup(p++);
...
READ_BUF(4);
dummy = be32_to_cpup(p++);Assigning value to "dummy" here, but that stored value
is overwritten before it can be used.
At the same time READ_BUF() will re-update the pointer p.delete invalid assignment statements
Signed-off-by: nixiaoming
Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields
17 Jun, 2018
2 commits
-
The change attribute is what is used by clients to revalidate their
caches. Our server may use i_version or ctime for that purpose. Those
choices behave slightly differently, and it may be useful to the client
to know which we're using. This attribute tells the client that. The
Linux client doesn't yet use this attribute yet, though.Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields -
Currently we return the worst-case value of 1 second in the time delta
attribute. That's not terribly useful. Instead, return a value
calculated from the time granularity supported by the filesystem and the
system clock.Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields