Eric Lee / smarc-fsl-linux-kernel

06 Nov, 2020

1 commit

1905cac9d NFSD: NFSv3 PATHCONF Reply is improperly formed ... Browse Code »

Commit cc028a10a48c ("NFSD: Hoist status code encoding into XDR
encoder functions") missed a spot.

Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields

Chuck Lever
2020-11-06 06:20:12 +0800

12 Oct, 2020

1 commit

cc028a10a NFSD: Hoist status code encoding into XDR encoder functions ... Browse Code »

The original intent was presumably to reduce code duplication. The
trade-off was:

- No support for an NFSD proc function returning a non-success
RPC accept_stat value.
- No support for void NFS replies to non-NULL procedures.
- Everyone pays for the deduplication with a few extra conditional
branches in a hot path.

In addition, nfsd_dispatch() leaves *statp uninitialized in the
success path, unlike svc_generic_dispatch().

Address all of these problems by moving the logic for encoding
the NFS status code into the NFS XDR encoders themselves. Then
update the NFS .pc_func methods to return an RPC accept_stat
value.

Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields

Chuck Lever
2020-10-12 22:29:44 +0800

02 Oct, 2020

1 commit

dcc46991d NFSD: Encoder and decoder functions are always present ... Browse Code »

nfsd_dispatch() is a hot path. Let's optimize the XDR method calls
for the by-far common case, which is that the XDR methods are indeed
present.

Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields

Chuck Lever
2020-10-02 21:37:41 +0800

23 Jan, 2020

2 commits

19e0663ff nfsd: Ensure sampling of the write verifier is atomic with the write ... Browse Code »

When doing an unstable write, we need to ensure that we sample the
write verifier before releasing the lock, and allowing a commit to
the same file to proceed.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2020-01-23 05:25:41 +0800
524ff1af2 nfsd: Ensure sampling of the commit verifier is atomic with the commit ... Browse Code »

When we have a successful commit, ensure we sample the commit verifier
before releasing the lock.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2020-01-23 05:25:41 +0800

20 Dec, 2019

1 commit

92c5e4691 nfsd: handle nfs3 timestamps as unsigned ... Browse Code »

The decode_time3 function behaves differently on 32-bit
and 64-bit architectures: on the former, a 32-bit timestamp
gets converted into an signed number and then into a timestamp
between 1902 and 2038, while on the latter it is interpreted
as unsigned in the range 1970-2106.

Change all the remaining 'timespec' in nfsd to 'timespec64'
to make the behavior the same, and use the current interpretation
of the dominant 64-bit architectures.

Signed-off-by: Arnd Bergmann
Signed-off-by: J. Bruce Fields

Arnd Bergmann
2019-12-20 06:46:08 +0800

16 Nov, 2019

1 commit

6c2d4798a new helper: lookup_positive_unlocked() ... Browse Code »

Most of the callers of lookup_one_len_unlocked() treat negatives are
ERR_PTR(-ENOENT). Provide a helper that would do just that. Note
that a pinned positive dentry remains positive - it's ->d_inode is
stable, etc.; a pinned _negative_ dentry can become positive at any
point as long as you are not holding its parent at least shared.
So using lookup_one_len_unlocked() needs to be careful;
lookup_positive_unlocked() is safer and that's what the callers
end up open-coding anyway.

Signed-off-by: Al Viro

Al Viro
2019-11-16 02:49:04 +0800

10 Sep, 2019

1 commit

27c438f53 nfsd: Support the server resetting the boot verifier ... Browse Code »

Add support to allow the server to reset the boot verifier in order to
force clients to resend I/O after a timeout failure.

Signed-off-by: Trond Myklebust
Signed-off-by: Lance Shelton
Signed-off-by: J. Bruce Fields

Trond Myklebust
2019-09-10 21:23:41 +0800

24 Apr, 2019

1 commit

e45d1a183 nfsd: knfsd must use the container user namespace ... Browse Code »

Convert knfsd to use the user namespace of the container that started
the server processes.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2019-04-24 21:46:35 +0800

06 Apr, 2019

1 commit

3c86794ac nfsd/nfsd3_proc_readdir: fix buffer count and page pointers ... Browse Code »

After this commit
f875a79 nfsd: allow nfsv3 readdir request to be larger.
nfsv3 readdir request size can be larger than PAGE_SIZE. So if the
directory been read is large enough, we can use multiple pages
in rq_respages. Update buffer count and page pointers like we do
in readdirplus to make this happen.

Now listing a directory within 3000 files will panic because we
are counting in a wrong way and would write on random page.

Fixes: f875a79 "nfsd: allow nfsv3 readdir request to be larger"
Signed-off-by: Murphy Zhou
Signed-off-by: J. Bruce Fields

Murphy Zhou
2019-04-06 07:57:24 +0800

09 Mar, 2019

1 commit

f875a792a nfsd: allow nfsv3 readdir request to be larger. ... Browse Code »

nfsd currently reports the NFSv3 dtpref FSINFO parameter
to be PAGE_SIZE, so NFS clients will typically ask for one
page of directory entries at a time. This is needlessly restrictive
as nfsd can handle larger replies easily.

Also, a READDIR request (but not a READDIRPLUS request) has the count
size clipped to PAGE_SIE, again unnecessary.

This patch lifts these limits so that larger readdir requests can be
used.

Signed-off-by: NeilBrown
Signed-off-by: J. Bruce Fields

NeilBrown
2019-03-09 03:30:31 +0800

06 Mar, 2019

1 commit

b602345da nfsd: fix memory corruption caused by readdir ... Browse Code »

If the result of an NFSv3 readdir{,plus} request results in the
"offset" on one entry having to be split across 2 pages, and is sized
so that the next directory entry doesn't fit in the requested size,
then memory corruption can happen.

When encode_entry() is called after encoding the last entry that fits,
it notices that ->offset and ->offset1 are set, and so stores the
offset value in the two pages as required. It clears ->offset1 but
*does not* clear ->offset.

Normally this omission doesn't matter as encode_entry_baggage() will
be called, and will set ->offset to a suitable value (not on a page
boundary).
But in the case where cd->buflen < elen and nfserr_toosmall is
returned, ->offset is not reset.

This means that nfsd3proc_readdirplus will see ->offset with a value 4
bytes before the end of a page, and ->offset1 set to NULL.
It will try to write 8bytes to ->offset.
If we are lucky, the next page will be read-only, and the system will
BUG: unable to handle kernel paging request at...

If we are unlucky, some innocent page will have the first 4 bytes
corrupted.

nfsd3proc_readdir() doesn't even check for ->offset1, it just blindly
writes 8 bytes to the offset wherever it is.

Fix this by clearing ->offset after it is used, and copying the
->offset handling code from nfsd3_proc_readdirplus into
nfsd3_proc_readdir.

(Note that the commit hash in the Fixes tag is from the 'history'
tree - this bug predates git).

Fixes: 0b1d57cf7654 ("[PATCH] kNFSd: Fix nfs3 dentry encoding")
Fixes-URL: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=0b1d57cf7654
Cc: stable@vger.kernel.org (v2.6.12+)
Signed-off-by: NeilBrown
Signed-off-by: J. Bruce Fields

NeilBrown
2019-03-06 05:41:33 +0800

06 Jun, 2018

1 commit

95582b008 vfs: change inode times to use struct timespec64 ... Browse Code »

struct timespec is not y2038 safe. Transition vfs to use
y2038 safe struct timespec64 instead.

The change was made with the help of the following cocinelle
script. This catches about 80% of the changes.
All the header file and logic changes are included in the
first 5 rules. The rest are trivial substitutions.
I avoid changing any of the function signatures or any other
filesystem specific data structures to keep the patch simple
for review.

The script can be a little shorter by combining different cases.
But, this version was sufficient for my usecase.

virtual patch

@ depends on patch @
identifier now;
@@
- struct timespec
+ struct timespec64
current_time ( ... )
{
- struct timespec now = current_kernel_time();
+ struct timespec64 now = current_kernel_time64();
...
- return timespec_trunc(
+ return timespec64_trunc(
... );
}

@ depends on patch @
identifier xtime;
@@
struct $ iattr \| inode \| kstat $ {
...
- struct timespec xtime;
+ struct timespec64 xtime;
...
}

@ depends on patch @
identifier t;
@@
struct inode_operations {
...
int (*update_time) (...,
- struct timespec t,
+ struct timespec64 t,
...);
...
}

@ depends on patch @
identifier t;
identifier fn_update_time =~ "update_time$";
@@
fn_update_time (...,
- struct timespec *t,
+ struct timespec64 *t,
...) { ... }

@ depends on patch @
identifier t;
@@
lease_get_mtime( ... ,
- struct timespec *t
+ struct timespec64 *t
) { ... }

@te depends on patch forall@
identifier ts;
local idexpression struct inode *inode_node;
identifier i_xtime =~ "^i_[acm]time$";
identifier ia_xtime =~ "^ia_[acm]time$";
identifier fn_update_time =~ "update_time$";
identifier fn;
expression e, E3;
local idexpression struct inode *node1;
local idexpression struct inode *node2;
local idexpression struct iattr *attr1;
local idexpression struct iattr *attr2;
local idexpression struct iattr attr;
identifier i_xtime1 =~ "^i_[acm]time$";
identifier i_xtime2 =~ "^i_[acm]time$";
identifier ia_xtime1 =~ "^ia_[acm]time$";
identifier ia_xtime2 =~ "^ia_[acm]time$";
@@
(
(
- struct timespec ts;
+ struct timespec64 ts;
|
- struct timespec ts = current_time(inode_node);
+ struct timespec64 ts = current_time(inode_node);
)

i_xtime, &ts)
+ timespec64_equal(&inode_node->i_xtime, &ts)
|
- timespec_equal(&ts, &inode_node->i_xtime)
+ timespec64_equal(&ts, &inode_node->i_xtime)
|
- timespec_compare(&inode_node->i_xtime, &ts)
+ timespec64_compare(&inode_node->i_xtime, &ts)
|
- timespec_compare(&ts, &inode_node->i_xtime)
+ timespec64_compare(&ts, &inode_node->i_xtime)
|
ts = current_time(e)
|
fn_update_time(..., &ts,...)
|
inode_node->i_xtime = ts
|
node1->i_xtime = ts
|
ts = inode_node->i_xtime
|
ia_xtime ...+> = ts
|
ts = attr1->ia_xtime
|
ts.tv_sec
|
ts.tv_nsec
|
btrfs_set_stack_timespec_sec(..., ts.tv_sec)
|
btrfs_set_stack_timespec_nsec(..., ts.tv_nsec)
|
- ts = timespec64_to_timespec(
+ ts =
...
-)
|
- ts = ktime_to_timespec(
+ ts = ktime_to_timespec64(
...)
|
- ts = E3
+ ts = timespec_to_timespec64(E3)
|
- ktime_get_real_ts(&ts)
+ ktime_get_real_ts64(&ts)
|
fn(...,
- ts
+ timespec64_to_timespec(ts)
,...)
)
...+>
(

)
|
- timespec_equal(&node1->i_xtime1, &node2->i_xtime2)
+ timespec64_equal(&node1->i_xtime2, &node2->i_xtime2)
|
- timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2)
+ timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2)
|
- timespec_compare(&node1->i_xtime1, &node2->i_xtime2)
+ timespec64_compare(&node1->i_xtime1, &node2->i_xtime2)
|
node1->i_xtime1 =
- timespec_trunc(attr1->ia_xtime1,
+ timespec64_trunc(attr1->ia_xtime1,
...)
|
- attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2,
+ attr1->ia_xtime1 = timespec64_trunc(attr2->ia_xtime2,
...)
|
- ktime_get_real_ts(&attr1->ia_xtime1)
+ ktime_get_real_ts64(&attr1->ia_xtime1)
|
- ktime_get_real_ts(&attr.ia_xtime1)
+ ktime_get_real_ts64(&attr.ia_xtime1)
)

@ depends on patch @
struct inode *node;
struct iattr *attr;
identifier fn;
identifier i_xtime =~ "^i_[acm]time$";
identifier ia_xtime =~ "^ia_[acm]time$";
expression e;
@@
(
- fn(node->i_xtime);
+ fn(timespec64_to_timespec(node->i_xtime));
|
fn(...,
- node->i_xtime);
+ timespec64_to_timespec(node->i_xtime));
|
- e = fn(attr->ia_xtime);
+ e = fn(timespec64_to_timespec(attr->ia_xtime));
)

@ depends on patch forall @
struct inode *node;
struct iattr *attr;
identifier i_xtime =~ "^i_[acm]time$";
identifier ia_xtime =~ "^ia_[acm]time$";
identifier fn;
@@
{
+ struct timespec ts;
i_xtime);
fn (...,
- &node->i_xtime,
+ &ts,
...);
|
+ ts = timespec64_to_timespec(attr->ia_xtime);
fn (...,
- &attr->ia_xtime,
+ &ts,
...);
)
...+>
}

@ depends on patch forall @
struct inode *node;
struct iattr *attr;
struct kstat *stat;
identifier ia_xtime =~ "^ia_[acm]time$";
identifier i_xtime =~ "^i_[acm]time$";
identifier xtime =~ "^[acm]time$";
identifier fn, ret;
@@
{
+ struct timespec ts;
i_xtime);
ret = fn (...,
- &node->i_xtime,
+ &ts,
...);
|
+ ts = timespec64_to_timespec(node->i_xtime);
ret = fn (...,
- &node->i_xtime);
+ &ts);
|
+ ts = timespec64_to_timespec(attr->ia_xtime);
ret = fn (...,
- &attr->ia_xtime,
+ &ts,
...);
|
+ ts = timespec64_to_timespec(attr->ia_xtime);
ret = fn (...,
- &attr->ia_xtime);
+ &ts);
|
+ ts = timespec64_to_timespec(stat->xtime);
ret = fn (...,
- &stat->xtime);
+ &ts);
)
...+>
}

@ depends on patch @
struct inode *node;
struct inode *node2;
identifier i_xtime1 =~ "^i_[acm]time$";
identifier i_xtime2 =~ "^i_[acm]time$";
identifier i_xtime3 =~ "^i_[acm]time$";
struct iattr *attrp;
struct iattr *attrp2;
struct iattr attr ;
identifier ia_xtime1 =~ "^ia_[acm]time$";
identifier ia_xtime2 =~ "^ia_[acm]time$";
struct kstat *stat;
struct kstat stat1;
struct timespec64 ts;
identifier xtime =~ "^[acmb]time$";
expression e;
@@
(
( node->i_xtime2 \| attrp->ia_xtime2 \| attr.ia_xtime2 \) = node->i_xtime1 ;
|
node->i_xtime2 = $ node2->i_xtime1 \| timespec64_trunc(...) $;
|
node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = $ts \| current_time(...) $;
|
node->i_xtime1 = node->i_xtime3 = $ts \| current_time(...) $;
|
stat->xtime = node2->i_xtime1;
|
stat1.xtime = node2->i_xtime1;
|
( node->i_xtime2 \| attrp->ia_xtime2 \) = attrp->ia_xtime1 ;
|
( attrp->ia_xtime1 \| attr.ia_xtime1 \) = attrp2->ia_xtime2;
|
- e = node->i_xtime1;
+ e = timespec64_to_timespec( node->i_xtime1 );
|
- e = attrp->ia_xtime1;
+ e = timespec64_to_timespec( attrp->ia_xtime1 );
|
node->i_xtime1 = current_time(...);
|
node->i_xtime2 = node->i_xtime1 = node->i_xtime3 =
- e;
+ timespec_to_timespec64(e);
|
node->i_xtime1 = node->i_xtime3 =
- e;
+ timespec_to_timespec64(e);
|
- node->i_xtime1 = e;
+ node->i_xtime1 = timespec_to_timespec64(e);
)

Signed-off-by: Deepa Dinamani
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:
Cc:

Deepa Dinamani
2018-06-06 07:57:31 +0800

04 Apr, 2018

2 commits

38a703155 NFSD: Clean up legacy NFS SYMLINK argument XDR decoders ... Browse Code »

Move common code in NFSD's legacy SYMLINK decoders into a helper.
The immediate benefits include:

- one fewer data copies on transports that support DDP
- consistent error checking across all versions
- reduction of code duplication
- support for both legal forms of SYMLINK requests on RDMA
transports for all versions of NFS (in particular, NFSv2, for
completeness)

In the long term, this helper is an appropriate spot to perform a
per-transport call-out to fill the pathname argument using, say,
RDMA Reads.

Filling the pathname in the proc function also means that eventually
the incoming filehandle can be interpreted so that filesystem-
specific memory can be allocated as a sink for the pathname
argument, rather than using anonymous pages.

Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields

Chuck Lever
2018-04-04 03:08:16 +0800
8154ef277 NFSD: Clean up legacy NFS WRITE argument XDR decoders ... Browse Code »

Move common code in NFSD's legacy NFS WRITE decoders into a helper.
The immediate benefit is reduction of code duplication and some nice
micro-optimizations (see below).

In the long term, this helper can perform a per-transport call-out
to fill the rq_vec (say, using RDMA Reads).

The legacy WRITE decoders and procs are changed to work like NFSv4,
which constructs the rq_vec just before it is about to call
vfs_writev.

Why? Calling a transport call-out from the proc instead of the XDR
decoder means that the incoming FH can be resolved to a particular
filesystem and file. This would allow pages from the backing file to
be presented to the transport to be filled, rather than presenting
anonymous pages and copying or flipping them into the file's page
cache later.

I also prefer using the pages in rq_arg.pages, instead of pulling
the data pages directly out of the rqstp::rq_pages array. This is
currently the way the NFSv3 write decoder works, but the other two
do not seem to take this approach. Fixing this removes the only
reference to rq_pages found in NFSD, eliminating an NFSD assumption
about how transports use the pages in rq_pages.

Lastly, avoid setting up the first element of rq_vec as a zero-
length buffer. This happens with an RDMA transport when a normal
Read chunk is present because the data payload is in rq_arg's
page list (none of it is in the head buffer).

Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields

Chuck Lever
2018-04-04 03:08:16 +0800

09 Feb, 2018

1 commit

39ca1bf62 nfsd: store stat times in fill_pre_wcc() instead of inode times ... Browse Code »

The time values in stat and inode may differ for overlayfs and stat time
values are the correct ones to use. This is also consistent with the fact
that fill_post_wcc() also stores stat time values.

This means introducing a stat call that could fail, where previously we
were just copying values out of the inode. To be conservative about
changing behavior, we fall back to copying values out of the inode in
the error case. It might be better just to clear fh_pre_saved (though
note the BUG_ON in set_change_info).

Signed-off-by: Amir Goldstein
Signed-off-by: J. Bruce Fields

Amir Goldstein
2018-02-09 02:40:17 +0800

19 Nov, 2017

1 commit

4dd3c2e5a Merge tag 'nfsd-4.15' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"Lots of good bugfixes, including:

- fix a number of races in the NFSv4+ state code

- fix some shutdown crashes in multiple-network-namespace cases

- relax our 4.1 session limits; if you've an artificially low limit
to the number of 4.1 clients that can mount simultaneously, try
upgrading"

* tag 'nfsd-4.15' of git://linux-nfs.org/~bfields/linux: (22 commits)
SUNRPC: Improve ordering of transport processing
nfsd: deal with revoked delegations appropriately
svcrdma: Enqueue after setting XPT_CLOSE in completion handlers
nfsd: use nfs->ns.inum as net ID
rpc: remove some BUG()s
svcrdma: Preserve CB send buffer across retransmits
nfds: avoid gettimeofday for nfssvc_boot time
fs, nfsd: convert nfs4_file.fi_ref from atomic_t to refcount_t
fs, nfsd: convert nfs4_cntl_odstate.co_odcount from atomic_t to refcount_t
fs, nfsd: convert nfs4_stid.sc_count from atomic_t to refcount_t
lockd: double unregister of inetaddr notifiers
nfsd4: catch some false session retries
nfsd4: fix cached replies to solo SEQUENCE compounds
sunrcp: make function _svc_create_xprt static
SUNRPC: Fix tracepoint storage issues with svc_recv and svc_rqst_status
nfsd: use ARRAY_SIZE
nfsd: give out fewer session slots as limit approaches
nfsd: increase DRC cache limit
nfsd: remove unnecessary nofilehandle checks
nfs_common: convert int to bool
...

Linus Torvalds
2017-11-19 03:22:04 +0800

08 Nov, 2017

1 commit

256a89fa3 nfds: avoid gettimeofday for nfssvc_boot time ... Browse Code »

do_gettimeofday() is deprecated and we should generally use time64_t
based functions instead.

In case of nfsd, all three users of nfssvc_boot only use the initial
time as a unique token, and are not affected by it overflowing, so they
are not affected by the y2038 overflow.

This converts the structure to timespec64 anyway and adds comments
to all uses, to document that we have thought about it and avoid
having to look at it again.

Signed-off-by: Arnd Bergmann
Signed-off-by: J. Bruce Fields

Arnd Bergmann
2017-11-08 05:44:00 +0800

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800

13 Jul, 2017

1 commit

630458e73 nfsd4: factor ctime into change attribute ... Browse Code »

Factoring ctime into the nfsv4 change attribute gives us better
properties than just i_version alone.

Eventually we'll likely also expose this (as opposed to raw i_version)
to userspace, at which point we'll want to move it to a common helper,
called from either userspace or individual filesystems. For now, nfsd
is the only user.

Signed-off-by: J. Bruce Fields

J. Bruce Fields
2017-07-13 03:55:00 +0800

29 Jun, 2017

1 commit

9a1d168e1 Merge tag 'v4.12-rc5' into nfsd tree ... Browse Code »

Update to get f0c3192ceee3 "virtio_net: lower limit on buffer size".
That bug was interfering with my nfsd testing.

J. Bruce Fields
2017-06-29 01:34:15 +0800

17 May, 2017

1 commit

9512a16b0 nfsd: Revert "nfsd: check for oversized NFSv2/v3 arguments" ... Browse Code »

This reverts commit 51f567777799 "nfsd: check for oversized NFSv2/v3
arguments", which breaks support for NFSv3 ACLs.

That patch was actually an earlier draft of a fix for the problem that
was eventually fixed by e6838a29ecb "nfsd: check for oversized NFSv2/v3
arguments". But somehow I accidentally left this earlier draft in the
branch that was part of my 2.12 pull request.

Reported-by: Eryu Guan
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2017-05-17 04:16:30 +0800

15 May, 2017

3 commits

63f8de379 sunrpc: properly type pc_encode callbacks ... Browse Code »

Drop the resp argument as it can trivially be derived from the rqstp
argument. With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig
Acked-by: Trond Myklebust

Christoph Hellwig
2017-05-15 23:42:25 +0800
026fec7e7 sunrpc: properly type pc_decode callbacks ... Browse Code »

Drop the argp argument as it can trivially be derived from the rqstp
argument. With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig

Christoph Hellwig
2017-05-15 23:42:24 +0800
8537488b5 sunrpc: properly type pc_release callbacks ... Browse Code »

Drop the p and resp arguments as they are always NULL or can trivially
be derived from the rqstp argument. With that all functions now have the
same prototype, and we can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig

Christoph Hellwig
2017-05-15 23:42:23 +0800

26 Apr, 2017

3 commits

51f567777 nfsd: check for oversized NFSv2/v3 arguments ... Browse Code »

A client can append random data to the end of an NFSv2 or NFSv3 RPC call
without our complaining; we'll just stop parsing at the end of the
expected data and ignore the rest.

Encoded arguments and replies are stored together in an array of pages,
and if a call is too large it could leave inadequate space for the
reply. This is normally OK because NFS RPC's typically have either
short arguments and long replies (like READ) or long arguments and short
replies (like WRITE). But a client that sends an incorrectly long reply
can violate those assumptions. This was observed to cause crashes.

So, insist that the argument not be any longer than we expect.

Also, several operations increment rq_next_page in the decode routine
before checking the argument size, which can leave rq_next_page pointing
well past the end of the page array, causing trouble later in
svc_free_pages.

As followup we may also want to rewrite the encoding routines to check
more carefully that they aren't running off the end of the page array.

Reported-by: Tuomas Haanpää
Reported-by: Ari Kauppi
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2017-04-26 05:25:53 +0800
13bf9fbff nfsd: stricter decoding of write-like NFSv2/v3 ops ... Browse Code »

The NFSv2/v3 code does not systematically check whether we decode past
the end of the buffer. This generally appears to be harmless, but there
are a few places where we do arithmetic on the pointers involved and
don't account for the possibility that a length could be negative. Add
checks to catch these.

Reported-by: Tuomas Haanpää
Reported-by: Ari Kauppi
Reviewed-by: NeilBrown
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2017-04-26 04:36:23 +0800
db44bac41 nfsd4: minor NFSv2/v3 write decoding cleanup ... Browse Code »

Use a couple shortcuts that will simplify a following bugfix.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2017-04-26 04:36:16 +0800

25 May, 2016

1 commit

5d22c5ab8 Merge tag 'nfsd-4.7' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"A very quiet cycle for nfsd, mainly just an RDMA update from Chuck
Lever"

* tag 'nfsd-4.7' of git://linux-nfs.org/~bfields/linux:
sunrpc: fix stripping of padded MIC tokens
svcrpc: autoload rdma module
svcrdma: Generalize svc_rdma_xdr_decode_req()
svcrdma: Eliminate code duplication in svc_rdma_recvfrom()
svcrdma: Drain QP before freeing svcrdma_xprt
svcrdma: Post Receives only for forward channel requests
svcrdma: Remove superfluous line from rdma_read_chunks()
svcrdma: svc_rdma_put_context() is invoked twice in Send error path
svcrdma: Do not add XDR padding to xdr_buf page vector
svcrdma: Support IPv6 with NFS/RDMA
nfsd: handle seqid wraparound in nfsd4_preprocess_layout_stateid
Remove unnecessary allocation

Linus Torvalds
2016-05-25 05:39:20 +0800

14 May, 2016

1 commit

6625d0913 svcrdma: Do not add XDR padding to xdr_buf page vector ... Browse Code »

An xdr_buf has a head, a vector of pages, and a tail. Each
RPC request is presented to the NFS server contained in an
xdr_buf.

The RDMA transport would like to supply the NFS server with only
the NFS WRITE payload bytes in the page vector. In some common
cases, that would allow the NFS server to swap those pages right
into the target file's page cache.

Have the transport's RDMA Read logic put XDR pad bytes in the tail
iovec, and not in the pages that hold the data payload.

The NFSv3 WRITE XDR decoder is finicky about the lengths involved,
so make sure it is looking in the correct places when computing
the total length of the incoming NFS WRITE request.

Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields

Chuck Lever
2016-05-14 03:53:05 +0800

11 Apr, 2016

1 commit

fc64005c9 don't bother with ->d_inode->i_sb - it's always equal to ->d_sb ... Browse Code »

... and neither can ever be NULL

Signed-off-by: Al Viro

Al Viro
2016-04-11 05:11:51 +0800

09 Jan, 2016

1 commit

bbddca8e8 nfsd: don't hold i_mutex over userspace upcalls ... Browse Code »

We need information about exports when crossing mountpoints during
lookup or NFSv4 readdir. If we don't already have that information
cached, we may have to ask (and wait for) rpc.mountd.

In both cases we currently hold the i_mutex on the parent of the
directory we're asking rpc.mountd about. We've seen situations where
rpc.mountd performs some operation on that directory that tries to take
the i_mutex again, resulting in deadlock.

With some care, we may be able to avoid that in rpc.mountd. But it
seems better just to avoid holding a mutex while waiting on userspace.

It appears that lookup_one_len is pretty much the only operation that
needs the i_mutex. So we could just drop the i_mutex elsewhere and do
something like

mutex_lock()
lookup_one_len()
mutex_unlock()

In many cases though the lookup would have been cached and not required
the i_mutex, so it's more efficient to create a lookup_one_len() variant
that only takes the i_mutex when necessary.

Signed-off-by: NeilBrown
Signed-off-by: J. Bruce Fields
Signed-off-by: Al Viro

NeilBrown
2016-01-09 16:07:52 +0800

13 Oct, 2015

1 commit

aaf91ec14 nfsd: switch unsigned char flags in svc_fh to bools ... Browse Code »

...just for clarity.

Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2015-10-13 05:31:04 +0800

07 May, 2015

1 commit

43b0e7ea5 nfsd: stop READDIRPLUS returning inconsistent attributes ... Browse Code »

The NFSv3 READDIRPLUS gets some of the returned attributes from the
readdir, and some from an inode returned from a new lookup. The two
objects could be different thanks to intervening renames.

The attributes in READDIRPLUS are optional, so let's just skip them if
we notice this case.

Signed-off-by: NeilBrown
Signed-off-by: J. Bruce Fields

NeilBrown
2015-05-07 23:47:00 +0800

16 Apr, 2015

1 commit

2b0143b5c VFS: normal filesystems (and lustre): d_inode() annotations ... Browse Code »

that's the bulk of filesystem drivers dealing with inodes of their own

Signed-off-by: David Howells
Signed-off-by: Al Viro

David Howells
2015-04-16 03:06:57 +0800

23 Jun, 2014

1 commit

3c7aa15d2 NFSD: Using min/max/min_t/max_t for calculate ... Browse Code »

Signed-off-by: Kinglong Mee
Signed-off-by: J. Bruce Fields

Kinglong Mee
2014-06-23 23:31:36 +0800

23 May, 2014

1 commit

d40aa3372 nfsd: Remove assignments inside conditions ... Browse Code »

Assignments should not happen inside an if conditional, but in the line
before. This issue was reported by checkpatch.

The semantic patch that makes this change is as follows
(http://coccinelle.lip6.fr/):

//

@@
identifier i1;
expression e1;
statement S;
@@
-if(!(i1 = e1)) S
+i1 = e1;
+if(!i1)
+S

//

It has been tested by compilation.

Signed-off-by: Benoit Taine
Signed-off-by: J. Bruce Fields

Benoit Taine
2014-05-23 03:52:23 +0800

24 Jan, 2014

1 commit

068c34c0c nfsd: fix encode_entryplus_baggage stack usage ... Browse Code »

We stick an extra svc_fh in nfsd3_readdirres to save the need to
kmalloc, though maybe it would be fine to kmalloc instead.

Acked-by: Jeff Layton
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2014-01-24 02:50:27 +0800

11 Dec, 2013

1 commit

6e14b46b9 nfsd: don't return high mode bits ... Browse Code »

The Linux NFS server replies among other things to a "Check access permission"
the following:

NFS: File type = 2 (Directory)
NFS: Mode = 040755

A netapp server replies here:
NFS: File type = 2 (Directory)
NFS: Mode = 0755

The RFC 1813 i read:
fattr3

struct fattr3 {
ftype3 type;
mode3 mode;
uint32 nlink;
...
For the mode bits only the lowest 9 are defined in the RFC

As far as I can tell, knfsd has always done this, so apparently it's harmless.
Nevertheless, it appears to be wrong.

Note this is already correct in the NFSv4 case, only v2 and v3 need
fixing.

Signed-off-by: J. Bruce Fields

Albert Fluegel
2013-12-11 09:35:58 +0800

27 Feb, 2013

1 commit

d895cb1af Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs pile (part one) from Al Viro:
"Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
locking violations, etc.

The most visible changes here are death of FS_REVAL_DOT (replaced with
"has ->d_weak_revalidate()") and a new helper getting from struct file
to inode. Some bits of preparation to xattr method interface changes.

Misc patches by various people sent this cycle *and* ocfs2 fixes from
several cycles ago that should've been upstream right then.

PS: the next vfs pile will be xattr stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
saner proc_get_inode() calling conventions
proc: avoid extra pde_put() in proc_fill_super()
fs: change return values from -EACCES to -EPERM
fs/exec.c: make bprm_mm_init() static
ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
ocfs2: fix possible use-after-free with AIO
ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
target: writev() on single-element vector is pointless
export kernel_write(), convert open-coded instances
fs: encode_fh: return FILEID_INVALID if invalid fid_type
kill f_vfsmnt
vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
nfsd: handle vfs_getattr errors in acl protocol
switch vfs_getattr() to struct path
default SET_PERSONALITY() in linux/elf.h
ceph: prepopulate inodes only when request is aborted
d_hash_and_lookup(): export, switch open-coded instances
9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
9p: split dropping the acls from v9fs_set_create_acl()
...

Linus Torvalds
2013-02-27 12:16:07 +0800