23 Oct, 2020

1 commit

  • Pull nfsd updates from Bruce Fields:
    "The one new feature this time, from Anna Schumaker, is READ_PLUS,
    which has the same arguments as READ but allows the server to return
    an array of data and hole extents.

    Otherwise it's a lot of cleanup and bugfixes"

    * tag 'nfsd-5.10' of git://linux-nfs.org/~bfields/linux: (43 commits)
    NFSv4.2: Fix NFS4ERR_STALE error when doing inter server copy
    SUNRPC: fix copying of multiple pages in gss_read_proxy_verf()
    sunrpc: raise kernel RPC channel buffer size
    svcrdma: fix bounce buffers for unaligned offsets and multiple pages
    nfsd: remove unneeded break
    net/sunrpc: Fix return value for sysctl sunrpc.transports
    NFSD: Encode a full READ_PLUS reply
    NFSD: Return both a hole and a data segment
    NFSD: Add READ_PLUS hole segment encoding
    NFSD: Add READ_PLUS data support
    NFSD: Hoist status code encoding into XDR encoder functions
    NFSD: Map nfserr_wrongsec outside of nfsd_dispatch
    NFSD: Remove the RETURN_STATUS() macro
    NFSD: Call NFSv2 encoders on error returns
    NFSD: Fix .pc_release method for NFSv2
    NFSD: Remove vestigial typedefs
    NFSD: Refactor nfsd_dispatch() error paths
    NFSD: Clean up nfsd_dispatch() variables
    NFSD: Clean up stale comments in nfsd_dispatch()
    NFSD: Clean up switch statement in nfsd_dispatch()
    ...

    Linus Torvalds
     

08 Oct, 2020

6 commits


26 Sep, 2020

2 commits

  • Reserving space for a large READ payload requires special handling when
    reserving space in the xdr buffer pages. One problem we can have is use
    of the scratch buffer, which is used to get a pointer to a contiguous
    region of data up to PAGE_SIZE. When using the scratch buffer, calls to
    xdr_commit_encode() shift the data to it's proper alignment in the xdr
    buffer. If we've reserved several pages in a vector, then this could
    potentially invalidate earlier pointers and result in incorrect READ
    data being sent to the client.

    I get around this by looking at the amount of space left in the current
    page, and never reserve more than that for each entry in the read
    vector. This lets us place data directly where it needs to go in the
    buffer pages.

    Signed-off-by: Anna Schumaker
    Signed-off-by: J. Bruce Fields

    Anna Schumaker
     
  • Drop duplicate words in net/sunrpc/.
    Also fix "Anyone" to be "Any one".

    Signed-off-by: Randy Dunlap
    Cc: "J. Bruce Fields"
    Cc: Chuck Lever
    Cc: linux-nfs@vger.kernel.org
    Signed-off-by: J. Bruce Fields

    Randy Dunlap
     

26 Jun, 2020

1 commit

  • @subbuf is an output parameter of xdr_buf_subsegment(). A survey of
    call sites shows that @subbuf is always uninitialized before
    xdr_buf_segment() is invoked by callers.

    There are some execution paths through xdr_buf_subsegment() that do
    not set all of the fields in @subbuf, leaving some pointer fields
    containing garbage addresses. Subsequent processing of that buffer
    then results in a page fault.

    Signed-off-by: Chuck Lever
    Cc:
    Signed-off-by: Anna Schumaker

    Chuck Lever
     

27 Apr, 2020

1 commit

  • I've noticed that when krb5i or krb5p security is in use,
    retransmitted requests are missing the server's duplicate reply
    cache. The computed checksum on the retransmitted request does not
    match the cached checksum, resulting in the server performing the
    retransmitted request again instead of returning the cached reply.

    The assumptions made when removing xdr_buf_trim() were not correct.
    In the send paths, the upper layer has already set the segment
    lengths correctly, and shorting the buffer's content is simply a
    matter of reducing buf->len.

    xdr_buf_trim() is the right answer in the receive/unwrap path on
    both the client and the server. The buffer segment lengths have to
    be shortened one-by-one.

    On the server side in particular, head.iov_len needs to be updated
    correctly to enable nfsd_cache_csum() to work correctly. The simple
    buf->len computation doesn't do that, and that results in
    checksumming stale data in the buffer.

    The problem isn't noticed until there's significant instability of
    the RPC transport. At that point, the reliability of retransmit
    detection on the server becomes crucial.

    Fixes: 241b1f419f0e ("SUNRPC: Remove xdr_buf_trim()")
    Signed-off-by: Chuck Lever

    Chuck Lever
     

16 Mar, 2020

1 commit


15 Jan, 2020

1 commit


18 Nov, 2019

1 commit

  • xdr_shrink_pagelen() BUG's when @len is larger than buf->page_len.
    This can happen when xdr_buf_read_mic() is given an xdr_buf with
    a small page array (like, only a few bytes).

    Instead, just cap the number of bytes that xdr_shrink_pagelen()
    will move.

    Fixes: 5f1bc39979d ("SUNRPC: Fix buffer handling of GSS MIC ... ")
    Signed-off-by: Chuck Lever
    Reviewed-by: Benjamin Coddington
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

21 Sep, 2019

2 commits

  • Let the name reflect the single use. The function now assumes the GSS MIC
    is the last object in the buffer.

    Signed-off-by: Benjamin Coddington
    Signed-off-by: Anna Schumaker

    Benjamin Coddington
     
  • The GSS Message Integrity Check data for krb5i may lie partially in the XDR
    reply buffer's pages and tail. If so, we try to copy the entire MIC into
    free space in the tail. But as the estimations of the slack space required
    for authentication and verification have improved there may be less free
    space in the tail to complete this copy -- see commit 2c94b8eca1a2
    ("SUNRPC: Use au_rslack when computing reply buffer size"). In fact, there
    may only be room in the tail for a single copy of the MIC, and not part of
    the MIC and then another complete copy.

    The real world failure reported is that `ls` of a directory on NFS may
    sometimes return -EIO, which can be traced back to xdr_buf_read_netobj()
    failing to find available free space in the tail to copy the MIC.

    Fix this by checking for the case of the MIC crossing the boundaries of
    head, pages, and tail. If so, shift the buffer until the MIC is contained
    completely within the pages or tail. This allows the remainder of the
    function to create a sub buffer that directly address the complete MIC.

    Signed-off-by: Benjamin Coddington
    Cc: stable@vger.kernel.org # v5.1
    Reviewed-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Benjamin Coddington
     

20 Aug, 2019

1 commit

  • Micro-optimization: For xdr_commit_encode call sites in
    net/sunrpc/xdr.c, eliminate the extra calling sequence. On my
    client, this change saves about a microsecond for every 30 calls
    to xdr_reserve_space().

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     

21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

14 Feb, 2019

6 commits

  • Certain NFS results (eg. READLINK) might expect a data payload that
    is not an exact multiple of 4 bytes. In this case, XDR encoding
    is required to pad that payload so its length on the wire is a
    multiple of 4 bytes. The constants that define the maximum size of
    each NFS result do not appear to account for this extra word.

    In each case where the data payload is to be received into pages:

    - 1 word is added to the size of the receive buffer allocated by
    call_allocate

    - rpc_inline_rcv_pages subtracts 1 word from @hdrsize so that the
    extra buffer space falls into the rcv_buf's tail iovec

    - If buf->pagelen is word-aligned, an XDR pad is not needed and
    is thus removed from the tail

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • prepare_reply_buffer() and its NFSv4 equivalents expose the details
    of the RPC header and the auth slack values to upper layer
    consumers, creating a layering violation, and duplicating code.

    Remedy these issues by adding a new RPC client API that hides those
    details from upper layers in a common helper function.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • The key action of xdr_buf_trim() is that it shortens buf->len, the
    length of the xdr_buf's content. The other actions -- shortening the
    head, pages, and tail components -- are actually not necessary. In
    particular, changing the size of those components can corrupt the
    RPC message contained in the buffer. This is an accident waiting to
    happen rather than a current bug, as far as we know.

    Signed-off-by: Chuck Lever
    Acked-by: Bruce Fields
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • We don't want READ payloads that are partially in the head iovec and
    in the page buffer because this requires pull-up, which can be
    expensive.

    The NFS/RPC client tries hard to predict the size of the head iovec
    so that the incoming READ data payload lands only in the page
    vector, but it doesn't always get it right. To help diagnose such
    problems, add a trace point in the logic that decodes READ-like
    operations that reports whether pull-up is being done.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • This can help field troubleshooting without needing the overhead of
    a full network capture (ie, tcpdump).

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Having access to the controlling rpc_rqst means a trace point in the
    XDR code can report:

    - the XID
    - the task ID and client ID
    - the p_name of RPC being processed

    Subsequent patches will introduce such trace points.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     

09 Nov, 2018

1 commit


06 Nov, 2018

1 commit

  • When truncating the encode buffer, the page_ptr is getting
    advanced, causing the next page to be skipped while encoding.
    The page is still included in the response, so the response
    contains a page of bogus data.

    We need to adjust the page_ptr backwards to ensure we encode
    the next page into the correct place.

    We saw this triggered when concurrent directory modifications caused
    nfsd4_encode_direct_fattr() to return nfserr_noent, and the resulting
    call to xdr_truncate_encode() corrupted the READDIR reply.

    Signed-off-by: Frank Sorenson
    Cc: stable@vger.kernel.org
    Signed-off-by: J. Bruce Fields

    Frank Sorenson
     

01 Oct, 2018

1 commit


11 Apr, 2018

1 commit


26 Apr, 2017

1 commit


22 Feb, 2017

1 commit


23 Sep, 2016

2 commits


09 May, 2016

1 commit


05 Apr, 2016

2 commits

  • Mostly direct substitution with occasional adjustment or removing
    outdated comments.

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

08 Jan, 2015

1 commit

  • A struct xdr_stream at a page boundary might point to the end of one
    page or the beginning of the next, but xdr_truncate_encode isn't
    prepared to handle the former.

    This can cause corruption of NFSv4 READDIR replies in the case that a
    readdir entry that would have exceeded the client's dircount/maxcount
    limit would have ended exactly on a 4k page boundary. You're more
    likely to hit this case on large directories.

    Other xdr_truncate_encode callers are probably also affected.

    Reported-by: Holger Hoffstätte
    Tested-by: Holger Hoffstätte
    Fixes: 3e19ce762b53 "rpc: xdr_truncate_encode"
    Cc: stable@vger.kernel.org
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

24 Oct, 2014

2 commits


18 Jul, 2014

1 commit


07 Jun, 2014

1 commit

  • The rpc code makes available to the NFS server an array of pages to
    encod into. The server represents its reply as an xdr buf, with the
    head pointing into the first page in that array, the pages ** array
    starting just after that, and the tail (if any) sharing any leftover
    space in the page used by the head.

    While encoding, we use xdr_stream->page_ptr to keep track of which page
    we're currently using.

    Currently we set xdr_stream->page_ptr to buf->pages, which makes the
    head a weird exception to the rule that page_ptr always points to the
    page we're currently encoding into. So, instead set it to buf->pages -
    1 (the page actually containing the head), and remove the need for a
    little unintuitive logic in xdr_get_next_encode_buffer() and
    xdr_truncate_encode.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields