28 Mar, 2020

16 commits


26 Mar, 2020

4 commits

  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Refactor pnfs_generic_commit_pagelist() to simplify the conversion
    to layout segment based commit lists.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Just allocate the array at the end of the layout segment structure,
    instead of allocating it as a separate array of pointers.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Ever since commit 2c94b8eca1a2 ("SUNRPC: Use au_rslack when computing
    reply buffer size"). It changed how "req->rq_rcvsize" is calculated. It
    used to use au_cslack value which was nice and large and changed it to
    au_rslack value which turns out to be too small.

    Since 5.1, v3 mount with sec=krb5p fails against an Ontap server
    because client's receive buffer it too small.

    For gss krb5p, we need to account for the mic token in the verifier,
    and the wrap token in the wrap token.

    RFC 4121 defines:
    mic token
    Octet no Name Description
    --------------------------------------------------------------
    0..1 TOK_ID Identification field. Tokens emitted by
    GSS_GetMIC() contain the hex value 04 04
    expressed in big-endian order in this
    field.
    2 Flags Attributes field, as described in section
    4.2.2.
    3..7 Filler Contains five octets of hex value FF.
    8..15 SND_SEQ Sequence number field in clear text,
    expressed in big-endian order.
    16..last SGN_CKSUM Checksum of the "to-be-signed" data and
    octet 0..15, as described in section 4.2.4.

    that's 16bytes (GSS_KRB5_TOK_HDR_LEN) + chksum

    wrap token
    Octet no Name Description
    --------------------------------------------------------------
    0..1 TOK_ID Identification field. Tokens emitted by
    GSS_Wrap() contain the hex value 05 04
    expressed in big-endian order in this
    field.
    2 Flags Attributes field, as described in section
    4.2.2.
    3 Filler Contains the hex value FF.
    4..5 EC Contains the "extra count" field, in big-
    endian order as described in section 4.2.3.
    6..7 RRC Contains the "right rotation count" in big-
    endian order, as described in section
    4.2.5.
    8..15 SND_SEQ Sequence number field in clear text,
    expressed in big-endian order.
    16..last Data Encrypted data for Wrap tokens with
    confidentiality, or plaintext data followed
    by the checksum for Wrap tokens without
    confidentiality, as described in section
    4.2.4.

    Also 16bytes of header (GSS_KRB5_TOK_HDR_LEN), encrypted data, and cksum
    (other things like padding)

    RFC 3961 defines known cksum sizes:
    Checksum type sumtype checksum section or
    value size reference
    ---------------------------------------------------------------------
    CRC32 1 4 6.1.3
    rsa-md4 2 16 6.1.2
    rsa-md4-des 3 24 6.2.5
    des-mac 4 16 6.2.7
    des-mac-k 5 8 6.2.8
    rsa-md4-des-k 6 16 6.2.6
    rsa-md5 7 16 6.1.1
    rsa-md5-des 8 24 6.2.4
    rsa-md5-des3 9 24 ??
    sha1 (unkeyed) 10 20 ??
    hmac-sha1-des3-kd 12 20 6.3
    hmac-sha1-des3 13 20 ??
    sha1 (unkeyed) 14 20 ??
    hmac-sha1-96-aes128 15 20 [KRB5-AES]
    hmac-sha1-96-aes256 16 20 [KRB5-AES]
    [reserved] 0x8003 ? [GSS-KRB5]

    Linux kernel now mainly supports type 15,16 so max cksum size is 20bytes.
    (GSS_KRB5_MAX_CKSUM_LEN)

    Re-use already existing define of GSS_KRB5_MAX_SLACK_NEEDED that's used
    for encoding the gss_wrap tokens (same tokens are used in reply).

    Fixes: 2c94b8eca1a2 ("SUNRPC: Use au_rslack when computing reply buffer size")
    Signed-off-by: Olga Kornievskaia
    Reviewed-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Olga Kornievskaia
     

25 Mar, 2020

2 commits

  • UDP was originally disabled in 6da1a034362f for NFSv4. Later in
    b24ee6c64ca7 UDP is by default disabled by NFS_DISABLE_UDP_SUPPORT=y for
    all NFS versions. Therefore remove v4 from error message.

    Fixes: b24ee6c64ca7 ("NFS: allow deprecation of NFS UDP protocol")

    Signed-off-by: Petr Vorel
    Signed-off-by: Trond Myklebust

    Petr Vorel
     
  • UDP is disabled by default in commit b24ee6c64ca7 ("NFS: allow
    deprecation of NFS UDP protocol"), but the default mount options
    is still udp, change it to tcp to avoid the "Unsupported transport
    protocol udp" error if no protocol is specified when mount nfs.

    Fixes: b24ee6c64ca7 ("NFS: allow deprecation of NFS UDP protocol")
    Signed-off-by: Liwei Song
    Signed-off-by: Trond Myklebust

    Liwei Song
     

23 Mar, 2020

1 commit

  • When dreq is allocated by nfs_direct_req_alloc(), dreq->kref is
    initialized to 2. Therefore we need to call nfs_direct_req_release()
    twice to release the allocated dreq. Usually it is called in
    nfs_file_direct_{read, write}() and nfs_direct_complete().

    However, current code only calls nfs_direct_req_relese() once if
    nfs_get_lock_context() fails in nfs_file_direct_{read, write}().
    So, that case would result in memory leak.

    Fix this by adding the missing call.

    Signed-off-by: Misono Tomohiro
    Signed-off-by: Trond Myklebust

    Misono Tomohiro
     

18 Mar, 2020

1 commit


16 Mar, 2020

16 commits

  • By preventing compiler inlining of the integrity and privacy
    helpers, stack utilization for the common case (authentication only)
    goes way down.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: this function is no longer used.

    Signed-off-by: Chuck Lever
    Reviewed-by: Benjamin Coddington
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • xdr_buf_read_mic() tries to find unused contiguous space in a
    received xdr_buf in order to linearize the checksum for the call
    to gss_verify_mic. However, the corner cases in this code are
    numerous and we seem to keep missing them. I've just hit yet
    another buffer overrun related to it.

    This overrun is at the end of xdr_buf_read_mic():

    1284 if (buf->tail[0].iov_len != 0)
    1285 mic->data = buf->tail[0].iov_base + buf->tail[0].iov_len;
    1286 else
    1287 mic->data = buf->head[0].iov_base + buf->head[0].iov_len;
    1288 __read_bytes_from_xdr_buf(&subbuf, mic->data, mic->len);
    1289 return 0;

    This logic assumes the transport has set the length of the tail
    based on the size of the received message. base + len is then
    supposed to be off the end of the message but still within the
    actual buffer.

    In fact, the length of the tail is set by the upper layer when the
    Call is encoded so that the end of the tail is actually the end of
    the allocated buffer itself. This causes the logic above to set
    mic->data to point past the end of the receive buffer.

    The "mic->data = head" arm of this if statement is no less fragile.

    As near as I can tell, this has been a problem forever. I'm not sure
    that minimizing au_rslack recently changed this pathology much.

    So instead, let's use a more straightforward approach: kmalloc a
    separate buffer to linearize the checksum. This is similar to
    how gss_validate() currently works.

    Coming back to this code, I had some trouble understanding what
    was going on. So I've cleaned up the variable naming and added
    a few comments that point back to the XDR definition in RFC 2203
    to help guide future spelunkers, including myself.

    As an added clean up, the functionality that was in
    xdr_buf_read_mic() is folded directly into gss_unwrap_resp_integ(),
    as that is its only caller.

    Signed-off-by: Chuck Lever
    Reviewed-by: Benjamin Coddington
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • The current codebase makes use of the zero-length array language
    extension to the C90 standard, but the preferred mechanism to declare
    variable-length types such as these ones is a flexible array member[1][2],
    introduced in C99:

    struct foo {
    int stuff;
    struct boo array[];
    };

    By making use of the mechanism above, we will get a compiler warning
    in case the flexible array does not occur last in the structure, which
    will help us prevent some kind of undefined behavior bugs from being
    inadvertently introduced[3] to the codebase from now on.

    Also, notice that, dynamic memory allocations won't be affected by
    this change:

    "Flexible array members have incomplete type, and so the sizeof operator
    may not be applied. As a quirk of the original implementation of
    zero-length arrays, sizeof evaluates to zero."[1]

    This issue was found with the help of Coccinelle.

    [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
    [2] https://github.com/KSPP/linux/issues/21
    [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: Trond Myklebust

    Gustavo A. R. Silva
     
  • This fixes xfstests generic/356 failure on NFSv4.2.

    Signed-off-by: Murphy Zhou
    Signed-off-by: Trond Myklebust

    Murphy Zhou
     
  • In function nfs_permission:
    1. the rcu_read_lock and rcu_read_unlock around nfs_do_access
    is unnecessary because the rcu critical data structure is already
    protected in subsidiary function nfs_access_get_cached_rcu. No other
    data structure needs rcu_read_lock in nfs_do_access.

    2. call nfs_do_access once is enough, because:
    2-1. when mask has MAY_NOT_BLOCK bit
    The second call to nfs_do_access will not happen.

    2-2. when mask has no MAY_NOT_BLOCK bit
    The second call to nfs_do_access will happen if res == -ECHILD, which
    means the first nfs_do_access goes out after statement if (!may_block).
    The second call to nfs_do_access will go through this procedure once
    again except continue the work after if (!may_block).
    But above work can be performed by only one call to nfs_do_access
    without mangling the mask flag.

    Tested in x86_64
    Signed-off-by: Zhouyi Zhou
    Signed-off-by: Trond Myklebust

    Zhouyi Zhou
     
  • The variable status is being initialized with a value that is never
    read and it is being updated later with a new value. The initialization
    is redundant and can be removed.

    Addresses-Coverity: ("Unused value")
    Signed-off-by: Colin Ian King
    Signed-off-by: Trond Myklebust

    Colin Ian King
     
  • When we receive a CB_RECALL_ANY that asks us to return flexfiles
    layouts, we iterate through all the layouts and look at whether or
    not there are active open file descriptors that might need them
    for I/O. If there are no such descriptors, we return the layouts.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Convert to use nfs_client_for_each_server() for efficiency.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Convert nfs_delegation_reap_unclaimed() to use nfs_client_for_each_server()
    for efficiency.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Convert it to use the nfs_client_for_each_server() helper, and
    make it more efficient by skipping delegations for inodes we
    know are in the process of being freed. Also improve the efficiency
    of the cursor by skipping delegations that are being freed.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Add a helper nfs_client_for_each_server() to iterate through all the
    filesystems that are attached to a struct nfs_client, and apply
    a function to all the active ones.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Now that we can rely on just the rcu_read_lock(), remove the
    clp->cl_lock and clean up.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Make sure to test the stateid for validity so that we catch instances
    where the server may have been reusing stateids in
    nfs_layout_find_inode_by_stateid().

    Fixes: 7b410d9ce460 ("pNFS: Delay getting the layout header in CB_LAYOUTRECALL handlers")
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Ensure that if the DS is returning too many DELAY and GRACE errors, we
    also report that to the MDS through the layouterror mechanism.

    Signed-off-by: Trond Myklebust

    Trond Myklebust