20 Jan, 2021

1 commit


06 Jan, 2021

1 commit


12 Aug, 2020

1 commit

  • The current mirrored read failover code is correctly resetting the mirror
    index between failed reads, however it is not able to actually flip the
    RPC call over to the next RPC client.
    The end result is that we keep resending the RPC call to the same client
    over and over.

    The fix is to use the pnfs_read_resend_pnfs() mechanism to schedule a
    new RPC call, but we need to add the ability to pass in a mirror
    index so that we always retry the next mirror in the list.

    Fixes: 166bd5b889ac ("pNFS/flexfiles: Fix layoutstats handling during read failovers")
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

28 Mar, 2020

7 commits


26 Mar, 2020

1 commit


16 Mar, 2020

3 commits


15 Jan, 2020

1 commit


21 Sep, 2019

2 commits


26 Apr, 2019

1 commit


02 Mar, 2019

1 commit


24 Feb, 2019

1 commit


20 Dec, 2018

1 commit

  • SUNRPC has two sorts of credentials, both of which appear as
    "struct rpc_cred".
    There are "generic credentials" which are supplied by clients
    such as NFS and passed in 'struct rpc_message' to indicate
    which user should be used to authorize the request, and there
    are low-level credentials such as AUTH_NULL, AUTH_UNIX, AUTH_GSS
    which describe the credential to be sent over the wires.

    This patch replaces all the generic credentials by 'struct cred'
    pointers - the credential structure used throughout Linux.

    For machine credentials, there is a special 'struct cred *' pointer
    which is statically allocated and recognized where needed as
    having a special meaning. A look-up of a low-level cred will
    map this to a machine credential.

    Signed-off-by: NeilBrown
    Acked-by: J. Bruce Fields
    Signed-off-by: Anna Schumaker

    NeilBrown
     

01 Oct, 2018

1 commit


17 Aug, 2018

1 commit


09 Aug, 2018

1 commit


19 Jun, 2018

1 commit


01 Jun, 2018

5 commits


15 Jan, 2018

2 commits

  • PNFS block/SCSI layouts should gracefully handle cases where block devices
    are not available when a layout is retrieved, or the block devices are
    removed while the client holds a layout.

    While setting up a layout segment, keep a record of an unavailable or
    un-parsable block device in cache with a flag so that subsequent layouts do
    not spam the server with GETDEVINFO. We can reuse the current
    NFS_DEVICEID_UNAVAILABLE handling with one variation: instead of reusing
    the device, we will discard it and send a fresh GETDEVINFO after the
    timeout, since the lookup and validation of the device occurs within the
    GETDEVINFO response handling.

    A lookup of a layout segment that references an unavailable device will
    return a segment with the NFS_LSEG_UNAVAILABLE flag set. This will allow
    the pgio layer to mark the layout with the appropriate fail bit, which
    forces subsequent IO to the MDS, and prevents spamming the server with
    LAYOUTGET, LAYOUTRETURN.

    Finally, when IO to a block device fails, look up the block device(s)
    referenced by the pgio header, and mark them as unavailable.

    Signed-off-by: Benjamin Coddington
    Signed-off-by: Trond Myklebust

    Benjamin Coddington
     
  • If there's an error doing I/O to block device, and the client resends the
    I/O to the MDS, the MDS must recall the layout from the client before
    processing the I/O. Let's preempt that exchange by returning the layout
    before falling back to the MDS when there's an error.

    Signed-off-by: Benjamin Coddington
    Signed-off-by: Trond Myklebust

    Benjamin Coddington
     

18 Nov, 2017

4 commits

  • If our layoutreturn on close operation returns an NFS4ERR_OLD_STATEID,
    then try to update the stateid and retry. We know that there should
    be no further LAYOUTGET requests being launched.

    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     
  • atomic_t variables are currently used to implement reference
    counters with the following properties:
    - counter is initialized to 1 using atomic_set()
    - a resource is freed upon counter reaching zero
    - once counter reaches zero, its further
    increments aren't allowed
    - counter schema uses basic atomic operations
    (set, inc, inc_not_zero, dec_and_test, etc.)

    Such atomic variables should be converted to a newly provided
    refcount_t type and API that prevents accidental counter overflows
    and underflows. This is important since overflows and underflows
    can lead to use-after-free situation and be exploitable.

    The variable pnfs_layout_hdr.plh_refcount is used as pure reference counter.
    Convert it to refcount_t and fix up the operations.

    Suggested-by: Kees Cook
    Reviewed-by: David Windsor
    Reviewed-by: Hans Liljestrand
    Signed-off-by: Elena Reshetova
    Signed-off-by: Anna Schumaker

    Elena Reshetova
     
  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: Anna Schumaker

    Elena Reshetova
     
  • atomic_t variables are currently used to implement reference
    counters with the following properties:
    - counter is initialized to 1 using atomic_set()
    - a resource is freed upon counter reaching zero
    - once counter reaches zero, its further
    increments aren't allowed
    - counter schema uses basic atomic operations
    (set, inc, inc_not_zero, dec_and_test, etc.)

    Such atomic variables should be converted to a newly provided
    refcount_t type and API that prevents accidental counter overflows
    and underflows. This is important since overflows and underflows
    can lead to use-after-free situation and be exploitable.

    The variable nfs4_pnfs_ds.ds_count is used as pure reference counter.
    Convert it to refcount_t and fix up the operations.

    Suggested-by: Kees Cook
    Reviewed-by: David Windsor
    Reviewed-by: Hans Liljestrand
    Signed-off-by: Elena Reshetova
    Signed-off-by: Anna Schumaker

    Elena Reshetova
     

15 Aug, 2017

1 commit


24 May, 2017

1 commit

  • It's possible and acceptable for NFS to attempt to add requests beyond the
    range of the current pgio->pg_lseg, a case which should be caught and
    limited by the pg_test operation. However, the current handling of this
    case replaces pgio->pg_lseg with a new layout segment (after a WARN) within
    that pg_test operation. That will cause all the previously added requests
    to be submitted with this new layout segment, which may not be valid for
    those requests.

    Fix this problem by only returning zero for the number of bytes to coalesce
    from pg_test for this case which allows any previously added requests to
    complete on the current layout segment. The check for requests starting
    out of range of the layout segment moves to pg_init, so that the
    replacement of pgio->pg_lseg will be done when the next request is added.

    Signed-off-by: Benjamin Coddington
    Signed-off-by: Trond Myklebust

    Benjamin Coddington
     

25 Apr, 2017

1 commit


21 Apr, 2017

1 commit