08 Oct, 2016

1 commit

  • I only implemented the sync version of this call, since it's the
    easiest. I can simply call vfs_copy_range() and have the vfs do the
    right thing for the filesystem being exported.

    Signed-off-by: Anna Schumaker
    Signed-off-by: J. Bruce Fields

    Anna Schumaker
     

14 Jul, 2016

2 commits

  • This addresses the conundrum referenced in RFC5661 18.35.3,
    and will allow clients to return state to the server using the
    machine credentials.

    The biggest part of the problem is that we need to allow the client
    to send a compound op with integrity/privacy on mounts that don't
    have it enabled.

    Add server support for properly decoding and using spo_must_enforce
    and spo_must_allow bits. Add support for machine credentials to be
    used for CLOSE, OPEN_DOWNGRADE, LOCKU, DELEGRETURN,
    and TEST/FREE STATEID.
    Implement a check so as to not throw WRONGSEC errors when these
    operations are used if integrity/privacy isn't turned on.

    Without this, Linux clients with credentials that expired while holding
    delegations were getting stuck in an endless loop.

    Signed-off-by: Andrew Elble
    Reviewed-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Andrew Elble
     
  • Rename mach_creds_match() to nfsd4_mach_creds_match() and un-staticify

    Signed-off-by: Andrew Elble
    Reviewed-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Andrew Elble
     

08 Dec, 2015

1 commit

  • This is basically a remote version of the btrfs CLONE operation,
    so the implementation is fairly trivial. Made even more trivial
    by stealing the XDR code and general framework Anna Schumaker's
    COPY prototype.

    Signed-off-by: Christoph Hellwig
    Acked-by: J. Bruce Fields
    Signed-off-by: Al Viro

    Christoph Hellwig
     

13 Oct, 2015

1 commit


23 Jun, 2015

1 commit

  • This patch changes nfs4_preprocess_stateid_op so it always returns
    a valid struct file if it has been asked for that. For that we
    now allocate a temporary struct file for special stateids, and check
    permissions if we got the file structure from the stateid. This
    ensures that all callers will get their handling of special stateids
    right, and avoids code duplication.

    There is a little wart in here because the read code needs to know
    if we allocated a file structure so that it can copy around the
    read-ahead parameters. In the long run we should probably aim to
    cache full file structures used with special stateids instead.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields

    Christoph Hellwig
     

05 May, 2015

1 commit

  • For the sake of forgetful clients, the server should return the layouts
    to the file system on 'last close' of a file (assuming that there are no
    delegations outstanding to that particular client) or on delegreturn
    (assuming that there are no opens on a file from that particular
    client).

    In theory the information is all there in current data structures, but
    it's not efficiently available; nfs4_file->fi_ref includes references on
    the file across all clients, but we need a per-(client, file) count.
    Walking through lots of stateid's to calculate this on each close or
    delegreturn would be painful.

    This patch introduces infrastructure to maintain per-client opens and
    delegation counters on a per-file basis.

    [hch: ported to the mainline pNFS support, merged various fixes from Jeff]
    Signed-off-by: Sachin Bhamare
    Signed-off-by: Jeff Layton
    Signed-off-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields

    Sachin Bhamare
     

27 Apr, 2015

1 commit

  • Pull fourth vfs update from Al Viro:
    "d_inode() annotations from David Howells (sat in for-next since before
    the beginning of merge window) + four assorted fixes"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    RCU pathwalk breakage when running into a symlink overmounting something
    fix I_DIO_WAKEUP definition
    direct-io: only inc/dec inode->i_dio_count for file systems
    fs/9p: fix readdir()
    VFS: assorted d_backing_inode() annotations
    VFS: fs/inode.c helpers: d_inode() annotations
    VFS: fs/cachefiles: d_backing_inode() annotations
    VFS: fs library helpers: d_inode() annotations
    VFS: assorted weird filesystems: d_inode() annotations
    VFS: normal filesystems (and lustre): d_inode() annotations
    VFS: security/: d_inode() annotations
    VFS: security/: d_backing_inode() annotations
    VFS: net/: d_inode() annotations
    VFS: net/unix: d_backing_inode() annotations
    VFS: kernel/: d_inode() annotations
    VFS: audit: d_backing_inode() annotations
    VFS: Fix up some ->d_inode accesses in the chelsio driver
    VFS: Cachefiles should perform fs modifications on the top layer only
    VFS: AF_UNIX sockets should call mknod on the top layer only

    Linus Torvalds
     

16 Apr, 2015

1 commit


01 Apr, 2015

2 commits


03 Feb, 2015

1 commit

  • Add support for the GETDEVICEINFO, LAYOUTGET, LAYOUTCOMMIT and
    LAYOUTRETURN NFSv4.1 operations, as well as backing code to manage
    outstanding layouts and devices.

    Layout management is very straight forward, with a nfs4_layout_stateid
    structure that extends nfs4_stid to manage layout stateids as the
    top-level structure. It is linked into the nfs4_file and nfs4_client
    structures like the other stateids, and contains a linked list of
    layouts that hang of the stateid. The actual layout operations are
    implemented in layout drivers that are not part of this commit, but
    will be added later.

    The worst part of this commit is the management of the pNFS device IDs,
    which suffers from a specification that is not sanely implementable due
    to the fact that the device-IDs are global and not bound to an export,
    and have a small enough size so that we can't store the fsid portion of
    a file handle, and must never be reused. As we still do need perform all
    export authentication and validation checks on a device ID passed to
    GETDEVICEINFO we are caught between a rock and a hard place. To work
    around this issue we add a new hash that maps from a 64-bit integer to a
    fsid so that we can look up the export to authenticate against it,
    a 32-bit integer as a generation that we can bump when changing the device,
    and a currently unused 32-bit integer that could be used in the future
    to handle more than a single device per export. Entries in this hash
    table are never deleted as we can't reuse the ids anyway, and would have
    a severe lifetime problem anyway as Linux export structures are temporary
    structures that can go away under load.

    Parts of the XDR data, structures and marshaling/unmarshaling code, as
    well as many concepts are derived from the old pNFS server implementation
    from Andy Adamson, Benny Halevy, Dean Hildebrand, Marc Eshel, Fred Isaman,
    Mike Sager, Ricardo Labiaga and many others.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     

08 Nov, 2014

2 commits

  • DEALLOCATE only returns a status value, meaning we can use the noop()
    xdr encoder to reply to the client.

    Signed-off-by: Anna Schumaker
    Signed-off-by: J. Bruce Fields

    Anna Schumaker
     
  • The ALLOCATE operation is used to preallocate space in a file. I can do
    this by using vfs_fallocate() to do the actual preallocation.

    ALLOCATE only returns a status indicator, so we don't need to write a
    special encode() function.

    Signed-off-by: Anna Schumaker
    Signed-off-by: J. Bruce Fields

    Anna Schumaker
     

30 Sep, 2014

1 commit

  • This patch adds server support for the NFS v4.2 operation SEEK, which
    returns the position of the next hole or data segment in a file.

    Signed-off-by: Anna Schumaker
    Signed-off-by: J. Bruce Fields

    Anna Schumaker
     

01 Aug, 2014

1 commit

  • We don't want to rely on the client_mutex for protection in the case of
    NFSv4 open owners. Instead, we add a mutex that will only be taken for
    NFSv4.0 state mutating operations, and that will be released once the
    entire compound is done.

    Also, ensure that nfsd4_cstate_assign_replay/nfsd4_cstate_clear_replay
    take a reference to the stateowner when they are using it for NFSv4.0
    open and lock replay caching.

    Signed-off-by: Trond Myklebust
    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

10 Jul, 2014

1 commit

  • We want to use the nfsd4_compound_state to cache the nfs4_client in
    order to optimise away extra lookups of the clid.

    In the v4.0 case, we use this to ensure that we only have to look up the
    client at most once per compound for each call into lookup_clientid. For
    v4.1+ we set the pointer in the cstate during SEQUENCE processing so we
    should never need to do a search for it.

    Signed-off-by: Trond Myklebust
    Reviewed-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

09 Jul, 2014

4 commits


31 May, 2014

4 commits

  • Currently we limit readdir results to a single page. This can result in
    a performance regression compared to NFSv3 when reading large
    directories.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • It will turn out to be useful to have a more accurate estimate of reply
    size; so, piggyback on the existing op reply-size estimators.

    Also move nfsd4_max_reply to nfs4proc.c to get easier access to struct
    nfsd4_operation and friends. (Thanks to Christoph Hellwig for pointing
    out that simplification.)

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • Limits on maxresp_sz mean that we only ever need to replay rpc's that
    are contained entirely in the head.

    The one exception is very small zero-copy reads. That's an odd corner
    case as clients wouldn't normally ask those to be cached.

    in any case, this seems a little more robust.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • We've tried to prevent running out of space with COMPOUND_SLACK_SPACE
    and special checking in those operations (getattr) whose result can vary
    enormously.

    However:
    - COMPOUND_SLACK_SPACE may be difficult to maintain as we add
    more protocol.
    - BUG_ON or page faulting on failure seems overly fragile.
    - Especially in the 4.1 case, we prefer not to fail compounds
    just because the returned result came *close* to session
    limits. (Though perfect enforcement here may be difficult.)
    - I'd prefer encoding to be uniform for all encoders instead of
    having special exceptions for encoders containing, for
    example, attributes.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

27 May, 2014

1 commit

  • If nfsd4_check_resp_size() returns an error then we should really be
    truncating the reply here, otherwise we may leave extra garbage at the
    end of the rpc reply.

    Also add a warning to catch any cases where our reply-size estimates may
    be wrong in the case of a non-idempotent operation.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

23 May, 2014

2 commits


29 Mar, 2014

1 commit


07 Jan, 2014

1 commit


04 Jan, 2014

1 commit


15 May, 2013

1 commit

  • Implement labeled NFS on the server: encoding and decoding, and writing
    and reading, of file labels.

    Enabled with CONFIG_NFSD_V4_SECURITY_LABEL.

    Signed-off-by: Matthew N. Dodd
    Signed-off-by: Miguel Rodel Felipe
    Signed-off-by: Phua Eu Gene
    Signed-off-by: Khin Mi Mi Aung
    Signed-off-by: J. Bruce Fields

    David Quigley
     

08 Apr, 2013

1 commit

  • Closed stateid's are kept around a little while to handle close replays
    in the 4.0 case. So we stash them in the last-used stateid in the
    oo_last_closed_stateid field of the open owner. We can free that in
    encode_seqid_op_tail once the seqid on the open owner is next
    incremented. But we don't want to do that on the close itself; so we
    set NFS4_OO_PURGE_CLOSE flag set on the open owner, skip freeing it the
    first time through encode_seqid_op_tail, then when we see that flag set
    next time we free it.

    This is unnecessarily baroque.

    Instead, just move the logic that increments the seqid out of the xdr
    code and into the operation code itself.

    The justification given for the current placement is that we need to
    wait till the last minute to be sure we know whether the status is a
    sequence-id-mutating error or not, but examination of the code shows
    that can't actually happen.

    Reported-by: Yanchuan Nian
    Tested-by: Yanchuan Nian
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

03 Apr, 2013

2 commits


24 Jan, 2013

1 commit

  • It seems slightly simpler to make nfsd4_encode_fattr rather than its
    callers responsible for advancing the write pointer on success.

    (Also: the count == 0 check in the verify case looks superfluous.
    Running out of buffer space is really the only reason fattr encoding
    should fail with eresource.)

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

18 Dec, 2012

1 commit


26 Nov, 2012

2 commits

  • Our server rejects compounds containing more than one write operation.
    It's unclear whether this is really permitted by the spec; with 4.0,
    it's possibly OK, with 4.1 (which has clearer limits on compound
    parameters), it's probably not OK. No client that we're aware of has
    ever done this, but in theory it could be useful.

    The source of the limitation: we need an array of iovecs to pass to the
    write operation. In the worst case that array of iovecs could have
    hundreds of elements (the maximum rwsize divided by the page size), so
    it's too big to put on the stack, or in each compound op. So we instead
    keep a single such array in the compound argument.

    We fill in that array at the time we decode the xdr operation.

    But we decode every op in the compound before executing any of them. So
    once we've used that array we can't decode another write.

    If we instead delay filling in that array till the time we actually
    perform the write, we can reuse it.

    Another option might be to switch to decoding compound ops one at a
    time. I considered doing that, but it has a number of other side
    effects, and I'd rather fix just this one problem for now.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • In preparation for moving some of this elsewhere.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

15 Nov, 2012

1 commit