05 Sep, 2013

1 commit

  • When an NFSv4 client loses contact with the server it can lose any
    locks that it holds.

    Currently when it reconnects to the server it simply tries to reclaim
    those locks. This might succeed even though some other client has
    held and released a lock in the mean time. So the first client might
    think the file is unchanged, but it isn't. This isn't good.

    If, when recovery happens, the locks cannot be claimed because some
    other client still holds the lock, then we get a message in the kernel
    logs, but the client can still write. So two clients can both think
    they have a lock and can both write at the same time. This is equally
    not good.

    There was a patch a while ago
    http://comments.gmane.org/gmane.linux.nfs/41917

    which tried to address some of this, but it didn't seem to go
    anywhere. That patch would also send a signal to the process. That
    might be useful but for now this patch just causes writes to fail.

    For NFSv4 (unlike v2/v3) there is a strong link between the lock and
    the write request so we can fairly easily fail any IO of the lock is
    gone. While some applications might not expect this, it is still
    safer than allowing the write to succeed.

    Because this is a fairly big change in behaviour a module parameter,
    "recover_locks", is introduced which defaults to true (the current
    behaviour) but can be set to "false" to tell the client not to try to
    recover things that were lost.

    Signed-off-by: NeilBrown
    Signed-off-by: Trond Myklebust

    NeilBrown
     

22 Aug, 2013

1 commit


10 Jul, 2013

1 commit

  • Pull NFS client updates from Trond Myklebust:
    "Feature highlights include:
    - Add basic client support for NFSv4.2
    - Add basic client support for Labeled NFS (selinux for NFSv4.2)
    - Fix the use of credentials in NFSv4.1 stateful operations, and add
    support for NFSv4.1 state protection.

    Bugfix highlights:
    - Fix another NFSv4 open state recovery race
    - Fix an NFSv4.1 back channel session regression
    - Various rpc_pipefs races
    - Fix another issue with NFSv3 auth negotiation

    Please note that Labeled NFS does require some additional support from
    the security subsystem. The relevant changesets have all been
    reviewed and acked by James Morris."

    * tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits)
    NFS: Set NFS_CS_MIGRATION for NFSv4 mounts
    NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs
    nfs: have NFSv3 try server-specified auth flavors in turn
    nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it
    nfs: move server_authlist into nfs_try_mount_request
    nfs: refactor "need_mount" code out of nfs_try_mount
    SUNRPC: PipeFS MOUNT notification optimization for dying clients
    SUNRPC: split client creation routine into setup and registration
    SUNRPC: fix races on PipeFS UMOUNT notifications
    SUNRPC: fix races on PipeFS MOUNT notifications
    NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount
    NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount
    NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize
    NFS: Improve legacy idmapping fallback
    NFSv4.1 end back channel session draining
    NFS: Apply v4.1 capabilities to v4.2
    NFSv4.1: Clean up layout segment comparison helper names
    NFSv4.1: layout segment comparison helpers should take 'const' parameters
    NFSv4: Move the DNS resolver into the NFSv4 module
    rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set
    ...

    Linus Torvalds
     

09 Jun, 2013

1 commit

  • After looking at all of the nfsv4 operations the label structure has been added
    to the prototypes of the functions which can transmit label data.

    Signed-off-by: Matthew N. Dodd
    Signed-off-by: Miguel Rodel Felipe
    Signed-off-by: Phua Eu Gene
    Signed-off-by: Khin Mi Mi Aung
    Signed-off-by: Steve Dickson
    Signed-off-by: Trond Myklebust

    David Quigley
     

12 May, 2013

1 commit

  • NFS calls the freezable helpers with locks held, which is unsafe
    and will cause lockdep warnings when 6aa9707 "lockdep: check
    that no locks held at freeze time" is reapplied (it was reverted
    in dbf520a). NFS shouldn't be doing this, but it has
    long-running syscalls that must hold a lock but also shouldn't
    block suspend. Until NFS freeze handling is rewritten to use a
    signal to exit out of the critical section, add new *_unsafe
    versions of the helpers that will not run the lockdep test when
    6aa9707 is reapplied, and call them from NFS.

    In practice the likley result of holding the lock while freezing
    is that a second task blocked on the lock will never freeze,
    aborting suspend, but it is possible to manufacture a case using
    the cgroup freezer, the lock, and the suspend freezer to create
    a deadlock. Silencing the lockdep warning here will allow
    problems to be found in other drivers that may have a more
    serious deadlock risk, and prevent new problems from being added.

    Signed-off-by: Colin Cross
    Acked-by: Pavel Machek
    Acked-by: Tejun Heo
    Signed-off-by: Rafael J. Wysocki

    Colin Cross
     

23 Feb, 2013

1 commit


13 Dec, 2012

1 commit

  • Currently, when an RPCSEC_GSS context has expired or is non-existent
    and the users (Kerberos) credentials have also expired or are non-existent,
    the client receives the -EKEYEXPIRED error and tries to refresh the context
    forever. If an application is performing I/O, or other work against the share,
    the application hangs, and the user is not prompted to refresh/establish their
    credentials. This can result in a denial of service for other users.

    Users are expected to manage their Kerberos credential lifetimes to mitigate
    this issue.

    Move the -EKEYEXPIRED handling into the RPC layer. Try tk_cred_retry number
    of times to refresh the gss_context, and then return -EACCES to the application.

    Signed-off-by: Andy Adamson
    Signed-off-by: Trond Myklebust

    Andy Adamson
     

05 Sep, 2012

1 commit

  • When the NFS_COOKIEVERF helper macro was converted into a static
    inline function in commit 99fadcd764 (nfs: convert NFS_*(inode)
    helpers to static inline), we broke the initialisation of the
    readdir cookies, since that depended on doing a memset with an
    argument of 'sizeof(NFS_COOKIEVERF(inode))' which therefore
    changed from sizeof(be32 cookieverf[2]) to sizeof(be32 *).

    At this point, NFS_COOKIEVERF seems to be more of an obfuscation
    than a helper, so the best thing would be to just get rid of it.

    Also see: https://bugzilla.kernel.org/show_bug.cgi?id=46881

    Reported-by: Andi Kleen
    Reported-by: David Binderman
    Signed-off-by: Trond Myklebust
    Cc: stable@vger.kernel.org

    Trond Myklebust
     

21 Aug, 2012

1 commit


31 Jul, 2012

2 commits


18 Jul, 2012

1 commit


17 Jul, 2012

1 commit


14 Jul, 2012

1 commit

  • Don't pass nfs_open_context() to ->create(). Only the NFS4 implementation
    needed that and only because it wanted to return an open file using open
    intents. That task has been replaced by ->atomic_open so it is not necessary
    anymore to pass the context to the create rpc operation.

    Despite nfs4_proc_create apparently being okay with a NULL context it Oopses
    somewhere down the call chain. So allocate a context here.

    Signed-off-by: Miklos Szeredi
    CC: Trond Myklebust
    Signed-off-by: Al Viro

    Miklos Szeredi
     

29 Jun, 2012

6 commits


30 May, 2012

1 commit

  • Pull NFS client updates from Trond Myklebust:
    "New features include:
    - Rewrite the O_DIRECT code so that it can share the same coalescing
    and pNFS functionality as the page cache code.
    - Allow the server to provide hints as to when we should use pNFS,
    and when it is more efficient to read and write through the
    metadata server.
    - NFS cache consistency updates:
    * Use the ctime to emulate a change attribute for NFSv2/v3 so that
    all NFS versions can share the same cache management code.
    * New cache management code will only look at the change attribute
    and size attribute when deciding whether or not our cached data
    is still valid or not.
    * Don't request NFSv4 post-op attributes on writes in cases such as
    O_DIRECT, where we don't care about data cache consistency, or
    when we have a write delegation, and know that our cache is still
    consistent.
    * Don't request NFSv4 post-op attributes on operations such as
    COMMIT, where there are no expected metadata updates.
    * Don't request NFSv4 directory post-op attributes in cases where
    the operations themselves already return change attribute
    updates: i.e. operations such as OPEN, CREATE, REMOVE, LINK and
    RENAME.
    - Speed up 'ls' and friends by using READDIR rather than READDIRPLUS
    if we detect no attempts to lookup filenames.
    - Improve the code sharing between NFSv2/v3 and v4 mounts
    - NFSv4.1 state management efficiency improvements
    - More patches in preparation for NFSv4/v4.1 migration functionality."

    Fix trivial conflict in fs/nfs/nfs4proc.c that was due to the dcache
    qstr name initialization changes (that made the length/hash a 64-bit
    union)

    * tag 'nfs-for-3.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (146 commits)
    NFSv4: Add debugging printks to state manager
    NFSv4: Map NFS4ERR_SHARE_DENIED into an EACCES error instead of EIO
    NFSv4: update_changeattr does not need to set NFS_INO_REVAL_PAGECACHE
    NFSv4.1: nfs4_reset_session should use nfs4_handle_reclaim_lease_error
    NFSv4.1: Handle other occurrences of NFS4ERR_CONN_NOT_BOUND_TO_SESSION
    NFSv4.1: Handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION in the state manager
    NFSv4.1: Handle errors in nfs4_bind_conn_to_session
    NFSv4.1: nfs4_bind_conn_to_session should drain the session
    NFSv4.1: Don't clobber the seqid if exchange_id returns a confirmed clientid
    NFSv4.1: Add DESTROY_CLIENTID
    NFSv4.1: Ensure we use the correct credentials for bind_conn_to_session
    NFSv4.1: Ensure we use the correct credentials for session create/destroy
    NFSv4.1: Move NFSPROC4_CLNT_BIND_CONN_TO_SESSION to the end of the operations
    NFSv4.1: Handle NFS4ERR_SEQ_MISORDERED when confirming the lease
    NFSv4: When purging the lease, we must clear NFS4CLNT_LEASE_CONFIRM
    NFSv4: Clean up the error handling for nfs4_reclaim_lease
    NFSv4.1: Exchange ID must use GFP_NOFS allocation mode
    nfs41: Use BIND_CONN_TO_SESSION for CB_PATH_DOWN*
    nfs4.1: add BIND_CONN_TO_SESSION operation
    NFSv4.1 test the mdsthreshold hint parameters
    ...

    Linus Torvalds
     

11 May, 2012

1 commit

  • This allows comparing hash and len in one operation on 64-bit
    architectures. Right now only __d_lookup_rcu() takes advantage of this,
    since that is the case we care most about.

    The use of anonymous struct/unions hides the alternate 64-bit approach
    from most users, the exception being a few cases where we initialize a
    'struct qstr' with a static initializer. This makes the problematic
    cases use a new QSTR_INIT() helper function for that (but initializing
    just the name pointer with a "{ .name = xyzzy }" initializer remains
    valid, as does just copying another qstr structure).

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

28 Apr, 2012

4 commits

  • Now that I'm doing secinfo automatically in the v4 code this extra
    argument isn't needed.

    Signed-off-by: Bryan Schumaker
    Signed-off-by: Trond Myklebust

    Bryan Schumaker
     
  • This simplifies the code for v2 and v3 and gives v4 a chance to decide
    on referrals without needing to modify the generic client.

    Signed-off-by: Bryan Schumaker
    Signed-off-by: Trond Myklebust

    Bryan Schumaker
     
  • In order to avoid duplicating all the data in nfs_read_data whenever we
    split it up into multiple RPC calls (either due to a short read result
    or due to rsize < PAGE_SIZE), we split out the bits that are the same
    per RPC call into a separate "header" structure.

    The goal this patch moves towards is to have a single header
    refcounted by several rpc_data structures. Thus, want to always refer
    from rpc_data to the header, and not the other way. This patch comes
    close to that ideal, but the directio code currently needs some
    special casing, isolated in the nfs_direct_[read_write]hdr_release()
    functions. This will be dealt with in a future patch.

    Signed-off-by: Fred Isaman
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • Commits don't need the vectors of pages, etc. that writes do. Split out
    a separate structure for the commit operation.

    Signed-off-by: Fred Isaman
    Signed-off-by: Trond Myklebust

    Fred Isaman
     

21 Mar, 2012

4 commits


07 Dec, 2011

1 commit

  • Allow the freezer to skip wait_on_bit_killable sleeps in the sunrpc
    layer. This should allow suspend and hibernate events to proceed, even
    when there are RPC's pending on the wire.

    Also, wrap the TASK_KILLABLE sleeps in NFS layer in freezer_do_not_count
    and freezer_count calls. This allows the freezer to skip tasks that are
    sleeping while looping on EJUKEBOX or NFS4ERR_DELAY sorts of errors.

    Signed-off-by: Jeff Layton
    Signed-off-by: Rafael J. Wysocki

    Jeff Layton
     

05 Nov, 2011

1 commit

  • commit d953126 changed how nfs_atomic_lookup handles an -EISDIR return
    from an OPEN call. Prior to that patch, that caused the client to fall
    back to doing a normal lookup. When that patch went in, the code began
    returning that error to userspace. The d_revalidate codepath however
    never had the corresponding change, so it was still possible to end up
    with a NULL ctx->state pointer after that.

    That patch caused a regression. When we attempt to open a directory that
    does not have a cached dentry, that open now errors out with EISDIR. If
    you attempt the same open with a cached dentry, it will succeed.

    Fix this by reverting the change in nfs_atomic_lookup and allowing
    attempts to open directories to fall back to a normal lookup

    Also, add a NFSv4-specific f_ops->open routine that just returns
    -ENOTDIR. This should never be called if things are working properly,
    but if it ever is, then the dprintk may help in debugging.

    To facilitate this, a new file_operations field is also added to the
    nfs_rpc_ops struct.

    Cc: stable@kernel.org
    Signed-off-by: Jeff Layton
    Signed-off-by: Trond Myklebust

    Jeff Layton
     

01 Aug, 2011

1 commit


25 Mar, 2011

1 commit


12 Mar, 2011

1 commit


24 Oct, 2010

1 commit

  • We can use vmapped pages to read more information from the network at once.
    This will reduce the number of calls needed to complete a readdir.

    Signed-off-by: Bryan Schumaker
    [trondmy: Added #include for linux/vmalloc.h> in fs/nfs/dir.c]
    Signed-off-by: Trond Myklebust

    Bryan Schumaker
     

22 Sep, 2010

1 commit


18 Sep, 2010

3 commits

  • A synchronous rename can be interrupted by a SIGKILL. If that happens
    during a sillyrename operation, it's possible for the rename call to
    be sent to the server, but the task exits before processing the
    reply. If this happens, the sillyrenamed file won't get cleaned up
    during nfs_dentry_iput and the server is left with a dangling .nfs* file
    hanging around.

    Fix this problem by turning sillyrename into an asynchronous operation
    and have the task doing the sillyrename just wait on the reply. If the
    task is killed before the sillyrename completes, it'll still proceed
    to completion.

    Signed-off-by: Jeff Layton
    Reviewed-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Jeff Layton
     
  • Right now, v3 and v4 have their own variants. Create a standard struct
    that will work for v3 and v4. v2 doesn't get anything but a simple error
    and so isn't affected by this.

    Signed-off-by: Jeff Layton
    Reviewed-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Jeff Layton
     
  • Each NFS version has its own version of the rename args container.
    Standardize them on a common one that's identical to the one NFSv4
    uses.

    Signed-off-by: Jeff Layton
    Reviewed-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Jeff Layton