24 Aug, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

29 Jul, 2020

1 commit

  • A sequence counter write side critical section must be protected by some
    form of locking to serialize writers. A plain seqcount_t does not
    contain the information of which lock must be held when entering a write
    side critical section.

    Use the new seqcount_spinlock_t data type, which allows to associate a
    spinlock with the sequence counter. This enables lockdep to verify that
    the spinlock used for writer serialization is held when the write side
    critical section is entered.

    If lockdep is disabled this lock association is compiled out and has
    neither storage size nor runtime overhead.

    Signed-off-by: Ahmed S. Darwish
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200720155530.1173732-22-a.darwish@linutronix.de

    Ahmed S. Darwish
     

12 May, 2020

1 commit

  • We add the new state to the nfsi->open_states list, making it
    potentially visible to other threads, before we've finished initializing
    it.

    That wasn't a problem when all the readers were also taking the i_lock
    (as we do here), but since we switched to RCU, there's now a possibility
    that a reader could see the partially initialized state.

    Symptoms observed were a crash when another thread called
    nfs4_get_valid_delegation() on a NULL inode, resulting in an oops like:

    BUG: unable to handle page fault for address: ffffffffffffffb0 ...
    RIP: 0010:nfs4_get_valid_delegation+0x6/0x30 [nfsv4] ...
    Call Trace:
    nfs4_open_prepare+0x80/0x1c0 [nfsv4]
    __rpc_execute+0x75/0x390 [sunrpc]
    ? finish_task_switch+0x75/0x260
    rpc_async_schedule+0x29/0x40 [sunrpc]
    process_one_work+0x1ad/0x370
    worker_thread+0x30/0x390
    ? create_worker+0x1a0/0x1a0
    kthread+0x10c/0x130
    ? kthread_park+0x80/0x80
    ret_from_fork+0x22/0x30

    Fixes: 9ae075fdd190 "NFSv4: Convert open state lookup to use RCU"
    Reviewed-by: Seiichi Ikarashi
    Tested-by: Daisuke Matsuda
    Tested-by: Masayoshi Mizuma
    Signed-off-by: J. Bruce Fields
    Cc: stable@vger.kernel.org # v4.20+
    Signed-off-by: Trond Myklebust

    J. Bruce Fields
     

16 Mar, 2020

1 commit


05 Feb, 2020

1 commit

  • Currently, each time nfs4_do_fsinfo() is called it will do an implicit
    NFS4 lease renewal, which is not compliant with the NFS4 specification.
    This can result in a lease being expired by an NFS server.

    Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
    introduced implicit client lease renewal in nfs4_do_fsinfo(),
    which can result in the NFSv4.0 lease to expire on a server side,
    and servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.

    This can easily be reproduced by frequently unmounting a sub-mount,
    then stat'ing it to get it mounted again, which will delay or even
    completely prevent client from sending RENEW operations if no other
    NFS operations are issued. Eventually nfs server will expire client's
    lease and return an error on file access or next RENEW.

    This can also happen when a sub-mount is automatically unmounted
    due to inactivity (after nfs_mountpoint_expiry_timeout), then it is
    mounted again via stat(). This can result in a short window during
    which client's lease will expire on a server but not on a client.
    This specific case was observed on production systems.

    This patch removes the implicit lease renewal from nfs4_do_fsinfo().

    Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing leases")
    Signed-off-by: Robert Milkowski
    Signed-off-by: Anna Schumaker

    Robert Milkowski
     

04 Feb, 2020

1 commit


15 Jan, 2020

1 commit

  • Fixes coccicheck warning:

    fs/nfs/nfs4state.c:1138:2-3: Unneeded semicolon
    fs/nfs/nfs4proc.c:6862:2-3: Unneeded semicolon
    fs/nfs/nfs4proc.c:8629:2-3: Unneeded semicolon

    Reported-by: Hulk Robot
    Signed-off-by: zhengbin
    Signed-off-by: Anna Schumaker

    zhengbin
     

18 Nov, 2019

2 commits

  • One of the most frustrating messages our sustaining team sees is
    the "Lock reclaim failed!" message. Add some observability in the
    client's lock reclaim logic so we can capture better data the
    first time a problem occurs.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Add a trace point in the main state manager loop to observe state
    recovery operation. Help track down state recovery bugs.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

06 Nov, 2019

1 commit


04 Nov, 2019

1 commit


10 Oct, 2019

2 commits

  • When the source server reboots after a server-to-server copy was
    issued, we need to retry the copy from COPY_NOTIFY. We need to
    detect that the source server rebooted and there is a copy waiting
    on a destination server and wake it up.

    Signed-off-by: Olga Kornievskaia

    Olga Kornievskaia
     
  • Mark the open created for the source file on the destination
    server. Then if this open is going thru a recovery, then fail
    the recovery as we don't need to be recoving a "fake" open.
    We need to fail the ongoing READs and vfs_copy_file_range().

    Signed-off-by: Olga Kornievskaia

    Olga Kornievskaia
     

21 Sep, 2019

1 commit


22 Aug, 2019

1 commit

  • In nfs4_try_migration(), if nfs4_begin_drain_session() fails, the
    previously allocated 'page' and 'locations' are not deallocated, leading to
    memory leaks. To fix this issue, go to the 'out' label to free 'page' and
    'locations' before returning the error.

    Signed-off-by: Wenwen Wang
    Signed-off-by: Anna Schumaker

    Wenwen Wang
     

08 Aug, 2019

1 commit


05 Aug, 2019

3 commits

  • John Hubbard reports seeing the following stack trace:

    nfs4_do_reclaim
    rcu_read_lock /* we are now in_atomic() and must not sleep */
    nfs4_purge_state_owners
    nfs4_free_state_owner
    nfs4_destroy_seqid_counter
    rpc_destroy_wait_queue
    cancel_delayed_work_sync
    __cancel_work_timer
    __flush_work
    start_flush_work
    might_sleep:
    (kernel/workqueue.c:2975: BUG)

    The solution is to separate out the freeing of the state owners
    from nfs4_purge_state_owners(), and perform that outside the atomic
    context.

    Reported-by: John Hubbard
    Fixes: 0aaaf5c424c7f ("NFS: Cache state owners after files are closed")
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • If the server returns with EAGAIN when we're trying to recover from
    a server reboot, we currently delay for 1 second, but then mark the
    stateid as needing recovery after the grace period has expired.

    Instead, we should just retry the same recovery process immediately
    after the 1 second delay. Break out of the loop after 10 retries.

    Fixes: 35a61606a612 ("NFS: Reduce indentation of the switch statement...")
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • When error recovery fails due to a fatal error on the server, ensure
    we log it in the syslog.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

19 Jul, 2019

1 commit


13 Jul, 2019

2 commits

  • RFC 7530 requires us to refetch the lease time attribute once a new
    clientID is established. This is already implemented for the
    nfs4.1(+) clients by nfs41_init_clientid, which calls
    nfs41_finish_session_reset, which calls nfs4_setup_state_renewal.

    To make nfs4_setup_state_renewal available for nfs4.0, move it
    further to the top of the source file to include it regardles of
    CONFIG_NFS_V4_1 and to save a forward declaration.

    Call nfs4_setup_state_renewal from nfs4_init_clientid.

    Signed-off-by: Donald Buczek
    Signed-off-by: Trond Myklebust

    Donald Buczek
     
  • The function nfs41_setup_state_renewal is useful to the nfs 4.0 client
    as well, so rename the function to nfs4_setup_state_renewal.

    Signed-off-by: Donald Buczek
    Signed-off-by: Trond Myklebust

    Donald Buczek
     

10 May, 2019

2 commits

  • Only delegations and layouts can be recalled, so it shouldn't be
    necessary to recover all opens when handling the status bit
    SEQ4_STATUS_RECALLABLE_STATE_REVOKED. We'll still wind up calling
    nfs41_open_expired() when a TEST_STATEID returns NFS4ERR_DELEG_REVOKED.

    Signed-off-by: Scott Mayhew
    Reviewed-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Scott Mayhew
     
  • stat command with soft mount never return after server is stopped.

    When alloc a new client, the state of the client will be set to
    NFS4CLNT_LEASE_EXPIRED.

    When the server is stopped, the state manager will work, and accord
    the state to recover. But the state is NFS4CLNT_LEASE_EXPIRED, it
    will drain the slot table and lead other task to wait queue, until
    the client recovered. Then the stat command is hung.

    When discover server trunking, the client will renew the lease,
    but check the client state, it lead the client state corruption.

    So, we need to call state manager to recover it when detect server
    ip trunking.

    Signed-off-by: ZhangXiaoxu
    Cc: stable@vger.kernel.org
    Signed-off-by: Anna Schumaker

    ZhangXiaoxu
     

21 Feb, 2019

1 commit


03 Jan, 2019

1 commit

  • Original commit (e4648aa4f98a "NFS recover from destination server
    reboot for copies") used memcmp() and then it was changed to use
    nfs4_stateid_match_other() but that function returns opposite of
    memcmp. As the result, recovery can't find the copy leading
    to copy hanging.

    Fixes: 80f42368868e ("NFSv4: Split out NFS v4.2 copy completion functions")
    Fixes: cb7a8384dc02 ("NFS: Split out the body of nfs4_reclaim_open_state")
    Signed-of-by: Olga Kornievskaia
    Signed-off-by: Anna Schumaker

    Olga Kornievskaia
     

20 Dec, 2018

4 commits

  • SUNRPC has two sorts of credentials, both of which appear as
    "struct rpc_cred".
    There are "generic credentials" which are supplied by clients
    such as NFS and passed in 'struct rpc_message' to indicate
    which user should be used to authorize the request, and there
    are low-level credentials such as AUTH_NULL, AUTH_UNIX, AUTH_GSS
    which describe the credential to be sent over the wires.

    This patch replaces all the generic credentials by 'struct cred'
    pointers - the credential structure used throughout Linux.

    For machine credentials, there is a special 'struct cred *' pointer
    which is statically allocated and recognized where needed as
    having a special meaning. A look-up of a low-level cred will
    map this to a machine credential.

    Signed-off-by: NeilBrown
    Acked-by: J. Bruce Fields
    Signed-off-by: Anna Schumaker

    NeilBrown
     
  • When NFS creates a machine credential, it is a "generic" credential,
    not tied to any auth protocol, and is really just a container for
    the princpal name.
    This doesn't get linked to a genuine credential until rpcauth_bindcred()
    is called.
    The lookup always succeeds, so various places that test if the machine
    credential is NULL, are pointless.

    As a step towards getting rid of generic credentials, this patch gets
    rid of generic machine credentials. The nfs_client and rpc_client
    just hold a pointer to a constant principal name.
    When a machine credential is wanted, a special static 'struct rpc_cred'
    pointer is used. rpcauth_bindcred() recognizes this, finds the
    principal from the client, and binds the correct credential.

    Signed-off-by: NeilBrown
    Signed-off-by: Anna Schumaker

    NeilBrown
     
  • This lock is no longer necessary.

    If nfs4_get_renew_cred() needs to hunt through the open-state
    creds for a user cred, it still takes the lock to stablize
    the rbtree, but otherwise there are no races.

    Note that this completely removes the lock from nfs4_renew_state().
    It appears that the original need for the locking here was removed
    long ago, and there is no longer anything to protect.

    Signed-off-by: NeilBrown
    Signed-off-by: Anna Schumaker

    NeilBrown
     
  • NFSv4 state management tries a root credential when no machine
    credential is available, as can happen with kerberos.
    It does this by replacing the cl_machine_cred with a root credential.
    This means that any user of the machine credential needs to take
    a lock while getting a reference to the machine credential, which is
    a little cumbersome.

    So introduce an explicit cl_root_cred, and never free either
    credential until client shutdown. This means that no locking
    is needed to reference these credentials. Future patches
    will make use of this.

    This is only a temporary addition. both cl_machine_cred and
    cl_root_cred will disappear later in the series.

    Signed-off-by: NeilBrown
    Signed-off-by: Anna Schumaker

    NeilBrown
     

20 Nov, 2018

1 commit

  • Fix a deadlock whereby the NFSv4 state manager can get stuck in the
    delegation return code, waiting for a layout return to complete in
    another thread. If the server reboots before that other thread
    completes, then we need to be able to start a second state
    manager thread in order to perform recovery.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

13 Nov, 2018

2 commits


01 Oct, 2018

7 commits