02 Oct, 2020

1 commit

  • Since commit 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in
    CLOSE/OPEN_DOWNGRADE") the following livelock may occur if a CLOSE races
    with the update of the nfs_state:

    Process 1 Process 2 Server
    ========= ========= ========
    OPEN file
    OPEN file
    Reply OPEN (1)
    Reply OPEN (2)
    Update state (1)
    CLOSE file (1)
    Reply OLD_STATEID (1)
    CLOSE file (2)
    Reply CLOSE (-1)
    Update state (2)
    wait for state change
    OPEN file
    wake
    CLOSE file
    OPEN file
    wake
    CLOSE file
    ...
    ...

    We can avoid this situation by not issuing an immediate retry with a bumped
    seqid when CLOSE/OPEN_DOWNGRADE receives NFS4ERR_OLD_STATEID. Instead,
    take the same approach used by OPEN and wait at least 5 seconds for
    outstanding stateid updates to complete if we can detect that we're out of
    sequence.

    Note that after this change it is still possible (though unlikely) that
    CLOSE waits a full 5 seconds, bumps the seqid, and retries -- and that
    attempt races with another OPEN at the same time. In order to avoid this
    race (which would result in the livelock), update
    nfs_need_update_open_stateid() to handle the case where:
    - the state is NFS_OPEN_STATE, and
    - the stateid doesn't match the current open stateid

    Finally, nfs_need_update_open_stateid() is modified to be idempotent and
    renamed to better suit the purpose of signaling that the stateid passed
    is the next stateid in sequence.

    Fixes: 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE")
    Cc: stable@vger.kernel.org # v5.4+
    Signed-off-by: Benjamin Coddington
    Signed-off-by: Anna Schumaker

    Benjamin Coddington
     

12 Aug, 2020

1 commit


05 Aug, 2020

2 commits


16 Mar, 2020

1 commit


15 Jan, 2020

5 commits


18 Nov, 2019

2 commits

  • One of the most frustrating messages our sustaining team sees is
    the "Lock reclaim failed!" message. Add some observability in the
    client's lock reclaim logic so we can capture better data the
    first time a problem occurs.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Add a trace point in the main state manager loop to observe state
    recovery operation. Help track down state recovery bugs.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

19 Jul, 2019

1 commit


09 Jul, 2019

3 commits

  • When triggering an nfs_xdr_status trace point, record the task ID
    and XID of the failing RPC to better pinpoint the problem.

    This feels like a bit of a layering violation.

    Suggested-by: Trond Myklebust
    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • I noticed that NFS status values stopped working again.

    trace_print_symbols_seq() takes an unsigned long. Passing a negative
    errno or negative NFSERR value just confuses it, and since we're
    using C macros here and not static inline functions, all bets are
    off due to implicit type conversion.

    Straight-line the calling conventions so that error codes are stored
    in the trace record as positive values in an unsigned long field,
    mapped to symbolic as an unsigned long, and displayed as a negative
    value, to continue to enable grepping on "error=-".

    It's often the case that an error value that is positive is a byte
    count but when it's negative, it's an error (e.g. nfs4_write). Fix
    those cases so that the value that is eventually stored in the
    error field is a positive NFS status or errno, or zero.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Help debug NFSv4 callback failures.

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     

14 Feb, 2019

1 commit


03 Jan, 2019

1 commit

  • These symbolic values were not being displayed in string form.
    TRACE_DEFINE_ENUM was missing in many cases. It also turns out that
    __print_symbolic wants an unsigned long in the first field...

    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     

15 Sep, 2018

2 commits


18 Nov, 2017

3 commits

  • Pull NFS client updates from Anna Schumaker:
    "Stable bugfixes:
    - Revalidate "." and ".." correctly on open
    - Avoid RCU usage in tracepoints
    - Fix ugly referral attributes
    - Fix a typo in nomigration mount option
    - Revert "NFS: Move the flock open mode check into nfs_flock()"

    Features:
    - Implement a stronger send queue accounting system for NFS over RDMA
    - Switch some atomics to the new refcount_t type

    Other bugfixes and cleanups:
    - Clean up access mode bits
    - Remove special-case revalidations in nfs_opendir()
    - Improve invalidating NFS over RDMA memory for async operations that
    time out
    - Handle NFS over RDMA replies with a worqueue
    - Handle NFS over RDMA sends with a workqueue
    - Fix up replaying interrupted requests
    - Remove dead NFS over RDMA definitions
    - Update NFS over RDMA copyright information
    - Be more consistent with bool initialization and comparisons
    - Mark expected switch fall throughs
    - Various sunrpc tracepoint cleanups
    - Fix various OPEN races
    - Fix a typo in nfs_rename()
    - Use common error handling code in nfs_lock_and_join_request()
    - Check that some structures are properly cleaned up during
    net_exit()
    - Remove net pointer from dprintk()s"

    * tag 'nfs-for-4.15-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (62 commits)
    NFS: Revert "NFS: Move the flock open mode check into nfs_flock()"
    NFS: Fix typo in nomigration mount option
    nfs: Fix ugly referral attributes
    NFS: super: mark expected switch fall-throughs
    sunrpc: remove net pointer from messages
    nfs: remove net pointer from messages
    sunrpc: exit_net cleanup check added
    nfs client: exit_net cleanup check added
    nfs/write: Use common error handling code in nfs_lock_and_join_requests()
    NFSv4: Replace closed stateids with the "invalid special stateid"
    NFSv4: nfs_set_open_stateid must not trigger state recovery for closed state
    NFSv4: Check the open stateid when searching for expired state
    NFSv4: Clean up nfs4_delegreturn_done
    NFSv4: cleanup nfs4_close_done
    NFSv4: Retry NFS4ERR_OLD_STATEID errors in layoutreturn
    pNFS: Retry NFS4ERR_OLD_STATEID errors in layoutreturn-on-close
    NFSv4: Don't try to CLOSE if the stateid 'other' field has changed
    NFSv4: Retry CLOSE and DELEGRETURN on NFS4ERR_OLD_STATEID.
    NFS: Fix a typo in nfs_rename()
    NFSv4: Fix open create exclusive when the server reboots
    ...

    Linus Torvalds
     
  • Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     
  • There isn't an obvious way to acquire and release the RCU lock during a
    tracepoint, so we can't use the rpc_peeraddr2str() function here.
    Instead, rely on the client's cl_hostname, which should have similar
    enough information without needing an rcu_dereference().

    Reported-by: Dave Jones
    Cc: stable@vger.kernel.org # v3.12
    Signed-off-by: Anna Schumaker

    Anna Schumaker
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

14 Jul, 2017

1 commit


31 Jan, 2017

1 commit


16 Jul, 2016

1 commit

  • Use __get_str(str) rather than __get_dynamic_array(str) when
    deadling with strings.

    It is just a code cleanup, no changes on tracepoint ABI.

    Link: http://lkml.kernel.org/r/ea260df91817411cca2a1f3db2abd88860094788.1467407618.git.bristot@redhat.com

    Cc: Trond Myklebust
    Cc: Anna Schumaker
    Cc: Ingo Molnar
    Cc: linux-nfs@vger.kernel.org
    Suggested-by: Steven Rostedt
    Reviewed-by: Steven Rostedt
    Signed-off-by: Daniel Bristot de Oliveira
    Signed-off-by: Steven Rostedt

    Daniel Bristot de Oliveira
     

18 May, 2016

1 commit

  • There are several problems in the way a stateid is selected for a
    LAYOUTGET operation:

    We pick a stateid to use in the RPC prepare op, but that makes
    it difficult to serialize LAYOUTGETs that use the open stateid. That
    serialization is done in pnfs_update_layout, which occurs well before
    the rpc_prepare operation.

    Between those two events, the i_lock is dropped and reacquired.
    pnfs_update_layout can find that the list has lsegs in it and not do any
    serialization, but then later pnfs_choose_layoutget_stateid ends up
    choosing the open stateid.

    This patch changes the client to select the stateid to use in the
    LAYOUTGET earlier, when we're searching for a usable layout segment.
    This way we can do it all while holding the i_lock the first time, and
    ensure that we serialize any LAYOUTGET call that uses a non-layout
    stateid.

    This also means a rework of how LAYOUTGET replies are handled, as we
    must now get the latest stateid if we want to retransmit in response
    to a retryable error.

    Most of those errors boil down to the fact that the layout state has
    changed in some fashion. Thus, what we really want to do is to re-search
    for a layout when it fails with a retryable error, so that we can avoid
    reissuing the RPC at all if possible.

    While the LAYOUTGET RPC is async, the initiating thread always waits for
    it to complete, so it's effectively synchronous anyway. Currently, when
    we need to retry a LAYOUTGET because of an error, we drive that retry
    via the rpc state machine.

    This means that once the call has been submitted, it runs until it
    completes. So, we must move the error handling for this RPC out of the
    rpc_call_done operation and into the caller.

    In order to handle errors like NFS4ERR_DELAY properly, we must also
    pass a pointer to the sliding timeout, which is now moved to the stack
    in pnfs_update_layout.

    The complicating errors are -NFS4ERR_RECALLCONFLICT and
    -NFS4ERR_LAYOUTTRYLATER, as those involve a timeout after which we give
    up and return NULL back to the caller. So, there is some special
    handling for those errors to ensure that the layers driving the retries
    can handle that appropriately.

    Signed-off-by: Jeff Layton
    Signed-off-by: Anna Schumaker

    Jeff Layton
     

08 Jan, 2016

1 commit

  • * bugfixes:
    SUNRPC: Fixup socket wait for memory
    SUNRPC: Fix a missing break in rpc_anyaddr()
    pNFS/flexfiles: Fix an Oopsable typo in ff_mirror_match_fh()
    NFS: Fix attribute cache revalidation
    NFS: Ensure we revalidate attributes before using execute_ok()
    NFS: Flush reclaim writes using FLUSH_COND_STABLE
    NFS: Background flush should not be low priority
    NFSv4.1/pnfs: Fixup an lo->plh_block_lgets imbalance in layoutreturn
    NFSv4: Don't perform cached access checks before we've OPENed the file
    NFS: Allow the combination pNFS and labeled NFS
    NFS42: handle layoutstats stateid error
    nfs: Fix race in __update_open_stateid()
    nfs: fix missing assignment in nfs4_sequence_done tracepoint

    Trond Myklebust
     

29 Dec, 2015

1 commit


28 Dec, 2015

5 commits

  • Instead of displaying a layout segment pointer in these tracepoints,
    let's use the layout stateid, now that Olga gave us a set of tools for
    displaying them.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • pnfs_update_layout is really the "nexus" of layout handling. If it
    returns NULL then we end up going through the MDS. This patch adds
    some tracepoints to that function that allow us to determine the
    cause when we end up going through the MDS unexpectedly.

    Signed-off-by: Jeff Layton
    Signed-off-by: Trond Myklebust

    Jeff Layton
     
  • Signed-off-by: Olga Kornievskaia
    Signed-off-by: Trond Myklebust

    Olga Kornievskaia
     
  • Operations to which stateid information is added:
    close, delegreturn, open, read, setattr, layoutget, layoutcommit, test_stateid,
    write, lock, locku, lockt

    Format is "stateid=:", also "openstateid=",
    "layoutstateid=", and "lockstateid=" for open_file, layoutget, set_lock
    tracepoints.

    New function is added to internal.h, nfs_stateid_hash(), to compute the hash

    trace_nfs4_setattr() is moved from nfs4_do_setattr() to _nfs4_do_setattr()
    to get access to stateid.

    trace_nfs4_setattr and trace_nfs4_delegreturn are changed from INODE_EVENT
    to new event type, INODE_STATEID_EVENT which is same as INODE_EVENT but adds
    stateid information

    for locking tracepoints, moved trace_nfs4_set_lock() into _nfs4_do_setlk()
    to get access to stateid information, and removed trace_nfs4_lock_reclaim(),
    trace_nfs4_lock_expired() as they call into _nfs4_do_setlk() and both were
    previously same LOCK_EVENT type.

    Signed-off-by: Olga Kornievskaia
    Signed-off-by: Trond Myklebust

    Olga Kornievskaia
     
  • status_flags not set

    Signed-off-by: Andrew Elble
    Signed-off-by: Trond Myklebust

    Andrew Elble
     

07 Oct, 2015

1 commit

  • Running xfstest generic/013 with the tracepoint nfs:nfs4_open_file
    enabled produces a NULL-pointer dereference when calculating fileid and
    filehandle of the opened file. Fix this by checking if state is NULL
    before trying to use the inode pointer.

    Reported-by: Olga Kornievskaia
    Signed-off-by: Anna Schumaker
    Signed-off-by: Trond Myklebust

    Anna Schumaker
     

26 Aug, 2015

3 commits


16 Apr, 2015

1 commit