24 Aug, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

20 Dec, 2019

1 commit

  • The nfsd4_cb_layout_done() function takes a 'time_t' value,
    multiplied by NSEC_PER_SEC*2 to get a nanosecond value.

    This works fine on 64-bit architectures, but on 32-bit, any
    value over 1 second results in a signed integer overflow
    with unexpected results.

    Cast one input to a 64-bit type in order to produce the
    same result that we have on 64-bit architectures, regarless
    of the type of nfsd4_lease.

    Fixes: 6b9b21073d3b ("nfsd: give up on CB_LAYOUTRECALLs after two lease periods")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: J. Bruce Fields

    Arnd Bergmann
     

19 Aug, 2019

1 commit


03 May, 2019

1 commit

  • Instead of having the convention where individual nfsd4_callback_ops->done
    operations return -1 to indicate the callback path is down, move the check
    to nfsd4_cb_done. Only mark the callback path down on transport-level
    errors, not NFS-level errors.

    The existing logic causes the server to set SEQ4_STATUS_CB_PATH_DOWN
    just because the client returned an error to a CB_RECALL for a
    delegation that the client had already done a FREE_STATEID for. But
    clearly that error doesn't mean that there's anything wrong with the
    backchannel.

    Additionally, handle NFS4ERR_DELAY in nfsd4_cb_recall_done. The client
    returns NFS4ERR_DELAY if it is already in the process of returning the
    delegation.

    Signed-off-by: Scott Mayhew
    Signed-off-by: J. Bruce Fields

    Scott Mayhew
     

28 Dec, 2018

1 commit

  • Drop LIST_HEAD where the variable it declares is never used.

    This was introduced in c5c707f96fc9a ("nfsd: implement pNFS
    layout recalls"), but was not used even in that commit.

    The semantic patch that fixes this problem is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @@
    identifier x;
    @@
    - LIST_HEAD(x);
    ... when != x
    //

    Fixes: c5c707f96fc9a ("nfsd: implement pNFS layout recalls")
    Signed-off-by: Julia Lawall
    Signed-off-by: J. Bruce Fields

    Julia Lawall
     

19 Jun, 2018

1 commit

  • Commit 30181faae37f ("nfsd: Check queue type before submitting a SCSI
    request") did the work of ensuring that we don't send SCSI requests to a
    request queue that won't support them, but that check is in the
    GETDEVICEINFO path. Let's not set the SCSI layout in fs_layout_type in the
    first place, and then we'll have less clients sending GETDEVICEINFO for
    non-SCSI request queues and less unnecessary WARN_ONs.

    While we're in here, remove some outdated comments that refer to
    "overwriting" layout seletion because commit 8a4c3926889e ("nfsd: allow
    nfsd to advertise multiple layout types") changed things to no longer
    overwrite the layout type.

    Signed-off-by: Benjamin Coddington
    Reviewed-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields

    Benjamin Coddington
     

04 Apr, 2018

1 commit


19 Nov, 2017

1 commit

  • Pull nfsd updates from Bruce Fields:
    "Lots of good bugfixes, including:

    - fix a number of races in the NFSv4+ state code

    - fix some shutdown crashes in multiple-network-namespace cases

    - relax our 4.1 session limits; if you've an artificially low limit
    to the number of 4.1 clients that can mount simultaneously, try
    upgrading"

    * tag 'nfsd-4.15' of git://linux-nfs.org/~bfields/linux: (22 commits)
    SUNRPC: Improve ordering of transport processing
    nfsd: deal with revoked delegations appropriately
    svcrdma: Enqueue after setting XPT_CLOSE in completion handlers
    nfsd: use nfs->ns.inum as net ID
    rpc: remove some BUG()s
    svcrdma: Preserve CB send buffer across retransmits
    nfds: avoid gettimeofday for nfssvc_boot time
    fs, nfsd: convert nfs4_file.fi_ref from atomic_t to refcount_t
    fs, nfsd: convert nfs4_cntl_odstate.co_odcount from atomic_t to refcount_t
    fs, nfsd: convert nfs4_stid.sc_count from atomic_t to refcount_t
    lockd: double unregister of inetaddr notifiers
    nfsd4: catch some false session retries
    nfsd4: fix cached replies to solo SEQUENCE compounds
    sunrcp: make function _svc_create_xprt static
    SUNRPC: Fix tracepoint storage issues with svc_recv and svc_rqst_status
    nfsd: use ARRAY_SIZE
    nfsd: give out fewer session slots as limit approaches
    nfsd: increase DRC cache limit
    nfsd: remove unnecessary nofilehandle checks
    nfs_common: convert int to bool
    ...

    Linus Torvalds
     

08 Nov, 2017

1 commit

  • atomic_t variables are currently used to implement reference
    counters with the following properties:
    - counter is initialized to 1 using atomic_set()
    - a resource is freed upon counter reaching zero
    - once counter reaches zero, its further
    increments aren't allowed
    - counter schema uses basic atomic operations
    (set, inc, inc_not_zero, dec_and_test, etc.)

    Such atomic variables should be converted to a newly provided
    refcount_t type and API that prevents accidental counter overflows
    and underflows. This is important since overflows and underflows
    can lead to use-after-free situation and be exploitable.

    The variable nfs4_stid.sc_count is used as pure reference counter.
    Convert it to refcount_t and fix up the operations.

    Suggested-by: Kees Cook
    Reviewed-by: David Windsor
    Reviewed-by: Hans Liljestrand
    Signed-off-by: Elena Reshetova
    Signed-off-by: J. Bruce Fields

    Elena Reshetova
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

23 Feb, 2017

1 commit

  • Pull driver core updates from Greg KH:
    "Here is the "small" driver core patches for 4.11-rc1.

    Not much here, some firmware documentation and self-test updates, a
    debugfs code formatting issue, and a new feature for call_usermodehelper
    to make it more robust on systems that want to lock it down in a more
    secure way.

    All of these have been linux-next for a while now with no reported
    issues"

    * tag 'driver-core-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
    kernfs: handle null pointers while printing node name and path
    Introduce STATIC_USERMODEHELPER to mediate call_usermodehelper()
    Make static usermode helper binaries constant
    kmod: make usermodehelper path a const string
    firmware: revamp firmware documentation
    selftests: firmware: send expected errors to /dev/null
    selftests: firmware: only modprobe if driver is missing
    platform: Print the resource range if device failed to claim
    kref: prefer atomic_inc_not_zero to atomic_add_unless
    debugfs: improve formatting of debugfs_real_fops()

    Linus Torvalds
     

01 Feb, 2017

1 commit

  • nfsd assigns the nfs4_free_lock_stateid to .sc_free in init_lock_stateid().

    If nfsd doesn't go through init_lock_stateid() and put stateid at end,
    there is a NULL reference to .sc_free when calling nfs4_put_stid(ns).

    This patch let the nfs4_stid.sc_free assignment to nfs4_alloc_stid().

    Cc: stable@vger.kernel.org
    Fixes: 356a95ece7aa "nfsd: clean up races in lock stateid searching..."
    Signed-off-by: Kinglong Mee
    Reviewed-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Kinglong Mee
     

19 Jan, 2017

1 commit

  • There are a number of usermode helper binaries that are "hard coded" in
    the kernel today, so mark them as "const" to make it harder for someone
    to change where the variables point to.

    Cc: Benjamin Herrenschmidt
    Cc: Thomas Sailer
    Cc: "Rafael J. Wysocki"
    Cc: Johan Hovold
    Cc: Alex Elder
    Cc: "J. Bruce Fields"
    Cc: Jeff Layton
    Cc: David Howells
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

02 Nov, 2016

1 commit

  • Currently, when the client continually returns NFS4ERR_DELAY on a
    CB_LAYOUTRECALL, we'll give up trying to retransmit after two lease
    periods, but leave the layout in place.

    What we really need to do here is fence the client in this case. Have it
    fall through to that code in that case instead of into the
    NFS4ERR_NOMATCHING_LAYOUT case.

    Signed-off-by: Jeff Layton
    Reviewed-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

17 Sep, 2016

1 commit

  • We currently can hit a deadlock (of sorts) when trying to use flexfiles
    layouts with XFS. XFS will call break_layout when something wants to
    write to the file. In the case of the (super-simple) flexfiles layout
    driver in knfsd, the MDS and DS are the same machine.

    The client can get a layout and then issue a v3 write to do its I/O. XFS
    will then call xfs_break_layouts, which will cause a CB_LAYOUTRECALL to
    be issued to the client. The client however can't return the layout
    until the v3 WRITE completes, but XFS won't allow the write to proceed
    until the layout is returned.

    Christoph says:

    XFS only cares about block-like layouts where the client has direct
    access to the file blocks. I'd need to look how to propagate the
    flag into break_layout, but in principle we don't need to do any
    recalls on truncate ever for file and flexfile layouts.

    If we're never going to recall the layout, then we don't even need to
    set the lease at all. Just skip doing so on flexfiles layouts by
    adding a new flag to struct nfsd4_layout_ops and skipping the lease
    setting and removal when that flag is true.

    Cc: Christoph Hellwig
    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

16 Jul, 2016

1 commit

  • If the underlying filesystem supports multiple layout types, then there
    is little reason not to advertise that fact to clients and let them
    choose what type to use.

    Turn the ex_layout_type field into a bitfield. For each supported
    layout type, we set a bit in that field. When the client requests a
    layout, ensure that the bit for that layout type is set. When the
    client requests attributes, send back a list of supported types.

    Signed-off-by: Jeff Layton
    Reviewed-by: Weston Andros Adamson
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

14 Jul, 2016

1 commit

  • Have a simple flex file server where the mds (NFSv4.1 or NFSv4.2)
    is also the ds (NFSv3). I.e., the metadata and the data file are
    the exact same file.

    This will allow testing of the flex file client.

    Simply add the "pnfs" export option to your export
    in /etc/exports and mount from a client that supports
    flex files.

    Signed-off-by: Tom Haynes
    Reviewed-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields

    Tom Haynes
     

14 May, 2016

1 commit


18 Mar, 2016

2 commits

  • This is a simple extension to the block layout driver to use SCSI
    persistent reservations for access control and fencing, as well as
    SCSI VPD pages for device identification.

    For this we need to pass the nfs4_client to the proc_getdeviceinfo method
    to generate the reservation key, and add a new fence_client method
    to allow for fence actions in the layout driver.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields

    Christoph Hellwig
     
  • Split the config symbols into a generic pNFS one, which is invisible
    and gets selected by the layout drivers, and one for the block layout
    driver.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields

    Christoph Hellwig
     

16 Jan, 2016

1 commit

  • Pull nfsd updates from Bruce Fields:
    "Smaller bugfixes and cleanup, including a fix for a failures of
    kerberized NFSv4.1 mounts, and Scott Mayhew's work addressing ACK
    storms that can affect some high-availability NFS setups"

    * tag 'nfsd-4.5' of git://linux-nfs.org/~bfields/linux:
    nfsd: add new io class tracepoint
    nfsd: give up on CB_LAYOUTRECALLs after two lease periods
    nfsd: Fix nfsd leaks sunrpc module references
    lockd: constify nlmsvc_binding structure
    lockd: use to_delayed_work
    nfsd: use to_delayed_work
    Revert "svcrdma: Do not send XDR roundup bytes for a write chunk"
    lockd: Register callbacks on the inetaddr_chain and inet6addr_chain
    nfsd: Register callbacks on the inetaddr_chain and inet6addr_chain
    sunrpc: Add a function to close temporary transports immediately
    nfsd: don't base cl_cb_status on stale information
    nfsd4: fix gss-proxy 4.1 mounts for some AD principals
    nfsd: fix unlikely NULL deref in mach_creds_match
    nfsd: minor consolidation of mach_cred handling code
    nfsd: helper for dup of possibly NULL string
    svcrpc: move some initialization to common code
    nfsd: fix a warning message
    nfsd: constify nfsd4_callback_ops structure
    nfsd: recover: constify nfsd4_client_tracking_ops structures
    svcrdma: Do not send XDR roundup bytes for a write chunk

    Linus Torvalds
     

09 Jan, 2016

1 commit

  • Have the CB_LAYOUTRECALL code treat NFS4_OK and NFS4ERR_DELAY returns
    equivalently. Change the code to periodically resend CB_LAYOUTRECALLS
    until the ls_layouts list is empty or the client returns a different
    error code.

    If we go for two lease periods without the list being emptied or the
    client sending a hard error, then we give up and clean out the list
    anyway.

    Signed-off-by: Jeff Layton
    Tested-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

17 Dec, 2015

1 commit

  • We do need to serialize layout stateid morphing operations, but we
    currently hold the ls_mutex across a layout recall which is pretty
    ugly. It's also unnecessary -- once we've bumped the seqid and
    copied it, we don't need to serialize the rest of the CB_LAYOUTRECALL
    vs. anything else. Just drop the mutex once the copy is done.

    This was causing a "workqueue leaked lock or atomic" warning and an
    occasional deadlock.

    There's more work to be done here but this fixes the immediate
    regression.

    Fixes: cc8a55320b5f "nfsd: serialize layout stateid morphing operations"
    Cc: stable@vger.kernel.org
    Reported-by: Kinglong Mee
    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

24 Nov, 2015

1 commit


24 Oct, 2015

2 commits

  • Bruce points out that the increment of the seqid in stateids is not
    serialized in any way, so it's possible for racing calls to bump it
    twice and end up sending the same stateid. While we don't have any
    reports of this problem it _is_ theoretically possible, and could lead
    to spurious state recovery by the client.

    In the current code, update_stateid is always followed by a memcpy of
    that stateid, so we can combine the two operations. For better
    atomicity, we add a spinlock to the nfs4_stid and hold that when bumping
    the seqid and copying the stateid.

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     
  • In order to allow the client to make a sane determination of what
    happened with racing LAYOUTGET/LAYOUTRETURN/CB_LAYOUTRECALL calls, we
    must ensure that the seqids return accurately represent the order of
    operations. The simplest way to do that is to ensure that operations on
    a single stateid are serialized.

    This patch adds a mutex to the layout stateid, and locks it when
    checking the layout stateid's seqid. The mutex is held over the entire
    operation and released after the seqid is bumped.

    Note that in the case of CB_LAYOUTRECALL we must move the increment of
    the seqid and setting into a new cb "prepare" operation. The lease
    infrastructure will call the lm_break callback with a spinlock held, so
    and we can't take the mutex in that codepath.

    Cc: Christoph Hellwig
    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

21 Jul, 2015

1 commit


31 Mar, 2015

1 commit


26 Mar, 2015

1 commit

  • With return layout as, (seg is return layout, lo is record layout)
    seg->offset offset and layout_end(seg) < layout_end(lo),
    nfsd should update lo's offset to seg's end,
    and,
    seg->offset > lo->offset and layout_end(seg) >= layout_end(lo),
    nfsd should update lo's end to seg's offset.

    Fixes: 9cf514ccfa ("nfsd: implement pNFS operations")
    Signed-off-by: Kinglong Mee
    Reviewed-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields

    Kinglong Mee
     

21 Mar, 2015

2 commits

  • According to RFC5661:
    " When lr_returntype is LAYOUTRETURN4_FSID, the current filehandle is used
    to identify the file system and all layouts matching the client ID,
    the fsid of the file system, lora_layout_type, and lora_iomode are
    returned. When lr_returntype is LAYOUTRETURN4_ALL, all layouts
    matching the client ID, lora_layout_type, and lora_iomode are
    returned and the current filehandle is not used. "

    When returning client layouts, always check layout type.

    Signed-off-by: Kinglong Mee
    Reviewed-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields

    Kinglong Mee
     
  • 31ef83dc05 "nfsd: add trace events" had a typo that dropped a trace
    event and replaced it by an incorrect recursive call to
    nfsd4_cb_layout_fail. 133d558216d9 "Subject: nfsd: don't recursively
    call nfsd4_cb_layout_fail" fixed the crash, this restores the
    tracepoint.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Kinglong Mee
    Signed-off-by: J. Bruce Fields

    Kinglong Mee
     

20 Mar, 2015

1 commit

  • Due to a merge error when creating c5c707f9 ("nfsd: implement pNFS
    layout recalls"), we recursively call nfsd4_cb_layout_fail from itself,
    leading to stack overflows.

    Signed-off-by: Christoph Hellwig
    Fixes: c5c707f9 ("nfsd: implement pNFS layout recalls")
    Signed-off-by: J. Bruce Fields
    ---
    fs/nfsd/nfs4layouts.c | 2 --
    1 file changed, 2 deletions(-)

    diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
    index 3c1bfa1..1028a06 100644
    --- a/fs/nfsd/nfs4layouts.c
    +++ b/fs/nfsd/nfs4layouts.c
    @@ -587,8 +587,6 @@ nfsd4_cb_layout_fail(struct nfs4_layout_stateid *ls)

    rpc_ntop((struct sockaddr *)&clp->cl_addr, addr_str, sizeof(addr_str));

    - nfsd4_cb_layout_fail(ls);
    -
    printk(KERN_WARNING
    "nfsd: client %s failed to respond to layout recall. "
    " Fencing..\n", addr_str);
    --
    1.9.1

    Christoph Hellwig
     

05 Feb, 2015

1 commit

  • Add a small shim between core nfsd and filesystems to translate the
    somewhat cumbersome pNFS data structures and semantics to something
    more palatable for Linux filesystems.

    Thanks to Rick McNeal for the old prototype pNFS blocklayout server
    code, which gave a lot of inspiration to this version even if no
    code is left from it.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     

03 Feb, 2015

3 commits

  • For now just a few simple events to trace the layout stateid lifetime, but
    these already were enough to find several bugs in the Linux client layout
    stateid handling.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     
  • Add support to issue layout recalls to clients. For now we only support
    full-file recalls to get a simple and stable implementation. This allows
    to embedd a nfsd4_callback structure in the layout_state and thus avoid
    any memory allocations under spinlocks during a recall. For normal
    use cases that do not intent to share a single file between multiple
    clients this implementation is fully sufficient.

    To ensure layouts are recalled on local filesystem access each layout
    state registers a new FL_LAYOUT lease with the kernel file locking code,
    which filesystems that support pNFS exports that require recalls need
    to break on conflicting access patterns.

    The XDR code is based on the old pNFS server implementation by
    Andy Adamson, Benny Halevy, Boaz Harrosh, Dean Hildebrand, Fred Isaman,
    Marc Eshel, Mike Sager and Ricardo Labiaga.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     
  • Add support for the GETDEVICEINFO, LAYOUTGET, LAYOUTCOMMIT and
    LAYOUTRETURN NFSv4.1 operations, as well as backing code to manage
    outstanding layouts and devices.

    Layout management is very straight forward, with a nfs4_layout_stateid
    structure that extends nfs4_stid to manage layout stateids as the
    top-level structure. It is linked into the nfs4_file and nfs4_client
    structures like the other stateids, and contains a linked list of
    layouts that hang of the stateid. The actual layout operations are
    implemented in layout drivers that are not part of this commit, but
    will be added later.

    The worst part of this commit is the management of the pNFS device IDs,
    which suffers from a specification that is not sanely implementable due
    to the fact that the device-IDs are global and not bound to an export,
    and have a small enough size so that we can't store the fsid portion of
    a file handle, and must never be reused. As we still do need perform all
    export authentication and validation checks on a device ID passed to
    GETDEVICEINFO we are caught between a rock and a hard place. To work
    around this issue we add a new hash that maps from a 64-bit integer to a
    fsid so that we can look up the export to authenticate against it,
    a 32-bit integer as a generation that we can bump when changing the device,
    and a currently unused 32-bit integer that could be used in the future
    to handle more than a single device per export. Entries in this hash
    table are never deleted as we can't reuse the ids anyway, and would have
    a severe lifetime problem anyway as Linux export structures are temporary
    structures that can go away under load.

    Parts of the XDR data, structures and marshaling/unmarshaling code, as
    well as many concepts are derived from the old pNFS server implementation
    from Andy Adamson, Benny Halevy, Dean Hildebrand, Marc Eshel, Fred Isaman,
    Mike Sager, Ricardo Labiaga and many others.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig