30 Dec, 2020

1 commit

  • [ Upstream commit 9b82d88d5976e5f2b8015d58913654856576ace5 ]

    NLM uses an interval-based rebinding, i.e. it clears the transport's
    binding under certain conditions if more than 60 seconds have elapsed
    since the connection was last bound.

    This rebinding is not necessary for an autobind RPC client over a
    connection-oriented protocol like TCP.

    It can also cause problems: it is possible for nlm_bind_host() to clear
    XPRT_BOUND whilst a connection worker is in the middle of trying to
    reconnect, after it had already been checked in xprt_connect().

    When the connection worker notices that XPRT_BOUND has been cleared
    under it, in xs_tcp_finish_connecting(), that results in:

    xs_tcp_setup_socket: connect returned unhandled error -107

    Worse, it's possible that the two can get into lockstep, resulting in
    the same behaviour repeated indefinitely, with the above error every
    300 seconds, without ever recovering, and the connection never being
    established. This has been seen in practice, with a large number of NLM
    client tasks, following a server restart.

    The existing callers of nlm_bind_host & nlm_rebind_host should not need
    to force the rebind, for TCP, so restrict the interval-based rebinding
    to UDP only.

    For TCP, we will still rebind when needed, e.g. on timeout, and connection
    error (including closure), since connection-related errors on an existing
    connection, ECONNREFUSED when trying to connect, and rpc_check_timeout(),
    already unconditionally clear XPRT_BOUND.

    To avoid having to add the fix, and explanation, to both nlm_bind_host()
    and nlm_rebind_host(), remove the duplicate code from the former, and
    have it call the latter.

    Drop the dprintk, which adds no value over a trace.

    Signed-off-by: Calum Mackay
    Fixes: 35f5a422ce1a ("SUNRPC: new interface to force an RPC rebind")
    Signed-off-by: Trond Myklebust
    Signed-off-by: Sasha Levin

    Calum Mackay
     

23 Oct, 2020

1 commit

  • Pull nfsd updates from Bruce Fields:
    "The one new feature this time, from Anna Schumaker, is READ_PLUS,
    which has the same arguments as READ but allows the server to return
    an array of data and hole extents.

    Otherwise it's a lot of cleanup and bugfixes"

    * tag 'nfsd-5.10' of git://linux-nfs.org/~bfields/linux: (43 commits)
    NFSv4.2: Fix NFS4ERR_STALE error when doing inter server copy
    SUNRPC: fix copying of multiple pages in gss_read_proxy_verf()
    sunrpc: raise kernel RPC channel buffer size
    svcrdma: fix bounce buffers for unaligned offsets and multiple pages
    nfsd: remove unneeded break
    net/sunrpc: Fix return value for sysctl sunrpc.transports
    NFSD: Encode a full READ_PLUS reply
    NFSD: Return both a hole and a data segment
    NFSD: Add READ_PLUS hole segment encoding
    NFSD: Add READ_PLUS data support
    NFSD: Hoist status code encoding into XDR encoder functions
    NFSD: Map nfserr_wrongsec outside of nfsd_dispatch
    NFSD: Remove the RETURN_STATUS() macro
    NFSD: Call NFSv2 encoders on error returns
    NFSD: Fix .pc_release method for NFSv2
    NFSD: Remove vestigial typedefs
    NFSD: Refactor nfsd_dispatch() error paths
    NFSD: Clean up nfsd_dispatch() variables
    NFSD: Clean up stale comments in nfsd_dispatch()
    NFSD: Clean up switch statement in nfsd_dispatch()
    ...

    Linus Torvalds
     

02 Oct, 2020

1 commit

  • Clean up: Follow-up on ten-year-old commit b9081d90f5b9 ("NFS: kill
    off complicated macro 'PROC'") by performing the same conversion in
    the lockd code. To reduce the chance of error, I copied the original
    C preprocessor output and then made some minor edits.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     

21 Sep, 2020

1 commit

  • Rationale:
    Reduces attack surface on kernel devs opening the links for MITM
    as HTTPS traffic is much harder to manipulate.

    Deterministic algorithm:
    For each file:
    If not .svg:
    For each line:
    If doesn't contain `\bxmlns\b`:
    For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
    If both the HTTP and HTTPS versions
    return 200 OK and serve the same content:
    Replace HTTP with HTTPS.

    Signed-off-by: Alexander A. Klimov
    Signed-off-by: Anna Schumaker

    Alexander A. Klimov
     

04 Feb, 2020

1 commit

  • The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in
    seq_file.h.

    Conversion rule is:

    llseek => proc_lseek
    unlocked_ioctl => proc_ioctl

    xxx => proc_xxx

    delete ".owner = THIS_MODULE" line

    [akpm@linux-foundation.org: fix drivers/isdn/capi/kcapi_proc.c]
    [sfr@canb.auug.org.au: fix kernel/sched/psi.c]
    Link: http://lkml.kernel.org/r/20200122180545.36222f50@canb.auug.org.au
    Link: http://lkml.kernel.org/r/20191225172546.GB13378@avx2
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

04 Nov, 2019

1 commit

  • NFSv2, v3 and NFSv4 servers often have duplicate replay caches that look
    at the source port when deciding whether or not an RPC call is a replay
    of a previous call. This requires clients to perform strange TCP gymnastics
    in order to ensure that when they reconnect to the server, they bind
    to the same source port.

    NFSv4.1 and NFSv4.2 have sessions that provide proper replay semantics,
    that do not look at the source port of the connection. This patch therefore
    ensures they can ignore the rebind requirement.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

04 Jul, 2019

5 commits

  • Fix sparse warnings:

    fs/lockd/clntproc.c:57:6: warning: symbol 'nlmclnt_put_lockowner' was not declared. Should it be static?
    fs/lockd/svclock.c:409:35: warning: symbol 'nlmsvc_lock_ops' was not declared. Should it be static?

    Reported-by: Hulk Robot
    Signed-off-by: YueHaibing
    Signed-off-by: J. Bruce Fields

    YueHaibing
     
  • Use the pid of lockd instead of the remote lock's svid for the fl_pid for
    local POSIX locks. This allows proper enumeration of which local process
    owns which lock. The svid is meaningless to local lock readers.

    Signed-off-by: Benjamin Coddington
    Signed-off-by: J. Bruce Fields

    Benjamin Coddington
     
  • Now that the NLM server allocates an nlm_lockowner for fl_owner, there's
    no need for special hashing or comparison.

    Signed-off-by: Benjamin Coddington
    Signed-off-by: J. Bruce Fields

    Benjamin Coddington
     
  • Do as the NLM client: allocate and track a struct nlm_lockowner for use as
    the fl_owner for locks created by the NLM sever. This allows us to keep
    the svid within this structure for matching locks, and will allow us to
    track the pid of lockd in a future patch. It should also allow easier
    reference of the nlm_host in conflicting locks, and simplify lock hashing
    and comparison.

    Signed-off-by: Benjamin Coddington
    [bfields@redhat.com: fix type of some error returns]
    Signed-off-by: J. Bruce Fields

    Benjamin Coddington
     
  • The nlm_lockowner structure that the client uses to track locks is
    generally useful to the server as well. Very similar functions to handle
    allocation and tracking of the nlm_lockowner will follow. Rename the client
    functions for clarity.

    Signed-off-by: Benjamin Coddington
    Signed-off-by: J. Bruce Fields

    Benjamin Coddington
     

31 May, 2019

1 commit

  • This reverts most of commit b8eee0e90f97 ("lockd: Show pid of lockd for
    remote locks"), which caused remote locks to not be differentiated between
    remote processes for NLM.

    We retain the fixup for setting the client's fl_pid to a negative value.

    Fixes: b8eee0e90f97 ("lockd: Show pid of lockd for remote locks")
    Cc: stable@vger.kernel.org

    Signed-off-by: Benjamin Coddington
    Reviewed-by: XueWei Zhang
    Signed-off-by: J. Bruce Fields

    Benjamin Coddington
     

21 May, 2019

2 commits

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have MODULE_LICENCE("GPL*") inside which was used in the initial
    scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

16 May, 2019

1 commit

  • Pull nfsd updates from Bruce Fields:
    "This consists mostly of nfsd container work:

    Scott Mayhew revived an old api that communicates with a userspace
    daemon to manage some on-disk state that's used to track clients
    across server reboots. We've been using a usermode_helper upcall for
    that, but it's tough to run those with the right namespaces, so a
    daemon is much friendlier to container use cases.

    Trond fixed nfsd's handling of user credentials in user namespaces. He
    also contributed patches that allow containers to support different
    sets of NFS protocol versions.

    The only remaining container bug I'm aware of is that the NFS reply
    cache is shared between all containers. If anyone's aware of other
    gaps in our container support, let me know.

    The rest of this is miscellaneous bugfixes"

    * tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linux: (23 commits)
    nfsd: update callback done processing
    locks: move checks from locks_free_lock() to locks_release_private()
    nfsd: fh_drop_write in nfsd_unlink
    nfsd: allow fh_want_write to be called twice
    nfsd: knfsd must use the container user namespace
    SUNRPC: rsi_parse() should use the current user namespace
    SUNRPC: Fix the server AUTH_UNIX userspace mappings
    lockd: Pass the user cred from knfsd when starting the lockd server
    SUNRPC: Temporary sockets should inherit the cred from their parent
    SUNRPC: Cache the process user cred in the RPC server listener
    nfsd: Allow containers to set supported nfs versions
    nfsd: Add custom rpcbind callbacks for knfsd
    SUNRPC: Allow further customisation of RPC program registration
    SUNRPC: Clean up generic dispatcher code
    SUNRPC: Add a callback to initialise server requests
    SUNRPC/nfs: Fix return value for nfs4_callback_compound()
    nfsd: handle legacy client tracking records sent by nfsdcld
    nfsd: re-order client tracking method selection
    nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld
    nfsd: un-deprecate nfsdcld
    ...

    Linus Torvalds
     

27 Apr, 2019

2 commits


26 Apr, 2019

1 commit

  • The RPC_TASK_KILLED flag should really not be set from another context
    because it can clobber data in the struct task when task->tk_flags is
    changed non-atomically.
    Let's therefore swap out RPC_TASK_KILLED with an atomic flag, and add
    a function to set that flag and safely wake up the task.

    Signed-off-by: Trond Myklebust
    Signed-off-by: Anna Schumaker

    Trond Myklebust
     

24 Apr, 2019

4 commits


19 Mar, 2019

1 commit

  • If the last NFSv3 unmount from a given host races with a mount from the
    same host, we can destroy an nlm_host that is still in use.

    Specifically nlmclnt_lookup_host() can increment h_count on
    an nlm_host that nlmclnt_release_host() has just successfully called
    refcount_dec_and_test() on.
    Once nlmclnt_lookup_host() drops the mutex, nlm_destroy_host_lock()
    will be called to destroy the nlmclnt which is now in use again.

    The cause of the problem is that the dec_and_test happens outside the
    locked region. This is easily fixed by using
    refcount_dec_and_mutex_lock().

    Fixes: 8ea6ecc8b075 ("lockd: Create client-side nlm_host cache")
    Cc: stable@vger.kernel.org (v2.6.38+)
    Signed-off-by: NeilBrown
    Signed-off-by: Trond Myklebust

    NeilBrown
     

14 Feb, 2019

1 commit


03 Jan, 2019

2 commits

  • Pull NFS client updates from Anna Schumaker:
    "Stable bugfixes:
    - xprtrdma: Yet another double DMA-unmap # v4.20

    Features:
    - Allow some /proc/sys/sunrpc entries without CONFIG_SUNRPC_DEBUG
    - Per-xprt rdma receive workqueues
    - Drop support for FMR memory registration
    - Make port= mount option optional for RDMA mounts

    Other bugfixes and cleanups:
    - Remove unused nfs4_xdev_fs_type declaration
    - Fix comments for behavior that has changed
    - Remove generic RPC credentials by switching to 'struct cred'
    - Fix crossing mountpoints with different auth flavors
    - Various xprtrdma fixes from testing and auditing the close code
    - Fixes for disconnect issues when using xprtrdma with krb5
    - Clean up and improve xprtrdma trace points
    - Fix NFS v4.2 async copy reboot recovery"

    * tag 'nfs-for-4.21-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (63 commits)
    sunrpc: convert to DEFINE_SHOW_ATTRIBUTE
    sunrpc: Add xprt after nfs4_test_session_trunk()
    sunrpc: convert unnecessary GFP_ATOMIC to GFP_NOFS
    sunrpc: handle ENOMEM in rpcb_getport_async
    NFS: remove unnecessary test for IS_ERR(cred)
    xprtrdma: Prevent leak of rpcrdma_rep objects
    NFSv4.2 fix async copy reboot recovery
    xprtrdma: Don't leak freed MRs
    xprtrdma: Add documenting comment for rpcrdma_buffer_destroy
    xprtrdma: Replace outdated comment for rpcrdma_ep_post
    xprtrdma: Update comments in frwr_op_send
    SUNRPC: Fix some kernel doc complaints
    SUNRPC: Simplify defining common RPC trace events
    NFS: Fix NFSv4 symbolic trace point output
    xprtrdma: Trace mapping, alloc, and dereg failures
    xprtrdma: Add trace points for calls to transport switch methods
    xprtrdma: Relocate the xprtrdma_mr_map trace points
    xprtrdma: Clean up of xprtrdma chunk trace points
    xprtrdma: Remove unused fields from rpcrdma_ia
    xprtrdma: Cull dprintk() call sites
    ...

    Linus Torvalds
     
  • Pull nfsd updates from Bruce Fields:
    "Thanks to Vasily Averin for fixing a use-after-free in the
    containerized NFSv4.2 client, and cleaning up some convoluted
    backchannel server code in the process.

    Otherwise, miscellaneous smaller bugfixes and cleanup"

    * tag 'nfsd-4.21' of git://linux-nfs.org/~bfields/linux: (25 commits)
    nfs: fixed broken compilation in nfs_callback_up_net()
    nfs: minor typo in nfs4_callback_up_net()
    sunrpc: fix debug message in svc_create_xprt()
    sunrpc: make visible processing error in bc_svc_process()
    sunrpc: remove unused xpo_prep_reply_hdr callback
    sunrpc: remove svc_rdma_bc_class
    sunrpc: remove svc_tcp_bc_class
    sunrpc: remove unused bc_up operation from rpc_xprt_ops
    sunrpc: replace svc_serv->sv_bc_xprt by boolean flag
    sunrpc: use-after-free in svc_process_common()
    sunrpc: use SVC_NET() in svcauth_gss_* functions
    nfsd: drop useless LIST_HEAD
    lockd: Show pid of lockd for remote locks
    NFSD remove OP_CACHEME from 4.2 op_flags
    nfsd: Return EPERM, not EACCES, in some SETATTR cases
    sunrpc: fix cache_head leak due to queued request
    nfsd: clean up indentation, increase indentation in switch statement
    svcrdma: Optimize the logic that selects the R_key to invalidate
    nfsd: fix a warning in __cld_pipe_upcall()
    nfsd4: fix crash on writing v4_end_grace before nfsd startup
    ...

    Linus Torvalds
     

20 Dec, 2018

1 commit

  • SUNRPC has two sorts of credentials, both of which appear as
    "struct rpc_cred".
    There are "generic credentials" which are supplied by clients
    such as NFS and passed in 'struct rpc_message' to indicate
    which user should be used to authorize the request, and there
    are low-level credentials such as AUTH_NULL, AUTH_UNIX, AUTH_GSS
    which describe the credential to be sent over the wires.

    This patch replaces all the generic credentials by 'struct cred'
    pointers - the credential structure used throughout Linux.

    For machine credentials, there is a special 'struct cred *' pointer
    which is statically allocated and recognized where needed as
    having a special meaning. A look-up of a low-level cred will
    map this to a machine credential.

    Signed-off-by: NeilBrown
    Acked-by: J. Bruce Fields
    Signed-off-by: Anna Schumaker

    NeilBrown
     

15 Dec, 2018

1 commit

  • Commit 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific l_pid
    for remote locks") specified that the l_pid returned for F_GETLK on a local
    file that has a remote lock should be the pid of the lock manager process.
    That commit, while updating other filesystems, failed to update lockd, such
    that locks created by lockd had their fl_pid set to that of the remote
    process holding the lock. Fix that here to be the pid of lockd.

    Also, fix the client case so that the returned lock pid is negative, which
    indicates a remote lock on a remote file.

    Fixes: 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific...")
    Cc: stable@vger.kernel.org

    Signed-off-by: Benjamin Coddington
    Signed-off-by: J. Bruce Fields

    Benjamin Coddington
     

07 Dec, 2018

1 commit

  • posix_unblock_lock() is not specific to posix locks, and behaves
    nearly identically to locks_delete_block() - the former returning a
    status while the later doesn't.

    So discard posix_unblock_lock() and use locks_delete_block() instead,
    after giving that function an appropriate return value.

    Signed-off-by: NeilBrown
    Reviewed-by: J. Bruce Fields
    Signed-off-by: Jeff Layton

    NeilBrown
     

28 Nov, 2018

1 commit

  • We fail to advance the read pointer when reading the stat.oh field that
    identifies the lock-holder in a TEST result.

    This turns out not to matter if the server is knfsd, which always
    returns a zero-length field. But other servers (Ganesha is an example)
    may not do this. The result is bad values in fcntl F_GETLK results.

    Fix this.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

30 Oct, 2018

1 commit


10 Aug, 2018

1 commit

  • nfsd and lockd call vfs_lock_file() to lock/unlock the inode
    returned by locks_inode(file).

    Many places in nfsd/lockd code use the inode returned by
    file_inode(file) for lock manipulation. With Overlayfs, file_inode()
    (the underlying inode) is not the same object as locks_inode() (the
    overlay inode). This can result in "Leaked POSIX lock" messages
    and eventually to a kernel crash as reported by Eddie Horng:
    https://marc.info/?l=linux-unionfs&m=153086643202072&w=2

    Fix all the call sites in nfsd/lockd that should use locks_inode().
    This is a correctness bug that manifested when overlayfs gained
    NFS export support in v4.16.

    Reported-by: Eddie Horng
    Tested-by: Eddie Horng
    Cc: Jeff Layton
    Fixes: 8383f1748829 ("ovl: wire up NFS export operations")
    Cc: stable@vger.kernel.org
    Signed-off-by: Amir Goldstein
    Signed-off-by: J. Bruce Fields

    Amir Goldstein
     

20 Mar, 2018

1 commit

  • The variables nlm_ntf_refcnt and nlm_ntf_wq are local to the source and
    do not need to be in global scope, so make them static.

    Cleans up sparse warnings:
    fs/lockd/svc.c:60:10: warning: symbol 'nlm_ntf_refcnt' was not declared.
    Should it be static?
    fs/lockd/svc.c:61:1: warning: symbol 'nlm_ntf_wq' was not declared.
    Should it be static?

    Signed-off-by: Colin Ian King
    Signed-off-by: J. Bruce Fields

    Colin Ian King
     

25 Jan, 2018

1 commit

  • The server shouldn't actually delete the struct nlm_host until it hits
    the garbage collector. In order to make that work correctly with the
    refcount API, we can bump the refcount by one, and then use
    refcount_dec_if_one() in the garbage collector.

    Signed-off-by: Trond Myklebust
    Acked-by: J. Bruce Fields

    Trond Myklebust
     

15 Jan, 2018

4 commits

  • atomic_t variables are currently used to implement reference
    counters with the following properties:
    - counter is initialized to 1 using atomic_set()
    - a resource is freed upon counter reaching zero
    - once counter reaches zero, its further
    increments aren't allowed
    - counter schema uses basic atomic operations
    (set, inc, inc_not_zero, dec_and_test, etc.)

    Such atomic variables should be converted to a newly provided
    refcount_t type and API that prevents accidental counter overflows
    and underflows. This is important since overflows and underflows
    can lead to use-after-free situation and be exploitable.

    The variable nlm_rqst.a_count is used as pure reference counter.
    Convert it to refcount_t and fix up the operations.

    **Important note for maintainers:

    Some functions from refcount_t API defined in lib/refcount.c
    have different memory ordering guarantees than their atomic
    counterparts.
    The full comparison can be seen in
    https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
    in state to be merged to the documentation tree.
    Normally the differences should not matter since refcount_t provides
    enough guarantees to satisfy the refcounting use cases, but in
    some rare cases it might matter.
    Please double check that you don't have some undocumented
    memory guarantees for this variable usage.

    For the nlm_rqst.a_count it might make a difference
    in following places:
    - nlmclnt_release_call() and nlmsvc_release_call(): decrement
    in refcount_dec_and_test() only
    provides RELEASE ordering and control dependency on success
    vs. fully ordered atomic counterpart

    Suggested-by: Kees Cook
    Reviewed-by: David Windsor
    Reviewed-by: Hans Liljestrand
    Signed-off-by: Elena Reshetova
    Signed-off-by: Trond Myklebust

    Elena Reshetova
     
  • atomic_t variables are currently used to implement reference
    counters with the following properties:
    - counter is initialized to 1 using atomic_set()
    - a resource is freed upon counter reaching zero
    - once counter reaches zero, its further
    increments aren't allowed
    - counter schema uses basic atomic operations
    (set, inc, inc_not_zero, dec_and_test, etc.)

    Such atomic variables should be converted to a newly provided
    refcount_t type and API that prevents accidental counter overflows
    and underflows. This is important since overflows and underflows
    can lead to use-after-free situation and be exploitable.

    The variable nlm_lockowner.count is used as pure reference counter.
    Convert it to refcount_t and fix up the operations.

    **Important note for maintainers:

    Some functions from refcount_t API defined in lib/refcount.c
    have different memory ordering guarantees than their atomic
    counterparts.
    The full comparison can be seen in
    https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
    in state to be merged to the documentation tree.
    Normally the differences should not matter since refcount_t provides
    enough guarantees to satisfy the refcounting use cases, but in
    some rare cases it might matter.
    Please double check that you don't have some undocumented
    memory guarantees for this variable usage.

    For the nlm_lockowner.count it might make a difference
    in following places:
    - nlm_put_lockowner(): decrement in refcount_dec_and_lock() only
    provides RELEASE ordering, control dependency on success and
    holds a spin lock on success vs. fully ordered atomic counterpart.
    No changes in spin lock guarantees.

    Suggested-by: Kees Cook
    Reviewed-by: David Windsor
    Reviewed-by: Hans Liljestrand
    Signed-off-by: Elena Reshetova
    Signed-off-by: Trond Myklebust

    Elena Reshetova
     
  • atomic_t variables are currently used to implement reference
    counters with the following properties:
    - counter is initialized to 1 using atomic_set()
    - a resource is freed upon counter reaching zero
    - once counter reaches zero, its further
    increments aren't allowed
    - counter schema uses basic atomic operations
    (set, inc, inc_not_zero, dec_and_test, etc.)

    Such atomic variables should be converted to a newly provided
    refcount_t type and API that prevents accidental counter overflows
    and underflows. This is important since overflows and underflows
    can lead to use-after-free situation and be exploitable.

    The variable nsm_handle.sm_count is used as pure reference counter.
    Convert it to refcount_t and fix up the operations.

    **Important note for maintainers:

    Some functions from refcount_t API defined in lib/refcount.c
    have different memory ordering guarantees than their atomic
    counterparts.
    The full comparison can be seen in
    https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
    in state to be merged to the documentation tree.
    Normally the differences should not matter since refcount_t provides
    enough guarantees to satisfy the refcounting use cases, but in
    some rare cases it might matter.
    Please double check that you don't have some undocumented
    memory guarantees for this variable usage.

    For the nsm_handle.sm_count it might make a difference
    in following places:
    - nsm_release(): decrement in refcount_dec_and_lock() only
    provides RELEASE ordering, control dependency on success
    and holds a spin lock on success vs. fully ordered atomic
    counterpart. No change for the spin lock guarantees.

    Suggested-by: Kees Cook
    Reviewed-by: David Windsor
    Reviewed-by: Hans Liljestrand
    Signed-off-by: Elena Reshetova
    Signed-off-by: Trond Myklebust

    Elena Reshetova
     
  • atomic_t variables are currently used to implement reference
    counters with the following properties:
    - counter is initialized to 1 using atomic_set()
    - a resource is freed upon counter reaching zero
    - once counter reaches zero, its further
    increments aren't allowed
    - counter schema uses basic atomic operations
    (set, inc, inc_not_zero, dec_and_test, etc.)

    Such atomic variables should be converted to a newly provided
    refcount_t type and API that prevents accidental counter overflows
    and underflows. This is important since overflows and underflows
    can lead to use-after-free situation and be exploitable.

    The variable nlm_host.h_count is used as pure reference counter.
    Convert it to refcount_t and fix up the operations.

    **Important note for maintainers:

    Some functions from refcount_t API defined in lib/refcount.c
    have different memory ordering guarantees than their atomic
    counterparts.
    The full comparison can be seen in
    https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
    in state to be merged to the documentation tree.
    Normally the differences should not matter since refcount_t provides
    enough guarantees to satisfy the refcounting use cases, but in
    some rare cases it might matter.
    Please double check that you don't have some undocumented
    memory guarantees for this variable usage.

    For the nlm_host.h_count it might make a difference
    in following places:
    - nlmsvc_release_host(): decrement in refcount_dec()
    provides RELEASE ordering, while original atomic_dec()
    was fully unordered. Since the change is for better, it
    should not matter.
    - nlmclnt_release_host(): decrement in refcount_dec_and_test() only
    provides RELEASE ordering and control dependency on success
    vs. fully ordered atomic counterpart. It doesn't seem to
    matter in this case since object freeing happens under mutex
    lock anyway.

    Suggested-by: Kees Cook
    Reviewed-by: David Windsor
    Reviewed-by: Hans Liljestrand
    Signed-off-by: Elena Reshetova
    Signed-off-by: Trond Myklebust

    Elena Reshetova
     

28 Nov, 2017

2 commits

  • nlm_complain_hosts() walks through nlm_server_hosts hlist, which should
    be protected by nlm_host_mutex.

    Signed-off-by: Vasily Averin
    Reviewed-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Vasily Averin
     
  • lockd_inet[6]addr_event use nlmsvc_rqst without taken nlmsvc_mutex,
    nlmsvc_rqst can be changed during execution of notifiers and crash the host.

    Patch enables access to nlmsvc_rqst only when it was correctly initialized
    and delays its cleanup until notifiers are no longer in use.

    Note that nlmsvc_rqst can be temporally set to ERR_PTR, so the "if
    (nlmsvc_rqst)" check in notifiers is insufficient on its own.

    Signed-off-by: Vasily Averin
    Tested-by: Scott Mayhew
    Signed-off-by: J. Bruce Fields

    Vasily Averin