14 Aug, 2014

1 commit

  • Pull NFS client updates from Trond Myklebust:
    "Highlights include:

    - stable fix for a bug in nfs3_list_one_acl()
    - speed up NFS path walks by supporting LOOKUP_RCU
    - more read/write code cleanups
    - pNFS fixes for layout return on close
    - fixes for the RCU handling in the rpcsec_gss code
    - more NFS/RDMA fixes"

    * tag 'nfs-for-3.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits)
    nfs: reject changes to resvport and sharecache during remount
    NFS: Avoid infinite loop when RELEASE_LOCKOWNER getting expired error
    SUNRPC: remove all refcounting of groupinfo from rpcauth_lookupcred
    NFS: fix two problems in lookup_revalidate in RCU-walk
    NFS: allow lockless access to access_cache
    NFS: teach nfs_lookup_verify_inode to handle LOOKUP_RCU
    NFS: teach nfs_neg_need_reval to understand LOOKUP_RCU
    NFS: support RCU_WALK in nfs_permission()
    sunrpc/auth: allow lockless (rcu) lookup of credential cache.
    NFS: prepare for RCU-walk support but pushing tests later in code.
    NFS: nfs4_lookup_revalidate: only evaluate parent if it will be used.
    NFS: add checks for returned value of try_module_get()
    nfs: clear_request_commit while holding i_lock
    pnfs: add pnfs_put_lseg_async
    pnfs: find swapped pages on pnfs commit lists too
    nfs: fix comment and add warn_on for PG_INODE_REF
    nfs: check wait_on_bit_lock err in page_group_lock
    sunrpc: remove "ec" argument from encrypt_v2 operation
    sunrpc: clean up sparse endianness warnings in gss_krb5_wrap.c
    sunrpc: clean up sparse endianness warnings in gss_krb5_seal.c
    ...

    Linus Torvalds
     

10 Aug, 2014

1 commit

  • Pull nfsd updates from Bruce Fields:
    "This includes a major rewrite of the NFSv4 state code, which has
    always depended on a single mutex. As an example, open creates are no
    longer serialized, fixing a performance regression on NFSv3->NFSv4
    upgrades. Thanks to Jeff, Trond, and Benny, and to Christoph for
    review.

    Also some RDMA fixes from Chuck Lever and Steve Wise, and
    miscellaneous fixes from Kinglong Mee and others"

    * 'for-3.17' of git://linux-nfs.org/~bfields/linux: (167 commits)
    svcrdma: remove rdma_create_qp() failure recovery logic
    nfsd: add some comments to the nfsd4 object definitions
    nfsd: remove the client_mutex and the nfs4_lock/unlock_state wrappers
    nfsd: remove nfs4_lock_state: nfs4_state_shutdown_net
    nfsd: remove nfs4_lock_state: nfs4_laundromat
    nfsd: Remove nfs4_lock_state(): reclaim_complete()
    nfsd: Remove nfs4_lock_state(): setclientid, setclientid_confirm, renew
    nfsd: Remove nfs4_lock_state(): exchange_id, create/destroy_session()
    nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirm
    nfsd: Remove nfs4_lock_state(): nfsd4_delegreturn()
    nfsd: Remove nfs4_lock_state(): nfsd4_open_downgrade + nfsd4_close
    nfsd: Remove nfs4_lock_state(): nfsd4_lock/locku/lockt()
    nfsd: Remove nfs4_lock_state(): nfsd4_release_lockowner
    nfsd: Remove nfs4_lock_state(): nfsd4_test_stateid/nfsd4_free_stateid
    nfsd: Remove nfs4_lock_state(): nfs4_preprocess_stateid_op()
    nfsd: remove old fault injection infrastructure
    nfsd: add more granular locking to *_delegations fault injectors
    nfsd: add more granular locking to forget_openowners fault injector
    nfsd: add more granular locking to forget_locks fault injector
    nfsd: add a list_head arg to nfsd_foreach_client_lock
    ...

    Linus Torvalds
     

06 Aug, 2014

1 commit

  • In svc_rdma_accept(), if rdma_create_qp() fails, there is useless
    logic to try and call rdma_create_qp() again with reduced sge depths.
    The assumption, I guess, was that perhaps the initial sge depths
    chosen were too big. However they initial depths are selected based
    on the rdma device attribute max_sge returned from ib_query_device().
    If rdma_create_qp() fails, it would not be because the max_send_sge and
    max_recv_sge values passed in exceed the device's max. So just remove
    this code.

    Signed-off-by: Steve Wise
    Signed-off-by: J. Bruce Fields

    Steve Wise
     

04 Aug, 2014

8 commits

  • current_cred() can only be changed by 'current', and
    cred->group_info is never changed. If a new group_info is
    needed, a new 'cred' is created.

    Consequently it is always safe to access
    current_cred()->group_info

    without taking any further references.
    So drop the refcounting and the incorrect rcu_dereference().

    Signed-off-by: NeilBrown
    Signed-off-by: Trond Myklebust

    NeilBrown
     
  • The new flag RPCAUTH_LOOKUP_RCU to credential lookup avoids locking,
    does not take a reference on the returned credential, and returns
    -ECHILD if a simple lookup was not possible.

    The returned value can only be used within an rcu_read_lock protected
    region.

    The main user of this is the new rpc_lookup_cred_nonblock() which
    returns a pointer to the current credential which is only rcu-safe (no
    ref-count held), and might return -ECHILD if allocation was required.

    Signed-off-by: NeilBrown
    Signed-off-by: Trond Myklebust

    NeilBrown
     
  • It's always 0.

    Signed-off-by: Jeff Layton
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Trond Myklebust

    Jeff Layton
     
  • Fix the endianness handling in gss_wrap_kerberos_v1 and drop the memset
    call there in favor of setting the filler bytes directly.

    In gss_wrap_kerberos_v2, get rid of the "ec" variable which is always
    zero, and drop the endianness conversion of 0. Sparse handles 0 as a
    special case, so it's not necessary.

    Signed-off-by: Jeff Layton
    Signed-off-by: Trond Myklebust

    Jeff Layton
     
  • Use u16 pointer in setup_token and setup_token_v2. None of the fields
    are actually handled as __be16, so this simplifies the code a bit. Also
    get rid of some unneeded pointer increments.

    Signed-off-by: Jeff Layton
    Signed-off-by: Trond Myklebust

    Jeff Layton
     
  • The handling of the gc_ctx pointer only seems to be partially RCU-safe.
    The assignment and freeing are done using RCU, but many places in the
    code seem to dereference that pointer without proper RCU safeguards.

    Fix them to use rcu_dereference and to rcu_read_lock/unlock, and to
    properly handle the case where the pointer is NULL.

    Cc: Arnd Bergmann
    Cc: Paul McKenney
    Signed-off-by: Jeff Layton
    Signed-off-by: Trond Myklebust

    Jeff Layton
     
  • * 'nfs-rdma' of git://git.linux-nfs.org/projects/anna/nfs-rdma: (916 commits)
    xprtrdma: Handle additional connection events
    xprtrdma: Remove RPCRDMA_PERSISTENT_REGISTRATION macro
    xprtrdma: Make rpcrdma_ep_disconnect() return void
    xprtrdma: Schedule reply tasklet once per upcall
    xprtrdma: Allocate each struct rpcrdma_mw separately
    xprtrdma: Rename frmr_wr
    xprtrdma: Disable completions for LOCAL_INV Work Requests
    xprtrdma: Disable completions for FAST_REG_MR Work Requests
    xprtrdma: Don't post a LOCAL_INV in rpcrdma_register_frmr_external()
    xprtrdma: Reset FRMRs after a flushed LOCAL_INV Work Request
    xprtrdma: Reset FRMRs when FAST_REG_MR is flushed by a disconnect
    xprtrdma: Properly handle exhaustion of the rb_mws list
    xprtrdma: Chain together all MWs in same buffer pool
    xprtrdma: Back off rkey when FAST_REG_MR fails
    xprtrdma: Unclutter struct rpcrdma_mr_seg
    xprtrdma: Don't invalidate FRMRs if registration fails
    xprtrdma: On disconnect, don't ignore pending CQEs
    xprtrdma: Update rkeys after transport reconnect
    xprtrdma: Limit data payload size for ALLPHYSICAL
    xprtrdma: Protect ia->ri_id when unmapping/invalidating MRs
    ...

    Trond Myklebust
     
  • In some cases where the credentials are not often reused, we may want
    to limit their total number just in order to make the negative lookups
    in the hash table more manageable.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

01 Aug, 2014

21 commits

  • Commit 38ca83a5 added RDMA_CM_EVENT_TIMEWAIT_EXIT. But that status
    is relevant only for consumers that re-use their QPs on new
    connections. xprtrdma creates a fresh QP on reconnection, so that
    event should be explicitly ignored.

    Squelch the alarming "unexpected CM event" message.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Clean up.

    RPCRDMA_PERSISTENT_REGISTRATION was a compile-time switch between
    RPCRDMA_REGISTER mode and RPCRDMA_ALLPHYSICAL mode. Since
    RPCRDMA_REGISTER has been removed, there's no need for the extra
    conditional compilation.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Clean up: The return code is used only for dprintk's that are
    already redundant.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Minor optimization: grab rpcrdma_tk_lock_g and disable hard IRQs
    just once after clearing the receive completion queue.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Currently rpcrdma_buffer_create() allocates struct rpcrdma_mw's as
    a single contiguous area of memory. It amounts to quite a bit of
    memory, and there's no requirement for these to be carved from a
    single piece of contiguous memory.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Clean up: Name frmr_wr after the opcode of the Work Request,
    consistent with the send and local invalidation paths.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Instead of relying on a completion to change the state of an FRMR
    to FRMR_IS_INVALID, set it in advance. If an error occurs, a completion
    will fire anyway and mark the FRMR FRMR_IS_STALE.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Instead of relying on a completion to change the state of an FRMR
    to FRMR_IS_VALID, set it in advance. If an error occurs, a completion
    will fire anyway and mark the FRMR FRMR_IS_STALE.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Any FRMR arriving in rpcrdma_register_frmr_external() is now
    guaranteed to be either invalid, or to be targeted by a queued
    LOCAL_INV that will invalidate it before the adapter processes
    the FAST_REG_MR being built here.

    The problem with current arrangement of chaining a LOCAL_INV to the
    FAST_REG_MR is that if the transport is not connected, the LOCAL_INV
    is flushed and the FAST_REG_MR is flushed. This leaves the FRMR
    valid with the old rkey. But rpcrdma_register_frmr_external() has
    already bumped the in-memory rkey.

    Next time through rpcrdma_register_frmr_external(), a LOCAL_INV and
    FAST_REG_MR is attempted again because the FRMR is still valid. But
    the rkey no longer matches the hardware's rkey, and a memory
    management operation error occurs.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • When a LOCAL_INV Work Request is flushed, it leaves an FRMR in the
    VALID state. This FRMR can be returned by rpcrdma_buffer_get(), and
    must be knocked down in rpcrdma_register_frmr_external() before it
    can be re-used.

    Instead, capture these in rpcrdma_buffer_get(), and reset them.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • FAST_REG_MR Work Requests update a Memory Region's rkey. Rkey's are
    used to block unwanted access to the memory controlled by an MR. The
    rkey is passed to the receiver (the NFS server, in our case), and is
    also used by xprtrdma to invalidate the MR when the RPC is complete.

    When a FAST_REG_MR Work Request is flushed after a transport
    disconnect, xprtrdma cannot tell whether the WR actually hit the
    adapter or not. So it is indeterminant at that point whether the
    existing rkey is still valid.

    After the transport connection is re-established, the next
    FAST_REG_MR or LOCAL_INV Work Request against that MR can sometimes
    fail because the rkey value does not match what xprtrdma expects.

    The only reliable way to recover in this case is to deregister and
    register the MR before it is used again. These operations can be
    done only in a process context, so handle it in the transport
    connect worker.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • If the rb_mws list is exhausted, clean up and return NULL so that
    call_allocate() will delay and try again.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • During connection loss recovery, need to visit every MW in a
    buffer pool. Any MW that is in use by an RPC will not be on the
    rb_mws list.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • If posting a FAST_REG_MR Work Reqeust fails, revert the rkey update
    to avoid subsequent IB_WC_MW_BIND_ERR completions.

    Suggested-by: Steve Wise
    Signed-off-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Clean ups:
    - make it obvious that the rl_mw field is a pointer -- allocated
    separately, not as part of struct rpcrdma_mr_seg
    - promote "struct {} frmr;" to a named type
    - promote the state enum to a named type
    - name the MW state field the same way other fields in
    rpcrdma_mw are named

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • If FRMR registration fails, it's likely to transition the QP to the
    error state. Or, registration may have failed because the QP is
    _already_ in ERROR.

    Thus calling rpcrdma_deregister_external() in
    rpcrdma_create_chunks() is useless in FRMR mode: the LOCAL_INVs just
    get flushed.

    It is safe to leave existing registrations: when FRMR registration
    is tried again, rpcrdma_register_frmr_external() checks if each FRMR
    is already/still VALID, and knocks it down first if it is.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • xprtrdma is currently throwing away queued completions during
    a reconnect. RPC replies posted just before connection loss, or
    successful completions that change the state of an FRMR, can be
    missed.

    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Various reports of:

    rpcrdma_qp_async_error_upcall: QP error 3 on device mlx4_0
    ep ffff8800bfd3e848

    Ensure that rkeys in already-marshalled RPC/RDMA headers are
    refreshed after the QP has been replaced by a reconnect.

    BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=249
    Suggested-by: Selvin Xavier
    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • When the client uses physical memory registration, each page in the
    payload gets its own array entry in the RPC/RDMA header's chunk list.

    Therefore, don't advertise a maximum payload size that would require
    more array entries than can fit in the RPC buffer where RPC/RDMA
    headers are built.

    BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=248
    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • Ensure ia->ri_id remains valid while invoking dma_unmap_page() or
    posting LOCAL_INV during a transport reconnect. Otherwise,
    ia->ri_id->device or ia->ri_id->qp is NULL, which triggers a panic.

    BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=259
    Fixes: ec62f40 'xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting'
    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     
  • seg1->mr_nsegs is not yet initialized when it is used to unmap
    segments during an error exit. Use the same unmapping logic for
    all error exits.

    "if (frmr_wr.wr.fast_reg.length < len) {" used to be a BUG_ON check.
    The broken code will never be executed under normal operation.

    Fixes: c977dea (xprtrdma: Remove BUG_ON() call sites)
    Signed-off-by: Chuck Lever
    Tested-by: Steve Wise
    Tested-by: Shirley Ma
    Tested-by: Devesh Sharma
    Signed-off-by: Anna Schumaker

    Chuck Lever
     

30 Jul, 2014

3 commits


23 Jul, 2014

2 commits

  • See RFC 5666 section 3.7: clients don't have to send zero XDR
    padding.

    BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=246
    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Fix the following warning when DMA-API debug is enabled by checking ib_dma_map_single result:
    [ 1455.345548] ------------[ cut here ]------------
    [ 1455.346863] WARNING: CPU: 3 PID: 3929 at /home/yanb/kernel/net-next/lib/dma-debug.c:1140 check_unmap+0x4e5/0x990()
    [ 1455.349350] mlx4_core 0000:00:07.0: DMA-API: device driver failed to check map error[device address=0x000000007c9f2090] [size=2656 bytes] [mapped as single]
    [ 1455.349350] Modules linked in: xprtrdma netconsole configfs nfsv3 nfs_acl ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm autofs4 auth_rpcgss oid_registry nfsv4 nfs fscache lockd sunrpc dm_mirror dm_region_hash dm_log microcode pcspkr mlx4_ib ib_sa ib_mad ib_core ib_addr mlx4_en ipv6 ptp pps_core vxlan mlx4_core virtio_balloon cirrus ttm drm_kms_helper drm sysimgblt sysfillrect syscopyarea i2c_piix4 i2c_core button ext3 jbd virtio_blk virtio_net virtio_pci virtio_ring virtio uhci_hcd ata_generic ata_piix libata
    [ 1455.349350] CPU: 3 PID: 3929 Comm: mount.nfs Not tainted 3.15.0-rc1-dbg+ #13
    [ 1455.349350] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
    [ 1455.349350] 0000000000000474 ffff880069dcf628 ffffffff8151c341 ffffffff817b69d8
    [ 1455.349350] ffff880069dcf678 ffff880069dcf668 ffffffff8105b5fc 0000000069dcf658
    [ 1455.349350] ffff880069dcf778 ffff88007b0c9f00 ffffffff8255ec40 0000000000000a60
    [ 1455.349350] Call Trace:
    [ 1455.349350] [] dump_stack+0x52/0x81
    [ 1455.349350] [] warn_slowpath_common+0x8c/0xc0
    [ 1455.349350] [] warn_slowpath_fmt+0x46/0x50
    [ 1455.349350] [] check_unmap+0x4e5/0x990
    [ 1455.349350] [] ? _raw_spin_unlock_irq+0x30/0x60
    [ 1455.349350] [] debug_dma_unmap_page+0x5a/0x60
    [ 1455.349350] [] rpcrdma_deregister_internal+0xb3/0xd0 [xprtrdma]
    [ 1455.349350] [] rpcrdma_buffer_destroy+0x69/0x170 [xprtrdma]
    [ 1455.349350] [] xprt_rdma_destroy+0x3f/0xb0 [xprtrdma]
    [ 1455.349350] [] xprt_destroy+0x6f/0x80 [sunrpc]
    [ 1455.349350] [] xprt_put+0x15/0x20 [sunrpc]
    [ 1455.349350] [] rpc_free_client+0x8a/0xe0 [sunrpc]
    [ 1455.349350] [] rpc_release_client+0x68/0xa0 [sunrpc]
    [ 1455.349350] [] rpc_shutdown_client+0xb0/0xc0 [sunrpc]
    [ 1455.349350] [] ? rpc_ping+0x5d/0x70 [sunrpc]
    [ 1455.349350] [] rpc_create_xprt+0xbb/0xd0 [sunrpc]
    [ 1455.349350] [] rpc_create+0xb3/0x160 [sunrpc]
    [ 1455.349350] [] ? __probe_kernel_read+0x69/0xb0
    [ 1455.349350] [] nfs_create_rpc_client+0xdc/0x100 [nfs]
    [ 1455.349350] [] nfs_init_client+0x3a/0x90 [nfs]
    [ 1455.349350] [] nfs_get_client+0x478/0x5b0 [nfs]
    [ 1455.349350] [] ? nfs_get_client+0x100/0x5b0 [nfs]
    [ 1455.349350] [] ? kmem_cache_alloc_trace+0x24d/0x260
    [ 1455.349350] [] nfs_create_server+0xf3/0x4c0 [nfs]
    [ 1455.349350] [] ? nfs_request_mount+0xf0/0x1a0 [nfs]
    [ 1455.349350] [] nfs3_create_server+0x13/0x30 [nfsv3]
    [ 1455.349350] [] nfs_try_mount+0x1f3/0x230 [nfs]
    [ 1455.349350] [] ? get_parent_ip+0x11/0x50
    [ 1455.349350] [] ? __this_cpu_preempt_check+0x13/0x20
    [ 1455.349350] [] ? try_module_get+0x6b/0x190
    [ 1455.349350] [] nfs_fs_mount+0x187/0x9d0 [nfs]
    [ 1455.349350] [] ? nfs_clone_super+0x140/0x140 [nfs]
    [ 1455.349350] [] ? nfs_auth_info_match+0x40/0x40 [nfs]
    [ 1455.349350] [] mount_fs+0x20/0xe0
    [ 1455.349350] [] vfs_kern_mount+0x76/0x160
    [ 1455.349350] [] do_mount+0x428/0xae0
    [ 1455.349350] [] SyS_mount+0x90/0xe0
    [ 1455.349350] [] system_call_fastpath+0x16/0x1b
    [ 1455.349350] ---[ end trace f1f31572972e211d ]---

    Signed-off-by: Yan Burman
    Reviewed-by: Chuck Lever
    Signed-off-by: Anna Schumaker

    Yan Burman
     

18 Jul, 2014

2 commits

  • Quell another sparse warning.

    Signed-off-by: Trond Myklebust
    Signed-off-by: J. Bruce Fields

    Trond Myklebust
     
  • The current code always selects XPRT_TRANSPORT_BC_TCP for the back
    channel, even when the forward channel was not TCP (eg, RDMA). When
    a 4.1 mount is attempted with RDMA, the server panics in the TCP BC
    code when trying to send CB_NULL.

    Instead, construct the transport protocol number from the forward
    channel transport or'd with XPRT_TRANSPORT_BC. Transports that do
    not support bi-directional RPC will not have registered a "BC"
    transport, causing create_backchannel_client() to fail immediately.

    Fixes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=265
    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     

16 Jul, 2014

1 commit

  • It is currently not possible for various wait_on_bit functions
    to implement a timeout.

    While the "action" function that is called to do the waiting
    could certainly use schedule_timeout(), there is no way to carry
    forward the remaining timeout after a false wake-up.
    As false-wakeups a clearly possible at least due to possible
    hash collisions in bit_waitqueue(), this is a real problem.

    The 'action' function is currently passed a pointer to the word
    containing the bit being waited on. No current action functions
    use this pointer. So changing it to something else will be a
    little noisy but will have no immediate effect.

    This patch changes the 'action' function to take a pointer to
    the "struct wait_bit_key", which contains a pointer to the word
    containing the bit so nothing is really lost.

    It also adds a 'private' field to "struct wait_bit_key", which
    is initialized to zero.

    An action function can now implement a timeout with something
    like

    static int timed_out_waiter(struct wait_bit_key *key)
    {
    unsigned long waited;
    if (key->private == 0) {
    key->private = jiffies;
    if (key->private == 0)
    key->private -= 1;
    }
    waited = jiffies - key->private;
    if (waited > 10 * HZ)
    return -EAGAIN;
    schedule_timeout(waited - 10 * HZ);
    return 0;
    }

    If any other need for context in a waiter were found it would be
    easy to use ->private for some other purpose, or even extend
    "struct wait_bit_key".

    My particular need is to support timeouts in nfs_release_page()
    to avoid deadlocks with loopback mounted NFS.

    While wait_on_bit_timeout() would be a cleaner interface, it
    will not meet my need. I need the timeout to be sensitive to
    the state of the connection with the server, which could change.
    So I need to use an 'action' interface.

    Signed-off-by: NeilBrown
    Acked-by: Peter Zijlstra
    Cc: Oleg Nesterov
    Cc: Steve French
    Cc: David Howells
    Cc: Steven Whitehouse
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brown
    Signed-off-by: Ingo Molnar

    NeilBrown