Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

14 Aug, 2014

1 commit

06b8ab552 Merge tag 'nfs-for-3.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs ... Browse Code »

Pull NFS client updates from Trond Myklebust:
"Highlights include:

- stable fix for a bug in nfs3_list_one_acl()
- speed up NFS path walks by supporting LOOKUP_RCU
- more read/write code cleanups
- pNFS fixes for layout return on close
- fixes for the RCU handling in the rpcsec_gss code
- more NFS/RDMA fixes"

* tag 'nfs-for-3.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits)
nfs: reject changes to resvport and sharecache during remount
NFS: Avoid infinite loop when RELEASE_LOCKOWNER getting expired error
SUNRPC: remove all refcounting of groupinfo from rpcauth_lookupcred
NFS: fix two problems in lookup_revalidate in RCU-walk
NFS: allow lockless access to access_cache
NFS: teach nfs_lookup_verify_inode to handle LOOKUP_RCU
NFS: teach nfs_neg_need_reval to understand LOOKUP_RCU
NFS: support RCU_WALK in nfs_permission()
sunrpc/auth: allow lockless (rcu) lookup of credential cache.
NFS: prepare for RCU-walk support but pushing tests later in code.
NFS: nfs4_lookup_revalidate: only evaluate parent if it will be used.
NFS: add checks for returned value of try_module_get()
nfs: clear_request_commit while holding i_lock
pnfs: add pnfs_put_lseg_async
pnfs: find swapped pages on pnfs commit lists too
nfs: fix comment and add warn_on for PG_INODE_REF
nfs: check wait_on_bit_lock err in page_group_lock
sunrpc: remove "ec" argument from encrypt_v2 operation
sunrpc: clean up sparse endianness warnings in gss_krb5_wrap.c
sunrpc: clean up sparse endianness warnings in gss_krb5_seal.c
...

Linus Torvalds
2014-08-14 08:13:19 +0800

10 Aug, 2014

1 commit

0d10c2c17 Merge branch 'for-3.17' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"This includes a major rewrite of the NFSv4 state code, which has
always depended on a single mutex. As an example, open creates are no
longer serialized, fixing a performance regression on NFSv3->NFSv4
upgrades. Thanks to Jeff, Trond, and Benny, and to Christoph for
review.

Also some RDMA fixes from Chuck Lever and Steve Wise, and
miscellaneous fixes from Kinglong Mee and others"

* 'for-3.17' of git://linux-nfs.org/~bfields/linux: (167 commits)
svcrdma: remove rdma_create_qp() failure recovery logic
nfsd: add some comments to the nfsd4 object definitions
nfsd: remove the client_mutex and the nfs4_lock/unlock_state wrappers
nfsd: remove nfs4_lock_state: nfs4_state_shutdown_net
nfsd: remove nfs4_lock_state: nfs4_laundromat
nfsd: Remove nfs4_lock_state(): reclaim_complete()
nfsd: Remove nfs4_lock_state(): setclientid, setclientid_confirm, renew
nfsd: Remove nfs4_lock_state(): exchange_id, create/destroy_session()
nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirm
nfsd: Remove nfs4_lock_state(): nfsd4_delegreturn()
nfsd: Remove nfs4_lock_state(): nfsd4_open_downgrade + nfsd4_close
nfsd: Remove nfs4_lock_state(): nfsd4_lock/locku/lockt()
nfsd: Remove nfs4_lock_state(): nfsd4_release_lockowner
nfsd: Remove nfs4_lock_state(): nfsd4_test_stateid/nfsd4_free_stateid
nfsd: Remove nfs4_lock_state(): nfs4_preprocess_stateid_op()
nfsd: remove old fault injection infrastructure
nfsd: add more granular locking to *_delegations fault injectors
nfsd: add more granular locking to forget_openowners fault injector
nfsd: add more granular locking to forget_locks fault injector
nfsd: add a list_head arg to nfsd_foreach_client_lock
...

Linus Torvalds
2014-08-10 05:31:18 +0800

06 Aug, 2014

1 commit

d1e458fe6 svcrdma: remove rdma_create_qp() failure recovery logic ... Browse Code »

In svc_rdma_accept(), if rdma_create_qp() fails, there is useless
logic to try and call rdma_create_qp() again with reduced sge depths.
The assumption, I guess, was that perhaps the initial sge depths
chosen were too big. However they initial depths are selected based
on the rdma device attribute max_sge returned from ib_query_device().
If rdma_create_qp() fails, it would not be because the max_send_sge and
max_recv_sge values passed in exceed the device's max. So just remove
this code.

Signed-off-by: Steve Wise
Signed-off-by: J. Bruce Fields

Steve Wise
2014-08-06 04:09:21 +0800

04 Aug, 2014

8 commits

122a8cda6 SUNRPC: remove all refcounting of groupinfo from rpcauth_lookupcred ... Browse Code »

current_cred() can only be changed by 'current', and
cred->group_info is never changed. If a new group_info is
needed, a new 'cred' is created.

Consequently it is always safe to access
current_cred()->group_info

without taking any further references.
So drop the refcounting and the incorrect rcu_dereference().

Signed-off-by: NeilBrown
Signed-off-by: Trond Myklebust

NeilBrown
2014-08-04 21:22:08 +0800
bd9560805 sunrpc/auth: allow lockless (rcu) lookup of credential cache. ... Browse Code »

The new flag RPCAUTH_LOOKUP_RCU to credential lookup avoids locking,
does not take a reference on the returned credential, and returns
-ECHILD if a simple lookup was not possible.

The returned value can only be used within an rcu_read_lock protected
region.

The main user of this is the new rpc_lookup_cred_nonblock() which
returns a pointer to the current credential which is only rcu-safe (no
ref-count held), and might return -ECHILD if allocation was required.

Signed-off-by: NeilBrown
Signed-off-by: Trond Myklebust

NeilBrown
2014-08-04 05:14:12 +0800
ec25422c6 sunrpc: remove "ec" argument from encrypt_v2 operation ... Browse Code »

It's always 0.

Signed-off-by: Jeff Layton
Reviewed-by: Christoph Hellwig
Signed-off-by: Trond Myklebust

Jeff Layton
2014-08-04 05:05:24 +0800
b36e9c44a sunrpc: clean up sparse endianness warnings in gss_krb5_wrap.c ... Browse Code »

Fix the endianness handling in gss_wrap_kerberos_v1 and drop the memset
call there in favor of setting the filler bytes directly.

In gss_wrap_kerberos_v2, get rid of the "ec" variable which is always
zero, and drop the endianness conversion of 0. Sparse handles 0 as a
special case, so it's not necessary.

Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust

Jeff Layton
2014-08-04 05:05:24 +0800
6ac0fbbfc sunrpc: clean up sparse endianness warnings in gss_krb5_seal.c ... Browse Code »

Use u16 pointer in setup_token and setup_token_v2. None of the fields
are actually handled as __be16, so this simplifies the code a bit. Also
get rid of some unneeded pointer increments.

Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust

Jeff Layton
2014-08-04 05:05:23 +0800
c5e6aecd0 sunrpc: fix RCU handling of gc_ctx field ... Browse Code »
5

The handling of the gc_ctx pointer only seems to be partially RCU-safe.
The assignment and freeing are done using RCU, but many places in the
code seem to dereference that pointer without proper RCU safeguards.

Fix them to use rcu_dereference and to rcu_read_lock/unlock, and to
properly handle the case where the pointer is NULL.

Cc: Arnd Bergmann
Cc: Paul McKenney
Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust

Jeff Layton
2014-08-04 05:05:23 +0800
9806755c5 Merge branch 'nfs-rdma' of git://git.linux-nfs.org/projects/anna/nfs-rdma into linux-next ... Browse Code »

* 'nfs-rdma' of git://git.linux-nfs.org/projects/anna/nfs-rdma: (916 commits)
xprtrdma: Handle additional connection events
xprtrdma: Remove RPCRDMA_PERSISTENT_REGISTRATION macro
xprtrdma: Make rpcrdma_ep_disconnect() return void
xprtrdma: Schedule reply tasklet once per upcall
xprtrdma: Allocate each struct rpcrdma_mw separately
xprtrdma: Rename frmr_wr
xprtrdma: Disable completions for LOCAL_INV Work Requests
xprtrdma: Disable completions for FAST_REG_MR Work Requests
xprtrdma: Don't post a LOCAL_INV in rpcrdma_register_frmr_external()
xprtrdma: Reset FRMRs after a flushed LOCAL_INV Work Request
xprtrdma: Reset FRMRs when FAST_REG_MR is flushed by a disconnect
xprtrdma: Properly handle exhaustion of the rb_mws list
xprtrdma: Chain together all MWs in same buffer pool
xprtrdma: Back off rkey when FAST_REG_MR fails
xprtrdma: Unclutter struct rpcrdma_mr_seg
xprtrdma: Don't invalidate FRMRs if registration fails
xprtrdma: On disconnect, don't ignore pending CQEs
xprtrdma: Update rkeys after transport reconnect
xprtrdma: Limit data payload size for ALLPHYSICAL
xprtrdma: Protect ia->ri_id when unmapping/invalidating MRs
...

Trond Myklebust
2014-08-04 05:04:51 +0800
bae6746ff SUNRPC: Enforce an upper limit on the number of cached credentials ... Browse Code »

In some cases where the credentials are not often reused, we may want
to limit their total number just in order to make the negative lookups
in the hash table more manageable.

Signed-off-by: Trond Myklebust

Trond Myklebust
2014-08-04 04:02:50 +0800

01 Aug, 2014

21 commits

8079fb785 xprtrdma: Handle additional connection events ... Browse Code »

Commit 38ca83a5 added RDMA_CM_EVENT_TIMEWAIT_EXIT. But that status
is relevant only for consumers that re-use their QPs on new
connections. xprtrdma creates a fresh QP on reconnection, so that
event should be explicitly ignored.

Squelch the alarming "unexpected CM event" message.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:59 +0800
a779ca5fa xprtrdma: Remove RPCRDMA_PERSISTENT_REGISTRATION macro ... Browse Code »

Clean up.

RPCRDMA_PERSISTENT_REGISTRATION was a compile-time switch between
RPCRDMA_REGISTER mode and RPCRDMA_ALLPHYSICAL mode. Since
RPCRDMA_REGISTER has been removed, there's no need for the extra
conditional compilation.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:59 +0800
282191cb7 xprtrdma: Make rpcrdma_ep_disconnect() return void ... Browse Code »

Clean up: The return code is used only for dprintk's that are
already redundant.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:58 +0800
bb96193d9 xprtrdma: Schedule reply tasklet once per upcall ... Browse Code »

Minor optimization: grab rpcrdma_tk_lock_g and disable hard IRQs
just once after clearing the receive completion queue.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:58 +0800
2e84522c2 xprtrdma: Allocate each struct rpcrdma_mw separately ... Browse Code »

Currently rpcrdma_buffer_create() allocates struct rpcrdma_mw's as
a single contiguous area of memory. It amounts to quite a bit of
memory, and there's no requirement for these to be carved from a
single piece of contiguous memory.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:57 +0800
f590e878c xprtrdma: Rename frmr_wr ... Browse Code »

Clean up: Name frmr_wr after the opcode of the Work Request,
consistent with the send and local invalidation paths.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:57 +0800
dab7e3b8d xprtrdma: Disable completions for LOCAL_INV Work Requests ... Browse Code »

Instead of relying on a completion to change the state of an FRMR
to FRMR_IS_INVALID, set it in advance. If an error occurs, a completion
will fire anyway and mark the FRMR FRMR_IS_STALE.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:57 +0800
050557220 xprtrdma: Disable completions for FAST_REG_MR Work Requests ... Browse Code »

Instead of relying on a completion to change the state of an FRMR
to FRMR_IS_VALID, set it in advance. If an error occurs, a completion
will fire anyway and mark the FRMR FRMR_IS_STALE.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:56 +0800
440ddad51 xprtrdma: Don't post a LOCAL_INV in rpcrdma_register_frmr_external() ... Browse Code »

Any FRMR arriving in rpcrdma_register_frmr_external() is now
guaranteed to be either invalid, or to be targeted by a queued
LOCAL_INV that will invalidate it before the adapter processes
the FAST_REG_MR being built here.

The problem with current arrangement of chaining a LOCAL_INV to the
FAST_REG_MR is that if the transport is not connected, the LOCAL_INV
is flushed and the FAST_REG_MR is flushed. This leaves the FRMR
valid with the old rkey. But rpcrdma_register_frmr_external() has
already bumped the in-memory rkey.

Next time through rpcrdma_register_frmr_external(), a LOCAL_INV and
FAST_REG_MR is attempted again because the FRMR is still valid. But
the rkey no longer matches the hardware's rkey, and a memory
management operation error occurs.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:56 +0800
ddb6bebcc xprtrdma: Reset FRMRs after a flushed LOCAL_INV Work Request ... Browse Code »

When a LOCAL_INV Work Request is flushed, it leaves an FRMR in the
VALID state. This FRMR can be returned by rpcrdma_buffer_get(), and
must be knocked down in rpcrdma_register_frmr_external() before it
can be re-used.

Instead, capture these in rpcrdma_buffer_get(), and reset them.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:55 +0800
9f9d802a2 xprtrdma: Reset FRMRs when FAST_REG_MR is flushed by a disconnect ... Browse Code »

FAST_REG_MR Work Requests update a Memory Region's rkey. Rkey's are
used to block unwanted access to the memory controlled by an MR. The
rkey is passed to the receiver (the NFS server, in our case), and is
also used by xprtrdma to invalidate the MR when the RPC is complete.

When a FAST_REG_MR Work Request is flushed after a transport
disconnect, xprtrdma cannot tell whether the WR actually hit the
adapter or not. So it is indeterminant at that point whether the
existing rkey is still valid.

After the transport connection is re-established, the next
FAST_REG_MR or LOCAL_INV Work Request against that MR can sometimes
fail because the rkey value does not match what xprtrdma expects.

The only reliable way to recover in this case is to deregister and
register the MR before it is used again. These operations can be
done only in a process context, so handle it in the transport
connect worker.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:55 +0800
c2922c023 xprtrdma: Properly handle exhaustion of the rb_mws list ... Browse Code »

If the rb_mws list is exhausted, clean up and return NULL so that
call_allocate() will delay and try again.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:55 +0800
3111d72c7 xprtrdma: Chain together all MWs in same buffer pool ... Browse Code »

During connection loss recovery, need to visit every MW in a
buffer pool. Any MW that is in use by an RPC will not be on the
rb_mws list.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:54 +0800
c93e986a2 xprtrdma: Back off rkey when FAST_REG_MR fails ... Browse Code »

If posting a FAST_REG_MR Work Reqeust fails, revert the rkey update
to avoid subsequent IB_WC_MW_BIND_ERR completions.

Suggested-by: Steve Wise
Signed-off-by: Chuck Lever
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:54 +0800
0dbb4108a xprtrdma: Unclutter struct rpcrdma_mr_seg ... Browse Code »

Clean ups:
- make it obvious that the rl_mw field is a pointer -- allocated
separately, not as part of struct rpcrdma_mr_seg
- promote "struct {} frmr;" to a named type
- promote the state enum to a named type
- name the MW state field the same way other fields in
rpcrdma_mw are named

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:54 +0800
539431a43 xprtrdma: Don't invalidate FRMRs if registration fails ... Browse Code »
5

If FRMR registration fails, it's likely to transition the QP to the
error state. Or, registration may have failed because the QP is
_already_ in ERROR.

Thus calling rpcrdma_deregister_external() in
rpcrdma_create_chunks() is useless in FRMR mode: the LOCAL_INVs just
get flushed.

It is safe to leave existing registrations: when FRMR registration
is tried again, rpcrdma_register_frmr_external() checks if each FRMR
is already/still VALID, and knocks it down first if it is.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:53 +0800
a7bc211ac xprtrdma: On disconnect, don't ignore pending CQEs ... Browse Code »
13

xprtrdma is currently throwing away queued completions during
a reconnect. RPC replies posted just before connection loss, or
successful completions that change the state of an FRMR, can be
missed.

Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:53 +0800
6ab59945f xprtrdma: Update rkeys after transport reconnect ... Browse Code »
39

Various reports of:

rpcrdma_qp_async_error_upcall: QP error 3 on device mlx4_0
ep ffff8800bfd3e848

Ensure that rkeys in already-marshalled RPC/RDMA headers are
refreshed after the QP has been replaced by a reconnect.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=249
Suggested-by: Selvin Xavier
Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:53 +0800
43e959881 xprtrdma: Limit data payload size for ALLPHYSICAL ... Browse Code »

When the client uses physical memory registration, each page in the
payload gets its own array entry in the RPC/RDMA header's chunk list.

Therefore, don't advertise a maximum payload size that would require
more array entries than can fit in the RPC buffer where RPC/RDMA
headers are built.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=248
Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:52 +0800
73806c883 xprtrdma: Protect ia->ri_id when unmapping/invalidating MRs ... Browse Code »

Ensure ia->ri_id remains valid while invoking dma_unmap_page() or
posting LOCAL_INV during a transport reconnect. Otherwise,
ia->ri_id->device or ia->ri_id->qp is NULL, which triggers a panic.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=259
Fixes: ec62f40 'xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting'
Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:52 +0800
5fc83f470 xprtrdma: Fix panic in rpcrdma_register_frmr_external() ... Browse Code »

seg1->mr_nsegs is not yet initialized when it is used to unmap
segments during an error exit. Use the same unmapping logic for
all error exits.

"if (frmr_wr.wr.fast_reg.length < len) {" used to be a BUG_ON check.
The broken code will never be executed under normal operation.

Fixes: c977dea (xprtrdma: Remove BUG_ON() call sites)
Signed-off-by: Chuck Lever
Tested-by: Steve Wise
Tested-by: Shirley Ma
Tested-by: Devesh Sharma
Signed-off-by: Anna Schumaker

Chuck Lever
2014-08-01 04:22:52 +0800

30 Jul, 2014

3 commits

518776800 SUNRPC: Allow svc_reserve() to notify TCP socket that space has been freed ... Browse Code »

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2014-07-30 04:10:20 +0800
c7fb3f063 SUNRPC: svc_tcp_write_space: don't clear SOCK_NOSPACE prematurely ... Browse Code »

If requests are queued in the socket inbuffer waiting for an
svc_tcp_has_wspace() requirement to be satisfied, then we do not want
to clear the SOCK_NOSPACE flag until we've satisfied that requirement.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2014-07-30 04:10:19 +0800
0971374e2 SUNRPC: Reduce contention in svc_xprt_enqueue() ... Browse Code »

Ensure that all calls to svc_xprt_enqueue() except svc_xprt_received()
check the value of XPT_BUSY, before attempting to grab spinlocks etc.
This is to avoid situations such as the following "perf" trace,
which shows heavy contention on the pool spinlock:

54.15% nfsd [kernel.kallsyms] [k] _raw_spin_lock_bh
|
--- _raw_spin_lock_bh
|
|--71.43%-- svc_xprt_enqueue
| |
| |--50.31%-- svc_reserve
| |
| |--31.35%-- svc_xprt_received
| |
| |--18.34%-- svc_tcp_data_ready
...

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2014-07-30 04:10:15 +0800

23 Jul, 2014

2 commits

e560e3b51 svcrdma: Add zero padding if the client doesn't send it ... Browse Code »
26

See RFC 5666 section 3.7: clients don't have to send zero XDR
padding.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=246
Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields

Chuck Lever
2014-07-23 04:40:21 +0800
bf858ab0a xprtrdma: Fix DMA-API-DEBUG warning by checking dma_map result ... Browse Code »

Fix the following warning when DMA-API debug is enabled by checking ib_dma_map_single result:
[ 1455.345548] ------------[ cut here ]------------
[ 1455.346863] WARNING: CPU: 3 PID: 3929 at /home/yanb/kernel/net-next/lib/dma-debug.c:1140 check_unmap+0x4e5/0x990()
[ 1455.349350] mlx4_core 0000:00:07.0: DMA-API: device driver failed to check map error[device address=0x000000007c9f2090] [size=2656 bytes] [mapped as single]
[ 1455.349350] Modules linked in: xprtrdma netconsole configfs nfsv3 nfs_acl ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm autofs4 auth_rpcgss oid_registry nfsv4 nfs fscache lockd sunrpc dm_mirror dm_region_hash dm_log microcode pcspkr mlx4_ib ib_sa ib_mad ib_core ib_addr mlx4_en ipv6 ptp pps_core vxlan mlx4_core virtio_balloon cirrus ttm drm_kms_helper drm sysimgblt sysfillrect syscopyarea i2c_piix4 i2c_core button ext3 jbd virtio_blk virtio_net virtio_pci virtio_ring virtio uhci_hcd ata_generic ata_piix libata
[ 1455.349350] CPU: 3 PID: 3929 Comm: mount.nfs Not tainted 3.15.0-rc1-dbg+ #13
[ 1455.349350] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[ 1455.349350] 0000000000000474 ffff880069dcf628 ffffffff8151c341 ffffffff817b69d8
[ 1455.349350] ffff880069dcf678 ffff880069dcf668 ffffffff8105b5fc 0000000069dcf658
[ 1455.349350] ffff880069dcf778 ffff88007b0c9f00 ffffffff8255ec40 0000000000000a60
[ 1455.349350] Call Trace:
[ 1455.349350] [] dump_stack+0x52/0x81
[ 1455.349350] [] warn_slowpath_common+0x8c/0xc0
[ 1455.349350] [] warn_slowpath_fmt+0x46/0x50
[ 1455.349350] [] check_unmap+0x4e5/0x990
[ 1455.349350] [] ? _raw_spin_unlock_irq+0x30/0x60
[ 1455.349350] [] debug_dma_unmap_page+0x5a/0x60
[ 1455.349350] [] rpcrdma_deregister_internal+0xb3/0xd0 [xprtrdma]
[ 1455.349350] [] rpcrdma_buffer_destroy+0x69/0x170 [xprtrdma]
[ 1455.349350] [] xprt_rdma_destroy+0x3f/0xb0 [xprtrdma]
[ 1455.349350] [] xprt_destroy+0x6f/0x80 [sunrpc]
[ 1455.349350] [] xprt_put+0x15/0x20 [sunrpc]
[ 1455.349350] [] rpc_free_client+0x8a/0xe0 [sunrpc]
[ 1455.349350] [] rpc_release_client+0x68/0xa0 [sunrpc]
[ 1455.349350] [] rpc_shutdown_client+0xb0/0xc0 [sunrpc]
[ 1455.349350] [] ? rpc_ping+0x5d/0x70 [sunrpc]
[ 1455.349350] [] rpc_create_xprt+0xbb/0xd0 [sunrpc]
[ 1455.349350] [] rpc_create+0xb3/0x160 [sunrpc]
[ 1455.349350] [] ? __probe_kernel_read+0x69/0xb0
[ 1455.349350] [] nfs_create_rpc_client+0xdc/0x100 [nfs]
[ 1455.349350] [] nfs_init_client+0x3a/0x90 [nfs]
[ 1455.349350] [] nfs_get_client+0x478/0x5b0 [nfs]
[ 1455.349350] [] ? nfs_get_client+0x100/0x5b0 [nfs]
[ 1455.349350] [] ? kmem_cache_alloc_trace+0x24d/0x260
[ 1455.349350] [] nfs_create_server+0xf3/0x4c0 [nfs]
[ 1455.349350] [] ? nfs_request_mount+0xf0/0x1a0 [nfs]
[ 1455.349350] [] nfs3_create_server+0x13/0x30 [nfsv3]
[ 1455.349350] [] nfs_try_mount+0x1f3/0x230 [nfs]
[ 1455.349350] [] ? get_parent_ip+0x11/0x50
[ 1455.349350] [] ? __this_cpu_preempt_check+0x13/0x20
[ 1455.349350] [] ? try_module_get+0x6b/0x190
[ 1455.349350] [] nfs_fs_mount+0x187/0x9d0 [nfs]
[ 1455.349350] [] ? nfs_clone_super+0x140/0x140 [nfs]
[ 1455.349350] [] ? nfs_auth_info_match+0x40/0x40 [nfs]
[ 1455.349350] [] mount_fs+0x20/0xe0
[ 1455.349350] [] vfs_kern_mount+0x76/0x160
[ 1455.349350] [] do_mount+0x428/0xae0
[ 1455.349350] [] SyS_mount+0x90/0xe0
[ 1455.349350] [] system_call_fastpath+0x16/0x1b
[ 1455.349350] ---[ end trace f1f31572972e211d ]---

Signed-off-by: Yan Burman
Reviewed-by: Chuck Lever
Signed-off-by: Anna Schumaker

Yan Burman
2014-07-23 01:55:30 +0800

18 Jul, 2014

2 commits

22cb43855 SUNRPC: xdr_get_next_encode_buffer should be declared static ... Browse Code »

Quell another sparse warning.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2014-07-18 23:35:46 +0800
3c45ddf82 svcrdma: Select NFSv4.1 backchannel transport based on forward channel ... Browse Code »
6

The current code always selects XPRT_TRANSPORT_BC_TCP for the back
channel, even when the forward channel was not TCP (eg, RDMA). When
a 4.1 mount is attempted with RDMA, the server panics in the TCP BC
code when trying to send CB_NULL.

Instead, construct the transport protocol number from the forward
channel transport or'd with XPRT_TRANSPORT_BC. Transports that do
not support bi-directional RPC will not have registered a "BC"
transport, causing create_backchannel_client() to fail immediately.

Fixes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=265
Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields

Chuck Lever
2014-07-18 23:35:45 +0800

16 Jul, 2014

1 commit

c1221321b sched: Allow wait_on_bit_action() functions to support a timeout ... Browse Code »
13

It is currently not possible for various wait_on_bit functions
to implement a timeout.

While the "action" function that is called to do the waiting
could certainly use schedule_timeout(), there is no way to carry
forward the remaining timeout after a false wake-up.
As false-wakeups a clearly possible at least due to possible
hash collisions in bit_waitqueue(), this is a real problem.

The 'action' function is currently passed a pointer to the word
containing the bit being waited on. No current action functions
use this pointer. So changing it to something else will be a
little noisy but will have no immediate effect.

This patch changes the 'action' function to take a pointer to
the "struct wait_bit_key", which contains a pointer to the word
containing the bit so nothing is really lost.

It also adds a 'private' field to "struct wait_bit_key", which
is initialized to zero.

An action function can now implement a timeout with something
like

static int timed_out_waiter(struct wait_bit_key *key)
{
unsigned long waited;
if (key->private == 0) {
key->private = jiffies;
if (key->private == 0)
key->private -= 1;
}
waited = jiffies - key->private;
if (waited > 10 * HZ)
return -EAGAIN;
schedule_timeout(waited - 10 * HZ);
return 0;
}

If any other need for context in a waiter were found it would be
easy to use ->private for some other purpose, or even extend
"struct wait_bit_key".

My particular need is to support timeouts in nfs_release_page()
to avoid deadlocks with loopback mounted NFS.

While wait_on_bit_timeout() would be a cleaner interface, it
will not meet my need. I need the timeout to be sensitive to
the state of the connection with the server, which could change.
So I need to use an 'action' interface.

Signed-off-by: NeilBrown
Acked-by: Peter Zijlstra
Cc: Oleg Nesterov
Cc: Steve French
Cc: David Howells
Cc: Steven Whitehouse
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brown
Signed-off-by: Ingo Molnar

NeilBrown
2014-07-16 21:10:41 +0800