29 Jun, 2013
2 commits
-
The GETDEVICEINFO gdia_maxcount represents all of the data being returned
within the GETDEVICEINFO4resok structure and includes the XDR overhead.The CREATE_SESSION ca_maxresponsesize is the maximum reply and includes the RPC
headers (including security flavor credentials and verifiers).Split out the struct pnfs_device field maxcount which is the gdia_maxcount
from the pglen field which is the reply (the total) buffer length.Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust -
Fallback should happen only when the request_key() call fails, because
this indicates that there was a problem running the nfsidmap program.
We shouldn't call the legacy code if the error was elsewhere.Signed-off-by: Bryan Schumaker
Signed-off-by: Trond Myklebust
20 Jun, 2013
1 commit
-
We need to ensure that we clear NFS4_SLOT_TBL_DRAINING on the back
channel when we're done recovering the session.Regression introduced by commit 774d5f14e (NFSv4.1 Fix a pNFS session
draining deadlock)Signed-off-by: Andy Adamson
[Trond: Changed order to start back-channel first. Minor code cleanup]
Signed-off-by: Trond Myklebust
Cc: stable@vger.kernel.org [>=3.10]
19 Jun, 2013
5 commits
-
Give them names that are a bit more consistent with the general
pNFS naming scheme.- lo_seg_contained -> pnfs_lseg_range_contained
- lo_seg_intersecting -> pnfs_lseg_range_intersecting
- cmp_layout -> pnfs_lseg_range_cmp
- is_matching_lseg -> pnfs_lseg_range_matchSigned-off-by: Trond Myklebust
-
Also strip off the unnecessary 'inline' declarations.
Signed-off-by: Trond Myklebust
-
The other protocols don't use it, so make it local to NFSv4, and
remove the EXPORT.
Also ensure that we only compile in cache_lib.o if we're using
the legacy DNS resolver.Signed-off-by: Trond Myklebust
Cc: Bryan Schumaker -
We had a report of a reproducible WARNING:
[ 1360.039358] ------------[ cut here ]------------
[ 1360.043978] WARNING: at fs/dcache.c:1355 d_set_d_op+0x8d/0xc0()
[ 1360.049880] Hardware name: HP Z200 Workstation
[ 1360.054308] Modules linked in: nfsv4 nfs dns_resolver fscache nfsd
auth_rpcgss nfs_acl lockd sunrpc sg acpi_cpufreq mperf coretemp kvm_intel kvm
snd_hda_codec_realtek snd_hda_intel snd_hda_codec hp_wmi crc32c_intel
snd_hwdep e1000e snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd
sparse_keymap rfkill soundcore serio_raw ptp iTCO_wdt pps_core pcspkr
iTCO_vendor_support mei microcode lpc_ich mfd_core wmi xfs libcrc32c sr_mod
sd_mod cdrom crc_t10dif radeon i2c_algo_bit drm_kms_helper ttm ahci libahci
drm i2c_core libata dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
auth_rpcgss]
[ 1360.107406] Pid: 8814, comm: mount.nfs4 Tainted: G I -------------- 3.9.0-0.55.el7.x86_64 #1
[ 1360.116771] Call Trace:
[ 1360.119219] [] warn_slowpath_common+0x70/0xa0
[ 1360.125208] [] warn_slowpath_null+0x1a/0x20
[ 1360.131025] [] d_set_d_op+0x8d/0xc0
[ 1360.136159] [] __rpc_lookup_create_exclusive+0x4f/0x80 [sunrpc]
[ 1360.143710] [] rpc_mkpipe_dentry+0x86/0x170 [sunrpc]
[ 1360.150311] [] nfs_idmap_new+0x96/0x130 [nfsv4]
[ 1360.156475] [] nfs4_init_client+0xad/0x2d0 [nfsv4]
[ 1360.162902] [] ? idr_get_empty_slot+0x16f/0x3c0
[ 1360.169062] [] ? idr_mark_full+0x52/0x60
[ 1360.174615] [] ? idr_alloc+0x79/0xe0
[ 1360.179826] [] ? __rpc_init_priority_wait_queue+0x81/0xc0 [sunrpc]
[ 1360.187635] [] ? rpc_init_wait_queue+0x13/0x20 [sunrpc]
[ 1360.194493] [] nfs_get_client+0x27a/0x350 [nfs]
[ 1360.200666] [] nfs4_set_client.isra.8+0x78/0x100 [nfsv4]
[ 1360.207624] [] nfs4_create_server+0xf3/0x3a0 [nfsv4]
[ 1360.214222] [] nfs4_remote_mount+0x2e/0x60 [nfsv4]
[ 1360.220644] [] mount_fs+0x39/0x1b0
[ 1360.225691] [] ? __alloc_percpu+0x10/0x20
[ 1360.231348] [] vfs_kern_mount+0x5f/0xf0
[ 1360.236822] [] nfs_do_root_mount+0x86/0xc0 [nfsv4]
[ 1360.243246] [] nfs4_try_mount+0x44/0xc0 [nfsv4]
[ 1360.249410] [] ? get_nfs_version+0x27/0x80 [nfs]
[ 1360.255659] [] nfs_fs_mount+0x5c5/0xd10 [nfs]
[ 1360.261650] [] ? nfs_clone_super+0x140/0x140 [nfs]
[ 1360.268074] [] ? param_set_portnr+0x60/0x60 [nfs]
[ 1360.274406] [] mount_fs+0x39/0x1b0
[ 1360.279443] [] ? __alloc_percpu+0x10/0x20
[ 1360.285088] [] vfs_kern_mount+0x5f/0xf0
[ 1360.290556] [] do_mount+0x1fd/0xa00
[ 1360.295677] [] ? __get_free_pages+0xe/0x50
[ 1360.301405] [] ? copy_mount_options+0x36/0x170
[ 1360.307479] [] sys_mount+0x83/0xc0
[ 1360.312515] [] system_call_fastpath+0x16/0x1b
[ 1360.318503] ---[ end trace 8fa1f4cbc36094a7 ]---The problem is that we're ending up in __rpc_lookup_create_exclusive
with a negative dentry that already has d_op set. A little debugging
has shown that when we hit this, the d_ops are already set to
simple_dentry_operations.I believe that what's happening is that during a mount, idmapd is racing
in and doing a lookup of /var/lib/nfs/rpc_pipefs/nfs/clnt???/idmap.
Before that dentry reference is released, the kernel races in to create
that file and finds the new negative dentry, which already has the
d_op set.This patch just avoids setting the d_op if it's already set.
simple_dentry_operations and rpc_dentry_operations are functionally
equivalent so it shouldn't matter which one it's set to.Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust -
Make sure that NFSv4 SETCLIENTID does not parse the NETID as a
format string.Signed-off-by: Djalal Harouni
Signed-off-by: Trond Myklebust
07 Jun, 2013
17 commits
-
State recovery currently relies on being able to find a valid
nfs_open_context in the inode->open_files list.
We therefore need to put the nfs_open_context on the list while
we're still protected by the sp->so_reclaim_seqcount in order
to avoid reboot races.Signed-off-by: Trond Myklebust
-
Signed-off-by: Trond Myklebust
-
Instead of having the callers set ctx->state, do it inside
_nfs4_open_and_get_state.Signed-off-by: Trond Myklebust
-
All the callers have an open_context at this point, and since we always
need one in order to do state recovery, it makes sense to use it as the
basis for the nfs4_do_open() call.Signed-off-by: Trond Myklebust
-
We already check the EXEC access mode in the lower layers.
Signed-off-by: Trond Myklebust
-
The RPC_TASK_RUNNING flag will always have been set in rpc_make_runnable()
once we get past the test for out_of_line_wait_on_bit() returning
ERESTARTSYS.Signed-off-by: Trond Myklebust
-
Signed-off-by: Trond Myklebust
-
Signed-off-by: Trond Myklebust
-
Signed-off-by: Trond Myklebust
-
If the rpc_task is asynchronous, it could theoretically finish executing
on the workqueue it was assigned by rpc_make_runnable() before we get
round to testing RPC_IS_ASYNC() in rpc_execute.In practice, however, all the existing callers hold a reference to the
rpc_task, so this can't happen today...Signed-off-by: Trond Myklebust
-
ctx->cred == ctx->state->owner->so_cred, so let's just use the former.
Signed-off-by: Trond Myklebust
-
Use the EXCHGID4_FLAG_BIND_PRINC_STATEID exchange_id flag to enable
stateid protection. This means that if we create a stateid using a
particular principal, then we must use the same principal if we
want to change that state.
IOW: if we OPEN a file using a particular credential, then we have
to use the same credential in subsequent OPEN_DOWNGRADE, CLOSE,
or DELEGRETURN operations that use that stateid.Signed-off-by: Trond Myklebust
-
This is not strictly needed, since get_deviceinfo is not allowed to
return NFS4ERR_ACCESS or NFS4ERR_WRONG_CRED, but lets do it anyway
for consistency with other pNFS operations.Signed-off-by: Trond Myklebust
-
Signed-off-by: Trond Myklebust
-
We want to use the same credential for reclaim_complete as we used
for the exchange_id call.Signed-off-by: Trond Myklebust
-
We need to use the same credential as was used for the layoutget
and/or layoutcommit operations.Signed-off-by: Trond Myklebust
-
Ensure that we use the same credential for layoutget, layoutcommit and
layoutreturn.Signed-off-by: Trond Myklebust
31 May, 2013
1 commit
-
Darrick J. Wong reports:
> I have a kvm-based testing setup that netboots VMs over NFS, the
> client end of which seems to have broken somehow in 3.10-rc1. The
> server's exports file looks like this:
>
> /storage/mtr/x64 192.168.122.0/24(ro,sync,no_root_squash,no_subtree_check)
>
> On the client end (inside the VM), the initrd runs the following
> command to try to mount the rootfs over NFS:
>
> # mount -o nolock -o ro -o retrans=10 192.168.122.1:/storage/mtr/x64/ /root
>
> (Note: This is the busybox mount command.)
>
> The mount fails with -EINVAL.Commit 4580a92d44 "NFS: Use server-recommended security flavor by
default (NFSv3)" introduced a behavior regression for NFS mounts
done via a legacy binary mount(2) call.Ensure that a default security flavor is specified for legacy binary
mount requests, since they do not invoke nfs_select_flavor() in the
kernel.Busybox uses klibc's nfsmount command, which performs NFS mounts
using the legacy binary mount data format. /sbin/mount.nfs is not
affected by this regression.Reported-by: Darrick J. Wong
Signed-off-by: Chuck Lever
Tested-by: Darrick J. Wong
Acked-by: Weston Andros Adamson
Signed-off-by: Trond Myklebust
30 May, 2013
1 commit
-
We need to pass the full open mode flags to nfs_may_open() when doing
a delegated open.Signed-off-by: Trond Myklebust
Cc: stable@vger.kernel.org
24 May, 2013
1 commit
-
Commit 79d852bf "NFS: Retry SETCLIENTID with AUTH_SYS instead of
AUTH_NONE" did not take into account commit 23631227 "NFSv4: Fix the
fallback to AUTH_NULL if krb5i is not available".Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust
23 May, 2013
1 commit
-
The lockless RPC_IS_QUEUED() test in __rpc_execute means that we need to
be careful about ordering the calls to rpc_test_and_set_running(task) and
rpc_clear_queued(task). If we get the order wrong, then we may end up
testing the RPC_TASK_RUNNING flag after __rpc_execute() has looped
and changed the state of the rpc_task.Signed-off-by: Trond Myklebust
Cc: stable@vger.kernel.org
21 May, 2013
1 commit
-
On a CB_RECALL the callback service thread flushes the inode using
filemap_flush prior to scheduling the state manager thread to return the
delegation. When pNFS is used and I/O has not yet gone to the data server
servicing the inode, a LAYOUTGET can preceed the I/O. Unlike the async
filemap_flush call, the LAYOUTGET must proceed to completion.If the state manager starts to recover data while the inode flush is sending
the LAYOUTGET, a deadlock occurs as the callback service thread holds the
single callback session slot until the flushing is done which blocks the state
manager thread, and the state manager thread has set the session draining bit
which puts the inode flush LAYOUTGET RPC to sleep on the forechannel slot
table waitq.Separate the draining of the back channel from the draining of the fore channel
by moving the NFS4_SESSION_DRAINING bit from session scope into the fore
and back slot tables. Drain the back channel first allowing the LAYOUTGET
call to proceed (and fail) so the callback service thread frees the callback
slot. Then proceed with draining the forechannel.Signed-off-by: Andy Adamson
Signed-off-by: Trond Myklebust
16 May, 2013
3 commits
-
This seems to have been overlooked when we did the namespace
conversion. If a container is running a legacy version of rpc.gssd
then it will be disrupted if the global 'pipe_version' is set by a
container running the new version of rpc.gssd.Signed-off-by: Trond Myklebust
-
Recent changes to the NFS security flavour negotiation mean that
we have a stronger dependency on rpc.gssd. If the latter is not
running, because the user failed to start it, then we time out
and mark the container as not having an instance. We then
use that information to time out faster the next time.If, on the other hand, the rpc.gssd successfully binds to an rpc_pipe,
then we mark the container as having an rpc.gssd instance.Signed-off-by: Trond Myklebust
-
If wait_event_interruptible_timeout() is successful, it returns
the number of seconds remaining until the timeout. In that
case, we should be retrying the upcall.Signed-off-by: Trond Myklebust
12 May, 2013
6 commits
-
Pull tracing/kprobes update from Steven Rostedt:
"The majority of these changes are from Masami Hiramatsu bringing
kprobes up to par with the latest changes to ftrace (multi buffering
and the new function probes).He also discovered and fixed some bugs in doing so. When pulling in
his patches, I also found a few minor bugs as well and fixed them.This also includes a compile fix for some archs that select the ring
buffer but not tracing.I based this off of the last patch you took from me that fixed the
merge conflict error, as that was the commit that had all the changes
I needed for this set of changes."* tag 'trace-fixes-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/kprobes: Support soft-mode disabling
tracing/kprobes: Support ftrace_event_file base multibuffer
tracing/kprobes: Pass trace_probe directly from dispatcher
tracing/kprobes: Increment probe hit-count even if it is used by perf
tracing/kprobes: Use bool for retprobe checker
ftrace: Fix function probe when more than one probe is added
ftrace: Fix the output of enabled_functions debug file
ftrace: Fix locking in register_ftrace_function_probe()
tracing: Add helper function trace_create_new_event() to remove duplicate code
tracing: Modify soft-mode only if there's no other referrer
tracing: Indicate enabled soft-mode in enable file
tracing/kprobes: Fix to increment return event probe hit-count
ftrace: Cleanup regex_lock and ftrace_lock around hash updating
ftrace, kprobes: Fix a deadlock on ftrace_regex_lock
ftrace: Have ftrace_regex_write() return either read or error
tracing: Return error if register_ftrace_function_probe() fails for event_enable_func()
tracing: Don't succeed if event_enable_func did not register anything
ring-buffer: Select IRQ_WORK -
…nux/kernel/git/konrad/xen
Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
- More fixes in the vCPU PVHVM hotplug path.
- Add more documentation.
- Fix various ARM related issues in the Xen generic drivers.
- Updates in the xen-pciback driver per Bjorn's updates.
- Mask the x2APIC feature for PV guests.* tag 'stable/for-linus-3.10-rc0-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pci: Used cached MSI-X capability offset
xen/pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
xen: clear IRQ_NOAUTOEN and IRQ_NOREQUEST
xen: mask x2APIC feature in PV
xen: SWIOTLB is only used on x86
xen/spinlock: Fix check from greater than to be also be greater or equal to.
xen/smp/pvhvm: Don't point per_cpu(xen_vpcu, 33 and larger) to shared_info
xen/vcpu: Document the xen_vcpu_info and xen_vcpu
xen/vcpu/pvhvm: Fix vcpu hotplugging hanging. -
Pull second SCSI update from James "Jaj B" Bottomley:
"This is the final round of SCSI patches for the merge window. It
consists mostly of driver updates (bnx2fc, ibmfc, fnic, lpfc,
be2iscsi, pm80xx, qla4x and ipr).There's also the power management updates that complete the patches in
Jens' tree, an iscsi refcounting problem fix from the last pull, some
dif handling in scsi_debug fixes, a few nice code cleanups and an
error handling busy bug fix."* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (92 commits)
[SCSI] qla2xxx: Update firmware link in Kconfig file.
[SCSI] iscsi class, qla4xxx: fix sess/conn refcounting when find fns are used
[SCSI] sas: unify the pointlessly separated enums sas_dev_type and sas_device_type
[SCSI] pm80xx: thermal, sas controller config and error handling update
[SCSI] pm80xx: NCQ error handling changes
[SCSI] pm80xx: WWN Modification for PM8081/88/89 controllers
[SCSI] pm80xx: Changed module name and debug messages update
[SCSI] pm80xx: Firmware flash memory free fix, with addition of new memory region for it
[SCSI] pm80xx: SPC new firmware changes for device id 0x8081 alone
[SCSI] pm80xx: Added SPCv/ve specific hardware functionalities and relevant changes in common files
[SCSI] pm80xx: MSI-X implementation for using 64 interrupts
[SCSI] pm80xx: Updated common functions common for SPC and SPCv/ve
[SCSI] pm80xx: Multiple inbound/outbound queue configuration
[SCSI] pm80xx: Added SPCv/ve specific ids, variables and modify for SPC
[SCSI] lpfc: fix up Kconfig dependencies
[SCSI] Handle MLQUEUE busy response in scsi_send_eh_cmnd
[SCSI] sd: change to auto suspend mode
[SCSI] sd: use REQ_PM in sd's runtime suspend operation
[SCSI] qla4xxx: Fix iocb_cnt calculation in qla4xxx_send_mbox_iocb()
[SCSI] ufs: Correct the expected data transfersize
... -
Pull idle update from Len Brown:
"Add support for new Haswell-ULT CPU idle power states"* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
intel_idle: initial C8, C9, C10 support
tools/power turbostat: display C8, C9, C10 residency -
Pull audit changes from Eric Paris:
"Al used to send pull requests every couple of years but he told me to
just start pushing them to you directly.Our touching outside of core audit code is pretty straight forward. A
couple of interface changes which hit net/. A simple argument bug
calling audit functions in namei.c and the removal of some assembly
branch prediction code on ppc"* git://git.infradead.org/users/eparis/audit: (31 commits)
audit: fix message spacing printing auid
Revert "audit: move kaudit thread start from auditd registration to kaudit init"
audit: vfs: fix audit_inode call in O_CREAT case of do_last
audit: Make testing for a valid loginuid explicit.
audit: fix event coverage of AUDIT_ANOM_LINK
audit: use spin_lock in audit_receive_msg to process tty logging
audit: do not needlessly take a lock in tty_audit_exit
audit: do not needlessly take a spinlock in copy_signal
audit: add an option to control logging of passwords with pam_tty_audit
audit: use spin_lock_irqsave/restore in audit tty code
helper for some session id stuff
audit: use a consistent audit helper to log lsm information
audit: push loginuid and sessionid processing down
audit: stop pushing loginid, uid, sessionid as arguments
audit: remove the old depricated kernel interface
audit: make validity checking generic
audit: allow checking the type of audit message in the user filter
audit: fix build break when AUDIT_DEBUG == 2
audit: remove duplicate export of audit_enabled
Audit: do not print error when LSMs disabled
...
11 May, 2013
1 commit
-
Pull nfsd fixes from Bruce Fields:
"Small fixes for two bugs and two warnings"* 'for-3.10' of git://linux-nfs.org/~bfields/linux:
nfsd: fix oops when legacy_recdir_name_error is passed a -ENOENT error
SUNRPC: fix decoding of optional gss-proxy xdr fields
SUNRPC: Refactor gssx_dec_option_array() to kill uninitialized warning
nfsd4: don't allow owner override on 4.1 CLAIM_FH opens