Doug / smarc-fsl-linux-kernel | Embedian Git Server

12 Jul, 2013

2 commits

1466b77a7 Merge tag 'nfs-for-3.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs ... Browse Code »

Pull second set of NFS client updates from Trond Myklebust:
"This mainly contains some small readdir optimisations that had
dependencies on Al Viro's readdir rewrite. There is also a fix for a
nasty deadlock which surfaced earlier in this merge window.

Highlights include:
- Fix an_rpc pipefs regression that causes a deadlock on mount
- Readdir optimisations by Scott Mayhew and Jeff Layton
- clean up the rpc_pipefs dentry operation setup"

* tag 'nfs-for-3.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: Fix a deadlock in rpc_client_register()
rpc_pipe: rpc_dir_inode_operations can be static
NFS: Allow nfs_updatepage to extend a write under additional circumstances
NFS: Make nfs_readdir revalidate less often
NFS: Make nfs_attribute_cache_expired() non-static
rpc_pipe: set dentry operations at d_alloc time
nfs: set verifier on existing dentries in nfs_prime_dcache

Linus Torvalds
2013-07-12 03:11:35 +0800
0ff08ba5d Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd changes from Bruce Fields:
"Changes this time include:

- 4.1 enabled on the server by default: the last 4.1-specific issues
I know of are fixed, so we're not going to find the rest of the
bugs without more exposure.
- Experimental support for NFSv4.2 MAC Labeling (to allow running
selinux over NFS), from Dave Quigley.
- Fixes for some delicate cache/upcall races that could cause rare
server hangs; thanks to Neil Brown and Bodo Stroesser for extreme
debugging persistence.
- Fixes for some bugs found at the recent NFS bakeathon, mostly v4
and v4.1-specific, but also a generic bug handling fragmented rpc
calls"

* 'for-3.11' of git://linux-nfs.org/~bfields/linux: (31 commits)
nfsd4: support minorversion 1 by default
nfsd4: allow destroy_session over destroyed session
svcrpc: fix failures to handle -1 uid's
sunrpc: Don't schedule an upcall on a replaced cache entry.
net/sunrpc: xpt_auth_cache should be ignored when expired.
sunrpc/cache: ensure items removed from cache do not have pending upcalls.
sunrpc/cache: use cache_fresh_unlocked consistently and correctly.
sunrpc/cache: remove races with queuing an upcall.
nfsd4: return delegation immediately if lease fails
nfsd4: do not throw away 4.1 lock state on last unlock
nfsd4: delegation-based open reclaims should bypass permissions
svcrpc: don't error out on small tcp fragment
svcrpc: fix handling of too-short rpc's
nfsd4: minor read_buf cleanup
nfsd4: fix decoding of compounds across page boundaries
nfsd4: clean up nfs4_open_delegation
NFSD: Don't give out read delegations on creates
nfsd4: allow client to send no cb_sec flavors
nfsd4: fail attempts to request gss on the backchannel
nfsd4: implement minimal SP4_MACH_CRED
...

Linus Torvalds
2013-07-12 01:17:13 +0800

11 Jul, 2013

1 commit

eeee24526 SUNRPC: Fix a deadlock in rpc_client_register() ... Browse Code »

Commit 384816051ca9125cd54750e59c780c2a2655fa4f (SUNRPC: fix races on
PipeFS MOUNT notifications) introduces a regression when we call
rpc_setup_pipedir() with RPCSEC_GSS as the auth flavour.

By calling rpcauth_create() while holding the sn->pipefs_sb_lock, we
end up deadlocking in gss_pipes_dentries_create_net().
Fix is to register the client and release the mutex before calling
rpcauth_create().

Reported-by: Weston Andros Adamson
Tested-by: Weston Andros Adamson
Cc: Stanislav Kinsbursky
Cc: # : 3848160: SUNRPC: fix races on PipeFS MOUNT
Cc: # : e73f4cc: SUNRPC: split client creation
Signed-off-by: Trond Myklebust

Trond Myklebust
2013-07-11 03:58:55 +0800

10 Jul, 2013

4 commits

4f8568cb5 rpc_pipe: rpc_dir_inode_operations can be static ... Browse Code »

Hi Jeff,

FYI, there are new sparse warnings show up in

tree: git://git.linux-nfs.org/projects/trondmy/linux-nfs.git nfs-for-next
head: 296afe1f58d55fd56ed85daaafafcfee39f59ece
commit: 76fa66657900071016f2bae61de28f059f3f2abf [2/5] rpc_pipe: set dentry operations at d_alloc time

>> net/sunrpc/rpc_pipe.c:496:31: sparse: symbol 'rpc_dir_inode_operations' was not declared. Should it be static?

Please consider folding the attached diff :-)

Signed-off-by: Fengguang Wu
Signed-off-by: Trond Myklebust

Fengguang Wu
2013-07-10 09:35:27 +0800
496322bc9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking updates from David Miller:
"This is a re-do of the net-next pull request for the current merge
window. The only difference from the one I made the other day is that
this has Eliezer's interface renames and the timeout handling changes
made based upon your feedback, as well as a few bug fixes that have
trickeled in.

Highlights:

1) Low latency device polling, eliminating the cost of interrupt
handling and context switches. Allows direct polling of a network
device from socket operations, such as recvmsg() and poll().

Currently ixgbe, mlx4, and bnx2x support this feature.

Full high level description, performance numbers, and design in
commit 0a4db187a999 ("Merge branch 'll_poll'")

From Eliezer Tamir.

2) With the routing cache removed, ip_check_mc_rcu() gets exercised
more than ever before in the case where we have lots of multicast
addresses. Use a hash table instead of a simple linked list, from
Eric Dumazet.

3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from
Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski,
Marek Puzyniak, Michal Kazior, and Sujith Manoharan.

4) Support reporting the TUN device persist flag to userspace, from
Pavel Emelyanov.

5) Allow controlling network device VF link state using netlink, from
Rony Efraim.

6) Support GRE tunneling in openvswitch, from Pravin B Shelar.

7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from
Daniel Borkmann and Eric Dumazet.

8) Allow controlling of TCP quickack behavior on a per-route basis,
from Cong Wang.

9) Several bug fixes and improvements to vxlan from Stephen
Hemminger, Pravin B Shelar, and Mike Rapoport. In particular,
support receiving on multiple UDP ports.

10) Major cleanups, particular in the area of debugging and cookie
lifetime handline, to the SCTP protocol code. From Daniel
Borkmann.

11) Allow packets to cross network namespaces when traversing tunnel
devices. From Nicolas Dichtel.

12) Allow monitoring netlink traffic via AF_PACKET sockets, in a
manner akin to how we monitor real network traffic via ptype_all.
From Daniel Borkmann.

13) Several bug fixes and improvements for the new alx device driver,
from Johannes Berg.

14) Fix scalability issues in the netem packet scheduler's time queue,
by using an rbtree. From Eric Dumazet.

15) Several bug fixes in TCP loss recovery handling, from Yuchung
Cheng.

16) Add support for GSO segmentation of MPLS packets, from Simon
Horman.

17) Make network notifiers have a real data type for the opaque
pointer that's passed into them. Use this to properly handle
network device flag changes in arp_netdev_event(). From Jiri
Pirko and Timo Teräs.

18) Convert several drivers over to module_pci_driver(), from Peter
Huewe.

19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a
O(1) calculation instead. From Eric Dumazet.

20) Support setting of explicit tunnel peer addresses in ipv6, just
like ipv4. From Nicolas Dichtel.

21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet.

22) Prevent a single high rate flow from overruning an individual cpu
during RX packet processing via selective flow shedding. From
Willem de Bruijn.

23) Don't use spinlocks in TCP md5 signing fast paths, from Eric
Dumazet.

24) Don't just drop GSO packets which are above the TBF scheduler's
burst limit, chop them up so they are in-bounds instead. Also
from Eric Dumazet.

25) VLAN offloads are missed when configured on top of a bridge, fix
from Vlad Yasevich.

26) Support IPV6 in ping sockets. From Lorenzo Colitti.

27) Receive flow steering targets should be updated at poll() time
too, from David Majnemer.

28) Fix several corner case regressions in PMTU/redirect handling due
to the routing cache removal, from Timo Teräs.

29) We have to be mindful of ipv4 mapped ipv6 sockets in
upd_v6_push_pending_frames(). From Hannes Frederic Sowa.

30) Fix L2TP sequence number handling bugs, from James Chapman."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits)
drivers/net: caif: fix wrong rtnl_is_locked() usage
drivers/net: enic: release rtnl_lock on error-path
vhost-net: fix use-after-free in vhost_net_flush
net: mv643xx_eth: do not use port number as platform device id
net: sctp: confirm route during forward progress
virtio_net: fix race in RX VQ processing
virtio: support unlocked queue poll
net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit
Documentation: Fix references to defunct linux-net@vger.kernel.org
net/fs: change busy poll time accounting
net: rename low latency sockets functions to busy poll
bridge: fix some kernel warning in multicast timer
sfc: Fix memory leak when discarding scattered packets
sit: fix tunnel update via netlink
dt:net:stmmac: Add dt specific phy reset callback support.
dt:net:stmmac: Add support to dwmac version 3.610 and 3.710
dt:net:stmmac: Allocate platform data only if its NULL.
net:stmmac: fix memleak in the open method
ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available
net: ipv6: fix wrong ping_v6_sendmsg return value
...

Linus Torvalds
2013-07-10 09:24:39 +0800
76fa66657 rpc_pipe: set dentry operations at d_alloc time ... Browse Code »

Currently the way these get set is a little convoluted. If the dentry is
allocated via lookup from userland, then it gets set by simple_lookup.
If it gets allocated when the kernel is populating the directory, then
it gets set via __rpc_lookup_create_exclusive, which has to check
whether they might already be set. Between both of these, this ensures
that all dentries have their d_op pointer set.

Instead of doing that, just have them set at d_alloc time by pointing
sb->s_d_op at them. With that change, we no longer want the lookup op
to set them, so we must move to using our own lookup routine.

Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust

Jeff Layton
2013-07-10 05:16:39 +0800
be0c5d8c0 Merge tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs ... Browse Code »

Pull NFS client updates from Trond Myklebust:
"Feature highlights include:
- Add basic client support for NFSv4.2
- Add basic client support for Labeled NFS (selinux for NFSv4.2)
- Fix the use of credentials in NFSv4.1 stateful operations, and add
support for NFSv4.1 state protection.

Bugfix highlights:
- Fix another NFSv4 open state recovery race
- Fix an NFSv4.1 back channel session regression
- Various rpc_pipefs races
- Fix another issue with NFSv3 auth negotiation

Please note that Labeled NFS does require some additional support from
the security subsystem. The relevant changesets have all been
reviewed and acked by James Morris."

* tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits)
NFS: Set NFS_CS_MIGRATION for NFSv4 mounts
NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs
nfs: have NFSv3 try server-specified auth flavors in turn
nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it
nfs: move server_authlist into nfs_try_mount_request
nfs: refactor "need_mount" code out of nfs_try_mount
SUNRPC: PipeFS MOUNT notification optimization for dying clients
SUNRPC: split client creation routine into setup and registration
SUNRPC: fix races on PipeFS UMOUNT notifications
SUNRPC: fix races on PipeFS MOUNT notifications
NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount
NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount
NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize
NFS: Improve legacy idmapping fallback
NFSv4.1 end back channel session draining
NFS: Apply v4.1 capabilities to v4.2
NFSv4.1: Clean up layout segment comparison helper names
NFSv4.1: layout segment comparison helpers should take 'const' parameters
NFSv4: Move the DNS resolver into the NFSv4 module
rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set
...

Linus Torvalds
2013-07-10 03:09:43 +0800

09 Jul, 2013

1 commit

0979292bf svcrpc: fix failures to handle -1 uid's ... Browse Code »

As of f025adf191924e3a75ce80e130afcd2485b53bb8 "sunrpc: Properly decode
kuids and kgids in RPC_AUTH_UNIX credentials" any rpc containing a -1
(0xffff) uid or gid would fail with a badcred error.

Commit afe3c3fd5392b2f0066930abc5dbd3f4b14a0f13 "svcrpc: fix failures to
handle -1 uid's and gid's" fixed part of the problem, but overlooked the
gid upcall--the kernel can request supplementary gid's for the -1 uid,
but mountd's attempt write a response will get -EINVAL.

Symptoms were nfsd failing to reply to the first attempt to use a newly
negotiated krb5 context.

Reported-by: Sven Geggus
Tested-by: Sven Geggus
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2013-07-09 05:27:23 +0800

04 Jul, 2013

3 commits

7f0ef0267 Merge branch 'akpm' (updates from Andrew Morton) ... Browse Code »

Merge first patch-bomb from Andrew Morton:
- various misc bits
- I'm been patchmonkeying ocfs2 for a while, as Joel and Mark have been
distracted. There has been quite a bit of activity.
- About half the MM queue
- Some backlight bits
- Various lib/ updates
- checkpatch updates
- zillions more little rtc patches
- ptrace
- signals
- exec
- procfs
- rapidio
- nbd
- aoe
- pps
- memstick
- tools/testing/selftests updates

* emailed patches from Andrew Morton : (445 commits)
tools/testing/selftests: don't assume the x bit is set on scripts
selftests: add .gitignore for kcmp
selftests: fix clean target in kcmp Makefile
selftests: add .gitignore for vm
selftests: add hugetlbfstest
self-test: fix make clean
selftests: exit 1 on failure
kernel/resource.c: remove the unneeded assignment in function __find_resource
aio: fix wrong comment in aio_complete()
drivers/w1/slaves/w1_ds2408.c: add magic sequence to disable P0 test mode
drivers/memstick/host/r592.c: convert to module_pci_driver
drivers/memstick/host/jmb38x_ms: convert to module_pci_driver
pps-gpio: add device-tree binding and support
drivers/pps/clients/pps-gpio.c: convert to module_platform_driver
drivers/pps/clients/pps-gpio.c: convert to devm_* helpers
drivers/parport/share.c: use kzalloc
Documentation/accounting/getdelays.c: avoid strncpy in accounting tool
aoe: update internal version number to v83
aoe: update copyright date
aoe: perform I/O completions in parallel
...

Linus Torvalds
2013-07-04 08:12:13 +0800
f170168b9 drivers: avoid parsing names as kthread_run() format strings ... Browse Code »

Calling kthread_run with a single name parameter causes it to be handled
as a format string. Many callers are passing potentially dynamic string
content, so use "%s" in those cases to avoid any potential accidents.

Signed-off-by: Kees Cook
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2013-07-04 07:07:41 +0800
f991fae5c Merge tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull power management and ACPI updates from Rafael Wysocki:
"This time the total number of ACPI commits is slightly greater than
the number of cpufreq commits, but Viresh Kumar (who works on cpufreq)
remains the most active patch submitter.

To me, the most significant change is the addition of offline/online
device operations to the driver core (with the Greg's blessing) and
the related modifications of the ACPI core hotplug code. Next are the
freezer updates from Colin Cross that should make the freezing of
tasks a bit less heavy weight.

We also have a couple of regression fixes, a number of fixes for
issues that have not been identified as regressions, two new drivers
and a bunch of cleanups all over.

Highlights:

- Hotplug changes to support graceful hot-removal failures.

It sometimes is necessary to fail device hot-removal operations
gracefully if they cannot be carried out completely. For example,
if memory from a memory module being hot-removed has been allocated
for the kernel's own use and cannot be moved elsewhere, it's
desirable to fail the hot-removal operation in a graceful way
rather than to crash the kernel, but currenty a success or a kernel
crash are the only possible outcomes of an attempted memory
hot-removal. Needless to say, that is not a very attractive
alternative and it had to be addressed.

However, in order to make it work for memory, I first had to make
it work for CPUs and for this purpose I needed to modify the ACPI
processor driver. It's been split into two parts, a resident one
handling the low-level initialization/cleanup and a modular one
playing the actual driver's role (but it binds to the CPU system
device objects rather than to the ACPI device objects representing
processors). That's been sort of like a live brain surgery on a
patient who's riding a bike.

So this is a little scary, but since we found and fixed a couple of
regressions it caused to happen during the early linux-next testing
(a month ago), nobody has complained.

As a bonus we remove some duplicated ACPI hotplug code, because the
ACPI-based CPU hotplug is now going to use the common ACPI hotplug
code.

- Lighter weight freezing of tasks.

These changes from Colin Cross and Mandeep Singh Baines are
targeted at making the freezing of tasks a bit less heavy weight
operation. They reduce the number of tasks woken up every time
during the freezing, by using the observation that the freezer
simply doesn't need to wake up some of them and wait for them all
to call refrigerator(). The time needed for the freezer to decide
to report a failure is reduced too.

Also reintroduced is the check causing a lockdep warining to
trigger when try_to_freeze() is called with locks held (which is
generally unsafe and shouldn't happen).

- cpufreq updates

First off, a commit from Srivatsa S Bhat fixes a resume regression
introduced during the 3.10 cycle causing some cpufreq sysfs
attributes to return wrong values to user space after resume. The
fix is kind of fresh, but also it's pretty obvious once Srivatsa
has identified the root cause.

Second, we have a new freqdomain_cpus sysfs attribute for the
acpi-cpufreq driver to provide information previously available via
related_cpus. From Lan Tianyu.

Finally, we fix a number of issues, mostly related to the
CPUFREQ_POSTCHANGE notifier and cpufreq Kconfig options and clean
up some code. The majority of changes from Viresh Kumar with bits
from Jacob Shin, Heiko Stübner, Xiaoguang Chen, Ezequiel Garcia,
Arnd Bergmann, and Tang Yuantian.

- ACPICA update

A usual bunch of updates from the ACPICA upstream.

During the 3.4 cycle we introduced support for ACPI 5 extended
sleep registers, but they are only supposed to be used if the
HW-reduced mode bit is set in the FADT flags and the code attempted
to use them without checking that bit. That caused suspend/resume
regressions to happen on some systems. Fix from Lv Zheng causes
those registers to be used only if the HW-reduced mode bit is set.

Apart from this some other ACPICA bugs are fixed and code cleanups
are made by Bob Moore, Tomasz Nowicki, Lv Zheng, Chao Guan, and
Zhang Rui.

- cpuidle updates

New driver for Xilinx Zynq processors is added by Michal Simek.

Multidriver support simplification, addition of some missing
kerneldoc comments and Kconfig-related fixes come from Daniel
Lezcano.

- ACPI power management updates

Changes to make suspend/resume work correctly in Xen guests from
Konrad Rzeszutek Wilk, sparse warning fix from Fengguang Wu and
cleanups and fixes of the ACPI device power state selection
routine.

- ACPI documentation updates

Some previously missing pieces of ACPI documentation are added by
Lv Zheng and Aaron Lu (hopefully, that will help people to
uderstand how the ACPI subsystem works) and one outdated doc is
updated by Hanjun Guo.

- Assorted ACPI updates

We finally nailed down the IA-64 issue that was the reason for
reverting commit 9f29ab11ddbf ("ACPI / scan: do not match drivers
against objects having scan handlers"), so we can fix it and move
the ACPI scan handler check added to the ACPI video driver back to
the core.

A mechanism for adding CMOS RTC address space handlers is
introduced by Lan Tianyu to allow some EC-related breakage to be
fixed on some systems.

A spec-compliant implementation of acpi_os_get_timer() is added by
Mika Westerberg.

The evaluation of _STA is added to do_acpi_find_child() to avoid
situations in which a pointer to a disabled device object is
returned instead of an enabled one with the same _ADR value. From
Jeff Wu.

Intel BayTrail PCH (Platform Controller Hub) support is added to
the ACPI driver for Intel Low-Power Subsystems (LPSS) and that
driver is modified to work around a couple of known BIOS issues.
Changes from Mika Westerberg and Heikki Krogerus.

The EC driver is fixed by Vasiliy Kulikov to use get_user() and
put_user() instead of dereferencing user space pointers blindly.

Code cleanups are made by Bjorn Helgaas, Nicholas Mazzuca and Toshi
Kani.

- Assorted power management updates

The "runtime idle" helper routine is changed to take the return
values of the callbacks executed by it into account and to call
rpm_suspend() if they return 0, which allows us to reduce the
overall code bloat a bit (by dropping some code that's not
necessary any more after that modification).

The runtime PM documentation is updated by Alan Stern (to reflect
the "runtime idle" behavior change).

New trace points for PM QoS are added by Sahara
().

PM QoS documentation is updated by Lan Tianyu.

Code cleanups are made and minor issues are addressed by Bernie
Thompson, Bjorn Helgaas, Julius Werner, and Shuah Khan.

- devfreq updates

New driver for the Exynos5-bus device from Abhilash Kesavan.

Minor cleanups, fixes and MAINTAINERS update from MyungJoo Ham,
Abhilash Kesavan, Paul Bolle, Rajagopal Venkat, and Wei Yongjun.

- OMAP power management updates

Adaptive Voltage Scaling (AVS) SmartReflex voltage control driver
updates from Andrii Tseglytskyi and Nishanth Menon."

* tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits)
cpufreq: Fix cpufreq regression after suspend/resume
ACPI / PM: Fix possible NULL pointer deref in acpi_pm_device_sleep_state()
PM / Sleep: Warn about system time after resume with pm_trace
cpufreq: don't leave stale policy pointer in cdbs->cur_policy
acpi-cpufreq: Add new sysfs attribute freqdomain_cpus
cpufreq: make sure frequency transitions are serialized
ACPI: implement acpi_os_get_timer() according the spec
ACPI / EC: Add HP Folio 13 to ec_dmi_table in order to skip DSDT scan
ACPI: Add CMOS RTC Operation Region handler support
ACPI / processor: Drop unused variable from processor_perflib.c
cpufreq: tegra: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: s3c64xx: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: omap: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: imx6q: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: exynos: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: dbx500: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: davinci: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: arm-big-little: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: powernow-k8: call CPUFREQ_POSTCHANGE notfier in error cases
cpufreq: pcc: call CPUFREQ_POSTCHANGE notfier in error cases
...

Linus Torvalds
2013-07-04 05:35:40 +0800

02 Jul, 2013

10 commits

0bebc633f sunrpc: Don't schedule an upcall on a replaced cache entry. ... Browse Code »

When a cache entry is replaced, the "expiry_time" get set to
zero by a call to "cache_fresh_locked(..., 0)" at the end of
"sunrpc_cache_update".

This low expiry time makes cache_check() think that the 'refresh_age'
is negative, so the 'age' is comparatively large and a refresh is
triggered.
However refreshing a replaced entry it pointless, it cannot achieve
anything useful.

So teach cache_check to ignore a low refresh_age when expiry_time
is zero.

Reported-by: Bodo Stroesser
Signed-off-by: NeilBrown
Signed-off-by: J. Bruce Fields

NeilBrown
2013-07-02 05:53:28 +0800
7715cde86 net/sunrpc: xpt_auth_cache should be ignored when expired. ... Browse Code »

commit d202cce8963d9268ff355a386e20243e8332b308
sunrpc: never return expired entries in sunrpc_cache_lookup

moved the 'entry is expired' test from cache_check to
sunrpc_cache_lookup, so that it happened early and some races could
safely be ignored.

However the ip_map (in svcauth_unix.c) has a separate single-item
cache which allows quick lookup without locking. An entry in this
case would not be subject to the expiry test and so could be used
well after it has expired.

This is not normally a big problem because the first time it is used
after it is expired an up-call will be scheduled to refresh the entry
(if it hasn't been scheduled already) and the old entry will then
be invalidated. So on the second attempt to use it after it has
expired, ip_map_cached_get will discard it.

However that is subtle and not ideal, so replace the "!cache_valid"
test with "cache_is_expired".
In doing this we drop the test on the "CACHE_VALID" bit. This is
unnecessary as the bit is never cleared, and an entry will only
be cached if the bit is set.

Reported-by: Bodo Stroesser
Signed-off-by: NeilBrown
Signed-off-by: J. Bruce Fields

NeilBrown
2013-07-02 05:53:28 +0800
013920eb5 sunrpc/cache: ensure items removed from cache do not have pending upcalls. ... Browse Code »

It is possible for a race to set CACHE_PENDING after cache_clean()
has removed a cache entry from the cache.
If CACHE_PENDING is still set when the entry is finally 'put',
the cache_dequeue() will never happen and we can leak memory.

So set a new flag 'CACHE_CLEANED' when we remove something from
the cache, and don't queue any upcall if it is set.

If CACHE_PENDING is set before CACHE_CLEANED, the call that
cache_clean() makes to cache_fresh_unlocked() will free memory
as needed. If CACHE_PENDING is set after CACHE_CLEANED, the
test in sunrpc_cache_pipe_upcall will ensure that the memory
is not allocated.

Reported-by:
Signed-off-by: NeilBrown
Signed-off-by: J. Bruce Fields

NeilBrown
2013-07-02 05:53:28 +0800
2a1c7f53f sunrpc/cache: use cache_fresh_unlocked consistently and correctly. ... Browse Code »

cache_fresh_unlocked() is called when a cache entry
has been updated and ensures that if there were any
pending upcalls, they are cleared.

So every time we update a cache entry, we should call this,
and this should be the only way that we try to clear
pending calls (that sort of uniformity makes code sooo much
easier to read).

try_to_negate_entry() will (possibly) mark an entry as
negative. If it doesn't, it is because the entry already
is VALID.
So the entry will be valid on exit, so it is appropriate to
call cache_fresh_unlocked().
So tidy up try_to_negate_entry() to do that, and remove
partial open-coded cache_fresh_unlocked() from the one
call-site of try_to_negate_entry().

In the other branch of the 'switch(cache_make_upcall())',
we again have a partial open-coded version of cache_fresh_unlocked().
Replace that with a real call.

And again in cache_clean(), use a real call to cache_fresh_unlocked().

These call sites might previously have called
cache_revisit_request() if CACHE_PENDING wasn't set.
This is never necessary because cache_revisit_request() can
only do anything if the item is in the cache_defer_hash,
However any time that an item is added to the cache_defer_hash
(setup_deferral), the code immediately tests CACHE_PENDING,
and removes the entry again if it is clear. So all other
places we only need to 'cache_revisit_request' if we've
just cleared CACHE_PENDING.

Reported-by: Bodo Stroesser
Signed-off-by: NeilBrown
Signed-off-by: J. Bruce Fields

NeilBrown
2013-07-02 05:53:28 +0800
f9e1aedc6 sunrpc/cache: remove races with queuing an upcall. ... Browse Code »

We currently queue an upcall after setting CACHE_PENDING,
and dequeue after clearing CACHE_PENDING.
So a request should only be present when CACHE_PENDING is set.

However we don't combine the test and the enqueue/dequeue in
a protected region, so it is possible (if unlikely) for a race
to result in a request being queued without CACHE_PENDING set,
or a request to be absent despite CACHE_PENDING.

So: include a test for CACHE_PENDING inside the regions of
enqueue and dequeue where queue_lock is held, and abort
the operation if the value is not as expected.

Also remove the early 'return' from cache_dequeue() to ensure that it
always removes all entries: As there is no locking between setting
CACHE_PENDING and calling sunrpc_cache_pipe_upcall it is not
inconceivable for some other thread to clear CACHE_PENDING and then
someone else to set it and call sunrpc_cache_pipe_upcall, both before
the original threads completed the call.

With this, it perfectly safe and correct to:
- call cache_dequeue() if and only if we have just
cleared CACHE_PENDING
- call sunrpc_cache_pipe_upcall() (via cache_make_upcall)
if and only if we have just set CACHE_PENDING.

Reported-by: Bodo Stroesser
Signed-off-by: NeilBrown
Signed-off-by: Bodo Stroesser
Signed-off-by: J. Bruce Fields

NeilBrown
2013-07-02 05:42:53 +0800
1f691b07c svcrpc: don't error out on small tcp fragment ... Browse Code »

Though clients we care about mostly don't do this, it is possible for
rpc requests to be sent in multiple fragments. Here we have a sanity
check to ensure that the final received rpc isn't too small--except that
the number we're actually checking is the length of just the final
fragment, not of the whole rpc. So a perfectly legal rpc that's
unluckily fragmented could cause the server to close the connection
here.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2013-07-02 05:32:04 +0800
cf3aa02cb svcrpc: fix handling of too-short rpc's ... Browse Code »

If we detect that an rpc is too short, we abort and close the
connection. Except, there's a bug here: we're leaving sk_datalen
nonzero without leaving any pages in the sk_pages array. The most
likely result of the inconsistency is a subsequent crash in
svc_tcp_clear_pages.

Also demote the BUG_ON in svc_tcp_clear_pages to a WARN.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2013-07-02 05:32:04 +0800
0dc1531ac svcrpc: store gss mech in svc_cred ... Browse Code »

Store a pointer to the gss mechanism used in the rq_cred and cl_cred.
This will make it easier to enforce SP4_MACH_CRED, which needs to
compare the mechanism used on the exchange_id with that used on
protected operations.

Signed-off-by: J. Bruce Fields

J. Bruce Fields
2013-07-02 05:23:06 +0800
442340639 svcrpc: introduce init_svc_cred ... Browse Code »

Common helper to zero out fields of the svc_cred.

Signed-off-by: J. Bruce Fields

J. Bruce Fields
2013-07-02 05:23:06 +0800
0de934936 Merge branch 'for-3.10' into 'for-3.11' ... Browse Code »

Merge bugfixes into my for-3.11 branch.

J. Bruce Fields
2013-07-02 05:22:24 +0800

29 Jun, 2013

5 commits

e77e43003 more open-coded file_inode() calls ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-06-29 16:57:21 +0800
4f6bb246f SUNRPC: PipeFS MOUNT notification optimization for dying clients ... Browse Code »

Not need to create pipes for dying client. So just skip them.

Note: we can safely dereference the client structure, because notification
caller is holding sn->pipefs_sb_lock.

Signed-off-by: Stanislav Kinsbursky
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust

Stanislav Kinsbursky
2013-06-29 03:42:26 +0800
e73f4cc05 SUNRPC: split client creation routine into setup and registration ... Browse Code »

This helper moves all "registration" code to the new rpc_client_register()
helper.
This helper will be used later in the series to synchronize against PipeFS
MOUNT/UMOUNT events.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: Trond Myklebust

Stanislav Kinsbursky
2013-06-29 03:42:16 +0800
adb6fa7ff SUNRPC: fix races on PipeFS UMOUNT notifications ... Browse Code »

CPU#0 CPU#1
----------------------------- -----------------------------
rpc_kill_sb
sn->pipefs_sb = NULL rpc_release_client
(UMOUNT_EVENT) rpc_free_auth
rpc_pipefs_event
rpc_get_client_for_event
!atomic_inc_not_zero(cl_count)

atomic_inc(cl_count)
rpc_free_client
rpc_clnt_remove_pipedir

To fix this, this patch does the following:

1) Calls RPC_PIPEFS_UMOUNT notification with sn->pipefs_sb_lock being held.
2) Removes SUNRPC client from the list AFTER pipes destroying.
3) Doesn't hold RPC client on notification: if client in the list, then it
can't be destroyed while sn->pipefs_sb_lock in hold by notification caller.

Signed-off-by: Stanislav Kinsbursky
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust

Stanislav Kinsbursky
2013-06-29 03:42:02 +0800
384816051 SUNRPC: fix races on PipeFS MOUNT notifications ... Browse Code »

Below are races, when RPC client can be created without PiepFS dentries

CPU#0 CPU#1
----------------------------- -----------------------------
rpc_new_client rpc_fill_super
rpc_setup_pipedir
mutex_lock(&sn->pipefs_sb_lock)
rpc_get_sb_net == NULL
(no per-net PipeFS superblock)
sn->pipefs_sb = sb;
notifier_call_chain(MOUNT)
(client is not in the list)
rpc_register_client
(client without pipes dentries)

To fix this patch:
1) makes PipeFS mount notification call with pipefs_sb_lock being held.
2) releases pipefs_sb_lock on new SUNRPC client creation only after
registration.

Signed-off-by: Stanislav Kinsbursky
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust

Stanislav Kinsbursky
2013-06-29 03:41:18 +0800

28 Jun, 2013

1 commit

207bc1181 Merge branch 'freezer' ... Browse Code »

* freezer:
af_unix: use freezable blocking calls in read
sigtimedwait: use freezable blocking call
nanosleep: use freezable blocking call
futex: use freezable blocking call
select: use freezable blocking call
epoll: use freezable blocking call
binder: use freezable blocking calls
freezer: add new freezable helpers using freezer_do_not_count()
freezer: convert freezable helpers to static inline where possible
freezer: convert freezable helpers to freezer_do_not_count()
freezer: skip waking up tasks with PF_FREEZER_SKIP set
freezer: shorten freezer sleep time using exponential backoff
lockdep: check that no locks held at freeze time
lockdep: remove task argument from debug_check_no_locks_held
freezer: add unsafe versions of freezable helpers for CIFS
freezer: add unsafe versions of freezable helpers for NFS

Rafael J. Wysocki
2013-06-28 19:00:53 +0800

19 Jun, 2013

1 commit

e401452d9 rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set ... Browse Code »

We had a report of a reproducible WARNING:

[ 1360.039358] ------------[ cut here ]------------
[ 1360.043978] WARNING: at fs/dcache.c:1355 d_set_d_op+0x8d/0xc0()
[ 1360.049880] Hardware name: HP Z200 Workstation
[ 1360.054308] Modules linked in: nfsv4 nfs dns_resolver fscache nfsd
auth_rpcgss nfs_acl lockd sunrpc sg acpi_cpufreq mperf coretemp kvm_intel kvm
snd_hda_codec_realtek snd_hda_intel snd_hda_codec hp_wmi crc32c_intel
snd_hwdep e1000e snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd
sparse_keymap rfkill soundcore serio_raw ptp iTCO_wdt pps_core pcspkr
iTCO_vendor_support mei microcode lpc_ich mfd_core wmi xfs libcrc32c sr_mod
sd_mod cdrom crc_t10dif radeon i2c_algo_bit drm_kms_helper ttm ahci libahci
drm i2c_core libata dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
auth_rpcgss]
[ 1360.107406] Pid: 8814, comm: mount.nfs4 Tainted: G I -------------- 3.9.0-0.55.el7.x86_64 #1
[ 1360.116771] Call Trace:
[ 1360.119219] [] warn_slowpath_common+0x70/0xa0
[ 1360.125208] [] warn_slowpath_null+0x1a/0x20
[ 1360.131025] [] d_set_d_op+0x8d/0xc0
[ 1360.136159] [] __rpc_lookup_create_exclusive+0x4f/0x80 [sunrpc]
[ 1360.143710] [] rpc_mkpipe_dentry+0x86/0x170 [sunrpc]
[ 1360.150311] [] nfs_idmap_new+0x96/0x130 [nfsv4]
[ 1360.156475] [] nfs4_init_client+0xad/0x2d0 [nfsv4]
[ 1360.162902] [] ? idr_get_empty_slot+0x16f/0x3c0
[ 1360.169062] [] ? idr_mark_full+0x52/0x60
[ 1360.174615] [] ? idr_alloc+0x79/0xe0
[ 1360.179826] [] ? __rpc_init_priority_wait_queue+0x81/0xc0 [sunrpc]
[ 1360.187635] [] ? rpc_init_wait_queue+0x13/0x20 [sunrpc]
[ 1360.194493] [] nfs_get_client+0x27a/0x350 [nfs]
[ 1360.200666] [] nfs4_set_client.isra.8+0x78/0x100 [nfsv4]
[ 1360.207624] [] nfs4_create_server+0xf3/0x3a0 [nfsv4]
[ 1360.214222] [] nfs4_remote_mount+0x2e/0x60 [nfsv4]
[ 1360.220644] [] mount_fs+0x39/0x1b0
[ 1360.225691] [] ? __alloc_percpu+0x10/0x20
[ 1360.231348] [] vfs_kern_mount+0x5f/0xf0
[ 1360.236822] [] nfs_do_root_mount+0x86/0xc0 [nfsv4]
[ 1360.243246] [] nfs4_try_mount+0x44/0xc0 [nfsv4]
[ 1360.249410] [] ? get_nfs_version+0x27/0x80 [nfs]
[ 1360.255659] [] nfs_fs_mount+0x5c5/0xd10 [nfs]
[ 1360.261650] [] ? nfs_clone_super+0x140/0x140 [nfs]
[ 1360.268074] [] ? param_set_portnr+0x60/0x60 [nfs]
[ 1360.274406] [] mount_fs+0x39/0x1b0
[ 1360.279443] [] ? __alloc_percpu+0x10/0x20
[ 1360.285088] [] vfs_kern_mount+0x5f/0xf0
[ 1360.290556] [] do_mount+0x1fd/0xa00
[ 1360.295677] [] ? __get_free_pages+0xe/0x50
[ 1360.301405] [] ? copy_mount_options+0x36/0x170
[ 1360.307479] [] sys_mount+0x83/0xc0
[ 1360.312515] [] system_call_fastpath+0x16/0x1b
[ 1360.318503] ---[ end trace 8fa1f4cbc36094a7 ]---

The problem is that we're ending up in __rpc_lookup_create_exclusive
with a negative dentry that already has d_op set. A little debugging
has shown that when we hit this, the d_ops are already set to
simple_dentry_operations.

I believe that what's happening is that during a mount, idmapd is racing
in and doing a lookup of /var/lib/nfs/rpc_pipefs/nfs/clnt???/idmap.
Before that dentry reference is released, the kernel races in to create
that file and finds the new negative dentry, which already has the
d_op set.

This patch just avoids setting the d_op if it's already set.
simple_dentry_operations and rpc_dentry_operations are functionally
equivalent so it shouldn't matter which one it's set to.

Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust

Jeff Layton
2013-06-19 01:46:50 +0800

13 Jun, 2013

1 commit

fe2c6338f net: Convert uses of typedef ctl_table to struct ctl_table ... Browse Code »

Reduce the uses of this unnecessary typedef.

Done via perl script:

$ git grep --name-only -w ctl_table net | \
xargs perl -p -i -e '\
sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \
s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge'

Reflow the modified lines that now exceed 80 columns.

Signed-off-by: Joe Perches
Signed-off-by: David S. Miller

Joe Perches
2013-06-13 17:36:09 +0800

07 Jun, 2013

3 commits

9ec2ef53b SUNRPC: Remove redundant call to rpc_set_running() in __rpc_execute() ... Browse Code »

The RPC_TASK_RUNNING flag will always have been set in rpc_make_runnable()
once we get past the test for out_of_line_wait_on_bit() returning
ERESTARTSYS.

Signed-off-by: Trond Myklebust

Trond Myklebust
2013-06-07 04:24:40 +0800
0053a8e65 SUNRPC: Remove unused function rpc_queue_empty ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2013-06-07 04:24:39 +0800
a76580fbf SUNRPC: Fix a potential race in rpc_execute ... Browse Code »

If the rpc_task is asynchronous, it could theoretically finish executing
on the workqueue it was assigned by rpc_make_runnable() before we get
round to testing RPC_IS_ASYNC() in rpc_execute.

In practice, however, all the existing callers hold a reference to the
rpc_task, so this can't happen today...

Signed-off-by: Trond Myklebust

Trond Myklebust
2013-06-07 04:24:38 +0800

31 May, 2013

1 commit

4203afc3f Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd fixes from Bruce Fields:
"A couple minor fixes for the (new to 3.10) gss-proxy code.

And one regression from user-namespace changes. (XBMC clients were
doing something admittedly weird--sending -1 gid's--but something that
we used to allow.)"

* 'for-3.10' of git://linux-nfs.org/~bfields/linux:
svcrpc: fix failures to handle -1 uid's and gid's
svcrpc: implement O_NONBLOCK behavior for use-gss-proxy
svcauth_gss: fix error code in use_gss_proxy()

Linus Torvalds
2013-05-31 08:48:56 +0800

29 May, 2013

2 commits

afe3c3fd5 svcrpc: fix failures to handle -1 uid's and gid's ... Browse Code »

As of f025adf191924e3a75ce80e130afcd2485b53bb8 "sunrpc: Properly decode
kuids and kgids in RPC_AUTH_UNIX credentials" any rpc containing a -1
(0xffff) uid or gid would fail with a badcred error.

Reported symptoms were xmbc clients failing on upgrade of the NFS
server; examination of the network trace showed them sending -1 as the
gid.

Reported-by: Julian Sikorski
Tested-by: Julian Sikorski
Cc: "Eric W. Biederman"
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2013-05-29 22:37:47 +0800
b161c1444 svcrpc: implement O_NONBLOCK behavior for use-gss-proxy ... Browse Code »

Somebody noticed LTP was complaining about O_NONBLOCK opens of
/proc/net/rpc/use-gss-proxy succeeding and then a following read
hanging.

I'm not convinced LTP really has any business opening random proc files
and expecting them to behave a certain way. Maybe this isn't really a
bug.

But in any case the O_NONBLOCK behavior could be useful for someone that
wants to test whether gss-proxy is up without waiting.

Reported-by: Jan Stancek
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2013-05-29 04:46:51 +0800

23 May, 2013

1 commit

a3c3cac5d SUNRPC: Prevent an rpc_task wakeup race ... Browse Code »

The lockless RPC_IS_QUEUED() test in __rpc_execute means that we need to
be careful about ordering the calls to rpc_test_and_set_running(task) and
rpc_clear_queued(task). If we get the order wrong, then we may end up
testing the RPC_TASK_RUNNING flag after __rpc_execute() has looped
and changed the state of the rpc_task.

Signed-off-by: Trond Myklebust
Cc: stable@vger.kernel.org

Trond Myklebust
2013-05-23 02:55:32 +0800

21 May, 2013

1 commit

b6040f970 sunrpc: the cache_detail in cache_is_valid is unused any more ... Browse Code »

The cache_detail(*detail) in function cache_is_valid is not used any
more.

Signed-off-by: fanchaoting
Signed-off-by: J. Bruce Fields

chaoting fan
2013-05-21 23:02:02 +0800

16 May, 2013

3 commits

2aed8b476 SUNRPC: Convert auth_gss pipe detection to work in namespaces ... Browse Code »

This seems to have been overlooked when we did the namespace
conversion. If a container is running a legacy version of rpc.gssd
then it will be disrupted if the global 'pipe_version' is set by a
container running the new version of rpc.gssd.

Signed-off-by: Trond Myklebust

Trond Myklebust
2013-05-16 21:17:54 +0800
abfdbd53a SUNRPC: Faster detection if gssd is actually running ... Browse Code »

Recent changes to the NFS security flavour negotiation mean that
we have a stronger dependency on rpc.gssd. If the latter is not
running, because the user failed to start it, then we time out
and mark the container as not having an instance. We then
use that information to time out faster the next time.

If, on the other hand, the rpc.gssd successfully binds to an rpc_pipe,
then we mark the container as having an rpc.gssd instance.

Signed-off-by: Trond Myklebust

Trond Myklebust
2013-05-16 21:15:41 +0800
d36ccb9ce SUNRPC: Fix a bug in gss_create_upcall ... Browse Code »

If wait_event_interruptible_timeout() is successful, it returns
the number of seconds remaining until the timeout. In that
case, we should be retrying the upcall.

Signed-off-by: Trond Myklebust

Trond Myklebust
2013-05-16 01:49:58 +0800