Eric Lee / smarc-fsl-linux-kernel

05 Aug, 2016

5 commits

d03d9fe47 nfsd: remove unnecessary positive-dentry check ... Browse Code »

vfs_{create,mkdir,mknod} each begin with a call to may_create(), which
returns EEXIST if the object already exists.

This check is therefore unnecessary.

(In the NFSv2 case, nfsd_proc_create also has such a check. Contrary to
RFC 1094, our code seems to believe that a CREATE of an existing file
should succeed. I'm leaving that behavior alone.)

Signed-off-by: J. Bruce Fields

J. Bruce Fields
2016-08-05 05:11:50 +0800
b44061d0b nfsd: reorganize nfsd_create ... Browse Code »

There's some odd logic in nfsd_create() that allows it to be called with
the parent directory either locked or unlocked. The only already-locked
caller is NFSv2's nfsd_proc_create(). It's less confusing to split out
the unlocked case into a separate function which the NFSv2 code can call
directly.

Also fix some comments while we're here.

Signed-off-by: J. Bruce Fields

J. Bruce Fields
2016-08-05 05:11:49 +0800
e75b23f9e nfsd: check d_can_lookup in fh_verify of directories ... Browse Code »

Create and other nfsd ops generally assume we can call lookup_one_len on
inodes with S_IFDIR set. Al says that this assumption isn't true in
general, though it should be for the filesystem objects nfsd sees.

Add a check just to make sure our assumption isn't violated.

Remove a couple checks for i_op->lookup in create code.

Cc: Al Viro
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2016-08-05 05:11:48 +0800
12391d072 nfsd: remove redundant zero-length check from create ... Browse Code »

lookup_one_len already has this check.

The only effect of this patch is to return access instead of perm in the
0-length-filename case. I actually prefer nfserr_perm (or _inval?), but
I doubt anyone cares.

The isdotent check seems redundant too, but I worry that some client
might actually care about that strange nfserr_exist error.

Signed-off-by: J. Bruce Fields

J. Bruce Fields
2016-08-05 05:11:47 +0800
7eed34f18 nfsd: Make creates return EEXIST instead of EACCES ... Browse Code »

When doing a create (mkdir/mknod) on a name, it's worth
checking the name exists first before returning EACCES in case
the directory is not writeable by the user.
This makes return values on the client more consistent
regardless of whenever the entry there is cached in the local
cache or not.
Another positive side effect is certain programs only expect
EEXIST in that case even despite POSIX allowing any valid
error to be returned.

Signed-off-by: Oleg Drokin
Signed-off-by: J. Bruce Fields

Oleg Drokin
2016-08-05 05:11:46 +0800

02 Aug, 2016

2 commits

c7995f8a7 SUNRPC: Detect immediate closure of accepted sockets ... Browse Code »

This modification is useful for debugging issues that happen while
the socket is being initialised.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2016-08-02 05:53:42 +0800
b2f21f7d8 SUNRPC: accept() may return sockets that are still in SYN_RECV ... Browse Code »

We're seeing traces of the following form:

[10952.396347] svc: transport ffff88042ba4a 000 dequeued, inuse=2
[10952.396351] svc: tcp_accept ffff88042ba4 a000 sock ffff88042a6e4c80
[10952.396362] nfsd: connect from 10.2.6.1, port=187
[10952.396364] svc: svc_setup_socket ffff8800b99bcf00
[10952.396368] setting up TCP socket for reading
[10952.396370] svc: svc_setup_socket created ffff8803eb10a000 (inet ffff88042b75b800)
[10952.396373] svc: transport ffff8803eb10a000 put into queue
[10952.396375] svc: transport ffff88042ba4a000 put into queue
[10952.396377] svc: server ffff8800bb0ec000 waiting for data (to = 3600000)
[10952.396380] svc: transport ffff8803eb10a000 dequeued, inuse=2
[10952.396381] svc_recv: found XPT_CLOSE
[10952.396397] svc: svc_delete_xprt(ffff8803eb10a000)
[10952.396398] svc: svc_tcp_sock_detach(ffff8803eb10a000)
[10952.396399] svc: svc_sock_detach(ffff8803eb10a000)
[10952.396412] svc: svc_sock_free(ffff8803eb10a000)

i.e. an immediate close of the socket after initialisation.

The culprit appears to be the test at the end of svc_tcp_init, which
checks if the newly created socket is in the TCP_ESTABLISHED state,
and immediately closes it if not. The evidence appears to suggest that
the socket might still be in the SYN_RECV state at this time.

The fix is to check for both states, and then to add a check in
svc_tcp_state_change() to ensure we don't close the socket when
it transitions into TCP_ESTABLISHED.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2016-08-02 05:53:41 +0800

16 Jul, 2016

4 commits

8a4c39268 nfsd: allow nfsd to advertise multiple layout types ... Browse Code »

If the underlying filesystem supports multiple layout types, then there
is little reason not to advertise that fact to clients and let them
choose what type to use.

Turn the ex_layout_type field into a bitfield. For each supported
layout type, we set a bit in that field. When the client requests a
layout, ensure that the bit for that layout type is set. When the
client requests attributes, send back a list of supported types.

Signed-off-by: Jeff Layton
Reviewed-by: Weston Andros Adamson
Signed-off-by: J. Bruce Fields

Jeff Layton
2016-07-16 03:31:32 +0800
885848186 nfsd: Close race between nfsd4_release_lockowner and nfsd4_lock ... Browse Code »

nfsd4_release_lockowner finds a lock owner that has no lock state,
and drops cl_lock. Then release_lockowner picks up cl_lock and
unhashes the lock owner.

During the window where cl_lock is dropped, I don't see anything
preventing a concurrent nfsd4_lock from finding that same lock owner
and adding lock state to it.

Move release_lockowner() into nfsd4_release_lockowner and hang onto
the cl_lock until after the lock owner's state cannot be found
again.

Found by inspection, we don't currently have a reproducer.

Fixes: 2c41beb0e5cf ("nfsd: reduce cl_lock thrashing in ... ")
Reviewed-by: Jeff Layton
Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields

Chuck Lever
2016-07-16 03:31:31 +0800
dd51db188 nfsd/blocklayout: Make sure calculate signature/designator length aligned ... Browse Code »

These values are all multiples of 4 already, so there's no change in
behavior from this patch. But perhaps this will prevent mistakes in the
future.

Signed-off-by: Kinglong Mee
Signed-off-by: J. Bruce Fields

Kinglong Mee
2016-07-16 03:31:30 +0800
15d66ac20 xfs: abstract block export operations from nfsd layouts ... Browse Code »

Instead of creeping pnfs layout configuration into filesystems, move the
definition of block-based export operations under a more abstract
configuration.

Signed-off-by: Benjamin Coddington
Reviewed-by: Christoph Hellwig
Acked-by: Dave Chinner
Signed-off-by: J. Bruce Fields

Benjamin Coddington
2016-07-16 03:31:29 +0800

14 Jul, 2016

17 commits

f4a4906e5 SUNRPC: Remove unused callback xpo_adjust_wspace() ... Browse Code »

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2016-07-14 03:53:50 +0800
637600f3f SUNRPC: Change TCP socket space reservation ... Browse Code »

The current server rpc tcp code attempts to predict how much writeable
socket space will be available to a given RPC call before accepting it
for processing. On a 40GigE network, we've found this throttles
individual clients long before the network or disk is saturated. The
server may handle more clients easily, but the bandwidth of individual
clients is still artificially limited.

Instead of trying (and failing) to predict how much writeable socket space
will be available to the RPC call, just fall back to the simple model of
deferring processing until the socket is uncongested.

This may increase the risk of fast clients starving slower clients; in
such cases, the previous patch allows setting a hard per-connection
limit.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2016-07-14 03:53:49 +0800
ff3ac5c3d SUNRPC: Add a server side per-connection limit ... Browse Code »

Allow the user to limit the number of requests serviced through a single
connection, to help prevent faster clients from starving slower clients.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2016-07-14 03:53:48 +0800
4720b0703 SUNRPC: Micro optimisation for svc_data_ready ... Browse Code »

Don't call svc_xprt_enqueue() if the XPT_DATA flag is already set.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2016-07-14 03:53:46 +0800
fa9251afc SUNRPC: Call the default socket callbacks instead of open coding ... Browse Code »

Rather than code up our own versions of the socket callbacks, just
call the defaults.
This also allows us to merge svc_udp_data_ready() and svc_tcp_data_ready().

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2016-07-14 03:53:45 +0800
069c225b8 SUNRPC: lock the socket while detaching it ... Browse Code »

Prevent callbacks from triggering while we're detaching the socket.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2016-07-14 03:53:44 +0800
104f6351f SUNRPC: Add tracepoints for dropped and deferred requests ... Browse Code »

Dropping and/or deferring requests has an impact on performance. Let's
make sure we can trace those events.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2016-07-14 03:53:43 +0800
82ea2d761 SUNRPC: Add a tracepoint for server socket out-of-space conditions ... Browse Code »

Add a tracepoint to track when the processing of incoming RPC data gets
deferred due to out-of-space issues on the outgoing transport.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields

Trond Myklebust
2016-07-14 03:53:42 +0800
d28c442f5 nfsd: Fix some indent inconsistancy ... Browse Code »

Silent a few smatch warnings about indentation

Signed-off-by: Christophe JAILLET
Signed-off-by: J. Bruce Fields

Christophe JAILLET
2016-07-14 03:53:41 +0800
93f580a9a nfsd: Correct a comment for NFSD_MAY_ defines location ... Browse Code »

Those are now defined in fs/nfsd/vfs.h

Signed-off-by: Oleg Drokin
Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Oleg Drokin
2016-07-14 03:53:40 +0800
9b9960a0c nfsd: Add a super simple flex file server ... Browse Code »

Have a simple flex file server where the mds (NFSv4.1 or NFSv4.2)
is also the ds (NFSv3). I.e., the metadata and the data file are
the exact same file.

This will allow testing of the flex file client.

Simply add the "pnfs" export option to your export
in /etc/exports and mount from a client that supports
flex files.

Signed-off-by: Tom Haynes
Reviewed-by: Christoph Hellwig
Signed-off-by: J. Bruce Fields

Tom Haynes
2016-07-14 03:40:48 +0800
d7c920d13 nfsd: flex file device id encoding will need the server address ... Browse Code »

Signed-off-by: Tom Haynes
Reviewed-by: Christoph Hellwig
Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Tom Haynes
2016-07-14 03:40:47 +0800
04d70edad sunrpc: add gss minor status to svcauth_gss_proxy_init ... Browse Code »

GSS-Proxy doesn't produce very much debug logging at all. Printing out
the gss minor status will aid in troubleshooting if the
GSS_Accept_sec_context upcall fails.

Signed-off-by: Scott Mayhew
Signed-off-by: J. Bruce Fields

Scott Mayhew
2016-07-14 03:40:46 +0800
ed9416439 nfsd: implement machine credential support for some operations ... Browse Code »

This addresses the conundrum referenced in RFC5661 18.35.3,
and will allow clients to return state to the server using the
machine credentials.

The biggest part of the problem is that we need to allow the client
to send a compound op with integrity/privacy on mounts that don't
have it enabled.

Add server support for properly decoding and using spo_must_enforce
and spo_must_allow bits. Add support for machine credentials to be
used for CLOSE, OPEN_DOWNGRADE, LOCKU, DELEGRETURN,
and TEST/FREE STATEID.
Implement a check so as to not throw WRONGSEC errors when these
operations are used if integrity/privacy isn't turned on.

Without this, Linux clients with credentials that expired while holding
delegations were getting stuck in an endless loop.

Signed-off-by: Andrew Elble
Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Andrew Elble
2016-07-14 03:32:47 +0800
dedeb13f9 nfsd: allow mach_creds_match to be used more broadly ... Browse Code »

Rename mach_creds_match() to nfsd4_mach_creds_match() and un-staticify

Signed-off-by: Andrew Elble
Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Andrew Elble
2016-07-14 03:32:47 +0800
1adf0c5a4 nfs/nfsd: Move useful bitfield ops to a commonly accessible place ... Browse Code »

So these may be used in nfsd as well

Signed-off-by: Andrew Elble
Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Andrew Elble
2016-07-14 03:32:47 +0800
d8d29138b sunrpc: remove 'inuse' flag from struct cache_detail. ... Browse Code »

This field is not currently in use.

Signed-off-by: NeilBrown
Signed-off-by: J. Bruce Fields

NeilBrown
2016-07-14 03:32:47 +0800

02 Jul, 2016

1 commit

db1bb44c4 SUNRPC: Don't allocate a full sockaddr_storage for tracing ... Browse Code »

We're always tracing IPv4 or IPv6 addresses, so we can save a lot
of space on the ringbuffer by allocating the correct sockaddr size.

Signed-off-by: Trond Myklebust
Cc: stable@vger.kernel.org
Fixes: 83a712e0afef "sunrpc: add some tracepoints around ..."
Signed-off-by: J. Bruce Fields

Trond Myklebust
2016-07-02 04:28:44 +0800

01 Jul, 2016

2 commits

6343a2120 locks: use file_inode() ... Browse Code »

(Another one for the f_path debacle.)

ltp fcntl33 testcase caused an Oops in selinux_file_send_sigiotask.

The reason is that generic_add_lease() used filp->f_path.dentry->inode
while all the others use file_inode(). This makes a difference for files
opened on overlayfs since the former will point to the overlay inode the
latter to the underlying inode.

So generic_add_lease() added the lease to the overlay inode and
generic_delete_lease() removed it from the underlying inode. When the file
was released the lease remained on the overlay inode's lock list, resulting
in use after free.

Reported-by: Eryu Guan
Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
Cc:
Signed-off-by: Miklos Szeredi
Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Miklos Szeredi
2016-07-01 22:24:18 +0800
cb7d224f8 lockd: unregister notifier blocks if the service fails to come up completely ... Browse Code »

If the lockd service fails to start up then we need to be sure that the
notifier blocks are not registered, otherwise a subsequent start of the
service could cause the same notifier to be registered twice, leading to
soft lockups.

Signed-off-by: Scott Mayhew
Cc: stable@vger.kernel.org
Fixes: 0751ddf77b6a "lockd: Register callbacks on the inetaddr_chain..."
Signed-off-by: J. Bruce Fields

Scott Mayhew
2016-07-01 04:35:07 +0800

27 Jun, 2016

2 commits

4c2e07c6a Linux 4.7-rc5 Browse Code »

Linus Torvalds
2016-06-27 08:52:03 +0800
2ac9b9735 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi ... Browse Code »

Pull SCSI fixes from James Bottomley:
"Two straightforward fixes.

One is a concurrency issue only affecting SAS connected SATA drives,
but which could hang the storage subsystem if it triggers (because the
outstanding command count on error never goes back to zero) and the
other is a NO_TAG fallout from the switch to hostwide tags which
causes the system to crash on module insertion (we've checked
carefully and only the 53c700 family of drivers is vulnerable to this
issue)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
53c700: fix BUG on untagged commands
scsi: fix race between simultaneous decrements of ->host_failed

Linus Torvalds
2016-06-27 01:08:49 +0800

25 Jun, 2016

7 commits

da2f6aba4 Merge branch 'for-linus-4.7-part2' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/mason/linux-btrfs

Pull btrfs fixes part 2 from Chris Mason:
"This has one patch from Omar to bring iterate_shared back to btrfs.

We have a tree of work we queue up for directory items and it doesn't
lend itself well to shared access. While we're cleaning it up, Omar
has changed things to use an exclusive lock when there are delayed
items"

* 'for-linus-4.7-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix ->iterate_shared() by upgrading i_rwsem for delayed nodes

Linus Torvalds
2016-06-25 23:53:38 +0800
b971712af Merge branch 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs fixes from Chris Mason:
"I have a two part pull this time because one of the patches Dave
Sterba collected needed to be against v4.7-rc2 or higher (we used
rc4). I try to make my for-linus-xx branch testable on top of the
last major so we can hand fixes to people on the list more easily, so
I've split this pull in two.

This first part has some fixes and two performance improvements that
we've been testing for some time.

Josef's two performance fixes are most notable. The transid tracking
patch makes a big improvement on pretty much every workload"

* 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: Force stripesize to the value of sectorsize
btrfs: fix disk_i_size update bug when fallocate() fails
Btrfs: fix error handling in map_private_extent_buffer
Btrfs: fix error return code in btrfs_init_test_fs()
Btrfs: don't do nocow check unless we have to
btrfs: fix deadlock in delayed_ref_async_start
Btrfs: track transid for delayed ref flushing

Linus Torvalds
2016-06-25 23:42:31 +0800
ca83a55c9 Merge tag 'sound-4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound ... Browse Code »

Pull sound fixes from Takashi Iwai:
"Again pretty calm weeks: we've had only a few trivial / stable
HD-audio fixes in addition to a possible race fix for snd-dummy driver
spotted by syzkaller"

* tag 'sound-4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: dummy: Fix a use-after-free at closing
ALSA: hda / realtek - add two more Thinkpad IDs (5050,5053) for tpt460 fixup
ALSA: hda - Fix the headset mic jack detection on Dell machine
ALSA: hda/tegra: iomem fixups for sparse warnings
ALSA: hdac_regmap - fix the register access for runtime PM

Linus Torvalds
2016-06-25 21:55:48 +0800
9a949a985 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 kprobe fix from Thomas Gleixner:
"A single fix clearing the TF bit when a fault is single stepped"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kprobes/x86: Clear TF bit in fault on single-stepping

Linus Torvalds
2016-06-25 21:49:32 +0800
57801c1b8 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler fixes from Thomas Gleixner:
"A couple of scheduler fixes:

- force watchdog reset while processing sysrq-w

- fix a deadlock when enabling trace events in the scheduler

- fixes to the throttled next buddy logic

- fixes for the average accounting (missing serialization and
underflow handling)

- allow kernel threads for fallback to online but not active cpus"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Allow kthreads to fall back to online && !active cpus
sched/fair: Do not announce throttled next buddy in dequeue_task_fair()
sched/fair: Initialize throttle_count for new task-groups lazily
sched/fair: Fix cfs_rq avg tracking underflow
kernel/sysrq, watchdog, sched/core: Reset watchdog on all CPUs while processing sysrq-w
sched/debug: Fix deadlock when enabling sched events
sched/fair: Fix post_init_entity_util_avg() serialization

Linus Torvalds
2016-06-25 21:38:42 +0800
02dbfc99b Btrfs: fix ->iterate_shared() by upgrading i_rwsem for delayed nodes ... Browse Code »

Commit fe742fd4f90f ("Revert "btrfs: switch to ->iterate_shared()"")
backed out the conversion to ->iterate_shared() for Btrfs because the
delayed inode handling in btrfs_real_readdir() is racy. However, we can
still do readdir in parallel if there are no delayed nodes.

This is a temporary fix which upgrades the shared inode lock to an
exclusive lock only when we have delayed items until we come up with a
more complete solution. While we're here, rename the
btrfs_{get,put}_delayed_items functions to make it very clear that
they're just for readdir.

Tested with xfstests and by doing a parallel kernel build:

while make tinyconfig && make -j4 && git clean dqfx; do
:
done

along with a bunch of parallel finds in another shell:

while true; do
for ((i=0; i/dev/null &
done
wait
done

Signed-off-by: Omar Sandoval
Signed-off-by: David Sterba
Signed-off-by: Chris Mason

Omar Sandoval
2016-06-25 21:20:10 +0800
e3b22bc3d Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull locking fix from Thomas Gleixner:
"A single fix to address a race in the static key logic"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/static_key: Fix concurrent static_key_slow_inc()

Linus Torvalds
2016-06-25 21:14:44 +0800