Eric Lee / smarc-fsl-linux-kernel

25 Mar, 2016

1 commit

5b1e167d8 Merge tag 'nfsd-4.6' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"Various bugfixes, a RDMA update from Chuck Lever, and support for a
new pnfs layout type from Christoph Hellwig. The new layout type is a
variant of the block layout which uses SCSI features to offer improved
fencing and device identification.

(Also: note this pull request also includes the client side of SCSI
layout, with Trond's permission.)"

* tag 'nfsd-4.6' of git://linux-nfs.org/~bfields/linux:
sunrpc/cache: drop reference when sunrpc_cache_pipe_upcall() detects a race
nfsd: recover: fix memory leak
nfsd: fix deadlock secinfo+readdir compound
nfsd4: resfh unused in nfsd4_secinfo
svcrdma: Use new CQ API for RPC-over-RDMA server send CQs
svcrdma: Use new CQ API for RPC-over-RDMA server receive CQs
svcrdma: Remove close_out exit path
svcrdma: Hook up the logic to return ERR_CHUNK
svcrdma: Use correct XID in error replies
svcrdma: Make RDMA_ERROR messages work
rpcrdma: Add RPCRDMA_HDRLEN_ERR
svcrdma: svc_rdma_post_recv() should close connection on error
svcrdma: Close connection when a send error occurs
nfsd: Lower NFSv4.1 callback message size limit
svcrdma: Do not send Write chunk XDR pad with inline content
svcrdma: Do not write xdr_buf::tail in a Write chunk
svcrdma: Find client-provided write and reply chunks once per reply
nfsd: Update NFS server comments related to RDMA support
nfsd: Fix a memory leak when meeting unsupported state_protect_how4
nfsd4: fix bad bounds checking

Linus Torvalds
2016-03-25 01:41:00 +0800

18 Mar, 2016

1 commit

956ccef3c nfsd: recover: fix memory leak ... Browse Code »

nfsd4_cltrack_grace_start() will allocate the memory for grace_start but
when we returned due to error we missed freeing it.

Signed-off-by: Sudip Mukherjee
Signed-off-by: J. Bruce Fields

Sudip Mukherjee
2016-03-18 02:57:15 +0800

27 Jan, 2016

1 commit

1edb82d20 nfsd: Use shash ... Browse Code »

This patch replaces uses of the long obsolete hash interface with
shash.

Signed-off-by: Herbert Xu

Herbert Xu
2016-01-27 20:36:13 +0800

23 Jan, 2016

1 commit

5955102c9 wrappers for ->i_mutex access ... Browse Code »

parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.

Signed-off-by: Al Viro

Al Viro
2016-01-23 07:04:28 +0800

24 Nov, 2015

1 commit

7c582e4fa nfsd: recover: constify nfsd4_client_tracking_ops structures ... Browse Code »

The nfsd4_client_tracking_ops structures are never modified, so declare
them as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall
Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Julia Lawall
2015-11-24 03:15:30 +0800

01 Sep, 2015

1 commit

46cc8ba30 nfsd: don't WARN/backtrace for invalid container deployment. ... Browse Code »

These messages, combined with the backtrace they trigger, makes it seem
like a serious problem, though a quick search shows distros marking
it as a "won't fix" non-issue when the problem is reported by users.

The backtrace is overkill, and only really manages to show that if
you follow the code path, you can't really avoid it with bootargs
or configuration settings in the container.

Given that, lets tone it down a bit and get rid of the WARN severity,
and the associated backtrace, so people aren't needlessly alarmed.

Also, lets drop the split printk line, since they are grep unfriendly.

Signed-off-by: Paul Gortmaker
Signed-off-by: J. Bruce Fields

Paul Gortmaker
2015-09-01 04:32:08 +0800

21 Jul, 2015

1 commit

4691b271a nfsd: Fix a memory leak in nfsd4_list_rec_dir() ... Browse Code »

If lookup_one_len() failed, nfsd should free those memory allocated for fname.

Signed-off-by: Kinglong Mee
Signed-off-by: J. Bruce Fields

Kinglong Mee
2015-07-21 02:58:45 +0800

16 Apr, 2015

1 commit

2b0143b5c VFS: normal filesystems (and lustre): d_inode() annotations ... Browse Code »

that's the bulk of filesystem drivers dealing with inodes of their own

Signed-off-by: David Howells
Signed-off-by: Al Viro

David Howells
2015-04-16 03:06:57 +0800

23 Feb, 2015

1 commit

e36cb0b89 VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) ... Browse Code »

Convert the following where appropriate:

(1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).

(2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).

(3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
complicated than it appears as some calls should be converted to
d_can_lookup() instead. The difference is whether the directory in
question is a real dir with a ->lookup op or whether it's a fake dir with
a ->d_automount op.

In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).

Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer. In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.

However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.

There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
intended for special directory entry types that don't have attached inodes.

The following perl+coccinelle script was used:

use strict;

my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
die "Can't grep for S_ISDIR and co. callers";
@callers = ;
close($fd);
unless (@callers) {
print "No matches\n";
exit(0);
}

my @cocci = (
'@@',
'expression E;',
'@@',
'',
'- S_ISLNK(E->d_inode->i_mode)',
'+ d_is_symlink(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISDIR(E->d_inode->i_mode)',
'+ d_is_dir(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISREG(E->d_inode->i_mode)',
'+ d_is_reg(E)' );

my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);

foreach my $file (@callers) {
chomp $file;
print "Processing ", $file, "\n";
system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
die "spatch failed";
}

[AV: overlayfs parts skipped]

Signed-off-by: David Howells
Signed-off-by: Al Viro

David Howells
2015-02-23 00:38:41 +0800

20 Nov, 2014

1 commit

ef8a1a10e nfsd: get rid of ->f_dentry ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2014-11-20 02:01:23 +0800

01 Nov, 2014

1 commit

ac7576f4b vfs: make first argument of dir_context.actor typed ... Browse Code »

Signed-off-by: Miklos Szeredi
Signed-off-by: Al Viro

Miklos Szeredi
2014-11-01 05:48:54 +0800

13 Oct, 2014

1 commit

faafcba3b Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler updates from Ingo Molnar:
"The main changes in this cycle were:

- Optimized support for Intel "Cluster-on-Die" (CoD) topologies (Dave
Hansen)

- Various sched/idle refinements for better idle handling (Nicolas
Pitre, Daniel Lezcano, Chuansheng Liu, Vincent Guittot)

- sched/numa updates and optimizations (Rik van Riel)

- sysbench speedup (Vincent Guittot)

- capacity calculation cleanups/refactoring (Vincent Guittot)

- Various cleanups to thread group iteration (Oleg Nesterov)

- Double-rq-lock removal optimization and various refactorings
(Kirill Tkhai)

- various sched/deadline fixes

... and lots of other changes"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
sched/dl: Use dl_bw_of() under rcu_read_lock_sched()
sched/fair: Delete resched_cpu() from idle_balance()
sched, time: Fix build error with 64 bit cputime_t on 32 bit systems
sched: Improve sysbench performance by fixing spurious active migration
sched/x86: Fix up typo in topology detection
x86, sched: Add new topology for multi-NUMA-node CPUs
sched/rt: Use resched_curr() in task_tick_rt()
sched: Use rq->rd in sched_setaffinity() under RCU read lock
sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask'
sched: Use dl_bw_of() under RCU read lock
sched/fair: Remove duplicate code from can_migrate_task()
sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
sched: print_rq(): Don't use tasklist_lock
sched: normalize_rt_tasks(): Don't use _irqsave for tasklist_lock, use task_rq_lock()
sched: Fix the task-group check in tg_has_rt_tasks()
sched/fair: Leverage the idle state info when choosing the "idlest" cpu
sched: Let the scheduler see CPU idle states
sched/deadline: Fix inter- exclusive cpusets migrations
sched/deadline: Clear dl_entity params when setscheduling to different class
sched/numa: Kill the wrong/dead TASK_DEAD check in task_numa_fault()
...

Linus Torvalds
2014-10-13 22:23:15 +0800

19 Sep, 2014

1 commit

f139caf2e sched, cleanup, treewide: Remove set_current_state(TASK_RUNNING) after schedule() ... Browse Code »

schedule(), io_schedule() and schedule_timeout() always return
with TASK_RUNNING state set, so one more setting is unnecessary.

(All places in patch are visible good, only exception is
kiblnd_scheduler() from:

drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c

Its schedule() is one line above standard 3 lines of unified diff)

No places where set_current_state() is used for mb().

Signed-off-by: Kirill Tkhai
Signed-off-by: Peter Zijlstra (Intel)
Link: http://lkml.kernel.org/r/1410529254.3569.23.camel@tkhai
Cc: Alasdair Kergon
Cc: Anil Belur
Cc: Arnd Bergmann
Cc: Dave Kleikamp
Cc: David Airlie
Cc: David Howells
Cc: Dmitry Eremin
Cc: Frank Blaschka
Cc: Greg Kroah-Hartman
Cc: Heiko Carstens
Cc: Helge Deller
Cc: Isaac Huang
Cc: James E.J. Bottomley
Cc: James E.J. Bottomley
Cc: J. Bruce Fields
Cc: Jeff Dike
Cc: Jesper Nilsson
Cc: Jiri Slaby
Cc: Laura Abbott
Cc: Liang Zhen
Cc: Linus Torvalds
Cc: Martin Schwidefsky
Cc: Masaru Nomura
Cc: Michael Opdenacker
Cc: Mikael Starvik
Cc: Mike Snitzer
Cc: Neil Brown
Cc: Oleg Drokin
Cc: Peng Tao
Cc: Richard Weinberger
Cc: Robert Love
Cc: Steven Rostedt
Cc: Trond Myklebust
Cc: Ursula Braun
Cc: Zi Shen Lim
Cc: devel@driverdev.osuosl.org
Cc: dm-devel@redhat.com
Cc: dri-devel@lists.freedesktop.org
Cc: fcoe-devel@open-fcoe.org
Cc: jfs-discussion@lists.sourceforge.net
Cc: linux390@de.ibm.com
Cc: linux-afs@lists.infradead.org
Cc: linux-cris-kernel@axis.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-nfs@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-raid@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: qla2xxx-upstream@qlogic.com
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: user-mode-linux-user@lists.sourceforge.net
Signed-off-by: Ingo Molnar

Kirill Tkhai
2014-09-19 18:35:17 +0800

18 Sep, 2014

5 commits

65decb650 nfsd: skip subsequent UMH "create" operations after the first one for v4.0 clients ... Browse Code »

In the case of v4.0 clients, we may call into the "create" client
tracking operation multiple times (once for each openowner). Upcalling
for each one of those is wasteful and slow however. We can skip doing
further "create" operations after the first one if we know that one has
already been done.

v4.1+ clients generally only call into this function once (on
RECLAIM_COMPLETE), and we can't skip upcalling on the create even if the
STABLE bit is set. Doing so would make it impossible for nfsdcltrack to
lift the grace period early since the timestamp has a different meaning
in the case where the client is expected to issue a RECLAIM_COMPLETE.

Signed-off-by: Jeff Layton

Jeff Layton
2014-09-18 04:33:17 +0800
788a7914a nfsd: set and test NFSD4_CLIENT_STABLE bit to reduce nfsdcltrack upcalls ... Browse Code »

The nfsdcltrack upcall doesn't utilize the NFSD4_CLIENT_STABLE flag,
which basically results in an upcall every time we call into the client
tracking ops.

Change it to set this bit on a successful "check" or "create" request,
and clear it on a "remove" request. Also, check to see if that bit is
set before upcalling on a "check" or "remove" request, and skip
upcalling appropriately, depending on its state.

Signed-off-by: Jeff Layton

Jeff Layton
2014-09-18 04:33:17 +0800
d682e750c nfsd: serialize nfsdcltrack upcalls for a particular client ... Browse Code »

In a later patch, we want to add a flag that will allow us to reduce the
need for upcalls. In order to handle that correctly, we'll need to
ensure that racing upcalls for the same client can't occur. In practice
it should be rare for this to occur with a well-behaved client, but it
is possible.

Convert one of the bits in the cl_flags field to be an upcall bitlock,
and use it to ensure that upcalls for the same client are serialized.

Signed-off-by: Jeff Layton

Jeff Layton
2014-09-18 04:33:16 +0800
d4318acd5 nfsd: pass extra info in env vars to upcalls to allow for early grace period end ... Browse Code »

In order to support lifting the grace period early, we must tell
nfsdcltrack what sort of client the "create" upcall is for. We can't
reliably tell if a v4.0 client has completed reclaiming, so we can only
lift the grace period once all the v4.1+ clients have issued a
RECLAIM_COMPLETE and if there are no v4.0 clients.

Also, in order to lift the grace period, we have to tell userland when
the grace period started so that it can tell whether a RECLAIM_COMPLETE
has been issued for each client since then.

Since this is all optional info, we pass it along in environment
variables to the "init" and "create" upcalls. By doing this, we don't
need to revise the upcall format. The UMH upcall can simply make use of
this info if it happens to be present. If it's not then it can just
avoid lifting the grace period early.

Signed-off-by: Jeff Layton

Jeff Layton
2014-09-18 04:33:15 +0800
919b8049f nfsd: remove redundant boot_time parm from grace_done client tracking op ... Browse Code »

Since it's stored in nfsd_net, we don't need to pass it in separately.

Signed-off-by: Jeff Layton

Jeff Layton
2014-09-18 04:33:12 +0800

04 Sep, 2014

2 commits

15d176c19 NFSD: Fix a memory leak if nfsd4_recdir_load fail ... Browse Code »

Signed-off-by: Kinglong Mee
Signed-off-by: J. Bruce Fields

Kinglong Mee
2014-09-04 05:43:01 +0800
c2236f141 NFSD: Reset creds after mnt_want_write_file() fail ... Browse Code »

Signed-off-by: Kinglong Mee
Signed-off-by: J. Bruce Fields

Kinglong Mee
2014-09-04 05:43:01 +0800

25 Oct, 2013

1 commit

a6a9f18f0 nfsd: switch to %p[dD] ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-10-25 11:34:51 +0800

31 Aug, 2013

1 commit

248f807b4 nfsd4: nfsd4_create_clid_dir prints uninitialized data ... Browse Code »

Take the easy way out and just remove the printk.

Reported-by: David Howells

J. Bruce Fields
2013-08-31 05:30:52 +0800

29 Jun, 2013

3 commits

ac6614b76 [readdir] constify ->actor ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-06-29 16:57:05 +0800
bb6f619b3 [readdir] introduce ->iterate(), ctx->pos, dir_emit() ... Browse Code »

New method - ->iterate(file, ctx). That's the replacement for ->readdir();
it takes callback from ctx->actor, uses ctx->pos instead of file->f_pos and
calls dir_emit(ctx, ...) instead of filldir(data, ...). It does *not*
update file->f_pos (or look at it, for that matter); iterate_dir() does the
update.

Note that dir_emit() takes the offset from ctx->pos (and eventually
filldir_t will lose that argument).

Signed-off-by: Al Viro

Al Viro
2013-06-29 16:46:47 +0800
5c0ba4e07 [readdir] introduce iterate_dir() and dir_context ... Browse Code »

iterate_dir(): new helper, replacing vfs_readdir().

struct dir_context: contains the readdir callback (and will get more stuff
in it), embedded into whatever data that callback wants to deal with;
eventually, we'll be passing it to ->readdir() replacement instead of
(data,filldir) pair.

Signed-off-by: Al Viro

Al Viro
2013-06-29 16:46:46 +0800

10 May, 2013

1 commit

7255e716b nfsd: fix oops when legacy_recdir_name_error is passed a -ENOENT error ... Browse Code »

Toralf reported the following oops to the linux-nfs mailing list:

-----------------[snip]------------------
NFSD: unable to generate recoverydir name (-2).
NFSD: disabling legacy clientid tracking. Reboot recovery will not function correctly!
BUG: unable to handle kernel NULL pointer dereference at 000003c8
IP: [] nfsd4_client_tracking_exit+0x11/0x50 [nfsd]
*pdpt = 000000002ba33001 *pde = 0000000000000000
Oops: 0000 [#1] SMP
Modules linked in: loop nfsd auth_rpcgss ipt_MASQUERADE xt_owner xt_multiport ipt_REJECT xt_tcpudp xt_recent xt_conntrack nf_conntrack_ftp xt_limit xt_LOG iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables af_packet pppoe pppox ppp_generic slhc bridge stp llc tun arc4 iwldvm mac80211 coretemp kvm_intel uvcvideo sdhci_pci sdhci mmc_core videobuf2_vmalloc videobuf2_memops usblp videobuf2_core i915 iwlwifi psmouse videodev cfg80211 kvm fbcon bitblit cfbfillrect acpi_cpufreq mperf evdev softcursor font cfbimgblt i2c_algo_bit cfbcopyarea intel_agp intel_gtt drm_kms_helper snd_hda_codec_conexant drm agpgart fb fbdev tpm_tis thinkpad_acpi tpm nvram e1000e rfkill thermal ptp wmi pps_core tpm_bios 8250_pci processor 8250 ac snd_hda_intel snd_hda_codec snd_pcm battery video i2c_i801 snd_page_alloc snd_timer button serial_core i2c_core snd soundcore thermal_sys hwmon aesni_intel ablk_helper cryp
td lrw aes_i586 xts gf128mul cbc fuse nfs lockd sunrpc dm_crypt dm_mod hid_monterey hid_microsoft hid_logitech hid_ezkey hid_cypress hid_chicony hid_cherry hid_belkin hid_apple hid_a4tech hid_generic usbhid hid sr_mod cdrom sg [last unloaded: microcode]
Pid: 6374, comm: nfsd Not tainted 3.9.1 #6 LENOVO 4180F65/4180F65
EIP: 0060:[] EFLAGS: 00010202 CPU: 0
EIP is at nfsd4_client_tracking_exit+0x11/0x50 [nfsd]
EAX: 00000000 EBX: fffffffe ECX: 00000007 EDX: 00000007
ESI: eb9dcb00 EDI: eb2991c0 EBP: eb2bde38 ESP: eb2bde34
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
CR0: 80050033 CR2: 000003c8 CR3: 2ba80000 CR4: 000407f0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Process nfsd (pid: 6374, ti=eb2bc000 task=eb2711c0 task.ti=eb2bc000)
Stack:
fffffffe eb2bde4c f90a3e0c f90a7754 fffffffe eb0a9c00 eb2bdea0 f90a41ed
eb2991c0 1b270000 eb2991c0 eb2bde7c f9099ce9 eb2bde98 0129a020 eb29a020
eb2bdecc eb2991c0 eb2bdea8 f9099da5 00000000 eb9dcb00 00000001 67822f08
Call Trace:
[] legacy_recdir_name_error+0x3c/0x40 [nfsd]
[] nfsd4_create_clid_dir+0x15d/0x1c0 [nfsd]
[] ? nfsd4_lookup_stateid+0x99/0xd0 [nfsd]
[] ? nfs4_preprocess_seqid_op+0x85/0x100 [nfsd]
[] nfsd4_client_record_create+0x37/0x50 [nfsd]
[] nfsd4_open_confirm+0xfe/0x130 [nfsd]
[] ? nfsd4_encode_operation+0x61/0x90 [nfsd]
[] ? nfsd4_free_stateid+0xc0/0xc0 [nfsd]
[] nfsd4_proc_compound+0x41b/0x530 [nfsd]
[] nfsd_dispatch+0x8b/0x1a0 [nfsd]
[] svc_process+0x3dd/0x640 [sunrpc]
[] nfsd+0xad/0x110 [nfsd]
[] ? nfsd_destroy+0x70/0x70 [nfsd]
[] kthread+0x94/0xa0
[] ret_from_kernel_thread+0x1b/0x28
[] ? flush_kthread_work+0xd0/0xd0
Code: 86 b0 00 00 00 90 c5 0a f9 c7 04 24 70 76 0a f9 e8 74 a9 3d c8 eb ba 8d 76 00 55 89 e5 53 66 66 66 66 90 8b 15 68 c7 0a f9 85 d2 88 c8 03 00 00 74 2c 3b 11 77 28 8b 5c 91 08 85 db 74 22 8b
EIP: [] nfsd4_client_tracking_exit+0x11/0x50 [nfsd] SS:ESP 0068:eb2bde34
CR2: 00000000000003c8
---[ end trace 09e54015d145c9c6 ]---

The problem appears to be a regression that was introduced in commit
9a9c6478 "nfsd: make NFSv4 recovery client tracking options per net".
Prior to that commit, it was safe to pass a NULL net pointer to
nfsd4_client_tracking_exit in the legacy recdir case, and
legacy_recdir_name_error did so. After that comit, the net pointer must
be valid.

This patch just fixes legacy_recdir_name_error to pass in a valid net
pointer to that function.

Cc: # v3.8+
Cc: Stanislav Kinsbursky
Reported-and-tested-by: Toralf Förster
Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2013-05-10 00:58:36 +0800

01 Mar, 2013

1 commit

b6669737d Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd changes from J Bruce Fields:
"Miscellaneous bugfixes, plus:

- An overhaul of the DRC cache by Jeff Layton. The main effect is
just to make it larger. This decreases the chances of intermittent
errors especially in the UDP case. But we'll need to watch for any
reports of performance regressions.

- Containerized nfsd: with some limitations, we now support
per-container nfs-service, thanks to extensive work from Stanislav
Kinsbursky over the last year."

Some notes about conflicts, since there were *two* non-data semantic
conflicts here:

- idr_remove_all() had been added by a memory leak fix, but has since
become deprecated since idr_destroy() does it for us now.

- xs_local_connect() had been added by this branch to make AF_LOCAL
connections be synchronous, but in the meantime Trond had changed the
calling convention in order to avoid a RCU dereference.

There were a couple of more obvious actual source-level conflicts due to
the hlist traversal changes and one just due to code changes next to
each other, but those were trivial.

* 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits)
SUNRPC: make AF_LOCAL connect synchronous
nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum
svcrpc: fix rpc server shutdown races
svcrpc: make svc_age_temp_xprts enqueue under sv_lock
lockd: nlmclnt_reclaim(): avoid stack overflow
nfsd: enable NFSv4 state in containers
nfsd: disable usermode helper client tracker in container
nfsd: use proper net while reading "exports" file
nfsd: containerize NFSd filesystem
nfsd: fix comments on nfsd_cache_lookup
SUNRPC: move cache_detail->cache_request callback call to cache_read()
SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
SUNRPC: rework cache upcall logic
SUNRPC: introduce cache_detail->cache_request callback
NFS: simplify and clean cache library
NFS: use SUNRPC cache creation and destruction helper for DNS cache
nfsd4: free_stid can be static
nfsd: keep a checksum of the first 256 bytes of request
sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
sunrpc: fix comment in struct xdr_buf definition
...

Linus Torvalds
2013-03-01 10:02:55 +0800

16 Feb, 2013

1 commit

71a503069 nfsd: disable usermode helper client tracker in container ... Browse Code »

This tracker uses khelper kthread to execute binaries.
Execution itself is done from kthread context - i.e. global root is used.
This is not suitable for containers with own root.
So, disable this tracker for a while.

Note: one of possible solutions can be pass "init" callback to khelper, which
will swap root to desired one.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2013-02-16 00:21:01 +0800

13 Feb, 2013

1 commit

6fab87790 nfsd: Properly compare and initialize kuids and kgids ... Browse Code »

Use uid_eq(uid, GLOBAL_ROOT_UID) instead of !uid.
Use gid_eq(gid, GLOBAL_ROOT_GID) instead of !gid.
Use uid_eq(uid, INVALID_UID) instead of uid == -1
Use gid_eq(uid, INVALID_GID) instead of gid == -1
Use uid = GLOBAL_ROOT_UID instead of uid = 0;
Use gid = GLOBAL_ROOT_GID instead of gid = 0;
Use !uid_eq(uid1, uid2) instead of uid1 != uid2.
Use !gid_eq(gid1, gid2) instead of gid1 != gid2.
Use uid_eq(uid1, uid2) instead of uid1 == uid2.

Cc: "J. Bruce Fields"
Cc: Trond Myklebust
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-02-13 22:16:09 +0800

11 Dec, 2012

1 commit

9a9c6478a nfsd: make NFSv4 recovery client tracking options per net ... Browse Code »

Pointer to client tracking operations - client_tracking_ops - have to be
containerized, because different environment can support different trackers
(for example, legacy tracker currently is not suported in container).

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-12-11 05:25:30 +0800

28 Nov, 2012

3 commits

f141f79d7 nfsd: recovery - make in_grace per net ... Browse Code »

Flag in_grace is a part of client tracking state, which is network namesapce
aware. So let'a replace global static variable with per-net one.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-11-28 23:13:54 +0800
3a0733692 nfsd: recovery - make rec_file per net ... Browse Code »

Opening and closing of this file is done in client tracking init and exit
operations.
Client tracking is done in network namespace context already. So let's make
this file opened and closed per network context - this will simlify it's
management.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-11-28 23:13:53 +0800
dba88ba55 nfsd4: remove state lock from nfsd4_load_reboot_recovery_data ... Browse Code »

That function is only called under nfsd_mutex: we know that because the
only caller is nfsd_svc, via

nfsd_svc
nfsd_startup
nfs4_state_start
nfsd4_client_tracking_init
client_tracking_ops->init == nfsd4_load_reboot_recovery_data

The shared state accessed here includes:

- user_recovery_dirname: used here, modified only by
nfs4_reset_recoverydir, which can be verified to only be
called under nfsd_mutex.
- filesystem state, protected by i_mutex (handwaving slightly
here)
- rec_file, reclaim_str_hashtbl, reclaim_str_hashtbl_size: other
than here, used only from code called from nfsd or laundromat
threads, both of which should be started only after this runs
(see nfsd_svc) and stopped before this could run again (see
nfsd_shutdown, called from nfsd_last_thread).

Signed-off-by: J. Bruce Fields

J. Bruce Fields
2012-11-28 23:13:48 +0800

15 Nov, 2012

3 commits

12760c668 nfsd: pass nfsd_net instead of net to grace enders ... Browse Code »

Passing net context looks as overkill.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-11-15 20:40:50 +0800
52e19c09a nfsd: make reclaim_str_hashtbl allocated per net ... Browse Code »

This hash holds nfs4_clients info, which are network namespace aware.
So let's make it allocated per network namespace.

Note: this hash is used only by legacy tracker. So let's allocate hash in
tracker init.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-11-15 20:40:43 +0800
c212cecfa nfsd: make nfs4_client network namespace dependent ... Browse Code »

And use it's net where possible.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: J. Bruce Fields

Stanislav Kinsbursky
2012-11-15 20:40:42 +0800

13 Nov, 2012

4 commits

7e4f015d8 nfsd: release the legacy reclaimable clients list in grace_done ... Browse Code »

The current code holds on to this list until nfsd is shut down, but it's
never touched once the grace period ends. Release that memory back into
the wild when the grace period ends.

Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2012-11-13 07:55:12 +0800
2216d449a nfsd: get rid of cl_recdir field ... Browse Code »

Remove the cl_recdir field from the nfs4_client struct. Instead, just
compute it on the fly when and if it's needed, which is now only when
the legacy client tracking code is in effect.

The error handling in the legacy client tracker is also changed to
handle the case where md5 is unavailable. In that case, we'll warn
the admin with a KERN_ERR message and disable the client tracking.

Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2012-11-13 07:55:11 +0800
0ce0c2b5d nfsd: don't search for client by hash on legacy reboot recovery gracedone ... Browse Code »

When nfsd starts, the legacy reboot recovery code creates a tracking
struct for each directory in the v4recoverydir. When the grace period
ends, it basically does a "readdir" on the directory again, and matches
each dentry in there to an existing client id to see if it should be
removed or not. If the matching client doesn't exist, or hasn't
reclaimed its state then it will remove that dentry.

This is pretty inefficient since it involves doing a lot of hash-bucket
searching. It also means that we have to keep relying on being able to
search for a nfs4_client by md5 hashed cl_recdir name.

Instead, add a pointer to the nfs4_client that indicates the association
between the nfs4_client_reclaim and nfs4_client. When a reclaim operation
comes in, we set the pointer to make that association. On gracedone, the
legacy client tracker will keep the recdir around iff:

1/ there is a reclaim record for the directory

...and...

2/ there's an association between the reclaim record and a client record
-- that is, a create or check operation was performed on the client that
matches that directory.

Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2012-11-13 07:55:11 +0800
278c931cb nfsd: have nfsd4_find_reclaim_client take a char * argument ... Browse Code »

Currently, it takes a client pointer, but later we're going to need to
search for these records without knowing whether a matching client even
exists.

Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2012-11-13 07:55:11 +0800