25 Mar, 2016

1 commit

  • Pull nfsd updates from Bruce Fields:
    "Various bugfixes, a RDMA update from Chuck Lever, and support for a
    new pnfs layout type from Christoph Hellwig. The new layout type is a
    variant of the block layout which uses SCSI features to offer improved
    fencing and device identification.

    (Also: note this pull request also includes the client side of SCSI
    layout, with Trond's permission.)"

    * tag 'nfsd-4.6' of git://linux-nfs.org/~bfields/linux:
    sunrpc/cache: drop reference when sunrpc_cache_pipe_upcall() detects a race
    nfsd: recover: fix memory leak
    nfsd: fix deadlock secinfo+readdir compound
    nfsd4: resfh unused in nfsd4_secinfo
    svcrdma: Use new CQ API for RPC-over-RDMA server send CQs
    svcrdma: Use new CQ API for RPC-over-RDMA server receive CQs
    svcrdma: Remove close_out exit path
    svcrdma: Hook up the logic to return ERR_CHUNK
    svcrdma: Use correct XID in error replies
    svcrdma: Make RDMA_ERROR messages work
    rpcrdma: Add RPCRDMA_HDRLEN_ERR
    svcrdma: svc_rdma_post_recv() should close connection on error
    svcrdma: Close connection when a send error occurs
    nfsd: Lower NFSv4.1 callback message size limit
    svcrdma: Do not send Write chunk XDR pad with inline content
    svcrdma: Do not write xdr_buf::tail in a Write chunk
    svcrdma: Find client-provided write and reply chunks once per reply
    nfsd: Update NFS server comments related to RDMA support
    nfsd: Fix a memory leak when meeting unsupported state_protect_how4
    nfsd4: fix bad bounds checking

    Linus Torvalds
     

18 Mar, 2016

1 commit


27 Jan, 2016

1 commit


23 Jan, 2016

1 commit

  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

24 Nov, 2015

1 commit


01 Sep, 2015

1 commit

  • These messages, combined with the backtrace they trigger, makes it seem
    like a serious problem, though a quick search shows distros marking
    it as a "won't fix" non-issue when the problem is reported by users.

    The backtrace is overkill, and only really manages to show that if
    you follow the code path, you can't really avoid it with bootargs
    or configuration settings in the container.

    Given that, lets tone it down a bit and get rid of the WARN severity,
    and the associated backtrace, so people aren't needlessly alarmed.

    Also, lets drop the split printk line, since they are grep unfriendly.

    Signed-off-by: Paul Gortmaker
    Signed-off-by: J. Bruce Fields

    Paul Gortmaker
     

21 Jul, 2015

1 commit


16 Apr, 2015

1 commit


23 Feb, 2015

1 commit

  • Convert the following where appropriate:

    (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).

    (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).

    (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
    complicated than it appears as some calls should be converted to
    d_can_lookup() instead. The difference is whether the directory in
    question is a real dir with a ->lookup op or whether it's a fake dir with
    a ->d_automount op.

    In some circumstances, we can subsume checks for dentry->d_inode not being
    NULL into this, provided we the code isn't in a filesystem that expects
    d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
    use d_inode() rather than d_backing_inode() to get the inode pointer).

    Note that the dentry type field may be set to something other than
    DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
    manages the fall-through from a negative dentry to a lower layer. In such a
    case, the dentry type of the negative union dentry is set to the same as the
    type of the lower dentry.

    However, if you know d_inode is not NULL at the call site, then you can use
    the d_is_xxx() functions even in a filesystem.

    There is one further complication: a 0,0 chardev dentry may be labelled
    DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
    intended for special directory entry types that don't have attached inodes.

    The following perl+coccinelle script was used:

    use strict;

    my @callers;
    open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
    die "Can't grep for S_ISDIR and co. callers";
    @callers = ;
    close($fd);
    unless (@callers) {
    print "No matches\n";
    exit(0);
    }

    my @cocci = (
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISLNK(E->d_inode->i_mode)',
    '+ d_is_symlink(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISDIR(E->d_inode->i_mode)',
    '+ d_is_dir(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISREG(E->d_inode->i_mode)',
    '+ d_is_reg(E)' );

    my $coccifile = "tmp.sp.cocci";
    open($fd, ">$coccifile") || die $coccifile;
    print($fd "$_\n") || die $coccifile foreach (@cocci);
    close($fd);

    foreach my $file (@callers) {
    chomp $file;
    print "Processing ", $file, "\n";
    system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
    die "spatch failed";
    }

    [AV: overlayfs parts skipped]

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     

20 Nov, 2014

1 commit


01 Nov, 2014

1 commit


13 Oct, 2014

1 commit

  • Pull scheduler updates from Ingo Molnar:
    "The main changes in this cycle were:

    - Optimized support for Intel "Cluster-on-Die" (CoD) topologies (Dave
    Hansen)

    - Various sched/idle refinements for better idle handling (Nicolas
    Pitre, Daniel Lezcano, Chuansheng Liu, Vincent Guittot)

    - sched/numa updates and optimizations (Rik van Riel)

    - sysbench speedup (Vincent Guittot)

    - capacity calculation cleanups/refactoring (Vincent Guittot)

    - Various cleanups to thread group iteration (Oleg Nesterov)

    - Double-rq-lock removal optimization and various refactorings
    (Kirill Tkhai)

    - various sched/deadline fixes

    ... and lots of other changes"

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
    sched/dl: Use dl_bw_of() under rcu_read_lock_sched()
    sched/fair: Delete resched_cpu() from idle_balance()
    sched, time: Fix build error with 64 bit cputime_t on 32 bit systems
    sched: Improve sysbench performance by fixing spurious active migration
    sched/x86: Fix up typo in topology detection
    x86, sched: Add new topology for multi-NUMA-node CPUs
    sched/rt: Use resched_curr() in task_tick_rt()
    sched: Use rq->rd in sched_setaffinity() under RCU read lock
    sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask'
    sched: Use dl_bw_of() under RCU read lock
    sched/fair: Remove duplicate code from can_migrate_task()
    sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
    sched: print_rq(): Don't use tasklist_lock
    sched: normalize_rt_tasks(): Don't use _irqsave for tasklist_lock, use task_rq_lock()
    sched: Fix the task-group check in tg_has_rt_tasks()
    sched/fair: Leverage the idle state info when choosing the "idlest" cpu
    sched: Let the scheduler see CPU idle states
    sched/deadline: Fix inter- exclusive cpusets migrations
    sched/deadline: Clear dl_entity params when setscheduling to different class
    sched/numa: Kill the wrong/dead TASK_DEAD check in task_numa_fault()
    ...

    Linus Torvalds
     

19 Sep, 2014

1 commit

  • schedule(), io_schedule() and schedule_timeout() always return
    with TASK_RUNNING state set, so one more setting is unnecessary.

    (All places in patch are visible good, only exception is
    kiblnd_scheduler() from:

    drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c

    Its schedule() is one line above standard 3 lines of unified diff)

    No places where set_current_state() is used for mb().

    Signed-off-by: Kirill Tkhai
    Signed-off-by: Peter Zijlstra (Intel)
    Link: http://lkml.kernel.org/r/1410529254.3569.23.camel@tkhai
    Cc: Alasdair Kergon
    Cc: Anil Belur
    Cc: Arnd Bergmann
    Cc: Dave Kleikamp
    Cc: David Airlie
    Cc: David Howells
    Cc: Dmitry Eremin
    Cc: Frank Blaschka
    Cc: Greg Kroah-Hartman
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Isaac Huang
    Cc: James E.J. Bottomley
    Cc: James E.J. Bottomley
    Cc: J. Bruce Fields
    Cc: Jeff Dike
    Cc: Jesper Nilsson
    Cc: Jiri Slaby
    Cc: Laura Abbott
    Cc: Liang Zhen
    Cc: Linus Torvalds
    Cc: Martin Schwidefsky
    Cc: Masaru Nomura
    Cc: Michael Opdenacker
    Cc: Mikael Starvik
    Cc: Mike Snitzer
    Cc: Neil Brown
    Cc: Oleg Drokin
    Cc: Peng Tao
    Cc: Richard Weinberger
    Cc: Robert Love
    Cc: Steven Rostedt
    Cc: Trond Myklebust
    Cc: Ursula Braun
    Cc: Zi Shen Lim
    Cc: devel@driverdev.osuosl.org
    Cc: dm-devel@redhat.com
    Cc: dri-devel@lists.freedesktop.org
    Cc: fcoe-devel@open-fcoe.org
    Cc: jfs-discussion@lists.sourceforge.net
    Cc: linux390@de.ibm.com
    Cc: linux-afs@lists.infradead.org
    Cc: linux-cris-kernel@axis.com
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-nfs@vger.kernel.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linux-raid@vger.kernel.org
    Cc: linux-s390@vger.kernel.org
    Cc: linux-scsi@vger.kernel.org
    Cc: qla2xxx-upstream@qlogic.com
    Cc: user-mode-linux-devel@lists.sourceforge.net
    Cc: user-mode-linux-user@lists.sourceforge.net
    Signed-off-by: Ingo Molnar

    Kirill Tkhai
     

18 Sep, 2014

5 commits

  • In the case of v4.0 clients, we may call into the "create" client
    tracking operation multiple times (once for each openowner). Upcalling
    for each one of those is wasteful and slow however. We can skip doing
    further "create" operations after the first one if we know that one has
    already been done.

    v4.1+ clients generally only call into this function once (on
    RECLAIM_COMPLETE), and we can't skip upcalling on the create even if the
    STABLE bit is set. Doing so would make it impossible for nfsdcltrack to
    lift the grace period early since the timestamp has a different meaning
    in the case where the client is expected to issue a RECLAIM_COMPLETE.

    Signed-off-by: Jeff Layton

    Jeff Layton
     
  • The nfsdcltrack upcall doesn't utilize the NFSD4_CLIENT_STABLE flag,
    which basically results in an upcall every time we call into the client
    tracking ops.

    Change it to set this bit on a successful "check" or "create" request,
    and clear it on a "remove" request. Also, check to see if that bit is
    set before upcalling on a "check" or "remove" request, and skip
    upcalling appropriately, depending on its state.

    Signed-off-by: Jeff Layton

    Jeff Layton
     
  • In a later patch, we want to add a flag that will allow us to reduce the
    need for upcalls. In order to handle that correctly, we'll need to
    ensure that racing upcalls for the same client can't occur. In practice
    it should be rare for this to occur with a well-behaved client, but it
    is possible.

    Convert one of the bits in the cl_flags field to be an upcall bitlock,
    and use it to ensure that upcalls for the same client are serialized.

    Signed-off-by: Jeff Layton

    Jeff Layton
     
  • In order to support lifting the grace period early, we must tell
    nfsdcltrack what sort of client the "create" upcall is for. We can't
    reliably tell if a v4.0 client has completed reclaiming, so we can only
    lift the grace period once all the v4.1+ clients have issued a
    RECLAIM_COMPLETE and if there are no v4.0 clients.

    Also, in order to lift the grace period, we have to tell userland when
    the grace period started so that it can tell whether a RECLAIM_COMPLETE
    has been issued for each client since then.

    Since this is all optional info, we pass it along in environment
    variables to the "init" and "create" upcalls. By doing this, we don't
    need to revise the upcall format. The UMH upcall can simply make use of
    this info if it happens to be present. If it's not then it can just
    avoid lifting the grace period early.

    Signed-off-by: Jeff Layton

    Jeff Layton
     
  • Since it's stored in nfsd_net, we don't need to pass it in separately.

    Signed-off-by: Jeff Layton

    Jeff Layton
     

04 Sep, 2014

2 commits


25 Oct, 2013

1 commit


31 Aug, 2013

1 commit


29 Jun, 2013

3 commits

  • Signed-off-by: Al Viro

    Al Viro
     
  • New method - ->iterate(file, ctx). That's the replacement for ->readdir();
    it takes callback from ctx->actor, uses ctx->pos instead of file->f_pos and
    calls dir_emit(ctx, ...) instead of filldir(data, ...). It does *not*
    update file->f_pos (or look at it, for that matter); iterate_dir() does the
    update.

    Note that dir_emit() takes the offset from ctx->pos (and eventually
    filldir_t will lose that argument).

    Signed-off-by: Al Viro

    Al Viro
     
  • iterate_dir(): new helper, replacing vfs_readdir().

    struct dir_context: contains the readdir callback (and will get more stuff
    in it), embedded into whatever data that callback wants to deal with;
    eventually, we'll be passing it to ->readdir() replacement instead of
    (data,filldir) pair.

    Signed-off-by: Al Viro

    Al Viro
     

10 May, 2013

1 commit

  • Toralf reported the following oops to the linux-nfs mailing list:

    -----------------[snip]------------------
    NFSD: unable to generate recoverydir name (-2).
    NFSD: disabling legacy clientid tracking. Reboot recovery will not function correctly!
    BUG: unable to handle kernel NULL pointer dereference at 000003c8
    IP: [] nfsd4_client_tracking_exit+0x11/0x50 [nfsd]
    *pdpt = 000000002ba33001 *pde = 0000000000000000
    Oops: 0000 [#1] SMP
    Modules linked in: loop nfsd auth_rpcgss ipt_MASQUERADE xt_owner xt_multiport ipt_REJECT xt_tcpudp xt_recent xt_conntrack nf_conntrack_ftp xt_limit xt_LOG iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables af_packet pppoe pppox ppp_generic slhc bridge stp llc tun arc4 iwldvm mac80211 coretemp kvm_intel uvcvideo sdhci_pci sdhci mmc_core videobuf2_vmalloc videobuf2_memops usblp videobuf2_core i915 iwlwifi psmouse videodev cfg80211 kvm fbcon bitblit cfbfillrect acpi_cpufreq mperf evdev softcursor font cfbimgblt i2c_algo_bit cfbcopyarea intel_agp intel_gtt drm_kms_helper snd_hda_codec_conexant drm agpgart fb fbdev tpm_tis thinkpad_acpi tpm nvram e1000e rfkill thermal ptp wmi pps_core tpm_bios 8250_pci processor 8250 ac snd_hda_intel snd_hda_codec snd_pcm battery video i2c_i801 snd_page_alloc snd_timer button serial_core i2c_core snd soundcore thermal_sys hwmon aesni_intel ablk_helper cryp
    td lrw aes_i586 xts gf128mul cbc fuse nfs lockd sunrpc dm_crypt dm_mod hid_monterey hid_microsoft hid_logitech hid_ezkey hid_cypress hid_chicony hid_cherry hid_belkin hid_apple hid_a4tech hid_generic usbhid hid sr_mod cdrom sg [last unloaded: microcode]
    Pid: 6374, comm: nfsd Not tainted 3.9.1 #6 LENOVO 4180F65/4180F65
    EIP: 0060:[] EFLAGS: 00010202 CPU: 0
    EIP is at nfsd4_client_tracking_exit+0x11/0x50 [nfsd]
    EAX: 00000000 EBX: fffffffe ECX: 00000007 EDX: 00000007
    ESI: eb9dcb00 EDI: eb2991c0 EBP: eb2bde38 ESP: eb2bde34
    DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
    CR0: 80050033 CR2: 000003c8 CR3: 2ba80000 CR4: 000407f0
    DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
    DR6: ffff0ff0 DR7: 00000400
    Process nfsd (pid: 6374, ti=eb2bc000 task=eb2711c0 task.ti=eb2bc000)
    Stack:
    fffffffe eb2bde4c f90a3e0c f90a7754 fffffffe eb0a9c00 eb2bdea0 f90a41ed
    eb2991c0 1b270000 eb2991c0 eb2bde7c f9099ce9 eb2bde98 0129a020 eb29a020
    eb2bdecc eb2991c0 eb2bdea8 f9099da5 00000000 eb9dcb00 00000001 67822f08
    Call Trace:
    [] legacy_recdir_name_error+0x3c/0x40 [nfsd]
    [] nfsd4_create_clid_dir+0x15d/0x1c0 [nfsd]
    [] ? nfsd4_lookup_stateid+0x99/0xd0 [nfsd]
    [] ? nfs4_preprocess_seqid_op+0x85/0x100 [nfsd]
    [] nfsd4_client_record_create+0x37/0x50 [nfsd]
    [] nfsd4_open_confirm+0xfe/0x130 [nfsd]
    [] ? nfsd4_encode_operation+0x61/0x90 [nfsd]
    [] ? nfsd4_free_stateid+0xc0/0xc0 [nfsd]
    [] nfsd4_proc_compound+0x41b/0x530 [nfsd]
    [] nfsd_dispatch+0x8b/0x1a0 [nfsd]
    [] svc_process+0x3dd/0x640 [sunrpc]
    [] nfsd+0xad/0x110 [nfsd]
    [] ? nfsd_destroy+0x70/0x70 [nfsd]
    [] kthread+0x94/0xa0
    [] ret_from_kernel_thread+0x1b/0x28
    [] ? flush_kthread_work+0xd0/0xd0
    Code: 86 b0 00 00 00 90 c5 0a f9 c7 04 24 70 76 0a f9 e8 74 a9 3d c8 eb ba 8d 76 00 55 89 e5 53 66 66 66 66 90 8b 15 68 c7 0a f9 85 d2 88 c8 03 00 00 74 2c 3b 11 77 28 8b 5c 91 08 85 db 74 22 8b
    EIP: [] nfsd4_client_tracking_exit+0x11/0x50 [nfsd] SS:ESP 0068:eb2bde34
    CR2: 00000000000003c8
    ---[ end trace 09e54015d145c9c6 ]---

    The problem appears to be a regression that was introduced in commit
    9a9c6478 "nfsd: make NFSv4 recovery client tracking options per net".
    Prior to that commit, it was safe to pass a NULL net pointer to
    nfsd4_client_tracking_exit in the legacy recdir case, and
    legacy_recdir_name_error did so. After that comit, the net pointer must
    be valid.

    This patch just fixes legacy_recdir_name_error to pass in a valid net
    pointer to that function.

    Cc: # v3.8+
    Cc: Stanislav Kinsbursky
    Reported-and-tested-by: Toralf Förster
    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

01 Mar, 2013

1 commit

  • Pull nfsd changes from J Bruce Fields:
    "Miscellaneous bugfixes, plus:

    - An overhaul of the DRC cache by Jeff Layton. The main effect is
    just to make it larger. This decreases the chances of intermittent
    errors especially in the UDP case. But we'll need to watch for any
    reports of performance regressions.

    - Containerized nfsd: with some limitations, we now support
    per-container nfs-service, thanks to extensive work from Stanislav
    Kinsbursky over the last year."

    Some notes about conflicts, since there were *two* non-data semantic
    conflicts here:

    - idr_remove_all() had been added by a memory leak fix, but has since
    become deprecated since idr_destroy() does it for us now.

    - xs_local_connect() had been added by this branch to make AF_LOCAL
    connections be synchronous, but in the meantime Trond had changed the
    calling convention in order to avoid a RCU dereference.

    There were a couple of more obvious actual source-level conflicts due to
    the hlist traversal changes and one just due to code changes next to
    each other, but those were trivial.

    * 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits)
    SUNRPC: make AF_LOCAL connect synchronous
    nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum
    svcrpc: fix rpc server shutdown races
    svcrpc: make svc_age_temp_xprts enqueue under sv_lock
    lockd: nlmclnt_reclaim(): avoid stack overflow
    nfsd: enable NFSv4 state in containers
    nfsd: disable usermode helper client tracker in container
    nfsd: use proper net while reading "exports" file
    nfsd: containerize NFSd filesystem
    nfsd: fix comments on nfsd_cache_lookup
    SUNRPC: move cache_detail->cache_request callback call to cache_read()
    SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
    SUNRPC: rework cache upcall logic
    SUNRPC: introduce cache_detail->cache_request callback
    NFS: simplify and clean cache library
    NFS: use SUNRPC cache creation and destruction helper for DNS cache
    nfsd4: free_stid can be static
    nfsd: keep a checksum of the first 256 bytes of request
    sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
    sunrpc: fix comment in struct xdr_buf definition
    ...

    Linus Torvalds
     

16 Feb, 2013

1 commit

  • This tracker uses khelper kthread to execute binaries.
    Execution itself is done from kthread context - i.e. global root is used.
    This is not suitable for containers with own root.
    So, disable this tracker for a while.

    Note: one of possible solutions can be pass "init" callback to khelper, which
    will swap root to desired one.

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: J. Bruce Fields

    Stanislav Kinsbursky
     

13 Feb, 2013

1 commit

  • Use uid_eq(uid, GLOBAL_ROOT_UID) instead of !uid.
    Use gid_eq(gid, GLOBAL_ROOT_GID) instead of !gid.
    Use uid_eq(uid, INVALID_UID) instead of uid == -1
    Use gid_eq(uid, INVALID_GID) instead of gid == -1
    Use uid = GLOBAL_ROOT_UID instead of uid = 0;
    Use gid = GLOBAL_ROOT_GID instead of gid = 0;
    Use !uid_eq(uid1, uid2) instead of uid1 != uid2.
    Use !gid_eq(gid1, gid2) instead of gid1 != gid2.
    Use uid_eq(uid1, uid2) instead of uid1 == uid2.

    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

11 Dec, 2012

1 commit


28 Nov, 2012

3 commits

  • Flag in_grace is a part of client tracking state, which is network namesapce
    aware. So let'a replace global static variable with per-net one.

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: J. Bruce Fields

    Stanislav Kinsbursky
     
  • Opening and closing of this file is done in client tracking init and exit
    operations.
    Client tracking is done in network namespace context already. So let's make
    this file opened and closed per network context - this will simlify it's
    management.

    Signed-off-by: Stanislav Kinsbursky
    Signed-off-by: J. Bruce Fields

    Stanislav Kinsbursky
     
  • That function is only called under nfsd_mutex: we know that because the
    only caller is nfsd_svc, via

    nfsd_svc
    nfsd_startup
    nfs4_state_start
    nfsd4_client_tracking_init
    client_tracking_ops->init == nfsd4_load_reboot_recovery_data

    The shared state accessed here includes:

    - user_recovery_dirname: used here, modified only by
    nfs4_reset_recoverydir, which can be verified to only be
    called under nfsd_mutex.
    - filesystem state, protected by i_mutex (handwaving slightly
    here)
    - rec_file, reclaim_str_hashtbl, reclaim_str_hashtbl_size: other
    than here, used only from code called from nfsd or laundromat
    threads, both of which should be started only after this runs
    (see nfsd_svc) and stopped before this could run again (see
    nfsd_shutdown, called from nfsd_last_thread).

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

15 Nov, 2012

3 commits


13 Nov, 2012

4 commits

  • The current code holds on to this list until nfsd is shut down, but it's
    never touched once the grace period ends. Release that memory back into
    the wild when the grace period ends.

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     
  • Remove the cl_recdir field from the nfs4_client struct. Instead, just
    compute it on the fly when and if it's needed, which is now only when
    the legacy client tracking code is in effect.

    The error handling in the legacy client tracker is also changed to
    handle the case where md5 is unavailable. In that case, we'll warn
    the admin with a KERN_ERR message and disable the client tracking.

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     
  • When nfsd starts, the legacy reboot recovery code creates a tracking
    struct for each directory in the v4recoverydir. When the grace period
    ends, it basically does a "readdir" on the directory again, and matches
    each dentry in there to an existing client id to see if it should be
    removed or not. If the matching client doesn't exist, or hasn't
    reclaimed its state then it will remove that dentry.

    This is pretty inefficient since it involves doing a lot of hash-bucket
    searching. It also means that we have to keep relying on being able to
    search for a nfs4_client by md5 hashed cl_recdir name.

    Instead, add a pointer to the nfs4_client that indicates the association
    between the nfs4_client_reclaim and nfs4_client. When a reclaim operation
    comes in, we set the pointer to make that association. On gracedone, the
    legacy client tracker will keep the recdir around iff:

    1/ there is a reclaim record for the directory

    ...and...

    2/ there's an association between the reclaim record and a client record
    -- that is, a create or check operation was performed on the client that
    matches that directory.

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     
  • Currently, it takes a client pointer, but later we're going to need to
    search for these records without knowing whether a matching client even
    exists.

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton