13 Jul, 2011

1 commit


15 Jun, 2011

1 commit

  • If the NLM daemon is killed on the NFS server, we can currently end up
    hanging forever on an 'unlock' request, instead of aborting. Basically,
    if the rpcbind request fails, or the server keeps returning garbage, we
    really want to quit instead of retrying.

    Tested-by: Vasily Averin
    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org

    Trond Myklebust
     

17 Dec, 2010

2 commits

  • NFS clients don't need the garbage collection processing that is
    performed on nlm_host structures. The client picks up an nlm_host at
    mount time and holds a reference to it until the file system is
    unmounted.

    Servers, on the other hand, don't have a precise way to tell when an
    nlm_host is no longer being used, so zero refcount nlm_host entries
    are left to expire in the cache after a time.

    Basically there's nothing holding a reference to an nlm_host between
    individual server-side NLM requests, but we can't afford the expense
    of recreating them for every new NLM request from a client. The
    nlm_host cache adds some lifetime hysteresis to entries in the cache
    so the next time a particular nlm_host is needed, it's likely to be
    discovered by a lookup rather than created from whole cloth.

    With the new implementation, client nlm_host cache items are no longer
    garbage collected, and are destroyed directly by a new release
    function specialized for client entries, nlmclnt_release_host(). They
    are cached in their own data structure, and have their own lookup
    logic, simplified and specialized for client nlm_host entries.

    However, the client nlm_host cache still shares reboot recovery logic
    with the server nlm_host cache. The NSM "peer rebooted" downcall for
    clients and servers still come through the same RPC call. This is a
    legacy formal API that would be difficult to alter, and besides, the
    user space NSM implementation can't tell the difference between peers
    that are clients or servers.

    For this reason, the client cache continues to share the
    nlm_host_mutex (and reboot recovery logic) with the server cache.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • The nlm_release_call() function is invoked from both the server and
    the client side. We're about to introduce a distinct server- and
    client-side nlm_release_host(), so nlm_release_call() must first be
    split into a client-side and a server-side version.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

18 Nov, 2010

1 commit


22 Sep, 2010

1 commit

  • This patch removes all calls to lock_kernel() from the client. This patch
    should be applied after the "fs/lock.c prepare for BKL removal" patch submitted
    by Arnd Bergmann on September 18.

    Signed-off-by: Bryan Schumaker
    Signed-off-by: Trond Myklebust

    Bryan Schumaker
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

22 Sep, 2009

1 commit


13 Jul, 2009

1 commit

  • * Remove smp_lock.h from files which don't need it (including some headers!)
    * Add smp_lock.h to files which do need it
    * Make smp_lock.h include conditional in hardirq.h
    It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

    This will make hardirq.h inclusion cheaper for every PREEMPT=n config
    (which includes allmodconfig/allyesconfig, BTW)

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

18 Jun, 2009

2 commits

  • When rpc.statd starts up in user space at boot time, it attempts to
    write the latest NSM local state number into
    /proc/sys/fs/nfs/nsm_local_state.

    If lockd.ko isn't loaded yet (as is the case in most configurations),
    that file doesn't exist, thus the kernel's NSM state remains set to
    its initial value of zero during lockd operation.

    This is a problem because rpc.statd and lockd use the NSM state number
    to prevent repeated lock recovery on rebooted hosts. If lockd sends
    a zero NSM state, but then a delayed SM_NOTIFY with a real NSM state
    number is received, there is no way for lockd or rpc.statd to
    distinguish that stale SM_NOTIFY from an actual reboot. Thus lock
    recovery could be performed after the rebooted host has already
    started reclaiming locks, and those locks will be lost.

    We could change /etc/init.d/nfslock so it always modprobes lockd.ko
    before starting rpc.statd. However, if lockd.ko is ever unloaded
    and reloaded, we are back at square one, since the NSM state is not
    preserved across an unload/reload cycle. This may happen frequently
    on clients that use automounter. A period of NFS inactivity causes
    lockd.ko to be unloaded, and the kernel loses its NSM state setting.

    Instead, let's use the fact that rpc.statd plants the local system's
    NSM state in every SM_MON (and SM_UNMON) reply. lockd performs a
    synchronous SM_MON upcall to the local rpc.statd _before_ sending its
    first NLM request to a new remote. This would permit rpc.statd to
    provide the current NSM state to lockd, even after lockd.ko had been
    unloaded and reloaded.

    Note that NLMPROC_LOCK arguments are constructed before the
    nsm_monitor() call, so we have to rearrange argument construction very
    slightly to make this all work out.

    And, the kernel appears to treat NSM state as a u32 (see struct
    nlm_args and nsm_res). Make nsm_local_state a u32 as well, to ensure
    we don't get bogus comparison results.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     

07 Jan, 2009

2 commits


26 Jul, 2008

1 commit

  • Fix nlm_fopen() to return NLM_FAILED (or NLM_LCK_DENIED_NOLOCKS) instead
    of NLM_LCK_DENIED. The latter means the lock request failed because of a
    conflicting lock (i.e. a temporary error), which is wrong in this case.

    Also fix the client to return ENOLCK instead of EAGAIN if a blocking lock
    request returns with NLM_LOCK_DENIED.

    Signed-off-by: Miklos Szeredi
    Cc: Trond Myklebust
    Cc: "J. Bruce Fields"
    Cc: Matthew Wilcox
    Cc: David Teigland
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

16 Jul, 2008

2 commits

  • Push it into those callback functions that actually need it.

    Note that all the NFS operations use their own locking, so don't need the
    BKL. Ditto for the rpcbind client.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • fcntl(F_GETLK) on an nfs client incorrectly returns
    the values for the conflicting lock. fl_len value is
    always 1.
    If the conflicting lock is (0, 4095) the F_GETLK
    request for (1024, 10) returns (0, 1), which doesn't
    even cover the requested range, and is quite confusing.
    The fix is trivial, set fl_end from the fl_end value
    recieved from the nfs server.

    Signed-off-by: Felix Blyakher
    Signed-off-by: "J. Bruce Fields"
    Signed-off-by: Trond Myklebust

    Felix Blyakher
     

30 Apr, 2008

1 commit


20 Apr, 2008

7 commits

  • Now that we've added the 'generic' credentials (that are independent of the
    rpc_client) to the nfs_open_context, we can use those in the NLM client to
    ensure that the lock/unlock requests are authenticated to whoever
    originally opened the file.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • We shouldn't remove the lock from the list of blocked locks until the
    CANCEL call has completed since we may be racing with a GRANTED callback.

    Also ensure that we send an UNLOCK if the CANCEL request failed. Normally
    that should only happen if the process gets hit with a fatal signal.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Currently, it returns success as long as the RPC call was sent. We'd like
    to know if the CANCEL operation succeeded on the server.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • The signal masks have been rendered obsolete by the preceding patch.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Peter Staubach comments:

    > In the course of investigating testing failures in the locking phase of
    > the Connectathon testsuite, I discovered a couple of things. One was
    > that one of the tests in the locking tests was racy when it didn't seem
    > to need to be and two, that the NFS client asynchronously releases locks
    > when a process is exiting.
    ...
    > The Single UNIX Specification Version 3 specifies that: "All locks
    > associated with a file for a given process shall be removed when a file
    > descriptor for that file is closed by that process or the process holding
    > that file descriptor terminates.".
    >
    > This does not specify whether those locks must be released prior to the
    > completion of the exit processing for the process or not. However,
    > general assumptions seem to be that those locks will be released. This
    > leads to more deterministic behavior under normal circumstances.

    The following patch converts the NFSv2/v3 locking code to use the same
    mechanism as NFSv4 for sending asynchronous RPC calls and then waiting for
    them to complete. This ensures that the UNLOCK and CANCEL RPC calls will
    complete even if the user interrupts the call, yet satisfies the
    above request for synchronous behaviour on process exit.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • When we replace the existing synchronous RPC calls with asynchronous calls,
    the reference count will be needed in order to allow us to examine the
    result of the RPC call.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Also fix up nlmclnt_lock() so that it doesn't pass modified versions of
    fl->fl_flags to nlmclnt_cancel() and other helpers.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

30 Jan, 2008

2 commits

  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Now that each NFS mount point caches its own nlm_host structure, it can be
    passed to nlmclnt_proc() for each lock request. By pinning an nlm_host for
    each mount point, we trade the overhead of looking up or creating a fresh
    nlm_host struct during every NLM procedure call for a little extra memory.

    We also restrict the nlmclnt_proc symbol to limit the use of this call to
    in-tree modules.

    Note that nlm_lookup_host() (just removed from the client's per-request
    NLM processing) could also trigger an nlm_host garbage collection. Now
    client-side nlm_host garbage collection occurs only during NFS mount
    processing. Since the NFS client now holds a reference on these nlm_host
    structures, they wouldn't have been affected by garbage collection
    anyway.

    Given that nlm_lookup_host() reorders the global nlm_host chain after
    every successful lookup, and that a garbage collection could be triggered
    during the call, we've removed a significant amount of per-NLM-request
    CPU processing overhead.

    Sidebar: there are only a few remaining references to the internals of
    NFS inodes in the client-side NLM code. The only references I found are
    related to extracting or comparing the inode's file handle via NFS_FH().
    One is in nlmclnt_grant(); the other is in nlmclnt_setlockargs().

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

09 May, 2007

1 commit


04 Feb, 2007

1 commit


14 Dec, 2006

1 commit


09 Dec, 2006

1 commit


08 Dec, 2006

2 commits


06 Dec, 2006

1 commit


04 Oct, 2006

3 commits

  • The way we incremented the NLM cookie in nlmclnt_next_cookie was not thread
    safe. This patch changes the counter to an atomic_t

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This patch adds the peer's hostname (and name length) to all calls to
    nlm*_lookup_host functions. A subsequent patch will make use of these (is
    requested by a sysctl).

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This patch moves all checks of the h_monitored flag into the
    nsm_monitor/unmonitor functions. A subsequent patch will replace the
    mechanism by which we mark a host as being monitored.

    There is still one occurence of h_monitored outside of mon.c and that is in
    clntlock.c where we respond to a reboot. The subsequent patch will modify
    this too.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     

02 Oct, 2006

1 commit

  • Replace references to system_utsname to the per-process uts namespace
    where appropriate. This includes things like uname.

    Changes: Per Eric Biederman's comments, use the per-process uts namespace
    for ELF_PLATFORM, sunrpc, and parts of net/ipv4/ipconfig.c

    [jdike@addtoit.com: UML fix]
    [clg@fr.ibm.com: cleanup]
    [akpm@osdl.org: build fix]
    Signed-off-by: Serge E. Hallyn
    Cc: Kirill Korotaev
    Cc: "Eric W. Biederman"
    Cc: Herbert Poetzl
    Cc: Andrey Savochkin
    Signed-off-by: Cedric Le Goater
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     

27 Sep, 2006

1 commit


23 Sep, 2006

1 commit


06 Jul, 2006

1 commit