17 Dec, 2010

1 commit

  • NFS clients don't need the garbage collection processing that is
    performed on nlm_host structures. The client picks up an nlm_host at
    mount time and holds a reference to it until the file system is
    unmounted.

    Servers, on the other hand, don't have a precise way to tell when an
    nlm_host is no longer being used, so zero refcount nlm_host entries
    are left to expire in the cache after a time.

    Basically there's nothing holding a reference to an nlm_host between
    individual server-side NLM requests, but we can't afford the expense
    of recreating them for every new NLM request from a client. The
    nlm_host cache adds some lifetime hysteresis to entries in the cache
    so the next time a particular nlm_host is needed, it's likely to be
    discovered by a lookup rather than created from whole cloth.

    With the new implementation, client nlm_host cache items are no longer
    garbage collected, and are destroyed directly by a new release
    function specialized for client entries, nlmclnt_release_host(). They
    are cached in their own data structure, and have their own lookup
    logic, simplified and specialized for client nlm_host entries.

    However, the client nlm_host cache still shares reboot recovery logic
    with the server nlm_host cache. The NSM "peer rebooted" downcall for
    clients and servers still come through the same RPC call. This is a
    legacy formal API that would be difficult to alter, and besides, the
    user space NSM implementation can't tell the difference between peers
    that are clients or servers.

    For this reason, the client cache continues to share the
    nlm_host_mutex (and reboot recovery logic) with the server cache.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

18 Nov, 2010

1 commit


22 Sep, 2010

1 commit

  • This patch removes all calls to lock_kernel() from the client. This patch
    should be applied after the "fs/lock.c prepare for BKL removal" patch submitted
    by Arnd Bergmann on September 18.

    Signed-off-by: Bryan Schumaker
    Signed-off-by: Trond Myklebust

    Bryan Schumaker
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

21 Aug, 2009

1 commit


29 Mar, 2009

1 commit

  • Apparently a lot of people need to disable IPv6 completely on their
    distributor-built systems, which have CONFIG_IPV6_MODULE enabled at
    build time.

    They do this by blacklisting the ipv6.ko module. This causes the
    creation of the lockd service listener to fail if CONFIG_IPV6_MODULE
    is set, but the module cannot be loaded.

    Now that the kernel's PF_INET6 RPC listeners are completely separate
    from PF_INET listeners, we can always start PF_INET. Then lockd can
    try to start PF_INET6, but it isn't required to be available.

    Note this has the added benefit that NLM callbacks from AF_INET6
    servers will never come from AF_INET remotes. We no longer have to
    worry about matching mapped IPv4 addresses to AF_INET when comparing
    addresses.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

11 Mar, 2009

1 commit

  • The NFS mount command may pass an AF_INET server address to lockd. If
    lockd happens to be using a PF_INET6 listener, the nlm_cmp_addr() in
    nlmclnt_grant() will fail to match requests from that host because they
    will all have a mapped IPv4 AF_INET6 address.

    Adopt the same solution used in nfs_sockaddr_match_ipaddr() for NFSv4
    callbacks: if either address is AF_INET, map it to an AF_INET6 address
    before doing the comparison.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

24 Dec, 2008

2 commits

  • If the admin has specified the "noresvport" option for an NFS mount
    point, the kernel's NFS client uses an unprivileged source port for
    the main NFS transport. The kernel's lockd client should use an
    unprivileged port in this case as well.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • My understanding is that there is a push to turn the kernel_thread
    interface into a non-exported symbol and move all kernel threads to use
    the kthread API. This patch changes lockd to use kthread_run to spawn
    the reclaimer thread.

    I've made the assumption here that the extra module references taken
    when we spawn this thread are unnecessary and removed them. I've also
    added a KERN_ERR printk that pops if the thread can't be spawned to warn
    the admin that the locks won't be reclaimed.

    In the future, it would be nice to be able to notify userspace that
    locks have been lost (probably by implementing SIGLOST), and adding some
    good policies about how long we should reattempt to reclaim the locks.

    Finally, I removed a comment about memory leaks that I believe is
    obsolete and added a new one to clarify the result of sending a SIGKILL
    to the reclaimer thread. As best I can tell, doing so doesn't actually
    cause a memory leak.

    I consider this patch 2.6.29 material.

    Signed-off-by: Jeff Layton
    Signed-off-by: Trond Myklebust

    Jeff Layton
     

05 Oct, 2008

1 commit


04 Oct, 2008

2 commits


30 Sep, 2008

2 commits


30 Jan, 2008

2 commits

  • Clean up: pass 5 arguments to nlmclnt_init() in a structure similar to the
    new nfs_client_initdata structure.

    Signed-off-by: Chuck Lever

    Chuck Lever
     
  • We would like to remove the per-lock-operation nlm_lookup_host() call from
    nlmclnt_proc().

    The new architecture pins an nlm_host structure to each NFS client
    superblock that has the "lock" mount option set. The NFS client passes
    in the pinned nlm_host structure during each call to nlmclnt_proc(). NFS
    client unmount processing "puts" the nlm_host so it can be garbage-
    collected later.

    This patch introduces externally callable NLM functions that handle
    mount-time nlm_host set up and tear-down.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

15 May, 2007

1 commit


31 Jan, 2007

1 commit


14 Dec, 2006

1 commit


09 Dec, 2006

1 commit


21 Oct, 2006

1 commit


04 Oct, 2006

3 commits

  • nlmclnt_recovery would try to force a portmap rebind by setting
    host->h_nextrebind to 0. The right thing to do here is to set it to the
    current time.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This patch makes the SM_NOTIFY handling understand and use the nsm_handle.

    To make it a bit clear what is happening:

    nlmclent_prepare_reclaim and nlmclnt_finish_reclaim
    get open-coded into 'reclaimer'

    The result is tidied up.

    Then some of that functionality is moved out into nlm_host_rebooted (which
    calls nlmclnt_recovery which starts a thread which runs reclaimer).

    Also host_rebooted now finds an nsm_handle rather than a host, then then
    iterates over all hosts and deals with each host that shares that nsm_handle.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This patch introduces the nsm_handle, which is shared by all nlm_host objects
    referring to the same client.

    With this patch applied, all nlm_hosts from the same address will share the
    same nsm_handle. A future patch will add sharing by name.

    Note: this patch changes h_name so that it is no longer guaranteed to be an IP
    address of the host. When the host represents an NFS server, h_name will be
    the name passed in the mount call. When the host represents a client, h_name
    will be the name presented in the lock request received from the client. A
    h_name is only used for printing informational messages, this change should
    not be significant.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     

02 Oct, 2006

2 commits

  • If lockd_up fails - what should we expect? Do we have to later call
    lockd_down?

    Well the nfs client thinks "no", the nfs server thinks "yes". lockd thinks
    "yes".

    The only answer that really makes sense is "no" !!

    So:
    Make lockd_up only increment nlmsvc_users on success.
    Make nfsd handle errors from lockd_up properly.
    Make sure lockd_up(0) never fails when lockd is running
    so that the 'reclaimer' call to lockd_up doesn't need to
    be error checked.

    Cc: "J. Bruce Fields"
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • Currently lockd listens on UDP always, and TCP if CONFIG_NFSD_TCP is set.

    However as lockd performs services of the client as well, this is a problem.
    If CONFIG_NfSD_TCP is not set, and a tcp mount is used, the server will not be
    able to call back to lockd.

    So:
    - add an option to lockd_up saying which protocol is needed
    - Always open sockets for which an explicit port was given, otherwise
    only open a socket of the type required
    - Change nfsd to do one lockd_up per socket rather than one per thread.

    This
    - removes the dependancy on CONFIG_NFSD_TCP
    - means that lockd may open sockets other than at startup
    - means that lockd will *not* listen on UDP if the only
    mounts are TCP mount (and nfsd hasn't started).

    The latter is the only one that concerns me at all - I don't know if this
    might be a problem with some servers.

    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     

27 Sep, 2006

1 commit


09 Jun, 2006

1 commit

  • Currently it is possible for a task to remove its locks at the same time as
    the NLM recovery thread is trying to recover them. This quickly leads to an
    Oops.
    Protect the locks using an rw semaphore while they are being recovered.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

21 Mar, 2006

4 commits

  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • The patch "stop abusing file_lock_list introduces a couple of bugs since
    the locks may be copied and need to be removed from the lists when they are
    destroyed.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Currently lockd directly access the file_lock_list from fs/locks.c.
    It does so to mark locks granted or reclaimable. This is very
    suboptimal, because a) lockd needs to poke into locks.c internals, and
    b) it needs to iterate over all locks in the system for marking locks
    granted or reclaimable.

    This patch adds lists for granted and reclaimable locks to the nlm_host
    structure instead, and adds locks to those.

    nlmclnt_lock:
    now adds the lock to h_granted instead of setting the
    NFS_LCK_GRANTED, still O(1)

    nlmclnt_mark_reclaim:
    goes away completely, replaced by a list_splice_init.
    Complexity reduced from O(locks in the system) to O(1)

    reclaimer:
    iterates over h_reclaim now, complexity reduced from
    O(locks in the system) to O(locks per nlm_host)

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Trond Myklebust

    Christoph Hellwig
     
  • Instead we use the nlm_lockowner->pid.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

15 Feb, 2006

1 commit

  • If 2 threads attached to the same process are blocking on different locks on
    different files (maybe even on different servers) but have the same lock
    arguments (i.e. same offset+length - actually quite common, since most
    processes try to lock the entire file) then the first GRANTED call that wakes
    one up will also wake the other.

    Currently when the NLM_GRANTED callback comes in, lockd walks the list of
    blocked locks in search of a match to the lock that the NLM server has
    granted. Although it checks the lock pid, start and end, it fails to check
    the filehandle and the server address.

    By checking the filehandle and server IP address, we ensure that this only
    happens if the locks truly are referencing the same file.

    Signed-off-by: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     

20 Dec, 2005

1 commit


23 Jun, 2005

2 commits


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds