30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

22 Sep, 2009

1 commit


25 Apr, 2009

1 commit

  • For every lock request lockd creates a new file_lock object
    in nlmsvc_setgrantargs() by copying the passed in file_lock with
    locks_copy_lock(). A filesystem can attach it's own lock_operations
    vector to the file_lock. It has to be cleaned up at the end of the
    file_lock's life. However, lockd doesn't do it today, yet it
    asserts in nlmclnt_release_lockargs() that the per-filesystem
    state is clean.
    This patch fixes it by exporting locks_release_private() and adding
    it to nlmsvc_freegrantargs(), to be symmetrical to creating a
    file_lock in nlmsvc_setgrantargs().

    Signed-off-by: Felix Blyakher
    Signed-off-by: J. Bruce Fields

    Felix Blyakher
     

19 Mar, 2009

1 commit


10 Feb, 2009

1 commit

  • If a client requests a blocking lock, is denied, then requests it again,
    then here in nlmsvc_lock() we will call vfs_lock_file() without FL_SLEEP
    set, because we've already queued a block and don't need the locks code
    to do it again.

    But that means vfs_lock_file() will return -EAGAIN instead of
    FILE_LOCK_DENIED. So we still need to translate that -EAGAIN return
    into a nlm_lck_blocked error in this case, and put ourselves back on
    lockd's block list.

    The bug was introduced by bde74e4bc64415b1 "locks: add special return
    value for asynchronous locks".

    Thanks to Frank van Maarseveen for the report; his original test
    case was essentially

    for i in `seq 30`; do flock /nfsmount/foo sleep 10 & done

    Tested-by: Frank van Maarseveen
    Reported-by: Frank van Maarseveen
    Cc: Miklos Szeredi
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

04 Oct, 2008

2 commits

  • The current lockd does not reject reclaims that arrive outside of the
    grace period.

    Accepting a reclaim means promising to the client that no conflicting
    locks were granted since last it held the lock. We can meet that
    promise if we assume the only lockers are nfs clients, and that they are
    sufficiently well-behaved to reclaim only locks that they held before,
    and that only reclaim locks have been permitted so far. Once we leave
    the grace period (and start permitting non-reclaims), we can no longer
    keep that promise. So we must start rejecting reclaims at that point.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • Do all the grace period checks in svclock.c. This simplifies the code a
    bit, and will ease some later changes.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

26 Jul, 2008

1 commit

  • Use a special error value FILE_LOCK_DEFERRED to mean that a locking
    operation returned asynchronously. This is returned by

    posix_lock_file() for sleeping locks to mean that the lock has been
    queued on the block list, and will be woken up when it might become
    available and needs to be retried (either fl_lmops->fl_notify() is
    called or fl_wait is woken up).

    f_op->lock() to mean either the above, or that the filesystem will
    call back with fl_lmops->fl_grant() when the result of the locking
    operation is known. The filesystem can do this for sleeping as well
    as non-sleeping locks.

    This is to make sure, that return values of -EAGAIN and -EINPROGRESS by
    filesystems are not mistaken to mean an asynchronous locking.

    This also makes error handling in fs/locks.c and lockd/svclock.c slightly
    cleaner.

    Signed-off-by: Miklos Szeredi
    Cc: Trond Myklebust
    Cc: "J. Bruce Fields"
    Cc: Matthew Wilcox
    Cc: David Teigland
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

21 Jul, 2008

1 commit

  • * 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: (51 commits)
    nfsd: nfs4xdr.c do-while is not a compound statement
    nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c
    lockd: Pass "struct sockaddr *" to new failover-by-IP function
    lockd: get host reference in nlmsvc_create_block() instead of callers
    lockd: minor svclock.c style fixes
    lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock
    lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock
    lockd: nlm_release_host() checks for NULL, caller needn't
    file lock: reorder struct file_lock to save space on 64 bit builds
    nfsd: take file and mnt write in nfs4_upgrade_open
    nfsd: document open share bit tracking
    nfsd: tabulate nfs4 xdr encoding functions
    nfsd: dprint operation names
    svcrdma: Change WR context get/put to use the kmem cache
    svcrdma: Create a kmem cache for the WR contexts
    svcrdma: Add flush_scheduled_work to module exit function
    svcrdma: Limit ORD based on client's advertised IRD
    svcrdma: Remove unused wait q from svcrdma_xprt structure
    svcrdma: Remove unneeded spin locks from __svc_rdma_free
    svcrdma: Add dma map count and WARN_ON
    ...

    Linus Torvalds
     

16 Jul, 2008

5 commits

  • Push it into those callback functions that actually need it.

    Note that all the NFS operations use their own locking, so don't need the
    BKL. Ditto for the rpcbind client.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • It may not be obvious (till you look at the definition of
    nlm_alloc_call()) that a function like nlmsvc_create_block() should
    consume a reference on success or failure, so I find it clearer if it
    takes the reference it needs itself.

    And both callers already do this immediately before the call anyway.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • nlmsvc_lock calls nlmsvc_lookup_host to find a nlm_host struct. The
    callers of this function, however, call nlmsvc_retrieve_args or
    nlm4svc_retrieve_args, which also return a nlm_host struct.

    Change nlmsvc_lock to take a host arg instead of calling
    nlmsvc_lookup_host itself and change the callers to pass a pointer to
    the nlm_host they've already found.

    Since nlmsvc_testlock() now just uses the caller's reference, we no
    longer need to get or release it.

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     
  • nlmsvc_testlock calls nlmsvc_lookup_host to find a nlm_host struct. The
    callers of this functions, however, call nlmsvc_retrieve_args or
    nlm4svc_retrieve_args, which also return a nlm_host struct.

    Change nlmsvc_testlock to take a host arg instead of calling
    nlmsvc_lookup_host itself and change the callers to pass a pointer to
    the nlm_host they've already found.

    We take a reference to host in the place where nlmsvc_testlock()
    previous did a new lookup, so the reference counting is unchanged from
    before.

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

30 Apr, 2008

1 commit


26 Apr, 2008

1 commit

  • The file_lock structure is used both as a heavy-weight representation of
    an active lock, with pointers to reference-counted structures, etc., and
    as a simple container for parameters that describe a file lock.

    The conflicting lock returned from __posix_lock_file is an example of
    the latter; so don't call the filesystem or lock manager callbacks when
    copying to it. This also saves the need for an unnecessary
    locks_init_lock in the nfsv4 server.

    Thanks to Trond for pointing out the error.

    Signed-off-by: J. Bruce Fields
    Cc: Trond Myklebust

    J. Bruce Fields
     

24 Apr, 2008

2 commits

  • As of 5996a298da43a03081e9ba2116983d173001c862 ("NLM: don't unlock on
    cancel requests") we no longer unlock in this case, so the comment is no
    longer accurate.

    Thanks to Stuart Friedberg for pointing out the inconsistency.

    Cc: Stuart Friedberg
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • Have lockd_up start lockd using kthread_run. With this change,
    lockd_down now blocks until lockd actually exits, so there's no longer
    need for the waitqueue code at the end of lockd_down. This also means
    that only one lockd can be running at a time which simplifies the code
    within lockd's main loop.

    This also adds a check for kthread_should_stop in the main loop of
    nlmsvc_retry_blocked and after that function returns. There's no sense
    continuing to retry blocks if lockd is coming down anyway.

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

11 Feb, 2008

2 commits

  • It's possible for lockd to catch a SIGKILL while a GRANT_MSG callback
    is in flight. If this happens we don't want lockd to insert the block
    back into the nlm_blocked list.

    This helps that situation, but there's still a possible race. Fixing
    that will mean adding real locking for nlm_blocked.

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     
  • With the current scheme in nlmsvc_grant_blocked, we can end up with more
    than one GRANT_MSG callback for a block in flight. Right now, we requeue
    the block unconditionally so that a GRANT_MSG callback is done again in
    30s. If the client is unresponsive, it can take more than 30s for the
    call already in flight to time out.

    There's no benefit to having more than one GRANT_MSG RPC queued up at a
    time, so put it on the list with a timeout of NLM_NEVER before doing the
    RPC call. If the RPC call submission fails, we requeue it with a short
    timeout. If it works, then nlmsvc_grant_callback will end up requeueing
    it with a shorter timeout after it completes.

    Signed-off-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

02 Feb, 2008

2 commits


10 Oct, 2007

1 commit

  • The recent fix for a circular lock dependency unfortunately introduced a
    potential memory leak in the event where the call to nlmsvc_lookup_host
    fails for some reason.

    Thanks to Roel Kluin for spotting this.

    Signed-off-by: Trond Myklebust
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     

27 Sep, 2007

1 commit

  • The problem is that the garbage collector for the 'host' structures
    nlm_gc_hosts(), holds nlm_host_mutex while calling down to
    nlmsvc_mark_resources, which, eventually takes the file->f_mutex.

    We cannot therefore call nlmsvc_lookup_host() from within
    nlmsvc_create_block, since the caller will already hold file->f_mutex, so
    the attempt to grab nlm_host_mutex may deadlock.

    Fix the problem by calling nlmsvc_lookup_host() outside the file->f_mutex.

    Signed-off-by: Trond Myklebust
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     

27 Jul, 2007

1 commit


07 May, 2007

8 commits

  • Rewrite nlmsvc_lock() to use the asynchronous interface.

    As with testlock, we answer nlm requests in nlmsvc_lock by first looking up
    the block and then using the results we find in the block if B_QUEUED is
    set, and calling vfs_lock_file() otherwise.

    If this a new lock request and we get -EINPROGRESS return on a non-blocking
    request then we defer the request.

    Also modify nlmsvc_unlock() to call the filesystem method if appropriate.

    Signed-off-by: Marc Eshel
    Signed-off-by: J. Bruce Fields

    Marc Eshel
     
  • Normally we could skip ever having to allocate a block in the case where
    the client asks for a non-blocking lock, or asks for a blocking lock that
    succeeds immediately.

    However we're going to want to always look up a block first in order to
    check whether we're revisiting a deferred lock call, and to be prepared to
    handle the case where the filesystem returns -EINPROGRESS--in that case we
    want to make sure the lock we've given the filesystem is the one embedded
    in the block that we'll use to track the deferred request.

    Signed-off-by: Marc Eshel
    Signed-off-by: J. Bruce Fields

    Marc Eshel
     
  • Rewrite nlmsvc_testlock() to use the new asynchronous interface: instead of
    immediately doing a posix_test_lock(), we first look for a matching block.
    If the subsequent test_lock returns anything other than -EINPROGRESS, we
    then remove the block we've found and return the results.

    If it returns -EINPROGRESS, then we defer the lock request.

    In the case where the block we find in the first step has B_QUEUED set,
    we bypass the vfs_test_lock entirely, instead using the block to decide how
    to respond:
    with nlm_lck_denied if B_TIMED_OUT is set.
    with nlm_granted if B_GOT_CALLBACK is set.
    by dropping if neither B_TIMED_OUT nor B_GOT_CALLBACK is set

    Signed-off-by: Marc Eshel
    Signed-off-by: J. Bruce Fields

    Marc Eshel
     
  • Change NLM internal interface to pass more information for test lock; we
    need this to make sure the cookie information is pushed down to the place
    where we do request deferral, which is handled for testlock by the
    following patch.

    Signed-off-by: Marc Eshel
    Signed-off-by: J. Bruce Fields

    Marc Eshel
     
  • Add code to handle file system callback when the lock is finally granted.

    Signed-off-by: Marc Eshel
    Signed-off-by: J. Bruce Fields

    Marc Eshel
     
  • We need to keep some state for a pending asynchronous lock request, so this
    patch adds that state to struct nlm_block.

    This also adds a function which defers the request, by calling
    rqstp->rq_chandle.defer and storing the resulting deferred request in a
    nlm_block structure which we insert into lockd's global block list. That
    new function isn't called yet, so it's dead code until a later patch.

    Signed-off-by: Marc Eshel
    Signed-off-by: J. Bruce Fields

    Marc Eshel
     
  • The nfsv4 protocol's lock operation, in the case of a conflict, returns
    information about the conflicting lock.

    It's unclear how clients can use this, so for now we're not going so far as to
    add a filesystem method that can return a conflicting lock, but we may as well
    return something in the local case when it's easy to.

    Signed-off-by: Marc Eshel
    Signed-off-by: "J. Bruce Fields"

    Marc Eshel
     
  • posix_test_lock() and ->lock() do the same job but have gratuitously
    different interfaces. Modify posix_test_lock() so the two agree,
    simplifying some code in the process.

    Signed-off-by: Marc Eshel
    Signed-off-by: "J. Bruce Fields"

    Marc Eshel
     

04 Feb, 2007

1 commit


14 Dec, 2006

1 commit


09 Dec, 2006

1 commit


21 Oct, 2006

1 commit


04 Oct, 2006

3 commits

  • Both the (recently introduces) nsm_sema and the older f_sema are converted
    over.

    Cc: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Neil Brown
     
  • When we send a GRANTED_MSG call, we current copy the NLM cookie provided in
    the original LOCK call - because in 1996, some broken clients seemed to rely
    on this bug. However, this means the cookies are not unique, so that when the
    client's GRANTED_RES message comes back, we cannot simply match it based on
    the cookie, but have to use the client's IP address in addition. Which breaks
    when you have a multi-homed NFS client.

    The X/Open spec explicitly mentions that clients should not expect the same
    cookie; so one may hope that any clients that were broken in 1996 have either
    been fixed or rendered obsolete.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • This patch makes nlm_traverse{locks,blocks,shares} and friends use a function
    pointer rather than a "action" enum.

    This function pointer is given two nlm_hosts (one given by the caller, the
    other taken from the lock/block/share currently visited), and is free to do
    with them as it wants. If it returns a non-zero value, the lockd/block/share
    is released.

    Signed-off-by: Olaf Kirch
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch