13 Jan, 2012

1 commit


07 Jan, 2012

1 commit


14 Sep, 2011

1 commit

  • For IPv6 local address, lockd can not callback to client for
    missing scope id when binding address at inet6_bind:

    324 if (addr_type & IPV6_ADDR_LINKLOCAL) {
    325 if (addr_len >= sizeof(struct sockaddr_in6) &&
    326 addr->sin6_scope_id) {
    327 /* Override any existing binding, if another one
    328 * is supplied by user.
    329 */
    330 sk->sk_bound_dev_if = addr->sin6_scope_id;
    331 }
    332
    333 /* Binding to link-local address requires an interface */
    334 if (!sk->sk_bound_dev_if) {
    335 err = -EINVAL;
    336 goto out_unlock;
    337 }

    Replacing svc_addr_u by sockaddr_storage, let rqstp->rq_daddr contains more info
    besides address.

    Reviewed-by: Jeff Layton
    Reviewed-by: Chuck Lever
    Signed-off-by: Mi Jinlong
    Signed-off-by: J. Bruce Fields

    Mi Jinlong
     

20 Aug, 2011

1 commit

  • Use NUMA aware allocations to reduce latencies and increase throughput.

    sunrpc kthreads can use kthread_create_on_node() if pool_mode is
    "percpu" or "pernode", and svc_prepare_thread()/svc_init_buffer() can
    also take into account NUMA node affinity for memory allocations.

    Signed-off-by: Eric Dumazet
    CC: "J. Bruce Fields"
    CC: Neil Brown
    CC: David Miller
    Reviewed-by: Greg Banks
    [bfields@redhat.com: fix up caller nfs41_callback_up]
    Signed-off-by: J. Bruce Fields

    Eric Dumazet
     

28 Jul, 2011

1 commit

  • * 'nfs-for-3.1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (44 commits)
    NFSv4: Don't use the delegation->inode in nfs_mark_return_delegation()
    nfs: don't use d_move in nfs_async_rename_done
    RDMA: Increasing RPCRDMA_MAX_DATA_SEGS
    SUNRPC: Replace xprt->resend and xprt->sending with a priority queue
    SUNRPC: Allow caller of rpc_sleep_on() to select priority levels
    SUNRPC: Support dynamic slot allocation for TCP connections
    SUNRPC: Clean up the slot table allocation
    SUNRPC: Initalise the struct xprt upon allocation
    SUNRPC: Ensure that we grab the XPRT_LOCK before calling xprt_alloc_slot
    pnfs: simplify pnfs files module autoloading
    nfs: document nfsv4 sillyrename issues
    NFS: Convert nfs4_set_ds_client to EXPORT_SYMBOL_GPL
    SUNRPC: Convert the backchannel exports to EXPORT_SYMBOL_GPL
    SUNRPC: sunrpc should not explicitly depend on NFS config options
    NFS: Clean up - simplify the switch to read/write-through-MDS
    NFS: Move the pnfs write code into pnfs.c
    NFS: Move the pnfs read code into pnfs.c
    NFS: Allow the nfs_pageio_descriptor to signal that a re-coalesce is needed
    NFS: Use the nfs_pageio_descriptor->pg_bsize in the read/write request
    NFS: Cache rpc_ops in struct nfs_pageio_descriptor
    ...

    Linus Torvalds
     

21 Jul, 2011

1 commit

  • Both the filesystem and the lock manager can associate operations with a
    lock. Confusingly, one of them (fl_release_private) actually has the
    same name in both operation structures.

    It would save some confusion to give the lock-manager ops different
    names.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

13 Jul, 2011

1 commit


15 Jun, 2011

1 commit

  • If the NLM daemon is killed on the NFS server, we can currently end up
    hanging forever on an 'unlock' request, instead of aborting. Basically,
    if the rpcbind request fails, or the server keeps returning garbage, we
    really want to quit instead of retrying.

    Tested-by: Vasily Averin
    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org

    Trond Myklebust
     

26 Jan, 2011

1 commit

  • Nick Bowler reports:

    > We were just having some NFS server troubles, and my client machine
    > running 2.6.38-rc1+ (specifically, commit 2b1caf6ed7b888c95) crashed
    > hard (syslog output appended to this mail).
    >
    > I'm not sure what the exact timeline was or how to reproduce this,
    > but the server was rebooted during all this. Since I've never seen
    > this happen before, it is possibly a regression from previous kernel
    > releases. However, I recently updated my nfs-utils (on the client) to
    > version 1.2.3, so that might be related as well.

    [ BUG output redacted ]

    When done searching, the for_each_host loop in next_host_state() falls
    through and returns the final host on the host chain without bumping
    it's reference count.

    Since the host's ref count is only one at that point, releasing the
    host in nlm_host_rebooted() attempts to destroy the host prematurely,
    and therefore hits a BUG().

    Likely, the original intent of the for_each_host behavior in
    next_host_state() was to handle the case when the host chain is empty.
    Searching the chain and finding no suitable host to return needs to be
    handled as well.

    Defensively restructure next_host_state() always to return NULL when
    the loop falls through.

    Introduced by commit b10e30f6 "lockd: reorganize nlm_host_rebooted".

    Cc: J. Bruce Fields
    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

05 Jan, 2011

1 commit


17 Dec, 2010

17 commits

  • Clean up.

    The contents of the src_sap field is not used in nlm_alloc_host().

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up.

    Remove the now unused helper nlm_lookup_host().

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up.

    nlm_hosts now contains only server-side entries. Rename it to match
    convention of client side cache.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up.

    Change nlmsvc_lookup_host() to be purpose-built for server-side
    nlm_host management. This replaces the generic nlm_lookup_host()
    helper function, just like on the client side. The lookup logic is
    specialized for server host lookups.

    The server side cache also gets its own specialized equivalent of the
    nlm_release_host() function.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • NFS clients don't need the garbage collection processing that is
    performed on nlm_host structures. The client picks up an nlm_host at
    mount time and holds a reference to it until the file system is
    unmounted.

    Servers, on the other hand, don't have a precise way to tell when an
    nlm_host is no longer being used, so zero refcount nlm_host entries
    are left to expire in the cache after a time.

    Basically there's nothing holding a reference to an nlm_host between
    individual server-side NLM requests, but we can't afford the expense
    of recreating them for every new NLM request from a client. The
    nlm_host cache adds some lifetime hysteresis to entries in the cache
    so the next time a particular nlm_host is needed, it's likely to be
    discovered by a lookup rather than created from whole cloth.

    With the new implementation, client nlm_host cache items are no longer
    garbage collected, and are destroyed directly by a new release
    function specialized for client entries, nlmclnt_release_host(). They
    are cached in their own data structure, and have their own lookup
    logic, simplified and specialized for client nlm_host entries.

    However, the client nlm_host cache still shares reboot recovery logic
    with the server nlm_host cache. The NSM "peer rebooted" downcall for
    clients and servers still come through the same RPC call. This is a
    legacy formal API that would be difficult to alter, and besides, the
    user space NSM implementation can't tell the difference between peers
    that are clients or servers.

    For this reason, the client cache continues to share the
    nlm_host_mutex (and reboot recovery logic) with the server cache.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • The nlm_release_call() function is invoked from both the server and
    the client side. We're about to introduce a distinct server- and
    client-side nlm_release_host(), so nlm_release_call() must first be
    split into a client-side and a server-side version.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Refactor the tail of nlm_gc_hosts() into nlm_destroy_host() so that
    this logic can be used separately from garbage collection.

    Rename it _locked() to document that it must be called with the hosts
    cache mutex held.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Refactor nlm_host allocation and initialization into a separate
    function. This will be the common piece of server and client nlm_host
    lookup logic after the nlm_host cache is split.

    Small change: use kmalloc() instead of kzalloc(), as we're overwriting
    almost all fields in the new nlm_host struct with non-zero values
    immediately after it is allocated. An added benefit is we now have an
    explicit reference to each field name where it is initialized (for all
    you cscope fans out there).

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Minor reorganization; no change in behavior. This will save some
    duplicated code after we split the client and server host caches.

    Signed-off-by: J. Bruce Fields
    [ cel: Forward-ported to 2.6.37 ]
    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    J. Bruce Fields
     
  • We've got a lot of loops like this, and I find them a little easier to
    read with the macros. More such loops are coming.

    Signed-off-by: J. Bruce Fields
    [ cel: Forward-ported to 2.6.37 ]
    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    J. Bruce Fields
     
  • Now that all client-side XDR decoder routines use xdr_streams, there
    should be no need to support the legacy calling sequence [rpc_rqst *,
    __be32 *, RPC res *] anywhere. We can construct an xdr_stream in the
    generic RPC code, instead of in each decoder function.

    This is a refactoring change. It should not cause different behavior.

    Signed-off-by: Chuck Lever
    Tested-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Now that all client-side XDR encoder routines use xdr_streams, there
    should be no need to support the legacy calling sequence [rpc_rqst *,
    __be32 *, RPC arg *] anywhere. We can construct an xdr_stream in the
    generic RPC code, instead of in each encoder function.

    Also, all the client-side encoder functions return 0 now, making a
    return value superfluous. Take this opportunity to convert them to
    return void instead.

    This is a refactoring change. It should not cause different behavior.

    Signed-off-by: Chuck Lever
    Tested-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up.

    The trend in the other XDR encoder functions is to BUG() when encoding
    problems occur, since a problem here is always due to a local coding
    error. Then, instead of a status, zero is unconditionally returned.

    Update the NSM XDR encoders to behave this way.

    To finish the update, use the new-style be32_to_cpup() and
    cpu_to_be32() macros, and compute the buffer sizes using raw integers
    instead of sizeof(). This matches the conventions used in other XDR
    functions

    Signed-off-by: Chuck Lever
    Tested-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up. nlmdbg_cookie2a() is used only in svclock.c.

    Signed-off-by: Chuck Lever
    Tested-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • We'd like to prevent local buffer overflows caused by malicious or
    broken servers. New xdr_stream style decoders can do that.

    For efficiency, we also want to be able to pass xdr_streams from
    call_encode() to all XDR encoding functions, rather than building
    an xdr_stream in every XDR encoding function in the kernel.

    Same idea as the NLM v3 XDR overhaul.

    Signed-off-by: Chuck Lever
    Tested-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • We'd like to prevent local buffer overflows caused by malicious or
    broken servers. New xdr_stream style decoders can do that.

    For efficiency, we also eventually want to be able to pass xdr_streams
    from call_encode() and call_decode() to all XDR encoding functions,
    rather than building an xdr_stream in every XDR encoding and decoding
    function in the kernel.

    To do all of this, rewrite the XDR encoding and decoding functions in
    fs/lockd/xdr.c to use xdr_streams. This makes them more or less
    incompatible with server-side XDR helper functions, so break them out
    into a separate source file.

    Static helper functions are left without the "inline" directive. This
    allows the compiler to choose automatically how to optimize these for
    size or speed.

    SHARE-related functionality doesn't seem to be used, as those
    functions are hiding behind a #define that isn't set anywhere that I
    can find. And, they've been in there forever (at least as far back as
    the kernel's git history goes), yet remain unused. Let's take the
    opportunity to bin them. It should be easy enough for someone to
    introduce proper XDR functions if at some point SHARE-related NLM
    functionality is desired.

    Signed-off-by: Chuck Lever
    Tested-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

18 Nov, 2010

1 commit


16 Nov, 2010

1 commit

  • Nick Bowler reports:
    There are no unusual messages on the client... but I just logged into
    the server and I see lots of messages of the following form:

    nfsd: request from insecure port (192.168.8.199:35766)!
    nfsd: request from insecure port (192.168.8.199:35766)!
    nfsd: request from insecure port (192.168.8.199:35766)!
    nfsd: request from insecure port (192.168.8.199:35766)!
    nfsd: request from insecure port (192.168.8.199:35766)!

    Bisected to commit 9247685088398cf21bcb513bd2832b4cd42516c4 (SUNRPC:
    Properly initialize sock_xprt.srcaddr in all cases)

    Apparently, removing the 'transport->srcaddr.ss_family = family' from
    xs_create_sock() triggers this due to nlmclnt_lookup_host() incorrectly
    initialising the srcaddr family to AF_UNSPEC.

    Reported-by: Nick Bowler
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     

28 Oct, 2010

2 commits


27 Oct, 2010

1 commit

  • * 'for-2.6.37' of git://linux-nfs.org/~bfields/linux: (99 commits)
    svcrpc: svc_tcp_sendto XPT_DEAD check is redundant
    svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue
    svcrpc: assume svc_delete_xprt() called only once
    svcrpc: never clear XPT_BUSY on dead xprt
    nfsd4: fix connection allocation in sequence()
    nfsd4: only require krb5 principal for NFSv4.0 callbacks
    nfsd4: move minorversion to client
    nfsd4: delay session removal till free_client
    nfsd4: separate callback change and callback probe
    nfsd4: callback program number is per-session
    nfsd4: track backchannel connections
    nfsd4: confirm only on succesful create_session
    nfsd4: make backchannel sequence number per-session
    nfsd4: use client pointer to backchannel session
    nfsd4: move callback setup into session init code
    nfsd4: don't cache seq_misordered replies
    SUNRPC: Properly initialize sock_xprt.srcaddr in all cases
    SUNRPC: Use conventional switch statement when reclassifying sockets
    sunrpc/xprtrdma: clean up workqueue usage
    sunrpc: Turn list_for_each-s into the ..._entry-s
    ...

    Fix up trivial conflicts (two different deprecation notices added in
    separate branches) in Documentation/feature-removal-schedule.txt

    Linus Torvalds
     

02 Oct, 2010

2 commits


23 Sep, 2010

1 commit


22 Sep, 2010

1 commit

  • This patch removes all calls to lock_kernel() from the client. This patch
    should be applied after the "fs/lock.c prepare for BKL removal" patch submitted
    by Arnd Bergmann on September 18.

    Signed-off-by: Bryan Schumaker
    Signed-off-by: Trond Myklebust

    Bryan Schumaker
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

09 Feb, 2010

2 commits

  • When lockd gets a notify downcall from statd, it'll search its hosts
    cache and then clear the sm_monitored bit on the host it finds. The idea
    is apparently to make lockd redo a SM_MON on the next lock request.

    This is unnecessary and causes the kernel's NSM cache to go out of sync
    with statd. statd doesn't stop monitoring a host when it gets a
    SM_NOTIFY and there's no guarantee that another lock will occur after
    the reclaim and before the unmount. In that event, no SM_UNMON will
    occur.

    Signed-off-by: Jeff Layton
    Reviewed-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     
  • nsm_reboot_lookup takes a reference to the nsm_handle that it returns,
    but nlm_host_rebooted never releases that reference.

    Signed-off-by: Jeff Layton
    Reviewed-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

27 Jan, 2010

1 commit

  • Clean up: Bruce observed we have more or less common logic in each of
    svc_create_xprt()'s callers: the check to create an IPv6 RPC listener
    socket only if CONFIG_IPV6 is set. I'm about to add another case
    that does just the same.

    If we move the ifdefs into __svc_xpo_create(), then svc_create_xprt()
    call sites can get rid of the "#ifdef" ugliness, and can use the same
    logic with or without IPv6 support available in the kernel.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever