30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

09 Feb, 2010

2 commits

  • When lockd gets a notify downcall from statd, it'll search its hosts
    cache and then clear the sm_monitored bit on the host it finds. The idea
    is apparently to make lockd redo a SM_MON on the next lock request.

    This is unnecessary and causes the kernel's NSM cache to go out of sync
    with statd. statd doesn't stop monitoring a host when it gets a
    SM_NOTIFY and there's no guarantee that another lock will occur after
    the reclaim and before the unmount. In that event, no SM_UNMON will
    occur.

    Signed-off-by: Jeff Layton
    Reviewed-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     
  • nsm_reboot_lookup takes a reference to the nsm_handle that it returns,
    but nlm_host_rebooted never releases that reference.

    Signed-off-by: Jeff Layton
    Reviewed-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     

27 Jan, 2010

1 commit

  • Clean up: Bruce observed we have more or less common logic in each of
    svc_create_xprt()'s callers: the check to create an IPv6 RPC listener
    socket only if CONFIG_IPV6 is set. I'm about to add another case
    that does just the same.

    If we move the ifdefs into __svc_xpo_create(), then svc_create_xprt()
    call sites can get rid of the "#ifdef" ugliness, and can use the same
    logic with or without IPv6 support available in the kernel.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     

17 Dec, 2009

1 commit

  • * 'for-2.6.33' of git://linux-nfs.org/~bfields/linux: (42 commits)
    nfsd: remove pointless paths in file headers
    nfsd: move most of nfsfh.h to fs/nfsd
    nfsd: remove unused field rq_reffh
    nfsd: enable V4ROOT exports
    nfsd: make V4ROOT exports read-only
    nfsd: restrict filehandles accepted in V4ROOT case
    nfsd: allow exports of symlinks
    nfsd: filter readdir results in V4ROOT case
    nfsd: filter lookup results in V4ROOT case
    nfsd4: don't continue "under" mounts in V4ROOT case
    nfsd: introduce export flag for v4 pseudoroot
    nfsd: let "insecure" flag vary by pseudoflavor
    nfsd: new interface to advertise export features
    nfsd: Move private headers to source directory
    vfs: nfsctl.c un-used nfsd #includes
    lockd: Remove un-used nfsd headers #includes
    s390: remove un-used nfsd #includes
    sparc: remove un-used nfsd #includes
    parsic: remove un-used nfsd #includes
    compat.c: Remove dependence on nfsd private headers
    ...

    Linus Torvalds
     

15 Dec, 2009

1 commit


19 Nov, 2009

1 commit


12 Nov, 2009

1 commit


24 Sep, 2009

1 commit

  • * remove asm/atomic.h inclusion from linux/utsname.h --
    not needed after kref conversion
    * remove linux/utsname.h inclusion from files which do not need it

    NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
    due to some personality stuff it _is_ needed -- cowardly leave ELF-related
    headers and files alone.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

22 Sep, 2009

3 commits

  • * 'for-2.6.32' of git://linux-nfs.org/~bfields/linux: (68 commits)
    nfsd4: nfsv4 clients should cross mountpoints
    nfsd: revise 4.1 status documentation
    sunrpc/cache: avoid variable over-loading in cache_defer_req
    sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req
    nfsd: return success for non-NFS4 nfs4_state_start
    nfsd41: Refactor create_client()
    nfsd41: modify nfsd4.1 backchannel to use new xprt class
    nfsd41: Backchannel: Implement cb_recall over NFSv4.1
    nfsd41: Backchannel: cb_sequence callback
    nfsd41: Backchannel: Setup sequence information
    nfsd41: Backchannel: Server backchannel RPC wait queue
    nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
    nfsd41: Backchannel: callback infrastructure
    nfsd4: use common rpc_cred for all callbacks
    nfsd4: allow nfs4 state startup to fail
    SUNRPC: Defer the auth_gss upcall when the RPC call is asynchronous
    nfsd4: fix null dereference creating nfsv4 callback client
    nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
    nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel
    sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked.
    ...

    Linus Torvalds
     
  • Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

21 Aug, 2009

1 commit


10 Aug, 2009

2 commits


13 Jul, 2009

1 commit

  • * Remove smp_lock.h from files which don't need it (including some headers!)
    * Add smp_lock.h to files which do need it
    * Make smp_lock.h include conditional in hardirq.h
    It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

    This will make hardirq.h inclusion cheaper for every PREEMPT=n config
    (which includes allmodconfig/allyesconfig, BTW)

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

23 Jun, 2009

1 commit

  • * 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits)
    SUNRPC: Fix the TCP server's send buffer accounting
    nfsd41: Backchannel: minorversion support for the back channel
    nfsd41: Backchannel: cleanup nfs4.0 callback encode routines
    nfsd41: Remove ip address collision detection case
    nfsd: optimise the starting of zero threads when none are running.
    nfsd: don't take nfsd_mutex twice when setting number of threads.
    nfsd41: sanity check client drc maxreqs
    nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct
    NFS: kill off complicated macro 'PROC'
    sunrpc: potential memory leak in function rdma_read_xdr
    nfsd: minor nfsd_vfs_write cleanup
    nfsd: Pull write-gathering code out of nfsd_vfs_write
    nfsd: track last inode only in use_wgather case
    sunrpc: align cache_clean work's timer
    nfsd: Use write gathering only with NFSv2
    NFSv4: kill off complicated macro 'PROC'
    NFSv4: do exact check about attribute specified
    knfsd: remove unreported filehandle stats counters
    knfsd: fix reply cache memory corruption
    knfsd: reply cache cleanups
    ...

    Linus Torvalds
     

18 Jun, 2009

3 commits

  • Cut NSM upcall RPC traffic in half -- don't do a NULL call first.
    The cases where a ping would be helpful are rare.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • When rpc.statd starts up in user space at boot time, it attempts to
    write the latest NSM local state number into
    /proc/sys/fs/nfs/nsm_local_state.

    If lockd.ko isn't loaded yet (as is the case in most configurations),
    that file doesn't exist, thus the kernel's NSM state remains set to
    its initial value of zero during lockd operation.

    This is a problem because rpc.statd and lockd use the NSM state number
    to prevent repeated lock recovery on rebooted hosts. If lockd sends
    a zero NSM state, but then a delayed SM_NOTIFY with a real NSM state
    number is received, there is no way for lockd or rpc.statd to
    distinguish that stale SM_NOTIFY from an actual reboot. Thus lock
    recovery could be performed after the rebooted host has already
    started reclaiming locks, and those locks will be lost.

    We could change /etc/init.d/nfslock so it always modprobes lockd.ko
    before starting rpc.statd. However, if lockd.ko is ever unloaded
    and reloaded, we are back at square one, since the NSM state is not
    preserved across an unload/reload cycle. This may happen frequently
    on clients that use automounter. A period of NFS inactivity causes
    lockd.ko to be unloaded, and the kernel loses its NSM state setting.

    Instead, let's use the fact that rpc.statd plants the local system's
    NSM state in every SM_MON (and SM_UNMON) reply. lockd performs a
    synchronous SM_MON upcall to the local rpc.statd _before_ sending its
    first NLM request to a new remote. This would permit rpc.statd to
    provide the current NSM state to lockd, even after lockd.ko had been
    unloaded and reloaded.

    Note that NLMPROC_LOCK arguments are constructed before the
    nsm_monitor() call, so we have to rearrange argument construction very
    slightly to make this all work out.

    And, the kernel appears to treat NSM state as a u32 (see struct
    nlm_args and nsm_res). Make nsm_local_state a u32 as well, to ensure
    we don't get bogus comparison results.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     

16 Jun, 2009

1 commit


07 May, 2009

1 commit

  • If lockd is signalled soon enough after restart then locks_start_grace()
    will try to re-add an entry to a list and trigger a lock corruption
    warning.

    Thanks to Wang Chen for the problem report and diagnosis.

    WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c()
    ...
    list_add corruption. next->prev should be prev (ef8fe958), but was ef8ff128. (next=ef8ff128).
    ...
    Pid: 23062, comm: lockd Tainted: G W 2.6.30-rc2 #3
    Call Trace:
    [] warn_slowpath+0x71/0xa0
    [] ? update_curr+0x11d/0x125
    [] ? trace_hardirqs_on_caller+0x18/0x150
    [] ? trace_hardirqs_on+0xb/0xd
    [] ? _raw_spin_lock+0x53/0xfa
    [] __list_add+0x27/0x5c
    [] locks_start_grace+0x22/0x30 [lockd]
    [] set_grace_period+0x39/0x53 [lockd]
    [] ? lock_kernel+0x1c/0x28
    [] lockd+0x64/0x164 [lockd]
    [] ? trace_hardirqs_on_caller+0x18/0x150
    [] ? complete+0x34/0x3e
    [] ? lockd+0x0/0x164 [lockd]
    [] ? lockd+0x0/0x164 [lockd]
    [] kthread+0x45/0x6b
    [] ? kthread+0x0/0x6b
    [] kernel_thread_helper+0x7/0x10

    Reported-by: Wang Chen
    Signed-off-by: J. Bruce Fields
    Cc: stable@kernel.org

    J. Bruce Fields
     

25 Apr, 2009

1 commit

  • For every lock request lockd creates a new file_lock object
    in nlmsvc_setgrantargs() by copying the passed in file_lock with
    locks_copy_lock(). A filesystem can attach it's own lock_operations
    vector to the file_lock. It has to be cleaned up at the end of the
    file_lock's life. However, lockd doesn't do it today, yet it
    asserts in nlmclnt_release_lockargs() that the per-filesystem
    state is clean.
    This patch fixes it by exporting locks_release_private() and adding
    it to nlmsvc_freegrantargs(), to be symmetrical to creating a
    file_lock in nlmsvc_setgrantargs().

    Signed-off-by: Felix Blyakher
    Signed-off-by: J. Bruce Fields

    Felix Blyakher
     

07 Apr, 2009

1 commit

  • * 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: (81 commits)
    nfsd41: define nfsd4_set_statp as noop for !CONFIG_NFSD_V4
    nfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc
    nfsd41: Documentation/filesystems/nfs41-server.txt
    nfsd41: CREATE_EXCLUSIVE4_1
    nfsd41: SUPPATTR_EXCLCREAT attribute
    nfsd41: support for 3-word long attribute bitmask
    nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify
    nfsd41: pass writable attrs mask to nfsd4_decode_fattr
    nfsd41: provide support for minor version 1 at rpc level
    nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions
    nfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap
    nfsd41: access_valid
    nfsd41: clientid handling
    nfsd41: check encode size for sessions maxresponse cached
    nfsd41: stateid handling
    nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op
    nfsd41: destroy_session operation
    nfsd41: non-page DRC for solo sequence responses
    nfsd41: Add a create session replay cache
    nfsd41: create_session operation
    ...

    Linus Torvalds
     

02 Apr, 2009

1 commit


29 Mar, 2009

4 commits

  • Apparently a lot of people need to disable IPv6 completely on their
    distributor-built systems, which have CONFIG_IPV6_MODULE enabled at
    build time.

    They do this by blacklisting the ipv6.ko module. This causes the
    creation of the lockd service listener to fail if CONFIG_IPV6_MODULE
    is set, but the module cannot be loaded.

    Now that the kernel's PF_INET6 RPC listeners are completely separate
    from PF_INET listeners, we can always start PF_INET. Then lockd can
    try to start PF_INET6, but it isn't required to be available.

    Note this has the added benefit that NLM callbacks from AF_INET6
    servers will never come from AF_INET remotes. We no longer have to
    worry about matching mapped IPv4 addresses to AF_INET when comparing
    addresses.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • We're about to convert over to using separate PF_INET and PF_INET6
    listeners, instead of a single PF_INET6 listener that also receives
    AF_INET requests and maps them to AF_INET6.

    Clear the way by removing the logic in lockd and the NFSv4 callback
    server that creates an AF_INET6 service listener.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Since an RPC service listener's protocol family is specified now via
    svc_create_xprt(), it no longer needs to be passed to svc_create() or
    svc_create_pooled(). Remove that argument from the synopsis of those
    functions, and remove the sv_family field from the svc_serv struct.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • The sv_family field is going away. Pass a protocol family argument to
    svc_create_xprt() instead of extracting the family from the passed-in
    svc_serv struct.

    Again, as this is a listener socket and not an address, we make this
    new argument an "int" protocol family, instead of an "sa_family_t."

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

19 Mar, 2009

1 commit


11 Mar, 2009

1 commit

  • The NFS mount command may pass an AF_INET server address to lockd. If
    lockd happens to be using a PF_INET6 listener, the nlm_cmp_addr() in
    nlmclnt_grant() will fail to match requests from that host because they
    will all have a mapped IPv4 AF_INET6 address.

    Adopt the same solution used in nfs_sockaddr_match_ipaddr() for NFSv4
    callbacks: if either address is AF_INET, map it to an AF_INET6 address
    before doing the comparison.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     

10 Feb, 2009

1 commit

  • If a client requests a blocking lock, is denied, then requests it again,
    then here in nlmsvc_lock() we will call vfs_lock_file() without FL_SLEEP
    set, because we've already queued a block and don't need the locks code
    to do it again.

    But that means vfs_lock_file() will return -EAGAIN instead of
    FILE_LOCK_DENIED. So we still need to translate that -EAGAIN return
    into a nlm_lck_blocked error in this case, and put ourselves back on
    lockd's block list.

    The bug was introduced by bde74e4bc64415b1 "locks: add special return
    value for asynchronous locks".

    Thanks to Frank van Maarseveen for the report; his original test
    case was essentially

    for i in `seq 30`; do flock /nfsmount/foo sleep 10 & done

    Tested-by: Frank van Maarseveen
    Reported-by: Frank van Maarseveen
    Cc: Miklos Szeredi
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

08 Jan, 2009

2 commits


07 Jan, 2009

6 commits

  • If the kernel is configured to support IPv6 and the RPC server can register
    services via rpcbindv4, we are all set to enable IPv6 support for lockd.

    Signed-off-by: Chuck Lever
    Cc: Aime Le Rouzic
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: one last thing... relocate nsm_create() to eliminate the forward
    declaration and group it near the only function that actually uses it.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up.

    Treat the nsm_use_hostnames global variable like nsm_local_state.
    Note that the default value of nsm_use_hostnames is still zero.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: nsm_addr_in() is no longer used, and nsm_addr() is used only in
    fs/lockd/mon.c, so move it there.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: The include/linux/lockd/sm_inter.h header is nearly empty
    now. Remove it.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • NLM provides file locking services for NFS files. Part of this service
    includes a second protocol, known as NSM, which is a reboot
    notification service. NLM uses this service to determine when to
    reclaim locks or enter a grace period after a client or server reboots.

    The NLM service (implemented by lockd in the Linux kernel) contacts
    the local NSM service (implemented by rpc.statd in Linux user space)
    via NSM protocol upcalls to register a callback when a particular
    remote peer reboots.

    To match the callback to the correct remote peer, the NLM service
    constructs a cookie that it passes in the request. The NSM service
    passes that cookie back to the NLM service when it is notified that
    the given remote peer has indeed rebooted.

    Currently on Linux, the cookie is the raw 32-bit IPv4 address of the
    remote peer. To support IPv6 addresses, which are larger, we could
    use all 16 bytes of the cookie to represent a full IPv6 address,
    although we still can't represent an IPv6 address with a scope ID in
    just 16 bytes.

    Instead, to avoid the need for future changes to support additional
    address types, we'll use a manufactured value for the cookie, and use
    that to find the corresponding nsm_handle struct in the kernel during
    the NLMPROC_SM_NOTIFY callback.

    This should provide complete support in the kernel's NSM
    implementation for IPv6 hosts, while remaining backwards compatible
    with older rpc.statd implementations.

    Note we also deal with another case where nsm_use_hostnames can change
    while there are outstanding notifications, possibly resulting in the
    loss of reboot notifications. After this patch, the priv cookie is
    always used to lookup rebooted hosts in the kernel.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever