08 Jan, 2009

2 commits


07 Jan, 2009

38 commits

  • If the kernel is configured to support IPv6 and the RPC server can register
    services via rpcbindv4, we are all set to enable IPv6 support for lockd.

    Signed-off-by: Chuck Lever
    Cc: Aime Le Rouzic
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: one last thing... relocate nsm_create() to eliminate the forward
    declaration and group it near the only function that actually uses it.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up.

    Treat the nsm_use_hostnames global variable like nsm_local_state.
    Note that the default value of nsm_use_hostnames is still zero.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: nsm_addr_in() is no longer used, and nsm_addr() is used only in
    fs/lockd/mon.c, so move it there.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: The include/linux/lockd/sm_inter.h header is nearly empty
    now. Remove it.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • NLM provides file locking services for NFS files. Part of this service
    includes a second protocol, known as NSM, which is a reboot
    notification service. NLM uses this service to determine when to
    reclaim locks or enter a grace period after a client or server reboots.

    The NLM service (implemented by lockd in the Linux kernel) contacts
    the local NSM service (implemented by rpc.statd in Linux user space)
    via NSM protocol upcalls to register a callback when a particular
    remote peer reboots.

    To match the callback to the correct remote peer, the NLM service
    constructs a cookie that it passes in the request. The NSM service
    passes that cookie back to the NLM service when it is notified that
    the given remote peer has indeed rebooted.

    Currently on Linux, the cookie is the raw 32-bit IPv4 address of the
    remote peer. To support IPv6 addresses, which are larger, we could
    use all 16 bytes of the cookie to represent a full IPv6 address,
    although we still can't represent an IPv6 address with a scope ID in
    just 16 bytes.

    Instead, to avoid the need for future changes to support additional
    address types, we'll use a manufactured value for the cookie, and use
    that to find the corresponding nsm_handle struct in the kernel during
    the NLMPROC_SM_NOTIFY callback.

    This should provide complete support in the kernel's NSM
    implementation for IPv6 hosts, while remaining backwards compatible
    with older rpc.statd implementations.

    Note we also deal with another case where nsm_use_hostnames can change
    while there are outstanding notifications, possibly resulting in the
    loss of reboot notifications. After this patch, the priv cookie is
    always used to lookup rebooted hosts in the kernel.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: refactor nsm_get_handle() so it is organized the same way that
    nsm_reboot_lookup() is.

    There is an additional micro-optimization here. This change moves the
    "hostname & nsm_use_hostnames" test out of the list_for_each_entry()
    clause in nsm_get_handle(), since it is loop-invariant.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up. Refactor the creation of nsm_handles into a helper. Fields
    are initialized in increasing address order to make efficient use of
    CPU caches.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: nsm_find() now has only one caller, and that caller
    unconditionally sets the @create argument. Thus the @create
    argument is no longer needed.

    Since nsm_find() now has a more specific purpose, pick a more
    appropriate name for it.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Invoke the newly introduced nsm_reboot_lookup() function in
    nlm_host_rebooted() instead of nsm_find().

    This introduces just one behavioral change: debugging messages
    produced during reboot notification will now appear when the
    NLMDBG_MONITOR flag is set, but not when the NLMDBG_HOSTCACHE flag
    is set.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Introduce a new API to fs/lockd/mon.c that allows nlm_host_rebooted()
    to lookup up nsm_handles via the contents of an nlm_reboot struct.

    The new function is equivalent to calling nsm_find() with @create set
    to zero, but it takes a struct nlm_reboot instead of separate
    arguments.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • The NLM XDR decoders for the NLMPROC_SM_NOTIFY procedure should treat
    their "priv" argument truly as an opaque, as defined by the protocol,
    and let the upper layers figure out what is in it.

    This will make it easier to modify the contents and interpretation of
    the "priv" argument, and keep knowledge about what's in "priv" local
    to fs/lockd/mon.c.

    For now, the NLM and NSM implementations should behave exactly as they
    did before.

    The formation of the address of the rebooted host in
    nlm_host_rebooted() may look a little strange, but it is the inverse
    of how nsm_init_private() forms the private cookie. Plus, it's
    going away soon anyway.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Pass the nlm_reboot data structure directly from the NLMPROC_SM_NOTIFY
    XDR decoders to nlm_host_rebooted(). This eliminates some packing and
    unpacking of the NLMPROC_SM_NOTIFY results, and prepares for passing
    these results, including the "priv" cookie, directly to a lookup
    routine in fs/lockd/mon.c.

    This patch changes code organization but should not cause any
    behavioral change.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Pass the new "priv" cookie to NSMPROC_MON's XDR encoder, instead of
    creating the "priv" argument in the encoder at call time.

    This patch should not cause a behavioral change: the contents of the
    cookie remain the same for the time being.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Introduce a new data type, used by both the in-kernel NLM and NSM
    implementations, that is used to manage the opaque "priv" argument
    for the NSMPROC_MON and NLMPROC_SM_NOTIFY calls.

    Construct the "priv" cookie when the nsm_handle is created.

    The nsm_init_private() function may look a little strange, but it is
    roughly equivalent to how the XDR encoder formed the "priv" argument.
    It's going to go away soon.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • The nsm_release() function should never be called with a NULL handle
    point. If it is, that's a bug.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • The nsm_find() function should never be called with a NULL IP address
    pointer. If it is, that's a bug.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Introduce some dprintk() calls in fs/lockd/mon.c that are enabled by
    the NLMDBG_MONITOR flag. These report when we find, create, and
    release nsm_handles.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • The nsm_find() function sets up fresh nsm_handle entries. This is
    where we will store the "priv" cookie used to lookup nsm_handles during
    reboot recovery. The cookie will be constructed when nsm_find()
    creates a new nsm_handle.

    As much as possible, I would like to keep everything that handles a
    "priv" cookie in fs/lockd/mon.c so that all the smarts are in one
    source file. That organization should make it pretty simple to see how
    all this works.

    To me, it makes more sense than the current arrangement to keep
    nsm_find() with nsm_monitor() and nsm_unmonitor().

    So, start reorganizing by moving nsm_find() into fs/lockd/mon.c. The
    nsm_release() function comes along too, since it shares the nsm_lock
    global variable.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Introduce xdr_stream-based XDR encoder and decoder functions, which are
    more careful about preventing RPC buffer overflows.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: Move the RPC program and procedure numbers for NSM into the
    one source file that needs them: fs/lockd/mon.c.

    And, as with NLM, NFS, and rpcbind calls, use NSMPROC_FOO instead of
    SM_FOO for NSM procedure numbers.

    Finally, make a couple of comments more precise: what is referred to
    here as SM_NOTIFY is really the NLM (lockd) NLMPROC_SM_NOTIFY downcall,
    not NSMPROC_NOTIFY.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: NSM's XDR data structures are used only in fs/lockd/mon.c,
    so move them there.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Make sure any error returned by rpc.statd during an SM_UNMON call is
    reported rather than ignored completely. There isn't much to do with
    such an error, but we should log it in any case.

    Similar to a recent change to nsm_monitor().

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up.

    Make the nlm_host argument "const," and move the public declaration to
    lockd.h. Add a documenting comment.

    Bruce observed that nsm_unmonitor()'s only caller doesn't care about
    its return code, so make nsm_unmonitor() return void.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • The nsm_handle's reference count is bumped in nlm_lookup_host(). It
    should be decremented in nlm_destroy_host() to make it easier to see
    the balance of these two operations.

    Move the nsm_release() call to fs/lockd/host.c.

    The h_nsmhandle pointer is set in nlm_lookup_host(), and never cleared.
    The nlm_destroy_host() function is never called for the same nlm_host
    twice, so h_nsmhandle won't ever be NULL when nsm_unmonitor() is
    called.

    All references to the nlm_host are gone before it is freed. We can
    skip making h_nsmhandle NULL just before the nlm_host is deallocated.

    It's also likely we can remove the h_nsmhandle NULL check in
    nlmsvc_is_client() as well, but we can do that later when rearchitect-
    ing the nlm_host cache.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up.

    Make the nlm_host argument "const," and move the public declaration to
    lockd.h with other NSM public function (nsm_release, eg) and global
    variable declarations.

    Add a documenting comment.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • The nsm_monitor() function reports an error and does not set sm_monitored
    if the SM_MON upcall reply has a non-zero result code, but nsm_monitor()
    does not return an error to its caller in this case.

    Since sm_monitored is not set, the upcall is retried when the next NLM
    request invokes nsm_monitor(). However, that may not come for a while.
    In the meantime, at least one NLM request will potentially proceed
    without the peer being monitored properly.

    Have nsm_monitor() return an error if the result code is non-zero.
    This will cause all NLM requests to fail immediately if the upcall
    completed successfully but rpc.statd returned an error.

    This may be inconvenient in some cases (for example if rpc.statd
    cannot complete a proper DNS reverse lookup of the hostname), but will
    make the reboot monitoring service more robust by forcing such issues
    to be corrected by an admin.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: Remove the BUG_ON() invocation in nsm_monitor(). It's not
    likely that nsm_monitor() is ever called with a NULL host pointer, and
    the code will die anyway if host is NULL.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • The nsm_monitor() function already generates a printk(KERN_NOTICE) if
    the SM_MON upcall fails, so the similar printk() in the nlmclnt_lock()
    function is redundant.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: Use the sm_name field for reporting the hostname in nsm_monitor()
    and nsm_unmonitor(), just as the other functions in fs/lockd/mon.c do.

    The h_name field is just a copy of the sm_name pointer.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • The "mon_name" argument of the NSMPROC_MON and NSMPROC_UNMON upcalls
    is a string that contains the hostname or IP address of the remote peer
    to be notified when this host has rebooted. The sm-notify command uses
    this identifier to contact the peer when we reboot, so it must be
    either a well-qualified DNS hostname or a presentation format IP
    address string.

    When the "nsm_use_hostnames" sysctl is set to zero, the kernel's NSM
    provides a presentation format IP address in the "mon_name" argument.
    Otherwise, the "caller_name" argument from NLM requests is used,
    which is usually just the DNS hostname of the peer.

    To support IPv6 addresses for the mon_name argument, we use the
    nsm_handle's address eye-catcher, which already contains an appropriate
    presentation format address string. Using the eye-catcher string
    obviates the need to use a large buffer on the stack to form the
    presentation address string for the upcall.

    This patch also addresses a subtle bug.

    An NSMPROC_MON request and the subsequent NSMPROC_UNMON request for the
    same peer are required to use the same value for the "mon_name"
    argument. Otherwise, rpc.statd's NSMPROC_UNMON processing cannot
    locate the database entry for that peer and remove it.

    If the setting of nsm_use_hostnames is changed between the time the
    kernel sends an NSMPROC_MON request and the time it sends the
    NSMPROC_UNMON request for the same peer, the "mon_name" argument for
    these two requests may not be the same. This is because the value of
    "mon_name" is currently chosen at the moment the call is made based on
    the setting of nsm_use_hostnames

    To ensure both requests pass identical contents in the "mon_name"
    argument, we now select which string to use for the argument in the
    nsm_monitor() function. A pointer to this string is saved in the
    nsm_handle so it can be used for a subsequent NSMPROC_UNMON upcall.

    NB: There are other potential problems, such as how nlm_host_rebooted()
    might behave if nsm_use_hostnames were changed while hosts are still
    being monitored. This patch does not attempt to address those
    problems.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: make the printk(KERN_DEBUG) in nsm_mon_unmon() a dprintk,
    and add another dprintk to note if creating an RPC client for the
    upcall failed.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: Use a C99 structure initializer instead of open-coding the
    initialization of nsm_args.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Clean up: introduce a helper function to generate IPv4 addresses using
    the same style as the IPv6 helper function we just added.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Scope ID support is needed since the kernel's NSM implementation is
    about to use these displayed addresses as a mon_name in some cases.

    When nsm_use_hostnames is zero, without scope ID support NSM will fail
    to handle peers that contact us via a link-local address. Link-local
    addresses do not work without an interface ID, which is stored in the
    sockaddr's sin6_scope_id field.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • AF_UNSPEC support is no longer needed in nlm_display_address() now
    that a presentation address is no longer generated for the h_srcaddr
    field.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • The h_name field in struct nlm_host is a just copy of
    h_nsmhandle->sm_name. Likewise, the contents of the h_addrbuf field
    should be identical to the sm_addrbuf field.

    The h_srcaddrbuf field is used only in one place for debugging. We can
    live without this until we get %pI formatting for printk().

    Currently these buffers are 48 bytes, but we need to support scope IDs
    in IPv6 presentation addresses, which means making the buffers even
    larger. Instead, let's find ways to eliminate them to save space.

    Signed-off-by: Chuck Lever
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • The default method for calculating the number of connections allowed
    per RPC service arbitrarily limits single-threaded services to 80
    connections. This is too low for services like lockd and artificially
    limits the number of TCP clients that it can support.

    Have lockd set a default sv_maxconn value to 1024 (which is the typical
    default value for RLIMIT_NOFILE. Also add a module parameter to allow an
    admin to set this to an arbitrary value.

    Signed-off-by: Jeff Layton
    Acked-by: Neil Brown
    Signed-off-by: J. Bruce Fields

    Jeff Layton