13 Sep, 2005

4 commits

  • We could try to unlock the state lock here without having first locked it.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Neil Brown
     
  • In the case of a lock which introduces a new lockowner, the openowner's
    sequence id should be incremented, even when the operation fails, if the
    error is a sequence-id-mutating error. The current code fails to do that
    in some cases. Fix this by using the same sequence-id-incrementing
    mechanism that all other such operations use.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Neil Brown
     
  • It seems more natural to move the setting of the replay_owner into the
    relevant procedure instead of doing it in nfsv4_proc_compound.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Neil Brown
     
  • Demote some printk's that look like they could be triggered by non-buggy
    clients to dprintk's. (For example, stale clientid's are normal
    occurrences on reboot, and on a server with a lot of clients these messages
    could become annoying.)

    Also remove some redundant dprintk's (e.g. no need for both STALE_CLIENTID
    and its callers to do dprintks).

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Neil Brown
     

08 Sep, 2005

1 commit

  • When registering an RPC cache, cache_register() always sets the owner as the
    sunrpc module. However, there are RPC caches owned by other modules. With
    the incorrect owner setting, the real owning module can be removed potentially
    with an open reference to the cache from userspace.

    For example, if one were to stop the nfs server and unmount the nfsd
    filesystem, the nfsd module could be removed eventhough rpc.idmapd had
    references to the idtoname and nametoid caches (i.e.
    /proc/net/rpc/nfs4./channel is still open). This resulted in a
    system panic on one of our machines when attempting to restart the nfs
    services after reloading the nfsd module.

    The following patch adds a 'struct module *owner' field in struct
    cache_detail. The owner is further assigned to the struct proc_dir_entry
    in cache_register() so that the module cannot be unloaded while user-space
    daemons have an open reference on the associated file under /proc.

    Signed-off-by: Bruce Allan
    Cc: Trond Myklebust
    Cc: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bruce Allan
     

02 Sep, 2005

2 commits

  • Since the patch to add a NULL short-circuit to crypto_free_tfm() went in,
    there's no longer any need for callers of that function to check for NULL.
    This patch removes the redundant NULL checks and also a few similar checks
    for NULL before calls to kfree() that I ran into while doing the
    crypto_free_tfm bits.

    I've succesfuly compile tested this patch, and a kernel with the patch
    applied boots and runs just fine.

    When I posted the patch to LKML (and other lists/people on Cc) it drew the
    following comments :

    J. Bruce Fields commented
    "I've no problem with the auth_gss or nfsv4 bits.--b."

    Sridhar Samudrala said
    "sctp change looks fine."

    Herbert Xu signed off on the patch.

    So, I guess this is ready to be dropped into -mm and eventually mainline.

    Signed-off-by: Jesper Juhl
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Jesper Juhl
     
  • This patch goes through the current users of the crypto layer and sets
    CRYPTO_TFM_REQ_MAY_SLEEP at crypto_alloc_tfm() where all crypto operations
    are performed in process context.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

18 Aug, 2005

1 commit

  • The nfsd holds the big kernel lock upon exit, when it really shouldn't.
    Not to mention that this breaks Ingo's RT patch. This is a trivial fix
    to release the lock.

    Ingo, this patch also works with your kernel, and stops the problem with
    nfsd.

    Note, there's a "goto out;" where "out:" is right above svc_exit_thread.
    The point of the goto also holds the kernel_lock, so I don't see any
    problem here in releasing it.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Linus Torvalds

    Steven Rostedt
     

13 Jul, 2005

1 commit

  • inotify is intended to correct the deficiencies of dnotify, particularly
    its inability to scale and its terrible user interface:

    * dnotify requires the opening of one fd per each directory
    that you intend to watch. This quickly results in too many
    open files and pins removable media, preventing unmount.
    * dnotify is directory-based. You only learn about changes to
    directories. Sure, a change to a file in a directory affects
    the directory, but you are then forced to keep a cache of
    stat structures.
    * dnotify's interface to user-space is awful. Signals?

    inotify provides a more usable, simple, powerful solution to file change
    notification:

    * inotify's interface is a system call that returns a fd, not SIGIO.
    You get a single fd, which is select()-able.
    * inotify has an event that says "the filesystem that the item
    you were watching is on was unmounted."
    * inotify can watch directories or files.

    Inotify is currently used by Beagle (a desktop search infrastructure),
    Gamin (a FAM replacement), and other projects.

    See Documentation/filesystems/inotify.txt.

    Signed-off-by: Robert Love
    Cc: John McCutchan
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert Love
     

08 Jul, 2005

19 commits

  • After discussion at the recent NFSv4 bake-a-thon, I realized that my
    assumption that NFS4_FH_PERSISTENT required filehandles to persist was a
    misreading of the spec. This also fixes an interoperability problem with the
    Solaris client.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • We shouldn't be allowing, e.g., write locks on files not open for read. To
    enforce this, we add a pointer from the lock stateid back to the open stateid
    it came from, so that the check will continue to be correct even after the
    open is upgraded or downgraded.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • As long as we're here, do some miscellaneous cleanup.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • The handling of close_lru in preprocess_stateid_op was a source of some
    confusion here recently. Try to make the logic a little clearer, by renaming
    find_openstateowner_id to make its purpose clearer and untangling some
    unnecessarily complicated goto's.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • nfs4_preprocess_seqid_op is called by NFSv4 operations that imply an implicit
    renewal of the client lease.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • from RFC 3530:
    "Share reservations are established by OPEN operations and by their
    nature are mandatory in that when the OPEN denies READ or WRITE
    operations, that denial results in such operations being rejected
    with error NFS4ERR_LOCKED."

    (Note that share_denied is really only a legal error for OPEN.)

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • An OPEN from the same client/open stateowner requires a stateid update because
    of the share/deny access update.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • We're insisting that the lock sequence id field passed in the
    open_to_lockowner struct always be zero. This is probably thanks to the
    sentence in rfc3530: "The first request issued for any given lock_owner is
    issued with a sequence number of zero."

    But there doesn't seem to be any problem with allowing initial sequence
    numbers other than zero. And currently this is causing lock reclaims from the
    Linux client to fail.

    In the spirit of "be liberal in what you accept, conservative in what you
    send", we'll relax the check (and patch the Linux client as well).

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • Add some comments on the use of so_seqid, in an attempt to avoid some of the
    confusion outlined in the previous patch....

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • The sequence number we store in the sequence id is the last one we received
    from the client. So on the next operation we'll check that the client gives
    us the next higher number.

    We increment sequence id's at the last moment, in encode, so that we're sure
    of knowing the right error return. (The decision to increment the sequence id
    depends on the exact error returned.)

    However on the *first* use of a sequence number, if we set the sequence number
    to the one received from the client and then let the increment happen on
    encode, we'll be left with a sequence number one to high.

    For that reason, ENCODE_SEQID_OP_TAIL only increments the sequence id on
    *confirmed* stateowners.

    This creates a problem for open reclaims, which are confirmed on first use.
    Therefore the open reclaim code, as a special exception, *decrements* the
    sequence id, cancelling out the undesired increment on encode. But this
    prevents the sequence id from ever being incremented in the case where
    multiple reclaims are sent with the same openowner. Yuch!

    We could add another exception to the open reclaim code, decrementing the
    sequence id only if this is the first use of the open owner.

    But it's simpler by far to modify the meaning of the op_seqid field: instead
    of representing the previous value sent by the client, we take op_seqid, after
    encoding, to represent the *next* sequence id that we expect from the client.
    This eliminates the need for special-case handling of the first use of a
    stateowner.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • Yeah, it's trivial, but this drives me up the wall....

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • A misreading of the spec lead us to convert all errors on open and lock
    reclaims to RECLAIM_BAD. This causes problems--for example, a reboot within
    the grace period could lead to reclaims with stale stateid's, and we'd like to
    return STALE errors in those cases.

    What rfc3530 actually says about RECLAIM_BAD: "The reclaim provided by the
    client does not match any of the server's state consistency checks and is
    bad." I'm assuming that "state consistency checks" refers to checks for
    consistency with the state recorded to stable storage, and that the error
    should be reserved for that case.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • A GRACE or NOGRACE response to a lock request should also bump the sequence
    id. So we delay the handling of grace period errors till after we've found
    the relevant owner.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • The GRACE and NOGRACE errors should bump the sequence id on open. So we delay
    the handling of these errors until nfsd4_process_open2, at which point we've
    set the open owner, so the encode routine will be able to bump the sequence
    id.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • We oops in list_for_each_entry(), because release_stateowner frees something
    on the list we're traversing.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • Make sure we don't try to delete client recovery directories multiple times;
    fixes some spurious error messages.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • Oops, this lookup_one_len needs the i_sem.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • We need to fsync the recovery directory after writing to it, but we weren't
    doing this correctly. (For example, we weren't taking the i_sem when calling
    ->fsync().)

    Just reuse the existing nfsd fsync code instead.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • We need to remove the recovery directory here too. (This chunk just got lost
    somehow in the process of commuting the reboot recovery patches past the other
    patches.)

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     

29 Jun, 2005

1 commit


24 Jun, 2005

11 commits

  • Set the recovery directory via /proc/fs/nfsd/nfs4recoverydir.

    It may be changed any time, but is used only on startup.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • This patch adds the code to create and remove client subdirectories from the
    recovery directory, as described in the previous patch comment.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • NFSv4 clients are required to know what state they have on the server so that
    they can reclaim it on server reboot. However, it is possible for
    pathalogical combinations of server reboots and network partitions to leave a
    client in a state where it cannot know whether it has lost its state on the
    server.

    For this reason, rfc3530 requires that we store some information about clients
    to stable storage.

    So we maintain a directory /var/lib/nfs/v4recovery with a subdirectory for
    each client with active state. We leave open the possibility of including
    files underneath each such subdirectory with information about the client, but
    for now the subdirectories are empty.

    We create a client subdirectory whenever a client makes its first non-reclaim
    open_confirm.

    We remove a client subdirectory whenever either
    a) its lease expires, or
    b) the grace period ends without it reclaiming anything.
    When handling reclaims, we allow the reclaim if and only if the client doing
    the reclaim has a subdirectory.

    This patch adds just the code to scan the recovery directory on nfsd startup.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • The cb_parsed field is only used by probe_callback, to determine whether the
    callback information has been filled in by setclientid. But there is no way
    that probe_callback() can be called without that having already happened, so
    that check is superfluous, as is cb_parsed.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • >From the language of rfc3530 section 8.1.3 (e.g., the suggestion that a
    "process id" might be a reasonable lockowner value) it's conceivable that a
    client might want to use the same lockowner string on multiple files, so we may
    as well allow that. We expect each use of open_to_lockowner to create a
    distinct seqid stream, though.

    For now we're also allowing multiple uses of open_to_lockowner with the same
    open, though it seems unlikely clients would actually do that.

    Also add a comment reminding myself of some very non-scalable data structures.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • Trivial renaming patch:

    I can never remember, while looking at various lists relating the nfsd4 state
    structures, which are the "heads" and which are items on other lists, or which
    structures are actually on the various lists. The following convention helps
    me: given structures foo and bar, with foo containing the head of a list of
    bars, use "bars" for the name of the head of the list contained in the struct
    foo, and use "per_foo" for the entries in the struct bars.

    Already done for struct nfs4_file; go ahead and do it for the other nfsd4
    state structures.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • Minor cleanup, remove some unnecessary printk's.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • Trivial whitespace and comment fixes.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • Change from "goto" to "else if" format in setclientid_confirm.

    From: Fred Isaman
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • NFS4_INVAL is not a valid error for setclientid_confirm, and INUSE is the more
    logical error here anyway.

    From: Fred Isaman
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     
  • Setclientid_confirm code confused states 1 and 3 (numbering from the
    IMPLEMENTATION section of rfc3530, section 14.2.33). Fix this.

    State 1 allows the client to change the callback channel on the fly. We don't
    implement this currently, so just turn off the callback channel in this case.

    From: Fred Isaman
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown