23 Jul, 2007

1 commit


20 Jul, 2007

13 commits


19 Jul, 2007

2 commits

  • Since posix_test_lock(), like fcntl() and ->lock(), indicates absence or
    presence of a conflict lock by setting fl_type to, respectively, F_UNLCK
    or something other than F_UNLCK, the return value is no longer needed.

    Signed-off-by: "J. Bruce Fields"

    J. Bruce Fields
     
  • As Peter Staubach says elsewhere
    (http://marc.info/?l=linux-kernel&m=118113649526444&w=2):

    > The problem is that some file system such as NFSv2 and NFSv3 do
    > not have sufficient support to be able to support leases correctly.
    > In particular for these two file systems, there is no over the wire
    > protocol support.
    >
    > Currently, these two file systems fail the fcntl(F_SETLEASE) call
    > accidentally, due to a reference counting difference. These file
    > systems should fail more consciously, with a proper error to
    > indicate that the call is invalid for them.

    Define an nfs setlease method that just returns -EINVAL.

    If someone can demonstrate a real need, perhaps we could reenable
    them in the presence of the "nolock" mount option.

    Signed-off-by: "J. Bruce Fields"
    Cc: Peter Staubach
    Cc: Trond Myklebust

    J. Bruce Fields
     

18 Jul, 2007

2 commits

  • Currently, the freezer treats all tasks as freezable, except for the kernel
    threads that explicitly set the PF_NOFREEZE flag for themselves. This
    approach is problematic, since it requires every kernel thread to either
    set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
    care for the freezing of tasks at all.

    It seems better to only require the kernel threads that want to or need to
    be frozen to use some freezer-related code and to remove any
    freezer-related code from the other (nonfreezable) kernel threads, which is
    done in this patch.

    The patch causes all kernel threads to be nonfreezable by default (ie. to
    have PF_NOFREEZE set by default) and introduces the set_freezable()
    function that should be called by the freezable kernel threads in order to
    unset PF_NOFREEZE. It also makes all of the currently freezable kernel
    threads call set_freezable(), so it shouldn't cause any (intentional)
    change of behaviour to appear. Additionally, it updates documentation to
    describe the freezing of tasks more accurately.

    [akpm@linux-foundation.org: build fixes]
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Nigel Cunningham
    Cc: Pavel Machek
    Cc: Oleg Nesterov
    Cc: Gautham R Shenoy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • I can never remember what the function to register to receive VM pressure
    is called. I have to trace down from __alloc_pages() to find it.

    It's called "set_shrinker()", and it needs Your Help.

    1) Don't hide struct shrinker. It contains no magic.
    2) Don't allocate "struct shrinker". It's not helpful.
    3) Call them "register_shrinker" and "unregister_shrinker".
    4) Call the function "shrink" not "shrinker".
    5) Reduce the 17 lines of waffly comments to 13, but document it properly.

    Signed-off-by: Rusty Russell
    Cc: David Chinner
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rusty Russell
     

17 Jul, 2007

1 commit


11 Jul, 2007

21 commits

  • I ran into a curious issue when a lock is being canceled. The
    cancellation results in a lock request to the vfs layer instead of an
    unlock request. This is particularly insidious when the process that
    owns the lock is exiting. In that case, sometimes the erroneous lock is
    applied AFTER the process has entered zombie state, preventing the lock
    from ever being released. Eventually other processes block on the lock
    causing a slow degredation of the system. In the 2.6.16 kernel this was
    investigated on, the problem is compounded by the fact that the cl_sem
    is held while blocking on the vfs lock, which results in most processes
    accessing the nfs file system in question hanging.

    In more detail, here is how the situation occurs:

    first _nfs4_do_setlk():

    static int _nfs4_do_setlk(struct nfs4_state *state, int cmd, struct file_lock *fl, int reclaim)
    ...
    ret = nfs4_wait_for_completion_rpc_task(task);
    if (ret == 0) {
    ...
    } else
    data->cancelled = 1;

    then nfs4_lock_release():

    static void nfs4_lock_release(void *calldata)
    ...
    if (data->cancelled != 0) {
    struct rpc_task *task;
    task = nfs4_do_unlck(&data->fl, data->ctx, data->lsp,
    data->arg.lock_seqid);

    The problem is the same file_lock that was passed in to _nfs4_do_setlk()
    gets passed to nfs4_do_unlck() from nfs4_lock_release(). So the type is
    still F_RDLCK or FWRLCK, not F_UNLCK. At some point, when cancelling the
    lock, the type needs to be changed to F_UNLCK. It seemed easiest to do
    that in nfs4_do_unlck(), but it could be done in nfs4_lock_release().
    The concern I had with doing it there was if something still needed the
    original file_lock, though it turns out the original file_lock still
    needs to be modified by nfs4_do_unlck() because nfs4_do_unlck() uses the
    original file_lock to pass to the vfs layer, and a copy of the original
    file_lock for the RPC request.

    It seems like the simplest solution is to force all situations where
    nfs4_do_unlck() is being used to result in an unlock, so with that in
    mind, I made the following change:

    Signed-off-by: Frank Filz
    Signed-off-by: Trond Myklebust

    Frank Filz
     
  • Consider the case where the user has mounted the remote filesystem
    server:/foo on the two local directories /bar and /baz using the
    nosharedcache mount option. The files /bar/file and /baz/file are
    represented by different inodes in the local namespace, but refer to the
    same file /foo/file on the server.
    Consider the case where a process opens both /bar/file and /baz/file, then
    closes /bar/file: because the nfs4_state is not shared between /bar/file
    and /baz/file, the kernel will see that the nfs4_state for /bar/file is no
    longer referenced, so it will send off a CLOSE rpc call. Unless the
    open_owners differ, then that CLOSE call will invalidate the open state on
    /baz/file too.

    Conclusion: we cannot share open state owners between two different
    non-shared mount instances of the same filesystem.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Unless the user sets the NFS_MOUNT_NOSHAREDCACHE mount flag, we should
    return EBUSY if the filesystem is already mounted on a superblock that
    has set conflicting mount options.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Prior to David Howell's mount changes in 2.6.18, users who mounted
    different directories which happened to be from the same filesystem on the
    server would get different super blocks, and hence could choose different
    mount options. As long as there were no hard linked files that crossed from
    one subtree to another, this was quite safe.
    Post the changes, if the two directories are on the same filesystem (have
    the same 'fsid'), they will share the same super block, and hence the same
    mount options.

    Add a flag to allow users to elect not to share the NFS super block with
    another mount point, even if the fsids are the same. This will allow
    users to set different mount options for the two different super blocks, as
    was previously possible. It is still up to the user to ensure that there
    are no cache coherency issues when doing this, however the default
    behaviour will be to share super blocks whenever two paths result in
    the same fsid.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Hook in final components required for supporting in-kernel mount option
    parsing for NFSv2 and NFSv3 mounts.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • For NFSv2 and v3 mounts, the first step is to contact the server's MOUNTD
    and request the file handle for the root of the mounted share. Add a
    function to the NFS client that handles this operation.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • This generic infrastructure works for both NFS and NFSv4 mounts.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up white space and coding conventions.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • In preparation for supporting NFSv2 and NFSv3 mount option handling in the
    kernel NFS client, convert mount_clnt.c to be a permanent part of the NFS
    client, instead of built only when CONFIG_ROOT_NFS is enabled.

    In addition, we also replace the "struct sockaddr_in *" argument with
    something more generic, to help support IPv6 at some later point.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • A couple of callers just use a stringified IP address for the rpc client's
    hostname. Move the logic for constructing this into rpc_create(), so it can
    be shared.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • In preparation for handling NFS mount option parsing in the kernel,
    rename rpcb_getport_external as rpcb_get_port_sync, and make it available
    always (instead of only when CONFIG_ROOT_NFS is enabled).

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Refactor NFSv4 mount processing to break out mount data validation
    in the same way it's broken out in the NFSv2/v3 mount path.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Move error handling code out of the main code path. The switch statement
    was also improperly indented, according to Documentation/CodingStyle. This
    prepares nfs_validate_mount_data for the addition of option string parsing.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • NFS and NFSv4 mounts can now share server address sanity checking. And, it
    provides an easy mechanism for adding IPv6 address checking at some later
    point.

    Signed-off-by: Chuck Lever
    Cc: Aurelien Charbon
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • /home/cel/linux/fs/nfs/super.c: In function 'nfs_pseudoflavour_to_name':
    /home/cel/linux/fs/nfs/super.c:270: warning: comparison between signed and unsigned

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • The error return logic in nfs_get_sb now matches nfs4_get_sb, and is more maintainable.
    A subsequent patch will take advantage of this simplification.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • The new string utility function strndup_user can be used instead of
    nfs_copy_user_string, eliminating an unnecessary duplication of function.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • inode->i_blocks is a blkcnt_t these days, which can be a u64 or unsigned
    long, depending on the setting of CONFIG_LSF.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever