10 Jul, 2008

32 commits

  • All instances are set to nfs_open(), so we should just remove the redundant
    indirection. Ditto for the file_release op

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • The RPC client uses the rq_xtime field in each RPC request to determine the
    round-trip time of the request. Currently, the rq_xtime field is
    initialized by each transport just before it starts enqueing a request to
    be sent. However, transports do not handle initializing this value
    consistently; sometimes they don't initialize it at all.

    To make the measurement of request round-trip time consistent for all
    RPC client transport capabilities, pull rq_xtime initialization into the
    RPC client's generic transport logic. Now all transports will get a
    standardized RTT measure automatically, from:

    xprt_transmit()

    to

    xprt_complete_rqst()

    This makes round-trip time calculation more accurate for the TCP transport.
    The socket ->sendmsg() method can return "-EAGAIN" if the socket's output
    buffer is full, so the TCP transport's ->send_request() method may call
    the ->sendmsg() method repeatedly until it gets all of the request's bytes
    queued in the socket's buffer.

    Currently, the TCP transport sets the rq_xtime field every time through
    that loop so the final value is the timestamp just before the *last* call
    to the underlying socket's ->sendmsg() method. After this patch, the
    rq_xtime field contains a timestamp that reflects the time just before the
    *first* call to ->sendmsg().

    This is consequential under heavy workloads because large requests often
    take multiple ->sendmsg() calls to get all the bytes of a request queued.
    The TCP transport causes the request to sleep until the remote end of the
    socket has received enough bytes to clear space in the socket's local
    output buffer. This delay can be quite significant.

    The method introduced by this patch is a more accurate measure of RTT
    for stream transports, since the server can cause enough back pressure
    to delay (ie increase the latency of) requests from the client.

    Additionally, this patch corrects the behavior of the RDMA transport, which
    entirely neglected to initialize the rq_xtime field. RPC performance
    metrics for RDMA transports now display correct RPC request round trip
    times.

    Signed-off-by: Chuck Lever
    Acked-by: Tom Talpey
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • ftruncate() access checking is supposed to be performed at open() time,
    just like reads and writes.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Try to make the comment here a little more clear and concise.

    Also, this macro definition seems unnecessary.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    \\\"J. Bruce Fields\\\
     
  • There used to be a print_hexl() function that used isprint(), now gone.
    I don't know why NFS_NGROUPS and CA_RUN_AS_MACHINE were here.

    I also don't know why another #define that's actually used was marked
    "unused".

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    \\\"J. Bruce Fields\\\
     
  • Also, a minor comment grammar fix in the same file.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    \\\"J. Bruce Fields\\\
     
  • The cl_chatty flag alows us to control whether a given rpc client leaves

    "server X not responding, timed out"

    messages in the syslog. Such messages make sense for ordinary nfs
    clients (where an unresponsive server means applications on the
    mountpoint are probably hanging), but not for the callback client (which
    can fail more commonly, with the only result just of disabling some
    optimizations).

    Previously cl_chatty was removed, do to lack of users; reinstate it, and
    use it for the nfsd's callback client.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Olga Kornievskaia
     
  • When remounting an NFS or NFS4 filesystem, the new NFS options are not
    respected, yet the remount will still return success. This patch adds
    a remount_fs sb op for NFS that checks any new nfs mount options against
    the existing ones and fails the mount if any have changed.

    This is only implemented for string-based mount options since doing
    this with binary options isn't really feasible.

    This is essentially the same as the original patch I sent out, but
    adds a check to see if the addr= option has changed.

    Signed-off-by: Jeff Layton
    Signed-off-by: Trond Myklebust

    Jeff Layton
     
  • This patch removes a CVS keyword that wasn't updated for a long time
    from a comment.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Trond Myklebust

    Adrian Bunk
     
  • Recent changes to the RPC client's transport connect logic make connect
    status values ECONNREFUSED and ECONNRESET impossible.

    Clean up xprt_connect_status() to account for these changes.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: fix a few dprintk messages that still need to show the RPC task ID
    correctly, and be sure we use the preferred %lld or %llu instead of %Ld or
    %Lu.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: some fops use NFSDBG_FILE, some use NFSDBG_VFS. Let's use
    NFSDBG_FILE for all fops, and consistently report file names instead
    of inode numbers.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Recent work in fs/nfs/file.c neglected to add appropriate trace debugging
    for the NFS client's address space operations.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: Report the same debugging info and count function calls the
    same for files and directories in nfs_opendir() and nfs_file_open().

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: Report the same debugging info in nfs_llseek_dir() and
    nfs_llseek_file().

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: Report the same debugging info, count function calls the same,
    and use similar function naming in nfs_fsync_dir() and nfs_fsync().

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • In rpc_show_tasks(), display the program name, version number, procedure
    name and tk_action as human-readable variable-length text fields rather
    than columnar numbers.

    Doing the symbol lookup here helps in cases where we have actual
    debugging output from a kernel log, but don't have access to the kernel
    image or RPC module that generated the output.

    Sample output:

    -pid- flgs status -client- --rqstp- -timeout ---ops--
    5608 0001 -11 eeb42690 f6d93710 0 f8fa1764 nfsv3 WRITE a:call_transmit_status q:none
    5609 0001 -11 eeb42690 f6d937e0 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5610 0001 -11 eeb42690 f6d93230 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5611 0001 -11 eeb42690 f6d93300 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5612 0001 -11 eeb42690 f6d93090 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5613 0001 -11 eeb42690 f6d933d0 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5614 0001 -11 eeb42690 f6d93cc0 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5615 0001 -11 eeb42690 f6d93a50 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5616 0001 -11 eeb42690 f6d93640 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5617 0001 -11 eeb42690 f6d93b20 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending
    5618 0001 -11 eeb42690 f6d93160 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: move the logic that displays each task to its own function.
    This removes indentation and makes future changes easier.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: don't display the rpc_show_tasks column header unless there is at
    least one task to display. As far as I can tell, it is safe to let the
    list_for_each_entry macro decide that each list is empty.

    scripts/checkpatch.pl also wants a KERN_FOO at the start of any newly added
    printk() calls, so this and subsequent patches will also add KERN_INFO.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • The RPC client uses a finite state machine to move RPC tasks through each
    step of an RPC request. Each state is contained in a function in
    net/sunrpc/clnt.c, and named call_foo.

    Some of the functions named call_foo have changed over the past few years and
    are no longer states in the FSM. These include: call_encode, call_header,
    and call_verify. As a clean up, rename the functions that have changed.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Improve debugging messages in call_start() and call_verify() by having
    them show the RPC procedure name instead of the procedure number.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Clean up: refresh the help text for Kconfig items related to the NFS
    client. Remove obsolete URLs, and make the language consistent among
    the options.

    Also move the ROOT_NFS config option next to the options related to the
    NFS client.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Since the credentials may be allocated during the call to rpc_new_task(),
    which again may be called by a memory allocator...

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Revert commit 44dd151d "NFS: Don't mark a written page as uptodate until it
    is on disk". While it is true that the write may fail, that is always the
    case. There is no reason why we should treat data on pages that are not
    already marked as PG_uptodate as being special. The only thing we gain is a
    noticeable slowdown when re-reading these pages.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • If a file is being extended, and we're creating a hole, we might as well
    declare the entire page to be up to date.

    This patch significantly improves the write performance for sparse files
    in the case where lseek(SEEK_END) is used to append several non-contiguous
    writes at intervals of < PAGE_SIZE.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • The special 'ENOMEM' case that was previously flagged as non-fatal is
    bogus: auth_gss always returns EAGAIN for non-fatal errors, and may in fact
    return ENOMEM in the special case where xdr_buf_read_netobj runs out of
    preallocated buffer space (invariably a _fatal_ error, since there is no
    provision for preallocating larger buffers).

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • All errors from call_encode(), with exception of EAGAIN are fatal, so we
    should immediately return instead of proceeding to xprt_transmit().

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • NFSv2 file locking currently fails the Connectathon tests, because the
    calls to the VFS locking code do not return an EINVAL error if the
    struct file_lock overflows the 32-bit boundaries.

    The problem is due to the fact that we occasionally call helpers from
    fs/locks.c in order to avoid RPC calls to the server when we know that a
    local process holds the lock. These helpers are, of course, always
    64-bit enabled, so EINVAL is not returned in cases when it would if
    the call had gone to the NLM code.

    For consistency, we therefore add support for a bounds-checking helper.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • The commit 2785259631697ebb0749a3782cca206e2e542939 (nfs: use GFP_NOFS
    preloads for radix-tree insertion) appears to have introduced a bug:
    We only want to call radix_tree_preload() once after creating a request.
    Calling it every time we loop after we created the request, will cause
    preemption count leaks.

    Signed-off-by: Trond Myklebust
    Cc: Nick Piggin

    Trond Myklebust
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     

09 Jul, 2008

8 commits