02 Feb, 2006

5 commits


19 Jan, 2006

4 commits

  • Allow mechanisms to return more varied errors on the context creation
    downcall.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J. Bruce Fields
     
  • We require the server's gssd to create a completed context before asking the
    kernel to send a final context init reply. However, gssd could be buggy, or
    under some bizarre circumstances we might purge the context from our cache
    before we get the chance to use it here.

    Handle this case by returning GSS_S_NO_CONTEXT to the client.

    Also move the relevant code here to a separate function rather than nesting
    excessively.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kevin Coffman
     
  • Kerberos context initiation is handled in a single round trip, but other
    mechanisms (including spkm3) may require more, so we need to handle the
    GSS_S_CONTINUE case in svcauth_gss_accept. Send a null verifier.

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Adamson
     
  • The server code currently keeps track of the destination address on every
    request so that it can reply using the same address. However we forget to do
    that in the case of a deferred request. Remedy this oversight. >From folks
    at PolyServe.

    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J. Bruce Fields
     

12 Jan, 2006

1 commit


11 Jan, 2006

3 commits


10 Jan, 2006

3 commits


09 Jan, 2006

1 commit

  • Some long time ago, dentry struct was carefully tuned so that on 32 bits
    UP, sizeof(struct dentry) was exactly 128, ie a power of 2, and a multiple
    of memory cache lines.

    Then RCU was added and dentry struct enlarged by two pointers, with nice
    results for SMP, but not so good on UP, because breaking the above tuning
    (128 + 8 = 136 bytes)

    This patch reverts this unwanted side effect, by using an union (d_u),
    where d_rcu and d_child are placed so that these two fields can share their
    memory needs.

    At the time d_free() is called (and d_rcu is really used), d_child is known
    to be empty and not touched by the dentry freeing.

    Lockless lookups only access d_name, d_parent, d_lock, d_op, d_flags (so
    the previous content of d_child is not needed if said dentry was unhashed
    but still accessed by a CPU because of RCU constraints)

    As dentry cache easily contains millions of entries, a size reduction is
    worth the extra complexity of the ugly C union.

    Signed-off-by: Eric Dumazet
    Cc: Dipankar Sarma
    Cc: Maneesh Soni
    Cc: Miklos Szeredi
    Cc: "Paul E. McKenney"
    Cc: Ian Kent
    Cc: Paul Jackson
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: Trond Myklebust
    Cc: Neil Brown
    Cc: James Morris
    Cc: Stephen Smalley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

07 Jan, 2006

20 commits

  • Print messages when an unsupported encrytion algorthm is requested or
    there is an error locating a supported algorthm.

    Signed-off-by: Kevin Coffman
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    J. Bruce Fields
     
  • Print messages when an unsupported encrytion algorthm is requested or
    there is an error locating a supported algorthm.

    Signed-off-by: Kevin Coffman
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    J. Bruce Fields
     
  • Also update the tokenlen calculations to accomodate g_token_size().

    Signed-off-by: Andy Adamson
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    J. Bruce Fields
     
  • We ought never to be calling xprt_destroy() if there are still active
    rpc_tasks. Optimise away the broken code that attempts to "fix" that case.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • If the server decides to close the RPC socket, we currently don't actually
    respond until either another RPC call is scheduled, or until xprt_autoclose()
    gets called by the socket expiry timer (which may be up to 5 minutes
    later).

    This patch ensures that xprt_autoclose() is called much sooner if the
    server closes the socket.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Clean up: Every ULP that uses the in-kernel RPC client, except the NLM
    client, sets cl_chatty. There's no reason why NLM shouldn't set it, so
    just get rid of cl_chatty and always be verbose.

    Test-plan:
    Compile with CONFIG_NFS enabled.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • At some point, transport endpoint addresses will no longer be IPv4. To hide
    the structure of the rpc_xprt's address field from ULPs and port mappers,
    add an API for setting the port number during an RPC bind operation.

    Test-plan:
    Destructive testing (unplugging the network temporarily). Connectathon
    with UDP and TCP. NFSv2/3 and NFSv4 mounting should be carefully checked.
    Probably need to rig a server where certain services aren't running, or
    that returns an error for some typical operation.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • We'd like to hide fields in rpc_xprt and rpc_clnt from upper layer protocols.
    Start by creating an API to force RPC rebind, replacing logic that simply
    sets cl_port to zero.

    Test-plan:
    Destructive testing (unplugging the network temporarily). Connectathon
    with UDP and TCP. NFSv2/3 and NFSv4 mounting should be carefully checked.
    Probably need to rig a server where certain services aren't running, or
    that returns an error for some typical operation.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Add RPC client transport switch support for replacing buffer management
    on a per-transport basis.

    In the current IPv4 socket transport implementation, RPC buffers are
    allocated as needed for each RPC message that is sent. Some transport
    implementations may choose to use pre-allocated buffers for encoding,
    sending, receiving, and unmarshalling RPC messages, however. For
    transports capable of direct data placement, the buffers can be carved
    out of a pre-registered area of memory rather than from a slab cache.

    Test-plan:
    Millions of fsx operations. Performance characterization with "sio" and
    "iozone". Use oprofile and other tools to look for significant regression
    in CPU utilization.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • This patch removes ths unused function xdr_decode_string().

    Signed-off-by: Adrian Bunk
    Acked-by: Neil Brown
    Acked-by: Charles Lever
    Signed-off-by: Trond Myklebust

    Adrian Bunk
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • ...and make sure that the "intr" flag also enables SIGHUP and SIGTERM to
    interrupt RPC calls too (as per the Solaris implementation).

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • The NFSv4 model requires us to complete all RPC calls that might
    establish state on the server whether or not the user wants to
    interrupt it. We may also need to schedule new work (including
    new RPC calls) in order to cancel the new state.

    The asynchronous RPC model will allow us to ensure that RPC calls
    always complete, but in order to allow for "synchronous" RPC, we
    want to add the ability to wait for completion.
    The waits are, of course, interruptible.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Shrink the RPC task structure. Instead of storing separate pointers
    for task->tk_exit and task->tk_release, put them in a structure.

    Also pass the user data pointer as a parameter instead of passing it via
    task->tk_calldata. This enables us to nest callbacks.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • I submitted this one previously - svc_tcp_recvfrom currently returns
    any errors to the caller, including ECONNRESET and the like.

    This is something svc_recv isn't able to deal with:

    len = svsk->sk_recvfrom(rqstp);
    [...]
    if (len == 0 || len == -EAGAIN) {
    [...]
    return -EAGAIN;
    }

    [...]
    return len;

    The nfsd main loop will exit when it sees an error code other than
    EAGAIN.

    The following patch fixes this problem

    svc_recv is not equipped to deal with error codes other than EAGAIN,
    and will propagate anything else (such as ECONNRESET) up to nfsd,
    causing it to exit.

    Signed-off-by: Olaf Kirch
    Cc: Trond Myklebust
    Cc: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Kirch
     
  • The hash.h hash_long function, when used on a 64 bit machine, ignores many
    of the middle-order bits. (The prime chosen it too bit-sparse).

    IP addresses for clients of an NFS server are very likely to differ only in
    the low-order bits. As addresses are stored in network-byte-order, these
    bits become middle-order bits in a little-endian 64bit 'long', and so do
    not contribute to the hash. Thus you can have the situation where all
    clients appear on one hash chain.

    So, until hash_long is fixed (or maybe forever), us a hash function that
    works well on IP addresses - xor the bytes together.

    Thanks to "Iozone" for identifying this problem.

    Cc: "Iozone"

    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     

04 Jan, 2006

1 commit

  • I noticed that some of 'struct proto_ops' used in the kernel may share
    a cache line used by locks or other heavily modified data. (default
    linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at
    least)

    This patch makes sure a 'struct proto_ops' can be declared as const,
    so that all cpus can share all parts of it without false sharing.

    This is not mandatory : a driver can still use a read/write structure
    if it needs to (and eventually a __read_mostly)

    I made a global stubstitute to change all existing occurences to make
    them const.

    This should reduce the possibility of false sharing on SMP, and
    speedup some socket system calls.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 Dec, 2005

2 commits

  • gss_create_upcall() should not error just because rpc.gssd closed the
    pipe on its end. Instead, it should requeue the pending requests and then
    retry.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • If we get something like the following,
    [ 125.300636] [] schedule_timeout+0x54/0xa5
    [ 125.305931] [] io_schedule_timeout+0x29/0x33
    [ 125.311495] [] blk_congestion_wait+0x70/0x85
    [ 125.317058] [] throttle_vm_writeout+0x69/0x7d
    [ 125.322720] [] shrink_zone+0xe0/0xfa
    [ 125.327560] [] shrink_caches+0x6d/0x6f
    [ 125.332581] [] try_to_free_pages+0xd0/0x1b5
    [ 125.338056] [] __alloc_pages+0x135/0x2e8
    [ 125.343258] [] tcp_sendmsg+0xaa0/0xb78
    [ 125.348281] [] inet_sendmsg+0x48/0x53
    [ 125.353212] [] sock_sendmsg+0xb8/0xd3
    [ 125.358147] [] kernel_sendmsg+0x42/0x4f
    [ 125.363259] [] sock_no_sendpage+0x5e/0x77
    [ 125.368556] [] xs_tcp_send_request+0x2af/0x375
    then the socket is blocked until memory is reclaimed, and no
    progress can ever be made.

    Try to access the emergency pools by using GFP_ATOMIC.

    Signed-off-by: Trond Myklebust

    Trond Myklebust