19 Apr, 2008

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.26: (1090 commits)
    [NET]: Fix and allocate less memory for ->priv'less netdevices
    [IPV6]: Fix dangling references on error in fib6_add().
    [NETLABEL]: Fix NULL deref in netlbl_unlabel_staticlist_gen() if ifindex not found
    [PKT_SCHED]: Fix datalen check in tcf_simp_init().
    [INET]: Uninline the __inet_inherit_port call.
    [INET]: Drop the inet_inherit_port() call.
    SCTP: Initialize partial_bytes_acked to 0, when all of the data is acked.
    [netdrvr] forcedeth: internal simplifications; changelog removal
    phylib: factor out get_phy_id from within get_phy_device
    PHY: add BCM5464 support to broadcom PHY driver
    cxgb3: Fix __must_check warning with dev_dbg.
    tc35815: Statistics cleanup
    natsemi: fix MMIO for PPC 44x platforms
    [TIPC]: Cleanup of TIPC reference table code
    [TIPC]: Optimized initialization of TIPC reference table
    [TIPC]: Remove inlining of reference table locking routines
    e1000: convert uint16_t style integers to u16
    ixgb: convert uint16_t style integers to u16
    sb1000.c: make const arrays static
    sb1000.c: stop inlining largish static functions
    ...

    Linus Torvalds
     

18 Apr, 2008

1 commit


17 Apr, 2008

1 commit

  • Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a
    "send with invalidate" work request as defined in the iWARP verbs and
    the InfiniBand base memory management extensions. Also put "imm_data"
    and a new "invalidate_rkey" member in a new "ex" union in struct
    ib_send_wr. The invalidate_rkey member can be used to pass in an
    R_Key/STag to be invalidated. Add this new union to struct
    ib_uverbs_send_wr. Add code to copy the invalidate_rkey field in
    ib_uverbs_post_send().

    Fix up low-level drivers to deal with the change to struct ib_send_wr,
    and just remove the imm_data initialization from net/sunrpc/xprtrdma/,
    since that code never does any send with immediate operations.

    Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since
    the iWARP drivers currently in the tree set the bit. The amso1100
    driver at least will silently fail to honor the IB_SEND_INVALIDATE bit
    if passed in as part of userspace send requests (since it does not
    implement kernel bypass work request queueing). Remove the flag from
    all existing drivers that set it until we know which ones are OK.

    The values chosen for the new flag is not consecutive to avoid clashing
    with flags defined in the XRC patches, which are not merged yet but
    which are already in use and are likely to be merged soon.

    This resurrects a patch sent long ago by Mikkel Hagen .

    Signed-off-by: Roland Dreier

    Roland Dreier
     

09 Apr, 2008

2 commits


04 Apr, 2008

1 commit


27 Mar, 2008

1 commit

  • The RDMACTXT_F_LAST_CTXT bit was getting set incorrectly
    when the last chunk in the read-list spanned multiple pages. This
    resulted in a kernel panic when the wrong context was used to
    build the RPC iovec page list.

    RDMA_READ is used to fetch RPC data from the client for
    NFS_WRITE requests. A scatter-gather is used to map the
    advertised client side buffer to the server-side iovec and
    associated page list.

    WR contexts are used to convey which scatter-gather entries are
    handled by each WR. When the write data is large, a single RPC may
    require multiple RDMA_READ requests so the contexts for a single RPC
    are chained together in a linked list. The last context in this list
    is marked with a bit RDMACTXT_F_LAST_CTXT so that when this WR completes,
    the CQ handler code can enqueue the RPC for processing.

    The code in rdma_read_xdr was setting this bit on the last two
    contexts on this list when the last read-list chunk spanned multiple
    pages. This caused the svc_rdma_recvfrom logic to incorrectly build
    the RPC and caused the kernel to crash because the second-to-last
    context doesn't contain the iovec page list.

    Modified the condition that sets this bit so that it correctly detects
    the last context for the RPC.

    Signed-off-by: Tom Tucker
    Tested-by: Roland Dreier
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Linus Torvalds

    Tom Tucker
     

25 Mar, 2008

1 commit

  • The iWARP protocol limits RDMA read requests to a single scatter
    entry. NFS/RDMA has code in rdma_read_max_sge() that is supposed to
    limit the sge_count for RDMA read requests to 1, but the code to do
    that is inside an #ifdef RDMA_TRANSPORT_IWARP block. In the mainline
    kernel at least, RDMA_TRANSPORT_IWARP is an enum and not a
    preprocessor #define, so the #ifdef'ed code is never compiled.

    In my test of a kernel build with -j8 on an NFS/RDMA mount, this
    problem eventually leads to trouble starting with:

    svcrdma: Error posting send = -22
    svcrdma : RDMA_READ error = -22

    and things go downhill from there.

    The trivial fix is to delete the #ifdef guard. The check seems to be
    a remnant of when the NFS/RDMA code was not merged and needed to
    compile against multiple kernel versions, although I don't think it
    ever worked as intended. In any case now that the code is upstream
    there's no need to test whether the RDMA_TRANSPORT_IWARP constant is
    defined or not.

    Without this patch, my kernel build on an NFS/RDMA mount using NetEffect
    adapters quickly and 100% reproducibly failed with an error like:

    ld: final link failed: Software caused connection abort

    With the patch applied I was able to complete a kernel build on the
    same setup.

    (Tom Tucker says this is "actually an _ancient_ remnant when it had to
    compile against iWARP vs. non-iWARP enabled OFA trees.")

    Signed-off-by: Roland Dreier
    Acked-by: Tom Tucker
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Linus Torvalds

    Roland Dreier
     

18 Mar, 2008

4 commits


13 Mar, 2008

2 commits

  • The assertion that checks for sge context overflow is
    incorrectly hard-coded to 32. This causes a kernel bug
    check when using big-data mounts. Changed the BUG_ON to
    use the computed value RPCSVC_MAXPAGES.

    Signed-off-by: Tom Tucker
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Linus Torvalds

    Tom Tucker
     
  • RDMA connection shutdown on an SMP machine can cause a kernel crash due
    to the transport close path racing with the I/O tasklet.

    Additional transport references were added as follows:
    - A reference when on the DTO Q to avoid having the transport
    deleted while queued for I/O.
    - A reference while there is a QP able to generate events.
    - A reference until the DISCONNECTED event is received on the CM ID

    Signed-off-by: Tom Tucker
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Linus Torvalds

    Tom Tucker
     

08 Mar, 2008

2 commits


06 Mar, 2008

1 commit


29 Feb, 2008

1 commit


22 Feb, 2008

1 commit

  • Sorry for the noise, but here's the v3 of this compilation fix :)

    There are some places, which declare the char buf[...] on the stack
    to push it later into dprintk(). Since the dprintk sometimes (if the
    CONFIG_SYSCTL=n) becomes an empty do { } while (0) stub, these buffers
    cause gcc to produce appropriate warnings.

    Wrap these buffers with RPC_IFDEBUG macro, as Trond proposed, to
    compile them out when not needed.

    Signed-off-by: Pavel Emelyanov
    Acked-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust

    Pavel Emelyanov
     

15 Feb, 2008

2 commits

  • * Add path_put() functions for releasing a reference to the dentry and
    vfsmount of a struct path in the right order

    * Switch from path_release(nd) to path_put(&nd->path)

    * Rename dput_path() to path_put_conditional()

    [akpm@linux-foundation.org: fix cifs]
    Signed-off-by: Jan Blunck
    Signed-off-by: Andreas Gruenbacher
    Acked-by: Christoph Hellwig
    Cc:
    Cc: Al Viro
    Cc: Steven French
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Blunck
     
  • This is the central patch of a cleanup series. In most cases there is no good
    reason why someone would want to use a dentry for itself. This series reflects
    that fact and embeds a struct path into nameidata.

    Together with the other patches of this series
    - it enforced the correct order of getting/releasing the reference count on
    pairs
    - it prepares the VFS for stacking support since it is essential to have a
    struct path in every place where the stack can be traversed
    - it reduces the overall code size:

    without patch series:
    text data bss dec hex filename
    5321639 858418 715768 6895825 6938d1 vmlinux

    with patch series:
    text data bss dec hex filename
    5320026 858418 715768 6894212 693284 vmlinux

    This patch:

    Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: fix cifs]
    [akpm@linux-foundation.org: fix smack]
    Signed-off-by: Jan Blunck
    Signed-off-by: Andreas Gruenbacher
    Acked-by: Christoph Hellwig
    Cc: Al Viro
    Cc: Casey Schaufler
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Blunck
     

14 Feb, 2008

1 commit

  • Use updated file list for docbook files and
    fix kernel-doc warnings in sunrpc:
    Warning(linux-2.6.24-git12//net/sunrpc/rpc_pipe.c:689): No description found for parameter 'rpc_client'
    Warning(linux-2.6.24-git12//net/sunrpc/rpc_pipe.c:765): No description found for parameter 'flags'
    Warning(linux-2.6.24-git12//net/sunrpc/clnt.c:584): No description found for parameter 'tk_ops'
    Warning(linux-2.6.24-git12//net/sunrpc/clnt.c:618): No description found for parameter 'bufsize'

    Signed-off-by: Randy Dunlap
    Cc: Trond Myklebust
    Cc: "J. Bruce Fields"
    Cc: Neil Brown
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

11 Feb, 2008

1 commit


02 Feb, 2008

17 commits

  • Clean up: When looping over RPC version and procedure numbers, use
    unsigned index variables.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust
    Signed-off-by: J. Bruce Fields

    Chuck Lever
     
  • Do it for the server code...

    Signed-off-by: Trond Myklebust
    Signed-off-by: J. Bruce Fields

    Trond Myklebust
     
  • Move the initialzation in __svc_create_thread that happens prior to
    thread creation to a new function. Export the function to allow
    services to have better control over the svc_rqst structs.

    Also rearrange the rqstp initialization to prevent NULL pointer
    dereferences in svc_exit_thread in case allocations fail.

    Signed-off-by: Jeff Layton
    Reviewed-by: NeilBrown
    Signed-off-by: J. Bruce Fields

    Jeff Layton
     
  • Add the svcrdma module to the xprtrdma makefile.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • This logic parses the ONCRDMA protocol headers that
    precede the actual RPC header. It is placed in a separate
    file to keep all protocol aware code in a single place.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • This file implements the RDMA transport sendto function. A RPC reply
    on an RDMA transport consists of some number of RDMA_WRITE requests
    followed by an RDMA_SEND request. The sendto function parses the
    ONCRPC RDMA reply header to determine how to send the reply back to
    the client. The send queue is sized so as to be able to send complete
    replies for requests in most cases. In the event that there are not
    enough SQ WR slots to reply, e.g. big data, the send will block the
    NFSD thread. The I/O callback functions in svc_rdma_transport.c that
    reap WR completions wake any waiters blocked on the SQ. In general,
    the goal is not to block NFSD threads and the has_wspace method
    stall requests when the SQ is nearly full.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • This file implements the RDMA transport recvfrom function. The function
    dequeues work reqeust completion contexts from an I/O list that it shares
    with the I/O tasklet in svc_rdma_transport.c. For ONCRPC RDMA, an RPC may
    not be complete when it is received. Instead, the RDMA header that precedes
    the RPC message informs the transport where to get the RPC data from on
    the client and where to place it in the RPC message before it is delivered
    to the server. The svc_rdma_recvfrom function therefore, parses this RDMA
    header and issues any necessary RDMA operations to fetch the remainder of
    the RPC from the client.

    Special handling is required when the request involves an RDMA_READ.
    In this case, recvfrom submits the RDMA_READ requests to the underlying
    transport driver and then returns 0. When the transport
    completes the last RDMA_READ for the request, it enqueues it on a
    read completion queue and enqueues the transport. The recvfrom code
    favors this queue over the regular DTO queue when satisfying reads.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • This file implements the core transport data management and I/O
    path. The I/O path for RDMA involves receiving callbacks on interrupt
    context. Since all the svc transport locks are _bh locks we enqueue the
    transport on a list, schedule a tasklet to dequeue data indications from
    the RDMA completion queue. The tasklet in turn takes _bh locks to
    enqueue receive data indications on a list for the transport. The
    svc_rdma_recvfrom transport function dequeues data from this list in an
    NFSD thread context.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • This file implements the RDMA transport module initialization and
    termination logic and registers the transport sysctl variables.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • Create a transport independent version of the svc_sock_names function.

    The toclose capability of the svc_sock_names service can be implemented
    using the svc_xprt_find and svc_xprt_close services.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Reviewed-by: Chuck Lever
    Reviewed-by: Greg Banks
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • Update the write handler for the portlist file to allow creating new
    listening endpoints on a transport. The general form of the string is:

    For example:

    echo "tcp 2049" > /proc/fs/nfsd/portlist

    This is intended to support the creation of a listening endpoint for
    RDMA transports without adding #ifdef code to the nfssvc.c file.

    Transports can also be removed as follows:

    '-'

    For example:

    echo "-tcp 2049" > /proc/fs/nfsd/portlist

    Attempting to add a listener with an invalid transport string results
    in EPROTONOSUPPORT and a perror string of "Protocol not supported".

    Attempting to remove an non-existent listener (.e.g. bad proto or port)
    results in ENOTCONN and a perror string of
    "Transport endpoint is not connected"

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Reviewed-by: Chuck Lever
    Reviewed-by: Greg Banks
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • Add a new svc function that allows a service to query whether a
    transport instance has already been created. This is used in lockd
    to determine whether or not a transport needs to be created when
    a lockd instance is brought up.

    Specifying 0 for the address family or port is effectively a wild-card,
    and will result in matching the first transport in the service's list
    that has a matching class name.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Reviewed-by: Chuck Lever
    Reviewed-by: Greg Banks
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • Add a file that when read lists the set of registered svc
    transports.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Reviewed-by: Chuck Lever
    Reviewed-by: Greg Banks
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • Some transports have a header in front of the RPC header. The current
    defer/revisit processing considers only the iov_len and arg_len to
    determine how much to back up when saving the original request
    to revisit. Add a field to the rqstp structure to save the size
    of the transport header so svc_defer can correctly compute
    the start of a request.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Reviewed-by: Chuck Lever
    Reviewed-by: Greg Banks
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • This functionally trivial patch moves all of the transport independent
    functions from the svcsock.c file to the transport independent svc_xprt.c
    file.

    In addition the following formatting changes were made:
    - White space cleanup
    - Function signatures on single line
    - The inline directive was removed
    - Lines over 80 columns were reformatted
    - The term 'socket' was changed to 'transport' in comments
    - The SMP comment was moved and updated.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Reviewed-by: Chuck Lever
    Reviewed-by: Greg Banks
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • The svc_check_conn_limits function only manipulates xprt fields. Change references
    to svc_sock->sk_xprt to svc_xprt directly.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Reviewed-by: Chuck Lever
    Reviewed-by: Greg Banks
    Signed-off-by: J. Bruce Fields

    Tom Tucker
     
  • This functionally empty patch removes rq_sock and unamed union
    from rqstp structure.

    Signed-off-by: Tom Tucker
    Acked-by: Neil Brown
    Reviewed-by: Chuck Lever
    Reviewed-by: Greg Banks
    Signed-off-by: J. Bruce Fields

    Tom Tucker