17 Sep, 2016

1 commit


03 Aug, 2016

5 commits

  • The use of the specific opcode test is redundant since
    all ack entry users correctly manipulate the mr pointer
    to selectively trigger the reference clearing.

    The overly specific test hinders the use of implementation
    specific operations.

    The change needs to get rid of the union to insure that
    an atomic value is not seen as an MR pointer.

    Reviewed-by: Ashutosh Dixit
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Ira Weiny
    Signed-off-by: Doug Ledford

    Ira Weiny
     
  • Hanging has been observed while writing a file over NFSoRDMA. Dmesg on
    the server contains messages like these:

    [ 931.992501] svcrdma: Error -22 posting RDMA_READ
    [ 952.076879] svcrdma: Error -22 posting RDMA_READ
    [ 982.154127] svcrdma: Error -22 posting RDMA_READ
    [ 1012.235884] svcrdma: Error -22 posting RDMA_READ
    [ 1042.319194] svcrdma: Error -22 posting RDMA_READ

    Here is why:

    With the base memory management extension enabled, FRMR is used instead
    of FMR. The xprtrdma server issues each RDMA read request as the following
    bundle:

    (1)IB_WR_REG_MR, signaled;
    (2)IB_WR_RDMA_READ, signaled;
    (3)IB_WR_LOCAL_INV, signaled & fencing.

    These requests are signaled. In order to generate completion, the fast
    register work request is processed by the hfi1 send engine after being
    posted to the work queue, and the corresponding lkey is not valid until
    the request is processed. However, the rdmavt driver validates lkey when
    the RDMA read request is posted and thus it fails immediately with error
    -EINVAL (-22).

    This patch changes the work flow of local operations (fast register and
    local invalidate) so that fast register work requests are always
    processed immediately to ensure that the corresponding lkey is valid
    when subsequent work requests are posted. Local invalidate requests are
    processed immediately if fencing is not required and no previous local
    invalidate request is pending.

    To allow completion generation for signaled local operations that have
    been processed before posting to the work queue, an internal send flag
    RVT_SEND_COMPLETION_ONLY is added. The hfi1 send engine checks this flag
    and only generates completion for such requests.

    Reviewed-by: Mike Marciniszyn
    Signed-off-by: Jianxin Xiong
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Jianxin Xiong
     
  • This fix allows for support of in-kernel reserved operations
    without impacting the ULP user.

    The low level driver can register a non-zero value which
    will be transparently added to the send queue size and hidden
    from the ULP in every respect.

    ULP post sends will never see a full queue due to a reserved
    post send and reserved operations will never exceed that
    registered value.

    The s_avail will continue to track the ULP swqe availability
    and the difference between the reserved value and the reserved
    in use will track reserved availabity.

    Reviewed-by: Ashutosh Dixit
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Mike Marciniszyn
     
  • Some work requests are local operations, such as IB_WR_REG_MR and
    IB_WR_LOCAL_INV. They differ from non-local operations in that:

    (1) Local operations can be processed immediately without being posted
    to the send queue if neither fencing nor completion generation is needed.
    However, to ensure correct ordering, once a local operation is posted to
    the work queue due to fencing or completion requiement, all subsequent
    local operations must also be posted to the work queue until all the
    local operations on the work queue have completed.

    (2) Local operations don't send packets over the wire and thus don't
    need (and shouldn't update) the packet sequence numbers.

    Define a new a flag bit for the post send table to identify local
    operations.

    Add a new field to the QP structure to track the number of local
    operations on the send queue to determine if direct processing of new
    local operations should be enabled/disabled.

    Reviewed-by: Mike Marciniszyn
    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Jianxin Xiong
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Jianxin Xiong
     
  • Add flexibility for driver dependent operations in post send
    because different drivers will have differing post send
    operation support.

    This includes data structure definitions to support a table
    driven scheme along with the necessary validation routine
    using the new table.

    Reviewed-by: Ashutosh Dixit
    Reviewed-by: Jianxin Xiong
    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Mike Marciniszyn
     

27 May, 2016

1 commit

  • rdmavt allows the driver to specify the size of the ack queue, but
    only uses it for the modify QP limit testing for setting the atomic
    limit value.

    The driver dependent size is now used to size the s_ack_queue ring
    dynamicially.

    Since the driver knows its size, the driver will use its define
    for any ring size dependent code.

    Reviewed-by: Mitko Haralanov
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Doug Ledford

    Mike Marciniszyn
     

29 Apr, 2016

1 commit

  • The RVT_S_WAIT_PIO_DRAIN flag was missing from
    the set of flags indicating a qp is waiting
    on a resource.

    This caused the sleep/wakeup for adaptive pio
    drain to lose a wakeup "hanging" a QP.

    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Doug Ledford

    Mike Marciniszyn
     

18 Mar, 2016

1 commit


11 Mar, 2016

15 commits

  • The change requires a new pio_busy field in the iowait structure to
    track the number of outstanding pios. The new counter together
    with the sdma counter serve as the basis for a packet by packet decision
    as to which egress mechanism to use. Since packets given to different
    egress mechanisms are not ordered, this scheme will preserve the order.

    The iowait drain/wait mechanisms are extended for a pio case. An
    additional qp wait flag is added for the PIO drain wait case.

    Currently the only pio wait is for buffers, so the no_bufs_available()
    routine name is changed to pio_wait() and a third argument is passed
    with one of the two pio wait flags to generalize the routine. A module
    parameter is added to hold a configurable threshold. For now, the
    module parameter is zero.

    A heuristic routine is added to return the func pointer of the proper
    egress routine to use.

    The heuristic is as follows:
    - SMI always uses pio
    - GSI,UD qps threadhold use sdma
    o No coordination with sdma is required because order is not required
    and this qp pio count is not maintained for UD
    - RC/UC ONLY packets threshold use SDMA
    o If pio's are pending the pio_wait with the new wait flag is
    called to delay for pios to drain

    The threshold is potentially reduced by the QP's mtu.

    The sc_buffer_alloc() has two additional args (a callback, a void *)
    which are exploited by the RC/UC cases to pass a new complete routine
    and a qp *.

    When the shadow ring completes the credit associated with a packet,
    the new complete routine is called. The verbs_pio_complete() will then
    decrement the busy count and trigger any drain waiters in qp destroy
    or reset.

    Reviewed-by: Jubin John
    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Doug Ledford

    Mike Marciniszyn
     
  • pahole noted the wasted 4 bytes after s_lock and r_lock.

    Move s_flags and r_psn to fill the holes.

    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Doug Ledford

    Mike Marciniszyn
     
  • Remove exported functions which are no longer required as the
    functionality has moved into rdmavt. This also requires re-ordering some
    of the functions since their prototype no longer appears in a header
    file. Rather than add forward declarations it is just cleaner to
    re-order some of the functions.

    Reviewed-by: Jubin John
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Dennis Dalessandro
     
  • This patch adds an additional lock to reduce contention on the s_lock.

    This lock is used in post_send() so that the post_send is not
    serialized with the send engine and other send related processing.

    To do this the s_next_psn is now maintained on post_send() while
    post_send() related fields are moved to a new cache line. There is
    an s_avail maintained for the post_send() to mitigate trading cache
    lines with the send engine. The lock is released/acquired around
    releasing the just built packet to the egress mechanism.

    Reviewed-by: Jubin John
    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Dean Luick
    Signed-off-by: Harish Chegondi
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Ira Weiny
    Signed-off-by: Doug Ledford

    Mike Marciniszyn
     
  • A busy_jiffies variable is maintained and updated when rc qps are
    created and deleted. busy_jiffies is a scaled value of the number
    of rc qps in the device. busy_jiffies is incremented every rc qp
    scaling interval. busy_jiffies is added to the rc timeout
    in add_retry_timer and mod_retry_timer. The rc qp scaling interval
    is selected based on extensive performance evaluation of targeted
    workloads.

    Reviewed-by: Dennis Dalessandro
    Reviewed-by: Mike Marciniszyn
    Signed-off-by: Vennila Megavannan
    Signed-off-by: Jubin John
    Signed-off-by: Doug Ledford

    Vennila Megavannan
     
  • The field is a vestige from ipath.

    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Doug Ledford

    Mike Marciniszyn
     
  • Update all files added by rdmavt which do not yet have 2016 as the
    copyright year.

    Reviewed-by: Ira Weiny
    Reviewed-by: Harish Chegondi
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Dennis Dalessandro
     
  • s_sde should be in the low level driver QP private data.

    Remove the definition from rvt_qp.

    Signed-off-by: Ira Weiny
    Signed-off-by: Doug Ledford

    Ira Weiny
     
  • This patch adds in the multicast add and remove functions as well as the
    ancillary infrastructure needed.

    Reviewed-by: Mike Marciniszyn
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Dennis Dalessandro
     
  • Add modify qp and supporting functions.

    Reviewed-by: Mike Marciniszyn
    Reviewed-by: Ira Weiny
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Dennis Dalessandro
     
  • Add in a post_send and post_one_send to rdmavt. The ULP will provide a WQE
    to rdmavt which will then walk and queue each element. Rdmavt will then
    queue the work to be done in the driver or kick the driver's progress
    routine.

    There needs to be a follow on patch which adds in another lock for the
    head of the queue so that it can be added to and read from in parallel.
    This will touch protocol handlers and require other changes in the
    drivers. This will be done separately.

    Reviewed-by: Mike Marciniszyn
    Reviewed-by: Ira Weiny
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Dennis Dalessandro
     
  • Until all queue pair functionality is moved to rdmavt we need to provide
    access to the reset function. This is only temporary and will be reverted
    back to a static, non exported function in the end.

    Reviewed-by: Ira Weiny
    Reviewed-by: Harish Chegondi
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Dennis Dalessandro
     
  • Use the flags originally provided for hfi1 in the rdmavt driver. These will
    be made available to drivers in the qp header file.

    Reviewed-by: Harish Chegondi
    Reviewed-by: Mike Marciniszyn
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Dennis Dalessandro
     
  • Add table init as well as teardown for handling qpn maps. Drivers can still
    provide this functionality by setting the QP_INIT_DRIVER bit.

    Reviewed-by: Ira Weiny
    Reviewed-by: Mike Marciniszyn
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Dennis Dalessandro
     
  • Until all functionality is moved over to rdmavt drivers still need to
    access a number of fields in data structures that are predominantly
    meant to be used by rdmavt. Once these rdmavt_.h header
    files are no longer being touched by drivers their content should be
    moved to rdmavt/.h. While here move a couple #defines
    over to more general IB verbs header files because they fit better.

    Reviewed-by: Ira Weiny
    Reviewed-by: Mike Marciniszyn
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Dennis Dalessandro