29 Aug, 2017

25 commits

  • There are currently 3 spots in the qib and hfi1 driver that have
    knowledge of the internal QP hash list that should only be in
    scope to rdmavt QP code.

    Add an iterator API for processing all QPs to hide the
    nature of the RCU hashlist.

    The API consists of:
    - rvt_qp_iter_init()
    * For iterating QPs one at a time for seq_file semantics
    - rvt_qp_iter_next()
    * For iterating QPs one at a time for seq_file semantics
    - rvt_qp_iter()
    * For iterating all QPs

    The first two are used for things like seq_file prints.

    The last is for code that just needs to iterate all QPs
    in the system.

    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Mike Marciniszyn
     
  • The qp_stats print will soon be moving to rdmavt, so use the proper
    accessor to get the ring size rather than a driver supplied constant.

    Fixes: Commit ff8d836efe06 ("IB/hfi1: Add receiving queue info to qp_stats")
    Reviewed-by: Kaike Wan
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Kaike Wan
     
  • Replace 'strcpy' with 'strncpy' to restrict the number
    of bytes copied to the buffer.

    Reviewed-by: Michael J. Ruhl
    Signed-off-by: Kamenee Arumugam
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Kamenee Arumugam
     
  • The hfi1_cdbg() macro can be instantiated in the hot path even when it
    is not in use. This shows up on perf profiles.

    Rework the macros (for SDMA and MMU), to use the trace interface directly
    to eliminate this performance hit.

    Reviewed-by: Mike Marciniszyn
    Signed-off-by: Michael J. Ruhl
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Michael J. Ruhl
     
  • Currently, QSFP information is not queried
    in cases where loopback was set up and QSFP module is
    present.

    Acquire QSFP information in case of loopback.

    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Jan Sokolowski
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Jan Sokolowski
     
  • Make some structures const as they are only used during a copy
    operation.

    Signed-off-by: Bhumika Goyal
    Signed-off-by: Doug Ledford

    Bhumika Goyal
     
  • vm_operations_struct are not supposed to change at runtime.
    vm_area_struct structure working with const vm_operations_struct.
    So mark the non-const vm_operations_struct structs as const.

    Signed-off-by: Arvind Yadav
    Signed-off-by: Doug Ledford

    Arvind Yadav
     
  • call to memset to assign 0 value immediately after allocating
    memory with kzalloc is unnecesaary as kzalloc allocates the memory
    filled with 0 value.

    Signed-off-by: Himanshu Jha
    Reviewed-by: Yuval Shaia
    Signed-off-by: Doug Ledford

    Himanshu Jha
     
  • usnic_uiom_get_dev_list() can return ERR_PTR(-ENOMEM) so we should check
    for that.

    Fixes: e3cf00d0a87f ("IB/usnic: Add Cisco VIC low-level hardware driver")
    Signed-off-by: Dan Carpenter
    Reviewed-by: Yuval Shaia
    Signed-off-by: Doug Ledford

    Dan Carpenter
     
  • These fields allow for debugging send engine processing.

    Reviewed-by: Kaike Wan
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Mike Marciniszyn
     
  • The rvt_ack_entry pointed to by s_tail_ack_queue provides important
    info about the request that has just been processed or is being processed
    on the responder side of a RC connection. This patch adds this info to
    the qp_stats to assist debugging.

    Reviewed-by: Mike Marciniszyn
    Signed-off-by: Kaike Wan
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Kaike Wan
     
  • Fix a tab alignment issue present in pr_err_ratelimited
    error message.

    Reviewed-by: Michael J. Ruhl
    Signed-off-by: Kamenee Arumugam
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Kamenee Arumugam
     
  • Clean up user_sdma.c by moving the structure and MACRO definitions into
    the header file user_sdma.h

    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Harish Chegondi
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Harish Chegondi
     
  • Clean up user_exp_rcv.c file by moving structure definitions into header
    file user_exp_rcv.h. Since these structure definitions depend on the
    structure definitions in mmu_rb.h, move #include "mmu_rb.h" above
    the include "user_exp_rcv.h" or include of header files that include
    user_exp_rcv.h

    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Harish Chegondi
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Harish Chegondi
     
  • num_user_pages() function has been defined in both user_exp_rcv.c file
    and user_sdma.c file. Move the function definition to a header file so
    there is only one definition in the source repo.

    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Harish Chegondi
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Harish Chegondi
     
  • In pin_vector_pages() function, if there is any error while pinning the
    pages or while adding a pinned buffer to the cache, the bail out code
    needs to unpin any pinned pages that are not in the cache and adjust the
    n_locked counter that counts the total pages pinned. The current bail
    out code doesn't seem to be doing it right in two cases:

    1. Before pinning required pages for a buffer, the SDMA pinned buffer
    cache is searched to see if the virtual address range that needs to be
    pinned is already pinned. If there isn't a hit in the cache, a new node
    is created for the buffer and is added to the cache after the buffer is
    pinned. If adding the new node to the cache fails, the n_locked count is
    decremented properly but the pinned pages are not freed. This commit
    fixes this issue.

    2. If there is a hit in the SDMA cache, but the cached buffer doesn't
    have enough pages to cover the entire address range that needs to be
    pinned, the node for the cached buffer is extracted from the cache,
    remaining pages needed are pinned and added to the node. The node is
    finally added back into the cache. If there is an error pinning the
    extra pages, the bail out code frees all the pages in the node but the
    n_locked count is not being decremented by the no of pages in the node
    that are freed. This commit fixes this issue.

    This commit fixes the above two issues by creating a new function that
    frees the pages in a node and decrements the n_locked count by the
    number of pages freed.

    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Harish Chegondi
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Harish Chegondi
     
  • Clean up pin_vector_pages() function by moving page pinning related code
    to a separate function since it really stands on its own.

    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Harish Chegondi
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Harish Chegondi
     
  • user_sdma_send_pkts() function is unnecessarily long. Clean it up by
    moving some of its code into separate functions.

    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Harish Chegondi
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Harish Chegondi
     
  • Clean up hfi1_user_exp_rcv_setup function by moving page pinning and
    unpinning related code to separate functions. In order to reduce the
    number of parameters passed between functions, a new data structure
    struct tid_user_buf is defined and used.

    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Harish Chegondi
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Harish Chegondi
     
  • Performance analysis shows that the cache callback function
    sdma_kmem_cache_ctor contributes to 1/2 of the kmem_cache_allocs
    time.

    Since all of the fields in the allocated data structure are initialized
    in the code path, remove the _ctor function.

    Reviewed-by: Mike Marciniszyn
    Signed-off-by: Michael J. Ruhl
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Michael J. Ruhl
     
  • Ratelimit error prints from sdma_interrupt function
    that could swarm dmesg otherwise.

    Reviewed-by: Dennis Dalessandro
    Signed-off-by: Grzegorz Morys
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Grzegorz Morys
     
  • Added checking on index value of array 'guids' in qib_ruc.c.
    Pass in corrrect size of array for memset operation in qib_mad.c.

    Reviewed-by: Michael J. Ruhl
    Signed-off-by: Kamenee Arumugam
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Kamenee Arumugam
     
  • Remove all the memory allocation implemented for boardname and
    directly assign the defined string literal.

    Reviewed-by: Michael J. Ruhl
    Signed-off-by: Kamenee Arumugam
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Kamenee Arumugam
     
  • Section 9.7.7.2.5 of the 1.3 IBTA spec clearly says that receive
    credits should never apply to RDMA write.

    qib and hfi1 were doing that. The following situation will result
    in a QP hang:
    - A prior SEND or RDMA_WRITE with immmediate consumed the last
    credit for a QP using RC receive buffer credits
    - The prior op is acked so there are no more acks
    - The peer ULP fails to post receive for some reason
    - An RDMA write sees that the credits are exhausted and waits
    - The peer ULP posts receive buffers
    - The ULP posts a send or RDMA write that will be hung

    The fix is to avoid the credit test for the RDMA write operation.

    Cc:
    Reviewed-by: Kaike Wan
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Mike Marciniszyn
     
  • hfi1 and qib were converted in previous patches, do the same for rdmavt.

    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Dennis Dalessandro
    Signed-off-by: Doug Ledford

    Mike Marciniszyn
     

25 Aug, 2017

15 commits

  • Signed-off-by: Doug Ledford

    Doug Ledford
     
  • Expose enhanced multi packet WQE capability to user space through
    query_device by uhw.

    Signed-off-by: Bodong Wang
    Reviewed-by: Daniel Jurgens
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Bodong Wang
     
  • Set the field to allow posting multi packet send WQEs if hardware
    supports this feature. This doesn't mean the send WQEs will be for
    multi packet unless the send WQE was prepared according to multi
    packet send WQE format.

    User space shall use flag MLX5_IB_ALLOW_MPW to check if hardware
    supports MPW and allows MPW in SQ context.

    Signed-off-by: Bodong Wang
    Reviewed-by: Daniel Jurgens
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Bodong Wang
     
  • Set underlay QPN as part of flow rule when it's applicable.

    There is one root flow table in the NIC RX namespace and all the
    underlay QPs steer the traffic to this flow table.
    In order to prevent QP to get traffic which is not target to its
    underlay QP, we need to set the underlay QP number as part of
    the steering matching.

    Note:
    When multicast traffic is sent the QPN filtering is done by the firmware
    as some early step. Adding the QPN match on the flow table entry is
    wrong as by that time the target QPN holds the multicast address (e.g.
    FF(s)) and it won't match.

    Signed-off-by: Yishai Hadas
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Yishai Hadas
     
  • Fix a bug where MR registration fails when mlx5_ib_cont_pages
    indicates that the MR can be mapped using 2GB pages (page_shift == 31).

    Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
    Signed-off-by: Ilya Lesokhin
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Ilya Lesokhin
     
  • In clean_mr error path the 'mr' should be freed.

    Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
    Signed-off-by: Kamal Heib
    Signed-off-by: Doug Ledford

    Kamal Heib
     
  • mlx5 compatible devices have two ways of populating the MTT
    table of an MKEY: using a FW command and using a UMR WQE.

    A UMR is much faster, so it should be used whenever possible.
    Unfortunately the code today uses UMR only if the MKEY was allocated
    from the MR cache.

    Fix the code to use UMR even for MKEYs that were allocated using
    a FW command.

    Signed-off-by: Ilya Lesokhin
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Ilya Lesokhin
     
  • This patch is the first step in decoupling UMR usage and
    allocation from the MR cache. The only functional change
    in this patch is to enables UMR for MRs created with
    reg_create.

    This change fixes a bug where ODP memory regions that
    were not allocated from the MR cache did not have UMR
    enabled.

    Signed-off-by: Ilya Lesokhin
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Ilya Lesokhin
     
  • Software parsing (SWP) is a feature that can be used to instruct the
    device to stop using its internal parser and to parse packets on the
    transmit path according to offsets set for each packets.

    Through this feature, the device allows the handling of checksum and
    LSO by the hardware according to the location of IP and TCP/UDP
    headers.

    Enable SW parsing on Raw Ethernet send queue by default if firmware
    supports it and report these capabilities to user space.

    Signed-off-by: Noa Osherovich
    Reviewed-by: Maor Gottlieb
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Noa Osherovich
     
  • None of the calls to i40iw_netdev_vlan_ipv6 are using mac so let's
    remove it from func's args-list.

    Signed-off-by: Yuval Shaia
    Signed-off-by: Doug Ledford

    Yuval Shaia
     
  • Trivial fix to spelling mistake in DP_ERR error message

    Signed-off-by: Colin Ian King
    Reviewed-by: Leon Romanovsky
    Reviewed-by: Ram Amrani
    Signed-off-by: Doug Ledford

    Colin Ian King
     
  • IB CM calls ib_modify_port() irrespective of link layer. If the
    failure is returned, the mad agent gets unregistered for those
    devices. Recently, modify_port() hook was removed from some of the
    low level drivers as it was always returning success. This breaks
    rdma connection establishment over those devices.
    For ethernet devices, Qkey violation and port capabilities are not
    applicable. So returning success for RoCE when modify_port hook is
    is not implemented.

    Cc: Leon Romanovsky
    Signed-off-by: Selvin Xavier
    Reviewed-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Selvin Xavier
     
  • Added support for two device caps - max_sge_rd, max_fast_reg_page_list_len
    and the IP_BASED_GIDS port cap flag.

    Reviewed-by: Jorgen Hansen
    Reviewed-by: Bryan Tan
    Reviewed-by: Aditya Sarwade
    Signed-off-by: Adit Ranadive
    Reviewed-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Adit Ranadive
     
  • The driver version is bumped for compatibility purposes. Also, send correct
    GID type during register to device. Added compatibility check macros for
    the device.

    Reviewed-by: Jorgen Hansen
    Reviewed-by: Aditya Sarwade
    Signed-off-by: Bryan Tan
    Signed-off-by: Adit Ranadive
    Reviewed-by: Leon Romanovsky
    Reviewed-by: Yuval Shaia
    Signed-off-by: Doug Ledford

    Bryan Tan
     
  • Adds support for ioctl callback in the RDMA netdevs to allow
    supporting functions not handled by the generic interface code.

    Signed-off-by: Feras Daoud
    Signed-off-by: Eitan Rabin
    Signed-off-by: Leon Romanovsky
    Reviewed-by: Yuval Shaia
    Signed-off-by: Doug Ledford

    Feras Daoud