15 Feb, 2017

2 commits

  • commit 647bf3d8a8e5777319da92af672289b2a6c4dc66 upstream.

    Update the range check to avoid integer-overflow in edge case.
    Resolves CVE 2016-8636.

    Signed-off-by: Eyal Itkin
    Signed-off-by: Dan Carpenter
    Reviewed-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Eyal Itkin
     
  • commit 628f07d33c1f2e7bf31e0a4a988bb07914bd5e73 upstream.

    Update the response's resid field when larger than MTU, instead of only
    updating the local resid variable.

    Fixes: 8700e3e7c485 ("Soft RoCE driver")
    Signed-off-by: Eyal Itkin
    Signed-off-by: Dan Carpenter
    Reviewed-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Eyal Itkin
     

09 Feb, 2017

1 commit

  • commit b414fa01c31318383ae29d9d23cb9ca4184bbd86 upstream.

    The current QP FetchBurstMax value is 256B, which
    is incorrect since a WR can exceed that value. The
    result being a partial WR fetched by hardware, and
    a fatal "bad WR" error posted by the SGE.

    So bump the FetchBurstMax to 512B.

    Signed-off-by: Steve Wise
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Steve Wise
     

01 Feb, 2017

8 commits

  • commit 2d4b21e0a2913612274a69a3ba1bfee4cffc6e77 upstream.

    On UD QP completer tasklet is scheduled for each packet sent.

    If it is followed by a destroy_qp(), the kernel panic will
    happen as the completer tries to operate on a destroyed QP.

    Fixes: 8700e3e7c485 ("Soft RoCE driver")
    Signed-off-by: Yonatan Cohen
    Reviewed-by: Moni Shoua
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Yonatan Cohen
     
  • commit f39f775218a7520e3700de2003c84a042c3b5972 upstream.

    The first argument of list_add_tail is the new item and the second
    is the head of the list. Fix the code to pass arguments in the
    right order, otherwise not all the rxe devices will be removed
    during teardown.

    Fixes: 8700e3e7c4857 ('Soft RoCE driver')
    Signed-off-by: Maor Gottlieb
    Reviewed-by: Moni Shoua
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Maor Gottlieb
     
  • commit 828f6fa65ce7e80f77f5ab12942e44eb3d9d174e upstream.

    1. Release pid before enter odp flow
    2. Release pid when fail to allocate memory

    Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get")
    Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging regions")
    Signed-off-by: Kenneth Lee
    Reviewed-by: Haggai Eran
    Reviewed-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Kenneth Lee
     
  • commit c12a67fec8d99bb554e8d4e99120d418f1a39c87 upstream.

    Commit ad61a4c7a9b7 ("iw_cxgb4: don't block in destroy_qp awaiting
    the last deref") introduced a bug where the RDMA QP EQ queue memory
    (and QIDs) are possibly freed before the underlying connection has been
    fully shutdown. The result being a possible DMA read issued by HW after
    the queue memory has been unmapped and freed. This results in possible
    WR corruption in the worst case, system bus errors if an IOMMU is in use,
    and SGE "bad WR" errors reported in the very least. The fix is to defer
    unmap/free of queue memory and QID resources until the QP struct has
    been fully dereferenced. To do this, the c4iw_ucontext must also be kept
    around until the last QP that references it is fully freed. In addition,
    since the last QP deref can happen in an IRQ disabled context, we need
    a new workqueue thread to do the final unmap/free of the EQ queue memory.

    Fixes: ad61a4c7a9b7 ("iw_cxgb4: don't block in destroy_qp awaiting the last deref")
    Signed-off-by: Steve Wise
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Steve Wise
     
  • commit 0a475ef4226e305bdcffe12b401ca1eab06c4913 upstream.

    After setting indirect_sg_entries module_param to huge value (e.g 500,000),
    srp_alloc_req_data() fails to allocate indirect descriptors for the request
    ring (kmalloc fails). This commit enforces the maximum value of
    indirect_sg_entries to be SG_MAX_SEGMENTS as signified in module param
    description.

    Fixes: 65e8617fba17 (scsi: rename SCSI_MAX_{SG, SG_CHAIN}_SEGMENTS)
    Fixes: c07d424d6118 (IB/srp: add support for indirect tables that don't fit in SRP_CMD)
    Signed-off-by: Israel Rukshin
    Signed-off-by: Max Gurtovoy
    Reviewed-by: Laurence Oberman
    Reviewed-by: Bart Van Assche --
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Israel Rukshin
     
  • commit ad8e66b4a80182174f73487ed25fd2140cf43361 upstream.

    If the device support arbitrary sg list mapping (device cap
    IB_DEVICE_SG_GAPS_REG set) we allocate the memory regions with
    IB_MR_TYPE_SG_GAPS.

    Fixes: 509c5f33f4f6 ("IB/srp: Prevent mapping failures")
    Signed-off-by: Israel Rukshin
    Signed-off-by: Max Gurtovoy
    Reviewed-by: Leon Romanovsky
    Reviewed-by: Mark Bloch
    Reviewed-by: Yuval Shaia
    Reviewed-by: Bart Van Assche
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Israel Rukshin
     
  • commit 1e5db6c31ade4150c2e2b1a21e39f776c38fea39 upstream.

    For devices that can register page list that is bigger than
    USHRT_MAX, we actually take the wrong value for sg_tablesize.
    E.g: for CX4 max_fast_reg_page_list_len is 65536 (bigger than USHRT_MAX)
    so we set sg_tablesize to 0 by mistake. Therefore, each IO that is
    bigger than 4k splitted to "< 4k" chunks that cause performance degredation.
    Remove wrong sg_tablesize assignment, and use the value that was set during
    address resolution handler with the needed casting.

    Signed-off-by: Max Gurtovoy
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Max Gurtovoy
     
  • commit b4cfe3971f6eab542dd7ecc398bfa1aeec889934 upstream.

    If IPV6 has not been enabled in the underlying kernel, we must avoid
    calling IPV6 procedures in rdma_cm.ko.

    This requires using "IS_ENABLED(CONFIG_IPV6)" in "if" statements
    surrounding any code which calls external IPV6 procedures.

    In the instance fixed here, procedure cma_bind_addr() called
    ipv6_addr_type() -- which resulted in calling external procedure
    __ipv6_addr_type().

    Fixes: 6c26a77124ff ("RDMA/cma: fix IPv6 address resolution")
    Cc: Spencer Baugh
    Signed-off-by: Jack Morgenstein
    Reviewed-by: Moni Shoua
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Jack Morgenstein
     

26 Jan, 2017

14 commits

  • commit 0b59970e7d96edcb3c7f651d9d48e1a59af3c3b0 upstream.

    Remove the warning print of "can't use of GFP_NOIO" to avoid prints in
    each QP creation when devices aren't supporting IB_QP_CREATE_USE_GFP_NOIO.

    This print become more annoying when the IPoIB interface is configured
    to work in connected mode.

    Fixes: 09b93088d750 ('IB: Add a QP creation flag to use GFP_NOIO allocations')
    Signed-off-by: Kamal Heib
    Signed-off-by: Leon Romanovsky
    Reviewed-by: Yuval Shaia
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Kamal Heib
     
  • commit bf08e884bfd5be068fd2ccf2bc450f085d8dd853 upstream.

    Before reading GRH attributes, need to make sure AH contains GRH,
    and in addition, initialize GID type.

    Fixes: dbf727de7440 ('IB/core: Use GID table in AH creation and dmac resolution')
    Signed-off-by: Eran Ben Elisha
    Signed-off-by: Daniel Jurgens
    Reviewed-by: Mark Bloch
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Eran Ben Elisha
     
  • commit 1f22e454df2eb99ba6b7ace3f594f6805cdf5cbc upstream.

    According to the firmware spec, FLOW_STEERING_IB_UC_QP_RANGE command is
    supported only if dmfs_ipoib bit is set.

    If it isn't set we want to ensure allocating NET_IF QPs fail. We do so
    by filling out the allocation bitmap. By thus, the NET_IF QPs allocating
    function won't find any free QP and will fail.

    Fixes: c1c98501121e ('IB/mlx4: Add support for steerable IB UD QPs')
    Signed-off-by: Eran Ben Elisha
    Signed-off-by: Daniel Jurgens
    Reviewed-by: Mark Bloch
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Eran Ben Elisha
     
  • commit 6fa26208206c406fa529cd73f7ae6bf4181e270b upstream.

    Report the correct speed in the port attributes when using a 56Gbps
    ethernet link. Without this change the field is incorrectly set to 10.

    Fixes: a9c766bb75ee ('IB/mlx4: Fix info returned when querying IBoE ports')
    Fixes: 2e96691c31ec ('IB: Use central enum for speed instead of hard-coded values')
    Signed-off-by: Saeed Mahameed
    Signed-off-by: Yishai Hadas
    Signed-off-by: Daniel Jurgens
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Saeed Mahameed
     
  • commit befcabcd530e4ffb6f016638f693b7d94986d2ba upstream.

    If OpenSM runs over a ConnectX-3, and there are ConnectX-4 or Connect-IB
    VFs active on the network, the OpenSM will receive QP1 packets containing
    a GRH where the destination GID is the "Well-Known GID" -- which is not a
    GID in the HCA Port's GID Table.

    This GID must be tested-for separately -- and packets which contain
    this destination GID should be routed to slave 0 (the PF).

    Fixes: 37bfc7c1e83f ('IB/mlx4: SR-IOV multiplex and demultiplex MADs')
    Signed-off-by: Jack Morgenstein
    Signed-off-by: Daniel Jurgens
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Jack Morgenstein
     
  • commit c482af646d0809a8d5e1b7f4398cce3592589b98 upstream.

    For non-special QPs, the port value becomes non-zero only at the
    RESET-to-INIT transition. If the QP has not undergone that transition,
    its port number value is still zero.

    If such a QP is destroyed before being moved out of the RESET state,
    subtracting one from the qp port number results in a negative value.
    Using that negative value as an index into the qp1_proxy array
    results in an out-of-bounds array reference.

    Fix this by testing that the QP type is one that uses qp1_proxy before
    using the port number. For special QPs of all types, the port number is
    specified at QP creation time.

    Fixes: 9433c188915c ("IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes")
    Signed-off-by: Jack Morgenstein
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Jack Morgenstein
     
  • commit af4295c117b82a521b05d0daf39ce879d26e6cb1 upstream.

    Set traffic class within sl_tclass_flowlabel when create iboe AH.
    Without this the TOS value will be empty when running VLAN tagged
    traffic, because the TOS value is taken from the traffic class in the
    address handle attributes.

    Fixes: 9106c4106974 ('IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE')
    Signed-off-by: Maor Gottlieb
    Signed-off-by: Daniel Jurgens
    Reviewed-by: Mark Bloch
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Maor Gottlieb
     
  • commit acbda523884dcf45613bf6818d8ead5180df35c2 upstream.

    Wait before continuing unload till all pending mkey async creation requests
    are done.

    Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
    Signed-off-by: Eli Cohen
    Signed-off-by: Maor Gottlieb
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Eli Cohen
     
  • commit c73b7911de97fad3ab9032a110af48d6ab2da48f upstream.

    Move the SRQ type assignment to be before actually using it
    in create_srq_user() and in create_srq_kernel() functions.

    Fixes: af1ba291c5e4 ('{net, IB}/mlx5: Refactor internal SRQ API')
    Signed-off-by: Maor Gottlieb
    Reviewed-by: Majd Dibbiny
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Maor Gottlieb
     
  • commit 288c01b746aab484651391ca6d64b585d3eb5ec6 upstream.

    Add the 512 bytes limit of RDMA READ and the size of remote
    address to the max SGE calculation.

    Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
    Signed-off-by: Eli Cohen
    Signed-off-by: Maor Gottlieb
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Eli Cohen
     
  • commit afd02cd3a9b6c04b41d946b5d7f6e17b3fc30c6b upstream.

    When enabling many VFs, the total amount of DMA mappings increase
    significantly. This causes DMA allocations to take a lot of time
    since they are serialized in the kernel.

    As a result the driver enters into fatal condition due to
    timeout and the system hangs. To recover from this we disable
    MR cache for VFs.

    PFs will still have a full cache and VFs cache can be manipulated
    as usual after driver load.

    Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
    Signed-off-by: Eli Cohen
    Signed-off-by: Maor Gottlieb
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Eli Cohen
     
  • commit a0fa72683e78979ef1123d679b1c40ae28bd9096 upstream.

    A race condition fix added an rxe_qp structure to the stack in order
    to be able to perform rollback in rxe_requester(), but the structure
    is large enough to trigger the warning for possible stack overflow:

    drivers/infiniband/sw/rxe/rxe_req.c: In function 'rxe_requester':
    drivers/infiniband/sw/rxe/rxe_req.c:757:1: error: the frame size of 2064 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

    This changes the rollback function to only save the psn inside
    the qp, which is the only field we access in the rollback_qp
    anyway.

    Fixes: 3050b9985024 ("IB/rxe: Fix race condition between requester and completer")
    Signed-off-by: Arnd Bergmann
    Reviewed-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • commit d680ebed91e0b45c43ae03a880a0b43211096161 upstream.

    Increase limit of max CQE from 8K to 32K to allow demanding
    applications to work over SoftRoCE with same configuration
    as most RoCEv2 HW vendors have.

    Fixes: 8700e3e7c485 ("Soft RoCE driver")
    Signed-off-by: Yonatan Cohen
    Reviewed-by: Moni Shoua
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Yonatan Cohen
     
  • commit aa6aae38f7fb2c030f326a6dd10b58fff1851dfa upstream.

    The failure in ib_cache_setup_one function during
    ib_register_device will leave leaked allocated memory.

    Fixes: 03db3a2d81e6 ("IB/core: Add RoCE GID table management")
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Leon Romanovsky
     

20 Jan, 2017

1 commit

  • commit 15f7e3c21b76598bc6e5816d2577ce843b2b963f upstream.

    Fix to return error code -ENOMEM from the __get_free_page() error
    handling case instead of 0, as done elsewhere in this function.

    Fixes: 05eb23893c2c ("cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes")
    Signed-off-by: Wei Yongjun
    Acked-by: Steve Wise
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Wei Yongjun
     

09 Jan, 2017

5 commits

  • commit 91c42b72f8e8b45961ff05a05009b644e6316ca2 upstream.

    hw_stats is a pointer to i40_iw_dev_stats struct in i40iw_get_hw_stats().
    Use hw_stats and not &hw_stats in the memcpy to copy the i40iw device stats
    data into rdma_hw_stats counters.

    Fixes: b40f4757daa1 ("IB/core: Make device counter infrastructure dynamic")

    Signed-off-by: Shiraz Saleem
    Signed-off-by: Faisal Latif
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Shiraz Saleem
     
  • commit e259934d4df7f99f2a5c2c4f074f6a55bd4b1722 upstream.

    A socket is associated with every QP by the rxe driver but sock_release()
    is never called. Add a call to sock_release() in rxe_qp_cleanup().

    Fixes: commit 8700e3e7c48A5 ("Add Soft RoCE driver")
    Signed-off-by: Bart Van Assche
    Cc: Moni Shoua
    Cc: Kamal Heib
    Cc: Amir Vadai
    Cc: Haggai Eran
    Reviewed-by: Moni Shoua
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Bart Van Assche
     
  • commit d3a2418ee36a59bc02e9d454723f3175dcf4bfd9 upstream.

    This patch avoids that Coverity complains about not checking the
    ib_find_pkey() return value.

    Fixes: commit 547af76521b3 ("IB/multicast: Report errors on multicast groups if P_key changes")
    Signed-off-by: Bart Van Assche
    Cc: Sean Hefty
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Bart Van Assche
     
  • commit 11b642b84e8c43e8597de031678d15c08dd057bc upstream.

    This patch avoids that Coverity reports the following:

    Using uninitialized value port_attr.state when calling printk

    Fixes: commit 94232d9ce817 ("IPoIB: Start multicast join process only on active ports")
    Signed-off-by: Bart Van Assche
    Cc: Erez Shitrit
    Reviewed-by: Leon Romanovsky
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Bart Van Assche
     
  • commit 2fe2f378dd45847d2643638c07a7658822087836 upstream.

    The array ib_mad_mgmt_class_table.method_table has MAX_MGMT_CLASS
    (80) elements. Hence compare the array index with that value instead
    of with IB_MGMT_MAX_METHODS (128). This patch avoids that Coverity
    reports the following:

    Overrunning array class->method_table of 80 8-byte elements at element index 127 (byte offset 1016) using index convert_mgmt_class(mad_hdr->mgmt_class) (which evaluates to 127).

    Fixes: commit b7ab0b19a85f ("IB/mad: Verify mgmt class in received MADs")
    Signed-off-by: Bart Van Assche
    Cc: Sean Hefty
    Reviewed-by: Hal Rosenstock
    Signed-off-by: Doug Ledford
    Signed-off-by: Greg Kroah-Hartman

    Bart Van Assche
     

18 Nov, 2016

1 commit

  • Pull rmda fixes from Doug Ledford.
    "First round of -rc fixes.

    Due to various issues, I've been away and couldn't send a pull request
    for about three weeks. There were a number of -rc patches that built
    up in the meantime (some where there already from the early -rc
    stages). Obviously, there were way too many to send now, so I tried to
    pare the list down to the more important patches for the -rc cycle.

    Most of the code has had plenty of soak time at the various vendor's
    testing setups, so I doubt there will be another -rc pull request this
    cycle. I also tried to limit the patches to those with smaller
    footprints, so even though a shortlog is longer than I would like, the
    actual diffstat is mostly very small with the exception of just three
    files that had more changes, and a couple files with pure removals.

    Summary:
    - Misc Intel hfi1 fixes
    - Misc Mellanox mlx4, mlx5, and rxe fixes
    - A couple cxgb4 fixes"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (34 commits)
    iw_cxgb4: invalidate the mr when posting a read_w_inv wr
    iw_cxgb4: set *bad_wr for post_send/post_recv errors
    IB/rxe: Update qp state for user query
    IB/rxe: Clear queue buffer when modifying QP to reset
    IB/rxe: Fix handling of erroneous WR
    IB/rxe: Fix kernel panic in UDP tunnel with GRO and RX checksum
    IB/mlx4: Fix create CQ error flow
    IB/mlx4: Check gid_index return value
    IB/mlx5: Fix NULL pointer dereference on debug print
    IB/mlx5: Fix fatal error dispatching
    IB/mlx5: Resolve soft lock on massive reg MRs
    IB/mlx5: Use cache line size to select CQE stride
    IB/mlx5: Validate requested RQT size
    IB/mlx5: Fix memory leak in query device
    IB/core: Avoid unsigned int overflow in sg_alloc_table
    IB/core: Add missing check for addr_resolve callback return value
    IB/core: Set routable RoCE gid type for ipv4/ipv6 networks
    IB/cm: Mark stale CM id's whenever the mad agent was unregistered
    IB/uverbs: Fix leak of XRC target QPs
    IB/hfi1: Remove incorrect IS_ERR check
    ...

    Linus Torvalds
     

17 Nov, 2016

8 commits

  • Also, rearrange things a bit to have a common c4iw_invalidate_mr()
    function used everywhere that we need to invalidate.

    Fixes: 49b53a93a64a ("iw_cxgb4: add fast-path for small REG_MR operations")
    Signed-off-by: Steve Wise
    Signed-off-by: Doug Ledford

    Steve Wise
     
  • There are a few cases in c4iw_post_send() and c4iw_post_receive()
    where *bad_wr is not set when an error is returned. This can
    cause a crash if the application tries to use bad_wr.

    Signed-off-by: Steve Wise
    Signed-off-by: Doug Ledford

    Steve Wise
     
  • Doug Ledford
     
  • The method rxe_qp_error() transitions QP to error state
    and make sure the QP is drained. It did not though update
    the QP state for user's query.

    This patch fixes this.

    Fixes: 8700e3e7c485 ("Soft RoCE driver")
    Signed-off-by: Yonatan Cohen
    Reviewed-by: Moni Shoua
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Yonatan Cohen
     
  • RXE resets the send-q only once in rxe_qp_init_req() when
    QP is created, but when the QP is reused after QP reset, the send-q
    holds previous garbage data.

    This garbage data wrongly fails CQEs that otherwise
    should have completed successfully.

    Fixes: 8700e3e7c485 ("Soft RoCE driver")
    Signed-off-by: Yonatan Cohen
    Reviewed-by: Moni Shoua
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Yonatan Cohen
     
  • To correctly handle a erroneous WR this fix does the following
    1. Make sure the bad WQE causes a user completion event.
    2. Call rxe_completer to handle the erred WQE.

    Before the fix, when rxe_requester found a bad WQE, it changed its
    status to IB_WC_LOC_PROT_ERR and exit with 0 for non RC QPs.

    If this was the 1st WQE then there would be no ACK to invoke the
    completer and this bad WQE would be stuck in the QP's send-q.

    On top of that the requester exiting with 0 caused rxe_do_task to
    endlessly invoke rxe_requester, resulting in a soft-lockup attached
    below.

    In case the WQE was not the 1st and rxe_completer did get a chance to
    handle the bad WQE, it did not cause a complete event since the WQE's
    IB_SEND_SIGNALED flag was not set.

    Setting WQE status to IB_SEND_SIGNALED is subject to IBA spec
    version 1.2.1, section 10.7.3.1 Signaled Completions.

    NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
    [] ? rxe_pool_get_index+0x35/0xb0 [rdma_rxe]
    [] lookup_mem+0x3c/0xc0 [rdma_rxe]
    [] copy_data+0x1c4/0x230 [rdma_rxe]
    [] rxe_requester+0x9d0/0x1100 [rdma_rxe]
    [] ? kfree_skbmem+0x5a/0x60
    [] rxe_do_task+0x89/0xf0 [rdma_rxe]
    [] rxe_run_task+0x12/0x30 [rdma_rxe]
    [] rxe_post_send+0x41a/0x550 [rdma_rxe]
    [] ? __kmalloc+0x182/0x200
    [] ? down_read+0x12/0x40
    [] ib_uverbs_post_send+0x532/0x540 [ib_uverbs]
    [] ? tcp_sendmsg+0x402/0xb80
    [] ib_uverbs_write+0x18c/0x3f0 [ib_uverbs]
    [] ? inet_recvmsg+0x7e/0xb0
    [] ? sock_recvmsg+0x3d/0x50
    [] __vfs_write+0x37/0x140
    [] vfs_write+0xb2/0x1b0
    [] SyS_write+0x55/0xc0
    [] entry_SYSCALL_64_fastpath+0x1a/0xa

    Fixes: 8700e3e7c485 ("Soft RoCE driver")
    Signed-off-by: Yonatan Cohen
    Reviewed-by: Moni Shoua
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Yonatan Cohen
     
  • Missing initialization of udp_tunnel_sock_cfg causes to following
    kernel panic, while kernel tries to execute gro_receive().

    While being there, we converted udp_port_cfg to use the same
    initialization scheme as udp_tunnel_sock_cfg.

    ------------[ cut here ]------------
    kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
    BUG: unable to handle kernel paging request at ffffffffa0588c50
    IP: [] __this_module+0x50/0xffffffffffff8400 [ib_rxe]
    PGD 1c09067 PUD 1c0a063 PMD bb394067 PTE 80000000ad5e8163
    Oops: 0011 [#1] SMP
    Modules linked in: ib_rxe ip6_udp_tunnel udp_tunnel
    CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.7.0-rc3+ #2
    Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
    task: ffff880235e4e680 ti: ffff880235e68000 task.ti: ffff880235e68000
    RIP: 0010:[]
    [] __this_module+0x50/0xffffffffffff8400 [ib_rxe]
    RSP: 0018:ffff880237343c80 EFLAGS: 00010282
    RAX: 00000000dffe482d RBX: ffff8800ae330900 RCX: 000000002001b712
    RDX: ffff8800ae330900 RSI: ffff8800ae102578 RDI: ffff880235589c00
    RBP: ffff880237343cb0 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800ae33e262
    R13: ffff880235589c00 R14: 0000000000000014 R15: ffff8800ae102578
    FS: 0000000000000000(0000) GS:ffff880237340000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffffffa0588c50 CR3: 0000000001c06000 CR4: 00000000000006e0
    Stack:
    ffffffff8160860e ffff8800ae330900 ffff8800ae102578 0000000000000014
    000000000000004e ffff8800ae102578 ffff880237343ce0 ffffffff816088fb
    0000000000000000 ffff8800ae330900 0000000000000000 00000000ffad0000
    Call Trace:

    [] ? udp_gro_receive+0xde/0x130
    [] udp4_gro_receive+0x10b/0x2d0
    [] inet_gro_receive+0x1d3/0x270
    [] dev_gro_receive+0x269/0x3b0
    [] napi_gro_receive+0x38/0x120
    [] mlx5e_handle_rx_cqe+0x27e/0x340 [mlx5_core]
    [] mlx5e_poll_rx_cq+0x66/0x6d0 [mlx5_core]
    [] mlx5e_napi_poll+0x8e/0x400 [mlx5_core]
    [] net_rx_action+0x160/0x380
    [] __do_softirq+0xd7/0x2c5
    [] irq_exit+0xf5/0x100
    [] do_IRQ+0x56/0xd0
    [] common_interrupt+0x8c/0x8c

    [] ? native_safe_halt+0x6/0x10
    [] default_idle+0x1e/0xd0
    [] arch_cpu_idle+0xf/0x20
    [] default_idle_call+0x3c/0x50
    [] cpu_startup_entry+0x323/0x3c0
    [] start_secondary+0x15c/0x1a0
    RIP [] __this_module+0x50/0xffffffffffff8400 [ib_rxe]
    RSP
    CR2: ffffffffa0588c50
    ---[ end trace 489ee31fa7614ac5 ]---
    Kernel panic - not syncing: Fatal exception in interrupt
    Kernel Offset: disabled
    ---[ end Kernel panic - not syncing: Fatal exception in interrupt
    ------------[ cut here ]------------

    Fixes: 8700e3e7c485 ("Soft RoCE driver")
    Signed-off-by: Yonatan Cohen
    Reviewed-by: Moni Shoua
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Yonatan Cohen
     
  • Currently, if ib_copy_to_udata fails, the CQ
    won't be deleted from the radix tree and the HW (HW2SW).

    Fixes: 225c7b1feef1 ('IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters')
    Signed-off-by: Matan Barak
    Signed-off-by: Daniel Jurgens
    Reviewed-by: Mark Bloch
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Doug Ledford

    Matan Barak