03 Jun, 2020

1 commit


18 Jul, 2019

4 commits

  • Since "rds_ib_free_frmr" and "rds_ib_free_frmr_list" simply put
    the FRMR memory segments on the "drop_list" or "free_list",
    and it is the job of "rds_ib_flush_mr_pool" to reap those entries
    by ultimately issuing a "IB_WR_LOCAL_INV" work-request,
    we need to trigger and then wait for all those memory segments
    attached to a particular connection to be fully released before
    we can move on to release the QP, CQ, etc.

    So we make "rds_ib_conn_path_shutdown" wait for one more
    atomic_t called "i_fastreg_inuse_count" that keeps track of how
    many FRWR memory segments are out there marked "FRMR_IS_INUSE"
    (and also wake_up rds_ib_ring_empty_wait, as they go away).

    Signed-off-by: Gerd Rausch
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Gerd Rausch
     
  • Fix a bug where fr_state first goes to FRMR_IS_STALE, because of a failure
    of operation IB_WR_LOCAL_INV, but then gets set back to "FRMR_IS_FREE"
    uncoditionally, even though the operation failed.

    Signed-off-by: Gerd Rausch
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Gerd Rausch
     
  • In order to:
    1) avoid a silly bouncing between "clean_list" and "drop_list"
    triggered by function "rds_ib_reg_frmr" as it is releases frmr
    regions whose state is not "FRMR_IS_FREE" right away.

    2) prevent an invalid access error in a race from a pending
    "IB_WR_LOCAL_INV" operation with a teardown ("dma_unmap_sg", "put_page")
    and de-registration ("ib_dereg_mr") of the corresponding
    memory region.

    Signed-off-by: Gerd Rausch
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Gerd Rausch
     
  • In the context of FRMR (ib_frmr.c):

    Memory regions make it onto the "clean_list" via "rds_ib_flush_mr_pool",
    after the memory region has been posted for invalidation via
    "rds_ib_post_inv".

    At that point in time, "fr_state" may still be in state "FRMR_IS_INUSE",
    since the only place where "fr_state" transitions to "FRMR_IS_FREE"
    is in "rds_ib_mr_cqe_handler", which is triggered by a tasklet.

    So in case we notice that "fr_state != FRMR_IS_FREE" (see below),
    we wait for "fr_inv_done" to trigger with a maximum of 10msec.
    Then we check again, and only put the memory region onto the drop_list
    (via "rds_ib_free_frmr") in case the situation remains unchanged.

    This avoids the problem of memory-regions bouncing between "clean_list"
    and "drop_list" before they even have a chance to be properly invalidated.

    Signed-off-by: Gerd Rausch
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Gerd Rausch
     

10 Jul, 2019

1 commit

  • This reverts commit 56012459310a1dbcc55c2dbf5500a9f7571402cb.

    RDS kept spinning inside function "rds_ib_post_reg_frmr", waiting for
    "i_fastreg_wrs" to become incremented:
    while (atomic_dec_return(&ibmr->ic->i_fastreg_wrs) ic->i_fastreg_wrs);
    cpu_relax();
    }

    Looking at the original commit:

    commit 56012459310a ("RDS: IB: split the mr registration and
    invalidation path")

    In there, the "rds_ib_mr_cqe_handler" was changed in the following
    way:

    void rds_ib_mr_cqe_handler(struct
    rds_ib_connection *ic,
    struct ib_wc *wc)
    if (frmr->fr_inv) {
    frmr->fr_state = FRMR_IS_FREE;
    frmr->fr_inv = false;
    atomic_inc(&ic->i_fastreg_wrs);
    } else {
    atomic_inc(&ic->i_fastunreg_wrs);
    }

    It looks like it's got it exactly backwards:

    Function "rds_ib_post_reg_frmr" keeps track of the outstanding
    requests via "i_fastreg_wrs".

    Function "rds_ib_post_inv" keeps track of the outstanding requests
    via "i_fastunreg_wrs" (post original commit). It also sets:
    frmr->fr_inv = true;

    However the completion handler "rds_ib_mr_cqe_handler" adjusts
    "i_fastreg_wrs" when "fr_inv" had been true, and adjusts
    "i_fastunreg_wrs" otherwise.

    The original commit was done in the name of performance:
    to remove the performance bottleneck

    No performance benefit could be observed with a fixed-up version
    of the original commit measured between two Oracle X7 servers,
    both equipped with Mellanox Connect-X5 HCAs.

    The prudent course of action is to revert this commit.

    Signed-off-by: Gerd Rausch
    Signed-off-by: Santosh Shilimkar

    Gerd Rausch
     

05 Feb, 2019

1 commit


17 Aug, 2018

2 commits

  • rdma.git merge resolution for the 4.19 merge window

    Conflicts:
    drivers/infiniband/core/rdma_core.c
    - Use the rdma code and revise with the new spelling for
    atomic_fetch_add_unless
    drivers/nvme/host/rdma.c
    - Replace max_sge with max_send_sge in new blk code
    drivers/nvme/target/rdma.c
    - Use the blk code and revise to use NULL for ib_post_recv when
    appropriate
    - Replace max_sge with max_recv_sge in new blk code
    net/rds/ib_send.c
    - Use the net code and revise to use NULL for ib_post_recv when
    appropriate

    Signed-off-by: Jason Gunthorpe

    Jason Gunthorpe
     
  • Resolve merge conflicts from the -rc cycle against the rdma.git tree:

    Conflicts:
    drivers/infiniband/core/uverbs_cmd.c
    - New ifs added to ib_uverbs_ex_create_flow in -rc and for-next
    - Merge removal of file->ucontext in for-next with new code in -rc
    drivers/infiniband/core/uverbs_main.c
    - for-next removed code from ib_uverbs_write() that was modified
    in for-rc

    Signed-off-by: Jason Gunthorpe

    Jason Gunthorpe
     

08 Aug, 2018

1 commit

  • Fix a static code checker warning:
    net/rds/ib_frmr.c:82 rds_ib_alloc_frmr() warn: passing zero to 'ERR_PTR'

    The error path for ib_alloc_mr failure should set err to PTR_ERR.

    Fixes: 1659185fb4d0 ("RDS: IB: Support Fastreg MR (FRMR) memory registration mode")
    Signed-off-by: YueHaibing
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    YueHaibing
     

27 Jul, 2018

1 commit

  • Registration of a memory region(MR) through FRMR/fastreg(unlike FMR)
    needs a connection/qp. With a proxy qp, this dependency on connection
    will be removed, but that needs more infrastructure patches, which is a
    work in progress.

    As an intermediate fix, the get_mr returns EOPNOTSUPP when connection
    details are not populated. The MR registration through sendmsg() will
    continue to work even with fast registration, since connection in this
    case is formed upfront.

    This patch fixes the following crash:
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] SMP KASAN
    Modules linked in:
    CPU: 1 PID: 4244 Comm: syzkaller468044 Not tainted 4.16.0-rc6+ #361
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    RIP: 0010:rds_ib_get_mr+0x5c/0x230 net/rds/ib_rdma.c:544
    RSP: 0018:ffff8801b059f890 EFLAGS: 00010202
    RAX: dffffc0000000000 RBX: ffff8801b07e1300 RCX: ffffffff8562d96e
    RDX: 000000000000000d RSI: 0000000000000001 RDI: 0000000000000068
    RBP: ffff8801b059f8b8 R08: ffffed0036274244 R09: ffff8801b13a1200
    R10: 0000000000000004 R11: ffffed0036274243 R12: ffff8801b13a1200
    R13: 0000000000000001 R14: ffff8801ca09fa9c R15: 0000000000000000
    FS: 00007f4d050af700(0000) GS:ffff8801db300000(0000)
    knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f4d050aee78 CR3: 00000001b0d9b006 CR4: 00000000001606e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    __rds_rdma_map+0x710/0x1050 net/rds/rdma.c:271
    rds_get_mr_for_dest+0x1d4/0x2c0 net/rds/rdma.c:357
    rds_setsockopt+0x6cc/0x980 net/rds/af_rds.c:347
    SYSC_setsockopt net/socket.c:1849 [inline]
    SyS_setsockopt+0x189/0x360 net/socket.c:1828
    do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
    entry_SYSCALL_64_after_hwframe+0x42/0xb7
    RIP: 0033:0x4456d9
    RSP: 002b:00007f4d050aedb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
    RAX: ffffffffffffffda RBX: 00000000006dac3c RCX: 00000000004456d9
    RDX: 0000000000000007 RSI: 0000000000000114 RDI: 0000000000000004
    RBP: 00000000006dac38 R08: 00000000000000a0 R09: 0000000000000000
    R10: 0000000020000380 R11: 0000000000000246 R12: 0000000000000000
    R13: 00007fffbfb36d6f R14: 00007f4d050af9c0 R15: 0000000000000005
    Code: fa 48 c1 ea 03 80 3c 02 00 0f 85 cc 01 00 00 4c 8b bb 80 04 00 00
    48
    b8 00 00 00 00 00 fc ff df 49 8d 7f 68 48 89 fa 48 c1 ea 03 3c 02
    00 0f
    85 9c 01 00 00 4d 8b 7f 68 48 b8 00 00 00 00 00
    RIP: rds_ib_get_mr+0x5c/0x230 net/rds/ib_rdma.c:544 RSP:
    ffff8801b059f890
    ---[ end trace 7e1cea13b85473b0 ]---

    Reported-by: syzbot+b51c77ef956678a65834@syzkaller.appspotmail.com
    Signed-off-by: Santosh Shilimkar
    Signed-off-by: Avinash Repaka

    Signed-off-by: David S. Miller

    Avinash Repaka
     

25 Jul, 2018

2 commits


03 Jan, 2017

2 commits


14 May, 2016

1 commit


03 Mar, 2016

1 commit