28 Oct, 2020

1 commit

  • There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the
    handler triggers a completion and another thread does rdma_connect() or
    the handler directly calls rdma_connect().

    In all cases rdma_connect() needs to hold the handler_mutex, but when
    handler's are invoked this is already held by the core code. This causes
    ULPs using the 2nd method to deadlock.

    Provide a rdma_connect_locked() and have all ULPs call it from their
    handlers.

    Link: https://lore.kernel.org/r/0-v2-53c22d5c1405+33-rdma_connect_locking_jgg@nvidia.com
    Reported-and-tested-by: Guoqing Jiang
    Fixes: 2a7cec538169 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state")
    Acked-by: Santosh Shilimkar
    Acked-by: Jack Wang
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Max Gurtovoy
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Jason Gunthorpe

    Jason Gunthorpe
     

10 Oct, 2020

1 commit

  • RDS/IB tries to refill the recv buffer in softirq context using
    GFP_NOWAIT flag. However alloc failure is handled by queueing a work to
    refill the recv buffer with GFP_KERNEL flag. This means failure to
    allocate with GFP_NOWAIT isn't fatal. Do not print the PAF warnings if
    softirq context fails to refill the recv buffer. We will see the PAF
    warnings when worker also fails to allocate.

    Signed-off-by: Manjunath Patil
    Reviewed-by: Aruna Ramakrishna
    Signed-off-by: Jakub Kicinski

    Manjunath Patil
     

21 Sep, 2020

1 commit

  • sg_init_table zeroes its first argument, so the allocation of that argument
    doesn't have to.

    the semantic patch that makes this change is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @@
    expression x,n,flags;
    @@

    x =
    - kcalloc
    + kmalloc_array
    (n,sizeof(*x),flags)
    ...
    sg_init_table(x,n)
    //

    Signed-off-by: Julia Lawall
    Signed-off-by: David S. Miller

    Julia Lawall
     

19 Sep, 2020

1 commit


24 Aug, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

02 Aug, 2020

1 commit


01 Aug, 2020

1 commit

  • rds_notify_queue_get() is potentially copying uninitialized kernel stack
    memory to userspace since the compiler may leave a 4-byte hole at the end
    of `cmsg`.

    In 2016 we tried to fix this issue by doing `= { 0 };` on `cmsg`, which
    unfortunately does not always initialize that 4-byte hole. Fix it by using
    memset() instead.

    Cc: stable@vger.kernel.org
    Fixes: f037590fff30 ("rds: fix a leak of kernel memory")
    Fixes: bdbe6fbc6a2f ("RDS: recv.c")
    Suggested-by: Dan Carpenter
    Signed-off-by: Peilin Ye
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Peilin Ye
     

25 Jul, 2020

1 commit

  • Rework the remaining setsockopt code to pass a sockptr_t instead of a
    plain user pointer. This removes the last remaining set_fs(KERNEL_DS)
    outside of architecture specific code.

    Signed-off-by: Christoph Hellwig
    Acked-by: Stefan Schmidt [ieee802154]
    Acked-by: Matthieu Baerts
    Signed-off-by: David S. Miller

    Christoph Hellwig
     

20 Jul, 2020

1 commit


02 Jul, 2020

1 commit

  • In testing with mprds enabled, Oracle Cluster nodes after reboot were
    not able to communicate with others nodes and so failed to rejoin
    the cluster. Peers with lower IP address initiated connection but the
    node could not respond as it choose a different path and could not
    initiate a connection as it had a higher IP address.

    With this patch, when a node sends out a packet and the selected path
    is down, all other paths are also checked and any down paths are
    re-connected.

    Reviewed-by: Ka-cheong Poon
    Reviewed-by: David Edmondson
    Signed-off-by: Somasundaram Krishnasamy
    Signed-off-by: Rao Shoaib
    Signed-off-by: David S. Miller

    Rao Shoaib
     

26 Jun, 2020

1 commit


16 Jun, 2020

1 commit


14 Jun, 2020

1 commit

  • Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over
    '---help---'"), the number of '---help---' has been gradually
    decreasing, but there are still more than 2400 instances.

    This commit finishes the conversion. While I touched the lines,
    I also fixed the indentation.

    There are a variety of indentation styles found.

    a) 4 spaces + '---help---'
    b) 7 spaces + '---help---'
    c) 8 spaces + '---help---'
    d) 1 space + 1 tab + '---help---'
    e) 1 tab + '---help---' (correct indentation)
    f) 1 tab + 1 space + '---help---'
    g) 1 tab + 2 spaces + '---help---'

    In order to convert all of them to 1 tab + 'help', I ran the
    following commend:

    $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     

06 Jun, 2020

1 commit

  • Pull rdma updates from Jason Gunthorpe:
    "A more active cycle than most of the recent past, with a few large,
    long discussed works this time.

    The RNBD block driver has been posted for nearly two years now, and
    flowing through RDMA due to it also introducing a new ULP.

    The removal of FMR has been a recurring discussion theme for a long
    time.

    And the usual smattering of features and bug fixes.

    Summary:

    - Various small driver bugs fixes in rxe, mlx5, hfi1, and efa

    - Continuing driver cleanups in bnxt_re, hns

    - Big cleanup of mlx5 QP creation flows

    - More consistent use of src port and flow label when LAG is used and
    a mlx5 implementation

    - Additional set of cleanups for IB CM

    - 'RNBD' network block driver and target. This is a network block
    RDMA device specific to ionos's cloud environment. It brings strong
    multipath and resiliency capabilities.

    - Accelerated IPoIB for HFI1

    - QP/WQ/SRQ ioctl migration for uverbs, and support for multiple
    async fds

    - Support for exchanging the new IBTA defiend ECE data during RDMA CM
    exchanges

    - Removal of the very old and insecure FMR interface from all ULPs
    and drivers. FRWR should be preferred for at least a decade now"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (247 commits)
    RDMA/cm: Spurious WARNING triggered in cm_destroy_id()
    RDMA/mlx5: Return ECE DC support
    RDMA/mlx5: Don't rely on FW to set zeros in ECE response
    RDMA/mlx5: Return an error if copy_to_user fails
    IB/hfi1: Use free_netdev() in hfi1_netdev_free()
    RDMA/hns: Uninitialized variable in modify_qp_init_to_rtr()
    RDMA/core: Move and rename trace_cm_id_create()
    IB/hfi1: Fix hfi1_netdev_rx_init() error handling
    RDMA: Remove 'max_map_per_fmr'
    RDMA: Remove 'max_fmr'
    RDMA/core: Remove FMR device ops
    RDMA/rdmavt: Remove FMR memory registration
    RDMA/mthca: Remove FMR support for memory registration
    RDMA/mlx4: Remove FMR support for memory registration
    RDMA/i40iw: Remove FMR leftovers
    RDMA/bnxt_re: Remove FMR leftovers
    RDMA/mlx5: Remove FMR leftovers
    RDMA/core: Remove FMR pool API
    RDMA/rds: Remove FMR support for memory registration
    RDMA/srp: Remove support for FMR memory registration
    ...

    Linus Torvalds
     

03 Jun, 2020

2 commits

  • Now that FMR support is gone, this attribute can be deleted from all
    places.

    Link: https://lore.kernel.org/r/12-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com
    Reviewed-by: Max Gurtovoy
    Reviewed-by: Bernard Metzler
    Signed-off-by: Jason Gunthorpe

    Jason Gunthorpe
     
  • Use FRWR method for memory registration by default and remove the ancient
    and unsafe FMR method.

    Link: https://lore.kernel.org/r/3-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com
    Signed-off-by: Max Gurtovoy
    Signed-off-by: Jason Gunthorpe

    Max Gurtovoy
     

29 May, 2020

7 commits


28 May, 2020

1 commit

  • IBTA declares "vendor option not supported" reject reason in REJ messages
    if passive side doesn't want to accept proposed ECE options.

    Due to the fact that ECE is managed by userspace, there is a need to let
    users to provide such rejected reason.

    Link: https://lore.kernel.org/r/20200526103304.196371-7-leon@kernel.org
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Jason Gunthorpe

    Leon Romanovsky
     

22 May, 2020

1 commit


21 May, 2020

1 commit

  • The conversion to pin_user_pages() had a bug: it overlooked
    the case of allocation of pages failing. Fix that by restoring
    an equivalent check.

    Reported-by: syzbot+118ac0af4ac7f785a45b@syzkaller.appspotmail.com
    Fixes: dbfe7d74376e ("rds: convert get_user_pages() --> pin_user_pages()")

    Cc: David S. Miller
    Cc: Jakub Kicinski
    Cc: netdev@vger.kernel.org
    Cc: linux-rdma@vger.kernel.org
    Cc: rds-devel@oss.oracle.com
    Signed-off-by: John Hubbard
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    John Hubbard
     

18 May, 2020

1 commit

  • This code was using get_user_pages_fast(), in a "Case 2" scenario
    (DMA/RDMA), using the categorization from [1]. That means that it's
    time to convert the get_user_pages_fast() + put_page() calls to
    pin_user_pages_fast() + unpin_user_pages() calls.

    There is some helpful background in [2]: basically, this is a small
    part of fixing a long-standing disconnect between pinning pages, and
    file systems' use of those pages.

    [1] Documentation/core-api/pin_user_pages.rst

    [2] "Explicit pinning of user-space pages":
    https://lwn.net/Articles/807108/

    Cc: David S. Miller
    Cc: Jakub Kicinski
    Cc: netdev@vger.kernel.org
    Cc: linux-rdma@vger.kernel.org
    Cc: rds-devel@oss.oracle.com
    Signed-off-by: John Hubbard
    Signed-off-by: David S. Miller

    John Hubbard
     

06 May, 2020

1 commit

  • When a client is added it isn't allowed to fail, but all the client's have
    various failure paths within their add routines.

    This creates the very fringe condition where the client was added, failed
    during add and didn't set the client_data. The core code will then still
    call other client_data centric ops like remove(), rename(), get_nl_info(),
    and get_net_dev_by_params() with NULL client_data - which is confusing and
    unexpected.

    If the add() callback fails, then do not call any more client ops for the
    device, even remove.

    Remove all the now redundant checks for NULL client_data in ops callbacks.

    Update all the add() callbacks to return error codes
    appropriately. EOPNOTSUPP is used for cases where the ULP does not support
    the ib_device - eg because it only works with IB.

    Link: https://lore.kernel.org/r/20200421172440.387069-1-leon@kernel.org
    Signed-off-by: Leon Romanovsky
    Acked-by: Ursula Braun
    Signed-off-by: Jason Gunthorpe

    Jason Gunthorpe
     

29 Apr, 2020

1 commit


27 Apr, 2020

1 commit

  • Instead of having all the sysctl handlers deal with user pointers, which
    is rather hairy in terms of the BPF interaction, copy the input to and
    from userspace in common code. This also means that the strings are
    always NUL-terminated by the common code, making the API a little bit
    safer.

    As most handler just pass through the data to one of the common handlers
    a lot of the changes are mechnical.

    Signed-off-by: Christoph Hellwig
    Acked-by: Andrey Ignatov
    Signed-off-by: Al Viro

    Christoph Hellwig
     

16 Apr, 2020

1 commit

  • Returning the error code via a 'int *ret' when the function returns a
    pointer is very un-kernely and causes gcc 10's static analysis to choke:

    net/rds/message.c: In function ‘rds_message_map_pages’:
    net/rds/message.c:358:10: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    358 | return ERR_PTR(ret);

    Use a typical ERR_PTR return instead.

    Signed-off-by: Jason Gunthorpe
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Jason Gunthorpe
     

10 Apr, 2020

2 commits

  • In rds_free_mr(), it calls rds_destroy_mr(mr) directly. But this
    defeats the purpose of reference counting and makes MR free handling
    impossible. It means that holding a reference does not guarantee that
    it is safe to access some fields. For example, In
    rds_cmsg_rdma_dest(), it increases the ref count, unlocks and then
    calls mr->r_trans->sync_mr(). But if rds_free_mr() (and
    rds_destroy_mr()) is called in between (there is no lock preventing
    this to happen), r_trans_private is set to NULL, causing a panic.
    Similar issue is in rds_rdma_unuse().

    Reported-by: zerons
    Signed-off-by: Ka-Cheong Poon
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Ka-Cheong Poon
     
  • And removed rds_mr_put().

    Signed-off-by: Ka-Cheong Poon
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Ka-Cheong Poon
     

17 Feb, 2020

1 commit

  • Convert net/rds to use the newly introduces pin_user_pages() API,
    which properly sets FOLL_PIN. Setting FOLL_PIN is now required for
    code that requires tracking of pinned pages.

    Note that this effectively changes the code's behavior: it now
    ultimately calls set_page_dirty_lock(), instead of set_page_dirty().
    This is probably more accurate.

    As Christoph Hellwig put it, "set_page_dirty() is only safe if we are
    dealing with a file backed page where we have reference on the inode it
    hangs off." [1]

    [1] https://lore.kernel.org/r/20190723153640.GB720@lst.de

    Cc: Hans Westgaard Ry
    Cc: Santosh Shilimkar
    Signed-off-by: Leon Romanovsky
    Signed-off-by: John Hubbard
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Leon Romanovsky
     

18 Jan, 2020

2 commits


16 Jan, 2020

1 commit


17 Nov, 2019

2 commits


18 Oct, 2019

1 commit