25 Aug, 2019

1 commit

  • >From IB specific 7.6.5 SERVICE LEVEL, Service Level (SL)
    is used to identify different flows within an IBA subnet.
    It is carried in the local route header of the packet.

    Before this commit, run "rds-info -I". The outputs are as
    below:
    "
    RDS IB Connections:
    LocalAddr RemoteAddr Tos SL LocalDev RemoteDev
    192.2.95.3 192.2.95.1 2 0 fe80::21:28:1a:39 fe80::21:28:10:b9
    192.2.95.3 192.2.95.1 1 0 fe80::21:28:1a:39 fe80::21:28:10:b9
    192.2.95.3 192.2.95.1 0 0 fe80::21:28:1a:39 fe80::21:28:10:b9
    "
    After this commit, the output is as below:
    "
    RDS IB Connections:
    LocalAddr RemoteAddr Tos SL LocalDev RemoteDev
    192.2.95.3 192.2.95.1 2 2 fe80::21:28:1a:39 fe80::21:28:10:b9
    192.2.95.3 192.2.95.1 1 1 fe80::21:28:1a:39 fe80::21:28:10:b9
    192.2.95.3 192.2.95.1 0 0 fe80::21:28:1a:39 fe80::21:28:10:b9
    "

    The commit fe3475af3bdf ("net: rds: add per rds connection cache
    statistics") adds cache_allocs in struct rds_info_rdma_connection
    as below:
    struct rds_info_rdma_connection {
    ...
    __u32 rdma_mr_max;
    __u32 rdma_mr_size;
    __u8 tos;
    __u32 cache_allocs;
    };
    The peer struct in rds-tools of struct rds_info_rdma_connection is as
    below:
    struct rds_info_rdma_connection {
    ...
    uint32_t rdma_mr_max;
    uint32_t rdma_mr_size;
    uint8_t tos;
    uint8_t sl;
    uint32_t cache_allocs;
    };
    The difference between userspace and kernel is the member variable sl.
    In the kernel struct, the member variable sl is missing. This will
    introduce risks. So it is necessary to use this commit to avoid this risk.

    Fixes: fe3475af3bdf ("net: rds: add per rds connection cache statistics")
    CC: Joe Jin
    CC: JUNXIAO_BI
    Suggested-by: Gerd Rausch
    Signed-off-by: Zhu Yanjun
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Zhu Yanjun
     

28 Jul, 2019

1 commit

  • In rds_rdma_cm_event_handler_cmn(), there are some if statements to
    check whether conn is NULL, such as on lines 65, 96 and 112.
    But conn is not checked before being used on line 108:
    trans->cm_connect_complete(conn, event);
    and on lines 140-143:
    rdsdebug("DISCONNECT event - dropping connection "
    "%pI6c->%pI6c\n", &conn->c_laddr,
    &conn->c_faddr);
    rds_conn_drop(conn);

    Thus, possible null-pointer dereferences may occur.

    To fix these bugs, conn is checked before being used.

    These bugs are found by a static analysis tool STCheck written by us.

    Signed-off-by: Jia-Ju Bai
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Jia-Ju Bai
     

10 Jul, 2019

2 commits

  • Connections with legitimate tos values can get into usual connection
    race. It can result in consumer reject. We don't want tos value or
    protocol version to be demoted for such connections otherwise
    piers would end up different tos values which can results in
    no connection. Example a peer initiated connection with say
    tos 8 while usual connection racing can get downgraded to tos 0
    which is not desirable.

    Patch fixes above issue introduced by commit
    commit d021fabf525f ("rds: rdma: add consumer reject")

    Reported-by: Yanjun Zhu
    Tested-by: Yanjun Zhu
    Signed-off-by: Santosh Shilimkar

    Santosh Shilimkar
     
  • Prior to
    commit d021fabf525ff ("rds: rdma: add consumer reject")

    function "rds_rdma_cm_event_handler_cmn" would always honor a rejected
    connection attempt by issuing a "rds_conn_drop".

    The commit mentioned above added a "break", eliminating
    the "fallthrough" case and made the "rds_conn_drop" rather conditional:

    Now it only happens if a "consumer defined" reject (i.e. "rdma_reject")
    carries an integer-value of "1" inside "private_data":

    if (!conn)
    break;
    err = (int *)rdma_consumer_reject_data(cm_id, event, &len);
    if (!err || (err && ((*err) == RDS_RDMA_REJ_INCOMPAT))) {
    pr_warn("RDS/RDMA: conn rejected, dropping connection\n",
    &conn->c_laddr, &conn->c_faddr);
    conn->c_proposed_version = RDS_PROTOCOL_COMPAT_VERSION;
    rds_conn_drop(conn);
    }
    rdsdebug("Connection rejected: %s\n",
    rdma_reject_msg(cm_id, event->status));
    break;
    /* FALLTHROUGH */
    A number of issues are worth mentioning here:
    #1) Previous versions of the RDS code simply rejected a connection
    by calling "rdma_reject(cm_id, NULL, 0);"
    So the value of the payload in "private_data" will not be "1",
    but "0".

    #2) Now the code has become dependent on host byte order and sizing.
    If one peer is big-endian, the other is little-endian,
    or there's a difference in sizeof(int) (e.g. ILP64 vs LP64),
    the *err check does not work as intended.

    #3) There is no check for "len" to see if the data behind *err is even valid.
    Luckily, it appears that the "rdma_reject(cm_id, NULL, 0)" will always
    carry 148 bytes of zeroized payload.
    But that should probably not be relied upon here.

    #4) With the added "break;",
    we might as well drop the misleading "/* FALLTHROUGH */" comment.

    This commit does _not_ address issue #2, as the sender would have to
    agree on a byte order as well.

    Here is the sequence of messages in this observed error-scenario:
    Host-A is pre-QoS changes (excluding the commit mentioned above)
    Host-B is post-QoS changes (including the commit mentioned above)

    #1 Host-B
    issues a connection request via function "rds_conn_path_transition"
    connection state transitions to "RDS_CONN_CONNECTING"

    #2 Host-A
    rejects the incompatible connection request (from #1)
    It does so by calling "rdma_reject(cm_id, NULL, 0);"

    #3 Host-B
    receives an "RDMA_CM_EVENT_REJECTED" event (from #2)
    But since the code is changed in the way described above,
    it won't drop the connection here, simply because "*err == 0".

    #4 Host-A
    issues a connection request

    #5 Host-B
    receives an "RDMA_CM_EVENT_CONNECT_REQUEST" event
    and ends up calling "rds_ib_cm_handle_connect".
    But since the state is already in "RDS_CONN_CONNECTING"
    (as of #1) it will end up issuing a "rdma_reject" without
    dropping the connection:
    if (rds_conn_state(conn) == RDS_CONN_CONNECTING) {
    /* Wait and see - our connect may still be succeeding */
    rds_ib_stats_inc(s_ib_connect_raced);
    }
    goto out;

    #6 Host-A
    receives an "RDMA_CM_EVENT_REJECTED" event (from #5),
    drops the connection and tries again (goto #4) until it gives up.

    Tested-by: Zhu Yanjun
    Signed-off-by: Gerd Rausch
    Signed-off-by: Santosh Shilimkar

    Gerd Rausch
     

05 Feb, 2019

3 commits

  • For RDMA transports, RDS TOS is an extension of IB QoS(Annex A13)
    to provide clients the ability to segregate traffic flows for
    different type of data. RDMA CM abstract it for ULPs using
    rdma_set_service_type(). Internally, each traffic flow is
    represented by a connection with all of its independent resources
    like that of a normal connection, and is differentiated by
    service type. In other words, there can be multiple qp connections
    between an IP pair and each supports a unique service type.

    The feature has been added from RDSv4.1 onwards and supports
    rolling upgrades. RDMA connection metadata also carries the tos
    information to set up SL on end to end context. The original
    code was developed by Bang Nguyen in downstream kernel back in
    2.6.32 kernel days and it has evolved over period of time.

    Reviewed-by: Sowmini Varadhan
    Signed-off-by: Santosh Shilimkar
    [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes]
    Signed-off-by: Zhu Yanjun

    Santosh Shilimkar
     
  • RDS Service type (TOS) is user-defined and needs to be configured
    via RDS IOCTL interface. It must be set before initiating any
    traffic and once set the TOS can not be changed. All out-going
    traffic from the socket will be associated with its TOS.

    Reviewed-by: Sowmini Varadhan
    Signed-off-by: Santosh Shilimkar
    [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes]
    Signed-off-by: Zhu Yanjun

    Santosh Shilimkar
     
  • For legacy protocol version incompatibility with non linux RDS,
    consumer reject reason being used to convey it to peer. But the
    choice of reject reason value as '1' was really poor.

    Anyway for interoperability reasons with shipping products,
    it needs to be supported. For any future versions, properly
    encoded reject reason should to be used.

    Reviewed-by: Sowmini Varadhan
    Signed-off-by: Santosh Shilimkar
    [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes]
    Signed-off-by: Zhu Yanjun

    Santosh Shilimkar
     

02 Aug, 2018

1 commit


25 Jul, 2018

1 commit


24 Jul, 2018

2 commits

  • This patch enables RDS to use IPv6 addresses. For RDS/TCP, the
    listener is now an IPv6 endpoint which accepts both IPv4 and IPv6
    connection requests. RDS/RDMA/IB uses a private data (struct
    rds_ib_connect_private) exchange between endpoints at RDS connection
    establishment time to support RDMA. This private data exchange uses a
    32 bit integer to represent an IP address. This needs to be changed in
    order to support IPv6. A new private data struct
    rds6_ib_connect_private is introduced to handle this. To ensure
    backward compatibility, an IPv6 capable RDS stack uses another RDMA
    listener port (RDS_CM_PORT) to accept IPv6 connection. And it
    continues to use the original RDS_PORT for IPv4 RDS connections. When
    it needs to communicate with an IPv6 peer, it uses the RDS_CM_PORT to
    send the connection set up request.

    v5: Fixed syntax problem (David Miller).

    v4: Changed port history comments in rds.h (Sowmini Varadhan).

    v3: Added support to set up IPv4 connection using mapped address
    (David Miller).
    Added support to set up connection between link local and non-link
    addresses.
    Various review comments from Santosh Shilimkar and Sowmini Varadhan.

    v2: Fixed bound and peer address scope mismatched issue.
    Added back rds_connect() IPv6 changes.

    Signed-off-by: Ka-Cheong Poon
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Ka-Cheong Poon
     
  • This patch changes the internal representation of an IP address to use
    struct in6_addr. IPv4 address is stored as an IPv4 mapped address.
    All the functions which take an IP address as argument are also
    changed to use struct in6_addr. But RDS socket layer is not modified
    such that it still does not accept IPv6 address from an application.
    And RDS layer does not accept nor initiate IPv6 connections.

    v2: Fixed sparse warnings.

    Signed-off-by: Ka-Cheong Poon
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Ka-Cheong Poon
     

03 Jan, 2017

1 commit


15 Dec, 2016

1 commit


15 Jun, 2016

1 commit

  • In preparation for multipath RDS, split the rds_connection
    structure into a base structure, and a per-path struct rds_conn_path.
    The base structure tracks information and locks common to all
    paths. The workqs for send/recv/shutdown etc are tracked per
    rds_conn_path. Thus the workq callbacks now work with rds_conn_path.

    This commit allows for one rds_conn_path per rds_connection, and will
    be extended into multiple conn_paths in subsequent commits.

    Signed-off-by: Sowmini Varadhan
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

03 Mar, 2016

2 commits

  • Drop the RDS connection on RDMA_CM_EVENT_TIMEWAIT_EXIT so that
    it can reconnect and resume.

    While testing fastreg, this error happened in couple of tests but
    was getting un-noticed.

    Signed-off-by: Santosh Shilimkar
    Signed-off-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    santosh.shilimkar@oracle.com
     
  • RDS iWarp support code has become stale and non testable. As
    indicated earlier, am dropping the support for it.

    If new iWarp user(s) shows up in future, we can adapat the RDS IB
    transprt for the special RDMA READ sink case. iWarp needs an MR
    for the RDMA READ sink.

    Signed-off-by: Santosh Shilimkar
    Signed-off-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    santosh.shilimkar@oracle.com
     

29 Oct, 2015

1 commit

  • Add support for network namespaces in the ib_cma module. This is
    accomplished by:

    1. Adding network namespace parameter for rdma_create_id. This parameter is
    used to populate the network namespace field in rdma_id_private.
    rdma_create_id keeps a reference on the network namespace.
    2. Using the network namespace from the rdma_id instead of init_net inside
    of ib_cma, when listening on an ID and when looking for an ID for an
    incoming request.
    3. Decrementing the reference count for the appropriate network namespace
    when calling rdma_destroy_id.

    In order to preserve the current behavior init_net is passed when calling
    from other modules.

    Signed-off-by: Guy Shapiro
    Signed-off-by: Haggai Eran
    Signed-off-by: Yotam Kenneth
    Signed-off-by: Shachar Raindel
    Signed-off-by: Doug Ledford

    Guy Shapiro
     

26 Aug, 2015

1 commit


19 May, 2015

1 commit


31 May, 2014

1 commit

  • This patch replaces a comma between expression statements by a semicolon.

    A simplified version of the semantic patch that performs this
    transformation is as follows:

    //
    @r@
    expression e1,e2,e;
    type T;
    identifier i;
    @@

    e1
    -,
    +;
    e2;
    //

    Signed-off-by: Himangi Saraogi
    Acked-by: Julia Lawall
    Signed-off-by: David S. Miller

    Himangi Saraogi
     

01 Nov, 2011

1 commit


26 May, 2011

1 commit

  • The RDMA CM currently infers the QP type from the port space selected
    by the user. In the future (eg with RDMA_PS_IB or XRC), there may not
    be a 1-1 correspondence between port space and QP type. For netlink
    export of RDMA CM state, we want to export the QP type to userspace,
    so it is cleaner to explicitly associate a QP type to an ID.

    Modify rdma_create_id() to allow the user to specify the QP type, and
    use it to make our selections of datagram versus connected mode.

    Signed-off-by: Sean Hefty
    Signed-off-by: Roland Dreier

    Sean Hefty
     

21 Oct, 2010

1 commit


09 Sep, 2010

2 commits


28 Apr, 2010

1 commit


23 Apr, 2010

1 commit

  • In the original code, the "goto out" calls "rdma_destroy_id(cm_id);"
    That isn't needed here and would cause problems because "cm_id" is an
    ERR_PTR. The new code just returns directly.

    Signed-off-by: Dan Carpenter
    Acked-by: Andy Grover
    Signed-off-by: David S. Miller

    Dan Carpenter
     

17 Mar, 2010

2 commits


24 Aug, 2009

1 commit

  • Enable the building of transports as modules.

    Also, improve consistency of Kconfig messages in relation to other
    protocols, and move build dependency on IB from the RDS core code
    to the rds_rdma module.

    Signed-off-by: Andy Grover
    Signed-off-by: David S. Miller

    Andy Grover
     

06 Aug, 2009

1 commit

  • Elsewhere the sin_family field holds a value with a name of the form
    AF_..., so it seems reasonable to do so here as well. Also the values of
    PF_INET and AF_INET are the same.

    The semantic patch that makes this change is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @@
    struct sockaddr_in sip;
    @@

    (
    sip.sin_family ==
    - PF_INET
    + AF_INET
    |
    sip.sin_family !=
    - PF_INET
    + AF_INET
    |
    sip.sin_family =
    - PF_INET
    + AF_INET
    )
    //

    Signed-off-by: Julia Lawall
    Signed-off-by: David S. Miller

    Julia Lawall
     

20 Jul, 2009

1 commit


10 Apr, 2009

1 commit


27 Feb, 2009

1 commit

  • Although most of IB and iWARP are separated from each other,
    there is some common code required to handle their shared
    CM listen port. This code listens for CM events and then
    dispatches the event to the appropriate transport, either
    IB or iWARP.

    Signed-off-by: Andy Grover
    Signed-off-by: David S. Miller

    Andy Grover