17 Jul, 2017

1 commit

  • We could end up executing rds_conn_shutdown before the rds_recv_worker
    thread, then rds_conn_shutdown -> rds_tcp_conn_shutdown can do a
    sock_release and set sock->sk to null, which may interleave in bad
    ways with rds_recv_worker, e.g., it could result in:

    "BUG: unable to handle kernel NULL pointer dereference at 0000000000000078"
    [ffff881769f6fd70] release_sock at ffffffff815f337b
    [ffff881769f6fd90] rds_tcp_recv at ffffffffa043c888 [rds_tcp]
    [ffff881769f6fdb0] rds_recv_worker at ffffffffa04a4810 [rds]
    [ffff881769f6fde0] process_one_work at ffffffff810a14c1
    [ffff881769f6fe40] worker_thread at ffffffff810a1940
    [ffff881769f6fec0] kthread at ffffffff810a6b1e

    Also, do not enqueue any new shutdown workq items when the connection is
    shutting down (this may happen for rds-tcp in softirq mode, if a FIN
    or CLOSE is received while the modules is in the middle of an unload)

    Signed-off-by: Sowmini Varadhan
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

01 Jul, 2017

1 commit

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

06 Apr, 2017

1 commit


18 Nov, 2016

1 commit

  • As noted in rds_recv_incoming() sequence numbers on data packets
    can decreas for the failover case, and the Rx path is equipped
    to recover from this, if the RDS_FLAG_RETRANSMITTED is set
    on the rds header of an incoming message with a suspect sequence
    number.

    The RDS_FLAG_RETRANSMITTED is predicated on the RDS_FLAG_RETRANSMITTED
    flag in the rds_message, so make sure the flag is set on messages
    queued for retransmission.

    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

16 Jul, 2016

1 commit

  • Use RDS probe-ping to compute how many paths may be used with
    the peer, and to synchronously start the multiple paths. If mprds is
    supported, hash outgoing traffic to one of multiple paths in rds_sendmsg()
    when multipath RDS is supported by the transport.

    CC: Santosh Shilimkar
    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

02 Jul, 2016

2 commits


30 Jun, 2016

1 commit


19 Jun, 2016

1 commit

  • Fix coding style issues in the following files:

    ib_cm.c: add space
    loop.c: convert spaces to tabs
    sysctl.c: add space
    tcp.h: convert spaces to tabs
    tcp_connect.c:remove extra indentation in switch statement
    tcp_recv.c: convert spaces to tabs
    tcp_send.c: convert spaces to tabs
    transport.c: move brace up one line on for statement

    Signed-off-by: Joshua Houghton
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Joshua Houghton
     

15 Jun, 2016

1 commit

  • In preparation for multipath RDS, split the rds_connection
    structure into a base structure, and a per-path struct rds_conn_path.
    The base structure tracks information and locks common to all
    paths. The workqs for send/recv/shutdown etc are tracked per
    rds_conn_path. Thus the workq callbacks now work with rds_conn_path.

    This commit allows for one rds_conn_path per rds_connection, and will
    be extended into multiple conn_paths in subsequent commits.

    Signed-off-by: Sowmini Varadhan
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

20 May, 2016

1 commit

  • TCP stack can now run from process context.

    Use read_lock_bh(&sk->sk_callback_lock) variant to restore previous
    assumption.

    Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
    Fixes: d41a69f1d390 ("tcp: make tcp_sendmsg() aware of socket backlog")
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

05 Oct, 2015

1 commit


18 Apr, 2014

1 commit

  • Mostly scripted conversion of the smp_mb__* barriers.

    Signed-off-by: Peter Zijlstra
    Acked-by: Paul E. McKenney
    Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
    Cc: Linus Torvalds
    Cc: linux-arch@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

23 Aug, 2012

1 commit


21 Oct, 2010

1 commit


27 Sep, 2010

1 commit


25 Sep, 2010

1 commit

  • We have for each socket :

    One spinlock (sk_slock.slock)
    One rwlock (sk_callback_lock)

    Possible scenarios are :

    (A) (this is used in net/sunrpc/xprtsock.c)
    read_lock(&sk->sk_callback_lock) (without blocking BH)

    spin_lock(&sk->sk_slock.slock);
    ...
    read_lock(&sk->sk_callback_lock);
    ...

    (B)
    write_lock_bh(&sk->sk_callback_lock)
    stuff
    write_unlock_bh(&sk->sk_callback_lock)

    (C)
    spin_lock_bh(&sk->sk_slock)
    ...
    write_lock_bh(&sk->sk_callback_lock)
    stuff
    write_unlock_bh(&sk->sk_callback_lock)
    spin_unlock_bh(&sk->sk_slock)

    This (C) case conflicts with (A) :

    CPU1 [A] CPU2 [C]
    read_lock(callback_lock)
    spin_lock_bh(slock)

    We have one problematic (C) use case in inet_csk_listen_stop() :

    local_bh_disable();
    bh_lock_sock(child); // spin_lock_bh(&sk->sk_slock)
    WARN_ON(sock_owned_by_user(child));
    ...
    sock_orphan(child); // write_lock_bh(&sk->sk_callback_lock)

    lockdep is not happy with this, as reported by Tetsuo Handa

    It seems only way to deal with this is to use read_lock_bh(callbacklock)
    everywhere.

    Thanks to Jarek for pointing a bug in my first attempt and suggesting
    this solution.

    Reported-by: Tetsuo Handa
    Tested-by: Tetsuo Handa
    Signed-off-by: Eric Dumazet
    CC: Jarek Poplawski
    Tested-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

09 Sep, 2010

4 commits


17 Mar, 2010

1 commit


04 Feb, 2010

1 commit


24 Aug, 2009

1 commit

  • This code allows RDS to be tunneled over a TCP connection.

    RDMA operations are disabled when using TCP transport,
    but this frees RDS from the IB/RDMA stack dependency, and allows
    it to be used with standard Ethernet adapters, or in a VM.

    Signed-off-by: Andy Grover
    Signed-off-by: David S. Miller

    Andy Grover