Eric Lee / smarc-fsl-linux-kernel

26 Sep, 2018

1 commit

5ff9c51cb rds: fix two RCU related problems ... Browse Code »

[ Upstream commit cc4dfb7f70a344f24c1c71e298deea0771dadcb2 ]

When a rds sock is bound, it is inserted into the bind_hash_table
which is protected by RCU. But when releasing rds sock, after it
is removed from this hash table, it is freed immediately without
respecting RCU grace period. This could cause some use-after-free
as reported by syzbot.

Mark the rds sock with SOCK_RCU_FREE before inserting it into the
bind_hash_table, so that it would be always freed after a RCU grace
period.

The other problem is in rds_find_bound(), the rds sock could be
freed in between rhashtable_lookup_fast() and rds_sock_addref(),
so we need to extend RCU read lock protection in rds_find_bound()
to close this race condition.

Reported-and-tested-by: syzbot+8967084bcac563795dc6@syzkaller.appspotmail.com
Reported-by: syzbot+93a5839deb355537440f@syzkaller.appspotmail.com
Cc: Sowmini Varadhan
Cc: Santosh Shilimkar
Cc: rds-devel@oss.oracle.com
Signed-off-by: Cong Wang
Acked-by: Santosh Shilimkar
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Cong Wang
2018-09-26 14:37:57 +0800

12 Apr, 2018

1 commit

ea620e414 rds; Reset rs->rs_bound_addr in rds_add_bound() failure path ... Browse Code »

[ Upstream commit 7ae0c649c47f1c5d2db8cee6dd75855970af1669 ]

If the rds_sock is not added to the bind_hash_table, we must
reset rs_bound_addr so that rds_remove_bound will not trip on
this rds_sock.

rds_add_bound() does a rds_sock_put() in this failure path, so
failing to reset rs_bound_addr will result in a socket refcount
bug, and will trigger a WARN_ON with the stack shown below when
the application subsequently tries to close the PF_RDS socket.

WARNING: CPU: 20 PID: 19499 at net/rds/af_rds.c:496 \
rds_sock_destruct+0x15/0x30 [rds]
:
__sk_destruct+0x21/0x190
rds_remove_bound.part.13+0xb6/0x140 [rds]
rds_release+0x71/0x120 [rds]
sock_release+0x1a/0x70
sock_close+0xe/0x20
__fput+0xd5/0x210
task_work_run+0x82/0xa0
do_exit+0x2ce/0xb30
? syscall_trace_enter+0x1cc/0x2b0
do_group_exit+0x39/0xa0
SyS_exit_group+0x10/0x10
do_syscall_64+0x61/0x1a0

Signed-off-by: Sowmini Varadhan
Acked-by: Santosh Shilimkar
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Sowmini Varadhan
2018-04-12 18:32:12 +0800

29 Aug, 2017

1 commit

8209432a5 RDS: make rhashtable_params const ... Browse Code »

Make this const as it is either used during a copy operation or passed
to a const argument of the function rhltable_init

Signed-off-by: Bhumika Goyal
Acked-by: Santosh Shilimkar
Signed-off-by: David S. Miller

Bhumika Goyal
2017-08-29 02:30:02 +0800

03 Jan, 2017

1 commit

f69b22e65 RDS: log the address on bind failure ... Browse Code »

It's useful to know the IP address when RDS fails to bind a
connection. Thus, adding it to the error message.

Orabug: 21894138
Reviewed-by: Wei Lin Guay
Signed-off-by: Santosh Shilimkar

Santosh Shilimkar
2017-01-03 06:02:39 +0800

16 Jul, 2016

1 commit

5916e2c15 RDS: TCP: Enable multipath RDS for TCP ... Browse Code »

Use RDS probe-ping to compute how many paths may be used with
the peer, and to synchronously start the multiple paths. If mprds is
supported, hash outgoing traffic to one of multiple paths in rds_sendmsg()
when multipath RDS is supported by the transport.

CC: Santosh Shilimkar
Signed-off-by: Sowmini Varadhan
Acked-by: Santosh Shilimkar
Signed-off-by: David S. Miller

Sowmini Varadhan
2016-07-16 02:36:58 +0800

03 Nov, 2015

1 commit

7b5654349 RDS: convert bind hash table to re-sizable hashtable ... Browse Code »

To further improve the RDS connection scalabilty on massive systems
where number of sockets grows into tens of thousands of sockets, there
is a need of larger bind hashtable. Pre-allocated 8K or 16K table is
not very flexible in terms of memory utilisation. The rhashtable
infrastructure gives us the flexibility to grow the hashtbable based
on use and also comes up with inbuilt efficient bucket(chain) handling.

Reviewed-by: David Miller
Signed-off-by: Santosh Shilimkar
Signed-off-by: Santosh Shilimkar
Signed-off-by: David S. Miller

santosh.shilimkar@oracle.com
2015-11-03 04:36:23 +0800

13 Oct, 2015

1 commit

486798001 RDS: Invoke ->laddr_check() in rds_bind() for explicitly bound transports. ... Browse Code »

The IP address passed to rds_bind() should be vetted by the
transport's ->laddr_check() for a previously bound transport.
This needs to be done to avoid cases where, for example,
the application has asked for an IB transport,
but the IP address passed to bind is only usable on
ethernet interfaces.

Signed-off-by: Sowmini Varadhan
Acked-by: Santosh Shilimkar
Signed-off-by: David S. Miller

Sowmini Varadhan
2015-10-13 19:22:40 +0800

01 Oct, 2015

3 commits

9b9acde7e RDS: Use per-bucket rw lock for bind hash-table ... Browse Code »

One global lock protecting hash-tables with 1024 buckets isn't
efficient and it shows up in a massive systems with truck
loads of RDS sockets serving multiple databases. The
perf data clearly highlights the contention on the rw
lock in these massive workloads.

When the contention gets worse, the code gets into a state where
it decides to back off on the lock. So while it has disabled interrupts,
it sits and backs off on this lock get. This causes the system to
become sluggish and eventually all sorts of bad things happen.

The simple fix is to move the lock into the hash bucket and
use per-bucket lock to improve the scalability.

Signed-off-by: Santosh Shilimkar
Signed-off-by: Santosh Shilimkar

Santosh Shilimkar
2015-10-01 00:43:25 +0800
281269598 RDS: fix rds_sock reference bug while doing bind ... Browse Code »

One need to take rds socket reference while using it and release it
once done with it. rds_add_bind() code path does not do that so
lets fix it.

Signed-off-by: Santosh Shilimkar
Signed-off-by: Santosh Shilimkar

Santosh Shilimkar
2015-10-01 00:43:25 +0800
8b0a6b461 RDS: make socket bind/release locking scheme simple and more efficient ... Browse Code »

RDS bind and release locking scheme is very inefficient. It
uses RCU for maintaining the bind hash-table which is great but
it also needs to hold spinlock for [add/remove]_bound(). So
overall usecase, the hash-table concurrent speedup doesn't pay off.
In fact blocking nature of synchronize_rcu() makes the RDS
socket shutdown too slow which hurts RDS performance since
connection shutdown and re-connect happens quite often to
maintain the RC part of the protocol.

So we make the locking scheme simpler and more efficient by
replacing spin_locks with reader/writer locks and getting rid
off rcu for bind hash-table.

In subsequent patch, we also covert the global lock with per-bucket
lock to reduce the global lock contention.

Signed-off-by: Santosh Shilimkar
Signed-off-by: Santosh Shilimkar

Santosh Shilimkar
2015-10-01 00:43:24 +0800

08 Aug, 2015

1 commit

d5a8ac28a RDS-TCP: Make RDS-TCP work correctly when it is set up in a netns other than init_net ... Browse Code »

Open the sockets calling sock_create_kern() with the correct struct net
pointer, and use that struct net pointer when verifying the
address passed to rds_bind().

Signed-off-by: Sowmini Varadhan
Signed-off-by: David S. Miller

Sowmini Varadhan
2015-08-08 02:29:57 +0800

01 Jun, 2015

1 commit

d97dac54b net/rds: Add setsockopt support for SO_RDS_TRANSPORT ... Browse Code »

An application may deterministically attach the underlying transport for
a PF_RDS socket by invoking setsockopt(2) with the SO_RDS_TRANSPORT
option at the SOL_RDS level. The integer argument to setsockopt must be
one of the RDS_TRANS_* transport types, e.g., RDS_TRANS_TCP. The option
must be specified before invoking bind(2) on the socket, and may only
be used once on the socket. An attempt to set the option on a bound
socket, or to invoke the option after a successful SO_RDS_TRANSPORT
attachment, will return EOPNOTSUPP.

Signed-off-by: Sowmini Varadhan
Signed-off-by: David S. Miller

Sowmini Varadhan
2015-06-01 12:47:23 +0800

15 Jan, 2014

1 commit

63862b5be net: replace macros net_random and net_srandom with direct calls to prandom ... Browse Code »

This patch removes the net_random and net_srandom macros and replaces
them with direct calls to the prandom ones. As new commits only seem to
use prandom_u32 there is no use to keep them around.
This change makes it easier to grep for users of prandom_u32.

Signed-off-by: Aruna-Hewapathirane
Suggested-by: Hannes Frederic Sowa
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Aruna-Hewapathirane
2014-01-15 07:15:25 +0800

28 Feb, 2013

1 commit

b67bfe0d4 hlist: drop the node parameter from iterators ... Browse Code »

I'm not sure why, but the hlist for each entry iterators were conceived

list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin
Acked-by: Paul E. McKenney
Signed-off-by: Sasha Levin
Cc: Wu Fengguang
Cc: Marcelo Tosatti
Cc: Gleb Natapov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sasha Levin
2013-02-28 11:10:24 +0800

17 Jun, 2011

1 commit

cb0a60564 net/rds: use prink_ratelimited() instead of printk_ratelimit() ... Browse Code »

Since printk_ratelimit() shouldn't be used anymore (see comment in
include/linux/printk.h), replace it with printk_ratelimited()

Signed-off-by: Manuel Zerpies
Signed-off-by: David S. Miller

Manuel Zerpies
2011-06-17 12:03:03 +0800

09 Sep, 2010

3 commits

38a4e5e61 rds: Use RCU for the bind lookup searches ... Browse Code »

The RDS bind lookups are somewhat expensive in terms of CPU
time and locking overhead. This commit changes them into a
faster RCU based hash tree instead of the rbtrees they were using
before.

On large NUMA systems it is a significant improvement.

Signed-off-by: Chris Mason

Chris Mason
2010-09-09 09:15:08 +0800
976673ee1 rds: switch to rwlock on bind_lock ... Browse Code »

The bind_lock is almost entirely readonly, but it gets
hammered during normal operations and is a major bottleneck.

This commit changes it to an rwlock, which takes it from 80%
of the system time on a big numa machine down to much lower
numbers.

A better fix would involve RCU, which is done in a later commit

Signed-off-by: Chris Mason

Chris Mason
2010-09-09 09:12:26 +0800
8690bfa17 RDS: cleanup: remove "== NULL"s and "!= NULL"s in ptr comparisons ... Browse Code »

Favor "if (foo)" style over "if (foo != NULL)".

Signed-off-by: Andy Grover

Andy Grover
2010-09-09 09:11:32 +0800

24 Aug, 2009

1 commit

f2c449320 RDS: Add a debug message suggesting to load transport modules ... Browse Code »

Now that RDS transports are no longer compiled-in to RDS core,
there is now the possibility that they will not be loaded. This
adds a helpful suggestion when rds_bind() fails to find a transport.

Signed-off-by: Andy Grover
Signed-off-by: David S. Miller

Andy Grover
2009-08-24 10:13:14 +0800

27 Feb, 2009

1 commit

639b321b4 RDS: Socket interface ... Browse Code »

Implement the RDS (Reliable Datagram Sockets) interface.

Signed-off-by: Andy Grover
Signed-off-by: David S. Miller

Andy Grover
2009-02-27 15:39:23 +0800