Eric Lee / smarc-fsl-linux-kernel

07 Oct, 2020

2 commits

aed60a174 net: virtio_vsock: Enhance connection semantics ... Browse Code »

[ Upstream commit df12eb6d6cd920ab2f0e0a43cd6e1c23a05cea91 ]

Whenever the vsock backend on the host sends a packet through the RX
queue, it expects an answer on the TX queue. Unfortunately, there is one
case where the host side will hang waiting for the answer and might
effectively never recover if no timeout mechanism was implemented.

This issue happens when the guest side starts binding to the socket,
which insert a new bound socket into the list of already bound sockets.
At this time, we expect the guest to also start listening, which will
trigger the sk_state to move from TCP_CLOSE to TCP_LISTEN. The problem
occurs if the host side queued a RX packet and triggered an interrupt
right between the end of the binding process and the beginning of the
listening process. In this specific case, the function processing the
packet virtio_transport_recv_pkt() will find a bound socket, which means
it will hit the switch statement checking for the sk_state, but the
state won't be changed into TCP_LISTEN yet, which leads the code to pick
the default statement. This default statement will only free the buffer,
while it should also respond to the host side, by sending a packet on
its TX queue.

In order to simply fix this unfortunate chain of events, it is important
that in case the default statement is entered, and because at this stage
we know the host side is waiting for an answer, we must send back a
packet containing the operation VIRTIO_VSOCK_OP_RST.

One could say that a proper timeout mechanism on the host side will be
enough to avoid the backend to hang. But the point of this patch is to
ensure the normal use case will be provided with proper responsiveness
when it comes to establishing the connection.

Signed-off-by: Sebastien Boeuf
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin

Sebastien Boeuf
2020-10-07 14:01:24 +0800
215459ff3 vsock/virtio: add transport parameter to the virtio_transport_reset_no_sock() ... Browse Code »

[ Upstream commit 4c7246dc45e2706770d5233f7ce1597a07e069ba ]

We are going to add 'struct vsock_sock *' parameter to
virtio_transport_get_ops().

In some cases, like in the virtio_transport_reset_no_sock(),
we don't have any socket assigned to the packet received,
so we can't use the virtio_transport_get_ops().

In order to allow virtio_transport_reset_no_sock() to use the
'.send_pkt' callback from the 'vhost_transport' or 'virtio_transport',
we add the 'struct virtio_transport *' to it and to its caller:
virtio_transport_recv_pkt().

We moved the 'vhost_transport' and 'virtio_transport' definition,
to pass their address to the virtio_transport_recv_pkt().

Reviewed-by: Stefan Hajnoczi
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin

Stefano Garzarella
2020-10-07 14:01:24 +0800

29 Jul, 2020

1 commit

ad49d7666 vsock/virtio: annotate 'the_virtio_vsock' RCU pointer ... Browse Code »

[ Upstream commit f961134a612c793d5901a93d85a29337c74af978 ]

Commit 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free
on the_virtio_vsock") starts to use RCU to protect 'the_virtio_vsock'
pointer, but we forgot to annotate it.

This patch adds the annotation to fix the following sparse errors:

net/vmw_vsock/virtio_transport.c:73:17: error: incompatible types in comparison expression (different address spaces):
net/vmw_vsock/virtio_transport.c:73:17: struct virtio_vsock [noderef] __rcu *
net/vmw_vsock/virtio_transport.c:73:17: struct virtio_vsock *
net/vmw_vsock/virtio_transport.c:171:17: error: incompatible types in comparison expression (different address spaces):
net/vmw_vsock/virtio_transport.c:171:17: struct virtio_vsock [noderef] __rcu *
net/vmw_vsock/virtio_transport.c:171:17: struct virtio_vsock *
net/vmw_vsock/virtio_transport.c:207:17: error: incompatible types in comparison expression (different address spaces):
net/vmw_vsock/virtio_transport.c:207:17: struct virtio_vsock [noderef] __rcu *
net/vmw_vsock/virtio_transport.c:207:17: struct virtio_vsock *
net/vmw_vsock/virtio_transport.c:561:13: error: incompatible types in comparison expression (different address spaces):
net/vmw_vsock/virtio_transport.c:561:13: struct virtio_vsock [noderef] __rcu *
net/vmw_vsock/virtio_transport.c:561:13: struct virtio_vsock *
net/vmw_vsock/virtio_transport.c:612:9: error: incompatible types in comparison expression (different address spaces):
net/vmw_vsock/virtio_transport.c:612:9: struct virtio_vsock [noderef] __rcu *
net/vmw_vsock/virtio_transport.c:612:9: struct virtio_vsock *
net/vmw_vsock/virtio_transport.c:631:9: error: incompatible types in comparison expression (different address spaces):
net/vmw_vsock/virtio_transport.c:631:9: struct virtio_vsock [noderef] __rcu *
net/vmw_vsock/virtio_transport.c:631:9: struct virtio_vsock *

Fixes: 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock")
Reported-by: Michael S. Tsirkin
Signed-off-by: Stefano Garzarella
Reviewed-by: Stefan Hajnoczi
Acked-by: Michael S. Tsirkin
Signed-off-by: Jakub Kicinski
Signed-off-by: Sasha Levin

Stefano Garzarella
2020-07-29 16:18:31 +0800

11 Jun, 2020

1 commit

a02203349 vsock: fix timeout in vsock_accept() ... Browse Code »

[ Upstream commit 7e0afbdfd13d1e708fe96e31c46c4897101a6a43 ]

The accept(2) is an "input" socket interface, so we should use
SO_RCVTIMEO instead of SO_SNDTIMEO to set the timeout.

So this patch replace sock_sndtimeo() with sock_rcvtimeo() to
use the right timeout in the vsock_accept().

Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Signed-off-by: Stefano Garzarella
Reviewed-by: Jorgen Hansen
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Stefano Garzarella
2020-06-11 02:24:55 +0800

15 Feb, 2020

1 commit

b96c27b18 hv_sock: Remove the accept port restriction ... Browse Code »

[ Upstream commit c742c59e1fbd022b64d91aa9a0092b3a699d653c ]

Currently, hv_sock restricts the port the guest socket can accept
connections on. hv_sock divides the socket port namespace into two parts
for server side (listening socket), 0-0x7FFFFFFF & 0x80000000-0xFFFFFFFF
(there are no restrictions on client port namespace). The first part
(0-0x7FFFFFFF) is reserved for sockets where connections can be accepted.
The second part (0x80000000-0xFFFFFFFF) is reserved for allocating ports
for the peer (host) socket, once a connection is accepted.
This reservation of the port namespace is specific to hv_sock and not
known by the generic vsock library (ex: af_vsock). This is problematic
because auto-binds/ephemeral ports are handled by the generic vsock
library and it has no knowledge of this port reservation and could
allocate a port that is not compatible with hv_sock (and legitimately so).
The issue hasn't surfaced so far because the auto-bind code of vsock
(__vsock_bind_stream) prior to the change 'VSOCK: bind to random port for
VMADDR_PORT_ANY' would start walking up from LAST_RESERVED_PORT (1023) and
start assigning ports. That will take a large number of iterations to hit
0x7FFFFFFF. But, after the above change to randomize port selection, the
issue has started coming up more frequently.
There has really been no good reason to have this port reservation logic
in hv_sock from the get go. Reserving a local port for peer ports is not
how things are handled generally. Peer ports should reflect the peer port.
This fixes the issue by lifting the port reservation, and also returns the
right peer port. Since the code converts the GUID to the peer port (by
using the first 4 bytes), there is a possibility of conflicts, but that
seems like a reasonable risk to take, given this is limited to vsock and
that only applies to all local sockets.

Signed-off-by: Sunil Muthuswamy
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin

Sunil Muthuswamy
2020-02-15 05:34:07 +0800

09 Nov, 2019

1 commit

ad8a72203 vsock/virtio: fix sock refcnt holding during the shutdown ... Browse Code »

The "42f5cda5eaf4" commit rightly set SOCK_DONE on peer shutdown,
but there is an issue if we receive the SHUTDOWN(RDWR) while the
virtio_transport_close_timeout() is scheduled.
In this case, when the timeout fires, the SOCK_DONE is already
set and the virtio_transport_close_timeout() will not call
virtio_transport_reset() and virtio_transport_do_close().
This causes that both sockets remain open and will never be released,
preventing the unloading of [virtio|vhost]_transport modules.

This patch fixes this issue, calling virtio_transport_reset() and
virtio_transport_do_close() when we receive the SHUTDOWN(RDWR)
and there is nothing left to read.

Fixes: 42f5cda5eaf4 ("vsock/virtio: set SOCK_DONE on peer shutdown")
Cc: Stephen Barber
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-11-09 04:17:50 +0800

29 Oct, 2019

1 commit

3ef7cf57c net: use skb_queue_empty_lockless() in poll() handlers ... Browse Code »

Many poll() handlers are lockless. Using skb_queue_empty_lockless()
instead of skb_queue_empty() is more appropriate.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2019-10-29 04:33:41 +0800

19 Oct, 2019

2 commits

ae6fcfbf5 vsock/virtio: discard packets if credit is not respected ... Browse Code »

If the remote peer doesn't respect the credit information
(buf_alloc, fwd_cnt), sending more data than it can send,
we should drop the packets to prevent a malicious peer
from using all of our memory.

This is patch follows the VIRTIO spec: "VIRTIO_VSOCK_OP_RW data
packets MUST only be transmitted when the peer has sufficient
free buffer space for the payload"

Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-10-19 01:19:43 +0800
ec3359b68 vsock/virtio: send a credit update when buffer size is changed ... Browse Code »

When the user application set a new buffer size value, we should
update the remote peer about this change, since it uses this
information to calculate the credit available.

Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-10-19 01:19:43 +0800

02 Oct, 2019

1 commit

0d9138ffa vsock: Fix a lockdep warning in __vsock_release() ... Browse Code »

Lockdep is unhappy if two locks from the same class are held.

Fix the below warning for hyperv and virtio sockets (vmci socket code
doesn't have the issue) by using lock_sock_nested() when __vsock_release()
is called recursively:

============================================
WARNING: possible recursive locking detected
5.3.0+ #1 Not tainted
--------------------------------------------
server/1795 is trying to acquire lock:
ffff8880c5158990 (sk_lock-AF_VSOCK){+.+.}, at: hvs_release+0x10/0x120 [hv_sock]

but task is already holding lock:
ffff8880c5158150 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 [vsock]

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(sk_lock-AF_VSOCK);
lock(sk_lock-AF_VSOCK);

*** DEADLOCK ***

May be due to missing lock nesting notation

2 locks held by server/1795:
#0: ffff8880c5d05ff8 (&sb->s_type->i_mutex_key#10){+.+.}, at: __sock_release+0x2d/0xa0
#1: ffff8880c5158150 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 [vsock]

stack backtrace:
CPU: 5 PID: 1795 Comm: server Not tainted 5.3.0+ #1
Call Trace:
dump_stack+0x67/0x90
__lock_acquire.cold.67+0xd2/0x20b
lock_acquire+0xb5/0x1c0
lock_sock_nested+0x6d/0x90
hvs_release+0x10/0x120 [hv_sock]
__vsock_release+0x24/0xf0 [vsock]
__vsock_release+0xa0/0xf0 [vsock]
vsock_release+0x12/0x30 [vsock]
__sock_release+0x37/0xa0
sock_close+0x14/0x20
__fput+0xc1/0x250
task_work_run+0x98/0xc0
do_exit+0x344/0xc60
do_group_exit+0x47/0xb0
get_signal+0x15c/0xc50
do_signal+0x30/0x720
exit_to_usermode_loop+0x50/0xa0
do_syscall_64+0x24e/0x270
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f4184e85f31

Tested-by: Stefano Garzarella
Signed-off-by: Dexuan Cui
Reviewed-by: Stefano Garzarella
Signed-off-by: David S. Miller

Dexuan Cui
2019-10-02 09:23:35 +0800

05 Sep, 2019

1 commit

f4d7c8e3d vsock/virtio: a better comment on credit update ... Browse Code »

The comment we have is just repeating what the code does.
Include the *reason* for the condition instead.

Cc: Stefano Garzarella
Signed-off-by: Michael S. Tsirkin
Reviewed-by: Stefano Garzarella
Signed-off-by: David S. Miller

Michael S. Tsirkin
2019-09-05 15:53:01 +0800

07 Aug, 2019

1 commit

13dfb3fa4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Just minor overlapping changes in the conflicts here.

Signed-off-by: David S. Miller

David S. Miller
2019-08-07 09:44:57 +0800

03 Aug, 2019

1 commit

685703b49 hv_sock: Fix hang when a connection is closed ... Browse Code »

There is a race condition for an established connection that is being closed
by the guest: the refcnt is 4 at the end of hvs_release() (Note: here the
'remove_sock' is false):

1 for the initial value;
1 for the sk being in the bound list;
1 for the sk being in the connected list;
1 for the delayed close_work.

After hvs_release() finishes, __vsock_release() -> sock_put(sk) *may*
decrease the refcnt to 3.

Concurrently, hvs_close_connection() runs in another thread:
calls vsock_remove_sock() to decrease the refcnt by 2;
call sock_put() to decrease the refcnt to 0, and free the sk;
next, the "release_sock(sk)" may hang due to use-after-free.

In the above, after hvs_release() finishes, if hvs_close_connection() runs
faster than "__vsock_release() -> sock_put(sk)", then there is not any issue,
because at the beginning of hvs_close_connection(), the refcnt is still 4.

The issue can be resolved if an extra reference is taken when the
connection is established.

Fixes: a9eeb998c28d ("hv_sock: Add support for delayed close")
Signed-off-by: Dexuan Cui
Reviewed-by: Sunil Muthuswamy
Signed-off-by: David S. Miller

Dexuan Cui
2019-08-03 08:26:27 +0800

31 Jul, 2019

5 commits

0038ff357 vsock/virtio: change the maximum packet size allowed ... Browse Code »

Since now we are able to split packets, we can avoid limiting
their sizes to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE.
Instead, we can use VIRTIO_VSOCK_MAX_PKT_BUF_SIZE as the max
packet size.

Signed-off-by: Stefano Garzarella
Reviewed-by: Stefan Hajnoczi
Acked-by: Michael S. Tsirkin
Signed-off-by: David S. Miller

Stefano Garzarella
2019-07-31 06:00:00 +0800
6dbd3e66e vhost/vsock: split packets to send using multiple buffers ... Browse Code »

If the packets to sent to the guest are bigger than the buffer
available, we can split them, using multiple buffers and fixing
the length in the packet header.
This is safe since virtio-vsock supports only stream sockets.

Signed-off-by: Stefano Garzarella
Reviewed-by: Stefan Hajnoczi
Acked-by: Michael S. Tsirkin
Signed-off-by: David S. Miller

Stefano Garzarella
2019-07-31 06:00:00 +0800
9632e9f61 vsock/virtio: fix locking in virtio_transport_inc_tx_pkt() ... Browse Code »

fwd_cnt and last_fwd_cnt are protected by rx_lock, so we should use
the same spinlock also if we are in the TX path.

Move also buf_alloc under the same lock.

Signed-off-by: Stefano Garzarella
Reviewed-by: Stefan Hajnoczi
Acked-by: Michael S. Tsirkin
Signed-off-by: David S. Miller

Stefano Garzarella
2019-07-31 06:00:00 +0800
b89d882dc vsock/virtio: reduce credit update messages ... Browse Code »

In order to reduce the number of credit update messages,
we send them only when the space available seen by the
transmitter is less than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE.

Signed-off-by: Stefano Garzarella
Reviewed-by: Stefan Hajnoczi
Acked-by: Michael S. Tsirkin
Signed-off-by: David S. Miller

Stefano Garzarella
2019-07-31 06:00:00 +0800
473c7391c vsock/virtio: limit the memory used per-socket ... Browse Code »

Since virtio-vsock was introduced, the buffers filled by the host
and pushed to the guest using the vring, are directly queued in
a per-socket list. These buffers are preallocated by the guest
with a fixed size (4 KB).

The maximum amount of memory used by each socket should be
controlled by the credit mechanism.
The default credit available per-socket is 256 KB, but if we use
only 1 byte per packet, the guest can queue up to 262144 of 4 KB
buffers, using up to 1 GB of memory per-socket. In addition, the
guest will continue to fill the vring with new 4 KB free buffers
to avoid starvation of other sockets.

This patch mitigates this issue copying the payload of small
packets (< 128 bytes) into the buffer of last packet queued, in
order to avoid wasting memory.

Signed-off-by: Stefano Garzarella
Reviewed-by: Stefan Hajnoczi
Acked-by: Michael S. Tsirkin
Signed-off-by: David S. Miller

Stefano Garzarella
2019-07-31 06:00:00 +0800

24 Jul, 2019

1 commit

ce103204c hv_sock: Use consistent types for UUIDs ... Browse Code »

The rest of Hyper-V code is using new types for UUID handling.
Convert hv_sock as well.

Signed-off-by: Andy Shevchenko
Reviewed-by: Dexuan Cui
Signed-off-by: David S. Miller

Andy Shevchenko
2019-07-24 04:58:32 +0800

09 Jul, 2019

3 commits

e226121fc vsock/virtio: fix flush of works during the .remove() ... Browse Code »

This patch moves the flush of works after vdev->config->del_vqs(vdev),
because we need to be sure that no workers run before to free the
'vsock' object.

Since we stopped the workers using the [tx|rx|event]_run flags,
we are sure no one is accessing the device while we are calling
vdev->config->reset(vdev), so we can safely move the workers' flush.

Before the vdev->config->del_vqs(vdev), workers can be scheduled
by VQ callbacks, so we must flush them after del_vqs(), to avoid
use-after-free of 'vsock' object.

Suggested-by: Michael S. Tsirkin
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-07-09 06:35:17 +0800
b917507e5 vsock/virtio: stop workers during the .remove() ... Browse Code »

Before to call vdev->config->reset(vdev) we need to be sure that
no one is accessing the device, for this reason, we add new variables
in the struct virtio_vsock to stop the workers during the .remove().

This patch also add few comments before vdev->config->reset(vdev)
and vdev->config->del_vqs(vdev).

Suggested-by: Stefan Hajnoczi
Suggested-by: Michael S. Tsirkin
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-07-09 06:35:17 +0800
0deab087b vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock ... Browse Code »

Some callbacks used by the upper layers can run while we are in the
.remove(). A potential use-after-free can happen, because we free
the_virtio_vsock without knowing if the callbacks are over or not.

To solve this issue we move the assignment of the_virtio_vsock at the
end of .probe(), when we finished all the initialization, and at the
beginning of .remove(), before to release resources.
For the same reason, we do the same also for the vdev->priv.

We use RCU to be sure that all callbacks that use the_virtio_vsock
ended before freeing it. This is not required for callbacks that
use vdev->priv, because after the vdev->config->del_vqs() we are sure
that they are ended and will no longer be invoked.

We also take the mutex during the .remove() to avoid that .probe() can
run while we are resetting the device.

Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-07-09 06:35:17 +0800

22 Jun, 2019

3 commits

92ad6325c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Minor SPDX change conflict.

Signed-off-by: David S. Miller

David S. Miller
2019-06-22 20:59:24 +0800
c356dc4b5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) Fix leak of unqueued fragments in ipv6 nf_defrag, from Guillaume
Nault.

2) Don't access the DDM interface unless the transceiver implements it
in bnx2x, from Mauro S. M. Rodrigues.

3) Don't double fetch 'len' from userspace in sock_getsockopt(), from
JingYi Hou.

4) Sign extension overflow in lio_core, from Colin Ian King.

5) Various netem bug fixes wrt. corrupted packets from Jakub Kicinski.

6) Fix epollout hang in hvsock, from Sunil Muthuswamy.

7) Fix regression in default fib6_type, from David Ahern.

8) Handle memory limits in tcp_fragment more appropriately, from Eric
Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (24 commits)
tcp: refine memory limit test in tcp_fragment()
inet: clear num_timeout reqsk_alloc()
net: mvpp2: debugfs: Add pmap to fs dump
ipv6: Default fib6_type to RTN_UNICAST when not set
net: hns3: Fix inconsistent indenting
net/af_iucv: always register net_device notifier
net/af_iucv: build proper skbs for HiperTransport
net/af_iucv: remove GFP_DMA restriction for HiperTransport
net: dsa: mv88e6xxx: fix shift of FID bits in mv88e6185_g1_vtu_loadpurge()
hvsock: fix epollout hang from race condition
net/udp_gso: Allow TX timestamp with UDP GSO
net: netem: fix use after free and double free with packet corruption
net: netem: fix backlog accounting for corrupted GSO frames
net: lio_core: fix potential sign-extension overflow on large shift
tipc: pass tunnel dev as NULL to udp_tunnel(6)_xmit_skb
ip6_tunnel: allow not to count pkts on tstats by passing dev as NULL
ip_tunnel: allow not to count pkts on tstats by setting skb's dev to NULL
tun: wake up waitqueues after IFF_UP is set
net: remove duplicate fetch in sock_getsockopt
tipc: fix issues with early FAILOVER_MSG from peer
...

Linus Torvalds
2019-06-22 13:23:35 +0800
c884d8ac7 Merge tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx ... Browse Code »

Pull still more SPDX updates from Greg KH:
"Another round of SPDX updates for 5.2-rc6

Here is what I am guessing is going to be the last "big" SPDX update
for 5.2. It contains all of the remaining GPLv2 and GPLv2+ updates
that were "easy" to determine by pattern matching. The ones after this
are going to be a bit more difficult and the people on the spdx list
will be discussing them on a case-by-case basis now.

Another 5000+ files are fixed up, so our overall totals are:
Files checked: 64545
Files with SPDX: 45529

Compared to the 5.1 kernel which was:
Files checked: 63848
Files with SPDX: 22576

This is a huge improvement.

Also, we deleted another 20000 lines of boilerplate license crud,
always nice to see in a diffstat"

* tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (65 commits)
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 507
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 506
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 505
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 504
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 503
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 502
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 501
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 498
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 497
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 496
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 495
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 491
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 490
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 489
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 488
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 487
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 486
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 485
...

Linus Torvalds
2019-06-22 00:58:42 +0800

19 Jun, 2019

2 commits

7a338472f treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 482 ... Browse Code »

Based on 1 normalized pattern(s):

this work is licensed under the terms of the gnu gpl version 2

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 48 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Reviewed-by: Enrico Weigelt
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081204.624030236@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-06-19 23:09:52 +0800
cb359b604 hvsock: fix epollout hang from race condition ... Browse Code »

Currently, hvsock can enter into a state where epoll_wait on EPOLLOUT will
not return even when the hvsock socket is writable, under some race
condition. This can happen under the following sequence:
- fd = socket(hvsocket)
- fd_out = dup(fd)
- fd_in = dup(fd)
- start a writer thread that writes data to fd_out with a combination of
epoll_wait(fd_out, EPOLLOUT) and
- start a reader thread that reads data from fd_in with a combination of
epoll_wait(fd_in, EPOLLIN)
- On the host, there are two threads that are reading/writing data to the
hvsocket

stack:
hvs_stream_has_space
hvs_notify_poll_out
vsock_poll
sock_poll
ep_poll

Race condition:
check for epollout from ep_poll():
assume no writable space in the socket
hvs_stream_has_space() returns 0
check for epollin from ep_poll():
assume socket has some free space < HVS_PKT_LEN(HVS_SEND_BUF_SIZE)
hvs_stream_has_space() will clear the channel pending send size
host will not notify the guest because the pending send size has
been cleared and so the hvsocket will never mark the
socket writable

Now, the EPOLLOUT will never return even if the socket write buffer is
empty.

The fix is to set the pending size to the default size and never change it.
This way the host will always notify the guest whenever the writable space
is bigger than the pending size. The host is already optimized to *only*
notify the guest when the pending size threshold boundary is crossed and
not everytime.

This change also reduces the cpu usage somewhat since hv_stream_has_space()
is in the hotpath of send:
vsock_stream_sendmsg()->hv_stream_has_space()
Earlier hv_stream_has_space was setting/clearing the pending size on every
call.

Signed-off-by: Sunil Muthuswamy
Reviewed-by: Dexuan Cui
Signed-off-by: David S. Miller

Sunil Muthuswamy
2019-06-19 09:41:12 +0800

18 Jun, 2019

2 commits

13091aa30 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Honestly all the conflicts were simple overlapping changes,
nothing really interesting to report.

Signed-off-by: David S. Miller

David S. Miller
2019-06-18 11:20:36 +0800
da0f38202 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:
"Lots of bug fixes here:

1) Out of bounds access in __bpf_skc_lookup, from Lorenz Bauer.

2) Fix rate reporting in cfg80211_calculate_bitrate_he(), from John
Crispin.

3) Use after free in psock backlog workqueue, from John Fastabend.

4) Fix source port matching in fdb peer flow rule of mlx5, from Raed
Salem.

5) Use atomic_inc_not_zero() in fl6_sock_lookup(), from Eric Dumazet.

6) Network header needs to be set for packet redirect in nfp, from
John Hurley.

7) Fix udp zerocopy refcnt, from Willem de Bruijn.

8) Don't assume linear buffers in vxlan and geneve error handlers,
from Stefano Brivio.

9) Fix TOS matching in mlxsw, from Jiri Pirko.

10) More SCTP cookie memory leak fixes, from Neil Horman.

11) Fix VLAN filtering in rtl8366, from Linus Walluij.

12) Various TCP SACK payload size and fragmentation memory limit fixes
from Eric Dumazet.

13) Use after free in pneigh_get_next(), also from Eric Dumazet.

14) LAPB control block leak fix from Jeremy Sowden"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (145 commits)
lapb: fixed leak of control-blocks.
tipc: purge deferredq list for each grp member in tipc_group_delete
ax25: fix inconsistent lock state in ax25_destroy_timer
neigh: fix use-after-free read in pneigh_get_next
tcp: fix compile error if !CONFIG_SYSCTL
hv_sock: Suppress bogus "may be used uninitialized" warnings
be2net: Fix number of Rx queues used for flow hashing
net: handle 802.1P vlan 0 packets properly
tcp: enforce tcp_min_snd_mss in tcp_mtu_probing()
tcp: add tcp_min_snd_mss sysctl
tcp: tcp_fragment() should apply sane memory limits
tcp: limit payload size of sacked skbs
Revert "net: phylink: set the autoneg state in phylink_phy_change"
bpf: fix nested bpf tracepoints with per-cpu data
bpf: Fix out of bounds memory access in bpf_sk_storage
vsock/virtio: set SOCK_DONE on peer shutdown
net: dsa: rtl8366: Fix up VLAN filtering
net: phylink: set the autoneg state in phylink_phy_change
net: add high_order_alloc_disable sysctl/static key
tcp: add tcp_tx_skb_cache sysctl
...

Linus Torvalds
2019-06-18 06:55:34 +0800

17 Jun, 2019

1 commit

d424a2afd hv_sock: Suppress bogus "may be used uninitialized" warnings ... Browse Code »

gcc 8.2.0 may report these bogus warnings under some condition:

warning: ‘vnew’ may be used uninitialized in this function
warning: ‘hvs_new’ may be used uninitialized in this function

Actually, the 2 pointers are only initialized and used if the variable
"conn_from_host" is true. The code is not buggy here.

Signed-off-by: Dexuan Cui
Signed-off-by: David S. Miller

Dexuan Cui
2019-06-17 05:00:51 +0800

16 Jun, 2019

1 commit

42f5cda5e vsock/virtio: set SOCK_DONE on peer shutdown ... Browse Code »

Set the SOCK_DONE flag to match the TCP_CLOSING state when a peer has
shut down and there is nothing left to read.

This fixes the following bug:
1) Peer sends SHUTDOWN(RDWR).
2) Socket enters TCP_CLOSING but SOCK_DONE is not set.
3) read() returns -ENOTCONN until close() is called, then returns 0.

Signed-off-by: Stephen Barber
Signed-off-by: David S. Miller

Stephen Barber
2019-06-16 05:01:09 +0800

15 Jun, 2019

1 commit

d5afa82c9 vsock: correct removal of socket from the list ... Browse Code »

The current vsock code for removal of socket from the list is both
subject to race and inefficient. It takes the lock, checks whether
the socket is in the list, drops the lock and if the socket was on the
list, deletes it from the list. This is subject to race because as soon
as the lock is dropped once it is checked for presence, that condition
cannot be relied upon for any decision. It is also inefficient because
if the socket is present in the list, it takes the lock twice.

Signed-off-by: Sunil Muthuswamy
Signed-off-by: David S. Miller

Sunil Muthuswamy
2019-06-15 10:20:20 +0800

08 Jun, 2019

1 commit

a6cdeeb16 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Some ISDN files that got removed in net-next had some changes
done in mainline, take the removals.

Signed-off-by: David S. Miller

David S. Miller
2019-06-08 02:00:14 +0800

05 Jun, 2019

2 commits

685a6bf84 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 321 ... Browse Code »

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation version 2 and no later version this
program is distributed in the hope that it will be useful but
without any warranty without even the implied warranty of
merchantability or fitness for a particular purpose see the gnu
general public license for more details

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 33 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Kate Stewart
Reviewed-by: Alexios Zavras
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190530000435.345978407@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-06-05 23:37:05 +0800
2025cf9e1 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 ... Browse Code »

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms and conditions of the gnu general public license
version 2 as published by the free software foundation this program
is distributed in the hope it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 263 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Reviewed-by: Alexios Zavras
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-06-05 23:36:37 +0800

31 May, 2019

1 commit

2874c5fd2 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 ... Browse Code »

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-31 02:26:32 +0800

23 May, 2019

2 commits

14a1eaa88 hv_sock: perf: loop in send() to maximize bandwidth ... Browse Code »

Currently, the hv_sock send() iterates once over the buffer, puts data into
the VMBUS channel and returns. It doesn't maximize on the case when there
is a simultaneous reader draining data from the channel. In such a case,
the send() can maximize the bandwidth (and consequently minimize the cpu
cycles) by iterating until the channel is found to be full.

Perf data:
Total Data Transfer: 10GB/iteration
Single threaded reader/writer, Linux hvsocket writer with Windows hvsocket
reader
Packet size: 64KB
CPU sys time was captured using the 'time' command for the writer to send
10GB of data.
'Send Buffer Loop' is with the patch applied.
The values below are over 10 iterations.

|--------------------------------------------------------|
| | Current | Send Buffer Loop |
|--------------------------------------------------------|
| | Throughput | CPU sys | Throughput | CPU sys |
| | (MB/s) | time (s) | (MB/s) | time (s) |
|--------------------------------------------------------|
| Min | 407 | 7.048 | 401 | 5.958 |
|--------------------------------------------------------|
| Max | 455 | 7.563 | 542 | 6.993 |
|--------------------------------------------------------|
| Avg | 440 | 7.411 | 451 | 6.639 |
|--------------------------------------------------------|
| Median | 446 | 7.417 | 447 | 6.761 |
|--------------------------------------------------------|

Observation:
1. The avg throughput doesn't really change much with this change for this
scenario. This is most probably because the bottleneck on throughput is
somewhere else.
2. The average system (or kernel) cpu time goes down by 10%+ with this
change, for the same amount of data transfer.

Signed-off-by: Sunil Muthuswamy
Reviewed-by: Dexuan Cui
Signed-off-by: David S. Miller

Sunil Muthuswamy
2019-05-23 09:00:36 +0800
ac383f58f hv_sock: perf: Allow the socket buffer size options to influence the actual socket buffers ... Browse Code »

Currently, the hv_sock buffer size is static and can't scale to the
bandwidth requirements of the application. This change allows the
applications to influence the socket buffer sizes using the SO_SNDBUF and
the SO_RCVBUF socket options.

Few interesting points to note:
1. Since the VMBUS does not allow a resize operation of the ring size, the
socket buffer size option should be set prior to establishing the
connection for it to take effect.
2. Setting the socket option comes with the cost of that much memory being
reserved/allocated by the kernel, for the lifetime of the connection.

Perf data:
Total Data Transfer: 1GB
Single threaded reader/writer
Results below are summarized over 10 iterations.

Linux hvsocket writer + Windows hvsocket reader:
|---------------------------------------------------------------------------------------------|
|Packet size -> | 128B | 1KB | 4KB | 64KB |
|---------------------------------------------------------------------------------------------|
|SO_SNDBUF size | | Throughput in MB/s (min/max/avg/median): |
| v | |
|---------------------------------------------------------------------------------------------|
| Default | 109/118/114/116 | 636/774/701/700 | 435/507/480/476 | 410/491/462/470 |
| 16KB | 110/116/112/111 | 575/705/662/671 | 749/900/854/869 | 592/824/692/676 |
| 32KB | 108/120/115/115 | 703/823/767/772 | 718/878/850/866 | 1593/2124/2000/2085 |
| 64KB | 108/119/114/114 | 592/732/683/688 | 805/934/903/911 | 1784/1943/1862/1843 |
|---------------------------------------------------------------------------------------------|

Windows hvsocket writer + Linux hvsocket reader:
|---------------------------------------------------------------------------------------------|
|Packet size -> | 128B | 1KB | 4KB | 64KB |
|---------------------------------------------------------------------------------------------|
|SO_RCVBUF size | | Throughput in MB/s (min/max/avg/median): |
| v | |
|---------------------------------------------------------------------------------------------|
| Default | 69/82/75/73 | 313/343/333/336 | 418/477/446/445 | 659/701/676/678 |
| 16KB | 69/83/76/77 | 350/401/375/382 | 506/548/517/516 | 602/624/615/615 |
| 32KB | 62/83/73/73 | 471/529/496/494 | 830/1046/935/939 | 944/1180/1070/1100 |
| 64KB | 64/70/68/69 | 467/533/501/497 | 1260/1590/1430/1431 | 1605/1819/1670/1660 |
|---------------------------------------------------------------------------------------------|

Signed-off-by: Sunil Muthuswamy
Reviewed-by: Dexuan Cui
Signed-off-by: David S. Miller

Sunil Muthuswamy
2019-05-23 09:00:14 +0800

22 May, 2019

1 commit

2c1212de6 Merge tag 'spdx-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core ... Browse Code »

Pull SPDX update from Greg KH:
"Here is a series of patches that add SPDX tags to different kernel
files, based on two different things:

- SPDX entries are added to a bunch of files that we missed a year
ago that do not have any license information at all.

These were either missed because the tool saw the MODULE_LICENSE()
tag, or some EXPORT_SYMBOL tags, and got confused and thought the
file had a real license, or the files have been added since the
last big sweep, or they were Makefile/Kconfig files, which we
didn't touch last time.

- Add GPL-2.0-only or GPL-2.0-or-later tags to files where our scan
tools can determine the license text in the file itself. Where this
happens, the license text is removed, in order to cut down on the
700+ different ways we have in the kernel today, in a quest to get
rid of all of these.

These patches have been out for review on the linux-spdx@vger mailing
list, and while they were created by automatic tools, they were
hand-verified by a bunch of different people, all whom names are on
the patches are reviewers.

The reason for these "large" patches is if we were to continue to
progress at the current rate of change in the kernel, adding license
tags to individual files in different subsystems, we would be finished
in about 10 years at the earliest.

There will be more series of these types of patches coming over the
next few weeks as the tools and reviewers crunch through the more
"odd" variants of how to say "GPLv2" that developers have come up with
over the years, combined with other fun oddities (GPL + a BSD
disclaimer?) that are being unearthed, with the goal for the whole
kernel to be cleaned up.

These diffstats are not small, 3840 files are touched, over 10k lines
removed in just 24 patches"

* tag 'spdx-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (24 commits)
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 25
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 24
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 23
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 22
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 21
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 20
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 19
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 18
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 17
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 15
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 14
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 12
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 11
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 10
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 9
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 7
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 5
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 4
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 3
...

Linus Torvalds
2019-05-22 03:33:38 +0800

21 May, 2019

1 commit

ec8f24b7f treewide: Add SPDX license identifier - Makefile/Kconfig ... Browse Code »

Add SPDX license identifiers to all Make/Kconfig files which:

- Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

GPL-2.0-only

Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-21 16:50:46 +0800