Eric Lee / smarc-fsl-linux-kernel

24 Nov, 2020

1 commit

3fe356d58 vsock/virtio: discard packets only when socket is really closed ... Browse Code »

Starting from commit 8692cefc433f ("virtio_vsock: Fix race condition
in virtio_transport_recv_pkt"), we discard packets in
virtio_transport_recv_pkt() if the socket has been released.

When the socket is connected, we schedule a delayed work to wait the
RST packet from the other peer, also if SHUTDOWN_MASK is set in
sk->sk_shutdown.
This is done to complete the virtio-vsock shutdown algorithm, releasing
the port assigned to the socket definitively only when the other peer
has consumed all the packets.

If we discard the RST packet received, the socket will be closed only
when the VSOCK_CLOSE_TIMEOUT is reached.

Sergio discovered the issue while running ab(1) HTTP benchmark using
libkrun [1] and observing a latency increase with that commit.

To avoid this issue, we discard packet only if the socket is really
closed (SOCK_DONE flag is set).
We also set SOCK_DONE in virtio_transport_release() when we don't need
to wait any packets from the other peer (we didn't schedule the delayed
work). In this case we remove the socket from the vsock lists, releasing
the port assigned.

[1] https://github.com/containers/libkrun

Fixes: 8692cefc433f ("virtio_vsock: Fix race condition in virtio_transport_recv_pkt")
Cc: justin.he@arm.com
Reported-by: Sergio Lopez
Tested-by: Sergio Lopez
Signed-off-by: Stefano Garzarella
Acked-by: Jia He
Link: https://lore.kernel.org/r/20201120104736.73749-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski

Stefano Garzarella
2020-11-24 08:36:29 +0800

15 Nov, 2020

1 commit

65b422d9b vsock: forward all packets to the host when no H2G is registered ... Browse Code »

Before commit c0cfa2d8a788 ("vsock: add multi-transports support"),
if a G2H transport was loaded (e.g. virtio transport), every packets
was forwarded to the host, regardless of the destination CID.
The H2G transports implemented until then (vhost-vsock, VMCI) always
responded with an error, if the destination CID was not
VMADDR_CID_HOST.

From that commit, we are using the remote CID to decide which
transport to use, so packets with remote CID > VMADDR_CID_HOST(2)
are sent only through H2G transport. If no H2G is available, packets
are discarded directly in the guest.

Some use cases (e.g. Nitro Enclaves [1]) rely on the old behaviour
to implement sibling VMs communication, so we restore the old
behavior when no H2G is registered.
It will be up to the host to discard packets if the destination is
not the right one. As it was already implemented before adding
multi-transport support.

Tested with nested QEMU/KVM by me and Nitro Enclaves by Andra.

[1] Documentation/virt/ne_overview.rst

Cc: Jorgen Hansen
Cc: Dexuan Cui
Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
Reported-by: Andra Paraschiv
Tested-by: Andra Paraschiv
Signed-off-by: Stefano Garzarella
Link: https://lore.kernel.org/r/20201112133837.34183-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski

Stefano Garzarella
2020-11-15 03:33:39 +0800

27 Oct, 2020

1 commit

af545bb5e vsock: use ns_capable_noaudit() on socket create ... Browse Code »

During __vsock_create() CAP_NET_ADMIN is used to determine if the
vsock_sock->trusted should be set to true. This value is used later
for determing if a remote connection should be allowed to connect
to a restricted VM. Unfortunately, if the caller doesn't have
CAP_NET_ADMIN, an audit message such as an selinux denial is
generated even if the caller does not want a trusted socket.

Logging errors on success is confusing. To avoid this, switch the
capable(CAP_NET_ADMIN) check to the noaudit version.

Reported-by: Roman Kiryanov
https://android-review.googlesource.com/c/device/generic/goldfish/+/1468545/
Signed-off-by: Jeff Vander Stoep
Reviewed-by: James Morris
Link: https://lore.kernel.org/r/20201023143757.377574-1-jeffv@google.com
Signed-off-by: Jakub Kicinski

Jeff Vander Stoep
2020-10-27 07:22:42 +0800

13 Aug, 2020

1 commit

1980c0584 vsock: fix potential null pointer dereference in vsock_poll() ... Browse Code »

syzbot reported this issue where in the vsock_poll() we find the
socket state at TCP_ESTABLISHED, but 'transport' is null:
general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097]
CPU: 0 PID: 8227 Comm: syz-executor.2 Not tainted 5.8.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:vsock_poll+0x75a/0x8e0 net/vmw_vsock/af_vsock.c:1038
Call Trace:
sock_poll+0x159/0x460 net/socket.c:1266
vfs_poll include/linux/poll.h:90 [inline]
do_pollfd fs/select.c:869 [inline]
do_poll fs/select.c:917 [inline]
do_sys_poll+0x607/0xd40 fs/select.c:1011
__do_sys_poll fs/select.c:1069 [inline]
__se_sys_poll fs/select.c:1057 [inline]
__x64_sys_poll+0x18c/0x440 fs/select.c:1057
do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
entry_SYSCALL_64_after_hwframe+0x44/0xa9

This issue can happen if the TCP_ESTABLISHED state is set after we read
the vsk->transport in the vsock_poll().

We could put barriers to synchronize, but this can only happen during
connection setup, so we can simply check that 'transport' is valid.

Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
Reported-and-tested-by: syzbot+a61bac2fcc1a7c6623fe@syzkaller.appspotmail.com
Signed-off-by: Stefano Garzarella
Reviewed-by: Jorgen Hansen
Signed-off-by: David S. Miller

Stefano Garzarella
2020-08-13 03:56:06 +0800

26 Jul, 2020

1 commit

a57066b1a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

The UDP reuseport conflict was a little bit tricky.

The net-next code, via bpf-next, extracted the reuseport handling
into a helper so that the BPF sk lookup code could invoke it.

At the same time, the logic for reuseport handling of unconnected
sockets changed via commit efc6b6f6c3113e8b203b9debfb72d81e0f3dcace
which changed the logic to carry on the reuseport result into the
rest of the lookup loop if we do not return immediately.

This requires moving the reuseport_has_conns() logic into the callers.

While we are here, get rid of inline directives as they do not belong
in foo.c files.

The other changes were cases of more straightforward overlapping
modifications.

Signed-off-by: David S. Miller

David S. Miller
2020-07-26 08:49:04 +0800

25 Jul, 2020

1 commit

a7b75c5a8 net: pass a sockptr_t into ->setsockopt ... Browse Code »

Rework the remaining setsockopt code to pass a sockptr_t instead of a
plain user pointer. This removes the last remaining set_fs(KERNEL_DS)
outside of architecture specific code.

Signed-off-by: Christoph Hellwig
Acked-by: Stefan Schmidt [ieee802154]
Acked-by: Matthieu Baerts
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-25 06:41:54 +0800

20 Jul, 2020

1 commit

a44d9e721 net: make ->{get,set}sockopt in proto_ops optional ... Browse Code »

Just check for a NULL method instead of wiring up
sock_no_{get,set}sockopt.

Signed-off-by: Christoph Hellwig
Acked-by: Marc Kleine-Budde
Signed-off-by: David S. Miller

Christoph Hellwig
2020-07-20 09:16:41 +0800

16 Jul, 2020

1 commit

f961134a6 vsock/virtio: annotate 'the_virtio_vsock' RCU pointer ... Browse Code »

Commit 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free
on the_virtio_vsock") starts to use RCU to protect 'the_virtio_vsock'
pointer, but we forgot to annotate it.

This patch adds the annotation to fix the following sparse errors:

net/vmw_vsock/virtio_transport.c:73:17: error: incompatible types in comparison expression (different address spaces):
net/vmw_vsock/virtio_transport.c:73:17: struct virtio_vsock [noderef] __rcu *
net/vmw_vsock/virtio_transport.c:73:17: struct virtio_vsock *
net/vmw_vsock/virtio_transport.c:171:17: error: incompatible types in comparison expression (different address spaces):
net/vmw_vsock/virtio_transport.c:171:17: struct virtio_vsock [noderef] __rcu *
net/vmw_vsock/virtio_transport.c:171:17: struct virtio_vsock *
net/vmw_vsock/virtio_transport.c:207:17: error: incompatible types in comparison expression (different address spaces):
net/vmw_vsock/virtio_transport.c:207:17: struct virtio_vsock [noderef] __rcu *
net/vmw_vsock/virtio_transport.c:207:17: struct virtio_vsock *
net/vmw_vsock/virtio_transport.c:561:13: error: incompatible types in comparison expression (different address spaces):
net/vmw_vsock/virtio_transport.c:561:13: struct virtio_vsock [noderef] __rcu *
net/vmw_vsock/virtio_transport.c:561:13: struct virtio_vsock *
net/vmw_vsock/virtio_transport.c:612:9: error: incompatible types in comparison expression (different address spaces):
net/vmw_vsock/virtio_transport.c:612:9: struct virtio_vsock [noderef] __rcu *
net/vmw_vsock/virtio_transport.c:612:9: struct virtio_vsock *
net/vmw_vsock/virtio_transport.c:631:9: error: incompatible types in comparison expression (different address spaces):
net/vmw_vsock/virtio_transport.c:631:9: struct virtio_vsock [noderef] __rcu *
net/vmw_vsock/virtio_transport.c:631:9: struct virtio_vsock *

Fixes: 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock")
Reported-by: Michael S. Tsirkin
Signed-off-by: Stefano Garzarella
Reviewed-by: Stefan Hajnoczi
Acked-by: Michael S. Tsirkin
Signed-off-by: Jakub Kicinski

Stefano Garzarella
2020-07-16 08:47:15 +0800

06 Jun, 2020

1 commit

fdb4276aa vsock/vmci: make vmci_vsock_transport_cb() static ... Browse Code »

Fix the following gcc-9.3 warning when building with 'make W=1':
net/vmw_vsock/vmci_transport.c:2058:6: warning: no previous prototype
for ‘vmci_vsock_transport_cb’ [-Wmissing-prototypes]
2058 | void vmci_vsock_transport_cb(bool is_host)
| ^~~~~~~~~~~~~~~~~~~~~~~

Fixes: b1bba80a4376 ("vsock/vmci: register vmci_transport only when VMCI guest/host are active")
Reported-by: kernel test robot
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2020-06-06 04:18:26 +0800

31 May, 2020

1 commit

8692cefc4 virtio_vsock: Fix race condition in virtio_transport_recv_pkt ... Browse Code »

When client on the host tries to connect(SOCK_STREAM, O_NONBLOCK) to the
server on the guest, there will be a panic on a ThunderX2 (armv8a server):

[ 463.718844] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 463.718848] Mem abort info:
[ 463.718849] ESR = 0x96000044
[ 463.718852] EC = 0x25: DABT (current EL), IL = 32 bits
[ 463.718853] SET = 0, FnV = 0
[ 463.718854] EA = 0, S1PTW = 0
[ 463.718855] Data abort info:
[ 463.718856] ISV = 0, ISS = 0x00000044
[ 463.718857] CM = 0, WnR = 1
[ 463.718859] user pgtable: 4k pages, 48-bit VAs, pgdp=0000008f6f6e9000
[ 463.718861] [0000000000000000] pgd=0000000000000000
[ 463.718866] Internal error: Oops: 96000044 [#1] SMP
[...]
[ 463.718977] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G O 5.7.0-rc7+ #139
[ 463.718980] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00, BIOS F06 09/25/2018
[ 463.718982] pstate: 60400009 (nZCv daif +PAN -UAO)
[ 463.718995] pc : virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
[ 463.718999] lr : virtio_transport_recv_pkt+0x1fc/0xd40 [vmw_vsock_virtio_transport_common]
[ 463.719000] sp : ffff80002dbe3c40
[...]
[ 463.719025] Call trace:
[ 463.719030] virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
[ 463.719034] vhost_vsock_handle_tx_kick+0x360/0x408 [vhost_vsock]
[ 463.719041] vhost_worker+0x100/0x1a0 [vhost]
[ 463.719048] kthread+0x128/0x130
[ 463.719052] ret_from_fork+0x10/0x18

The race condition is as follows:
Task1 Task2
===== =====
__sock_release virtio_transport_recv_pkt
__vsock_release vsock_find_bound_socket (found sk)
lock_sock_nested
vsock_remove_sock
sock_orphan
sk_set_socket(sk, NULL)
sk->sk_shutdown = SHUTDOWN_MASK
...
release_sock
lock_sock
virtio_transport_recv_connecting
sk->sk_socket->state (panic!)

The root cause is that vsock_find_bound_socket can't hold the lock_sock,
so there is a small race window between vsock_find_bound_socket() and
lock_sock(). If __vsock_release() is running in another task,
sk->sk_socket will be set to NULL inadvertently.

This fixes it by checking sk->sk_shutdown(suggested by Stefano) after
lock_sock since sk->sk_shutdown is set to SHUTDOWN_MASK under the
protection of lock_sock_nested.

Signed-off-by: Jia He
Reviewed-by: Stefano Garzarella
Signed-off-by: David S. Miller

Jia He
2020-05-31 08:44:01 +0800

28 May, 2020

1 commit

7e0afbdfd vsock: fix timeout in vsock_accept() ... Browse Code »

The accept(2) is an "input" socket interface, so we should use
SO_RCVTIMEO instead of SO_SNDTIMEO to set the timeout.

So this patch replace sock_sndtimeo() with sock_rcvtimeo() to
use the right timeout in the vsock_accept().

Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Signed-off-by: Stefano Garzarella
Reviewed-by: Jorgen Hansen
Signed-off-by: David S. Miller

Stefano Garzarella
2020-05-28 02:20:23 +0800

28 Apr, 2020

1 commit

a78d16397 vsock/virtio: fix multiple packet delivery to monitoring devices ... Browse Code »

In virtio_transport.c, if the virtqueue is full, the transmitting
packet is queued up and it will be sent in the next iteration.
This causes the same packet to be delivered multiple times to
monitoring devices.

We want to continue to deliver packets to monitoring devices before
it is put in the virtqueue, to avoid that replies can appear in the
packet capture before the transmitted packet.

This patch fixes the issue, adding a new flag (tap_delivered) in
struct virtio_vsock_pkt, to check if the packet is already delivered
to monitoring devices.

In vhost/vsock.c, we are splitting packets, so we must set
'tap_delivered' to false when we queue up the same virtio_vsock_pkt
to handle the remaining bytes.

Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2020-04-28 01:18:01 +0800

28 Feb, 2020

2 commits

9f6e05590 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

The mptcp conflict was overlapping additions.

The SMC conflict was an additional and removal happening at the same
time.

Signed-off-by: David S. Miller

David S. Miller
2020-02-28 10:31:39 +0800
3f74957fc vsock: fix potential deadlock in transport->release() ... Browse Code »

Some transports (hyperv, virtio) acquire the sock lock during the
.release() callback.

In the vsock_stream_connect() we call vsock_assign_transport(); if
the socket was previously assigned to another transport, the
vsk->transport->release() is called, but the sock lock is already
held in the vsock_stream_connect(), causing a deadlock reported by
syzbot:

INFO: task syz-executor280:9768 blocked for more than 143 seconds.
Not tainted 5.6.0-rc1-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor280 D27912 9768 9766 0x00000000
Call Trace:
context_switch kernel/sched/core.c:3386 [inline]
__schedule+0x934/0x1f90 kernel/sched/core.c:4082
schedule+0xdc/0x2b0 kernel/sched/core.c:4156
__lock_sock+0x165/0x290 net/core/sock.c:2413
lock_sock_nested+0xfe/0x120 net/core/sock.c:2938
virtio_transport_release+0xc4/0xd60 net/vmw_vsock/virtio_transport_common.c:832
vsock_assign_transport+0xf3/0x3b0 net/vmw_vsock/af_vsock.c:454
vsock_stream_connect+0x2b3/0xc70 net/vmw_vsock/af_vsock.c:1288
__sys_connect_file+0x161/0x1c0 net/socket.c:1857
__sys_connect+0x174/0x1b0 net/socket.c:1874
__do_sys_connect net/socket.c:1885 [inline]
__se_sys_connect net/socket.c:1882 [inline]
__x64_sys_connect+0x73/0xb0 net/socket.c:1882
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe

To avoid this issue, this patch remove the lock acquiring in the
.release() callback of hyperv and virtio transports, and it holds
the lock when we call vsk->transport->release() in the vsock core.

Reported-by: syzbot+731710996d79d0d58fbc@syzkaller.appspotmail.com
Fixes: 408624af4c89 ("vsock: use local transport when it is loaded")
Signed-off-by: Stefano Garzarella
Reviewed-by: Stefan Hajnoczi
Signed-off-by: David S. Miller

Stefano Garzarella
2020-02-28 04:03:56 +0800

17 Feb, 2020

1 commit

df12eb6d6 net: virtio_vsock: Enhance connection semantics ... Browse Code »

Whenever the vsock backend on the host sends a packet through the RX
queue, it expects an answer on the TX queue. Unfortunately, there is one
case where the host side will hang waiting for the answer and might
effectively never recover if no timeout mechanism was implemented.

This issue happens when the guest side starts binding to the socket,
which insert a new bound socket into the list of already bound sockets.
At this time, we expect the guest to also start listening, which will
trigger the sk_state to move from TCP_CLOSE to TCP_LISTEN. The problem
occurs if the host side queued a RX packet and triggered an interrupt
right between the end of the binding process and the beginning of the
listening process. In this specific case, the function processing the
packet virtio_transport_recv_pkt() will find a bound socket, which means
it will hit the switch statement checking for the sk_state, but the
state won't be changed into TCP_LISTEN yet, which leads the code to pick
the default statement. This default statement will only free the buffer,
while it should also respond to the host side, by sending a packet on
its TX queue.

In order to simply fix this unfortunate chain of events, it is important
that in case the default statement is entered, and because at this stage
we know the host side is waiting for an answer, we must send back a
packet containing the operation VIRTIO_VSOCK_OP_RST.

One could say that a proper timeout mechanism on the host side will be
enough to avoid the backend to hang. But the point of this patch is to
ensure the normal use case will be provided with proper responsiveness
when it comes to establishing the connection.

Signed-off-by: Sebastien Boeuf
Signed-off-by: David S. Miller

Sebastien Boeuf
2020-02-17 11:01:49 +0800

20 Jan, 2020

1 commit

b3f7e3f23 Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net Browse Code »

David S. Miller
2020-01-20 05:10:04 +0800

15 Jan, 2020

1 commit

c742c59e1 hv_sock: Remove the accept port restriction ... Browse Code »

Currently, hv_sock restricts the port the guest socket can accept
connections on. hv_sock divides the socket port namespace into two parts
for server side (listening socket), 0-0x7FFFFFFF & 0x80000000-0xFFFFFFFF
(there are no restrictions on client port namespace). The first part
(0-0x7FFFFFFF) is reserved for sockets where connections can be accepted.
The second part (0x80000000-0xFFFFFFFF) is reserved for allocating ports
for the peer (host) socket, once a connection is accepted.
This reservation of the port namespace is specific to hv_sock and not
known by the generic vsock library (ex: af_vsock). This is problematic
because auto-binds/ephemeral ports are handled by the generic vsock
library and it has no knowledge of this port reservation and could
allocate a port that is not compatible with hv_sock (and legitimately so).
The issue hasn't surfaced so far because the auto-bind code of vsock
(__vsock_bind_stream) prior to the change 'VSOCK: bind to random port for
VMADDR_PORT_ANY' would start walking up from LAST_RESERVED_PORT (1023) and
start assigning ports. That will take a large number of iterations to hit
0x7FFFFFFF. But, after the above change to randomize port selection, the
issue has started coming up more frequently.
There has really been no good reason to have this port reservation logic
in hv_sock from the get go. Reserving a local port for peer ports is not
how things are handled generally. Peer ports should reflect the peer port.
This fixes the issue by lifting the port reservation, and also returns the
right peer port. Since the code converts the GUID to the peer port (by
using the first 4 bytes), there is a possibility of conflicts, but that
seems like a reasonable risk to take, given this is limited to vsock and
that only applies to all local sockets.

Signed-off-by: Sunil Muthuswamy
Signed-off-by: David S. Miller

Sunil Muthuswamy
2020-01-15 03:50:53 +0800

23 Dec, 2019

1 commit

ac80010fc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Mere overlapping changes in the conflicts here.

Signed-off-by: David S. Miller

David S. Miller
2019-12-23 07:15:05 +0800

17 Dec, 2019

2 commits

4aaf59614 vsock/virtio: add WARN_ON check on virtio_transport_get_ops() ... Browse Code »

virtio_transport_get_ops() and virtio_transport_send_pkt_info()
can only be used on connecting/connected sockets, since a socket
assigned to a transport is required.

This patch adds a WARN_ON() on virtio_transport_get_ops() to check
this requirement, a comment and a returned error on
virtio_transport_send_pkt_info(),

Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-12-17 08:07:12 +0800
df18fa146 vsock/virtio: fix null-pointer dereference in virtio_transport_recv_listen() ... Browse Code »

With multi-transport support, listener sockets are not bound to any
transport. So, calling virtio_transport_reset(), when an error
occurs, on a listener socket produces the following null-pointer
dereference:

BUG: kernel NULL pointer dereference, address: 00000000000000e8
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 0 PID: 20 Comm: kworker/0:1 Not tainted 5.5.0-rc1-ste-00003-gb4be21f316ac-dirty #56
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
Workqueue: virtio_vsock virtio_transport_rx_work [vmw_vsock_virtio_transport]
RIP: 0010:virtio_transport_send_pkt_info+0x20/0x130 [vmw_vsock_virtio_transport_common]
Code: 1f 84 00 00 00 00 00 0f 1f 00 55 48 89 e5 41 57 41 56 41 55 49 89 f5 41 54 49 89 fc 53 48 83 ec 10 44 8b 76 20 e8 c0 ba fe ff 8b 80 e8 00 00 00 e8 64 e3 7d c1 45 8b 45 00 41 8b 8c 24 d4 02
RSP: 0018:ffffc900000b7d08 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88807bf12728 RCX: 0000000000000000
RDX: ffff88807bf12700 RSI: ffffc900000b7d50 RDI: ffff888035c84000
RBP: ffffc900000b7d40 R08: ffff888035c84000 R09: ffffc900000b7d08
R10: ffff8880781de800 R11: 0000000000000018 R12: ffff888035c84000
R13: ffffc900000b7d50 R14: 0000000000000000 R15: ffff88807bf12724
FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000e8 CR3: 00000000790f4004 CR4: 0000000000160ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
virtio_transport_reset+0x59/0x70 [vmw_vsock_virtio_transport_common]
virtio_transport_recv_pkt+0x5bb/0xe50 [vmw_vsock_virtio_transport_common]
? detach_buf_split+0xf1/0x130
virtio_transport_rx_work+0xba/0x130 [vmw_vsock_virtio_transport]
process_one_work+0x1c0/0x300
worker_thread+0x45/0x3c0
kthread+0xfc/0x130
? current_work+0x40/0x40
? kthread_park+0x90/0x90
ret_from_fork+0x35/0x40
Modules linked in: sunrpc kvm_intel kvm vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common irqbypass vsock virtio_rng rng_core
CR2: 00000000000000e8
---[ end trace e75400e2ea2fa824 ]---

This happens because virtio_transport_reset() calls
virtio_transport_send_pkt_info() that can be used only on
connecting/connected sockets.

This patch fixes the issue, using virtio_transport_reset_no_sock()
instead of virtio_transport_reset() when we are handling a listener
socket.

Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-12-17 08:07:12 +0800

12 Dec, 2019

6 commits

bf5432b1d vsock/virtio: remove loopback handling ... Browse Code »

We can remove the loopback handling from virtio_transport,
because now the vsock core is able to handle local communication
using the new vsock_loopback device.

Reviewed-by: Stefan Hajnoczi
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-12-12 07:01:23 +0800
408624af4 vsock: use local transport when it is loaded ... Browse Code »

Now that we have a transport that can handle the local communication,
we can use it when it is loaded.

A socket will use the local transport (loopback) when the remote
CID is:
- equal to VMADDR_CID_LOCAL
- or equal to transport_g2h->get_local_cid(), if transport_g2h
is loaded (this allows us to keep the same behavior implemented
by virtio and vmci transports)
- or equal to VMADDR_CID_HOST, if transport_g2h is not loaded

Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-12-12 07:01:23 +0800
077263fba vsock: add vsock_loopback transport ... Browse Code »

This patch adds a new vsock_loopback transport to handle local
communication.
This transport is based on the loopback implementation of
virtio_transport, so it uses the virtio_transport_common APIs
to interface with the vsock core.

Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-12-12 07:01:23 +0800
0e1219057 vsock: add local transport support in the vsock core ... Browse Code »

This patch allows to register a transport able to handle
local communication (loopback).

Reviewed-by: Stefan Hajnoczi
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-12-12 07:01:23 +0800
ef343b35d vsock: add VMADDR_CID_LOCAL definition ... Browse Code »

The VMADDR_CID_RESERVED (1) was used by VMCI, but now it is not
used anymore, so we can reuse it for local communication
(loopback) adding the new well-know CID: VMADDR_CID_LOCAL.

Cc: Jorgen Hansen
Reviewed-by: Stefan Hajnoczi
Reviewed-by: Jorgen Hansen
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-12-12 07:01:23 +0800
c5144fcbf vsock/virtio_transport_common: remove unused virtio header includes ... Browse Code »

We can remove virtio header includes, because virtio_transport_common
doesn't use virtio API, but provides common functions to interface
virtio/vhost transports with the af_vsock core, and to handle
the protocol.

Reviewed-by: Stefan Hajnoczi
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-12-12 07:01:23 +0800

01 Dec, 2019

1 commit

0dd0c8f7d Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux ... Browse Code »

Pull Hyper-V updates from Sasha Levin:

- support for new VMBus protocols (Andrea Parri)

- hibernation support (Dexuan Cui)

- latency testing framework (Branden Bonaby)

- decoupling Hyper-V page size from guest page size (Himadri Pandya)

* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (22 commits)
Drivers: hv: vmbus: Fix crash handler reset of Hyper-V synic
drivers/hv: Replace binary semaphore with mutex
drivers: iommu: hyperv: Make HYPERV_IOMMU only available on x86
HID: hyperv: Add the support of hibernation
hv_balloon: Add the support of hibernation
x86/hyperv: Implement hv_is_hibernation_supported()
Drivers: hv: balloon: Remove dependencies on guest page size
Drivers: hv: vmbus: Remove dependencies on guest page size
x86: hv: Add function to allocate zeroed page for Hyper-V
Drivers: hv: util: Specify ring buffer size using Hyper-V page size
Drivers: hv: Specify receive buffer size using Hyper-V page size
tools: hv: add vmbus testing tool
drivers: hv: vmbus: Introduce latency testing
video: hyperv: hyperv_fb: Support deferred IO for Hyper-V frame buffer driver
video: hyperv: hyperv_fb: Obtain screen resolution from Hyper-V host
hv_netvsc: Add the support of hibernation
hv_sock: Add the support of hibernation
video: hyperv_fb: Add the support of hibernation
scsi: storvsc: Add the support of hibernation
Drivers: hv: vmbus: Add module parameter to cap the VMBus version
...

Linus Torvalds
2019-12-01 06:50:51 +0800

22 Nov, 2019

2 commits

2194c2eb6 hv_sock: Add the support of hibernation ... Browse Code »

Add the necessary dummy callbacks for hibernation.

Signed-off-by: Dexuan Cui
Acked-by: David S. Miller
Signed-off-by: Sasha Levin

Dexuan Cui
2019-11-22 09:10:44 +0800
039fcccae vsock: avoid to assign transport if its initialization fails ... Browse Code »

If transport->init() fails, we can't assign the transport to the
socket, because it's not initialized correctly, and any future
calls to the transport callbacks would have an unexpected behavior.

Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
Reported-and-tested-by: syzbot+e2e5c07bf353b2f79daa@syzkaller.appspotmail.com
Signed-off-by: Stefano Garzarella
Reviewed-by: Jorgen Hansen
Signed-off-by: David S. Miller

Stefano Garzarella
2019-11-22 03:37:16 +0800

15 Nov, 2019

11 commits

36c5b48b9 vsock: fix bind() behaviour taking care of CID ... Browse Code »

When we are looking for a socket bound to a specific address,
we also have to take into account the CID.

This patch is useful with multi-transports support because it
allows the binding of the same port with different CID, and
it prevents a connection to a wrong socket bound to the same
port, but with different CID.

Reviewed-by: Stefan Hajnoczi
Reviewed-by: Jorgen Hansen
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-11-15 10:12:18 +0800
6a2c09621 vsock: prevent transport modules unloading ... Browse Code »

This patch adds 'module' member in the 'struct vsock_transport'
in order to get/put the transport module. This prevents the
module unloading while sockets are assigned to it.

We increase the module refcnt when a socket is assigned to a
transport, and we decrease the module refcnt when the socket
is destructed.

Reviewed-by: Stefan Hajnoczi
Reviewed-by: Jorgen Hansen
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-11-15 10:12:18 +0800
b1bba80a4 vsock/vmci: register vmci_transport only when VMCI guest/host are active ... Browse Code »

To allow other transports to be loaded with vmci_transport,
we register the vmci_transport as G2H or H2G only when a VMCI guest
or host is active.

To do that, this patch adds a callback registered in the vmci driver
that will be called when the host or guest becomes active.
This callback will register the vmci_transport in the VSOCK core.

Cc: Jorgen Hansen
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-11-15 10:12:18 +0800
c0cfa2d8a vsock: add multi-transports support ... Browse Code »

This patch adds the support of multiple transports in the
VSOCK core.

With the multi-transports support, we can use vsock with nested VMs
(using also different hypervisors) loading both guest->host and
host->guest transports at the same time.

Major changes:
- vsock core module can be loaded regardless of the transports
- vsock_core_init() and vsock_core_exit() are renamed to
vsock_core_register() and vsock_core_unregister()
- vsock_core_register() has a feature parameter (H2G, G2H, DGRAM)
to identify which directions the transport can handle and if it's
support DGRAM (only vmci)
- each stream socket is assigned to a transport when the remote CID
is set (during the connect() or when we receive a connection request
on a listener socket).
The remote CID is used to decide which transport to use:
- remote CID host transport;
- remote CID == local_cid (guest->host transport) will use guest->host
transport for loopback (host->guest transports don't support loopback);
- remote CID > VMADDR_CID_HOST will use host->guest transport;
- listener sockets are not bound to any transports since no transport
operations are done on it. In this way we can create a listener
socket, also if the transports are not loaded or with VMADDR_CID_ANY
to listen on all transports.
- DGRAM sockets are handled as before, since only the vmci_transport
provides this feature.

Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-11-15 10:12:18 +0800
039642574 hv_sock: set VMADDR_CID_HOST in the hvs_remote_addr_init() ... Browse Code »

Remote peer is always the host, so we set VMADDR_CID_HOST as
remote CID instead of VMADDR_CID_ANY.

Reviewed-by: Dexuan Cui
Reviewed-by: Stefan Hajnoczi
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-11-15 10:12:18 +0800
55f3e149b vsock: move vsock_insert_unbound() in the vsock_create() ... Browse Code »

vsock_insert_unbound() was called only when 'sock' parameter of
__vsock_create() was not null. This only happened when
__vsock_create() was called by vsock_create().

In order to simplify the multi-transports support, this patch
moves vsock_insert_unbound() at the end of vsock_create().

Reviewed-by: Dexuan Cui
Reviewed-by: Stefan Hajnoczi
Reviewed-by: Jorgen Hansen
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-11-15 10:12:18 +0800
b9ca2f5ff vsock: add vsock_create_connected() called by transports ... Browse Code »

All transports call __vsock_create() with the same parameters,
most of them depending on the parent socket. In order to simplify
the VSOCK core APIs exposed to the transports, this patch adds
the vsock_create_connected() callable from transports to create
a new socket when a connection request is received.
We also unexported the __vsock_create().

Suggested-by: Stefan Hajnoczi
Reviewed-by: Stefan Hajnoczi
Reviewed-by: Jorgen Hansen
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-11-15 10:12:18 +0800
b9f2b0ffd vsock: handle buffer_size sockopts in the core ... Browse Code »

virtio_transport and vmci_transport handle the buffer_size
sockopts in a very similar way.

In order to support multiple transports, this patch moves this
handling in the core to allow the user to change the options
also if the socket is not yet assigned to any transport.

This patch also adds the '.notify_buffer_size' callback in the
'struct virtio_transport' in order to inform the transport,
when the buffer_size is changed by the user. It is also useful
to limit the 'buffer_size' requested (e.g. virtio transports).

Acked-by: Dexuan Cui
Reviewed-by: Stefan Hajnoczi
Reviewed-by: Jorgen Hansen
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-11-15 10:12:18 +0800
daabfbca3 vsock: add 'struct vsock_sock *' param to vsock_core_get_transport() ... Browse Code »

Since now the 'struct vsock_sock' object contains a pointer to
the transport, this patch adds a parameter to the
vsock_core_get_transport() to return the right transport
assigned to the socket.

This patch modifies also the virtio_transport_get_ops(), that
uses the vsock_core_get_transport(), adding the
'struct vsock_sock *' parameter.

Reviewed-by: Stefan Hajnoczi
Reviewed-by: Jorgen Hansen
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-11-15 10:12:18 +0800
4c7246dc4 vsock/virtio: add transport parameter to the virtio_transport_reset_no_sock() ... Browse Code »

We are going to add 'struct vsock_sock *' parameter to
virtio_transport_get_ops().

In some cases, like in the virtio_transport_reset_no_sock(),
we don't have any socket assigned to the packet received,
so we can't use the virtio_transport_get_ops().

In order to allow virtio_transport_reset_no_sock() to use the
'.send_pkt' callback from the 'vhost_transport' or 'virtio_transport',
we add the 'struct virtio_transport *' to it and to its caller:
virtio_transport_recv_pkt().

We moved the 'vhost_transport' and 'virtio_transport' definition,
to pass their address to the virtio_transport_recv_pkt().

Reviewed-by: Stefan Hajnoczi
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-11-15 10:12:18 +0800
fe502c4a3 vsock: add 'transport' member in the struct vsock_sock ... Browse Code »

As a preparation to support multiple transports, this patch adds
the 'transport' member at the 'struct vsock_sock'.
This new field is initialized during the creation in the
__vsock_create() function.

This patch also renames the global 'transport' pointer to
'transport_single', since for now we're only supporting a single
transport registered at run-time.

Reviewed-by: Stefan Hajnoczi
Reviewed-by: Jorgen Hansen
Signed-off-by: Stefano Garzarella
Signed-off-by: David S. Miller

Stefano Garzarella
2019-11-15 10:12:18 +0800