25 Aug, 2022

2 commits

  • commit a3e7b29e30854ed67be0d17687e744ad0c769c4b upstream.

    Imagine two non-blocking vsock_connect() requests on the same socket.
    The first request schedules @connect_work, and after it times out,
    vsock_connect_timeout() sets *sock* state back to TCP_CLOSE, but keeps
    *socket* state as SS_CONNECTING.

    Later, the second request returns -EALREADY, meaning the socket "already
    has a pending connection in progress", even though the first request has
    already timed out.

    As suggested by Stefano, fix it by setting *socket* state back to
    SS_UNCONNECTED, so that the second request will return -ETIMEDOUT.

    Suggested-by: Stefano Garzarella
    Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
    Reviewed-by: Stefano Garzarella
    Signed-off-by: Peilin Ye
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Peilin Ye
     
  • commit 7e97cfed9929eaabc41829c395eb0d1350fccb9d upstream.

    An O_NONBLOCK vsock_connect() request may try to reschedule
    @connect_work. Imagine the following sequence of vsock_connect()
    requests:

    1. The 1st, non-blocking request schedules @connect_work, which will
    expire after 200 jiffies. Socket state is now SS_CONNECTING;

    2. Later, the 2nd, blocking request gets interrupted by a signal after
    a few jiffies while waiting for the connection to be established.
    Socket state is back to SS_UNCONNECTED, but @connect_work is still
    pending, and will expire after 100 jiffies.

    3. Now, the 3rd, non-blocking request tries to schedule @connect_work
    again. Since @connect_work is already scheduled,
    schedule_delayed_work() silently returns. sock_hold() is called
    twice, but sock_put() will only be called once in
    vsock_connect_timeout(), causing a memory leak reported by syzbot:

    BUG: memory leak
    unreferenced object 0xffff88810ea56a40 (size 1232):
    comm "syz-executor756", pid 3604, jiffies 4294947681 (age 12.350s)
    hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    28 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 (..@............
    backtrace:
    [] sk_prot_alloc+0x3e/0x1b0 net/core/sock.c:1930
    [] sk_alloc+0x32/0x2e0 net/core/sock.c:1989
    [] __vsock_create.constprop.0+0x38/0x320 net/vmw_vsock/af_vsock.c:734
    [] vsock_create+0xc1/0x2d0 net/vmw_vsock/af_vsock.c:2203
    [] __sock_create+0x1ab/0x2b0 net/socket.c:1468
    [] sock_create net/socket.c:1519 [inline]
    [] __sys_socket+0x6f/0x140 net/socket.c:1561
    [] __do_sys_socket net/socket.c:1570 [inline]
    [] __se_sys_socket net/socket.c:1568 [inline]
    [] __x64_sys_socket+0x1a/0x20 net/socket.c:1568
    [] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    [] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80
    [] entry_SYSCALL_64_after_hwframe+0x44/0xae

    Use mod_delayed_work() instead: if @connect_work is already scheduled,
    reschedule it, and undo sock_hold() to keep the reference count
    balanced.

    Reported-and-tested-by: syzbot+b03f55bf128f9a38f064@syzkaller.appspotmail.com
    Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
    Co-developed-by: Stefano Garzarella
    Signed-off-by: Stefano Garzarella
    Reviewed-by: Stefano Garzarella
    Signed-off-by: Peilin Ye
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Peilin Ye
     

08 Apr, 2022

3 commits

  • [ Upstream commit 88704454ef8b00ea91537ae0d47d9348077e0e72 ]

    virtio spec requires drivers to set DRIVER_OK before using VQs.
    This is set automatically after probe returns, but virtio-vsock
    driver uses VQs in the probe function to fill rx and event VQs
    with new buffers.

    Let's fix this, calling virtio_device_ready() before using VQs
    in the probe function.

    Fixes: 0ea9e1d3a9e3 ("VSOCK: Introduce virtio_transport.ko")
    Signed-off-by: Stefano Garzarella
    Acked-by: Michael S. Tsirkin
    Reviewed-by: Stefan Hajnoczi
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Sasha Levin

    Stefano Garzarella
     
  • [ Upstream commit c1011c0b3a9c8d2065f425407475cbcc812540b7 ]

    Complete the driver configuration, reading the negotiated features,
    before using the VQs in the virtio_vsock_probe().

    Fixes: 53efbba12cc7 ("virtio/vsock: enable SEQPACKET for transport")
    Suggested-by: Michael S. Tsirkin
    Reviewed-by: Stefan Hajnoczi
    Signed-off-by: Stefano Garzarella
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Sasha Levin

    Stefano Garzarella
     
  • [ Upstream commit 4b5f1ad5566ada230aaa2ce861b28d1895f1ea68 ]

    When we fill VQs with empty buffers and kick the host, it may send
    an interrupt. `vdev->priv` must be initialized before this since it
    is used in the virtqueue callbacks.

    Fixes: 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock")
    Suggested-by: Michael S. Tsirkin
    Signed-off-by: Stefano Garzarella
    Acked-by: Michael S. Tsirkin
    Reviewed-by: Stefan Hajnoczi
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Sasha Levin

    Stefano Garzarella
     

23 Mar, 2022

1 commit

  • [ Upstream commit 8e6ed963763fe21429eabfc76c69ce2b0163a3dd ]

    When iterating over sockets using vsock_for_each_connected_socket, make
    sure that a transport filters out sockets that don't belong to the
    transport.

    There actually was an issue caused by this; in a nested VM
    configuration, destroying the nested VM (which often involves the
    closing of /dev/vhost-vsock if there was h2g connections to the nested
    VM) kills not only the h2g connections, but also all existing g2h
    connections to the (outmost) host which are totally unrelated.

    Tested: Executed the following steps on Cuttlefish (Android running on a
    VM) [1]: (1) Enter into an `adb shell` session - to have a g2h
    connection inside the VM, (2) open and then close /dev/vhost-vsock by
    `exec 3< /dev/vhost-vsock && exec 3
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Jiyong Park
    Link: https://lore.kernel.org/r/20220311020017.1509316-1-jiyong@google.com
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Sasha Levin

    Jiyong Park
     

23 Feb, 2022

1 commit

  • commit b9208492fcaecff8f43915529ae34b3bcb03877c upstream.

    vsock_connect() expects that the socket could already be in the
    TCP_ESTABLISHED state when the connecting task wakes up with a signal
    pending. If this happens the socket will be in the connected table, and
    it is not removed when the socket state is reset. In this situation it's
    common for the process to retry connect(), and if the connection is
    successful the socket will be added to the connected table a second
    time, corrupting the list.

    Prevent this by calling vsock_remove_connected() if a signal is received
    while waiting for a connection. This is harmless if the socket is not in
    the connected table, and if it is in the table then removing it will
    prevent list corruption from a double add.

    Note for backporting: this patch requires d5afa82c977e ("vsock: correct
    removal of socket from the list"), which is in all current stable trees
    except 4.9.y.

    Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
    Signed-off-by: Seth Forshee
    Reviewed-by: Stefano Garzarella
    Link: https://lore.kernel.org/r/20220217141312.2297547-1-sforshee@digitalocean.com
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Greg Kroah-Hartman

    Seth Forshee
     

22 Dec, 2021

1 commit

  • [ Upstream commit 1db8f5fc2e5c66a5c51e1f6488e0ba7d45c29ae4 ]

    The VMADDR_CID_ANY flag used by a socket means that the socket isn't bound
    to any specific CID. For example, a host vsock server may want to be bound
    with VMADDR_CID_ANY, so that a guest vsock client can connect to the host
    server with CID=VMADDR_CID_HOST (i.e. 2), and meanwhile, a host vsock
    client can connect to the same local server with CID=VMADDR_CID_LOCAL
    (i.e. 1).

    The current implementation sets the destination socket's svm_cid to a
    fixed CID value after the first client's connection, which isn't an
    expected operation. For example, if the guest client first connects to the
    host server, the server's svm_cid gets set to VMADDR_CID_HOST, then other
    host clients won't be able to connect to the server anymore.

    Reproduce steps:
    1. Run the host server:
    socat VSOCK-LISTEN:1234,fork -
    2. Run a guest client to connect to the host server:
    socat - VSOCK-CONNECT:2:1234
    3. Run a host client to connect to the host server:
    socat - VSOCK-CONNECT:1:1234

    Without this patch, step 3. above fails to connect, and socat complains
    "socat[1720] E connect(5, AF=40 cid:1 port:1234, 16): Connection
    reset by peer".
    With this patch, the above works well.

    Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
    Signed-off-by: Wei Wang
    Link: https://lore.kernel.org/r/20211126011823.1760-1-wei.w.wang@intel.com
    Signed-off-by: Michael S. Tsirkin
    Reviewed-by: Stefano Garzarella
    Signed-off-by: Sasha Levin

    Wei Wang
     

19 Nov, 2021

1 commit

  • [ Upstream commit c7cd82b90599fa10915f41e3dd9098a77d0aa7b6 ]

    Currently vosck_connect() increments sock refcount for nonblocking
    socket each time it's called, which can lead to memory leak if
    it's called multiple times because connect timeout function decrements
    sock refcount only once.

    Fixes it by making vsock_connect() return -EALREADY immediately when
    sock state is already SS_CONNECTING.

    Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
    Reviewed-by: Stefano Garzarella
    Signed-off-by: Eiichi Tsukata
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Eiichi Tsukata
     

06 Sep, 2021

3 commits

  • Record is supported via MSG_EOR flag, while current logic operates
    with message, so rename variables from 'record' to 'message'.

    Signed-off-by: Arseny Krasnov
    Reviewed-by: Stefano Garzarella
    Link: https://lore.kernel.org/r/20210903123306.3273757-1-arseny.krasnov@kaspersky.com
    Signed-off-by: Michael S. Tsirkin

    Arseny Krasnov
     
  • If packet has 'EOR' bit - set MSG_EOR in 'recvmsg()' flags.

    Signed-off-by: Arseny Krasnov
    Reviewed-by: Stefano Garzarella
    Link: https://lore.kernel.org/r/20210903123251.3273639-1-arseny.krasnov@kaspersky.com
    Signed-off-by: Michael S. Tsirkin

    Arseny Krasnov
     
  • This current implemented bit is used to mark end of messages
    ('EOM' - end of message), not records('EOR' - end of record).
    Also rename 'record' to 'message' in implementation as it is
    different things.

    Signed-off-by: Arseny Krasnov
    Reviewed-by: Stefano Garzarella
    Link: https://lore.kernel.org/r/20210903123109.3273053-1-arseny.krasnov@kaspersky.com
    Signed-off-by: Michael S. Tsirkin

    Arseny Krasnov
     

13 Aug, 2021

1 commit

  • There's a potential deadlock case when remove the vsock device or
    process the RESET event:

    vsock_for_each_connected_socket:
    spin_lock_bh(&vsock_table_lock) ----------- (1)
    ...
    virtio_vsock_reset_sock:
    lock_sock(sk) --------------------- (2)
    ...
    spin_unlock_bh(&vsock_table_lock)

    lock_sock() may do initiative schedule when the 'sk' is owned by
    other thread at the same time, we would receivce a warning message
    that "scheduling while atomic".

    Even worse, if the next task (selected by the scheduler) try to
    release a 'sk', it need to request vsock_table_lock and the deadlock
    occur, cause the system into softlockup state.
    Call trace:
    queued_spin_lock_slowpath
    vsock_remove_bound
    vsock_remove_sock
    virtio_transport_release
    __vsock_release
    vsock_release
    __sock_release
    sock_close
    __fput
    ____fput

    So we should not require sk_lock in this case, just like the behavior
    in vhost_vsock or vmci.

    Fixes: 0ea9e1d3a9e3 ("VSOCK: Introduce virtio_transport.ko")
    Cc: Stefan Hajnoczi
    Signed-off-by: Longpeng(Mike)
    Reviewed-by: Stefano Garzarella
    Link: https://lore.kernel.org/r/20210812053056.1699-1-longpeng2@huawei.com
    Signed-off-by: Jakub Kicinski

    Longpeng(Mike)
     

04 Aug, 2021

1 commit

  • The original implementation of the virtio-vsock driver does not
    handle a VIRTIO_VSOCK_OP_CREDIT_REQUEST as required by the
    virtio-vsock specification. The vsock device emulated by
    vhost-vsock and the virtio-vsock driver never uses this request,
    which was probably why nobody noticed it. However, another
    implementation of the device may use this request type.

    Hence, this commit introduces a way to handle an explicit credit
    request by responding with a corresponding credit update as
    required by the virtio-vsock specification.

    Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko")
    Signed-off-by: Harshavardhan Unnibhavi
    Reviewed-by: Stefano Garzarella
    Acked-by: Michael S. Tsirkin
    Link: https://lore.kernel.org/r/20210802173506.2383-1-harshanavkis@gmail.com
    Signed-off-by: Jakub Kicinski

    Harshavardhan Unnibhavi
     

01 Jul, 2021

1 commit

  • Pull networking updates from Jakub Kicinski:
    "Core:

    - BPF:
    - add syscall program type and libbpf support for generating
    instructions and bindings for in-kernel BPF loaders (BPF loaders
    for BPF), this is a stepping stone for signed BPF programs
    - infrastructure to migrate TCP child sockets from one listener to
    another in the same reuseport group/map to improve flexibility
    of service hand-off/restart
    - add broadcast support to XDP redirect

    - allow bypass of the lockless qdisc to improving performance (for
    pktgen: +23% with one thread, +44% with 2 threads)

    - add a simpler version of "DO_ONCE()" which does not require jump
    labels, intended for slow-path usage

    - virtio/vsock: introduce SOCK_SEQPACKET support

    - add getsocketopt to retrieve netns cookie

    - ip: treat lowest address of a IPv4 subnet as ordinary unicast
    address allowing reclaiming of precious IPv4 addresses

    - ipv6: use prandom_u32() for ID generation

    - ip: add support for more flexible field selection for hashing
    across multi-path routes (w/ offload to mlxsw)

    - icmp: add support for extended RFC 8335 PROBE (ping)

    - seg6: add support for SRv6 End.DT46 behavior

    - mptcp:
    - DSS checksum support (RFC 8684) to detect middlebox meddling
    - support Connection-time 'C' flag
    - time stamping support

    - sctp: packetization Layer Path MTU Discovery (RFC 8899)

    - xfrm: speed up state addition with seq set

    - WiFi:
    - hidden AP discovery on 6 GHz and other HE 6 GHz improvements
    - aggregation handling improvements for some drivers
    - minstrel improvements for no-ack frames
    - deferred rate control for TXQs to improve reaction times
    - switch from round robin to virtual time-based airtime scheduler

    - add trace points:
    - tcp checksum errors
    - openvswitch - action execution, upcalls
    - socket errors via sk_error_report

    Device APIs:

    - devlink: add rate API for hierarchical control of max egress rate
    of virtual devices (VFs, SFs etc.)

    - don't require RCU read lock to be held around BPF hooks in NAPI
    context

    - page_pool: generic buffer recycling

    New hardware/drivers:

    - mobile:
    - iosm: PCIe Driver for Intel M.2 Modem
    - support for Qualcomm MSM8998 (ipa)

    - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices

    - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches

    - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)

    - NXP SJA1110 Automotive Ethernet 10-port switch

    - Qualcomm QCA8327 switch support (qca8k)

    - Mikrotik 10/25G NIC (atl1c)

    Driver changes:

    - ACPI support for some MDIO, MAC and PHY devices from Marvell and
    NXP (our first foray into MAC/PHY description via ACPI)

    - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx

    - Mellanox/Nvidia NIC (mlx5)
    - NIC VF offload of L2 bridging
    - support IRQ distribution to Sub-functions

    - Marvell (prestera):
    - add flower and match all
    - devlink trap
    - link aggregation

    - Netronome (nfp): connection tracking offload

    - Intel 1GE (igc): add AF_XDP support

    - Marvell DPU (octeontx2): ingress ratelimit offload

    - Google vNIC (gve): new ring/descriptor format support

    - Qualcomm mobile (rmnet & ipa): inline checksum offload support

    - MediaTek WiFi (mt76)
    - mt7915 MSI support
    - mt7915 Tx status reporting
    - mt7915 thermal sensors support
    - mt7921 decapsulation offload
    - mt7921 enable runtime pm and deep sleep

    - Realtek WiFi (rtw88)
    - beacon filter support
    - Tx antenna path diversity support
    - firmware crash information via devcoredump

    - Qualcomm WiFi (wcn36xx)
    - Wake-on-WLAN support with magic packets and GTK rekeying

    - Micrel PHY (ksz886x/ksz8081): add cable test support"

    * tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits)
    tcp: change ICSK_CA_PRIV_SIZE definition
    tcp_yeah: check struct yeah size at compile time
    gve: DQO: Fix off by one in gve_rx_dqo()
    stmmac: intel: set PCI_D3hot in suspend
    stmmac: intel: Enable PHY WOL option in EHL
    net: stmmac: option to enable PHY WOL with PMT enabled
    net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}
    net: use netdev_info in ndo_dflt_fdb_{add,del}
    ptp: Set lookup cookie when creating a PTP PPS source.
    net: sock: add trace for socket errors
    net: sock: introduce sk_error_report
    net: dsa: replay the local bridge FDB entries pointing to the bridge dev too
    net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev
    net: dsa: include fdb entries pointing to bridge in the host fdb list
    net: dsa: include bridge addresses which are local in the host fdb list
    net: dsa: sync static FDB entries on foreign interfaces to hardware
    net: dsa: install the host MDB and FDB entries in the master's RX filter
    net: dsa: reference count the FDB addresses at the cross-chip notifier level
    net: dsa: introduce a separate cross-chip notifier type for host FDBs
    net: dsa: reference count the MDB entries at the cross-chip notifier level
    ...

    Linus Torvalds
     

30 Jun, 2021

2 commits


23 Jun, 2021

1 commit

  • Make sure the_virtio_vsock is not NULL before dereferencing it.

    general protection fault, probably for non-canonical address 0xdffffc0000000071: 0000 [#1] PREEMPT SMP KASAN
    KASAN: null-ptr-deref in range [0x0000000000000388-0x000000000000038f]
    CPU: 0 PID: 8452 Comm: syz-executor406 Not tainted 5.13.0-rc6-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:virtio_transport_seqpacket_allow+0xbf/0x210 net/vmw_vsock/virtio_transport.c:503
    Code: e8 c6 d9 ab f8 84 db 0f 84 0f 01 00 00 e8 09 d3 ab f8 48 8d bd 88 03 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 b6 04 02 84 c0 74 06 0f 8e 2a 01 00 00 44 0f b6 a5 88 03 00 00
    RSP: 0018:ffffc90003757c18 EFLAGS: 00010206
    RAX: dffffc0000000000 RBX: 0000000000000001 RCX: 0000000000000000
    RDX: 0000000000000071 RSI: ffffffff88c908e7 RDI: 0000000000000388
    RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
    R10: ffffffff88c90a06 R11: 0000000000000000 R12: 0000000000000000
    R13: ffffffff88c90840 R14: 0000000000000000 R15: 0000000000000001
    FS: 0000000001bee300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000020000082 CR3: 000000002847e000 CR4: 00000000001506f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    vsock_assign_transport+0x575/0x700 net/vmw_vsock/af_vsock.c:490
    vsock_connect+0x200/0xc00 net/vmw_vsock/af_vsock.c:1337
    __sys_connect_file+0x155/0x1a0 net/socket.c:1824
    __sys_connect+0x161/0x190 net/socket.c:1841
    __do_sys_connect net/socket.c:1851 [inline]
    __se_sys_connect net/socket.c:1848 [inline]
    __x64_sys_connect+0x6f/0xb0 net/socket.c:1848
    do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47
    entry_SYSCALL_64_after_hwframe+0x44/0xae
    RIP: 0033:0x43ee69
    Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007ffd49e7c788 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
    RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ee69
    RDX: 0000000000000010 RSI: 0000000020000080 RDI: 0000000000000003
    RBP: 0000000000402e50 R08: 0000000000000000 R09: 0000000000400488
    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000402ee0
    R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488

    Fixes: 53efbba12cc7 ("virtio/vsock: enable SEQPACKET for transport")
    Signed-off-by: Eric Dumazet
    Cc: Arseny Krasnov
    Reported-by: syzbot
    Reviewed-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Eric Dumazet
     

22 Jun, 2021

1 commit

  • The client's sk_state will be set to TCP_ESTABLISHED if the server
    replay the client's connect request.

    However, if the client has pending signal, its sk_state will be set
    to TCP_CLOSE without notify the server, so the server will hold the
    corrupt connection.

    client server

    1. sk_state=TCP_SYN_SENT |
    2. call ->connect() |
    3. wait reply |
    | 4. sk_state=TCP_ESTABLISHED
    | 5. insert to connected list
    | 6. reply to the client
    7. sk_state=TCP_ESTABLISHED |
    8. insert to connected list |
    9. *signal pending* release() |
    virtio_transport_close
    if (!(sk->sk_state == TCP_ESTABLISHED ||
    sk->sk_state == TCP_CLOSING))
    return true; *return at here, the server cannot notice the connection is corrupt*

    So the client should notify the peer in this case.

    Cc: David S. Miller
    Cc: Jakub Kicinski
    Cc: Jorgen Hansen
    Cc: Norbert Slusarek
    Cc: Andra Paraschiv
    Cc: Colin Ian King
    Cc: David Brazdil
    Cc: Alexander Popov
    Suggested-by: Stefano Garzarella
    Link: https://lkml.org/lkml/2021/5/17/418
    Signed-off-by: lixianming
    Signed-off-by: Longpeng(Mike)
    Reviewed-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Longpeng(Mike)
     

19 Jun, 2021

3 commits

  • When memcpy_to_msg() fails in virtio_transport_seqpacket_do_dequeue(),
    we already set `dequeued_len` with the negative error value returned
    by memcpy_to_msg().

    So we can directly check `dequeued_len` value instead of using a
    dedicated flag variable to skip the copy path for the rest of
    fragments.

    Signed-off-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Stefano Garzarella
     
  • vsock_wait_data() is used only by STREAM and SEQPACKET sockets,
    so let's rename it to vsock_connectible_wait_data(), using the same
    nomenclature (connectible) used in other functions after the
    introduction of SEQPACKET.

    Signed-off-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Stefano Garzarella
     
  • vsock_has_data() is used only by STREAM and SEQPACKET sockets,
    so let's rename it to vsock_connectible_has_data(), using the same
    nomenclature (connectible) used in other functions after the
    introduction of SEQPACKET.

    Signed-off-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Stefano Garzarella
     

12 Jun, 2021

14 commits

  • Add SEQPACKET ops for loopback transport and 'seqpacket_allow()'
    callback.

    Signed-off-by: Arseny Krasnov
    Signed-off-by: David S. Miller

    Arseny Krasnov
     
  • To make transport work with SOCK_SEQPACKET add two things:
    1) SOCK_SEQPACKET ops for virtio transport and 'seqpacket_allow()'
    callback.
    2) Handling of SEQPACKET bit: guest tries to negotiate it with vhost,
    so feature will be enabled only if bit is negotiated with device.

    Signed-off-by: Arseny Krasnov
    Signed-off-by: David S. Miller

    Arseny Krasnov
     
  • Small updates to make SOCK_SEQPACKET work:
    1) Send SHUTDOWN on socket close for SEQPACKET type.
    2) Set SEQPACKET packet type during send.
    3) Set 'VIRTIO_VSOCK_SEQ_EOR' bit in flags for last
    packet of message.
    4) Implement data check function for SEQPACKET.
    5) Check for max datagram size.

    Signed-off-by: Arseny Krasnov
    Signed-off-by: David S. Miller

    Arseny Krasnov
     
  • Update current receive logic for SEQPACKET support: performs
    check for packet and socket types on receive(if mismatch, then
    reset connection). Increment EOR counter on receive. Also if
    buffer of new packet was appended to buffer of last packet in
    rx queue, update flags of last packet with flags of new packet.

    Signed-off-by: Arseny Krasnov
    Signed-off-by: David S. Miller

    Arseny Krasnov
     
  • Callback fetches RW packets from rx queue of socket until whole record
    is copied(if user's buffer is full, user is not woken up). This is done
    to not stall sender, because if we wake up user and it leaves syscall,
    nobody will send credit update for rest of record, and sender will wait
    for next enter of read syscall at receiver's side. So if user buffer is
    full, we just send credit update and drop data.

    Signed-off-by: Arseny Krasnov
    Signed-off-by: David S. Miller

    Arseny Krasnov
     
  • This function is static and 'hdr' arg was always NULL.

    Signed-off-by: Arseny Krasnov
    Reviewed-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Arseny Krasnov
     
  • There is no need to set type of packet which differs from type
    of socket, so move passing type of packet from 'info' structure
    to 'virtio_transport_send_pkt_info()' function. Since at current
    time only stream type is supported, set it directly in 'virtio_
    transport_send_pkt_info()', so callers don't need to set it.

    Signed-off-by: Arseny Krasnov
    Reviewed-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Arseny Krasnov
     
  • Replace 'stream' to 'connection oriented' in comments as
    SEQPACKET is also connection oriented.

    Signed-off-by: Arseny Krasnov
    Reviewed-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Arseny Krasnov
     
  • Add socket ops for SEQPACKET type and .seqpacket_allow() callback
    to query transports if they support SEQPACKET. Also split path
    for data check for STREAM and SEQPACKET branches.

    Signed-off-by: Arseny Krasnov
    Signed-off-by: David S. Miller

    Arseny Krasnov
     
  • Update current stream enqueue function for SEQPACKET
    support:
    1) Call transport's seqpacket enqueue callback.
    2) Return value from enqueue function is whole record length or error
    for SOCK_SEQPACKET.

    Signed-off-by: Arseny Krasnov
    Reviewed-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Arseny Krasnov
     
  • Add receive loop for SEQPACKET. It looks like receive loop for
    STREAM, but there are differences:
    1) It doesn't call notify callbacks.
    2) It doesn't care about 'SO_SNDLOWAT' and 'SO_RCVLOWAT' values, because
    there is no sense for these values in SEQPACKET case.
    3) It waits until whole record is received.
    4) It processes and sets 'MSG_TRUNC' flag.

    So to avoid extra conditions for two types of socket inside one loop, two
    independent functions were created.

    Signed-off-by: Arseny Krasnov
    Signed-off-by: David S. Miller

    Arseny Krasnov
     
  • Some code in receive data loop could be shared between SEQPACKET
    and STREAM sockets, while another part is type specific, so move STREAM
    specific data receive logic to '__vsock_stream_recvmsg()' dedicated
    function, while checks, that will be same for both STREAM and SEQPACKET
    sockets, stays in 'vsock_connectible_recvmsg()'.

    Signed-off-by: Arseny Krasnov
    Reviewed-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Arseny Krasnov
     
  • Wait loop for data could be shared between SEQPACKET and STREAM
    sockets, so move it to dedicated function. While moving the code
    around, let's update an old comment.

    Signed-off-by: Arseny Krasnov
    Reviewed-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Arseny Krasnov
     
  • Prepare af_vsock.c for SEQPACKET support: rename some functions such
    as setsockopt(), getsockopt(), connect(), recvmsg(), sendmsg() in general
    manner, because they are shared with stream sockets.

    Signed-off-by: Arseny Krasnov
    Reviewed-by: Stefano Garzarella
    Signed-off-by: David S. Miller

    Arseny Krasnov
     

11 Jun, 2021

1 commit


15 May, 2021

1 commit

  • Pointers to ring-buffer packets sent by Hyper-V are used within the
    guest VM. Hyper-V can send packets with erroneous values or modify
    packet fields after they are processed by the guest. To defend
    against these scenarios, return a copy of the incoming VMBus packet
    after validating its length and offset fields in hv_pkt_iter_first().
    In this way, the packet can no longer be modified by the host.

    Signed-off-by: Andres Beltran
    Co-developed-by: Andrea Parri (Microsoft)
    Signed-off-by: Andrea Parri (Microsoft)
    Reviewed-by: Michael Kelley
    Link: https://lore.kernel.org/r/20210408161439.341988-1-parri.andrea@gmail.com
    Signed-off-by: Wei Liu

    Andres Beltran
     

01 May, 2021

1 commit

  • Variable 'err' is set to zero but this value is never read as it is
    overwritten with a new value later on, hence it is a redundant
    assignment and can be removed.

    Clean up the following clang-analyzer warning:

    net/vmw_vsock/vmci_transport.c:948:2: warning: Value stored to 'err' is
    never read [clang-analyzer-deadcode.DeadStores]

    Reported-by: Abaci Robot
    Signed-off-by: Yang Li
    Signed-off-by: David S. Miller

    Yang Li
     

27 Apr, 2021

1 commit