02 Dec, 2020

1 commit

  • The copy_to_user() function returns the number of bytes remaining to be
    copied but this should return -EFAULT to the user.

    Fixes: 1b48dc03e575 ("vhost: vdpa: report iova range")
    Signed-off-by: Dan Carpenter
    Link: https://lore.kernel.org/r/X8c32z5EtDsMyyIL@mwanda
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Jason Wang
    Reviewed-by: Stefano Garzarella

    Dan Carpenter
     

25 Nov, 2020

1 commit

  • Pinned pages are not properly accounted particularly when
    mapping error occurs on IOTLB update. Clean up dangling
    pinned pages for the error path.

    The memory usage for bookkeeping pinned pages is reverted
    to what it was before: only one single free page is needed.
    This helps reduce the host memory demand for VM with a large
    amount of memory, or in the situation where host is running
    short of free memory.

    Fixes: 4c8cf31885f6 ("vhost: introduce vDPA-based backend")
    Signed-off-by: Si-Wei Liu
    Link: https://lore.kernel.org/r/1604618793-4681-1-git-send-email-si-wei.liu@oracle.com
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Jason Wang

    Si-Wei Liu
     

30 Oct, 2020

3 commits

  • LKP considered variable 'ret' in vhost_vdpa_setup_vq_irq() as
    a unused variable, so suggest we remove it. Actually it stores
    return value of irq_bypass_register_producer(), but we did not
    check it, we should handle the failure case.

    This commit will print a message if irq bypass register producer
    fail, in this case, vqs still remain functional.

    Signed-off-by: Zhu Lingshan
    Reported-by: kernel test robot
    Link: https://lore.kernel.org/r/20201023104046.404794-1-lingshan.zhu@intel.com
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Jason Wang

    Zhu Lingshan
     
  • This reverts commit 7ed9e3d97c32d969caded2dfb6e67c1a2cc5a0b1.

    The patch creates a DoS risk since it can result in a high order memory
    allocation.

    Fixes: 7ed9e3d97c32d ("vhost-vdpa: fix page pinning leakage in error path")
    Cc: stable@vger.kernel.org
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • The copy_to/from_user() functions return the number of bytes which we
    weren't able to copy but the ioctl should return -EFAULT if they fail.

    Fixes: a127c5bbb6a8 ("vhost-vdpa: fix backend feature ioctls")
    Signed-off-by: Dan Carpenter
    Link: https://lore.kernel.org/r/20201023120853.GI282278@mwanda
    Signed-off-by: Michael S. Tsirkin
    Cc: stable@vger.kernel.org
    Acked-by: Jason Wang

    Dan Carpenter
     

23 Oct, 2020

1 commit

  • This patch introduces a new ioctl for vhost-vdpa device that can
    report the iova range by the device.

    For device that implements get_iova_range() method, we fetch it from
    the vDPA device. If device doesn't implement get_iova_range() but
    depends on platform IOMMU, we will query via DOMAIN_ATTR_GEOMETRY,
    otherwise [0, ULLONG_MAX] is assumed.

    For safety, this patch also rules out the map request which is not in
    the valid range.

    Signed-off-by: Jason Wang
    Link: https://lore.kernel.org/r/20201023090043.14430-3-jasowang@redhat.com
    Signed-off-by: Michael S. Tsirkin

    Jason Wang
     

21 Oct, 2020

2 commits

  • This commit removed unnecessary spin_locks in vhost_vring_call
    and related operations. Because we manipulate irq offloading
    contents in vhost_vdpa ioctl code path which is already
    protected by dev mutex and vq mutex.

    Signed-off-by: Zhu Lingshan
    Link: https://lore.kernel.org/r/20200909065234.3313-1-lingshan.zhu@intel.com
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Jason Wang

    Zhu Lingshan
     
  • linux/kernel.h is included more than once, Remove the one that isn't
    necessary.

    Signed-off-by: Tian Tao
    Link: https://lore.kernel.org/r/1600131102-24672-1-git-send-email-tiantao6@hisilicon.com
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Jason Wang

    Tian Tao
     

04 Oct, 2020

2 commits

  • Pinned pages are not properly accounted particularly when
    mapping error occurs on IOTLB update. Clean up dangling
    pinned pages for the error path. As the inflight pinned
    pages, specifically for memory region that strides across
    multiple chunks, would need more than one free page for
    book keeping and accounting. For simplicity, pin pages
    for all memory in the IOVA range in one go rather than
    have multiple pin_user_pages calls to make up the entire
    region. This way it's easier to track and account the
    pages already mapped, particularly for clean-up in the
    error path.

    Fixes: 4c8cf31885f6 ("vhost: introduce vDPA-based backend")
    Signed-off-by: Si-Wei Liu
    Link: https://lore.kernel.org/r/1601701330-16837-3-git-send-email-si-wei.liu@oracle.com
    Signed-off-by: Michael S. Tsirkin

    Si-Wei Liu
     
  • vhost_vdpa_map() should remove the iotlb entry just added
    if the corresponding mapping fails to set up properly.

    Fixes: 4c8cf31885f6 ("vhost: introduce vDPA-based backend")
    Signed-off-by: Si-Wei Liu
    Link: https://lore.kernel.org/r/1601701330-16837-2-git-send-email-si-wei.liu@oracle.com
    Signed-off-by: Michael S. Tsirkin

    Si-Wei Liu
     

30 Sep, 2020

1 commit

  • We must free the vqs array in the open failure path, because
    vhost_vdpa_release will not be called.

    Signed-off-by: Mike Christie
    Link: https://lore.kernel.org/r/1600712588-9514-2-git-send-email-michael.christie@oracle.com
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Jason Wang

    Mike Christie
     

24 Sep, 2020

1 commit

  • Commit 653055b9acd4 ("vhost-vdpa: support get/set backend features")
    introduces two malfunction backend features ioctls:

    1) the ioctls was blindly added to vring ioctl instead of vdpa device
    ioctl
    2) vhost_set_backend_features() was called when dev mutex has already
    been held which will lead a deadlock

    This patch fixes the above issues.

    Cc: Eli Cohen
    Reported-by: Zhu Lingshan
    Fixes: 653055b9acd4 ("vhost-vdpa: support get/set backend features")
    Signed-off-by: Jason Wang
    Link: https://lore.kernel.org/r/20200907104343.31141-1-jasowang@redhat.com
    Signed-off-by: Michael S. Tsirkin

    Jason Wang
     

06 Aug, 2020

6 commits

  • Modify get_vq_state() so it returns an error code. In case of hardware
    acceleration, the available index may be retrieved from the device, an
    operation that can possibly fail.

    Reviewed-by: Parav Pandit
    Signed-off-by: Eli Cohen
    Link: https://lore.kernel.org/r/20200804162048.22587-9-eli@mellanox.com
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Jason Wang

    Eli Cohen
     
  • For now VQ state involves 16 bit available index value encoded in u64
    variable. In the future it will be extended to contain more fields. Use
    struct to contain the state, now containing only a single u16 for the
    available index. In the future we can add fields to this struct.

    Reviewed-by: Parav Pandit
    Acked-by: Jason Wang
    Signed-off-by: Eli Cohen
    Link: https://lore.kernel.org/r/20200804162048.22587-8-eli@mellanox.com
    Signed-off-by: Michael S. Tsirkin

    Eli Cohen
     
  • This will enable vdpa providers to add support for multi queue feature
    and publish it to upper layers (vhost and virtio).

    Signed-off-by: Max Gurtovoy
    Reviewed-by: Jason Wang
    Link: https://lore.kernel.org/r/20200804162048.22587-7-eli@mellanox.com
    Signed-off-by: Michael S. Tsirkin

    Max Gurtovoy
     
  • This patches extend the vhost IOTLB API to accept batch updating hints
    form userspace. When userspace wants update the device IOTLB in a
    batch, it may do:

    1) Write vhost_iotlb_msg with VHOST_IOTLB_BATCH_BEGIN flag
    2) Perform a batch of IOTLB updating via VHOST_IOTLB_UPDATE/INVALIDATE
    3) Write vhost_iotlb_msg with VHOST_IOTLB_BATCH_END flag

    Vhost-vdpa may decide to batch the IOMMU/IOTLB updating in step 3 when
    vDPA device support set_map() ops. This is useful for the vDPA device
    that want to know all the mappings to tweak their own DMA translation
    logic.

    For vDPA device that doesn't require set_map(), no behavior changes.

    This capability is advertised via VHOST_BACKEND_F_IOTLB_BATCH capability.

    Signed-off-by: Jason Wang
    Link: https://lore.kernel.org/r/20200804162048.22587-5-eli@mellanox.com
    Signed-off-by: Michael S. Tsirkin

    Jason Wang
     
  • This patch makes userspace can get and set backend features to
    vhost-vdpa.

    Signed-off-by: Cindy Lu
    Signed-off-by: Jason Wang
    Link: https://lore.kernel.org/r/20200804162048.22587-4-eli@mellanox.com
    Signed-off-by: Michael S. Tsirkin

    Jason Wang
     
  • Switch to use 'switch' to make the codes more easier to be extended.

    Signed-off-by: Jason Wang
    Link: https://lore.kernel.org/r/20200804162048.22587-2-eli@mellanox.com
    Signed-off-by: Michael S. Tsirkin

    Jason Wang
     

05 Aug, 2020

5 commits

  • IRQ of a vq is not expected to be changed in a DRIVER_OK ~ !DRIVER_OK
    period for irq offloading purposes. Place this comment at the side of
    bus ops get_vq_irq than in set_status in vhost_vdpa.

    Signed-off-by: Zhu Lingshan
    Link: https://lore.kernel.org/r/20200804102123.69978-1-lingshan.zhu@intel.com
    Signed-off-by: Michael S. Tsirkin

    Zhu Lingshan
     
  • This patch introduce a set of functions for setup/unsetup
    and update irq offloading respectively by register/unregister
    and re-register the irq_bypass_producer.

    With these functions, this commit can setup/unsetup
    irq offloading through setting DRIVER_OK/!DRIVER_OK, and
    update irq offloading through SET_VRING_CALL.

    Signed-off-by: Zhu Lingshan
    Suggested-by: Jason Wang
    Link: https://lore.kernel.org/r/20200731065533.4144-5-lingshan.zhu@intel.com
    Signed-off-by: Michael S. Tsirkin

    Zhu Lingshan
     
  • This commit introduces struct vhost_vring_call which replaced
    raw struct eventfd_ctx *call_ctx in struct vhost_virtqueue.
    Besides eventfd_ctx, it contains a spin lock and an
    irq_bypass_producer in its structure.

    Signed-off-by: Zhu Lingshan
    Suggested-by: Jason Wang
    Link: https://lore.kernel.org/r/20200731065533.4144-2-lingshan.zhu@intel.com
    Signed-off-by: Michael S. Tsirkin

    Zhu Lingshan
     
  • We used to have a per device feature whitelist to filter out the
    unsupported virtio features. But this seems unnecessary since:

    - the main idea behind feature whitelist is to block control vq
    feature until we finalize the control virtqueue API. But the current
    vhost-vDPA uAPI is sufficient to support control virtqueue. For
    device that has hardware control virtqueue, the vDPA device driver
    can just setup the hardware virtqueue and let userspace to use
    hardware virtqueue directly. For device that doesn't have a control
    virtqueue, the vDPA device driver need to use e.g vringh to emulate
    a software control virtqueue.
    - we don't do it in virtio-vDPA driver

    So remove this limitation.

    Signed-off-by: Jason Wang
    Link: https://lore.kernel.org/r/20200720085043.16485-1-jasowang@redhat.com
    Signed-off-by: Michael S. Tsirkin

    Jason Wang
     
  • For new helpers handling legacy features to be effective,
    vhost needs to invoke them. Tie them in.

    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     

04 Aug, 2020

1 commit


23 Jun, 2020

1 commit

  • The "vma->vm_pgoff" variable is an unsigned long so if it's larger than
    INT_MAX then "index" can be negative leading to an underflow. Fix this
    by changing the type of "index" to "unsigned long".

    Fixes: ddd89d0a059d ("vhost_vdpa: support doorbell mapping via mmap")
    Signed-off-by: Dan Carpenter
    Link: https://lore.kernel.org/r/20200610085852.GB5439@mwanda
    Signed-off-by: Michael S. Tsirkin

    Dan Carpenter
     

11 Jun, 2020

1 commit

  • Pull virtio updates from Michael Tsirkin:

    - virtio-mem: paravirtualized memory hotplug

    - support doorbell mapping for vdpa

    - config interrupt support in ifc

    - fixes all over the place

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (40 commits)
    vhost/test: fix up after API change
    virtio_mem: convert device block size into 64bit
    virtio-mem: drop unnecessary initialization
    ifcvf: implement config interrupt in IFCVF
    vhost: replace -1 with VHOST_FILE_UNBIND in ioctls
    vhost_vdpa: Support config interrupt in vdpa
    ifcvf: ignore continuous setting same status value
    virtio-mem: Don't rely on implicit compiler padding for requests
    virtio-mem: Try to unplug the complete online memory block first
    virtio-mem: Use -ETXTBSY as error code if the device is busy
    virtio-mem: Unplug subblocks right-to-left
    virtio-mem: Drop manual check for already present memory
    virtio-mem: Add parent resource for all added "System RAM"
    virtio-mem: Better retry handling
    virtio-mem: Offline and remove completely unplugged memory blocks
    mm/memory_hotplug: Introduce offline_and_remove_memory()
    virtio-mem: Allow to offline partially unplugged memory blocks
    mm: Allow to offline unmovable PageOffline() pages via MEM_GOING_OFFLINE
    virtio-mem: Paravirtualized memory hotunplug part 2
    virtio-mem: Paravirtualized memory hotunplug part 1
    ...

    Linus Torvalds
     

10 Jun, 2020

1 commit

  • This change converts the existing mmap_sem rwsem calls to use the new mmap
    locking API instead.

    The change is generated using coccinelle with the following rule:

    // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

    @@
    expression mm;
    @@
    (
    -init_rwsem
    +mmap_init_lock
    |
    -down_write
    +mmap_write_lock
    |
    -down_write_killable
    +mmap_write_lock_killable
    |
    -down_write_trylock
    +mmap_write_trylock
    |
    -up_write
    +mmap_write_unlock
    |
    -downgrade_write
    +mmap_write_downgrade
    |
    -down_read
    +mmap_read_lock
    |
    -down_read_killable
    +mmap_read_lock_killable
    |
    -down_read_trylock
    +mmap_read_trylock
    |
    -up_read
    +mmap_read_unlock
    )
    -(&mm->mmap_sem)
    +(mm)

    Signed-off-by: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Reviewed-by: Daniel Jordan
    Reviewed-by: Laurent Dufour
    Reviewed-by: Vlastimil Babka
    Cc: Davidlohr Bueso
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Jason Gunthorpe
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Liam Howlett
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Ying Han
    Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     

07 Jun, 2020

1 commit


05 Jun, 2020

3 commits

  • There could be ways to support doorbell mapping with !MMU, but things
    like pgprot_noncached are not universally supported.
    Fixable, but just disable this for now.

    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • Currently the doorbell is relayed via eventfd which may have
    significant overhead because of the cost of vmexits or syscall. This
    patch introduces mmap() based doorbell mapping which can eliminate the
    overhead caused by vmexit or syscall.

    To ease the userspace modeling of the doorbell layout (usually
    virtio-pci), this patch starts from a doorbell per page
    model. Vhost-vdpa only support the hardware doorbell that sit at the
    boundary of a page and does not share the page with other registers.

    Doorbell of each virtqueue must be mapped separately, pgoff is the
    index of the virtqueue. This allows userspace to map a subset of the
    doorbell which may be useful for the implementation of software
    assisted virtqueue (control vq) in the future.

    Signed-off-by: Jason Wang
    Link: https://lore.kernel.org/r/20200529080303.15449-5-jasowang@redhat.com
    Signed-off-by: Michael S. Tsirkin

    Jason Wang
     
  • vDPA device currently relays the eventfd via vhost worker. This is
    inefficient due the latency of wakeup and scheduling, so this patch
    tries to introduce a use_worker attribute for the vhost device. When
    use_worker is not set with vhost_dev_init(), vhost won't try to
    allocate a worker thread and the vhost_poll will be processed directly
    in the wakeup function.

    This help for vDPA since it reduces the latency caused by vhost worker.

    In my testing, it saves 0.2 ms in pings between VMs on a mutual host.

    Signed-off-by: Zhu Lingshan
    Signed-off-by: Jason Wang
    Link: https://lore.kernel.org/r/20200529080303.15449-2-jasowang@redhat.com
    Signed-off-by: Michael S. Tsirkin

    Jason Wang
     

17 Apr, 2020

2 commits

  • Fix the following gcc warning:
    drivers/vhost/vdpa.c:299:5: warning: variable 'status' set but not used [-Wunused-but-set-variable]
    u8 status;
    ^~~~~~

    Reported-by: Hulk Robot
    Signed-off-by: Jason Yan
    Link: https://lore.kernel.org/r/20200402065106.20108-1-yanaijie@huawei.com
    Signed-off-by: Michael S. Tsirkin

    Jason Yan
     
  • container_of is never null, so this null check is
    unnecessary.

    Addresses-Coverity-ID: 1492006 ("Logically dead code")
    Fixes: 20453a45fb06 ("vhost: introduce vDPA-based backend")
    Signed-off-by: Gustavo A. R. Silva
    Link: https://lore.kernel.org/r/20200330235040.GA9997@embeddedor
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Jason Wang

    Gustavo A. R. Silva
     

02 Apr, 2020

1 commit

  • This patch introduces a vDPA-based vhost backend. This backend is
    built on top of the same interface defined in virtio-vDPA and provides
    a generic vhost interface for userspace to accelerate the virtio
    devices in guest.

    This backend is implemented as a vDPA device driver on top of the same
    ops used in virtio-vDPA. It will create char device entry named
    vhost-vdpa-$index for userspace to use. Userspace can use vhost ioctls
    on top of this char device to setup the backend.

    Vhost ioctls are extended to make it type agnostic and behave like a
    virtio device, this help to eliminate type specific API like what
    vhost_net/scsi/vsock did:

    - VHOST_VDPA_GET_DEVICE_ID: get the virtio device ID which is defined
    by virtio specification to differ from different type of devices
    - VHOST_VDPA_GET_VRING_NUM: get the maximum size of virtqueue
    supported by the vDPA device
    - VHSOT_VDPA_SET/GET_STATUS: set and get virtio status of vDPA device
    - VHOST_VDPA_SET/GET_CONFIG: access virtio config space
    - VHOST_VDPA_SET_VRING_ENABLE: enable a specific virtqueue

    For memory mapping, IOTLB API is mandated for vhost-vDPA which means
    userspace drivers are required to use
    VHOST_IOTLB_UPDATE/VHOST_IOTLB_INVALIDATE to add or remove mapping for
    a specific userspace memory region.

    The vhost-vDPA API is designed to be type agnostic, but it allows net
    device only in current stage. Due to the lacking of control virtqueue
    support, some features were filter out by vhost-vdpa.

    We will enable more features and devices in the near future.

    Signed-off-by: Tiwei Bie
    Signed-off-by: Eugenio Pérez
    Signed-off-by: Jason Wang
    Link: https://lore.kernel.org/r/20200326140125.19794-8-jasowang@redhat.com
    Signed-off-by: Michael S. Tsirkin

    Tiwei Bie