09 Nov, 2012

1 commit

  • Virtio wants to release used indices after the corresponding
    virtio device has been unregistered. However, virtio does not
    hold an extra reference, giving up its last reference with
    device_unregister(), making accessing dev->index afterwards
    invalid.

    I actually saw problems when testing my (not-yet-merged)
    virtio-ccw code:

    - device_add virtio-net,id=xxx
    -> creates device virtio with n>0

    - device_del xxx
    -> deletes virtio, but calls ida_simple_remove with an
    index of 0

    - device_add virtio-net,id=xxx
    -> tries to add virtio0, which is still in use...

    So let's save the index we want to release before calling
    device_unregister().

    Signed-off-by: Cornelia Huck
    Acked-by: Sjur Brændeland
    Cc: stable@kernel.org
    Signed-off-by: Rusty Russell

    Cornelia Huck
     

28 Sep, 2012

11 commits

  • If a virtio device reports a QueueNumMax of 0, vring_new_virtqueue()
    doesn't check this, and thanks to an unsigned (i < num - 1) loop
    guard, scribbles over memory when initialising the free list.

    Avoid by not trying to create zero-descriptor queues, as there's no
    way to do any I/O with one.

    Signed-off-by: Brian Foley
    Signed-off-by: Pawel Moll
    Signed-off-by: Rusty Russell

    Brian Foley
     
  • vm_setup_vq fails to allow VirtQueues needing only 2 pages of
    storage, as it should. Found with a kernel using 64kB pages, but
    can be provoked if a virtio device reports QueueNumMax where the
    descriptor table and available ring fit in one page, and the used
    ring on the second (
    Signed-off-by: Pawel Moll
    Signed-off-by: Rusty Russell

    Brian Foley
     
  • Convert a nonnegative error return code to a negative one, as returned
    elsewhere in the function.

    A simplified version of the semantic match that finds this problem is as
    follows: (http://coccinelle.lip6.fr/)

    //
    (
    if@p1 (\(ret < 0\|ret != 0\))
    { ... return ret; }
    |
    ret@p1 = 0
    )
    ... when != ret = e1
    when != &ret
    *if(...)
    {
    ... when != ret = e2
    when forall
    return ret;
    }
    //

    Signed-off-by: Peter Senna Tschudin
    Signed-off-by: Rusty Russell

    Peter Senna Tschudin
     
  • Because of a sanity check in virtio_dev_remove, a buggy device can crash
    kernel. And in case of rproc it's userspace so it's not a good idea.
    We are unloading a driver so how bad can it be?
    Be less aggressive in handling this error: if it's a driver bug,
    warning once should be enough.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin
     
  • Everyone who selects VIRTIO is also made to select VIRTIO_RING; just make
    them synonymous, since we removed the indirection layer some time ago.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Trying to enable a virtio driver (eg CONFIG_VIRTIO_BLK) is painful
    because it depends on CONFIG_VIRTIO. CONFIG_VIRTIO doesn't tell you
    how to turn it on (it's selected from anything which provides a virtio
    bus).

    This patch at least adds some documentation, visible in menuconfig, as
    a hint.

    Reported-by: Kent Overstreet
    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • virtio network device multiqueue support reserves
    vq 3 for future use (useful both for future extensions and to make it
    pretty - this way receive vqs have even and transmit - odd numbers).
    Make it possible to skip initialization for
    specific vq numbers by specifying NULL for name.
    Document this usage as well as (existing) NULL callback.

    Drivers using this not coded up yet, so I simply tested
    with virtio-pci and verified that this patch does
    not break existing drivers.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin
     
  • Sometimes, virtio device need to configure irq affinity hint to maximize the
    performance. Instead of just exposing the irq of a virtqueue, this patch
    introduce an API to set the affinity for a virtqueue.

    The api is best-effort, the affinity hint may not be set as expected due to
    platform support, irq sharing or irq type. Currently, only pci method were
    implemented and we set the affinity according to:

    - if device uses INTX, we just ignore the request
    - if device has per vq vector, we force the affinity hint
    - if the virtqueues share MSI, make the affinity OR over all affinities
    requested

    Signed-off-by: Jason Wang
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Rusty Russell

    Jason Wang
     
  • Instead of storing the queue index in transport-specific virtio structs,
    this patch moves them to vring_virtqueue and introduces an helper to get
    the value. This lets drivers simplify their management and tracing of
    virtqueues.

    Signed-off-by: Jason Wang
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Rusty Russell

    Jason Wang
     
  • It is not experimental in any vaguely-sane sense.

    Reported-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Devices should depend on virtio, not select it. It's supposed to be
    selected by the particular driver, e.g. VIRTIO_PCI.
    Make balloon depend on VIRTIO and EXPERIMENTAL
    (to match description).

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin
     

20 Jul, 2012

1 commit

  • This patch changes virtio-scsi to use a new virtio_driver->scan() callback
    so that scsi_scan_host() can be properly invoked once virtio_dev_probe() has
    set add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK) to signal active virtio-ring
    operation, instead of from within virtscsi_probe().

    This fixes a bug where SCSI LUN scanning for both virtio-scsi-raw and
    virtio-scsi/tcm_vhost setups was happening before VIRTIO_CONFIG_S_DRIVER_OK
    had been set, causing VIRTIO_SCSI_S_BAD_TARGET to occur. This fixes a bug
    with virtio-scsi/tcm_vhost where LUN scan was not detecting LUNs.

    Tested with virtio-scsi-raw + virtio-scsi/tcm_vhost w/ IBLOCK on 3.5-rc2 code.

    Signed-off-by: Nicholas Bellinger
    Acked-by: Paolo Bonzini
    Signed-off-by: James Bottomley

    Nicholas Bellinger
     

09 Jul, 2012

1 commit

  • Since ee7cd8981e15bcb365fc762afe3fc47b8242f630 'virtio: expose added
    descriptors immediately.', in virtio balloon virtqueue_get_buf might
    now run concurrently with virtqueue_kick. I audited both and this
    seems safe in practice but this is not guaranteed by the API.
    Additionally, a spurious interrupt might in theory make
    virtqueue_get_buf run in parallel with virtqueue_add_buf, which is
    racy.

    While we might try to protect against spurious callbacks it's
    easier to fix the driver: balloon seems to be the only one
    (mis)using the API like this, so let's just fix balloon.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell (removed unused var)

    Michael S. Tsirkin
     

22 May, 2012

4 commits


17 May, 2012

1 commit

  • When the balloon module is removed, we deflate the balloon, reclaiming
    all the pages that were given to the host. However, we don't update the
    config values for the new balloon size, resulting in the host showing
    outdated balloon values.

    The size update is done after each leak and fill operation, only the
    module removal case was left out.

    Signed-off-by: Amit Shah
    Signed-off-by: Michael S. Tsirkin

    Amit Shah
     

15 Apr, 2012

2 commits

  • As reported by David Gibson, current code handles PAGE_SIZE != 4k
    completely wrong which can lead to guest memory corruption errors:

    - page_to_balloon_pfn is wrong: e.g. on system with 64K page size
    it gives the same pfn value for 16 different pages.

    - we also need to convert back to linux pfns when we free.

    - for each linux page we need to tell host about multiple balloon
    pages, but code only adds one pfn to the array.

    This patch fixes all that, tested with a 64k ppc64 kernel.

    Reported-by: David Gibson
    Tested-by: David Gibson
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • Although virtio config space fields are usually in guest-native endian,
    the spec for the virtio balloon device explicitly states that both fields
    in its config space are little-endian.

    However, the current virtio_balloon driver does not have a suitable endian
    swap for the 'num_pages' field, although it does have one for the 'actual'
    field. This patch corrects the bug, adding sparse annotation while we're
    at it.

    Signed-off-by: David Gibson
    Signed-off-by: Michael S. Tsirkin

    David Gibson
     

31 Mar, 2012

5 commits


29 Mar, 2012

1 commit


01 Mar, 2012

1 commit

  • commit e562966dbaf49e7804097cd991e5d3a8934fc148 added support for S4 to
    the balloon driver. The freeze function did nothing to free the pages,
    since reclaiming the pages from the host to immediately give them back
    (if S4 was successful) seemed wasteful. Also, if S4 wasn't successful,
    the guest would have to re-fill the balloon. On restore, the pages were
    supposed to be marked freed and the free page counters were incremented
    to reflect the balloon was totally deflated.

    However, this wasn't done right. The pages that were earlier taken away
    from the guest during a balloon inflation operation were just shown as
    used pages after a successful restore from S4. Just a fancy way of
    leaking lots of memory.

    Instead of trying that, just leak the balloon on freeze and fill it on
    restore/thaw paths. This works properly now. The optimisation to not
    leak can be added later on after a bit of refactoring of the code.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     

28 Jan, 2012

2 commits

  • Use virtio_mb() to make sure the available index to be exposed before
    checking the the avail event. Otherwise we may get stale value of
    avail event in guest and never kick the host after.

    Note: this fixes a bug introduced by ee7cd8981e15bcb365fc762afe3fc47b8242f630.

    Signed-off-by: Jason Wang
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell
    Cc: stable@kernel.org

    Jason Wang
     
  • Note: this fixes a bug introduced recently in
    7b21e34fd1c272e3a8c3846168f2f6287a4cd72b.

    Signed-off-by: Jason Wang
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Jason Wang
     

12 Jan, 2012

10 commits

  • Handling balloon hibernate / restore is tricky. If the balloon was
    inflated before going into the hibernation state, upon resume, the host
    will not have any memory of that. Any pages that were passed on to the
    host earlier would most likely be invalid, and the host will have to
    re-balloon to the previous value to get in the pre-hibernate state.

    So the only sane thing for the guest to do here is to discard all the
    pages that were put in the balloon. When to discard the pages is the
    next question.

    One solution is to deflate the balloon just before writing the image to
    the disk (in the freeze() PM callback). However, asking for pages from
    the host just to discard them immediately after seems wasteful of
    resources. Hence, it makes sense to do this by just fudging our
    counters soon after wakeup. This means we don't deflate the balloon
    before sleep, and also don't put unnecessary pressure on the host.

    This also helps in the thaw case: if the freeze fails for whatever
    reason, the balloon should continue to remain in the inflated state.
    This was tested by issuing 'swapoff -a' and trying to go into the S4
    state. That fails, and the balloon stays inflated, as expected. Both
    the host and the guest are happy.

    Finally, in the restore() callback, we empty the list of pages that were
    previously given off to the host, add the appropriate number of pages to
    the totalram_pages counter, reset the num_pages counter to 0, and
    all is fine.

    As a last step, delete the vqs on the freeze callback to prepare for
    hibernation, and re-create them in the restore and thaw callbacks to
    resume normal operation.

    The kthread doesn't race with any operations here, since it's frozen
    before the freeze() call and is thawed after the thaw() and restore()
    callbacks, so we're safe with that.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • The probe and PM restore functions will share this code.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • Handle thaw, restore and freeze notifications from the PM core. Expose
    these to individual virtio drivers that can quiesce and resume vq
    operations. For drivers not implementing the thaw() method, use the
    restore method instead.

    These functions also save device-specific data so that the device can be
    put in pre-suspend state after resume, and disable and enable the PCI
    device in the freeze and resume functions, respectively.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • The older PM API doesn't have a way to get notifications on hibernate
    events. Switch to the newer one that gives us those notifications.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • Under the existing #ifdef DEBUG, check that they don't have more than
    1/10 of a second between an add_buf() and a
    virtqueue_notify()/virtqueue_kick_prepare() call.

    We could get false positives on a really busy system, but good for
    development.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • A virtio driver does virtqueue_add_buf() multiple times before finally
    calling virtqueue_kick(); previously we only exposed the added buffers
    in the virtqueue_kick() call. This means we don't need a memory
    barrier in virtqueue_add_buf(), but it reduces concurrency as the
    device (ie. host) can't see the buffers until the kick.

    In the unusual (but now possible) case where a driver does add_buf()
    and get_buf() without doing a kick, we do need to insert one before
    our counter wraps. Otherwise we could wrap num_added, and later on
    not realize that we have passed the marker where we should have
    kicked.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Since we know vq->vring.num is a power of 2, modulus is lazy (it's asserted
    in vring_new_virtqueue()).

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Based on patch by Christoph for virtio_blk speedup:

    Split virtqueue_kick to be able to do the actual notification
    outside the lock protecting the virtqueue. This patch was
    originally done by Stefan Hajnoczi, but I can't find the
    original one anymore and had to recreated it from memory.
    Pointers to the original or corrections for the commit message
    are welcome.

    Stefan's patch was here:

    https://github.com/stefanha/linux/commit/a6d06644e3a58e57a774e77d7dc34c4a5a2e7496
    http://www.spinics.net/lists/linux-virtualization/msg14616.html

    Third time's the charm!

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Remove wrapper functions. This makes the allocation type explicit in
    all callers; I used GPF_KERNEL where it seemed obvious, left it at
    GFP_ATOMIC otherwise.

    Signed-off-by: Rusty Russell
    Reviewed-by: Christoph Hellwig

    Rusty Russell
     
  • The old documentation is left over from when we used a structure with
    strategy pointers.

    And move the documentation to the C file as per kernel practice.
    Though I disagree...

    Signed-off-by: Rusty Russell
    Reviewed-by: Christoph Hellwig

    Rusty Russell