21 Apr, 2011

2 commits

  • In the case where a virtio-console port is in use (opened by a program)
    and a virtio-console device is removed, the port is kept around but all
    the virtio-related state is assumed to be gone.

    When the port is finally released (close() called), we call
    device_destroy() on the port's device. This results in the parent
    device's structures to be freed as well. This includes the PCI regions
    for the virtio-console PCI device.

    Once this is done, however, virtio_pci_release_dev() kicks in, as the
    last ref to the virtio device is now gone, and attempts to do

    pci_iounmap(pci_dev, vp_dev->ioaddr);
    pci_release_regions(pci_dev);
    pci_disable_device(pci_dev);

    which results in a double-free warning.

    Move the code that releases regions, etc., to the virtio_pci_remove()
    function, and all that's now left in release_dev is the final freeing of
    the vp_dev.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • When detaching a buffer from a vq, the avail.idx value should be
    decremented as well.

    This was noticed by hot-unplugging a virtio console port and then
    plugging in a new one on the same number (re-using the vqs which were
    just 'disowned'). qemu reported

    'Guest moved used index from 0 to 256'

    when any IO was attempted on the new port.

    CC: stable@kernel.org
    Reported-by: juzhang
    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     

20 Jan, 2011

1 commit

  • We sometimes need to map between the virtio device and
    the given pci device. One such use is OS installer that
    gets the boot pci device from BIOS and needs to
    find the relevant block device. Since it can't,
    installation fails.

    Instead of creating a top-level devices/virtio-pci
    directory, create each device under the corresponding
    pci device node. Symlinks to all virtio-pci
    devices can be found under the pci driver link in
    bus/pci/drivers/virtio-pci/devices, and all virtio
    devices under drivers/bus/virtio/devices.

    Signed-off-by: Milton Miller
    Signed-off-by: Rusty Russell
    Acked-by: Michael S. Tsirkin
    Tested-by: Michael S. Tsirkin
    Acked-by: Gleb Natapov
    Tested-by: "Daniel P. Berrange"
    Cc: stable@kernel.org

    Milton Miller
     

24 Nov, 2010

2 commits

  • The sysfs files for virtio produce the wrong format and are missing
    the required newline. The output for virtio bus vendor/device should
    have the same format as the corresponding entries for PCI devices.

    Although this technically changes the ABI for sysfs, these files were
    broken to start with!

    Signed-off-by: Stephen Hemminger
    Signed-off-by: Rusty Russell

    Stephen Hemminger
     
  • We can't rely on indirect buffers for capacity
    calculations because they need a memory allocation
    which might fail. In particular, virtio_net can get
    into this situation under stress, and it drops packets
    and performs badly.

    So return the number of buffers we can guarantee users.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell
    Reported-By: Krishna Kumar2

    Michael S. Tsirkin
     

26 Jul, 2010

1 commit

  • virtio ring was changed to return an error code on OOM,
    but one caller was missed and still checks for vq->vring.num.
    The fix is just to check for
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell
    Tested-by: Chris Mason
    Cc: stable@kernel.org # .34.x
    Signed-off-by: Linus Torvalds

    Michael S. Tsirkin
     

23 Jun, 2010

2 commits

  • virtio-pci resets the device at startup by writing to the status
    register, but this does not clear the pci config space,
    specifically msi enable status which affects register
    layout.

    This breaks things like kdump when they try to use e.g. virtio-blk.

    Fix by forcing msi off at startup. Since pci.c already has
    a routine to do this, we export and use it instead of duplicating code.

    Signed-off-by: Michael S. Tsirkin
    Tested-by: Vivek Goyal
    Acked-by: Jesse Barnes
    Cc: linux-pci@vger.kernel.org
    Signed-off-by: Rusty Russell
    Cc: stable@kernel.org

    Michael S. Tsirkin
     
  • add_buf returns ring size on out of memory,
    this is not what devices expect.

    Signed-off-by: Michael S. Tsirkin
    Acked-by: Amit Shah
    Signed-off-by: Rusty Russell
    Cc: stable@kernel.org # .34.x

    Michael S. Tsirkin
     

22 May, 2010

1 commit

  • * 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (27 commits)
    drivers/char: Eliminate use after free
    virtio: console: Accept console size along with resize control message
    virtio: console: Store each console's size in the console structure
    virtio: console: Resize console port 0 on config intr only if multiport is off
    virtio: console: Add support for nonblocking write()s
    virtio: console: Rename wait_is_over() to will_read_block()
    virtio: console: Don't always create a port 0 if using multiport
    virtio: console: Use a control message to add ports
    virtio: console: Move code around for future patches
    virtio: console: Remove config work handler
    virtio: console: Don't call hvc_remove() on unplugging console ports
    virtio: console: Return -EPIPE to hvc_console if we lost the connection
    virtio: console: Let host know of port or device add failures
    virtio: console: Add a __send_control_msg() that can send messages without a valid port
    virtio: Revert "virtio: disable multiport console support."
    virtio: add_buf_gfp
    trans_virtio: use virtqueue_xxx wrappers
    virtio-rng: use virtqueue_xxx wrappers
    virtio_ring: remove a level of indirection
    virtio_net: use virtqueue_xxx wrappers
    ...

    Fix up conflicts in drivers/net/virtio_net.c due to new virtqueue_xxx
    wrappers changes conflicting with some other cleanups.

    Linus Torvalds
     

19 May, 2010

3 commits


23 Apr, 2010

1 commit


22 Apr, 2010

1 commit

  • The virtio balloon driver can dig into the reservation pools of the OS
    to satisfy a balloon request. This is not advisable and other balloon
    drivers (drivers/xen/balloon.c) avoid this as well.

    The patch also adds changes to avoid printing a warning if allocation
    fails, since we retry after sometime anyway.

    Signed-off-by: Balbir Singh
    Signed-off-by: Rusty Russell
    Cc: kvm
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Balbir Singh
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

16 Mar, 2010

1 commit


02 Mar, 2010

1 commit


01 Mar, 2010

1 commit

  • I have observed the following error on virtio-net module unload:

    ------------[ cut here ]------------
    WARNING: at kernel/irq/manage.c:858 __free_irq+0xa0/0x14c()
    Hardware name: Bochs
    Trying to free already-free IRQ 0
    Modules linked in: virtio_net(-) virtio_blk virtio_pci virtio_ring
    virtio af_packet e1000 shpchp aacraid uhci_hcd ohci_hcd ehci_hcd [last
    unloaded: scsi_wait_scan]
    Pid: 1957, comm: rmmod Not tainted 2.6.33-rc8-vhost #24
    Call Trace:
    [] warn_slowpath_common+0x7c/0x94
    [] warn_slowpath_fmt+0x41/0x43
    [] ? __free_pages+0x5a/0x70
    [] __free_irq+0xa0/0x14c
    [] free_irq+0x3f/0x65
    [] vp_del_vqs+0x81/0xb1 [virtio_pci]
    [] virtnet_remove+0xda/0x10b [virtio_net]
    [] virtio_dev_remove+0x22/0x4a [virtio]
    [] __device_release_driver+0x66/0xac
    [] driver_detach+0x83/0xa9
    [] bus_remove_driver+0x91/0xb4
    [] driver_unregister+0x6c/0x74
    [] unregister_virtio_driver+0xe/0x10 [virtio]
    [] fini+0x15/0x17 [virtio_net]
    [] sys_delete_module+0x1c3/0x230
    [] ? old_ich_force_enable_hpet+0x117/0x164
    [] ? do_page_fault+0x29c/0x2cc
    [] sysenter_dispatch+0x7/0x27
    ---[ end trace 15e88e4c576cc62b ]---

    The bug is in virtio-pci: we use msix_vector as array index to get irq
    entry, but some vqs do not have a dedicated vector so this causes an out
    of bounds access. By chance, we seem to often get 0 value, which
    results in this error.

    Fix by verifying that vector is legal before using it as index.

    Signed-off-by: Michael S. Tsirkin
    Acked-by: Anthony Liguori
    Acked-by: Shirley Ma
    Acked-by: Amit Shah

    Michael S. Tsirkin
     

24 Feb, 2010

8 commits

  • vq operations depend on vq->data[i] being NULL to figure out if the vq
    entry is in use (since the previous patch).

    We have to initialize them to NULL to ensure we don't work with junk
    data and trigger false BUG_ONs.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell
    Cc: Shirley Ma

    Amit Shah
     
  • There's currently no way for a virtio driver to ask for unused
    buffers, so it has to keep a list itself to reclaim them at shutdown.
    This is redundant, since virtio_ring stores that information. So
    add a new hook to do this.

    Signed-off-by: Shirley Ma
    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Shirley Ma
     
  • virtio is communicating with a virtual "device" that actually runs on
    another host processor. Thus SMP barriers can be used to control
    memory access ordering.

    Where possible, we should use SMP barriers which are more lightweight than
    mandatory barriers, because mandatory barriers also control MMIO effects on
    accesses through relaxed memory I/O windows (which virtio does not use)
    (compare specifically smp_rmb and rmb on x86_64).

    We can't just use smp_mb and friends though, because
    we must force memory ordering even if guest is UP since host could be
    running on another CPU, but SMP barriers are defined to barrier() in
    that configuration. So, for UP fall back to mandatory barriers instead.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin
     
  • With DEBUG defined, we add an ->in_use flag to detect if the caller
    invokes two virtio methods in parallel. The barriers attempt to ensure
    timely update of the ->in_use flag.

    But they're voodoo: if we need these barriers it implies that the
    calling code doesn't have sufficient synchronization to ensure the
    code paths aren't invoked at the same time anyway, and we want to
    detect it.

    Also, adding barriers changes timing, so turning on debug has more
    chance of hiding real problems.

    Thanks to MST for drawing my attention to this code...

    CC: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • When running under qemu-kvm-0.11.0:

    BUG: unable to handle kernel paging request at 56e58955
    ...
    Process vballoon (pid: 1297, ti=c7976000 task=c70a6ca0 task.ti=c7
    ...
    Call Trace:
    [] ? balloon+0x1b3/0x440 [virtio_balloon]
    [] ? schedule+0x327/0x9d0
    [] ? balloon+0x0/0x440 [virtio_balloon]
    [] ? kthread+0x74/0x80
    [] ? kthread+0x0/0x80
    [] ? kernel_thread_helper+0x6/0x30

    need_stats_update should be zero-initialized.

    Signed-off-by: Rusty Russell
    Acked-by: Adam Litke

    Rusty Russell
     
  • This is a fix for my earlier patch: "virtio: Add memory statistics reporting to
    the balloon driver (V4)".

    I discovered that all_vm_events() can sleep and therefore stats collection
    cannot be done in interrupt context. One solution is to handle the interrupt
    by noting that stats need to be collected and waking the existing vballoon
    kthread which will complete the work via stats_handle_request(). Rusty, is
    this a saner way of doing business?

    There is one issue that I would like a broader opinion on. In stats_request, I
    update vb->need_stats_update and then wake up the kthread. The kthread uses
    vb->need_stats_update as a condition variable. Do I need a memory barrier
    between the update and wake_up to ensure that my kthread sees the correct
    value? My testing suggests that it is not needed but I would like some
    confirmation from the experts.

    Signed-off-by: Adam Litke
    To: Rusty Russell
    Cc: Anthony Liguori
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Rusty Russell

    Adam Litke
     
  • Changes since V3:
    - Do not do endian conversions as they will be done in the host
    - Report stats that reference a quantity of memory in bytes
    - Minor coding style updates

    Changes since V2:
    - Increase stat field size to 64 bits
    - Report all sizes in kb (not pages)
    - Drop anon_pages stat and fix endianness conversion

    Changes since V1:
    - Use a virtqueue instead of the device config space

    When using ballooning to manage overcommitted memory on a host, a system for
    guests to communicate their memory usage to the host can provide information
    that will minimize the impact of ballooning on the guests. The current method
    employs a daemon running in each guest that communicates memory statistics to a
    host daemon at a specified time interval. The host daemon aggregates this
    information and inflates and/or deflates balloons according to the level of
    host memory pressure. This approach is effective but overly complex since a
    daemon must be installed inside each guest and coordinated to communicate with
    the host. A simpler approach is to collect memory statistics in the virtio
    balloon driver and communicate them directly to the hypervisor.

    This patch enables the guest-side support by adding stats collection and
    reporting to the virtio balloon driver.

    Signed-off-by: Adam Litke
    Cc: Anthony Liguori
    Cc: virtualization@lists.linux-foundation.org
    Signed-off-by: Rusty Russell (minor fixes)

    Adam Litke
     
  • This is needed to compile with CONFIG_VIRTIO_PCI=y,
    because virtio_pci_remove is marked __devexit.

    Signed-off-by: Jamie Lokier
    Signed-off-by: Rusty Russell

    Jamie Lokier
     

17 Jan, 2010

1 commit

  • Fix fixes the following warnings by renaming the driver structures to be
    suffixed with _driver.

    WARNING: drivers/virtio/virtio_balloon.o(.data+0x88): Section mismatch in reference from the variable virtio_balloon to the function .devexit.text:virtballoon_remove()

    WARNING: drivers/char/hw_random/virtio-rng.o(.data+0x88): Section mismatch in reference from the variable virtio_rng to the function .devexit.text:virtrng_remove()

    Signed-off-by: Jeff Mahoney
    Acked-by: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     

29 Oct, 2009

2 commits

  • On SMP guests, reads from the ring might bypass used index reads. This
    causes guest crashes because host writes to used index to signal ring
    data readiness. Fix this by inserting rmb before used ring reads.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell
    Cc: stable@kernel.org

    Michael S. Tsirkin
     
  • Commit f68d24082e22ccee3077d11aeb6dc5354f0ca7f1
    in 2.6.32-rc1 broke requesting IRQs for per-VQ MSI-X vectors:
    - vector number was used instead of the vector itself
    - we try to request an IRQ for VQ which does not
    have a callback handler

    This is a regression that causes warnings in kernel log,
    potentially lower performance as we need to scan vq list,
    and might cause system failure if the interrupt
    requested is in fact needed by another system.

    This was not noticed earlier because in most cases
    we were falling back on shared interrupt for all vqs.

    The warnings often look like this:

    virtio-pci 0000:00:03.0: irq 26 for MSI/MSI-X
    virtio-pci 0000:00:03.0: irq 27 for MSI/MSI-X
    virtio-pci 0000:00:03.0: irq 28 for MSI/MSI-X
    IRQ handler type mismatch for IRQ 1
    current handler: i8042
    Pid: 2400, comm: modprobe Tainted: G W
    2.6.32-rc3-11952-gf3ed8d8-dirty #1
    Call Trace:
    [] ? __setup_irq+0x299/0x304
    [] ? request_threaded_irq+0x144/0x1c1
    [] ? vring_interrupt+0x0/0x30
    [] ? vp_try_to_find_vqs+0x583/0x5c7
    [] ? skb_recv_done+0x0/0x34 [virtio_net]
    [] ? vp_find_vqs+0x2d/0x83
    [] ? vp_get+0x3c/0x4e
    [] ? virtnet_probe+0x2f1/0x428 [virtio_net]
    [] ? skb_recv_done+0x0/0x34 [virtio_net]
    [] ? skb_xmit_done+0x0/0x39 [virtio_net]
    [] ? sysfs_do_create_link+0xcb/0x116
    [] ? vp_get_status+0x14/0x16
    [] ? virtio_dev_probe+0xa9/0xc8
    [] ? driver_probe_device+0x8d/0x128
    [] ? __driver_attach+0x4f/0x6f
    [] ? __driver_attach+0x0/0x6f
    [] ? bus_for_each_dev+0x43/0x74
    [] ? bus_add_driver+0xea/0x22d
    [] ? driver_register+0xa7/0x111
    [] ? init+0x0/0xc [virtio_net]
    [] ? do_one_initcall+0x50/0x148
    [] ? sys_init_module+0xc5/0x21a
    [] ? system_call_fastpath+0x16/0x1b
    virtio-pci 0000:00:03.0: irq 26 for MSI/MSI-X
    virtio-pci 0000:00:03.0: irq 27 for MSI/MSI-X

    Reported-by: Marcelo Tosatti
    Reported-by: Shirley Ma
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin
     

22 Oct, 2009

2 commits

  • The function virtballoon_remove is used only wrapped by __devexit_p so
    define it using __devexit.

    Signed-off-by: Uwe Kleine-König
    Acked-by: Sam Ravnborg
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Uwe Kleine-König
     
  • Rusty,

    commit 3ca4f5ca73057a617f9444a91022d7127041970a
    virtio: add virtio IDs file
    moved all device IDs into a single file. While the change itself is
    a very good one, it can break userspace applications. For example
    if a userspace tool wanted to get the ID of virtio_net it used to
    include virtio_net.h. This does no longer work, since virtio_net.h
    does not include virtio_ids.h.
    This patch moves all "#include " from the C
    files into the header files, making the header files compatible with
    the old ones.

    In addition, this patch exports virtio_ids.h to userspace.

    CC: Fernando Luis Vazquez Cao
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Rusty Russell

    Christian Borntraeger
     

23 Sep, 2009

3 commits

  • Virtio IDs are spread all over the tree which makes assigning new IDs
    bothersome. Putting them together should make the process less error-prone.

    Signed-off-by: Fernando Luis Vazquez Cao
    Signed-off-by: Rusty Russell

    Fernando Luis Vazquez Cao
     
  • This API change means that virtio_net can tell how much capacity
    remains for buffers. It's necessarily fuzzy, since
    VIRTIO_RING_F_INDIRECT_DESC means we can fit any number of descriptors
    in one, *if* we can kmalloc.

    Signed-off-by: Rusty Russell
    Cc: Dinesh Subhraveti

    Rusty Russell
     
  • 1) Rename vp_request_vectors to vp_request_msix_vectors, and take
    non-MSI-X case out to caller.
    2) Comment weird pci_enable_msix API
    3) Rename vp_find_vq to setup_vq.
    4) Fix spaces to tabs
    5) Make nvectors calc internal to vp_try_to_find_vqs()
    6) Rename vector to msix_vector for more clarity.

    Signed-off-by: Rusty Russell
    Cc: "Michael S. Tsirkin"

    Rusty Russell
     

30 Jul, 2009

3 commits


17 Jul, 2009

1 commit


12 Jun, 2009

2 commits