13 Mar, 2014

1 commit

  • As result of deprecation of MSI-X/MSI enablement functions
    pci_enable_msix() and pci_enable_msi_block() all drivers
    using these two interfaces need to be updated to use the
    new pci_enable_msi_range() or pci_enable_msi_exact()
    and pci_enable_msix_range() or pci_enable_msix_exact()
    interfaces.

    Signed-off-by: Alexander Gordeev
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Alexander Gordeev
     

04 Dec, 2013

1 commit


29 Oct, 2013

1 commit

  • Currently a host kick error is silently ignored and not reflected in
    the virtqueue of a particular virtio device.

    Changing the notify API for guest->host notification seems to be one
    prerequisite in order to be able to handle such errors in the context
    where the kick is triggered.

    This patch changes the notify API. The notify function must return a
    bool return value. It returns false if the host notification failed.

    Signed-off-by: Heinz Graalfs
    Signed-off-by: Rusty Russell

    Heinz Graalfs
     

09 Sep, 2013

1 commit

  • The virtio_pci_freeze/restore are defined under CONFIG_PM but is used
    by SET_SYSTEM_SLEEP_PM_OPS macro, which is defined under
    CONFIG_PM_SLEEP. So if CONFIG_PM_SLEEP is not cofigured but
    CONFIG_PM_RUNTIME is, the following warning message appeared:

    drivers/virtio/virtio_pci.c:770:12: warning: ‘virtio_pci_freeze’ defined but not used [-Wunused-function]
    static int virtio_pci_freeze(struct device *dev)
    ^
    drivers/virtio/virtio_pci.c:790:12: warning: ‘virtio_pci_restore’ defined but not used [-Wunused-function]
    static int virtio_pci_restore(struct device *dev)
    ^
    Fix it by changing CONFIG_PM to CONFIG_PM_SLEEP.

    Signed-off-by: Aaron Lu
    Reviewed-by: Amit Shah
    Signed-off-by: Rusty Russell

    Aaron Lu
     

02 Jul, 2013

1 commit

  • vp_dev->msix_vectors should be initialized before allocating
    msix_affinity_masks, otherwise vp_free_vectors will not free these
    objects.

    unreferenced object 0xffff88010f969d88 (size 512):
    comm "systemd-udevd", pid 158, jiffies 4294673645 (age 80.545s)
    hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] kmemleak_alloc+0x5e/0xc0
    [] kmem_cache_alloc_node_trace+0x141/0x2c0
    [] alloc_cpumask_var_node+0x23/0x80
    [] alloc_cpumask_var+0xe/0x10
    [] vp_try_to_find_vqs+0x25d/0x810
    [] vp_find_vqs+0x81/0xb0
    [] init_vqs+0x85/0x120 [virtio_balloon]
    [] virtballoon_probe+0xf9/0x1a0 [virtio_balloon]
    [] virtio_dev_probe+0xde/0x140
    [] driver_probe_device+0x98/0x3a0
    [] __driver_attach+0xab/0xb0
    [] bus_for_each_dev+0x94/0xb0
    [] driver_attach+0x1e/0x20
    [] bus_add_driver+0x200/0x280
    [] driver_register+0x74/0x160
    [] register_virtio_driver+0x20/0x40

    v2: change msix_vectors uncoditionaly in vp_free_vectors

    Cc: Rusty Russell
    Cc: "Michael S. Tsirkin"
    Cc: Jason Wang
    Signed-off-by: Andrew Vagin
    Acked-by: Michael S. Tsirkin
    Acked-by: Jason Wang
    Signed-off-by: Rusty Russell

    Andrew Vagin
     

11 Feb, 2013

2 commits


04 Jan, 2013

1 commit

  • CONFIG_HOTPLUG is going away as an option. As a result, the __dev*
    markings need to be removed.

    This change removes the use of __devinit, __devexit_p, and __devexit
    from these drivers.

    Based on patches originally written by Bill Pemberton, but redone by me
    in order to handle some of the coding style issues better, by hand.

    Cc: Bill Pemberton
    Cc: Rusty Russell
    Cc: "Michael S. Tsirkin"
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

18 Dec, 2012

2 commits


28 Sep, 2012

4 commits

  • Convert a nonnegative error return code to a negative one, as returned
    elsewhere in the function.

    A simplified version of the semantic match that finds this problem is as
    follows: (http://coccinelle.lip6.fr/)

    //
    (
    if@p1 (\(ret < 0\|ret != 0\))
    { ... return ret; }
    |
    ret@p1 = 0
    )
    ... when != ret = e1
    when != &ret
    *if(...)
    {
    ... when != ret = e2
    when forall
    return ret;
    }
    //

    Signed-off-by: Peter Senna Tschudin
    Signed-off-by: Rusty Russell

    Peter Senna Tschudin
     
  • virtio network device multiqueue support reserves
    vq 3 for future use (useful both for future extensions and to make it
    pretty - this way receive vqs have even and transmit - odd numbers).
    Make it possible to skip initialization for
    specific vq numbers by specifying NULL for name.
    Document this usage as well as (existing) NULL callback.

    Drivers using this not coded up yet, so I simply tested
    with virtio-pci and verified that this patch does
    not break existing drivers.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin
     
  • Sometimes, virtio device need to configure irq affinity hint to maximize the
    performance. Instead of just exposing the irq of a virtqueue, this patch
    introduce an API to set the affinity for a virtqueue.

    The api is best-effort, the affinity hint may not be set as expected due to
    platform support, irq sharing or irq type. Currently, only pci method were
    implemented and we set the affinity according to:

    - if device uses INTX, we just ignore the request
    - if device has per vq vector, we force the affinity hint
    - if the virtqueues share MSI, make the affinity OR over all affinities
    requested

    Signed-off-by: Jason Wang
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Rusty Russell

    Jason Wang
     
  • Instead of storing the queue index in transport-specific virtio structs,
    this patch moves them to vring_virtqueue and introduces an helper to get
    the value. This lets drivers simplify their management and tracing of
    virtqueues.

    Signed-off-by: Jason Wang
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Rusty Russell

    Jason Wang
     

31 Mar, 2012

4 commits


12 Jan, 2012

3 commits

  • Handle thaw, restore and freeze notifications from the PM core. Expose
    these to individual virtio drivers that can quiesce and resume vq
    operations. For drivers not implementing the thaw() method, use the
    restore method instead.

    These functions also save device-specific data so that the device can be
    put in pre-suspend state after resume, and disable and enable the PCI
    device in the freeze and resume functions, respectively.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • The older PM API doesn't have a way to get notifications on hibernate
    events. Switch to the newer one that gives us those notifications.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     
  • We were cheating with our barriers; using the smp ones rather than the
    real device ones. That was fine, until rpmsg came along, which is
    used to talk to a real device (a non-SMP CPU).

    Unfortunately, just putting back the real barriers (reverting
    d57ed95d) causes a performance regression on virtio-pci. In
    particular, Amos reports netbench's TCP_RR over virtio_net CPU
    utilization increased up to 35% while throughput went down by up to
    14%.

    By comparison, this branch is in the noise.

    Reference: https://lkml.org/lkml/2011/12/11/22

    Signed-off-by: Rusty Russell

    Rusty Russell
     

03 Dec, 2011

1 commit


24 Nov, 2011

1 commit

  • virtio pci device reset actually just does an I/O
    write, which in PCI is really posted, that is it
    can complete on CPU before the device has received it.

    Further, interrupts might have been pending on
    another CPU, so device callback might get invoked after reset.

    This conflicts with how drivers use reset, which is typically:
    reset
    unregister
    a callback running after reset completed can race with
    unregister, potentially leading to use after free bugs.

    Fix by flushing out the write, and flushing pending interrupts.

    This assumes that device is never reset from
    its vq/config callbacks, or in parallel with being
    added/removed, document this assumption.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin
     

22 Nov, 2011

1 commit


17 Nov, 2011

1 commit


14 Nov, 2011

1 commit

  • Commit 31a3ddda166cda86d2b5111e09ba4bda5239fae6 introduced
    a use after free in virtio-pci. The main issue is
    that the release method signals removal of the virtio device,
    while remove signals removal of the pci device.

    For example, on driver removal or hot-unplug,
    virtio_pci_release_dev is called before virtio_pci_remove.
    We then might get a crash as virtio_pci_remove tries to use the
    device freed by virtio_pci_release_dev.

    We allocate/free all resources together with the
    pci device, so we can leave the release method empty.

    Signed-off-by: Michael S. Tsirkin
    Acked-by: Amit Shah
    Signed-off-by: Rusty Russell
    Cc: stable@kernel.org

    Michael S. Tsirkin
     

02 Nov, 2011

1 commit

  • For the MSI but non-per_vq_vector case, the config/change vq
    also gets added to the list of vqs that need to process the
    MSI interrupt. This is not needed as config has it's own
    handler (vp_config_changed). In any case, vring_interrupt()
    finds nothing needs to be done on this vq.

    I tested this patch by testing the "Fallback:" and "Finally
    fall back" cases in vp_find_vqs(). Please review.

    Signed-off-by: Krishna Kumar
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Krishna Kumar
     

21 Apr, 2011

1 commit

  • In the case where a virtio-console port is in use (opened by a program)
    and a virtio-console device is removed, the port is kept around but all
    the virtio-related state is assumed to be gone.

    When the port is finally released (close() called), we call
    device_destroy() on the port's device. This results in the parent
    device's structures to be freed as well. This includes the PCI regions
    for the virtio-console PCI device.

    Once this is done, however, virtio_pci_release_dev() kicks in, as the
    last ref to the virtio device is now gone, and attempts to do

    pci_iounmap(pci_dev, vp_dev->ioaddr);
    pci_release_regions(pci_dev);
    pci_disable_device(pci_dev);

    which results in a double-free warning.

    Move the code that releases regions, etc., to the virtio_pci_remove()
    function, and all that's now left in release_dev is the final freeing of
    the vp_dev.

    Signed-off-by: Amit Shah
    Signed-off-by: Rusty Russell

    Amit Shah
     

20 Jan, 2011

1 commit

  • We sometimes need to map between the virtio device and
    the given pci device. One such use is OS installer that
    gets the boot pci device from BIOS and needs to
    find the relevant block device. Since it can't,
    installation fails.

    Instead of creating a top-level devices/virtio-pci
    directory, create each device under the corresponding
    pci device node. Symlinks to all virtio-pci
    devices can be found under the pci driver link in
    bus/pci/drivers/virtio-pci/devices, and all virtio
    devices under drivers/bus/virtio/devices.

    Signed-off-by: Milton Miller
    Signed-off-by: Rusty Russell
    Acked-by: Michael S. Tsirkin
    Tested-by: Michael S. Tsirkin
    Acked-by: Gleb Natapov
    Tested-by: "Daniel P. Berrange"
    Cc: stable@kernel.org

    Milton Miller
     

23 Jun, 2010

1 commit

  • virtio-pci resets the device at startup by writing to the status
    register, but this does not clear the pci config space,
    specifically msi enable status which affects register
    layout.

    This breaks things like kdump when they try to use e.g. virtio-blk.

    Fix by forcing msi off at startup. Since pci.c already has
    a routine to do this, we export and use it instead of duplicating code.

    Signed-off-by: Michael S. Tsirkin
    Tested-by: Vivek Goyal
    Acked-by: Jesse Barnes
    Cc: linux-pci@vger.kernel.org
    Signed-off-by: Rusty Russell
    Cc: stable@kernel.org

    Michael S. Tsirkin
     

23 Apr, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

16 Mar, 2010

1 commit


02 Mar, 2010

1 commit


01 Mar, 2010

1 commit

  • I have observed the following error on virtio-net module unload:

    ------------[ cut here ]------------
    WARNING: at kernel/irq/manage.c:858 __free_irq+0xa0/0x14c()
    Hardware name: Bochs
    Trying to free already-free IRQ 0
    Modules linked in: virtio_net(-) virtio_blk virtio_pci virtio_ring
    virtio af_packet e1000 shpchp aacraid uhci_hcd ohci_hcd ehci_hcd [last
    unloaded: scsi_wait_scan]
    Pid: 1957, comm: rmmod Not tainted 2.6.33-rc8-vhost #24
    Call Trace:
    [] warn_slowpath_common+0x7c/0x94
    [] warn_slowpath_fmt+0x41/0x43
    [] ? __free_pages+0x5a/0x70
    [] __free_irq+0xa0/0x14c
    [] free_irq+0x3f/0x65
    [] vp_del_vqs+0x81/0xb1 [virtio_pci]
    [] virtnet_remove+0xda/0x10b [virtio_net]
    [] virtio_dev_remove+0x22/0x4a [virtio]
    [] __device_release_driver+0x66/0xac
    [] driver_detach+0x83/0xa9
    [] bus_remove_driver+0x91/0xb4
    [] driver_unregister+0x6c/0x74
    [] unregister_virtio_driver+0xe/0x10 [virtio]
    [] fini+0x15/0x17 [virtio_net]
    [] sys_delete_module+0x1c3/0x230
    [] ? old_ich_force_enable_hpet+0x117/0x164
    [] ? do_page_fault+0x29c/0x2cc
    [] sysenter_dispatch+0x7/0x27
    ---[ end trace 15e88e4c576cc62b ]---

    The bug is in virtio-pci: we use msix_vector as array index to get irq
    entry, but some vqs do not have a dedicated vector so this causes an out
    of bounds access. By chance, we seem to often get 0 value, which
    results in this error.

    Fix by verifying that vector is legal before using it as index.

    Signed-off-by: Michael S. Tsirkin
    Acked-by: Anthony Liguori
    Acked-by: Shirley Ma
    Acked-by: Amit Shah

    Michael S. Tsirkin
     

24 Feb, 2010

1 commit


29 Oct, 2009

1 commit

  • Commit f68d24082e22ccee3077d11aeb6dc5354f0ca7f1
    in 2.6.32-rc1 broke requesting IRQs for per-VQ MSI-X vectors:
    - vector number was used instead of the vector itself
    - we try to request an IRQ for VQ which does not
    have a callback handler

    This is a regression that causes warnings in kernel log,
    potentially lower performance as we need to scan vq list,
    and might cause system failure if the interrupt
    requested is in fact needed by another system.

    This was not noticed earlier because in most cases
    we were falling back on shared interrupt for all vqs.

    The warnings often look like this:

    virtio-pci 0000:00:03.0: irq 26 for MSI/MSI-X
    virtio-pci 0000:00:03.0: irq 27 for MSI/MSI-X
    virtio-pci 0000:00:03.0: irq 28 for MSI/MSI-X
    IRQ handler type mismatch for IRQ 1
    current handler: i8042
    Pid: 2400, comm: modprobe Tainted: G W
    2.6.32-rc3-11952-gf3ed8d8-dirty #1
    Call Trace:
    [] ? __setup_irq+0x299/0x304
    [] ? request_threaded_irq+0x144/0x1c1
    [] ? vring_interrupt+0x0/0x30
    [] ? vp_try_to_find_vqs+0x583/0x5c7
    [] ? skb_recv_done+0x0/0x34 [virtio_net]
    [] ? vp_find_vqs+0x2d/0x83
    [] ? vp_get+0x3c/0x4e
    [] ? virtnet_probe+0x2f1/0x428 [virtio_net]
    [] ? skb_recv_done+0x0/0x34 [virtio_net]
    [] ? skb_xmit_done+0x0/0x39 [virtio_net]
    [] ? sysfs_do_create_link+0xcb/0x116
    [] ? vp_get_status+0x14/0x16
    [] ? virtio_dev_probe+0xa9/0xc8
    [] ? driver_probe_device+0x8d/0x128
    [] ? __driver_attach+0x4f/0x6f
    [] ? __driver_attach+0x0/0x6f
    [] ? bus_for_each_dev+0x43/0x74
    [] ? bus_add_driver+0xea/0x22d
    [] ? driver_register+0xa7/0x111
    [] ? init+0x0/0xc [virtio_net]
    [] ? do_one_initcall+0x50/0x148
    [] ? sys_init_module+0xc5/0x21a
    [] ? system_call_fastpath+0x16/0x1b
    virtio-pci 0000:00:03.0: irq 26 for MSI/MSI-X
    virtio-pci 0000:00:03.0: irq 27 for MSI/MSI-X

    Reported-by: Marcelo Tosatti
    Reported-by: Shirley Ma
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin
     

23 Sep, 2009

1 commit

  • 1) Rename vp_request_vectors to vp_request_msix_vectors, and take
    non-MSI-X case out to caller.
    2) Comment weird pci_enable_msix API
    3) Rename vp_find_vq to setup_vq.
    4) Fix spaces to tabs
    5) Make nvectors calc internal to vp_try_to_find_vqs()
    6) Rename vector to msix_vector for more clarity.

    Signed-off-by: Rusty Russell
    Cc: "Michael S. Tsirkin"

    Rusty Russell
     

30 Jul, 2009

2 commits

  • This refactors find_vqs, making it more readable and robust, and fixing
    two regressions from 2.6.30:
    - double free_irq causing BUG_ON on device removal
    - probe failure when vq can't be assigned to msi-x vector
    (reported on old host kernels)

    Tested-by: Amit Shah
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin
     
  • This makes delete vq the reverse of find vq.
    This is required to make it possible to retry find_vqs
    after a failure, otherwise the list gets corrupted.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Rusty Russell

    Michael S. Tsirkin