03 May, 2017

3 commits


11 Apr, 2017

6 commits

  • virtio-pci registers a per-vq affinity hint when using MSIX,
    but fails to remove it when freeing the interrupt, resulting
    in this type of splat:

    [ 31.111202] WARNING: CPU: 0 PID: 2823 at kernel/irq/manage.c:1503 __free_irq+0x2c4/0x2c8
    [ 31.114689] Modules linked in:
    [ 31.116101] CPU: 0 PID: 2823 Comm: kexec Not tainted 4.10.0+ #6941
    [ 31.118911] Hardware name: Generic DT based system
    [ 31.121319] [] (unwind_backtrace) from [] (show_stack+0x18/0x1c)
    [ 31.125017] [] (show_stack) from [] (dump_stack+0x84/0x98)
    [ 31.128427] [] (dump_stack) from [] (__warn+0xf4/0x10c)
    [ 31.131910] [] (__warn) from [] (warn_slowpath_null+0x28/0x30)
    [ 31.135543] [] (warn_slowpath_null) from [] (__free_irq+0x2c4/0x2c8)
    [ 31.139355] [] (__free_irq) from [] (free_irq+0x44/0x78)
    [ 31.142909] [] (free_irq) from [] (vp_del_vqs+0x68/0x1c0)
    [ 31.146299] [] (vp_del_vqs) from [] (pci_device_shutdown+0x3c/0x78)

    The obvious fix is to drop the affinity hint before freeing the
    interrupt.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Michael S. Tsirkin

    Marc Zyngier
     
  • This reverts commit 5c34d002dcc7a6dd665a19d098b4f4cd5501ba1a.

    Conflicts:
    drivers/virtio/virtio_pci_common.c

    The cleanup seems to be one of the changes that broke
    hybernation for some users. We are still not sure why
    but revert helps.

    This reverts the cleanup changes but keeps the affinity support.

    Tested-by: Mike Galbraith
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • This reverts commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507.

    Conflicts:
    drivers/virtio/virtio_pci_common.c

    Unfortunately the idea does not work with threadirqs
    as more than 32 queues can then map to a single interrupts.

    Further, the cleanup seems to be one of the changes that broke
    hybernation for some users. We are still not sure why
    but revert helps.

    This reverts the cleanup changes but keeps the affinity support.

    Tested-by: Mike Galbraith
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • This reverts commit 53a020c661741f3b87ad3ac6fa545088aaebac9b.

    The cleanup seems to be one of the changes that broke
    hybernation for some users. We are still not sure why
    but revert helps.

    Tested-by: Mike Galbraith
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • This reverts commit 52a61516125fa9a21b3bdf4f90928308e2e5573f.

    Conflicts:
    drivers/virtio/virtio_pci_common.c

    The cleanup seems to be one of the changes that broke
    hybernation for some users. We are still not sure why
    but revert helps.

    This reverts the cleanup changes but keeps the affinity support.

    Tested-by: Mike Galbraith
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • This reverts commit de85ec8b07f82c8c84de7687f769e74bf4c26a1e.

    Follow-up patches will revert 07ec51480b5e ("virtio_pci: use shared
    interrupts for virtqueues") that triggered the problem so no need for
    this one anymore.

    Tested-by: Mike Galbraith
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     

07 Apr, 2017

1 commit


29 Mar, 2017

4 commits

  • The latest gcc-7.0.1 snapshot reports a new warning:

    virtio/virtio_balloon.c: In function 'update_balloon_stats':
    virtio/virtio_balloon.c:258:26: error: 'events[2]' is used uninitialized in this function [-Werror=uninitialized]
    virtio/virtio_balloon.c:260:26: error: 'events[3]' is used uninitialized in this function [-Werror=uninitialized]
    virtio/virtio_balloon.c:261:56: error: 'events[18]' is used uninitialized in this function [-Werror=uninitialized]
    virtio/virtio_balloon.c:262:56: error: 'events[17]' is used uninitialized in this function [-Werror=uninitialized]

    This seems absolutely right, so we should add an extra check to
    prevent copying uninitialized stack data into the statistics.
    >From all I can tell, this has been broken since the statistics code
    was originally added in 2.6.34.

    Fixes: 9564e138b1f6 ("virtio: Add memory statistics reporting to the balloon driver (V4)")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Ladi Prosek
    Signed-off-by: Michael S. Tsirkin

    Arnd Bergmann
     
  • The virtio balloon driver contained a not-so-obvious invariant that
    update_balloon_stats has to update exactly VIRTIO_BALLOON_S_NR counters
    in order to send valid stats to the host. This commit fixes it by having
    update_balloon_stats return the actual number of counters, and its
    callers use it when pushing buffers to the stats virtqueue.

    Note that it is still out of spec to change the number of counters
    at run-time. "Driver MUST supply the same subset of statistics in all
    buffers submitted to the statsq."

    Suggested-by: Arnd Bergmann
    Signed-off-by: Ladi Prosek
    Signed-off-by: Michael S. Tsirkin

    Ladi Prosek
     
  • When init_vqs runs, virtio_balloon.stats is either uninitialized or
    contains stale values. The host updates its state with garbage data
    because it has no way of knowing that this is just a marker buffer
    used for signaling.

    This patch updates the stats before pushing the initial buffer.

    Alternative fixes:
    * Push an empty buffer in init_vqs. Not easily done with the current
    virtio implementation and violates the spec "Driver MUST supply the
    same subset of statistics in all buffers submitted to the statsq".
    * Push a buffer with invalid tags in init_vqs. Violates the same
    spec clause, plus "invalid tag" is not really defined.

    Note: the spec says:
    When using the legacy interface, the device SHOULD ignore all values in
    the first buffer in the statsq supplied by the driver after device
    initialization. Note: Historically, drivers supplied an uninitialized
    buffer in the first buffer.

    Unfortunately QEMU does not seem to implement the recommendation
    even for the legacy interface.

    Cc: stable@vger.kernel.org
    Signed-off-by: Ladi Prosek
    Signed-off-by: Michael S. Tsirkin

    Ladi Prosek
     
  • Fedora has received multiple reports of crashes when running
    4.11 as a guest

    https://bugzilla.redhat.com/show_bug.cgi?id=1430297
    https://bugzilla.redhat.com/show_bug.cgi?id=1434462
    https://bugzilla.kernel.org/show_bug.cgi?id=194911
    https://bugzilla.redhat.com/show_bug.cgi?id=1433899

    The crashes are not always consistent but they are generally
    some flavor of oops or GPF in virtio related code. Multiple people
    have done bisections (Thank you Thorsten Leemhuis and
    Richard W.M. Jones) and found this commit to be at fault

    07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507 is the first bad commit
    commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507
    Author: Christoph Hellwig
    Date: Sun Feb 5 18:15:19 2017 +0100

    virtio_pci: use shared interrupts for virtqueues

    The issue seems to be an out of bounds access to the msix_names
    array corrupting kernel memory.

    Fixes: 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues")
    Reported-by: Laura Abbott
    Signed-off-by: Jason Wang
    Signed-off-by: Michael S. Tsirkin
    Reviewed-by: Christoph Hellwig
    Tested-by: Richard W.M. Jones
    Tested-by: Thorsten Leemhuis

    Jason Wang
     

04 Mar, 2017

1 commit

  • Pull sched.h split-up from Ingo Molnar:
    "The point of these changes is to significantly reduce the
    header footprint, to speed up the kernel build and to
    have a cleaner header structure.

    After these changes the new 's typical preprocessed
    size goes down from a previous ~0.68 MB (~22K lines) to ~0.45 MB (~15K
    lines), which is around 40% faster to build on typical configs.

    Not much changed from the last version (-v2) posted three weeks ago: I
    eliminated quirks, backmerged fixes plus I rebased it to an upstream
    SHA1 from yesterday that includes most changes queued up in -next plus
    all sched.h changes that were pending from Andrew.

    I've re-tested the series both on x86 and on cross-arch defconfigs,
    and did a bisectability test at a number of random points.

    I tried to test as many build configurations as possible, but some
    build breakage is probably still left - but it should be mostly
    limited to architectures that have no cross-compiler binaries
    available on kernel.org, and non-default configurations"

    * 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (146 commits)
    sched/headers: Clean up
    sched/headers: Remove #ifdefs from
    sched/headers: Remove the include from
    sched/headers, hrtimer: Remove the include from
    sched/headers, x86/apic: Remove the header inclusion from
    sched/headers, timers: Remove the include from
    sched/headers: Remove from
    sched/headers: Remove from
    sched/core: Remove unused prefetch_stack()
    sched/headers: Remove from
    sched/headers: Remove the 'init_pid_ns' prototype from
    sched/headers: Remove from
    sched/headers: Remove from
    sched/headers: Remove the runqueue_is_locked() prototype
    sched/headers: Remove from
    sched/headers: Remove from
    sched/headers: Remove from
    sched/headers: Remove from
    sched/headers: Remove the include from
    sched/headers: Remove from
    ...

    Linus Torvalds
     

03 Mar, 2017

1 commit

  • Pull vhost updates from Michael Tsirkin:
    "virtio, vhost: optimizations, fixes

    Looks like a quiet cycle for vhost/virtio, just a couple of minor
    tweaks. Most notable is automatic interrupt affinity for blk and scsi.
    Hopefully other devices are not far behind"

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
    virtio-console: avoid DMA from stack
    vhost: introduce O(1) vq metadata cache
    virtio_scsi: use virtio IRQ affinity
    virtio_blk: use virtio IRQ affinity
    blk-mq: provide a default queue mapping for virtio device
    virtio: provide a method to get the IRQ affinity mask for a virtqueue
    virtio: allow drivers to request IRQ affinity when creating VQs
    virtio_pci: simplify MSI-X setup
    virtio_pci: don't duplicate the msix_enable flag in struct pci_dev
    virtio_pci: use shared interrupts for virtqueues
    virtio_pci: remove struct virtio_pci_vq_info
    vhost: try avoiding avail index access when getting descriptor
    virtio_mmio: expose header to userspace

    Linus Torvalds
     

02 Mar, 2017

1 commit


28 Feb, 2017

6 commits

  • This basically passed up the pci_irq_get_affinity information through
    virtio through an optional get_vq_affinity method. It is only implemented
    by the PCI backend for now, and only when we use per-virtqueue IRQs.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Jason Wang
    Signed-off-by: Michael S. Tsirkin

    Christoph Hellwig
     
  • Add a struct irq_affinity pointer to the find_vqs methods, which if set
    is used to tell the PCI layer to create the MSI-X vectors for our I/O
    virtqueues with the proper affinity from the start. Compared to after
    the fact affinity hints this gives us an instantly working setup and
    allows to allocate the irq descritors node-local and avoid interconnect
    traffic. Last but not least this will allow blk-mq queues are created
    based on the interrupt affinity for storage drivers.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Jason Wang
    Signed-off-by: Michael S. Tsirkin

    Christoph Hellwig
     
  • Try to grab the MSI-X vectors early and fall back to the shared one
    before doing lots of allocations.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Jason Wang
    Signed-off-by: Michael S. Tsirkin

    Christoph Hellwig
     
  • Signed-off-by: Christoph Hellwig
    Reviewed-by: Jason Wang
    Signed-off-by: Michael S. Tsirkin

    Christoph Hellwig
     
  • This lets IRQ layer handle dispatching IRQs to separate handlers for the
    case where we don't have per-VQ MSI-X vectors, and allows us to greatly
    simplify the code based on the assumption that we always have interrupt
    vector 0 (legacy INTx or config interrupt for MSI-X) available, and
    any other interrupt is request/freed throught the VQ, even if the
    actual interrupt line might be shared in some cases.

    This allows removing a great deal of variables keeping track of the
    interrupt state in struct virtio_pci_device, as we can now simply walk the
    list of VQs and deal with per-VQ interrupt handlers there, and only treat
    vector 0 special.

    Additionally clean up the VQ allocation code to properly unwind on error
    instead of having a single global cleanup label, which is error prone,
    and in this case also leads to more code.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Michael S. Tsirkin

    Christoph Hellwig
     
  • We don't really need struct virtio_pci_vq_info, as most field in there
    are redundant:

    - the vq backpointer is not strictly neede to start with
    - the entry in the vqs list is not needed - the generic virtqueue already
    has list, we only need to check if it has a callback to get the same
    semantics
    - we can use a simple array to look up the MSI-X vec if needed.
    - That simple array now also duoble serves to replace the per_vq_vectors
    flag

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Michael S. Tsirkin

    Christoph Hellwig
     

27 Feb, 2017

1 commit


25 Feb, 2017

1 commit

  • With CONFIG_BALLOON_COMPACTION=y the kernel will mount balloon_mnt for
    balloon page migration when we probe a virtio_balloon device. However
    we do not unmount it when removing the device. Fix this.

    Fixes: b1123ea6d3b3 ("mm: balloon: use general non-lru movable page feature")
    Link: http://lkml.kernel.org/r/1486531318-35189-1-git-send-email-xieyisheng1@huawei.com
    Signed-off-by: Yisheng Xie
    Acked-by: Minchan Kim
    Cc: Rafael Aquini
    Cc: Konstantin Khlebnikov
    Cc: Gioh Kim
    Cc: Vlastimil Babka
    Cc: Michal Hocko
    Cc: Michael S. Tsirkin
    Cc: Jason Wang
    Cc: Hanjun Guo
    Cc: Xishi Qiu
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yisheng Xie
     

08 Feb, 2017

1 commit


07 Feb, 2017

1 commit

  • For XDP we will need to reset the queues to allow for buffer headroom
    to be configured. In order to do this we need to essentially run the
    freeze()/restore() code path. Unfortunately the locking requirements
    between the freeze/restore and reset paths are different however so
    we can not simply reuse the code.

    This patch refactors the code path and adds a reset helper routine.

    Signed-off-by: John Fastabend
    Acked-by: Jason Wang
    Signed-off-by: David S. Miller

    John Fastabend
     

04 Feb, 2017

1 commit

  • This reverts commit c7070619f3408d9a0dffbed9149e6f00479cf43b.

    This has been shown to regress on some ARM systems:

    by forcing on DMA API usage for ARM systems, we have inadvertently
    kicked open a hornets' nest in terms of cache-coherency. Namely that
    unless the virtio device is explicitly described as capable of coherent
    DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
    assume it is non-coherent. This turns out to cause a big problem for the
    likes of QEMU and kvmtool, which generate virtio-mmio devices in their
    guest DTs but neglect to add the often-overlooked "dma-coherent"
    property; as a result, we end up with the guest making non-cacheable
    accesses to the vring, the host doing so cacheably, both talking past
    each other and things going horribly wrong.

    We are working on a safer work-around.

    Fixes: c7070619f340 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
    Reported-by: Robin Murphy
    Cc:
    Signed-off-by: Will Deacon
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Marc Zyngier

    Michael S. Tsirkin
     

25 Jan, 2017

2 commits

  • Booting Linux on an ARM fastmodel containing an SMMU emulation results
    in an unexpected I/O page fault from the legacy virtio-blk PCI device:

    [ 1.211721] arm-smmu-v3 2b400000.smmu: event 0x10 received:
    [ 1.211800] arm-smmu-v3 2b400000.smmu: 0x00000000fffff010
    [ 1.211880] arm-smmu-v3 2b400000.smmu: 0x0000020800000000
    [ 1.211959] arm-smmu-v3 2b400000.smmu: 0x00000008fa081002
    [ 1.212075] arm-smmu-v3 2b400000.smmu: 0x0000000000000000
    [ 1.212155] arm-smmu-v3 2b400000.smmu: event 0x10 received:
    [ 1.212234] arm-smmu-v3 2b400000.smmu: 0x00000000fffff010
    [ 1.212314] arm-smmu-v3 2b400000.smmu: 0x0000020800000000
    [ 1.212394] arm-smmu-v3 2b400000.smmu: 0x00000008fa081000
    [ 1.212471] arm-smmu-v3 2b400000.smmu: 0x0000000000000000

    This is because the legacy virtio-blk device is behind an SMMU, so we
    have consequently swizzled its DMA ops and configured the SMMU to
    translate accesses. This then requires the vring code to use the DMA API
    to establish translations, otherwise all transactions will result in
    fatal faults and termination.

    Given that ARM-based systems only see an SMMU if one is really present
    (the topology is all described by firmware tables such as device-tree or
    IORT), then we can safely use the DMA API for all legacy virtio devices.
    Modern devices can advertise the prescense of an IOMMU using the
    VIRTIO_F_IOMMU_PLATFORM feature flag.

    Cc: Andy Lutomirski
    Cc: Michael S. Tsirkin
    Cc:
    Fixes: 876945dbf649 ("arm64: Hook up IOMMU dma_ops")
    Signed-off-by: Will Deacon
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Marc Zyngier

    Will Deacon
     
  • Once DMA API usage is enabled, it becomes apparent that virtio-mmio is
    inadvertently relying on the default 32-bit DMA mask, which leads to
    problems like rapidly exhausting SWIOTLB bounce buffers.

    Ensure that we set the appropriate 64-bit DMA mask whenever possible,
    with the coherent mask suitably limited for the legacy vring as per
    a0be1db4304f ("virtio_pci: Limit DMA mask to 44 bits for legacy virtio
    devices").

    Cc: Andy Lutomirski
    Cc: Michael S. Tsirkin
    Reported-by: Jean-Philippe Brucker
    Fixes: b42111382f0e ("virtio_mmio: Use the DMA API if enabled")
    Signed-off-by: Robin Murphy
    Signed-off-by: Michael S. Tsirkin

    Robin Murphy
     

16 Dec, 2016

7 commits


15 Dec, 2016

2 commits

  • # make C=2 CF="-D__CHECK_ENDIAN__" ./drivers/virtio/

    drivers/virtio/virtio_ring.c:423:19: warning: incorrect type in assignment (different base types)
    drivers/virtio/virtio_ring.c:423:19: expected unsigned int [unsigned] [assigned] i
    drivers/virtio/virtio_ring.c:423:19: got restricted __virtio16 [usertype] next
    drivers/virtio/virtio_ring.c:423:19: warning: incorrect type in assignment (different base types)
    drivers/virtio/virtio_ring.c:423:19: expected unsigned int [unsigned] [assigned] i
    drivers/virtio/virtio_ring.c:423:19: got restricted __virtio16 [usertype] next
    drivers/virtio/virtio_ring.c:423:19: warning: incorrect type in assignment (different base types)
    drivers/virtio/virtio_ring.c:423:19: expected unsigned int [unsigned] [assigned] i
    drivers/virtio/virtio_ring.c:423:19: got restricted __virtio16 [usertype] next
    drivers/virtio/virtio_ring.c:604:39: warning: incorrect type in initializer (different base types)
    drivers/virtio/virtio_ring.c:604:39: expected unsigned short [unsigned] [usertype] nextflag
    drivers/virtio/virtio_ring.c:604:39: got restricted __virtio16
    drivers/virtio/virtio_ring.c:612:33: warning: restricted __virtio16 degrades to integer

    Signed-off-by: Gonglei
    Signed-off-by: Michael S. Tsirkin

    Gonglei
     
  • drivers/virtio/virtio_pci_modern.c:66:40: warning: incorrect type in argument 2 (different base types)
    drivers/virtio/virtio_pci_modern.c:66:40: expected unsigned int [noderef] [usertype] *addr
    drivers/virtio/virtio_pci_modern.c:66:40: got restricted __le32 [noderef] [usertype] *lo
    drivers/virtio/virtio_pci_modern.c:67:33: warning: incorrect type in argument 2 (different base types)
    drivers/virtio/virtio_pci_modern.c:67:33: expected unsigned int [noderef] [usertype] *addr
    drivers/virtio/virtio_pci_modern.c:67:33: got restricted __le32 [noderef] [usertype] *hi
    drivers/virtio/virtio_pci_modern.c:150:32: warning: incorrect type in argument 2 (different base types)
    drivers/virtio/virtio_pci_modern.c:150:32: expected unsigned int [noderef] [usertype] *addr
    drivers/virtio/virtio_pci_modern.c:150:32: got restricted __le32 [noderef] *
    drivers/virtio/virtio_pci_modern.c:151:39: warning: incorrect type in argument 1 (different base types)
    drivers/virtio/virtio_pci_modern.c:151:39: expected unsigned int [noderef] [usertype] *addr
    drivers/virtio/virtio_pci_modern.c:151:39: got restricted __le32 [noderef] *
    drivers/virtio/virtio_pci_modern.c:152:32: warning: incorrect type in argument 2 (different base types)
    drivers/virtio/virtio_pci_modern.c:152:32: expected unsigned int [noderef] [usertype] *addr

    Signed-off-by: Gonglei
    Signed-off-by: Michael S. Tsirkin

    Gonglei
     

19 Nov, 2016

1 commit