08 Jan, 2015

1 commit

  • Current vfio-pci just supports normal pci device, so vfio_pci_probe() will
    return if the pci device is not a normal device. While current code makes a
    mistake. PCI_HEADER_TYPE is the offset in configuration space of the device
    type, but we use this value to mask the type value.

    This patch fixs this by do the check directly on the pci_dev->hdr_type.

    Signed-off-by: Wei Yang
    Signed-off-by: Alex Williamson
    Cc: stable@vger.kernel.org # v3.6+

    Wei Yang
     

08 Nov, 2014

1 commit


30 Sep, 2014

1 commit

  • Locking both the remove() and release() path results in a deadlock
    that should have been obvious. To fix this we can get and hold the
    vfio_device reference as we evaluate whether to do a bus/slot reset.
    This will automatically block any remove() calls, allowing us to
    remove the explict lock. Fixes 61d792562b53.

    Signed-off-by: Alex Williamson
    Cc: stable@vger.kernel.org [3.17]

    Alex Williamson
     

09 Aug, 2014

1 commit

  • The existing vfio_pci_open() fails upon error returned from
    vfio_spapr_pci_eeh_open(), which breaks POWER7's P5IOC2 PHB
    support which this patch brings back.

    The patch fixes the issue by dropping the return value of
    vfio_spapr_pci_eeh_open().

    Signed-off-by: Alexey Kardashevskiy
    Signed-off-by: Gavin Shan
    Signed-off-by: Alex Williamson

    Alexey Kardashevskiy
     

08 Aug, 2014

3 commits

  • Each time a device is released, mark whether a local reset was
    successful or whether a bus/slot reset is needed. If a reset is
    needed and all of the affected devices are bound to vfio-pci and
    unused, allow the reset. This is most useful when the userspace
    driver is killed and releases all the devices in an unclean state,
    such as when a QEMU VM quits.

    Signed-off-by: Alex Williamson

    Alex Williamson
     
  • Serializing open/release allows us to fix a refcnt error if we fail
    to enable the device and lets us prevent devices from being unbound
    or opened, giving us an opportunity to do bus resets on release. No
    restriction added to serialize binding devices to vfio-pci while the
    mutex is held though.

    Signed-off-by: Alex Williamson

    Alex Williamson
     
  • Our current open/release path looks like this:

    vfio_pci_open
    vfio_pci_enable
    pci_enable_device
    pci_save_state
    pci_store_saved_state

    vfio_pci_release
    vfio_pci_disable
    pci_disable_device
    pci_restore_state

    pci_enable_device() doesn't modify PCI_COMMAND_MASTER, so if a device
    comes to us with it enabled, it persists through the open and gets
    stored as part of the device saved state. We then restore that saved
    state when released, which can allow the device to attempt to continue
    to do DMA. When the group is disconnected from the domain, this will
    get caught by the IOMMU, but if there are other devices in the group,
    the device may continue running and interfere with the user. Even in
    the former case, IOMMUs don't necessarily behave well and a stream of
    blocked DMA can result in unpleasant behavior on the host.

    Explicitly disable Bus Master as we're enabling the device and
    slightly re-work release to make sure that pci_disable_device() is
    the last thing that touches the device.

    Signed-off-by: Alex Williamson

    Alex Williamson
     

05 Aug, 2014

1 commit

  • The patch adds new IOCTL commands for sPAPR VFIO container device
    to support EEH functionality for PCI devices, which have been passed
    through from host to somebody else via VFIO.

    Signed-off-by: Gavin Shan
    Acked-by: Alexander Graf
    Acked-by: Alex Williamson
    Signed-off-by: Benjamin Herrenschmidt

    Gavin Shan
     

31 May, 2014

2 commits

  • According PCI local bus specification, the register of Message
    Control for MSI (offset: 2, length: 2) has bit#0 to enable or
    disable MSI logic and it shouldn't be part contributing to the
    calculation of MSI interrupt count. The patch fixes the issue.

    Signed-off-by: Gavin Shan
    Signed-off-by: Alex Williamson

    Gavin Shan
     
  • There's nothing we can do different if pci_load_and_free_saved_state()
    fails, other than maybe print some log message, but the actual re-load
    of the state is an unnecessary step here since we've only just saved
    it. We can cleanup a coverity warning and eliminate the unnecessary
    step by freeing the state ourselves.

    Detected by Coverity: CID 753101

    Signed-off-by: Alex Williamson

    Alex Williamson
     

25 Jan, 2014

1 commit


16 Jan, 2014

1 commit

  • PCI resets will attempt to take the device_lock for any device to be
    reset. This is a problem if that lock is already held, for instance
    in the device remove path. It's not sufficient to simply kill the
    user process or skip the reset if called after .remove as a race could
    result in the same deadlock. Instead, we handle all resets as "best
    effort" using the PCI "try" reset interfaces. This prevents the user
    from being able to induce a deadlock by triggering a reset.

    Signed-off-by: Alex Williamson
    Signed-off-by: Bjorn Helgaas

    Alex Williamson
     

15 Jan, 2014

1 commit

  • device_lock is much too prone to lockups. For instance if we have a
    pending .remove then device_lock is already held. If userspace
    attempts to modify AER signaling after that point, a deadlock occurs.
    eventfd setup/teardown is already protected in vfio with the igate
    mutex. AER is not a high performance interrupt, so we can also use
    the same mutex to protect signaling versus setup races.

    Signed-off-by: Alex Williamson

    Alex Williamson
     

05 Sep, 2013

1 commit

  • The current VFIO_DEVICE_RESET interface only maps to PCI use cases
    where we can isolate the reset to the individual PCI function. This
    means the device must support FLR (PCIe or AF), PM reset on D3hot->D0
    transition, device specific reset, or be a singleton device on a bus
    for a secondary bus reset. FLR does not have widespread support,
    PM reset is not very reliable, and bus topology is dictated by the
    system and device design. We need to provide a means for a user to
    induce a bus reset in cases where the existing mechanisms are not
    available or not reliable.

    This device specific extension to VFIO provides the user with this
    ability. Two new ioctls are introduced:
    - VFIO_DEVICE_PCI_GET_HOT_RESET_INFO
    - VFIO_DEVICE_PCI_HOT_RESET

    The first provides the user with information about the extent of
    devices affected by a hot reset. This is essentially a list of
    devices and the IOMMU groups they belong to. The user may then
    initiate a hot reset by calling the second ioctl. We must be
    careful that the user has ownership of all the affected devices
    found via the first ioctl, so the second ioctl takes a list of file
    descriptors for the VFIO groups affected by the reset. Each group
    must have IOMMU protection established for the ioctl to succeed.

    Signed-off-by: Alex Williamson

    Alex Williamson
     

25 Jul, 2013

1 commit

  • If an attempt is made to unbind a device from vfio-pci while that
    device is in use, the request is blocked until the device becomes
    unused. Unfortunately, that unbind path still grabs the device_lock,
    which certain things like __pci_reset_function() also want to take.
    This means we need to try to acquire the locks ourselves and use the
    pre-locked version, __pci_reset_function_locked().

    Signed-off-by: Alex Williamson

    Alex Williamson
     

29 Jun, 2013

1 commit


03 May, 2013

1 commit

  • Pull vfio updates from Alex Williamson:
    "Changes include extension to support PCI AER notification to
    userspace, byte granularity of PCI config space and access to
    unarchitected PCI config space, better protection around IOMMU driver
    accesses, default file mode fix, and a few misc cleanups."

    * tag 'vfio-for-v3.10' of git://github.com/awilliam/linux-vfio:
    vfio: Set container device mode
    vfio: Use down_reads to protect iommu disconnects
    vfio: Convert container->group_lock to rwsem
    PCI/VFIO: use pcie_flags_reg instead of access PCI-E Capabilities Register
    vfio-pci: Enable raw access to unassigned config space
    vfio-pci: Use byte granularity in config map
    vfio: make local function vfio_pci_intx_unmask_handler() static
    VFIO-AER: Vfio-pci driver changes for supporting AER
    VFIO: Wrapper for getting reference to vfio_device

    Linus Torvalds
     

30 Apr, 2013

1 commit

  • Pull PCI updates from Bjorn Helgaas:
    "PCI changes for the v3.10 merge window:

    PCI device hotplug
    - Remove ACPI PCI subdrivers (Jiang Liu, Myron Stowe)
    - Make acpiphp builtin only, not modular (Jiang Liu)
    - Add acpiphp mutual exclusion (Jiang Liu)

    Power management
    - Skip "PME enabled/disabled" messages when not supported (Rafael
    Wysocki)
    - Fix fallback to PCI_D0 (Rafael Wysocki)

    Miscellaneous
    - Factor quirk_io_region (Yinghai Lu)
    - Cache MSI capability offsets & cleanup (Gavin Shan, Bjorn Helgaas)
    - Clean up EISA resource initialization and logging (Bjorn Helgaas)
    - Fix prototype warnings (Andy Shevchenko, Bjorn Helgaas)
    - MIPS: Initialize of_node before scanning bus (Gabor Juhos)
    - Fix pcibios_get_phb_of_node() declaration "weak" annotation (Gabor
    Juhos)
    - Add MSI INTX_DISABLE quirks for AR8161/AR8162/etc (Xiong Huang)
    - Fix aer_inject return values (Prarit Bhargava)
    - Remove PME/ACPI dependency (Andrew Murray)
    - Use shared PCI_BUS_NUM() and PCI_DEVID() (Shuah Khan)"

    * tag 'pci-v3.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (63 commits)
    vfio-pci: Use cached MSI/MSI-X capabilities
    vfio-pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
    PCI: Remove "extern" from function declarations
    PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
    PCI: Drop msi_mask_reg() and remove drivers/pci/msi.h
    PCI: Use msix_table_size() directly, drop multi_msix_capable()
    PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macros
    PCI: Drop is_64bit_address() and is_mask_bit_support() macros
    PCI: Drop msi_data_reg() macro
    PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macros
    PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directly
    PCI: Use cached MSI/MSI-X offsets from dev, not from msi_desc
    PCI: Clean up MSI/MSI-X capability #defines
    PCI: Use cached MSI-X cap while enabling MSI-X
    PCI: Use cached MSI cap while enabling MSI interrupts
    PCI: Remove MSI/MSI-X cap check in pci_msi_check_device()
    PCI: Cache MSI/MSI-X capability offsets in struct pci_dev
    PCI: Use u8, not int, for PM capability offset
    [SCSI] megaraid_sas: Use correct #define for MSI-X capability
    PCI: Remove "extern" from function declarations
    ...

    Linus Torvalds
     

25 Apr, 2013

2 commits


27 Mar, 2013

1 commit

  • The VFIO_DEVICE_SET_IRQS ioctl takes a start and count parameter, both
    of which are unsigned. We attempt to bounds check these, but fail to
    account for the case where start is a very large number, allowing
    start + count to wrap back into the valid range. Bounds check both
    start and start + count.

    Reported-by: Dan Carpenter
    Signed-off-by: Alex Williamson

    Alex Williamson
     

11 Mar, 2013

1 commit

  • - New VFIO_SET_IRQ ioctl option to pass the eventfd that is signaled when
    an error occurs in the vfio_pci_device

    - Register pci_error_handler for the vfio_pci driver

    - When the device encounters an error, the error handler registered by
    the vfio_pci driver gets invoked by the AER infrastructure

    - In the error handler, signal the eventfd registered for the device.

    - This results in the qemu eventfd handler getting invoked and
    appropriate action taken for the guest.

    Signed-off-by: Vijay Mohan Pandarathil
    Signed-off-by: Alex Williamson

    Vijay Mohan Pandarathil
     

19 Feb, 2013

1 commit

  • PCI defines display class VGA regions at I/O port address 0x3b0, 0x3c0
    and MMIO address 0xa0000. As these are non-overlapping, we can ignore
    the I/O port vs MMIO difference and expose them both in a single
    region. We make use of the VGA arbiter around each access to
    configure chipset access as necessary.

    Signed-off-by: Alex Williamson

    Alex Williamson
     

15 Feb, 2013

2 commits

  • We can actually handle MMIO and I/O port from the same access function
    since PCI already does abstraction of this. The ROM BAR only requires
    a minor difference, so it gets included too. vfio_pci_config_readwrite
    gets renamed for consistency.

    Signed-off-by: Alex Williamson

    Alex Williamson
     
  • The read and write functions are nearly identical, combine them
    and convert to a switch statement. This also makes it easy to
    narrow the scope of when we use the io/mem accessors in case new
    regions are added.

    Signed-off-by: Alex Williamson

    Alex Williamson
     

08 Dec, 2012

4 commits

  • Devices making use of PM reset are getting incorrectly identified as
    not supporting reset because pci_pm_reset() fails unless the device is
    in D0 power state. When first attached to vfio_pci devices are
    typically in an unknown power state. We can fix this by explicitly
    setting the power state or simply calling pci_enable_device() before
    attempting a pci_reset_function(). We need to enable the device
    anyway, so move this up in our vfio_pci_enable() function, which also
    simplifies the error path a bit.

    Note that pci_disable_device() does not explicitly set the power
    state, so there's no need to re-order vfio_pci_disable().

    Signed-off-by: Alex Williamson

    Alex Williamson
     
  • The two labels for error recovery in function vfio_pci_init() is out of
    order, so fix it.

    Signed-off-by: Jiang Liu
    Signed-off-by: Alex Williamson

    Jiang Liu
     
  • Move the device reset to the end of our disable path, the device
    should already be stopped from pci_disable_device(). This also allows
    us to manipulate the save/restore to avoid the save/reset/restore +
    save/restore that we had before.

    Signed-off-by: Alex Williamson

    Alex Williamson
     
  • Generated by: coccinelle/api/memdup_user.cocci

    Acked-by: Julia Lawall
    Reported-by: Fengguang Wu
    Signed-off-by: Alex Williamson

    Fengguang Wu
     

10 Oct, 2012

1 commit


09 Oct, 2012

1 commit

  • The VM_RESERVED flag was killed off in commit 314e51b9851b ("mm: kill
    vma flag VM_RESERVED and mm->reserved_vm counter"), and replaced by the
    proper semantic flags (eg "don't core-dump" etc). But there was a new
    use of VM_RESERVED that got missed by the merge.

    Fix the remaining use of VM_RESERVED in the vfio_pci driver, replacing
    the VM_RESERVED flag with VM_DONTEXPAND | VM_DONTDUMP.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

31 Jul, 2012

1 commit

  • Add PCI device support for VFIO. PCI devices expose regions
    for accessing config space, I/O port space, and MMIO areas
    of the device. PCI config access is virtualized in the kernel,
    allowing us to ensure the integrity of the system, by preventing
    various accesses while reducing duplicate support across various
    userspace drivers. I/O port supports read/write access while
    MMIO also supports mmap of sufficiently sized regions. Support
    for INTx, MSI, and MSI-X interrupts are provided using eventfds to
    userspace.

    Signed-off-by: Alex Williamson

    Alex Williamson