11 Jul, 2012

1 commit

  • The kernel no longer allows us to pass NULL for the hard handler
    without also specifying IRQF_ONESHOT. IRQF_ONESHOT imposes latency
    in the exit path that we don't need for MSI interrupts. Long term
    we'd like to inject these interrupts from the hard handler when
    possible. In the short term, we can create dummy hard handlers
    that return us to the previous behavior. Credit to Michael for
    original patch.

    Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=43328

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Alex Williamson
    Signed-off-by: Avi Kivity

    Alex Williamson
     

04 Jul, 2012

1 commit


03 Jul, 2012

2 commits


16 Jun, 2012

1 commit

  • The masking was wrong (must have been 0x7f), and there is no need to
    re-read the value as pci_setup_device already does this for us.

    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43339
    Signed-off-by: Jan Kiszka
    Acked-by: Alex Williamson
    Signed-off-by: Marcelo Tosatti

    Jan Kiszka
     

05 Jun, 2012

1 commit

  • kvm_set_irq() has an internal buffer of three irq routing entries, allowing
    connecting a GSI to three IRQ chips or on MSI. However setup_routing_entry()
    does not properly enforce this, allowing three irqchip routes followed by
    an MSI route to overflow the buffer.

    Fix by ensuring that an MSI entry is added to an empty list.

    Signed-off-by: Avi Kivity

    Avi Kivity
     

01 May, 2012

1 commit


24 Apr, 2012

1 commit

  • Currently, MSI messages can only be injected to in-kernel irqchips by
    defining a corresponding IRQ route for each message. This is not only
    unhandy if the MSI messages are generated "on the fly" by user space,
    IRQ routes are a limited resource that user space has to manage
    carefully.

    By providing a direct injection path, we can both avoid using up limited
    resources and simplify the necessary steps for user land.

    Signed-off-by: Jan Kiszka
    Signed-off-by: Avi Kivity

    Jan Kiszka
     

20 Apr, 2012

1 commit


19 Apr, 2012

1 commit

  • As pointed out by Jason Baron, when assigning a device to a guest
    we first set the iommu domain pointer, which enables mapping
    and unmapping of memory slots to the iommu. This leaves a window
    where this path is enabled, but we haven't synchronized the iommu
    mappings to the existing memory slots. Thus a slot being removed
    at that point could send us down unexpected code paths removing
    non-existent pinnings and iommu mappings. Take the slots_lock
    around creating the iommu domain and initial mappings as well as
    around iommu teardown to avoid this race.

    Signed-off-by: Alex Williamson
    Signed-off-by: Marcelo Tosatti

    Alex Williamson
     

17 Apr, 2012

1 commit

  • Intel spec says that TMR needs to be set/cleared
    when IRR is set, but kvm also clears it on EOI.

    I did some tests on a real (AMD based) system,
    and I see same TMR values both before
    and after EOI, so I think it's a minor bug in kvm.

    This patch fixes TMR to be set/cleared on IRR set
    only as per spec.

    And now that we don't clear TMR, we can save
    an atomic read of TMR on EOI that's not propagated
    to ioapic, by checking whether ioapic needs
    a specific vector first and calculating
    the mode afterwards.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Marcelo Tosatti

    Michael S. Tsirkin
     

12 Apr, 2012

1 commit

  • We've been adding new mappings, but not destroying old mappings.
    This can lead to a page leak as pages are pinned using
    get_user_pages, but only unpinned with put_page if they still
    exist in the memslots list on vm shutdown. A memslot that is
    destroyed while an iommu domain is enabled for the guest will
    therefore result in an elevated page reference count that is
    never cleared.

    Additionally, without this fix, the iommu is only programmed
    with the first translation for a gpa. This can result in
    peer-to-peer errors if a mapping is destroyed and replaced by a
    new mapping at the same gpa as the iommu will still be pointing
    to the original, pinned memory address.

    Signed-off-by: Alex Williamson
    Signed-off-by: Marcelo Tosatti

    Alex Williamson
     

08 Apr, 2012

4 commits

  • Now that we do neither double buffering nor heuristic selection of the
    write protection method these are not needed anymore.

    Note: some drivers have their own implementation of set_bit_le() and
    making it generic needs a bit of work; so we use test_and_set_bit_le()
    and will later replace it with generic set_bit_le().

    Signed-off-by: Takuya Yoshikawa
    Signed-off-by: Avi Kivity

    Takuya Yoshikawa
     
  • S390's kvm_vcpu_stat does not contain halt_wakeup member.

    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Marcelo Tosatti
     
  • The kvm_vcpu_kick function performs roughly the same funcitonality on
    most all architectures, so we shouldn't have separate copies.

    PowerPC keeps a pointer to interchanging waitqueues on the vcpu_arch
    structure and to accomodate this special need a
    __KVM_HAVE_ARCH_VCPU_GET_WQ define and accompanying function
    kvm_arch_vcpu_wq have been defined. For all other architectures this
    is a generic inline that just returns &vcpu->wq;

    Acked-by: Scott Wood
    Signed-off-by: Christoffer Dall
    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Christoffer Dall
     
  • This patch makes the kvm_io_range array can be resized dynamically.

    Signed-off-by: Amos Kong
    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Amos Kong
     

20 Mar, 2012

1 commit

  • As kvm_notify_acked_irq calls kvm_assigned_dev_ack_irq under
    rcu_read_lock, we cannot use a mutex in the latter function. Switch to a
    spin lock to address this.

    Signed-off-by: Jan Kiszka
    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Jan Kiszka
     

08 Mar, 2012

8 commits


05 Mar, 2012

5 commits

  • This moves __gfn_to_memslot() and search_memslots() from kvm_main.c to
    kvm_host.h to reduce the code duplication caused by the need for
    non-modular code in arch/powerpc/kvm/book3s_hv_rm_mmu.c to call
    gfn_to_memslot() in real mode.

    Rather than putting gfn_to_memslot() itself in a header, which would
    lead to increased code size, this puts __gfn_to_memslot() in a header.
    Then, the non-modular uses of gfn_to_memslot() are changed to call
    __gfn_to_memslot() instead. This way there is only one place in the
    source code that needs to be changed should the gfn_to_memslot()
    implementation need to be modified.

    On powerpc, the Book3S HV style of KVM has code that is called from
    real mode which needs to call gfn_to_memslot() and thus needs this.
    (Module code is allocated in the vmalloc region, which can't be
    accessed in real mode.)

    With this, we can remove builtin_gfn_to_memslot() from book3s_hv_rm_mmu.c.

    Signed-off-by: Paul Mackerras
    Acked-by: Avi Kivity
    Signed-off-by: Alexander Graf
    Signed-off-by: Avi Kivity

    Paul Mackerras
     
  • find_index_from_host_irq returns 0 on error
    but callers assume < 0 on error. This should
    not matter much: an out of range irq should never happen since
    irq handler was registered with this irq #,
    and even if it does we get a spurious msix irq in guest
    and typically nothing terrible happens.

    Still, better to make it consistent.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Michael S. Tsirkin
     
  • This adds an smp_wmb in kvm_mmu_notifier_invalidate_range_end() and an
    smp_rmb in mmu_notifier_retry() so that mmu_notifier_retry() will give
    the correct answer when called without kvm->mmu_lock being held.
    PowerPC Book3S HV KVM wants to use a bitlock per guest page rather than
    a single global spinlock in order to improve the scalability of updates
    to the guest MMU hashed page table, and so needs this.

    Signed-off-by: Paul Mackerras
    Acked-by: Avi Kivity
    Signed-off-by: Alexander Graf
    Signed-off-by: Avi Kivity

    Paul Mackerras
     
  • This patch exports the s390 SIE hardware control block to userspace
    via the mapping of the vcpu file descriptor. In order to do so,
    a new arch callback named kvm_arch_vcpu_fault is introduced for all
    architectures. It allows to map architecture specific pages.

    Signed-off-by: Carsten Otte
    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Carsten Otte
     
  • This patch introduces a new config option for user controlled kernel
    virtual machines. It introduces a parameter to KVM_CREATE_VM that
    allows to set bits that alter the capabilities of the newly created
    virtual machine.
    The parameter is passed to kvm_arch_init_vm for all architectures.
    The only valid modifier bit for now is KVM_VM_S390_UCONTROL.
    This requires CAP_SYS_ADMIN privileges and creates a user controlled
    virtual machine on s390 architectures.

    Signed-off-by: Carsten Otte
    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Carsten Otte
     

01 Feb, 2012

1 commit

  • It is possible that the __set_bit() in mark_page_dirty() is called
    simultaneously on the same region of memory, which may result in only
    one bit being set, because some callers do not take mmu_lock before
    mark_page_dirty().

    This problem is hard to produce because when we reach mark_page_dirty()
    beginning from, e.g., tdp_page_fault(), mmu_lock is being held during
    __direct_map(): making kvm-unit-tests' dirty log api test write to two
    pages concurrently was not useful for this reason.

    So we have confirmed that there can actually be race condition by
    checking if some callers really reach there without holding mmu_lock
    using spin_is_locked(): probably they were from kvm_write_guest_page().

    To fix this race, this patch changes the bit operation to the atomic
    version: note that nr_dirty_pages also suffers from the race but we do
    not need exactly correct numbers for now.

    Signed-off-by: Takuya Yoshikawa
    Signed-off-by: Marcelo Tosatti

    Takuya Yoshikawa
     

13 Jan, 2012

1 commit


11 Jan, 2012

1 commit

  • * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (53 commits)
    iommu/amd: Set IOTLB invalidation timeout
    iommu/amd: Init stats for iommu=pt
    iommu/amd: Remove unnecessary cache flushes in amd_iommu_resume
    iommu/amd: Add invalidate-context call-back
    iommu/amd: Add amd_iommu_device_info() function
    iommu/amd: Adapt IOMMU driver to PCI register name changes
    iommu/amd: Add invalid_ppr callback
    iommu/amd: Implement notifiers for IOMMUv2
    iommu/amd: Implement IO page-fault handler
    iommu/amd: Add routines to bind/unbind a pasid
    iommu/amd: Implement device aquisition code for IOMMUv2
    iommu/amd: Add driver stub for AMD IOMMUv2 support
    iommu/amd: Add stat counter for IOMMUv2 events
    iommu/amd: Add device errata handling
    iommu/amd: Add function to get IOMMUv2 domain for pdev
    iommu/amd: Implement function to send PPR completions
    iommu/amd: Implement functions to manage GCR3 table
    iommu/amd: Implement IOMMUv2 TLB flushing routines
    iommu/amd: Add support for IOMMUv2 domain mode
    iommu/amd: Add amd_iommu_domain_direct_map function
    ...

    Linus Torvalds
     

09 Jan, 2012

1 commit


27 Dec, 2011

6 commits