Doug / smarc-fsl-linux-kernel | Embedian Git Server

11 Jul, 2012

1 commit

a76beb141 KVM: Fix device assignment threaded irq handler ... Browse Code »

The kernel no longer allows us to pass NULL for the hard handler
without also specifying IRQF_ONESHOT. IRQF_ONESHOT imposes latency
in the exit path that we don't need for MSI interrupts. Long term
we'd like to inject these interrupts from the hard handler when
possible. In the short term, we can create dummy hard handlers
that return us to the previous behavior. Credit to Michael for
original patch.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=43328

Signed-off-by: Michael S. Tsirkin
Signed-off-by: Alex Williamson
Signed-off-by: Avi Kivity

Alex Williamson
2012-07-11 18:01:48 +0800

04 Jul, 2012

1 commit

f41193044 KVM: fix fault page leak ... Browse Code »

fault_page is forgot to be freed

Signed-off-by: Xiao Guangrong
Signed-off-by: Marcelo Tosatti

Xiao Guangrong
2012-07-04 04:31:50 +0800

03 Jul, 2012

2 commits

326cf0334 KVM: Sanitize KVM_IRQFD flags ... Browse Code »

We only know of one so far.

Signed-off-by: Alex Williamson
Signed-off-by: Marcelo Tosatti

Alex Williamson
2012-07-03 08:10:30 +0800
d4db2935e KVM: Pass kvm_irqfd to functions ... Browse Code »

Prune this down to just the struct kvm_irqfd so we can avoid
changing function definition for every flag or field we use.

Signed-off-by: Alex Williamson
Acked-by: Cornelia Huck
Signed-off-by: Marcelo Tosatti

Alex Williamson
2012-07-03 08:10:30 +0800

16 Jun, 2012

1 commit

f961f7283 KVM: Fix PCI header check on device assignment ... Browse Code »

The masking was wrong (must have been 0x7f), and there is no need to
re-read the value as pci_setup_device already does this for us.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43339
Signed-off-by: Jan Kiszka
Acked-by: Alex Williamson
Signed-off-by: Marcelo Tosatti

Jan Kiszka
2012-06-16 10:22:12 +0800

05 Jun, 2012

1 commit

f2ebd422f KVM: Fix buffer overflow in kvm_set_irq() ... Browse Code »

kvm_set_irq() has an internal buffer of three irq routing entries, allowing
connecting a GSI to three IRQ chips or on MSI. However setup_routing_entry()
does not properly enforce this, allowing three irqchip routes followed by
an MSI route to overflow the buffer.

Fix by ensuring that an MSI entry is added to an empty list.

Signed-off-by: Avi Kivity

Avi Kivity
2012-06-05 21:39:58 +0800

01 May, 2012

1 commit

41628d334 KVM: s390: Implement the directed yield (diag 9c) hypervisor call for KVM ... Browse Code »

This patch implements the directed yield hypercall found on other
System z hypervisors. It delegates execution time to the virtual cpu
specified in the instruction's parameter.

Useful to avoid long spinlock waits in the guest.

Christian Borntraeger: moved common code in virt/kvm/

Signed-off-by: Konstantin Weitz
Signed-off-by: Christian Borntraeger
Signed-off-by: Marcelo Tosatti

Konstantin Weitz
2012-05-01 08:38:31 +0800

24 Apr, 2012

1 commit

07975ad3b KVM: Introduce direct MSI message injection for in-kernel irqchips ... Browse Code »

Currently, MSI messages can only be injected to in-kernel irqchips by
defining a corresponding IRQ route for each message. This is not only
unhandy if the MSI messages are generated "on the fly" by user space,
IRQ routes are a limited resource that user space has to manage
carefully.

By providing a direct injection path, we can both avoid using up limited
resources and simplify the necessary steps for user land.

Signed-off-by: Jan Kiszka
Signed-off-by: Avi Kivity

Jan Kiszka
2012-04-24 20:59:47 +0800

20 Apr, 2012

1 commit

eac055675 Merge branch 'linus' into queue ... Browse Code »

Merge reason: development work has dependency on kvm patches merged
upstream.

Conflicts:
Documentation/feature-removal-schedule.txt

Signed-off-by: Marcelo Tosatti

Marcelo Tosatti
2012-04-20 04:06:26 +0800

19 Apr, 2012

1 commit

21a1416a1 KVM: lock slots_lock around device assignment ... Browse Code »

As pointed out by Jason Baron, when assigning a device to a guest
we first set the iommu domain pointer, which enables mapping
and unmapping of memory slots to the iommu. This leaves a window
where this path is enabled, but we haven't synchronized the iommu
mappings to the existing memory slots. Thus a slot being removed
at that point could send us down unexpected code paths removing
non-existent pinnings and iommu mappings. Take the slots_lock
around creating the iommu domain and initial mappings as well as
around iommu teardown to avoid this race.

Signed-off-by: Alex Williamson
Signed-off-by: Marcelo Tosatti

Alex Williamson
2012-04-19 11:04:18 +0800

17 Apr, 2012

1 commit

a0c9a822b KVM: dont clear TMR on EOI ... Browse Code »

Intel spec says that TMR needs to be set/cleared
when IRR is set, but kvm also clears it on EOI.

I did some tests on a real (AMD based) system,
and I see same TMR values both before
and after EOI, so I think it's a minor bug in kvm.

This patch fixes TMR to be set/cleared on IRR set
only as per spec.

And now that we don't clear TMR, we can save
an atomic read of TMR on EOI that's not propagated
to ioapic, by checking whether ioapic needs
a specific vector first and calculating
the mode afterwards.

Signed-off-by: Michael S. Tsirkin
Signed-off-by: Marcelo Tosatti

Michael S. Tsirkin
2012-04-17 07:36:38 +0800

12 Apr, 2012

1 commit

32f6daad4 KVM: unmap pages from the iommu when slots are removed ... Browse Code »

We've been adding new mappings, but not destroying old mappings.
This can lead to a page leak as pages are pinned using
get_user_pages, but only unpinned with put_page if they still
exist in the memslots list on vm shutdown. A memslot that is
destroyed while an iommu domain is enabled for the guest will
therefore result in an elevated page reference count that is
never cleared.

Additionally, without this fix, the iommu is only programmed
with the first translation for a gpa. This can result in
peer-to-peer errors if a mapping is destroyed and replaced by a
new mapping at the same gpa as the iommu will still be pointing
to the original, pinned memory address.

Signed-off-by: Alex Williamson
Signed-off-by: Marcelo Tosatti

Alex Williamson
2012-04-12 09:55:25 +0800

08 Apr, 2012

4 commits

93474b25a KVM: Remove unused dirty_bitmap_head and nr_dirty_pages ... Browse Code »

Now that we do neither double buffering nor heuristic selection of the
write protection method these are not needed anymore.

Note: some drivers have their own implementation of set_bit_le() and
making it generic needs a bit of work; so we use test_and_set_bit_le()
and will later replace it with generic set_bit_le().

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Avi Kivity

Takuya Yoshikawa
2012-04-08 17:50:01 +0800
8c84780df KVM: fix kvm_vcpu_kick build failure on S390 ... Browse Code »

S390's kvm_vcpu_stat does not contain halt_wakeup member.

Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Marcelo Tosatti
2012-04-08 17:49:42 +0800
b6d33834b KVM: Factor out kvm_vcpu_kick to arch-generic code ... Browse Code »

The kvm_vcpu_kick function performs roughly the same funcitonality on
most all architectures, so we shouldn't have separate copies.

PowerPC keeps a pointer to interchanging waitqueues on the vcpu_arch
structure and to accomodate this special need a
__KVM_HAVE_ARCH_VCPU_GET_WQ define and accompanying function
kvm_arch_vcpu_wq have been defined. For all other architectures this
is a generic inline that just returns &vcpu->wq;

Acked-by: Scott Wood
Signed-off-by: Christoffer Dall
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Christoffer Dall
2012-04-08 17:47:47 +0800
a13007160 KVM: resize kvm_io_range array dynamically ... Browse Code »

This patch makes the kvm_io_range array can be resized dynamically.

Signed-off-by: Amos Kong
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Amos Kong
2012-04-08 17:46:58 +0800

20 Mar, 2012

1 commit

cf9eeac46 KVM: Convert intx_mask_lock to spin lock ... Browse Code »

As kvm_notify_acked_irq calls kvm_assigned_dev_ack_irq under
rcu_read_lock, we cannot use a mutex in the latter function. Switch to a
spin lock to address this.

Signed-off-by: Jan Kiszka
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Jan Kiszka
2012-03-20 18:41:24 +0800

08 Mar, 2012

8 commits

bec87d6e3 KVM: use correct tlbs dirty type in cmpxchg ... Browse Code »

Using 'int' type is not suitable for a 'long' object. So, correct it.

Signed-off-by: Alex Shi
Signed-off-by: Avi Kivity

Alex Shi
2012-03-08 20:11:44 +0800
07700a94b KVM: Allow host IRQ sharing for assigned PCI 2.3 devices ... Browse Code »

PCI 2.3 allows to generically disable IRQ sources at device level. This
enables us to share legacy IRQs of such devices with other host devices
when passing them to a guest.

The new IRQ sharing feature introduced here is optional, user space has
to request it explicitly. Moreover, user space can inform us about its
view of PCI_COMMAND_INTX_DISABLE so that we can avoid unmasking the
interrupt and signaling it if the guest masked it via the virtualized
PCI config space.

Signed-off-by: Jan Kiszka
Acked-by: Alex Williamson
Acked-by: Michael S. Tsirkin
Signed-off-by: Avi Kivity

Jan Kiszka
2012-03-08 20:11:36 +0800
3e515705a KVM: Ensure all vcpus are consistent with in-kernel irqchip settings ... Browse Code »

If some vcpus are created before KVM_CREATE_IRQCHIP, then
irqchip_in_kernel() and vcpu->arch.apic will be inconsistent, leading
to potential NULL pointer dereferences.

Fix by:
- ensuring that no vcpus are installed when KVM_CREATE_IRQCHIP is called
- ensuring that a vcpu has an apic if it is installed after KVM_CREATE_IRQCHIP

This is somewhat long winded because vcpu->arch.apic is created without
kvm->lock held.

Based on earlier patch by Michael Ellerman.

Signed-off-by: Michael Ellerman
Signed-off-by: Avi Kivity

Avi Kivity
2012-03-08 20:10:30 +0800
565f3be21 KVM: mmu_notifier: Flush TLBs before releasing mmu_lock ... Browse Code »

Other threads may process the same page in that small window and skip
TLB flush and then return before these functions do flush.

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Takuya Yoshikawa
2012-03-08 20:10:23 +0800
db3fe4eb4 KVM: Introduce kvm_memory_slot::arch and move lpage_info into it ... Browse Code »

Some members of kvm_memory_slot are not used by every architecture.

This patch is the first step to make this difference clear by
introducing kvm_memory_slot::arch; lpage_info is moved into it.

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Takuya Yoshikawa
2012-03-08 20:10:22 +0800
189a2f7b2 KVM: Simplify ifndef conditional usage in __kvm_set_memory_region() ... Browse Code »

Narrow down the controlled text inside the conditional so that it will
include lpage_info and rmap stuff only.

For this we change the way we check whether the slot is being created
from "if (npages && !new.rmap)" to "if (npages && !old.npages)".

We also stop checking if lpage_info is NULL when we create lpage_info
because we do it from inside the slot creation code block.

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Takuya Yoshikawa
2012-03-08 20:10:21 +0800
a64f273a0 KVM: Split lpage_info creation out from __kvm_set_memory_region() ... Browse Code »

This makes it easy to make lpage_info architecture specific.

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Takuya Yoshikawa
2012-03-08 20:10:20 +0800
fb03cb6f4 KVM: Introduce gfn_to_index() which returns the index for a given level ... Browse Code »

This patch cleans up the code and removes the "(void)level;" warning
suppressor.

Note that we can also use this for PT_PAGE_TABLE_LEVEL to treat every
level uniformly later.

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Takuya Yoshikawa
2012-03-08 20:10:19 +0800

05 Mar, 2012

5 commits

9d4cba7f9 KVM: Move gfn_to_memslot() to kvm_host.h ... Browse Code »

This moves __gfn_to_memslot() and search_memslots() from kvm_main.c to
kvm_host.h to reduce the code duplication caused by the need for
non-modular code in arch/powerpc/kvm/book3s_hv_rm_mmu.c to call
gfn_to_memslot() in real mode.

Rather than putting gfn_to_memslot() itself in a header, which would
lead to increased code size, this puts __gfn_to_memslot() in a header.
Then, the non-modular uses of gfn_to_memslot() are changed to call
__gfn_to_memslot() instead. This way there is only one place in the
source code that needs to be changed should the gfn_to_memslot()
implementation need to be modified.

On powerpc, the Book3S HV style of KVM has code that is called from
real mode which needs to call gfn_to_memslot() and thus needs this.
(Module code is allocated in the vmalloc region, which can't be
accessed in real mode.)

With this, we can remove builtin_gfn_to_memslot() from book3s_hv_rm_mmu.c.

Signed-off-by: Paul Mackerras
Acked-by: Avi Kivity
Signed-off-by: Alexander Graf
Signed-off-by: Avi Kivity

Paul Mackerras
2012-03-05 20:57:22 +0800
b93a35532 KVM: fix error handling for out of range irq ... Browse Code »

find_index_from_host_irq returns 0 on error
but callers assume < 0 on error. This should
not matter much: an out of range irq should never happen since
irq handler was registered with this irq #,
and even if it does we get a spurious msix irq in guest
and typically nothing terrible happens.

Still, better to make it consistent.

Signed-off-by: Michael S. Tsirkin
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Michael S. Tsirkin
2012-03-05 20:52:43 +0800
a355aa54f KVM: Add barriers to allow mmu_notifier_retry to be used locklessly ... Browse Code »

This adds an smp_wmb in kvm_mmu_notifier_invalidate_range_end() and an
smp_rmb in mmu_notifier_retry() so that mmu_notifier_retry() will give
the correct answer when called without kvm->mmu_lock being held.
PowerPC Book3S HV KVM wants to use a bitlock per guest page rather than
a single global spinlock in order to improve the scalability of updates
to the guest MMU hashed page table, and so needs this.

Signed-off-by: Paul Mackerras
Acked-by: Avi Kivity
Signed-off-by: Alexander Graf
Signed-off-by: Avi Kivity

Paul Mackerras
2012-03-05 20:52:38 +0800
5b1c1493a KVM: s390: ucontrol: export SIE control block to user ... Browse Code »

This patch exports the s390 SIE hardware control block to userspace
via the mapping of the vcpu file descriptor. In order to do so,
a new arch callback named kvm_arch_vcpu_fault is introduced for all
architectures. It allows to map architecture specific pages.

Signed-off-by: Carsten Otte
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Carsten Otte
2012-03-05 20:52:19 +0800
e08b96371 KVM: s390: add parameter for KVM_CREATE_VM ... Browse Code »

This patch introduces a new config option for user controlled kernel
virtual machines. It introduces a parameter to KVM_CREATE_VM that
allows to set bits that alter the capabilities of the newly created
virtual machine.
The parameter is passed to kvm_arch_init_vm for all architectures.
The only valid modifier bit for now is KVM_VM_S390_UCONTROL.
This requires CAP_SYS_ADMIN privileges and creates a user controlled
virtual machine on s390 architectures.

Signed-off-by: Carsten Otte
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Carsten Otte
2012-03-05 20:52:18 +0800

01 Feb, 2012

1 commit

50e92b3c9 KVM: Fix __set_bit() race in mark_page_dirty() during dirty logging ... Browse Code »

It is possible that the __set_bit() in mark_page_dirty() is called
simultaneously on the same region of memory, which may result in only
one bit being set, because some callers do not take mmu_lock before
mark_page_dirty().

This problem is hard to produce because when we reach mark_page_dirty()
beginning from, e.g., tdp_page_fault(), mmu_lock is being held during
__direct_map(): making kvm-unit-tests' dirty log api test write to two
pages concurrently was not useful for this reason.

So we have confirmed that there can actually be race condition by
checking if some callers really reach there without holding mmu_lock
using spin_is_locked(): probably they were from kvm_write_guest_page().

To fix this race, this patch changes the bit operation to the atomic
version: note that nr_dirty_pages also suffers from the race but we do
not need exactly correct numbers for now.

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Marcelo Tosatti

Takuya Yoshikawa
2012-02-01 17:42:32 +0800

13 Jan, 2012

1 commit

90ab5ee94 module_param: make bool parameters really bool (drivers & misc) ... Browse Code »

module_param(bool) used to counter-intuitively take an int. In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.

It's time to remove the int/unsigned int option. For this version
it'll simply give a warning, but it'll break next kernel version.

Acked-by: Mauro Carvalho Chehab
Signed-off-by: Rusty Russell

Rusty Russell
2012-01-13 07:02:20 +0800

11 Jan, 2012

1 commit

1c8106528 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu ... Browse Code »

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (53 commits)
iommu/amd: Set IOTLB invalidation timeout
iommu/amd: Init stats for iommu=pt
iommu/amd: Remove unnecessary cache flushes in amd_iommu_resume
iommu/amd: Add invalidate-context call-back
iommu/amd: Add amd_iommu_device_info() function
iommu/amd: Adapt IOMMU driver to PCI register name changes
iommu/amd: Add invalid_ppr callback
iommu/amd: Implement notifiers for IOMMUv2
iommu/amd: Implement IO page-fault handler
iommu/amd: Add routines to bind/unbind a pasid
iommu/amd: Implement device aquisition code for IOMMUv2
iommu/amd: Add driver stub for AMD IOMMUv2 support
iommu/amd: Add stat counter for IOMMUv2 events
iommu/amd: Add device errata handling
iommu/amd: Add function to get IOMMUv2 domain for pdev
iommu/amd: Implement function to send PPR completions
iommu/amd: Implement functions to manage GCR3 table
iommu/amd: Implement IOMMUv2 TLB flushing routines
iommu/amd: Add support for IOMMUv2 domain mode
iommu/amd: Add amd_iommu_domain_direct_map function
...

Linus Torvalds
2012-01-11 03:08:21 +0800

09 Jan, 2012

1 commit

00fb5430f Merge branches 'iommu/fixes', 'arm/omap' and 'x86/amd' into next ... Browse Code »

Conflicts:
drivers/pci/hotplug/acpiphp_glue.c

Joerg Roedel
2012-01-09 20:04:05 +0800

27 Dec, 2011

6 commits

4f69b6805 KVM: ensure that debugfs entries have been created ... Browse Code »

by checking the return value from kvm_init_debug, we
can ensure that the entries under debugfs for KVM have
been created correctly.

Signed-off-by: Yang Bai
Signed-off-by: Marcelo Tosatti

Hamo
2011-12-27 17:22:33 +0800
d546cb406 KVM: drop bsp_vcpu pointer from kvm struct ... Browse Code »

Drop bsp_vcpu pointer from kvm struct since its only use is incorrect
anyway.

Signed-off-by: Gleb Natapov
Signed-off-by: Marcelo Tosatti

Gleb Natapov
2011-12-27 17:22:32 +0800
ff5c2c031 KVM: Use memdup_user instead of kmalloc/copy_from_user ... Browse Code »

Switch to using memdup_user when possible. This makes code more
smaller and compact, and prevents errors.

Signed-off-by: Sasha Levin
Signed-off-by: Avi Kivity

Sasha Levin
2011-12-27 17:22:21 +0800
cdfca7b34 KVM: Use kmemdup() instead of kmalloc/memcpy ... Browse Code »

Switch to kmemdup() in two places to shorten the code and avoid possible bugs.

Signed-off-by: Sasha Levin
Signed-off-by: Avi Kivity

Sasha Levin
2011-12-27 17:22:20 +0800
d77fe6354 KVM: Allow aligned byte and word writes to IOAPIC registers. ... Browse Code »

This fixes byte accesses to IOAPIC_REG_SELECT as mandated by at least the
ICH10 and Intel Series 5 chipset specs. It also makes ioapic_mmio_write
consistent with ioapic_mmio_read, which also allows byte and word accesses.

Signed-off-by: Julian Stecklina
Signed-off-by: Avi Kivity

Julian Stecklina
2011-12-27 17:17:44 +0800
f85e2cb5d KVM: introduce a table to map slot id to index in memslots array ... Browse Code »

The operation of getting dirty log is frequent when framebuffer-based
displays are used(for example, Xwindow), so, we introduce a mapping table
to speed up id_to_memslot()

Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity

Xiao Guangrong
2011-12-27 17:17:42 +0800