Eric Lee / smarc-fsl-linux-kernel

19 Apr, 2012

1 commit

21a1416a1 KVM: lock slots_lock around device assignment ... Browse Code »

As pointed out by Jason Baron, when assigning a device to a guest
we first set the iommu domain pointer, which enables mapping
and unmapping of memory slots to the iommu. This leaves a window
where this path is enabled, but we haven't synchronized the iommu
mappings to the existing memory slots. Thus a slot being removed
at that point could send us down unexpected code paths removing
non-existent pinnings and iommu mappings. Take the slots_lock
around creating the iommu domain and initial mappings as well as
around iommu teardown to avoid this race.

Signed-off-by: Alex Williamson
Signed-off-by: Marcelo Tosatti

Alex Williamson
2012-04-19 11:04:18 +0800

12 Apr, 2012

1 commit

32f6daad4 KVM: unmap pages from the iommu when slots are removed ... Browse Code »
1

We've been adding new mappings, but not destroying old mappings.
This can lead to a page leak as pages are pinned using
get_user_pages, but only unpinned with put_page if they still
exist in the memslots list on vm shutdown. A memslot that is
destroyed while an iommu domain is enabled for the guest will
therefore result in an elevated page reference count that is
never cleared.

Additionally, without this fix, the iommu is only programmed
with the first translation for a gpa. This can result in
peer-to-peer errors if a mapping is destroyed and replaced by a
new mapping at the same gpa as the iommu will still be pointing
to the original, pinned memory address.

Signed-off-by: Alex Williamson
Signed-off-by: Marcelo Tosatti

Alex Williamson
2012-04-12 09:55:25 +0800

20 Mar, 2012

1 commit

cf9eeac46 KVM: Convert intx_mask_lock to spin lock ... Browse Code »

As kvm_notify_acked_irq calls kvm_assigned_dev_ack_irq under
rcu_read_lock, we cannot use a mutex in the latter function. Switch to a
spin lock to address this.

Signed-off-by: Jan Kiszka
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Jan Kiszka
2012-03-20 18:41:24 +0800

08 Mar, 2012

8 commits

bec87d6e3 KVM: use correct tlbs dirty type in cmpxchg ... Browse Code »

Using 'int' type is not suitable for a 'long' object. So, correct it.

Signed-off-by: Alex Shi
Signed-off-by: Avi Kivity

Alex Shi
2012-03-08 20:11:44 +0800
07700a94b KVM: Allow host IRQ sharing for assigned PCI 2.3 devices ... Browse Code »

PCI 2.3 allows to generically disable IRQ sources at device level. This
enables us to share legacy IRQs of such devices with other host devices
when passing them to a guest.

The new IRQ sharing feature introduced here is optional, user space has
to request it explicitly. Moreover, user space can inform us about its
view of PCI_COMMAND_INTX_DISABLE so that we can avoid unmasking the
interrupt and signaling it if the guest masked it via the virtualized
PCI config space.

Signed-off-by: Jan Kiszka
Acked-by: Alex Williamson
Acked-by: Michael S. Tsirkin
Signed-off-by: Avi Kivity

Jan Kiszka
2012-03-08 20:11:36 +0800
3e515705a KVM: Ensure all vcpus are consistent with in-kernel irqchip settings ... Browse Code »

If some vcpus are created before KVM_CREATE_IRQCHIP, then
irqchip_in_kernel() and vcpu->arch.apic will be inconsistent, leading
to potential NULL pointer dereferences.

Fix by:
- ensuring that no vcpus are installed when KVM_CREATE_IRQCHIP is called
- ensuring that a vcpu has an apic if it is installed after KVM_CREATE_IRQCHIP

This is somewhat long winded because vcpu->arch.apic is created without
kvm->lock held.

Based on earlier patch by Michael Ellerman.

Signed-off-by: Michael Ellerman
Signed-off-by: Avi Kivity

Avi Kivity
2012-03-08 20:10:30 +0800
565f3be21 KVM: mmu_notifier: Flush TLBs before releasing mmu_lock ... Browse Code »

Other threads may process the same page in that small window and skip
TLB flush and then return before these functions do flush.

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Takuya Yoshikawa
2012-03-08 20:10:23 +0800
db3fe4eb4 KVM: Introduce kvm_memory_slot::arch and move lpage_info into it ... Browse Code »

Some members of kvm_memory_slot are not used by every architecture.

This patch is the first step to make this difference clear by
introducing kvm_memory_slot::arch; lpage_info is moved into it.

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Takuya Yoshikawa
2012-03-08 20:10:22 +0800
189a2f7b2 KVM: Simplify ifndef conditional usage in __kvm_set_memory_region() ... Browse Code »

Narrow down the controlled text inside the conditional so that it will
include lpage_info and rmap stuff only.

For this we change the way we check whether the slot is being created
from "if (npages && !new.rmap)" to "if (npages && !old.npages)".

We also stop checking if lpage_info is NULL when we create lpage_info
because we do it from inside the slot creation code block.

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Takuya Yoshikawa
2012-03-08 20:10:21 +0800
a64f273a0 KVM: Split lpage_info creation out from __kvm_set_memory_region() ... Browse Code »

This makes it easy to make lpage_info architecture specific.

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Takuya Yoshikawa
2012-03-08 20:10:20 +0800
fb03cb6f4 KVM: Introduce gfn_to_index() which returns the index for a given level ... Browse Code »

This patch cleans up the code and removes the "(void)level;" warning
suppressor.

Note that we can also use this for PT_PAGE_TABLE_LEVEL to treat every
level uniformly later.

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Takuya Yoshikawa
2012-03-08 20:10:19 +0800

05 Mar, 2012

5 commits

9d4cba7f9 KVM: Move gfn_to_memslot() to kvm_host.h ... Browse Code »

This moves __gfn_to_memslot() and search_memslots() from kvm_main.c to
kvm_host.h to reduce the code duplication caused by the need for
non-modular code in arch/powerpc/kvm/book3s_hv_rm_mmu.c to call
gfn_to_memslot() in real mode.

Rather than putting gfn_to_memslot() itself in a header, which would
lead to increased code size, this puts __gfn_to_memslot() in a header.
Then, the non-modular uses of gfn_to_memslot() are changed to call
__gfn_to_memslot() instead. This way there is only one place in the
source code that needs to be changed should the gfn_to_memslot()
implementation need to be modified.

On powerpc, the Book3S HV style of KVM has code that is called from
real mode which needs to call gfn_to_memslot() and thus needs this.
(Module code is allocated in the vmalloc region, which can't be
accessed in real mode.)

With this, we can remove builtin_gfn_to_memslot() from book3s_hv_rm_mmu.c.

Signed-off-by: Paul Mackerras
Acked-by: Avi Kivity
Signed-off-by: Alexander Graf
Signed-off-by: Avi Kivity

Paul Mackerras
2012-03-05 20:57:22 +0800
b93a35532 KVM: fix error handling for out of range irq ... Browse Code »

find_index_from_host_irq returns 0 on error
but callers assume < 0 on error. This should
not matter much: an out of range irq should never happen since
irq handler was registered with this irq #,
and even if it does we get a spurious msix irq in guest
and typically nothing terrible happens.

Still, better to make it consistent.

Signed-off-by: Michael S. Tsirkin
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Michael S. Tsirkin
2012-03-05 20:52:43 +0800
a355aa54f KVM: Add barriers to allow mmu_notifier_retry to be used locklessly ... Browse Code »

This adds an smp_wmb in kvm_mmu_notifier_invalidate_range_end() and an
smp_rmb in mmu_notifier_retry() so that mmu_notifier_retry() will give
the correct answer when called without kvm->mmu_lock being held.
PowerPC Book3S HV KVM wants to use a bitlock per guest page rather than
a single global spinlock in order to improve the scalability of updates
to the guest MMU hashed page table, and so needs this.

Signed-off-by: Paul Mackerras
Acked-by: Avi Kivity
Signed-off-by: Alexander Graf
Signed-off-by: Avi Kivity

Paul Mackerras
2012-03-05 20:52:38 +0800
5b1c1493a KVM: s390: ucontrol: export SIE control block to user ... Browse Code »

This patch exports the s390 SIE hardware control block to userspace
via the mapping of the vcpu file descriptor. In order to do so,
a new arch callback named kvm_arch_vcpu_fault is introduced for all
architectures. It allows to map architecture specific pages.

Signed-off-by: Carsten Otte
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Carsten Otte
2012-03-05 20:52:19 +0800
e08b96371 KVM: s390: add parameter for KVM_CREATE_VM ... Browse Code »

This patch introduces a new config option for user controlled kernel
virtual machines. It introduces a parameter to KVM_CREATE_VM that
allows to set bits that alter the capabilities of the newly created
virtual machine.
The parameter is passed to kvm_arch_init_vm for all architectures.
The only valid modifier bit for now is KVM_VM_S390_UCONTROL.
This requires CAP_SYS_ADMIN privileges and creates a user controlled
virtual machine on s390 architectures.

Signed-off-by: Carsten Otte
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Carsten Otte
2012-03-05 20:52:18 +0800

01 Feb, 2012

1 commit

50e92b3c9 KVM: Fix __set_bit() race in mark_page_dirty() during dirty logging ... Browse Code »

It is possible that the __set_bit() in mark_page_dirty() is called
simultaneously on the same region of memory, which may result in only
one bit being set, because some callers do not take mmu_lock before
mark_page_dirty().

This problem is hard to produce because when we reach mark_page_dirty()
beginning from, e.g., tdp_page_fault(), mmu_lock is being held during
__direct_map(): making kvm-unit-tests' dirty log api test write to two
pages concurrently was not useful for this reason.

So we have confirmed that there can actually be race condition by
checking if some callers really reach there without holding mmu_lock
using spin_is_locked(): probably they were from kvm_write_guest_page().

To fix this race, this patch changes the bit operation to the atomic
version: note that nr_dirty_pages also suffers from the race but we do
not need exactly correct numbers for now.

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Marcelo Tosatti

Takuya Yoshikawa
2012-02-01 17:42:32 +0800

13 Jan, 2012

1 commit

90ab5ee94 module_param: make bool parameters really bool (drivers & misc) ... Browse Code »
92

module_param(bool) used to counter-intuitively take an int. In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.

It's time to remove the int/unsigned int option. For this version
it'll simply give a warning, but it'll break next kernel version.

Acked-by: Mauro Carvalho Chehab
Signed-off-by: Rusty Russell

Rusty Russell
2012-01-13 07:02:20 +0800

11 Jan, 2012

1 commit

1c8106528 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu ... Browse Code »

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (53 commits)
iommu/amd: Set IOTLB invalidation timeout
iommu/amd: Init stats for iommu=pt
iommu/amd: Remove unnecessary cache flushes in amd_iommu_resume
iommu/amd: Add invalidate-context call-back
iommu/amd: Add amd_iommu_device_info() function
iommu/amd: Adapt IOMMU driver to PCI register name changes
iommu/amd: Add invalid_ppr callback
iommu/amd: Implement notifiers for IOMMUv2
iommu/amd: Implement IO page-fault handler
iommu/amd: Add routines to bind/unbind a pasid
iommu/amd: Implement device aquisition code for IOMMUv2
iommu/amd: Add driver stub for AMD IOMMUv2 support
iommu/amd: Add stat counter for IOMMUv2 events
iommu/amd: Add device errata handling
iommu/amd: Add function to get IOMMUv2 domain for pdev
iommu/amd: Implement function to send PPR completions
iommu/amd: Implement functions to manage GCR3 table
iommu/amd: Implement IOMMUv2 TLB flushing routines
iommu/amd: Add support for IOMMUv2 domain mode
iommu/amd: Add amd_iommu_domain_direct_map function
...

Linus Torvalds
2012-01-11 03:08:21 +0800

09 Jan, 2012

1 commit

00fb5430f Merge branches 'iommu/fixes', 'arm/omap' and 'x86/amd' into next ... Browse Code »

Conflicts:
drivers/pci/hotplug/acpiphp_glue.c

Joerg Roedel
2012-01-09 20:04:05 +0800

27 Dec, 2011

14 commits

4f69b6805 KVM: ensure that debugfs entries have been created ... Browse Code »

by checking the return value from kvm_init_debug, we
can ensure that the entries under debugfs for KVM have
been created correctly.

Signed-off-by: Yang Bai
Signed-off-by: Marcelo Tosatti

Hamo
2011-12-27 17:22:33 +0800
d546cb406 KVM: drop bsp_vcpu pointer from kvm struct ... Browse Code »

Drop bsp_vcpu pointer from kvm struct since its only use is incorrect
anyway.

Signed-off-by: Gleb Natapov
Signed-off-by: Marcelo Tosatti

Gleb Natapov
2011-12-27 17:22:32 +0800
ff5c2c031 KVM: Use memdup_user instead of kmalloc/copy_from_user ... Browse Code »

Switch to using memdup_user when possible. This makes code more
smaller and compact, and prevents errors.

Signed-off-by: Sasha Levin
Signed-off-by: Avi Kivity

Sasha Levin
2011-12-27 17:22:21 +0800
cdfca7b34 KVM: Use kmemdup() instead of kmalloc/memcpy ... Browse Code »

Switch to kmemdup() in two places to shorten the code and avoid possible bugs.

Signed-off-by: Sasha Levin
Signed-off-by: Avi Kivity

Sasha Levin
2011-12-27 17:22:20 +0800
d77fe6354 KVM: Allow aligned byte and word writes to IOAPIC registers. ... Browse Code »

This fixes byte accesses to IOAPIC_REG_SELECT as mandated by at least the
ICH10 and Intel Series 5 chipset specs. It also makes ioapic_mmio_write
consistent with ioapic_mmio_read, which also allows byte and word accesses.

Signed-off-by: Julian Stecklina
Signed-off-by: Avi Kivity

Julian Stecklina
2011-12-27 17:17:44 +0800
f85e2cb5d KVM: introduce a table to map slot id to index in memslots array ... Browse Code »

The operation of getting dirty log is frequent when framebuffer-based
displays are used(for example, Xwindow), so, we introduce a mapping table
to speed up id_to_memslot()

Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity

Xiao Guangrong
2011-12-27 17:17:42 +0800
bf3e05bc1 KVM: sort memslots by its size and use line search ... Browse Code »

Sort memslots base on its size and use line search to find it, so that the
larger memslots have better fit

The idea is from Avi

Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity

Xiao Guangrong
2011-12-27 17:17:40 +0800
28a37544f KVM: introduce id_to_memslot function ... Browse Code »

Introduce id_to_memslot to get memslot by slot id

Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity

Xiao Guangrong
2011-12-27 17:17:39 +0800
be6ba0f09 KVM: introduce kvm_for_each_memslot macro ... Browse Code »

Introduce kvm_for_each_memslot to walk all valid memslot

Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity

Xiao Guangrong
2011-12-27 17:17:37 +0800
be593d628 KVM: introduce update_memslots function ... Browse Code »

Introduce update_memslots to update slot which will be update to
kvm->memslots

Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity

Xiao Guangrong
2011-12-27 17:17:35 +0800
93a5cef07 KVM: introduce KVM_MEM_SLOTS_NUM macro ... Browse Code »

Introduce KVM_MEM_SLOTS_NUM macro to instead of
KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS

Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity

Xiao Guangrong
2011-12-27 17:17:34 +0800
7850ac542 KVM: Count the number of dirty pages for dirty logging ... Browse Code »

Needed for the next patch which uses this number to decide how to write
protect a slot.

Signed-off-by: Takuya Yoshikawa
Signed-off-by: Avi Kivity

Takuya Yoshikawa
2011-12-27 17:17:19 +0800
6da64fdb8 KVM: Use kmemdup rather than duplicating its implementation ... Browse Code »

Use kmemdup rather than duplicating its implementation

The semantic patch that makes this change is available
in scripts/coccinelle/api/memdup.cocci.

More information about semantic patching is available at
http://coccinelle.lip6.fr/

Signed-off-by: Thomas Meyer
Signed-off-by: Marcelo Tosatti

Thomas Meyer
2011-12-27 17:17:11 +0800
1a214246c KVM: make checks stricter in coalesced_mmio_in_range() ... Browse Code »

My testing version of Smatch complains that addr and len come from
the user and they can wrap. The path is:
-> kvm_vm_ioctl()
-> kvm_vm_ioctl_unregister_coalesced_mmio()
-> coalesced_mmio_in_range()

I don't know what the implications are of wrapping here, but we may
as well fix it, if only to silence the warning.

Signed-off-by: Dan Carpenter
Signed-off-by: Marcelo Tosatti

Dan Carpenter
2011-12-27 17:17:07 +0800

26 Dec, 2011

1 commit

3d27e23b1 KVM: Device assignment permission checks ... Browse Code »

Only allow KVM device assignment to attach to devices which:

- Are not bridges
- Have BAR resources (assume others are special devices)
- The user has permissions to use

Assigning a bridge is a configuration error, it's not supported, and
typically doesn't result in the behavior the user is expecting anyway.
Devices without BAR resources are typically chipset components that
also don't have host drivers. We don't want users to hold such devices
captive or cause system problems by fencing them off into an iommu
domain. We determine "permission to use" by testing whether the user
has access to the PCI sysfs resource files. By default a normal user
will not have access to these files, so it provides a good indication
that an administration agent has granted the user access to the device.

[Yang Bai: add missing #include]
[avi: fix comment style]

Signed-off-by: Alex Williamson
Signed-off-by: Yang Bai
Signed-off-by: Marcelo Tosatti

Alex Williamson
2011-12-26 01:03:54 +0800

25 Dec, 2011

1 commit

423873736 KVM: Remove ability to assign a device without iommu support ... Browse Code »

This option has no users and it exposes a security hole that we
can allow devices to be assigned without iommu protection. Make
KVM_DEV_ASSIGN_ENABLE_IOMMU a mandatory option.

Signed-off-by: Alex Williamson
Signed-off-by: Marcelo Tosatti

Alex Williamson
2011-12-25 23:13:31 +0800

10 Nov, 2011

1 commit

7d3002cc8 iommu/core: split mapping to page sizes as supported by the hardware ... Browse Code »

When mapping a memory region, split it to page sizes as supported
by the iommu hardware. Always prefer bigger pages, when possible,
in order to reduce the TLB pressure.

The logic to do that is now added to the IOMMU core, so neither the iommu
drivers themselves nor users of the IOMMU API have to duplicate it.

This allows a more lenient granularity of mappings; traditionally the
IOMMU API took 'order' (of a page) as a mapping size, and directly let
the low level iommu drivers handle the mapping, but now that the IOMMU
core can split arbitrary memory regions into pages, we can remove this
limitation, so users don't have to split those regions by themselves.

Currently the supported page sizes are advertised once and they then
remain static. That works well for OMAP and MSM but it would probably
not fly well with intel's hardware, where the page size capabilities
seem to have the potential to be different between several DMA
remapping devices.

register_iommu() currently sets a default pgsize behavior, so we can convert
the IOMMU drivers in subsequent patches. After all the drivers
are converted, the temporary default settings will be removed.

Mainline users of the IOMMU API (kvm and omap-iovmm) are adopted
to deal with bytes instead of page order.

Many thanks to Joerg Roedel for significant review!

Signed-off-by: Ohad Ben-Cohen
Cc: David Brown
Cc: David Woodhouse
Cc: Joerg Roedel
Cc: Stepan Moskovchenko
Cc: KyongHo Cho
Cc: Hiroshi DOYU
Cc: Laurent Pinchart
Cc: kvm@vger.kernel.org
Signed-off-by: Joerg Roedel

Ohad Ben-Cohen
2011-11-10 18:40:37 +0800

01 Nov, 2011

2 commits

51441d434 kvm: iommu.c file requires the full module.h present. ... Browse Code »

This file has things like module_param_named() and MODULE_PARM_DESC()
so it needs the full module.h header present. Without it, you'll get:

CC arch/x86/kvm/../../../virt/kvm/iommu.o
virt/kvm/iommu.c:37: error: expected ‘)’ before ‘bool’
virt/kvm/iommu.c:39: error: expected ‘)’ before string constant
make[3]: *** [arch/x86/kvm/../../../virt/kvm/iommu.o] Error 1
make[2]: *** [arch/x86/kvm] Error 2

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-11-01 07:32:13 +0800
799fd8b23 kvm: fix implicit use of stat.h header file ... Browse Code »

This was coming in via an implicit module.h (and its sub-includes)
before, but we'll be cleaning that up shortly. Call out the stat.h
include requirement in advance.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-11-01 07:32:12 +0800

31 Oct, 2011

1 commit

0cfdc7243 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu ... Browse Code »

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (33 commits)
iommu/core: Remove global iommu_ops and register_iommu
iommu/msm: Use bus_set_iommu instead of register_iommu
iommu/omap: Use bus_set_iommu instead of register_iommu
iommu/vt-d: Use bus_set_iommu instead of register_iommu
iommu/amd: Use bus_set_iommu instead of register_iommu
iommu/core: Use bus->iommu_ops in the iommu-api
iommu/core: Convert iommu_found to iommu_present
iommu/core: Add bus_type parameter to iommu_domain_alloc
Driver core: Add iommu_ops to bus_type
iommu/core: Define iommu_ops and register_iommu only with CONFIG_IOMMU_API
iommu/amd: Fix wrong shift direction
iommu/omap: always provide iommu debug code
iommu/core: let drivers know if an iommu fault handler isn't installed
iommu/core: export iommu_set_fault_handler()
iommu/omap: Fix build error with !IOMMU_SUPPORT
iommu/omap: Migrate to the generic fault report mechanism
iommu/core: Add fault reporting mechanism
iommu/core: Use PAGE_SIZE instead of hard-coded value
iommu/core: use the existing IS_ALIGNED macro
iommu/msm: ->unmap() should return order of unmapped page
...

Fixup trivial conflicts in drivers/iommu/Makefile: "move omap iommu to
dedicated iommu folder" vs "Rename the DMAR and INTR_REMAP config
options" just happened to touch lines next to each other.

Linus Torvalds
2011-10-31 06:46:19 +0800