09 Jan, 2017
1 commit
-
commit 0d808df06a44200f52262b6eb72bcb6042f5a7c5 upstream.
When switching from/to a guest that has a transaction in progress,
we need to save/restore the checkpointed register state. Although
XER is part of the CPU state that gets checkpointed, the code that
does this saving and restoring doesn't save/restore XER.This fixes it by saving and restoring the XER. To allow userspace
to read/write the checkpointed XER value, we also add a new ONE_REG
specifier.The visible effect of this bug is that the guest may see its XER
value being corrupted when it uses transactions.Fixes: e4e38121507a ("KVM: PPC: Book3S HV: Add transactional memory support")
Fixes: 0a8eccefcb34 ("KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit")
Signed-off-by: Paul Mackerras
Reviewed-by: Thomas Huth
Signed-off-by: Paul Mackerras
Signed-off-by: Greg Kroah-Hartman
20 Nov, 2016
1 commit
-
Userspace can read the exact value of kvmclock by reading the TSC
and fetching the timekeeping parameters out of guest memory. This
however is brittle and not necessary anymore with KVM 4.11. Provide
a mechanism that lets userspace know if the new KVM_GET_CLOCK
semantics are in effect, and---since we are at it---if the clock
is stable across all VCPUs.Cc: Radim Krčmář
Cc: Marcelo Tosatti
Signed-off-by: Paolo Bonzini
Signed-off-by: Radim Krčmář
04 Aug, 2016
2 commits
-
The KVM_X2APIC_API_USE_32BIT_IDS feature applies to both
KVM_SET_GSI_ROUTING and KVM_SIGNAL_MSI, but was not mentioned in the
documentation for the latter ioctl.Signed-off-by: Paolo Bonzini
-
…it/kvmarm/kvmarm into HEAD
KVM/ARM Changes for v4.8 - Take 2
Includes GSI routing support to go along with the new VGIC and a small fix that
has been cooking in -next for a while.
23 Jul, 2016
4 commits
-
KVM/ARM changes for Linux 4.8
- GICv3 ITS emulation
- Simpler idmap management that fixes potential TLB conflicts
- Honor the kernel protection in HYP mode
- Removal of the old vgic implementation -
Up to now, only irqchip routing entries could be set. This patch
adds the capability to insert MSI routing entries.For ARM64, let's also increase KVM_MAX_IRQ_ROUTES to 4096: this
include SPI irqchip routes plus MSI routes. In the future this
might be extended.Signed-off-by: Eric Auger
Reviewed-by: Andre Przywara
Signed-off-by: Marc Zyngier -
This patch adds compilation and link against irqchip.
Main motivation behind using irqchip code is to enable MSI
routing code. In the future irqchip routing may also be useful
when targeting multiple irqchips.Routing standard callbacks now are implemented in vgic-irqfd:
- kvm_set_routing_entry
- kvm_set_irq
- kvm_set_msiThey only are supported with new_vgic code.
Both HAVE_KVM_IRQCHIP and HAVE_KVM_IRQ_ROUTING are defined.
KVM_CAP_IRQ_ROUTING is advertised and KVM_SET_GSI_ROUTING is allowed.So from now on IRQCHIP routing is enabled and a routing table entry
must exist for irqfd injection to succeed for a given SPI. This patch
builds a default flat irqchip routing table (gsi=irqchip.pin) covering
all the VGIC SPI indexes. This routing table is overwritten by the
first first user-space call to KVM_SET_GSI_ROUTING ioctl.MSI routing setup is not yet allowed.
Signed-off-by: Eric Auger
Signed-off-by: Marc Zyngier -
On ARM, the MSI msg (address and data) comes along with
out-of-band device ID information. The device ID encodes the
device that writes the MSI msg. Let's convey the device id in
kvm_irq_routing_msi and use KVM_MSI_VALID_DEVID flag value in
kvm_irq_routing_entry to indicate the msi devid is populated.Signed-off-by: Eric Auger
Reviewed-by: Andre Przywara
Acked-by: Radim Krčmář
Signed-off-by: Marc Zyngier
19 Jul, 2016
2 commits
-
Now that all ITS emulation functionality is in place, we advertise
MSI functionality to userland and also the ITS device to the guest - if
userland has configured that.Signed-off-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
The ARM GICv3 ITS MSI controller requires a device ID to be able to
assign the proper interrupt vector. On real hardware, this ID is
sampled from the bus. To be able to emulate an ITS controller, extend
the KVM MSI interface to let userspace provide such a device ID. For
PCI devices, the device ID is simply the 16-bit bus-device-function
triplet, which should be easily available to the userland tool.Also there is a new KVM capability which advertises whether the
current VM requires a device ID to be set along with the MSI data.
This flag is still reported as not available everywhere, later we will
enable it when ITS emulation is used.Signed-off-by: Andre Przywara
Reviewed-by: Eric Auger
Reviewed-by: Marc Zyngier
Acked-by: Christoffer Dall
Acked-by: Paolo Bonzini
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier
18 Jul, 2016
1 commit
-
We will use illegal instruction 0x0000 for handling 2 byte sw breakpoints
from user space. As it can be enabled dynamically via a capability,
let's move setting of ICTL_OPEREXC to the post creation step, so we avoid
any races when enabling that capability just while adding new cpus.Acked-by: Janosch Frank
Reviewed-by: Cornelia Huck
Signed-off-by: David Hildenbrand
Signed-off-by: Christian Borntraeger
14 Jul, 2016
2 commits
-
Add KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK as a feature flag to
KVM_CAP_X2APIC_API.The quirk made KVM interpret 0xff as a broadcast even in x2APIC mode.
The enableable capability is needed in order to support standard x2APIC and
remain backward compatible.Signed-off-by: Radim Krčmář
[Expand kvm_apic_mda comment. - Paolo]
Signed-off-by: Paolo Bonzini -
KVM_CAP_X2APIC_API is a capability for features related to x2APIC
enablement. KVM_X2APIC_API_32BIT_FORMAT feature can be enabled to
extend APIC ID in get/set ioctl and MSI addresses to 32 bits.
Both are needed to support x2APIC.The feature has to be enableable and disabled by default, because
get/set ioctl shifted and truncated APIC ID to 8 bits by using a
non-standard protocol inspired by xAPIC and the change is not
backward-compatible.Changes to MSI addresses follow the format used by interrupt remapping
unit. The upper address word, that used to be 0, contains upper 24 bits
of the LAPIC address in its upper 24 bits. Lower 8 bits are reserved as
0. Using the upper address word is not backward-compatible either as we
didn't check that userspace zeroed the word. Reserved bits are still
not explicitly checked, but non-zero data will affect LAPIC addresses,
which will cause a bug.Signed-off-by: Radim Krčmář
Signed-off-by: Paolo Bonzini
16 Jun, 2016
1 commit
-
Allow up to 6 KVM guest KScratch registers to be enabled and accessed
via the KVM guest register API and from the guest itself (the fallback
reading and writing of commpage registers is sufficient for KScratch
registers to work as expected).User mode can expose the registers by setting the appropriate bits of
the guest Config4.KScrExist field. KScratch registers that aren't usable
won't be writeable via the KVM Ioctl API.Signed-off-by: James Hogan
Cc: Paolo Bonzini
Cc: Radim Krčmář
Cc: Ralf Baechle
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini
10 Jun, 2016
1 commit
-
Let's not provide the device attribute for cmma enabling and clearing
if the hardware doesn't support it.This also helps getting rid of the undocumented return value "-EINVAL"
in case CMMA is not available when trying to enable it.Also properly document the meaning of -EINVAL for CMMA clearing.
Reviewed-by: Christian Borntraeger
Signed-off-by: David Hildenbrand
Signed-off-by: Christian Borntraeger
12 May, 2016
1 commit
-
The KVM_MAX_VCPUS define provides the maximum number of vCPUs per guest, and
also the upper limit for vCPU ids. This is okay for all archs except PowerPC
which can have higher ids, depending on the cpu/core/thread topology. In the
worst case (single threaded guest, host with 8 threads per core), it limits
the maximum number of vCPUS to KVM_MAX_VCPUS / 8.This patch separates the vCPU numbering from the total number of vCPUs, with
the introduction of KVM_MAX_VCPU_ID, as the maximal valid value for vCPU ids
plus one.The corresponding KVM_CAP_MAX_VCPU_ID allows userspace to validate vCPU ids
before passing them to KVM_CREATE_VCPU.This patch only implements KVM_MAX_VCPU_ID with a specific value for PowerPC.
Other archs continue to return KVM_MAX_VCPUS instead.Suggested-by: Radim Krcmar
Signed-off-by: Greg Kurz
Reviewed-by: Cornelia Huck
Signed-off-by: Paolo Bonzini
09 May, 2016
1 commit
-
We forgot to document that capability, let's add documentation.
Reviewed-by: Christian Borntraeger
Signed-off-by: David Hildenbrand
Signed-off-by: Christian Borntraeger
09 Mar, 2016
1 commit
-
KVM/ARM updates for 4.6
- VHE support so that we can run the kernel at EL2 on ARMv8.1 systems
- PMU support for guests
- 32bit world switch rewritten in C
- Various optimizations to the vgic save/restore codeConflicts:
include/uapi/linux/kvm.h
04 Mar, 2016
1 commit
-
Signed-off-by: Radim Krčmář
Signed-off-by: Paolo Bonzini
03 Mar, 2016
1 commit
-
…lus/powerpc into HEAD
The highlights are:
* Enable VFIO device on PowerPC, from David Gibson
* Optimizations to speed up IPIs between vcpus in HV KVM,
from Suresh Warrier (who is also Suresh E. Warrier)
* In-kernel handling of IOMMU hypercalls, and support for dynamic DMA
windows (DDW), from Alexey Kardashevskiy.Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
02 Mar, 2016
1 commit
-
The existing KVM_CREATE_SPAPR_TCE only supports 32bit windows which is not
enough for directly mapped windows as the guest can get more than 4GB.This adds KVM_CREATE_SPAPR_TCE_64 ioctl and advertises it
via KVM_CAP_SPAPR_TCE_64 capability. The table size is checked against
the locked memory limit.Since 64bit windows are to support Dynamic DMA windows (DDW), let's add
@bus_offset and @page_shift which are also required by DDW.Signed-off-by: Alexey Kardashevskiy
Signed-off-by: Paul Mackerras
01 Mar, 2016
2 commits
-
In some cases it needs to get/set attributes specific to a vcpu and so
needs something else than ONE_REG.Let's copy the KVM_DEVICE approach, and define the respective ioctls
for the vcpu file descriptor.Signed-off-by: Shannon Zhao
Reviewed-by: Andrew Jones
Acked-by: Peter Maydell
Signed-off-by: Marc Zyngier -
To support guest PMUv3, use one bit of the VCPU INIT feature array.
Initialize the PMU when initialzing the vcpu with that bit and PMU
overflow interrupt set.Signed-off-by: Shannon Zhao
Acked-by: Peter Maydell
Reviewed-by: Andrew Jones
Signed-off-by: Marc Zyngier
17 Feb, 2016
1 commit
-
The patch implements KVM_EXIT_HYPERV userspace exit
functionality for Hyper-V VMBus hypercalls:
HV_X64_HCALL_POST_MESSAGE, HV_X64_HCALL_SIGNAL_EVENT.Changes v3:
* use vcpu->arch.complete_userspace_io to setup hypercall
resultChanges v2:
* use KVM_EXIT_HYPERV for hypercallsSigned-off-by: Andrey Smetanin
Reviewed-by: Roman Kagan
CC: Gleb Natapov
CC: Paolo Bonzini
CC: Joerg Roedel
CC: "K. Y. Srinivasan"
CC: Haiyang Zhang
CC: Roman Kagan
CC: Denis V. Lunev
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini
16 Feb, 2016
1 commit
-
This adds real and virtual mode handlers for the H_PUT_TCE_INDIRECT and
H_STUFF_TCE hypercalls for user space emulated devices such as IBMVIO
devices or emulated PCI. These calls allow adding multiple entries
(up to 512) into the TCE table in one call which saves time on
transition between kernel and user space.The current implementation of kvmppc_h_stuff_tce() allows it to be
executed in both real and virtual modes so there is one helper.
The kvmppc_rm_h_put_tce_indirect() needs to translate the guest address
to the host address and since the translation is different, there are
2 helpers - one for each mode.This implements the KVM_CAP_PPC_MULTITCE capability. When present,
the kernel will try handling H_PUT_TCE_INDIRECT and H_STUFF_TCE if these
are enabled by the userspace via KVM_CAP_PPC_ENABLE_HCALL.
If they can not be handled by the kernel, they are passed on to
the user space. The user space still has to have an implementation
for these.Both HV and PR-syle KVM are supported.
Signed-off-by: Alexey Kardashevskiy
Reviewed-by: David Gibson
Signed-off-by: Paul Mackerras
26 Jan, 2016
1 commit
-
The KVM_SMI capability is following the KVM_S390_SET_IRQ_STATE capability
which is "4.95", this changes the number of the KVM_SMI chapter to 4.96.Signed-off-by: Alexey Kardashevskiy
Reviewed-by: David Gibson
Signed-off-by: Paolo Bonzini
26 Nov, 2015
2 commits
-
A new vcpu exit is introduced to notify the userspace of the
changes in Hyper-V SynIC configuration triggered by guest writing to the
corresponding MSRs.Changes v4:
* exit into userspace only if guest writes into SynIC MSR'sChanges v3:
* added KVM_EXIT_HYPERV types and structs notes into docsSigned-off-by: Andrey Smetanin
Reviewed-by: Roman Kagan
Signed-off-by: Denis V. Lunev
CC: Gleb Natapov
CC: Paolo Bonzini
CC: Roman Kagan
CC: Denis V. Lunev
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini -
SynIC (synthetic interrupt controller) is a lapic extension,
which is controlled via MSRs and maintains for each vCPU
- 16 synthetic interrupt "lines" (SINT's); each can be configured to
trigger a specific interrupt vector optionally with auto-EOI
semantics
- a message page in the guest memory with 16 256-byte per-SINT message
slots
- an event flag page in the guest memory with 16 2048-bit per-SINT
event flag areasThe host triggers a SINT whenever it delivers a new message to the
corresponding slot or flips an event flag bit in the corresponding area.
The guest informs the host that it can try delivering a message by
explicitly asserting EOI in lapic or writing to End-Of-Message (EOM)
MSR.The userspace (qemu) triggers interrupts and receives EOM notifications
via irqfd with resampler; for that, a GSI is allocated for each
configured SINT, and irq_routing api is extended to support GSI-SINT
mapping.Changes v4:
* added activation of SynIC by vcpu KVM_ENABLE_CAP
* added per SynIC active flag
* added deactivation of APICv upon SynIC activationChanges v3:
* added KVM_CAP_HYPERV_SYNIC and KVM_IRQ_ROUTING_HV_SINT notes into
docsChanges v2:
* do not use posted interrupts for Hyper-V SynIC AutoEOI vectors
* add Hyper-V SynIC vectors into EOI exit bitmap
* Hyper-V SyniIC SINT msr write logic simplifiedSigned-off-by: Andrey Smetanin
Reviewed-by: Roman Kagan
Signed-off-by: Denis V. Lunev
CC: Gleb Natapov
CC: Paolo Bonzini
CC: Roman Kagan
CC: Denis V. Lunev
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini
06 Nov, 2015
1 commit
-
Pull KVM updates from Paolo Bonzini:
"First batch of KVM changes for 4.4.s390:
A bunch of fixes and optimizations for interrupt and time handling.PPC:
Mostly bug fixes.ARM:
No big features, but many small fixes and prerequisites including:- a number of fixes for the arch-timer
- introducing proper level-triggered semantics for the arch-timers
- a series of patches to synchronously halt a guest (prerequisite
for IRQ forwarding)- some tracepoint improvements
- a tweak for the EL2 panic handlers
- some more VGIC cleanups getting rid of redundant state
x86:
Quite a few changes:- support for VT-d posted interrupts (i.e. PCI devices can inject
interrupts directly into vCPUs). This introduces a new
component (in virt/lib/) that connects VFIO and KVM together.
The same infrastructure will be used for ARM interrupt
forwarding as well.- more Hyper-V features, though the main one Hyper-V synthetic
interrupt controller will have to wait for 4.5. These will let
KVM expose Hyper-V devices.- nested virtualization now supports VPID (same as PCID but for
vCPUs) which makes it quite a bit faster- for future hardware that supports NVDIMM, there is support for
clflushopt, clwb, pcommit- support for "split irqchip", i.e. LAPIC in kernel +
IOAPIC/PIC/PIT in userspace, which reduces the attack surface of
the hypervisor- obligatory smattering of SMM fixes
- on the guest side, stable scheduler clock support was rewritten
to not require help from the hypervisor"* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (123 commits)
KVM: VMX: Fix commit which broke PML
KVM: x86: obey KVM_X86_QUIRK_CD_NW_CLEARED in kvm_set_cr0()
KVM: x86: allow RSM from 64-bit mode
KVM: VMX: fix SMEP and SMAP without EPT
KVM: x86: move kvm_set_irq_inatomic to legacy device assignment
KVM: device assignment: remove pointless #ifdefs
KVM: x86: merge kvm_arch_set_irq with kvm_set_msi_inatomic
KVM: x86: zero apic_arb_prio on reset
drivers/hv: share Hyper-V SynIC constants with userspace
KVM: x86: handle SMBASE as physical address in RSM
KVM: x86: add read_phys to x86_emulate_ops
KVM: x86: removing unused variable
KVM: don't pointlessly leave KVM_COMPAT=y in non-KVM configs
KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr()
KVM: arm/arm64: Clean up vgic_retire_lr() and surroundings
KVM: arm/arm64: Optimize away redundant LR tracking
KVM: s390: use simple switch statement as multiplexer
KVM: s390: drop useless newline in debugging data
KVM: s390: SCA must not cross page boundaries
KVM: arm: Do not indent the arguments of DECLARE_BITMAP
...
12 Oct, 2015
1 commit
-
This patch fix spelling typos in Documentation/virtual/kvm.
Signed-off-by: Masanari Iida
Signed-off-by: Jonathan Corbet
01 Oct, 2015
5 commits
-
Cc: Gleb Natapov
Cc: Paolo Bonzini
Signed-off-by: Jason Wang
Signed-off-by: Paolo Bonzini -
In order to enable userspace PIC support, the userspace PIC needs to
be able to inject local interrupts even when the APICs are in the
kernel.KVM_INTERRUPT now supports sending local interrupts to an APIC when
APICs are in the kernel.The ready_for_interrupt_request flag is now only set when the CPU/APIC
will immediately accept and inject an interrupt (i.e. APIC has not
masked the PIC).When the PIC wishes to initiate an INTA cycle with, say, CPU0, it
kicks CPU0 out of the guest, and renedezvous with CPU0 once it arrives
in userspace.When the CPU/APIC unmasks the PIC, a KVM_EXIT_IRQ_WINDOW_OPEN is
triggered, so that userspace has a chance to inject a PIC interrupt
if it had been pending.Overall, this design can lead to a small number of spurious userspace
renedezvous. In particular, whenever the PIC transistions from low to
high while it is masked and whenever the PIC becomes unmasked while
it is low.Note: this does not buffer more than one local interrupt in the
kernel, so the VMM needs to enter the guest in order to complete
interrupt injection before injecting an additional interrupt.Compiles for x86.
Can pass the KVM Unit Tests.
Signed-off-by: Steve Rutherford
Signed-off-by: Paolo Bonzini -
In order to support a userspace IOAPIC interacting with an in kernel
APIC, the EOI exit bitmaps need to be configurable.If the IOAPIC is in userspace (i.e. the irqchip has been split), the
EOI exit bitmaps will be set whenever the GSI Routes are configured.
In particular, for the low MSI routes are reservable for userspace
IOAPICs. For these MSI routes, the EOI Exit bit corresponding to the
destination vector of the route will be set for the destination VCPU.The intention is for the userspace IOAPICs to use the reservable MSI
routes to inject interrupts into the guest.This is a slight abuse of the notion of an MSI Route, given that MSIs
classically bypass the IOAPIC. It might be worthwhile to add an
additional route type to improve clarity.Compile tested for Intel x86.
Signed-off-by: Steve Rutherford
Signed-off-by: Paolo Bonzini -
Adds KVM_EXIT_IOAPIC_EOI which allows the kernel to EOI
level-triggered IOAPIC interrupts.Uses a per VCPU exit bitmap to decide whether or not the IOAPIC needs
to be informed (which is identical to the EOI_EXIT_BITMAP field used
by modern x86 processors, but can also be used to elide kvm IOAPIC EOI
exits on older processors).[Note: A prototype using ResampleFDs found that decoupling the EOI
from the VCPU's thread made it possible for the VCPU to not see a
recent EOI after reentering the guest. This does not match real
hardware.]Compile tested for Intel x86.
Signed-off-by: Steve Rutherford
Signed-off-by: Paolo Bonzini -
First patch in a series which enables the relocation of the
PIC/IOAPIC to userspace.Adds capability KVM_CAP_SPLIT_IRQCHIP;
KVM_CAP_SPLIT_IRQCHIP enables the construction of LAPICs without the
rest of the irqchip.Compile tested for x86.
Signed-off-by: Steve Rutherford
Suggested-by: Andrew Honig
Signed-off-by: Paolo Bonzini
23 Aug, 2015
1 commit
-
Patch queue for ppc - 2015-08-22
Highlights for KVM PPC this time around:
- Book3S: A few bug fixes
- Book3S: Allow micro-threading on POWER8
23 Jul, 2015
1 commit
-
Sending of notification is done by exiting vcpu to user space
if KVM_REQ_HV_CRASH is enabled for vcpu. At exit to user space
the kvm_run structure contains system_event with type
KVM_SYSTEM_EVENT_CRASH to notify about guest crash occurred.Signed-off-by: Andrey Smetanin
Signed-off-by: Denis V. Lunev
Reviewed-by: Peter Hornyack
CC: Paolo Bonzini
CC: Gleb Natapov
Signed-off-by: Paolo Bonzini
21 Jul, 2015
3 commits
-
Finally advertise the KVM capability for SET_GUEST_DEBUG. Once arm
support is added this check can be moved to the common
kvm_vm_ioctl_check_extension() code.Signed-off-by: Alex Bennée
Acked-by: Christoffer Dall
Signed-off-by: Marc Zyngier -
This adds support for SW breakpoints inserted by userspace.
We do this by trapping all guest software debug exceptions to the
hypervisor (MDCR_EL2.TDE). The exit handler sets an exit reason of
KVM_EXIT_DEBUG with the kvm_debug_exit_arch structure holding the
exception syndrome information.It will be up to userspace to extract the PC (via GET_ONE_REG) and
determine if the debug event was for a breakpoint it inserted. If not
userspace will need to re-inject the correct exception restart the
hypervisor to deliver the debug exception to the guest.Any other guest software debug exception (e.g. single step or HW
assisted breakpoints) will cause an error and the VM to be killed. This
is addressed by later patches which add support for the other debug
types.Signed-off-by: Alex Bennée
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier -
This commit adds a stub function to support the KVM_SET_GUEST_DEBUG
ioctl. Any unsupported flag will return -EINVAL. For now, only
KVM_GUESTDBG_ENABLE is supported, although it won't have any effects.Signed-off-by: Alex Bennée .
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier