15 Jun, 2017
2 commits
-
In order to start handling guest access to GICv3 system registers,
let's add a hook that will get called when we trap a system register
access. This is gated by a new static key (vgic_v3_cpuif_trap).Tested-by: Alexander Graf
Acked-by: David Daney
Reviewed-by: Eric Auger
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Christoffer Dall
08 Jun, 2017
7 commits
-
The PMU IRQ number is set through the VCPU device's KVM_SET_DEVICE_ATTR
ioctl handler for the KVM_ARM_VCPU_PMU_V3_IRQ attribute, but there is no
enforced or stated requirement that this must happen after initializing
the VGIC. As a result, calling vgic_valid_spi() which relies on the
nr_spis being set during the VGIC init can incorrectly fail.Introduce irq_is_spi, which determines if an IRQ number is within the
SPI range without verifying it against the actual VGIC properties.Signed-off-by: Christoffer Dall
Reviewed-by: Marc Zyngier -
When injecting an IRQ to the VGIC, you now have to present an owner
token for that IRQ line to show that you are the owner of that line.IRQ lines driven from userspace or via an irqfd do not have an owner and
will simply pass a NULL pointer.Also get rid of the unused kvm_vgic_inject_mapped_irq prototype.
Signed-off-by: Christoffer Dall
Acked-by: Marc Zyngier -
Having multiple devices being able to signal the same interrupt line is
very confusing and almost certainly guarantees a configuration error.Therefore, introduce a very simple allocator which allows a device to
claim an interrupt line from the vgic for a given VM.Signed-off-by: Christoffer Dall
Acked-by: Marc Zyngier -
First we define an ABI using the vcpu devices that lets userspace set
the interrupt numbers for the various timers on both the 32-bit and
64-bit KVM/ARM implementations.Second, we add the definitions for the groups and attributes introduced
by the above ABI. (We add the PMU define on the 32-bit side as well for
symmetry and it may get used some day.)Third, we set up the arch-specific vcpu device operation handlers to
call into the timer code for anything related to the
KVM_ARM_VCPU_TIMER_CTRL group.Fourth, we implement support for getting and setting the timer interrupt
numbers using the above defined ABI in the arch timer code.Fifth, we introduce error checking upon enabling the arch timer (which
is called when first running a VCPU) to check that all VCPUs are
configured to use the same PPI for the timer (as mandated by the
architecture) and that the virtual and physical timers are not
configured to use the same IRQ number.Signed-off-by: Christoffer Dall
Reviewed-by: Marc Zyngier -
We currently initialize the arch timer IRQ numbers from the reset code,
presumably because we once intended to model multiple CPU or SoC types
from within the kernel and have hard-coded reset values in the reset
code.As we are moving towards userspace being in charge of more fine-grained
CPU emulation and stitching together the pieces needed to emulate a
particular type of CPU, we should no longer have a tight coupling
between resetting a VCPU and setting IRQ numbers.Therefore, move the logic to define and use the default IRQ numbers to
the timer code and set the IRQ number immediately when creating the
VCPU.Signed-off-by: Christoffer Dall
Reviewed-by: Marc Zyngier -
We are about to need this define in the arch timer code as well so move
it to a common location.Signed-off-by: Christoffer Dall
Acked-by: Marc Zyngier -
Since we got support for devices in userspace which allows reporting the
PMU overflow output status to userspace, we should actually allow
creating the PMU on systems without an in-kernel irqchip, which in turn
requires us to slightly clarify error codes for the ABI and move things
around for the initialization phase.Signed-off-by: Christoffer Dall
Reviewed-by: Marc Zyngier
18 May, 2017
1 commit
-
If userspace creates the VCPUs after initializing the VGIC, then we end
up in a situation where we trigger a bug in kvm_vcpu_get_idx(), because
it is called prior to adding the VCPU into the vcpus array on the VM.There is no tight coupling between the VCPU index and the area of the
redistributor region used for the VCPU, so we can simply ensure that all
creations of redistributors are serialized per VM, and increment an
offset when we successfully add a redistributor.The vgic_register_redist_iodev() function can be called from two paths:
vgic_redister_all_redist_iodev() which is called via the kvm_vgic_addr()
device attribute handler. This patch already holds the kvm->lock mutex.The other path is via kvm_vgic_vcpu_init, which is called through a
longer chain from kvm_vm_ioctl_create_vcpu(), which releases the
kvm->lock mutex just before calling kvm_arch_vcpu_create(), so we can
simply take this mutex again later for our purposes.Fixes: ab6f468c10 ("KVM: arm/arm64: Register iodevs when setting redist base and creating VCPUs")
Signed-off-by: Christoffer Dall
Tested-by: Jean-Philippe Brucker
Reviewed-by: Eric Auger
09 May, 2017
4 commits
-
…l/git/kvmarm/kvmarm into HEAD
Second round of KVM/ARM Changes for v4.12.
Changes include:
- A fix related to the 32-bit idmap stub
- A fix to the bitmask used to deode the operands of an AArch32 CP
instruction
- We have moved the files shared between arch/arm/kvm and
arch/arm64/kvm to virt/kvm/arm
- We add support for saving/restoring the virtual ITS state to
userspace -
The its->initialized doesn't bring much to the table, and creates
unnecessary ordering between setting the address and initializing it
(which amounts to exactly nothing).Let's kill it altogether, making KVM_DEV_ARM_VGIC_CTRL_INIT the no-op
it deserves to be.Signed-off-by: Marc Zyngier
Signed-off-by: Christoffer Dall
Reviewed-by: Eric Auger -
Instead of waiting with registering KVM iodevs until the first VCPU is
run, we can actually create the iodevs when the redist base address is
set. The only downside is that we must now also check if we need to do
this for VCPUs which are created after creating the VGIC, because there
is no enforced ordering between creating the VGIC (and setting its base
addresses) and creating the VCPUs.Signed-off-by: Christoffer Dall
Reviewed-by: Eric Auger -
Pull KVM updates from Paolo Bonzini:
"ARM:
- HYP mode stub supports kexec/kdump on 32-bit
- improved PMU support
- virtual interrupt controller performance improvements
- support for userspace virtual interrupt controller (slower, but
necessary for KVM on the weird Broadcom SoCs used by the Raspberry
Pi 3)MIPS:
- basic support for hardware virtualization (ImgTec P5600/P6600/I6400
and Cavium Octeon III)PPC:
- in-kernel acceleration for VFIOs390:
- support for guests without storage keys
- adapter interruption suppressionx86:
- usual range of nVMX improvements, notably nested EPT support for
accessed and dirty bits
- emulation of CPL3 CPUID faultinggeneric:
- first part of VCPU thread request API
- kvm_stat improvements"* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
kvm: nVMX: Don't validate disabled secondary controls
KVM: put back #ifndef CONFIG_S390 around kvm_vcpu_kick
Revert "KVM: Support vCPU-based gfn->hva cache"
tools/kvm: fix top level makefile
KVM: x86: don't hold kvm->lock in KVM_SET_GSI_ROUTING
KVM: Documentation: remove VM mmap documentation
kvm: nVMX: Remove superfluous VMX instruction fault checks
KVM: x86: fix emulation of RSM and IRET instructions
KVM: mark requests that need synchronization
KVM: return if kvm_vcpu_wake_up() did wake up the VCPU
KVM: add explicit barrier to kvm_vcpu_kick
KVM: perform a wake_up in kvm_make_all_cpus_request
KVM: mark requests that do not need a wakeup
KVM: remove #ifndef CONFIG_S390 around kvm_vcpu_wake_up
KVM: x86: always use kvm_make_request instead of set_bit
KVM: add kvm_{test,clear}_request to replace {test,clear}_bit
s390: kvm: Cpu model support for msa6, msa7 and msa8
KVM: x86: remove irq disablement around KVM_SET_CLOCK/KVM_GET_CLOCK
kvm: better MWAIT emulation for guests
KVM: x86: virtualize cpuid faulting
...
08 May, 2017
1 commit
-
We plan to support different migration ABIs, ie. characterizing
the ITS table layout format in guest RAM. For example, a new ABI
will be needed if vLPIs get supported for nested use case.So let's introduce an array of supported ABIs (at the moment a single
ABI is supported though). The following characteristics are foreseen
to vary with the ABI: size of table entries, save/restore operation,
the way abi settings are applied.By default the MAX_ABI_REV is applied on its creation. In subsequent
patches we will introduce a way for the userspace to change the ABI
in use.The entry sizes now are set according to the ABI version and not
hardcoded anymore.Signed-off-by: Eric Auger
Reviewed-by: Christoffer Dall
09 Apr, 2017
5 commits
-
When not using an in-kernel VGIC, but instead emulating an interrupt
controller in userspace, we should report the PMU overflow status to
that userspace interrupt controller using the KVM_CAP_ARM_USER_IRQ
feature.Reviewed-by: Alexander Graf
Reviewed-by: Marc Zyngier
Signed-off-by: Christoffer Dall -
If you're running with a userspace gic or other interrupt controller
(that is no vgic in the kernel), then you have so far not been able to
use the architected timers, because the output of the architected
timers, which are driven inside the kernel, was a kernel-only construct
between the arch timer code and the vgic.This patch implements the new KVM_CAP_ARM_USER_IRQ feature, where we use a
side channel on the kvm_run structure, run->s.regs.device_irq_level, to
always notify userspace of the timer output levels when using a userspace
irqchip.This works by ensuring that before we enter the guest, if the timer
output level has changed compared to what we last told userspace, we
don't enter the guest, but instead return to userspace to notify it of
the new level. If we are exiting, because of an MMIO for example, and
the level changed at the same time, the value is also updated and
userspace can sample the line as it needs. This is nicely achieved
simply always updating the timer_irq_level field after the main run
loop.Note that the kvm_timer_update_irq trace event is changed to show the
host IRQ number for the timer instead of the guest IRQ number, because
the kernel no longer know which IRQ userspace wires up the timer signal
to.Also note that this patch implements all required functionality but does
not yet advertise the capability.Reviewed-by: Alexander Graf
Reviewed-by: Marc Zyngier
Signed-off-by: Alexander Graf
Signed-off-by: Christoffer Dall -
We don't use these fields anymore so let's nuke them completely.
Acked-by: Marc Zyngier
Signed-off-by: Christoffer Dall -
There is no need to calculate and maintain live_lrs when we always
populate the lowest numbered LRs first on every entry and clear all LRs
on every exit.Acked-by: Marc Zyngier
Signed-off-by: Christoffer Dall -
We don't have to save/restore the VMCR on every entry to/from the guest,
since on GICv2 we can access the control interface from EL1 and on VHE
systems with GICv3 we can access the control interface from KVM running
in EL2.GICv3 systems without VHE becomes the rare case, which has to
save/restore the register on each round trip.Note that userspace accesses may see out-of-date values if the VCPU is
running while accessing the VGIC state via the KVM device API, but this
is already the case and it is up to userspace to quiesce the CPUs before
reading the CPU registers from the GIC for an up-to-date view.Reviewed-by: Marc Zyngier
Signed-off-by: Christoffer Dall
Signed-off-by: Christoffer Dall
04 Apr, 2017
1 commit
-
We currently have some code to clear the list registers on GICv3, but we
never call this code, because the caller got nuked when removing the old
vgic. We also used to have a similar GICv2 part, but that got lost in
the process too.Let's reintroduce the logic for GICv2 and call the logic when we
initialize the use of hypervisors on the CPU, for example when first
loading KVM or when exiting a low power state.Reviewed-by: Marc Zyngier
Signed-off-by: Christoffer Dall
Signed-off-by: Marc Zyngier
08 Feb, 2017
6 commits
-
Emulate read and write operations to CNTP_TVAL, CNTP_CVAL and CNTP_CTL.
Now VMs are able to use the EL1 physical timer.Signed-off-by: Jintack Lim
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier -
Initialize the emulated EL1 physical timer with the default irq number.
Signed-off-by: Jintack Lim
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier -
Add the EL1 physical timer context.
Signed-off-by: Jintack Lim
Acked-by: Christoffer Dall
Signed-off-by: Marc Zyngier -
Now that we have a separate structure for timer context, make functions
generic so that they can work with any timer context, not just the
virtual timer context. This does not change the virtual timer
functionality.Signed-off-by: Jintack Lim
Acked-by: Marc Zyngier
Acked-by: Christoffer Dall
Signed-off-by: Marc Zyngier -
Make cntvoff per each timer context. This is helpful to abstract kvm
timer functions to work with timer context without considering timer
types (e.g. physical timer or virtual timer).This also would pave the way for ever doing adjustments of the cntvoff
on a per-CPU basis if that should ever make sense.Signed-off-by: Jintack Lim
Signed-off-by: Christoffer Dall
Signed-off-by: Marc Zyngier -
Abstract virtual timer context into a separate structure and change all
callers referring to timer registers, irq state and so on. No change in
functionality.This is about to become very handy when adding the EL1 physical timer.
Signed-off-by: Jintack Lim
Acked-by: Christoffer Dall
Acked-by: Marc Zyngier
Signed-off-by: Marc Zyngier
30 Jan, 2017
1 commit
-
VGICv3 CPU interface registers are accessed using
KVM_DEV_ARM_VGIC_CPU_SYSREGS ioctl. These registers are accessed
as 64-bit. The cpu MPIDR value is passed along with register id.
It is used to identify the cpu for registers access.The VM that supports SEIs expect it on destination machine to handle
guest aborts and hence checked for ICC_CTLR_EL1.SEIS compatibility.
Similarly, VM that supports Affinity Level 3 that is required for AArch64
mode, is required to be supported on destination machine. Hence checked
for ICC_CTLR_EL1.A3V compatibility.The arch/arm64/kvm/vgic-sys-reg-v3.c handles read and write of VGIC
CPU registers for AArch64.For AArch32 mode, arch/arm/kvm/vgic-v3-coproc.c file is created but
APIs are not implemented.Updated arch/arm/include/uapi/asm/kvm.h with new definitions
required to compile for AArch32.The version of VGIC v3 specification is defined here
Documentation/virtual/kvm/devices/arm-vgic-v3.txtAcked-by: Christoffer Dall
Reviewed-by: Eric Auger
Signed-off-by: Pavel Fedin
Signed-off-by: Vijaya Kumar K
Signed-off-by: Marc Zyngier
25 Jan, 2017
2 commits
-
Add a file to debugfs to read the in-kernel state of the vgic. We don't
do any locking of the entire VGIC state while traversing all the IRQs,
so if the VM is running the user/developer may not see a quiesced state,
but should take care to pause the VM using facilities in user space for
that purpose.We also don't support LPIs yet, but they can be added easily if needed.
Reviewed-by: Eric Auger
Tested-by: Eric Auger
Tested-by: Andre Przywara
Acked-by: Marc Zyngier
Signed-off-by: Christoffer Dall -
One of the goals behind the VGIC redesign was to get rid of cached or
intermediate state in the data structures, but we decided to allow
ourselves to precompute the pending value of an IRQ based on the line
level and pending latch state. However, this has now become difficult
to base proper GICv3 save/restore on, because there is a potential to
modify the pending state without knowing if an interrupt is edge or
level configured.See the following post and related message for more background:
https://lists.cs.columbia.edu/pipermail/kvmarm/2017-January/023195.htmlThis commit gets rid of the precomputed pending field in favor of a
function that calculates the value when needed, irq_is_pending().The soft_pending field is renamed to pending_latch to represent that
this latch is the equivalent hardware latch which gets manipulated by
the input signal for edge-triggered interrupts and when writing to the
SPENDR/CPENDR registers.After this commit save/restore code should be able to simply restore the
pending_latch state, line_level state, and config state in any order and
get the desired result.Reviewed-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Andre Przywara
Signed-off-by: Christoffer Dall
13 Jan, 2017
1 commit
-
Current KVM world switch code is unintentionally setting wrong bits to
CNTHCTL_EL2 when E2H == 1, which may allow guest OS to access physical
timer. Bit positions of CNTHCTL_EL2 are changing depending on
HCR_EL2.E2H bit. EL1PCEN and EL1PCTEN are 1st and 0th bits when E2H is
not set, but they are 11th and 10th bits respectively when E2H is set.In fact, on VHE we only need to set those bits once, not for every world
switch. This is because the host kernel runs in EL2 with HCR_EL2.TGE ==
1, which makes those bits have no effect for the host kernel execution.
So we just set those bits once for guests, and that's it.Signed-off-by: Jintack Lim
Reviewed-by: Marc Zyngier
Signed-off-by: Marc Zyngier
25 Dec, 2016
1 commit
-
There is no point in having an extra type for extra confusion. u64 is
unambiguous.Conversion was done with the following coccinelle script:
@rem@
@@
-typedef u64 cycle_t;@fix@
typedef cycle_t;
@@
-cycle_t
+u64Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: John Stultz
22 Sep, 2016
2 commits
-
This patch allows to build and use vgic-v3 in 32-bit mode.
Unfortunately, it can not be split in several steps without extra
stubs to keep patches independent and bisectable. For instance,
virt/kvm/arm/vgic/vgic-v3.c uses function from vgic-v3-sr.c, handling
access to GICv3 cpu interface from the guest requires vgic_v3.vgic_sre
to be already defined.It is how support has been done:
* handle SGI requests from the guest
* report configured SRE on access to GICv3 cpu interface from the guest
* required vgic-v3 macros are provided via uapi.h
* static keys are used to select GIC backend
* to make vgic-v3 build KVM_ARM_VGIC_V3 guard is removed along with
the static inlinesAcked-by: Marc Zyngier
Reviewed-by: Christoffer Dall
Signed-off-by: Vladimir Murzin
Signed-off-by: Christoffer Dall -
Currently GIC backend is selected via alternative framework and this
is fine. We are going to introduce vgic-v3 to 32-bit world and there
we don't have patching framework in hand, so we can either check
support for GICv3 every time we need to choose which backend to use or
try to optimise it by using static keys. The later looks quite
promising because we can share logic involved in selecting GIC backend
between architectures if both uses static keys.This patch moves arm64 from alternative to static keys framework for
selecting GIC backend. For that we embed static key into vgic_global
and enable the key during vgic initialisation based on what has
already been exposed by the host GIC driver.Acked-by: Marc Zyngier
Signed-off-by: Vladimir Murzin
Signed-off-by: Christoffer Dall
08 Sep, 2016
2 commits
-
Now that we have the necessary infrastructure to handle MMIO accesses
in HYP, perform the GICV access on behalf of the guest. This requires
checking that the access is strictly 32bit, properly aligned, and
falls within the expected range.When all condition are satisfied, we perform the access and tell
the rest of the HYP code that the instruction has been correctly
emulated.Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Christoffer Dall -
In order to efficiently perform the GICV access on behalf of the
guest, we need to be able to avoid going back all the way to
the host kernel.For this, we introduce a new hook in the world switch code,
conveniently placed just after populating the fault info.
At that point, we only have saved/restored the GP registers,
and we can quickly perform all the required checks (data abort,
translation fault, valid faulting syndrome, not an external
abort, not a PTW).Coming back from the emulation code, we need to skip the emulated
instruction. This involves an additional bit of save/restore in
order to be able to access the guest's PC (and possibly CPSR if
this is a 32bit guest).At this stage, no emulation code is provided.
Signed-off-by: Marc Zyngier
Reviewed-by: Christoffer Dall
Signed-off-by: Christoffer Dall
04 Aug, 2016
1 commit
-
…it/kvmarm/kvmarm into HEAD
KVM/ARM Changes for v4.8 - Take 2
Includes GSI routing support to go along with the new VGIC and a small fix that
has been cooking in -next for a while.
03 Aug, 2016
1 commit
-
Pull KVM updates from Paolo Bonzini:
- ARM: GICv3 ITS emulation and various fixes. Removal of the
old VGIC implementation.- s390: support for trapping software breakpoints, nested
virtualization (vSIE), the STHYI opcode, initial extensions
for CPU model support.- MIPS: support for MIPS64 hosts (32-bit guests only) and lots
of cleanups, preliminary to this and the upcoming support for
hardware virtualization extensions.- x86: support for execute-only mappings in nested EPT; reduced
vmexit latency for TSC deadline timer (by about 30%) on Intel
hosts; support for more than 255 vCPUs.- PPC: bugfixes.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
KVM: PPC: Introduce KVM_CAP_PPC_HTM
MIPS: Select HAVE_KVM for MIPS64_R{2,6}
MIPS: KVM: Reset CP0_PageMask during host TLB flush
MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
MIPS: KVM: Sign extend MFC0/RDHWR results
MIPS: KVM: Fix 64-bit big endian dynamic translation
MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
MIPS: KVM: Use 64-bit CP0_EBase when appropriate
MIPS: KVM: Set CP0_Status.KX on MIPS64
MIPS: KVM: Make entry code MIPS64 friendly
MIPS: KVM: Use kmap instead of CKSEG0ADDR()
MIPS: KVM: Use virt_to_phys() to get commpage PFN
MIPS: Fix definition of KSEGX() for 64-bit
KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
kvm: x86: nVMX: maintain internal copy of current VMCS
KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
KVM: arm64: vgic-its: Simplify MAPI error handling
KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
...
23 Jul, 2016
1 commit
-
This patch adds compilation and link against irqchip.
Main motivation behind using irqchip code is to enable MSI
routing code. In the future irqchip routing may also be useful
when targeting multiple irqchips.Routing standard callbacks now are implemented in vgic-irqfd:
- kvm_set_routing_entry
- kvm_set_irq
- kvm_set_msiThey only are supported with new_vgic code.
Both HAVE_KVM_IRQCHIP and HAVE_KVM_IRQ_ROUTING are defined.
KVM_CAP_IRQ_ROUTING is advertised and KVM_SET_GSI_ROUTING is allowed.So from now on IRQCHIP routing is enabled and a routing table entry
must exist for irqfd injection to succeed for a given SPI. This patch
builds a default flat irqchip routing table (gsi=irqchip.pin) covering
all the VGIC SPI indexes. This routing table is overwritten by the
first first user-space call to KVM_SET_GSI_ROUTING ioctl.MSI routing setup is not yet allowed.
Signed-off-by: Eric Auger
Signed-off-by: Marc Zyngier
19 Jul, 2016
1 commit
-
Going from the ITS structure to the corresponding KVM structure
would be quite handy at times. The kvm_device pointer that is
passed at create time is quite convenient for this, so let's
keep a copy of it in the vgic_its structure.This will be put to a good use in subsequent patches.
Signed-off-by: Marc Zyngier