04 Aug, 2016
1 commit
-
…it/kvmarm/kvmarm into HEAD
KVM/ARM Changes for v4.8 - Take 2
Includes GSI routing support to go along with the new VGIC and a small fix that
has been cooking in -next for a while.
03 Aug, 2016
1 commit
-
Pull KVM updates from Paolo Bonzini:
- ARM: GICv3 ITS emulation and various fixes. Removal of the
old VGIC implementation.- s390: support for trapping software breakpoints, nested
virtualization (vSIE), the STHYI opcode, initial extensions
for CPU model support.- MIPS: support for MIPS64 hosts (32-bit guests only) and lots
of cleanups, preliminary to this and the upcoming support for
hardware virtualization extensions.- x86: support for execute-only mappings in nested EPT; reduced
vmexit latency for TSC deadline timer (by about 30%) on Intel
hosts; support for more than 255 vCPUs.- PPC: bugfixes.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
KVM: PPC: Introduce KVM_CAP_PPC_HTM
MIPS: Select HAVE_KVM for MIPS64_R{2,6}
MIPS: KVM: Reset CP0_PageMask during host TLB flush
MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
MIPS: KVM: Sign extend MFC0/RDHWR results
MIPS: KVM: Fix 64-bit big endian dynamic translation
MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
MIPS: KVM: Use 64-bit CP0_EBase when appropriate
MIPS: KVM: Set CP0_Status.KX on MIPS64
MIPS: KVM: Make entry code MIPS64 friendly
MIPS: KVM: Use kmap instead of CKSEG0ADDR()
MIPS: KVM: Use virt_to_phys() to get commpage PFN
MIPS: Fix definition of KSEGX() for 64-bit
KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
kvm: x86: nVMX: maintain internal copy of current VMCS
KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
KVM: arm64: vgic-its: Simplify MAPI error handling
KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
...
30 Jul, 2016
1 commit
-
Pull smp hotplug updates from Thomas Gleixner:
"This is the next part of the hotplug rework.- Convert all notifiers with a priority assigned
- Convert all CPU_STARTING/DYING notifiers
The final removal of the STARTING/DYING infrastructure will happen
when the merge window closes.Another 700 hundred line of unpenetrable maze gone :)"
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
timers/core: Correct callback order during CPU hot plug
leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
powerpc/numa: Convert to hotplug state machine
arm/perf: Fix hotplug state machine conversion
irqchip/armada: Avoid unused function warnings
ARC/time: Convert to hotplug state machine
clocksource/atlas7: Convert to hotplug state machine
clocksource/armada-370-xp: Convert to hotplug state machine
clocksource/exynos_mct: Convert to hotplug state machine
clocksource/arm_global_timer: Convert to hotplug state machine
rcu: Convert rcutree to hotplug state machine
KVM/arm/arm64/vgic-new: Convert to hotplug state machine
smp/cfd: Convert core to hotplug state machine
x86/x2apic: Convert to CPU hotplug state machine
profile: Convert to hotplug state machine
timers/core: Convert to hotplug state machine
hrtimer: Convert to hotplug state machine
x86/tboot: Convert to hotplug state machine
arm64/armv8 deprecated: Convert to hotplug state machine
hwtracing/coresight-etm4x: Convert to hotplug state machine
...
24 Jul, 2016
1 commit
-
kvm_set_routing_entry is changing in -next, and causes things to
explode. Add a temporary workaround that should be dropped when
we hit 4.8-rc1Signed-off-by: Marc Zyngier
23 Jul, 2016
4 commits
-
KVM/ARM changes for Linux 4.8
- GICv3 ITS emulation
- Simpler idmap management that fixes potential TLB conflicts
- Honor the kernel protection in HYP mode
- Removal of the old vgic implementation -
Up to now, only irqchip routing entries could be set. This patch
adds the capability to insert MSI routing entries.For ARM64, let's also increase KVM_MAX_IRQ_ROUTES to 4096: this
include SPI irqchip routes plus MSI routes. In the future this
might be extended.Signed-off-by: Eric Auger
Reviewed-by: Andre Przywara
Signed-off-by: Marc Zyngier -
This patch adds compilation and link against irqchip.
Main motivation behind using irqchip code is to enable MSI
routing code. In the future irqchip routing may also be useful
when targeting multiple irqchips.Routing standard callbacks now are implemented in vgic-irqfd:
- kvm_set_routing_entry
- kvm_set_irq
- kvm_set_msiThey only are supported with new_vgic code.
Both HAVE_KVM_IRQCHIP and HAVE_KVM_IRQ_ROUTING are defined.
KVM_CAP_IRQ_ROUTING is advertised and KVM_SET_GSI_ROUTING is allowed.So from now on IRQCHIP routing is enabled and a routing table entry
must exist for irqfd injection to succeed for a given SPI. This patch
builds a default flat irqchip routing table (gsi=irqchip.pin) covering
all the VGIC SPI indexes. This routing table is overwritten by the
first first user-space call to KVM_SET_GSI_ROUTING ioctl.MSI routing setup is not yet allowed.
Signed-off-by: Eric Auger
Signed-off-by: Marc Zyngier -
on ARM, a devid field is populated in kvm_msi struct in case the
flag is set to KVM_MSI_VALID_DEVID. Let's propagate both flags and
devid field in kvm_kernel_irq_routing_entry.Signed-off-by: Eric Auger
Reviewed-by: Andre Przywara
Acked-by: Christoffer Dall
Acked-by: Radim Krčmář
Signed-off-by: Marc Zyngier
19 Jul, 2016
26 commits
-
If we care to move all the checks that do not involve any memory
allocation, we can simplify the MAPI error handling. Let's do that,
it cannot hurt.Signed-off-by: Marc Zyngier
-
vgic_its_cmd_handle_mapi has an extra "subcmd" argument, which is
already contained in the command buffer that all command handlers
obtain from the command queue. Let's drop it, as it is not that
useful.Signed-off-by: Marc Zyngier
-
There is no need to have separate functions to validate devices
and collections, as the architecture doesn't really distinguish the
two, and they are supposed to be managed the same way.Let's turn the DevID checker into a generic one.
Signed-off-by: Marc Zyngier
-
Going from the ITS structure to the corresponding KVM structure
would be quite handy at times. The kvm_device pointer that is
passed at create time is quite convenient for this, so let's
keep a copy of it in the vgic_its structure.This will be put to a good use in subsequent patches.
Signed-off-by: Marc Zyngier
-
Instead of spreading random allocations all over the place,
consolidate allocation/init/freeing of collections in a pair
of constructor/destructor.Signed-off-by: Marc Zyngier
-
When checking that the storage address of a device entry is valid,
it is critical to compute the actual address of the entry, rather
than relying on the beginning of the page to match a CPU page of
the same size: for example, if the guest places the table at the
last 64kB boundary of RAM, but RAM size isn't a multiple of 64kB...Fix this by computing the actual offset of the device ID in the
L2 page, and check the corresponding GFN.Signed-off-by: Marc Zyngier
-
Checking that the device_id fits if the table, and we must make
sure that the associated memory is also accessible.Signed-off-by: Marc Zyngier
-
The nr_entries variable in vgic_its_check_device_id actually
describe the size of the L1 table, and not the number of
entries in this table.Rename it to l1_tbl_size, so that we can now change the code
with a better understanding of what is what.Signed-off-by: Marc Zyngier
-
The ITS tables are stored in LE format. If the host is reading
a L1 table entry to check its validity, it must convert it to
the CPU endianness.Signed-off-by: Marc Zyngier
-
The current code will fail on valid indirect tables, and happily
use the ones that are pointing out of the guest RAM. Funny what a
small "!" can do for you...Signed-off-by: Marc Zyngier
-
Instead of sprinkling raw kref_get() calls everytime we cannot
do a normal vgic_get_irq(), use the existing vgic_get_irq_kref(),
which does the same thing and is paired with a vgic_put_irq().vgic_get_irq_kref is moved to vgic.h in order to be easily shared.
Signed-off-by: Marc Zyngier
-
For VGICv2 save and restore the CPU interface registers
are accessed. Restore the modality which has been altered.
Also explicitly set the iodev_type for both the DIST and CPU
interface.Signed-off-by: Eric Auger
Signed-off-by: Marc Zyngier -
Now that all ITS emulation functionality is in place, we advertise
MSI functionality to userland and also the ITS device to the guest - if
userland has configured that.Signed-off-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
When userland wants to inject an MSI into the guest, it uses the
KVM_SIGNAL_MSI ioctl, which carries the doorbell address along with
the payload and the device ID.
With the help of the KVM IO bus framework we learn the corresponding
ITS from the doorbell address. We then use our wrapper functions to
iterate the linked lists and find the proper Interrupt Translation Table
Entry (ITTE) and thus the corresponding struct vgic_irq to finally set
the pending bit.
We also provide the handler for the ITS "INT" command, which allows a
guest to trigger an MSI via the ITS command queue. Since this one knows
about the right ITS already, we directly call the MMIO handler function
without using the kvm_io_bus framework.Signed-off-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
The connection between a device, an event ID, the LPI number and the
associated CPU is stored in in-memory tables in a GICv3, but their
format is not specified by the spec. Instead software uses a command
queue in a ring buffer to let an ITS implementation use its own
format.
Implement handlers for the various ITS commands and let them store
the requested relation into our own data structures. Those data
structures are protected by the its_lock mutex.
Our internal ring buffer read and write pointers are protected by the
its_cmd mutex, so that only one VCPU per ITS can handle commands at
any given time.
Error handling is very basic at the moment, as we don't have a good
way of communicating errors to the guest (usually an SError).
The INT command handler is missing from this patch, as we gain the
capability of actually injecting MSIs into the guest only later on.Signed-off-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
The (system-wide) LPI configuration table is held in a table in
(guest) memory. To achieve reasonable performance, we cache this data
in our struct vgic_irq. If the guest updates the configuration data
(which consists of the enable bit and the priority value), it issues
an INV or INVALL command to allow us to update our information.
Provide functions that update that information for one LPI or all LPIs
mapped to a specific collection.Signed-off-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
The LPI pending status for a GICv3 redistributor is held in a table
in (guest) memory. To achieve reasonable performance, we cache the
pending bit in our struct vgic_irq. The initial pending state must be
read from guest memory upon enabling LPIs for this redistributor.
As we can't access the guest memory while we hold the lpi_list spinlock,
we create a snapshot of the LPI list and iterate over that.Signed-off-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
LPIs are dynamically created (mapped) at guest runtime and their
actual number can be quite high, but is mostly assigned using a very
sparse allocation scheme. So arrays are not an ideal data structure
to hold the information.
We use a spin-lock protected linked list to hold all mapped LPIs,
represented by their struct vgic_irq. This lock is grouped between the
ap_list_lock and the vgic_irq lock in our locking order.
Also we store a pointer to that struct vgic_irq in our struct its_itte,
so we can easily access it.
Eventually we call our new vgic_get_lpi() from vgic_get_irq(), so
the VGIC code gets transparently access to LPIs.Signed-off-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
Add emulation for some basic MMIO registers used in the ITS emulation.
This includes:
- GITS_{CTLR,TYPER,IIDR}
- ID registers
- GITS_{CBASER,CREADR,CWRITER}
(which implement the ITS command buffer handling)
- GITS_BASERMost of the handlers are pretty straight forward, only the CWRITER
handler is a bit more involved by taking the new its_cmd mutex and
then iterating over the command buffer.
The registers holding base addresses and attributes are sanitised before
storing them.Signed-off-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
Introduce a new KVM device that represents an ARM Interrupt Translation
Service (ITS) controller. Since there can be multiple of this per guest,
we can't piggy back on the existing GICv3 distributor device, but create
a new type of KVM device.
On the KVM_CREATE_DEVICE ioctl we allocate and initialize the ITS data
structure and store the pointer in the kvm_device data.
Upon an explicit init ioctl from userland (after having setup the MMIO
address) we register the handlers with the kvm_io_bus framework.
Any reference to an ITS thus has to go via this interface.Signed-off-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
The ARM GICv3 ITS emulation code goes into a separate file, but needs
to be connected to the GICv3 emulation, of which it is an option.
The ITS MMIO handlers require the respective ITS pointer to be passed in,
so we amend the existing VGIC MMIO framework to let it cope with that.
Also we introduce the basic ITS data structure and initialize it, but
don't return any success yet, as we are not yet ready for the show.Signed-off-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
In the GICv3 redistributor there are the PENDBASER and PROPBASER
registers which we did not emulate so far, as they only make sense
when having an ITS. In preparation for that emulate those MMIO
accesses by storing the 64-bit data written into it into a variable
which we later read in the ITS emulation.
We also sanitise the registers, making sure RES0 regions are respected
and checking for valid memory attributes.Signed-off-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
In the moment our struct vgic_irq's are statically allocated at guest
creation time. So getting a pointer to an IRQ structure is trivial and
safe. LPIs are more dynamic, they can be mapped and unmapped at any time
during the guest's _runtime_.
In preparation for supporting LPIs we introduce reference counting for
those structures using the kernel's kref infrastructure.
Since private IRQs and SPIs are statically allocated, we avoid actually
refcounting them, since they would never be released anyway.
But we take provisions to increase the refcount when an IRQ gets onto a
VCPU list and decrease it when it gets removed. Also this introduces
vgic_put_irq(), which wraps kref_put and hides the release function from
the callers.Signed-off-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
The kvm_io_bus framework is a nice place of holding information about
various MMIO regions for kernel emulated devices.
Add a call to retrieve the kvm_io_device structure which is associated
with a certain MMIO address. This avoids to duplicate kvm_io_bus'
knowledge of MMIO regions without having to fake MMIO calls if a user
needs the device a certain MMIO address belongs to.
This will be used by the ITS emulation to get the associated ITS device
when someone triggers an MSI via an ioctl from userspace.Signed-off-by: Andre Przywara
Reviewed-by: Eric Auger
Reviewed-by: Marc Zyngier
Acked-by: Christoffer Dall
Acked-by: Paolo Bonzini
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
kvm_register_device_ops() can return an error, so lets check its return
value and propagate this up the call chain.Signed-off-by: Andre Przywara
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier -
Logically a GICv3 redistributor is assigned to a (v)CPU, so we should
aim to keep redistributor related variables out of our struct vgic_dist.Let's start by replacing the redistributor related kvm_io_device array
with two members in our existing struct vgic_cpu, which are naturally
per-VCPU and thus don't require any allocation / freeing.
So apart from the better fit with the redistributor design this saves
some code as well.Signed-off-by: Andre Przywara
Reviewed-by: Eric Auger
Reviewed-by: Marc Zyngier
Tested-by: Eric Auger
Signed-off-by: Marc Zyngier
15 Jul, 2016
6 commits
-
Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs.Signed-off-by: Anna-Maria Gleixner
Cc: Andre Przywara
Cc: Christoffer Dall
Cc: Eric Auger
Cc: Linus Torvalds
Cc: Marc Zyngier
Cc: Paolo Bonzini
Cc: Peter Zijlstra
Cc: Radim Krcmar
Cc: Thomas Gleixner
Cc: kvm@vger.kernel.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153337.900484868@linutronix.de
Signed-off-by: Ingo Molnar -
Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs.Signed-off-by: Richard Cochran
Signed-off-by: Anna-Maria Gleixner
Reviewed-by: Sebastian Andrzej Siewior
Cc: Christoffer Dall
Cc: Gleb Natapov
Cc: Linus Torvalds
Cc: Marc Zyngier
Cc: Paolo Bonzini
Cc: Peter Zijlstra
Cc: Radim Krcmar
Cc: Thomas Gleixner
Cc: kvm@vger.kernel.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153336.634155707@linutronix.de
Signed-off-by: Ingo Molnar -
Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs.
The VGIC callback is run after KVM's main callback since it reflects the
makefile order.Signed-off-by: Richard Cochran
Signed-off-by: Anna-Maria Gleixner
Reviewed-by: Sebastian Andrzej Siewior
Cc: Christoffer Dall
Cc: Gleb Natapov
Cc: Linus Torvalds
Cc: Marc Zyngier
Cc: Paolo Bonzini
Cc: Peter Zijlstra
Cc: Radim Krcmar
Cc: Thomas Gleixner
Cc: kvm@vger.kernel.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153336.546953286@linutronix.de
Signed-off-by: Ingo Molnar -
Install the callbacks via the state machine. The core won't invoke the
callbacks on already online CPUs.Signed-off-by: Thomas Gleixner
Signed-off-by: Anna-Maria Gleixner
Reviewed-by: Sebastian Andrzej Siewior
Acked-by: Paolo Bonzini
Cc: Gleb Natapov
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Radim Krcmar
Cc: kvm@vger.kernel.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153335.886159080@linutronix.de
Signed-off-by: Ingo Molnar -
Once anon_inode_getfd() has succeeded, it's impossible to undo
in a clean way and no, sys_close() is not usable in such cases.
Use anon_inode_getfile() and get_unused_fd_flags() to get struct file
and descriptor and do *not* install the file into the descriptor table
until after the last possible failure exit.Signed-off-by: Paolo Bonzini
-
This reverts commit 77ecc085fed1af1000ca719522977b960aa6da52.
Al Viro colorfully says: "You should *NEVER* use sys_close() on failure
exit paths like that. Moreover, this kvm_put_kvm() becomes a double-put,
since closing the damn file will drop that reference to kvm. Please,
revert. anon_inode_getfd() should be used only when there's no possible
failures past its call".Signed-off-by: Paolo Bonzini