17 Jan, 2019
1 commit
-
commit fb544d1ca65a89f7a3895f7531221ceeed74ada7 upstream.
We recently addressed a VMID generation race by introducing a read/write
lock around accesses and updates to the vmid generation values.However, kvm_arch_vcpu_ioctl_run() also calls need_new_vmid_gen() but
does so without taking the read lock.As far as I can tell, this can lead to the same kind of race:
VM 0, VCPU 0 VM 0, VCPU 1
------------ ------------
update_vttbr (vmid 254)
update_vttbr (vmid 1) // roll over
read_lock(kvm_vmid_lock);
force_vm_exit()
local_irq_disable
need_new_vmid_gen == false //because vmid gen matchesenter_guest (vmid 254)
kvm_arch.vttbr = :
read_unlock(kvm_vmid_lock);enter_guest (vmid 1)
Which results in running two VCPUs in the same VM with different VMIDs
and (even worse) other VCPUs from other VMs could now allocate clashing
VMID 254 from the new generation as long as VCPU 0 is not exiting.Attempt to solve this by making sure vttbr is updated before another CPU
can observe the updated VMID generation.Cc: stable@vger.kernel.org
Fixes: f0cf47d939d0 "KVM: arm/arm64: Close VMID generation race"
Reviewed-by: Julien Thierry
Signed-off-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman
10 Jan, 2019
1 commit
-
commit 107352a24900fb458152b92a4e72fbdc83fd5510 upstream.
We currently only halt the guest when a vCPU messes with the active
state of an SPI. This is perfectly fine for GICv2, but isn't enough
for GICv3, where all vCPUs can access the state of any other vCPU.Let's broaden the condition to include any GICv3 interrupt that
has an active state (i.e. all but LPIs).Cc: stable@vger.kernel.org
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman
14 Nov, 2018
1 commit
-
commit da5a3ce66b8bb51b0ea8a89f42aac153903f90fb upstream.
At boot time, KVM stashes the host MDCR_EL2 value, but only does this
when the kernel is not running in hyp mode (i.e. is non-VHE). In these
cases, the stashed value of MDCR_EL2.HPMN happens to be zero, which can
lead to CONSTRAINED UNPREDICTABLE behaviour.Since we use this value to derive the MDCR_EL2 value when switching
to/from a guest, after a guest have been run, the performance counters
do not behave as expected. This has been observed to result in accesses
via PMXEVTYPER_EL0 and PMXEVCNTR_EL0 not affecting the relevant
counters, resulting in events not being counted. In these cases, only
the fixed-purpose cycle counter appears to work as expected.Fix this by always stashing the host MDCR_EL2 value, regardless of VHE.
Cc: Christopher Dall
Cc: James Morse
Cc: Will Deacon
Cc: stable@vger.kernel.org
Fixes: 1e947bad0b63b351 ("arm64: KVM: Skip HYP setup when already running in HYP")
Tested-by: Robin Murphy
Signed-off-by: Mark Rutland
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman
26 Sep, 2018
2 commits
-
[ Upstream commit 1d47191de7e15900f8fbfe7cccd7c6e1c2d7c31a ]
The vgic_init function can race with kvm_arch_vcpu_create() which does
not hold kvm_lock() and we therefore have no synchronization primitives
to ensure we're doing the right thing.As the user is trying to initialize or run the VM while at the same time
creating more VCPUs, we just have to refuse to initialize the VGIC in
this case rather than silently failing with a broken VCPU.Reviewed-by: Eric Auger
Signed-off-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 6b8b9a48545e08345b8ff77c9fd51b1aebdbefb3 ]
It's possible for userspace to control n. Sanitize n when using it as an
array index, to inhibit the potential spectre-v1 write gadget.Note that while it appears that n must be bound to the interval [0,3]
due to the way it is extracted from addr, we cannot guarantee that
compiler transformations (and/or future refactoring) will ensure this is
the case, and given this is a slow path it's better to always perform
the masking.Found by smatch.
Signed-off-by: Mark Rutland
Cc: Christoffer Dall
Cc: Marc Zyngier
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: Marc Zyngier
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman
05 Sep, 2018
2 commits
-
commit 976d34e2dab10ece5ea8fe7090b7692913f89084 upstream.
When there is contention on faulting in a particular page table entry
at stage 2, the break-before-make requirement of the architecture can
lead to additional refaulting due to TLB invalidation.Avoid this by skipping a page table update if the new value of the PTE
matches the previous value.Cc: stable@vger.kernel.org
Fixes: d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
Reviewed-by: Suzuki Poulose
Acked-by: Christoffer Dall
Signed-off-by: Punit Agrawal
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman -
commit 86658b819cd0a9aa584cd84453ed268a6f013770 upstream.
Contention on updating a PMD entry by a large number of vcpus can lead
to duplicate work when handling stage 2 page faults. As the page table
update follows the break-before-make requirement of the architecture,
it can lead to repeated refaults due to clearing the entry and
flushing the tlbs.This problem is more likely when -
* there are large number of vcpus
* the mapping is large block mappingsuch as when using PMD hugepages (512MB) with 64k pages.
Fix this by skipping the page table update if there is no change in
the entry being updated.Cc: stable@vger.kernel.org
Fixes: ad361f093c1e ("KVM: ARM: Support hugetlbfs backed huge pages")
Reviewed-by: Suzuki Poulose
Acked-by: Christoffer Dall
Signed-off-by: Punit Agrawal
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman
24 Aug, 2018
1 commit
-
[ Upstream commit ba56bc3a0786992755e6804fbcbdc60ef6cfc24c ]
When booting a 64 KB pages kernel on a ACPI GICv3 system that
implements support for v2 emulation, the following warning is
producedGICV size 0x2000 not a multiple of page size 0x10000
and support for v2 emulation is disabled, preventing GICv2 VMs
from being able to run on such hosts.The reason is that vgic_v3_probe() performs a sanity check on the
size of the window (it should be a multiple of the page size),
while the ACPI MADT parsing code hardcodes the size of the window
to 8 KB. This makes sense, considering that ACPI does not bother
to describe the size in the first place, under the assumption that
platforms implementing ACPI will follow the architecture and not
put anything else in the same 64 KB window.So let's just drop the sanity check altogether, and assume that
the window is at least 64 KB in size.Fixes: 909777324588 ("KVM: arm/arm64: vgic-new: vgic_init: implement kvm_vgic_hyp_init")
Signed-off-by: Ard Biesheuvel
Signed-off-by: Marc Zyngier
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman
22 Jul, 2018
4 commits
-
commit 5d81f7dc9bca4f4963092433e27b508cbe524a32 upstream.
Now that all our infrastructure is in place, let's expose the
availability of ARCH_WORKAROUND_2 to guests. We take this opportunity
to tidy up a couple of SMCCC constants.Acked-by: Christoffer Dall
Reviewed-by: Mark Rutland
Signed-off-by: Marc Zyngier
Signed-off-by: Catalin Marinas
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman -
commit 55e3748e8902ff641e334226bdcb432f9a5d78d3 upstream.
In order to offer ARCH_WORKAROUND_2 support to guests, we need
a bit of infrastructure.Let's add a flag indicating whether or not the guest uses
SSBD mitigation. Depending on the state of this flag, allow
KVM to disable ARCH_WORKAROUND_2 before entering the guest,
and enable it when exiting it.Reviewed-by: Christoffer Dall
Reviewed-by: Mark Rutland
Signed-off-by: Marc Zyngier
Signed-off-by: Catalin Marinas
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman -
Commit 44a497abd621a71c645f06d3d545ae2f46448830 upstream.
kvm_vgic_global_state is part of the read-only section, and is
usually accessed using a PC-relative address generation (adrp + add).It is thus useless to use kern_hyp_va() on it, and actively problematic
if kern_hyp_va() becomes non-idempotent. On the other hand, there is
no way that the compiler is going to guarantee that such access is
always PC relative.So let's bite the bullet and provide our own accessor.
Acked-by: Catalin Marinas
Reviewed-by: James Morse
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman -
Commit 36989e7fd386a9a5822c48691473863f8fbb404d upstream.
kvm_host_cpu_state is a per-cpu allocation made from kvm_arch_init()
used to store the host EL1 registers when KVM switches to a guest.Make it easier for ASM to generate pointers into this per-cpu memory
by making it a static allocation.Signed-off-by: James Morse
Acked-by: Christoffer Dall
Signed-off-by: Catalin Marinas
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman
21 Jun, 2018
1 commit
-
[ Upstream commit 5e1ca5e23b167987d5b6d8b08f2d5b7dd2d13f49 ]
It's possible for userspace to control n. Sanitize n when using it as an
array index.Note that while it appears that n must be bound to the interval [0,3]
due to the way it is extracted from addr, we cannot guarantee that
compiler transformations (and/or future refactoring) will ensure this is
the case, and given this is a slow path it's better to always perform
the masking.Found by smatch.
Signed-off-by: Mark Rutland
Acked-by: Christoffer Dall
Acked-by: Marc Zyngier
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: Will Deacon
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman
30 May, 2018
1 commit
-
[ Upstream commit 62b06f8f429cd233e4e2e7bbd21081ad60c9018f ]
Our irq_is_pending() helper function accesses multiple members of the
vgic_irq struct, so we need to hold the lock when calling it.
Add that requirement as a comment to the definition and take the lock
around the call in vgic_mmio_read_pending(), where we were missing it
before.Fixes: 96b298000db4 ("KVM: arm/arm64: vgic-new: Add PENDING registers handlers")
Signed-off-by: Andre Przywara
Signed-off-by: Marc Zyngier
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman
23 May, 2018
2 commits
-
commit bf308242ab98b5d1648c3663e753556bef9bec01 upstream.
kvm_read_guest() will eventually look up in kvm_memslots(), which requires
either to hold the kvm->slots_lock or to be inside a kvm->srcu critical
section.
In contrast to x86 and s390 we don't take the SRCU lock on every guest
exit, so we have to do it individually for each kvm_read_guest() call.Provide a wrapper which does that and use that everywhere.
Note that ending the SRCU critical section before returning from the
kvm_read_guest() wrapper is safe, because the data has been *copied*, so
we don't need to rely on valid references to the memslot anymore.Cc: Stable # 4.8+
Reported-by: Jan Glauber
Signed-off-by: Andre Przywara
Acked-by: Christoffer Dall
Signed-off-by: Paolo Bonzini
Signed-off-by: Greg Kroah-Hartman -
commit 711702b57cc3c50b84bd648de0f1ca0a378805be upstream.
kvm_read_guest() will eventually look up in kvm_memslots(), which requires
either to hold the kvm->slots_lock or to be inside a kvm->srcu critical
section.
In contrast to x86 and s390 we don't take the SRCU lock on every guest
exit, so we have to do it individually for each kvm_read_guest() call.
Use the newly introduced wrapper for that.Cc: Stable # 4.12+
Reported-by: Jan Glauber
Signed-off-by: Andre Przywara
Acked-by: Christoffer Dall
Signed-off-by: Paolo Bonzini
Signed-off-by: Greg Kroah-Hartman
02 May, 2018
2 commits
-
commit 85bd0ba1ff9875798fad94218b627ea9f768f3c3 upstream.
Although we've implemented PSCI 0.1, 0.2 and 1.0, we expose either 0.1
or 1.0 to a guest, defaulting to the latest version of the PSCI
implementation that is compatible with the requested version. This is
no different from doing a firmware upgrade on KVM.But in order to give a chance to hypothetical badly implemented guests
that would have a fit by discovering something other than PSCI 0.2,
let's provide a new API that allows userspace to pick one particular
version of the API.This is implemented as a new class of "firmware" registers, where
we expose the PSCI version. This allows the PSCI version to be
save/restored as part of a guest migration, and also set to
any supported version if the guest requires it.Cc: stable@vger.kernel.org #4.16
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman -
commit f0cf47d939d0b4b4f660c5aaa4276fa3488f3391 upstream.
Before entering the guest, we check whether our VMID is still
part of the current generation. In order to avoid taking a lock,
we start with checking that the generation is still current, and
only if not current do we take the lock, recheck, and update the
generation and VMID.This leaves open a small race: A vcpu can bump up the global
generation number as well as the VM's, but has not updated
the VMID itself yet.At that point another vcpu from the same VM comes in, checks
the generation (and finds it not needing anything), and jumps
into the guest. At this point, we end-up with two vcpus belonging
to the same VM running with two different VMIDs. Eventually, the
VMID used by the second vcpu will get reassigned, and things will
really go wrong...A simple solution would be to drop this initial check, and always take
the lock. This is likely to cause performance issues. A middle ground
is to convert the spinlock to a rwlock, and only take the read lock
on the fast path. If the check fails at that point, drop it and
acquire the write lock, rechecking the condition.This ensures that the above scenario doesn't occur.
Cc: stable@vger.kernel.org
Reported-by: Mark Rutland
Tested-by: Shannon Zhao
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman
24 Apr, 2018
1 commit
-
commit 7d8b44c54e0c7c8f688e3a07f17e6083f849f01f upstream.
vgic_copy_lpi_list() parses the LPI list and picks LPIs targeting
a given vcpu. We allocate the array containing the intids before taking
the lpi_list_lock, which means we can have an array size that is not
equal to the number of LPIs.This is particularly obvious when looking at the path coming from
vgic_enable_lpis, which is not a command, and thus can run in parallel
with commands:vcpu 0: vcpu 1:
vgic_enable_lpis
its_sync_lpi_pending_table
vgic_copy_lpi_list
intids = kmalloc_array(irq_count)
MAPI(lpi targeting vcpu 0)
list_for_each_entry(lpi_list_head)
intids[i++] = irq->intid;At that stage, we will happily overrun the intids array. Boo. An easy
fix is is to break once the array is full. The MAPI command will update
the config anyway, and we won't miss a thing. We also make sure that
lpi_list_count is read exactly once, so that further updates of that
value will not affect the array bound check.Cc: stable@vger.kernel.org
Fixes: ccb1d791ab9e ("KVM: arm64: vgic-its: Fix pending table sync")
Reviewed-by: Andre Przywara
Reviewed-by: Eric Auger
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman
21 Mar, 2018
3 commits
-
commit 16ca6a607d84bef0129698d8d808f501afd08d43 upstream.
The vgic code is trying to be clever when injecting GICv2 SGIs,
and will happily populate LRs with the same interrupt number if
they come from multiple vcpus (after all, they are distinct
interrupt sources).Unfortunately, this is against the letter of the architecture,
and the GICv2 architecture spec says "Each valid interrupt stored
in the List registers must have a unique VirtualID for that
virtual CPU interface.". GICv3 has similar (although slightly
ambiguous) restrictions.This results in guests locking up when using GICv2-on-GICv3, for
example. The obvious fix is to stop trying so hard, and inject
a single vcpu per SGI per guest entry. After all, pending SGIs
with multiple source vcpus are pretty rare, and are mostly seen
in scenario where the physical CPUs are severely overcomitted.But as we now only inject a single instance of a multi-source SGI per
vcpu entry, we may delay those interrupts for longer than strictly
necessary, and run the risk of injecting lower priority interrupts
in the meantime.In order to address this, we adopt a three stage strategy:
- If we encounter a multi-source SGI in the AP list while computing
its depth, we force the list to be sorted
- When populating the LRs, we prevent the injection of any interrupt
of lower priority than that of the first multi-source SGI we've
injected.
- Finally, the injection of a multi-source SGI triggers the request
of a maintenance interrupt when there will be no pending interrupt
in the LRs (HCR_NPIE).At the point where the last pending interrupt in the LRs switches
from Pending to Active, the maintenance interrupt will be delivered,
allowing us to add the remaining SGIs using the same process.Cc: stable@vger.kernel.org
Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
Acked-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman -
commit 27e91ad1e746e341ca2312f29bccb9736be7b476 upstream.
On guest exit, and when using GICv2 on GICv3, we use a dsb(st) to
force synchronization between the memory-mapped guest view and
the system-register view that the hypervisor uses.This is incorrect, as the spec calls out the need for "a DSB whose
required access type is both loads and stores with any Shareability
attribute", while we're only synchronizing stores.We also lack an isb after the dsb to ensure that the latter has
actually been executed before we start reading stuff from the sysregs.The fix is pretty easy: turn dsb(st) into dsb(sy), and slap an isb()
just after.Cc: stable@vger.kernel.org
Fixes: f68d2b1b73cc ("arm64: KVM: Implement vgic-v3 save/restore")
Acked-by: Christoffer Dall
Reviewed-by: Andre Przywara
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman -
commit 76600428c3677659e3c3633bb4f2ea302220a275 upstream.
On my GICv3 system, the following is printed to the kernel log at boot:
kvm [1]: 8-bit VMID
kvm [1]: IDMAP page: d20e35000
kvm [1]: HYP VA range: 800000000000:ffffffffffff
kvm [1]: vgic-v2@2c020000
kvm [1]: GIC system register CPU interface enabled
kvm [1]: vgic interrupt IRQ1
kvm [1]: virtual timer IRQ4
kvm [1]: Hyp mode initialized successfullyThe KVM IDMAP is a mapping of a statically allocated kernel structure,
and so printing its physical address leaks the physical placement of
the kernel when physical KASLR in effect. So change the kvm_info() to
kvm_debug() to remove it from the log output.While at it, trim the output a bit more: IRQ numbers can be found in
/proc/interrupts, and the HYP VA and vgic-v2 lines are not highly
informational either.Cc:
Acked-by: Will Deacon
Acked-by: Christoffer Dall
Signed-off-by: Ard Biesheuvel
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman
25 Feb, 2018
2 commits
-
[ Upstream commit 7465894e90e5a47e0e52aa5f1f708653fc40020f ]
vgic_set_owner acquires the irq lock without disabling interrupts,
resulting in a lockdep splat (an interrupt could fire and result
in the same lock being taken if the same virtual irq is to be
injected).In practice, it is almost impossible to trigger this bug, but
better safe than sorry. Convert the lock acquisition to a
spin_lock_irqsave() and keep lockdep happy.Reported-by: James Morse
Signed-off-by: Marc Zyngier
Signed-off-by: Christoffer Dall
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 58d0d19a204604ca0da26058828a53558b265da3 ]
Since it is perfectly legal to run the kernel at EL1, it is not
actually an error if HYP mode is not available when attempting to
initialize KVM, given that KVM support cannot be built as a module.
So demote the kvm_err() to kvm_info(), which prevents the error from
appearing on an otherwise 'quiet' console.Acked-by: Marc Zyngier
Acked-by: Christoffer Dall
Signed-off-by: Ard Biesheuvel
Signed-off-by: Christoffer Dall
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman
17 Feb, 2018
9 commits
-
commit 58d6b15e9da5042a99c9c30ad725792e4569150e upstream.
cpu_pm_enter() calls the pm notifier chain with CPU_PM_ENTER, then if
there is a failure: CPU_PM_ENTER_FAILED.When KVM receives CPU_PM_ENTER it calls cpu_hyp_reset() which will
return us to the hyp-stub. If we subsequently get a CPU_PM_ENTER_FAILED,
KVM does nothing, leaving the CPU running with the hyp-stub, at odds
with kvm_arm_hardware_enabled.Add CPU_PM_ENTER_FAILED as a fallthrough for CPU_PM_EXIT, this reloads
KVM based on kvm_arm_hardware_enabled. This is safe even if CPU_PM_ENTER
never gets as far as KVM, as cpu_hyp_reinit() calls cpu_hyp_reset()
to make sure the hyp-stub is loaded before reloading KVM.Fixes: 67f691976662 ("arm64: kvm: allows kvm cpu hotplug")
CC: Lorenzo Pieralisi
Reviewed-by: Christoffer Dall
Signed-off-by: James Morse
Signed-off-by: Christoffer Dall
Signed-off-by: Greg Kroah-Hartman -
Commit 6167ec5c9145 upstream.
A new feature of SMCCC 1.1 is that it offers firmware-based CPU
workarounds. In particular, SMCCC_ARCH_WORKAROUND_1 provides
BP hardening for CVE-2017-5715.If the host has some mitigation for this issue, report that
we deal with it using SMCCC_ARCH_WORKAROUND_1, as we apply the
host workaround on every guest exit.Tested-by: Ard Biesheuvel
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Catalin Marinas
Signed-off-by: Will Deacon
Signed-off-by: Ard Biesheuvel
Signed-off-by: Greg Kroah-Hartman -
Commit a4097b351118 upstream.
We're about to need kvm_psci_version in HYP too. So let's turn it
into a static inline, and pass the kvm structure as a second
parameter (so that HYP can do a kern_hyp_va on it).Tested-by: Ard Biesheuvel
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Catalin Marinas
Signed-off-by: Will Deacon
Signed-off-by: Ard Biesheuvel
Signed-off-by: Greg Kroah-Hartman -
Commit 09e6be12effd upstream.
The new SMC Calling Convention (v1.1) allows for a reduced overhead
when calling into the firmware, and provides a new feature discovery
mechanism.Make it visible to KVM guests.
Tested-by: Ard Biesheuvel
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Catalin Marinas
Signed-off-by: Will Deacon
Signed-off-by: Ard Biesheuvel
Signed-off-by: Greg Kroah-Hartman -
Commit 58e0b2239a4d upstream.
PSCI 1.0 can be trivially implemented by providing the FEATURES
call on top of PSCI 0.2 and returning 1.0 as the PSCI version.We happily ignore everything else, as they are either optional or
are clarifications that do not require any additional change.PSCI 1.0 is now the default until we decide to add a userspace
selection API.Reviewed-by: Christoffer Dall
Tested-by: Ard Biesheuvel
Signed-off-by: Marc Zyngier
Signed-off-by: Catalin Marinas
Signed-off-by: Will Deacon
Signed-off-by: Ard Biesheuvel
Signed-off-by: Greg Kroah-Hartman -
Commit 84684fecd7ea upstream.
Instead of open coding the accesses to the various registers,
let's add explicit SMCCC accessors.Reviewed-by: Christoffer Dall
Tested-by: Ard Biesheuvel
Signed-off-by: Marc Zyngier
Signed-off-by: Catalin Marinas
Signed-off-by: Will Deacon
Signed-off-by: Ard Biesheuvel
Signed-off-by: Greg Kroah-Hartman -
Commit d0a144f12a7c upstream.
As we're about to trigger a PSCI version explosion, it doesn't
hurt to introduce a PSCI_VERSION helper that is going to be
used everywhere.Reviewed-by: Christoffer Dall
Tested-by: Ard Biesheuvel
Signed-off-by: Marc Zyngier
Signed-off-by: Catalin Marinas
Signed-off-by: Will Deacon
Signed-off-by: Ard Biesheuvel
Signed-off-by: Greg Kroah-Hartman -
Commit 1a2fb94e6a77 upstream.
As we're about to update the PSCI support, and because I'm lazy,
let's move the PSCI include file to include/kvm so that both
ARM architectures can find it.Acked-by: Christoffer Dall
Tested-by: Ard Biesheuvel
Signed-off-by: Marc Zyngier
Signed-off-by: Catalin Marinas
Signed-off-by: Will Deacon
Signed-off-by: Ard Biesheuvel
Signed-off-by: Greg Kroah-Hartman -
Commit 6840bdd73d07 upstream.
Now that we have per-CPU vectors, let's plug then in the KVM/arm64 code.
Signed-off-by: Marc Zyngier
Signed-off-by: Will Deacon
Signed-off-by: Catalin Marinas
Signed-off-by: Ard Biesheuvel
Signed-off-by: Greg Kroah-Hartman
04 Feb, 2018
1 commit
-
[ Upstream commit 20b7035c66bacc909ae3ffe92c1a1ea7db99fe4f ]
KVM API says for the signal mask you set via KVM_SET_SIGNAL_MASK, that
"any unblocked signal received [...] will cause KVM_RUN to return with
-EINTR" and that "the signal will only be delivered if not blocked by
the original signal mask".This, however, is only true, when the calling task has a signal handler
registered for a signal. If not, signal evaluation is short-circuited for
SIG_IGN and SIG_DFL, and the signal is either ignored without KVM_RUN
returning or the whole process is terminated.Make KVM_SET_SIGNAL_MASK behave as advertised by utilizing logic similar
to that in do_sigtimedwait() to avoid short-circuiting of signals.Signed-off-by: Jan H. Schönherr
Signed-off-by: Paolo Bonzini
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman
24 Jan, 2018
1 commit
-
commit c507babf10ead4d5c8cca704539b170752a8ac84 upstream.
KVM only supports PMD hugepages at stage 2 but doesn't actually check
that the provided hugepage memory pagesize is PMD_SIZE before populating
stage 2 entries.In cases where the backing hugepage size is smaller than PMD_SIZE (such
as when using contiguous hugepages), KVM can end up creating stage 2
mappings that extend beyond the supplied memory.Fix this by checking for the pagesize of userspace vma before creating
PMD hugepage at stage 2.Fixes: 66b3923a1a0f77a ("arm64: hugetlb: add support for PTE contiguous bit")
Signed-off-by: Punit Agrawal
Cc: Marc Zyngier
Reviewed-by: Christoffer Dall
Signed-off-by: Christoffer Dall
Signed-off-by: Greg Kroah-Hartman
17 Jan, 2018
1 commit
-
commit e39d200fa5bf5b94a0948db0dae44c1b73b84a56 upstream.
Reported by syzkaller:
BUG: KASAN: stack-out-of-bounds in write_mmio+0x11e/0x270 [kvm]
Read of size 8 at addr ffff8803259df7f8 by task syz-executor/32298CPU: 6 PID: 32298 Comm: syz-executor Tainted: G OE 4.15.0-rc2+ #18
Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
Call Trace:
dump_stack+0xab/0xe1
print_address_description+0x6b/0x290
kasan_report+0x28a/0x370
write_mmio+0x11e/0x270 [kvm]
emulator_read_write_onepage+0x311/0x600 [kvm]
emulator_read_write+0xef/0x240 [kvm]
emulator_fix_hypercall+0x105/0x150 [kvm]
em_hypercall+0x2b/0x80 [kvm]
x86_emulate_insn+0x2b1/0x1640 [kvm]
x86_emulate_instruction+0x39a/0xb90 [kvm]
handle_exception+0x1b4/0x4d0 [kvm_intel]
vcpu_enter_guest+0x15a0/0x2640 [kvm]
kvm_arch_vcpu_ioctl_run+0x549/0x7d0 [kvm]
kvm_vcpu_ioctl+0x479/0x880 [kvm]
do_vfs_ioctl+0x142/0x9a0
SyS_ioctl+0x74/0x80
entry_SYSCALL_64_fastpath+0x23/0x9aThe path of patched vmmcall will patch 3 bytes opcode 0F 01 C1(vmcall)
to the guest memory, however, write_mmio tracepoint always prints 8 bytes
through *(u64 *)val since kvm splits the mmio access into 8 bytes. This
leaks 5 bytes from the kernel stack (CVE-2017-17741). This patch fixes
it by just accessing the bytes which we operate on.Before patch:
syz-executor-5567 [007] .... 51370.561696: kvm_mmio: mmio write len 3 gpa 0x10 val 0x1ffff10077c1010f
After patch:
syz-executor-13416 [002] .... 51302.299573: kvm_mmio: mmio write len 3 gpa 0x10 val 0xc1010f
Reported-by: Dmitry Vyukov
Reviewed-by: Darren Kenny
Reviewed-by: Marc Zyngier
Tested-by: Marc Zyngier
Cc: Paolo Bonzini
Cc: Radim Krčmář
Cc: Marc Zyngier
Cc: Christoffer Dall
Signed-off-by: Wanpeng Li
Signed-off-by: Paolo Bonzini
Cc: Mathieu Desnoyers
Signed-off-by: Greg Kroah-Hartman
30 Dec, 2017
1 commit
-
commit 7839c672e58bf62da8f2f0197fefb442c02ba1dd upstream.
When we unmap the HYP memory, we try to be clever and unmap one
PGD at a time. If we start with a non-PGD aligned address and try
to unmap a whole PGD, things go horribly wrong in unmap_hyp_range
(addr and end can never match, and it all goes really badly as we
keep incrementing pgd and parse random memory as page tables...).The obvious fix is to let unmap_hyp_range do what it does best,
which is to iterate over a range.The size of the linear mapping, which begins at PAGE_OFFSET, can be
easily calculated by subtracting PAGE_OFFSET form high_memory, because
high_memory is defined as the linear map address of the last byte of
DRAM, plus one.The size of the vmalloc region is given trivially by VMALLOC_END -
VMALLOC_START.Reported-by: Andre Przywara
Tested-by: Andre Przywara
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Christoffer Dall
Signed-off-by: Greg Kroah-Hartman
17 Dec, 2017
1 commit
-
commit 64afe6e9eb4841f35317da4393de21a047a883b3 upstream.
The current pending table parsing code assumes that we keep the
previous read of the pending bits, but keep that variable in
the current block, making sure it is discarded on each loop.We end-up using whatever is on the stack. Who knows, it might
just be the right thing...Fixes: 33d3bc9556a7d ("KVM: arm64: vgic-its: Read initial LPI pending table")
Reported-by: AKASHI Takahiro
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Christoffer Dall
Signed-off-by: Greg Kroah-Hartman
14 Dec, 2017
2 commits
-
commit 686f294f2f1ae40705283dd413ca1e4c14f20f93 upstream.
We miss a test against NULL after allocation.
Fixes: 6d03a68f8054 ("KVM: arm64: vgic-its: Turn device_id validation into generic ID validation")
Reported-by: AKASHI Takahiro
Acked-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Christoffer Dall
Signed-off-by: Greg Kroah-Hartman -
commit ddb4b0102cb9cdd2398d98b3e1e024e08a2f4239 upstream.
The current pending table parsing code assumes that we keep the
previous read of the pending bits, but keep that variable in
the current block, making sure it is discarded on each loop.We end-up using whatever is on the stack. Who knows, it might
just be the right thing...Fixes: 280771252c1ba ("KVM: arm64: vgic-v3: KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES")
Reported-by: AKASHI Takahiro
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Christoffer Dall
Signed-off-by: Greg Kroah-Hartman