Eric Lee / smarc-fsl-linux-kernel

11 Mar, 2018

1 commit

2e112f365 KVM: mmu: Fix overlap between public and private memslots ... Browse Code »

commit b28676bb8ae4569cced423dc2a88f7cb319d5379 upstream.

Reported by syzkaller:

pte_list_remove: ffff9714eb1f8078 0->BUG
------------[ cut here ]------------
kernel BUG at arch/x86/kvm/mmu.c:1157!
invalid opcode: 0000 [#1] SMP
RIP: 0010:pte_list_remove+0x11b/0x120 [kvm]
Call Trace:
drop_spte+0x83/0xb0 [kvm]
mmu_page_zap_pte+0xcc/0xe0 [kvm]
kvm_mmu_prepare_zap_page+0x81/0x4a0 [kvm]
kvm_mmu_invalidate_zap_all_pages+0x159/0x220 [kvm]
kvm_arch_flush_shadow_all+0xe/0x10 [kvm]
kvm_mmu_notifier_release+0x6c/0xa0 [kvm]
? kvm_mmu_notifier_release+0x5/0xa0 [kvm]
__mmu_notifier_release+0x79/0x110
? __mmu_notifier_release+0x5/0x110
exit_mmap+0x15a/0x170
? do_exit+0x281/0xcb0
mmput+0x66/0x160
do_exit+0x2c9/0xcb0
? __context_tracking_exit.part.5+0x4a/0x150
do_group_exit+0x50/0xd0
SyS_exit_group+0x14/0x20
do_syscall_64+0x73/0x1f0
entry_SYSCALL64_slow_path+0x25/0x25

The reason is that when creates new memslot, there is no guarantee for new
memslot not overlap with private memslots. This can be triggered by the
following program:

#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include

long r[16];

int main()
{
void *p = valloc(0x4000);

r[2] = open("/dev/kvm", 0);
r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul);

uint64_t addr = 0xf000;
ioctl(r[3], KVM_SET_IDENTITY_MAP_ADDR, &addr);
r[6] = ioctl(r[3], KVM_CREATE_VCPU, 0x0ul);
ioctl(r[3], KVM_SET_TSS_ADDR, 0x0ul);
ioctl(r[6], KVM_RUN, 0);
ioctl(r[6], KVM_RUN, 0);

struct kvm_userspace_memory_region mr = {
.slot = 0,
.flags = KVM_MEM_LOG_DIRTY_PAGES,
.guest_phys_addr = 0xf000,
.memory_size = 0x4000,
.userspace_addr = (uintptr_t) p
};
ioctl(r[3], KVM_SET_USER_MEMORY_REGION, &mr);
return 0;
}

This patch fixes the bug by not adding a new memslot even if it
overlaps with private memslots.

Reported-by: Dmitry Vyukov
Cc: Paolo Bonzini
Cc: Radim Krčmář
Cc: Dmitry Vyukov
Cc: Eric Biggers
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li

Wanpeng Li
2018-03-11 23:21:29 +0800

25 Dec, 2017

2 commits

206e1621b kvm, mm: account kvm related kmem slabs to kmemcg ... Browse Code »

[ Upstream commit 46bea48ac241fe0b413805952dda74dd0c09ba8b ]

The kvm slabs can consume a significant amount of system memory
and indeed in our production environment we have observed that
a lot of machines are spending significant amount of memory that
can not be left as system memory overhead. Also the allocations
from these slabs can be triggered directly by user space applications
which has access to kvm and thus a buggy application can leak
such memory. So, these caches should be accounted to kmemcg.

Signed-off-by: Shakeel Butt
Signed-off-by: Paolo Bonzini
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Shakeel Butt
2017-12-25 21:23:43 +0800
808ed3bd9 KVM: pci-assign: do not map smm memory slot pages in vt-d page tables ... Browse Code »

[ Upstream commit 0292e169b2d9c8377a168778f0b16eadb1f578fd ]

or VM memory are not put thus leaked in kvm_iommu_unmap_memslots() when
destroy VM.

This is consistent with current vfio implementation.

Signed-off-by: herongguang
Signed-off-by: Paolo Bonzini
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Herongguang (Stephen)
2017-12-25 21:23:42 +0800

16 Dec, 2017

1 commit

9414a6309 KVM: arm/arm64: vgic-its: Preserve the revious read from the pending table ... Browse Code »

commit 64afe6e9eb4841f35317da4393de21a047a883b3 upstream.

The current pending table parsing code assumes that we keep the
previous read of the pending bits, but keep that variable in
the current block, making sure it is discarded on each loop.

We end-up using whatever is on the stack. Who knows, it might
just be the right thing...

Fixes: 33d3bc9556a7d ("KVM: arm64: vgic-its: Read initial LPI pending table")
Cc: stable@vger.kernel.org # 4.8
Reported-by: AKASHI Takahiro
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Christoffer Dall
Signed-off-by: Greg Kroah-Hartman

Marc Zyngier
2017-12-16 23:25:47 +0800

14 Dec, 2017

5 commits

b1f71147a KVM: arm/arm64: VGIC: Fix command handling while ITS being disabled ... Browse Code »

[ Upstream commit a5e1e6ca94a8cec51571fd62e3eaec269717969c ]

The ITS spec says that ITS commands are only processed when the ITS
is enabled (section 8.19.4, Enabled, bit[0]). Our emulation was not taking
this into account.
Fix this by checking the enabled state before handling CWRITER writes.

On the other hand that means that CWRITER could advance while the ITS
is disabled, and enabling it would need those commands to be processed.
Fix this case as well by refactoring actual command processing and
calling this from both the GITS_CWRITER and GITS_CTLR handlers.

Reviewed-by: Eric Auger
Reviewed-by: Christoffer Dall
Signed-off-by: Andre Przywara
Signed-off-by: Marc Zyngier
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Andre Przywara
2017-12-14 16:28:19 +0800
7df3dbef3 KVM: arm/arm64: vgic-its: Check result of allocation before use ... Browse Code »

commit 686f294f2f1ae40705283dd413ca1e4c14f20f93 upstream.

We miss a test against NULL after allocation.

Fixes: 6d03a68f8054 ("KVM: arm64: vgic-its: Turn device_id validation into generic ID validation")
Reported-by: AKASHI Takahiro
Acked-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Christoffer Dall
Signed-off-by: Greg Kroah-Hartman

Marc Zyngier
2017-12-14 16:28:14 +0800
42c3f4c55 KVM: arm/arm64: vgic-irqfd: Fix MSI entry allocation ... Browse Code »

commit 150009e2c70cc3c6e97f00e7595055765d32fb85 upstream.

Using the size of the structure we're allocating is a good idea
and avoids any surprise... In this case, we're happilly confusing
kvm_kernel_irq_routing_entry and kvm_irq_routing_entry...

Fixes: 95b110ab9a09 ("KVM: arm/arm64: Enable irqchip routing")
Reported-by: AKASHI Takahiro
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Christoffer Dall
Signed-off-by: Greg Kroah-Hartman

Marc Zyngier
2017-12-14 16:28:14 +0800
cf6668d57 KVM: arm/arm64: Fix broken GICH_ELRSR big endian conversion ... Browse Code »

commit fc396e066318c0a02208c1d3f0b62950a7714999 upstream.

We are incorrectly rearranging 32-bit words inside a 64-bit typed value
for big endian systems, which would result in never marking a virtual
interrupt as inactive on big endian systems (assuming 32 or fewer LRs on
the hardware). Fix this by not doing any word order manipulation for
the typed values.

Acked-by: Christoffer Dall
Signed-off-by: Christoffer Dall
Signed-off-by: Greg Kroah-Hartman

Christoffer Dall
2017-12-14 16:28:14 +0800
9cf0eaf88 KVM: x86: fix APIC page invalidation ... Browse Code »

commit b1394e745b9453dcb5b0671c205b770e87dedb87 upstream.

Implementation of the unpinned APIC page didn't update the VMCS address
cache when invalidation was done through range mmu notifiers.
This became a problem when the page notifier was removed.

Re-introduce the arch-specific helper and call it from ...range_start.

Reported-by: Fabian Grünbichler
Fixes: 38b9917350cb ("kvm: vmx: Implement set_apic_access_page_addr")
Fixes: 369ea8242c0f ("mm/rmap: update to new mmu_notifier semantic v2")
Reviewed-by: Paolo Bonzini
Reviewed-by: Andrea Arcangeli
Tested-by: Wanpeng Li
Tested-by: Fabian Grünbichler
Signed-off-by: Radim Krčmář
Signed-off-by: Greg Kroah-Hartman

Radim Krčmář
2017-12-14 16:28:12 +0800

10 Dec, 2017

1 commit

ee01c59bf KVM: arm/arm64: Fix occasional warning from the timer work function ... Browse Code »

[ Upstream commit 63e41226afc3f7a044b70325566fa86ac3142538 ]

When a VCPU blocks (WFI) and has programmed the vtimer, we program a
soft timer to expire in the future to wake up the vcpu thread when
appropriate. Because such as wake up involves a vcpu kick, and the
timer expire function can get called from interrupt context, and the
kick may sleep, we have to schedule the kick in the work function.

The work function currently has a warning that gets raised if it turns
out that the timer shouldn't fire when it's run, which was added because
the idea was that in that case the work should never have been cancelled.

However, it turns out that this whole thing is racy and we can get
spurious warnings. The problem is that we clear the armed flag in the
work function, which may run in parallel with the
kvm_timer_unschedule->timer_disarm() call. This results in a possible
situation where the timer_disarm() call does not call
cancel_work_sync(), which effectively synchronizes the completion of the
work function with running the VCPU. As a result, the VCPU thread
proceeds before the work function completees, causing changes to the
timer state such that kvm_timer_should_fire(vcpu) returns false in the
work function.

All we do in the work function is to kick the VCPU, and an occasional
rare extra kick never harmed anyone. Since the race above is extremely
rare, we don't bother checking if the race happens but simply remove the
check and the clearing of the armed flag from the work function.

Reported-by: Matthias Brugger
Reviewed-by: Marc Zyngier
Signed-off-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Christoffer Dall
2017-12-10 05:01:52 +0800

28 Jul, 2017

1 commit

8f9dec0c2 vfio: New external user group/file match ... Browse Code »

commit 5d6dee80a1e94cc284d03e06d930e60e8d3ecf7d upstream.

At the point where the kvm-vfio pseudo device wants to release its
vfio group reference, we can't always acquire a new reference to make
that happen. The group can be in a state where we wouldn't allow a
new reference to be added. This new helper function allows a caller
to match a file to a group to facilitate this. Given a file and
group, report if they match. Thus the caller needs to already have a
group reference to match to the file. This allows the deletion of a
group without acquiring a new reference.

Signed-off-by: Alex Williamson
Reviewed-by: Eric Auger
Reviewed-by: Paolo Bonzini
Tested-by: Eric Auger
Signed-off-by: Greg Kroah-Hartman

Alex Williamson
2017-07-28 06:08:03 +0800

14 Jun, 2017

2 commits

3e7a76b29 KVM: arm/arm64: vgic-v2: Do not use Active+Pending state for a HW interrupt ... Browse Code »

commit ddf42d068f8802de122bb7efdfcb3179336053f1 upstream.

When an interrupt is injected with the HW bit set (indicating that
deactivation should be propagated to the physical distributor),
special care must be taken so that we never mark the corresponding
LR with the Active+Pending state (as the pending state is kept in
the physycal distributor).

Cc: stable@vger.kernel.org
Fixes: 140b086dd197 ("KVM: arm/arm64: vgic-new: Add GICv2 world switch backend")
Signed-off-by: Marc Zyngier
Reviewed-by: Christoffer Dall
Signed-off-by: Christoffer Dall
Signed-off-by: Greg Kroah-Hartman

Marc Zyngier
2017-06-14 21:05:56 +0800
2a5c08a4d KVM: arm/arm64: vgic-v3: Do not use Active+Pending state for a HW interrupt ... Browse Code »

commit 3d6e77ad1489650afa20da92bb589c8778baa8da upstream.

When an interrupt is injected with the HW bit set (indicating that
deactivation should be propagated to the physical distributor),
special care must be taken so that we never mark the corresponding
LR with the Active+Pending state (as the pending state is kept in
the physycal distributor).

Fixes: 59529f69f504 ("KVM: arm/arm64: vgic-new: Add GICv3 world switch backend")
Signed-off-by: Marc Zyngier
Reviewed-by: Christoffer Dall
Signed-off-by: Christoffer Dall
Signed-off-by: Greg Kroah-Hartman

Marc Zyngier
2017-06-14 21:05:56 +0800

08 Apr, 2017

2 commits

1563625c7 KVM: kvm_io_bus_unregister_dev() should never fail ... Browse Code »

commit 90db10434b163e46da413d34db8d0e77404cc645 upstream.

No caller currently checks the return value of
kvm_io_bus_unregister_dev(). This is evil, as all callers silently go on
freeing their device. A stale reference will remain in the io_bus,
getting at least used again, when the iobus gets teared down on
kvm_destroy_vm() - leading to use after free errors.

There is nothing the callers could do, except retrying over and over
again.

So let's simply remove the bus altogether, print an error and make
sure no one can access this broken bus again (returning -ENOMEM on any
attempt to access it).

Fixes: e93f8a0f821e ("KVM: convert io_bus to SRCU")
Reported-by: Dmitry Vyukov
Reviewed-by: Cornelia Huck
Signed-off-by: David Hildenbrand
Signed-off-by: Paolo Bonzini
Signed-off-by: Greg Kroah-Hartman

David Hildenbrand
2017-04-08 15:30:34 +0800
ef46a13b9 KVM: x86: clear bus pointer when destroyed ... Browse Code »

commit df630b8c1e851b5e265dc2ca9c87222e342c093b upstream.

When releasing the bus, let's clear the bus pointers to mark it out. If
any further device unregister happens on this bus, we know that we're
done if we found the bus being released already.

Signed-off-by: Peter Xu
Signed-off-by: Radim Krčmář
Signed-off-by: Greg Kroah-Hartman

Peter Xu
2017-04-08 15:30:34 +0800

18 Mar, 2017

1 commit

d29e6215e KVM: arm/arm64: Let vcpu thread modify its own active state ... Browse Code »

commit 370a0ec1819990f8e2a93df7cc9c0146980ed45f upstream.

Currently, if a vcpu thread tries to change the active state of an
interrupt which is already on the same vcpu's AP list, it will loop
forever. Since the VGIC mmio handler is called after a vcpu has
already synced back the LR state to the struct vgic_irq, we can just
let it proceed safely.

Reviewed-by: Marc Zyngier
Signed-off-by: Jintack Lim
Signed-off-by: Christoffer Dall
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman

Jintack Lim
2017-03-18 19:14:34 +0800

12 Mar, 2017

1 commit

d408d23ad KVM: arm/arm64: vgic: Stop injecting the MSI occurrence twice ... Browse Code »

commit 0bdbf3b071986ba80731203683cf623d5c0cacb1 upstream.

The IRQFD framework calls the architecture dependent function
twice if the corresponding GSI type is edge triggered. For ARM,
the function kvm_set_msi() is getting called twice whenever the
IRQFD receives the event signal. The rest of the code path is
trying to inject the MSI without any validation checks. No need
to call the function vgic_its_inject_msi() second time to avoid
an unnecessary overhead in IRQ queue logic. It also avoids the
possibility of VM seeing the MSI twice.

Simple fix, return -1 if the argument 'level' value is zero.

Reviewed-by: Eric Auger
Reviewed-by: Christoffer Dall
Signed-off-by: Shanker Donthineni
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman

Shanker Donthineni
2017-03-12 13:41:48 +0800

26 Jan, 2017

1 commit

26c4d513b KVM: arm/arm64: vgic: Fix deadlock on error handling ... Browse Code »

commit 1193e6aeecb36c74c48c7cd0f641acbbed9ddeef upstream.

Dmitry Vyukov reported that the syzkaller fuzzer triggered a
deadlock in the vgic setup code when an error was detected, as
the cleanup code tries to take a lock that is already held by
the setup code.

The fix is to avoid retaking the lock when cleaning up, by
telling the cleanup function that we already hold it.

Reported-by: Dmitry Vyukov
Reviewed-by: Christoffer Dall
Reviewed-by: Eric Auger
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman

Marc Zyngier
2017-01-26 15:24:39 +0800

20 Jan, 2017

1 commit

7caf473f9 KVM: eventfd: fix NULL deref irqbypass consumer ... Browse Code »

commit 4f3dbdf47e150016aacd734e663347fcaa768303 upstream.

Reported syzkaller:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass]
PGD 0

Oops: 0002 [#1] SMP
CPU: 1 PID: 125 Comm: kworker/1:1 Not tainted 4.9.0+ #1
Workqueue: kvm-irqfd-cleanup irqfd_shutdown [kvm]
task: ffff9bbe0dfbb900 task.stack: ffffb61802014000
RIP: 0010:irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass]
Call Trace:
irqfd_shutdown+0x66/0xa0 [kvm]
process_one_work+0x16b/0x480
worker_thread+0x4b/0x500
kthread+0x101/0x140
? process_one_work+0x480/0x480
? kthread_create_on_node+0x60/0x60
ret_from_fork+0x25/0x30
RIP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass] RSP: ffffb61802017e20
CR2: 0000000000000008

The syzkaller folks reported a NULL pointer dereference that due to
unregister an consumer which fails registration before. The syzkaller
creates two VMs w/ an equal eventfd occasionally. So the second VM
fails to register an irqbypass consumer. It will make irqfd as inactive
and queue an workqueue work to shutdown irqfd and unregister the irqbypass
consumer when eventfd is closed. However, the second consumer has been
initialized though it fails registration. So the token(same as the first
VM's) is taken to unregister the consumer through the workqueue, the
consumer of the first VM is found and unregistered, then NULL deref incurred
in the path of deleting consumer from the consumers list.

This patch fixes it by making irq_bypass_register/unregister_consumer()
looks for the consumer entry based on consumer pointer itself instead of
token matching.

Reported-by: Dmitry Vyukov
Suggested-by: Alex Williamson
Cc: Paolo Bonzini
Cc: Radim Krčmář
Cc: Dmitry Vyukov
Cc: Alex Williamson
Signed-off-by: Wanpeng Li
Signed-off-by: Paolo Bonzini
Signed-off-by: Greg Kroah-Hartman

Wanpeng Li
2017-01-20 03:17:59 +0800

01 Dec, 2016

2 commits

a0f1d21c1 KVM: use after free in kvm_ioctl_create_device() ... Browse Code »

We should move the ops->destroy(dev) after the list_del(&dev->vm_node)
so that we don't use "dev" after freeing it.

Fixes: a28ebea2adc4 ("KVM: Protect device ops->create and list_add with kvm->lock")
Signed-off-by: Dan Carpenter
Reviewed-by: David Hildenbrand
Signed-off-by: Radim Krčmář

Dan Carpenter
2016-12-01 23:10:50 +0800
0f4828a1d Merge tag 'kvm-arm-for-4.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm ... Browse Code »

KVM/ARM updates for v4.9-rc7

- Do not call kvm_notify_acked for PPIs

Radim Krčmář
2016-12-01 21:56:34 +0800

24 Nov, 2016

1 commit

8ca18eec2 KVM: arm/arm64: vgic: Don't notify EOI for non-SPIs ... Browse Code »

When we inject a level triggerered interrupt (and unless it
is backed by the physical distributor - timer style), we request
a maintenance interrupt. Part of the processing for that interrupt
is to feed to the rest of KVM (and to the eventfd subsystem) the
information that the interrupt has been EOIed.

But that notification only makes sense for SPIs, and not PPIs
(such as the PMU interrupt). Skip over the notification if
the interrupt is not an SPI.

Cc: stable@vger.kernel.org # 4.7+
Fixes: 140b086dd197 ("KVM: arm/arm64: vgic-new: Add GICv2 world switch backend")
Fixes: 59529f69f504 ("KVM: arm/arm64: vgic-new: Add GICv3 world switch backend")
Reported-by: Catalin Marinas
Tested-by: Catalin Marinas
Acked-by: Christoffer Dall
Signed-off-by: Marc Zyngier

Marc Zyngier
2016-11-24 21:12:07 +0800

20 Nov, 2016

2 commits

22583f0d9 KVM: async_pf: avoid recursive flushing of work items ... Browse Code »

This was reported by syzkaller:

[ INFO: possible recursive locking detected ]
4.9.0-rc4+ #49 Not tainted
---------------------------------------------
kworker/2:1/5658 is trying to acquire lock:
([ 1644.769018] (&work->work)
[< inline >] list_empty include/linux/compiler.h:243
[] flush_work+0x0/0x660 kernel/workqueue.c:1511

but task is already holding lock:
([ 1644.769018] (&work->work)
[] process_one_work+0x94b/0x1900 kernel/workqueue.c:2093

stack backtrace:
CPU: 2 PID: 5658 Comm: kworker/2:1 Not tainted 4.9.0-rc4+ #49
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: events async_pf_execute
ffff8800676ff630 ffffffff81c2e46b ffffffff8485b930 ffff88006b1fc480
0000000000000000 ffffffff8485b930 ffff8800676ff7e0 ffffffff81339b27
ffff8800676ff7e8 0000000000000046 ffff88006b1fcce8 ffff88006b1fccf0
Call Trace:
...
[] flush_work+0x93/0x660 kernel/workqueue.c:2846
[] __cancel_work_timer+0x17a/0x410 kernel/workqueue.c:2916
[] cancel_work_sync+0x17/0x20 kernel/workqueue.c:2951
[] kvm_clear_async_pf_completion_queue+0xd7/0x400 virt/kvm/async_pf.c:126
[< inline >] kvm_free_vcpus arch/x86/kvm/x86.c:7841
[] kvm_arch_destroy_vm+0x23d/0x620 arch/x86/kvm/x86.c:7946
[< inline >] kvm_destroy_vm virt/kvm/kvm_main.c:731
[] kvm_put_kvm+0x40e/0x790 virt/kvm/kvm_main.c:752
[] async_pf_execute+0x23d/0x4f0 virt/kvm/async_pf.c:111
[] process_one_work+0x9fc/0x1900 kernel/workqueue.c:2096
[] worker_thread+0xef/0x1480 kernel/workqueue.c:2230
[] kthread+0x244/0x2d0 kernel/kthread.c:209
[] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433

The reason is that kvm_put_kvm is causing the destruction of the VM, but
the page fault is still on the ->queue list. The ->queue list is owned
by the VCPU, not by the work items, so we cannot just add list_del to
the work item.

Instead, use work->vcpu to note async page faults that have been resolved
and will be processed through the done list. There is no need to flush
those.

Cc: Dmitry Vyukov
Signed-off-by: Paolo Bonzini
Signed-off-by: Radim Krčmář

Paolo Bonzini
2016-11-20 02:04:17 +0800
e5dbc4bf0 Merge tag 'kvm-arm-for-4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm ... Browse Code »

KVM/ARM updates for v4.9-rc6

- Fix handling of the 32bit cycle counter
- Fix cycle counter filtering

Radim Krčmář
2016-11-20 01:02:07 +0800

18 Nov, 2016

1 commit

b112c84a6 KVM: arm64: Fix the issues when guest PMCCFILTR is configured ... Browse Code »

KVM calls kvm_pmu_set_counter_event_type() when PMCCFILTR is configured.
But this function can't deals with PMCCFILTR correctly because the evtCount
bits of PMCCFILTR, which is reserved 0, conflits with the SW_INCR event
type of other PMXEVTYPER registers. To fix it, when eventsel == 0, this
function shouldn't return immediately; instead it needs to check further
if select_idx is ARMV8_PMU_CYCLE_IDX.

Another issue is that KVM shouldn't copy the eventsel bits of PMCCFILTER
blindly to attr.config. Instead it ought to convert the request to the
"cpu cycle" event type (i.e. 0x11).

To support this patch and to prevent duplicated definitions, a limited
set of ARMv8 perf event types were relocated from perf_event.c to
asm/perf_event.h.

Cc: stable@vger.kernel.org # 4.6+
Acked-by: Will Deacon
Signed-off-by: Wei Huang
Signed-off-by: Marc Zyngier

Wei Huang
2016-11-18 17:06:58 +0800

11 Nov, 2016

1 commit

05d36a7df Merge tag 'kvm-arm-for-v4.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/gi… ... Browse Code »

…t/kvmarm/kvmarm into HEAD

KVM/ARM updates for v4.9-rc4

- Kick the vcpu when a pending interrupt becomes pending again
- Prevent access to invalid interrupt registers
- Invalid TLBs when two vcpus from the same VM share a CPU

Paolo Bonzini
2016-11-11 18:13:36 +0800

05 Nov, 2016

3 commits

66cecb678 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm ... Browse Code »

Pull KVM updates from Paolo Bonzini:
"One NULL pointer dereference, and two fixes for regressions introduced
during the merge window.

The rest are fixes for MIPS, s390 and nested VMX"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: x86: Check memopp before dereference (CVE-2016-8630)
kvm: nVMX: VMCLEAR an active shadow VMCS after last use
KVM: x86: drop TSC offsetting kvm_x86_ops to fix KVM_GET/SET_CLOCK
KVM: x86: fix wbinvd_dirty_mask use-after-free
kvm/x86: Show WRMSR data is in hex
kvm: nVMX: Fix kernel panics induced by illegal INVEPT/INVVPID types
KVM: document lock orders
KVM: fix OOPS on flush_work
KVM: s390: Fix STHYI buffer alignment for diag224
KVM: MIPS: Precalculate MMIO load resume PC
KVM: MIPS: Make ERET handle ERL before EXL
KVM: MIPS: Fix lazy user ASID regenerate for SMP

Linus Torvalds
2016-11-05 04:08:05 +0800
d42c79701 KVM: arm/arm64: vgic: Kick VCPUs when queueing already pending IRQs ... Browse Code »

In cases like IPI, we could be queueing an interrupt for a VCPU
that is already running and is not about to exit, because the
VCPU has entered the VM with the interrupt pending and would
not trap on EOI'ing that interrupt. This could result to delays
in interrupt deliveries or even loss of interrupts.
To guarantee prompt interrupt injection, here we have to try to
kick the VCPU.

Signed-off-by: Shih-Wei Li
Reviewed-by: Christoffer Dall
Signed-off-by: Marc Zyngier

Shih-Wei Li
2016-11-05 01:56:56 +0800
112b0b8f8 KVM: arm/arm64: vgic: Prevent access to invalid SPIs ... Browse Code »

In our VGIC implementation we limit the number of SPIs to a number
that the userland application told us. Accordingly we limit the
allocation of memory for virtual IRQs to that number.
However in our MMIO dispatcher we didn't check if we ever access an
IRQ beyond that limit, leading to out-of-bound accesses.
Add a test against the number of allocated SPIs in check_region().
Adjust the VGIC_ADDR_TO_INT macro to avoid an actual division, which
is not implemented on ARM(32).

[maz: cleaned-up original patch]

Cc: stable@vger.kernel.org
Reviewed-by: Christoffer Dall
Signed-off-by: Andre Przywara
Signed-off-by: Marc Zyngier

Andre Przywara
2016-11-05 01:56:54 +0800

26 Oct, 2016

1 commit

36343f6ea KVM: fix OOPS on flush_work ... Browse Code »

The conversion done by commit 3706feacd007 ("KVM: Remove deprecated
create_singlethread_workqueue") is broken. It flushes a single work
item &irqfd->shutdown instead of all of them, and even worse if there
is no irqfd on the list then you get a NULL pointer dereference.
Revert the virt/kvm/eventfd.c part of that patch; to avoid the
deprecated function, just allocate our own workqueue---it does
not even have to be unbound---with alloc_workqueue.

Fixes: 3706feacd007
Reviewed-by: Cornelia Huck
Signed-off-by: Paolo Bonzini

Paolo Bonzini
2016-10-26 20:06:51 +0800

25 Oct, 2016

1 commit

0d7317598 mm: unexport __get_user_pages() ... Browse Code »

This patch unexports the low-level __get_user_pages() function.

Recent refactoring of the get_user_pages* functions allow flags to be
passed through get_user_pages() which eliminates the need for access to
this function from its one user, kvm.

We can see that the two calls to get_user_pages() which replace
__get_user_pages() in kvm_main.c are equivalent by examining their call
stacks:

get_user_page_nowait():
get_user_pages(start, 1, flags, page, NULL)
__get_user_pages_locked(current, current->mm, start, 1, page, NULL, NULL,
false, flags | FOLL_TOUCH)
__get_user_pages(current, current->mm, start, 1,
flags | FOLL_TOUCH | FOLL_GET, page, NULL, NULL)

check_user_page_hwpoison():
get_user_pages(addr, 1, flags, NULL, NULL)
__get_user_pages_locked(current, current->mm, addr, 1, NULL, NULL, NULL,
false, flags | FOLL_TOUCH)
__get_user_pages(current, current->mm, addr, 1, flags | FOLL_TOUCH, NULL,
NULL, NULL)

Signed-off-by: Lorenzo Stoakes
Acked-by: Paolo Bonzini
Acked-by: Michal Hocko
Signed-off-by: Linus Torvalds

Lorenzo Stoakes
2016-10-25 10:13:20 +0800

19 Oct, 2016

1 commit

d4944b0ec mm: remove write/force parameters from __get_user_pages_unlocked() ... Browse Code »

This removes the redundant 'write' and 'force' parameters from
__get_user_pages_unlocked() to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes
Acked-by: Paolo Bonzini
Reviewed-by: Jan Kara
Acked-by: Michal Hocko
Signed-off-by: Linus Torvalds

Lorenzo Stoakes
2016-10-19 05:13:37 +0800

29 Sep, 2016

1 commit

45ca877ad Merge tag 'kvm-arm-for-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/kv… ... Browse Code »

…marm/kvmarm into next

KVM/ARM Changes for v4.9

- Various cleanups and removal of redundant code
- Two important fixes for not using an in-kernel irqchip
- A bit of optimizations
- Handle SError exceptions and present them to guests if appropriate
- Proxying of GICV access at EL2 if guest mappings are unsafe
- GICv3 on AArch32 on ARMv8
- Preparations for GICv3 save/restore, including ABI docs

Radim Krčmář
2016-09-29 22:01:51 +0800

28 Sep, 2016

2 commits

0099b7701 KVM: arm/arm64: vgic: Don't flush/sync without a working vgic ... Browse Code »

If the vgic hasn't been created and initialized, we shouldn't attempt to
look at its data structures or flush/sync anything to the GIC hardware.

This fixes an issue reported by Alexander Graf when using a userspace
irqchip.

Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
Cc: stable@vger.kernel.org
Reported-by: Alexander Graf
Acked-by: Marc Zyngier
Signed-off-by: Christoffer Dall

Christoffer Dall
2016-09-28 00:57:35 +0800
6fe407f2d KVM: arm64: Require in-kernel irqchip for PMU support ... Browse Code »

If userspace creates a PMU for the VCPU, but doesn't create an in-kernel
irqchip, then we end up in a nasty path where we try to take an
uninitialized spinlock, which can lead to all sorts of breakages.

Luckily, QEMU always creates the VGIC before the PMU, so we can
establish this as ABI and check for the VGIC in the PMU init stage.
This can be relaxed at a later time if we want to support PMU with a
userspace irqchip.

Cc: stable@vger.kernel.org
Cc: Shannon Zhao
Acked-by: Marc Zyngier
Signed-off-by: Christoffer Dall

Christoffer Dall
2016-09-28 00:57:07 +0800

22 Sep, 2016

5 commits

acda5430b ARM: KVM: Support vgic-v3 ... Browse Code »

This patch allows to build and use vgic-v3 in 32-bit mode.

Unfortunately, it can not be split in several steps without extra
stubs to keep patches independent and bisectable. For instance,
virt/kvm/arm/vgic/vgic-v3.c uses function from vgic-v3-sr.c, handling
access to GICv3 cpu interface from the guest requires vgic_v3.vgic_sre
to be already defined.

It is how support has been done:

* handle SGI requests from the guest

* report configured SRE on access to GICv3 cpu interface from the guest

* required vgic-v3 macros are provided via uapi.h

* static keys are used to select GIC backend

* to make vgic-v3 build KVM_ARM_VGIC_V3 guard is removed along with
the static inlines

Acked-by: Marc Zyngier
Reviewed-by: Christoffer Dall
Signed-off-by: Vladimir Murzin
Signed-off-by: Christoffer Dall

Vladimir Murzin
2016-09-22 19:22:21 +0800
d7d0a11e4 KVM: arm: vgic: Support 64-bit data manipulation on 32-bit host systems ... Browse Code »

We have couple of 64-bit registers defined in GICv3 architecture, so
unsigned long accesses to these registers will only access a single
32-bit part of that regitser. On the other hand these registers can't
be accessed as 64-bit with a single instruction like ldrd/strd or
ldmia/stmia if we run a 32-bit host because KVM does not support
access to MMIO space done by these instructions.

It means that a 32-bit guest accesses these registers in 32-bit
chunks, so the only thing we need to do is to ensure that
extract_bytes() always takes 64-bit data.

Acked-by: Marc Zyngier
Signed-off-by: Vladimir Murzin
Signed-off-by: Christoffer Dall

Vladimir Murzin
2016-09-22 19:21:59 +0800
e533a37f7 KVM: arm: vgic: Fix compiler warnings when built for 32-bit ... Browse Code »

Well, this patch is looking ahead of time, but we'll get following
compiler warnings as soon as we introduce vgic-v3 to 32-bit world

CC arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.o
arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c: In function 'vgic_mmio_read_v3r_typer':
arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c:184:35: warning: left shift count >= width of type [-Wshift-count-overflow]
value = (mpidr & GENMASK(23, 0)) << 32;
^
In file included from ./include/linux/kernel.h:10:0,
from ./include/asm-generic/bug.h:13,
from ./arch/arm/include/asm/bug.h:59,
from ./include/linux/bug.h:4,
from ./include/linux/io.h:23,
from ./arch/arm/include/asm/arch_gicv3.h:23,
from ./include/linux/irqchip/arm-gic-v3.h:411,
from arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c:14:
arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c: In function 'vgic_v3_dispatch_sgi':
./include/linux/bitops.h:6:24: warning: left shift count >= width of type [-Wshift-count-overflow]
#define BIT(nr) (1UL << (nr))
^
arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c:614:20: note: in expansion of macro 'BIT'
broadcast = reg & BIT(ICC_SGI1R_IRQ_ROUTING_MODE_BIT);
^
Let's fix them now.

Acked-by: Marc Zyngier
Signed-off-by: Vladimir Murzin
Signed-off-by: Christoffer Dall

Vladimir Murzin
2016-09-22 19:21:48 +0800
7a1ff7082 KVM: arm64: vgic-its: Introduce config option to guard ITS specific code ... Browse Code »

By now ITS code guarded with KVM_ARM_VGIC_V3 config option which was
introduced to hide everything specific to vgic-v3 from 32-bit world.
We are going to support vgic-v3 in 32-bit world and KVM_ARM_VGIC_V3
will gone, but we don't have support for ITS there yet and we need to
continue keeping ITS away.
Introduce the new config option to prevent ITS code being build in
32-bit mode when support for vgic-v3 is done.

Signed-off-by: Vladimir Murzin
Acked-by: Marc Zyngier
Signed-off-by: Christoffer Dall

Vladimir Murzin
2016-09-22 19:21:47 +0800
19f0ece43 arm64: KVM: Move vgic-v3 save/restore to virt/kvm/arm/hyp ... Browse Code »

So we can reuse the code under arch/arm

Signed-off-by: Vladimir Murzin
Acked-by: Marc Zyngier
Signed-off-by: Christoffer Dall

Vladimir Murzin
2016-09-22 19:21:46 +0800