14 Dec, 2012
1 commit
-
Pull KVM updates from Marcelo Tosatti:
"Considerable KVM/PPC work, x86 kvmclock vsyscall support,
IA32_TSC_ADJUST MSR emulation, amongst others."Fix up trivial conflict in kernel/sched/core.c due to cross-cpu
migration notifier added next to rq migration call-back.* tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (156 commits)
KVM: emulator: fix real mode segment checks in address linearization
VMX: remove unneeded enable_unrestricted_guest check
KVM: VMX: fix DPL during entry to protected mode
x86/kexec: crash_vmclear_local_vmcss needs __rcu
kvm: Fix irqfd resampler list walk
KVM: VMX: provide the vmclear function and a bitmap to support VMCLEAR in kdump
x86/kexec: VMCLEAR VMCSs loaded on all cpus if necessary
KVM: MMU: optimize for set_spte
KVM: PPC: booke: Get/set guest EPCR register using ONE_REG interface
KVM: PPC: bookehv: Add EPCR support in mtspr/mfspr emulation
KVM: PPC: bookehv: Add guest computation mode for irq delivery
KVM: PPC: Make EPCR a valid field for booke64 and bookehv
KVM: PPC: booke: Extend MAS2 EPN mask for 64-bit
KVM: PPC: e500: Mask MAS2 EPN high 32-bits in 32/64 tlbwe emulation
KVM: PPC: Mask ea's high 32-bits in 32/64 instr emulation
KVM: PPC: e500: Add emulation helper for getting instruction ea
KVM: PPC: bookehv64: Add support for interrupt handling
KVM: PPC: bookehv: Remove GET_VCPU macro from exception handler
KVM: PPC: booke: Fix get_tb() compile error on 64-bit
KVM: PPC: e500: Silence bogus GCC warning in tlb code
...
10 Dec, 2012
1 commit
-
* 'for-upstream' of https://github.com/agraf/linux-2.6: (28 commits)
KVM: PPC: booke: Get/set guest EPCR register using ONE_REG interface
KVM: PPC: bookehv: Add EPCR support in mtspr/mfspr emulation
KVM: PPC: bookehv: Add guest computation mode for irq delivery
KVM: PPC: Make EPCR a valid field for booke64 and bookehv
KVM: PPC: booke: Extend MAS2 EPN mask for 64-bit
KVM: PPC: e500: Mask MAS2 EPN high 32-bits in 32/64 tlbwe emulation
KVM: PPC: Mask ea's high 32-bits in 32/64 instr emulation
KVM: PPC: e500: Add emulation helper for getting instruction ea
KVM: PPC: bookehv64: Add support for interrupt handling
KVM: PPC: bookehv: Remove GET_VCPU macro from exception handler
KVM: PPC: booke: Fix get_tb() compile error on 64-bit
KVM: PPC: e500: Silence bogus GCC warning in tlb code
KVM: PPC: Book3S HV: Handle guest-caused machine checks on POWER7 without panicking
KVM: PPC: Book3S HV: Improve handling of local vs. global TLB invalidations
MAINTAINERS: Add git tree link for PPC KVM
KVM: PPC: Book3S PR: MSR_DE doesn't exist on Book 3S
KVM: PPC: Book3S PR: Fix VSX handling
KVM: PPC: Book3S PR: Emulate PURR, SPURR and DSCR registers
KVM: PPC: Book3S HV: Don't give the guest RW access to RO pages
KVM: PPC: Book3S HV: Report correct HPT entry index when reading HPT
...
08 Dec, 2012
1 commit
-
…/git/frederic/linux-dynticks into sched/core
Pull more cputime cleanups from Frederic Weisbecker:
* Get rid of underscores polluting the vtime namespace
* Consolidate context switch and tick handling
* Improve debuggability by detecting irq unsafe callers
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
07 Dec, 2012
1 commit
-
Pick up the autogroups fix and other fixes.
Signed-off-by: Ingo Molnar
06 Dec, 2012
1 commit
-
The current eventfd code assumes that when we have eventfd, we also have
irqfd for in-kernel interrupt delivery. This is not necessarily true. On
PPC we don't have an in-kernel irqchip yet, but we can still support easily
support eventfd.Signed-off-by: Alexander Graf
05 Dec, 2012
1 commit
-
Add an API to inject IRQ from atomic context.
Return EWOULDBLOCK if impossible (e.g. for multicast).
Only MSI is supported ATM.Signed-off-by: Michael S. Tsirkin
Signed-off-by: Gleb Natapov
28 Nov, 2012
2 commits
-
TSC initialization will soon make use of online_vcpus.
Signed-off-by: Marcelo Tosatti
-
KVM added a global variable to guarantee monotonicity in the guest.
One of the reasons for that is that the time between1. ktime_get_ts(×pec);
2. rdtscll(tsc);Is variable. That is, given a host with stable TSC, suppose that
two VCPUs read the same time via ktime_get_ts() above.The time required to execute 2. is not the same on those two instances
executing in different VCPUS (cache misses, interrupts...).If the TSC value that is used by the host to interpolate when
calculating the monotonic time is the same value used to calculate
the tsc_timestamp value stored in the pvclock data structure, and
a single tuple is visible to all
vcpus simultaneously, this problem disappears. See comment on top
of pvclock_update_vm_gtod_copy for details.Monotonicity is then guaranteed by synchronicity of the host TSCs
and guest TSCs.Set TSC stable pvclock flag in that case, allowing the guest to read
clock from userspace.Signed-off-by: Marcelo Tosatti
19 Nov, 2012
1 commit
-
Prepending irq-unsafe vtime APIs with underscores was actually
a bad idea as the result is a big mess in the API namespace that
is even waiting to be further extended. Also these helpers
are always called from irq safe callers except kvm. Just
provide a vtime_account_system_irqsafe() for this specific
case so that we can remove the underscore prefix on other
vtime functions.Signed-off-by: Frederic Weisbecker
Reviewed-by: Steven Rostedt
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Steven Rostedt
Cc: Paul Gortmaker
Cc: Tony Luck
Cc: Fenghua Yu
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Martin Schwidefsky
Cc: Heiko Carstens
01 Nov, 2012
1 commit
-
After commit b3356bf0dbb349 (KVM: emulator: optimize "rep ins" handling),
the pieces of io data can be collected and write them to the guest memory
or MMIO togetherUnfortunately, kvm splits the mmio access into 8 bytes and store them to
vcpu->mmio_fragments. If the guest uses "rep ins" to move large data, it
will cause vcpu->mmio_fragments overflowThe bug can be exposed by isapc (-M isapc):
[23154.818733] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[ ......]
[23154.858083] Call Trace:
[23154.859874] [] kvm_get_cr8+0x1d/0x28 [kvm]
[23154.861677] [] kvm_arch_vcpu_ioctl_run+0xcda/0xe45 [kvm]
[23154.863604] [] ? kvm_arch_vcpu_load+0x17b/0x180 [kvm]Actually, we can use one mmio_fragment to store a large mmio access then
split it when we pass the mmio-exit-info to userspace. After that, we only
need two entries to store mmio info for the cross-mmio pages accessSigned-off-by: Xiao Guangrong
Signed-off-by: Marcelo Tosatti
30 Oct, 2012
2 commits
-
This patch filters noslot pfn out from error pfns based on Marcelo comment:
noslot pfn is not a error pfnAfter this patch,
- is_noslot_pfn indicates that the gfn is not in slot
- is_error_pfn indicates that the gfn is in slot but the error is occurred
when translate the gfn to pfn
- is_error_noslot_pfn indicates that the pfn either it is error pfns or it
is noslot pfn
And is_invalid_pfn can be removed, it makes the code more cleanSigned-off-by: Xiao Guangrong
Signed-off-by: Marcelo Tosatti -
Switching to or from guest context is done on ioctl context.
So by the time we call kvm_guest_enter() or kvm_guest_exit()
we know we are not running the idle task.As a result, we can directly account the cputime using
vtime_account_system().There are two good reasons to do this:
* We avoid some useless checks on guest switch. It optimizes
a bit this fast path.* In the case of CONFIG_IRQ_TIME_ACCOUNTING, calling vtime_account()
checks for irq time to account. This is pointless since we know
we are not in an irq on guest switch. This is wasting cpu cycles
for no good reason. vtime_account_system() OTOH is a no-op in
this config option.* We can remove the irq disable/enable around kvm guest switch in s390.
A further optimization may consist in introducing a vtime_account_guest()
that directly calls account_guest_time().Signed-off-by: Frederic Weisbecker
Cc: Tony Luck
Cc: Fenghua Yu
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Heiko Carstens
Cc: Martin Schwidefsky
Cc: Avi Kivity
Cc: Marcelo Tosatti
Cc: Joerg Roedel
Cc: Alexander Graf
Cc: Xiantao Zhang
Cc: Christian Borntraeger
Cc: Cornelia Huck
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Steven Rostedt
Cc: Paul Gortmaker
23 Oct, 2012
1 commit
-
The mmu_notifier_retry is not specific to any vcpu (and never will be)
so only take struct kvm as a parameter.The motivation is the ARM mmu code that needs to call this from
somewhere where we long let go of the vcpu pointer.Signed-off-by: Christoffer Dall
Signed-off-by: Avi Kivity
11 Oct, 2012
1 commit
-
* 'for-upstream' of http://github.com/agraf/linux-2.6: (56 commits)
arch/powerpc/kvm/e500_tlb.c: fix error return code
KVM: PPC: Book3S HV: Provide a way for userspace to get/set per-vCPU areas
KVM: PPC: Book3S: Get/set guest FP regs using the GET/SET_ONE_REG interface
KVM: PPC: Book3S: Get/set guest SPRs using the GET/SET_ONE_REG interface
KVM: PPC: set IN_GUEST_MODE before checking requests
KVM: PPC: e500: MMU API: fix leak of shared_tlb_pages
KVM: PPC: e500: fix allocation size error on g2h_tlb1_map
KVM: PPC: Book3S HV: Fix calculation of guest phys address for MMIO emulation
KVM: PPC: Book3S HV: Remove bogus update of physical thread IDs
KVM: PPC: Book3S HV: Fix updates of vcpu->cpu
KVM: Move some PPC ioctl definitions to the correct place
KVM: PPC: Book3S HV: Handle memory slot deletion and modification correctly
KVM: PPC: Move kvm->arch.slot_phys into memslot.arch
KVM: PPC: Book3S HV: Take the SRCU read lock before looking up memslots
KVM: PPC: bookehv: Allow duplicate calls of DO_KVM macro
KVM: PPC: BookE: Support FPU on non-hv systems
KVM: PPC: 440: Implement mfdcrx
KVM: PPC: 440: Implement mtdcrx
Document IACx/DACx registers access using ONE_REG API
KVM: PPC: E500: Remove E500_TLB_DIRTY flag
...
09 Oct, 2012
1 commit
-
There are no external callers of this function as there is no concept of
resetting a vcpu from generic code.Signed-off-by: Jan Kiszka
Signed-off-by: Marcelo Tosatti
06 Oct, 2012
1 commit
-
This patch adds the watchdog emulation in KVM. The watchdog
emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl.
The kernel timer are used for watchdog emulation and emulates
h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU
if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how
it is configured.Signed-off-by: Liu Yu
Signed-off-by: Scott Wood
[bharat.bhushan@freescale.com: reworked patch]
Signed-off-by: Bharat Bhushan
[agraf: adjust to new request framework]
Signed-off-by: Alexander Graf
05 Oct, 2012
1 commit
-
Pull KVM updates from Avi Kivity:
"Highlights of the changes for this release include support for vfio
level triggered interrupts, improved big real mode support on older
Intels, a streamlines guest page table walker, guest APIC speedups,
PIO optimizations, better overcommit handling, and read-only memory."* tag 'kvm-3.7-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (138 commits)
KVM: s390: Fix vcpu_load handling in interrupt code
KVM: x86: Fix guest debug across vcpu INIT reset
KVM: Add resampling irqfds for level triggered interrupts
KVM: optimize apic interrupt delivery
KVM: MMU: Eliminate pointless temporary 'ac'
KVM: MMU: Avoid access/dirty update loop if all is well
KVM: MMU: Eliminate eperm temporary
KVM: MMU: Optimize is_last_gpte()
KVM: MMU: Simplify walk_addr_generic() loop
KVM: MMU: Optimize pte permission checks
KVM: MMU: Update accessed and dirty bits after guest pagetable walk
KVM: MMU: Move gpte_access() out of paging_tmpl.h
KVM: MMU: Optimize gpte_access() slightly
KVM: MMU: Push clean gpte write protection out of gpte_access()
KVM: clarify kvmclock documentation
KVM: make processes waiting on vcpu mutex killable
KVM: SVM: Make use of asm.h
KVM: VMX: Make use of asm.h
KVM: VMX: Make lto-friendly
KVM: x86: lapic: Clean up find_highest_vector() and count_vectors()
...Conflicts:
arch/s390/include/asm/processor.h
arch/x86/kvm/i8259.c
25 Sep, 2012
1 commit
-
Use a naming based on vtime as a prefix for virtual based
cputime accounting APIs:- account_system_vtime() -> vtime_account()
- account_switch_vtime() -> vtime_task_switch()It makes it easier to allow for further declension such
as vtime_account_system(), vtime_account_idle(), ... if we
want to find out the context we account to from generic code.This also make it better to know on which subsystem these APIs
refer to.Signed-off-by: Frederic Weisbecker
Cc: Tony Luck
Cc: Fenghua Yu
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Heiko Carstens
Cc: Martin Schwidefsky
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Peter Zijlstra
23 Sep, 2012
1 commit
-
To emulate level triggered interrupts, add a resample option to
KVM_IRQFD. When specified, a new resamplefd is provided that notifies
the user when the irqchip has been resampled by the VM. This may, for
instance, indicate an EOI. Also in this mode, posting of an interrupt
through an irqfd only asserts the interrupt. On resampling, the
interrupt is automatically de-asserted prior to user notification.
This enables level triggered interrupts to be posted and re-enabled
from vfio with no userspace intervention.All resampling irqfds can make use of a single irq source ID, so we
reserve a new one for this interface.Signed-off-by: Alex Williamson
Signed-off-by: Avi Kivity
18 Sep, 2012
1 commit
-
vcpu mutex can be held for unlimited time so
taking it with mutex_lock on an ioctl is wrong:
one process could be passed a vcpu fd and
call this ioctl on the vcpu used by another process,
it will then be unkillable until the owner exits.Call mutex_lock_killable instead and return status.
Note: mutex_lock_interruptible would be even nicer,
but I am not sure all users are prepared to handle EINTR
from these ioctls. They might misinterpret it as an error.Cleanup paths expect a vcpu that can't be used by
any userspace so this will always succeed - catch bugs
by calling BUG_ON.Catch callers that don't check return state by adding
__must_check.Signed-off-by: Michael S. Tsirkin
Signed-off-by: Marcelo Tosatti
06 Sep, 2012
1 commit
-
Introducing kvm_arch_flush_shadow_memslot, to invalidate the
translations of a single memory slot.Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity
28 Aug, 2012
1 commit
-
The build error was caused by that builtin functions are calling
the functions implemented in modules. This error was introduced by
commit 4d8b81abc4 ("KVM: introduce readonly memslot").The patch fixes the build error by moving function __gfn_to_hva_memslot()
from kvm_main.c to kvm_host.h and making that "inline" so that the
builtin function (kvmppc_h_enter) can use that.Acked-by: Paul Mackerras
Signed-off-by: Gavin Shan
Signed-off-by: Marcelo Tosatti
22 Aug, 2012
6 commits
-
In current code, if we map a readonly memory space from host to guest
and the page is not currently mapped in the host, we will get a fault
pfn and async is not allowed, then the vm will crashWe introduce readonly memory region to map ROM/ROMD to the guest, read access
is happy for readonly memslot, write access on readonly memslot will cause
KVM_EXIT_MMIO exitSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
In the later patch, it indicates failure when we try to get a writable
hva from the readonly memslotSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
Then, remove bad_hva and inline kvm_is_error_hva
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
In the later patch, it indicates failure when we try to get a writable
pfn from the readonly memslotSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
It can instead of hva_to_pfn_atomic
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
Quote Avi's comment:
| KVM_MEMSLOT_INVALID is actually an internal symbol, not used by
| userspace. Please move it to kvm_host.h.Also, we divide the memlsot->flags into two parts, the lower 16 bits
are visible for userspace, the higher 16 bits are internally used in
kvmSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity
06 Aug, 2012
9 commits
-
Currently, we use the error code as error pfn to indicat the error
condition, it is not straightforward and it will not work on PAE
32-bit cpu with huge memory, since the valid physical address
can be at most 52 bitsFor the normal pfn, the highest 12 bits should be zero, so we can
mask these bits to indicate the error.Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
After commit a2766325cf9f9, the error page is replaced by the
error code, it need not be released anymore[ The patch has been compiling tested for powerpc ]
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
It is used to eliminate the overload of function call and cleanup
the codeSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
Remove it since it is not used anymore
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
These functions are exported and can not inline, move them
to kvm_host.h to eliminate the overload of function callSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
Then, remove get_bad_pfn
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
Then, get_hwpoison_pfn and is_hwpoison_pfn can be removed
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
After that, the exported and un-inline function, get_fault_pfn,
can be removedSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
Two reasons:
- x86 can integrate rmap and rmap_pde and remove heuristics in
__gfn_to_rmap().
- Some architectures do not need rmap.Since rmap is one of the most memory consuming stuff in KVM, ppc'd
better restrict the allocation to Book3S HV.Signed-off-by: Takuya Yoshikawa
Acked-by: Paul Mackerras
Signed-off-by: Avi Kivity
26 Jul, 2012
3 commits
-
Handle KVM_IRQ_LINE and KVM_IRQ_LINE_STATUS in the generic
kvm_vm_ioctl() function and call into kvm_vm_ioctl_irq_line().This is even more relevant when KVM/ARM also uses this ioctl.
Signed-off-by: Christoffer Dall
Signed-off-by: Avi Kivity -
Currently, kvm allocates some pages and use them as error indicators,
it wastes memory and is not good for scalabilityBase on Avi's suggestion, we use the error codes instead of these pages
to indicate the error conditionsSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
Merge patches queued during the run-up to the merge window.
* queue: (25 commits)
KVM: Choose better candidate for directed yield
KVM: Note down when cpu relax intercepted or pause loop exited
KVM: Add config to support ple or cpu relax optimzation
KVM: switch to symbolic name for irq_states size
KVM: x86: Fix typos in pmu.c
KVM: x86: Fix typos in lapic.c
KVM: x86: Fix typos in cpuid.c
KVM: x86: Fix typos in emulate.c
KVM: x86: Fix typos in x86.c
KVM: SVM: Fix typos
KVM: VMX: Fix typos
KVM: remove the unused parameter of gfn_to_pfn_memslot
KVM: remove is_error_hpa
KVM: make bad_pfn static to kvm_main.c
KVM: using get_fault_pfn to get the fault pfn
KVM: MMU: track the refcount when unmap the page
KVM: x86: remove unnecessary mark_page_dirty
KVM: MMU: Avoid handling same rmap_pde in kvm_handle_hva_range()
KVM: MMU: Push trace_kvm_age_page() into kvm_age_rmapp()
KVM: MMU: Add memslot parameter to hva handlers
...Signed-off-by: Avi Kivity