24 Oct, 2012
1 commit
-
Pull kvm fixes from Avi Kivity:
"KVM updates for 3.7-rc2"* tag 'kvm-3.7-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM guest: exit idleness when handling KVM_PV_REASON_PAGE_NOT_PRESENT
KVM: apic: fix LDR calculation in x2apic mode
KVM: MMU: fix release noslot pfn
23 Oct, 2012
1 commit
-
We can not directly call kvm_release_pfn_clean to release the pfn
since we can meet noslot pfn which is used to cache mmio info into
spteSigned-off-by: Xiao Guangrong
Cc: stable@vger.kernel.org
Signed-off-by: Avi Kivity
06 Oct, 2012
1 commit
-
Now that we have defined generic set_bit_le() we do not need to use
test_and_set_bit_le() for atomically setting a bit.Signed-off-by: Takuya Yoshikawa
Cc: Avi Kivity
Cc: Marcelo Tosatti
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
05 Oct, 2012
1 commit
-
Pull KVM updates from Avi Kivity:
"Highlights of the changes for this release include support for vfio
level triggered interrupts, improved big real mode support on older
Intels, a streamlines guest page table walker, guest APIC speedups,
PIO optimizations, better overcommit handling, and read-only memory."* tag 'kvm-3.7-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (138 commits)
KVM: s390: Fix vcpu_load handling in interrupt code
KVM: x86: Fix guest debug across vcpu INIT reset
KVM: Add resampling irqfds for level triggered interrupts
KVM: optimize apic interrupt delivery
KVM: MMU: Eliminate pointless temporary 'ac'
KVM: MMU: Avoid access/dirty update loop if all is well
KVM: MMU: Eliminate eperm temporary
KVM: MMU: Optimize is_last_gpte()
KVM: MMU: Simplify walk_addr_generic() loop
KVM: MMU: Optimize pte permission checks
KVM: MMU: Update accessed and dirty bits after guest pagetable walk
KVM: MMU: Move gpte_access() out of paging_tmpl.h
KVM: MMU: Optimize gpte_access() slightly
KVM: MMU: Push clean gpte write protection out of gpte_access()
KVM: clarify kvmclock documentation
KVM: make processes waiting on vcpu mutex killable
KVM: SVM: Make use of asm.h
KVM: VMX: Make use of asm.h
KVM: VMX: Make lto-friendly
KVM: x86: lapic: Clean up find_highest_vector() and count_vectors()
...Conflicts:
arch/s390/include/asm/processor.h
arch/x86/kvm/i8259.c
03 Oct, 2012
1 commit
-
Pull workqueue changes from Tejun Heo:
"This is workqueue updates for v3.7-rc1. A lot of activities this
round including considerable API and behavior cleanups.* delayed_work combines a timer and a work item. The handling of the
timer part has always been a bit clunky leading to confusing
cancelation API with weird corner-case behaviors. delayed_work is
updated to use new IRQ safe timer and cancelation now works as
expected.* Another deficiency of delayed_work was lack of the counterpart of
mod_timer() which led to cancel+queue combinations or open-coded
timer+work usages. mod_delayed_work[_on]() are added.These two delayed_work changes make delayed_work provide interface
and behave like timer which is executed with process context.* A work item could be executed concurrently on multiple CPUs, which
is rather unintuitive and made flush_work() behavior confusing and
half-broken under certain circumstances. This problem doesn't
exist for non-reentrant workqueues. While non-reentrancy check
isn't free, the overhead is incurred only when a work item bounces
across different CPUs and even in simulated pathological scenario
the overhead isn't too high.All workqueues are made non-reentrant. This removes the
distinction between flush_[delayed_]work() and
flush_[delayed_]_work_sync(). The former is now as strong as the
latter and the specified work item is guaranteed to have finished
execution of any previous queueing on return.* In addition to the various bug fixes, Lai redid and simplified CPU
hotplug handling significantly.* Joonsoo introduced system_highpri_wq and used it during CPU
hotplug.There are two merge commits - one to pull in IRQ safe timer from
tip/timers/core and the other to pull in CPU hotplug fixes from
wq/for-3.6-fixes as Lai's hotplug restructuring depended on them."Fixed a number of trivial conflicts, but the more interesting conflicts
were silent ones where the deprecated interfaces had been used by new
code in the merge window, and thus didn't cause any real data conflicts.Tejun pointed out a few of them, I fixed a couple more.
* 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (46 commits)
workqueue: remove spurious WARN_ON_ONCE(in_irq()) from try_to_grab_pending()
workqueue: use cwq_set_max_active() helper for workqueue_set_max_active()
workqueue: introduce cwq_set_max_active() helper for thaw_workqueues()
workqueue: remove @delayed from cwq_dec_nr_in_flight()
workqueue: fix possible stall on try_to_grab_pending() of a delayed work item
workqueue: use hotcpu_notifier() for workqueue_cpu_down_callback()
workqueue: use __cpuinit instead of __devinit for cpu callbacks
workqueue: rename manager_mutex to assoc_mutex
workqueue: WORKER_REBIND is no longer necessary for idle rebinding
workqueue: WORKER_REBIND is no longer necessary for busy rebinding
workqueue: reimplement idle worker rebinding
workqueue: deprecate __cancel_delayed_work()
workqueue: reimplement cancel_delayed_work() using try_to_grab_pending()
workqueue: use mod_delayed_work() instead of __cancel + queue
workqueue: use irqsafe timer for delayed_work
workqueue: clean up delayed_work initializers and add missing one
workqueue: make deferrable delayed_work initializer names consistent
workqueue: cosmetic whitespace updates for macro definitions
workqueue: deprecate system_nrt[_freezable]_wq
workqueue: deprecate flush[_delayed]_work_sync()
...
23 Sep, 2012
1 commit
-
To emulate level triggered interrupts, add a resample option to
KVM_IRQFD. When specified, a new resamplefd is provided that notifies
the user when the irqchip has been resampled by the VM. This may, for
instance, indicate an EOI. Also in this mode, posting of an interrupt
through an irqfd only asserts the interrupt. On resampling, the
interrupt is automatically de-asserted prior to user notification.
This enables level triggered interrupts to be posted and re-enabled
from vfio with no userspace intervention.All resampling irqfds can make use of a single irq source ID, so we
reserve a new one for this interface.Signed-off-by: Alex Williamson
Signed-off-by: Avi Kivity
20 Sep, 2012
1 commit
-
Most interrupt are delivered to only one vcpu. Use pre-build tables to
find interrupt destination instead of looping through all vcpus. In case
of logical mode loop only through vcpus in a logical cluster irq is sent
to.Signed-off-by: Gleb Natapov
Acked-by: Michael S. Tsirkin
Signed-off-by: Avi Kivity
18 Sep, 2012
1 commit
-
vcpu mutex can be held for unlimited time so
taking it with mutex_lock on an ioctl is wrong:
one process could be passed a vcpu fd and
call this ioctl on the vcpu used by another process,
it will then be unkillable until the owner exits.Call mutex_lock_killable instead and return status.
Note: mutex_lock_interruptible would be even nicer,
but I am not sure all users are prepared to handle EINTR
from these ioctls. They might misinterpret it as an error.Cleanup paths expect a vcpu that can't be used by
any userspace so this will always succeed - catch bugs
by calling BUG_ON.Catch callers that don't check return state by adding
__must_check.Signed-off-by: Michael S. Tsirkin
Signed-off-by: Marcelo Tosatti
06 Sep, 2012
3 commits
-
Other arches do not need this.
Signed-off-by: Marcelo Tosatti
v2: fix incorrect deletion of mmio sptes on gpa move (noticed by Takuya)
Signed-off-by: Avi Kivity -
PPC must flush all translations before the new memory slot
is visible.Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity -
Introducing kvm_arch_flush_shadow_memslot, to invalidate the
translations of a single memory slot.Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity
28 Aug, 2012
1 commit
-
The build error was caused by that builtin functions are calling
the functions implemented in modules. This error was introduced by
commit 4d8b81abc4 ("KVM: introduce readonly memslot").The patch fixes the build error by moving function __gfn_to_hva_memslot()
from kvm_main.c to kvm_host.h and making that "inline" so that the
builtin function (kvmppc_h_enter) can use that.Acked-by: Paul Mackerras
Signed-off-by: Gavin Shan
Signed-off-by: Marcelo Tosatti
27 Aug, 2012
1 commit
-
KVM_SET_SIGNAL_MASK passed a NULL argument leaves the on stack signal
sets uninitialized. It then passes them through to
kvm_vcpu_ioctl_set_sigmask.We should be passing a NULL in this case not translated garbage.
Signed-off-by: Alan Cox
Signed-off-by: Marcelo Tosatti
22 Aug, 2012
7 commits
-
In current code, if we map a readonly memory space from host to guest
and the page is not currently mapped in the host, we will get a fault
pfn and async is not allowed, then the vm will crashWe introduce readonly memory region to map ROM/ROMD to the guest, read access
is happy for readonly memslot, write access on readonly memslot will cause
KVM_EXIT_MMIO exitSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
Then, remove bad_hva and inline kvm_is_error_hva
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
In current code, we always map writable pfn for the read fault, in order
to support readonly memslot, we map writable pfn only if 'writable'
is not NULLSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
We do too many things in hva_to_pfn, this patch reorganize the code,
let it be better readableSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
This set of functions is only used to read data from host space, in the
later patch, we will only get a readonly hva in gfn_to_hva_read, and
the function name is a good hint to let gfn_to_hva_read to pair with
kvm_read_hva()/kvm_read_hva_atomic()Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
It can instead of hva_to_pfn_atomic
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
Check flags when memslot is registered from userspace as Avi's suggestion
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity
21 Aug, 2012
1 commit
-
flush[_delayed]_work_sync() are now spurious. Mark them deprecated
and convert all users to flush[_delayed]_work().If you're cc'd and wondering what's going on: Now all workqueues are
non-reentrant and the regular flushes guarantee that the work item is
not pending or running on any CPU on return, so there's no reason to
use the sync flushes at all and they're going away.This patch doesn't make any functional difference.
Signed-off-by: Tejun Heo
Cc: Russell King
Cc: Paul Mundt
Cc: Ian Campbell
Cc: Jens Axboe
Cc: Mattia Dongili
Cc: Kent Yoder
Cc: David Airlie
Cc: Jiri Kosina
Cc: Karsten Keil
Cc: Bryan Wu
Cc: Benjamin Herrenschmidt
Cc: Alasdair Kergon
Cc: Mauro Carvalho Chehab
Cc: Florian Tobias Schandinat
Cc: David Woodhouse
Cc: "David S. Miller"
Cc: linux-wireless@vger.kernel.org
Cc: Anton Vorontsov
Cc: Sangbeom Kim
Cc: "James E.J. Bottomley"
Cc: Greg Kroah-Hartman
Cc: Eric Van Hensbergen
Cc: Takashi Iwai
Cc: Steven Whitehouse
Cc: Petr Vandrovec
Cc: Mark Fasheh
Cc: Christoph Hellwig
Cc: Avi Kivity
15 Aug, 2012
1 commit
-
We validate irq pin number when routing is setup, so
code handling illegal irq # in pic and ioapic on each injection
is never called.
Drop it, replace with BUG_ON to catch out of bounds access bugs.Signed-off-by: Michael S. Tsirkin
Signed-off-by: Marcelo Tosatti
06 Aug, 2012
9 commits
-
After commit a2766325cf9f9, the error page is replaced by the
error code, it need not be released anymore[ The patch has been compiling tested for powerpc ]
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
After commit a2766325cf9f9, the error pfn is replaced by the
error code, it need not be released anymore[ The patch has been compiling tested for powerpc ]
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
It is used to eliminate the overload of function call and cleanup
the codeSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
These functions are exported and can not inline, move them
to kvm_host.h to eliminate the overload of function callSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
Then, remove get_bad_pfn
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
Then, get_hwpoison_pfn and is_hwpoison_pfn can be removed
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
After that, the exported and un-inline function, get_fault_pfn,
can be removedSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
There are two bugs:
- the 'error page' is forgot to be released
[ it is unneeded after commit a2766325cf9f9, for backport, we
still do kvm_release_pfn_clean for the error pfn ]- guest pages are always released regardless of the unmapped page
(e,g, caused by hwpoison)Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
Two reasons:
- x86 can integrate rmap and rmap_pde and remove heuristics in
__gfn_to_rmap().
- Some architectures do not need rmap.Since rmap is one of the most memory consuming stuff in KVM, ppc'd
better restrict the allocation to Book3S HV.Signed-off-by: Takuya Yoshikawa
Acked-by: Paul Mackerras
Signed-off-by: Avi Kivity
26 Jul, 2012
4 commits
-
Handle KVM_IRQ_LINE and KVM_IRQ_LINE_STATUS in the generic
kvm_vm_ioctl() function and call into kvm_vm_ioctl_irq_line().This is even more relevant when KVM/ARM also uses this ioctl.
Signed-off-by: Christoffer Dall
Signed-off-by: Avi Kivity -
Currently, kvm allocates some pages and use them as error indicators,
it wastes memory and is not good for scalabilityBase on Avi's suggestion, we use the error codes instead of these pages
to indicate the error conditionsSigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
In kvm_async_pf_wakeup_all, it uses bad_page to generate broadcast wakeup,
and uses put_page to release bad_page, the work depends on the fact that
bad_page is the normal page. But we will use the error code instead of
bad_page, so use kvm_release_page_clean to release the page which will
release the error code properlySigned-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity -
Merge patches queued during the run-up to the merge window.
* queue: (25 commits)
KVM: Choose better candidate for directed yield
KVM: Note down when cpu relax intercepted or pause loop exited
KVM: Add config to support ple or cpu relax optimzation
KVM: switch to symbolic name for irq_states size
KVM: x86: Fix typos in pmu.c
KVM: x86: Fix typos in lapic.c
KVM: x86: Fix typos in cpuid.c
KVM: x86: Fix typos in emulate.c
KVM: x86: Fix typos in x86.c
KVM: SVM: Fix typos
KVM: VMX: Fix typos
KVM: remove the unused parameter of gfn_to_pfn_memslot
KVM: remove is_error_hpa
KVM: make bad_pfn static to kvm_main.c
KVM: using get_fault_pfn to get the fault pfn
KVM: MMU: track the refcount when unmap the page
KVM: x86: remove unnecessary mark_page_dirty
KVM: MMU: Avoid handling same rmap_pde in kvm_handle_hva_range()
KVM: MMU: Push trace_kvm_age_page() into kvm_age_rmapp()
KVM: MMU: Add memslot parameter to hva handlers
...Signed-off-by: Avi Kivity
25 Jul, 2012
1 commit
-
Pull KVM updates from Avi Kivity:
"Highlights include
- full big real mode emulation on pre-Westmere Intel hosts (can be
disabled with emulate_invalid_guest_state=0)
- relatively small ppc and s390 updates
- PCID/INVPCID support in guests
- EOI avoidance; 3.6 guests should perform better on 3.6 hosts on
interrupt intensive workloads)
- Lockless write faults during live migration
- EPT accessed/dirty bits support for new Intel processors"Fix up conflicts in:
- Documentation/virtual/kvm/api.txt:Stupid subchapter numbering, added next to each other.
- arch/powerpc/kvm/booke_interrupts.S:
PPC asm changes clashing with the KVM fixes
- arch/s390/include/asm/sigp.h, arch/s390/kvm/sigp.c:
Duplicated commits through the kvm tree and the s390 tree, with
subsequent edits in the KVM tree.* tag 'kvm-3.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (93 commits)
KVM: fix race with level interrupts
x86, hyper: fix build with !CONFIG_KVM_GUEST
Revert "apic: fix kvm build on UP without IOAPIC"
KVM guest: switch to apic_set_eoi_write, apic_write
apic: add apic_set_eoi_write for PV use
KVM: VMX: Implement PCID/INVPCID for guests with EPT
KVM: Add x86_hyper_kvm to complete detect_hypervisor_platform check
KVM: PPC: Critical interrupt emulation support
KVM: PPC: e500mc: Fix tlbilx emulation for 64-bit guests
KVM: PPC64: booke: Set interrupt computation mode for 64-bit host
KVM: PPC: bookehv: Add ESR flag to Data Storage Interrupt
KVM: PPC: bookehv64: Add support for std/ld emulation.
booke: Added crit/mc exception handler for e500v2
booke/bookehv: Add host crit-watchdog exception support
KVM: MMU: document mmu-lock and fast page fault
KVM: MMU: fix kvm_mmu_pagetable_walk tracepoint
KVM: MMU: trace fast page fault
KVM: MMU: fast path of handling guest page fault
KVM: MMU: introduce SPTE_MMU_WRITEABLE bit
KVM: MMU: fold tlb flush judgement into mmu_spte_update
...
23 Jul, 2012
3 commits
-
Currently, on a large vcpu guests, there is a high probability of
yielding to the same vcpu who had recently done a pause-loop exit or
cpu relax intercepted. Such a yield can lead to the vcpu spinning
again and hence degrade the performance.The patchset keeps track of the pause loop exit/cpu relax interception
and gives chance to a vcpu which:
(a) Has not done pause loop exit or cpu relax intercepted at all
(probably he is preempted lock-holder)
(b) Was skipped in last iteration because it did pause loop exit or
cpu relax intercepted, and probably has become eligible now
(next eligible lock holder)Signed-off-by: Raghavendra K T
Reviewed-by: Marcelo Tosatti
Reviewed-by: Rik van Riel
Tested-by: Christian Borntraeger # on s390x
Signed-off-by: Avi Kivity -
Noting pause loop exited vcpu or cpu relax intercepted helps in
filtering right candidate to yield. Wrong selection of vcpu;
i.e., a vcpu that just did a pl-exit or cpu relax intercepted may
contribute to performance degradation.Signed-off-by: Raghavendra K T
Reviewed-by: Marcelo Tosatti
Reviewed-by: Rik van Riel
Tested-by: Christian Borntraeger # on s390x
Signed-off-by: Avi Kivity -
Suggested-by: Avi Kivity
Signed-off-by: Raghavendra K T
Reviewed-by: Marcelo Tosatti
Reviewed-by: Rik van Riel
Tested-by: Christian Borntraeger # on s390x
Signed-off-by: Avi Kivity
21 Jul, 2012
1 commit
-
Use PIC_NUM_PINS instead of hard-coded 16 for pic pins.
Signed-off-by: Michael S. Tsirkin
Signed-off-by: Marcelo Tosatti