Eric Lee / linux-smarc-t335x-v3.2

01 Mar, 2010

15 commits

70e335e16 KVM: Convert kvm->requests_lock to raw_spinlock_t ... Browse Code »

The code relies on kvm->requests_lock inhibiting preemption.

Noted by Jan Kiszka.

Signed-off-by: Avi Kivity

Avi Kivity
2010-03-01 23:36:13 +0800
8f0b1ab6f KVM: Introduce kvm_host_page_size ... Browse Code »

This patch introduces a generic function to find out the
host page size for a given gfn. This function is needed by
the kvm iommu code. This patch also simplifies the x86
host_mapping_level function.

Signed-off-by: Joerg Roedel
Signed-off-by: Avi Kivity

Joerg Roedel
2010-03-01 23:36:08 +0800
f0f4b9309 KVM: Fix kvm_coalesced_mmio_ring duplicate allocation ... Browse Code »

The commit 0953ca73 "KVM: Simplify coalesced mmio initialization"
allocate kvm_coalesced_mmio_ring in the kvm_coalesced_mmio_init(), but
didn't discard the original allocation...

Signed-off-by: Sheng Yang
Signed-off-by: Marcelo Tosatti

Sheng Yang
2010-03-01 23:36:03 +0800
647492047 KVM: fix cleanup_srcu_struct on vm destruction ... Browse Code »

cleanup_srcu_struct on VM destruction remains broken:

BUG: unable to handle kernel paging request at ffffffffffffffff
IP: [] srcu_read_lock+0x16/0x21
RIP: 0010:[] [] srcu_read_lock+0x16/0x21
Call Trace:
[] kvm_arch_vcpu_uninit+0x1b/0x48 [kvm]
[] kvm_vcpu_uninit+0x9/0x15 [kvm]
[] vmx_free_vcpu+0x7f/0x8f [kvm_intel]
[] kvm_arch_destroy_vm+0x78/0x111 [kvm]
[] kvm_put_kvm+0xd4/0xfe [kvm]

Move it to kvm_arch_destroy_vm.

Signed-off-by: Marcelo Tosatti
Reported-by: Jan Kiszka

Marcelo Tosatti
2010-03-01 23:36:01 +0800
79fac95ec KVM: convert slots_lock to a mutex ... Browse Code »

Signed-off-by: Marcelo Tosatti

Marcelo Tosatti
2010-03-01 23:35:45 +0800
e93f8a0f8 KVM: convert io_bus to SRCU ... Browse Code »

Signed-off-by: Marcelo Tosatti

Marcelo Tosatti
2010-03-01 23:35:45 +0800
a983fb238 KVM: x86: switch kvm_set_memory_alias to SRCU update ... Browse Code »

Using a similar two-step procedure as for memslots.

Signed-off-by: Marcelo Tosatti

Marcelo Tosatti
2010-03-01 23:35:45 +0800
bc6678a33 KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU update ... Browse Code »

Use two steps for memslot deletion: mark the slot invalid (which stops
instantiation of new shadow pages for that slot, but allows destruction),
then instantiate the new empty slot.

Also simplifies kvm_handle_hva locking.

Signed-off-by: Marcelo Tosatti

Marcelo Tosatti
2010-03-01 23:35:44 +0800
3ad26d813 KVM: use gfn_to_pfn_memslot in kvm_iommu_map_pages ... Browse Code »

So its possible to iommu map a memslot before making it visible to
kvm.

Signed-off-by: Marcelo Tosatti

Marcelo Tosatti
2010-03-01 23:35:44 +0800
506f0d6f9 KVM: introduce gfn_to_pfn_memslot ... Browse Code »

Which takes a memslot pointer instead of using kvm->memslots.

To be used by SRCU convertion later.

Signed-off-by: Marcelo Tosatti

Marcelo Tosatti
2010-03-01 23:35:44 +0800
f7784b8ec KVM: split kvm_arch_set_memory_region into prepare and commit ... Browse Code »

Required for SRCU convertion later.

Signed-off-by: Marcelo Tosatti

Marcelo Tosatti
2010-03-01 23:35:44 +0800
46a26bf55 KVM: modify memslots layout in struct kvm ... Browse Code »

Have a pointer to an allocated region inside struct kvm.

[alex: fix ppc book 3s]

Signed-off-by: Alexander Graf
Signed-off-by: Marcelo Tosatti

Marcelo Tosatti
2010-03-01 23:35:43 +0800
980da6ce5 KVM: Simplify coalesced mmio initialization ... Browse Code »

- add destructor function
- move related allocation into constructor
- add stubs for !CONFIG_KVM_MMIO

Signed-off-by: Avi Kivity

Avi Kivity
2010-03-01 23:35:41 +0800
4c07b0a4b KVM: Remove ifdefs from mmu notifier initialization ... Browse Code »

Signed-off-by: Avi Kivity

Avi Kivity
2010-03-01 23:35:41 +0800
283d0c65e KVM: Disentangle mmu notifiers and coalesced_mmio registration ... Browse Code »

They aren't related.

Signed-off-by: Avi Kivity

Avi Kivity
2010-03-01 23:35:41 +0800

27 Dec, 2009

2 commits

b4329db0d KVM: get rid of kvm_create_vm() unused label warning on s390 ... Browse Code »

arch/s390/kvm/../../../virt/kvm/kvm_main.c: In function 'kvm_create_vm':
arch/s390/kvm/../../../virt/kvm/kvm_main.c:409: warning: label 'out_err' defined but not used

Signed-off-by: Heiko Carstens
Signed-off-by: Avi Kivity

Heiko Carstens
2009-12-27 23:36:34 +0800
fae3a3536 KVM: Fix possible circular locking in kvm_vm_ioctl_assign_device() ... Browse Code »

One possible order is:

KVM_CREATE_IRQCHIP ioctl(took kvm->lock) -> kvm_iobus_register_dev() ->
down_write(kvm->slots_lock).

The other one is in kvm_vm_ioctl_assign_device(), which take kvm->slots_lock
first, then kvm->lock.

Update the comment of lock order as well.

Observe it due to kernel locking debug warnings.

Cc: stable@kernel.org
Signed-off-by: Sheng Yang
Signed-off-by: Avi Kivity

Sheng Yang
2009-12-27 23:36:31 +0800

23 Dec, 2009

1 commit

628ff7c1d anonfd: Allow making anon files read-only ... Browse Code »

It seems a couple places such as arch/ia64/kernel/perfmon.c and
drivers/infiniband/core/uverbs_main.c could use anon_inode_getfile()
instead of a private pseudo-fs + alloc_file(), if only there were a way
to get a read-only file. So provide this by having anon_inode_getfile()
create a read-only file if we pass O_RDONLY in flags.

Signed-off-by: Roland Dreier
Signed-off-by: Al Viro

Roland Dreier
2009-12-23 01:27:34 +0800

09 Dec, 2009

1 commit

bcd6acd51 Merge commit 'origin/master' into next ... Browse Code »

Conflicts:
include/linux/kvm.h

Benjamin Herrenschmidt
2009-12-09 14:14:38 +0800

03 Dec, 2009

9 commits

a9c7399d6 KVM: Allow internal errors reported to userspace to carry extra data ... Browse Code »

Usually userspace will freeze the guest so we can inspect it, but some
internal state is not available. Add extra data to internal error
reporting so we can expose it to the debugger. Extra data is specific
to the suberror.

Signed-off-by: Avi Kivity

Avi Kivity
2009-12-03 15:32:24 +0800
6ff5894cd KVM: Enable 32bit dirty log pointers on 64bit host ... Browse Code »

With big endian userspace, we can't quite figure out if a pointer
is 32 bit (shifted >> 32) or 64 bit when we read a 64 bit pointer.

This is what happens with dirty logging. To get the pointer interpreted
correctly, we thus need Arnd's patch to implement a compat layer for
the ioctl:

A better way to do this is to add a separate compat_ioctl() method that
converts this for you.

Based on initial patch from Arnd Bergmann.

Signed-off-by: Arnd Bergmann
Signed-off-by: Alexander Graf
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Arnd Bergmann
2009-12-03 15:32:20 +0800
d255f4f2b KVM: introduce kvm_vcpu_on_spin ... Browse Code »

Introduce kvm_vcpu_on_spin, to be used by VMX/SVM to yield processing
once the cpu detects pause-based looping.

Signed-off-by: "Zhai, Edwin"
Signed-off-by: Marcelo Tosatti

Zhai, Edwin
2009-12-03 15:32:17 +0800
10474ae89 KVM: Activate Virtualization On Demand ... Browse Code »

X86 CPUs need to have some magic happening to enable the virtualization
extensions on them. This magic can result in unpleasant results for
users, like blocking other VMMs from working (vmx) or using invalid TLB
entries (svm).

Currently KVM activates virtualization when the respective kernel module
is loaded. This blocks us from autoloading KVM modules without breaking
other VMMs.

To circumvent this problem at least a bit, this patch introduces on
demand activation of virtualization. This means, that instead
virtualization is enabled on creation of the first virtual machine
and disabled on destruction of the last one.

So using this, KVM can be easily autoloaded, while keeping other
hypervisors usable.

Signed-off-by: Alexander Graf
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Alexander Graf
2009-12-03 15:32:10 +0800
bfd99ff5d KVM: Move assigned device code to own file ... Browse Code »

Signed-off-by: Avi Kivity

Avi Kivity
2009-12-03 15:32:09 +0800
680b3648b KVM: Drop kvm->irq_lock lock from irq injection path ... Browse Code »

The only thing it protects now is interrupt injection into lapic and
this can work lockless. Even now with kvm->irq_lock in place access
to lapic is not entirely serialized since vcpu access doesn't take
kvm->irq_lock.

Signed-off-by: Gleb Natapov
Signed-off-by: Avi Kivity

Gleb Natapov
2009-12-03 15:32:08 +0800
136bdfeee KVM: Move irq ack notifier list to arch independent code ... Browse Code »

Mask irq notifier list is already there.

Signed-off-by: Gleb Natapov
Signed-off-by: Avi Kivity

Gleb Natapov
2009-12-03 15:32:07 +0800
46e624b95 KVM: Change irq routing table to use gsi indexed array ... Browse Code »

Use gsi indexed array instead of scanning all entries on each interrupt
injection.

Signed-off-by: Gleb Natapov
Signed-off-by: Avi Kivity

Gleb Natapov
2009-12-03 15:32:07 +0800
45ec431c5 KVM: Don't wrap schedule() with vcpu_put()/vcpu_load() ... Browse Code »

Preemption notifiers will do that for us automatically.

Signed-off-by: Avi Kivity

Avi Kivity
2009-12-03 15:32:05 +0800

05 Nov, 2009

1 commit

c8240bd6f Use Little Endian for Dirty Bitmap ... Browse Code »

We currently use host endian long types to store information
in the dirty bitmap.

This works reasonably well on Little Endian targets, because the
u32 after the first contains the next 32 bits. On Big Endian this
breaks completely though, forcing us to be inventive here.

So Ben suggested to always use Little Endian, which looks reasonable.

We only have dirty bitmap implemented in Little Endian targets so far
and since PowerPC would be the first Big Endian platform, we can just
as well switch to Little Endian always with little effort without
breaking existing targets.

Signed-off-by: Alexander Graf
Signed-off-by: Benjamin Herrenschmidt

Alexander Graf
2009-11-05 13:50:27 +0800

16 Oct, 2009

1 commit

0ea4ed8e9 KVM: Prevent kvm_init from corrupting debugfs structures ... Browse Code »

I'm seeing an oops condition when kvm-intel and kvm-amd are modprobe'd
during boot (say on an Intel system) and then rmmod'd:

# modprobe kvm-intel
kvm_init()
kvm_init_debug()
kvm_arch_init()
Signed-off-by: Marcelo Tosatti

Darrick J. Wong
2009-10-16 23:30:26 +0800

04 Oct, 2009

1 commit

3da0dd433 KVM: add support for change_pte mmu notifiers ... Browse Code »

this is needed for kvm if it want ksm to directly map pages into its
shadow page tables.

[marcelo: cast pfn assignment to u64]

Signed-off-by: Izik Eidus
Signed-off-by: Marcelo Tosatti

Izik Eidus
2009-10-04 23:04:53 +0800

02 Oct, 2009

1 commit

828c09509 const: constify remaining file_operations ... Browse Code »

[akpm@linux-foundation.org: fix KVM]
Signed-off-by: Alexey Dobriyan
Acked-by: Mike Frysinger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2009-10-02 07:11:11 +0800

28 Sep, 2009

1 commit

f0f37e2f7 const: mark struct vm_struct_operations ... Browse Code »

* mark struct vm_area_struct::vm_ops as const
* mark vm_ops in AGP code

But leave TTM code alone, something is fishy there with global vm_ops
being used.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2009-09-28 02:39:25 +0800

24 Sep, 2009

1 commit

79f559977 cpumask: use zalloc_cpumask_var() where possible ... Browse Code »

Remove open-coded zalloc_cpumask_var() and zalloc_cpumask_var_node().

Signed-off-by: Li Zefan
Signed-off-by: Rusty Russell

Li Zefan
2009-09-24 08:04:24 +0800

10 Sep, 2009

6 commits

28bcb1121 KVM: fix compile warnings on s390 ... Browse Code »

CC arch/s390/kvm/../../../virt/kvm/kvm_main.o
arch/s390/kvm/../../../virt/kvm/kvm_main.c: In function '__kvm_set_memory_region':
arch/s390/kvm/../../../virt/kvm/kvm_main.c:485: warning: unused variable 'j'
arch/s390/kvm/../../../virt/kvm/kvm_main.c:484: warning: unused variable 'lpages'
arch/s390/kvm/../../../virt/kvm/kvm_main.c:483: warning: unused variable 'ugfn'

Cc: Carsten Otte
Signed-off-by: Heiko Carstens
Signed-off-by: Marcelo Tosatti

Heiko Carstens
2009-09-10 23:11:11 +0800
6621fbc2c KVM: Move #endif KVM_CAP_IRQ_ROUTING to correct place ... Browse Code »

The symbol only controls irq routing, not MSI-X.

Signed-off-by: Avi Kivity

Avi Kivity
2009-09-10 15:46:42 +0800
aed665f7b KVM: fix kvm_init() error handling ... Browse Code »

Remove debugfs file if kvm_arch_init() return error

Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity

Xiao Guangrong
2009-09-10 13:33:17 +0800
e601e3be7 KVM: Drop obsolete cpu_get/put in make_all_cpus_request ... Browse Code »

spin_lock disables preemption, so we can simply read the current cpu.

Signed-off-by: Jan Kiszka
Signed-off-by: Marcelo Tosatti

Jan Kiszka
2009-09-10 13:33:16 +0800
a1b37100d KVM: Reduce runnability interface with arch support code ... Browse Code »

Remove kvm_cpu_has_interrupt() and kvm_arch_interrupt_allowed() from
interface between general code and arch code. kvm_arch_vcpu_runnable()
checks for interrupts instead.

Signed-off-by: Gleb Natapov
Signed-off-by: Avi Kivity

Gleb Natapov
2009-09-10 13:33:13 +0800
d34e6b175 KVM: add ioeventfd support ... Browse Code »

ioeventfd is a mechanism to register PIO/MMIO regions to trigger an eventfd
signal when written to by a guest. Host userspace can register any
arbitrary IO address with a corresponding eventfd and then pass the eventfd
to a specific end-point of interest for handling.

Normal IO requires a blocking round-trip since the operation may cause
side-effects in the emulated model or may return data to the caller.
Therefore, an IO in KVM traps from the guest to the host, causes a VMX/SVM
"heavy-weight" exit back to userspace, and is ultimately serviced by qemu's
device model synchronously before returning control back to the vcpu.

However, there is a subclass of IO which acts purely as a trigger for
other IO (such as to kick off an out-of-band DMA request, etc). For these
patterns, the synchronous call is particularly expensive since we really
only want to simply get our notification transmitted asychronously and
return as quickly as possible. All the sychronous infrastructure to ensure
proper data-dependencies are met in the normal IO case are just unecessary
overhead for signalling. This adds additional computational load on the
system, as well as latency to the signalling path.

Therefore, we provide a mechanism for registration of an in-kernel trigger
point that allows the VCPU to only require a very brief, lightweight
exit just long enough to signal an eventfd. This also means that any
clients compatible with the eventfd interface (which includes userspace
and kernelspace equally well) can now register to be notified. The end
result should be a more flexible and higher performance notification API
for the backend KVM hypervisor and perhipheral components.

To test this theory, we built a test-harness called "doorbell". This
module has a function called "doorbell_ring()" which simply increments a
counter for each time the doorbell is signaled. It supports signalling
from either an eventfd, or an ioctl().

We then wired up two paths to the doorbell: One via QEMU via a registered
io region and through the doorbell ioctl(). The other is direct via
ioeventfd.

You can download this test harness here:

ftp://ftp.novell.com/dev/ghaskins/doorbell.tar.bz2

The measured results are as follows:

qemu-mmio: 110000 iops, 9.09us rtt
ioeventfd-mmio: 200100 iops, 5.00us rtt
ioeventfd-pio: 367300 iops, 2.72us rtt

I didn't measure qemu-pio, because I have to figure out how to register a
PIO region with qemu's device model, and I got lazy. However, for now we
can extrapolate based on the data from the NULLIO runs of +2.56us for MMIO,
and -350ns for HC, we get:

qemu-pio: 153139 iops, 6.53us rtt
ioeventfd-hc: 412585 iops, 2.37us rtt

these are just for fun, for now, until I can gather more data.

Here is a graph for your convenience:

http://developer.novell.com/wiki/images/7/76/Iofd-chart.png

The conclusion to draw is that we save about 4us by skipping the userspace
hop.

--------------------

Signed-off-by: Gregory Haskins
Acked-by: Michael S. Tsirkin
Signed-off-by: Avi Kivity

Gregory Haskins
2009-09-10 13:33:12 +0800