17 Nov, 2011

4 commits

  • KVM on s390 always had a sync mmu. Any mapping change in userspace
    mapping was always reflected immediately in the guest mapping.
    - In older code the guest mapping was just an offset
    - In newer code the last level page table is shared

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Carsten Otte
    Signed-off-by: Avi Kivity

    Christian Borntraeger
     
  • There is a potential host deadlock in the tprot intercept handling.
    We must not hold the mmap semaphore while resolving the guest
    address. If userspace is remapping, then the memory detection in
    the guest is broken anyway so we can safely separate the
    address translation from walking the vmas.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Carsten Otte
    Signed-off-by: Avi Kivity

    Christian Borntraeger
     
  • SIGP sense running may cause an intercept on higher level
    virtualization, so handle it by checking the CPUSTAT_RUNNING flag.

    Signed-off-by: Cornelia Huck
    Signed-off-by: Carsten Otte
    Signed-off-by: Avi Kivity

    Cornelia Huck
     
  • CPUSTAT_RUNNING was implemented signifying that a vcpu is not stopped.
    This is not, however, what the architecture says: RUNNING should be
    set when the host is acting on the behalf of the guest operating
    system.

    CPUSTAT_RUNNING has been changed to be set in kvm_arch_vcpu_load()
    and to be unset in kvm_arch_vcpu_put().

    For signifying stopped state of a vcpu, a host-controlled bit has
    been used and is set/unset basically on the reverse as the old
    CPUSTAT_RUNNING bit (including pushing it down into stop handling
    proper in handle_stop()).

    Cc: stable@kernel.org
    Signed-off-by: Cornelia Huck
    Signed-off-by: Carsten Otte
    Signed-off-by: Avi Kivity

    Cornelia Huck
     

01 Nov, 2011

1 commit

  • * 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (54 commits)
    [S390] Remove error checking from copy_oldmem_page()
    [S390] qdio: prevent dsci access without adapter interrupts
    [S390] irqstats: split IPI interrupt accounting
    [S390] add missing __tlb_flush_global() for !CONFIG_SMP
    [S390] sparse: fix sparse symbol shadow warning
    [S390] sparse: fix sparse NULL pointer warnings
    [S390] sparse: fix sparse warnings with __user pointers
    [S390] sparse: fix sparse warnings in math-emu
    [S390] sparse: fix sparse warnings about missing prototypes
    [S390] sparse: fix sparse ANSI-C warnings
    [S390] sparse: fix sparse static warnings
    [S390] sparse: fix access past end of array warnings
    [S390] dasd: prevent path verification before resume
    [S390] qdio: remove multicast polling
    [S390] qdio: reset outbound SBAL error states
    [S390] qdio: EQBS retry after CCQ 96
    [S390] qdio: add timestamp for last queue scan time
    [S390] Introduce get_clock_fast()
    [S390] kvm: Handle diagnose 0x10 (release pages)
    [S390] take mmap_sem when walking guest page table
    ...

    Linus Torvalds
     

30 Oct, 2011

5 commits

  • Linux on System z uses a ballooner based on diagnose 0x10. (aka as
    collaborative memory management). This patch implements diagnose
    0x10 on the guest address space.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Martin Schwidefsky

    Christian Borntraeger
     
  • Implement sigp external call, which might be required for guests that
    issue an external call instead of an emergency signal for IPI.

    This fixes an issue with "KVM: unknown SIGP: 0x02" when booting
    such an SMP guest.

    Signed-off-by: Christian Ehrhardt
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Marcelo Tosatti

    Christian Ehrhardt
     
  • KVM common code does vcpu_load prior to calling our arch ioctls and
    vcpu_put after we're done here. Via the kvm_arch_vcpu_load/put
    callbacks we do load the fpu and access register state into the
    processor, which saves us moving the state on every SIE exit the
    kernel handles. However this breaks register setting from userspace,
    because of the following sequence:
    1a. vcpu load stores userspace register content
    1b. vcpu load loads guest register content
    2. kvm_arch_vcpu_ioctl_set_fpu/sregs updates saved guest register content
    3a. vcpu put stores the guest registers and overwrites the new content
    3b. vcpu put loads the userspace register set again

    This patch loads the new guest register state into the cpu, so that the correct
    (new) set of guest registers will be stored in step 3a.

    Signed-off-by: Carsten Otte
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Marcelo Tosatti

    Carsten Otte
     
  • This patch fixes the return value of kvm_arch_init_vm in case a memory
    allocation goes wrong.

    Signed-off-by: Carsten Otte
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Marcelo Tosatti

    Carsten Otte
     
  • We use the cpu id provided by userspace as array index here. Thus we
    clearly need to check it first. Ooops.

    CC:
    Signed-off-by: Carsten Otte
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Marcelo Tosatti

    Carsten Otte
     

20 Sep, 2011

2 commits

  • 598841ca9919d008b520114d8a4378c4ce4e40a1 ([S390] use gmap address
    spaces for kvm guest images) changed kvm on s390 to use a separate
    address space for kvm guests. We can now put KVM guests anywhere
    in the user address mode with a size up to 8PB - as long as the
    memory is 1MB-aligned. This change was done without KVM extension
    capability bit.
    The change was added after 3.0, but we still have a chance to add
    a feature bit before 3.1 (keeping the releases in a sane state).
    We use number 71 to avoid collisions with other pending kvm patches
    as requested by Alexander Graf.

    Signed-off-by: Christian Borntraeger
    Acked-by: Avi Kivity
    Cc: Alexander Graf
    Signed-off-by: Heiko Carstens

    Christian Borntraeger
     
  • 598841ca9919d008b520114d8a4378c4ce4e40a1 ([S390] use gmap address
    spaces for kvm guest images) changed kvm to use a separate address
    space for kvm guests. This address space was switched in __vcpu_run
    In some cases (preemption, page fault) there is the possibility that
    this address space switch is lost.
    The typical symptom was a huge amount of validity intercepts or
    random guest addressing exceptions.
    Fix this by doing the switch in sie_loop and sie_exit and saving the
    address space in the gmap structure itself. Also use the preempt
    notifier.

    Signed-off-by: Christian Borntraeger
    Acked-by: Avi Kivity
    Signed-off-by: Heiko Carstens

    Christian Borntraeger
     

24 Jul, 2011

6 commits

  • SIGP emerg needs to pass the source vpu adress into __LC_CPU_ADDRESS of the
    target guest.

    Signed-off-by: Christian Ehrhardt
    Signed-off-by: Martin Schwidefsky

    Christian Ehrhardt
     
  • This patch removes the mmu reload logic for kvm on s390. Via Martin's
    new gmap interface, we can safely add or remove memory slots while
    guest CPUs are in-flight. Thus, the mmu reload logic is not needed
    anymore.

    Signed-off-by: Carsten Otte
    Signed-off-by: Martin Schwidefsky

    Carsten Otte
     
  • This patch removes kvm-s390 internal assumption of a linear mapping
    of guest address space to user space. Previously, guest memory was
    translated to user addresses using a fixed offset (gmsor). The new
    code uses gmap_fault to resolve guest addresses.

    Signed-off-by: Carsten Otte
    Signed-off-by: Martin Schwidefsky

    Carsten Otte
     
  • This patch switches kvm from using (Qemu's) user address space to
    Martin's gmap address space. This way QEMU does not have to use a
    linker script in order to fit large guests at low addresses in its
    address space.

    Signed-off-by: Carsten Otte
    Signed-off-by: Martin Schwidefsky

    Carsten Otte
     
  • The entry to / exit from sie has subtle dependencies to the first level
    interrupt handler. Move the sie assembler code to entry64.S and replace
    the SIE_HOOK callback with a test and the new _TIF_SIE bit.
    In addition this patch fixes several problems in regard to the check for
    the_TIF_EXIT_SIE bits. The old code checked the TIF bits before executing
    the interrupt handler and it only modified the instruction address if it
    pointed directly to the sie instruction. In both cases it could miss
    a TIF bit that normally would cause an exit from the guest and would
    reenter the guest context.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • When running a kvm guest we can get intercepts for tprot, if the host
    page table is read-only or not populated. This patch implements the
    most common case (linux memory detection).
    This also allows host copy on write for guest memory on newer systems.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Martin Schwidefsky

    Christian Borntraeger
     

23 Jul, 2011

1 commit

  • virtio has been so far used only in the context of virtualization,
    and the virtio Kconfig was sourced directly by the relevant arch
    Kconfigs when VIRTUALIZATION was selected.

    Now that we start using virtio for inter-processor communications,
    we need to source the virtio Kconfig outside of the virtualization
    scope too.

    Moreover, some architectures might use virtio for both virtualization
    and inter-processor communications, so directly sourcing virtio
    might yield unexpected results due to conflicting selections.

    The simple solution offered by this patch is to always source virtio's
    Kconfig in drivers/Kconfig, and remove it from the appropriate arch
    Kconfigs. Additionally, a virtio menu entry has been added so virtio
    drivers don't show up in the general drivers menu.

    This way anyone can use virtio, though it's arguably less accessible
    (and neat!) for virtualization users now.

    Note: some architectures (mips and sh) seem to have a VIRTUALIZATION
    menu merely for sourcing virtio's Kconfig, so that menu is removed too.

    Signed-off-by: Ohad Ben-Cohen
    Signed-off-by: Rusty Russell

    Ohad Ben-Cohen
     

06 Jun, 2011

2 commits

  • Currently KVM masks out the known good facilities only for the first
    double word, but passed the 2nd double word without filtering. This
    breaks some code on newer systems:

    [ 0.593966] ------------[ cut here ]------------
    [ 0.594086] WARNING: at arch/s390/oprofile/hwsampler.c:696
    [ 0.594213] Modules linked in:
    [ 0.594321] Modules linked in:
    [ 0.594439] CPU: 0 Not tainted 3.0.0-rc1 #46
    [ 0.594564] Process swapper (pid: 1, task: 00000001effa8038, ksp: 00000001effafab8)
    [ 0.594735] Krnl PSW : 0704100180000000 00000000004ab89a (hwsampler_setup+0x75a/0x7b8)
    [ 0.594910] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0 EA:3
    [ 0.595120] Krnl GPRS: ffffffff00000000 00000000ffffffea ffffffffffffffea 00000000004a98f8
    [ 0.595351] 00000000004aa002 0000000000000001 000000000080e720 000000000088b9f8
    [ 0.595522] 000000000080d3e8 0000000000000000 0000000000000000 000000000080e464
    [ 0.595725] 0000000000000000 00000000005db198 00000000004ab3a2 00000001effafd98
    [ 0.595901] Krnl Code: 00000000004ab88c: c0e5000673ca brasl %r14,57a020
    [ 0.596071] 00000000004ab892: a7f4fc77 brc 15,4ab180
    [ 0.596276] 00000000004ab896: a7f40001 brc 15,4ab898
    [ 0.596454] >00000000004ab89a: a7c8ffa1 lhi %r12,-95
    [ 0.596657] 00000000004ab89e: a7f4fc71 brc 15,4ab180
    [ 0.596854] 00000000004ab8a2: a7f40001 brc 15,4ab8a4
    [ 0.597029] 00000000004ab8a6: a7f4ff22 brc 15,4ab6ea
    [ 0.597230] 00000000004ab8aa: c0200011009a larl %r2,6cb9de
    [ 0.597441] Call Trace:
    [ 0.597511] ([] hwsampler_setup+0x262/0x7b8)
    [ 0.597676] [] oprofile_arch_init+0x32/0xd0
    [ 0.597834] [] oprofile_init+0x28/0x74
    [ 0.597991] [] do_one_initcall+0x3a/0x170
    [ 0.598151] [] kernel_init+0x142/0x1ec
    [ 0.598314] [] kernel_thread_starter+0x6/0xc
    [ 0.598468] [] kernel_thread_starter+0x0/0xc
    [ 0.598606] Last Breaking-Event-Address:
    [ 0.598707] [] hwsampler_setup+0x756/0x7b8
    [ 0.598863] ---[ end trace ce3179037f4e3e5b ]---

    So lets also mask the 2nd double word. Facilites 66,76,76,77 should be fine.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Martin Schwidefsky

    Christian Borntraeger
     
  • commit 9ff4cfb3fcfd48b49fdd9be7381b3be340853aa4 ([S390] kvm-390: Let
    kernel exit SIE instruction on work) fixed a problem of commit
    commit cd3b70f5d4d82f85d1e1d6e822f38ae098cf7c72 ([S390] virtualization
    aware cpu measurement) but uncovered another one.

    If a kvm guest accesses guest real memory that doesnt exist, the
    page fault handler calls the sie hook, which then rewrites
    the return psw from sie_inst to either sie_exit or sie_reenter.
    On return, the page fault handler will then detect the wrong access
    as a kernel fault causing a kernel oops in sie_reenter or sie_exit.

    We have to add these two addresses to the exception table to allow
    graceful exits.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Martin Schwidefsky

    Christian Borntraeger
     

20 Apr, 2011

1 commit

  • From: Christian Borntraeger

    This patch fixes the sie exit on interrupts. The low level
    interrupt handler returns to the PSW address in pt_regs and not
    to the PSW address in the lowcore.
    Without this fix a cpu bound guest might never leave guest state
    since the host interrupt handler would blindly return to the
    SIE instruction, even on need_resched and friends.

    Cc: stable@kernel.org
    Signed-off-by: Carsten Otte
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Martin Schwidefsky

    Carsten Otte
     

31 Mar, 2011

1 commit


17 Mar, 2011

1 commit


14 Jan, 2011

1 commit

  • * 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (142 commits)
    KVM: Initialize fpu state in preemptible context
    KVM: VMX: when entering real mode align segment base to 16 bytes
    KVM: MMU: handle 'map_writable' in set_spte() function
    KVM: MMU: audit: allow audit more guests at the same time
    KVM: Fetch guest cr3 from hardware on demand
    KVM: Replace reads of vcpu->arch.cr3 by an accessor
    KVM: MMU: only write protect mappings at pagetable level
    KVM: VMX: Correct asm constraint in vmcs_load()/vmcs_clear()
    KVM: MMU: Initialize base_role for tdp mmus
    KVM: VMX: Optimize atomic EFER load
    KVM: VMX: Add definitions for more vm entry/exit control bits
    KVM: SVM: copy instruction bytes from VMCB
    KVM: SVM: implement enhanced INVLPG intercept
    KVM: SVM: enhance mov DR intercept handler
    KVM: SVM: enhance MOV CR intercept handler
    KVM: SVM: add new SVM feature bit names
    KVM: cleanup emulate_instruction
    KVM: move complete_insn_gp() into x86.c
    KVM: x86: fix CR8 handling
    KVM guest: Fix kvm clock initialization when it's configured out
    ...

    Linus Torvalds
     

12 Jan, 2011

1 commit

  • IA64 support forces us to abstract the allocation of the kvm structure.
    But instead of mixing this up with arch-specific initialization and
    doing the same on destruction, split both steps. This allows to move
    generic destruction calls into generic code.

    It also fixes error clean-up on failures of kvm_create_vm for IA64.

    Signed-off-by: Jan Kiszka
    Signed-off-by: Avi Kivity

    Jan Kiszka
     

05 Jan, 2011

1 commit


25 Oct, 2010

2 commits


01 Aug, 2010

6 commits


09 Jun, 2010

2 commits

  • The containing function is called from several places. At one of them, in
    the function __sigp_stop, the spin lock &fi->lock is held.

    The semantic patch that makes this change is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @gfp exists@
    identifier fn;
    position p;
    @@

    fn(...) {
    ... when != spin_unlock
    when any
    GFP_KERNEL@p
    ... when any
    }

    @locked@
    identifier gfp.fn;
    @@

    spin_lock(...)
    ... when != spin_unlock
    fn(...)

    @depends on locked@
    position gfp.p;
    @@

    - GFP_KERNEL@p
    + GFP_ATOMIC
    //

    Signed-off-by: Julia Lawall
    Acked-by: Christian Borntraeger
    Signed-off-by: Martin Schwidefsky

    Julia Lawall
     
  • Add missing GFP flag to memory allocations. The part in cio only
    changes a comment.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

27 May, 2010

1 commit

  • This config option enables or disables three single instructions
    which aren't expensive. This is too fine grained.
    Besided that everybody who uses kvm would enable it anyway in order
    to debug performance problems.
    Just remove it.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

22 May, 2010

1 commit

  • * 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (269 commits)
    KVM: x86: Add missing locking to arch specific vcpu ioctls
    KVM: PPC: Add missing vcpu_load()/vcpu_put() in vcpu ioctls
    KVM: MMU: Segregate shadow pages with different cr0.wp
    KVM: x86: Check LMA bit before set_efer
    KVM: Don't allow lmsw to clear cr0.pe
    KVM: Add cpuid.txt file
    KVM: x86: Tell the guest we'll warn it about tsc stability
    x86, paravirt: don't compute pvclock adjustments if we trust the tsc
    x86: KVM guest: Try using new kvm clock msrs
    KVM: x86: export paravirtual cpuid flags in KVM_GET_SUPPORTED_CPUID
    KVM: x86: add new KVMCLOCK cpuid feature
    KVM: x86: change msr numbers for kvmclock
    x86, paravirt: Add a global synchronization point for pvclock
    x86, paravirt: Enable pvclock flags in vcpu_time_info structure
    KVM: x86: Inject #GP with the right rip on efer writes
    KVM: SVM: Don't allow nested guest to VMMCALL into host
    KVM: x86: Fix exception reinjection forced to true
    KVM: Fix wallclock version writing race
    KVM: MMU: Don't read pdptrs with mmu spinlock held in mmu_alloc_roots
    KVM: VMX: enable VMXON check with SMX enabled (Intel TXT)
    ...

    Linus Torvalds
     

19 May, 2010

1 commit