15 Dec, 2014

1 commit


13 Dec, 2014

3 commits

  • When a vcpu calls SYSTEM_OFF or SYSTEM_RESET with PSCI v0.2, the vcpus
    should really be turned off for the VM adhering to the suggestions in
    the PSCI spec, and it's the sane thing to do.

    Also, clarify the behavior and expectations for exits to user space with
    the KVM_EXIT_SYSTEM_EVENT case.

    Acked-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Christoffer Dall
     
  • It is not clear that this ioctl can be called multiple times for a given
    vcpu. Userspace already does this, so clarify the ABI.

    Also specify that userspace is expected to always make secondary and
    subsequent calls to the ioctl with the same parameters for the VCPU as
    the initial call (which userspace also already does).

    Add code to check that userspace doesn't violate that ABI in the future,
    and move the kvm_vcpu_set_target() function which is currently
    duplicated between the 32-bit and 64-bit versions in guest.c to a common
    static function in arm.c, shared between both architectures.

    Acked-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Christoffer Dall
     
  • The implementation of KVM_ARM_VCPU_INIT is currently not doing what
    userspace expects, namely making sure that a vcpu which may have been
    turned off using PSCI is returned to its initial state, which would be
    powered on if userspace does not set the KVM_ARM_VCPU_POWER_OFF flag.

    Implement the expected functionality and clarify the ABI.

    Acked-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Christoffer Dall
     

20 Nov, 2014

1 commit


07 Nov, 2014

2 commits


03 Nov, 2014

2 commits

  • No kernel ever reported KVM_CAP_DEVICE_MSIX, KVM_CAP_DEVICE_MSI,
    KVM_CAP_DEVICE_ASSIGNMENT, KVM_CAP_DEVICE_DEASSIGNMENT.

    This makes the documentation wrong, and no application ever
    written to use these capabilities has a chance to work correctly.
    The only way to detect support is to try, and test errno for ENOTTY.
    That's unfortunate, but we can't fix the past.

    Document the actual semantics, and drop the definitions from
    the exported header to make it easier for application
    developers to note and fix the bug.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Paolo Bonzini

    Michael S. Tsirkin
     
  • When commit 6adba5274206 (KVM: Let host know whether the guest can
    handle async PF in non-userspace context.) is introduced, actually
    bit 2 still is reserved and should be zero. Instead, bit 1 is 1 to
    indicate if asynchronous page faults can be injected when vcpu is
    in cpl == 0, and also please see this,

    in the file kvm_para.h, #define KVM_ASYNC_PF_SEND_ALWAYS (1 << 1).

    Signed-off-by: Tiejun Chen
    Signed-off-by: Paolo Bonzini

    Tiejun Chen
     

27 Sep, 2014

1 commit

  • …marm/kvmarm into kvm-next

    Changes for KVM for arm/arm64 for 3.18

    This includes a bunch of changes:
    - Support read-only memory slots on arm/arm64
    - Various changes to fix Sparse warnings
    - Correctly detect write vs. read Stage-2 faults
    - Various VGIC cleanups and fixes
    - Dynamic VGIC data strcuture sizing
    - Fix SGI set_clear_pend offset bug
    - Fix VTTBR_BADDR Mask
    - Correctly report the FSC on Stage-2 faults

    Conflicts:
    virt/kvm/eventfd.c
    [duplicate, different patch where the kvm-arm version broke x86.
    The kvm tree instead has the right one]

    Paolo Bonzini
     

22 Sep, 2014

1 commit


19 Sep, 2014

1 commit


10 Sep, 2014

2 commits

  • It looks like when this was initially merged it got accidentally included
    in the following section. I've just moved it back in the correct section
    and re-numbered it as other ioctls have been added since.

    Signed-off-by: Alex Bennée
    Acked-by: Borislav Petkov
    Signed-off-by: Paolo Bonzini

    Alex Bennée
     
  • In preparation for working on the ARM implementation I noticed the debug
    interface was missing from the API document. I've pieced together the
    expected behaviour from the code and commit messages written it up as
    best I can.

    Signed-off-by: Alex Bennée
    Signed-off-by: Paolo Bonzini

    Alex Bennée
     

03 Sep, 2014

1 commit

  • vcpu exits and memslot mutations can run concurrently as long as the
    vcpu does not aquire the slots mutex. Thus it is theoretically possible
    for memslots to change underneath a vcpu that is handling an exit.

    If we increment the memslot generation number again after
    synchronize_srcu_expedited(), vcpus can safely cache memslot generation
    without maintaining a single rcu_dereference through an entire vm exit.
    And much of the x86/kvm code does not maintain a single rcu_dereference
    of the current memslots during each exit.

    We can prevent the following case:

    vcpu (CPU 0) | thread (CPU 1)
    --------------------------------------------+--------------------------
    1 vm exit |
    2 srcu_read_unlock(&kvm->srcu) |
    3 decide to cache something based on |
    old memslots |
    4 | change memslots
    | (increments generation)
    5 | synchronize_srcu(&kvm->srcu);
    6 retrieve generation # from new memslots |
    7 tag cache with new memslot generation |
    8 srcu_read_unlock(&kvm->srcu) |
    ... |
    |
    ... |
    |
    |

    By incrementing the generation after synchronizing with kvm->srcu readers,
    we ensure that the generation retrieved in (6) will become invalid soon
    after (8).

    Keeping the existing increment is not strictly necessary, but we
    do keep it and just move it for consistency from update_memslots to
    install_new_memslots. It invalidates old cached MMIOs immediately,
    instead of having to wait for the end of synchronize_srcu_expedited,
    which makes the code more clearly correct in case CPU 1 is preempted
    right after synchronize_srcu() returns.

    To avoid halving the generation space in SPTEs, always presume that the
    low bit of the generation is zero when reconstructing a generation number
    out of an SPTE. This effectively disables MMIO caching in SPTEs during
    the call to synchronize_srcu_expedited. Using the low bit this way is
    somewhat like a seqcount---where the protected thing is a cache, and
    instead of retrying we can simply punt if we observe the low bit to be 1.

    Cc: stable@vger.kernel.org
    Signed-off-by: David Matlack
    Reviewed-by: Xiao Guangrong
    Reviewed-by: David Matlack
    Signed-off-by: Paolo Bonzini

    David Matlack
     

25 Aug, 2014

1 commit

  • This patch clarifies that kvm_dirty_regs are just a hint to the kernel and
    that the kernel might just ignore some flags and sync the values (like done for
    acrs and gprs now).

    Signed-off-by: David Hildenbrand
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     

05 Aug, 2014

1 commit

  • Patch queue for ppc - 2014-08-01

    Highlights in this release include:

    - BookE: Rework instruction fetch, not racy anymore now
    - BookE HV: Fix ONE_REG accessors for some in-hardware registers
    - Book3S: Good number of LE host fixes, enable HV on LE
    - Book3S: Some misc bug fixes
    - Book3S HV: Add in-guest debug support
    - Book3S HV: Preload cache lines on context switch
    - Remove 440 support

    Alexander Graf (31):
    KVM: PPC: Book3s PR: Disable AIL mode with OPAL
    KVM: PPC: Book3s HV: Fix tlbie compile error
    KVM: PPC: Book3S PR: Handle hyp doorbell exits
    KVM: PPC: Book3S PR: Fix ABIv2 on LE
    KVM: PPC: Book3S PR: Fix sparse endian checks
    PPC: Add asm helpers for BE 32bit load/store
    KVM: PPC: Book3S HV: Make HTAB code LE host aware
    KVM: PPC: Book3S HV: Access guest VPA in BE
    KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE
    KVM: PPC: Book3S HV: Access XICS in BE
    KVM: PPC: Book3S HV: Fix ABIv2 on LE
    KVM: PPC: Book3S HV: Enable for little endian hosts
    KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct
    KVM: PPC: Deflect page write faults properly in kvmppc_st
    KVM: PPC: Book3S: Stop PTE lookup on write errors
    KVM: PPC: Book3S: Add hack for split real mode
    KVM: PPC: Book3S: Make magic page properly 4k mappable
    KVM: PPC: Remove 440 support
    KVM: Rename and add argument to check_extension
    KVM: Allow KVM_CHECK_EXTENSION on the vm fd
    KVM: PPC: Book3S: Provide different CAPs based on HV or PR mode
    KVM: PPC: Implement kvmppc_xlate for all targets
    KVM: PPC: Move kvmppc_ld/st to common code
    KVM: PPC: Remove kvmppc_bad_hva()
    KVM: PPC: Use kvm_read_guest in kvmppc_ld
    KVM: PPC: Handle magic page in kvmppc_ld/st
    KVM: PPC: Separate loadstore emulation from priv emulation
    KVM: PPC: Expose helper functions for data/inst faults
    KVM: PPC: Remove DCR handling
    KVM: PPC: HV: Remove generic instruction emulation
    KVM: PPC: PR: Handle FSCR feature deselects

    Alexey Kardashevskiy (1):
    KVM: PPC: Book3S: Fix LPCR one_reg interface

    Aneesh Kumar K.V (4):
    KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation
    KVM: PPC: BOOK3S: PR: Emulate virtual timebase register
    KVM: PPC: BOOK3S: PR: Emulate instruction counter
    KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base page

    Anton Blanchard (2):
    KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue
    KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()

    Bharat Bhushan (10):
    kvm: ppc: bookehv: Added wrapper macros for shadow registers
    kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1
    kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR
    kvm: ppc: booke: Add shared struct helpers of SPRN_ESR
    kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7
    kvm: ppc: Add SPRN_EPR get helper function
    kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit
    KVM: PPC: Booke-hv: Add one reg interface for SPRG9
    KVM: PPC: Remove comment saying SPRG1 is used for vcpu pointer
    KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr

    Michael Neuling (1):
    KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling

    Mihai Caraman (8):
    KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule
    KVM: PPC: e500: Fix default tlb for victim hint
    KVM: PPC: e500: Emulate power management control SPR
    KVM: PPC: e500mc: Revert "add load inst fixup"
    KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1
    KVM: PPC: Book3s: Remove kvmppc_read_inst() function
    KVM: PPC: Allow kvmppc_get_last_inst() to fail
    KVM: PPC: Bookehv: Get vcpu's last instruction for emulation

    Paul Mackerras (4):
    KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling
    KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled
    KVM: PPC: Book3S PR: Take SRCU read lock around RTAS kvm_read_guest() call
    KVM: PPC: Book3S: Make kvmppc_ld return a more accurate error indication

    Stewart Smith (2):
    Split out struct kvmppc_vcore creation to separate function
    Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8

    Conflicts:
    Documentation/virtual/kvm/api.txt

    Paolo Bonzini
     

29 Jul, 2014

1 commit


28 Jul, 2014

4 commits

  • The KVM_CHECK_EXTENSION is only available on the kvm fd today. Unfortunately
    on PPC some of the capabilities change depending on the way a VM was created.

    So instead we need a way to expose capabilities as VM ioctl, so that we can
    see which VM type we're using (HV or PR). To enable this, add the
    KVM_CHECK_EXTENSION ioctl to our vm ioctl portfolio.

    Signed-off-by: Alexander Graf
    Acked-by: Paolo Bonzini

    Alexander Graf
     
  • Unfortunately, the LPCR got defined as a 32-bit register in the
    one_reg interface. This is unfortunate because KVM allows userspace
    to control the DPFD (default prefetch depth) field, which is in the
    upper 32 bits. The result is that DPFD always get set to 0, which
    reduces performance in the guest.

    We can't just change KVM_REG_PPC_LPCR to be a 64-bit register ID,
    since that would break existing userspace binaries. Instead we define
    a new KVM_REG_PPC_LPCR_64 id which is 64-bit. Userspace can still use
    the old KVM_REG_PPC_LPCR id, but it now only modifies those fields in
    the bottom 32 bits that userspace can modify (ILE, TC and AIL).
    If userspace uses the new KVM_REG_PPC_LPCR_64 id, it can modify DPFD
    as well.

    Signed-off-by: Alexey Kardashevskiy
    Signed-off-by: Paul Mackerras
    Cc: stable@vger.kernel.org
    Signed-off-by: Alexander Graf

    Alexey Kardashevskiy
     
  • This adds code to check that when the KVM_CAP_PPC_ENABLE_HCALL
    capability is used to enable or disable in-kernel handling of an
    hcall, that the hcall is actually implemented by the kernel.
    If not an EINVAL error is returned.

    This also checks the default-enabled list of hcalls and prints a
    warning if any hcall there is not actually implemented.

    Signed-off-by: Paul Mackerras
    Signed-off-by: Alexander Graf

    Paul Mackerras
     
  • This provides a way for userspace controls which sPAPR hcalls get
    handled in the kernel. Each hcall can be individually enabled or
    disabled for in-kernel handling, except for H_RTAS. The exception
    for H_RTAS is because userspace can already control whether
    individual RTAS functions are handled in-kernel or not via the
    KVM_PPC_RTAS_DEFINE_TOKEN ioctl, and because the numeric value for
    H_RTAS is out of the normal sequence of hcall numbers.

    Hcalls are enabled or disabled using the KVM_ENABLE_CAP ioctl for the
    KVM_CAP_PPC_ENABLE_HCALL capability on the file descriptor for the VM.
    The args field of the struct kvm_enable_cap specifies the hcall number
    in args[0] and the enable/disable flag in args[1]; 0 means disable
    in-kernel handling (so that the hcall will always cause an exit to
    userspace) and 1 means enable. Enabling or disabling in-kernel
    handling of an hcall is effective across the whole VM.

    The ability for KVM_ENABLE_CAP to be used on a VM file descriptor
    on PowerPC is new, added by this commit. The KVM_CAP_ENABLE_CAP_VM
    capability advertises that this ability exists.

    When a VM is created, an initial set of hcalls are enabled for
    in-kernel handling. The set that is enabled is the set that have
    an in-kernel implementation at this point. Any new hcall
    implementations from this point onwards should not be added to the
    default set without a good reason.

    No distinction is made between real-mode and virtual-mode hcall
    implementations; the one setting controls them both.

    Signed-off-by: Paul Mackerras
    Signed-off-by: Alexander Graf

    Paul Mackerras
     

22 Jul, 2014

1 commit


21 Jul, 2014

3 commits

  • …vms390/linux into kvm-next

    This series enables the "KVM_(S|G)ET_MP_STATE" ioctls on s390 to make
    the cpu state settable by user space.

    This is necessary to avoid races in s390 SIGP/reset handling which
    happen because some SIGPs are handled in QEMU, while others are
    handled in the kernel. Together with the busy conditions as return
    value of SIGP races happen especially in areas like starting and
    stopping of CPUs. (For example, there is a program 'cpuplugd', that
    runs on several s390 distros which does automatic onlining and
    offlining on cpus.)

    As soon as the MPSTATE interface is used, user space takes complete
    control of the cpu states. Otherwise the kernel will use the old way.

    Therefore, the new kernel continues to work fine with old QEMUs.

    Paolo Bonzini
     
  • Let's document that this is a capability that may be enabled per-vm.

    Signed-off-by: Cornelia Huck
    Reviewed-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    Cornelia Huck
     
  • Capabilities can be enabled on a vcpu or (since recently) on a vm. Document
    this and note for the existing capabilites whether they are per-vcpu or
    per-vm.

    Signed-off-by: Cornelia Huck
    Reviewed-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    Cornelia Huck
     

10 Jul, 2014

5 commits

  • This patch
    - adds s390 specific MP states to linux headers and documents them
    - implements the KVM_{SET,GET}_MP_STATE ioctls
    - enables KVM_CAP_MP_STATE
    - allows user space to control the VCPU state on s390.

    If user space sets the VCPU state using the ioctl KVM_SET_MP_STATE, we can disable
    manual changing of the VCPU state and trust user space to do the right thing.

    Signed-off-by: David Hildenbrand
    Reviewed-by: Cornelia Huck
    Acked-by: Christian Borntraeger
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     
  • Highlight the aspects of the ioctls that are actually specific to x86
    and ia64. As defined restrictions (irqchip) and mp states may not apply
    to other architectures, these parts are flagged to belong to x86 and ia64.

    In preparation for the use of KVM_(S|G)ET_MP_STATE by s390.
    Fix a spelling error (KVM_SET_MP_STATE vs. KVM_SET_MPSTATE) on the way.

    Signed-off-by: David Hildenbrand
    Reviewed-by: Cornelia Huck
    Acked-by: Christian Borntraeger
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     
  • Document the MIPS specific parts of the KVM API, including:
    - The layout of the kvm_regs structure.
    - The interrupt number passed to KVM_INTERRUPT.
    - The registers supported by the KVM_{GET,SET}_ONE_REG interface, and
    the encoding of those register ids.
    - That KVM_INTERRUPT and KVM_GET_REG_LIST are supported on MIPS.

    Signed-off-by: James Hogan
    Cc: Paolo Bonzini
    Cc: Gleb Natapov
    Cc: kvm@vger.kernel.org
    Cc: Randy Dunlap
    Cc: linux-doc@vger.kernel.org
    Signed-off-by: Paolo Bonzini

    James Hogan
     
  • Some of the MIPS registers that can be accessed with the
    KVM_{GET,SET}_ONE_REG interface have fairly long names, so widen the
    Register column of the table in the KVM_SET_ONE_REG documentation to
    allow them to fit.

    Tabs in the table are replaced with spaces at the same time for
    consistency.

    Signed-off-by: James Hogan
    Cc: Paolo Bonzini
    Cc: Gleb Natapov
    Cc: kvm@vger.kernel.org
    Cc: Randy Dunlap
    Cc: linux-doc@vger.kernel.org
    Signed-off-by: Paolo Bonzini

    James Hogan
     
  • KVM_SET_SIGNAL_MASK is implemented in generic code and isn't x86
    specific, so document it as being applicable for all architectures.

    Signed-off-by: James Hogan
    Cc: Paolo Bonzini
    Cc: Gleb Natapov
    Cc: kvm@vger.kernel.org
    Cc: Randy Dunlap
    Cc: linux-doc@vger.kernel.org
    Signed-off-by: Paolo Bonzini

    James Hogan
     

04 Jun, 2014

2 commits

  • Pull trivial tree changes from Jiri Kosina:
    "Usual pile of patches from trivial tree that make the world go round"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
    staging: go7007: remove reference to CONFIG_KMOD
    aic7xxx: Remove obsolete preprocessor define
    of: dma: doc fixes
    doc: fix incorrect formula to calculate CommitLimit value
    doc: Note need of bc in the kernel build from 3.10 onwards
    mm: Fix printk typo in dmapool.c
    modpost: Fix comment typo "Modules.symvers"
    Kconfig.debug: Grammar s/addition/additional/
    wimax: Spelling s/than/that/, wording s/destinatary/recipient/
    aic7xxx: Spelling s/termnation/termination/
    arm64: mm: Remove superfluous "the" in comment
    of: Spelling s/anonymouns/anonymous/
    dma: imx-sdma: Spelling s/determnine/determine/
    ath10k: Improve grammar in comments
    ath6kl: Spelling s/determnine/determine/
    of: Improve grammar for of_alias_get_id() documentation
    drm/exynos: Spelling s/contro/control/
    radio-bcm2048.c: fix wrong overflow check
    doc: printk-formats: do not mention casts for u64/s64
    doc: spelling error changes
    ...

    Linus Torvalds
     
  • Pull KVM updates from Paolo Bonzini:
    "At over 200 commits, covering almost all supported architectures, this
    was a pretty active cycle for KVM. Changes include:

    - a lot of s390 changes: optimizations, support for migration, GDB
    support and more

    - ARM changes are pretty small: support for the PSCI 0.2 hypercall
    interface on both the guest and the host (the latter acked by
    Catalin)

    - initial POWER8 and little-endian host support

    - support for running u-boot on embedded POWER targets

    - pretty large changes to MIPS too, completing the userspace
    interface and improving the handling of virtualized timer hardware

    - for x86, a larger set of changes is scheduled for 3.17. Still, we
    have a few emulator bugfixes and support for running nested
    fully-virtualized Xen guests (para-virtualized Xen guests have
    always worked). And some optimizations too.

    The only missing architecture here is ia64. It's not a coincidence
    that support for KVM on ia64 is scheduled for removal in 3.17"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (203 commits)
    KVM: add missing cleanup_srcu_struct
    KVM: PPC: Book3S PR: Rework SLB switching code
    KVM: PPC: Book3S PR: Use SLB entry 0
    KVM: PPC: Book3S HV: Fix machine check delivery to guest
    KVM: PPC: Book3S HV: Work around POWER8 performance monitor bugs
    KVM: PPC: Book3S HV: Make sure we don't miss dirty pages
    KVM: PPC: Book3S HV: Fix dirty map for hugepages
    KVM: PPC: Book3S HV: Put huge-page HPTEs in rmap chain for base address
    KVM: PPC: Book3S HV: Fix check for running inside guest in global_invalidates()
    KVM: PPC: Book3S: Move KVM_REG_PPC_WORT to an unused register number
    KVM: PPC: Book3S: Add ONE_REG register names that were missed
    KVM: PPC: Add CAP to indicate hcall fixes
    KVM: PPC: MPIC: Reset IRQ source private members
    KVM: PPC: Graciously fail broken LE hypercalls
    PPC: ePAPR: Fix hypercall on LE guest
    KVM: PPC: BOOK3S: Remove open coded make_dsisr in alignment handler
    KVM: PPC: BOOK3S: Always use the saved DAR value
    PPC: KVM: Make NX bit available with magic page
    KVM: PPC: Disable NX for old magic page using guests
    KVM: PPC: BOOK3S: HV: Add mixed page-size support for guest
    ...

    Linus Torvalds
     

30 May, 2014

4 commits


27 May, 2014

1 commit


15 May, 2014

1 commit

  • s390 has acquired irqfd support with commit "KVM: s390: irq routing for
    adapter interrupts" (84223598778ba08041f4297fda485df83414d57e) but
    failed to announce it. Let's fix that.

    Signed-off-by: Cornelia Huck
    Acked-by: Christian Borntraeger
    Signed-off-by: Christian Borntraeger

    Cornelia Huck
     

06 May, 2014

1 commit