25 Nov, 2017

1 commit

  • Pull KVM updates from Radim Krčmář:
    "Trimmed second batch of KVM changes for Linux 4.15:

    - GICv4 Support for KVM/ARM

    - re-introduce support for CPUs without virtual NMI (cc stable) and
    allow testing of KVM without virtual NMI on available CPUs

    - fix long-standing performance issues with assigned devices on AMD
    (cc stable)"

    * tag 'kvm-4.15-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits)
    kvm: vmx: Allow disabling virtual NMI support
    kvm: vmx: Reinstate support for CPUs without virtual NMI
    KVM: SVM: obey guest PAT
    KVM: arm/arm64: Don't queue VLPIs on INV/INVALL
    KVM: arm/arm64: Fix GICv4 ITS initialization issues
    KVM: arm/arm64: GICv4: Theory of operations
    KVM: arm/arm64: GICv4: Enable VLPI support
    KVM: arm/arm64: GICv4: Prevent userspace from changing doorbell affinity
    KVM: arm/arm64: GICv4: Prevent a VM using GICv4 from being saved
    KVM: arm/arm64: GICv4: Enable virtual cpuif if VLPIs can be delivered
    KVM: arm/arm64: GICv4: Hook vPE scheduling into vgic flush/sync
    KVM: arm/arm64: GICv4: Use the doorbell interrupt as an unblocking source
    KVM: arm/arm64: GICv4: Add doorbell interrupt handling
    KVM: arm/arm64: GICv4: Use pending_last as a scheduling hint
    KVM: arm/arm64: GICv4: Handle INVALL applied to a vPE
    KVM: arm/arm64: GICv4: Propagate property updates to VLPIs
    KVM: arm/arm64: GICv4: Handle MOVALL applied to a vPE
    KVM: arm/arm64: GICv4: Handle CLEAR applied to a VLPI
    KVM: arm/arm64: GICv4: Propagate affinity changes to the physical ITS
    KVM: arm/arm64: GICv4: Unmap VLPI when freeing an LPI
    ...

    Linus Torvalds
     

18 Nov, 2017

1 commit

  • Pull compat and uaccess updates from Al Viro:

    - {get,put}_compat_sigset() series

    - assorted compat ioctl stuff

    - more set_fs() elimination

    - a few more timespec64 conversions

    - several removals of pointless access_ok() in places where it was
    followed only by non-__ variants of primitives

    * 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (24 commits)
    coredump: call do_unlinkat directly instead of sys_unlink
    fs: expose do_unlinkat for built-in callers
    ext4: take handling of EXT4_IOC_GROUP_ADD into a helper, get rid of set_fs()
    ipmi: get rid of pointless access_ok()
    pi433: sanitize ioctl
    cxlflash: get rid of pointless access_ok()
    mtdchar: get rid of pointless access_ok()
    r128: switch compat ioctls to drm_ioctl_kernel()
    selection: get rid of field-by-field copyin
    VT_RESIZEX: get rid of field-by-field copyin
    i2c compat ioctls: move to ->compat_ioctl()
    sched_rr_get_interval(): move compat to native, get rid of set_fs()
    mips: switch to {get,put}_compat_sigset()
    sparc: switch to {get,put}_compat_sigset()
    s390: switch to {get,put}_compat_sigset()
    ppc: switch to {get,put}_compat_sigset()
    parisc: switch to {get,put}_compat_sigset()
    get_compat_sigset()
    get rid of {get,put}_compat_itimerspec()
    io_getevents: Use timespec64 to represent timeouts
    ...

    Linus Torvalds
     

17 Nov, 2017

2 commits

  • …/git/kvmarm/kvmarm into HEAD

    GICv4 Support for KVM/ARM for v4.15

    Paolo Bonzini
     
  • Pull KVM updates from Radim Krčmář:
    "First batch of KVM changes for 4.15

    Common:
    - Python 3 support in kvm_stat
    - Accounting of slabs to kmemcg

    ARM:
    - Optimized arch timer handling for KVM/ARM
    - Improvements to the VGIC ITS code and introduction of an ITS reset
    ioctl
    - Unification of the 32-bit fault injection logic
    - More exact external abort matching logic

    PPC:
    - Support for running hashed page table (HPT) MMU mode on a host that
    is using the radix MMU mode; single threaded mode on POWER 9 is
    added as a pre-requisite
    - Resolution of merge conflicts with the last second 4.14 HPT fixes
    - Fixes and cleanups

    s390:
    - Some initial preparation patches for exitless interrupts and crypto
    - New capability for AIS migration
    - Fixes

    x86:
    - Improved emulation of LAPIC timer mode changes, MCi_STATUS MSRs,
    and after-reset state
    - Refined dependencies for VMX features
    - Fixes for nested SMI injection
    - A lot of cleanups"

    * tag 'kvm-4.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (89 commits)
    KVM: s390: provide a capability for AIS state migration
    KVM: s390: clear_io_irq() requests are not expected for adapter interrupts
    KVM: s390: abstract conversion between isc and enum irq_types
    KVM: s390: vsie: use common code functions for pinning
    KVM: s390: SIE considerations for AP Queue virtualization
    KVM: s390: document memory ordering for kvm_s390_vcpu_wakeup
    KVM: PPC: Book3S HV: Cosmetic post-merge cleanups
    KVM: arm/arm64: fix the incompatible matching for external abort
    KVM: arm/arm64: Unify 32bit fault injection
    KVM: arm/arm64: vgic-its: Implement KVM_DEV_ARM_ITS_CTRL_RESET
    KVM: arm/arm64: Document KVM_DEV_ARM_ITS_CTRL_RESET
    KVM: arm/arm64: vgic-its: Free caches when GITS_BASER Valid bit is cleared
    KVM: arm/arm64: vgic-its: New helper functions to free the caches
    KVM: arm/arm64: vgic-its: Remove kvm_its_unmap_device
    arm/arm64: KVM: Load the timer state when enabling the timer
    KVM: arm/arm64: Rework kvm_timer_should_fire
    KVM: arm/arm64: Get rid of kvm_timer_flush_hwstate
    KVM: arm/arm64: Avoid phys timer emulation in vcpu entry/exit
    KVM: arm/arm64: Move phys_timer_emulate function
    KVM: arm/arm64: Use kvm_arm_timer_set/get_reg for guest register traps
    ...

    Linus Torvalds
     

16 Nov, 2017

2 commits

  • KVM: s390: fixes and improvements for 4.15

    - Some initial preparation patches for exitless interrupts and crypto
    - New capability for AIS migration
    - Fixes
    - merge of the sthyi tree from the base s390 team, which moves the sthyi
    out of KVM into a shared function also for non-KVM

    Radim Krčmář
     
  • Pull arm64 updates from Will Deacon:
    "The big highlight is support for the Scalable Vector Extension (SVE)
    which required extensive ABI work to ensure we don't break existing
    applications by blowing away their signal stack with the rather large
    new vector context ( of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (97 commits)
    arm64: Make ARMV8_DEPRECATED depend on SYSCTL
    arm64: Implement __lshrti3 library function
    arm64: support __int128 on gcc 5+
    arm64/sve: Add documentation
    arm64/sve: Detect SVE and activate runtime support
    arm64/sve: KVM: Hide SVE from CPU features exposed to guests
    arm64/sve: KVM: Treat guest SVE use as undefined instruction execution
    arm64/sve: KVM: Prevent guests from using SVE
    arm64/sve: Add sysctl to set the default vector length for new processes
    arm64/sve: Add prctl controls for userspace vector length management
    arm64/sve: ptrace and ELF coredump support
    arm64/sve: Preserve SVE registers around EFI runtime service calls
    arm64/sve: Preserve SVE registers around kernel-mode NEON use
    arm64/sve: Probe SVE capabilities and usable vector lengths
    arm64: cpufeature: Move sys_caps_initialised declarations
    arm64/sve: Backend logic for setting the vector length
    arm64/sve: Signal handling support
    arm64/sve: Support vector length resetting for new processes
    arm64/sve: Core task context handling
    arm64/sve: Low-level CPU setup
    ...

    Linus Torvalds
     

10 Nov, 2017

23 commits

  • Since VLPIs are injected directly by the hardware there's no need to
    mark these as pending in software and queue them on the AP list.

    Reviewed-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Christoffer Dall
     
  • We should only try to initialize GICv4 data structures on a GICv4
    capable system. Move the vgic_supports_direct_msis() check inito
    vgic_v4_init() so that any KVM VGIC initialization path does not fail
    on non-GICv4 systems.

    Also be slightly more strict in the checking of the return value in
    vgic_its_create, and only error out on negative return values from the
    vgic_v4_init() function. This is important because the kvm device code
    only treats negative values as errors and only cleans up in this case.
    Errornously treating a positive return value as an error from the
    vgic_v4_init() function can lead to NULL pointer dereferences, as has
    recently been observed.

    Acked-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Christoffer Dall
     
  • Yet another braindump so I can free some cells...

    Acked-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • All it takes is the has_v4 flag to be set in gic_kvm_info
    as well as "kvm-arm.vgic_v4_enable=1" being passed on the
    command line for GICv4 to be enabled in KVM.

    Acked-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • We so far allocate the doorbell interrupts without taking any
    special measure regarding the affinity of these interrupts. We
    simply move them around as required when the vcpu gets scheduled
    on a different CPU.

    But that's counting without userspace (and the evil irqbalance) that
    can try and move the VPE interrupt around, causing the ITS code
    to emit VMOVP commands and remap the doorbell to another redistributor.
    Worse, this can happen while the vcpu is running, causing all kind
    of trouble if the VPE is already resident, and we end-up in UNPRED
    territory.

    So let's take a definitive action and prevent userspace from messing
    with us. This is just a matter of adding IRQ_NO_BALANCING to the
    set of flags we already have, letting the kernel in sole control
    of the affinity.

    Acked-by: Christoffer Dall
    Reviewed-by: Eric Auger
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • The GICv4 architecture doesn't make it easy for save/restore to
    work, as it doesn't give any guarantee that the pending state
    is written into the pending table.

    So let's not take any chance, and let's return an error if
    we encounter any LPI that has the HW bit set. In order for
    userspace to distinguish this error from other failure modes,
    use -EACCES as an error code.

    Reviewed-by: Eric Auger
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • In order for VLPIs to be delivered to the guest, we must make sure that
    the virtual cpuif is always enabled, irrespective of the presence of
    virtual interrupt in the LRs.

    Acked-by: Christoffer Dall
    Reviewed-by: Eric Auger
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • The redistributor needs to be told which vPE is about to be run,
    and tells us whether there is any pending VLPI on exit.

    Let's add the scheduling calls to the vgic flush/sync functions,
    allowing the VLPIs to be delivered to the guest.

    Reviewed-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • The doorbell interrupt is only useful if the vcpu is blocked on WFI.
    In all other cases, recieving a doorbell interrupt is just a waste
    of cycles.

    So let's only enable the doorbell if a vcpu is getting blocked,
    and disable it when it is unblocked. This is very similar to
    what we're doing for the background timer.

    Reviewed-by: Christoffer Dall
    Reviewed-by: Eric Auger
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • When a vPE is not running, a VLPI being made pending results in a
    doorbell interrupt being delivered. Let's handle this interrupt
    and update the pending_last flag that indicates that VLPIs are
    pending. The corresponding vcpu is also kicked into action.

    Special care is taken to prevent the doorbell from being enabled
    at request time (this is controlled separately), and to make
    the disabling on the interrupt non-lazy.

    Reviewed-by: Christoffer Dall
    Reviewed-by: Eric Auger
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • When a vPE exits, the pending_last flag is set when there are pending
    VLPIs stored in the pending table. Similarily, this flag will be set
    when a doorbell interrupt fires, as it indicates the same condition.

    Let's update kvm_vgic_vcpu_pending_irq() to account for that
    flag as well, making a vcpu runnable when set.

    Acked-by: Christoffer Dall
    Reviewed-by: Eric Auger
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • There is no need to perform an INV for each interrupt when updating
    multiple interrupts. Instead, we can rely on the final VINVALL that
    gets sent to the ITS to do the work for all of them.

    Acked-by: Christoffer Dall
    Reviewed-by: Eric Auger
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • Upon updating a property, we propagate it all the way to the physical
    ITS, and ask for an INV command to be executed there.

    Acked-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • The current implementation of MOVALL doesn't allow us to call
    into the core ITS code as we hold a number of spinlocks.

    Let's try a method used in other parts of the code, were we copy
    the intids of the candicate interrupts, and then do whatever
    we need to do with them outside of the critical section.

    This allows us to move the interrupts one by one, at the expense
    of a bit of CPU time. Who cares? MOVALL is such a stupid command
    anyway...

    Reviewed-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • Handling CLEAR is pretty easy. Just ask the ITS driver to clear
    the corresponding pending bit (which will turn into a CLEAR
    command on the physical side).

    Acked-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • When the guest issues an affinity change, we need to tell the physical
    ITS that we're now targetting a new vcpu. This is done by extracting
    the current mapping, updating the target, and reapplying the mapping.

    Reviewed-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • When freeing an LPI (on a DISCARD command, for example), we need
    to unmap the VLPI down to the physical ITS level.

    Acked-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • If the guest issues an INT command targetting a VLPI, let's
    call into the irq_set_irqchip_state() helper to make it pending
    on the physical side.

    This works just as well if userspace decides to inject an interrupt
    using the normal userspace API...

    Acked-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • Let's use the irq bypass mechanism also used for x86 posted interrupts
    to intercept the virtual PCIe endpoint configuration and establish our
    LPI->VLPI mapping.

    Reviewed-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • In order to control the GICv4 view of virtual CPUs, we rely
    on an irqdomain allocated for that purpose. Let's add a couple
    of helpers to that effect.

    At the same time, the vgic data structures gain new fields to
    track all this... erm... wonderful stuff.

    The way we hook into the vgic init is slightly convoluted. We
    need the vgic to be initialized (in order to guarantee that
    the number of vcpus is now fixed), and we must have a vITS
    (otherwise this is all very pointless). So we end-up calling
    the init from both vgic_init and vgic_its_create.

    Reviewed-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • Add a new has_gicv4 field in the global VGIC state that indicates
    whether the HW is GICv4 capable, as a per-VM predicate indicating
    if there is a possibility for a VM to support direct injection
    (the above being true and the VM having an ITS).

    Reviewed-by: Christoffer Dall
    Reviewed-by: Eric Auger
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • In order to help integrating the vITS code with GICv4, let's add
    a new helper that deals with updating the affinity of an LPI,
    which will later be augmented with super duper extra GICv4
    goodness.

    Reviewed-by: Christoffer Dall
    Reviewed-by: Eric Auger
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • The whole MSI injection process is fairly monolithic. An MSI write
    gets turned into an injected LPI in one swift go. But this is actually
    a more fine-grained process:

    - First, a virtual ITS gets selected using the doorbell address
    - Then the DevID/EventID pair gets translated into an LPI
    - Finally the LPI is injected

    Since the GICv4 code needs the first two steps in order to match
    an IRQ routing entry to an LPI, let's expose them as helpers,
    and refactor the existing code to use them

    Reviewed-by: Christoffer Dall
    Reviewed-by: Eric Auger
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     

09 Nov, 2017

1 commit

  • We will not see -ENOMEM (gfn_to_hva() will return KVM_ERR_PTR_BAD_PAGE
    for all errors). So we can also get rid of special handling in the
    callers of pin_guest_page() and always assume that it is a g2 error.

    As also kvm_s390_inject_program_int() should never fail, we can
    simplify pin_scb(), too.

    Signed-off-by: David Hildenbrand
    Message-Id:
    Acked-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     

08 Nov, 2017

1 commit


07 Nov, 2017

4 commits

  • Conflicts:
    include/linux/compiler-clang.h
    include/linux/compiler-gcc.h
    include/linux/compiler-intel.h
    include/uapi/linux/stddef.h

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • The way we call kvm_vgic_destroy is a bit bizarre. We call it
    *after* having freed the vcpus, which sort of defeats the point
    of cleaning up things before that point.

    Let's move kvm_vgic_destroy towards the beginning of kvm_arch_destroy_vm,
    which seems more sensible.

    Acked-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • We want to reuse the core of the map/unmap functions for IRQ
    forwarding. Let's move the computation of the hwirq in
    kvm_vgic_map_phys_irq and pass the linux IRQ as parameter.
    the host_irq is added to struct vgic_irq.

    We introduce kvm_vgic_map/unmap_irq which take a struct vgic_irq
    handle as a parameter.

    Acked-by: Christoffer Dall
    Signed-off-by: Eric Auger
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Eric Auger
     
  • This patch selects IRQ_BYPASS_MANAGER and HAVE_KVM_IRQ_BYPASS
    configs for ARM/ARM64.

    kvm_arch_has_irq_bypass() now is implemented and returns true.
    As a consequence the irq bypass consumer will be registered for
    ARM/ARM64 with the forwarding callbacks:

    - stop/start: halt/resume guest execution
    - add/del_producer: set/unset forwarding at vgic/irqchip level

    We don't have any actual support yet, so nothing gets actually
    forwarded.

    Acked-by: Christoffer Dall
    Signed-off-by: Eric Auger
    [maz: dropped the DEOI stuff for the time being in order to
    reduce the dependency chain, amended commit message]
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Eric Auger
     

06 Nov, 2017

5 commits

  • Both arm and arm64 implementations are capable of injecting
    faults, and yet have completely divergent implementations,
    leading to different bugs and reduced maintainability.

    Let's elect the arm64 version as the canonical one
    and move it into aarch32.c, which is common to both
    architectures.

    Reviewed-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • On reset we clear the valid bits of GITS_CBASER and GITS_BASER.
    We also clear command queue registers and free the cache (device,
    collection, and lpi lists).

    As we need to take the same locks as save/restore functions, we
    create a vgic_its_ctrl() wrapper that handles KVM_DEV_ARM_VGIC_GRP_CTRL
    group functions.

    Reviewed-by: Christoffer Dall
    Reviewed-by: Marc Zyngier
    Signed-off-by: Eric Auger
    Signed-off-by: Christoffer Dall

    Eric Auger
     
  • When the GITS_BASER.Valid gets cleared, the data structures in
    guest RAM are not valid anymore. The device, collection
    and LPI lists stored in the in-kernel ITS represent the same
    information in some form of cache. So let's void the cache.

    Reviewed-by: Marc Zyngier
    Reviewed-by: Christoffer Dall
    Signed-off-by: Eric Auger
    Signed-off-by: Christoffer Dall

    Eric Auger
     
  • We create two new functions that free the device and
    collection lists. They are currently called by vgic_its_destroy()
    and other callers will be added in subsequent patches.

    We also remove the check on its->device_list.next.
    Lists are initialized in vgic_create_its() and the device
    is added to the device list only if this latter succeeds.

    vgic_its_destroy is the device destroy ops. This latter is called
    by kvm_destroy_devices() which loops on all created devices. So
    at this point the list is initialized.

    Acked-by: Marc Zyngier
    Signed-off-by: wanghaibin
    Signed-off-by: Eric Auger
    Signed-off-by: Christoffer Dall

    wanghaibin
     
  • Let's remove kvm_its_unmap_device and use kvm_its_free_device
    as both functions are identical.

    Signed-off-by: Eric Auger
    Acked-by: Marc Zyngier
    Acked-by: Christoffer Dall
    Signed-off-by: Christoffer Dall

    Eric Auger