22 Sep, 2016

2 commits

  • This patch allows to build and use vgic-v3 in 32-bit mode.

    Unfortunately, it can not be split in several steps without extra
    stubs to keep patches independent and bisectable. For instance,
    virt/kvm/arm/vgic/vgic-v3.c uses function from vgic-v3-sr.c, handling
    access to GICv3 cpu interface from the guest requires vgic_v3.vgic_sre
    to be already defined.

    It is how support has been done:

    * handle SGI requests from the guest

    * report configured SRE on access to GICv3 cpu interface from the guest

    * required vgic-v3 macros are provided via uapi.h

    * static keys are used to select GIC backend

    * to make vgic-v3 build KVM_ARM_VGIC_V3 guard is removed along with
    the static inlines

    Acked-by: Marc Zyngier
    Reviewed-by: Christoffer Dall
    Signed-off-by: Vladimir Murzin
    Signed-off-by: Christoffer Dall

    Vladimir Murzin
     
  • Currently GIC backend is selected via alternative framework and this
    is fine. We are going to introduce vgic-v3 to 32-bit world and there
    we don't have patching framework in hand, so we can either check
    support for GICv3 every time we need to choose which backend to use or
    try to optimise it by using static keys. The later looks quite
    promising because we can share logic involved in selecting GIC backend
    between architectures if both uses static keys.

    This patch moves arm64 from alternative to static keys framework for
    selecting GIC backend. For that we embed static key into vgic_global
    and enable the key during vgic initialisation based on what has
    already been exposed by the host GIC driver.

    Acked-by: Marc Zyngier
    Signed-off-by: Vladimir Murzin
    Signed-off-by: Christoffer Dall

    Vladimir Murzin
     

08 Sep, 2016

2 commits

  • Now that we have the necessary infrastructure to handle MMIO accesses
    in HYP, perform the GICV access on behalf of the guest. This requires
    checking that the access is strictly 32bit, properly aligned, and
    falls within the expected range.

    When all condition are satisfied, we perform the access and tell
    the rest of the HYP code that the instruction has been correctly
    emulated.

    Reviewed-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • In order to efficiently perform the GICV access on behalf of the
    guest, we need to be able to avoid going back all the way to
    the host kernel.

    For this, we introduce a new hook in the world switch code,
    conveniently placed just after populating the fault info.
    At that point, we only have saved/restored the GP registers,
    and we can quickly perform all the required checks (data abort,
    translation fault, valid faulting syndrome, not an external
    abort, not a PTW).

    Coming back from the emulation code, we need to skip the emulated
    instruction. This involves an additional bit of save/restore in
    order to be able to access the guest's PC (and possibly CPSR if
    this is a 32bit guest).

    At this stage, no emulation code is provided.

    Signed-off-by: Marc Zyngier
    Reviewed-by: Christoffer Dall
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     

04 Aug, 2016

1 commit


03 Aug, 2016

1 commit

  • Pull KVM updates from Paolo Bonzini:

    - ARM: GICv3 ITS emulation and various fixes. Removal of the
    old VGIC implementation.

    - s390: support for trapping software breakpoints, nested
    virtualization (vSIE), the STHYI opcode, initial extensions
    for CPU model support.

    - MIPS: support for MIPS64 hosts (32-bit guests only) and lots
    of cleanups, preliminary to this and the upcoming support for
    hardware virtualization extensions.

    - x86: support for execute-only mappings in nested EPT; reduced
    vmexit latency for TSC deadline timer (by about 30%) on Intel
    hosts; support for more than 255 vCPUs.

    - PPC: bugfixes.

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
    KVM: PPC: Introduce KVM_CAP_PPC_HTM
    MIPS: Select HAVE_KVM for MIPS64_R{2,6}
    MIPS: KVM: Reset CP0_PageMask during host TLB flush
    MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
    MIPS: KVM: Sign extend MFC0/RDHWR results
    MIPS: KVM: Fix 64-bit big endian dynamic translation
    MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
    MIPS: KVM: Use 64-bit CP0_EBase when appropriate
    MIPS: KVM: Set CP0_Status.KX on MIPS64
    MIPS: KVM: Make entry code MIPS64 friendly
    MIPS: KVM: Use kmap instead of CKSEG0ADDR()
    MIPS: KVM: Use virt_to_phys() to get commpage PFN
    MIPS: Fix definition of KSEGX() for 64-bit
    KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
    kvm: x86: nVMX: maintain internal copy of current VMCS
    KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
    KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
    KVM: arm64: vgic-its: Simplify MAPI error handling
    KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
    KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
    ...

    Linus Torvalds
     

23 Jul, 2016

1 commit

  • This patch adds compilation and link against irqchip.

    Main motivation behind using irqchip code is to enable MSI
    routing code. In the future irqchip routing may also be useful
    when targeting multiple irqchips.

    Routing standard callbacks now are implemented in vgic-irqfd:
    - kvm_set_routing_entry
    - kvm_set_irq
    - kvm_set_msi

    They only are supported with new_vgic code.

    Both HAVE_KVM_IRQCHIP and HAVE_KVM_IRQ_ROUTING are defined.
    KVM_CAP_IRQ_ROUTING is advertised and KVM_SET_GSI_ROUTING is allowed.

    So from now on IRQCHIP routing is enabled and a routing table entry
    must exist for irqfd injection to succeed for a given SPI. This patch
    builds a default flat irqchip routing table (gsi=irqchip.pin) covering
    all the VGIC SPI indexes. This routing table is overwritten by the
    first first user-space call to KVM_SET_GSI_ROUTING ioctl.

    MSI routing setup is not yet allowed.

    Signed-off-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Eric Auger
     

19 Jul, 2016

9 commits

  • Going from the ITS structure to the corresponding KVM structure
    would be quite handy at times. The kvm_device pointer that is
    passed at create time is quite convenient for this, so let's
    keep a copy of it in the vgic_its structure.

    This will be put to a good use in subsequent patches.

    Signed-off-by: Marc Zyngier

    Marc Zyngier
     
  • Now that all ITS emulation functionality is in place, we advertise
    MSI functionality to userland and also the ITS device to the guest - if
    userland has configured that.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • LPIs are dynamically created (mapped) at guest runtime and their
    actual number can be quite high, but is mostly assigned using a very
    sparse allocation scheme. So arrays are not an ideal data structure
    to hold the information.
    We use a spin-lock protected linked list to hold all mapped LPIs,
    represented by their struct vgic_irq. This lock is grouped between the
    ap_list_lock and the vgic_irq lock in our locking order.
    Also we store a pointer to that struct vgic_irq in our struct its_itte,
    so we can easily access it.
    Eventually we call our new vgic_get_lpi() from vgic_get_irq(), so
    the VGIC code gets transparently access to LPIs.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • Add emulation for some basic MMIO registers used in the ITS emulation.
    This includes:
    - GITS_{CTLR,TYPER,IIDR}
    - ID registers
    - GITS_{CBASER,CREADR,CWRITER}
    (which implement the ITS command buffer handling)
    - GITS_BASER

    Most of the handlers are pretty straight forward, only the CWRITER
    handler is a bit more involved by taking the new its_cmd mutex and
    then iterating over the command buffer.
    The registers holding base addresses and attributes are sanitised before
    storing them.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • Introduce a new KVM device that represents an ARM Interrupt Translation
    Service (ITS) controller. Since there can be multiple of this per guest,
    we can't piggy back on the existing GICv3 distributor device, but create
    a new type of KVM device.
    On the KVM_CREATE_DEVICE ioctl we allocate and initialize the ITS data
    structure and store the pointer in the kvm_device data.
    Upon an explicit init ioctl from userland (after having setup the MMIO
    address) we register the handlers with the kvm_io_bus framework.
    Any reference to an ITS thus has to go via this interface.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • The ARM GICv3 ITS emulation code goes into a separate file, but needs
    to be connected to the GICv3 emulation, of which it is an option.
    The ITS MMIO handlers require the respective ITS pointer to be passed in,
    so we amend the existing VGIC MMIO framework to let it cope with that.
    Also we introduce the basic ITS data structure and initialize it, but
    don't return any success yet, as we are not yet ready for the show.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • In the GICv3 redistributor there are the PENDBASER and PROPBASER
    registers which we did not emulate so far, as they only make sense
    when having an ITS. In preparation for that emulate those MMIO
    accesses by storing the 64-bit data written into it into a variable
    which we later read in the ITS emulation.
    We also sanitise the registers, making sure RES0 regions are respected
    and checking for valid memory attributes.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • In the moment our struct vgic_irq's are statically allocated at guest
    creation time. So getting a pointer to an IRQ structure is trivial and
    safe. LPIs are more dynamic, they can be mapped and unmapped at any time
    during the guest's _runtime_.
    In preparation for supporting LPIs we introduce reference counting for
    those structures using the kernel's kref infrastructure.
    Since private IRQs and SPIs are statically allocated, we avoid actually
    refcounting them, since they would never be released anyway.
    But we take provisions to increase the refcount when an IRQ gets onto a
    VCPU list and decrease it when it gets removed. Also this introduces
    vgic_put_irq(), which wraps kref_put and hides the release function from
    the callers.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • Logically a GICv3 redistributor is assigned to a (v)CPU, so we should
    aim to keep redistributor related variables out of our struct vgic_dist.

    Let's start by replacing the redistributor related kvm_io_device array
    with two members in our existing struct vgic_cpu, which are naturally
    per-VCPU and thus don't require any allocation / freeing.
    So apart from the better fit with the redistributor design this saves
    some code as well.

    Signed-off-by: Andre Przywara
    Reviewed-by: Eric Auger
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     

04 Jul, 2016

1 commit

  • I don't think any single piece of the KVM/ARM code ever generated
    as much hatred as the GIC emulation.

    It was written by someone who had zero experience in modeling
    hardware (me), was riddled with design flaws, should have been
    scrapped and rewritten from scratch long before having a remote
    chance of reaching mainline, and yet we supported it for a good
    three years. No need to mention the names of those who suffered,
    the git log is singing their praises.

    Thankfully, we now have a much more maintainable implementation,
    and we can safely put the grumpy old GIC to rest.

    Fellow hackers, please raise your glass in memory of the GIC:

    The GIC is dead, long live the GIC!

    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     

27 Jun, 2016

1 commit

  • When CONFIG_ARM_PMU is disabled, we get the following build error:

    arch/arm64/kvm/sys_regs.c: In function 'pmu_counter_idx_valid':
    arch/arm64/kvm/sys_regs.c:564:27: error: 'ARMV8_PMU_CYCLE_IDX' undeclared (first use in this function)
    if (idx >= val && idx != ARMV8_PMU_CYCLE_IDX)
    ^
    arch/arm64/kvm/sys_regs.c:564:27: note: each undeclared identifier is reported only once for each function it appears in
    arch/arm64/kvm/sys_regs.c: In function 'access_pmu_evcntr':
    arch/arm64/kvm/sys_regs.c:592:10: error: 'ARMV8_PMU_CYCLE_IDX' undeclared (first use in this function)
    idx = ARMV8_PMU_CYCLE_IDX;
    ^
    arch/arm64/kvm/sys_regs.c: In function 'access_pmu_evtyper':
    arch/arm64/kvm/sys_regs.c:638:14: error: 'ARMV8_PMU_CYCLE_IDX' undeclared (first use in this function)
    if (idx == ARMV8_PMU_CYCLE_IDX)
    ^
    arch/arm64/kvm/hyp/switch.c:86:15: error: 'ARMV8_PMU_USERENR_MASK' undeclared (first use in this function)
    write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);

    This patch fixes the build with CONFIG_ARM_PMU disabled.

    Cc: Christoffer Dall
    Cc: Marc Zyngier
    Signed-off-by: Sudeep Holla
    Signed-off-by: Christoffer Dall

    Sudeep Holla
     

20 May, 2016

21 commits

  • We now store the mapped hardware IRQ number in our struct, so we
    don't need the irq_phys_map for the new VGIC.
    Implement the hardware IRQ mapping on top of the reworked arch
    timer interface.

    Signed-off-by: Andre Przywara
    Reviewed-by: Christoffer Dall

    Andre Przywara
     
  • map_resources is the last initialization step. It is executed on
    first VCPU run. At that stage the code checks that userspace has provided
    the base addresses for the relevant VGIC regions, which depend on the
    type of VGIC that is exposed to the guest. Also we check if the two
    regions overlap.
    If the checks succeeded, we register the respective register frames with
    the kvm_io_bus framework.

    If we emulate a GICv2, the function also forces vgic_init execution if
    it has not been executed yet. Also we map the virtual GIC CPU interface
    onto the guest's CPU interface.

    Signed-off-by: Eric Auger
    Signed-off-by: Andre Przywara
    Reviewed-by: Christoffer Dall

    Eric Auger
     
  • This patch allocates and initializes the data structures used
    to model the vgic distributor and virtual cpu interfaces. At that
    stage the number of IRQs and number of virtual CPUs is frozen.

    Signed-off-by: Eric Auger
    Signed-off-by: Andre Przywara
    Reviewed-by: Christoffer Dall

    Eric Auger
     
  • This patch implements the vgic_creation function which is
    called on CREATE_IRQCHIP VM IOCTL (v2 only) or KVM_CREATE_DEVICE

    Signed-off-by: Eric Auger
    Signed-off-by: Andre Przywara
    Reviewed-by: Christoffer Dall

    Eric Auger
     
  • Implements kvm_vgic_hyp_init and vgic_probe function.
    This uses the new firmware independent VGIC probing to support both ACPI
    and DT based systems (code from Marc Zyngier).

    The vgic_global struct is enriched with new fields populated
    by those functions.

    Signed-off-by: Eric Auger
    Signed-off-by: Andre Przywara
    Reviewed-by: Christoffer Dall

    Eric Auger
     
  • kvm_vgic_addr is used by the userspace to set the base address of
    the following register regions, as seen by the guest:
    - distributor(v2 and v3),
    - re-distributors (v3),
    - CPU interface (v2).

    Signed-off-by: Eric Auger
    Signed-off-by: Andre Przywara
    Reviewed-by: Christoffer Dall

    Eric Auger
     
  • In contrast to GICv2 SGIs in a GICv3 implementation are not triggered
    by a MMIO write, but with a system register write. KVM knows about
    that register already, we just need to implement the handler and wire
    it up to the core KVM/ARM code.

    Signed-off-by: Andre Przywara
    Reviewed-by: Christoffer Dall

    Andre Przywara
     
  • Add an MMIO handling framework to the VGIC emulation:
    Each register is described by its offset, size (or number of bits per
    IRQ, if applicable) and the read/write handler functions. We provide
    initialization macros to describe each GIC register later easily.

    Separate dispatch functions for read and write accesses are connected
    to the kvm_io_bus framework and binary-search for the responsible
    register handler based on the offset address within the region.
    We convert the incoming data (referenced by a pointer) to the host's
    endianess and use pass-by-value to hand the data over to the actual
    handler functions.

    The register handler prototype and the endianess conversion are
    courtesy of Christoffer Dall.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall
    Signed-off-by: Andre Przywara
    Reviewed-by: Christoffer Dall

    Marc Zyngier
     
  • Tell KVM whether a particular VCPU has an IRQ that needs handling
    in the guest. This is used to decide whether a VCPU is runnable.

    Signed-off-by: Eric Auger
    Signed-off-by: Andre Przywara
    Reviewed-by: Christoffer Dall
    Reviewed-by: Marc Zyngier

    Eric Auger
     
  • Implement the framework for syncing IRQs between our emulation and
    the list registers, which represent the guest's view of IRQs.
    This is done in kvm_vgic_flush_hwstate and kvm_vgic_sync_hwstate,
    which gets called on guest entry and exit.
    The code talking to the actual GICv2/v3 hardware is added in the
    following patches.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall
    Signed-off-by: Eric Auger
    Signed-off-by: Andre Przywara
    Reviewed-by: Eric Auger
    Reviewed-by: Christoffer Dall

    Marc Zyngier
     
  • Provide a vgic_queue_irq_unlock() function which decides whether a
    given IRQ needs to be queued to a VCPU's ap_list.
    This should be called whenever an IRQ becomes pending or enabled,
    either as a result of userspace injection, from in-kernel emulated
    devices like the architected timer or from MMIO accesses to the
    distributor emulation.
    Also provides the necessary functions to allow userland to inject an
    IRQ to a guest.
    Since this is the first code that starts using our locking mechanism, we
    add some (hopefully) clear documentation of our locking strategy and
    requirements along with this patch.

    Signed-off-by: Christoffer Dall
    Signed-off-by: Andre Przywara

    Christoffer Dall
     
  • Add a new header file for the new and improved GIC implementation.
    The big change is that we now have a struct vgic_irq per IRQ instead
    of spreading all the information over various bitmaps.

    We include this new header conditionally from within the old header
    file for the time being to avoid touching all the users.

    Signed-off-by: Christoffer Dall
    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier

    Christoffer Dall
     
  • Currently the PMU uses a member of the struct vgic_dist directly,
    which not only breaks abstraction, but will fail with the new VGIC.
    Abstract this access in the VGIC header file and refactor the validity
    check in the PMU code.

    Signed-off-by: Andre Przywara

    Andre Przywara
     
  • The number of list registers is a property of the underlying system, not
    of emulated VGIC CPU interface.

    As we are about to move this variable to global state in the new vgic
    for clarity, move it from the legacy implementation as well to make the
    merge of the new code easier.

    Signed-off-by: Christoffer Dall
    Signed-off-by: Andre Przywara
    Reviewed-by: Andre Przywara

    Christoffer Dall
     
  • We are about to modify the VGIC to allocate all data structures
    dynamically and store mapped IRQ information on a per-IRQ struct, which
    is indeed allocated dynamically at init time.

    Therefore, we cannot record the mapped IRQ info from the timer at timer
    reset time like it's done now, because VCPU reset happens before timer
    init.

    A possible later time to do this is on the first run of a per VCPU, it
    just requires us to move the enable state to be a per-VCPU state and do
    the lookup of the physical IRQ number when we are about to run the VCPU.

    Signed-off-by: Christoffer Dall
    Signed-off-by: Andre Przywara

    Christoffer Dall
     
  • Now that the virtual arch timer does not care about the irq_phys_map
    anymore, let's rework kvm_vgic_map_phys_irq() to return an error
    value instead. Any reference to that mapping can later be done by
    passing the correct combination of VCPU and virtual IRQ number.
    This makes the irq_phys_map handling completely private to the
    VGIC code.

    Signed-off-by: Andre Przywara
    Reviewed-by: Eric Auger
    Reviewed-by: Christoffer Dall

    Andre Przywara
     
  • Now that the interface between the arch timer and the VGIC does not
    require passing the irq_phys_map entry pointer anymore, let's remove
    it from the virtual arch timer and use the virtual IRQ number instead
    directly.
    The remaining pointer returned by kvm_vgic_map_phys_irq() will be
    removed in the following patch.

    Signed-off-by: Andre Przywara
    Reviewed-by: Eric Auger
    Reviewed-by: Christoffer Dall

    Andre Przywara
     
  • The communication of a Linux IRQ number from outside the VGIC to the
    vgic was a leftover from the day when the vgic code cared about how a
    particular device injects virtual interrupts mapped to a physical
    interrupt.

    We can safely remove this notion, leaving all physical IRQ handling to
    be done in the device driver (the arch timer in this case), which makes
    room for a saner API for the new VGIC.

    Signed-off-by: Christoffer Dall
    Signed-off-by: Andre Przywara
    Reviewed-by: Eric Auger

    Christoffer Dall
     
  • kvm_vgic_unmap_phys_irq() only needs the virtual IRQ number, so let's
    just pass that between the arch timer and the VGIC to get rid of
    the irq_phys_map pointer.

    Signed-off-by: Andre Przywara
    Reviewed-by: Eric Auger
    Reviewed-by: Christoffer Dall

    Andre Przywara
     
  • For getting the active state of a mapped IRQ, we actually only need
    the virtual IRQ number, not the pointer to the mapping entry.
    Pass the virtual IRQ number from the arch timer to the VGIC directly.

    Signed-off-by: Andre Przywara
    Reviewed-by: Eric Auger
    Reviewed-by: Christoffer Dall

    Andre Przywara
     
  • When we want to inject a hardware mapped IRQ into a guest, we actually
    only need the virtual IRQ number from the irq_phys_map.
    So let's pass this number directly from the arch timer to the VGIC
    to avoid using the map as a parameter.

    Signed-off-by: Andre Przywara
    Reviewed-by: Eric Auger
    Reviewed-by: Christoffer Dall

    Andre Przywara
     

03 May, 2016

1 commit

  • Currently, the firmware tables are parsed 2 times: once in the GIC
    drivers, the other time when initializing the vGIC. It means code
    duplication and make more tedious to add the support for another
    firmware table (like ACPI).

    Use the recently introduced helper gic_get_kvm_info() to get
    information about the virtual GIC.

    With this change, the virtual GIC becomes agnostic to the firmware
    table and KVM will be able to initialize the vGIC on ACPI.

    Signed-off-by: Julien Grall
    Reviewed-by: Christoffer Dall
    Signed-off-by: Christoffer Dall

    Julien Grall