25 Jul, 2017

1 commit

  • kvm_pmu_overflow_set() is called from perf's interrupt handler,
    making the call of kvm_vgic_inject_irq() from it introduced with
    "KVM: arm/arm64: PMU: remove request-less vcpu kick" a really bad
    idea, as it's quite easy to try and retake a lock that the
    interrupted context is already holding. The fix is to use a vcpu
    kick, leaving the interrupt injection to kvm_pmu_sync_hwstate(),
    like it was doing before the refactoring. We don't just revert,
    though, because before the kick was request-less, leaving the vcpu
    exposed to the request-less vcpu kick race, and also because the
    kick was used unnecessarily from register access handlers.

    Reviewed-by: Christoffer Dall
    Signed-off-by: Andrew Jones
    Signed-off-by: Marc Zyngier

    Andrew Jones
     

08 Jun, 2017

5 commits

  • The PMU IRQ number is set through the VCPU device's KVM_SET_DEVICE_ATTR
    ioctl handler for the KVM_ARM_VCPU_PMU_V3_IRQ attribute, but there is no
    enforced or stated requirement that this must happen after initializing
    the VGIC. As a result, calling vgic_valid_spi() which relies on the
    nr_spis being set during the VGIC init can incorrectly fail.

    Introduce irq_is_spi, which determines if an IRQ number is within the
    SPI range without verifying it against the actual VGIC properties.

    Signed-off-by: Christoffer Dall
    Reviewed-by: Marc Zyngier

    Christoffer Dall
     
  • When injecting an IRQ to the VGIC, you now have to present an owner
    token for that IRQ line to show that you are the owner of that line.

    IRQ lines driven from userspace or via an irqfd do not have an owner and
    will simply pass a NULL pointer.

    Also get rid of the unused kvm_vgic_inject_mapped_irq prototype.

    Signed-off-by: Christoffer Dall
    Acked-by: Marc Zyngier

    Christoffer Dall
     
  • We check if other in-kernel devices have already been connected to the
    GIC for a particular interrupt line when possible.

    For the PMU, we can do this whenever setting the PMU interrupt number
    from userspace.

    For the timers, we have to wait until we try to enable the timer,
    because we have a concept of default IRQ numbers that userspace
    shouldn't have to work around in the initialization phase.

    Signed-off-by: Christoffer Dall
    Reviewed-by: Marc Zyngier

    Christoffer Dall
     
  • We are about to need this define in the arch timer code as well so move
    it to a common location.

    Signed-off-by: Christoffer Dall
    Acked-by: Marc Zyngier

    Christoffer Dall
     
  • Since we got support for devices in userspace which allows reporting the
    PMU overflow output status to userspace, we should actually allow
    creating the PMU on systems without an in-kernel irqchip, which in turn
    requires us to slightly clarify error codes for the ABI and move things
    around for the initialization phase.

    Signed-off-by: Christoffer Dall
    Reviewed-by: Marc Zyngier

    Christoffer Dall
     

04 Jun, 2017

1 commit

  • Refactor PMU overflow handling in order to remove the request-less
    vcpu kick. Now, since kvm_vgic_inject_irq() uses vcpu requests,
    there should be no chance that a kick sent at just the wrong time
    (between the VCPU's call to kvm_pmu_flush_hwstate() and before it
    enters guest mode) results in a failure for the guest to see updated
    GIC state until its next exit some time later for some other reason.

    Signed-off-by: Andrew Jones
    Reviewed-by: Christoffer Dall
    Signed-off-by: Christoffer Dall

    Andrew Jones
     

09 Apr, 2017

1 commit


18 Nov, 2016

1 commit

  • KVM calls kvm_pmu_set_counter_event_type() when PMCCFILTR is configured.
    But this function can't deals with PMCCFILTR correctly because the evtCount
    bits of PMCCFILTR, which is reserved 0, conflits with the SW_INCR event
    type of other PMXEVTYPER registers. To fix it, when eventsel == 0, this
    function shouldn't return immediately; instead it needs to check further
    if select_idx is ARMV8_PMU_CYCLE_IDX.

    Another issue is that KVM shouldn't copy the eventsel bits of PMCCFILTER
    blindly to attr.config. Instead it ought to convert the request to the
    "cpu cycle" event type (i.e. 0x11).

    To support this patch and to prevent duplicated definitions, a limited
    set of ARMv8 perf event types were relocated from perf_event.c to
    asm/perf_event.h.

    Cc: stable@vger.kernel.org # 4.6+
    Acked-by: Will Deacon
    Signed-off-by: Wei Huang
    Signed-off-by: Marc Zyngier

    Wei Huang
     

28 Sep, 2016

1 commit

  • If userspace creates a PMU for the VCPU, but doesn't create an in-kernel
    irqchip, then we end up in a nasty path where we try to take an
    uninitialized spinlock, which can lead to all sorts of breakages.

    Luckily, QEMU always creates the VGIC before the PMU, so we can
    establish this as ABI and check for the VGIC in the PMU init stage.
    This can be relaxed at a later time if we want to support PMU with a
    userspace irqchip.

    Cc: stable@vger.kernel.org
    Cc: Shannon Zhao
    Acked-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Christoffer Dall
     

20 May, 2016

1 commit


01 Apr, 2016

1 commit

  • The kernel is written in C, not python, so we need braces around
    multi-line if statements. GCC 6 actually warns about this, thanks to the
    fantastic new "-Wmisleading-indentation" flag:

    | virt/kvm/arm/pmu.c: In function ‘kvm_pmu_overflow_status’:
    | virt/kvm/arm/pmu.c:198:3: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation]
    | reg &= vcpu_sys_reg(vcpu, PMCNTENSET_EL0);
    | ^~~
    | arch/arm64/kvm/../../../virt/kvm/arm/pmu.c:196:2: note: ...this ‘if’ clause, but it is not
    | if ((vcpu_sys_reg(vcpu, PMCR_EL0) & ARMV8_PMU_PMCR_E))
    | ^~

    As it turns out, this particular case is harmless (we just do some &=
    operations with 0), but worth fixing nonetheless.

    Signed-off-by: Will Deacon
    Signed-off-by: Christoffer Dall

    Will Deacon
     

01 Mar, 2016

11 commits

  • To configure the virtual PMUv3 overflow interrupt number, we use the
    vcpu kvm_device ioctl, encapsulating the KVM_ARM_VCPU_PMU_V3_IRQ
    attribute within the KVM_ARM_VCPU_PMU_V3_CTRL group.

    After configuring the PMUv3, call the vcpu ioctl with attribute
    KVM_ARM_VCPU_PMU_V3_INIT to initialize the PMUv3.

    Signed-off-by: Shannon Zhao
    Acked-by: Peter Maydell
    Reviewed-by: Andrew Jones
    Reviewed-by: Christoffer Dall
    Signed-off-by: Marc Zyngier

    Shannon Zhao
     
  • To support guest PMUv3, use one bit of the VCPU INIT feature array.
    Initialize the PMU when initialzing the vcpu with that bit and PMU
    overflow interrupt set.

    Signed-off-by: Shannon Zhao
    Acked-by: Peter Maydell
    Reviewed-by: Andrew Jones
    Signed-off-by: Marc Zyngier

    Shannon Zhao
     
  • When KVM frees VCPU, it needs to free the perf_event of PMU.

    Signed-off-by: Shannon Zhao
    Reviewed-by: Marc Zyngier
    Reviewed-by: Andrew Jones
    Signed-off-by: Marc Zyngier

    Shannon Zhao
     
  • When resetting vcpu, it needs to reset the PMU state to initial status.

    Signed-off-by: Shannon Zhao
    Reviewed-by: Marc Zyngier
    Reviewed-by: Andrew Jones
    Signed-off-by: Marc Zyngier

    Shannon Zhao
     
  • When calling perf_event_create_kernel_counter to create perf_event,
    assign a overflow handler. Then when the perf event overflows, set the
    corresponding bit of guest PMOVSSET register. If this counter is enabled
    and its interrupt is enabled as well, kick the vcpu to sync the
    interrupt.

    On VM entry, if there is counter overflowed and interrupt level is
    changed, inject the interrupt with corresponding level. On VM exit, sync
    the interrupt level as well if it has been changed.

    Signed-off-by: Shannon Zhao
    Reviewed-by: Marc Zyngier
    Reviewed-by: Andrew Jones
    Reviewed-by: Christoffer Dall
    Signed-off-by: Marc Zyngier

    Shannon Zhao
     
  • According to ARMv8 spec, when writing 1 to PMCR.E, all counters are
    enabled by PMCNTENSET, while writing 0 to PMCR.E, all counters are
    disabled. When writing 1 to PMCR.P, reset all event counters, not
    including PMCCNTR, to zero. When writing 1 to PMCR.C, reset PMCCNTR to
    zero.

    Signed-off-by: Shannon Zhao
    Reviewed-by: Marc Zyngier
    Signed-off-by: Marc Zyngier

    Shannon Zhao
     
  • Add access handler which emulates writing and reading PMSWINC
    register and add support for creating software increment event.

    Signed-off-by: Shannon Zhao
    Reviewed-by: Andrew Jones
    Signed-off-by: Marc Zyngier

    Shannon Zhao
     
  • Since the reset value of PMOVSSET and PMOVSCLR is UNKNOWN, use
    reset_unknown for its reset handler. Add a handler to emulate writing
    PMOVSSET or PMOVSCLR register.

    When writing non-zero value to PMOVSSET, the counter and its interrupt
    is enabled, kick this vcpu to sync PMU interrupt.

    Signed-off-by: Shannon Zhao
    Reviewed-by: Andrew Jones
    Signed-off-by: Marc Zyngier

    Shannon Zhao
     
  • When we use tools like perf on host, perf passes the event type and the
    id of this event type category to kernel, then kernel will map them to
    hardware event number and write this number to PMU PMEVTYPER_EL0
    register. When getting the event number in KVM, directly use raw event
    type to create a perf_event for it.

    Signed-off-by: Shannon Zhao
    Reviewed-by: Marc Zyngier
    Signed-off-by: Marc Zyngier

    Shannon Zhao
     
  • Since the reset value of PMCNTENSET and PMCNTENCLR is UNKNOWN, use
    reset_unknown for its reset handler. Add a handler to emulate writing
    PMCNTENSET or PMCNTENCLR register.

    When writing to PMCNTENSET, call perf_event_enable to enable the perf
    event. When writing to PMCNTENCLR, call perf_event_disable to disable
    the perf event.

    Signed-off-by: Shannon Zhao
    Signed-off-by: Marc Zyngier

    Shannon Zhao
     
  • These kind of registers include PMEVCNTRn, PMCCNTR and PMXEVCNTR which
    is mapped to PMEVCNTRn.

    The access handler translates all aarch32 register offsets to aarch64
    ones and uses vcpu_sys_reg() to access their values to avoid taking care
    of big endian.

    When reading these registers, return the sum of register value and the
    value perf event counts.

    Signed-off-by: Shannon Zhao
    Reviewed-by: Andrew Jones
    Signed-off-by: Marc Zyngier

    Shannon Zhao