18 Aug, 2016

1 commit


17 Aug, 2016

2 commits

  • Similarily to f005bd7e3b84 ("clocksource/arm_arch_timer: Force
    per-CPU interrupt to be level-triggered"), make sure we can
    survive an interrupt that has been misconfigured as edge-triggered
    by forcing it to be level-triggered (active low is assumed, but
    the GIC doesn't really care whether this is high or low).

    Hopefully, the amount of shouting in the kernel log will convince
    the user to do something about their firmware.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • When a guest wants to map a device-ID/event-ID combination that is
    already mapped, we may end up in a situation where an LPI is never
    "put", thus never being freed.
    Since the GICv3 spec says that mapping an already mapped LPI is
    UNPREDICTABLE, lets just bail out early in this situation to avoid
    any potential leaks.

    Signed-off-by: Andre Przywara
    Reviewed-by: Christoffer Dall
    Signed-off-by: Christoffer Dall

    Andre Przywara
     

16 Aug, 2016

3 commits

  • When userspace provides the doorbell address for an MSI to be
    injected into the guest, we find a KVM device which feels responsible.
    Lets check that this device is really an emulated ITS before we make
    real use of the container_of-ed pointer.

    [ Moved NULL-pointer check to caller of static function
    - Christoffer ]

    Signed-off-by: Andre Przywara
    Reviewed-by: Christoffer Dall
    Signed-off-by: Christoffer Dall

    Andre Przywara
     
  • Currently we register an ITS device upon userland issuing the CTLR_INIT
    ioctl to mark initialization of the ITS as done.
    This deviates from the initialization sequence of the existing GIC
    devices and does not play well with the way QEMU handles things.
    To be more in line with what we are used to, register the ITS(es) just
    before the first VCPU is about to run, so in the map_resources() call.
    This involves iterating through the list of KVM devices and map each
    ITS that we find.

    Signed-off-by: Andre Przywara
    Reviewed-by: Eric Auger
    Tested-by: Eric Auger
    Signed-off-by: Christoffer Dall

    Andre Przywara
     
  • There are two problems with the current implementation of the MMIO
    handlers for the propbaser and pendbaser:

    First, the write to the value itself is not guaranteed to be an atomic
    64-bit write so two concurrent writes to the structure field could be
    intermixed.

    Second, because we do a read-modify-update operation without any
    synchronization, if we have two 32-bit accesses to separate parts of the
    register, we can loose one of them.

    By using the atomic cmpxchg64 we should cover both issues above.

    Reviewed-by: Andre Przywara
    Signed-off-by: Christoffer Dall

    Christoffer Dall
     

12 Aug, 2016

2 commits

  • KVM devices were manipulating list data structures without any form of
    synchronization, and some implementations of the create operations also
    suffered from a lack of synchronization.

    Now when we've split the xics create operation into create and init, we
    can hold the kvm->lock mutex while calling the create operation and when
    manipulating the devices list.

    The error path in the generic code gets slightly ugly because we have to
    take the mutex again and delete the device from the list, but holding
    the mutex during anon_inode_getfd or releasing/locking the mutex in the
    common non-error path seemed wrong.

    Signed-off-by: Christoffer Dall
    Reviewed-by: Paolo Bonzini
    Acked-by: Christian Borntraeger
    Signed-off-by: Radim Krčmář

    Christoffer Dall
     
  • As we are about to hold the kvm->lock during the create operation on KVM
    devices, we should move the call to xics_debugfs_init into its own
    function, since holding a mutex over extended amounts of time might not
    be a good idea.

    Introduce an init operation on the kvm_device_ops struct which cannot
    fail and call this, if configured, after the device has been created.

    Signed-off-by: Christoffer Dall
    Reviewed-by: Paolo Bonzini
    Signed-off-by: Radim Krčmář

    Christoffer Dall
     

10 Aug, 2016

2 commits

  • Right now the following sequence of events can happen:

    1. Thread X calls vgic_put_irq
    2. Thread Y calls vgic_add_lpi
    3. Thread Y gets lpi_list_lock
    4. Thread X drops the ref count to 0 and blocks on lpi_list_lock
    5. Thread Y finds the irq via the lpi_list_lock, raises the ref
    count to 1, and release the lpi_list_lock.
    6. Thread X proceeds and frees the irq.

    Avoid this by holding the spinlock around the kref_put.

    Reviewed-by: Andre Przywara
    Signed-off-by: Christoffer Dall

    Christoffer Dall
     
  • During low memory conditions, we could be dereferencing a NULL pointer
    when vgic_add_lpi fails to allocate memory.

    Consider for example this call sequence:

    vgic_its_cmd_handle_mapi
    itte->irq = vgic_add_lpi(kvm, lpi_nr);
    update_lpi_config(kvm, itte->irq, NULL);
    ret = kvm_read_guest(kvm, propbase + irq->intid
    ^^^^
    kaboom?

    Instead, return an error pointer from vgic_add_lpi and check the return
    value from its single caller.

    Signed-off-by: Christoffer Dall

    Christoffer Dall
     

09 Aug, 2016

1 commit

  • According to the KVM API documentation a successful MSI injection
    should return a value > 0 on success.
    Return possible errors in vgic_its_trigger_msi() and report a
    successful injection back to userland, while also reporting the
    case where the MSI could not be delivered due to the guest not
    having the LPI mapped, for instance.

    Signed-off-by: Andre Przywara
    Reviewed-by: Eric Auger
    Reviewed-by: Christoffer Dall
    Signed-off-by: Christoffer Dall

    Andre Przywara
     

04 Aug, 2016

1 commit


03 Aug, 2016

1 commit

  • Pull KVM updates from Paolo Bonzini:

    - ARM: GICv3 ITS emulation and various fixes. Removal of the
    old VGIC implementation.

    - s390: support for trapping software breakpoints, nested
    virtualization (vSIE), the STHYI opcode, initial extensions
    for CPU model support.

    - MIPS: support for MIPS64 hosts (32-bit guests only) and lots
    of cleanups, preliminary to this and the upcoming support for
    hardware virtualization extensions.

    - x86: support for execute-only mappings in nested EPT; reduced
    vmexit latency for TSC deadline timer (by about 30%) on Intel
    hosts; support for more than 255 vCPUs.

    - PPC: bugfixes.

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
    KVM: PPC: Introduce KVM_CAP_PPC_HTM
    MIPS: Select HAVE_KVM for MIPS64_R{2,6}
    MIPS: KVM: Reset CP0_PageMask during host TLB flush
    MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
    MIPS: KVM: Sign extend MFC0/RDHWR results
    MIPS: KVM: Fix 64-bit big endian dynamic translation
    MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
    MIPS: KVM: Use 64-bit CP0_EBase when appropriate
    MIPS: KVM: Set CP0_Status.KX on MIPS64
    MIPS: KVM: Make entry code MIPS64 friendly
    MIPS: KVM: Use kmap instead of CKSEG0ADDR()
    MIPS: KVM: Use virt_to_phys() to get commpage PFN
    MIPS: Fix definition of KSEGX() for 64-bit
    KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
    kvm: x86: nVMX: maintain internal copy of current VMCS
    KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
    KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
    KVM: arm64: vgic-its: Simplify MAPI error handling
    KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
    KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
    ...

    Linus Torvalds
     

30 Jul, 2016

1 commit

  • Pull smp hotplug updates from Thomas Gleixner:
    "This is the next part of the hotplug rework.

    - Convert all notifiers with a priority assigned

    - Convert all CPU_STARTING/DYING notifiers

    The final removal of the STARTING/DYING infrastructure will happen
    when the merge window closes.

    Another 700 hundred line of unpenetrable maze gone :)"

    * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
    timers/core: Correct callback order during CPU hot plug
    leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
    powerpc/numa: Convert to hotplug state machine
    arm/perf: Fix hotplug state machine conversion
    irqchip/armada: Avoid unused function warnings
    ARC/time: Convert to hotplug state machine
    clocksource/atlas7: Convert to hotplug state machine
    clocksource/armada-370-xp: Convert to hotplug state machine
    clocksource/exynos_mct: Convert to hotplug state machine
    clocksource/arm_global_timer: Convert to hotplug state machine
    rcu: Convert rcutree to hotplug state machine
    KVM/arm/arm64/vgic-new: Convert to hotplug state machine
    smp/cfd: Convert core to hotplug state machine
    x86/x2apic: Convert to CPU hotplug state machine
    profile: Convert to hotplug state machine
    timers/core: Convert to hotplug state machine
    hrtimer: Convert to hotplug state machine
    x86/tboot: Convert to hotplug state machine
    arm64/armv8 deprecated: Convert to hotplug state machine
    hwtracing/coresight-etm4x: Convert to hotplug state machine
    ...

    Linus Torvalds
     

24 Jul, 2016

1 commit


23 Jul, 2016

4 commits

  • KVM/ARM changes for Linux 4.8

    - GICv3 ITS emulation
    - Simpler idmap management that fixes potential TLB conflicts
    - Honor the kernel protection in HYP mode
    - Removal of the old vgic implementation

    Radim Krčmář
     
  • Up to now, only irqchip routing entries could be set. This patch
    adds the capability to insert MSI routing entries.

    For ARM64, let's also increase KVM_MAX_IRQ_ROUTES to 4096: this
    include SPI irqchip routes plus MSI routes. In the future this
    might be extended.

    Signed-off-by: Eric Auger
    Reviewed-by: Andre Przywara
    Signed-off-by: Marc Zyngier

    Eric Auger
     
  • This patch adds compilation and link against irqchip.

    Main motivation behind using irqchip code is to enable MSI
    routing code. In the future irqchip routing may also be useful
    when targeting multiple irqchips.

    Routing standard callbacks now are implemented in vgic-irqfd:
    - kvm_set_routing_entry
    - kvm_set_irq
    - kvm_set_msi

    They only are supported with new_vgic code.

    Both HAVE_KVM_IRQCHIP and HAVE_KVM_IRQ_ROUTING are defined.
    KVM_CAP_IRQ_ROUTING is advertised and KVM_SET_GSI_ROUTING is allowed.

    So from now on IRQCHIP routing is enabled and a routing table entry
    must exist for irqfd injection to succeed for a given SPI. This patch
    builds a default flat irqchip routing table (gsi=irqchip.pin) covering
    all the VGIC SPI indexes. This routing table is overwritten by the
    first first user-space call to KVM_SET_GSI_ROUTING ioctl.

    MSI routing setup is not yet allowed.

    Signed-off-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Eric Auger
     
  • on ARM, a devid field is populated in kvm_msi struct in case the
    flag is set to KVM_MSI_VALID_DEVID. Let's propagate both flags and
    devid field in kvm_kernel_irq_routing_entry.

    Signed-off-by: Eric Auger
    Reviewed-by: Andre Przywara
    Acked-by: Christoffer Dall
    Acked-by: Radim Krčmář
    Signed-off-by: Marc Zyngier

    Eric Auger
     

19 Jul, 2016

21 commits

  • If we care to move all the checks that do not involve any memory
    allocation, we can simplify the MAPI error handling. Let's do that,
    it cannot hurt.

    Signed-off-by: Marc Zyngier

    Marc Zyngier
     
  • vgic_its_cmd_handle_mapi has an extra "subcmd" argument, which is
    already contained in the command buffer that all command handlers
    obtain from the command queue. Let's drop it, as it is not that
    useful.

    Signed-off-by: Marc Zyngier

    Marc Zyngier
     
  • There is no need to have separate functions to validate devices
    and collections, as the architecture doesn't really distinguish the
    two, and they are supposed to be managed the same way.

    Let's turn the DevID checker into a generic one.

    Signed-off-by: Marc Zyngier

    Marc Zyngier
     
  • Going from the ITS structure to the corresponding KVM structure
    would be quite handy at times. The kvm_device pointer that is
    passed at create time is quite convenient for this, so let's
    keep a copy of it in the vgic_its structure.

    This will be put to a good use in subsequent patches.

    Signed-off-by: Marc Zyngier

    Marc Zyngier
     
  • Instead of spreading random allocations all over the place,
    consolidate allocation/init/freeing of collections in a pair
    of constructor/destructor.

    Signed-off-by: Marc Zyngier

    Marc Zyngier
     
  • When checking that the storage address of a device entry is valid,
    it is critical to compute the actual address of the entry, rather
    than relying on the beginning of the page to match a CPU page of
    the same size: for example, if the guest places the table at the
    last 64kB boundary of RAM, but RAM size isn't a multiple of 64kB...

    Fix this by computing the actual offset of the device ID in the
    L2 page, and check the corresponding GFN.

    Signed-off-by: Marc Zyngier

    Marc Zyngier
     
  • Checking that the device_id fits if the table, and we must make
    sure that the associated memory is also accessible.

    Signed-off-by: Marc Zyngier

    Marc Zyngier
     
  • The nr_entries variable in vgic_its_check_device_id actually
    describe the size of the L1 table, and not the number of
    entries in this table.

    Rename it to l1_tbl_size, so that we can now change the code
    with a better understanding of what is what.

    Signed-off-by: Marc Zyngier

    Marc Zyngier
     
  • The ITS tables are stored in LE format. If the host is reading
    a L1 table entry to check its validity, it must convert it to
    the CPU endianness.

    Signed-off-by: Marc Zyngier

    Marc Zyngier
     
  • The current code will fail on valid indirect tables, and happily
    use the ones that are pointing out of the guest RAM. Funny what a
    small "!" can do for you...

    Signed-off-by: Marc Zyngier

    Marc Zyngier
     
  • Instead of sprinkling raw kref_get() calls everytime we cannot
    do a normal vgic_get_irq(), use the existing vgic_get_irq_kref(),
    which does the same thing and is paired with a vgic_put_irq().

    vgic_get_irq_kref is moved to vgic.h in order to be easily shared.

    Signed-off-by: Marc Zyngier

    Marc Zyngier
     
  • For VGICv2 save and restore the CPU interface registers
    are accessed. Restore the modality which has been altered.
    Also explicitly set the iodev_type for both the DIST and CPU
    interface.

    Signed-off-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Eric Auger
     
  • Now that all ITS emulation functionality is in place, we advertise
    MSI functionality to userland and also the ITS device to the guest - if
    userland has configured that.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • When userland wants to inject an MSI into the guest, it uses the
    KVM_SIGNAL_MSI ioctl, which carries the doorbell address along with
    the payload and the device ID.
    With the help of the KVM IO bus framework we learn the corresponding
    ITS from the doorbell address. We then use our wrapper functions to
    iterate the linked lists and find the proper Interrupt Translation Table
    Entry (ITTE) and thus the corresponding struct vgic_irq to finally set
    the pending bit.
    We also provide the handler for the ITS "INT" command, which allows a
    guest to trigger an MSI via the ITS command queue. Since this one knows
    about the right ITS already, we directly call the MMIO handler function
    without using the kvm_io_bus framework.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • The connection between a device, an event ID, the LPI number and the
    associated CPU is stored in in-memory tables in a GICv3, but their
    format is not specified by the spec. Instead software uses a command
    queue in a ring buffer to let an ITS implementation use its own
    format.
    Implement handlers for the various ITS commands and let them store
    the requested relation into our own data structures. Those data
    structures are protected by the its_lock mutex.
    Our internal ring buffer read and write pointers are protected by the
    its_cmd mutex, so that only one VCPU per ITS can handle commands at
    any given time.
    Error handling is very basic at the moment, as we don't have a good
    way of communicating errors to the guest (usually an SError).
    The INT command handler is missing from this patch, as we gain the
    capability of actually injecting MSIs into the guest only later on.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • The (system-wide) LPI configuration table is held in a table in
    (guest) memory. To achieve reasonable performance, we cache this data
    in our struct vgic_irq. If the guest updates the configuration data
    (which consists of the enable bit and the priority value), it issues
    an INV or INVALL command to allow us to update our information.
    Provide functions that update that information for one LPI or all LPIs
    mapped to a specific collection.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • The LPI pending status for a GICv3 redistributor is held in a table
    in (guest) memory. To achieve reasonable performance, we cache the
    pending bit in our struct vgic_irq. The initial pending state must be
    read from guest memory upon enabling LPIs for this redistributor.
    As we can't access the guest memory while we hold the lpi_list spinlock,
    we create a snapshot of the LPI list and iterate over that.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • LPIs are dynamically created (mapped) at guest runtime and their
    actual number can be quite high, but is mostly assigned using a very
    sparse allocation scheme. So arrays are not an ideal data structure
    to hold the information.
    We use a spin-lock protected linked list to hold all mapped LPIs,
    represented by their struct vgic_irq. This lock is grouped between the
    ap_list_lock and the vgic_irq lock in our locking order.
    Also we store a pointer to that struct vgic_irq in our struct its_itte,
    so we can easily access it.
    Eventually we call our new vgic_get_lpi() from vgic_get_irq(), so
    the VGIC code gets transparently access to LPIs.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • Add emulation for some basic MMIO registers used in the ITS emulation.
    This includes:
    - GITS_{CTLR,TYPER,IIDR}
    - ID registers
    - GITS_{CBASER,CREADR,CWRITER}
    (which implement the ITS command buffer handling)
    - GITS_BASER

    Most of the handlers are pretty straight forward, only the CWRITER
    handler is a bit more involved by taking the new its_cmd mutex and
    then iterating over the command buffer.
    The registers holding base addresses and attributes are sanitised before
    storing them.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • Introduce a new KVM device that represents an ARM Interrupt Translation
    Service (ITS) controller. Since there can be multiple of this per guest,
    we can't piggy back on the existing GICv3 distributor device, but create
    a new type of KVM device.
    On the KVM_CREATE_DEVICE ioctl we allocate and initialize the ITS data
    structure and store the pointer in the kvm_device data.
    Upon an explicit init ioctl from userland (after having setup the MMIO
    address) we register the handlers with the kvm_io_bus framework.
    Any reference to an ITS thus has to go via this interface.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara
     
  • The ARM GICv3 ITS emulation code goes into a separate file, but needs
    to be connected to the GICv3 emulation, of which it is an option.
    The ITS MMIO handlers require the respective ITS pointer to be passed in,
    so we amend the existing VGIC MMIO framework to let it cope with that.
    Also we introduce the basic ITS data structure and initialize it, but
    don't return any success yet, as we are not yet ready for the show.

    Signed-off-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Tested-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Andre Przywara