15 Mar, 2018

4 commits

  • For testing the exitless interrupt support it turned out useful to
    have separate counters for inject and delivery of I/O interrupt.
    While at it do the same for all interrupt types. For timer
    related interrupts (clock comparator and cpu timer) we even had
    no delivery counters. Fix this as well. On this way some counters
    are being renamed to have a similar name.

    Signed-off-by: Christian Borntraeger
    Reviewed-by: Cornelia Huck

    Christian Borntraeger
     
  • This counter can be used for administration, debug or test purposes.

    Suggested-by: Vladislav Mironov
    Signed-off-by: QingFeng Hao
    Reviewed-by: Christian Borntraeger
    Reviewed-by: Cornelia Huck
    Reviewed-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    QingFeng Hao
     
  • A case statement in kvm_s390_shadow_tables uses fallthrough annotations
    which are not recognized by gcc because they are hidden within a block.
    Move these annotations out of the block to fix (W=1) warnings like below:

    arch/s390/kvm/gaccess.c: In function 'kvm_s390_shadow_tables':
    arch/s390/kvm/gaccess.c:1029:26: warning: this statement may fall through [-Wimplicit-fallthrough=]
    case ASCE_TYPE_REGION1: {
    ^

    Signed-off-by: Sebastian Ott
    Reviewed-by: Cornelia Huck
    Reviewed-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    Sebastian Ott
     
  • We want to count IO exit requests in kvm_stat. At the same time
    we can get rid of the handle_noop function.

    Reviewed-by: David Hildenbrand
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Christian Borntraeger
     

09 Mar, 2018

2 commits

  • use_cmma in kvm_arch means that the KVM hypervisor is allowed to use
    cmma, whereas use_cmma in the mm context means cmm has been used before.
    Let's rename the context one to uses_cmm, as the vm does use
    collaborative memory management but the host uses the cmm assist
    (interpretation facility).

    Also let's introduce use_pfmfi, so we can remove the pfmfi disablement
    when we activate cmma and rather not activate it in the first place.

    Signed-off-by: Janosch Frank
    Message-Id:
    Reviewed-by: David Hildenbrand
    Reviewed-by: Christian Borntraeger
    Signed-off-by: Christian Borntraeger

    Janosch Frank
     
  • Some facilities should only be provided to the guest, if they are
    enabled by a CPU model. This allows us to avoid capabilities and
    to simply fall back to the cpumodel for deciding about a facility
    without enabling it for older QEMUs or QEMUs without a CPU
    model.

    Reviewed-by: David Hildenbrand
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Christian Borntraeger
     

06 Mar, 2018

1 commit

  • Even if we don't have extended SCA support, we can have more than 64 CPUs
    if we don't enable any HW features that might use the SCA entries.

    Now, this works just fine, but we missed a return, which is why we
    would actually store the SCA entries. If we have more than 64 CPUs, this
    means writing outside of the basic SCA - bad.

    Let's fix this. This allows > 64 CPUs when running nested (under vSIE)
    without random crashes.

    Fixes: a6940674c384 ("KVM: s390: allow 255 VCPUs when sca entries aren't used")
    Reported-by: Christian Borntraeger
    Tested-by: Christian Borntraeger
    Signed-off-by: David Hildenbrand
    Message-Id:
    Cc: stable@vger.kernel.org
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     

01 Mar, 2018

1 commit


21 Feb, 2018

4 commits

  • Right now, SET CLOCK called in the guest does not properly take care of
    the epoch index, as the call goes via the old kvm_s390_set_tod_clock()
    interface. So the epoch index is neither reset to 0, if required, nor
    properly set to e.g. 0xff on negative values.

    Fix this by providing a single kvm_s390_set_tod_clock() function. Move
    Multiple-epoch facility handling into it.

    Signed-off-by: David Hildenbrand
    Message-Id:
    Reviewed-by: Christian Borntraeger
    Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
    Cc: stable@vger.kernel.org
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     
  • For now, we don't take care of over/underflows. Especially underflows
    are critical:

    Assume the epoch is currently 0 and we get a sync request for delta=1,
    meaning the TOD is moved forward by 1 and we have to fix it up by
    subtracting 1 from the epoch. Right now, this will leave the epoch
    index untouched, resulting in epoch=-1, epoch_idx=0, which is wrong.

    We have to take care of over and underflows, also for the VSIE case. So
    let's factor out calculation into a separate function.

    Signed-off-by: David Hildenbrand
    Message-Id:
    Reviewed-by: Christian Borntraeger
    Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
    Cc: stable@vger.kernel.org
    Signed-off-by: Christian Borntraeger
    [use u8 for idx]

    David Hildenbrand
     
  • We must copy both, the epoch and the epoch_idx.

    Signed-off-by: David Hildenbrand
    Message-Id:
    Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
    Reviewed-by: Cornelia Huck
    Reviewed-by: Christian Borntraeger
    Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
    Cc: stable@vger.kernel.org
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     
  • Missed when enabling the Multiple-epoch facility. If the facility is
    installed and the control is set, a sign based comaprison has to be
    performed.

    Right now we would inject wrong interrupts and ignore interrupt
    conditions. Also the sleep time is calculated in a wrong way.

    Signed-off-by: David Hildenbrand
    Message-Id:
    Fixes: 8fa1696ea781 ("KVM: s390: Multiple Epoch Facility support")
    Cc: stable@vger.kernel.org
    Reviewed-by: Christian Borntraeger
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     

14 Feb, 2018

6 commits

  • Just like for the interception handlers, let's also use a switch-case
    in our interrupt delivery code.

    Signed-off-by: David Hildenbrand
    Message-Id:
    Reviewed-by: Cornelia Huck
    Reviewed-by: Janosch Frank
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     
  • Instead of having huge jump tables for function selection,
    let's use normal switch/case statements for the instruction
    handlers in intercept.c We can now also get rid of
    intercept_handler_t.

    This allows the compiler to make the right decision depending
    on the situation (e.g. avoid jump-tables for thunks).

    Signed-off-by: Christian Borntraeger
    Reviewed-by: Janosch Frank
    Reviewed-by: David Hildenbrand
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Christian Borntraeger
     
  • Instead of having huge jump tables for function selection,
    let's use normal switch/case statements for the instruction
    handlers in priv.c

    This allows the compiler to make the right decision depending
    on the situation (e.g. avoid jump-tables for thunks).

    Signed-off-by: Christian Borntraeger
    Reviewed-by: Cornelia Huck
    Reviewed-by: Janosch Frank
    Reviewed-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    Christian Borntraeger
     
  • If the guest runs with bp isolation when doing a SIE instruction,
    we must also run the nested guest with bp isolation when emulating
    that SIE instruction.
    This is done by activating BPBC in the lpar, which acts as an override
    for lower level guests.

    Signed-off-by: Christian Borntraeger
    Reviewed-by: Janosch Frank
    Reviewed-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    Christian Borntraeger
     
  • If GISA is available, we do not have to kick CPUs out of SIE to deliver
    interrupts. The hardware can deliver such interrupts while running.

    Cc: Michael Mueller
    Reviewed-by: David Hildenbrand
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Christian Borntraeger
     
  • For interrupt injection of floating interrupts we queue the interrupt
    either in the GISA or in the floating interrupt list. The first CPU
    that looks at these data structures - either in KVM code or hardware
    will then deliver that interrupt. To minimize latency we also:
    -a: choose a VCPU to deliver that interrupt. We prefer idle CPUs
    -b: we wake up the host thread that runs the VCPU
    -c: set an I/O intervention bit for that CPU so that it exits guest
    context as soon as the PSW I/O mask is enabled
    This will make sure that this CPU will execute the interrupt delivery
    code of KVM very soon.

    We can now optimize the injection case if we have exitless interrupts.
    The wakeup is still necessary in case the target CPU sleeps. We can
    avoid the I/O intervention request bit though. Whenever this
    intervention request would be handled, the hardware could also directly
    inject the interrupt on that CPU, no need to go through the interrupt
    injection loop of KVM.

    Cc: Michael Mueller
    Reviewed-by: Halil Pasic
    Reviewed-by: David Hildenbrand
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Christian Borntraeger
     

01 Feb, 2018

1 commit


31 Jan, 2018

1 commit


26 Jan, 2018

9 commits

  • The patch modifies the previously defined GISA data structure to be
    able to store two GISA formats, format-0 and format-1. Additionally,
    it verifies the availability of the GISA format facility and enables
    the use of a format-1 GISA in the SIE control block accordingly.

    A format-1 can do everything that format-0 can and we will need it
    for real HW passthrough. As there are systems with only format-0
    we keep both variants.

    Signed-off-by: Michael Mueller
    Reviewed-by: Pierre Morel
    Reviewed-by: Christian Borntraeger
    Reviewed-by: David Hildenbrand
    Acked-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Michael Mueller
     
  • If the AIV facility is available, a GISA will be used to manage emulated
    adapter interrupts.

    Signed-off-by: Michael Mueller
    Reviewed-by: Halil Pasic
    Reviewed-by: Christian Borntraeger
    Reviewed-by: David Hildenbrand
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Michael Mueller
     
  • The function returns a pending I/O interrupt with the highest
    priority defined by its ISC.

    Together with AIV activation, pending adapter interrupts are
    managed by the GISA IPM. Thus kvm_s390_get_io_int() needs to
    inspect the IPM as well when the interrupt with the highest
    priority has to be identified.

    In case classic and adapter interrupts with the same ISC are
    pending, the classic interrupt will be returned first.

    Signed-off-by: Michael Mueller
    Reviewed-by: Halil Pasic
    Reviewed-by: Christian Borntraeger
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Michael Mueller
     
  • Pending interrupts marked in the GISA IPM are required to
    become part of the answer of ioctl KVM_DEV_FLIC_GET_ALL_IRQS.

    The ioctl KVM_DEV_FLIC_ENQUEUE is already capable to enqueue
    adapter interrupts when a GISA is present.

    With ioctl KVM_DEV_FLIC_CLEAR_IRQS the GISA IPM wil be cleared
    now as well.

    Signed-off-by: Michael Mueller
    Reviewed-by: Halil Pasic
    Reviewed-by: Pierre Morel
    Reviewed-by: Christian Borntraeger
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Michael Mueller
     
  • The function isc_to_int_word() allows the generation of interruption
    words for adapter interrupts.

    Signed-off-by: Michael Mueller
    Reviewed-by: Christian Borntraeger
    Reviewed-by: Cornelia Huck
    Reviewed-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    Michael Mueller
     
  • The adapter interruption virtualization (AIV) facility is an
    optional facility that comes with functionality expected to increase
    the performance of adapter interrupt handling for both emulated and
    passed-through adapter interrupts. With AIV, adapter interrupts can be
    delivered to the guest without exiting SIE.

    This patch provides some preparations for using AIV for emulated adapter
    interrupts (including virtio) if it's available. When using AIV, the
    interrupts are delivered at the so called GISA by setting the bit
    corresponding to its Interruption Subclass (ISC) in the Interruption
    Pending Mask (IPM) instead of inserting a node into the floating interrupt
    list.

    To keep the change reasonably small, the handling of this new state is
    deferred in get_all_floating_irqs and handle_tpi. This patch concentrates
    on the code handling enqueuement of emulated adapter interrupts, and their
    delivery to the guest.

    Note that care is still required for adapter interrupts using AIV,
    because there is no guarantee that AIV is going to deliver the adapter
    interrupts pending at the GISA (consider all vcpus idle). When delivering
    GISA adapter interrupts by the host (usual mechanism) special attention
    is required to honor interrupt priorities.

    Empirical results show that the time window between making an interrupt
    pending at the GISA and doing kvm_s390_deliver_pending_interrupts is
    sufficient for a guest with at least moderate cpu activity to get adapter
    interrupts delivered within the SIE, and potentially save some SIE exits
    (if not other deliverable interrupts).

    The code will be activated with a follow-up patch.

    Signed-off-by: Michael Mueller
    Acked-by: Christian Borntraeger
    Reviewed-by: David Hildenbrand
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Michael Mueller
     
  • The patch implements routines to access the GISA to test and modify
    its Interruption Pending Mask (IPM) from the host side.

    Signed-off-by: Michael Mueller
    Reviewed-by: Pierre Morel
    Reviewed-by: Halil Pasic
    Reviewed-by: Christian Borntraeger
    Reviewed-by: David Hildenbrand
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Michael Mueller
     
  • In preperation to support pass-through adapter interrupts, the Guest
    Interruption State Area (GISA) and the Adapter Interruption Virtualization
    (AIV) features will be introduced here.

    This patch introduces format-0 GISA (that is defines the struct describing
    the GISA, allocates storage for it, and introduces fields for the
    GISA address in kvm_s390_sie_block and kvm_s390_vsie).

    As the GISA requires storage below 2GB, it is put in sie_page2, which is
    already allocated in ZONE_DMA. In addition, The GISA requires alignment to
    its integral boundary. This is already naturally aligned via the
    padding in the sie_page2.

    Signed-off-by: Michael Mueller
    Reviewed-by: Pierre Morel
    Reviewed-by: Halil Pasic
    Reviewed-by: Christian Borntraeger
    Reviewed-by: David Hildenbrand
    Acked-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Michael Mueller
     
  • This patch prepares a simplification of bit operations between the irq
    pending mask for emulated interrupts and the Interruption Pending Mask
    (IPM) which is part of the Guest Interruption State Area (GISA), a feature
    that allows interrupt delivery to guests by means of the SIE instruction.

    Without that change, a bit-wise *or* operation on parts of these two masks
    would either require a look-up table of size 256 bytes to map the IPM
    to the emulated irq pending mask bit orientation (all bits mirrored at half
    byte) or a sequence of up to 8 condidional branches to perform tests of
    single bit positions. Both options are to be rejected either by performance
    or space utilization reasons.

    Beyond that this change will be transparent.

    Signed-off-by: Michael Mueller
    Reviewed-by: Halil Pasic
    Reviewed-by: Pierre Morel
    Reviewed-by: Christian Borntraeger
    Reviewed-by: Cornelia Huck
    Reviewed-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    Michael Mueller
     

25 Jan, 2018

4 commits


24 Jan, 2018

6 commits

  • The overall instruction counter is larger than the sum of the
    single counters. We should try to catch all instruction handlers
    to make this match the summary counter.
    Let us add sck,tb,sske,iske,rrbe,tb,tpi,tsch,lpsw,pswe....
    and remove other unused ones.

    Signed-off-by: Christian Borntraeger
    Acked-by: Janosch Frank
    Reviewed-by: David Hildenbrand

    Christian Borntraeger
     
  • KVM: s390: another fix for cmma migration

    This fixes races and potential use after free in the
    cmma migration code.

    Radim Krčmář
     
  • Make the diagnose counters also appear as instruction counters.

    Signed-off-by: Christian Borntraeger
    Reviewed-by: Janosch Frank
    Reviewed-by: David Hildenbrand
    Reviewed-by: Cornelia Huck

    Christian Borntraeger
     
  • Some parts of the cmma migration bitmap is already protected
    with the kvm->lock (e.g. the migration start). On the other
    hand the read of the cmma bits is not protected against a
    concurrent free, neither is the emulation of the ESSA instruction.
    Let's extend the locking to all related ioctls by using
    the slots lock for
    - kvm_s390_vm_start_migration
    - kvm_s390_vm_stop_migration
    - kvm_s390_set_cmma_bits
    - kvm_s390_get_cmma_bits

    In addition to that, we use synchronize_srcu before freeing
    the migration structure as all users hold kvm->srcu for read.
    (e.g. the ESSA handler).

    Reported-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger
    Cc: stable@vger.kernel.org # 4.13+
    Fixes: 190df4a212a7 (KVM: s390: CMMA tracking, ESSA emulation, migration mode)
    Reviewed-by: Claudio Imbrenda
    Reviewed-by: David Hildenbrand
    Reviewed-by: Cornelia Huck

    Christian Borntraeger
     
  • This way, the values cannot change, even if another VCPU might try to
    mess with the nested SCB currently getting executed by another VCPU.

    We now always use the same gpa for pinning and unpinning a page (for
    unpinning, it is only relevant to mark the guest page dirty for
    migration).

    Signed-off-by: David Hildenbrand
    Message-Id:
    Reviewed-by: Christian Borntraeger
    Acked-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     
  • Another VCPU might try to modify the SCB while we are creating the
    shadow SCB. In general this is no problem - unless the compiler decides
    to not load values once, but e.g. twice.

    For us, this is only relevant when checking/working with such values.
    E.g. the prefix value, the mso, state of transactional execution and
    addresses of satellite blocks.

    E.g. if we blindly forward values (e.g. general purpose registers or
    execution controls after masking), we don't care.

    Leaving unpin_blocks() untouched for now, will handle it separately.

    The worst thing right now that I can see would be a missed prefix
    un/remap (mso, prefix, tx) or using wrong guest addresses. Nothing
    critical, but let's try to avoid unpredictable behavior.

    Signed-off-by: David Hildenbrand
    Message-Id:
    Reviewed-by: Christian Borntraeger
    Acked-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     

21 Jan, 2018

1 commit

  • The new firmware interfaces for branch prediction behaviour changes
    are transparently available for the guest. Nevertheless, there is
    new state attached that should be migrated and properly resetted.
    Provide a mechanism for handling reset, migration and VSIE.

    Signed-off-by: Christian Borntraeger
    Reviewed-by: David Hildenbrand
    Reviewed-by: Cornelia Huck
    [Changed capability number to 152. - Radim]
    Signed-off-by: Radim Krčmář

    Christian Borntraeger