26 Jan, 2017

1 commit

  • commit 04478197416e3a302e9ebc917ba1aa884ef9bfab upstream.

    kvm_s390_get_machine() populates the facility bitmap by copying bytes
    from the host results that are stored in a 256 byte array in the prefix
    page. The KVM code does use the size of the target buffer (2k), thus
    copying and exposing unrelated kernel memory (mostly machine check
    related logout data).

    Let's use the size of the source buffer instead. This is ok, as the
    target buffer will always be greater or equal than the source buffer as
    the KVM internal buffers (and thus S390_ARCH_FAC_LIST_SIZE_BYTE) cover
    the maximum possible size that is allowed by STFLE, which is 256
    doublewords. All structures are zero allocated so we can leave bytes
    256-2047 unchanged.

    Add a similar fix for kvm_arch_init_vm().

    Reported-by: Heiko Carstens
    [found with smatch]
    Signed-off-by: Christian Borntraeger
    Acked-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Christian Borntraeger
     

26 Oct, 2016

1 commit

  • Diag224 requires a page-aligned 4k buffer to store the name table
    into. kmalloc does not guarantee page alignment, hence we replace it
    with __get_free_page for the buffer allocation.

    Cc: stable@vger.kernel.org # v4.8+
    Reported-by: Michael Holzheu
    Signed-off-by: Janosch Frank
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger

    Janosch Frank
     

21 Oct, 2016

1 commit

  • Usually a validity intercept is a programming error of the host
    because of invalid entries in the state description.
    We can get a validity intercept if the mode of the runtime
    instrumentation control block is wrong. As the host does not know
    which modes are valid, this can be used by userspace to trigger
    a WARN.
    Instead of printing a WARN let's return an error to userspace as
    this can only happen if userspace provides a malformed initial
    value (e.g. on migration). The kernel should never warn on bogus
    input. Instead let's log it into the s390 debug feature.

    While at it, let's return -EINVAL for all validity intercepts as
    this will trigger an error in QEMU like

    error: kvm run failed Invalid argument
    PSW=mask 0404c00180000000 addr 000000000063c226 cc 00
    R00=000000000000004f R01=0000000000000004 R02=0000000000760005 R03=000000007fe0a000
    R04=000000000064ba2a R05=000000049db73dd0 R06=000000000082c4b0 R07=0000000000000041
    R08=0000000000000002 R09=000003e0804042a8 R10=0000000496152c42 R11=000000007fe0afb0
    [...]

    This will avoid an endless loop of validity intercepts.

    Cc: stable@vger.kernel.org # v4.5+
    Fixes: c6e5f166373a ("KVM: s390: implement the RI support of guest")
    Acked-by: Fan Zhang
    Reviewed-by: Pierre Morel
    Signed-off-by: Christian Borntraeger

    Christian Borntraeger
     

07 Oct, 2016

1 commit

  • Pull KVM updates from Radim Krčmář:
    "All architectures:
    - move `make kvmconfig` stubs from x86
    - use 64 bits for debugfs stats

    ARM:
    - Important fixes for not using an in-kernel irqchip
    - handle SError exceptions and present them to guests if appropriate
    - proxying of GICV access at EL2 if guest mappings are unsafe
    - GICv3 on AArch32 on ARMv8
    - preparations for GICv3 save/restore, including ABI docs
    - cleanups and a bit of optimizations

    MIPS:
    - A couple of fixes in preparation for supporting MIPS EVA host
    kernels
    - MIPS SMP host & TLB invalidation fixes

    PPC:
    - Fix the bug which caused guests to falsely report lockups
    - other minor fixes
    - a small optimization

    s390:
    - Lazy enablement of runtime instrumentation
    - up to 255 CPUs for nested guests
    - rework of machine check deliver
    - cleanups and fixes

    x86:
    - IOMMU part of AMD's AVIC for vmexit-less interrupt delivery
    - Hyper-V TSC page
    - per-vcpu tsc_offset in debugfs
    - accelerated INS/OUTS in nVMX
    - cleanups and fixes"

    * tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (140 commits)
    KVM: MIPS: Drop dubious EntryHi optimisation
    KVM: MIPS: Invalidate TLB by regenerating ASIDs
    KVM: MIPS: Split kernel/user ASID regeneration
    KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
    KVM: arm/arm64: vgic: Don't flush/sync without a working vgic
    KVM: arm64: Require in-kernel irqchip for PMU support
    KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register
    KVM: PPC: Book3S PR: Support 64kB page size on POWER8E and POWER8NVL
    KVM: PPC: Book3S: Remove duplicate setting of the B field in tlbie
    KVM: PPC: BookE: Fix a sanity check
    KVM: PPC: Book3S HV: Take out virtual core piggybacking code
    KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread
    ARM: gic-v3: Work around definition of gic_write_bpr1
    KVM: nVMX: Fix the NMI IDT-vectoring handling
    KVM: VMX: Enable MSR-BASED TPR shadow even if APICv is inactive
    KVM: nVMX: Fix reload apic access page warning
    kvmconfig: add virtio-gpu to config fragment
    config: move x86 kvm_guest.config to a common location
    arm64: KVM: Remove duplicating init code for setting VMID
    ARM: KVM: Support vgic-v3
    ...

    Linus Torvalds
     

05 Oct, 2016

1 commit

  • Pull s390 updates from Martin Schwidefsky:
    "The new features and main improvements in this merge for v4.9

    - Support for the UBSAN sanitizer

    - Set HAVE_EFFICIENT_UNALIGNED_ACCESS, it improves the code in some
    places

    - Improvements for the in-kernel fpu code, in particular the overhead
    for multiple consecutive in kernel fpu users is recuded

    - Add a SIMD implementation for the RAID6 gen and xor operations

    - Add RAID6 recovery based on the XC instruction

    - The PCI DMA flush logic has been improved to increase the speed of
    the map / unmap operations

    - The time synchronization code has seen some updates

    And bug fixes all over the place"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits)
    s390/con3270: fix insufficient space padding
    s390/con3270: fix use of uninitialised data
    MAINTAINERS: update DASD maintainer
    s390/cio: fix accidental interrupt enabling during resume
    s390/dasd: add missing \n to end of dev_err messages
    s390/config: Enable config options for Docker
    s390/dasd: make query host access interruptible
    s390/dasd: fix panic during offline processing
    s390/dasd: fix hanging offline processing
    s390/pci_dma: improve lazy flush for unmap
    s390/pci_dma: split dma_update_trans
    s390/pci_dma: improve map_sg
    s390/pci_dma: simplify dma address calculation
    s390/pci_dma: remove dma address range check
    iommu/s390: simplify registration of I/O address translation parameters
    s390: migrate exception table users off module.h and onto extable.h
    s390: export header for CLP ioctl
    s390/vmur: fix irq pointer dereference in int handler
    s390/dasd: add missing KOBJ_CHANGE event for unformatted devices
    s390: enable UBSAN
    ...

    Linus Torvalds
     

16 Sep, 2016

1 commit

  • Two stubs are added:

    o kvm_arch_has_vcpu_debugfs(): must return true if the arch
    supports creating debugfs entries in the vcpu debugfs dir
    (which will be implemented by the next commit)

    o kvm_arch_create_vcpu_debugfs(): code that creates debugfs
    entries in the vcpu debugfs dir

    For x86, this commit introduces a new file to avoid growing
    arch/x86/kvm/x86.c even more.

    Signed-off-by: Luiz Capitulino
    Signed-off-by: Paolo Bonzini

    Luiz Capitulino
     

08 Sep, 2016

11 commits

  • Christian Borntraeger
     
  • * Reuse existing functionality from memdup_user() instead of keeping
    duplicate source code.

    This issue was detected by using the Coccinelle software.

    * Return directly if this copy operation failed.

    Reviewed-by: David Hildenbrand
    Acked-by: Cornelia Huck
    Signed-off-by: Markus Elfring
    Message-Id:
    Signed-off-by: Christian Borntraeger

    Markus Elfring
     
  • * A multiplication for the size determination of a memory allocation
    indicated that an array data structure should be processed.
    Thus reuse the corresponding function "kmalloc_array".

    Suggested-by: Paolo Bonzini

    This issue was detected also by using the Coccinelle software.

    * Replace the specification of data structures by pointer dereferences
    to make the corresponding size determination a bit safer according to
    the Linux coding style convention.

    * Delete the local variable "size" which became unnecessary with
    this refactoring.

    Signed-off-by: Markus Elfring
    Acked-by: Cornelia Huck
    Message-Id:
    Signed-off-by: Christian Borntraeger

    Markus Elfring
     
  • If the SCA entries aren't used by the hardware (no SIGPIF), we
    can simply not set the entries, stick to the basic sca and allow more
    than 64 VCPUs.

    To hinder any other facility from using these entries, let's properly
    provoke intercepts by not setting the MCN and keeping the entries
    unset.

    This effectively allows when running KVM under KVM (vSIE) or under z/VM to
    provide more than 64 VCPUs to a guest. Let's limit it to 255 for now, to
    not run into problems if the CPU numbers are limited somewhere else.

    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     
  • Only enable runtime instrumentation if the guest issues an RI related
    instruction or if userspace changes the riccb to a valid state.
    This makes entry/exit a tiny bit faster.

    Initial patch by Christian Borntraeger
    Signed-off-by: Fan Zhang
    Signed-off-by: Christian Borntraeger

    Fan Zhang
     
  • The payload data for protection exceptions is a superset of the
    payload of other translation exceptions. Let's set the additional
    flags and use a fall through to minimize code duplication.

    Signed-off-by: Janosch Frank
    Reviewed-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    Janosch Frank
     
  • Let's avoid working with the PER_EVENT* defines, used for control register
    manipulation, when checking the u8 PER code. Introduce separate defines
    based on the existing defines.

    Reviewed-by: Eric Farman
    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     
  • Let's also write the external damage code already provided by
    struct kvm_s390_mchk_info.

    Reviewed-by: Cornelia Huck
    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     
  • Vector registers are only to be stored if the facility is available
    and if the guest has set up the machine check extended save area.

    If anything goes wrong while writing the vector registers, the vector
    registers are to be marked as invalid. Please note that we are allowed
    to write the registers although they are marked as invalid.

    Machine checks and "store status" SIGP orders are two different concepts,
    let's correctly separate these. As the SIGP part is completely handled in
    user space, we can drop it.

    This patch is based on a patch from Cornelia Huck.

    Reviewed-by: Cornelia Huck
    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     
  • Store status writes the prefix which is not to be done by a machine check.
    Also, the psw is stored and later on overwritten by the failing-storage
    address, which looks strange at first sight.

    Store status and machine check handling look similar, but they are actually
    two different things.

    Reviewed-by: Cornelia Huck
    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     
  • Let's factor this out to prepare for bigger changes. Reorder to calls to
    match the logical order given in the PoP.

    Reviewed-by: Cornelia Huck
    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     

05 Sep, 2016

1 commit

  • We store the address of riccbd at the wrong location, overwriting
    gvrd. This means that our nested guest will not be able to use runtime
    instrumentation. Also, a memory leak, if our KVM guest actually sets gvrd.

    Not noticed until now, as KVM guests never make use of gvrd and runtime
    instrumentation wasn't completely tested yet.

    Reported-by: Fan Zhang
    Reviewed-by: Cornelia Huck
    Signed-off-by: David Hildenbrand
    Signed-off-by: Cornelia Huck

    David Hildenbrand
     

29 Aug, 2016

1 commit

  • The CPACF code makes some assumptions about the availablity of hardware
    support. E.g. if the machine supports KM(AES-256) without chaining it is
    assumed that KMC(AES-256) with chaining is available as well. For the
    existing CPUs this is true but the architecturally correct way is to
    check each CPACF functions on its own. This is what the query function
    of each instructions is all about.

    Reviewed-by: Harald Freudenberger
    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

26 Aug, 2016

2 commits


25 Aug, 2016

1 commit

  • As the meaning of these variables and pointers seems to change more
    frequently, let's directly access our save area, instead of going via
    current->thread.

    Right now, this is broken for set/get_fpu. They simply overwrite the
    host registers, as the pointers to the current save area were turned
    into the static host save area.

    Cc: stable@vger.kernel.org # 4.7
    Fixes: 3f6813b9a5e0 ("s390/fpu: allocate 'struct fpu' with the task_struct")
    Reported-by: Hao QingFeng
    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     

12 Aug, 2016

2 commits

  • When triggering KVM_RUN without a user memory region being mapped
    (KVM_SET_USER_MEMORY_REGION) a validity intercept occurs. This could
    happen, if the user memory region was not mapped initially or if it
    was unmapped after the vcpu is initialized. The function
    kvm_s390_handle_requests checks for the KVM_REQ_MMU_RELOAD bit. The
    check function always clears this bit. If gmap_mprotect_notify
    returns an error code, the mapping failed, but the KVM_REQ_MMU_RELOAD
    was not set anymore. So the next time kvm_s390_handle_requests is
    called, the execution would fall trough the check for
    KVM_REQ_MMU_RELOAD. The bit needs to be resetted, if
    gmap_mprotect_notify returns an error code. Resetting the bit with
    kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu) fixes the bug.

    Reviewed-by: David Hildenbrand
    Signed-off-by: Julius Niedworok
    Signed-off-by: Christian Borntraeger

    Julius Niedworok
     
  • When KVM_RUN is triggered on a VCPU without an initial reset, a
    validity intercept occurs.
    Setting the prefix will set the KVM_REQ_MMU_RELOAD bit initially,
    thus preventing the bug.

    Reviewed-by: David Hildenbrand
    Acked-by: Cornelia Huck
    Signed-off-by: Julius Niedworok
    Signed-off-by: Christian Borntraeger

    Julius Niedworok
     

03 Aug, 2016

1 commit

  • Pull KVM updates from Paolo Bonzini:

    - ARM: GICv3 ITS emulation and various fixes. Removal of the
    old VGIC implementation.

    - s390: support for trapping software breakpoints, nested
    virtualization (vSIE), the STHYI opcode, initial extensions
    for CPU model support.

    - MIPS: support for MIPS64 hosts (32-bit guests only) and lots
    of cleanups, preliminary to this and the upcoming support for
    hardware virtualization extensions.

    - x86: support for execute-only mappings in nested EPT; reduced
    vmexit latency for TSC deadline timer (by about 30%) on Intel
    hosts; support for more than 255 vCPUs.

    - PPC: bugfixes.

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
    KVM: PPC: Introduce KVM_CAP_PPC_HTM
    MIPS: Select HAVE_KVM for MIPS64_R{2,6}
    MIPS: KVM: Reset CP0_PageMask during host TLB flush
    MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
    MIPS: KVM: Sign extend MFC0/RDHWR results
    MIPS: KVM: Fix 64-bit big endian dynamic translation
    MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
    MIPS: KVM: Use 64-bit CP0_EBase when appropriate
    MIPS: KVM: Set CP0_Status.KX on MIPS64
    MIPS: KVM: Make entry code MIPS64 friendly
    MIPS: KVM: Use kmap instead of CKSEG0ADDR()
    MIPS: KVM: Use virt_to_phys() to get commpage PFN
    MIPS: Fix definition of KSEGX() for 64-bit
    KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
    kvm: x86: nVMX: maintain internal copy of current VMCS
    KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
    KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
    KVM: arm64: vgic-its: Simplify MAPI error handling
    KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
    KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
    ...

    Linus Torvalds
     

27 Jul, 2016

1 commit

  • Pull s390 updates from Martin Schwidefsky:
    "There are a couple of new things for s390 with this merge request:

    - a new scheduling domain "drawer" is added to reflect the unusual
    topology found on z13 machines. Performance tests showed up to 8
    percent gain with the additional domain.

    - the new crc-32 checksum crypto module uses the vector-galois-field
    multiply and sum SIMD instruction to speed up crc-32 and crc-32c.

    - proper __ro_after_init support, this requires RO_AFTER_INIT_DATA in
    the generic vmlinux.lds linker script definitions.

    - kcov instrumentation support. A prerequisite for that is the
    inline assembly basic block cleanup, which is the reason for the
    net/iucv/iucv.c change.

    - support for 2GB pages is added to the hugetlbfs backend.

    Then there are two removals:

    - the oprofile hardware sampling support is dead code and is removed.
    The oprofile user space uses the perf interface nowadays.

    - the ETR clock synchronization is removed, this has been superseeded
    be the STP clock synchronization. And it always has been
    "interesting" code..

    And the usual bug fixes and cleanups"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (82 commits)
    s390/pci: Delete an unnecessary check before the function call "pci_dev_put"
    s390/smp: clean up a condition
    s390/cio/chp : Remove deprecated create_singlethread_workqueue
    s390/chsc: improve channel path descriptor determination
    s390/chsc: sanitize fmt check for chp_desc determination
    s390/cio: make fmt1 channel path descriptor optional
    s390/chsc: fix ioctl CHSC_INFO_CU command
    s390/cio/device_ops: fix kernel doc
    s390/cio: allow to reset channel measurement block
    s390/console: Make preferred console handling more consistent
    s390/mm: fix gmap tlb flush issues
    s390/mm: add support for 2GB hugepages
    s390: have unique symbol for __switch_to address
    s390/cpuinfo: show maximum thread id
    s390/ptrace: clarify bits in the per_struct
    s390: stack address vs thread_info
    s390: remove pointless load within __switch_to
    s390: enable kcov support
    s390/cpumf: use basic block for ecctr inline assembly
    s390/hypfs: use basic block for diag inline assembly
    ...

    Linus Torvalds
     

18 Jul, 2016

2 commits

  • We don't emulate ptff subfunctions, therefore react on any attempt of
    execution by setting cc=3 (Requested function not available).

    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     
  • We will use illegal instruction 0x0000 for handling 2 byte sw breakpoints
    from user space. As it can be enabled dynamically via a capability,
    let's move setting of ICTL_OPEREXC to the post creation step, so we avoid
    any races when enabling that capability just while adding new cpus.

    Acked-by: Janosch Frank
    Reviewed-by: Cornelia Huck
    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     

14 Jul, 2016

1 commit


05 Jul, 2016

1 commit

  • In case we have to emuluate an instruction or part of it (instruction,
    partial instruction, operation exception), we have to inject a PER
    instruction-fetching event for that instruction, if hardware told us to do
    so.

    In case we retry an instruction, we must not inject the PER event.

    Please note that we don't filter the events properly yet, so guest
    debugging will be visible for the guest.

    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger

    David Hildenbrand
     

01 Jul, 2016

1 commit


21 Jun, 2016

9 commits