13 Jan, 2017

3 commits

  • Dmitry Vyukov reported that the syzkaller fuzzer triggered a
    deadlock in the vgic setup code when an error was detected, as
    the cleanup code tries to take a lock that is already held by
    the setup code.

    The fix is to avoid retaking the lock when cleaning up, by
    telling the cleanup function that we already hold it.

    Cc: stable@vger.kernel.org
    Reported-by: Dmitry Vyukov
    Reviewed-by: Christoffer Dall
    Reviewed-by: Eric Auger
    Signed-off-by: Marc Zyngier

    Marc Zyngier
     
  • Current KVM world switch code is unintentionally setting wrong bits to
    CNTHCTL_EL2 when E2H == 1, which may allow guest OS to access physical
    timer. Bit positions of CNTHCTL_EL2 are changing depending on
    HCR_EL2.E2H bit. EL1PCEN and EL1PCTEN are 1st and 0th bits when E2H is
    not set, but they are 11th and 10th bits respectively when E2H is set.

    In fact, on VHE we only need to set those bits once, not for every world
    switch. This is because the host kernel runs in EL2 with HCR_EL2.TGE ==
    1, which makes those bits have no effect for the host kernel execution.
    So we just set those bits once for guests, and that's it.

    Signed-off-by: Jintack Lim
    Reviewed-by: Marc Zyngier
    Signed-off-by: Marc Zyngier

    Jintack Lim
     
  • When a VCPU blocks (WFI) and has programmed the vtimer, we program a
    soft timer to expire in the future to wake up the vcpu thread when
    appropriate. Because such as wake up involves a vcpu kick, and the
    timer expire function can get called from interrupt context, and the
    kick may sleep, we have to schedule the kick in the work function.

    The work function currently has a warning that gets raised if it turns
    out that the timer shouldn't fire when it's run, which was added because
    the idea was that in that case the work should never have been cancelled.

    However, it turns out that this whole thing is racy and we can get
    spurious warnings. The problem is that we clear the armed flag in the
    work function, which may run in parallel with the
    kvm_timer_unschedule->timer_disarm() call. This results in a possible
    situation where the timer_disarm() call does not call
    cancel_work_sync(), which effectively synchronizes the completion of the
    work function with running the VCPU. As a result, the VCPU thread
    proceeds before the work function completees, causing changes to the
    timer state such that kvm_timer_should_fire(vcpu) returns false in the
    work function.

    All we do in the work function is to kick the VCPU, and an occasional
    rare extra kick never harmed anyone. Since the race above is extremely
    rare, we don't bother checking if the race happens but simply remove the
    check and the clearing of the armed flag from the work function.

    Reported-by: Matthias Brugger
    Reviewed-by: Marc Zyngier
    Signed-off-by: Christoffer Dall
    Signed-off-by: Marc Zyngier

    Christoffer Dall
     

26 Dec, 2016

2 commits

  • Pull timer type cleanups from Thomas Gleixner:
    "This series does a tree wide cleanup of types related to
    timers/timekeeping.

    - Get rid of cycles_t and use a plain u64. The type is not really
    helpful and caused more confusion than clarity

    - Get rid of the ktime union. The union has become useless as we use
    the scalar nanoseconds storage unconditionally now. The 32bit
    timespec alike storage got removed due to the Y2038 limitations
    some time ago.

    That leaves the odd union access around for no reason. Clean it up.

    Both changes have been done with coccinelle and a small amount of
    manual mopping up"

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    ktime: Get rid of ktime_equal()
    ktime: Cleanup ktime_set() usage
    ktime: Get rid of the union
    clocksource: Use a plain u64 instead of cycle_t

    Linus Torvalds
     
  • Pull SMP hotplug notifier removal from Thomas Gleixner:
    "This is the final cleanup of the hotplug notifier infrastructure. The
    series has been reintgrated in the last two days because there came a
    new driver using the old infrastructure via the SCSI tree.

    Summary:

    - convert the last leftover drivers utilizing notifiers

    - fixup for a completely broken hotplug user

    - prevent setup of already used states

    - removal of the notifiers

    - treewide cleanup of hotplug state names

    - consolidation of state space

    There is a sphinx based documentation pending, but that needs review
    from the documentation folks"

    * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    irqchip/armada-xp: Consolidate hotplug state space
    irqchip/gic: Consolidate hotplug state space
    coresight/etm3/4x: Consolidate hotplug state space
    cpu/hotplug: Cleanup state names
    cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
    staging/lustre/libcfs: Convert to hotplug state machine
    scsi/bnx2i: Convert to hotplug state machine
    scsi/bnx2fc: Convert to hotplug state machine
    cpu/hotplug: Prevent overwriting of callbacks
    x86/msr: Remove bogus cleanup from the error path
    bus: arm-ccn: Prevent hotplug callback leak
    perf/x86/intel/cstate: Prevent hotplug callback leak
    ARM/imx/mmcd: Fix broken cpu hotplug handling
    scsi: qedi: Convert to hotplug state machine

    Linus Torvalds
     

25 Dec, 2016

3 commits

  • There is no point in having an extra type for extra confusion. u64 is
    unambiguous.

    Conversion was done with the following coccinelle script:

    @rem@
    @@
    -typedef u64 cycle_t;

    @fix@
    typedef cycle_t;
    @@
    -cycle_t
    +u64

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: John Stultz

    Thomas Gleixner
     
  • When the state names got added a script was used to add the extra argument
    to the calls. The script basically converted the state constant to a
    string, but the cleanup to convert these strings into meaningful ones did
    not happen.

    Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
    are used in all the other places already.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Sebastian Siewior
    Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • This was entirely automated, using the script by Al:

    PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
    sed -i -e "s!$PATT!#include !" \
    $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

    to do the replacement at the end of the merge window.

    Requested-by: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

15 Dec, 2016

1 commit

  • Unexport the low-level __get_user_pages_unlocked() function and replaces
    invocations with calls to more appropriate higher-level functions.

    In hva_to_pfn_slow() we are able to replace __get_user_pages_unlocked()
    with get_user_pages_unlocked() since we can now pass gup_flags.

    In async_pf_execute() and process_vm_rw_single_vec() we need to pass
    different tsk, mm arguments so get_user_pages_remote() is the sane
    replacement in these cases (having added manual acquisition and release
    of mmap_sem.)

    Additionally get_user_pages_remote() reintroduces use of the FOLL_TOUCH
    flag. However, this flag was originally silently dropped by commit
    1e9877902dc7 ("mm/gup: Introduce get_user_pages_remote()"), so this
    appears to have been unintentional and reintroducing it is therefore not
    an issue.

    [akpm@linux-foundation.org: coding-style fixes]
    Link: http://lkml.kernel.org/r/20161027095141.2569-3-lstoakes@gmail.com
    Signed-off-by: Lorenzo Stoakes
    Acked-by: Michal Hocko
    Cc: Jan Kara
    Cc: Hugh Dickins
    Cc: Dave Hansen
    Cc: Rik van Riel
    Cc: Mel Gorman
    Cc: Paolo Bonzini
    Cc: Radim Krcmar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lorenzo Stoakes
     

14 Dec, 2016

2 commits

  • Pull KVM updates from Paolo Bonzini:
    "Small release, the most interesting stuff is x86 nested virt
    improvements.

    x86:
    - userspace can now hide nested VMX features from guests
    - nested VMX can now run Hyper-V in a guest
    - support for AVX512_4VNNIW and AVX512_FMAPS in KVM
    - infrastructure support for virtual Intel GPUs.

    PPC:
    - support for KVM guests on POWER9
    - improved support for interrupt polling
    - optimizations and cleanups.

    s390:
    - two small optimizations, more stuff is in flight and will be in
    4.11.

    ARM:
    - support for the GICv3 ITS on 32bit platforms"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits)
    arm64: KVM: pmu: Reset PMSELR_EL0.SEL to a sane value before entering the guest
    KVM: arm/arm64: timer: Check for properly initialized timer on init
    KVM: arm/arm64: vgic-v2: Limit ITARGETSR bits to number of VCPUs
    KVM: x86: Handle the kthread worker using the new API
    KVM: nVMX: invvpid handling improvements
    KVM: nVMX: check host CR3 on vmentry and vmexit
    KVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentry
    KVM: nVMX: propagate errors from prepare_vmcs02
    KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT
    KVM: nVMX: load GUEST_EFER after GUEST_CR0 during emulated VM-entry
    KVM: nVMX: generate MSR_IA32_CR{0,4}_FIXED1 from guest CPUID
    KVM: nVMX: fix checks on CR{0,4} during virtual VMX operation
    KVM: nVMX: support restore of VMX capability MSRs
    KVM: nVMX: generate non-true VMX MSRs based on true versions
    KVM: x86: Do not clear RFLAGS.TF when a singlestep trap occurs.
    KVM: x86: Add kvm_skip_emulated_instruction and use it.
    KVM: VMX: Move skip_emulated_instruction out of nested_vmx_check_vmcs12
    KVM: VMX: Reorder some skip_emulated_instruction calls
    KVM: x86: Add a return value to kvm_emulate_cpuid
    KVM: PPC: Book3S: Move prototypes for KVM functions into kvm_ppc.h
    ...

    Linus Torvalds
     
  • Pull VFIO updates from Alex Williamson:

    - VFIO updates for v4.10 primarily include a new Mediated Device
    interface, which essentially allows software defined devices to be
    exposed to users through VFIO. The host vendor driver providing this
    virtual device polices, or mediates user access to the device.

    These devices often incorporate portions of real devices, for
    instance the primary initial users of this interface expose vGPUs
    which allow the user to map mediated devices, or mdevs, to a portion
    of a physical GPU. QEMU composes these mdevs into PCI representations
    using the existing VFIO user API. This enables both Intel KVM-GT
    support, which is also expected to arrive into Linux mainline during
    the v4.10 merge window, as well as NVIDIA vGPU, and also Channel I/O
    devices (aka CCW devices) for s390 virtualization support. (Kirti
    Wankhede, Neo Jia)

    - Drop unnecessary uses of pcibios_err_to_errno() (Cao Jin)

    - Fixes to VFIO capability chain handling (Eric Auger)

    - Error handling fixes for fallout from mdev (Christophe JAILLET)

    - Notifiers to expose struct kvm to mdev vendor drivers (Jike Song)

    - type1 IOMMU model search fixes (Kirti Wankhede, Neo Jia)

    * tag 'vfio-v4.10-rc1' of git://github.com/awilliam/linux-vfio: (30 commits)
    vfio iommu type1: Fix size argument to vfio_find_dma() in pin_pages/unpin_pages
    vfio iommu type1: Fix size argument to vfio_find_dma() during DMA UNMAP.
    vfio iommu type1: WARN_ON if notifier block is not unregistered
    kvm: set/clear kvm to/from vfio_group when group add/delete
    vfio: support notifier chain in vfio_group
    vfio: vfio_register_notifier: classify iommu notifier
    vfio: Fix handling of error returned by 'vfio_group_get_from_dev()'
    vfio: fix vfio_info_cap_add/shift
    vfio/pci: Drop unnecessary pcibios_err_to_errno()
    MAINTAINERS: Add entry VFIO based Mediated device drivers
    docs: Sample driver to demonstrate how to use Mediated device framework.
    docs: Sysfs ABI for mediated device framework
    docs: Add Documentation for Mediated devices
    vfio: Define device_api strings
    vfio_platform: Updated to use vfio_set_irqs_validate_and_prepare()
    vfio_pci: Updated to use vfio_set_irqs_validate_and_prepare()
    vfio: Introduce vfio_set_irqs_validate_and_prepare()
    vfio_pci: Update vfio_pci to use vfio_info_add_capability()
    vfio: Introduce common function to add capabilities
    vfio iommu: Add blocking notifier to notify DMA_UNMAP
    ...

    Linus Torvalds
     

12 Dec, 2016

1 commit


11 Dec, 2016

1 commit


09 Dec, 2016

2 commits

  • When the arch timer code fails to initialize (for example because the
    memory mapped timer doesn't work, which is currently seen with the AEM
    model), then KVM just continues happily with a final result that KVM
    eventually does a NULL pointer dereference of the uninitialized cycle
    counter.

    Check directly for this in the init path and give the user a reasonable
    error in this case.

    Cc: Shih-Wei Li
    Signed-off-by: Christoffer Dall
    Signed-off-by: Marc Zyngier

    Christoffer Dall
     
  • The GICv2 spec says in section 4.3.12 that a "CPU targets field bit that
    corresponds to an unimplemented CPU interface is RAZ/WI."
    Currently we allow the guest to write any value in there and it can
    read that back.
    Mask the written value with the proper CPU mask to be spec compliant.

    Signed-off-by: Andre Przywara
    Signed-off-by: Marc Zyngier

    Andre Przywara
     

02 Dec, 2016

1 commit

  • Sometimes users need to be aware when a vfio_group attaches to a
    KVM or detaches from it. KVM already calls get/put method from vfio to
    manipulate the vfio_group reference, it can notify vfio_group in
    a similar way.

    Cc: Kirti Wankhede
    Cc: Xiao Guangrong
    Signed-off-by: Jike Song
    Acked-by: Paolo Bonzini
    Signed-off-by: Alex Williamson

    Jike Song
     

01 Dec, 2016

2 commits


28 Nov, 2016

1 commit

  • The kvm module has the parameters halt_poll_ns, halt_poll_ns_grow, and
    halt_poll_ns_shrink. Halt polling was recently added to the powerpc kvm-hv
    module and these parameters were essentially duplicated for that. There is
    no benefit to this duplication and it can lead to confusion when trying to
    tune halt polling.

    Thus move the definition of these variables to kvm_host.h and export them.
    This will allow the kvm-hv module to use the same module parameters by
    accessing these variables, which will be implemented in the next patch,
    meaning that they will no longer be duplicated.

    Signed-off-by: Suraj Jitindar Singh
    Signed-off-by: Paul Mackerras

    Suraj Jitindar Singh
     

24 Nov, 2016

1 commit

  • When we inject a level triggerered interrupt (and unless it
    is backed by the physical distributor - timer style), we request
    a maintenance interrupt. Part of the processing for that interrupt
    is to feed to the rest of KVM (and to the eventfd subsystem) the
    information that the interrupt has been EOIed.

    But that notification only makes sense for SPIs, and not PPIs
    (such as the PMU interrupt). Skip over the notification if
    the interrupt is not an SPI.

    Cc: stable@vger.kernel.org # 4.7+
    Fixes: 140b086dd197 ("KVM: arm/arm64: vgic-new: Add GICv2 world switch backend")
    Fixes: 59529f69f504 ("KVM: arm/arm64: vgic-new: Add GICv3 world switch backend")
    Reported-by: Catalin Marinas
    Tested-by: Catalin Marinas
    Acked-by: Christoffer Dall
    Signed-off-by: Marc Zyngier

    Marc Zyngier
     

22 Nov, 2016

1 commit

  • It allows us to update some status or field of a structure partially.

    We can also save a kvm_read_guest_cached() call if we just update one
    fild of the struct regardless of its current value.

    Signed-off-by: Pan Xinhui
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Paolo Bonzini
    Cc: David.Laight@ACULAB.COM
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: benh@kernel.crashing.org
    Cc: boqun.feng@gmail.com
    Cc: borntraeger@de.ibm.com
    Cc: bsingharora@gmail.com
    Cc: dave@stgolabs.net
    Cc: jgross@suse.com
    Cc: kernellwp@gmail.com
    Cc: konrad.wilk@oracle.com
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: mpe@ellerman.id.au
    Cc: paulmck@linux.vnet.ibm.com
    Cc: paulus@samba.org
    Cc: rkrcmar@redhat.com
    Cc: virtualization@lists.linux-foundation.org
    Cc: will.deacon@arm.com
    Cc: xen-devel-request@lists.xenproject.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/1478077718-37424-8-git-send-email-xinhui.pan@linux.vnet.ibm.com
    [ Typo fixes. ]
    Signed-off-by: Ingo Molnar

    Pan Xinhui
     

20 Nov, 2016

2 commits

  • This was reported by syzkaller:

    [ INFO: possible recursive locking detected ]
    4.9.0-rc4+ #49 Not tainted
    ---------------------------------------------
    kworker/2:1/5658 is trying to acquire lock:
    ([ 1644.769018] (&work->work)
    [< inline >] list_empty include/linux/compiler.h:243
    [] flush_work+0x0/0x660 kernel/workqueue.c:1511

    but task is already holding lock:
    ([ 1644.769018] (&work->work)
    [] process_one_work+0x94b/0x1900 kernel/workqueue.c:2093

    stack backtrace:
    CPU: 2 PID: 5658 Comm: kworker/2:1 Not tainted 4.9.0-rc4+ #49
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Workqueue: events async_pf_execute
    ffff8800676ff630 ffffffff81c2e46b ffffffff8485b930 ffff88006b1fc480
    0000000000000000 ffffffff8485b930 ffff8800676ff7e0 ffffffff81339b27
    ffff8800676ff7e8 0000000000000046 ffff88006b1fcce8 ffff88006b1fccf0
    Call Trace:
    ...
    [] flush_work+0x93/0x660 kernel/workqueue.c:2846
    [] __cancel_work_timer+0x17a/0x410 kernel/workqueue.c:2916
    [] cancel_work_sync+0x17/0x20 kernel/workqueue.c:2951
    [] kvm_clear_async_pf_completion_queue+0xd7/0x400 virt/kvm/async_pf.c:126
    [< inline >] kvm_free_vcpus arch/x86/kvm/x86.c:7841
    [] kvm_arch_destroy_vm+0x23d/0x620 arch/x86/kvm/x86.c:7946
    [< inline >] kvm_destroy_vm virt/kvm/kvm_main.c:731
    [] kvm_put_kvm+0x40e/0x790 virt/kvm/kvm_main.c:752
    [] async_pf_execute+0x23d/0x4f0 virt/kvm/async_pf.c:111
    [] process_one_work+0x9fc/0x1900 kernel/workqueue.c:2096
    [] worker_thread+0xef/0x1480 kernel/workqueue.c:2230
    [] kthread+0x244/0x2d0 kernel/kthread.c:209
    [] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433

    The reason is that kvm_put_kvm is causing the destruction of the VM, but
    the page fault is still on the ->queue list. The ->queue list is owned
    by the VCPU, not by the work items, so we cannot just add list_del to
    the work item.

    Instead, use work->vcpu to note async page faults that have been resolved
    and will be processed through the done list. There is no need to flush
    those.

    Cc: Dmitry Vyukov
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Radim Krčmář

    Paolo Bonzini
     
  • KVM/ARM updates for v4.9-rc6

    - Fix handling of the 32bit cycle counter
    - Fix cycle counter filtering

    Radim Krčmář
     

18 Nov, 2016

1 commit

  • KVM calls kvm_pmu_set_counter_event_type() when PMCCFILTR is configured.
    But this function can't deals with PMCCFILTR correctly because the evtCount
    bits of PMCCFILTR, which is reserved 0, conflits with the SW_INCR event
    type of other PMXEVTYPER registers. To fix it, when eventsel == 0, this
    function shouldn't return immediately; instead it needs to check further
    if select_idx is ARMV8_PMU_CYCLE_IDX.

    Another issue is that KVM shouldn't copy the eventsel bits of PMCCFILTER
    blindly to attr.config. Instead it ought to convert the request to the
    "cpu cycle" event type (i.e. 0x11).

    To support this patch and to prevent duplicated definitions, a limited
    set of ARMv8 perf event types were relocated from perf_event.c to
    asm/perf_event.h.

    Cc: stable@vger.kernel.org # 4.6+
    Acked-by: Will Deacon
    Signed-off-by: Wei Huang
    Signed-off-by: Marc Zyngier

    Wei Huang
     

17 Nov, 2016

1 commit


15 Nov, 2016

1 commit

  • 1) Since commit:41a54482 changed timer enabled variable to per-vcpu,
    the correlative comment in kvm_timer_enable is useless now.

    2) After the kvm module init successfully, the timecounter is always
    non-null, so we can remove the checking of timercounter.

    Signed-off-by: Longpeng(Mike)
    Acked-by: Christoffer Dall
    Signed-off-by: Marc Zyngier

    Longpeng(Mike)
     

14 Nov, 2016

2 commits

  • This patch allows to build and use vGICv3 ITS in 32-bit mode.

    Signed-off-by: Vladimir Murzin
    Reviewed-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Signed-off-by: Marc Zyngier

    Vladimir Murzin
     
  • Evaluate GITS_BASER_ENTRY_SIZE once as an int data (GITS_BASER's
    Entry Size is 5-bit wide only), so when used as divider no reference
    to __aeabi_uldivmod is generated when build for AArch32.

    Use unsigned long long for GITS_BASER_PAGE_SIZE_* since they are
    used in conjunction with 64-bit data.

    Signed-off-by: Vladimir Murzin
    Reviewed-by: Andre Przywara
    Reviewed-by: Marc Zyngier
    Signed-off-by: Marc Zyngier

    Vladimir Murzin
     

11 Nov, 2016

1 commit


05 Nov, 2016

3 commits

  • Pull KVM updates from Paolo Bonzini:
    "One NULL pointer dereference, and two fixes for regressions introduced
    during the merge window.

    The rest are fixes for MIPS, s390 and nested VMX"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    kvm: x86: Check memopp before dereference (CVE-2016-8630)
    kvm: nVMX: VMCLEAR an active shadow VMCS after last use
    KVM: x86: drop TSC offsetting kvm_x86_ops to fix KVM_GET/SET_CLOCK
    KVM: x86: fix wbinvd_dirty_mask use-after-free
    kvm/x86: Show WRMSR data is in hex
    kvm: nVMX: Fix kernel panics induced by illegal INVEPT/INVVPID types
    KVM: document lock orders
    KVM: fix OOPS on flush_work
    KVM: s390: Fix STHYI buffer alignment for diag224
    KVM: MIPS: Precalculate MMIO load resume PC
    KVM: MIPS: Make ERET handle ERL before EXL
    KVM: MIPS: Fix lazy user ASID regenerate for SMP

    Linus Torvalds
     
  • In cases like IPI, we could be queueing an interrupt for a VCPU
    that is already running and is not about to exit, because the
    VCPU has entered the VM with the interrupt pending and would
    not trap on EOI'ing that interrupt. This could result to delays
    in interrupt deliveries or even loss of interrupts.
    To guarantee prompt interrupt injection, here we have to try to
    kick the VCPU.

    Signed-off-by: Shih-Wei Li
    Reviewed-by: Christoffer Dall
    Signed-off-by: Marc Zyngier

    Shih-Wei Li
     
  • In our VGIC implementation we limit the number of SPIs to a number
    that the userland application told us. Accordingly we limit the
    allocation of memory for virtual IRQs to that number.
    However in our MMIO dispatcher we didn't check if we ever access an
    IRQ beyond that limit, leading to out-of-bound accesses.
    Add a test against the number of allocated SPIs in check_region().
    Adjust the VGIC_ADDR_TO_INT macro to avoid an actual division, which
    is not implemented on ARM(32).

    [maz: cleaned-up original patch]

    Cc: stable@vger.kernel.org
    Reviewed-by: Christoffer Dall
    Signed-off-by: Andre Przywara
    Signed-off-by: Marc Zyngier

    Andre Przywara
     

03 Nov, 2016

1 commit


26 Oct, 2016

1 commit

  • The conversion done by commit 3706feacd007 ("KVM: Remove deprecated
    create_singlethread_workqueue") is broken. It flushes a single work
    item &irqfd->shutdown instead of all of them, and even worse if there
    is no irqfd on the list then you get a NULL pointer dereference.
    Revert the virt/kvm/eventfd.c part of that patch; to avoid the
    deprecated function, just allocate our own workqueue---it does
    not even have to be unbound---with alloc_workqueue.

    Fixes: 3706feacd007
    Reviewed-by: Cornelia Huck
    Signed-off-by: Paolo Bonzini

    Paolo Bonzini
     

25 Oct, 2016

1 commit

  • This patch unexports the low-level __get_user_pages() function.

    Recent refactoring of the get_user_pages* functions allow flags to be
    passed through get_user_pages() which eliminates the need for access to
    this function from its one user, kvm.

    We can see that the two calls to get_user_pages() which replace
    __get_user_pages() in kvm_main.c are equivalent by examining their call
    stacks:

    get_user_page_nowait():
    get_user_pages(start, 1, flags, page, NULL)
    __get_user_pages_locked(current, current->mm, start, 1, page, NULL, NULL,
    false, flags | FOLL_TOUCH)
    __get_user_pages(current, current->mm, start, 1,
    flags | FOLL_TOUCH | FOLL_GET, page, NULL, NULL)

    check_user_page_hwpoison():
    get_user_pages(addr, 1, flags, NULL, NULL)
    __get_user_pages_locked(current, current->mm, addr, 1, NULL, NULL, NULL,
    false, flags | FOLL_TOUCH)
    __get_user_pages(current, current->mm, addr, 1, flags | FOLL_TOUCH, NULL,
    NULL, NULL)

    Signed-off-by: Lorenzo Stoakes
    Acked-by: Paolo Bonzini
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Lorenzo Stoakes
     

19 Oct, 2016

1 commit

  • This removes the redundant 'write' and 'force' parameters from
    __get_user_pages_unlocked() to make the use of FOLL_FORCE explicit in
    callers as use of this flag can result in surprising behaviour (and
    hence bugs) within the mm subsystem.

    Signed-off-by: Lorenzo Stoakes
    Acked-by: Paolo Bonzini
    Reviewed-by: Jan Kara
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Lorenzo Stoakes
     

29 Sep, 2016

1 commit


28 Sep, 2016

2 commits

  • If the vgic hasn't been created and initialized, we shouldn't attempt to
    look at its data structures or flush/sync anything to the GIC hardware.

    This fixes an issue reported by Alexander Graf when using a userspace
    irqchip.

    Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
    Cc: stable@vger.kernel.org
    Reported-by: Alexander Graf
    Acked-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Christoffer Dall
     
  • If userspace creates a PMU for the VCPU, but doesn't create an in-kernel
    irqchip, then we end up in a nasty path where we try to take an
    uninitialized spinlock, which can lead to all sorts of breakages.

    Luckily, QEMU always creates the VGIC before the PMU, so we can
    establish this as ABI and check for the VGIC in the PMU init stage.
    This can be relaxed at a later time if we want to support PMU with a
    userspace irqchip.

    Cc: stable@vger.kernel.org
    Cc: Shannon Zhao
    Acked-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Christoffer Dall
     

22 Sep, 2016

1 commit

  • This patch allows to build and use vgic-v3 in 32-bit mode.

    Unfortunately, it can not be split in several steps without extra
    stubs to keep patches independent and bisectable. For instance,
    virt/kvm/arm/vgic/vgic-v3.c uses function from vgic-v3-sr.c, handling
    access to GICv3 cpu interface from the guest requires vgic_v3.vgic_sre
    to be already defined.

    It is how support has been done:

    * handle SGI requests from the guest

    * report configured SRE on access to GICv3 cpu interface from the guest

    * required vgic-v3 macros are provided via uapi.h

    * static keys are used to select GIC backend

    * to make vgic-v3 build KVM_ARM_VGIC_V3 guard is removed along with
    the static inlines

    Acked-by: Marc Zyngier
    Reviewed-by: Christoffer Dall
    Signed-off-by: Vladimir Murzin
    Signed-off-by: Christoffer Dall

    Vladimir Murzin