26 Jan, 2017

1 commit

  • commit 1193e6aeecb36c74c48c7cd0f641acbbed9ddeef upstream.

    Dmitry Vyukov reported that the syzkaller fuzzer triggered a
    deadlock in the vgic setup code when an error was detected, as
    the cleanup code tries to take a lock that is already held by
    the setup code.

    The fix is to avoid retaking the lock when cleaning up, by
    telling the cleanup function that we already hold it.

    Reported-by: Dmitry Vyukov
    Reviewed-by: Christoffer Dall
    Reviewed-by: Eric Auger
    Signed-off-by: Marc Zyngier
    Signed-off-by: Greg Kroah-Hartman

    Marc Zyngier

20 Jan, 2017

1 commit

  • commit 4f3dbdf47e150016aacd734e663347fcaa768303 upstream.

    Reported syzkaller:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
    IP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass]
    PGD 0

    Oops: 0002 [#1] SMP
    CPU: 1 PID: 125 Comm: kworker/1:1 Not tainted 4.9.0+ #1
    Workqueue: kvm-irqfd-cleanup irqfd_shutdown [kvm]
    task: ffff9bbe0dfbb900 task.stack: ffffb61802014000
    RIP: 0010:irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass]
    Call Trace:
    irqfd_shutdown+0x66/0xa0 [kvm]
    ? process_one_work+0x480/0x480
    ? kthread_create_on_node+0x60/0x60
    RIP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass] RSP: ffffb61802017e20
    CR2: 0000000000000008

    The syzkaller folks reported a NULL pointer dereference that due to
    unregister an consumer which fails registration before. The syzkaller
    creates two VMs w/ an equal eventfd occasionally. So the second VM
    fails to register an irqbypass consumer. It will make irqfd as inactive
    and queue an workqueue work to shutdown irqfd and unregister the irqbypass
    consumer when eventfd is closed. However, the second consumer has been
    initialized though it fails registration. So the token(same as the first
    VM's) is taken to unregister the consumer through the workqueue, the
    consumer of the first VM is found and unregistered, then NULL deref incurred
    in the path of deleting consumer from the consumers list.

    This patch fixes it by making irq_bypass_register/unregister_consumer()
    looks for the consumer entry based on consumer pointer itself instead of
    token matching.

    Reported-by: Dmitry Vyukov
    Suggested-by: Alex Williamson
    Cc: Paolo Bonzini
    Cc: Radim Krčmář
    Cc: Dmitry Vyukov
    Cc: Alex Williamson
    Signed-off-by: Wanpeng Li
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Wanpeng Li

01 Dec, 2016

2 commits

24 Nov, 2016

1 commit

  • When we inject a level triggerered interrupt (and unless it
    is backed by the physical distributor - timer style), we request
    a maintenance interrupt. Part of the processing for that interrupt
    is to feed to the rest of KVM (and to the eventfd subsystem) the
    information that the interrupt has been EOIed.

    But that notification only makes sense for SPIs, and not PPIs
    (such as the PMU interrupt). Skip over the notification if
    the interrupt is not an SPI.

    Cc: stable@vger.kernel.org # 4.7+
    Fixes: 140b086dd197 ("KVM: arm/arm64: vgic-new: Add GICv2 world switch backend")
    Fixes: 59529f69f504 ("KVM: arm/arm64: vgic-new: Add GICv3 world switch backend")
    Reported-by: Catalin Marinas
    Tested-by: Catalin Marinas
    Acked-by: Christoffer Dall
    Signed-off-by: Marc Zyngier

    Marc Zyngier

20 Nov, 2016

2 commits

  • This was reported by syzkaller:

    [ INFO: possible recursive locking detected ]
    4.9.0-rc4+ #49 Not tainted
    kworker/2:1/5658 is trying to acquire lock:
    ([ 1644.769018] (&work->work)
    [< inline >] list_empty include/linux/compiler.h:243
    [] flush_work+0x0/0x660 kernel/workqueue.c:1511

    but task is already holding lock:
    ([ 1644.769018] (&work->work)
    [] process_one_work+0x94b/0x1900 kernel/workqueue.c:2093

    stack backtrace:
    CPU: 2 PID: 5658 Comm: kworker/2:1 Not tainted 4.9.0-rc4+ #49
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Workqueue: events async_pf_execute
    ffff8800676ff630 ffffffff81c2e46b ffffffff8485b930 ffff88006b1fc480
    0000000000000000 ffffffff8485b930 ffff8800676ff7e0 ffffffff81339b27
    ffff8800676ff7e8 0000000000000046 ffff88006b1fcce8 ffff88006b1fccf0
    Call Trace:
    [] flush_work+0x93/0x660 kernel/workqueue.c:2846
    [] __cancel_work_timer+0x17a/0x410 kernel/workqueue.c:2916
    [] cancel_work_sync+0x17/0x20 kernel/workqueue.c:2951
    [] kvm_clear_async_pf_completion_queue+0xd7/0x400 virt/kvm/async_pf.c:126
    [< inline >] kvm_free_vcpus arch/x86/kvm/x86.c:7841
    [] kvm_arch_destroy_vm+0x23d/0x620 arch/x86/kvm/x86.c:7946
    [< inline >] kvm_destroy_vm virt/kvm/kvm_main.c:731
    [] kvm_put_kvm+0x40e/0x790 virt/kvm/kvm_main.c:752
    [] async_pf_execute+0x23d/0x4f0 virt/kvm/async_pf.c:111
    [] process_one_work+0x9fc/0x1900 kernel/workqueue.c:2096
    [] worker_thread+0xef/0x1480 kernel/workqueue.c:2230
    [] kthread+0x244/0x2d0 kernel/kthread.c:209
    [] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433

    The reason is that kvm_put_kvm is causing the destruction of the VM, but
    the page fault is still on the ->queue list. The ->queue list is owned
    by the VCPU, not by the work items, so we cannot just add list_del to
    the work item.

    Instead, use work->vcpu to note async page faults that have been resolved
    and will be processed through the done list. There is no need to flush

    Cc: Dmitry Vyukov
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Radim Krčmář

    Paolo Bonzini
  • KVM/ARM updates for v4.9-rc6

    - Fix handling of the 32bit cycle counter
    - Fix cycle counter filtering

    Radim Krčmář

18 Nov, 2016

1 commit

  • KVM calls kvm_pmu_set_counter_event_type() when PMCCFILTR is configured.
    But this function can't deals with PMCCFILTR correctly because the evtCount
    bits of PMCCFILTR, which is reserved 0, conflits with the SW_INCR event
    type of other PMXEVTYPER registers. To fix it, when eventsel == 0, this
    function shouldn't return immediately; instead it needs to check further
    if select_idx is ARMV8_PMU_CYCLE_IDX.

    Another issue is that KVM shouldn't copy the eventsel bits of PMCCFILTER
    blindly to attr.config. Instead it ought to convert the request to the
    "cpu cycle" event type (i.e. 0x11).

    To support this patch and to prevent duplicated definitions, a limited
    set of ARMv8 perf event types were relocated from perf_event.c to

    Cc: stable@vger.kernel.org # 4.6+
    Acked-by: Will Deacon
    Signed-off-by: Wei Huang
    Signed-off-by: Marc Zyngier

    Wei Huang

11 Nov, 2016

1 commit

05 Nov, 2016

3 commits

  • Pull KVM updates from Paolo Bonzini:
    "One NULL pointer dereference, and two fixes for regressions introduced
    during the merge window.

    The rest are fixes for MIPS, s390 and nested VMX"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    kvm: x86: Check memopp before dereference (CVE-2016-8630)
    kvm: nVMX: VMCLEAR an active shadow VMCS after last use
    KVM: x86: drop TSC offsetting kvm_x86_ops to fix KVM_GET/SET_CLOCK
    KVM: x86: fix wbinvd_dirty_mask use-after-free
    kvm/x86: Show WRMSR data is in hex
    kvm: nVMX: Fix kernel panics induced by illegal INVEPT/INVVPID types
    KVM: document lock orders
    KVM: fix OOPS on flush_work
    KVM: s390: Fix STHYI buffer alignment for diag224
    KVM: MIPS: Precalculate MMIO load resume PC
    KVM: MIPS: Make ERET handle ERL before EXL
    KVM: MIPS: Fix lazy user ASID regenerate for SMP

    Linus Torvalds
  • In cases like IPI, we could be queueing an interrupt for a VCPU
    that is already running and is not about to exit, because the
    VCPU has entered the VM with the interrupt pending and would
    not trap on EOI'ing that interrupt. This could result to delays
    in interrupt deliveries or even loss of interrupts.
    To guarantee prompt interrupt injection, here we have to try to
    kick the VCPU.

    Signed-off-by: Shih-Wei Li
    Reviewed-by: Christoffer Dall
    Signed-off-by: Marc Zyngier

    Shih-Wei Li
  • In our VGIC implementation we limit the number of SPIs to a number
    that the userland application told us. Accordingly we limit the
    allocation of memory for virtual IRQs to that number.
    However in our MMIO dispatcher we didn't check if we ever access an
    IRQ beyond that limit, leading to out-of-bound accesses.
    Add a test against the number of allocated SPIs in check_region().
    Adjust the VGIC_ADDR_TO_INT macro to avoid an actual division, which
    is not implemented on ARM(32).

    [maz: cleaned-up original patch]

    Cc: stable@vger.kernel.org
    Reviewed-by: Christoffer Dall
    Signed-off-by: Andre Przywara
    Signed-off-by: Marc Zyngier

    Andre Przywara

26 Oct, 2016

1 commit

  • The conversion done by commit 3706feacd007 ("KVM: Remove deprecated
    create_singlethread_workqueue") is broken. It flushes a single work
    item &irqfd->shutdown instead of all of them, and even worse if there
    is no irqfd on the list then you get a NULL pointer dereference.
    Revert the virt/kvm/eventfd.c part of that patch; to avoid the
    deprecated function, just allocate our own workqueue---it does
    not even have to be unbound---with alloc_workqueue.

    Fixes: 3706feacd007
    Reviewed-by: Cornelia Huck
    Signed-off-by: Paolo Bonzini

    Paolo Bonzini

25 Oct, 2016

1 commit

  • This patch unexports the low-level __get_user_pages() function.

    Recent refactoring of the get_user_pages* functions allow flags to be
    passed through get_user_pages() which eliminates the need for access to
    this function from its one user, kvm.

    We can see that the two calls to get_user_pages() which replace
    __get_user_pages() in kvm_main.c are equivalent by examining their call

    get_user_pages(start, 1, flags, page, NULL)
    __get_user_pages_locked(current, current->mm, start, 1, page, NULL, NULL,
    false, flags | FOLL_TOUCH)
    __get_user_pages(current, current->mm, start, 1,
    flags | FOLL_TOUCH | FOLL_GET, page, NULL, NULL)

    get_user_pages(addr, 1, flags, NULL, NULL)
    __get_user_pages_locked(current, current->mm, addr, 1, NULL, NULL, NULL,
    false, flags | FOLL_TOUCH)
    __get_user_pages(current, current->mm, addr, 1, flags | FOLL_TOUCH, NULL,

    Signed-off-by: Lorenzo Stoakes
    Acked-by: Paolo Bonzini
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Lorenzo Stoakes

19 Oct, 2016

1 commit

  • This removes the redundant 'write' and 'force' parameters from
    __get_user_pages_unlocked() to make the use of FOLL_FORCE explicit in
    callers as use of this flag can result in surprising behaviour (and
    hence bugs) within the mm subsystem.

    Signed-off-by: Lorenzo Stoakes
    Acked-by: Paolo Bonzini
    Reviewed-by: Jan Kara
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Lorenzo Stoakes

29 Sep, 2016

1 commit

28 Sep, 2016

2 commits

  • If the vgic hasn't been created and initialized, we shouldn't attempt to
    look at its data structures or flush/sync anything to the GIC hardware.

    This fixes an issue reported by Alexander Graf when using a userspace

    Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
    Cc: stable@vger.kernel.org
    Reported-by: Alexander Graf
    Acked-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Christoffer Dall
  • If userspace creates a PMU for the VCPU, but doesn't create an in-kernel
    irqchip, then we end up in a nasty path where we try to take an
    uninitialized spinlock, which can lead to all sorts of breakages.

    Luckily, QEMU always creates the VGIC before the PMU, so we can
    establish this as ABI and check for the VGIC in the PMU init stage.
    This can be relaxed at a later time if we want to support PMU with a
    userspace irqchip.

    Cc: stable@vger.kernel.org
    Cc: Shannon Zhao
    Acked-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Christoffer Dall

22 Sep, 2016

6 commits

  • This patch allows to build and use vgic-v3 in 32-bit mode.

    Unfortunately, it can not be split in several steps without extra
    stubs to keep patches independent and bisectable. For instance,
    virt/kvm/arm/vgic/vgic-v3.c uses function from vgic-v3-sr.c, handling
    access to GICv3 cpu interface from the guest requires vgic_v3.vgic_sre
    to be already defined.

    It is how support has been done:

    * handle SGI requests from the guest

    * report configured SRE on access to GICv3 cpu interface from the guest

    * required vgic-v3 macros are provided via uapi.h

    * static keys are used to select GIC backend

    * to make vgic-v3 build KVM_ARM_VGIC_V3 guard is removed along with
    the static inlines

    Acked-by: Marc Zyngier
    Reviewed-by: Christoffer Dall
    Signed-off-by: Vladimir Murzin
    Signed-off-by: Christoffer Dall

    Vladimir Murzin
  • We have couple of 64-bit registers defined in GICv3 architecture, so
    unsigned long accesses to these registers will only access a single
    32-bit part of that regitser. On the other hand these registers can't
    be accessed as 64-bit with a single instruction like ldrd/strd or
    ldmia/stmia if we run a 32-bit host because KVM does not support
    access to MMIO space done by these instructions.

    It means that a 32-bit guest accesses these registers in 32-bit
    chunks, so the only thing we need to do is to ensure that
    extract_bytes() always takes 64-bit data.

    Acked-by: Marc Zyngier
    Signed-off-by: Vladimir Murzin
    Signed-off-by: Christoffer Dall

    Vladimir Murzin
  • Well, this patch is looking ahead of time, but we'll get following
    compiler warnings as soon as we introduce vgic-v3 to 32-bit world

    CC arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.o
    arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c: In function 'vgic_mmio_read_v3r_typer':
    arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c:184:35: warning: left shift count >= width of type [-Wshift-count-overflow]
    value = (mpidr & GENMASK(23, 0)) << 32;
    In file included from ./include/linux/kernel.h:10:0,
    from ./include/asm-generic/bug.h:13,
    from ./arch/arm/include/asm/bug.h:59,
    from ./include/linux/bug.h:4,
    from ./include/linux/io.h:23,
    from ./arch/arm/include/asm/arch_gicv3.h:23,
    from ./include/linux/irqchip/arm-gic-v3.h:411,
    from arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c:14:
    arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c: In function 'vgic_v3_dispatch_sgi':
    ./include/linux/bitops.h:6:24: warning: left shift count >= width of type [-Wshift-count-overflow]
    #define BIT(nr) (1UL << (nr))
    arch/arm/kvm/../../../virt/kvm/arm/vgic/vgic-mmio-v3.c:614:20: note: in expansion of macro 'BIT'
    broadcast = reg & BIT(ICC_SGI1R_IRQ_ROUTING_MODE_BIT);
    Let's fix them now.

    Acked-by: Marc Zyngier
    Signed-off-by: Vladimir Murzin
    Signed-off-by: Christoffer Dall

    Vladimir Murzin
  • By now ITS code guarded with KVM_ARM_VGIC_V3 config option which was
    introduced to hide everything specific to vgic-v3 from 32-bit world.
    We are going to support vgic-v3 in 32-bit world and KVM_ARM_VGIC_V3
    will gone, but we don't have support for ITS there yet and we need to
    continue keeping ITS away.
    Introduce the new config option to prevent ITS code being build in
    32-bit mode when support for vgic-v3 is done.

    Signed-off-by: Vladimir Murzin
    Acked-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Vladimir Murzin
  • So we can reuse the code under arch/arm

    Signed-off-by: Vladimir Murzin
    Acked-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Vladimir Murzin
  • Currently GIC backend is selected via alternative framework and this
    is fine. We are going to introduce vgic-v3 to 32-bit world and there
    we don't have patching framework in hand, so we can either check
    support for GICv3 every time we need to choose which backend to use or
    try to optimise it by using static keys. The later looks quite
    promising because we can share logic involved in selecting GIC backend
    between architectures if both uses static keys.

    This patch moves arm64 from alternative to static keys framework for
    selecting GIC backend. For that we embed static key into vgic_global
    and enable the key during vgic initialisation based on what has
    already been exposed by the host GIC driver.

    Acked-by: Marc Zyngier
    Signed-off-by: Vladimir Murzin
    Signed-off-by: Christoffer Dall

    Vladimir Murzin

16 Sep, 2016

2 commits

  • This commit adds the ability for archs to export
    per-vcpu information via a new per-vcpu dir in
    the VM's debugfs directory.

    If kvm_arch_has_vcpu_debugfs() returns true, then KVM
    will create a vcpu dir for each vCPU in the VM's
    debugfs directory. Then kvm_arch_create_vcpu_debugfs()
    is responsible for populating each vcpu directory
    with arch specific entries.

    The per-vcpu path in debugfs will look like:


    This is all arch specific for now because the only
    user of this interface (x86) wants to export x86-specific
    per-vcpu information to user-space.

    Signed-off-by: Luiz Capitulino
    Signed-off-by: Paolo Bonzini

    Luiz Capitulino
  • This make it possible to call kvm_destroy_vm_debugfs() from
    kvm_create_vm_debugfs() in error conditions.

    Reviewed-by: Paolo Bonzini
    Signed-off-by: Luiz Capitulino
    Signed-off-by: Paolo Bonzini

    Luiz Capitulino

13 Sep, 2016

1 commit

08 Sep, 2016

13 commits

  • Remove two unnecessary labels now that kvm_timer_hyp_init is not
    creating its own workqueue anymore.

    Signed-off-by: Paolo Bonzini
    Signed-off-by: Christoffer Dall

    Paolo Bonzini
  • If, when proxying a GICV access at EL2, we detect that the guest is
    doing something silly, report an EL1 SError instead ofgnoring the

    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
  • So far, we've been disabling KVM on systems where the GICV region couldn't
    be safely given to a guest. Now that we're able to handle this access
    safely by emulating it in HYP, we can enable this feature when we detect
    an unsafe configuration.

    Reviewed-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
  • Now that we have the necessary infrastructure to handle MMIO accesses
    in HYP, perform the GICV access on behalf of the guest. This requires
    checking that the access is strictly 32bit, properly aligned, and
    falls within the expected range.

    When all condition are satisfied, we perform the access and tell
    the rest of the HYP code that the instruction has been correctly

    Reviewed-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
  • In order to efficiently perform the GICV access on behalf of the
    guest, we need to be able to avoid going back all the way to
    the host kernel.

    For this, we introduce a new hook in the world switch code,
    conveniently placed just after populating the fault info.
    At that point, we only have saved/restored the GP registers,
    and we can quickly perform all the required checks (data abort,
    translation fault, valid faulting syndrome, not an external
    abort, not a PTW).

    Coming back from the emulation code, we need to skip the emulated
    instruction. This involves an additional bit of save/restore in
    order to be able to access the guest's PC (and possibly CPSR if
    this is a 32bit guest).

    At this stage, no emulation code is provided.

    Signed-off-by: Marc Zyngier
    Reviewed-by: Christoffer Dall
    Signed-off-by: Christoffer Dall

    Marc Zyngier
  • As we plan to do some emulation at HYP, let's make kvm_skip_instr32
    as part of the hyp_text section. This doesn't preclude the kernel
    from using it.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
  • Add the bit of glue and const-ification that is required to use
    the code inherited from the arm64 port, and move over to it.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
  • It would make some sense to share the conditional execution code
    between 32 and 64bit. In order to achieve this, let's move that
    code to virt/kvm/arm/aarch32.c. While we're at it, drop a
    superfluous BUG_ON() that wasn't that useful.

    Following patches will migrate the 32bit port to that code base.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
  • As kvm_set_routing_entry() was changing prototype between 4.7 and 4.8,
    an ugly hack was put in place in order to survive both building in
    -next and the merge window.

    Now that everything has been merged, let's dump the compatibility
    hack for good.

    Signed-off-by: Marc Zyngier
    Reviewed-by: Eric Auger
    Signed-off-by: Christoffer Dall

    Marc Zyngier
  • Just a rename so we can implement a v3-specific function later.

    We take the chance to get rid of the V2/V3 ops comments as well.

    No functional change.

    Signed-off-by: Christoffer Dall
    Reviewed-by: Eric Auger

    Christoffer Dall
  • As we are about to deal with multiple data types and situations where
    the vgic should not be initialized when doing userspace accesses on the
    register attributes, factor out the functionality of
    vgic_attr_regs_access into smaller bits which can be reused by a new
    function later.

    Signed-off-by: Christoffer Dall
    Reviewed-by: Eric Auger

    Christoffer Dall
  • vms and vcpus have statistics associated with them which can be viewed
    within the debugfs. Currently it is assumed within the vcpu_stat_get() and
    vm_stat_get() functions that all of these statistics are represented as
    u32s, however the next patch adds some u64 vcpu statistics.

    Change all vcpu statistics to u64 and modify vcpu_stat_get() accordingly.
    Since vcpu statistics are per vcpu, they will only be updated by a single
    vcpu at a time so this shouldn't present a problem on 32-bit machines
    which can't atomically increment 64-bit numbers. However vm statistics
    could potentially be updated by multiple vcpus from that vm at a time.
    To avoid the overhead of atomics make all vm statistics ulong such that
    they are 64-bit on 64-bit systems where they can be atomically incremented
    and are 32-bit on 32-bit systems which may not be able to atomically
    increment 64-bit numbers. Modify vm_stat_get() to expect ulongs.

    Signed-off-by: Suraj Jitindar Singh
    Reviewed-by: David Matlack
    Acked-by: Christian Borntraeger
    Signed-off-by: Paul Mackerras

    Suraj Jitindar Singh
  • The workqueue "irqfd_cleanup_wq" queues a single work item
    &irqfd->shutdown and hence doesn't require ordering. It is a host-wide
    workqueue for issuing deferred shutdown requests aggregated from all
    vm* instances. It is not being used on a memory reclaim path.
    Hence, it has been converted to use system_wq.
    The work item has been flushed in kvm_irqfd_release().

    The workqueue "wqueue" queues a single work item &timer->expired
    and hence doesn't require ordering. Also, it is not being used on
    a memory reclaim path. Hence, it has been converted to use system_wq.

    System workqueues have been able to handle high level of concurrency
    for a long time now and hence it's not required to have a singlethreaded
    workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue
    created with create_singlethread_workqueue(), system_wq allows multiple
    work items to overlap executions even on the same CPU; however, a
    per-cpu workqueue doesn't have any CPU locality or global ordering
    guarantee unless the target CPU is explicitly specified and thus the
    increase of local concurrency shouldn't make any difference.

    Signed-off-by: Bhaktipriya Shridhar
    Signed-off-by: Paolo Bonzini

    Bhaktipriya Shridhar