12 Jan, 2011

28 commits

  • Since vmx blocks INIT signals, we disable virtualization extensions during
    reboot. This leads to virtualization instructions faulting; we trap these
    faults and spin while the reboot continues.

    Unfortunately spinning on a non-preemptible kernel may block a task that
    reboot depends on; this causes the reboot to hang.

    Fix by skipping over the instruction and hoping for the best.

    Signed-off-by: Avi Kivity

    Avi Kivity
     
  • Quote from Avi:
    | I don't think we need to flush immediately; set a "tlb dirty" bit somewhere
    | that is cleareded when we flush the tlb. kvm_mmu_notifier_invalidate_page()
    | can consult the bit and force a flush if set.

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Marcelo Tosatti

    Xiao Guangrong
     
  • Store irq routing table pointer in the irqfd object,
    and use that to inject MSI directly without bouncing out to
    a kernel thread.

    While we touch this structure, rearrange irqfd fields to make fastpath
    better packed for better cache utilization.

    This also adds some comments about locking rules and rcu usage in code.

    Some notes on the design:
    - Use pointer into the rt instead of copying an entry,
    to make it possible to use rcu, thus side-stepping
    locking complexities. We also save some memory this way.
    - Old workqueue code is still used for level irqs.
    I don't think we DTRT with level anyway, however,
    it seems easier to keep the code around as
    it has been thought through and debugged, and fix level later than
    rip out and re-instate it later.

    Signed-off-by: Michael S. Tsirkin
    Acked-by: Marcelo Tosatti
    Acked-by: Gregory Haskins
    Signed-off-by: Avi Kivity

    Michael S. Tsirkin
     
  • The naming convension of hardware_[dis|en]able family is little bit confusing
    because only hardware_[dis|en]able_all are using _nolock suffix.

    Renaming current hardware_[dis|en]able() to *_nolock() and using
    hardware_[dis|en]able() as wrapper functions which take kvm_lock for them
    reduces extra confusion.

    Signed-off-by: Takuya Yoshikawa
    Signed-off-by: Marcelo Tosatti

    Takuya Yoshikawa
     
  • In kvm_cpu_hotplug(), only CPU_STARTING case is protected by kvm_lock.
    This patch adds missing protection for CPU_DYING case.

    Signed-off-by: Takuya Yoshikawa
    Signed-off-by: Marcelo Tosatti

    Takuya Yoshikawa
     
  • Any arch not supporting device assigment will also not build
    assigned-dev.c. So testing for KVM_CAP_DEVICE_DEASSIGNMENT is pointless.
    KVM_CAP_ASSIGN_DEV_IRQ is unconditinally set. Moreover, add a default
    case for dispatching the ioctl.

    Acked-by: Alex Williamson
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Jan Kiszka
    Signed-off-by: Marcelo Tosatti

    Jan Kiszka
     
  • The guest may change states that pci_reset_function does not touch. So
    we better save/restore the assigned device across guest usage.

    Acked-by: Alex Williamson
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Jan Kiszka
    Signed-off-by: Marcelo Tosatti

    Jan Kiszka
     
  • Cosmetic change, but it helps to correlate IRQs with PCI devices.

    Acked-by: Alex Williamson
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Jan Kiszka
    Signed-off-by: Marcelo Tosatti

    Jan Kiszka
     
  • This improves the IRQ forwarding for assigned devices: By using the
    kernel's threaded IRQ scheme, we can get rid of the latency-prone work
    queue and simplify the code in the same run.

    Moreover, we no longer have to hold assigned_dev_lock while raising the
    guest IRQ, which can be a lenghty operation as we may have to iterate
    over all VCPUs. The lock is now only used for synchronizing masking vs.
    unmasking of INTx-type IRQs, thus is renames to intx_lock.

    Acked-by: Alex Williamson
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Jan Kiszka
    Signed-off-by: Marcelo Tosatti

    Jan Kiszka
     
  • When we deassign a guest IRQ, clear the potentially asserted guest line.
    There might be no chance for the guest to do this, specifically if we
    switch from INTx to MSI mode.

    Acked-by: Alex Williamson
    Acked-by: Michael S. Tsirkin
    Signed-off-by: Jan Kiszka
    Signed-off-by: Marcelo Tosatti

    Jan Kiszka
     
  • IA64 support forces us to abstract the allocation of the kvm structure.
    But instead of mixing this up with arch-specific initialization and
    doing the same on destruction, split both steps. This allows to move
    generic destruction calls into generic code.

    It also fixes error clean-up on failures of kvm_create_vm for IA64.

    Signed-off-by: Jan Kiszka
    Signed-off-by: Avi Kivity

    Jan Kiszka
     
  • Signed-off-by: Jan Kiszka
    Signed-off-by: Avi Kivity

    Jan Kiszka
     
  • In kvm_async_pf_wakeup_all(), we add a dummy apf to vcpu->async_pf.done
    without holding vcpu->async_pf.lock, it will break if we are handling apfs
    at this time.

    Also use 'list_empty_careful()' instead of 'list_empty()'

    Signed-off-by: Xiao Guangrong
    Acked-by: Gleb Natapov
    Signed-off-by: Marcelo Tosatti

    Xiao Guangrong
     
  • If it's no need to inject async #PF to PV guest we can handle
    more completed apfs at one time, so we can retry guest #PF
    as early as possible

    Signed-off-by: Xiao Guangrong
    Acked-by: Gleb Natapov
    Signed-off-by: Marcelo Tosatti

    Xiao Guangrong
     
  • Let's use newly introduced vzalloc().

    Signed-off-by: Takuya Yoshikawa
    Signed-off-by: Jesper Juhl
    Signed-off-by: Marcelo Tosatti

    Takuya Yoshikawa
     
  • Fixes this:

    CC arch/s390/kvm/../../../virt/kvm/kvm_main.o
    arch/s390/kvm/../../../virt/kvm/kvm_main.c: In function 'kvm_dev_ioctl_create_vm':
    arch/s390/kvm/../../../virt/kvm/kvm_main.c:1828:10: warning: unused variable 'r'

    Signed-off-by: Heiko Carstens
    Signed-off-by: Marcelo Tosatti

    Heiko Carstens
     
  • Fixes this:

    CC arch/s390/kvm/../../../virt/kvm/kvm_main.o
    arch/s390/kvm/../../../virt/kvm/kvm_main.c: In function 'kvm_clear_guest_page':
    arch/s390/kvm/../../../virt/kvm/kvm_main.c:1224:2: warning: passing argument 3 of 'kvm_write_guest_page' makes pointer from integer without a cast
    arch/s390/kvm/../../../virt/kvm/kvm_main.c:1185:5: note: expected 'const void *' but argument is of type 'long unsigned int'

    Signed-off-by: Heiko Carstens
    Signed-off-by: Marcelo Tosatti

    Heiko Carstens
     
  • Currently we are using vmalloc() for all dirty bitmaps even if
    they are small enough, say less than K bytes.

    We use kmalloc() if dirty bitmap size is less than or equal to
    PAGE_SIZE so that we can avoid vmalloc area usage for VGA.

    This will also make the logging start/stop faster.

    Signed-off-by: Takuya Yoshikawa
    Signed-off-by: Marcelo Tosatti

    Takuya Yoshikawa
     
  • Currently x86's kvm_vm_ioctl_get_dirty_log() needs to allocate a bitmap by
    vmalloc() which will be used in the next logging and this has been causing
    bad effect to VGA and live-migration: vmalloc() consumes extra systime,
    triggers tlb flush, etc.

    This patch resolves this issue by pre-allocating one more bitmap and switching
    between two bitmaps during dirty logging.

    Performance improvement:
    I measured performance for the case of VGA update by trace-cmd.
    The result was 1.5 times faster than the original one.

    In the case of live migration, the improvement ratio depends on the workload
    and the guest memory size. In general, the larger the memory size is the more
    benefits we get.

    Note:
    This does not change other architectures's logic but the allocation size
    becomes twice. This will increase the actual memory consumption only when
    the new size changes the number of pages allocated by vmalloc().

    Signed-off-by: Takuya Yoshikawa
    Signed-off-by: Fernando Luis Vazquez Cao
    Signed-off-by: Marcelo Tosatti

    Takuya Yoshikawa
     
  • This makes it easy to change the way of allocating/freeing dirty bitmaps.

    Signed-off-by: Takuya Yoshikawa
    Signed-off-by: Fernando Luis Vazquez Cao
    Signed-off-by: Marcelo Tosatti

    Takuya Yoshikawa
     
  • Add tracepoint for userspace exit.

    Signed-off-by: Gleb Natapov
    Signed-off-by: Marcelo Tosatti

    Gleb Natapov
     
  • As suggested by Andrea, pass r/w error code to gup(), upgrading read fault
    to writable if host pte allows it.

    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Marcelo Tosatti
     
  • Improve vma handling code readability in hva_to_pfn() and fix
    async pf handling code to properly check vma returned by find_vma().

    Signed-off-by: Gleb Natapov
    Signed-off-by: Marcelo Tosatti

    Gleb Natapov
     
  • Send async page fault to a PV guest if it accesses swapped out memory.
    Guest will choose another task to run upon receiving the fault.

    Allow async page fault injection only when guest is in user mode since
    otherwise guest may be in non-sleepable context and will not be able
    to reschedule.

    Vcpu will be halted if guest will fault on the same page again or if
    vcpu executes kernel code.

    Acked-by: Rik van Riel
    Signed-off-by: Gleb Natapov
    Signed-off-by: Marcelo Tosatti

    Gleb Natapov
     
  • Guest enables async PF vcpu functionality using this MSR.

    Reviewed-by: Rik van Riel
    Signed-off-by: Gleb Natapov
    Signed-off-by: Marcelo Tosatti

    Gleb Natapov
     
  • Keep track of memslots changes by keeping generation number in memslots
    structure. Provide kvm_write_guest_cached() function that skips
    gfn_to_hva() translation if memslots was not changed since previous
    invocation.

    Acked-by: Rik van Riel
    Signed-off-by: Gleb Natapov
    Signed-off-by: Marcelo Tosatti

    Gleb Natapov
     
  • When page is swapped in it is mapped into guest memory only after guest
    tries to access it again and generate another fault. To save this fault
    we can map it immediately since we know that guest is going to access
    the page. Do it only when tdp is enabled for now. Shadow paging case is
    more complicated. CR[034] and EFER registers should be switched before
    doing mapping and then switched back.

    Acked-by: Rik van Riel
    Signed-off-by: Gleb Natapov
    Signed-off-by: Marcelo Tosatti

    Gleb Natapov
     
  • If a guest accesses swapped out memory do not swap it in from vcpu thread
    context. Schedule work to do swapping and put vcpu into halted state
    instead.

    Interrupts will still be delivered to the guest and if interrupt will
    cause reschedule guest will continue to run another task.

    [avi: remove call to get_user_pages_noio(), nacked by Linus; this
    makes everything synchrnous again]

    Acked-by: Rik van Riel
    Signed-off-by: Gleb Natapov
    Signed-off-by: Marcelo Tosatti

    Gleb Natapov
     

25 Oct, 2010

1 commit

  • * 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (321 commits)
    KVM: Drop CONFIG_DMAR dependency around kvm_iommu_map_pages
    KVM: Fix signature of kvm_iommu_map_pages stub
    KVM: MCE: Send SRAR SIGBUS directly
    KVM: MCE: Add MCG_SER_P into KVM_MCE_CAP_SUPPORTED
    KVM: fix typo in copyright notice
    KVM: Disable interrupts around get_kernel_ns()
    KVM: MMU: Avoid sign extension in mmu_alloc_direct_roots() pae root address
    KVM: MMU: move access code parsing to FNAME(walk_addr) function
    KVM: MMU: audit: check whether have unsync sps after root sync
    KVM: MMU: audit: introduce audit_printk to cleanup audit code
    KVM: MMU: audit: unregister audit tracepoints before module unloaded
    KVM: MMU: audit: fix vcpu's spte walking
    KVM: MMU: set access bit for direct mapping
    KVM: MMU: cleanup for error mask set while walk guest page table
    KVM: MMU: update 'root_hpa' out of loop in PAE shadow path
    KVM: x86 emulator: Eliminate compilation warning in x86_decode_insn()
    KVM: x86: Fix constant type in kvm_get_time_scale
    KVM: VMX: Add AX to list of registers clobbered by guest switch
    KVM guest: Move a printk that's using the clock before it's ready
    KVM: x86: TSC catchup mode
    ...

    Linus Torvalds
     

24 Oct, 2010

7 commits

  • We also have to call kvm_iommu_map_pages for CONFIG_AMD_IOMMU. So drop
    the dependency on Intel IOMMU, kvm_iommu_map_pages will be a nop anyway
    if CONFIG_IOMMU_API is not defined.

    KVM-Stable-Tag.
    Signed-off-by: Jan Kiszka
    Signed-off-by: Marcelo Tosatti

    Jan Kiszka
     
  • Fix typo in copyright notice.

    Signed-off-by: Nicolas Kaiser
    Signed-off-by: Marcelo Tosatti

    Nicolas Kaiser
     
  • It doesn't really matter, but if we spin, we should spin in a more relaxed
    manner. This way, if something goes wrong at least it won't contribute to
    global warming.

    Signed-off-by: Avi Kivity
    Signed-off-by: Marcelo Tosatti

    Avi Kivity
     
  • There is a bugs in this function, we call gfn_to_pfn() and kvm_mmu_gva_to_gpa_read() in
    atomic context(kvm_mmu_audit() is called under the spinlock(mmu_lock)'s protection).

    This patch fix it by:
    - introduce gfn_to_pfn_atomic instead of gfn_to_pfn
    - get the mapping gfn from kvm_mmu_page_get_gfn()

    And it adds 'notrap' ptes check in unsync/direct sps

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • Introduce this function to get consecutive gfn's pages, it can reduce
    gup's overload, used by later patch

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Marcelo Tosatti

    Xiao Guangrong
     
  • Introduce hva_to_pfn_atomic(), it's the fast path and can used in atomic
    context, the later patch will use it

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Marcelo Tosatti

    Xiao Guangrong
     
  • If there are active VCPUs which are marked as belonging to
    a particular hardware CPU, request a clock sync for them when
    enabling hardware; the TSC could be desynchronized on a newly
    arriving CPU, and we need to recompute guests system time
    relative to boot after a suspend event.

    This covers both cases.

    Note that it is acceptable to take the spinlock, as either
    no other tasks will be running and no locks held (BSP after
    resume), or other tasks will be guaranteed to drop the lock
    relatively quickly (AP on CPU_STARTING).

    Noting we now get clock synchronization requests for VCPUs
    which are starting up (or restarting), it is tempting to
    attempt to remove the arch/x86/kvm/x86.c CPU hot-notifiers
    at this time, however it is not correct to do so; they are
    required for systems with non-constant TSC as the frequency
    may not be known immediately after the processor has started
    until the cpufreq driver has had a chance to run and query
    the chipset.

    Updated: implement better locking semantics for hardware_enable

    Removed the hack of dropping and retaking the lock by adding the
    semantic that we always hold kvm_lock when hardware_enable is
    called. The one place that doesn't need to worry about it is
    resume, as resuming a frozen CPU, the spinlock won't be taken.

    Signed-off-by: Zachary Amsden
    Signed-off-by: Marcelo Tosatti

    Zachary Amsden
     

23 Oct, 2010

1 commit

  • * 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
    vfs: make no_llseek the default
    vfs: don't use BKL in default_llseek
    llseek: automatically add .llseek fop
    libfs: use generic_file_llseek for simple_attr
    mac80211: disallow seeks in minstrel debug code
    lirc: make chardev nonseekable
    viotape: use noop_llseek
    raw: use explicit llseek file operations
    ibmasmfs: use generic_file_llseek
    spufs: use llseek in all file operations
    arm/omap: use generic_file_llseek in iommu_debug
    lkdtm: use generic_file_llseek in debugfs
    net/wireless: use generic_file_llseek in debugfs
    drm: use noop_llseek

    Linus Torvalds
     

15 Oct, 2010

1 commit

  • All file_operations should get a .llseek operation so we can make
    nonseekable_open the default for future file operations without a
    .llseek pointer.

    The three cases that we can automatically detect are no_llseek, seq_lseek
    and default_llseek. For cases where we can we can automatically prove that
    the file offset is always ignored, we use noop_llseek, which maintains
    the current behavior of not returning an error from a seek.

    New drivers should normally not use noop_llseek but instead use no_llseek
    and call nonseekable_open at open time. Existing drivers can be converted
    to do the same when the maintainer knows for certain that no user code
    relies on calling seek on the device file.

    The generated code is often incorrectly indented and right now contains
    comments that clarify for each added line why a specific variant was
    chosen. In the version that gets submitted upstream, the comments will
    be gone and I will manually fix the indentation, because there does not
    seem to be a way to do that using coccinelle.

    Some amount of new code is currently sitting in linux-next that should get
    the same modifications, which I will do at the end of the merge window.

    Many thanks to Julia Lawall for helping me learn to write a semantic
    patch that does all this.

    ===== begin semantic patch =====
    // This adds an llseek= method to all file operations,
    // as a preparation for making no_llseek the default.
    //
    // The rules are
    // - use no_llseek explicitly if we do nonseekable_open
    // - use seq_lseek for sequential files
    // - use default_llseek if we know we access f_pos
    // - use noop_llseek if we know we don't access f_pos,
    // but we still want to allow users to call lseek
    //
    @ open1 exists @
    identifier nested_open;
    @@
    nested_open(...)
    {

    }

    @ open exists@
    identifier open_f;
    identifier i, f;
    identifier open1.nested_open;
    @@
    int open_f(struct inode *i, struct file *f)
    {

    }

    @ read disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {

    }

    @ read_no_fpos disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ write @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {

    }

    @ write_no_fpos @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ fops0 @
    identifier fops;
    @@
    struct file_operations fops = {
    ...
    };

    @ has_llseek depends on fops0 @
    identifier fops0.fops;
    identifier llseek_f;
    @@
    struct file_operations fops = {
    ...
    .llseek = llseek_f,
    ...
    };

    @ has_read depends on fops0 @
    identifier fops0.fops;
    identifier read_f;
    @@
    struct file_operations fops = {
    ...
    .read = read_f,
    ...
    };

    @ has_write depends on fops0 @
    identifier fops0.fops;
    identifier write_f;
    @@
    struct file_operations fops = {
    ...
    .write = write_f,
    ...
    };

    @ has_open depends on fops0 @
    identifier fops0.fops;
    identifier open_f;
    @@
    struct file_operations fops = {
    ...
    .open = open_f,
    ...
    };

    // use no_llseek if we call nonseekable_open
    ////////////////////////////////////////////
    @ nonseekable1 depends on !has_llseek && has_open @
    identifier fops0.fops;
    identifier nso ~= "nonseekable_open";
    @@
    struct file_operations fops = {
    ... .open = nso, ...
    +.llseek = no_llseek, /* nonseekable */
    };

    @ nonseekable2 depends on !has_llseek @
    identifier fops0.fops;
    identifier open.open_f;
    @@
    struct file_operations fops = {
    ... .open = open_f, ...
    +.llseek = no_llseek, /* open uses nonseekable */
    };

    // use seq_lseek for sequential files
    /////////////////////////////////////
    @ seq depends on !has_llseek @
    identifier fops0.fops;
    identifier sr ~= "seq_read";
    @@
    struct file_operations fops = {
    ... .read = sr, ...
    +.llseek = seq_lseek, /* we have seq_read */
    };

    // use default_llseek if there is a readdir
    ///////////////////////////////////////////
    @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier readdir_e;
    @@
    // any other fop is used that changes pos
    struct file_operations fops = {
    ... .readdir = readdir_e, ...
    +.llseek = default_llseek, /* readdir is present */
    };

    // use default_llseek if at least one of read/write touches f_pos
    /////////////////////////////////////////////////////////////////
    @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read.read_f;
    @@
    // read fops use offset
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = default_llseek, /* read accesses f_pos */
    };

    @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ... .write = write_f, ...
    + .llseek = default_llseek, /* write accesses f_pos */
    };

    // Use noop_llseek if neither read nor write accesses f_pos
    ///////////////////////////////////////////////////////////

    @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    identifier write_no_fpos.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ...
    .write = write_f,
    .read = read_f,
    ...
    +.llseek = noop_llseek, /* read and write both use no f_pos */
    };

    @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write_no_fpos.write_f;
    @@
    struct file_operations fops = {
    ... .write = write_f, ...
    +.llseek = noop_llseek, /* write uses no f_pos */
    };

    @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    @@
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = noop_llseek, /* read uses no f_pos */
    };

    @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    @@
    struct file_operations fops = {
    ...
    +.llseek = noop_llseek, /* no read or write fn */
    };
    ===== End semantic patch =====

    Signed-off-by: Arnd Bergmann
    Cc: Julia Lawall
    Cc: Christoph Hellwig

    Arnd Bergmann
     

23 Sep, 2010

2 commits

  • When we reboot, we disable vmx extensions or otherwise INIT gets blocked.
    If a task on another cpu hits a vmx instruction, it will fault if vmx is
    disabled. We trap that to avoid a nasty oops and spin until the reboot
    completes.

    Problem is, we sleep with interrupts disabled. This blocks smp_send_stop()
    from running, and the reboot process halts.

    Fix by enabling interrupts before spinning.

    KVM-Stable-Tag.
    Signed-off-by: Avi Kivity
    Signed-off-by: Marcelo Tosatti

    Avi Kivity
     
  • I think I see the following (theoretical) race:

    During irqfd assign, we drop irqfds lock before we
    schedule inject work. Therefore, deassign running
    on another CPU could cause shutdown and flush to run
    before inject, causing user after free in inject.

    A simple fix it to schedule inject under the lock.

    Signed-off-by: Michael S. Tsirkin
    Acked-by: Gregory Haskins
    Signed-off-by: Marcelo Tosatti

    Michael S. Tsirkin