07 Jun, 2008

1 commit

  • There's a bug in the IOAPIC code for level-triggered interrupts. Its
    relatively easy to trigger by sharing (virtio-blk + usbtablet was the
    testcase, initially reported by Gerd von Egidy).

    The "remote_irr" variable is used to indicate accepted but not yet acked
    interrupts. Its cleared from the EOI handler.

    Problem is that the EOI handler clears remote_irr unconditionally, even
    if it reinjected another pending interrupt.

    In that case, kvm_ioapic_set_irq() proceeds to ioapic_service() which
    sets remote_irr even if it failed to inject (since the IRR was high due
    to EOI reinjection).

    Since the TMR bit has been cleared by the first EOI, the second one
    fails to clear remote_irr.

    End result is interrupt line dead.

    Fix it by setting remote_irr only if a new pending interrupt has been
    generated (and the TMR bit for vector in question set).

    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Marcelo Tosatti
     

18 May, 2008

1 commit

  • There's still a race in kvm_vcpu_block(), if a wake_up_interruptible()
    call happens before the task state is set to TASK_INTERRUPTIBLE:

    CPU0 CPU1

    kvm_vcpu_block

    add_wait_queue

    kvm_cpu_has_interrupt = 0
    set interrupt
    if (waitqueue_active())
    wake_up_interruptible()

    kvm_cpu_has_pending_timer
    kvm_arch_vcpu_runnable
    signal_pending

    set_current_state(TASK_INTERRUPTIBLE)
    schedule()

    Can be fixed by using prepare_to_wait() which sets the task state before
    testing for the wait condition.

    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Marcelo Tosatti
     

04 May, 2008

1 commit


02 May, 2008

1 commit

  • a) none of the callers even looks at inode or file returned by anon_inode_getfd()
    b) any caller that would try to look at those would be racy, since by the time
    it returns we might have raced with close() from another thread and that
    file would be pining for fjords.

    Signed-off-by: Al Viro

    Al Viro
     

27 Apr, 2008

14 commits

  • Use kvm own refcounting instead of playing with ->filp->f_count.
    That will allow to get rid of a lot of crap in anon_inode_getfd() and
    kill a race in kvm_dev_ioctl_create_vm() (file might have been closed
    immediately by another thread, so ->filp might point to already freed
    struct file when we get around to setting it).

    Signed-off-by: Al Viro
    Signed-off-by: Avi Kivity

    Al Viro
     
  • It's a globally exported symbol now.

    Signed-off-by: Hollis Blanchard
    Signed-off-by: Avi Kivity

    Hollis Blanchard
     
  • So userspace can save/restore the mpstate during migration.

    [avi: export the #define constants describing the value]
    [christian: add s390 stubs]
    [avi: ditto for ia64]

    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Carsten Otte
    Signed-off-by: Avi Kivity

    Marcelo Tosatti
     
  • Timers that fire between guest hlt and vcpu_block's add_wait_queue() are
    ignored, possibly resulting in hangs.

    Also make sure that atomic_inc and waitqueue_active tests happen in the
    specified order, otherwise the following race is open:

    CPU0 CPU1
    if (waitqueue_active(wq))
    add_wait_queue()
    if (!atomic_read(pit_timer->pending))
    schedule()
    atomic_inc(pit_timer->pending)

    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Marcelo Tosatti
     
  • This interface allows user a space application to read the trace of kvm
    related events through relayfs.

    Signed-off-by: Feng (Eric) Liu
    Signed-off-by: Avi Kivity

    Feng(Eric) Liu
     
  • This patch introduces a gfn_to_pfn() function and corresponding functions like
    kvm_release_pfn_dirty(). Using these new functions, we can modify the x86
    MMU to no longer assume that it can always get a struct page for any given gfn.

    We don't want to eliminate gfn_to_page() entirely because a number of places
    assume they can do gfn_to_page() and then kmap() the results. When we support
    IO memory, gfn_to_page() will fail for IO pages although gfn_to_pfn() will
    succeed.

    This does not implement support for avoiding reference counting for reserved
    RAM or for IO memory. However, it should make those things pretty straight
    forward.

    Since we're only introducing new common symbols, I don't think it will break
    the non-x86 architectures but I haven't tested those. I've tested Intel,
    AMD, NPT, and hugetlbfs with Windows and Linux guests.

    [avi: fix overflow when shifting left pfns by adding casts]

    Signed-off-by: Anthony Liguori
    Signed-off-by: Avi Kivity

    Anthony Liguori
     
  • the main purpose of adding this functions is the abilaty to release the
    spinlock that protect the kvm list while still be able to do operations
    on a specific kvm in a safe way.

    Signed-off-by: Izik Eidus
    Signed-off-by: Avi Kivity

    Izik Eidus
     
  • Since the size of kvm_regs is too big to allocate from kernel stack on ia64,
    use kzalloc to allocate it.

    Signed-off-by: Xiantao Zhang
    Signed-off-by: Avi Kivity

    Xiantao Zhang
     
  • Create large pages mappings if the guest PTE's are marked as such and
    the underlying memory is hugetlbfs backed. If the largepage contains
    write-protected pages, a large pte is not used.

    Gives a consistent 2% improvement for data copies on ram mounted
    filesystem, without NPT/EPT.

    Anthony measures a 4% improvement on 4-way kernbench, with NPT.

    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Marcelo Tosatti
     
  • Mark zapped root pagetables as invalid and ignore such pages during lookup.

    This is a problem with the cr3-target feature, where a zapped root table fools
    the faulting code into creating a read-only mapping. The result is a lockup
    if the instruction can't be emulated.

    Signed-off-by: Marcelo Tosatti
    Cc: Anthony Liguori
    Signed-off-by: Avi Kivity

    Marcelo Tosatti
     
  • With CONFIG_PREEMPT=n, this is needed in order to disable the fault-in
    code from sleeping.

    Signed-off-by: Andrea Arcangeli
    Signed-off-by: Avi Kivity

    Andrea Arcangeli
     
  • The second page is only needed on archs that support pio.

    Noted by Carsten Otte.

    Signed-off-by: Avi Kivity

    Avi Kivity
     
  • Signed-off-by: Avi Kivity

    Avi Kivity
     
  • Signed-off-by: Jan Engelhardt
    Signed-off-by: Avi Kivity

    Jan Engelhardt
     

04 Mar, 2008

2 commits


09 Feb, 2008

1 commit

  • Sometimes simple attributes might need to return an error, e.g. for
    acquiring a mutex interruptibly. In fact we have that situation in
    spufs already which is the original user of the simple attributes. This
    patch merged the temporarily forked attributes in spufs back into the
    main ones and allows to return errors.

    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Christoph Hellwig
    Cc:
    Cc: Arnd Bergmann
    Cc: Greg KH
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

31 Jan, 2008

5 commits