Eric Lee / smarc-fsl-linux-kernel

26 Oct, 2016

1 commit

36343f6ea KVM: fix OOPS on flush_work ... Browse Code »

The conversion done by commit 3706feacd007 ("KVM: Remove deprecated
create_singlethread_workqueue") is broken. It flushes a single work
item &irqfd->shutdown instead of all of them, and even worse if there
is no irqfd on the list then you get a NULL pointer dereference.
Revert the virt/kvm/eventfd.c part of that patch; to avoid the
deprecated function, just allocate our own workqueue---it does
not even have to be unbound---with alloc_workqueue.

Fixes: 3706feacd007
Reviewed-by: Cornelia Huck
Signed-off-by: Paolo Bonzini

Paolo Bonzini
2016-10-26 20:06:51 +0800

08 Sep, 2016

1 commit

3706feacd KVM: Remove deprecated create_singlethread_workqueue ... Browse Code »

The workqueue "irqfd_cleanup_wq" queues a single work item
&irqfd->shutdown and hence doesn't require ordering. It is a host-wide
workqueue for issuing deferred shutdown requests aggregated from all
vm* instances. It is not being used on a memory reclaim path.
Hence, it has been converted to use system_wq.
The work item has been flushed in kvm_irqfd_release().

The workqueue "wqueue" queues a single work item &timer->expired
and hence doesn't require ordering. Also, it is not being used on
a memory reclaim path. Hence, it has been converted to use system_wq.

System workqueues have been able to handle high level of concurrency
for a long time now and hence it's not required to have a singlethreaded
workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue
created with create_singlethread_workqueue(), system_wq allows multiple
work items to overlap executions even on the same CPU; however, a
per-cpu workqueue doesn't have any CPU locality or global ordering
guarantee unless the target CPU is explicitly specified and thus the
increase of local concurrency shouldn't make any difference.

Signed-off-by: Bhaktipriya Shridhar
Signed-off-by: Paolo Bonzini

Bhaktipriya Shridhar
2016-09-08 01:34:28 +0800

12 May, 2016

1 commit

14717e203 kvm: Conditionally register IRQ bypass consumer ... Browse Code »

If we don't support a mechanism for bypassing IRQs, don't register as
a consumer. This eliminates meaningless dev_info()s when the connect
fails between producer and consumer, such as on AMD systems where
kvm_x86_ops->update_pi_irte is not implemented

Signed-off-by: Alex Williamson
Signed-off-by: Paolo Bonzini

Alex Williamson
2016-05-12 04:37:55 +0800

04 Nov, 2015

1 commit

b97e6de9c KVM: x86: merge kvm_arch_set_irq with kvm_set_msi_inatomic ... Browse Code »

We do not want to do too much work in atomic context, in particular
not walking all the VCPUs of the virtual machine. So we want
to distinguish the architecture-specific injection function for irqfd
from kvm_set_msi. Since it's still empty, reuse the newly added
kvm_arch_set_irq and rename it to kvm_arch_set_irq_inatomic.

Reviewed-by: Radim Krčmář
Signed-off-by: Paolo Bonzini

Paolo Bonzini
2015-11-04 23:24:35 +0800

16 Oct, 2015

3 commits

c9a5eccac kvm/eventfd: add arch-specific set_irq ... Browse Code »

Allow for arch-specific interrupt types to be set. For that, add
kvm_arch_set_irq() which takes interrupt type-specific action if it
recognizes the interrupt type given, and -EWOULDBLOCK otherwise.

The default implementation always returns -EWOULDBLOCK.

Signed-off-by: Andrey Smetanin
Reviewed-by: Roman Kagan
Signed-off-by: Denis V. Lunev
CC: Vitaly Kuznetsov
CC: "K. Y. Srinivasan"
CC: Gleb Natapov
CC: Paolo Bonzini
Signed-off-by: Paolo Bonzini

Andrey Smetanin
2015-10-16 16:34:29 +0800
ba1aefcd6 kvm/eventfd: factor out kvm_notify_acked_gsi() ... Browse Code »

Factor out kvm_notify_acked_gsi() helper to iterate over EOI listeners
and notify those matching the given gsi.

It will be reused in the upcoming Hyper-V SynIC implementation.

Signed-off-by: Andrey Smetanin
Reviewed-by: Roman Kagan
Signed-off-by: Denis V. Lunev
CC: Vitaly Kuznetsov
CC: "K. Y. Srinivasan"
CC: Gleb Natapov
CC: Paolo Bonzini
Signed-off-by: Paolo Bonzini

Andrey Smetanin
2015-10-16 16:34:29 +0800
351dc6477 kvm/eventfd: avoid loop inside irqfd_update() ... Browse Code »

The loop(for) inside irqfd_update() is unnecessary
because any other value for irq_entry.type will just trigger
schedule_work(&irqfd->inject) in irqfd_wakeup.

Signed-off-by: Andrey Smetanin
Reviewed-by: Roman Kagan
Signed-off-by: Denis V. Lunev
CC: Vitaly Kuznetsov
CC: "K. Y. Srinivasan"
CC: Gleb Natapov
CC: Paolo Bonzini
Signed-off-by: Paolo Bonzini

Andrey Smetanin
2015-10-16 16:34:28 +0800

01 Oct, 2015

5 commits

f70c20aaf KVM: Add an arch specific hooks in 'struct kvm_kernel_irqfd' ... Browse Code »

This patch adds an arch specific hooks 'arch_update' in
'struct kvm_kernel_irqfd'. On Intel side, it is used to
update the IRTE when VT-d posted-interrupts is used.

Signed-off-by: Feng Wu
Reviewed-by: Alex Williamson
Signed-off-by: Paolo Bonzini

Feng Wu
2015-10-01 21:06:47 +0800
9016cfb57 KVM: eventfd: add irq bypass consumer management ... Browse Code »

This patch adds the registration/unregistration of an
irq_bypass_consumer on irqfd assignment/deassignment.

Signed-off-by: Eric Auger
Signed-off-by: Feng Wu
Reviewed-by: Alex Williamson
Signed-off-by: Paolo Bonzini

Eric Auger
2015-10-01 21:06:46 +0800
1a02b2703 KVM: introduce kvm_arch functions for IRQ bypass ... Browse Code »

This patch introduces
- kvm_arch_irq_bypass_add_producer
- kvm_arch_irq_bypass_del_producer
- kvm_arch_irq_bypass_stop
- kvm_arch_irq_bypass_start

They make possible to specialize the KVM IRQ bypass consumer in
case CONFIG_KVM_HAVE_IRQ_BYPASS is set.

Signed-off-by: Eric Auger
[Add weak implementations of the callbacks. - Feng]
Signed-off-by: Feng Wu
Reviewed-by: Alex Williamson
Signed-off-by: Paolo Bonzini

Eric Auger
2015-10-01 21:06:45 +0800
166c9775f KVM: create kvm_irqfd.h ... Browse Code »

Move _irqfd_resampler and _irqfd struct declarations in a new
public header: kvm_irqfd.h. They are respectively renamed into
kvm_kernel_irqfd_resampler and kvm_kernel_irqfd. Those datatypes
will be used by architecture specific code, in the context of
IRQ bypass manager integration.

Signed-off-by: Eric Auger
Signed-off-by: Feng Wu
Reviewed-by: Alex Williamson
Signed-off-by: Paolo Bonzini

Eric Auger
2015-10-01 21:06:44 +0800
e9ea5069d kvm: add capability for any-length ioeventfds ... Browse Code »

Cc: Gleb Natapov
Cc: Paolo Bonzini
Signed-off-by: Jason Wang
Signed-off-by: Paolo Bonzini

Jason Wang
2015-10-01 21:06:31 +0800

15 Sep, 2015

3 commits

eefd6b06b kvm: fix double free for fast mmio eventfd ... Browse Code »

We register wildcard mmio eventfd on two buses, once for KVM_MMIO_BUS
and once on KVM_FAST_MMIO_BUS but with a single iodev
instance. This will lead to an issue: kvm_io_bus_destroy() knows
nothing about the devices on two buses pointing to a single dev. Which
will lead to double free[1] during exit. Fix this by allocating two
instances of iodevs then registering one on KVM_MMIO_BUS and another
on KVM_FAST_MMIO_BUS.

CPU: 1 PID: 2894 Comm: qemu-system-x86 Not tainted 3.19.0-26-generic #28-Ubuntu
Hardware name: LENOVO 2356BG6/2356BG6, BIOS G7ET96WW (2.56 ) 09/12/2013
task: ffff88009ae0c4b0 ti: ffff88020e7f0000 task.ti: ffff88020e7f0000
RIP: 0010:[] [] ioeventfd_release+0x28/0x60 [kvm]
RSP: 0018:ffff88020e7f3bc8 EFLAGS: 00010292
RAX: dead000000200200 RBX: ffff8801ec19c900 RCX: 000000018200016d
RDX: ffff8801ec19cf80 RSI: ffffea0008bf1d40 RDI: ffff8801ec19c900
RBP: ffff88020e7f3bd8 R08: 000000002fc75a01 R09: 000000018200016d
R10: ffffffffc07df6ae R11: ffff88022fc75a98 R12: ffff88021e7cc000
R13: ffff88021e7cca48 R14: ffff88021e7cca50 R15: ffff8801ec19c880
FS: 00007fc1ee3e6700(0000) GS:ffff88023e240000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8f389d8000 CR3: 000000023dc13000 CR4: 00000000001427e0
Stack:
ffff88021e7cc000 0000000000000000 ffff88020e7f3be8 ffffffffc07e2622
ffff88020e7f3c38 ffffffffc07df69a ffff880232524160 ffff88020e792d80
0000000000000000 ffff880219b78c00 0000000000000008 ffff8802321686a8
Call Trace:
[] ioeventfd_destructor+0x12/0x20 [kvm]
[] kvm_put_kvm+0xca/0x210 [kvm]
[] kvm_vcpu_release+0x18/0x20 [kvm]
[] __fput+0xe7/0x250
[] ____fput+0xe/0x10
[] task_work_run+0xd4/0xf0
[] do_exit+0x368/0xa50
[] ? recalc_sigpending+0x1f/0x60
[] do_group_exit+0x45/0xb0
[] get_signal+0x291/0x750
[] do_signal+0x28/0xab0
[] ? do_futex+0xdb/0x5d0
[] ? __wake_up_locked_key+0x18/0x20
[] ? SyS_futex+0x76/0x170
[] do_notify_resume+0x69/0xb0
[] int_signal+0x12/0x17
Code: 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 7f 20 e8 06 d6 a5 c0 48 8b 43 08 48 8b 13 48 89 df 48 89 42 08 89 10 48 b8 00 01 10 00 00
RIP [] ioeventfd_release+0x28/0x60 [kvm]
RSP

Cc: stable@vger.kernel.org
Cc: Gleb Natapov
Cc: Paolo Bonzini
Signed-off-by: Jason Wang
Reviewed-by: Cornelia Huck
Signed-off-by: Paolo Bonzini

Jason Wang
2015-09-15 22:59:31 +0800
85da11ca5 kvm: factor out core eventfd assign/deassign logic ... Browse Code »

This patch factors out core eventfd assign/deassign logic and leaves
the argument checking and bus index selection to callers.

Cc: stable@vger.kernel.org
Cc: Gleb Natapov
Cc: Paolo Bonzini
Signed-off-by: Jason Wang
Reviewed-by: Cornelia Huck
Signed-off-by: Paolo Bonzini

Jason Wang
2015-09-15 22:58:47 +0800
8453fecbe kvm: don't try to register to KVM_FAST_MMIO_BUS for non mmio eventfd ... Browse Code »

We only want zero length mmio eventfd to be registered on
KVM_FAST_MMIO_BUS. So check this explicitly when arg->len is zero to
make sure this.

Cc: stable@vger.kernel.org
Cc: Gleb Natapov
Cc: Paolo Bonzini
Signed-off-by: Jason Wang
Reviewed-by: Cornelia Huck
Signed-off-by: Paolo Bonzini

Jason Wang
2015-09-15 22:58:27 +0800

27 Mar, 2015

2 commits

af669ac6d KVM: move iodev.h from virt/kvm/ to include/kvm ... Browse Code »

iodev.h contains definitions for the kvm_io_bus framework. This is
needed both by the generic KVM code in virt/kvm as well as by
architecture specific code under arch/. Putting the header file in
virt/kvm and using local includes in the architecture part seems at
least dodgy to me, so let's move the file into include/kvm, so that a
more natural "#include " can be used by all of the code.
This also solves a problem later when using struct kvm_io_device
in arm_vgic.h.
Fixing up the FSF address in the GPL header and a wrong include path
on the way.

Signed-off-by: Andre Przywara
Acked-by: Christoffer Dall
Reviewed-by: Marc Zyngier
Reviewed-by: Marcelo Tosatti
Signed-off-by: Marc Zyngier

Andre Przywara
2015-03-27 05:43:12 +0800
e32edf4fd KVM: Redesign kvm_io_bus_ API to pass VCPU structure to the callbacks. ... Browse Code »

This is needed in e.g. ARM vGIC emulation, where the MMIO handling
depends on the VCPU that does the access.

Signed-off-by: Nikolay Nikolaev
Signed-off-by: Andre Przywara
Acked-by: Paolo Bonzini
Acked-by: Christoffer Dall
Reviewed-by: Marc Zyngier
Signed-off-by: Marc Zyngier

Nikolay Nikolaev
2015-03-27 05:43:11 +0800

12 Mar, 2015

1 commit

01c94e64f KVM: introduce kvm_arch_intc_initialized and use it in irqfd ... Browse Code »

Introduce __KVM_HAVE_ARCH_INTC_INITIALIZED define and
associated kvm_arch_intc_initialized function. This latter
allows to test whether the virtual interrupt controller is initialized
and ready to accept virtual IRQ injection. On some architectures,
the virtual interrupt controller is dynamically instantiated, justifying
that kind of check.

The new function can now be used by irqfd to check whether the
virtual interrupt controller is ready on KVM_IRQFD request. If not,
KVM_IRQFD returns -EAGAIN.

Signed-off-by: Eric Auger
Acked-by: Christoffer Dall
Reviewed-by: Andre Przywara
Acked-by: Marc Zyngier
Signed-off-by: Christoffer Dall

Eric Auger
2015-03-12 22:15:32 +0800

22 Nov, 2014

1 commit

6ef768fac kvm: x86: move ioapic.c and irq_comm.c back to arch/x86/ ... Browse Code »

ia64 does not need them anymore. Ack notifiers become x86-specific
too.

Suggested-by: Gleb Natapov
Reviewed-by: Radim Krcmar
Signed-off-by: Paolo Bonzini

Paolo Bonzini
2014-11-22 01:02:37 +0800

24 Sep, 2014

1 commit

29f1b65b5 KVM: EVENTFD: Remove inclusion of irq.h ... Browse Code »

Commit c77dcac (KVM: Move more code under CONFIG_HAVE_KVM_IRQFD) added
functionality that depends on definitions in ioapic.h when
__KVM_HAVE_IOAPIC is defined.

At the same time, kvm-arm commit 0ba0951 (KVM: EVENTFD: remove inclusion
of irq.h) removed the inclusion of irq.h, an architecture-specific header
that is not present on ARM but which happened to include ioapic.h on x86.

Include ioapic.h directly in eventfd.c if __KVM_HAVE_IOAPIC is defined.
This fixes x86 and lets ARM use eventfd.c.

Signed-off-by: Christoffer Dall
Signed-off-by: Paolo Bonzini

Christoffer Dall
2014-09-24 18:06:25 +0800

06 Aug, 2014

1 commit

c77dcacb3 KVM: Move more code under CONFIG_HAVE_KVM_IRQFD ... Browse Code »

Commits e4d57e1ee1ab (KVM: Move irq notifier implementation into
eventfd.c, 2014-06-30) included the irq notifier code unconditionally
in eventfd.c, while it was under CONFIG_HAVE_KVM_IRQCHIP before.

Similarly, commit 297e21053a52 (KVM: Give IRQFD its own separate enabling
Kconfig option, 2014-06-30) moved code from CONFIG_HAVE_IRQ_ROUTING
to CONFIG_HAVE_KVM_IRQFD but forgot to move the pieces that used to be
under CONFIG_HAVE_KVM_IRQCHIP.

Together, this broke compilation without CONFIG_KVM_XICS. Fix by adding
or changing the #ifdefs so that they point at CONFIG_HAVE_KVM_IRQFD.

Signed-off-by: Paolo Bonzini

Paolo Bonzini
2014-08-06 20:24:47 +0800

05 Aug, 2014

5 commits

297e21053 KVM: Give IRQFD its own separate enabling Kconfig option ... Browse Code »

Currently, the IRQFD code is conditional on CONFIG_HAVE_KVM_IRQ_ROUTING.
So that we can have the IRQFD code compiled in without having the
IRQ routing code, this creates a new CONFIG_HAVE_KVM_IRQFD, makes
the IRQFD code conditional on it instead of CONFIG_HAVE_KVM_IRQ_ROUTING,
and makes all the platforms that currently select HAVE_KVM_IRQ_ROUTING
also select HAVE_KVM_IRQFD.

Signed-off-by: Paul Mackerras
Tested-by: Eric Auger
Tested-by: Cornelia Huck
Signed-off-by: Paolo Bonzini

Paul Mackerras
2014-08-05 20:26:28 +0800
e4d57e1ee KVM: Move irq notifier implementation into eventfd.c ... Browse Code »

This moves the functions kvm_irq_has_notifier(), kvm_notify_acked_irq(),
kvm_register_irq_ack_notifier() and kvm_unregister_irq_ack_notifier()
from irqchip.c to eventfd.c. The reason for doing this is that those
functions are used in connection with IRQFDs, which are implemented in
eventfd.c. In future we will want to use IRQFDs on platforms that
don't implement the GSI routing implemented in irqchip.c, so we won't
be compiling in irqchip.c, but we still need the irq notifiers. The
implementation is unchanged.

Signed-off-by: Paul Mackerras
Tested-by: Eric Auger
Tested-by: Cornelia Huck
Signed-off-by: Paolo Bonzini

Paul Mackerras
2014-08-05 20:26:24 +0800
9957c86d6 KVM: Move all accesses to kvm::irq_routing into irqchip.c ... Browse Code »

Now that struct _irqfd does not keep a reference to storage pointed
to by the irq_routing field of struct kvm, we can move the statement
that updates it out from under the irqfds.lock and put it in
kvm_set_irq_routing() instead. That means we then have to take a
srcu_read_lock on kvm->irq_srcu around the irqfd_update call in
kvm_irqfd_assign(), since holding the kvm->irqfds.lock no longer
ensures that that the routing can't change.

Combined with changing kvm_irq_map_gsi() and kvm_irq_map_chip_pin()
to take a struct kvm * argument instead of the pointer to the routing
table, this allows us to to move all references to kvm->irq_routing
into irqchip.c. That in turn allows us to move the definition of the
kvm_irq_routing_table struct into irqchip.c as well.

Signed-off-by: Paul Mackerras
Tested-by: Eric Auger
Tested-by: Cornelia Huck
Signed-off-by: Paolo Bonzini

Paul Mackerras
2014-08-05 20:26:20 +0800
8ba918d48 KVM: irqchip: Provide and use accessors for irq routing table ... Browse Code »

This provides accessor functions for the KVM interrupt mappings, in
order to reduce the amount of code that accesses the fields of the
kvm_irq_routing_table struct, and restrict that code to one file,
virt/kvm/irqchip.c. The new functions are kvm_irq_map_gsi(), which
maps from a global interrupt number to a set of IRQ routing entries,
and kvm_irq_map_chip_pin, which maps from IRQ chip and pin numbers to
a global interrupt number.

This also moves the update of kvm_irq_routing_table::chip[][]
into irqchip.c, out of the various kvm_set_routing_entry
implementations. That means that none of the kvm_set_routing_entry
implementations need the kvm_irq_routing_table argument anymore,
so this removes it.

This does not change any locking or data lifetime rules.

Signed-off-by: Paul Mackerras
Tested-by: Eric Auger
Tested-by: Cornelia Huck
Signed-off-by: Paolo Bonzini

Paul Mackerras
2014-08-05 20:26:16 +0800
56f89f362 KVM: Don't keep reference to irq routing table in irqfd struct ... Browse Code »

This makes the irqfd code keep a copy of the irq routing table entry
for each irqfd, rather than a reference to the copy in the actual
irq routing table maintained in kvm/virt/irqchip.c. This will enable
us to change the routing table structure in future, or even not have a
routing table at all on some platforms.

The synchronization that was previously achieved using srcu_dereference
on the read side is now achieved using a seqcount_t structure. That
ensures that we don't get a halfway-updated copy of the structure if
we read it while another thread is updating it.

We still use srcu_read_lock/unlock around the read side so that when
changing the routing table we can be sure that after calling
synchronize_srcu, nothing will be using the old routing.

Signed-off-by: Paul Mackerras
Tested-by: Eric Auger
Tested-by: Cornelia Huck
Signed-off-by: Paolo Bonzini

Paul Mackerras
2014-08-05 20:24:23 +0800

05 May, 2014

1 commit

719d93cd5 kvm/irqchip: Speed up KVM_SET_GSI_ROUTING ... Browse Code »

When starting lots of dataplane devices the bootup takes very long on
Christian's s390 with irqfd patches. With larger setups he is even
able to trigger some timeouts in some components. Turns out that the
KVM_SET_GSI_ROUTING ioctl takes very long (strace claims up to 0.1 sec)
when having multiple CPUs. This is caused by the synchronize_rcu and
the HZ=100 of s390. By changing the code to use a private srcu we can
speed things up. This patch reduces the boot time till mounting root
from 8 to 2 seconds on my s390 guest with 100 disks.

Uses of hlist_for_each_entry_rcu, hlist_add_head_rcu, hlist_del_init_rcu
are fine because they do not have lockdep checks (hlist_for_each_entry_rcu
uses rcu_dereference_raw rather than rcu_dereference, and write-sides
do not do rcu lockdep at all).

Note that we're hardly relying on the "sleepable" part of srcu. We just
want SRCU's faster detection of grace periods.

Testing was done by Andrew Theurer using netperf tests STREAM, MAERTS
and RR. The difference between results "before" and "after" the patch
has mean -0.2% and standard deviation 0.6%. Using a paired t-test on the
data points says that there is a 2.5% probability that the patch is the
cause of the performance difference (rather than a random fluctuation).

(Restricting the t-test to RR, which is the most likely to be affected,
changes the numbers to respectively -0.3% mean, 0.7% stdev, and 8%
probability that the numbers actually say something about the patch.
The probability increases mostly because there are fewer data points).

Cc: Marcelo Tosatti
Cc: Michael S. Tsirkin
Tested-by: Christian Borntraeger # s390
Reviewed-by: Christian Borntraeger
Signed-off-by: Christian Borntraeger
Signed-off-by: Paolo Bonzini

Christian Borntraeger
2014-05-05 22:29:11 +0800

18 Apr, 2014

2 commits

68c3b4d16 KVM: VMX: speed up wildcard MMIO EVENTFD ... Browse Code »

With KVM, MMIO is much slower than PIO, due to the need to
do page walk and emulation. But with EPT, it does not have to be: we
know the address from the VMCS so if the address is unique, we can look
up the eventfd directly, bypassing emulation.

Unfortunately, this only works if userspace does not need to match on
access length and data. The implementation adds a separate FAST_MMIO
bus internally. This serves two purposes:
- minimize overhead for old userspace that does not use eventfd with lengtth = 0
- minimize disruption in other code (since we don't know the length,
devices on the MMIO bus only get a valid address in write, this
way we don't need to touch all devices to teach them to handle
an invalid length)

At the moment, this optimization only has effect for EPT on x86.

It will be possible to speed up MMIO for NPT and MMU using the same
idea in the future.

With this patch applied, on VMX MMIO EVENTFD is essentially as fast as PIO.
I was unable to detect any measureable slowdown to non-eventfd MMIO.

Making MMIO faster is important for the upcoming virtio 1.0 which
includes an MMIO signalling capability.

The idea was suggested by Peter Anvin. Lots of thanks to Gleb for
pre-review and suggestions.

Signed-off-by: Michael S. Tsirkin
Signed-off-by: Marcelo Tosatti

Michael S. Tsirkin
2014-04-18 01:01:43 +0800
f848a5a8d KVM: support any-length wildcard ioeventfd ... Browse Code »

It is sometimes benefitial to ignore IO size, and only match on address.
In hindsight this would have been a better default than matching length
when KVM_IOEVENTFD_FLAG_DATAMATCH is not set, In particular, this kind
of access can be optimized on VMX: there no need to do page lookups.
This can currently be done with many ioeventfds but in a suboptimal way.

However we can't change kernel/userspace ABI without risk of breaking
some applications.
Use len = 0 to mean "ignore length for matching" in a more optimal way.

Signed-off-by: Michael S. Tsirkin
Signed-off-by: Marcelo Tosatti

Michael S. Tsirkin
2014-04-18 01:01:42 +0800

19 Mar, 2014

1 commit

684a0b719 KVM: eventfd: Fix lock order inversion. ... Browse Code »

When registering a new irqfd, we call its ->poll method to collect any
event that might have previously been pending so that we can trigger it.
This is done under the kvm->irqfds.lock, which means the eventfd's ctx
lock is taken under it.

However, if we get a POLLHUP in irqfd_wakeup, we will be called with the
ctx lock held before getting the irqfds.lock to deactivate the irqfd,
causing lockdep to complain.

Calling the ->poll method does not really need the irqfds.lock, so let's
just move it after we've given up the irqfds.lock in kvm_irqfd_assign().

Signed-off-by: Cornelia Huck
Signed-off-by: Paolo Bonzini

Cornelia Huck
2014-03-19 00:06:04 +0800

04 Sep, 2013

1 commit

cffe78d92 kvm eventfd: switch to fdget ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-09-04 11:04:45 +0800

04 Jun, 2013

1 commit

6ea34c9b7 kvm: exclude ioeventfd from counting kvm_io_range limit ... Browse Code »

We can easily reach the 1000 limit by start VM with a couple
hundred I/O devices (multifunction=on). The hardcode limit
already been adjusted 3 times (6 ~ 200 ~ 300 ~ 1000).

In userspace, we already have maximum file descriptor to
limit ioeventfd count. But kvm_io_bus devices also are used
for pit, pic, ioapic, coalesced_mmio. They couldn't be limited
by maximum file descriptor.

Currently only ioeventfds take too much kvm_io_bus devices,
so just exclude it from counting kvm_io_range limit.

Also fixed one indent issue in kvm_host.h

Signed-off-by: Amos Kong
Reviewed-by: Stefan Hajnoczi
Signed-off-by: Gleb Natapov

Amos Kong
2013-06-04 16:49:38 +0800

27 Apr, 2013

1 commit

a725d56a0 KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING ... Browse Code »

Quite a bit of code in KVM has been conditionalized on availability of
IOAPIC emulation. However, most of it is generically applicable to
platforms that don't have an IOPIC, but a different type of irq chip.

Make code that only relies on IRQ routing, not an APIC itself, on
CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later.

Signed-off-by: Alexander Graf
Acked-by: Michael S. Tsirkin

Alexander Graf
2013-04-27 02:27:14 +0800

16 Apr, 2013

1 commit

aa2fbe6d4 KVM: Let ioapic know the irq line status ... Browse Code »

Userspace may deliver RTC interrupt without query the status. So we
want to track RTC EOI for this case.

Signed-off-by: Yang Zhang
Reviewed-by: Gleb Natapov
Signed-off-by: Marcelo Tosatti

Yang Zhang
2013-04-16 10:20:34 +0800

07 Apr, 2013

1 commit

05e07f9bd kvm: fix MMIO/PIO collision misdetection ... Browse Code »

PIO and MMIO are separate address spaces, but
ioeventfd registration code mistakenly detected
two eventfds as duplicate if they use the same address,
even if one is PIO and another one MMIO.

Reviewed-by: Paolo Bonzini
Signed-off-by: Michael S. Tsirkin
Signed-off-by: Gleb Natapov

Michael S. Tsirkin
2013-04-07 19:53:47 +0800

06 Mar, 2013

2 commits

2b83451b4 KVM: ioeventfd for virtio-ccw devices. ... Browse Code »

Enhance KVM_IOEVENTFD with a new flag that allows to attach to virtio-ccw
devices on s390 via the KVM_VIRTIO_CCW_NOTIFY_BUS.

Signed-off-by: Cornelia Huck
Signed-off-by: Marcelo Tosatti

Cornelia Huck
2013-03-06 06:12:17 +0800
a0f155e96 KVM: Initialize irqfd from kvm_init(). ... Browse Code »

Currently, eventfd introduces module_init/module_exit functions
to initialize/cleanup the irqfd workqueue. This only works, however,
if no other module_init/module_exit functions are built into the
same module.

Let's just move the initialization and cleanup to kvm_init and kvm_exit.
This way, it is also clearer where kvm startup may fail.

Signed-off-by: Cornelia Huck
Signed-off-by: Marcelo Tosatti

Cornelia Huck
2013-03-06 06:12:16 +0800

28 Feb, 2013

1 commit

b67bfe0d4 hlist: drop the node parameter from iterators ... Browse Code »

I'm not sure why, but the hlist for each entry iterators were conceived

list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin
Acked-by: Paul E. McKenney
Signed-off-by: Sasha Levin
Cc: Wu Fengguang
Cc: Marcelo Tosatti
Cc: Gleb Natapov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sasha Levin
2013-02-28 11:10:24 +0800

11 Dec, 2012

1 commit

49f8a1a53 kvm: Fix irqfd resampler list walk ... Browse Code »

Typo for the next pointer means we're walking random data here.

Signed-off-by: Alex Williamson
Signed-off-by: Marcelo Tosatti

Alex Williamson
2012-12-11 04:16:36 +0800

06 Dec, 2012

1 commit

914daba86 KVM: Distangle eventfd code from irqchip ... Browse Code »

The current eventfd code assumes that when we have eventfd, we also have
irqfd for in-kernel interrupt delivery. This is not necessarily true. On
PPC we don't have an in-kernel irqchip yet, but we can still support easily
support eventfd.

Signed-off-by: Alexander Graf

Alexander Graf
2012-12-06 08:33:49 +0800