Eric Lee / smarc-fsl-linux-kernel

15 Feb, 2017

4 commits

1cf897fcc Drivers: hv: vmbus: finally fix hv_need_to_signal_on_read() ... Browse Code »

commit 433e19cf33d34bb6751c874a9c00980552fe508c upstream.

Commit a389fcfd2cb5 ("Drivers: hv: vmbus: Fix signaling logic in
hv_need_to_signal_on_read()")
added the proper mb(), but removed the test "prev_write_sz < pending_sz"
when making the signal decision.

As a result, the guest can signal the host unnecessarily,
and then the host can throttle the guest because the host
thinks the guest is buggy or malicious; finally the user
running stress test can perceive intermittent freeze of
the guest.

This patch brings back the test, and properly handles the
in-place consumption APIs used by NetVSC (see get_next_pkt_raw(),
put_pkt_raw() and commit_rd_index()).

Fixes: a389fcfd2cb5 ("Drivers: hv: vmbus: Fix signaling logic in
hv_need_to_signal_on_read()")

Signed-off-by: Dexuan Cui
Reported-by: Rolf Neugebauer
Tested-by: Rolf Neugebauer
Cc: "K. Y. Srinivasan"
Cc: Haiyang Zhang
Cc: Stephen Hemminger
Signed-off-by: K. Y. Srinivasan
Cc: Rolf Neugebauer
Signed-off-by: Greg Kroah-Hartman

Dexuan Cui
2017-02-15 07:25:39 +0800
964dfbe3d Drivers: hv: vmbus: On the read path cleanup the logic to interrupt the host ... Browse Code »

commit 3372592a140db69fd63837e81f048ab4abf8111e upstream.

Signal the host when we determine the host is to be signaled -
on th read path. The currrent code determines the need to signal in the
ringbuffer code and actually issues the signal elsewhere. This can result
in the host viewing this interrupt as spurious since the host may also
poll the channel. Make the necessary adjustments.

Signed-off-by: K. Y. Srinivasan
Cc: Rolf Neugebauer
Signed-off-by: Greg Kroah-Hartman

K. Y. Srinivasan
2017-02-15 07:25:38 +0800
e2fdf7841 Drivers: hv: vmbus: On write cleanup the logic to interrupt the host ... Browse Code »

commit 1f6ee4e7d83586c8b10bd4f2f4346353d04ce884 upstream.

Signal the host when we determine the host is to be signaled.
The currrent code determines the need to signal in the ringbuffer
code and actually issues the signal elsewhere. This can result
in the host viewing this interrupt as spurious since the host may also
poll the channel. Make the necessary adjustments.

Signed-off-by: K. Y. Srinivasan
Cc: Rolf Neugebauer
Signed-off-by: Greg Kroah-Hartman

K. Y. Srinivasan
2017-02-15 07:25:38 +0800
afbb98f91 Drivers: hv: vmbus: Base host signaling strictly on the ring state ... Browse Code »

commit 74198eb4a42c4a3c4fbef08fa01a291a282f7c2e upstream.

One of the factors that can result in the host concluding that a given
guest in mounting a DOS attack is if the guest generates interrupts
to the host when the host is not expecting it. If these "spurious"
interrupts reach a certain rate, the host can throttle the guest to
minimize the impact. The host computation of the "expected number
of interrupts" is strictly based on the ring transitions. Until
the host logic is fixed, base the guest logic to interrupt solely
on the ring state.

Signed-off-by: K. Y. Srinivasan
Cc: Rolf Neugebauer
Signed-off-by: Greg Kroah-Hartman

K. Y. Srinivasan
2017-02-15 07:25:38 +0800

09 Jan, 2017

1 commit

bf6a9b31e hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels() ... Browse Code »

commit abd1026da4a7700a8db370947f75cd17b6ae6f76 upstream.

"kernel BUG at drivers/hv/channel_mgmt.c:350!" is observed when hv_vmbus
module is unloaded. BUG_ON() was introduced in commit 85d9aa705184
("Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()") as
vmbus_free_channels() codepath was apparently forgotten.

Fixes: 85d9aa705184 ("Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()")

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2017-01-09 15:32:18 +0800

01 Nov, 2016

1 commit

f6b2db084 vmbus: make sysfs names consistent with PCI ... Browse Code »

In commit 9a56e5d6a0ba ("Drivers: hv: make VMBus bus ids persistent")
the name of vmbus devices in sysfs changed to be (in 4.9-rc1):
/sys/bus/vmbus/vmbus-6aebe374-9ba0-11e6-933c-00259086b36b

The prefix ("vmbus-") is redundant and differs from how PCI is
represented in sysfs. Therefore simplify to:
/sys/bus/vmbus/6aebe374-9ba0-11e6-933c-00259086b36b

Please merge this before 4.9 is released and the old format
has to live forever.

Signed-off-by: Stephen Hemminger
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Stephen Hemminger
2016-11-01 23:07:13 +0800

25 Oct, 2016

1 commit

407a3aee6 hv: do not lose pending heartbeat vmbus packets ... Browse Code »

The host keeps sending heartbeat packets independent of the
guest responding to them. Even though we respond to the heartbeat messages at
interrupt level, we can have situations where there maybe multiple heartbeat
messages pending that have not been responded to. For instance this occurs when the
VM is paused and the host continues to send the heartbeat messages.
Address this issue by draining and responding to all
the heartbeat messages that maybe pending.

Signed-off-by: Long Li
Signed-off-by: K. Y. Srinivasan
CC: Stable
Signed-off-by: Greg Kroah-Hartman

Long Li
2016-10-25 14:52:10 +0800

27 Sep, 2016

2 commits

e7fca5d86 Drivers: hv: get rid of id in struct vmbus_channel ... Browse Code »

The auto incremented counter is not being used anymore, get rid of it.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-09-27 18:35:49 +0800
b294809db Drivers: hv: make VMBus bus ids persistent ... Browse Code »

Some tools use bus ids to identify devices and they count on the fact
that these ids are persistent across reboot. This may be not true for
VMBus as we use auto incremented counter from alloc_channel() as such
id. Switch to using if_instance from channel offer, this id is supposed
to be persistent.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-09-27 18:35:49 +0800

09 Sep, 2016

1 commit

3ba1eb17b Drivers: hv: hv_util: Avoid dynamic allocation in time synch ... Browse Code »

Under stress, we have seen allocation failure in time synch code. Avoid
this dynamic allocation.

Signed-off-by: Vivek Yadav
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vivek yadav
2016-09-09 19:48:23 +0800

08 Sep, 2016

3 commits

8e1d26073 Drivers: hv: utils: Support TimeSync version 4.0 protocol samples. ... Browse Code »

This enables support for more accurate TimeSync v4 samples when hosted
under Windows Server 2016 and newer hosts.

The new time samples include a "vmreferencetime" field that represents
the guest's TSC value when the host generated its time sample. This value
lets the guest calculate the latency in receiving the time sample. The
latency is added to the sample host time prior to updating the clock.

Signed-off-by: Alex Ng
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Alex Ng
2016-09-08 19:53:07 +0800
2e338f7e0 Drivers: hv: utils: Use TimeSync samples to adjust the clock after boot. ... Browse Code »

Only the first 50 samples after boot were being used to discipline the
clock. After the first 50 samples, any samples from the host were ignored
and the guest clock would eventually drift from the host clock.

This patch allows TimeSync-enabled guests to continuously synchronize the
clock with the host clock, even after the first 50 samples.

Signed-off-by: Alex Ng
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Alex Ng
2016-09-08 19:53:07 +0800
abeda47eb Drivers: hv: utils: Rename version definitions to reflect protocol version. ... Browse Code »

Different Windows host versions may reuse the same protocol version when
negotiating the TimeSync, Shutdown, and Heartbeat protocols. We should only
refer to the protocol version to avoid conflating the two concepts.

Signed-off-by: Alex Ng
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Alex Ng
2016-09-08 19:53:07 +0800

07 Sep, 2016

2 commits

0f98829a9 Drivers: hv: vmbus: suppress some "hv_vmbus: Unknown GUID" warnings ... Browse Code »

Some VMBus devices are not needed by Linux guest[1][2], and, VMBus channels
of Hyper-V Sockets don't really mean usual synthetic devices, so let's
suppress the warnings for them.

[1] https://support.microsoft.com/en-us/kb/2925727
[2] https://msdn.microsoft.com/en-us/library/jj980180(v=winembedded.81).aspx

Signed-off-by: Dexuan Cui
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Dexuan Cui
2016-09-07 18:57:55 +0800
e2e808413 Driver: hv: vmbus: Make mmio resource local ... Browse Code »

This fixes a sparse warning because hyperv_mmio resources
are only used in this one file and should be static.

Signed-off-by: Stephen Hemminger
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Stephen Hemminger
2016-09-07 18:57:55 +0800

02 Sep, 2016

6 commits

db886e4d2 Drivers: hv: utils: Check VSS daemon is listening before a hot backup ... Browse Code »

Hyper-V host will send a VSS_OP_HOT_BACKUP request to check if guest is
ready for a live backup/snapshot. The driver should respond to the check
only if the daemon is running and listening to requests. This allows the
host to fallback to standard snapshots in case the VSS daemon is not
running.

Signed-off-by: Alex Ng
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Alex Ng
2016-09-02 23:22:51 +0800
497af84b8 Drivers: hv: utils: Continue to poll VSS channel after handling requests. ... Browse Code »

Multiple VSS_OP_HOT_BACKUP requests may arrive in quick succession, even
though the host only signals once. The driver wass handling the first
request while ignoring the others in the ring buffer. We should poll the
VSS channel after handling a request to continue processing other requests.

Signed-off-by: Alex Ng
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Alex Ng
2016-09-02 23:22:51 +0800
509879bdb Drivers: hv: Introduce a policy for controlling channel affinity ... Browse Code »

Introduce a mechanism to control how channels will be affinitized. We will
support two policies:

1. HV_BALANCED: All performance critical channels will be dstributed
evenly amongst all the available NUMA nodes. Once the Node is assigned,
we will assign the CPU based on a simple round robin scheme.

2. HV_LOCALIZED: Only the primary channels are distributed across all
NUMA nodes. Sub-channels will be in the same NUMA node as the primary
channel. This is the current behaviour.

The default policy will be the HV_BALANCED as it can minimize the remote
memory access on NUMA machines with applications that span NUMA nodes.

Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

K. Y. Srinivasan
2016-09-02 23:22:51 +0800
f24f0b495 Drivers: hv: ring_buffer: use wrap around mappings in hv_copy{from, to}_ringbuffer() ... Browse Code »

With wrap around mappings for ring buffers we can always use a single
memcpy() to do the job.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Tested-by: Dexuan Cui
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-09-02 23:22:51 +0800
9988ce685 Drivers: hv: ring_buffer: wrap around mappings for ring buffers ... Browse Code »

Make it possible to always use a single memcpy() or to provide a direct
link to a packet on the ring buffer by creating virtual mapping for two
copies of the ring buffer with vmap(). Utilize currently empty
hv_ringbuffer_cleanup() to do the unmap.

While on it, replace sizeof(struct hv_ring_buffer) check
in hv_ringbuffer_init() with BUILD_BUG_ON() as it is a compile time check.

Signed-off-by: Vitaly Kuznetsov
Tested-by: Dexuan Cui
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-09-02 23:22:51 +0800
98f531b10 Drivers: hv: cleanup vmbus_open() for wrap around mappings ... Browse Code »

In preparation for doing wrap around mappings for ring buffers cleanup
vmbus_open() function:
- check that ring sizes are PAGE_SIZE aligned (they are for all in-kernel
drivers now);
- kfree(open_info) on error only after we kzalloc() it (not an issue as it
is valid to call kfree(NULL);
- rename poorly named labels;
- use alloc_pages() instead of __get_free_pages() as we need struct page
pointer for future.

Signed-off-by: Vitaly Kuznetsov
Tested-by: Dexuan Cui
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-09-02 23:22:51 +0800

31 Aug, 2016

14 commits

b605c2d91 Drivers: hv: balloon: Use available memory value in pressure report ... Browse Code »

Reports for available memory should use the si_mem_available() value.
The previous freeram value does not include available page cache memory.

Signed-off-by: Alex Ng
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Alex Ng
2016-08-31 19:05:42 +0800
eece30b9f Drivers: hv: balloon: replace ha_region_mutex with spinlock ... Browse Code »

lockdep reports possible circular locking dependency when udev is used
for memory onlining:

systemd-udevd/3996 is trying to acquire lock:
((memory_chain).rwsem){++++.+}, at: [] __blocking_notifier_call_chain+0x4e/0xc0

but task is already holding lock:
(&dm_device.ha_region_mutex){+.+.+.}, at: [] hv_memory_notifier+0x5e/0xc0 [hv_balloon]
...

which is probably a false positive because we take and release
ha_region_mutex from memory notifier chain depending on the arg. No real
deadlocks were reported so far (though I'm not really sure about
preemptible kernels...) but we don't really need to hold the mutex
for so long. We use it to protect ha_region_list (and its members) and the
num_pages_onlined counter. None of these operations require us to sleep
and nothing is slow, switch to using spinlock with interrupts disabled.

While on it, replace list_for_each -> list_for_each_entry as we actually
need entries in all these cases, drop meaningless list_empty() checks.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-08-31 19:05:41 +0800
a132c54cb Drivers: hv: balloon: don't wait for ol_waitevent when memhp_auto_online is enabled ... Browse Code »

With the recently introduced in-kernel memory onlining
(MEMORY_HOTPLUG_DEFAULT_ONLINE) these is no point in waiting for pages
to come online in the driver and we can get rid of the waiting.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-08-31 19:05:41 +0800
cb7a5724c Drivers: hv: balloon: account for gaps in hot add regions ... Browse Code »

I'm observing the following hot add requests from the WS2012 host:

hot_add_req: start_pfn = 0x108200 count = 330752
hot_add_req: start_pfn = 0x158e00 count = 193536
hot_add_req: start_pfn = 0x188400 count = 239616

As the host doesn't specify hot add regions we're trying to create
128Mb-aligned region covering the first request, we create the 0x108000 -
0x160000 region and we add 0x108000 - 0x158e00 memory. The second request
passes the pfn_covered() check, we enlarge the region to 0x108000 -
0x190000 and add 0x158e00 - 0x188200 memory. The problem emerges with the
third request as it starts at 0x188400 so there is a 0x200 gap which is
not covered. As the end of our region is 0x190000 now it again passes the
pfn_covered() check were we just adjust the covered_end_pfn and make it
0x188400 instead of 0x188200 which means that we'll try to online
0x188200-0x188400 pages but these pages were never assigned to us and we
crash.

We can't react to such requests by creating new hot add regions as it may
happen that the whole suggested range falls into the previously identified
128Mb-aligned area so we'll end up adding nothing or create intersecting
regions and our current logic doesn't allow that. Instead, create a list of
such 'gaps' and check for them in the page online callback.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-08-31 19:05:41 +0800
7cf3b79ec Drivers: hv: balloon: keep track of where ha_region starts ... Browse Code »

Windows 2012 (non-R2) does not specify hot add region in hot add requests
and the logic in hot_add_req() is trying to find a 128Mb-aligned region
covering the request. It may also happen that host's requests are not 128Mb
aligned and the created ha_region will start before the first specified
PFN. We can't online these non-present pages but we don't remember the real
start of the region.

This is a regression introduced by the commit 5abbbb75d733 ("Drivers: hv:
hv_balloon: don't lose memory when onlining order is not natural"). While
the idea of keeping the 'moving window' was wrong (as there is no guarantee
that hot add requests come ordered) we should still keep track of
covered_start_pfn. This is not a revert, the logic is different.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-08-31 19:05:41 +0800
3724287c0 Drivers: hv: vmbus: Implement a mechanism to tag the channel for low latency ... Browse Code »

On Hyper-V, performance critical channels use the monitor
mechanism to signal the host when the guest posts mesages
for the host. This mechanism minimizes the hypervisor intercepts
and also makes the host more efficient in that each time the
host is woken up, it processes a batch of messages as opposed to
just one. The goal here is improve the throughput and this is at
the expense of increased latency.
Implement a mechanism to let the client driver decide if latency
is important.

Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

K. Y. Srinivasan
2016-08-31 19:05:41 +0800
8de0d7e95 Drivers: hv: vmbus: Reduce the delay between retries in vmbus_post_msg() ... Browse Code »

The current delay between retries is unnecessarily high and is negatively
affecting the time it takes to boot the system.

Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

K. Y. Srinivasan
2016-08-31 19:05:41 +0800
ccef9bcc0 Drivers: hv: vmbus: Enable explicit signaling policy for NIC channels ... Browse Code »

For synthetic NIC channels, enable explicit signaling policy as netvsc wants to
explicitly control when the host is to be signaled.

Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

K. Y. Srinivasan
2016-08-31 19:05:41 +0800
638fea33a Drivers: hv: vmbus: fix the race when querying & updating the percpu list ... Browse Code »

There is a rare race when we remove an entry from the global list
hv_context.percpu_list[cpu] in hv_process_channel_removal() ->
percpu_channel_deq() -> list_del(): at this time, if vmbus_on_event() ->
process_chn_event() -> pcpu_relid2channel() is trying to query the list,
we can get the kernel fault.

Similarly, we also have the issue in the code path: vmbus_process_offer() ->
percpu_channel_enq().

We can resolve the issue by disabling the tasklet when updating the list.

The patch also moves vmbus_release_relid() to a later place where
the channel has been removed from the per-cpu and the global lists.

Reported-by: Rolf Neugebauer
Signed-off-by: Dexuan Cui
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Dexuan Cui
2016-08-31 19:05:41 +0800
e0fa3e5e7 Drivers: hv: utils: fix a race on userspace daemons registration ... Browse Code »

Background: userspace daemons registration protocol for Hyper-V utilities
drivers has two steps:
1) daemon writes its own version to kernel
2) kernel reads it and replies with module version
at this point we consider the handshake procedure being completed and we
do hv_poll_channel() transitioning the utility device to HVUTIL_READY
state. At this point we're ready to handle messages from kernel.

When hvutil_transport is in HVUTIL_TRANSPORT_CHARDEV mode we have a
single buffer for outgoing message. hvutil_transport_send() puts to this
buffer and till the buffer is cleared with hvt_op_read() returns -EFAULT
to all consequent calls. Hostguest protocol guarantees there is no more
than one request at a time and we will not get new requests till we reply
to the previous one so this single message buffer is enough.

Now to the race. When we finish negotiation procedure and send kernel
module version to userspace with hvutil_transport_send() it goes into the
above mentioned buffer and if the daemon is slow enough to read it from
there we can get a collision when a request from the host comes, we won't
be able to put anything to the buffer so the request will be lost. To
solve the issue we need to know when the negotiation is really done (when
the version message is read by the daemon) and transition to HVUTIL_READY
state after this happens. Implement a callback on read to support this.
Old style netlink communication is not affected by the change, we don't
really know when these messages are delivered but we don't have a single
message buffer there.

Reported-by: Barry Davis
Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-08-31 19:05:41 +0800
396e287fa Drivers: hv: get rid of timeout in vmbus_open() ... Browse Code »

vmbus_teardown_gpadl() can result in infinite wait when it is called on 5
second timeout in vmbus_open(). The issue is caused by the fact that gpadl
teardown operation won't ever succeed for an opened channel and the timeout
isn't always enough. As a guest, we can always trust the host to respond to
our request (and there is nothing we can do if it doesn't).

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-08-31 19:05:41 +0800
7cc80c980 Drivers: hv: don't leak memory in vmbus_establish_gpadl() ... Browse Code »

In some cases create_gpadl_header() allocates submessages but we never
free them.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-08-31 19:05:41 +0800
4d6376329 Drivers: hv: get rid of redundant messagecount in create_gpadl_header() ... Browse Code »

We use messagecount only once in vmbus_establish_gpadl() to check if
it is safe to iterate through the submsglist. We can just initialize
the list header in all cases in create_gpadl_header() instead.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-08-31 19:05:40 +0800
a9f61ca79 Drivers: hv: avoid vfree() on crash ... Browse Code »

When we crash from NMI context (e.g. after NMI injection from host when
'sysctl -w kernel.unknown_nmi_panic=1' is set) we hit

kernel BUG at mm/vmalloc.c:1530!

as vfree() is denied. While the issue could be solved with in_nmi() check
instead I opted for skipping vfree on all sorts of crashes to reduce the
amount of work which can cause consequent crashes. We don't really need to
free anything on crash.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-08-31 19:05:40 +0800

13 Jun, 2016

1 commit

4b44f2d18 random: add interrupt callback to VMBus IRQ handler ... Browse Code »

The Hyper-V Linux Integration Services use the VMBus implementation for
communication with the Hypervisor. VMBus registers its own interrupt
handler that completely bypasses the common Linux interrupt handling.
This implies that the interrupt entropy collector is not triggered.

This patch adds the interrupt entropy collection callback into the VMBus
interrupt handler function.

Cc: stable@kernel.org
Signed-off-by: Stephan Mueller
Signed-off-by: Stephan Mueller
Signed-off-by: Theodore Ts'o

Stephan Mueller
2016-06-13 23:54:33 +0800

02 May, 2016

4 commits

d19a55d6e Drivers: hv: balloon: reset host_specified_ha_region ... Browse Code »

We set host_specified_ha_region = true on certain request but this is a
global state which stays 'true' forever. We need to reset it when we
receive a request where ha_region is not specified. I did not see any
real issues, the bug was found by code inspection.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-05-02 00:23:14 +0800
77c0c9735 Drivers: hv: balloon: don't crash when memory is added in non-sorted order ... Browse Code »

When we iterate through all HA regions in handle_pg_range() we have an
assumption that all these regions are sorted in the list and the
'start_pfn >= has->end_pfn' check is enough to find the proper region.
Unfortunately it's not the case with WS2016 where host can hot-add regions
in a different order. We end up modifying the wrong HA region and crashing
later on pages online. Modify the check to make sure we found the region
we were searching for while iterating. Fix the same check in pfn_covered()
as well.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-05-02 00:23:14 +0800
cd95aad55 Drivers: hv: vmbus: handle various crash scenarios ... Browse Code »

Kdump keeps biting. Turns out CHANNELMSG_UNLOAD_RESPONSE is always
delivered to the CPU which was used for initial contact or to CPU0
depending on host version. vmbus_wait_for_unload() doesn't account for
the fact that in case we're crashing on some other CPU we won't get the
CHANNELMSG_UNLOAD_RESPONSE message and our wait on the current CPU will
never end.

Do the following:
1) Check for completion_done() in the loop. In case interrupt handler is
still alive we'll get the confirmation we need.

2) Read message pages for all CPUs message page as we're unsure where
CHANNELMSG_UNLOAD_RESPONSE is going to be delivered to. We can race with
still-alive interrupt handler doing the same, add cmpxchg() to
vmbus_signal_eom() to not lose CHANNELMSG_UNLOAD_RESPONSE message.

3) Cleanup message pages on all CPUs. This is required (at least for the
current CPU as we're clearing CPU0 messages now but we may want to bring
up additional CPUs on crash) as new messages won't be delivered till we
consume what's pending. On boot we'll place message pages somewhere else
and we won't be able to read stale messages.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-05-02 00:23:14 +0800
4dbfc2e68 Drivers: hv: kvp: fix IP Failover ... Browse Code »

Hyper-V VMs can be replicated to another hosts and there is a feature to
set different IP for replicas, it is called 'Failover TCP/IP'. When
such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as soon
as we finish negotiation procedure. The problem is that it can happen (and
it actually happens) before userspace daemon connects and we reply with
HV_E_FAIL to the message. As there are no repetitions we fail to set the
requested IP.

Solve the issue by postponing our reply to the negotiation message till
userspace daemon is connected. We can't wait too long as there is a
host-side timeout (cca. 75 seconds) and if we fail to reply in this time
frame the whole KVP service will become inactive. The solution is not
ideal - if it takes userspace daemon more than 60 seconds to connect
IP Failover will still fail but I don't see a solution with our current
separation between kernel and userspace parts.

Other two modules (VSS and FCOPY) don't require such delay, leave them
untouched.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2016-05-02 00:23:14 +0800