30 Mar, 2017

1 commit

  • commit 5e030d5ce9d99a899b648413139ff65bab12b038 upstream.

    When we close a channel that has been rescinded, we will leak memory since
    vmbus_teardown_gpadl() returns an error. Fix this so that we can properly
    cleanup the memory allocated to the ring buffers.

    Fixes: ccb61f8a99e6 ("Drivers: hv: vmbus: Fix a rescind handling bug")

    Signed-off-by: K. Y. Srinivasan
    Cc: Dexuan Cui
    Signed-off-by: Greg Kroah-Hartman

    K. Y. Srinivasan
     

12 Mar, 2017

2 commits

  • commit ccb61f8a99e6c29df4fb96a65dad4fad740d5be9 upstream.

    The host can rescind a channel that has been offered to the
    guest and once the channel is rescinded, the host does not
    respond to any requests on that channel. Deal with the case where
    the guest may be blocked waiting for a response from the host.

    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    K. Y. Srinivasan
     
  • commit c0bb03924f1a80e7f65900e36c8e6b3dc167c5f8 upstream.

    DoS protection conditions were altered in WS2016 and now it's easy to get
    -EAGAIN returned from vmbus_post_msg() (e.g. when we try changing MTU on a
    netvsc device in a loop). All vmbus_post_msg() callers don't retry the
    operation and we usually end up with a non-functional device or crash.

    While host's DoS protection conditions are unknown to me my tests show that
    it can take up to 10 seconds before the message is sent so doing udelay()
    is not an option, we really need to sleep. Almost all vmbus_post_msg()
    callers are ready to sleep but there is one special case:
    vmbus_initiate_unload() which can be called from interrupt/NMI context and
    we can't sleep there. I'm also not sure about the lonely
    vmbus_send_tl_connect_request() which has no in-tree users but its external
    users are most likely waiting for the host to reply so sleeping there is
    also appropriate.

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Vitaly Kuznetsov
     

15 Feb, 2017

3 commits

  • commit 3372592a140db69fd63837e81f048ab4abf8111e upstream.

    Signal the host when we determine the host is to be signaled -
    on th read path. The currrent code determines the need to signal in the
    ringbuffer code and actually issues the signal elsewhere. This can result
    in the host viewing this interrupt as spurious since the host may also
    poll the channel. Make the necessary adjustments.

    Signed-off-by: K. Y. Srinivasan
    Cc: Rolf Neugebauer
    Signed-off-by: Greg Kroah-Hartman

    K. Y. Srinivasan
     
  • commit 1f6ee4e7d83586c8b10bd4f2f4346353d04ce884 upstream.

    Signal the host when we determine the host is to be signaled.
    The currrent code determines the need to signal in the ringbuffer
    code and actually issues the signal elsewhere. This can result
    in the host viewing this interrupt as spurious since the host may also
    poll the channel. Make the necessary adjustments.

    Signed-off-by: K. Y. Srinivasan
    Cc: Rolf Neugebauer
    Signed-off-by: Greg Kroah-Hartman

    K. Y. Srinivasan
     
  • commit 74198eb4a42c4a3c4fbef08fa01a291a282f7c2e upstream.

    One of the factors that can result in the host concluding that a given
    guest in mounting a DOS attack is if the guest generates interrupts
    to the host when the host is not expecting it. If these "spurious"
    interrupts reach a certain rate, the host can throttle the guest to
    minimize the impact. The host computation of the "expected number
    of interrupts" is strictly based on the ring transitions. Until
    the host logic is fixed, base the guest logic to interrupt solely
    on the ring state.

    Signed-off-by: K. Y. Srinivasan
    Cc: Rolf Neugebauer
    Signed-off-by: Greg Kroah-Hartman

    K. Y. Srinivasan
     

02 Sep, 2016

2 commits

  • Make it possible to always use a single memcpy() or to provide a direct
    link to a packet on the ring buffer by creating virtual mapping for two
    copies of the ring buffer with vmap(). Utilize currently empty
    hv_ringbuffer_cleanup() to do the unmap.

    While on it, replace sizeof(struct hv_ring_buffer) check
    in hv_ringbuffer_init() with BUILD_BUG_ON() as it is a compile time check.

    Signed-off-by: Vitaly Kuznetsov
    Tested-by: Dexuan Cui
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Vitaly Kuznetsov
     
  • In preparation for doing wrap around mappings for ring buffers cleanup
    vmbus_open() function:
    - check that ring sizes are PAGE_SIZE aligned (they are for all in-kernel
    drivers now);
    - kfree(open_info) on error only after we kzalloc() it (not an issue as it
    is valid to call kfree(NULL);
    - rename poorly named labels;
    - use alloc_pages() instead of __get_free_pages() as we need struct page
    pointer for future.

    Signed-off-by: Vitaly Kuznetsov
    Tested-by: Dexuan Cui
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Vitaly Kuznetsov
     

31 Aug, 2016

6 commits

  • On Hyper-V, performance critical channels use the monitor
    mechanism to signal the host when the guest posts mesages
    for the host. This mechanism minimizes the hypervisor intercepts
    and also makes the host more efficient in that each time the
    host is woken up, it processes a batch of messages as opposed to
    just one. The goal here is improve the throughput and this is at
    the expense of increased latency.
    Implement a mechanism to let the client driver decide if latency
    is important.

    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    K. Y. Srinivasan
     
  • For synthetic NIC channels, enable explicit signaling policy as netvsc wants to
    explicitly control when the host is to be signaled.

    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    K. Y. Srinivasan
     
  • There is a rare race when we remove an entry from the global list
    hv_context.percpu_list[cpu] in hv_process_channel_removal() ->
    percpu_channel_deq() -> list_del(): at this time, if vmbus_on_event() ->
    process_chn_event() -> pcpu_relid2channel() is trying to query the list,
    we can get the kernel fault.

    Similarly, we also have the issue in the code path: vmbus_process_offer() ->
    percpu_channel_enq().

    We can resolve the issue by disabling the tasklet when updating the list.

    The patch also moves vmbus_release_relid() to a later place where
    the channel has been removed from the per-cpu and the global lists.

    Reported-by: Rolf Neugebauer
    Signed-off-by: Dexuan Cui
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Dexuan Cui
     
  • vmbus_teardown_gpadl() can result in infinite wait when it is called on 5
    second timeout in vmbus_open(). The issue is caused by the fact that gpadl
    teardown operation won't ever succeed for an opened channel and the timeout
    isn't always enough. As a guest, we can always trust the host to respond to
    our request (and there is nothing we can do if it doesn't).

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Vitaly Kuznetsov
     
  • In some cases create_gpadl_header() allocates submessages but we never
    free them.

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Vitaly Kuznetsov
     
  • We use messagecount only once in vmbus_establish_gpadl() to check if
    it is safe to iterate through the submsglist. We can just initialize
    the list header in all cases in create_gpadl_header() instead.

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Vitaly Kuznetsov
     

08 Feb, 2016

3 commits


15 Dec, 2015

6 commits

  • Currently, there is only one user for hv_ringbuffer_read()/
    hv_ringbuffer_peak() functions and the usage of these functions is:
    - insecure as we drop ring_lock between them, someone else (in theory
    only) can acquire it in between;
    - non-optimal as we do a number of things (acquire/release the above
    mentioned lock, calculate available space on the ring, ...) twice and
    this path is performance-critical.

    Remove hv_ringbuffer_peek() moving the logic from __vmbus_recvpacket() to
    hv_ringbuffer_read().

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Vitaly Kuznetsov
     
  • vmbus_recvpacket() and vmbus_recvpacket_raw() are almost identical but
    there are two discrepancies:
    1) vmbus_recvpacket() doesn't propagate errors from hv_ringbuffer_read()
    which looks like it is not desired.
    2) There is an error message printed in packetlen > bufferlen case in
    vmbus_recvpacket(). I'm removing it as it is usless for users to see
    such messages and /vmbus_recvpacket_raw() doesn't have it.

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Vitaly Kuznetsov
     
  • Currently we have two policies for deciding when to signal the host:
    One based on the ring buffer state and the other based on what the
    VMBUS client driver wants to do. Consider the case when the client
    wants to explicitly control when to signal the host. In this case,
    if the client were to defer signaling, we will not be able to signal
    the host subsequently when the client does want to signal since the
    ring buffer state will prevent the signaling. Implement logic to
    have only one signaling policy in force for a given channel.

    Signed-off-by: K. Y. Srinivasan
    Reviewed-by: Haiyang Zhang
    Tested-by: Haiyang Zhang
    Cc: # v4.2+
    Signed-off-by: Greg Kroah-Hartman

    K. Y. Srinivasan
     
  • In the path vmbus_onoffer_rescind() -> vmbus_device_unregister() ->
    device_unregister() -> ... -> __device_release_driver(), we can see for a
    device without a driver loaded: dev->driver is NULL, so
    dev->bus->remove(dev), namely vmbus_remove(), isn't invoked.

    As a result, vmbus_remove() -> hv_process_channel_removal() isn't invoked
    and some cleanups(like sending a CHANNELMSG_RELID_RELEASED message to the
    host) aren't done.

    We can demo the issue this way:
    1. rmmod hv_utils;
    2. disable the Heartbeat Integration Service in Hyper-V Manager and lsvmbus
    shows the device disappears.
    3. re-enable the Heartbeat in Hyper-V Manager and modprobe hv_utils, but
    lsvmbus shows the device can't appear again.
    This is because, the host thinks the VM hasn't released the relid, so can't
    re-offer the device to the VM.

    We can fix the issue by moving hv_process_channel_removal()
    from vmbus_close_internal() to vmbus_device_release(), since the latter is
    always invoked on device_unregister(), whether or not the dev has a driver
    loaded.

    Signed-off-by: Dexuan Cui
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Dexuan Cui
     
  • This fixes an incorrect assumption of channel state in the function.

    Signed-off-by: Dexuan Cui
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Dexuan Cui
     
  • process_chn_event(), running in the tasklet, can race with
    vmbus_close_internal() in the case of SMP guest, e.g., when the former is
    accessing channel->inbound.ring_buffer, the latter could be freeing the
    ring_buffer pages.

    To resolve the race, we can serialize them by disabling the tasklet when
    the latter is running here.

    Signed-off-by: Dexuan Cui
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Dexuan Cui
     

05 Aug, 2015

1 commit


13 Jun, 2015

1 commit


25 May, 2015

1 commit

  • In case there was an error reported in the response to the CHANNELMSG_OPENCHANNEL
    call we need to do the cleanup as a vmbus_open() user won't be doing it after
    receiving an error. The cleanup should be done on all failure paths. We also need
    to avoid returning open_info->response.open_result.status as the return value as
    all other errors we return from vmbus_open() are -EXXX and vmbus_open() callers
    are not supposed to analyze host error codes.

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Vitaly Kuznetsov
     

25 Mar, 2015

3 commits


02 Mar, 2015

6 commits


26 Jan, 2015

3 commits


24 Sep, 2014

2 commits