28 Sep, 2020

3 commits

  • When a guest communicate with the hypervisor, it must use HV_HYP_PAGE to
    calculate PFN, so introduce a few hvpfn helper functions as the
    counterpart of the page helper functions. This is the preparation for
    supporting guest whose PAGE_SIZE is not 4k.

    Signed-off-by: Boqun Feng
    Reviewed-by: Michael Kelley
    Link: https://lore.kernel.org/r/20200916034817.30282-7-boqun.feng@gmail.com
    Signed-off-by: Wei Liu

    Boqun Feng
     
  • There will be more places other than vmbus where we need to calculate
    the Hyper-V page PFN from a virtual address, so move virt_to_hvpfn() to
    hyperv generic header.

    Signed-off-by: Boqun Feng
    Reviewed-by: Michael Kelley
    Link: https://lore.kernel.org/r/20200916034817.30282-6-boqun.feng@gmail.com
    Signed-off-by: Wei Liu

    Boqun Feng
     
  • This patch introduces two types of GPADL: HV_GPADL_{BUFFER, RING}. The
    types of GPADL are purely the concept in the guest, IOW the hypervisor
    treat them as the same.

    The reason of introducing the types for GPADL is to support guests whose
    page size is not 4k (the page size of Hyper-V hypervisor). In these
    guests, both the headers and the data parts of the ringbuffers need to
    be aligned to the PAGE_SIZE, because 1) some of the ringbuffers will be
    mapped into userspace and 2) we use "double mapping" mechanism to
    support fast wrap-around, and "double mapping" relies on ringbuffers
    being page-aligned. However, the Hyper-V hypervisor only uses 4k
    (HV_HYP_PAGE_SIZE) headers. Our solution to this is that we always make
    the headers of ringbuffers take one guest page and when GPADL is
    established between the guest and hypervisor, the only first 4k of
    header is used. To handle this special case, we need the types of GPADL
    to differ different guest memory usage for GPADL.

    Type enum is introduced along with several general interfaces to
    describe the differences between normal buffer GPADL and ringbuffer
    GPADL.

    Signed-off-by: Boqun Feng
    Reviewed-by: Michael Kelley
    Link: https://lore.kernel.org/r/20200916034817.30282-4-boqun.feng@gmail.com
    Signed-off-by: Wei Liu

    Boqun Feng
     

20 Jun, 2020

1 commit

  • The spinlock is (now) *not used to protect test-and-set accesses
    to attributes of the structure or sc_list operations.

    There is, AFAICT, a distinct lack of {WRITE,READ}_ONCE()s in the
    handling of channel->state, but the changes below do not seem to
    make things "worse". ;-)

    Signed-off-by: Andrea Parri (Microsoft)
    Link: https://lore.kernel.org/r/20200617164642.37393-9-parri.andrea@gmail.com
    Reviewed-by: Michael Kelley
    Signed-off-by: Wei Liu

    Andrea Parri (Microsoft)
     

19 Jun, 2020

2 commits


23 May, 2020

1 commit

  • init_vp_index() uses the (per-node) hv_numa_map[] masks to record the
    CPUs allocated for channel interrupts at a given time, and distribute
    the performance-critical channels across the available CPUs: in part.,
    the mask of "candidate" target CPUs in a given NUMA node, for a newly
    offered channel, is determined by XOR-ing the node's CPU mask and the
    node's hv_numa_map. This operation/mechanism assumes that no offline
    CPUs is set in the hv_numa_map mask, an assumption that does not hold
    since such mask is currently not updated when a channel is removed or
    assigned to a different CPU.

    To address the issues described above, this adds hooks in the channel
    removal path (hv_process_channel_removal()) and in target_cpu_store()
    in order to clear, resp. to update, the hv_numa_map[] masks as needed.
    This also adds a (missed) update of the masks in init_vp_index() (cf.,
    e.g., the memory-allocation failure path in this function).

    Like in the case of init_vp_index(), such hooks require to determine
    if the given channel is performance critical. init_vp_index() does
    this by parsing the channel's offer, it can not rely on the device
    data structure (device_obj) to retrieve such information because the
    device data structure has not been allocated/linked with the channel
    by the time that init_vp_index() executes. A similar situation may
    hold in hv_is_alloced_cpu() (defined below); the adopted approach is
    to "cache" the device type of the channel, as computed by parsing the
    channel's offer, in the channel structure itself.

    Fixes: 7527810573436f ("Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type")
    Signed-off-by: Andrea Parri (Microsoft)
    Reviewed-by: Michael Kelley
    Link: https://lore.kernel.org/r/20200522171901.204127-3-parri.andrea@gmail.com
    Signed-off-by: Wei Liu

    Andrea Parri (Microsoft)
     

20 May, 2020

2 commits

  • The current codebase makes use of the zero-length array language
    extension to the C90 standard, but the preferred mechanism to declare
    variable-length types such as these ones is a flexible array member[1][2],
    introduced in C99:

    struct foo {
    int stuff;
    struct boo array[];
    };

    By making use of the mechanism above, we will get a compiler warning
    in case the flexible array does not occur last in the structure, which
    will help us prevent some kind of undefined behavior bugs from being
    inadvertently introduced[3] to the codebase from now on.

    Also, notice that, dynamic memory allocations won't be affected by
    this change:

    "Flexible array members have incomplete type, and so the sizeof operator
    may not be applied. As a quirk of the original implementation of
    zero-length arrays, sizeof evaluates to zero."[1]

    sizeof(flexible-array-member) triggers a warning because flexible array
    members have incomplete type[1]. There are some instances of code in
    which the sizeof operator is being incorrectly/erroneously applied to
    zero-length arrays and the result is zero. Such instances may be hiding
    some bugs. So, this work (flexible-array member conversions) will also
    help to get completely rid of those sorts of issues.

    This issue was found with the help of Coccinelle.

    [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
    [2] https://github.com/KSPP/linux/issues/21
    [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

    Signed-off-by: Gustavo A. R. Silva
    Link: https://lore.kernel.org/r/20200507185323.GA14416@embeddedor
    Signed-off-by: Wei Liu

    Gustavo A. R. Silva
     
  • For each storvsc_device, storvsc keeps track of the channel target CPUs
    associated to the device (alloced_cpus) and it uses this information to
    fill a "cache" (stor_chns) mapping CPU->channel according to a certain
    heuristic. Update the alloced_cpus mask and the stor_chns array when a
    channel of the storvsc device is re-assigned to a different CPU.

    Signed-off-by: Andrea Parri (Microsoft)
    Cc: "James E.J. Bottomley"
    Cc: "Martin K. Petersen"
    Cc:
    Link: https://lore.kernel.org/r/20200406001514.19876-12-parri.andrea@gmail.com
    Reviewed-by; Long Li
    Reviewed-by: Michael Kelley
    [ wei: fix a small issue reported by kbuild test robot ]
    Signed-off-by: Wei Liu

    Andrea Parri (Microsoft)
     

23 Apr, 2020

5 commits

  • VMBus version 4.1 and later support the CHANNELMSG_MODIFYCHANNEL(22)
    message type which can be used to request Hyper-V to change the vCPU
    that a channel will interrupt.

    Introduce the CHANNELMSG_MODIFYCHANNEL message type, and define the
    vmbus_send_modifychannel() function to send CHANNELMSG_MODIFYCHANNEL
    requests to the host via a hypercall. The function is then used to
    define a sysfs "store" operation, which allows to change the (v)CPU
    the channel will interrupt by using the sysfs interface. The feature
    can be used for load balancing or other purposes.

    One interesting catch here is that Hyper-V can *not* currently ACK
    CHANNELMSG_MODIFYCHANNEL messages with the promise that (after the ACK
    is sent) the channel won't send any more interrupts to the "old" CPU.

    The peculiarity of the CHANNELMSG_MODIFYCHANNEL messages is problematic
    if the user want to take a CPU offline, since we don't want to take a
    CPU offline (and, potentially, "lose" channel interrupts on such CPU)
    if the host is still processing a CHANNELMSG_MODIFYCHANNEL message
    associated to that CPU.

    It is worth mentioning, however, that we have been unable to observe
    the above mentioned "race": in all our tests, CHANNELMSG_MODIFYCHANNEL
    requests appeared *as if* they were processed synchronously by the host.

    Suggested-by: Michael Kelley
    Signed-off-by: Andrea Parri (Microsoft)
    Link: https://lore.kernel.org/r/20200406001514.19876-11-parri.andrea@gmail.com
    Reviewed-by: Michael Kelley
    [ wei: fix conflict in channel_mgmt.c ]
    Signed-off-by: Wei Liu

    Andrea Parri (Microsoft)
     
  • The logic is unused since commit 509879bdb30b8 ("Drivers: hv: Introduce
    a policy for controlling channel affinity").

    This logic assumes that a channel target_cpu doesn't change during the
    lifetime of a channel, but this assumption is incompatible with the new
    functionality that allows changing the vCPU a channel will interrupt.

    Signed-off-by: Andrea Parri (Microsoft)
    Link: https://lore.kernel.org/r/20200406001514.19876-9-parri.andrea@gmail.com
    Reviewed-by: Michael Kelley
    Signed-off-by: Wei Liu

    Andrea Parri (Microsoft)
     
  • Since vmbus_chan_sched() dereferences the ring buffer pointer, we have
    to make sure that the ring buffer data structures don't get freed while
    such dereferencing is happening. Current code does this by sending an
    IPI to the CPU that is allowed to access that ring buffer from interrupt
    level, cf., vmbus_reset_channel_cb(). But with the new functionality
    to allow changing the CPU that a channel will interrupt, we can't be
    sure what CPU will be running the vmbus_chan_sched() function for a
    particular channel, so the current IPI mechanism is infeasible.

    Instead synchronize vmbus_chan_sched() and vmbus_reset_channel_cb() by
    using the (newly introduced) per-channel spin lock "sched_lock". Move
    the test for onchannel_callback being NULL before the "switch" control
    statement in vmbus_chan_sched(), in order to not access the ring buffer
    if the vmbus_reset_channel_cb() has been completed on the channel.

    Suggested-by: Michael Kelley
    Signed-off-by: Andrea Parri (Microsoft)
    Link: https://lore.kernel.org/r/20200406001514.19876-7-parri.andrea@gmail.com
    Reviewed-by: Michael Kelley
    Signed-off-by: Wei Liu

    Andrea Parri (Microsoft)
     
  • When Hyper-V sends an interrupt to the guest, the guest has to figure
    out which channel the interrupt is associated with. Hyper-V sets a bit
    in a memory page that is shared with the guest, indicating a particular
    "relid" that the interrupt is associated with. The current Linux code
    then uses a set of per-CPU linked lists to map a given "relid" to a
    pointer to a channel structure.

    This design introduces a synchronization problem if the CPU that Hyper-V
    will interrupt for a certain channel is changed. If the interrupt comes
    on the "old CPU" and the channel was already moved to the per-CPU list
    of the "new CPU", then the relid -> channel mapping will fail and the
    interrupt is dropped. Similarly, if the interrupt comes on the new CPU
    but the channel was not moved to the per-CPU list of the new CPU, then
    the mapping will fail and the interrupt is dropped.

    Relids are integers ranging from 0 to 2047. The mapping from relids to
    channel structures can be done by setting up an array with 2048 entries,
    each entry being a pointer to a channel structure (hence total size ~16K
    bytes, which is not a problem). The array is global, so there are no
    per-CPU linked lists to update. The array can be searched and updated
    by loading from/storing to the array at the specified index. With no
    per-CPU data structures, the above mentioned synchronization problem is
    avoided and the relid2channel() function gets simpler.

    Suggested-by: Michael Kelley
    Signed-off-by: Andrea Parri (Microsoft)
    Link: https://lore.kernel.org/r/20200406001514.19876-4-parri.andrea@gmail.com
    Reviewed-by: Michael Kelley
    Signed-off-by: Wei Liu

    Andrea Parri (Microsoft)
     
  • vmbus_onmessage() doesn't need the header of the message, it only
    uses it to get to the payload, we can pass the pointer to the
    payload directly.

    Signed-off-by: Vitaly Kuznetsov
    Reviewed-by: Michael Kelley
    Link: https://lore.kernel.org/r/20200406104154.45010-4-vkuznets@redhat.com
    Signed-off-by: Wei Liu

    Vitaly Kuznetsov
     

27 Jan, 2020

1 commit

  • Add util_pre_suspend() and util_pre_resume() for some hv_utils devices
    (e.g. kvp/vss/fcopy), because they need special handling before
    util_suspend() calls vmbus_close().

    For kvp, all the possible pending work items should be cancelled.

    For vss and fcopy, some extra clean-up needs to be done, i.e. fake a
    THAW message for hv_vss_daemon and fake a CANCEL_FCOPY message for
    hv_fcopy_daemon, otherwise when the VM resums back, the daemons
    can end up in an inconsistent state (i.e. the file systems are
    frozen but will never be thawed; the file transmitted via fcopy
    may not be complete). Note: there is an extra patch for the daemons:
    "Tools: hv: Reopen the devices if read() or write() returns errors",
    because the hv_utils driver can not guarantee the whole transaction
    finishes completely once util_suspend() starts to run (at this time,
    all the userspace processes are frozen).

    util_probe() disables channel->callback_event to avoid the race with
    the channel callback.

    Signed-off-by: Dexuan Cui
    Reviewed-by: Michael Kelley
    Signed-off-by: Sasha Levin

    Dexuan Cui
     

26 Jan, 2020

1 commit

  • When a Linux hv_sock app tries to connect to a Service GUID on which no
    host app is listening, a recent host (RS3+) sends a
    CHANNELMSG_TL_CONNECT_RESULT (23) message to Linux and this triggers such
    a warning:

    unknown msgtype=23
    WARNING: CPU: 2 PID: 0 at drivers/hv/vmbus_drv.c:1031 vmbus_on_msg_dpc

    Actually Linux can safely ignore the message because the Linux app's
    connect() will time out in 2 seconds: see VSOCK_DEFAULT_CONNECT_TIMEOUT
    and vsock_stream_connect(). We don't bother to make use of the message
    because: 1) it's only supported on recent hosts; 2) a non-trivial effort
    is required to use the message in Linux, but the benefit is small.

    So, let's not see the warning by silently ignoring the message.

    Signed-off-by: Dexuan Cui
    Reviewed-by: Michael Kelley
    Signed-off-by: Sasha Levin

    Dexuan Cui
     

22 Nov, 2019

3 commits

  • Introduce user specified latency in the packet reception path
    By exposing the test parameters as part of the debugfs channel
    attributes. We will control the testing state via these attributes.

    Signed-off-by: Branden Bonaby
    Reviewed-by: Michael Kelley
    Signed-off-by: Sasha Levin

    Branden Bonaby
     
  • Hyper-V has added VMBus protocol versions 5.1 and 5.2 in recent release
    versions. Allow Linux guests to negotiate these new protocol versions
    on versions of Hyper-V that support them. While on this, also allow
    guests to negotiate the VMBus protocol version 4.1 (which was missing).

    Signed-off-by: Andrea Parri
    Reviewed-by: Wei Liu
    Reviewed-by: Michael Kelley
    Signed-off-by: Sasha Levin

    Andrea Parri
     
  • The technique used to get the next VMBus version seems increasisly
    clumsy as the number of VMBus versions increases. Performance is
    not a concern since this is only done once during system boot; it's
    just that we'll end up with more lines of code than is really needed.

    As an alternative, introduce a table with the version numbers listed
    in order (from the most recent to the oldest). vmbus_connect() loops
    through the versions listed in the table until it gets an accepted
    connection or gets to the end of the table (invalid version).

    Suggested-by: Michael Kelley
    Signed-off-by: Andrea Parri
    Reviewed-by: Wei Liu
    Reviewed-by: Michael Kelley
    Signed-off-by: Sasha Levin

    Andrea Parri
     

25 Sep, 2019

1 commit

  • Pull Hyper-V updates from Sasha Levin:

    - first round of vmbus hibernation support (Dexuan Cui)

    - remove dependencies on PAGE_SIZE (Maya Nakamura)

    - move the hyper-v tools/ code into the tools build system (Andy
    Shevchenko)

    - hyper-v balloon cleanups (Dexuan Cui)

    * tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
    Drivers: hv: vmbus: Resume after fixing up old primary channels
    Drivers: hv: vmbus: Suspend after cleaning up hv_sock and sub channels
    Drivers: hv: vmbus: Clean up hv_sock channels by force upon suspend
    Drivers: hv: vmbus: Suspend/resume the vmbus itself for hibernation
    Drivers: hv: vmbus: Ignore the offers when resuming from hibernation
    Drivers: hv: vmbus: Implement suspend/resume for VSC drivers for hibernation
    Drivers: hv: vmbus: Add a helper function is_sub_channel()
    Drivers: hv: vmbus: Suspend/resume the synic for hibernation
    Drivers: hv: vmbus: Break out synic enable and disable operations
    HID: hv: Remove dependencies on PAGE_SIZE for ring buffer
    Tools: hv: move to tools buildsystem
    hv_balloon: Reorganize the probe function
    hv_balloon: Use a static page for the balloon_up send buffer

    Linus Torvalds
     

07 Sep, 2019

3 commits


22 Aug, 2019

2 commits

  • This interface driver is a helper driver allows other drivers to
    have a common interface with the Hyper-V PCI frontend driver.

    Signed-off-by: Haiyang Zhang
    Signed-off-by: Saeed Mahameed
    Signed-off-by: David S. Miller

    Haiyang Zhang
     
  • Windows SR-IOV provides a backchannel mechanism in software for communication
    between a VF driver and a PF driver. These "configuration blocks" are
    similar in concept to PCI configuration space, but instead of doing reads and
    writes in 32-bit chunks through a very slow path, packets of up to 128 bytes
    can be sent or received asynchronously.

    Nearly every SR-IOV device contains just such a communications channel in
    hardware, so using this one in software is usually optional. Using the
    software channel, however, allows driver implementers to leverage software
    tools that fuzz the communications channel looking for vulnerabilities.

    The usage model for these packets puts the responsibility for reading or
    writing on the VF driver. The VF driver sends a read or a write packet,
    indicating which "block" is being referred to by number.

    If the PF driver wishes to initiate communication, it can "invalidate" one or
    more of the first 64 blocks. This invalidation is delivered via a callback
    supplied by the VF driver by this driver.

    No protocol is implied, except that supplied by the PF and VF drivers.

    Signed-off-by: Jake Oshins
    Signed-off-by: Dexuan Cui
    Cc: Haiyang Zhang
    Cc: K. Y. Srinivasan
    Cc: Stephen Hemminger
    Signed-off-by: Saeed Mahameed
    Signed-off-by: Haiyang Zhang
    Signed-off-by: David S. Miller

    Dexuan Cui
     

05 Jun, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms and conditions of the gnu general public license
    version 2 as published by the free software foundation this program
    is distributed in the hope it will be useful but without any
    warranty without even the implied warranty of merchantability or
    fitness for a particular purpose see the gnu general public license
    for more details you should have received a copy of the gnu general
    public license along with this program if not write to the free
    software foundation inc 59 temple place suite 330 boston ma 02111
    1307 usa

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 33 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Kate Stewart
    Reviewed-by: Alexios Zavras
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190530000435.254582722@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

11 Apr, 2019

1 commit

  • Fix a race condition that can result in a ring buffer pointer being set
    to null while a "_show" function is reading the ring buffer's data. This
    problem was discussed here: https://lkml.org/lkml/2018/10/18/779

    To fix the race condition, add a new mutex lock to the
    "hv_ring_buffer_info" struct. Add a new function,
    "hv_ringbuffer_pre_init()", where a channel's inbound and outbound
    ring_buffer_info mutex locks are initialized.

    Acquire/release the locks in the "hv_ringbuffer_cleanup()" function,
    which is where the ring buffer pointers are set to null.

    Acquire/release the locks in the four channel-level "_show" functions
    that access ring buffer data. Remove the "const" qualifier from the
    "vmbus_channel" parameter and the "rbi" variable of the channel-level
    "_show" functions so that the locks can be acquired/released in these
    functions.

    Acquire/release the locks in hv_ringbuffer_get_debuginfo(). Remove the
    "const" qualifier from the "hv_ring_buffer_info" parameter so that the
    locks can be acquired/released in this function.

    Signed-off-by: Kimberly Brown
    Reviewed-by: Michael Kelley
    Signed-off-by: Sasha Levin

    Kimberly Brown
     

15 Feb, 2019

2 commits

  • Counter values for per-channel interrupts and ring buffer full
    conditions are useful for investigating performance.

    Expose counters in sysfs for 2 types of guest to host interrupts:
    1) Interrupts caused by the channel's outbound ring buffer transitioning
    from empty to not empty
    2) Interrupts caused by the channel's inbound ring buffer transitioning
    from full to not full while a packet is waiting for enough buffer space to
    become available

    Expose 2 counters in sysfs for the number of times that write operations
    encountered a full outbound ring buffer:
    1) The total number of write operations that encountered a full
    condition
    2) The number of write operations that were the first to encounter a
    full condition

    Increment the outbound full condition counters in the
    hv_ringbuffer_write() function because, for most drivers, a full
    outbound ring buffer is detected in that function. Also increment the
    outbound full condition counters in the set_channel_pending_send_size()
    function. In the hv_sock driver, a full outbound ring buffer is detected
    and set_channel_pending_send_size() is called before
    hv_ringbuffer_write() is called.

    I tested this patch by confirming that the sysfs files were created and
    observing the counter values. The values seemed to increase by a
    reasonable amount when the Hyper-v related drivers were in use.

    Signed-off-by: Kimberly Brown
    Reviewed-by: Michael Kelley
    Signed-off-by: Sasha Levin

    Kimberly Brown
     
  • There are new types and helpers that are supposed to be used in new code.

    As a preparation to get rid of legacy types and API functions do
    the conversion here.

    Cc: "K. Y. Srinivasan"
    Cc: Haiyang Zhang
    Cc: Stephen Hemminger
    Cc: devel@linuxdriverproject.org
    Signed-off-by: Andy Shevchenko
    Reviewed-by: Michael Kelley
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Sasha Levin

    Andy Shevchenko
     

10 Jan, 2019

1 commit

  • fc96df16a1ce is good and can already fix the "return stack garbage" issue,
    but let's also improve hv_ringbuffer_get_debuginfo(), which would silently
    return stack garbage, if people forget to check channel->state or
    ring_info->ring_buffer, when using the function in the future.

    Having an error check in the function would eliminate the potential risk.

    Add a Fixes tag to indicate the patch depdendency.

    Fixes: fc96df16a1ce ("Drivers: hv: vmbus: Return -EINVAL for the sys files for unopened channels")
    Cc: stable@vger.kernel.org
    Cc: K. Y. Srinivasan
    Cc: Haiyang Zhang
    Signed-off-by: Stephen Hemminger
    Signed-off-by: Dexuan Cui
    Signed-off-by: Sasha Levin

    Dexuan Cui
     

10 Dec, 2018

1 commit


03 Dec, 2018

1 commit

  • vmbus_process_offer() mustn't call channel->sc_creation_callback()
    directly for sub-channels, because sc_creation_callback() ->
    vmbus_open() may never get the host's response to the
    OPEN_CHANNEL message (the host may rescind a channel at any time,
    e.g. in the case of hot removing a NIC), and vmbus_onoffer_rescind()
    may not wake up the vmbus_open() as it's blocked due to a non-zero
    vmbus_connection.offer_in_progress, and finally we have a deadlock.

    The above is also true for primary channels, if the related device
    drivers use sync probing mode by default.

    And, usually the handling of primary channels and sub-channels can
    depend on each other, so we should offload them to different
    workqueues to avoid possible deadlock, e.g. in sync-probing mode,
    NIC1's netvsc_subchan_work() can race with NIC2's netvsc_probe() ->
    rtnl_lock(), and causes deadlock: the former gets the rtnl_lock
    and waits for all the sub-channels to appear, but the latter
    can't get the rtnl_lock and this blocks the handling of sub-channels.

    The patch can fix the multiple-NIC deadlock described above for
    v3.x kernels (e.g. RHEL 7.x) which don't support async-probing
    of devices, and v4.4, v4.9, v4.14 and v4.18 which support async-probing
    but don't enable async-probing for Hyper-V drivers (yet).

    The patch can also fix the hang issue in sub-channel's handling described
    above for all versions of kernels, including v4.19 and v4.20-rc4.

    So actually the patch should be applied to all the existing kernels,
    not only the kernels that have 8195b1396ec8.

    Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
    Cc: stable@vger.kernel.org
    Cc: Stephen Hemminger
    Cc: K. Y. Srinivasan
    Cc: Haiyang Zhang
    Signed-off-by: Dexuan Cui
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Dexuan Cui
     

27 Nov, 2018

1 commit

  • Commit d86adf482b84 ("scsi: storvsc: Enable multi-queue support") removed
    the usage of the API in Jan 2017, and the API is not used since then.

    netvsc and storvsc have their own algorithms to determine the outgoing
    channel, so this API is useless.

    And the API is potentially unsafe, because it reads primary->num_sc without
    any lock held. This can be risky considering the RESCIND-OFFER message.

    Let's remove the API.

    Cc: Long Li
    Cc: Stephen Hemminger
    Cc: K. Y. Srinivasan
    Cc: Haiyang Zhang
    Signed-off-by: Dexuan Cui
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Dexuan Cui
     

26 Sep, 2018

3 commits


12 Sep, 2018

1 commit

  • Add support for overriding the default driver for a VMBus device
    in the same way that it can be done for PCI devices. This patch
    adds the /sys/bus/vmbus/devices/.../driver_override file
    and the logic for matching.

    This is used by driverctl tool to do driver override.
    https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.com%2Fdriverctl%2Fdriverctl&data=02%7C01%7Ckys%40microsoft.com%7C42e803feb2c544ef6ea908d5fd538878%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636693457619960040&sdata=kEyYHRIjNZCk%2B37moCSqbrZL426YccNQrsWpENcrZdw%3D&reserved=0

    Signed-off-by: Stephen Hemminger
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Stephen Hemminger
     

02 Aug, 2018

1 commit

  • Before setting channel->rescind in vmbus_rescind_cleanup(), we should make
    sure the channel callback won't run any more, otherwise a high-level
    driver like pci_hyperv, which may be infinitely waiting for the host VSP's
    response and notices the channel has been rescinded, can't safely give
    up: e.g., in hv_pci_protocol_negotiation() -> wait_for_response(), it's
    unsafe to exit from wait_for_response() and proceed with the on-stack
    variable "comp_pkt" popped. The issue was originally spotted by
    Michael Kelley .

    In vmbus_close_internal(), the patch also minimizes the range protected by
    disabling/enabling channel->callback_event: we don't really need that for
    the whole function.

    Signed-off-by: Dexuan Cui
    Reviewed-by: Michael Kelley
    Cc: stable@vger.kernel.org
    Cc: K. Y. Srinivasan
    Cc: Stephen Hemminger
    Cc: Michael Kelley
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Dexuan Cui
     

03 Jul, 2018

1 commit

  • Add comments describing intricacies of Hyper-V ring buffer
    signaling code. This information is not in Hyper-V public
    documents, so include here to capture the knowledge for
    future coders.

    There are no code changes in this commit.

    Signed-off-by: Michael Kelley
    Signed-off-by: K. Y. Srinivasan
    Signed-off-by: Greg Kroah-Hartman

    Michael Kelley
     

11 Jun, 2018

1 commit

  • Pull SCSI updates from James Bottomley:
    "This is mostly updates to the usual drivers: ufs, qedf, mpt3sas, lpfc,
    xfcp, hisi_sas, cxlflash, qla2xxx.

    In the absence of Nic, we're also taking target updates which are
    mostly minor except for the tcmu refactor.

    The only real core change to worry about is the removal of high page
    bouncing (in sas, storvsc and iscsi). This has been well tested and no
    problems have shown up so far"

    * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (268 commits)
    scsi: lpfc: update driver version to 12.0.0.4
    scsi: lpfc: Fix port initialization failure.
    scsi: lpfc: Fix 16gb hbas failing cq create.
    scsi: lpfc: Fix crash in blk_mq layer when executing modprobe -r lpfc
    scsi: lpfc: correct oversubscription of nvme io requests for an adapter
    scsi: lpfc: Fix MDS diagnostics failure (Rx < Tx)
    scsi: hisi_sas: Mark PHY as in reset for nexus reset
    scsi: hisi_sas: Fix return value when get_free_slot() failed
    scsi: hisi_sas: Terminate STP reject quickly for v2 hw
    scsi: hisi_sas: Add v2 hw force PHY function for internal ATA command
    scsi: hisi_sas: Include TMF elements in struct hisi_sas_slot
    scsi: hisi_sas: Try wait commands before before controller reset
    scsi: hisi_sas: Init disks after controller reset
    scsi: hisi_sas: Create a scsi_host_template per HW module
    scsi: hisi_sas: Reset disks when discovered
    scsi: hisi_sas: Add LED feature for v3 hw
    scsi: hisi_sas: Change common allocation mode of device id
    scsi: hisi_sas: change slot index allocation mode
    scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate()
    scsi: hisi_sas: fix a typo in hisi_sas_task_prep()
    ...

    Linus Torvalds