13 Oct, 2018

2 commits

  • commit d9e427f6ab8142d6868eb719e6a7851aafea56b6 upstream.

    commit c7cdff0e8647 ("virtio_balloon: fix deadlock on OOM")
    changed code to increment vb->num_pfns before call to
    set_page_pfns(), which used to happen only after.

    This patch fixes boot hang for me on ppc64le KVM guests.

    Fixes: c7cdff0e8647 ("virtio_balloon: fix deadlock on OOM")
    Cc: Michael S. Tsirkin
    Cc: Tetsuo Handa
    Cc: Michal Hocko
    Cc: Wei Wang
    Cc: stable@vger.kernel.org
    Signed-off-by: Jan Stancek
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Sudip Mukherjee
    Signed-off-by: Greg Kroah-Hartman

    Jan Stancek
     
  • commit c7cdff0e864713a089d7cb3a2b1136ba9a54881a upstream.

    fill_balloon doing memory allocations under balloon_lock
    can cause a deadlock when leak_balloon is called from
    virtballoon_oom_notify and tries to take same lock.

    To fix, split page allocation and enqueue and do allocations outside the lock.

    Here's a detailed analysis of the deadlock by Tetsuo Handa:

    In leak_balloon(), mutex_lock(&vb->balloon_lock) is called in order to
    serialize against fill_balloon(). But in fill_balloon(),
    alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is
    called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE]
    implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY
    is specified, this allocation attempt might indirectly depend on somebody
    else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect
    __GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via
    virtballoon_oom_notify() via blocking_notifier_call_chain() callback via
    out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock
    mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it
    will cause OOM lockup.

    Thread1 Thread2
    fill_balloon()
    takes a balloon_lock
    balloon_page_enqueue()
    alloc_page(GFP_HIGHUSER_MOVABLE)
    direct reclaim (__GFP_FS context) takes a fs lock
    waits for that fs lock alloc_page(GFP_NOFS)
    __alloc_pages_may_oom()
    takes the oom_lock
    out_of_memory()
    blocking_notifier_call_chain()
    leak_balloon()
    tries to take that balloon_lock and deadlocks

    Reported-by: Tetsuo Handa
    Cc: Michal Hocko
    Cc: Wei Wang
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Sudip Mukherjee
    Signed-off-by: Greg Kroah-Hartman

    Michael S. Tsirkin
     

15 Sep, 2018

1 commit

  • [ Upstream commit 69599206ea9a3f8f2e94d46580579cbf9d08ad6c ]

    Legacy PCI over virtio uses a 32bit PFN for the queue. If the
    queue pfn is too large to fit in 32bits, which we could hit on
    arm64 systems with 52bit physical addresses (even with 64K page
    size), we simply miss out a proper link to the other side of
    the queue.

    Add a check to validate the PFN, rather than silently breaking
    the devices.

    Cc: "Michael S. Tsirkin"
    Cc: Jason Wang
    Cc: Marc Zyngier
    Cc: Christoffer Dall
    Cc: Peter Maydel
    Cc: Jean-Philippe Brucker
    Signed-off-by: Suzuki K Poulose
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Suzuki K Poulose
     

06 Aug, 2018

1 commit

  • commit 89da619bc18d79bca5304724c11d4ba3b67ce2c6 upstream.

    Kernel panic when with high memory pressure, calltrace looks like,

    PID: 21439 TASK: ffff881be3afedd0 CPU: 16 COMMAND: "java"
    #0 [ffff881ec7ed7630] machine_kexec at ffffffff81059beb
    #1 [ffff881ec7ed7690] __crash_kexec at ffffffff81105942
    #2 [ffff881ec7ed7760] crash_kexec at ffffffff81105a30
    #3 [ffff881ec7ed7778] oops_end at ffffffff816902c8
    #4 [ffff881ec7ed77a0] no_context at ffffffff8167ff46
    #5 [ffff881ec7ed77f0] __bad_area_nosemaphore at ffffffff8167ffdc
    #6 [ffff881ec7ed7838] __node_set at ffffffff81680300
    #7 [ffff881ec7ed7860] __do_page_fault at ffffffff8169320f
    #8 [ffff881ec7ed78c0] do_page_fault at ffffffff816932b5
    #9 [ffff881ec7ed78f0] page_fault at ffffffff8168f4c8
    [exception RIP: _raw_spin_lock_irqsave+47]
    RIP: ffffffff8168edef RSP: ffff881ec7ed79a8 RFLAGS: 00010046
    RAX: 0000000000000246 RBX: ffffea0019740d00 RCX: ffff881ec7ed7fd8
    RDX: 0000000000020000 RSI: 0000000000000016 RDI: 0000000000000008
    RBP: ffff881ec7ed79a8 R8: 0000000000000246 R9: 000000000001a098
    R10: ffff88107ffda000 R11: 0000000000000000 R12: 0000000000000000
    R13: 0000000000000008 R14: ffff881ec7ed7a80 R15: ffff881be3afedd0
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018

    It happens in the pagefault and results in double pagefault
    during compacting pages when memory allocation fails.

    Analysed the vmcore, the page leads to second pagefault is corrupted
    with _mapcount=-256, but private=0.

    It's caused by the race between migration and ballooning, and lock
    missing in virtballoon_migratepage() of virtio_balloon driver.
    This patch fix the bug.

    Fixes: e22504296d4f64f ("virtio_balloon: introduce migration primitives to balloon pages")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jiang Biao
    Signed-off-by: Huang Chong
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Greg Kroah-Hartman

    Jiang Biao
     

15 Mar, 2018

1 commit

  • commit e82df670235138575b37ff0ec24412a471efd97f upstream.

    The vq->vq.num_free hasn't been changed when error happens,
    so it shouldn't be changed when handling the error.

    Fixes: 780bc7903a32 ("virtio_ring: Support DMA APIs")
    Cc: Andy Lutomirski
    Cc: Michael S. Tsirkin
    Cc: stable@vger.kernel.org
    Signed-off-by: Tiwei Bie
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Greg Kroah-Hartman

    Tiwei Bie
     

14 Dec, 2017

1 commit


02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

08 Sep, 2017

1 commit

  • Pull SCSI updates from James Bottomley:
    "This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas,
    megaraid_sas, zfcp and a host of minor updates.

    The major driver change here is the elimination of the block based
    cciss driver in favour of the SCSI based hpsa driver (which now drives
    all the legacy cases cciss used to be required for). Plus a reset
    handler clean up and the redo of the SAS SMP handler to use bsg lib"

    * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits)
    scsi: scsi-mq: Always unprepare before requeuing a request
    scsi: Show .retries and .jiffies_at_alloc in debugfs
    scsi: Improve requeuing behavior
    scsi: Call scsi_initialize_rq() for filesystem requests
    scsi: qla2xxx: Reset the logo flag, after target re-login.
    scsi: qla2xxx: Fix slow mem alloc behind lock
    scsi: qla2xxx: Clear fc4f_nvme flag
    scsi: qla2xxx: add missing includes for qla_isr
    scsi: qla2xxx: Fix an integer overflow in sysfs code
    scsi: aacraid: report -ENOMEM to upper layer from aac_convert_sgraw2()
    scsi: aacraid: get rid of one level of indentation
    scsi: aacraid: fix indentation errors
    scsi: storvsc: fix memory leak on ring buffer busy
    scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough
    scsi: smartpqi: remove the smp_handler stub
    scsi: hpsa: remove the smp_handler stub
    scsi: bsg-lib: pass the release callback through bsg_setup_queue
    scsi: Rework handling of scsi_device.vpd_pg8[03]
    scsi: Rework the code for caching Vital Product Data (VPD)
    scsi: rcu: Introduce rcu_swap_protected()
    ...

    Linus Torvalds
     

07 Sep, 2017

1 commit

  • Pull networking updates from David Miller:

    1) Support ipv6 checksum offload in sunvnet driver, from Shannon
    Nelson.

    2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric
    Dumazet.

    3) Allow generic XDP to work on virtual devices, from John Fastabend.

    4) Add bpf device maps and XDP_REDIRECT, which can be used to build
    arbitrary switching frameworks using XDP. From John Fastabend.

    5) Remove UFO offloads from the tree, gave us little other than bugs.

    6) Remove the IPSEC flow cache, from Florian Westphal.

    7) Support ipv6 route offload in mlxsw driver.

    8) Support VF representors in bnxt_en, from Sathya Perla.

    9) Add support for forward error correction modes to ethtool, from
    Vidya Sagar Ravipati.

    10) Add time filter for packet scheduler action dumping, from Jamal Hadi
    Salim.

    11) Extend the zerocopy sendmsg() used by virtio and tap to regular
    sockets via MSG_ZEROCOPY. From Willem de Bruijn.

    12) Significantly rework value tracking in the BPF verifier, from Edward
    Cree.

    13) Add new jump instructions to eBPF, from Daniel Borkmann.

    14) Rework rtnetlink plumbing so that operations can be run without
    taking the RTNL semaphore. From Florian Westphal.

    15) Support XDP in tap driver, from Jason Wang.

    16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal.

    17) Add Huawei hinic ethernet driver.

    18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan
    Delalande.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits)
    i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq
    i40e: avoid NVM acquire deadlock during NVM update
    drivers: net: xgene: Remove return statement from void function
    drivers: net: xgene: Configure tx/rx delay for ACPI
    drivers: net: xgene: Read tx/rx delay for ACPI
    rocker: fix kcalloc parameter order
    rds: Fix non-atomic operation on shared flag variable
    net: sched: don't use GFP_KERNEL under spin lock
    vhost_net: correctly check tx avail during rx busy polling
    net: mdio-mux: add mdio_mux parameter to mdio_mux_init()
    rxrpc: Make service connection lookup always check for retry
    net: stmmac: Delete dead code for MDIO registration
    gianfar: Fix Tx flow control deactivation
    cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6
    cxgb4: Fix pause frame count in t4_get_port_stats
    cxgb4: fix memory leak
    tun: rename generic_xdp to skb_xdp
    tun: reserve extra headroom only when XDP is set
    net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping
    net: dsa: bcm_sf2: Advertise number of egress queues
    ...

    Linus Torvalds
     

05 Sep, 2017

1 commit

  • Pull x86 asm updates from Ingo Molnar:

    - Introduce the ORC unwinder, which can be enabled via
    CONFIG_ORC_UNWINDER=y.

    The ORC unwinder is a lightweight, Linux kernel specific debuginfo
    implementation, which aims to be DWARF done right for unwinding.
    Objtool is used to generate the ORC unwinder tables during build, so
    the data format is flexible and kernel internal: there's no
    dependency on debuginfo created by an external toolchain.

    The ORC unwinder is almost two orders of magnitude faster than the
    (out of tree) DWARF unwinder - which is important for perf call graph
    profiling. It is also significantly simpler and is coded defensively:
    there has not been a single ORC related kernel crash so far, even
    with early versions. (knock on wood!)

    But the main advantage is that enabling the ORC unwinder allows
    CONFIG_FRAME_POINTERS to be turned off - which speeds up the kernel
    measurably:

    With frame pointers disabled, GCC does not have to add frame pointer
    instrumentation code to every function in the kernel. The kernel's
    .text size decreases by about 3.2%, resulting in better cache
    utilization and fewer instructions executed, resulting in a broad
    kernel-wide speedup. Average speedup of system calls should be
    roughly in the 1-3% range - measurements by Mel Gorman [1] have shown
    a speedup of 5-10% for some function execution intense workloads.

    The main cost of the unwinder is that the unwinder data has to be
    stored in RAM: the memory cost is 2-4MB of RAM, depending on kernel
    config - which is a modest cost on modern x86 systems.

    Given how young the ORC unwinder code is it's not enabled by default
    - but given the performance advantages the plan is to eventually make
    it the default unwinder on x86.

    See Documentation/x86/orc-unwinder.txt for more details.

    - Remove lguest support: its intended role was that of a temporary
    proof of concept for virtualization, plus its removal will enable the
    reduction (removal) of the paravirt API as well, so Rusty agreed to
    its removal. (Juergen Gross)

    - Clean up and fix FSGS related functionality (Andy Lutomirski)

    - Clean up IO access APIs (Andy Shevchenko)

    - Enhance the symbol namespace (Jiri Slaby)

    * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
    objtool: Handle GCC stack pointer adjustment bug
    x86/entry/64: Use ENTRY() instead of ALIGN+GLOBAL for stub32_clone()
    x86/fpu/math-emu: Add ENDPROC to functions
    x86/boot/64: Extract efi_pe_entry() from startup_64()
    x86/boot/32: Extract efi_pe_entry() from startup_32()
    x86/lguest: Remove lguest support
    x86/paravirt/xen: Remove xen_patch()
    objtool: Fix objtool fallthrough detection with function padding
    x86/xen/64: Fix the reported SS and CS in SYSCALL
    objtool: Track DRAP separately from callee-saved registers
    objtool: Fix validate_branch() return codes
    x86: Clarify/fix no-op barriers for text_poke_bp()
    x86/switch_to/64: Rewrite FS/GS switching yet again to fix AMD CPUs
    selftests/x86/fsgsbase: Test selectors 1, 2, and 3
    x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps
    x86/fsgsbase/64: Fully initialize FS and GS state in start_thread_common
    x86/asm: Fix UNWIND_HINT_REGS macro for older binutils
    x86/asm/32: Fix regs_get_register() on segment registers
    x86/xen/64: Rearrange the SYSCALL entries
    x86/asm/32: Remove a bunch of '& 0xffff' from pt_regs segment reads
    ...

    Linus Torvalds
     

02 Sep, 2017

1 commit


26 Aug, 2017

1 commit

  • Commit 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for
    virtqueues"") removed the adjustment of the pre_vectors for the virtio
    MSI-X vector allocation which was added in commit fb5e31d9 ("virtio:
    allow drivers to request IRQ affinity when creating VQs"). This will
    lead to an incorrect assignment of MSI-X vectors, and potential
    deadlocks when offlining cpus.

    Signed-off-by: Christoph Hellwig
    Fixes: 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for virtqueues")
    Reported-by: YASUAKI ISHIMATSU
    Cc: stable@vger.kernel.org
    Signed-off-by: Michael S. Tsirkin

    Christoph Hellwig
     

25 Aug, 2017

1 commit


24 Aug, 2017

1 commit

  • Lguest seems to be rather unused these days. It has seen only patches
    ensuring it still builds the last two years and its official state is
    "Odd Fixes".

    Remove it in order to be able to clean up the paravirt code.

    Signed-off-by: Juergen Gross
    Acked-by: Rusty Russell
    Acked-by: Thomas Gleixner
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: boris.ostrovsky@oracle.com
    Cc: lguest@lists.ozlabs.org
    Cc: rusty@rustcorp.com.au
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/20170816173157.8633-3-jgross@suse.com
    Signed-off-by: Ingo Molnar

    Juergen Gross
     

02 Aug, 2017

1 commit


25 Jul, 2017

3 commits


19 Jun, 2017

1 commit

  • virtio balloon bypasses the DMA API entirely so does not support the
    VIOMMU right now. It's not clear we need that support, for now let's
    just make sure we don't pretend to support it.

    Cc: stable@vger.kernel.org
    Cc: Wei Wang
    Fixes: 1a937693993f ("virtio: new feature to detect IOMMU device quirk")
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Jason Wang

    Michael S. Tsirkin
     

03 May, 2017

3 commits


11 Apr, 2017

6 commits

  • virtio-pci registers a per-vq affinity hint when using MSIX,
    but fails to remove it when freeing the interrupt, resulting
    in this type of splat:

    [ 31.111202] WARNING: CPU: 0 PID: 2823 at kernel/irq/manage.c:1503 __free_irq+0x2c4/0x2c8
    [ 31.114689] Modules linked in:
    [ 31.116101] CPU: 0 PID: 2823 Comm: kexec Not tainted 4.10.0+ #6941
    [ 31.118911] Hardware name: Generic DT based system
    [ 31.121319] [] (unwind_backtrace) from [] (show_stack+0x18/0x1c)
    [ 31.125017] [] (show_stack) from [] (dump_stack+0x84/0x98)
    [ 31.128427] [] (dump_stack) from [] (__warn+0xf4/0x10c)
    [ 31.131910] [] (__warn) from [] (warn_slowpath_null+0x28/0x30)
    [ 31.135543] [] (warn_slowpath_null) from [] (__free_irq+0x2c4/0x2c8)
    [ 31.139355] [] (__free_irq) from [] (free_irq+0x44/0x78)
    [ 31.142909] [] (free_irq) from [] (vp_del_vqs+0x68/0x1c0)
    [ 31.146299] [] (vp_del_vqs) from [] (pci_device_shutdown+0x3c/0x78)

    The obvious fix is to drop the affinity hint before freeing the
    interrupt.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Michael S. Tsirkin

    Marc Zyngier
     
  • This reverts commit 5c34d002dcc7a6dd665a19d098b4f4cd5501ba1a.

    Conflicts:
    drivers/virtio/virtio_pci_common.c

    The cleanup seems to be one of the changes that broke
    hybernation for some users. We are still not sure why
    but revert helps.

    This reverts the cleanup changes but keeps the affinity support.

    Tested-by: Mike Galbraith
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • This reverts commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507.

    Conflicts:
    drivers/virtio/virtio_pci_common.c

    Unfortunately the idea does not work with threadirqs
    as more than 32 queues can then map to a single interrupts.

    Further, the cleanup seems to be one of the changes that broke
    hybernation for some users. We are still not sure why
    but revert helps.

    This reverts the cleanup changes but keeps the affinity support.

    Tested-by: Mike Galbraith
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • This reverts commit 53a020c661741f3b87ad3ac6fa545088aaebac9b.

    The cleanup seems to be one of the changes that broke
    hybernation for some users. We are still not sure why
    but revert helps.

    Tested-by: Mike Galbraith
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • This reverts commit 52a61516125fa9a21b3bdf4f90928308e2e5573f.

    Conflicts:
    drivers/virtio/virtio_pci_common.c

    The cleanup seems to be one of the changes that broke
    hybernation for some users. We are still not sure why
    but revert helps.

    This reverts the cleanup changes but keeps the affinity support.

    Tested-by: Mike Galbraith
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • This reverts commit de85ec8b07f82c8c84de7687f769e74bf4c26a1e.

    Follow-up patches will revert 07ec51480b5e ("virtio_pci: use shared
    interrupts for virtqueues") that triggered the problem so no need for
    this one anymore.

    Tested-by: Mike Galbraith
    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     

07 Apr, 2017

1 commit


29 Mar, 2017

4 commits

  • The latest gcc-7.0.1 snapshot reports a new warning:

    virtio/virtio_balloon.c: In function 'update_balloon_stats':
    virtio/virtio_balloon.c:258:26: error: 'events[2]' is used uninitialized in this function [-Werror=uninitialized]
    virtio/virtio_balloon.c:260:26: error: 'events[3]' is used uninitialized in this function [-Werror=uninitialized]
    virtio/virtio_balloon.c:261:56: error: 'events[18]' is used uninitialized in this function [-Werror=uninitialized]
    virtio/virtio_balloon.c:262:56: error: 'events[17]' is used uninitialized in this function [-Werror=uninitialized]

    This seems absolutely right, so we should add an extra check to
    prevent copying uninitialized stack data into the statistics.
    >From all I can tell, this has been broken since the statistics code
    was originally added in 2.6.34.

    Fixes: 9564e138b1f6 ("virtio: Add memory statistics reporting to the balloon driver (V4)")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Ladi Prosek
    Signed-off-by: Michael S. Tsirkin

    Arnd Bergmann
     
  • The virtio balloon driver contained a not-so-obvious invariant that
    update_balloon_stats has to update exactly VIRTIO_BALLOON_S_NR counters
    in order to send valid stats to the host. This commit fixes it by having
    update_balloon_stats return the actual number of counters, and its
    callers use it when pushing buffers to the stats virtqueue.

    Note that it is still out of spec to change the number of counters
    at run-time. "Driver MUST supply the same subset of statistics in all
    buffers submitted to the statsq."

    Suggested-by: Arnd Bergmann
    Signed-off-by: Ladi Prosek
    Signed-off-by: Michael S. Tsirkin

    Ladi Prosek
     
  • When init_vqs runs, virtio_balloon.stats is either uninitialized or
    contains stale values. The host updates its state with garbage data
    because it has no way of knowing that this is just a marker buffer
    used for signaling.

    This patch updates the stats before pushing the initial buffer.

    Alternative fixes:
    * Push an empty buffer in init_vqs. Not easily done with the current
    virtio implementation and violates the spec "Driver MUST supply the
    same subset of statistics in all buffers submitted to the statsq".
    * Push a buffer with invalid tags in init_vqs. Violates the same
    spec clause, plus "invalid tag" is not really defined.

    Note: the spec says:
    When using the legacy interface, the device SHOULD ignore all values in
    the first buffer in the statsq supplied by the driver after device
    initialization. Note: Historically, drivers supplied an uninitialized
    buffer in the first buffer.

    Unfortunately QEMU does not seem to implement the recommendation
    even for the legacy interface.

    Cc: stable@vger.kernel.org
    Signed-off-by: Ladi Prosek
    Signed-off-by: Michael S. Tsirkin

    Ladi Prosek
     
  • Fedora has received multiple reports of crashes when running
    4.11 as a guest

    https://bugzilla.redhat.com/show_bug.cgi?id=1430297
    https://bugzilla.redhat.com/show_bug.cgi?id=1434462
    https://bugzilla.kernel.org/show_bug.cgi?id=194911
    https://bugzilla.redhat.com/show_bug.cgi?id=1433899

    The crashes are not always consistent but they are generally
    some flavor of oops or GPF in virtio related code. Multiple people
    have done bisections (Thank you Thorsten Leemhuis and
    Richard W.M. Jones) and found this commit to be at fault

    07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507 is the first bad commit
    commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507
    Author: Christoph Hellwig
    Date: Sun Feb 5 18:15:19 2017 +0100

    virtio_pci: use shared interrupts for virtqueues

    The issue seems to be an out of bounds access to the msix_names
    array corrupting kernel memory.

    Fixes: 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues")
    Reported-by: Laura Abbott
    Signed-off-by: Jason Wang
    Signed-off-by: Michael S. Tsirkin
    Reviewed-by: Christoph Hellwig
    Tested-by: Richard W.M. Jones
    Tested-by: Thorsten Leemhuis

    Jason Wang
     

04 Mar, 2017

1 commit

  • Pull sched.h split-up from Ingo Molnar:
    "The point of these changes is to significantly reduce the
    header footprint, to speed up the kernel build and to
    have a cleaner header structure.

    After these changes the new 's typical preprocessed
    size goes down from a previous ~0.68 MB (~22K lines) to ~0.45 MB (~15K
    lines), which is around 40% faster to build on typical configs.

    Not much changed from the last version (-v2) posted three weeks ago: I
    eliminated quirks, backmerged fixes plus I rebased it to an upstream
    SHA1 from yesterday that includes most changes queued up in -next plus
    all sched.h changes that were pending from Andrew.

    I've re-tested the series both on x86 and on cross-arch defconfigs,
    and did a bisectability test at a number of random points.

    I tried to test as many build configurations as possible, but some
    build breakage is probably still left - but it should be mostly
    limited to architectures that have no cross-compiler binaries
    available on kernel.org, and non-default configurations"

    * 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (146 commits)
    sched/headers: Clean up
    sched/headers: Remove #ifdefs from
    sched/headers: Remove the include from
    sched/headers, hrtimer: Remove the include from
    sched/headers, x86/apic: Remove the header inclusion from
    sched/headers, timers: Remove the include from
    sched/headers: Remove from
    sched/headers: Remove from
    sched/core: Remove unused prefetch_stack()
    sched/headers: Remove from
    sched/headers: Remove the 'init_pid_ns' prototype from
    sched/headers: Remove from
    sched/headers: Remove from
    sched/headers: Remove the runqueue_is_locked() prototype
    sched/headers: Remove from
    sched/headers: Remove from
    sched/headers: Remove from
    sched/headers: Remove from
    sched/headers: Remove the include from
    sched/headers: Remove from
    ...

    Linus Torvalds
     

03 Mar, 2017

1 commit

  • Pull vhost updates from Michael Tsirkin:
    "virtio, vhost: optimizations, fixes

    Looks like a quiet cycle for vhost/virtio, just a couple of minor
    tweaks. Most notable is automatic interrupt affinity for blk and scsi.
    Hopefully other devices are not far behind"

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
    virtio-console: avoid DMA from stack
    vhost: introduce O(1) vq metadata cache
    virtio_scsi: use virtio IRQ affinity
    virtio_blk: use virtio IRQ affinity
    blk-mq: provide a default queue mapping for virtio device
    virtio: provide a method to get the IRQ affinity mask for a virtqueue
    virtio: allow drivers to request IRQ affinity when creating VQs
    virtio_pci: simplify MSI-X setup
    virtio_pci: don't duplicate the msix_enable flag in struct pci_dev
    virtio_pci: use shared interrupts for virtqueues
    virtio_pci: remove struct virtio_pci_vq_info
    vhost: try avoiding avail index access when getting descriptor
    virtio_mmio: expose header to userspace

    Linus Torvalds
     

02 Mar, 2017

1 commit


28 Feb, 2017

4 commits