15 Dec, 2017
1 commit
-
Recent rework of the virtio_mmio probe/remove paths balanced a
devm_ioremap() with an iounmap() rather than its devm variant. This ends
up corrupting the devm datastructures, and results in the following
boot-time splat on arm64 under QEMU 2.9.0:[ 3.450397] ------------[ cut here ]------------
[ 3.453822] Trying to vfree() nonexistent vm area (00000000c05b4844)
[ 3.460534] WARNING: CPU: 1 PID: 1 at mm/vmalloc.c:1525 __vunmap+0x1b8/0x220
[ 3.475898] Kernel panic - not syncing: panic_on_warn set ...
[ 3.475898]
[ 3.493933] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc3 #1
[ 3.513109] Hardware name: linux,dummy-virt (DT)
[ 3.525382] Call trace:
[ 3.531683] dump_backtrace+0x0/0x368
[ 3.543921] show_stack+0x20/0x30
[ 3.547767] dump_stack+0x108/0x164
[ 3.559584] panic+0x25c/0x51c
[ 3.569184] __warn+0x29c/0x31c
[ 3.576023] report_bug+0x1d4/0x290
[ 3.586069] bug_handler.part.2+0x40/0x100
[ 3.597820] bug_handler+0x4c/0x88
[ 3.608400] brk_handler+0x11c/0x218
[ 3.613430] do_debug_exception+0xe8/0x318
[ 3.627370] el1_dbg+0x18/0x78
[ 3.634037] __vunmap+0x1b8/0x220
[ 3.648747] vunmap+0x6c/0xc0
[ 3.653864] __iounmap+0x44/0x58
[ 3.659771] devm_ioremap_release+0x34/0x68
[ 3.672983] release_nodes+0x404/0x880
[ 3.683543] devres_release_all+0x6c/0xe8
[ 3.695692] driver_probe_device+0x250/0x828
[ 3.706187] __driver_attach+0x190/0x210
[ 3.717645] bus_for_each_dev+0x14c/0x1f0
[ 3.728633] driver_attach+0x48/0x78
[ 3.740249] bus_add_driver+0x26c/0x5b8
[ 3.752248] driver_register+0x16c/0x398
[ 3.757211] __platform_driver_register+0xd8/0x128
[ 3.770860] virtio_mmio_init+0x1c/0x24
[ 3.782671] do_one_initcall+0xe0/0x398
[ 3.791890] kernel_init_freeable+0x594/0x660
[ 3.798514] kernel_init+0x18/0x190
[ 3.810220] ret_from_fork+0x10/0x18To fix this, we can simply rip out the explicit cleanup that the devm
infrastructure will do for us when our probe function returns an error
code, or when our remove function returns.We only need to ensure that we call put_device() if a call to
register_virtio_device() fails in the probe path.Signed-off-by: Mark Rutland
Fixes: 7eb781b1bbb7136f ("virtio_mmio: add cleanup for virtio_mmio_probe")
Fixes: 25f32223bce5c580 ("virtio_mmio: add cleanup for virtio_mmio_remove")
Cc: Cornelia Huck
Cc: Michael S. Tsirkin
Cc: weiping zhang
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Michael S. Tsirkin
Reviewed-by: Cornelia Huck
08 Dec, 2017
2 commits
-
cleanup all resource allocated by virtio_mmio_probe.
Signed-off-by: weiping zhang
Signed-off-by: Michael S. Tsirkin
Reviewed-by: Cornelia Huck -
As mentioned at drivers/base/core.c:
/*
* NOTE: _Never_ directly free @dev after calling this function, even
* if it returned an error! Always use put_device() to give up the
* reference initialized in this function instead.
*/
so we don't free vm_dev until vm_dev.dev.release be called.Signed-off-by: weiping zhang
Signed-off-by: Michael S. Tsirkin
Reviewed-by: Cornelia Huck
01 Dec, 2017
2 commits
-
commit c7cdff0e8647 ("virtio_balloon: fix deadlock on OOM")
changed code to increment vb->num_pfns before call to
set_page_pfns(), which used to happen only after.This patch fixes boot hang for me on ppc64le KVM guests.
Fixes: c7cdff0e8647 ("virtio_balloon: fix deadlock on OOM")
Cc: Michael S. Tsirkin
Cc: Tetsuo Handa
Cc: Michal Hocko
Cc: Wei Wang
Cc: stable@vger.kernel.org
Signed-off-by: Jan Stancek
Signed-off-by: Michael S. Tsirkin -
index can be reused by other virtio device.
Cc: stable@vger.kernel.org
Signed-off-by: weiping zhang
Reviewed-by: Cornelia Huck
Signed-off-by: Michael S. Tsirkin
15 Nov, 2017
1 commit
-
fill_balloon doing memory allocations under balloon_lock
can cause a deadlock when leak_balloon is called from
virtballoon_oom_notify and tries to take same lock.To fix, split page allocation and enqueue and do allocations outside the lock.
Here's a detailed analysis of the deadlock by Tetsuo Handa:
In leak_balloon(), mutex_lock(&vb->balloon_lock) is called in order to
serialize against fill_balloon(). But in fill_balloon(),
alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is
called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE]
implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY
is specified, this allocation attempt might indirectly depend on somebody
else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect
__GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via
virtballoon_oom_notify() via blocking_notifier_call_chain() callback via
out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock
mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it
will cause OOM lockup.Thread1 Thread2
fill_balloon()
takes a balloon_lock
balloon_page_enqueue()
alloc_page(GFP_HIGHUSER_MOVABLE)
direct reclaim (__GFP_FS context) takes a fs lock
waits for that fs lock alloc_page(GFP_NOFS)
__alloc_pages_may_oom()
takes the oom_lock
out_of_memory()
blocking_notifier_call_chain()
leak_balloon()
tries to take that balloon_lock and deadlocksReported-by: Tetsuo Handa
Cc: Michal Hocko
Cc: Wei Wang
Signed-off-by: Michael S. Tsirkin
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
08 Sep, 2017
1 commit
-
Pull SCSI updates from James Bottomley:
"This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas,
megaraid_sas, zfcp and a host of minor updates.The major driver change here is the elimination of the block based
cciss driver in favour of the SCSI based hpsa driver (which now drives
all the legacy cases cciss used to be required for). Plus a reset
handler clean up and the redo of the SAS SMP handler to use bsg lib"* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits)
scsi: scsi-mq: Always unprepare before requeuing a request
scsi: Show .retries and .jiffies_at_alloc in debugfs
scsi: Improve requeuing behavior
scsi: Call scsi_initialize_rq() for filesystem requests
scsi: qla2xxx: Reset the logo flag, after target re-login.
scsi: qla2xxx: Fix slow mem alloc behind lock
scsi: qla2xxx: Clear fc4f_nvme flag
scsi: qla2xxx: add missing includes for qla_isr
scsi: qla2xxx: Fix an integer overflow in sysfs code
scsi: aacraid: report -ENOMEM to upper layer from aac_convert_sgraw2()
scsi: aacraid: get rid of one level of indentation
scsi: aacraid: fix indentation errors
scsi: storvsc: fix memory leak on ring buffer busy
scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough
scsi: smartpqi: remove the smp_handler stub
scsi: hpsa: remove the smp_handler stub
scsi: bsg-lib: pass the release callback through bsg_setup_queue
scsi: Rework handling of scsi_device.vpd_pg8[03]
scsi: Rework the code for caching Vital Product Data (VPD)
scsi: rcu: Introduce rcu_swap_protected()
...
07 Sep, 2017
1 commit
-
Pull networking updates from David Miller:
1) Support ipv6 checksum offload in sunvnet driver, from Shannon
Nelson.2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric
Dumazet.3) Allow generic XDP to work on virtual devices, from John Fastabend.
4) Add bpf device maps and XDP_REDIRECT, which can be used to build
arbitrary switching frameworks using XDP. From John Fastabend.5) Remove UFO offloads from the tree, gave us little other than bugs.
6) Remove the IPSEC flow cache, from Florian Westphal.
7) Support ipv6 route offload in mlxsw driver.
8) Support VF representors in bnxt_en, from Sathya Perla.
9) Add support for forward error correction modes to ethtool, from
Vidya Sagar Ravipati.10) Add time filter for packet scheduler action dumping, from Jamal Hadi
Salim.11) Extend the zerocopy sendmsg() used by virtio and tap to regular
sockets via MSG_ZEROCOPY. From Willem de Bruijn.12) Significantly rework value tracking in the BPF verifier, from Edward
Cree.13) Add new jump instructions to eBPF, from Daniel Borkmann.
14) Rework rtnetlink plumbing so that operations can be run without
taking the RTNL semaphore. From Florian Westphal.15) Support XDP in tap driver, from Jason Wang.
16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal.
17) Add Huawei hinic ethernet driver.
18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan
Delalande.* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits)
i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq
i40e: avoid NVM acquire deadlock during NVM update
drivers: net: xgene: Remove return statement from void function
drivers: net: xgene: Configure tx/rx delay for ACPI
drivers: net: xgene: Read tx/rx delay for ACPI
rocker: fix kcalloc parameter order
rds: Fix non-atomic operation on shared flag variable
net: sched: don't use GFP_KERNEL under spin lock
vhost_net: correctly check tx avail during rx busy polling
net: mdio-mux: add mdio_mux parameter to mdio_mux_init()
rxrpc: Make service connection lookup always check for retry
net: stmmac: Delete dead code for MDIO registration
gianfar: Fix Tx flow control deactivation
cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6
cxgb4: Fix pause frame count in t4_get_port_stats
cxgb4: fix memory leak
tun: rename generic_xdp to skb_xdp
tun: reserve extra headroom only when XDP is set
net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping
net: dsa: bcm_sf2: Advertise number of egress queues
...
05 Sep, 2017
1 commit
-
Pull x86 asm updates from Ingo Molnar:
- Introduce the ORC unwinder, which can be enabled via
CONFIG_ORC_UNWINDER=y.The ORC unwinder is a lightweight, Linux kernel specific debuginfo
implementation, which aims to be DWARF done right for unwinding.
Objtool is used to generate the ORC unwinder tables during build, so
the data format is flexible and kernel internal: there's no
dependency on debuginfo created by an external toolchain.The ORC unwinder is almost two orders of magnitude faster than the
(out of tree) DWARF unwinder - which is important for perf call graph
profiling. It is also significantly simpler and is coded defensively:
there has not been a single ORC related kernel crash so far, even
with early versions. (knock on wood!)But the main advantage is that enabling the ORC unwinder allows
CONFIG_FRAME_POINTERS to be turned off - which speeds up the kernel
measurably:With frame pointers disabled, GCC does not have to add frame pointer
instrumentation code to every function in the kernel. The kernel's
.text size decreases by about 3.2%, resulting in better cache
utilization and fewer instructions executed, resulting in a broad
kernel-wide speedup. Average speedup of system calls should be
roughly in the 1-3% range - measurements by Mel Gorman [1] have shown
a speedup of 5-10% for some function execution intense workloads.The main cost of the unwinder is that the unwinder data has to be
stored in RAM: the memory cost is 2-4MB of RAM, depending on kernel
config - which is a modest cost on modern x86 systems.Given how young the ORC unwinder code is it's not enabled by default
- but given the performance advantages the plan is to eventually make
it the default unwinder on x86.See Documentation/x86/orc-unwinder.txt for more details.
- Remove lguest support: its intended role was that of a temporary
proof of concept for virtualization, plus its removal will enable the
reduction (removal) of the paravirt API as well, so Rusty agreed to
its removal. (Juergen Gross)- Clean up and fix FSGS related functionality (Andy Lutomirski)
- Clean up IO access APIs (Andy Shevchenko)
- Enhance the symbol namespace (Jiri Slaby)
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
objtool: Handle GCC stack pointer adjustment bug
x86/entry/64: Use ENTRY() instead of ALIGN+GLOBAL for stub32_clone()
x86/fpu/math-emu: Add ENDPROC to functions
x86/boot/64: Extract efi_pe_entry() from startup_64()
x86/boot/32: Extract efi_pe_entry() from startup_32()
x86/lguest: Remove lguest support
x86/paravirt/xen: Remove xen_patch()
objtool: Fix objtool fallthrough detection with function padding
x86/xen/64: Fix the reported SS and CS in SYSCALL
objtool: Track DRAP separately from callee-saved registers
objtool: Fix validate_branch() return codes
x86: Clarify/fix no-op barriers for text_poke_bp()
x86/switch_to/64: Rewrite FS/GS switching yet again to fix AMD CPUs
selftests/x86/fsgsbase: Test selectors 1, 2, and 3
x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps
x86/fsgsbase/64: Fully initialize FS and GS state in start_thread_common
x86/asm: Fix UNWIND_HINT_REGS macro for older binutils
x86/asm/32: Fix regs_get_register() on segment registers
x86/xen/64: Rearrange the SYSCALL entries
x86/asm/32: Remove a bunch of '& 0xffff' from pt_regs segment reads
...
02 Sep, 2017
1 commit
-
Three cases of simple overlapping changes.
Signed-off-by: David S. Miller
26 Aug, 2017
1 commit
-
Commit 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for
virtqueues"") removed the adjustment of the pre_vectors for the virtio
MSI-X vector allocation which was added in commit fb5e31d9 ("virtio:
allow drivers to request IRQ affinity when creating VQs"). This will
lead to an incorrect assignment of MSI-X vectors, and potential
deadlocks when offlining cpus.Signed-off-by: Christoph Hellwig
Fixes: 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for virtqueues")
Reported-by: YASUAKI ISHIMATSU
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin
25 Aug, 2017
1 commit
-
If using indirect descriptors, you can make the total_sg as large as you
want. If not, BUG is too serious because the function later returns
-ENOSPC.Signed-off-by: Richard W.M. Jones
Reviewed-by: Paolo Bonzini
Signed-off-by: Martin K. Petersen
24 Aug, 2017
1 commit
-
Lguest seems to be rather unused these days. It has seen only patches
ensuring it still builds the last two years and its official state is
"Odd Fixes".Remove it in order to be able to clean up the paravirt code.
Signed-off-by: Juergen Gross
Acked-by: Rusty Russell
Acked-by: Thomas Gleixner
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: boris.ostrovsky@oracle.com
Cc: lguest@lists.ozlabs.org
Cc: rusty@rustcorp.com.au
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20170816173157.8633-3-jgross@suse.com
Signed-off-by: Ingo Molnar
02 Aug, 2017
1 commit
-
Two minor conflicts in virtio_net driver (bug fix overlapping addition
of a helper) and MAINTAINERS (new driver edit overlapping revamp of
PHY entry).Signed-off-by: David S. Miller
25 Jul, 2017
3 commits
-
Clean up the comment format.
Signed-off-by: Wei Wang
Signed-off-by: Michael S. Tsirkin -
This patch saves the deflated pages to a list, instead of the PFN array.
Accordingly, the balloon_pfn_to_page() function is removed.Signed-off-by: Liang Li
Signed-off-by: Michael S. Tsirkin
Signed-off-by: Wei Wang
Signed-off-by: Michael S. Tsirkin -
Allow zero to be store as a ctx, with this we could store e.g zero
value which could be meaningful for the case of storing headroom
through ctx.Signed-off-by: Jason Wang
Signed-off-by: David S. Miller
19 Jun, 2017
1 commit
-
virtio balloon bypasses the DMA API entirely so does not support the
VIOMMU right now. It's not clear we need that support, for now let's
just make sure we don't pretend to support it.Cc: stable@vger.kernel.org
Cc: Wei Wang
Fixes: 1a937693993f ("virtio: new feature to detect IOMMU device quirk")
Signed-off-by: Michael S. Tsirkin
Acked-by: Jason Wang
03 May, 2017
3 commits
-
Allow extra context per descriptor. To avoid slow down for data path,
this disables use of indirect descriptors for this vq.Signed-off-by: Michael S. Tsirkin
-
Allows maintaining extra context per vq. For ease of use, passing in
NULL is legal and disables the feature for all vqs.Includes fixes by Christian for s390, acked by Cornelia.
Signed-off-by: Christian Borntraeger
Acked-by: Cornelia Huck
Signed-off-by: Michael S. Tsirkin -
We are going to add more parameters to find_vqs, let's wrap the call so
we don't need to tweak all drivers every time.Signed-off-by: Michael S. Tsirkin
11 Apr, 2017
6 commits
-
virtio-pci registers a per-vq affinity hint when using MSIX,
but fails to remove it when freeing the interrupt, resulting
in this type of splat:[ 31.111202] WARNING: CPU: 0 PID: 2823 at kernel/irq/manage.c:1503 __free_irq+0x2c4/0x2c8
[ 31.114689] Modules linked in:
[ 31.116101] CPU: 0 PID: 2823 Comm: kexec Not tainted 4.10.0+ #6941
[ 31.118911] Hardware name: Generic DT based system
[ 31.121319] [] (unwind_backtrace) from [] (show_stack+0x18/0x1c)
[ 31.125017] [] (show_stack) from [] (dump_stack+0x84/0x98)
[ 31.128427] [] (dump_stack) from [] (__warn+0xf4/0x10c)
[ 31.131910] [] (__warn) from [] (warn_slowpath_null+0x28/0x30)
[ 31.135543] [] (warn_slowpath_null) from [] (__free_irq+0x2c4/0x2c8)
[ 31.139355] [] (__free_irq) from [] (free_irq+0x44/0x78)
[ 31.142909] [] (free_irq) from [] (vp_del_vqs+0x68/0x1c0)
[ 31.146299] [] (vp_del_vqs) from [] (pci_device_shutdown+0x3c/0x78)The obvious fix is to drop the affinity hint before freeing the
interrupt.Signed-off-by: Marc Zyngier
Signed-off-by: Michael S. Tsirkin -
This reverts commit 5c34d002dcc7a6dd665a19d098b4f4cd5501ba1a.
Conflicts:
drivers/virtio/virtio_pci_common.cThe cleanup seems to be one of the changes that broke
hybernation for some users. We are still not sure why
but revert helps.This reverts the cleanup changes but keeps the affinity support.
Tested-by: Mike Galbraith
Signed-off-by: Michael S. Tsirkin -
This reverts commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507.
Conflicts:
drivers/virtio/virtio_pci_common.cUnfortunately the idea does not work with threadirqs
as more than 32 queues can then map to a single interrupts.Further, the cleanup seems to be one of the changes that broke
hybernation for some users. We are still not sure why
but revert helps.This reverts the cleanup changes but keeps the affinity support.
Tested-by: Mike Galbraith
Signed-off-by: Michael S. Tsirkin -
This reverts commit 53a020c661741f3b87ad3ac6fa545088aaebac9b.
The cleanup seems to be one of the changes that broke
hybernation for some users. We are still not sure why
but revert helps.Tested-by: Mike Galbraith
Signed-off-by: Michael S. Tsirkin -
This reverts commit 52a61516125fa9a21b3bdf4f90928308e2e5573f.
Conflicts:
drivers/virtio/virtio_pci_common.cThe cleanup seems to be one of the changes that broke
hybernation for some users. We are still not sure why
but revert helps.This reverts the cleanup changes but keeps the affinity support.
Tested-by: Mike Galbraith
Signed-off-by: Michael S. Tsirkin -
This reverts commit de85ec8b07f82c8c84de7687f769e74bf4c26a1e.
Follow-up patches will revert 07ec51480b5e ("virtio_pci: use shared
interrupts for virtqueues") that triggered the problem so no need for
this one anymore.Tested-by: Mike Galbraith
Signed-off-by: Michael S. Tsirkin
07 Apr, 2017
1 commit
-
Some drivers can't support all features in all configurations. At the
moment we blindly set FEATURES_OK and later FAILED. Support this better
by adding a callback drivers can use to do some early checks.Signed-off-by: Michael S. Tsirkin
29 Mar, 2017
4 commits
-
The latest gcc-7.0.1 snapshot reports a new warning:
virtio/virtio_balloon.c: In function 'update_balloon_stats':
virtio/virtio_balloon.c:258:26: error: 'events[2]' is used uninitialized in this function [-Werror=uninitialized]
virtio/virtio_balloon.c:260:26: error: 'events[3]' is used uninitialized in this function [-Werror=uninitialized]
virtio/virtio_balloon.c:261:56: error: 'events[18]' is used uninitialized in this function [-Werror=uninitialized]
virtio/virtio_balloon.c:262:56: error: 'events[17]' is used uninitialized in this function [-Werror=uninitialized]This seems absolutely right, so we should add an extra check to
prevent copying uninitialized stack data into the statistics.
>From all I can tell, this has been broken since the statistics code
was originally added in 2.6.34.Fixes: 9564e138b1f6 ("virtio: Add memory statistics reporting to the balloon driver (V4)")
Signed-off-by: Arnd Bergmann
Signed-off-by: Ladi Prosek
Signed-off-by: Michael S. Tsirkin -
The virtio balloon driver contained a not-so-obvious invariant that
update_balloon_stats has to update exactly VIRTIO_BALLOON_S_NR counters
in order to send valid stats to the host. This commit fixes it by having
update_balloon_stats return the actual number of counters, and its
callers use it when pushing buffers to the stats virtqueue.Note that it is still out of spec to change the number of counters
at run-time. "Driver MUST supply the same subset of statistics in all
buffers submitted to the statsq."Suggested-by: Arnd Bergmann
Signed-off-by: Ladi Prosek
Signed-off-by: Michael S. Tsirkin -
When init_vqs runs, virtio_balloon.stats is either uninitialized or
contains stale values. The host updates its state with garbage data
because it has no way of knowing that this is just a marker buffer
used for signaling.This patch updates the stats before pushing the initial buffer.
Alternative fixes:
* Push an empty buffer in init_vqs. Not easily done with the current
virtio implementation and violates the spec "Driver MUST supply the
same subset of statistics in all buffers submitted to the statsq".
* Push a buffer with invalid tags in init_vqs. Violates the same
spec clause, plus "invalid tag" is not really defined.Note: the spec says:
When using the legacy interface, the device SHOULD ignore all values in
the first buffer in the statsq supplied by the driver after device
initialization. Note: Historically, drivers supplied an uninitialized
buffer in the first buffer.Unfortunately QEMU does not seem to implement the recommendation
even for the legacy interface.Cc: stable@vger.kernel.org
Signed-off-by: Ladi Prosek
Signed-off-by: Michael S. Tsirkin -
Fedora has received multiple reports of crashes when running
4.11 as a guesthttps://bugzilla.redhat.com/show_bug.cgi?id=1430297
https://bugzilla.redhat.com/show_bug.cgi?id=1434462
https://bugzilla.kernel.org/show_bug.cgi?id=194911
https://bugzilla.redhat.com/show_bug.cgi?id=1433899The crashes are not always consistent but they are generally
some flavor of oops or GPF in virtio related code. Multiple people
have done bisections (Thank you Thorsten Leemhuis and
Richard W.M. Jones) and found this commit to be at fault07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507 is the first bad commit
commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507
Author: Christoph Hellwig
Date: Sun Feb 5 18:15:19 2017 +0100virtio_pci: use shared interrupts for virtqueues
The issue seems to be an out of bounds access to the msix_names
array corrupting kernel memory.Fixes: 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues")
Reported-by: Laura Abbott
Signed-off-by: Jason Wang
Signed-off-by: Michael S. Tsirkin
Reviewed-by: Christoph Hellwig
Tested-by: Richard W.M. Jones
Tested-by: Thorsten Leemhuis
04 Mar, 2017
1 commit
-
Pull sched.h split-up from Ingo Molnar:
"The point of these changes is to significantly reduce the
header footprint, to speed up the kernel build and to
have a cleaner header structure.After these changes the new 's typical preprocessed
size goes down from a previous ~0.68 MB (~22K lines) to ~0.45 MB (~15K
lines), which is around 40% faster to build on typical configs.Not much changed from the last version (-v2) posted three weeks ago: I
eliminated quirks, backmerged fixes plus I rebased it to an upstream
SHA1 from yesterday that includes most changes queued up in -next plus
all sched.h changes that were pending from Andrew.I've re-tested the series both on x86 and on cross-arch defconfigs,
and did a bisectability test at a number of random points.I tried to test as many build configurations as possible, but some
build breakage is probably still left - but it should be mostly
limited to architectures that have no cross-compiler binaries
available on kernel.org, and non-default configurations"* 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (146 commits)
sched/headers: Clean up
sched/headers: Remove #ifdefs from
sched/headers: Remove the include from
sched/headers, hrtimer: Remove the include from
sched/headers, x86/apic: Remove the header inclusion from
sched/headers, timers: Remove the include from
sched/headers: Remove from
sched/headers: Remove from
sched/core: Remove unused prefetch_stack()
sched/headers: Remove from
sched/headers: Remove the 'init_pid_ns' prototype from
sched/headers: Remove from
sched/headers: Remove from
sched/headers: Remove the runqueue_is_locked() prototype
sched/headers: Remove from
sched/headers: Remove from
sched/headers: Remove from
sched/headers: Remove from
sched/headers: Remove the include from
sched/headers: Remove from
...
03 Mar, 2017
1 commit
-
Pull vhost updates from Michael Tsirkin:
"virtio, vhost: optimizations, fixesLooks like a quiet cycle for vhost/virtio, just a couple of minor
tweaks. Most notable is automatic interrupt affinity for blk and scsi.
Hopefully other devices are not far behind"* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio-console: avoid DMA from stack
vhost: introduce O(1) vq metadata cache
virtio_scsi: use virtio IRQ affinity
virtio_blk: use virtio IRQ affinity
blk-mq: provide a default queue mapping for virtio device
virtio: provide a method to get the IRQ affinity mask for a virtqueue
virtio: allow drivers to request IRQ affinity when creating VQs
virtio_pci: simplify MSI-X setup
virtio_pci: don't duplicate the msix_enable flag in struct pci_dev
virtio_pci: use shared interrupts for virtqueues
virtio_pci: remove struct virtio_pci_vq_info
vhost: try avoiding avail index access when getting descriptor
virtio_mmio: expose header to userspace
02 Mar, 2017
1 commit
-
Update files that depend on the magic.h inclusion.
Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar
28 Feb, 2017
4 commits
-
This basically passed up the pci_irq_get_affinity information through
virtio through an optional get_vq_affinity method. It is only implemented
by the PCI backend for now, and only when we use per-virtqueue IRQs.Signed-off-by: Christoph Hellwig
Reviewed-by: Jason Wang
Signed-off-by: Michael S. Tsirkin -
Add a struct irq_affinity pointer to the find_vqs methods, which if set
is used to tell the PCI layer to create the MSI-X vectors for our I/O
virtqueues with the proper affinity from the start. Compared to after
the fact affinity hints this gives us an instantly working setup and
allows to allocate the irq descritors node-local and avoid interconnect
traffic. Last but not least this will allow blk-mq queues are created
based on the interrupt affinity for storage drivers.Signed-off-by: Christoph Hellwig
Reviewed-by: Jason Wang
Signed-off-by: Michael S. Tsirkin -
Try to grab the MSI-X vectors early and fall back to the shared one
before doing lots of allocations.Signed-off-by: Christoph Hellwig
Reviewed-by: Jason Wang
Signed-off-by: Michael S. Tsirkin -
Signed-off-by: Christoph Hellwig
Reviewed-by: Jason Wang
Signed-off-by: Michael S. Tsirkin