06 Aug, 2016
1 commit
-
Pull virtio/vhost updates from Michael Tsirkin:
- new vsock device support in host and guest
- platform IOMMU support in host and guest, including compatibility
quirks for legacy systems.- misc fixes and cleanups.
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
VSOCK: Use kvfree()
vhost: split out vringh Kconfig
vhost: detect 32 bit integer wrap around
vhost: new device IOTLB API
vhost: drop vringh dependency
vhost: convert pre sorted vhost memory array to interval tree
vhost: introduce vhost memory accessors
VSOCK: Add Makefile and Kconfig
VSOCK: Introduce vhost_vsock.ko
VSOCK: Introduce virtio_transport.ko
VSOCK: Introduce virtio_vsock_common.ko
VSOCK: defer sock removal to transports
VSOCK: transport-specific vsock_transport functions
vhost: drop vringh dependency
vop: pull in vhost Kconfig
virtio: new feature to detect IOMMU device quirk
balloon: check the number of available pages in leak balloon
vhost: lockless enqueuing
vhost: simplify work flushing
04 Aug, 2016
1 commit
-
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer. Thus the pointer can point to const data.
However the attributes do not have to be a bitfield. Instead unsigned
long will do fine:1. This is just simpler. Both in terms of reading the code and setting
attributes. Instead of initializing local attributes on the stack
and passing pointer to it to dma_set_attr(), just set the bits.2. It brings safeness and checking for const correctness because the
attributes are passed by value.Semantic patches for this change (at least most of them):
virtual patch
virtual context@r@
identifier f, attrs;@@
f(...,
- struct dma_attrs *attrs
+ unsigned long attrs
, ...)
{
...
}@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)and
// Options: --all-includes
virtual patch
virtual context@r@
identifier f, attrs;
type t;@@
t f(..., struct dma_attrs *attrs);@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski
Acked-by: Vineet Gupta
Acked-by: Robin Murphy
Acked-by: Hans-Christian Noren Egtvedt
Acked-by: Mark Salter [c6x]
Acked-by: Jesper Nilsson [cris]
Acked-by: Daniel Vetter [drm]
Reviewed-by: Bart Van Assche
Acked-by: Joerg Roedel [iommu]
Acked-by: Fabien Dessenne [bdisp]
Reviewed-by: Marek Szyprowski [vb2-core]
Acked-by: David Vrabel [xen]
Acked-by: Konrad Rzeszutek Wilk [xen swiotlb]
Acked-by: Joerg Roedel [iommu]
Acked-by: Richard Kuo [hexagon]
Acked-by: Geert Uytterhoeven [m68k]
Acked-by: Gerald Schaefer [s390]
Acked-by: Bjorn Andersson
Acked-by: Hans-Christian Noren Egtvedt [avr32]
Acked-by: Vineet Gupta [arc]
Acked-by: Robin Murphy [arm64 and dma-iommu]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
02 Aug, 2016
2 commits
-
vringh is pulled in by caif and mic, but the other
vhost config does not need to be there.
In particular, it makes no sense to have vhost net/scsi/sock
under caif/mic.Create a separate Kconfig file and put vringh bits there.
Signed-off-by: Michael S. Tsirkin
-
VOP selects VHOST_RING. Pull in Kconfig that includes it
to make it self-containing.Signed-off-by: Michael S. Tsirkin
01 May, 2016
3 commits
-
Return statements at the end of void functions are useless.
The Coccinelle semantic patch used to make this change is as follows:
//
@@
identifier f;
expression e;
@@
void f(...) {}
//Signed-off-by: Amitoj Kaur Chawla
Signed-off-by: Greg Kroah-Hartman -
My static checker complains that we still use "mark" even when the
_scif_fence_mark() call fails so it can be uninitialized.Signed-off-by: Dan Carpenter
Signed-off-by: Greg Kroah-Hartman -
Fixes randconfig build error reported at
https://lkml.org/lkml/2016/4/3/135 by ensuring that
the VOP driver selects VIRTIO.Reported-by: Fengguang Wu
Reviewed-by: Ashutosh Dixit
Signed-off-by: Sudeep Dutt
Acked-by: Randy Dunlap
Signed-off-by: Greg Kroah-Hartman
28 Apr, 2016
1 commit
-
The MIC VOP driver does two successive reads from user space to read a
variable length data structure. Kernel memory corruption can result if
the data structure changes between the two reads. This patch disallows
the chance of this happening.Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=116651
Reported by: Pengfei Wang
Reviewed-by: Sudeep Dutt
Signed-off-by: Ashutosh Dixit
Cc: stable
Signed-off-by: Greg Kroah-Hartman
21 Mar, 2016
1 commit
-
Pull x86 protection key support from Ingo Molnar:
"This tree adds support for a new memory protection hardware feature
that is available in upcoming Intel CPUs: 'protection keys' (pkeys).There's a background article at LWN.net:
https://lwn.net/Articles/643797/
The gist is that protection keys allow the encoding of
user-controllable permission masks in the pte. So instead of having a
fixed protection mask in the pte (which needs a system call to change
and works on a per page basis), the user can map a (handful of)
protection mask variants and can change the masks runtime relatively
cheaply, without having to change every single page in the affected
virtual memory range.This allows the dynamic switching of the protection bits of large
amounts of virtual memory, via user-space instructions. It also
allows more precise control of MMU permission bits: for example the
executable bit is separate from the read bit (see more about that
below).This tree adds the MM infrastructure and low level x86 glue needed for
that, plus it adds a high level API to make use of protection keys -
if a user-space application calls:mmap(..., PROT_EXEC);
or
mprotect(ptr, sz, PROT_EXEC);
(note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice
this special case, and will set a special protection key on this
memory range. It also sets the appropriate bits in the Protection
Keys User Rights (PKRU) register so that the memory becomes unreadable
and unwritable.So using protection keys the kernel is able to implement 'true'
PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies
PROT_READ as well. Unreadable executable mappings have security
advantages: they cannot be read via information leaks to figure out
ASLR details, nor can they be scanned for ROP gadgets - and they
cannot be used by exploits for data purposes either.We know about no user-space code that relies on pure PROT_EXEC
mappings today, but binary loaders could start making use of this new
feature to map binaries and libraries in a more secure fashion.There is other pending pkeys work that offers more high level system
call APIs to manage protection keys - but those are not part of this
pull request.Right now there's a Kconfig that controls this feature
(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled
(like most x86 CPU feature enablement code that has no runtime
overhead), but it's not user-configurable at the moment. If there's
any serious problem with this then we can make it configurable and/or
flip the default"* 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
mm/pkeys: Fix siginfo ABI breakage caused by new u64 field
x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA
mm/core, x86/mm/pkeys: Add execute-only protection keys support
x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags
x86/mm/pkeys: Allow kernel to modify user pkey rights register
x86/fpu: Allow setting of XSAVE state
x86/mm: Factor out LDT init from context init
mm/core, x86/mm/pkeys: Add arch_validate_pkey()
mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()
x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU
x86/mm/pkeys: Add Kconfig prompt to existing config option
x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps
x86/mm/pkeys: Dump PKRU with other kernel registers
mm/core, x86/mm/pkeys: Differentiate instruction fetches
x86/mm/pkeys: Optimize fault handling in access_error()
mm/core: Do not enforce PKEY permissions on remote mm access
um, pkeys: Add UML arch_*_access_permitted() methods
mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
x86/mm/gup: Simplify get_user_pages() PTE bit handling
...
16 Feb, 2016
1 commit
-
We will soon modify the vanilla get_user_pages() so it can no
longer be used on mm/tasks other than 'current/current->mm',
which is by far the most common way it is called. For now,
we allow the old-style calls, but warn when they are used.
(implemented in previous patch)This patch switches all callers of:
get_user_pages()
get_user_pages_unlocked()
get_user_pages_locked()to stop passing tsk/mm so they will no longer see the warnings.
Signed-off-by: Dave Hansen
Reviewed-by: Thomas Gleixner
Cc: Andrea Arcangeli
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Dave Hansen
Cc: Denys Vlasenko
Cc: H. Peter Anvin
Cc: Kirill A. Shutemov
Cc: Linus Torvalds
Cc: Naoya Horiguchi
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Srikar Dronamraju
Cc: Vlastimil Babka
Cc: jack@suse.cz
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20160212210156.113E9407@viggo.jf.intel.com
Signed-off-by: Ingo Molnar
15 Feb, 2016
2 commits
-
Static checkers complain that the this is a potential array overflow.
We verify that it's not on the next line so this code is OK, but
static checker warnings are annoying.Signed-off-by: Dan Carpenter
Signed-off-by: Greg Kroah-Hartman -
Swap the printk and kfree() to avoid a use after free bug.
Fixes: 61e9c905df78 ('misc: mic: Enable VOP host side functionality')
Signed-off-by: Dan Carpenter
Signed-off-by: Greg Kroah-Hartman
10 Feb, 2016
8 commits
-
This patch modifies the MIC host and card drivers to start using the
VOP driver. The MIC host and card drivers now implement the VOP bus
operations and register a VOP device on the VOP bus. MIC driver stack
documentation is also updated to include the new VOP driver.Reviewed-by: Ashutosh Dixit
Signed-off-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
This patch moves the virtio specific debugfs hooks previously in
mic_debugfs.c in the MIC host driver into the VOP driver. The
Kconfig/Makefile is also updated to allow building the VOP driver.Reviewed-by: Ashutosh Dixit
Signed-off-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
This patch moves virtio functionality from the MIC card driver into a
separate hardware independent Virtio Over PCIe (VOP) driver. This
functionality was introduced in commit 2141c7c5ee67 ("Intel MIC Card
Driver Changes for Virtio Devices.") in
drivers/misc/mic/card/mic_virtio.c. Apart from being moved into a
separate driver the functionality is essentially unchanged. See the
above mentioned commit for a description of this functionality.Signed-off-by: Sudeep Dutt
Signed-off-by: Ashutosh Dixit
Signed-off-by: Greg Kroah-Hartman -
This patch moves virtio functionality from the MIC host driver into a
separate hardware independent Virtio Over PCIe (VOP) driver. This
functionality was introduced in commit f69bcbf3b4c4 ("Intel MIC Host
Driver Changes for Virtio Devices.") in
drivers/misc/mic/host/mic_virtio.c. Apart from being moved into a
separate driver the functionality is essentially unchanged. See the
above mentioned commit for a description of this functionality.Signed-off-by: Ashutosh Dixit
Signed-off-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
This patch adds VOP driver data structures used in subsequent
patches. These data structures are refactored from similar data
structures used in the virtio parts of previous MIC host and card
drivers.Signed-off-by: Ashutosh Dixit
Signed-off-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
The Virtio Over PCIe (VOP) bus abstracts the low level hardware
details like interrupts and mapping remote memory so that the same VOP
driver can work without changes with different MIC host or card
drivers as long as the hardware bus operations are implemented. The
VOP driver registers itself on the VOP bus. The base PCIe drivers
implement the bus ops and register VOP devices on the bus, resulting
in the VOP driver being probed with the VOP devices. This allows the
VOP functionality to be shared between multiple generations of Intel
MIC products.Reviewed-by: Ashutosh Dixit
Signed-off-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
This patch deletes the virtio functionality from the MIC X100 card
driver. A subsequent patch will re-enable this functionality by
consolidating the hardware independent logic in a new Virtio over PCIe
(VOP) driver.Reviewed-by: Ashutosh Dixit
Signed-off-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
This patch deletes the virtio functionality from the MIC X100 host
driver. A subsequent patch will re-enable this functionality by
consolidating the hardware independent logic in a new Virtio over PCIe
(VOP) driver.Reviewed-by: Ashutosh Dixit
Signed-off-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman
08 Feb, 2016
9 commits
-
Instead of calling release_firmware() on every error and then jumping
lets have a common release_firmware() in the error path.
This patch also fixes a memory leak where we missed release_firmware()
if mic_x100_load_command_line() fails.Signed-off-by: Sudip Mukherjee
Reviewed-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
Instead of jumping to a label and then returning from there lets return
directly.Signed-off-by: Sudip Mukherjee
Reviewed-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
If request_firmware() succeeds then rc becomes 0. After that if the test
for strcmp() fails then we were jumping to label done: and returning rc.
But rc being 0 we returned success whereas we have failed here and we
were supposed to return an error.Signed-off-by: Sudip Mukherjee
Reviewed-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
>From the error path we are printing an error message with dev_err(). No
need to print almost same message with dev_dbg().Signed-off-by: Sudip Mukherjee
Reviewed-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
After the loop we test "if (!retry)" to see if we timedout. The problem
is "retry--" is a post-op so retry will be -1 at the end of the loop. I
have fixed this by changing it to a pre-op instead.Signed-off-by: Dan Carpenter
Signed-off-by: Greg Kroah-Hartman -
This patch fixes the following crash seen when MIC reset is invoked in
RESET_FAILED state due to device_del being called a second time on an
already deleted device:[] device_del+0x45/0x1d0
[] device_unregister+0x1e/0x60
[] scif_unregister_device+0x12/0x20 [scif_bus]
[] cosm_stop+0xaa/0xe0 [mic_cosm]
[] cosm_reset_trigger_work+0x14/0x20 [mic_cosm]The fix consists in realizing that because cosm_reset changes the
state to MIC_RESETTING, cosm_stop needs the previous state, before it
changed to MIC_RESETTING, to decide whether a hw_ops->stop had
previously been issued. This is now provided in a new cosm_device
member cdev->prev_state.Reviewed-by: Sudeep Dutt
Signed-off-by: Ashutosh Dixit
Signed-off-by: Greg Kroah-Hartman -
The error code passed to ERR_PTR() always should be negated. Also, the
return value of scif_add_mmu_notifier() was never checked.Signed-off-by: Eric Biggers
Reviewed-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
list_next_entry has been defined in list.h, so I replace list_entry_next
with it.Signed-off-by: Geliang Tang
Reviewed-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
Signed integer overflow is undefined. Also I added a check for
"(offset < 0)" in scif_unregister() because that makes it match the
other conditions and because I didn't want to subtract a negative.Fixes: ba612aa8b487 ('misc: mic: SCIF memory registration and unregistration')
Signed-off-by: Dan Carpenter
Signed-off-by: Greg Kroah-Hartman
13 Jan, 2016
1 commit
-
checkpatch.pl wants arrays of strings declared as follows:
static const char * const names[] = { "vq-1", "vq-2", "vq-3" };
Currently the find_vqs() function takes a const char *names[] argument
so passing checkpatch.pl's const char * const names[] results in a
compiler error due to losing the second const.This patch adjusts the find_vqs() prototype and updates all virtio
transports. This makes it possible for virtio_balloon.c, virtio_input.c,
virtgpu_kms.c, and virtio_rpmsg_bus.c to use the checkpatch.pl-friendly
type.Signed-off-by: Stefan Hajnoczi
Signed-off-by: Michael S. Tsirkin
Acked-by: Bjorn Andersson
18 Oct, 2015
5 commits
-
We should be returning -ENOMEM here instead of success.
Fixes: ba612aa8b487 ('misc: mic: SCIF memory registration and unregistration')
Signed-off-by: Dan Carpenter
Reviewed-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
The caller expects that we take this lock again before returning
otherwise it you get double unlocks and races.Fixes: ba612aa8b487 ('misc: mic: SCIF memory registration and unregistration')
Signed-off-by: Dan Carpenter
Reviewed-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
In scif_node_connect() we were returning if the initialization of p2p_ji
fails. But at that time p2p_ij has already been initialized and
resources allocated for it. And since p2p_ij is not added to the list
till now so we will have a leak.
Lets deinitialize and release the resources connected to p2p_ij.Signed-off-by: Sudip Mukherjee
Reviewed-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
Handle a failed device_register(), replace kfree() with put_device(),
which will call cosm/mbus/scif_release_dev().Signed-off-by: Geliang Tang
Signed-off-by: Greg Kroah-Hartman -
Fixes randconfig build error reported at
http://www.spinics.net/lists/kernel/msg2092346.htmlReported-by: Jim Davis
Reviewed-by: Dasaratharaman Chandramouli
Signed-off-by: Ashutosh Dixit
Signed-off-by: Greg Kroah-Hartman
05 Oct, 2015
1 commit
-
SCIF depends on IOVA which requires IOMMU_SUPPORT to be enabled.
The long term fix is to move IOVA from drivers/iommu to lib/
but this current patch should fix the reported issue.Reported-by: Fengguang Wu
Reviewed-by: Ashutosh Dixit
Signed-off-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman
04 Oct, 2015
4 commits
-
This patch adds the SCIF kernel node QP control messages required to
enable SCIF RMAs. Examples of such node QP control messages include
registration, unregistration, remote memory allocation requests,
remote memory unmap and SCIF remote fence requests.The patch also updates the SCIF driver with minor changes required to
enable SCIF RMAs by adding the new files to the build, initializing
RMA specific information during SCIF endpoint creation, reserving SCIF
DMA channels, initializing SCIF RMA specific global data structures,
adding the IOCTL hooks required for SCIF RMAs and updating RMA
specific debugfs hooks.Reviewed-by: Ashutosh Dixit
Reviewed-by: Nikhil Rao
Signed-off-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
This patch implements the fence APIs required to synchronize
DMAs. SCIF provides an interface to return a "mark" for all DMAs
programmed at the instant the API was called. Users can then "wait" on
the mark provided previously by blocking inside the kernel. Upon
receipt of a DMA completion interrupt the waiting thread is woken
up. There is also an interface to signal DMA completion by polling for
a location to be updated via a "signal" cookie to avoid the interrupt
overhead in the mark/wait interface. SCIF allows programming fences on
both the local and the remote node for both the mark/wait or the fence
signal APIs.Reviewed-by: Ashutosh Dixit
Reviewed-by: Nikhil Rao
Signed-off-by: Jacek Lawrynowicz
Signed-off-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
SCIF allows users to read from or write to registered remote memory
via CPU copies or DMA. The API verifies that both local and remote
windows are valid before initiating the CPU or DMA transfers. SCIF has
optimized algorithms for handling byte aligned as well as cache line
aligned DMA engines. A registration cache is maintained to avoid the
overhead of pinning pages repeatedly if buffers are reused. The
registration cache is invalidated upon receipt of MMU notifier
callbacks. SCIF windows are destroyed and the pages are unpinned only
once all prior DMAs initiated using that window are drained. Users can
request synchronous DMA operations as well as tail byte ordering if
required. CPU copies are always performed synchronously.Reviewed-by: Ashutosh Dixit
Reviewed-by: Nikhil Rao
Signed-off-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman -
This patch implements the SCIF mmap/munmap interface. A similar
capability is provided to kernel clients via the
scif_get_pages()/scif_put_pages() APIs. The SCIF mmap interface
queries to check if a window is valid and then remaps the local
virtual address to the remote physical pages. These mappings are
subsequently destroyed upon receipt of the VMA close operation or
scif_get_pages(). This functionality allows SCIF users to directly
access remote memory without any driver interaction once the mappings
are created thereby providing bare-metal PCIe latency. These mappings
are zapped to avoid RMA accesses from user space, if a Coprocessor is
reset.Reviewed-by: Ashutosh Dixit
Reviewed-by: Nikhil Rao
Signed-off-by: Sudeep Dutt
Signed-off-by: Greg Kroah-Hartman