06 Aug, 2016

1 commit

  • Pull virtio/vhost updates from Michael Tsirkin:

    - new vsock device support in host and guest

    - platform IOMMU support in host and guest, including compatibility
    quirks for legacy systems.

    - misc fixes and cleanups.

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
    VSOCK: Use kvfree()
    vhost: split out vringh Kconfig
    vhost: detect 32 bit integer wrap around
    vhost: new device IOTLB API
    vhost: drop vringh dependency
    vhost: convert pre sorted vhost memory array to interval tree
    vhost: introduce vhost memory accessors
    VSOCK: Add Makefile and Kconfig
    VSOCK: Introduce vhost_vsock.ko
    VSOCK: Introduce virtio_transport.ko
    VSOCK: Introduce virtio_vsock_common.ko
    VSOCK: defer sock removal to transports
    VSOCK: transport-specific vsock_transport functions
    vhost: drop vringh dependency
    vop: pull in vhost Kconfig
    virtio: new feature to detect IOMMU device quirk
    balloon: check the number of available pages in leak balloon
    vhost: lockless enqueuing
    vhost: simplify work flushing

    Linus Torvalds
     

04 Aug, 2016

1 commit

  • The dma-mapping core and the implementations do not change the DMA
    attributes passed by pointer. Thus the pointer can point to const data.
    However the attributes do not have to be a bitfield. Instead unsigned
    long will do fine:

    1. This is just simpler. Both in terms of reading the code and setting
    attributes. Instead of initializing local attributes on the stack
    and passing pointer to it to dma_set_attr(), just set the bits.

    2. It brings safeness and checking for const correctness because the
    attributes are passed by value.

    Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
    )

    and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
    )

    Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
    Signed-off-by: Krzysztof Kozlowski
    Acked-by: Vineet Gupta
    Acked-by: Robin Murphy
    Acked-by: Hans-Christian Noren Egtvedt
    Acked-by: Mark Salter [c6x]
    Acked-by: Jesper Nilsson [cris]
    Acked-by: Daniel Vetter [drm]
    Reviewed-by: Bart Van Assche
    Acked-by: Joerg Roedel [iommu]
    Acked-by: Fabien Dessenne [bdisp]
    Reviewed-by: Marek Szyprowski [vb2-core]
    Acked-by: David Vrabel [xen]
    Acked-by: Konrad Rzeszutek Wilk [xen swiotlb]
    Acked-by: Joerg Roedel [iommu]
    Acked-by: Richard Kuo [hexagon]
    Acked-by: Geert Uytterhoeven [m68k]
    Acked-by: Gerald Schaefer [s390]
    Acked-by: Bjorn Andersson
    Acked-by: Hans-Christian Noren Egtvedt [avr32]
    Acked-by: Vineet Gupta [arc]
    Acked-by: Robin Murphy [arm64 and dma-iommu]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Krzysztof Kozlowski
     

02 Aug, 2016

2 commits


01 May, 2016

3 commits


28 Apr, 2016

1 commit

  • The MIC VOP driver does two successive reads from user space to read a
    variable length data structure. Kernel memory corruption can result if
    the data structure changes between the two reads. This patch disallows
    the chance of this happening.

    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=116651
    Reported by: Pengfei Wang
    Reviewed-by: Sudeep Dutt
    Signed-off-by: Ashutosh Dixit
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Ashutosh Dixit
     

21 Mar, 2016

1 commit

  • Pull x86 protection key support from Ingo Molnar:
    "This tree adds support for a new memory protection hardware feature
    that is available in upcoming Intel CPUs: 'protection keys' (pkeys).

    There's a background article at LWN.net:

    https://lwn.net/Articles/643797/

    The gist is that protection keys allow the encoding of
    user-controllable permission masks in the pte. So instead of having a
    fixed protection mask in the pte (which needs a system call to change
    and works on a per page basis), the user can map a (handful of)
    protection mask variants and can change the masks runtime relatively
    cheaply, without having to change every single page in the affected
    virtual memory range.

    This allows the dynamic switching of the protection bits of large
    amounts of virtual memory, via user-space instructions. It also
    allows more precise control of MMU permission bits: for example the
    executable bit is separate from the read bit (see more about that
    below).

    This tree adds the MM infrastructure and low level x86 glue needed for
    that, plus it adds a high level API to make use of protection keys -
    if a user-space application calls:

    mmap(..., PROT_EXEC);

    or

    mprotect(ptr, sz, PROT_EXEC);

    (note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice
    this special case, and will set a special protection key on this
    memory range. It also sets the appropriate bits in the Protection
    Keys User Rights (PKRU) register so that the memory becomes unreadable
    and unwritable.

    So using protection keys the kernel is able to implement 'true'
    PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies
    PROT_READ as well. Unreadable executable mappings have security
    advantages: they cannot be read via information leaks to figure out
    ASLR details, nor can they be scanned for ROP gadgets - and they
    cannot be used by exploits for data purposes either.

    We know about no user-space code that relies on pure PROT_EXEC
    mappings today, but binary loaders could start making use of this new
    feature to map binaries and libraries in a more secure fashion.

    There is other pending pkeys work that offers more high level system
    call APIs to manage protection keys - but those are not part of this
    pull request.

    Right now there's a Kconfig that controls this feature
    (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled
    (like most x86 CPU feature enablement code that has no runtime
    overhead), but it's not user-configurable at the moment. If there's
    any serious problem with this then we can make it configurable and/or
    flip the default"

    * 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
    x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
    mm/pkeys: Fix siginfo ABI breakage caused by new u64 field
    x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA
    mm/core, x86/mm/pkeys: Add execute-only protection keys support
    x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags
    x86/mm/pkeys: Allow kernel to modify user pkey rights register
    x86/fpu: Allow setting of XSAVE state
    x86/mm: Factor out LDT init from context init
    mm/core, x86/mm/pkeys: Add arch_validate_pkey()
    mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()
    x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU
    x86/mm/pkeys: Add Kconfig prompt to existing config option
    x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps
    x86/mm/pkeys: Dump PKRU with other kernel registers
    mm/core, x86/mm/pkeys: Differentiate instruction fetches
    x86/mm/pkeys: Optimize fault handling in access_error()
    mm/core: Do not enforce PKEY permissions on remote mm access
    um, pkeys: Add UML arch_*_access_permitted() methods
    mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
    x86/mm/gup: Simplify get_user_pages() PTE bit handling
    ...

    Linus Torvalds
     

16 Feb, 2016

1 commit

  • We will soon modify the vanilla get_user_pages() so it can no
    longer be used on mm/tasks other than 'current/current->mm',
    which is by far the most common way it is called. For now,
    we allow the old-style calls, but warn when they are used.
    (implemented in previous patch)

    This patch switches all callers of:

    get_user_pages()
    get_user_pages_unlocked()
    get_user_pages_locked()

    to stop passing tsk/mm so they will no longer see the warnings.

    Signed-off-by: Dave Hansen
    Reviewed-by: Thomas Gleixner
    Cc: Andrea Arcangeli
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Kirill A. Shutemov
    Cc: Linus Torvalds
    Cc: Naoya Horiguchi
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Srikar Dronamraju
    Cc: Vlastimil Babka
    Cc: jack@suse.cz
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/20160212210156.113E9407@viggo.jf.intel.com
    Signed-off-by: Ingo Molnar

    Dave Hansen
     

15 Feb, 2016

2 commits


10 Feb, 2016

8 commits

  • This patch modifies the MIC host and card drivers to start using the
    VOP driver. The MIC host and card drivers now implement the VOP bus
    operations and register a VOP device on the VOP bus. MIC driver stack
    documentation is also updated to include the new VOP driver.

    Reviewed-by: Ashutosh Dixit
    Signed-off-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudeep Dutt
     
  • This patch moves the virtio specific debugfs hooks previously in
    mic_debugfs.c in the MIC host driver into the VOP driver. The
    Kconfig/Makefile is also updated to allow building the VOP driver.

    Reviewed-by: Ashutosh Dixit
    Signed-off-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudeep Dutt
     
  • This patch moves virtio functionality from the MIC card driver into a
    separate hardware independent Virtio Over PCIe (VOP) driver. This
    functionality was introduced in commit 2141c7c5ee67 ("Intel MIC Card
    Driver Changes for Virtio Devices.") in
    drivers/misc/mic/card/mic_virtio.c. Apart from being moved into a
    separate driver the functionality is essentially unchanged. See the
    above mentioned commit for a description of this functionality.

    Signed-off-by: Sudeep Dutt
    Signed-off-by: Ashutosh Dixit
    Signed-off-by: Greg Kroah-Hartman

    Ashutosh Dixit
     
  • This patch moves virtio functionality from the MIC host driver into a
    separate hardware independent Virtio Over PCIe (VOP) driver. This
    functionality was introduced in commit f69bcbf3b4c4 ("Intel MIC Host
    Driver Changes for Virtio Devices.") in
    drivers/misc/mic/host/mic_virtio.c. Apart from being moved into a
    separate driver the functionality is essentially unchanged. See the
    above mentioned commit for a description of this functionality.

    Signed-off-by: Ashutosh Dixit
    Signed-off-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudeep Dutt
     
  • This patch adds VOP driver data structures used in subsequent
    patches. These data structures are refactored from similar data
    structures used in the virtio parts of previous MIC host and card
    drivers.

    Signed-off-by: Ashutosh Dixit
    Signed-off-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudeep Dutt
     
  • The Virtio Over PCIe (VOP) bus abstracts the low level hardware
    details like interrupts and mapping remote memory so that the same VOP
    driver can work without changes with different MIC host or card
    drivers as long as the hardware bus operations are implemented. The
    VOP driver registers itself on the VOP bus. The base PCIe drivers
    implement the bus ops and register VOP devices on the bus, resulting
    in the VOP driver being probed with the VOP devices. This allows the
    VOP functionality to be shared between multiple generations of Intel
    MIC products.

    Reviewed-by: Ashutosh Dixit
    Signed-off-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudeep Dutt
     
  • This patch deletes the virtio functionality from the MIC X100 card
    driver. A subsequent patch will re-enable this functionality by
    consolidating the hardware independent logic in a new Virtio over PCIe
    (VOP) driver.

    Reviewed-by: Ashutosh Dixit
    Signed-off-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudeep Dutt
     
  • This patch deletes the virtio functionality from the MIC X100 host
    driver. A subsequent patch will re-enable this functionality by
    consolidating the hardware independent logic in a new Virtio over PCIe
    (VOP) driver.

    Reviewed-by: Ashutosh Dixit
    Signed-off-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudeep Dutt
     

08 Feb, 2016

9 commits

  • Instead of calling release_firmware() on every error and then jumping
    lets have a common release_firmware() in the error path.
    This patch also fixes a memory leak where we missed release_firmware()
    if mic_x100_load_command_line() fails.

    Signed-off-by: Sudip Mukherjee
    Reviewed-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudip Mukherjee
     
  • Instead of jumping to a label and then returning from there lets return
    directly.

    Signed-off-by: Sudip Mukherjee
    Reviewed-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudip Mukherjee
     
  • If request_firmware() succeeds then rc becomes 0. After that if the test
    for strcmp() fails then we were jumping to label done: and returning rc.
    But rc being 0 we returned success whereas we have failed here and we
    were supposed to return an error.

    Signed-off-by: Sudip Mukherjee
    Reviewed-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudip Mukherjee
     
  • >From the error path we are printing an error message with dev_err(). No
    need to print almost same message with dev_dbg().

    Signed-off-by: Sudip Mukherjee
    Reviewed-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudip Mukherjee
     
  • After the loop we test "if (!retry)" to see if we timedout. The problem
    is "retry--" is a post-op so retry will be -1 at the end of the loop. I
    have fixed this by changing it to a pre-op instead.

    Signed-off-by: Dan Carpenter
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     
  • This patch fixes the following crash seen when MIC reset is invoked in
    RESET_FAILED state due to device_del being called a second time on an
    already deleted device:

    [] device_del+0x45/0x1d0
    [] device_unregister+0x1e/0x60
    [] scif_unregister_device+0x12/0x20 [scif_bus]
    [] cosm_stop+0xaa/0xe0 [mic_cosm]
    [] cosm_reset_trigger_work+0x14/0x20 [mic_cosm]

    The fix consists in realizing that because cosm_reset changes the
    state to MIC_RESETTING, cosm_stop needs the previous state, before it
    changed to MIC_RESETTING, to decide whether a hw_ops->stop had
    previously been issued. This is now provided in a new cosm_device
    member cdev->prev_state.

    Reviewed-by: Sudeep Dutt
    Signed-off-by: Ashutosh Dixit
    Signed-off-by: Greg Kroah-Hartman

    Ashutosh Dixit
     
  • The error code passed to ERR_PTR() always should be negated. Also, the
    return value of scif_add_mmu_notifier() was never checked.

    Signed-off-by: Eric Biggers
    Reviewed-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     
  • list_next_entry has been defined in list.h, so I replace list_entry_next
    with it.

    Signed-off-by: Geliang Tang
    Reviewed-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Geliang Tang
     
  • Signed integer overflow is undefined. Also I added a check for
    "(offset < 0)" in scif_unregister() because that makes it match the
    other conditions and because I didn't want to subtract a negative.

    Fixes: ba612aa8b487 ('misc: mic: SCIF memory registration and unregistration')
    Signed-off-by: Dan Carpenter
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     

13 Jan, 2016

1 commit

  • checkpatch.pl wants arrays of strings declared as follows:

    static const char * const names[] = { "vq-1", "vq-2", "vq-3" };

    Currently the find_vqs() function takes a const char *names[] argument
    so passing checkpatch.pl's const char * const names[] results in a
    compiler error due to losing the second const.

    This patch adjusts the find_vqs() prototype and updates all virtio
    transports. This makes it possible for virtio_balloon.c, virtio_input.c,
    virtgpu_kms.c, and virtio_rpmsg_bus.c to use the checkpatch.pl-friendly
    type.

    Signed-off-by: Stefan Hajnoczi
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Bjorn Andersson

    Stefan Hajnoczi
     

18 Oct, 2015

5 commits

  • We should be returning -ENOMEM here instead of success.

    Fixes: ba612aa8b487 ('misc: mic: SCIF memory registration and unregistration')
    Signed-off-by: Dan Carpenter
    Reviewed-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     
  • The caller expects that we take this lock again before returning
    otherwise it you get double unlocks and races.

    Fixes: ba612aa8b487 ('misc: mic: SCIF memory registration and unregistration')
    Signed-off-by: Dan Carpenter
    Reviewed-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     
  • In scif_node_connect() we were returning if the initialization of p2p_ji
    fails. But at that time p2p_ij has already been initialized and
    resources allocated for it. And since p2p_ij is not added to the list
    till now so we will have a leak.
    Lets deinitialize and release the resources connected to p2p_ij.

    Signed-off-by: Sudip Mukherjee
    Reviewed-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudip Mukherjee
     
  • Handle a failed device_register(), replace kfree() with put_device(),
    which will call cosm/mbus/scif_release_dev().

    Signed-off-by: Geliang Tang
    Signed-off-by: Greg Kroah-Hartman

    Geliang Tang
     
  • Fixes randconfig build error reported at
    http://www.spinics.net/lists/kernel/msg2092346.html

    Reported-by: Jim Davis
    Reviewed-by: Dasaratharaman Chandramouli
    Signed-off-by: Ashutosh Dixit
    Signed-off-by: Greg Kroah-Hartman

    Ashutosh Dixit
     

05 Oct, 2015

1 commit


04 Oct, 2015

4 commits

  • This patch adds the SCIF kernel node QP control messages required to
    enable SCIF RMAs. Examples of such node QP control messages include
    registration, unregistration, remote memory allocation requests,
    remote memory unmap and SCIF remote fence requests.

    The patch also updates the SCIF driver with minor changes required to
    enable SCIF RMAs by adding the new files to the build, initializing
    RMA specific information during SCIF endpoint creation, reserving SCIF
    DMA channels, initializing SCIF RMA specific global data structures,
    adding the IOCTL hooks required for SCIF RMAs and updating RMA
    specific debugfs hooks.

    Reviewed-by: Ashutosh Dixit
    Reviewed-by: Nikhil Rao
    Signed-off-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudeep Dutt
     
  • This patch implements the fence APIs required to synchronize
    DMAs. SCIF provides an interface to return a "mark" for all DMAs
    programmed at the instant the API was called. Users can then "wait" on
    the mark provided previously by blocking inside the kernel. Upon
    receipt of a DMA completion interrupt the waiting thread is woken
    up. There is also an interface to signal DMA completion by polling for
    a location to be updated via a "signal" cookie to avoid the interrupt
    overhead in the mark/wait interface. SCIF allows programming fences on
    both the local and the remote node for both the mark/wait or the fence
    signal APIs.

    Reviewed-by: Ashutosh Dixit
    Reviewed-by: Nikhil Rao
    Signed-off-by: Jacek Lawrynowicz
    Signed-off-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudeep Dutt
     
  • SCIF allows users to read from or write to registered remote memory
    via CPU copies or DMA. The API verifies that both local and remote
    windows are valid before initiating the CPU or DMA transfers. SCIF has
    optimized algorithms for handling byte aligned as well as cache line
    aligned DMA engines. A registration cache is maintained to avoid the
    overhead of pinning pages repeatedly if buffers are reused. The
    registration cache is invalidated upon receipt of MMU notifier
    callbacks. SCIF windows are destroyed and the pages are unpinned only
    once all prior DMAs initiated using that window are drained. Users can
    request synchronous DMA operations as well as tail byte ordering if
    required. CPU copies are always performed synchronously.

    Reviewed-by: Ashutosh Dixit
    Reviewed-by: Nikhil Rao
    Signed-off-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudeep Dutt
     
  • This patch implements the SCIF mmap/munmap interface. A similar
    capability is provided to kernel clients via the
    scif_get_pages()/scif_put_pages() APIs. The SCIF mmap interface
    queries to check if a window is valid and then remaps the local
    virtual address to the remote physical pages. These mappings are
    subsequently destroyed upon receipt of the VMA close operation or
    scif_get_pages(). This functionality allows SCIF users to directly
    access remote memory without any driver interaction once the mappings
    are created thereby providing bare-metal PCIe latency. These mappings
    are zapped to avoid RMA accesses from user space, if a Coprocessor is
    reset.

    Reviewed-by: Ashutosh Dixit
    Reviewed-by: Nikhil Rao
    Signed-off-by: Sudeep Dutt
    Signed-off-by: Greg Kroah-Hartman

    Sudeep Dutt