08 Oct, 2020

1 commit

  • * tag 'v5.4.70': (3051 commits)
    Linux 5.4.70
    netfilter: ctnetlink: add a range check for l3/l4 protonum
    ep_create_wakeup_source(): dentry name can change under you...
    ...

    Conflicts:
    arch/arm/mach-imx/pm-imx6.c
    arch/arm64/boot/dts/freescale/imx8mm-evk.dts
    arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
    drivers/crypto/caam/caamalg.c
    drivers/gpu/drm/imx/dw_hdmi-imx.c
    drivers/gpu/drm/imx/imx-ldb.c
    drivers/gpu/drm/imx/ipuv3/ipuv3-crtc.c
    drivers/mmc/host/sdhci-esdhc-imx.c
    drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
    drivers/net/ethernet/freescale/enetc/enetc.c
    drivers/net/ethernet/freescale/enetc/enetc_pf.c
    drivers/thermal/imx_thermal.c
    drivers/usb/cdns3/ep0.c
    drivers/xen/swiotlb-xen.c
    sound/soc/fsl/fsl_esai.c
    sound/soc/fsl/fsl_sai.c

    Signed-off-by: Jason Liu

    Jason Liu
     

01 Oct, 2020

3 commits

  • [ Upstream commit b872d0640840018669032b20b6375a478ed1f923 ]

    The vfio_pci_release call will free and clear the error and request
    eventfd ctx while these ctx could be in use at the same time in the
    function like vfio_pci_request, and it's expected to protect them under
    the vdev->igate mutex, which is missing in vfio_pci_release.

    This issue is introduced since commit 1518ac272e78 ("vfio/pci: fix memory
    leaks of eventfd ctx"),and since commit 5c5866c593bb ("vfio/pci: Clear
    error and request eventfd ctx after releasing"), it's very easily to
    trigger the kernel panic like this:

    [ 9513.904346] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
    [ 9513.913091] Mem abort info:
    [ 9513.915871] ESR = 0x96000006
    [ 9513.918912] EC = 0x25: DABT (current EL), IL = 32 bits
    [ 9513.924198] SET = 0, FnV = 0
    [ 9513.927238] EA = 0, S1PTW = 0
    [ 9513.930364] Data abort info:
    [ 9513.933231] ISV = 0, ISS = 0x00000006
    [ 9513.937048] CM = 0, WnR = 0
    [ 9513.940003] user pgtable: 4k pages, 48-bit VAs, pgdp=0000007ec7d12000
    [ 9513.946414] [0000000000000008] pgd=0000007ec7d13003, p4d=0000007ec7d13003, pud=0000007ec728c003, pmd=0000000000000000
    [ 9513.956975] Internal error: Oops: 96000006 [#1] PREEMPT SMP
    [ 9513.962521] Modules linked in: vfio_pci vfio_virqfd vfio_iommu_type1 vfio hclge hns3 hnae3 [last unloaded: vfio_pci]
    [ 9513.972998] CPU: 4 PID: 1327 Comm: bash Tainted: G W 5.8.0-rc4+ #3
    [ 9513.980443] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.B270.01 05/08/2020
    [ 9513.989274] pstate: 80400089 (Nzcv daIf +PAN -UAO BTYPE=--)
    [ 9513.994827] pc : _raw_spin_lock_irqsave+0x48/0x88
    [ 9513.999515] lr : eventfd_signal+0x6c/0x1b0
    [ 9514.003591] sp : ffff800038a0b960
    [ 9514.006889] x29: ffff800038a0b960 x28: ffff007ef7f4da10
    [ 9514.012175] x27: ffff207eefbbfc80 x26: ffffbb7903457000
    [ 9514.017462] x25: ffffbb7912191000 x24: ffff007ef7f4d400
    [ 9514.022747] x23: ffff20be6e0e4c00 x22: 0000000000000008
    [ 9514.028033] x21: 0000000000000000 x20: 0000000000000000
    [ 9514.033321] x19: 0000000000000008 x18: 0000000000000000
    [ 9514.038606] x17: 0000000000000000 x16: ffffbb7910029328
    [ 9514.043893] x15: 0000000000000000 x14: 0000000000000001
    [ 9514.049179] x13: 0000000000000000 x12: 0000000000000002
    [ 9514.054466] x11: 0000000000000000 x10: 0000000000000a00
    [ 9514.059752] x9 : ffff800038a0b840 x8 : ffff007ef7f4de60
    [ 9514.065038] x7 : ffff007fffc96690 x6 : fffffe01faffb748
    [ 9514.070324] x5 : 0000000000000000 x4 : 0000000000000000
    [ 9514.075609] x3 : 0000000000000000 x2 : 0000000000000001
    [ 9514.080895] x1 : ffff007ef7f4d400 x0 : 0000000000000000
    [ 9514.086181] Call trace:
    [ 9514.088618] _raw_spin_lock_irqsave+0x48/0x88
    [ 9514.092954] eventfd_signal+0x6c/0x1b0
    [ 9514.096691] vfio_pci_request+0x84/0xd0 [vfio_pci]
    [ 9514.101464] vfio_del_group_dev+0x150/0x290 [vfio]
    [ 9514.106234] vfio_pci_remove+0x30/0x128 [vfio_pci]
    [ 9514.111007] pci_device_remove+0x48/0x108
    [ 9514.115001] device_release_driver_internal+0x100/0x1b8
    [ 9514.120200] device_release_driver+0x28/0x38
    [ 9514.124452] pci_stop_bus_device+0x68/0xa8
    [ 9514.128528] pci_stop_and_remove_bus_device+0x20/0x38
    [ 9514.133557] pci_iov_remove_virtfn+0xb4/0x128
    [ 9514.137893] sriov_disable+0x3c/0x108
    [ 9514.141538] pci_disable_sriov+0x28/0x38
    [ 9514.145445] hns3_pci_sriov_configure+0x48/0xb8 [hns3]
    [ 9514.150558] sriov_numvfs_store+0x110/0x198
    [ 9514.154724] dev_attr_store+0x44/0x60
    [ 9514.158373] sysfs_kf_write+0x5c/0x78
    [ 9514.162018] kernfs_fop_write+0x104/0x210
    [ 9514.166010] __vfs_write+0x48/0x90
    [ 9514.169395] vfs_write+0xbc/0x1c0
    [ 9514.172694] ksys_write+0x74/0x100
    [ 9514.176079] __arm64_sys_write+0x24/0x30
    [ 9514.179987] el0_svc_common.constprop.4+0x110/0x200
    [ 9514.184842] do_el0_svc+0x34/0x98
    [ 9514.188144] el0_svc+0x14/0x40
    [ 9514.191185] el0_sync_handler+0xb0/0x2d0
    [ 9514.195088] el0_sync+0x140/0x180
    [ 9514.198389] Code: b9001020 d2800000 52800022 f9800271 (885ffe61)
    [ 9514.204455] ---[ end trace 648de00c8406465f ]---
    [ 9514.212308] note: bash[1327] exited with preempt_count 1

    Cc: Qian Cai
    Cc: Alex Williamson
    Fixes: 1518ac272e78 ("vfio/pci: fix memory leaks of eventfd ctx")
    Signed-off-by: Zeng Tao
    Signed-off-by: Alex Williamson
    Signed-off-by: Sasha Levin

    Zeng Tao
     
  • [ Upstream commit 5c5866c593bbd444d0339ede6a8fb5f14ff66d72 ]

    The next use of the device will generate an underflow from the
    stale reference.

    Cc: Qian Cai
    Fixes: 1518ac272e78 ("vfio/pci: fix memory leaks of eventfd ctx")
    Reported-by: Daniel Wagner
    Reviewed-by: Cornelia Huck
    Tested-by: Daniel Wagner
    Signed-off-by: Alex Williamson
    Signed-off-by: Sasha Levin

    Alex Williamson
     
  • [ Upstream commit 1518ac272e789cae8c555d69951b032a275b7602 ]

    Finished a qemu-kvm (-device vfio-pci,host=0001:01:00.0) triggers a few
    memory leaks after a while because vfio_pci_set_ctx_trigger_single()
    calls eventfd_ctx_fdget() without the matching eventfd_ctx_put() later.
    Fix it by calling eventfd_ctx_put() for those memory in
    vfio_pci_release() before vfio_device_release().

    unreferenced object 0xebff008981cc2b00 (size 128):
    comm "qemu-kvm", pid 4043, jiffies 4294994816 (age 9796.310s)
    hex dump (first 32 bytes):
    01 00 00 00 6b 6b 6b 6b 00 00 00 00 ad 4e ad de ....kkkk.....N..
    ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff ....kkkk........
    backtrace:
    [] slab_post_alloc_hook+0x74/0x9c
    [] kmem_cache_alloc_trace+0x2b4/0x3d4
    [] do_eventfd+0x54/0x1ac
    [] __arm64_sys_eventfd2+0x34/0x44
    [] do_el0_svc+0x128/0x1dc
    [] el0_sync_handler+0xd0/0x268
    [] el0_sync+0x164/0x180
    unreferenced object 0x29ff008981cc4180 (size 128):
    comm "qemu-kvm", pid 4043, jiffies 4294994818 (age 9796.290s)
    hex dump (first 32 bytes):
    01 00 00 00 6b 6b 6b 6b 00 00 00 00 ad 4e ad de ....kkkk.....N..
    ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff ....kkkk........
    backtrace:
    [] slab_post_alloc_hook+0x74/0x9c
    [] kmem_cache_alloc_trace+0x2b4/0x3d4
    [] do_eventfd+0x54/0x1ac
    [] __arm64_sys_eventfd2+0x34/0x44
    [] do_el0_svc+0x128/0x1dc
    [] el0_sync_handler+0xd0/0x268
    [] el0_sync+0x164/0x180

    Signed-off-by: Qian Cai
    Signed-off-by: Alex Williamson
    Signed-off-by: Sasha Levin

    Qian Cai
     

10 Sep, 2020

4 commits

  • commit ebfa440ce38b7e2e04c3124aa89c8a9f4094cf21 upstream.

    SR-IOV VFs do not implement the memory enable bit of the command
    register, therefore this bit is not set in config space after
    pci_enable_device(). This leads to an unintended difference
    between PF and VF in hand-off state to the user. We can correct
    this by setting the initial value of the memory enable bit in our
    virtualized config space. There's really no need however to
    ever fault a user on a VF though as this would only indicate an
    error in the user's management of the enable bit, versus a PF
    where the same access could trigger hardware faults.

    Fixes: abafbc551fdd ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory")
    Signed-off-by: Alex Williamson
    Signed-off-by: Greg Kroah-Hartman

    Alex Williamson
     
  • commit abafbc551fddede3e0a08dee1dcde08fc0eb8476 upstream.

    Accessing the disabled memory space of a PCI device would typically
    result in a master abort response on conventional PCI, or an
    unsupported request on PCI express. The user would generally see
    these as a -1 response for the read return data and the write would be
    silently discarded, possibly with an uncorrected, non-fatal AER error
    triggered on the host. Some systems however take it upon themselves
    to bring down the entire system when they see something that might
    indicate a loss of data, such as this discarded write to a disabled
    memory space.

    To avoid this, we want to try to block the user from accessing memory
    spaces while they're disabled. We start with a semaphore around the
    memory enable bit, where writers modify the memory enable state and
    must be serialized, while readers make use of the memory region and
    can access in parallel. Writers include both direct manipulation via
    the command register, as well as any reset path where the internal
    mechanics of the reset may both explicitly and implicitly disable
    memory access, and manipulation of the MSI-X configuration, where the
    MSI-X vector table resides in MMIO space of the device. Readers
    include the read and write file ops to access the vfio device fd
    offsets as well as memory mapped access. In the latter case, we make
    use of our new vma list support to zap, or invalidate, those memory
    mappings in order to force them to be faulted back in on access.

    Our semaphore usage will stall user access to MMIO spaces across
    internal operations like reset, but the user might experience new
    behavior when trying to access the MMIO space while disabled via the
    PCI command register. Access via read or write while disabled will
    return -EIO and access via memory maps will result in a SIGBUS. This
    is expected to be compatible with known use cases and potentially
    provides better error handling capabilities than present in the
    hardware, while avoiding the more readily accessible and severe
    platform error responses that might otherwise occur.

    Fixes: CVE-2020-12888
    Reviewed-by: Peter Xu
    Signed-off-by: Alex Williamson
    Signed-off-by: Ajay Kaher
    Signed-off-by: Sasha Levin

    Ajay Kaher
     
  • commit 11c4cd07ba111a09f49625f9e4c851d83daf0a22 upstream.

    Rather than calling remap_pfn_range() when a region is mmap'd, setup
    a vm_ops handler to support dynamic faulting of the range on access.
    This allows us to manage a list of vmas actively mapping the area that
    we can later use to invalidate those mappings. The open callback
    invalidates the vma range so that all tracking is inserted in the
    fault handler and removed in the close handler.

    Reviewed-by: Peter Xu
    Signed-off-by: Alex Williamson
    Signed-off-by: Ajay Kaher
    Signed-off-by: Sasha Levin

    Ajay Kaher
     
  • commit 41311242221e3482b20bfed10fa4d9db98d87016 upstream.

    With conversion to follow_pfn(), DMA mapping a PFNMAP range depends on
    the range being faulted into the vma. Add support to manually provide
    that, in the same way as done on KVM with hva_to_pfn_remapped().

    Reviewed-by: Peter Xu
    Signed-off-by: Alex Williamson
    Signed-off-by: Ajay Kaher
    Signed-off-by: Sasha Levin

    Ajay Kaher
     

26 Aug, 2020

1 commit

  • [ Upstream commit aae7a75a821a793ed6b8ad502a5890fb8e8f172d ]

    The vfio_iommu_replay() function does not currently unwind on error,
    yet it does pin pages, perform IOMMU mapping, and modify the vfio_dma
    structure to indicate IOMMU mapping. The IOMMU mappings are torn down
    when the domain is destroyed, but the other actions go on to cause
    trouble later. For example, the iommu->domain_list can be empty if we
    only have a non-IOMMU backed mdev attached. We don't currently check
    if the list is empty before getting the first entry in the list, which
    leads to a bogus domain pointer. If a vfio_dma entry is erroneously
    marked as iommu_mapped, we'll attempt to use that bogus pointer to
    retrieve the existing physical page addresses.

    This is the scenario that uncovered this issue, attempting to hot-add
    a vfio-pci device to a container with an existing mdev device and DMA
    mappings, one of which could not be pinned, causing a failure adding
    the new group to the existing container and setting the conditions
    for a subsequent attempt to explode.

    To resolve this, we can first check if the domain_list is empty so
    that we can reject replay of a bogus domain, should we ever encounter
    this inconsistent state again in the future. The real fix though is
    to add the necessary unwind support, which means cleaning up the
    current pinning if an IOMMU mapping fails, then walking back through
    the r-b tree of DMA entries, reading from the IOMMU which ranges are
    mapped, and unmapping and unpinning those ranges. To be able to do
    this, we also defer marking the DMA entry as IOMMU mapped until all
    entries are processed, in order to allow the unwind to know the
    disposition of each entry.

    Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices")
    Reported-by: Zhiyi Guo
    Tested-by: Zhiyi Guo
    Reviewed-by: Cornelia Huck
    Signed-off-by: Alex Williamson
    Signed-off-by: Sasha Levin

    Alex Williamson
     

24 Jun, 2020

3 commits

  • [ Upstream commit aa8ba13cae3134b8ef1c1b6879f66372531da738 ]

    kobject_init_and_add() takes reference even when it fails.
    If this function returns an error, kobject_put() must be called to
    properly clean up the memory associated with the object. Thus,
    replace kfree() by kobject_put() to fix this issue. Previous
    commit "b8eb718348b8" fixed a similar problem.

    Fixes: 7b96953bc640 ("vfio: Mediated device Core driver")
    Signed-off-by: Qiushi Wu
    Reviewed-by: Cornelia Huck
    Reviewed-by: Kirti Wankhede
    Signed-off-by: Alex Williamson
    Signed-off-by: Sasha Levin

    Qiushi Wu
     
  • [ Upstream commit bc138db1b96264b9c1779cf18d5a3b186aa90066 ]

    The PCI Code and ID Assignment Specification changed capability ID 0
    from reserved to a NULL capability in the v1.1 revision. The NULL
    capability is defined to include only the 16-bit capability header,
    ie. only the ID and next pointer. Unfortunately vfio-pci creates a
    map of config space, where ID 0 is used to reserve the standard type
    0 header. Finding an actual capability with this ID therefore results
    in a bogus range marked in that map and conflicts with subsequent
    capabilities. As this seems to be a dummy capability anyway and we
    already support dropping capabilities, let's hide this one rather than
    delving into the potentially subtle dependencies within our map.

    Seen on an NVIDIA Tesla T4.

    Reviewed-by: Cornelia Huck
    Signed-off-by: Alex Williamson
    Signed-off-by: Sasha Levin

    Alex Williamson
     
  • [ Upstream commit 3e63b94b6274324ff2e7d8615df31586de827c4e ]

    vfio_pci_disable() calls vfio_config_free() but forgets to call
    free_perm_bits() resulting in memory leaks,

    unreferenced object 0xc000000c4db2dee0 (size 16):
    comm "qemu-kvm", pid 4305, jiffies 4295020272 (age 3463.780s)
    hex dump (first 16 bytes):
    00 00 ff 00 ff ff ff ff ff ff ff ff ff ff 00 00 ................
    backtrace:
    [] alloc_perm_bits+0x58/0xe0 [vfio_pci]
    [] vfio_config_init+0xdf0/0x11b0 [vfio_pci]
    init_pci_cap_msi_perm at drivers/vfio/pci/vfio_pci_config.c:1125
    (inlined by) vfio_msi_cap_len at drivers/vfio/pci/vfio_pci_config.c:1180
    (inlined by) vfio_cap_len at drivers/vfio/pci/vfio_pci_config.c:1241
    (inlined by) vfio_cap_init at drivers/vfio/pci/vfio_pci_config.c:1468
    (inlined by) vfio_config_init at drivers/vfio/pci/vfio_pci_config.c:1707
    [] vfio_pci_open+0x234/0x700 [vfio_pci]
    [] vfio_group_fops_unl_ioctl+0x8e0/0xb84 [vfio]
    [] ksys_ioctl+0xd8/0x130
    [] sys_ioctl+0x28/0x40
    [] system_call_exception+0x114/0x1e0
    [] system_call_common+0xf0/0x278
    unreferenced object 0xc000000c4db2e330 (size 16):
    comm "qemu-kvm", pid 4305, jiffies 4295020272 (age 3463.780s)
    hex dump (first 16 bytes):
    00 ff ff 00 ff ff ff ff ff ff ff ff ff ff 00 00 ................
    backtrace:
    [] alloc_perm_bits+0x44/0xe0 [vfio_pci]
    [] vfio_config_init+0xdf0/0x11b0 [vfio_pci]
    [] vfio_pci_open+0x234/0x700 [vfio_pci]
    [] vfio_group_fops_unl_ioctl+0x8e0/0xb84 [vfio]
    [] ksys_ioctl+0xd8/0x130
    [] sys_ioctl+0x28/0x40
    [] system_call_exception+0x114/0x1e0
    [] system_call_common+0xf0/0x278

    Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
    Signed-off-by: Qian Cai
    [aw: rolled in follow-up patch]
    Signed-off-by: Alex Williamson
    Signed-off-by: Sasha Levin

    Qian Cai
     

19 Jun, 2020

1 commit

  • * tag 'v5.4.47': (2193 commits)
    Linux 5.4.47
    KVM: arm64: Save the host's PtrAuth keys in non-preemptible context
    KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception
    ...

    Conflicts:
    arch/arm/boot/dts/imx6qdl.dtsi
    arch/arm/mach-imx/Kconfig
    arch/arm/mach-imx/common.h
    arch/arm/mach-imx/suspend-imx6.S
    arch/arm64/boot/dts/freescale/imx8qxp-mek.dts
    arch/powerpc/include/asm/cacheflush.h
    drivers/cpufreq/imx6q-cpufreq.c
    drivers/dma/imx-sdma.c
    drivers/edac/synopsys_edac.c
    drivers/firmware/imx/imx-scu.c
    drivers/net/ethernet/freescale/fec.h
    drivers/net/ethernet/freescale/fec_main.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
    drivers/net/phy/phy_device.c
    drivers/perf/fsl_imx8_ddr_perf.c
    drivers/usb/cdns3/gadget.c
    drivers/usb/dwc3/gadget.c
    include/uapi/linux/dma-buf.h

    Signed-off-by: Jason Liu

    Jason Liu
     

06 May, 2020

2 commits

  • commit 5cbf3264bc715e9eb384e2b68601f8c02bb9a61d upstream.

    Use follow_pfn() to get the PFN of a PFNMAP VMA instead of assuming that
    vma->vm_pgoff holds the base PFN of the VMA. This fixes a bug where
    attempting to do VFIO_IOMMU_MAP_DMA on an arbitrary PFNMAP'd region of
    memory calculates garbage for the PFN.

    Hilariously, this only got detected because the first "PFN" calculated
    by vaddr_get_pfn() is PFN 0 (vma->vm_pgoff==0), and iommu_iova_to_phys()
    uses PA==0 as an error, which triggers a WARN in vfio_unmap_unpin()
    because the translation "failed". PFN 0 is now unconditionally reserved
    on x86 in order to mitigate L1TF, which causes is_invalid_reserved_pfn()
    to return true and in turns results in vaddr_get_pfn() returning success
    for PFN 0. Eventually the bogus calculation runs into PFNs that aren't
    reserved and leads to failure in vfio_pin_map_dma(). The subsequent
    call to vfio_remove_dma() attempts to unmap PFN 0 and WARNs.

    WARNING: CPU: 8 PID: 5130 at drivers/vfio/vfio_iommu_type1.c:750 vfio_unmap_unpin+0x2e1/0x310 [vfio_iommu_type1]
    Modules linked in: vfio_pci vfio_virqfd vfio_iommu_type1 vfio ...
    CPU: 8 PID: 5130 Comm: sgx Tainted: G W 5.6.0-rc5-705d787c7fee-vfio+ #3
    Hardware name: Intel Corporation Mehlow UP Server Platform/Moss Beach Server, BIOS CNLSE2R1.D00.X119.B49.1803010910 03/01/2018
    RIP: 0010:vfio_unmap_unpin+0x2e1/0x310 [vfio_iommu_type1]
    Code: 0b 49 81 c5 00 10 00 00 e9 c5 fe ff ff bb 00 10 00 00 e9 3d fe
    RSP: 0018:ffffbeb5039ebda8 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: ffff9a55cbf8d480 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9a52b771c200
    RBP: 0000000000000000 R08: 0000000000000040 R09: 00000000fffffff2
    R10: 0000000000000001 R11: ffff9a51fa896000 R12: 0000000184010000
    R13: 0000000184000000 R14: 0000000000010000 R15: ffff9a55cb66ea08
    FS: 00007f15d3830b40(0000) GS:ffff9a55d5600000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000561cf39429e0 CR3: 000000084f75f005 CR4: 00000000003626e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    vfio_remove_dma+0x17/0x70 [vfio_iommu_type1]
    vfio_iommu_type1_ioctl+0x9e3/0xa7b [vfio_iommu_type1]
    ksys_ioctl+0x92/0xb0
    __x64_sys_ioctl+0x16/0x20
    do_syscall_64+0x4c/0x180
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
    RIP: 0033:0x7f15d04c75d7
    Code: 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48

    Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation")
    Signed-off-by: Sean Christopherson
    Signed-off-by: Alex Williamson
    Signed-off-by: Greg Kroah-Hartman

    Sean Christopherson
     
  • commit 0ea971f8dcd6dee78a9a30ea70227cf305f11ff7 upstream.

    add parentheses to avoid possible vaddr overflow.

    Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices")
    Signed-off-by: Yan Zhao
    Signed-off-by: Alex Williamson
    Signed-off-by: Greg Kroah-Hartman

    Yan Zhao
     

17 Apr, 2020

1 commit

  • commit 723fe298ad85ad1278bd2312469ad14738953cc6 upstream.

    Since commit 7723f4c5ecdb ("driver core: platform: Add an error
    message to platform_get_irq*()"), platform_get_irq() calls dev_err()
    on an error. As we enumerate all interrupts until platform_get_irq()
    fails, we now systematically get a message such as:
    "vfio-platform fff51000.ethernet: IRQ index 3 not found" which is
    a false positive.

    Let's use platform_get_irq_optional() instead.

    Signed-off-by: Eric Auger
    Cc: stable@vger.kernel.org # v5.3+
    Reviewed-by: Andre Przywara
    Tested-by: Andre Przywara
    Signed-off-by: Alex Williamson
    Signed-off-by: Greg Kroah-Hartman

    Eric Auger
     

08 Mar, 2020

1 commit

  • Merge Linux stable release v5.4.24 into imx_5.4.y

    * tag 'v5.4.24': (3306 commits)
    Linux 5.4.24
    blktrace: Protect q->blk_trace with RCU
    kvm: nVMX: VMWRITE checks unsupported field before read-only field
    ...

    Signed-off-by: Jason Liu

    Conflicts:
    arch/arm/boot/dts/imx6sll-evk.dts
    arch/arm/boot/dts/imx7ulp.dtsi
    arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
    drivers/clk/imx/clk-composite-8m.c
    drivers/gpio/gpio-mxc.c
    drivers/irqchip/Kconfig
    drivers/mmc/host/sdhci-of-esdhc.c
    drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
    drivers/net/can/flexcan.c
    drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
    drivers/net/ethernet/mscc/ocelot.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
    drivers/net/phy/realtek.c
    drivers/pci/controller/mobiveil/pcie-mobiveil-host.c
    drivers/perf/fsl_imx8_ddr_perf.c
    drivers/tee/optee/shm_pool.c
    drivers/usb/cdns3/gadget.c
    kernel/sched/cpufreq.c
    net/core/xdp.c
    sound/soc/fsl/fsl_esai.c
    sound/soc/fsl/fsl_sai.c
    sound/soc/sof/core.c
    sound/soc/sof/imx/Kconfig
    sound/soc/sof/loader.c

    Jason Liu
     

24 Feb, 2020

1 commit

  • [ Upstream commit 338b4e10f939a71194d8ecef7ece205a942cec05 ]

    The nvlink2 subdriver for IBM Witherspoon machines preregisters
    GPU memory in the IOMMI API so KVM TCE code can map this memory
    for DMA as well. This is done by mm_iommu_newdev() called from
    vfio_pci_nvgpu_regops::mmap.

    In an unlikely event of failure the data->mem remains NULL and
    since mm_iommu_put() (which unregisters the region and unpins memory
    if that was regular memory) does not expect mem=NULL, it should not be
    called.

    This adds a check to only call mm_iommu_put() for a valid data->mem.

    Fixes: 7f92891778df ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver")
    Signed-off-by: Alexey Kardashevskiy
    Signed-off-by: Alex Williamson
    Signed-off-by: Sasha Levin

    Alexey Kardashevskiy
     

21 Dec, 2019

1 commit

  • commit d567fb8819162099035e546b11a736e29c2af0ea upstream.

    Since irq_bypass_register_producer() is called after request_irq(), we
    should do tear-down in reverse order: irq_bypass_unregister_producer()
    then free_irq().

    Specifically free_irq() may release resources required by the
    irqbypass del_producer() callback. Notably an example provided by
    Marc Zyngier on arm64 with GICv4 that he indicates has the potential
    to wedge the hardware:

    free_irq(irq)
    __free_irq(irq)
    irq_domain_deactivate_irq(irq)
    its_irq_domain_deactivate()
    [unmap the VLPI from the ITS]

    kvm_arch_irq_bypass_del_producer(cons, prod)
    kvm_vgic_v4_unset_forwarding(kvm, irq, ...)
    its_unmap_vlpi(irq)
    [Unmap the VLPI from the ITS (again), remap the original LPI]

    Signed-off-by: Jiang Yi
    Cc: stable@vger.kernel.org # v4.4+
    Fixes: 6d7425f109d26 ("vfio: Register/unregister irq_bypass_producer")
    Link: https://lore.kernel.org/kvm/20191127164910.15888-1-giangyi@amazon.com
    Reviewed-by: Marc Zyngier
    Reviewed-by: Eric Auger
    [aw: commit log]
    Signed-off-by: Alex Williamson
    Signed-off-by: Greg Kroah-Hartman

    Jiang Yi
     

26 Nov, 2019

10 commits


16 Oct, 2019

1 commit

  • After enabling CONFIG_IOMMU_DMA on X86 a new warning appears when
    compiling vfio:

    drivers/vfio/vfio_iommu_type1.c: In function ‘vfio_iommu_type1_attach_group’:
    drivers/vfio/vfio_iommu_type1.c:1827:7: warning: ‘resv_msi_base’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    ret = iommu_get_msi_cookie(domain->domain, resv_msi_base);
    ~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    The warning is a false positive, because the call to iommu_get_msi_cookie()
    only happens when vfio_iommu_has_sw_msi() returned true. And that only
    happens when it also set resv_msi_base.

    But initialize the variable anyway to get rid of the warning.

    Signed-off-by: Joerg Roedel
    Reviewed-by: Cornelia Huck
    Reviewed-by: Eric Auger
    Signed-off-by: Alex Williamson

    Joerg Roedel
     

26 Sep, 2019

1 commit

  • This patch is a part of a series that extends kernel ABI to allow to pass
    tagged user pointers (with the top byte set to something else other than
    0x00) as syscall arguments.

    vaddr_get_pfn() uses provided user pointers for vma lookups, which can
    only by done with untagged pointers.

    Untag user pointers in this function.

    Link: http://lkml.kernel.org/r/87422b4d72116a975896f2b19b00f38acbd28f33.1563904656.git.andreyknvl@google.com
    Signed-off-by: Andrey Konovalov
    Reviewed-by: Eric Auger
    Reviewed-by: Vincenzo Frascino
    Reviewed-by: Catalin Marinas
    Reviewed-by: Kees Cook
    Cc: Dave Hansen
    Cc: Will Deacon
    Cc: Al Viro
    Cc: Felix Kuehling
    Cc: Jens Wiklander
    Cc: Khalid Aziz
    Cc: Mauro Carvalho Chehab
    Cc: Mike Rapoport
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Konovalov
     

25 Sep, 2019

1 commit

  • Replace PAGE_SHIFT + compound_order(page) with the new page_shift()
    function. Minor improvements in readability.

    [akpm@linux-foundation.org: fix build in tce_page_is_contained()]
    Link: http://lkml.kernel.org/r/201907241853.yNQTrJWd%25lkp@intel.com
    Link: http://lkml.kernel.org/r/20190721104612.19120-3-willy@infradead.org
    Signed-off-by: Matthew Wilcox (Oracle)
    Reviewed-by: Andrew Morton
    Reviewed-by: Ira Weiny
    Acked-by: Kirill A. Shutemov
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox (Oracle)
     

21 Sep, 2019

2 commits

  • Pull VFIO updates from Alex Williamson:

    - Fix spapr iommu error case case (Alexey Kardashevskiy)

    - Consolidate region type definitions (Cornelia Huck)

    - Restore saved original PCI state on release (hexin)

    - Simplify mtty sample driver interrupt path (Parav Pandit)

    - Support for reporting valid IOVA regions to user (Shameer Kolothum)

    * tag 'vfio-v5.4-rc1' of git://github.com/awilliam/linux-vfio:
    vfio_pci: Restore original state on release
    vfio/type1: remove duplicate retrieval of reserved regions
    vfio/type1: Add IOVA range capability support
    vfio/type1: check dma map request is within a valid iova range
    vfio/spapr_tce: Fix incorrect tce_iommu_group memory free
    vfio-mdev/mtty: Simplify interrupt generation
    vfio: re-arrange vfio region definitions
    vfio/type1: Update iova list on detach
    vfio/type1: Check reserved region conflict and update iova list
    vfio/type1: Introduce iova list and add iommu aperture validity check

    Linus Torvalds
     
  • Pull powerpc updates from Michael Ellerman:
    "This is a bit late, partly due to me travelling, and partly due to a
    power outage knocking out some of my test systems *while* I was
    travelling.

    - Initial support for running on a system with an Ultravisor, which
    is software that runs below the hypervisor and protects guests
    against some attacks by the hypervisor.

    - Support for building the kernel to run as a "Secure Virtual
    Machine", ie. as a guest capable of running on a system with an
    Ultravisor.

    - Some changes to our DMA code on bare metal, to allow devices with
    medium sized DMA masks (> 32 && < 59 bits) to use more than 2GB of
    DMA space.

    - Support for firmware assisted crash dumps on bare metal (powernv).

    - Two series fixing bugs in and refactoring our PCI EEH code.

    - A large series refactoring our exception entry code to use gas
    macros, both to make it more readable and also enable some future
    optimisations.

    As well as many cleanups and other minor features & fixups.

    Thanks to: Adam Zerella, Alexey Kardashevskiy, Alistair Popple, Andrew
    Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anshuman Khandual,
    Balbir Singh, Benjamin Herrenschmidt, Cédric Le Goater, Christophe
    JAILLET, Christophe Leroy, Christopher M. Riedl, Christoph Hellwig,
    Claudio Carvalho, Daniel Axtens, David Gibson, David Hildenbrand,
    Desnes A. Nunes do Rosario, Ganesh Goudar, Gautham R. Shenoy, Greg
    Kurz, Guerney Hunt, Gustavo Romero, Halil Pasic, Hari Bathini, Joakim
    Tjernlund, Jonathan Neuschafer, Jordan Niethe, Leonardo Bras, Lianbo
    Jiang, Madhavan Srinivasan, Mahesh Salgaonkar, Mahesh Salgaonkar,
    Masahiro Yamada, Maxiwell S. Garcia, Michael Anderson, Nathan
    Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver
    O'Halloran, Qian Cai, Ram Pai, Ravi Bangoria, Reza Arbab, Ryan Grimm,
    Sam Bobroff, Santosh Sivaraj, Segher Boessenkool, Sukadev Bhattiprolu,
    Thiago Bauermann, Thiago Jung Bauermann, Thomas Gleixner, Tom
    Lendacky, Vasant Hegde"

    * tag 'powerpc-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (264 commits)
    powerpc/mm/mce: Keep irqs disabled during lockless page table walk
    powerpc: Use ftrace_graph_ret_addr() when unwinding
    powerpc/ftrace: Enable HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
    ftrace: Look up the address of return_to_handler() using helpers
    powerpc: dump kernel log before carrying out fadump or kdump
    docs: powerpc: Add missing documentation reference
    powerpc/xmon: Fix output of XIVE IPI
    powerpc/xmon: Improve output of XIVE interrupts
    powerpc/mm/radix: remove useless kernel messages
    powerpc/fadump: support holes in kernel boot memory area
    powerpc/fadump: remove RMA_START and RMA_END macros
    powerpc/fadump: update documentation about option to release opalcore
    powerpc/fadump: consider f/w load area
    powerpc/opalcore: provide an option to invalidate /sys/firmware/opal/core file
    powerpc/opalcore: export /sys/firmware/opal/core for analysing opal crashes
    powerpc/fadump: update documentation about CONFIG_PRESERVE_FA_DUMP
    powerpc/fadump: add support to preserve crash data on FADUMP disabled kernel
    powerpc/fadump: improve how crashed kernel's memory is reserved
    powerpc/fadump: consider reserved ranges while releasing memory
    powerpc/fadump: make crash memory ranges array allocation generic
    ...

    Linus Torvalds
     

30 Aug, 2019

1 commit

  • Invalidating a TCE cache entry for each updated TCE is quite expensive.
    This makes use of the new iommu_table_ops::xchg_no_kill()/tce_kill()
    callbacks to bring down the time spent in mapping a huge guest DMA window.

    Signed-off-by: Alexey Kardashevskiy
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20190829085252.72370-4-aik@ozlabs.ru

    Alexey Kardashevskiy
     

24 Aug, 2019

1 commit


23 Aug, 2019

1 commit

  • vfio_pci_enable() saves the device's initial configuration information
    with the intent that it is restored in vfio_pci_disable(). However,
    the commit referenced in Fixes: below replaced the call to
    __pci_reset_function_locked(), which is not wrapped in a state save
    and restore, with pci_try_reset_function(), which overwrites the
    restored device state with the current state before applying it to the
    device. Reinstate use of __pci_reset_function_locked() to return to
    the desired behavior.

    Fixes: 890ed578df82 ("vfio-pci: Use pci "try" reset interface")
    Signed-off-by: hexin
    Signed-off-by: Liu Qi
    Signed-off-by: Zhang Yu
    Signed-off-by: Alex Williamson

    hexin
     

20 Aug, 2019

3 commits