20 May, 2019

1 commit

  • ZONE_DMA is removed by commit
    ad67f5a6545("arm64: replace ZONE_DMA with ZONE_DMA32").
    So need use __GFP_DMA32, otherwise meet
    "
    [ 2.560837] arm-smmu 51400000.iommu: Cannot accommodate DMA translation for IOMMU page tables
    [ 2.569456] iommu: Failed to add device 5f020000.sata to group 0: -12
    [ 2.623471] arm-smmu 51400000.iommu: Cannot accommodate DMA translation for IOMMU page tables
    [ 2.632022] iommu: Failed to add device 5b040000.ethernet to group 0: -12
    [ 2.703749] arm-smmu 51400000.iommu: Cannot accommodate DMA translation for IOMMU page tables
    [ 2.712294] iommu: Failed to add device 5b050000.ethernet to group 0: -12
    [ 3.294966] arm-smmu 51400000.iommu: Cannot accommodate DMA translation for IOMMU page tables
    [ 3.303514] iommu: Failed to add device 5b010000.usdhc to group 0: -12
    [ 3.358298] arm-smmu 51400000.iommu: Cannot accommodate DMA translation for IOMMU page tables
    [ 3.366858] iommu: Failed to add device 5b020000.usdhc to group 0: -12
    [ 4.184750] arm-smmu 51400000.iommu: Cannot accommodate DMA translation for IOMMU page tables
    [ 4.193311] iommu: Failed to add device 5b050000.ethernet to group 0: -12
    [ 4.205606] arm-smmu 51400000.iommu: Cannot accommodate DMA translation for IOMMU page tables
    [ 4.214175] iommu: Failed to add device 5b050000.ethernet to group 0: -12
    [ 4.224358] arm-smmu 51400000.iommu: Cannot accommodate DMA translation for IOMMU page tables
    [ 4.232938] iommu: Failed to add device 5b050000.ethernet to group 0: -12
    "

    Fixes: 90d512cf0ab ("MLK-15007-1 iommu: arm: pgtable: alloc pagetable in DMA area")
    Signed-off-by: Peng Fan
    Reviewed-by: Richard Zhu

    Peng Fan
     

18 Apr, 2019

1 commit

  • Normally the iommu pagetable could be in 64bit address space,
    but we have one patch to address PCIE driver, 'commit 9e03e5076269
    ("MLK-15064-2 ARM64: DMA: limit the dma mask to be 32bit")'

    The patch restrict swiotlb and iommu dma to be in 32bit address.

    So if we allocate pages in highmem, then dma_map_single will return
    a 32bit address. Then, we will get "Cannot accommodate DMA
    translation for IOMMU page tables", because `dma != virt_to_phys(pages)`.

    So we strict the lpae iommu pgtable in DMA area to fix this issue.

    Signed-off-by: Peng Fan

    Vipul: while rebase on v4.19, apply manually and use
    commit 4b123757eeaa iommu/io-pgtable-arm: Make allocations NUMA-aware
    Signed-off-by: Vipul Kumar

    Peng Fan
     

06 Apr, 2019

1 commit

  • [ Upstream commit 032ebd8548c9d05e8d2bdc7a7ec2fe29454b0ad0 ]

    L1 tables are allocated with __get_dma_pages, and therefore already
    ignored by kmemleak.

    Without this, the kernel would print this error message on boot,
    when the first L1 table is allocated:

    [ 2.810533] kmemleak: Trying to color unknown object at 0xffffffd652388000 as Black
    [ 2.818190] CPU: 5 PID: 39 Comm: kworker/5:0 Tainted: G S 4.19.16 #8
    [ 2.831227] Workqueue: events deferred_probe_work_func
    [ 2.836353] Call trace:
    ...
    [ 2.852532] paint_ptr+0xa0/0xa8
    [ 2.855750] kmemleak_ignore+0x38/0x6c
    [ 2.859490] __arm_v7s_alloc_table+0x168/0x1f4
    [ 2.863922] arm_v7s_alloc_pgtable+0x114/0x17c
    [ 2.868354] alloc_io_pgtable_ops+0x3c/0x78
    ...

    Fixes: e5fc9753b1a8314 ("iommu/io-pgtable: Add ARMv7 short descriptor support")
    Signed-off-by: Nicolas Boichat
    Acked-by: Will Deacon
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Nicolas Boichat
     

03 Apr, 2019

1 commit

  • commit 0a352554da69b02f75ca3389c885c741f1f63235 upstream.

    IOMMUs using ARMv7 short-descriptor format require page tables (level 1
    and 2) to be allocated within the first 4GB of RAM, even on 64-bit
    systems.

    For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32 is
    defined (e.g. on arm64 platforms).

    For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32. Note that
    we do not explicitly pass GFP_DMA[32] to kmem_cache_zalloc, as this is
    not strictly necessary, and would cause a warning in mm/sl*b.c, as we
    did not update GFP_SLAB_BUG_MASK.

    Also, print an error when the physical address does not fit in
    32-bit, to make debugging easier in the future.

    Link: http://lkml.kernel.org/r/20181210011504.122604-3-drinkcat@chromium.org
    Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
    Signed-off-by: Nicolas Boichat
    Acked-by: Will Deacon
    Cc: Christoph Hellwig
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Hsin-Yi Wang
    Cc: Huaisheng Ye
    Cc: Joerg Roedel
    Cc: Joonsoo Kim
    Cc: Matthew Wilcox
    Cc: Matthias Brugger
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Mike Rapoport
    Cc: Pekka Enberg
    Cc: Robin Murphy
    Cc: Sasha Levin
    Cc: Tomasz Figa
    Cc: Vlastimil Babka
    Cc: Yingjoe Chen
    Cc: Yong Wu
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Nicolas Boichat
     

27 Mar, 2019

1 commit

  • commit 4e50ce03976fbc8ae995a000c4b10c737467beaa upstream.

    Take into account that sg->offset can be bigger than PAGE_SIZE when
    setting segment sg->dma_address. Otherwise sg->dma_address will point
    at diffrent page, what makes DMA not possible with erros like this:

    xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa70c0 flags=0x0020]
    xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7040 flags=0x0020]
    xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7080 flags=0x0020]
    xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7100 flags=0x0020]
    xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7000 flags=0x0020]

    Additinally with wrong sg->dma_address unmap_sg will free wrong pages,
    what what can cause crashes like this:

    Feb 28 19:27:45 kernel: BUG: Bad page state in process cinnamon pfn:39e8b1
    Feb 28 19:27:45 kernel: Disabling lock debugging due to kernel taint
    Feb 28 19:27:45 kernel: flags: 0x2ffff0000000000()
    Feb 28 19:27:45 kernel: raw: 02ffff0000000000 0000000000000000 ffffffff00000301 0000000000000000
    Feb 28 19:27:45 kernel: raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
    Feb 28 19:27:45 kernel: page dumped because: nonzero _refcount
    Feb 28 19:27:45 kernel: Modules linked in: ccm fuse arc4 nct6775 hwmon_vid amdgpu nls_iso8859_1 nls_cp437 edac_mce_amd vfat fat kvm_amd ccp rng_core kvm mt76x0u mt76x0_common mt76x02_usb irqbypass mt76_usb mt76x02_lib mt76 crct10dif_pclmul crc32_pclmul chash mac80211 amd_iommu_v2 ghash_clmulni_intel gpu_sched i2c_algo_bit ttm wmi_bmof snd_hda_codec_realtek snd_hda_codec_generic drm_kms_helper snd_hda_codec_hdmi snd_hda_intel drm snd_hda_codec aesni_intel snd_hda_core snd_hwdep aes_x86_64 crypto_simd snd_pcm cfg80211 cryptd mousedev snd_timer glue_helper pcspkr r8169 input_leds realtek agpgart libphy rfkill snd syscopyarea sysfillrect sysimgblt fb_sys_fops soundcore sp5100_tco k10temp i2c_piix4 wmi evdev gpio_amdpt pinctrl_amd mac_hid pcc_cpufreq acpi_cpufreq sg ip_tables x_tables ext4(E) crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) sd_mod(E) hid_generic(E) usbhid(E) hid(E) dm_mod(E) serio_raw(E) atkbd(E) libps2(E) crc32c_intel(E) ahci(E) libahci(E) libata(E) xhci_pci(E) xhci_hcd(E)
    Feb 28 19:27:45 kernel: scsi_mod(E) i8042(E) serio(E) bcache(E) crc64(E)
    Feb 28 19:27:45 kernel: CPU: 2 PID: 896 Comm: cinnamon Tainted: G B W E 4.20.12-arch1-1-custom #1
    Feb 28 19:27:45 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450M Pro4, BIOS P1.20 06/26/2018
    Feb 28 19:27:45 kernel: Call Trace:
    Feb 28 19:27:45 kernel: dump_stack+0x5c/0x80
    Feb 28 19:27:45 kernel: bad_page.cold.29+0x7f/0xb2
    Feb 28 19:27:45 kernel: __free_pages_ok+0x2c0/0x2d0
    Feb 28 19:27:45 kernel: skb_release_data+0x96/0x180
    Feb 28 19:27:45 kernel: __kfree_skb+0xe/0x20
    Feb 28 19:27:45 kernel: tcp_recvmsg+0x894/0xc60
    Feb 28 19:27:45 kernel: ? reuse_swap_page+0x120/0x340
    Feb 28 19:27:45 kernel: ? ptep_set_access_flags+0x23/0x30
    Feb 28 19:27:45 kernel: inet_recvmsg+0x5b/0x100
    Feb 28 19:27:45 kernel: __sys_recvfrom+0xc3/0x180
    Feb 28 19:27:45 kernel: ? handle_mm_fault+0x10a/0x250
    Feb 28 19:27:45 kernel: ? syscall_trace_enter+0x1d3/0x2d0
    Feb 28 19:27:45 kernel: ? __audit_syscall_exit+0x22a/0x290
    Feb 28 19:27:45 kernel: __x64_sys_recvfrom+0x24/0x30
    Feb 28 19:27:45 kernel: do_syscall_64+0x5b/0x170
    Feb 28 19:27:45 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Cc: stable@vger.kernel.org
    Reported-and-tested-by: Jan Viktorin
    Reviewed-by: Alexander Duyck
    Signed-off-by: Stanislaw Gruszka
    Fixes: 80187fd39dcb ('iommu/amd: Optimize map_sg and unmap_sg')
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    Stanislaw Gruszka
     

14 Mar, 2019

3 commits

  • [ Upstream commit 9825bd94e3a2baae1f4874767ae3a7d4c049720e ]

    When a VM is terminated, the VFIO driver detaches all pass-through
    devices from VFIO domain by clearing domain id and page table root
    pointer from each device table entry (DTE), and then invalidates
    the DTE. Then, the VFIO driver unmap pages and invalidate IOMMU pages.

    Currently, the IOMMU driver keeps track of which IOMMU and how many
    devices are attached to the domain. When invalidate IOMMU pages,
    the driver checks if the IOMMU is still attached to the domain before
    issuing the invalidate page command.

    However, since VFIO has already detached all devices from the domain,
    the subsequent INVALIDATE_IOMMU_PAGES commands are being skipped as
    there is no IOMMU attached to the domain. This results in data
    corruption and could cause the PCI device to end up in indeterministic
    state.

    Fix this by invalidate IOMMU pages when detach a device, and
    before decrementing the per-domain device reference counts.

    Cc: Boris Ostrovsky
    Suggested-by: Joerg Roedel
    Co-developed-by: Brijesh Singh
    Signed-off-by: Brijesh Singh
    Signed-off-by: Suravee Suthikulpanit
    Fixes: 6de8ad9b9ee0 ('x86/amd-iommu: Make iommu_flush_pages aware of multiple IOMMUs')
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Suravee Suthikulpanit
     
  • [ Upstream commit f1724c0883bb0ce93b8dcb94b53dcca3b75ac9a7 ]

    In the error path of map_sg there is an incorrect if condition
    for breaking out of the loop that searches the scatterlist
    for mapped pages to unmap. Instead of breaking out of the
    loop once all the pages that were mapped have been unmapped,
    it will break out of the loop after it has unmapped 1 page.
    Fix the condition, so it breaks out of the loop only after
    all the mapped pages have been unmapped.

    Fixes: 80187fd39dcb ("iommu/amd: Optimize map_sg and unmap_sg")
    Cc: Joerg Roedel
    Signed-off-by: Jerry Snitselaar
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Jerry Snitselaar
     
  • [ Upstream commit 51d8838d66d3249508940d8f59b07701f2129723 ]

    In the error path of map_sg, free_iova_fast is being called with
    address instead of the pfn. This results in a bad value getting into
    the rcache, and can result in hitting a BUG_ON when
    iova_magazine_free_pfns is called.

    Cc: Joerg Roedel
    Cc: Suravee Suthikulpanit
    Signed-off-by: Jerry Snitselaar
    Fixes: 80187fd39dcb ("iommu/amd: Optimize map_sg and unmap_sg")
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Jerry Snitselaar
     

13 Feb, 2019

4 commits

  • [ Upstream commit a868e8530441286342f90c1fd9c5f24de3aa2880 ]

    After removing an entry from a queue (e.g. reading an event in
    arm_smmu_evtq_thread()) it is necessary to advance the MMIO consumer
    pointer to free the queue slot back to the SMMU. A memory barrier is
    required here so that all reads targetting the queue entry have
    completed before the consumer pointer is updated.

    The implementation of queue_inc_cons() relies on a writel() to complete
    the previous reads, but this is incorrect because writel() is only
    guaranteed to complete prior writes. This patch replaces the call to
    writel() with an mb(); writel_relaxed() sequence, which gives us the
    read->write ordering which we require.

    Cc: Robin Murphy
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Will Deacon
     
  • [ Upstream commit 89cddc563743cb1e0068867ac97013b2a5bf86aa ]

    qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
    clock and power requirements.
    On msm8996, multiple cores, viz. mdss, video, etc. use this
    smmu. On sdm845, this smmu is used with gpu.
    Add bindings for the same.

    Signed-off-by: Vivek Gautam
    Reviewed-by: Rob Herring
    Reviewed-by: Tomasz Figa
    Tested-by: Srinivas Kandagatla
    Reviewed-by: Robin Murphy
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Vivek Gautam
     
  • [ Upstream commit 84a9a75774961612d0c7dd34a1777e8f98a65abd ]

    The GITS_TRANSLATER MMIO doorbell register in the ITS hardware is
    architected to be 4 bytes in size, yet on hi1620 and earlier, Hisilicon
    have allocated the adjacent 4 bytes to carry some IMPDEF sideband
    information which results in an 8-byte MSI payload being delivered when
    signalling an interrupt:

    MSIAddr:
    |----4bytes----|----4bytes----|
    | MSIData | IMPDEF |

    This poses no problem for the ITS hardware because the adjacent 4 bytes
    are reserved in the memory map. However, when delivering MSIs to memory,
    as we do in the SMMUv3 driver for signalling the completion of a SYNC
    command, the extended payload will corrupt the 4 bytes adjacent to the
    "sync_count" member in struct arm_smmu_device. Fortunately, the current
    layout allocates these bytes to padding, but this is fragile and we
    should make this explicit.

    Reviewed-by: Robin Murphy
    Signed-off-by: Zhen Lei
    [will: Rewrote commit message and comment]
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Zhen Lei
     
  • [ Upstream commit c12b08ebbe16f0d3a96a116d86709b04c1ee8e74 ]

    The parameter is still there but it's ignored. We need to check its
    value before deciding to go into passthrough mode for AMD IOMMU v2
    capable device.

    We occasionally use this parameter to force v2 capable device into
    translation mode to debug memory corruption that we suspect is
    caused by DMA writes.

    To address the following comment from Joerg Roedel on the first
    version, v2 capability of device is completely ignored.
    > This breaks the iommu_v2 use-case, as it needs a direct mapping for the
    > devices that support it.

    And from Documentation/admin-guide/kernel-parameters.txt:
    This option does not override iommu=pt

    Fixes: aafd8ba0ca74 ("iommu/amd: Implement add_device and remove_device")

    Signed-off-by: Yu Zhao
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Yu Zhao
     

07 Feb, 2019

1 commit

  • commit 198bc3252ea3a45b0c5d500e6a5b91cfdd08f001 upstream.

    Commit 9d3a4de4cb8d ("iommu: Disambiguate MSI region types") changed
    the reserved region type in intel_iommu_get_resv_regions() from
    IOMMU_RESV_RESERVED to IOMMU_RESV_MSI, but it forgot to also change
    the type in intel_iommu_put_resv_regions().

    This leads to a memory leak, because now the check in
    intel_iommu_put_resv_regions() for IOMMU_RESV_RESERVED will never
    be true, and no allocated regions will be freed.

    Fix this by changing the region type in intel_iommu_put_resv_regions()
    to IOMMU_RESV_MSI, matching the type of the allocated regions.

    Fixes: 9d3a4de4cb8d ("iommu: Disambiguate MSI region types")
    Cc: # v4.11+
    Signed-off-by: Gerald Schaefer
    Reviewed-by: Eric Auger
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    Gerald Schaefer
     

13 Jan, 2019

1 commit

  • commit 3569dd07aaad71920c5ea4da2d5cc9a167c1ffd4 upstream.

    The Intel IOMMU driver opportunistically skips a few top level page
    tables from the domain paging directory while programming the IOMMU
    context entry. However there is an implicit assumption in the code that
    domain's adjusted guest address width (agaw) would always be greater
    than IOMMU's agaw.

    The IOMMU capabilities in an upcoming platform cause the domain's agaw
    to be lower than IOMMU's agaw. The issue is seen when the IOMMU supports
    both 4-level and 5-level paging. The domain builds a 4-level page table
    based on agaw of 2. However the IOMMU's agaw is set as 3 (5-level). In
    this case the code incorrectly tries to skip page page table levels.
    This causes the IOMMU driver to avoid programming the context entry. The
    fix handles this case and programs the context entry accordingly.

    Fixes: de24e55395698 ("iommu/vt-d: Simplify domain_context_mapping_one")
    Cc:
    Cc: Ashok Raj
    Cc: Jacob Pan
    Cc: Lu Baolu
    Reviewed-by: Lu Baolu
    Reported-by: Ramos Falcon, Ernesto R
    Tested-by: Ricardo Neri
    Signed-off-by: Sohil Mehta
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    Sohil Mehta
     

10 Jan, 2019

1 commit

  • commit 3cd508a8c1379427afb5e16c2e0a7c986d907853 upstream.

    When we insert the sync sequence number into the CMD_SYNC.MSIData field,
    we do so in CPU-native byte order, before writing out the whole command
    as explicitly little-endian dwords. Thus on big-endian systems, the SMMU
    will receive and write back a byteswapped version of sync_nr, which would
    be perfect if it were targeting a similarly-little-endian ITS, but since
    it's actually writing back to memory being polled by the CPUs, they're
    going to end up seeing the wrong thing.

    Since the SMMU doesn't care what the MSIData actually contains, the
    minimal-overhead solution is to simply add an extra byteswap initially,
    such that it then writes back the big-endian format directly.

    Cc:
    Fixes: 37de98f8f1cf ("iommu/arm-smmu-v3: Use CMD_SYNC completion MSI")
    Signed-off-by: Robin Murphy
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Robin Murphy
     

13 Dec, 2018

4 commits

  • [ Upstream commit 829383e183728dec7ed9150b949cd6de64127809 ]

    memunmap() should be used to free the return of memremap(), not
    iounmap().

    Fixes: dfddb969edf0 ('iommu/vt-d: Switch from ioremap_cache to memremap')
    Signed-off-by: Pan Bian
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Pan Bian
     
  • [ Upstream commit ab99be4683d9db33b100497d463274ebd23bd67e ]

    This register should have been programmed with the physical address
    of the memory location containing the shadow tail pointer for
    the guest virtual APIC log instead of the base address.

    Fixes: 8bda0cfbdc1a ('iommu/amd: Detect and initialize guest vAPIC log')
    Signed-off-by: Filippo Sironi
    Signed-off-by: Wei Wang
    Signed-off-by: Suravee Suthikulpanit
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Filippo Sironi
     
  • [ Upstream commit e5b78f2e349eef5d4fca5dc1cf5a3b4b2cc27abd ]

    If iommu_ops.add_device() fails, iommu_ops.domain_free() is still
    called, leading to a crash, as the domain was only partially
    initialized:

    ipmmu-vmsa e67b0000.mmu: Cannot accommodate DMA translation for IOMMU page tables
    sata_rcar ee300000.sata: Unable to initialize IPMMU context
    iommu: Failed to add device ee300000.sata to group 0: -22
    Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
    ...
    Call trace:
    ipmmu_domain_free+0x1c/0xa0
    iommu_group_release+0x48/0x68
    kobject_put+0x74/0xe8
    kobject_del.part.0+0x3c/0x50
    kobject_put+0x60/0xe8
    iommu_group_get_for_dev+0xa8/0x1f0
    ipmmu_add_device+0x1c/0x40
    of_iommu_configure+0x118/0x190

    Fix this by checking if the domain's context already exists, before
    trying to destroy it.

    Signed-off-by: Geert Uytterhoeven
    Reviewed-by: Robin Murphy
    Fixes: d25a2a16f0889 ('iommu: Add driver for Renesas VMSA-compatible IPMMU')
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Geert Uytterhoeven
     
  • [ Upstream commit 19ed3e2dd8549c1a34914e8dad01b64e7837645a ]

    When handling page request without pasid event, go to "no_pasid"
    branch instead of "bad_req". Otherwise, a NULL pointer deference
    will happen there.

    Cc: Ashok Raj
    Cc: Jacob Pan
    Cc: Sohil Mehta
    Signed-off-by: Lu Baolu
    Fixes: a222a7f0bb6c9 'iommu/vt-d: Implement page request handling'
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Lu Baolu
     

14 Nov, 2018

1 commit

  • commit 7d321bd3542500caf125249f44dc37cb4e738013 upstream.

    The IO-pgtable code relies on the driver TLB invalidation callbacks to
    ensure that all page-table updates are visible to the IOMMU page-table
    walker.

    In the case that the page-table walker is cache-coherent, we cannot rely
    on an implicit DSB from the DMA-mapping code, so we must ensure that we
    execute a DSB in our tlb_add_flush() callback prior to triggering the
    invalidation.

    Cc:
    Cc: Robin Murphy
    Fixes: 2df7a25ce4a7 ("iommu/arm-smmu: Clean up DMA API usage")
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Will Deacon
     

05 Oct, 2018

1 commit

  • Boris Ostrovsky reported a memory leak with device passthrough when SME
    is active.

    The VFIO driver uses iommu_iova_to_phys() to get the physical address for
    an iova. This physical address is later passed into vfio_unmap_unpin() to
    unpin the memory. The vfio_unmap_unpin() uses pfn_valid() before unpinning
    the memory. The pfn_valid() check was failing because encryption mask was
    part of the physical address returned. This resulted in the memory not
    being unpinned and therefore leaked after the guest terminates.

    The memory encryption mask must be cleared from the physical address in
    iommu_iova_to_phys().

    Fixes: 2543a786aa25 ("iommu/amd: Allow the AMD IOMMU to work with memory encryption")
    Reported-by: Boris Ostrovsky
    Cc: Tom Lendacky
    Cc: Joerg Roedel
    Cc:
    Cc: Borislav Petkov
    Cc: Paolo Bonzini
    Cc: Radim Krčmář
    Cc: kvm@vger.kernel.org
    Cc: Boris Ostrovsky
    Cc: # 4.14+
    Signed-off-by: Brijesh Singh
    Signed-off-by: Joerg Roedel

    Singh, Brijesh
     

26 Sep, 2018

1 commit

  • ACPI HID devices do not actually have an alias for
    them in the IVRS. But dev_data->alias is still used
    for indexing into the IOMMU device table for devices
    being handled by the IOMMU. So for ACPI HID devices,
    we simply return the corresponding devid as an alias,
    as parsed from IVRS table.

    Signed-off-by: Arindam Nath
    Fixes: 2bf9a0a12749 ('iommu/amd: Add iommu support for ACPI HID devices')
    Signed-off-by: Joerg Roedel

    Arindam Nath
     

25 Sep, 2018

2 commits

  • Pasid table memory allocation could return failure due to memory
    shortage. Limit the pasid table size to 1MiB because current 8MiB
    contiguous physical memory allocation can be hard to come by. W/o
    a PASID table, the device could continue to work with only shared
    virtual memory impacted. So, let's go ahead with context mapping
    even the memory allocation for pasid table failed.

    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107783
    Fixes: cc580e41260d ("iommu/vt-d: Per PCI device pasid table interfaces")

    Cc: Ashok Raj
    Cc: Jacob Pan
    Cc: Mika Westerberg
    Reported-and-tested-by: Pelton Kyle D
    Tested-by: Mika Westerberg
    Signed-off-by: Lu Baolu
    Signed-off-by: Joerg Roedel

    Lu Baolu
     
  • In the iommu's shutdown handler we disable runtime-pm which could
    result in the irq-handler running unclocked and since commit
    3fc7c5c0cff3 ("iommu/rockchip: Handle errors returned from PM framework")
    we warn about that fact.

    This can cause warnings on shutdown on some Rockchip machines, so
    free the irqs in the shutdown handler before we disable runtime-pm.

    Reported-by: Enric Balletbo i Serra
    Fixes: 3fc7c5c0cff3 ("iommu/rockchip: Handle errors returned from PM framework")
    Signed-off-by: Heiko Stuebner
    Tested-by: Enric Balletbo i Serra
    Acked-by: Marc Zyngier
    Signed-off-by: Joerg Roedel

    Heiko Stuebner
     

26 Aug, 2018

1 commit

  • Pull ARM SoC late updates from Olof Johansson:
    "A couple of late-merged changes that would be useful to get in this
    merge window:

    - Driver support for reset of audio complex on Meson platforms. The
    audio driver went in this merge window, and these changes have been
    in -next for a while (just not in our tree).

    - Power management fixes for IOMMU on Rockchip platforms, getting
    closer to kexec working on them, including Chromebooks.

    - Another pass updating "arm,psci" -> "psci" for some properties that
    have snuck in since last time it was done"

    * tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
    iommu/rockchip: Move irq request past pm_runtime_enable
    iommu/rockchip: Handle errors returned from PM framework
    arm64: rockchip: Force CONFIG_PM on Rockchip systems
    ARM: rockchip: Force CONFIG_PM on Rockchip systems
    arm64: dts: Fix various entry-method properties to reflect documentation
    reset: imx7: Fix always writing bits as 0
    reset: meson: add meson audio arb driver
    reset: meson: add dt-bindings for meson-axg audio arb

    Linus Torvalds
     

25 Aug, 2018

1 commit

  • Pull IOMMU updates from Joerg Roedel:

    - PASID table handling updates for the Intel VT-d driver. It implements
    a global PASID space now so that applications usings multiple devices
    will just have one PASID.

    - A new config option to make iommu passthroug mode the default.

    - New sysfs attribute for iommu groups to export the type of the
    default domain.

    - A debugfs interface (for debug only) usable by IOMMU drivers to
    export internals to user-space.

    - R-Car Gen3 SoCs support for the ipmmu-vmsa driver

    - The ARM-SMMU now aborts transactions from unknown devices and devices
    not attached to any domain.

    - Various cleanups and smaller fixes all over the place.

    * tag 'iommu-updates-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (42 commits)
    iommu/omap: Fix cache flushes on L2 table entries
    iommu: Remove the ->map_sg indirection
    iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel
    iommu/arm-smmu-v3: Prevent any devices access to memory without registration
    iommu/ipmmu-vmsa: Don't register as BUS IOMMU if machine doesn't have IPMMU-VMSA
    iommu/ipmmu-vmsa: Clarify supported platforms
    iommu/ipmmu-vmsa: Fix allocation in atomic context
    iommu: Add config option to set passthrough as default
    iommu: Add sysfs attribyte for domain type
    iommu/arm-smmu-v3: sync the OVACKFLG to PRIQ consumer register
    iommu/arm-smmu: Error out only if not enough context interrupts
    iommu/io-pgtable-arm-v7s: Abort allocation when table address overflows the PTE
    iommu/io-pgtable-arm: Fix pgtable allocation in selftest
    iommu/vt-d: Remove the obsolete per iommu pasid tables
    iommu/vt-d: Apply per pci device pasid table in SVA
    iommu/vt-d: Allocate and free pasid table
    iommu/vt-d: Per PCI device pasid table interfaces
    iommu/vt-d: Add for_each_device_domain() helper
    iommu/vt-d: Move device_domain_info to header
    iommu/vt-d: Apply global PASID in SVA
    ...

    Linus Torvalds
     

24 Aug, 2018

2 commits

  • Enabling the interrupt early, before power has been applied to the
    device, can result in an interrupt being delivered too early if:

    - the IOMMU shares an interrupt with a VOP
    - the VOP has a pending interrupt (after a kexec, for example)

    In these conditions, we end-up taking the interrupt without
    the IOMMU being ready to handle the interrupt (not powered on).

    Moving the interrupt request past the pm_runtime_enable() call
    makes sure we can at least access the IOMMU registers. Note that
    this is only a partial fix, and that the VOP interrupt will still
    be screaming until the VOP driver kicks in, which advocates for
    a more synchronized interrupt enabling/disabling approach.

    Fixes: 0f181d3cf7d98 ("iommu/rockchip: Add runtime PM support")
    Reviewed-by: Heiko Stuebner
    Signed-off-by: Marc Zyngier
    Signed-off-by: Olof Johansson

    Marc Zyngier
     
  • pm_runtime_get_if_in_use can fail: either PM has been disabled
    altogether (-EINVAL), or the device hasn't been enabled yet (0).
    Sadly, the Rockchip IOMMU driver tends to conflate the two things
    by considering a non-zero return value as successful.

    This has the consequence of hiding other bugs, so let's handle this
    case throughout the driver, with a WARN_ON_ONCE so that we can try
    and work out what happened.

    Fixes: 0f181d3cf7d98 ("iommu/rockchip: Add runtime PM support")
    Reviewed-by: Heiko Stuebner
    Signed-off-by: Marc Zyngier
    Signed-off-by: Olof Johansson

    Marc Zyngier
     

19 Aug, 2018

1 commit

  • Pull driver core updates from Greg KH:
    "Here are all of the driver core and related patches for 4.19-rc1.

    Nothing huge here, just a number of small cleanups and the ability to
    now stop the deferred probing after init happens.

    All of these have been in linux-next for a while with only a merge
    issue reported"

    * tag 'driver-core-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (21 commits)
    base: core: Remove WARN_ON from link dependencies check
    drivers/base: stop new probing during shutdown
    drivers: core: Remove glue dirs from sysfs earlier
    driver core: remove unnecessary function extern declare
    sysfs.h: fix non-kernel-doc comment
    PM / Domains: Stop deferring probe at the end of initcall
    iommu: Remove IOMMU_OF_DECLARE
    iommu: Stop deferring probe at end of initcalls
    pinctrl: Support stopping deferred probe after initcalls
    dt-bindings: pinctrl: add a 'pinctrl-use-default' property
    driver core: allow stopping deferred probe after init
    driver core: add a debugfs entry to show deferred devices
    sysfs: Fix internal_create_group() for named group updates
    base: fix order of OF initialization
    linux/device.h: fix kernel-doc notation warning
    Documentation: update firmware loader fallback reference
    kobject: Replace strncpy with memcpy
    drivers: base: cacheinfo: use OF property_read_u32 instead of get_property,read_number
    kernfs: Replace strncpy with memcpy
    device: Add #define dev_fmt similar to #define pr_fmt
    ...

    Linus Torvalds
     

18 Aug, 2018

2 commits

  • The CMA memory allocator doesn't support standard gfp flags for memory
    allocation, so there is no point having it as a parameter for
    dma_alloc_from_contiguous() function. Replace it by a boolean no_warn
    argument, which covers all the underlaying cma_alloc() function
    supports.

    This will help to avoid giving false feeling that this function supports
    standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer,
    what has already been an issue: see commit dd65a941f6ba ("arm64:
    dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag").

    Link: http://lkml.kernel.org/r/20180709122020eucas1p21a71b092975cb4a3b9954ffc63f699d1~-sqUFoa-h2939329393eucas1p2Y@eucas1p2.samsung.com
    Signed-off-by: Marek Szyprowski
    Acked-by: Michał Nazarewicz
    Acked-by: Vlastimil Babka
    Reviewed-by: Christoph Hellwig
    Cc: Laura Abbott
    Cc: Michal Hocko
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marek Szyprowski
     
  • Use new return type vm_fault_t for fault handler. For now, this is just
    documenting that the function returns a VM_FAULT value rather than an
    errno. Once all instances are converted, vm_fault_t will become a
    distinct type.

    Ref-> commit 1c8f422059ae ("mm: change return type to vm_fault_t")

    In this patch all the caller of handle_mm_fault() are changed to return
    vm_fault_t type.

    Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PC
    Signed-off-by: Souptick Joarder
    Cc: Matthew Wilcox
    Cc: Richard Henderson
    Cc: Tony Luck
    Cc: Matt Turner
    Cc: Vineet Gupta
    Cc: Russell King
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Richard Kuo
    Cc: Geert Uytterhoeven
    Cc: Michal Simek
    Cc: James Hogan
    Cc: Ley Foon Tan
    Cc: Jonas Bonn
    Cc: James E.J. Bottomley
    Cc: Benjamin Herrenschmidt
    Cc: Palmer Dabbelt
    Cc: Yoshinori Sato
    Cc: David S. Miller
    Cc: Richard Weinberger
    Cc: Guan Xuetao
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: "Levin, Alexander (Sasha Levin)"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Souptick Joarder
     

08 Aug, 2018

3 commits


30 Jul, 2018

1 commit


28 Jul, 2018

1 commit

  • Take the new bus limit into account (when present) for IOVA allocations,
    to accommodate those SoCs which integrate off-the-shelf IP blocks with
    narrower interconnects such that the link between a device output and an
    IOMMU input can truncate DMA addresses to even fewer bits than the
    native size of either block's interface would imply.

    Eventually it might make sense for the DMA core to apply this constraint
    up-front in dma_set_mask() and friends, but for now this seems like the
    least risky approach.

    Signed-off-by: Robin Murphy
    Acked-by: Ard Biesheuvel
    Acked-by: Joerg Roedel
    Signed-off-by: Christoph Hellwig

    Robin Murphy
     

27 Jul, 2018

4 commits