17 Oct, 2022

6 commits

  • Check if there is any RMR info associated with the devices behind
    the SMMU and if any, install bypass SMRs for them. This is to
    keep any ongoing traffic associated with these devices alive
    when we enable/reset SMMU during probe().

    Signed-off-by: Jon Nettleton
    Signed-off-by: Steven Price
    Tested-by: Steven Price
    Tested-by: Laurentiu Tudor
    Signed-off-by: Shameer Kolothum
    Acked-by: Robin Murphy
    Link: https://lore.kernel.org/r/20220615101044.1972-10-shameerali.kolothum.thodi@huawei.com
    Signed-off-by: Joerg Roedel

    Jon Nettleton
     
  • Check if there is any RMR info associated with the devices behind
    the SMMUv3 and if any, install bypass STEs for them. This is to
    keep any ongoing traffic associated with these devices alive
    when we enable/reset SMMUv3 during probe().

    Tested-by: Hanjun Guo
    Signed-off-by: Shameer Kolothum
    Acked-by: Robin Murphy
    Link: https://lore.kernel.org/r/20220615101044.1972-9-shameerali.kolothum.thodi@huawei.com
    Signed-off-by: Joerg Roedel

    Shameer Kolothum
     
  • By default, disable_bypass flag is set and any dev without
    an iommu domain installs STE with CFG_ABORT during
    arm_smmu_init_bypass_stes(). Introduce a "force" flag and
    move the STE update logic to arm_smmu_init_bypass_stes()
    so that we can force it to install CFG_BYPASS STE for specific
    SIDs.

    This will be useful in a follow-up patch to install bypass
    for IORT RMR SIDs.

    Tested-by: Hanjun Guo
    Signed-off-by: Shameer Kolothum
    Acked-by: Robin Murphy
    Link: https://lore.kernel.org/r/20220615101044.1972-8-shameerali.kolothum.thodi@huawei.com
    Signed-off-by: Joerg Roedel

    Shameer Kolothum
     
  • Introduce a helper to check the sid range and to init the l2 strtab
    entries(bypass). This will be useful when we have to initialize the
    l2 strtab with bypass for RMR SIDs.

    Tested-by: Hanjun Guo
    Acked-by: Will Deacon
    Signed-off-by: Shameer Kolothum
    Acked-by: Robin Murphy
    Link: https://lore.kernel.org/r/20220615101044.1972-7-shameerali.kolothum.thodi@huawei.com
    Signed-off-by: Joerg Roedel

    Shameer Kolothum
     
  • Currently IORT provides a helper to retrieve HW MSI reserve regions.
    Change this to a generic helper to retrieve any IORT related reserve
    regions. This will be useful when we add support for RMR nodes in
    subsequent patches.

    [Lorenzo: For ACPI IORT]

    Reviewed-by: Lorenzo Pieralisi
    Reviewed-by: Christoph Hellwig
    Tested-by: Steven Price
    Tested-by: Laurentiu Tudor
    Tested-by: Hanjun Guo
    Reviewed-by: Hanjun Guo
    Signed-off-by: Shameer Kolothum
    Acked-by: Robin Murphy
    Link: https://lore.kernel.org/r/20220615101044.1972-4-shameerali.kolothum.thodi@huawei.com
    Signed-off-by: Joerg Roedel

    Shameer Kolothum
     
  • A callback is introduced to struct iommu_resv_region to free memory
    allocations associated with the reserved region. This will be useful
    when we introduce support for IORT RMR based reserved regions.

    Reviewed-by: Christoph Hellwig
    Tested-by: Steven Price
    Tested-by: Laurentiu Tudor
    Tested-by: Hanjun Guo
    Signed-off-by: Shameer Kolothum
    Acked-by: Robin Murphy
    Link: https://lore.kernel.org/r/20220615101044.1972-2-shameerali.kolothum.thodi@huawei.com
    Signed-off-by: Joerg Roedel

    Shameer Kolothum
     

28 Sep, 2022

1 commit

  • commit 154897807050c1161cb2660e502fc0470d46b986 upstream.

    Check 5-level paging capability for 57 bits address width instead of
    checking 1GB large page capability.

    Fixes: 53fc7ad6edf2 ("iommu/vt-d: Correctly calculate sagaw value of IOMMU")
    Cc: stable@vger.kernel.org
    Reported-by: Raghunathan Srinivasan
    Signed-off-by: Yi Liu
    Reviewed-by: Jerry Snitselaar
    Reviewed-by: Kevin Tian
    Reviewed-by: Raghunathan Srinivasan
    Link: https://lore.kernel.org/r/20220916071212.2223869-2-yi.l.liu@intel.com
    Signed-off-by: Lu Baolu
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    Yi Liu
     

20 Sep, 2022

1 commit

  • [ Upstream commit 0c5f6c0d8201a809a6585b07b6263e9db2c874a3 ]

    The translation table copying code for kdump kernels is currently based
    on the extended root/context entry formats of ECS mode defined in older
    VT-d v2.5, and doesn't handle the scalable mode formats. This causes
    the kexec capture kernel boot failure with DMAR faults if the IOMMU was
    enabled in scalable mode by the previous kernel.

    The ECS mode has already been deprecated by the VT-d spec since v3.0 and
    Intel IOMMU driver doesn't support this mode as there's no real hardware
    implementation. Hence this converts ECS checking in copying table code
    into scalable mode.

    The existing copying code consumes a bit in the context entry as a mark
    of copied entry. It needs to work for the old format as well as for the
    extended context entries. As it's hard to find such a common bit for both
    legacy and scalable mode context entries. This replaces it with a per-
    IOMMU bitmap.

    Fixes: 7373a8cc38197 ("iommu/vt-d: Setup context and enable RID2PASID support")
    Cc: stable@vger.kernel.org
    Reported-by: Jerry Snitselaar
    Tested-by: Wen Jin
    Signed-off-by: Lu Baolu
    Link: https://lore.kernel.org/r/20220817011035.3250131-1-baolu.lu@linux.intel.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Lu Baolu
     

15 Sep, 2022

2 commits

  • commit 53fc7ad6edf210b497230ce74b61b322a202470c upstream.

    The Intel IOMMU driver possibly selects between the first-level and the
    second-level translation tables for DMA address translation. However,
    the levels of page-table walks for the 4KB base page size are calculated
    from the SAGAW field of the capability register, which is only valid for
    the second-level page table. This causes the IOMMU driver to stop working
    if the hardware (or the emulated IOMMU) advertises only first-level
    translation capability and reports the SAGAW field as 0.

    This solves the above problem by considering both the first level and the
    second level when calculating the supported page table levels.

    Fixes: b802d070a52a1 ("iommu/vt-d: Use iova over first level")
    Cc: stable@vger.kernel.org
    Signed-off-by: Lu Baolu
    Link: https://lore.kernel.org/r/20220817023558.3253263-1-baolu.lu@linux.intel.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    Lu Baolu
     
  • [ Upstream commit 94a568ce32038d8ff9257004bb4632e60eb43a49 ]

    We started using a 64 bit completion value. Unfortunately, we only
    stored the low 32-bits, so a very large completion value would never
    be matched in iommu_completion_wait().

    Fixes: c69d89aff393 ("iommu/amd: Use 4K page for completion wait write-back semaphore")
    Signed-off-by: John Sperbeck
    Link: https://lore.kernel.org/r/20220801192229.3358786-1-jsperbeck@google.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    John Sperbeck
     

25 Aug, 2022

1 commit

  • [ Upstream commit bfdd231374181254742c5e2faef0bef2d30c0ee4 ]

    Single memory zone feature will remove ZONE_DMA32 and ZONE_DMA and
    cause pgtable PA size larger than 32bit.

    Since Mediatek IOMMU hardware support at most 35bit PA in pgtable,
    so add a quirk to allow the PA of pgtables support up to bit35.

    Signed-off-by: Ning Li
    Signed-off-by: Yunfei Wang
    Reviewed-by: Robin Murphy
    Acked-by: Will Deacon
    Link: https://lore.kernel.org/r/20220630092927.24925-2-yf.wang@mediatek.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Yunfei Wang
     

17 Aug, 2022

3 commits

  • [ Upstream commit b0b0b77ea611e3088e9523e60860f4f41b62b235 ]

    KASAN reports:

    [ 4.668325][ T0] BUG: KASAN: wild-memory-access in dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
    [ 4.676149][ T0] Read of size 8 at addr 1fffffff85115558 by task swapper/0/0
    [ 4.683454][ T0]
    [ 4.685638][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc3-00004-g0e862838f290 #1
    [ 4.694331][ T0] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1 03/02/2016
    [ 4.703196][ T0] Call Trace:
    [ 4.706334][ T0]
    [ 4.709133][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)

    after converting the type of the first argument (@nr, bit number)
    of arch_test_bit() from `long` to `unsigned long`[0].

    Under certain conditions (for example, when ACPI NUMA is disabled
    via command line), pxm_to_node() can return %NUMA_NO_NODE (-1).
    It is valid 'magic' number of NUMA node, but not valid bit number
    to use in bitops.
    node_online() eventually descends to test_bit() without checking
    for the input, assuming it's on caller side (which might be good
    for perf-critical tasks). There, -1 becomes %ULONG_MAX which leads
    to an insane array index when calculating bit position in memory.

    For now, add an explicit check for @node being not %NUMA_NO_NODE
    before calling test_bit(). The actual logics didn't change here
    at all.

    [0] https://github.com/norov/linux/commit/0e862838f290147ea9c16db852d8d494b552d38d

    Fixes: ee34b32d8c29 ("dmar: support for parsing Remapping Hardware Static Affinity structure")
    Cc: stable@vger.kernel.org # 2.6.33+
    Reported-by: kernel test robot
    Signed-off-by: Alexander Lobakin
    Reviewed-by: Andy Shevchenko
    Reviewed-by: Lu Baolu
    Signed-off-by: Yury Norov
    Signed-off-by: Sasha Levin

    Alexander Lobakin
     
  • [ Upstream commit a91eb6803c1c715738682fece095145cbd68fe0b ]

    In qcom_iommu_has_secure_context(), we should call of_node_put()
    for the reference 'child' when breaking out of for_each_child_of_node()
    which will automatically increase and decrease the refcount.

    Fixes: d051f28c8807 ("iommu/qcom: Initialize secure page table")
    Signed-off-by: Liang He
    Link: https://lore.kernel.org/r/20220719124955.1242171-1-windhl@126.com
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Liang He
     
  • [ Upstream commit fce398d2d02c0a9a2bedf7c7201b123e153e8963 ]

    If iommu_device_register() fails in exynos_sysmmu_probe(), the previous
    calls have to be cleaned up. In this case, the iommu_device_sysfs_add()
    should be cleaned up, by calling its remove counterpart call.

    Fixes: d2c302b6e8b1 ("iommu/exynos: Make use of iommu_device_register interface")
    Signed-off-by: Sam Protsenko
    Reviewed-by: Krzysztof Kozlowski
    Acked-by: Marek Szyprowski
    Link: https://lore.kernel.org/r/20220714165550.8884-3-semen.protsenko@linaro.org
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Sam Protsenko
     

12 Jul, 2022

1 commit

  • commit 316f92a705a4c2bf4712135180d56f3cca09243a upstream.

    Notifier calling chain uses priority to determine the execution
    order of the notifiers or listeners registered to the chain.
    PCI bus device hot add utilizes the notification mechanism.

    The current code sets low priority (INT_MIN) to Intel
    dmar_pci_bus_notifier and postpones DMAR decoding after adding
    new device into IOMMU. The result is that struct device pointer
    cannot be found in DRHD search for the new device's DMAR/IOMMU.
    Subsequently, the device is put under the "catch-all" IOMMU
    instead of the correct one. This could cause system hang when
    device TLB invalidation is sent to the wrong IOMMU. Invalidation
    timeout error and hard lockup have been observed and data
    inconsistency/crush may occur as well.

    This patch fixes the issue by setting a positive priority(1) for
    dmar_pci_bus_notifier while the priority of IOMMU bus notifier
    uses the default value(0), therefore DMAR decoding will be in
    advance of DRHD search for a new device to find the correct IOMMU.

    Following is a 2-step example that triggers the bug by simulating
    PCI device hot add behavior in Intel Sapphire Rapids server.

    echo 1 > /sys/bus/pci/devices/0000:6a:01.0/remove
    echo 1 > /sys/bus/pci/rescan

    Fixes: 59ce0515cdaf ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope")
    Cc: stable@vger.kernel.org # v3.15+
    Reported-by: Zhang, Bernice
    Signed-off-by: Jacob Pan
    Signed-off-by: Yian Chen
    Link: https://lore.kernel.org/r/20220521002115.1624069-1-yian.chen@intel.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    Yian Chen
     

15 Jun, 2022

2 commits

  • [ Upstream commit b131fa8c1d2afd05d0b7598621114674289c2fbb ]

    It will cause null-ptr-deref if platform_get_resource() returns NULL,
    we need check the return value.

    Signed-off-by: Yang Yingliang
    Link: https://lore.kernel.org/r/20220425114525.2651143-1-yangyingliang@huawei.com
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Yang Yingliang
     
  • [ Upstream commit d9ed8af1dee37f181096631fb03729ece98ba816 ]

    It will cause null-ptr-deref when using 'res', if platform_get_resource()
    returns NULL, so move using 'res' after devm_ioremap_resource() that
    will check it to avoid null-ptr-deref.
    And use devm_platform_get_and_ioremap_resource() to simplify code.

    Signed-off-by: Yang Yingliang
    Link: https://lore.kernel.org/r/20220425114136.2649310-1-yangyingliang@huawei.com
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Yang Yingliang
     

09 Jun, 2022

11 commits

  • commit a3884774d731f03d3a3dd4fb70ec2d9341ceb39d upstream.

    The data type of the return value of the iommu_map_sg_atomic
    is ssize_t, but the data type of iova size is size_t,
    e.g. one is int while the other is unsigned int.

    When iommu_map_sg_atomic return value is compared with iova size,
    it will force the signed int to be converted to unsigned int, if
    iova map fails and iommu_map_sg_atomic return error code is less
    than 0, then (ret < iova_len) is false, which will to cause not
    do free iova, and the master can still successfully get the iova
    of map fail, which is not expected.

    Therefore, we need to check the return value of iommu_map_sg_atomic
    in two cases according to whether it is less than 0.

    Fixes: ad8f36e4b6b1 ("iommu: return full error code from iommu_map_sg[_atomic]()")
    Signed-off-by: Yunfei Wang
    Cc: # 5.15.*
    Reviewed-by: Robin Murphy
    Reviewed-by: Miles Chen
    Link: https://lore.kernel.org/r/20220507085204.16914-1-yf.wang@mediatek.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    Yunfei Wang
     
  • commit 8b9ad480bd1dd25f4ff4854af5685fa334a2f57a upstream.

    The bug is here:
    if (!iommu || iommu->dev->of_node != spec->np) {

    The list iterator value 'iommu' will *always* be set and non-NULL by
    list_for_each_entry(), so it is incorrect to assume that the iterator
    value will be NULL if the list is empty or no element is found (in fact,
    it will point to a invalid structure object containing HEAD).

    To fix the bug, use a new value 'iter' as the list iterator, while use
    the old value 'iommu' as a dedicated variable to point to the found one,
    and remove the unneeded check for 'iommu->dev->of_node != spec->np'
    outside the loop.

    Cc: stable@vger.kernel.org
    Fixes: f78ebca8ff3d6 ("iommu/msm: Add support for generic master bindings")
    Signed-off-by: Xiaomeng Tong
    Link: https://lore.kernel.org/r/20220501132823.12714-1-xiam0nd.tong@gmail.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    Xiaomeng Tong
     
  • [ Upstream commit 42bb5aa043382f09bef2cc33b8431be867c70f8e ]

    On some systems it can take a long time for the hardware to enable the
    GA log of the AMD IOMMU. The current wait time is only 0.1ms, but
    testing showed that it can take up to 14ms for the GA log to enter
    running state after it has been enabled.

    Sometimes the long delay happens when booting the system, sometimes
    only on resume. Adjust the timeout accordingly to not print a warning
    when hardware takes a longer than usual.

    There has already been an attempt to fix this with commit

    9b45a7738eec ("iommu/amd: Fix loop timeout issue in iommu_ga_log_enable()")

    But that commit was based on some wrong math and did not fix the issue
    in all cases.

    Cc: "D. Ziegfeld"
    Cc: Jörg-Volker Peetz
    Fixes: 8bda0cfbdc1a ("iommu/amd: Detect and initialize guest vAPIC log")
    Signed-off-by: Joerg Roedel
    Link: https://lore.kernel.org/r/20220520102214.12563-1-joro@8bytes.org
    Signed-off-by: Sasha Levin

    Joerg Roedel
     
  • [ Upstream commit de78657e16f41417da9332f09c2d67d100096939 ]

    When larbdev is NULL (in the case I hit, the node is incorrectly set
    iommus = ), it will cause device_link_add() fail and
    kernel crashes when we try to print dev_name(larbdev).

    Let's fail the probe if a larbdev is NULL to avoid invalid inputs from
    dts.

    It should work for normal correct setting and avoid the crash caused
    by my incorrect setting.

    Error log:
    [ 18.189042][ T301] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
    ...
    [ 18.344519][ T301] pstate: a0400005 (NzCv daif +PAN -UAO)
    [ 18.345213][ T301] pc : mtk_iommu_probe_device+0xf8/0x118 [mtk_iommu]
    [ 18.346050][ T301] lr : mtk_iommu_probe_device+0xd0/0x118 [mtk_iommu]
    [ 18.346884][ T301] sp : ffffffc00a5635e0
    [ 18.347392][ T301] x29: ffffffc00a5635e0 x28: ffffffd44a46c1d8
    [ 18.348156][ T301] x27: ffffff80c39a8000 x26: ffffffd44a80cc38
    [ 18.348917][ T301] x25: 0000000000000000 x24: ffffffd44a80cc38
    [ 18.349677][ T301] x23: ffffffd44e4da4c6 x22: ffffffd44a80cc38
    [ 18.350438][ T301] x21: ffffff80cecd1880 x20: 0000000000000000
    [ 18.351198][ T301] x19: ffffff80c439f010 x18: ffffffc00a50d0c0
    [ 18.351959][ T301] x17: ffffffffffffffff x16: 0000000000000004
    [ 18.352719][ T301] x15: 0000000000000004 x14: ffffffd44eb5d420
    [ 18.353480][ T301] x13: 0000000000000ad2 x12: 0000000000000003
    [ 18.354241][ T301] x11: 00000000fffffad2 x10: c0000000fffffad2
    [ 18.355003][ T301] x9 : a0d288d8d7142d00 x8 : a0d288d8d7142d00
    [ 18.355763][ T301] x7 : ffffffd44c2bc640 x6 : 0000000000000000
    [ 18.356524][ T301] x5 : 0000000000000080 x4 : 0000000000000001
    [ 18.357284][ T301] x3 : 0000000000000000 x2 : 0000000000000005
    [ 18.358045][ T301] x1 : 0000000000000000 x0 : 0000000000000000
    [ 18.360208][ T301] Hardware name: MT6873 (DT)
    [ 18.360771][ T301] Call trace:
    [ 18.361168][ T301] dump_backtrace+0xf8/0x1f0
    [ 18.361737][ T301] dump_stack_lvl+0xa8/0x11c
    [ 18.362305][ T301] dump_stack+0x1c/0x2c
    [ 18.362816][ T301] mrdump_common_die+0x184/0x40c [mrdump]
    [ 18.363575][ T301] ipanic_die+0x24/0x38 [mrdump]
    [ 18.364230][ T301] atomic_notifier_call_chain+0x128/0x2b8
    [ 18.364937][ T301] die+0x16c/0x568
    [ 18.365394][ T301] __do_kernel_fault+0x1e8/0x214
    [ 18.365402][ T301] do_page_fault+0xb8/0x678
    [ 18.366934][ T301] do_translation_fault+0x48/0x64
    [ 18.368645][ T301] do_mem_abort+0x68/0x148
    [ 18.368652][ T301] el1_abort+0x40/0x64
    [ 18.368660][ T301] el1h_64_sync_handler+0x54/0x88
    [ 18.368668][ T301] el1h_64_sync+0x68/0x6c
    [ 18.368673][ T301] mtk_iommu_probe_device+0xf8/0x118 [mtk_iommu]
    ...

    Cc: Robin Murphy
    Cc: Yong Wu
    Reported-by: kernel test robot
    Fixes: 635319a4a744 ("media: iommu/mediatek: Add device_link between the consumer and the larb devices")
    Signed-off-by: Miles Chen
    Reviewed-by: Yong Wu
    Reviewed-by: AngeloGioacchino Del Regno
    Link: https://lore.kernel.org/r/20220505132731.21628-1-miles.chen@mediatek.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Miles Chen
     
  • [ Upstream commit cbd23144f7662b00bcde32a938c4a4057e476d68 ]

    We currently call arm64_mm_context_put() without holding a reference to
    the mm, which can result in use-after-free. Call mmgrab()/mmdrop() to
    ensure the mm only gets freed after we unpinned the ASID.

    Fixes: 32784a9562fb ("iommu/arm-smmu-v3: Implement iommu_sva_bind/unbind()")
    Signed-off-by: Jean-Philippe Brucker
    Tested-by: Zhangfei Gao
    Link: https://lore.kernel.org/r/20220426130444.300556-1-jean-philippe@linaro.org
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Jean-Philippe Brucker
     
  • [ Upstream commit 0e5a3f2e630b28e88e018655548212ef8eb4dfcb ]

    Add a mutex to protect the data in the structure mtk_iommu_data,
    like ->"m4u_group" ->"m4u_dom". For the internal data, we should
    protect it in ourselves driver. Add a mutex for this.
    This could be a fix for the multi-groups support.

    Fixes: c3045f39244e ("iommu/mediatek: Support for multi domains")
    Signed-off-by: Yunfei Wang
    Signed-off-by: Yong Wu
    Reviewed-by: AngeloGioacchino Del Regno
    Reviewed-by: Matthias Brugger
    Link: https://lore.kernel.org/r/20220503071427.2285-8-yong.wu@mediatek.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Yong Wu
     
  • [ Upstream commit 98df772bdd1c4ce717a26289efea15cbbe4b64ed ]

    After the commit b34ea31fe013 ("iommu/mediatek: Always enable the clk on
    resume"), the iommu clock is controlled by the runtime callback.
    thus remove the clk control in the mtk_iommu_remove.

    Otherwise, it will warning like:

    echo 14018000.iommu > /sys/bus/platform/drivers/mtk-iommu/unbind

    [ 51.413044] ------------[ cut here ]------------
    [ 51.413648] vpp0_smi_iommu already disabled
    [ 51.414233] WARNING: CPU: 2 PID: 157 at */v5.15-rc1/kernel/mediatek/
    drivers/clk/clk.c:952 clk_core_disable+0xb0/0xb8
    [ 51.417174] Hardware name: MT8195V/C(ENG) (DT)
    [ 51.418635] pc : clk_core_disable+0xb0/0xb8
    [ 51.419177] lr : clk_core_disable+0xb0/0xb8
    ...
    [ 51.429375] Call trace:
    [ 51.429694] clk_core_disable+0xb0/0xb8
    [ 51.430193] clk_core_disable_lock+0x24/0x40
    [ 51.430745] clk_disable+0x20/0x30
    [ 51.431189] mtk_iommu_remove+0x58/0x118
    [ 51.431705] platform_remove+0x28/0x60
    [ 51.432197] device_release_driver_internal+0x110/0x1f0
    [ 51.432873] device_driver_detach+0x18/0x28
    [ 51.433418] unbind_store+0xd4/0x108
    [ 51.433886] drv_attr_store+0x24/0x38
    [ 51.434363] sysfs_kf_write+0x40/0x58
    [ 51.434843] kernfs_fop_write_iter+0x164/0x1e0

    Fixes: b34ea31fe013 ("iommu/mediatek: Always enable the clk on resume")
    Reported-by: Hsin-Yi Wang
    Signed-off-by: Yong Wu
    Reviewed-by: AngeloGioacchino Del Regno
    Reviewed-by: Matthias Brugger
    Link: https://lore.kernel.org/r/20220503071427.2285-7-yong.wu@mediatek.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Yong Wu
     
  • [ Upstream commit ee55f75e4bcade81d253163641b63bef3e76cac4 ]

    Lack the list_del in the mtk_iommu_remove, and remove
    bus_set_iommu(*, NULL) since there may be several iommu HWs.
    we can not bus_set_iommu null when one iommu driver unbind.

    This could be a fix for mt2712 which support 2 M4U HW and list them.

    Fixes: 7c3a2ec02806 ("iommu/mediatek: Merge 2 M4U HWs into one iommu domain")
    Signed-off-by: Yong Wu
    Reviewed-by: AngeloGioacchino Del Regno
    Reviewed-by: Matthias Brugger
    Link: https://lore.kernel.org/r/20220503071427.2285-6-yong.wu@mediatek.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Yong Wu
     
  • [ Upstream commit 645b87c190c959e9bb4f216b8c4add4ee880451a ]

    In the commit 4f956c97d26b ("iommu/mediatek: Move domain_finalise into
    attach_device"), I overlooked the sharing pgtable case.
    After that commit, the "data" in the mtk_iommu_domain_finalise always is
    the data of the current IOMMU HW. Fix this for the sharing pgtable case.

    Only affect mt2712 which is the only SoC that share pgtable currently.

    Fixes: 4f956c97d26b ("iommu/mediatek: Move domain_finalise into attach_device")
    Signed-off-by: Yong Wu
    Reviewed-by: AngeloGioacchino Del Regno
    Reviewed-by: Matthias Brugger
    Link: https://lore.kernel.org/r/20220503071427.2285-5-yong.wu@mediatek.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Yong Wu
     
  • [ Upstream commit 121660bba631104154b7c15e88f208c48c8c3297 ]

    Previously the AMD IOMMU would only enable SWIOTLB in certain
    circumstances:
    * IOMMU in passthrough mode
    * SME enabled

    This logic however doesn't work when an untrusted device is plugged in
    that doesn't do page aligned DMA transactions. The expectation is
    that a bounce buffer is used for those transactions.

    This fails like this:

    swiotlb buffer is full (sz: 4096 bytes), total 0 (slots), used 0 (slots)

    That happens because the bounce buffers have been allocated, followed by
    freed during startup but the bounce buffering code expects that all IOMMUs
    have left it enabled.

    Remove the criteria to set up bounce buffers on AMD systems to ensure
    they're always available for supporting untrusted devices.

    Fixes: 82612d66d51d ("iommu: Allow the dma-iommu api to use bounce buffers")
    Suggested-by: Christoph Hellwig
    Signed-off-by: Mario Limonciello
    Reviewed-by: Robin Murphy
    Reviewed-by: Christoph Hellwig
    Link: https://lore.kernel.org/r/20220404204723.9767-2-mario.limonciello@amd.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Mario Limonciello
     
  • [ Upstream commit 0a967f5bfd9134b89681cae58deb222e20840e76 ]

    The VT-d spec requires (10.4.4 Global Command Register, TE
    field) that:

    Hardware implementations supporting DMA draining must drain
    any in-flight DMA read/write requests queued within the
    Root-Complex before completing the translation enable
    command and reflecting the status of the command through
    the TES field in the Global Status register.

    Unfortunately, some integrated graphic devices fail to do
    so after some kind of power state transition. As the
    result, the system might stuck in iommu_disable_translati
    on(), waiting for the completion of TE transition.

    This adds RPLS to a quirk list for those devices and skips
    TE disabling if the qurik hits.

    Link: https://gitlab.freedesktop.org/drm/intel/-/issues/4898
    Tested-by: Raviteja Goud Talla
    Cc: Rodrigo Vivi
    Acked-by: Lu Baolu
    Signed-off-by: Tejas Upadhyay
    Reviewed-by: Rodrigo Vivi
    Signed-off-by: Rodrigo Vivi
    Link: https://patchwork.freedesktop.org/patch/msgid/20220302043256.191529-1-tejaskumarx.surendrakumar.upadhyay@intel.com
    Signed-off-by: Sasha Levin

    Tejas Upadhyay
     

18 May, 2022

1 commit

  • [ Upstream commit 4a25f2ea0e030b2fc852c4059a50181bfc5b2f57 ]

    Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
    entries to not be invalidated correctly. The problem is that the walk
    cache index generated for IOVA is not same across translation and
    invalidation requests. This is leading to page faults when PMD entry is
    released during unmap and populated with new PTE table during subsequent
    map request. Disabling large page mappings avoids the release of PMD
    entry and avoid translations seeing stale PMD entry in walk cache.
    Fix this by limiting the page mappings to PAGE_SIZE for Tegra194 and
    Tegra234 devices. This is recommended fix from Tegra hardware design
    team.

    Acked-by: Robin Murphy
    Reviewed-by: Krishna Reddy
    Co-developed-by: Pritesh Raithatha
    Signed-off-by: Pritesh Raithatha
    Signed-off-by: Ashish Mhetre
    Link: https://lore.kernel.org/r/20220421081504.24678-1-amhetre@nvidia.com
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Ashish Mhetre
     

12 May, 2022

5 commits

  • [ Upstream commit 2ac2fab52917ae82cbca97cf6e5d2993530257ed ]

    This is required to make loading this as a module work.

    Signed-off-by: Hector Martin
    Fixes: 46d1fb072e76 ("iommu/dart: Add DART iommu driver")
    Reviewed-by: Sven Peter
    Link: https://lore.kernel.org/r/20220502092238.30486-1-marcan@marcan.st
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Hector Martin
     
  • commit a15932f4377062364d22096afe25bc579134a1c3 upstream.

    It will cause null-ptr-deref in resource_size(), if platform_get_resource()
    returns NULL, move calling resource_size() after devm_ioremap_resource() that
    will check 'res' to avoid null-ptr-deref.
    And use devm_platform_get_and_ioremap_resource() to simplify code.

    Fixes: 46d1fb072e76 ("iommu/dart: Add DART iommu driver")
    Signed-off-by: Yang Yingliang
    Reviewed-by: Sven Peter
    Link: https://lore.kernel.org/r/20220425090826.2532165-1-yangyingliang@huawei.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    Yang Yingliang
     
  • commit da8669ff41fa31573375c9a4180f5c080677204b upstream.

    The page fault handling framework in the IOMMU core explicitly states
    that it doesn't handle PCI PASID Stop Marker and the IOMMU drivers must
    discard them before reporting faults. This handles Stop Marker messages
    in prq_event_thread() before reporting events to the core.

    The VT-d driver explicitly drains the pending page requests when a CPU
    page table (represented by a mm struct) is unbound from a PASID according
    to the procedures defined in the VT-d spec. The Stop Marker messages do
    not need a response. Hence, it is safe to drop the Stop Marker messages
    silently if any of them is found in the page request queue.

    Fixes: d5b9e4bfe0d88 ("iommu/vt-d: Report prq to io-pgfault framework")
    Signed-off-by: Lu Baolu
    Reviewed-by: Jacob Pan
    Reviewed-by: Kevin Tian
    Link: https://lore.kernel.org/r/20220421113558.3504874-1-baolu.lu@linux.intel.com
    Link: https://lore.kernel.org/r/20220423082330.3897867-2-baolu.lu@linux.intel.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    Lu Baolu
     
  • commit 95d4782c34a60800ccf91d9f0703137d4367a2fc upstream.

    The arm_smmu_mm_invalidate_range function is designed to be called
    by mm core for Shared Virtual Addressing purpose between IOMMU and
    CPU MMU. However, the ways of two subsystems defining their "end"
    addresses are slightly different. IOMMU defines its "end" address
    using the last address of an address range, while mm core defines
    that using the following address of an address range:

    include/linux/mm_types.h:
    unsigned long vm_end;
    /* The first byte after our end address ...

    This mismatch resulted in an incorrect calculation for size so it
    failed to be page-size aligned. Further, it caused a dead loop at
    "while (iova < end)" check in __arm_smmu_tlb_inv_range function.

    This patch fixes the issue by doing the calculation correctly.

    Fixes: 2f7e8c553e98 ("iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops")
    Cc: stable@vger.kernel.org
    Signed-off-by: Nicolin Chen
    Reviewed-by: Jason Gunthorpe
    Reviewed-by: Robin Murphy
    Reviewed-by: Jean-Philippe Brucker
    Link: https://lore.kernel.org/r/20220419210158.21320-1-nicolinc@nvidia.com
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Nicolin Chen
     
  • commit 59bf3557cf2f8a469a554aea1e3d2c8e72a579f7 upstream.

    Calculate the appropriate mask for non-size-aligned page selective
    invalidation. Since psi uses the mask value to mask out the lower order
    bits of the target address, properly flushing the iotlb requires using a
    mask value such that [pfn, pfn+pages) all lie within the flushed
    size-aligned region. This is not normally an issue because iova.c
    always allocates iovas that are aligned to their size. However, iovas
    which come from other sources (e.g. userspace via VFIO) may not be
    aligned.

    To properly flush the IOTLB, both the start and end pfns need to be
    equal after applying the mask. That means that the most efficient mask
    to use is the index of the lowest bit that is equal where all higher
    bits are also equal. For example, if pfn=0x17f and pages=3, then
    end_pfn=0x181, so the smallest mask we can use is 8. Any differences
    above the highest bit of pages are due to carrying, so by xnor'ing pfn
    and end_pfn and then masking out the lower order bits based on pages, we
    get 0xffffff00, where the first set bit is the mask we want to use.

    Fixes: 6fe1010d6d9c ("vfio/type1: DMA unmap chunking")
    Cc: stable@vger.kernel.org
    Signed-off-by: David Stevens
    Reviewed-by: Kevin Tian
    Link: https://lore.kernel.org/r/20220401022430.1262215-1-stevensd@google.com
    Signed-off-by: Lu Baolu
    Link: https://lore.kernel.org/r/20220410013533.3959168-2-baolu.lu@linux.intel.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    David Stevens
     

14 Apr, 2022

2 commits

  • [ Upstream commit 71ff461c3f41f6465434b9e980c01782763e7ad8 ]

    Commit 3f6634d997db ("iommu: Use right way to retrieve iommu_ops") started
    triggering a NULL pointer dereference for some omap variants:

    __iommu_probe_device from probe_iommu_group+0x2c/0x38
    probe_iommu_group from bus_for_each_dev+0x74/0xbc
    bus_for_each_dev from bus_iommu_probe+0x34/0x2e8
    bus_iommu_probe from bus_set_iommu+0x80/0xc8
    bus_set_iommu from omap_iommu_init+0x88/0xcc
    omap_iommu_init from do_one_initcall+0x44/0x24

    This is caused by omap iommu probe returning 0 instead of ERR_PTR(-ENODEV)
    as noted by Jason Gunthorpe .

    Looks like the regression already happened with an earlier commit
    6785eb9105e3 ("iommu/omap: Convert to probe/release_device() call-backs")
    that changed the function return type and missed converting one place.

    Cc: Drew Fustini
    Cc: Lu Baolu
    Cc: Suman Anna
    Suggested-by: Jason Gunthorpe
    Fixes: 6785eb9105e3 ("iommu/omap: Convert to probe/release_device() call-backs")
    Fixes: 3f6634d997db ("iommu: Use right way to retrieve iommu_ops")
    Signed-off-by: Tony Lindgren
    Tested-by: Drew Fustini
    Reviewed-by: Jason Gunthorpe
    Link: https://lore.kernel.org/r/20220331062301.24269-1-tony@atomide.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Tony Lindgren
     
  • [ Upstream commit 30de2b541af98179780054836b48825fcfba4408 ]

    During event processing, events are read from the event queue one
    by one until the queue is empty.If the master device continuously
    requests address access at the same time and the SMMU generates
    events, the cyclic processing of the event takes a long time and
    softlockup warnings may be reported.

    arm-smmu-v3 arm-smmu-v3.34.auto: event 0x0a received:
    arm-smmu-v3 arm-smmu-v3.34.auto: 0x00007f220000280a
    arm-smmu-v3 arm-smmu-v3.34.auto: 0x000010000000007e
    arm-smmu-v3 arm-smmu-v3.34.auto: 0x00000000034e8670
    watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [irq/268-arm-smm:247]
    Call trace:
    _dev_info+0x7c/0xa0
    arm_smmu_evtq_thread+0x1c0/0x230
    irq_thread_fn+0x30/0x80
    irq_thread+0x128/0x210
    kthread+0x134/0x138
    ret_from_fork+0x10/0x1c
    Kernel panic - not syncing: softlockup: hung tasks

    Fix this by calling cond_resched() after the event information is
    printed.

    Signed-off-by: Zhou Guanghui
    Link: https://lore.kernel.org/r/20220119070754.26528-1-zhouguanghui1@huawei.com
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Zhou Guanghui
     

08 Apr, 2022

4 commits

  • commit 2cbc61a1b1665c84282dbf2b1747ffa0b6248639 upstream.

    Pass the non-aligned size to __iommu_dma_map when using swiotlb bounce
    buffers in iommu_dma_map_page, to account for min_align_mask.

    To deal with granule alignment, __iommu_dma_map maps iova_align(size +
    iova_off) bytes starting at phys - iova_off. If iommu_dma_map_page
    passes aligned size when using swiotlb, then this becomes
    iova_align(iova_align(orig_size) + iova_off). Normally iova_off will be
    zero when using swiotlb. However, this is not the case for devices that
    set min_align_mask. When iova_off is non-zero, __iommu_dma_map ends up
    mapping an extra page at the end of the buffer. Beyond just being a
    security issue, the extra page is not cleaned up by __iommu_dma_unmap.
    This causes problems when the IOVA is reused, due to collisions in the
    iommu driver. Just passing the original size is sufficient, since
    __iommu_dma_map will take care of granule alignment.

    Fixes: 1f221a0d0dbf ("swiotlb: respect min_align_mask")
    Signed-off-by: David Stevens
    Link: https://lore.kernel.org/r/20210929023300.335969-8-stevensd@google.com
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    David Stevens
     
  • commit e81e99bacc9f9347bda7808a949c1ce9fcc2bbf4 upstream.

    Add an argument to swiotlb_tbl_map_single that specifies the desired
    alignment of the allocated buffer. This is used by dma-iommu to ensure
    the buffer is aligned to the iova granule size when using swiotlb with
    untrusted sub-granule mappings. This addresses an issue where adjacent
    slots could be exposed to the untrusted device if IO_TLB_SIZE < iova
    granule < PAGE_SIZE.

    Signed-off-by: David Stevens
    Reviewed-by: Christoph Hellwig
    Link: https://lore.kernel.org/r/20210929023300.335969-7-stevensd@google.com
    Signed-off-by: Joerg Roedel
    Cc: Mario Limonciello
    Signed-off-by: Greg Kroah-Hartman

    David Stevens
     
  • commit 2e727bffbe93750a13d2414f3ce43de2f21600d2 upstream.

    Introduce a new dev_use_swiotlb function to guard swiotlb code, instead
    of overloading dev_is_untrusted. This allows CONFIG_SWIOTLB to be
    checked more broadly, so the swiotlb related code can be removed more
    aggressively.

    Signed-off-by: David Stevens
    Reviewed-by: Robin Murphy
    Reviewed-by: Christoph Hellwig
    Link: https://lore.kernel.org/r/20210929023300.335969-6-stevensd@google.com
    Signed-off-by: Joerg Roedel
    Cc: Mario Limonciello
    Signed-off-by: Greg Kroah-Hartman

    David Stevens
     
  • commit 9b49bbc2c4dfd0431bf7ff4e862171189cf94b7e upstream.

    Fold the _swiotlb helper functions into the respective _page functions,
    since recent fixes have moved all logic from the _page functions to the
    _swiotlb functions.

    Signed-off-by: David Stevens
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Robin Murphy
    Link: https://lore.kernel.org/r/20210929023300.335969-5-stevensd@google.com
    Signed-off-by: Joerg Roedel
    Cc: Mario Limonciello
    Signed-off-by: Greg Kroah-Hartman

    David Stevens