17 Feb, 2016

2 commits

  • As the number of io-pgtable implementations grows beyond 1, it's time
    to rationalise the quirks mechanism before things have a chance to
    start getting really ugly and out-of-hand.

    To that end:
    - Indicate exactly which quirks each format can/does support.
    - Fail creating a table if a caller wants unsupported quirks.
    - Properly document where each quirk applies and why.

    Reviewed-by: Laurent Pinchart
    Signed-off-by: Robin Murphy
    Signed-off-by: Will Deacon

    Robin Murphy
     
  • Add some simple wrappers to avoid having the guts of the TLB operations
    spilled all over the page table implementations, and to provide a point
    to implement extra common functionality.

    Acked-by: Will Deacon
    Acked-by: Laurent Pinchart
    Signed-off-by: Robin Murphy
    Signed-off-by: Will Deacon

    Robin Murphy
     

29 Jan, 2016

1 commit

  • Trying to build a kernel for ARC with both options CONFIG_COMPILE_TEST
    and CONFIG_IOMMU_IO_PGTABLE_LPAE enabled (e.g. as a result of "make
    allyesconfig") results in the following build failure:

    | CC drivers/iommu/io-pgtable-arm.o
    | linux/drivers/iommu/io-pgtable-arm.c: In
    | function ‘__arm_lpae_alloc_pages’:
    | linux/drivers/iommu/io-pgtable-arm.c:221:3:
    | error: implicit declaration of function ‘dma_map_single’
    | [-Werror=implicit-function-declaration]
    | dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
    | ^
    | linux/drivers/iommu/io-pgtable-arm.c:221:42:
    | error: ‘DMA_TO_DEVICE’ undeclared (first use in this function)
    | dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
    | ^

    Since IOMMU_IO_PGTABLE_LPAE depends on DMA API, io-pgtable-arm.c should
    include linux/dma-mapping.h. This fixes the reported failure.

    Cc: Alexey Brodkin
    Cc: Vineet Gupta
    Cc: Joerg Roedel
    Signed-off-by: Lada Trimasova
    Signed-off-by: Will Deacon
    Signed-off-by: Joerg Roedel

    Lada Trimasova
     

17 Dec, 2015

4 commits

  • When tearing down page tables, we return early for the final level
    since we know that we won't have any table pointers to follow.
    Unfortunately, this also means that we forget to free the final level,
    so we end up leaking memory.

    Fix the issue by always freeing the current level, but just don't bother
    to iterate over the ptes if we're at the final level.

    Cc:
    Reported-by: Zhang Bo
    Signed-off-by: Will Deacon

    Will Deacon
     
  • There is no need to keep a useful accessor for a public structure hidden
    away in a private implementation. Move it out alongside the structure
    definition so that other implementations may reuse it.

    Acked-by: Laurent Pinchart
    Signed-off-by: Robin Murphy
    Signed-off-by: Will Deacon

    Robin Murphy
     
  • IOMMU hardware with range-based TLB maintenance commands can work
    happily with the iova and size arguments passed via the tlb_add_flush
    callback, but for IOMMUs which require separate commands per entry in
    the range, it is not straightforward to infer the necessary granularity
    when it comes to issuing the actual commands.

    Add an additional argument indicating the granularity for the benefit
    of drivers needing to know, and update the ARM LPAE code appropriately
    (for non-leaf invalidations we currently just assume the worst-case
    page granularity rather than walking the table to check).

    Signed-off-by: Robin Murphy
    Signed-off-by: Will Deacon

    Robin Murphy
     
  • In the case of corrupted page tables, or when an invalid size is given,
    __arm_lpae_unmap() may recurse beyond the maximum number of levels.
    Unfortunately the detection of this error condition only happens *after*
    calculating a nonsense offset from something which might not be a valid
    table pointer and dereferencing that to see if it is a valid PTE.

    Make things a little more robust by checking the level is valid before
    doing anything which depends on it being so.

    Reviewed-by: Laurent Pinchart
    Signed-off-by: Robin Murphy
    Signed-off-by: Will Deacon

    Robin Murphy
     

23 Sep, 2015

1 commit

  • In checking whether DMA addresses differ from physical addresses, using
    dma_to_phys() is actually the wrong thing to do, since it may hide any
    DMA offset, which is precisely one of the things we are checking for.
    Simply casting between the two address types, whilst ugly, is in fact
    the appropriate course of action. Further care (and ugliness) is also
    necessary in the comparison to avoid truncation if phys_addr_t and
    dma_addr_t differ in size.

    We can also reject any device with a fixed DMA offset up-front at page
    table creation, leaving the allocation-time check for the more subtle
    cases like bounce buffering due to an incorrect DMA mask.

    Furthermore, we can then fix the hackish KConfig dependency so that
    architectures without a dma_to_phys() implementation may still
    COMPILE_TEST (or even use!) the code. The true dependency is on the
    DMA API, so use the appropriate symbol for that.

    Signed-off-by: Robin Murphy
    [will: folded in selftest fix from Yong Wu]
    Signed-off-by: Will Deacon

    Robin Murphy
     

18 Aug, 2015

1 commit

  • When installing a block mapping, we unconditionally overwrite a non-leaf
    PTE if we find one. However, this can cause a problem if the following
    sequence of events occur:

    (1) iommu_map called for a 4k (i.e. PAGE_SIZE) mapping at some address
    - We initialise the page table all the way down to a leaf entry
    - No TLB maintenance is required, because we're going from invalid
    to valid.

    (2) iommu_unmap is called on the mapping installed in (1)
    - We walk the page table to the final (leaf) entry and zero it
    - We only changed a valid leaf entry, so we invalidate leaf-only

    (3) iommu_map is called on the same address as (1), but this time for
    a 2MB (i.e. BLOCK_SIZE) mapping)
    - We walk the page table down to the penultimate level, where we
    find a table entry
    - We overwrite the table entry with a block mapping and return
    without any TLB maintenance and without freeing the memory used
    by the now-orphaned table.

    This last step can lead to a walk-cache caching the overwritten table
    entry, causing unexpected faults when the new mapping is accessed by a
    device. One way to fix this would be to collapse the page table when
    freeing the last page at a given level, but this would require expensive
    iteration on every map call. Instead, this patch detects the case when
    we are overwriting a table entry and explicitly unmaps the table first,
    which takes care of both freeing and TLB invalidation.

    Cc:
    Reported-by: Brian Starkey
    Tested-by: Brian Starkey
    Signed-off-by: Will Deacon
    Signed-off-by: Joerg Roedel

    Will Deacon
     

06 Aug, 2015

3 commits

  • With the users fully converted to DMA API operations, it's dead, Jim.

    Signed-off-by: Robin Murphy
    Signed-off-by: Will Deacon

    Robin Murphy
     
  • With all current users now opted in to DMA API operations, make the
    iommu_dev pointer mandatory, rendering the flush_pgtable callback
    redundant for cache maintenance. However, since the DMA calls could be
    nops in the case of a coherent IOMMU, we still need to ensure the page
    table updates are fully synchronised against a subsequent page table
    walk. In the unmap path, the TLB sync will usually need to do this
    anyway, so just cement that requirement; in the map path which may
    consist solely of cacheable memory writes (in the coherent case),
    insert an appropriate barrier at the end of the operation, and obviate
    the need to call flush_pgtable on every individual update for
    synchronisation.

    Signed-off-by: Robin Murphy
    [will: slight clarification to tlb_sync comment]
    Signed-off-by: Will Deacon

    Robin Murphy
     
  • Currently, users of the LPAE page table code are (ab)using dma_map_page()
    as a means to flush page table updates for non-coherent IOMMUs. Since
    from the CPU's point of view, creating IOMMU page tables *is* passing
    DMA buffers to a device (the IOMMU's page table walker), there's little
    reason not to use the DMA API correctly.

    Allow IOMMU drivers to opt into DMA API operations for page table
    allocation and updates by providing their appropriate device pointer.
    The expectation is that an LPAE IOMMU should have a full view of system
    memory, so use streaming mappings to avoid unnecessary pressure on
    ZONE_DMA, and treat any DMA translation as a warning sign.

    Signed-off-by: Robin Murphy
    Signed-off-by: Will Deacon

    Robin Murphy
     

27 Mar, 2015

1 commit

  • Although we set TCR.T1SZ to 0, the input address range covered by TTBR1
    is actually calculated using T0SZ in this case on the ARM SMMU. This
    could theoretically lead to speculative table walks through physical
    address zero, leading to all sorts of fun and games if we have MMIO
    regions down there.

    This patch avoids the issue by setting EPD1 to disable walks through
    the unused TTBR1 register.

    Signed-off-by: Will Deacon

    Will Deacon
     

25 Feb, 2015

1 commit

  • Various build/boot bots have reported WARNs being triggered by the ARM
    iopgtable LPAE self-tests on i386 machines.

    This boils down to two instances of right-shifting a 32-bit unsigned
    long (i.e. an iova) by more than the size of the type. On 32-bit ARM,
    this happens to give us zero, hence my testing didn't catch this
    earlier.

    This patch fixes the issue by using DIV_ROUND_UP and explicit case to
    to avoid the erroneous shifts.

    Reported-by: Fengguang Wu
    Reported-by: Huang Ying
    Signed-off-by: Will Deacon
    Signed-off-by: Joerg Roedel

    Will Deacon
     

19 Jan, 2015

3 commits