23 Dec, 2016

1 commit

  • Pull more ACPI updates from Rafael Wysocki:
    "Here are new versions of two ACPICA changes that were deferred
    previously due to a problem they had introduced, two cleanups on top
    of them and the removal of a useless warning message from the ACPI
    core.

    Specifics:

    - Move some Linux-specific functionality to upstream ACPICA and
    update the in-kernel users of it accordingly (Lv Zheng)

    - Drop a useless warning (triggered by the lack of an optional
    object) from the ACPI namespace scanning code (Zhang Rui)"

    * tag 'acpi-extra-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    ACPI / osl: Remove deprecated acpi_get_table_with_size()/early_acpi_os_unmap_memory()
    ACPI / osl: Remove acpi_get_table_with_size()/early_acpi_os_unmap_memory() users
    ACPICA: Tables: Allow FADT to be customized with virtual address
    ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
    ACPI: do not warn if _BQC does not exist

    Linus Torvalds
     

22 Dec, 2016

1 commit

  • * acpica:
    ACPI / osl: Remove deprecated acpi_get_table_with_size()/early_acpi_os_unmap_memory()
    ACPI / osl: Remove acpi_get_table_with_size()/early_acpi_os_unmap_memory() users
    ACPICA: Tables: Allow FADT to be customized with virtual address
    ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel

    * acpi-scan:
    ACPI: do not warn if _BQC does not exist

    Rafael J. Wysocki
     

21 Dec, 2016

1 commit

  • This patch removes the users of the deprectated APIs:
    acpi_get_table_with_size()
    early_acpi_os_unmap_memory()
    The following APIs should be used instead of:
    acpi_get_table()
    acpi_put_table()

    The deprecated APIs are invented to be a replacement of acpi_get_table()
    during the early stage so that the early mapped pointer will not be stored
    in ACPICA core and thus the late stage acpi_get_table() won't return a
    wrong pointer. The mapping size is returned just because it is required by
    early_acpi_os_unmap_memory() to unmap the pointer during early stage.

    But as the mapping size equals to the acpi_table_header.length
    (see acpi_tb_init_table_descriptor() and acpi_tb_validate_table()), when
    such a convenient result is returned, driver code will start to use it
    instead of accessing acpi_table_header to obtain the length.

    Thus this patch cleans up the drivers by replacing returned table size with
    acpi_table_header.length, and should be a no-op.

    Reported-by: Dan Williams
    Signed-off-by: Lv Zheng
    Signed-off-by: Rafael J. Wysocki

    Lv Zheng
     

16 Dec, 2016

1 commit

  • Pull IOMMU updates from Joerg Roedel:
    "These changes include:

    - support for the ACPI IORT table on ARM systems and patches to make
    the ARM-SMMU driver make use of it

    - conversion of the Exynos IOMMU driver to device dependency links
    and implementation of runtime pm support based on that conversion

    - update the Mediatek IOMMU driver to use the new struct
    device->iommu_fwspec member

    - implementation of dma_map/unmap_resource in the generic ARM
    dma-iommu layer

    - a number of smaller fixes and improvements all over the place"

    * tag 'iommu-updates-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (44 commits)
    ACPI/IORT: Make dma masks set-up IORT specific
    iommu/amd: Missing error code in amd_iommu_init_device()
    iommu/s390: Drop duplicate header pci.h
    ACPI/IORT: Introduce iort_iommu_configure
    ACPI/IORT: Add single mapping function
    ACPI/IORT: Replace rid map type with type mask
    iommu/arm-smmu: Add IORT configuration
    iommu/arm-smmu: Split probe functions into DT/generic portions
    iommu/arm-smmu-v3: Add IORT configuration
    iommu/arm-smmu-v3: Split probe functions into DT/generic portions
    ACPI/IORT: Add support for ARM SMMU platform devices creation
    ACPI/IORT: Add node match function
    ACPI: Implement acpi_dma_configure
    iommu/arm-smmu-v3: Convert struct device of_node to fwnode usage
    iommu/arm-smmu: Convert struct device of_node to fwnode usage
    iommu: Make of_iommu_set/get_ops() DT agnostic
    ACPI/IORT: Add support for IOMMU fwnode registration
    ACPI/IORT: Introduce linker section for IORT entries probing
    ACPI: Add FWNODE_ACPI_STATIC fwnode type
    iommu/arm-smmu: Set SMTNMB_TLBEN in ACR to enable caching of bypass entries
    ...

    Linus Torvalds
     

13 Dec, 2016

1 commit

  • Pull smp hotplug updates from Thomas Gleixner:
    "This is the final round of converting the notifier mess to the state
    machine. The removal of the notifiers and the related infrastructure
    will happen around rc1, as there are conversions outstanding in other
    trees.

    The whole exercise removed about 2000 lines of code in total and in
    course of the conversion several dozen bugs got fixed. The new
    mechanism allows to test almost every hotplug step standalone, so
    usage sites can exercise all transitions extensively.

    There is more room for improvement, like integrating all the
    pointlessly different architecture mechanisms of synchronizing,
    setting cpus online etc into the core code"

    * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
    tracing/rb: Init the CPU mask on allocation
    soc/fsl/qbman: Convert to hotplug state machine
    soc/fsl/qbman: Convert to hotplug state machine
    zram: Convert to hotplug state machine
    KVM/PPC/Book3S HV: Convert to hotplug state machine
    arm64/cpuinfo: Convert to hotplug state machine
    arm64/cpuinfo: Make hotplug notifier symmetric
    mm/compaction: Convert to hotplug state machine
    iommu/vt-d: Convert to hotplug state machine
    mm/zswap: Convert pool to hotplug state machine
    mm/zswap: Convert dst-mem to hotplug state machine
    mm/zsmalloc: Convert to hotplug state machine
    mm/vmstat: Convert to hotplug state machine
    mm/vmstat: Avoid on each online CPU loops
    mm/vmstat: Drop get_online_cpus() from init_cpu_node_state/vmstat_cpu_dead()
    tracing/rb: Convert to hotplug state machine
    oprofile/nmi timer: Convert to hotplug state machine
    net/iucv: Use explicit clean up labels in iucv_init()
    x86/pci/amd-bus: Convert to hotplug state machine
    x86/oprofile/nmi: Convert to hotplug state machine
    ...

    Linus Torvalds
     

07 Dec, 2016

1 commit


02 Dec, 2016

1 commit

  • Install the callbacks via the state machine.

    Signed-off-by: Anna-Maria Gleixner
    Signed-off-by: Sebastian Andrzej Siewior
    Cc: Joerg Roedel
    Cc: iommu@lists.linux-foundation.org
    Cc: rt@linutronix.de
    Cc: David Woodhouse
    Link: http://lkml.kernel.org/r/20161126231350.10321-14-bigeasy@linutronix.de
    Signed-off-by: Thomas Gleixner

    Anna-Maria Gleixner
     

30 Nov, 2016

3 commits


29 Nov, 2016

12 commits

  • In ACPI based systems, in order to be able to create platform
    devices and initialize them for ARM SMMU components, the IORT
    kernel implementation requires a set of static functions to be
    used by the IORT kernel layer to configure platform devices for
    ARM SMMU components.

    Add static configuration functions to the IORT kernel layer for
    the ARM SMMU components, so that the ARM SMMU driver can
    initialize its respective platform device by relying on the IORT
    kernel infrastructure and by adding a corresponding ACPI device
    early probe section entry.

    Signed-off-by: Lorenzo Pieralisi
    Reviewed-by: Tomasz Nowicki
    Tested-by: Hanjun Guo
    Tested-by: Tomasz Nowicki
    Cc: Will Deacon
    Cc: Robin Murphy
    Cc: Joerg Roedel
    Signed-off-by: Will Deacon

    Lorenzo Pieralisi
     
  • Current ARM SMMU probe functions intermingle HW and DT probing
    in the initialization functions to detect and programme the ARM SMMU
    driver features. In order to allow probing the ARM SMMU with other
    firmwares than DT, this patch splits the ARM SMMU init functions into
    DT and HW specific portions so that other FW interfaces (ie ACPI) can
    reuse the HW probing functions and skip the DT portion accordingly.

    This patch implements no functional change, only code reshuffling.

    Signed-off-by: Lorenzo Pieralisi
    Reviewed-by: Will Deacon
    Reviewed-by: Robin Murphy
    Reviewed-by: Tomasz Nowicki
    Tested-by: Hanjun Guo
    Tested-by: Tomasz Nowicki
    Cc: Will Deacon
    Cc: Hanjun Guo
    Cc: Robin Murphy
    Signed-off-by: Will Deacon

    Lorenzo Pieralisi
     
  • In ACPI bases systems, in order to be able to create platform
    devices and initialize them for ARM SMMU v3 components, the IORT
    kernel implementation requires a set of static functions to be
    used by the IORT kernel layer to configure platform devices for
    ARM SMMU v3 components.

    Add static configuration functions to the IORT kernel layer for
    the ARM SMMU v3 components, so that the ARM SMMU v3 driver can
    initialize its respective platform device by relying on the IORT
    kernel infrastructure and by adding a corresponding ACPI device
    early probe section entry.

    Signed-off-by: Lorenzo Pieralisi
    Reviewed-by: Tomasz Nowicki
    Tested-by: Hanjun Guo
    Tested-by: Tomasz Nowicki
    Cc: Will Deacon
    Cc: Robin Murphy
    Cc: Joerg Roedel
    Signed-off-by: Will Deacon

    Lorenzo Pieralisi
     
  • Current ARM SMMUv3 probe functions intermingle HW and DT probing in the
    initialization functions to detect and programme the ARM SMMU v3 driver
    features. In order to allow probing the ARM SMMUv3 with other firmwares
    than DT, this patch splits the ARM SMMUv3 init functions into DT and HW
    specific portions so that other FW interfaces (ie ACPI) can reuse the HW
    probing functions and skip the DT portion accordingly.

    This patch implements no functional change, only code reshuffling.

    Signed-off-by: Lorenzo Pieralisi
    Reviewed-by: Tomasz Nowicki
    Tested-by: Hanjun Guo
    Tested-by: Tomasz Nowicki
    Cc: Will Deacon
    Cc: Hanjun Guo
    Cc: Robin Murphy
    Cc: Joerg Roedel
    Signed-off-by: Will Deacon

    Lorenzo Pieralisi
     
  • Current ARM SMMU v3 driver rely on the struct device.of_node pointer for
    device look-up and iommu_ops retrieval.

    In preparation for ACPI probing enablement, convert the driver to use
    the struct device.fwnode member for device and iommu_ops look-up so that
    the driver infrastructure can be used also on systems that do not
    associate an of_node pointer to a struct device (eg ACPI), making the
    device look-up and iommu_ops retrieval firmware agnostic.

    Signed-off-by: Lorenzo Pieralisi
    Acked-by: Will Deacon
    Reviewed-by: Robin Murphy
    Reviewed-by: Tomasz Nowicki
    Tested-by: Hanjun Guo
    Tested-by: Tomasz Nowicki
    Cc: Will Deacon
    Cc: Hanjun Guo
    Cc: Robin Murphy
    Signed-off-by: Will Deacon

    Lorenzo Pieralisi
     
  • Current ARM SMMU driver rely on the struct device.of_node pointer for
    device look-up and iommu_ops retrieval.

    In preparation for ACPI probing enablement, convert the driver to use
    the struct device.fwnode member for device and iommu_ops look-up so that
    the driver infrastructure can be used also on systems that do not
    associate an of_node pointer to a struct device (eg ACPI), making the
    device look-up and iommu_ops retrieval firmware agnostic.

    Signed-off-by: Lorenzo Pieralisi
    Acked-by: Will Deacon
    Reviewed-by: Robin Murphy
    Reviewed-by: Tomasz Nowicki
    Tested-by: Hanjun Guo
    Tested-by: Tomasz Nowicki
    Cc: Will Deacon
    Cc: Hanjun Guo
    Cc: Robin Murphy
    Signed-off-by: Will Deacon

    Lorenzo Pieralisi
     
  • The of_iommu_{set/get}_ops() API is used to associate a device
    tree node with a specific set of IOMMU operations. The same
    kernel interface is required on systems booting with ACPI, where
    devices are not associated with a device tree node, therefore
    the interface requires generalization.

    The struct device fwnode member represents the fwnode token associated
    with the device and the struct it points at is firmware specific;
    regardless, it is initialized on both ACPI and DT systems and makes an
    ideal candidate to use it to associate a set of IOMMU operations to a
    given device, through its struct device.fwnode member pointer, paving
    the way for representing per-device iommu_ops (ie an iommu instance
    associated with a device).

    Convert the DT specific of_iommu_{set/get}_ops() interface to
    use struct device.fwnode as a look-up token, making the interface
    usable on ACPI systems and rename the data structures and the
    registration API so that they are made to represent their usage
    more clearly.

    Signed-off-by: Lorenzo Pieralisi
    Acked-by: Will Deacon
    Reviewed-by: Robin Murphy
    Reviewed-by: Tomasz Nowicki
    Tested-by: Hanjun Guo
    Tested-by: Tomasz Nowicki
    Cc: Will Deacon
    Cc: Hanjun Guo
    Cc: Robin Murphy
    Cc: Joerg Roedel
    Signed-off-by: Will Deacon

    Lorenzo Pieralisi
     
  • The SMTNMB_TLBEN in the Auxiliary Configuration Register (ACR) provides an
    option to enable the updation of TLB in case of bypass transactions due to
    no stream match in the stream match table. This reduces the latencies of
    the subsequent transactions with the same stream-id which bypasses the SMMU.
    This provides a significant performance benefit for certain networking
    workloads.

    With this change substantial performance improvement of ~9% is observed with
    DPDK l3fwd application (http://dpdk.org/doc/guides/sample_app_ug/l3_forward.html)
    on NXP's LS2088a platform.

    Reviewed-by: Robin Murphy
    Signed-off-by: Nipun Gupta
    Signed-off-by: Will Deacon

    Nipun Gupta
     
  • Check for iommu_gather_ops structures that are only stored in the tlb
    field of an io_pgtable_cfg structure. The tlb field is of type
    const struct iommu_gather_ops *, so iommu_gather_ops structures
    having this property can be declared as const. Also, replace __initdata
    with __initconst.

    Acked-by: Julia Lawall
    Signed-off-by: Bhumika Goyal
    Signed-off-by: Will Deacon

    Bhumika Goyal
     
  • Check for iommu_gather_ops structures that are only stored in the tlb
    field of an io_pgtable_cfg structure. The tlb field is of type
    const struct iommu_gather_ops *, so iommu_gather_ops structures
    having this property can be declared as const.

    Acked-by: Julia Lawall
    Signed-off-by: Bhumika Goyal
    Signed-off-by: Will Deacon

    Bhumika Goyal
     
  • Check for iommu_gather_ops structures that are only stored in the tlb
    field of an io_pgtable_cfg structure. The tlb field is of type
    const struct iommu_gather_ops *, so iommu_gather_ops structures
    having this property can be declared as const.

    Acked-by: Julia Lawall
    Signed-off-by: Bhumika Goyal
    Signed-off-by: Will Deacon

    Bhumika Goyal
     
  • We can use for_each_set_bit() to simplify the code slightly in the
    ARM io-pgtable self tests.

    Reviewed-by: Robin Murphy
    Signed-off-by: Kefeng Wang
    Signed-off-by: Will Deacon

    Kefeng Wang
     

28 Nov, 2016

1 commit

  • Pull IOMMU fixes from David Woodhouse:
    "Two minor fixes.

    The first fixes the assignment of SR-IOV virtual functions to the
    correct IOMMU unit, and the second fixes the excessively large (and
    physically contiguous) PASID tables used with SVM"

    * git://git.infradead.org/intel-iommu:
    iommu/vt-d: Fix PASID table allocation
    iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions

    Linus Torvalds
     

20 Nov, 2016

1 commit

  • Somehow I ended up with an off-by-three error in calculating the size of
    the PASID and PASID State tables, which triggers allocations failures as
    those tables unfortunately have to be physically contiguous.

    In fact, even the *correct* maximum size of 8MiB is problematic and is
    wont to lead to allocation failures. Since I have extracted a promise
    that this *will* be fixed in hardware, I'm happy to limit it on the
    current hardware to a maximum of 0x20000 PASIDs, which gives us 1MiB
    tables — still not ideal, but better than before.

    Reported by Mika Kuoppala and also by
    Xunlei Pang who submitted a simpler patch to fix
    only the allocation (and not the free) to the "correct" limit... which
    was still problematic.

    Signed-off-by: David Woodhouse
    Cc: stable@vger.kernel.org

    David Woodhouse
     

15 Nov, 2016

13 commits

  • When searching for a free IOVA range, we optimise the tree traversal
    by starting from the cached32_node, instead of the last node, when
    limit_pfn is equal to dma_32bit_pfn. However, if limit_pfn happens to
    be smaller, then we'll go ahead and start from the top even though
    dma_32bit_pfn is still a more suitable upper bound. Since this is
    clearly a silly thing to do, adjust the lookup condition appropriately.

    Signed-off-by: Robin Murphy
    Signed-off-by: Joerg Roedel

    Robin Murphy
     
  • For each subsequent device assigned to the m4u_group after its initial
    allocation, we need to take an additional reference. Otherwise, the
    caller of iommu_group_get_for_dev() will inadvertently remove the
    reference taken by iommu_group_add_device(), and the group will be
    freed prematurely if any device is removed.

    Signed-off-by: Robin Murphy
    Signed-off-by: Joerg Roedel

    Robin Murphy
     
  • For each subsequent device assigned to the m4u_group after its initial
    allocation, we need to take an additional reference. Otherwise, the
    caller of iommu_group_get_for_dev() will inadvertently remove the
    reference taken by iommu_group_add_device(), and the group will be
    freed prematurely if any device is removed.

    Signed-off-by: Robin Murphy
    Signed-off-by: Joerg Roedel

    Robin Murphy
     
  • If acpihid_device_group() finds an existing group for the relevant
    devid, it should be taking an additional reference on that group.
    Otherwise, the caller of iommu_group_get_for_dev() will inadvertently
    remove the reference taken by iommu_group_add_device(), and the group
    will be freed prematurely if any device is removed.

    Signed-off-by: Robin Murphy
    Signed-off-by: Joerg Roedel

    Robin Murphy
     
  • When arm_smmu_device_group() finds an existing group due to Stream ID
    aliasing, it should be taking an additional reference on that group.
    Otherwise, the caller of iommu_group_get_for_dev() will inadvertently
    remove the reference taken by iommu_group_add_device(), and the group
    will be freed prematurely if any device is removed.

    Reported-by: Sricharan R
    Signed-off-by: Robin Murphy
    Signed-off-by: Joerg Roedel

    Robin Murphy
     
  • iommu_group_get_for_dev() expects that the IOMMU driver's device_group
    callback return a group with a reference held for the given device.
    Whilst allocating a new group is fine, and pci_device_group() correctly
    handles reusing an existing group, there is no general means for IOMMU
    drivers doing their own group lookup to take additional references on an
    existing group pointer without having to also store device pointers or
    resort to elaborate trickery.

    Add an IOMMU-driver-specific function to fill the hole.

    Acked-by: Sricharan R
    Signed-off-by: Robin Murphy
    Signed-off-by: Joerg Roedel

    Robin Murphy
     
  • This patch uses recently introduced device dependency links to track the
    runtime pm state of the master's device. The goal is to let SYSMMU
    controller device's runtime PM to follow the runtime PM state of the
    respective master's device. This way each SYSMMU controller is active
    only when its master's device is active and can properly restore or save
    its state instead on runtime PM transition of master's device.
    This approach replaces old behavior, when SYSMMU controller was set to
    runtime active once after attaching to the master device. In the new
    approach SYSMMU controllers no longer prevents respective power domains
    to be turned off when master's device is not being used.

    This patch reduces total power consumption of idle system, because most
    power domains can be finally turned off. For example, on Exynos 4412
    based Odroid U3 this patch reduces power consuption from 136mA to 130mA
    at 5V (by 4.4%).

    The dependency links also enforce proper order of suspending/restoring
    devices during system sleep transition, so there is no more need to use
    LATE_SYSTEM_SLEEP_PM_OPS-based workaround for ensuring that SYSMMUs are
    suspended after their master devices.

    Signed-off-by: Marek Szyprowski
    Signed-off-by: Joerg Roedel

    Marek Szyprowski
     
  • This patch adds runtime pm implementation, which is based on previous
    suspend/resume code. SYSMMU controller is now being enabled/disabled mainly
    from the runtime pm callbacks. System sleep callbacks relies on generic
    pm_runtime_force_suspend/pm_runtime_force_resume helpers. To ensure
    internal state consistency, additional lock for runtime pm transitions
    was introduced.

    Signed-off-by: Marek Szyprowski
    Signed-off-by: Joerg Roedel

    Marek Szyprowski
     
  • This patch reworks locking in the exynos_iommu_attach/detach_device
    functions to ensure that all entries of the sysmmu_drvdata and
    exynos_iommu_owner structure are updated under the respective spinlocks,
    while runtime pm functions are called without any spinlocks held.

    Signed-off-by: Marek Szyprowski
    Signed-off-by: Joerg Roedel

    Marek Szyprowski
     
  • To avoid possible races, set master device pointer in each SYSMMU
    controller once on boot. Suspend/resume callbacks now properly relies on
    the configured iommu domain to enable or disable SYSMMU controller.
    While changing the code, also update the sleep debug messages and make
    them conditional.

    Signed-off-by: Marek Szyprowski
    Signed-off-by: Joerg Roedel

    Marek Szyprowski
     
  • Remove remaining leftovers of the ref-count related code in the
    __sysmmu_enable/disable functions inline __sysmmu_enable/disable_nocount
    to them. Suspend/resume callbacks now checks if master device is set for
    given SYSMMU controller instead of relying on the activation count.

    Signed-off-by: Marek Szyprowski
    Signed-off-by: Joerg Roedel

    Marek Szyprowski
     
  • __sysmmu_enable/disable functions were designed to do ref-count based
    operations, but current code always calls them only once, so the code for
    checking the conditions and invalid conditions can be simply removed
    without any influence to the driver operation.

    Signed-off-by: Marek Szyprowski
    Signed-off-by: Joerg Roedel

    Marek Szyprowski
     
  • Remove excessive, useless debug about skipping TLB invalidation, which
    is a normal situation when more aggressive power management is enabled.

    Signed-off-by: Marek Szyprowski
    Signed-off-by: Joerg Roedel

    Marek Szyprowski
     

14 Nov, 2016

2 commits

  • With the new dma_{map,unmap}_resource() functions added to the DMA API
    for the benefit of cases like slave DMA, add suitable implementations to
    the arsenal of our generic layer. Since cache maintenance should not be
    a concern, these can both be standalone callback implementations without
    the need for arch code wrappers.

    CC: Joerg Roedel
    Signed-off-by: Robin Murphy
    Reviewed-by: Catalin Marinas
    Signed-off-by: Joerg Roedel

    Robin Murphy
     
  • This patch add support for page access protection bits. Till now this
    feature was disabled and Exynos SYSMMU always mapped pages as read/write.
    Now page access bits are set according to the protection bits provided
    in iommu_map(), so Exynos SYSMMU is able to detect incorrect access to
    mapped pages. Exynos SYSMMU earlier than v5 doesn't support write-only
    mappings, so pages with such protection bits are mapped as read/write.

    Signed-off-by: Marek Szyprowski
    Signed-off-by: Joerg Roedel

    Marek Szyprowski
     

10 Nov, 2016

1 commit