04 Jul, 2014

1 commit

  • Register our DRHD IOMMUs, cross link devices, and provide a base set
    of attributes for the IOMMU. Note that IRQ remapping support parses
    the DMAR table very early in boot, well before the iommu_class can
    reasonably be setup, so our registration is split between
    intel_iommu_init(), which occurs later, and alloc_iommu(), which
    typically occurs much earlier, but may happen at any time later
    with IOMMU hot-add support.

    On a typical desktop system, this provides the following (pruned):

    $ find /sys | grep dmar
    /sys/devices/virtual/iommu/dmar0
    /sys/devices/virtual/iommu/dmar0/devices
    /sys/devices/virtual/iommu/dmar0/devices/0000:00:02.0
    /sys/devices/virtual/iommu/dmar0/intel-iommu
    /sys/devices/virtual/iommu/dmar0/intel-iommu/cap
    /sys/devices/virtual/iommu/dmar0/intel-iommu/ecap
    /sys/devices/virtual/iommu/dmar0/intel-iommu/address
    /sys/devices/virtual/iommu/dmar0/intel-iommu/version
    /sys/devices/virtual/iommu/dmar1
    /sys/devices/virtual/iommu/dmar1/devices
    /sys/devices/virtual/iommu/dmar1/devices/0000:00:00.0
    /sys/devices/virtual/iommu/dmar1/devices/0000:00:01.0
    /sys/devices/virtual/iommu/dmar1/devices/0000:00:16.0
    /sys/devices/virtual/iommu/dmar1/devices/0000:00:1a.0
    /sys/devices/virtual/iommu/dmar1/devices/0000:00:1b.0
    /sys/devices/virtual/iommu/dmar1/devices/0000:00:1c.0
    ...
    /sys/devices/virtual/iommu/dmar1/intel-iommu
    /sys/devices/virtual/iommu/dmar1/intel-iommu/cap
    /sys/devices/virtual/iommu/dmar1/intel-iommu/ecap
    /sys/devices/virtual/iommu/dmar1/intel-iommu/address
    /sys/devices/virtual/iommu/dmar1/intel-iommu/version
    /sys/class/iommu/dmar0
    /sys/class/iommu/dmar1

    (devices also link back to the dmar units)

    This makes address, version, capabilities, and extended capabilities
    available, just like printed on boot. I've tried not to duplicate
    data that can be found in the DMAR table, with the exception of the
    address, which provides an easy way to associate the sysfs device with
    a DRHD entry in the DMAR. It's tempting to add scopes and RMRR data
    here, but the full DMAR table is already exposed under /sys/firmware/
    and therefore already provides a way for userspace to learn such
    details.

    Signed-off-by: Alex Williamson
    Signed-off-by: Joerg Roedel

    Alex Williamson
     

24 Mar, 2014

1 commit


09 Jan, 2014

2 commits

  • Data structure drhd->iommu is shared between DMA remapping driver and
    interrupt remapping driver, so DMA remapping driver shouldn't release
    drhd->iommu when it failed to initialize IOMMU devices. Otherwise it
    may cause invalid memory access to the interrupt remapping driver.

    Sample stack dump:
    [ 13.315090] BUG: unable to handle kernel paging request at ffffc9000605a088
    [ 13.323221] IP: [] qi_submit_sync+0x15c/0x400
    [ 13.330107] PGD 82f81e067 PUD c2f81e067 PMD 82e846067 PTE 0
    [ 13.336818] Oops: 0002 [#1] SMP
    [ 13.340757] Modules linked in:
    [ 13.344422] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 3.13.0-rc1-gerry+ #7
    [ 13.352474] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.091020121352 09/10/2012
    [ 13.365659] Workqueue: events work_for_cpu_fn
    [ 13.370774] task: ffff88042ddf00d0 ti: ffff88042ddee000 task.ti: ffff88042dde e000
    [ 13.379389] RIP: 0010:[] [] qi_submit_sy nc+0x15c/0x400
    [ 13.389055] RSP: 0000:ffff88042ddef940 EFLAGS: 00010002
    [ 13.395151] RAX: 00000000000005e0 RBX: 0000000000000082 RCX: 0000000200000025
    [ 13.403308] RDX: ffffc9000605a000 RSI: 0000000000000010 RDI: ffff88042ddb8610
    [ 13.411446] RBP: ffff88042ddef9a0 R08: 00000000000005d0 R09: 0000000000000001
    [ 13.419599] R10: 0000000000000000 R11: 000000000000005d R12: 000000000000005c
    [ 13.427742] R13: ffff88102d84d300 R14: 0000000000000174 R15: ffff88042ddb4800
    [ 13.435877] FS: 0000000000000000(0000) GS:ffff88043de00000(0000) knlGS:00000 00000000000
    [ 13.445168] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 13.451749] CR2: ffffc9000605a088 CR3: 0000000001a0b000 CR4: 00000000000407f0
    [ 13.459895] Stack:
    [ 13.462297] ffff88042ddb85d0 000000000000005d ffff88042ddef9b0 0000000000000 5d0
    [ 13.471147] 00000000000005c0 ffff88042ddb8000 000000000000005c 0000000000000 015
    [ 13.480001] ffff88042ddb4800 0000000000000282 ffff88042ddefa40 ffff88042ddef ac0
    [ 13.488855] Call Trace:
    [ 13.491771] [] modify_irte+0x9d/0xd0
    [ 13.497778] [] intel_setup_ioapic_entry+0x10d/0x290
    [ 13.505250] [] ? trace_hardirqs_on_caller+0x16/0x1e0
    [ 13.512824] [] ? default_init_apic_ldr+0x60/0x60
    [ 13.519998] [] setup_ioapic_remapped_entry+0x20/0x30
    [ 13.527566] [] io_apic_setup_irq_pin+0x12a/0x2c0
    [ 13.534742] [] ? acpi_pci_irq_find_prt_entry+0x2b9/0x2d8
    [ 13.544102] [] io_apic_setup_irq_pin_once+0x85/0xa0
    [ 13.551568] [] ? mp_find_ioapic_pin+0x8f/0xf0
    [ 13.558434] [] io_apic_set_pci_routing+0x34/0x70
    [ 13.565621] [] mp_register_gsi+0xaf/0x1c0
    [ 13.572111] [] acpi_register_gsi_ioapic+0xe/0x10
    [ 13.579286] [] acpi_register_gsi+0xf/0x20
    [ 13.585779] [] acpi_pci_irq_enable+0x171/0x1e3
    [ 13.592764] [] pcibios_enable_device+0x31/0x40
    [ 13.599744] [] do_pci_enable_device+0x3b/0x60
    [ 13.606633] [] pci_enable_device_flags+0xc8/0x120
    [ 13.613887] [] pci_enable_device+0x13/0x20
    [ 13.620484] [] pcie_port_device_register+0x1e/0x510
    [ 13.627947] [] ? trace_hardirqs_on_caller+0x16/0x1e0
    [ 13.635510] [] ? trace_hardirqs_on+0xd/0x10
    [ 13.642189] [] pcie_portdrv_probe+0x58/0xc0
    [ 13.648877] [] local_pci_probe+0x45/0xa0
    [ 13.655266] [] work_for_cpu_fn+0x14/0x20
    [ 13.661656] [] process_one_work+0x369/0x710
    [ 13.668334] [] ? process_one_work+0x2f2/0x710
    [ 13.675215] [] ? worker_thread+0x46/0x690
    [ 13.681714] [] worker_thread+0x484/0x690
    [ 13.688109] [] ? cancel_delayed_work_sync+0x20/0x20
    [ 13.695576] [] kthread+0xf0/0x110
    [ 13.701300] [] ? local_clock+0x3f/0x50
    [ 13.707492] [] ? kthread_create_on_node+0x250/0x250
    [ 13.714959] [] ret_from_fork+0x7c/0xb0
    [ 13.721152] [] ? kthread_create_on_node+0x250/0x250

    Signed-off-by: Jiang Liu
    Signed-off-by: Joerg Roedel

    Jiang Liu
     
  • Functions alloc_iommu() and parse_ioapics_under_ir()
    are only used internally, so mark them as static.

    [Joerg: Made detect_intel_iommu() non-static again for IA64]

    Signed-off-by: Jiang Liu
    Signed-off-by: Joerg Roedel

    Jiang Liu
     

08 Jan, 2014

1 commit

  • Currently Intel interrupt remapping drivers uses the "present" flag bit
    in remapping entry to track whether an entry is allocated or not.
    It works as follow:
    1) allocate a remapping entry and set its "present" flag bit to 1
    2) compose other fields for the entry
    3) update the remapping entry with the composed value

    The remapping hardware may access the entry between step 1 and step 3,
    which then observers an entry with the "present" flag set but random
    values in all other fields.

    This patch introduces a dedicated bitmap to track remapping entry
    allocation status instead of sharing the "present" flag with hardware,
    thus eliminate the race window. It also simplifies the implementation.

    Tested-and-reviewed-by: Yijing Wang
    Signed-off-by: Jiang Liu
    Signed-off-by: Joerg Roedel

    Jiang Liu
     

24 Sep, 2013

1 commit


08 Jun, 2012

1 commit

  • Intel-iommu initialization doesn't currently reserve the memory
    used for the IOMMU registers. This can allow the pci resource
    allocator to assign a device BAR to the same address as the
    IOMMU registers. This can cause some not so nice side affects
    when the driver ioremap's that region.

    Introduced two helper functions to map & unmap the IOMMU
    registers as well as simplify the init and exit paths.

    Signed-off-by: Donald Dutile
    Acked-by: Chris Wright
    Cc: iommu@lists.linux-foundation.org
    Cc: suresh.b.siddha@intel.com
    Cc: dwmw2@infradead.org
    Link: http://lkml.kernel.org/r/1338845342-12464-3-git-send-email-ddutile@redhat.com
    Signed-off-by: Ingo Molnar

    Donald Dutile
     

26 Oct, 2011

1 commit

  • * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
    rtmutex: Add missing rcu_read_unlock() in debug_rt_mutex_print_deadlock()
    lockdep: Comment all warnings
    lib: atomic64: Change the type of local lock to raw_spinlock_t
    locking, lib/atomic64: Annotate atomic64_lock::lock as raw
    locking, x86, iommu: Annotate qi->q_lock as raw
    locking, x86, iommu: Annotate irq_2_ir_lock as raw
    locking, x86, iommu: Annotate iommu->register_lock as raw
    locking, dma, ipu: Annotate bank_lock as raw
    locking, ARM: Annotate low level hw locks as raw
    locking, drivers/dca: Annotate dca_lock as raw
    locking, powerpc: Annotate uic->lock as raw
    locking, x86: mce: Annotate cmci_discover_lock as raw
    locking, ACPI: Annotate c3_lock as raw
    locking, oprofile: Annotate oprofilefs lock as raw
    locking, video: Annotate vga console lock as raw
    locking, latencytop: Annotate latency_lock as raw
    locking, timer_stats: Annotate table_lock as raw
    locking, rwsem: Annotate inner lock as raw
    locking, semaphores: Annotate inner lock as raw
    locking, sched: Annotate thread_group_cputimer as raw
    ...

    Fix up conflicts in kernel/posix-cpu-timers.c manually: making
    cputimer->cputime a raw lock conflicted with the ABBA fix in commit
    bcd5cff7216f ("cputimer: Cure lock inversion").

    Linus Torvalds
     

21 Sep, 2011

1 commit

  • Change the CONFIG_DMAR to CONFIG_INTEL_IOMMU to be consistent
    with the other IOMMU options.

    Rename the CONFIG_INTR_REMAP to CONFIG_IRQ_REMAP to match the
    irq subsystem name.

    And define the CONFIG_DMAR_TABLE for the common ACPI DMAR
    routines shared by both CONFIG_INTEL_IOMMU and CONFIG_IRQ_REMAP.

    Signed-off-by: Suresh Siddha
    Cc: yinghai@kernel.org
    Cc: youquan.song@intel.com
    Cc: joerg.roedel@amd.com
    Cc: tony.luck@intel.com
    Cc: dwmw2@infradead.org
    Link: http://lkml.kernel.org/r/20110824001456.558630224@sbsiddha-desk.sc.intel.com
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     

13 Sep, 2011

2 commits

  • The qi->q_lock lock can be taken in atomic context and therefore
    cannot be preempted on -rt - annotate it.

    In mainline this change documents the low level nature of
    the lock - otherwise there's no functional difference. Lockdep
    and Sparse checking will work as usual.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • The iommu->register_lock can be taken in atomic context and therefore
    must not be preempted on -rt - annotate it.

    In mainline this change documents the low level nature of
    the lock - otherwise there's no functional difference. Lockdep
    and Sparse checking will work as usual.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

05 Oct, 2009

1 commit


11 Sep, 2009

1 commit

  • BIOS clear DMAR table INTR_REMAP flag to disable interrupt remapping. Current
    kernel only check interrupt remapping(IR) flag in DRHD's extended capability
    register to decide interrupt remapping support or not. But IR flag will not
    change when BIOS disable/enable interrupt remapping.

    When user disable interrupt remapping in BIOS or BIOS often defaultly disable
    interrupt remapping feature when BIOS is not mature.Though BIOS disable
    interrupt remapping but intr_remapping_supported function will always report
    to OS support interrupt remapping if VT-d2 chipset populated. On this
    cases, kernel will continue enable interrupt remapping and result kernel panic.
    This bug exist on almost all platforms with interrupt remapping support.

    This patch add DMAR table INTR_REMAP flag check before enable interrupt
    remapping.

    Signed-off-by: Youquan Song
    Signed-off-by: David Woodhouse

    Youquan Song
     

18 May, 2009

3 commits


11 May, 2009

2 commits

  • As we just did for context cache flushing, clean up the logic around
    whether we need to flush the iotlb or just the write-buffer, depending
    on caching mode.

    Fix the same bug in qi_flush_iotlb() that qi_flush_context() had -- it
    isn't supposed to be returning an error; it's supposed to be returning a
    flag which triggers a write-buffer flush.

    Remove some superfluous conditional write-buffer flushes which could
    never have happened because they weren't for non-present-to-present
    mapping changes anyway.

    Signed-off-by: David Woodhouse

    David Woodhouse
     
  • It really doesn't make a lot of sense to have some of the logic to
    handle caching vs. non-caching mode duplicated in qi_flush_context() and
    __iommu_flush_context(), while the return value indicates whether the
    caller should take other action which depends on the same thing.

    Especially since qi_flush_context() thought it was returning something
    entirely different anyway.

    This patch makes qi_flush_context() and __iommu_flush_context() both
    return void, removes the 'non_present_entry_flush' argument and makes
    the only call site which _set_ that argument to 1 do the right thing.

    Signed-off-by: David Woodhouse

    David Woodhouse
     

29 Apr, 2009

1 commit

  • The patch adds kernel parameter intel_iommu=pt to set up pass through
    mode in context mapping entry. This disables DMAR in linux kernel; but
    KVM still runs on VT-d and interrupt remapping still works.

    In this mode, kernel uses swiotlb for DMA API functions but other VT-d
    functionalities are enabled for KVM. KVM always uses multi level
    translation page table in VT-d. By default, pass though mode is disabled
    in kernel.

    This is useful when people don't want to enable VT-d DMAR in kernel but
    still want to use KVM and interrupt remapping for reasons like DMAR
    performance concern or debug purpose.

    Signed-off-by: Fenghua Yu
    Acked-by: Weidong Han
    Signed-off-by: David Woodhouse

    Fenghua Yu
     

04 Apr, 2009

3 commits

  • When extended interrupt mode (x2apic mode) is not supported in a
    system, it must set compatibility format interrupt to bypass
    interrupt remapping, otherwise compatibility format interrupts
    will be blocked.

    This will be used when interrupt remapping is enabled while x2apic
    is not supported.

    Signed-off-by: Weidong Han
    Acked-by: Ingo Molnar
    Signed-off-by: David Woodhouse

    Han, Weidong
     
  • This patch implements the suspend and resume feature for Intel IOMMU
    DMAR. It hooks to kernel suspend and resume interface. When suspend happens, it
    saves necessary hardware registers. When resume happens, it restores the
    registers and restarts IOMMU by enabling translation, setting up root entry, and
    re-enabling queued invalidation.

    Signed-off-by: Fenghua Yu
    Acked-by: Ingo Molnar
    Signed-off-by: David Woodhouse

    Fenghua Yu
     
  • * git://git.infradead.org/iommu-2.6:
    intel-iommu: Fix address wrap on 32-bit kernel.
    intel-iommu: Enable DMAR on 32-bit kernel.
    intel-iommu: fix PCI device detach from virtual machine
    intel-iommu: VT-d page table to support snooping control bit
    iommu: Add domain_has_cap iommu_ops
    intel-iommu: Snooping control support

    Fixed trivial conflicts in arch/x86/Kconfig and drivers/pci/intel-iommu.c

    Linus Torvalds
     

31 Mar, 2009

1 commit

  • * 'iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
    dma-debug: make memory range checks more consistent
    dma-debug: warn of unmapping an invalid dma address
    dma-debug: fix dma_debug_add_bus() definition for !CONFIG_DMA_API_DEBUG
    dma-debug/x86: register pci bus for dma-debug leak detection
    dma-debug: add a check dma memory leaks
    dma-debug: add checks for kernel text and rodata
    dma-debug: print stacktrace of mapping path on unmap error
    dma-debug: Documentation update
    dma-debug: x86 architecture bindings
    dma-debug: add function to dump dma mappings
    dma-debug: add checks for sync_single_sg_*
    dma-debug: add checks for sync_single_range_*
    dma-debug: add checks for sync_single_*
    dma-debug: add checking for [alloc|free]_coherent
    dma-debug: add add checking for map/unmap_sg
    dma-debug: add checking for map/unmap_page/single
    dma-debug: add core checking functions
    dma-debug: add debugfs interface
    dma-debug: add kernel command line parameters
    dma-debug: add initialization code
    ...

    Fix trivial conflicts due to whitespace changes in arch/x86/kernel/pci-nommu.c

    Linus Torvalds
     

24 Mar, 2009

1 commit


18 Mar, 2009

2 commits


05 Mar, 2009

1 commit


09 Feb, 2009

1 commit

  • When hardware detects any error with a descriptor from the invalidation
    queue, it stops fetching new descriptors from the queue until software
    clears the Invalidation Queue Error bit in the Fault Status register.
    Following fix handles the IQE so the kernel won't be trapped in an
    infinite loop.

    Signed-off-by: Yu Zhao
    Signed-off-by: David Woodhouse

    Yu Zhao
     

29 Jan, 2009

1 commit


06 Jan, 2009

1 commit

  • This converts X86 and IA64 to use include/linux/dma-mapping.h.

    It's a bit large but pretty boring. The major change for X86 is
    converting 'int dir' to 'enum dma_data_direction dir' in DMA mapping
    operations. The major changes for IA64 is using map_page and
    unmap_page instead of map_single and unmap_single.

    Signed-off-by: FUJITA Tomonori
    Acked-by: Tony Luck
    Signed-off-by: Ingo Molnar

    FUJITA Tomonori
     

03 Jan, 2009

8 commits


18 Oct, 2008

1 commit

  • The current Intel IOMMU code assumes that both host page size and Intel
    IOMMU page size are 4KiB. The first patch supports variable page size.
    This provides support for IA64 which has multiple page sizes.

    This patch also adds some other code hooks for IA64 platform including
    DMAR_OPERATION_TIMEOUT definition.

    [dwmw2: some cleanup]
    Signed-off-by: Fenghua Yu
    Signed-off-by: Tony Luck
    Signed-off-by: David Woodhouse

    Fenghua Yu
     

17 Oct, 2008

1 commit

  • If queued invalidation interface is available and enabled, queued invalidation
    interface will be used instead of the register based interface.

    According to Vt-d2 specification, when queued invalidation is enabled,
    invalidation command submit works only through invalidation queue and not
    through the command registers interface.

    Signed-off-by: Youquan Song
    Signed-off-by: Suresh Siddha
    Signed-off-by: David Woodhouse

    Youquan Song