17 Dec, 2009

1 commit

  • * git://git.infradead.org/iommu-2.6:
    implement early_io{re,un}map for ia64
    Revert "Intel IOMMU: Avoid memory allocation failures in dma map api calls"
    intel-iommu: ignore page table validation in pass through mode
    intel-iommu: Fix oops with intel_iommu=igfx_off
    intel-iommu: Check for an RMRR which ends before it starts.
    intel-iommu: Apply BIOS sanity checks for interrupt remapping too.
    intel-iommu: Detect DMAR in hyperspace at probe time.
    dmar: Fix build failure without NUMA, warn on bogus RHSA tables and don't abort
    iommu: Allocate dma-remapping structures using numa locality info
    intr_remap: Allocate intr-remapping table using numa locality info
    dmar: Allocate queued invalidation structure using numa locality info
    dmar: support for parsing Remapping Hardware Static Affinity structure

    Linus Torvalds
     

16 Dec, 2009

1 commit


12 Dec, 2009

1 commit

  • * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (109 commits)
    PCI: fix coding style issue in pci_save_state()
    PCI: add pci_request_acs
    PCI: fix BUG_ON triggered by logical PCIe root port removal
    PCI: remove ifdefed pci_cleanup_aer_correct_error_status
    PCI: unconditionally clear AER uncorr status register during cleanup
    x86/PCI: claim SR-IOV BARs in pcibios_allocate_resource
    PCI: portdrv: remove redundant definitions
    PCI: portdrv: remove unnecessary struct pcie_port_data
    PCI: portdrv: minor cleanup for pcie_port_device_register
    PCI: portdrv: add missing irq cleanup
    PCI: portdrv: enable device before irq initialization
    PCI: portdrv: cleanup service irqs initialization
    PCI: portdrv: check capabilities first
    PCI: portdrv: move PME capability check
    PCI: portdrv: remove redundant pcie type calculation
    PCI: portdrv: cleanup pcie_device registration
    PCI: portdrv: remove redundant pcie_port_device_probe
    PCI: Always set prefetchable base/limit upper32 registers
    PCI: read-modify-write the pcie device control register when initiating pcie flr
    PCI: show dma_mask bits in /sys
    ...

    Fixed up conflicts in:
    arch/x86/kernel/amd_iommu_init.c
    drivers/pci/dmar.c
    drivers/pci/hotplug/acpiphp_glue.c

    Linus Torvalds
     

10 Dec, 2009

1 commit

  • * 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
    ACPICA: Update version to 20091112.
    ACPICA: Add additional module-level code support
    ACPICA: Deploy new create integer interface where appropriate
    ACPICA: New internal utility function to create Integer objects
    ACPICA: Add repair for predefined methods that must return sorted lists
    ACPICA: Fix possible fault if return Package objects contain NULL elements
    ACPICA: Add post-order callback to acpi_walk_namespace
    ACPICA: Change package length error message to an info message
    ACPICA: Reduce severity of predefined repair messages, Warning to Info
    ACPICA: Update version to 20091013
    ACPICA: Fix possible memory leak for Scope ASL operator
    ACPICA: Remove possibility of executing _REG methods twice
    ACPICA: Add repair for bad _MAT buffers
    ACPICA: Add repair for bad _BIF/_BIX packages

    Linus Torvalds
     

09 Dec, 2009

1 commit


08 Dec, 2009

7 commits

  • commit eb3fa7cb51 said Intel IOMMU

    Intel IOMMU driver needs memory during DMA map calls to setup its
    internal page tables and for other data structures. As we all know
    that these DMA map calls are mostly called in the interrupt context
    or with the spinlock held by the upper level drivers(network/storage
    drivers), so in order to avoid any memory allocation failure due to
    low memory issues, this patch makes memory allocation by temporarily
    setting PF_MEMALLOC flags for the current task before making memory
    allocation calls.

    We evaluated mempools as a backup when kmem_cache_alloc() fails
    and found that mempools are really not useful here because
    1) We don't know for sure how much to reserve in advance
    2) And mempools are not useful for GFP_ATOMIC case (as we call
    memory alloc functions with GFP_ATOMIC)

    (akpm: point 2 is wrong...)

    The above description doesn't justify to waste system emergency memory
    at all. Non MM subsystem must not use PF_MEMALLOC. Memory reclaim need
    few memory, anyone must not prevent it. Otherwise the system cause
    mysterious hang-up and/or OOM Killer invokation.

    Plus, akpm already pointed out what we should do.

    Then, this patch revert it.

    Cc: Keshavamurthy Anil S
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: David Woodhouse

    KOSAKI Motohiro
     
  • We are seeing a bug when booting w/ iommu=pt with current upstream
    (bisect blames 19943b0e30b05d42e494ae6fef78156ebc8c637e "intel-iommu:
    Unify hardware and software passthrough support).

    The issue is specific to this loop during identity map initialization
    of each device:

    domain_context_mapping_one(si_domain, ..., CONTEXT_TT_PASS_THROUGH)
    ...
    /* Skip top levels of page tables for
    * iommu which has less agaw than default.
    */
    for (agaw = domain->agaw; agaw != iommu->agaw; agaw--) {
    pgd = phys_to_virt(dma_pte_addr(pgd));
    if (!dma_pte_present(pgd)) { lock, flags);
    return -ENOMEM;
    }

    This box has 2 iommu's in it. The catchall iommu has MGAW == 48, and
    SAGAW == 4. The other iommu has MGAW == 39, SAGAW == 2.

    The device that's failing the above pgd test is the only device connected
    to the non-catchall iommu, which has a smaller address width than the
    domain default. This test is not necessary since the context is in PT
    mode and the ASR is ignored.

    Thanks to Don Dutile for discovering and debugging this one.

    Cc: stable@kernel.org
    Signed-off-by: Chris Wright
    Signed-off-by: David Woodhouse

    Chris Wright
     
  • The hotplug notifier will call find_domain() to see if the device in
    question has been assigned an IOMMU domain. However, this should never
    be called for devices with a "dummy" domain, such as graphics devices
    when intel_iommu=igfx_off is set and the corresponding IOMMU isn't even
    initialised. If you do that, it'll oops as it dereferences the (-1)
    pointer.

    The notifier function should check iommu_no_mapping() for the
    device before doing anything else.

    Cc: stable@kernel.org
    Signed-off-by: David Woodhouse

    David Woodhouse
     
  • Some HP BIOSes report an RMRR region (a region which needs a 1:1 mapping
    in the IOMMU for a given device) which has an end address lower than its
    start address. Detect that and warn, rather than triggering the
    BUG() in dma_pte_clear_range().

    Cc: stable@kernel.org
    Signed-off-by: David Woodhouse

    David Woodhouse
     
  • The BIOS errors where an IOMMU is reported either at zero or a bogus
    address are causing problems even when the IOMMU is disabled -- because
    interrupt remapping uses the same hardware. Ensure that the checks get
    applied for the interrupt remapping initialisation too.

    Cc: stable@kernel.org
    Signed-off-by: David Woodhouse

    David Woodhouse
     
  • Many BIOSes will lie to us about the existence of an IOMMU, and claim
    that there is one at an address which actually returns all 0xFF.

    We need to detect this early, so that we know we don't have a viable
    IOMMU and can set up swiotlb before it's too late.

    Cc: stable@kernel.org
    Signed-off-by: Chris Wright
    Signed-off-by: David Woodhouse

    Chris Wright
     
  • Merge the BIOS workarounds from 2.6.32, and the swiotlb fallback on failure.

    David Woodhouse
     

06 Dec, 2009

1 commit

  • …/git/tip/linux-2.6-tip

    * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits)
    x86, Calgary IOMMU quirk: Find nearest matching Calgary while walking up the PCI tree
    x86/amd-iommu: Remove amd_iommu_pd_table
    x86/amd-iommu: Move reset_iommu_command_buffer out of locked code
    x86/amd-iommu: Cleanup DTE flushing code
    x86/amd-iommu: Introduce iommu_flush_device() function
    x86/amd-iommu: Cleanup attach/detach_device code
    x86/amd-iommu: Keep devices per domain in a list
    x86/amd-iommu: Add device bind reference counting
    x86/amd-iommu: Use dev->arch->iommu to store iommu related information
    x86/amd-iommu: Remove support for domain sharing
    x86/amd-iommu: Rearrange dma_ops related functions
    x86/amd-iommu: Move some pte allocation functions in the right section
    x86/amd-iommu: Remove iommu parameter from dma_ops_domain_alloc
    x86/amd-iommu: Use get_device_id and check_device where appropriate
    x86/amd-iommu: Move find_protection_domain to helper functions
    x86/amd-iommu: Simplify get_device_resources()
    x86/amd-iommu: Let domain_for_device handle aliases
    x86/amd-iommu: Remove iommu specific handling from dma_ops path
    x86/amd-iommu: Remove iommu parameter from __(un)map_single
    x86/amd-iommu: Make alloc_new_range aware of multiple IOMMUs
    ...

    Linus Torvalds
     

05 Dec, 2009

21 commits

  • Remove a stray space in pci_save_state().

    Signed-off-by: Kleber Sacilotto de Souza
    Signed-off-by: Jesse Barnes

    Kleber Sacilotto de Souza
     
  • Commit ae21ee65e8bc228416bbcc8a1da01c56a847a60c "PCI: acs p2p upsteram
    forwarding enabling" doesn't actually enable ACS.

    Add a function to pci core to allow an IOMMU to request that ACS
    be enabled. The existing mechanism of using iommu_found() in the pci
    core to know when ACS should be enabled doesn't actually work due to
    initialization order; iommu has only been detected not initialized.

    Have Intel and AMD IOMMUs request ACS, and Xen does as well during early
    init of dom0.

    Cc: Allen Kay
    Cc: David Woodhouse
    Cc: Jeremy Fitzhardinge
    Cc: Joerg Roedel
    Signed-off-by: Chris Wright
    Signed-off-by: Jesse Barnes

    Chris Wright
     
  • This problem happened when removing PCIe root port using PCI logical
    hotplug operation.

    The immediate cause of this problem is that the pointer to invalid
    data structure is passed to pcie_update_aspm_capable() by
    pcie_aspm_exit_link_state(). When pcie_aspm_exit_link_state() received
    a pointer to root port link, it unconfigures the root port link and
    frees its data structure at first. At this point, there are not links
    to configure under the root port and the data structure for root port
    link is already freed. So pcie_aspm_exit_link_state() must not call
    pcie_update_aspm_capable() and pcie_config_aspm_path().

    This patch fixes the problem by changing pcie_aspm_exit_link_state()
    not to call pcie_update_aspm_capable() and pcie_config_aspm_path() if
    the specified link is root port link.

    ------------[ cut here ]------------
    kernel BUG at drivers/pci/pcie/aspm.c:606!
    invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
    last sysfs file: /sys/devices/pci0000:40/0000:40:13.0/remove
    CPU 1
    Modules linked in: shpchp
    Pid: 9345, comm: sysfsd Not tainted 2.6.32-rc5 #98 ProLiant DL785 G6
    RIP: 0010:[] [] pcie_update_aspm_capable+0x15/0xbe
    RSP: 0018:ffff88082a2f5ca0 EFLAGS: 00010202
    RAX: 0000000000000e77 RBX: ffff88182cc3e000 RCX: ffff88082a33d006
    RDX: 0000000000000001 RSI: ffffffff811dff4a RDI: ffff88182cc3e000
    RBP: ffff88082a2f5cc0 R08: ffff88182cc3e000 R09: 0000000000000000
    R10: ffff88182fc00180 R11: ffff88182fc00198 R12: ffff88182cc3e000
    R13: 0000000000000000 R14: ffff88182cc3e000 R15: ffff88082a2f5e20
    FS: 00007f259a64b6f0(0000) GS:ffff880864600000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 00007feb53f73da0 CR3: 000000102cc94000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process sysfsd (pid: 9345, threadinfo ffff88082a2f4000, task ffff88082a33cf00)
    Stack:
    ffff88182cc3e000 ffff88182cc3e000 0000000000000000 ffff88082a33cf00
    ffff88082a2f5cf0 ffffffff811dff52 ffff88082a2f5cf0 ffff88082c525168
    ffff88402c9fd2f8 ffff88402c9fd2f8 ffff88082a2f5d20 ffffffff811d7db2
    Call Trace:
    [] pcie_aspm_exit_link_state+0xf5/0x11e
    [] pci_stop_bus_device+0x76/0x7e
    [] pci_stop_bus_device+0x2b/0x7e
    [] pci_remove_bus_device+0x15/0xb9
    [] remove_callback+0x29/0x3a
    [] sysfs_schedule_callback_work+0x15/0x6d
    [] worker_thread+0x19d/0x298
    [] ? worker_thread+0x148/0x298
    [] ? sysfs_schedule_callback_work+0x0/0x6d
    [] ? autoremove_wake_function+0x0/0x38
    [] ? worker_thread+0x0/0x298
    [] kthread+0x7d/0x85
    [] child_rip+0xa/0x20
    [] ? restore_args+0x0/0x30
    [] ? kthread+0x0/0x85
    [] ? child_rip+0x0/0x20
    Code: 89 e5 8a 50 48 31 c0 c0 ea 03 83 e2 07 e8 b2 de fe ff c9 48 98 c3 55 48 89 e5 41 56 49 89 fe 41 55 41 54 53 48 83 7f 10 00 74 04 0b eb fe 48 8b 05 da 7d 63 00 4c 8d 60 e8 4c 89 e1 eb 24 4c
    RIP [] pcie_update_aspm_capable+0x15/0xbe
    RSP
    ---[ end trace 6ae0f65bdeab8555 ]---

    Reported-by: Alex Chiang
    Tested-by: Alex Chiang
    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • The pci_cleanup_aer_correct_error_status() function has been
    #if 0'd out since 2.6.25. Time to remove the dead code.

    Signed-off-by: Andrew Patterson
    Signed-off-by: Jesse Barnes

    Andrew Patterson
     
  • The current implementation of pci_cleanup_aer_uncorrect_error_status
    only clears either fatal or non-fatal error status bits depending
    on the state of the I/O channel. This implementation will then often
    leave some bits set after PCI error recovery completes. The uncleared bit
    settings will then be falsely reported the next time an AER interrupt is
    generated for that hierarchy. An easy way to illustrate this issue is to
    use the aer-inject module to simultaneously inject both an uncorrectable
    non-fatal and uncorrectable fatal error. One of the errors will not be
    cleared.

    This patch resolves this issue by unconditionally clearing all bits in
    the AER uncorrectable status register. All settings and corrective action
    strategies are saved and determined before
    pci_cleanup_aer_uncorrect_error_status is called, so this change should not
    affect errory handling functionality.

    Signed-off-by: Andrew Patterson
    Signed-off-by: Jesse Barnes

    Andrew Patterson
     
  • Remove unnecessary definitions from portdrv.h and use generic
    definitions instead.

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • Remove 'port_type' field in struct pcie_port_data(), because we can
    get port type information from struct pci_dev. With this change, this
    patch also does followings:

    - Remove struct pcie_port_data because it no longer has any field.
    - Remove portdrv private definitions about port type (PCIE_RC_PORT,
    PCIE_SW_UPSTREAM_PORT and PCIE_SW_DOWNSTREAM_PORT), and use generic
    definitions instead.

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • Minor cleanups for pcie_port_device_register().

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • Add missing service irqs cleanup in the error code path of
    pcie_port_device_register().

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • Call pci_enable_device() before initializing service irqs, because
    legacy interrupt is initialized in pci_enable_device() on some
    architectures.

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • This patch cleans up the service irqs initialization as follows:

    - Remove 'irq_mode' field in pcie_port_data and related definitions,
    which is not needed because we can get the same information from
    'is_msix', 'is_msi' and 'pin' fields in struct pci_dev.

    - Change the name of 'vectors' argument of assign_interrupt_mode() to
    'irqs' because it holds irq numbers actually. People might confuse
    it with CPU vector or MSI/MSI-X vector.

    - Change function name assign_interrupt_mode() to init_service_irqs()
    becasuse we no longer have 'irq_mode' data structure, and new name
    is more straightforward (IMO).

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • Move capability check capability to the beginning of
    pcie_port_device_register() prevents redundant execution path.

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • No reason to check PME capability outside get_port_device_capability().
    Do it in get_port_device_capability().

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • PCIe port type is already stored in 'pcie_type' field of struct
    pci_dev. So we don't need to get it from pci configuration space.

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • In the current port bus driver implementation, pcie_device allocation,
    initialization and registration are done in separated functions. Doing
    those in one function make the code simple and easier to read.

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • We don't need pcie_port_device_probe() because we can get pci
    device/port type using pci_is_pcie() and 'pcie_type' fields in struct
    pci_dev. Remove pcie_port_device_probe().

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • Prior to 1f82de10 we always initialized the upper 32bits of the
    prefetchable memory window, regardless of the address range used.
    Now we only touch it for a >32bit address, which means the upper32
    registers remain whatever the BIOS initialized them too.

    It's valid for the BIOS to set the upper32 base/limit to
    0xffffffff/0x00000000, which makes us program prefetchable ranges
    like 0xffffffffabc00000 - 0x00000000abc00000

    Revert the chunk of 1f82de10 that made this conditional so we always
    write the upper32 registers and remove now unused pref_mem64 variable.

    Signed-off-by: Alex Williamson
    Signed-off-by: Jesse Barnes

    Alex Williamson
     
  • The pcie_flr routine writes the device control register with the FLR bit
    set clearing all other fields for the FLR duration. Among other fields,
    the Max_Payload_Size is also cleared which can cause errors if there are
    transactions lurking in the HW pipeline. The patch replaces the blank
    write with read-modify-write of the control register keeping the other
    fields intact.

    Signed-off-by: Shmulik Ravid
    Signed-off-by: Jesse Barnes

    Shmulik Ravid
     
  • So we can catch if the driver sets an incorrect dma_mask.

    Reviewed-by: Grant Grundler
    Signed-off-by: Yinghai Lu
    Signed-off-by: Jesse Barnes

    Yinghai Lu
     
  • This allows us to find out what DMA mask is used for each PCI device at boot
    time; useful for debugging.

    After the patch:
    ehci_hcd 0000:00:02.1: using 31bit consistent DMA mask
    e1000 0000:0b:01.0: using 64bit DMA mask
    e1000 0000:0b:01.0: using 64bit consistent DMA mask
    e1000e 0000:04:00.0: using 64bit DMA mask
    e1000e 0000:04:00.0: using 64bit consistent DMA mask
    ixgb 0000:0c:01.0: using 64bit DMA mask
    ixgb 0000:0c:01.0: using 64bit consistent DMA mask
    aacraid 0000:86:00.0: using 32bit DMA mask
    aacraid 0000:86:00.0: using 32bit consistent DMA mask
    aacraid 0000:86:00.0: using 64bit DMA mask
    aacraid 0000:86:00.0: using 64bit consistent DMA mask
    qla2xxx 0000:0c:02.0: using 64bit consistent DMA mask
    qla2xxx 0000:0c:02.1: using 64bit consistent DMA mask
    lpfc 0000:06:00.0: using 64bit DMA mask
    lpfc 0000:06:00.1: using 64bit DMA mask
    pata_amd 0000:00:06.0: using 32bit DMA mask
    pata_amd 0000:00:06.0: using 32bit consistent DMA mask
    mptsas 0000:0c:04.0: using 64bit DMA mask
    mptsas 0000:0c:04.0: using 64bit consistent DMA mask

    forcedeth 0000:00:08.0: using 39bit DMA mask
    forcedeth 0000:00:08.0: using 39bit consistent DMA mask
    niu 0000:02:00.0: using 44bit DMA mask
    niu 0000:02:00.0: using 44bit consistent DMA mask
    sata_nv 0000:00:05.0: using 32bit DMA mask
    sata_nv 0000:00:05.0: using 32bit consistent DMA mask
    ib_mthca 0000:03:00.0: using 64bit DMA mask
    ib_mthca 0000:03:00.0: using 64bit consistent DMA mask

    Reviewed-by: Grant Grundler
    Signed-off-by: Yinghai Lu
    Signed-off-by: Jesse Barnes

    Yinghai Lu
     
  • If we stop the kthread, we may end up up'ing the sem twice, which seems
    unintended.

    Reported-by: Dan Carpenter
    Signed-off-by: Jesse Barnes

    Jesse Barnes
     

25 Nov, 2009

6 commits

  • The existing interface only has a pre-order callback. This change
    adds an additional parameter for a post-order callback which will
    be more useful for bus scans. ACPICA BZ 779.

    Also update the external calls to acpi_walk_namespace.

    http://www.acpica.org/bugzilla/show_bug.cgi?id=779

    Signed-off-by: Lin Ming
    Signed-off-by: Bob Moore
    Signed-off-by: Len Brown

    Lin Ming
     
  • Enabling power fault detected event notification in current pciehp
    might cause power fault interrupt storm on some machines. On those
    machines. On those machines, power fault detected bit in the slot
    status register was set again immediately when it is cleared in the
    interrupt service routine, and next power fault detected interrupt was
    notified again. Therefore, disable power fault detected event
    notification for now.

    This patch also removes unnecessary handling for power fault cleared
    event because this event is not supported by PCIe spec.

    Tested-by: Jens Axboe
    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • Change for PCI hotplug to use pci_is_pcie() instead of checking
    pci_dev->is_pcie.

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • Changes for PCIe AER driver to use pci_is_pcie() instead of checking
    pci_dev->is_pcie.

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • Change for PCIe ASPM driver to use pci_is_pcie() instead of checking
    pci_dev->is_pcie.

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • Change for PCI core to use pci_is_pcie() instead of checking
    pci_dev->is_pcie.

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige