12 Sep, 2009

2 commits

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (25 commits)
    pata_rz1000: use printk_once
    ahci: kill @force_restart and refine CLO for ahci_kick_engine()
    pata_cs5535: add pci id for AMD based CS5535 controllers
    ahci: Add AMD SB900 SATA/IDE controller device IDs
    drivers/ata: use resource_size
    sata_fsl: Defer non-ncq commands when ncq commands active
    libata: add SATA PMP revision information for spec 1.2
    libata: fix off-by-one error in ata_tf_read_block()
    ahci: Gigabyte GA-MA69VM-S2 can't do 64bit DMA
    ahci: make ahci_asus_m2a_vm_32bit_only() quirk more generic
    dmi: extend dmi_get_year() to dmi_get_date()
    dmi: fix date handling in dmi_get_year()
    libata: unbreak TPM filtering by reorganizing ata_scsi_pass_thru()
    sata_sis: convert to slave_link
    sata_sil24: always set protocol override for non-ATAPI data commands
    libata: Export AHCI capabilities
    libata: Delegate nonrot flag setting to SCSI
    [libata] Add pata_rdc driver for RDC ATA devices
    drivers/ata: Remove unnecessary semicolons
    libata: remove spindown skipping and warning
    ...

    Linus Torvalds
     
  • * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    pci/intr_remapping: Allocate irq_iommu on node
    irq: Add irq_node() primitive
    irq: Make sure irq_desc for legacy irq get correct node setting
    genirq: Add prototype for handle_nested_irq()
    irq: Remove superfluous NULL pointer check in check_irq_resend()
    irq: Clean up by removing irqfixup MODULE_PARM_DESC()
    genirq: Fix comment describing suspend_device_irqs()
    genirq: Remove obsolete defines and typedefs

    Linus Torvalds
     

11 Sep, 2009

1 commit


30 Aug, 2009

1 commit

  • An SR-IOV capable device includes an SR-IOV PCIe capability which
    describes the Virtual Function (VF) BAR requirements. A typical SR-IOV
    device can support multiple VFs whose BARs must be in a contiguous region,
    effectively an array of VF BARs. The BAR reports the size requirement
    for a single VF. We calculate the full range needed by simply multiplying
    the VF BAR size with the number of possible VFs and create a resource
    spanning the full range.

    This all seems sane enough except it artificially inflates the alignment
    requirement for the VF BAR. The VF BAR need only be aligned to the size
    of a single BAR not the contiguous range of VF BARs. This can cause us
    to fail to allocate resources for the BAR despite the fact that we
    actually have enough space.

    This patch adds a thin PCI specific layer over the generic
    resource_alignment() function which is aware of the special nature of
    VF BARs and does sorting and allocation based on the smaller alignment
    requirement.

    I recognize that while resource_alignment is generic, it's basically a
    PCI helper. An alternative to this patch is to add PCI VF BAR specific
    information to struct resource. I opted for the extra layer rather than
    adding such PCI specific information to struct resource. This does
    have the slight downside that we don't cache the BAR size and re-read
    for each alignment query (happens a small handful of times during boot
    for each VF BAR).

    Signed-off-by: Chris Wright
    Cc: Ivan Kokshaysky
    Cc: Linus Torvalds
    Cc: Matthew Wilcox
    Cc: Yu Zhao
    Cc: stable@kernel.org
    Signed-off-by: Jesse Barnes

    Chris Wright
     

29 Aug, 2009

1 commit


21 Aug, 2009

1 commit

  • Without the check, the config space may be filled with zeros. Though
    the driver should try to avoid call restoring before saving, but the
    pci layer also should check this.

    Also removes the existing check in pci_restore_standard_config, since
    it's superfluous with the new check in restore_state.

    Acked-by: Rafael J. Wysocki
    Signed-off-by: Alek Du
    Signed-off-by: Jesse Barnes

    Alek Du
     

11 Aug, 2009

1 commit


08 Aug, 2009

2 commits

  • By the pci slot changes, callbacks of attributes under slot directory
    (/sys/bus/pci/slots) had been changed to get the pointer to struct
    pci_slot instead of struct hotplug_slot. So the path_show() that
    assumes the parameter is a pointer to struct hotplug_slot seems
    broken.

    Tested-by: Mike Habeck
    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     
  • The commit bd3d99c17039fd05a29587db3f4a180c48da115a ("PCI: Remove
    untested Electromechanical Interlock (EMI) support in pciehp."), which
    removes the definition of "struct hotplug_slot_attr", broke SGI
    hotplug driver. By this commit, we get the following compile error.

    drivers/pci/hotplug/sgi_hotplug.c:106: error: variable 'sn_slot_path_attr' has initializer but incomplete type
    drivers/pci/hotplug/sgi_hotplug.c:106: error: unknown field 'attr' specified in initializer
    drivers/pci/hotplug/sgi_hotplug.c:106: error: extra brace group at end of initializer
    drivers/pci/hotplug/sgi_hotplug.c:106: error: (near initialization for 'sn_slot_path_attr')
    drivers/pci/hotplug/sgi_hotplug.c:106: warning: excess elements in struct initializer
    drivers/pci/hotplug/sgi_hotplug.c:106: warning: (near initialization for 'sn_slot_path_attr')
    drivers/pci/hotplug/sgi_hotplug.c:106: error: unknown field 'show' specified in initializer
    drivers/pci/hotplug/sgi_hotplug.c:106: warning: excess elements in struct initializer
    drivers/pci/hotplug/sgi_hotplug.c:106: warning: (near initialization for 'sn_slot_path_attr')
    drivers/pci/hotplug/sgi_hotplug.c: In function 'sn_hp_destroy':
    drivers/pci/hotplug/sgi_hotplug.c:203: error: invalid use of undefined type 'struct hotplug_slot_attribute'
    drivers/pci/hotplug/sgi_hotplug.c: In function 'sn_hotplug_slot_register':
    drivers/pci/hotplug/sgi_hotplug.c:655: error: invalid use of undefined type 'struct hotplug_slot_attribute'

    This patch fixes this regression by adding the definition of struct
    hotplug_slot_attr into sgi_hotplug.c.

    Tested-by: Mike Habeck
    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     

06 Aug, 2009

1 commit

  • Two defects work together result in KVM device passthrough randomly can't
    work:
    1. iommu_snooping is not initialized to zero when vm_iommu_init() called.
    So it is possible to get a random value.
    2. One line added by commit 2c2e2c38("IOMMU Identity Mapping Support")
    change the code path, let it bypass domain_update_iommu_cap(), as well as
    missing the increment of domain iommu reference count.

    The latter is also likely to cause a leak of domains on repeated VMM
    assignment and deassignment.

    Signed-off-by: Sheng Yang
    Signed-off-by: David Woodhouse

    Sheng Yang
     

05 Aug, 2009

2 commits

  • The physical address passed to domain_pfn_mapping() should be rounded
    down to the start of the MM page, not the VT-d page.

    This issue causes kernel panic on PAGE_SIZE>VTD_PAGE_SIZE platforms e.g. ia64
    platforms.

    Signed-off-by: Fenghua Yu
    Signed-off-by: David Woodhouse

    Fenghua Yu
     
  • In domain_sg_mapping(), use aligned_nrpages() instead of hand-coded
    rounding code for calculating the size of each sg elem. This means that
    on IA64 we correctly round up to the MM page size, not just to the VT-d
    page size.

    Also remove the incorrect mm_to_dma_pfn() when intel_map_sg() calls
    domain_sg_mapping() -- the 'size' variable is in VT-d pages already.

    Signed-off-by: Fenghua Yu
    Signed-off-by: David Woodhouse

    Fenghua Yu
     

03 Aug, 2009

1 commit

  • This function has traditionally used "insert_resource()", because before
    commit cebd78a8c5 ("Fix pci_claim_resource") it used to just insert the
    resource into whatever root resource tree that was indicated by
    "pcibios_select_root()".

    So there Matthew fixed it to actually look up the proper parent
    resource, which means that now it's actively wrong to then traverse the
    resource tree any more: we already know exactly where the new resource
    should go.

    And when we then did commit a76117dfd6 ("x86: Use pci_claim_resource"),
    which changed the x86 PCI code from the open-coded

    pr = pci_find_parent_resource(dev, r);
    if (!pr || request_resource(pr, r) < 0) {

    to using

    if (pci_claim_resource(dev, idx) < 0) {

    that "insert_resource()" now suddenly became a problem, and causes a
    regression covered by

    http://bugzilla.kernel.org/show_bug.cgi?id=13891

    which this fixes.

    Reported-and-tested-by: Rafael J. Wysocki
    Cc: Matthew Wilcox
    Cc: Andrew Patterson
    Cc: Linux PCI
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

13 Jul, 2009

1 commit

  • * Remove smp_lock.h from files which don't need it (including some headers!)
    * Add smp_lock.h to files which do need it
    * Make smp_lock.h include conditional in hardirq.h
    It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

    This will make hardirq.h inclusion cheaper for every PREEMPT=n config
    (which includes allmodconfig/allyesconfig, BTW)

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

09 Jul, 2009

1 commit

  • After some API change, intel_iommu_unmap_range() introduced a assumption that
    parameter size != 0, otherwise the dma_pte_clean_range() would have a
    overflowed argument. But the user like KVM don't have this assumption before,
    then some BUG() triggered.

    Fix it by ignoring size = 0.

    Signed-off-by: Sheng Yang
    Signed-off-by: David Woodhouse
    Signed-off-by: Linus Torvalds

    Sheng Yang
     

07 Jul, 2009

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
    PCI: Fix IRQ swizzling for ARI-enabled devices
    ia64/PCI: adjust section annotation for pcibios_setup()
    x86/PCI: get root CRS before scanning children
    x86/PCI: fix boundary checking when using root CRS
    PCI MSI: Fix restoration of MSI/MSI-X mask states in suspend/resume
    PCI MSI: Unmask MSI if setup failed
    PCI MSI: shorten PCI_MSIX_ENTRY_* symbol names
    PCI: make pci_name() take const argument
    PCI: More PATA quirks for not entering D3
    PCI: fix kernel-doc warnings
    PCI: check if bus has a proper bridge device before triggering SBR
    PCI: remove pci_dac_dma_... APIs on mn10300
    PCI ECRC: Remove unnecessary semicolons
    PCI MSI: Return if alloc_msi_entry for MSI-X failed

    Linus Torvalds
     

05 Jul, 2009

2 commits

  • Our current strategy for pass-through mode is to put all devices into
    the 1:1 domain at startup (which is before we know what their dma_mask
    will be), and only _later_ take them out of that domain, if it turns out
    that they really can't address all of memory.

    However, when there are a bunch of PCI devices behind a bridge, they all
    end up with the same source-id on their DMA transactions, and hence in
    the same IOMMU domain. This means that we _can't_ easily move them from
    the 1:1 domain into their own domain at runtime, because there might be DMA
    in-flight from their siblings.

    So we have to adjust our pass-through strategy: For PCI devices not on
    the root bus, and for the bridges which will take responsibility for
    their transactions, we have to start up _out_ of the 1:1 domain, just in
    case.

    This fixes the BUG() we see when we have 32-bit-capable devices behind a
    PCI-PCI bridge, and use the software identity mapping.

    It does mean that we might end up using 'normal' mapping mode for some
    devices which could actually live with the faster 1:1 mapping -- but
    this is only for PCI devices behind bridges, which presumably aren't the
    devices for which people are most concerned about performance.

    Signed-off-by: David Woodhouse

    David Woodhouse
     
  • At boot time, the dma_mask won't have been set on any devices, so we
    assume that all devices will be 64-bit capable (and thus get a 1:1 map).

    Signed-off-by: David Woodhouse

    David Woodhouse
     

04 Jul, 2009

6 commits


03 Jul, 2009

1 commit

  • * git://git.infradead.org/iommu-2.6: (38 commits)
    intel-iommu: Don't keep freeing page zero in dma_pte_free_pagetable()
    intel-iommu: Introduce first_pte_in_page() to simplify PTE-setting loops
    intel-iommu: Use cmpxchg64_local() for setting PTEs
    intel-iommu: Warn about unmatched unmap requests
    intel-iommu: Kill superfluous mapping_lock
    intel-iommu: Ensure that PTE writes are 64-bit atomic, even on i386
    intel-iommu: Make iommu=pt work on i386 too
    intel-iommu: Performance improvement for dma_pte_free_pagetable()
    intel-iommu: Don't free too much in dma_pte_free_pagetable()
    intel-iommu: dump mappings but don't die on pte already set
    intel-iommu: Combine domain_pfn_mapping() and domain_sg_mapping()
    intel-iommu: Introduce domain_sg_mapping() to speed up intel_map_sg()
    intel-iommu: Simplify __intel_alloc_iova()
    intel-iommu: Performance improvement for domain_pfn_mapping()
    intel-iommu: Performance improvement for dma_pte_clear_range()
    intel-iommu: Clean up iommu_domain_identity_map()
    intel-iommu: Remove last use of PHYSICAL_PAGE_MASK, for reserving PCI BARs
    intel-iommu: Make iommu_flush_iotlb_psi() take pfn as argument
    intel-iommu: Change aligned_size() to aligned_nrpages()
    intel-iommu: Clean up intel_map_sg(), remove domain_page_mapping()
    ...

    Linus Torvalds
     

02 Jul, 2009

8 commits


30 Jun, 2009

7 commits

  • As with other functions, batch the CPU data cache flushes and don't keep
    recalculating PTE addresses.

    Signed-off-by: David Woodhouse

    David Woodhouse
     
  • The loop condition was wrong -- we should free a PMD only if its
    _entire_ range is within the range we're intending to clear. The
    early-termination condition was right, but not the loop.

    Signed-off-by: David Woodhouse

    David Woodhouse
     
  • Signed-off-by: David Woodhouse

    David Woodhouse
     
  • Signed-off-by: David Woodhouse

    David Woodhouse
     
  • Instead of calling domain_pfn_mapping() repeatedly with single or
    small numbers of pages, just pass the sglist in. It can optimise the
    number of cache flushes like domain_pfn_mapping() does, and gives a huge
    speedup for large scatterlists.

    Signed-off-by: David Woodhouse

    David Woodhouse
     
  • There are 2 problems on mask states in suspend/resume.

    [1]:
    It is better to restore the mask states of MSI/MSI-X to initial states
    (MSI is unmasked, MSI-X is masked) when we release the device.
    The pci_msi_shutdown() does the restoration of mask states for MSI,
    while the msi_free_irqs() does it for MSI-X. In other words, in the
    "disable" path both of MSI and MSI-X are handled, but in the "shutdown"
    path only MSI is handled.

    MSI:
    pci_disable_msi()
    => pci_msi_shutdown()
    [ mask states for MSI restored ]
    => msi_set_enable(dev, pos, 0);
    => msi_free_irqs()

    MSI-X:
    pci_disable_msix()
    => pci_msix_shutdown()
    => msix_set_enable(dev, 0);
    => msix_free_all_irqs
    => msi_free_irqs()
    [ mask states for MSI-X restored ]

    This patch moves the masking for MSI-X from msi_free_irqs() to
    pci_msix_shutdown().

    This change has some positive side effects:
    - It prevents OS from touching mask states before reading preserved
    bits in the register, which can be happen if msi_free_irqs() is
    called from error path in msix_capability_init().
    - It also prevents touching the register after turning off MSI-X in
    "disable" path, which can be a problem on some devices.

    [2]:
    We have cache of the mask state in msi_desc, which is automatically
    updated when msi/msix_mask_irq() is called. This cached states are
    used for the resume.

    But since what need to be restored in the resume is the states before
    the shutdown on the suspend, calling msi/msix_mask_irq() from
    pci_msi/msix_shutdown() is not appropriate.

    This patch introduces __msi/msix_mask_irq() that do mask as same
    as msi/msix_mask_irq() but does not update cached state, for use
    in pci_msi/msix_shutdown().

    [updated: get rid of msi/msix_mask_irq_nocache() (proposed by Matthew Wilcox)]

    Reviewed-by: Matthew Wilcox
    Signed-off-by: Hidetoshi Seto
    Signed-off-by: Jesse Barnes

    Hidetoshi Seto
     
  • The initial state of mask register of MSI is unmasked. We set it
    masked before calling arch_setup_msi_irqs(). If arch_setup_msi_irq()
    fails, it is better to restore the state of the mask register.

    Reviewed-by: Matthew Wilcox
    Signed-off-by: Hidetoshi Seto
    Signed-off-by: Jesse Barnes

    Hidetoshi Seto