13 Feb, 2014

2 commits

  • - add one imx pcie ep simple skeleton driver to demo
    the msi trigger capability in imx6 pcie rc/ep validation
    system
    - in order to avoid the modification of common codes,
    force the msi address to be 0x01ff8000

    Test howto:
    - Enable CONFIG_PCI_MSI=y, when rebuild the rc/ep images

    - EP side(console command and kernel message):
    root@sabresd_6dq:/ # memtool 0x1ff8000=0
    Writing 32-bit value 0x0 to address 0x01FF8000
    root@sabresd_6dq:/ #

    - RC side(console command and kernel message):
    root@sabresd_6dq:/ # cat /proc/interrupts | grep MSI
    384: 1 0 0 0 PCI-MSI

    - EP side(console command and kernel message):
    root@sabresd_6dq:/ # memtool 0x1ff8000=0
    Writing 32-bit value 0x0 to address 0x01FF8000

    - RC side(console command and kernel message):
    root@sabresd_6dq:/ # cat /proc/interrupts | grep MSI
    384: 2 0 0 0 PCI-MSI

    Signed-off-by: Richard Zhu

    Richard Zhu
     
  • - setup one new outbound memory region at rc side, used
    to let imx6 pcie rc can access the memory of imx6 pcie ep
    in imx6 pcie rc ep validation system.
    - set the default address of the ddr memory to be 0x4000_0000

    NOTE:
    - default address 0x4000_0000 of ep side would be
    accessed in this demo.
    Test howto:
    step1:
    EP side:
    1.1:
    echo 0x40000000 > /sys/devices/soc0/soc.1/1ffc000.pcie/ep_bar0_addr

    1.2:
    memtool -32 0x40000000 4
    E
    Reading 0x4 count starting at address 0x40000000

    0x40000000: 6FE9E9F6 7583FBB9 39EAEFEA FBDCFD78

    step2:
    RC side:
    memtool -32 0x01000000=58D454DA
    memtool -32 0x01000004=7332095B

    step3:
    EP side:
    memtool -32 0x40000000 4
    E
    Reading 0x4 count starting at address 0x40000000

    0x40000000: 58D454DA 7332095B 39EAEFEA FBDCFD78

    Signed-off-by: Richard Zhu

    Richard Zhu
     

28 Nov, 2013

1 commit


20 Nov, 2013

20 commits

  • HW setup:
    * Two i.MX6Q SD boards, one is used as PCIe RC, the other
    is used as PCIe EP. Connected by 2*mini_PCIe to standard_PCIe
    adaptors, 2*PEX cable adaptors, One PCIe cable.

    SW setup:
    * When build RC image, make sure that
    CONFIG_IMX_PCIE=y
    # CONFIG_EP_MODE_IN_EP_RC_SYS is not set
    # CONFIG_EP_SELF_IO_TEST is not set
    CONFIG_RC_MODE_IN_EP_RC_SYS=y
    * When build EP image,(enable if you want ep do self IO test):
    CONFIG_EP_MODE_IN_EP_RC_SYS=y
    CONFIG_EP_SELF_IO_TEST=y
    # CONFIG_RC_MODE_IN_EP_RC_SYS is not set

    Features:
    * Set-up link between RC and EP by their stand-alone
    125MHz running internally.
    * In EP's system, EP can access the reserved ddr memory
    (default address:0x40000000) of PCIe RC's system, by the
    interconnection between PCIe EP and PCIe RC.
    * add the configuration methods in the EP side, used to
    configure the start address and the size of the reserved
    RC's memory window.
    - cat /sys/devices/soc0/soc.1/1ffc000.pcie/rc_memw_info
    - echo 0x41000000 > /sys/devices/soc0/soc.1/1ffc000.pcie/rc_memw_start_set
    - echo 0x800000 > /sys/devices/soc0/soc.1/1ffc000.pcie/rc_memw_size_set
    * provide one example, howto configure the bar# and so on, when
    * pcie ep emaluates one memory ram ep device

    Throughput:
    * To Be fine-tuned.

    NOTE:
    * boot up EP platform firstly, then boot up RC platform.

    Signed-off-by: Richard Zhu

    Richard Zhu
     
  • Fix the pcie switch no detection issue
    Root cause why the switch can't be detected before:
    * The initialization sequence is not properly, 100ms reset
    should be just issue before ltssm enable.
    * Lagency INTx mapping is wrong
    * remove un-correct IO/MEM iATU outbound mapping.

    Signed-off-by: Richard Zhu

    Richard Zhu
     
  • eanble pcie msi support on imx6 platforms
    * add check_device api in the msi chip.
    * add the quirks into pcie_port struct for the deviation
    from standard routines.

    Signed-off-by: Richard Zhu

    Richard Zhu
     
  • The new struct msi_chip is used to associated an MSI controller with a
    PCI bus. It is automatically handed down from the root to its children
    during bus enumeration.

    This patch provides default (weak) implementations for the architecture-
    specific MSI functions (arch_setup_msi_irq(), arch_teardown_msi_irq()
    and arch_msi_check_device()) which check if a PCI device's bus has an
    attached MSI chip and forward the call appropriately.

    Signed-off-by: Thierry Reding
    Signed-off-by: Thomas Petazzoni
    Acked-by: Bjorn Helgaas
    Tested-by: Daniel Price
    Tested-by: Thierry Reding
    Signed-off-by: Jason Cooper

    Thierry Reding
     
  • Until now, the MSI architecture-specific functions could be overloaded
    using a fairly complex set of #define and compile-time
    conditionals. In order to prepare for the introduction of the msi_chip
    infrastructure, it is desirable to switch all those functions to use
    the 'weak' mechanism. This commit converts all the architectures that
    were overidding those MSI functions to use the new strategy.

    Note that we keep two separate, non-weak, functions
    default_teardown_msi_irqs() and default_restore_msi_irqs() for the
    default behavior of the arch_teardown_msi_irqs() and
    arch_restore_msi_irqs(), as the default behavior is needed by x86 PCI
    code.

    Signed-off-by: Thomas Petazzoni
    Acked-by: Bjorn Helgaas
    Acked-by: Benjamin Herrenschmidt
    Tested-by: Daniel Price
    Tested-by: Thierry Reding
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: linux390@de.ibm.com
    Cc: linux-s390@vger.kernel.org
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: H. Peter Anvin
    Cc: x86@kernel.org
    Cc: Russell King
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: linux-ia64@vger.kernel.org
    Cc: Ralf Baechle
    Cc: linux-mips@linux-mips.org
    Cc: David S. Miller
    Cc: sparclinux@vger.kernel.org
    Cc: Chris Metcalf
    Signed-off-by: Jason Cooper

    Thomas Petazzoni
     
  • Because of the encoding of the "Multiple Message Capable" and "Multiple
    Message Enable" fields, a device can only advertise that it's capable of a
    power-of-two number of vectors, and the OS can only enable a power-of-two
    number.

    For example, a device that's limited internally to using 18 vectors would
    have to advertise that it's capable of 32. The 14 extra vectors consume
    vector numbers and IRQ descriptors even though the device can't actually
    use them.

    This fix introduces a 'msi_desc::nvec_used' field to address this issue.
    When non-zero, it is the actual number of MSIs the device will send, as
    requested by the device driver. This value should be used by architectures
    to set up and tear down only as many interrupt resources as the device will
    actually use.

    Note, although the existing 'msi_desc::multiple' field might seem
    redundant, in fact it is not. The number of MSIs advertised need not be
    the smallest power-of-two larger than the number of MSIs the device will
    send. Thus, it is not always possible to derive the former from the
    latter, so we need to keep them both to handle this case.

    [bhelgaas: changelog, rename to "nvec_used"]
    Signed-off-by: Alexander Gordeev
    Signed-off-by: Bjorn Helgaas

    Alexander Gordeev
     
  • Without irq_create_mapping(), the correct IRQ number cannot be
    provided. In this case, it makes problems such as NULL dereference.
    Thus, irq_create_mapping() should be added for MSI.

    Suggested-by: Kishon Vijay Abraham I
    Signed-off-by: Pratyush Anand
    Signed-off-by: Jingoo Han
    Signed-off-by: Bjorn Helgaas

    Pratyush Anand
     
  • The following variables and functions are used only in pcie-designware.c,
    so make them static:

    global_io_offset
    dw_pcie_rd_own_conf()
    dw_pcie_wr_own_conf()
    dw_pcie_setup()
    dw_pcie_scan_bus()
    dw_pcie_map_irq()

    Signed-off-by: Bjorn Helgaas
    Acked-by: Jingoo Han

    Bjorn Helgaas
     
  • Add header guards to prevent redundant inclusion.

    Signed-off-by: Seungwon Jeon
    Signed-off-by: Jingoo Han
    Signed-off-by: Bjorn Helgaas

    Seungwon Jeon
     
  • Add the missing clk_disable_unprepare() before return
    from exynos_pcie_probe() in the error handling case.

    Signed-off-by: Wei Yongjun
    Signed-off-by: Bjorn Helgaas
    Acked-by: Jingoo Han

    Wei Yongjun
     
  • When link failed, there is no need to turn on phy block. Also,
    turning on phy block is added, in order to turn on phy block
    regardless of the default value of phy registers.

    Signed-off-by: Jingoo Han
    Signed-off-by: Bjorn Helgaas

    Jingoo Han
     
  • This patch adds support for Message Signaled Interrupt in the
    Exynos PCIe driver using Synopsys designware PCIe core IP.

    Signed-off-by: Siva Reddy Kallam
    Signed-off-by: Srikanth T Shivanand
    Signed-off-by: Jingoo Han
    Signed-off-by: Bjorn Helgaas
    Cc: Pratyush Anand
    Cc: Mohit KUMAR

    Jingoo Han
     
  • Probe the PCIe driver in fs_initcall() instead of module_init()
    to assure that pci_assign_unassigned_resources() will be called
    early. This function is called in dw_pcie_host_init(), which is
    in turn called from imx6_add_pcie_port(), which is called from
    imx6_pcie_probe(). If this is not called early, we will hit
    resource collisions since pcieport driver is then probed way too
    late.

    Signed-off-by: Marek Vasut
    Signed-off-by: Bjorn Helgaas
    Acked-by: Shawn Guo
    Cc: Frank Li
    Cc: Jingoo Han
    Cc: Mohit KUMAR
    Cc: Pratyush Anand
    Cc: Richard Zhu
    Cc: Sascha Hauer
    Cc: Sean Cross
    Cc: Siva Reddy Kallam
    Cc: Srikanth T Shivanand
    Cc: Tim Harvey
    Cc: Troy Kisky
    Cc: Yinghai Lu (cherry picked from commit f216f57ffe6eede3c8a763add65d331e688f8c56)

    Marek Vasut
     
  • imx6_pcie_of_match is always compiled in because PCI_IMX6 depends on
    SOC_IMX6Q, which only supports OF build. Hence of_match_ptr is not
    required.

    [bhelgaas: add changelog details from Shawn]
    Signed-off-by: Sachin Kamat
    Signed-off-by: Bjorn Helgaas
    Acked-by: Shawn Guo
    Cc: Sean Cross (cherry picked from commit 8bcadbe17207aee0df4a1f5cb41371d71bf3e4b0)

    Sachin Kamat
     
  • A longer link startup timeout is required when certain PCI switches are
    attached to the root complex. This was tested with a Pericom switch
    and a PLX switch.

    Signed-off-by: Marek Vasut
    Signed-off-by: Bjorn Helgaas
    Acked-by: Tim Harvey
    Acked-by: Shawn Guo (cherry picked from commit 017f10e1c78e14d48be7a28f2c33a32dae15fee5)

    Marek Vasut
     
  • An imprecise abort is triggered when a port behind a switch is accessed
    and no device is present. At enumeration, imprecise aborts are not enabled
    thus this ends up getting deferred until the kernel has completed init. At
    that point we must not adjust PC - the handler must do nothing, but a
    handler must exist.

    This fixes random crashes that occur right after freeing init.

    Tested-by: Marek Vasut
    Signed-off-by: Tim Harvey
    Signed-off-by: Bjorn Helgaas
    Acked-by: Shawn Guo
    Acked-by: Marek Vasut (cherry picked from commit 4ec3ed7f5e91e9325c810dcb995ef5a55e4a79a6)

    Tim Harvey
     
  • There is an error message within devm_ioremap_resource()
    already, so remove the dev_err() call to avoid redundant
    error message.

    Signed-off-by: Wei Yongjun
    Signed-off-by: Bjorn Helgaas
    Reviewed-by: Jingoo Han
    Acked-by: Shawn Guo (cherry picked from commit 9b5cd0948b67e1750498b5ff85267e87d3b4c5b3)

    Wei Yongjun
     
  • Add support for the PCIe port present on the i.MX6 family of controllers.
    These use the Synopsis Designware core tied to their own PHY.

    Signed-off-by: Sean Cross
    Signed-off-by: Shawn Guo
    Signed-off-by: Bjorn Helgaas
    Acked-by: Sascha Hauer
    (cherry picked from commit bb38919ec56e0758c3ae56dfc091dcde1391353e)

    Sean Cross
     
  • switch to community upstreamed pcie driver.
    Revert "ENGR00275213-4 pcie_imx: enable pcie on imx6 platforms"
    This reverts commit dce7d25b770086a978d4dd9838c46f5ff52ee135.

    Signed-off-by: Richard Zhu

    Richard Zhu
     
  • switch to community upstreamed pcie driver.
    Revert "ENGR00278492 imx: pcie: delay is required after REF_CLK_EN is set"
    This reverts commit 1976e889408175354a19824375bc5137f43ef14e.

    Signed-off-by: Richard Zhu

    Richard Zhu
     

30 Oct, 2013

6 commits

  • delay is required after REF_CLK_EN of GPR1 is set.
    otherwise, system would be hang when access the registers
    of PCIe RC when the EARLY_PRINTK is not enabled.

    Signed-off-by: Richard Zhu

    Richard Zhu
     
  • imx6q and imx6dl platforms have one x1 pcie interface,
    this patch used to setup the pcie driver for this
    interface.

    Signed-off-by: Richard Zhu

    Richard Zhu
     
  • Exynos PCIe IP consists of Synopsys specific part and Exynos
    specific part. Only core block is a Synopsys Designware part;
    other parts are Exynos specific.

    Also, the Synopsys Designware part can be shared with other
    platforms; thus, it can be split two parts such as Synopsys
    Designware part and Exynos specific part.

    Signed-off-by: Jingoo Han
    Signed-off-by: Bjorn Helgaas
    Cc: Pratyush Anand
    Cc: Mohit KUMAR

    Jingoo Han
     
  • Exynos5440 has a PCIe controller which can be used as Root Complex.
    This driver supports a PCIe controller as Root Complex mode.

    Signed-off-by: Surendranath Gurivireddy Balla
    Signed-off-by: Siva Reddy Kallam
    Signed-off-by: Jingoo Han
    Acked-by: Bjorn Helgaas
    Acked-by: Kukjin Kim
    Cc: Pratyush Anand
    Cc: Mohit KUMAR
    Signed-off-by: Arnd Bergmann

    Jingoo Han
     
  • We allow the pci-mvebu driver to be compiled on the Kirkwood platform,
    and add the 'marvell,kirkwood-pcie' as a compatible string supported
    by the driver.

    Signed-off-by: Thomas Petazzoni
    Tested-by: Andrew Lunn
    Signed-off-by: Jason Cooper

    Thomas Petazzoni
     
  • This driver implements the support for the PCIe interfaces on the
    Marvell Armada 370/XP ARM SoCs. In the future, it might be extended to
    cover earlier families of Marvell SoCs, such as Dove, Orion and
    Kirkwood.

    The driver implements the hw_pci operations needed by the core ARM PCI
    code to setup PCI devices and get their corresponding IRQs, and the
    pci_ops operations that are used by the PCI core to read/write the
    configuration space of PCI devices.

    Since the PCIe interfaces of Marvell SoCs are completely separate and
    not linked together in a bus, this driver sets up an emulated PCI host
    bridge, with one PCI-to-PCI bridge as child for each hardware PCIe
    interface.

    In addition, this driver enumerates the different PCIe slots, and for
    those having a device plugged in, it sets up the necessary address
    decoding windows, using the mvebu-mbus driver.

    Signed-off-by: Thomas Petazzoni
    Acked-by: Bjorn Helgaas
    Signed-off-by: Jason Cooper

    Thomas Petazzoni
     

02 Oct, 2013

1 commit

  • commit 834145156bedadfb50121f0bc5e9d9f9f942bcca upstream.

    Commit 448bd85 (PCI/PM: add PCIe runtime D3cold support) added a
    piece of code to pci_acpi_wake_dev() causing that function to behave
    in a special way for devices in D3cold (so that their configuration
    registers are not accessed before those devices are resumed).
    However, it didn't take the clearing of the pme_poll flag into
    account. That has to be done for all devices, even if they are in
    D3cold, or pci_pme_list_scan() will not know that wakeup has been
    signaled for the device and will poll its PME Status bit
    unnecessarily.

    Fix the problem by moving the clearing of the pme_poll flag in
    pci_acpi_wake_dev() before the code introduced by commit 448bd85.

    Reported-and-tested-by: David E. Box
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Bjorn Helgaas
    Signed-off-by: Greg Kroah-Hartman

    Rafael J. Wysocki
     

30 Aug, 2013

1 commit

  • commit 60f75b8e97daf4a39790a20d962cb861b9220af5 upstream.

    In theory, under a given ACPI namespace node there should be only
    one child device object with _ADR whose value matches a given bus
    address exactly. In practice, however, there are systems in which
    multiple child device objects under a given parent have _ADR matching
    exactly the same address. In those cases we use _STA to determine
    which of the multiple matching devices is enabled, since some systems
    are known to indicate which ACPI device object to associate with the
    given physical (usually PCI) device this way.

    Unfortunately, as it turns out, there are systems in which many
    device objects under the same parent have _ADR matching exactly the
    same bus address and none of them has _STA, in which case they all
    should be regarded as enabled according to the spec. Still, if
    those device objects are supposed to represent bridges (e.g. this
    is the case for device objects corresponding to PCIe ports), we can
    try harder and skip the ones that have no child device objects in the
    ACPI namespace. With luck, we can avoid using device objects that we
    are not expected to use this way.

    Although this only works for bridges whose children also have ACPI
    namespace representation, it is sufficient to address graphics
    adapter detection issues on some systems, so rework the code finding
    a matching device ACPI handle for a given bus address to implement
    this idea.

    Introduce a new function, acpi_find_child(), taking three arguments:
    the ACPI handle of the device's parent, a bus address suitable for
    the device's bus type and a bool indicating if the device is a
    bridge and make it work as outlined above. Reimplement the function
    currently used for this purpose, acpi_get_child(), as a call to
    acpi_find_child() with the last argument set to 'false' and make
    the PCI subsystem use acpi_find_child() with the bridge information
    passed as the last argument to it. [Lan Tianyu notices that it is
    not sufficient to use pci_is_bridge() for that, because the device's
    subordinate pointer hasn't been set yet at this point, so use
    hdr_type instead.]

    This change fixes a regression introduced inadvertently by commit
    33f767d (ACPI: Rework acpi_get_child() to be more efficient) which
    overlooked the fact that for acpi_walk_namespace() "post-order" means
    "after all children have been visited" rather than "on the way back",
    so for device objects without children and for namespace walks of
    depth 1, as in the acpi_get_child() case, the "post-order" callbacks
    ordering is actually the same as the ordering of "pre-order" ones.
    Since that commit changed the namespace walk in acpi_get_child() to
    terminate after finding the first matching object instead of going
    through all of them and returning the last one, it effectively
    changed the result returned by that function in some rare cases and
    that led to problems (the switch from a "pre-order" to a "post-order"
    callback was supposed to prevent that from happening, but it was
    ineffective).

    As it turns out, the systems where the change made by commit
    33f767d actually matters are those where there are multiple ACPI
    device objects representing the same PCIe port (which effectively
    is a bridge). Moreover, only one of them, and the one we are
    expected to use, has child device objects in the ACPI namespace,
    so the regression can be addressed as described above.

    References: https://bugzilla.kernel.org/show_bug.cgi?id=60561
    Reported-by: Peter Wu
    Tested-by: Vladimir Lalov
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Bjorn Helgaas
    Cc: Peter Wu
    Signed-off-by: Greg Kroah-Hartman

    Rafael J. Wysocki
     

12 Aug, 2013

2 commits

  • commit aa914f5ec25e4371ba18b312971314be1b9b1076 upstream.

    Ben Herrenschmidt reported the following problem:

    - The bus has space for all desired MMIO resources, including optional
    space for SR-IOV devices
    - We attempt to allocate I/O port space, but it fails because the bus
    has no I/O space
    - Because of the I/O allocation failure, we retry MMIO allocation,
    requesting only the required space, without the optional SR-IOV space

    This means we don't allocate the optional SR-IOV space, even though we
    could.

    This is related to 0c5be0cb0e ("PCI: Retry on IORESOURCE_IO type
    allocations").

    This patch changes how we handle allocation failures. We will now retry
    allocation of only the resource type that failed. If MMIO allocation
    fails, we'll retry only MMIO allocation. If I/O port allocation fails,
    we'll retry only I/O port allocation.

    [bhelgaas: changelog]
    Reference: https://lkml.kernel.org/r/1367712653.11982.19.camel@pasglop
    Reported-by: Benjamin Herrenschmidt
    Tested-by: Gavin Shan
    Signed-off-by: Yinghai Lu
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Greg Kroah-Hartman

    Yinghai Lu
     
  • commit 29ed1f29b68a8395d5679b3c4e38352b617b3236 upstream.

    Hot-removing a device with SR-IOV enabled causes a null pointer dereference
    in v3.9 and v3.10.

    This is a regression caused by ba518e3c17 ("PCI: pciehp: Iterate over all
    devices in slot, not functions 0-7"). When we iterate over the
    bus->devices list, we first remove the PF, which also removes all the VFs
    from the list. Then the list iterator blows up because more than just the
    current entry was removed from the list.

    ac205b7bb7 ("PCI: make sriov work with hotplug remove") works around a
    similar problem in pci_stop_bus_devices() by iterating over the list in
    reverse, so the VFs are stopped and removed from the list first, before the
    PF.

    This patch changes pciehp_unconfigure_device() to iterate over the list in
    reverse, too.

    [bhelgaas: bugzilla, changelog]
    Reference: https://bugzilla.kernel.org/show_bug.cgi?id=60604
    Signed-off-by: Yinghai Lu
    Signed-off-by: Bjorn Helgaas
    Acked-by: Yijing Wang
    Signed-off-by: Greg Kroah-Hartman

    Yinghai Lu
     

22 Jul, 2013

4 commits

  • commit fafe5c3d82a470d73de53e6b08eb4e28d974d895 upstream.

    To add AMD CZ SATA controller device ID of IDE mode.

    [bhelgaas: drop pci_ids.h update]
    Signed-off-by: Shane Huang
    Signed-off-by: Bjorn Helgaas
    Reviewed-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Shane Huang
     
  • commit 343df771e671d821478dd3ef525a0610b808dbf8 upstream.

    After calling device_register(&bridge->dev), the bridge is reference-
    counted, and it is illegal to call kfree() on it except in the release
    function.

    [bhelgaas: changelog, use put_device() after device_register() failure]
    Signed-off-by: Jiang Liu
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Greg Kroah-Hartman

    Jiang Liu
     
  • commit fbf33f516bdbcc2ab1ba1e54dfb720b0cfaa6874 upstream.

    Commit 4f535093cf "PCI: Put pci_dev in device tree as early as possible"
    moves device registering from pci_bus_add_devices() to pci_device_add().
    That causes problems for virtual functions because device_add(&virtfn->dev)
    is called before setting the virtfn->is_virtfn flag, which then causes Xen
    to report PCI virtual functions as PCI physical functions.

    Fix it by setting virtfn->is_virtfn before calling pci_device_add().

    [Jiang Liu]: Move the setting of virtfn->is_virtfn ahead further for better
    readability and modify changelog.

    Signed-off-by: Xudong Hao
    Signed-off-by: Jiang Liu
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Greg Kroah-Hartman

    Xudong Hao
     
  • commit 098b1aeaf4d6149953b8f1f8d55c21d85536fbff upstream.

    There are two tool-stack that can instruct the Xen PCI frontend
    and backend to change states: 'xm' (Python code with a daemon),
    and 'xl' (C library - does not keep state changes).

    With the 'xm', the path to disconnect a single PCI device (xm pci-detach
    ) is:

    4(Connected)->7(Reconfiguring*)-> 8(Reconfigured)-> 4(Connected)->5(Closing*).

    The * is for states that the tool-stack sets. For 'xl', it is similar:

    4(Connected)->7(Reconfiguring*)-> 8(Reconfigured)-> 4(Connected)

    Both of them also tear down the XenBus structure, so the backend
    state ends up going in the 3(Initialised) and calls pcifront_xenbus_remove.

    When a PCI device is plugged back in (xm pci-attach )
    both of them follow the same pattern:

    2(InitWait*), 3(Initialized*), 4(Connected*)->4(Connected).

    [xen-pcifront ignores the 2,3 state changes and only acts when
    4 (Connected) has been reached]

    Note that this is for a _single_ PCI device. If there were two
    PCI devices and only one was disconnected 'xm' would show the same
    state changes.

    The problem is that git commit 3d925320e9e2de162bd138bf97816bda8c3f71be
    ("xen/pcifront: Use Xen-SWIOTLB when initting if required") introduced
    a mechanism to initialize the SWIOTLB when the Xen PCI front moves to
    Connected state. It also had some aggressive seatbelt code check that
    would warn the user if one tried to change to Connected state without
    hitting first the Closing state:

    pcifront pci-0: PCI frontend already installed!

    However, that code can be relaxed and we can continue on working
    even if the frontend is instructed to be the 'Connected' state with
    no devices and then gets tickled to be in 'Connected' state again.

    In other words, this 4(Connected)->5(Closing)->4(Connected) state
    was expected, while 4(Connected)->.... anything but 5(Closing)->4(Connected)
    was not. This patch removes that aggressive check and allows
    Xen pcifront to work with the 'xl' toolstack (for one or more
    PCI devices) and with 'xm' toolstack (for more than two PCI
    devices).

    Acked-by: Bjorn Helgaas
    Cc: linux-pci@vger.kernel.org
    [v2: Added in the description about two PCI devices]
    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: Greg Kroah-Hartman

    Konrad Rzeszutek Wilk
     

24 Jun, 2013

1 commit

  • The interactions between the ACPI dock driver and the ACPI-based PCI
    hotplug (acpiphp) are currently problematic because of ordering
    issues during hot-remove operations.

    First of all, the current ACPI glue code expects that physical
    devices will always be deleted before deleting the companion ACPI
    device objects. Otherwise, acpi_unbind_one() will fail with a
    warning message printed to the kernel log, for example:

    [ 185.026073] usb usb5: Oops, 'acpi_handle' corrupt
    [ 185.035150] pci 0000:1b:00.0: Oops, 'acpi_handle' corrupt
    [ 185.035515] pci 0000:18:02.0: Oops, 'acpi_handle' corrupt
    [ 180.013656] port1: Oops, 'acpi_handle' corrupt

    This means, in particular, that struct pci_dev objects have to
    be deleted before the struct acpi_device objects they are "glued"
    with.

    Now, the following happens the during the undocking of an ACPI-based
    dock station:
    1) hotplug_dock_devices() invokes registered hotplug callbacks to
    destroy physical devices associated with the ACPI device objects
    depending on the dock station. It calls dd->ops->handler() for
    each of those device objects.
    2) For PCI devices dd->ops->handler() points to
    handle_hotplug_event_func() that queues up a separate work item
    to execute _handle_hotplug_event_func() for the given device and
    returns immediately. That work item will be executed later.
    3) hotplug_dock_devices() calls dock_remove_acpi_device() for each
    device depending on the dock station. This runs acpi_bus_trim()
    for each of them, which causes the underlying ACPI device object
    to be destroyed, but the work items queued up by
    handle_hotplug_event_func() haven't been started yet.
    4) _handle_hotplug_event_func() queued up in step 2) are executed
    and cause the above failure to happen, because the PCI devices
    they handle do not have the companion ACPI device objects any
    more (those objects have been deleted in step 3).

    The possible breakage doesn't end here, though, because
    hotplug_dock_devices() may return before at least some of the
    _handle_hotplug_event_func() work items spawned by it have a
    chance to complete and then undock() will cause _DCK to be
    evaluated and that will cause the devices handled by the
    _handle_hotplug_event_func() to go away possibly while they are
    being accessed.

    This means that dd->ops->handler() for PCI devices should not point
    to handle_hotplug_event_func(). Instead, it should point to a
    function that will do the work of _handle_hotplug_event_func()
    synchronously. For this reason, introduce such a function,
    hotplug_event_func(), and modity acpiphp_dock_ops to point to
    it as the handler.

    Unfortunately, however, this is not sufficient, because if the dock
    code were not changed further, hotplug_event_func() would now
    deadlock with hotplug_dock_devices() that called it, since it would
    run unregister_hotplug_dock_device() which in turn would attempt to
    acquire the dock station's hp_lock mutex already acquired by
    hotplug_dock_devices().

    To resolve that deadlock use the observation that
    unregister_hotplug_dock_device() won't need to acquire hp_lock
    if PCI bridges the devices on the dock station depend on are
    prevented from being removed prematurely while the first loop in
    hotplug_dock_devices() is in progress.

    To make that possible, introduce a mechanism by which the callers of
    register_hotplug_dock_device() can provide "init" and "release"
    routines that will be executed, respectively, during the addition
    and removal of the physical device object associated with the
    given ACPI device handle. Make acpiphp use two new functions,
    acpiphp_dock_init() and acpiphp_dock_release(), that call
    get_bridge() and put_bridge(), respectively, on the acpiphp bridge
    holding the given device, for this purpose.

    In addition to that, remove the dock station's list of
    "hotplug devices" and make the dock code always walk the whole list
    of "dependent devices" instead in such a way that the loops in
    hotplug_dock_devices() and dock_event() (replacing the loops over
    "hotplug devices") will take references to the list entries that
    register_hotplug_dock_device() has been called for. That prevents
    the "release" routines associated with those entries from being
    called while the given entry is being processed and for PCI
    devices this means that their bridges won't be removed (by a
    concurrent thread) while hotplug_event_func() handling them is
    being executed.

    This change is based on two earlier patches from Jiang Liu.

    References: https://bugzilla.kernel.org/show_bug.cgi?id=59501
    Reported-and-tested-by: Alexander E. Patrakov
    Tracked-down-by: Jiang Liu
    Tested-by: Illya Klymov
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Yinghai Lu
    Cc: 3.9+

    Rafael J. Wysocki
     

23 Jun, 2013

1 commit

  • On x86 platforms, the kernel respects PCI resource assignments from
    the BIOS and only reassigns resources for unassigned BARs at boot
    time. However, with the ACPI-based hotplug (acpiphp), it ignores the
    BIOS' PCI resource assignments completely and reassigns all resources
    by itself. This causes differences in PCI resource allocation
    between boot time and runtime hotplug to occur, which is generally
    undesirable and sometimes actively breaks things.

    Namely, if there are enough resources, reassigning all PCI resources
    during runtime hotplug should work, but it may fail if the resources
    are constrained. This may happen, for instance, when some PCI
    devices with huge MMIO BARs are involved in the runtime hotplug
    operations, because the current PCI MMIO alignment algorithm may
    waste huge chunks of MMIO address space in those cases.

    On the Alexander's Sony VAIO VPCZ23A4R the BIOS allocates limited
    MMIO resources for the dock station which contains a device
    (graphics adapter) with a 256MB MMIO BAR. An attempt to reassign
    that during runtime hotplug causes the dock station MMIO window to be
    exhausted and acpiphp fails to allocate resources for the majority
    of devices on the dock station as a result.

    To prevent that from happening, modify acpiphp to follow the boot
    time resources allocation behavior so that the BIOS' resource
    assignments are respected during runtime hotplug too.

    [rjw: Changelog]
    References: https://bugzilla.kernel.org/show_bug.cgi?id=56531
    Reported-and-tested-by: Alexander E. Patrakov
    Tested-by: Illya Klymov
    Signed-off-by: Jiang Liu
    Acked-by: Yinghai Lu
    Cc: 3.9+
    Signed-off-by: Rafael J. Wysocki

    Jiang Liu
     

31 May, 2013

1 commit

  • The following warning was seen on 3.9 when a corrected PCIe error was being
    handled by the AER subsystem.

    WARNING: at .../drivers/pci/search.c:214 pci_get_dev_by_id+0x8a/0x90()

    This occurred because a call to pci_get_domain_bus_and_slot() was added to
    cper_print_pcie() to setup for the call to cper_print_aer(). The warning
    showed up because cper_print_pcie() is called in an interrupt context and
    pci_get* functions are not supposed to be called in that context.

    The solution is to move the cper_print_aer() call out of the interrupt
    context and into aer_recover_work_func() to avoid any warnings when calling
    pci_get* functions.

    Signed-off-by: Lance Ortiz
    Acked-by: Borislav Petkov
    Acked-by: Rafael J. Wysocki
    Signed-off-by: Tony Luck

    Lance Ortiz