14 Nov, 2018

1 commit

  • commit aeae4f3e5c38d47bdaef50446dc0ec857307df68 upstream.

    Upon removal of the last device on a bus, the link_state of the bridge
    leading to that bus is sought to be torn down by having pci_stop_dev()
    call pcie_aspm_exit_link_state().

    When ASPM was originally introduced by commit 7d715a6c1ae5 ("PCI: add
    PCI Express ASPM support"), it determined whether the device being
    removed is the last one by calling list_empty() on the bridge's
    subordinate devices list. That didn't work because the device is only
    removed from the list slightly later in pci_destroy_dev().

    Commit 3419c75e15f8 ("PCI: properly clean up ASPM link state on device
    remove") attempted to fix it by calling list_is_last(), but that's not
    correct either because it checks whether the device is at the *end* of
    the list, not whether it's the last one *left* in the list. If the user
    removes the device which happens to be at the end of the list via sysfs
    but other devices are preceding the device in the list, the link_state
    is torn down prematurely.

    The real fix is to move the invocation of pcie_aspm_exit_link_state() to
    pci_destroy_dev() and reinstate the call to list_empty(). Remove a
    duplicate check for dev->bus->self because pcie_aspm_exit_link_state()
    already contains an identical check.

    Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support")
    Signed-off-by: Lukas Wunner
    Signed-off-by: Bjorn Helgaas
    Cc: Shaohua Li
    Cc: stable@vger.kernel.org # v2.6.26
    Signed-off-by: Greg Kroah-Hartman

    Lukas Wunner
     

20 Dec, 2017

1 commit

  • [ Upstream commit 16b6c8bb687cc3bec914de09061fcb8411951fda ]

    When removing a device, for example a VF being removed due to SR-IOV
    teardown, a "soft" hot-unplug via 'echo 1 > remove' in sysfs, or an actual
    hot-unplug, we first remove the procfs and sysfs attributes for the device
    before attempting to release the device from any driver bound to it.
    Unbinding the driver from the device can take time. The device might need
    to write out data or it might be actively in use. If it's in use by
    userspace through a vfio driver, the unbind might block until the user
    releases the device. This leads to a potentially non-trivial amount of
    time where the device exists, but we've torn down the interfaces that
    userspace uses to examine devices, for instance lspci might generate this
    sort of error:

    pcilib: Cannot open /sys/bus/pci/devices/0000:01:0a.3/config
    lspci: Unable to read the standard configuration space header of device 0000:01:0a.3

    We don't seem to have any dependence on this teardown ordering in the
    kernel, so let's unbind the driver first, which is also more symmetric with
    the instantiation of the device in pci_bus_add_device().

    Signed-off-by: Alex Williamson
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Alex Williamson
     

18 Nov, 2016

1 commit

  • The algorithm to update the flag indicating whether a bridge may go to D3
    makes a few optimizations based on whether the update was caused by the
    removal of a device on the one hand, versus the addition of a device or the
    change of its D3cold flags on the other hand.

    The information whether the update pertains to a removal is currently
    passed in by the caller, but the function may as well determine that itself
    by examining the device in question, thereby allowing for a considerable
    simplification and reduction of the code.

    Out of several options to determine removal, I've chosen the function
    device_is_registered() because it's cheap: It merely returns the
    dev->kobj.state_in_sysfs flag. That flag is set through device_add() when
    the root bus is scanned and cleared through device_remove(). The call to
    pci_bridge_d3_update() happens after each of these calls, respectively, so
    the ordering is correct.

    No functional change intended.

    Tested-by: Mika Westerberg
    Signed-off-by: Lukas Wunner
    Signed-off-by: Bjorn Helgaas
    Reviewed-by: Rafael J. Wysocki

    Lukas Wunner
     

14 Sep, 2016

1 commit

  • Starting with v4.8, we allow a PCIe port to runtime suspend to D3hot if the
    port itself and its children satisfy a number of conditions. Once a child
    is removed, we recheck those conditions in case the removed device was
    blocking the port from suspending.

    The rechecking needs to happen *after* the device has been removed from the
    bus it resides on. Otherwise when walking the port's subordinate bus in
    pci_bridge_d3_update(), the device being removed would erroneously still be
    taken into account.

    However the device is removed from the bus_list in pci_destroy_dev() and we
    currently recheck *before* that. Fix it.

    Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
    Signed-off-by: Lukas Wunner
    Signed-off-by: Bjorn Helgaas
    Reviewed-by: Mika Westerberg
    Acked-by: Rafael J. Wysocki

    Lukas Wunner
     

14 Jun, 2016

1 commit

  • Currently the Linux PCI core does not touch power state of PCI bridges and
    PCIe ports when system suspend is entered. Leaving them in D0 consumes
    power unnecessarily and may prevent the CPU from entering deeper C-states.

    With recent PCIe hardware we can power down the ports to save power given
    that we take into account few restrictions:

    - The PCIe port hardware is recent enough, starting from 2015.

    - Devices connected to PCIe ports are effectively in D3cold once the port
    is transitioned to D3 (the config space is not accessible anymore and
    the link may be powered down).

    - Devices behind the PCIe port need to be allowed to transition to D3cold
    and back. There is a way both drivers and userspace can forbid this.

    - If the device behind the PCIe port is capable of waking the system it
    needs to be able to do so from D3cold.

    This patch adds a new flag to struct pci_device called 'bridge_d3'. This
    flag is set and cleared by the PCI core whenever there is a change in power
    management state of any of the devices behind the PCIe port. When system
    later on is suspended we only need to check this flag and if it is true
    transition the port to D3 otherwise we leave it in D0.

    Also provide override mechanism via command line parameter
    "pcie_port_pm=[off|force]" that can be used to disable or enable the
    feature regardless of the BIOS manufacturing date.

    Tested-by: Lukas Wunner
    Signed-off-by: Mika Westerberg
    Signed-off-by: Bjorn Helgaas
    Acked-by: Rafael J. Wysocki

    Mika Westerberg
     

15 Mar, 2016

1 commit

  • * pci/resource:
    PCI: Simplify pci_create_attr() control flow
    PCI: Don't leak memory if sysfs_create_bin_file() fails
    PCI: Simplify sysfs ROM cleanup
    PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
    MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM resource
    MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
    ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource
    ia64/PCI: Use ioremap() instead of open-coded equivalent
    ia64/PCI: Use temporary struct resource * to avoid repetition
    PCI: Clean up pci_map_rom() whitespace
    PCI: Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs
    PCI: Set ROM shadow location in arch code, not in PCI core
    PCI: Don't enable/disable ROM BAR if we're using a RAM shadow copy
    PCI: Don't assign or reassign immutable resources
    PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED
    x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs
    PCI: Disable IO/MEM decoding for devices with non-compliant BARs

    Bjorn Helgaas
     

12 Mar, 2016

1 commit


09 Mar, 2016

1 commit

  • Add pci_ops.{add,remove}_bus() callbacks, which will be called on every
    newly created bus and when a bus is being removed, respectively. This can
    be used by drivers to implement driver-specific initialization and teardown
    of the bus, in addition to the architecture-specifics implemented by the
    pcibios_add_bus() and the pcibios_remove_bus() functions.

    Signed-off-by: Thierry Reding
    Signed-off-by: Bjorn Helgaas

    Thierry Reding
     

09 Apr, 2015

1 commit

  • Export the following symbols so they can be referenced by a PCI host bridge
    driver compiled as a kernel loadable module:

    pci_common_swizzle
    pci_create_root_bus
    pci_stop_root_bus
    pci_remove_root_bus
    pci_assign_unassigned_bus_resources
    pci_fixup_irqs

    Signed-off-by: Ray Jui
    Signed-off-by: Bjorn Helgaas
    Acked-by: Arnd Bergmann

    Ray Jui
     

02 Feb, 2014

1 commit

  • Revert commit ef83b0781a73 "PCI: Remove from bus_list and release
    resources in pci_release_dev()" that made some nasty race conditions
    become possible. For example, if a Thunderbolt link is unplugged
    and then replugged immediately, the pci_release_dev() resulting from
    the hot-remove code path may be racing with the hot-add code path
    which after that commit causes various kinds of breakage to happen
    (up to and including a hard crash of the whole system).

    Moreover, the problem that commit ef83b0781a73 attempted to address
    cannot happen any more after commit 8a4c5c329de7 "PCI: Check parent
    kobject in pci_destroy_dev()", because pci_destroy_dev() will now
    return immediately if it has already been executed for the given
    device.

    Note, however, that the invocation of msi_remove_pci_irq_vectors()
    removed by commit ef83b0781a73 from pci_free_resources() along with
    the other changes made by it is not added back because of subsequent
    code changes depending on that modification.

    Fixes: ef83b0781a73 (PCI: Remove from bus_list and release resources in pci_release_dev())
    Reported-by: Mika Westerberg
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

16 Jan, 2014

1 commit

  • If pci_stop_and_remove_bus_device() is run concurrently for a device and
    its parent bridge via remove_callback(), both code paths attempt to acquire
    pci_rescan_remove_lock. If the child device removal acquires it first,
    there will be no problems. However, if the parent bridge removal acquires
    it first, it will eventually execute pci_destroy_dev() for the child
    device, but that device object will not be freed yet due to the reference
    held by the concurrent child removal. Consequently, both
    pci_stop_bus_device() and pci_remove_bus_device() will be executed for that
    device unnecessarily and pci_destroy_dev() will see a corrupted list head
    in that object. Moreover, an excess put_device() will be executed for that
    device in that case which may lead to a use-after-free in the final
    kobject_put() done by sysfs_schedule_callback_work().

    To avoid that problem, make pci_destroy_dev() check if the device's parent
    kobject is NULL, which only happens after device_del() has already run for
    it. Make pci_destroy_dev() return immediately whithout doing anything in
    that case.

    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Bjorn Helgaas

    Rafael J. Wysocki
     

14 Jan, 2014

1 commit

  • There are multiple PCI device addition and removal code paths that may be
    run concurrently with the generic PCI bus rescan and device removal that
    can be triggered via sysfs. If that happens, it may lead to multiple
    different, potentially dangerous race conditions.

    The most straightforward way to address those problems is to run
    the code in question under the same lock that is used by the
    generic rescan/remove code in pci-sysfs.c. To prepare for those
    changes, move the definition of the global PCI remove/rescan lock
    to probe.c and provide global wrappers, pci_lock_rescan_remove()
    and pci_unlock_rescan_remove(), allowing drivers to manipulate
    that lock. Also provide pci_stop_and_remove_bus_device_locked()
    for the callers of pci_stop_and_remove_bus_device() who only need
    to hold the rescan/remove lock around it.

    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Bjorn Helgaas

    Rafael J. Wysocki
     

19 Dec, 2013

3 commits

  • Previously we removed the pci_dev from the bus_list and released its
    resources in pci_destroy_dev(). But that's too early: it's possible to
    call pci_destroy_dev() twice for the same device (e.g., via sysfs), and
    that will cause an oops when we try to remove it from bus_list the second
    time.

    We should remove it from the bus_list only when the last reference to the
    pci_dev has been released, i.e., in pci_release_dev().

    [bhelgaas: changelog]
    Signed-off-by: Yinghai Lu
    Signed-off-by: Bjorn Helgaas

    Yinghai Lu
     
  • To be consistent with 4bff6749905d ("PCI: Move device_del() from
    pci_stop_dev() to pci_destroy_dev()", this changes pci_stop_root_bus()
    to use device_release_driver() instead of device_del().

    This also changes pci_remove_root_bus() to use device_unregister()
    instead of put_device() so it corresponds with the device_register()
    call in pci_create_root_bus().

    [bhelgaas: changelog]
    Signed-off-by: Yinghai Lu
    Signed-off-by: Bjorn Helgaas
    Acked-by: Rafael J. Wysocki

    Yinghai Lu
     
  • After commit bcdde7e221a8 (sysfs: make __sysfs_remove_dir() recursive)
    I'm seeing traces analogous to the one below in Thunderbolt testing:

    WARNING: CPU: 3 PID: 76 at /scratch/rafael/work/linux-pm/fs/sysfs/group.c:214 sysfs_remove_group+0x59/0xe0()
    sysfs group ffffffff81c6c500 not found for kobject '0000:08'
    Modules linked in: ...
    CPU: 3 PID: 76 Comm: kworker/u16:7 Not tainted 3.13.0-rc1+ #76
    Hardware name: Acer Aspire S5-391/Venus , BIOS V1.02 05/29/2012
    Workqueue: kacpi_hotplug acpi_hotplug_work_fn
    0000000000000009 ffff8801644b9ac8 ffffffff816b23bf 0000000000000007
    ffff8801644b9b18 ffff8801644b9b08 ffffffff81046607 ffff88016925b800
    0000000000000000 ffffffff81c6c500 ffff88016924f928 ffff88016924f800
    Call Trace:
    [] dump_stack+0x4e/0x71
    [] warn_slowpath_common+0x87/0xb0
    [] warn_slowpath_fmt+0x41/0x50
    [] ? sysfs_get_dirent_ns+0x6f/0x80
    [] sysfs_remove_group+0x59/0xe0
    [] dpm_sysfs_remove+0x3b/0x50
    [] device_del+0x58/0x1c0
    [] device_unregister+0x48/0x60
    [] pci_remove_bus+0x6e/0x80
    [] pci_remove_bus_device+0x38/0x110
    [] pci_remove_bus_device+0x4d/0x110
    [] pci_stop_and_remove_bus_device+0x19/0x20
    [] disable_slot+0x20/0xe0
    [] acpiphp_check_bridge+0xa8/0xd0
    [] hotplug_event+0x17d/0x220
    [] hotplug_event_work+0x30/0x70
    [] acpi_hotplug_work_fn+0x18/0x24
    [] process_one_work+0x261/0x450
    [] worker_thread+0x21e/0x370
    [] ? rescuer_thread+0x300/0x300
    [] kthread+0xd2/0xe0
    [] ? flush_kthread_worker+0x70/0x70
    [] ret_from_fork+0x7c/0xb0
    [] ? flush_kthread_worker+0x70/0x70

    (Mika Westerberg sees them too in his tests).

    Some investigation documented in kernel bug #65281 led me to the
    conclusion that the source of the problem is the device_del() in
    pci_stop_dev() as it now causes the sysfs directory of the device to be
    removed recursively along with all of its subdirectories. That includes
    the sysfs directory of the device's subordinate bus (dev->subordinate) and
    its "power" group.

    Consequently, when pci_remove_bus() is called for dev->subordinate in
    pci_remove_bus_device(), it calls device_unregister(&bus->dev), but at this
    point the sysfs directory of bus->dev doesn't exist any more and its
    "power" group doesn't exist either. Thus, when dpm_sysfs_remove() called
    from device_del() tries to remove that group, it triggers the above
    warning.

    That indicates a logical mistake in the design of
    pci_stop_and_remove_bus_device(), which causes bus device objects to be
    left behind their parents (bridge device objects) and can be fixed by
    moving the device_del() from pci_stop_dev() into pci_destroy_dev(), so
    pci_remove_bus() can be called for the device's subordinate bus before the
    device itself is unregistered from the hierarchy. Still, the driver, if
    any, should be detached from the device in pci_stop_dev(), so use
    device_release_driver() directly from there.

    References: https://bugzilla.kernel.org/show_bug.cgi?id=65281#c6
    Reported-by: Mika Westerberg
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Bjorn Helgaas

    Rafael J. Wysocki
     

15 Nov, 2013

1 commit


13 Apr, 2013

2 commits

  • On ACPI-based platforms, the pci_slot driver creates PCI slot devices
    according to information from ACPI tables by registering an ACPI PCI
    subdriver. The ACPI PCI subdriver will only be called when creating/
    destroying PCI root buses, and it won't be called when hot-plugging
    P2P bridges. It may cause stale PCI slot devices after hot-removing
    a P2P bridge if that bridge has associated PCI slots. And the acpiphp
    driver has the same issue too.

    This patch introduces two hook points into the PCI core, which will
    be invoked when creating/destroying PCI buses for PCI host and P2P
    bridges. They could be used to setup/destroy platform dependent stuff
    in a unified way, both at boot time and for PCI hotplug operations.

    Signed-off-by: Jiang Liu
    Signed-off-by: Yijing Wang
    Signed-off-by: Bjorn Helgaas
    Reviewed-by: Yinghai Lu
    Cc: "Rafael J. Wysocki"
    Cc: Toshi Kani
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Myron Stowe

    Jiang Liu
     
  • We always call device_register() and pci_create_legacy_files() for a
    new bus before handing out the "struct pci_bus *". Therefore, there's
    no possiblity of removing the bus with pci_remove_bus() before those
    calls have been made, so we don't need to check "bus->is_added" before
    calling pci_remove_legacy_files() and device_unregister().

    [bhelgaas: changelog]
    Signed-off-by: Jiang Liu
    Signed-off-by: Yijing Wang
    Signed-off-by: Bjorn Helgaas
    Reviewed-by: Yinghai Lu
    Cc: "Rafael J. Wysocki"
    Cc: Toshi Kani

    Jiang Liu
     

26 Feb, 2013

1 commit

  • Pull PCI changes from Bjorn Helgaas:
    "Host bridge hotplug
    - Major overhaul of ACPI host bridge add/start (Rafael Wysocki, Yinghai Lu)
    - Major overhaul of PCI/ACPI binding (Rafael Wysocki, Yinghai Lu)
    - Split out ACPI host bridge and ACPI PCI device hotplug (Yinghai Lu)
    - Stop caching _PRT and make independent of bus numbers (Yinghai Lu)

    PCI device hotplug
    - Clean up cpqphp dead code (Sasha Levin)
    - Disable ARI unless device and upstream bridge support it (Yijing Wang)
    - Initialize all hot-added devices (not functions 0-7) (Yijing Wang)

    Power management
    - Don't touch ASPM if disabled (Joe Lawrence)
    - Fix ASPM link state management (Myron Stowe)

    Miscellaneous
    - Fix PCI_EXP_FLAGS accessor (Alex Williamson)
    - Disable Bus Master in pci_device_shutdown (Konstantin Khlebnikov)
    - Document hotplug resource and MPS parameters (Yijing Wang)
    - Add accessor for PCIe capabilities (Myron Stowe)
    - Drop pciehp suspend/resume messages (Paul Bolle)
    - Make pci_slot built-in only (not a module) (Jiang Liu)
    - Remove unused PCI/ACPI bind ops (Jiang Liu)
    - Removed used pci_root_bus (Bjorn Helgaas)"

    * tag 'pci-v3.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (51 commits)
    PCI/ACPI: Don't cache _PRT, and don't associate them with bus numbers
    PCI: Fix PCI Express Capability accessors for PCI_EXP_FLAGS
    ACPI / PCI: Make pci_slot built-in only, not a module
    PCI/PM: Clear state_saved during suspend
    PCI: Use atomic_inc_return() rather than atomic_add_return()
    PCI: Catch attempts to disable already-disabled devices
    PCI: Disable Bus Master unconditionally in pci_device_shutdown()
    PCI: acpiphp: Remove dead code for PCI host bridge hotplug
    PCI: acpiphp: Create companion ACPI devices before creating PCI devices
    PCI: Remove unused "rc" in virtfn_add_bus()
    PCI: pciehp: Drop suspend/resume ENTRY messages
    PCI/ASPM: Don't touch ASPM if forcibly disabled
    PCI/ASPM: Deallocate upstream link state even if device is not PCIe
    PCI: Document MPS parameters pci=pcie_bus_safe, pci=pcie_bus_perf, etc
    PCI: Document hpiosize= and hpmemsize= resource reservation parameters
    PCI: Use PCI Express Capability accessor
    PCI: Introduce accessor to retrieve PCIe Capabilities Register
    PCI: Put pci_dev in device tree as early as possible
    PCI: Skip attaching driver in device_add()
    PCI: acpiphp: Keep driver loaded even if no slots found
    ...

    Linus Torvalds
     

14 Feb, 2013

1 commit

  • Devices are added to pci_pme_list when drivers use pci_enable_wake()
    or pci_wake_from_d3(), but they aren't removed from the list unless
    the driver explicitly disables wakeup. Many drivers never disable
    wakeup, so their devices remain on the list even after they are
    removed, e.g., via hotplug. A subsequent PME poll will oops when
    it tries to touch the device.

    This patch disables PME# on a device before removing it, which removes
    the device from pci_pme_list. This is safe even if the device never
    had PME# enabled.

    This oops can be triggered by unplugging a Thunderbolt ethernet adapter
    on a Macbook Pro, as reported by Daniel below.

    [bhelgaas: changelog]
    Reference: http://lkml.kernel.org/r/CAMVG2svG21yiM1wkH4_2pen2n+cr2-Zv7TbH3Gj+8MwevZjDbw@mail.gmail.com
    Reported-and-tested-by: Daniel J Blueman
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Bjorn Helgaas
    CC: stable@vger.kernel.org

    Rafael J. Wysocki
     

26 Jan, 2013

1 commit

  • According to device model documentation, the way to create/destroy PCI
    devices should be symmetric. The rule is to either use
    1) device_register()/device_unregister()
    or
    2) device_initialize()/device_add()/device_del()/put_device().

    So change PCI core logic to follow the rule and get rid of the redundant
    pci_dev_get()/pci_dev_put() pair.

    Signed-off-by: Jiang Liu
    Signed-off-by: Yinghai Lu
    Signed-off-by: Bjorn Helgaas
    Acked-by: Rafael J. Wysocki

    Jiang Liu
     

04 Nov, 2012

1 commit


21 Sep, 2012

1 commit

  • This restores the previous behavior of stopping all child devices before
    removing any of them. The current SR-IOV design, where removing the PF
    also drops references on all the VFs, depends on having the VFs continue
    to exist after having been stopped.

    [bhelgaas: changelog]
    Signed-off-by: Yinghai Lu
    Signed-off-by: Bjorn Helgaas

    Yinghai Lu
     

23 Aug, 2012

8 commits


14 Jun, 2012

1 commit


28 Feb, 2012

3 commits


15 Feb, 2012

1 commit

  • When hot removing a pci express module that has a pcie switch and supports
    SRIOV, we got:

    [ 5918.610127] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 1
    [ 5918.615779] pciehp 0000:80:02.2:pcie04: Attention button interrupt received
    [ 5918.622730] pciehp 0000:80:02.2:pcie04: Button pressed on Slot(3)
    [ 5918.629002] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 1f9
    [ 5918.637416] pciehp 0000:80:02.2:pcie04: PCI slot #3 - powering off due to button press.
    [ 5918.647125] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 10
    [ 5918.653039] pciehp 0000:80:02.2:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
    [ 5918.661229] pciehp 0000:80:02.2:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0
    [ 5924.667627] pciehp 0000:80:02.2:pcie04: Disabling domain:bus:device=0000:b0:00
    [ 5924.674909] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 2f9
    [ 5924.683262] pciehp 0000:80:02.2:pcie04: pciehp_unconfigure_device: domain:bus:dev = 0000:b0:00
    [ 5924.693976] libfcoe_device_notification: NETDEV_UNREGISTER eth6
    [ 5924.764979] libfcoe_device_notification: NETDEV_UNREGISTER eth14
    [ 5924.873539] libfcoe_device_notification: NETDEV_UNREGISTER eth15
    [ 5924.995209] libfcoe_device_notification: NETDEV_UNREGISTER eth16
    [ 5926.114407] sxge 0000:b2:00.0: PCI INT A disabled
    [ 5926.119342] BUG: unable to handle kernel NULL pointer dereference at (null)
    [ 5926.127189] IP: [] pci_stop_bus_device+0x33/0x83
    [ 5926.133377] PGD 0
    [ 5926.135402] Oops: 0000 [#1] SMP
    [ 5926.138659] CPU 2
    [ 5926.140499] Modules linked in:
    ...
    [ 5926.143754]
    [ 5926.275823] Call Trace:
    [ 5926.278267] [] pci_stop_bus_device+0x30/0x83
    [ 5926.284180] [] pci_remove_bus_device+0x1a/0xba
    [ 5926.290264] [] pciehp_unconfigure_device+0x110/0x17b
    [ 5926.296866] [] ? pciehp_disable_slot+0x188/0x188
    [ 5926.303123] [] pciehp_disable_slot+0x11e/0x188
    [ 5926.309206] [] pciehp_power_thread+0x8f/0xe0
    ...

    +-[0000:80]-+-00.0-[81-8f]--
    | +-01.0-[90-9f]--
    | +-02.0-[a0-af]--
    | +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Device
    | | | +-00.1 Device
    | | | +-00.2 Device
    | | | \-00.3 Device
    | | \-03.0-[b3]--+-00.0 Device
    | | +-00.1 Device
    | | +-00.2 Device
    | | \-00.3 Device

    root complex: 80:02.2
    pci express modules: have pcie switch and are listed as b0:00.0, b1:02.0 and b1:03.0.
    end devices are b2:00.0 and b3.00.0.
    VFs are: b2:00.1,... b2:00.3, and b3:00.1,...,b3:00.3

    Root cause: when doing pci_stop_bus_device() with phys fn, it will stop
    virt fn and remove the fn, so
    list_for_each_safe(l, n, &bus->devices)
    will have problem to refer freed n that is pointed to vf entry.

    Solution is just replacing list_for_each_safe() with
    list_for_each_prev_safe(). This will make sure we can get valid n pointer
    to PF instead of the freed VF pointer (because newly added devices are
    inserted to the bus->devices list tail).

    During reviewing the patch, Bjorn said:
    | The PCI hot-remove path calls pci_stop_bus_devices() via
    | pci_remove_bus_device().
    |
    | pci_stop_bus_devices() traverses the bus->devices list (point A below),
    | stopping each device in turn, which calls the driver remove() method. When
    | the device is an SR-IOV PF, the driver calls pci_disable_sriov(), which
    | also uses pci_remove_bus_device() to remove the VF devices from the
    | bus->devices list (point B).
    |
    | pci_remove_bus_device
    | pci_stop_bus_device
    | pci_stop_bus_devices(subordinate)
    | list_for_each(bus->devices) remove
    | pci_disable_sriov
    | ...
    | pci_remove_bus_device(VF)
    |
    Signed-off-by: Jesse Barnes

    Yinghai Lu
     

11 Feb, 2012

1 commit

  • During test busn_res allocation with cardbus, found pci card removal is not
    working anymore, and it turns out it is broken by:

    |commit 79cc9601c3e42b4f0650fe7e69132ebce7ab48f9
    |Date: Tue Nov 22 21:06:53 2011 -0800
    |
    | PCI: Only call pci_stop_bus_device() one time for child devices at remove

    The above changed the behavior of pci_remove_behind_bridge that
    yenta_cardbus depended on. So restore the old behavoir of
    pci_remove_behind_bridge (which requires stopping and removing of all
    devices) by:

    1. rename pci_remove_behind_bridge to __pci_remove_behind_bridge, and let
    __pci_remove_bus_device() call it instead.
    2. add pci_stop_behind_bridge that will stop devices behind a bridge
    3. add back pci_remove_behind_bridge that will stop and remove devices
    under bridge.

    -v2: update commit description a little bit.

    Tested-by: Dominik Brodowski
    Signed-off-by: Yinghai Lu
    Signed-off-by: Jesse Barnes

    Yinghai Lu
     

07 Jan, 2012

1 commit

  • During debugging pcie hotplug with SRIOV with pcie switch, I found
    pci_stop_bus_device() is called several times for some child devices.

    So change original pci_remove_bus_device() to __pci_remove_bus_device(),
    and make it only do remove work, and add a new pci_remove_bus_device
    that calls pci_stop_bus_device() one time, and then call
    __pci_remove_bus_device().

    Signed-off-by: Yinghai Lu
    Signed-off-by: Jesse Barnes

    Yinghai Lu
     

22 May, 2011

1 commit


12 Jun, 2009

1 commit