11 Nov, 2020

1 commit


30 Oct, 2020

2 commits


26 Oct, 2020

1 commit


20 Oct, 2020

2 commits


15 Oct, 2020

1 commit


08 Oct, 2020

1 commit

  • * tag 'v5.4.70': (3051 commits)
    Linux 5.4.70
    netfilter: ctnetlink: add a range check for l3/l4 protonum
    ep_create_wakeup_source(): dentry name can change under you...
    ...

    Conflicts:
    arch/arm/mach-imx/pm-imx6.c
    arch/arm64/boot/dts/freescale/imx8mm-evk.dts
    arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
    drivers/crypto/caam/caamalg.c
    drivers/gpu/drm/imx/dw_hdmi-imx.c
    drivers/gpu/drm/imx/imx-ldb.c
    drivers/gpu/drm/imx/ipuv3/ipuv3-crtc.c
    drivers/mmc/host/sdhci-esdhc-imx.c
    drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
    drivers/net/ethernet/freescale/enetc/enetc.c
    drivers/net/ethernet/freescale/enetc/enetc_pf.c
    drivers/thermal/imx_thermal.c
    drivers/usb/cdns3/ep0.c
    drivers/xen/swiotlb-xen.c
    sound/soc/fsl/fsl_esai.c
    sound/soc/fsl/fsl_sai.c

    Signed-off-by: Jason Liu

    Jason Liu
     

01 Oct, 2020

6 commits

  • [ Upstream commit fcee90cdf6f3a3a371add04d41528d5ba9c3b411 ]

    pm_runtime_get_sync() increments the runtime PM usage counter even
    when it returns an error code. Thus a pairing decrement is needed on
    the error handling path to keep the counter balanced.

    Also, call pm_runtime_disable() when pm_runtime_get_sync() returns
    an error code.

    Link: https://lore.kernel.org/r/20200521024709.2368-1-dinghao.liu@zju.edu.cn
    Signed-off-by: Dinghao Liu
    Signed-off-by: Lorenzo Pieralisi
    Acked-by: Thierry Reding
    Signed-off-by: Sasha Levin

    Dinghao Liu
     
  • [ Upstream commit 1c1dbb2c02623db18a50c61b175f19aead800b4e ]

    pm_runtime_get_sync() increments the runtime PM usage counter even
    when it returns an error code. Thus a pairing decrement is needed on
    the error handling path to keep the counter balanced.

    Link: https://lore.kernel.org/r/20200521031355.7022-1-dinghao.liu@zju.edu.cn
    Signed-off-by: Dinghao Liu
    Signed-off-by: Lorenzo Pieralisi
    Acked-by: Thierry Reding
    Acked-by: Vidya Sagar
    Signed-off-by: Sasha Levin

    Dinghao Liu
     
  • [ Upstream commit 8edf5332c39340b9583cf9cba659eb7ec71f75b5 ]

    Without this commit, a PCIe hotplug port can stop generating interrupts on
    hotplug events, so device adds and removals will not be seen:

    The pciehp interrupt handler pciehp_isr() reads the Slot Status register
    and then writes back to it to clear the bits that caused the interrupt. If
    a different interrupt event bit gets set between the read and the write,
    pciehp_isr() returns without having cleared all of the interrupt event
    bits. If this happens when the MSI isn't masked (which by default it isn't
    in handle_edge_irq(), and which it will never be when MSI per-vector
    masking is not supported), we won't get any more hotplug interrupts from
    that device.

    That is expected behavior, according to the PCIe Base Spec r5.0, section
    6.7.3.4, "Software Notification of Hot-Plug Events".

    Because the Presence Detect Changed and Data Link Layer State Changed event
    bits can both get set at nearly the same time when a device is added or
    removed, this is more likely to happen than it might seem. The issue was
    found (and can be reproduced rather easily) by connecting and disconnecting
    an NVMe storage device on at least one system model where the NVMe devices
    were being connected to an AMD PCIe port (PCI device 0x1022/0x1483).

    Fix the issue by modifying pciehp_isr() to loop back and re-read the Slot
    Status register immediately after writing to it, until it sees that all of
    the event status bits have been cleared.

    [lukas: drop loop count limitation, write "events" instead of "status",
    don't loop back in INTx and poll modes, tweak code comment & commit msg]
    Link: https://lore.kernel.org/r/78b4ced5072bfe6e369d20e8b47c279b8c7af12e.1582121613.git.lukas@wunner.de
    Tested-by: Stuart Hayes
    Signed-off-by: Stuart Hayes
    Signed-off-by: Lukas Wunner
    Signed-off-by: Bjorn Helgaas
    Reviewed-by: Joerg Roedel
    Signed-off-by: Sasha Levin

    Stuart Hayes
     
  • [ Upstream commit 72e0ef0e5f067fd991f702f0b2635d911d0cf208 ]

    On some EFI systems, the video BIOS is provided by the EFI firmware. The
    boot stub code stores the physical address of the ROM image in pdev->rom.
    Currently we attempt to access this pointer using phys_to_virt(), which
    doesn't work with CONFIG_HIGHMEM.

    On these systems, attempting to load the radeon module on a x86_32 kernel
    can result in the following:

    BUG: unable to handle page fault for address: 3e8ed03c
    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    *pde = 00000000
    Oops: 0000 [#1] PREEMPT SMP
    CPU: 0 PID: 317 Comm: systemd-udevd Not tainted 5.6.0-rc3-next-20200228 #2
    Hardware name: Apple Computer, Inc. MacPro1,1/Mac-F4208DC8, BIOS MP11.88Z.005C.B08.0707021221 07/02/07
    EIP: radeon_get_bios+0x5ed/0xe50 [radeon]
    Code: 00 00 84 c0 0f 85 12 fd ff ff c7 87 64 01 00 00 00 00 00 00 8b 47 08 8b 55 b0 e8 1e 83 e1 d6 85 c0 74 1a 8b 55 c0 85 d2 74 13 38 55 75 0e 80 78 01 aa 0f 84 a4 03 00 00 8d 74 26 00 68 dc 06
    EAX: 3e8ed03c EBX: 00000000 ECX: 3e8ed03c EDX: 00010000
    ESI: 00040000 EDI: eec04000 EBP: eef3fc60 ESP: eef3fbe0
    DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010206
    CR0: 80050033 CR2: 3e8ed03c CR3: 2ec77000 CR4: 000006d0
    Call Trace:
    r520_init+0x26/0x240 [radeon]
    radeon_device_init+0x533/0xa50 [radeon]
    radeon_driver_load_kms+0x80/0x220 [radeon]
    drm_dev_register+0xa7/0x180 [drm]
    radeon_pci_probe+0x10f/0x1a0 [radeon]
    pci_device_probe+0xd4/0x140

    Fix the issue by updating all drivers which can access a platform provided
    ROM. Instead of calling the helper function pci_platform_rom() which uses
    phys_to_virt(), call ioremap() directly on the pdev->rom.

    radeon_read_platform_bios() previously directly accessed an __iomem
    pointer. Avoid this by calling memcpy_fromio() instead of kmemdup().

    pci_platform_rom() now has no remaining callers, so remove it.

    Link: https://lore.kernel.org/r/20200319021623.5426-1-mikel@mikelr.com
    Signed-off-by: Mikel Rychliski
    Signed-off-by: Bjorn Helgaas
    Acked-by: Alex Deucher
    Signed-off-by: Sasha Levin

    Mikel Rychliski
     
  • [ Upstream commit c13704f5685deb7d6eb21e293233e0901ed77377 ]

    Previously, the kernel sometimes assigned more MMIO or MMIO_PREF space than
    desired. For example, if the user requested 128M of space with
    "pci=realloc,hpmemsize=128M", we sometimes assigned 256M:

    pci 0000:06:01.0: BAR 14: assigned [mem 0x90100000-0xa00fffff] = 256M
    pci 0000:06:04.0: BAR 14: assigned [mem 0xa0200000-0xb01fffff] = 256M

    With this patch applied:

    pci 0000:06:01.0: BAR 14: assigned [mem 0x90100000-0x980fffff] = 128M
    pci 0000:06:04.0: BAR 14: assigned [mem 0x98200000-0xa01fffff] = 128M

    This happened when in the first pass, the MMIO_PREF succeeded but the MMIO
    failed. In the next pass, because MMIO_PREF was already assigned, the
    attempt to assign MMIO_PREF returned an error code instead of success
    (nothing more to do, already allocated). Hence, the size which was actually
    allocated, but thought to have failed, was placed in the MMIO window.

    The bug resulted in the MMIO_PREF being added to the MMIO window, which
    meant doubling if MMIO_PREF size = MMIO size. With a large MMIO_PREF, the
    MMIO window would likely fail to be assigned altogether due to lack of
    32-bit address space.

    Change find_free_bus_resource() to do the following:

    - Return first unassigned resource of the correct type.
    - If there is none, return first assigned resource of the correct type.
    - If none of the above, return NULL.

    Returning an assigned resource of the correct type allows the caller to
    distinguish between already assigned and no resource of the correct type.

    Add checks in pbus_size_io() and pbus_size_mem() to return success if
    resource returned from find_free_bus_resource() is already allocated.

    This avoids pbus_size_io() and pbus_size_mem() returning error code to
    __pci_bus_size_bridges() when a resource has been successfully assigned in
    a previous pass. This fixes the existing behaviour where space for a
    resource could be reserved multiple times in different parent bridge
    windows.

    Link: https://lore.kernel.org/lkml/20190531171216.20532-2-logang@deltatee.com/T/#u
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=203243
    Link: https://lore.kernel.org/r/PS2P216MB075563AA6AD242AA666EDC6A80760@PS2P216MB0755.KORP216.PROD.OUTLOOK.COM
    Reported-by: Kit Chow
    Reported-by: Nicholas Johnson
    Signed-off-by: Nicholas Johnson
    Signed-off-by: Bjorn Helgaas
    Reviewed-by: Mika Westerberg
    Reviewed-by: Logan Gunthorpe
    Signed-off-by: Sasha Levin

    Nicholas Johnson
     
  • [ Upstream commit 35ff867b76576e32f34c698ccd11343f7d616204 ]

    When sriov_numvfs is being updated, we call the driver->sriov_configure()
    function, which may enable VFs and call probe functions, which may make new
    devices visible. This all happens before before sriov_numvfs_store()
    updates sriov->num_VFs, so previously, concurrent sysfs reads of
    sriov_numvfs returned stale values.

    Serialize the sysfs read vs the write so the read returns the correct
    num_VFs value.

    [bhelgaas: hold device_lock instead of checking mutex_is_locked()]
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=202991
    Link: https://lore.kernel.org/r/20190911072736.32091-1-pierre.cregut@orange.com
    Signed-off-by: Pierre Crégut
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Sasha Levin

    Pierre Crégut
     

22 Sep, 2020

2 commits

  • Invoke mic driver API to probe mic card driver when imx_mic_epf setup pci
    outbound configuration.

    Signed-off-by: Joakim Zhang
    Signed-off-by: Sherry Sun
    Reviewed-by: Frank Li
    Reviewed-by: Fugang Duan

    Sherry Sun
     
  • Add i.MX MIC endpoint function driver:
    bar0: fetch share memory reserved by RC side.
    bar2: mapping MU register memory to RC side.
    bar4: mapping swiotlb region to RC side

    Note: Since pcie map address and size should be aligned, for 64M swiotlb
    size, pci map address should be 64M aligned, but swiotlb region starts
    from 0xfbfff000, which is not 64M aligned. So for BAR4, we mapped 128M
    size from 0xf8000000 directly to avoid mapping errors.

    This endpoint dirver is for MIC, following patch would invoke MIC driver
    API to probe MIC driver.

    Signed-off-by: Joakim Zhang
    Signed-off-by: Sherry Sun
    Reviewed-by: Frank Li
    Reviewed-by: Fugang Duan

    Sherry Sun
     

03 Sep, 2020

4 commits

  • [ Upstream commit ee367e2cdd2202b5714982739e684543cd2cee0e ]

    Add missing ext reset used by ipq8064 SoC in PCIe qcom driver.

    Link: https://lore.kernel.org/r/20200615210608.21469-5-ansuelsmth@gmail.com
    Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver")
    Signed-off-by: Sham Muthayyan
    Signed-off-by: Ansuel Smith
    Signed-off-by: Lorenzo Pieralisi
    Reviewed-by: Rob Herring
    Reviewed-by: Philipp Zabel
    Acked-by: Stanimir Varbanov
    Cc: stable@vger.kernel.org # v4.5+
    Signed-off-by: Sasha Levin

    Ansuel Smith
     
  • [ Upstream commit dd58318c019f10bc94db36df66af6c55d4c0cbba ]

    The deinit issues reset_control_assert for PCI twice and does not contain
    phy reset.

    Link: https://lore.kernel.org/r/20200615210608.21469-4-ansuelsmth@gmail.com
    Signed-off-by: Abhishek Sahu
    Signed-off-by: Ansuel Smith
    Signed-off-by: Lorenzo Pieralisi
    Reviewed-by: Rob Herring
    Acked-by: Stanimir Varbanov
    Signed-off-by: Sasha Levin

    Abhishek Sahu
     
  • [ Upstream commit 8b6f0330b5f9a7543356bfa9e76d580f03aa2c1e ]

    Aux and Ref clk are missing in PCIe qcom driver. Add support for this
    optional clks for ipq8064/apq8064 SoC.

    Link: https://lore.kernel.org/r/20200615210608.21469-2-ansuelsmth@gmail.com
    Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver")
    Signed-off-by: Sham Muthayyan
    Signed-off-by: Ansuel Smith
    Signed-off-by: Lorenzo Pieralisi
    Reviewed-by: Rob Herring
    Acked-by: Stanimir Varbanov
    Signed-off-by: Sasha Levin

    Ansuel Smith
     
  • [ Upstream commit 8a94644b440eef5a7b9c104ac8aa7a7f413e35e5 ]

    kobject_init_and_add() takes a reference even when it fails. If it returns
    an error, kobject_put() must be called to clean up the memory associated
    with the object.

    When kobject_init_and_add() fails, call kobject_put() instead of kfree().

    b8eb718348b8 ("net-sysfs: Fix reference count leak in
    rx|netdev_queue_add_kobject") fixed a similar problem.

    Link: https://lore.kernel.org/r/20200528021322.1984-1-wu000273@umn.edu
    Signed-off-by: Qiushi Wu
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Sasha Levin

    Qiushi Wu
     

21 Aug, 2020

5 commits

  • commit de3c4bf648975ea0b1d344d811e9b0748907b47c upstream.

    Add tx term offset support to pcie qcom driver need in some revision of
    the ipq806x SoC. Ipq8064 needs tx term offset set to 7.

    Link: https://lore.kernel.org/r/20200615210608.21469-9-ansuelsmth@gmail.com
    Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver")
    Signed-off-by: Sham Muthayyan
    Signed-off-by: Ansuel Smith
    Signed-off-by: Lorenzo Pieralisi
    Acked-by: Stanimir Varbanov
    Cc: stable@vger.kernel.org # v4.5+
    Signed-off-by: Greg Kroah-Hartman

    Ansuel Smith
     
  • commit 5149901e9e6deca487c01cc434a3ac4125c7b00b upstream.

    Set some specific value for Tx De-Emphasis, Tx Swing and Rx equalization
    needed on some ipq8064 based device (Netgear R7800 for example). Without
    this the system locks on kernel load.

    Link: https://lore.kernel.org/r/20200615210608.21469-8-ansuelsmth@gmail.com
    Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver")
    Signed-off-by: Ansuel Smith
    Signed-off-by: Lorenzo Pieralisi
    Reviewed-by: Rob Herring
    Acked-by: Stanimir Varbanov
    Cc: stable@vger.kernel.org # v4.5+
    Signed-off-by: Greg Kroah-Hartman

    Ansuel Smith
     
  • commit 2194bc7c39610be7cabe7456c5f63a570604f015 upstream.

    device_attach() returning failure indicates a driver error while trying to
    probe the device. In such a scenario, the PCI device should still be added
    in the system and be visible to the user.

    When device_attach() fails, merely warn about it and keep the PCI device in
    the system.

    This partially reverts ab1a187bba5c ("PCI: Check device_attach() return
    value always").

    Link: https://lore.kernel.org/r/20200706233240.3245512-1-rajatja@google.com
    Signed-off-by: Rajat Jain
    Signed-off-by: Bjorn Helgaas
    Reviewed-by: Greg Kroah-Hartman
    Cc: stable@vger.kernel.org # v4.6+
    Signed-off-by: Greg Kroah-Hartman

    Rajat Jain
     
  • commit 45beb31d3afb651bb5c41897e46bd4fa9980c51c upstream.

    We are seeing AMD Radeon Pro W5700 doesn't work when IOMMU is enabled:

    iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=63:00.0 address=0x42b5b01a0]
    iommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=63:00.0 address=0x42b5b01c0]

    The error also makes graphics driver fail to probe the device.

    It appears to be the same issue as commit 5e89cd303e3a ("PCI: Mark AMD
    Navi14 GPU rev 0xc5 ATS as broken") addresses, and indeed the same ATS
    quirk can workaround the issue.

    See-also: 5e89cd303e3a ("PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken")
    See-also: d28ca864c493 ("PCI: Mark AMD Stoney Radeon R7 GPU ATS as broken")
    See-also: 9b44b0b09dec ("PCI: Mark AMD Stoney GPU ATS as broken")
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208725
    Link: https://lore.kernel.org/r/20200728104554.28927-1-kai.heng.feng@canonical.com
    Signed-off-by: Kai-Heng Feng
    Signed-off-by: Bjorn Helgaas
    Acked-by: Alex Deucher
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Kai-Heng Feng
     
  • commit dae68d7fd4930315389117e9da35b763f12238f9 upstream.

    If context is not NULL in acpiphp_grab_context(), but the
    is_going_away flag is set for the device's parent, the reference
    counter of the context needs to be decremented before returning
    NULL or the context will never be freed, so make that happen.

    Fixes: edf5bf34d408 ("ACPI / dock: Use callback pointers from devices' ACPI hotplug contexts")
    Reported-by: Vasily Averin
    Cc: 3.15+ # 3.15+
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    Rafael J. Wysocki
     

19 Aug, 2020

5 commits

  • commit ec0160891e387f4771f953b888b1fe951398e5d9 upstream.

    Commit 711419e504eb ("irqdomain: Add the missing assignment of
    domain->fwnode for named fwnode") unintentionally caused a dangling pointer
    page fault issue on firmware nodes that were freed after IRQ domain
    allocation. Commit e3beca48a45b fixed that dangling pointer issue by only
    freeing the firmware node after an IRQ domain allocation failure. That fix
    no longer frees the firmware node immediately, but leaves the firmware node
    allocated after the domain is removed.

    The firmware node must be kept around through irq_domain_remove, but should be
    freed it afterwards.

    Add the missing free operations after domain removal where where appropriate.

    Fixes: e3beca48a45b ("irqdomain/treewide: Keep firmware node unconditionally allocated")
    Signed-off-by: Jon Derrick
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Andy Shevchenko
    Acked-by: Bjorn Helgaas # drivers/pci
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/1595363169-7157-1-git-send-email-jonathan.derrick@intel.com
    Signed-off-by: Greg Kroah-Hartman

    Jon Derrick
     
  • [ Upstream commit 090688fa4e448284aaa16136372397d7d10814db ]

    The acpi_get_table() should be coupled with acpi_put_table() if the mapped
    table is not used at runtime to release the table mapping.

    In pci_quirk_amd_sb_acs(), IVRS table is just used for checking AMD IOMMU
    is supported, not used at runtime, so put the table after using it.

    Fixes: 15b100dfd1c9 ("PCI: Claim ACS support for AMD southbridge devices")
    Link: https://lore.kernel.org/r/1595411068-15440-1-git-send-email-guohanjun@huawei.com
    Signed-off-by: Hanjun Guo
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Sasha Levin

    Hanjun Guo
     
  • [ Upstream commit e3bca37d15dca118f2ef1f0a068bb6e07846ea20 ]

    Commit 1b79c5284439 ("PCI: cadence: Add host driver for Cadence PCIe
    controller") in order to update Vendor ID, directly wrote to
    PCI_VENDOR_ID register. However PCI_VENDOR_ID in root port configuration
    space is read-only register and writing to it will have no effect.
    Use local management register to configure Vendor ID and Subsystem Vendor
    ID.

    Link: https://lore.kernel.org/r/20200722110317.4744-10-kishon@ti.com
    Fixes: 1b79c5284439 ("PCI: cadence: Add host driver for Cadence PCIe controller")
    Signed-off-by: Kishon Vijay Abraham I
    Signed-off-by: Lorenzo Pieralisi
    Reviewed-by: Rob Herring
    Signed-off-by: Sasha Levin

    Kishon Vijay Abraham I
     
  • [ Upstream commit 3167e3d340c092fd47924bc4d23117a3074ef9a9 ]

    When I cat ASPM parameter 'policy' by sysfs, it displays as follows. Add a
    newline for easy reading. Other sysfs attributes already include a
    newline.

    [root@localhost ~]# cat /sys/module/pcie_aspm/parameters/policy
    [default] performance powersave powersupersave [root@localhost ~]#

    Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support")
    Link: https://lore.kernel.org/r/1594972765-10404-1-git-send-email-wangxiongfeng2@huawei.com
    Signed-off-by: Xiongfeng Wang
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Sasha Levin

    Xiongfeng Wang
     
  • [ Upstream commit 2a7e32d0547f41c5ce244f84cf5d6ca7fccee7eb ]

    The pci_cfg_wait queue is used to prevent user-space config accesses to
    devices while they are recovering from reset.

    Previously we used these operations on pci_cfg_wait:

    __add_wait_queue(&pci_cfg_wait, ...)
    __remove_wait_queue(&pci_cfg_wait, ...)
    wake_up_all(&pci_cfg_wait)

    The wake_up acquires the wait queue lock, but the add and remove do not.

    Originally these were all protected by the pci_lock, but cdcb33f98244
    ("PCI: Avoid possible deadlock on pci_lock and p->pi_lock"), moved
    wake_up_all() outside pci_lock, so it could race with add/remove
    operations, which caused occasional kernel panics, e.g., during vfio-pci
    hotplug/unplug testing:

    Unable to handle kernel read from unreadable memory at virtual address ffff802dac469000

    Resolve this by using wait_event() instead of __add_wait_queue() and
    __remove_wait_queue(). The wait queue lock is held by both wait_event()
    and wake_up_all(), so it provides mutual exclusion.

    Fixes: cdcb33f98244 ("PCI: Avoid possible deadlock on pci_lock and p->pi_lock")
    Link: https://lore.kernel.org/linux-pci/79827f2f-9b43-4411-1376-b9063b67aee3@huawei.com/T/#u
    Based-on: https://lore.kernel.org/linux-pci/20191210031527.40136-1-zhengxiang9@huawei.com/
    Based-on-patch-by: Xiang Zheng
    Signed-off-by: Bjorn Helgaas
    Tested-by: Xiang Zheng
    Cc: Heyi Guo
    Cc: Biaoxiang Ye
    Signed-off-by: Sasha Levin

    Bjorn Helgaas
     

11 Aug, 2020

1 commit

  • commit e7b856dfcec6d3bf028adee8c65342d7035914a1 upstream.

    As reported in https://bugzilla.kernel.org/206217 , raw_violation_fixup
    is causing more harm than good in some common use-cases.

    This patch is a partial revert of commit:

    191cd6fb5d2c ("PCI: tegra: Add SW fixup for RAW violations")

    and fixes the following regression since then.

    * Description:

    When both the NIC and MMC are used one can see the following message:

    NETDEV WATCHDOG: enp1s0 (r8169): transmit queue 0 timed out

    and

    pcieport 0000:00:02.0: AER: Uncorrected (Non-Fatal) error received: 0000:01:00.0
    r8169 0000:01:00.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
    r8169 0000:01:00.0: AER: device [10ec:8168] error status/mask=00004000/00400000
    r8169 0000:01:00.0: AER: [14] CmpltTO (First)
    r8169 0000:01:00.0: AER: can't recover (no error_detected callback)
    pcieport 0000:00:02.0: AER: device recovery failed

    After that, the ethernet NIC is not functional anymore even after
    reloading the r8169 module. After a reboot, this is reproducible by
    copying a large file over the NIC to the MMC.

    For some reason this is not reproducible when files are copied to a tmpfs.

    * Little background on the fixup, by Manikanta Maddireddy:
    "In the internal testing with dGPU on Tegra124, CmplTO is reported by
    dGPU. This happened because FIFO queue in AFI(AXI to PCIe) module
    get full by upstream posted writes. Back to back upstream writes
    interleaved with infrequent reads, triggers RAW violation and CmpltTO.
    This is fixed by reducing the posted write credits and by changing
    updateFC timer frequency. These settings are fixed after stress test.

    In the current case, RTL NIC is also reporting CmplTO. These settings
    seems to be aggravating the issue instead of fixing it."

    Link: https://lore.kernel.org/r/20200718100710.15398-1-kwizart@gmail.com
    Fixes: 191cd6fb5d2c ("PCI: tegra: Add SW fixup for RAW violations")
    Signed-off-by: Nicolas Chauvet
    Signed-off-by: Lorenzo Pieralisi
    Reviewed-by: Manikanta Maddireddy
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Nicolas Chauvet
     

05 Aug, 2020

1 commit

  • commit b361663c5a40c8bc758b7f7f2239f7a192180e7c upstream.

    Recently ASPM handling was changed to allow ASPM on PCIe-to-PCI/PCI-X
    bridges. Unfortunately the ASMedia ASM1083/1085 PCIe to PCI bridge device
    doesn't seem to function properly with ASPM enabled. On an Asus PRIME
    H270-PRO motherboard, it causes errors like these:

    pcieport 0000:00:1c.0: AER: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
    pcieport 0000:00:1c.0: AER: device [8086:a292] error status/mask=00003000/00002000
    pcieport 0000:00:1c.0: AER: [12] Timeout
    pcieport 0000:00:1c.0: AER: Corrected error received: 0000:00:1c.0
    pcieport 0000:00:1c.0: AER: can't find device of ID00e0

    In addition to flooding the kernel log, this also causes the machine to
    wake up immediately after suspend is initiated.

    The device advertises ASPM L0s and L1 support in the Link Capabilities
    register, but the ASMedia web page for ASM1083 [1] claims "No PCIe ASPM
    support".

    Windows 10 (build 2004) enables L0s, but it also logs correctable PCIe
    errors.

    Add a quirk to disable ASPM for this device.

    [1] https://www.asmedia.com.tw/eng/e_show_products.php?cate_index=169&item=114

    [bhelgaas: commit log]
    Fixes: 66ff14e59e8a ("PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges")
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208667
    Link: https://lore.kernel.org/r/20200722021803.17958-1-hancockrwd@gmail.com
    Signed-off-by: Robert Hancock
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Greg Kroah-Hartman

    Robert Hancock
     

29 Jul, 2020

2 commits

  • [ Upstream commit d08c30d7a0d1826f771f16cde32bd86e48401791 ]

    This reverts commit ec411e02b7a2e785a4ed9ed283207cd14f48699d.

    Patrick reported that this commit broke hybrid graphics on a ThinkPad X1
    Extreme 2nd with Intel UHD Graphics 630 and NVIDIA GeForce GTX 1650 Mobile:

    nouveau 0000:01:00.0: fifo: PBDMA0: 01000000 [] ch 0 [00ff992000 DRM] subc 0 mthd 0008 data 00000000

    Karol reported that this commit broke Nouveau firmware loading on a Lenovo
    P1G2 with Intel UHD Graphics 630 and NVIDIA TU117GLM [Quadro T1000 Mobile]:

    nouveau 0000:01:00.0: acr: AHESASC binary failed

    In both cases, reverting ec411e02b7a2 solved the problem. Unfortunately,
    this revert will reintroduce the "Thunderbolt bridges take long time to
    resume from D3cold" problem:
    https://bugzilla.kernel.org/show_bug.cgi?id=206837

    Link: https://lore.kernel.org/r/CAErSpo5sTeK_my1dEhWp7aHD0xOp87+oHYWkTjbL7ALgDbXo-Q@mail.gmail.com
    Link: https://lore.kernel.org/r/CACO55tsAEa5GXw5oeJPG=mcn+qxNvspXreJYWDJGZBy5v82JDA@mail.gmail.com
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=208597
    Reported-by: Patrick Volkerding
    Reported-by: Karol Herbst
    Fixes: ec411e02b7a2 ("PCI/PM: Assume ports without DLL Link Active train links in 100 ms")
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Sasha Levin

    Bjorn Helgaas
     
  • [ Upstream commit e3beca48a45b5e0e6e6a4e0124276b8248dcc9bb ]

    Quite some non OF/ACPI users of irqdomains allocate firmware nodes of type
    IRQCHIP_FWNODE_NAMED or IRQCHIP_FWNODE_NAMED_ID and free them right after
    creating the irqdomain. The only purpose of these FW nodes is to convey
    name information. When this was introduced the core code did not store the
    pointer to the node in the irqdomain. A recent change stored the firmware
    node pointer in irqdomain for other reasons and missed to notice that the
    usage sites which do the alloc_fwnode/create_domain/free_fwnode sequence
    are broken by this. Storing a dangling pointer is dangerous itself, but in
    case that the domain is destroyed later on this leads to a double free.

    Remove the freeing of the firmware node after creating the irqdomain from
    all affected call sites to cure this.

    Fixes: 711419e504eb ("irqdomain: Add the missing assignment of domain->fwnode for named fwnode")
    Reported-by: Andy Shevchenko
    Signed-off-by: Thomas Gleixner
    Acked-by: Bjorn Helgaas
    Acked-by: Marc Zyngier
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/873661qakd.fsf@nanos.tec.linutronix.de
    Signed-off-by: Sasha Levin

    Thomas Gleixner
     

22 Jul, 2020

1 commit

  • commit c3aaf086701d05a82c8156ee8620af41e5a7d6fe upstream.

    26ad34d510a8 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports") added
    the struct pci_platform_pm_ops.bridge_d3() function pointer and
    platform_pci_bridge_d3() to use it.

    The .bridge_d3() op is implemented by acpi_pci_platform_pm, but not by
    mid_pci_platform_pm. We don't expect platform_pci_bridge_d3() to be called
    on Intel MID platforms, but nothing in the code itself would prevent that.

    Check the .bridge_d3() pointer for NULL before calling it.

    Fixes: 26ad34d510a8 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports")
    Signed-off-by: Bjorn Helgaas
    Reviewed-by: Mika Westerberg
    Signed-off-by: Greg Kroah-Hartman

    Bjorn Helgaas
     

24 Jun, 2020

5 commits

  • [ Upstream commit 87dccf09323fc363bd0d072fcc12b96622ab8c69 ]

    The vim3l board does not work with a standard PCIe switch (ASM1184e),
    spitting all kind of errors - hinting at HW misconfiguration (no link,
    port enumeration issues, etc).

    According to the the Synopsys DWC PCIe Reference Manual, in the section
    dedicated to the PLCR register, bit 7 is described (FAST_LINK_MODE) as:

    "Sets all internal timers to fast mode for simulation purposes."

    it is sound to set this bit from a simulation perspective, but on actual
    silicon, which expects timers to have a nominal value, it is not.

    Make sure the FAST_LINK_MODE bit is cleared when configuring the RC
    to solve this problem.

    Link: https://lore.kernel.org/r/20200429164230.309922-1-maz@kernel.org
    Fixes: 9c0ef6d34fdb ("PCI: amlogic: Add the Amlogic Meson PCIe controller driver")
    Signed-off-by: Marc Zyngier
    [lorenzo.pieralisi@arm.com: commit log]
    Signed-off-by: Lorenzo Pieralisi
    Reviewed-by: Neil Armstrong
    Acked-by: Rob Herring
    Signed-off-by: Sasha Levin

    Marc Zyngier
     
  • [ Upstream commit 0414b93e78d87ecc24ae1a7e61fe97deb29fa2f4 ]

    On a system that uses the internal DWC MSI widget, I get this
    warning from debugfs when CONFIG_GENERIC_IRQ_DEBUGFS is selected:

    debugfs: File ':soc:pcie@fc000000' in directory 'domains' already present!

    This is due to the fact that the DWC MSI code tries to register two
    IRQ domains for the same firmware node, without telling the low
    level code how to distinguish them (by setting a bus token). This
    further confuses debugfs which tries to create corresponding
    files for each domain.

    Fix it by tagging the inner domain as DOMAIN_BUS_NEXUS, which is
    the closest thing we have as to "generic MSI".

    Link: https://lore.kernel.org/r/20200501113921.366597-1-maz@kernel.org
    Signed-off-by: Marc Zyngier
    Signed-off-by: Lorenzo Pieralisi
    Acked-by: Jingoo Han
    Signed-off-by: Sasha Levin

    Marc Zyngier
     
  • [ Upstream commit 7b38fd9760f51cc83d80eed2cfbde8b5ead9e93a ]

    Except for Endpoints, we enable PTM at enumeration-time. Previously we did
    not account for the fact that Switch Downstream Ports are not permitted to
    have a PTM capability; their PTM behavior is controlled by the Upstream
    Port (PCIe r5.0, sec 7.9.16). Since Downstream Ports don't have a PTM
    capability, we did not mark them as "ptm_enabled", which meant that
    pci_enable_ptm() on an Endpoint failed because there was no PTM path to it.

    Mark Downstream Ports as "ptm_enabled" if their Upstream Port has PTM
    enabled.

    Fixes: eec097d43100 ("PCI: Add pci_enable_ptm() for drivers to enable PTM on endpoints")
    Reported-by: Aditya Paluri
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Sasha Levin

    Bjorn Helgaas
     
  • [ Upstream commit ec411e02b7a2e785a4ed9ed283207cd14f48699d ]

    Kai-Heng Feng reported that it takes a long time (> 1 s) to resume
    Thunderbolt-connected devices from both runtime suspend and system sleep
    (s2idle).

    This was because some Downstream Ports that support > 5 GT/s do not also
    support Data Link Layer Link Active reporting. Per PCIe r5.0 sec 6.6.1:

    With a Downstream Port that supports Link speeds greater than 5.0 GT/s,
    software must wait a minimum of 100 ms after Link training completes
    before sending a Configuration Request to the device immediately below
    that Port. Software can determine when Link training completes by polling
    the Data Link Layer Link Active bit or by setting up an associated
    interrupt (see Section 6.7.3.3).

    Sec 7.5.3.6 requires such Ports to support DLL Link Active reporting, but
    at least the Intel JHL6240 Thunderbolt 3 Bridge [8086:15c0] and the Intel
    JHL7540 Thunderbolt 3 Bridge [8086:15ea] do not.

    Previously we tried to wait for Link training to complete, but since there
    was no DLL Link Active reporting, all we could do was wait the worst-case
    1000 ms, then another 100 ms.

    Instead of using the supported speeds to determine whether to wait for Link
    training, check whether the port supports DLL Link Active reporting. The
    Ports in question do not, so we'll wait only the 100 ms required for Ports
    that support Link speeds 5 GT/s, which is not required by the spec.

    [bhelgaas: commit log, comment]
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=206837
    Link: https://lore.kernel.org/r/20200514133043.27429-1-mika.westerberg@linux.intel.com
    Reported-by: Kai-Heng Feng
    Tested-by: Kai-Heng Feng
    Signed-off-by: Mika Westerberg
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Sasha Levin

    Mika Westerberg
     
  • [ Upstream commit 1b54ae8327a4d630111c8d88ba7906483ec6010b ]

    If device_register() has an error, we should bail out of
    pci_register_host_bridge() rather than continuing on.

    Fixes: 37d6a0a6f470 ("PCI: Add pci_register_host_bridge() interface")
    Link: https://lore.kernel.org/r/20200513223859.11295-1-robh@kernel.org
    Signed-off-by: Rob Herring
    Signed-off-by: Bjorn Helgaas
    Reviewed-by: Lorenzo Pieralisi
    Reviewed-by: Arnd Bergmann
    Signed-off-by: Sasha Levin

    Rob Herring