Eric Lee / smarc-fsl-linux-kernel

17 Apr, 2019

2 commits

8b4f68b47 genirq: Initialize request_mutex if CONFIG_SPARSE_IRQ=n ... Browse Code »

commit e8458e7afa855317b14915d7b86ab3caceea7eb6 upstream.

When CONFIG_SPARSE_IRQ is disable, the request_mutex in struct irq_desc
is not initialized which causes malfunction.

Fixes: 9114014cf4e6 ("genirq: Add mutex to irq desc to serialize request/free_irq()")
Signed-off-by: Kefeng Wang
Signed-off-by: Thomas Gleixner
Reviewed-by: Mukesh Ojha
Cc: Marc Zyngier
Cc:
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190404074512.145533-1-wangkefeng.wang@huawei.com
Signed-off-by: Greg Kroah-Hartman

Kefeng Wang
2019-04-17 14:38:52 +0800
cd5b06a93 genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent() ... Browse Code »

commit 325aa19598e410672175ed50982f902d4e3f31c5 upstream.

If a child irqchip calls irq_chip_set_wake_parent() but its parent irqchip
has the IRQCHIP_SKIP_SET_WAKE flag set an error is returned.

This is inconsistent behaviour vs. set_irq_wake_real() which returns 0 when
the irqchip has the IRQCHIP_SKIP_SET_WAKE flag set. It doesn't attempt to
walk the chain of parents and set irq wake on any chips that don't have the
flag set either. If the intent is to call the .irq_set_wake() callback of
the parent irqchip, then we expect irqchip implementations to omit the
IRQCHIP_SKIP_SET_WAKE flag and implement an .irq_set_wake() function that
calls irq_chip_set_wake_parent().

The problem has been observed on a Qualcomm sdm845 device where set wake
fails on any GPIO interrupts after applying work in progress wakeup irq
patches to the GPIO driver. The chain of chips looks like this:

QCOM GPIO -> QCOM PDC (SKIP) -> ARM GIC (SKIP)

The GPIO controllers parent is the QCOM PDC irqchip which in turn has ARM
GIC as parent. The QCOM PDC irqchip has the IRQCHIP_SKIP_SET_WAKE flag
set, and so does the grandparent ARM GIC.

The GPIO driver doesn't know if the parent needs to set wake or not, so it
unconditionally calls irq_chip_set_wake_parent() causing this function to
return a failure because the parent irqchip (PDC) doesn't have the
.irq_set_wake() callback set. Returning 0 instead makes everything work and
irqs from the GPIO controller can be configured for wakeup.

Make it consistent by returning 0 (success) from irq_chip_set_wake_parent()
when a parent chip has IRQCHIP_SKIP_SET_WAKE set.

[ tglx: Massaged changelog ]

Fixes: 08b55e2a9208e ("genirq: Add irqchip_set_wake_parent")
Signed-off-by: Stephen Boyd
Signed-off-by: Thomas Gleixner
Acked-by: Marc Zyngier
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-gpio@vger.kernel.org
Cc: Lina Iyer
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190325181026.247796-1-swboyd@chromium.org
Signed-off-by: Greg Kroah-Hartman

Stephen Boyd
2019-04-17 14:38:52 +0800

06 Apr, 2019

1 commit

1f3694865 genirq: Avoid summation loops for /proc/stat ... Browse Code »

[ Upstream commit 1136b0728969901a091f0471968b2b76ed14d9ad ]

Waiman reported that on large systems with a large amount of interrupts the
readout of /proc/stat takes a long time to sum up the interrupt
statistics. In principle this is not a problem. but for unknown reasons
some enterprise quality software reads /proc/stat with a high frequency.

The reason for this is that interrupt statistics are accounted per cpu. So
the /proc/stat logic has to sum up the interrupt stats for each interrupt.

This can be largely avoided for interrupts which are not marked as
'PER_CPU' interrupts by simply adding a per interrupt summation counter
which is incremented along with the per interrupt per cpu counter.

The PER_CPU interrupts need to avoid that and use only per cpu accounting
because they share the interrupt number and the interrupt descriptor and
concurrent updates would conflict or require unwanted synchronization.

Reported-by: Waiman Long
Signed-off-by: Thomas Gleixner
Reviewed-by: Waiman Long
Reviewed-by: Marc Zyngier
Reviewed-by: Davidlohr Bueso
Cc: Matthew Wilcox
Cc: Andrew Morton
Cc: Alexey Dobriyan
Cc: Kees Cook
Cc: linux-fsdevel@vger.kernel.org
Cc: Davidlohr Bueso
Cc: Miklos Szeredi
Cc: Daniel Colascione
Cc: Dave Chinner
Cc: Randy Dunlap
Link: https://lkml.kernel.org/r/20190208135020.925487496@linutronix.de

8

Thomas Gleixner
2019-04-06 04:33:09 +0800

06 Mar, 2019

4 commits

17fab8914 genirq: Make sure the initial affinity is not empty ... Browse Code »

[ Upstream commit bddda606ec76550dd63592e32a6e87e7d32583f7 ]

If all CPUs in the irq_default_affinity mask are offline when an interrupt
is initialized then irq_setup_affinity() can set an empty affinity mask for
a newly allocated interrupt.

Fix this by falling back to cpu_online_mask in case the resulting affinity
mask is zero.

Signed-off-by: Srinivas Ramana
Signed-off-by: Thomas Gleixner
Cc: linux-arm-msm@vger.kernel.org
Link: https://lkml.kernel.org/r/1545312957-8504-1-git-send-email-sramana@codeaurora.org
Signed-off-by: Sasha Levin

Srinivas Ramana
2019-03-06 00:58:47 +0800
765c30b31 genirq/matrix: Improve target CPU selection for managed interrupts. ... Browse Code »

[ Upstream commit e8da8794a7fd9eef1ec9a07f0d4897c68581c72b ]

On large systems with multiple devices of the same class (e.g. NVMe disks,
using managed interrupts), the kernel can affinitize these interrupts to a
small subset of CPUs instead of spreading them out evenly.

irq_matrix_alloc_managed() tries to select the CPU in the supplied cpumask
of possible target CPUs which has the lowest number of interrupt vectors
allocated.

This is done by searching the CPU with the highest number of available
vectors. While this is correct for non-managed CPUs it can select the wrong
CPU for managed interrupts. Under certain constellations this results in
affinitizing the managed interrupts of several devices to a single CPU in
a set.

The book keeping of available vectors works the following way:

1) Non-managed interrupts:

available is decremented when the interrupt is actually requested by
the device driver and a vector is assigned. It's incremented when the
interrupt and the vector are freed.

2) Managed interrupts:

Managed interrupts guarantee vector reservation when the MSI/MSI-X
functionality of a device is enabled, which is achieved by reserving
vectors in the bitmaps of the possible target CPUs. This reservation
decrements the available count on each possible target CPU.

When the interrupt is requested by the device driver then a vector is
allocated from the reserved region. The operation is reversed when the
interrupt is freed by the device driver. Neither of these operations
affect the available count.

The reservation persist up to the point where the MSI/MSI-X
functionality is disabled and only this operation increments the
available count again.

For non-managed interrupts the available count is the correct selection
criterion because the guaranteed reservations need to be taken into
account. Using the allocated counter could lead to a failing allocation in
the following situation (total vector space of 10 assumed):

CPU0 CPU1
available: 2 0
allocated: 5 3 allocated: 3 3

available: 4 4 allocated: 4 3

available: 3 4 allocated: 4 4

But the allocation of three managed interrupts starting from the same
point will affinitize all of them to CPU0 because the available count is
not affected by the allocation (see above). So the end result is:

CPU0 CPU1
available: 5 4
allocated: 5 3

Introduce a "managed_allocated" field in struct cpumap to track the vector
allocation for managed interrupts separately. Use this information to
select the target CPU when a vector is allocated for a managed interrupt,
which results in more evenly distributed vector assignments. The above
example results in the following allocations:

CPU0 CPU1
managed_allocated: 0 0 allocated: 3 3

managed_allocated: 1 0 allocated: 3 4

managed_allocated: 1 1 allocated: 4 4

The allocation of non-managed interrupts is not affected by this change and
is still evaluating the available count.

The overall distribution of interrupt vectors for both types of interrupts
might still not be perfectly even depending on the number of non-managed
and managed interrupts in a system, but due to the reservation guarantee
for managed interrupts this cannot be avoided.

Expose the new field in debugfs as well.

[ tglx: Clarified the background of the problem in the changelog and
described it independent of NVME ]

Signed-off-by: Long Li
Signed-off-by: Thomas Gleixner
Cc: Michael Kelley
Link: https://lkml.kernel.org/r/20181106040000.27316-1-longli@linuxonhyperv.com
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Long Li
2019-03-06 00:58:45 +0800
8cae7757e irq/matrix: Spread managed interrupts on allocation ... Browse Code »

[ Upstream commit 76f99ae5b54d48430d1f0c5512a84da0ff9761e0 ]

Linux spreads out the non managed interrupt across the possible target CPUs
to avoid vector space exhaustion.

Managed interrupts are treated differently, as for them the vectors are
reserved (with guarantee) when the interrupt descriptors are initialized.

When the interrupt is requested a real vector is assigned. The assignment
logic uses the first CPU in the affinity mask for assignment. If the
interrupt has more than one CPU in the affinity mask, which happens when a
multi queue device has less queues than CPUs, then doing the same search as
for non managed interrupts makes sense as it puts the interrupt on the
least interrupt plagued CPU. For single CPU affine vectors that's obviously
a NOOP.

Restructre the matrix allocation code so it does the 'best CPU' search, add
the sanity check for an empty affinity mask and adapt the call site in the
x86 vector management code.

[ tglx: Added the empty mask check to the core and improved change log ]

Signed-off-by: Dou Liyang
Signed-off-by: Thomas Gleixner
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180908175838.14450-2-dou_liyang@163.com
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Dou Liyang
2019-03-06 00:58:45 +0800
2948b8875 irq/matrix: Split out the CPU selection code into a helper ... Browse Code »

[ Upstream commit 8ffe4e61c06a48324cfd97f1199bb9838acce2f2 ]

Linux finds the CPU which has the lowest vector allocation count to spread
out the non managed interrupts across the possible target CPUs, but does
not do so for managed interrupts.

Split out the CPU selection code into a helper function for reuse. No
functional change.

Signed-off-by: Dou Liyang
Signed-off-by: Thomas Gleixner
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180908175838.14450-1-dou_liyang@163.com
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Dou Liyang
2019-03-06 00:58:45 +0800

13 Feb, 2019

1 commit

46ed4f4fa genirq/affinity: Spread IRQs to all available NUMA nodes ... Browse Code »

[ Upstream commit b82592199032bf7c778f861b936287e37ebc9f62 ]

If the number of NUMA nodes exceeds the number of MSI/MSI-X interrupts
which are allocated for a device, the interrupt affinity spreading code
fails to spread them across all nodes.

The reason is, that the spreading code starts from node 0 and continues up
to the number of interrupts requested for allocation. This leaves the nodes
past the last interrupt unused.

This results in interrupt concentration on the first nodes which violates
the assumption of the block layer that all nodes are covered evenly. As a
consequence the NUMA nodes above the number of interrupts are all assigned
to hardware queue 0 and therefore NUMA node 0, which results in bad
performance and has CPU hotplug implications, because queue 0 gets shut
down when the last CPU of node 0 is offlined.

Go over all NUMA nodes and assign them round-robin to all requested
interrupts to solve this.

[ tglx: Massaged changelog ]

Signed-off-by: Long Li
Signed-off-by: Thomas Gleixner
Reviewed-by: Ming Lei
Cc: Michael Kelley
Link: https://lkml.kernel.org/r/20181102180248.13583-1-longli@linuxonhyperv.com
Signed-off-by: Sasha Levin

Long Li
2019-02-13 02:46:57 +0800

14 Nov, 2018

1 commit

e6d2f788c genirq: Fix race on spurious interrupt detection ... Browse Code »

commit 746a923b863a1065ef77324e1e43f19b1a3eab5c upstream.

Commit 1e77d0a1ed74 ("genirq: Sanitize spurious interrupt detection of
threaded irqs") made detection of spurious interrupts work for threaded
handlers by:

a) incrementing a counter every time the thread returns IRQ_HANDLED, and
b) checking whether that counter has increased every time the thread is
woken.

However for oneshot interrupts, the commit unmasks the interrupt before
incrementing the counter. If another interrupt occurs right after
unmasking but before the counter is incremented, that interrupt is
incorrectly considered spurious:

time
| irq_thread()
| irq_thread_fn()
| action->thread_fn()
| irq_finalize_oneshot()
| unmask_threaded_irq() /* interrupt is unmasked */
|
| /* interrupt fires, incorrectly deemed spurious */
|
| atomic_inc(&desc->threads_handled); /* counter is incremented */
v

This is observed with a hi3110 CAN controller receiving data at high volume
(from a separate machine sending with "cangen -g 0 -i -x"): The controller
signals a huge number of interrupts (hundreds of millions per day) and
every second there are about a dozen which are deemed spurious.

In theory with high CPU load and the presence of higher priority tasks, the
number of incorrectly detected spurious interrupts might increase beyond
the 99,900 threshold and cause disablement of the interrupt.

In practice it just increments the spurious interrupt count. But that can
cause people to waste time investigating it over and over.

Fix it by moving the accounting before the invocation of
irq_finalize_oneshot().

[ tglx: Folded change log update ]

Fixes: 1e77d0a1ed74 ("genirq: Sanitize spurious interrupt detection of threaded irqs")
Signed-off-by: Lukas Wunner
Signed-off-by: Thomas Gleixner
Cc: Mathias Duckeck
Cc: Akshay Bhat
Cc: Casey Fitzpatrick
Cc: stable@vger.kernel.org # v3.16+
Link: https://lkml.kernel.org/r/1dfd8bbd16163940648045495e3e9698e63b50ad.1539867047.git.lukas@wunner.de
Signed-off-by: Greg Kroah-Hartman

Lukas Wunner
2018-11-14 03:08:48 +0800

14 Aug, 2018

1 commit

d0daaeaf6 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull genirq updates from Thomas Gleixner:
"The irq departement provides:

- A synchronization fix for free_irq() to synchronize just the
removed interrupt thread on shared interrupt lines.

- Consolidate the multi low level interrupt entry handling and mvoe
it to the generic code instead of adding yet another copy for
RISC-V

- Refactoring of the ARM LPI allocator and LPI exposure to the
hypervisor

- Yet another interrupt chip driver for the JZ4725B SoC

- Speed up for /proc/interrupts as people seem to love reading this
file with high frequency

- Miscellaneous fixes and updates"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
irqchip/gic-v3-its: Make its_lock a raw_spin_lock_t
genirq/irqchip: Remove MULTI_IRQ_HANDLER as it's now obselete
openrisc: Use the new GENERIC_IRQ_MULTI_HANDLER
arm64: Use the new GENERIC_IRQ_MULTI_HANDLER
ARM: Convert to GENERIC_IRQ_MULTI_HANDLER
irqchip: Port the ARM IRQ drivers to GENERIC_IRQ_MULTI_HANDLER
irqchip/gic-v3-its: Reduce minimum LPI allocation to 1 for PCI devices
dt-bindings: irqchip: renesas-irqc: Document r8a77980 support
dt-bindings: irqchip: renesas-irqc: Document r8a77470 support
irqchip/ingenic: Add support for the JZ4725B SoC
irqchip/stm32: Add exti0 translation for stm32mp1
genirq: Remove redundant NULL pointer check in __free_irq()
irqchip/gic-v3-its: Honor hypervisor enforced LPI range
irqchip/gic-v3: Expose GICD_TYPER in the rdist structure
irqchip/gic-v3-its: Drop chunk allocation compatibility
irqchip/gic-v3-its: Move minimum LPI requirements to individual busses
irqchip/gic-v3-its: Use full range of LPIs
irqchip/gic-v3-its: Refactor LPI allocator
genirq: Synchronize only with single thread on free_irq()
genirq: Update code comments wrt recycled thread_mask
...

Linus Torvalds
2018-08-14 01:47:26 +0800

06 Aug, 2018

1 commit

9e90c7985 Merge tag 'irqchip-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/ar… ... Browse Code »

…m-platforms into irq/core

Pull irqchip updates from Marc Zyngier:

- GICv3 ITS LPI allocation revamp
- GICv3 support for hypervisor-enforced LPI range
- GICv3 ITS conversion to raw spinlock

Thomas Gleixner
2018-08-06 18:45:42 +0800

03 Aug, 2018

2 commits

d1f0301b3 genirq: Make force irq threading setup more robust ... Browse Code »

The support of force threading interrupts which are set up with both a
primary and a threaded handler wreckaged the setup of regular requested
threaded interrupts (primary handler == NULL).

The reason is that it does not check whether the primary handler is set to
the default handler which wakes the handler thread. Instead it replaces the
thread handler with the primary handler as it would do with force threaded
interrupts which have been requested via request_irq(). So both the primary
and the thread handler become the same which then triggers the warnon that
the thread handler tries to wakeup a not configured secondary thread.

Fortunately this only happens when the driver omits the IRQF_ONESHOT flag
when requesting the threaded interrupt, which is normaly caught by the
sanity checks when force irq threading is disabled.

Fix it by skipping the force threading setup when a regular threaded
interrupt is requested. As a consequence the interrupt request which lacks
the IRQ_ONESHOT flag is rejected correctly instead of silently wreckaging
it.

Fixes: 2a1d3ab8986d ("genirq: Handle force threading of irqs with primary and thread handler")
Reported-by: Kurt Kanzenbach
Signed-off-by: Thomas Gleixner
Tested-by: Kurt Kanzenbach
Cc: stable@vger.kernel.org

Thomas Gleixner
2018-08-03 21:19:01 +0800
4f7799d96 genirq/irqchip: Remove MULTI_IRQ_HANDLER as it's now obselete ... Browse Code »

Now that every user of MULTI_IRQ_HANDLER has been convereted over to use
GENERIC_IRQ_MULTI_HANDLER remove the references to MULTI_IRQ_HANDLER.

Signed-off-by: Palmer Dabbelt
Signed-off-by: Thomas Gleixner
Cc: linux@armlinux.org.uk
Cc: catalin.marinas@arm.com
Cc: Will Deacon
Cc: jonas@southpole.se
Cc: stefan.kristiansson@saunalahti.fi
Cc: shorne@gmail.com
Cc: jason@lakedaemon.net
Cc: marc.zyngier@arm.com
Cc: Arnd Bergmann
Cc: nicolas.pitre@linaro.org
Cc: vladimir.murzin@arm.com
Cc: keescook@chromium.org
Cc: jinb.park7@gmail.com
Cc: yamada.masahiro@socionext.com
Cc: alexandre.belloni@bootlin.com
Cc: pombredanne@nexb.com
Cc: Greg KH
Cc: kstewart@linuxfoundation.org
Cc: jhogan@kernel.org
Cc: mark.rutland@arm.com
Cc: ard.biesheuvel@linaro.org
Cc: james.morse@arm.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: openrisc@lists.librecores.org
Link: https://lkml.kernel.org/r/20180622170126.6308-6-palmer@sifive.com

Palmer Dabbelt
2018-08-03 18:14:10 +0800

17 Jul, 2018

1 commit

d91cfeb0a genirq: Remove redundant NULL pointer check in __free_irq() ... Browse Code »

The NULL pointer check in __free_irq() triggers a 'dereference before NULL
pointer check' warning in static code analysis. It turns out that the check
is redundant because all callers have a NULL pointer check already.

Remove it.

Signed-off-by: RAGHU Halharvi
Signed-off-by: Thomas Gleixner
Link: https://lkml.kernel.org/r/20180717102009.7708-1-raghuhack78@gmail.com

RAGHU Halharvi
2018-07-17 19:35:44 +0800

24 Jun, 2018

2 commits

519cc8652 genirq: Synchronize only with single thread on free_irq() ... Browse Code »

When pciehp is converted to threaded IRQ handling, removal of unplugged
devices below a PCIe hotplug port happens synchronously in the IRQ thread.
Removal of devices typically entails a call to free_irq() by their drivers.

If those devices share their IRQ with the hotplug port, __free_irq()
deadlocks because it calls synchronize_irq() to wait for all hard IRQ
handlers as well as all threads sharing the IRQ to finish.

Actually it's sufficient to wait only for the IRQ thread of the removed
device, so call synchronize_hardirq() to wait for all hard IRQ handlers to
finish, but no longer for any threads. Compensate by rearranging the
control flow in irq_wait_for_interrupt() such that the device's thread is
allowed to run one last time after kthread_stop() has been called.

kthread_stop() blocks until the IRQ thread has completed. On completion
the IRQ thread clears its oneshot thread_mask bit. This is safe because
__free_irq() holds the request_mutex, thereby preventing __setup_irq() from
handing out the same oneshot thread_mask bit to a newly requested action.

Stack trace for posterity:
INFO: task irq/17-pciehp:94 blocked for more than 120 seconds.
schedule+0x28/0x80
synchronize_irq+0x6e/0xa0
__free_irq+0x15a/0x2b0
free_irq+0x33/0x70
pciehp_release_ctrl+0x98/0xb0
pcie_port_remove_service+0x2f/0x40
device_release_driver_internal+0x157/0x220
bus_remove_device+0xe2/0x150
device_del+0x124/0x340
device_unregister+0x16/0x60
remove_iter+0x1a/0x20
device_for_each_child+0x4b/0x90
pcie_port_device_remove+0x1e/0x30
pci_device_remove+0x36/0xb0
device_release_driver_internal+0x157/0x220
pci_stop_bus_device+0x7d/0xa0
pci_stop_bus_device+0x3d/0xa0
pci_stop_and_remove_bus_device+0xe/0x20
pciehp_unconfigure_device+0xb8/0x160
pciehp_disable_slot+0x84/0x130
pciehp_ist+0x158/0x190
irq_thread_fn+0x1b/0x50
irq_thread+0x143/0x1a0
kthread+0x111/0x130

Signed-off-by: Lukas Wunner
Signed-off-by: Thomas Gleixner
Cc: Bjorn Helgaas
Cc: Mika Westerberg
Cc: linux-pci@vger.kernel.org
Link: https://lkml.kernel.org/r/d72b41309f077c8d3bee6cc08ad3662d50b5d22a.1529828292.git.lukas@wunner.de

Lukas Wunner
2018-06-24 20:17:27 +0800
836557bd5 genirq: Update code comments wrt recycled thread_mask ... Browse Code »

Previously a race existed between __free_irq() and __setup_irq() wherein
the thread_mask of a just removed action could be handed out to a newly
added action and the freed irq thread would then tread on the oneshot
mask bit of the newly added irq thread in irq_finalize_oneshot():

time
| __free_irq()
| raw_spin_lock_irqsave(&desc->lock, flags);
|
| raw_spin_unlock_irqrestore(&desc->lock, flags);
|
| __setup_irq()
| raw_spin_lock_irqsave(&desc->lock, flags);
|
| raw_spin_unlock_irqrestore(&desc->lock, flags);
|
| irq_thread() of freed irq (__free_irq() waits in synchronize_irq())
| irq_thread_fn()
| irq_finalize_oneshot()
| raw_spin_lock_irq(&desc->lock);
| desc->threads_oneshot &= ~action->thread_mask;
| raw_spin_unlock_irq(&desc->lock);
v

The race was known at least since 2012 when it was documented in a code
comment by commit e04268b0effc ("genirq: Remove paranoid warnons and bogus
fixups"). The race itself is harmless as nothing touches any of the
potentially freed data after synchronize_irq().

In 2017 the race was close by commit 9114014cf4e6 ("genirq: Add mutex to
irq desc to serialize request/free_irq()"), apparently inadvertantly so
because the race is neither mentioned in the commit message nor was the
code comment updated. Make up for that.

Signed-off-by: Lukas Wunner
Signed-off-by: Thomas Gleixner
Cc: Bjorn Helgaas
Cc: Mika Westerberg
Cc: linux-pci@vger.kernel.org
Link: https://lkml.kernel.org/r/32fc25aa35ecef4b2692f57687bb7fc2a57230e2.1529828292.git.lukas@wunner.de

Lukas Wunner
2018-06-24 20:17:26 +0800

22 Jun, 2018

2 commits

74bdf7815 genirq: Speedup show_interrupts() ... Browse Code »

Since commit 425a5072dcd1 ("genirq: Free irq_desc with rcu"),
show_interrupts() can be switched to rcu locking, which removes possible
contention on sparse_irq_lock.

The per_cpu count scan and print can be done without holding desc spinlock.

And there is no need to call kstat_irqs_cpu() and abuse irq_to_desc() while
holding rcu read lock, since desc and desc->kstat_irqs wont disappear or
change.

Signed-off-by: Eric Dumazet
Signed-off-by: Thomas Gleixner
Cc: Eric Dumazet
Link: https://lkml.kernel.org/r/20180620150332.163320-1-edumazet@google.com

Eric Dumazet
2018-06-22 20:22:58 +0800
72a8edc2d genirq/debugfs: Add missing IRQCHIP_SUPPORTS_LEVEL_MSI debug ... Browse Code »

Debug is missing the IRQCHIP_SUPPORTS_LEVEL_MSI debug entry, making debugfs
slightly less useful.

Take this opportunity to also add a missing comment in the definition of
IRQCHIP_SUPPORTS_LEVEL_MSI.

Fixes: 6988e0e0d283 ("genirq/msi: Limit level-triggered MSI to platform devices")
Signed-off-by: Marc Zyngier
Signed-off-by: Thomas Gleixner
Cc: Jason Cooper
Cc: Alexandre Belloni
Cc: Yang Yingliang
Cc: Sumit Garg
Link: https://lkml.kernel.org/r/20180622095254.5906-2-marc.zyngier@arm.com

Marc Zyngier
2018-06-22 20:22:00 +0800

19 Jun, 2018

2 commits

0a13ec0bb genirq: Fix editing error in a comment ... Browse Code »

When the comment was reflowed to a wider format, the "*" snuck in.

Fixes: ae88a23b32fa ("irq: refactor and clean up the free_irq() code flow")
Signed-off-by: Jonathan Neuschäfer
Signed-off-by: Thomas Gleixner
Link: https://lkml.kernel.org/r/20180617124018.25539-1-j.neuschaefer@gmx.net

Jonathan Neuschäfer
2018-06-19 15:19:41 +0800
4a5f4d2f8 genirq: Use rcu in kstat_irqs_usr() ... Browse Code »

Jeremy Dorfman identified mutex contention when multiple threads
parse /proc/stat concurrently.

Since commit 425a5072dcd1 ("genirq: Free irq_desc with rcu"),
kstat_irqs_usr() can be switched to rcu locking, which removes this mutex
contention.

show_interrupts() case will be handled in a separate patch.

Reported-by: Jeremy Dorfman
Signed-off-by: Eric Dumazet
Signed-off-by: Thomas Gleixner
Cc: Eric Dumazet
Cc: Willem de Bruijn
Link: https://lkml.kernel.org/r/20180618125612.155057-1-edumazet@google.com

Eric Dumazet
2018-06-19 15:19:40 +0800

11 Jun, 2018

1 commit

f4e5b30d8 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 updates and fixes from Thomas Gleixner:

- Fix the (late) fallout from the vector management rework causing
hlist corruption and irq descriptor reference leaks caused by a
missing sanity check.

The straight forward fix triggered another long standing issue to
surface. The pre rework code hid the issue due to being way slower,
but now the chance that user space sees an EBUSY error return when
updating irq affinities is way higher, though quite a bunch of
userspace tools do not handle it properly despite the fact that EBUSY
could be returned for at least 10 years.

It turned out that the EBUSY return can be avoided completely by
utilizing the existing delayed affinity update mechanism for irq
remapped scenarios as well. That's a bit more error handling in the
kernel, but avoids fruitless fingerpointing discussions with tool
developers.

- Decouple PHYSICAL_MASK from AMD SME as its going to be required for
the upcoming Intel memory encryption support as well.

- Handle legacy device ACPI detection properly for newer platforms

- Fix the wrong argument ordering in the vector allocation tracepoint

- Simplify the IDT setup code for the APIC=n case

- Use the proper string helpers in the MTRR code

- Remove a stale unused VDSO source file

- Convert the microcode update lock to a raw spinlock as its used in
atomic context.

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/intel_rdt: Enable CMT and MBM on new Skylake stepping
x86/apic/vector: Print APIC control bits in debugfs
genirq/affinity: Defer affinity setting if irq chip is busy
x86/platform/uv: Use apic_ack_irq()
x86/ioapic: Use apic_ack_irq()
irq_remapping: Use apic_ack_irq()
x86/apic: Provide apic_ack_irq()
genirq/migration: Avoid out of line call if pending is not set
genirq/generic_pending: Do not lose pending affinity update
x86/apic/vector: Prevent hlist corruption and leaks
x86/vector: Fix the args of vector_alloc tracepoint
x86/idt: Simplify the idt_setup_apic_and_irq_gates()
x86/platform/uv: Remove extra parentheses
x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME
x86: Mark native_set_p4d() as __always_inline
x86/microcode: Make the late update update_lock a raw lock for RT
x86/mtrr: Convert to use strncpy_from_user() helper
x86/mtrr: Convert to use match_string() helper
x86/vdso: Remove unused file
x86/i8237: Register device based on FADT legacy boot flag

Linus Torvalds
2018-06-11 00:44:53 +0800

06 Jun, 2018

4 commits

12f47073a genirq/affinity: Defer affinity setting if irq chip is busy ... Browse Code »

The case that interrupt affinity setting fails with -EBUSY can be handled
in the kernel completely by using the already available generic pending
infrastructure.

If a irq_chip::set_affinity() fails with -EBUSY, handle it like the
interrupts for which irq_chip::set_affinity() can only be invoked from
interrupt context. Copy the new affinity mask to irq_desc::pending_mask and
set the affinity pending bit. The next raised interrupt for the affected
irq will check the pending bit and try to set the new affinity from the
handler. This avoids that -EBUSY is returned when an affinity change is
requested from user space and the previous change has not been cleaned
up. The new affinity will take effect when the next interrupt is raised
from the device.

Fixes: dccfe3147b42 ("x86/vector: Simplify vector move cleanup")
Signed-off-by: Thomas Gleixner
Tested-by: Song Liu
Cc: Joerg Roedel
Cc: Peter Zijlstra
Cc: Song Liu
Cc: Dmitry Safonov
Cc: stable@vger.kernel.org
Cc: Mike Travis
Cc: Borislav Petkov
Cc: Tariq Toukan
Link: https://lkml.kernel.org/r/20180604162224.819273597@linutronix.de

Thomas Gleixner
2018-06-06 21:18:22 +0800
d340ebd69 genirq/migration: Avoid out of line call if pending is not set ... Browse Code »

The upcoming fix for the -EBUSY return from affinity settings requires to
use the irq_move_irq() functionality even on irq remapped interrupts. To
avoid the out of line call, move the check for the pending bit into an
inline helper.

Preparatory change for the real fix. No functional change.

Fixes: dccfe3147b42 ("x86/vector: Simplify vector move cleanup")
Signed-off-by: Thomas Gleixner
Cc: Joerg Roedel
Cc: Peter Zijlstra
Cc: Song Liu
Cc: Dmitry Safonov
Cc: stable@vger.kernel.org
Cc: Mike Travis
Cc: Borislav Petkov
Cc: Tariq Toukan
Cc: Dou Liyang
Link: https://lkml.kernel.org/r/20180604162224.471925894@linutronix.de

Thomas Gleixner
2018-06-06 21:18:20 +0800
a33a5d2d1 genirq/generic_pending: Do not lose pending affinity update ... Browse Code »

The generic pending interrupt mechanism moves interrupts from the interrupt
handler on the original target CPU to the new destination CPU. This is
required for x86 and ia64 due to the way the interrupt delivery and
acknowledge works if the interrupts are not remapped.

However that update can fail for various reasons. Some of them are valid
reasons to discard the pending update, but the case, when the previous move
has not been fully cleaned up is not a legit reason to fail.

Check the return value of irq_do_set_affinity() for -EBUSY, which indicates
a pending cleanup, and rearm the pending move in the irq dexcriptor so it's
tried again when the next interrupt arrives.

Fixes: 996c591227d9 ("x86/irq: Plug vector cleanup race")
Signed-off-by: Thomas Gleixner
Tested-by: Song Liu
Cc: Joerg Roedel
Cc: Peter Zijlstra
Cc: Song Liu
Cc: Dmitry Safonov
Cc: stable@vger.kernel.org
Cc: Mike Travis
Cc: Borislav Petkov
Cc: Tariq Toukan
Link: https://lkml.kernel.org/r/20180604162224.386544292@linutronix.de

Thomas Gleixner
2018-06-06 21:18:19 +0800
47b82e881 ide: don't enable/disable interrupts in force threaded-IRQ mode ... Browse Code »

The interrupts are enabled/disabled so the interrupt handler can run
with enabled interrupts while serving the interrupt and not lose other
interrupts especially the timer tick.
If the system runs with force-threaded interrupts then there is no need
to enable the interrupts.

Signed-off-by: Sebastian Andrzej Siewior
Acked-by: David S. Miller
Signed-off-by: David S. Miller

Sebastian Andrzej Siewior
2018-06-06 04:26:47 +0800

05 Jun, 2018

1 commit

db020be9f Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull irq updates from Thomas Gleixner:

- Consolidation of softirq pending:

The softirq mask and its accessors/mutators have many implementations
scattered around many architectures. Most do the same things
consisting in a field in a per-cpu struct (often irq_cpustat_t)
accessed through per-cpu ops. We can provide instead a generic
efficient version that most of them can use. In fact s390 is the only
exception because the field is stored in lowcore.

- Support for level!?! triggered MSI (ARM)

Over the past couple of years, we've seen some SoCs coming up with
ways of signalling level interrupts using a new flavor of MSIs, where
the MSI controller uses two distinct messages: one that raises a
virtual line, and one that lowers it. The target MSI controller is in
charge of maintaining the state of the line.

This allows for a much simplified HW signal routing (no need to have
hundreds of discrete lines to signal level interrupts if you already
have a memory bus), but results in a departure from the current idea
the kernel has of MSIs.

- Support for Meson-AXG GPIO irqchip

- Large stm32 irqchip rework (suspend/resume, hierarchical domains)

- More SPDX conversions

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
ARM: dts: stm32: Add exti support to stm32mp157 pinctrl
ARM: dts: stm32: Add exti support for stm32mp157c
pinctrl/stm32: Add irq_eoi for stm32gpio irqchip
irqchip/stm32: Add suspend/resume support for hierarchy domain
irqchip/stm32: Add stm32mp1 support with hierarchy domain
irqchip/stm32: Prepare common functions
irqchip/stm32: Add host and driver data structures
irqchip/stm32: Add suspend support
irqchip/stm32: Add falling pending register support
irqchip/stm32: Checkpatch fix
irqchip/stm32: Optimizes and cleans up stm32-exti irq_domain
irqchip/meson-gpio: Add support for Meson-AXG SoCs
dt-bindings: interrupt-controller: New binding for Meson-AXG SoC
dt-bindings: interrupt-controller: Fix the double quotes
softirq/s390: Move default mutators of overwritten softirq mask to s390
softirq/x86: Switch to generic local_softirq_pending() implementation
softirq/sparc: Switch to generic local_softirq_pending() implementation
softirq/powerpc: Switch to generic local_softirq_pending() implementation
softirq/parisc: Switch to generic local_softirq_pending() implementation
softirq/ia64: Switch to generic local_softirq_pending() implementation
...

Linus Torvalds
2018-06-05 10:59:22 +0800

16 May, 2018

1 commit

3f3942aca proc: introduce proc_create_single{,_data} ... Browse Code »

Variants of proc_create{,_data} that directly take a seq_file show
callback and drastically reduces the boilerplate code in the callers.

All trivial callers converted over.

Signed-off-by: Christoph Hellwig

Christoph Hellwig
2018-05-16 13:23:35 +0800

13 May, 2018

1 commit

0be8153cb genirq/msi: Allow level-triggered MSIs to be exposed by MSI providers ... Browse Code »

So far, MSIs have been used to signal edge-triggered interrupts, as
a write is a good model for an edge (you can't "unwrite" something).
On the other hand, routing zillions of wires in an SoC because you
need level interrupts is a bit extreme.

People have come up with a variety of schemes to support this, which
involves sending two messages: one to signal the interrupt, and one
to clear it. Since the kernel cannot represent this, we've ended up
with side-band mechanisms that are pretty awful.

Instead, let's acknoledge the requirement, and ensure that, under the
right circumstances, the irq_compose_msg and irq_write_msg can take
as a parameter an array of two messages instead of a pointer to a
single one. We also add some checking that the compose method only
clobbers the second message if the MSI domain has been created with
the MSI_FLAG_LEVEL_CAPABLE flags.

Signed-off-by: Marc Zyngier
Signed-off-by: Thomas Gleixner
Cc: Rob Herring
Cc: Jason Cooper
Cc: Ard Biesheuvel
Cc: Srinivas Kandagatla
Cc: Thomas Petazzoni
Cc: Miquel Raynal
Link: https://lkml.kernel.org/r/20180508121438.11301-2-marc.zyngier@arm.com

Marc Zyngier
2018-05-13 21:58:59 +0800

27 Apr, 2018

1 commit

275220ca6 genirq/irq_sim: Remove the license boilerplate ... Browse Code »

There is the SPDX license identifier now in the irq simulator. Remove the
license boilerplate.

While at it: update the copyright notice, since I did some changes in 2018.

Signed-off-by: Bartosz Golaszewski
Signed-off-by: Thomas Gleixner
Link: https://lkml.kernel.org/r/20180426200747.8344-1-brgl@bgdev.pl

Bartosz Golaszewski
2018-04-27 04:26:39 +0800

06 Apr, 2018

5 commits

d3056812e genirq/affinity: Spread irq vectors among present CPUs as far as possible ... Browse Code »

Commit 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
tried to spread the interrupts accross all possible CPUs to make sure that
in case of phsyical hotplug (e.g. virtualization) the CPUs which get
plugged in after the device was initialized are targeted by a hardware
queue and the corresponding interrupt.

This has a downside in cases where the ACPI tables claim that there are
more possible CPUs than present CPUs and the number of interrupts to spread
out is smaller than the number of possible CPUs. These bogus ACPI tables
are unfortunately not uncommon.

In such a case the vector spreading algorithm assigns interrupts to CPUs
which can never be utilized and as a consequence these interrupts are
unused instead of being mapped to present CPUs. As a result the performance
of the device is suboptimal.

To fix this spread the interrupt vectors in two stages:

1) Spread as many interrupts as possible among the present CPUs

2) Spread the remaining vectors among non present CPUs

On a 8 core system, where CPU 0-3 are present and CPU 4-7 are not present,
for a device with 4 queues the resulting interrupt affinity is:

1) Before 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
irq 39, cpu list 0
irq 40, cpu list 1
irq 41, cpu list 2
irq 42, cpu list 3

2) With 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
irq 39, cpu list 0-2
irq 40, cpu list 3-4,6
irq 41, cpu list 5
irq 42, cpu list 7

3) With the refined vector spread applied:
irq 39, cpu list 0,4
irq 40, cpu list 1,6
irq 41, cpu list 2,5
irq 42, cpu list 3,7

On a 8 core system, where all CPUs are present the resulting interrupt
affinity for the 4 queues is:

irq 39, cpu list 0,1
irq 40, cpu list 2,3
irq 41, cpu list 4,5
irq 42, cpu list 6,7

This is independent of the number of CPUs which are online at the point of
initialization because in such a system the offline CPUs can be easily
onlined afterwards, while in non-present CPUs need to be plugged physically
or virtually which requires external interaction.

The downside of this approach is that in case of physical hotplug the
interrupt vector spreading might be suboptimal when CPUs 4-7 are physically
plugged. Suboptimal from a NUMA point of view and due to the single target
nature of interrupt affinities the later plugged CPUs might not be targeted
by interrupts at all.

Though, physical hotplug systems are not the common case while the broken
ACPI table disease is wide spread. So it's preferred to have as many
interrupts as possible utilized at the point where the device is
initialized.

Block multi-queue devices like NVME create a hardware queue per possible
CPU, so the goal of commit 84676c1f21 to assign one interrupt vector per
possible CPU is still achieved even with physical/virtual hotplug.

[ tglx: Changed from online to present CPUs for the first spreading stage,
renamed variables for readability sake, added comments and massaged
changelog ]

Reported-by: Laurence Oberman
Signed-off-by: Ming Lei
Signed-off-by: Thomas Gleixner
Reviewed-by: Christoph Hellwig
Cc: Jens Axboe
Cc: linux-block@vger.kernel.org
Cc: Christoph Hellwig
Link: https://lkml.kernel.org/r/20180308105358.1506-5-ming.lei@redhat.com

Ming Lei
2018-04-06 18:19:51 +0800
1a2d0914e genirq/affinity: Allow irq spreading from a given starting point ... Browse Code »

To support two stage irq vector spreading, it's required to add a starting
point to the spreading function. No functional change, just preparatory
work for the actual two stage change.

[ tglx: Renamed variables, tidied up the code and massaged changelog ]

Signed-off-by: Ming Lei
Signed-off-by: Thomas Gleixner
Reviewed-by: Christoph Hellwig
Cc: Jens Axboe
Cc: linux-block@vger.kernel.org
Cc: Laurence Oberman
Cc: Christoph Hellwig
Link: https://lkml.kernel.org/r/20180308105358.1506-4-ming.lei@redhat.com

Ming Lei
2018-04-06 18:19:51 +0800
b3e6aaa8d genirq/affinity: Move actual irq vector spreading into a helper function ... Browse Code »

No functional change, just prepare for converting to 2-stage irq vector
spreading.

Signed-off-by: Ming Lei
Signed-off-by: Thomas Gleixner
Reviewed-by: Christoph Hellwig
Cc: Jens Axboe
Cc: linux-block@vger.kernel.org
Cc: Laurence Oberman
Cc: Christoph Hellwig
Link: https://lkml.kernel.org/r/20180308105358.1506-3-ming.lei@redhat.com

Ming Lei
2018-04-06 18:19:51 +0800
47778f33d genirq/affinity: Rename *node_to_possible_cpumask as *node_to_cpumask ... Browse Code »

The following patches will introduce two stage irq spreading for improving
irq spread on all possible CPUs.

No functional change.

Signed-off-by: Ming Lei
Signed-off-by: Thomas Gleixner
Reviewed-by: Christoph Hellwig
Cc: Jens Axboe
Cc: linux-block@vger.kernel.org
Cc: Laurence Oberman
Cc: Christoph Hellwig
Link: https://lkml.kernel.org/r/20180308105358.1506-2-ming.lei@redhat.com

Ming Lei
2018-04-06 18:19:50 +0800
0211e12dd genirq/affinity: Don't return with empty affinity masks on error ... Browse Code »

When the allocation of node_to_possible_cpumask fails, then
irq_create_affinity_masks() returns with a pointer to the empty affinity
masks array, which will cause malfunction.

Reorder the allocations so the masks array allocation comes last and every
failure path returns NULL.

Fixes: 9a0ef98e186d ("genirq/affinity: Assign vectors to all present CPUs")
Signed-off-by: Thomas Gleixner
Cc: Christoph Hellwig
Cc: Ming Lei

Thomas Gleixner
2018-04-06 18:19:50 +0800

04 Apr, 2018

1 commit

d6f73825d genirq: Make GENERIC_IRQ_MULTI_HANDLER depend on !MULTI_IRQ_HANDLER ... Browse Code »

These config switches enable the same code in the core and the not yet
converted architecture code. They can be selected both by randconfig builds
and cause linker error because the same symbols are defined twice.

Make the new GENERIC_IRQ_MULTI_HANDLER depend on !MULTI_IRQ_HANDLER to
prevent that. The dependency will be removed once all architectures are
converted over.

Signed-off-by: Palmer Dabbelt
Signed-off-by: Thomas Gleixner
Cc: Linus Torvalds
Cc: Arnd Bergmann
Link: https://lkml.kernel.org/r/20180404043130.31277-4-palmer@sifive.com

Palmer Dabbelt
2018-04-04 18:04:28 +0800

20 Mar, 2018

5 commits

f3f59fbc5 genirq: Remove license boilerplate/references ... Browse Code »

Now that SPDX identifiers are in place, remove the boilerplate or
references.

The change in timings.c has been acked by the author.

Signed-off-by: Thomas Gleixner
Acked-by: Daniel Lezcano
Cc: Kate Stewart
Cc: Greg Kroah-Hartman
Cc: Philippe Ombredanne
Link: https://lkml.kernel.org/r/20180314212030.668321222@linutronix.de

Thomas Gleixner
2018-03-20 21:23:28 +0800
52a65ff56 genirq: Add missing SPDX identifiers ... Browse Code »

Add SPDX identifiers to files

- which contain an explicit license boiler plate or reference

- which do not contain a license reference and were not updated in the
initial SPDX conversion because the license was deduced by the scanners
via EXPORT_SYMBOL_GPL as GPL2.0 only.

[ tglx: Moved adding identifiers from the patch which removes the
references/boilerplate ]

Signed-off-by: Thomas Gleixner
Cc: Kate Stewart
Cc: Greg Kroah-Hartman
Cc: Philippe Ombredanne
Link: https://lkml.kernel.org/r/20180314212030.668321222@linutronix.de

Thomas Gleixner
2018-03-20 21:23:28 +0800
90cafdd52 genirq/matrix: Cleanup SPDX identifier ... Browse Code »

Use the proper SPDX-Identifier format.

Signed-off-by: Thomas Gleixner
Acked-by: Marc Zyngier
Cc: Kate Stewart
Cc: Greg Kroah-Hartman
Cc: Philippe Ombredanne
Link: https://lkml.kernel.org/r/20180314212030.492674761@linutronix.de

Thomas Gleixner
2018-03-20 21:23:28 +0800
99bfce5db genirq: Cleanup top of file comments ... Browse Code »

Remove pointless references to the file name itself and condense the
information so it wastes less space.

Signed-off-by: Thomas Gleixner
Acked-by: Marc Zyngier
Cc: Kate Stewart
Cc: Greg Kroah-Hartman
Cc: Philippe Ombredanne
Link: https://lkml.kernel.org/r/20180314212030.412095827@linutronix.de

Thomas Gleixner
2018-03-20 21:23:27 +0800
83ac4ca94 genirq: Pass desc to __irq_free instead of irq number ... Browse Code »

Given that irq_to_desc() is a radix_tree_lookup and the reverse
operation is only a pointer dereference and that all callers of
__free_irq already have the desc, pass the desc instead of the irq
number.

Signed-off-by: Uwe Kleine-König
Signed-off-by: Thomas Gleixner
Cc: kernel@pengutronix.de
Link: https://lkml.kernel.org/r/20180319105202.9794-1-u.kleine-koenig@pengutronix.de

Uwe Kleine König
2018-03-20 15:52:44 +0800