26 Apr, 2018
1 commit
-
[ Upstream commit 95a2562590c2f64a0398183f978d5cf3db6d0284 ]
On some platforms there's an ITS available but it's not enabled
because reading or writing the registers is denied by the
firmware. In fact, reading or writing them will cause the system
to reset. We could remove the node from DT in such a case, but
it's better to skip nodes that are marked as "disabled" in DT so
that we can describe the hardware that exists and use the status
property to indicate how the firmware has configured things.Cc: Stuart Yoder
Cc: Laurentiu Tudor
Cc: Greg Kroah-Hartman
Cc: Marc Zyngier
Cc: Rajendra Nayak
Signed-off-by: Stephen Boyd
Signed-off-by: Marc Zyngier
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman
21 Mar, 2018
1 commit
-
commit 4f2c7583e33eb08dc09dd2e25574b80175ba7d93 upstream.
When struct its_device instances are created, the nr_ites member
will be set to a power of 2 that equals or exceeds the requested
number of MSIs passed to the msi_prepare() callback. At the same
time, the LPI map is allocated to be some multiple of 32 in size,
where the allocated size may be less than the requested size
depending on whether a contiguous range of sufficient size is
available in the global LPI bitmap.This may result in the situation where the nr_ites < nr_lpis, and
since nr_ites is what we program into the hardware when we map the
device, the additional LPIs will be non-functional.For bog standard hardware, this does not really matter. However,
in cases where ITS device IDs are shared between different PCIe
devices, we may end up allocating these additional LPIs without
taking into account that they don't actually work.So let's make nr_ites at least 32. This ensures that all allocated
LPIs are 'live', and that its_alloc_device_irq() will fail when
attempts are made to allocate MSIs beyond what was allocated in
the first place.Signed-off-by: Ard Biesheuvel
[maz: updated comment]
Signed-off-by: Marc Zyngier
Signed-off-by: Greg Kroah-Hartman
13 Oct, 2017
3 commits
-
The current ITS driver works fine as long as normal memory and GICR
regions are located within the lower 48bit (>=0 &&
Signed-off-by: Marc Zyngier -
The VCPU table consists of vPE entries, and its size provides the number
of VPEs supported by GICv4 hardware. Unfortunately the maximum size of
the VPE table is not discoverable like Device table. All VLPI commands
limits the number of bits to 16 to hold VPEID, which is index into VCPU
table. Don't apply DEVID bits for VCPU table instead assume maximum bits
to 16.ITS log messages on QDF2400 without fix:
allocated 524288 Devices (indirect, esz 8, psz 64K, shr 1)
allocated 8192 Interrupt Collections (flat, esz 8, psz 64K, shr 1)
Virtual CPUs Table too large, reduce ids 32->26
Virtual CPUs too large, reduce ITS pages 8192->256
allocated 2097152 Virtual CPUs (flat, esz 8, psz 64K, shr 1)ITS log messages on QDF2400 with fix:
allocated 524288 Devices (indirect, esz 8, psz 64K, shr 1)
allocated 8192 Interrupt Collections (flat, esz 8, psz 64K, shr 1)
allocated 65536 Virtual CPUs (flat, esz 8, psz 64K, shr 1)Signed-off-by: Shanker Donthineni
Signed-off-by: Marc Zyngier -
The driver probe path hits 'BUG_ON(entries != vpe_proxy.dev->nr_ites)'
on systems where it has VLPI capability, doesn't support direct LPI
feature and boot with a single CPU.Relax the BUG_ON() condition to fix the issue.
Signed-off-by: Shanker Donthineni
Signed-off-by: Marc Zyngier
05 Sep, 2017
1 commit
-
Pull irq updates from Thomas Gleixner:
"The interrupt subsystem delivers this time:- Refactoring of the GIC-V3 driver to prepare for the GIC-V4 support
- Initial GIC-V4 support
- Consolidation of the FSL MSI support
- Utilize the effective affinity interface in various ARM irqchip
drivers- Yet another interrupt chip driver (UniPhier AIDET)
- Bulk conversion of the irq chip driver to use %pOF
- The usual small fixes and improvements all over the place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
irqchip/ls-scfg-msi: Add MSI affinity support
irqchip/ls-scfg-msi: Add LS1043a v1.1 MSI support
irqchip/ls-scfg-msi: Add LS1046a MSI support
arm64: dts: ls1046a: Add MSI dts node
arm64: dts: ls1043a: Share all MSIs
arm: dts: ls1021a: Share all MSIs
arm64: dts: ls1043a: Fix typo of MSI compatible string
arm: dts: ls1021a: Fix typo of MSI compatible string
irqchip/ls-scfg-msi: Fix typo of MSI compatible strings
irqchip/irq-bcm7120-l2: Use correct I/O accessors for irq_fwd_mask
irqchip/mmp: Make mmp_intc_conf const
irqchip/gic: Make irq_chip const
irqchip/gic-v3: Advertise GICv4 support to KVM
irqchip/gic-v4: Enable low-level GICv4 operations
irqchip/gic-v4: Add some basic documentation
irqchip/gic-v4: Add VLPI configuration interface
irqchip/gic-v4: Add VPE command interface
irqchip/gic-v4: Add per-VM VPE domain creation
irqchip/gic-v3-its: Set implementation defined bit to enable VLPIs
irqchip/gic-v3-its: Allow doorbell interrupts to be injected/cleared
...
01 Sep, 2017
1 commit
-
…m-platforms into irq/core
Pull irqchip updates for 4.14 from Marc Zyngier:
- irqchip-specific part of the monster GICv4 series
- new UniPhier AIDET irqchip driver
- new variants of some Freescale MSI widget
- blanket removal of of_node->full_name in printk
- random collection of fixes
31 Aug, 2017
17 commits
-
Get the show on the road...
Reviewed-by: Thomas Gleixner
Signed-off-by: Marc Zyngier -
A long time ago, GITS_CTLR[1] used to be called GITC_CTLR.EnableVLPI.
It has been subsequently deprecated and is now an "Implementation
Defined" bit that may ot may not be set for GICv4. Brilliant.And the current crop of the FastModel requires that bit for VLPIs
to be enabled. Oh well... Let's set it and find out what breaks.Signed-off-by: Marc Zyngier
-
While the doorbell interrupts are usually driven by the HW itself,
having a way to trigger them independently has proved to be a
really useful debug feature. As it is actually very little code,
let's add it to the VPE irqchip operations.Signed-off-by: Marc Zyngier
-
After moving a VPE from a redistributor to another, we're still left
with a potential pending doorbell interrupt on the old redistributor.
That interrupt should be moved to the new one to be either cleared
or take, depending on what the hypervisor wishes to do.So let's move it right after having execited VMOVP. This doesn't
add much cost in the !DirectLPI case (we trade a DISCARD for a MOVI),
and the cost of the DIRECTLPI case should be minimal (two extra MMIO
accesses).Signed-off-by: Marc Zyngier
-
When we don't have the DirectLPI feature, we must work around the
architecture shortcomings to be able to perform the required
maintenance (interrupt masking, clearing and injection).For this, we create a fake device whose sole purpose is to
provide a way to issue commands as if we were dealing with LPIs
coming from that device (while they actually originate from
the ITS). This fake device doesn't have LPIs allocated to it,
but instead uses the VPE LPIs.Of course, this could be a real bottleneck, and a naive
implementation would require 6 commands to issue an invalidation.Instead, let's allocate at least one event per physical CPU
(rounded up to the next power of 2), and opportunistically
map the VPE doorbell to an event. This doorbell will be mapped
until we roll over and need to reallocate this slot.This ensures that most of the time, we only need 2 commands
to issue an INV, INT or CLEAR, making the performance a lot
better, given that we always issue a CLEAR on entry, and
an INV on each side of a trapped WFI.Signed-off-by: Marc Zyngier
-
The normal course of action when allocating the ITS' view of a
device is to allocate the corresponding LPIs. But we're about
to introduce devices that borrow their interrupts from
some other entities.So let's make the allocation optional.
Signed-off-by: Marc Zyngier
-
When masking/unmasking a doorbell interrupt, it is necessary
to issue an invalidation to the corresponding redistributor.
We use the DirectLPI feature by writting directly to the corresponding
redistributor.Reviewed-by: Thomas Gleixner
Signed-off-by: Marc Zyngier -
When we're about to run a vcpu, it is crucial that the redistributor
associated with the physical CPU is being told about the new residency.This is abstracted by hijacking the irq_set_affinity method for the
doorbell interrupt associated with the VPE. It is expected that the
hypervisor will call this method before scheduling the VPE.Reviewed-by: Thomas Gleixner
Signed-off-by: Marc Zyngier -
When a guest issues a INVALL command targetting a collection, it must
be translated into a VINVALL for the VPE that has this collection.This patch implements a hook that offers this functionallity to the
hypervisor.Reviewed-by: Thomas Gleixner
Signed-off-by: Marc Zyngier -
When a VPE is scheduled to run, the corresponding redistributor must
be told so, by setting VPROPBASER to the VM's property table, and
VPENDBASER to the vcpu's pending table.When scheduled out, we preserve the IDAI and PendingLast bits. The
latter is specially important, as it tells the hypervisor that
there are pending interrupts for this vcpu.Reviewed-by: Eric Auger
Signed-off-by: Marc Zyngier -
On activation, a VPE is mapped using the VMAPP command, followed
by a VINVALL for a good measure. On deactivation, the VPE is
simply unmapped.Reviewed-by: Thomas Gleixner
Signed-off-by: Marc Zyngier -
When creating a VM, the low level GICv4 code is responsible for:
- allocating each VPE a unique VPEID
- allocating a doorbell interrupt for each VPE
- allocating the pending tables for each VPE
- allocating the property table for the VMThis of course has to be reversed when the VM is brought down.
All of this is wired into the irq domain alloc/free methods.
Signed-off-by: Marc Zyngier
-
Add the basic GICv4 VPE (vcpu in GICv4 parlance) infrastructure
(irqchip, irq domain) that is going to be populated in the following
patches.Reviewed-by: Eric Auger
Signed-off-by: Marc Zyngier -
When a VLPI is reconfigured (enabled, disabled, change in priority),
the full configuration byte must be written, and the caches invalidated.Also, when using the irq_mask/irq_unmask methods, it is necessary
to disable the doorbell for that particular interrupt (by mapping it
to 1023) on top of clearing the Enable bit.Reviewed-by: Thomas Gleixner
Reviewed-by: Eric Auger
Signed-off-by: Marc Zyngier -
In order to let a VLPI being injected into a guest, the VLPI must
be mapped using the VMAPTI command. When moved to a different vcpu,
it must be moved with the VMOVI command.These commands are issued via the irq_set_vcpu_affinity method,
making sure we unmap the corresponding host LPI first.The reverse is also done when the VLPI is unmapped from the guest.
Signed-off-by: Marc Zyngier
-
Add the skeleton irq_set_vcpu_affinity method that will be used
to configure VLPIs.Reviewed-by: Eric Auger
Signed-off-by: Marc Zyngier -
Add the new GICv4 ITS command definitions, most of them, being
defined in terms of their physical counterparts.Reviewed-by: Eric Auger
Reviewed-by: Thomas Gleixner
Signed-off-by: Marc Zyngier
23 Aug, 2017
11 commits
-
We're are going to need to change a bit more than just the enable
bit in the LPI property table in the future. So let's change the
LPI configuration funtion to take a set of bits to be cleared,
and a set of bits to be set.This way, we'll be able to use it when a guest updates an LPI
property (priority, for example).Reviewed-by: Eric Auger
Reviewed-by: Thomas Gleixner
Signed-off-by: Marc Zyngier -
As we want to use 2-level tables for VCPUs, let's hack the device
table allocator in order to make it slightly more generic. It
will get reused in subsequent patches.Reviewed-by: Thomas Gleixner
Reviewed-by: Eric Auger
Signed-off-by: Marc Zyngier -
Rework LPI deallocation so that it can be reused by the v4 support
code.Reviewed-by: Eric Auger
Reviewed-by: Thomas Gleixner
Signed-off-by: Marc Zyngier -
Just as for the property table, let's move the pending table
allocation to a separate function.Reviewed-by: Thomas Gleixner
Reviewed-by: Eric Auger
Signed-off-by: Marc Zyngier -
The VCPU tables can be quite sparse as well, and it makes sense
to use indirect tables as well if possible.Reviewed-by: Thomas Gleixner
Reviewed-by: Eric Auger
Signed-off-by: Marc Zyngier -
Move the LPI property table allocation into its own function, as
this is going to be required for those associated with VMs in
the future.Reviewed-by: Eric Auger
Reviewed-by: Thomas Gleixner
Signed-off-by: Marc Zyngier -
Allow the pending state of an LPI to be set or cleared via
irq_set_irqchip_state.Reviewed-by: Thomas Gleixner
Reviewed-by: Eric Auger
Signed-off-by: Marc Zyngier -
Most ITS commands do operate on a collection object, and require
a SYNC command to be performed on that collection in order to
guarantee the execution of the first command.With GICv4 ITS, another set of commands perform similar operations
on a VPE object, and a VSYNC operations must be executed to guarantee
their execution.Given the similarities (post a command, perform a synchronization
operation on a sync object), it makes sense to reuse the same
mechanism for both class of commands.Let's start with turning its_send_single_command into a huge macro
that performs the bulk of the work, and a set of helpers that
make this macro usable for the GICv3 ITS commands.Signed-off-by: Marc Zyngier
-
Add the probing code for the ITS VLPI support. This includes
configuring the ITS number if not supporting the single VMOVP
command feature.Reviewed-by: Eric Auger
Signed-off-by: Marc Zyngier -
The various LPI definitions are in the middle of the code, and
would be better placed at the beginning, given that we're going
to use some of them much earlier.Reviewed-by: Thomas Gleixner
Reviewed-by: Eric Auger
Signed-off-by: Marc Zyngier -
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.Cc: Thomas Gleixner
Cc: Jason Cooper
Cc: Lee Jones
Cc: Stefan Wahren
Cc: Florian Fainelli
Cc: Ray Jui
Cc: Scott Branden
Cc: bcm-kernel-feedback-list@broadcom.com
Cc: Sylvain Lemieux
Cc: Maxime Coquelin
Cc: Chen-Yu Tsai
Cc: Thierry Reding
Cc: Jonathan Hunter
Cc: Michal Simek
Cc: "Sören Brinkmann"
Cc: linux-rpi-kernel@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-tegra@vger.kernel.org
Acked-by: Eric Anholt
Acked-by: Baruch Siach
Acked-by: Vladimir Zapolskiy
Acked-by: Matthias Brugger
Acked-by: Alexandre Torgue
Acked-by: Maxime Ripard
Signed-off-by: Rob Herring
Signed-off-by: Marc Zyngier
19 Aug, 2017
1 commit
-
wait_for_range_completion() is nicely busted when handling
wrapping of the command queue, leading to an early exit
instead of waiting for the command to have been executed.Fortunately, the impact is pretty minor, as it only impair
the detection of an ITS that doesn't make any forward progress
for a whole second. And an ITS should *never* lock up.Reported-by: Yang Yingliang
Signed-off-by: Marc Zyngier
18 Aug, 2017
1 commit
-
The GICv3 ITS driver only targets a single CPU at a time, even if
the notional affinity is wider. Let's inform the core code
about this.Signed-off-by: Marc Zyngier
Signed-off-by: Thomas Gleixner
Cc: Andrew Lunn
Cc: James Hogan
Cc: Jason Cooper
Cc: Paul Burton
Cc: Chris Zankel
Cc: Kevin Cernekee
Cc: Wei Xu
Cc: Max Filippov
Cc: Florian Fainelli
Cc: Gregory Clement
Cc: Matt Redfearn
Cc: Sebastian Hesselbarth
Link: http://lkml.kernel.org/r/20170818083925.10108-6-marc.zyngier@arm.com
14 Aug, 2017
1 commit
-
…arm-platforms into irq/urgent
Pull irqchip fixes for 4.13 from Marc Zyngier
Mostly GIC related, again:
- GICv3 ITS NUMA handling fixes
- GICv3 force affinity handling
- Barrier adjustment in both GIC interrupt handling
- Error reporting when the DT presents an incompatible interrupt
- GICv3 platform MSI DT parsing bug fix
- Broadcom L2 PM fix
- Atmel AIC cleanups
10 Aug, 2017
1 commit
-
When enabling ITS NUMA support on D05, I got the boot log:
[ 0.000000] SRAT: PXM 0 -> ITS 0 -> Node 0
[ 0.000000] SRAT: PXM 0 -> ITS 1 -> Node 0
[ 0.000000] SRAT: PXM 0 -> ITS 2 -> Node 0
[ 0.000000] SRAT: PXM 1 -> ITS 3 -> Node 1
[ 0.000000] SRAT: ITS affinity exceeding max count[4]This is wrong on D05 as we have 8 ITSs with 4 NUMA nodes.
So dynamically alloc the memory needed instead of using
its_srat_maps[MAX_NUMNODES], which count the number of
ITS entry(ies) in SRAT and alloc its_srat_maps as needed,
then build the mapping of numa node to ITS ID. Of course,
its_srat_maps will be freed after ITS probing because
we don't need that after boot.After doing this, I got what I wanted:
[ 0.000000] SRAT: PXM 0 -> ITS 0 -> Node 0
[ 0.000000] SRAT: PXM 0 -> ITS 1 -> Node 0
[ 0.000000] SRAT: PXM 0 -> ITS 2 -> Node 0
[ 0.000000] SRAT: PXM 1 -> ITS 3 -> Node 1
[ 0.000000] SRAT: PXM 2 -> ITS 4 -> Node 2
[ 0.000000] SRAT: PXM 2 -> ITS 5 -> Node 2
[ 0.000000] SRAT: PXM 2 -> ITS 6 -> Node 2
[ 0.000000] SRAT: PXM 3 -> ITS 7 -> Node 3Fixes: dbd2b8267233 ("irqchip/gic-v3-its: Add ACPI NUMA node mapping")
Signed-off-by: Hanjun Guo
Reviewed-by: Lorenzo Pieralisi
Cc: Ganapatrao Kulkarni
Cc: John Garry
Signed-off-by: Marc Zyngier
02 Aug, 2017
1 commit
-
The version check was added due to dependency to
a618c7f89a02 ACPICA: Add support for new SRAT subtable
Now, that this code is in the kernel, remove the check. This is esp.
useful to enable backports.Signed-off-by: Robert Richter
Signed-off-by: Marc Zyngier