09 Mar, 2019

1 commit

  • Pull GPIO updates from Linus Walleij:
    "This is the bulk of GPIO changes for the v5.1 cycle:

    Core changes:

    - The big change this time around is the irqchip handling in the
    qualcomm pin controllers, closely coupled with the gpiochip. This
    rework, in a classic fall-between-the-chairs fashion has been
    sidestepped for too long.

    The Qualcomm IRQchips using the SPMI and SSBI transport mechanisms
    have been rewritten to use hierarchical irqchip. This creates the
    base from which I intend to gradually pull support for hierarchical
    irqchips into the gpiolib irqchip helpers to cut down on duplicate
    code.

    We have too many hacks in the kernel because people have been
    working around the missing hierarchical irqchip for years, and once
    it was there, noone understood it for a while. We are now slowly
    adapting to using it.

    This is why this pull requests include changes to MFD, SPMI,
    IRQchip core and some ARM Device Trees pertaining to the Qualcomm
    chip family. Since Qualcomm have so many chips and such large
    deployments it is paramount that this platform gets this right, and
    now it (hopefully) does.

    - Core support for pull-up and pull-down configuration, also from the
    device tree. When a simple GPIO chip supports an "off or on" pull-up
    or pull-down resistor, we provide a way to set this up using
    machine descriptors or device tree.

    If more elaborate control of pull up/down (such as resistance shunt
    setting) is required, drivers should be phased over to use pin
    control. We do not yet provide a userspace ABI for this pull
    up-down setting but I suspect the makers are going to ask for it
    soon enough. PCA953x is the first user of this new API.

    - The GPIO mockup driver has been revamped after some discussion
    improving the IRQ simulator in the process.

    The idea is to make it possible to use the mockup for both testing
    and virtual prototyping, e.g. when you do not yet have a GPIO
    expander to play with but really want to get something to develop
    code around before hardware is available. It's neat. The blackbox
    testing usecase is currently making its way into kernelci.

    - ACPI GPIO core preserves non direction flags when updating flags.

    - A new device core helper for devm_platform_ioremap_resource() is
    funneled through the GPIO tree with Greg's ACK.

    New drivers:

    - TQ-Systems QTMX86 GPIO controllers (using port-mapped I/O)

    - Gateworks PLD GPIO driver (vaccumed up from OpenWrt)

    - AMD G-Series PCH (Platform Controller Hub) GPIO driver.

    - Fintek F81804 & F81966 subvariants.

    - PCA953x now supports NXP PCAL6416.

    Driver improvements:

    - IRQ support on the Nintendo Wii (Hollywood) GPIO.

    - get_direction() support for the MVEBU driver.

    - Set the right output level on SAMA5D2.

    - Drop the unused irq trigger setting on the Spreadtrum driver.

    - Wakeup support for PCA953x.

    - A slew of cleanups in the various Intel drivers"

    * tag 'gpio-v5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (110 commits)
    gpio: gpio-omap: fix level interrupt idling
    gpio: amd-fch: Set proper output level for direction_output
    x86: apuv2: remove unused variable
    gpio: pca953x: Use PCA_LATCH_INT
    platform/x86: fix PCENGINES_APU2 Kconfig warning
    gpio: pca953x: Fix dereference of irq data in shutdown
    gpio: amd-fch: Fix type error found by sparse
    gpio: amd-fch: Drop const from resource
    gpio: mxc: add check to return defer probe if clock tree NOT ready
    gpio: ftgpio: Register per-instance irqchip
    gpio: ixp4xx: Add DT bindings
    x86: pcengines apuv2 gpio/leds/keys platform driver
    gpio: AMD G-Series PCH gpio driver
    drivers: depend on HAS_IOMEM for devm_platform_ioremap_resource()
    gpio: tqmx86: Set proper output level for direction_output
    gpio: sprd: Change to use SoC compatible string
    gpio: sprd: Use SoC compatible string instead of wildcard string
    gpio: of: Handle both enable-gpio{,s}
    gpio: of: Restrict enable-gpio quirk to regulator-gpio
    gpio: davinci: use devm_platform_ioremap_resource()
    ...

    Linus Torvalds
     

23 Feb, 2019

1 commit


21 Feb, 2019

2 commits

  • Linus Walleij
     
  • The default irq domain allows legacy code to create irqdomain
    mappings without having to track the domain it is allocating
    from. Setting the default domain is a one shot, fire and forget
    operation, and no effort was made to be able to retrieve this
    information at a later point in time.

    Newer irqdomain APIs (the hierarchical stuff) relies on both
    the irqchip code to track the irqdomain it is allocating from,
    as well as some form of firmware abstraction to easily identify
    which piece of HW maps to which irq domain (DT, ACPI).

    For systems without such firmware (or legacy platform that are
    getting dragged into the 21st century), things are a bit harder.
    For these cases (and these cases only!), let's provide a way
    to retrieve the default domain, allowing the use of the v2 API
    without having to resort to platform-specific hacks.

    Signed-off-by: Marc Zyngier

    Marc Zyngier
     

20 Feb, 2019

2 commits


18 Feb, 2019

5 commits

  • Now that the NVME driver is converted over to the calc_set() callback, the
    workarounds of the original set support can be removed.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ming Lei
    Acked-by: Marc Zyngier
    Cc: Christoph Hellwig
    Cc: Bjorn Helgaas
    Cc: Jens Axboe
    Cc: linux-block@vger.kernel.org
    Cc: Sagi Grimberg
    Cc: linux-nvme@lists.infradead.org
    Cc: linux-pci@vger.kernel.org
    Cc: Keith Busch
    Cc: Sumit Saxena
    Cc: Kashyap Desai
    Cc: Shivasharan Srikanteshwara
    Link: https://lkml.kernel.org/r/20190216172228.689834224@linutronix.de

    Thomas Gleixner
     
  • The interrupt affinity spreading mechanism supports to spread out
    affinities for one or more interrupt sets. A interrupt set contains one or
    more interrupts. Each set is mapped to a specific functionality of a
    device, e.g. general I/O queues and read I/O queus of multiqueue block
    devices.

    The number of interrupts per set is defined by the driver. It depends on
    the total number of available interrupts for the device, which is
    determined by the PCI capabilites and the availability of underlying CPU
    resources, and the number of queues which the device provides and the
    driver wants to instantiate.

    The driver passes initial configuration for the interrupt allocation via a
    pointer to struct irq_affinity.

    Right now the allocation mechanism is complex as it requires to have a loop
    in the driver to determine the maximum number of interrupts which are
    provided by the PCI capabilities and the underlying CPU resources. This
    loop would have to be replicated in every driver which wants to utilize
    this mechanism. That's unwanted code duplication and error prone.

    In order to move this into generic facilities it is required to have a
    mechanism, which allows the recalculation of the interrupt sets and their
    size, in the core code. As the core code does not have any knowledge about the
    underlying device, a driver specific callback is required in struct
    irq_affinity, which can be invoked by the core code. The callback gets the
    number of available interupts as an argument, so the driver can calculate the
    corresponding number and size of interrupt sets.

    At the moment the struct irq_affinity pointer which is handed in from the
    driver and passed through to several core functions is marked 'const', but for
    the callback to be able to modify the data in the struct it's required to
    remove the 'const' qualifier.

    Add the optional callback to struct irq_affinity, which allows drivers to
    recalculate the number and size of interrupt sets and remove the 'const'
    qualifier.

    For simple invocations, which do not supply a callback, a default callback
    is installed, which just sets nr_sets to 1 and transfers the number of
    spreadable vectors to the set_size array at index 0.

    This is for now guarded by a check for nr_sets != 0 to keep the NVME driver
    working until it is converted to the callback mechanism.

    To make sure that the driver configuration is correct under all circumstances
    the callback is invoked even when there are no interrupts for queues left,
    i.e. the pre/post requirements already exhaust the numner of available
    interrupts.

    At the PCI layer irq_create_affinity_masks() has to be invoked even for the
    case where the legacy interrupt is used. That ensures that the callback is
    invoked and the device driver can adjust to that situation.

    [ tglx: Fixed the simple case (no sets required). Moved the sanity check
    for nr_sets after the invocation of the callback so it catches
    broken drivers. Fixed the kernel doc comments for struct
    irq_affinity and de-'This patch'-ed the changelog ]

    Signed-off-by: Ming Lei
    Signed-off-by: Thomas Gleixner
    Acked-by: Marc Zyngier
    Cc: Christoph Hellwig
    Cc: Bjorn Helgaas
    Cc: Jens Axboe
    Cc: linux-block@vger.kernel.org
    Cc: Sagi Grimberg
    Cc: linux-nvme@lists.infradead.org
    Cc: linux-pci@vger.kernel.org
    Cc: Keith Busch
    Cc: Sumit Saxena
    Cc: Kashyap Desai
    Cc: Shivasharan Srikanteshwara
    Link: https://lkml.kernel.org/r/20190216172228.512444498@linutronix.de

    Ming Lei
     
  • The interrupt affinity spreading mechanism supports to spread out
    affinities for one or more interrupt sets. A interrupt set contains one
    or more interrupts. Each set is mapped to a specific functionality of a
    device, e.g. general I/O queues and read I/O queus of multiqueue block
    devices.

    The number of interrupts per set is defined by the driver. It depends on
    the total number of available interrupts for the device, which is
    determined by the PCI capabilites and the availability of underlying CPU
    resources, and the number of queues which the device provides and the
    driver wants to instantiate.

    The driver passes initial configuration for the interrupt allocation via
    a pointer to struct irq_affinity.

    Right now the allocation mechanism is complex as it requires to have a
    loop in the driver to determine the maximum number of interrupts which
    are provided by the PCI capabilities and the underlying CPU resources.
    This loop would have to be replicated in every driver which wants to
    utilize this mechanism. That's unwanted code duplication and error
    prone.

    In order to move this into generic facilities it is required to have a
    mechanism, which allows the recalculation of the interrupt sets and
    their size, in the core code. As the core code does not have any
    knowledge about the underlying device, a driver specific callback will
    be added to struct affinity_desc, which will be invoked by the core
    code. The callback will get the number of available interupts as an
    argument, so the driver can calculate the corresponding number and size
    of interrupt sets.

    To support this, two modifications for the handling of struct irq_affinity
    are required:

    1) The (optional) interrupt sets size information is contained in a
    separate array of integers and struct irq_affinity contains a
    pointer to it.

    This is cumbersome and as the maximum number of interrupt sets is small,
    there is no reason to have separate storage. Moving the size array into
    struct affinity_desc avoids indirections and makes the code simpler.

    2) At the moment the struct irq_affinity pointer which is handed in from
    the driver and passed through to several core functions is marked
    'const'.

    With the upcoming callback to recalculate the number and size of
    interrupt sets, it's necessary to remove the 'const'
    qualifier. Otherwise the callback would not be able to update the data.

    Implement #1 and store the interrupt sets size in 'struct irq_affinity'.

    No functional change.

    [ tglx: Fixed the memcpy() size so it won't copy beyond the size of the
    source. Fixed the kernel doc comments for struct irq_affinity and
    de-'This patch'-ed the changelog ]

    Signed-off-by: Ming Lei
    Signed-off-by: Thomas Gleixner
    Acked-by: Marc Zyngier
    Cc: Christoph Hellwig
    Cc: Bjorn Helgaas
    Cc: Jens Axboe
    Cc: linux-block@vger.kernel.org
    Cc: Sagi Grimberg
    Cc: linux-nvme@lists.infradead.org
    Cc: linux-pci@vger.kernel.org
    Cc: Keith Busch
    Cc: Sumit Saxena
    Cc: Kashyap Desai
    Cc: Shivasharan Srikanteshwara
    Link: https://lkml.kernel.org/r/20190216172228.423723127@linutronix.de

    Ming Lei
     
  • All information and calculations in the interrupt affinity spreading code
    is strictly unsigned int. Though the code uses int all over the place.

    Convert it over to unsigned int.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ming Lei
    Acked-by: Marc Zyngier
    Cc: Christoph Hellwig
    Cc: Bjorn Helgaas
    Cc: Jens Axboe
    Cc: linux-block@vger.kernel.org
    Cc: Sagi Grimberg
    Cc: linux-nvme@lists.infradead.org
    Cc: linux-pci@vger.kernel.org
    Cc: Keith Busch
    Cc: Sumit Saxena
    Cc: Kashyap Desai
    Cc: Shivasharan Srikanteshwara
    Link: https://lkml.kernel.org/r/20190216172228.336424556@linutronix.de

    Thomas Gleixner
     
  • …rnel/git/brgl/linux into devel

    gpio updates for v5.1

    - support for a new variant of pca953x
    - documentation fix from Wolfram
    - some tegra186 name changes
    - two minor fixes for madera and altera-a10sr

    Linus Walleij
     

15 Feb, 2019

1 commit


14 Feb, 2019

1 commit


13 Feb, 2019

2 commits

  • The hierarchical irqchip never before ran into a situation
    where the parent is not "simple", i.e. does not implement
    .irq_ack() and .irq_mask() like most, but the qcom-pm8xxx.c
    happens to implement only .irq_mask_ack().

    Since we want to make ssbi-gpio a hierarchical child of this
    irqchip, it must *also* only implement .irq_mask_ack()
    and call down to the parent, and for this we of course
    need irq_chip_mask_ack_parent().

    Cc: Marc Zyngier
    Cc: Thomas Gleixner
    Acked-by: Marc Zyngier
    Signed-off-by: Brian Masney
    Signed-off-by: Linus Walleij

    Linus Walleij
     
  • Add a new function irq_domain_translate_twocell() that is to be used as
    the translate function in struct irq_domain_ops for the v2 IRQ API.

    This patch also changes irq_domain_xlate_twocell() from the v1 IRQ API
    to call irq_domain_translate_twocell() in the v2 IRQ API. This required
    changes to of_phandle_args_to_fwspec()'s arguments so that it can be
    called from multiple places.

    Cc: Thomas Gleixner
    Reviewed-by: Marc Zyngier
    Signed-off-by: Brian Masney
    Signed-off-by: Linus Walleij

    Brian Masney
     

11 Feb, 2019

2 commits

  • Waiman reported that on large systems with a large amount of interrupts the
    readout of /proc/stat takes a long time to sum up the interrupt
    statistics. In principle this is not a problem. but for unknown reasons
    some enterprise quality software reads /proc/stat with a high frequency.

    The reason for this is that interrupt statistics are accounted per cpu. So
    the /proc/stat logic has to sum up the interrupt stats for each interrupt.

    This can be largely avoided for interrupts which are not marked as
    'PER_CPU' interrupts by simply adding a per interrupt summation counter
    which is incremented along with the per interrupt per cpu counter.

    The PER_CPU interrupts need to avoid that and use only per cpu accounting
    because they share the interrupt number and the interrupt descriptor and
    concurrent updates would conflict or require unwanted synchronization.

    Reported-by: Waiman Long
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Waiman Long
    Reviewed-by: Marc Zyngier
    Reviewed-by: Davidlohr Bueso
    Cc: Matthew Wilcox
    Cc: Andrew Morton
    Cc: Alexey Dobriyan
    Cc: Kees Cook
    Cc: linux-fsdevel@vger.kernel.org
    Cc: Davidlohr Bueso
    Cc: Miklos Szeredi
    Cc: Daniel Colascione
    Cc: Dave Chinner
    Cc: Randy Dunlap
    Link: https://lkml.kernel.org/r/20190208135020.925487496@linutronix.de

    8<-------------

    v2: Undo the unintentional layout change of struct irq_desc.

    include/linux/irqdesc.h | 1 +
    kernel/irq/chip.c | 12 ++++++++++--
    kernel/irq/internals.h | 8 +++++++-
    kernel/irq/irqdesc.c | 7 ++++++-
    4 files changed, 24 insertions(+), 4 deletions(-)

    Thomas Gleixner
     
  • 'node_to_cpumask' is just one temparay variable for irq_build_affinity_masks(),
    so move it into irq_build_affinity_masks().

    No functioanl change.

    Signed-off-by: Ming Lei
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Bjorn Helgaas
    Cc: Christoph Hellwig
    Cc: Jens Axboe
    Cc: linux-block@vger.kernel.org
    Cc: Sagi Grimberg
    Cc: linux-nvme@lists.infradead.org
    Cc: linux-pci@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190125095347.17950-2-ming.lei@redhat.com

    Ming Lei
     

05 Feb, 2019

4 commits

  • NMI handling code should be executed between calls to nmi_enter and
    nmi_exit.

    Add a separate domain handler to properly setup NMI context when handling
    an interrupt requested as NMI.

    Signed-off-by: Julien Thierry
    Acked-by: Marc Zyngier
    Cc: Thomas Gleixner
    Cc: Marc Zyngier
    Cc: Will Deacon
    Cc: Peter Zijlstra
    Signed-off-by: Marc Zyngier

    Julien Thierry
     
  • Provide flow handlers that are NMI safe for interrupts and percpu_devid
    interrupts.

    Signed-off-by: Julien Thierry
    Acked-by: Marc Zyngier
    Cc: Thomas Gleixner
    Cc: Marc Zyngier
    Cc: Peter Zijlstra
    Signed-off-by: Marc Zyngier

    Julien Thierry
     
  • Add support for percpu_devid interrupts treated as NMIs.

    Percpu_devid NMIs need to be setup/torn down on each CPU they target.

    The same restrictions as for global NMIs still apply for percpu_devid NMIs.

    Signed-off-by: Julien Thierry
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Marc Zyngier
    Signed-off-by: Marc Zyngier

    Julien Thierry
     
  • Add functionality to allocate interrupt lines that will deliver IRQs
    as Non-Maskable Interrupts. These allocations are only successful if
    the irqchip provides the necessary support and allows NMI delivery for the
    interrupt line.

    Interrupt lines allocated for NMI delivery must be enabled/disabled through
    enable_nmi/disable_nmi_nosync to keep their state consistent.

    To treat a PERCPU IRQ as NMI, the interrupt must not be shared nor threaded,
    the irqchip directly managing the IRQ must be the root irqchip and the
    irqchip cannot be behind a slow bus.

    Signed-off-by: Julien Thierry
    Reviewed-by: Marc Zyngier
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Marc Zyngier
    Signed-off-by: Marc Zyngier

    Julien Thierry
     

04 Feb, 2019

1 commit


30 Jan, 2019

1 commit


18 Jan, 2019

1 commit

  • The recent rework of alloc_descs() introduced a double increment of the
    loop counter. As a consequence only every second affinity mask is
    validated.

    Remove it.

    [ tglx: Massaged changelog ]

    Fixes: c410abbbacb9 ("genirq/affinity: Add is_managed to struct irq_affinity_desc")
    Signed-off-by: Huacai Chen
    Signed-off-by: Thomas Gleixner
    Cc: Fuxin Zhang
    Cc: Zhangjin Wu
    Cc: Huacai Chen
    Cc: Dou Liyang
    Link: https://lkml.kernel.org/r/1547694009-16261-1-git-send-email-chenhc@lemote.com

    Huacai Chen
     

15 Jan, 2019

3 commits

  • If all CPUs in the irq_default_affinity mask are offline when an interrupt
    is initialized then irq_setup_affinity() can set an empty affinity mask for
    a newly allocated interrupt.

    Fix this by falling back to cpu_online_mask in case the resulting affinity
    mask is zero.

    Signed-off-by: Srinivas Ramana
    Signed-off-by: Thomas Gleixner
    Cc: linux-arm-msm@vger.kernel.org
    Link: https://lkml.kernel.org/r/1545312957-8504-1-git-send-email-sramana@codeaurora.org

    Srinivas Ramana
     
  • There is a plan to build the kernel with -Wimplicit-fallthrough. The
    fallthrough in __handle_irq_event_percpu() has a fallthrough annotation
    which is followed by an additional comment and is not recognized by GCC.

    Separate the 'fall through' and the rest of the comment with a dash so the
    regular expression used by GCC matches.

    Signed-off-by: Mathieu Malaterre
    Signed-off-by: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20190114203633.18557-1-malat@debian.org

    Mathieu Malaterre
     
  • There is a plan to build the kernel with -Wimplicit-fallthrough. The
    fallthrough in __irq_set_trigger() lacks an annotation. Add it.

    Signed-off-by: Mathieu Malaterre
    Signed-off-by: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20190114203154.17125-1-malat@debian.org

    Mathieu Malaterre
     

19 Dec, 2018

4 commits

  • Devices which use managed interrupts usually have two classes of
    interrupts:

    - Interrupts for multiple device queues
    - Interrupts for general device management

    Currently both classes are treated the same way, i.e. as managed
    interrupts. The general interrupts get the default affinity mask assigned
    while the device queue interrupts are spread out over the possible CPUs.

    Treating the general interrupts as managed is both a limitation and under
    certain circumstances a bug. Assume the following situation:

    default_irq_affinity = 4..7

    So if CPUs 4-7 are offlined, then the core code will shut down the device
    management interrupts because the last CPU in their affinity mask went
    offline.

    It's also a limitation because it's desired to allow manual placement of
    the general device interrupts for various reasons. If they are marked
    managed then the interrupt affinity setting from both user and kernel space
    is disabled. That limitation was reported by Kashyap and Sumit.

    Expand struct irq_affinity_desc with a new bit 'is_managed' which is set
    for truly managed interrupts (queue interrupts) and cleared for the general
    device interrupts.

    [ tglx: Simplify code and massage changelog ]

    Reported-by: Kashyap Desai
    Reported-by: Sumit Saxena
    Signed-off-by: Dou Liyang
    Signed-off-by: Thomas Gleixner
    Cc: linux-pci@vger.kernel.org
    Cc: shivasharan.srikanteshwara@broadcom.com
    Cc: ming.lei@redhat.com
    Cc: hch@lst.de
    Cc: bhelgaas@google.com
    Cc: douliyang1@huawei.com
    Link: https://lkml.kernel.org/r/20181204155122.6327-3-douliyangs@gmail.com

    Dou Liyang
     
  • The interrupt affinity management uses straight cpumask pointers to convey
    the automatically assigned affinity masks for managed interrupts. The core
    interrupt descriptor allocation also decides based on the pointer being non
    NULL whether an interrupt is managed or not.

    Devices which use managed interrupts usually have two classes of
    interrupts:

    - Interrupts for multiple device queues
    - Interrupts for general device management

    Currently both classes are treated the same way, i.e. as managed
    interrupts. The general interrupts get the default affinity mask assigned
    while the device queue interrupts are spread out over the possible CPUs.

    Treating the general interrupts as managed is both a limitation and under
    certain circumstances a bug. Assume the following situation:

    default_irq_affinity = 4..7

    So if CPUs 4-7 are offlined, then the core code will shut down the device
    management interrupts because the last CPU in their affinity mask went
    offline.

    It's also a limitation because it's desired to allow manual placement of
    the general device interrupts for various reasons. If they are marked
    managed then the interrupt affinity setting from both user and kernel space
    is disabled.

    To remedy that situation it's required to convey more information than the
    cpumasks through various interfaces related to interrupt descriptor
    allocation.

    Instead of adding yet another argument, create a new data structure
    'irq_affinity_desc' which for now just contains the cpumask. This struct
    can be expanded to convey auxilliary information in the next step.

    No functional change, just preparatory work.

    [ tglx: Simplified logic and clarified changelog ]

    Suggested-by: Thomas Gleixner
    Suggested-by: Bjorn Helgaas
    Signed-off-by: Dou Liyang
    Signed-off-by: Thomas Gleixner
    Cc: linux-pci@vger.kernel.org
    Cc: kashyap.desai@broadcom.com
    Cc: shivasharan.srikanteshwara@broadcom.com
    Cc: sumit.saxena@broadcom.com
    Cc: ming.lei@redhat.com
    Cc: hch@lst.de
    Cc: douliyang1@huawei.com
    Link: https://lkml.kernel.org/r/20181204155122.6327-2-douliyangs@gmail.com

    Dou Liyang
     
  • Plus other coding style issues which stood out while staring at that code.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • …m-platforms into irq/core

    Pull irqchip updates from Marc Zyngier:

    - A bunch of new irqchip drivers (RDA8810PL, Madera, imx-irqsteer)
    - Updates for new (and old) platforms (i.MX8MQ, F1C100s)
    - A number of SPDX cleanups
    - A workaround for a very broken GICv3 implementation
    - A platform-msi fix
    - Various cleanups

    Thomas Gleixner
     

18 Dec, 2018

1 commit

  • Go over the IRQ subsystem source code (including irqchip drivers) and
    fix common typos in comments.

    No change in functionality intended.

    Signed-off-by: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Jason Cooper
    Cc: Marc Zyngier
    Cc: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: linux-kernel@vger.kernel.org

    Ingo Molnar
     

13 Dec, 2018

1 commit

  • Two threads can try to fire the irq_sim with different offsets and will
    end up fighting for the irq_work asignment. Thomas Gleixner suggested a
    solution based on a bitfield where we set a bit for every offset
    associated with an interrupt that should be fired and then iterate over
    all set bits in the interrupt handler.

    This is a slightly modified solution using a bitmap so that we don't
    impose a limit on the number of interrupts one can allocate with
    irq_sim.

    Suggested-by: Thomas Gleixner
    Signed-off-by: Bartosz Golaszewski
    Signed-off-by: Marc Zyngier

    Bartosz Golaszewski
     

07 Nov, 2018

1 commit

  • On large systems with multiple devices of the same class (e.g. NVMe disks,
    using managed interrupts), the kernel can affinitize these interrupts to a
    small subset of CPUs instead of spreading them out evenly.

    irq_matrix_alloc_managed() tries to select the CPU in the supplied cpumask
    of possible target CPUs which has the lowest number of interrupt vectors
    allocated.

    This is done by searching the CPU with the highest number of available
    vectors. While this is correct for non-managed CPUs it can select the wrong
    CPU for managed interrupts. Under certain constellations this results in
    affinitizing the managed interrupts of several devices to a single CPU in
    a set.

    The book keeping of available vectors works the following way:

    1) Non-managed interrupts:

    available is decremented when the interrupt is actually requested by
    the device driver and a vector is assigned. It's incremented when the
    interrupt and the vector are freed.

    2) Managed interrupts:

    Managed interrupts guarantee vector reservation when the MSI/MSI-X
    functionality of a device is enabled, which is achieved by reserving
    vectors in the bitmaps of the possible target CPUs. This reservation
    decrements the available count on each possible target CPU.

    When the interrupt is requested by the device driver then a vector is
    allocated from the reserved region. The operation is reversed when the
    interrupt is freed by the device driver. Neither of these operations
    affect the available count.

    The reservation persist up to the point where the MSI/MSI-X
    functionality is disabled and only this operation increments the
    available count again.

    For non-managed interrupts the available count is the correct selection
    criterion because the guaranteed reservations need to be taken into
    account. Using the allocated counter could lead to a failing allocation in
    the following situation (total vector space of 10 assumed):

    CPU0 CPU1
    available: 2 0
    allocated: 5 3 allocated: 3 3

    available: 4 4 allocated: 4 3

    available: 3 4 allocated: 4 4

    But the allocation of three managed interrupts starting from the same
    point will affinitize all of them to CPU0 because the available count is
    not affected by the allocation (see above). So the end result is:

    CPU0 CPU1
    available: 5 4
    allocated: 5 3

    Introduce a "managed_allocated" field in struct cpumap to track the vector
    allocation for managed interrupts separately. Use this information to
    select the target CPU when a vector is allocated for a managed interrupt,
    which results in more evenly distributed vector assignments. The above
    example results in the following allocations:

    CPU0 CPU1
    managed_allocated: 0 0 allocated: 3 3

    managed_allocated: 1 0 allocated: 3 4

    managed_allocated: 1 1 allocated: 4 4

    The allocation of non-managed interrupts is not affected by this change and
    is still evaluating the available count.

    The overall distribution of interrupt vectors for both types of interrupts
    might still not be perfectly even depending on the number of non-managed
    and managed interrupts in a system, but due to the reservation guarantee
    for managed interrupts this cannot be avoided.

    Expose the new field in debugfs as well.

    [ tglx: Clarified the background of the problem in the changelog and
    described it independent of NVME ]

    Signed-off-by: Long Li
    Signed-off-by: Thomas Gleixner
    Cc: Michael Kelley
    Link: https://lkml.kernel.org/r/20181106040000.27316-1-longli@linuxonhyperv.com

    Long Li
     

05 Nov, 2018

4 commits

  • A driver may have a need to allocate multiple sets of MSI/MSI-X interrupts,
    and have them appropriately affinitized.

    Add support for defining a number of sets in the irq_affinity structure, of
    varying sizes, and get each set affinitized correctly across the machine.

    [ tglx: Minor changelog tweaks ]

    Signed-off-by: Jens Axboe
    Signed-off-by: Ming Lei
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Ming Lei
    Reviewed-by: Keith Busch
    Reviewed-by: Sagi Grimberg
    Cc: linux-block@vger.kernel.org
    Link: https://lkml.kernel.org/r/20181102145951.31979-5-ming.lei@redhat.com

    Jens Axboe
     
  • No functional change.

    Prepares for support of allocating and affinitizing sets of interrupts, in
    which each set of interrupts needs a full two stage spreading. The first
    vector argument is necessary for this so the affinitizing starts from the
    first vector of each set.

    [ tglx: Minor changelog tweaks ]

    Signed-off-by: Ming Lei
    Signed-off-by: Thomas Gleixner
    Cc: Jens Axboe
    Cc: linux-block@vger.kernel.org
    Cc: Hannes Reinecke
    Cc: Keith Busch
    Cc: Sagi Grimberg
    Link: https://lkml.kernel.org/r/20181102145951.31979-4-ming.lei@redhat.com

    Ming Lei
     
  • No functional change. Prepares for supporting allocating and affinitizing
    interrupt sets.

    [ tglx: Minor changelog tweaks ]

    Signed-off-by: Ming Lei
    Signed-off-by: Thomas Gleixner
    Cc: Jens Axboe
    Cc: linux-block@vger.kernel.org
    Cc: Hannes Reinecke
    Cc: Keith Busch
    Cc: Sagi Grimberg
    Link: https://lkml.kernel.org/r/20181102145951.31979-3-ming.lei@redhat.com

    Ming Lei
     
  • If the number of NUMA nodes exceeds the number of MSI/MSI-X interrupts
    which are allocated for a device, the interrupt affinity spreading code
    fails to spread them across all nodes.

    The reason is, that the spreading code starts from node 0 and continues up
    to the number of interrupts requested for allocation. This leaves the nodes
    past the last interrupt unused.

    This results in interrupt concentration on the first nodes which violates
    the assumption of the block layer that all nodes are covered evenly. As a
    consequence the NUMA nodes above the number of interrupts are all assigned
    to hardware queue 0 and therefore NUMA node 0, which results in bad
    performance and has CPU hotplug implications, because queue 0 gets shut
    down when the last CPU of node 0 is offlined.

    Go over all NUMA nodes and assign them round-robin to all requested
    interrupts to solve this.

    [ tglx: Massaged changelog ]

    Signed-off-by: Long Li
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ming Lei
    Cc: Michael Kelley
    Link: https://lkml.kernel.org/r/20181102180248.13583-1-longli@linuxonhyperv.com

    Long Li
     

01 Nov, 2018

1 commit

  • IRQ_MATRIX_SIZE is the number of longs needed for a bitmap, multiplied by
    the size of a long, yielding a byte count. But it is used to size an array
    of longs, which is way more memory than is needed.

    Change IRQ_MATRIX_SIZE so it is just the number of longs needed and the
    arrays come out the correct size.

    Fixes: 2f75d9e1c905 ("genirq: Implement bitmap matrix allocator")
    Signed-off-by: Michael Kelley
    Signed-off-by: Thomas Gleixner
    Cc: KY Srinivasan
    Link: https://lkml.kernel.org/r/1541032428-10392-1-git-send-email-mikelley@microsoft.com

    Michael Kelley
     

26 Oct, 2018

1 commit

  • Pull irq updates from Thomas Gleixner:
    "The interrupt brigade came up with the following updates:

    - Driver for the Marvell System Error Interrupt machinery

    - Overhaul of the GIC-V3 ITS driver

    - Small updates and fixes all over the place"

    * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
    genirq: Fix race on spurious interrupt detection
    softirq: Fix typo in __do_softirq() comments
    genirq: Fix grammar s/an /a /
    irqchip/gic: Unify GIC priority definitions
    irqchip/gic-v3: Remove acknowledge loop
    dt-bindings/interrupt-controller: Add documentation for Marvell SEI controller
    dt-bindings/interrupt-controller: Update Marvell ICU bindings
    irqchip/irq-mvebu-icu: Add support for System Error Interrupts (SEI)
    arm64: marvell: Enable SEI driver
    irqchip/irq-mvebu-sei: Add new driver for Marvell SEI
    irqchip/irq-mvebu-icu: Support ICU subnodes
    irqchip/irq-mvebu-icu: Disociate ICU and NSR
    irqchip/irq-mvebu-icu: Clarify the reset operation of configured interrupts
    irqchip/irq-mvebu-icu: Fix wrong private data retrieval
    dt-bindings/interrupt-controller: Fix Marvell ICU length in the example
    genirq/msi: Allow creation of a tree-based irqdomain for platform-msi
    dt-bindings: irqchip: renesas-irqc: Document r8a7744 support
    dt-bindings: irqchip: renesas-irqc: Document R-Car E3 support
    irqchip/pdc: Setup all edge interrupts as rising edge at GIC
    irqchip/gic-v3-its: Allow use of LPI tables in reserved memory
    ...

    Linus Torvalds