11 Jul, 2018

1 commit

  • …SK_OFFSTACK=y kernels(v1)

    commit 10d94ff4d558b96bfc4f55bb0051ae4d938246fe upstream.

    When the irqaffinity= kernel parameter is passed in a CPUMASK_OFFSTACK=y
    kernel, it fails to boot, because zalloc_cpumask_var() cannot be used before
    initializing the slab allocator to allocate a cpumask.

    So, use alloc_bootmem_cpumask_var() instead.

    Also do some cleanups while at it: in init_irq_default_affinity() remove
    an #ifdef via using cpumask_available().

    Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Link: http://lkml.kernel.org/r/20171026045800.27087-1-rakib.mullick@gmail.com
    Link: http://lkml.kernel.org/r/20171101041451.12581-1-rakib.mullick@gmail.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Cc: Janne Huttunen <janne.huttunen@nokia.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

    Rakib Mullick
     

07 Sep, 2017

1 commit

  • for_each_active_irq() iterates the sparse irq allocation bitmap. The caller
    must hold sparse_irq_lock. Several code pathes expect that an active bit in
    the sparse bitmap also has a valid interrupt descriptor.

    Unfortunately that's not true. The (de)allocation is a two step process,
    which holds the sparse_irq_lock only across the queue/remove from the radix
    tree and the set/clear in the allocation bitmap.

    If a iteration locks sparse_irq_lock between the two steps, then it might
    see an active bit but the corresponding irq descriptor is NULL. If that is
    dereferenced unconditionally, then the kernel oopses. Of course, all
    iterator sites could be audited and fixed, but....

    There is no reason why the sparse_irq_lock needs to be dropped between the
    two steps, in fact the code becomes simpler when the mutex is held across
    both and the semantics become more straight forward, so future problems of
    missing NULL pointer checks in the iteration are avoided and all existing
    sites are fixed in one go.

    Expand the lock held sections so both operations are covered and the bitmap
    and the radixtree are in sync.

    Fixes: a05a900a51c7 ("genirq: Make sparse_lock a mutex")
    Reported-and-tested-by: Huang Ying
    Signed-off-by: Thomas Gleixner
    Cc: stable@vger.kernel.org

    Thomas Gleixner
     

10 Jul, 2017

1 commit

  • Pull irq fixes from Thomas Gleixner:

    - A few fixes mopping up the fallout of the big irq overhaul

    - Move the interrupt resource management logic out of the spin locked,
    irq disabled region to avoid unnecessary restrictions of the resource
    callbacks

    - Preparation for reworking the per cpu irq request function.

    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    irqdomain: Allow ACPI device nodes to be used as irqdomain identifiers
    genirq/debugfs: Remove redundant NULL pointer check
    genirq: Allow to pass the IRQF_TIMER flag with percpu irq request
    genirq/timings: Move free timings out of spinlocked region
    genirq: Move irq resource handling out of spinlocked region
    genirq: Add mutex to irq desc to serialize request/free_irq()
    genirq: Move bus locking into __setup_irq()
    genirq: Force inlining of __irq_startup_managed to prevent build failure
    genirq/debugfs: Fix build for !CONFIG_IRQ_DOMAIN

    Linus Torvalds
     

04 Jul, 2017

2 commits

  • The irq_request/release_resources() callbacks ar currently invoked under
    desc->lock with interrupts disabled. This is a source of problems on RT and
    conceptually not required.

    Add a seperate mutex to struct irq_desc which allows to serialize
    request/free_irq(), which can be used to move the resource functions out of
    the desc->lock held region.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Marc Zyngier
    Cc: Heiko Stuebner
    Cc: Julia Cartwright
    Cc: Linus Walleij
    Cc: Brian Norris
    Cc: Doug Anderson
    Cc: linux-rockchip@lists.infradead.org
    Cc: John Keeping
    Cc: linux-gpio@vger.kernel.org
    Link: http://lkml.kernel.org/r/20170629214344.039220922@linutronix.de

    Thomas Gleixner
     
  • Pull documentation updates from Jonathan Corbet:
    "There has been a fair amount of activity in the docs tree this time
    around. Highlights include:

    - Conversion of a bunch of security documentation into RST

    - The conversion of the remaining DocBook templates by The Amazing
    Mauro Machine. We can now drop the entire DocBook build chain.

    - The usual collection of fixes and minor updates"

    * tag 'docs-4.13' of git://git.lwn.net/linux: (90 commits)
    scripts/kernel-doc: handle DECLARE_HASHTABLE
    Documentation: atomic_ops.txt is core-api/atomic_ops.rst
    Docs: clean up some DocBook loose ends
    Make the main documentation title less Geocities
    Docs: Use kernel-figure in vidioc-g-selection.rst
    Docs: fix table problems in ras.rst
    Docs: Fix breakage with Sphinx 1.5 and upper
    Docs: Include the Latex "ifthen" package
    doc/kokr/howto: Only send regression fixes after -rc1
    docs-rst: fix broken links to dynamic-debug-howto in kernel-parameters
    doc: Document suitability of IBM Verse for kernel development
    Doc: fix a markup error in coding-style.rst
    docs: driver-api: i2c: remove some outdated information
    Documentation: DMA API: fix a typo in a function name
    Docs: Insert missing space to separate link from text
    doc/ko_KR/memory-barriers: Update control-dependencies example
    Documentation, kbuild: fix typo "minimun" -> "minimum"
    docs: Fix some formatting issues in request-key.rst
    doc: ReSTify keys-trusted-encrypted.txt
    doc: ReSTify keys-request-key.txt
    ...

    Linus Torvalds
     

26 Jun, 2017

1 commit

  • The irq default state is set to disabled when allocating irq desc, but the
    masked state flag is not set. This is inconsistent vs. the state tracking
    logic which is used to prevent unnecessary calls to hardware level irq chip
    functions.

    Set the masked state flag as well.

    Signed-off-by: Jeffy Chen
    Signed-off-by: Thomas Gleixner
    Cc: tfiga@chromium.org
    Cc: briannorris@chromium.org
    Cc: dianders@chromium.org
    Link: http://lkml.kernel.org/r/1498476814-12563-1-git-send-email-jeffy.chen@rock-chips.com

    Jeffy Chen
     

23 Jun, 2017

3 commits

  • There is currently no way to evaluate the effective affinity mask of a
    given interrupt. Many irq chips allow only a single target CPU or a subset
    of CPUs in the affinity mask.

    Updating the mask at the time of setting the affinity to the subset would
    be counterproductive because information for cpu hotplug about assigned
    interrupt affinities gets lost. On CPU hotplug it's also pointless to force
    migrate an interrupt, which is not targeted at the CPU effectively. But
    currently the information is not available.

    Provide a seperate mask to be updated by the irq_chip->irq_set_affinity()
    implementations. Implement the read only proc files so the user can see the
    effective mask as well w/o trying to deduce it from /proc/interrupts.

    Signed-off-by: Thomas Gleixner
    Cc: Jens Axboe
    Cc: Marc Zyngier
    Cc: Michael Ellerman
    Cc: Keith Busch
    Cc: Peter Zijlstra
    Cc: Christoph Hellwig
    Link: http://lkml.kernel.org/r/20170619235446.247834245@linutronix.de

    Thomas Gleixner
     
  • All callers hand in GPF_KERNEL. No point to have an extra argument for
    that.

    Signed-off-by: Thomas Gleixner
    Cc: Jens Axboe
    Cc: Marc Zyngier
    Cc: Michael Ellerman
    Cc: Keith Busch
    Cc: Peter Zijlstra
    Cc: Christoph Hellwig
    Link: http://lkml.kernel.org/r/20170619235446.082544752@linutronix.de

    Thomas Gleixner
     
  • Debugging (hierarchical) interupt domains is tedious as there is no
    information about the hierarchy and no information about states of
    interrupts in the various domain levels.

    Add a debugfs directory 'irq' and subdirectories 'domains' and 'irqs'.

    The domains directory contains the domain files. The content is information
    about the domain. If the domain is part of a hierarchy then the parent
    domains are printed as well.

    # ls /sys/kernel/debug/irq/domains/
    default INTEL-IR-2 INTEL-IR-MSI-2 IO-APIC-IR-2 PCI-MSI
    DMAR-MSI INTEL-IR-3 INTEL-IR-MSI-3 IO-APIC-IR-3 unknown-1
    INTEL-IR-0 INTEL-IR-MSI-0 IO-APIC-IR-0 IO-APIC-IR-4 VECTOR
    INTEL-IR-1 INTEL-IR-MSI-1 IO-APIC-IR-1 PCI-HT

    # cat /sys/kernel/debug/irq/domains/VECTOR
    name: VECTOR
    size: 0
    mapped: 216
    flags: 0x00000041

    # cat /sys/kernel/debug/irq/domains/IO-APIC-IR-0
    name: IO-APIC-IR-0
    size: 24
    mapped: 19
    flags: 0x00000041
    parent: INTEL-IR-3
    name: INTEL-IR-3
    size: 65536
    mapped: 167
    flags: 0x00000041
    parent: VECTOR
    name: VECTOR
    size: 0
    mapped: 216
    flags: 0x00000041

    Unfortunately there is no per cpu information about the VECTOR domain (yet).

    The irqs directory contains detailed information about mapped interrupts.

    # cat /sys/kernel/debug/irq/irqs/3
    handler: handle_edge_irq
    status: 0x00004000
    istate: 0x00000000
    ddepth: 1
    wdepth: 0
    dstate: 0x01018000
    IRQD_IRQ_DISABLED
    IRQD_SINGLE_TARGET
    IRQD_MOVE_PCNTXT
    node: 0
    affinity: 0-143
    effectiv: 0
    pending:
    domain: IO-APIC-IR-0
    hwirq: 0x3
    chip: IR-IO-APIC
    flags: 0x10
    IRQCHIP_SKIP_SET_WAKE
    parent:
    domain: INTEL-IR-3
    hwirq: 0x20000
    chip: INTEL-IR
    flags: 0x0
    parent:
    domain: VECTOR
    hwirq: 0x3
    chip: APIC
    flags: 0x0

    This was developed to simplify the debugging of the managed affinity
    changes.

    Signed-off-by: Thomas Gleixner
    Acked-by: Marc Zyngier
    Cc: Jens Axboe
    Cc: Michael Ellerman
    Cc: Keith Busch
    Cc: Peter Zijlstra
    Cc: Christoph Hellwig
    Link: http://lkml.kernel.org/r/20170619235444.537566163@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

26 May, 2017

1 commit

  • The printk in early_irq_init() is cryptic and badly formatted:

    NR_IRQS:33024 nr_irqs:968 16

    The last number is the number of preallocated interrupts, so add a prefix
    to it:

    NR_IRQS: 33024, nr_irqs: 968, preallocated irqs: 16

    Cleanup the formatting for better readability as well.

    Signed-off-by: Vincent Legoll
    Signed-off-by: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1494318849-6733-1-git-send-email-vincent.legoll@gmail.com

    Vincent Legoll
     

16 May, 2017

1 commit


16 Sep, 2016

1 commit


15 Sep, 2016

1 commit

  • Switch MSI over to the new spreading code. If a pci device contains a valid
    pointer to a cpumask, then this mask is used for spreading otherwise the
    online cpu mask is used. This allows a driver to restrict the spread to a
    subset of CPUs, e.g. cpus on a particular node.

    Signed-off-by: Thomas Gleixner
    Cc: Christoph Hellwig
    Cc: axboe@fb.com
    Cc: keith.busch@intel.com
    Cc: agordeev@redhat.com
    Cc: linux-block@vger.kernel.org
    Link: http://lkml.kernel.org/r/1473862739-15032-4-git-send-email-hch@lst.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

14 Sep, 2016

1 commit

  • Information about interrupts is exposed via /proc/interrupts, but the
    format of that file has changed over kernel versions and differs across
    architectures. It also has varying column numbers depending on hardware.

    That all makes it hard for tools to parse.

    To solve this, expose the information through sysfs so each irq attribute
    is in a separate file in a consistent, machine parsable way.

    This feature is only available when both CONFIG_SPARSE_IRQ and
    CONFIG_SYSFS are enabled.

    Examples:
    /sys/kernel/irq/18/actions: i801_smbus,ehci_hcd:usb1,uhci_hcd:usb7
    /sys/kernel/irq/18/chip_name: IR-IO-APIC
    /sys/kernel/irq/18/hwirq: 18
    /sys/kernel/irq/18/name: fasteoi
    /sys/kernel/irq/18/per_cpu_count: 0,0
    /sys/kernel/irq/18/type: level

    /sys/kernel/irq/25/actions: ahci0
    /sys/kernel/irq/25/chip_name: IR-PCI-MSI
    /sys/kernel/irq/25/hwirq: 512000
    /sys/kernel/irq/25/name: edge
    /sys/kernel/irq/25/per_cpu_count: 29036,0
    /sys/kernel/irq/25/type: edge

    [ tglx: Moved kobject_del() under sparse_irq_lock, massaged code comments
    and changelog ]

    Signed-off-by: Craig Gallek
    Cc: David Decotigny
    Link: http://lkml.kernel.org/r/1473783291-122873-1-git-send-email-kraigatgoog@gmail.com
    Signed-off-by: Thomas Gleixner

    Craig Gallek
     

04 Jul, 2016

2 commits

  • Use the affinity hint in the irqdesc allocator. The hint is used to determine
    the node for the allocation and to set the affinity of the interrupt.

    If multiple interrupts are allocated (multi-MSI) then the allocator iterates
    over the cpumask and for each set cpu it allocates on their node and sets the
    initial affinity to that cpu.

    If a single interrupt is allocated (MSI-X) then the allocator uses the first
    cpu in the mask to compute the allocation node and uses the mask for the
    initial affinity setting.

    Interrupts set up this way are marked with the AFFINITY_MANAGED flag to
    prevent userspace from messing with their affinity settings.

    Signed-off-by: Thomas Gleixner
    Cc: Christoph Hellwig
    Cc: linux-block@vger.kernel.org
    Cc: linux-pci@vger.kernel.org
    Cc: linux-nvme@lists.infradead.org
    Cc: axboe@fb.com
    Cc: agordeev@redhat.com
    Link: http://lkml.kernel.org/r/1467621574-8277-5-git-send-email-hch@lst.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Add an extra argument to the irq(domain) allocation functions, so we can hand
    down affinity hints to the allocator. Thats necessary to implement proper
    support for multiqueue devices.

    Signed-off-by: Thomas Gleixner
    Cc: Christoph Hellwig
    Cc: linux-block@vger.kernel.org
    Cc: linux-pci@vger.kernel.org
    Cc: linux-nvme@lists.infradead.org
    Cc: axboe@fb.com
    Cc: agordeev@redhat.com
    Link: http://lkml.kernel.org/r/1467621574-8277-4-git-send-email-hch@lst.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

02 May, 2016

1 commit

  • In order to prepare the genirq layer for the concept of partitionned
    percpu interrupts, let's allow an affinity to be associated with
    such an interrupt. We introduce:

    - irq_set_percpu_devid_partition: flag an interrupt as a percpu-devid
    interrupt, and associate it with an affinity
    - irq_get_percpu_devid_partition: allow the affinity of that interrupt
    to be retrieved.

    This will allow a driver to discover which CPUs the per-cpu interrupt
    can actually fire on.

    Signed-off-by: Marc Zyngier
    Cc: Mark Rutland
    Cc: devicetree@vger.kernel.org
    Cc: Jason Cooper
    Cc: Will Deacon
    Cc: Rob Herring
    Link: http://lkml.kernel.org/r/1460365075-7316-3-git-send-email-marc.zyngier@arm.com
    Signed-off-by: Thomas Gleixner

    Marc Zyngier
     

08 Feb, 2016

1 commit

  • If we isolate CPUs, then we don't want random device interrupts on them. Even
    w/o the user space irq balancer enabled we can end up with irqs on non boot
    cpus and chasing newly requested interrupts is a tedious task.

    Allow to restrict the default irq affinity mask.

    Signed-off-by: Thomas Gleixner
    Cc: Rik van Riel
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    Cc: Chris Metcalf
    Cc: Christoph Lameter
    Cc: Sebastian Siewior
    Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1602031948190.25254@nanos
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

14 Dec, 2015

1 commit

  • The new VMD device driver needs to iterate over a list of
    "demultiplexing" interrupts. Protecting that list with a lock is not
    possible because the list is also required in code pathes which hold
    irq descriptor lock. Therefor the demultiplexing interrupt handler
    would create a lock inversion scenario if it calls a demux handler
    with the list protection lock held.

    A solution for this is to free the irq descriptor via RCU, so the
    list can be walked with rcu read lock held.

    Signed-off-by: Thomas Gleixner
    Cc: Keith Busch

    Thomas Gleixner
     

16 Sep, 2015

5 commits

  • Most interrupt flow handlers do not use the irq argument. Those few
    which use it can retrieve the irq number from the irq descriptor.

    Remove the argument.

    Search and replace was done with coccinelle and some extra helper
    scripts around it. Thanks to Julia for her help!

    Signed-off-by: Thomas Gleixner
    Cc: Julia Lawall
    Cc: Jiang Liu

    Thomas Gleixner
     
  • MSI descriptors are per-irq instead of per irqchip, so move it into
    struct irq_common_data.

    Signed-off-by: Jiang Liu
    Cc: Konrad Rzeszutek Wilk
    Cc: Tony Luck
    Cc: Bjorn Helgaas
    Cc: Benjamin Herrenschmidt
    Cc: Randy Dunlap
    Cc: Yinghai Lu
    Cc: Borislav Petkov
    Cc: Jason Cooper
    Cc: Kevin Cernekee
    Cc: Arnd Bergmann
    Cc: Marc Zyngier
    Link: http://lkml.kernel.org/r/1433145945-789-35-git-send-email-jiang.liu@linux.intel.com
    Signed-off-by: Thomas Gleixner

    Jiang Liu
     
  • Irq affinity mask is per-irq instead of per irqchip, so move it into
    struct irq_common_data.

    Signed-off-by: Jiang Liu
    Cc: Konrad Rzeszutek Wilk
    Cc: Tony Luck
    Cc: Bjorn Helgaas
    Cc: Benjamin Herrenschmidt
    Cc: Randy Dunlap
    Cc: Yinghai Lu
    Cc: Borislav Petkov
    Cc: Jason Cooper
    Cc: Kevin Cernekee
    Cc: Arnd Bergmann
    Link: http://lkml.kernel.org/r/1433303281-27688-1-git-send-email-jiang.liu@linux.intel.com
    Signed-off-by: Thomas Gleixner

    Jiang Liu
     
  • Handler data (handler_data) is per-irq instead of per irqchip, so move
    it into struct irq_common_data.

    Signed-off-by: Jiang Liu
    Cc: Konrad Rzeszutek Wilk
    Cc: Tony Luck
    Cc: Bjorn Helgaas
    Cc: Benjamin Herrenschmidt
    Cc: Randy Dunlap
    Cc: Yinghai Lu
    Cc: Borislav Petkov
    Cc: Jason Cooper
    Cc: Kevin Cernekee
    Cc: Arnd Bergmann
    Cc: Marc Zyngier
    Link: http://lkml.kernel.org/r/1433145945-789-13-git-send-email-jiang.liu@linux.intel.com
    Signed-off-by: Thomas Gleixner

    Jiang Liu
     
  • NUMA node information is per-irq instead of per-irqchip, so move it into
    struct irq_common_data. Also use CONFIG_NUMA to guard irq_common_data.node.

    Signed-off-by: Jiang Liu
    Cc: Konrad Rzeszutek Wilk
    Cc: Tony Luck
    Cc: Bjorn Helgaas
    Cc: Benjamin Herrenschmidt
    Cc: Randy Dunlap
    Cc: Yinghai Lu
    Cc: Borislav Petkov
    Cc: Jason Cooper
    Cc: Kevin Cernekee
    Cc: Arnd Bergmann
    Link: http://lkml.kernel.org/r/1433145945-789-8-git-send-email-jiang.liu@linux.intel.com
    Signed-off-by: Thomas Gleixner

    Jiang Liu
     

12 Jul, 2015

1 commit

  • The first parameter 'irq' is never used by
    kstat_incr_irqs_this_cpu(). Remove it.

    Signed-off-by: Jiang Liu
    Cc: Konrad Rzeszutek Wilk
    Cc: Tony Luck
    Cc: Bjorn Helgaas
    Cc: Benjamin Herrenschmidt
    Cc: Randy Dunlap
    Cc: Yinghai Lu
    Cc: Borislav Petkov
    Link: http://lkml.kernel.org/r/1433391238-19471-16-git-send-email-jiang.liu@linux.intel.com
    Signed-off-by: Thomas Gleixner

    Jiang Liu
     

21 Jun, 2015

1 commit


12 Jun, 2015

2 commits

  • Introduce helper function irq_data_get_node() and variants thereof to
    hide struct irq_data implementation details.

    Convert the core code to use them.

    Signed-off-by: Jiang Liu
    Cc: Konrad Rzeszutek Wilk
    Cc: Tony Luck
    Cc: Bjorn Helgaas
    Cc: Benjamin Herrenschmidt
    Cc: Randy Dunlap
    Cc: Yinghai Lu
    Cc: Borislav Petkov
    Cc: Jason Cooper
    Cc: Kevin Cernekee
    Cc: Arnd Bergmann
    Link: http://lkml.kernel.org/r/1433145945-789-5-git-send-email-jiang.liu@linux.intel.com
    Signed-off-by: Thomas Gleixner

    Jiang Liu
     
  • With the introduction of hierarchy irqdomain, struct irq_data becomes
    per-chip instead of per-irq and there may be multiple irq_datas
    associated with the same irq. Some per-irq data stored in struct
    irq_data now may get duplicated into multiple irq_datas, and causes
    inconsistent view.

    So introduce struct irq_common_data to host per-irq common data and to
    achieve consistent view among irq_chips.

    Signed-off-by: Jiang Liu
    Cc: Konrad Rzeszutek Wilk
    Cc: Tony Luck
    Cc: Bjorn Helgaas
    Cc: Benjamin Herrenschmidt
    Cc: Randy Dunlap
    Cc: Yinghai Lu
    Cc: Borislav Petkov
    Cc: Jason Cooper
    Cc: Kevin Cernekee
    Cc: Arnd Bergmann
    Cc: Marc Zyngier
    Link: http://lkml.kernel.org/r/1433145945-789-4-git-send-email-jiang.liu@linux.intel.com
    Signed-off-by: Thomas Gleixner

    Jiang Liu
     

05 May, 2015

2 commits

  • The return type of kstat_irqs_usr() is unsigned int and kstat_irqs() also
    returns unsigned int so sum should be unsigned int here as well.

    Signed-off-by: Nicholas Mc Guire
    Link: http://lkml.kernel.org/r/1430642951-23964-1-git-send-email-hofrat@osadl.org
    Signed-off-by: Thomas Gleixner

    Nicholas Mc Guire
     
  • kstat_irqs is unsigned int and the return type of kstat_irqs() is also
    unsigned int so sum should be unsigned int as well even if the result
    is correct due to automatic type conversion.

    Signed-off-by: Nicholas Mc Guire
    Link: http://lkml.kernel.org/r/1430642930-23929-1-git-send-email-hofrat@osadl.org
    Signed-off-by: Thomas Gleixner

    Nicholas Mc Guire
     

13 Dec, 2014

1 commit

  • Since the rework of the sparse interrupt code to actually free the
    unused interrupt descriptors there exists a race between the /proc
    interfaces to the irq subsystem and the code which frees the interrupt
    descriptor.

    CPU0 CPU1
    show_interrupts()
    desc = irq_to_desc(X);
    free_desc(desc)
    remove_from_radix_tree();
    kfree(desc);
    raw_spinlock_irq(&desc->lock);

    /proc/interrupts is the only interface which can actively corrupt
    kernel memory via the lock access. /proc/stat can only read from freed
    memory. Extremly hard to trigger, but possible.

    The interfaces in /proc/irq/N/ are not affected by this because the
    removal of the proc file is serialized in procfs against concurrent
    readers/writers. The removal happens before the descriptor is freed.

    For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
    as the descriptor is never freed. It's merely cleared out with the irq
    descriptor lock held. So any concurrent proc access will either see
    the old correct value or the cleared out ones.

    Protect the lookup and access to the irq descriptor in
    show_interrupts() with the sparse_irq_lock.

    Provide kstat_irqs_usr() which is protecting the lookup and access
    with sparse_irq_lock and switch /proc/stat to use it.

    Document the existing kstat_irqs interfaces so it's clear that the
    caller needs to take care about protection. The users of these
    interfaces are either not affected due to SPARSE_IRQ=n or already
    protected against removal.

    Fixes: 1f5a5b87f78f "genirq: Implement a sane sparse_irq allocator"
    Signed-off-by: Thomas Gleixner
    Cc: stable@vger.kernel.org

    Thomas Gleixner
     

03 Sep, 2014

1 commit

  • Calling irq_find_mapping from outside a irq_{enter,exit} section is
    unsafe and produces ugly messages if CONFIG_PROVE_RCU is enabled:
    If coming from the idle state, the rcu_read_lock call in irq_find_mapping
    will generate an unpleasant warning:

    ===============================
    [ INFO: suspicious RCU usage. ]
    3.16.0-rc1+ #135 Not tainted
    -------------------------------
    include/linux/rcupdate.h:871 rcu_read_lock() used illegally while idle!

    other info that might help us debug this:

    RCU used illegally from idle CPU!
    rcu_scheduler_active = 1, debug_locks = 0
    RCU used illegally from extended quiescent state!
    1 lock held by swapper/0/0:
    #0: (rcu_read_lock){......}, at: []
    irq_find_mapping+0x4c/0x198

    As this issue is fairly widespread and involves at least three
    different architectures, a possible solution is to add a new
    handle_domain_irq entry point into the generic IRQ code that
    the interrupt controller code can call.

    This new function takes an irq_domain, and calls into irq_find_domain
    inside the irq_{enter,exit} block. An additional "lookup" parameter is
    used to allow non-domain architecture code to be replaced by this as well.

    Interrupt controllers can then be updated to use the new mechanism.

    This code is sitting behind a new CONFIG_HANDLE_DOMAIN_IRQ, as not all
    architectures implement set_irq_regs (yes, mn10300, I'm looking at you...).

    Reported-by: Vladimir Murzin
    Signed-off-by: Marc Zyngier
    Link: https://lkml.kernel.org/r/1409047421-27649-2-git-send-email-marc.zyngier@arm.com
    Signed-off-by: Jason Cooper

    Marc Zyngier
     

06 Jul, 2014

1 commit

  • irq_free_hwirqs() always calls irq_free_descs() with a cnt == 0
    which makes it a no-op since the interrupt count to free is
    decremented in itself.

    Fixes: 7b6ef1262549f6afc5c881aaef80beb8fd15f908

    Signed-off-by: Keith Busch
    Acked-by: David Rientjes
    Link: http://lkml.kernel.org/r/1404167084-8070-1-git-send-email-keith.busch@intel.com
    Signed-off-by: Thomas Gleixner

    Keith Busch
     

16 May, 2014

5 commits

  • No more users. Get rid of the cruft.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Grant Likely
    Tested-by: Tony Luck
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20140507154341.012847637@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Create a new interface and confine it with a config switch which makes
    clear that this is just legacy support and not to be used for new code.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Grant Likely
    Tested-by: Tony Luck
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20140507154340.574437049@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • No more users. And it's not going to come back. If you need
    hotplugable irq chips, use irq domains.

    Signed-off-by: Thomas Gleixner
    Reviewed-and-acked-by: Grant Likely
    Tested-by: Tony Luck
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20140507154340.302183048@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • We want to get rid of the public interface.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Grant Likely
    Tested-by: Tony Luck
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20140507154340.061990194@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Not really the solution to the problem, but at least it confines the
    mess in the core code and allows to get rid of the create/destroy_irq
    variants from hell, i.e. 3 implementations with different semantics
    plus the x86 specific variants __create_irqs and create_irq_nr
    which have been invented in another circle of hell.

    x86 : x86 should be converted to irq domains and I'm deliberately
    making it impossible to do the multi-vector MSI support by
    adding more crap to the current mess. It's not that hard to do
    and I'm really tired of the trainwrecks which have been invented
    by baindaid engineering so far. Any attempt to do multi-vector
    MSI or ioapic hotplug without converting to irq domains is NAKed
    hereby.

    tile: Might use irq domains as well, but it has a very limited
    interrupt space, so handling it via this functionality might be
    the right thing to do even in the long run.

    ia64: That's an hopeless case, as I doubt that anyone has the stomach
    to rewrite the homebrewn dynamic allocation facilities. I stared
    at it for a couple of hours and gave up. The create/destroy_irq
    mess could be made private to itanic right away if there
    wouldn't be the iommu/dmar driver being shared with x86. So to
    do that I'm going to add a separate ia64 specific implementation
    later in order not to deep-six itanic right away.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Grant Likely
    Cc: Tony Luck
    Cc: Peter Zijlstra
    Cc: Chris Metcalf
    Cc: Fenghua Yu
    Cc: x86@kernel.org
    Link: http://lkml.kernel.org/r/20140507154334.208629358@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

28 Apr, 2014

1 commit

  • On x86 the allocation of irq descriptors may allocate interrupts which
    are in the range of the GSI interrupts. That's wrong as those
    interrupts are hardwired and we don't have the irq domain translation
    like PPC. So one of these interrupts can be hooked up later to one of
    the devices which are hard wired to it and the io_apic init code for
    that particular interrupt line happily reuses that descriptor with a
    completely different configuration so hell breaks lose.

    Inside x86 we allocate dynamic interrupts from above nr_gsi_irqs,
    except for a few usage sites which have not yet blown up in our face
    for whatever reason. But for drivers which need an irq range, like the
    GPIO drivers, we have no limit in place and we don't want to expose
    such a detail to a driver.

    To cure this introduce a function which an architecture can implement
    to impose a lower bound on the dynamic interrupt allocations.

    Implement it for x86 and set the lower bound to nr_gsi_irqs, which is
    the end of the hardwired interrupt space, so all dynamic allocations
    happen above.

    That not only allows the GPIO driver to work sanely, it also protects
    the bogus callsites of create_irq_nr() in hpet, uv, irq_remapping and
    htirq code. They need to be cleaned up as well, but that's a separate
    issue.

    Reported-by: Jin Yao
    Signed-off-by: Thomas Gleixner
    Tested-by: Mika Westerberg
    Cc: Mathias Nyman
    Cc: Linus Torvalds
    Cc: Grant Likely
    Cc: H. Peter Anvin
    Cc: Rafael J. Wysocki
    Cc: Andy Shevchenko
    Cc: Krogerus Heikki
    Cc: Linus Walleij
    Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1404241617360.28206@ionos.tec.linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

05 Mar, 2014

1 commit

  • There is a common pattern all over the place:

    kstat_incr_irqs_this_cpu(irq, irq_to_desc(irq));

    This results in a call to core code anyway. So provide a function
    which does the same thing in core.

    While at it, replace the butt ugly macro with an inline.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20140223212737.422068876@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner