13 Dec, 2014

1 commit

  • Since the rework of the sparse interrupt code to actually free the
    unused interrupt descriptors there exists a race between the /proc
    interfaces to the irq subsystem and the code which frees the interrupt
    descriptor.

    CPU0 CPU1
    show_interrupts()
    desc = irq_to_desc(X);
    free_desc(desc)
    remove_from_radix_tree();
    kfree(desc);
    raw_spinlock_irq(&desc->lock);

    /proc/interrupts is the only interface which can actively corrupt
    kernel memory via the lock access. /proc/stat can only read from freed
    memory. Extremly hard to trigger, but possible.

    The interfaces in /proc/irq/N/ are not affected by this because the
    removal of the proc file is serialized in procfs against concurrent
    readers/writers. The removal happens before the descriptor is freed.

    For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
    as the descriptor is never freed. It's merely cleared out with the irq
    descriptor lock held. So any concurrent proc access will either see
    the old correct value or the cleared out ones.

    Protect the lookup and access to the irq descriptor in
    show_interrupts() with the sparse_irq_lock.

    Provide kstat_irqs_usr() which is protecting the lookup and access
    with sparse_irq_lock and switch /proc/stat to use it.

    Document the existing kstat_irqs interfaces so it's clear that the
    caller needs to take care about protection. The users of these
    interfaces are either not affected due to SPARSE_IRQ=n or already
    protected against removal.

    Fixes: 1f5a5b87f78f "genirq: Implement a sane sparse_irq allocator"
    Signed-off-by: Thomas Gleixner
    Cc: stable@vger.kernel.org

    Thomas Gleixner
     

19 Mar, 2014

1 commit

  • Includes:
    - /proc/irq/default_smp_affinity
    - /proc/irq/*/affinity_hint
    - /proc/irq/*/smp_affinity
    - /proc/irq/*/smp_affinity_list

    Users can distill the same information by reading /proc/interrupts.

    Signed-off-by: Chema Gonzalez
    Cc: Eric Dumazet
    Link: http://lkml.kernel.org/r/1394765455-1217-1-git-send-email-chema@google.com
    Signed-off-by: Thomas Gleixner

    Chema Gonzalez
     

24 Jun, 2013

1 commit

  • Add the hardware interrupt number to the output of /proc/interrupts.
    It is often important to have access to the hardware interrupt number because
    it identifies exactly how an interrupt signal is wired up to the interrupt
    controller. This is especially important when using irq_domains since irq
    numbers get dynamically allocated in that case, and have no relation to the
    actual hardware number.

    Note: This output is currently conditional on whether or not the irq_domain
    pointer is set; however hwirq could still be used without irq_domain. It
    may be worthwhile to always output the hwirq number regardless of the
    domain pointer.

    Signed-off-by: Grant Likely
    Tested-by: Olof Johansson
    Cc: Ben Herrenschmidt
    Cc: Thomas Gleixner

    Grant Likely
     

02 May, 2013

1 commit

  • Supply a function (proc_remove()) to remove a proc entry (and any subtree
    rooted there) by proc_dir_entry pointer rather than by name and (optionally)
    root dir entry pointer. This allows us to eliminate all remaining pde->name
    accesses outside of procfs.

    Signed-off-by: David Howells
    Acked-by: Grant Likely
    cc: linux-acpi@vger.kernel.org
    cc: openipmi-developer@lists.sourceforge.net
    cc: devicetree-discuss@lists.ozlabs.org
    cc: linux-pci@vger.kernel.org
    cc: netdev@vger.kernel.org
    cc: netfilter-devel@vger.kernel.org
    cc: alsa-devel@alsa-project.org
    Signed-off-by: Al Viro

    David Howells
     

10 Apr, 2013

1 commit

  • The only part of proc_dir_entry the code outside of fs/proc
    really cares about is PDE(inode)->data. Provide a helper
    for that; static inline for now, eventually will be moved
    to fs/proc, along with the knowledge of struct proc_dir_entry
    layout.

    Signed-off-by: Al Viro

    Al Viro
     

23 Feb, 2013

1 commit


26 May, 2011

1 commit

  • commit 4b06042(bitmap, irq: add smp_affinity_list interface to
    /proc/irq) causes the following warning:

    [ 274.239500] WARNING: at fs/proc/generic.c:850 remove_proc_entry+0x24c/0x27a()
    [ 274.251761] remove_proc_entry: removing non-empty directory 'irq/184',
    leaking at least 'smp_affinity_list'

    Remove the new file in the exit path.

    Signed-off-by: Yinghai Lu
    Cc: Mike Travis
    Link: http://lkml.kernel.org/r/4DDDE094.6050505@kernel.org
    Signed-off-by: Thomas Gleixner

    Yinghai Lu
     

25 May, 2011

1 commit

  • Manually adjusting the smp_affinity for IRQ's becomes unwieldy when the
    cpu count is large.

    Setting smp affinity to cpus 256 to 263 would be:

    echo 000000ff,00000000,00000000,00000000,00000000,00000000,00000000,00000000 > smp_affinity

    instead of:

    echo 256-263 > smp_affinity_list

    Think about what it looks like for cpus around say, 4088 to 4095.

    We already have many alternate "list" interfaces:

    /sys/devices/system/cpu/cpuX/indexY/shared_cpu_list
    /sys/devices/system/cpu/cpuX/topology/thread_siblings_list
    /sys/devices/system/cpu/cpuX/topology/core_siblings_list
    /sys/devices/system/node/nodeX/cpulist
    /sys/devices/pci***/***/local_cpulist

    Add a companion interface, smp_affinity_list to use cpu lists instead of
    cpu maps. This conforms to other companion interfaces where both a map
    and a list interface exists.

    This required adding a bitmap_parselist_user() function in a manner
    similar to the bitmap_parse_user() function.

    [akpm@linux-foundation.org: make __bitmap_parselist() static]
    Signed-off-by: Mike Travis
    Cc: Thomas Gleixner
    Cc: Jack Steiner
    Cc: Lee Schermerhorn
    Cc: Andy Shevchenko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Travis
     

03 May, 2011

1 commit

  • commit ab7798ffcf98b11a9525cf65bacdae3fd58d357f ("genirq: Expand generic
    show_interrupts()") added the Kconfig option GENERIC_IRQ_SHOW_LEVEL to
    accomodate PowerPC, but this doesn't actually enable the functionality due
    to a typo in the #ifdef check.

    Signed-off-by: Geert Uytterhoeven
    Cc: Linux/PPC Development
    Link: http://lkml.kernel.org/r/%3Calpine.DEB.2.00.1104302251370.19068%40ayla.of.borg%3E
    Signed-off-by: Thomas Gleixner

    Geert Uytterhoeven
     

29 Mar, 2011

1 commit

  • The only subtle difference is that alpha uses ACTUAL_NR_IRQS and
    prints the IRQF_DISABLED flag.

    Change the generic implementation to deal with ACTUAL_NR_IRQS if
    defined.

    The IRQF_DISABLED printing is pointless, as we nowadays run all
    interrupts with irqs disabled.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

26 Mar, 2011

1 commit

  • Some archs want to print extra information for certain irq_chips which
    is per irq and not per chip. Allow them to provide a chip callback to
    print the chip name and the extra information.

    PowerPC wants to print the LEVEL/EDGE type information. Make it configurable.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

17 Mar, 2011

1 commit


19 Feb, 2011

4 commits

  • Add a !desc check while at it.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • chip implementations need to know about it. Keep status in sync until
    all users are fixed.

    Accessor function: irqd_is_setaffinity_pending(irqdata)

    Coders who access them directly will be tracked down and slapped with
    stinking trouts.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • All archs implement show_interrupts() in more or less the same
    way. That's tons of duplicated code with different bugs with no
    value. Implement a generic version and deprecate show_interrupts()

    Unfortunately we need some ifdeffery for !GENERIC_HARDIRQ archs.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • While rumaging through arch code I found that there are a few
    workarounds which deal with the fact that the initial affinity setting
    from request_irq() copies the mask into irq_data->affinity before the
    chip code is called. In the normal path we unconditionally copy the
    mask when the chip code returns 0.

    Copy after the code is called and add a return code
    IRQ_SET_MASK_OK_NOCOPY for the chip functions, which prevents the
    copy. That way we see the real mask when the chip function decided to
    truncate it further as some arches do. IRQ_SET_MASK_OK is 0, which is
    the current behaviour.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

01 Dec, 2010

1 commit

  • Since commit a1afb637(switch /proc/irq/*/spurious to seq_file) all
    /proc/irq/XX/spurious files show the information of irq 0.

    Current irq_spurious_proc_open() passes on NULL as the 3rd argument,
    which is used as an IRQ number in irq_spurious_proc_show(), to the
    single_open(). Because of this, all the /proc/irq/XX/spurious file
    shows IRQ 0 information regardless of the IRQ number.

    To fix the problem, irq_spurious_proc_open() must pass on the
    appropreate data (IRQ number) to single_open().

    Signed-off-by: Kenji Kaneshige
    Reviewed-by: Yong Zhang
    LKML-Reference:
    Cc: stable@kernel.org [2.6.33+]
    Signed-off-by: Thomas Gleixner

    Kenji Kaneshige
     

12 Oct, 2010

1 commit


04 Oct, 2010

2 commits


12 May, 2010

1 commit

  • When an interrupt is disabled and torn down, the CPU mask returned
    through affinity_hint right now is all CPUs. Also, for drivers that
    don't provide an affinity_hint mask, this can be misleading. There
    should be no hint at all, meaning an empty CPU mask.

    [ tglx: use zalloc_cpumask_var instead of clearing it under the lock ]

    Signed-off-by: Peter P Waskiewicz Jr
    Cc: davem@davemloft.net
    Cc: arjan@linux.jf.intel.com
    Cc: bhutchings@solarflare.com
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Peter P Waskiewicz Jr
     

03 May, 2010

1 commit

  • This patch adds a cpumask affinity hint to the irq_desc structure,
    along with a registration function and a read-only proc entry for each
    interrupt.

    This affinity_hint handle for each interrupt can be used by underlying
    drivers that need a better mechanism to control interrupt affinity.
    The underlying driver can register a cpumask for the interrupt, which
    will allow the driver to provide the CPU mask for the interrupt to
    anything that requests it. The intent is to extend the userspace
    daemon, irqbalance, to help hint to it a preferred CPU mask to balance
    the interrupt into.

    [ tglx: Fixed compile warnings, added WARN_ON, made SMP only ]

    Signed-off-by: Peter P Waskiewicz Jr
    Cc: davem@davemloft.net
    Cc: arjan@linux.jf.intel.com
    Cc: bhutchings@solarflare.com
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Peter P Waskiewicz Jr
     

13 Apr, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

24 Mar, 2010

1 commit

  • Expose irq_desc->node as /proc/irq/*/node.

    This file provides device hardware locality information for apps
    desiring to include hardware locality in irq mapping decisions.

    Signed-off-by: Dimitri Sivanich
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Thomas Gleixner

    Dimitri Sivanich
     

15 Dec, 2009

1 commit


20 Nov, 2009

1 commit


18 Nov, 2009

1 commit


08 Nov, 2009

1 commit


12 Jan, 2009

1 commit

  • Impact: reduce memory usage, use new cpumask API.

    Replace the affinity and pending_masks with cpumask_var_t's. This adds
    to the significant size reduction done with the SPARSE_IRQS changes.

    The added functions (init_alloc_desc_masks & init_copy_desc_masks) are
    in the include file so they can be inlined (and optimized out for the
    !CONFIG_CPUMASKS_OFFSTACK case.) [Naming chosen to be consistent with
    the other init*irq functions, as well as the backwards arg declaration
    of "from, to" instead of the more common "to, from" standard.]

    Includes a slight change to the declaration of struct irq_desc to embed
    the pending_mask within ifdef(CONFIG_SMP) to be consistent with other
    references, and some small changes to Xen.

    Tested: sparse/non-sparse/cpumask_offstack/non-cpumask_offstack/nonuma/nosmp on x86_64

    Signed-off-by: Mike Travis
    Cc: Chris Wright
    Cc: Jeremy Fitzhardinge
    Cc: KOSAKI Motohiro
    Cc: Venkatesh Pallipadi
    Cc: virtualization@lists.osdl.org
    Cc: xen-devel@lists.xensource.com
    Cc: Yinghai Lu

    Mike Travis
     

04 Jan, 2009

1 commit

  • Impact: build fix on ia64

    ia64's default_affinity_write() still had old cpumask_t usage:

    /home/mingo/tip/kernel/irq/proc.c: In function `default_affinity_write':
    /home/mingo/tip/kernel/irq/proc.c:114: error: incompatible type for argument 1 of `is_affinity_mask_valid'
    make[3]: *** [kernel/irq/proc.o] Error 1
    make[3]: *** Waiting for unfinished jobs....

    update it to cpumask_var_t.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

01 Jan, 2009

1 commit

  • Impact: Reduce stack usage, use new cpumask API. ALPHA mod!

    Main change is that irq_default_affinity becomes a cpumask_var_t, so
    treat it as a pointer (this effects alpha).

    Signed-off-by: Rusty Russell

    Rusty Russell
     

31 Dec, 2008

1 commit


13 Dec, 2008

2 commits

  • Impact: change existing irq_chip API

    Not much point with gentle transition here: the struct irq_chip's
    setaffinity method signature needs to change.

    Fortunately, not widely used code, but hits a few architectures.

    Note: In irq_select_affinity() I save a temporary in by mangling
    irq_desc[irq].affinity directly. Ingo, does this break anything?

    (Folded in fix from KOSAKI Motohiro)

    Signed-off-by: Rusty Russell
    Signed-off-by: Mike Travis
    Reviewed-by: Grant Grundler
    Acked-by: Ingo Molnar
    Cc: ralf@linux-mips.org
    Cc: grundler@parisc-linux.org
    Cc: jeremy@xensource.com
    Cc: KOSAKI Motohiro

    Rusty Russell
     
  • …t_scnprintf to take pointers.

    Impact: change calling convention of existing cpumask APIs

    Most cpumask functions started with cpus_: these have been replaced by
    cpumask_ ones which take struct cpumask pointers as expected.

    These four functions don't have good replacement names; fortunately
    they're rarely used, so we just change them over.

    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    Signed-off-by: Mike Travis <travis@sgi.com>
    Acked-by: Ingo Molnar <mingo@elte.hu>
    Cc: paulus@samba.org
    Cc: mingo@redhat.com
    Cc: tony.luck@intel.com
    Cc: ralf@linux-mips.org
    Cc: Greg Kroah-Hartman <gregkh@suse.de>
    Cc: cl@linux-foundation.org
    Cc: srostedt@redhat.com

    Rusty Russell
     

08 Dec, 2008

1 commit

  • Impact: new feature

    Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with
    NR_CPUS set to large values. The goal is to be able to scale up to much
    larger NR_IRQS value without impacting the (important) common case.

    To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of
    irq_desc pointers.

    When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc,
    this also makes the IRQ descriptors NUMA-local (to the site that calls
    request_irq()).

    This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now
    uses desc->chip_data for x86 to store irq_cfg.

    Signed-off-by: Yinghai Lu
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     

10 Nov, 2008

1 commit

  • Impact: preserve user-modified affinities on interrupts

    Kumar Galak noticed that commit
    18404756765c713a0be4eb1082920c04822ce588 (genirq: Expose default irq
    affinity mask (take 3))

    overrides an already set affinity setting across a free /
    request_irq(). Happens e.g. with ifdown/ifup of a network device.

    Change the logic to mark the affinities as set and keep them
    intact. This also fixes the unlocked access to irq_desc in
    irq_select_affinity() when called from irq_affinity_proc_write()

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

22 Oct, 2008

1 commit


16 Oct, 2008

2 commits