12 May, 2010

1 commit

  • When an interrupt is disabled and torn down, the CPU mask returned
    through affinity_hint right now is all CPUs. Also, for drivers that
    don't provide an affinity_hint mask, this can be misleading. There
    should be no hint at all, meaning an empty CPU mask.

    [ tglx: use zalloc_cpumask_var instead of clearing it under the lock ]

    Signed-off-by: Peter P Waskiewicz Jr
    Cc: davem@davemloft.net
    Cc: arjan@linux.jf.intel.com
    Cc: bhutchings@solarflare.com
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Peter P Waskiewicz Jr
     

03 May, 2010

1 commit

  • This patch adds a cpumask affinity hint to the irq_desc structure,
    along with a registration function and a read-only proc entry for each
    interrupt.

    This affinity_hint handle for each interrupt can be used by underlying
    drivers that need a better mechanism to control interrupt affinity.
    The underlying driver can register a cpumask for the interrupt, which
    will allow the driver to provide the CPU mask for the interrupt to
    anything that requests it. The intent is to extend the userspace
    daemon, irqbalance, to help hint to it a preferred CPU mask to balance
    the interrupt into.

    [ tglx: Fixed compile warnings, added WARN_ON, made SMP only ]

    Signed-off-by: Peter P Waskiewicz Jr
    Cc: davem@davemloft.net
    Cc: arjan@linux.jf.intel.com
    Cc: bhutchings@solarflare.com
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Peter P Waskiewicz Jr
     

13 Apr, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

24 Mar, 2010

1 commit

  • Expose irq_desc->node as /proc/irq/*/node.

    This file provides device hardware locality information for apps
    desiring to include hardware locality in irq mapping decisions.

    Signed-off-by: Dimitri Sivanich
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Thomas Gleixner

    Dimitri Sivanich
     

15 Dec, 2009

1 commit


20 Nov, 2009

1 commit


18 Nov, 2009

1 commit


08 Nov, 2009

1 commit


12 Jan, 2009

1 commit

  • Impact: reduce memory usage, use new cpumask API.

    Replace the affinity and pending_masks with cpumask_var_t's. This adds
    to the significant size reduction done with the SPARSE_IRQS changes.

    The added functions (init_alloc_desc_masks & init_copy_desc_masks) are
    in the include file so they can be inlined (and optimized out for the
    !CONFIG_CPUMASKS_OFFSTACK case.) [Naming chosen to be consistent with
    the other init*irq functions, as well as the backwards arg declaration
    of "from, to" instead of the more common "to, from" standard.]

    Includes a slight change to the declaration of struct irq_desc to embed
    the pending_mask within ifdef(CONFIG_SMP) to be consistent with other
    references, and some small changes to Xen.

    Tested: sparse/non-sparse/cpumask_offstack/non-cpumask_offstack/nonuma/nosmp on x86_64

    Signed-off-by: Mike Travis
    Cc: Chris Wright
    Cc: Jeremy Fitzhardinge
    Cc: KOSAKI Motohiro
    Cc: Venkatesh Pallipadi
    Cc: virtualization@lists.osdl.org
    Cc: xen-devel@lists.xensource.com
    Cc: Yinghai Lu

    Mike Travis
     

04 Jan, 2009

1 commit

  • Impact: build fix on ia64

    ia64's default_affinity_write() still had old cpumask_t usage:

    /home/mingo/tip/kernel/irq/proc.c: In function `default_affinity_write':
    /home/mingo/tip/kernel/irq/proc.c:114: error: incompatible type for argument 1 of `is_affinity_mask_valid'
    make[3]: *** [kernel/irq/proc.o] Error 1
    make[3]: *** Waiting for unfinished jobs....

    update it to cpumask_var_t.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

01 Jan, 2009

1 commit

  • Impact: Reduce stack usage, use new cpumask API. ALPHA mod!

    Main change is that irq_default_affinity becomes a cpumask_var_t, so
    treat it as a pointer (this effects alpha).

    Signed-off-by: Rusty Russell

    Rusty Russell
     

31 Dec, 2008

1 commit


13 Dec, 2008

2 commits

  • Impact: change existing irq_chip API

    Not much point with gentle transition here: the struct irq_chip's
    setaffinity method signature needs to change.

    Fortunately, not widely used code, but hits a few architectures.

    Note: In irq_select_affinity() I save a temporary in by mangling
    irq_desc[irq].affinity directly. Ingo, does this break anything?

    (Folded in fix from KOSAKI Motohiro)

    Signed-off-by: Rusty Russell
    Signed-off-by: Mike Travis
    Reviewed-by: Grant Grundler
    Acked-by: Ingo Molnar
    Cc: ralf@linux-mips.org
    Cc: grundler@parisc-linux.org
    Cc: jeremy@xensource.com
    Cc: KOSAKI Motohiro

    Rusty Russell
     
  • …t_scnprintf to take pointers.

    Impact: change calling convention of existing cpumask APIs

    Most cpumask functions started with cpus_: these have been replaced by
    cpumask_ ones which take struct cpumask pointers as expected.

    These four functions don't have good replacement names; fortunately
    they're rarely used, so we just change them over.

    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    Signed-off-by: Mike Travis <travis@sgi.com>
    Acked-by: Ingo Molnar <mingo@elte.hu>
    Cc: paulus@samba.org
    Cc: mingo@redhat.com
    Cc: tony.luck@intel.com
    Cc: ralf@linux-mips.org
    Cc: Greg Kroah-Hartman <gregkh@suse.de>
    Cc: cl@linux-foundation.org
    Cc: srostedt@redhat.com

    Rusty Russell
     

08 Dec, 2008

1 commit

  • Impact: new feature

    Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with
    NR_CPUS set to large values. The goal is to be able to scale up to much
    larger NR_IRQS value without impacting the (important) common case.

    To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of
    irq_desc pointers.

    When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc,
    this also makes the IRQ descriptors NUMA-local (to the site that calls
    request_irq()).

    This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now
    uses desc->chip_data for x86 to store irq_cfg.

    Signed-off-by: Yinghai Lu
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     

10 Nov, 2008

1 commit

  • Impact: preserve user-modified affinities on interrupts

    Kumar Galak noticed that commit
    18404756765c713a0be4eb1082920c04822ce588 (genirq: Expose default irq
    affinity mask (take 3))

    overrides an already set affinity setting across a free /
    request_irq(). Happens e.g. with ifdown/ifup of a network device.

    Change the logic to mark the affinities as set and keep them
    intact. This also fixes the unlocked access to irq_desc in
    irq_select_affinity() when called from irq_affinity_proc_write()

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

22 Oct, 2008

1 commit


16 Oct, 2008

4 commits


13 Aug, 2008

1 commit

  • Switch /proc/irq/*/smp_affinity , /proc/irq/default_smp_affinity to
    seq_files.

    cat(1) reads with 1024 chunks by default, with high enough NR_CPUS, there
    will be -EINVAL.

    As side effect, there are now two less users of the ->read_proc interface.

    Signed-off-by: Alexey Dobriyan
    Cc: Paul Jackson
    Cc: Mike Travis
    Cc: Al Viro
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

05 Jun, 2008

1 commit

  • Current IRQ affinity interface does not provide a way to set affinity
    for the IRQs that will be allocated/activated in the future.
    This patch creates /proc/irq/default_smp_affinity that lets users set
    default affinity mask for the newly allocated IRQs. Changing the default
    does not affect affinity masks for the currently active IRQs, they
    have to be changed explicitly.

    Updated based on Paul J's comments and added some more documentation.

    Signed-off-by: Max Krasnyansky
    Cc: pj@sgi.com
    Cc: a.p.zijlstra@chello.nl
    Cc: tglx@linutronix.de
    Cc: rdunlap@xenotime.net
    Cc: mingo@elte.hu
    Signed-off-by: Thomas Gleixner

    Max Krasnyansky
     

30 Jan, 2008

1 commit


22 Jul, 2007

1 commit


12 May, 2007

1 commit

  • On SN, only allow one bit to be set in the smp_affinty mask when
    redirecting an interrupt. Currently setting multiple bits is allowed, but
    only the first bit is used in determining the CPU to redirect to. This has
    caused confusion among some customers.

    [akpm@linux-foundation.org: fixes]
    Signed-off-by: John Keller
    Signed-off-by: Andrew Morton
    Signed-off-by: Tony Luck

    John Keller
     

09 May, 2007

1 commit


17 Feb, 2007

2 commits

  • Provide funtions to:
    - check, whether an interrupt can set the affinity
    - pin the interrupt to a given cpu

    Necessary for the ability to setup clocksources more flexible (e.g. use the
    different HPET channels per CPU)

    [akpm@osdl.org: alpha build fix]
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Add a flag so we can prevent the irq balancing of an interrupt. Move the
    bits, so we have room for more :)

    Necessary for the ability to setup clocksources more flexible (e.g. use the
    different HPET channels per CPU)

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

12 Feb, 2007

1 commit

  • Bug: pnx8550 code creates directory but resets ->nlink to 1.

    create_proc_entry() et al will correctly set ->nlink for you.

    Signed-off-by: Alexey Dobriyan
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Jeff Dike
    Cc: Corey Minyard
    Cc: Alan Cox
    Cc: Kyle McMartin
    Cc: Martin Schwidefsky
    Cc: Greg KH
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

09 Dec, 2006

1 commit

  • While running my MCA test (hardware error injection) on 2.6.19,
    I got some warning like following:

    > BUG: warning at kernel/irq/migration.c:27/move_masked_irq()
    >
    > Call Trace:
    > [] show_stack+0x40/0xa0
    > sp=e00000006b2578d0 bsp=e00000006b2510b0
    > [] dump_stack+0x30/0x60
    > sp=e00000006b257aa0 bsp=e00000006b251098
    > [] move_masked_irq+0xb0/0x240
    > sp=e00000006b257aa0 bsp=e00000006b251070
    > [] move_native_irq+0xe0/0x180
    > sp=e00000006b257aa0 bsp=e00000006b251040
    > [] iosapic_end_level_irq+0x30/0xe0
    > sp=e00000006b257aa0 bsp=e00000006b251020
    > [] __do_IRQ+0x170/0x400
    > sp=e00000006b257aa0 bsp=e00000006b250fd8
    > [] ia64_handle_irq+0x1b0/0x260
    > sp=e00000006b257aa0 bsp=e00000006b250fa8
    > [] ia64_leave_kernel+0x0/0x280
    > sp=e00000006b257aa0 bsp=e00000006b250fa8
    > [] _spin_unlock_irqrestore+0x30/0x60
    > sp=e00000006b257c70 bsp=e00000006b250f90

    It comes from:

    [kernel/irq/migration.c]
    26 if (CHECK_IRQ_PER_CPU(desc->status)) {
    27 WARN_ON(1);
    28 return;
    29 }

    By putting some printk in kernel, I found that irqbalance is trying to
    move CPEI which is handled as PER_CPU irq. That's why.

    CPEI(Corrected Platform Error Interrupt) is ia64 specific irq, is
    allowed to pin to particular processor which selected by the platform, and
    even it is PER_CPU but it has set_affinity handler (=iosapic_set_affinity)
    as same as other IO-SAPIC-level interrupts. (I don't know why, but
    I guess that there would be typical situation where the handler for
    migration is needed, such as hotplug - the processor going to be
    offline/hot-removed.)

    To shut up this warning, there are 2 way at least:
    a) fix CPEI stuff
    b) prohibit setting affinity to PER_CPU irq

    I'm not sure what stuff of CPEI need to be fixed, but I think that
    returning error to attempting move PER_CPU irq is useful for all
    applications since it will never work.

    Following small patch takes b) style.
    It works, the warning disappeared and irqbalance still runs well.

    Signed-off-by: Hidetoshi Seto
    Cc: Arjan van de Ven
    Acked-by: Ingo Molnar
    Acked-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hidetoshi Seto
     

12 Oct, 2006

1 commit

  • lib/bitmap.c:bitmap_parse() is a library function that received as input a
    user buffer. This seemed to have originated from the way the write_proc
    function of the /proc filesystem operates.

    This has been reworked to not use kmalloc and eliminates a lot of
    get_user() overhead by performing one access_ok before using __get_user().

    We need to test if we are in kernel or user space (is_user) and access the
    buffer differently. We cannot use __get_user() to access kernel addresses
    in all cases, for example in architectures with separate address space for
    kernel and user.

    This function will be useful for other uses as well; for example, taking
    input for /sysfs instead of /proc, so it was changed to accept kernel
    buffers. We have this use for the Linux UWB project, as part as the
    upcoming bandwidth allocator code.

    Only a few routines used this function and they were changed too.

    Signed-off-by: Reinette Chatre
    Signed-off-by: Inaky Perez-Gonzalez
    Cc: Paul Jackson
    Cc: Joe Korty
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Reinette Chatre
     

30 Jun, 2006

4 commits

  • Rename no_irq_type to no_irq_chip.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Consolidation: remove the irq_dir[NR_IRQS] and the smp_affinity_entry[NR_IRQS]
    arrays and move them into the irq_desc[] array.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Consolidation: remove the irq_affinity[NR_IRQS] array and move it into the
    irq_desc[NR_IRQS].affinity field.

    [akpm@osdl.org: sparc64 build fix]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • This patch-queue improves the generic IRQ layer to be truly generic, by adding
    various abstractions and features to it, without impacting existing
    functionality.

    While the queue can be best described as "fix and improve everything in the
    generic IRQ layer that we could think of", and thus it consists of many
    smaller features and lots of cleanups, the one feature that stands out most is
    the new 'irq chip' abstraction.

    The irq-chip abstraction is about describing and coding and IRQ controller
    driver by mapping its raw hardware capabilities [and quirks, if needed] in a
    straightforward way, without having to think about "IRQ flow"
    (level/edge/etc.) type of details.

    This stands in contrast with the current 'irq-type' model of genirq
    architectures, which 'mixes' raw hardware capabilities with 'flow' details.
    The patchset supports both types of irq controller designs at once, and
    converts i386 and x86_64 to the new irq-chip design.

    As a bonus side-effect of the irq-chip approach, chained interrupt controllers
    (master/slave PIC constructs, etc.) are now supported by design as well.

    The end result of this patchset intends to be simpler architecture-level code
    and more consolidation between architectures.

    We reused many bits of code and many concepts from Russell King's ARM IRQ
    layer, the merging of which was one of the motivations for this patchset.

    This patch:

    rename desc->handler to desc->chip.

    Originally i did not want to do this, because it's a big patch. But having
    both "desc->handler", "desc->handle_irq" and "action->handler" caused a
    large degree of confusion and made the code appear alot less clean than it
    truly is.

    I have also attempted a dual approach as well by introducing a
    desc->chip alias - but that just wasnt robust enough and broke
    frequently.

    So lets get over with this quickly. The conversion was done automatically
    via scripts and converts all the code in the kernel.

    This renaming patch is the first one amongst the patches, so that the
    remaining patches can stay flexible and can be merged and split up
    without having some big monolithic patch act as a merge barrier.

    [akpm@osdl.org: build fix]
    [akpm@osdl.org: another build fix]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

23 Jun, 2006

1 commit

  • On i386, kernel irq balance doesn't work.

    1) In function do_irq_balance, after kernel finds the min_loaded cpu but
    before calling set_pending_irq to really pin the selected_irq to the
    target cpu, kernel does a cpus_and with irq_affinity[selected_irq].
    Later on, when the irq is acked, kernel would calls
    move_native_irq=>desc->handler->set_affinity to change the irq affinity.
    However, every function pointed by
    hw_interrupt_type->set_affinity(unsigned int irq, cpumask_t cpumask)
    always changes irq_affinity[irq] to cpumask. Next time when recalling
    do_irq_balance, it has to do cpu_ands again with
    irq_affinity[selected_irq], but irq_affinity[selected_irq] already
    becomes one cpu selected by the first irq balance.

    2) Function balance_irq in file arch/i386/kernel/io_apic.c has the same
    issue.

    [akpm@osdl.org: cleanups]
    Signed-off-by: Zhang Yanmin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zhang Yanmin
     

09 Jan, 2006

1 commit

  • This patch contains the following cleanups:
    - make needlessly global functions static
    - every file should include the headers containing the prototypes for
    it's global functions

    Signed-off-by: Adrian Bunk
    Acked-by: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     

07 Jan, 2006

1 commit

  • Thanks to Christoph for doing most of the work.

    This allows automatic SMP IRQ affinity assignment other than default "all
    interrupts on all CPUs" which is rather expensive. This might be useful if
    the hardware can be programmed to distribute interrupts among different
    CPUs, like Alpha does.

    Signed-off-by: Ivan Kokshaysky
    Cc: Christoph Hellwig
    Cc: Richard Henderson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ivan Kokshaysky