28 Oct, 2010

1 commit

  • In /proc/stat, the number of per-IRQ event is shown by making a sum each
    irq's events on all cpus. But we can make use of kstat_irqs().

    kstat_irqs() do the same calculation, If !CONFIG_GENERIC_HARDIRQ,
    it's not a big cost. (Both of the number of cpus and irqs are small.)

    If a system is very big and CONFIG_GENERIC_HARDIRQ, it does

    for_each_irq()
    for_each_cpu()
    - look up a radix tree
    - read desc->irq_stat[cpu]
    This seems not efficient. This patch adds kstat_irqs() for
    CONFIG_GENRIC_HARDIRQ and change the calculation as

    for_each_irq()
    look up radix tree
    for_each_cpu()
    - read desc->irq_stat[cpu]

    This reduces cost.

    A test on (4096cpusp, 256 nodes, 4592 irqs) host (by Jack Steiner)

    %time cat /proc/stat > /dev/null

    Before Patch: 2.459 sec
    After Patch : .561 sec

    [akpm@linux-foundation.org: unexport kstat_irqs, coding-style tweaks]
    [akpm@linux-foundation.org: fix unused variable 'per_irq_sum']
    Signed-off-by: KAMEZAWA Hiroyuki
    Tested-by: Jack Steiner
    Acked-by: Jack Steiner
    Cc: Yinghai Lu
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     

13 Oct, 2010

1 commit


12 Oct, 2010

15 commits

  • The allocator functions are now called outside of preempt disabled
    regions. Switch to GFP_KERNEL.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner
     
  • No callers from atomic regions.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner
     
  • The move_irq_desc() function was only used due to the problem that the
    allocator did not free the old descriptors. So the descriptors had to
    be moved in create_irq_nr(). That's history.

    The code would have never been able to move active interrupt
    descriptors on affinity settings. That can be done in a completely
    different way w/o all this horror.

    Remove all of it.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner
     
  • Use the cleanup functions of the dynamic allocator. No need to have
    separate implementations.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner
     
  • This function should have not been there in the first place.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner
     
  • sparse irq sets up NR_IRQS_LEGACY irq descriptors and archs then go
    ahead and allocate more.

    Use the unused return value of arch_probe_nr_irqs() to let the
    architecture return the number of early allocations. Fix up all users.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner
     
  • Make irq_to_desc_alloc_node() a wrapper around the new allocator.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner
     
  • Mark a range of interrupts as allocated. In the SPARSE_IRQ=n case we
    need this to update the bitmap for the legacy irqs so the enumerator
    via irq_get_next_irq() works.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Use the allocator bitmap to lookup active interrupts.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner
     
  • /proc/irq never removes any entries, but when irq descriptors can be
    freed for real this is necessary. Otherwise we'd reference a freed
    descriptor in /proc/irq/N

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner
     
  • The current sparse_irq allocator has several short comings due to
    failures in the design or the lack of it:

    - Requires iteration over the number of active irqs to find a free slot
    (Some architectures have grown their own workarounds for this)
    - Removal of entries is not possible
    - Racy between create_irq_nr and destroy_irq (plugged by horrible
    callbacks)
    - Migration of active irq descriptors is not possible
    - No bulk allocation of irq ranges
    - Sprinkeled irq_desc references all over the place outside of kernel/irq/
    (The previous chip functions series is addressing this issue)

    Implement a sane allocator which fixes the above short comings (though
    migration of active descriptors needs a full tree wide cleanup of the
    direct and mostly unlocked access to irq_desc).

    The new allocator still uses a radix_tree, but uses a bitmap for
    keeping track of allocated irq numbers. That allows:

    - Fast lookup of a free slot
    - Allows the removal of descriptors
    - Prevents the create/destroy race
    - Bulk allocation of consecutive irq ranges
    - Basic design is ready for migration of life descriptors after
    further cleanups

    The bitmap is also used in the SPARSE_IRQ=n case for lookup and
    raceless (de)allocation of irq numbers. So it removes the requirement
    for looping through the descriptor array to find slots.

    Right now it uses sparse_irq_lock to protect the bitmap and the radix
    tree, but after cleaning up all users we should be able convert that
    to a mutex and to switch the radix_tree and decriptor allocations to
    GFP_KERNEL.

    [ Folded in a bugfix from Yinghai Lu ]

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner
     
  • Arch code sets it's own irq_desc.status flags right after boot and for
    dynamically allocated interrupts. That might involve iterating over a
    huge array.

    Allow ARCH_IRQ_INIT_FLAGS to set separate flags aside of IRQ_DISABLED
    which is the default.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner
     
  • The statistics accessor is only used by proc/stats and
    show_interrupts(). Both are compiled in.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner
     
  • early_init_irq_lock_class() is called way before anything touches the
    irq descriptors. In case of SPARSE_IRQ=y this is a NOP operation
    because the radix tree is empty at this point. For the SPARSE_IRQ=n
    case it's sufficient to set the lock class in early_init_irq().

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner
     
  • kernel/irq/handle.c has become a dumpground for random code in random
    order. Split out the irq descriptor management and the dummy irq_chip
    implementation into separate files. Cleanup the include maze while at
    it.

    No code change.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar

    Thomas Gleixner