12 Oct, 2006

1 commit

  • lib/bitmap.c:bitmap_parse() is a library function that received as input a
    user buffer. This seemed to have originated from the way the write_proc
    function of the /proc filesystem operates.

    This has been reworked to not use kmalloc and eliminates a lot of
    get_user() overhead by performing one access_ok before using __get_user().

    We need to test if we are in kernel or user space (is_user) and access the
    buffer differently. We cannot use __get_user() to access kernel addresses
    in all cases, for example in architectures with separate address space for
    kernel and user.

    This function will be useful for other uses as well; for example, taking
    input for /sysfs instead of /proc, so it was changed to accept kernel
    buffers. We have this use for the Linux UWB project, as part as the
    upcoming bandwidth allocator code.

    Only a few routines used this function and they were changed too.

    Signed-off-by: Reinette Chatre
    Signed-off-by: Inaky Perez-Gonzalez
    Cc: Paul Jackson
    Cc: Joe Korty
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Reinette Chatre
     

07 Oct, 2006

1 commit


05 Oct, 2006

3 commits

  • Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
    of passing regs around manually through all ~1800 interrupt handlers in the
    Linux kernel.

    The regs pointer is used in few places, but it potentially costs both stack
    space and code to pass it around. On the FRV arch, removing the regs parameter
    from all the genirq function results in a 20% speed up of the IRQ exit path
    (ie: from leaving timer_interrupt() to leaving do_IRQ()).

    Where appropriate, an arch may override the generic storage facility and do
    something different with the variable. On FRV, for instance, the address is
    maintained in GR28 at all times inside the kernel as part of general exception
    handling.

    Having looked over the code, it appears that the parameter may be handed down
    through up to twenty or so layers of functions. Consider a USB character
    device attached to a USB hub, attached to a USB controller that posts its
    interrupts through a cascaded auxiliary interrupt controller. A character
    device driver may want to pass regs to the sysrq handler through the input
    layer which adds another few layers of parameter passing.

    I've build this code with allyesconfig for x86_64 and i386. I've runtested the
    main part of the code on FRV and i386, though I can't test most of the drivers.
    I've also done partial conversion for powerpc and MIPS - these at least compile
    with minimal configurations.

    This will affect all archs. Mostly the changes should be relatively easy.
    Take do_IRQ(), store the regs pointer at the beginning, saving the old one:

    struct pt_regs *old_regs = set_irq_regs(regs);

    And put the old one back at the end:

    set_irq_regs(old_regs);

    Don't pass regs through to generic_handle_irq() or __do_IRQ().

    In timer_interrupt(), this sort of change will be necessary:

    - update_process_times(user_mode(regs));
    - profile_tick(CPU_PROFILING, regs);
    + update_process_times(user_mode(get_irq_regs()));
    + profile_tick(CPU_PROFILING);

    I'd like to move update_process_times()'s use of get_irq_regs() into itself,
    except that i386, alone of the archs, uses something other than user_mode().

    Some notes on the interrupt handling in the drivers:

    (*) input_dev() is now gone entirely. The regs pointer is no longer stored in
    the input_dev struct.

    (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does
    something different depending on whether it's been supplied with a regs
    pointer or not.

    (*) Various IRQ handler function pointers have been moved to type
    irq_handler_t.

    Signed-Off-By: David Howells
    (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)

    David Howells
     
  • Typedef the IRQ handler function type.

    Signed-Off-By: David Howells
    (cherry picked from 1356d1e5fd256997e3d3dce0777ab787d0515c7a commit)

    David Howells
     
  • Typedef the IRQ flow handler function type.

    Signed-Off-By: David Howells
    (cherry picked from 8e973fbdf5716b93a0a8c0365be33a31ca0fa351 commit)

    David Howells
     

04 Oct, 2006

4 commits

  • Currently msi.c is doing sanity checks that make certain before an irq is
    destroyed it has no more users.

    By adding irq_has_action I can perform the test is a generic way, instead of
    relying on a msi specific data structure.

    By performing the core check in dynamic_irq_cleanup I ensure every user of
    dynamic irqs has a test present and we don't free resources that are in use.

    In msi.c this allows me to kill the attrib.state member of msi_desc and all of
    the assciated code to maintain it.

    To keep from freeing data structures when irq cleanup code is called to soon
    changing dyanamic_irq_cleanup is insufficient because there are msi specific
    data structures that are also not safe to free.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Tony Luck
    Cc: Andi Kleen
    Cc: Thomas Gleixner
    Cc: Greg KH
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • With the msi support comes a new concept in irq handling, irqs that are
    created dynamically at run time.

    Currently the msi code allocates irqs backwards. First it allocates a
    platform dependent routing value for an interrupt the ``vector'' and then it
    figures out from the vector which irq you are on.

    This msi backwards allocator suffers from two basic problems. The allocator
    suffers because it is trying to do something that is architecture specific in
    a generic way making it brittle, inflexible, and tied to tightly to the
    architecture implementation. The alloctor also suffers from it's very
    backwards nature as it has tied things together that should have no
    dependencies.

    To solve the basic dynamic irq allocation problem two new architecture
    specific functions are added: create_irq and destroy_irq.

    create_irq takes no input and returns an unused irq number, that won't be
    reused until it is returned to the free poll with destroy_irq. The irq then
    can be used for any purpose although the only initial consumer is the msi
    code.

    destroy_irq takes an irq number allocated with create_irq and returns it to
    the free pool.

    Making this functionality per architecture increases the simplicity of the irq
    allocation code and increases it's flexibility.

    dynamic_irq_init() and dynamic_irq_cleanup() are added to automate the
    irq_desc initializtion that should happen for dynamic irqs.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Currently move_native_irq disables and renables the irq we are migrating to
    ensure we don't take that irq when we are actually doing the migration
    operation. Disabling the irq needs to happen but sometimes doing the work is
    move_native_irq is too late.

    On x86 with ioapics the irq move sequences needs to be:
    edge_triggered:
    mask irq.
    move irq.
    unmask irq.
    ack irq.
    level_triggered:
    mask irq.
    ack irq.
    move irq.
    unmask irq.

    We can easily perform the edge triggered sequence, with the current defintion
    of move_native_irq. However the level triggered case does not map well. For
    that I have added move_masked_irq, to allow me to disable the irqs around both
    the ack and the move.

    Q: Why have we not seen this problem earlier?

    A: The only symptom I have been able to reproduce is that if we change
    the vector before acknowleding an irq the wrong irq is acknowledged.
    Since we currently are not reprogramming the irq vector during
    migration no problems show up.

    We have to mask the irq before we acknowledge the irq or else we could
    hit a window where an irq is asserted just before we acknowledge it.

    Edge triggered irqs do not have this problem because acknowledgements
    do not propogate in the same way.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The primary aim of this patchset is to remove maintenances problems caused by
    the irq infrastructure. The two big issues I address are an artificially
    small cap on the number of irqs, and that MSI assumes vector == irq. My
    primary focus is on x86_64 but I have touched other architectures where
    necessary to keep them from breaking.

    - To increase the number of irqs I modify the code to look at the (cpu,
    vector) pair instead of just looking at the vector.

    With a large number of irqs available systems with a large irq count no
    longer need to compress their irq numbers to fit. Removing a lot of brittle
    special cases.

    For acpi guys the result is that irq == gsi.

    - Addressing the fact that MSI assumes irq == vector takes a few more
    patches. But suffice it to say when I am done none of the generic irq code
    even knows what a vector is.

    In quick testing on a large Unisys x86_64 machine we stumbled over at least
    one driver that assumed that NR_IRQS could always fit into an 8 bit number.
    This driver is clearly buggy today. But this has become a class of bugs that
    it is now much easier to hit.

    This patch:

    This is a minor space optimization. In practice I don't think this has any
    affect because of our alignment constraints and the other fields but there is
    not point in chewing up an uncessary word and since we already read the flag
    field this should improve the cache hit ratio of the irq handler.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

30 Sep, 2006

2 commits


26 Sep, 2006

1 commit


19 Sep, 2006

1 commit


17 Sep, 2006

1 commit

  • Fix a bug where the IRQ_PENDING flag is never cleared and the ISR is called
    endlessly without an actual interrupt.

    Signed-off-by: Imre Deak
    Acked-by: Thomas Gleixner
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Imre Deak
     

02 Sep, 2006

1 commit


01 Aug, 2006

1 commit

  • IRQs need refcounting and a state flag to track whether the the IRQ should
    be enabled or disabled as a "normal IRQ" source after a series of calls to
    {en,dis}able_irq(). For shared IRQs, the IRQ must be enabled so long as at
    least one driver needs it active.

    Likewise, IRQs need the same support to track whether the IRQ should be
    enabled or disabled as a "wakeup event" source after a series of calls to
    {en,dis}able_irq_wake(). For shared IRQs, the IRQ must be enabled as a
    wakeup source during sleep so long as at least one driver needs it. But
    right now they _don't have_ that refcounting ... which means sharing a
    wakeup-capable IRQ can't work correctly in some configurations.

    This patch adds the refcount and flag mechanisms to set_irq_wake() -- which
    is what {en,dis}able_irq_wake() call -- and minimal documentation of what
    the irq wake mechanism does.

    Drivers relying on the older (broken) "toggle" semantics will trigger a
    warning; that'll be a handful of drivers on ARM systems.

    Signed-off-by: David Brownell
    Acked-by: Ingo Molnar
    Acked-by: Thomas Gleixner
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Brownell
     

04 Jul, 2006

4 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
    powerpc: add defconfig for Freescale MPC8349E-mITX board
    powerpc: Add base support for the Freescale MPC8349E-mITX eval board
    Documentation: correct values in MPC8548E SEC example node
    [POWERPC] Actually copy over i8259.c to arch/ppc/syslib this time
    [POWERPC] Add new interrupt mapping core and change platforms to use it
    [POWERPC] Copy i8259 code back to arch/ppc
    [POWERPC] New device-tree interrupt parsing code
    [POWERPC] Use the genirq framework
    [PATCH] genirq: Allow fasteoi handler to retrigger disabled interrupts
    [POWERPC] Update the SWIM3 (powermac) floppy driver
    [POWERPC] Fix error handling in detecting legacy serial ports
    [POWERPC] Fix booting on Momentum "Apache" board (a Maple derivative)
    [POWERPC] Fix various offb and BootX-related issues
    [POWERPC] Add a default config for 32-bit CHRP machines
    [POWERPC] fix implicit declaration on cell.
    [POWERPC] change get_property to return void *

    Linus Torvalds
     
  • Make use of local_irq_enable_in_hardirq() API to annotate places that enable
    hardirqs in hardirq context.

    Has no effect on non-lockdep kernels.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Teach special (recursive) locking code to the lock validator. Has no effect
    on non-lockdep kernels.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Do 'make oldconfig' and accept all the defaults for new config options -
    reboot into the kernel and if everything goes well it should boot up fine and
    you should have /proc/lockdep and /proc/lockdep_stats files.

    Typically if the lock validator finds some problem it will print out
    voluminous debug output that begins with "BUG: ..." and which syslog output
    can be used by kernel developers to figure out the precise locking scenario.

    What does the lock validator do? It "observes" and maps all locking rules as
    they occur dynamically (as triggered by the kernel's natural use of spinlocks,
    rwlocks, mutexes and rwsems). Whenever the lock validator subsystem detects a
    new locking scenario, it validates this new rule against the existing set of
    rules. If this new rule is consistent with the existing set of rules then the
    new rule is added transparently and the kernel continues as normal. If the
    new rule could create a deadlock scenario then this condition is printed out.

    When determining validity of locking, all possible "deadlock scenarios" are
    considered: assuming arbitrary number of CPUs, arbitrary irq context and task
    context constellations, running arbitrary combinations of all the existing
    locking scenarios. In a typical system this means millions of separate
    scenarios. This is why we call it a "locking correctness" validator - for all
    rules that are observed the lock validator proves it with mathematical
    certainty that a deadlock could not occur (assuming that the lock validator
    implementation itself is correct and its internal data structures are not
    corrupted by some other kernel subsystem). [see more details and conditionals
    of this statement in include/linux/lockdep.h and
    Documentation/lockdep-design.txt]

    Furthermore, this "all possible scenarios" property of the validator also
    enables the finding of complex, highly unlikely multi-CPU multi-context races
    via single single-context rules, increasing the likelyhood of finding bugs
    drastically. In practical terms: the lock validator already found a bug in
    the upstream kernel that could only occur on systems with 3 or more CPUs, and
    which needed 3 very unlikely code sequences to occur at once on the 3 CPUs.
    That bug was found and reported on a single-CPU system (!). So in essence a
    race will be found "piecemail-wise", triggering all the necessary components
    for the race, without having to reproduce the race scenario itself! In its
    short existence the lock validator found and reported many bugs before they
    actually caused a real deadlock.

    To further increase the efficiency of the validator, the mapping is not per
    "lock instance", but per "lock-class". For example, all struct inode objects
    in the kernel have inode->inotify_mutex. If there are 10,000 inodes cached,
    then there are 10,000 lock objects. But ->inotify_mutex is a single "lock
    type", and all locking activities that occur against ->inotify_mutex are
    "unified" into this single lock-class. The advantage of the lock-class
    approach is that all historical ->inotify_mutex uses are mapped into a single
    (and as narrow as possible) set of locking rules - regardless of how many
    different tasks or inode structures it took to build this set of rules. The
    set of rules persist during the lifetime of the kernel.

    To see the rough magnitude of checking that the lock validator does, here's a
    portion of /proc/lockdep_stats, fresh after bootup:

    lock-classes: 694 [max: 2048]
    direct dependencies: 1598 [max: 8192]
    indirect dependencies: 17896
    all direct dependencies: 16206
    dependency chains: 1910 [max: 8192]
    in-hardirq chains: 17
    in-softirq chains: 105
    in-process chains: 1065
    stack-trace entries: 38761 [max: 131072]
    combined max dependencies: 2033928
    hardirq-safe locks: 24
    hardirq-unsafe locks: 176
    softirq-safe locks: 53
    softirq-unsafe locks: 137
    irq-safe locks: 59
    irq-unsafe locks: 176

    The lock validator has observed 1598 actual single-thread locking patterns,
    and has validated all possible 2033928 distinct locking scenarios.

    More details about the design of the lock validator can be found in
    Documentation/lockdep-design.txt, which can also found at:

    http://redhat.com/~mingo/lockdep-patches/lockdep-design.txt

    [bunk@stusta.de: cleanups]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

03 Jul, 2006

5 commits

  • Make the fasteoi handler mark disabled interrupts as pending if they
    happen anyway. This allow implementation of a delayed disable scheme
    with the fasteoi handler.

    Signed-off-by: Benjamin Herrenschmidt
    Acked-by: Thomas Gleixner
    Acked-by: Ingo Molnar
    Signed-off-by: Paul Mackerras

    Benjamin Herrenschmidt
     
  • The irqflags consolidation converted SA_PERCPU_IRQ to IRQF_PERCPU but
    did not define the new constant.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Linus: "The hacks in kernel/irq/handle.c are really horrid. REALLY
    horrid."

    They are indeed. Move the dyntick quirks to ARM where they belong.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • * 'genirq' of master.kernel.org:/home/rmk/linux-2.6-arm: (24 commits)
    [ARM] 3683/2: ARM: Convert at91rm9200 to generic irq handling
    [ARM] 3682/2: ARM: Convert ixp4xx to generic irq handling
    [ARM] 3702/1: ARM: Convert ixp23xx to generic irq handling
    [ARM] 3701/1: ARM: Convert plat-omap to generic irq handling
    [ARM] 3700/1: ARM: Convert lh7a40x to generic irq handling
    [ARM] 3699/1: ARM: Convert s3c2410 to generic irq handling
    [ARM] 3698/1: ARM: Convert sa1100 to generic irq handling
    [ARM] 3697/1: ARM: Convert shark to generic irq handling
    [ARM] 3696/1: ARM: Convert clps711x to generic irq handling
    [ARM] 3694/1: ARM: Convert ecard driver to generic irq handling
    [ARM] 3693/1: ARM: Convert omap1 to generic irq handling
    [ARM] 3691/1: ARM: Convert imx to generic irq handling
    [ARM] 3688/1: ARM: Convert clps7500 to generic irq handling
    [ARM] 3687/1: ARM: Convert integrator to generic irq handling
    [ARM] 3685/1: ARM: Convert pxa to generic irq handling
    [ARM] 3684/1: ARM: Convert l7200 to generic irq handling
    [ARM] 3681/1: ARM: Convert ixp2000 to generic irq handling
    [ARM] 3680/1: ARM: Convert footbridge to generic irq handling
    [ARM] 3695/1: ARM drivers/pcmcia: Fixup includes
    [ARM] 3689/1: ARM drivers/input/touchscreen: Fixup includes
    ...

    Manual conflict resolved in kernel/irq/handle.c (butt-ugly ARM tickless
    code).

    Linus Torvalds
     
  • Signed-off-by: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "David S. Miller"
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

02 Jul, 2006

4 commits


01 Jul, 2006

1 commit


30 Jun, 2006

10 commits