04 Oct, 2006

40 commits

  • This patch (as751) adds a new type of notifier chain, based on the SRCU
    (Sleepable Read-Copy Update) primitives recently added to the kernel. An
    SRCU notifier chain is much like a blocking notifier chain, in that it must
    be called in process context and its callout routines are allowed to sleep.
    The difference is that the chain's links are protected by the SRCU
    mechanism rather than by an rw-semaphore, so calling the chain has
    extremely low overhead: no memory barriers and no cache-line bouncing. On
    the other hand, unregistering from the chain is expensive and the chain
    head requires special runtime initialization (plus cleanup if it is to be
    deallocated).

    SRCU notifiers are appropriate for notifiers that will be called very
    frequently and for which unregistration occurs very seldom. The proposed
    "task notifier" scheme qualifies, as may some of the network notifiers.

    Signed-off-by: Alan Stern
    Acked-by: Paul E. McKenney
    Acked-by: Chandra Seetharaman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Stern
     
  • Adds SRCU operations to rcutorture and updates rcutorture documentation.
    Also increases the stress imposed by the rcutorture test.

    [bunk@stusta.de: make needlessly global code static]
    Signed-off-by: Paul E. McKenney
    Cc: Paul E. McKenney
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     
  • Updated patch adding a variant of RCU that permits sleeping in read-side
    critical sections. SRCU is as follows:

    o Each use of SRCU creates its own srcu_struct, and each
    srcu_struct has its own set of grace periods. This is
    critical, as it prevents one subsystem with a blocking
    reader from holding up SRCU grace periods for other
    subsystems.

    o The SRCU primitives (srcu_read_lock(), srcu_read_unlock(),
    and synchronize_srcu()) all take a pointer to a srcu_struct.

    o The SRCU primitives must be called from process context.

    o srcu_read_lock() returns an int that must be passed to
    the matching srcu_read_unlock(). Realtime RCU avoids the
    need for this by storing the state in the task struct,
    but SRCU needs to allow a given code path to pass through
    multiple SRCU domains -- storing state in the task struct
    would therefore require either arbitrary space in the
    task struct or arbitrary limits on SRCU nesting. So I
    kicked the state-storage problem up to the caller.

    Of course, it is not permitted to call synchronize_srcu()
    while in an SRCU read-side critical section.

    o There is no call_srcu(). It would not be hard to implement
    one, but it seems like too easy a way to OOM the system.
    (Hey, we have enough trouble with call_rcu(), which does
    -not- permit readers to sleep!!!) So, if you want it,
    please tell me why...

    [josht@us.ibm.com: sparse notation]
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     
  • This moves the declarations for the architecture helpers into
    include/linux/htirq.h from the generic include/linux/pci.h. Hopefully this
    will make this distinction clearer.

    htirq.h is included where it is needed.

    The dependency on the msi code is fixed and removed.

    The Makefile is tidied up.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Tony Luck
    Cc: Andi Kleen
    Cc: Thomas Gleixner
    Cc: Greg KH
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This is just a few makefile tweaks and some file renames.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Tony Luck
    Cc: Andi Kleen
    Cc: Thomas Gleixner
    Cc: Greg KH
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • It turns out msi_ops was simply not enough to abstract the architecture
    specific details of msi. So I have moved the resposibility of constructing
    the struct irq_chip to the architectures, and have two architecture specific
    functions arch_setup_msi_irq, and arch_teardown_msi_irq.

    For simple architectures those functions can do all of the work. For
    architectures with platform dependencies they can call into the appropriate
    platform code.

    With this msi.c is finally free of assuming you have an apic, and this
    actually takes less code.

    The helpers for the architecture specific code are declared in the linux/msi.h
    to keep them separate from the msi functions used by drivers in linux/pci.h

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Tony Luck
    Cc: Andi Kleen
    Cc: Thomas Gleixner
    Cc: Greg KH
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The logic works like this.

    Since we no longer track the state logic by hand in msi.c startup and shutdown
    are no longer needed.

    By updating msi_set_mask_bit to work on msi devices that do not implement a
    mask bit we can always call the mask/unmask functions.

    What we really have are mask and unmask so we use them to implement the .mask
    and .unmask functions instead of .enable and .disable.

    By switching to the handle_edge_irq handler we only need an ack function that
    moves the irq if necessary. Which removes the old end and ack functions and
    their peculiar logic of sometimes disabling an irq.

    This removes the reliance on pre genirq irq handling methods.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Tony Luck
    Cc: Andi Kleen
    Cc: Thomas Gleixner
    Cc: Greg KH
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Currently msi.c is doing sanity checks that make certain before an irq is
    destroyed it has no more users.

    By adding irq_has_action I can perform the test is a generic way, instead of
    relying on a msi specific data structure.

    By performing the core check in dynamic_irq_cleanup I ensure every user of
    dynamic irqs has a test present and we don't free resources that are in use.

    In msi.c this allows me to kill the attrib.state member of msi_desc and all of
    the assciated code to maintain it.

    To keep from freeing data structures when irq cleanup code is called to soon
    changing dyanamic_irq_cleanup is insufficient because there are msi specific
    data structures that are also not safe to free.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Tony Luck
    Cc: Andi Kleen
    Cc: Thomas Gleixner
    Cc: Greg KH
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This patch implements two functions ht_create_irq and ht_destroy_irq for
    use by drivers. Several other functions are implemented as helpers for
    arch specific irq_chip handlers.

    The driver for the card I tested this on isn't yet ready to be merged.
    However this code is and hypertransport irqs are in use in a few other
    places in the kernel. Not that any of this will get merged before 2.6.19

    Because the ipath-ht400 is slightly out of spec this code will need to be
    generalized to work there.

    I think all of the powerpc uses are for a plain interrupt controller in a
    chipset so support for native hypertransport devices is a little less
    interesting.

    However I think this is a half way decent model on how to separate arch
    specific and generic helper code, and I think this is a functional model of
    how to get the architecture dependencies out of the msi code.

    [akpm@osdl.org: Kconfig fix]
    Signed-off-by: Eric W. Biederman
    Cc: Greg KH
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This adds defines for the hypertransport capability subtypes and starts
    using them a little.

    [akpm@osdl.org: fix typo]
    Signed-off-by: Eric W. Biederman
    Acked-by: Benjamin Herrenschmidt
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • With more irqs in the system we don't need this.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • After raising the number of irqs the system supports this function is no
    longer necessary.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This refactors the irq handling code to make the vectors a per cpu resource so
    the same vector number can be simultaneously used on multiple cpus for
    different irqs.

    This should make systems that were hitting limits on the total number of irqs
    much more livable.

    [akpm@osdl.org: build fix]
    [akpm@osdl.org: __target_IO_APIC_irq is unneeded on UP]
    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This is a small pessimization but it paves the way for making this information
    per cpu. Which allows the the maximum number of IRQS to become NR_CPUS*224.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This patch adds support for systems that cannot receive every interrupt on a
    single cpu simultaneously, in the check to see if we have enough HARDIRQ_BITS.

    MAX_HARDIRQS_PER_CPU becomes the count of the maximum number of hardare
    generated interrupts per cpu.

    On architectures that support per cpu interrupt delivery this can be a
    significant space savings and scalability bonus.

    This patch adds support for systems that cannot receive every interrupt on

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Because of the nasty way that CONFIG_PCI_MSI was implemented we wound up with
    set_irq_info and set_native_irq_info, with move_irq and move_native_irq. Both
    functions did the same thing but they were built and called under different
    circumstances. Now that the msi hacks are gone we can kill move_irq and
    set_irq_info.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This patch removes the change in behavior of the irq allocation code when
    CONFIG_PCI_MSI is defined. Removing all instances of the assumption that irq
    == vector.

    create_irq is rewritten to first allocate a free irq and then to assign that
    irq a vector.

    assign_irq_vector is made static and the AUTO_ASSIGN case which allocates an
    vector not bound to an irq is removed.

    The ioapic vector methods are removed, and everything now works with irqs.

    The definition of NR_IRQS no longer depends on CONFIG_PCI_MSI

    [akpm@osdl.org: cleanup]
    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This patch removes the change in behavior of the irq allocation code when
    CONFIG_PCI_MSI is defined. Removing all instances of the assumption that irq
    == vector.

    create_irq is rewritten to first allocate a free irq and then to assign that
    irq a vector.

    assign_irq_vector is made static and the AUTO_ASSIGN case which allocates an
    vector not bound to an irq is removed.

    The ioapic vector methods are removed, and everything now works with irqs.

    The definition of NR_IRQS no longer depends on CONFIG_PCI_MSI

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • After the previous changes ia64 is the only architecture useing msi-apic.c

    [akpm@osdl.org: unbreak MSI on ia64]
    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This removes the hardcoded assumption that irq == vector in the msi
    composition code, and it allows the msi message composition to setup logical
    mode, or lowest priorirty delivery mode as we do for other apic interrupts,
    and with the same selection criteria.

    Basically this moves the problem of what is in the msi message into the
    architecture irq management code where it belongs. Not in a generic layer
    that doesn't have enough information to compose msi messages properly.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This removes the hardcoded assumption that irq == vector in the msi
    composition code, and it allows the msi message composition to setup logical
    mode, or lowest priorirty delivery mode as we do for other apic interrupts,
    and with the same selection criteria.

    Basically this moves the problem of what is in the msi message into the
    architecture irq management code where it belongs. Not in a generic layer
    that doesn't have enough information to compose msi messages properly.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The msi currently allocates irqs backwards. First it allocates a platform
    dependent routing value for an interrupt the ``vector'' and then it figures
    out from the vector which irq you are on.

    For ia64 this is fine. For x86 and x86_64 this is complete nonsense and makes
    an enourmous mess of the irq handling code and prevents some pretty
    significant cleanups in the code for handling large numbers of irqs.

    This patch refactors msi.c to work in terms of irqs and create_irq/destroy_irq
    for dynamically managing irqs.

    Hopefully this is finally a version of msi.c that is useful on more than just
    x86 derivatives.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The current implementation of create_irq() is a hack but it is the current
    hack that msi.c uses, and unfortunately the ``generic'' apic msi ops depend on
    this hack. Thus we are this hack of assuming irq == vector until the
    depencencies in the generic irq code are removed.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The current implementation of create_irq() is a hack but it is the current
    hack that msi.c uses, and unfortunately the ``generic'' apic msi ops depend on
    this hack. Thus we are stuck this hack of assuming irq == vector until the
    depencencies in the generic msi code are removed.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • [akpm@osdl.org: build fix]
    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • With the msi support comes a new concept in irq handling, irqs that are
    created dynamically at run time.

    Currently the msi code allocates irqs backwards. First it allocates a
    platform dependent routing value for an interrupt the ``vector'' and then it
    figures out from the vector which irq you are on.

    This msi backwards allocator suffers from two basic problems. The allocator
    suffers because it is trying to do something that is architecture specific in
    a generic way making it brittle, inflexible, and tied to tightly to the
    architecture implementation. The alloctor also suffers from it's very
    backwards nature as it has tied things together that should have no
    dependencies.

    To solve the basic dynamic irq allocation problem two new architecture
    specific functions are added: create_irq and destroy_irq.

    create_irq takes no input and returns an unused irq number, that won't be
    reused until it is returned to the free poll with destroy_irq. The irq then
    can be used for any purpose although the only initial consumer is the msi
    code.

    destroy_irq takes an irq number allocated with create_irq and returns it to
    the free pool.

    Making this functionality per architecture increases the simplicity of the irq
    allocation code and increases it's flexibility.

    dynamic_irq_init() and dynamic_irq_cleanup() are added to automate the
    irq_desc initializtion that should happen for dynamic irqs.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Currently we attempt to predict how many irqs we will be able to allocate with
    msi using pci_vector_resources and some complicated accounting, and then we
    only allow each device as many irqs as we think are available on average.

    Only the s2io driver even takes advantage of this feature all other drivers
    have a fixed number of irqs they need and bail if they can't get them.

    pci_vector_resources is inaccurate if anyone ever frees an irq. The whole
    implmentation is racy. The current irq limit policy does not appear to make
    sense with current drivers. So I have simplified things. We can revisit this
    we we need a more sophisticated policy.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The current msi_ops are short sighted in a number of ways, this patch attempts
    to fix the glaring deficiences.

    - Report in msi_ops if a 64bit address is needed in the msi message, so we
    can fail 32bit only msi structures.

    - Send and receive a full struct msi_msg in both setup and target. This is
    a little cleaner and allows for architectures that need to modify the data
    to retarget the msi interrupt to a different cpu.

    - In target pass in the full cpu mask instead of just the first cpu in case
    we can make use of the full cpu mask.

    - Operate in terms of irqs and not vectors, currently there is still a 1-1
    relationship but on architectures other than ia64 I expect this will change.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • In support of this I also add a struct msi_msg that captures the the two
    address and one data field ina typical msi message, and I remember the pos and
    if the address is 64bit in struct msi_desc.

    This makes the code a little more readable and easier to maintain, and paves
    the way to further simplfications.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This allows the output of the msi tests to be stored directly in a bit field.
    If you don't do this a value greater than one will be truncated and become 0.
    Changing true to false with bizare consequences.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The problem. Because the disable routines leave the msi interrupts in all
    sorts of half enabled states the enable routines become impossible to
    implement correctly, and almost impossible to understand.

    Simplifing this allows me to simply kill the buggy reroute_msix_table, and
    generally makes the code more maintainable.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Cc: Rajesh Shah
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • In the latest changes the code for migrating x86_64 irqs was dropped. This
    reads it in a fashion that will work even if we change the vector on level
    triggered irqs when we migrate them.

    [akpm@osdl.org: build fix]
    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Currently move_native_irq disables and renables the irq we are migrating to
    ensure we don't take that irq when we are actually doing the migration
    operation. Disabling the irq needs to happen but sometimes doing the work is
    move_native_irq is too late.

    On x86 with ioapics the irq move sequences needs to be:
    edge_triggered:
    mask irq.
    move irq.
    unmask irq.
    ack irq.
    level_triggered:
    mask irq.
    ack irq.
    move irq.
    unmask irq.

    We can easily perform the edge triggered sequence, with the current defintion
    of move_native_irq. However the level triggered case does not map well. For
    that I have added move_masked_irq, to allow me to disable the irqs around both
    the ack and the move.

    Q: Why have we not seen this problem earlier?

    A: The only symptom I have been able to reproduce is that if we change
    the vector before acknowleding an irq the wrong irq is acknowledged.
    Since we currently are not reprogramming the irq vector during
    migration no problems show up.

    We have to mask the irq before we acknowledge the irq or else we could
    hit a window where an irq is asserted just before we acknowledge it.

    Edge triggered irqs do not have this problem because acknowledgements
    do not propogate in the same way.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The primary aim of this patchset is to remove maintenances problems caused by
    the irq infrastructure. The two big issues I address are an artificially
    small cap on the number of irqs, and that MSI assumes vector == irq. My
    primary focus is on x86_64 but I have touched other architectures where
    necessary to keep them from breaking.

    - To increase the number of irqs I modify the code to look at the (cpu,
    vector) pair instead of just looking at the vector.

    With a large number of irqs available systems with a large irq count no
    longer need to compress their irq numbers to fit. Removing a lot of brittle
    special cases.

    For acpi guys the result is that irq == gsi.

    - Addressing the fact that MSI assumes irq == vector takes a few more
    patches. But suffice it to say when I am done none of the generic irq code
    even knows what a vector is.

    In quick testing on a large Unisys x86_64 machine we stumbled over at least
    one driver that assumed that NR_IRQS could always fit into an 8 bit number.
    This driver is clearly buggy today. But this has become a class of bugs that
    it is now much easier to hit.

    This patch:

    This is a minor space optimization. In practice I don't think this has any
    affect because of our alignment constraints and the other fields but there is
    not point in chewing up an uncessary word and since we already read the flag
    field this should improve the cache hit ratio of the irq handler.

    Signed-off-by: Eric W. Biederman
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Rajesh Shah
    Cc: Andi Kleen
    Cc: "Protasevich, Natalie"
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This patch converts all the i386 PIC controllers (except VisWS and Voyager,
    which I could not test - but which should still work as old-style IRQ layers)
    to the new and simpler irq-chip interrupt handling layer.

    [akpm@osdl.org: build fix]
    [mingo@elte.hu: enable fasteoi handler for i386 level-triggered IO-APIC irqs]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Roland Dreier
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • This patch converts all the x86_64 PIC controllers layers to the new and
    simpler irq-chip interrupt handling layer.

    [mingo@elte.hu: The patch also enables the fasteoi handler for x86_64]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Cc: Benjamin Herrenschmidt
    Cc: Roland Dreier
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • drivers/video/riva/fbdev.c: In function `riva_get_EDID_OF':
    drivers/video/riva/fbdev.c:1846: warning: assignment discards qualifiers from pointer target type

    This code is being bad: copying a pointer to read-only OF data into a
    non-const pointer.

    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: "Antonino A. Daplas"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • eCryptfs is a stacked cryptographic filesystem for Linux. It is derived from
    Erez Zadok's Cryptfs, implemented through the FiST framework for generating
    stacked filesystems. eCryptfs extends Cryptfs to provide advanced key
    management and policy features. eCryptfs stores cryptographic metadata in the
    header of each file written, so that encrypted files can be copied between
    hosts; the file will be decryptable with the proper key, and there is no need
    to keep track of any additional information aside from what is already in the
    encrypted file itself.

    [akpm@osdl.org: updates for ongoing API changes]
    [bunk@stusta.de: cleanups]
    [akpm@osdl.org: alpha build fix]
    [akpm@osdl.org: cleanups]
    [tytso@mit.edu: inode-diet updates]
    [pbadari@us.ibm.com: generic_file_*_read/write() interface updates]
    [rdunlap@xenotime.net: printk format fixes]
    [akpm@osdl.org: make slab creation and teardown table-driven]
    Signed-off-by: Phillip Hellewell
    Signed-off-by: Michael Halcrow
    Signed-off-by: Erez Zadok
    Signed-off-by: Adrian Bunk
    Signed-off-by: Stephan Mueller
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Badari Pulavarty
    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Halcrow
     
  • make headers_check fails on linux/nfsd/const.h.

    Since linux/sunrpc/msg_prot.h does not seem to export anything interesting
    for userspace, this patch moves it in the __KERNEL__ protected section.

    Signed-off-by: Cedric Le Goater
    Cc: David Woodhouse
    Cc: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cedric Le Goater
     
  • Use all the pieces set up so far to implement referral support, allowing
    return of NFS4ERR_MOVED and fs_locations attribute.

    Signed-off-by: Manoj Naik
    Signed-off-by: Fred Isaman
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    J.Bruce Fields