30 Apr, 2012

1 commit

  • The switch from using irq_map to irq_alloc_desc*() for managing irq
    number allocations introduced new bugs in some of the powerpc
    interrupt code. Several functions rely on the value of NR_IRQS to
    determine the maximum irq number that could get allocated. However,
    with sparse_irq and using irq_alloc_desc*() the maximum possible irq
    number is now specified with 'nr_irqs' which may be a number larger
    than NR_IRQS. This has caused breakage on powermac when
    CONFIG_NR_IRQS is set to 32.

    This patch removes most of the direct references to NR_IRQS in the
    powerpc code and replaces them with either a nr_irqs reference or by
    using the common for_each_irq_desc() macro. The powerpc-specific
    for_each_irq() macro is removed at the same time.

    Also, the Cell axon_msi driver is refactored to remove the global
    build assumption on the size of NR_IRQS and instead add a limit to the
    maximum irq number when calling irq_domain_add_nomap().

    Signed-off-by: Grant Likely
    Signed-off-by: Benjamin Herrenschmidt

    Grant Likely
     

16 Feb, 2012

2 commits

  • Each revmap type has different arguments for setting up the revmap.
    This patch splits up the generator functions so that each revmap type
    can do its own setup and the user doesn't need to keep track of how
    each revmap type handles the arguments.

    This patch also adds a host_data argument to the generators. There are
    cases where the host_data pointer will be needed before the function returns.
    ie. the legacy map calls the .map callback for each irq before returning.

    v2: - Add void *host_data argument to irq_domain_add_*() functions
    - fixed failure to compile
    - Moved IRQ_DOMAIN_MAP_* defines into irqdomain.c

    Signed-off-by: Grant Likely
    Cc: Rob Herring
    Cc: Benjamin Herrenschmidt
    Cc: Thomas Gleixner
    Cc: Milton Miller
    Tested-by: Olof Johansson

    Grant Likely
     
  • There is only one user, and it is trivial to open-code.

    Signed-off-by: Grant Likely
    Cc: Benjamin Herrenschmidt
    Cc: Thomas Gleixner
    Cc: Milton Miller
    Tested-by: Olof Johansson

    Grant Likely
     

15 Feb, 2012

1 commit

  • This patch drops the powerpc-specific irq_host structures and uses the common
    irq_domain strucutres defined in linux/irqdomain.h. It also fixes all
    the users to use the new structure names.

    Renaming irq_host to irq_domain has been discussed for a long time, and this
    patch is a step in the process of generalizing the powerpc virq code to be
    usable by all architecture.

    An astute reader will notice that this patch actually removes the irq_host
    structure instead of renaming it. This is because the irq_domain structure
    already exists in include/linux/irqdomain.h and has the needed data members.

    Signed-off-by: Grant Likely
    Cc: Benjamin Herrenschmidt
    Cc: Thomas Gleixner
    Cc: Milton Miller
    Tested-by: Olof Johansson

    Grant Likely
     

08 Dec, 2011

1 commit

  • I have an intermittent kdump fail where the hypervisor fails an H_EOI.
    As a result our CPPR is never reset to 0xff and we no longer accept
    interrupts.

    This patch calls icp_hv_set_cppr to reset the CPPR if H_EOI fails,
    fixing the kdump fail.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     

28 Nov, 2011

1 commit

  • During kdump stress testing I sometimes see the kdump kernel panic
    with:

    Interrupt 0x306 (real) is invalid, disabling it.
    Kernel panic - not syncing: bad return code EOI - rc = -4, value=ff000306

    Instead of panicing print the error message, dump the stack the first
    time it happens and continue on. Add some more information to the
    debug messages as well.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     

25 Nov, 2011

1 commit


08 Nov, 2011

1 commit

  • Since commit [e58aa3d2: genirq: Run irq handlers with interrupts disabled],
    We run all interrupt handlers with interrupts disabled
    and we even check and yell when an interrupt handler
    returns with interrupts enabled (see commit [b738a50a:
    genirq: Warn when handler enables interrupts]).

    So now this flag is a NOOP and can be removed.

    Signed-off-by: Yong Zhang
    Acked-by: Arnd Bergmann
    Acked-by: Geoff Levand
    Signed-off-by: Benjamin Herrenschmidt

    Yong Zhang
     

20 Sep, 2011

2 commits

  • OPAL handles HW access to the various ICS or equivalent chips
    for us (with the exception of p5ioc2 based HEA which uses a

    different backend) similarily to what RTAS does on pSeries.

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     
  • This should fix the following warning:

    LD arch/powerpc/sysdev/xics/built-in.o
    WARNING: arch/powerpc/sysdev/xics/built-in.o(.text+0x1310): Section mismatch in
    reference from the function .icp_native_init() to the function
    .init.text:.icp_native_init_one_node()
    The function .icp_native_init() references
    the function __init .icp_native_init_one_node().
    This is often because .icp_native_init lacks a __init
    annotation or the annotation of .icp_native_init_one_node is wrong.

    icp_native_init() is only referenced in `arch/powerpc/sysdev/xics/xics-common.c'
    by xics_init() which is itself marked with __init.

    = not built-tested =

    Reported-by: Timur Tabi
    Signed-off-by: Arnaud Lacombe
    Signed-off-by: Benjamin Herrenschmidt

    Arnaud Lacombe
     

26 Jul, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
    fs: Merge split strings
    treewide: fix potentially dangerous trailing ';' in #defined values/expressions
    uwb: Fix misspelling of neighbourhood in comment
    net, netfilter: Remove redundant goto in ebt_ulog_packet
    trivial: don't touch files that are removed in the staging tree
    lib/vsprintf: replace link to Draft by final RFC number
    doc: Kconfig: `to be' -> `be'
    doc: Kconfig: Typo: square -> squared
    doc: Konfig: Documentation/power/{pm => apm-acpi}.txt
    drivers/net: static should be at beginning of declaration
    drivers/media: static should be at beginning of declaration
    drivers/i2c: static should be at beginning of declaration
    XTENSA: static should be at beginning of declaration
    SH: static should be at beginning of declaration
    MIPS: static should be at beginning of declaration
    ARM: static should be at beginning of declaration
    rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
    Update my e-mail address
    PCIe ASPM: forcedly -> forcibly
    gma500: push through device driver tree
    ...

    Fix up trivial conflicts:
    - arch/arm/mach-ep93xx/dma-m2p.c (deleted)
    - drivers/gpio/gpio-ep93xx.c (renamed and context nearby)
    - drivers/net/r8169.c (just context changes)

    Linus Torvalds
     

12 Jul, 2011

1 commit

  • This lifts the restriction that book3s_hv guests can only run one
    hardware thread per core, and allows them to use up to 4 threads
    per core on POWER7. The host still has to run single-threaded.

    This capability is advertised to qemu through a new KVM_CAP_PPC_SMT
    capability. The return value of the ioctl querying this capability
    is the number of vcpus per virtual CPU core (vcore), currently 4.

    To use this, the host kernel should be booted with all threads
    active, and then all the secondary threads should be offlined.
    This will put the secondary threads into nap mode. KVM will then
    wake them from nap mode and use them for running guest code (while
    they are still offline). To wake the secondary threads, we send
    them an IPI using a new xics_wake_cpu() function, implemented in
    arch/powerpc/sysdev/xics/icp-native.c. In other words, at this stage
    we assume that the platform has a XICS interrupt controller and
    we are using icp-native.c to drive it. Since the woken thread will
    need to acknowledge and clear the IPI, we also export the base
    physical address of the XICS registers using kvmppc_set_xics_phys()
    for use in the low-level KVM book3s code.

    When a vcpu is created, it is assigned to a virtual CPU core.
    The vcore number is obtained by dividing the vcpu number by the
    number of threads per core in the host. This number is exported
    to userspace via the KVM_CAP_PPC_SMT capability. If qemu wishes
    to run the guest in single-threaded mode, it should make all vcpu
    numbers be multiples of the number of threads per core.

    We distinguish three states of a vcpu: runnable (i.e., ready to execute
    the guest), blocked (that is, idle), and busy in host. We currently
    implement a policy that the vcore can run only when all its threads
    are runnable or blocked. This way, if a vcpu needs to execute elsewhere
    in the kernel or in qemu, it can do so without being starved of CPU
    by the other vcpus.

    When a vcore starts to run, it executes in the context of one of the
    vcpu threads. The other vcpu threads all go to sleep and stay asleep
    until something happens requiring the vcpu thread to return to qemu,
    or to wake up to run the vcore (this can happen when another vcpu
    thread goes from busy in host state to blocked).

    It can happen that a vcpu goes from blocked to runnable state (e.g.
    because of an interrupt), and the vcore it belongs to is already
    running. In that case it can start to run immediately as long as
    the none of the vcpus in the vcore have started to exit the guest.
    We send the next free thread in the vcore an IPI to get it to start
    to execute the guest. It synchronizes with the other threads via
    the vcore->entry_exit_count field to make sure that it doesn't go
    into the guest if the other vcpus are exiting by the time that it
    is ready to actually enter the guest.

    Note that there is no fixed relationship between the hardware thread
    number and the vcpu number. Hardware threads are assigned to vcpus
    as they become runnable, so we will always use the lower-numbered
    hardware threads in preference to higher-numbered threads if not all
    the vcpus in the vcore are runnable, regardless of which vcpus are
    runnable.

    Signed-off-by: Paul Mackerras
    Signed-off-by: Alexander Graf

    Paul Mackerras
     

10 Jun, 2011

1 commit


19 May, 2011

5 commits

  • Some irq_host implementations are using virq_to_host to check if
    they are the irq_host for a virtual irq. To allow us to make space
    versus time tradeoffs, replace this usage with an assertive
    virq_is_host that confirms or denies the irq is associated with the
    given irq_host.

    Signed-off-by: Milton Miller
    Acked-by: Grant Likely
    Signed-off-by: Benjamin Herrenschmidt

    Milton Miller
     
  • Since we already have a special case in map to set the ipi handler, use
    the desired flow.

    If we don't find an ics to handle the interrupt complain instead of
    returning 0 without having set a chip or handler.

    Signed-off-by: Milton Miller
    Signed-off-by: Benjamin Herrenschmidt

    Milton Miller
     
  • Compile the new smp ipi mux and demux code only if a platform
    will make use of it. The new config is selected as required.

    The new cause_ipi smp op is only available conditionally to point out
    configs where the select is required; this makes setting the op an
    immediate fail instead of a deferred unresolved symbol at link.

    This also creates a new config for power surge powermac upgrade support
    that can be disabled in expert mode but is default on.

    I also removed the depends / default y on CONFIG_XICS since it is selected
    by PSERIES.

    Signed-off-by: Milton Miller
    Signed-off-by: Benjamin Herrenschmidt

    Milton Miller
     
  • Consolidate the mux and demux of ipi messages into smp.c and call
    a new smp_ops callback to actually trigger the ipi.

    The powerpc architecture code is optimised for having 4 distinct
    ipi triggers, which are mapped to 4 distinct messages (ipi many, ipi
    single, scheduler ipi, and enter debugger). However, several interrupt
    controllers only provide a single software triggered interrupt that
    can be delivered to each cpu. To resolve this limitation, each smp_ops
    implementation created a per-cpu variable that is manipulated with atomic
    bitops. Since these lines will be contended they are optimialy marked as
    shared_aligned and take a full cache line for each cpu. Distro kernels
    may have 2 or 3 of these in their config, each taking per-cpu space
    even though at most one will be in use.

    This consolidation removes smp_message_recv and replaces the single call
    actions cases with direct calls from the common message recognition loop.
    The complicated debugger ipi case with its muxed crash handling code is
    moved to debug_ipi_action which is now called from the demux code (instead
    of the multi-message action calling smp_message_recv).

    I put a call to reschedule_action to increase the likelyhood of correctly
    merging the anticipated scheduler_ipi() hook coming from the scheduler
    tree; that single required call can be inlined later.

    The actual message decode is a copy of the old pseries xics code with its
    memory barriers and cache line spacing, augmented with a per-cpu unsigned
    long based on the book-e doorbell code. The optional data is set via a
    callback from the implementation and is passed to the new cause-ipi hook
    along with the logical cpu number. While currently only the doorbell
    implemntation uses this data it should be almost zero cost to retrieve and
    pass it -- it adds a single register load for the argument from the same
    cache line to which we just completed a store and the register is dead
    on return from the call. I extended the data element from unsigned int
    to unsigned long in case some other code wanted to associate a pointer.

    The doorbell check_self is replaced by a call to smp_muxed_ipi_resend,
    conditioned on the CPU_DBELL feature. The ifdef guard could be relaxed
    to CONFIG_SMP but I left it with BOOKE for now.

    Also, the doorbell interrupt vector for book-e was not calling irq_enter
    and irq_exit, which throws off cpu accounting and causes code to not
    realize it is running in interrupt context. Add the missing calls.

    Signed-off-by: Milton Miller
    Signed-off-by: Benjamin Herrenschmidt

    Milton Miller
     
  • Now that smp_ops->smp_message_pass is always called with an (online) cpu
    number for the target remove the checks for MSG_ALL and MSG_ALL_BUT_SELF.

    Signed-off-by: Milton Miller
    Signed-off-by: Benjamin Herrenschmidt

    Milton Miller
     

06 May, 2011

1 commit

  • Add a platform for the Wire Speed Processor, based on the PPC A2.

    This includes code for the ICS & OPB interrupt controllers, as well
    as a SCOM backend, and SCOM based cpu bringup.

    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: David Gibson
    Signed-off-by: Jack Miller
    Signed-off-by: Ian Munsie
    Signed-off-by: Michael Ellerman
    Signed-off-by: Benjamin Herrenschmidt

    David Gibson
     

04 May, 2011

1 commit


20 Apr, 2011

3 commits

  • An upcoming new ics backend will need to implement different matching
    semantics to the current ones, which are essentially the RTAS ics
    backends. So move the current match into the RTAS backend, and allow
    other ics backends to override.

    Signed-off-by: Michael Ellerman
    Signed-off-by: Benjamin Herrenschmidt

    Michael Ellerman
     
  • Even when nothing is specified in the device tree, and despite the
    fact that we don't setup links properly yet, we still need a reasonable
    value in there or some interrupts won't be setup properly to point to
    an existing processor.

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     
  • This is a significant rework of the XICS driver, too significant to
    conveniently break it up into a series of smaller patches to be honest.

    The driver is moved to a more generic location to allow new platforms
    to use it, and is broken up into separate ICP and ICS "backends". For
    now we have the native and "hypervisor" ICP backends and one common
    RTAS ICS backend.

    The driver supports one ICP backend instanciation, and many ICS ones,
    in order to accomodate future platforms with multiple possibly different
    interrupt "sources" mechanisms.

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt