22 Dec, 2011

2 commits

  • With i386, NMIs and breakpoints use the current stack and they
    do not reset the stack pointer to a fix point that might corrupt
    a previous NMI or breakpoint (as it does in x86_64). But NMIs are
    still not made to be re-entrant, and need to prevent the case that
    an NMI hitting a breakpoint (which does an iret), doesn't allow
    another NMI to run.

    The fix is to let the NMI be in 3 different states:

    1) not running
    2) executing
    3) latched

    When no NMI is executing on a given CPU, the state is "not running".
    When the first NMI comes in, the state is switched to "executing".
    On exit of that NMI, a cmpxchg is performed to switch the state
    back to "not running" and if that fails, the NMI is restarted.

    If a breakpoint is hit and does an iret, which re-enables NMIs,
    and another NMI comes in before the first NMI finished, it will
    detect that the state is not in the "not running" state and the
    current NMI is nested. In this case, the state is switched to "latched"
    to let the interrupted NMI know to restart the NMI handler, and
    the nested NMI exits without doing anything.

    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: H. Peter Anvin
    Cc: Thomas Gleixner
    Cc: Paul Turner
    Cc: Frederic Weisbecker
    Cc: Mathieu Desnoyers
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • We want to allow NMI handlers to have breakpoints to be able to
    remove stop_machine from ftrace, kprobes and jump_labels. But if
    an NMI interrupts a current breakpoint, and then it triggers a
    breakpoint itself, it will switch to the breakpoint stack and
    corrupt the data on it for the breakpoint processing that it
    interrupted.

    Instead, have the NMI check if it interrupted breakpoint processing
    by checking if the stack that is currently used is a breakpoint
    stack. If it is, then load a special IDT that changes the IST
    for the debug exception to keep the same stack in kernel context.
    When the NMI is done, it puts it back.

    This way, if the NMI does trigger a breakpoint, it will keep
    using the same stack and not stomp on the breakpoint data for
    the breakpoint it interrupted.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

11 Nov, 2011

1 commit

  • with "apic=verbose" the print_IO_APIC() function tries to print
    IRQ to pin mappings for every active irq. It assumes chip_data
    is of type irq_cfg and may cause an oops if not.

    As the print_IO_APIC() is called from a late_initcall other
    chained irq chips may already be registered with custom
    chip_data information, causing an oops. This is the case with
    intel MID SoC devices with gpio demuxers registered as irq_chips.

    Signed-off-by: Mathias Nyman
    Signed-off-by: Alan Cox
    [ -v2: fixed build failure ]
    Signed-off-by: Ingo Molnar

    Mathias Nyman
     

10 Nov, 2011

1 commit

  • Moorestown/Medfield platform does not have port 0x61 to report
    NMI status, nor does it have external NMI sources. The only NMI
    sources are from lapic, as results of perf counter overflow or
    IPI, e.g. NMI watchdog or spin lock debug.

    Reading port 0x61 on Moorestown will return 0xff which misled
    NMI handlers to false critical errors such memory parity error.
    The subsequent ioport access for NMI handling can also cause
    undefined behavior on Moorestown.

    This patch allows kernel process NMI due to watchdog or backrace
    dump without unnecessary hangs.

    Signed-off-by: Jacob Pan
    Signed-off-by: Ingo Molnar
    [hand applied]
    Signed-off-by: Alan Cox

    Jacob Pan
     

01 Nov, 2011

1 commit

  • These files were implicitly getting EXPORT_SYMBOL via device.h
    which was including module.h, but that will be fixed up shortly.

    By fixing these now, we can avoid seeing things like:

    arch/x86/kernel/rtc.c:29: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL’
    arch/x86/kernel/pci-dma.c:20: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL’
    arch/x86/kernel/e820.c:69: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL_GPL’

    [ with input from Randy Dunlap and also
    from Stephen Rothwell ]

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

10 Oct, 2011

6 commits

  • nmi.c needs an #include :

    arch/x86/kernel/nmi.c: In function ‘unknown_nmi_error’:
    arch/x86/kernel/nmi.c:286:6: error: ‘MCA_bus’ undeclared (first use in this function)
    arch/x86/kernel/nmi.c:286:6: note: each undeclared identifier is reported only once for each function it appears in

    Another one is the hpwdt driver:

    drivers/watchdog/hpwdt.c:507:9: error: ‘NMI_DONE’ undeclared (first use in this function)

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Now that the NMI handler are broken into lists, increment the appropriate
    stats for each list. This allows us to see what is going on when they
    get printed out in the next patch.

    Signed-off-by: Don Zickus
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1317409584-23662-6-git-send-email-dzickus@redhat.com
    Signed-off-by: Ingo Molnar

    Don Zickus
     
  • Previous patches allow the NMI subsystem to process multipe NMI events
    in one NMI. As previously discussed this can cause issues when an event
    triggered another NMI but is processed in the current NMI. This causes the
    next NMI to go unprocessed and become an 'unknown' NMI.

    To handle this, we first have to flag whether or not the NMI handler handled
    more than one event or not. If it did, then there exists a chance that
    the next NMI might be already processed. Once the NMI is flagged as a
    candidate to be swallowed, we next look for a back-to-back NMI condition.

    This is determined by looking at the %rip from pt_regs. If it is the same
    as the previous NMI, it is assumed the cpu did not have a chance to jump
    back into a non-NMI context and execute code and instead handled another NMI.

    If both of those conditions are true then we will swallow any unknown NMI.

    There still exists a chance that we accidentally swallow a real unknown NMI,
    but for now things seem better.

    An optimization has also been added to the nmi notifier rountine. Because x86
    can latch up to one NMI while currently processing an NMI, we don't have to
    worry about executing _all_ the handlers in a standalone NMI. The idea is
    if multiple NMIs come in, the second NMI will represent them. For those
    back-to-back NMI cases, we have the potentail to drop NMIs. Therefore only
    execute all the handlers in the second half of a detected back-to-back NMI.

    Signed-off-by: Don Zickus
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1317409584-23662-5-git-send-email-dzickus@redhat.com
    Signed-off-by: Ingo Molnar

    Don Zickus
     
  • Just convert all the files that have an nmi handler to the new routines.
    Most of it is straight forward conversion. A couple of places needed some
    tweaking like kgdb which separates the debug notifier from the nmi handler
    and mce removes a call to notify_die.

    [Thanks to Ying for finding out the history behind that mce call

    https://lkml.org/lkml/2010/5/27/114

    And Boris responding that he would like to remove that call because of it

    https://lkml.org/lkml/2011/9/21/163]

    The things that get converted are the registeration/unregistration routines
    and the nmi handler itself has its args changed along with code removal
    to check which list it is on (most are on one NMI list except for kgdb
    which has both an NMI routine and an NMI Unknown routine).

    Signed-off-by: Don Zickus
    Signed-off-by: Peter Zijlstra
    Acked-by: Corey Minyard
    Cc: Jason Wessel
    Cc: Andi Kleen
    Cc: Robert Richter
    Cc: Huang Ying
    Cc: Corey Minyard
    Cc: Jack Steiner
    Link: http://lkml.kernel.org/r/1317409584-23662-4-git-send-email-dzickus@redhat.com
    Signed-off-by: Ingo Molnar

    Don Zickus
     
  • The NMI handlers used to rely on the notifier infrastructure. This worked
    great until we wanted to support handling multiple events better.

    One of the key ideas to the nmi handling is to process _all_ the handlers for
    each NMI. The reason behind this switch is because NMIs are edge triggered.
    If enough NMIs are triggered, then they could be lost because the cpu can
    only latch at most one NMI (besides the one currently being processed).

    In order to deal with this we have decided to process all the NMI handlers
    for each NMI. This allows the handlers to determine if they recieved an
    event or not (the ones that can not determine this will be left to fend
    for themselves on the unknown NMI list).

    As a result of this change it is now possible to have an extra NMI that
    was destined to be received for an already processed event. Because the
    event was processed in the previous NMI, this NMI gets dropped and becomes
    an 'unknown' NMI. This of course will cause printks that scare people.

    However, we prefer to have extra NMIs as opposed to losing NMIs and as such
    are have developed a basic mechanism to catch most of them. That will be
    a later patch.

    To accomplish this idea, I unhooked the nmi handlers from the notifier
    routines and created a new mechanism loosely based on doIRQ. The reason
    for this is the notifier routines have a couple of shortcomings. One we
    could't guarantee all future NMI handlers used NOTIFY_OK instead of
    NOTIFY_STOP. Second, we couldn't keep track of the number of events being
    handled in each routine (most only handle one, perf can handle more than one).
    Third, I wanted to eventually display which nmi handlers are registered in
    the system in /proc/interrupts to help see who is generating NMIs.

    The patch below just implements the new infrastructure but doesn't wire it up
    yet (that is the next patch). Its design is based on doIRQ structs and the
    atomic notifier routines. So the rcu stuff in the patch isn't entirely untested
    (as the notifier routines have soaked it) but it should be double checked in
    case I copied the code wrong.

    Signed-off-by: Don Zickus
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1317409584-23662-3-git-send-email-dzickus@redhat.com
    Signed-off-by: Ingo Molnar

    Don Zickus
     
  • The nmi stuff is changing a lot and adding more functionality. Split it
    out from the traps.c file so it doesn't continue to pollute that file.

    This makes it easier to find and expand all the future nmi related work.

    No real functional changes here.

    Signed-off-by: Don Zickus
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1317409584-23662-2-git-send-email-dzickus@redhat.com
    Signed-off-by: Ingo Molnar

    Don Zickus
     

18 Feb, 2009

2 commits


17 Feb, 2009

1 commit


29 Jan, 2009

1 commit


18 Jan, 2009

1 commit


06 Jan, 2009

1 commit


03 Jan, 2009

1 commit


23 Dec, 2008

1 commit

  • …86/debug', 'x86/defconfig', 'x86/detect-hyper', 'x86/doc', 'x86/dumpstack', 'x86/early-printk', 'x86/fpu', 'x86/idle', 'x86/io', 'x86/memory-corruption-check', 'x86/microcode', 'x86/mm', 'x86/mtrr', 'x86/nmi-watchdog', 'x86/pat2', 'x86/pci-ioapic-boot-irq-quirks', 'x86/ptrace', 'x86/quirks', 'x86/reboot', 'x86/setup-memory', 'x86/signal', 'x86/sparse-fixes', 'x86/time', 'x86/uv' and 'x86/xen' into x86/core

    Ingo Molnar
     

31 Oct, 2008

1 commit

  • Impact: introduce nmi_watchdog=lapic and nmi_watchdog=ioapic aliases

    Add sensible names as "lapic" and "ioapic" to
    nmi_watchdog boot parameter. Sometimes it is not
    that easy to recall what exactly nmi_watchdog=1
    does mean so we allow the using of symbolic names here.

    Old numeric values remain valid.

    Signed-off-by: Cyrill Gorcunov
    Signed-off-by: Ingo Molnar

    Cyrill Gorcunov
     

28 Oct, 2008

2 commits


23 Sep, 2008

1 commit

  • There's a small window when NMI watchdog is being set up that if any NMIs
    are triggered, the NMI code will make make use of not initalized wd_ops
    elements:
    void setup_apic_nmi_watchdog(void *unused)
    {
    if (__get_cpu_var(wd_enabled))
    return;

    /* cheap hack to support suspend/resume */
    /* if cpu0 is not active neither should the other cpus */
    if (smp_processor_id() != 0 && atomic_read(&nmi_active) __get_cpu_var(wd_enabled) = 1;
    --> if (lapic_watchdog_init(nmi_hz) < 0) {
    (...)
    asmlinkage notrace __kprobes void default_do_nmi(struct pt_regs *regs)
    {
    (...)
    if (nmi_watchdog_tick(regs, reason))
    return;
    (...)
    notrace __kprobes int
    nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
    {
    (...)
    if (!__get_cpu_var(wd_enabled))
    return rc;
    switch (nmi_watchdog) {
    case NMI_LOCAL_APIC:
    rc |= lapic_wd_event(nmi_hz);
    (...)
    int lapic_wd_event(unsigned nmi_hz)
    {
    struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
    u64 ctr;

    --> rdmsrl(wd->perfctr_msr, ctr);

    and wd->*_msr will be initialized on each processor type specific setup, after
    enabling NMIs for PMIs. Since the counter was just set, the chances of an
    performance counter generated NMI is minimal, but any other unknown NMI would
    trigger the problem. This patch fixes the problem by setting everything up
    before enabling performance counter generated NMIs and will set wd_enabled
    using a callback function.

    Signed-off-by: Aristeu Rozanski
    Acked-by: Don Zickus
    Acked-by: Prarit Bhargava
    Acked-by: Vivek Goyal
    Signed-off-by: Ingo Molnar

    Aristeu Rozanski
     

15 Aug, 2008

2 commits

  • clean up the failure message - and redirect people to bugzilla
    instead of lkml.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • > it just won't work at boot time - the second logic unit will be stuck:
    >
    > Booting processor 1/2 APIC 0x1
    > Initializing CPU#1
    > Calibrating delay using timer specific routine.. 5586.12 BogoMIPS (lpj=2793063)
    > CPU: Trace cache: 12K uops, L1 D cache: 16K
    > CPU: L2 cache: 1024K
    > CPU: Physical Processor ID: 0
    > CPU: Processor Core ID: 1
    > CPU1: Thermal monitoring enabled (TM1)
    > Intel(R) Pentium(R) D CPU 2.80GHz stepping 04
    > Brought up 2 CPUs
    > testing NMI watchdog ... WARNING: CPU#1: NMI appears to be stuck (0->0)!

    while at it... - fix that newline

    Signed-off-by: Aristeu Rozanski
    Cc: jvillalo@redhat.com
    Signed-off-by: Ingo Molnar

    Aristeu Rozanski
     

21 Jul, 2008

1 commit


20 Jul, 2008

1 commit


18 Jul, 2008

1 commit

  • Use alternatives to select the workaround for the 11AP Pentium erratum
    for the affected steppings on the fly rather than build time. Remove the
    X86_GOOD_APIC configuration option and replace all the calls to
    apic_write_around() with plain apic_write(), protecting accesses to the
    ESR as appropriate due to the 3AP Pentium erratum. Remove
    apic_read_around() and all its invocations altogether as not needed.
    Remove apic_write_atomic() and all its implementing backends. The use of
    ASM_OUTPUT2() is not strictly needed for input constraints, but I have
    used it for readability's sake.

    I had the feeling no one else was brave enough to do it, so I went ahead
    and here it is. Verified by checking the generated assembly and tested
    with both a 32-bit and a 64-bit configuration, also with the 11AP
    "feature" forced on and verified with gdb on /proc/kcore to work as
    expected (as an 11AP machines are quite hard to get hands on these days).
    Some script complained about the use of "volatile", but apic_write() needs
    it for the same reason and is effectively a replacement for writel(), so I
    have disregarded it.

    I am not sure what the policy wrt defconfig files is, they are generated
    and there is risk of a conflict resulting from an unrelated change, so I
    have left changes to them out. The option will get removed from them at
    the next run.

    Some testing with machines other than mine will be needed to avoid some
    stupid mistake, but despite its volume, the change is not really that
    intrusive, so I am fairly confident that because it works for me, it will
    everywhere.

    Signed-off-by: Maciej W. Rozycki
    Signed-off-by: Ingo Molnar

    Maciej W. Rozycki
     

16 Jul, 2008

1 commit

  • Conflicts:

    arch/powerpc/Kconfig
    arch/s390/kernel/time.c
    arch/x86/kernel/apic_32.c
    arch/x86/kernel/cpu/perfctr-watchdog.c
    arch/x86/kernel/i8259_64.c
    arch/x86/kernel/ldt.c
    arch/x86/kernel/nmi_64.c
    arch/x86/kernel/smpboot.c
    arch/x86/xen/smp.c
    include/asm-x86/hw_irq_32.h
    include/asm-x86/hw_irq_64.h
    include/asm-x86/mach-default/irq_vectors.h
    include/asm-x86/mach-voyager/irq_vectors.h
    include/asm-x86/smp.h
    kernel/Makefile

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

12 Jul, 2008

1 commit

  • In the course of the recent unification of the NMI watchdog an assignment
    to timer_ack to switch off unnecesary POLL commands to the 8259A in the
    case of a watchdog failure has been accidentally removed. The statement
    used to be limited to the 32-bit variation as since the rewrite of the
    timer code it has been relevant for the 82489DX only. This change brings
    it back.

    Signed-off-by: Maciej W. Rozycki
    Signed-off-by: Ingo Molnar

    Maciej W. Rozycki
     

08 Jul, 2008

4 commits


19 Jun, 2008

1 commit


12 Jun, 2008

1 commit


05 Jun, 2008

2 commits

  • fix:

    arch/x86/kernel/built-in.o: In function `proc_nmi_enabled':
    : undefined reference to `nmi_watchdog_default'
    arch/x86/kernel/built-in.o: In function `native_smp_prepare_cpus':
    : undefined reference to `nmi_watchdog_default'

    Signed-off-by: Ingo Molnar

    mingo@elte.hu
     
  • 64bit mode bootstrap code does set nmi_watchdog to NMI_NONE
    by default and doing the same on 32bit mode is safe too.
    Such an action saves us from several #ifdef.

    Btw, my previous commit

    commit 19ec673ced067316b9732bc6d1c4ff4052e5f795
    Author: Cyrill Gorcunov
    Date: Wed May 28 23:00:47 2008 +0400

    x86: nmi - fix incorrect NMI watchdog used by default

    did not fix the problem completely, moreover it
    introduced additional bug - nmi_watchdog would be
    set to either NMI_LOCAL_APIC or NMI_IO_APIC
    _regardless_ to boot option if being enabled thru
    /proc/sys/kernel/nmi_watchdog. Sorry for that.
    Fix it too.

    Signed-off-by: Cyrill Gorcunov
    Cc: mingo@redhat.com
    Cc: hpa@zytor.com
    Cc: macro@linux-mips.org
    Signed-off-by: Thomas Gleixner

    Cyrill Gorcunov
     

02 Jun, 2008

2 commits