29 Dec, 2018

1 commit

  • commit c7c3f05e341a9a2bd1a92993d4f996cfd6e7348e upstream.

    From printk()/serial console point of view panic() is special, because
    it may force CPU to re-enter printk() or/and serial console driver.
    Therefore, some of serial consoles drivers are re-entrant. E.g. 8250:

    serial8250_console_write()
    {
    if (port->sysrq)
    locked = 0;
    else if (oops_in_progress)
    locked = spin_trylock_irqsave(&port->lock, flags);
    else
    spin_lock_irqsave(&port->lock, flags);
    ...
    }

    panic() does set oops_in_progress via bust_spinlocks(1), so in theory
    we should be able to re-enter serial console driver from panic():

    CPU0

    uart_console_write()
    serial8250_console_write() // if (oops_in_progress)
    // spin_trylock_irqsave()
    call_console_drivers()
    console_unlock()
    console_flush_on_panic()
    bust_spinlocks(1) // oops_in_progress++
    panic()

    spin_lock_irqsave(&port->lock, flags) // spin_lock_irqsave()
    serial8250_console_write()
    call_console_drivers()
    console_unlock()
    printk()
    ...

    However, this does not happen and we deadlock in serial console on
    port->lock spinlock. And the problem is that console_flush_on_panic()
    called after bust_spinlocks(0):

    void panic(const char *fmt, ...)
    {
    bust_spinlocks(1);
    ...
    bust_spinlocks(0);
    console_flush_on_panic();
    ...
    }

    bust_spinlocks(0) decrements oops_in_progress, so oops_in_progress
    can go back to zero. Thus even re-entrant console drivers will simply
    spin on port->lock spinlock. Given that port->lock may already be
    locked either by a stopped CPU, or by the very same CPU we execute
    panic() on (for instance, NMI panic() on printing CPU) the system
    deadlocks and does not reboot.

    Fix this by removing bust_spinlocks(0), so oops_in_progress is always
    set in panic() now and, thus, re-entrant console drivers will trylock
    the port->lock instead of spinning on it forever, when we call them
    from console_flush_on_panic().

    Link: http://lkml.kernel.org/r/20181025101036.6823-1-sergey.senozhatsky@gmail.com
    Cc: Steven Rostedt
    Cc: Daniel Wang
    Cc: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Greg Kroah-Hartman
    Cc: Alan Cox
    Cc: Jiri Slaby
    Cc: Peter Feiner
    Cc: linux-serial@vger.kernel.org
    Cc: Sergey Senozhatsky
    Cc: stable@vger.kernel.org
    Signed-off-by: Sergey Senozhatsky
    Signed-off-by: Petr Mladek
    Signed-off-by: Greg Kroah-Hartman

    Sergey Senozhatsky
     

17 Aug, 2017

1 commit

  • This implements refcount_t overflow protection on x86 without a noticeable
    performance impact, though without the fuller checking of REFCOUNT_FULL.

    This is done by duplicating the existing atomic_t refcount implementation
    but with normally a single instruction added to detect if the refcount
    has gone negative (e.g. wrapped past INT_MAX or below zero). When detected,
    the handler saturates the refcount_t to INT_MIN / 2. With this overflow
    protection, the erroneous reference release that would follow a wrap back
    to zero is blocked from happening, avoiding the class of refcount-overflow
    use-after-free vulnerabilities entirely.

    Only the overflow case of refcounting can be perfectly protected, since
    it can be detected and stopped before the reference is freed and left to
    be abused by an attacker. There isn't a way to block early decrements,
    and while REFCOUNT_FULL stops increment-from-zero cases (which would
    be the state _after_ an early decrement and stops potential double-free
    conditions), this fast implementation does not, since it would require
    the more expensive cmpxchg loops. Since the overflow case is much more
    common (e.g. missing a "put" during an error path), this protection
    provides real-world protection. For example, the two public refcount
    overflow use-after-free exploits published in 2016 would have been
    rendered unexploitable:

    http://perception-point.io/2016/01/14/analysis-and-exploitation-of-a-linux-kernel-vulnerability-cve-2016-0728/

    http://cyseclabs.com/page?n=02012016

    This implementation does, however, notice an unchecked decrement to zero
    (i.e. caller used refcount_dec() instead of refcount_dec_and_test() and it
    resulted in a zero). Decrements under zero are noticed (since they will
    have resulted in a negative value), though this only indicates that a
    use-after-free may have already happened. Such notifications are likely
    avoidable by an attacker that has already exploited a use-after-free
    vulnerability, but it's better to have them reported than allow such
    conditions to remain universally silent.

    On first overflow detection, the refcount value is reset to INT_MIN / 2
    (which serves as a saturation value) and a report and stack trace are
    produced. When operations detect only negative value results (such as
    changing an already saturated value), saturation still happens but no
    notification is performed (since the value was already saturated).

    On the matter of races, since the entire range beyond INT_MAX but before
    0 is negative, every operation at INT_MIN / 2 will trap, leaving no
    overflow-only race condition.

    As for performance, this implementation adds a single "js" instruction
    to the regular execution flow of a copy of the standard atomic_t refcount
    operations. (The non-"and_test" refcount_dec() function, which is uncommon
    in regular refcount design patterns, has an additional "jz" instruction
    to detect reaching exactly zero.) Since this is a forward jump, it is by
    default the non-predicted path, which will be reinforced by dynamic branch
    prediction. The result is this protection having virtually no measurable
    change in performance over standard atomic_t operations. The error path,
    located in .text.unlikely, saves the refcount location and then uses UD0
    to fire a refcount exception handler, which resets the refcount, handles
    reporting, and returns to regular execution. This keeps the changes to
    .text size minimal, avoiding return jumps and open-coded calls to the
    error reporting routine.

    Example assembly comparison:

    refcount_inc() before:

    .text:
    ffffffff81546149: f0 ff 45 f4 lock incl -0xc(%rbp)

    refcount_inc() after:

    .text:
    ffffffff81546149: f0 ff 45 f4 lock incl -0xc(%rbp)
    ffffffff8154614d: 0f 88 80 d5 17 00 js ffffffff816c36d3
    ...
    .text.unlikely:
    ffffffff816c36d3: 48 8d 4d f4 lea -0xc(%rbp),%rcx
    ffffffff816c36d7: 0f ff (bad)

    These are the cycle counts comparing a loop of refcount_inc() from 1
    to INT_MAX and back down to 0 (via refcount_dec_and_test()), between
    unprotected refcount_t (atomic_t), fully protected REFCOUNT_FULL
    (refcount_t-full), and this overflow-protected refcount (refcount_t-fast):

    2147483646 refcount_inc()s and 2147483647 refcount_dec_and_test()s:
    cycles protections
    atomic_t 82249267387 none
    refcount_t-fast 82211446892 overflow, untested dec-to-zero
    refcount_t-full 144814735193 overflow, untested dec-to-zero, inc-from-zero

    This code is a modified version of the x86 PAX_REFCOUNT atomic_t
    overflow defense from the last public patch of PaX/grsecurity, based
    on my understanding of the code. Changes or omissions from the original
    code are mine and don't reflect the original grsecurity/PaX code. Thanks
    to PaX Team for various suggestions for improvement for repurposing this
    code to be a refcount-only protection.

    Signed-off-by: Kees Cook
    Reviewed-by: Josh Poimboeuf
    Cc: Alexey Dobriyan
    Cc: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Christoph Hellwig
    Cc: David S. Miller
    Cc: Davidlohr Bueso
    Cc: Elena Reshetova
    Cc: Eric Biggers
    Cc: Eric W. Biederman
    Cc: Greg KH
    Cc: Hans Liljestrand
    Cc: James Bottomley
    Cc: Jann Horn
    Cc: Linus Torvalds
    Cc: Manfred Spraul
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Serge E. Hallyn
    Cc: Thomas Gleixner
    Cc: arozansk@redhat.com
    Cc: axboe@kernel.dk
    Cc: kernel-hardening@lists.openwall.com
    Cc: linux-arch
    Link: http://lkml.kernel.org/r/20170815161924.GA133115@beast
    Signed-off-by: Ingo Molnar

    Kees Cook
     

02 Mar, 2017

1 commit


24 Feb, 2017

1 commit


23 Feb, 2017

1 commit

  • Pull printk updates from Petr Mladek:

    - Add Petr Mladek, Sergey Senozhatsky as printk maintainers, and Steven
    Rostedt as the printk reviewer. This idea came up after the
    discussion about printk issues at Kernel Summit. It was formulated
    and discussed at lkml[1].

    - Extend a lock-less NMI per-cpu buffers idea to handle recursive
    printk() calls by Sergey Senozhatsky[2]. It is the first step in
    sanitizing printk as discussed at Kernel Summit.

    The change allows to see messages that would normally get ignored or
    would cause a deadlock.

    Also it allows to enable lockdep in printk(). This already paid off.
    The testing in linux-next helped to discover two old problems that
    were hidden before[3][4].

    - Remove unused parameter by Sergey Senozhatsky. Clean up after a past
    change.

    [1] http://lkml.kernel.org/r/1481798878-31898-1-git-send-email-pmladek@suse.com
    [2] http://lkml.kernel.org/r/20161227141611.940-1-sergey.senozhatsky@gmail.com
    [3] http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com
    [4] http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.com

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
    printk: drop call_console_drivers() unused param
    printk: convert the rest to printk-safe
    printk: remove zap_locks() function
    printk: use printk_safe buffers in printk
    printk: report lost messages in printk safe/nmi contexts
    printk: always use deferred printk when flush printk_safe lines
    printk: introduce per-cpu safe_print seq buffer
    printk: rename nmi.c and exported api
    printk: use vprintk_func in vprintk()
    MAINTAINERS: Add printk maintainers

    Linus Torvalds
     

08 Feb, 2017

1 commit

  • A preparation patch for printk_safe work. No functional change.
    - rename nmi.c to print_safe.c
    - add `printk_safe' prefix to some (which used both by printk-safe
    and printk-nmi) of the exported functions.

    Link: http://lkml.kernel.org/r/20161227141611.940-3-sergey.senozhatsky@gmail.com
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Jan Kara
    Cc: Tejun Heo
    Cc: Calvin Owens
    Cc: Steven Rostedt
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Andy Lutomirski
    Cc: Peter Hurley
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Sergey Senozhatsky
    Signed-off-by: Petr Mladek

    Sergey Senozhatsky
     

25 Jan, 2017

1 commit

  • When a system panics, the "Rebooting in X seconds.." message is never
    printed because it lacks a new line. Fix it.

    Link: http://lkml.kernel.org/r/20170119114751.2724-1-jslaby@suse.cz
    Signed-off-by: Jiri Slaby
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiri Slaby
     

18 Jan, 2017

1 commit

  • Commit 7fd8329ba502 ("taint/module: Clean up global and module taint
    flags handling") used the key words true and false as character members
    of a new struct. These names cause problems when out-of-kernel modules
    such as VirtualBox include their own definitions of true and false.

    Fixes: 7fd8329ba502 ("taint/module: Clean up global and module taint flags handling")
    Signed-off-by: Larry Finger
    Cc: Petr Mladek
    Cc: Jessica Yu
    Cc: Rusty Russell
    Reported-by: Valdis Kletnieks
    Reviewed-by: Petr Mladek
    Acked-by: Rusty Russell
    Signed-off-by: Jessica Yu

    Larry Finger
     

27 Nov, 2016

1 commit

  • The commit 66cc69e34e86a231 ("Fix: module signature vs tracepoints:
    add new TAINT_UNSIGNED_MODULE") updated module_taint_flags() to
    potentially print one more character. But it did not increase the
    size of the corresponding buffers in m_show() and print_modules().

    We have recently done the same mistake when adding a taint flag
    for livepatching, see
    https://lkml.kernel.org/r/cfba2c823bb984690b73572aaae1db596b54a082.1472137475.git.jpoimboe@redhat.com

    Also struct module uses an incompatible type for mod-taints flags.
    It survived from the commit 2bc2d61a9638dab670d ("[PATCH] list module
    taint flags in Oops/panic"). There was used "int" for the global taint
    flags at these times. But only the global tain flags was later changed
    to "unsigned long" by the commit 25ddbb18aae33ad2 ("Make the taint
    flags reliable").

    This patch defines TAINT_FLAGS_COUNT that can be used to create
    arrays and buffers of the right size. Note that we could not use
    enum because the taint flag indexes are used also in assembly code.

    Then it reworks the table that describes the taint flags. The TAINT_*
    numbers can be used as the index. Instead, we add information
    if the taint flag is also shown per-module.

    Finally, it uses "unsigned long", bit operations, and the updated
    taint_flags table also for mod->taints.

    It is not optimal because only few taint flags can be printed by
    module_taint_flags(). But better be on the safe side. IMHO, it is
    not worth the optimization and this is a good compromise.

    Signed-off-by: Petr Mladek
    Link: http://lkml.kernel.org/r/1474458442-21581-1-git-send-email-pmladek@suse.com
    [jeyu@redhat.com: fix broken lkml link in changelog]
    Signed-off-by: Jessica Yu

    Petr Mladek
     

12 Oct, 2016

1 commit

  • Daniel Walker reported problems which happens when
    crash_kexec_post_notifiers kernel option is enabled
    (https://lkml.org/lkml/2015/6/24/44).

    In that case, smp_send_stop() is called before entering kdump routines
    which assume other CPUs are still online. As the result, for x86, kdump
    routines fail to save other CPUs' registers and disable virtualization
    extensions.

    To fix this problem, call a new kdump friendly function,
    crash_smp_send_stop(), instead of the smp_send_stop() when
    crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a weak
    function, and it just call smp_send_stop(). Architecture codes should
    override it so that kdump can work appropriately. This patch only
    provides x86-specific version.

    For Xen's PV kernel, just keep the current behavior.

    NOTES:

    - Right solution would be to place crash_smp_send_stop() before
    __crash_kexec() invocation in all cases and remove smp_send_stop(), but
    we can't do that until all architectures implement own
    crash_smp_send_stop()

    - crash_smp_send_stop()-like work is still needed by
    machine_crash_shutdown() because crash_kexec() can be called without
    entering panic()

    Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
    Link: http://lkml.kernel.org/r/20160810080948.11028.15344.stgit@sysi4-13.yrl.intra.hitachi.co.jp
    Signed-off-by: Hidehiro Kawai
    Reported-by: Daniel Walker
    Cc: Dave Young
    Cc: Baoquan He
    Cc: Vivek Goyal
    Cc: Eric Biederman
    Cc: Masami Hiramatsu
    Cc: Daniel Walker
    Cc: Xunlei Pang
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Borislav Petkov
    Cc: David Vrabel
    Cc: Toshi Kani
    Cc: Ralf Baechle
    Cc: David Daney
    Cc: Aaro Koskinen
    Cc: "Steven J. Hill"
    Cc: Corey Minyard
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hidehiro Kawai
     

03 Aug, 2016

1 commit

  • crash_kexec_post_notifiers ia a boot option which controls whether the
    1st kernel calls panic notifiers or not before booting the 2nd kernel.
    However, there is no need to limit it to being modifiable only at boot
    time. So, use core_param instead of early_param.

    Link: http://lkml.kernel.org/r/20160705113327.5864.43139.stgit@softrs
    Signed-off-by: Hidehiro Kawai
    Cc: Dave Young
    Cc: Baoquan He
    Cc: Vivek Goyal
    Cc: Eric Biederman
    Cc: Masami Hiramatsu
    Cc: Borislav Petkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hidehiro Kawai
     

21 May, 2016

1 commit

  • In NMI context, printk() messages are stored into per-CPU buffers to
    avoid a possible deadlock. They are normally flushed to the main ring
    buffer via an IRQ work. But the work is never called when the system
    calls panic() in the very same NMI handler.

    This patch tries to flush NMI buffers before the crash dump is
    generated. In this case it does not risk a double release and bails out
    when the logbuf_lock is already taken. The aim is to get the messages
    into the main ring buffer when possible. It makes them better
    accessible in the vmcore.

    Then the patch tries to flush the buffers second time when other CPUs
    are down. It might be more aggressive and reset logbuf_lock. The aim
    is to get the messages available for the consequent kmsg_dump() and
    console_flush_on_panic() calls.

    The patch causes vprintk_emit() to be called even in NMI context again.
    But it is done via printk_deferred() so that the console handling is
    skipped. Consoles use internal locks and we could not prevent a
    deadlock easily. They are explicitly called later when the crash dump
    is not generated, see console_flush_on_panic().

    Signed-off-by: Petr Mladek
    Cc: Benjamin Herrenschmidt
    Cc: Daniel Thompson
    Cc: David Miller
    Cc: Ingo Molnar
    Cc: Jan Kara
    Cc: Jiri Kosina
    Cc: Martin Schwidefsky
    Cc: Peter Zijlstra
    Cc: Ralf Baechle
    Cc: Russell King
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Petr Mladek
     

23 Mar, 2016

1 commit

  • Commit 1717f2096b54 ("panic, x86: Fix re-entrance problem due to panic
    on NMI") and commit 58c5661f2144 ("panic, x86: Allow CPUs to save
    registers even if looping in NMI context") introduced nmi_panic() which
    prevents concurrent/recursive execution of panic(). It also saves
    registers for the crash dump on x86.

    However, there are some cases where NMI handlers still use panic().
    This patch set partially replaces them with nmi_panic() in those cases.

    Even this patchset is applied, some NMI or similar handlers (e.g. MCE
    handler) continue to use panic(). This is because I can't test them
    well and actual problems won't happen. For example, the possibility
    that normal panic and panic on MCE happen simultaneously is very low.

    This patch (of 3):

    Convert nmi_panic() to a proper function and export it instead of
    exporting internal implementation details to modules, for obvious
    reasons.

    Signed-off-by: Hidehiro Kawai
    Acked-by: Borislav Petkov
    Acked-by: Michal Nazarewicz
    Cc: Michal Hocko
    Cc: Rasmus Villemoes
    Cc: Nicolas Iooss
    Cc: Javi Merino
    Cc: Gobinda Charan Maji
    Cc: "Steven Rostedt (Red Hat)"
    Cc: Thomas Gleixner
    Cc: Vitaly Kuznetsov
    Cc: HATAYAMA Daisuke
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hidehiro Kawai
     

18 Mar, 2016

1 commit

  • The traceoff_on_warning option doesn't have any effect on s390, powerpc,
    arm64, parisc, and sh because there are two different types of WARN
    implementations:

    1) The above mentioned architectures treat WARN() as a special case of a
    BUG() exception. They handle warnings in report_bug() in lib/bug.c.

    2) All other architectures just call warn_slowpath_*() directly. Their
    warnings are handled in warn_slowpath_common() in kernel/panic.c.

    Support traceoff_on_warning on all architectures and prevent any future
    divergence by using a single common function to emit the warning.

    Also remove the '()' from '%pS()', because the parentheses look funky:

    [ 45.607629] WARNING: at /root/warn_mod/warn_mod.c:17 .init_dummy+0x20/0x40 [warn_mod]()

    Reported-by: Chunyu Hu
    Signed-off-by: Josh Poimboeuf
    Acked-by: Heiko Carstens
    Tested-by: Prarit Bhargava
    Acked-by: Prarit Bhargava
    Acked-by: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josh Poimboeuf
     

17 Jan, 2016

1 commit

  • @console_may_schedule tracks whether console_sem was acquired through
    lock or trylock. If the former, we're inside a sleepable context and
    console_conditional_schedule() performs cond_resched(). This allows
    console drivers which use console_lock for synchronization to yield
    while performing time-consuming operations such as scrolling.

    However, the actual console outputting is performed while holding
    irq-safe logbuf_lock, so console_unlock() clears @console_may_schedule
    before starting outputting lines. Also, only a few drivers call
    console_conditional_schedule() to begin with. This means that when a
    lot of lines need to be output by console_unlock(), for example on a
    console registration, the task doing console_unlock() may not yield for
    a long time on a non-preemptible kernel.

    If this happens with a slow console devices, for example a serial
    console, the outputting task may occupy the cpu for a very long time.
    Long enough to trigger softlockup and/or RCU stall warnings, which in
    turn pile more messages, sometimes enough to trigger the next cycle of
    warnings incapacitating the system.

    Fix it by making console_unlock() insert cond_resched() between lines if
    @console_may_schedule.

    Signed-off-by: Tejun Heo
    Reported-by: Calvin Owens
    Acked-by: Jan Kara
    Cc: Dave Jones
    Cc: Kyle McMartin
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

19 Dec, 2015

3 commits

  • Currently, panic() and crash_kexec() can be called at the same time.
    For example (x86 case):

    CPU 0:
    oops_end()
    crash_kexec()
    mutex_trylock() // acquired
    nmi_shootdown_cpus() // stop other CPUs

    CPU 1:
    panic()
    crash_kexec()
    mutex_trylock() // failed to acquire
    smp_send_stop() // stop other CPUs
    infinite loop

    If CPU 1 calls smp_send_stop() before nmi_shootdown_cpus(), kdump
    fails.

    In another case:

    CPU 0:
    oops_end()
    crash_kexec()
    mutex_trylock() // acquired

    io_check_error()
    panic()
    crash_kexec()
    mutex_trylock() // failed to acquire
    infinite loop

    Clearly, this is an undesirable result.

    To fix this problem, this patch changes crash_kexec() to exclude others
    by using the panic_cpu atomic.

    Signed-off-by: Hidehiro Kawai
    Acked-by: Michal Hocko
    Cc: Andrew Morton
    Cc: Baoquan He
    Cc: Dave Young
    Cc: "Eric W. Biederman"
    Cc: HATAYAMA Daisuke
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Jonathan Corbet
    Cc: kexec@lists.infradead.org
    Cc: linux-doc@vger.kernel.org
    Cc: Martin Schwidefsky
    Cc: Masami Hiramatsu
    Cc: Minfei Huang
    Cc: Peter Zijlstra
    Cc: Seth Jennings
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Cc: Vitaly Kuznetsov
    Cc: Vivek Goyal
    Cc: x86-ml
    Link: http://lkml.kernel.org/r/20151210014630.25437.94161.stgit@softrs
    Signed-off-by: Borislav Petkov
    Signed-off-by: Thomas Gleixner

    Hidehiro Kawai
     
  • Currently, kdump_nmi_shootdown_cpus(), a subroutine of crash_kexec(),
    sends an NMI IPI to CPUs which haven't called panic() to stop them,
    save their register information and do some cleanups for crash dumping.
    However, if such a CPU is infinitely looping in NMI context, we fail to
    save its register information into the crash dump.

    For example, this can happen when unknown NMIs are broadcast to all
    CPUs as follows:

    CPU 0 CPU 1
    =========================== ==========================
    receive an unknown NMI
    unknown_nmi_error()
    panic() receive an unknown NMI
    spin_trylock(&panic_lock) unknown_nmi_error()
    crash_kexec() panic()
    spin_trylock(&panic_lock)
    panic_smp_self_stop()
    infinite loop
    kdump_nmi_shootdown_cpus()
    issue NMI IPI -----------> blocked until IRET
    infinite loop...

    Here, since CPU 1 is in NMI context, the second NMI from CPU 0 is
    blocked until CPU 1 executes IRET. However, CPU 1 never executes IRET,
    so the NMI is not handled and the callback function to save registers is
    never called.

    In practice, this can happen on some servers which broadcast NMIs to all
    CPUs when the NMI button is pushed.

    To save registers in this case, we need to:

    a) Return from NMI handler instead of looping infinitely
    or
    b) Call the callback function directly from the infinite loop

    Inherently, a) is risky because NMI is also used to prevent corrupted
    data from being propagated to devices. So, we chose b).

    This patch does the following:

    1. Move the infinite looping of CPUs which haven't called panic() in NMI
    context (actually done by panic_smp_self_stop()) outside of panic() to
    enable us to refer pt_regs. Please note that panic_smp_self_stop() is
    still used for normal context.

    2. Call a callback of kdump_nmi_shootdown_cpus() directly to save
    registers and do some cleanups after setting waiting_for_crash_ipi which
    is used for counting down the number of CPUs which handled the callback

    Signed-off-by: Hidehiro Kawai
    Acked-by: Michal Hocko
    Cc: Aaron Tomlin
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Baoquan He
    Cc: Chris Metcalf
    Cc: Dave Young
    Cc: David Hildenbrand
    Cc: Don Zickus
    Cc: Eric Biederman
    Cc: Frederic Weisbecker
    Cc: Gobinda Charan Maji
    Cc: HATAYAMA Daisuke
    Cc: Hidehiro Kawai
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Javi Merino
    Cc: Jiang Liu
    Cc: Jonathan Corbet
    Cc: kexec@lists.infradead.org
    Cc: linux-doc@vger.kernel.org
    Cc: lkml
    Cc: Masami Hiramatsu
    Cc: Michal Nazarewicz
    Cc: Nicolas Iooss
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Prarit Bhargava
    Cc: Rasmus Villemoes
    Cc: Seth Jennings
    Cc: Stefan Lippers-Hollmann
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Cc: Ulrich Obergfell
    Cc: Vitaly Kuznetsov
    Cc: Vivek Goyal
    Cc: Yasuaki Ishimatsu
    Link: http://lkml.kernel.org/r/20151210014628.25437.75256.stgit@softrs
    [ Cleanup comments, fixup formatting. ]
    Signed-off-by: Borislav Petkov
    Signed-off-by: Thomas Gleixner

    Hidehiro Kawai
     
  • If panic on NMI happens just after panic() on the same CPU, panic() is
    recursively called. Kernel stalls, as a result, after failing to acquire
    panic_lock.

    To avoid this problem, don't call panic() in NMI context if we've
    already entered panic().

    For that, introduce nmi_panic() macro to reduce code duplication. In
    the case of panic on NMI, don't return from NMI handlers if another CPU
    already panicked.

    Signed-off-by: Hidehiro Kawai
    Acked-by: Michal Hocko
    Cc: Aaron Tomlin
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Baoquan He
    Cc: Chris Metcalf
    Cc: David Hildenbrand
    Cc: Don Zickus
    Cc: "Eric W. Biederman"
    Cc: Frederic Weisbecker
    Cc: Gobinda Charan Maji
    Cc: HATAYAMA Daisuke
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Javi Merino
    Cc: Jonathan Corbet
    Cc: kexec@lists.infradead.org
    Cc: linux-doc@vger.kernel.org
    Cc: lkml
    Cc: Masami Hiramatsu
    Cc: Michal Nazarewicz
    Cc: Nicolas Iooss
    Cc: Peter Zijlstra
    Cc: Prarit Bhargava
    Cc: Rasmus Villemoes
    Cc: Rusty Russell
    Cc: Seth Jennings
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Cc: Ulrich Obergfell
    Cc: Vitaly Kuznetsov
    Cc: Vivek Goyal
    Link: http://lkml.kernel.org/r/20151210014626.25437.13302.stgit@softrs
    [ Cleanup comments, fixup formatting. ]
    Signed-off-by: Borislav Petkov
    Signed-off-by: Thomas Gleixner

    Hidehiro Kawai
     

21 Nov, 2015

1 commit

  • Commit 08d78658f393 ("panic: release stale console lock to always get the
    logbuf printed out") introduced an unwanted bad unlock balance report when
    panic() is called directly and not from OOPS (e.g. from out_of_memory()).
    The difference is that in case of OOPS we disable locks debug in
    oops_enter() and on direct panic call nobody does that.

    Fixes: 08d78658f393 ("panic: release stale console lock to always get the logbuf printed out")
    Reported-by: kernel test robot
    Signed-off-by: Vitaly Kuznetsov
    Cc: HATAYAMA Daisuke
    Cc: Masami Hiramatsu
    Cc: Jiri Kosina
    Cc: Baoquan He
    Cc: Prarit Bhargava
    Cc: Xie XiuQi
    Cc: Seth Jennings
    Cc: "K. Y. Srinivasan"
    Cc: Jan Kara
    Cc: Petr Mladek
    Cc: Yasuaki Ishimatsu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vitaly Kuznetsov
     

07 Nov, 2015

1 commit

  • In some cases we may end up killing the CPU holding the console lock
    while still having valuable data in logbuf. E.g. I'm observing the
    following:

    - A crash is happening on one CPU and console_unlock() is being called on
    some other.

    - console_unlock() tries to print out the buffer before releasing the lock
    and on slow console it takes time.

    - in the meanwhile crashing CPU does lots of printk()-s with valuable data
    (which go to the logbuf) and sends IPIs to all other CPUs.

    - console_unlock() finishes printing previous chunk and enables interrupts
    before trying to print out the rest, the CPU catches the IPI and never
    releases console lock.

    This is not the only possible case: in VT/fb subsystems we have many other
    console_lock()/console_unlock() users. Non-masked interrupts (or
    receiving NMI in case of extreme slowness) will have the same result.
    Getting the whole console buffer printed out on crash should be top
    priority.

    [akpm@linux-foundation.org: tweak comment text]
    Signed-off-by: Vitaly Kuznetsov
    Cc: HATAYAMA Daisuke
    Cc: Masami Hiramatsu
    Cc: Jiri Kosina
    Cc: Baoquan He
    Cc: Prarit Bhargava
    Cc: Xie XiuQi
    Cc: Seth Jennings
    Cc: "K. Y. Srinivasan"
    Cc: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vitaly Kuznetsov
     

01 Jul, 2015

2 commits

  • Commit f06e5153f4ae2e ("kernel/panic.c: add "crash_kexec_post_notifiers"
    option for kdump after panic_notifers") introduced
    "crash_kexec_post_notifiers" kernel boot option, which toggles wheather
    panic() calls crash_kexec() before panic_notifiers and dump kmsg or after.

    The problem is that the commit overlooks panic_on_oops kernel boot option.
    If it is enabled, crash_kexec() is called directly without going through
    panic() in oops path.

    To fix this issue, this patch adds a check to "crash_kexec_post_notifiers"
    in the condition of kexec_should_crash().

    Also, put a comment in kexec_should_crash() to explain not obvious things
    on this patch.

    Signed-off-by: HATAYAMA Daisuke
    Acked-by: Baoquan He
    Tested-by: Hidehiro Kawai
    Reviewed-by: Masami Hiramatsu
    Cc: Vivek Goyal
    Cc: Ingo Molnar
    Cc: Hidehiro Kawai
    Cc: Baoquan He
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    HATAYAMA Daisuke
     
  • For compatibility with the behaviour before the commit f06e5153f4ae2e
    ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after
    panic_notifers"), the 2nd crash_kexec() should be called only if
    crash_kexec_post_notifiers is enabled.

    Note that crash_kexec() returns immediately if kdump crash kernel is not
    loaded, so in this case, this patch makes no functionality change, but the
    point is to make it explicit, from the caller panic() side, that the 2nd
    crash_kexec() does nothing.

    Signed-off-by: HATAYAMA Daisuke
    Suggested-by: Ingo Molnar
    Cc: "Eric W. Biederman"
    Cc: Vivek Goyal
    Cc: Masami Hiramatsu
    Cc: Hidehiro Kawai
    Cc: Baoquan He
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    HATAYAMA Daisuke
     

22 Dec, 2014

1 commit

  • This adds a new taint flag to indicate when the kernel or a kernel
    module has been live patched. This will provide a clean indication in
    bug reports that live patching was used.

    Additionally, if the crash occurs in a live patched function, the live
    patch module will appear beside the patched function in the backtrace.

    Signed-off-by: Seth Jennings
    Acked-by: Josh Poimboeuf
    Reviewed-by: Miroslav Benes
    Reviewed-by: Petr Mladek
    Reviewed-by: Masami Hiramatsu
    Signed-off-by: Jiri Kosina

    Seth Jennings
     

11 Dec, 2014

1 commit

  • There have been several times where I have had to rebuild a kernel to
    cause a panic when hitting a WARN() in the code in order to get a crash
    dump from a system. Sometimes this is easy to do, other times (such as
    in the case of a remote admin) it is not trivial to send new images to
    the user.

    A much easier method would be a switch to change the WARN() over to a
    panic. This makes debugging easier in that I can now test the actual
    image the WARN() was seen on and I do not have to engage in remote
    debugging.

    This patch adds a panic_on_warn kernel parameter and
    /proc/sys/kernel/panic_on_warn calls panic() in the
    warn_slowpath_common() path. The function will still print out the
    location of the warning.

    An example of the panic_on_warn output:

    The first line below is from the WARN_ON() to output the WARN_ON()'s
    location. After that the panic() output is displayed.

    WARNING: CPU: 30 PID: 11698 at /home/prarit/dummy_module/dummy-module.c:25 init_dummy+0x1f/0x30 [dummy_module]()
    Kernel panic - not syncing: panic_on_warn set ...

    CPU: 30 PID: 11698 Comm: insmod Tainted: G W OE 3.17.0+ #57
    Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
    0000000000000000 000000008e3f87df ffff88080f093c38 ffffffff81665190
    0000000000000000 ffffffff818aea3d ffff88080f093cb8 ffffffff8165e2ec
    ffffffff00000008 ffff88080f093cc8 ffff88080f093c68 000000008e3f87df
    Call Trace:
    [] dump_stack+0x46/0x58
    [] panic+0xd0/0x204
    [] ? init_dummy+0x1f/0x30 [dummy_module]
    [] warn_slowpath_common+0xd0/0xd0
    [] ? dummy_greetings+0x40/0x40 [dummy_module]
    [] warn_slowpath_null+0x1a/0x20
    [] init_dummy+0x1f/0x30 [dummy_module]
    [] do_one_initcall+0xd4/0x210
    [] ? __vunmap+0xc2/0x110
    [] load_module+0x16a9/0x1b30
    [] ? store_uevent+0x70/0x70
    [] ? copy_module_from_fd.isra.44+0x129/0x180
    [] SyS_finit_module+0xa6/0xd0
    [] system_call_fastpath+0x12/0x17

    Successfully tested by me.

    hpa said: There is another very valid use for this: many operators would
    rather a machine shuts down than being potentially compromised either
    functionally or security-wise.

    Signed-off-by: Prarit Bhargava
    Cc: Jonathan Corbet
    Cc: Rusty Russell
    Cc: "H. Peter Anvin"
    Cc: Andi Kleen
    Cc: Masami Hiramatsu
    Acked-by: Yasuaki Ishimatsu
    Cc: Fabian Frederick
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Prarit Bhargava
     

14 Nov, 2014

1 commit


09 Aug, 2014

1 commit

  • This taint flag will be set if the system has ever entered a softlockup
    state. Similar to TAINT_WARN it is useful to know whether or not the
    system has been in a softlockup state when debugging.

    [akpm@linux-foundation.org: apply the taint before calling panic()]
    Signed-off-by: Josh Hunt
    Cc: Jason Baron
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josh Hunt
     

07 Jun, 2014

1 commit

  • Add a "crash_kexec_post_notifiers" boot option to run kdump after
    running panic_notifiers and dump kmsg. This can help rare situations
    where kdump fails because of unstable crashed kernel or hardware failure
    (memory corruption on critical data/code), or the 2nd kernel is already
    broken by the 1st kernel (it's a broken behavior, but who can guarantee
    that the "crashed" kernel works correctly?).

    Usage: add "crash_kexec_post_notifiers" to kernel boot option.

    Note that this actually increases risks of the failure of kdump. This
    option should be set only if you worry about the rare case of kdump
    failure rather than increasing the chance of success.

    Signed-off-by: Masami Hiramatsu
    Acked-by: Motohiro Kosaki
    Acked-by: Vivek Goyal
    Cc: Eric Biederman
    Cc: Yoshihiro YUNOMAE
    Cc: Satoru MORIYA
    Cc: Tomoki Sekiyama
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Masami Hiramatsu
     

08 Apr, 2014

1 commit

  • Currently, booting without initrd specified on 80x25 screen gives a call
    trace followed by atkbd : Spurious ACK. Original message ("VFS: Unable
    to mount root fs") is not available. Of course this could happen in
    other situations...

    This patch displays panic reason after call trace which could help lot
    of people even if it's not the very last line on screen.

    Also, convert all panic.c printk(KERN_EMERG to pr_emerg(

    [akpm@linux-foundation.org: missed a couple of pr_ conversions]
    Signed-off-by: Fabian Frederick
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     

07 Apr, 2014

1 commit

  • Pull module updates from Rusty Russell:
    "Nothing major: the stricter permissions checking for sysfs broke a
    staging driver; fix included. Greg KH said he'd take the patch but
    hadn't as the merge window opened, so it's included here to avoid
    breaking build"

    * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
    staging: fix up speakup kobject mode
    Use 'E' instead of 'X' for unsigned module taint flag.
    VERIFY_OCTAL_PERMISSIONS: stricter checking for sysfs perms.
    kallsyms: fix percpu vars on x86-64 with relocation.
    kallsyms: generalize address range checking
    module: LLVMLinux: Remove unused function warning from __param_check macro
    Fix: module signature vs tracepoints: add new TAINT_UNSIGNED_MODULE
    module: remove MODULE_GENERIC_TABLE
    module: allow multiple calls to MODULE_DEVICE_TABLE() per module
    module: use pr_cont

    Linus Torvalds
     

01 Apr, 2014

1 commit

  • Pull x86 LTO changes from Peter Anvin:
    "More infrastructure work in preparation for link-time optimization
    (LTO). Most of these changes is to make sure symbols accessed from
    assembly code are properly marked as visible so the linker doesn't
    remove them.

    My understanding is that the changes to support LTO are still not
    upstream in binutils, but are on the way there. This patchset should
    conclude the x86-specific changes, and remaining patches to actually
    enable LTO will be fed through the Kbuild tree (other than keeping up
    with changes to the x86 code base, of course), although not
    necessarily in this merge window"

    * 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
    Kbuild, lto: Handle basic LTO in modpost
    Kbuild, lto: Disable LTO for asm-offsets.c
    Kbuild, lto: Add a gcc-ld script to let run gcc as ld
    Kbuild, lto: add ld-version and ld-ifversion macros
    Kbuild, lto: Drop .number postfixes in modpost
    Kbuild, lto, workaround: Don't warn for initcall_reference in modpost
    lto: Disable LTO for sys_ni
    lto: Handle LTO common symbols in module loader
    lto, workaround: Add workaround for initcall reordering
    lto: Make asmlinkage __visible
    x86, lto: Disable LTO for the x86 VDSO
    initconst, x86: Fix initconst mistake in ts5500 code
    initconst: Fix initconst mistake in dcdbas
    asmlinkage: Make trace_hardirqs_on/off_caller visible
    asmlinkage, x86: Fix 32bit memcpy for LTO
    asmlinkage Make __stack_chk_failed and memcmp visible
    asmlinkage: Mark rwsem functions that can be called from assembler asmlinkage
    asmlinkage: Make main_extable_sort_needed visible
    asmlinkage, mutex: Mark __visible
    asmlinkage: Make trace_hardirq visible
    ...

    Linus Torvalds
     

31 Mar, 2014

1 commit

  • Takashi Iwai says:
    > The letter 'X' has been already used for SUSE kernels for very long
    > time, to indicate the external supported modules. Can the new flag be
    > changed to another letter for avoiding conflict...?
    > (BTW, we also use 'N' for "no support", too.)

    Note: this code should be cleaned up, so we don't have such maps in
    three places!

    Signed-off-by: Rusty Russell

    Rusty Russell
     

21 Mar, 2014

1 commit


13 Mar, 2014

1 commit

  • Users have reported being unable to trace non-signed modules loaded
    within a kernel supporting module signature.

    This is caused by tracepoint.c:tracepoint_module_coming() refusing to
    take into account tracepoints sitting within force-loaded modules
    (TAINT_FORCED_MODULE). The reason for this check, in the first place, is
    that a force-loaded module may have a struct module incompatible with
    the layout expected by the kernel, and can thus cause a kernel crash
    upon forced load of that module on a kernel with CONFIG_TRACEPOINTS=y.

    Tracepoints, however, specifically accept TAINT_OOT_MODULE and
    TAINT_CRAP, since those modules do not lead to the "very likely system
    crash" issue cited above for force-loaded modules.

    With kernels having CONFIG_MODULE_SIG=y (signed modules), a non-signed
    module is tainted re-using the TAINT_FORCED_MODULE taint flag.
    Unfortunately, this means that Tracepoints treat that module as a
    force-loaded module, and thus silently refuse to consider any tracepoint
    within this module.

    Since an unsigned module does not fit within the "very likely system
    crash" category of tainting, add a new TAINT_UNSIGNED_MODULE taint flag
    to specifically address this taint behavior, and accept those modules
    within Tracepoints. We use the letter 'X' as a taint flag character for
    a module being loaded that doesn't know how to sign its name (proposed
    by Steven Rostedt).

    Also add the missing 'O' entry to trace event show_module_flags() list
    for the sake of completeness.

    Signed-off-by: Mathieu Desnoyers
    Acked-by: Steven Rostedt
    NAKed-by: Ingo Molnar
    CC: Thomas Gleixner
    CC: David Howells
    CC: Greg Kroah-Hartman
    Signed-off-by: Rusty Russell

    Mathieu Desnoyers
     

14 Feb, 2014

1 commit

  • In LTO symbols implicitely referenced by the compiler need
    to be visible. Earlier these symbols were visible implicitely
    from being exported, but we disabled implicit visibility fo
    EXPORTs when modules are disabled to improve code size. So
    now these symbols have to be marked visible explicitely.

    Do this for __stack_chk_fail (with stack protector)
    and memcmp.

    Signed-off-by: Andi Kleen
    Link: http://lkml.kernel.org/r/1391845930-28580-10-git-send-email-ak@linux.intel.com
    Signed-off-by: H. Peter Anvin

    Andi Kleen
     

26 Nov, 2013

1 commit

  • The panic_timeout value can be set via the command line option
    'panic=x', or via /proc/sys/kernel/panic, however that is not
    sufficient when the panic occurs before we are able to set up
    these values. Thus, add a CONFIG_PANIC_TIMEOUT so that we can
    set the desired value from the .config.

    The default panic_timeout value continues to be 0 - wait
    forever. Also adds set_arch_panic_timeout(new_timeout,
    arch_default_timeout), which is intended to be used by arches in
    arch_setup(). The idea being that the new_timeout is only set if
    the user hasn't changed from the arch_default_timeout.

    Signed-off-by: Jason Baron
    Cc: benh@kernel.crashing.org
    Cc: paulus@samba.org
    Cc: ralf@linux-mips.org
    Cc: mpe@ellerman.id.au
    Cc: felipe.contreras@gmail.com
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1a1674daec27c534df409697025ac568ebcee91e.1385418410.git.jbaron@akamai.com
    Signed-off-by: Ingo Molnar

    Jason Baron
     

13 Nov, 2013

1 commit


12 Sep, 2013

1 commit

  • Since the panic handlers may produce additional information (via printk)
    for the kernel log, it should be reported as part of the panic output
    saved by kmsg_dump(). Without this re-ordering, nothing that adds
    information to a panic will show up in pstore's view when kmsg_dump runs,
    and is therefore not visible to crash reporting tools that examine pstore
    output.

    Signed-off-by: Kees Cook
    Cc: Anton Vorontsov
    Cc: Colin Cross
    Acked-by: Tony Luck
    Cc: Stephen Boyd
    Cc: Vikram Mulukutla
    Cc: Peter Zijlstra
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

12 Jul, 2013

1 commit

  • Pull tracing changes from Steven Rostedt:
    "The majority of the changes here are cleanups for the large changes
    that were added to 3.10, which includes several bug fixes that have
    been marked for stable.

    As for new features, there were a few, but nothing to write to LWN
    about. These include:

    New function trigger called "dump" and "cpudump" that will cause
    ftrace to dump its buffer to the console when the function is called.
    The difference between "dump" and "cpudump" is that "dump" will dump
    the entire contents of the ftrace buffer, where as "cpudump" will only
    dump the contents of the ftrace buffer for the CPU that called the
    function.

    Another small enhancement is a new sysctl switch called
    "traceoff_on_warning" which, when enabled, will disable tracing if any
    WARN_ON() is triggered. This is useful if you want to debug what
    caused a warning and do not want to risk losing your trace data by the
    ring buffer overwriting the data before you can disable it. There's
    also a kernel command line option that will make this enabled at boot
    up called the same thing"

    * tag 'trace-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (34 commits)
    tracing: Make tracing_open_generic_{tr,tc}() static
    tracing: Remove ftrace() function
    tracing: Remove TRACE_EVENT_TYPE enum definition
    tracing: Make tracer_tracing_{off,on,is_on}() static
    tracing: Fix irqs-off tag display in syscall tracing
    uprobes: Fix return value in error handling path
    tracing: Fix race between deleting buffer and setting events
    tracing: Add trace_array_get/put() to event handling
    tracing: Get trace_array ref counts when accessing trace files
    tracing: Add trace_array_get/put() to handle instance refs better
    tracing: Protect ftrace_trace_arrays list in trace_events.c
    tracing: Make trace_marker use the correct per-instance buffer
    ftrace: Do not run selftest if command line parameter is set
    tracing/kprobes: Don't pass addr=ip to perf_trace_buf_submit()
    tracing: Use flag buffer_disabled for irqsoff tracer
    tracing/kprobes: Turn trace_probe->files into list_head
    tracing: Fix disabling of soft disable
    tracing: Add missing syscall_metadata comment
    tracing: Simplify code for showing of soft disabled flag
    tracing/kprobes: Kill probe_enable_lock
    ...

    Linus Torvalds
     

10 Jul, 2013

1 commit


20 Jun, 2013

1 commit

  • Add a traceoff_on_warning option in both the kernel command line as well
    as a sysctl option. When set, any WARN*() function that is hit will cause
    the tracing_on variable to be cleared, which disables writing to the
    ring buffer.

    This is useful especially when tracing a bug with function tracing. When
    a warning is hit, the print caused by the warning can flood the trace with
    the functions that producing the output for the warning. This can make the
    resulting trace useless by either hiding where the bug happened, or worse,
    by overflowing the buffer and losing the trace of the bug totally.

    Acked-by: Peter Zijlstra
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)