01 Jun, 2020

1 commit

  • While doing some tracing, I found a huge portion of the per-cpu buffer
    was taken by printk/serial output because we're disabling the trace far
    too late (after printing the CUT string).

    Improve matters for architectures that have GENERIC_BUG + _BUG_FLAGS by
    killing the tracer in the exception handler before printing anything
    much.

    Link: https://lkml.kernel.org/r/20200528145240.GF706495@hirez.programming.kicks-ass.net

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Steven Rostedt (VMware)

    Peter Zijlstra
     

26 Sep, 2019

1 commit

  • The original clean up of "cut here" missed the WARN_ON() case (that does
    not have a printk message), which was fixed recently by adding an explicit
    printk of "cut here". This had the downside of adding a printk() to every
    WARN_ON() caller, which reduces the utility of using an instruction
    exception to streamline the resulting code. By making this a new BUGFLAG,
    all of these can be removed and "cut here" can be handled by the exception
    handler.

    This was very pronounced on PowerPC, but the effect can be seen on x86 as
    well. The resulting text size of a defconfig build shows some small
    savings from this patch:

    text data bss dec hex filename
    19691167 5134320 1646664 26472151 193eed7 vmlinux.before
    19676362 5134260 1663048 26473670 193f4c6 vmlinux.after

    This change also opens the door for creating something like BUG_MSG(),
    where a custom printk() before issuing BUG(), without confusing the "cut
    here" line.

    Link: http://lkml.kernel.org/r/201908200943.601DD59DCE@keescook
    Fixes: 6b15f678fb7d ("include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures")
    Signed-off-by: Kees Cook
    Reported-by: Christophe Leroy
    Cc: Peter Zijlstra
    Cc: Christophe Leroy
    Cc: Drew Davenport
    Cc: Arnd Bergmann
    Cc: "Steven Rostedt (VMware)"
    Cc: Feng Tang
    Cc: Petr Mladek
    Cc: Mauro Carvalho Chehab
    Cc: Borislav Petkov
    Cc: YueHaibing
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

10 Mar, 2018

2 commits

  • Commit b8347c219649 ("x86/debug: Handle warnings before the notifier
    chain, to fix KGDB crash") changed the ordering of fixups, and did not
    take into account the case of x86 processing non-WARN() and non-BUG()
    exceptions. This would lead to output of a false BUG line with no other
    information.

    In the case of a refcount exception, it would be immediately followed by
    the refcount WARN(), producing very strange double-"cut here":

    lkdtm: attempting bad refcount_inc() overflow
    ------------[ cut here ]------------
    Kernel BUG at 0000000065f29de5 [verbose debug info unavailable]
    ------------[ cut here ]------------
    refcount_t overflow at lkdtm_REFCOUNT_INC_OVERFLOW+0x6b/0x90 in cat[3065], uid/euid: 0/0
    WARNING: CPU: 0 PID: 3065 at kernel/panic.c:657 refcount_error_report+0x9a/0xa4
    ...

    In the prior ordering, exceptions were searched first:

    do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str,
    ...
    if (fixup_exception(regs, trapnr))
    return 0;

    - if (fixup_bug(regs, trapnr))
    - return 0;
    -

    As a result, fixup_bugs()'s is_valid_bugaddr() didn't take into account
    needing to search the exception list first, since that had already
    happened.

    So, instead of searching the exception list twice (once in
    is_valid_bugaddr() and then again in fixup_exception()), just add a
    simple sanity check to report_bug() that will immediately bail out if a
    BUG() (or WARN()) entry is not found.

    Link: http://lkml.kernel.org/r/20180301225934.GA34350@beast
    Fixes: b8347c219649 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash")
    Signed-off-by: Kees Cook
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Borislav Petkov
    Cc: Richard Weinberger
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • The BUG and stack protector reports were still using a raw %p. This
    changes it to %pB for more meaningful output.

    Link: http://lkml.kernel.org/r/20180301225704.GA34198@beast
    Fixes: ad67b74d2469 ("printk: hash addresses printed with %p")
    Signed-off-by: Kees Cook
    Reviewed-by: Andrew Morton
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Borislav Petkov
    Cc: Richard Weinberger ,
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

18 Nov, 2017

2 commits

  • The "cut here" string is used in a few paths. Define it in a single
    place.

    Link: http://lkml.kernel.org/r/1510100869-73751-3-git-send-email-keescook@chromium.org
    Signed-off-by: Kees Cook
    Cc: Arnd Bergmann
    Cc: Fengguang Wu
    Cc: Ingo Molnar
    Cc: Josh Poimboeuf
    Cc: Peter Zijlstra (Intel)
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • Some architectures store the WARN_ONCE state in the flags field of the
    bug_entry. Clear that one too when resetting once state through
    /sys/kernel/debug/clear_warn_once

    Pointed out by Michael Ellerman

    Improves the earlier patch that add clear_warn_once.

    [ak@linux.intel.com: add a missing ifdef CONFIG_MODULES]
    Link: http://lkml.kernel.org/r/20171020170633.9593-1-andi@firstfloor.org
    [akpm@linux-foundation.org: fix unused var warning]
    [akpm@linux-foundation.org: Use 0200 for clear_warn_once file, per mpe]
    [akpm@linux-foundation.org: clear BUGFLAG_DONE in clear_once_table(), per mpe]
    Link: http://lkml.kernel.org/r/20171019204642.7404-1-andi@firstfloor.org
    Signed-off-by: Andi Kleen
    Tested-by: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andi Kleen
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

30 Mar, 2017

1 commit

  • Josh suggested moving the _ONCE logic inside the trap handler, using a
    bit in the bug_entry::flags field, avoiding the need for the extra
    variable.

    Sadly this only works for WARN_ON_ONCE(), since the others have
    printk() statements prior to triggering the trap.

    Still, this saves a fair amount of text and some data:

    text data filename
    10682460 4530992 defconfig-build/vmlinux.orig
    10665111 4530096 defconfig-build/vmlinux.patched

    Suggested-by: Josh Poimboeuf
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andy Lutomirski
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

02 Mar, 2017

1 commit


18 Mar, 2016

2 commits

  • The traceoff_on_warning option doesn't have any effect on s390, powerpc,
    arm64, parisc, and sh because there are two different types of WARN
    implementations:

    1) The above mentioned architectures treat WARN() as a special case of a
    BUG() exception. They handle warnings in report_bug() in lib/bug.c.

    2) All other architectures just call warn_slowpath_*() directly. Their
    warnings are handled in warn_slowpath_common() in kernel/panic.c.

    Support traceoff_on_warning on all architectures and prevent any future
    divergence by using a single common function to emit the warning.

    Also remove the '()' from '%pS()', because the parentheses look funky:

    [ 45.607629] WARNING: at /root/warn_mod/warn_mod.c:17 .init_dummy+0x20/0x40 [warn_mod]()

    Reported-by: Chunyu Hu
    Signed-off-by: Josh Poimboeuf
    Acked-by: Heiko Carstens
    Tested-by: Prarit Bhargava
    Acked-by: Prarit Bhargava
    Acked-by: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josh Poimboeuf
     
  • Christian Borntraeger reported that panic_on_warn doesn't have any
    effect on s390.

    The panic_on_warn feature was introduced with 9e3961a09798 ("kernel: add
    panic_on_warn"). However it did care only for the case when
    WANT_WARN_ON_SLOWPATH is defined. This is turn is only the case for
    architectures which do not have an own __WARN_TAINT defined.

    Other architectures which do have __WARN_TAINT defined call report_bug()
    for warnings within lib/bug.c which does not call panic() in case
    panic_on_warn is set.

    Let's simply enable the panic_on_warn feature by adding the same code
    like it was added to warn_slowpath_common() in panic.c.

    This enables panic_on_warn also for arm64, parisc, powerpc, s390 and sh.

    Signed-off-by: Heiko Carstens
    Reported-by: Christian Borntraeger
    Tested-by: Christian Borntraeger
    Acked-by: Prarit Bhargava
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: "James E.J. Bottomley"
    Cc: Helge Deller
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Tested-by: Michael Ellerman (powerpc)
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     

28 May, 2015

1 commit

  • Currently the RCU usage in module is an inconsistent mess of RCU and
    RCU-sched, this is broken for CONFIG_PREEMPT where synchronize_rcu()
    does not imply synchronize_sched().

    Most usage sites use preempt_{dis,en}able() which is RCU-sched, but
    (most of) the modification sites use synchronize_rcu(). With the
    exception of the module bug list, which actually uses RCU.

    Convert everything over to RCU-sched.

    Furthermore add lockdep asserts to all sites, because it's not at all
    clear to me the required locking is observed, esp. on exported
    functions.

    Cc: Rusty Russell
    Acked-by: "Paul E. McKenney"
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Rusty Russell

    Peter Zijlstra
     

11 Nov, 2014

1 commit


05 Jun, 2014

1 commit

  • - Coalesce formats

    - "WARNING:" prefix unchanged to keep bug format.

    - printk(KERN_DEFAULT not converted.

    - define pr_fmt without prefix to avoid any default prefix update
    (suggested by Joe Perches).

    Signed-off-by: Fabian Frederick
    Cc: Jeremy Fitzhardinge
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     

21 Jan, 2013

1 commit


12 Jan, 2013

1 commit

  • Prarit's excellent bug report:
    > In recent Fedora releases (F17 & F18) some users have reported seeing
    > messages similar to
    >
    > [ 15.478160] kvm: Could not allocate 304 bytes percpu data
    > [ 15.478174] PERCPU: allocation failed, size=304 align=32, alloc from
    > reserved chunk failed
    >
    > during system boot. In some cases, users have also reported seeing this
    > message along with a failed load of other modules.
    >
    > What is happening is systemd is loading an instance of the kvm module for
    > each cpu found (see commit e9bda3b). When the module load occurs the kernel
    > currently allocates the modules percpu data area prior to checking to see
    > if the module is already loaded or is in the process of being loaded. If
    > the module is already loaded, or finishes load, the module loading code
    > releases the current instance's module's percpu data.

    Now we have a new state MODULE_STATE_UNFORMED, we can insert the
    module into the list (and thus guarantee its uniqueness) before we
    allocate the per-cpu region.

    Reported-by: Prarit Bhargava
    Signed-off-by: Rusty Russell
    Tested-by: Prarit Bhargava

    Rusty Russell
     

27 Jan, 2012

1 commit

  • rsyslog will display KERN_EMERG messages on a connected
    terminal. However, these messages are useless/undecipherable
    for a general user.

    For example, after a softlockup we get:

    Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
    kernel:Stack:

    Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
    kernel:Call Trace:

    Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
    kernel:Code: ff ff a8 08 75 25 31 d2 48 8d 86 38 e0 ff ff 48 89
    d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e0 0f 01 c9 ea 69 dd ff 4c 29 e8 48 89 c7 e8 0f bc da ff 49 89 c4 49 89

    This happens because the printk levels for these messages are
    incorrect. Only an informational message should be displayed on
    a terminal.

    I modified the printk levels for various messages in the kernel
    and tested the output by using the drivers/misc/lkdtm.c kernel
    modules (ie, softlockups, panics, hard lockups, etc.) and
    confirmed that the console output was still the same and that
    the output to the terminals was correct.

    For example, in the case of a softlockup we now see the much
    more informative:

    Message from syslogd@intel-s3e37-04 at Jan 25 10:18:06 ...
    BUG: soft lockup - CPU4 stuck for 60s!

    instead of the above confusing messages.

    AFAICT, the messages no longer have to be KERN_EMERG. In the
    most important case of a panic we set console_verbose(). As for
    the other less severe cases the correct data is output to the
    console and /var/log/messages.

    Successfully tested by me using the drivers/misc/lkdtm.c module.

    Signed-off-by: Prarit Bhargava
    Cc: dzickus@redhat.com
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Link: http://lkml.kernel.org/r/1327586134-11926-1-git-send-email-prarit@redhat.com
    Signed-off-by: Ingo Molnar

    Prarit Bhargava
     

06 Oct, 2010

1 commit

  • With all the recent module loading cleanups, we've minimized the code
    that sits under module_mutex, fixing various deadlocks and making it
    possible to do most of the module loading in parallel.

    However, that whole conversion totally missed the rather obscure code
    that adds a new module to the list for BUG() handling. That code was
    doubly obscure because (a) the code itself lives in lib/bugs.c (for
    dubious reasons) and (b) it gets called from the architecture-specific
    "module_finalize()" rather than from generic code.

    Calling it from arch-specific code makes no sense what-so-ever to begin
    with, and is now actively wrong since that code isn't protected by the
    module loading lock any more.

    So this commit moves the "module_bug_{finalize,cleanup}()" calls away
    from the arch-specific code, and into the generic code - and in the
    process protects it with the module_mutex so that the list operations
    are now safe.

    Future fixups:
    - move the module list handling code into kernel/module.c where it
    belongs.
    - get rid of 'module_bug_list' and just use the regular list of modules
    (called 'modules' - imagine that) that we already create and maintain
    for other reasons.

    Reported-and-tested-by: Thomas Gleixner
    Cc: Rusty Russell
    Cc: Adrian Bunk
    Cc: Andrew Morton
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

11 Aug, 2010

2 commits

  • We are missing the oops end marker for the exception based WARN implementation
    in lib/bug.c. This is useful for logfile analysis tools.

    Signed-off-by: Anton Blanchard
    Cc: Ingo Molnar
    Cc: Arjan van de Ven
    Cc: "Kirill A. Shutemov"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anton Blanchard
     
  • There are a few issues with the exception based WARN implementation in
    lib/bug.c:

    - Inconsistent printk flags. The "cut here" line is printed at KERN_EMERG, so
    the console and all logged in users see the single line:

    ------------[ cut here ]------------

    for each WARN. Fix this so we print everything at KERN_WARNING to match the
    kernel/panic.c version.

    - The lib/bug.c WARN would print "Badness at". Change it to match the
    kernel/panic.c version which prints "WARNING: at".

    - Print the list of modules, similar to kernel/panic.c of modules, similar to
    kernel/panic.c

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Anton Blanchard
    Cc: Ingo Molnar
    Cc: Arjan van de Ven
    Cc: "Kirill A. Shutemov"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anton Blanchard
     

19 May, 2010

1 commit

  • WARN() is used in some places to report firmware or hardware bugs that
    are then worked-around. These bugs do not affect the stability of the
    kernel and should not set the flag for TAINT_WARN. To allow for this,
    add WARN_TAINT() and WARN_TAINT_ONCE() macros that take a taint number
    as argument.

    Architectures that implement warnings using trap instructions instead
    of calls to warn_slowpath_*() now implement __WARN_TAINT(taint)
    instead of __WARN().

    Signed-off-by: Ben Hutchings
    Acked-by: Helge Deller
    Tested-by: Paul Mundt
    Signed-off-by: David Woodhouse

    Ben Hutchings
     

17 Dec, 2008

1 commit


05 Jul, 2008

1 commit

  • Commit 95b570c9cef3b12356454c7112571b7e406b4b51 ("Taint kernel after
    WARN_ON(condition)") introduced a TAINT_WARN that was implemented for
    all architectures using the generic warn_on_slowpath(), which excluded
    any architecture that set HAVE_ARCH_WARN_ON.

    As all of the architectures that implement their own WARN_ON() all go
    through the report_bug() path (specifically handling BUG_TRAP_TYPE_WARN),
    taint the kernel there as well for consistency.

    Tested on avr32 and sh. Also relevant for s390, parisc, and powerpc.

    Signed-off-by: Haavard Skinnemoen
    Signed-off-by: Paul Mundt
    Acked-by: Kyle McMartin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Mundt
     

17 Jul, 2007

1 commit

  • The current generic bug implementation has a call to dump_stack() in case a
    WARN_ON(whatever) gets hit. Since report_bug(), which calls dump_stack(),
    gets called from an exception handler we can do better: just pass the
    pt_regs structure to report_bug() and pass it to show_regs() in case of a
    warning. This will give more debug informations like register contents,
    etc... In addition this avoids some pointless lines that dump_stack()
    emits, since it includes a stack backtrace of the exception handler which
    is of no interest in case of a warning. E.g. on s390 the following lines
    are currently always present in a stack backtrace if dump_stack() gets
    called from report_bug():

    [] show_trace+0x92/0xe8)
    [] show_stack+0xa0/0xd0
    [] dump_stack+0x2e/0x3c
    [] report_bug+0x98/0xf8
    [] illegal_op+0x1fc/0x21c
    [] sysc_return+0x0/0x10

    Acked-by: Jeremy Fitzhardinge
    Acked-by: Haavard Skinnemoen
    Cc: Andi Kleen
    Cc: Kyle McMartin
    Cc: Paul Mackerras
    Cc: Paul Mundt
    Cc: Martin Schwidefsky
    Signed-off-by: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     

09 Dec, 2006

1 commit

  • This patch adds common handling for kernel BUGs, for use by architectures as
    they wish. The code is derived from arch/powerpc.

    The advantages of having common BUG handling are:
    - consistent BUG reporting across architectures
    - shared implementation of out-of-line file/line data
    - implement CONFIG_DEBUG_BUGVERBOSE consistently

    This means that in inline impact of BUG is just the illegal instruction
    itself, which is an improvement for i386 and x86-64.

    A BUG is represented in the instruction stream as an illegal instruction,
    which has file/line information associated with it. This extra information is
    stored in the __bug_table section in the ELF file.

    When the kernel gets an illegal instruction, it first confirms it might
    possibly be from a BUG (ie, in kernel mode, the right illegal instruction).
    It then calls report_bug(). This searches __bug_table for a matching
    instruction pointer, and if found, prints the corresponding file/line
    information. If report_bug() determines that it wasn't a BUG which caused the
    trap, it returns BUG_TRAP_TYPE_NONE.

    Some architectures (powerpc) implement WARN using the same mechanism; if the
    illegal instruction was the result of a WARN, then report_bug(Q) returns
    CONFIG_DEBUG_BUGVERBOSE; otherwise it returns BUG_TRAP_TYPE_BUG.

    lib/bug.c keeps a list of loaded modules which can be searched for __bug_table
    entries. The architecture must call
    module_bug_finalize()/module_bug_cleanup() from its corresponding
    module_finalize/cleanup functions.

    Unsetting CONFIG_DEBUG_BUGVERBOSE will reduce the kernel size by some amount.
    At the very least, filename and line information will not be recorded for each
    but, but architectures may decide to store no extra information per BUG at
    all.

    Unfortunately, gcc doesn't have a general way to mark an asm() as noreturn, so
    architectures will generally have to include an infinite loop (or similar) in
    the BUG code, so that gcc knows execution won't continue beyond that point.
    gcc does have a __builtin_trap() operator which may be useful to achieve the
    same effect, unfortunately it cannot be used to actually implement the BUG
    itself, because there's no way to get the instruction's address for use in
    generating the __bug_table entry.

    [randy.dunlap@oracle.com: Handle BUG=n, GENERIC_BUG=n to prevent build errors]
    [bunk@stusta.de: include/linux/bug.h must always #include
    Cc: Andi Kleen
    Cc: Hugh Dickens
    Cc: Michael Ellerman
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Rusty Russell
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeremy Fitzhardinge