15 Nov, 2020

1 commit

  • Before commit 3f388f28639f ("panic: dump registers on panic_on_warn"),
    __warn() was calling show_regs() when regs was not NULL, and show_stack()
    otherwise.

    After that commit, show_stack() is called regardless of whether
    show_regs() has been called or not, leading to duplicated Call Trace:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 1 at arch/powerpc/mm/nohash/8xx.c:186 mmu_mark_initmem_nx+0x24/0x94
    CPU: 0 PID: 1 Comm: swapper Not tainted 5.10.0-rc2-s3k-dev-01375-gf46ec0d3ecbd-dirty #4092
    NIP: c00128b4 LR: c0010228 CTR: 00000000
    REGS: c9023e40 TRAP: 0700 Not tainted (5.10.0-rc2-s3k-dev-01375-gf46ec0d3ecbd-dirty)
    MSR: 00029032 CR: 24000424 XER: 00000000

    GPR00: c0010228 c9023ef8 c2100000 0074c000 ffffffff 00000000 c2151000 c07b3880
    GPR08: ff000900 0074c000 c8000000 c33b53a8 24000822 00000000 c0003a20 00000000
    GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    GPR24: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00800000
    NIP [c00128b4] mmu_mark_initmem_nx+0x24/0x94
    LR [c0010228] free_initmem+0x20/0x58
    Call Trace:
    free_initmem+0x20/0x58
    kernel_init+0x1c/0x114
    ret_from_kernel_thread+0x14/0x1c
    Instruction dump:
    7d291850 7d234b78 4e800020 9421ffe0 7c0802a6 bfc10018 3fe0c060 3bff0000
    3fff4080 3bffffff 90010024 57ff0010 392001cd 7c3e0b78 953e0008
    CPU: 0 PID: 1 Comm: swapper Not tainted 5.10.0-rc2-s3k-dev-01375-gf46ec0d3ecbd-dirty #4092
    Call Trace:
    __warn+0x8c/0xd8 (unreliable)
    report_bug+0x11c/0x154
    program_check_exception+0x1dc/0x6e0
    ret_from_except_full+0x0/0x4
    --- interrupt: 700 at mmu_mark_initmem_nx+0x24/0x94
    LR = free_initmem+0x20/0x58
    free_initmem+0x20/0x58
    kernel_init+0x1c/0x114
    ret_from_kernel_thread+0x14/0x1c
    ---[ end trace 31702cd2a9570752 ]---

    Only call show_stack() when regs is NULL.

    Fixes: 3f388f28639f ("panic: dump registers on panic_on_warn")
    Signed-off-by: Christophe Leroy
    Signed-off-by: Andrew Morton
    Cc: Alexey Kardashevskiy
    Cc: Kefeng Wang
    Link: https://lkml.kernel.org/r/e8c055458b080707f1bc1a98ff8bea79d0cec445.1604748361.git.christophe.leroy@csgroup.eu
    Signed-off-by: Linus Torvalds

    Christophe Leroy
     

17 Oct, 2020

1 commit

  • Currently we print stack and registers for ordinary warnings but we do not
    for panic_on_warn which looks as oversight - panic() will reboot the
    machine but won't print registers.

    This moves printing of registers and modules earlier.

    This does not move the stack dumping as panic() dumps it.

    Signed-off-by: Alexey Kardashevskiy
    Signed-off-by: Andrew Morton
    Reviewed-by: Kees Cook
    Cc: Douglas Anderson
    Cc: Ingo Molnar
    Cc: Kees Cook
    Cc: Rafael Aquini
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: Nicholas Piggin
    Link: https://lkml.kernel.org/r/20200804095054.68724-1-aik@ozlabs.ru
    Signed-off-by: Linus Torvalds

    Alexey Kardashevskiy
     

13 Aug, 2020

2 commits

  • Since print_oops_end_marker() is not used externally, also remove it in
    kernel.h at the same time.

    Signed-off-by: Yue Hu
    Signed-off-by: Andrew Morton
    Cc: Kees Cook
    Link: http://lkml.kernel.org/r/20200724011516.12756-1-zbestahu@gmail.com
    Signed-off-by: Linus Torvalds

    Yue Hu
     
  • The return value of oops_may_print() is true or false, so change its type
    to reflect that.

    Signed-off-by: Tiezhu Yang
    Signed-off-by: Andrew Morton
    Reviewed-by: Kees Cook
    Cc: Xuefeng Li
    Link: http://lkml.kernel.org/r/1591103358-32087-1-git-send-email-yangtiezhu@loongson.cn
    Signed-off-by: Linus Torvalds

    Tiezhu Yang
     

11 Jun, 2020

1 commit

  • Warnings, bugs and stack protection fails from noinstr sections, e.g. low
    level and early entry code, are likely to be fatal.

    Mark them as "safe" to be invoked from noinstr protected code to avoid
    annotating all usage sites. Getting the information out is important.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Alexandre Chartre
    Acked-by: Peter Zijlstra
    Link: https://lkml.kernel.org/r/20200505134100.376598577@linutronix.de

    Thomas Gleixner
     

09 Jun, 2020

2 commits

  • Usually when the kernel reaches an oops condition, it's a point of no
    return; in case not enough debug information is available in the kernel
    splat, one of the last resorts would be to collect a kernel crash dump
    and analyze it. The problem with this approach is that in order to
    collect the dump, a panic is required (to kexec-load the crash kernel).
    When in an environment of multiple virtual machines, users may prefer to
    try living with the oops, at least until being able to properly shutdown
    their VMs / finish their important tasks.

    This patch implements a way to collect a bit more debug details when an
    oops event is reached, by printing all the CPUs backtraces through the
    usage of NMIs (on architectures that support that). The sysctl added
    (and documented) here was called "oops_all_cpu_backtrace", and when set
    will (as the name suggests) dump all CPUs backtraces.

    Far from ideal, this may be the last option though for users that for
    some reason cannot panic on oops. Most of times oopses are clear enough
    to indicate the kernel portion that must be investigated, but in virtual
    environments it's possible to observe hypervisor/KVM issues that could
    lead to oopses shown in other guests CPUs (like virtual APIC crashes).
    This patch hence aims to help debug such complex issues without
    resorting to kdump.

    Signed-off-by: Guilherme G. Piccoli
    Signed-off-by: Andrew Morton
    Reviewed-by: Kees Cook
    Cc: Luis Chamberlain
    Cc: Iurii Zaikin
    Cc: Thomas Gleixner
    Cc: Vlastimil Babka
    Cc: Randy Dunlap
    Cc: Matthew Wilcox
    Link: http://lkml.kernel.org/r/20200327224116.21030-1-gpiccoli@canonical.com
    Signed-off-by: Linus Torvalds

    Guilherme G. Piccoli
     
  • Analogously to the introduction of panic_on_warn, this patch introduces
    a kernel option named panic_on_taint in order to provide a simple and
    generic way to stop execution and catch a coredump when the kernel gets
    tainted by any given flag.

    This is useful for debugging sessions as it avoids having to rebuild the
    kernel to explicitly add calls to panic() into the code sites that
    introduce the taint flags of interest.

    For instance, if one is interested in proceeding with a post-mortem
    analysis at the point a given code path is hitting a bad page (i.e.
    unaccount_page_cache_page(), or slab_bug()), a coredump can be collected
    by rebooting the kernel with 'panic_on_taint=0x20' amended to the
    command line.

    Another, perhaps less frequent, use for this option would be as a means
    for assuring a security policy case where only a subset of taints, or no
    single taint (in paranoid mode), is allowed for the running system. The
    optional switch 'nousertaint' is handy in this particular scenario, as
    it will avoid userspace induced crashes by writes to sysctl interface
    /proc/sys/kernel/tainted causing false positive hits for such policies.

    [akpm@linux-foundation.org: tweak kernel-parameters.txt wording]

    Suggested-by: Qian Cai
    Signed-off-by: Rafael Aquini
    Signed-off-by: Andrew Morton
    Reviewed-by: Luis Chamberlain
    Cc: Dave Young
    Cc: Baoquan He
    Cc: Jonathan Corbet
    Cc: Kees Cook
    Cc: Randy Dunlap
    Cc: "Theodore Ts'o"
    Cc: Adrian Bunk
    Cc: Greg Kroah-Hartman
    Cc: Laura Abbott
    Cc: Jeff Mahoney
    Cc: Jiri Kosina
    Cc: Takashi Iwai
    Link: http://lkml.kernel.org/r/20200515175502.146720-1-aquini@redhat.com
    Signed-off-by: Linus Torvalds

    Rafael Aquini
     

25 Nov, 2019

1 commit

  • 'refcount_error_report()' has no callers. Remove it.

    Signed-off-by: Will Deacon
    Reviewed-by: Ard Biesheuvel
    Acked-by: Kees Cook
    Tested-by: Hanjun Guo
    Cc: Ard Biesheuvel
    Cc: Elena Reshetova
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20191121115902.2551-10-will@kernel.org
    Signed-off-by: Ingo Molnar

    Will Deacon
     

08 Oct, 2019

1 commit

  • Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the
    calling CPU in an infinite loop, but with interrupts and preemption
    enabled. From this state, userspace can continue to be scheduled,
    despite the system being "dead" as far as the kernel is concerned.

    This is easily reproducible on arm64 when booting with "nosmp" on the
    command line; a couple of shell scripts print out a periodic "Ping"
    message whilst another triggers a crash by writing to
    /proc/sysrq-trigger:

    | sysrq: Trigger a crash
    | Kernel panic - not syncing: sysrq triggered crash
    | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1
    | Hardware name: linux,dummy-virt (DT)
    | Call trace:
    | dump_backtrace+0x0/0x148
    | show_stack+0x14/0x20
    | dump_stack+0xa0/0xc4
    | panic+0x140/0x32c
    | sysrq_handle_reboot+0x0/0x20
    | __handle_sysrq+0x124/0x190
    | write_sysrq_trigger+0x64/0x88
    | proc_reg_write+0x60/0xa8
    | __vfs_write+0x18/0x40
    | vfs_write+0xa4/0x1b8
    | ksys_write+0x64/0xf0
    | __arm64_sys_write+0x14/0x20
    | el0_svc_common.constprop.0+0xb0/0x168
    | el0_svc_handler+0x28/0x78
    | el0_svc+0x8/0xc
    | Kernel Offset: disabled
    | CPU features: 0x0002,24002004
    | Memory Limit: none
    | ---[ end Kernel panic - not syncing: sysrq triggered crash ]---
    | Ping 2!
    | Ping 1!
    | Ping 1!
    | Ping 2!

    The issue can also be triggered on x86 kernels if CONFIG_SMP=n,
    otherwise local interrupts are disabled in 'smp_send_stop()'.

    Disable preemption in 'panic()' before re-enabling interrupts.

    Link: http://lkml.kernel.org/r/20191002123538.22609-1-will@kernel.org
    Link: https://lore.kernel.org/r/BX1W47JXPMR8.58IYW53H6M5N@dragonstone
    Signed-off-by: Will Deacon
    Reported-by: Xogium
    Reviewed-by: Kees Cook
    Cc: Russell King
    Cc: Greg Kroah-Hartman
    Cc: Ingo Molnar
    Cc: Petr Mladek
    Cc: Feng Tang
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Will Deacon
     

26 Sep, 2019

5 commits

  • Instead of having separate tests for __WARN_FLAGS, merge the two #ifdef
    blocks and replace the synonym WANT_WARN_ON_SLOWPATH macro.

    Link: http://lkml.kernel.org/r/20190819234111.9019-7-keescook@chromium.org
    Signed-off-by: Kees Cook
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Christophe Leroy
    Cc: Drew Davenport
    Cc: Feng Tang
    Cc: Mauro Carvalho Chehab
    Cc: Peter Zijlstra
    Cc: Petr Mladek
    Cc: "Steven Rostedt (VMware)"
    Cc: YueHaibing
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • In preparation for cleaning up "cut here", move the "cut here" logic up
    out of __warn() and into callers that pass non-NULL args. For anyone
    looking closely, there are two callers that pass NULL args: one already
    explicitly prints "cut here". The remaining case is covered by how a WARN
    is built, which will be cleaned up in the next patch.

    Link: http://lkml.kernel.org/r/20190819234111.9019-5-keescook@chromium.org
    Signed-off-by: Kees Cook
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Christophe Leroy
    Cc: Drew Davenport
    Cc: Feng Tang
    Cc: Mauro Carvalho Chehab
    Cc: Peter Zijlstra
    Cc: Petr Mladek
    Cc: "Steven Rostedt (VMware)"
    Cc: YueHaibing
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • Instead of having a separate helper for no printk output, just consolidate
    the logic into warn_slowpath_fmt().

    Link: http://lkml.kernel.org/r/20190819234111.9019-4-keescook@chromium.org
    Signed-off-by: Kees Cook
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Christophe Leroy
    Cc: Drew Davenport
    Cc: Feng Tang
    Cc: Mauro Carvalho Chehab
    Cc: Peter Zijlstra
    Cc: Petr Mladek
    Cc: "Steven Rostedt (VMware)"
    Cc: YueHaibing
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • Patch series "Clean up WARN() "cut here" handling", v2.

    Christophe Leroy noticed that the fix for missing "cut here" in the WARN()
    case was adding explicit printk() calls instead of teaching the exception
    handler to add it. This refactors the bug/warn infrastructure to pass
    this information as a new BUGFLAG.

    Longer details repeated from the last patch in the series:

    bug: move WARN_ON() "cut here" into exception handler

    The original cleanup of "cut here" missed the WARN_ON() case (that does
    not have a printk message), which was fixed recently by adding an explicit
    printk of "cut here". This had the downside of adding a printk() to every
    WARN_ON() caller, which reduces the utility of using an instruction
    exception to streamline the resulting code. By making this a new BUGFLAG,
    all of these can be removed and "cut here" can be handled by the exception
    handler.

    This was very pronounced on PowerPC, but the effect can be seen on x86 as
    well. The resulting text size of a defconfig build shows some small
    savings from this patch:

    text data bss dec hex filename
    19691167 5134320 1646664 26472151 193eed7 vmlinux.before
    19676362 5134260 1663048 26473670 193f4c6 vmlinux.after

    This change also opens the door for creating something like BUG_MSG(),
    where a custom printk() before issuing BUG(), without confusing the "cut
    here" line.

    This patch (of 7):

    There's no reason to have specialized helpers for passing the warn taint
    down to __warn(). Consolidate and refactor helper macros, removing
    __WARN_printf() and warn_slowpath_fmt_taint().

    Link: http://lkml.kernel.org/r/20190819234111.9019-2-keescook@chromium.org
    Signed-off-by: Kees Cook
    Cc: Christophe Leroy
    Cc: Peter Zijlstra
    Cc: Christophe Leroy
    Cc: Drew Davenport
    Cc: Arnd Bergmann
    Cc: "Steven Rostedt (VMware)"
    Cc: Feng Tang
    Cc: Petr Mladek
    Cc: Mauro Carvalho Chehab
    Cc: Borislav Petkov
    Cc: YueHaibing
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • Right now kgdb/kdb hooks up to debug panics by registering for the panic
    notifier. This works OK except that it means that kgdb/kdb gets called
    _after_ the CPUs in the system are taken offline. That means that if
    anything important was happening on those CPUs (like something that might
    have contributed to the panic) you can't debug them.

    Specifically I ran into a case where I got a panic because a task was
    "blocked for more than 120 seconds" which was detected on CPU 2. I nicely
    got shown stack traces in the kernel log for all CPUs including CPU 0,
    which was running 'PID: 111 Comm: kworker/0:1H' and was in the middle of
    __mmc_switch().

    I then ended up at the kdb prompt where switched over to kgdb to try to
    look at local variables of the process on CPU 0. I found that I couldn't.
    Digging more, I found that I had no info on any tasks running on CPUs
    other than CPU 2 and that asking kdb for help showed me "Error: no saved
    data for this cpu". This was because all the CPUs were offline.

    Let's move the entry of kdb/kgdb to a direct call from panic() and stop
    using the generic notifier. Putting a direct call in allows us to order
    things more properly and it also doesn't seem like we're breaking any
    abstractions by calling into the debugger from the panic function.

    Daniel said:

    : This patch changes the way kdump and kgdb interact with each other.
    : However it would seem rather odd to have both tools simultaneously armed
    : and, even if they were, the user still has the option to use panic_timeout
    : to force a kdump to happen. Thus I think the change of order is
    : acceptable.

    Link: http://lkml.kernel.org/r/20190703170354.217312-1-dianders@chromium.org
    Signed-off-by: Douglas Anderson
    Reviewed-by: Daniel Thompson
    Cc: Jason Wessel
    Cc: Kees Cook
    Cc: Borislav Petkov
    Cc: Thomas Gleixner
    Cc: Feng Tang
    Cc: YueHaibing
    Cc: Sergey Senozhatsky
    Cc: "Steven Rostedt (VMware)"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Douglas Anderson
     

15 Jul, 2019

2 commits

  • The stuff under sysctl describes /sys interface from userspace
    point of view. So, add it to the admin-guide and remove the
    :orphan: from its index file.

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     
  • Rename the /proc/sys/ documentation files to ReST, using the
    README file as a template for an index.rst, adding the other
    files there via TOC markup.

    Despite being written on different times with different
    styles, try to make them somewhat coherent with a similar
    look and feel, ensuring that they'll look nice as both
    raw text file and as via the html output produced by the
    Sphinx build system.

    At its new index.rst, let's add a :orphan: while this is not linked to
    the main index.rst file, in order to avoid build warnings.

    Signed-off-by: Mauro Carvalho Chehab

    Mauro Carvalho Chehab
     

21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

19 May, 2019

1 commit

  • Currently on panic, kernel will lower the loglevel and print out pending
    printk msg only with console_flush_on_panic().

    Add an option for users to configure the "panic_print" to replay all
    dmesg in buffer, some of which they may have never seen due to the
    loglevel setting, which will help panic debugging .

    [feng.tang@intel.com: keep the original console_flush_on_panic() inside panic()]
    Link: http://lkml.kernel.org/r/1556199137-14163-1-git-send-email-feng.tang@intel.com
    [feng.tang@intel.com: use logbuf lock to protect the console log index]
    Link: http://lkml.kernel.org/r/1556269868-22654-1-git-send-email-feng.tang@intel.com
    Link: http://lkml.kernel.org/r/1556095872-36838-1-git-send-email-feng.tang@intel.com
    Signed-off-by: Feng Tang
    Reviewed-by: Petr Mladek
    Cc: Aaro Koskinen
    Cc: Petr Mladek
    Cc: Steven Rostedt
    Cc: Sergey Senozhatsky
    Cc: Kees Cook
    Cc: Borislav Petkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Feng Tang
     

15 May, 2019

2 commits

  • Allow specifying reboot_mode for panic only. This is needed on systems
    where ramoops is used to store panic logs, and user wants to use warm
    reset to preserve those, while still having cold reset on normal
    reboots.

    Link: http://lkml.kernel.org/r/20190322004735.27702-1-aaro.koskinen@iki.fi
    Signed-off-by: Aaro Koskinen
    Reviewed-by: Kees Cook
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aaro Koskinen
     
  • When kernel panic happens, it will first print the panic call stack,
    then the ending msg like:

    [ 35.743249] ---[ end Kernel panic - not syncing: Fatal exception
    [ 35.749975] ------------[ cut here ]------------

    The above message are very useful for debugging.

    But if system is configured to not reboot on panic, say the
    "panic_timeout" parameter equals 0, it will likely print out many noisy
    message like WARN() call stack for each and every CPU except the panic
    one, messages like below:

    WARNING: CPU: 1 PID: 280 at kernel/sched/core.c:1198 set_task_cpu+0x183/0x190
    Call Trace:

    try_to_wake_up
    default_wake_function
    autoremove_wake_function
    __wake_up_common
    __wake_up_common_lock
    __wake_up
    wake_up_klogd_work_func
    irq_work_run_list
    irq_work_tick
    update_process_times
    tick_sched_timer
    __hrtimer_run_queues
    hrtimer_interrupt
    smp_apic_timer_interrupt
    apic_timer_interrupt

    For people working in console mode, the screen will first show the panic
    call stack, but immediately overridden by these noisy extra messages,
    which makes debugging much more difficult, as the original context gets
    lost on screen.

    Also these noisy messages will confuse some users, as I have seen many bug
    reporters posted the noisy message into bugzilla, instead of the real
    panic call stack and context.

    Adding a flag "suppress_printk" which gets set in panic() to avoid those
    noisy messages, without changing current kernel behavior that both panic
    blinking and sysrq magic key can work as is, suggested by Petr Mladek.

    To verify this, make sure kernel is not configured to reboot on panic and
    in console
    # echo c > /proc/sysrq-trigger
    to see if console only prints out the panic call stack.

    Link: http://lkml.kernel.org/r/1551430186-24169-1-git-send-email-feng.tang@intel.com
    Signed-off-by: Feng Tang
    Suggested-by: Petr Mladek
    Reviewed-by: Petr Mladek
    Acked-by: Steven Rostedt (VMware)
    Acked-by: Sergey Senozhatsky
    Cc: Thomas Gleixner
    Cc: Kees Cook
    Cc: Borislav Petkov
    Cc: Andi Kleen
    Cc: Peter Zijlstra
    Cc: Greg Kroah-Hartman
    Cc: Jiri Slaby
    Cc: Sasha Levin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Feng Tang
     

02 May, 2019

1 commit

  • The disabled_wait() function uses its argument as the PSW address when
    it stops the CPU with a wait PSW that is disabled for interrupts.
    The different callers sometimes use a specific number like 0xdeadbeef
    to indicate a specific failure, the early boot code uses 0 and some
    other calls sites use __builtin_return_address(0).

    At the time a dump is created the current PSW and the registers of a
    CPU are written to lowcore to make them avaiable to the dump analysis
    tool. For a CPU stopped with disabled_wait the PSW and the registers
    do not really make sense together, the PSW address does not point to
    the function the registers belong to.

    Simplify disabled_wait() by using _THIS_IP_ for the PSW address and
    drop the argument to the function.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

08 Mar, 2019

1 commit

  • Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE for
    debugfs files.

    Semantic patch information:
    Rationale: DEFINE_SIMPLE_ATTRIBUTE + debugfs_create_file()
    imposes some significant overhead as compared to
    DEFINE_DEBUGFS_ATTRIBUTE + debugfs_create_file_unsafe().

    Generated by: scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci

    The _unsafe() part suggests that some of them "safeness
    responsibilities" are now panic.c responsibilities. The patch is OK
    since panic's clear_warn_once_fops struct file_operations is safe
    against removal, so we don't have to use otherwise necessary
    debugfs_file_get()/debugfs_file_put().

    [sergey.senozhatsky.work@gmail.com: changelog addition]
    Link: http://lkml.kernel.org/r/1545990861-158097-1-git-send-email-yuehaibing@huawei.com
    Signed-off-by: YueHaibing
    Reviewed-by: Sergey Senozhatsky
    Cc: Kees Cook
    Cc: Borislav Petkov
    Cc: Steven Rostedt (VMware)
    Cc: Petr Mladek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    YueHaibing
     

05 Jan, 2019

2 commits

  • So that we can also runtime chose to print out the needed system info
    for panic, other than setting the kernel cmdline.

    Link: http://lkml.kernel.org/r/1543398842-19295-3-git-send-email-feng.tang@intel.com
    Signed-off-by: Feng Tang
    Suggested-by: Steven Rostedt
    Acked-by: Steven Rostedt (VMware)
    Cc: Thomas Gleixner
    Cc: John Stultz
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Kees Cook
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Feng Tang
     
  • Kernel panic issues are always painful to debug, partially because it's
    not easy to get enough information of the context when panic happens.

    And we have ramoops and kdump for that, while this commit tries to
    provide a easier way to show the system info by adding a cmdline
    parameter, referring some idea from sysrq handler.

    Link: http://lkml.kernel.org/r/1543398842-19295-2-git-send-email-feng.tang@intel.com
    Signed-off-by: Feng Tang
    Reviewed-by: Kees Cook
    Acked-by: Steven Rostedt (VMware)
    Cc: Thomas Gleixner
    Cc: John Stultz
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Feng Tang
     

28 Dec, 2018

1 commit

  • Pull printk updates from Petr Mladek:

    - Keep spinlocks busted until the end of panic()

    - Fix races between calculating number of messages that would fit into
    user space buffers, filling the buffers, and switching printk.time
    parameter

    - Some code clean up

    * tag 'printk-for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
    printk: Remove print_prefix() calls with NULL buffer.
    printk: fix printk_time race.
    printk: Make printk_emit() local function.
    panic: avoid deadlocks in re-entrant console drivers

    Linus Torvalds
     

22 Nov, 2018

1 commit

  • From printk()/serial console point of view panic() is special, because
    it may force CPU to re-enter printk() or/and serial console driver.
    Therefore, some of serial consoles drivers are re-entrant. E.g. 8250:

    serial8250_console_write()
    {
    if (port->sysrq)
    locked = 0;
    else if (oops_in_progress)
    locked = spin_trylock_irqsave(&port->lock, flags);
    else
    spin_lock_irqsave(&port->lock, flags);
    ...
    }

    panic() does set oops_in_progress via bust_spinlocks(1), so in theory
    we should be able to re-enter serial console driver from panic():

    CPU0

    uart_console_write()
    serial8250_console_write() // if (oops_in_progress)
    // spin_trylock_irqsave()
    call_console_drivers()
    console_unlock()
    console_flush_on_panic()
    bust_spinlocks(1) // oops_in_progress++
    panic()

    spin_lock_irqsave(&port->lock, flags) // spin_lock_irqsave()
    serial8250_console_write()
    call_console_drivers()
    console_unlock()
    printk()
    ...

    However, this does not happen and we deadlock in serial console on
    port->lock spinlock. And the problem is that console_flush_on_panic()
    called after bust_spinlocks(0):

    void panic(const char *fmt, ...)
    {
    bust_spinlocks(1);
    ...
    bust_spinlocks(0);
    console_flush_on_panic();
    ...
    }

    bust_spinlocks(0) decrements oops_in_progress, so oops_in_progress
    can go back to zero. Thus even re-entrant console drivers will simply
    spin on port->lock spinlock. Given that port->lock may already be
    locked either by a stopped CPU, or by the very same CPU we execute
    panic() on (for instance, NMI panic() on printing CPU) the system
    deadlocks and does not reboot.

    Fix this by removing bust_spinlocks(0), so oops_in_progress is always
    set in panic() now and, thus, re-entrant console drivers will trylock
    the port->lock instead of spinning on it forever, when we call them
    from console_flush_on_panic().

    Link: http://lkml.kernel.org/r/20181025101036.6823-1-sergey.senozhatsky@gmail.com
    Cc: Steven Rostedt
    Cc: Daniel Wang
    Cc: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Greg Kroah-Hartman
    Cc: Alan Cox
    Cc: Jiri Slaby
    Cc: Peter Feiner
    Cc: linux-serial@vger.kernel.org
    Cc: Sergey Senozhatsky
    Cc: stable@vger.kernel.org
    Signed-off-by: Sergey Senozhatsky
    Signed-off-by: Petr Mladek

    Sergey Senozhatsky
     

31 Oct, 2018

2 commits

  • If a call to panic() terminates the string with a \n , the result puts the
    closing brace ']---' on a newline because panic() itself adds \n too.

    Now, if one goes and removes the newline chars from all panic()
    invocations - and the stats right now look like this:

    ~300 calls with a \n
    ~500 calls without a \n

    one is destined to a neverending game of whack-a-mole because the usual
    thing to do is add a newline at the end of a string a function is supposed
    to print.

    Therefore, simply zap any \n at the end of the panic string to avoid
    touching so many places in the kernel.

    Link: http://lkml.kernel.org/r/20181009205019.2786-1-bp@alien8.de
    Signed-off-by: Borislav Petkov
    Acked-by: Kees Cook
    Reviewed-by: Steven Rostedt (VMware)
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Borislav Petkov
     
  • ... because panic() itself already does this. Otherwise you have
    line-broken trailer:

    [ 1.836965] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: pgd_alloc+0x29e/0x2a0
    [ 1.836965] ]---

    Link: http://lkml.kernel.org/r/20181008202901.7894-1-bp@alien8.de
    Signed-off-by: Borislav Petkov
    Acked-by: Kees Cook
    Cc: Masahiro Yamada
    Cc: "Steven Rostedt (VMware)"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Borislav Petkov
     

14 Jun, 2018

1 commit

  • The changes to automatically test for working stack protector compiler
    support in the Kconfig files removed the special STACKPROTECTOR_AUTO
    option that picked the strongest stack protector that the compiler
    supported.

    That was all a nice cleanup - it makes no sense to have the AUTO case
    now that the Kconfig phase can just determine the compiler support
    directly.

    HOWEVER.

    It also meant that doing "make oldconfig" would now _disable_ the strong
    stackprotector if you had AUTO enabled, because in a legacy config file,
    the sane stack protector configuration would look like

    CONFIG_HAVE_CC_STACKPROTECTOR=y
    # CONFIG_CC_STACKPROTECTOR_NONE is not set
    # CONFIG_CC_STACKPROTECTOR_REGULAR is not set
    # CONFIG_CC_STACKPROTECTOR_STRONG is not set
    CONFIG_CC_STACKPROTECTOR_AUTO=y

    and when you ran this through "make oldconfig" with the Kbuild changes,
    it would ask you about the regular CONFIG_CC_STACKPROTECTOR (that had
    been renamed from CONFIG_CC_STACKPROTECTOR_REGULAR to just
    CONFIG_CC_STACKPROTECTOR), but it would think that the STRONG version
    used to be disabled (because it was really enabled by AUTO), and would
    disable it in the new config, resulting in:

    CONFIG_HAVE_CC_STACKPROTECTOR=y
    CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
    CONFIG_CC_STACKPROTECTOR=y
    # CONFIG_CC_STACKPROTECTOR_STRONG is not set
    CONFIG_CC_HAS_SANE_STACKPROTECTOR=y

    That's dangerously subtle - people could suddenly find themselves with
    the weaker stack protector setup without even realizing.

    The solution here is to just rename not just the old RECULAR stack
    protector option, but also the strong one. This does that by just
    removing the CC_ prefix entirely for the user choices, because it really
    is not about the compiler support (the compiler support now instead
    automatially impacts _visibility_ of the options to users).

    This results in "make oldconfig" actually asking the user for their
    choice, so that we don't have any silent subtle security model changes.
    The end result would generally look like this:

    CONFIG_HAVE_CC_STACKPROTECTOR=y
    CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
    CONFIG_STACKPROTECTOR=y
    CONFIG_STACKPROTECTOR_STRONG=y
    CONFIG_CC_HAS_SANE_STACKPROTECTOR=y

    where the "CC_" versions really are about internal compiler
    infrastructure, not the user selections.

    Acked-by: Masahiro Yamada
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

12 Apr, 2018

3 commits

  • Since the randstruct plugin can intentionally produce extremely unusual
    kernel structure layouts (even performance pathological ones), some
    maintainers want to be able to trivially determine if an Oops is coming
    from a randstruct-built kernel, so as to keep their sanity when
    debugging. This adds the new flag and initializes taint_mask
    immediately when built with randstruct.

    Link: http://lkml.kernel.org/r/1519084390-43867-4-git-send-email-keescook@chromium.org
    Signed-off-by: Kees Cook
    Reviewed-by: Andrew Morton
    Cc: Al Viro
    Cc: Alexey Dobriyan
    Cc: Jonathan Corbet
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • This consolidates the taint bit documentation into a single place with
    both numeric and letter values. Additionally adds the missing TAINT_AUX
    documentation.

    Link: http://lkml.kernel.org/r/1519084390-43867-3-git-send-email-keescook@chromium.org
    Signed-off-by: Kees Cook
    Reviewed-by: Andrew Morton
    Cc: Al Viro
    Cc: Alexey Dobriyan
    Cc: Jonathan Corbet
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • This converts to using indexed initializers instead of comments, adds a
    comment on why the taint flags can't be an enum, and make sure that no
    one forgets to update the taint_flags when adding new bits.

    Link: http://lkml.kernel.org/r/1519084390-43867-2-git-send-email-keescook@chromium.org
    Signed-off-by: Kees Cook
    Reviewed-by: Andrew Morton
    Cc: Al Viro
    Cc: Alexey Dobriyan
    Cc: Jonathan Corbet
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

11 Apr, 2018

1 commit

  • Pull tracing updates from Steven Rostedt:
    "New features:

    - Tom Zanussi's extended histogram work.

    This adds the synthetic events to have histograms from multiple
    event data Adds triggers "onmatch" and "onmax" to call the
    synthetic events Several updates to the histogram code from this

    - Allow way to nest ring buffer calls in the same context

    - Allow absolute time stamps in ring buffer

    - Rewrite of filter code parsing based on Al Viro's suggestions

    - Setting of trace_clock to global if TSC is unstable (on boot)

    - Better OOM handling when allocating large ring buffers

    - Added initcall tracepoints (consolidated initcall_debug code with
    them)

    And other various fixes and clean ups"

    * tag 'trace-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (68 commits)
    init: Have initcall_debug still work without CONFIG_TRACEPOINTS
    init, tracing: Have printk come through the trace events for initcall_debug
    init, tracing: instrument security and console initcall trace events
    init, tracing: Add initcall trace events
    tracing: Add rcu dereference annotation for test func that touches filter->prog
    tracing: Add rcu dereference annotation for filter->prog
    tracing: Fixup logic inversion on setting trace_global_clock defaults
    tracing: Hide global trace clock from lockdep
    ring-buffer: Add set/clear_current_oom_origin() during allocations
    ring-buffer: Check if memory is available before allocation
    lockdep: Add print_irqtrace_events() to __warn
    vsprintf: Do not preprocess non-dereferenced pointers for bprintf (%px and %pK)
    tracing: Uninitialized variable in create_tracing_map_fields()
    tracing: Make sure variable string fields are NULL-terminated
    tracing: Add action comparisons when testing matching hist triggers
    tracing: Don't add flag strings when displaying variable references
    tracing: Fix display of hist trigger expressions containing timestamps
    ftrace: Drop a VLA in module_exists()
    tracing: Mention trace_clock=global when warning about unstable clocks
    tracing: Default to using trace_global_clock if sched_clock is unstable
    ...

    Linus Torvalds
     

06 Apr, 2018

1 commit

  • Running a test on a x86_32 kernel I triggered a bug that an interrupt
    disable/enable isn't being catched by lockdep. At least knowing where the
    last one was found would be helpful, but the warnings that are produced do
    not show this information. Even without debugging lockdep, having the WARN()
    display the last place hard and soft irqs were enabled or disabled is
    valuable.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

03 Apr, 2018

1 commit


10 Mar, 2018

1 commit

  • The BUG and stack protector reports were still using a raw %p. This
    changes it to %pB for more meaningful output.

    Link: http://lkml.kernel.org/r/20180301225704.GA34198@beast
    Fixes: ad67b74d2469 ("printk: hash addresses printed with %p")
    Signed-off-by: Kees Cook
    Reviewed-by: Andrew Morton
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Borislav Petkov
    Cc: Richard Weinberger ,
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

08 Mar, 2018

1 commit


18 Nov, 2017

3 commits

  • This is the gist of a patch which we've been forward-porting in our
    kernels for a long time now and it probably would make a good sense to
    have such TAINT_AUX flag upstream which can be used by each distro etc,
    how they see fit. This way, we won't need to forward-port a distro-only
    version indefinitely.

    Add an auxiliary taint flag to be used by distros and others. This
    obviates the need to forward-port whatever internal solutions people
    have in favor of a single flag which they can map arbitrarily to a
    definition of their pleasing.

    The "X" mnemonic could also mean eXternal, which would be taint from a
    distro or something else but not the upstream kernel. We will use it to
    mark modules for which we don't provide support. I.e., a really
    eXternal module.

    Link: http://lkml.kernel.org/r/20170911134533.dp5mtyku5bongx4c@pd.tnic
    Signed-off-by: Borislav Petkov
    Cc: Kees Cook
    Cc: Jessica Yu
    Cc: Peter Zijlstra
    Cc: Jiri Slaby
    Cc: Jiri Olsa
    Cc: Michal Marek
    Cc: Jiri Kosina
    Cc: Takashi Iwai
    Cc: Petr Mladek
    Cc: Jeff Mahoney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Borislav Petkov
     
  • Prior to v4.11, x86 used warn_slowpath_fmt() for handling WARN()s.
    After WARN() was moved to using UD0 on x86, the warning text started
    appearing _before_ the "cut here" line. This appears to have been a
    long-standing bug on architectures that used __WARN_TAINT, but it didn't
    get fixed.

    v4.11 and earlier on x86:

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 2956 at drivers/misc/lkdtm_bugs.c:65 lkdtm_WARNING+0x21/0x30
    This is a warning message
    Modules linked in:

    v4.12 and later on x86:

    This is a warning message
    ------------[ cut here ]------------
    WARNING: CPU: 1 PID: 2982 at drivers/misc/lkdtm_bugs.c:68 lkdtm_WARNING+0x15/0x20
    Modules linked in:

    With this fix:

    ------------[ cut here ]------------
    This is a warning message
    WARNING: CPU: 3 PID: 3009 at drivers/misc/lkdtm_bugs.c:67 lkdtm_WARNING+0x15/0x20

    Since the __FILE__ reporting happens as part of the UD0 handler, it
    isn't trivial to move the message to after the WARNING line, but at
    least we can fix the position of the "cut here" line so all the various
    logging tools will start including the actual runtime warning message
    again, when they follow the instruction and "cut here".

    Link: http://lkml.kernel.org/r/1510100869-73751-4-git-send-email-keescook@chromium.org
    Fixes: 9a93848fe787 ("x86/debug: Implement __WARN() using UD0")
    Signed-off-by: Kees Cook
    Cc: Peter Zijlstra (Intel)
    Cc: Josh Poimboeuf
    Cc: Fengguang Wu
    Cc: Arnd Bergmann
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • The "cut here" string is used in a few paths. Define it in a single
    place.

    Link: http://lkml.kernel.org/r/1510100869-73751-3-git-send-email-keescook@chromium.org
    Signed-off-by: Kees Cook
    Cc: Arnd Bergmann
    Cc: Fengguang Wu
    Cc: Ingo Molnar
    Cc: Josh Poimboeuf
    Cc: Peter Zijlstra (Intel)
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook