Eric Lee / smarc-fsl-linux-kernel

02 Jul, 2021

1 commit

f39650de6 kernel.h: split out panic and oops helpers ... Browse Code »

kernel.h is being used as a dump for all kinds of stuff for a long time.
Here is the attempt to start cleaning it up by splitting out panic and
oops helpers.

There are several purposes of doing this:
- dropping dependency in bug.h
- dropping a loop by moving out panic_notifier.h
- unload kernel.h from something which has its own domain

At the same time convert users tree-wide to use new headers, although for
the time being include new header back to kernel.h to avoid twisted
indirected includes for existing users.

[akpm@linux-foundation.org: thread_info.h needs limits.h]
[andriy.shevchenko@linux.intel.com: ia64 fix]
Link: https://lkml.kernel.org/r/20210520130557.55277-1-andriy.shevchenko@linux.intel.com

Link: https://lkml.kernel.org/r/20210511074137.33666-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko
Reviewed-by: Bjorn Andersson
Co-developed-by: Andrew Morton
Acked-by: Mike Rapoport
Acked-by: Corey Minyard
Acked-by: Christian Brauner
Acked-by: Arnd Bergmann
Acked-by: Kees Cook
Acked-by: Wei Liu
Acked-by: Rasmus Villemoes
Signed-off-by: Andrew Morton
Acked-by: Sebastian Reichel
Acked-by: Luis Chamberlain
Acked-by: Stephen Boyd
Acked-by: Thomas Bogendoerfer
Acked-by: Helge Deller # parisc
Signed-off-by: Linus Torvalds

Andy Shevchenko
2021-07-02 02:06:04 +0800

07 May, 2021

2 commits

b2075dbb1 kexec: dump kmessage before machine_kexec ... Browse Code »

kmsg_dump(KMSG_DUMP_SHUTDOWN) is called before machine_restart(),
machine_halt(), and machine_power_off(). The only one that is missing
is machine_kexec().

The dmesg output that it contains can be used to study the shutdown
performance of both kernel and systemd during kexec reboot.

Here is example of dmesg data collected after kexec:

root@dplat-cp22:~# cat /sys/fs/pstore/dmesg-ramoops-0 | tail
...
[ 70.914592] psci: CPU3 killed (polled 0 ms)
[ 70.915705] CPU4: shutdown
[ 70.916643] psci: CPU4 killed (polled 4 ms)
[ 70.917715] CPU5: shutdown
[ 70.918725] psci: CPU5 killed (polled 0 ms)
[ 70.919704] CPU6: shutdown
[ 70.920726] psci: CPU6 killed (polled 4 ms)
[ 70.921642] CPU7: shutdown
[ 70.922650] psci: CPU7 killed (polled 0 ms)

Link: https://lkml.kernel.org/r/20210319192326.146000-2-pasha.tatashin@soleen.com
Signed-off-by: Pavel Tatashin
Reviewed-by: Kees Cook
Reviewed-by: Petr Mladek
Reviewed-by: Bhupesh Sharma
Acked-by: Baoquan He
Reviewed-by: Tyler Hicks
Cc: James Morris
Cc: Sasha Levin
Cc: Eric W. Biederman
Cc: Anton Vorontsov
Cc: Colin Cross
Cc: Tony Luck
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Tatashin
2021-05-07 15:26:32 +0800
a119b4e51 kexec: Add kexec reboot string ... Browse Code »

The purpose is to notify the kernel module for fast reboot.

Upstream a patch from the SONiC network operating system [1].

[1]: https://github.com/Azure/sonic-linux-kernel/pull/46

Link: https://lkml.kernel.org/r/20210304124626.13927-1-pmenzel@molgen.mpg.de
Signed-off-by: Joe LeVeque
Signed-off-by: Paul Menzel
Acked-by: Baoquan He
Cc: Guohan Lu
Cc: Joe LeVeque
Cc: Paul Menzel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe LeVeque
2021-05-07 15:26:32 +0800

22 Feb, 2021

1 commit

591fd30ee Merge branch 'work.elf-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull ELF compat updates from Al Viro:
"Sanitizing ELF compat support, especially for triarch architectures:

- X32 handling cleaned up

- MIPS64 uses compat_binfmt_elf.c both for O32 and N32 now

- Kconfig side of things regularized

Eventually I hope to have compat_binfmt_elf.c killed, with both native
and compat built from fs/binfmt_elf.c, with -DELF_BITS={64,32} passed
by kbuild, but that's a separate story - not included here"

* 'work.elf-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
get rid of COMPAT_ELF_EXEC_PAGESIZE
compat_binfmt_elf: don't bother with undef of ELF_ARCH
Kconfig: regularize selection of CONFIG_BINFMT_ELF
mips compat: switch to compat_binfmt_elf.c
mips: don't bother with ELF_CORE_EFLAGS
mips compat: don't bother with ELF_ET_DYN_BASE
mips: KVM_GUEST makes no sense for 64bit builds...
mips: kill unused definitions in binfmt_elf[on]32.c
mips binfmt_elf*32.c: use elfcore-compat.h
x32: make X32, !IA32_EMULATION setups able to execute x32 binaries
[amd64] clean PRSTATUS_SIZE/SET_PR_FPVALID up properly
elf_prstatus: collect the common part (everything before pr_reg) into a struct
binfmt_elf: partially sanitize PRSTATUS_SIZE and SET_PR_FPVALID

Linus Torvalds
2021-02-22 01:29:23 +0800

26 Jan, 2021

1 commit

56c91a184 kernel: kexec: remove the lock operation of system_transition_mutex ... Browse Code »

Function kernel_kexec() is called with lock system_transition_mutex
held in reboot system call. While inside kernel_kexec(), it will
acquire system_transition_mutex agin. This will lead to dead lock.

The dead lock should be easily triggered, it hasn't caused any
failure report just because the feature 'kexec jump' is almost not
used by anyone as far as I know. An inquiry can be made about who
is using 'kexec jump' and where it's used. Before that, let's simply
remove the lock operation inside CONFIG_KEXEC_JUMP ifdeffery scope.

Fixes: 55f2503c3b69 ("PM / reboot: Eliminate race between reboot and suspend")
Signed-off-by: Baoquan He
Reported-by: Dan Carpenter
Reviewed-by: Pingfan Liu
Cc: 4.19+ # 4.19+
Signed-off-by: Rafael J. Wysocki

Baoquan He
2021-01-26 01:40:37 +0800

06 Jan, 2021

1 commit

f2485a2dc elf_prstatus: collect the common part (everything before pr_reg) into a struct ... Browse Code »

Preparations to doing i386 compat elf_prstatus sanely - rather than duplicating
the beginning of compat_elf_prstatus, take these fields into a separate
structure (compat_elf_prstatus_common), so that it could be reused. Due to
the incestous relationship between binfmt_elf.c and compat_binfmt_elf.c we
need the same shape change done to native struct elf_prstatus, gathering the
fields prior to pr_reg into a new structure (struct elf_prstatus_common).

Fortunately, offset of pr_reg is always a multiple of 16 with no padding
right before it, so it's possible to turn all the stuff prior to it into
a single member without disturbing the layout.

[build fix from Geert Uytterhoeven folded in]

Signed-off-by: Al Viro

Al Viro
2021-01-06 21:38:29 +0800

20 Nov, 2020

1 commit

a24d22b22 crypto: sha - split sha.h into sha1.h and sha2.h ... Browse Code »

Currently contains declarations for both SHA-1 and SHA-2,
and contains declarations for SHA-3.

This organization is inconsistent, but more importantly SHA-1 is no
longer considered to be cryptographically secure. So to the extent
possible, SHA-1 shouldn't be grouped together with any of the other SHA
versions, and usage of it should be phased out.

Therefore, split into two headers and
, and make everyone explicitly specify whether they want
the declarations for SHA-1, SHA-2, or both.

This avoids making the SHA-1 declarations visible to files that don't
want anything to do with SHA-1. It also prepares for potentially moving
sha1.h into a new insecure/ or dangerous/ directory.

Signed-off-by: Eric Biggers
Acked-by: Ard Biesheuvel
Acked-by: Jason A. Donenfeld
Signed-off-by: Herbert Xu

Eric Biggers
2020-11-20 11:45:33 +0800

17 Oct, 2020

1 commit

7b7b8a2c9 kernel/: fix repeated words in comments ... Browse Code »

Fix multiple occurrences of duplicated words in kernel/.

Fix one typo/spello on the same line as a duplicate word. Change one
instance of "the the" to "that the". Otherwise just drop one of the
repeated words.

Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Link: https://lkml.kernel.org/r/98202fa6-8919-ef63-9efe-c0fad5ca7af1@infradead.org
Signed-off-by: Linus Torvalds

Randy Dunlap
2020-10-17 02:11:19 +0800

10 Sep, 2020

1 commit

00089c048 objtool: Rename frame.h -> objtool.h ... Browse Code »

Header frame.h is getting more code annotations to help objtool analyze
object files.

Rename the file to objtool.h.

[ jpoimboe: add objtool.h to MAINTAINERS ]

Signed-off-by: Julien Thierry
Signed-off-by: Josh Poimboeuf

Julien Thierry
2020-09-10 23:43:13 +0800

09 Jan, 2020

2 commits

de68e4dae kexec: add machine_kexec_post_load() ... Browse Code »

It is the same as machine_kexec_prepare(), but is called after segments are
loaded. This way, can do processing work with already loaded relocation
segments. One such example is arm64: it has to have segments loaded in
order to create a page table, but it cannot do it during kexec time,
because at that time allocations won't be possible anymore.

Signed-off-by: Pavel Tatashin
Acked-by: Dave Young
Signed-off-by: Will Deacon

Pavel Tatashin
2020-01-09 00:32:55 +0800
d42cc530b kexec: quiet down kexec reboot ... Browse Code »

Here is a regular kexec command sequence and output:
=====
$ kexec --reuse-cmdline -i --load Image
$ kexec -e
[ 161.342002] kexec_core: Starting new kernel

Welcome to Buildroot
buildroot login:
=====

Even when "quiet" kernel parameter is specified, "kexec_core: Starting
new kernel" is printed.

This message has KERN_EMERG level, but there is no emergency, it is a
normal kexec operation, so quiet it down to appropriate KERN_NOTICE.

Machines that have slow console baud rate benefit from less output.

Signed-off-by: Pavel Tatashin
Reviewed-by: Simon Horman
Acked-by: Dave Young
Signed-off-by: Will Deacon

Pavel Tatashin
2020-01-09 00:32:55 +0800

26 Sep, 2019

1 commit

7c3a6aedc kexec: bail out upon SIGKILL when allocating memory. ... Browse Code »

syzbot found that a thread can stall for minutes inside kexec_load() after
that thread was killed by SIGKILL [1]. It turned out that the reproducer
was trying to allocate 2408MB of memory using kimage_alloc_page() from
kimage_load_normal_segment(). Let's check for SIGKILL before doing memory
allocation.

[1] https://syzkaller.appspot.com/bug?id=a0e3436829698d5824231251fad9d8e998f94f5e

Link: http://lkml.kernel.org/r/993c9185-d324-2640-d061-bed2dd18b1f7@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa
Reported-by: syzbot
Cc: Eric Biederman
Reviewed-by: Andrew Morton
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tetsuo Handa
2019-09-26 08:51:40 +0800

19 Jun, 2019

1 commit

40b0b3f8f treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 230 ... Browse Code »

Based on 2 normalized pattern(s):

this source code is licensed under the gnu general public license
version 2 see the file copying for more details

this source code is licensed under general public license version 2
see

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 52 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Enrico Weigelt
Reviewed-by: Allison Randal
Reviewed-by: Alexios Zavras
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.449021192@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-06-19 23:09:06 +0800

04 May, 2019

1 commit

2f1a6fbbe power/suspend: Add function to disable secondaries for suspend ... Browse Code »

This adds a function to disable secondary CPUs for suspend that are
not necessarily non-zero / non-boot CPUs. Platforms will be able to
use this to suspend using non-zero CPUs.

Signed-off-by: Nicholas Piggin
Signed-off-by: Peter Zijlstra (Intel)
Cc: Frederic Weisbecker
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Rafael J . Wysocki
Cc: Thomas Gleixner
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lkml.kernel.org/r/20190411033448.20842-3-npiggin@gmail.com
Signed-off-by: Ingo Molnar

Nicholas Piggin
2019-05-04 01:42:41 +0800

29 Dec, 2018

2 commits

ca79b0c21 mm: convert totalram_pages and totalhigh_pages variables to atomic ... Browse Code »

totalram_pages and totalhigh_pages are made static inline function.

Main motivation was that managed_page_count_lock handling was complicating
things. It was discussed in length here,
https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes
better to remove the lock and convert variables to atomic, with preventing
poteintial store-to-read tearing as a bonus.

[akpm@linux-foundation.org: coding style fixes]
Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.org
Signed-off-by: Arun KS
Suggested-by: Michal Hocko
Suggested-by: Vlastimil Babka
Reviewed-by: Konstantin Khlebnikov
Reviewed-by: Pavel Tatashin
Acked-by: Michal Hocko
Acked-by: Vlastimil Babka
Cc: David Hildenbrand
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arun KS
2018-12-29 04:11:47 +0800
3d6357de8 mm: reference totalram_pages and managed_pages once per function ... Browse Code »

Patch series "mm: convert totalram_pages, totalhigh_pages and managed
pages to atomic", v5.

This series converts totalram_pages, totalhigh_pages and
zone->managed_pages to atomic variables.

totalram_pages, zone->managed_pages and totalhigh_pages updates are
protected by managed_page_count_lock, but readers never care about it.
Convert these variables to atomic to avoid readers potentially seeing a
store tear.

Main motivation was that managed_page_count_lock handling was complicating
things. It was discussed in length here,
https://lore.kernel.org/patchwork/patch/995739/#1181785 It seemes better
to remove the lock and convert variables to atomic. With the change,
preventing poteintial store-to-read tearing comes as a bonus.

This patch (of 4):

This is in preparation to a later patch which converts totalram_pages and
zone->managed_pages to atomic variables. Please note that re-reading the
value might lead to a different value and as such it could lead to
unexpected behavior. There are no known bugs as a result of the current
code but it is better to prevent from them in principle.

Link: http://lkml.kernel.org/r/1542090790-21750-2-git-send-email-arunks@codeaurora.org
Signed-off-by: Arun KS
Reviewed-by: Konstantin Khlebnikov
Reviewed-by: David Hildenbrand
Acked-by: Michal Hocko
Acked-by: Vlastimil Babka
Reviewed-by: Pavel Tatashin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arun KS
2018-12-29 04:11:47 +0800

06 Oct, 2018

1 commit

9cf38d555 kexec: Allocate decrypted control pages for kdump if SME is enabled ... Browse Code »

When SME is enabled in the first kernel, it needs to allocate decrypted
pages for kdump because when the kdump kernel boots, these pages need to
be accessed decrypted in the initial boot stage, before SME is enabled.

[ bp: clean up text. ]

Signed-off-by: Lianbo Jiang
Signed-off-by: Borislav Petkov
Reviewed-by: Tom Lendacky
Cc: kexec@lists.infradead.org
Cc: tglx@linutronix.de
Cc: mingo@redhat.com
Cc: hpa@zytor.com
Cc: akpm@linux-foundation.org
Cc: dan.j.williams@intel.com
Cc: bhelgaas@google.com
Cc: baiyaowei@cmss.chinamobile.com
Cc: tiwai@suse.de
Cc: brijesh.singh@amd.com
Cc: dyoung@redhat.com
Cc: bhe@redhat.com
Cc: jroedel@suse.de
Link: https://lkml.kernel.org/r/20180930031033.22110-3-lijiang@redhat.com

Lianbo Jiang
2018-10-06 18:01:51 +0800

15 Jun, 2018

1 commit

a8311f647 kexec: yield to scheduler when loading kimage segments ... Browse Code »

Without yielding while loading kimage segments, a large initrd will
block all other work on the CPU performing the load until it is
completed. For example loading an initrd of 200MB on a low power single
core system will lock up the system for a few seconds.

To increase system responsiveness to other tasks at that time, call
cond_resched() in both the crash kernel and normal kernel segment
loading loops.

I did run into a practical problem. Hardware watchdogs on embedded
systems can have short timers on the order of seconds. If the system is
locked up for a few seconds with only a single core available, the
watchdog may not be pet in a timely fashion. If this happens, the
hardware watchdog will fire and reset the system.

This really only becomes a problem when you are working with a single
core, a decently sized initrd, and have a constrained hardware watchdog.

Link: http://lkml.kernel.org/r/1528738546-3328-1-git-send-email-jmf@amazon.com
Signed-off-by: Jarrett Farnitano
Reviewed-by: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jarrett Farnitano
2018-06-15 06:55:24 +0800

18 Jul, 2017

1 commit

bba4ed011 x86/mm, kexec: Allow kexec to be used with SME ... Browse Code »

Provide support so that kexec can be used to boot a kernel when SME is
enabled.

Support is needed to allocate pages for kexec without encryption. This
is needed in order to be able to reboot in the kernel in the same manner
as originally booted.

Additionally, when shutting down all of the CPUs we need to be sure to
flush the caches and then halt. This is needed when booting from a state
where SME was not active into a state where SME is active (or vice-versa).
Without these steps, it is possible for cache lines to exist for the same
physical location but tagged both with and without the encryption bit. This
can cause random memory corruption when caches are flushed depending on
which cacheline is written last.

Signed-off-by: Tom Lendacky
Reviewed-by: Thomas Gleixner
Reviewed-by: Borislav Petkov
Cc:
Cc: Alexander Potapenko
Cc: Andrey Ryabinin
Cc: Andy Lutomirski
Cc: Arnd Bergmann
Cc: Borislav Petkov
Cc: Brijesh Singh
Cc: Dave Young
Cc: Dmitry Vyukov
Cc: Jonathan Corbet
Cc: Konrad Rzeszutek Wilk
Cc: Larry Woodman
Cc: Linus Torvalds
Cc: Matt Fleming
Cc: Michael S. Tsirkin
Cc: Paolo Bonzini
Cc: Peter Zijlstra
Cc: Radim Krčmář
Cc: Rik van Riel
Cc: Toshimitsu Kani
Cc: kasan-dev@googlegroups.com
Cc: kvm@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-efi@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/b95ff075db3e7cd545313f2fb609a49619a09625.1500319216.git.thomas.lendacky@amd.com
Signed-off-by: Ingo Molnar

Tom Lendacky
2017-07-18 17:38:04 +0800

13 Jul, 2017

1 commit

1229384f5 kdump: protect vmcoreinfo data under the crash memory ... Browse Code »

Currently vmcoreinfo data is updated at boot time subsys_initcall(), it
has the risk of being modified by some wrong code during system is
running.

As a result, vmcore dumped may contain the wrong vmcoreinfo. Later on,
when using "crash", "makedumpfile", etc utility to parse this vmcore, we
probably will get "Segmentation fault" or other unexpected errors.

E.g. 1) wrong code overwrites vmcoreinfo_data; 2) further crashes the
system; 3) trigger kdump, then we obviously will fail to recognize the
crash context correctly due to the corrupted vmcoreinfo.

Now except for vmcoreinfo, all the crash data is well
protected(including the cpu note which is fully updated in the crash
path, thus its correctness is guaranteed). Given that vmcoreinfo data
is a large chunk prepared for kdump, we better protect it as well.

To solve this, we relocate and copy vmcoreinfo_data to the crash memory
when kdump is loading via kexec syscalls. Because the whole crash
memory will be protected by existing arch_kexec_protect_crashkres()
mechanism, we naturally protect vmcoreinfo_data from write(even read)
access under kernel direct mapping after kdump is loaded.

Since kdump is usually loaded at the very early stage after boot, we can
trust the correctness of the vmcoreinfo data copied.

On the other hand, we still need to operate the vmcoreinfo safe copy
when crash happens to generate vmcoreinfo_note again, we rely on vmap()
to map out a new kernel virtual address and update to use this new one
instead in the following crash_save_vmcoreinfo().

BTW, we do not touch vmcoreinfo_note, because it will be fully updated
using the protected vmcoreinfo_data after crash which is surely correct
just like the cpu crash note.

Link: http://lkml.kernel.org/r/1493281021-20737-3-git-send-email-xlpang@redhat.com
Signed-off-by: Xunlei Pang
Tested-by: Michael Holzheu
Cc: Benjamin Herrenschmidt
Cc: Dave Young
Cc: Eric Biederman
Cc: Hari Bathini
Cc: Juergen Gross
Cc: Mahesh Salgaonkar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xunlei Pang
2017-07-13 07:26:00 +0800

30 Jun, 2017

1 commit

c207aee48 objtool, x86: Add several functions and files to the objtool whitelist ... Browse Code »

In preparation for an objtool rewrite which will have broader checks,
whitelist functions and files which cause problems because they do
unusual things with the stack.

These whitelists serve as a TODO list for which functions and files
don't yet have undwarf unwinder coverage. Eventually most of the
whitelists can be removed in favor of manual CFI hint annotations or
objtool improvements.

Signed-off-by: Josh Poimboeuf
Cc: Andy Lutomirski
Cc: Jiri Slaby
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/7f934a5d707a574bda33ea282e9478e627fb1829.1498659915.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar

Josh Poimboeuf
2017-06-30 16:19:19 +0800

09 May, 2017

2 commits

51dbd9252 ia64: reuse append_elf_note() and final_note() functions ... Browse Code »

Get rid of multiple definitions of append_elf_note() & final_note()
functions. Reuse these functions compiled under CONFIG_CRASH_CORE Also,
define Elf_Word and use it instead of generic u32 or the more specific
Elf64_Word.

Link: http://lkml.kernel.org/r/149035342324.6881.11667840929850361402.stgit@hbathini.in.ibm.com
Signed-off-by: Hari Bathini
Acked-by: Dave Young
Acked-by: Tony Luck
Cc: Fenghua Yu
Cc: Eric Biederman
Cc: Mahesh Salgaonkar
Cc: Vivek Goyal
Cc: Michael Ellerman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hari Bathini
2017-05-09 08:15:11 +0800
692f66f26 crash: move crashkernel parsing and vmcore related code under CONFIG_CRASH_CORE ... Browse Code »

Patch series "kexec/fadump: remove dependency with CONFIG_KEXEC and
reuse crashkernel parameter for fadump", v4.

Traditionally, kdump is used to save vmcore in case of a crash. Some
architectures like powerpc can save vmcore using architecture specific
support instead of kexec/kdump mechanism. Such architecture specific
support also needs to reserve memory, to be used by dump capture kernel.
crashkernel parameter can be a reused, for memory reservation, by such
architecture specific infrastructure.

This patchset removes dependency with CONFIG_KEXEC for crashkernel
parameter and vmcoreinfo related code as it can be reused without kexec
support. Also, crashkernel parameter is reused instead of
fadump_reserve_mem to reserve memory for fadump.

The first patch moves crashkernel parameter parsing and vmcoreinfo
related code under CONFIG_CRASH_CORE instead of CONFIG_KEXEC_CORE. The
second patch reuses the definitions of append_elf_note() & final_note()
functions under CONFIG_CRASH_CORE in IA64 arch code. The third patch
removes dependency on CONFIG_KEXEC for firmware-assisted dump (fadump)
in powerpc. The next patch reuses crashkernel parameter for reserving
memory for fadump, instead of the fadump_reserve_mem parameter. This
has the advantage of using all syntaxes crashkernel parameter supports,
for fadump as well. The last patch updates fadump kernel documentation
about use of crashkernel parameter.

This patch (of 5):

Traditionally, kdump is used to save vmcore in case of a crash. Some
architectures like powerpc can save vmcore using architecture specific
support instead of kexec/kdump mechanism. Such architecture specific
support also needs to reserve memory, to be used by dump capture kernel.
crashkernel parameter can be a reused, for memory reservation, by such
architecture specific infrastructure.

But currently, code related to vmcoreinfo and parsing of crashkernel
parameter is built under CONFIG_KEXEC_CORE. This patch introduces
CONFIG_CRASH_CORE and moves the above mentioned code under this config,
allowing code reuse without dependency on CONFIG_KEXEC. There is no
functional change with this patch.

Link: http://lkml.kernel.org/r/149035338104.6881.4550894432615189948.stgit@hbathini.in.ibm.com
Signed-off-by: Hari Bathini
Acked-by: Dave Young
Cc: Fenghua Yu
Cc: Tony Luck
Cc: Eric Biederman
Cc: Mahesh Salgaonkar
Cc: Vivek Goyal
Cc: Michael Ellerman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hari Bathini
2017-05-09 08:15:11 +0800

23 Feb, 2017

1 commit

7d91de744 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk ... Browse Code »

Pull printk updates from Petr Mladek:

- Add Petr Mladek, Sergey Senozhatsky as printk maintainers, and Steven
Rostedt as the printk reviewer. This idea came up after the
discussion about printk issues at Kernel Summit. It was formulated
and discussed at lkml[1].

- Extend a lock-less NMI per-cpu buffers idea to handle recursive
printk() calls by Sergey Senozhatsky[2]. It is the first step in
sanitizing printk as discussed at Kernel Summit.

The change allows to see messages that would normally get ignored or
would cause a deadlock.

Also it allows to enable lockdep in printk(). This already paid off.
The testing in linux-next helped to discover two old problems that
were hidden before[3][4].

- Remove unused parameter by Sergey Senozhatsky. Clean up after a past
change.

[1] http://lkml.kernel.org/r/1481798878-31898-1-git-send-email-pmladek@suse.com
[2] http://lkml.kernel.org/r/20161227141611.940-1-sergey.senozhatsky@gmail.com
[3] http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com
[4] http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.com

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
printk: drop call_console_drivers() unused param
printk: convert the rest to printk-safe
printk: remove zap_locks() function
printk: use printk_safe buffers in printk
printk: report lost messages in printk safe/nmi contexts
printk: always use deferred printk when flush printk_safe lines
printk: introduce per-cpu safe_print seq buffer
printk: rename nmi.c and exported api
printk: use vprintk_func in vprintk()
MAINTAINERS: Add printk maintainers

Linus Torvalds
2017-02-23 09:33:34 +0800

08 Feb, 2017

1 commit

f92bac3b1 printk: rename nmi.c and exported api ... Browse Code »

A preparation patch for printk_safe work. No functional change.
- rename nmi.c to print_safe.c
- add `printk_safe' prefix to some (which used both by printk-safe
and printk-nmi) of the exported functions.

Link: http://lkml.kernel.org/r/20161227141611.940-3-sergey.senozhatsky@gmail.com
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Jan Kara
Cc: Tejun Heo
Cc: Calvin Owens
Cc: Steven Rostedt
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Andy Lutomirski
Cc: Peter Hurley
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sergey Senozhatsky
Signed-off-by: Petr Mladek

Sergey Senozhatsky
2017-02-08 18:02:33 +0800

11 Jan, 2017

1 commit

b6e92aa81 kexec: Switch to __pa_symbol ... Browse Code »

__pa_symbol is the correct api to get the physical address of kernel
symbols. Switch to it to allow for better debug checking.

Reviewed-by: Mark Rutland
Tested-by: Mark Rutland
Acked-by: "Eric W. Biederman"
Signed-off-by: Laura Abbott
Signed-off-by: Will Deacon

Laura Abbott
2017-01-11 21:56:49 +0800

15 Dec, 2016

2 commits

8e53c073a kexec: add cond_resched into kimage_alloc_crash_control_pages ... Browse Code »

A soft lookup will occur when I run trinity in syscall kexec_load. the
corresponding stack information is as follows.

BUG: soft lockup - CPU#6 stuck for 22s! [trinity-c6:13859]
Kernel panic - not syncing: softlockup: hung tasks
CPU: 6 PID: 13859 Comm: trinity-c6 Tainted: G O L ----V------- 3.10.0-327.28.3.35.zhongjiang.x86_64 #1
Hardware name: Huawei Technologies Co., Ltd. Tecal BH622 V2/BC01SRSA0, BIOS RMIBV386 06/30/2014
Call Trace:
dump_stack+0x19/0x1b
panic+0xd8/0x214
watchdog_timer_fn+0x1cc/0x1e0
__hrtimer_run_queues+0xd2/0x260
hrtimer_interrupt+0xb0/0x1e0
? call_softirq+0x1c/0x30
local_apic_timer_interrupt+0x37/0x60
smp_apic_timer_interrupt+0x3f/0x60
apic_timer_interrupt+0x6d/0x80
? kimage_alloc_control_pages+0x80/0x270
? kmem_cache_alloc_trace+0x1ce/0x1f0
? do_kimage_alloc_init+0x1f/0x90
kimage_alloc_init+0x12a/0x180
SyS_kexec_load+0x20a/0x260
system_call_fastpath+0x16/0x1b

the first time allocation of control pages may take too much time
because crash_res.end can be set to a higher value. we need to add
cond_resched to avoid the issue.

The patch have been tested and above issue is not appear.

Link: http://lkml.kernel.org/r/1481164674-42775-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang
Acked-by: "Eric W. Biederman"
Cc: Xunlei Pang
Cc: Dave Young
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

zhong jiang
2016-12-15 08:04:07 +0800
401721ecd kexec: export the value of phys_base instead of symbol address ... Browse Code »

Currently in x86_64, the symbol address of phys_base is exported to
vmcoreinfo. Dave Anderson complained this is really useless for his
Crash implementation. Because in user-space utility Crash and
Makedumpfile which exported vmcore information is mainly used for, value
of phys_base is needed to covert virtual address of exported kernel
symbol to physical address. Especially init_level4_pgt, if we want to
access and go over the page table to look up a PA corresponding to VA,
firstly we need calculate

page_dir = SYMBOL(init_level4_pgt) - __START_KERNEL_map + phys_base;

Now in Crash and Makedumpfile, we have to analyze the vmcore elf program
header to get value of phys_base. As Dave said, it would be preferable
if it were readily availabl in vmcoreinfo rather than depending upon the
PT_LOAD semantics.

Hence in this patch change to export the value of phys_base instead of
its virtual address.

And people also complained that KERNEL_IMAGE_SIZE exporting is x86_64
only, should be moved into arch dependent function
arch_crash_save_vmcoreinfo. Do the moving in this patch.

Link: http://lkml.kernel.org/r/1478568596-30060-2-git-send-email-bhe@redhat.com
Signed-off-by: Baoquan He
Cc: Thomas Garnier
Cc: Baoquan He
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H . Peter Anvin"
Cc: Eric Biederman
Cc: Xunlei Pang
Cc: HATAYAMA Daisuke
Cc: Kees Cook
Cc: Eugene Surovegin
Cc: Dave Young
Cc: AKASHI Takahiro
Cc: Atsushi Kumagai
Cc: Dave Anderson
Cc: Pratyush Anand
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Baoquan He
2016-12-15 08:04:07 +0800

03 Aug, 2016

6 commits

1730f1466 kexec: add restriction on kexec_load() segment sizes ... Browse Code »

I hit the following issue when run trinity in my system. The kernel is
3.4 version, but mainline has the same issue.

The root cause is that the segment size is too large so the kerenl
spends too long trying to allocate a page. Other cases will block until
the test case quits. Also, OOM conditions will occur.

Call Trace:
__alloc_pages_nodemask+0x14c/0x8f0
alloc_pages_current+0xaf/0x120
kimage_alloc_pages+0x10/0x60
kimage_alloc_control_pages+0x5d/0x270
machine_kexec_prepare+0xe5/0x6c0
? kimage_free_page_list+0x52/0x70
sys_kexec_load+0x141/0x600
? vfs_write+0x100/0x180
system_call_fastpath+0x16/0x1b

The patch changes sanity_check_segment_list() to verify that the usage by
all segments does not exceed half of memory.

[akpm@linux-foundation.org: fix for kexec-return-error-number-directly.patch, update comment]
Link: http://lkml.kernel.org/r/1469625474-53904-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang
Suggested-by: Eric W. Biederman
Cc: Vivek Goyal
Cc: Dave Young
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

zhong jiang
2016-08-03 07:35:31 +0800
21db79e8b kexec: add a kexec_crash_loaded() function ... Browse Code »

Provide a wrapper function to be used by kernel code to check whether a
crash kernel is loaded. It returns the same value that can be seen in
/sys/kernel/kexec_crash_loaded by userspace programs.

I'm exporting the function, because it will be used by Xen, and it is
possible to compile Xen modules separately to enable the use of PV
drivers with unmodified bare-metal kernels.

Link: http://lkml.kernel.org/r/20160713121955.14969.69080.stgit@hananiah.suse.cz
Signed-off-by: Petr Tesarik
Cc: Juergen Gross
Cc: Josh Triplett
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Eric Biederman
Cc: "H. Peter Anvin"
Cc: Boris Ostrovsky
Cc: "Paul E. McKenney"
Cc: Dave Young
Cc: David Vrabel
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Petr Tesarik
2016-08-03 07:35:30 +0800
43546d866 kexec: allow architectures to override boot mapping ... Browse Code »

kexec physical addresses are the boot-time view of the system. For
certain ARM systems (such as Keystone 2), the boot view of the system
does not match the kernel's view of the system: the boot view uses a
special alias in the lower 4GB of the physical address space.

To cater for these kinds of setups, we need to translate between the
boot view physical addresses and the normal kernel view physical
addresses. This patch extracts the current transation points into
linux/kexec.h, and allows an architecture to override the functions.

Due to the translations required, we unfortunately end up with six
translation functions, which are reduced down to four that the
architecture can override.

[akpm@linux-foundation.org: kexec.h needs asm/io.h for phys_to_virt()]
Link: http://lkml.kernel.org/r/E1b8koP-0004HZ-Vf@rmk-PC.armlinux.org.uk
Signed-off-by: Russell King
Cc: Keerthy
Cc: Pratyush Anand
Cc: Vitaly Andrianov
Cc: Eric Biederman
Cc: Dave Young
Cc: Baoquan He
Cc: Vivek Goyal
Cc: Simon Horman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Russell King
2016-08-03 07:35:27 +0800
dae28018f kdump: arrange for paddr_vmcoreinfo_note() to return phys_addr_t ... Browse Code »

On PAE systems (eg, ARM LPAE) the vmcore note may be located above 4GB
physical on 32-bit architectures, so we need a wider type than "unsigned
long" here. Arrange for paddr_vmcoreinfo_note() to return a
phys_addr_t, thereby allowing it to be located above 4GB.

This makes no difference for kexec-tools, as they already assume a
64-bit type when reading from this file.

Link: http://lkml.kernel.org/r/E1b8koK-0004HS-K9@rmk-PC.armlinux.org.uk
Signed-off-by: Russell King
Reviewed-by: Pratyush Anand
Acked-by: Baoquan He
Cc: Keerthy
Cc: Vitaly Andrianov
Cc: Eric Biederman
Cc: Dave Young
Cc: Vivek Goyal
Cc: Simon Horman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Russell King
2016-08-03 07:35:27 +0800
465d37770 kexec: ensure user memory sizes do not wrap ... Browse Code »

Ensure that user memory sizes do not wrap around when validating the
user input, which can lead to the following input validation working
incorrectly.

[akpm@linux-foundation.org: fix it for kexec-return-error-number-directly.patch]
Link: http://lkml.kernel.org/r/E1b8koF-0004HM-5x@rmk-PC.armlinux.org.uk
Signed-off-by: Russell King
Reviewed-by: Pratyush Anand
Acked-by: Baoquan He
Cc: Keerthy
Cc: Vitaly Andrianov
Cc: Eric Biederman
Cc: Dave Young
Cc: Vivek Goyal
Cc: Simon Horman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Russell King
2016-08-03 07:35:26 +0800
4caf96152 kexec: return error number directly ... Browse Code »

This is a cleanup patch to make kexec more clear to return error number
directly. The variable result is useless, because there is no other
function's return value assignes to it. So remove it.

Link: http://lkml.kernel.org/r/1464179273-57668-1-git-send-email-mnghuan@gmail.com
Signed-off-by: Minfei Huang
Cc: Dave Young
Cc: Baoquan He
Cc: Borislav Petkov
Cc: Xunlei Pang
Cc: Atsushi Kumagai
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minfei Huang
2016-08-03 07:35:24 +0800

24 May, 2016

2 commits

7a0058ec7 s390/kexec: consolidate crash_map/unmap_reserved_pages() and arch_kexec_protect(… ... Browse Code »

…unprotect)_crashkres()

Commit 3f625002581b ("kexec: introduce a protection mechanism for the
crashkernel reserved memory") is a similar mechanism for protecting the
crash kernel reserved memory to previous crash_map/unmap_reserved_pages()
implementation, the new one is more generic in name and cleaner in code
(besides, some arch may not be allowed to unmap the pgtable).

Therefore, this patch consolidates them, and uses the new
arch_kexec_protect(unprotect)_crashkres() to replace former
crash_map/unmap_reserved_pages() which by now has been only used by
S390.

The consolidation work needs the crash memory to be mapped initially,
this is done in machine_kdump_pm_init() which is after
reserve_crashkernel(). Once kdump kernel is loaded, the new
arch_kexec_protect_crashkres() implemented for S390 will actually
unmap the pgtable like before.

Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Minfei Huang <mhuang@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Xunlei Pang
2016-05-24 08:04:14 +0800
9b492cf58 kexec: introduce a protection mechanism for the crashkernel reserved memory ... Browse Code »

For the cases that some kernel (module) path stamps the crash reserved
memory(already mapped by the kernel) where has been loaded the second
kernel data, the kdump kernel will probably fail to boot when panic
happens (or even not happens) leaving the culprit at large, this is
unacceptable.

The patch introduces a mechanism for detecting such cases:

1) After each crash kexec loading, it simply marks the reserved memory
regions readonly since we no longer access it after that. When someone
stamps the region, the first kernel will panic and trigger the kdump.
The weak arch_kexec_protect_crashkres() is introduced to do the actual
protection.

2) To allow multiple loading, once 1) was done we also need to remark
the reserved memory to readwrite each time a system call related to
kdump is made. The weak arch_kexec_unprotect_crashkres() is introduced
to do the actual protection.

The architecture can make its specific implementation by overriding
arch_kexec_protect_crashkres() and arch_kexec_unprotect_crashkres().

Signed-off-by: Xunlei Pang
Cc: Eric Biederman
Cc: Dave Young
Cc: Minfei Huang
Cc: Vivek Goyal
Cc: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xunlei Pang
2016-05-24 08:04:14 +0800

21 May, 2016

1 commit

cf9b1106c printk/nmi: flush NMI messages on the system panic ... Browse Code »

In NMI context, printk() messages are stored into per-CPU buffers to
avoid a possible deadlock. They are normally flushed to the main ring
buffer via an IRQ work. But the work is never called when the system
calls panic() in the very same NMI handler.

This patch tries to flush NMI buffers before the crash dump is
generated. In this case it does not risk a double release and bails out
when the logbuf_lock is already taken. The aim is to get the messages
into the main ring buffer when possible. It makes them better
accessible in the vmcore.

Then the patch tries to flush the buffers second time when other CPUs
are down. It might be more aggressive and reset logbuf_lock. The aim
is to get the messages available for the consequent kmsg_dump() and
console_flush_on_panic() calls.

The patch causes vprintk_emit() to be called even in NMI context again.
But it is done via printk_deferred() so that the console handling is
skipped. Consoles use internal locks and we could not prevent a
deadlock easily. They are explicitly called later when the crash dump
is not generated, see console_flush_on_panic().

Signed-off-by: Petr Mladek
Cc: Benjamin Herrenschmidt
Cc: Daniel Thompson
Cc: David Miller
Cc: Ingo Molnar
Cc: Jan Kara
Cc: Jiri Kosina
Cc: Martin Schwidefsky
Cc: Peter Zijlstra
Cc: Ralf Baechle
Cc: Russell King
Cc: Steven Rostedt
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Petr Mladek
2016-05-21 08:58:30 +0800

20 May, 2016

1 commit

0139aa7b7 mm: rename _count, field of the struct page, to _refcount ... Browse Code »

Many developers already know that field for reference count of the
struct page is _count and atomic type. They would try to handle it
directly and this could break the purpose of page reference count
tracepoint. To prevent direct _count modification, this patch rename it
to _refcount and add warning message on the code. After that, developer
who need to handle reference count will find that field should not be
accessed directly.

[akpm@linux-foundation.org: fix comments, per Vlastimil]
[akpm@linux-foundation.org: Documentation/vm/transhuge.txt too]
[sfr@canb.auug.org.au: sync ethernet driver changes]
Signed-off-by: Joonsoo Kim
Signed-off-by: Stephen Rothwell
Cc: Vlastimil Babka
Cc: Hugh Dickins
Cc: Johannes Berg
Cc: "David S. Miller"
Cc: Sunil Goutham
Cc: Chris Metcalf
Cc: Manish Chopra
Cc: Yuval Mintz
Cc: Tariq Toukan
Cc: Saeed Mahameed
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joonsoo Kim
2016-05-20 10:12:14 +0800

29 Apr, 2016

2 commits

d7f53518f kexec: export OFFSET(page.compound_head) to find out compound tail page ... Browse Code »

PageAnon() always look at head page to check PAGE_MAPPING_ANON and tail
page's page->mapping has just a poisoned data since commit 1c290f642101
("mm: sanitize page->mapping for tail pages").

If makedumpfile checks page->mapping of a compound tail page to
distinguish anonymous page as usual, it must fail in newer kernel. So
it's necessary to export OFFSET(page.compound_head) to avoid checking
compound tail pages.

The problem is that unnecessary hugepages won't be removed from a dump
file in kernels 4.5.x and later. This means that extra disk space would
be consumed. It's a problem, but not critical.

Signed-off-by: Atsushi Kumagai
Acked-by: Dave Young
Cc: "Eric W. Biederman"
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Atsushi Kumagai
2016-04-29 10:34:04 +0800
8639a847b kexec: update VMCOREINFO for compound_order/dtor ... Browse Code »

makedumpfile refers page.lru.next to get the order of compound pages for
page filtering.

However, now the order is stored in page.compound_order, hence
VMCOREINFO should be updated to export the offset of
page.compound_order.

The fact is, page.compound_order was introduced already in kernel 4.0,
but the offset of it was the same as page.lru.next until kernel 4.3, so
this was not actual problem.

The above can be said also for page.lru.prev and page.compound_dtor,
it's necessary to detect hugetlbfs pages. Further, the content was
changed from direct address to the ID which means dtor.

The problem is that unnecessary hugepages won't be removed from a dump
file in kernels 4.4.x and later. This means that extra disk space would
be consumed. It's a problem, but not critical.

Signed-off-by: Atsushi Kumagai
Acked-by: Dave Young
Cc: "Eric W. Biederman"
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Atsushi Kumagai
2016-04-29 10:34:04 +0800