17 Jul, 2018
1 commit
-
In order for LSMs and IMA-appraisal to differentiate between kexec_load
and kexec_file_load syscalls, both the original and new syscalls must
call an LSM hook. This patch adds a call to security_kernel_load_data()
in the original kexec_load syscall.Signed-off-by: Mimi Zohar
Cc: Eric Biederman
Cc: Kees Cook
Acked-by: Serge Hallyn
Acked-by: Kees Cook
Signed-off-by: James Morris
03 Apr, 2018
1 commit
-
do_kexec_load() can be called directly by compat_sys_kexec() as long as
the same parameters checks are completed which are currently handled
(also) by sys_kexec(). Therefore, move those to kexec_load_check(),
call that newly introduced helper function from both sys_kexec() and
compat_sys_kexec(), and duplicate the remaining code from sys_kexec()
in compat_sys_kexec().This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.netCc: Eric Biederman
Cc: kexec@lists.infradead.org
Signed-off-by: Dominik Brodowski
13 Jul, 2017
1 commit
-
Currently vmcoreinfo data is updated at boot time subsys_initcall(), it
has the risk of being modified by some wrong code during system is
running.As a result, vmcore dumped may contain the wrong vmcoreinfo. Later on,
when using "crash", "makedumpfile", etc utility to parse this vmcore, we
probably will get "Segmentation fault" or other unexpected errors.E.g. 1) wrong code overwrites vmcoreinfo_data; 2) further crashes the
system; 3) trigger kdump, then we obviously will fail to recognize the
crash context correctly due to the corrupted vmcoreinfo.Now except for vmcoreinfo, all the crash data is well
protected(including the cpu note which is fully updated in the crash
path, thus its correctness is guaranteed). Given that vmcoreinfo data
is a large chunk prepared for kdump, we better protect it as well.To solve this, we relocate and copy vmcoreinfo_data to the crash memory
when kdump is loading via kexec syscalls. Because the whole crash
memory will be protected by existing arch_kexec_protect_crashkres()
mechanism, we naturally protect vmcoreinfo_data from write(even read)
access under kernel direct mapping after kdump is loaded.Since kdump is usually loaded at the very early stage after boot, we can
trust the correctness of the vmcoreinfo data copied.On the other hand, we still need to operate the vmcoreinfo safe copy
when crash happens to generate vmcoreinfo_note again, we rely on vmap()
to map out a new kernel virtual address and update to use this new one
instead in the following crash_save_vmcoreinfo().BTW, we do not touch vmcoreinfo_note, because it will be fully updated
using the protected vmcoreinfo_data after crash which is surely correct
just like the cpu crash note.Link: http://lkml.kernel.org/r/1493281021-20737-3-git-send-email-xlpang@redhat.com
Signed-off-by: Xunlei Pang
Tested-by: Michael Holzheu
Cc: Benjamin Herrenschmidt
Cc: Dave Young
Cc: Eric Biederman
Cc: Hari Bathini
Cc: Juergen Gross
Cc: Mahesh Salgaonkar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
03 Aug, 2016
1 commit
-
kexec physical addresses are the boot-time view of the system. For
certain ARM systems (such as Keystone 2), the boot view of the system
does not match the kernel's view of the system: the boot view uses a
special alias in the lower 4GB of the physical address space.To cater for these kinds of setups, we need to translate between the
boot view physical addresses and the normal kernel view physical
addresses. This patch extracts the current transation points into
linux/kexec.h, and allows an architecture to override the functions.Due to the translations required, we unfortunately end up with six
translation functions, which are reduced down to four that the
architecture can override.[akpm@linux-foundation.org: kexec.h needs asm/io.h for phys_to_virt()]
Link: http://lkml.kernel.org/r/E1b8koP-0004HZ-Vf@rmk-PC.armlinux.org.uk
Signed-off-by: Russell King
Cc: Keerthy
Cc: Pratyush Anand
Cc: Vitaly Andrianov
Cc: Eric Biederman
Cc: Dave Young
Cc: Baoquan He
Cc: Vivek Goyal
Cc: Simon Horman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
24 May, 2016
4 commits
-
…unprotect)_crashkres()
Commit 3f625002581b ("kexec: introduce a protection mechanism for the
crashkernel reserved memory") is a similar mechanism for protecting the
crash kernel reserved memory to previous crash_map/unmap_reserved_pages()
implementation, the new one is more generic in name and cleaner in code
(besides, some arch may not be allowed to unmap the pgtable).Therefore, this patch consolidates them, and uses the new
arch_kexec_protect(unprotect)_crashkres() to replace former
crash_map/unmap_reserved_pages() which by now has been only used by
S390.The consolidation work needs the crash memory to be mapped initially,
this is done in machine_kdump_pm_init() which is after
reserve_crashkernel(). Once kdump kernel is loaded, the new
arch_kexec_protect_crashkres() implemented for S390 will actually
unmap the pgtable like before.Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Minfei Huang <mhuang@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> -
There are a lof of work to be done in function kexec_load, not only for
allocating structs and loading initram, but also for some misc.To make it more clear, wrap a new function do_kexec_load which is used
to allocate structs and load initram. And the pre-work will be done in
kexec_load.Signed-off-by: Minfei Huang
Cc: Vivek Goyal
Cc: "Eric W. Biederman"
Cc: Xunlei Pang
Cc: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
For some arch, kexec shall map the reserved pages, then use them, when
we try to start the kdump service.kexec may return directly, without unmaping the reserved pages, if it
fails during starting service. To fix it, we make a pair of map/unmap
reserved pages both in generic path and error path.This patch only affects s390. Other architecturess don't implement the
interface of crash_unmap_reserved_pages and crash_map_reserved_pages.It isn't a urgent patch. Kernel can work well without any risk,
although the reserved pages are not unmapped before returning in error
path.Signed-off-by: Minfei Huang
Cc: Vivek Goyal
Cc: "Eric W. Biederman"
Cc: Xunlei Pang
Cc: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
For the cases that some kernel (module) path stamps the crash reserved
memory(already mapped by the kernel) where has been loaded the second
kernel data, the kdump kernel will probably fail to boot when panic
happens (or even not happens) leaving the culprit at large, this is
unacceptable.The patch introduces a mechanism for detecting such cases:
1) After each crash kexec loading, it simply marks the reserved memory
regions readonly since we no longer access it after that. When someone
stamps the region, the first kernel will panic and trigger the kdump.
The weak arch_kexec_protect_crashkres() is introduced to do the actual
protection.2) To allow multiple loading, once 1) was done we also need to remark
the reserved memory to readwrite each time a system call related to
kdump is made. The weak arch_kexec_unprotect_crashkres() is introduced
to do the actual protection.The architecture can make its specific implementation by overriding
arch_kexec_protect_crashkres() and arch_kexec_unprotect_crashkres().Signed-off-by: Xunlei Pang
Cc: Eric Biederman
Cc: Dave Young
Cc: Minfei Huang
Cc: Vivek Goyal
Cc: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
21 Jan, 2016
1 commit
-
sanity_check_segment_list() checks KEXEC_TYPE_CRASH flag to ensure all the
segments of the loaded crash kernel are within the kernel crash resource
limits, so set the flag beforehand.Signed-off-by: Xunlei Pang
Acked-by: Dave Young
Cc: Eric Biederman
Cc: Vivek Goyal
Acked-by: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
07 Nov, 2015
1 commit
-
kexec output message misses the prefix "kexec", when Dave Young split the
kexec code. Now, we use file name as the output message prefix.Currently, the format of output message:
[ 140.290795] SYSC_kexec_load: hello, world
[ 140.291534] kexec: sanity_check_segment_list: hello, worldIdeally, the format of output message:
[ 30.791503] kexec: SYSC_kexec_load, Hello, world
[ 79.182752] kexec_core: sanity_check_segment_list, Hello, worldRemove the custom prefix "kexec" in output message.
Signed-off-by: Minfei Huang
Acked-by: Dave Young
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
11 Sep, 2015
2 commits
-
There are two kexec load syscalls, kexec_load another and kexec_file_load.
kexec_file_load has been splited as kernel/kexec_file.c. In this patch I
split kexec_load syscall code to kernel/kexec.c.And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and
use kexec_file_load only, or vice verse.The original requirement is from Ted Ts'o, he want kexec kernel signature
being checked with CONFIG_KEXEC_VERIFY_SIG enabled. But kexec-tools use
kexec_load syscall can bypass the checking.Vivek Goyal proposed to create a common kconfig option so user can compile
in only one syscall for loading kexec kernel. KEXEC/KEXEC_FILE selects
KEXEC_CORE so that old config files still work.Because there's general code need CONFIG_KEXEC_CORE, so I updated all the
architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects
KEXEC_CORE in arch Kconfig. Also updated general kernel code with to
kexec_load syscall.[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Dave Young
Cc: Eric W. Biederman
Cc: Vivek Goyal
Cc: Petr Tesarik
Cc: Theodore Ts'o
Cc: Josh Boyer
Cc: David Howells
Cc: Geert Uytterhoeven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Split kexec_file syscall related code to another file kernel/kexec_file.c
so that the #ifdef CONFIG_KEXEC_FILE in kexec.c can be dropped.Sharing variables and functions are moved to kernel/kexec_internal.h per
suggestion from Vivek and Petr.[akpm@linux-foundation.org: fix bisectability]
[akpm@linux-foundation.org: declare the various arch_kexec functions]
[akpm@linux-foundation.org: fix build]
Signed-off-by: Dave Young
Cc: Eric W. Biederman
Cc: Vivek Goyal
Cc: Petr Tesarik
Cc: Theodore Ts'o
Cc: Josh Boyer
Cc: David Howells
Cc: Geert Uytterhoeven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
01 Jul, 2015
1 commit
-
Commit f06e5153f4ae2e ("kernel/panic.c: add "crash_kexec_post_notifiers"
option for kdump after panic_notifers") introduced
"crash_kexec_post_notifiers" kernel boot option, which toggles wheather
panic() calls crash_kexec() before panic_notifiers and dump kmsg or after.The problem is that the commit overlooks panic_on_oops kernel boot option.
If it is enabled, crash_kexec() is called directly without going through
panic() in oops path.To fix this issue, this patch adds a check to "crash_kexec_post_notifiers"
in the condition of kexec_should_crash().Also, put a comment in kexec_should_crash() to explain not obvious things
on this patch.Signed-off-by: HATAYAMA Daisuke
Acked-by: Baoquan He
Tested-by: Hidehiro Kawai
Reviewed-by: Masami Hiramatsu
Cc: Vivek Goyal
Cc: Ingo Molnar
Cc: Hidehiro Kawai
Cc: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
23 Apr, 2015
1 commit
-
Introduce KEXEC_CONTROL_MEMORY_GFP to allow the architecture code
to override the gfp flags of the allocation for the kexec control
page. The loop in kimage_alloc_normal_control_pages allocates pages
with GFP_KERNEL until a page is found that happens to have an
address smaller than the KEXEC_CONTROL_MEMORY_LIMIT. On systems
with a large memory size but a small KEXEC_CONTROL_MEMORY_LIMIT
the loop will keep allocating memory until the oom killer steps in.Signed-off-by: Martin Schwidefsky
18 Feb, 2015
3 commits
-
Simplify the code around one of the conditionals in the kexec_load syscall
routine.The original code was confusing with a redundant check on KEXEC_ON_CRASH
and comments outside of the conditional block. This change switches the
order of the conditional check, and cleans up the comments for the
conditional. There is no functional change to the code.Signed-off-by: Geoff Levand
Acked-by: Vivek Goyal
Cc: Arnd Bergmann
Cc: Benjamin Herrenschmidt
Cc: H. Peter Anvin
Cc: Maximilian Attems
Cc: Michal Marek
Cc: Paul Bolle
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Alexander Kuleshov
Acked-by: "Eric W. Biederman"
Acked-by: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
struct kimage has a member destination which is used to store the real
destination address of each page when load segment from user space buffer
to kernel. But we never retrieve the value stored in kimage->destination,
so this member variable in kimage and its assignment operation are
redundent code.I guess for_each_kimage_entry just does the work that kimage->destination
is expected to do.So in this patch just make a cleanup to remove it.
Signed-off-by: Baoquan He
Cc: "Eric W. Biederman"
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
11 Feb, 2015
1 commit
-
Pull trivial tree changes from Jiri Kosina:
"Patches from trivial.git that keep the world turning around.Mostly documentation and comment fixes, and a two corner-case code
fixes from Alan Cox"* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
kexec, Kconfig: spell "architecture" properly
mm: fix cleancache debugfs directory path
blackfin: mach-common: ints-priority: remove unused function
doubletalk: probe failure causes OOPS
ARM: cache-l2x0.c: Make it clear that cache-l2x0 handles L310 cache controller
msdos_fs.h: fix 'fields' in comment
scsi: aic7xxx: fix comment
ARM: l2c: fix comment
ibmraid: fix writeable attribute with no store method
dynamic_debug: fix comment
doc: usbmon: fix spelling s/unpriviledged/unprivileged/
x86: init_mem_mapping(): use capital BIOS in comment
26 Jan, 2015
1 commit
-
Grepping for "archicture" showed it actually twice! Most unusual
spelling error, very interesting. :)Signed-off-by: Borislav Petkov
Signed-off-by: Jiri Kosina
14 Dec, 2014
1 commit
-
Remove unnecessary KERN_ERR from pr_err() within kexec.c.
Signed-off-by: Masanari Iida
Acked-by: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
14 Oct, 2014
2 commits
-
This is a cleanup. In function parse_crashkernel_suffix, the parameter
crash_base is not used. So here remove it.Signed-off-by: Baoquan He
Acked-by: Vivek Goyal
Cc: Eric W. Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
In locate_mem_hole functions, a memory hole is located and added as
kexec_segment. But from the name of locate_mem_hole, it should only take
responsibility of searching a available memory hole to contain data of a
specified size.So in this patch add a new field 'mem' into kexec_buf, then take that
kexec segment adding code out of locate_mem_hole_top_down and
locate_mem_hole_bottom_up. This make clear of the functionality of
locate_mem_hole just like it declars to do. And by this
locate_mem_hole_callback chould be used later if anyone want to locate a
memory hole for other use.Meanwhile Vivek suggested opening code function __kexec_add_segment(),
that way we have to retreive ksegment pointer once and it is easy to read.
So just do it in this patch and remove __kexec_add_segment() since no one
use it anymore.Signed-off-by: Baoquan He
Acked-by: Vivek Goyal
Cc: Eric W. Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
30 Aug, 2014
1 commit
-
Currently new system call kexec_file_load() and all the associated code
compiles if CONFIG_KEXEC=y. But new syscall also compiles purgatory
code which currently uses gcc option -mcmodel=large. This option seems
to be available only gcc 4.4 onwards.Hiding new functionality behind a new config option will not break
existing users of old gcc. Those who wish to enable new functionality
will require new gcc. Having said that, I am trying to figure out how
can I move away from using -mcmodel=large but that can take a while.I think there are other advantages of introducing this new config
option. As this option will be enabled only on x86_64, other arches
don't have to compile generic kexec code which will never be used. This
new code selects CRYPTO=y and CRYPTO_SHA256=y. And all other arches had
to do this for CONFIG_KEXEC. Now with introduction of new config
option, we can remove crypto dependency from other arches.Now CONFIG_KEXEC_FILE is available only on x86_64. So whereever I had
CONFIG_X86_64 defined, I got rid of that.For CONFIG_KEXEC_FILE, instead of doing select CRYPTO=y, I changed it to
"depends on CRYPTO=y". This should be safer as "select" is not
recursive.Signed-off-by: Vivek Goyal
Cc: Eric Biederman
Cc: H. Peter Anvin
Tested-by: Shaun Ruffell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
09 Aug, 2014
9 commits
-
This is the final piece of the puzzle of verifying kernel image signature
during kexec_file_load() syscall.This patch calls into PE file routines to verify signature of bzImage. If
signature are valid, kexec_file_load() succeeds otherwise it fails.Two new config options have been introduced. First one is
CONFIG_KEXEC_VERIFY_SIG. This option enforces that kernel has to be
validly signed otherwise kernel load will fail. If this option is not
set, no signature verification will be done. Only exception will be when
secureboot is enabled. In that case signature verification should be
automatically enforced when secureboot is enabled. But that will happen
when secureboot patches are merged.Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG. This option
enables signature verification support on bzImage. If this option is not
set and previous one is set, kernel image loading will fail because kernel
does not have support to verify signature of bzImage.I tested these patches with both "pesign" and "sbsign" signed bzImages.
I used signing_key.priv key and signing_key.x509 cert for signing as
generated during kernel build process (if module signing is enabled).Used following method to sign bzImage.
pesign
======
- Convert DER format cert to PEM format cert
openssl x509 -in signing_key.x509 -inform DER -out signing_key.x509.PEM -outform
PEM- Generate a .p12 file from existing cert and private key file
openssl pkcs12 -export -out kernel-key.p12 -inkey signing_key.priv -in
signing_key.x509.PEM- Import .p12 file into pesign db
pk12util -i /tmp/kernel-key.p12 -d /etc/pki/pesign- Sign bzImage
pesign -i /boot/vmlinuz-3.16.0-rc3+ -o /boot/vmlinuz-3.16.0-rc3+.signed.pesign
-c "Glacier signing key - Magrathea" -ssbsign
======
sbsign --key signing_key.priv --cert signing_key.x509.PEM --output
/boot/vmlinuz-3.16.0-rc3+.signed.sbsign /boot/vmlinuz-3.16.0-rc3+Patch details:
Well all the hard work is done in previous patches. Now bzImage loader
has just call into that code and verify whether bzImage signature are
valid or not.Also create two config options. First one is CONFIG_KEXEC_VERIFY_SIG.
This option enforces that kernel has to be validly signed otherwise kernel
load will fail. If this option is not set, no signature verification will
be done. Only exception will be when secureboot is enabled. In that case
signature verification should be automatically enforced when secureboot is
enabled. But that will happen when secureboot patches are merged.Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG. This option
enables signature verification support on bzImage. If this option is not
set and previous one is set, kernel image loading will fail because kernel
does not have support to verify signature of bzImage.Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Cc: Matt Fleming
Cc: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch adds support for loading a kexec on panic (kdump) kernel usning
new system call.It prepares ELF headers for memory areas to be dumped and for saved cpu
registers. Also prepares the memory map for second kernel and limits its
boot to reserved areas only.Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is loader specific code which can load bzImage and set it up for
64bit entry. This does not take care of 32bit entry or real mode entry.32bit mode entry can be implemented if somebody needs it.
Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Load purgatory code in RAM and relocate it based on the location.
Relocation code has been inspired by module relocation code and purgatory
relocation code in kexec-tools.Also compute the checksums of loaded kexec segments and store them in
purgatory.Arch independent code provides this functionality so that arch dependent
bootloaders can make use of it.Helper functions are provided to get/set symbol values in purgatory which
are used by bootloaders later to set things like stack and entry point of
second kernel etc.Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Previous patch provided the interface definition and this patch prvides
implementation of new syscall.Previously segment list was prepared in user space. Now user space just
passes kernel fd, initrd fd and command line and kernel will create a
segment list internally.This patch contains generic part of the code. Actual segment preparation
and loading is done by arch and image specific loader. Which comes in
next patch.[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is the new syscall kexec_file_load() declaration/interface. I have
reserved the syscall number only for x86_64 so far. Other architectures
(including i386) can reserve syscall number when they enable the support
for this new syscall.Signed-off-by: Vivek Goyal
Cc: Michael Kerrisk
Cc: Borislav Petkov
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
kimage_normal_alloc() and kimage_crash_alloc() are doing lot of similar
things and differ only little. So instead of having two separate
functions create a common function kimage_alloc_init() and pass it the
"flags" argument which tells whether it is normal kexec or kexec_on_panic.
And this function should be able to deal with both the cases.This consolidation also helps later where we can use a common function
kimage_file_alloc_init() to handle normal and crash cases for new file
based kexec syscall.Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Previously do_kimage_alloc() will allocate a kimage structure, copy
segment list from user space and then do the segment list sanity
verification.Break down this function in 3 parts. do_kimage_alloc_init() to do actual
allocation and basic initialization of kimage structure.
copy_user_segment_list() to copy segment list from user space and
sanity_check_segment_list() to verify the sanity of segment list as passed
by user space.In later patches, I need to only allocate kimage and not copy segment list
from user space. So breaking down in smaller functions enables re-use of
code at other places.Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Let's use the more common "unusable".
This patch was originally written and posted by Boris. I am including it
in this patch series.Signed-off-by: Borislav Petkov
Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
31 Jul, 2014
2 commits
-
free_huge_page() is undefined without CONFIG_HUGETLBFS and there's no
need to filter PageHuge() page is such a configuration either, so avoid
exporting the symbol to fix a build error:In file included from kernel/kexec.c:14:0:
kernel/kexec.c: In function 'crash_save_vmcoreinfo_init':
kernel/kexec.c:1623:20: error: 'free_huge_page' undeclared (first use in this function)
VMCOREINFO_SYMBOL(free_huge_page);
^Introduced by commit 8f1d26d0e59b ("kexec: export free_huge_page to
VMCOREINFO")Reported-by: kbuild test robot
Acked-by: Olof Johansson
Cc: Atsushi Kumagai
Cc: Baoquan He
Cc: Vivek Goyal
Cc: Andrew Morton
Signed-off-by: David Rientjes
Signed-off-by: Linus Torvalds -
PG_head_mask was added into VMCOREINFO to filter huge pages in b3acc56bfe1
("kexec: save PG_head_mask in VMCOREINFO"), but makedumpfile still need
another symbol to filter *hugetlbfs* pages.If a user hope to filter user pages, makedumpfile tries to exclude them by
checking the condition whether the page is anonymous, but hugetlbfs pages
aren't anonymous while they also be user pages.We know it's possible to detect them in the same way as PageHuge(),
so we need the start address of free_huge_page():int PageHuge(struct page *page)
{
if (!PageCompound(page))
return 0;page = compound_head(page);
return get_compound_page_dtor(page) == free_huge_page;
}For that reason, this patch changes free_huge_page() into public
to export it to VMCOREINFO.Signed-off-by: Atsushi Kumagai
Acked-by: Baoquan He
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
24 Jun, 2014
1 commit
-
To allow filtering of huge pages, makedumpfile must be able to identify
them in the dump. This can be done by checking the appropriate page
flag, so communicate its value to makedumpfile through the VMCOREINFO
interface.There's only one small catch. Depending on how many page flags are
available on a given architecture, this bit can be called PG_head or
PG_compound.I sent a similar patch back in 2012, but Eric Biederman did not like
using an #ifdef. So, this time I'm adding a common symbol
(PG_head_mask) instead.See https://lkml.org/lkml/2012/11/28/91 for the previous version.
Signed-off-by: Petr Tesarik
Acked-by: Vivek Goyal
Cc: Eric Biederman
Cc: Paul Mackerras
Cc: Fengguang Wu
Cc: Benjamin Herrenschmidt
Cc: Shaohua Li
Cc: Alexey Kardashevskiy
Cc: Sasha Levin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
07 Jun, 2014
1 commit
-
+ some pr_warning -> pr_warn and checkpatch warning fixes
Signed-off-by: Fabian Frederick
Cc: Eric Biederman
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
28 May, 2014
1 commit
-
If we try to perform a kexec when the machine is in ST (Single-Threaded) mode
(ppc64_cpu --smt=off), the kexec operation doesn't succeed properly, and we
get the following messages during boot:[ 0.089866] POWER8 performance monitor hardware support registered
[ 0.089985] power8-pmu: PMAO restore workaround active.
[ 5.095419] Processor 1 is stuck.
[ 10.097933] Processor 2 is stuck.
[ 15.100480] Processor 3 is stuck.
[ 20.102982] Processor 4 is stuck.
[ 25.105489] Processor 5 is stuck.
[ 30.108005] Processor 6 is stuck.
[ 35.110518] Processor 7 is stuck.
[ 40.113369] Processor 9 is stuck.
[ 45.115879] Processor 10 is stuck.
[ 50.118389] Processor 11 is stuck.
[ 55.120904] Processor 12 is stuck.
[ 60.123425] Processor 13 is stuck.
[ 65.125970] Processor 14 is stuck.
[ 70.128495] Processor 15 is stuck.
[ 75.131316] Processor 17 is stuck.Note that only the sibling threads are stuck, while the primary threads (0, 8,
16 etc) boot just fine. Looking closer at the previous step of kexec, we observe
that kexec tries to wakeup (bring online) the sibling threads of all the cores,
before performing kexec:[ 9464.131231] Starting new kernel
[ 9464.148507] kexec: Waking offline cpu 1.
[ 9464.148552] kexec: Waking offline cpu 2.
[ 9464.148600] kexec: Waking offline cpu 3.
[ 9464.148636] kexec: Waking offline cpu 4.
[ 9464.148671] kexec: Waking offline cpu 5.
[ 9464.148708] kexec: Waking offline cpu 6.
[ 9464.148743] kexec: Waking offline cpu 7.
[ 9464.148779] kexec: Waking offline cpu 9.
[ 9464.148815] kexec: Waking offline cpu 10.
[ 9464.148851] kexec: Waking offline cpu 11.
[ 9464.148887] kexec: Waking offline cpu 12.
[ 9464.148922] kexec: Waking offline cpu 13.
[ 9464.148958] kexec: Waking offline cpu 14.
[ 9464.148994] kexec: Waking offline cpu 15.
[ 9464.149030] kexec: Waking offline cpu 17.Instrumenting this piece of code revealed that the cpu_up() operation actually
fails with -EBUSY. Thus, only the primary threads of all the cores are online
during kexec, and hence this is a sure-shot receipe for disaster, as explained
in commit e8e5c2155b (powerpc/kexec: Fix orphaned offline CPUs across kexec),
as well as in the comment above wake_offline_cpus().It turns out that cpu_up() was returning -EBUSY because the variable
'cpu_hotplug_disabled' was set to 1; and this disabling of CPU hotplug was done
by migrate_to_reboot_cpu() inside kernel_kexec().Now, migrate_to_reboot_cpu() was originally written with the assumption that
any further code will not need to perform CPU hotplug, since we are anyway in
the reboot path. However, kexec is clearly not such a case, since we depend on
onlining CPUs, atleast on powerpc.So re-enable cpu-hotplug after returning from migrate_to_reboot_cpu() in the
kexec path, to fix this regression in kexec on powerpc.Also, wrap the cpu_up() in powerpc kexec code within a WARN_ON(), so that we
can catch such issues more easily in the future.Fixes: c97102ba963 (kexec: migrate to reboot cpu)
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat
Signed-off-by: Benjamin Herrenschmidt
08 Apr, 2014
1 commit
-
To increase compiler portability there is which
provides convenience macros for various gcc constructs. Eg: __weak for
__attribute__((weak)). I've replaced all instances of gcc attributes
with the right macro in the kernel subsystem.Signed-off-by: Gideon Israel Dsouza
Cc: "Rafael J. Wysocki"
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
04 Apr, 2014
1 commit
-
Code that is obj-y (always built-in) or dependent on a bool Kconfig
(built-in or absent) can never be modular. So using module_init as an
alias for __initcall can be somewhat misleading.Fix these up now, so that we can relocate module_init from init.h into
module.h in the future. If we don't do this, we'd have to add module.h
to obviously non-modular code, and that would be a worse thing.The audit targets the following module_init users for change:
kernel/user.c obj-y
kernel/kexec.c bool KEXEC (one instance per arch)
kernel/profile.c bool PROFILING
kernel/hung_task.c bool DETECT_HUNG_TASK
kernel/sched/stats.c bool SCHEDSTATS
kernel/user_namespace.c bool USER_NSNote that direct use of __initcall is discouraged, vs. one of the
priority categorized subgroups. As __initcall gets mapped onto
device_initcall, our use of subsys_initcall (which makes sense for these
files) will thus change this registration from level 6-device to level
4-subsys (i.e. slightly earlier). However no observable impact of that
difference has been observed during testing.Also, two instances of missing ";" at EOL are fixed in kexec.
Signed-off-by: Paul Gortmaker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
06 Mar, 2014
1 commit
-
In order to allow the COMPAT_SYSCALL_DEFINE macro generate code that
performs proper zero and sign extension convert all 64 bit parameters
to their corresponding 32 bit compat counterparts.Signed-off-by: Heiko Carstens