Eric Lee / smarc-fsl-linux-kernel

03 Aug, 2016

1 commit

43546d866 kexec: allow architectures to override boot mapping ... Browse Code »

kexec physical addresses are the boot-time view of the system. For
certain ARM systems (such as Keystone 2), the boot view of the system
does not match the kernel's view of the system: the boot view uses a
special alias in the lower 4GB of the physical address space.

To cater for these kinds of setups, we need to translate between the
boot view physical addresses and the normal kernel view physical
addresses. This patch extracts the current transation points into
linux/kexec.h, and allows an architecture to override the functions.

Due to the translations required, we unfortunately end up with six
translation functions, which are reduced down to four that the
architecture can override.

[akpm@linux-foundation.org: kexec.h needs asm/io.h for phys_to_virt()]
Link: http://lkml.kernel.org/r/E1b8koP-0004HZ-Vf@rmk-PC.armlinux.org.uk
Signed-off-by: Russell King
Cc: Keerthy
Cc: Pratyush Anand
Cc: Vitaly Andrianov
Cc: Eric Biederman
Cc: Dave Young
Cc: Baoquan He
Cc: Vivek Goyal
Cc: Simon Horman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Russell King
2016-08-03 07:35:27 +0800

24 May, 2016

4 commits

7a0058ec7 s390/kexec: consolidate crash_map/unmap_reserved_pages() and arch_kexec_protect(… ... Browse Code »

…unprotect)_crashkres()

Commit 3f625002581b ("kexec: introduce a protection mechanism for the
crashkernel reserved memory") is a similar mechanism for protecting the
crash kernel reserved memory to previous crash_map/unmap_reserved_pages()
implementation, the new one is more generic in name and cleaner in code
(besides, some arch may not be allowed to unmap the pgtable).

Therefore, this patch consolidates them, and uses the new
arch_kexec_protect(unprotect)_crashkres() to replace former
crash_map/unmap_reserved_pages() which by now has been only used by
S390.

The consolidation work needs the crash memory to be mapped initially,
this is done in machine_kdump_pm_init() which is after
reserve_crashkernel(). Once kdump kernel is loaded, the new
arch_kexec_protect_crashkres() implemented for S390 will actually
unmap the pgtable like before.

Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Minfei Huang <mhuang@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Xunlei Pang
2016-05-24 08:04:14 +0800
0eea08678 kexec: do a cleanup for function kexec_load ... Browse Code »

There are a lof of work to be done in function kexec_load, not only for
allocating structs and loading initram, but also for some misc.

To make it more clear, wrap a new function do_kexec_load which is used
to allocate structs and load initram. And the pre-work will be done in
kexec_load.

Signed-off-by: Minfei Huang
Cc: Vivek Goyal
Cc: "Eric W. Biederman"
Cc: Xunlei Pang
Cc: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minfei Huang
2016-05-24 08:04:14 +0800
917a35605 kexec: make a pair of map/unmap reserved pages in error path ... Browse Code »

For some arch, kexec shall map the reserved pages, then use them, when
we try to start the kdump service.

kexec may return directly, without unmaping the reserved pages, if it
fails during starting service. To fix it, we make a pair of map/unmap
reserved pages both in generic path and error path.

This patch only affects s390. Other architecturess don't implement the
interface of crash_unmap_reserved_pages and crash_map_reserved_pages.

It isn't a urgent patch. Kernel can work well without any risk,
although the reserved pages are not unmapped before returning in error
path.

Signed-off-by: Minfei Huang
Cc: Vivek Goyal
Cc: "Eric W. Biederman"
Cc: Xunlei Pang
Cc: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minfei Huang
2016-05-24 08:04:14 +0800
9b492cf58 kexec: introduce a protection mechanism for the crashkernel reserved memory ... Browse Code »

For the cases that some kernel (module) path stamps the crash reserved
memory(already mapped by the kernel) where has been loaded the second
kernel data, the kdump kernel will probably fail to boot when panic
happens (or even not happens) leaving the culprit at large, this is
unacceptable.

The patch introduces a mechanism for detecting such cases:

1) After each crash kexec loading, it simply marks the reserved memory
regions readonly since we no longer access it after that. When someone
stamps the region, the first kernel will panic and trigger the kdump.
The weak arch_kexec_protect_crashkres() is introduced to do the actual
protection.

2) To allow multiple loading, once 1) was done we also need to remark
the reserved memory to readwrite each time a system call related to
kdump is made. The weak arch_kexec_unprotect_crashkres() is introduced
to do the actual protection.

The architecture can make its specific implementation by overriding
arch_kexec_protect_crashkres() and arch_kexec_unprotect_crashkres().

Signed-off-by: Xunlei Pang
Cc: Eric Biederman
Cc: Dave Young
Cc: Minfei Huang
Cc: Vivek Goyal
Cc: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xunlei Pang
2016-05-24 08:04:14 +0800

21 Jan, 2016

1 commit

cdf4b3fa0 kexec: set KEXEC_TYPE_CRASH before sanity_check_segment_list() ... Browse Code »

sanity_check_segment_list() checks KEXEC_TYPE_CRASH flag to ensure all the
segments of the loaded crash kernel are within the kernel crash resource
limits, so set the flag beforehand.

Signed-off-by: Xunlei Pang
Acked-by: Dave Young
Cc: Eric Biederman
Cc: Vivek Goyal
Acked-by: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xunlei Pang
2016-01-21 09:09:18 +0800

07 Nov, 2015

1 commit

de90a6bca kexec: use file name as the output message prefix ... Browse Code »

kexec output message misses the prefix "kexec", when Dave Young split the
kexec code. Now, we use file name as the output message prefix.

Currently, the format of output message:
[ 140.290795] SYSC_kexec_load: hello, world
[ 140.291534] kexec: sanity_check_segment_list: hello, world

Ideally, the format of output message:
[ 30.791503] kexec: SYSC_kexec_load, Hello, world
[ 79.182752] kexec_core: sanity_check_segment_list, Hello, world

Remove the custom prefix "kexec" in output message.

Signed-off-by: Minfei Huang
Acked-by: Dave Young
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minfei Huang
2015-11-07 09:50:42 +0800

11 Sep, 2015

2 commits

2965faa5e kexec: split kexec_load syscall from kexec core code ... Browse Code »

There are two kexec load syscalls, kexec_load another and kexec_file_load.
kexec_file_load has been splited as kernel/kexec_file.c. In this patch I
split kexec_load syscall code to kernel/kexec.c.

And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and
use kexec_file_load only, or vice verse.

The original requirement is from Ted Ts'o, he want kexec kernel signature
being checked with CONFIG_KEXEC_VERIFY_SIG enabled. But kexec-tools use
kexec_load syscall can bypass the checking.

Vivek Goyal proposed to create a common kconfig option so user can compile
in only one syscall for loading kexec kernel. KEXEC/KEXEC_FILE selects
KEXEC_CORE so that old config files still work.

Because there's general code need CONFIG_KEXEC_CORE, so I updated all the
architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects
KEXEC_CORE in arch Kconfig. Also updated general kernel code with to
kexec_load syscall.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Dave Young
Cc: Eric W. Biederman
Cc: Vivek Goyal
Cc: Petr Tesarik
Cc: Theodore Ts'o
Cc: Josh Boyer
Cc: David Howells
Cc: Geert Uytterhoeven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dave Young
2015-09-11 04:29:01 +0800
a43cac0d9 kexec: split kexec_file syscall code to kexec_file.c ... Browse Code »

Split kexec_file syscall related code to another file kernel/kexec_file.c
so that the #ifdef CONFIG_KEXEC_FILE in kexec.c can be dropped.

Sharing variables and functions are moved to kernel/kexec_internal.h per
suggestion from Vivek and Petr.

[akpm@linux-foundation.org: fix bisectability]
[akpm@linux-foundation.org: declare the various arch_kexec functions]
[akpm@linux-foundation.org: fix build]
Signed-off-by: Dave Young
Cc: Eric W. Biederman
Cc: Vivek Goyal
Cc: Petr Tesarik
Cc: Theodore Ts'o
Cc: Josh Boyer
Cc: David Howells
Cc: Geert Uytterhoeven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dave Young
2015-09-11 04:29:01 +0800

01 Jul, 2015

1 commit

5375b708f kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path ... Browse Code »

Commit f06e5153f4ae2e ("kernel/panic.c: add "crash_kexec_post_notifiers"
option for kdump after panic_notifers") introduced
"crash_kexec_post_notifiers" kernel boot option, which toggles wheather
panic() calls crash_kexec() before panic_notifiers and dump kmsg or after.

The problem is that the commit overlooks panic_on_oops kernel boot option.
If it is enabled, crash_kexec() is called directly without going through
panic() in oops path.

To fix this issue, this patch adds a check to "crash_kexec_post_notifiers"
in the condition of kexec_should_crash().

Also, put a comment in kexec_should_crash() to explain not obvious things
on this patch.

Signed-off-by: HATAYAMA Daisuke
Acked-by: Baoquan He
Tested-by: Hidehiro Kawai
Reviewed-by: Masami Hiramatsu
Cc: Vivek Goyal
Cc: Ingo Molnar
Cc: Hidehiro Kawai
Cc: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

HATAYAMA Daisuke
2015-07-01 10:44:57 +0800

23 Apr, 2015

1 commit

7e01b5acd kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP ... Browse Code »

Introduce KEXEC_CONTROL_MEMORY_GFP to allow the architecture code
to override the gfp flags of the allocation for the kexec control
page. The loop in kimage_alloc_normal_control_pages allocates pages
with GFP_KERNEL until a page is found that happens to have an
address smaller than the KEXEC_CONTROL_MEMORY_LIMIT. On systems
with a large memory size but a small KEXEC_CONTROL_MEMORY_LIMIT
the loop will keep allocating memory until the oom killer steps in.

Signed-off-by: Martin Schwidefsky

Martin Schwidefsky
2015-04-23 22:52:01 +0800

18 Feb, 2015

3 commits

518a0c716 kexec: simplify conditional ... Browse Code »

Simplify the code around one of the conditionals in the kexec_load syscall
routine.

The original code was confusing with a redundant check on KEXEC_ON_CRASH
and comments outside of the conditional block. This change switches the
order of the conditional check, and cleans up the comments for the
conditional. There is no functional change to the code.

Signed-off-by: Geoff Levand
Acked-by: Vivek Goyal
Cc: Arnd Bergmann
Cc: Benjamin Herrenschmidt
Cc: H. Peter Anvin
Cc: Maximilian Attems
Cc: Michal Marek
Cc: Paul Bolle
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Geoff Levand
2015-02-18 06:34:51 +0800
ad6993498 kexec: fix a typo in comment ... Browse Code »

Signed-off-by: Alexander Kuleshov
Acked-by: "Eric W. Biederman"
Acked-by: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Kuleshov
2015-02-18 06:34:51 +0800
73d7e3eac kexec: remove never used member destination in kimage ... Browse Code »

struct kimage has a member destination which is used to store the real
destination address of each page when load segment from user space buffer
to kernel. But we never retrieve the value stored in kimage->destination,
so this member variable in kimage and its assignment operation are
redundent code.

I guess for_each_kimage_entry just does the work that kimage->destination
is expected to do.

So in this patch just make a cleanup to remove it.

Signed-off-by: Baoquan He
Cc: "Eric W. Biederman"
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Baoquan He
2015-02-18 06:34:51 +0800

11 Feb, 2015

1 commit

29afc4e9a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial ... Browse Code »

Pull trivial tree changes from Jiri Kosina:
"Patches from trivial.git that keep the world turning around.

Mostly documentation and comment fixes, and a two corner-case code
fixes from Alan Cox"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
kexec, Kconfig: spell "architecture" properly
mm: fix cleancache debugfs directory path
blackfin: mach-common: ints-priority: remove unused function
doubletalk: probe failure causes OOPS
ARM: cache-l2x0.c: Make it clear that cache-l2x0 handles L310 cache controller
msdos_fs.h: fix 'fields' in comment
scsi: aic7xxx: fix comment
ARM: l2c: fix comment
ibmraid: fix writeable attribute with no store method
dynamic_debug: fix comment
doc: usbmon: fix spelling s/unpriviledged/unprivileged/
x86: init_mem_mapping(): use capital BIOS in comment

Linus Torvalds
2015-02-11 10:57:15 +0800

26 Jan, 2015

1 commit

edb0ec072 kexec, Kconfig: spell "architecture" properly ... Browse Code »

Grepping for "archicture" showed it actually twice! Most unusual
spelling error, very interesting. :)

Signed-off-by: Borislav Petkov
Signed-off-by: Jiri Kosina

Borislav Petkov
2015-01-26 21:36:46 +0800

14 Dec, 2014

1 commit

d5393955c kexec: remove unnecessary KERN_ERR from kexec.c ... Browse Code »

Remove unnecessary KERN_ERR from pr_err() within kexec.c.

Signed-off-by: Masanari Iida
Acked-by: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Masanari Iida
2014-12-14 04:42:51 +0800

14 Oct, 2014

2 commits

36f3f500e kexec: remove the unused function parameter ... Browse Code »

This is a cleanup. In function parse_crashkernel_suffix, the parameter
crash_base is not used. So here remove it.

Signed-off-by: Baoquan He
Acked-by: Vivek Goyal
Cc: Eric W. Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Baoquan He
2014-10-14 08:18:21 +0800
669280a15 kexec: take the segment adding out of locate_mem_hole functions ... Browse Code »

In locate_mem_hole functions, a memory hole is located and added as
kexec_segment. But from the name of locate_mem_hole, it should only take
responsibility of searching a available memory hole to contain data of a
specified size.

So in this patch add a new field 'mem' into kexec_buf, then take that
kexec segment adding code out of locate_mem_hole_top_down and
locate_mem_hole_bottom_up. This make clear of the functionality of
locate_mem_hole just like it declars to do. And by this
locate_mem_hole_callback chould be used later if anyone want to locate a
memory hole for other use.

Meanwhile Vivek suggested opening code function __kexec_add_segment(),
that way we have to retreive ksegment pointer once and it is easy to read.
So just do it in this patch and remove __kexec_add_segment() since no one
use it anymore.

Signed-off-by: Baoquan He
Acked-by: Vivek Goyal
Cc: Eric W. Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Baoquan He
2014-10-14 08:18:21 +0800

30 Aug, 2014

1 commit

74ca317c2 kexec: create a new config option CONFIG_KEXEC_FILE for new syscall ... Browse Code »

Currently new system call kexec_file_load() and all the associated code
compiles if CONFIG_KEXEC=y. But new syscall also compiles purgatory
code which currently uses gcc option -mcmodel=large. This option seems
to be available only gcc 4.4 onwards.

Hiding new functionality behind a new config option will not break
existing users of old gcc. Those who wish to enable new functionality
will require new gcc. Having said that, I am trying to figure out how
can I move away from using -mcmodel=large but that can take a while.

I think there are other advantages of introducing this new config
option. As this option will be enabled only on x86_64, other arches
don't have to compile generic kexec code which will never be used. This
new code selects CRYPTO=y and CRYPTO_SHA256=y. And all other arches had
to do this for CONFIG_KEXEC. Now with introduction of new config
option, we can remove crypto dependency from other arches.

Now CONFIG_KEXEC_FILE is available only on x86_64. So whereever I had
CONFIG_X86_64 defined, I got rid of that.

For CONFIG_KEXEC_FILE, instead of doing select CRYPTO=y, I changed it to
"depends on CRYPTO=y". This should be safer as "select" is not
recursive.

Signed-off-by: Vivek Goyal
Cc: Eric Biederman
Cc: H. Peter Anvin
Tested-by: Shaun Ruffell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2014-08-30 07:28:16 +0800

09 Aug, 2014

9 commits

8e7d83810 kexec: verify the signature of signed PE bzImage ... Browse Code »

This is the final piece of the puzzle of verifying kernel image signature
during kexec_file_load() syscall.

This patch calls into PE file routines to verify signature of bzImage. If
signature are valid, kexec_file_load() succeeds otherwise it fails.

Two new config options have been introduced. First one is
CONFIG_KEXEC_VERIFY_SIG. This option enforces that kernel has to be
validly signed otherwise kernel load will fail. If this option is not
set, no signature verification will be done. Only exception will be when
secureboot is enabled. In that case signature verification should be
automatically enforced when secureboot is enabled. But that will happen
when secureboot patches are merged.

Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG. This option
enables signature verification support on bzImage. If this option is not
set and previous one is set, kernel image loading will fail because kernel
does not have support to verify signature of bzImage.

I tested these patches with both "pesign" and "sbsign" signed bzImages.

I used signing_key.priv key and signing_key.x509 cert for signing as
generated during kernel build process (if module signing is enabled).

Used following method to sign bzImage.

pesign
======
- Convert DER format cert to PEM format cert
openssl x509 -in signing_key.x509 -inform DER -out signing_key.x509.PEM -outform
PEM

- Generate a .p12 file from existing cert and private key file
openssl pkcs12 -export -out kernel-key.p12 -inkey signing_key.priv -in
signing_key.x509.PEM

- Import .p12 file into pesign db
pk12util -i /tmp/kernel-key.p12 -d /etc/pki/pesign

- Sign bzImage
pesign -i /boot/vmlinuz-3.16.0-rc3+ -o /boot/vmlinuz-3.16.0-rc3+.signed.pesign
-c "Glacier signing key - Magrathea" -s

sbsign
======
sbsign --key signing_key.priv --cert signing_key.x509.PEM --output
/boot/vmlinuz-3.16.0-rc3+.signed.sbsign /boot/vmlinuz-3.16.0-rc3+

Patch details:

Well all the hard work is done in previous patches. Now bzImage loader
has just call into that code and verify whether bzImage signature are
valid or not.

Also create two config options. First one is CONFIG_KEXEC_VERIFY_SIG.
This option enforces that kernel has to be validly signed otherwise kernel
load will fail. If this option is not set, no signature verification will
be done. Only exception will be when secureboot is enabled. In that case
signature verification should be automatically enforced when secureboot is
enabled. But that will happen when secureboot patches are merged.

Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG. This option
enables signature verification support on bzImage. If this option is not
set and previous one is set, kernel image loading will fail because kernel
does not have support to verify signature of bzImage.

Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Cc: Matt Fleming
Cc: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2014-08-09 06:57:33 +0800
dd5f72607 kexec: support for kexec on panic using new system call ... Browse Code »

This patch adds support for loading a kexec on panic (kdump) kernel usning
new system call.

It prepares ELF headers for memory areas to be dumped and for saved cpu
registers. Also prepares the memory map for second kernel and limits its
boot to reserved areas only.

Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2014-08-09 06:57:33 +0800
27f48d3e6 kexec-bzImage64: support for loading bzImage using 64bit entry ... Browse Code »

This is loader specific code which can load bzImage and set it up for
64bit entry. This does not take care of 32bit entry or real mode entry.

32bit mode entry can be implemented if somebody needs it.

Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2014-08-09 06:57:33 +0800
12db5562e kexec: load and relocate purgatory at kernel load time ... Browse Code »

Load purgatory code in RAM and relocate it based on the location.
Relocation code has been inspired by module relocation code and purgatory
relocation code in kexec-tools.

Also compute the checksums of loaded kexec segments and store them in
purgatory.

Arch independent code provides this functionality so that arch dependent
bootloaders can make use of it.

Helper functions are provided to get/set symbol values in purgatory which
are used by bootloaders later to set things like stack and entry point of
second kernel etc.

Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2014-08-09 06:57:32 +0800
cb1052581 kexec: implementation of new syscall kexec_file_load ... Browse Code »

Previous patch provided the interface definition and this patch prvides
implementation of new syscall.

Previously segment list was prepared in user space. Now user space just
passes kernel fd, initrd fd and command line and kernel will create a
segment list internally.

This patch contains generic part of the code. Actual segment preparation
and loading is done by arch and image specific loader. Which comes in
next patch.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2014-08-09 06:57:32 +0800
f0895685c kexec: new syscall kexec_file_load() declaration ... Browse Code »

This is the new syscall kexec_file_load() declaration/interface. I have
reserved the syscall number only for x86_64 so far. Other architectures
(including i386) can reserve syscall number when they enable the support
for this new syscall.

Signed-off-by: Vivek Goyal
Cc: Michael Kerrisk
Cc: Borislav Petkov
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2014-08-09 06:57:32 +0800
255aedd90 kexec: use common function for kimage_normal_alloc() and kimage_crash_alloc() ... Browse Code »

kimage_normal_alloc() and kimage_crash_alloc() are doing lot of similar
things and differ only little. So instead of having two separate
functions create a common function kimage_alloc_init() and pass it the
"flags" argument which tells whether it is normal kexec or kexec_on_panic.
And this function should be able to deal with both the cases.

This consolidation also helps later where we can use a common function
kimage_file_alloc_init() to handle normal and crash cases for new file
based kexec syscall.

Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2014-08-09 06:57:32 +0800
dabe78628 kexec: move segment verification code in a separate function ... Browse Code »

Previously do_kimage_alloc() will allocate a kimage structure, copy
segment list from user space and then do the segment list sanity
verification.

Break down this function in 3 parts. do_kimage_alloc_init() to do actual
allocation and basic initialization of kimage structure.
copy_user_segment_list() to copy segment list from user space and
sanity_check_segment_list() to verify the sanity of segment list as passed
by user space.

In later patches, I need to only allocate kimage and not copy segment list
from user space. So breaking down in smaller functions enables re-use of
code at other places.

Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2014-08-09 06:57:32 +0800
7d3e2bca2 kexec: rename unusebale_pages to unusable_pages ... Browse Code »

Let's use the more common "unusable".

This patch was originally written and posted by Boris. I am including it
in this patch series.

Signed-off-by: Borislav Petkov
Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2014-08-09 06:57:32 +0800

31 Jul, 2014

2 commits

3a1122d26 kexec: fix build error when hugetlbfs is disabled ... Browse Code »

free_huge_page() is undefined without CONFIG_HUGETLBFS and there's no
need to filter PageHuge() page is such a configuration either, so avoid
exporting the symbol to fix a build error:

In file included from kernel/kexec.c:14:0:
kernel/kexec.c: In function 'crash_save_vmcoreinfo_init':
kernel/kexec.c:1623:20: error: 'free_huge_page' undeclared (first use in this function)
VMCOREINFO_SYMBOL(free_huge_page);
^

Introduced by commit 8f1d26d0e59b ("kexec: export free_huge_page to
VMCOREINFO")

Reported-by: kbuild test robot
Acked-by: Olof Johansson
Cc: Atsushi Kumagai
Cc: Baoquan He
Cc: Vivek Goyal
Cc: Andrew Morton
Signed-off-by: David Rientjes
Signed-off-by: Linus Torvalds

David Rientjes
2014-07-31 11:09:37 +0800
8f1d26d0e kexec: export free_huge_page to VMCOREINFO ... Browse Code »

PG_head_mask was added into VMCOREINFO to filter huge pages in b3acc56bfe1
("kexec: save PG_head_mask in VMCOREINFO"), but makedumpfile still need
another symbol to filter *hugetlbfs* pages.

If a user hope to filter user pages, makedumpfile tries to exclude them by
checking the condition whether the page is anonymous, but hugetlbfs pages
aren't anonymous while they also be user pages.

We know it's possible to detect them in the same way as PageHuge(),
so we need the start address of free_huge_page():

int PageHuge(struct page *page)
{
if (!PageCompound(page))
return 0;

page = compound_head(page);
return get_compound_page_dtor(page) == free_huge_page;
}

For that reason, this patch changes free_huge_page() into public
to export it to VMCOREINFO.

Signed-off-by: Atsushi Kumagai
Acked-by: Baoquan He
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Atsushi Kumagai
2014-07-31 08:16:13 +0800

24 Jun, 2014

1 commit

b3acc56bf kexec: save PG_head_mask in VMCOREINFO ... Browse Code »

To allow filtering of huge pages, makedumpfile must be able to identify
them in the dump. This can be done by checking the appropriate page
flag, so communicate its value to makedumpfile through the VMCOREINFO
interface.

There's only one small catch. Depending on how many page flags are
available on a given architecture, this bit can be called PG_head or
PG_compound.

I sent a similar patch back in 2012, but Eric Biederman did not like
using an #ifdef. So, this time I'm adding a common symbol
(PG_head_mask) instead.

See https://lkml.org/lkml/2012/11/28/91 for the previous version.

Signed-off-by: Petr Tesarik
Acked-by: Vivek Goyal
Cc: Eric Biederman
Cc: Paul Mackerras
Cc: Fengguang Wu
Cc: Benjamin Herrenschmidt
Cc: Shaohua Li
Cc: Alexey Kardashevskiy
Cc: Sasha Levin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Petr Tesarik
2014-06-24 07:47:43 +0800

07 Jun, 2014

1 commit

e1bebcf41 kernel/kexec.c: convert printk to pr_foo() ... Browse Code »

+ some pr_warning -> pr_warn and checkpatch warning fixes

Signed-off-by: Fabian Frederick
Cc: Eric Biederman
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-06-07 07:08:12 +0800

28 May, 2014

1 commit

011e4b02f powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode ... Browse Code »

If we try to perform a kexec when the machine is in ST (Single-Threaded) mode
(ppc64_cpu --smt=off), the kexec operation doesn't succeed properly, and we
get the following messages during boot:

[ 0.089866] POWER8 performance monitor hardware support registered
[ 0.089985] power8-pmu: PMAO restore workaround active.
[ 5.095419] Processor 1 is stuck.
[ 10.097933] Processor 2 is stuck.
[ 15.100480] Processor 3 is stuck.
[ 20.102982] Processor 4 is stuck.
[ 25.105489] Processor 5 is stuck.
[ 30.108005] Processor 6 is stuck.
[ 35.110518] Processor 7 is stuck.
[ 40.113369] Processor 9 is stuck.
[ 45.115879] Processor 10 is stuck.
[ 50.118389] Processor 11 is stuck.
[ 55.120904] Processor 12 is stuck.
[ 60.123425] Processor 13 is stuck.
[ 65.125970] Processor 14 is stuck.
[ 70.128495] Processor 15 is stuck.
[ 75.131316] Processor 17 is stuck.

Note that only the sibling threads are stuck, while the primary threads (0, 8,
16 etc) boot just fine. Looking closer at the previous step of kexec, we observe
that kexec tries to wakeup (bring online) the sibling threads of all the cores,
before performing kexec:

[ 9464.131231] Starting new kernel
[ 9464.148507] kexec: Waking offline cpu 1.
[ 9464.148552] kexec: Waking offline cpu 2.
[ 9464.148600] kexec: Waking offline cpu 3.
[ 9464.148636] kexec: Waking offline cpu 4.
[ 9464.148671] kexec: Waking offline cpu 5.
[ 9464.148708] kexec: Waking offline cpu 6.
[ 9464.148743] kexec: Waking offline cpu 7.
[ 9464.148779] kexec: Waking offline cpu 9.
[ 9464.148815] kexec: Waking offline cpu 10.
[ 9464.148851] kexec: Waking offline cpu 11.
[ 9464.148887] kexec: Waking offline cpu 12.
[ 9464.148922] kexec: Waking offline cpu 13.
[ 9464.148958] kexec: Waking offline cpu 14.
[ 9464.148994] kexec: Waking offline cpu 15.
[ 9464.149030] kexec: Waking offline cpu 17.

Instrumenting this piece of code revealed that the cpu_up() operation actually
fails with -EBUSY. Thus, only the primary threads of all the cores are online
during kexec, and hence this is a sure-shot receipe for disaster, as explained
in commit e8e5c2155b (powerpc/kexec: Fix orphaned offline CPUs across kexec),
as well as in the comment above wake_offline_cpus().

It turns out that cpu_up() was returning -EBUSY because the variable
'cpu_hotplug_disabled' was set to 1; and this disabling of CPU hotplug was done
by migrate_to_reboot_cpu() inside kernel_kexec().

Now, migrate_to_reboot_cpu() was originally written with the assumption that
any further code will not need to perform CPU hotplug, since we are anyway in
the reboot path. However, kexec is clearly not such a case, since we depend on
onlining CPUs, atleast on powerpc.

So re-enable cpu-hotplug after returning from migrate_to_reboot_cpu() in the
kexec path, to fix this regression in kexec on powerpc.

Also, wrap the cpu_up() in powerpc kexec code within a WARN_ON(), so that we
can catch such issues more easily in the future.

Fixes: c97102ba963 (kexec: migrate to reboot cpu)
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat
Signed-off-by: Benjamin Herrenschmidt

Srivatsa S. Bhat
2014-05-28 11:24:26 +0800

08 Apr, 2014

1 commit

52f5684c8 kernel: use macros from compiler.h instead of __attribute__((...)) ... Browse Code »

To increase compiler portability there is which
provides convenience macros for various gcc constructs. Eg: __weak for
__attribute__((weak)). I've replaced all instances of gcc attributes
with the right macro in the kernel subsystem.

Signed-off-by: Gideon Israel Dsouza
Cc: "Rafael J. Wysocki"
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Gideon Israel Dsouza
2014-04-08 07:36:11 +0800

04 Apr, 2014

1 commit

c96d6660d kernel: audit/fix non-modular users of module_init in core code ... Browse Code »

Code that is obj-y (always built-in) or dependent on a bool Kconfig
(built-in or absent) can never be modular. So using module_init as an
alias for __initcall can be somewhat misleading.

Fix these up now, so that we can relocate module_init from init.h into
module.h in the future. If we don't do this, we'd have to add module.h
to obviously non-modular code, and that would be a worse thing.

The audit targets the following module_init users for change:
kernel/user.c obj-y
kernel/kexec.c bool KEXEC (one instance per arch)
kernel/profile.c bool PROFILING
kernel/hung_task.c bool DETECT_HUNG_TASK
kernel/sched/stats.c bool SCHEDSTATS
kernel/user_namespace.c bool USER_NS

Note that direct use of __initcall is discouraged, vs. one of the
priority categorized subgroups. As __initcall gets mapped onto
device_initcall, our use of subsys_initcall (which makes sense for these
files) will thus change this registration from level 6-device to level
4-subsys (i.e. slightly earlier). However no observable impact of that
difference has been observed during testing.

Also, two instances of missing ";" at EOL are fixed in kexec.

Signed-off-by: Paul Gortmaker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Gortmaker
2014-04-04 07:21:07 +0800

06 Mar, 2014

1 commit

ca2c405ab kexec/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types ... Browse Code »

In order to allow the COMPAT_SYSCALL_DEFINE macro generate code that
performs proper zero and sign extension convert all 64 bit parameters
to their corresponding 32 bit compat counterparts.

Signed-off-by: Heiko Carstens

Heiko Carstens
2014-03-06 23:30:46 +0800

28 Jan, 2014

1 commit

a19428e5c kernel/kexec.c: use vscnprintf() instead of vsnprintf() in vmcoreinfo_append_str() ... Browse Code »

vsnprintf() may let 'r' larger than sizeof(buf), in this case, if 'r' is
also less than "vmcoreinfo_max_size - vmcoreinfo_size" (left size of
destination buffer), next memcpy() will read the unexpected addresses.

Signed-off-by: Chen Gang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Chen Gang
2014-01-28 13:02:40 +0800

24 Jan, 2014

1 commit

7984754b9 kexec: add sysctl to disable kexec_load ... Browse Code »

For general-purpose (i.e. distro) kernel builds it makes sense to build
with CONFIG_KEXEC to allow end users to choose what kind of things they
want to do with kexec. However, in the face of trying to lock down a
system with such a kernel, there needs to be a way to disable kexec_load
(much like module loading can be disabled). Without this, it is too easy
for the root user to modify kernel memory even when CONFIG_STRICT_DEVMEM
and modules_disabled are set. With this change, it is still possible to
load an image for use later, then disable kexec_load so the image (or lack
of image) can't be altered.

The intention is for using this in environments where "perfect"
enforcement is hard. Without a verified boot, along with verified
modules, and along with verified kexec, this is trying to give a system a
better chance to defend itself (or at least grow the window of
discoverability) against attack in the face of a privilege escalation.

In my mind, I consider several boot scenarios:

1) Verified boot of read-only verified root fs loading fd-based
verification of kexec images.
2) Secure boot of writable root fs loading signed kexec images.
3) Regular boot loading kexec (e.g. kcrash) image early and locking it.
4) Regular boot with no control of kexec image at all.

1 and 2 don't exist yet, but will soon once the verified kexec series has
landed. 4 is the state of things now. The gap between 2 and 4 is too
large, so this change creates scenario 3, a middle-ground above 4 when 2
and 1 are not possible for a system.

Signed-off-by: Kees Cook
Acked-by: Rik van Riel
Cc: Vivek Goyal
Cc: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2014-01-24 08:37:03 +0800

19 Dec, 2013

1 commit

c97102ba9 kexec: migrate to reboot cpu ... Browse Code »

Commit 1b3a5d02ee07 ("reboot: move arch/x86 reboot= handling to generic
kernel") moved reboot= handling to generic code. In the process it also
removed the code in native_machine_shutdown() which are moving reboot
process to reboot_cpu/cpu0.

I guess that thought must have been that all reboot paths are calling
migrate_to_reboot_cpu(), so we don't need this special handling. But
kexec reboot path (kernel_kexec()) is not calling
migrate_to_reboot_cpu() so above change broke kexec. Now reboot can
happen on non-boot cpu and when INIT is sent in second kerneo to bring
up BP, it brings down the machine.

So start calling migrate_to_reboot_cpu() in kexec reboot path to avoid
this problem.

Bisected by WANG Chao.

Reported-by: Matthew Whitehead
Reported-by: Dave Young
Signed-off-by: Vivek Goyal
Tested-by: Baoquan He
Tested-by: WANG Chao
Acked-by: H. Peter Anvin
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2013-12-19 11:04:50 +0800