Eric Lee / smarc-fsl-linux-kernel

17 Oct, 2020

1 commit

7cf603d17 kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED ... Browse Code »

IORESOURCE_MEM_DRIVER_MANAGED currently uses an unused PnP bit, which is
always set to 0 by hardware. This is far from beautiful (and confusing),
and the bit only applies to SYSRAM. So let's move it out of the
bus-specific (PnP) defined bits.

We'll add another SYSRAM specific bit soon. If we ever need more bits for
other purposes, we can steal some from "desc", or reshuffle/regroup what
we have.

Signed-off-by: David Hildenbrand
Signed-off-by: Andrew Morton
Cc: Michal Hocko
Cc: Dan Williams
Cc: Jason Gunthorpe
Cc: Kees Cook
Cc: Ard Biesheuvel
Cc: Pankaj Gupta
Cc: Baoquan He
Cc: Wei Yang
Cc: Eric Biederman
Cc: Thomas Gleixner
Cc: Greg Kroah-Hartman
Cc: Anton Blanchard
Cc: Benjamin Herrenschmidt
Cc: Boris Ostrovsky
Cc: Christian Borntraeger
Cc: Dave Jiang
Cc: Haiyang Zhang
Cc: Heiko Carstens
Cc: Jason Wang
Cc: Juergen Gross
Cc: Julien Grall
Cc: "K. Y. Srinivasan"
Cc: Len Brown
Cc: Leonardo Bras
Cc: Libor Pechacek
Cc: Michael Ellerman
Cc: "Michael S. Tsirkin"
Cc: Nathan Lynch
Cc: "Oliver O'Halloran"
Cc: Paul Mackerras
Cc: Pingfan Liu
Cc: "Rafael J. Wysocki"
Cc: Roger Pau Monné
Cc: Stefano Stabellini
Cc: Stephen Hemminger
Cc: Vasily Gorbik
Cc: Vishal Verma
Cc: Wei Liu
Link: https://lkml.kernel.org/r/20200911103459.10306-3-david@redhat.com
Signed-off-by: Linus Torvalds

David Hildenbrand
2020-10-17 02:11:18 +0800

05 Oct, 2020

4 commits

0fa8e0846 fs/kernel_file_read: Add "offset" arg for partial reads ... Browse Code »

To perform partial reads, callers of kernel_read_file*() must have a
non-NULL file_size argument and a preallocated buffer. The new "offset"
argument can then be used to seek to specific locations in the file to
fill the buffer to, at most, "buf_size" per call.

Where possible, the LSM hooks can report whether a full file has been
read or not so that the contents can be reasoned about.

Signed-off-by: Kees Cook
Link: https://lore.kernel.org/r/20201002173828.2099543-14-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman

Kees Cook
2020-10-05 19:37:04 +0800
885352881 fs/kernel_read_file: Add file_size output argument ... Browse Code »

In preparation for adding partial read support, add an optional output
argument to kernel_read_file*() that reports the file size so callers
can reason more easily about their reading progress.

Signed-off-by: Kees Cook
Reviewed-by: Mimi Zohar
Reviewed-by: Luis Chamberlain
Reviewed-by: James Morris
Acked-by: Scott Branden
Link: https://lore.kernel.org/r/20201002173828.2099543-8-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman

Kees Cook
2020-10-05 19:37:03 +0800
f7a4f689b fs/kernel_read_file: Remove redundant size argument ... Browse Code »

In preparation for refactoring kernel_read_file*(), remove the redundant
"size" argument which is not needed: it can be included in the return
code, with callers adjusted. (VFS reads already cannot be larger than
INT_MAX.)

Signed-off-by: Kees Cook
Reviewed-by: Mimi Zohar
Reviewed-by: Luis Chamberlain
Reviewed-by: James Morris
Acked-by: Scott Branden
Link: https://lore.kernel.org/r/20201002173828.2099543-6-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman

Kees Cook
2020-10-05 19:34:18 +0800
b89999d00 fs/kernel_read_file: Split into separate include file ... Browse Code »

Move kernel_read_file* out of linux/fs.h to its own linux/kernel_read_file.h
include file. That header gets pulled in just about everywhere
and doesn't really need functions not related to the general fs interface.

Suggested-by: Christoph Hellwig
Signed-off-by: Scott Branden
Signed-off-by: Kees Cook
Reviewed-by: Christoph Hellwig
Reviewed-by: Mimi Zohar
Reviewed-by: Luis Chamberlain
Acked-by: Greg Kroah-Hartman
Acked-by: James Morris
Link: https://lore.kernel.org/r/20200706232309.12010-2-scott.branden@broadcom.com
Link: https://lore.kernel.org/r/20201002173828.2099543-4-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman

Scott Branden
2020-10-05 19:34:18 +0800

16 Aug, 2020

1 commit

50f6c7dbd Merge tag 'x86-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 fixes from Ingo Molnar:
"Misc fixes and small updates all around the place:

- Fix mitigation state sysfs output

- Fix an FPU xstate/sxave code assumption bug triggered by
Architectural LBR support

- Fix Lightning Mountain SoC TSC frequency enumeration bug

- Fix kexec debug output

- Fix kexec memory range assumption bug

- Fix a boundary condition in the crash kernel code

- Optimize porgatory.ro generation a bit

- Enable ACRN guests to use X2APIC mode

- Reduce a __text_poke() IRQs-off critical section for the benefit of
PREEMPT_RT"

* tag 'x86-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/alternatives: Acquire pte lock with interrupts enabled
x86/bugs/multihit: Fix mitigation reporting when VMX is not in use
x86/fpu/xstate: Fix an xstate size check warning with architectural LBRs
x86/purgatory: Don't generate debug info for purgatory.ro
x86/tsr: Fix tsc frequency enumeration bug on Lightning Mountain SoC
kexec_file: Correctly output debugging information for the PT_LOAD ELF header
kexec: Improve & fix crash_exclude_mem_range() to handle overlapping ranges
x86/crash: Correct the address boundary of function parameters
x86/acrn: Remove redundant chars from ACRN signature
x86/acrn: Allow ACRN guest to use X2APIC mode

Linus Torvalds
2020-08-16 01:38:03 +0800

08 Aug, 2020

1 commit

25d8d4eec Merge tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux ... Browse Code »

Pull powerpc updates from Michael Ellerman:

- Add support for (optionally) using queued spinlocks & rwlocks.

- Support for a new faster system call ABI using the scv instruction on
Power9 or later.

- Drop support for the PROT_SAO mmap/mprotect flag as it will be
unsupported on Power10 and future processors, leaving us with no way
to implement the functionality it requests. This risks breaking
userspace, though we believe it is unused in practice.

- A bug fix for, and then the removal of, our custom stack expansion
checking. We now allow stack expansion up to the rlimit, like other
architectures.

- Remove the remnants of our (previously disabled) topology update
code, which tried to react to NUMA layout changes on virtualised
systems, but was prone to crashes and other problems.

- Add PMU support for Power10 CPUs.

- A change to our signal trampoline so that we don't unbalance the link
stack (branch return predictor) in the signal delivery path.

- Lots of other cleanups, refactorings, smaller features and so on as
usual.

Thanks to: Abhishek Goel, Alastair D'Silva, Alexander A. Klimov, Alexey
Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju
T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Balamuruhan
S, Bharata B Rao, Bill Wendling, Bin Meng, Cédric Le Goater, Chris
Packham, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Dan
Williams, David Lamparter, Desnes A. Nunes do Rosario, Erhard F., Finn
Thain, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geoff Levand,
Greg Kurz, Gustavo A. R. Silva, Hari Bathini, Harish, Imre Kaloz, Joel
Stanley, Joe Perches, John Crispin, Jordan Niethe, Kajol Jain, Kamalesh
Babulal, Kees Cook, Laurent Dufour, Leonardo Bras, Li RongQing, Madhavan
Srinivasan, Mahesh Salgaonkar, Mark Cave-Ayland, Michal Suchanek, Milton
Miller, Mimi Zohar, Murilo Opsfelder Araujo, Nathan Chancellor, Nathan
Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Oliver O'Halloran,
Palmer Dabbelt, Pedro Miraglia Franco de Carvalho, Philippe Bergheaud,
Pingfan Liu, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Randy
Dunlap, Ravi Bangoria, Sachin Sant, Sam Bobroff, Sandipan Das, Santosh
Sivaraj, Satheesh Rajendran, Shirisha Ganta, Sourabh Jain, Srikar
Dronamraju, Stan Johnson, Stephen Rothwell, Thadeu Lima de Souza
Cascardo, Thiago Jung Bauermann, Tom Lane, Vaibhav Jain, Vladis Dronov,
Wei Yongjun, Wen Xiong, YueHaibing.

* tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (337 commits)
selftests/powerpc: Fix pkey syscall redefinitions
powerpc: Fix circular dependency between percpu.h and mmu.h
powerpc/powernv/sriov: Fix use of uninitialised variable
selftests/powerpc: Skip vmx/vsx/tar/etc tests on older CPUs
powerpc/40x: Fix assembler warning about r0
powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric
powerpc/papr_scm: Fetch nvdimm performance stats from PHYP
cpuidle: pseries: Fixup exit latency for CEDE(0)
cpuidle: pseries: Add function to parse extended CEDE records
cpuidle: pseries: Set the latency-hint before entering CEDE
selftests/powerpc: Fix online CPU selection
powerpc/perf: Consolidate perf_callchain_user_[64|32]()
powerpc/pseries/hotplug-cpu: Remove double free in error path
powerpc/pseries/mobility: Add pr_debug() for device tree changes
powerpc/pseries/mobility: Set pr_fmt()
powerpc/cacheinfo: Warn if cache object chain becomes unordered
powerpc/cacheinfo: Improve diagnostics about malformed cache lists
powerpc/cacheinfo: Use name@unit instead of full DT path in debug messages
powerpc/cacheinfo: Set pr_fmt()
powerpc: fix function annotations to avoid section mismatch warnings with gcc-10
...

Linus Torvalds
2020-08-08 01:33:50 +0800

07 Aug, 2020

3 commits

475f63ae6 kexec_file: Correctly output debugging information for the PT_LOAD ELF header ... Browse Code »

Currently, when we enable the debugging switch to debug kexec_file,
we always get the following incorrect results:

kexec_file: Crash PT_LOAD elf header. phdr=00000000c988639b vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=51 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=000000003cca69a0 vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=52 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=00000000c584cb9f vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=53 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=00000000cf85d57f vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=54 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=00000000a4a8f847 vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=55 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=00000000272ec49f vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=56 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=00000000ea0b65de vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=57 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=000000001f5e490c vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=58 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=00000000dfe4109e vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=59 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=00000000480ed2b6 vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=60 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=0000000080b65151 vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=61 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=0000000024e31c5e vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=62 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=00000000332e0385 vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=63 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=000000002754d5da vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=64 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=00000000783320dd vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=65 p_offset=0x0
kexec_file: Crash PT_LOAD elf header. phdr=0000000076fe5b64 vaddr=0x0, paddr=0x0, sz=0x0 e_phnum=66 p_offset=0x0

The reason is that kernel always prints the values of the next PT_LOAD
instead of the current PT_LOAD. Change it to ensure that we can get the
correct debugging information.

[ mingo: Amended changelog, capitalized "ELF". ]

Signed-off-by: Lianbo Jiang
Signed-off-by: Ingo Molnar
Acked-by: Dave Young
Link: https://lore.kernel.org/r/20200804044933.1973-4-lijiang@redhat.com

Lianbo Jiang
2020-08-07 07:32:00 +0800
a2e9a95d2 kexec: Improve & fix crash_exclude_mem_range() to handle overlapping ranges ... Browse Code »

The crash_exclude_mem_range() function can only handle one memory region a time.

It will fail in the case in which the passed in area covers several memory
regions. In this case, it will only exclude the first region, then return,
but leave the later regions unsolved.

E.g in a NEC system with two usable RAM regions inside the low 1M:

...
BIOS-e820: [mem 0x0000000000000000-0x000000000003efff] usable
BIOS-e820: [mem 0x000000000003f000-0x000000000003ffff] reserved
BIOS-e820: [mem 0x0000000000040000-0x000000000009ffff] usable

It will only exclude the memory region [0, 0x3efff], the memory region
[0x40000, 0x9ffff] will still be added into /proc/vmcore, which may cause
the following failure when dumping vmcore:

ioremap on RAM at 0x0000000000040000 - 0x0000000000040fff
WARNING: CPU: 0 PID: 665 at arch/x86/mm/ioremap.c:186 __ioremap_caller+0x2c7/0x2e0
...
RIP: 0010:__ioremap_caller+0x2c7/0x2e0
...
cp: error reading '/proc/vmcore': Cannot allocate memory
kdump: saving vmcore failed

In order to fix this bug, let's extend the crash_exclude_mem_range()
to handle the overlapping ranges.

[ mingo: Amended the changelog. ]

Signed-off-by: Lianbo Jiang
Signed-off-by: Ingo Molnar
Acked-by: Dave Young
Link: https://lore.kernel.org/r/20200804044933.1973-3-lijiang@redhat.com

Lianbo Jiang
2020-08-07 07:32:00 +0800
4cec92937 Merge tag 'integrity-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity ... Browse Code »

Pull integrity updates from Mimi Zohar:
"The nicest change is the IMA policy rule checking. The other changes
include allowing the kexec boot cmdline line measure policy rules to
be defined in terms of the inode associated with the kexec kernel
image, making the IMA_APPRAISE_BOOTPARAM, which governs the IMA
appraise mode (log, fix, enforce), a runtime decision based on the
secure boot mode of the system, and including errno in the audit log"

* tag 'integrity-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
integrity: remove redundant initialization of variable ret
ima: move APPRAISE_BOOTPARAM dependency on ARCH_POLICY to runtime
ima: AppArmor satisfies the audit rule requirements
ima: Rename internal filter rule functions
ima: Support additional conditionals in the KEXEC_CMDLINE hook function
ima: Use the common function to detect LSM conditionals in a rule
ima: Move comprehensive rule validation checks out of the token parser
ima: Use correct type for the args_p member of ima_rule_entry.lsm elements
ima: Shallow copy the args_p member of ima_rule_entry.lsm elements
ima: Fail rule parsing when appraise_flag=blacklist is unsupportable
ima: Fail rule parsing when the KEY_CHECK hook is combined with an invalid cond
ima: Fail rule parsing when the KEXEC_CMDLINE hook is combined with an invalid cond
ima: Fail rule parsing when buffer hook functions have an invalid action
ima: Free the entire rule if it fails to parse
ima: Free the entire rule when deleting a list of rules
ima: Have the LSM free its audit rule
IMA: Add audit log for failure conditions
integrity: Add errno field in audit message

Linus Torvalds
2020-08-07 02:35:57 +0800

29 Jul, 2020

1 commit

f891f1973 kexec_file: Allow archs to handle special regions while locating memory hole ... Browse Code »

Some architectures may have special memory regions, within the given
memory range, which can't be used for the buffer in a kexec segment.
Implement weak arch_kexec_locate_mem_hole() definition which arch code
may override, to take care of special regions, while trying to locate
a memory hole.

Also, add the missing declarations for arch overridable functions and
and drop the __weak descriptors in the declarations to avoid non-weak
definitions from becoming weak.

Signed-off-by: Hari Bathini
Tested-by: Pingfan Liu
Reviewed-by: Thiago Jung Bauermann
Acked-by: Dave Young
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/159602273603.575379.17665852963340380839.stgit@hbathini

Hari Bathini
2020-07-29 21:47:53 +0800

21 Jul, 2020

1 commit

4834177e6 ima: Support additional conditionals in the KEXEC_CMDLINE hook function ... Browse Code »

Take the properties of the kexec kernel's inode and the current task
ownership into consideration when matching a KEXEC_CMDLINE operation to
the rules in the IMA policy. This allows for some uniformity when
writing IMA policy rules for KEXEC_KERNEL_CHECK, KEXEC_INITRAMFS_CHECK,
and KEXEC_CMDLINE operations.

Prior to this patch, it was not possible to write a set of rules like
this:

dont_measure func=KEXEC_KERNEL_CHECK obj_type=foo_t
dont_measure func=KEXEC_INITRAMFS_CHECK obj_type=foo_t
dont_measure func=KEXEC_CMDLINE obj_type=foo_t
measure func=KEXEC_KERNEL_CHECK
measure func=KEXEC_INITRAMFS_CHECK
measure func=KEXEC_CMDLINE

The inode information associated with the kernel being loaded by a
kexec_kernel_load(2) syscall can now be included in the decision to
measure or not

Additonally, the uid, euid, and subj_* conditionals can also now be
used in KEXEC_CMDLINE rules. There was no technical reason as to why
those conditionals weren't being considered previously other than
ima_match_rules() didn't have a valid inode to use so it immediately
bailed out for KEXEC_CMDLINE operations rather than going through the
full list of conditional comparisons.

Signed-off-by: Tyler Hicks
Cc: Eric Biederman
Cc: kexec@lists.infradead.org
Reviewed-by: Lakshmi Ramasubramanian
Signed-off-by: Mimi Zohar

Tyler Hicks
2020-07-21 01:28:16 +0800

26 Jun, 2020

1 commit

fd7af71be kexec: do not verify the signature without the lockdown or mandatory signature ... Browse Code »

Signature verification is an important security feature, to protect
system from being attacked with a kernel of unknown origin. Kexec
rebooting is a way to replace the running kernel, hence need be secured
carefully.

In the current code of handling signature verification of kexec kernel,
the logic is very twisted. It mixes signature verification, IMA
signature appraising and kexec lockdown.

If there is no KEXEC_SIG_FORCE, kexec kernel image doesn't have one of
signature, the supported crypto, and key, we don't think this is wrong,
Unless kexec lockdown is executed. IMA is considered as another kind of
signature appraising method.

If kexec kernel image has signature/crypto/key, it has to go through the
signature verification and pass. Otherwise it's seen as verification
failure, and won't be loaded.

Seems kexec kernel image with an unqualified signature is even worse
than those w/o signature at all, this sounds very unreasonable. E.g.
If people get a unsigned kernel to load, or a kernel signed with expired
key, which one is more dangerous?

So, here, let's simplify the logic to improve code readability. If the
KEXEC_SIG_FORCE enabled or kexec lockdown enabled, signature
verification is mandated. Otherwise, we lift the bar for any kernel
image.

Link: http://lkml.kernel.org/r/20200602045952.27487-1-lijiang@redhat.com
Signed-off-by: Lianbo Jiang
Reviewed-by: Jiri Bohac
Acked-by: Dave Young
Acked-by: Baoquan He
Cc: James Morris
Cc: Matthew Garrett
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lianbo Jiang
2020-06-26 15:27:36 +0800

05 Jun, 2020

1 commit

3fe4f4991 kexec_file: don't place kexec images on IORESOURCE_MEM_DRIVER_MANAGED ... Browse Code »

Memory flagged with IORESOURCE_MEM_DRIVER_MANAGED is special - it won't be
part of the initial memmap of the kexec kernel and not all memory might be
accessible. Don't place any kexec images onto it.

Signed-off-by: David Hildenbrand
Signed-off-by: Andrew Morton
Cc: Michal Hocko
Cc: Pankaj Gupta
Cc: Wei Yang
Cc: Baoquan He
Cc: Dave Hansen
Cc: Eric Biederman
Cc: Pavel Tatashin
Cc: Dan Williams
Link: http://lkml.kernel.org/r/20200508084217.9160-4-david@redhat.com
Signed-off-by: Linus Torvalds

David Hildenbrand
2020-06-05 10:06:23 +0800

09 Jan, 2020

1 commit

de68e4dae kexec: add machine_kexec_post_load() ... Browse Code »

It is the same as machine_kexec_prepare(), but is called after segments are
loaded. This way, can do processing work with already loaded relocation
segments. One such example is arm64: it has to have segments loaded in
order to create a page table, but it cannot do it during kexec time,
because at that time allocations won't be possible anymore.

Signed-off-by: Pavel Tatashin
Acked-by: Dave Young
Signed-off-by: Will Deacon

Pavel Tatashin
2020-01-09 00:32:55 +0800

02 Nov, 2019

1 commit

f973cce0e kexec: Fix pointer-to-int-cast warnings ... Browse Code »

Fix two pointer-to-int-cast warnings when compiling for the 32-bit parisc
platform:

kernel/kexec_file.c: In function ‘crash_prepare_elf64_headers’:
kernel/kexec_file.c:1307:19: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
phdr->p_vaddr = (Elf64_Addr)_text;
^
kernel/kexec_file.c:1324:19: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
phdr->p_vaddr = (unsigned long long) __va(mstart);
^

Signed-off-by: Helge Deller

Helge Deller
2019-11-02 04:42:58 +0800

28 Sep, 2019

1 commit

aefcf2f4b Merge branch 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security ... Browse Code »

Pull kernel lockdown mode from James Morris:
"This is the latest iteration of the kernel lockdown patchset, from
Matthew Garrett, David Howells and others.

From the original description:

This patchset introduces an optional kernel lockdown feature,
intended to strengthen the boundary between UID 0 and the kernel.
When enabled, various pieces of kernel functionality are restricted.
Applications that rely on low-level access to either hardware or the
kernel may cease working as a result - therefore this should not be
enabled without appropriate evaluation beforehand.

The majority of mainstream distributions have been carrying variants
of this patchset for many years now, so there's value in providing a
doesn't meet every distribution requirement, but gets us much closer
to not requiring external patches.

There are two major changes since this was last proposed for mainline:

- Separating lockdown from EFI secure boot. Background discussion is
covered here: https://lwn.net/Articles/751061/

- Implementation as an LSM, with a default stackable lockdown LSM
module. This allows the lockdown feature to be policy-driven,
rather than encoding an implicit policy within the mechanism.

The new locked_down LSM hook is provided to allow LSMs to make a
policy decision around whether kernel functionality that would allow
tampering with or examining the runtime state of the kernel should be
permitted.

The included lockdown LSM provides an implementation with a simple
policy intended for general purpose use. This policy provides a coarse
level of granularity, controllable via the kernel command line:

lockdown={integrity|confidentiality}

Enable the kernel lockdown feature. If set to integrity, kernel features
that allow userland to modify the running kernel are disabled. If set to
confidentiality, kernel features that allow userland to extract
confidential information from the kernel are also disabled.

This may also be controlled via /sys/kernel/security/lockdown and
overriden by kernel configuration.

New or existing LSMs may implement finer-grained controls of the
lockdown features. Refer to the lockdown_reason documentation in
include/linux/security.h for details.

The lockdown feature has had signficant design feedback and review
across many subsystems. This code has been in linux-next for some
weeks, with a few fixes applied along the way.

Stephen Rothwell noted that commit 9d1f8be5cf42 ("bpf: Restrict bpf
when kernel lockdown is in confidentiality mode") is missing a
Signed-off-by from its author. Matthew responded that he is providing
this under category (c) of the DCO"

* 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (31 commits)
kexec: Fix file verification on S390
security: constify some arrays in lockdown LSM
lockdown: Print current->comm in restriction messages
efi: Restrict efivar_ssdt_load when the kernel is locked down
tracefs: Restrict tracefs when the kernel is locked down
debugfs: Restrict debugfs when the kernel is locked down
kexec: Allow kexec_file() with appropriate IMA policy when locked down
lockdown: Lock down perf when in confidentiality mode
bpf: Restrict bpf when kernel lockdown is in confidentiality mode
lockdown: Lock down tracing and perf kprobes when in confidentiality mode
lockdown: Lock down /proc/kcore
x86/mmiotrace: Lock down the testmmiotrace module
lockdown: Lock down module params that specify hardware parameters (eg. ioport)
lockdown: Lock down TIOCSSERIAL
lockdown: Prohibit PCMCIA CIS storage when the kernel is locked down
acpi: Disable ACPI table override if the kernel is locked down
acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down
ACPI: Limit access to custom_method when the kernel is locked down
x86/msr: Restrict MSR access when the kernel is locked down
x86: Lock down IO port access when the kernel is locked down
...

Linus Torvalds
2019-09-28 23:14:15 +0800

20 Aug, 2019

3 commits

29d3c1c8d kexec: Allow kexec_file() with appropriate IMA policy when locked down ... Browse Code »

Systems in lockdown mode should block the kexec of untrusted kernels.
For x86 and ARM we can ensure that a kernel is trustworthy by validating
a PE signature, but this isn't possible on other architectures. On those
platforms we can use IMA digital signatures instead. Add a function to
determine whether IMA has or will verify signatures for a given event type,
and if so permit kexec_file() even if the kernel is otherwise locked down.
This is restricted to cases where CONFIG_INTEGRITY_TRUSTED_KEYRING is set
in order to prevent an attacker from loading additional keys at runtime.

Signed-off-by: Matthew Garrett
Acked-by: Mimi Zohar
Cc: Dmitry Kasatkin
Cc: linux-integrity@vger.kernel.org
Signed-off-by: James Morris

Matthew Garrett
2019-08-20 12:54:16 +0800
155bdd30a kexec_file: Restrict at runtime if the kernel is locked down ... Browse Code »

When KEXEC_SIG is not enabled, kernel should not load images through
kexec_file systemcall if the kernel is locked down.

[Modified by David Howells to fit with modifications to the previous patch
and to return -EPERM if the kernel is locked down for consistency with
other lockdowns. Modified by Matthew Garrett to remove the IMA
integration, which will be replaced by integrating with the IMA
architecture policy patches.]

Signed-off-by: Jiri Bohac
Signed-off-by: David Howells
Signed-off-by: Matthew Garrett
cc: kexec@lists.infradead.org
Signed-off-by: James Morris

Jiri Bohac
2019-08-20 12:54:15 +0800
99d5cadfd kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE ... Browse Code »

This is a preparatory patch for kexec_file_load() lockdown. A locked down
kernel needs to prevent unsigned kernel images from being loaded with
kexec_file_load(). Currently, the only way to force the signature
verification is compiling with KEXEC_VERIFY_SIG. This prevents loading
usigned images even when the kernel is not locked down at runtime.

This patch splits KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE.
Analogous to the MODULE_SIG and MODULE_SIG_FORCE for modules, KEXEC_SIG
turns on the signature verification but allows unsigned images to be
loaded. KEXEC_SIG_FORCE disallows images without a valid signature.

Signed-off-by: Jiri Bohac
Signed-off-by: David Howells
Signed-off-by: Matthew Garrett
cc: kexec@lists.infradead.org
Signed-off-by: James Morris

Jiri Bohac
2019-08-20 12:54:15 +0800

09 Jul, 2019

1 commit

8b6815088 Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity ... Browse Code »

Pull integrity updates from Mimi Zohar:
"Bug fixes, code clean up, and new features:

- IMA policy rules can be defined in terms of LSM labels, making the
IMA policy dependent on LSM policy label changes, in particular LSM
label deletions. The new environment, in which IMA-appraisal is
being used, frequently updates the LSM policy and permits LSM label
deletions.

- Prevent an mmap'ed shared file opened for write from also being
mmap'ed execute. In the long term, making this and other similar
changes at the VFS layer would be preferable.

- The IMA per policy rule template format support is needed for a
couple of new/proposed features (eg. kexec boot command line
measurement, appended signatures, and VFS provided file hashes).

- Other than the "boot-aggregate" record in the IMA measuremeent
list, all other measurements are of file data. Measuring and
storing the kexec boot command line in the IMA measurement list is
the first buffer based measurement included in the measurement
list"

* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
integrity: Introduce struct evm_xattr
ima: Update MAX_TEMPLATE_NAME_LEN to fit largest reasonable definition
KEXEC: Call ima_kexec_cmdline to measure the boot command line args
IMA: Define a new template field buf
IMA: Define a new hook to measure the kexec boot command line arguments
IMA: support for per policy rule template formats
integrity: Fix __integrity_init_keyring() section mismatch
ima: Use designated initializers for struct ima_event_data
ima: use the lsm policy update notifier
LSM: switch to blocking policy update notifiers
x86/ima: fix the Kconfig dependency for IMA_ARCH_POLICY
ima: Make arch_policy_entry static
ima: prevent a file already mmap'ed write to be mmap'ed execute
x86/ima: check EFI SetupMode too

Linus Torvalds
2019-07-09 11:28:59 +0800

01 Jul, 2019

1 commit

6a31fcd4c KEXEC: Call ima_kexec_cmdline to measure the boot command line args ... Browse Code »

During soft reboot(kexec_file_load) boot command line
arguments are not measured.

Call ima hook ima_kexec_cmdline to measure the boot command line
arguments into IMA measurement list.

- call ima_kexec_cmdline from kexec_file_load.
- move the call ima_add_kexec_buffer after the cmdline
args have been measured.

Signed-off-by: Prakhar Srivastava
Reviewed-by: James Morris
Acked-by: Dave Young
Signed-off-by: Mimi Zohar

Prakhar Srivastava
2019-07-01 05:54:39 +0800

19 Jun, 2019

1 commit

40b0b3f8f treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 230 ... Browse Code »

Based on 2 normalized pattern(s):

this source code is licensed under the gnu general public license
version 2 see the file copying for more details

this source code is licensed under general public license version 2
see

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 52 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Enrico Weigelt
Reviewed-by: Allison Randal
Reviewed-by: Alexios Zavras
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.449021192@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-06-19 23:09:06 +0800

15 May, 2019

1 commit

350e88bad mm: memblock: make keeping memblock memory opt-in rather than opt-out ... Browse Code »

Most architectures do not need the memblock memory after the page
allocator is initialized, but only few enable ARCH_DISCARD_MEMBLOCK in the
arch Kconfig.

Replacing ARCH_DISCARD_MEMBLOCK with ARCH_KEEP_MEMBLOCK and inverting the
logic makes it clear which architectures actually use memblock after
system initialization and skips the necessity to add ARCH_DISCARD_MEMBLOCK
to the architectures that are still missing that option.

Link: http://lkml.kernel.org/r/1556102150-32517-1-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport
Acked-by: Michael Ellerman (powerpc)
Cc: Russell King
Cc: Catalin Marinas
Cc: Will Deacon
Cc: Richard Kuo
Cc: Tony Luck
Cc: Fenghua Yu
Cc: Geert Uytterhoeven
Cc: Ralf Baechle
Cc: Paul Burton
Cc: James Hogan
Cc: Ley Foon Tan
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: Yoshinori Sato
Cc: Rich Felker
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Borislav Petkov
Cc: "H. Peter Anvin"
Cc: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mike Rapoport
2019-05-15 00:47:50 +0800

25 Apr, 2019

1 commit

877b5691f crypto: shash - remove shash_desc::flags ... Browse Code »

The flags field in 'struct shash_desc' never actually does anything.
The only ostensibly supported flag is CRYPTO_TFM_REQ_MAY_SLEEP.
However, no shash algorithm ever sleeps, making this flag a no-op.

With this being the case, inevitably some users who can't sleep wrongly
pass MAY_SLEEP. These would all need to be fixed if any shash algorithm
actually started sleeping. For example, the shash_ahash_*() functions,
which wrap a shash algorithm with the ahash API, pass through MAY_SLEEP
from the ahash API to the shash API. However, the shash functions are
called under kmap_atomic(), so actually they're assumed to never sleep.

Even if it turns out that some users do need preemption points while
hashing large buffers, we could easily provide a helper function
crypto_shash_update_large() which divides the data into smaller chunks
and calls crypto_shash_update() and cond_resched() for each chunk. It's
not necessary to have a flag in 'struct shash_desc', nor is it necessary
to make individual shash algorithms aware of this at all.

Therefore, remove shash_desc::flags, and document that the
crypto_shash_*() functions can be called from any context.

Signed-off-by: Eric Biggers
Signed-off-by: Herbert Xu

Eric Biggers
2019-04-25 15:38:12 +0800

06 Dec, 2018

4 commits

497e18586 kexec_file: kexec_walk_memblock() only walks a dedicated region at kdump ... Browse Code »

In kdump case, there exists only one dedicated memblock region as usable
memory (crashk_res). With this patch, kexec_walk_memblock() runs a given
callback function on this region.

Cosmetic change: 0 to MEMBLOCK_NONE at for_each_free_mem_range*()

Signed-off-by: AKASHI Takahiro
Acked-by: Dave Young
Cc: Vivek Goyal
Cc: Baoquan He
Signed-off-by: Will Deacon

AKASHI Takahiro
2018-12-06 22:38:50 +0800
735c2f90e powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem() ... Browse Code »

Memblock list is another source for usable system memory layout.
So move powerpc's arch_kexec_walk_mem() to common code so that other
memblock-based architectures, particularly arm64, can also utilise it.
A moved function is now renamed to kexec_walk_memblock() and integrated
into kexec_locate_mem_hole(), which will now be usable for all
architectures with no need for overriding arch_kexec_walk_mem().

With this change, arch_kexec_walk_mem() need no longer be a weak function,
and was now renamed to kexec_walk_resources().

Since powerpc doesn't support kdump in its kexec_file_load(), the current
kexec_walk_memblock() won't work for kdump either in this form, this will
be fixed in the next patch.

Signed-off-by: AKASHI Takahiro
Cc: "Eric W. Biederman"
Acked-by: Dave Young
Cc: Vivek Goyal
Cc: Baoquan He
Acked-by: James Morse
Signed-off-by: Will Deacon

AKASHI Takahiro
2018-12-06 22:38:50 +0800
b6664ba42 s390, kexec_file: drop arch_kexec_mem_walk() ... Browse Code »

Since s390 already knows where to locate buffers, calling
arch_kexec_mem_walk() has no sense. So we can just drop it as kbuf->mem
indicates this while all other architectures sets it to 0 initially.

This change is a preparatory work for the next patch, where all the
variant memory walks, either on system resource or memblock, will be
put in one common place so that it will satisfy all the architectures'
need.

Signed-off-by: AKASHI Takahiro
Reviewed-by: Philipp Rudo
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: Dave Young
Cc: Vivek Goyal
Cc: Baoquan He
Signed-off-by: Will Deacon

AKASHI Takahiro
2018-12-06 22:38:49 +0800
92a98a2b9 kexec_file: make kexec_image_post_load_cleanup_default() global ... Browse Code »

Change this function from static to global so that arm64 can implement
its own arch_kimage_file_post_load_cleanup() later using
kexec_image_post_load_cleanup_default().

Signed-off-by: AKASHI Takahiro
Acked-by: Dave Young
Cc: Vivek Goyal
Cc: Baoquan He
Signed-off-by: Will Deacon

AKASHI Takahiro
2018-12-06 22:38:49 +0800

04 Nov, 2018

1 commit

3383b3604 kernel/kexec_file.c: remove some duplicated includes ... Browse Code »

We include kexec.h and slab.h twice in kexec_file.c. It's unnecessary.
hence just remove them.

Link: http://lkml.kernel.org/r/1537498098-19171-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang
Reviewed-by: Bhupesh Sharma
Reviewed-by: Andrew Morton
Acked-by: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

zhong jiang
2018-11-04 01:09:37 +0800

13 Jun, 2018

1 commit

fad953ce0 treewide: Use array_size() in vzalloc() ... Browse Code »

The vzalloc() function has no 2-factor argument form, so multiplication
factors need to be wrapped in array_size(). This patch replaces cases of:

vzalloc(a * b)

with:
vzalloc(array_size(a, b))

as well as handling cases of:

vzalloc(a * b * c)

with:

vzalloc(array3_size(a, b, c))

This does, however, attempt to ignore constant size factors like:

vzalloc(4 * 1024)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
vzalloc(
- (sizeof(TYPE)) * E
+ sizeof(TYPE) * E
, ...)
|
vzalloc(
- (sizeof(THING)) * E
+ sizeof(THING) * E
, ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
vzalloc(
- sizeof(u8) * (COUNT)
+ COUNT
, ...)
|
vzalloc(
- sizeof(__u8) * (COUNT)
+ COUNT
, ...)
|
vzalloc(
- sizeof(char) * (COUNT)
+ COUNT
, ...)
|
vzalloc(
- sizeof(unsigned char) * (COUNT)
+ COUNT
, ...)
|
vzalloc(
- sizeof(u8) * COUNT
+ COUNT
, ...)
|
vzalloc(
- sizeof(__u8) * COUNT
+ COUNT
, ...)
|
vzalloc(
- sizeof(char) * COUNT
+ COUNT
, ...)
|
vzalloc(
- sizeof(unsigned char) * COUNT
+ COUNT
, ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
vzalloc(
- sizeof(TYPE) * (COUNT_ID)
+ array_size(COUNT_ID, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * COUNT_ID
+ array_size(COUNT_ID, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * (COUNT_CONST)
+ array_size(COUNT_CONST, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * COUNT_CONST
+ array_size(COUNT_CONST, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(THING) * (COUNT_ID)
+ array_size(COUNT_ID, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * COUNT_ID
+ array_size(COUNT_ID, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * (COUNT_CONST)
+ array_size(COUNT_CONST, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * COUNT_CONST
+ array_size(COUNT_CONST, sizeof(THING))
, ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

vzalloc(
- SIZE * COUNT
+ array_size(COUNT, SIZE)
, ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
vzalloc(
- sizeof(TYPE) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(THING) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
vzalloc(
- sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
vzalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
vzalloc(
- sizeof(THING1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
vzalloc(
- sizeof(THING1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
vzalloc(
- sizeof(TYPE1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
|
vzalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
vzalloc(
- (COUNT) * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- COUNT * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- COUNT * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- (COUNT) * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- COUNT * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- (COUNT) * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- (COUNT) * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- COUNT * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
)

// Any remaining multi-factor products, first at least 3-factor products
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
vzalloc(C1 * C2 * C3, ...)
|
vzalloc(
- E1 * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
)

// And then all remaining 2 factors products when they're not all constants.
@@
expression E1, E2;
constant C1, C2;
@@

(
vzalloc(C1 * C2, ...)
|
vzalloc(
- E1 * E2
+ array_size(E1, E2)
, ...)
)

Signed-off-by: Kees Cook

Kees Cook
2018-06-13 07:19:22 +0800

14 Apr, 2018

9 commits

3be3f61d2 kernel/kexec_file.c: allow archs to set purgatory load address ... Browse Code »

For s390 new kernels are loaded to fixed addresses in memory before they
are booted. With the current code this is a problem as it assumes the
kernel will be loaded to an 'arbitrary' address. In particular,
kexec_locate_mem_hole searches for a large enough memory region and sets
the load address (kexec_bufer->mem) to it.

Luckily there is a simple workaround for this problem. By returning 1
in arch_kexec_walk_mem, kexec_locate_mem_hole is turned off. This
allows the architecture to set kbuf->mem by hand. While the trick works
fine for the kernel it does not for the purgatory as here the
architectures don't have access to its kexec_buffer.

Give architectures access to the purgatories kexec_buffer by changing
kexec_load_purgatory to take a pointer to it. With this change
architectures have access to the buffer and can edit it as they need.

A nice side effect of this change is that we can get rid of the
purgatory_info->purgatory_load_address field. As now the information
stored there can directly be accessed from kbuf->mem.

Link: http://lkml.kernel.org/r/20180321112751.22196-11-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo
Reviewed-by: Martin Schwidefsky
Acked-by: Dave Young
Cc: AKASHI Takahiro
Cc: Eric Biederman
Cc: Heiko Carstens
Cc: Ingo Molnar
Cc: Michael Ellerman
Cc: Thiago Jung Bauermann
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Philipp Rudo
2018-04-14 08:10:28 +0800
8da0b7249 kernel/kexec_file.c: remove mis-use of sh_offset field during purgatory load ... Browse Code »

The current code uses the sh_offset field in purgatory_info->sechdrs to
store a pointer to the current load address of the section. Depending
whether the section will be loaded or not this is either a pointer into
purgatory_info->purgatory_buf or kexec_purgatory. This is not only a
violation of the ELF standard but also makes the code very hard to
understand as you cannot tell if the memory you are using is read-only
or not.

Remove this misuse and store the offset of the section in
pugaroty_info->purgatory_buf in sh_offset.

Link: http://lkml.kernel.org/r/20180321112751.22196-10-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo
Acked-by: Dave Young
Cc: AKASHI Takahiro
Cc: Eric Biederman
Cc: Heiko Carstens
Cc: Ingo Molnar
Cc: Martin Schwidefsky
Cc: Michael Ellerman
Cc: Thiago Jung Bauermann
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Philipp Rudo
2018-04-14 08:10:28 +0800
620f697cc kernel/kexec_file.c: remove unneeded variables in kexec_purgatory_setup_sechdrs ... Browse Code »

The main loop currently uses quite a lot of variables to update the
section headers. Some of them are unnecessary. So clean them up a
little.

Link: http://lkml.kernel.org/r/20180321112751.22196-9-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo
Acked-by: Dave Young
Cc: AKASHI Takahiro
Cc: Eric Biederman
Cc: Heiko Carstens
Cc: Ingo Molnar
Cc: Martin Schwidefsky
Cc: Michael Ellerman
Cc: Thiago Jung Bauermann
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Philipp Rudo
2018-04-14 08:10:28 +0800
f1b1cca39 kernel/kexec_file.c: remove unneeded for-loop in kexec_purgatory_setup_sechdrs ... Browse Code »

To update the entry point there is an extra loop over all section
headers although this can be done in the main loop. So move it there
and eliminate the extra loop and variable to store the 'entry section
index'.

Also, in the main loop, move the usual case, i.e. non-bss section, out
of the extra if-block.

Link: http://lkml.kernel.org/r/20180321112751.22196-8-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo
Reviewed-by: Martin Schwidefsky
Acked-by: Dave Young
Cc: AKASHI Takahiro
Cc: Eric Biederman
Cc: Heiko Carstens
Cc: Ingo Molnar
Cc: Michael Ellerman
Cc: Thiago Jung Bauermann
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Philipp Rudo
2018-04-14 08:10:28 +0800
930457057 kernel/kexec_file.c: split up __kexec_load_puragory ... Browse Code »

When inspecting __kexec_load_purgatory you find that it has two tasks

1) setting up the kexec_buffer for the new kernel and,
2) setting up pi->sechdrs for the final load address.

The two tasks are independent of each other. To improve readability
split up __kexec_load_purgatory into two functions, one for each task,
and call them directly from kexec_load_purgatory.

Link: http://lkml.kernel.org/r/20180321112751.22196-7-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo
Acked-by: Dave Young
Cc: AKASHI Takahiro
Cc: Eric Biederman
Cc: Heiko Carstens
Cc: Ingo Molnar
Cc: Martin Schwidefsky
Cc: Michael Ellerman
Cc: Thiago Jung Bauermann
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Philipp Rudo
2018-04-14 08:10:28 +0800
8aec395b8 kernel/kexec_file.c: use read-only sections in arch_kexec_apply_relocations* ... Browse Code »

When the relocations are applied to the purgatory only the section the
relocations are applied to is writable. The other sections, i.e. the
symtab and .rel/.rela, are in read-only kexec_purgatory. Highlight this
by marking the corresponding variables as 'const'.

While at it also change the signatures of arch_kexec_apply_relocations* to
take section pointers instead of just the index of the relocation section.
This removes the second lookup and sanity check of the sections in arch
code.

Link: http://lkml.kernel.org/r/20180321112751.22196-6-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo
Acked-by: Dave Young
Cc: AKASHI Takahiro
Cc: Eric Biederman
Cc: Heiko Carstens
Cc: Ingo Molnar
Cc: Martin Schwidefsky
Cc: Michael Ellerman
Cc: Thiago Jung Bauermann
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Philipp Rudo
2018-04-14 08:10:28 +0800
961d921a1 kernel/kexec_file.c: search symbols in read-only kexec_purgatory ... Browse Code »

The stripped purgatory does not contain a symtab. So when looking for
symbols this is done in read-only kexec_purgatory. Highlight this by
marking the corresponding variables as 'const'.

Link: http://lkml.kernel.org/r/20180321112751.22196-5-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo
Acked-by: Dave Young
Cc: AKASHI Takahiro
Cc: Eric Biederman
Cc: Heiko Carstens
Cc: Ingo Molnar
Cc: Martin Schwidefsky
Cc: Michael Ellerman
Cc: Thiago Jung Bauermann
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Philipp Rudo
2018-04-14 08:10:28 +0800
65c225d32 kernel/kexec_file.c: make purgatory_info->ehdr const ... Browse Code »

The kexec_purgatory buffer is read-only. Thus all pointers into
kexec_purgatory are read-only, too. Point this out by explicitly
marking purgatory_info->ehdr as 'const' and update the comments in
purgatory_info.

Link: http://lkml.kernel.org/r/20180321112751.22196-4-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo
Acked-by: Dave Young
Cc: AKASHI Takahiro
Cc: Eric Biederman
Cc: Heiko Carstens
Cc: Ingo Molnar
Cc: Martin Schwidefsky
Cc: Michael Ellerman
Cc: Thiago Jung Bauermann
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Philipp Rudo
2018-04-14 08:10:28 +0800
d2b8178ca kernel/kexec_file.c: remove checks in kexec_purgatory_load ... Browse Code »

Before the purgatory is loaded several checks are done whether the ELF
file in kexec_purgatory is valid or not. These checks are incomplete.
For example they don't check for the total size of the sections defined
in the section header table or if the entry point actually points into
the purgatory.

On the other hand the purgatory, although an ELF file on its own, is
part of the kernel. Thus not trusting the purgatory means not trusting
the kernel build itself.

So remove all validity checks on the purgatory and just trust the kernel
build.

Link: http://lkml.kernel.org/r/20180321112751.22196-3-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo
Acked-by: Dave Young
Cc: AKASHI Takahiro
Cc: Eric Biederman
Cc: Heiko Carstens
Cc: Ingo Molnar
Cc: Martin Schwidefsky
Cc: Michael Ellerman
Cc: Thiago Jung Bauermann
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Philipp Rudo
2018-04-14 08:10:28 +0800