28 Sep, 2019

1 commit

  • Pull kernel lockdown mode from James Morris:
    "This is the latest iteration of the kernel lockdown patchset, from
    Matthew Garrett, David Howells and others.

    From the original description:

    This patchset introduces an optional kernel lockdown feature,
    intended to strengthen the boundary between UID 0 and the kernel.
    When enabled, various pieces of kernel functionality are restricted.
    Applications that rely on low-level access to either hardware or the
    kernel may cease working as a result - therefore this should not be
    enabled without appropriate evaluation beforehand.

    The majority of mainstream distributions have been carrying variants
    of this patchset for many years now, so there's value in providing a
    doesn't meet every distribution requirement, but gets us much closer
    to not requiring external patches.

    There are two major changes since this was last proposed for mainline:

    - Separating lockdown from EFI secure boot. Background discussion is
    covered here: https://lwn.net/Articles/751061/

    - Implementation as an LSM, with a default stackable lockdown LSM
    module. This allows the lockdown feature to be policy-driven,
    rather than encoding an implicit policy within the mechanism.

    The new locked_down LSM hook is provided to allow LSMs to make a
    policy decision around whether kernel functionality that would allow
    tampering with or examining the runtime state of the kernel should be
    permitted.

    The included lockdown LSM provides an implementation with a simple
    policy intended for general purpose use. This policy provides a coarse
    level of granularity, controllable via the kernel command line:

    lockdown={integrity|confidentiality}

    Enable the kernel lockdown feature. If set to integrity, kernel features
    that allow userland to modify the running kernel are disabled. If set to
    confidentiality, kernel features that allow userland to extract
    confidential information from the kernel are also disabled.

    This may also be controlled via /sys/kernel/security/lockdown and
    overriden by kernel configuration.

    New or existing LSMs may implement finer-grained controls of the
    lockdown features. Refer to the lockdown_reason documentation in
    include/linux/security.h for details.

    The lockdown feature has had signficant design feedback and review
    across many subsystems. This code has been in linux-next for some
    weeks, with a few fixes applied along the way.

    Stephen Rothwell noted that commit 9d1f8be5cf42 ("bpf: Restrict bpf
    when kernel lockdown is in confidentiality mode") is missing a
    Signed-off-by from its author. Matthew responded that he is providing
    this under category (c) of the DCO"

    * 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (31 commits)
    kexec: Fix file verification on S390
    security: constify some arrays in lockdown LSM
    lockdown: Print current->comm in restriction messages
    efi: Restrict efivar_ssdt_load when the kernel is locked down
    tracefs: Restrict tracefs when the kernel is locked down
    debugfs: Restrict debugfs when the kernel is locked down
    kexec: Allow kexec_file() with appropriate IMA policy when locked down
    lockdown: Lock down perf when in confidentiality mode
    bpf: Restrict bpf when kernel lockdown is in confidentiality mode
    lockdown: Lock down tracing and perf kprobes when in confidentiality mode
    lockdown: Lock down /proc/kcore
    x86/mmiotrace: Lock down the testmmiotrace module
    lockdown: Lock down module params that specify hardware parameters (eg. ioport)
    lockdown: Lock down TIOCSSERIAL
    lockdown: Prohibit PCMCIA CIS storage when the kernel is locked down
    acpi: Disable ACPI table override if the kernel is locked down
    acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down
    ACPI: Limit access to custom_method when the kernel is locked down
    x86/msr: Restrict MSR access when the kernel is locked down
    x86: Lock down IO port access when the kernel is locked down
    ...

    Linus Torvalds
     

20 Aug, 2019

3 commits

  • Systems in lockdown mode should block the kexec of untrusted kernels.
    For x86 and ARM we can ensure that a kernel is trustworthy by validating
    a PE signature, but this isn't possible on other architectures. On those
    platforms we can use IMA digital signatures instead. Add a function to
    determine whether IMA has or will verify signatures for a given event type,
    and if so permit kexec_file() even if the kernel is otherwise locked down.
    This is restricted to cases where CONFIG_INTEGRITY_TRUSTED_KEYRING is set
    in order to prevent an attacker from loading additional keys at runtime.

    Signed-off-by: Matthew Garrett
    Acked-by: Mimi Zohar
    Cc: Dmitry Kasatkin
    Cc: linux-integrity@vger.kernel.org
    Signed-off-by: James Morris

    Matthew Garrett
     
  • When KEXEC_SIG is not enabled, kernel should not load images through
    kexec_file systemcall if the kernel is locked down.

    [Modified by David Howells to fit with modifications to the previous patch
    and to return -EPERM if the kernel is locked down for consistency with
    other lockdowns. Modified by Matthew Garrett to remove the IMA
    integration, which will be replaced by integrating with the IMA
    architecture policy patches.]

    Signed-off-by: Jiri Bohac
    Signed-off-by: David Howells
    Signed-off-by: Matthew Garrett
    cc: kexec@lists.infradead.org
    Signed-off-by: James Morris

    Jiri Bohac
     
  • This is a preparatory patch for kexec_file_load() lockdown. A locked down
    kernel needs to prevent unsigned kernel images from being loaded with
    kexec_file_load(). Currently, the only way to force the signature
    verification is compiling with KEXEC_VERIFY_SIG. This prevents loading
    usigned images even when the kernel is not locked down at runtime.

    This patch splits KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE.
    Analogous to the MODULE_SIG and MODULE_SIG_FORCE for modules, KEXEC_SIG
    turns on the signature verification but allows unsigned images to be
    loaded. KEXEC_SIG_FORCE disallows images without a valid signature.

    Signed-off-by: Jiri Bohac
    Signed-off-by: David Howells
    Signed-off-by: Matthew Garrett
    cc: kexec@lists.infradead.org
    Signed-off-by: James Morris

    Jiri Bohac
     

09 Jul, 2019

1 commit

  • Pull integrity updates from Mimi Zohar:
    "Bug fixes, code clean up, and new features:

    - IMA policy rules can be defined in terms of LSM labels, making the
    IMA policy dependent on LSM policy label changes, in particular LSM
    label deletions. The new environment, in which IMA-appraisal is
    being used, frequently updates the LSM policy and permits LSM label
    deletions.

    - Prevent an mmap'ed shared file opened for write from also being
    mmap'ed execute. In the long term, making this and other similar
    changes at the VFS layer would be preferable.

    - The IMA per policy rule template format support is needed for a
    couple of new/proposed features (eg. kexec boot command line
    measurement, appended signatures, and VFS provided file hashes).

    - Other than the "boot-aggregate" record in the IMA measuremeent
    list, all other measurements are of file data. Measuring and
    storing the kexec boot command line in the IMA measurement list is
    the first buffer based measurement included in the measurement
    list"

    * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
    integrity: Introduce struct evm_xattr
    ima: Update MAX_TEMPLATE_NAME_LEN to fit largest reasonable definition
    KEXEC: Call ima_kexec_cmdline to measure the boot command line args
    IMA: Define a new template field buf
    IMA: Define a new hook to measure the kexec boot command line arguments
    IMA: support for per policy rule template formats
    integrity: Fix __integrity_init_keyring() section mismatch
    ima: Use designated initializers for struct ima_event_data
    ima: use the lsm policy update notifier
    LSM: switch to blocking policy update notifiers
    x86/ima: fix the Kconfig dependency for IMA_ARCH_POLICY
    ima: Make arch_policy_entry static
    ima: prevent a file already mmap'ed write to be mmap'ed execute
    x86/ima: check EFI SetupMode too

    Linus Torvalds
     

01 Jul, 2019

1 commit

  • During soft reboot(kexec_file_load) boot command line
    arguments are not measured.

    Call ima hook ima_kexec_cmdline to measure the boot command line
    arguments into IMA measurement list.

    - call ima_kexec_cmdline from kexec_file_load.
    - move the call ima_add_kexec_buffer after the cmdline
    args have been measured.

    Signed-off-by: Prakhar Srivastava
    Reviewed-by: James Morris
    Acked-by: Dave Young
    Signed-off-by: Mimi Zohar

    Prakhar Srivastava
     

19 Jun, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this source code is licensed under the gnu general public license
    version 2 see the file copying for more details

    this source code is licensed under general public license version 2
    see

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 52 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Allison Randal
    Reviewed-by: Alexios Zavras
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190602204653.449021192@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

15 May, 2019

1 commit

  • Most architectures do not need the memblock memory after the page
    allocator is initialized, but only few enable ARCH_DISCARD_MEMBLOCK in the
    arch Kconfig.

    Replacing ARCH_DISCARD_MEMBLOCK with ARCH_KEEP_MEMBLOCK and inverting the
    logic makes it clear which architectures actually use memblock after
    system initialization and skips the necessity to add ARCH_DISCARD_MEMBLOCK
    to the architectures that are still missing that option.

    Link: http://lkml.kernel.org/r/1556102150-32517-1-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Acked-by: Michael Ellerman (powerpc)
    Cc: Russell King
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Richard Kuo
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Geert Uytterhoeven
    Cc: Ralf Baechle
    Cc: Paul Burton
    Cc: James Hogan
    Cc: Ley Foon Tan
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Yoshinori Sato
    Cc: Rich Felker
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Borislav Petkov
    Cc: "H. Peter Anvin"
    Cc: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

25 Apr, 2019

1 commit

  • The flags field in 'struct shash_desc' never actually does anything.
    The only ostensibly supported flag is CRYPTO_TFM_REQ_MAY_SLEEP.
    However, no shash algorithm ever sleeps, making this flag a no-op.

    With this being the case, inevitably some users who can't sleep wrongly
    pass MAY_SLEEP. These would all need to be fixed if any shash algorithm
    actually started sleeping. For example, the shash_ahash_*() functions,
    which wrap a shash algorithm with the ahash API, pass through MAY_SLEEP
    from the ahash API to the shash API. However, the shash functions are
    called under kmap_atomic(), so actually they're assumed to never sleep.

    Even if it turns out that some users do need preemption points while
    hashing large buffers, we could easily provide a helper function
    crypto_shash_update_large() which divides the data into smaller chunks
    and calls crypto_shash_update() and cond_resched() for each chunk. It's
    not necessary to have a flag in 'struct shash_desc', nor is it necessary
    to make individual shash algorithms aware of this at all.

    Therefore, remove shash_desc::flags, and document that the
    crypto_shash_*() functions can be called from any context.

    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu

    Eric Biggers
     

06 Dec, 2018

4 commits

  • In kdump case, there exists only one dedicated memblock region as usable
    memory (crashk_res). With this patch, kexec_walk_memblock() runs a given
    callback function on this region.

    Cosmetic change: 0 to MEMBLOCK_NONE at for_each_free_mem_range*()

    Signed-off-by: AKASHI Takahiro
    Acked-by: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Signed-off-by: Will Deacon

    AKASHI Takahiro
     
  • Memblock list is another source for usable system memory layout.
    So move powerpc's arch_kexec_walk_mem() to common code so that other
    memblock-based architectures, particularly arm64, can also utilise it.
    A moved function is now renamed to kexec_walk_memblock() and integrated
    into kexec_locate_mem_hole(), which will now be usable for all
    architectures with no need for overriding arch_kexec_walk_mem().

    With this change, arch_kexec_walk_mem() need no longer be a weak function,
    and was now renamed to kexec_walk_resources().

    Since powerpc doesn't support kdump in its kexec_file_load(), the current
    kexec_walk_memblock() won't work for kdump either in this form, this will
    be fixed in the next patch.

    Signed-off-by: AKASHI Takahiro
    Cc: "Eric W. Biederman"
    Acked-by: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Acked-by: James Morse
    Signed-off-by: Will Deacon

    AKASHI Takahiro
     
  • Since s390 already knows where to locate buffers, calling
    arch_kexec_mem_walk() has no sense. So we can just drop it as kbuf->mem
    indicates this while all other architectures sets it to 0 initially.

    This change is a preparatory work for the next patch, where all the
    variant memory walks, either on system resource or memblock, will be
    put in one common place so that it will satisfy all the architectures'
    need.

    Signed-off-by: AKASHI Takahiro
    Reviewed-by: Philipp Rudo
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Signed-off-by: Will Deacon

    AKASHI Takahiro
     
  • Change this function from static to global so that arm64 can implement
    its own arch_kimage_file_post_load_cleanup() later using
    kexec_image_post_load_cleanup_default().

    Signed-off-by: AKASHI Takahiro
    Acked-by: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Signed-off-by: Will Deacon

    AKASHI Takahiro
     

04 Nov, 2018

1 commit

  • We include kexec.h and slab.h twice in kexec_file.c. It's unnecessary.
    hence just remove them.

    Link: http://lkml.kernel.org/r/1537498098-19171-1-git-send-email-zhongjiang@huawei.com
    Signed-off-by: zhong jiang
    Reviewed-by: Bhupesh Sharma
    Reviewed-by: Andrew Morton
    Acked-by: Baoquan He
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    zhong jiang
     

13 Jun, 2018

1 commit

  • The vzalloc() function has no 2-factor argument form, so multiplication
    factors need to be wrapped in array_size(). This patch replaces cases of:

    vzalloc(a * b)

    with:
    vzalloc(array_size(a, b))

    as well as handling cases of:

    vzalloc(a * b * c)

    with:

    vzalloc(array3_size(a, b, c))

    This does, however, attempt to ignore constant size factors like:

    vzalloc(4 * 1024)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    vzalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    vzalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    vzalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    vzalloc(
    - sizeof(TYPE) * (COUNT_ID)
    + array_size(COUNT_ID, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * COUNT_ID
    + array_size(COUNT_ID, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * (COUNT_CONST)
    + array_size(COUNT_CONST, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * COUNT_CONST
    + array_size(COUNT_CONST, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * (COUNT_ID)
    + array_size(COUNT_ID, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * COUNT_ID
    + array_size(COUNT_ID, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * (COUNT_CONST)
    + array_size(COUNT_CONST, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * COUNT_CONST
    + array_size(COUNT_CONST, sizeof(THING))
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    vzalloc(
    - SIZE * COUNT
    + array_size(COUNT, SIZE)
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    vzalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    vzalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    vzalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    vzalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    vzalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    vzalloc(C1 * C2 * C3, ...)
    |
    vzalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants.
    @@
    expression E1, E2;
    constant C1, C2;
    @@

    (
    vzalloc(C1 * C2, ...)
    |
    vzalloc(
    - E1 * E2
    + array_size(E1, E2)
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     

14 Apr, 2018

12 commits

  • For s390 new kernels are loaded to fixed addresses in memory before they
    are booted. With the current code this is a problem as it assumes the
    kernel will be loaded to an 'arbitrary' address. In particular,
    kexec_locate_mem_hole searches for a large enough memory region and sets
    the load address (kexec_bufer->mem) to it.

    Luckily there is a simple workaround for this problem. By returning 1
    in arch_kexec_walk_mem, kexec_locate_mem_hole is turned off. This
    allows the architecture to set kbuf->mem by hand. While the trick works
    fine for the kernel it does not for the purgatory as here the
    architectures don't have access to its kexec_buffer.

    Give architectures access to the purgatories kexec_buffer by changing
    kexec_load_purgatory to take a pointer to it. With this change
    architectures have access to the buffer and can edit it as they need.

    A nice side effect of this change is that we can get rid of the
    purgatory_info->purgatory_load_address field. As now the information
    stored there can directly be accessed from kbuf->mem.

    Link: http://lkml.kernel.org/r/20180321112751.22196-11-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Reviewed-by: Martin Schwidefsky
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • The current code uses the sh_offset field in purgatory_info->sechdrs to
    store a pointer to the current load address of the section. Depending
    whether the section will be loaded or not this is either a pointer into
    purgatory_info->purgatory_buf or kexec_purgatory. This is not only a
    violation of the ELF standard but also makes the code very hard to
    understand as you cannot tell if the memory you are using is read-only
    or not.

    Remove this misuse and store the offset of the section in
    pugaroty_info->purgatory_buf in sh_offset.

    Link: http://lkml.kernel.org/r/20180321112751.22196-10-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • The main loop currently uses quite a lot of variables to update the
    section headers. Some of them are unnecessary. So clean them up a
    little.

    Link: http://lkml.kernel.org/r/20180321112751.22196-9-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • To update the entry point there is an extra loop over all section
    headers although this can be done in the main loop. So move it there
    and eliminate the extra loop and variable to store the 'entry section
    index'.

    Also, in the main loop, move the usual case, i.e. non-bss section, out
    of the extra if-block.

    Link: http://lkml.kernel.org/r/20180321112751.22196-8-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Reviewed-by: Martin Schwidefsky
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • When inspecting __kexec_load_purgatory you find that it has two tasks

    1) setting up the kexec_buffer for the new kernel and,
    2) setting up pi->sechdrs for the final load address.

    The two tasks are independent of each other. To improve readability
    split up __kexec_load_purgatory into two functions, one for each task,
    and call them directly from kexec_load_purgatory.

    Link: http://lkml.kernel.org/r/20180321112751.22196-7-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • When the relocations are applied to the purgatory only the section the
    relocations are applied to is writable. The other sections, i.e. the
    symtab and .rel/.rela, are in read-only kexec_purgatory. Highlight this
    by marking the corresponding variables as 'const'.

    While at it also change the signatures of arch_kexec_apply_relocations* to
    take section pointers instead of just the index of the relocation section.
    This removes the second lookup and sanity check of the sections in arch
    code.

    Link: http://lkml.kernel.org/r/20180321112751.22196-6-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • The stripped purgatory does not contain a symtab. So when looking for
    symbols this is done in read-only kexec_purgatory. Highlight this by
    marking the corresponding variables as 'const'.

    Link: http://lkml.kernel.org/r/20180321112751.22196-5-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • The kexec_purgatory buffer is read-only. Thus all pointers into
    kexec_purgatory are read-only, too. Point this out by explicitly
    marking purgatory_info->ehdr as 'const' and update the comments in
    purgatory_info.

    Link: http://lkml.kernel.org/r/20180321112751.22196-4-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • Before the purgatory is loaded several checks are done whether the ELF
    file in kexec_purgatory is valid or not. These checks are incomplete.
    For example they don't check for the total size of the sections defined
    in the section header table or if the entry point actually points into
    the purgatory.

    On the other hand the purgatory, although an ELF file on its own, is
    part of the kernel. Thus not trusting the purgatory means not trusting
    the kernel build itself.

    So remove all validity checks on the purgatory and just trust the kernel
    build.

    Link: http://lkml.kernel.org/r/20180321112751.22196-3-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • In the previous patches, commonly-used routines, exclude_mem_range() and
    prepare_elf64_headers(), were carved out. Now place them in kexec
    common code. A prefix "crash_" is given to each of their names to avoid
    possible name collisions.

    Link: http://lkml.kernel.org/r/20180306102303.9063-8-takahiro.akashi@linaro.org
    Signed-off-by: AKASHI Takahiro
    Acked-by: Dave Young
    Tested-by: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    AKASHI Takahiro
     
  • As arch_kexec_kernel_image_{probe,load}(),
    arch_kimage_file_post_load_cleanup() and arch_kexec_kernel_verify_sig()
    are almost duplicated among architectures, they can be commonalized with
    an architecture-defined kexec_file_ops array. So let's factor them out.

    Link: http://lkml.kernel.org/r/20180306102303.9063-3-takahiro.akashi@linaro.org
    Signed-off-by: AKASHI Takahiro
    Acked-by: Dave Young
    Tested-by: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    AKASHI Takahiro
     
  • Patch series "kexec_file, x86, powerpc: refactoring for other
    architecutres", v2.

    This is a preparatory patchset for adding kexec_file support on arm64.

    It was originally included in a arm64 patch set[1], but Philipp is also
    working on their kexec_file support on s390[2] and some changes are now
    conflicting.

    So these common parts were extracted and put into a separate patch set
    for better integration. What's more, my original patch#4 was split into
    a few small chunks for easier review after Dave's comment.

    As such, the resulting code is basically identical with my original, and
    the only *visible* differences are:

    - renaming of _kexec_kernel_image_probe() and _kimage_file_post_load_cleanup()

    - change one of types of arguments at prepare_elf64_headers()

    Those, unfortunately, require a couple of trivial changes on the rest
    (#1, #6 to #13) of my arm64 kexec_file patch set[1].

    Patch #1 allows making a use of purgatory optional, particularly useful
    for arm64.

    Patch #2 commonalizes arch_kexec_kernel_{image_probe, image_load,
    verify_sig}() and arch_kimage_file_post_load_cleanup() across
    architectures.

    Patches #3-#7 are also intended to generalize parse_elf64_headers(),
    along with exclude_mem_range(), to be made best re-use of.

    [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/561182.html
    [2] http://lkml.iu.edu//hypermail/linux/kernel/1802.1/02596.html

    This patch (of 7):

    On arm64, crash dump kernel's usable memory is protected by *unmapping*
    it from kernel virtual space unlike other architectures where the region
    is just made read-only. It is highly unlikely that the region is
    accidentally corrupted and this observation rationalizes that digest
    check code can also be dropped from purgatory. The resulting code is so
    simple as it doesn't require a bit ugly re-linking/relocation stuff,
    i.e. arch_kexec_apply_relocations_add().

    Please see:

    http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html

    All that the purgatory does is to shuffle arguments and jump into a new
    kernel, while we still need to have some space for a hash value
    (purgatory_sha256_digest) which is never checked against.

    As such, it doesn't make sense to have trampline code between old kernel
    and new kernel on arm64.

    This patch introduces a new configuration, ARCH_HAS_KEXEC_PURGATORY, and
    allows related code to be compiled in only if necessary.

    [takahiro.akashi@linaro.org: fix trivial screwup]
    Link: http://lkml.kernel.org/r/20180309093346.GF25863@linaro.org
    Link: http://lkml.kernel.org/r/20180306102303.9063-2-takahiro.akashi@linaro.org
    Signed-off-by: AKASHI Takahiro
    Acked-by: Dave Young
    Tested-by: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    AKASHI Takahiro
     

07 Nov, 2017

1 commit

  • In preperation for a new function that will need additional resource
    information during the resource walk, update the resource walk callback to
    pass the resource structure. Since the current callback start and end
    arguments are pulled from the resource structure, the callback functions
    can obtain them from the resource structure directly.

    Signed-off-by: Tom Lendacky
    Signed-off-by: Brijesh Singh
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Kees Cook
    Reviewed-by: Borislav Petkov
    Tested-by: Borislav Petkov
    Cc: kvm@vger.kernel.org
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: linuxppc-dev@lists.ozlabs.org
    Link: https://lkml.kernel.org/r/20171020143059.3291-10-brijesh.singh@amd.com

    Tom Lendacky
     

13 Jul, 2017

2 commits

  • Defining kexec_purgatory as a zero-length char array upsets compile time
    size checking. Since this is built on a per-arch basis, define it as an
    unsized char array (like is done for other similar things, e.g. linker
    sections). This silences the warning generated by the future
    CONFIG_FORTIFY_SOURCE, which did not like the memcmp() of a "0 byte"
    array. This drops the __weak and uses an extern instead, since both
    users define kexec_purgatory.

    Link: http://lkml.kernel.org/r/1497903987-21002-4-git-send-email-keescook@chromium.org
    Signed-off-by: Kees Cook
    Acked-by: "Eric W. Biederman"
    Cc: Daniel Micay
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • Currently vmcoreinfo data is updated at boot time subsys_initcall(), it
    has the risk of being modified by some wrong code during system is
    running.

    As a result, vmcore dumped may contain the wrong vmcoreinfo. Later on,
    when using "crash", "makedumpfile", etc utility to parse this vmcore, we
    probably will get "Segmentation fault" or other unexpected errors.

    E.g. 1) wrong code overwrites vmcoreinfo_data; 2) further crashes the
    system; 3) trigger kdump, then we obviously will fail to recognize the
    crash context correctly due to the corrupted vmcoreinfo.

    Now except for vmcoreinfo, all the crash data is well
    protected(including the cpu note which is fully updated in the crash
    path, thus its correctness is guaranteed). Given that vmcoreinfo data
    is a large chunk prepared for kdump, we better protect it as well.

    To solve this, we relocate and copy vmcoreinfo_data to the crash memory
    when kdump is loading via kexec syscalls. Because the whole crash
    memory will be protected by existing arch_kexec_protect_crashkres()
    mechanism, we naturally protect vmcoreinfo_data from write(even read)
    access under kernel direct mapping after kdump is loaded.

    Since kdump is usually loaded at the very early stage after boot, we can
    trust the correctness of the vmcoreinfo data copied.

    On the other hand, we still need to operate the vmcoreinfo safe copy
    when crash happens to generate vmcoreinfo_note again, we rely on vmap()
    to map out a new kernel virtual address and update to use this new one
    instead in the following crash_save_vmcoreinfo().

    BTW, we do not touch vmcoreinfo_note, because it will be fully updated
    using the protected vmcoreinfo_data after crash which is surely correct
    just like the cpu crash note.

    Link: http://lkml.kernel.org/r/1493281021-20737-3-git-send-email-xlpang@redhat.com
    Signed-off-by: Xunlei Pang
    Tested-by: Michael Holzheu
    Cc: Benjamin Herrenschmidt
    Cc: Dave Young
    Cc: Eric Biederman
    Cc: Hari Bathini
    Cc: Juergen Gross
    Cc: Mahesh Salgaonkar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xunlei Pang
     

30 Jun, 2017

1 commit


11 Mar, 2017

1 commit

  • The purgatory code defines global variables which are referenced via a
    symbol lookup in the kexec code (core and arch).

    A recent commit addressing sparse warnings made these static and thereby
    broke kexec_file.

    Why did this happen? Simply because the whole machinery is undocumented and
    lacks any form of forward declarations. The variable names are unspecific
    and lack a prefix, so adding forward declarations creates shadow variables
    in the core code. Aside of that the code relies on magic constants and
    duplicate struct definitions with no way to ensure that these things stay
    in sync. The section placement of the purgatory variables happened by
    chance and not by design.

    Unbreak kexec and cleanup the mess:

    - Add proper forward declarations and document the usage
    - Use common struct definition
    - Use the proper common defines instead of magic constants
    - Add a purgatory_ prefix to have a proper name space
    - Use ARRAY_SIZE() instead of a homebrewn reimplementation
    - Add proper sections to the purgatory variables [ From Mike ]

    Fixes: 72042a8c7b01 ("x86/purgatory: Make functions and variables static")
    Reported-by: Mike Galbraith <
    Signed-off-by: Thomas Gleixner
    Cc: Nicholas Mc Guire
    Cc: Borislav Petkov
    Cc: Vivek Goyal
    Cc: "Tobin C. Harding"
    Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1703101315140.3681@nanos
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

21 Dec, 2016

1 commit

  • The TPM PCRs are only reset on a hard reboot. In order to validate a
    TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement
    list of the running kernel must be saved and restored on boot.

    This patch uses the kexec buffer passing mechanism to pass the
    serialized IMA binary_runtime_measurements to the next kernel.

    Link: http://lkml.kernel.org/r/1480554346-29071-7-git-send-email-zohar@linux.vnet.ibm.com
    Signed-off-by: Thiago Jung Bauermann
    Signed-off-by: Mimi Zohar
    Acked-by: "Eric W. Biederman"
    Acked-by: Dmitry Kasatkin
    Cc: Andreas Steffen
    Cc: Josh Sklar
    Cc: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Cc: Michael Ellerman
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Stewart Smith
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mimi Zohar
     

30 Nov, 2016

3 commits


02 Sep, 2016

1 commit

  • If kexec_apply_relocations fails, kexec_load_purgatory frees pi->sechdrs
    and pi->purgatory_buf. This is redundant, because in case of error
    kimage_file_prepare_segments calls kimage_file_post_load_cleanup, which
    will also free those buffers.

    This causes two warnings like the following, one for pi->sechdrs and the
    other for pi->purgatory_buf:

    kexec-bzImage64: Loading purgatory failed
    ------------[ cut here ]------------
    WARNING: CPU: 1 PID: 2119 at mm/vmalloc.c:1490 __vunmap+0xc1/0xd0
    Trying to vfree() nonexistent vm area (ffffc90000e91000)
    Modules linked in:
    CPU: 1 PID: 2119 Comm: kexec Not tainted 4.8.0-rc3+ #5
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Call Trace:
    dump_stack+0x4d/0x65
    __warn+0xcb/0xf0
    warn_slowpath_fmt+0x4f/0x60
    ? find_vmap_area+0x19/0x70
    ? kimage_file_post_load_cleanup+0x47/0xb0
    __vunmap+0xc1/0xd0
    vfree+0x2e/0x70
    kimage_file_post_load_cleanup+0x5e/0xb0
    SyS_kexec_file_load+0x448/0x680
    ? putname+0x54/0x60
    ? do_sys_open+0x190/0x1f0
    entry_SYSCALL_64_fastpath+0x13/0x8f
    ---[ end trace 158bb74f5950ca2b ]---

    Fix by setting pi->sechdrs an pi->purgatory_buf to NULL, since vfree
    won't try to free a NULL pointer.

    Link: http://lkml.kernel.org/r/1472083546-23683-1-git-send-email-bauerman@linux.vnet.ibm.com
    Signed-off-by: Thiago Jung Bauermann
    Acked-by: Baoquan He
    Cc: "Eric W. Biederman"
    Cc: Vivek Goyal
    Cc: Dave Young
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thiago Jung Bauermann
     

24 May, 2016

1 commit

  • For the cases that some kernel (module) path stamps the crash reserved
    memory(already mapped by the kernel) where has been loaded the second
    kernel data, the kdump kernel will probably fail to boot when panic
    happens (or even not happens) leaving the culprit at large, this is
    unacceptable.

    The patch introduces a mechanism for detecting such cases:

    1) After each crash kexec loading, it simply marks the reserved memory
    regions readonly since we no longer access it after that. When someone
    stamps the region, the first kernel will panic and trigger the kdump.
    The weak arch_kexec_protect_crashkres() is introduced to do the actual
    protection.

    2) To allow multiple loading, once 1) was done we also need to remark
    the reserved memory to readwrite each time a system call related to
    kdump is made. The weak arch_kexec_unprotect_crashkres() is introduced
    to do the actual protection.

    The architecture can make its specific implementation by overriding
    arch_kexec_protect_crashkres() and arch_kexec_unprotect_crashkres().

    Signed-off-by: Xunlei Pang
    Cc: Eric Biederman
    Cc: Dave Young
    Cc: Minfei Huang
    Cc: Vivek Goyal
    Cc: Baoquan He
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xunlei Pang
     

18 Mar, 2016

1 commit

  • Pull security layer updates from James Morris:
    "There are a bunch of fixes to the TPM, IMA, and Keys code, with minor
    fixes scattered across the subsystem.

    IMA now requires signed policy, and that policy is also now measured
    and appraised"

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (67 commits)
    X.509: Make algo identifiers text instead of enum
    akcipher: Move the RSA DER encoding check to the crypto layer
    crypto: Add hash param to pkcs1pad
    sign-file: fix build with CMS support disabled
    MAINTAINERS: update tpmdd urls
    MODSIGN: linux/string.h should be #included to get memcpy()
    certs: Fix misaligned data in extra certificate list
    X.509: Handle midnight alternative notation in GeneralizedTime
    X.509: Support leap seconds
    Handle ISO 8601 leap seconds and encodings of midnight in mktime64()
    X.509: Fix leap year handling again
    PKCS#7: fix unitialized boolean 'want'
    firmware: change kernel read fail to dev_dbg()
    KEYS: Use the symbol value for list size, updated by scripts/insert-sys-cert
    KEYS: Reserve an extra certificate symbol for inserting without recompiling
    modsign: hide openssl output in silent builds
    tpm_tis: fix build warning with tpm_tis_resume
    ima: require signed IMA policy
    ima: measure and appraise the IMA policy itself
    ima: load policy using path
    ...

    Linus Torvalds
     

21 Feb, 2016

1 commit

  • Replace copy_file_from_fd() with kernel_read_file_from_fd().

    Two new identifiers named READING_KEXEC_IMAGE and READING_KEXEC_INITRAMFS
    are defined for measuring, appraising or auditing the kexec image and
    initramfs.

    Changelog v3:
    - return -EBADF, not -ENOEXEC
    - identifier change
    - split patch, moving copy_file_from_fd() to a separate patch
    - split patch, moving IMA changes to a separate patch
    v0:
    - use kstat file size type loff_t, not size_t
    - Calculate the file hash from the in memory buffer - Dave Young

    Signed-off-by: Mimi Zohar
    Acked-by: Kees Cook
    Acked-by: Luis R. Rodriguez
    Cc: Eric Biederman
    Acked-by: Dave Young

    Mimi Zohar