06 Dec, 2018

4 commits

  • In kdump case, there exists only one dedicated memblock region as usable
    memory (crashk_res). With this patch, kexec_walk_memblock() runs a given
    callback function on this region.

    Cosmetic change: 0 to MEMBLOCK_NONE at for_each_free_mem_range*()

    Signed-off-by: AKASHI Takahiro
    Acked-by: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Signed-off-by: Will Deacon

    AKASHI Takahiro
     
  • Memblock list is another source for usable system memory layout.
    So move powerpc's arch_kexec_walk_mem() to common code so that other
    memblock-based architectures, particularly arm64, can also utilise it.
    A moved function is now renamed to kexec_walk_memblock() and integrated
    into kexec_locate_mem_hole(), which will now be usable for all
    architectures with no need for overriding arch_kexec_walk_mem().

    With this change, arch_kexec_walk_mem() need no longer be a weak function,
    and was now renamed to kexec_walk_resources().

    Since powerpc doesn't support kdump in its kexec_file_load(), the current
    kexec_walk_memblock() won't work for kdump either in this form, this will
    be fixed in the next patch.

    Signed-off-by: AKASHI Takahiro
    Cc: "Eric W. Biederman"
    Acked-by: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Acked-by: James Morse
    Signed-off-by: Will Deacon

    AKASHI Takahiro
     
  • Since s390 already knows where to locate buffers, calling
    arch_kexec_mem_walk() has no sense. So we can just drop it as kbuf->mem
    indicates this while all other architectures sets it to 0 initially.

    This change is a preparatory work for the next patch, where all the
    variant memory walks, either on system resource or memblock, will be
    put in one common place so that it will satisfy all the architectures'
    need.

    Signed-off-by: AKASHI Takahiro
    Reviewed-by: Philipp Rudo
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Signed-off-by: Will Deacon

    AKASHI Takahiro
     
  • Change this function from static to global so that arm64 can implement
    its own arch_kimage_file_post_load_cleanup() later using
    kexec_image_post_load_cleanup_default().

    Signed-off-by: AKASHI Takahiro
    Acked-by: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Signed-off-by: Will Deacon

    AKASHI Takahiro
     

04 Nov, 2018

1 commit

  • We include kexec.h and slab.h twice in kexec_file.c. It's unnecessary.
    hence just remove them.

    Link: http://lkml.kernel.org/r/1537498098-19171-1-git-send-email-zhongjiang@huawei.com
    Signed-off-by: zhong jiang
    Reviewed-by: Bhupesh Sharma
    Reviewed-by: Andrew Morton
    Acked-by: Baoquan He
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    zhong jiang
     

13 Jun, 2018

1 commit

  • The vzalloc() function has no 2-factor argument form, so multiplication
    factors need to be wrapped in array_size(). This patch replaces cases of:

    vzalloc(a * b)

    with:
    vzalloc(array_size(a, b))

    as well as handling cases of:

    vzalloc(a * b * c)

    with:

    vzalloc(array3_size(a, b, c))

    This does, however, attempt to ignore constant size factors like:

    vzalloc(4 * 1024)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    vzalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    vzalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    vzalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    vzalloc(
    - sizeof(TYPE) * (COUNT_ID)
    + array_size(COUNT_ID, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * COUNT_ID
    + array_size(COUNT_ID, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * (COUNT_CONST)
    + array_size(COUNT_CONST, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * COUNT_CONST
    + array_size(COUNT_CONST, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * (COUNT_ID)
    + array_size(COUNT_ID, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * COUNT_ID
    + array_size(COUNT_ID, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * (COUNT_CONST)
    + array_size(COUNT_CONST, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * COUNT_CONST
    + array_size(COUNT_CONST, sizeof(THING))
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    vzalloc(
    - SIZE * COUNT
    + array_size(COUNT, SIZE)
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    vzalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    vzalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    vzalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    vzalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    vzalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    vzalloc(C1 * C2 * C3, ...)
    |
    vzalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants.
    @@
    expression E1, E2;
    constant C1, C2;
    @@

    (
    vzalloc(C1 * C2, ...)
    |
    vzalloc(
    - E1 * E2
    + array_size(E1, E2)
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     

14 Apr, 2018

12 commits

  • For s390 new kernels are loaded to fixed addresses in memory before they
    are booted. With the current code this is a problem as it assumes the
    kernel will be loaded to an 'arbitrary' address. In particular,
    kexec_locate_mem_hole searches for a large enough memory region and sets
    the load address (kexec_bufer->mem) to it.

    Luckily there is a simple workaround for this problem. By returning 1
    in arch_kexec_walk_mem, kexec_locate_mem_hole is turned off. This
    allows the architecture to set kbuf->mem by hand. While the trick works
    fine for the kernel it does not for the purgatory as here the
    architectures don't have access to its kexec_buffer.

    Give architectures access to the purgatories kexec_buffer by changing
    kexec_load_purgatory to take a pointer to it. With this change
    architectures have access to the buffer and can edit it as they need.

    A nice side effect of this change is that we can get rid of the
    purgatory_info->purgatory_load_address field. As now the information
    stored there can directly be accessed from kbuf->mem.

    Link: http://lkml.kernel.org/r/20180321112751.22196-11-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Reviewed-by: Martin Schwidefsky
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • The current code uses the sh_offset field in purgatory_info->sechdrs to
    store a pointer to the current load address of the section. Depending
    whether the section will be loaded or not this is either a pointer into
    purgatory_info->purgatory_buf or kexec_purgatory. This is not only a
    violation of the ELF standard but also makes the code very hard to
    understand as you cannot tell if the memory you are using is read-only
    or not.

    Remove this misuse and store the offset of the section in
    pugaroty_info->purgatory_buf in sh_offset.

    Link: http://lkml.kernel.org/r/20180321112751.22196-10-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • The main loop currently uses quite a lot of variables to update the
    section headers. Some of them are unnecessary. So clean them up a
    little.

    Link: http://lkml.kernel.org/r/20180321112751.22196-9-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • To update the entry point there is an extra loop over all section
    headers although this can be done in the main loop. So move it there
    and eliminate the extra loop and variable to store the 'entry section
    index'.

    Also, in the main loop, move the usual case, i.e. non-bss section, out
    of the extra if-block.

    Link: http://lkml.kernel.org/r/20180321112751.22196-8-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Reviewed-by: Martin Schwidefsky
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • When inspecting __kexec_load_purgatory you find that it has two tasks

    1) setting up the kexec_buffer for the new kernel and,
    2) setting up pi->sechdrs for the final load address.

    The two tasks are independent of each other. To improve readability
    split up __kexec_load_purgatory into two functions, one for each task,
    and call them directly from kexec_load_purgatory.

    Link: http://lkml.kernel.org/r/20180321112751.22196-7-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • When the relocations are applied to the purgatory only the section the
    relocations are applied to is writable. The other sections, i.e. the
    symtab and .rel/.rela, are in read-only kexec_purgatory. Highlight this
    by marking the corresponding variables as 'const'.

    While at it also change the signatures of arch_kexec_apply_relocations* to
    take section pointers instead of just the index of the relocation section.
    This removes the second lookup and sanity check of the sections in arch
    code.

    Link: http://lkml.kernel.org/r/20180321112751.22196-6-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • The stripped purgatory does not contain a symtab. So when looking for
    symbols this is done in read-only kexec_purgatory. Highlight this by
    marking the corresponding variables as 'const'.

    Link: http://lkml.kernel.org/r/20180321112751.22196-5-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • The kexec_purgatory buffer is read-only. Thus all pointers into
    kexec_purgatory are read-only, too. Point this out by explicitly
    marking purgatory_info->ehdr as 'const' and update the comments in
    purgatory_info.

    Link: http://lkml.kernel.org/r/20180321112751.22196-4-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • Before the purgatory is loaded several checks are done whether the ELF
    file in kexec_purgatory is valid or not. These checks are incomplete.
    For example they don't check for the total size of the sections defined
    in the section header table or if the entry point actually points into
    the purgatory.

    On the other hand the purgatory, although an ELF file on its own, is
    part of the kernel. Thus not trusting the purgatory means not trusting
    the kernel build itself.

    So remove all validity checks on the purgatory and just trust the kernel
    build.

    Link: http://lkml.kernel.org/r/20180321112751.22196-3-prudo@linux.vnet.ibm.com
    Signed-off-by: Philipp Rudo
    Acked-by: Dave Young
    Cc: AKASHI Takahiro
    Cc: Eric Biederman
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Rudo
     
  • In the previous patches, commonly-used routines, exclude_mem_range() and
    prepare_elf64_headers(), were carved out. Now place them in kexec
    common code. A prefix "crash_" is given to each of their names to avoid
    possible name collisions.

    Link: http://lkml.kernel.org/r/20180306102303.9063-8-takahiro.akashi@linaro.org
    Signed-off-by: AKASHI Takahiro
    Acked-by: Dave Young
    Tested-by: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    AKASHI Takahiro
     
  • As arch_kexec_kernel_image_{probe,load}(),
    arch_kimage_file_post_load_cleanup() and arch_kexec_kernel_verify_sig()
    are almost duplicated among architectures, they can be commonalized with
    an architecture-defined kexec_file_ops array. So let's factor them out.

    Link: http://lkml.kernel.org/r/20180306102303.9063-3-takahiro.akashi@linaro.org
    Signed-off-by: AKASHI Takahiro
    Acked-by: Dave Young
    Tested-by: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Cc: Michael Ellerman
    Cc: Thiago Jung Bauermann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    AKASHI Takahiro
     
  • Patch series "kexec_file, x86, powerpc: refactoring for other
    architecutres", v2.

    This is a preparatory patchset for adding kexec_file support on arm64.

    It was originally included in a arm64 patch set[1], but Philipp is also
    working on their kexec_file support on s390[2] and some changes are now
    conflicting.

    So these common parts were extracted and put into a separate patch set
    for better integration. What's more, my original patch#4 was split into
    a few small chunks for easier review after Dave's comment.

    As such, the resulting code is basically identical with my original, and
    the only *visible* differences are:

    - renaming of _kexec_kernel_image_probe() and _kimage_file_post_load_cleanup()

    - change one of types of arguments at prepare_elf64_headers()

    Those, unfortunately, require a couple of trivial changes on the rest
    (#1, #6 to #13) of my arm64 kexec_file patch set[1].

    Patch #1 allows making a use of purgatory optional, particularly useful
    for arm64.

    Patch #2 commonalizes arch_kexec_kernel_{image_probe, image_load,
    verify_sig}() and arch_kimage_file_post_load_cleanup() across
    architectures.

    Patches #3-#7 are also intended to generalize parse_elf64_headers(),
    along with exclude_mem_range(), to be made best re-use of.

    [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/561182.html
    [2] http://lkml.iu.edu//hypermail/linux/kernel/1802.1/02596.html

    This patch (of 7):

    On arm64, crash dump kernel's usable memory is protected by *unmapping*
    it from kernel virtual space unlike other architectures where the region
    is just made read-only. It is highly unlikely that the region is
    accidentally corrupted and this observation rationalizes that digest
    check code can also be dropped from purgatory. The resulting code is so
    simple as it doesn't require a bit ugly re-linking/relocation stuff,
    i.e. arch_kexec_apply_relocations_add().

    Please see:

    http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html

    All that the purgatory does is to shuffle arguments and jump into a new
    kernel, while we still need to have some space for a hash value
    (purgatory_sha256_digest) which is never checked against.

    As such, it doesn't make sense to have trampline code between old kernel
    and new kernel on arm64.

    This patch introduces a new configuration, ARCH_HAS_KEXEC_PURGATORY, and
    allows related code to be compiled in only if necessary.

    [takahiro.akashi@linaro.org: fix trivial screwup]
    Link: http://lkml.kernel.org/r/20180309093346.GF25863@linaro.org
    Link: http://lkml.kernel.org/r/20180306102303.9063-2-takahiro.akashi@linaro.org
    Signed-off-by: AKASHI Takahiro
    Acked-by: Dave Young
    Tested-by: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    AKASHI Takahiro
     

07 Nov, 2017

1 commit

  • In preperation for a new function that will need additional resource
    information during the resource walk, update the resource walk callback to
    pass the resource structure. Since the current callback start and end
    arguments are pulled from the resource structure, the callback functions
    can obtain them from the resource structure directly.

    Signed-off-by: Tom Lendacky
    Signed-off-by: Brijesh Singh
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Kees Cook
    Reviewed-by: Borislav Petkov
    Tested-by: Borislav Petkov
    Cc: kvm@vger.kernel.org
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: linuxppc-dev@lists.ozlabs.org
    Link: https://lkml.kernel.org/r/20171020143059.3291-10-brijesh.singh@amd.com

    Tom Lendacky
     

13 Jul, 2017

2 commits

  • Defining kexec_purgatory as a zero-length char array upsets compile time
    size checking. Since this is built on a per-arch basis, define it as an
    unsized char array (like is done for other similar things, e.g. linker
    sections). This silences the warning generated by the future
    CONFIG_FORTIFY_SOURCE, which did not like the memcmp() of a "0 byte"
    array. This drops the __weak and uses an extern instead, since both
    users define kexec_purgatory.

    Link: http://lkml.kernel.org/r/1497903987-21002-4-git-send-email-keescook@chromium.org
    Signed-off-by: Kees Cook
    Acked-by: "Eric W. Biederman"
    Cc: Daniel Micay
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • Currently vmcoreinfo data is updated at boot time subsys_initcall(), it
    has the risk of being modified by some wrong code during system is
    running.

    As a result, vmcore dumped may contain the wrong vmcoreinfo. Later on,
    when using "crash", "makedumpfile", etc utility to parse this vmcore, we
    probably will get "Segmentation fault" or other unexpected errors.

    E.g. 1) wrong code overwrites vmcoreinfo_data; 2) further crashes the
    system; 3) trigger kdump, then we obviously will fail to recognize the
    crash context correctly due to the corrupted vmcoreinfo.

    Now except for vmcoreinfo, all the crash data is well
    protected(including the cpu note which is fully updated in the crash
    path, thus its correctness is guaranteed). Given that vmcoreinfo data
    is a large chunk prepared for kdump, we better protect it as well.

    To solve this, we relocate and copy vmcoreinfo_data to the crash memory
    when kdump is loading via kexec syscalls. Because the whole crash
    memory will be protected by existing arch_kexec_protect_crashkres()
    mechanism, we naturally protect vmcoreinfo_data from write(even read)
    access under kernel direct mapping after kdump is loaded.

    Since kdump is usually loaded at the very early stage after boot, we can
    trust the correctness of the vmcoreinfo data copied.

    On the other hand, we still need to operate the vmcoreinfo safe copy
    when crash happens to generate vmcoreinfo_note again, we rely on vmap()
    to map out a new kernel virtual address and update to use this new one
    instead in the following crash_save_vmcoreinfo().

    BTW, we do not touch vmcoreinfo_note, because it will be fully updated
    using the protected vmcoreinfo_data after crash which is surely correct
    just like the cpu crash note.

    Link: http://lkml.kernel.org/r/1493281021-20737-3-git-send-email-xlpang@redhat.com
    Signed-off-by: Xunlei Pang
    Tested-by: Michael Holzheu
    Cc: Benjamin Herrenschmidt
    Cc: Dave Young
    Cc: Eric Biederman
    Cc: Hari Bathini
    Cc: Juergen Gross
    Cc: Mahesh Salgaonkar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xunlei Pang
     

30 Jun, 2017

1 commit


11 Mar, 2017

1 commit

  • The purgatory code defines global variables which are referenced via a
    symbol lookup in the kexec code (core and arch).

    A recent commit addressing sparse warnings made these static and thereby
    broke kexec_file.

    Why did this happen? Simply because the whole machinery is undocumented and
    lacks any form of forward declarations. The variable names are unspecific
    and lack a prefix, so adding forward declarations creates shadow variables
    in the core code. Aside of that the code relies on magic constants and
    duplicate struct definitions with no way to ensure that these things stay
    in sync. The section placement of the purgatory variables happened by
    chance and not by design.

    Unbreak kexec and cleanup the mess:

    - Add proper forward declarations and document the usage
    - Use common struct definition
    - Use the proper common defines instead of magic constants
    - Add a purgatory_ prefix to have a proper name space
    - Use ARRAY_SIZE() instead of a homebrewn reimplementation
    - Add proper sections to the purgatory variables [ From Mike ]

    Fixes: 72042a8c7b01 ("x86/purgatory: Make functions and variables static")
    Reported-by: Mike Galbraith <
    Signed-off-by: Thomas Gleixner
    Cc: Nicholas Mc Guire
    Cc: Borislav Petkov
    Cc: Vivek Goyal
    Cc: "Tobin C. Harding"
    Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1703101315140.3681@nanos
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

21 Dec, 2016

1 commit

  • The TPM PCRs are only reset on a hard reboot. In order to validate a
    TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement
    list of the running kernel must be saved and restored on boot.

    This patch uses the kexec buffer passing mechanism to pass the
    serialized IMA binary_runtime_measurements to the next kernel.

    Link: http://lkml.kernel.org/r/1480554346-29071-7-git-send-email-zohar@linux.vnet.ibm.com
    Signed-off-by: Thiago Jung Bauermann
    Signed-off-by: Mimi Zohar
    Acked-by: "Eric W. Biederman"
    Acked-by: Dmitry Kasatkin
    Cc: Andreas Steffen
    Cc: Josh Sklar
    Cc: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Cc: Michael Ellerman
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Stewart Smith
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mimi Zohar
     

30 Nov, 2016

3 commits


02 Sep, 2016

1 commit

  • If kexec_apply_relocations fails, kexec_load_purgatory frees pi->sechdrs
    and pi->purgatory_buf. This is redundant, because in case of error
    kimage_file_prepare_segments calls kimage_file_post_load_cleanup, which
    will also free those buffers.

    This causes two warnings like the following, one for pi->sechdrs and the
    other for pi->purgatory_buf:

    kexec-bzImage64: Loading purgatory failed
    ------------[ cut here ]------------
    WARNING: CPU: 1 PID: 2119 at mm/vmalloc.c:1490 __vunmap+0xc1/0xd0
    Trying to vfree() nonexistent vm area (ffffc90000e91000)
    Modules linked in:
    CPU: 1 PID: 2119 Comm: kexec Not tainted 4.8.0-rc3+ #5
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Call Trace:
    dump_stack+0x4d/0x65
    __warn+0xcb/0xf0
    warn_slowpath_fmt+0x4f/0x60
    ? find_vmap_area+0x19/0x70
    ? kimage_file_post_load_cleanup+0x47/0xb0
    __vunmap+0xc1/0xd0
    vfree+0x2e/0x70
    kimage_file_post_load_cleanup+0x5e/0xb0
    SyS_kexec_file_load+0x448/0x680
    ? putname+0x54/0x60
    ? do_sys_open+0x190/0x1f0
    entry_SYSCALL_64_fastpath+0x13/0x8f
    ---[ end trace 158bb74f5950ca2b ]---

    Fix by setting pi->sechdrs an pi->purgatory_buf to NULL, since vfree
    won't try to free a NULL pointer.

    Link: http://lkml.kernel.org/r/1472083546-23683-1-git-send-email-bauerman@linux.vnet.ibm.com
    Signed-off-by: Thiago Jung Bauermann
    Acked-by: Baoquan He
    Cc: "Eric W. Biederman"
    Cc: Vivek Goyal
    Cc: Dave Young
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thiago Jung Bauermann
     

24 May, 2016

1 commit

  • For the cases that some kernel (module) path stamps the crash reserved
    memory(already mapped by the kernel) where has been loaded the second
    kernel data, the kdump kernel will probably fail to boot when panic
    happens (or even not happens) leaving the culprit at large, this is
    unacceptable.

    The patch introduces a mechanism for detecting such cases:

    1) After each crash kexec loading, it simply marks the reserved memory
    regions readonly since we no longer access it after that. When someone
    stamps the region, the first kernel will panic and trigger the kdump.
    The weak arch_kexec_protect_crashkres() is introduced to do the actual
    protection.

    2) To allow multiple loading, once 1) was done we also need to remark
    the reserved memory to readwrite each time a system call related to
    kdump is made. The weak arch_kexec_unprotect_crashkres() is introduced
    to do the actual protection.

    The architecture can make its specific implementation by overriding
    arch_kexec_protect_crashkres() and arch_kexec_unprotect_crashkres().

    Signed-off-by: Xunlei Pang
    Cc: Eric Biederman
    Cc: Dave Young
    Cc: Minfei Huang
    Cc: Vivek Goyal
    Cc: Baoquan He
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xunlei Pang
     

18 Mar, 2016

1 commit

  • Pull security layer updates from James Morris:
    "There are a bunch of fixes to the TPM, IMA, and Keys code, with minor
    fixes scattered across the subsystem.

    IMA now requires signed policy, and that policy is also now measured
    and appraised"

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (67 commits)
    X.509: Make algo identifiers text instead of enum
    akcipher: Move the RSA DER encoding check to the crypto layer
    crypto: Add hash param to pkcs1pad
    sign-file: fix build with CMS support disabled
    MAINTAINERS: update tpmdd urls
    MODSIGN: linux/string.h should be #included to get memcpy()
    certs: Fix misaligned data in extra certificate list
    X.509: Handle midnight alternative notation in GeneralizedTime
    X.509: Support leap seconds
    Handle ISO 8601 leap seconds and encodings of midnight in mktime64()
    X.509: Fix leap year handling again
    PKCS#7: fix unitialized boolean 'want'
    firmware: change kernel read fail to dev_dbg()
    KEYS: Use the symbol value for list size, updated by scripts/insert-sys-cert
    KEYS: Reserve an extra certificate symbol for inserting without recompiling
    modsign: hide openssl output in silent builds
    tpm_tis: fix build warning with tpm_tis_resume
    ima: require signed IMA policy
    ima: measure and appraise the IMA policy itself
    ima: load policy using path
    ...

    Linus Torvalds
     

21 Feb, 2016

1 commit

  • Replace copy_file_from_fd() with kernel_read_file_from_fd().

    Two new identifiers named READING_KEXEC_IMAGE and READING_KEXEC_INITRAMFS
    are defined for measuring, appraising or auditing the kexec image and
    initramfs.

    Changelog v3:
    - return -EBADF, not -ENOEXEC
    - identifier change
    - split patch, moving copy_file_from_fd() to a separate patch
    - split patch, moving IMA changes to a separate patch
    v0:
    - use kstat file size type loff_t, not size_t
    - Calculate the file hash from the in memory buffer - Dave Young

    Signed-off-by: Mimi Zohar
    Acked-by: Kees Cook
    Acked-by: Luis R. Rodriguez
    Cc: Eric Biederman
    Acked-by: Dave Young

    Mimi Zohar
     

30 Jan, 2016

2 commits

  • Change the callers of walk_iomem_res() scanning for the
    following resources by name to use walk_iomem_res_desc()
    instead.

    "ACPI Tables"
    "ACPI Non-volatile Storage"
    "Persistent Memory (legacy)"
    "Crash kernel"

    Note, the caller of walk_iomem_res() with "GART" will be removed
    in a later patch.

    Signed-off-by: Toshi Kani
    Signed-off-by: Borislav Petkov
    Reviewed-by: Dave Young
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Chun-Yi
    Cc: Dan Williams
    Cc: Denys Vlasenko
    Cc: Don Zickus
    Cc: H. Peter Anvin
    Cc: Lee, Chun-Yi
    Cc: Linus Torvalds
    Cc: Luis R. Rodriguez
    Cc: Minfei Huang
    Cc: Peter Zijlstra (Intel)
    Cc: Ross Zwisler
    Cc: Stephen Rothwell
    Cc: Takao Indoh
    Cc: Thomas Gleixner
    Cc: Toshi Kani
    Cc: kexec@lists.infradead.org
    Cc: linux-arch@vger.kernel.org
    Cc: linux-mm
    Cc: linux-nvdimm@lists.01.org
    Link: http://lkml.kernel.org/r/1453841853-11383-15-git-send-email-bp@alien8.de
    Signed-off-by: Ingo Molnar

    Toshi Kani
     
  • Set proper ioresource flags and types for crash kernel
    reservation areas.

    Signed-off-by: Toshi Kani
    Signed-off-by: Borislav Petkov
    Reviewed-by: Dave Young
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Baoquan He
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: HATAYAMA Daisuke
    Cc: Linus Torvalds
    Cc: Luis R. Rodriguez
    Cc: Minfei Huang
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Toshi Kani
    Cc: Vivek Goyal
    Cc: kexec@lists.infradead.org
    Cc: linux-arch@vger.kernel.org
    Cc: linux-mm
    Link: http://lkml.kernel.org/r/1453841853-11383-8-git-send-email-bp@alien8.de
    Signed-off-by: Ingo Molnar

    Toshi Kani
     

21 Jan, 2016

1 commit


07 Nov, 2015

1 commit

  • kexec output message misses the prefix "kexec", when Dave Young split the
    kexec code. Now, we use file name as the output message prefix.

    Currently, the format of output message:
    [ 140.290795] SYSC_kexec_load: hello, world
    [ 140.291534] kexec: sanity_check_segment_list: hello, world

    Ideally, the format of output message:
    [ 30.791503] kexec: SYSC_kexec_load, Hello, world
    [ 79.182752] kexec_core: sanity_check_segment_list, Hello, world

    Remove the custom prefix "kexec" in output message.

    Signed-off-by: Minfei Huang
    Acked-by: Dave Young
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minfei Huang
     

11 Sep, 2015

1 commit

  • Split kexec_file syscall related code to another file kernel/kexec_file.c
    so that the #ifdef CONFIG_KEXEC_FILE in kexec.c can be dropped.

    Sharing variables and functions are moved to kernel/kexec_internal.h per
    suggestion from Vivek and Petr.

    [akpm@linux-foundation.org: fix bisectability]
    [akpm@linux-foundation.org: declare the various arch_kexec functions]
    [akpm@linux-foundation.org: fix build]
    Signed-off-by: Dave Young
    Cc: Eric W. Biederman
    Cc: Vivek Goyal
    Cc: Petr Tesarik
    Cc: Theodore Ts'o
    Cc: Josh Boyer
    Cc: David Howells
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Young