23 Apr, 2015

1 commit

  • Introduce KEXEC_CONTROL_MEMORY_GFP to allow the architecture code
    to override the gfp flags of the allocation for the kexec control
    page. The loop in kimage_alloc_normal_control_pages allocates pages
    with GFP_KERNEL until a page is found that happens to have an
    address smaller than the KEXEC_CONTROL_MEMORY_LIMIT. On systems
    with a large memory size but a small KEXEC_CONTROL_MEMORY_LIMIT
    the loop will keep allocating memory until the oom killer steps in.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

18 Feb, 2015

3 commits

  • Simplify the code around one of the conditionals in the kexec_load syscall
    routine.

    The original code was confusing with a redundant check on KEXEC_ON_CRASH
    and comments outside of the conditional block. This change switches the
    order of the conditional check, and cleans up the comments for the
    conditional. There is no functional change to the code.

    Signed-off-by: Geoff Levand
    Acked-by: Vivek Goyal
    Cc: Arnd Bergmann
    Cc: Benjamin Herrenschmidt
    Cc: H. Peter Anvin
    Cc: Maximilian Attems
    Cc: Michal Marek
    Cc: Paul Bolle
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geoff Levand
     
  • Signed-off-by: Alexander Kuleshov
    Acked-by: "Eric W. Biederman"
    Acked-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Kuleshov
     
  • struct kimage has a member destination which is used to store the real
    destination address of each page when load segment from user space buffer
    to kernel. But we never retrieve the value stored in kimage->destination,
    so this member variable in kimage and its assignment operation are
    redundent code.

    I guess for_each_kimage_entry just does the work that kimage->destination
    is expected to do.

    So in this patch just make a cleanup to remove it.

    Signed-off-by: Baoquan He
    Cc: "Eric W. Biederman"
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Baoquan He
     

11 Feb, 2015

1 commit

  • Pull trivial tree changes from Jiri Kosina:
    "Patches from trivial.git that keep the world turning around.

    Mostly documentation and comment fixes, and a two corner-case code
    fixes from Alan Cox"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
    kexec, Kconfig: spell "architecture" properly
    mm: fix cleancache debugfs directory path
    blackfin: mach-common: ints-priority: remove unused function
    doubletalk: probe failure causes OOPS
    ARM: cache-l2x0.c: Make it clear that cache-l2x0 handles L310 cache controller
    msdos_fs.h: fix 'fields' in comment
    scsi: aic7xxx: fix comment
    ARM: l2c: fix comment
    ibmraid: fix writeable attribute with no store method
    dynamic_debug: fix comment
    doc: usbmon: fix spelling s/unpriviledged/unprivileged/
    x86: init_mem_mapping(): use capital BIOS in comment

    Linus Torvalds
     

26 Jan, 2015

1 commit


14 Dec, 2014

1 commit


14 Oct, 2014

2 commits

  • This is a cleanup. In function parse_crashkernel_suffix, the parameter
    crash_base is not used. So here remove it.

    Signed-off-by: Baoquan He
    Acked-by: Vivek Goyal
    Cc: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Baoquan He
     
  • In locate_mem_hole functions, a memory hole is located and added as
    kexec_segment. But from the name of locate_mem_hole, it should only take
    responsibility of searching a available memory hole to contain data of a
    specified size.

    So in this patch add a new field 'mem' into kexec_buf, then take that
    kexec segment adding code out of locate_mem_hole_top_down and
    locate_mem_hole_bottom_up. This make clear of the functionality of
    locate_mem_hole just like it declars to do. And by this
    locate_mem_hole_callback chould be used later if anyone want to locate a
    memory hole for other use.

    Meanwhile Vivek suggested opening code function __kexec_add_segment(),
    that way we have to retreive ksegment pointer once and it is easy to read.
    So just do it in this patch and remove __kexec_add_segment() since no one
    use it anymore.

    Signed-off-by: Baoquan He
    Acked-by: Vivek Goyal
    Cc: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Baoquan He
     

30 Aug, 2014

1 commit

  • Currently new system call kexec_file_load() and all the associated code
    compiles if CONFIG_KEXEC=y. But new syscall also compiles purgatory
    code which currently uses gcc option -mcmodel=large. This option seems
    to be available only gcc 4.4 onwards.

    Hiding new functionality behind a new config option will not break
    existing users of old gcc. Those who wish to enable new functionality
    will require new gcc. Having said that, I am trying to figure out how
    can I move away from using -mcmodel=large but that can take a while.

    I think there are other advantages of introducing this new config
    option. As this option will be enabled only on x86_64, other arches
    don't have to compile generic kexec code which will never be used. This
    new code selects CRYPTO=y and CRYPTO_SHA256=y. And all other arches had
    to do this for CONFIG_KEXEC. Now with introduction of new config
    option, we can remove crypto dependency from other arches.

    Now CONFIG_KEXEC_FILE is available only on x86_64. So whereever I had
    CONFIG_X86_64 defined, I got rid of that.

    For CONFIG_KEXEC_FILE, instead of doing select CRYPTO=y, I changed it to
    "depends on CRYPTO=y". This should be safer as "select" is not
    recursive.

    Signed-off-by: Vivek Goyal
    Cc: Eric Biederman
    Cc: H. Peter Anvin
    Tested-by: Shaun Ruffell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     

09 Aug, 2014

9 commits

  • This is the final piece of the puzzle of verifying kernel image signature
    during kexec_file_load() syscall.

    This patch calls into PE file routines to verify signature of bzImage. If
    signature are valid, kexec_file_load() succeeds otherwise it fails.

    Two new config options have been introduced. First one is
    CONFIG_KEXEC_VERIFY_SIG. This option enforces that kernel has to be
    validly signed otherwise kernel load will fail. If this option is not
    set, no signature verification will be done. Only exception will be when
    secureboot is enabled. In that case signature verification should be
    automatically enforced when secureboot is enabled. But that will happen
    when secureboot patches are merged.

    Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG. This option
    enables signature verification support on bzImage. If this option is not
    set and previous one is set, kernel image loading will fail because kernel
    does not have support to verify signature of bzImage.

    I tested these patches with both "pesign" and "sbsign" signed bzImages.

    I used signing_key.priv key and signing_key.x509 cert for signing as
    generated during kernel build process (if module signing is enabled).

    Used following method to sign bzImage.

    pesign
    ======
    - Convert DER format cert to PEM format cert
    openssl x509 -in signing_key.x509 -inform DER -out signing_key.x509.PEM -outform
    PEM

    - Generate a .p12 file from existing cert and private key file
    openssl pkcs12 -export -out kernel-key.p12 -inkey signing_key.priv -in
    signing_key.x509.PEM

    - Import .p12 file into pesign db
    pk12util -i /tmp/kernel-key.p12 -d /etc/pki/pesign

    - Sign bzImage
    pesign -i /boot/vmlinuz-3.16.0-rc3+ -o /boot/vmlinuz-3.16.0-rc3+.signed.pesign
    -c "Glacier signing key - Magrathea" -s

    sbsign
    ======
    sbsign --key signing_key.priv --cert signing_key.x509.PEM --output
    /boot/vmlinuz-3.16.0-rc3+.signed.sbsign /boot/vmlinuz-3.16.0-rc3+

    Patch details:

    Well all the hard work is done in previous patches. Now bzImage loader
    has just call into that code and verify whether bzImage signature are
    valid or not.

    Also create two config options. First one is CONFIG_KEXEC_VERIFY_SIG.
    This option enforces that kernel has to be validly signed otherwise kernel
    load will fail. If this option is not set, no signature verification will
    be done. Only exception will be when secureboot is enabled. In that case
    signature verification should be automatically enforced when secureboot is
    enabled. But that will happen when secureboot patches are merged.

    Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG. This option
    enables signature verification support on bzImage. If this option is not
    set and previous one is set, kernel image loading will fail because kernel
    does not have support to verify signature of bzImage.

    Signed-off-by: Vivek Goyal
    Cc: Borislav Petkov
    Cc: Michael Kerrisk
    Cc: Yinghai Lu
    Cc: Eric Biederman
    Cc: H. Peter Anvin
    Cc: Matthew Garrett
    Cc: Greg Kroah-Hartman
    Cc: Dave Young
    Cc: WANG Chao
    Cc: Baoquan He
    Cc: Andy Lutomirski
    Cc: Matt Fleming
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • This patch adds support for loading a kexec on panic (kdump) kernel usning
    new system call.

    It prepares ELF headers for memory areas to be dumped and for saved cpu
    registers. Also prepares the memory map for second kernel and limits its
    boot to reserved areas only.

    Signed-off-by: Vivek Goyal
    Cc: Borislav Petkov
    Cc: Michael Kerrisk
    Cc: Yinghai Lu
    Cc: Eric Biederman
    Cc: H. Peter Anvin
    Cc: Matthew Garrett
    Cc: Greg Kroah-Hartman
    Cc: Dave Young
    Cc: WANG Chao
    Cc: Baoquan He
    Cc: Andy Lutomirski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • This is loader specific code which can load bzImage and set it up for
    64bit entry. This does not take care of 32bit entry or real mode entry.

    32bit mode entry can be implemented if somebody needs it.

    Signed-off-by: Vivek Goyal
    Cc: Borislav Petkov
    Cc: Michael Kerrisk
    Cc: Yinghai Lu
    Cc: Eric Biederman
    Cc: H. Peter Anvin
    Cc: Matthew Garrett
    Cc: Greg Kroah-Hartman
    Cc: Dave Young
    Cc: WANG Chao
    Cc: Baoquan He
    Cc: Andy Lutomirski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • Load purgatory code in RAM and relocate it based on the location.
    Relocation code has been inspired by module relocation code and purgatory
    relocation code in kexec-tools.

    Also compute the checksums of loaded kexec segments and store them in
    purgatory.

    Arch independent code provides this functionality so that arch dependent
    bootloaders can make use of it.

    Helper functions are provided to get/set symbol values in purgatory which
    are used by bootloaders later to set things like stack and entry point of
    second kernel etc.

    Signed-off-by: Vivek Goyal
    Cc: Borislav Petkov
    Cc: Michael Kerrisk
    Cc: Yinghai Lu
    Cc: Eric Biederman
    Cc: H. Peter Anvin
    Cc: Matthew Garrett
    Cc: Greg Kroah-Hartman
    Cc: Dave Young
    Cc: WANG Chao
    Cc: Baoquan He
    Cc: Andy Lutomirski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • Previous patch provided the interface definition and this patch prvides
    implementation of new syscall.

    Previously segment list was prepared in user space. Now user space just
    passes kernel fd, initrd fd and command line and kernel will create a
    segment list internally.

    This patch contains generic part of the code. Actual segment preparation
    and loading is done by arch and image specific loader. Which comes in
    next patch.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Vivek Goyal
    Cc: Borislav Petkov
    Cc: Michael Kerrisk
    Cc: Yinghai Lu
    Cc: Eric Biederman
    Cc: H. Peter Anvin
    Cc: Matthew Garrett
    Cc: Greg Kroah-Hartman
    Cc: Dave Young
    Cc: WANG Chao
    Cc: Baoquan He
    Cc: Andy Lutomirski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • This is the new syscall kexec_file_load() declaration/interface. I have
    reserved the syscall number only for x86_64 so far. Other architectures
    (including i386) can reserve syscall number when they enable the support
    for this new syscall.

    Signed-off-by: Vivek Goyal
    Cc: Michael Kerrisk
    Cc: Borislav Petkov
    Cc: Yinghai Lu
    Cc: Eric Biederman
    Cc: H. Peter Anvin
    Cc: Matthew Garrett
    Cc: Greg Kroah-Hartman
    Cc: Dave Young
    Cc: WANG Chao
    Cc: Baoquan He
    Cc: Andy Lutomirski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • kimage_normal_alloc() and kimage_crash_alloc() are doing lot of similar
    things and differ only little. So instead of having two separate
    functions create a common function kimage_alloc_init() and pass it the
    "flags" argument which tells whether it is normal kexec or kexec_on_panic.
    And this function should be able to deal with both the cases.

    This consolidation also helps later where we can use a common function
    kimage_file_alloc_init() to handle normal and crash cases for new file
    based kexec syscall.

    Signed-off-by: Vivek Goyal
    Cc: Borislav Petkov
    Cc: Michael Kerrisk
    Cc: Yinghai Lu
    Cc: Eric Biederman
    Cc: H. Peter Anvin
    Cc: Matthew Garrett
    Cc: Greg Kroah-Hartman
    Cc: Dave Young
    Cc: WANG Chao
    Cc: Baoquan He
    Cc: Andy Lutomirski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • Previously do_kimage_alloc() will allocate a kimage structure, copy
    segment list from user space and then do the segment list sanity
    verification.

    Break down this function in 3 parts. do_kimage_alloc_init() to do actual
    allocation and basic initialization of kimage structure.
    copy_user_segment_list() to copy segment list from user space and
    sanity_check_segment_list() to verify the sanity of segment list as passed
    by user space.

    In later patches, I need to only allocate kimage and not copy segment list
    from user space. So breaking down in smaller functions enables re-use of
    code at other places.

    Signed-off-by: Vivek Goyal
    Cc: Borislav Petkov
    Cc: Michael Kerrisk
    Cc: Yinghai Lu
    Cc: Eric Biederman
    Cc: H. Peter Anvin
    Cc: Matthew Garrett
    Cc: Greg Kroah-Hartman
    Cc: Dave Young
    Cc: WANG Chao
    Cc: Baoquan He
    Cc: Andy Lutomirski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • Let's use the more common "unusable".

    This patch was originally written and posted by Boris. I am including it
    in this patch series.

    Signed-off-by: Borislav Petkov
    Signed-off-by: Vivek Goyal
    Cc: Borislav Petkov
    Cc: Michael Kerrisk
    Cc: Yinghai Lu
    Cc: Eric Biederman
    Cc: H. Peter Anvin
    Cc: Matthew Garrett
    Cc: Greg Kroah-Hartman
    Cc: Dave Young
    Cc: WANG Chao
    Cc: Baoquan He
    Cc: Andy Lutomirski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     

31 Jul, 2014

2 commits

  • free_huge_page() is undefined without CONFIG_HUGETLBFS and there's no
    need to filter PageHuge() page is such a configuration either, so avoid
    exporting the symbol to fix a build error:

    In file included from kernel/kexec.c:14:0:
    kernel/kexec.c: In function 'crash_save_vmcoreinfo_init':
    kernel/kexec.c:1623:20: error: 'free_huge_page' undeclared (first use in this function)
    VMCOREINFO_SYMBOL(free_huge_page);
    ^

    Introduced by commit 8f1d26d0e59b ("kexec: export free_huge_page to
    VMCOREINFO")

    Reported-by: kbuild test robot
    Acked-by: Olof Johansson
    Cc: Atsushi Kumagai
    Cc: Baoquan He
    Cc: Vivek Goyal
    Cc: Andrew Morton
    Signed-off-by: David Rientjes
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • PG_head_mask was added into VMCOREINFO to filter huge pages in b3acc56bfe1
    ("kexec: save PG_head_mask in VMCOREINFO"), but makedumpfile still need
    another symbol to filter *hugetlbfs* pages.

    If a user hope to filter user pages, makedumpfile tries to exclude them by
    checking the condition whether the page is anonymous, but hugetlbfs pages
    aren't anonymous while they also be user pages.

    We know it's possible to detect them in the same way as PageHuge(),
    so we need the start address of free_huge_page():

    int PageHuge(struct page *page)
    {
    if (!PageCompound(page))
    return 0;

    page = compound_head(page);
    return get_compound_page_dtor(page) == free_huge_page;
    }

    For that reason, this patch changes free_huge_page() into public
    to export it to VMCOREINFO.

    Signed-off-by: Atsushi Kumagai
    Acked-by: Baoquan He
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Atsushi Kumagai
     

24 Jun, 2014

1 commit

  • To allow filtering of huge pages, makedumpfile must be able to identify
    them in the dump. This can be done by checking the appropriate page
    flag, so communicate its value to makedumpfile through the VMCOREINFO
    interface.

    There's only one small catch. Depending on how many page flags are
    available on a given architecture, this bit can be called PG_head or
    PG_compound.

    I sent a similar patch back in 2012, but Eric Biederman did not like
    using an #ifdef. So, this time I'm adding a common symbol
    (PG_head_mask) instead.

    See https://lkml.org/lkml/2012/11/28/91 for the previous version.

    Signed-off-by: Petr Tesarik
    Acked-by: Vivek Goyal
    Cc: Eric Biederman
    Cc: Paul Mackerras
    Cc: Fengguang Wu
    Cc: Benjamin Herrenschmidt
    Cc: Shaohua Li
    Cc: Alexey Kardashevskiy
    Cc: Sasha Levin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Petr Tesarik
     

07 Jun, 2014

1 commit


28 May, 2014

1 commit

  • If we try to perform a kexec when the machine is in ST (Single-Threaded) mode
    (ppc64_cpu --smt=off), the kexec operation doesn't succeed properly, and we
    get the following messages during boot:

    [ 0.089866] POWER8 performance monitor hardware support registered
    [ 0.089985] power8-pmu: PMAO restore workaround active.
    [ 5.095419] Processor 1 is stuck.
    [ 10.097933] Processor 2 is stuck.
    [ 15.100480] Processor 3 is stuck.
    [ 20.102982] Processor 4 is stuck.
    [ 25.105489] Processor 5 is stuck.
    [ 30.108005] Processor 6 is stuck.
    [ 35.110518] Processor 7 is stuck.
    [ 40.113369] Processor 9 is stuck.
    [ 45.115879] Processor 10 is stuck.
    [ 50.118389] Processor 11 is stuck.
    [ 55.120904] Processor 12 is stuck.
    [ 60.123425] Processor 13 is stuck.
    [ 65.125970] Processor 14 is stuck.
    [ 70.128495] Processor 15 is stuck.
    [ 75.131316] Processor 17 is stuck.

    Note that only the sibling threads are stuck, while the primary threads (0, 8,
    16 etc) boot just fine. Looking closer at the previous step of kexec, we observe
    that kexec tries to wakeup (bring online) the sibling threads of all the cores,
    before performing kexec:

    [ 9464.131231] Starting new kernel
    [ 9464.148507] kexec: Waking offline cpu 1.
    [ 9464.148552] kexec: Waking offline cpu 2.
    [ 9464.148600] kexec: Waking offline cpu 3.
    [ 9464.148636] kexec: Waking offline cpu 4.
    [ 9464.148671] kexec: Waking offline cpu 5.
    [ 9464.148708] kexec: Waking offline cpu 6.
    [ 9464.148743] kexec: Waking offline cpu 7.
    [ 9464.148779] kexec: Waking offline cpu 9.
    [ 9464.148815] kexec: Waking offline cpu 10.
    [ 9464.148851] kexec: Waking offline cpu 11.
    [ 9464.148887] kexec: Waking offline cpu 12.
    [ 9464.148922] kexec: Waking offline cpu 13.
    [ 9464.148958] kexec: Waking offline cpu 14.
    [ 9464.148994] kexec: Waking offline cpu 15.
    [ 9464.149030] kexec: Waking offline cpu 17.

    Instrumenting this piece of code revealed that the cpu_up() operation actually
    fails with -EBUSY. Thus, only the primary threads of all the cores are online
    during kexec, and hence this is a sure-shot receipe for disaster, as explained
    in commit e8e5c2155b (powerpc/kexec: Fix orphaned offline CPUs across kexec),
    as well as in the comment above wake_offline_cpus().

    It turns out that cpu_up() was returning -EBUSY because the variable
    'cpu_hotplug_disabled' was set to 1; and this disabling of CPU hotplug was done
    by migrate_to_reboot_cpu() inside kernel_kexec().

    Now, migrate_to_reboot_cpu() was originally written with the assumption that
    any further code will not need to perform CPU hotplug, since we are anyway in
    the reboot path. However, kexec is clearly not such a case, since we depend on
    onlining CPUs, atleast on powerpc.

    So re-enable cpu-hotplug after returning from migrate_to_reboot_cpu() in the
    kexec path, to fix this regression in kexec on powerpc.

    Also, wrap the cpu_up() in powerpc kexec code within a WARN_ON(), so that we
    can catch such issues more easily in the future.

    Fixes: c97102ba963 (kexec: migrate to reboot cpu)
    Cc: stable@vger.kernel.org
    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Benjamin Herrenschmidt

    Srivatsa S. Bhat
     

08 Apr, 2014

1 commit


04 Apr, 2014

1 commit

  • Code that is obj-y (always built-in) or dependent on a bool Kconfig
    (built-in or absent) can never be modular. So using module_init as an
    alias for __initcall can be somewhat misleading.

    Fix these up now, so that we can relocate module_init from init.h into
    module.h in the future. If we don't do this, we'd have to add module.h
    to obviously non-modular code, and that would be a worse thing.

    The audit targets the following module_init users for change:
    kernel/user.c obj-y
    kernel/kexec.c bool KEXEC (one instance per arch)
    kernel/profile.c bool PROFILING
    kernel/hung_task.c bool DETECT_HUNG_TASK
    kernel/sched/stats.c bool SCHEDSTATS
    kernel/user_namespace.c bool USER_NS

    Note that direct use of __initcall is discouraged, vs. one of the
    priority categorized subgroups. As __initcall gets mapped onto
    device_initcall, our use of subsys_initcall (which makes sense for these
    files) will thus change this registration from level 6-device to level
    4-subsys (i.e. slightly earlier). However no observable impact of that
    difference has been observed during testing.

    Also, two instances of missing ";" at EOL are fixed in kexec.

    Signed-off-by: Paul Gortmaker
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Gortmaker
     

06 Mar, 2014

1 commit


28 Jan, 2014

1 commit


24 Jan, 2014

1 commit

  • For general-purpose (i.e. distro) kernel builds it makes sense to build
    with CONFIG_KEXEC to allow end users to choose what kind of things they
    want to do with kexec. However, in the face of trying to lock down a
    system with such a kernel, there needs to be a way to disable kexec_load
    (much like module loading can be disabled). Without this, it is too easy
    for the root user to modify kernel memory even when CONFIG_STRICT_DEVMEM
    and modules_disabled are set. With this change, it is still possible to
    load an image for use later, then disable kexec_load so the image (or lack
    of image) can't be altered.

    The intention is for using this in environments where "perfect"
    enforcement is hard. Without a verified boot, along with verified
    modules, and along with verified kexec, this is trying to give a system a
    better chance to defend itself (or at least grow the window of
    discoverability) against attack in the face of a privilege escalation.

    In my mind, I consider several boot scenarios:

    1) Verified boot of read-only verified root fs loading fd-based
    verification of kexec images.
    2) Secure boot of writable root fs loading signed kexec images.
    3) Regular boot loading kexec (e.g. kcrash) image early and locking it.
    4) Regular boot with no control of kexec image at all.

    1 and 2 don't exist yet, but will soon once the verified kexec series has
    landed. 4 is the state of things now. The gap between 2 and 4 is too
    large, so this change creates scenario 3, a middle-ground above 4 when 2
    and 1 are not possible for a system.

    Signed-off-by: Kees Cook
    Acked-by: Rik van Riel
    Cc: Vivek Goyal
    Cc: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

19 Dec, 2013

1 commit

  • Commit 1b3a5d02ee07 ("reboot: move arch/x86 reboot= handling to generic
    kernel") moved reboot= handling to generic code. In the process it also
    removed the code in native_machine_shutdown() which are moving reboot
    process to reboot_cpu/cpu0.

    I guess that thought must have been that all reboot paths are calling
    migrate_to_reboot_cpu(), so we don't need this special handling. But
    kexec reboot path (kernel_kexec()) is not calling
    migrate_to_reboot_cpu() so above change broke kexec. Now reboot can
    happen on non-boot cpu and when INIT is sent in second kerneo to bring
    up BP, it brings down the machine.

    So start calling migrate_to_reboot_cpu() in kexec reboot path to avoid
    this problem.

    Bisected by WANG Chao.

    Reported-by: Matthew Whitehead
    Reported-by: Dave Young
    Signed-off-by: Vivek Goyal
    Tested-by: Baoquan He
    Tested-by: WANG Chao
    Acked-by: H. Peter Anvin
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     

08 Dec, 2013

1 commit

  • Add a flag to tell the PCI subsystem that kernel is shutting down in
    preparation to kexec a kernel. Add code in PCI subsystem to use this flag
    to clear Bus Master bit on PCI devices only in case of kexec reboot.

    This fixes a power-off problem on Acer Aspire V5-573G and likely other
    machines and avoids any other issues caused by clearing Bus Master bit on
    PCI devices in normal shutdown path. The problem was introduced by
    b566a22c2332 ("PCI: disable Bus Master on PCI device shutdown").

    This patch is based on discussion at
    http://marc.info/?l=linux-pci&m=138425645204355&w=2

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=63861
    Reported-by: Chang Liu
    Signed-off-by: Khalid Aziz
    Signed-off-by: Bjorn Helgaas
    Acked-by: Konstantin Khlebnikov
    Cc: stable@vger.kernel.org # v3.5+

    Khalid Aziz
     

16 Nov, 2013

1 commit

  • Pull trivial tree updates from Jiri Kosina:
    "Usual earth-shaking, news-breaking, rocket science pile from
    trivial.git"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
    doc: usb: Fix typo in Documentation/usb/gadget_configs.txt
    doc: add missing files to timers/00-INDEX
    timekeeping: Fix some trivial typos in comments
    mm: Fix some trivial typos in comments
    irq: Fix some trivial typos in comments
    NUMA: fix typos in Kconfig help text
    mm: update 00-INDEX
    doc: Documentation/DMA-attributes.txt fix typo
    DRM: comment: `halve' -> `half'
    Docs: Kconfig: `devlopers' -> `developers'
    doc: typo on word accounting in kprobes.c in mutliple architectures
    treewide: fix "usefull" typo
    treewide: fix "distingush" typo
    mm/Kconfig: Grammar s/an/a/
    kexec: Typo s/the/then/
    Documentation/kvm: Update cpuid documentation for steal time and pv eoi
    treewide: Fix common typo in "identify"
    __page_to_pfn: Fix typo in comment
    Correct some typos for word frequency
    clk: fixed-factor: Fix a trivial typo
    ...

    Linus Torvalds
     

14 Oct, 2013

1 commit


12 Sep, 2013

1 commit

  • Code can not run here forever, so remove the unnecessary return.

    Signed-off-by: Xishi Qiu
    Suggested-by: Zhang Yanfei
    Reviewed-by: Simon Horman
    Reviewed-by: Zhang Yanfei
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xishi Qiu
     

01 May, 2013

2 commits

  • Simplify the logic of variable assignments.

    [akpm@linux-foundation.org: replace min_t with min, remove unneeded casts]
    Signed-off-by: Zhang Yanfei
    Cc: "Eric W. Biederman"
    Reviewed-by: Simon Horman
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zhang Yanfei
     
  • The types of the following local variables:

    - ubytes/mbytes in kimage_load_crash_segment()/kimage_load_normal_segment()

    - r in vmcoreinfo_append_str()

    are wrong, so fix them.

    Signed-off-by: Zhang Yanfei
    Cc: "Eric W. Biederman"
    Cc: Simon Horman
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zhang Yanfei
     

30 Apr, 2013

3 commits

  • Now, vmap_area_list is exported as VMCOREINFO for makedumpfile to get
    the start address of vmalloc region (vmalloc_start). The address which
    contains vmalloc_start value is represented as below:

    vmap_area_list.next - OFFSET(vmap_area.list) + OFFSET(vmap_area.va_start)

    However, both OFFSET(vmap_area.va_start) and OFFSET(vmap_area.list)
    aren't exported as VMCOREINFO.

    So this patch exports them externally with small cleanup.

    [akpm@linux-foundation.org: vmalloc.h should include list.h for list_head]
    Signed-off-by: Atsushi Kumagai
    Cc: Joonsoo Kim
    Cc: Joonsoo Kim
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Atsushi Kumagai
    Cc: Chris Metcalf
    Cc: Dave Anderson
    Cc: Eric Biederman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Atsushi Kumagai
     
  • Although our intention is to unexport internal structure entirely, but
    there is one exception for kexec. kexec dumps address of vmlist and
    makedumpfile uses this information.

    We are about to remove vmlist, then another way to retrieve information
    of vmalloc layer is needed for makedumpfile. For this purpose, we
    export vmap_area_list, instead of vmlist.

    Signed-off-by: Joonsoo Kim
    Signed-off-by: Joonsoo Kim
    Cc: Eric Biederman
    Cc: Dave Anderson
    Cc: Vivek Goyal
    Cc: Atsushi Kumagai
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Chris Metcalf
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Use common help functions to free reserved pages.

    Signed-off-by: Jiang Liu
    Cc: Eric Biederman
    Reviewed-by: Zhang Yanfei
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     

18 Apr, 2013

1 commit

  • We can extend kexec-tools to support multiple "Crash kernel" in /proc/iomem
    instead.

    So we can use "Crash kernel" instead of "Crash kernel low" in /proc/iomem.

    Suggested-by: Vivek Goyal
    Signed-off-by: Yinghai Lu
    Link: http://lkml.kernel.org/r/1366089828-19692-3-git-send-email-yinghai@kernel.org
    Acked-by: Vivek Goyal
    Signed-off-by: H. Peter Anvin

    Yinghai Lu