26 Jun, 2005

40 commits

  • Signed-off-by: Jesper Juhl
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jesper Juhl
     
  • Signed-off-by: Alexey Dobriyan
    Signed-off-by: Domen Puncer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Domen Puncer
     
  • Signed-off-by: Alexey Dobriyan
    Signed-off-by: Domen Puncer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Domen Puncer
     
  • Signed-off-by: Alexey Dobriyan
    Signed-off-by: Domen Puncer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Domen Puncer
     
  • Signed-off-by: Alexey Dobriyan
    Signed-off-by: Domen Puncer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Domen Puncer
     
  • Put function prototypes for memset() and memcpy() ahead of where
    there are used, to kill sparse warnings:

    arch/x86_64/boot/compressed/../../../../lib/inflate.c:317:3: warning: undefined identifier 'memset'
    arch/x86_64/boot/compressed/../../../../lib/inflate.c:601:11: warning: undefined identifier 'memcpy'
    arch/x86_64/boot/compressed/misc.c:151:2: warning: undefined identifier 'memcpy'
    arch/x86_64/boot/compressed/../../../../lib/inflate.c:317:3: warning: call with no type!
    arch/x86_64/boot/compressed/../../../../lib/inflate.c:601:17: warning: call with no type!
    arch/x86_64/boot/compressed/misc.c:151:9: warning: call with no type!

    Signed-off-by: randy_dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    randy_dunlap
     
  • o Following patch provides purely cosmetic changes and corrects CodingStyle
    guide lines related certain issues like below in kexec related files

    o braces for one line "if" statements, "for" loops,
    o more than 80 column wide lines,
    o No space after "while", "for" and "switch" key words

    o Changes:
    o take-2: Removed the extra tab before "case" key words.
    o take-3: Put operator at the end of line and space before "*/"

    Signed-off-by: Maneesh Soni
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Maneesh Soni
     
  • If we are faulting in kernel it is quite possible this will lead to a
    panic. Save trap number, cr2 (in case of page fault) and error_code in the
    current thread (these fields already exist for signal delivery but are not
    used here).

    This helps later kdump crash analyzing from user-space (a script has been
    submitted to dig this info out in gdb).

    Signed-off-by: Alexander Nyberg
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Nyberg
     
  • Makes kexec_crashdump() take a pt_regs * as an argument. This allows to
    get exact register state at the point of the crash. If we come from direct
    panic assertion NULL will be passed and the current registers saved before
    crashdump.

    This hooks into two places:
    die(): check the conditions under which we will panic when calling
    do_exit and go there directly with the pt_regs that caused the fatal
    fault.

    die_nmi(): If we receive an NMI lockup while in the kernel use the
    pt_regs and go directly to crash_kexec(). We're probably nested up badly
    at this point so this might be the only chance to escape with proper
    information.

    Signed-off-by: Alexander Nyberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Nyberg
     
  • This patch adds support for retrieving the address of elf core header if one
    is passed in command line.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • This patch provides the interfaces necessary to read the dump contents,
    treating it as a high memory device.

    Signed off by Hariprasad Nellitheertha
    Signed-off-by: Eric Biederman
    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • - config option CONFIG_CRASH_DUMP

    - Made it dependent on HIGHMEM. This is required as capture kernel treats
    the previous kernel's memory as high memmory and stitches a PTE for
    accessing it.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • This patch retrieves the max_pfn being used by previous kernel and stores it
    in a safe location (saved_max_pfn) before it is overwritten due to user
    defined memory map. This pfn is used to make sure that user does not try to
    read the physical memory beyond saved_max_pfn.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • o Problem: Kexec on panic hangs if first kernel is booted with nmi_watchdog
    command line parameter. This problem occurs because kexec crash shutdown
    code replaces the NMI callback handler. This handler saves the cpu register
    states and halts the cpu. If system is booted with nmi_watchdog parameter,
    then crashing cpu also runs this nmi handler and halts itself.

    o This patch fixes the problem by keeping a track of crashing cpu and not
    executing the new nmi handler on crashing cpu.

    o There is a dependence on smp_processor_id() function which might return
    insane value for cpu, if cpu field of thread_info is corrupted.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • CPU does not save ss and esp on stack if execution was already in kernel mode
    at the time of NMI occurrence. This leads to saving of erractic values for ss
    and esp. This patch fixes the issue.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • o Following patch exports kexec global variable "crash_notes" to user space
    through sysfs as kernel attribute in /sys/kernel.

    Signed-off-by: Maneesh Soni
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • Add kexec support for s390 architecture.

    From: Milton Miller

    - Fix passing of first argument to relocate_kernel assembly.
    - Fix Kconfig description.
    - Remove wrong comment and comments that describe obvious things.
    - Allow only KEXEC_TYPE_DEFAULT as image type -> dump not supported.

    Acked-by: Martin Schwidefsky
    Signed-off-by: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     
  • This patch implements the kexec support for ppc64 platforms.

    A couple of notes:

    1) We copy the pages in virtual mode, using the full base kernel
    and a statically allocated stack. At kexec_prepare time we
    scan the pages and if any overlap our (0, _end[]) range we
    return -ETXTBSY.

    On PowerPC 64 systems running in LPAR (logical partitioning)
    mode, only a small region of memory, referred to as the RMO,
    can be accessed in real mode. Since Linux runs with only one
    zone of memory in the memory allocator, and it can be orders of
    magnitude more memory than the RMO, looping until we allocate
    pages in the source region is not feasible. Copying in virtual
    means we don't have to write a hash table generation and call
    hypervisor to insert translations, instead we rely on the pinned
    kernel linear mapping. The kernel already has move to linked
    location built in, so there is no requirement to load it at 0.

    If we want to load something other than a kernel, then a stub
    can be written to copy a linear chunk in real mode.

    2) The start entry point gets passed parameters from the kernel.
    Slaves are started at a fixed address after copying code from
    the entry point.

    All CPUs get passed their firmware assigned physical id in r3
    (most calling conventions use this register for the first
    argument).

    This is used to distinguish each CPU from all other CPUs.
    Since firmware is not around, there is no other way to obtain
    this information other than to pass it somewhere.

    A single CPU, referred to here as the master and the one executing
    the kexec call, branches to start with the address of start in r4.
    While this can be calculated, we have to load it through a gpr to
    branch to this point so defining the register this is contained
    in is free. A stack of unspecified size is available at r1
    (also common calling convention).

    All remaining running CPUs are sent to start at absolute address
    0x60 after copying the first 0x100 bytes from start to address 0.
    This convention was chosen because it matches what the kernel
    has been doing itself. (only gpr3 is defined).

    Note: This is not quite the convention of the kexec bootblock v2
    in the kernel. A stub has been written to convert between them,
    and we may adjust the kernel in the future to allow this directly
    without any stub.

    3) Destination pages can be placed anywhere, even where they
    would not be accessible in real mode. This will allow us to
    place ram disks above the RMO if we choose.

    Signed-off-by: Milton Miller
    Signed-off-by: R Sharada
    Signed-off-by: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    R Sharada
     
  • Add code to clear the hash table and invalidate the tlb for native (SMP,
    non-LPAR) mode. Supports 16M and 4k pages.

    Signed-off-by: Milton Miller
    Signed-off-by: R Sharada
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    R Sharada
     
  • I have tweaked this patch slightly to handle an empty list
    of pages to relocate passed to relocate_new_kernel. And
    I have added ppc_md.machine_crash_shutdown. To keep up with
    the changes in the generic kexec infrastructure.

    From: Albert Herranz

    The following patch adds support for kexec on the ppc32 platform.

    Non-OpenFirmware based platforms are likely to work directly without
    additional changes on the kernel side. The kexec-tools userland package
    may need to be slightly updated, though.

    For OpenFirmware based machines, additional work is still needed on the
    kernel side before kexec support is ready. Benjamin Herrenschmidt is
    kindly working on that part.

    In order for a ppc platform to use the kexec kernel services it must
    implement some ppc_md hooks. Otherwise, kexec will be explicitly disabled,
    as suggested by benh.

    There are 3+1 new ppc_md hooks that a platform supporting kexec may
    implement. Two of them are mandatory for kexec to work. See
    include/asm-ppc/machdep.h for details.

    - machine_kexec_prepare(image)

    This function is called to make any arrangements to the image before it
    is loaded.

    This hook _MUST_ be provided by a platform in order to activate kexec
    support for that platform. Otherwise, the platform is considered to not
    support kexec and the kexec_load system call will fail (that makes all
    existing platforms by default non-kexec'able).

    - machine_kexec_cleanup(image)

    This function is called to make any cleanups on image after the loaded
    image data it is freed. This hook is optional. A platform may or may
    not provide this hook.

    - machine_kexec(image)

    This function is called to perform the _actual_ kexec. This hook
    _MUST_ be provided by a platform in order to activate kexec support for
    that platform.

    If a platform provides machine_kexec_prepare but forgets to provide
    machine_kexec, a kexec will fall back to a reboot.

    A ready-to-use machine_kexec_simple() generic function is provided to,
    hopefully, simplify kexec adoption for embedded platforms. A platform
    may call this function from its specific machine_kexec hook, like this:

    void myplatform_kexec(struct kimage *image)
    {
    machine_kexec_simple(image);
    }

    - machine_shutdown()

    This function is called to perform any machine specific shutdowns, not
    already done by drivers. This hook is optional. A platform may or may
    not provide this hook.

    An example (trimmed) platform specific module for a platform supporting
    kexec through the existing machine_kexec_simple follows:

    /* ... */

    #ifdef CONFIG_KEXEC
    int myplatform_kexec_prepare(struct kimage *image)
    {
    /* here, we can place additional preparations
    */
    return 0; /* yes, we support kexec */
    }

    void myplatform_kexec(struct kimage *image)
    {
    machine_kexec_simple(image);
    }
    #endif /* CONFIG_KEXEC */

    /* ... */

    void __init
    platform_init(unsigned long r3, unsigned long r4,
    unsigned long r5,
    unsigned long r6, unsigned long r7)
    {

    /* ... */

    #ifdef CONFIG_KEXEC
    ppc_md.machine_kexec_prepare =
    myplatform_kexec_prepare;
    ppc_md.machine_kexec =
    myplatform_kexec;
    #endif /* CONFIG_KEXEC */

    /* ... */

    }

    The kexec ppc kernel support has been heavily tested on the GameCube Linux
    port, and, as reported in the fastboot mailing list, it has been tested too
    on a Moto 82xx ppc by Rick Richardson.

    Signed-off-by: Albert Herranz
    Signed-off-by: Eric Biederman
    Acked-by: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This is the x86_64 implementation of the crashkernel option. It reserves
    a window of memory very early in the bootup process, so we never use
    it for anything but the kernel to switch to when the running
    kernel panics.

    In addition to reserving this memory a resource structure is registered
    so looking at /proc/iomem it is clear what happened to that memory.

    ISSUES:
    Is it possible to implement this in a architecture generic way?
    What should be done with architectures that always use an iommu and
    thus don't report their RAM memory resources in /proc/iomem?

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This is the x86_64 implementation of machine kexec. 32bit compatibility
    support has been implemented, and machine_kexec has been enhanced to not care
    about the changing internal kernel paget table structures.

    From: Alexander Nyberg

    build fix

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Factor out the apic and smp shutdown code from machine_restart so it can be
    called by in the kexec reboot path as well.

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This is the x86 implementation of the crashkernel option. It reserves a
    window of memory very early in the bootup process, so we never use it for
    anything but the kernel to switch to when the running kernel panics.

    In addition to reserving this memory a resource structure is registered so
    looking at /proc/iomem it is clear what happened to that memory.

    ISSUES:
    Is it possible to implement this in a architecture generic way?
    What should be done with architectures that always use an iommu and
    thus don't report their RAM memory resources in /proc/iomem?

    Signed-off-by: Eric Biederman
    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • In the case of a crash/panic an architecture specific function
    machine_crash_shutdown is called. This patch adds to the x86 machine_crash
    function the standard kernel code for shutting down apics.

    Every line of code added to that function increases the risk that we will call
    code after a kernel panic that is not safe.

    This patch should not make it to the stable kernel without a being reviewed a
    lot more. It is unclear how much a hardned kernel can take when it comes to
    misconfigured apics. So since a normal kernel has problems this patch does a
    clean shutdown.

    It is my expectation this patch will be dropped from future generations of the
    kexec work. But for the moment it is a crutch to keep from breaking
    everything.

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • After the kernel panics if we wish to generate an entire machine core file it
    is very nice to know the register state at the time the machine crashed.

    After long discussion it was realized that if you are going to be saving the
    information anyway it is reasonable to store the information in a format that
    it will be used and recognized in so the register state is stored in the
    standard ELF note format.

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • One of the dangers when switching from one kernel to another is what happens
    to all of the other cpus that were running in the crashed kernel. In an
    attempt to avoid that problem this patch adds a nmi handler and attempts to
    shoot down the other cpus by sending them non maskable interrupts.

    The code then waits for 1 second or until all known cpus have stopped running
    and then jumps from the running kernel that has crashed to the kernel in
    reserved memory.

    The kernel spin loop is used for the delay as that should behave continue to
    be safe even in after a crash.

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This is the i386 implementation of kexec.

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Factor out the apic and smp shutdown code from machine_restart so it can be
    called by in the kexec reboot path as well.

    By switching to the bootstrap cpu by default on reboot I can delete/simplify
    some motherboard fixups well.

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • For one kernel to report a crash another kernel has created we need
    to have 2 kernels loaded simultaneously in memory. To accomplish this
    the two kernels need to built to run at different physical addresses.

    This patch adds the CONFIG_PHYSICAL_START option to the x86_64 kernel
    so we can do just that. You need to know what you are doing and
    the ramifications are before changing this value, and most users
    won't care so I have made it depend on CONFIG_EMBEDDED

    bzImage kernels will work and run at a different address when compiled
    with this option but they will still load at 1MB. If you need a kernel
    loaded at a different address as well you need to boot a vmlinux.

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This patch fixes a problem with reserving memory during boot up of a kernel
    built for non-default location. Currently boot memory allocator reserves
    the memory required by kernel image, boot allocaotor bitmap etc. It
    assumes that kernel is loaded at 1MB (HIGH_MEMORY hard coded to 1024*1024).
    But kernel can be built for non-default locatoin, hence existing
    hardcoding will lead to reserving unnecessary memory. This patch fixes it.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • For one kernel to report a crash another kernel has created we need
    to have 2 kernels loaded simultaneously in memory. To accomplish this
    the two kernels need to built to run at different physical addresses.

    This patch adds the CONFIG_PHYSICAL_START option to the x86 kernel
    so we can do just that. You need to know what you are doing and
    the ramifications are before changing this value, and most users
    won't care so I have made it depend on CONFIG_EMBEDDED

    bzImage kernels will work and run at a different address when compiled
    with this option but they will still load at 1MB. If you need a kernel
    loaded at a different address as well you need to boot a vmlinux.

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The vmlinux on x86_64 does not report the correct physical address of
    the kernel. Instead in the physical address field it currently
    reports the virtual address of the kernel.

    This is patch is a bug fix that corrects vmlinux to report the
    proper physical addresses.

    This is potentially a help for crash dump analysis tools.

    This definitiely allows bootloaders that load vmlinux as a standard
    ELF executable. Bootloaders directly loading vmlinux become of
    practical importance when we consider the kexec on panic case.

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • The vmlinux on i386 does not report the correct physical address of
    the kernel. Instead in the physical address field it currently
    reports the virtual address of the kernel.

    This is patch is a bug fix that corrects vmlinux to report the
    proper physical addresses.

    This is potentially a help for crash dump analysis tools.

    This definitiely allows bootloaders that load vmlinux as a standard
    ELF executable. Bootloaders directly loading vmlinux become of
    practical importance when we consider the kexec on panic case.

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • When coming out of apic mode attempt to set the appropriate
    apic back into virtual wire mode. This improves on previous versions
    of this patch by by never setting bot the local apic and the ioapic
    into veritual wire mode.

    This code looks at data from the mptable to see if an ioapic has
    an ExtInt input to make this decision. A future improvement
    is to figure out which apic or ioapic was in virtual wire mode
    at boot time and to remember it. That is potentially a more accurate
    method, of selecting which apic to place in virutal wire mode.

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • When coming out of apic mode attempt to set the appropriate
    apic back into virtual wire mode. This improves on previous versions
    of this patch by by never setting bot the local apic and the ioapic
    into veritual wire mode.

    This code looks at data from the mptable to see if an ioapic has
    an ExtInt input to make this decision. A future improvement
    is to figure out which apic or ioapic was in virtual wire mode
    at boot time and to remember it. That is potentially a more accurate
    method, of selecting which apic to place in virutal wire mode.

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • From: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • From: Eric W. Biederman

    This patch disables interrupt generation from the legacy pic on reboot. Now
    that there is a sys_device class it should not be called while drivers are
    still using interrupts.

    There is a report about this breaking ACPI power off on some systems.
    http://bugme.osdl.org/show_bug.cgi?id=4041
    However the final comment seems to exonerate this code. So until
    I get more information I believe that was a false positive.

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • From: Eric W. Biederman

    It is ok to reserve resources > 4G on x86_64 struct resource is 64bit now :)

    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • From: "Maciej W. Rozycki"

    Fix a kexec problem whcih causes local APIC detection failure.

    The problem is detect_init_APIC() is called early, before the command line
    have been processed. Therefore "lapic" (and "nolapic") have not been seen,
    yet.

    Signed-off-by: Maciej W. Rozycki
    Signed-off-by: Eric Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman