08 Dec, 2006

2 commits

  • The elf note saving code is currently duplicated over several
    architectures. This cleanup patch simply adds code to a common file and
    then replaces the arch-specific code with calls to the newly added code.

    The only drawback with this approach is that s390 doesn't fully support
    kexec-on-panic which for that arch leads to introduction of unused code.

    Signed-off-by: Magnus Damm
    Cc: Vivek Goyal
    Cc: Andi Kleen
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Magnus Damm
     
  • Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Burman Yan
     

30 Sep, 2006

2 commits

  • This fixes a couple of compiler warnings, and adds paranoia checks as well.

    Signed-off-by: Roland McGrath
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland McGrath
     
  • This is an updated version of Eric Biederman's is_init() patch.
    (http://lkml.org/lkml/2006/2/6/280). It applies cleanly to 2.6.18-rc3 and
    replaces a few more instances of ->pid == 1 with is_init().

    Further, is_init() checks pid and thus removes dependency on Eric's other
    patches for now.

    Eric's original description:

    There are a lot of places in the kernel where we test for init
    because we give it special properties. Most significantly init
    must not die. This results in code all over the kernel test
    ->pid == 1.

    Introduce is_init to capture this case.

    With multiple pid spaces for all of the cases affected we are
    looking for only the first process on the system, not some other
    process that has pid == 1.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Sukadev Bhattiprolu
    Cc: Dave Hansen
    Cc: Serge Hallyn
    Cc: Cedric Le Goater
    Cc:
    Acked-by: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sukadev Bhattiprolu
     

28 Jun, 2006

1 commit

  • With this patch, kdump uses the firmware soft-reset NMI for two purposes:
    1) Initiate the kdump (take a crash dump) by issuing a soft-reset.
    2) Break a CPU out of a deadlock condition that is detected during kdump
    processing.

    When a soft-reset is initiated each CPU will enter
    system_reset_exception() and set its corresponding bit in the global
    bit-array cpus_in_sr then call die(). When die() finds the CPU's bit set
    in cpu_in_sr crash_kexec() is called to initiate a crash dump. The first
    CPU to enter crash_kexec() is called the "crashing CPU". All other CPUs
    are "secondary CPUs". The secondary CPU's pass through to
    crash_kexec_secondary() and sleep. The crashing CPU waits for all CPUs
    to enter via soft-reset then boots the kdump kernel (see
    crash_soft_reset_check())

    When the system crashes due to a panic or exception, crash_kexec() is
    called by panic() or die(). The crashing CPU sends an IPI to all other
    CPUs to notify them of the pending shutdown. If a CPU is in a deadlock
    or hung state with interrupts disabled, the IPI will not be delivered.
    The result being, that the kdump kernel is not booted. This problem is
    solved with the use of a firmware generated soft-reset. After the
    crashing_cpu has issued the IPI, it waits for 10 sec for all CPUs to
    enter crash_ipi_callback(). A CPU signifies its entry to
    crash_ipi_callback() by setting its corresponding bit in the
    cpus_in_crash bit array. After 10 sec, if one or more CPUs have not set
    their bit in cpus_in_crash we assume that the CPU(s) is deadlocked. The
    operator is then prompted to generate a soft-reset to break the
    deadlock. Each CPU enters the soft reset handler as described above.

    Two conditions must be handled at this point:
    1) The system crashed because the operator generated a soft-reset. See
    2) The system had crashed before the soft-reset was generated ( in the
    case of a Panic or oops).

    The first CPU to enter crash_kexec() uses the state of the kexec_lock to
    determine this state. If kexec_lock is already held then condition 2 is
    true and crash_kexec_secondary() is called, else; this CPU is flagged as
    the crashing CPU, the kexec_lock is acquired and crash_kexec() proceeds
    as described above.

    Each additional CPUs responding to the soft-reset will pass through
    crash_kexec() to kexec_secondary(). All secondary CPUs call
    crash_ipi_callback() readying them self's for the shutdown. When ready
    they clear their bit in cpus_in_sr. The crashing CPU waits in
    kexec_secondary() until all other CPUs have cleared their bits in
    cpus_in_sr. The kexec kernel boot is then started.

    Signed-off-by: Haren Myneni
    Signed-off-by: David Wilder
    Signed-off-by: Paul Mackerras

    David Wilder
     

23 Jun, 2006

1 commit

  • Create two files in /sys/kernel, kexec_loaded and kexec_crash_loaded. Each
    file contains a simple boolean value indicating whether the relevant kernel
    has been loaded into memory. The motivation for this is geared around
    support.

    Signed-off-by: Jeff Moyer
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Moyer
     

12 Jan, 2006

1 commit

  • - Move capable() from sched.h to capability.h;

    - Use where capable() is used
    (in include/, block/, ipc/, kernel/, a few drivers/,
    mm/, security/, & sound/;
    many more drivers/ to go)

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy.Dunlap
     

11 Jan, 2006

2 commits

  • - If system panics then cpu register states are captured through funciton
    crash_get_current_regs(). This is not a inline function hence a stack frame
    is pushed on to the stack and then cpu register state is captured. Later
    this frame is popped and new frames are pushed (machine_kexec).

    - In theory this is not very right as we are capturing register states for a
    frame and that frame is no more valid. This seems to have created back
    trace problems for ppc64.

    - This patch fixes it up. The very first thing it does after entering
    crash_kexec() is to capture the register states. Anyway we don't want the
    back trace beyond crash_kexec(). crash_get_current_regs() has been made
    inline

    - crash_setup_regs() is the top architecture dependent function which should
    be responsible for capturing the register states as well as to do some
    architecture dependent tricks. For ex. fixing up ss and esp for i386.
    crash_setup_regs() has also been made inline to ensure no new call frame is
    pushed onto stack.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • - In case of system crash, current state of cpu registers is saved in memory
    in elf note format. So far memory for storing elf notes was being allocated
    statically for NR_CPUS.

    - This patch introduces dynamic allocation of memory for storing elf notes.
    It uses alloc_percpu() interface. This should lead to better memory usage.

    - Introduced based on Andi Kleen's and Eric W. Biederman's suggestions.

    - This patch also moves memory allocation for elf notes from architecture
    dependent portion to architecture independent portion. Now crash_notes is
    architecture independent. The whole idea is that size of memory to be
    allocated per cpu (MAX_NOTE_BYTES) can be architecture dependent and
    allocation of this memory can be architecture independent.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     

30 Oct, 2005

1 commit

  • Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
    a many-threaded application which concurrently initializes different parts of
    a large anonymous area.

    This patch corrects that, by using a separate spinlock per page table page, to
    guard the page table entries in that page, instead of using the mm's single
    page_table_lock. (But even then, page_table_lock is still used to guard page
    table allocation, and anon_vma allocation.)

    In this implementation, the spinlock is tucked inside the struct page of the
    page table page: with a BUILD_BUG_ON in case it overflows - which it would in
    the case of 32-bit PA-RISC with spinlock debugging enabled.

    Splitting the lock is not quite for free: another cacheline access. Ideally,
    I suppose we would use split ptlock only for multi-threaded processes on
    multi-cpu machines; but deciding that dynamically would have its own costs.
    So for now enable it by config, at some number of cpus - since the Kconfig
    language doesn't support inequalities, let preprocessor compare that with
    NR_CPUS. But I don't think it's worth being user-configurable: for good
    testing of both split and unsplit configs, split now at 4 cpus, and perhaps
    change that to 8 later.

    There is a benefit even for singly threaded processes: kswapd can be attacking
    one part of the mm while another part is busy faulting.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

28 Oct, 2005

1 commit


29 Jun, 2005

1 commit


26 Jun, 2005

4 commits

  • o Following patch provides purely cosmetic changes and corrects CodingStyle
    guide lines related certain issues like below in kexec related files

    o braces for one line "if" statements, "for" loops,
    o more than 80 column wide lines,
    o No space after "while", "for" and "switch" key words

    o Changes:
    o take-2: Removed the extra tab before "case" key words.
    o take-3: Put operator at the end of line and space before "*/"

    Signed-off-by: Maneesh Soni
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Maneesh Soni
     
  • Makes kexec_crashdump() take a pt_regs * as an argument. This allows to
    get exact register state at the point of the crash. If we come from direct
    panic assertion NULL will be passed and the current registers saved before
    crashdump.

    This hooks into two places:
    die(): check the conditions under which we will panic when calling
    do_exit and go there directly with the pt_regs that caused the fatal
    fault.

    die_nmi(): If we receive an NMI lockup while in the kernel use the
    pt_regs and go directly to crash_kexec(). We're probably nested up badly
    at this point so this might be the only chance to escape with proper
    information.

    Signed-off-by: Alexander Nyberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Nyberg
     
  • This is a minor bug fix in kexec to resolve the problem of loading panic
    kernel with initrd.

    o Problem: Loading a capture kenrel fails if initrd is also being loaded.
    This has been observed for vmlinux image for kexec on panic case.

    o This patch fixes the problem. In segment location and size verification
    logic, minor correction has been done. Segment memory end (mend) should be
    mstart + memsz - 1. This one byte offset was source of failure for initrd
    loading which was being loaded at hole boundary.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     
  • This patch introduces the architecture independent implementation the
    sys_kexec_load, the compat_sys_kexec_load system calls.

    Kexec on panic support has been integrated into the core patch and is
    relatively clean.

    In addition the hopefully architecture independent option
    crashkernel=size@location has been docuemented. It's purpose is to reserve
    space for the panic kernel to live, and where no DMA transfer will ever be
    setup to access.

    Signed-off-by: Eric Biederman
    Signed-off-by: Alexander Nyberg
    Signed-off-by: Adrian Bunk
    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman