29 Apr, 2008

1 commit

  • SysRQ-P is not always useful on SMP systems, since it usually ends up showing
    the backtrace of a CPU that is doing just fine, instead of the backtrace of
    the CPU that is having problems.

    This patch adds SysRQ show-all-cpus(L), which shows the backtrace of every
    active CPU in the system. It skips idle CPUs because some SMP systems are
    just too large and we already know what the backtrace of the idle task looks
    like.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Rik van Riel
    Randy Dunlap
    Cc:
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rik van Riel
     

28 Apr, 2008

1 commit

  • Introduce a node_zonelist() helper function. It is used to lookup the
    appropriate zonelist given a node and a GFP mask. The patch on its own is a
    cleanup but it helps clarify parts of the two-zonelist-per-node patchset. If
    necessary, it can be merged with the next patch in this set without problems.

    Reviewed-by: Christoph Lameter
    Signed-off-by: Mel Gorman
    Signed-off-by: Lee Schermerhorn
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Christoph Lameter
    Cc: Hugh Dickins
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

20 Oct, 2007

1 commit

  • is_init() is an ambiguous name for the pid==1 check. Split it into
    is_global_init() and is_container_init().

    A cgroup init has it's tsk->pid == 1.

    A global init also has it's tsk->pid == 1 and it's active pid namespace
    is the init_pid_ns. But rather than check the active pid namespace,
    compare the task structure with 'init_pid_ns.child_reaper', which is
    initialized during boot to the /sbin/init process and never changes.

    Changelog:

    2.6.22-rc4-mm2-pidns1:
    - Use 'init_pid_ns.child_reaper' to determine if a given task is the
    global init (/sbin/init) process. This would improve performance
    and remove dependence on the task_pid().

    2.6.21-mm2-pidns2:

    - [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
    ppc,avr32}/traps.c for the _exception() call to is_global_init().
    This way, we kill only the cgroup if the cgroup's init has a
    bug rather than force a kernel panic.

    [akpm@linux-foundation.org: fix comment]
    [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
    [bunk@stusta.de: kernel/pid.c: remove unused exports]
    [sukadev@us.ibm.com: Fix capability.c to work with threaded init]
    Signed-off-by: Serge E. Hallyn
    Signed-off-by: Sukadev Bhattiprolu
    Acked-by: Pavel Emelianov
    Cc: Eric W. Biederman
    Cc: Cedric Le Goater
    Cc: Dave Hansen
    Cc: Herbert Poetzel
    Cc: Kirill Korotaev
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     

17 Oct, 2007

2 commits

  • As of now, the kernel defaults to non-unicode and XLATE for the keyboard.
    We've been changing this in Fedora, but that requires patching the defaults
    in the kernel.

    The attached introduces CONFIG_VT_UNICODE, which sets the console in
    unicode mode by default on boot, including both the virtual terminal and
    the keyboard driver.

    Signed-off-by: Bill Nottingham
    Cc: Samuel Thibault
    Cc: Dmitry Torokhov
    Cc: Jiri Kosina
    Cc: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bill Nottingham
     
  • Move the OOM killer's extern function prototypes to include/linux/oom.h and
    include it where necessary.

    [clg@fr.ibm.com: build fix]
    Cc: Andrea Arcangeli
    Acked-by: Christoph Lameter
    Signed-off-by: David Rientjes
    Signed-off-by: Cedric Le Goater
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     

09 May, 2007

1 commit


17 Feb, 2007

1 commit


14 Feb, 2007

1 commit

  • Somewhere in the rewrite of the work queues my cleanup of SAK handling
    got broken. Maybe I didn't retest it properly or possibly the API
    was changing so fast I missed something. Regardless currently
    triggering a SAK now generates an ugly BUG_ON and kills the kernel.

    Thanks to Alexey Dobriyan for spotting this.

    This modifies the use of SAK_work to initialize it when the data
    structure it resides in is initialized, and to simply call
    schedule_work when we need to generate a SAK. I update both
    data structures that have a SAK_work member for consistency.

    All of the old PREPARE_WORK calls that are now gone.

    If we call schedule_work again before it has processed it
    has generated the first SAK it will simply ignore the duplicate
    schedule_work request.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

12 Feb, 2007

1 commit

  • This does several things.
    - It moves looking up of the current foreground console into process
    context where we can safely take the semaphore that protects this
    operation.
    - It uses the new flavor of work queue processing.
    - This generates a factor of do_SAK, __do_SAK that runs immediately.
    - This calls __do_SAK with the console semaphore held ensuring nothing
    else happens to the console while we process the SAK operation.
    - With the console SAK processing moved into process context this
    patch removes the xchg operations that I used to attempt to attomically
    update struct pid, because of the strange locking used in the SAK processing.
    With SAK using the normal console semaphore nothing special is needed.

    Cc: Oleg Nesterov
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

02 Feb, 2007

1 commit

  • Change SysRq showBlockedTasks from sysrq-X to sysrq-W and show that in the
    Help message.

    It was previously done via X, but X is already used for Xmon on ppc & powerpc
    platforms and this collision needs to be avoided.

    All callers of register_sysrq_key() are now marked in the sysrq op/key table.
    I didn't mark 'h' as Help because Help is just printed for any unknown key,
    such as '?'.

    Added some omitted sysrq key entries in the sysrq.txt file.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

14 Dec, 2006

1 commit

  • Most distributions enable sysrq support but set it to 0 by default. Add a
    sysrq_always_enabled boot option to always-enable sysrq keys. Useful for
    debugging - without having to modify the disribution's config files (which
    might not be possible if the kernel is on a live CD, etc.).

    Also, while at it, clean up the sysrq interfaces.

    [bunk@stusta.de: make sysrq_always_enabled_setup() static]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

08 Dec, 2006

1 commit


22 Nov, 2006

1 commit

  • Pass the work_struct pointer to the work function rather than context data.
    The work function can use container_of() to work out the data.

    For the cases where the container of the work_struct may go away the moment the
    pending bit is cleared, it is made possible to defer the release of the
    structure by deferring the clearing of the pending bit.

    To make this work, an extra flag is introduced into the management side of the
    work_struct. This governs auto-release of the structure upon execution.

    Ordinarily, the work queue executor would release the work_struct for further
    scheduling or deallocation by clearing the pending bit prior to jumping to the
    work function. This means that, unless the driver makes some guarantee itself
    that the work_struct won't go away, the work function may not access anything
    else in the work_struct or its container lest they be deallocated.. This is a
    problem if the auxiliary data is taken away (as done by the last patch).

    However, if the pending bit is *not* cleared before jumping to the work
    function, then the work function *may* access the work_struct and its container
    with no problems. But then the work function must itself release the
    work_struct by calling work_release().

    In most cases, automatic release is fine, so this is the default. Special
    initiators exist for the non-auto-release case (ending in _NAR).

    Signed-Off-By: David Howells

    David Howells
     

06 Oct, 2006

1 commit


05 Oct, 2006

1 commit

  • Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
    of passing regs around manually through all ~1800 interrupt handlers in the
    Linux kernel.

    The regs pointer is used in few places, but it potentially costs both stack
    space and code to pass it around. On the FRV arch, removing the regs parameter
    from all the genirq function results in a 20% speed up of the IRQ exit path
    (ie: from leaving timer_interrupt() to leaving do_IRQ()).

    Where appropriate, an arch may override the generic storage facility and do
    something different with the variable. On FRV, for instance, the address is
    maintained in GR28 at all times inside the kernel as part of general exception
    handling.

    Having looked over the code, it appears that the parameter may be handed down
    through up to twenty or so layers of functions. Consider a USB character
    device attached to a USB hub, attached to a USB controller that posts its
    interrupts through a cascaded auxiliary interrupt controller. A character
    device driver may want to pass regs to the sysrq handler through the input
    layer which adds another few layers of parameter passing.

    I've build this code with allyesconfig for x86_64 and i386. I've runtested the
    main part of the code on FRV and i386, though I can't test most of the drivers.
    I've also done partial conversion for powerpc and MIPS - these at least compile
    with minimal configurations.

    This will affect all archs. Mostly the changes should be relatively easy.
    Take do_IRQ(), store the regs pointer at the beginning, saving the old one:

    struct pt_regs *old_regs = set_irq_regs(regs);

    And put the old one back at the end:

    set_irq_regs(old_regs);

    Don't pass regs through to generic_handle_irq() or __do_IRQ().

    In timer_interrupt(), this sort of change will be necessary:

    - update_process_times(user_mode(regs));
    - profile_tick(CPU_PROFILING, regs);
    + update_process_times(user_mode(get_irq_regs()));
    + profile_tick(CPU_PROFILING);

    I'd like to move update_process_times()'s use of get_irq_regs() into itself,
    except that i386, alone of the archs, uses something other than user_mode().

    Some notes on the interrupt handling in the drivers:

    (*) input_dev() is now gone entirely. The regs pointer is no longer stored in
    the input_dev struct.

    (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does
    something different depending on whether it's been supplied with a regs
    pointer or not.

    (*) Various IRQ handler function pointers have been moved to type
    irq_handler_t.

    Signed-Off-By: David Howells
    (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)

    David Howells
     

01 Oct, 2006

1 commit

  • SysRq : Emergency Sync
    Emergency Sync complete
    SysRq : Emergency Remount R/O
    Emergency Remount complete
    SysRq : Resetting
    BUG: warning at kernel/lockdep.c:1816/trace_hardirqs_on() (Not tainted)

    Call Trace:
    [] show_trace+0xae/0x319
    [] dump_stack+0x15/0x17
    [] trace_hardirqs_on+0xbc/0x13d
    [] sysrq_handle_reboot+0x9/0x11
    [] __handle_sysrq+0x99/0x130
    [] handle_sysrq+0x17/0x19
    [] kbd_event+0x32e/0x57d
    [] input_event+0x42d/0x45b
    [] atkbd_interrupt+0x44d/0x53d
    [] serio_interrupt+0x49/0x86
    [] i8042_interrupt+0x202/0x21a
    [] handle_IRQ_event+0x2c/0x64
    [] __do_IRQ+0xaf/0x114
    [] do_IRQ+0xf8/0x107
    [] ret_from_intr+0x0/0xf
    DWARF2 unwinder stuck at ret_from_intr+0x0/0xf
    Leftover inexact backtrace:
    [] mwait_idle+0x3f/0x54
    [] cpu_idle+0xa2/0xc5
    [] rest_init+0x2b/0x2d
    [] start_kernel+0x24a/0x24c
    [] _sinittext+0x28b/0x292

    Since we're shutting down anyway, don't bother being smart,
    just turn the thing off.

    Signed-off-by: Peter Zijlstra
    Acked-by: Ingo Molnar
    Cc: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

30 Sep, 2006

1 commit

  • This is an updated version of Eric Biederman's is_init() patch.
    (http://lkml.org/lkml/2006/2/6/280). It applies cleanly to 2.6.18-rc3 and
    replaces a few more instances of ->pid == 1 with is_init().

    Further, is_init() checks pid and thus removes dependency on Eric's other
    patches for now.

    Eric's original description:

    There are a lot of places in the kernel where we test for init
    because we give it special properties. Most significantly init
    must not die. This results in code all over the kernel test
    ->pid == 1.

    Introduce is_init to capture this case.

    With multiple pid spaces for all of the cases affected we are
    looking for only the first process on the system, not some other
    process that has pid == 1.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Sukadev Bhattiprolu
    Cc: Dave Hansen
    Cc: Serge Hallyn
    Cc: Cedric Le Goater
    Cc:
    Acked-by: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sukadev Bhattiprolu
     

04 Jul, 2006

2 commits

  • Print all lock-classes on SysRq-D.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Generic lock debugging:

    - generalized lock debugging framework. For example, a bug in one lock
    subsystem turns off debugging in all lock subsystems.

    - got rid of the caller address passing (__IP__/__IP_DECL__/etc.) from
    the mutex/rtmutex debugging code: it caused way too much prototype
    hackery, and lockdep will give the same information anyway.

    - ability to do silent tests

    - check lock freeing in vfree too.

    - more finegrained debugging options, to allow distributions to
    turn off more expensive debugging features.

    There's no separate 'held mutexes' list anymore - but there's a 'held locks'
    stack within lockdep, which unifies deadlock detection across all lock
    classes. (this is independent of the lockdep validation stuff - lockdep first
    checks whether we are holding a lock already)

    Here are the current debugging options:

    CONFIG_DEBUG_MUTEXES=y
    CONFIG_DEBUG_LOCK_ALLOC=y

    which do:

    config DEBUG_MUTEXES
    bool "Mutex debugging, basic checks"

    config DEBUG_LOCK_ALLOC
    bool "Detect incorrect freeing of live mutexes"

    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

01 Jul, 2006

1 commit


26 Mar, 2006

1 commit


21 Feb, 2006

1 commit

  • Some allocations are restricted to a limited set of nodes (due to memory
    policies or cpuset constraints). If the page allocator is not able to find
    enough memory then that does not mean that overall system memory is low.

    In particular going postal and more or less randomly shooting at processes
    is not likely going to help the situation but may just lead to suicide (the
    whole system coming down).

    It is better to signal to the process that no memory exists given the
    constraints that the process (or the configuration of the process) has
    placed on the allocation behavior. The process may be killed but then the
    sysadmin or developer can investigate the situation. The solution is
    similar to what we do when running out of hugepages.

    This patch adds a check before we kill processes. At that point
    performance considerations do not matter much so we just scan the zonelist
    and reconstruct a list of nodes. If the list of nodes does not contain all
    online nodes then this is a constrained allocation and we should kill the
    current process.

    Signed-off-by: Christoph Lameter
    Cc: Nick Piggin
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

10 Jan, 2006

1 commit


09 Nov, 2005

1 commit


27 Jul, 2005

1 commit

  • sysrq calls into the reboot path from an interrupt handler
    we can either push the code do into process context and
    call kernel_restart and get a clean reboot or we can simply
    reboot the machine, and increase our chances of actually
    rebooting. emergency_reboot() seems like the closest match
    to what we have previously done, and what we want.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

08 Jul, 2005

1 commit


26 Jun, 2005

2 commits

  • Makes kexec_crashdump() take a pt_regs * as an argument. This allows to
    get exact register state at the point of the crash. If we come from direct
    panic assertion NULL will be passed and the current registers saved before
    crashdump.

    This hooks into two places:
    die(): check the conditions under which we will panic when calling
    do_exit and go there directly with the pt_regs that caused the fatal
    fault.

    die_nmi(): If we receive an NMI lockup while in the kernel use the
    pt_regs and go directly to crash_kexec(). We're probably nested up badly
    at this point so this might be the only chance to escape with proper
    information.

    Signed-off-by: Alexander Nyberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Nyberg
     
  • Add a sysrq-trigger mechanism for kexec based crashdumps. Alt-Sysrq-c
    triggers a kexec based crashdump.

    Signed-off-by: Hariprasad Nellitheertha
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hariprasad Nellitheertha
     

17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds