05 Sep, 2005

9 commits

  • Jeff Dike ,
    Paolo 'Blaisorblade' Giarrusso ,
    Bodo Stroesser

    Adds a new ptrace(2) mode, called PTRACE_SYSEMU, resembling PTRACE_SYSCALL
    except that the kernel does not execute the requested syscall; this is useful
    to improve performance for virtual environments, like UML, which want to run
    the syscall on their own.

    In fact, using PTRACE_SYSCALL means stopping child execution twice, on entry
    and on exit, and each time you also have two context switches; with SYSEMU you
    avoid the 2nd stop and so save two context switches per syscall.

    Also, some architectures don't have support in the host for changing the
    syscall number via ptrace(), which is currently needed to skip syscall
    execution (UML turns any syscall into getpid() to avoid it being executed on
    the host). Fixing that is hard, while SYSEMU is easier to implement.

    * This version of the patch includes some suggestions of Jeff Dike to avoid
    adding any instructions to the syscall fast path, plus some other little
    changes, by myself, to make it work even when the syscall is executed with
    SYSENTER (but I'm unsure about them). It has been widely tested for quite a
    lot of time.

    * Various fixed were included to handle the various switches between
    various states, i.e. when for instance a syscall entry is traced with one of
    PT_SYSCALL / _SYSEMU / _SINGLESTEP and another one is used on exit.
    Basically, this is done by remembering which one of them was used even after
    the call to ptrace_notify().

    * We're combining TIF_SYSCALL_EMU with TIF_SYSCALL_TRACE or TIF_SINGLESTEP
    to make do_syscall_trace() notice that the current syscall was started with
    SYSEMU on entry, so that no notification ought to be done in the exit path;
    this is a bit of a hack, so this problem is solved in another way in next
    patches.

    * Also, the effects of the patch:
    "Ptrace - i386: fix Syscall Audit interaction with singlestep"
    are cancelled; they are restored back in the last patch of this series.

    Detailed descriptions of the patches doing this kind of processing follow (but
    I've already summed everything up).

    * Fix behaviour when changing interception kind #1.

    In do_syscall_trace(), we check the status of the TIF_SYSCALL_EMU flag
    only after doing the debugger notification; but the debugger might have
    changed the status of this flag because he continued execution with
    PTRACE_SYSCALL, so this is wrong. This patch fixes it by saving the flag
    status before calling ptrace_notify().

    * Fix behaviour when changing interception kind #2:
    avoid intercepting syscall on return when using SYSCALL again.

    A guest process switching from using PTRACE_SYSEMU to PTRACE_SYSCALL
    crashes.

    The problem is in arch/i386/kernel/entry.S. The current SYSEMU patch
    inhibits the syscall-handler to be called, but does not prevent
    do_syscall_trace() to be called after this for syscall completion
    interception.

    The appended patch fixes this. It reuses the flag TIF_SYSCALL_EMU to
    remember "we come from PTRACE_SYSEMU and now are in PTRACE_SYSCALL", since
    the flag is unused in the depicted situation.

    * Fix behaviour when changing interception kind #3:
    avoid intercepting syscall on return when using SINGLESTEP.

    When testing 2.6.9 and the skas3.v6 patch, with my latest patch and had
    problems with singlestepping on UML in SKAS with SYSEMU. It looped
    receiving SIGTRAPs without moving forward. EIP of the traced process was
    the same for all SIGTRAPs.

    What's missing is to handle switching from PTRACE_SYSCALL_EMU to
    PTRACE_SINGLESTEP in a way very similar to what is done for the change from
    PTRACE_SYSCALL_EMU to PTRACE_SYSCALL_TRACE.

    I.e., after calling ptrace(PTRACE_SYSEMU), on the return path, the debugger is
    notified and then wake ups the process; the syscall is executed (or skipped,
    when do_syscall_trace() returns 0, i.e. when using PTRACE_SYSEMU), and
    do_syscall_trace() is called again. Since we are on the return path of a
    SYSEMU'd syscall, if the wake up is performed through ptrace(PTRACE_SYSCALL),
    we must still avoid notifying the parent of the syscall exit. Now, this
    behaviour is extended even to resuming with PTRACE_SINGLESTEP.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laurent Vivier
     
  • Clean code up a bit, and only show suspend to disk as available when
    it is configured in.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • If process freezing fails, some processes are frozen, and rest are left in
    "were asked to be frozen" state. Thats wrong, we should leave it in some
    consistent state.

    Signed-off-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • Drop printing during normal boot (when no image exists in swap), print
    message when drivers fail, fix error paths and consolidate near-identical
    functions in disk.c (and functions with just one statement).

    Signed-off-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • It is trying to protect swsusp_resume_device and software_resume() from two
    users banging it from userspace at the same time.

    Signed-off-by: Shaohua Li
    Signed-off-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shaohua Li
     
  • The function calc_nr uses an iterative algorithm to calculate the number of
    pages needed for the image and the pagedir. Exactly the same result can be
    obtained with a one-line expression.

    Note that this was even proved correct ;-).

    Signed-off-by: Michal Schmidt
    Signed-off-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Schmidt
     
  • The patch protects from leaking sensitive data after resume from suspend.
    During suspend a temporary key is created and this key is used to encrypt the
    data written to disk. When, during resume, the data was read back into memory
    the temporary key is destroyed which simply means that all data written to
    disk during suspend are then inaccessible so they can't be stolen lateron.

    Think of the following: you suspend while an application is running that keeps
    sensitive data in memory. The application itself prevents the data from being
    swapped out. Suspend, however, must write these data to swap to be able to
    resume lateron. Without suspend encryption your sensitive data are then
    stored in plaintext on disk. This means that after resume your sensitive data
    are accessible to all applications having direct access to the swap device
    which was used for suspend. If you don't need swap after resume these data
    can remain on disk virtually forever. Thus it can happen that your system
    gets broken in weeks later and sensitive data which you thought were encrypted
    and protected are retrieved and stolen from the swap device.

    Signed-off-by: Andreas Steinmetz
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andreas Steinmetz
     
  • This should make refrigerator sleep properly, not busywait after the first
    schedule() returns.

    Signed-off-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • Aha, swsusp dips into swap_info[], better update it to swap_lock. It's
    bitflipping flags with 0xFF, so get_swap_page will allocate from only the one
    chosen device: let's change that to flip SWP_WRITEOK.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

30 Aug, 2005

3 commits


27 Aug, 2005

2 commits

  • At the suggestion of Nick Piggin and Dinakar, totally disable
    the facility to allow cpu_exclusive cpusets to define dynamic
    sched domains in Linux 2.6.13, in order to avoid problems
    first reported by John Hawkes (corrupt sched data structures
    and kernel oops).

    This has been built for ppc64, i386, ia64, x86_64, sparc, alpha.
    It has been built, booted and tested for cpuset functionality
    on an SN2 (ia64).

    Dinakar or Nick - could you verify that it for sure does avoid
    the problems Hawkes reported. Hawkes is out of town, and I don't
    have the recipe to reproduce what he found.

    Signed-off-by: Paul Jackson
    Acked-by: Nick Piggin
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • The partial disabling of Dinakar's new facility to allow
    cpu_exclusive cpusets to define dynamic sched domains
    doesn't go far enough. At the suggestion of Nick Piggin
    and Dinakar, let us instead totally disable this facility
    for 2.6.13, in order to avoid problems first reported
    by John Hawkes (corrupt sched data structures and kernel oops).

    This patch removes the partial disabling code in 2.6.13-rc7,
    in anticipation of the next patch, which will totally disable
    it instead.

    Signed-off-by: Paul Jackson
    Signed-off-by: Linus Torvalds

    Paul Jackson
     

25 Aug, 2005

1 commit

  • As reported by Paul Mackerras , the previous patch
    "cpu_exclusive sched domains fix" broke the ppc64 build with
    CONFIC_CPUSET, yielding error messages:

    kernel/cpuset.c: In function 'update_cpu_domains':
    kernel/cpuset.c:648: error: invalid lvalue in unary '&'
    kernel/cpuset.c:648: error: invalid lvalue in unary '&'

    On some arch's, the node_to_cpumask() is a function, returning
    a cpumask_t. But the for_each_cpu_mask() requires an lvalue mask.

    The following patch fixes this build failure by making a copy
    of the cpumask_t on the stack.

    Signed-off-by: Paul Jackson
    Signed-off-by: Linus Torvalds

    Paul Jackson
     

24 Aug, 2005

2 commits

  • This keeps the kernel/cpuset.c routine update_cpu_domains() from
    invoking the sched.c routine partition_sched_domains() if the cpuset in
    question doesn't fall on node boundaries.

    I have boot tested this on an SN2, and with the help of a couple of ad
    hoc printk's, determined that it does indeed avoid calling the
    partition_sched_domains() routine on partial nodes.

    I did not directly verify that this avoids setting up bogus sched
    domains or avoids the oops that Hawkes saw.

    This patch imposes a silent artificial constraint on which cpusets can
    be used to define dynamic sched domains.

    This patch should allow proceeding with this new feature in 2.6.13 for
    the configurations in which it is useful (node alligned sched domains)
    while avoiding trying to setup sched domains in the less useful cases
    that can cause the kernel corruption and oops.

    Signed-off-by: Paul Jackson
    Acked-by: Ingo Molnar
    Acked-by: Dinakar Guniguntala
    Acked-by: John Hawkes
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • With CONFIG_PREEMPT && !CONFIG_SMP, it's possible for sys_getppid to
    return a bogus value if the parent's task_struct gets reallocated after
    current->group_leader->real_parent is read:

    asmlinkage long sys_getppid(void)
    {
    int pid;
    struct task_struct *me = current;
    struct task_struct *parent;

    parent = me->group_leader->real_parent;
    RACE HERE => for (;;) {
    pid = parent->tgid;
    #ifdef CONFIG_SMP
    {
    struct task_struct *old = parent;

    /*
    * Make sure we read the pid before re-reading the
    * parent pointer:
    */
    smp_rmb();
    parent = me->group_leader->real_parent;
    if (old != parent)
    continue;
    }
    #endif
    break;
    }
    return pid;
    }

    If the process gets preempted at the indicated point, the parent process
    can go ahead and call exit() and then get wait()'d on to reap its
    task_struct. When the preempted process gets resumed, it will not do any
    further checks of the parent pointer on !CONFIG_SMP: it will read the
    bad pid and return.

    So, the same algorithm used when SMP is enabled should be used when
    preempt is enabled, which will recheck ->real_parent in this case.

    Signed-off-by: David Meybohm
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Meybohm
     

19 Aug, 2005

1 commit


18 Aug, 2005

1 commit

  • This bug is quite subtle and only happens in a very interesting
    situation where a real-time threaded process is in the middle of a
    coredump when someone whacks it with a SIGKILL. However, this deadlock
    leaves the system pretty hosed and you have to reboot to recover.

    Not good for real-time priority-preemption applications like our
    telephony application, with 90+ real-time (SCHED_FIFO and SCHED_RR)
    processes, many of them multi-threaded, interacting with each other for
    high volume call processing.

    Acked-by: Roland McGrath
    Signed-off-by: Linus Torvalds

    Bhavesh P. Davda
     

11 Aug, 2005

1 commit

  • We have a chek in there to make sure that the name won't overflow
    task_struct.comm[], but it's triggering for scsi with lots of HBAs, only
    scsi is using single-threaded workqueues which don't append the "/%d"
    anyway.

    All too hard. Just kill the BUG_ON.

    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton

    [ kthread_create() uses vsnprintf() and limits the thing, so no
    actual overflow can actually happen regardless ]

    Signed-off-by: Linus Torvalds

    James Bottomley
     

10 Aug, 2005

1 commit

  • Fix possible cpuset_sem ABBA deadlock if 'notify_on_release' set.

    For a particular usage pattern, creating and destroying cpusets fairly
    frequently using notify_on_release, on a very large system, this deadlock
    can be seen every few days. If you are not using the cpuset
    notify_on_release feature, you will never see this deadlock.

    The existing code, on task exit (or cpuset deletion) did:

    get cpuset_sem
    if cpuset marked notify_on_release and is ready to release:
    compute cpuset path relative to /dev/cpuset mount point
    call_usermodehelper() forks /sbin/cpuset_release_agent with path
    drop cpuset_sem

    Unfortunately, the fork in call_usermodehelper can allocate memory, and
    allocating memory can require cpuset_sem, if the mems_generation values
    changed in the interim. This results in an ABBA deadlock, trying to obtain
    cpuset_sem when it is already held by the current task.

    To fix this, I put the cpuset path (which must be computed while holding
    cpuset_sem) in a temporary buffer, to be used in the call_usermodehelper
    call of /sbin/cpuset_release_agent only _after_ dropping cpuset_sem.

    So the new logic is:

    get cpuset_sem
    if cpuset marked notify_on_release and is ready to release:
    compute cpuset path relative to /dev/cpuset mount point
    stash path in kmalloc'd buffer
    drop cpuset_sem
    call_usermodehelper() forks /sbin/cpuset_release_agent with path
    free path

    The sharp eyed reader might notice that this patch does not contain any
    calls to kmalloc. The existing code in the check_for_release() routine was
    already kmalloc'ing a buffer to hold the cpuset path. In the old code, it
    just held the buffer for a few lines, over the cpuset_release_agent() call
    that in turn invoked call_usermodehelper(). In the new code, with the
    application of this patch, it returns that buffer via the new char
    **ppathbuf parameter, for later use and freeing in cpuset_release_agent(),
    which is called after cpuset_sem is dropped. Whereas the old code has just
    one call to cpuset_release_agent(), right in the check_for_release()
    routine, the new code has three calls to cpuset_release_agent(), from the
    various places that a cpuset can be released.

    This patch has been build and booted on SN2, and passed a stress test that
    previously hit the deadlock within a few seconds.

    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     

05 Aug, 2005

1 commit


04 Aug, 2005

1 commit

  • This removes the calls to device_suspend() from the shutdown path that
    were added sometime during 2.6.13-rc*. They aren't working properly on
    a number of configs (I got reports from both ppc powerbook users and x86
    users) causing the system to not shutdown anymore.

    I think it isn't the right approach at the moment anyway. We have
    already a shutdown() callback for the drivers that actually care about
    shutdown and the suspend() code isn't yet in a good enough shape to be
    so much generalized. Also, the semantics of suspend and shutdown are
    slightly different on a number of setups and the way this was patched in
    provides little way for drivers to cleanly differenciate. It should
    have been at least a different message.

    For 2.6.13, I think we should revert to 2.6.12 behaviour and have a
    working suspend back.

    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Linus Torvalds

    Benjamin Herrenschmidt
     

02 Aug, 2005

2 commits

  • The module code assumes noone will ever ask for a per-cpu area more than
    SMP_CACHE_BYTES aligned. However, as these cases show, gcc asks sometimes
    asks for 32-byte alignment for the per-cpu section on a module, and if
    CONFIG_X86_L1_CACHE_SHIFT is 4, we hit that BUG_ON(). This is obviously an
    unusual combination, as there have been few reports, but better to warn
    than die.

    See:
    http://www.ussg.iu.edu/hypermail/linux/kernel/0409.0/0768.html

    And more recently:
    http://bugs.gentoo.org/show_bug.cgi?id=97006

    Signed-off-by: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rusty Russell
     
  • This removes sys_set_zone_reclaim() for now. While i'm sure Martin is
    trying to solve a real problem, we must not hard-code an incomplete and
    insufficient approach into a syscall, because syscalls are pretty much
    for eternity. I am quite strongly convinced that this syscall must not
    hit v2.6.13 in its current form.

    Firstly, the syscall lacks basic syscall design: e.g. it allows the
    global setting of VM policy for unprivileged users. (!) [ Imagine an
    Oracle installation and a SAP installation on the same NUMA box fighting
    over the 'optimal' setting for this flag. What will they do? Will they
    try to set the flag to their own preferred value every second or so? ]

    Secondly, it was added based on a single datapoint from Martin:

    http://marc.theaimsgroup.com/?l=linux-mm&m=111763597218177&w=2

    where Martin characterizes the numbers the following way:

    ' Run-to-run variability for "make -j" is huge, so these numbers aren't
    terribly useful except to see that with reclaim the benchmark still
    finishes in a reasonable amount of time. '

    in other words: the fundamental problem has likely not been solved, only
    a tendential move into the right direction has been observed, and a
    handful of numbers were picked out of a set of hugely variable results,
    without showing the variability data. How much variance is there
    run-to-run?

    I'd really suggest to first walk the walk and see what's needed to get
    stable & predictable kernel compilation numbers on that NUMA box, before
    adding random syscalls to tune a particular aspect of the VM ... which
    approach might not even matter once the whole picture has been analyzed
    and understood!

    The third, most important point is that the syscall exposes VM tuning
    internals in a completely unstructured way. What sense does it make to
    have a _GLOBAL_ per-node setting for 'should we go to another node for
    reclaim'? If then it might make sense to do this per-app, via numalib or
    so.

    The change is minimalistic in that it doesnt remove the syscall and the
    underlying infrastructure changes, only the user-visible changes. We
    could perhaps add a CAP_SYS_ADMIN-only sysctl for this hack, a'ka
    /proc/sys/vm/swappiness, but even that looks quite counterproductive
    when the generic approach is that we are trying to reduce the number of
    external factors in the VM balance picture.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

31 Jul, 2005

1 commit


30 Jul, 2005

1 commit


29 Jul, 2005

2 commits

  • (We found this (after a customer complained) and it is in the kernel.org
    kernel. Seems that for CLOCK_MONOTONIC absolute timers and clock_nanosleep
    calls both the request time and wall_to_monotonic are subtracted prior to
    the normalize resulting in an overflow in the existing normalize test.
    This causes the result to be shifted ~4 seconds ahead instead of ~2 seconds
    back in time.)

    The normalize code in posix-timers.c fails when the tv_nsec member is ~1.2
    seconds negative. This can happen on absolute timers (and
    clock_nanosleeps) requested on CLOCK_MONOTONIC (both the request time and
    wall_to_monotonic are subtracted resulting in the possibility of a number
    close to -2 seconds.)

    This fix uses the set_normalized_timespec() (which does not have an
    overflow problem) to fix the problem and as a side effect makes the code
    cleaner.

    Signed-off-by: George Anzinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    George Anzinger
     
  • This avoids some potential stack overflows with very deep softirq callchains.
    i386 does this too.

    TOADD CFI annotation

    Signed-off-by: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andi Kleen
     

28 Jul, 2005

8 commits

  • My fairly ordinary x86 test box gets stuck during reboot on the
    wait_for_completion() in ide_do_drive_cmd():

    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • `gcc -W' likes to complain if the static keyword is not at the beginning of
    the declaration. This patch fixes all remaining occurrences of "inline
    static" up with "static inline" in the entire kernel tree (140 occurrences in
    47 files).

    While making this change I came across a few lines with trailing whitespace
    that I also fixed up, I have also added or removed a blank line or two here
    and there, but there are no functional changes in the patch.

    Signed-off-by: Jesper Juhl
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jesper Juhl
     
  • Add kerneldoc to kernel/crash_dump.c

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Add kerneldoc to kernel/cpuset.c

    Fix cpuset typos in init/Kconfig

    Signed-off-by: Randy Dunlap
    Acked-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Add kerneldoc to kernel/capability.c

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Split spin lock and r/w lock implementation into a single try which is done
    inline and an out of line function that repeatedly tries to get the lock
    before doing the cpu_relax(). Add a system control to set the number of
    retries before a cpu is yielded.

    The reason for the spin lock retry is that the diagnose 0x44 that is used to
    give up the virtual cpu is quite expensive. For spin locks that are held only
    for a short period of time the costs of the diagnoses outweights the savings
    for spin locks that are held for a longer timer. The default retry count is
    1000.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Schwidefsky
     
  • Fix the recent off-by-one fix in the itimer code:

    1. The repeating timer is figured using the requested time
    (not +1 as we know where we are in the jiffie).

    2. The tests for interval too large are left to the time_val to jiffie code.

    Signed-off-by: George Anzinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    George Anzinger
     
  • This patch fixes a warning in the disable_nonboot_cpus call in
    kernel/power/smp.c.

    Signed-off by: Nigel Cunningham

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nigel Cunningham
     

27 Jul, 2005

3 commits

  • Here's the patch again to fix the code to handle if the values between
    MAX_USER_RT_PRIO and MAX_RT_PRIO are different.

    Without this patch, an SMP system will crash if the values are
    different.

    Signed-off-by: Steven Rostedt
    Cc: Ingo Molnar
    Signed-off-by: Dean Nelson
    Signed-off-by: Linus Torvalds

    Steven Rostedt
     
  • RLIMIT_RTPRIO is supposed to grant non privileged users the right to use
    SCHED_FIFO/SCHED_RR scheduling policies with priorites bounded by the
    RLIMIT_RTPRIO value via sched_setscheduler(). This is usually used by
    audio users.

    Unfortunately this is broken in 2.6.13rc3 as you can see in the excerpt
    from sched_setscheduler below:

    /*
    * Allow unprivileged RT tasks to decrease priority:
    */
    if (!capable(CAP_SYS_NICE)) {
    /* can't change policy */
    if (policy != p->policy)
    return -EPERM;

    After the above unconditional test which causes sched_setscheduler to
    fail with no regard to the RLIMIT_RTPRIO value the following check is made:

    /* can't increase priority */
    if (policy != SCHED_NORMAL &&
    param->sched_priority > p->rt_priority &&
    param->sched_priority >
    p->signal->rlim[RLIMIT_RTPRIO].rlim_cur)
    return -EPERM;

    Thus I do believe that the RLIMIT_RTPRIO value must be taken into
    account for the policy check, especially as the RLIMIT_RTPRIO limit is
    of no use without this change.

    The attached patch fixes this problem.

    Signed-off-by: Andreas Steinmetz
    Acked-by: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Andreas Steinmetz
     
  • The suspend to disk code was a poor copy of the code in
    sys_reboot now that we have kernel_power_off, kernel_restart
    and kernel_halt use them instead of poorly duplicating them inline.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Linus Torvalds

    Eric W. Biederman