27 Jul, 2008
2 commits
-
This patch implements devices state save/restore before after kexec.
This patch together with features in kexec_jump patch can be used for
following:- A simple hibernation implementation without ACPI support. You can kexec a
hibernating kernel, save the memory image of original system and shutdown
the system. When resuming, you restore the memory image of original system
via ordinary kexec load then jump back.- Kernel/system debug through making system snapshot. You can make system
snapshot, jump back, do some thing and make another system snapshot.- Cooperative multi-kernel/system. With kexec jump, you can switch between
several kernels/systems quickly without boot process except the first time.
This appears like swap a whole kernel/system out/in.- A general method to call program in physical mode (paging turning
off). This can be used to invoke BIOS code under Linux.The following user-space tools can be used with kexec jump:
- kexec-tools needs to be patched to support kexec jump. The patches
and the precompiled kexec can be download from the following URL:
source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10- makedumpfile with patches are used as memory image saving tool, it
can exclude free pages from original kernel memory image file. The
patches and the precompiled makedumpfile can be download from the
following URL:
source: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-src_cvs_kh10.tar.bz2
patches: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-patches_cvs_kh10.tar.bz2
binary: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile_cvs_kh10- An initramfs image can be used as the root file system of kexeced
kernel. An initramfs image built with "BuildRoot" can be downloaded
from the following URL:
initramfs image: http://khibernation.sourceforge.net/download/release_v10/initramfs/rootfs_cvs_kh10.gz
All user space tools above are included in the initramfs image.Usage example of simple hibernation:
1. Compile and install patched kernel with following options selected:
CONFIG_X86_32=y
CONFIG_RELOCATABLE=y
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_PM=y
CONFIG_HIBERNATION=y
CONFIG_KEXEC_JUMP=y2. Build an initramfs image contains kexec-tool and makedumpfile, or
download the pre-built initramfs image, called rootfs.gz in
following text.3. Prepare a partition to save memory image of original kernel, called
hibernating partition in following text.4. Boot kernel compiled in step 1 (kernel A).
5. In the kernel A, load kernel compiled in step 1 (kernel B) with
/sbin/kexec. The shell command line can be as follow:/sbin/kexec --load-preserve-context /boot/bzImage --mem-min=0x100000
--mem-max=0xffffff --initrd=rootfs.gz6. Boot the kernel B with following shell command line:
/sbin/kexec -e
7. The kernel B will boot as normal kexec. In kernel B the memory
image of kernel A can be saved into hibernating partition as
follow:jump_back_entry=`cat /proc/cmdline | tr ' ' '\n' | grep kexec_jump_back_entry | cut -d '='`
echo $jump_back_entry > kexec_jump_back_entry
cp /proc/vmcore dump.elfThen you can shutdown the machine as normal.
8. Boot kernel compiled in step 1 (kernel C). Use the rootfs.gz as
root file system.9. In kernel C, load the memory image of kernel A as follow:
/sbin/kexec -l --args-none --entry=`cat kexec_jump_back_entry` dump.elf
10. Jump back to the kernel A as follow:
/sbin/kexec -e
Then, kernel A is resumed.
Implementation point:
To support jumping between two kernels, before jumping to (executing)
the new kernel and jumping back to the original kernel, the devices
are put into quiescent state, and the state of devices and CPU is
saved. After jumping back from kexeced kernel and jumping to the new
kernel, the state of devices and CPU are restored accordingly. The
devices/CPU state save/restore code of software suspend is called to
implement corresponding function.Known issues:
- Because the segment number supported by sys_kexec_load is limited,
hibernation image with many segments may not be load. This is
planned to be eliminated by adding a new flag to sys_kexec_load to
make a image can be loaded with multiple sys_kexec_load invoking.Now, only the i386 architecture is supported.
Signed-off-by: Huang Ying
Acked-by: Vivek Goyal
Cc: "Eric W. Biederman"
Cc: Pavel Machek
Cc: Nigel Cunningham
Cc: "Rafael J. Wysocki"
Cc: Ingo Molnar
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Cope with a quirk of some RTCs (notably ACPI ones) which aren't guaranteed
to implement oneshot behavior when they woke the system from sleeep:
forcibly disable the alarm, just in case.Signed-off-by: David Brownell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
25 Jul, 2008
4 commits
-
Fix try_to_freeze_tasks()'s use of do_div() on an s64 by making
elapsed_csecs64 a u64 instead and dividing that.Possibly this should be guarded lest the interval calculation turn up
negative, but the possible negativity of the result of the division is
cast away anyway.This was introduced by patch 438e2ce68dfd4af4cfcec2f873564fb921db4bb5.
Signed-off-by: David Howells
Acked-by: "Rafael J. Wysocki"
Acked-by: Pavel Machek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
schedule sysrq poweroff on boot cpu.
sysrq poweroff needs to disable nonboot cpus, and we need to run this on boot
cpu to avoid any recursion. http://bugzilla.kernel.org/show_bug.cgi?id=10897[kosaki.motohiro@jp.fujitsu.com: build fix]
Signed-off-by: Zhang Rui
Tested-by: Rus
Signed-off-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Signed-off-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch simplifies the memory bitmap manipulations.
- remove the member size in struct bm_block
It is not necessary for struct bm_block to have the number of bit chunks that
can be calculated by using end_pfn and start_pfn.- use find_next_bit() for memory_bm_next_pfn
No need to invent the bitmap library only for the memory bitmap.
Signed-off-by: Akinobu Mita
Signed-off-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Boot-time test for system suspend states (STR or standby). The generic
RTC framework triggers wakeup alarms, which are used to exit those states.- Measures some aspects of suspend time ... this uses "jiffies" until
someone converts it to use a timebase that works properly even while
timer IRQs are disabled.- Triggered by a command line parameter. By default nothing even
vaguely troublesome will happen, but "test_suspend=mem" will give
you a brief STR test during system boot. (Or you may need to use
"test_suspend=standby" instead, if your hardware needs that.)This isn't without problems. It fires early enough during boot that for
example both PCMCIA and MMC stacks have misbehaved. The workaround in
those cases was to boot without such media cards inserted.[matthltc@us.ibm.com: fix compile failure in boot time suspend selftest]
Signed-off-by: David Brownell
Cc: Ingo Molnar
Cc: Pavel Machek
Cc: "Rafael J. Wysocki"
Signed-off-by: Matt Helsley
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
18 Jul, 2008
2 commits
-
Xen save/restore needs bits of code enabled by PM_SLEEP, and PM_SLEEP
depends on PM. So make XEN_SAVE_RESTORE depend on PM and PM_SLEEP
depend on XEN_SAVE_RESTORE.Signed-off-by: Jeremy Fitzhardinge
Acked-by: Rafael J. Wysocki
Signed-off-by: Ingo Molnar
17 Jul, 2008
4 commits
-
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (72 commits)
Revert "x86/PCI: ACPI based PCI gap calculation"
PCI: remove unnecessary volatile in PCIe hotplug struct controller
x86/PCI: ACPI based PCI gap calculation
PCI: include linux/pm_wakeup.h for device_set_wakeup_capable
PCI PM: Fix pci_prepare_to_sleep
x86/PCI: Fix PCI config space for domains > 0
Fix acpi_pm_device_sleep_wake() by providing a stub for CONFIG_PM_SLEEP=n
PCI: Simplify PCI device PM code
PCI PM: Introduce pci_prepare_to_sleep and pci_back_from_sleep
PCI ACPI: Rework PCI handling of wake-up
ACPI: Introduce new device wakeup flag 'prepared'
ACPI: Introduce acpi_device_sleep_wake function
PCI: rework pci_set_power_state function to call platform first
PCI: Introduce platform_pci_power_manageable function
ACPI: Introduce acpi_bus_power_manageable function
PCI: make pci_name use dev_name
PCI: handle pci_name() being const
PCI: add stub for pci_set_consistent_dma_mask()
PCI: remove unused arch pcibios_update_resource() functions
PCI: fix pci_setup_device()'s sprinting into a const buffer
...Fixed up conflicts in various files (arch/x86/kernel/setup_64.c,
arch/x86/pci/irq.c, arch/x86/pci/pci.h, drivers/acpi/sleep/main.c,
drivers/pci/pci.c, drivers/pci/pci.h, include/acpi/acpi_bus.h) from x86
and ACPI updates manually. -
We can avoid taking the BKL in snapshot_ioctl() if pm_mutex is used to prevent
the ioctls from being executed concurrently.In addition, although it is only possible to open /dev/snapshot once, the task
which has done that may spawn a child that will inherit the open descriptor,
so in theory they can call snapshot_write(), snapshot_read() and
snapshot_release() concurrently. pm_mutex can also be used for mutual
exclusion in such cases.Signed-off-by: Rafael J. Wysocki
Signed-off-by: Andi Kleen
Acked-by: Pavel Machek
Signed-off-by: Len Brown -
Push BKL down into ioctl handlers - snapshot device.
Signed-off-by: Alan Cox
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Len Brown
Signed-off-by: Andi Kleen -
The freezer currently attempts to distinguish kernel threads from
user space tasks by checking if their mm pointer is unset and it
does not send fake signals to kernel threads. However, there are
kernel threads, mostly related to networking, that behave like
user space tasks and may want to be sent a fake signal to be frozen.Introduce the new process flag PF_FREEZER_NOSIG that will be set
by default for all kernel threads and make the freezer only send
fake signals to the tasks having PF_FREEZER_NOSIG unset. Provide
the set_freezable_with_signal() function to be called by the kernel
threads that want to be sent a fake signal for freezing.This patch should not change the freezer's observable behavior.
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Andi Kleen
Acked-by: Pavel Machek
Signed-off-by: Len Brown
16 Jul, 2008
2 commits
-
This reverts commit 6fbbec428c8e7bb617da2e8a589af2e97bcf3bc4.
Rafael doesnt like it - it breaks various assumptions.
Signed-off-by: Ingo Molnar
-
Xen save/restore requires PM_SLEEP to be set without requiring
SUSPEND or HIBERNATION.Signed-off-by: Jeremy Fitzhardinge
Cc: Stephen Tweedie
Cc: Eduardo Habkost
Cc: Mark McLoughlin
Signed-off-by: Ingo Molnar
13 Jun, 2008
1 commit
-
ACPI PM: Add possibility to change suspend sequence
There are some systems out there that don't work correctly with
our current suspend/hibernation code ordering. Provide a workaround
for these systems allowing them to pass 'acpi_sleep=old_ordering' in
the kernel command line so that it will use the pre-ACPI 2.0 ("old")
suspend code ordering.Unfortunately, this requires us to add a platform hook to the
resuming of devices for recovering the platform in case one of the
device drivers' .suspend() routines returns error code. Namely,
ACPI 1.0 specifies that _PTS should be called before suspending
devices, but _WAK still should be called before resuming them in
order to undo the changes made by _PTS. However, if there is an
error during suspending devices, they are automatically resumed
without returning control to the PM core, so the _WAK has to be
called from within device_resume() in that cases.The patch also reorders and refactors the ACPI suspend/hibernation
code to avoid duplication as far as reasonably possible.Signed-off-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Signed-off-by: Jesse Barnes
11 Jun, 2008
1 commit
-
Introduce 'struct pm_ops' and 'struct pm_ext_ops' ('ext' meaning
'extended') representing suspend and hibernation operations for bus
types, device classes, device types and device drivers.Modify the PM core to use 'struct pm_ops' and 'struct pm_ext_ops'
objects, if defined, instead of the ->suspend(), ->resume(),
->suspend_late(), and ->resume_early() callbacks (the old callbacks
will be considered as legacy and gradually phased out).The main purpose of doing this is to separate suspend (aka S2RAM and
standby) callbacks from hibernation callbacks in such a way that the
new callbacks won't take arguments and the semantics of each of them
will be clearly specified. This has been requested for multiple
times by many people, including Linus himself, and the reason is that
within the current scheme if ->resume() is called, for example, it's
difficult to say why it's been called (ie. is it a resume from RAM or
from hibernation or a suspend/hibernation failure etc.?).The second purpose is to make the suspend/hibernation callbacks more
flexible so that device drivers can handle more than they can within
the current scheme. For example, some drivers may need to prevent
new children of the device from being registered before their
->suspend() callbacks are executed or they may want to carry out some
operations requiring the availability of some other devices, not
directly bound via the parent-child relationship, in order to prepare
for the execution of ->suspend(), etc.Ultimately, we'd like to stop using the freezing of tasks for suspend
and therefore the drivers' suspend/hibernation code will have to take
care of the handling of the user space during suspend/hibernation.
That, in turn, would be difficult within the current scheme, without
the new ->prepare() and ->complete() callbacks.Signed-off-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Signed-off-by: Jesse Barnes
01 May, 2008
1 commit
-
…-9916', 'ec', 'eeepc', 'idle', 'misc', 'pm-legacy', 'sysfs-links-2.6.26', 'thermal', 'thinkpad' and 'video' into release
28 Apr, 2008
1 commit
-
Prior to suspend, we allocate and switch to a new VT; after suspend, we switch
back to the original VT. This can be slow, and is completely unnecessary if
the framebuffer we're using can restore video properly.This adds a hook that allows drivers to select whether or not to do this vt
switch, and changes the gxfb driver to call this hook. It also adds a module
param to gxfb to allow controlling of the vt switch (defaulting to no switch).(Note: I'm not convinced that console_sem is the best way to protect this, but
we should probably have some form of locking..)[akpm@linux-foundation.org: build fix]
Signed-off-by: Andres Salomon
Cc: Jordan Crouse
Cc: "Antonino A. Daplas"
Cc: Pavel Machek
Cc: "Rafael J. Wysocki"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
15 Apr, 2008
1 commit
-
AFAICT pm_send_all is a nop when noone uses pm_register...
Hmm.. can we just force CONFIG_PM_LEGACY=n, and see what happens?
Or maybe this is better idea? It may break build somewhere, but it
should be easy to fix... (it builds here, i386 and x86-64).Signed-off-by: Pavel Machek
Acked-by: Ralf Baechle
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Len Brown
13 Mar, 2008
2 commits
-
Move 00-INDEX entries to power/00-INDEX (and add entry for
pm_qos_interface.txt).Update references to moved filenames.
Fix some trailing whitespace.
Signed-off-by: Randy Dunlap
Signed-off-by: Len Brown
12 Mar, 2008
1 commit
-
There is a problem in the hibernation code that triggers on some NUMA
systems on which pfn_valid() returns 'true' for some PFNs that don't
belong to any zone. Namely, there is a BUG_ON() in
memory_bm_find_bit() that triggers for PFNs not belonging to any
zone and passing the pfn_valid() test. On the affected systems it
triggers when we mark PFNs reported by the platform as not saveable,
because the PFNs in question belong to a region mapped directly using
iorepam() (i.e. the ACPI data area) and they pass the pfn_valid()
test.Modify memory_bm_find_bit() so that it returns an error if given PFN
doesn't belong to any zone instead of crashing the kernel and ignore
the result returned by it in mark_nosave_pages(), while marking the
"nosave" memory regions.This doesn't affect the hibernation functionality, as we won't touch
the PFNs in question anyway.http://bugzilla.kernel.org/show_bug.cgi?id=9966 .
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Len Brown
04 Mar, 2008
1 commit
-
This changes the "freezer" code used by suspend/hibernate in its treatment
of tasks in TASK_STOPPED (job control stop) and TASK_TRACED (ptrace) states.As I understand it, the intent of the "freezer" is to hold all tasks
from doing anything significant. For this purpose, TASK_STOPPED and
TASK_TRACED are "frozen enough". It's possible the tasks might resume
from ptrace calls (if the tracer were unfrozen) or from signals
(including ones that could come via timer interrupts, etc). But this
doesn't matter as long as they quickly block again while "freezing" is
in effect. Some minor adjustments to the signal.c code make sure that
try_to_freeze() very shortly follows all wakeups from both kinds of
stop. This lets the freezer code safely leave stopped tasks unmolested.Changing this fixes the longstanding bug of seeing after resuming from
suspend/hibernate your shell report "[1] Stopped" and the like for all
your jobs stopped by ^Z et al, as if you had freshly fg'd and ^Z'd them.
It also removes from the freezer the arcane special case treatment for
ptrace'd tasks, which relied on intimate knowledge of ptrace internals.Signed-off-by: Roland McGrath
Signed-off-by: Linus Torvalds
24 Feb, 2008
1 commit
-
During the last step of hibernation in the "platform" mode (with the
help of ACPI) we use the suspend code, including the devices'
->suspend() methods, to prepare the system for entering the ACPI S4
system sleep state.But at least for some devices the operations performed by the
->suspend() callback in that case must be different from its operations
during regular suspend.For this reason, introduce the new PM event type PM_EVENT_HIBERNATE and
pass it to the device drivers' ->suspend() methods during the last phase
of hibernation, so that they can distinguish this case and handle it as
appropriate. Modify the drivers that handle PM_EVENT_SUSPEND in a
special way and need to handle PM_EVENT_HIBERNATE in the same way.These changes are necessary to fix a hibernation regression related
to the i915 driver (ref. http://lkml.org/lkml/2008/2/22/488).Signed-off-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Tested-by: Jeff Chua
Signed-off-by: Linus Torvalds
21 Feb, 2008
1 commit
-
Make hibernation work with CONFIG_DEBUG_PAGEALLOC set on x86, by
checking if the pages to be copied are marked as present in the
kernel mapping and temporarily marking them as present if that's not
the case. No functional modifications are introduced if
CONFIG_DEBUG_PAGEALLOC is unset.Signed-off-by: Rafael J. Wysocki
Signed-off-by: Len Brown
07 Feb, 2008
1 commit
-
Signed-off-by: Pavel Machek
Acked-by: Rafael J. Wysocki
Signed-off-by: Len Brown
06 Feb, 2008
2 commits
-
resume_file[] and create_image() can become static.
Signed-off-by: Adrian Bunk
Acked-by: Pavel Machek
Cc: "Rafael J. Wysocki"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
- Add comments explaing how drain_pages() works.
- Eliminate useless functions
- Rename drain_all_local_pages to drain_all_pages(). It does drain
all pages not only those of the local processor.- Eliminate useless interrupt off / on sequences. drain_pages()
disables interrupts on its own. The execution thread is
pinned to processor by the caller. So there is no need to
disable interrupts.- Put drain_all_pages() declaration in gfp.h and remove the
declarations from suspend.h and from mm/memory_hotplug.c- Make software suspend call drain_all_pages(). The draining
of processor local pages is may not the right approach if
software suspend wants to support SMP. If they call drain_all_pages
then we can make drain_pages() static.[akpm@linux-foundation.org: fix build]
Signed-off-by: Christoph Lameter
Acked-by: Mel Gorman
Cc: "Rafael J. Wysocki"
Cc: Daniel Walker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
02 Feb, 2008
12 commits
-
Following the recent change in the suspend code path, switch consoles before
calling PM notifiers during hibernation.Signed-off-by: Rafael J. Wysocki
Signed-off-by: Len Brown -
In order to fix APM emulation it is necessary to enable apm-emulation
notifications for suspends triggered in various ways via the suspend
notifiers. However, this will cause the systems using APM emulation
to lock up between X being needed to switch away from the VT and X
already waiting for resume in the APM ioctl.This patch moves the console switch (if enabled) before the suspend
notification (and after the resume notification) to avoid this issue.Signed-off-by: Johannes Berg
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Len Brown -
This patch makes the freezer optional for suspend to allow the
system to work (or not work) like the original PMU suspend.Signed-off-by: Johannes Berg
Acked-by: Pavel Machek
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Len Brown -
Introduce global hibernation callback .end() and rename global
hibernation callback .start() to .begin(), in analogy with the
recent modifications of the global suspend callbacks.Signed-off-by: Rafael J. Wysocki
Signed-off-by: Len Brown -
On ACPI systems the target state set by acpi_pm_set_target() is
reset by acpi_pm_finish(), but that need not be called if the
suspend fails. All platforms that use the .set_target() global
suspend callback are affected by analogous issues.For this reason, we need an additional global suspend callback that
will reset the target state regardless of whether or not the suspend
is successful. Also, it is reasonable to rename the .set_target()
callback, since it will be used for a different purpose on ACPI
systems (due to ACPI 1.0x code ordering requirements).Introduce the global suspend callback .end() to be executed at the
end of the suspend sequence and rename the .set_target() global
suspend callback to .begin().Signed-off-by: Rafael J. Wysocki
Signed-off-by: Len Brown -
kernel/power/main.c:488: error: ‘pm_test_attr’ undeclared here (not in a function)
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Len Brown -
This cleans up the suspend Kconfig and removes the need to
declare centrally which architectures support suspend. All
architectures that currently support suspend are modified
accordingly.Signed-off-by: Johannes Berg
Acked-by: Russell King
Acked-by: Paul Mackerras
Acked-by: Ralf Baechle
Acked-by: Paul Mundt
Cc: Pavel Machek
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Len Brown -
This cleans up the hibernation Kconfig and removes the need to
declare centrally which architectures support hibernation. All
architectures that currently support hibernation are modified
accordingly.Signed-off-by: Johannes Berg
Acked-by: Paul Mackerras
Cc: Pavel Machek
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Len Brown -
Make hibernation messages start with one common prefix "PM: " and use
the word "hibernation" in the messages as a synonym of "suspend to
disk".Turn some KERN_INFO messages into debug ones.
Signed-off-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Signed-off-by: Len Brown -
Make suspend messages start with one common prefix "PM: ".
Signed-off-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Signed-off-by: Len Brown -
Remove the unnecessary extern declaration of resume_file[]
from kernel/power/swap.c .Signed-off-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Signed-off-by: Len Brown -
Fix a comment in kernel/power/disk.c so that it doesn't contain lines
longer that 80 characters.Signed-off-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Signed-off-by: Len Brown