18 Mar, 2015

1 commit

  • Occasionally, the system can't come back up after suspend/resume
    due to problems of device suspending phase. This patch make
    PM_TRACE infrastructure cover device suspending phase of
    suspend/resume process, and the information in RTC can tell
    developers which device suspending function make system hang.

    Signed-off-by: Zhonghui Fu
    Signed-off-by: Rafael J. Wysocki

    Zhonghui Fu
     

28 Jul, 2014

1 commit


23 Jul, 2014

1 commit


21 Jul, 2014

1 commit


17 Jun, 2014

1 commit

  • To support using kernel features that are not compatible with hibernation,
    this creates the "nohibernate" kernel boot parameter to disable both
    hibernation and resume. This allows hibernation support to be a boot-time
    choice instead of only a compile-time choice.

    Signed-off-by: Kees Cook
    Acked-by: Pavel Machek
    Signed-off-by: Rafael J. Wysocki

    Kees Cook
     

26 May, 2014

3 commits

  • On some systems the platform doesn't support neither
    PM_SUSPEND_MEM nor PM_SUSPEND_STANDBY, so PM_SUSPEND_FREEZE is the
    only available system sleep state. However, some user space frameworks
    only use the "mem" and (sometimes) "standby" sleep state labels, so
    the users of those systems need to modify user space in order to be
    able to use system suspend at all and that is not always possible.

    For this reason, add a new kernel command line argument,
    relative_sleep_states, allowing the users of those systems to change
    the way in which the kernel assigns labels to system sleep states.
    Namely, for relative_sleep_states=1, the "mem", "standby" and "freeze"
    labels will enumerate the available system sleem states from the
    deepest to the shallowest, respectively, so that "mem" is always
    present in /sys/power/state and the other state strings may or may
    not be presend depending on what is supported by the platform.

    Update system sleep states documentation to reflect this change.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • Use the observation that, for platform-dependent sleep states
    (PM_SUSPEND_STANDBY, PM_SUSPEND_MEM), a given state is either
    always supported or always unsupported and store that information
    in pm_states[] instead of calling valid_state() every time we
    need to check it.

    Also do not use valid_state() for PM_SUSPEND_FREEZE, which is always
    valid, and move the pm_test_level validity check for PM_SUSPEND_FREEZE
    directly into enter_state().

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • To allow sleep states corresponding to the "mem", "standby" and
    "freeze" lables to be different from the pm_states[] indexes of
    those strings, introduce struct pm_sleep_state, consisting of
    a string label and a state number, and turn pm_states[] into an
    array of objects of that type.

    This modification should not lead to any functional changes.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

12 Mar, 2014

1 commit


28 Jun, 2013

1 commit

  • pm_trace uses the system's Real Time Clock (RTC) to save the magic
    number. The reason for this is that the RTC is the only reliably
    available piece of hardware during resume operations where a value
    can be set that will survive a reboot.

    Consequence is that after a resume (even if it is successful) your
    system clock will have a value corresponding to the magic number
    instead of the correct date/time! It is therefore advisable to use
    a program like ntp-date or rdate to reset the correct date/time from
    an external time source when using this trace option.

    There is no run-time message to warn users of the consequences of
    enabling pm_trace. Adding a warning message to pm_trace_store()
    will serve as a reminder to users to set the system date and time
    after resume.

    Signed-off-by: Shuah Khan
    Acked-by: Pavel Machek
    Signed-off-by: Rafael J. Wysocki

    Shuah Khan
     

21 Jun, 2013

1 commit

  • Commit a938da06 introduced a useful little log message to tell
    users/debuggers which wakeup source aborted a suspend. However,
    this message is only printed if the abort happens during the
    in-kernel suspend path (after writing /sys/power/state).

    The full specification of the /sys/power/wakeup_count facility
    allows user-space power managers to double-check if wakeups have
    already happened before it actually tries to suspend (e.g. while it
    was running user-space pre-suspend hooks), by writing the last known
    wakeup_count value to /sys/power/wakeup_count. This patch changes
    the sysfs handler for that node to also print said log message if
    that write fails, so that we can figure out the offending wakeup
    source for both kinds of suspend aborts.

    Signed-off-by: Julius Werner
    Signed-off-by: Rafael J. Wysocki

    Julius Werner
     

10 Feb, 2013

2 commits

  • At present, the value of timeout for freezing is 20s, which is
    meaningless in case that one thread is frozen with mutex locked
    and another thread is trying to lock the mutex, as this time of
    freezing will fail unavoidably.
    And if there is no new wakeup event registered, the system will
    waste at most 20s for such meaningless trying of freezing.

    With this patch, the value of timeout can be configured to smaller
    value, so such meaningless trying of freezing will be aborted in
    earlier time, and later freezing can be also triggered in earlier
    time. And more power will be saved.
    In normal case on mobile phone, it costs real little time to freeze
    processes. On some platform, it only costs about 20ms to freeze
    user space processes and 10ms to freeze kernel freezable threads.

    Signed-off-by: Liu Chuansheng
    Signed-off-by: Li Fei
    Signed-off-by: Rafael J. Wysocki

    Li Fei
     
  • PM_SUSPEND_FREEZE state is a general state that
    does not need any platform specific support, it equals
    frozen processes + suspended devices + idle processors.

    Compared with PM_SUSPEND_MEMORY,
    PM_SUSPEND_FREEZE saves less power
    because the system is still in a running state.
    PM_SUSPEND_FREEZE has less resume latency because it does not
    touch BIOS, and the processors are in idle state.

    Compared with RTPM/idle,
    PM_SUSPEND_FREEZE saves more power as
    1. the processor has longer sleep time because processes are frozen.
    The deeper c-state the processor supports, more power saving we can get.
    2. PM_SUSPEND_FREEZE uses system suspend code path, thus we can get
    more power saving from the devices that does not have good RTPM support.

    This state is useful for
    1) platforms that do not have STR, or have a broken STR.
    2) platforms that have an extremely low power idle state,
    which can be used to replace STR.

    The following describes how PM_SUSPEND_FREEZE state works.
    1. echo freeze > /sys/power/state
    2. the processes are frozen.
    3. all the devices are suspended.
    4. all the processors are blocked by a wait queue
    5. all the processors idles and enters (Deep) c-state.
    6. an interrupt fires.
    7. a processor is woken up and handles the irq.
    8. if it is a general event,
    a) the irq handler runs and quites.
    b) goto step 4.
    9. if it is a real wake event, say, power button pressing, keyboard touch, mouse moving,
    a) the irq handler runs and activate the wakeup source
    b) wakeup_source_activate() notifies the wait queue.
    c) system starts resuming from PM_SUSPEND_FREEZE
    10. all the devices are resumed.
    11. all the processes are unfrozen.
    12. system is back to working.

    Known Issue:
    The wakeup of this new PM_SUSPEND_FREEZE state may behave differently
    from the previous suspend state.
    Take ACPI platform for example, there are some GPEs that only enabled
    when the system is in sleep state, to wake the system backk from S3/S4.
    But we are not touching these GPEs during transition to PM_SUSPEND_FREEZE.
    This means we may lose some wake event.
    But on the other hand, as we do not disable all the Interrupts during
    PM_SUSPEND_FREEZE, we may get some extra "wakeup" Interrupts, that are
    not available for S3/S4.

    The patches has been tested on an old Sony laptop, and here are the results:

    Average Power:
    1. RPTM/idle for half an hour:
    14.8W, 12.6W, 14.1W, 12.5W, 14.4W, 13.2W, 12.9W
    2. Freeze for half an hour:
    11W, 10.4W, 9.4W, 11.3W 10.5W
    3. RTPM/idle for three hours:
    11.6W
    4. Freeze for three hours:
    10W
    5. Suspend to Memory:
    0.5~0.9W

    Average Resume Latency:
    1. RTPM/idle with a black screen: (From pressing keyboard to screen back)
    Less than 0.2s
    2. Freeze: (From pressing power button to screen back)
    2.50s
    3. Suspend to Memory: (From pressing power button to screen back)
    4.33s

    >From the results, we can see that all the platforms should benefit from
    this patch, even if it does not have Low Power S0.

    Signed-off-by: Zhang Rui
    Signed-off-by: Rafael J. Wysocki

    Zhang Rui
     

15 Nov, 2012

1 commit


01 Jul, 2012

2 commits

  • Change the behavior of the newly introduced
    /sys/power/pm_print_times attribute so that its initial value
    depends on initcall_debug, but setting it to 0 will cause device
    suspend/resume times not to be printed, even if initcall_debug has
    been set. This way, the people who use initcall_debug for reasons
    other than PM debugging will be able to switch the suspend/resume
    times printing off, if need be.

    Signed-off-by: Rafael J. Wysocki
    Reviewed-by: Srivatsa S. Bhat
    Acked-by: Greg Kroah-Hartman

    Rafael J. Wysocki
     
  • Added a new knob called /sys/power/pm_print_times. Setting it to 1
    enables printing of time taken by devices to suspend and resume.
    Setting it to 0 disables this printing (unless overridden by
    initcall_debug kernel command line option).

    Signed-off-by: Sameer Nanda
    Acked-by: Greg KH
    Signed-off-by: Rafael J. Wysocki

    Sameer Nanda
     

06 May, 2012

1 commit


02 May, 2012

2 commits

  • Android allows user space to manipulate wakelocks using two
    sysfs file located in /sys/power/, wake_lock and wake_unlock.
    Writing a wakelock name and optionally a timeout to the wake_lock
    file causes the wakelock whose name was written to be acquired (it
    is created before is necessary), optionally with the given timeout.
    Writing the name of a wakelock to wake_unlock causes that wakelock
    to be released.

    Implement an analogous interface for user space using wakeup sources.
    Add the /sys/power/wake_lock and /sys/power/wake_unlock files
    allowing user space to create, activate and deactivate wakeup
    sources, such that writing a name and optionally a timeout to
    wake_lock causes the wakeup source of that name to be activated,
    optionally with the given timeout. If that wakeup source doesn't
    exist, it will be created and then activated. Writing a name to
    wake_unlock causes the wakeup source of that name, if there is one,
    to be deactivated. Wakeup sources created with the help of
    wake_lock that haven't been used for more than 5 minutes are garbage
    collected and destroyed. Moreover, there can be only WL_NUMBER_LIMIT
    wakeup sources created with the help of wake_lock present at a time.

    The data type used to track wakeup sources created by user space is
    called "struct wakelock" to indicate the origins of this feature.

    This version of the patch includes an rbtree manipulation fix from John Stultz.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Greg Kroah-Hartman
    Reviewed-by: NeilBrown

    Rafael J. Wysocki
     
  • Introduce a mechanism by which the kernel can trigger global
    transitions to a sleep state chosen by user space if there are no
    active wakeup sources.

    It consists of a new sysfs attribute, /sys/power/autosleep, that
    can be written one of the strings returned by reads from
    /sys/power/state, an ordered workqueue and a work item carrying out
    the "suspend" operations. If a string representing the system's
    sleep state is written to /sys/power/autosleep, the work item
    triggering transitions to that state is queued up and it requeues
    itself after every execution until user space writes "off" to
    /sys/power/autosleep.

    That work item enables the detection of wakeup events using the
    functions already defined in drivers/base/power/wakeup.c (with one
    small modification) and calls either pm_suspend(), or hibernate() to
    put the system into a sleep state. If a wakeup event is reported
    while the transition is in progress, it will abort the transition and
    the "system suspend" work item will be queued up again.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Greg Kroah-Hartman
    Reviewed-by: NeilBrown

    Rafael J. Wysocki
     

18 Feb, 2012

1 commit


10 Feb, 2012

1 commit


30 Jan, 2012

1 commit

  • The current device suspend/resume phases during system-wide power
    transitions appear to be insufficient for some platforms that want
    to use the same callback routines for saving device states and
    related operations during runtime suspend/resume as well as during
    system suspend/resume. In principle, they could point their
    .suspend_noirq() and .resume_noirq() to the same callback routines
    as their .runtime_suspend() and .runtime_resume(), respectively,
    but at least some of them require device interrupts to be enabled
    while the code in those routines is running.

    It also makes sense to have device suspend-resume callbacks that will
    be executed with runtime PM disabled and with device interrupts
    enabled in case someone needs to run some special code in that
    context during system-wide power transitions.

    Apart from this, .suspend_noirq() and .resume_noirq() were introduced
    as a workaround for drivers using shared interrupts and failing to
    prevent their interrupt handlers from accessing suspended hardware.
    It appears to be better not to use them for other porposes, or we may
    have to deal with some serious confusion (which seems to be happening
    already).

    For the above reasons, introduce new device suspend/resume phases,
    "late suspend" and "early resume" (and analogously for hibernation)
    whose callback will be executed with runtime PM disabled and with
    device interrupts enabled and whose callback pointers generally may
    point to runtime suspend/resume routines.

    Signed-off-by: Rafael J. Wysocki
    Reviewed-by: Mark Brown
    Reviewed-by: Kevin Hilman

    Rafael J. Wysocki
     

09 Dec, 2011

1 commit


24 Nov, 2011

1 commit


19 Nov, 2011

1 commit

  • After commit 2a77c46de1e3dace73745015635ebbc648eca69c
    (PM / Suspend: Add statistics debugfs file for suspend to RAM)
    a missing pair of braces inside the state_store() function causes even
    invalid arguments to suspend to be wrongly treated as failed suspend
    attempts. Fix this.

    [rjw: Put the hash/subject of the buggy commit into the changelog.]

    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki

    Srivatsa S. Bhat
     

01 Nov, 2011

1 commit


17 Oct, 2011

2 commits

  • Suspend statistics should depend on CONFIG_PM_SLEEP, so make that
    happen.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • Record S3 failure time about each reason and the latest two failed
    devices' names in S3 progress.
    We can check it through 'suspend_stats' entry in debugfs.

    The motivation of the patch:

    We are enabling power features on Medfield. Comparing with PC/notebook,
    a mobile enters/exits suspend-2-ram (we call it s3 on Medfield) far
    more frequently. If it can't enter suspend-2-ram in time, the power
    might be used up soon.

    We often find sometimes, a device suspend fails. Then, system retries
    s3 over and over again. As display is off, testers and developers
    don't know what happens.

    Some testers and developers complain they don't know if system
    tries suspend-2-ram, and what device fails to suspend. They need
    such info for a quick check. The patch adds suspend_stats under
    debugfs for users to check suspend to RAM statistics quickly.

    If not using this patch, we have other methods to get info about
    what device fails. One is to turn on CONFIG_PM_DEBUG, but users
    would get too much info and testers need recompile the system.

    In addition, dynamic debug is another good tool to dump debug info.
    But it still doesn't match our utilization scenario closely.
    1) user need write a user space parser to process the syslog output;
    2) Our testing scenario is we leave the mobile for at least hours.
    Then, check its status. No serial console available during the
    testing. One is because console would be suspended, and the other
    is serial console connecting with spi or HSU devices would consume
    power. These devices are powered off at suspend-2-ram.

    Signed-off-by: ShuoX Liu
    Signed-off-by: Rafael J. Wysocki

    ShuoX Liu
     

16 Jul, 2011

1 commit


18 May, 2011

1 commit

  • Martin reports that on his system hibernation occasionally fails due
    to the lack of memory, because the radeon driver apparently allocates
    too much of it during the device freeze stage. It turns out that the
    amount of memory allocated by radeon during hibernation (and
    presumably during system suspend too) depends on the utilization of
    the GPU (e.g. hibernating while there are two KDE 4 sessions with
    compositing enabled causes radeon to allocate more memory than for
    one KDE 4 session).

    In principle it should be possible to use image_size to make the
    memory preallocation mechanism free enough memory for the radeon
    driver, but in practice it is not easy to guess the right value
    because of the way the preallocation code uses image_size. For this
    reason, it seems reasonable to allow users to control the amount of
    memory reserved for driver allocations made after the hibernate
    preallocation, which currently is constant and amounts to 1 MB.

    Introduce a new sysfs file, /sys/power/reserved_size, whose value
    will be used as the amount of memory to reserve for the
    post-preallocation reservations made by device drivers, in bytes.
    For backwards compatibility, set its default (and initial) value to
    the currently used number (1 MB).

    References: https://bugzilla.kernel.org/show_bug.cgi?id=34102
    Reported-and-tested-by: Martin Steigerwald
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

31 Mar, 2011

1 commit


15 Mar, 2011

2 commits

  • The variable pm_flags is used to prevent APM from being enabled
    along with ACPI, which would lead to problems. However, acpi_init()
    is always called before apm_init() and after acpi_init() has
    returned, it is known whether or not ACPI will be used. Namely, if
    acpi_disabled is not set after acpi_init() has returned, this means
    that ACPI is enabled. Thus, it is sufficient to check acpi_disabled
    in apm_init() to prevent APM from being enabled in parallel with
    ACPI.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Len Brown

    Rafael J. Wysocki
     
  • If direct references to pm_flags are removed from drivers/acpi/bus.c,
    CONFIG_ACPI will not need to depend on CONFIG_PM any more. Make that
    happen.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Len Brown

    Rafael J. Wysocki
     

17 Feb, 2011

1 commit

  • There are two spellings in use for 'freeze' + 'able' - 'freezable' and
    'freezeable'. The former is the more prominent one. The latter is
    mostly used by workqueue and in a few other odd places. Unify the
    spelling to 'freezable'.

    Signed-off-by: Tejun Heo
    Reported-by: Alan Stern
    Acked-by: "Rafael J. Wysocki"
    Acked-by: Greg Kroah-Hartman
    Acked-by: Dmitry Torokhov
    Cc: David Woodhouse
    Cc: Alex Dubov
    Cc: "David S. Miller"
    Cc: Steven Whitehouse

    Tejun Heo
     

17 Oct, 2010

4 commits

  • If the device which fails to resume is part of a loadable kernel module
    it won't be checked at startup against the magic number stored in the
    RTC.

    Add a read-only sysfs attribute /sys/power/pm_trace_dev_match which
    contains a list of newline separated devices (usually just the one)
    which currently match the last magic number. This allows the device
    which is failing to resume to be found after the modules are loaded
    again.

    Signed-off-by: James Hogan
    Signed-off-by: Rafael J. Wysocki

    James Hogan
     
  • Introduce struct wakeup_source for representing system wakeup sources
    within the kernel and for collecting statistics related to them.
    Make the recently introduced helper functions pm_wakeup_event(),
    pm_stay_awake() and pm_relax() use struct wakeup_source objects
    internally, so that wakeup statistics associated with wakeup devices
    can be collected and reported in a consistent way (the definition of
    pm_relax() is changed, which is harmless, because this function is
    not called directly by anyone yet). Introduce new wakeup-related
    sysfs device attributes in /sys/devices/.../power for reporting the
    device wakeup statistics.

    Change the global wakeup events counters event_count and
    events_in_progress into atomic variables, so that it is not necessary
    to acquire a global spinlock in pm_wakeup_event(), pm_stay_awake()
    and pm_relax(), which should allow us to avoid lock contention in
    these functions on SMP systems with many wakeup devices.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Greg Kroah-Hartman

    Rafael J. Wysocki
     
  • The default hibernation image size is currently hard coded and euqal
    to 500 MB, which is not a reasonable default on many contemporary
    systems. Make it equal 2/5 of the total RAM size (this is slightly
    below the maximum, i.e. 1/2 of the total RAM size, and seems to be
    generally suitable).

    Signed-off-by: Rafael J. Wysocki
    Tested-by: M. Vefa Bicakci

    Rafael J. Wysocki
     
  • Although we need the PM workqueue to be freezable, we don't need it
    to be singlethread. Also, the number of concurrent work items
    running on a single CPU need not be constrained. For these reasons
    use alloc_workqueue() directly, with suitable arguments, instead of
    create_freezeable_workqueue(), to create the runtime PM workqueue.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Tejun Heo

    Rafael J. Wysocki
     

19 Jul, 2010

1 commit

  • One of the arguments during the suspend blockers discussion was that
    the mainline kernel didn't contain any mechanisms making it possible
    to avoid races between wakeup and system suspend.

    Generally, there are two problems in that area. First, if a wakeup
    event occurs exactly when /sys/power/state is being written to, it
    may be delivered to user space right before the freezer kicks in, so
    the user space consumer of the event may not be able to process it
    before the system is suspended. Second, if a wakeup event occurs
    after user space has been frozen, it is not generally guaranteed that
    the ongoing transition of the system into a sleep state will be
    aborted.

    To address these issues introduce a new global sysfs attribute,
    /sys/power/wakeup_count, associated with a running counter of wakeup
    events and three helper functions, pm_stay_awake(), pm_relax(), and
    pm_wakeup_event(), that may be used by kernel subsystems to control
    the behavior of this attribute and to request the PM core to abort
    system transitions into a sleep state already in progress.

    The /sys/power/wakeup_count file may be read from or written to by
    user space. Reads will always succeed (unless interrupted by a
    signal) and return the current value of the wakeup events counter.
    Writes, however, will only succeed if the written number is equal to
    the current value of the wakeup events counter. If a write is
    successful, it will cause the kernel to save the current value of the
    wakeup events counter and to abort the subsequent system transition
    into a sleep state if any wakeup events are reported after the write
    has returned.

    [The assumption is that before writing to /sys/power/state user space
    will first read from /sys/power/wakeup_count. Next, user space
    consumers of wakeup events will have a chance to acknowledge or
    veto the upcoming system transition to a sleep state. Finally, if
    the transition is allowed to proceed, /sys/power/wakeup_count will
    be written to and if that succeeds, /sys/power/state will be written
    to as well. Still, if any wakeup events are reported to the PM core
    by kernel subsystems after that point, the transition will be
    aborted.]

    Additionally, put a wakeup events counter into struct dev_pm_info and
    make these per-device wakeup event counters available via sysfs,
    so that it's possible to check the activity of various wakeup event
    sources within the kernel.

    To illustrate how subsystems can use pm_wakeup_event(), make the
    low-level PCI runtime PM wakeup-handling code use it.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Jesse Barnes
    Acked-by: Greg Kroah-Hartman
    Acked-by: markgross
    Reviewed-by: Alan Stern

    Rafael J. Wysocki
     

27 Feb, 2010

1 commit