16 Oct, 2020

1 commit

  • The cpuidle.h header is declaring a function with an empty stub
    for the cpuidle disabled case, but that function is only called
    by cpuidle governors which depend on cpuidle anyway.

    In other words, the function is only called when cpuidle is enabled,
    so there is no need for the stub.

    Remove the pointless stub.

    Signed-off-by: Daniel Lezcano
    [ rjw: Changelog edits ]
    Signed-off-by: Rafael J. Wysocki

    Daniel Lezcano
     

23 Sep, 2020

1 commit

  • CPUs may fail to enter the chosen idle state if there was a
    pending interrupt, causing the cpuidle driver to return an error
    value.

    Record that and export it via sysfs along with the other idle state
    statistics.

    This could prove useful in understanding behavior of the governor
    and the system during usecases that involve multiple CPUs.

    Signed-off-by: Lina Iyer
    [ rjw: Changelog and documentation edits ]
    Signed-off-by: Rafael J. Wysocki

    Lina Iyer
     

17 Sep, 2020

1 commit

  • Some drivers have to do significant work, some of which relies on RCU
    still being active. Instead of using RCU_NONIDLE in the drivers and
    flipping RCU back on, allow drivers to take over RCU-idle duty.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Ulf Hansson
    Tested-by: Borislav Petkov
    Signed-off-by: Rafael J. Wysocki

    Peter Zijlstra
     

26 Aug, 2020

1 commit

  • This allows moving the leave_mm() call into generic code before
    rcu_idle_enter(). Gets rid of more trace_*_rcuidle() users.

    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Steven Rostedt (VMware)
    Reviewed-by: Thomas Gleixner
    Acked-by: Rafael J. Wysocki
    Tested-by: Marco Elver
    Link: https://lkml.kernel.org/r/20200821085348.369441600@infradead.org

    Peter Zijlstra
     

30 Jul, 2020

1 commit

  • Control Flow Integrity(CFI) is a security mechanism that disallows
    changes to the original control flow graph of a compiled binary,
    making it significantly harder to perform such attacks.

    init_state_node() assign same function callback to different
    function pointer declarations.

    static int init_state_node(struct cpuidle_state *idle_state,
    const struct of_device_id *matches,
    struct device_node *state_node) { ...
    idle_state->enter = match_id->data; ...
    idle_state->enter_s2idle = match_id->data; }

    Function declarations:

    struct cpuidle_state { ...
    int (*enter) (struct cpuidle_device *dev,
    struct cpuidle_driver *drv,
    int index);

    void (*enter_s2idle) (struct cpuidle_device *dev,
    struct cpuidle_driver *drv,
    int index); };

    In this case, either enter() or enter_s2idle() would cause CFI check
    failed since they use same callee.

    Align function prototype of enter() since it needs return value for
    some use cases. The return value of enter_s2idle() is no
    need currently.

    Signed-off-by: Neal Liu
    Reviewed-by: Sami Tolvanen
    Signed-off-by: Rafael J. Wysocki

    Neal Liu
     

23 Jan, 2020

1 commit

  • Merge changes updating the ACPI processor driver in order to export
    acpi_processor_evaluate_cst() to the code outside of it and adding
    ACPI support to the intel_idle driver based on that.

    * intel_idle+acpi:
    Documentation: admin-guide: PM: Add intel_idle document
    intel_idle: Use ACPI _CST on server systems
    intel_idle: Add module parameter to prevent ACPI _CST from being used
    intel_idle: Allow ACPI _CST to be used for selected known processors
    cpuidle: Allow idle states to be disabled by default
    intel_idle: Use ACPI _CST for processor models without C-state tables
    intel_idle: Refactor intel_idle_cpuidle_driver_init()
    ACPI: processor: Export acpi_processor_evaluate_cst()
    ACPI: processor: Make ACPI_PROCESSOR_CSTATE depend on ACPI_PROCESSOR
    ACPI: processor: Clean up acpi_processor_evaluate_cst()
    ACPI: processor: Introduce acpi_processor_evaluate_cst()
    ACPI: processor: Export function to claim _CST control

    Rafael J. Wysocki
     

09 Jan, 2020

1 commit


27 Dec, 2019

1 commit

  • In certain situations it may be useful to prevent some idle states
    from being used by default while allowing user space to enable them
    later on.

    For this purpose, introduce a new state flag, CPUIDLE_FLAG_OFF, to
    mark idle states that should be disabled by default, make the core
    set CPUIDLE_STATE_DISABLED_BY_USER for those states at the
    initialization time and add a new state attribute in sysfs,
    "default_status", to inform user space of the initial status of
    the given idle state ("disabled" if CPUIDLE_FLAG_OFF is set for it,
    "enabled" otherwise).

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

29 Nov, 2019

1 commit

  • After recent cpuidle updates the "disabled" field in struct
    cpuidle_state is only used by two drivers (intel_idle and shmobile
    cpuidle) for marking unusable idle states, but that may as well be
    achieved with the help of a state flag, so define an "unusable" idle
    state flag, CPUIDLE_FLAG_UNUSABLE, make the drivers in question use
    it instead of the "disabled" field and make the core set
    CPUIDLE_STATE_DISABLED_BY_DRIVER for the idle states with that flag
    set.

    After the above changes, the "disabled" field in struct cpuidle_state
    is not used any more, so drop it.

    No intentional functional impact.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

20 Nov, 2019

2 commits

  • Modify cpuidle_use_deepest_state() to take an additional exit latency
    limit argument to be passed to find_deepest_idle_state() and make
    cpuidle_idle_call() pass dev->forced_idle_latency_limit_ns to it for
    forced idle.

    Suggested-by: Rafael J. Wysocki
    Signed-off-by: Daniel Lezcano
    [ rjw: Rebase and rearrange code, subject & changelog ]
    Signed-off-by: Rafael J. Wysocki

    Daniel Lezcano
     
  • In some cases it may be useful to specify an exit latency limit for
    the idle state to be used during CPU idle time injection.

    Instead of duplicating the information in struct cpuidle_device
    or propagating the latency limit in the call stack, replace the
    use_deepest_state field with forced_latency_limit_ns to represent
    that limit, so that the deepest idle state with exit latency within
    that limit is forced (i.e. no governors) when it is set.

    A zero exit latency limit for forced idle means to use governors in
    the usual way (analogous to use_deepest_state equal to "false" before
    this change).

    Additionally, add play_idle_precise() taking two arguments, the
    duration of forced idle and the idle state exit latency limit, both
    in nanoseconds, and redefine play_idle() as a wrapper around that
    new function.

    This change is preparatory, no functional impact is expected.

    Suggested-by: Rafael J. Wysocki
    Signed-off-by: Daniel Lezcano
    [ rjw: Subject, changelog, cpuidle_use_deepest_state() kerneldoc, whitespace ]
    Signed-off-by: Rafael J. Wysocki

    Daniel Lezcano
     

19 Nov, 2019

1 commit

  • Commit 99e98d3fb100 ("cpuidle: Consolidate disabled state checks")
    overlooked the fact that the imx6q and tegra20 cpuidle drivers use
    the "disabled" field in struct cpuidle_state for quirks which trigger
    after the initialization of cpuidle, so reading the initial value of
    that field is not sufficient for those drivers.

    In order to allow them to implement the quirks without using the
    "disabled" field in struct cpuidle_state, introduce a new helper
    function and modify them to use it.

    Fixes: 99e98d3fb100 ("cpuidle: Consolidate disabled state checks")
    Reported-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

12 Nov, 2019

1 commit

  • Currently, the cpuidle subsystem uses microseconds as the unit of
    time which (among other things) causes the idle loop to incur some
    integer division overhead for no clear benefit.

    In order to allow cpuidle to measure time in nanoseconds, add two
    new fields, exit_latency_ns and target_residency_ns, to represent the
    exit latency and target residency of an idle state in nanoseconds,
    respectively, to struct cpuidle_state and initialize them with the
    help of the corresponding values in microseconds provided by drivers.
    Additionally, change cpuidle_governor_latency_req() to return the
    idle state exit latency constraint in nanoseconds.

    Also meeasure idle state residency (last_residency_ns in struct
    cpuidle_device and time_ns in struct cpuidle_driver) in nanoseconds
    and update the cpuidle core and governors accordingly.

    However, the menu governor still computes typical intervals in
    microseconds to avoid integer overflows.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Doug Smythies
    Tested-by: Doug Smythies

    Rafael J. Wysocki
     

06 Nov, 2019

1 commit

  • There are two reasons why CPU idle states may be disabled: either
    because the driver has disabled them or because they have been
    disabled by user space via sysfs.

    In the former case, the state's "disabled" flag is set once during
    the initialization of the driver and it is never cleared later (it
    is read-only effectively). In the latter case, the "disable" field
    of the given state's cpuidle_state_usage struct is set and it may be
    changed via sysfs. Thus checking whether or not an idle state has
    been disabled involves reading these two flags every time.

    In order to avoid the additional check of the state's "disabled" flag
    (which is effectively read-only anyway), use the value of it at the
    init time to set a (new) flag in the "disable" field of that state's
    cpuidle_state_usage structure and use the sysfs interface to
    manipulate another (new) flag in it. This way the state is disabled
    whenever the "disable" field of its cpuidle_state_usage structure is
    nonzero, whatever the reason, and it is the only place to look into
    to check whether or not the state has been disabled.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Daniel Lezcano
    Acked-by: Peter Zijlstra (Intel)

    Rafael J. Wysocki
     

18 Sep, 2019

1 commit

  • Pull power management updates from Rafael Wysocki:
    "These include a rework of the main suspend-to-idle code flow (related
    to the handling of spurious wakeups), a switch over of several users
    of cpufreq notifiers to QoS-based limits, a new devfreq driver for
    Tegra20, a new cpuidle driver and governor for virtualized guests, an
    extension of the wakeup sources framework to expose wakeup sources as
    device objects in sysfs, and more.

    Specifics:

    - Rework the main suspend-to-idle control flow to avoid repeating
    "noirq" device resume and suspend operations in case of spurious
    wakeups from the ACPI EC and decouple the ACPI EC wakeups support
    from the LPS0 _DSM support (Rafael Wysocki).

    - Extend the wakeup sources framework to expose wakeup sources as
    device objects in sysfs (Tri Vo, Stephen Boyd).

    - Expose system suspend statistics in sysfs (Kalesh Singh).

    - Introduce a new haltpoll cpuidle driver and a new matching governor
    for virtualized guests wanting to do guest-side polling in the idle
    loop (Marcelo Tosatti, Joao Martins, Wanpeng Li, Stephen Rothwell).

    - Fix the menu and teo cpuidle governors to allow the scheduler tick
    to be stopped if PM QoS is used to limit the CPU idle state exit
    latency in some cases (Rafael Wysocki).

    - Increase the resolution of the play_idle() argument to microseconds
    for more fine-grained injection of CPU idle cycles (Daniel
    Lezcano).

    - Switch over some users of cpuidle notifiers to the new QoS-based
    frequency limits and drop the CPUFREQ_ADJUST and CPUFREQ_NOTIFY
    policy notifier events (Viresh Kumar).

    - Add new cpufreq driver based on nvmem for sun50i (Yangtao Li).

    - Add support for MT8183 and MT8516 to the mediatek cpufreq driver
    (Andrew-sh.Cheng, Fabien Parent).

    - Add i.MX8MN support to the imx-cpufreq-dt cpufreq driver (Anson
    Huang).

    - Add qcs404 to cpufreq-dt-platdev blacklist (Jorge Ramirez-Ortiz).

    - Update the qcom cpufreq driver (among other things, to make it
    easier to extend and to use kryo cpufreq for other nvmem-based
    SoCs) and add qcs404 support to it (Niklas Cassel, Douglas
    RAILLARD, Sibi Sankar, Sricharan R).

    - Fix assorted issues and make assorted minor improvements in the
    cpufreq code (Colin Ian King, Douglas RAILLARD, Florian Fainelli,
    Gustavo Silva, Hariprasad Kelam).

    - Add new devfreq driver for NVidia Tegra20 (Dmitry Osipenko, Arnd
    Bergmann).

    - Add new Exynos PPMU events to devfreq events and extend that
    mechanism (Lukasz Luba).

    - Fix and clean up the exynos-bus devfreq driver (Kamil Konieczny).

    - Improve devfreq documentation and governor code, fix spelling typos
    in devfreq (Ezequiel Garcia, Krzysztof Kozlowski, Leonard Crestez,
    MyungJoo Ham, Gaël PORTAY).

    - Add regulators enable and disable to the OPP (operating performance
    points) framework (Kamil Konieczny).

    - Update the OPP framework to support multiple opp-suspend properties
    (Anson Huang).

    - Fix assorted issues and make assorted minor improvements in the OPP
    code (Niklas Cassel, Viresh Kumar, Yue Hu).

    - Clean up the generic power domains (genpd) framework (Ulf Hansson).

    - Clean up assorted pieces of power management code and documentation
    (Akinobu Mita, Amit Kucheria, Chuhong Yuan).

    - Update the pm-graph tool to version 5.5 including multiple fixes
    and improvements (Todd Brandt).

    - Update the cpupower utility (Benjamin Weis, Geert Uytterhoeven,
    Sébastien Szymanski)"

    * tag 'pm-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (126 commits)
    cpuidle-haltpoll: Enable kvm guest polling when dedicated physical CPUs are available
    cpuidle-haltpoll: do not set an owner to allow modunload
    cpuidle-haltpoll: return -ENODEV on modinit failure
    cpuidle-haltpoll: set haltpoll as preferred governor
    cpuidle: allow governor switch on cpuidle_register_driver()
    PM: runtime: Documentation: add runtime_status ABI document
    pm-graph: make setVal unbuffered again for python2 and python3
    powercap: idle_inject: Use higher resolution for idle injection
    cpuidle: play_idle: Increase the resolution to usec
    cpuidle-haltpoll: vcpu hotplug support
    cpufreq: Add qcs404 to cpufreq-dt-platdev blacklist
    cpufreq: qcom: Add support for qcs404 on nvmem driver
    cpufreq: qcom: Refactor the driver to make it easier to extend
    cpufreq: qcom: Re-organise kryo cpufreq to use it for other nvmem based qcom socs
    dt-bindings: opp: Add qcom-opp bindings with properties needed for CPR
    dt-bindings: opp: qcom-nvmem: Support pstates provided by a power domain
    Documentation: cpufreq: Update policy notifier documentation
    cpufreq: Remove CPUFREQ_ADJUST and CPUFREQ_NOTIFY policy notifier events
    PM / Domains: Verify PM domain type in dev_pm_genpd_set_performance_state()
    PM / Domains: Simplify genpd_lookup_dev()
    ...

    Linus Torvalds
     

11 Sep, 2019

1 commit

  • The recently introduced haltpoll driver is largely only useful with
    haltpoll governor. To allow drivers to associate with a particular idle
    behaviour, add a @governor property to 'struct cpuidle_driver' and thus
    allow a cpuidle driver to switch to a *preferred* governor on idle driver
    registration. We save the previous governor, and when an idle driver is
    unregistered we switch back to that.

    The @governor can be overridden by cpuidle.governor= boot param or
    alternatively be ignored if the governor doesn't exist.

    Signed-off-by: Joao Martins
    Signed-off-by: Rafael J. Wysocki

    Joao Martins
     

10 Aug, 2019

1 commit

  • Current PSCI code handles idle state entry through the
    psci_cpu_suspend_enter() API, that takes an idle state index as a
    parameter and convert the index into a previously initialized
    power_state parameter before calling the PSCI.CPU_SUSPEND() with it.

    This is unwieldly, since it forces the PSCI firmware layer to keep track
    of power_state parameter for every idle state so that the
    index->power_state conversion can be made in the PSCI firmware layer
    instead of the CPUidle driver implementations.

    Move the power_state handling out of drivers/firmware/psci
    into the respective ACPI/DT PSCI CPUidle backends and convert
    the psci_cpu_suspend_enter() API to get the power_state
    parameter as input, which makes it closer to its firmware
    interface PSCI.CPU_SUSPEND() API.

    A notable side effect is that the PSCI ACPI/DT CPUidle backends
    now can directly handle (and if needed update) power_state
    parameters before handing them over to the PSCI firmware
    interface to trigger PSCI.CPU_SUSPEND() calls.

    Signed-off-by: Lorenzo Pieralisi
    Acked-by: Daniel Lezcano
    Reviewed-by: Ulf Hansson
    Reviewed-by: Sudeep Holla
    Cc: Will Deacon
    Cc: Ulf Hansson
    Cc: Sudeep Holla
    Cc: Daniel Lezcano
    Cc: Catalin Marinas
    Cc: Mark Rutland
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Will Deacon

    Lorenzo Pieralisi
     

31 Jul, 2019

1 commit

  • An x86_64 allmodconfig build produces these errors:

    x86_64-linux-gnu-ld: kernel/sched/core.o: in function `cpuidle_poll_time':
    core.c:(.text+0x230): multiple definition of `cpuidle_poll_time'; arch/x86/=
    kernel/process.o:process.c:(.text+0xc0): first defined here

    (and more)

    Fixes: 259231a04561 ("cpuidle: add poll_limit_ns to cpuidle_device structure")
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Rafael J. Wysocki

    Stephen Rothwell
     

30 Jul, 2019

2 commits


10 Apr, 2019

1 commit

  • To be able to predict the sleep duration for a CPU entering idle, it
    is essential to know the expiration time of the next timer. Both the
    teo and the menu cpuidle governors already use this information for
    CPU idle state selection.

    Moving forward, a similar prediction needs to be made for a group of
    idle CPUs rather than for a single one and the following changes
    implement a new genpd governor for that purpose.

    In order to support that feature, add a new function called
    tick_nohz_get_next_hrtimer() that will return the next hrtimer
    expiration time of a given CPU to be invoked after deciding
    whether or not to stop the scheduler tick on that CPU.

    Make the cpuidle core call tick_nohz_get_next_hrtimer() right
    before invoking the ->enter() callback provided by the cpuidle
    driver for the given state and store its return value in the
    per-CPU struct cpuidle_device, so as to make it available to code
    outside of cpuidle.

    Note that at the point when cpuidle calls tick_nohz_get_next_hrtimer(),
    the governor's ->select() callback has already returned and indicated
    whether or not the tick should be stopped, so in fact the value
    returned by tick_nohz_get_next_hrtimer() always is the next hrtimer
    expiration time for the given CPU, possibly including the tick (if
    it hasn't been stopped).

    Co-developed-by: Lina Iyer
    Co-developed-by: Daniel Lezcano
    Acked-by: Daniel Lezcano
    Signed-off-by: Ulf Hansson
    [ rjw: Subject & changelog ]
    Signed-off-by: Rafael J. Wysocki

    Ulf Hansson
     

18 Jan, 2019

1 commit


13 Dec, 2018

1 commit

  • Add two new metrics for CPU idle states, "above" and "below", to count
    the number of times the given state had been asked for (or entered
    from the kernel's perspective), but the observed idle duration turned
    out to be too short or too long for it (respectively).

    These metrics help to estimate the quality of the CPU idle governor
    in use.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

04 Oct, 2018

1 commit

  • If the CPU exits the "polling" state due to the time limit in the
    loop in poll_idle(), this is not a real wakeup and it just means
    that the "polling" state selection was not adequate. The governor
    mispredicted short idle duration, but had a more suitable state been
    selected, the CPU might have spent more time in it. In fact, there
    is no reason to expect that there would have been a wakeup event
    earlier than the next timer in that case.

    Handling such cases as regular wakeups in menu_update() may cause the
    menu governor to make suboptimal decisions going forward, but ignoring
    them altogether would not be correct either, because every time
    menu_select() is invoked, it makes a separate new attempt to predict
    the idle duration taking distinct time to the closest timer event as
    input and the outcomes of all those attempts should be recorded.

    For this reason, make menu_update() always assume that if the
    "polling" state was exited due to the time limit, the next proper
    wakeup event for the CPU would be the next timer event (not
    including the tick).

    Fixes: a37b969a61c1 "cpuidle: poll_state: Add time limit to poll_idle()"
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Peter Zijlstra (Intel)
    Reviewed-by: Daniel Lezcano

    Rafael J. Wysocki
     

18 Sep, 2018

1 commit

  • cpuidle_get_last_residency() is just a wrapper for retrieving
    the last_residency member of struct cpuidle_device. It's also
    weirdly the only wrapper function for accessing cpuidle_* struct
    member (by my best guess is it could be a leftover from v2.x).

    Anyhow, since the only two users (the ladder and menu governors)
    can access dev->last_residency directly, and it's more intuitive to
    do it that way, let's just get rid of the wrapper.

    This patch tidies up CPU idle code a bit without functional changes.

    Signed-off-by: Fieah Lim
    [ rjw: Changelog cleanup ]
    Signed-off-by: Rafael J. Wysocki

    Fieah Lim
     

31 May, 2018

1 commit


06 Apr, 2018

1 commit

  • Add a new pointer argument to cpuidle_select() and to the ->select
    cpuidle governor callback to allow a boolean value indicating
    whether or not the tick should be stopped before entering the
    selected state to be returned from there.

    Make the ladder governor ignore that pointer (to preserve its
    current behavior) and make the menu governor return 'false" through
    it if:
    (1) the idle exit latency is constrained at 0, or
    (2) the selected state is a polling one, or
    (3) the expected idle period duration is within the tick period
    range.

    In addition to that, the correction factor computations in the menu
    governor need to take the possibility that the tick may not be
    stopped into account to avoid artificially small correction factor
    values. To that end, add a mechanism to record tick wakeups, as
    suggested by Peter Zijlstra, and use it to modify the menu_update()
    behavior when tick wakeup occurs. Namely, if the CPU is woken up by
    the tick and the return value of tick_nohz_get_sleep_length() is not
    within the tick boundary, the predicted idle duration is likely too
    short, so make menu_update() try to compensate for that by updating
    the governor statistics as though the CPU was idle for a long time.

    Since the value returned through the new argument pointer of
    cpuidle_select() is not used by its caller yet, this change by
    itself is not expected to alter the functionality of the code.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Peter Zijlstra (Intel)

    Rafael J. Wysocki
     

29 Mar, 2018

1 commit

  • Add a new attribute group called "s2idle" under the sysfs directory
    of each cpuidle state that supports the ->enter_s2idle callback
    and put two new attributes, "usage" and "time", into that group to
    represent the number of times the given state was requested for
    suspend-to-idle and the total time spent in suspend-to-idle after
    requesting that state, respectively.

    That will allow diagnostic information related to suspend-to-idle
    to be collected without enabling advanced debug features and
    analyzing dmesg output.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

15 Feb, 2018

1 commit


12 Feb, 2018

1 commit

  • Commit f85942207516 (x86: PM: Make APM idle driver initialize polling
    state) made apm_init() call cpuidle_poll_state_init(), but that only
    is defined for CONFIG_CPU_IDLE set, so make the empty stub of it
    available for CONFIG_CPU_IDLE unset too to fix the resulting build
    issue.

    Fixes: f85942207516 (x86: PM: Make APM idle driver initialize polling state)
    Cc: 4.14+ # 4.14+
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

02 Jan, 2018

1 commit

  • If a CPU is entering a low power idle state where it doesn't lose any
    context, then there is no need to call cpu_pm_enter()/cpu_pm_exit().
    Add a new macro(CPU_PM_CPU_IDLE_ENTER_RETENTION) to be used by cpuidle
    drivers when they are entering retention state. By not calling
    cpu_pm_enter and cpu_pm_exit we reduce the latency involved in
    entering and exiting the retention idle states.

    CPU_PM_CPU_IDLE_ENTER_RETENTION assumes that no state is lost and
    hence CPU PM notifiers will not be called. We may need a broader
    change if we need to support partial retention states effeciently.

    On ARM64 based Qualcomm Server Platform we measured below overhead for
    for calling cpu_pm_enter and cpu_pm_exit for retention states.

    workload: stress --hdd #CPUs --hdd-bytes 32M -t 30
    Average overhead of cpu_pm_enter - 1.2us
    Average overhead of cpu_pm_exit - 3.1us

    Acked-by: Rafael J. Wysocki
    Acked-by: Sudeep Holla
    Signed-off-by: Prashanth Prakash
    Signed-off-by: Catalin Marinas

    Prashanth Prakash
     

04 Sep, 2017

1 commit

  • * pm-sleep:
    ACPI / PM: Check low power idle constraints for debug only
    PM / s2idle: Rename platform operations structure
    PM / s2idle: Rename ->enter_freeze to ->enter_s2idle
    PM / s2idle: Rename freeze_state enum and related items
    PM / s2idle: Rename PM_SUSPEND_FREEZE to PM_SUSPEND_TO_IDLE
    ACPI / PM: Prefer suspend-to-idle over S3 on some systems
    platform/x86: intel-hid: Wake up Dell Latitude 7275 from suspend-to-idle
    PM / suspend: Define pr_fmt() in suspend.c
    PM / suspend: Use mem_sleep_labels[] strings in messages
    PM / sleep: Put pm_test under CONFIG_PM_SLEEP_DEBUG
    PM / sleep: Check pm_wakeup_pending() in __device_suspend_noirq()
    PM / core: Add error argument to dpm_show_time()
    PM / core: Split dpm_suspend_noirq() and dpm_resume_noirq()
    PM / s2idle: Rearrange the main suspend-to-idle loop
    PM / timekeeping: Print debug messages when requested
    PM / sleep: Mark suspend/hibernation start and finish
    PM / sleep: Do not print debug messages by default
    PM / suspend: Export pm_suspend_target_state

    Rafael J. Wysocki
     

30 Aug, 2017

3 commits

  • Make the drivers that want to include the polling state into their
    states table initialize it explicitly and drop the initialization of
    it (which in fact is conditional, but that is not obvious from the
    code) from the core.

    Signed-off-by: Rafael J. Wysocki
    Tested-by: Sudeep Holla
    Acked-by: Daniel Lezcano

    Rafael J. Wysocki
     
  • Move the polling state initialization code to a separate file built
    conditionally on CONFIG_ARCH_HAS_CPU_RELAX to get rid of the #ifdef
    in driver.c.

    Signed-off-by: Rafael J. Wysocki
    Tested-by: Sudeep Holla
    Acked-by: Daniel Lezcano

    Rafael J. Wysocki
     
  • On some architectures the first (index 0) idle state is a polling
    one and it doesn't really save energy, so there is the
    CPUIDLE_DRIVER_STATE_START symbol allowing some pieces of
    cpuidle code to avoid using that state.

    However, this makes the code rather hard to follow. It is better
    to explicitly avoid the polling state, so add a new cpuidle state
    flag CPUIDLE_FLAG_POLLING to mark it and make the relevant code
    check that flag for the first state instead of using the
    CPUIDLE_DRIVER_STATE_START symbol.

    In the ACPI processor driver that cannot always rely on the state
    flags (like before the states table has been set up) define
    a new internal symbol ACPI_IDLE_STATE_START equivalent to the
    CPUIDLE_DRIVER_STATE_START one and drop the latter.

    Signed-off-by: Rafael J. Wysocki
    Tested-by: Sudeep Holla
    Acked-by: Daniel Lezcano

    Rafael J. Wysocki
     

11 Aug, 2017

1 commit


31 Jan, 2017

1 commit

  • In the current code for powernv_add_idle_states, there is a lot of code
    duplication while initializing an idle state in powernv_states table.

    Add an inline helper function to populate the powernv_states[] table
    for a given idle state. Invoke this for populating the "Nap",
    "Fastsleep" and the stop states in powernv_add_idle_states.

    Signed-off-by: Gautham R. Shenoy
    Acked-by: Balbir Singh
    Acked-by: Rafael J. Wysocki
    Signed-off-by: Michael Ellerman

    Gautham R. Shenoy
     

29 Nov, 2016

1 commit

  • When idle injection is used to cap power, we need to override the
    governor's choice of idle states.

    For this reason, make it possible the deepest idle state selection to
    be enforced by setting a flag on a given CPU to achieve the maximum
    potential power draw reduction.

    Signed-off-by: Jacob Pan
    [ rjw: Subject & changelog ]
    Signed-off-by: Rafael J. Wysocki

    Jacob Pan
     

21 Oct, 2016

1 commit

  • The governor's code use try_module_get() and put_module() to refcount
    the governor's module. But the governors are not compiled as module.

    The refcount does not prevent to switch the governor or unload
    a module as they aren't compiled as modules. The code is pointless,
    so remove it.

    Signed-off-by: Daniel Lezcano
    Signed-off-by: Rafael J. Wysocki

    Daniel Lezcano
     

22 Jul, 2016

1 commit

  • The function arm_enter_idle_state is exactly the same in both generic
    ARM{32,64} CPUIdle driver and will be the same even on ARM64 backend
    for ACPI processor idle driver. So we can unify it and move it to a
    common place by introducing CPU_PM_CPU_IDLE_ENTER macro that can be
    used in all places avoiding duplication.

    This is in preparation of reuse of the generic cpuidle entry function
    for ACPI LPI support on ARM64.

    Suggested-by: Rafael J. Wysocki
    Signed-off-by: Sudeep Holla
    Signed-off-by: Rafael J. Wysocki

    Sudeep Holla