27 Jul, 2012

1 commit

  • Pull ACPI & power management update from Len Brown:
    "Re-write of the turbostat tool.
    lower overhead was necessary for measuring very large system when
    they are very idle.

    IVB support in intel_idle
    It's what I run on my IVB, others should be able to also:-)

    ACPICA core update
    We have found some bugs due to divergence between Linux and the
    upstream ACPICA base. Most of these patches are to reduce that
    divergence to reduce the risk of future bugs.

    Some cpuidle updates, mostly for non-Intel
    More will be coming, as they depend on this part.

    Some thermal management changes needed by non-ACPI systems.

    Some _OST (OS Status Indication) updates for hot ACPI hot-plug."

    * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (51 commits)
    Thermal: Documentation update
    Thermal: Add Hysteresis attributes
    Thermal: Make Thermal trip points writeable
    ACPI/AC: prevent OOPS on some boxes due to missing check power_supply_register() return value check
    tools/power: turbostat: fix large c1% issue
    tools/power: turbostat v2 - re-write for efficiency
    ACPICA: Update to version 20120711
    ACPICA: AcpiSrc: Fix some translation issues for Linux conversion
    ACPICA: Update header files copyrights to 2012
    ACPICA: Add new ACPI table load/unload external interfaces
    ACPICA: Split file: tbxface.c -> tbxfload.c
    ACPICA: Add PCC address space to space ID decode function
    ACPICA: Fix some comment fields
    ACPICA: Table manager: deploy new firmware error/warning interfaces
    ACPICA: Add new interfaces for BIOS(firmware) errors and warnings
    ACPICA: Split exception code utilities to a new file, utexcep.c
    ACPI: acpi_pad: tune round_robin_time
    ACPICA: Update to version 20120620
    ACPICA: Add support for implicit notify on multiple devices
    ACPICA: Update comments; no functional change
    ...

    Linus Torvalds
     

19 Jul, 2012

1 commit

  • * pm-domains:
    PM / Domains: Fix build warning for CONFIG_PM_RUNTIME unset
    PM / Domains: Replace plain integer with NULL pointer in domain.c file
    PM / Domains: Add missing static storage class specifier in domain.c file
    PM / Domains: Allow device callbacks to be added at any time
    PM / Domains: Add device domain data reference counter
    PM / Domains: Add preliminary support for cpuidle, v2
    PM / Domains: Do not stop devices after restoring their states
    PM / Domains: Use subsystem runtime suspend/resume callbacks by default

    Rafael J. Wysocki
     

11 Jul, 2012

1 commit

  • On certain bios, resume hangs if cpus are allowed to enter idle states
    during suspend [1].

    This was fixed in apci idle driver [2].But intel_idle driver does not
    have this fix. Thus instead of replicating the fix in both the idle
    drivers, or in more platform specific idle drivers if needed, the
    more general cpuidle infrastructure could handle this.

    A suspend callback in cpuidle_driver could handle this fix. But
    a cpuidle_driver provides only basic functionalities like platform idle
    state detection capability and mechanisms to support entry and exit
    into CPU idle states. All other cpuidle functions are found in the
    cpuidle generic infrastructure for good reason that all cpuidle
    drivers, irrepective of their platforms will support these functions.

    One option therefore would be to register a suspend callback in cpuidle
    which handles this fix. This could be called through a PM_SUSPEND_PREPARE
    notifier. But this is too generic a notfier for a driver to handle.

    Also, ideally the job of cpuidle is not to handle side effects of suspend.
    It should expose the interfaces which "handle cpuidle 'during' suspend"
    or any other operation, which the subsystems call during that respective
    operation.

    The fix demands that during suspend, no cpus should be allowed to enter
    deep C-states. The interface cpuidle_uninstall_idle_handler() in cpuidle
    ensures that. Not just that it also kicks all the cpus which are already
    in idle out of their idle states which was being done during cpu hotplug
    through a CPU_DYING_FROZEN callbacks.

    Now the question arises about when during suspend should
    cpuidle_uninstall_idle_handler() be called. Since we are dealing with
    drivers it seems best to call this function during dpm_suspend().
    Delaying the call till dpm_suspend_noirq() does no harm, as long as it is
    before cpu_hotplug_begin() to avoid race conditions with cpu hotpulg
    operations. In dpm_suspend_noirq(), it would be wise to place this call
    before suspend_device_irqs() to avoid ugly interactions with the same.

    Ananlogously, during resume.

    References:
    [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/674075.
    [2] http://marc.info/?l=linux-pm&m=133958534231884&w=2

    Reported-and-tested-by: Dave Hansen
    Signed-off-by: Preeti U Murthy
    Reviewed-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki

    Preeti U Murthy
     

06 Jul, 2012

1 commit

  • When the system is booted with some cpus offline, the idle
    driver is not initialized. When a cpu is set online, the
    acpi code call the intel idle init function. Unfortunately
    this code introduce a dependency between intel_idle and acpi.

    This patch is intended to remove this dependency by using the
    notifier of intel_idle. This patch has the benefit of
    encapsulating the intel_idle driver and remove some exported
    functions.

    Signed-off-by: Daniel Lezcano
    Acked-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki

    Daniel Lezcano
     

04 Jul, 2012

3 commits

  • On some systems there are CPU cores located in the same power
    domains as I/O devices. Then, power can only be removed from the
    domain if all I/O devices in it are not in use and the CPU core
    is idle. Add preliminary support for that to the generic PM domains
    framework.

    First, the platform is expected to provide a cpuidle driver with one
    extra state designated for use with the generic PM domains code.
    This state should be initially disabled and its exit_latency value
    should be set to whatever time is needed to bring up the CPU core
    itself after restoring power to it, not including the domain's
    power on latency. Its .enter() callback should point to a procedure
    that will remove power from the domain containing the CPU core at
    the end of the CPU power transition.

    The remaining characteristics of the extra cpuidle state, referred to
    as the "domain" cpuidle state below, (e.g. power usage, target
    residency) should be populated in accordance with the properties of
    the hardware.

    Next, the platform should execute genpd_attach_cpuidle() on the PM
    domain containing the CPU core. That will cause the generic PM
    domains framework to treat that domain in a special way such that:

    * When all devices in the domain have been suspended and it is about
    to be turned off, the states of the devices will be saved, but
    power will not be removed from the domain. Instead, the "domain"
    cpuidle state will be enabled so that power can be removed from
    the domain when the CPU core is idle and the state has been chosen
    as the target by the cpuidle governor.

    * When the first I/O device in the domain is resumed and
    __pm_genpd_poweron(() is called for the first time after
    power has been removed from the domain, the "domain" cpuidle
    state will be disabled to avoid subsequent surprise power removals
    via cpuidle.

    The effective exit_latency value of the "domain" cpuidle state
    depends on the time needed to bring up the CPU core itself after
    restoring power to it as well as on the power on latency of the
    domain containing the CPU core. Thus the "domain" cpuidle state's
    exit_latency has to be recomputed every time the domain's power on
    latency is updated, which may happen every time power is restored
    to the domain, if the measured power on latency is greater than
    the latency stored in the corresponding generic_pm_domain structure.

    Signed-off-by: Rafael J. Wysocki
    Reviewed-by: Kevin Hilman

    Rafael J. Wysocki
     
  • Add a reference counter for the cpuidle driver, so that it can't
    be unregistered when it is in use.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • Andrew J.Schorr raises a question. When he changes the disable setting on
    a single CPU, it affects all the other CPUs. Basically, currently, the
    disable field is per-driver instead of per-cpu. All the C states of the
    same driver are shared by all CPU in the same machine.

    The patch changes the `disable' field to per-cpu, so we could set this
    separately for each cpu.

    Signed-off-by: ShuoX Liu
    Reported-by: Andrew J.Schorr
    Reviewed-by: Yanmin Zhang
    Signed-off-by: Andrew Morton
    Signed-off-by: Rafael J. Wysocki

    ShuoX Liu
     

02 Jun, 2012

2 commits

  • Adds cpuidle_coupled_parallel_barrier, which can be used by coupled
    cpuidle state enter functions to handle resynchronization after
    determining if any cpu needs to abort. The normal use case will
    be:

    static bool abort_flag;
    static atomic_t abort_barrier;

    int arch_cpuidle_enter(struct cpuidle_device *dev, ...)
    {
    if (arch_turn_off_irq_controller()) {
    /* returns an error if an irq is pending and would be lost
    if idle continued and turned off power */
    abort_flag = true;
    }

    cpuidle_coupled_parallel_barrier(dev, &abort_barrier);

    if (abort_flag) {
    /* One of the cpus didn't turn off it's irq controller */
    arch_turn_on_irq_controller();
    return -EINTR;
    }

    /* continue with idle */
    ...
    }

    This will cause all cpus to abort idle together if one of them needs
    to abort.

    Reviewed-by: Santosh Shilimkar
    Tested-by: Santosh Shilimkar
    Reviewed-by: Kevin Hilman
    Tested-by: Kevin Hilman
    Signed-off-by: Colin Cross
    Signed-off-by: Len Brown

    Colin Cross
     
  • On some ARM SMP SoCs (OMAP4460, Tegra 2, and probably more), the
    cpus cannot be independently powered down, either due to
    sequencing restrictions (on Tegra 2, cpu 0 must be the last to
    power down), or due to HW bugs (on OMAP4460, a cpu powering up
    will corrupt the gic state unless the other cpu runs a work
    around). Each cpu has a power state that it can enter without
    coordinating with the other cpu (usually Wait For Interrupt, or
    WFI), and one or more "coupled" power states that affect blocks
    shared between the cpus (L2 cache, interrupt controller, and
    sometimes the whole SoC). Entering a coupled power state must
    be tightly controlled on both cpus.

    The easiest solution to implementing coupled cpu power states is
    to hotplug all but one cpu whenever possible, usually using a
    cpufreq governor that looks at cpu load to determine when to
    enable the secondary cpus. This causes problems, as hotplug is an
    expensive operation, so the number of hotplug transitions must be
    minimized, leading to very slow response to loads, often on the
    order of seconds.

    This file implements an alternative solution, where each cpu will
    wait in the WFI state until all cpus are ready to enter a coupled
    state, at which point the coupled state function will be called
    on all cpus at approximately the same time.

    Once all cpus are ready to enter idle, they are woken by an smp
    cross call. At this point, there is a chance that one of the
    cpus will find work to do, and choose not to enter idle. A
    final pass is needed to guarantee that all cpus will call the
    power state enter function at the same time. During this pass,
    each cpu will increment the ready counter, and continue once the
    ready counter matches the number of online coupled cpus. If any
    cpu exits idle, the other cpus will decrement their counter and
    retry.

    To use coupled cpuidle states, a cpuidle driver must:

    Set struct cpuidle_device.coupled_cpus to the mask of all
    coupled cpus, usually the same as cpu_possible_mask if all cpus
    are part of the same cluster. The coupled_cpus mask must be
    set in the struct cpuidle_device for each cpu.

    Set struct cpuidle_device.safe_state to a state that is not a
    coupled state. This is usually WFI.

    Set CPUIDLE_FLAG_COUPLED in struct cpuidle_state.flags for each
    state that affects multiple cpus.

    Provide a struct cpuidle_state.enter function for each state
    that affects multiple cpus. This function is guaranteed to be
    called on all cpus at approximately the same time. The driver
    should ensure that the cpus all abort together if any cpu tries
    to abort once the function is called.

    update1:

    cpuidle: coupled: fix count of online cpus

    online_count was never incremented on boot, and was also counting
    cpus that were not part of the coupled set. Fix both issues by
    introducting a new function that counts online coupled cpus, and
    call it from register as well as the hotplug notifier.

    update2:

    cpuidle: coupled: fix decrementing ready count

    cpuidle_coupled_set_not_ready sometimes refuses to decrement the
    ready count in order to prevent a race condition. This makes it
    unsuitable for use when finished with idle. Add a new function
    cpuidle_coupled_set_done that decrements both the ready count and
    waiting count, and call it after idle is complete.

    Cc: Amit Kucheria
    Cc: Arjan van de Ven
    Cc: Trinabh Gupta
    Cc: Deepthi Dharwar
    Reviewed-by: Santosh Shilimkar
    Tested-by: Santosh Shilimkar
    Reviewed-by: Kevin Hilman
    Tested-by: Kevin Hilman
    Signed-off-by: Colin Cross
    Acked-by: Rafael J. Wysocki
    Signed-off-by: Len Brown

    Colin Cross
     

30 Mar, 2012

5 commits

  • power_usage is always assigned a negative value and should be declared
    a signed integer

    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Len Brown

    Boris Ostrovsky
     
  • Currently when a CPU is off-lined it enters either MWAIT-based idle or,
    if MWAIT is not desired or supported, HLT-based idle (which places the
    processor in C1 state). This patch allows processors without MWAIT
    support to stay in states deeper than C1.

    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Len Brown

    Boris Ostrovsky
     
  • As far as I can see, this field is never used in the code.

    Signed-off-by: Daniel Lezcano
    Signed-off-by: Len Brown

    Daniel Lezcano
     
  • All the modules name are ro-data, it is never copied to the array.

    eg.

    static struct cpuidle_driver intel_idle_driver = {
    .name = "intel_idle",
    .owner = THIS_MODULE,
    };

    It safe to assign the pointer of this ro-data to a const char *.
    By this way we save 12 bytes.

    Signed-off-by: Daniel Lezcano
    Acked-by: Deepthi Dharwar
    Tested-by: Deepthi Dharwar
    Signed-off-by: Len Brown

    Daniel Lezcano
     
  • Some C states of new CPU might be not good. One reason is BIOS might
    configure them incorrectly. To help developers root cause it quickly, the
    patch adds a new sysfs entry, so developers could disable specific C state
    manually.

    In addition, C state might have much impact on performance tuning, as it
    takes much time to enter/exit C states, which might delay interrupt
    processing. With the new debug option, developers could check if a deep C
    state could impact performance and how much impact it could cause.

    Also add this option in Documentation/cpuidle/sysfs.txt.

    [akpm@linux-foundation.org: check kstrtol return value]
    Signed-off-by: ShuoX Liu
    Reviewed-by: Yanmin Zhang
    Reviewed-and-Tested-by: Deepthi Dharwar
    Signed-off-by: Andrew Morton
    Signed-off-by: Len Brown

    ShuoX Liu
     

21 Mar, 2012

1 commit

  • Make necessary changes to implement time keeping and irq enabling
    in the core cpuidle code. This will allow the removal of these
    functionalities from various platform cpuidle implementations whose
    timekeeping and irq enabling follows the form in this common code.

    Signed-off-by: Robert Lee
    Tested-by: Jean Pihet
    Tested-by: Amit Daniel
    Tested-by: Robert Lee
    Reviewed-by: Kevin Hilman
    Reviewed-by: Daniel Lezcano
    Reviewed-by: Deepthi Dharwar
    Acked-by: Jean Pihet
    Signed-off-by: Len Brown

    Robert Lee
     

19 Jan, 2012

1 commit

  • This includes initial support for the recently published ACPI 5.0 spec.
    In particular, support for the "hardware-reduced" bit that eliminates
    the dependency on legacy hardware.

    APEI has patches resulting from testing on real hardware.

    Plus other random fixes.

    * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (52 commits)
    acpi/apei/einj: Add extensions to EINJ from rev 5.0 of acpi spec
    intel_idle: Split up and provide per CPU initialization func
    ACPI processor: Remove unneeded variable passed by acpi_processor_hotadd_init V2
    ACPI processor: Remove unneeded cpuidle_unregister_driver call
    intel idle: Make idle driver more robust
    intel_idle: Fix a cast to pointer from integer of different size warning in intel_idle
    ACPI: kernel-parameters.txt : Add intel_idle.max_cstate
    intel_idle: remove redundant local_irq_disable() call
    ACPI processor: Fix error path, also remove sysdev link
    ACPI: processor: fix acpi_get_cpuid for UP processor
    intel_idle: fix API misuse
    ACPI APEI: Convert atomicio routines
    ACPI: Export interfaces for ioremapping/iounmapping ACPI registers
    ACPI: Fix possible alignment issues with GAS 'address' references
    ACPI, ia64: Use SRAT table rev to use 8bit or 16/32bit PXM fields (ia64)
    ACPI, x86: Use SRAT table rev to use 8bit or 32bit PXM fields (x86/x86-64)
    ACPI: Store SRAT table revision
    ACPI, APEI, Resolve false conflict between ACPI NVS and APEI
    ACPI, Record ACPI NVS regions
    ACPI, APEI, EINJ, Refine the fix of resource conflict
    ...

    Linus Torvalds
     

18 Jan, 2012

1 commit


08 Dec, 2011

1 commit

  • This patch enables cpuidle for pSeries and pSeries_idle is
    directly called from the idle loop. As a result of pSeries_idle, cpuidle
    driver registered with cpuidle subsystem comes into action. On
    failure of loading of the driver or cpuidle framework default idle
    is executed as part of the function. This patch
    also removes the routines pseries_shared_idle_sleep and
    pseries_dedicated_idle_sleep as they are now implemented as part of
    pseries_idle cpuidle driver.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Trinabh Gupta
    Signed-off-by: Arun R Bharadwaj
    Signed-off-by: Benjamin Herrenschmidt

    Deepthi Dharwar
     

08 Nov, 2011

1 commit

  • * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
    cpuidle: Single/Global registration of idle states
    cpuidle: Split cpuidle_state structure and move per-cpu statistics fields
    cpuidle: Remove CPUIDLE_FLAG_IGNORE and dev->prepare()
    cpuidle: Move dev->last_residency update to driver enter routine; remove dev->last_state
    ACPI: Fix CONFIG_ACPI_DOCK=n compiler warning
    ACPI: Export FADT pm_profile integer value to userspace
    thermal: Prevent polling from happening during system suspend
    ACPI: Drop ACPI_NO_HARDWARE_INIT
    ACPI atomicio: Convert width in bits to bytes in __acpi_ioremap_fast()
    PNPACPI: Simplify disabled resource registration
    ACPI: Fix possible recursive locking in hwregs.c
    ACPI: use kstrdup()
    mrst pmu: update comment
    tools/power turbostat: less verbose debugging

    Linus Torvalds
     

07 Nov, 2011

4 commits

  • This patch makes the cpuidle_states structure global (single copy)
    instead of per-cpu. The statistics needed on per-cpu basis
    by the governor are kept per-cpu. This simplifies the cpuidle
    subsystem as state registration is done by single cpu only.
    Having single copy of cpuidle_states saves memory. Rare case
    of asymmetric C-states can be handled within the cpuidle driver
    and architectures such as POWER do not have asymmetric C-states.

    Having single/global registration of all the idle states,
    dynamic C-state transitions on x86 are handled by
    the boot cpu. Here, the boot cpu would disable all the devices,
    re-populate the states and later enable all the devices,
    irrespective of the cpu that would receive the notification first.

    Reference:
    https://lkml.org/lkml/2011/4/25/83

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Trinabh Gupta
    Tested-by: Jean Pihet
    Reviewed-by: Kevin Hilman
    Acked-by: Arjan van de Ven
    Acked-by: Kevin Hilman
    Signed-off-by: Len Brown

    Deepthi Dharwar
     
  • This is the first step towards global registration of cpuidle
    states. The statistics used primarily by the governor are per-cpu
    and have to be split from rest of the fields inside cpuidle_state,
    which would be made global i.e. single copy. The driver_data field
    is also per-cpu and moved.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Trinabh Gupta
    Tested-by: Jean Pihet
    Reviewed-by: Kevin Hilman
    Acked-by: Arjan van de Ven
    Acked-by: Kevin Hilman
    Signed-off-by: Len Brown

    Deepthi Dharwar
     
  • The cpuidle_device->prepare() mechanism causes updates to the
    cpuidle_state[].flags, setting and clearing CPUIDLE_FLAG_IGNORE
    to tell the governor not to chose a state on a per-cpu basis at
    run-time. State demotion is now handled by the driver and it returns
    the actual state entered. Hence, this mechanism is not required.
    Also this removes per-cpu flags from cpuidle_state enabling
    it to be made global.

    Reference:
    https://lkml.org/lkml/2011/3/25/52

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Trinabh Gupta
    Tested-by: Jean Pihet
    Acked-by: Arjan van de Ven
    Reviewed-by: Kevin Hilman
    Signed-off-by: Len Brown

    Deepthi Dharwar
     
  • Cpuidle governor only suggests the state to enter using the
    governor->select() interface, but allows the low level driver to
    override the recommended state. The actual entered state
    may be different because of software or hardware demotion. Software
    demotion is done by the back-end cpuidle driver and can be accounted
    correctly. Current cpuidle code uses last_state field to capture the
    actual state entered and based on that updates the statistics for the
    state entered.

    Ideally the driver enter routine should update the counters,
    and it should return the state actually entered rather than the time
    spent there. The generic cpuidle code should simply handle where
    the counters live in the sysfs namespace, not updating the counters.

    Reference:
    https://lkml.org/lkml/2011/3/25/52

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Trinabh Gupta
    Tested-by: Jean Pihet
    Reviewed-by: Kevin Hilman
    Acked-by: Arjan van de Ven
    Acked-by: Kevin Hilman
    Signed-off-by: Len Brown

    Deepthi Dharwar
     

01 Nov, 2011

1 commit

  • The pretty much brings in the kitchen sink along
    with it, so it should be avoided wherever reasonably possible in
    terms of being included from other commonly used
    files, as it results in a measureable increase on compile times.

    The worst culprit was probably device.h since it is used everywhere.
    This file also had an implicit dependency/usage of mutex.h which was
    masked by module.h, and is also fixed here at the same time.

    There are over a dozen other headers that simply declare the
    struct instead of pulling in the whole file, so follow their lead
    and simply make it a few more.

    Most of the implicit dependencies on module.h being present by
    these headers pulling it in have been now weeded out, so we can
    finally make this change with hopefully minimal breakage.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

04 Aug, 2011

2 commits

  • cpuidle users should call cpuidle_call_idle() directly
    rather than via (pm_idle)() function pointer.

    Architecture may choose to continue using (pm_idle)(),
    but cpuidle need not depend on it:

    my_arch_cpu_idle()
    ...
    if(cpuidle_call_idle())
    pm_idle();

    cc: Kevin Hilman
    cc: Paul Mundt
    cc: x86@kernel.org
    Acked-by: H. Peter Anvin
    Signed-off-by: Len Brown

    Len Brown
     
  • When a Xen Dom0 kernel boots on a hypervisor, it gets access
    to the raw-hardware ACPI tables. While it parses the idle tables
    for the hypervisor's beneift, it uses HLT for its own idle.

    Rather than have xen scribble on pm_idle and access default_idle,
    have it simply disable_cpuidle() so acpi_idle will not load and
    architecture default HLT will be used.

    cc: xen-devel@lists.xensource.com
    Tested-by: Konrad Rzeszutek Wilk
    Acked-by: H. Peter Anvin
    Signed-off-by: Len Brown

    Len Brown
     

13 Jan, 2011

4 commits


01 Oct, 2010

1 commit

  • Avoid TLB flush IPIs for the cores in deeper c-states by voluntary leave_mm()
    before entering into that state. CPUs tend to flush TLB in those c-states
    anyways.

    acpi_idle does this with C3-type states, but it was not caried over
    when intel_idle was introduced. intel_idle can apply it
    to C-states in addition to those that ACPI might export as C3...

    Signed-off-by: Suresh Siddha
    Signed-off-by: Len Brown

    Suresh Siddha
     

10 Aug, 2010

1 commit

  • On some SoC chips, HW resources may be in use during any particular idle
    period. As a consequence, the cpuidle states that the SoC is safe to
    enter can change from idle period to idle period. In addition, the
    latency and threshold of each cpuidle state can vary, depending on the
    operating condition when the CPU becomes idle, e.g. the current cpu
    frequency, the current state of the HW blocks, etc.

    cpuidle core and the menu governor, in the current form, are geared
    towards cpuidle states that are static, i.e. the availabiltiy of the
    states, their latencies, their thresholds are non-changing during run
    time. cpuidle does not provide any hook that cpuidle drivers can use to
    adjust those values on the fly for the current idle period before the menu
    governor selects the target cpuidle state.

    This patch extends cpuidle core and the menu governor to handle states
    that are dynamic. There are three additions in the patch and the patch
    maintains backwards-compatibility with existing cpuidle drivers.

    1) add prepare() to struct cpuidle_device. A cpuidle driver can hook
    into the callback and cpuidle will call prepare() before calling the
    governor's select function. The callback gives the cpuidle driver a
    chance to update the dynamic information of the cpuidle states for the
    current idle period, e.g. state availability, latencies, thresholds,
    power values, etc.

    2) add CPUIDLE_FLAG_IGNORE as one of the state flags. In the prepare()
    function, a cpuidle driver can set/clear the flag to indicate to the
    menu governor whether a cpuidle state should be ignored, i.e. not
    available, during the current idle period.

    3) add power_specified bit to struct cpuidle_device. The menu governor
    currently assumes that the cpuidle states are arranged in the order of
    increasing latency, threshold, and power savings. This is true or can
    be made true for static states. Once the state parameters are dynamic,
    the latencies, thresholds, and power savings for the cpuidle states can
    increase or decrease by different amounts from idle period to idle
    period. So the assumption of increasing latency, threshold, and power
    savings from Cn to C(n+1) can no longer be guaranteed.

    It can be straightforward to calculate the power consumption of each
    available state and to specify it in power_usage for the idle period.
    Using the power_usage fields, the menu governor then selects the state
    that has the lowest power consumption and that still satisfies all other
    critieria. The power_specified bit defaults to 0. For existing cpuidle
    drivers, cpuidle detects that power_specified is 0 and fills in a dummy
    set of power_usage values.

    Signed-off-by: Ai Li
    Cc: Len Brown
    Acked-by: Arjan van de Ven
    Cc: Ingo Molnar
    Cc: Venkatesh Pallipadi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ai Li
     

28 May, 2010

1 commit

  • cpuidle_register_driver() sets cpuidle_curr_driver
    cpuidle_unregister_driver() clears cpuidle_curr_driver

    We should't expose cpuidle_curr_driver to
    potential modification except via these interfaces.
    So make it static and create cpuidle_get_driver() to observe it.

    Signed-off-by: Len Brown

    Len Brown
     

27 May, 2010

1 commit


12 Jun, 2008

1 commit

  • cpuidle and acpi driver interaction bug with the way cpuidle_register_driver()
    is called. Due to this bug, there will be oops on
    ACDC on some systems, where they support C-states in one DC and not in AC.

    The current code does
    ON BOOT:
    Look at CST and other C-state info to see whether more than C1 is
    supported. If it is, then acpi processor_idle does a
    cpuidle_register_driver() call, which internally enables the device.

    ON CST change notification (ACDC) and on suspend-resume:
    acpi driver temporarily disables device, updates the device with
    any new C-states, and reenables the device.

    The problem is is on boot, there are no C2, C3 states supported and we skip
    the register. Later on ACDC, we may get a CST notification and we try
    to reevaluate CST and enabled the device, without actually registering it.
    This causes breakage as we try to create /sys fs sub directory, without the
    parent directory which is created at register time.

    Thanks to Sanjeev for reporting the problem here.
    http://bugzilla.kernel.org/show_bug.cgi?id=10394

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Len Brown

    Venkatesh Pallipadi
     

26 Mar, 2008

1 commit

  • cpuidle C-state sysfs node time and usage are very easy to overflow because
    they are all of unsigned int type, time will overflow within about two hours,
    usage will take longer time to overflow, but they are increasing for ever.

    This patch will convert them to unsigned long long.

    Signed-off-by: Yi Yang
    Acked-by: Venkatesh Pallipadi
    Signed-off-by: Len Brown

    Yi Yang
     

14 Feb, 2008

1 commit

  • Add a new sysfs entry under cpuidle states. desc - can be used by driver to
    communicate to userspace any specific information about the state.
    This helps in identifying the exact hardware C-states behind the ACPI C-state
    definition.

    Idea is to export this through powertop, which will help to map the C-state
    reported by powertop to actual hardware C-state.

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Len Brown

    Venkatesh Pallipadi
     

07 Feb, 2008

3 commits