29 Jan, 2014

6 commits

  • Following patch ports the cpuidle framework for powernv
    platform and also implements a cpuidle back-end powernv
    idle driver calling on to power7_nap and snooze idle states.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Benjamin Herrenschmidt

    Deepthi Dharwar
     
  • smt-snooze-delay was designed to disable NAP state or delay the entry
    to the NAP state prior to adoption of cpuidle framework. This
    is per-cpu variable. With the coming of CPUIDLE framework,
    states can be disabled on per-cpu basis using the cpuidle/enable
    sysfs entry.

    Also, with the coming of cpuidle driver each state's target residency
    is per-driver unlike earlier which was per-device. Therefore,
    the per-cpu sysfs smt-snooze-delay which decides the target residency
    of the idle state on a particular cpu causes more confusion to the user
    as we cannot have different smt-snooze-delay (target residency)
    values for each cpu.

    In the current code, smt-snooze-delay functionality is completely broken.
    It makes sense to remove smt-snooze-delay from idle driver with the
    coming of cpuidle framework.
    However, sysfs files are retained as ppc64_util currently
    utilises it. Once we fix ppc64_util, propose to clean
    up the kernel code.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Benjamin Herrenschmidt

    Deepthi Dharwar
     
  • This patch removes the usage of MAX_IDLE_STATE macro
    and dead code around it. The number of states
    are determined at run time based on the cpuidle
    state table selected on a given platform

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Benjamin Herrenschmidt

    Deepthi Dharwar
     
  • Currently cpuidle-pseries backend driver cannot be
    built as a module due to dependencies wrt cpuidle framework.
    This patch removes all the module related code in the driver.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Benjamin Herrenschmidt

    Deepthi Dharwar
     
  • This patch replaces the cpuidle driver and devices initialisation
    calls with a single generic cpuidle_register() call
    and also includes minor refactoring of the code around it.

    Remove the cpu online check in snooze loop, as this code can
    only locally run on a cpu only if it is online. Therefore,
    this check is not required.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Benjamin Herrenschmidt

    Deepthi Dharwar
     
  • Move the file from arch specific pseries/processor_idle.c
    to drivers/cpuidle/cpuidle-pseries.c
    Make the relevant Makefile and Kconfig changes.
    Also, introduce Kconfig.powerpc in drivers/cpuidle
    for all powerpc cpuidle drivers.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Benjamin Herrenschmidt

    Deepthi Dharwar
     

30 Dec, 2013

1 commit

  • Commit 60a66e370007e8535b7a561353b07b37deaf35ba changed the Calxeda
    cpuidle driver to a platform driver, copying the __init tag from the
    _init() to the newly used _probe() function. However, "probe should
    not be __init." (Rob said ;-)
    Remove the __init tag to fix a section mismatch in the Calxeda
    cpuidle driver.

    Signed-off-by: Andre Przywara
    Signed-off-by: Daniel Lezcano

    Andre Przywara
     

04 Dec, 2013

1 commit

  • If not, we could end up in the unfortunate situation where
    we dereference a NULL pointer b/c we have cpuidle disabled.

    This is the case when booting under Xen (which uses the
    ACPI P/C states but disables the CPU idle driver) - and can
    be easily reproduced when booting with cpuidle.off=1.

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [] cpuidle_unregister_device+0x2a/0x90
    .. snip..
    Call Trace:
    [] acpi_processor_power_exit+0x3c/0x5c
    [] acpi_processor_stop+0x61/0xb6
    [] __device_release_driver+0fffff81421653>] device_release_driver+0x23/0x30
    [] bus_remove_device+0x108/0x180
    [] device_del+0x129/0x1c0
    [] ? unregister_xenbus_watch+0x1f0/0x1f0
    [] device_unregister+0x1e/0x60
    [] unregister_cpu+0x39/0x60
    [] arch_unregister_cpu+0x23/0x30
    [] handle_vcpu_hotplug_event+0xc1/0xe0
    [] xenwatch_thread+0x45/0x120
    [] ? abort_exclusive_wait+0xb0/0xb0
    [] kthread+0xd2/0xf0
    [] ? kthread_create_on_node+0x180/0x180
    [] ret_from_fork+0x7c/0xb0
    [] ? kthread_create_on_node+0x180/0x180

    This problem also appears in 3.12 and could be a candidate for backport.

    Signed-off-by: Konrad Rzeszutek Wilk
    Cc: All applicable
    Signed-off-by: Rafael J. Wysocki

    Konrad Rzeszutek Wilk
     

14 Nov, 2013

1 commit

  • Pull ACPI and power management updates from Rafael J Wysocki:

    - New power capping framework and the the Intel Running Average Power
    Limit (RAPL) driver using it from Srinivas Pandruvada and Jacob Pan.

    - Addition of the in-kernel switching feature to the arm_big_little
    cpufreq driver from Viresh Kumar and Nicolas Pitre.

    - cpufreq support for iMac G5 from Aaro Koskinen.

    - Baytrail processors support for intel_pstate from Dirk Brandewie.

    - cpufreq support for Midway/ECX-2000 from Mark Langsdorf.

    - ARM vexpress/TC2 cpufreq support from Sudeep KarkadaNagesha.

    - ACPI power management support for the I2C and SPI bus types from Mika
    Westerberg and Lv Zheng.

    - cpufreq core fixes and cleanups from Viresh Kumar, Srivatsa S Bhat,
    Stratos Karafotis, Xiaoguang Chen, Lan Tianyu.

    - cpufreq drivers updates (mostly fixes and cleanups) from Viresh
    Kumar, Aaro Koskinen, Jungseok Lee, Sudeep KarkadaNagesha, Lukasz
    Majewski, Manish Badarkhe, Hans-Christian Egtvedt, Evgeny Kapaev.

    - intel_pstate updates from Dirk Brandewie and Adrian Huang.

    - ACPICA update to version 20130927 includig fixes and cleanups and
    some reduction of divergences between the ACPICA code in the kernel
    and ACPICA upstream in order to improve the automatic ACPICA patch
    generation process. From Bob Moore, Lv Zheng, Tomasz Nowicki, Naresh
    Bhat, Bjorn Helgaas, David E Box.

    - ACPI IPMI driver fixes and cleanups from Lv Zheng.

    - ACPI hotplug fixes and cleanups from Bjorn Helgaas, Toshi Kani, Zhang
    Yanfei, Rafael J Wysocki.

    - Conversion of the ACPI AC driver to the platform bus type and
    multiple driver fixes and cleanups related to ACPI from Zhang Rui.

    - ACPI processor driver fixes and cleanups from Hanjun Guo, Jiang Liu,
    Bartlomiej Zolnierkiewicz, Mathieu Rhéaume, Rafael J Wysocki.

    - Fixes and cleanups and new blacklist entries related to the ACPI
    video support from Aaron Lu, Felipe Contreras, Lennart Poettering,
    Kirill Tkhai.

    - cpuidle core cleanups from Viresh Kumar and Lorenzo Pieralisi.

    - cpuidle drivers fixes and cleanups from Daniel Lezcano, Jingoo Han,
    Bartlomiej Zolnierkiewicz, Prarit Bhargava.

    - devfreq updates from Sachin Kamat, Dan Carpenter, Manish Badarkhe.

    - Operation Performance Points (OPP) core updates from Nishanth Menon.

    - Runtime power management core fix from Rafael J Wysocki and update
    from Ulf Hansson.

    - Hibernation fixes from Aaron Lu and Rafael J Wysocki.

    - Device suspend/resume lockup detection mechanism from Benoit Goby.

    - Removal of unused proc directories created for various ACPI drivers
    from Lan Tianyu.

    - ACPI LPSS driver fix and new device IDs for the ACPI platform scan
    handler from Heikki Krogerus and Jarkko Nikula.

    - New ACPI _OSI blacklist entry for Toshiba NB100 from Levente Kurusa.

    - Assorted fixes and cleanups related to ACPI from Andy Shevchenko, Al
    Stone, Bartlomiej Zolnierkiewicz, Colin Ian King, Dan Carpenter,
    Felipe Contreras, Jianguo Wu, Lan Tianyu, Yinghai Lu, Mathias Krause,
    Liu Chuansheng.

    - Assorted PM fixes and cleanups from Andy Shevchenko, Thierry Reding,
    Jean-Christophe Plagniol-Villard.

    * tag 'pm+acpi-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (386 commits)
    cpufreq: conservative: fix requested_freq reduction issue
    ACPI / hotplug: Consolidate deferred execution of ACPI hotplug routines
    PM / runtime: Use pm_runtime_put_sync() in __device_release_driver()
    ACPI / event: remove unneeded NULL pointer check
    Revert "ACPI / video: Ignore BIOS initial backlight value for HP 250 G1"
    ACPI / video: Quirk initial backlight level 0
    ACPI / video: Fix initial level validity test
    intel_pstate: skip the driver if ACPI has power mgmt option
    PM / hibernate: Avoid overflow in hibernate_preallocate_memory()
    ACPI / hotplug: Do not execute "insert in progress" _OST
    ACPI / hotplug: Carry out PCI root eject directly
    ACPI / hotplug: Merge device hot-removal routines
    ACPI / hotplug: Make acpi_bus_hot_remove_device() internal
    ACPI / hotplug: Simplify device ejection routines
    ACPI / hotplug: Fix handle_root_bridge_removal()
    ACPI / hotplug: Refuse to hot-remove all objects with disabled hotplug
    ACPI / scan: Start matching drivers after trying scan handlers
    ACPI: Remove acpi_pci_slot_init() headers from internal.h
    ACPI / blacklist: fix name of ThinkPad Edge E530
    PowerCap: Fix build error with option -Werror=format-security
    ...

    Conflicts:
    arch/arm/mach-omap2/opp.c
    drivers/Kconfig
    drivers/spi/spi.c

    Linus Torvalds
     

30 Oct, 2013

12 commits


17 Oct, 2013

1 commit

  • As the cpuidle driver code has no more the dependency with the pm code, the
    'standby' callback being passed as a parameter to the device's platform data,
    we can move the cpuidle driver in the drivers/cpuidle directory.

    Signed-off-by: Daniel Lezcano
    Acked-by: Jean-Christophe PLAGNIOL-VILLARD
    Acked-by: Nicolas Ferre

    Conflicts:

    drivers/cpuidle/Kconfig.arm
    drivers/cpuidle/Makefile

    Daniel Lezcano
     

07 Oct, 2013

4 commits


02 Oct, 2013

3 commits


13 Sep, 2013

1 commit

  • Pull ACPI and power management fixes from Rafael Wysocki:
    "All of these commits are fixes that have emerged recently and some of
    them fix bugs introduced during this merge window.

    Specifics:

    1) ACPI-based PCI hotplug (ACPIPHP) fixes related to spurious events

    After the recent ACPIPHP changes we've seen some interesting
    breakage on a system that triggers device check notifications
    during boot for non-existing devices. Although those
    notifications are really spurious, we should be able to deal with
    them nevertheless and that shouldn't introduce too much overhead.
    Four commits to make that work properly.

    2) Memory hotplug and hibernation mutual exclusion rework

    This was maent to be a cleanup, but it happens to fix a classical
    ABBA deadlock between system suspend/hibernation and ACPI memory
    hotplug which is possible if they are started roughly at the same
    time. Three commits rework memory hotplug so that it doesn't
    acquire pm_mutex and make hibernation use device_hotplug_lock
    which prevents it from racing with memory hotplug.

    3) ACPI Intel LPSS (Low-Power Subsystem) driver crash fix

    The ACPI LPSS driver crashes during boot on Apple Macbook Air with
    Haswell that has slightly unusual BIOS configuration in which one
    of the LPSS device's _CRS method doesn't return all of the
    information expected by the driver. Fix from Mika Westerberg, for
    stable.

    4) ACPICA fix related to Store->ArgX operation

    AML interpreter fix for obscure breakage that causes AML to be
    executed incorrectly on some machines (observed in practice).
    From Bob Moore.

    5) ACPI core fix for PCI ACPI device objects lookup

    There still are cases in which there is more than one ACPI device
    object matching a given PCI device and we don't choose the one
    that the BIOS expects us to choose, so this makes the lookup take
    more criteria into account in those cases.

    6) Fix to prevent cpuidle from crashing in some rare cases

    If the result of cpuidle_get_driver() is NULL, which can happen on
    some systems, cpuidle_driver_ref() will crash trying to use that
    pointer and the Daniel Fu's fix prevents that from happening.

    7) cpufreq fixes related to CPU hotplug

    Stephen Boyd reported a number of concurrency problems with
    cpufreq related to CPU hotplug which are addressed by a series of
    fixes from Srivatsa S Bhat and Viresh Kumar.

    8) cpufreq fix for time conversion in time_in_state attribute

    Time conversion carried out by cpufreq when user space attempts to
    read /sys/devices/system/cpu/cpu*/cpufreq/stats/time_in_state
    won't work correcty if cputime_t doesn't map directly to jiffies.
    Fix from Andreas Schwab.

    9) Revert of a troublesome cpufreq commit

    Commit 7c30ed5 (cpufreq: make sure frequency transitions are
    serialized) was intended to address some known concurrency
    problems in cpufreq related to the ordering of transitions, but
    unfortunately it introduced several problems of its own, so I
    decided to revert it now and address the original problems later
    in a more robust way.

    10) Intel Haswell CPU models for intel_pstate from Nell Hardcastle.

    11) cpufreq fixes related to system suspend/resume

    The recent cpufreq changes that made it preserve CPU sysfs
    attributes over suspend/resume cycles introduced a possible NULL
    pointer dereference that caused it to crash during the second
    attempt to suspend. Three commits from Srivatsa S Bhat fix that
    problem and a couple of related issues.

    12) cpufreq locking fix

    cpufreq_policy_restore() should acquire the lock for reading, but
    it acquires it for writing. Fix from Lan Tianyu"

    * tag 'pm+acpi-fixes-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (25 commits)
    cpufreq: Acquire the lock in cpufreq_policy_restore() for reading
    cpufreq: Prevent problems in update_policy_cpu() if last_cpu == new_cpu
    cpufreq: Restructure if/else block to avoid unintended behavior
    cpufreq: Fix crash in cpufreq-stats during suspend/resume
    intel_pstate: Add Haswell CPU models
    Revert "cpufreq: make sure frequency transitions are serialized"
    cpufreq: Use signed type for 'ret' variable, to store negative error values
    cpufreq: Remove temporary fix for race between CPU hotplug and sysfs-writes
    cpufreq: Synchronize the cpufreq store_*() routines with CPU hotplug
    cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock
    cpufreq: Split __cpufreq_remove_dev() into two parts
    cpufreq: Fix wrong time unit conversion
    cpufreq: serialize calls to __cpufreq_governor()
    cpufreq: don't allow governor limits to be changed when it is disabled
    ACPI / bind: Prefer device objects with _STA to those without it
    ACPI / hotplug / PCI: Avoid parent bus rescans on spurious device checks
    ACPI / hotplug / PCI: Use _OST to notify firmware about notify status
    ACPI / hotplug / PCI: Avoid doing too much for spurious notifies
    ACPICA: Fix for a Store->ArgX when ArgX contains a reference to a field.
    ACPI / hotplug / PCI: Don't trim devices before scanning the namespace
    ...

    Linus Torvalds
     

10 Sep, 2013

1 commit

  • Pull ARM SoC driver update from Kevin Hilman:
    "This contains the ARM SoC related driver updates for v3.12. The only
    thing this cycle are core PM updates and CPUidle support for ARM's TC2
    big.LITTLE development platform"

    * tag 'drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
    cpuidle: big.LITTLE: vexpress-TC2 CPU idle driver
    ARM: vexpress: tc2: disable GIC CPU IF in tc2_pm_suspend
    drivers: irq-chip: irq-gic: introduce gic_cpu_if_down()

    Linus Torvalds
     

31 Aug, 2013

1 commit


30 Aug, 2013

3 commits

  • The coupled cpuidle waiting loop clears pending pokes before
    entering the safe state. If a poke arrives just before the
    pokes are cleared, but after the while loop condition checks,
    the poke will be lost and the cpu will stay in the safe state
    until another interrupt arrives. This may cause the cpu that
    sent the poke to spin in the ready loop with interrupts off
    until another cpu receives an interrupt, and if no other cpus
    have interrupts routed to them it can spin forever.

    Change the return value of cpuidle_coupled_clear_pokes to
    return if a poke was cleared, and move the need_resched()
    checks into the callers. In the waiting loop, if
    a poke was cleared restart the loop to repeat the while
    condition checks.

    Reported-by: Neil Zhang
    Signed-off-by: Colin Cross
    Cc: 3.6+ # 3.6+
    Signed-off-by: Rafael J. Wysocki

    Colin Cross
     
  • Joseph Lo reported a lockup on Tegra20 caused
    by a race condition in coupled cpuidle. When two or more cpus
    enter idle at the same time, the first cpus to arrive may go to the
    ready loop without processing pending pokes from the last cpu to
    arrive.

    This patch adds a check for pending pokes once all cpus have been
    synchronized in the ready loop and resets the coupled state and
    retries if any cpus failed to handle their pending poke.

    Retrying on all cpus may trigger the same issue again, so this patch
    also adds a check to ensure that each cpu has received at least one
    poke between when it enters the waiting loop and when it moves on to
    the ready loop.

    Reported-and-tested-by: Joseph Lo
    Tested-by: Stephen Warren
    Signed-off-by: Colin Cross
    Cc: 3.6+ # 3.6+
    Signed-off-by: Rafael J. Wysocki

    Colin Cross
     
  • Calling cpuidle_enter_state is expected to return with interrupts
    enabled, but interrupts must be disabled before starting the
    ready loop synchronization stage. Call local_irq_disable after
    each call to cpuidle_enter_state for the safe state.

    Tested-by: Stephen Warren
    Signed-off-by: Colin Cross
    Signed-off-by: Rafael J. Wysocki

    Colin Cross
     

29 Aug, 2013

2 commits

  • From Lorenzo Pieralisi:
    This patch series contains:

    - GIC driver update to add a method to disable the GIC CPU IF
    - TC2 MCPM update to add GIC CPU disabling to suspend method
    - TC2 CPU idle big.LITTLE driver

    * cpuidle/biglittle:
    cpuidle: big.LITTLE: vexpress-TC2 CPU idle driver
    ARM: vexpress: tc2: disable GIC CPU IF in tc2_pm_suspend
    drivers: irq-chip: irq-gic: introduce gic_cpu_if_down()
    ARM: vexpress/TC2: implement PM suspend method
    ARM: vexpress/TC2: basic PM support
    ARM: vexpress: Add SCC to V2P-CA15_A7's device tree
    ARM: vexpress/TC2: add Serial Power Controller (SPC) support
    ARM: vexpress/dcscb: fix cache disabling sequences

    Signed-off-by: Olof Johansson

    Olof Johansson
     
  • The big.LITTLE architecture is composed of two clusters of cpus. One cluster
    contains less powerful but more energy efficient processors and the other
    cluster groups the powerful but energy-intensive cpus.

    The TC2 testchip implements two clusters of CPUs (A7 and A15 clusters in
    a big.LITTLE configuration) connected through a CCI interconnect that manages
    coherency of their respective L2 caches and intercluster distributed
    virtual memory messages (DVM).

    TC2 testchip integrates a power controller that manages cores resets, wake-up
    IRQs and cluster low-power states. Power states are managed at cluster
    level, which means that voltage is removed from a cluster iff all cores
    in a cluster are in a wfi state. Single cores can enter a reset state
    which is identical to wfi in terms of power consumption but simplifies the
    way cluster states are entered.

    This patch provides a multiple driver CPU idle implementation for TC2
    which paves the way for a generic big.LITTLE idle driver for all
    upcoming big.LITTLE based systems on chip.

    The driver relies on the MCPM infrastructure to coordinate and manage
    core power states; in particular MCPM allows to suspend specific cores
    and hides the CPUs coordination required to shut-down clusters of CPUs.

    Power down sequences for the respective clusters are implemented in the
    MCPM TC2 backend, with all code needed to clean caches and exit coherency.

    The multiple driver CPU idle infrastructure allows to define different
    C-states for big and little cores, determined at boot by checking the
    part id of the possible CPUs and initializing the respective logical
    masks in the big and little drivers.

    Current big.little systems are composed of A7 and A15 clusters, as
    implemented in TC2, but in the future that may change and the driver
    will have evolve to retrieve what is a 'big' cpu and what is a 'little'
    cpu in order to build the correct topology.

    Cc: Kevin Hilman
    Cc: Amit Kucheria
    Cc: Olof Johansson
    Cc: Nicolas Pitre
    Cc: Rafael J. Wysocki
    Signed-off-by: Daniel Lezcano
    Signed-off-by: Lorenzo Pieralisi
    Signed-off-by: Olof Johansson

    Lorenzo Pieralisi
     

23 Aug, 2013

3 commits

  • Field predicted_us value can never exceed expected_us value, but it has
    a potentially larger type. As there is no need for additional 32 bits of
    zeroes on 32 bit plaforms, change the type of predicted_us to match the
    type of expected_us.

    Field correction_factor is used to store a value that cannot exceed the
    product of RESOLUTION and DECAY (default 1024*8 = 8192). The constants
    cannot in practice be incremented to such values, that they'd overflow
    unsigned int even on 32 bit systems, so the type is changed to avoid
    unnecessary 64 bit arithmetic on 32 bit systems.

    One multiplication of (now) 32 bit values needs an added cast to avoid
    truncation of the result and has been added.

    In order to avoid another multiplication from 32 bit domain to 64 bit
    domain, the new correction_factor calculation has been changed from
    new = old * (DECAY-1) / DECAY
    to
    new = old - old / DECAY,
    which with infinite precision would yeild exactly the same result, but
    now changes the direction of rounding. The impact is not significant as
    the maximum accumulated difference cannot exceed the value of DECAY,
    which is relatively small compared to product of RESOLUTION and DECAY
    (8 / 8192).

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    Tuukka Tikkanen
     
  • The menu governor has a number of tunable constants that may be changed
    in the source. If certain combination of values are chosen, an overflow
    is possible when the correction_factor is being recalculated.

    This patch adds a warning regarding this possibility and describes the
    change needed for fixing the issue. The change should not be permanently
    enabled, as it will hurt performance when it is not needed.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    Tuukka Tikkanen
     
  • The menu governor uses a static function get_typical_interval() to
    try to detect a repeating pattern of wakeups. The previous interval
    durations are stored as an array of unsigned ints, but the arithmetic
    in the function is performed exclusively as 64 bit values, even when
    the value stored in a variable is known not to exceed unsigned int,
    which may be smaller and more efficient on some platforms.

    This patch changes the types of varibles used to store some
    intermediates, the maximum and and the cutoff threshold to unsigned
    ints. Average and standard deviation are still treated as 64 bit values,
    even when the values are known to be within the domain of unsigned int,
    to avoid casts to ensure correct integer promotion for arithmetic
    operations.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    Tuukka Tikkanen