08 Apr, 2014

2 commits

  • * pm-cpuidle:
    cpuidle: sysfs: Export target residency information
    intel_idle: fine-tune IVT residency targets
    tools/power turbostat: Run on Broadwell
    tools/power turbostat: simplify output, add Avg_MHz
    intel_idle: Add CPU model 54 (Atom N2000 series)
    intel_idle: support Bay Trail
    intel_idle: allow sparse sub-state numbering, for Bay Trail
    ACPI idle: permit sparse C-state sub-state numbers

    Rafael J. Wysocki
     
  • From user space, there is no way to know the target residency for each idle
    state. If we want to write tools to measure the accuracy of the idle state
    selection from the governor, we need this info.

    As the exit latency is exported through sysfs, exporting the target residency
    in the same place makes sense.

    Signed-off-by: Daniel Lezcano
    Signed-off-by: Rafael J. Wysocki

    Daniel Lezcano
     

06 Apr, 2014

1 commit

  • Pull ARM SoC driver changes from Arnd Bergmann:
    "These changes are mostly for ARM specific device drivers that either
    don't have an upstream maintainer, or that had the maintainer ask us
    to pick up the changes to avoid conflicts.

    A large chunk of this are clock drivers (bcm281xx, exynos, versatile,
    shmobile), aside from that, reset controllers for STi as well as a
    large rework of the Marvell Orion/EBU watchdog driver are notable"

    * tag 'drivers-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (99 commits)
    Revert "dts: socfpga: Add DTS entry for adding the stmmac glue layer for stmmac."
    Revert "net: stmmac: Add SOCFPGA glue driver"
    ARM: shmobile: r8a7791: Fix SCIFA3-5 clocks
    ARM: STi: Add reset controller support to mach-sti Kconfig
    drivers: reset: stih416: add softreset controller
    drivers: reset: stih415: add softreset controller
    drivers: reset: Reset controller driver for STiH416
    drivers: reset: Reset controller driver for STiH415
    drivers: reset: STi SoC system configuration reset controller support
    dts: socfpga: Add sysmgr node so the gmac can use to reference
    dts: socfpga: Add support for SD/MMC on the SOCFPGA platform
    reset: Add optional resets and stubs
    ARM: shmobile: r7s72100: fix bus clock calculation
    Power: Reset: Generalize qnap-poweroff to work on Synology devices.
    dts: socfpga: Update clock entry to support multiple parents
    ARM: socfpga: Update socfpga_defconfig
    dts: socfpga: Add DTS entry for adding the stmmac glue layer for stmmac.
    net: stmmac: Add SOCFPGA glue driver
    watchdog: orion_wdt: Use %pa to print 'phys_addr_t'
    drivers: cci: Export CCI PMU revision
    ...

    Linus Torvalds
     

03 Apr, 2014

2 commits

  • Pull sched/idle changes from Ingo Molnar:
    "More idle code reorganization, to prepare for more integration.

    (Sent separately because it depended on pending timer work, which is
    now upstream)"

    * 'sched-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/idle: Add more comments to the code
    sched/idle: Move idle conditions in cpuidle_idle main function
    sched/idle: Reorganize the idle loop
    cpuidle/idle: Move the cpuidle_idle_call function to idle.c
    idle/cpuidle: Split cpuidle_idle_call main function into smaller functions

    Linus Torvalds
     
  • Pull powerpc non-virtualized cpuidle from Ben Herrenschmidt:
    "This is the branch I mentioned in my other pull request which contains
    our improved cpuidle support for the "powernv" platform
    (non-virtualized).

    It adds support for the "fast sleep" feature of the processor which
    provides higher power savings than our usual "nap" mode but at the
    cost of losing the timers while asleep, and thus exploits the new
    timer broadcast framework to work around that limitation.

    It's based on a tip timer tree that you seem to have already merged"

    * 'powernv-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
    cpuidle/powernv: Parse device tree to setup idle states
    cpuidle/powernv: Add "Fast-Sleep" CPU idle state
    powerpc/powernv: Add OPAL call to resync timebase on wakeup
    powerpc/powernv: Add context management for Fast Sleep
    powerpc: Split timer_interrupt() into timer handling and interrupt handling routines
    powerpc: Implement tick broadcast IPI as a fixed IPI message
    powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message

    Linus Torvalds
     

02 Apr, 2014

3 commits

  • Pull core block layer updates from Jens Axboe:
    "This is the pull request for the core block IO bits for the 3.15
    kernel. It's a smaller round this time, it contains:

    - Various little blk-mq fixes and additions from Christoph and
    myself.

    - Cleanup of the IPI usage from the block layer, and associated
    helper code. From Frederic Weisbecker and Jan Kara.

    - Duplicate code cleanup in bio-integrity from Gu Zheng. This will
    give you a merge conflict, but that should be easy to resolve.

    - blk-mq notify spinlock fix for RT from Mike Galbraith.

    - A blktrace partial accounting bug fix from Roman Pen.

    - Missing REQ_SYNC detection fix for blk-mq from Shaohua Li"

    * 'for-3.15/core' of git://git.kernel.dk/linux-block: (25 commits)
    blk-mq: add REQ_SYNC early
    rt,blk,mq: Make blk_mq_cpu_notify_lock a raw spinlock
    blk-mq: support partial I/O completions
    blk-mq: merge blk_mq_insert_request and blk_mq_run_request
    blk-mq: remove blk_mq_alloc_rq
    blk-mq: don't dump CPU -> hw queue map on driver load
    blk-mq: fix wrong usage of hctx->state vs hctx->flags
    blk-mq: allow blk_mq_init_commands() to return failure
    block: remove old blk_iopoll_enabled variable
    blktrace: fix accounting of partially completed requests
    smp: Rename __smp_call_function_single() to smp_call_function_single_async()
    smp: Remove wait argument from __smp_call_function_single()
    watchdog: Simplify a little the IPI call
    smp: Move __smp_call_function_single() below its safe version
    smp: Consolidate the various smp_call_function_single() declensions
    smp: Teach __smp_call_function_single() to check for offline cpus
    smp: Remove unused list_head from csd
    smp: Iterate functions through llist_for_each_entry_safe()
    block: Stop abusing rq->csd.list in blk-softirq
    block: Remove useless IPI struct initialization
    ...

    Linus Torvalds
     
  • Pull ACPI and power management updates from Rafael Wysocki:
    "The majority of this material spent some time in linux-next, some of
    it even several weeks. There are a few relatively fresh commits in
    it, but they are mostly fixes and simple cleanups.

    ACPI took the lead this time, both in terms of the number of commits
    and the number of modified lines of code, cpufreq follows and there
    are a few changes in the PM core and in cpuidle too.

    A new feature that already got some LWN.net's attention is the device
    PM QoS extension allowing latency tolerance requirements to be
    propagated from leaf devices to their ancestors with hardware
    interfaces for specifying latency tolerance. That should help systems
    with hardware-driven power management to avoid going too far with it
    in cases when there are latency tolerance constraints.

    There also are some significant changes in the ACPI core related to
    the way in which hotplug notifications are handled. They affect PCI
    hotplug (ACPIPHP) and the ACPI dock station code too. The bottom line
    is that all those notification now go through the root notify handler
    and are propagated to the interested subsystems by means of callbacks
    instead of having to install a notify handler for each device object
    that we can potentially get hotplug notifications for.

    In addition to that ACPICA will now advertise "Windows 2013"
    compatibility for _OSI, because some systems out there don't work
    correctly if that is not done (some of them don't even boot).

    On the system suspend side of things, all of the device suspend and
    resume callbacks, except for ->prepare() and ->complete(), are now
    going to be executed asynchronously as that turns out to speed up
    system suspend and resume on some platforms quite significantly and we
    have a few more optimizations in that area.

    Apart from that, there are some new device IDs and fixes and cleanups
    all over. In particular, the system suspend and resume handling by
    cpufreq should be improved and the cpuidle menu governor should be a
    bit more robust now.

    Specifics:

    - Device PM QoS support for latency tolerance constraints on systems
    with hardware interfaces allowing such constraints to be specified.
    That is necessary to prevent hardware-driven power management from
    becoming overly aggressive on some systems and to prevent power
    management features leading to excessive latencies from being used
    in some cases.

    - Consolidation of the handling of ACPI hotplug notifications for
    device objects. This causes all device hotplug notifications to go
    through the root notify handler (that was executed for all of them
    anyway before) that propagates them to individual subsystems, if
    necessary, by executing callbacks provided by those subsystems
    (those callbacks are associated with struct acpi_device objects
    during device enumeration). As a result, the code in question
    becomes both smaller in size and more straightforward and all of
    those changes should not affect users.

    - ACPICA update, including fixes related to the handling of _PRT in
    cases when it is broken and the addition of "Windows 2013" to the
    list of supported "features" for _OSI (which is necessary to
    support systems that work incorrectly or don't even boot without
    it). Changes from Bob Moore and Lv Zheng.

    - Consolidation of ACPI _OST handling from Jiang Liu.

    - ACPI battery and AC fixes allowing unusual system configurations to
    be handled by that code from Alexander Mezin.

    - New device IDs for the ACPI LPSS driver from Chiau Ee Chew.

    - ACPI fan and thermal optimizations related to system suspend and
    resume from Aaron Lu.

    - Cleanups related to ACPI video from Jean Delvare.

    - Assorted ACPI fixes and cleanups from Al Stone, Hanjun Guo, Lan
    Tianyu, Paul Bolle, Tomasz Nowicki.

    - Intel RAPL (Running Average Power Limits) driver cleanups from
    Jacob Pan.

    - intel_pstate fixes and cleanups from Dirk Brandewie.

    - cpufreq fixes related to system suspend/resume handling from Viresh
    Kumar.

    - cpufreq core fixes and cleanups from Viresh Kumar, Stratos
    Karafotis, Saravana Kannan, Rashika Kheria, Joe Perches.

    - cpufreq drivers updates from Viresh Kumar, Zhuoyu Zhang, Rob
    Herring.

    - cpuidle fixes related to the menu governor from Tuukka Tikkanen.

    - cpuidle fix related to coupled CPUs handling from Paul Burton.

    - Asynchronous execution of all device suspend and resume callbacks,
    except for ->prepare and ->complete, during system suspend and
    resume from Chuansheng Liu.

    - Delayed resuming of runtime-suspended devices during system suspend
    for the PCI bus type and ACPI PM domain.

    - New set of PM helper routines to allow device runtime PM callbacks
    to be used during system suspend and resume more easily from Ulf
    Hansson.

    - Assorted fixes and cleanups in the PM core from Geert Uytterhoeven,
    Prabhakar Lad, Philipp Zabel, Rashika Kheria, Sebastian Capella.

    - devfreq fix from Saravana Kannan"

    * tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits)
    PM / devfreq: Rewrite devfreq_update_status() to fix multiple bugs
    PM / sleep: Correct whitespace errors in
    intel_pstate: Set core to min P state during core offline
    cpufreq: Add stop CPU callback to cpufreq_driver interface
    cpufreq: Remove unnecessary braces
    cpufreq: Fix checkpatch errors and warnings
    cpufreq: powerpc: add cpufreq transition latency for FSL e500mc SoCs
    MAINTAINERS: Reorder maintainer addresses for PM and ACPI
    PM / Runtime: Update runtime_idle() documentation for return value meaning
    video / output: Drop display output class support
    fujitsu-laptop: Drop unneeded include
    acer-wmi: Stop selecting VIDEO_OUTPUT_CONTROL
    ACPI / gpu / drm: Stop selecting VIDEO_OUTPUT_CONTROL
    ACPI / video: fix ACPI_VIDEO dependencies
    cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE|RESUMECHANGE}
    cpufreq: Do not allow ->setpolicy drivers to provide ->target
    cpufreq: arm_big_little: set 'physical_cluster' for each CPU
    cpufreq: arm_big_little: make vexpress driver depend on bL core driver
    ACPI / button: Add ACPI Button event via netlink routine
    ACPI: Remove duplicate definitions of PREFIX
    ...

    Linus Torvalds
     
  • Pull timer changes from Thomas Gleixner:
    "This assorted collection provides:

    - A new timer based timer broadcast feature for systems which do not
    provide a global accessible timer device. That allows those
    systems to put CPUs into deep idle states where the per cpu timer
    device stops.

    - A few NOHZ_FULL related improvements to the timer wheel

    - The usual updates to timer devices found in ARM SoCs

    - Small improvements and updates all over the place"

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
    tick: Remove code duplication in tick_handle_periodic()
    tick: Fix spelling mistake in tick_handle_periodic()
    x86: hpet: Use proper destructor for delayed work
    workqueue: Provide destroy_delayed_work_on_stack()
    clocksource: CMT, MTU2, TMU and STI should depend on GENERIC_CLOCKEVENTS
    timer: Remove code redundancy while calling get_nohz_timer_target()
    hrtimer: Rearrange comments in the order struct members are declared
    timer: Use variable head instead of &work_list in __run_timers()
    clocksource: exynos_mct: silence a static checker warning
    arm: zynq: Add support for cpufreq
    arm: zynq: Don't use arm_global_timer with cpufreq
    clocksource/cadence_ttc: Overhaul clocksource frequency adjustment
    clocksource/cadence_ttc: Call clockevents_update_freq() with IRQs enabled
    clocksource: Add Kconfig entries for CMT, MTU2, TMU and STI
    sh: Remove Kconfig entries for TMU, CMT and MTU2
    ARM: shmobile: Remove CMT, TMU and STI Kconfig entries
    clocksource: armada-370-xp: Use atomic access for shared registers
    clocksource: orion: Use atomic access for shared registers
    clocksource: timer-keystone: Delete unnecessary variable
    clocksource: timer-keystone: introduce clocksource driver for Keystone
    ...

    Linus Torvalds
     

12 Mar, 2014

1 commit

  • As described by a comment at the end of cpuidle_enter_state_coupled it
    can be inefficient for coupled idle states to return with IRQs enabled
    since they may proceed to service an interrupt instead of clearing the
    coupled idle state. Until they have finished & cleared the idle state
    all CPUs coupled with them will spin rather than being able to enter a
    safe idle state.

    Commits e1689795a784 "cpuidle: Add common time keeping and irq
    enabling" and 554c06ba3ee2 "cpuidle: remove en_core_tk_irqen flag" led
    to the cpuidle_enter_state enabling interrupts for all idle states,
    including coupled ones, making this inefficiency unavoidable by drivers
    & the local_irq_enable near the end of cpuidle_enter_state_coupled
    redundant. This patch avoids enabling interrupts in cpuidle_enter_state
    after a coupled state has been entered, allowing them to remain disabled
    until all coupled CPUs have exited the idle state and
    cpuidle_enter_state_coupled re-enables them.

    Cc: Daniel Lezcano
    Signed-off-by: Paul Burton
    Signed-off-by: Rafael J. Wysocki

    Paul Burton
     

11 Mar, 2014

2 commits

  • The cpuidle_idle_call does nothing more than calling the three individuals
    function and is no longer used by any arch specific code but only in the
    cpuidle framework code.

    We can move this function into the idle task code to ensure better
    proximity to the scheduler code.

    Signed-off-by: Daniel Lezcano
    Acked-by: Nicolas Pitre
    Signed-off-by: Peter Zijlstra
    Cc: rjw@rjwysocki.net
    Cc: preeti@linux.vnet.ibm.com
    Link: http://lkml.kernel.org/r/1393832934-11625-2-git-send-email-daniel.lezcano@linaro.org
    Signed-off-by: Ingo Molnar

    Daniel Lezcano
     
  • In order to allow better integration between the cpuidle framework and the
    scheduler, reducing the distance between these two sub-components will
    facilitate this integration by moving part of the cpuidle code in the idle
    task file and, because idle.c is in the sched directory, we have access to
    the scheduler's private structures.

    This patch splits the cpuidle_idle_call main entry function into 3 calls
    to a newly added API:

    1. select the idle state
    2. enter the idle state
    3. reflect the idle state

    The cpuidle_idle_call calls these three functions to implement the main
    idle entry function.

    Signed-off-by: Daniel Lezcano
    Acked-by: Nicolas Pitre
    Signed-off-by: Peter Zijlstra
    Cc: rjw@rjwysocki.net
    Cc: preeti@linux.vnet.ibm.com
    Link: http://lkml.kernel.org/r/1393832934-11625-1-git-send-email-daniel.lezcano@linaro.org
    Signed-off-by: Ingo Molnar

    Daniel Lezcano
     

07 Mar, 2014

1 commit

  • For some platforms, a poll state is inserted in the cpuidle driver states.
    The flags for the state do not indicate that timekeeping is not affected.
    As the state does not do anything apart from calling cpu_relax(), the
    times returned by ktime_get should remain valid. Add the missing flag.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    tuukka.tikkanen@linaro.org
     

06 Mar, 2014

5 commits

  • The menu governor performance multiplier defines a minimum predicted
    idle duration to latency ratio. Instead of checking this separately
    in every iteration of the state selection loop, adjust the overall
    latency restriction for the whole loop if this restriction is tighter
    than what is set by the QoS subsystem.

    The original test
    s->exit_latency * multiplier > data->predicted_us
    becomes
    s->exit_latency > data->predicted_us / multiplier
    by dividing both sides of the comparison by "multiplier".

    While division is likely to be several times slower than multiplication,
    the minor performance hit allows making a generic sleep state selection
    function based on (sleep duration, maximum latency) tuple.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    tuukka.tikkanen@linaro.org
     
  • The menu governor statistics update function tries to determine the
    amount of time between entry to low power state and the occurrence
    of the wakeup event. However, the time measured by the framework
    includes exit latency on top of the desired value. This exit latency
    is substracted from the measured value to obtain the desired value.

    When measured value is not available, the menu governor assumes
    the wakeup was caused by the timer and the time is equal to remaining
    timer length. No exit latency should be substracted from this value.

    This patch prevents the erroneous substraction and clarifies the
    associated comment. It also removes one intermediate variable that
    serves no purpose.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    tuukka.tikkanen@linaro.org
     
  • The menu governor uses coefficients as one method of actual idle
    period length estimation. The coefficients are, as detailed below,
    multipliers giving expected idle period length from time until next
    timer expiry. The multipliers are supposed to have domain of (0..1].

    The coefficients are fractions where only the numerators are stored
    and denominators are a shared constant RESOLUTION*DECAY. Since the
    value of the coefficient should always be greater than 0 and less
    than or equal to 1, the numerator must have a value greater than
    0 and less than or equal to RESOLUTION*DECAY.

    If the coefficients are updated with measured idle durations exceeding
    timer length, the multiplier may reach values exceeding unity (i.e.
    the stored numerator exceeds RESOLUTION*DECAY). This patch ensures that
    the multipliers are updated with durations capped to timer length.

    Signed-off-by: Tuukka Tikkanen
    Acked-by: Nicolas Pitre
    Signed-off-by: Rafael J. Wysocki

    tuukka.tikkanen@linaro.org
     
  • Currently menu governor records the exit latency of the state it has
    chosen for the idle period. The stored latency value is then later
    used to calculate the actual length of the idle period. This value
    may however be incorrect, as the entered state may not be the one
    chosen by the governor. The entered state information is available,
    so we can use that to obtain the real exit latency.

    Signed-off-by: Tuukka Tikkanen
    Acked-by: Daniel Lezcano
    Signed-off-by: Rafael J. Wysocki

    tuukka.tikkanen@linaro.org
     
  • The field expected_us is used to store the time remaining until next
    timer expiry. The name is inaccurate, as we really do not expect all
    wakeups to be caused by timers. In addition, another field with a very
    similar name (predicted_us) is used to store the predicted time
    remaining until any wakeup source being active.

    This patch renames expected_us to next_timer_us in order to better
    reflect the contained information.

    Signed-off-by: Tuukka Tikkanen
    Acked-by: Nicolas Pitre
    Acked-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    tuukka.tikkanen@linaro.org
     

05 Mar, 2014

2 commits

  • Add deep idle states such as nap and fast sleep to the cpuidle state table
    only if they are discovered from the device tree during cpuidle initialization.

    Signed-off-by: Preeti U Murthy
    Signed-off-by: Benjamin Herrenschmidt

    Preeti U Murthy
     
  • Fast sleep is one of the deep idle states on Power8 in which local timers of
    CPUs stop. On PowerPC we do not have an external clock device which can
    handle wakeup of such CPUs. Now that we have the support in the tick broadcast
    framework for archs that do not sport such a device and the low level support
    for fast sleep, enable it in the cpuidle framework on PowerNV.

    Signed-off-by: Preeti U Murthy
    Signed-off-by: Benjamin Herrenschmidt

    Preeti U Murthy
     

28 Feb, 2014

1 commit


25 Feb, 2014

3 commits

  • The name __smp_call_function_single() doesn't tell much about the
    properties of this function, especially when compared to
    smp_call_function_single().

    The comments above the implementation are also misleading. The main
    point of this function is actually not to be able to embed the csd
    in an object. This is actually a requirement that result from the
    purpose of this function which is to raise an IPI asynchronously.

    As such it can be called with interrupts disabled. And this feature
    comes at the cost of the caller who then needs to serialize the
    IPIs on this csd.

    Lets rename the function and enhance the comments so that they reflect
    these properties.

    Suggested-by: Christoph Hellwig
    Cc: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Ingo Molnar
    Cc: Jan Kara
    Cc: Jens Axboe
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Jens Axboe

    Frederic Weisbecker
     
  • The main point of calling __smp_call_function_single() is to send
    an IPI in a pure asynchronous way. By embedding a csd in an object,
    a caller can send the IPI without waiting for a previous one to complete
    as is required by smp_call_function_single() for example. As such,
    sending this kind of IPI can be safe even when irqs are disabled.

    This flexibility comes at the expense of the caller who then needs to
    synchronize the csd lifecycle by himself and make sure that IPIs on a
    single csd are serialized.

    This is how __smp_call_function_single() works when wait = 0 and this
    usecase is relevant.

    Now there don't seem to be any usecase with wait = 1 that can't be
    covered by smp_call_function_single() instead, which is safer. Lets look
    at the two possible scenario:

    1) The user calls __smp_call_function_single(wait = 1) on a csd embedded
    in an object. It looks like a nice and convenient pattern at the first
    sight because we can then retrieve the object from the IPI handler easily.

    But actually it is a waste of memory space in the object since the csd
    can be allocated from the stack by smp_call_function_single(wait = 1)
    and the object can be passed an the IPI argument.

    Besides that, embedding the csd in an object is more error prone
    because the caller must take care of the serialization of the IPIs
    for this csd.

    2) The user calls __smp_call_function_single(wait = 1) on a csd that
    is allocated on the stack. It's ok but smp_call_function_single()
    can do it as well and it already takes care of the allocation on the
    stack. Again it's more simple and less error prone.

    Therefore, using the underscore prepend API version with wait = 1
    is a bad pattern and a sign that the caller can do safer and more
    simple.

    There was a single user of that which has just been converted.
    So lets remove this option to discourage further users.

    Cc: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Ingo Molnar
    Cc: Jan Kara
    Cc: Jens Axboe
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Jens Axboe

    Frederic Weisbecker
     
  • With the move of kirkwood into mach-mvebu, drivers Kconfig need
    tweeking to allow the kirkwood specific drivers to be built.

    Signed-off-by: Andrew Lunn
    Acked-by: Arnd Bergmann
    Acked-by: Mark Brown
    Acked-by: Kishon Vijay Abraham I
    Acked-by: Daniel Lezcano
    Acked-by: Viresh Kumar
    Tested-by: Jason Gunthorpe
    Cc: Viresh Kumar
    Cc: Rafael J. Wysocki
    Cc: Richard Purdie
    Cc: Bryan Wu
    Cc: Zhang Rui
    Cc: Eduardo Valentin
    Signed-off-by: Jason Cooper

    Andrew Lunn
     

23 Feb, 2014

1 commit

  • The core idle loop now takes care of it. We need to add the runlatch
    function calls to the idle routines which was earlier taken care of by
    the arch specific idle routine.

    Signed-off-by: Nicolas Pitre
    Signed-off-by: Preeti U Murthy
    Reviewed-by: Deepthi Dharwar
    Signed-off-by: Peter Zijlstra
    Cc: Paul Burton
    Cc: "Rafael J. Wysocki"
    Cc: Daniel Lezcano
    Cc: linux-pm@vger.kernel.org
    Cc: linaro-kernel@lists.linaro.org
    Link: http://lkml.kernel.org/n/tip-nr4mtbkkzf2oomaj85m24o7c@git.kernel.org
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Nicolas Pitre
     

12 Feb, 2014

1 commit

  • Commit d8c6ad3184ca651 ("sched/idle, PPC: Remove redundant
    cpuidle_idle_call()") reintroduced ppc64_runlatch_off/on() in the
    pseries cpuidle backend driver. Hence the cleanup caused by the
    commit "c0c4301c54adde05:pseries/cpuidle: Remove redundant call
    to ppc64_runlatch_off() in cpu idle routines" in conjuction
    with the commit d8c6ad3184ca651 causes a build failure.

    Signed-off-by: Preeti U Murthy
    Cc: Peter Zijlstra
    Cc: Nicolas Pitre
    Cc: Stephen Rothwell
    Link: http://lkml.kernel.org/r/52FAFD2D.2090306@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Preeti U Murthy
     

11 Feb, 2014

1 commit

  • The core idle loop now takes care of it. However a few things need
    checking:

    - Invocation of cpuidle_idle_call() in pseries_lpar_idle() happened
    through arch_cpu_idle() and was therefore always preceded by a call
    to ppc64_runlatch_off(). To preserve this property now that
    cpuidle_idle_call() is invoked directly from core code, a call to
    ppc64_runlatch_off() has been added to idle_loop_prolog() in
    platforms/pseries/processor_idle.c.

    - Similarly, cpuidle_idle_call() was followed by ppc64_runlatch_off()
    so a call to the later has been added to idle_loop_epilog().

    - And since arch_cpu_idle() always made sure to re-enable IRQs if they
    were not enabled, this is now
    done in idle_loop_epilog() as well.

    The above was made in order to keep the execution flow close to the
    original. I don't know if that was strictly necessary. Someone well
    aquainted with the platform details might find some room for possible
    optimizations.

    Signed-off-by: Nicolas Pitre
    Reviewed-by: Preeti U Murthy
    Cc: "Rafael J. Wysocki"
    Cc: Daniel Lezcano
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: linux-sh@vger.kernel.org
    Cc: linux-pm@vger.kernel.org
    Cc: Russell King
    Cc: linaro-kernel@lists.linaro.org
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mundt
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-47o4m03citrfg9y1vxic5asb@git.kernel.org
    Signed-off-by: Ingo Molnar

    Nicolas Pitre
     

07 Feb, 2014

1 commit

  • Some archs set the CPUIDLE_FLAG_TIMER_STOP flag for idle states in which the
    local timers stop. The cpuidle_idle_call() currently handles such idle states
    by calling into the broadcast framework so as to wakeup CPUs at their next
    wakeup event. With the hrtimer mode of broadcast, the BROADCAST_ENTER call
    into the broadcast frameowork can fail for archs that do not have an external
    clock device to handle wakeups and the CPU in question has thus to be made
    the stand by CPU. This patch handles such cases by failing the call into
    cpuidle so that the arch can take some default action. The arch will certainly
    not enter a similar idle state because a failed cpuidle call will also implicitly
    indicate that the broadcast framework has not registered this CPU to be woken up.
    Hence we are safe if we fail the cpuidle call.

    In the process move the functions that trace idle statistics just before and
    after the entry and exit into idle states respectively. In other
    scenarios where the call to cpuidle fails, we end up not tracing idle
    entry and exit since a decision on an idle state could not be taken. Similarly
    when the call to broadcast framework fails, we skip tracing idle statistics
    because we are in no further position to take a decision on an alternative
    idle state to enter into.

    Signed-off-by: Preeti U Murthy
    Cc: deepthi@linux.vnet.ibm.com
    Cc: paulmck@linux.vnet.ibm.com
    Cc: fweisbec@gmail.com
    Cc: paulus@samba.org
    Cc: srivatsa.bhat@linux.vnet.ibm.com
    Cc: svaidy@linux.vnet.ibm.com
    Cc: peterz@infradead.org
    Cc: benh@kernel.crashing.org
    Acked-by: Rafael J. Wysocki
    Cc: linuxppc-dev@lists.ozlabs.org
    Link: http://lkml.kernel.org/r/20140207080652.17187.66344.stgit@preeti.in.ibm.com
    Signed-off-by: Thomas Gleixner

    Preeti U Murthy
     

29 Jan, 2014

6 commits

  • Following patch ports the cpuidle framework for powernv
    platform and also implements a cpuidle back-end powernv
    idle driver calling on to power7_nap and snooze idle states.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Benjamin Herrenschmidt

    Deepthi Dharwar
     
  • smt-snooze-delay was designed to disable NAP state or delay the entry
    to the NAP state prior to adoption of cpuidle framework. This
    is per-cpu variable. With the coming of CPUIDLE framework,
    states can be disabled on per-cpu basis using the cpuidle/enable
    sysfs entry.

    Also, with the coming of cpuidle driver each state's target residency
    is per-driver unlike earlier which was per-device. Therefore,
    the per-cpu sysfs smt-snooze-delay which decides the target residency
    of the idle state on a particular cpu causes more confusion to the user
    as we cannot have different smt-snooze-delay (target residency)
    values for each cpu.

    In the current code, smt-snooze-delay functionality is completely broken.
    It makes sense to remove smt-snooze-delay from idle driver with the
    coming of cpuidle framework.
    However, sysfs files are retained as ppc64_util currently
    utilises it. Once we fix ppc64_util, propose to clean
    up the kernel code.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Benjamin Herrenschmidt

    Deepthi Dharwar
     
  • This patch removes the usage of MAX_IDLE_STATE macro
    and dead code around it. The number of states
    are determined at run time based on the cpuidle
    state table selected on a given platform

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Benjamin Herrenschmidt

    Deepthi Dharwar
     
  • Currently cpuidle-pseries backend driver cannot be
    built as a module due to dependencies wrt cpuidle framework.
    This patch removes all the module related code in the driver.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Benjamin Herrenschmidt

    Deepthi Dharwar
     
  • This patch replaces the cpuidle driver and devices initialisation
    calls with a single generic cpuidle_register() call
    and also includes minor refactoring of the code around it.

    Remove the cpu online check in snooze loop, as this code can
    only locally run on a cpu only if it is online. Therefore,
    this check is not required.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Benjamin Herrenschmidt

    Deepthi Dharwar
     
  • Move the file from arch specific pseries/processor_idle.c
    to drivers/cpuidle/cpuidle-pseries.c
    Make the relevant Makefile and Kconfig changes.
    Also, introduce Kconfig.powerpc in drivers/cpuidle
    for all powerpc cpuidle drivers.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Benjamin Herrenschmidt

    Deepthi Dharwar
     

30 Dec, 2013

1 commit

  • Commit 60a66e370007e8535b7a561353b07b37deaf35ba changed the Calxeda
    cpuidle driver to a platform driver, copying the __init tag from the
    _init() to the newly used _probe() function. However, "probe should
    not be __init." (Rob said ;-)
    Remove the __init tag to fix a section mismatch in the Calxeda
    cpuidle driver.

    Signed-off-by: Andre Przywara
    Signed-off-by: Daniel Lezcano

    Andre Przywara
     

04 Dec, 2013

1 commit

  • If not, we could end up in the unfortunate situation where
    we dereference a NULL pointer b/c we have cpuidle disabled.

    This is the case when booting under Xen (which uses the
    ACPI P/C states but disables the CPU idle driver) - and can
    be easily reproduced when booting with cpuidle.off=1.

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [] cpuidle_unregister_device+0x2a/0x90
    .. snip..
    Call Trace:
    [] acpi_processor_power_exit+0x3c/0x5c
    [] acpi_processor_stop+0x61/0xb6
    [] __device_release_driver+0fffff81421653>] device_release_driver+0x23/0x30
    [] bus_remove_device+0x108/0x180
    [] device_del+0x129/0x1c0
    [] ? unregister_xenbus_watch+0x1f0/0x1f0
    [] device_unregister+0x1e/0x60
    [] unregister_cpu+0x39/0x60
    [] arch_unregister_cpu+0x23/0x30
    [] handle_vcpu_hotplug_event+0xc1/0xe0
    [] xenwatch_thread+0x45/0x120
    [] ? abort_exclusive_wait+0xb0/0xb0
    [] kthread+0xd2/0xf0
    [] ? kthread_create_on_node+0x180/0x180
    [] ret_from_fork+0x7c/0xb0
    [] ? kthread_create_on_node+0x180/0x180

    This problem also appears in 3.12 and could be a candidate for backport.

    Signed-off-by: Konrad Rzeszutek Wilk
    Cc: All applicable
    Signed-off-by: Rafael J. Wysocki

    Konrad Rzeszutek Wilk
     

14 Nov, 2013

1 commit

  • Pull ACPI and power management updates from Rafael J Wysocki:

    - New power capping framework and the the Intel Running Average Power
    Limit (RAPL) driver using it from Srinivas Pandruvada and Jacob Pan.

    - Addition of the in-kernel switching feature to the arm_big_little
    cpufreq driver from Viresh Kumar and Nicolas Pitre.

    - cpufreq support for iMac G5 from Aaro Koskinen.

    - Baytrail processors support for intel_pstate from Dirk Brandewie.

    - cpufreq support for Midway/ECX-2000 from Mark Langsdorf.

    - ARM vexpress/TC2 cpufreq support from Sudeep KarkadaNagesha.

    - ACPI power management support for the I2C and SPI bus types from Mika
    Westerberg and Lv Zheng.

    - cpufreq core fixes and cleanups from Viresh Kumar, Srivatsa S Bhat,
    Stratos Karafotis, Xiaoguang Chen, Lan Tianyu.

    - cpufreq drivers updates (mostly fixes and cleanups) from Viresh
    Kumar, Aaro Koskinen, Jungseok Lee, Sudeep KarkadaNagesha, Lukasz
    Majewski, Manish Badarkhe, Hans-Christian Egtvedt, Evgeny Kapaev.

    - intel_pstate updates from Dirk Brandewie and Adrian Huang.

    - ACPICA update to version 20130927 includig fixes and cleanups and
    some reduction of divergences between the ACPICA code in the kernel
    and ACPICA upstream in order to improve the automatic ACPICA patch
    generation process. From Bob Moore, Lv Zheng, Tomasz Nowicki, Naresh
    Bhat, Bjorn Helgaas, David E Box.

    - ACPI IPMI driver fixes and cleanups from Lv Zheng.

    - ACPI hotplug fixes and cleanups from Bjorn Helgaas, Toshi Kani, Zhang
    Yanfei, Rafael J Wysocki.

    - Conversion of the ACPI AC driver to the platform bus type and
    multiple driver fixes and cleanups related to ACPI from Zhang Rui.

    - ACPI processor driver fixes and cleanups from Hanjun Guo, Jiang Liu,
    Bartlomiej Zolnierkiewicz, Mathieu Rhéaume, Rafael J Wysocki.

    - Fixes and cleanups and new blacklist entries related to the ACPI
    video support from Aaron Lu, Felipe Contreras, Lennart Poettering,
    Kirill Tkhai.

    - cpuidle core cleanups from Viresh Kumar and Lorenzo Pieralisi.

    - cpuidle drivers fixes and cleanups from Daniel Lezcano, Jingoo Han,
    Bartlomiej Zolnierkiewicz, Prarit Bhargava.

    - devfreq updates from Sachin Kamat, Dan Carpenter, Manish Badarkhe.

    - Operation Performance Points (OPP) core updates from Nishanth Menon.

    - Runtime power management core fix from Rafael J Wysocki and update
    from Ulf Hansson.

    - Hibernation fixes from Aaron Lu and Rafael J Wysocki.

    - Device suspend/resume lockup detection mechanism from Benoit Goby.

    - Removal of unused proc directories created for various ACPI drivers
    from Lan Tianyu.

    - ACPI LPSS driver fix and new device IDs for the ACPI platform scan
    handler from Heikki Krogerus and Jarkko Nikula.

    - New ACPI _OSI blacklist entry for Toshiba NB100 from Levente Kurusa.

    - Assorted fixes and cleanups related to ACPI from Andy Shevchenko, Al
    Stone, Bartlomiej Zolnierkiewicz, Colin Ian King, Dan Carpenter,
    Felipe Contreras, Jianguo Wu, Lan Tianyu, Yinghai Lu, Mathias Krause,
    Liu Chuansheng.

    - Assorted PM fixes and cleanups from Andy Shevchenko, Thierry Reding,
    Jean-Christophe Plagniol-Villard.

    * tag 'pm+acpi-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (386 commits)
    cpufreq: conservative: fix requested_freq reduction issue
    ACPI / hotplug: Consolidate deferred execution of ACPI hotplug routines
    PM / runtime: Use pm_runtime_put_sync() in __device_release_driver()
    ACPI / event: remove unneeded NULL pointer check
    Revert "ACPI / video: Ignore BIOS initial backlight value for HP 250 G1"
    ACPI / video: Quirk initial backlight level 0
    ACPI / video: Fix initial level validity test
    intel_pstate: skip the driver if ACPI has power mgmt option
    PM / hibernate: Avoid overflow in hibernate_preallocate_memory()
    ACPI / hotplug: Do not execute "insert in progress" _OST
    ACPI / hotplug: Carry out PCI root eject directly
    ACPI / hotplug: Merge device hot-removal routines
    ACPI / hotplug: Make acpi_bus_hot_remove_device() internal
    ACPI / hotplug: Simplify device ejection routines
    ACPI / hotplug: Fix handle_root_bridge_removal()
    ACPI / hotplug: Refuse to hot-remove all objects with disabled hotplug
    ACPI / scan: Start matching drivers after trying scan handlers
    ACPI: Remove acpi_pci_slot_init() headers from internal.h
    ACPI / blacklist: fix name of ThinkPad Edge E530
    PowerCap: Fix build error with option -Werror=format-security
    ...

    Conflicts:
    arch/arm/mach-omap2/opp.c
    drivers/Kconfig
    drivers/spi/spi.c

    Linus Torvalds
     

30 Oct, 2013

4 commits

  • cpuidle_unregister_governor() and cpuidle_replace_governor() aren't
    used anymore and can be removed. They were used by cpufreq governors
    earlier, but since the governors can't be compiled as modules any
    more, these two functions aren't necessary.

    Suggested-by: Daniel Lezcano
    Signed-off-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Viresh Kumar
     
  • poll_idle_init() just initializes drv->states[0] and so that is
    required to be done only once for each driver. Currently, it is
    called from cpuidle_enable_device() which is called for every CPU
    that the driver supports. That is not required, so move it to a
    better place and call it from __cpuidle_register_driver() so that
    the initialization is carried out only once.

    Acked-by: Daniel Lezcano
    Signed-off-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Viresh Kumar
     
  • Instances of "struct cpuidle_driver *" are consistently named as "drv"
    in the cpuidle core except in show_current_driver().

    Make that function use variable naming consistent with the rest of the
    code.

    [rjw: Changelog]
    Acked-by: Daniel Lezcano
    Signed-off-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Viresh Kumar
     
  • There are a few cpuidle_get_driver() calls that aren't made under
    cpuidle_driver_lock which is incorrect.

    Fix them by calling cpuidle_get_driver() after taking cpuidle_driver_lock.

    Acked-by: Daniel Lezcano
    Signed-off-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Viresh Kumar