18 Aug, 2012

1 commit


27 Jul, 2012

1 commit

  • Pull ACPI & power management update from Len Brown:
    "Re-write of the turbostat tool.
    lower overhead was necessary for measuring very large system when
    they are very idle.

    IVB support in intel_idle
    It's what I run on my IVB, others should be able to also:-)

    ACPICA core update
    We have found some bugs due to divergence between Linux and the
    upstream ACPICA base. Most of these patches are to reduce that
    divergence to reduce the risk of future bugs.

    Some cpuidle updates, mostly for non-Intel
    More will be coming, as they depend on this part.

    Some thermal management changes needed by non-ACPI systems.

    Some _OST (OS Status Indication) updates for hot ACPI hot-plug."

    * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (51 commits)
    Thermal: Documentation update
    Thermal: Add Hysteresis attributes
    Thermal: Make Thermal trip points writeable
    ACPI/AC: prevent OOPS on some boxes due to missing check power_supply_register() return value check
    tools/power: turbostat: fix large c1% issue
    tools/power: turbostat v2 - re-write for efficiency
    ACPICA: Update to version 20120711
    ACPICA: AcpiSrc: Fix some translation issues for Linux conversion
    ACPICA: Update header files copyrights to 2012
    ACPICA: Add new ACPI table load/unload external interfaces
    ACPICA: Split file: tbxface.c -> tbxfload.c
    ACPICA: Add PCC address space to space ID decode function
    ACPICA: Fix some comment fields
    ACPICA: Table manager: deploy new firmware error/warning interfaces
    ACPICA: Add new interfaces for BIOS(firmware) errors and warnings
    ACPICA: Split exception code utilities to a new file, utexcep.c
    ACPI: acpi_pad: tune round_robin_time
    ACPICA: Update to version 20120620
    ACPICA: Add support for implicit notify on multiple devices
    ACPICA: Update comments; no functional change
    ...

    Linus Torvalds
     

06 Jul, 2012

1 commit

  • When the system is booted with some cpus offline, the idle
    driver is not initialized. When a cpu is set online, the
    acpi code call the intel idle init function. Unfortunately
    this code introduce a dependency between intel_idle and acpi.

    This patch is intended to remove this dependency by using the
    notifier of intel_idle. This patch has the benefit of
    encapsulating the intel_idle driver and remove some exported
    functions.

    Signed-off-by: Daniel Lezcano
    Acked-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki

    Daniel Lezcano
     

06 Jun, 2012

1 commit


06 Apr, 2012

1 commit

  • Many users of debugfs copy the implementation of default_open() when
    they want to support a custom read/write function op. This leads to a
    proliferation of the default_open() implementation across the entire
    tree.

    Now that the common implementation has been consolidated into libfs we
    can replace all the users of this function with simple_open().

    This replacement was done with the following semantic patch:

    @ open @
    identifier open_f != simple_open;
    identifier i, f;
    @@
    -int open_f(struct inode *i, struct file *f)
    -{
    (
    -if (i->i_private)
    -f->private_data = i->i_private;
    |
    -f->private_data = i->i_private;
    )
    -return 0;
    -}

    @ has_open depends on open @
    identifier fops;
    identifier open.open_f;
    @@
    struct file_operations fops = {
    ...
    -.open = open_f,
    +.open = simple_open,
    ...
    };

    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: Stephen Boyd
    Cc: Greg Kroah-Hartman
    Cc: Al Viro
    Cc: Julia Lawall
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Boyd
     

22 Mar, 2012

1 commit


16 Feb, 2012

1 commit

  • Commit b66b8b9a4a79087dde1b358a016e5c8739ccf186 ('intel-idle: convert
    to x86_cpu_id auto probing') added a distinction between Nehalem and
    Westemere processors and changed auto_demotion_disable_flags for the
    former to 0. This was not explained in the commit message, so change
    it back.

    Signed-off-by: Ben Hutchings
    Acked-by: Thomas Renninger
    Acked-by: H. Peter Anvin
    Signed-off-by: Greg Kroah-Hartman

    Ben Hutchings
     

14 Feb, 2012

1 commit

  • Commit b66b8b9a4a79087dde1b358a016e5c8739ccf186 ('intel-idle: convert
    to x86_cpu_id auto probing') put two entries for model 0x2f
    (Westmere-EX Xeon) in the device ID table and left out model 0x2e
    (Nehalem-EX Xeon).

    Signed-off-by: Ben Hutchings
    Acked-by: Thomas Renninger
    Acked-by: H. Peter Anvin
    Signed-off-by: Greg Kroah-Hartman

    Ben Hutchings
     

03 Feb, 2012

1 commit


27 Jan, 2012

1 commit

  • With this it should be automatically loaded on suitable systems by
    udev.

    The old switch () is replaced with a table based approach, this
    also cleans up the code.

    Cc: Len Brown
    Signed-off-by: Andi Kleen
    Signed-off-by: Thomas Renninger
    Acked-by: H. Peter Anvin
    Signed-off-by: Greg Kroah-Hartman

    Andi Kleen
     

20 Jan, 2012

1 commit


18 Jan, 2012

5 commits

  • Len Brown
     
  • Function split up, should have no functional change.

    Provides entry point for physically hotplugged CPUs
    to initialize and activate cpuidle.

    Signed-off-by: Thomas Renninger
    CC: Deepthi Dharwar
    CC: Shaohua Li
    CC: Andrew Morton
    Signed-off-by: Len Brown

    Thomas Renninger
     
  • kvm -cpu host passes the original cpuid info to the guest.

    Latest kvm version seem to return true for mwait_leaf cpuid
    function on recent Intel CPUs. But it does not return mwait
    C-states (mwait_substates), instead zero is returned.

    While real CPUs seem to always return non-zero values, the intel
    idle driver should not get active in kvm (mwait_substates == 0)
    case and bail out.
    Otherwise a Null pointer exception will happen later when the
    cpuidle subsystem tries to get active:
    [0.984807] BUG: unable to handle kernel NULL pointer dereference at (null)
    [0.984807] IP: [] (null)
    ...
    [0.984807][] ? cpuidle_idle_call+0xb4/0x340
    [0.984807][] ? __atomic_notifier_call_chain+0x4c/0x70
    [0.984807][] ? cpu_idle+0x78/0xd0

    Reference:
    https://bugzilla.novell.com/show_bug.cgi?id=726296

    Cc: stable@vger.kernel.org
    Signed-off-by: Thomas Renninger
    CC: Bruno Friedmann
    Signed-off-by: Len Brown

    Thomas Renninger
     
  • Fix the following warning:

    drivers/idle/intel_idle.c: In function 'intel_idle_cpuidle_devices_init':
    drivers/idle/intel_idle.c:518:5: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]

    By making get_driver_data() return a long instead of an int.

    Signed-off-by: David Howells
    Signed-off-by: Len Brown

    David Howells
     
  • irq disabling happens earlier in process_32.c:cpu_idle. Basically,
    cpuidle_state->enter is called, cpu irq is disabled. cpuidle_state->enter
    would turn on irq when exiting.

    intel_idle doesn't follow this assumption. Although it doesn't cause real
    issue, it misleads developers. Remove the call to local_irq_disable() at
    entry.

    [akpm@linux-foundation.org: add comment]
    Signed-off-by: Mingming Zhang
    Signed-off-by: Andrew Morton
    Signed-off-by: Len Brown

    Yanmin Zhang
     

17 Jan, 2012

1 commit

  • smp_call_function() only lets all other CPUs execute a specific function,
    while we expect all CPUs do in intel_idle. Without the fix, we could have
    one cpu which has auto_demotion enabled or has no broadcast timer setup.
    Usually we don't see impact because auto demotion just harms power and the
    intel_idle init is called in CPU 0, where boradcast timer delivers
    interrupt, but this still could be a problem.

    Cc: stable@vger.kernel.org
    Signed-off-by: Shaohua Li
    Signed-off-by: Andrew Morton
    Signed-off-by: Len Brown

    Shaohua Li
     

08 Nov, 2011

1 commit

  • * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
    cpuidle: Single/Global registration of idle states
    cpuidle: Split cpuidle_state structure and move per-cpu statistics fields
    cpuidle: Remove CPUIDLE_FLAG_IGNORE and dev->prepare()
    cpuidle: Move dev->last_residency update to driver enter routine; remove dev->last_state
    ACPI: Fix CONFIG_ACPI_DOCK=n compiler warning
    ACPI: Export FADT pm_profile integer value to userspace
    thermal: Prevent polling from happening during system suspend
    ACPI: Drop ACPI_NO_HARDWARE_INIT
    ACPI atomicio: Convert width in bits to bytes in __acpi_ioremap_fast()
    PNPACPI: Simplify disabled resource registration
    ACPI: Fix possible recursive locking in hwregs.c
    ACPI: use kstrdup()
    mrst pmu: update comment
    tools/power turbostat: less verbose debugging

    Linus Torvalds
     

07 Nov, 2011

3 commits

  • This patch makes the cpuidle_states structure global (single copy)
    instead of per-cpu. The statistics needed on per-cpu basis
    by the governor are kept per-cpu. This simplifies the cpuidle
    subsystem as state registration is done by single cpu only.
    Having single copy of cpuidle_states saves memory. Rare case
    of asymmetric C-states can be handled within the cpuidle driver
    and architectures such as POWER do not have asymmetric C-states.

    Having single/global registration of all the idle states,
    dynamic C-state transitions on x86 are handled by
    the boot cpu. Here, the boot cpu would disable all the devices,
    re-populate the states and later enable all the devices,
    irrespective of the cpu that would receive the notification first.

    Reference:
    https://lkml.org/lkml/2011/4/25/83

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Trinabh Gupta
    Tested-by: Jean Pihet
    Reviewed-by: Kevin Hilman
    Acked-by: Arjan van de Ven
    Acked-by: Kevin Hilman
    Signed-off-by: Len Brown

    Deepthi Dharwar
     
  • This is the first step towards global registration of cpuidle
    states. The statistics used primarily by the governor are per-cpu
    and have to be split from rest of the fields inside cpuidle_state,
    which would be made global i.e. single copy. The driver_data field
    is also per-cpu and moved.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Trinabh Gupta
    Tested-by: Jean Pihet
    Reviewed-by: Kevin Hilman
    Acked-by: Arjan van de Ven
    Acked-by: Kevin Hilman
    Signed-off-by: Len Brown

    Deepthi Dharwar
     
  • Cpuidle governor only suggests the state to enter using the
    governor->select() interface, but allows the low level driver to
    override the recommended state. The actual entered state
    may be different because of software or hardware demotion. Software
    demotion is done by the back-end cpuidle driver and can be accounted
    correctly. Current cpuidle code uses last_state field to capture the
    actual state entered and based on that updates the statistics for the
    state entered.

    Ideally the driver enter routine should update the counters,
    and it should return the state actually entered rather than the time
    spent there. The generic cpuidle code should simply handle where
    the counters live in the sysfs namespace, not updating the counters.

    Reference:
    https://lkml.org/lkml/2011/3/25/52

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Trinabh Gupta
    Tested-by: Jean Pihet
    Reviewed-by: Kevin Hilman
    Acked-by: Arjan van de Ven
    Acked-by: Kevin Hilman
    Signed-off-by: Len Brown

    Deepthi Dharwar
     

01 Nov, 2011

1 commit


01 Mar, 2011

1 commit

  • Userspace apps might have to cut off parts off the
    idle state name for display reasons.
    Switch NHM-C1 to C1-NHM (and others) so that a cut off
    name is unique and makes sense to the user.

    Signed-off-by: Thomas Renninger
    CC: lenb@kernel.org
    Signed-off-by: Len Brown

    Thomas Renninger
     

18 Feb, 2011

2 commits

  • Just as we had to disable auto-demotion for NHM/WSM,
    we need to do the same for Atom (Lincroft version).

    In particular, auto-demotion will prevent Lincroft
    from entering the S0i3 idle power saving state.

    https://bugzilla.kernel.org/show_bug.cgi?id=25252

    Signed-off-by: Len Brown

    Len Brown
     
  • Hardware C-state auto-demotion is a mechanism where the HW overrides
    the OS C-state request, instead demoting to a shallower state,
    which is less expensive, but saves less power.

    Modern Linux should generally get exactly the states it requests.
    In particular, when a CPU is taken off-line, it must not be demoted, else
    it can prevent the entire package from reaching deep C-states.

    https://bugzilla.kernel.org/show_bug.cgi?id=25252

    Signed-off-by: Len Brown

    Len Brown
     

25 Jan, 2011

1 commit

  • Fix a shutdown regression caused by 2a2d31c8dc6f ("intel_idle: open
    broadcast clock event"). The clockevent framework can automatically
    shutdown broadcast timers for hotremove CPUs. And we get a shutdown
    regression when we shutdown broadcast timer for hot remove CPU, so just
    delete some code.

    Also fix some section mismatch.

    Reported-by: Ari Savolainen
    Signed-off-by: Shaohua Li
    Tested-by: Linus Torvalds
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Shaohua Li
     

13 Jan, 2011

7 commits

  • Len Brown
     
  • Len Brown
     
  • … from the cpuidle layer

    Currently intel_idle and acpi_idle driver show double cpu_idle "exit idle"
    events -> this patch fixes it and makes cpu_idle events throwing less complex.

    It also introduces cpu_idle events for all architectures which use
    the cpuidle subsystem, namely:
    - arch/arm/mach-at91/cpuidle.c
    - arch/arm/mach-davinci/cpuidle.c
    - arch/arm/mach-kirkwood/cpuidle.c
    - arch/arm/mach-omap2/cpuidle34xx.c
    - arch/drivers/acpi/processor_idle.c (for all cases, not only mwait)
    - arch/x86/kernel/process.c (did throw events before, but was a mess)
    - drivers/idle/intel_idle.c (did throw events before)

    Convention should be:
    Fire cpu_idle events inside the current pm_idle function (not somewhere
    down the the callee tree) to keep things easy.

    Current possible pm_idle functions in X86:
    c1e_idle, poll_idle, cpuidle_idle_call, mwait_idle, default_idle
    -> this is really easy is now.

    This affects userspace:
    The type field of the cpu_idle power event can now direclty get
    mapped to:
    /sys/devices/system/cpu/cpuX/cpuidle/stateX/{name,desc,usage,time,...}
    instead of throwing very CPU/mwait specific values.
    This change is not visible for the intel_idle driver.
    For the acpi_idle driver it should only be visible if the vendor
    misses out C-states in his BIOS.
    Another (perf timechart) patch reads out cpuidle info of cpu_idle
    events from:
    /sys/.../cpuidle/stateX/*, then the cpuidle events are mapped
    to the correct C-/cpuidle state again, even if e.g. vendors miss
    out C-states in their BIOS and for example only export C1 and C3.
    -> everything is fine.

    Signed-off-by: Thomas Renninger <trenn@suse.de>
    CC: Robert Schoene <robert.schoene@tu-dresden.de>
    CC: Jean Pihet <j-pihet@ti.com>
    CC: Arjan van de Ven <arjan@linux.intel.com>
    CC: Ingo Molnar <mingo@elte.hu>
    CC: Frederic Weisbecker <fweisbec@gmail.com>
    CC: linux-pm@lists.linux-foundation.org
    CC: linux-acpi@vger.kernel.org
    CC: linux-kernel@vger.kernel.org
    CC: linux-perf-users@vger.kernel.org
    CC: linux-omap@vger.kernel.org
    Signed-off-by: Len Brown <len.brown@intel.com>

    Thomas Renninger
     
  • Intel_idle driver uses CLOCK_EVT_NOTIFY_BROADCAST_ENTER
    CLOCK_EVT_NOTIFY_BROADCAST_EXIT
    for broadcast clock events. The _ENTER/_EXIT doesn't really open broadcast clock
    events, please see processor_idle.c for an example. In some situation, this will
    cause boot hang, because some CPUs enters idle but local APIC timer stalls.

    Reported-and-tested-by: Yan Zheng
    Signed-off-by: Shaohua Li
    cc: stable@kernel.org
    Signed-off-by: Len Brown

    Shaohua Li
     
  • Signed-off-by: Len Brown

    Len Brown
     
  • Having four variables for the same thing:
    idle_halt, idle_nomwait, force_mwait and boot_option_idle_overrides
    is rather confusing and unnecessary complex.

    if idle= boot param is passed, only set up one variable:
    boot_option_idle_overrides

    Introduces following functional changes/fixes:
    - intel_idle driver does not register if any idle=xy
    boot param is passed.
    - processor_idle.c will also not register a cpuidle driver
    and get active if idle=halt is passed.
    Before a cpuidle driver with one (C1, halt) state got registered
    Now the default_idle function will be used which finally uses
    the same idle call to enter sleep state (safe_halt()), but
    without registering a whole cpuidle driver.

    That means idle= param will always avoid cpuidle drivers to register
    with one exception (same behavior as before):
    idle=nomwait
    may still register acpi_idle cpuidle driver, but C1 will not use
    mwait, but hlt. This can be a workaround for IO based deeper sleep
    states where C1 mwait causes problems.

    Signed-off-by: Thomas Renninger
    cc: x86@kernel.org
    Signed-off-by: Len Brown

    Thomas Renninger
     
  • Signed-off-by: Len Brown

    Len Brown
     

04 Jan, 2011

2 commits

  • Add these new power trace events:

    power:cpu_idle
    power:cpu_frequency
    power:machine_suspend

    The old C-state/idle accounting events:
    power:power_start
    power:power_end

    Have now a replacement (but we are still keeping the old
    tracepoints for compatibility):

    power:cpu_idle

    and
    power:power_frequency

    is replaced with:
    power:cpu_frequency

    power:machine_suspend is newly introduced.

    Jean Pihet has a patch integrated into the generic layer
    (kernel/power/suspend.c) which will make use of it.

    the type= field got removed from both, it was never
    used and the type is differed by the event type itself.

    perf timechart userspace tool gets adjusted in a separate patch.

    Signed-off-by: Thomas Renninger
    Signed-off-by: Ingo Molnar
    Acked-by: Arjan van de Ven
    Acked-by: Jean Pihet
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: rjw@sisk.pl
    LKML-Reference:
    Signed-off-by: Ingo Molnar
    LKML-Reference:

    Thomas Renninger
     
  • power_frequency moved to drivers/cpufreq/cpufreq.c which has
    to be compiled in, no need to export it.

    intel_idle can a be module though...

    Signed-off-by: Thomas Renninger
    Signed-off-by: Ingo Molnar
    Acked-by: Jean Pihet
    Cc: Jean Pihet
    Cc: Arjan van de Ven
    Cc: rjw@sisk.pl
    LKML-Reference:
    Signed-off-by: Ingo Molnar
    LKML-Reference:

    Thomas Renninger
     

02 Dec, 2010

1 commit


27 Oct, 2010

2 commits


23 Oct, 2010

2 commits