Eric Lee / smarc-fsl-linux-kernel

10 Aug, 2010

1 commit

71abbbf85 cpuidle: extend cpuidle and menu governor to handle dynamic states ... Browse Code »

On some SoC chips, HW resources may be in use during any particular idle
period. As a consequence, the cpuidle states that the SoC is safe to
enter can change from idle period to idle period. In addition, the
latency and threshold of each cpuidle state can vary, depending on the
operating condition when the CPU becomes idle, e.g. the current cpu
frequency, the current state of the HW blocks, etc.

cpuidle core and the menu governor, in the current form, are geared
towards cpuidle states that are static, i.e. the availabiltiy of the
states, their latencies, their thresholds are non-changing during run
time. cpuidle does not provide any hook that cpuidle drivers can use to
adjust those values on the fly for the current idle period before the menu
governor selects the target cpuidle state.

This patch extends cpuidle core and the menu governor to handle states
that are dynamic. There are three additions in the patch and the patch
maintains backwards-compatibility with existing cpuidle drivers.

1) add prepare() to struct cpuidle_device. A cpuidle driver can hook
into the callback and cpuidle will call prepare() before calling the
governor's select function. The callback gives the cpuidle driver a
chance to update the dynamic information of the cpuidle states for the
current idle period, e.g. state availability, latencies, thresholds,
power values, etc.

2) add CPUIDLE_FLAG_IGNORE as one of the state flags. In the prepare()
function, a cpuidle driver can set/clear the flag to indicate to the
menu governor whether a cpuidle state should be ignored, i.e. not
available, during the current idle period.

3) add power_specified bit to struct cpuidle_device. The menu governor
currently assumes that the cpuidle states are arranged in the order of
increasing latency, threshold, and power savings. This is true or can
be made true for static states. Once the state parameters are dynamic,
the latencies, thresholds, and power savings for the cpuidle states can
increase or decrease by different amounts from idle period to idle
period. So the assumption of increasing latency, threshold, and power
savings from Cn to C(n+1) can no longer be guaranteed.

It can be straightforward to calculate the power consumption of each
available state and to specify it in power_usage for the idle period.
Using the power_usage fields, the menu governor then selects the state
that has the lowest power consumption and that still satisfies all other
critieria. The power_specified bit defaults to 0. For existing cpuidle
drivers, cpuidle detects that power_specified is 0 and fills in a dummy
set of power_usage values.

Signed-off-by: Ai Li
Cc: Len Brown
Acked-by: Arjan van de Ven
Cc: Ingo Molnar
Cc: Venkatesh Pallipadi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ai Li
2010-08-10 11:45:04 +0800

04 Aug, 2010

1 commit

6f4f2723d [CPUFREQ] x86 cpufreq: Make trace_power_frequency cpufreq driver independent ... Browse Code »

and fix the broken case if a core's frequency depends on others.

trace_power_frequency was only implemented in a rather ungeneric way
in acpi-cpufreq driver's target() function only.
-> Move the call to trace_power_frequency to
cpufreq.c:cpufreq_notify_transition() where CPUFREQ_POSTCHANGE
notifier is triggered.
This will support power frequency tracing by all cpufreq drivers

trace_power_frequency did not trace frequency changes correctly when
the userspace governor was used or when CPU cores' frequency depend
on each other.
-> Moving this into the CPUFREQ_POSTCHANGE notifier and pass the cpu
which gets switched automatically fixes this.

Robert Schoene provided some important fixes on top of my initial
quick shot version which are integrated in this patch:
- Forgot some changes in power_end trace (TP_printk/variable names)
- Variable dummy in power_end must now be cpu_id
- Use static 64 bit variable instead of unsigned int for cpu_id

Signed-off-by: Thomas Renninger
CC: davej@redhat.com
CC: arjan@infradead.org
CC: linux-kernel@vger.kernel.org
CC: robert.schoene@tu-dresden.de
Tested-by: robert.schoene@tu-dresden.de
Signed-off-by: Dave Jones

Thomas Renninger
2010-08-04 01:47:05 +0800

01 Jul, 2010

1 commit

8c215bd38 sched: Cure nr_iowait_cpu() users ... Browse Code »

Commit 0224cf4c5e (sched: Intoduce get_cpu_iowait_time_us())
broke things by not making sure preemption was indeed disabled
by the callers of nr_iowait_cpu() which took the iowait value of
the current cpu.

This resulted in a heap of preempt warnings. Cure this by making
nr_iowait_cpu() take a cpu number and fix up the callers to pass
in the right number.

Signed-off-by: Peter Zijlstra
Cc: Arjan van de Ven
Cc: Sergey Senozhatsky
Cc: Rafael J. Wysocki
Cc: Maxim Levitsky
Cc: Len Brown
Cc: Pavel Machek
Cc: Jiri Slaby
Cc: linux-pm@lists.linux-foundation.org
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-07-01 15:39:48 +0800

29 May, 2010

1 commit

e4f2e5eaa Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6 ... Browse Code »

* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
intel_idle: native hardware cpuidle driver for latest Intel processors
ACPI: acpi_idle: touch TS_POLLING only in the non-MWAIT case
acpi_pad: uses MONITOR/MWAIT, so it doesn't need to clear TS_POLLING
sched: clarify commment for TS_POLLING
ACPI: allow a native cpuidle driver to displace ACPI
cpuidle: make cpuidle_curr_driver static
cpuidle: add cpuidle_unregister_driver() error check
cpuidle: fail to register if !CONFIG_CPU_IDLE

Linus Torvalds
2010-05-29 07:14:17 +0800

28 May, 2010

2 commits

752138df0 cpuidle: make cpuidle_curr_driver static ... Browse Code »

cpuidle_register_driver() sets cpuidle_curr_driver
cpuidle_unregister_driver() clears cpuidle_curr_driver

We should't expose cpuidle_curr_driver to
potential modification except via these interfaces.
So make it static and create cpuidle_get_driver() to observe it.

Signed-off-by: Len Brown

Len Brown
2010-05-28 09:06:58 +0800
c0d64cb03 cpuidle: add cpuidle_unregister_driver() error check ... Browse Code »

Assure that cpuidle_unregister_driver() will not clobber
the registered driver if unregistered by somebody else.

Signed-off-by: Len Brown

Len Brown
2010-05-28 01:04:04 +0800

25 May, 2010

1 commit

1f85f87d4 cpuidle: add a repeating pattern detector to the menu governor ... Browse Code »

Currently, the menu governor uses the (corrected) next timer as key item
for predicting the idle duration.

It turns out that there are specific cases where this breaks down: There
are cases where we have a very repetitive pattern of idle durations, where
the idle period is pretty much the same, for reasons completely unrelated
to the next timer event. Examples of such repeating patterns are network
loads with irq mitigation, the mouse moving but in theory also the wifi
beacons.

This patch adds a relatively simple detector for such repeating patterns,
where the standard deviation of the last 8 idle periods is compared to a
threshold.

With this extra predictor in place, measurements show that the DECAY
factor can now be increased (the decaying average will now decay slower)
to get an even more stable result.

[arjan@infradead.org: fix bug identified by Frank]
Signed-off-by: Arjan van de Ven
Cc: Corrado Zoccolo
Cc: Frank Rowand
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arjan van de Ven
2010-05-25 23:07:02 +0800

11 May, 2010

1 commit

ed77134bf PM QOS update ... Browse Code »

This patch changes the string based list management to a handle base
implementation to help with the hot path use of pm-qos, it also renames
much of the API to use "request" as opposed to "requirement" that was
used in the initial implementation. I did this because request more
accurately represents what it actually does.

Also, I added a string based ABI for users wanting to use a string
interface. So if the user writes 0xDDDDDDDD formatted hex it will be
accepted by the interface. (someone asked me for it and I don't think
it hurts anything.)

This patch updates some documentation input I got from Randy.

Signed-off-by: markgross
Signed-off-by: Rafael J. Wysocki

Mark Gross
2010-05-11 05:08:19 +0800

10 May, 2010

1 commit

1c6fe0364 cpuidle: Fix incorrect optimization ... Browse Code »

commit 672917dcc78 ("cpuidle: menu governor: reduce latency on exit")
added an optimization, where the analysis on the past idle period moved
from the end of idle, to the beginning of the new idle.

Unfortunately, this optimization had a bug where it zeroed one key
variable for new use, that is needed for the analysis. The fix is
simple, zero the variable after doing the work from the previous idle.

During the audit of the code that found this issue, another issue was
also found; the ->measured_us data structure member is never set, a
local variable is always used instead.

Signed-off-by: Arjan van de Ven
Cc: Corrado Zoccolo
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds

Arjan van de Ven
2010-05-10 09:35:36 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

08 Mar, 2010

2 commits

52cf25d0a Driver core: Constify struct sysfs_ops in struct kobj_type ... Browse Code »

Constify struct sysfs_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

* prevents modification of data that is shared
(referenced) by many other structure instances
at runtime

* detects/prevents accidental (but not intentional)
modification attempts on archs that enforce
read-only kernel data at runtime

* potentially better optimized code as the compiler
can assume that the const data cannot be changed

* the compiler/linker move const data into .rodata
and therefore exclude them from false sharing

Signed-off-by: Emese Revfy
Acked-by: David Teigland
Acked-by: Matt Domsch
Acked-by: Maciej Sosnowski
Acked-by: Hans J. Koch
Acked-by: Pekka Enberg
Acked-by: Jens Axboe
Acked-by: Stephen Hemminger
Signed-off-by: Greg Kroah-Hartman

Emese Revfy
2010-03-08 09:04:49 +0800
c9be0a36f sysdev: Pass attribute in sysdev_class attributes show/store ... Browse Code »

Passing the attribute to the low level IO functions allows all kinds
of cleanups, by sharing low level IO code without requiring
an own function for every piece of data.

Also drivers can extend the attributes with own data fields
and use that in the low level function.

Similar to sysdev_attributes and normal attributes.

This is a tree-wide sweep, converting everything in one go.

No functional changes in this patch other than passing the new
argument everywhere.

Tested on x86, the non x86 parts are uncompiled.

Signed-off-by: Andi Kleen
Signed-off-by: Greg Kroah-Hartman

Andi Kleen
2010-03-08 09:04:47 +0800

07 Mar, 2010

1 commit

56e6943b4 cpuidle menu: remove 8 bytes of padding on 64 bit builds ... Browse Code »

Reorder struct menu_device to remove 8 bytes of padding on 64 bit builds.
Size drops from 136 to 128 bytes, so possibly needing one fewer cache
lines.

Signed-off-by: Richard Kennedy
Cc: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Richard Kennedy
2010-03-07 03:26:28 +0800

12 Jan, 2010

1 commit

5787536ed drivers/cpuidle/governors/menu.c: fix undefined reference to `__udivdi3' ... Browse Code »

menu: use proper 64 bit math

The new menu governor is incorrectly doing a 64 bit divide. Compile
tested only

Signed-off-by: Stephen Hemminger
Cc: Arjan van de Ven
Cc: Len Brown
Cc: Venkatesh Pallipadi
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stephen Hemminger
2010-01-12 01:34:07 +0800

16 Dec, 2009

1 commit

faa7b7ddc drivers/cpuidle: Move dereference after NULL test ... Browse Code »

It does not seem possible that ldev can be NULL, so drop the unnecessary
test. If ldev can somehow be NULL, then the initialization of last_idx
should be moved below the test.

A simplified version of the semantic match that detects this problem is as
follows (http://coccinelle.lip6.fr/):

//
@match exists@
expression x, E;
identifier fld;
@@

* x->fld
... when != $x = E\|&x$
* x == NULL
//

Signed-off-by: Julia Lawall
Acked-by: Arjan van de Ven
Cc: Ingo Molnar
Cc: Venkatesh Pallipadi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Julia Lawall
2009-12-16 00:53:25 +0800

09 Nov, 2009

1 commit

21ae2956c tree-wide: fix typos "aquire" -> "acquire", "cumsumed" -> "consumed" ... Browse Code »

This patch was generated by

git grep -E -i -l '[Aa]quire' | xargs -r perl -p -i -e 's/([Aa])quire/$1cquire/'

and the cumsumed was found by checking the diff for aquire.

Signed-off-by: Uwe Kleine-König
Signed-off-by: Jiri Kosina

Uwe Kleine-König
2009-11-09 16:40:57 +0800

29 Oct, 2009

1 commit

246eb7f0e cpuidle: always return with interrupts enabled ... Browse Code »

In the case where cpuidle_idle_call() returns before changing state due to
a need_resched(), it was returning with IRQs disabled.

The idle path assumes that the platform specific idle code returns with
interrupts enabled (although this too is undocumented AFAICT) and on ARM
we have a WARN_ON(!(irqs_disabled()) when returning from the idle loop, so
the user-visible effects were only a warning since interrupts were
eventually re-enabled later.

On x86, this same problem exists, but there is no WARN_ON() to detect it.
As on ARM, the interrupts are eventually re-enabled, so I'm not sure of
any actual bugs triggered by this. It's primarily a
correctness/consistency fix.

This patch ensures IRQs are (re)enabled before returning.

Reported-by: Hemanth V
Signed-off-by: Kevin Hilman
Cc: Arjan van de Ven
Cc: Len Brown
Cc: Venkatesh Pallipadi
Cc: Ingo Molnar
Cc: "Rafael J. Wysocki"
Tested-by: Martin Michlmayr
Cc: [2.6.31.x]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kevin Hilman
2009-10-29 22:39:31 +0800

22 Sep, 2009

2 commits

672917dcc cpuidle: menu governor: reduce latency on exit ... Browse Code »

Move the state residency accounting and statistics computation off the hot
exit path.

On exit, the need to recompute statistics is recorded, and new statistics
will be computed when menu_select is called again.

The expected effect is to reduce processor wakeup latency from sleep
(C-states). We are speaking of few hundreds of cycles reduction out of a
several microseconds latency (determined by the hardware transition), so
it is difficult to measure.

Signed-off-by: Corrado Zoccolo
Cc: Venkatesh Pallipadi
Cc: Len Brown
Cc: Adam Belay
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Corrado Zoccolo
2009-09-22 22:17:45 +0800
69d25870f cpuidle: fix the menu governor to boost IO performance ... Browse Code »

Fix the menu idle governor which balances power savings, energy efficiency
and performance impact.

The reason for a reworked governor is that there have been serious
performance issues reported with the existing code on Nehalem server
systems.

To show this I'm sure Andrew wants to see benchmark results:
(benchmark is "fio", "no cstates" is using "idle=poll")

no cstates current linux new algorithm
1 disk 107 Mb/s 85 Mb/s 105 Mb/s
2 disks 215 Mb/s 123 Mb/s 209 Mb/s
12 disks 590 Mb/s 320 Mb/s 585 Mb/s

In various power benchmark measurements, no degredation was found by our
measurement&diagnostics team. Obviously a small percentage more power was
used in the "fio" benchmark, due to the much higher performance.

While it would be a novel idea to describe the new algorithm in this
commit message, I cheaped out and described it in comments in the code
instead.

[changes since first post: spelling fixes from akpm, review feedback,
folded menu-tng into menu.c]

Signed-off-by: Arjan van de Ven
Cc: Venkatesh Pallipadi
Cc: Len Brown
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Yanmin Zhang
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arjan van de Ven
2009-09-22 22:17:45 +0800

20 Sep, 2009

1 commit

288f023e7 tracing, x86, cpuidle: Move the end point of a C state in the power tracer ... Browse Code »

The "end of a C state" trace point currently happens before
the code runs that corrects the TSC for having stopped during idle.

The result of this is that the timestamp of the end-of-C-state event
is garbage on cpus where the TSC stops during idle.

This patch moves the end point of the C state to after the timekeeping
engine of the kernel has been corrected.

Signed-off-by: Arjan van de Ven
Cc: Len Brown
Cc: fweisbec@gmail.com
Cc: peterz@infradead.org
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar

Arjan van de Ven
2009-09-20 00:57:52 +0800

31 Dec, 2008

1 commit

816bb611e cpuidle: Add decaying history logic to menu idle predictor ... Browse Code »

Add decaying history of predicted idle time, instead of using the last early
wakeup. This logic helps menu governor do better job of predicting idle time.

With this change, we also measured noticable (~8%) power savings on
a DP server system with CPUs supporting deep C states, when system
was lightly loaded. There was no change to power or perf on other load
conditions.

Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Len Brown

Pallipadi, Venkatesh
2008-12-31 07:48:01 +0800

10 Nov, 2008

1 commit

9a6558371 regression: disable timer peek-ahead for 2.6.28 ... Browse Code »
43

It's showing up as regressions; disabling it very likely just papers
over an underlying issue, but time is running out for 2.6.28, lets get
back to this for 2.6.29

Fixes: #11826 and #11893

Signed-off-by: Arjan van de Ven
Signed-off-by: Linus Torvalds

Arjan van de Ven
2008-11-10 08:28:42 +0800

24 Oct, 2008

1 commit

1f6d6e8eb Merge branch 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/l… ... Browse Code »

…inux/kernel/git/tip/linux-2.6-tip

* 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits)
hrtimers: add missing docbook comments to struct hrtimer
hrtimers: simplify hrtimer_peek_ahead_timers()
hrtimers: fix docbook comments
DECLARE_PER_CPU needs linux/percpu.h
hrtimers: fix typo
rangetimers: fix the bug reported by Ingo for real
rangetimer: fix BUG_ON reported by Ingo
rangetimer: fix x86 build failure for the !HRTIMERS case
select: fix alpha OSF wrapper
select: fix alpha OSF wrapper
hrtimer: peek at the timer queue just before going idle
hrtimer: make the futex() system call use the per process slack value
hrtimer: make the nanosleep() syscall use the per process slack
hrtimer: fix signed/unsigned bug in slack estimator
hrtimer: show the timer ranges in /proc/timer_list
hrtimer: incorporate feedback from Peter Zijlstra
hrtimer: add a hrtimer_start_range() function
hrtimer: another build fix
hrtimer: fix build bug found by Ingo
hrtimer: make select() and poll() use the hrtimer range feature
...

Linus Torvalds
2008-10-24 01:53:02 +0800

17 Oct, 2008

2 commits

89cedfefc cpuidle: upon BIOS bug, default to default_idle rather than polling ... Browse Code »

http://bugzilla.kernel.org/show_bug.cgi?id=11345

Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Len Brown

Venkatesh Pallipadi
2008-10-17 07:00:08 +0800
887e301aa cpuidle: use last_state which can reflect the actual state entered ... Browse Code »

cpuidle accounts the idle time for the C-state it was trying to enter and
not to the actual state that the driver eventually entered. The driver may
select a different state than the one chosen by cpuidle due to
constraints like bus-mastering, etc.

Change the time acounting code to look at the dev->last_state after
returning from target_state->enter(). Driver can modify dev->last_state
internally, inside the enter routine to reflect the actual C-state
entered.

Signed-off-by: Venkatesh Pallipadi
Tested-by: Kevin Hilman
Signed-off-by: Len Brown

Venkatesh Pallipadi
2008-10-17 05:59:44 +0800

11 Sep, 2008

1 commit

2e94d1f71 hrtimer: peek at the timer queue just before going idle ... Browse Code »

As part of going idle, we already look at the time of the next timer event to determine
which C-state to select etc.

This patch adds functionality that causes the timers that are past their
soft expire time, to fire at this time, before we calculate the next wakeup
time. This functionality will thus avoid wakeups by running timers before
going idle rather than specially waking up for it.

Signed-off-by: Arjan van de Ven

Arjan van de Ven
2008-09-11 22:17:49 +0800

16 Aug, 2008

3 commits

06d9e908b cpuidle: Make ladder governor honor latency requirements fully ... Browse Code »

ladder governor only honored latency requirement when promoting C-states.
Instead. it should check for latency requirement on each idle call,
and demote to appropriate C-state when there is a latency requirement change.

Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Andi Kleen

venkatesh.pallipadi@intel.com
2008-08-16 03:25:35 +0800
320eee776 cpuidle: Menu governor fix wrong usage of measured_us ... Browse Code »

There is a bug in menu governor where we have
if (data->elapsed_us < data->elapsed_us + measured_us)

with measured_us already having elapsed_us added in tickless case here
unsigned int measured_us =
cpuidle_get_last_residency(dev) + data->elapsed_us;

Also, it should be last_residency, not measured_us, that need to be used to
do comparing and distinguish between expected & non-expected events.

Refactor menu_reflect() to fix these two problems.

Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Wei Gang
Signed-off-by: Andi Kleen

venkatesh.pallipadi@intel.com
2008-08-16 03:25:25 +0800
a2bd92023 cpuidle: Do not use poll_idle unless user asks for it ... Browse Code »

poll_idle was added to CPUIDLE, just as a low latency idle handler, to be
used in cases when user desires CPUs not to enter any idle state at all. It
was supposed to be a run time idle=poll option to the user. But, it was indeed
getting used during normal menu and ladder governor default case, with no
special user setting (Reported by Linus Torvalds).

Change below ensures that poll_idle will not be used unless user explicitly
asks pm_qos infrastructure for zero latency requirement.

Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Andi Kleen

venkatesh.pallipadi@intel.com
2008-08-16 03:25:25 +0800

13 Aug, 2008

1 commit

66198f36a cpuidle: make sysfs attributes sysdev class attributes ... Browse Code »

These attributes are really sysdev class attributes. The incorrect
definition leads to an oops because of recent changes which make sysdev
attributes use a different prototype.

Based on Andi's f718cd4add5aea9d379faff92f162571e356cc5f ("sched: make
scheduler sysfs attributes sysdev class devices")

Reported-by: Eric Sesterhenn
Signed-off-by: Rabin Vincent
Acked-by: Andi Kleen
Cc: "Li, Shaohua"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rabin Vincent
2008-08-13 07:07:28 +0800

28 Jul, 2008

1 commit

b032bf70d ACPI/CPUIDLE: prevent setting pm_idle to NULL ... Browse Code »

pm_idle_save resp. pm_idle_old can be NULL when the restore code in
acpi_processor_cst_has_changed() resp. cpuidle_uninstall_idle_handler()
is called. This can set pm_idle unconditinally to NULL, which causes the
kernel to panic when calling pm_idle in the x86 idle code. This was
covered by an extra check for !pm_idle in the x86 idle code, which was
removed during the x86 idle code refactoring.

Instead of restoring the pm_idle check in the x86 code prevent the
acpi/cpuidle code to set pm_idle to NULL.

Reported by: Dhaval Giani http://lkml.org/lkml/2008/7/2/309
Based on a debug patch from Ingo Molnar

Signed-off-by: Thomas Gleixner
Signed-off-by: Linus Torvalds

Thomas Gleixner
2008-07-28 23:31:58 +0800

22 Jul, 2008

1 commit

4a0b2b4db sysdev: Pass the attribute to the low level sysdev show/store function ... Browse Code »

This allow to dynamically generate attributes and share show/store
functions between attributes. Right now most attributes are generated
by special macros and lots of duplicated code. With the attribute
passed it's instead possible to attach some data to the attribute
and then use that in shared low level functions to do different things.

I need this for the dynamically generated bank attributes in the x86
machine check code, but it'll allow some further cleanups.

I converted all users in tree to the new show/store prototype. It's a single
huge patch to avoid unbisectable sections.

Runtime tested: x86-32, x86-64
Compiled only: ia64, powerpc
Not compile tested/only grep converted: sh, arm, avr32

Signed-off-by: Andi Kleen
Signed-off-by: Greg Kroah-Hartman

Andi Kleen
2008-07-22 12:55:02 +0800

26 Jun, 2008

1 commit

8691e5a8f smp_call_function: get rid of the unused nonatomic/retry argument ... Browse Code »

It's never used and the comments refer to nonatomic and retry
interchangably. So get rid of it.

Acked-by: Jeremy Fitzhardinge
Signed-off-by: Jens Axboe

Jens Axboe
2008-06-26 17:24:35 +0800

12 Jun, 2008

1 commit

dcb84f335 cpuidle acpi driver: fix oops on AC<->DC ... Browse Code »

cpuidle and acpi driver interaction bug with the way cpuidle_register_driver()
is called. Due to this bug, there will be oops on
ACDC on some systems, where they support C-states in one DC and not in AC.

The current code does
ON BOOT:
Look at CST and other C-state info to see whether more than C1 is
supported. If it is, then acpi processor_idle does a
cpuidle_register_driver() call, which internally enables the device.

ON CST change notification (ACDC) and on suspend-resume:
acpi driver temporarily disables device, updates the device with
any new C-states, and reenables the device.

The problem is is on boot, there are no C2, C3 states supported and we skip
the register. Later on ACDC, we may get a CST notification and we try
to reevaluate CST and enabled the device, without actually registering it.
This causes breakage as we try to create /sys fs sub directory, without the
parent directory which is created at register time.

Thanks to Sanjeev for reporting the problem here.
http://bugzilla.kernel.org/show_bug.cgi?id=10394

Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Len Brown

Venkatesh Pallipadi
2008-06-12 07:13:45 +0800

26 Mar, 2008

2 commits

8e92b6605 cpuidle: fix 100% C0 statistics regression ... Browse Code »

commit 9b12e18cdc1553de62d931e73443c806347cd974
'ACPI: cpuidle: Support C1 idle time accounting'
was implicated in a 100% C0 idle regression.
http://bugzilla.kernel.org/show_bug.cgi?id=10076

It pointed out a potential problem where the menu governor
may get confused by the C-state residency time from poll
idle or C1 idle, where this timing info is not accurate.
This inaccuracy is due to interrupts being handled
before we account for C-state exit.

Do not mark TIME_VALID for CO poll state.
Mark C1 time as valid only with the MWAIT (CSTATE_FFH) entry method.

This makes governors use the timing information only when it is correct and
eliminates any wrong policy decisions that may result from invalid timing
information.

Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Len Brown

Venki Pallipadi
2008-03-26 12:58:19 +0800
8b78cf602 cpuidle: fix cpuidle time and usage overflow ... Browse Code »

cpuidle C-state sysfs node time and usage are very easy to overflow because
they are all of unsigned int type, time will overflow within about two hours,
usage will take longer time to overflow, but they are increasing for ever.

This patch will convert them to unsigned long long.

Signed-off-by: Yi Yang
Acked-by: Venkatesh Pallipadi
Signed-off-by: Len Brown

Yi Yang
2008-03-26 12:45:26 +0800

14 Feb, 2008

1 commit

4fcb2fcd4 ACPI, cpuidle: Clarify C-state description in sysfs ... Browse Code »

Add a new sysfs entry under cpuidle states. desc - can be used by driver to
communicate to userspace any specific information about the state.
This helps in identifying the exact hardware C-states behind the ACPI C-state
definition.

Idea is to export this through powertop, which will help to map the C-state
reported by powertop to actual hardware C-state.

Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Len Brown

Venkatesh Pallipadi
2008-02-14 13:09:55 +0800

09 Feb, 2008

1 commit

a6869cc4c cpuidle: build fix for non-x86 ... Browse Code »

The last posted version of this patch gave compile error
on IA64. So, here goes yet another rewrite of the patch.

Convert cpu_idle_wait() to cpuidle_kick_cpus() which is
SMP-only, and gives error on non supported CPU.

Changes from last patch sent by Kevin:
Moved the definition of kick_cpus back to cpuidle.c from cpuidle.h:
* Having it in .h gives #error on archs which includes the header file without
actually having CPU_IDLE configured. To make it work in .h, we need one more
#ifdef around that code which makes it messy.
* Also, the function is only called from one file. So, it can be in declared
statically in .c rather than making it available to everyone who includes
the .h file.

Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Kevin Hilman
Signed-off-by: Len Brown

Venki Pallipadi
2008-02-09 16:33:40 +0800

07 Feb, 2008

2 commits

9b7131542 Revert "cpuidle: build fix for non-x86" ... Browse Code »

This reverts commit f757397097d0713c949af76dccabb65a2785782e.
which ironically broke the ia64 build

Len Brown
2008-02-07 17:16:34 +0800
acf63867a Merge branches 'release', 'cpuidle-2.6.25' and 'idle' into release Browse Code »

Len Brown
2008-02-07 16:11:05 +0800