31 Jul, 2008
1 commit
-
Found an interactivity problem on a quad core test-system - simple
CPU loops would occasionally delay the system un an unacceptable way.After much debugging with Peter Zijlstra it turned out that the problem
is caused by the string of sched_clock() changes - they caused the CPU
clock to jump backwards a bit - which confuses the scheduler arithmetics.(which is unsigned for performance reasons)
So revert:
# c300ba2: sched_clock: and multiplier for TSC to gtod drift
# c0c8773: sched_clock: only update deltas with local reads.
# af52a90: sched_clock: stop maximum check on NO HZ
# f7cce27: sched_clock: widen the max and min timeThis solves the interactivity problems.
Signed-off-by: Ingo Molnar
Acked-by: Peter Zijlstra
Acked-by: Mike Galbraith
25 Jul, 2008
1 commit
-
…el/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
nohz: adjust tick_nohz_stop_sched_tick() call of s390 as well
nohz: prevent tick stop outside of the idle loop
19 Jul, 2008
2 commits
-
Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:scheduler switch to idle task
enable interruptsWindow starts here
----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick----> interrupt happens (does set NEED_RESCHED)
return from schedule()
cpu_idle(): preempt_disable();
Window ends here
The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.cpu_idle()
{
preempt_disable();while(1) {
tick_nohz_stop_sched_tick(1); ,
Debugged-by: eric miao
Signed-off-by: Thomas Gleixner
16 Jul, 2008
2 commits
-
Conflicts:
kernel/softlockup.c
Signed-off-by: Ingo Molnar
-
* 'timers/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: add PCI ID for 6300ESB force hpet
x86: add another PCI ID for ICH6 force-hpet
kernel-paramaters: document pmtmr= command line option
acpi_pm clccksource: fix printk format warning
nohz: don't stop idle tick if softirqs are pending.
pmtmr: allow command line override of ioport
nohz: reduce jiffies polling overhead
hrtimer: Remove unused variables in ktime_divns()
hrtimer: remove warning in hres_timers_resume
posix-timers: print RT watchdog message
11 Jul, 2008
2 commits
-
Working with ftrace I would get large jumps of 11 millisecs or more with
the clock tracer. This killed the latencing timings of ftrace and also
caused the irqoff self tests to fail.What was happening is with NO_HZ the idle would stop the jiffy counter and
before the jiffy counter was updated the sched_clock would have a bad
delta jiffies to compare with the gtod with the maximum.The jiffies would stop and the last sched_tick would record the last gtod.
On wakeup, the sched clock update would compare the gtod + delta jiffies
(which would be zero) and compare it to the TSC. The TSC would have
correctly (with a stable TSC) moved forward several jiffies. But because the
jiffies has not been updated yet the clock would be prevented from moving
forward because it would appear that the TSC jumped too far ahead.The clock would then virtually stop, until the jiffies are updated. Then
the next sched clock update would see that the clock was very much behind
since the delta jiffies is now correct. This would then jump the clock
forward by several jiffies.This caused ftrace to report several milliseconds of interrupts off
latency at every resume from NO_HZ idle.This patch adds hooks into the nohz code to disable the checking of the
maximum clock update when nohz is in effect. It resumes the max check
when nohz has updated the jiffies again.Signed-off-by: Steven Rostedt
Cc: Steven Rostedt
Cc: Peter Zijlstra
Cc: Andrew Morton
Signed-off-by: Ingo Molnar -
In case a cpu goes idle but softirqs are pending only an error message is
printed to the console. It may take a very long time until the pending
softirqs will finally be executed. Worst case would be a hanging system.With this patch the timer tick just continues and the softirqs will be
executed after the next interrupt. Still a delay but better than a
hanging system.Currently we have at least two device drivers on s390 which under certain
circumstances schedule a tasklet from process context. This is a reason
why we can end up with pending softirqs when going idle. Fixing these
drivers seems to be non-trivial.
However there is no question that the drivers should be fixed.
This patch shouldn't be considered as a bug fix. It just is intended to
keep a system running even if device drivers are buggy.Signed-off-by: Heiko Carstens
Cc: Jan Glauber
Cc: Stefan Weinhuber
Cc: Andrew Morton
Signed-off-by: Ingo Molnar
30 May, 2008
2 commits
-
Signed-off-by: Ingo Molnar
-
Fix (probably theoretical only) rq->clock update bug:
in tick_nohz_update_jiffies() [which is called on all irq
entry on all cpus where the irq entry hits an idle cpu] we
call touch_softlockup_watchdog() before we update jiffies.
That works fine most of the time when idle timeouts are within
60 seconds. But when an idle timeout is beyond 60 seconds,
jiffies is updated with a jump of more than 60 seconds,
which causes a jump in cpu-clock of more than 60 seconds,
triggering a false positive.Reported-by: David Miller
Signed-off-by: Ingo Molnar
25 Apr, 2008
1 commit
-
David Miller reported:
|--------------->
the following commit:| commit 27ec4407790d075c325e1f4da0a19c56953cce23
| Author: Ingo Molnar
| Date: Thu Feb 28 21:00:21 2008 +0100
|
| sched: make cpu_clock() globally synchronous
|
| Alexey Zaytsev reported (and bisected) that the introduction of
| cpu_clock() in printk made the timestamps jump back and forth.
|
| Make cpu_clock() more reliable while still keeping it fast when it's
| called frequently.
|
| Signed-off-by: Ingo Molnarcauses watchdog triggers when a cpu exits NOHZ state when it has been
there for >= the soft lockup threshold, for example here are some
messages from a 128 cpu Niagara2 box:[ 168.106406] BUG: soft lockup - CPU#11 stuck for 128s! [dd:3239]
[ 168.989592] BUG: soft lockup - CPU#21 stuck for 86s! [swapper:0]
[ 168.999587] BUG: soft lockup - CPU#29 stuck for 91s! [make:4511]
[ 168.999615] BUG: soft lockup - CPU#2 stuck for 85s! [swapper:0]
[ 169.020514] BUG: soft lockup - CPU#37 stuck for 91s! [swapper:0]
[ 169.020514] BUG: soft lockup - CPU#45 stuck for 91s! [sh:4515]
[ 169.020515] BUG: soft lockup - CPU#69 stuck for 92s! [swapper:0]
[ 169.020515] BUG: soft lockup - CPU#77 stuck for 92s! [swapper:0]
[ 169.020515] BUG: soft lockup - CPU#61 stuck for 92s! [swapper:0]
[ 169.112554] BUG: soft lockup - CPU#85 stuck for 92s! [swapper:0]
[ 169.112554] BUG: soft lockup - CPU#101 stuck for 92s! [swapper:0]
[ 169.112554] BUG: soft lockup - CPU#109 stuck for 92s! [swapper:0]
[ 169.112554] BUG: soft lockup - CPU#117 stuck for 92s! [swapper:0]
[ 169.171483] BUG: soft lockup - CPU#40 stuck for 80s! [dd:3239]
[ 169.331483] BUG: soft lockup - CPU#13 stuck for 86s! [swapper:0]
[ 169.351500] BUG: soft lockup - CPU#43 stuck for 101s! [dd:3239]
[ 169.531482] BUG: soft lockup - CPU#9 stuck for 129s! [mkdir:4565]
[ 169.595754] BUG: soft lockup - CPU#20 stuck for 93s! [swapper:0]
[ 169.626787] BUG: soft lockup - CPU#52 stuck for 93s! [swapper:0]
[ 169.626787] BUG: soft lockup - CPU#84 stuck for 92s! [swapper:0]
[ 169.636812] BUG: soft lockup - CPU#116 stuck for 94s! [swapper:0]It's simple enough to trigger this by doing a 10 minute sleep after a
fresh bootup then starting a parallel kernel build.I suspect this might be reintroducing a problem we've had and fixed
before, see the thread:http://marc.info/?l=linux-kernel&m=119546414004065&w=2
20 Apr, 2008
1 commit
-
Various SMP balancing algorithms require that the bandwidth period
run in sync.Possible improvements are moving the rt_bandwidth thing into root_domain
and keeping a span per rt_bandwidth which marks throttled cpus.Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
17 Apr, 2008
1 commit
-
Call
ts = &per_cpu(tick_cpu_sched, cpu);
and
cpu = smp_processor_id();
once instead of twice.No functional change done, as changed code runs with local irq off.
Reduces source lines and text size (20bytes on x86_64).[ akpm@linux-foundation.org: Build fix ]
Signed-off-by: Karsten Wiese
Cc: Andrew Morton
Signed-off-by: Thomas Gleixner
09 Mar, 2008
1 commit
-
Silences WARN_ONs in rcu_enter_nohz() and rcu_exit_nohz(), which appeared
before caused by (repeated) calls to:
$ echo 0 > /sys/devices/system/cpu/cpu1/online
$ echo 1 > /sys/devices/system/cpu/cpu1/onlineSigned-off-by: Karsten Wiese
Cc: johnstul@us.ibm.com
Cc: Rafael Wysocki
Cc: Steven Rostedt
Cc: Ingo Molnar
Acked-by: Paul E. McKenney
Signed-off-by: Andrew Morton
Signed-off-by: Thomas Gleixner
01 Mar, 2008
1 commit
-
The PREEMPT-RCU can get stuck if a CPU goes idle and NO_HZ is set. The
idle CPU will not progress the RCU through its grace period and a
synchronize_rcu my get stuck. Without this patch I have a box that will
not boot when PREEMPT_RCU and NO_HZ are set. That same box boots fine
with this patch.This patch comes from the -rt kernel where it has been tested for
several months.Signed-off-by: Steven Rostedt
Signed-off-by: Paul E. McKenney
Signed-off-by: Ingo Molnar
09 Feb, 2008
1 commit
-
Function timekeeping_is_continuous() no longer checks flag
CLOCK_IS_CONTINUOUS, and it checks CLOCK_SOURCE_VALID_FOR_HRES now. So rename
the function accordingly.Signed-off-by: Li Zefan
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: john stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
02 Feb, 2008
1 commit
-
To allow better diagnosis of tick-sched related, especially NOHZ
related problems, we need to know when the last wakeup via an irq
happened and when the CPU left the idle state.Add two fields (idle_waketime, idle_exittime) to the tick_sched
structure and add them to the timer_list output.Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
30 Jan, 2008
3 commits
-
Current idle time in kstat is based on jiffies and is coarse grained.
tick_sched.idle_sleeptime is making some attempt to keep track of idle time
in a fine grained manner. But, it is not handling the time spent in
interrupts fully.Make tick_sched.idle_sleeptime accurate with respect to time spent on
handling interrupts and also add tick_sched.idle_lastupdate, which keeps
track of last time when idle_sleeptime was updated.This statistics will be crucial for cpufreq-ondemand governor, which can
shed some conservative gaurd band that is uses today while setting the
frequency. The ondemand changes that uses the exact idle time is coming
soon.Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Andrew Morton
Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner -
I was confused by FSEC = 10^15 NSEC statement, plus small whitespace
fixes. When there's copyright, there should be GPL.Signed-off-by: Pavel Machek
Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner -
Small cleanups to tick-related code. Wrong preempt count is followed
by BUG(), so it is hardly KERN_WARNING.Signed-off-by: Pavel Machek
Cc: john stultz
Signed-off-by: Andrew Morton
Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner
26 Jan, 2008
2 commits
-
In order to more easily allow for the scheduler to use timers, clean up
the locking a bit.Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar -
We need to teach no_hz about the rt throttling because its tick driven.
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
28 Nov, 2007
1 commit
-
David Miller reported soft lockup false-positives that trigger
on NOHZ due to CPUs idling for more than 10 seconds.The solution is touch the softlockup watchdog when we return from
idle. (by definition we are not 'locked up' when we were idle)http://bugzilla.kernel.org/show_bug.cgi?id=9409
Reported-by: David Miller
Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
06 Nov, 2007
1 commit
-
Signed-off-by: Li Zefan
Cc: Thomas Gleixner
Cc: john stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
29 Oct, 2007
1 commit
-
This patch removes the unused
EXPORT_SYMBOL_GPL(tick_nohz_get_sleep_length).Signed-off-by: Adrian Bunk
Signed-off-by: Thomas Gleixner
20 Oct, 2007
1 commit
-
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (41 commits)
ACPICA: hw: Don't carry spinlock over suspend
ACPICA: hw: remove use_lock flag from acpi_hw_register_{read, write}
ACPI: cpuidle: port idle timer suspend/resume workaround to cpuidle
ACPI: clean up acpi_enter_sleep_state_prep
Hibernation: Make sure that ACPI is enabled in acpi_hibernation_finish
ACPI: suppress uninitialized var warning
cpuidle: consolidate 2.6.22 cpuidle branch into one patch
ACPI: thinkpad-acpi: skip blanks before the data when parsing sysfs
ACPI: AC: Add sysfs interface
ACPI: SBS: Add sysfs alarm
ACPI: SBS: Add ACPI_PROCFS around procfs handling code.
ACPI: SBS: Add support for power_supply class (and sysfs)
ACPI: SBS: Make SBS reads table-driven.
ACPI: SBS: Simplify data structures in SBS
ACPI: SBS: Split host controller (ACPI0001) from SBS driver (ACPI0002)
ACPI: EC: Add new query handler to list head.
ACPI: Add acpi_bus_generate_event4() function
ACPI: Battery: add sysfs alarm
ACPI: Battery: Add sysfs support
ACPI: Battery: Misc clean-ups, no functional changes
...Fix up conflicts in drivers/misc/thinkpad_acpi.[ch] manually
17 Oct, 2007
1 commit
-
To avoid lock contention, we distribute the sched_timer calls across the
cpus so they do not trigger at the same instant. However, I used NR_CPUS,
which can cause needless grouping on small smp systems depending on your
kernel config. This patch converts to using num_possible_cpus() so we
spread it as evenly as possible on every machine.Briefly tested w/ NR_CPUS=255 and verified reduced contention.
Signed-off-by: John Stultz
Acked-by: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
10 Oct, 2007
1 commit
-
commit e5a16b1f9eec0af7cfa0830304b41c1c0833cf9f
Author: Len Brown
Date: Tue Oct 2 23:44:44 2007 -0400cpuidle: shrink diff
processor_idle.c | 440 +++++++++++++++++++++++++++++++++++++++++--
1 file changed, 429 insertions(+), 11 deletions(-)Signed-off-by: Len Brown
commit dfbb9d5aedfb18848a3e0d6f6e3e4969febb209c
Author: Len Brown
Date: Wed Sep 26 02:17:55 2007 -0400cpuidle: reduce diff size
Reduces the cpuidle processor_idle.c diff vs 2.6.22 from this
processor_idle.c | 2006 ++++++++++++++++++++++++++-----------------
1 file changed, 1219 insertions(+), 787 deletions(-)to this:
processor_idle.c | 502 +++++++++++++++++++++++++++++++++++++++----
1 file changed, 458 insertions(+), 44 deletions(-)...for the purpose of making the cpuilde patch less invasive
and easier to review.no functional changes. build tested only.
Signed-off-by: Len Brown
commit 889172fc915f5a7fe20f35b133cbd205ce69bf6c
Author: Venki Pallipadi
Date: Thu Sep 13 13:40:05 2007 -0700cpuidle: Retain old ACPI policy for !CONFIG_CPU_IDLE
Retain the old policy in processor_idle, so that when CPU_IDLE is not
configured, old C-state policy will still be used. This provides a
clean gradual migration path from old ACPI policy to new cpuidle
based policy.Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Len Browncommit 9544a8181edc7ecc33b3bfd69271571f98ed08bc
Author: Venki Pallipadi
Date: Thu Sep 13 13:39:17 2007 -0700cpuidle: Configure governors by default
Quoting Len "Do not give an option to users to shoot themselves in the foot".
Remove the configurability of ladder and menu governors as they are
needed for default policy of cpuidle. That way users will not be able to
have cpuidle without any policy loosing all C-state power savings.Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Len Browncommit 8975059a2c1e56cfe83d1bcf031bcf4cb39be743
Author: Adam Belay
Date: Tue Aug 21 18:27:07 2007 -0400CPUIDLE: load ACPI properly when CPUIDLE is disabled
Change the registration return codes for when CPUIDLE
support is not compiled into the kernel. As a result, the ACPI
processor driver will load properly even if CPUIDLE is unavailable.
However, it may be possible to cleanup the ACPI processor driver further
and eliminate some dead code paths.Signed-off-by: Adam Belay
Acked-by: Venkatesh Pallipadi
Signed-off-by: Len Browncommit e0322e2b58dd1b12ec669bf84693efe0dc2414a8
Author: Adam Belay
Date: Tue Aug 21 18:26:06 2007 -0400CPUIDLE: remove cpuidle_get_bm_activity()
Remove cpuidle_get_bm_activity() and updates governors
accordingly.Signed-off-by: Adam Belay
Acked-by: Venkatesh Pallipadi
Signed-off-by: Len Browncommit 18a6e770d5c82ba26653e53d240caa617e09e9ab
Author: Adam Belay
Date: Tue Aug 21 18:25:58 2007 -0400CPUIDLE: max_cstate fix
Currently max_cstate is limited to 0, resulting in no idle processor
power management on ACPI platforms. This patch restores the value to
the array size.Signed-off-by: Adam Belay
Acked-by: Venkatesh Pallipadi
Signed-off-by: Len Browncommit 1fdc0887286179b40ce24bcdbde663172e205ef0
Author: Adam Belay
Date: Tue Aug 21 18:25:40 2007 -0400CPUIDLE: handle BM detection inside the ACPI Processor driver
Update the ACPI processor driver to detect BM activity and
limit state entry depth internally, rather than exposing such
requirements to CPUIDLE. As a result, CPUIDLE can drop this
ACPI-specific interface and become more platform independent. BM
activity is now handled much more aggressively than it was in the
original implementation, so some testing coverage may be needed to
verify that this doesn't introduce any DMA buffer under-run issues.Signed-off-by: Adam Belay
Acked-by: Venkatesh Pallipadi
Signed-off-by: Len Browncommit 0ef38840db666f48e3cdd2b769da676c57228dd9
Author: Adam Belay
Date: Tue Aug 21 18:25:14 2007 -0400CPUIDLE: menu governor updates
Tweak the menu governor to more effectively handle non-timer
break events. Non-timer break events are detected by comparing the
actual sleep time to the expected sleep time. In future revisions, it
may be more reliable to use the timer data structures directly.Signed-off-by: Adam Belay
Acked-by: Venkatesh Pallipadi
Signed-off-by: Len Browncommit bb4d74fca63fa96cf3ace644b15ae0f12b7df5a1
Author: Adam Belay
Date: Tue Aug 21 18:24:40 2007 -0400CPUIDLE: fix 'current_governor' sysfs entry
Allow the "current_governor" sysfs entry to properly handle
input terminated with '\n'.Signed-off-by: Adam Belay
Acked-by: Venkatesh Pallipadi
Signed-off-by: Len Browncommit df3c71559bb69b125f1a48971bf0d17f78bbdf47
Author: Len Brown
Date: Sun Aug 12 02:00:45 2007 -0400cpuidle: fix IA64 build (again)
Signed-off-by: Len Brown
commit a02064579e3f9530fd31baae16b1fc46b5a7bca8
Author: Venkatesh Pallipadi
Date: Sun Aug 12 01:39:27 2007 -0400cpuidle: Remove support for runtime changing of max_cstate
Remove support for runtime changeability of max_cstate. Drivers can use
use latency APIs.max_cstate can still be used as a boot time option and dmi override.
Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Len Browncommit 0912a44b13adf22f5e3f607d263aed23b4910d7e
Author: Venkatesh Pallipadi
Date: Sun Aug 12 01:39:16 2007 -0400cpuidle: Remove ACPI cstate_limit calls from ipw2100
ipw2100 already has code to use accetable_latency interfaces to limit the
C-state. Remove the calls to acpi_set_cstate_limit and acpi_get_cstate_limit
as they are redundant.Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Len Browncommit c649a76e76be6bff1fd770d0a775798813a3f6e0
Author: Venkatesh Pallipadi
Date: Sun Aug 12 01:35:39 2007 -0400cpuidle: compile fix for pause and resume functions
Fix the compilation failure when cpuidle is not compiled in.
Signed-off-by: Venkatesh Pallipadi
Acked-by: Adam Belay
Signed-off-by: Len Browncommit 2305a5920fb8ee6ccec1c62ade05aa8351091d71
Author: Adam Belay
Date: Thu Jul 19 00:49:00 2007 -0400cpuidle: re-write
Some portions have been rewritten to make the code cleaner and lighter
weight. The following is a list of changes:1.) the state name is now included in the sysfs interface
2.) detection, hotplug, and available state modifications are handled by
CPUIDLE drivers directly
3.) the CPUIDLE idle handler is only ever installed when at least one
cpuidle_device is enabled and ready
4.) the menu governor BM code no longer overflows
5.) the sysfs attributes are now printed as unsigned integers, avoiding
negative values
6.) a variety of other small cleanupsAlso, Idle drivers are no longer swappable during runtime through the
CPUIDLE sysfs inteface. On i386 and x86_64 most idle handlers (e.g.
poll, mwait, halt, etc.) don't benefit from an infrastructure that
supports multiple states, so I think using a more general case idle
handler selection mechanism would be cleaner.Signed-off-by: Adam Belay
Acked-by: Venkatesh Pallipadi
Acked-by: Shaohua Li
Signed-off-by: Len Browncommit df25b6b56955714e6e24b574d88d1fd11f0c3ee5
Author: Len Brown
Date: Tue Jul 24 17:08:21 2007 -0400cpuidle: fix IA64 buid
Signed-off-by: Len Brown
commit fd6ada4c14488755ff7068860078c437431fbccd
Author: Adrian Bunk
Date: Mon Jul 9 11:33:13 2007 -0700cpuidle: static
make cpuidle_replace_governor() static
Signed-off-by: Adrian Bunk
Cc: Venkatesh Pallipadi
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit c1d4a2cebcadf2429c0c72e1d29aa2a9684c32e0
Author: Adrian Bunk
Date: Tue Jul 3 00:54:40 2007 -0400cpuidle: static
This patch makes the needlessly global struct menu_governor static.
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit dbf8780c6e8d572c2c273da97ed1cca7608fd999
Author: Andrew Morton
Date: Tue Jul 3 00:49:14 2007 -0400export symbol tick_nohz_get_sleep_length
ERROR: "tick_nohz_get_sleep_length" [drivers/cpuidle/governors/menu.ko] undefined!
ERROR: "tick_nohz_get_idle_jiffies" [drivers/cpuidle/governors/menu.ko] undefined!And please be sure to get your changes to core kernel suitably reviewed.
Cc: Adam Belay
Cc: Venki Pallipadi
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: john stultz
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit 29f0e248e7017be15f99febf9143a2cef00b2961
Author: Andrew Morton
Date: Tue Jul 3 00:43:04 2007 -0400tick.h needs hrtimer.h
It uses hrtimers.
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit e40cede7d63a029e92712a3fe02faee60cc38fb4
Author: Venki Pallipadi
Date: Tue Jul 3 00:40:34 2007 -0400cpuidle: first round of documentation updates
Documentation changes based on Pavel's feedback.
Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit 83b42be2efece386976507555c29e7773a0dfcd1
Author: Venki Pallipadi
Date: Tue Jul 3 00:39:25 2007 -0400cpuidle: add rating to the governors and pick the one with highest rating by default
Introduce a governor rating scheme to pick the right governor by default.
Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit d2a74b8c5e8f22def4709330d4bfc4a29209b71c
Author: Venki Pallipadi
Date: Tue Jul 3 00:38:08 2007 -0400cpuidle: make cpuidle sysfs driver governor switch off by default
Make default cpuidle sysfs to show current_governor and current_driver in
read-only mode. More elaborate available_governors and available_drivers with
writeable current_governor and current_driver interface only appear with
"cpuidle_sysfs_switch" boot parameter.Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit 1f60a0e80bf83cf6b55c8845bbe5596ed8f6307b
Author: Venki Pallipadi
Date: Tue Jul 3 00:37:00 2007 -0400cpuidle: menu governor: change the early break condition
Change the C-state early break out algorithm in menu governor.
We only look at early breakouts that result in wakeups shorter than idle
state's target_residency. If such a breakout is frequent enough, eliminate
the particular idle state upto a timeout period.Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit 45a42095cf64b003b4a69be3ce7f434f97d7af51
Author: Venki Pallipadi
Date: Tue Jul 3 00:35:38 2007 -0400cpuidle: fix uninitialized variable in sysfs routine
Fix the uninitialized usage of ret.
Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit 80dca7cdba3e6ee13eae277660873ab9584eb3be
Author: Venki Pallipadi
Date: Tue Jul 3 00:34:16 2007 -0400cpuidle: reenable /proc/acpi//power interface for the time being
Keep /proc/acpi/processor/CPU*/power around for a while as powertop depends
on it. It will be marked deprecated and removed in future. powertop can use
cpuidle interfaces instead.Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit 589c37c2646c5e3813a51255a5ee1159cb4c33fc
Author: Venki Pallipadi
Date: Tue Jul 3 00:32:37 2007 -0400cpuidle: menu governor and hrtimer compile fix
Compile fix for menu governor.
Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit 0ba80bd9ab3ed304cb4f19b722e4cc6740588b5e
Author: Len Brown
Date: Thu May 31 22:51:43 2007 -0400cpuidle: build fix - cpuidle vs ipw2100 module
ERROR: "acpi_set_cstate_limit" [drivers/net/wireless/ipw2100.ko] undefined!
Signed-off-by: Len Brown
commit d7d8fa7f96a7f7682be7c6cc0cc53fa7a18c3b58
Author: Adam Belay
Date: Sat Mar 24 03:47:07 2007 -0400cpuidle: add the 'menu' governor
Here is my first take at implementing an idle PM governor that takes
full advantage of NO_HZ. I call it the 'menu' governor because it
considers the full list of idle states before each entry.I've kept the implementation fairly simple. It attempts to guess the
next residency time and then chooses a state that would meet at least
the break-even point between power savings and entry cost. To this end,
it selects the deepest idle state that satisfies the following
constraints:
1. If the idle time elapsed since bus master activity was detected
is below a threshold (currently 20 ms), then limit the selection
to C2-type or above.
2. Do not choose a state with a break-even residency that exceeds
the expected time remaining until the next timer interrupt.
3. Do not choose a state with a break-even residency that exceeds
the elapsed time between the last pair of break events,
excluding timer interrupts.This governor has an advantage over "ladder" governor because it
proactively checks how much time remains until the next timer interrupt
using the tick infrastructure. Also, it handles device interrupt
activity more intelligently by not including timer interrupts in break
event calculations. Finally, it doesn't make policy decisions using the
number of state entries, which can have variable residency times (NO_HZ
makes these potentially very large), and instead only considers sleep
time deltas.The menu governor can be selected during runtime using the cpuidle sysfs
interface like so:
"echo "menu" > /sys/devices/system/cpu/cpuidle/current_governor"Signed-off-by: Adam Belay
Signed-off-by: Len Browncommit a4bec7e65aa3b7488b879d971651cc99a6c410fe
Author: Adam Belay
Date: Sat Mar 24 03:47:03 2007 -0400cpuidle: export time until next timer interrupt using NO_HZ
Expose information about the time remaining until the next
timer interrupt expires by utilizing the dynticks infrastructure.
Also modify the main idle loop to allow dynticks to handle
non-interrupt break events (e.g. DMA). Finally, expose sleep ticks
information to external code. Thomas Gleixner is responsible for much
of the code in this patch. However, I've made some additional changes,
so I'm probably responsible if there are any bugs or oversights :)Signed-off-by: Adam Belay
Signed-off-by: Len Browncommit 2929d8996fbc77f41a5ff86bb67cdde3ca7d2d72
Author: Adam Belay
Date: Sat Mar 24 03:46:58 2007 -0400cpuidle: governor API changes
This patch prepares cpuidle for the menu governor. It adds an optional
stage after idle state entry to give the governor an opportunity to
check why the state was exited. Also it makes sure the idle loop
returns after each state entry, allowing the appropriate dynticks code
to run.Signed-off-by: Adam Belay
Signed-off-by: Len Browncommit 3a7fd42f9825c3b03e364ca59baa751bb350775f
Author: Venki Pallipadi
Date: Thu Apr 26 00:03:59 2007 -0700cpuidle: hang fix
Prevent hang on x86-64, when ACPI processor driver is added as a module on
a system that does not support C-states.x86-64 expects all idle handlers to enable interrupts before returning from
idle handler. This is due to enter_idle(), exit_idle() races. Make
cpuidle_idle_call() confirm to this when there is no pm_idle_old.Also, cpuidle look at the return values of attch_driver() and set
current_driver to NULL if attach fails on all CPUs.Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit 4893339a142afbd5b7c01ffadfd53d14746e858e
Author: Shaohua Li
Date: Thu Apr 26 10:40:09 2007 +0800cpuidle: add support for max_cstate limit
With CPUIDLE framework, the max_cstate (to limit max cpu c-state)
parameter is ingored. Some systems require it to ignore C2/C3
and some drivers like ipw require it too.Signed-off-by: Shaohua Li
Signed-off-by: Len Browncommit 43bbbbe1cb998cbd2df656f55bb3bfe30f30e7d1
Author: Shaohua Li
Date: Thu Apr 26 10:40:13 2007 +0800cpuidle: add cpuidle_fore_redetect_devices API
add cpuidle_force_redetect_devices API,
which forces all CPU redetect idle states.
Next patch will use it.Signed-off-by: Shaohua Li
Signed-off-by: Len Browncommit d1edadd608f24836def5ec483d2edccfb37b1d19
Author: Shaohua Li
Date: Thu Apr 26 10:40:01 2007 +0800cpuidle: fix sysfs related issue
Fix the cpuidle sysfs issue.
a. make kobject dynamicaly allocated
b. fixed sysfs init issue to avoid suspend/resume issueSigned-off-by: Shaohua Li
Signed-off-by: Len Browncommit 7169a5cc0d67b263978859672e86c13c23a5570d
Author: Randy Dunlap
Date: Wed Mar 28 22:52:53 2007 -0400cpuidle: 1-bit field must be unsigned
A 1-bit bitfield has no room for a sign bit.
drivers/cpuidle/governors/ladder.c:54:16: error: dubious bitfield without explicit `signed' or `unsigned'Signed-off-by: Randy Dunlap
Cc: Venkatesh Pallipadi
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit 4658620158dc2fbd9e4bcb213c5b6fb5d05ba7d4
Author: Venkatesh Pallipadi
Date: Wed Mar 28 22:52:41 2007 -0400cpuidle: fix boot hang
Patch for cpuidle boot hang reported by Larry Finger here.
http://www.ussg.iu.edu/hypermail/linux/kernel/0703.2/2025.htmlSigned-off-by: Venkatesh Pallipadi
Cc: Larry Finger
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit c17e168aa6e5fe3851baaae8df2fbc1cf11443a9
Author: Len Brown
Date: Wed Mar 7 04:37:53 2007 -0500cpuidle: ladder does not depend on ACPI
build fix for CONFIG_ACPI=n
In file included from drivers/cpuidle/governors/ladder.c:21:
include/acpi/processor.h:88: error: expected specifier-qualifier-list before âacpi_integerâ
include/acpi/processor.h:106: error: expected specifier-qualifier-list before âacpi_integerâ
include/acpi/processor.h:168: error: expected specifier-qualifier-list before âacpi_handleâSigned-off-by: Len Brown
commit 8c91d958246bde68db0c3f0c57b535962ce861cb
Author: Adrian Bunk
Date: Tue Mar 6 02:29:40 2007 -0800cpuidle: make code static
This patch makes the following needlessly global code static:
- driver.c: __cpuidle_find_driver()
- governor.c: __cpuidle_find_governor()
- ladder.c: struct ladder_governorSigned-off-by: Adrian Bunk
Cc: Venkatesh Pallipadi
Cc: Adam Belay
Cc: Shaohua Li
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit 0c39dc3187094c72c33ab65a64d2017b21f372d2
Author: Venkatesh Pallipadi
Date: Wed Mar 7 02:38:22 2007 -0500cpu_idle: fix build break
This patch fixes a build breakage with !CONFIG_HOTPLUG_CPU and
CONFIG_CPU_IDLE.Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit 8112e3b115659b07df340ef170515799c0105f82
Author: Venkatesh Pallipadi
Date: Tue Mar 6 02:29:39 2007 -0800cpuidle: build fix for !CPU_IDLE
Fix the compile issues when CPU_IDLE is not configured.
Signed-off-by: Venkatesh Pallipadi
Cc: Adam Belay
Cc: Shaohua Li
Signed-off-by: Andrew Morton
Signed-off-by: Len Browncommit 1eb4431e9599cd25e0d9872f3c2c8986821839dd
Author: Venkatesh Pallipadi
Date: Thu Feb 22 13:54:57 2007 -0800cpuidle take2: Basic documentation for cpuidle
Documentation for cpuidle infrastructure
Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Adam Belay
Signed-off-by: Shaohua Li
Signed-off-by: Len Browncommit ef5f15a8b79123a047285ec2e3899108661df779
Author: Venkatesh Pallipadi
Date: Thu Feb 22 13:54:03 2007 -0800cpuidle take2: Hookup ACPI C-states driver with cpuidle
Hookup ACPI C-states onto generic cpuidle infrastructure.
drivers/acpi/procesor_idle.c is now a ACPI C-states driver that registers as
a driver in cpuidle infrastructure and the policy part is removed from
drivers/acpi/processor_idle.c. We use governor in cpuidle instead.Signed-off-by: Shaohua Li
Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Adam Belay
Signed-off-by: Len Browncommit 987196fa82d4db52c407e8c9d5dec884ba602183
Author: Venkatesh Pallipadi
Date: Thu Feb 22 13:52:57 2007 -0800cpuidle take2: Core cpuidle infrastructure
Announcing 'cpuidle', a new CPU power management infrastructure to manage
idle CPUs in a clean and efficient manner.
cpuidle separates out the drivers that can provide support for multiple types
of idle states and policy governors that decide on what idle state to use
at run time.
A cpuidle driver can support multiple idle states based on parameters like
varying power consumption, wakeup latency, etc (ACPI C-states for example).
A cpuidle governor can be usage model specific (laptop, server,
laptop on battery etc).
Main advantage of the infrastructure being, it allows independent development
of drivers and governors and allows for better CPU power management.A huge thanks to Adam Belay and Shaohua Li who were part of this mini-project
since its beginning and are greatly responsible for this patchset.This patch:
Core cpuidle infrastructure.
Introduces a new abstraction layer for cpuidle:
* which manages drivers that can support multiple idles states. Drivers
can be generic or particular to specific hardware/platform
* allows pluging in multiple policy governors that can take idle state policy
decision
* The core also has a set of sysfs interfaces with which administrato can know
about supported drivers and governors and switch them at run time.Signed-off-by: Adam Belay
Signed-off-by: Shaohua Li
Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Len BrownSigned-off-by: Len Brown
16 Sep, 2007
1 commit
-
Taking a cpu offline removes the cpu from the online mask before the
CPU_DEAD notification is done. The clock events layer does the cleanup
of the dead CPU from the CPU_DEAD notifier chain. tick_do_timer_cpu is
used to avoid xtime lock contention by assigning the task of jiffies
xtime updates to one CPU. If a CPU is taken offline, then this
assignment becomes stale. This went unnoticed because most of the time
the offline CPU went dead before the online CPU reached __cpu_die(),
where the CPU_DEAD state is checked. In the case that the offline CPU did
not reach the DEAD state before we reach __cpu_die(), the code in there
goes to sleep for 100ms. Due to the stale time update assignment, the
system is stuck forever.Take the assignment away when a cpu is not longer in the cpu_online_mask.
We do this in the last call to tick_nohz_stop_sched_tick() when the offline
CPU is on the way to the final play_dead() idle entry.Signed-off-by: Thomas Gleixner
22 Jul, 2007
1 commit
-
After discussing w/ Thomas over IRC, it seems the issue is the sched tick
fires on every cpu at the same time, causing extra lock contention.This smaller change, adds an extra offset per cpu so the ticks don't line up.
This patch also drops the idle latency from 40us down to under 20us.Signed-off-by: john stultz
Signed-off-by: Thomas Gleixner
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
30 May, 2007
1 commit
-
get_next_timer_interrupt() returns a delta of (LONG_MAX > 1) in case
there is no timer pending. On 64 bit machines this results in a
multiplication overflow in tick_nohz_stop_sched_tick().Reported by: Dave Miller
Make the return value a constant and limit the return value to a 32 bit
value.When the max timeout value is returned, we can safely stop the tick
timer device. The max jiffies delta results in a 12 days timeout for
HZ=1000.In the long term the get_next_timer_interrupt() code needs to be
reworked to return ktime instead of jiffies, but we have to wait until
the last users of the original NO_IDLE_HZ code are converted.Signed-off-by: Thomas Gleixner
Acked-off-by: David S. Miller
Signed-off-by: Linus Torvalds
24 May, 2007
1 commit
-
The warning in the NOHZ code, which triggers when a CPU goes idle with
softirqs pending can fill up the logs quite quickly. Rate limit the output
until we found the root cause of that problem.Signed-off-by: Thomas Gleixner
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
09 May, 2007
2 commits
-
Fix the process idle load balancing in the presence of dynticks. cpus for
which ticks are stopped will sleep till the next event wakes it up.
Potentially these sleeps can be for large durations and during which today,
there is no periodic idle load balancing being done.This patch nominates an owner among the idle cpus, which does the idle load
balancing on behalf of the other idle cpus. And once all the cpus are
completely idle, then we can stop this idle load balancing too. Checks added
in fast path are minimized. Whenever there are busy cpus in the system, there
will be an owner(idle cpu) doing the system wide idle load balancing.Open items:
1. Intelligent owner selection (like an idle core in a busy package).
2. Merge with rcu's nohz_cpu_mask?Signed-off-by: Suresh Siddha
Acked-by: Ingo Molnar
Cc: Thomas Gleixner
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
While the !highres/!dyntick code assigns the duty of the do_timer() call to
one specific CPU, this was dropped in the highres/dyntick part during
development.Steven Rostedt discovered the xtime lock contention on highres/dyntick due
to several CPUs trying to update jiffies.Add the single CPU assignement back. In the dyntick case this needs to be
handled carefully, as the CPU which has the do_timer() duty must drop the
assignement and let it be grabbed by another CPU, which is active.
Otherwise the do_timer() calls would not happen during the long sleep.Signed-off-by: Thomas Gleixner
Acked-by: Ingo Molnar
Cc: Steven Rostedt
Acked-by: Mark Lord
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
27 Feb, 2007
1 commit
-
Acked-by: Thomas Gleixner
Signed-off-by: David S. Miller
20 Feb, 2007
2 commits
-
The BUG_ON() in tick_nohz_stop_sched_tick() triggers on some boxen.
Remove the BUG_ON and print information about the pending softirq
to allow better debugging of the problem.Signed-off-by: Thomas Gleixner
Signed-off-by: Linus Torvalds -
When a CPU is needed for RCU the tick has to continue even when it was
stopped before.Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner
Signed-off-by: Linus Torvalds
17 Feb, 2007
2 commits
-
add /proc/timer_list, which prints all currently pending (high-res) timers,
all clock-event sources and their parameters in a human-readable form.Sample output:
Timer List Version: v0.1
HRTIMER_MAX_CLOCK_BASES: 2
now at 4246046273872 nsecscpu: 0
clock 0:
.index: 0
.resolution: 1 nsecs
.get_time: ktime_get_real
.offset: 1273998312645738432 nsecs
active timers:
clock 1:
.index: 1
.resolution: 1 nsecs
.get_time: ktime_get
.offset: 0 nsecs
active timers:
#0: , hrtimer_sched_tick, hrtimer_stop_sched_tick, swapper/0
# expires at 4246432689566 nsecs [in 386415694 nsecs]
#1: , hrtimer_wakeup, do_nanosleep, pcscd/2050
# expires at 4247018194689 nsecs [in 971920817 nsecs]
#2: , hrtimer_wakeup, do_nanosleep, irqbalance/1909
# expires at 4247351358392 nsecs [in 1305084520 nsecs]
#3: , hrtimer_wakeup, do_nanosleep, crond/2157
# expires at 4249097614968 nsecs [in 3051341096 nsecs]
#4: , it_real_fn, do_setitimer, syslogd/1888
# expires at 4251329900926 nsecs [in 5283627054 nsecs]
.expires_next : 4246432689566 nsecs
.hres_active : 1
.check_clocks : 0
.nr_events : 31306
.idle_tick : 4246020791890 nsecs
.tick_stopped : 1
.idle_jiffies : 986504
.idle_calls : 40700
.idle_sleeps : 36014
.idle_entrytime : 4246019418883 nsecs
.idle_sleeptime : 4178181972709 nsecscpu: 1
clock 0:
.index: 0
.resolution: 1 nsecs
.get_time: ktime_get_real
.offset: 1273998312645738432 nsecs
active timers:
clock 1:
.index: 1
.resolution: 1 nsecs
.get_time: ktime_get
.offset: 0 nsecs
active timers:
#0: , hrtimer_sched_tick, hrtimer_restart_sched_tick, swapper/0
# expires at 4246050084568 nsecs [in 3810696 nsecs]
#1: , hrtimer_wakeup, do_nanosleep, atd/2227
# expires at 4261010635003 nsecs [in 14964361131 nsecs]
#2: , hrtimer_wakeup, do_nanosleep, smartd/2332
# expires at 5469485798970 nsecs [in 1223439525098 nsecs]
.expires_next : 4246050084568 nsecs
.hres_active : 1
.check_clocks : 0
.nr_events : 24043
.idle_tick : 4246046084568 nsecs
.tick_stopped : 0
.idle_jiffies : 986510
.idle_calls : 26360
.idle_sleeps : 22551
.idle_entrytime : 4246043874339 nsecs
.idle_sleeptime : 4170763761184 nsecstick_broadcast_mask: 00000003
event_broadcast_mask: 00000001CPU#0's local event device:
Clock Event Device: lapic
capabilities: 0000000e
max_delta_ns: 807385544
min_delta_ns: 1443
mult: 44624025
shift: 32
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: hrtimer_interrupt
.installed: 1
.expires: 4246432689566 nsecsCPU#1's local event device:
Clock Event Device: lapic
capabilities: 0000000e
max_delta_ns: 807385544
min_delta_ns: 1443
mult: 44624025
shift: 32
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: hrtimer_interrupt
.installed: 1
.expires: 4246050084568 nsecsClock Event Device: hpet
capabilities: 00000007
max_delta_ns: 2147483647
min_delta_ns: 3352
mult: 61496110
shift: 32
set_next_event: hpet_next_event
set_mode: hpet_set_mode
event_handler: handle_nextevt_broadcastSigned-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner
Cc: john stultz
Cc: Roman Zippel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
With Ingo Molnar
Add functions to provide dynamic ticks and high resolution timers. The code
which keeps track of jiffies and handles the long idle periods is shared
between tick based and high resolution timer based dynticks. The dyntick
functionality can be disabled on the kernel commandline. Provide also the
infrastructure to support high resolution timers.Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
Cc: john stultz
Cc: Roman Zippel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds