Eric Lee / smarc-fsl-linux-kernel

07 Jan, 2012

1 commit

0db49b72b Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
sched/tracing: Add a new tracepoint for sleeptime
sched: Disable scheduler warnings during oopses
sched: Fix cgroup movement of waking process
sched: Fix cgroup movement of newly created process
sched: Fix cgroup movement of forking process
sched: Remove cfs bandwidth period check in tg_set_cfs_period()
sched: Fix load-balance lock-breaking
sched: Replace all_pinned with a generic flags field
sched: Only queue remote wakeups when crossing cache boundaries
sched: Add missing rcu_dereference() around ->real_parent usage
[S390] fix cputime overflow in uptime_proc_show
[S390] cputime: add sparse checking and cleanup
sched: Mark parent and real_parent as __rcu
sched, nohz: Fix missing RCU read lock
sched, nohz: Set the NOHZ_BALANCE_KICK flag for idle load balancer
sched, nohz: Fix the idle cpu check in nohz_idle_balance
sched: Use jump_labels for sched_feat
sched/accounting: Fix parameter passing in task_group_account_field
sched/accounting: Fix user/system tick double accounting
sched/accounting: Re-use scheduler statistics for the root cgroup
...

Fix up conflicts in
- arch/ia64/include/asm/cputime.h, include/asm-generic/cputime.h
usecs_to_cputime64() vs the sparse cleanups
- kernel/sched/fair.c, kernel/time/tick-sched.c
scheduler changes in multiple branches

Linus Torvalds
2012-01-07 00:44:54 +0800

12 Dec, 2011

4 commits

1268fbc74 nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu() ... Browse Code »

Those two APIs were provided to optimize the calls of
tick_nohz_idle_enter() and rcu_idle_enter() into a single
irq disabled section. This way no interrupt happening in-between would
needlessly process any RCU job.

Now we are talking about an optimization for which benefits
have yet to be measured. Let's start simple and completely decouple
idle rcu and dyntick idle logics to simplify.

Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Peter Zijlstra
Reviewed-by: Josh Triplett
Signed-off-by: Paul E. McKenney

Frederic Weisbecker
2011-12-12 02:31:57 +0800
2bbb6817c nohz: Allow rcu extended quiescent state handling seperately from tick stop ... Browse Code »

It is assumed that rcu won't be used once we switch to tickless
mode and until we restart the tick. However this is not always
true, as in x86-64 where we dereference the idle notifiers after
the tick is stopped.

To prepare for fixing this, add two new APIs:
tick_nohz_idle_enter_norcu() and tick_nohz_idle_exit_norcu().

If no use of RCU is made in the idle loop between
tick_nohz_enter_idle() and tick_nohz_exit_idle() calls, the arch
must instead call the new *_norcu() version such that the arch doesn't
need to call rcu_idle_enter() and rcu_idle_exit().

Otherwise the arch must call tick_nohz_enter_idle() and
tick_nohz_exit_idle() and also call explicitly:

- rcu_idle_enter() after its last use of RCU before the CPU is put
to sleep.
- rcu_idle_exit() before the first use of RCU after the CPU is woken
up.

Signed-off-by: Frederic Weisbecker
Cc: Mike Frysinger
Cc: Guan Xuetao
Cc: David Miller
Cc: Chris Metcalf
Cc: Hans-Christian Egtvedt
Cc: Ralf Baechle
Cc: Paul E. McKenney
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: H. Peter Anvin
Cc: Russell King
Cc: Paul Mackerras
Cc: Heiko Carstens
Cc: Paul Mundt
Signed-off-by: Paul E. McKenney

Frederic Weisbecker
2011-12-12 02:31:36 +0800
280f06774 nohz: Separate out irq exit and idle loop dyntick logic ... Browse Code »

The tick_nohz_stop_sched_tick() function, which tries to delay
the next timer tick as long as possible, can be called from two
places:

- From the idle loop to start the dytick idle mode
- From interrupt exit if we have interrupted the dyntick
idle mode, so that we reprogram the next tick event in
case the irq changed some internal state that requires this
action.

There are only few minor differences between both that
are handled by that function, driven by the ts->inidle
cpu variable and the inidle parameter. The whole guarantees
that we only update the dyntick mode on irq exit if we actually
interrupted the dyntick idle mode, and that we enter in RCU extended
quiescent state from idle loop entry only.

Split this function into:

- tick_nohz_idle_enter(), which sets ts->inidle to 1, enters
dynticks idle mode unconditionally if it can, and enters into RCU
extended quiescent state.

- tick_nohz_irq_exit() which only updates the dynticks idle mode
when ts->inidle is set (ie: if tick_nohz_idle_enter() has been called).

To maintain symmetry, tick_nohz_restart_sched_tick() has been renamed
into tick_nohz_idle_exit().

This simplifies the code and micro-optimize the irq exit path (no need
for local_irq_save there). This also prepares for the split between
dynticks and rcu extended quiescent state logics. We'll need this split to
further fix illegal uses of RCU in extended quiescent states in the idle
loop.

Signed-off-by: Frederic Weisbecker
Cc: Mike Frysinger
Cc: Guan Xuetao
Cc: David Miller
Cc: Chris Metcalf
Cc: Hans-Christian Egtvedt
Cc: Ralf Baechle
Cc: Paul E. McKenney
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: H. Peter Anvin
Cc: Russell King
Cc: Paul Mackerras
Cc: Heiko Carstens
Cc: Paul Mundt
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Frederic Weisbecker
2011-12-12 02:31:35 +0800
9b2e4f188 rcu: Track idleness independent of idle tasks ... Browse Code »

Earlier versions of RCU used the scheduling-clock tick to detect idleness
by checking for the idle task, but handled idleness differently for
CONFIG_NO_HZ=y. But there are now a number of uses of RCU read-side
critical sections in the idle task, for example, for tracing. A more
fine-grained detection of idleness is therefore required.

This commit presses the old dyntick-idle code into full-time service,
so that rcu_idle_enter(), previously known as rcu_enter_nohz(), is
always invoked at the beginning of an idle loop iteration. Similarly,
rcu_idle_exit(), previously known as rcu_exit_nohz(), is always invoked
at the end of an idle-loop iteration. This allows the idle task to
use RCU everywhere except between consecutive rcu_idle_enter() and
rcu_idle_exit() calls, in turn allowing architecture maintainers to
specify exactly where in the idle loop that RCU may be used.

Because some of the userspace upcall uses can result in what looks
to RCU like half of an interrupt, it is not possible to expect that
the irq_enter() and irq_exit() hooks will give exact counts. This
patch therefore expands the ->dynticks_nesting counter to 64 bits
and uses two separate bitfields to count process/idle transitions
and interrupt entry/exit transitions. It is presumed that userspace
upcalls do not happen in the idle loop or from usermode execution
(though usermode might do a system call that results in an upcall).
The counter is hard-reset on each process/idle transition, which
avoids the interrupt entry/exit error from accumulating. Overflow
is avoided by the 64-bitness of the ->dyntick_nesting counter.

This commit also adds warnings if a non-idle task asks RCU to enter
idle state (and these checks will need some adjustment before applying
Frederic's OS-jitter patches (http://lkml.org/lkml/2011/10/7/246).
In addition, validation of ->dynticks and ->dynticks_nesting is added.

Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
2011-12-12 02:31:24 +0800

06 Dec, 2011

1 commit

69e1e811d sched, nohz: Track nr_busy_cpus in the sched_group_power ... Browse Code »

Introduce nr_busy_cpus in the struct sched_group_power [Not in sched_group
because sched groups are duplicated for the SD_OVERLAP scheduler domain]
and for each cpu that enters and exits idle, this parameter will
be updated in each scheduler group of the scheduler domain that this cpu
belongs to.

To avoid the frequent update of this state as the cpu enters
and exits idle, the update of the stat during idle exit is
delayed to the first timer tick that happens after the cpu becomes busy.
This is done using NOHZ_IDLE flag in the struct rq's nohz_flags.

Signed-off-by: Suresh Siddha
Signed-off-by: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20111202010832.555984323@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar

Suresh Siddha
2011-12-06 16:06:32 +0800

26 Oct, 2011

1 commit

39adff5f6 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
time, s390: Get rid of compile warning
dw_apb_timer: constify clocksource name
time: Cleanup old CONFIG_GENERIC_TIME references that snuck in
time: Change jiffies_to_clock_t() argument type to unsigned long
alarmtimers: Fix error handling
clocksource: Make watchdog reset lockless
posix-cpu-timers: Cure SMP accounting oddities
s390: Use direct ktime path for s390 clockevent device
clockevents: Add direct ktime programming function
clockevents: Make minimum delay adjustments configurable
nohz: Remove "Switched to NOHz mode" debugging messages
proc: Consider NO_HZ when printing idle and iowait times
nohz: Make idle/iowait counter update conditional
nohz: Fix update_ts_time_stat idle accounting
cputime: Clean up cputime_to_usecs and usecs_to_cputime macros
alarmtimers: Rework RTC device selection using class interface
alarmtimers: Add try_to_cancel functionality
alarmtimers: Add more refined alarm state tracking
alarmtimers: Remove period from alarm structure
alarmtimers: Remove interval cap limit hack
...

Linus Torvalds
2011-10-26 23:15:03 +0800

29 Sep, 2011

1 commit

fc0763f53 nohz: Remove nohz_cpu_mask ... Browse Code »

RCU no longer uses this global variable, nor does anyone else. This
commit therefore removes this variable. This reduces memory footprint
and also removes some atomic instructions and memory barriers from
the dyntick-idle path.

Signed-off-by: Alex Shi
Signed-off-by: Paul E. McKenney

Shi, Alex
2011-09-29 12:38:29 +0800

08 Sep, 2011

3 commits

29c158e81 nohz: Remove "Switched to NOHz mode" debugging messages ... Browse Code »

When performing cpu hotplug tests the kernel printk log buffer gets flooded
with pointless "Switched to NOHz mode..." messages. Especially when afterwards
analyzing a dump this might have removed more interesting stuff out of the
buffer.
Assuming that switching to NOHz mode simply works just remove the printk.

Signed-off-by: Heiko Carstens
Link: http://lkml.kernel.org/r/20110823112046.GB2540@osiris.boeblingen.de.ibm.com
Signed-off-by: Thomas Gleixner

Heiko Carstens
2011-09-08 17:10:55 +0800
09a1d34f8 nohz: Make idle/iowait counter update conditional ... Browse Code »

get_cpu_{idle,iowait}_time_us update idle/iowait counters
unconditionally if the given CPU is in the idle loop.

This doesn't work well outside of CPU governors which are singletons
so nobody (except for IRQ) can race with them.

We will need to use both functions from /proc/stat handler to properly
handle nohz idle/iowait times.

Make the update depend on a non NULL last_update_time argument.

Signed-off-by: Michal Hocko
Cc: Dave Jones
Cc: Arnd Bergmann
Cc: Alexey Dobriyan
Link: http://lkml.kernel.org/r/11f23179472635ce52e78921d47a20216b872f23.1314172057.git.mhocko@suse.cz
Signed-off-by: Thomas Gleixner

Michal Hocko
2011-09-08 17:10:55 +0800
6beea0cda nohz: Fix update_ts_time_stat idle accounting ... Browse Code »

update_ts_time_stat currently updates idle time even if we are in
iowait loop at the moment. The only real users of the idle counter
(via get_cpu_idle_time_us) are CPU governors and they expect to get
cumulative time for both idle and iowait times.
The value (idle_sleeptime) is also printed to userspace by print_cpu
but it prints both idle and iowait times so the idle part is misleading.

Let's clean this up and fix update_ts_time_stat to account both counters
properly and update consumers of idle to consider iowait time as well.
If we do this we might use get_cpu_{idle,iowait}_time_us from other
contexts as well and we will get expected values.

Signed-off-by: Michal Hocko
Cc: Dave Jones
Cc: Arnd Bergmann
Cc: Alexey Dobriyan
Link: http://lkml.kernel.org/r/e9c909c221a8da402c4da07e4cd968c3218f8eb1.1314172057.git.mhocko@suse.cz
Signed-off-by: Thomas Gleixner

Michal Hocko
2011-09-08 17:10:55 +0800

01 Feb, 2011

1 commit

e2830b5c1 time: Make do_timer() and xtime_lock local to kernel/time/ ... Browse Code »

All callers of do_timer() are converted to xtime_update(). The only
users of xtime_lock are in kernel/time/. Make both local to
kernel/time/ and remove them from the global header files.

[ tglx: Reuse tick-internal.h instead of creating another local header
file. Massaged changelog ]

Signed-off-by: Torben Hohn
Cc: Peter Zijlstra
Cc: johnstul@us.ibm.com
Cc: yong.zhang0@gmail.com
Cc: hch@infradead.org
Signed-off-by: Thomas Gleixner

Torben Hohn
2011-02-01 02:26:50 +0800

20 Jan, 2011

1 commit

2d0640b47 hrtimers: Notify hrtimer users of switches to NOHZ mode ... Browse Code »

When NOHZ=y and high res timers are disabled (via cmdline or
Kconfig) tick_nohz_switch_to_nohz() will notify the user about
switching into NOHZ mode. Nothing is printed for the case where
HIGH_RES_TIMERS=y. Fix this for the HIGH_RES_TIMERS=y case by
duplicating the printk from the low res NOHZ path in the high
res NOHZ path.

This confused me since I was thinking 'dmesg | grep -i NOHZ' would
tell me if NOHZ was enabled, but if I have hrtimers there is
nothing.

Signed-off-by: Stephen Boyd
Acked-by: Thomas Gleixner
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Stephen Boyd
2011-01-20 03:08:15 +0800

07 Aug, 2010

1 commit

af3900843 Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip ... Browse Code »

* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
Documentation: Add timers/timers-howto.txt
timer: Added usleep_range timer
Revert "timer: Added usleep[_range] timer"
clockevents: Remove the per cpu tick skew
posix_timer: Move copy_to_user(created_timer_id) down in timer_create()
timer: Added usleep[_range] timer
timers: Document meaning of deferrable timer

Linus Torvalds
2010-08-07 04:12:36 +0800

05 Aug, 2010

1 commit

0bcfe7580 Merge branch 'sched/urgent' into sched/core ... Browse Code »

Conflicts:
include/linux/sched.h

Merge reason: Add the leftover .35 urgent bits, fix the conflict.

Signed-off-by: Ingo Molnar

Ingo Molnar
2010-08-05 15:46:29 +0800

03 Aug, 2010

1 commit

af5ab277d clockevents: Remove the per cpu tick skew ... Browse Code »
43

Historically, Linux has tried to make the regular timer tick on the
various CPUs not happen at the same time, to avoid contention on
xtime_lock.

Nowadays, with the tickless kernel, this contention no longer happens
since time keeping and updating are done differently. In addition,
this skew is actually hurting power consumption in a measurable way on
many-core systems.

Signed-off-by: Arjan van de Ven
LKML-Reference:
Signed-off-by: Thomas Gleixner

Arjan van de Ven
2010-08-03 03:45:58 +0800

22 Jul, 2010

1 commit

dca45ad8a Merge branch 'linus' into sched/core ... Browse Code »

Merge reason: Move from the -rc3 to the almost-rc6 base.

Signed-off-by: Ingo Molnar

Ingo Molnar
2010-07-22 03:45:08 +0800

17 Jul, 2010

1 commit

396e894d2 sched: Revert nohz_ratelimit() for now ... Browse Code »

Norbert reported that nohz_ratelimit() causes his laptop to burn about
4W (40%) extra. For now back out the change and see if we can adjust
the power management code to make better decisions.

Reported-by: Norbert Preining
Signed-off-by: Peter Zijlstra
Acked-by: Mike Galbraith
Cc: Arjan van de Ven
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-07-17 18:05:44 +0800

03 Jul, 2010

1 commit

123f94f22 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Cure nr_iowait_cpu() users
init: Fix comment
init, sched: Fix race between init and kthreadd

Linus Torvalds
2010-07-03 00:52:58 +0800

01 Jul, 2010

1 commit

8c215bd38 sched: Cure nr_iowait_cpu() users ... Browse Code »

Commit 0224cf4c5e (sched: Intoduce get_cpu_iowait_time_us())
broke things by not making sure preemption was indeed disabled
by the callers of nr_iowait_cpu() which took the iowait value of
the current cpu.

This resulted in a heap of preempt warnings. Cure this by making
nr_iowait_cpu() take a cpu number and fix up the callers to pass
in the right number.

Signed-off-by: Peter Zijlstra
Cc: Arjan van de Ven
Cc: Sergey Senozhatsky
Cc: Rafael J. Wysocki
Cc: Maxim Levitsky
Cc: Len Brown
Cc: Pavel Machek
Cc: Jiri Slaby
Cc: linux-pm@lists.linux-foundation.org
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-07-01 15:39:48 +0800

18 Jun, 2010

1 commit

3310d4d38 nohz: Fix nohz ratelimit ... Browse Code »

Chris Wedgwood reports that 39c0cbe (sched: Rate-limit nohz) causes a
serial console regression, unresponsiveness, and indeed it does. The
reason is that the nohz code is skipped even when the tick was already
stopped before the nohz_ratelimit(cpu) condition changed.

Move the nohz_ratelimit() check to the other conditions which prevent
long idle sleeps.

Reported-by: Chris Wedgwood
Tested-by: Brian Bloniarz
Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
Cc: Jiri Kosina
Cc: Linus Torvalds
Cc: Greg KH
Cc: Alan Cox
Cc: OGAWA Hirofumi
Cc: Jef Driesen
LKML-Reference:
Signed-off-by: Thomas Gleixner

Peter Zijlstra
2010-06-18 01:37:29 +0800

09 Jun, 2010

1 commit

83cd4fe27 sched: Change nohz idle load balancing logic to push model ... Browse Code »

In the new push model, all idle CPUs indeed go into nohz mode. There is
still the concept of idle load balancer (performing the load balancing
on behalf of all the idle cpu's in the system). Busy CPU kicks the nohz
balancer when any of the nohz CPUs need idle load balancing.
The kickee CPU does the idle load balancing on behalf of all idle CPUs
instead of the normal idle balance.

This addresses the below two problems with the current nohz ilb logic:
* the idle load balancer continued to have periodic ticks during idle and
wokeup frequently, even though it did not have any rebalancing to do on
behalf of any of the idle CPUs.
* On x86 and CPUs that have APIC timer stoppage on idle CPUs, this
periodic wakeup can result in a periodic additional interrupt on a CPU
doing the timer broadcast.

Also currently we are migrating the unpinned timers from an idle to the cpu
doing idle load balancing (when all the cpus in the system are idle,
there is no idle load balancing cpu and timers get added to the same idle cpu
where the request was made. So the existing optimization works only on semi idle
system).

And In semi idle system, we no longer have periodic ticks on the idle load
balancer CPU. Using that cpu will add more delays to the timers than intended
(as that cpu's timer base may not be uptodate wrt jiffies etc). This was
causing mysterious slowdowns during boot etc.

For now, in the semi idle case, use the nearest busy cpu for migrating timers
from an idle cpu. This is good for power-savings anyway.

Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Suresh Siddha
Signed-off-by: Peter Zijlstra
Cc: Thomas Gleixner
LKML-Reference:
Signed-off-by: Ingo Molnar

Venkatesh Pallipadi
2010-06-09 16:34:52 +0800

10 May, 2010

6 commits

0224cf4c5 sched: Intoduce get_cpu_iowait_time_us() ... Browse Code »

For the ondemand cpufreq governor, it is desired that the iowait
time is microaccounted in a similar way as idle time is.

This patch introduces the infrastructure to account and expose
this information via the get_cpu_iowait_time_us() function.

[akpm@linux-foundation.org: fix CONFIG_NO_HZ=n build]
Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Reviewed-by: Rik van Riel
Acked-by: Peter Zijlstra
Cc: davej@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Arjan van de Ven
2010-05-10 01:35:27 +0800
e0e37c200 sched: Eliminate the ts->idle_lastupdate field ... Browse Code »
43

Now that the only user of ts->idle_lastupdate is
update_ts_time_stats(), the entire field can be eliminated.

In update_ts_time_stats(), idle_lastupdate is first set to
"now", and a few lines later, the only user is an if() statement
that assigns a variable either to "now" or to
ts->idle_lastupdate, which has the value of "now" at that point.

Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Reviewed-by: Rik van Riel
Acked-by: Peter Zijlstra
Cc: davej@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Arjan van de Ven
2010-05-10 01:35:26 +0800
8d63bf949 sched: Fold updating of the last_update_time_info into update_ts_time_stats() ... Browse Code »

This patch folds the updating of the last_update_time into the
update_ts_time_stats() function, and updates the callers.

This allows for further cleanups that are done in the next
patch.

Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Reviewed-by: Rik van Riel
Acked-by: Peter Zijlstra
Cc: davej@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Arjan van de Ven
2010-05-10 01:35:26 +0800
8c7b09f43 sched: Update the idle statistics in get_cpu_idle_time_us() ... Browse Code »

Right now, get_cpu_idle_time_us() only reports the idle
statistics upto the point the CPU entered last idle; not what is
valid right now.

This patch adds an update of the idle statistics to
get_cpu_idle_time_us(), so that calling this function always
returns statistics that are accurate at the point of the call.

This includes resetting the start of the idle time for
accounting purposes to avoid double accounting.

Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Reviewed-by: Rik van Riel
Acked-by: Peter Zijlstra
Cc: davej@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Arjan van de Ven
2010-05-10 01:35:26 +0800
595aac488 sched: Introduce a function to update the idle statistics ... Browse Code »

Currently, two places update the idle statistics (and more to
come later in this series).

This patch creates a helper function for updating these
statistics.

Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Reviewed-by: Rik van Riel
Acked-by: Peter Zijlstra
Cc: davej@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Arjan van de Ven
2010-05-10 01:35:25 +0800
b1f724c30 sched: Add a comment to get_cpu_idle_time_us() ... Browse Code »

The exported function get_cpu_idle_time_us() has no comment
describing it; add a kerneldoc comment

Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Reviewed-by: Rik van Riel
Acked-by: Peter Zijlstra
Cc: davej@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Arjan van de Ven
2010-05-10 01:35:25 +0800

12 Mar, 2010

1 commit

39c0cbe21 sched: Rate-limit nohz ... Browse Code »

Entering nohz code on every micro-idle is costing ~10% throughput for netperf
TCP_RR when scheduling cross-cpu. Rate limiting entry fixes this, but raises
ticks a bit. On my Q6600, an idle box goes from ~85 interrupts/sec to 128.

The higher the context switch rate, the more nohz entry costs. With this patch
and some cycle recovery patches in my tree, max cross cpu context switch rate is
improved by ~16%, a large portion of which of which is this ratelimiting.

Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Mike Galbraith
2010-03-12 01:32:49 +0800

09 Dec, 2009

1 commit

60d8ce2cd Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip ... Browse Code »

* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
timers, init: Limit the number of per cpu calibration bootup messages
posix-cpu-timers: optimize and document timer_create callback
clockevents: Add missing include to pacify sparse
x86: vmiclock: Fix printk format
x86: Fix printk format due to variable type change
sparc: fix printk for change of variable type
clocksource/events: Fix fallout of generic code changes
nohz: Allow 32-bit machines to sleep for more than 2.15 seconds
nohz: Track last do_timer() cpu
nohz: Prevent clocksource wrapping during idle
nohz: Type cast printk argument
mips: Use generic mult/shift factor calculation for clocks
clocksource: Provide a generic mult/shift factor calculation
clockevents: Use u32 for mult and shift factors
nohz: Introduce arch_needs_cpu
nohz: Reuse ktime in sub-functions of tick_check_idle.
time: Remove xtime_cache
time: Implement logarithmic time accumulation

Linus Torvalds
2009-12-09 11:27:08 +0800

14 Nov, 2009

3 commits

27185016b nohz: Track last do_timer() cpu ... Browse Code »

The previous patch which limits the sleep time to the maximum
deferment time of the time keeping clocksource has some limitations on
SMP machines: if all CPUs are idle then for all CPUs the maximum sleep
time is limited.

Solve this by keeping track of which cpu had the do_timer() duty
assigned last and limit the sleep time only for this cpu.

Signed-off-by: Thomas Gleixner
LKML-Reference:
Cc: Jon Hunter
Cc: John Stultz

Thomas Gleixner
2009-11-14 03:46:24 +0800
98962465e nohz: Prevent clocksource wrapping during idle ... Browse Code »

The dynamic tick allows the kernel to sleep for periods longer than a
single tick, but it does not limit the sleep time currently. In the
worst case the kernel could sleep longer than the wrap around time of
the time keeping clock source which would result in losing track of
time.

Prevent this by limiting it to the safe maximum sleep time of the
current time keeping clock source. The value is calculated when the
clock source is registered.

[ tglx: simplified the code a bit and massaged the commit msg ]

Signed-off-by: Jon Hunter
Cc: John Stultz
LKML-Reference:
Signed-off-by: Thomas Gleixner

Jon Hunter
2009-11-14 03:46:24 +0800
529eaccd9 nohz: Type cast printk argument ... Browse Code »

On some archs local_softirq_pending() has a data type of unsigned long
on others its unsigned int. Type cast it to (unsigned int) in the
printk to avoid the compiler warning.

Signed-off-by: Thomas Gleixner
LKML-Reference:

Thomas Gleixner
2009-11-14 03:46:24 +0800

05 Nov, 2009

2 commits

3c5d92a0c nohz: Introduce arch_needs_cpu ... Browse Code »

Allow the architecture to request a normal jiffy tick when the system
goes idle and tick_nohz_stop_sched_tick is called . On s390 the hook is
used to prevent the system going fully idle if there has been an
interrupt other than a clock comparator interrupt since the last wakeup.

On s390 the HiperSockets response time for 1 connection ping-pong goes
down from 42 to 34 microseconds. The CPU cost decreases by 27%.

Signed-off-by: Martin Schwidefsky
LKML-Reference:
Signed-off-by: Thomas Gleixner

Martin Schwidefsky
2009-11-05 14:53:53 +0800
eed3b9cf3 nohz: Reuse ktime in sub-functions of tick_check_idle. ... Browse Code »

On a system with NOHZ=y tick_check_idle calls tick_nohz_stop_idle and
tick_nohz_update_jiffies. Given the right conditions (ts->idle_active
and/or ts->tick_stopped) both function get a time stamp with ktime_get.
The same time stamp can be reused if both function require one.

On s390 this change has the additional benefit that gcc inlines the
tick_nohz_stop_idle function into tick_check_idle. The number of
instructions to execute tick_check_idle drops from 225 to 144
(without the ktime_get optimization it is 367 vs 215 instructions).

before:

0) | tick_check_idle() {
0) | tick_nohz_stop_idle() {
0) | ktime_get() {
0) | read_tod_clock() {
0) 0.601 us | }
0) 1.765 us | }
0) 3.047 us | }
0) | ktime_get() {
0) | read_tod_clock() {
0) 0.570 us | }
0) 1.727 us | }
0) | tick_do_update_jiffies64() {
0) 0.609 us | }
0) 8.055 us | }

after:

0) | tick_check_idle() {
0) | ktime_get() {
0) | read_tod_clock() {
0) 0.617 us | }
0) 1.773 us | }
0) | tick_do_update_jiffies64() {
0) 0.593 us | }
0) 4.477 us | }

Signed-off-by: Martin Schwidefsky
Cc: john stultz
LKML-Reference:
Signed-off-by: Thomas Gleixner

Martin Schwidefsky
2009-11-05 14:53:53 +0800

07 Oct, 2009

1 commit

fdc6f192e NOHZ: update idle state also when NOHZ is inactive ... Browse Code »

Commit f2e21c9610991e95621a81407cdbab881226419b had unfortunate side
effects with cpufreq governors on some systems.

If the system did not switch into NOHZ mode ts->inidle is not set when
tick_nohz_stop_sched_tick() is called from the idle routine. Therefor
all subsequent calls from irq_exit() to tick_nohz_stop_sched_tick()
fail to call tick_nohz_start_idle(). This results in bogus idle
accounting information which is passed to cpufreq governors.

Set the inidle flag unconditionally of the NOHZ active state to keep
the idle time accounting correct in any case.

[ tglx: Added comment and tweaked the changelog ]

Reported-by: Steven Noonan
Signed-off-by: Eero Nurkkala
Cc: Rik van Riel
Cc: Venkatesh Pallipadi
Cc: Greg KH
Cc: Steven Noonan
Cc: stable@kernel.org
LKML-Reference:
Signed-off-by: Thomas Gleixner

Eero Nurkkala
2009-10-07 19:05:05 +0800

21 Jun, 2009

1 commit

38df92b8c Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/tip/linux-2.6-tip

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
NOHZ: Properly feed cpufreq ondemand governor

Linus Torvalds
2009-06-21 01:51:44 +0800

27 May, 2009

1 commit

f2e21c961 NOHZ: Properly feed cpufreq ondemand governor ... Browse Code »

A call from irq_exit() may occasionally pause the timing
info for cpufreq ondemand governor. This results in the
cpufreq ondemand governor to fail to calculate the
system load properly. Thus, relocate the checks for this
particular case to keep the governor always functional.

Signed-off-by: Eero Nurkkala
Reported-by: Tero Kristo
Acked-by: Rik van Riel
Acked-by: Venkatesh Pallipadi
Signed-off-by: Thomas Gleixner

Eero Nurkkala
2009-05-27 21:33:43 +0800

13 May, 2009

1 commit

5c333864a timers: Identifying the existing pinned timers ... Browse Code »

* Arun R Bharadwaj [2009-04-16 12:11:36]:

The following pinned hrtimers have been identified and marked:
1)sched_rt_period_timer
2)tick_sched_timer
3)stack_trace_timer_fn

[ tglx: fixup the hrtimer pinned mode ]

Signed-off-by: Arun R Bharadwaj
Signed-off-by: Thomas Gleixner

Arun R Bharadwaj
2009-05-13 22:52:42 +0800

15 Jan, 2009

1 commit

934d96eaf time-sched.c: tick_nohz_update_jiffies should be static ... Browse Code »

Impact: cleanup, reduce kernel size a bit, avoid sparse warning

Fixes sparse warning:

kernel/time/tick-sched.c:137:6: warning: symbol 'tick_nohz_update_jiffies' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh Rajput
Signed-off-by: Ingo Molnar

Jaswinder Singh Rajput
2009-01-15 19:06:56 +0800