Eric Lee / smarc-fsl-linux-kernel

15 Dec, 2009

1 commit

05fa785cf sched: Convert rq->lock to raw_spinlock ... Browse Code »

Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.

Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Acked-by: Ingo Molnar

Thomas Gleixner
15 years ago

11 Dec, 2009

1 commit

b9889ed1d sched: Remove forced2_migrations stats ... Browse Code »

This build warning:

kernel/sched.c: In function 'set_task_cpu':
kernel/sched.c:2070: warning: unused variable 'old_rq'

Made me realize that the forced2_migrations stat looks pretty
pointless (and a misnomer) - remove it.

Cc: Peter Zijlstra
Cc: Mike Galbraith
LKML-Reference:
Signed-off-by: Ingo Molnar

Ingo Molnar
15 years ago

09 Dec, 2009

2 commits

1983a922a sched: Make tunable scaling style configurable ... Browse Code »

As scaling now takes place on all kind of cpu add/remove events a user
that configures values via proc should be able to configure if his set
values are still rescaled or kept whatever happens.

As the comments state that log2 was just a second guess that worked the
interface is not just designed for on/off, but to choose a scaling type.
Currently this allows none, log and linear, but more important it allwos
us to keep the interface even if someone has an even better idea how to
scale the values.

Signed-off-by: Christian Ehrhardt
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Christian Ehrhardt
15 years ago
6cecd084d sched: Discard some old bits ... Browse Code »

WAKEUP_RUNNING was an experiment, not sure why that ever ended up being
merged...

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
15 years ago

05 Nov, 2009

1 commit

1b9508f68 sched: Rate-limit newidle ... Browse Code »

Rate limit newidle to migration_cost. It's a win for all
stages of sysbench oltp tests.

Signed-off-by: Mike Galbraith
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Mike Galbraith
15 years ago

17 Sep, 2009

1 commit

ad4b78bbc sched: Add new wakeup preemption mode: WAKEUP_RUNNING ... Browse Code »

Create a new wakeup preemption mode, preempt towards tasks that run
shorter on avg. It sets next buddy to be sure we actually run the task
we preempted for.

Test results:

root@twins:~# while :; do :; done &
[1] 6537
root@twins:~# while :; do :; done &
[2] 6538
root@twins:~# while :; do :; done &
[3] 6539
root@twins:~# while :; do :; done &
[4] 6540

root@twins:/home/peter# ./latt -c4 sleep 4
Entries: 48 (clients=4)

Averages:
------------------------------
Max 4750 usec
Avg 497 usec
Stdev 737 usec

root@twins:/home/peter# echo WAKEUP_RUNNING > /debug/sched_features

root@twins:/home/peter# ./latt -c4 sleep 4
Entries: 48 (clients=4)

Averages:
------------------------------
Max 14 usec
Avg 5 usec
Stdev 3 usec

Disabled by default - needs more testing.

Signed-off-by: Peter Zijlstra
Acked-by: Mike Galbraith
Signed-off-by: Ingo Molnar
LKML-Reference:

Peter Zijlstra
16 years ago

02 Sep, 2009

1 commit

8f0dfc34e sched: Provide iowait counters ... Browse Code »

For counting how long an application has been waiting for
(disk) IO, there currently is only the HZ sample driven
information available, while for all other counters in this
class, a high resolution version is available via
CONFIG_SCHEDSTATS.

In order to make an improved bootchart tool possible, we also
need a higher resolution version of the iowait time.

This patch below adds this scheduler statistic to the kernel.

Signed-off-by: Arjan van de Ven
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Arjan van de Ven
16 years ago

18 Jun, 2009

1 commit

348ec61e6 sched: Hide runqueues from direct refer at source code level ... Browse Code »

There are some points which refer the per-cpu value "runqueues" directly.
sched.c provides nice abstraction, such as cpu_rq() and this_rq(),
so we should use these macros when looking runqueues.

Signed-off-by: Hitoshi Mitake
LKML-Reference:
Signed-off-by: Ingo Molnar

Hitoshi Mitake
16 years ago

25 Mar, 2009

1 commit

67aa0f767 sched: remove unused fields from struct rq ... Browse Code »

Impact: cleanup, new schedstat ABI

Since they are used on in statistics and are always set to zero, the
following fields from struct rq have been removed: yld_exp_empty,
yld_act_empty and yld_both_empty.

Both Sched Debug and SCHEDSTAT_VERSION versions has also been
incremented since ABIs have been changed.

The schedtop tool has been updated to properly handle new version of
schedstat:

http://rt.wiki.kernel.org/index.php/Schedtop_utility

Signed-off-by: Luis Henriques
Acked-by: Gregory Haskins
Acked-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Luis Henriques
16 years ago

18 Mar, 2009

1 commit

af66df5ec sched: jiffies not printed per CPU ... Browse Code »

The jiffies value was being printed for each CPU, which does not seem to make
sense. Moved jiffies to system section.

Signed-off-by: Luis Henriques
Acked-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Luis Henriques
16 years ago

15 Jan, 2009

1 commit

831451ac4 sched: introduce avg_wakeup ... Browse Code »

Introduce a new avg_wakeup statistic.

avg_wakeup is a measure of how frequently a task wakes up other tasks, it
represents the average time between wakeups, with a limit of avg_runtime
for when it doesn't wake up anybody.

Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
16 years ago

11 Jan, 2009

1 commit

805194c35 sched: partly revert "sched debug: remove NULL checking in print_cfs_rt_rq()" ... Browse Code »

Impact: avoid accessing NULL tg.css->cgroup

In commit 0a0db8f5c9d4bbb9bbfcc2b6cb6bce2d0ef4d73d, I removed checking
NULL tg.css->cgroup, but I realized I was wrong when I found reading
/proc/sched_debug can race with cgroup_create().

Signed-off-by: Li Zefan
Signed-off-by: Ingo Molnar

Li Zefan
16 years ago

02 Dec, 2008

1 commit

6c415b923 sched: add uid information to sched_debug for CONFIG_USER_SCHED ... Browse Code »

Impact: extend information in /proc/sched_debug

This patch adds uid information in sched_debug for CONFIG_USER_SCHED

Signed-off-by: Arun R Bharadwaj
Acked-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Arun R Bharadwaj
16 years ago

19 Nov, 2008

1 commit

3ac3ba0b3 Merge branch 'linus' into sched/core ... Browse Code »

Conflicts:
kernel/Makefile

Ingo Molnar
16 years ago

16 Nov, 2008

1 commit

29d7b90c1 sched: fix kernel warning on /proc/sched_debug access ... Browse Code »

Luis Henriques reported that with CONFIG_PREEMPT=y + CONFIG_PREEMPT_DEBUG=y +
CONFIG_SCHED_DEBUG=y + CONFIG_LATENCYTOP=y enabled, the following warning
triggers when using latencytop:

> [ 775.663239] BUG: using smp_processor_id() in preemptible [00000000] code: latencytop/6585
> [ 775.663303] caller is native_sched_clock+0x3a/0x80
> [ 775.663314] Pid: 6585, comm: latencytop Tainted: G W 2.6.28-rc4-00355-g9c7c354 #1
> [ 775.663322] Call Trace:
> [ 775.663343] [] debug_smp_processor_id+0xe4/0xf0
> [ 775.663356] [] native_sched_clock+0x3a/0x80
> [ 775.663368] [] sched_clock+0x9/0x10
> [ 775.663381] [] proc_sched_show_task+0x8bd/0x10e0
> [ 775.663395] [] sched_show+0x3e/0x80
> [ 775.663408] [] seq_read+0xdb/0x350
> [ 775.663421] [] ? security_file_permission+0x16/0x20
> [ 775.663435] [] vfs_read+0xc8/0x170
> [ 775.663447] [] sys_read+0x55/0x90
> [ 775.663460] [] system_call_fastpath+0x16/0x1b
> ...

This breakage was caused by me via:

7cbaef9: sched: optimize sched_clock() a bit

Change the calls to cpu_clock().

Reported-by: Luis Henriques

Ingo Molnar
16 years ago

11 Nov, 2008

1 commit

ff9b48c35 sched: include group statistics in /proc/sched_debug ... Browse Code »

Impact: extend /proc/sched_debug info

Since the statistics of a group entity isn't exported directly from the
kernel, it becomes difficult to obtain some of the group statistics.
For example, the current method to obtain exec time of a group entity
is not always accurate. One has to read the exec times of all
the tasks(/proc//sched) in the group and add them. This method
fails (or becomes difficult) if we want to collect stats of a group over
a duration where tasks get created and terminated.

This patch makes it easier to obtain group stats by directly including
them in /proc/sched_debug. Stats like group exec time would help user
programs (like LTP) to accurately measure the group fairness.

An example output of group stats from /proc/sched_debug:

cfs_rq[3]:/3/a/1
.exec_clock : 89.598007
.MIN_vruntime : 0.000001
.min_vruntime : 256300.970506
.max_vruntime : 0.000001
.spread : 0.000000
.spread0 : -25373.372248
.nr_running : 0
.load : 0
.yld_exp_empty : 0
.yld_act_empty : 0
.yld_both_empty : 0
.yld_count : 4474
.sched_switch : 0
.sched_count : 40507
.sched_goidle : 12686
.ttwu_count : 15114
.ttwu_local : 11950
.bkl_count : 67
.nr_spread_over : 0
.shares : 0
.se->exec_start : 113676.727170
.se->vruntime : 1592.612714
.se->sum_exec_runtime : 89.598007
.se->wait_start : 0.000000
.se->sleep_start : 0.000000
.se->block_start : 0.000000
.se->sleep_max : 0.000000
.se->block_max : 0.000000
.se->exec_max : 1.000282
.se->slice_max : 1.999750
.se->wait_max : 54.981093
.se->wait_sum : 217.610521
.se->wait_count : 50
.se->load.weight : 2

Signed-off-by: Bharata B Rao
Acked-by: Srivatsa Vaddagiri
Acked-by: Dhaval Giani
Acked-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Bharata B Rao
16 years ago

10 Nov, 2008

1 commit

5ac5c4d60 sched: clean up debug info ... Browse Code »

Impact: clean up and fix debug info printout

While looking over the sched_debug code I noticed that we printed the rq
schedstats for every cfs_rq, ammend this.

Also change nr_spead_over into an int, and fix a little buglet in
min_vruntime printing.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
16 years ago

04 Nov, 2008

1 commit

0a0db8f5c sched debug: remove NULL checking in print_cfs/rt_rq() ... Browse Code »

Impact: cleanup

cfs->tg is initialized in init_tg_cfs_entry() with tg != NULL, and
will never be invalidated to NULL. And the underlying cgroup of a
valid task_group is always valid.

Same for rt->tg.

Signed-off-by: Li Zefan
Acked-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Li Zefan
16 years ago

30 Oct, 2008

1 commit

a9cf4ddb3 sched: change sched_debug's mode to 0444 ... Browse Code »

Impact: change /proc/sched/debug from rw-r--r-- to r--r--r--

/proc/sched_debug is read-only.

Signed-off-by: Li Zefan
Acked-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Li Zefan
16 years ago

10 Oct, 2008

1 commit

a6bebbc87 [PATCH] signal, procfs: some lock_task_sighand() users do not need rcu_read_lock() ... Browse Code »

lock_task_sighand() make sure task->sighand is being protected,
so we do not need rcu_read_lock().
[ exec() will get task->sighand->siglock before change task->sighand! ]

But code using rcu_read_lock() _just_ to protect lock_task_sighand()
only appear in procfs. (and some code in procfs use lock_task_sighand()
without such redundant protection.)

Other subsystem may put lock_task_sighand() into rcu_read_lock()
critical region, but these rcu_read_lock() are used for protecting
"for_each_process()", "find_task_by_vpid()" etc. , not for protecting
lock_task_sighand().

Signed-off-by: Lai Jiangshan
[ok from Oleg]
Signed-off-by: Alexey Dobriyan

Lai Jiangshan
16 years ago

27 Jun, 2008

2 commits

32df2ee86 sched: add full schedstats to /proc/sched_debug ... Browse Code »

show all the schedstats in /debug/sched_debug as well.

Signed-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
17 years ago
c09595f63 sched: revert revert of: fair-group: SMP-nice for group scheduling ... Browse Code »

Try again..

Initial commit: 18d95a2832c1392a2d63227a7a6d433cb9f2037e
Revert: 6363ca57c76b7b83639ca8c83fc285fa26a7880e

Signed-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
17 years ago

20 Jun, 2008

1 commit

ada18de2e sched: debug: add some rt debug output ... Browse Code »

Signed-off-by: Peter Zijlstra
Cc: "Daniel K."
Signed-off-by: Ingo Molnar

Peter Zijlstra
17 years ago

29 May, 2008

1 commit

6363ca57c revert ("sched: fair-group: SMP-nice for group scheduling") ... Browse Code »

Yanmin Zhang reported:

Comparing with 2.6.25, volanoMark has big regression with kernel 2.6.26-rc1.
It's about 50% on my 8-core stoakley, 16-core tigerton, and Itanium Montecito.

With bisect, I located the following patch:

| 18d95a2832c1392a2d63227a7a6d433cb9f2037e is first bad commit
| commit 18d95a2832c1392a2d63227a7a6d433cb9f2037e
| Author: Peter Zijlstra
| Date: Sat Apr 19 19:45:00 2008 +0200
|
| sched: fair-group: SMP-nice for group scheduling

Revert it so that we get v2.6.25 behavior.

Bisected-by: Yanmin Zhang
Signed-off-by: Ingo Molnar

Ingo Molnar
17 years ago

06 May, 2008

1 commit

3e51f33fc sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK ... Browse Code »

this replaces the rq->clock stuff (and possibly cpu_clock()).

- architectures that have an 'imperfect' hardware clock can set
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK

- the 'jiffie' window might be superfulous when we update tick_gtod
before the __update_sched_clock() call in sched_clock_tick()

- cpu_clock() might be implemented as:

sched_clock_cpu(smp_processor_id())

if the accuracy proves good enough - how far can TSC drift in a
single jiffie when considering the filtering and idle hooks?

[ mingo@elte.hu: various fixes and cleanups ]

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
17 years ago

01 May, 2008

1 commit

6f6d6a1a6 rename div64_64 to div64_u64 ... Browse Code »

Rename div64_64 to div64_u64 to make it consistent with the other divide
functions, so it clearly includes the type of the divide. Move its definition
to math64.h as currently no architecture overrides the generic implementation.
They can still override it of course, but the duplicated declarations are
avoided.

Signed-off-by: Roman Zippel
Cc: Avi Kivity
Cc: Russell King
Cc: Geert Uytterhoeven
Cc: Ralf Baechle
Cc: David Howells
Cc: Jeff Dike
Cc: Ingo Molnar
Cc: "David S. Miller"
Cc: Patrick McHardy
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Roman Zippel
17 years ago

29 Apr, 2008

1 commit

c33fff0af kernel: use non-racy method for proc entries creation ... Browse Code »

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Signed-off-by: Denis V. Lunev
Cc: Alexey Dobriyan
Cc: "Eric W. Biederman"
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Denis V. Lunev
17 years ago

20 Apr, 2008

3 commits

486fdae21 sched: build fix ... Browse Code »

Signed-off-by: Ingo Molnar

Ingo Molnar
17 years ago
d19ca3087 sched: debug: add some debug code to handle the full hierarchy ... Browse Code »

Add some extra debug output so we can get a better overview of the
full hierarchy.

We print the cgroup path after each cfs_rq, so we can see what group
we're looking at.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
17 years ago
50df5d6ae sched: remove sysctl_sched_batch_wakeup_granularity ... Browse Code »

it's unused.

Signed-off-by: Ingo Molnar

Ingo Molnar
17 years ago

19 Mar, 2008

1 commit

4ae7d5cef sched: improve affine wakeups ... Browse Code »

improve affine wakeups. Maintain the 'overlap' metric based on CFS's
sum_exec_runtime - which means the amount of time a task executes
after it wakes up some other task.

Use the 'overlap' for the wakeup decisions: if the 'overlap' is short,
it means there's strong workload coupling between this task and the
woken up task. If the 'overlap' is large then the workload is decoupled
and the scheduler will move them to separate CPUs more easily.

( Also slightly move the preempt_check within try_to_wake_up() - this has
no effect on functionality but allows 'early wakeups' (for still-on-rq
tasks) to be correctly accounted as well.)

Signed-off-by: Ingo Molnar

Ingo Molnar
17 years ago

26 Jan, 2008

2 commits

6d082592b sched: keep total / count stats in addition to the max for ... Browse Code »

Right now, the linux kernel (with scheduler statistics enabled) keeps track
of the maximum time a process is waiting to be scheduled. While the maximum
is a very useful metric, tracking average and total is equally useful
(at least for latencytop) to figure out the accumulated effect of scheduler
delays. The accumulated effect is important to judge the performance impact
of scheduler tuning/behavior.

Signed-off-by: Arjan van de Ven
Signed-off-by: Ingo Molnar

Arjan van de Ven
17 years ago
cc203d242 sched: monitor clock underflows in /proc/sched_debug ... Browse Code »

We monitor clock overflows, let's also monitor clock underflows.

Signed-off-by: Guillaume Chazarain
Signed-off-by: Ingo Molnar

Guillaume Chazarain
17 years ago

31 Dec, 2007

1 commit

90b2628f1 sched: fix gcc warnings ... Browse Code »

Meelis Roos reported these warnings on sparc64:

CC kernel/sched.o
In file included from kernel/sched.c:879:
kernel/sched_debug.c: In function 'nsec_high':
kernel/sched_debug.c:38: warning: comparison of distinct pointer types lacks a cast

the debug check in do_div() is over-eager here, because the long long
is always positive in these places. Mark this by casting them to
unsigned long long.

no change in code output:

text data bss dec hex filename
51471 6582 376 58429 e43d sched.o.before
51471 6582 376 58429 e43d sched.o.after

md5:
7f7729c111f185bf3ccea4d542abc049 sched.o.before.asm
7f7729c111f185bf3ccea4d542abc049 sched.o.after.asm

Signed-off-by: Ingo Molnar

Ingo Molnar
17 years ago

28 Nov, 2007

1 commit

c1a89740d sched: clean up overlong line in kernel/sched_debug.c ... Browse Code »

clean up overlong line in kernel/sched_debug.c.

Signed-off-by: Ingo Molnar

Ingo Molnar
17 years ago

27 Nov, 2007

1 commit

f7b9329e5 sched: bump version of kernel/sched_debug.c ... Browse Code »

bump version of kernel/sched_debug.c and remove CFS version
information from it.

Signed-off-by: Ingo Molnar

Ingo Molnar
17 years ago

10 Nov, 2007

1 commit

b2be5e96d sched: reintroduce the sched_min_granularity tunable ... Browse Code »

we lost the sched_min_granularity tunable to a clever optimization
that uses the sched_latency/min_granularity ratio - but the ratio
is quite unintuitive to users and can also crash the kernel if the
ratio is set to 0. So reintroduce the min_granularity tunable,
while keeping the ratio maintained internally.

no functionality changed.

[ mingo@elte.hu: some fixlets. ]

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
17 years ago

25 Oct, 2007

1 commit

ab63a633c sched: fix unconditional irq lock ... Browse Code »

Lockdep noticed that this lock can also be taken from hardirq context, and can
thus not unconditionally disable/enable irqs.

WARNING: at kernel/lockdep.c:2033 trace_hardirqs_on()
[show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30
[show_trace+18/32] show_trace+0x12/0x20
[dump_stack+22/32] dump_stack+0x16/0x20
[trace_hardirqs_on+405/416] trace_hardirqs_on+0x195/0x1a0
[_read_unlock_irq+34/48] _read_unlock_irq+0x22/0x30
[sched_debug_show+2615/4224] sched_debug_show+0xa37/0x1080
[show_state_filter+326/368] show_state_filter+0x146/0x170
[sysrq_handle_showstate+10/16] sysrq_handle_showstate+0xa/0x10
[__handle_sysrq+123/288] __handle_sysrq+0x7b/0x120
[handle_sysrq+40/64] handle_sysrq+0x28/0x40
[kbd_event+1045/1680] kbd_event+0x415/0x690
[input_pass_event+206/208] input_pass_event+0xce/0xd0
[input_handle_event+170/928] input_handle_event+0xaa/0x3a0
[input_event+95/112] input_event+0x5f/0x70
[atkbd_interrupt+434/1456] atkbd_interrupt+0x1b2/0x5b0
[serio_interrupt+59/128] serio_interrupt+0x3b/0x80
[i8042_interrupt+263/576] i8042_interrupt+0x107/0x240
[handle_IRQ_event+40/96] handle_IRQ_event+0x28/0x60
[handle_edge_irq+175/320] handle_edge_irq+0xaf/0x140
[do_IRQ+64/128] do_IRQ+0x40/0x80
[common_interrupt+46/52] common_interrupt+0x2e/0x34

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
17 years ago

19 Oct, 2007

1 commit

480b9434c sched: reduce schedstat variable overhead a bit ... Browse Code »

schedstat is useful in investigating CPU scheduler behavior. Ideally,
I think it is beneficial to have it on all the time. However, the
cost of turning it on in production system is quite high, largely due
to number of events it collects and also due to its large memory
footprint.

Most of the fields probably don't need to be full 64-bit on 64-bit
arch. Rolling over 4 billion events will most like take a long time
and user space tool can be made to accommodate that. I'm proposing
kernel to cut back most of variable width on 64-bit system. (note,
the following patch doesn't affect 32-bit system).

Signed-off-by: Ken Chen
Signed-off-by: Ingo Molnar

Ken Chen
17 years ago

15 Oct, 2007

1 commit

0dbee3a6b Make scheduler debug file operations const ... Browse Code »

In general, struct file_operations are const in the kernel, to not have
false cacheline sharing and to catch bugs at compiletime with accidental
writes to them. The new scheduler code introduces a new non-const one;
fix this up.

Signed-off-by: Arjan van de Ven
Signed-off-by: Ingo Molnar

Arjan van de Ven
17 years ago