Doug / smarc-fsl-linux-kernel | Embedian Git Server

24 Oct, 2012

7 commits

82958366c sched: Replace update_shares weight distribution with per-entity computation ... Browse Code »

Now that the machinery in place is in place to compute contributed load in a
bottom up fashion; replace the shares distribution code within update_shares()
accordingly.

Signed-off-by: Paul Turner
Reviewed-by: Ben Segall
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120823141507.061208672@google.com
Signed-off-by: Ingo Molnar

Paul Turner
2012-10-24 16:27:28 +0800
bb17f6557 sched: Normalize tg load contributions against runnable time ... Browse Code »

Entities of equal weight should receive equitable distribution of cpu time.
This is challenging in the case of a task_group's shares as execution may be
occurring on multiple cpus simultaneously.

To handle this we divide up the shares into weights proportionate with the load
on each cfs_rq. This does not however, account for the fact that the sum of
the parts may be less than one cpu and so we need to normalize:
load(tg) = min(runnable_avg(tg), 1) * tg->shares
Where runnable_avg is the aggregate time in which the task_group had runnable
children.

Signed-off-by: Paul Turner
Reviewed-by: Ben Segall .
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120823141506.930124292@google.com
Signed-off-by: Ingo Molnar

Paul Turner
2012-10-24 16:27:26 +0800
c566e8e9e sched: Aggregate total task_group load ... Browse Code »

Maintain a global running sum of the average load seen on each cfs_rq belonging
to each task group so that it may be used in calculating an appropriate
shares:weight distribution.

Signed-off-by: Paul Turner
Reviewed-by: Ben Segall
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120823141506.792901086@google.com
Signed-off-by: Ingo Molnar

Paul Turner
2012-10-24 16:27:24 +0800
9ee474f55 sched: Maintain the load contribution of blocked entities ... Browse Code »

We are currently maintaining:

runnable_load(cfs_rq) = \Sum task_load(t)

For all running children t of cfs_rq. While this can be naturally updated for
tasks in a runnable state (as they are scheduled); this does not account for
the load contributed by blocked task entities.

This can be solved by introducing a separate accounting for blocked load:

blocked_load(cfs_rq) = \Sum runnable(b) * weight(b)

Obviously we do not want to iterate over all blocked entities to account for
their decay, we instead observe that:

runnable_load(t) = \Sum p_i*y^i

and that to account for an additional idle period we only need to compute:

y*runnable_load(t).

This means that we can compute all blocked entities at once by evaluating:

blocked_load(cfs_rq)` = y * blocked_load(cfs_rq)

Finally we maintain a decay counter so that when a sleeping entity re-awakens
we can determine how much of its load should be removed from the blocked sum.

Signed-off-by: Paul Turner
Reviewed-by: Ben Segall
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120823141506.585389902@google.com
Signed-off-by: Ingo Molnar

Paul Turner
2012-10-24 16:27:22 +0800
2dac754e1 sched: Aggregate load contributed by task entities on parenting cfs_rq ... Browse Code »

For a given task t, we can compute its contribution to load as:

task_load(t) = runnable_avg(t) * weight(t)

On a parenting cfs_rq we can then aggregate:

runnable_load(cfs_rq) = \Sum task_load(t), for all runnable children t

Maintain this bottom up, with task entities adding their contributed load to
the parenting cfs_rq sum. When a task entity's load changes we add the same
delta to the maintained sum.

Signed-off-by: Paul Turner
Reviewed-by: Ben Segall
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120823141506.514678907@google.com
Signed-off-by: Ingo Molnar

Paul Turner
2012-10-24 16:27:21 +0800
18bf2805d sched: Maintain per-rq runnable averages ... Browse Code »

Since runqueues do not have a corresponding sched_entity we instead embed a
sched_avg structure directly.

Signed-off-by: Ben Segall
Reviewed-by: Paul Turner
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120823141506.442637130@google.com
Signed-off-by: Ingo Molnar

Ben Segall
2012-10-24 16:27:20 +0800
9d85f21c9 sched: Track the runnable average on a per-task entity basis ... Browse Code »

Instead of tracking averaging the load parented by a cfs_rq, we can track
entity load directly. With the load for a given cfs_rq then being the sum
of its children.

To do this we represent the historical contribution to runnable average
within each trailing 1024us of execution as the coefficients of a
geometric series.

We can express this for a given task t as:

runnable_sum(t) = \Sum u_i * y^i, runnable_avg_period(t) = \Sum 1024 * y^i
load(t) = weight_t * runnable_sum(t) / runnable_avg_period(t)

Where: u_i is the usage in the last i`th 1024us period (approximately 1ms)
~ms and y is chosen such that y^k = 1/2. We currently choose k to be 32 which
roughly translates to about a sched period.

Signed-off-by: Paul Turner
Reviewed-by: Ben Segall
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20120823141506.372695337@google.com
Signed-off-by: Ingo Molnar

Paul Turner
2012-10-24 16:27:18 +0800

14 May, 2012

1 commit

13e099d2f sched/debug: Fix printing large integers on 32-bit platforms ... Browse Code »

Some numbers like nr_running and nr_uninterruptible are fundamentally
unsigned since its impossible to have a negative amount of tasks, yet
we still print them as signed to easily recognise the underflow
condition.

rq->nr_uninterruptible has 'special' accounting and can in fact very
easily become negative on a per-cpu basis.

It was noted that since the P() macro assumes things are long long and
the promotion of unsigned 'int/long' to long long on 32bit doesn't
sign extend we print silly large numbers instead of the easier to read
signed numbers.

Therefore extend the P() macro to not require the sign extention.

Reported-by: Diwakar Tundlam
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/n/tip-gk5tm8t2n4ix2vkpns42uqqp@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2012-05-14 21:05:28 +0800

09 May, 2012

1 commit

c82513e51 sched: Change rq->nr_running to unsigned int ... Browse Code »

Since there's a PID space limit of 30bits (see
futex.h:FUTEX_TID_MASK) and allocating that many tasks (assuming a
lower bound of 2 pages per task) would still take 8T of memory it
seems reasonable to say that unsigned int is sufficient for
rq->nr_running.

When we do get anywhere near that amount of tasks I suspect other
things would go funny, load-balancer load computations would really
need to be hoisted to 128bit etc.

So save a few bytes and convert rq->nr_running and friends to
unsigned int.

Suggested-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/n/tip-y3tvyszjdmbibade5bw8zl81@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2012-05-09 21:00:49 +0800

27 Jan, 2012

1 commit

30fd049af sched: Remove sched_switch ... Browse Code »

Currently we don't utilize the sched_switch field anymore.

But, simply removing sched_switch field from the middle of the
sched_stat output will break tools.

So, to stay compatible we hardcode it to zero and remove the
field from the scheduler data structures.

Update the schedstat documentation accordingly.

Signed-off-by: Rakib Mullick
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1327422836.27181.5.camel@localhost.localdomain
Signed-off-by: Ingo Molnar

Rakib Mullick
2012-01-27 20:28:53 +0800

17 Nov, 2011

1 commit

391e43da7 sched: Move all scheduler bits into kernel/sched/ ... Browse Code »

There's too many sched*.[ch] files in kernel/, give them their own
directory.

(No code changed, other than Makefile glue added.)

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2011-11-17 19:20:22 +0800