Eric Lee / smarc-fsl-linux-kernel

31 Mar, 2009

1 commit

c4e1aa67e Merge branch 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip ... Browse Code »

* 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (33 commits)
lockdep: fix deadlock in lockdep_trace_alloc
lockdep: annotate reclaim context (__GFP_NOFS), fix SLOB
lockdep: annotate reclaim context (__GFP_NOFS), fix
lockdep: build fix for !PROVE_LOCKING
lockstat: warn about disabled lock debugging
lockdep: use stringify.h
lockdep: simplify check_prev_add_irq()
lockdep: get_user_chars() redo
lockdep: simplify get_user_chars()
lockdep: add comments to mark_lock_irq()
lockdep: remove macro usage from mark_held_locks()
lockdep: fully reduce mark_lock_irq()
lockdep: merge the !_READ mark_lock_irq() helpers
lockdep: merge the _READ mark_lock_irq() helpers
lockdep: simplify mark_lock_irq() helpers #3
lockdep: further simplify mark_lock_irq() helpers
lockdep: simplify the mark_lock_irq() helpers
lockdep: split up mark_lock_irq()
lockdep: generate usage strings
lockdep: generate the state bit definitions
...

Linus Torvalds
2009-03-31 08:17:35 +0800

15 Jan, 2009

2 commits

e52fb7c09 sched: prefer wakers ... Browse Code »

Prefer tasks that wake other tasks to preempt quickly. This improves
performance because more work is available sooner.

The workload that prompted this patch was a kernel build over NFS4 (for some
curious and not understood reason we had to revert commit:
18de9735300756e3ca9c361ef58409d8561dfe0d to make any progress at all)

Without this patch a make -j8 bzImage (of x86-64 defconfig) would take
3m30-ish, with this patch we're down to 2m50-ish.

psql-sysbench/mysql-sysbench show a slight improvement in peak performance as
well, tbench and vmark seemed to not care.

It is possible to improve upon the build time (to 2m20-ish) but that seriously
destroys other benchmarks (just shows that there's more room for tinkering).

Much thanks to Mike who put in a lot of effort to benchmark things and proved
a worthy opponent with a competing patch.

Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-01-15 19:00:09 +0800
0d66bf6d3 mutex: implement adaptive spinning ... Browse Code »

Change mutex contention behaviour such that it will sometimes busy wait on
acquisition - moving its behaviour closer to that of spinlocks.

This concept got ported to mainline from the -rt tree, where it was originally
implemented for rtmutexes by Steven Rostedt, based on work by Gregory Haskins.

Testing with Ingo's test-mutex application (http://lkml.org/lkml/2006/1/8/50)
gave a 345% boost for VFS scalability on my testbox:

# ./test-mutex-shm V 16 10 | grep "^avg ops"
avg ops/sec: 296604

# ./test-mutex-shm V 16 10 | grep "^avg ops"
avg ops/sec: 85870

The key criteria for the busy wait is that the lock owner has to be running on
a (different) cpu. The idea is that as long as the owner is running, there is a
fair chance it'll release the lock soon, and thus we'll be better off spinning
instead of blocking/scheduling.

Since regular mutexes (as opposed to rtmutexes) do not atomically track the
owner, we add the owner in a non-atomic fashion and deal with the races in
the slowpath.

Furthermore, to ease the testing of the performance impact of this new code,
there is means to disable this behaviour runtime (without having to reboot
the system), when scheduler debugging is enabled (CONFIG_SCHED_DEBUG=y),
by issuing the following command:

# echo NO_OWNER_SPIN > /debug/sched_features

This command re-enables spinning again (this is also the default):

# echo OWNER_SPIN > /debug/sched_features

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-01-15 01:09:02 +0800

05 Nov, 2008

1 commit

4793241be sched: backward looking buddy ... Browse Code »

Impact: improve/change/fix wakeup-buddy scheduling

Currently we only have a forward looking buddy, that is, we prefer to
schedule to the task we last woke up, under the presumption that its
going to consume the data we just produced, and therefore will have
cache hot benefits.

This allows co-waking producer/consumer task pairs to run ahead of the
pack for a little while, keeping their cache warm. Without this, we
would interleave all pairs, utterly trashing the cache.

This patch introduces a backward looking buddy, that is, suppose that
in the above scenario, the consumer preempts the producer before it
can go to sleep, we will therefore miss the wakeup from consumer to
producer (its already running, after all), breaking the cycle and
reverting to the cache-trashing interleaved schedule pattern.

The backward buddy will try to schedule back to the task that woke us
up in case the forward buddy is not available, under the assumption
that the last task will be the one with the most cache hot task around
barring current.

This will basically allow a task to continue after it got preempted.

In order to avoid starvation, we allow either buddy to get wakeup_gran
ahead of the pack.

Signed-off-by: Peter Zijlstra
Acked-by: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-11-05 17:30:14 +0800

20 Oct, 2008

1 commit

0c4b83da5 sched: disable the hrtick for now ... Browse Code »

David Miller reported that hrtick update overhead has tripled the
wakeup overhead on Sparc64.

That is too much - disable the HRTICK feature for now by default,
until a faster implementation is found.

Reported-by: David Miller
Acked-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Ingo Molnar
2008-10-20 20:27:43 +0800

22 Sep, 2008

2 commits

f681bbd65 sched: turn off WAKEUP_OVERLAP ... Browse Code »

WAKEUP_OVERLAP is not a winner on a 16way box, running psql+sysbench:

.27-rc7-NO_WAKEUP_OVERLAP .27-rc7-WAKEUP_OVERLAP
-------------------------------------------------
1: 694 811 +14.39%
2: 1454 1427 -1.86%
4: 3017 3070 +1.70%
8: 5694 5808 +1.96%
16: 10592 10612 +0.19%
32: 9693 9647 -0.48%
64: 8507 8262 -2.97%
128: 8402 7087 -18.55%
256: 8419 5124 -64.30%
512: 7990 3671 -117.62%
-------------------------------------------------
SUM: 64466 55524 -16.11%

... so turn it off by default.

Signed-off-by: Ingo Molnar

Ingo Molnar
2008-09-22 22:29:00 +0800
15afe09bf sched: wakeup preempt when small overlap ... Browse Code »

Lin Ming reported a 10% OLTP regression against 2.6.27-rc4.

The difference seems to come from different preemption agressiveness,
which affects the cache footprint of the workload and its effective
cache trashing.

Aggresively preempt a task if its avg overlap is very small, this should
avoid the task going to sleep and find it still running when we schedule
back to it - saving a wakeup.

Reported-by: Lin Ming
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-09-22 22:28:32 +0800

21 Aug, 2008

1 commit

efc2dead2 sched: enable LB_BIAS by default ... Browse Code »

Yanmin reported a significant regression on his 16-core machine due to:

commit 93b75217df39e6d75889cc6f8050343286aff4a5
Author: Peter Zijlstra
Date: Fri Jun 27 13:41:33 2008 +0200

Flip back to the old behaviour.

Reported-by: "Zhang, Yanmin"
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-08-21 14:18:02 +0800

27 Jun, 2008

5 commits

f5bfb7d9f sched: bias effective_load() error towards failing wake_affine(). ... Browse Code »

Measurement shows that the difference between cgroup:/ and cgroup:/foo
wake_affine() results is that the latter succeeds significantly more.

Therefore bias the calculations towards failing the test.

Signed-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-06-27 20:31:47 +0800
2398f2c6d sched: update shares on wakeup ... Browse Code »

We found that the affine wakeup code needs rather accurate load figures
to be effective. The trouble is that updating the load figures is fairly
expensive with group scheduling. Therefore ratelimit the updating.

Signed-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-06-27 20:31:45 +0800
93b75217d sched: disable source/target_load bias ... Browse Code »

The bias given by source/target_load functions can be very large, disable
it by default to get faster convergence.

Signed-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-06-27 20:31:44 +0800
c9c294a63 sched: fix calc_delta_asym() ... Browse Code »

calc_delta_asym() is supposed to do the same as calc_delta_fair() except
linearly shrink the result for negative nice processes - this causes them
to have a smaller preemption threshold so that they are more easily preempted.

The problem is that for task groups se->load.weight is the per cpu share of
the actual task group weight; take that into account.

Also provide a debug switch to disable the asymmetry (which I still don't
like - but it does greatly benefit some workloads)

This would explain the interactivity issues reported against group scheduling.

Signed-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-06-27 20:31:28 +0800
a7be37ac8 sched: revert the revert of: weight calculations ... Browse Code »

Try again..

initial commit: 8f1bc385cfbab474db6c27b5af1e439614f3025c
revert: f9305d4a0968201b2818dbed0dc8cb0d4ee7aeb3

Signed-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-06-27 20:31:27 +0800

10 Jun, 2008

1 commit

6492c7f83 sched: trivial sched_features cleanup ... Browse Code »

Remove unused debug/tuning features.

Signed-off-by: Mike Galbraith
Signed-off-by: Ingo Molnar

Mike Galbraith
2008-06-10 18:38:17 +0800

20 Apr, 2008

1 commit

f00b45c14 sched: /debug/sched_features ... Browse Code »

provide a text based interface to the scheduler features; this saves the
'user' from setting bits using decimal arithmetic.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-04-20 01:45:00 +0800