Eric Lee / linux-smarc-t335x-v3.2

25 Jul, 2011

1 commit

5fabc487c Merge branch 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm ... Browse Code »

* 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
KVM: IOMMU: Disable device assignment without interrupt remapping
KVM: MMU: trace mmio page fault
KVM: MMU: mmio page fault support
KVM: MMU: reorganize struct kvm_shadow_walk_iterator
KVM: MMU: lockless walking shadow page table
KVM: MMU: do not need atomicly to set/clear spte
KVM: MMU: introduce the rules to modify shadow page table
KVM: MMU: abstract some functions to handle fault pfn
KVM: MMU: filter out the mmio pfn from the fault pfn
KVM: MMU: remove bypass_guest_pf
KVM: MMU: split kvm_mmu_free_page
KVM: MMU: count used shadow pages on prepareing path
KVM: MMU: rename 'pt_write' to 'emulate'
KVM: MMU: cleanup for FNAME(fetch)
KVM: MMU: optimize to handle dirty bit
KVM: MMU: cache mmio info on page fault path
KVM: x86: introduce vcpu_mmio_gva_to_gpa to cleanup the code
KVM: MMU: do not update slot bitmap if spte is nonpresent
KVM: MMU: fix walking shadow page table
KVM guest: KVM Steal time registration
...

Linus Torvalds
2011-07-25 00:07:03 +0800

21 Jul, 2011

1 commit

e3589f6c8 sched: Allow for overlapping sched_domain spans ... Browse Code »

Allow for sched_domain spans that overlap by giving such domains their
own sched_group list instead of sharing the sched_groups amongst
each-other.

This is needed for machines with more than 16 nodes, because
sched_domain_node_span() will generate a node mask from the
16 nearest nodes without regard if these masks have any overlap.

Currently sched_domains have a sched_group that maps to their child
sched_domain span, and since there is no overlap we share the
sched_group between the sched_domains of the various CPUs. If however
there is overlap, we would need to link the sched_group list in
different ways for each cpu, and hence sharing isn't possible.

In order to solve this, allocate private sched_groups for each CPU's
sched_domain but have the sched_groups share a sched_group_power
structure such that we can uniquely track the power.

Reported-and-tested-by: Anton Blanchard
Signed-off-by: Peter Zijlstra
Cc: Linus Torvalds
Cc: Andrew Morton
Link: http://lkml.kernel.org/n/tip-08bxqw9wis3qti9u5inifh3y@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2011-07-21 00:32:41 +0800

14 Jul, 2011

1 commit

095c0aa83 sched: adjust scheduler cpu power for stolen time ... Browse Code »

This patch makes update_rq_clock() aware of steal time.
The mechanism of operation is not different from irq_time,
and follows the same principles. This lives in a CONFIG
option itself, and can be compiled out independently of
the rest of steal time reporting. The effect of disabling it
is that the scheduler will still report steal time (that cannot be
disabled), but won't use this information for cpu power adjustments.

Everytime update_rq_clock_task() is invoked, we query information
about how much time was stolen since last call, and feed it into
sched_rt_avg_update().

Although steal time reporting in account_process_tick() keeps
track of the last time we read the steal clock, in prev_steal_time,
this patch do it independently using another field,
prev_steal_time_rq. This is because otherwise, information about time
accounted in update_process_tick() would never reach us in update_rq_clock().

Signed-off-by: Glauber Costa
Acked-by: Rik van Riel
Acked-by: Peter Zijlstra
Tested-by: Eric B Munson
CC: Jeremy Fitzhardinge
CC: Anthony Liguori
Signed-off-by: Avi Kivity

Glauber Costa
2011-07-14 17:59:47 +0800

14 Apr, 2011

1 commit

317f39416 sched: Move the second half of ttwu() to the remote cpu ... Browse Code »
1

Now that we've removed the rq->lock requirement from the first part of
ttwu() and can compute placement without holding any rq->lock, ensure
we execute the second half of ttwu() on the actual cpu we want the
task to run on.

This avoids having to take rq->lock and doing the task enqueue
remotely, saving lots on cacheline transfers.

As measured using: http://oss.oracle.com/~mason/sembench.c

$ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i; done
$ echo 4096 32000 64 128 > /proc/sys/kernel/sem
$ ./sembench -t 2048 -w 1900 -o 0

unpatched: run time 30 seconds 647278 worker burns per second
patched: run time 30 seconds 816715 worker burns per second

Reviewed-by: Frank Rowand
Cc: Mike Galbraith
Cc: Nick Piggin
Cc: Linus Torvalds
Cc: Andrew Morton
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/20110405152729.515897185@chello.nl

Peter Zijlstra
2011-04-14 14:52:41 +0800

18 Nov, 2010

1 commit

2069dd75c sched: Rewrite tg_shares_up) ... Browse Code »

By tracking a per-cpu load-avg for each cfs_rq and folding it into a
global task_group load on each tick we can rework tg_shares_up to be
strictly per-cpu.

This should improve cpu-cgroup performance for smp systems
significantly.

[ Paul: changed to use queueing cfs_rq + bug fixes ]

Signed-off-by: Paul Turner
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-11-18 20:27:46 +0800

19 Oct, 2010

1 commit

aa4838085 sched: Remove irq time from available CPU power ... Browse Code »

The idea was suggested by Peter Zijlstra here:

http://marc.info/?l=linux-kernel&m=127476934517534&w=2

irq time is technically not available to the tasks running on the CPU.
This patch removes irq time from CPU power piggybacking on
sched_rt_avg_update().

Tested this by keeping CPU X busy with a network intensive task having 75%
oa a single CPU irq processing (hard+soft) on a 4-way system. And start seven
cycle soakers on the system. Without this change, there will be two tasks on
each CPU. With this change, there is a single task on irq busy CPU X and
remaining 7 tasks are spread around among other 3 CPUs.

Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Venkatesh Pallipadi
2010-10-19 02:52:27 +0800

12 Mar, 2010

7 commits

13814d42e sched: Remove ASYM_GRAN feature ... Browse Code »

This features has been enabled for quite a while, after testing showed that
easing preemption for light tasks was harmful to high priority threads.

Remove the feature flag.

Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Mike Galbraith
2010-03-12 01:32:53 +0800
c6ee36c42 sched: Remove SYNC_WAKEUPS feature ... Browse Code »

Sync wakeups are critical functionality with a long history. Remove it, we don't
need the branch or icache footprint.

Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Mike Galbraith
2010-03-12 01:32:53 +0800
f2e74eeac sched: Remove WAKEUP_SYNC feature ... Browse Code »

This feature never earned its keep, remove it.

Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Mike Galbraith
2010-03-12 01:32:52 +0800
5ca9880c6 sched: Remove FAIR_SLEEPERS feature ... Browse Code »

Our preemption model relies too heavily on sleeper fairness to disable it
without dire consequences. Remove the feature, and save a branch or two.

Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Mike Galbraith
2010-03-12 01:32:52 +0800
6bc6cf2b6 sched: Remove NORMALIZED_SLEEPER ... Browse Code »

This feature hasn't been enabled in a long time, remove effectively dead code.

Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Mike Galbraith
2010-03-12 01:32:52 +0800
e12f31d3e sched: Remove avg_overlap ... Browse Code »

Both avg_overlap and avg_wakeup had an inherent problem in that their accuracy
was detrimentally affected by cross-cpu wakeups, this because we are missing
the necessary call to update_curr(). This can't be fixed without increasing
overhead in our already too fat fastpath.

Additionally, with recent load balancing changes making us prefer to place tasks
in an idle cache domain (which is good for compute bound loads), communicating
tasks suffer when a sync wakeup, which would enable affine placement, is turned
into a non-sync wakeup by SYNC_LESS. With one task on the runqueue, wake_affine()
rejects the affine wakeup request, leaving the unfortunate where placed, taking
frequent cache misses.

Remove it, and recover some fastpath cycles.

Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Mike Galbraith
2010-03-12 01:32:50 +0800
b42e0c41a sched: Remove avg_wakeup ... Browse Code »

Testing the load which led to this heuristic (nfs4 kbuild) shows that it has
outlived it's usefullness. With intervening load balancing changes, I cannot
see any difference with/without, so recover there fastpath cycles.

Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Mike Galbraith
2010-03-12 01:32:50 +0800

09 Dec, 2009

1 commit

6cecd084d sched: Discard some old bits ... Browse Code »

WAKEUP_RUNNING was an experiment, not sure why that ever ended up being
merged...

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-12-09 17:03:07 +0800

17 Sep, 2009

1 commit

ad4b78bbc sched: Add new wakeup preemption mode: WAKEUP_RUNNING ... Browse Code »

Create a new wakeup preemption mode, preempt towards tasks that run
shorter on avg. It sets next buddy to be sure we actually run the task
we preempted for.

Test results:

root@twins:~# while :; do :; done &
[1] 6537
root@twins:~# while :; do :; done &
[2] 6538
root@twins:~# while :; do :; done &
[3] 6539
root@twins:~# while :; do :; done &
[4] 6540

root@twins:/home/peter# ./latt -c4 sleep 4
Entries: 48 (clients=4)

Averages:
------------------------------
Max 4750 usec
Avg 497 usec
Stdev 737 usec

root@twins:/home/peter# echo WAKEUP_RUNNING > /debug/sched_features

root@twins:/home/peter# ./latt -c4 sleep 4
Entries: 48 (clients=4)

Averages:
------------------------------
Max 14 usec
Avg 5 usec
Stdev 3 usec

Disabled by default - needs more testing.

Signed-off-by: Peter Zijlstra
Acked-by: Mike Galbraith
Signed-off-by: Ingo Molnar
LKML-Reference:

Peter Zijlstra
2009-09-17 16:17:25 +0800

16 Sep, 2009

3 commits

3b6408942 sched: Optimize cgroup vs wakeup a bit ... Browse Code »

We don't need to call update_shares() for each domain we iterate,
just got the largets one.

However, we should call it before wake_affine() as well, so that
that can use up-to-date values too.

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-09-16 22:44:32 +0800
51e0304ce sched: Implement a gentler fair-sleepers feature ... Browse Code »

Add back FAIR_SLEEPERS and GENTLE_FAIR_SLEEPERS.

FAIR_SLEEPERS is the old logic: credit sleepers with their sleep time.

GENTLE_FAIR_SLEEPERS dampens this a bit: 50% of their sleep time gets
credited.

The hope here is to still give the benefits of fair-sleepers logic
(quick wakeups, etc.) while not allow them to have 100% of their
sleep time as if they were running.

Cc: Peter Zijlstra
Cc: Mike Galbraith
LKML-Reference:
Signed-off-by: Ingo Molnar

Ingo Molnar
2009-09-16 15:05:20 +0800
e69b0f1b4 sched: Add a few SYNC hint knobs to play with ... Browse Code »

Currently we use overlap to weaken the SYNC hint, but allow it to
set the hint as well.

echo NO_SYNC_WAKEUP > /debug/sched_features
echo SYNC_MORE > /debug/sched_features

preserves pipe-test behaviour without using the WF_SYNC hint.

Worth playing with on more workloads...

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-09-16 01:47:23 +0800

15 Sep, 2009

5 commits

8e6598af3 sched: Feature to disable APERF/MPERF cpu_power ... Browse Code »

I suspect a feed-back loop between cpuidle and the aperf/mperf
cpu_power bits, where when we have idle C-states lower the ratio,
which leads to lower cpu_power and then less load, which generates
more idle time, etc..

Put in a knob to disable it.

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-09-15 22:51:28 +0800
0ec9fab3d sched: Improve latencies and throughput ... Browse Code »

Make the idle balancer more agressive, to improve a
x264 encoding workload provided by Jason Garrett-Glaser:

NEXT_BUDDY NO_LB_BIAS
encoded 600 frames, 252.82 fps, 22096.60 kb/s
encoded 600 frames, 250.69 fps, 22096.60 kb/s
encoded 600 frames, 245.76 fps, 22096.60 kb/s

NO_NEXT_BUDDY LB_BIAS
encoded 600 frames, 344.44 fps, 22096.60 kb/s
encoded 600 frames, 346.66 fps, 22096.60 kb/s
encoded 600 frames, 352.59 fps, 22096.60 kb/s

NO_NEXT_BUDDY NO_LB_BIAS
encoded 600 frames, 425.75 fps, 22096.60 kb/s
encoded 600 frames, 425.45 fps, 22096.60 kb/s
encoded 600 frames, 422.49 fps, 22096.60 kb/s

Peter pointed out that this is better done via newidle_idx,
not via LB_BIAS, newidle balancing should look for where
there is load _now_, not where there was load 2 ticks ago.

Worst-case latencies are improved as well as no buddies
means less vruntime spread. (as per prior lkml discussions)

This change improves kbuild-peak parallelism as well.

Reported-by: Jason Garrett-Glaser
Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Mike Galbraith
2009-09-15 22:51:16 +0800
e26af0e8b sched: Add come comments to the sched features ... Browse Code »

Add text...

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-09-15 22:01:03 +0800
3cb63d527 sched: Complete buddy switches ... Browse Code »

Add a NEXT_BUDDY feature flag to aid in debugging.

Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Mike Galbraith
2009-09-15 22:01:02 +0800
e6b1b2c9c sched: Split WAKEUP_OVERLAP ... Browse Code »

It consists of two conditions, split them out in separate toggles
so we can test them independently.

Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-09-15 22:01:02 +0800

11 Sep, 2009

1 commit

3f2aa307c sched: Disable NEW_FAIR_SLEEPERS for now ... Browse Code »

Nikos Chantziaras and Jens Axboe reported that turning off
NEW_FAIR_SLEEPERS improves desktop interactivity visibly.

Nikos described his experiences the following way:

" With this setting, I can do "nice -n 19 make -j20" and
still have a very smooth desktop and watch a movie at
the same time. Various other annoyances (like the
"logout/shutdown/restart" dialog of KDE not appearing
at all until the background fade-out effect has finished)
are also gone. So this seems to be the single most
important setting that vastly improves desktop behavior,
at least here. "

Jens described it the following way, referring to a 10-seconds
xmodmap scheduling delay he was trying to debug:

" Then I tried switching NO_NEW_FAIR_SLEEPERS on, and then
I get:

Performance counter stats for 'xmodmap .xmodmap-carl':

9.009137 task-clock-msecs # 0.447 CPUs
18 context-switches # 0.002 M/sec
1 CPU-migrations # 0.000 M/sec
315 page-faults # 0.035 M/sec

0.020167093 seconds time elapsed

Woot! "

So disable it for now. In perf trace output i can see weird
delta timestamps:

cc1-9943 [001] 2802.059479616: sched_stat_wait: task: as:9944 wait: 2801938766276 [ns]

That nsec field is not supposed to be that large. More digging
is needed - but lets turn it off while the real bug is found.

Reported-by: Nikos Chantziaras
Tested-by: Nikos Chantziaras
Reported-by: Jens Axboe
Tested-by: Jens Axboe
Acked-by: Peter Zijlstra
Cc: Mike Galbraith
LKML-Reference:
Signed-off-by: Ingo Molnar

Ingo Molnar
2009-09-11 02:34:48 +0800

31 Mar, 2009

1 commit

c4e1aa67e Merge branch 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip ... Browse Code »

* 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (33 commits)
lockdep: fix deadlock in lockdep_trace_alloc
lockdep: annotate reclaim context (__GFP_NOFS), fix SLOB
lockdep: annotate reclaim context (__GFP_NOFS), fix
lockdep: build fix for !PROVE_LOCKING
lockstat: warn about disabled lock debugging
lockdep: use stringify.h
lockdep: simplify check_prev_add_irq()
lockdep: get_user_chars() redo
lockdep: simplify get_user_chars()
lockdep: add comments to mark_lock_irq()
lockdep: remove macro usage from mark_held_locks()
lockdep: fully reduce mark_lock_irq()
lockdep: merge the !_READ mark_lock_irq() helpers
lockdep: merge the _READ mark_lock_irq() helpers
lockdep: simplify mark_lock_irq() helpers #3
lockdep: further simplify mark_lock_irq() helpers
lockdep: simplify the mark_lock_irq() helpers
lockdep: split up mark_lock_irq()
lockdep: generate usage strings
lockdep: generate the state bit definitions
...

Linus Torvalds
2009-03-31 08:17:35 +0800

15 Jan, 2009

2 commits

e52fb7c09 sched: prefer wakers ... Browse Code »

Prefer tasks that wake other tasks to preempt quickly. This improves
performance because more work is available sooner.

The workload that prompted this patch was a kernel build over NFS4 (for some
curious and not understood reason we had to revert commit:
18de9735300756e3ca9c361ef58409d8561dfe0d to make any progress at all)

Without this patch a make -j8 bzImage (of x86-64 defconfig) would take
3m30-ish, with this patch we're down to 2m50-ish.

psql-sysbench/mysql-sysbench show a slight improvement in peak performance as
well, tbench and vmark seemed to not care.

It is possible to improve upon the build time (to 2m20-ish) but that seriously
destroys other benchmarks (just shows that there's more room for tinkering).

Much thanks to Mike who put in a lot of effort to benchmark things and proved
a worthy opponent with a competing patch.

Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-01-15 19:00:09 +0800
0d66bf6d3 mutex: implement adaptive spinning ... Browse Code »

Change mutex contention behaviour such that it will sometimes busy wait on
acquisition - moving its behaviour closer to that of spinlocks.

This concept got ported to mainline from the -rt tree, where it was originally
implemented for rtmutexes by Steven Rostedt, based on work by Gregory Haskins.

Testing with Ingo's test-mutex application (http://lkml.org/lkml/2006/1/8/50)
gave a 345% boost for VFS scalability on my testbox:

# ./test-mutex-shm V 16 10 | grep "^avg ops"
avg ops/sec: 296604

# ./test-mutex-shm V 16 10 | grep "^avg ops"
avg ops/sec: 85870

The key criteria for the busy wait is that the lock owner has to be running on
a (different) cpu. The idea is that as long as the owner is running, there is a
fair chance it'll release the lock soon, and thus we'll be better off spinning
instead of blocking/scheduling.

Since regular mutexes (as opposed to rtmutexes) do not atomically track the
owner, we add the owner in a non-atomic fashion and deal with the races in
the slowpath.

Furthermore, to ease the testing of the performance impact of this new code,
there is means to disable this behaviour runtime (without having to reboot
the system), when scheduler debugging is enabled (CONFIG_SCHED_DEBUG=y),
by issuing the following command:

# echo NO_OWNER_SPIN > /debug/sched_features

This command re-enables spinning again (this is also the default):

# echo OWNER_SPIN > /debug/sched_features

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2009-01-15 01:09:02 +0800

05 Nov, 2008

1 commit

4793241be sched: backward looking buddy ... Browse Code »

Impact: improve/change/fix wakeup-buddy scheduling

Currently we only have a forward looking buddy, that is, we prefer to
schedule to the task we last woke up, under the presumption that its
going to consume the data we just produced, and therefore will have
cache hot benefits.

This allows co-waking producer/consumer task pairs to run ahead of the
pack for a little while, keeping their cache warm. Without this, we
would interleave all pairs, utterly trashing the cache.

This patch introduces a backward looking buddy, that is, suppose that
in the above scenario, the consumer preempts the producer before it
can go to sleep, we will therefore miss the wakeup from consumer to
producer (its already running, after all), breaking the cycle and
reverting to the cache-trashing interleaved schedule pattern.

The backward buddy will try to schedule back to the task that woke us
up in case the forward buddy is not available, under the assumption
that the last task will be the one with the most cache hot task around
barring current.

This will basically allow a task to continue after it got preempted.

In order to avoid starvation, we allow either buddy to get wakeup_gran
ahead of the pack.

Signed-off-by: Peter Zijlstra
Acked-by: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-11-05 17:30:14 +0800

20 Oct, 2008

1 commit

0c4b83da5 sched: disable the hrtick for now ... Browse Code »

David Miller reported that hrtick update overhead has tripled the
wakeup overhead on Sparc64.

That is too much - disable the HRTICK feature for now by default,
until a faster implementation is found.

Reported-by: David Miller
Acked-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Ingo Molnar
2008-10-20 20:27:43 +0800

22 Sep, 2008

2 commits

f681bbd65 sched: turn off WAKEUP_OVERLAP ... Browse Code »

WAKEUP_OVERLAP is not a winner on a 16way box, running psql+sysbench:

.27-rc7-NO_WAKEUP_OVERLAP .27-rc7-WAKEUP_OVERLAP
-------------------------------------------------
1: 694 811 +14.39%
2: 1454 1427 -1.86%
4: 3017 3070 +1.70%
8: 5694 5808 +1.96%
16: 10592 10612 +0.19%
32: 9693 9647 -0.48%
64: 8507 8262 -2.97%
128: 8402 7087 -18.55%
256: 8419 5124 -64.30%
512: 7990 3671 -117.62%
-------------------------------------------------
SUM: 64466 55524 -16.11%

... so turn it off by default.

Signed-off-by: Ingo Molnar

Ingo Molnar
2008-09-22 22:29:00 +0800
15afe09bf sched: wakeup preempt when small overlap ... Browse Code »

Lin Ming reported a 10% OLTP regression against 2.6.27-rc4.

The difference seems to come from different preemption agressiveness,
which affects the cache footprint of the workload and its effective
cache trashing.

Aggresively preempt a task if its avg overlap is very small, this should
avoid the task going to sleep and find it still running when we schedule
back to it - saving a wakeup.

Reported-by: Lin Ming
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-09-22 22:28:32 +0800

21 Aug, 2008

1 commit

efc2dead2 sched: enable LB_BIAS by default ... Browse Code »

Yanmin reported a significant regression on his 16-core machine due to:

commit 93b75217df39e6d75889cc6f8050343286aff4a5
Author: Peter Zijlstra
Date: Fri Jun 27 13:41:33 2008 +0200

Flip back to the old behaviour.

Reported-by: "Zhang, Yanmin"
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-08-21 14:18:02 +0800

27 Jun, 2008

5 commits

f5bfb7d9f sched: bias effective_load() error towards failing wake_affine(). ... Browse Code »

Measurement shows that the difference between cgroup:/ and cgroup:/foo
wake_affine() results is that the latter succeeds significantly more.

Therefore bias the calculations towards failing the test.

Signed-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-06-27 20:31:47 +0800
2398f2c6d sched: update shares on wakeup ... Browse Code »

We found that the affine wakeup code needs rather accurate load figures
to be effective. The trouble is that updating the load figures is fairly
expensive with group scheduling. Therefore ratelimit the updating.

Signed-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-06-27 20:31:45 +0800
93b75217d sched: disable source/target_load bias ... Browse Code »

The bias given by source/target_load functions can be very large, disable
it by default to get faster convergence.

Signed-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-06-27 20:31:44 +0800
c9c294a63 sched: fix calc_delta_asym() ... Browse Code »

calc_delta_asym() is supposed to do the same as calc_delta_fair() except
linearly shrink the result for negative nice processes - this causes them
to have a smaller preemption threshold so that they are more easily preempted.

The problem is that for task groups se->load.weight is the per cpu share of
the actual task group weight; take that into account.

Also provide a debug switch to disable the asymmetry (which I still don't
like - but it does greatly benefit some workloads)

This would explain the interactivity issues reported against group scheduling.

Signed-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-06-27 20:31:28 +0800
a7be37ac8 sched: revert the revert of: weight calculations ... Browse Code »

Try again..

initial commit: 8f1bc385cfbab474db6c27b5af1e439614f3025c
revert: f9305d4a0968201b2818dbed0dc8cb0d4ee7aeb3

Signed-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-06-27 20:31:27 +0800

10 Jun, 2008

1 commit

6492c7f83 sched: trivial sched_features cleanup ... Browse Code »

Remove unused debug/tuning features.

Signed-off-by: Mike Galbraith
Signed-off-by: Ingo Molnar

Mike Galbraith
2008-06-10 18:38:17 +0800

20 Apr, 2008

1 commit

f00b45c14 sched: /debug/sched_features ... Browse Code »

provide a text based interface to the scheduler features; this saves the
'user' from setting bits using decimal arithmetic.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-04-20 01:45:00 +0800