Eric Lee / smarc-fsl-linux-kernel

26 Jun, 2005

40 commits

c4ac4263a [PATCH] crashdump: x86: add NMI handler to capture other CPUs ... Browse Code »

One of the dangers when switching from one kernel to another is what happens
to all of the other cpus that were running in the crashed kernel. In an
attempt to avoid that problem this patch adds a nmi handler and attempts to
shoot down the other cpus by sending them non maskable interrupts.

The code then waits for 1 second or until all known cpus have stopped running
and then jumps from the running kernel that has crashed to the kernel in
reserved memory.

The kernel spin loop is used for the delay as that should behave continue to
be safe even in after a crash.

Signed-off-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:49 +0800
5033cba08 [PATCH] kexec: x86 kexec core ... Browse Code »

This is the i386 implementation of kexec.

Signed-off-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:49 +0800
dd2a13054 [PATCH] kexec: x86: factor out apic shutdown code ... Browse Code »

Factor out the apic and smp shutdown code from machine_restart so it can be
called by in the kexec reboot path as well.

By switching to the bootstrap cpu by default on reboot I can delete/simplify
some motherboard fixups well.

Signed-off-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:49 +0800
50cccc699 [PATCH] Kexec on panic vmlinux initrd fix ... Browse Code »

This is a minor bug fix in kexec to resolve the problem of loading panic
kernel with initrd.

o Problem: Loading a capture kenrel fails if initrd is also being loaded.
This has been observed for vmlinux image for kexec on panic case.

o This patch fixes the problem. In segment location and size verification
logic, minor correction has been done. Segment memory end (mend) should be
mstart + memsz - 1. This one byte offset was source of failure for initrd
loading which was being loaded at hole boundary.

Signed-off-by: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2005-06-26 07:24:48 +0800
dc009d924 [PATCH] kexec: add kexec syscalls ... Browse Code »

This patch introduces the architecture independent implementation the
sys_kexec_load, the compat_sys_kexec_load system calls.

Kexec on panic support has been integrated into the core patch and is
relatively clean.

In addition the hopefully architecture independent option
crashkernel=size@location has been docuemented. It's purpose is to reserve
space for the panic kernel to live, and where no DMA transfer will ever be
setup to access.

Signed-off-by: Eric Biederman
Signed-off-by: Alexander Nyberg
Signed-off-by: Adrian Bunk
Signed-off-by: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:48 +0800
d0537508a [PATCH] kexec: x86_64: add CONFIG_PHYSICAL_START ... Browse Code »

For one kernel to report a crash another kernel has created we need
to have 2 kernels loaded simultaneously in memory. To accomplish this
the two kernels need to built to run at different physical addresses.

This patch adds the CONFIG_PHYSICAL_START option to the x86_64 kernel
so we can do just that. You need to know what you are doing and
the ramifications are before changing this value, and most users
won't care so I have made it depend on CONFIG_EMBEDDED

bzImage kernels will work and run at a different address when compiled
with this option but they will still load at 1MB. If you need a kernel
loaded at a different address as well you need to boot a vmlinux.

Signed-off-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:48 +0800
8a9190853 [PATCH] kexec: reserve Bootmem fix for booting nondefault location kernel ... Browse Code »

This patch fixes a problem with reserving memory during boot up of a kernel
built for non-default location. Currently boot memory allocator reserves
the memory required by kernel image, boot allocaotor bitmap etc. It
assumes that kernel is loaded at 1MB (HIGH_MEMORY hard coded to 1024*1024).
But kernel can be built for non-default locatoin, hence existing
hardcoding will lead to reserving unnecessary memory. This patch fixes it.

Signed-off-by: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2005-06-26 07:24:48 +0800
3d345e3fc [PATCH] kexec: x86: add CONFIG_PYSICAL_START ... Browse Code »

For one kernel to report a crash another kernel has created we need
to have 2 kernels loaded simultaneously in memory. To accomplish this
the two kernels need to built to run at different physical addresses.

This patch adds the CONFIG_PHYSICAL_START option to the x86 kernel
so we can do just that. You need to know what you are doing and
the ramifications are before changing this value, and most users
won't care so I have made it depend on CONFIG_EMBEDDED

bzImage kernels will work and run at a different address when compiled
with this option but they will still load at 1MB. If you need a kernel
loaded at a different address as well you need to boot a vmlinux.

Signed-off-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:48 +0800
5ded01e83 [PATCH] kexec: x86_64: vmlinux: fix physical addresses ... Browse Code »

The vmlinux on x86_64 does not report the correct physical address of
the kernel. Instead in the physical address field it currently
reports the virtual address of the kernel.

This is patch is a bug fix that corrects vmlinux to report the
proper physical addresses.

This is potentially a help for crash dump analysis tools.

This definitiely allows bootloaders that load vmlinux as a standard
ELF executable. Bootloaders directly loading vmlinux become of
practical importance when we consider the kexec on panic case.

Signed-off-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:47 +0800
ad0d75eba [PATCH] kexec: x86: vmlinux: fix physical addresses ... Browse Code »

The vmlinux on i386 does not report the correct physical address of
the kernel. Instead in the physical address field it currently
reports the virtual address of the kernel.

This is patch is a bug fix that corrects vmlinux to report the
proper physical addresses.

This is potentially a help for crash dump analysis tools.

This definitiely allows bootloaders that load vmlinux as a standard
ELF executable. Bootloaders directly loading vmlinux become of
practical importance when we consider the kexec on panic case.

Signed-off-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:47 +0800
60bad7fad [PATCH] kexec: vmlinux: fix physical addresses ... Browse Code »

In vmlinux.lds.h the code is carefull to define every section so vmlinux
properly reports the correct physical load address of code, as well as
it's virtual address.

The new SECURITY_INIT definition fails to follow that convention and
and causes incorrect physical address to appear in the vmlinux if
there are any security initcalls.

This patch updates the SECURITY_INIT to follow the convention in the rest of
the file.

Signed-off-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:47 +0800
208fb9316 [PATCH] kexec: x86_64: restore apic virtual wire mode on shutdown ... Browse Code »

When coming out of apic mode attempt to set the appropriate
apic back into virtual wire mode. This improves on previous versions
of this patch by by never setting bot the local apic and the ioapic
into veritual wire mode.

This code looks at data from the mptable to see if an ioapic has
an ExtInt input to make this decision. A future improvement
is to figure out which apic or ioapic was in virtual wire mode
at boot time and to remember it. That is potentially a more accurate
method, of selecting which apic to place in virutal wire mode.

Signed-off-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:47 +0800
650927ef8 [PATCH] kexec: x86: resture apic virtual wire mode on shutdown ... Browse Code »

When coming out of apic mode attempt to set the appropriate
apic back into virtual wire mode. This improves on previous versions
of this patch by by never setting bot the local apic and the ioapic
into veritual wire mode.

This code looks at data from the mptable to see if an ioapic has
an ExtInt input to make this decision. A future improvement
is to figure out which apic or ioapic was in virtual wire mode
at boot time and to remember it. That is potentially a more accurate
method, of selecting which apic to place in virutal wire mode.

Signed-off-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:47 +0800
719e71105 [PATCH] kexec: x86_64: add i8259 shutdown method ... Browse Code »

From: Eric W. Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:46 +0800
cee5dab48 [PATCH] kexec: x86: i8259 shutdown: disable interrupts ... Browse Code »

From: Eric W. Biederman

This patch disables interrupt generation from the legacy pic on reboot. Now
that there is a sys_device class it should not be called while drivers are
still using interrupts.

There is a report about this breaking ACPI power off on some systems.
http://bugme.osdl.org/show_bug.cgi?id=4041
However the final comment seems to exonerate this code. So until
I get more information I believe that was a false positive.

Signed-off-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:46 +0800
70adada42 [PATCH] kexec: x86_64: e820 64bit fix ... Browse Code »

From: Eric W. Biederman

It is ok to reserve resources > 4G on x86_64 struct resource is 64bit now :)

Signed-off-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:46 +0800
9635b47d9 [PATCH] kexec: x86: local apic fix ... Browse Code »

From: "Maciej W. Rozycki"

Fix a kexec problem whcih causes local APIC detection failure.

The problem is detect_init_APIC() is called early, before the command line
have been processed. Therefore "lapic" (and "nolapic") have not been seen,
yet.

Signed-off-by: Maciej W. Rozycki
Signed-off-by: Eric Biederman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:46 +0800
8f43d03fe [PATCH] kexec: x86: rename APIC_MODE_EXINT ... Browse Code »

From: "Maciej W. Rozycki"

Rename APIC_MODE_EXINT to APIC_MODE_EXTINT - I think it should be named
after what the mode is called in documentation.

From: "Eric W. Biederman"

I have reduced this patch to just the name change in the header. And
integrated the changes into the patches that add those
lines. Otherwise I ran into some ugly dependencies.

Signed-off-by: Maciej W. Rozycki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2005-06-26 07:24:46 +0800
f8cbd99bd [PATCH] sched: voluntary kernel preemption ... Browse Code »

This patch adds a new preemption model: 'Voluntary Kernel Preemption'. The
3 models can be selected from a new menu:

(X) No Forced Preemption (Server)
( ) Voluntary Kernel Preemption (Desktop)
( ) Preemptible Kernel (Low-Latency Desktop)

we still default to the stock (Server) preemption model.

Voluntary preemption works by adding a cond_resched()
(reschedule-if-needed) call to every might_sleep() check. It is lighter
than CONFIG_PREEMPT - at the cost of not having as tight latencies. It
represents a different latency/complexity/overhead tradeoff.

It has no runtime impact at all if disabled. Here are size stats that show
how the various preemption models impact the kernel's size:

text data bss dec hex filename
3618774 547184 179896 4345854 424ffe vmlinux.stock
3626406 547184 179896 4353486 426dce vmlinux.voluntary +0.2%
3748414 548640 179896 4476950 445016 vmlinux.preempt +3.5%

voluntary-preempt is +0.2% of .text, preempt is +3.5%.

This feature has been tested for many months by lots of people (and it's
also included in the RHEL4 distribution and earlier variants were in Fedora
as well), and it's intended for users and distributions who dont want to
use full-blown CONFIG_PREEMPT for one reason or another.

Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2005-06-26 07:24:45 +0800
f704f56af [PATCH] enable PREEMPT_BKL on !PREEMPT+SMP too ... Browse Code »

The only sane way to clean up the current 3 lock_kernel() variants seems to
be to remove the spinlock-based BKL implementations altogether, and to keep
the semaphore-based one only. If we dont want to do that for whatever
reason then i'm afraid we have to live with the current complexity. (but
i'm open for other cleanup suggestions as well.)

To explore this possibility we'll (at a minimum) have to know whether the
semaphore-based BKL works fine on plain SMP too. The patch below enables
this.

The patch may make sense in isolation as well, as it might bring
performance benefits: code that would formerly spin on the BKL spinlock
will now schedule away and give up the CPU. It might introduce performance
regressions as well, if any performance-critical code uses the BKL heavily
and gets overscheduled due to the semaphore. I very much hope there is no
such performance-critical codepath left though.

Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2005-06-26 07:24:45 +0800
cc19ca86a [PATCH] consolidate PREEMPT options into kernel/Kconfig.preempt ... Browse Code »

This patch consolidates the CONFIG_PREEMPT and CONFIG_PREEMPT_BKL
preemption options into kernel/Kconfig.preempt. This, besides reducing
source-code, also enables more centralized tweaking of preemption related
options.

Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2005-06-26 07:24:45 +0800
7f1867a5b [PATCH] Dynamic sched domains: ia64 changes ... Browse Code »

ia64 changes similar to kernel/sched.c.

Signed-off-by: Dinakar Guniguntala
Acked-by: Paul Jackson
Acked-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dinakar Guniguntala
2005-06-26 07:24:45 +0800
85d7b9498 [PATCH] Dynamic sched domains: cpuset changes ... Browse Code »

Adds the core update_cpu_domains code and updated cpusets documentation

Signed-off-by: Dinakar Guniguntala
Acked-by: Paul Jackson
Acked-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dinakar Guniguntala
2005-06-26 07:24:45 +0800
1a20ff27e [PATCH] Dynamic sched domains: sched changes ... Browse Code »

The following patches add dynamic sched domains functionality that was
extensively discussed on lkml and lse-tech. I would like to see this added to
-mm

o The main advantage with this feature is that it ensures that the scheduler
load balacing code only balances against the cpus that are in the sched
domain as defined by an exclusive cpuset and not all of the cpus in the
system. This removes any overhead due to load balancing code trying to
pull tasks outside of the cpu exclusive cpuset only to be prevented by
the tasks' cpus_allowed mask.
o cpu exclusive cpusets are useful for servers running orthogonal
workloads such as RT applications requiring low latency and HPC
applications that are throughput sensitive

o It provides a new API partition_sched_domains in sched.c
that makes dynamic sched domains possible.
o cpu_exclusive cpusets sets are now associated with a sched domain.
Which means that the users can dynamically modify the sched domains
through the cpuset file system interface
o ia64 sched domain code has been updated to support this feature as well
o Currently, this does not support hotplug. (However some of my tests
indicate hotplug+preempt is currently broken)
o I have tested it extensively on x86.
o This should have very minimal impact on performance as none of
the fast paths are affected

Signed-off-by: Dinakar Guniguntala
Acked-by: Paul Jackson
Acked-by: Nick Piggin
Acked-by: Matthew Dobson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dinakar Guniguntala
2005-06-26 07:24:45 +0800
37e4ab3f0 [PATCH] Changing RT priority without CAP_SYS_NICE ... Browse Code »

Presently, a process without the capability CAP_SYS_NICE can not change
its own policy, which is OK.

But it can also not decrease its RT priority (if scheduled with policy
SCHED_RR or SCHED_FIFO), which is what this patch changes.

The rationale is the same as for the nice value: a process should be
able to require less priority for itself. Increasing the priority is
still not allowed.

This is for example useful if you give a multithreaded user process a RT
priority, and the process would like to organize its internal threads
using priorities also. Then you can give the process the highest
priority needed N, and the process starts its threads with lower
priorities: N-1, N-2...

The POSIX norm says that the permissions are implementation specific, so
I think we can do that.

In a sense, it makes the permissions consistent whatever the policy is:
with this patch, process scheduled by SCHED_FIFO, SCHED_RR and
SCHED_OTHER can all decrease their priority.

From: Ingo Molnar

cleaned up and merged to -mm.

Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Olivier Croquette
2005-06-26 07:24:44 +0800
a3464a102 [PATCH] sched: micro-optimize task requeueing in schedule() ... Browse Code »

micro-optimize task requeueing in schedule() & clean up recalc_task_prio().

Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Chen Shang
2005-06-26 07:24:44 +0800
77391d716 [PATCH] sched: relax pinned balancing ... Browse Code »

The maximum rebalance interval allowed by the multiprocessor balancing
backoff is often not large enough to handle corner cases where there are
lots of tasks pinned on a CPU. Suresh reported:

I see system livelock's if for example I have 7000 processes
pinned onto one cpu (this is on the fastest 8-way system I
have access to).

After this patch, the machine is reported to go well above this number.

Signed-off-by: Nick Piggin
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-06-26 07:24:44 +0800
476d139c2 [PATCH] sched: consolidate sbe sbf ... Browse Code »

Consolidate balance-on-exec with balance-on-fork. This is made easy by the
sched-domains RCU patches.

As well as the general goodness of code reduction, this allows the runqueues
to be unlocked during balance-on-fork.

schedstats is a problem. Maybe just have balance-on-event instead of
distinguishing fork and exec?

Signed-off-by: Nick Piggin
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-06-26 07:24:44 +0800
674311d5b [PATCH] sched: RCU domains ... Browse Code »

One of the problems with the multilevel balance-on-fork/exec is that it needs
to jump through hoops to satisfy sched-domain's locking semantics (that is,
you may traverse your own domain when not preemptable, and you may traverse
others' domains when holding their runqueue lock).

balance-on-exec had to potentially migrate between more than one CPU before
finding a final CPU to migrate to, and balance-on-fork needed to potentially
take multiple runqueue locks.

So bite the bullet and make sched-domains go completely RCU. This actually
simplifies the code quite a bit.

From: Ingo Molnar

schedstats RCU fix, and a nice comment on for_each_domain, from Ingo.

Signed-off-by: Ingo Molnar
Signed-off-by: Nick Piggin
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-06-26 07:24:44 +0800
3dbd53420 [PATCH] sched: multilevel sbe sbf ... Browse Code »

The fundamental problem that Suresh has with balance on exec and fork is that
it only tries to balance the top level domain with the flag set.

This was worked around by removing degenerate domains, but is still a problem
if people want to start using more complex sched-domains, especially
multilevel NUMA that ia64 is already using.

This patch makes balance on fork and exec try balancing over not just the top
most domain with the flag set, but all the way down the domain tree.

Signed-off-by: Nick Piggin
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-06-26 07:24:43 +0800
245af2c78 [PATCH] sched: remove degenerate domains ... Browse Code »

Remove degenerate scheduler domains during the sched-domain init.

For example on x86_64, we always have NUMA configured in. On Intel EM64T
systems, top most sched domain will be of NUMA and with only one sched_group
in it.

With fork/exec balances(recent Nick's fixes in -mm tree), we always endup
taking wrong decisions because of this topmost domain (as it contains only one
group and find_idlest_group always returns NULL). We will endup loading HT
package completely first, letting active load balance kickin and correct it.

In general, this patch also makes sense with out recent Nick's fixes in -mm.

From: Nick Piggin

Modified to account for more than just sched_groups when scanning for
degenerate domains by Nick Piggin. And allow a runqueue's sd to go NULL
rather than keep a single degenerate domain around (this happens when you run
with maxcpus=1).

Signed-off-by: Suresh Siddha
Signed-off-by: Nick Piggin
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Suresh Siddha
2005-06-26 07:24:43 +0800
41c7ce9ad [PATCH] sched: null domains ... Browse Code »

Fix the last 2 places that directly access a runqueue's sched-domain and
assume it cannot be NULL.

That allows the use of NULL for domain, instead of a dummy domain, to signify
no balancing is to happen. No functional changes.

Signed-off-by: Nick Piggin
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-06-26 07:24:43 +0800
4866cde06 [PATCH] sched: cleanup context switch locking ... Browse Code »

Instead of requiring architecture code to interact with the scheduler's
locking implementation, provide a couple of defines that can be used by the
architecture to request runqueue unlocked context switches, and ask for
interrupts to be enabled over the context switch.

Also replaces the "switch_lock" used by these architectures with an oncpu
flag (note, not a potentially slow bitflag). This eliminates one bus
locked memory operation when context switching, and simplifies the
task_running function.

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-06-26 07:24:43 +0800
48c08d3f8 [PATCH] sched: uninline task_timeslice ... Browse Code »

"Chen, Kenneth W"

uninline task_timeslice() - reduces code footprint noticeably, and it's
slowpath code.

Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2005-06-26 07:24:43 +0800
687f1661d [PATCH] sched: sched tuning ... Browse Code »

Do some basic initial tuning.

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-06-26 07:24:42 +0800
68767a0ae [PATCH] sched: schedstats update for balance on fork ... Browse Code »

Add SCHEDSTAT statistics for sched-balance-fork.

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-06-26 07:24:42 +0800
147cbb4bb [PATCH] sched: balance on fork ... Browse Code »

Reimplement the balance on exec balancing to be sched-domains aware. Use this
to also do balance on fork balancing. Make x86_64 do balance on fork over the
NUMA domain.

The problem that the non sched domains aware blancing became apparent on dual
core, multi socket opterons. What we want is for the new tasks to be sent to
a different socket, but more often than not, we would first load up our
sibling core, or fill two cores of a single remote socket before selecting a
new one.

This gives large improvements to STREAM on such systems.

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-06-26 07:24:42 +0800
cafb20c1f [PATCH] sched: no aggressive idle balancing ... Browse Code »

Remove the very aggressive idle stuff that has recently gone into 2.6 - it is
going against the direction we are trying to go. Hopefully we can regain
performance through other methods.

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-06-26 07:24:42 +0800
a3f21bce1 [PATCH] sched: tweak affine wakeups ... Browse Code »

Do less affine wakeups. We're trying to reduce dbt2-pgsql idle time
regressions here... make sure we don't don't move tasks the wrong way in an
imbalance condition. Also, remove the cache coldness requirement from the
calculation - this seems to induce sharp cutoff points where behaviour will
suddenly change on some workloads if the load creeps slightly over or under
some point. It is good for periodic balancing because in that case have
otherwise have no other context to determine what task to move.

But also make a minor tweak to "wake balancing" - the imbalance tolerance is
now set at half the domain's imbalance, so we get the opportunity to do wake
balancing before the more random periodic rebalancing gets preformed.

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-06-26 07:24:41 +0800
7897986ba [PATCH] sched: balance timers ... Browse Code »

Do CPU load averaging over a number of different intervals. Allow each
interval to be chosen by sending a parameter to source_load and target_load.
0 is instantaneous, idx > 0 returns a decaying average with the most recent
sample weighted at 2^(idx-1). To a maximum of 3 (could be easily increased).

So generally a higher number will result in more conservative balancing.

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-06-26 07:24:41 +0800