Eric Lee / smarc-fsl-linux-kernel

12 Sep, 2007

3 commits

298a5df45 Fix "no_sync_cmos_clock" logic inversion in kernel/time/ntp.c ... Browse Code »

Seems to me that this timer will only get started on platforms that say
they don't want it?

Signed-off-by: Tony Breeds
Cc: Paul Mackerras
Cc: Gabriel Paubert
Cc: Zachary Amsden
Acked-by: Thomas Gleixner
Cc: John Stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tony Breeds
2007-09-12 08:21:27 +0800
3210f0ecd Restore call_usermodehelper_pipe() behaviour ... Browse Code »

The semantics of call_usermodehelper_pipe() used to be that it would fork
the helper, and wait for the kernel thread to be started. This was
implemented by setting sub_info.wait to 0 (implicitly), and doing a
wait_for_completion().

As part of the cleanup done in 0ab4dc92278a0f3816e486d6350c6652a72e06c8,
call_usermodehelper_pipe() was changed to pass 1 as the value for wait to
call_usermodehelper_exec().

This is equivalent to setting sub_info.wait to 1, which is a change from
the previous behaviour. Using 1 instead of 0 causes
__call_usermodehelper() to start the kernel thread running
wait_for_helper(), rather than directly calling ____call_usermodehelper().

The end result is that the calling kernel code blocks until the user mode
helper finishes. As the helper is expecting input on stdin, and now no one
is writing anything, everything locks up (observed in do_coredump).

The fix is to change the 1 to UMH_WAIT_EXEC (aka 0), indicating that we
want to wait for the kernel thread to be started, but not for the helper to
finish.

Signed-off-by: Michael Ellerman
Acked-by: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michael Ellerman
2007-09-12 08:21:20 +0800
179c85ea5 futex_compat: fix list traversal bugs ... Browse Code »

The futex list traversal on the compat side appears to have
a bug.

It's loop termination condition compares:

while (compat_ptr(uentry) != &head->list)

But that can't be right because "uentry" has the special
"pi" indicator bit still potentially set at bit 0. This
is cleared by fetch_robust_entry() into the "entry"
return value.

What this seems to mean is that the list won't terminate
when list iteration gets back to the the head. And we'll
also process the list head like a normal entry, which could
cause all kinds of problems.

So we should check for equality with "entry". That pointer
is of the non-compat type so we have to do a little casting
to keep the compiler and sparse happy.

The same problem can in theory occur with the 'pending'
variable, although that has not been reported from users
so far.

Based on the original patch from David Miller.

Acked-by: Ingo Molnar
Cc: Thomas Gleixner
Cc: David Miller
Signed-off-by: Arnd Bergmann
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arnd Bergmann
2007-09-12 08:21:20 +0800

11 Sep, 2007

1 commit

7d9414329 Fix spurious syscall tracing after PTRACE_DETACH + PTRACE_ATTACH ... Browse Code »

When PTRACE_SYSCALL was used and then PTRACE_DETACH is used, the
TIF_SYSCALL_TRACE flag is left set on the formerly-traced task. This
means that when a new tracer comes along and does PTRACE_ATTACH, it's
possible he gets a syscall tracing stop even though he's never used
PTRACE_SYSCALL. This happens if the task was in the middle of a system
call when the second PTRACE_ATTACH was done. The symptom is an
unexpected SIGTRAP when the tracer thinks that only SIGSTOP should have
been provoked by his ptrace calls so far.

A few machines already fixed this in ptrace_disable (i386, ia64, m68k).
But all other machines do not, and still have this bug. On x86_64, this
constitutes a regression in IA32 compatibility support.

Since all machines now use TIF_SYSCALL_TRACE for this, I put the
clearing of TIF_SYSCALL_TRACE in the generic ptrace_detach code rather
than adding it to every other machine's ptrace_disable.

Signed-off-by: Roland McGrath
Signed-off-by: Linus Torvalds

Roland McGrath
2007-09-11 09:57:47 +0800

05 Sep, 2007

8 commits

116978308 sched: fix ideal_runtime calculations for reniced tasks ... Browse Code »

fix ideal_runtime:

- do not scale it using niced_granularity()
it is against sum_exec_delta, so its wall-time, not fair-time.

- move the whole check into __check_preempt_curr_fair()
so that wakeup preemption can also benefit from the new logic.

this also results in code size reduction:

text data bss dec hex filename
13391 228 1204 14823 39e7 sched.o.before
13369 228 1204 14801 39d1 sched.o.after

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2007-09-05 20:32:49 +0800
4a55b4503 sched: improve prev_sum_exec_runtime setting ... Browse Code »

Second preparatory patch for fix-ideal runtime:

Mark prev_sum_exec_runtime at the beginning of our run, the same spot
that adds our wait period to wait_runtime. This seems a more natural
location to do this, and it also reduces the code a bit:

text data bss dec hex filename
13397 228 1204 14829 39ed sched.o.before
13391 228 1204 14823 39e7 sched.o.after

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2007-09-05 20:32:49 +0800
7c92e54f6 sched: simplify __check_preempt_curr_fair() ... Browse Code »

Preparatory patch for fix-ideal-runtime:

simplify __check_preempt_curr_fair(): get rid of the integer return.

text data bss dec hex filename
13404 228 1204 14836 39f4 sched.o.before
13393 228 1204 14825 39e9 sched.o.after

functionality is unchanged.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2007-09-05 20:32:49 +0800
cf2ab4696 sched: fix xtensa build warning ... Browse Code »

rename RSR to SRR - 'RSR' is already defined on xtensa.

found by Adrian Bunk.

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-09-05 20:32:49 +0800
2491b2b89 sched: debug: fix sum_exec_runtime clearing ... Browse Code »

when cleaning sched-stats also clear prev_sum_exec_runtime.

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-09-05 20:32:49 +0800
a206c0721 sched: debug: fix cfs_rq->wait_runtime accounting ... Browse Code »

the cfs_rq->wait_runtime debug/statistics counter was not maintained
properly - fix this.

this also removes some code:

text data bss dec hex filename
13420 228 1204 14852 3a04 sched.o.before
13404 228 1204 14836 39f4 sched.o.after

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Ingo Molnar
2007-09-05 20:32:49 +0800
a0dc72601 sched: fix niced_granularity() shift ... Browse Code »

fix niced_granularity(). This resulted in under-scheduling for
CPU-bound negative nice level tasks (and this in turn caused
higher than necessary latencies in nice-0 tasks).

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-09-05 20:32:49 +0800
7fd0d2dde sched: fix MC/HT scheduler optimization, without breaking the FUZZ logic. ... Browse Code »

First fix the check
if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task)
with this
if (*imbalance < busiest_load_per_task)

As the current check is always false for nice 0 tasks (as
SCHED_LOAD_SCALE_FUZZ is same as busiest_load_per_task for nice 0
tasks).

With the above change, imbalance was getting reset to 0 in the corner
case condition, making the FUZZ logic fail. Fix it by not corrupting the
imbalance and change the imbalance, only when it finds that the HT/MC
optimization is needed.

Signed-off-by: Suresh Siddha
Signed-off-by: Ingo Molnar

Suresh Siddha
2007-09-05 20:32:48 +0800

01 Sep, 2007

1 commit

5e7a39275 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
sched: clean up task_new_fair()
sched: small schedstat fix
sched: fix wait_start_fair condition in update_stats_wait_end()
sched: call update_curr() in task_tick_fair()
sched: make the scheduler converge to the ideal latency
sched: fix sleeper bonus limit

Linus Torvalds
2007-09-01 01:52:00 +0800

31 Aug, 2007

6 commits

60187d270 sigqueue_free: fix the race with collect_signal() ... Browse Code »

Spotted by taoyue and Jeremy Katz .

collect_signal: sigqueue_free:

list_del_init(&first->list);
if (!list_empty(&q->list)) {
// not taken
}
q->flags &= ~SIGQUEUE_PREALLOC;

__sigqueue_free(first); __sigqueue_free(q);

Now, __sigqueue_free() is called twice on the same "struct sigqueue" with the
obviously bad implications.

In particular, this double free breaks the array_cache->avail logic, so the
same sigqueue could be "allocated" twice, and the bug can manifest itself via
the "impossible" BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free/send_sigqueue.

Hopefully this can explain these mysterious bug-reports, see

http://marc.info/?t=118766926500003
http://marc.info/?t=118466273000005

Alexey Dobriyan reports this patch makes the difference for the testcase, but
nobody has an access to the application which opened the problems originally.

Also, this patch removes tasklist lock/unlock, ->siglock is enough.

Signed-off-by: Oleg Nesterov
Cc: taoyue
Cc: Jeremy Katz
Cc: Sukadev Bhattiprolu
Cc: Alexey Dobriyan
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Roland McGrath
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2007-08-31 16:42:23 +0800
99db67bc0 userns: don't leak root user ... Browse Code »

Signed-off-by: Alexey Dobriyan
Acked-by: Cedric Le Goater
Acked-by: Serge Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-08-31 16:42:23 +0800
59845b1ff request_irq: fix DEBUG_SHIRQ handling ... Browse Code »

Mariusz Kozlowski reported lockdep's warning:

> =================================
> [ INFO: inconsistent lock state ]
> 2.6.23-rc2-mm1 #7
> ---------------------------------
> inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
> ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
> (&tp->lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b [8139too]
> {in-hardirq-W} state was registered at:
> [] __lock_acquire+0x949/0x11ac
> [] lock_acquire+0x99/0xb2
> [] _spin_lock+0x35/0x42
> [] rtl8139_interrupt+0x27/0x46b [8139too]
> [] handle_IRQ_event+0x28/0x59
> [] handle_level_irq+0xad/0x10b
> [] do_IRQ+0x93/0xd0
> [] common_interrupt+0x2e/0x34
...
> other info that might help us debug this:
> 1 lock held by ifconfig/5492:
> #0: (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f
>
> stack backtrace:
...
> [] _spin_lock+0x35/0x42
> [] rtl8139_interrupt+0x27/0x46b [8139too]
> [] free_irq+0x11b/0x146
> [] rtl8139_close+0x8a/0x14a [8139too]
> [] dev_close+0x57/0x74
...

This shows that a driver's irq handler was running both in hard interrupt
and process contexts with irqs enabled. The latter was done during
free_irq() call and was possible only with CONFIG_DEBUG_SHIRQ enabled.
This was fixed by another patch.

But similar problem is possible with request_irq(): any locks taken from
irq handler could be vulnerable - especially with soft interrupts. This
patch fixes it by disabling local interrupts during handler's run. (It
seems, disabling softirqs should be enough, but it needs more checking
on possible races or other special cases).

Reported-by: Mariusz Kozlowski
Signed-off-by: Jarek Poplawski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jarek Poplawski
2007-08-31 16:42:23 +0800
f3de4be9d PM: Fix dependencies of CONFIG_SUSPEND and CONFIG_HIBERNATION ... Browse Code »

Dependencies of CONFIG_SUSPEND and CONFIG_HIBERNATION introduced by commit
296699de6bdc717189a331ab6bbe90e05c94db06 "Introduce CONFIG_SUSPEND for
suspend-to-Ram and standby" are incorrect, as they don't cover the facts that
(1) not all architectures support suspend and (2) SMP hibernation is only
possible on X86 and PPC64 (if CONFIG_PPC64_SWSUSP is set).

Signed-off-by: Rafael J. Wysocki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rafael J. Wysocki
2007-08-31 16:42:22 +0800
b07e35f94 setpgid(child) fails if the child was forked by sub-thread ... Browse Code »

Spotted by Marcin Kowalczyk .

sys_setpgid(child) fails if the child was forked by sub-thread.

Fix the "is it our child" check. The previous commit
ee0acf90d320c29916ba8c5c1b2e908d81f5057d was not complete.

(this patch asks for the new same_thread_group() helper, but mainline doesn't
have it yet).

Signed-off-by: Oleg Nesterov
Acked-by: Roland McGrath
Cc:
Tested-by: "Marcin 'Qrczak' Kowalczyk"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2007-08-31 16:42:22 +0800
f2ab6d888 Assign task_struct.exit_code before taskstats_exit() ... Browse Code »

taskstats.ac_exitcode is assigned to task_struct.exit_code in bacct_add_tsk()
through the following kernel function calls:

do_exit()
taskstats_exit()
fill_pid()
bacct_add_tsk()

The problem is that in do_exit(), task_struct.exit_code is set to 'code' only
after taskstats_exit() has been called. So we need to move the assignment
before taskstats_exit().

Signed-off-by: Jonathan Lim
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jonathan Lim
2007-08-31 16:42:22 +0800

28 Aug, 2007

7 commits

9f508f825 sched: clean up task_new_fair() ... Browse Code »

cleanup: we have the 'se' and 'curr' entity-pointers already,
no need to use p->se and current->se.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith

Ingo Molnar
2007-08-28 18:53:24 +0800
213c8af67 sched: small schedstat fix ... Browse Code »

small schedstat fix: the cfs_rq->wait_runtime 'sum of all runtimes'
statistics counters missed newly forked tasks and thus had a constant
negative skew. Fix this.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith

Ingo Molnar
2007-08-28 18:53:24 +0800
b77d69db9 sched: fix wait_start_fair condition in update_stats_wait_end() ... Browse Code »

Peter Zijlstra noticed the following bug in SCHED_FEAT_SKIP_INITIAL (which
is disabled by default at the moment): it relies on se.wait_start_fair
being 0 while update_stats_wait_end() did not recognize a 0 value,
so instead of 'skipping' the initial interval we gave the new child
a maximum boost of +runtime-limit ...

(No impact on the default kernel, but nice to fix for completeness.)

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith

Ingo Molnar
2007-08-28 18:53:24 +0800
7109c4429 sched: call update_curr() in task_tick_fair() ... Browse Code »

update the fair-clock before using it for the key value.

[ mingo@elte.hu: small cleanups. ]

Signed-off-by: Ting Yang
Signed-off-by: Ingo Molnar
Signed-off-by: Mike Galbraith
Signed-off-by: Peter Zijlstra

Ting Yang
2007-08-28 18:53:24 +0800
f6cf891c4 sched: make the scheduler converge to the ideal latency ... Browse Code »

de-HZ-ification of the granularity defaults unearthed a pre-existing
property of CFS: while it correctly converges to the granularity goal,
it does not prevent run-time fluctuations in the range of
[-gran ... 0 ... +gran].

With the increase of the granularity due to the removal of HZ
dependencies, this becomes visible in chew-max output (with 5 tasks
running):

out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40
out: 27 . 27. 32 | flu: 0 . 0 | ran: 17 . 13 | per: 44 . 40
out: 27 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 36 . 40
out: 29 . 27. 32 | flu: 2 . 0 | ran: 17 . 13 | per: 46 . 40
out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40
out: 29 . 27. 32 | flu: 0 . 0 | ran: 18 . 13 | per: 47 . 40
out: 28 . 27. 32 | flu: 0 . 0 | ran: 9 . 13 | per: 37 . 40

average slice is the ideal 13 msecs and the period is picture-perfect 40
msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no
mechanism in CFS to keep that from happening: it's a perfectly valid
solution that CFS finds.

to fix this we add a granularity/preemption rule that knows about
the "target latency", which makes tasks that run longer than the ideal
latency run a bit less. The simplest approach is to simply decrease the
preemption granularity when a task overruns its ideal latency. For this
we have to track how much the task executed since its last preemption.

( this adds a new field to task_struct, but we can eliminate that
overhead in 2.6.24 by putting all the scheduler timestamps into an
anonymous union. )

with this change in place, chew-max output is fluctuation-less all
around:

out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 2 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40
out: 28 . 27. 39 | flu: 0 . 1 | ran: 13 . 13 | per: 41 . 40

this patch has no impact on any fastpath or on any globally observable
scheduling property. (unless you have sharp enough eyes to see
millisecond-level ruckles in glxgears smoothness :-)

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith

Ingo Molnar
2007-08-28 18:53:24 +0800
5f01d519e sched: fix sleeper bonus limit ... Browse Code »

There is an Amarok song switch time increase (regression) under
hefty load.

What is happening is that sleeper_bonus is never consumed, and only
rarely goes below runtime_limit, so for the most part, Amarok isn't
getting any bonus at all. We're keeping sleeper_bonus right at
runtime_limit (sched_latency == sched_runtime_limit == 40ms) forever, ie
we don't consume if we're lower that that, and don't add if we're above
it. One Amarok thread waking (or anybody else) will push us past the
threshold, so the next thread waking gets nada, but will reap pain from
the previous thread waking until we drop back to runtime_limit. It
looks to me like under load, some random task gets a bonus, and
everybody else pays, whether deserving or not.

This diff fixed the regression for me at any load rate.

Signed-off-by: Mike Galbraith
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Mike Galbraith
2007-08-28 18:53:24 +0800
d243769d3 fix bogus hotplug cpu warning ... Browse Code »

Fix bogus DEBUG_PREEMPT warning on x86_64, when cpu brought online after
bootup: current_is_keventd is right to note its use of smp_processor_id
is preempt-safe, but should use raw_smp_processor_id to avoid the warning.

Signed-off-by: Hugh Dickins
Signed-off-by: Linus Torvalds

Hugh Dickins
2007-08-28 01:27:48 +0800

26 Aug, 2007

4 commits

50c46637a sched: s/sched_latency/sched_min_granularity ... Browse Code »

runtime limit and wakeup granularity used to be a function of
granularity and that was incorrect changed to sched_latency.

Fix this to make wakeup granularity a function of min-granularity,
and the runtime limit equal to latency.

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-08-26 04:17:19 +0800
172ac3dbb sched: cleanup, sched_granularity -> sched_min_granularity ... Browse Code »

due to adaptive granularity scheduling the role of sched_granularity
has changed to "minimum granularity", so rename the variable (and the
tunable) accordingly.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Ingo Molnar
2007-08-26 00:41:53 +0800
218050855 sched: adaptive scheduler granularity ... Browse Code »

Instead of specifying the preemption granularity, specify the wanted
latency. By fixing the granlarity to a constany the wakeup latency
it a function of the number of running tasks on the rq.

Invert this relation.

sysctl_sched_granularity becomes a minimum for the dynamic granularity
computed from the new sysctl_sched_latency.

Then use this latency to do more intelligent granularity decisions: if
there are fewer tasks running then we can schedule coarser. This helps
performance while still always keeping the latency target.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2007-08-26 00:41:53 +0800
1fc84aaae sched: fix CONFIG_SCHED_DEBUG dependency of lockdep sysctls ... Browse Code »

Make the lockdep sysctls not depend on CONFIG_SCHED_DEBUG.

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2007-08-26 00:41:52 +0800

25 Aug, 2007

8 commits

095e56c70 sched: fix startup penalty calculation ... Browse Code »

fix task startup penalty miscalculation: sysctl_sched_granularity is
unsigned int and wait_runtime is long so we first have to convert it
to long before turning it negative ...

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-08-25 02:39:10 +0800
ea0aa3b23 sched: simplify bonus calculation #2 ... Browse Code »

current code:

delta = calc_delta_mine(delta_exec, curr->load.weight, lw);
delta = min((u64)delta, cfs_rq->sleeper_bonus);

Notice that this calc_delta_mine() line is exactly delta_mine, which
gives:

delta = min((u64)delta_mine, cfs_rq->sleeper_bonus);

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2007-08-25 02:39:10 +0800
a6f299404 sched: simplify bonus calculation #1 ... Browse Code »

current code:

delta = min(cfs_rq->sleeper_bonus, (u64)delta_exec);
delta = calc_delta_mine(delta, curr->load.weight, lw);
delta = min((u64)delta, cfs_rq->sleeper_bonus);

drop the first min(), because we clip against sleeper_bonus in the 3rd line
again. That gives:

delta = calc_delta_mine(delta_exec, curr->load.weight, lw);
delta = min((u64)delta, cfs_rq->sleeper_bonus);

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2007-08-25 02:39:10 +0800
b2133c8b1 sched: tidy up and simplify the bonus balance ... Browse Code »

make the bonus balance more consistent: do not hand out a bonus if
there's too much in flight already, and only deduct as much from a
runner as it has the capacity. This makes the bonus engine a zero-sum
game (as intended).

this also simplifies the code:

text data bss dec hex filename
34770 2998 24 37792 93a0 sched.o.before
34749 2998 24 37771 938b sched.o.after

and it also avoids overscheduling in sleep-happy workloads like
hackbench.c.

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-08-25 02:39:10 +0800
98fbc7985 sched: optimize task_tick_rt() a bit ... Browse Code »

Mitchell Erblich suggested a quality-of-implementation change to
not requeue SCHED_RR tasks if there's only a single task on the
runqueue, by checking for rq->nr_running == 1.

provide a more efficient implementation of that, to check that
particular RT priority-queue only.

[ From: mingo@elte.hu ]

Also first requeue the task then set need_resched - results in slightly
better machine-instruction ordering. Also clean up the code a bit.

Signed-off-by: Dmitry Adamushko
Signed-off-by: Ingo Molnar

Dmitry Adamushko
2007-08-25 02:39:10 +0800
deac4ee65 sched: simplify can_migrate_task() ... Browse Code »

Remove trivial conditional branch in Linux scheduler's
can_migrate_task() function.

text data bss dec hex filename
34770 2998 24 37792 93a0 sched.o.before
34757 2998 24 37779 9393 sched.o.after

Signed-off-by: Sven-Thorsten Dietrich
Signed-off-by: Ingo Molnar

Sven-Thorsten Dietrich
2007-08-25 02:39:10 +0800
71fd37146 sched: remove HZ dependency from the granularity default ... Browse Code »

remove HZ dependency from the granularity default. Use 10 msec for
the base granularity, 1 msec for wakeup granularity and 25 msec for
batch wakeup granularity. (These defaults are close to the values
that the default HZ=250 setting got previously, and thus it's the
most common setting.)

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-08-25 02:39:10 +0800
7c6c16f35 sched: CONFIG_SCHED_GROUP_FAIR=y fixlet ... Browse Code »

when I built with CONFIG_FAIR_GROUP_SCHED=y, I need the following change
to make things right.

[ From: mingo@elte.hu ]

this config option is not upstream-configurable right now but lets fix
this for completeness.

Signed-off-by: Bruce Ashfield
Signed-off-by: Ingo Molnar

Bruce Ashfield
2007-08-25 02:39:10 +0800

24 Aug, 2007

2 commits

d0797b39d Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
sched: tweak the sched_runtime_limit tunable
sched: skip updating rq's next_balance under null SD
sched: fix broken SMT/MC optimizations
sched: accounting regression since rc1
sched: fix sysctl directory permissions
sched: sched_clock_idle_[sleep|wakeup]_event()

Linus Torvalds
2007-08-24 12:38:39 +0800
de80af4cc Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 ... Browse Code »

* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6:
sysfs: don't warn on removal of a nonexistent binary file
HOWTO: latest lxr url address changed
HOWTO: korean translation of Documentation/HOWTO
Fix Off-by-one in /sys/module/*/refcnt
sysfs: fix locking in sysfs_lookup() and sysfs_rename_dir()

Linus Torvalds
2007-08-24 12:34:43 +0800