10 Oct, 2008
1 commit
-
lock_task_sighand() make sure task->sighand is being protected,
so we do not need rcu_read_lock().
[ exec() will get task->sighand->siglock before change task->sighand! ]But code using rcu_read_lock() _just_ to protect lock_task_sighand()
only appear in procfs. (and some code in procfs use lock_task_sighand()
without such redundant protection.)Other subsystem may put lock_task_sighand() into rcu_read_lock()
critical region, but these rcu_read_lock() are used for protecting
"for_each_process()", "find_task_by_vpid()" etc. , not for protecting
lock_task_sighand().Signed-off-by: Lai Jiangshan
[ok from Oleg]
Signed-off-by: Alexey Dobriyan
27 Jun, 2008
2 commits
-
show all the schedstats in /debug/sched_debug as well.
Signed-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar -
Try again..
Initial commit: 18d95a2832c1392a2d63227a7a6d433cb9f2037e
Revert: 6363ca57c76b7b83639ca8c83fc285fa26a7880eSigned-off-by: Peter Zijlstra
Cc: Srivatsa Vaddagiri
Cc: Mike Galbraith
Signed-off-by: Ingo Molnar
20 Jun, 2008
1 commit
-
Signed-off-by: Peter Zijlstra
Cc: "Daniel K."
Signed-off-by: Ingo Molnar
29 May, 2008
1 commit
-
Yanmin Zhang reported:
Comparing with 2.6.25, volanoMark has big regression with kernel 2.6.26-rc1.
It's about 50% on my 8-core stoakley, 16-core tigerton, and Itanium Montecito.With bisect, I located the following patch:
| 18d95a2832c1392a2d63227a7a6d433cb9f2037e is first bad commit
| commit 18d95a2832c1392a2d63227a7a6d433cb9f2037e
| Author: Peter Zijlstra
| Date: Sat Apr 19 19:45:00 2008 +0200
|
| sched: fair-group: SMP-nice for group schedulingRevert it so that we get v2.6.25 behavior.
Bisected-by: Yanmin Zhang
Signed-off-by: Ingo Molnar
06 May, 2008
1 commit
-
this replaces the rq->clock stuff (and possibly cpu_clock()).
- architectures that have an 'imperfect' hardware clock can set
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK- the 'jiffie' window might be superfulous when we update tick_gtod
before the __update_sched_clock() call in sched_clock_tick()- cpu_clock() might be implemented as:
sched_clock_cpu(smp_processor_id())
if the accuracy proves good enough - how far can TSC drift in a
single jiffie when considering the filtering and idle hooks?[ mingo@elte.hu: various fixes and cleanups ]
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
01 May, 2008
1 commit
-
Rename div64_64 to div64_u64 to make it consistent with the other divide
functions, so it clearly includes the type of the divide. Move its definition
to math64.h as currently no architecture overrides the generic implementation.
They can still override it of course, but the duplicated declarations are
avoided.Signed-off-by: Roman Zippel
Cc: Avi Kivity
Cc: Russell King
Cc: Geert Uytterhoeven
Cc: Ralf Baechle
Cc: David Howells
Cc: Jeff Dike
Cc: Ingo Molnar
Cc: "David S. Miller"
Cc: Patrick McHardy
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
29 Apr, 2008
1 commit
-
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.Signed-off-by: Denis V. Lunev
Cc: Alexey Dobriyan
Cc: "Eric W. Biederman"
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
20 Apr, 2008
3 commits
-
Signed-off-by: Ingo Molnar
-
Add some extra debug output so we can get a better overview of the
full hierarchy.We print the cgroup path after each cfs_rq, so we can see what group
we're looking at.Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar -
it's unused.
Signed-off-by: Ingo Molnar
19 Mar, 2008
1 commit
-
improve affine wakeups. Maintain the 'overlap' metric based on CFS's
sum_exec_runtime - which means the amount of time a task executes
after it wakes up some other task.Use the 'overlap' for the wakeup decisions: if the 'overlap' is short,
it means there's strong workload coupling between this task and the
woken up task. If the 'overlap' is large then the workload is decoupled
and the scheduler will move them to separate CPUs more easily.( Also slightly move the preempt_check within try_to_wake_up() - this has
no effect on functionality but allows 'early wakeups' (for still-on-rq
tasks) to be correctly accounted as well.)Signed-off-by: Ingo Molnar
26 Jan, 2008
2 commits
-
Right now, the linux kernel (with scheduler statistics enabled) keeps track
of the maximum time a process is waiting to be scheduled. While the maximum
is a very useful metric, tracking average and total is equally useful
(at least for latencytop) to figure out the accumulated effect of scheduler
delays. The accumulated effect is important to judge the performance impact
of scheduler tuning/behavior.Signed-off-by: Arjan van de Ven
Signed-off-by: Ingo Molnar -
We monitor clock overflows, let's also monitor clock underflows.
Signed-off-by: Guillaume Chazarain
Signed-off-by: Ingo Molnar
31 Dec, 2007
1 commit
-
Meelis Roos reported these warnings on sparc64:
CC kernel/sched.o
In file included from kernel/sched.c:879:
kernel/sched_debug.c: In function 'nsec_high':
kernel/sched_debug.c:38: warning: comparison of distinct pointer types lacks a castthe debug check in do_div() is over-eager here, because the long long
is always positive in these places. Mark this by casting them to
unsigned long long.no change in code output:
text data bss dec hex filename
51471 6582 376 58429 e43d sched.o.before
51471 6582 376 58429 e43d sched.o.aftermd5:
7f7729c111f185bf3ccea4d542abc049 sched.o.before.asm
7f7729c111f185bf3ccea4d542abc049 sched.o.after.asmSigned-off-by: Ingo Molnar
28 Nov, 2007
1 commit
-
clean up overlong line in kernel/sched_debug.c.
Signed-off-by: Ingo Molnar
27 Nov, 2007
1 commit
-
bump version of kernel/sched_debug.c and remove CFS version
information from it.Signed-off-by: Ingo Molnar
10 Nov, 2007
1 commit
-
we lost the sched_min_granularity tunable to a clever optimization
that uses the sched_latency/min_granularity ratio - but the ratio
is quite unintuitive to users and can also crash the kernel if the
ratio is set to 0. So reintroduce the min_granularity tunable,
while keeping the ratio maintained internally.no functionality changed.
[ mingo@elte.hu: some fixlets. ]
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
25 Oct, 2007
1 commit
-
Lockdep noticed that this lock can also be taken from hardirq context, and can
thus not unconditionally disable/enable irqs.WARNING: at kernel/lockdep.c:2033 trace_hardirqs_on()
[show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30
[show_trace+18/32] show_trace+0x12/0x20
[dump_stack+22/32] dump_stack+0x16/0x20
[trace_hardirqs_on+405/416] trace_hardirqs_on+0x195/0x1a0
[_read_unlock_irq+34/48] _read_unlock_irq+0x22/0x30
[sched_debug_show+2615/4224] sched_debug_show+0xa37/0x1080
[show_state_filter+326/368] show_state_filter+0x146/0x170
[sysrq_handle_showstate+10/16] sysrq_handle_showstate+0xa/0x10
[__handle_sysrq+123/288] __handle_sysrq+0x7b/0x120
[handle_sysrq+40/64] handle_sysrq+0x28/0x40
[kbd_event+1045/1680] kbd_event+0x415/0x690
[input_pass_event+206/208] input_pass_event+0xce/0xd0
[input_handle_event+170/928] input_handle_event+0xaa/0x3a0
[input_event+95/112] input_event+0x5f/0x70
[atkbd_interrupt+434/1456] atkbd_interrupt+0x1b2/0x5b0
[serio_interrupt+59/128] serio_interrupt+0x3b/0x80
[i8042_interrupt+263/576] i8042_interrupt+0x107/0x240
[handle_IRQ_event+40/96] handle_IRQ_event+0x28/0x60
[handle_edge_irq+175/320] handle_edge_irq+0xaf/0x140
[do_IRQ+64/128] do_IRQ+0x40/0x80
[common_interrupt+46/52] common_interrupt+0x2e/0x34Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
19 Oct, 2007
1 commit
-
schedstat is useful in investigating CPU scheduler behavior. Ideally,
I think it is beneficial to have it on all the time. However, the
cost of turning it on in production system is quite high, largely due
to number of events it collects and also due to its large memory
footprint.Most of the fields probably don't need to be full 64-bit on 64-bit
arch. Rolling over 4 billion events will most like take a long time
and user space tool can be made to accommodate that. I'm proposing
kernel to cut back most of variable width on 64-bit system. (note,
the following patch doesn't affect 32-bit system).Signed-off-by: Ken Chen
Signed-off-by: Ingo Molnar
15 Oct, 2007
20 commits
-
In general, struct file_operations are const in the kernel, to not have
false cacheline sharing and to catch bugs at compiletime with accidental
writes to them. The new scheduler code introduces a new non-const one;
fix this up.Signed-off-by: Arjan van de Ven
Signed-off-by: Ingo Molnar -
add new migration statistics when SCHED_DEBUG and SCHEDSTATS
is enabled. Available in /proc//sched.Signed-off-by: Ingo Molnar
-
increase width of debug line - in preparation of more debugging info.
Signed-off-by: Ingo Molnar
-
Add tunables in sysfs to modify a user's cpu share.
A directory is created in sysfs for each new user in the system.
/sys/kernel/uids//cpu_share
Reading this file returns the cpu shares granted for the user.
Writing into this file modifies the cpu share for the user. Only an
administrator is allowed to modify a user's cpu share.Ex:
# cd /sys/kernel/uids/
# cat 512/cpu_share
1024
# echo 2048 > 512/cpu_share
# cat 512/cpu_share
2048
#Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar -
cleanup: rename task_grp to task_group. No need to save two characters
and 'grp' is annoying to read.Signed-off-by: Ingo Molnar
-
Fix coding style issues reported by Randy Dunlap and others
Signed-off-by: Dhaval Giani
Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Ingo Molnar
Reviewed-by: Thomas Gleixner -
speed up and simplify vslice calculations.
[ From: Mike Galbraith : build fix ]
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar -
rename all 'cnt' fields and variables to the less yucky 'count' name.
yuckage noticed by Andrew Morton.
no change in code, other than the /proc/sched_debug bkl_count string got
a bit larger:text data bss dec hex filename
38236 3506 24 41766 a326 sched.o.before
38240 3506 24 41770 a32a sched.o.afterSigned-off-by: Ingo Molnar
Reviewed-by: Thomas Gleixner -
debug feature: check how well we schedule within a reasonable
vruntime 'spread' range. (note that CPU overload can increase
the spread, so this is not a hard condition, but normal loads
should be within the spread.)Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra -
more width for parameter printouts in /proc/sched_debug.
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner -
print the current value of all tunables in /proc/sched_debug output.
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner -
build fix for the SCHED_DEBUG && !SCHEDSTATS case.
Signed-off-by: S.Ceglar Onur
Signed-off-by: Ingo Molnar
Reviewed-by: Thomas Gleixner -
add per task and per rq BKL usage statistics.
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner -
Enable user-id based fair group scheduling. This is useful for anyone
who wants to test the group scheduler w/o having to enable
CONFIG_CGROUPS.A separate scheduling group (i.e struct task_grp) is automatically created for
every new user added to the system. Upon uid change for a task, it is made to
move to the corresponding scheduling group.A /proc tunable (/proc/root_user_share) is also provided to tune root
user's quota of cpu bandwidth.Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner -
- print nr_running and load information for cfs_rq in /proc/sched_debug
Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner -
fix formatting of /proc/sched_debug
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner -
enhance debug output by changing 12345678 nsecs to 12.345678 output,
this is more human-readable.Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner -
print the correct amount of dashes in /proc/sched_debug.
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner -
Get rid of 'sched_entity::fair_key'.
As a side effect, 'current' is not kept withing the tree for
SCHED_NORMAL/BATCH tasks anymore. This simplifies some parts of code
(e.g. entity_tick() and yield_task_fair()) and also somewhat optimizes
them (e.g. a single update_curr() now vs. dequeue/enqueue() before in
entity_tick()).Signed-off-by: Dmitry Adamushko
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner -
remove wait_runtime based fields and features, now that the CFS
math has been changed over to the vruntime metric.Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith
Reviewed-by: Thomas Gleixner