Eric Lee / smarc-fsl-linux-kernel

19 Oct, 2007

1 commit

6212e3a38 Remove struct task_struct::io_wait ... Browse Code »

Hell knows what happened in commit 63b05203af57e7de4f3bb63b8b81d43bc196d32b
during 2.6.9 development. Commit introduced io_wait field which remained
write-only than and still remains write-only.

Also garbage collect macros which "use" io_wait.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-10-19 05:37:20 +0800

18 Oct, 2007

1 commit

e6d5a11da Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
sched: fix new task startup crash
sched: fix !SYSFS build breakage
sched: fix improper load balance across sched domain
sched: more robust sd-sysctl entry freeing

Linus Torvalds
2007-10-18 00:11:18 +0800

17 Oct, 2007

11 commits

57c521ce6 ifdef struct task_struct::security ... Browse Code »

For those who don't care about CONFIG_SECURITY.

Signed-off-by: Alexey Dobriyan
Cc: "Serge E. Hallyn"
Cc: Casey Schaufler
Cc: James Morris
Cc: Stephen Smalley
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-10-17 23:43:07 +0800
18796aa00 task_struct: move ->fpu_counter and ->oomkilladj ... Browse Code »

There is nice 2 byte hole after struct task_struct::ioprio field
into which we can put two 1-byte fields: ->fpu_counter and ->oomkilladj.

Signed-off-by: Alexey Dobriyan
Acked-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-10-17 23:43:01 +0800
970a8645c user.c: #ifdef ->mq_bytes ... Browse Code »

For those who deselect POSIX message queues.

Reduces SLAB size of user_struct from 64 to 32 bytes here, SLUB size -- from
40 bytes to 32 bytes.

[akpm@linux-foundation.org: fix build]
Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-10-17 23:42:59 +0800
42b2dd0a0 Shrink task_struct if CONFIG_FUTEX=n ... Browse Code »

robust_list, compat_robust_list, pi_state_list, pi_state_cache are
really used if futexes are on.

Signed-off-by: Alexey Dobriyan
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-10-17 23:42:55 +0800
3befe7ceb Shrink struct task_struct::oomkilladj ... Browse Code »

oomkilladj is int, but values which can be assigned to it are -17, [-16,
15], thus fitting into s8.

While patch itself doesn't help in making task_struct smaller, because of
natural alignment of ->link_count, it will make picture clearer wrt futher
task_struct reduction patches. My plan is to move ->fpu_counter and
->oomkilladj after ->ioprio filling hole on i386 and x86_64. But that's
for later, because bloated distro configs need looking at as well.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-10-17 23:42:53 +0800
82df39738 Add MMF_DUMP_ELF_HEADERS ... Browse Code »

This adds the MMF_DUMP_ELF_HEADERS option to /proc/pid/coredump_filter.
This dumps the first page (only) of a private file mapping if it appears to
be a mapping of an ELF file. Including these pages in the core dump may
give sufficient identifying information to associate the original DSO and
executable file images and their debugging information with a core file in
a generic way just from its contents (e.g. when those binaries were built
with ld --build-id). I expect this to become the default behavior
eventually. Existing versions of gdb can be confused by the core dumps it
creates, so it won't enabled by default for some time to come. Soon many
people will have systems with a gdb that handle these dumps, so they can
arrange to set the bit at boot and have it inherited system-wide.

This also cleans up the checking of the MMF_DUMP_* flag bits, which did not
need to be using atomic macros.

Signed-off-by: Roland McGrath
Cc: Hidehiro Kawai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Roland McGrath
2007-10-17 23:42:52 +0800
c4f3b63fe softlockup: add a /proc tuning parameter ... Browse Code »

Control the trigger limit for softlockup warnings. This is useful for
debugging softlockups, by lowering the softlockup_thresh to identify
possible softlockups earlier.

This patch:
1. Adds a sysctl softlockup_thresh with valid values of 1-60s
(Higher value to disable false positives)
2. Changes the softlockup printk to print the cpu softlockup time

[akpm@linux-foundation.org: Fix various warnings and add definition of "two"]
Signed-off-by: Ravikiran Thirumalai
Signed-off-by: Shai Fultheim
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ravikiran G Thirumalai
2007-10-17 23:42:47 +0800
3e26c149c mm: dirty balancing for tasks ... Browse Code »

Based on ideas of Andrew:
http://marc.info/?l=linux-kernel&m=102912915020543&w=2

Scale the bdi dirty limit inversly with the tasks dirty rate.
This makes heavy writers have a lower dirty limit than the occasional writer.

Andrea proposed something similar:
http://lwn.net/Articles/152277/

The main disadvantage to his patch is that he uses an unrelated quantity to
measure time, which leaves him with a workload dependant tunable. Other than
that the two approaches appear quite similar.

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-10-17 23:42:45 +0800
b1a8c172c sched: fix !SYSFS build breakage ... Browse Code »

When CONFIG_SYSFS is not set, CONFIG_FAIR_USER_SCHED fails to build
with

kernel/built-in.o: In function `uids_kobject_init':
(.init.text+0x1488): undefined reference to `kernel_subsys'
kernel/built-in.o: In function `uids_kobject_init':
(.init.text+0x1490): undefined reference to `kernel_subsys'
kernel/built-in.o: In function `uids_kobject_init':
(.init.text+0x1480): undefined reference to `kernel_subsys'
kernel/built-in.o: In function `uids_kobject_init':
(.init.text+0x1494): undefined reference to `kernel_subsys'

This patch fixes this build error.

Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar

Dhaval Giani
2007-10-17 22:55:11 +0800
607717a65 cpuset: remove sched domain hooks from cpusets ... Browse Code »

Remove the cpuset hooks that defined sched domains depending on the setting
of the 'cpu_exclusive' flag.

The cpu_exclusive flag can only be set on a child if it is set on the
parent.

This made that flag painfully unsuitable for use as a flag defining a
partitioning of a system.

It was entirely unobvious to a cpuset user what partitioning of sched
domains they would be causing when they set that one cpu_exclusive bit on
one cpuset, because it depended on what CPUs were in the remainder of that
cpusets siblings and child cpusets, after subtracting out other
cpu_exclusive cpusets.

Furthermore, there was no way on production systems to query the
result.

Using the cpu_exclusive flag for this was simply wrong from the get go.

Fortunately, it was sufficiently borked that so far as I know, almost no
successful use has been made of this. One real time group did use it to
affectively isolate CPUs from any load balancing efforts. They are willing
to adapt to alternative mechanisms for this, such as someway to manipulate
the list of isolated CPUs on a running system. They can do without this
present cpu_exclusive based mechanism while we develop an alternative.

There is a real risk, to the best of my understanding, of users
accidentally setting up a partitioned scheduler domains, inhibiting desired
load balancing across all their CPUs, due to the nonobvious (from the
cpuset perspective) side affects of the cpu_exclusive flag.

Furthermore, since there was no way on a running system to see what one was
doing with sched domains, this change will be invisible to any using code.
Unless they have real insight to the scheduler load balancing choices, they
will be unable to detect that this change has been made in the kernel's
behaviour.

Initial discussion on lkml of this patch has generated much comment. My
(probably controversial) take on that discussion is that it has reached a
rough concensus that the current cpuset cpu_exclusive mechanism for
defining sched domains is borked. There is no concensus on the
replacement. But since we can remove this mechanism, and since its
continued presence risks causing unwanted partitioning of the schedulers
load balancing, we should remove it while we can, as we proceed to work the
replacement scheduler domain mechanisms.

Signed-off-by: Paul Jackson
Cc: Ingo Molnar
Cc: Nick Piggin
Cc: Christoph Lameter
Cc: Dinakar Guniguntala
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Jackson
2007-10-17 00:43:09 +0800
c92ff1bde move mm_struct and vm_area_struct ... Browse Code »

Move the definitions of struct mm_struct and struct vma_area_struct to
include/mm_types.h. This allows to define more function in asm/pgtable.h
and friends with inline assemblies instead of macros. Compile tested on
i386, powerpc, powerpc64, s390-32, s390-64 and x86_64.

[aurelien@aurel32.net: build fix]
Signed-off-by: Martin Schwidefsky
Signed-off-by: Aurelien Jarno
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Martin Schwidefsky
2007-10-17 00:42:53 +0800

15 Oct, 2007

27 commits

94886b84b sched: guest CPU accounting: maintain stats in account_system_time() ... Browse Code »

modify account_system_time() to add cputime to cpustat->guest if we are
running a VCPU. We add this cputime to cpustat->user instead of
cpustat->system because this part of KVM code is in fact user code
although it is executed in the kernel. We duplicate VCPU time between
guest and user to allow an unmodified "top(1)" to display correct value.
A modified "top(1)" is able to display good cpu user time and cpu guest
time by subtracting cpu guest time from cpu user time. Update "gtime" in
task_struct accordingly.

Signed-off-by: Laurent Vivier
Acked-by: Avi Kivity
Signed-off-by: Ingo Molnar

Laurent Vivier
2007-10-15 23:00:19 +0800
9ac52315d sched: guest CPU accounting: add guest-CPU /proc/<pid>/stat fields ... Browse Code »

like for cpustat, introduce the "gtime" (guest time of the task) and
"cgtime" (guest time of the task children) fields for the
tasks. Modify signal_struct and task_struct.

Modify /proc//stat to display these new fields.

Signed-off-by: Laurent Vivier
Acked-by: Avi Kivity
Signed-off-by: Ingo Molnar

Laurent Vivier
2007-10-15 23:00:19 +0800
cc367732f sched: debug, improve migration statistics ... Browse Code »

add new migration statistics when SCHED_DEBUG and SCHEDSTATS
is enabled. Available in /proc//sched.

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-10-15 23:00:18 +0800
da84d9617 sched: reintroduce cache-hot affinity ... Browse Code »

reintroduce a simplified version of cache-hot/cold scheduling
affinity. This improves performance with certain SMP workloads,
such as sysbench.

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-10-15 23:00:18 +0800
95938a35c sched: prevent wakeup over-scheduling ... Browse Code »

Prevent wakeup over-scheduling. Once a task has been preempted by a
task of the same or lower priority, it becomes ineligible for repeated
preemption by same until it has been ticked, or slept. Instead, the
task is marked for preemption at the next tick. Tasks of higher
priority still preempt immediately.

Signed-off-by: Mike Galbraith
Signed-off-by: Ingo Molnar

Mike Galbraith
2007-10-15 23:00:14 +0800
5cb350baf sched: group scheduling, sysfs tunables ... Browse Code »

Add tunables in sysfs to modify a user's cpu share.

A directory is created in sysfs for each new user in the system.

/sys/kernel/uids//cpu_share

Reading this file returns the cpu shares granted for the user.
Writing into this file modifies the cpu share for the user. Only an
administrator is allowed to modify a user's cpu share.

Ex:
# cd /sys/kernel/uids/
# cat 512/cpu_share
1024
# echo 2048 > 512/cpu_share
# cat 512/cpu_share
2048
#

Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar

Dhaval Giani
2007-10-15 23:00:14 +0800
4cf86d77f sched: cleanup: rename task_grp to task_group ... Browse Code »

cleanup: rename task_grp to task_group. No need to save two characters
and 'grp' is annoying to read.

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-10-15 23:00:14 +0800
af9272326 sched: cleanup, remove the TASK_NONINTERACTIVE flag ... Browse Code »

Here's another piece of low hanging obsolete fruit.

Remove obsolete TASK_NONINTERACTIVE.

Signed-off-by: Mike Galbraith
Signed-off-by: Ingo Molnar

Mike Galbraith
2007-10-15 23:00:13 +0800
5522d5d5f sched: mark scheduling classes as const ... Browse Code »

mark scheduling classes as const. The speeds up the code
a bit and shrinks it:

text data bss dec hex filename
40027 4018 292 44337 ad31 sched.o.before
40190 3842 292 44324 ad24 sched.o.after

Signed-off-by: Ingo Molnar
Reviewed-by: Thomas Gleixner

Ingo Molnar
2007-10-15 23:00:12 +0800
5f6d858ec sched: speed up and simplify vslice calculations ... Browse Code »

speed up and simplify vslice calculations.

[ From: Mike Galbraith : build fix ]

Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Peter Zijlstra
2007-10-15 23:00:12 +0800
2d72376b3 sched: clean up schedstats, cnt -> count ... Browse Code »

rename all 'cnt' fields and variables to the less yucky 'count' name.

yuckage noticed by Andrew Morton.

no change in code, other than the /proc/sched_debug bkl_count string got
a bit larger:

text data bss dec hex filename
38236 3506 24 41766 a326 sched.o.before
38240 3506 24 41770 a32a sched.o.after

Signed-off-by: Ingo Molnar
Reviewed-by: Thomas Gleixner

Ingo Molnar
2007-10-15 23:00:12 +0800
94359f05c sched: undo some of the recent changes ... Browse Code »

undo some of the recent changes that are not needed after all,
such as last_min_vruntime.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Ingo Molnar
2007-10-15 23:00:11 +0800
67e9fb2a3 sched: add vslice ... Browse Code »

add vslice: the load-dependent "virtual slice" a task should
run ideally, so that the observed latency stays within the
sched_latency window.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner

Peter Zijlstra
2007-10-15 23:00:10 +0800
c18b8a7cb sched: remove unneeded tunables ... Browse Code »

remove unneeded tunables.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner

Ingo Molnar
2007-10-15 23:00:10 +0800
b8efb5617 sched debug: BKL usage statistics ... Browse Code »

add per task and per rq BKL usage statistics.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner

Ingo Molnar
2007-10-15 23:00:10 +0800
24e377a83 sched: add fair-user scheduler ... Browse Code »

Enable user-id based fair group scheduling. This is useful for anyone
who wants to test the group scheduler w/o having to enable
CONFIG_CGROUPS.

A separate scheduling group (i.e struct task_grp) is automatically created for
every new user added to the system. Upon uid change for a task, it is made to
move to the corresponding scheduling group.

A /proc tunable (/proc/root_user_share) is also provided to tune root
user's quota of cpu bandwidth.

Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner

Srivatsa Vaddagiri
2007-10-15 23:00:09 +0800
9b5b77512 sched: clean up code under CONFIG_FAIR_GROUP_SCHED ... Browse Code »

With the view of supporting user-id based fair scheduling (and not just
container-based fair scheduling), this patch renames several functions
and makes them independent of whether they are being used for container
or user-id based fair scheduling.

Also fix a problem reported by KAMEZAWA Hiroyuki (wrt allocating
less-sized array for tg->cfs_rq[] and tf->se[]).

Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner

Srivatsa Vaddagiri
2007-10-15 23:00:09 +0800
83b699ed2 sched: revert recent removal of set_curr_task() ... Browse Code »

Revert removal of set_curr_task.
Use put_prev_task/set_curr_task when changing groups/policies

Signed-off-by: Srivatsa Vaddagiri < vatsa@linux.vnet.ibm.com>
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Srivatsa Vaddagiri
2007-10-15 23:00:08 +0800
f6b53205e sched: rework enqueue/dequeue_entity() to get rid of set_curr_task() ... Browse Code »

rework enqueue/dequeue_entity() to get rid of
sched_class::set_curr_task(). This simplifies sched_setscheduler(),
rt_mutex_setprio() and sched_move_tasks().

text data bss dec hex filename
24330 2734 20 27084 69cc sched.o.before
24233 2730 20 26983 6967 sched.o.after

Signed-off-by: Dmitry Adamushko
Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner

Dmitry Adamushko
2007-10-15 23:00:08 +0800
4530d7ab0 sched: simplify sched_class::yield_task() ... Browse Code »

the 'p' (task_struct) parameter in the sched_class :: yield_task() is
redundant as the caller is always the 'current'. Get rid of it.

text data bss dec hex filename
24341 2734 20 27095 69d7 sched.o.before
24330 2734 20 27084 69cc sched.o.after

Signed-off-by: Dmitry Adamushko
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner

Dmitry Adamushko
2007-10-15 23:00:08 +0800
30cfdcfc5 sched: do not keep current in the tree and get rid of sched_entity::fair_key ... Browse Code »

Get rid of 'sched_entity::fair_key'.

As a side effect, 'current' is not kept withing the tree for
SCHED_NORMAL/BATCH tasks anymore. This simplifies some parts of code
(e.g. entity_tick() and yield_task_fair()) and also somewhat optimizes
them (e.g. a single update_curr() now vs. dequeue/enqueue() before in
entity_tick()).

Signed-off-by: Dmitry Adamushko
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner

Dmitry Adamushko
2007-10-15 23:00:07 +0800
bbdba7c0e sched: remove wait_runtime fields and features ... Browse Code »

remove wait_runtime based fields and features, now that the CFS
math has been changed over to the vruntime metric.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith
Reviewed-by: Thomas Gleixner

Ingo Molnar
2007-10-15 23:00:06 +0800
e22f5bbf8 sched: remove wait_runtime limit ... Browse Code »

remove the wait_runtime-limit fields and the code depending on it, now
that the math has been changed over to rely on the vruntime metric.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith
Reviewed-by: Thomas Gleixner

Ingo Molnar
2007-10-15 23:00:06 +0800
e9acbff64 sched: introduce se->vruntime ... Browse Code »

introduce se->vruntime as a sum of weighted delta-exec's, and use that
as the key into the tree.

the idea to use absolute virtual time as the basic metric of scheduling
has been first raised by William Lee Irwin, advanced by Tong Li and first
prototyped by Roman Zippel in the "Really Fair Scheduler" (RFS) patchset.

also see:

http://lkml.org/lkml/2007/9/2/76

for a simpler variant of this patch.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith
Reviewed-by: Thomas Gleixner

Ingo Molnar
2007-10-15 23:00:04 +0800
8ebc91d93 sched: remove stat_gran ... Browse Code »

remove the stat_gran code - it was disabled by default and it causes
unnecessary overhead.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith
Reviewed-by: Thomas Gleixner

Ingo Molnar
2007-10-15 23:00:03 +0800
2bd8e6d42 sched: use constants if !CONFIG_SCHED_DEBUG ... Browse Code »

use constants if !CONFIG_SCHED_DEBUG.

this speeds up the code and reduces code-size:

text data bss dec hex filename
27464 3014 16 30494 771e sched.o.before
26929 3010 20 29959 7507 sched.o.after

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith
Reviewed-by: Thomas Gleixner

Ingo Molnar
2007-10-15 23:00:02 +0800
eba1ed4b7 sched: debug: track maximum 'slice' ... Browse Code »

track the maximum amount of time a task has executed while
the CPU load was at least 2x. (i.e. at least two nice-0
tasks were runnable)

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Signed-off-by: Mike Galbraith
Reviewed-by: Thomas Gleixner

Ingo Molnar
2007-10-15 23:00:02 +0800