Eric Lee / smarc-fsl-linux-kernel

11 Apr, 2008

1 commit

b6c3006d2 cgroups: include hierarchy ids in /proc/<pid>/cgroup ... Browse Code »

Extend the /proc//cgroup file to include the appropriate hierarchy ID on
each line.

Currently this ID isn't really needed since a hierarchy can be completely
identified by the set of subsystems bound to it, but this is likely to change
in the near future in order to support stateless subsystems and
merging/rebinding of subsystems. Getting this change into 2.6.25 reduces the
need for an API change later.

Signed-off-by: Paul Menage
Cc: Balbir Singh
Cc: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2008-04-11 23:06:43 +0800

05 Apr, 2008

1 commit

8bab8dded cgroups: add cgroup support for enabling controllers at boot time ... Browse Code »

The effects of cgroup_disable=foo are:

- foo isn't auto-mounted if you mount all cgroups in a single hierarchy
- foo isn't visible as an individually mountable subsystem

As a result there will only ever be one call to foo->create(), at init time;
all processes will stay in this group, and the group will never be mounted on
a visible hierarchy. Any additional effects (e.g. not allocating metadata)
are up to the foo subsystem.

This doesn't handle early_init subsystems (their "disabled" bit isn't set be,
but it could easily be extended to do so if any of the early_init systems
wanted it - I think it would just involve some nastier parameter processing
since it would occur before the command-line argument parser had been run.

Hugh said:

Ballpark figures, I'm trying to get this question out rather than
processing the exact numbers: CONFIG_CGROUP_MEM_RES_CTLR adds 15% overhead
to the affected paths, booting with cgroup_disable=memory cuts that back to
1% overhead (due to slightly bigger struct page).

I'm no expert on distros, they may have no interest whatever in
CONFIG_CGROUP_MEM_RES_CTLR=y; and the rest of us can easily build with or
without it, or apply the cgroup_disable=memory patches.

Unix bench's execl test result on x86_64 was

== just after boot without mounting any cgroup fs.==
mem_cgorup=off : Execl Throughput 43.0 3150.1 732.6
mem_cgroup=on : Execl Throughput 43.0 2932.6 682.0
==

[lizf@cn.fujitsu.com: fix boot option parsing]
Signed-off-by: Balbir Singh
Cc: Paul Menage
Cc: Balbir Singh
Cc: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Cc: Hugh Dickins
Cc: Sudhir Kumar
Cc: YAMAMOTO Takashi
Cc: David Rientjes
Signed-off-by: Li Zefan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2008-04-05 05:46:26 +0800

31 Mar, 2008

1 commit

9dce07f1a NULL noise: fs/*, mm/*, kernel/* ... Browse Code »

Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds

Al Viro
2008-03-31 05:18:41 +0800

05 Mar, 2008

1 commit

b6abdb0e6 cgroup: fix default notify_on_release setting ... Browse Code »

The documentation says the default value of notify_on_release of a child
cgroup is inherited from its parent, which is reasonable, but the
implementation just sets the flag disabled.

Signed-off-by: Li Zefan
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-03-05 08:35:09 +0800

24 Feb, 2008

5 commits

bc231d2a0 cgroup: remove dead code in cgroup_get_rootdir() ... Browse Code »

Signed-off-by: Li Zefan
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-02-24 09:13:25 +0800
68db38f15 cgroup: remove duplicate code in find_css_set() ... Browse Code »

The list head res->tasks gets initialized twice in find_css_set().

Signed-off-by: Li Zefan
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-02-24 09:13:25 +0800
8d53d55d2 cgroup: fix subsys bitops ... Browse Code »

Cgroup uses unsigned long for subsys bitops, not unsigned long long.

Signed-off-by: Li Zefan
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-02-24 09:13:24 +0800
f77707384 cgroup: fix memory leak in cgroup_get_sb() ... Browse Code »

opts.release_agent is not kfree()ed in all necessary places.

Signed-off-by: Li Zefan
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-02-24 09:13:24 +0800
a043e3b2c cgroup: fix comments ... Browse Code »

fix:
- comments about need_forkexit_callback
- comments about release agent
- typo and comment style, etc.

Signed-off-by: Li Zefan
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-02-24 09:13:24 +0800

08 Feb, 2008

9 commits

73507f335 Handle pid namespaces in cgroups code ... Browse Code »

There's one place that works with task pids - its the "tasks" file in cgroups.
The read/write handlers assume, that the pid values go to/come from the user
space and thus it is a virtual pid, i.e. the pid as it is seen from inside a
namespace.

Tune the code accordingly.

Signed-off-by: Pavel Emelyanov
Cc: "Eric W. Biederman"
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2008-02-08 00:42:22 +0800
956db3ca0 hotplug cpu: move tasks in empty cpusets to parent ... Browse Code »

This patch corrects a situation that occurs when one disables all the cpus in
a cpuset.

Currently, the disabled (cpu-less) cpuset inherits the cpus of its parent,
which is incorrect because it may then overlap its cpu-exclusive sibling.

Tasks of an empty cpuset should be moved to the cpuset which is the parent of
their current cpuset. Or if the parent cpuset has no cpus, to its parent,
etc.

And the empty cpuset should be released (if it is flagged notify_on_release).

Depends on the cgroup_scan_tasks() function (proposed by David Rientjes) to
iterate through all tasks in the cpu-less cpuset. We are deliberately
avoiding a walk of the tasklist.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Cliff Wickman
Cc: Paul Menage
Cc: Paul Jackson
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Cliff Wickman
2008-02-08 00:42:22 +0800
31a7df01f cgroups: mechanism to process each task in a cgroup ... Browse Code »

Provide cgroup_scan_tasks(), which iterates through every task in a cgroup,
calling a test function and a process function for each. And call the process
function without holding the css_set_lock lock.

The idea is David Rientjes', predicting that such a function will make it much
easier in the future to extend things that require access to each task in a
cgroup without holding the lock,

[akpm@linux-foundation.org: cleanup]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Cliff Wickman
Cc: Paul Menage
Cc: Paul Jackson
Acked-by: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Cliff Wickman
2008-02-08 00:42:22 +0800
4fca88c87 memory cgroup enhancements: add- pre_destroy() handler ... Browse Code »

Add a handler "pre_destroy" to cgroup_subsys. It is called before
cgroup_rmdir() checks all subsys's refcnt.

I think this is useful for subsys which have some extra refs even if there
are no tasks in cgroup. By adding pre_destroy(), the kernel keeps the rule
"destroy() against subsystem is called only when refcnt=0." and allows css
ref to be used by other objects than tasks.

Signed-off-by: KAMEZAWA Hiroyuki
Cc: "Eric W. Biederman"
Cc: Balbir Singh
Cc: David Rientjes
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Nick Piggin
Cc: Paul Menage
Cc: Pavel Emelianov
Cc: Peter Zijlstra
Cc: Vaidyanathan Srinivasan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2008-02-08 00:42:20 +0800
e9685a03c kernel/cgroup.c: make 2 functions static ... Browse Code »

cgroup_is_releasable() and notify_on_release() should be static,
not global inline.

Signed-off-by: Adrian Bunk
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2008-02-08 00:42:18 +0800
8dc4f3e17 cgroups: move cgroups destroy() callbacks to cgroup_diput() ... Browse Code »

Move the calls to the cgroup subsystem destroy() methods from
cgroup_rmdir() to cgroup_diput(). This allows control file reads and
writes to access their subsystem state without having to be concerned with
locking against cgroup destruction - the control file dentry will keep the
cgroup and its subsystem state objects alive until the file is closed.

The documentation is updated to reflect the changed semantics of destroy();
additionally the locking comments for destroy() and some other methods were
clarified and decrustified.

Signed-off-by: Paul Menage
Cc: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2008-02-08 00:42:18 +0800
622d42cac cgroup simplify space stripping ... Browse Code »

Simplify the space stripping code in cgroup file write.

[akpm@linux-foundation.org: s/BUG_ON/BUILD_BUG_ON/]
Signed-off-by: Paul Jackson
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Jackson
2008-02-08 00:42:18 +0800
e18f6318e cgroup brace coding style fix ... Browse Code »

Coding style fix - one line conditionals don't get braces.

Signed-off-by: Paul Jackson
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Jackson
2008-02-08 00:42:17 +0800
3cdeed298 kernel/cgroup.c: remove dead code ... Browse Code »

This patch removes dead code spotted by the Coverity checker
(look at the "(nbytes >= PATH_MAX)" check).

Signed-off-by: Adrian Bunk
Cc: Paul Jackson
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2008-02-08 00:42:17 +0800

15 Nov, 2007

1 commit

cfe36bde5 Improve cgroup printks ... Browse Code »

When I boot with the 'quiet' parameter, I see on the screen:

[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 39.036026] Initializing cgroup subsys cpuacct
[ 39.036080] Initializing cgroup subsys debug
[ 39.036118] Initializing cgroup subsys ns

This patch lowers the priority of those messages, adds a "cgroup: " prefix
to another couple of printks and kills the useless reference to the source
file.

Signed-off-by: Diego Calleja
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Diego Calleja
2007-11-15 10:45:37 +0800

24 Oct, 2007

1 commit

3bdf590ea cgroup: kill unused variable ... Browse Code »

Signed-off-by: Jeff Garzik

Jeff Garzik
2007-10-24 09:28:39 +0800

20 Oct, 2007

11 commits

bd89aabc6 Control groups: Replace "cont" with "cgrp" and other misc renaming ... Browse Code »

Replace "cont" with "cgrp" and other misc renaming

This patch finishes some of the names that got missed in the great
"task containers" -> "control groups" rename. Primarily it renames
the local variable "cont" to "cgrp" in a number of places, and renames
the CONT_* enum members to CGRP_*.

This patch is not intended to have any effect on the generated code;
the output of "objdump -d kernel/cgroup.o" is unchanged.

Signed-off-by: Paul Menage
Acked-by: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2007-10-20 02:53:43 +0800
69cccb887 Use task_pid_nr() instead of pid_nr(task_pid()) ... Browse Code »

There are two places that do so - the cgroups subsystem and the autofs
code.

Signed-off-by: Pavel Emelyanov
Cc: Ian Kent
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelyanov
2007-10-20 02:53:43 +0800
846c7bb05 Add cgroupstats ... Browse Code »

This patch is inspired by the discussion at
http://lkml.org/lkml/2007/4/11/187 and implements per cgroup statistics
as suggested by Andrew Morton in http://lkml.org/lkml/2007/4/11/263. The
patch is on top of 2.6.21-mm1 with Paul's cgroups v9 patches (forward
ported)

This patch implements per cgroup statistics infrastructure and re-uses
code from the taskstats interface. A new set of cgroup operations are
registered with commands and attributes. It should be very easy to
*extend* per cgroup statistics, by adding members to the cgroupstats
structure.

The current model for cgroupstats is a pull, a push model (to post
statistics on interesting events), should be very easy to add. Currently
user space requests for statistics by passing the cgroup file
descriptor. Statistics about the state of all the tasks in the cgroup
is returned to user space.

TODO's/NOTE:

This patch provides an infrastructure for implementing cgroup statistics.
Based on the needs of each controller, we can incrementally add more statistics,
event based support for notification of statistics, accumulation of taskstats
into cgroup statistics in the future.

Sample output

# ./cgroupstats -C /cgroup/a
sleeping 2, blocked 0, running 1, stopped 0, uninterruptible 0

# ./cgroupstats -C /cgroup/
sleeping 154, blocked 0, running 0, stopped 0, uninterruptible 0

If the approach looks good, I'll enhance and post the user space utility for
the same

Feedback, comments, test results are always welcome!

[akpm@linux-foundation.org: build fix]
Signed-off-by: Balbir Singh
Cc: Paul Menage
Cc: Jay Lan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2007-10-20 02:53:36 +0800
81a6a5cdd Task Control Groups: automatic userspace notification of idle cgroups ... Browse Code »

Add the following files to the cgroup filesystem:

notify_on_release - configures/reports whether the cgroup subsystem should
attempt to run a release script when this cgroup becomes unused

release_agent - configures/reports the release agent to be used for this
hierarchy (top level in each hierarchy only)

releasable - reports whether this cgroup would have been auto-released if
notify_on_release was true and a release agent was configured (mainly useful
for debugging)

To avoid locking issues, invoking the userspace release agent is done via a
workqueue task; cgroups that need to have their release agents invoked by
the workqueue task are linked on to a list.

[pj@sgi.com: Need to include kmod.h]
Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Signed-off-by: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2007-10-20 02:53:36 +0800
817929ec2 Task Control Groups: shared cgroup subsystem group arrays ... Browse Code »

Replace the struct css_set embedded in task_struct with a pointer; all tasks
that have the same set of memberships across all hierarchies will share a
css_set object, and will be linked via their css_sets field to the "tasks"
list_head in the css_set.

Assuming that many tasks share the same cgroup assignments, this reduces
overall space usage and keeps the size of the task_struct down (three pointers
added to task_struct compared to a non-cgroups kernel, no matter how many
subsystems are registered).

[akpm@linux-foundation.org: fix a printk]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2007-10-20 02:53:36 +0800
a424316ca Task Control Groups: add procfs interface ... Browse Code »

Add:

/proc/cgroups - general system info

/proc/*/cgroup - per-task cgroup membership info

[a.p.zijlstra@chello.nl: cgroups: bdi init hooks]
Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Signed-off-by: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2007-10-20 02:53:36 +0800
697f41610 Task Control Groups: add cgroup_clone() interface ... Browse Code »

Add support for cgroup_clone(), a way to create new cgroups intended to
be used for systems such as namespace unsharing. A new subsystem callback,
post_clone(), is added to allow subsystems to automatically configure cloned
cgroups.

Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2007-10-20 02:53:36 +0800
b4f48b636 Task Control Groups: add fork()/exit() hooks ... Browse Code »

This adds the necessary hooks to the fork() and exit() paths to ensure
that new children inherit their parent's cgroup assignments, and that
exiting processes release reference counts on their cgroups.

Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2007-10-20 02:53:36 +0800
355e0c48b Add cgroup write_uint() helper method ... Browse Code »

Add write_uint() helper method for cgroup subsystems

This helper is analagous to the read_uint() helper method for
reporting u64 values to userspace. It's designed to reduce the amount
of boilerplate requierd for creating new cgroup subsystems.

Signed-off-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2007-10-20 02:53:36 +0800
bbcb81d09 Task Control Groups: add tasks file interface ... Browse Code »

Add the per-directory "tasks" file for cgroupfs mounts; this allows the
user to determine which tasks are members of a cgroup by reading a
cgroup's "tasks", and to move a task into a cgroup by writing its pid to
its "tasks".

Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2007-10-20 02:53:36 +0800
ddbcc7e8e Task Control Groups: basic task cgroup framework ... Browse Code »

Generic Process Control Groups
--------------------------

There have recently been various proposals floating around for
resource management/accounting and other task grouping subsystems in
the kernel, including ResGroups, User BeanCounters, NSProxy
cgroups, and others. These all need the basic abstraction of being
able to group together multiple processes in an aggregate, in order to
track/limit the resources permitted to those processes, or control
other behaviour of the processes, and all implement this grouping in
different ways.

This patchset provides a framework for tracking and grouping processes
into arbitrary "cgroups" and assigning arbitrary state to those
groupings, in order to control the behaviour of the cgroup as an
aggregate.

The intention is that the various resource management and
virtualization/cgroup efforts can also become task cgroup
clients, with the result that:

- the userspace APIs are (somewhat) normalised

- it's easier to test e.g. the ResGroups CPU controller in
conjunction with the BeanCounters memory controller, or use either of
them as the resource-control portion of a virtual server system.

- the additional kernel footprint of any of the competing resource
management systems is substantially reduced, since it doesn't need
to provide process grouping/containment, hence improving their
chances of getting into the kernel

This patch:

Add the main task cgroups framework - the cgroup filesystem, and the
basic structures for tracking membership and associating subsystem state
objects to tasks.

Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2007-10-20 02:53:36 +0800