Eric Lee / smarc-fsl-linux-kernel

24 Sep, 2009

4 commits

a6df63615 memory controller: soft limit documentation ... Browse Code »

Soft limits is a new feature for the memory resource controller, something
similar has existed in the group scheduler in the form of shares. The CPU
controllers interpretation of shares is very different though.

Soft limits are the most useful feature to have for environments where the
administrator wants to overcommit the system, such that only on memory
contention do the limits become active. The current soft limits
implementation provides a soft_limit_in_bytes interface for the memory
controller and not for memory+swap controller. The implementation
maintains an RB-Tree of groups that exceed their soft limit and starts
reclaiming from the group that exceeds this limit by the maximum amount.

This patch:

Add documentation for soft limits

Signed-off-by: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Cc: Li Zefan
Cc: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2009-09-24 22:20:59 +0800
4b3bde4c9 memcg: remove the overhead associated with the root cgroup ... Browse Code »

Change the memory cgroup to remove the overhead associated with accounting
all pages in the root cgroup. As a side-effect, we can no longer set a
memory hard limit in the root cgroup.

A new flag to track whether the page has been accounted or not has been
added as well. Flags are now set atomically for page_cgroup,
pcg_default_flags is now obsolete and removed.

[akpm@linux-foundation.org: fix a few documentation glitches]
Signed-off-by: Balbir Singh
Signed-off-by: Daisuke Nishimura
Reviewed-by: KAMEZAWA Hiroyuki
Cc: Daisuke Nishimura
Cc: Li Zefan
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2009-09-24 22:20:58 +0800
be367d099 cgroups: let ss->can_attach and ss->attach do whole threadgroups at a time ... Browse Code »

Alter the ss->can_attach and ss->attach functions to be able to deal with
a whole threadgroup at a time, for use in cgroup_attach_proc. (This is a
pre-patch to cgroup-procs-writable.patch.)

Currently, new mode of the attach function can only tell the subsystem
about the old cgroup of the threadgroup leader. No subsystem currently
needs that information for each thread that's being moved, but if one were
to be added (for example, one that counts tasks within a group) this bit
would need to be reworked a bit to tell the subsystem the right
information.

[hidave.darkstar@gmail.com: fix build]
Signed-off-by: Ben Blum
Signed-off-by: Paul Menage
Acked-by: Li Zefan
Reviewed-by: Matt Helsley
Cc: "Eric W. Biederman"
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Dave Young
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ben Blum
2009-09-24 22:20:58 +0800
c6d57f331 cgroups: support named cgroups hierarchies ... Browse Code »

To simplify referring to cgroup hierarchies in mount statements, and to
allow disambiguation in the presence of empty hierarchies and
multiply-bindable subsystems this patch adds support for naming a new
cgroup hierarchy via the "name=" mount option

A pre-existing hierarchy may be specified by either name or by subsystems;
a hierarchy's name cannot be changed by a remount operation.

Example usage:

# To create a hierarchy called "foo" containing the "cpu" subsystem
mount -t cgroup -oname=foo,cpu cgroup /mnt/cgroup1

# To mount the "foo" hierarchy on a second location
mount -t cgroup -oname=foo cgroup /mnt/cgroup2

Signed-off-by: Paul Menage
Reviewed-by: Li Zefan
Cc: KAMEZAWA Hiroyuki
Cc: Balbir Singh
Cc: Dhaval Giani
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2009-09-24 22:20:57 +0800

01 Jul, 2009

1 commit

b37f2d4de cpusets: document adding/removing cpus to cpuset elaborately ... Browse Code »

By writing a tasks's pid to the file, a process adds that task to that
cgroup/cpuset. But to add a cpu/mem to a cpuset, the new list of cpus
should be written to the cpuset.mems file which would replace the old list
of cpus. Make this clearer in the documentation.

Signed-off-by: Nikanth Karthikesan
Signed-off-by: Li Zefan
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nikanth Karthikesan
2009-07-01 09:56:01 +0800

19 Jun, 2009

2 commits

c5b947b28 memcg: add interface to reset limits ... Browse Code »

We don't have an interface to reset mem.limit or memsw.limit now.

This patch allows to reset mem.limit or memsw.limit when they are being
set to -1.

Signed-off-by: Daisuke Nishimura
Cc: KAMEZAWA Hiroyuki
Cc: Balbir Singh
Cc: Li Zefan
Cc: Dhaval Giani
Cc: YAMAMOTO Takashi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Daisuke Nishimura
2009-06-19 04:03:48 +0800
22a668d7c memcg: fix behavior under memory.limit equals to memsw.limit ... Browse Code »
1

A user can set memcg.limit_in_bytes == memcg.memsw.limit_in_bytes when the
user just want to limit the total size of applications, in other words,
not very interested in memory usage itself. In this case, swap-out will
be done only by global-LRU.

But, under current implementation, memory.limit_in_bytes is checked at
first and try_to_free_page() may do swap-out. But, that swap-out is
useless for memsw.limit_in_bytes and the thread may hit limit again.

This patch tries to fix the current behavior at memory.limit ==
memsw.limit case. And documentation is updated to explain the behavior of
this special case.

Signed-off-by: KAMEZAWA Hiroyuki
Cc: Daisuke Nishimura
Cc: Balbir Singh
Cc: Li Zefan
Cc: Dhaval Giani
Cc: YAMAMOTO Takashi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2009-06-19 04:03:48 +0800

14 Apr, 2009

2 commits

c863d835b memcg: fix documentation ... Browse Code »

The description about various statistics from memory.stat is not accurate
and confusing at times.

Correct this along with a few other minor cleanups.

Signed-off-by: Bharata B Rao
Acked-by: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Bharata B Rao
2009-04-14 06:04:33 +0800
5341cfab9 res_counter: update documentation ... Browse Code »

After the introduction of resource counters hierarchies
(28dbc4b6a01fb579a9441c7b81e3d3413dc452df) the prototypes of
res_counter_init() and res_counter_charge() have been changed.

Keep the documentation consistent with the actual function prototypes.

Signed-off-by: Andrea Righi
Cc: Paul Menage
Cc: Pavel Emelyanov
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrea Righi
2009-04-14 06:04:30 +0800

08 Apr, 2009

1 commit

5af8c4e0f Merge commit 'v2.6.30-rc1' into sched/urgent ... Browse Code »

Merge reason: update to latest upstream to queue up fix

Signed-off-by: Ingo Molnar

Ingo Molnar
2009-04-08 23:26:00 +0800

04 Apr, 2009

1 commit

811158b14 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (28 commits)
trivial: Update my email address
trivial: NULL noise: drivers/mtd/tests/mtd_*test.c
trivial: NULL noise: drivers/media/dvb/frontends/drx397xD_fw.h
trivial: Fix misspelling of "Celsius".
trivial: remove unused variable 'path' in alloc_file()
trivial: fix a pdlfush -> pdflush typo in comment
trivial: jbd header comment typo fix for JBD_PARANOID_IOFAIL
trivial: wusb: Storage class should be before const qualifier
trivial: drivers/char/bsr.c: Storage class should be before const qualifier
trivial: h8300: Storage class should be before const qualifier
trivial: fix where cgroup documentation is not correctly referred to
trivial: Give the right path in Documentation example
trivial: MTD: remove EOL from MODULE_DESCRIPTION
trivial: Fix typo in bio_split()'s documentation
trivial: PWM: fix of #endif comment
trivial: fix typos/grammar errors in Kconfig texts
trivial: Fix misspelling of firmware
trivial: cgroups: documentation typo and spelling corrections
trivial: Update contact info for Jochen Hein
trivial: fix typo "resgister" -> "register"
...

Linus Torvalds
2009-04-04 06:24:35 +0800

03 Apr, 2009

3 commits

0b7f569e4 memcg: fix OOM killer under memcg ... Browse Code »

This patch tries to fix OOM Killer problems caused by hierarchy.
Now, memcg itself has OOM KILL function (in oom_kill.c) and tries to
kill a task in memcg.

But, when hierarchy is used, it's broken and correct task cannot
be killed. For example, in following cgroup

/groupA/ hierarchy=1, limit=1G,
01 nolimit
02 nolimit
All tasks' memory usage under /groupA, /groupA/01, groupA/02 is limited to
groupA's 1Gbytes but OOM Killer just kills tasks in groupA.

This patch provides makes the bad process be selected from all tasks
under hierarchy. BTW, currently, oom_jiffies is updated against groupA
in above case. oom_jiffies of tree should be updated.

To see how oom_jiffies is used, please check mem_cgroup_oom_called()
callers.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: const fix]
Signed-off-by: KAMEZAWA Hiroyuki
Cc: Paul Menage
Cc: Li Zefan
Cc: Balbir Singh
Cc: Daisuke Nishimura
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2009-04-03 10:04:55 +0800
b6719ec1a cgroups: more documentation for remount and release_agent ... Browse Code »

This won't remove cpuacct from the mounted hierachy:
# mount -t cgroup -o cpu,cpuacct xxx /mnt
# mount -o remount,cpu /mnt

Because for this usage mount(8) will append the new options to the original
options.

And this will get you right:
# mount [-t cgroup] -o remount,cpu xxx /mnt

Also document how to specify or change release_agent.

Signed-off-by: Li Zefan
Reviewd-by: KAMEZAWA Hiroyuki
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2009-04-03 10:04:54 +0800
ec64f5154 cgroup: fix frequent -EBUSY at rmdir ... Browse Code »

In following situation, with memory subsystem,

/groupA use_hierarchy==1
/01 some tasks
/02 some tasks
/03 some tasks
/04 empty

When tasks under 01/02/03 hit limit on /groupA, hierarchical reclaim
is triggered and the kernel walks tree under groupA. In this case,
rmdir /groupA/04 fails with -EBUSY frequently because of temporal
refcnt from the kernel.

In general. cgroup can be rmdir'd if there are no children groups and
no tasks. Frequent fails of rmdir() is not useful to users.
(And the reason for -EBUSY is unknown to users.....in most cases)

This patch tries to modify above behavior, by
- retries if css_refcnt is got by someone.
- add "return value" to pre_destroy() and allows subsystem to
say "we're really busy!"

Signed-off-by: KAMEZAWA Hiroyuki
Cc: Paul Menage
Cc: Li Zefan
Cc: Balbir Singh
Cc: Daisuke Nishimura
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2009-04-03 10:04:54 +0800

01 Apr, 2009

1 commit

ef12fefab cpuacct: add per-cgroup utime/stime statistics ... Browse Code »

Add per-cgroup cpuacct controller statistics like the system and user
time consumed by the group of tasks.

Changelog:

v7
- Changed the name of the statistic from utime to user and from stime to
system so that in future we could easily add other statistics like irq,
softirq, steal times etc easily.

v6
- Fixed a bug in the error path of cpuacct_create() (pointed by Li Zefan).

v5
- In cpuacct_stats_show(), use cputime64_to_clock_t() since we are
operating on a 64bit variable here.

v4
- Remove comments in cpuacct_update_stats() which explained why rcu_read_lock()
was needed (as per Peter Zijlstra's review comments).
- Don't say that percpu_counter_read() is broken in Documentation/cpuacct.txt
as per KAMEZAWA Hiroyuki's review comments.

v3
- Fix a small race in the cpuacct hierarchy walk.

v2
- stime and utime now exported in clock_t units instead of msecs.
- Addressed the code review comments from Balbir and Li Zefan.
- Moved to -tip tree.

v1
- Moved the stime/utime accounting to cpuacct controller.

Earlier versions
- http://lkml.org/lkml/2009/2/25/129

Signed-off-by: Bharata B Rao
Signed-off-by: Balaji Rao
Cc: Dhaval Giani
Cc: Paul Menage
Cc: Andrew Morton
Cc: KAMEZAWA Hiroyuki
Reviewed-by: Li Zefan
Acked-by: Peter Zijlstra
Acked-by: Balbir Singh
Tested-by: Balbir Singh
LKML-Reference:
Signed-off-by: Ingo Molnar

Bharata B Rao
2009-04-01 22:49:38 +0800

30 Mar, 2009

3 commits

21acb9caa trivial: fix where cgroup documentation is not correctly referred to ... Browse Code »

cgroup documentation was moved to Documentation/cgroups/. There are some
places that still refer to Documentation/controllers/,
Documentation/cgroups.txt and Documentation/cpusets.txt. Fix those.

Signed-off-by: Thadeu Lima de Souza Cascardo
Reviewed-by: Li Zefan
Acked-by: Paul Menage
Signed-off-by: Jiri Kosina

Thadeu Lima de Souza Cascardo
2009-03-30 21:22:02 +0800
6d5e147dd trivial: Give the right path in Documentation example ... Browse Code »

While the Documentation example creates /cgroup/test, it removes
/test/cgroup, which is clearly not the intended path. Change that to
/cgroup/test.

Acked-by: KAMEZAWA Hiroyuki
Signed-off-by: Thadeu Lima de Souza Cascardo
Signed-off-by: Jiri Kosina

Thadeu Lima de Souza Cascardo
2009-03-30 21:22:02 +0800
caa790ba6 trivial: cgroups: documentation typo and spelling corrections ... Browse Code »

Minor typo and spelling corrections fixed whilst reading
to learn about cgroups capabilities.

Signed-off-by: Chris Samuel
Acked-by: Paul Menage
Signed-off-by: Jiri Kosina

Chris Samuel
2009-03-30 21:21:58 +0800

21 Feb, 2009

1 commit

3fd076dd9 cpuset: various documentation fixes and updates ... Browse Code »

I noticed the old commit 8f5aa26c75b7722e80c0c5c5bb833d41865d7019
("cpusets: update_cpumask documentation fix") is not a complete fix,
resulting in inconsistent paragraphs. This patch fixes it and does other
fixes and updates:

- s/migrate_all_tasks()/migrate_live_tasks()/
- describe more cpuset control files
- s/cpumask_t/struct cpumask/
- document cpu hotplug and change of 'sched_relax_domain_level' may cause
domain rebuild
- document various ways to query and modify cpusets
- the equivalent of "mount -t cpuset" is "mount -t cgroup -o cpuset,noprefix"

Signed-off-by: Li Zefan
Acked-by: Randy Dunlap
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2009-02-21 09:57:49 +0800

19 Feb, 2009

1 commit

b851ee792 cgroups: update documentation about css_set hash table ... Browse Code »

The css_set hash table was introduced in 2.6.26, so update the
documentation accordingly.

Signed-off-by: Li Zefan
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2009-02-19 07:37:53 +0800

30 Jan, 2009

1 commit

8d50d369d memcg: update document to mention that swapoff should be tested ... Browse Code »

Considering the recently found problem "memcg: fix refcnt handling at
swapoff", it's better to mention swapoff behavior in the memcg_test
document.

Signed-off-by: KAMEZAWA Hiroyuki
Acked-by: Balbir Singh
Cc: Li Zefan
Cc: Daisuke Nishimura
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2009-01-30 10:04:44 +0800

16 Jan, 2009

1 commit

45ce80fb6 cgroups: consolidate cgroup documents ... Browse Code »

Move Documentation/cpusets.txt and Documentation/controllers/* to
Documentation/cgroups/

Signed-off-by: Li Zefan
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Balbir Singh
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2009-01-16 08:39:37 +0800

09 Jan, 2009

2 commits

999cd8a45 cgroups: add a per-subsystem hierarchy_mutex ... Browse Code »

These patches introduce new locking/refcount support for cgroups to
reduce the need for subsystems to call cgroup_lock(). This will
ultimately allow the atomicity of cgroup_rmdir() (which was removed
recently) to be restored.

These three patches give:

1/3 - introduce a per-subsystem hierarchy_mutex which a subsystem can
use to prevent changes to its own cgroup tree

2/3 - use hierarchy_mutex in place of calling cgroup_lock() in the
memory controller

3/3 - introduce a css_tryget() function similar to the one recently
proposed by Kamezawa, but avoiding spurious refcount failures in
the event of a race between a css_tryget() and an unsuccessful
cgroup_rmdir()

Future patches will likely involve:

- using hierarchy mutex in place of cgroup_lock() in more subsystems
where appropriate

- restoring the atomicity of cgroup_rmdir() with respect to cgroup_create()

This patch:

Add a hierarchy_mutex to the cgroup_subsys object that protects changes to
the hierarchy observed by that subsystem. It is taken by the cgroup
subsystem (in addition to cgroup_mutex) for the following operations:

- linking a cgroup into that subsystem's cgroup tree
- unlinking a cgroup from that subsystem's cgroup tree
- moving the subsystem to/from a hierarchy (including across the
bind() callback)

Thus if the subsystem holds its own hierarchy_mutex, it can safely
traverse its own hierarchy.

Signed-off-by: Paul Menage
Tested-by: KAMEZAWA Hiroyuki
Cc: Li Zefan
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2009-01-09 00:31:10 +0800
18e7f1f0d cgroups: documentation updates ... Browse Code »

- remove 'releasable' since it has been moved to the debug subsys.
- update lock requirements of subsys callbacks.

Signed-off-by: Li Zefan
Cc: Paul Menage
Cc: KAMEZAWA Hiroyuki
Cc: Balbir Singh
Cc: Pavel Emelyanov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2009-01-09 00:31:01 +0800

13 Nov, 2008

1 commit

3b1b3f6e5 freezer_cg: disable writing freezer.state of root cgroup ... Browse Code »

With this change, control file 'freezer.state' doesn't exist in root
cgroup, making root cgroup unfreezable.

I think it's reasonable to disallow freeze tasks in the root cgroup. And
then we can avoid fork overhead when freezer subsystem is compiled but not
used.

Also make writing invalid value to freezer.state returns EINVAL rather
than EIO. This is more consistent with other cgroup subsystem.

Signed-off-by: Li Zefan
Acked-by: Paul Menage
Cc: Cedric Le Goater
Cc: Paul Menage
Cc: Matt Helsley
Cc: "Serge E. Hallyn"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-11-13 09:17:16 +0800

20 Oct, 2008

1 commit

bde5ab655 container freezer: document the cgroup freezer subsystem. ... Browse Code »

Describe why we need the freezer subsystem and how to use it in a
documentation file. Since the cgroups.txt file is focused on the
subsystem-agnostic portions of cgroups make a directory and move the old
cgroups.txt file at the same time.

Signed-off-by: Matt Helsley
Cc: Paul Menage
Cc: containers@lists.linux-foundation.org
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matt Helsley
2008-10-20 23:52:34 +0800