Eric Lee / smarc-fsl-linux-kernel

07 Jan, 2009

1 commit

e5991371e mm: remove cgroup_mm_owner_callbacks ... Browse Code »

cgroup_mm_owner_callbacks() was brought in to support the memrlimit
controller, but sneaked into mainline ahead of it. That controller has
now been shelved, and the mm_owner_changed() args were inadequate for it
anyway (they needed an mm pointer instead of a task pointer).

Remove the dead code, and restore mm_update_next_owner() locking to how it
was before: taking mmap_sem there does nothing for memcontrol.c, now the
only user of mm->owner.

Signed-off-by: Hugh Dickins
Cc: Paul Menage
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2009-01-07 07:59:01 +0800

06 Jan, 2009

2 commits

520c85346 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
inotify: fix type errors in interfaces
fix breakage in reiserfs_new_inode()
fix the treatment of jfs special inodes
vfs: remove duplicate code in get_fs_type()
add a vfs_fsync helper
sys_execve and sys_uselib do not call into fsnotify
zero i_uid/i_gid on inode allocation
inode->i_op is never NULL
ntfs: don't NULL i_op
isofs check for NULL ->i_op in root directory is dead code
affs: do not zero ->i_op
kill suid bit only for regular files
vfs: lseek(fd, 0, SEEK_CUR) race condition

Linus Torvalds
2009-01-06 10:32:06 +0800
56ff5efad zero i_uid/i_gid on inode allocation ... Browse Code »

... and don't bother in callers. Don't bother with zeroing i_blocks,
while we are at it - it's already been zeroed.

i_mode is not worth the effort; it has no common default value.

Signed-off-by: Al Viro

Al Viro
2009-01-06 00:54:28 +0800

05 Jan, 2009

1 commit

7b574b7b0 cgroups: fix a race between cgroup_clone and umount ... Browse Code »

The race is calling cgroup_clone() while umounting the ns cgroup subsys,
and thus cgroup_clone() might access invalid cgroup_fs, or kill_sb() is
called after cgroup_clone() created a new dir in it.

The BUG I triggered is BUG_ON(root->number_of_cgroups != 1);

------------[ cut here ]------------
kernel BUG at kernel/cgroup.c:1093!
invalid opcode: 0000 [#1] SMP
...
Process umount (pid: 5177, ti=e411e000 task=e40c4670 task.ti=e411e000)
...
Call Trace:
[] ? deactivate_super+0x3f/0x51
[] ? mntput_no_expire+0xb3/0xdd
[] ? sys_umount+0x265/0x2ac
[] ? sys_oldumount+0xd/0xf
[] ? sysenter_do_call+0x12/0x31
...
EIP: [] cgroup_kill_sb+0x23/0xe0 SS:ESP 0068:e411ef2c
---[ end trace c766c1be3bf944ac ]---

Cc: Serge E. Hallyn
Signed-off-by: Li Zefan
Cc: Paul Menage
Cc: "Serge E. Hallyn"
Cc: Balbir Singh
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2009-01-05 05:33:19 +0800

25 Dec, 2008

1 commit

cbacc2c7f Merge branch 'next' into for-linus Browse Code »

James Morris
2008-12-25 08:40:09 +0800

24 Dec, 2008

2 commits

20ca9b3f4 cgroups: avoid accessing uninitialized data in failure path ... Browse Code »

If cgroup_get_rootdir() failed, free_cg_links() will be called in the
failure path, but tmp_cg_links hasn't been initialized at that time.

I introduced this bug in the 2.6.27 merge window.

Signed-off-by: Li Zefan
Acked-by: Serge Hallyn
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-12-24 07:58:21 +0800
e368d3a83 cgroups: suppress bogus warning messages ... Browse Code »

Remove spurious warning messages that are thrown onto the console during
cgroup operations.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Sharyathi Nagesh
Acked-by: Serge E. Hallyn
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sharyathi Nagesh
2008-12-24 07:58:21 +0800

16 Dec, 2008

1 commit

307257cf4 cgroups: fix a race between rmdir and remount ... Browse Code »

When a cgroup is removed, it's unlinked from its parent's children list,
but not actually freed until the last dentry on it is released (at which
point cgrp->root->number_of_cgroups is decremented).

Currently rebind_subsystems checks for the top cgroup's child list being
empty in order to rebind subsystems into or out of a hierarchy - this can
result in the set of subsystems bound to a hierarchy being
removed-but-not-freed cgroup.

The simplest fix for this is to forbid remounts that change the set of
subsystems on a hierarchy that has removed-but-not-freed cgroups. This
bug can be reproduced via:

mkdir /mnt/cg
mount -t cgroup -o ns,freezer cgroup /mnt/cg
mkdir /mnt/cg/foo
sleep 1h < /mnt/cg/foo &
rmdir /mnt/cg/foo
mount -t cgroup -o remount,ns,devices,freezer cgroup /mnt/cg
kill $!

Though the above will cause oops in -mm only but not mainline, but the bug
can cause memory leak in mainline (and even oops)

Signed-off-by: Paul Menage
Reviewed-by: Li Zefan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2008-12-16 08:27:07 +0800

04 Dec, 2008

1 commit

ec98ce480 Merge branch 'master' into next ... Browse Code »

Conflicts:
fs/nfsd/nfs4recover.c

Manually fixed above to use new creds API functions, e.g.
nfs4_save_creds().

Signed-off-by: James Morris

James Morris
2008-12-04 14:16:36 +0800

20 Nov, 2008

2 commits

33d283bef cgroups: fix a serious bug in cgroupstats ... Browse Code »

Try this, and you'll get oops immediately:
# cd Documentation/accounting/
# gcc -o getdelays getdelays.c
# mount -t cgroup -o debug xxx /mnt
# ./getdelays -C /mnt/tasks

Because a normal file's dentry->d_fsdata is a pointer to struct cftype,
not struct cgroup.

After the patch, it returns EINVAL if we try to get cgroupstats
from a normal file.

Cc: Balbir Singh
Signed-off-by: Li Zefan
Acked-by: Paul Menage
Cc: [2.6.25.x, 2.6.26.x, 2.6.27.x]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-11-20 10:50:00 +0800
3fa59dfbc cgroup: fix potential deadlock in pre_destroy ... Browse Code »

As Balbir pointed out, memcg's pre_destroy handler has potential deadlock.

It has following lock sequence.

cgroup_mutex (cgroup_rmdir)
-> pre_destroy -> mem_cgroup_pre_destroy-> force_empty
-> cpu_hotplug.lock. (lru_add_drain_all->
schedule_work->
get_online_cpus)

But, cpuset has following.
cpu_hotplug.lock (call notifier)
-> cgroup_mutex. (within notifier)

Then, this lock sequence should be fixed.

Considering how pre_destroy works, it's not necessary to holding
cgroup_mutex() while calling it.

As a side effect, we don't have to wait at this mutex while memcg's
force_empty works.(it can be long when there are tons of pages.)

Signed-off-by: KAMEZAWA Hiroyuki
Acked-by: Balbir Singh
Cc: Li Zefan
Cc: Paul Menage
Cc: Daisuke Nishimura
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2008-11-20 10:49:58 +0800

14 Nov, 2008

4 commits

2b8289256 Merge branch 'master' into next ... Browse Code »

Conflicts:
security/keys/internal.h
security/keys/process_keys.c
security/keys/request_key.c

Fixed conflicts above by using the non 'tsk' versions.

Signed-off-by: James Morris

James Morris
2008-11-14 08:29:12 +0800
c69e8d9c0 CRED: Use RCU to access another task's creds and to release a task's own creds ... Browse Code »

Use RCU to access another task's creds and to release a task's own creds.
This means that it will be possible for the credentials of a task to be
replaced without another task (a) requiring a full lock to read them, and (b)
seeing deallocated memory.

Signed-off-by: David Howells
Acked-by: James Morris
Acked-by: Serge Hallyn
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:19 +0800
b6dff3ec5 CRED: Separate task security context from task_struct ... Browse Code »

Separate the task security context from task_struct. At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.

Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.

With comment fixes Signed-off-by: Marc Dionne

Signed-off-by: David Howells
Acked-by: James Morris
Acked-by: Serge Hallyn
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:16 +0800
76aac0e9a CRED: Wrap task credential accesses in the core kernel ... Browse Code »

Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells
Reviewed-by: James Morris
Acked-by: Serge Hallyn
Cc: Al Viro
Cc: linux-audit@redhat.com
Cc: containers@lists.linux-foundation.org
Cc: linux-mm@kvack.org
Signed-off-by: James Morris

David Howells
2008-11-14 07:39:12 +0800

07 Nov, 2008

1 commit

24eb08995 cgroups: fix invalid cgrp->dentry before cgroup has been completely removed ... Browse Code »

This fixes an oops when reading /proc/sched_debug.

A cgroup won't be removed completely until finishing cgroup_diput(), so we
shouldn't invalidate cgrp->dentry in cgroup_rmdir(). Otherwise, when a
group is being removed while cgroup_path() gets called, we may trigger
NULL dereference BUG.

The bug can be reproduced:

# cat test.sh
#!/bin/sh
mount -t cgroup -o cpu xxx /mnt
for (( ; ; ))
{
mkdir /mnt/sub
rmdir /mnt/sub
}
# ./test.sh &
# cat /proc/sched_debug

BUG: unable to handle kernel NULL pointer dereference at 00000038
IP: [] cgroup_path+0x39/0x90
...
Call Trace:
[] ? print_cfs_rq+0x6e/0x75d
[] ? sched_debug_show+0x72d/0xc1e
...

Signed-off-by: Li Zefan
Acked-by: Paul Menage
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: [2.6.26.x, 2.6.27.x]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-11-07 07:41:19 +0800

27 Oct, 2008

1 commit

207777664 cgroup: remove unused variable ... Browse Code »

/scratch/sfr/next/kernel/cgroup.c: In function 'cgroup_tasks_start':
/scratch/sfr/next/kernel/cgroup.c:2107: warning: unused variable 'i'

Introduced in commit cc31edceee04a7b87f2be48f9489ebb72d264844 "cgroups:
convert tasks file to use a seq_file with shared pid array".

Signed-off-by: Stephen Rothwell
Signed-off-by: Linus Torvalds

Stephen Rothwell
2008-10-27 00:38:17 +0800

20 Oct, 2008

2 commits

cc31edcee cgroups: convert tasks file to use a seq_file with shared pid array ... Browse Code »

Rather than pre-generating the entire text for the "tasks" file each
time the file is opened, we instead just generate/update the array of
process ids and use a seq_file to report these to userspace. All open
file handles on the same "tasks" file can share a pid array, which may
be updated any time that no thread is actively reading the array. By
sharing the array, the potential for userspace to DoS the system by
opening many handles on the same "tasks" file is removed.

[Based on a patch by Lai Jiangshan, extended to use seq_file]

Signed-off-by: Paul Menage
Reviewed-by: Lai Jiangshan
Cc: Serge Hallyn
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2008-10-20 23:52:38 +0800
146aa1bd0 cgroups: fix probable race with put_css_set[_taskexit] and find_css_set ... Browse Code »

put_css_set_taskexit may be called when find_css_set is called on other
cpu. And the race will occur:

put_css_set_taskexit side find_css_set side

|
atomic_dec_and_test(&kref->refcount) |
/* kref->refcount = 0 */ |
....................................................................
| read_lock(&css_set_lock)
| find_existing_css_set
| get_css_set
| read_unlock(&css_set_lock);
....................................................................
__release_css_set |
....................................................................
| /* use a released css_set */
|

[put_css_set is the same. But in the current code, all put_css_set are
put into cgroup mutex critical region as the same as find_css_set.]

[akpm@linux-foundation.org: repair comments]
[menage@google.com: eliminate race in css_set refcounting]
Signed-off-by: Lai Jiangshan
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lai Jiangshan
2008-10-20 23:52:38 +0800

17 Oct, 2008

1 commit

9363b9f23 memrlimit: cgroup mm owner callback changes to add task info ... Browse Code »

This patch adds an additional field to the mm_owner callbacks. This field
is required to get to the mm that changed. Hold mmap_sem in write mode
before calling the mm_owner_changed callback

[hugh@veritas.com: fix mmap_sem deadlock]
Signed-off-by: Balbir Singh
Cc: Sudhir Kumar
Cc: YAMAMOTO Takashi
Cc: Paul Menage
Cc: Li Zefan
Cc: Pavel Emelianov
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Cc: David Rientjes
Cc: Vivek Goyal
Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2008-10-17 02:21:28 +0800

29 Sep, 2008

1 commit

31a78f23b mm owner: fix race between swapoff and exit ... Browse Code »

There's a race between mm->owner assignment and swapoff, more easily
seen when task slab poisoning is turned on. The condition occurs when
try_to_unuse() runs in parallel with an exiting task. A similar race
can occur with callers of get_task_mm(), such as /proc//
or ptrace or page migration.

CPU0 CPU1
try_to_unuse
looks at mm = task0->mm
increments mm->mm_users
task 0 exits
mm->owner needs to be updated, but no
new owner is found (mm_users > 1, but
no other task has task->mm = task0->mm)
mm_update_next_owner() leaves
mmput(mm) decrements mm->mm_users
task0 freed
dereferencing mm->owner fails

The fix is to notify the subsystem via mm_owner_changed callback(),
if no new owner is found, by specifying the new task as NULL.

Jiri Slaby:
mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but
must be set after that, so as not to pass NULL as old owner causing oops.

Daisuke Nishimura:
mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task()
and its callers need to take account of this situation to avoid oops.

Hugh Dickins:
Lockdep warning and hang below exec_mmap() when testing these patches.
exit_mm() up_reads mmap_sem before calling mm_update_next_owner(),
so exec_mmap() now needs to do the same. And with that repositioning,
there's now no point in mm_need_new_owner() allowing for NULL mm.

Reported-by: Hugh Dickins
Signed-off-by: Balbir Singh
Signed-off-by: Jiri Slaby
Signed-off-by: Daisuke Nishimura
Signed-off-by: Hugh Dickins
Cc: KAMEZAWA Hiroyuki
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2008-09-29 23:41:47 +0800

31 Jul, 2008

3 commits

55b6fd016 cgroup: uninline cgroup_has_css_refs() ... Browse Code »

It's not small enough, and has 2 call sites.

text data bss dec hex filename
12813 1676 4832 19321 4b79 cgroup.o.orig
12775 1676 4832 19283 4b53 cgroup.o

Signed-off-by: Li Zefan
Cc: Paul Menage
Cc: Cedric Le Goater
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-07-31 00:41:44 +0800
36553434f cgroup: remove duplicate code in allocate_cg_link() ... Browse Code »

- just call free_cg_links() in allocate_cg_links()
- the list will get initialized in allocate_cg_links(), so don't init
it twice

Signed-off-by: Li Zefan
Cc: Paul Menage
Cc: Cedric Le Goater
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-07-31 00:41:44 +0800
5a3eb9f6b cgroup: fix possible memory leak ... Browse Code »

There's a leak if copy_from_user() returns failure.

Signed-off-by: Li Zefan
Cc: Paul Menage
Cc: Cedric Le Goater
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-07-31 00:41:44 +0800

27 Jul, 2008

2 commits

3f8206d49 [PATCH] get rid of indirect users of namei.h ... Browse Code »

fs.h needs path.h, not namei.h; nfs_fs.h doesn't need it at all.
Several places in the tree needed direct include.

Signed-off-by: Al Viro

Al Viro
2008-07-27 08:53:42 +0800
96930a636 make cgroup_seqfile_release() static ... Browse Code »

cgroup_seqfile_release() can become static.

Signed-off-by: Adrian Bunk
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2008-07-27 03:00:11 +0800

26 Jul, 2008

9 commits

e885dcde7 cgroup_clone: use pid of newly created task for new cgroup ... Browse Code »

cgroup_clone creates a new cgroup with the pid of the task. This works
correctly for unshare, but for clone cgroup_clone is called from
copy_namespaces inside copy_process, which happens before the new pid is
created. As a result, the new cgroup was created with current's pid.
This patch:

1. Moves the call inside copy_process to after the new pid
is created
2. Passes the struct pid into ns_cgroup_clone (as it is not
yet attached to the task)
3. Passes a name from ns_cgroup_clone() into cgroup_clone()
so as to keep cgroup_clone() itself simpler
4. Uses pid_vnr() to get the process id value, so that the
pid used to name the new cgroup is always the pid as it
would be known to the task which did the cloning or
unsharing. I think that is the most intuitive thing to
do. This way, task t1 does clone(CLONE_NEWPID) to get
t2, which does clone(CLONE_NEWPID) to get t3, then the
cgroup for t3 will be named for the pid by which t2 knows
t3.

(Thanks to Dan Smith for finding the main bug)

Changelog:
June 11: Incorporate Paul Menage's feedback: don't pass
NULL to ns_cgroup_clone from unshare, and reduce
patch size by using 'nodename' in cgroup_clone.
June 10: Original version

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge Hallyn
Acked-by: Paul Menage
Tested-by: Dan Smith
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2008-07-26 01:53:37 +0800
af351026a cgroup files: turn attach_task_by_pid directly into a cgroup write handler ... Browse Code »

This patch changes attach_task_by_pid() to take a u64 rather than a
string; as a result it can be called directly as a control groups
write_u64 handler, and cgroup_common_file_write() can be removed.

Signed-off-by: Paul Menage
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: Balbir Singh
Cc: Serge Hallyn
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2008-07-26 01:53:36 +0800
6379c1061 cgroup files: move notify_on_release file to separate write handler ... Browse Code »

This patch moves the write handler for the cgroups notify_on_release
file into a separate handler. This handler requires no cgroups locking
since it relies on atomic bitops for synchronization.

Signed-off-by: Paul Menage
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: Balbir Singh
Cc: Serge Hallyn
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2008-07-26 01:53:36 +0800
84eea8428 cgroups: misc cleanups to write_string patchset ... Browse Code »

This patch contains cleanups suggested by reviewers for the recent
write_string() patchset:

- pair cgroup_lock_live_group() with cgroup_unlock() in cgroup.c for
clarity, rather than directly unlocking cgroup_mutex.

- make the return type of cgroup_lock_live_group() a bool

- use a #define'd constant for the local buffer size in read/write functions

Signed-off-by: Paul Menage
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: Balbir Singh
Acked-by: Serge Hallyn
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2008-07-26 01:53:35 +0800
e788e066c cgroup files: move the release_agent file to use typed handlers ... Browse Code »

Adds cgroup_release_agent_write() and cgroup_release_agent_show()
methods to handle writing/reading the path to a cgroup hierarchy's
release agent. As a result, cgroup_common_file_read() is now unnecessary.

As part of the change, a previously-tolerated race in
cgroup_release_agent() is avoided by copying the current
release_agent_path prior to calling call_usermode_helper().

Signed-off-by: Paul Menage
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: Balbir Singh
Acked-by: Serge Hallyn
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2008-07-26 01:53:35 +0800
db3b14978 cgroup files: add write_string cgroup control file method ... Browse Code »

This patch adds a write_string() method for cgroups control files. The
semantics are that a buffer is copied from userspace to kernelspace
and the handler function invoked on that buffer. The buffer is
guaranteed to be nul-terminated, and no longer than max_write_len
(defaulting to 64 bytes if unspecified). Later patches will convert
existing raw file write handlers in control group subsystems to use
this method.

Signed-off-by: Paul Menage
Cc: Paul Jackson
Cc: Pavel Emelyanov
Acked-by: Balbir Singh
Acked-by: Serge Hallyn
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Menage
2008-07-26 01:53:35 +0800
8947f9d5b cgroups: annotate two variables with __read_mostly ... Browse Code »

- need_forkexit_callback will be read only after system boot.
- use_task_css_set_links will be read only after it's set.

And these 2 variables are checked when a new process is forked.

Signed-off-by: Li Zefan
Acked-by: Paul Menage
Acked-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-07-26 01:53:35 +0800
71cbb949d cgroup: list_for_each cleanup ... Browse Code »

--------------------------
while() {
list_entry();
...
}
--------------------------

is equivalent to following code.

--------------------------
list_for_each_entry(){
...
}
--------------------------

later can review easily more.

this patch is just clean up.
it doesn't have any behavor change.

Signed-off-by: KOSAKI Motohiro
Cc: Paul Menage
Cc: Li Zefan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2008-07-26 01:53:35 +0800
7e9abd89c cgroup: use read lock to guard find_existing_css_set() ... Browse Code »

The function does not modify anything (except the temporary css template), so
it's sufficient to hold read lock.

Signed-off-by: Li Zefan
Acked-by: Paul Menage
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-07-26 01:53:35 +0800

25 May, 2008

1 commit

5c02b5757 cgroups: remove node_ prefix_from ns subsystem ... Browse Code »

This is a slight change in the namespace cgroup subsystem api.

The change is that previously when cgroup_clone() was called (currently
only from the unshare path in ns_proxy cgroup, you'd get a new group named
"node_$pid" whereas now you'll get a group named after just your pid.)

The only users who would notice it are those who are using the ns_proxy
cgroup subsystem to auto-create cgroups when namespaces are unshared -
something of an experimental feature, which I think really needs more
complete container/namespace support in order to be useful. I suspect the
only users are Cedric and Serge, or maybe a few others on
containers@lists.linux-foundation.org. And in fact it would only be
noticed by the users who make the assumption about how the name is
generated, rather than getting it from the /proc//cgroups file for
the process in question.

Whether the change is actually needed or not I'm fairly agnostic on, but I
guess it is more elegant to just use the pid as the new group name rather
than adding a fairly arbitrary "node_" prefix on the front.

[menage@google.com: provided changelog]
Signed-off-by: Cedric Le Goater
Cc: "Paul Menage"
Cc: "Serge E. Hallyn"
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Cedric Le Goater
2008-05-25 00:56:14 +0800

30 Apr, 2008

1 commit

e4ad08fe6 mm: bdi: add separate writeback accounting capability ... Browse Code »

Add a new BDI capability flag: BDI_CAP_NO_ACCT_WB. If this flag is
set, then don't update the per-bdi writeback stats from
test_set_page_writeback() and test_clear_page_writeback().

Misc cleanups:

- convert bdi_cap_writeback_dirty() and friends to static inline functions
- create a flag that includes all three dirty/writeback related flags,
since almst all users will want to have them toghether

Signed-off-by: Miklos Szeredi
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Miklos Szeredi
2008-04-30 23:29:50 +0800

29 Apr, 2008

3 commits

cf475ad28 cgroups: add an owner to the mm_struct ... Browse Code »

Remove the mem_cgroup member from mm_struct and instead adds an owner.

This approach was suggested by Paul Menage. The advantage of this approach
is that, once the mm->owner is known, using the subsystem id, the cgroup
can be determined. It also allows several control groups that are
virtually grouped by mm_struct, to exist independent of the memory
controller i.e., without adding mem_cgroup's for each controller, to
mm_struct.

A new config option CONFIG_MM_OWNER is added and the memory resource
controller selects this config option.

This patch also adds cgroup callbacks to notify subsystems when mm->owner
changes. The mm_cgroup_changed callback is called with the task_lock() of
the new task held and is called just prior to changing the mm->owner.

I am indebted to Paul Menage for the several reviews of this patchset and
helping me make it lighter and simpler.

This patch was tested on a powerpc box, it was compiled with both the
MM_OWNER config turned on and off.

After the thread group leader exits, it's moved to init_css_state by
cgroup_exit(), thus all future charges from runnings threads would be
redirected to the init_css_set's subsystem.

Signed-off-by: Balbir Singh
Cc: Pavel Emelianov
Cc: Hugh Dickins
Cc: Sudhir Kumar
Cc: YAMAMOTO Takashi
Cc: Hirokazu Takahashi
Cc: David Rientjes ,
Cc: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Pekka Enberg
Reviewed-by: Paul Menage
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2008-04-29 23:06:10 +0800
29486df32 cgroups: introduce cft->read_seq() ... Browse Code »

Introduce a read_seq() helper in cftype, which uses seq_file to print out
lists. Use it in the devices cgroup. Also split devices.allow into two
files, so now devices.deny and devices.allow are the ones to use to manipulate
the whitelist, while devices.list outputs the cgroup's current whitelist.

Signed-off-by: Serge E. Hallyn
Acked-by: Paul Menage
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2008-04-29 23:06:10 +0800
28fd5dfc1 cgroups: remove the css_set linked-list ... Browse Code »

Now we can run through the hash table instead of running through the
linked-list.

Signed-off-by: Li Zefan
Reviewed-by: Paul Menage
Cc: Balbir Singh
Cc: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Li Zefan
2008-04-29 23:06:10 +0800