Doug / smarc-fsl-linux-kernel | Embedian Git Server

24 Feb, 2008

1 commit

eaeb16883 sparc: fix build ... Browse Code »

Fix build failure on sparc:

In file included from include/linux/mm.h:39,
from include/linux/memcontrol.h:24,
from include/linux/swap.h:8,
from include/linux/suspend.h:7,
from init/do_mounts.c:6:
include/asm/pgtable.h:344: warning: parameter names (without
types) in function declaration
include/asm/pgtable.h:345: warning: parameter names (without
types) in function declaration
include/asm/pgtable.h:346: error: expected '=', ',', ';', 'asm' or
'__attribute__' before '___f___swp_entry'

viro sayeth:

I've run allmodconfig builds on a bunch of target, FWIW (essentially the
same patch). Note that these includes are recent addition caused by added
inline function that had since then become a define. So while I agree with
your comments in general, in _this_ case it's pretty safe.

The commit that had done it is 3062fc67dad01b1d2a15d58c709eff946389eca4
("memcontrol: move mm_cgroup to header file") and the switch to #define
is in commit 60c12b1202a60eabb1c61317e5d2678fcea9893f ("memcontrol: add
vm_match_cgroup()") (BTW, that probably warranted mentioning in the
changelog of the latter).

Cc: Adrian Bunk
Cc: Robert Reif
Signed-off-by: David Rientjes
Cc: "David S. Miller"
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2008-02-24 09:12:14 +0800

10 Feb, 2008

1 commit

60c12b120 memcontrol: add vm_match_cgroup() ... Browse Code »

mm_cgroup() is exclusively used to test whether an mm's mem_cgroup pointer
is pointing to a specific cgroup. Instead of returning the pointer, we can
just do the test itself in a new macro:

vm_match_cgroup(mm, cgroup)

returns non-zero if the mm's mem_cgroup points to cgroup. Otherwise it
returns zero.

Signed-off-by: David Rientjes
Cc: Balbir Singh
Cc: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2008-02-10 03:08:33 +0800

08 Feb, 2008

16 commits

3c541e14b Memory controller remove control_type feature ... Browse Code »

Based on the discussion at http://lkml.org/lkml/2007/12/20/383, it was felt
that control_type might not be a good thing to implement right away. We
can add this flexibility at a later point when required.

Signed-off-by: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2008-02-08 00:42:22 +0800
cc38108e1 per-zone and reclaim enhancements for memory controller: calculate the number of… ... Browse Code »

… pages to be scanned per cgroup

Define function for calculating the number of scan target on each Zone/LRU.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Kirill Korotaev <dev@sw.ru>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Paul Menage <menage@google.com>
Cc: Pavel Emelianov <xemul@openvz.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

KAMEZAWA Hiroyuki
2008-02-08 00:42:22 +0800
6c48a1d04 per-zone and reclaim enhancements for memory controller: remember reclaim priority in memory cgroup ... Browse Code »

Functions to remember reclaim priority per cgroup (as zone->prev_priority)

[akpm@linux-foundation.org: build fixes]
[akpm@linux-foundation.org: more build fixes]
Signed-off-by: KAMEZAWA Hiroyuki
Cc: "Eric W. Biederman"
Cc: Balbir Singh
Cc: David Rientjes
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Nick Piggin
Cc: Paul Menage
Cc: Pavel Emelianov
Cc: Peter Zijlstra
Cc: Vaidyanathan Srinivasan
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2008-02-08 00:42:22 +0800
5932f3671 per-zone and reclaim enhancements for memory controller: calculate active/inacti… ... Browse Code »

…ve imbalance per cgroup

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Kirill Korotaev <dev@sw.ru>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Paul Menage <menage@google.com>
Cc: Pavel Emelianov <xemul@openvz.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

KAMEZAWA Hiroyuki
2008-02-08 00:42:21 +0800
58ae83db2 per-zone and reclaim enhancements for memory controller: calculate mapper_ratio per cgroup ... Browse Code »

Define function for calculating mapped_ratio in memory cgroup.

Signed-off-by: KAMEZAWA Hiroyuki
Cc: "Eric W. Biederman"
Cc: Balbir Singh
Cc: David Rientjes
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Nick Piggin
Cc: Paul Menage
Cc: Pavel Emelianov
Cc: Peter Zijlstra
Cc: Vaidyanathan Srinivasan
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2008-02-08 00:42:21 +0800
ae41be374 bugfix for memory cgroup controller: migration under memory controller fix ... Browse Code »

While using memory control cgroup, page-migration under it works as following.
==
1. uncharge all refs at try to unmap.
2. charge regs again remove_migration_ptes()
==
This is simple but has following problems.
==
The page is uncharged and charged back again if *mapped*.
- This means that cgroup before migration can be different from one after
migration
- If page is not mapped but charged as page cache, charge is just ignored
(because not mapped, it will not be uncharged before migration)
This is memory leak.
==
This patch tries to keep memory cgroup at page migration by increasing
one refcnt during it. 3 functions are added.

mem_cgroup_prepare_migration() --- increase refcnt of page->page_cgroup
mem_cgroup_end_migration() --- decrease refcnt of page->page_cgroup
mem_cgroup_page_migration() --- copy page->page_cgroup from old page to
new page.

During migration
- old page is under PG_locked.
- new page is under PG_locked, too.
- both old page and new page is not on LRU.

These 3 facts guarantee that page_cgroup() migration has no race.

Tested and worked well in x86_64/fake-NUMA box.

Signed-off-by: KAMEZAWA Hiroyuki
Cc: Balbir Singh
Cc: Pavel Emelianov
Cc: Paul Menage
Cc: Peter Zijlstra
Cc: "Eric W. Biederman"
Cc: Nick Piggin
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: David Rientjes
Cc: Vaidyanathan Srinivasan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2008-02-08 00:42:19 +0800
4c4a22148 memcontrol: move oom task exclusion to tasklist scan ... Browse Code »

Creates a helper function to return non-zero if a task is a member of a
memory controller:

int task_in_mem_cgroup(const struct task_struct *task,
const struct mem_cgroup *mem);

When the OOM killer is constrained by the memory controller, the exclusion
of tasks that are not a member of that controller was previously misplaced
and appeared in the badness scoring function. It should be excluded
during the tasklist scan in select_bad_process() instead.

[akpm@linux-foundation.org: build fix]
Cc: Christoph Lameter
Cc: Balbir Singh
Signed-off-by: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2008-02-08 00:42:19 +0800
3062fc67d memcontrol: move mm_cgroup to header file ... Browse Code »

Inline functions must preceed their use, so mm_cgroup() should be defined
in linux/memcontrol.h.

include/linux/memcontrol.h:48: warning: 'mm_cgroup' declared inline after
being called
include/linux/memcontrol.h:48: warning: previous declaration of
'mm_cgroup' was here

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: nuther build fix]
Cc: Balbir Singh
Signed-off-by: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2008-02-08 00:42:19 +0800
e1a1cd590 Memory controller: make charging gfp mask aware ... Browse Code »

Nick Piggin pointed out that swap cache and page cache addition routines
could be called from non GFP_KERNEL contexts. This patch makes the
charging routine aware of the gfp context. Charging might fail if the
cgroup is over it's limit, in which case a suitable error is returned.

This patch was tested on a Powerpc box. I am still looking at being able
to test the path, through which allocations happen in non GFP_KERNEL
contexts.

[kamezawa.hiroyu@jp.fujitsu.com: problem with ZONE_MOVABLE]
Signed-off-by: Balbir Singh
Cc: Pavel Emelianov
Cc: Paul Menage
Cc: Peter Zijlstra
Cc: "Eric W. Biederman"
Cc: Nick Piggin
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: David Rientjes
Cc: Vaidyanathan Srinivasan
Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2008-02-08 00:42:19 +0800
bed7161a5 Memory controller: make page_referenced() cgroup aware ... Browse Code »

Make page_referenced() cgroup aware. Without this patch, page_referenced()
can cause a page to be skipped while reclaiming pages. This patch ensures
that other cgroups do not hold pages in a particular cgroup hostage. It
is required to ensure that shared pages are freed from a cgroup when they
are not actively referenced from the cgroup that brought them in

Signed-off-by: Balbir Singh
Cc: Pavel Emelianov
Cc: Paul Menage
Cc: Peter Zijlstra
Cc: "Eric W. Biederman"
Cc: Nick Piggin
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: David Rientjes
Cc: Vaidyanathan Srinivasan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2008-02-08 00:42:19 +0800
8697d3319 Memory controller: add switch to control what type of pages to limit ... Browse Code »

Choose if we want cached pages to be accounted or not. By default both are
accounted for. A new set of tunables are added.

echo -n 1 > mem_control_type

switches the accounting to account for only mapped pages

echo -n 3 > mem_control_type

switches the behaviour back

[bunk@kernel.org: mm/memcontrol.c: clenups]
[akpm@linux-foundation.org: fix sparc32 build]
Signed-off-by: Balbir Singh
Cc: Pavel Emelianov
Cc: Paul Menage
Cc: Peter Zijlstra
Cc: "Eric W. Biederman"
Cc: Nick Piggin
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: David Rientjes
Cc: Vaidyanathan Srinivasan
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2008-02-08 00:42:19 +0800
c7ba5c9e8 Memory controller: OOM handling ... Browse Code »

Out of memory handling for cgroups over their limit. A task from the
cgroup over limit is chosen using the existing OOM logic and killed.

TODO:
1. As discussed in the OLS BOF session, consider implementing a user
space policy for OOM handling.

[akpm@linux-foundation.org: fix build due to oom-killer changes]
Signed-off-by: Pavel Emelianov
Signed-off-by: Balbir Singh
Cc: Paul Menage
Cc: Peter Zijlstra
Cc: "Eric W. Biederman"
Cc: Nick Piggin
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: David Rientjes
Cc: Vaidyanathan Srinivasan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelianov
2008-02-08 00:42:19 +0800
66e1707bc Memory controller: add per cgroup LRU and reclaim ... Browse Code »

Add the page_cgroup to the per cgroup LRU. The reclaim algorithm has
been modified to make the isolate_lru_pages() as a pluggable component. The
scan_control data structure now accepts the cgroup on behalf of which
reclaims are carried out. try_to_free_pages() has been extended to become
cgroup aware.

[akpm@linux-foundation.org: fix warning]
[Lee.Schermerhorn@hp.com: initialize all scan_control's isolate_pages member]
[bunk@kernel.org: make do_try_to_free_pages() static]
[hugh@veritas.com: memcgroup: fix try_to_free order]
[kamezawa.hiroyu@jp.fujitsu.com: this unlock_page_cgroup() is unnecessary]
Signed-off-by: Pavel Emelianov
Signed-off-by: Balbir Singh
Cc: Paul Menage
Cc: Peter Zijlstra
Cc: "Eric W. Biederman"
Cc: Nick Piggin
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: David Rientjes
Cc: Vaidyanathan Srinivasan
Signed-off-by: Lee Schermerhorn
Signed-off-by: Hugh Dickins
Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2008-02-08 00:42:18 +0800
8a9f3ccd2 Memory controller: memory accounting ... Browse Code »

Add the accounting hooks. The accounting is carried out for RSS and Page
Cache (unmapped) pages. There is now a common limit and accounting for both.
The RSS accounting is accounted at page_add_*_rmap() and page_remove_rmap()
time. Page cache is accounted at add_to_page_cache(),
__delete_from_page_cache(). Swap cache is also accounted for.

Each page's page_cgroup is protected with the last bit of the
page_cgroup pointer, this makes handling of race conditions involving
simultaneous mappings of a page easier. A reference count is kept in the
page_cgroup to deal with cases where a page might be unmapped from the RSS
of all tasks, but still lives in the page cache.

Credits go to Vaidyanathan Srinivasan for helping with reference counting work
of the page cgroup. Almost all of the page cache accounting code has help
from Vaidyanathan Srinivasan.

[hugh@veritas.com: fix swapoff breakage]
[akpm@linux-foundation.org: fix locking]
Signed-off-by: Vaidyanathan Srinivasan
Signed-off-by: Balbir Singh
Cc: Pavel Emelianov
Cc: Paul Menage
Cc: Peter Zijlstra
Cc: "Eric W. Biederman"
Cc: Nick Piggin
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: David Rientjes
Cc:
Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2008-02-08 00:42:18 +0800
78fb74669 Memory controller: accounting setup ... Browse Code »

Basic setup routines, the mm_struct has a pointer to the cgroup that
it belongs to and the the page has a page_cgroup associated with it.

Signed-off-by: Pavel Emelianov
Signed-off-by: Balbir Singh
Cc: Paul Menage
Cc: Peter Zijlstra
Cc: "Eric W. Biederman"
Cc: Nick Piggin
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelianov
2008-02-08 00:42:18 +0800
8cdea7c05 Memory controller: cgroups setup ... Browse Code »

Setup the memory cgroup and add basic hooks and controls to integrate
and work with the cgroup.

Signed-off-by: Balbir Singh
Cc: Pavel Emelianov
Cc: Paul Menage
Cc: Peter Zijlstra
Cc: "Eric W. Biederman"
Cc: Nick Piggin
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: David Rientjes
Cc: Vaidyanathan Srinivasan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2008-02-08 00:42:18 +0800