04 Apr, 2014
1 commit
-
Mark functions as static in page_cgroup.c because they are not used
outside this file.This eliminates the following warning in mm/page_cgroup.c:
mm/page_cgroup.c:177:6: warning: no previous prototype for `__free_page_cgroup' [-Wmissing-prototypes]
mm/page_cgroup.c:190:15: warning: no previous prototype for `online_page_cgroup' [-Wmissing-prototypes]
mm/page_cgroup.c:225:15: warning: no previous prototype for `offline_page_cgroup' [-Wmissing-prototypes]Signed-off-by: Rashika Kheria
Reviewed-by: Josh Triplett
Acked-by: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
22 Jan, 2014
2 commits
-
Merge first patch-bomb from Andrew Morton:
- a couple of misc things
- inotify/fsnotify work from Jan
- ocfs2 updates (partial)
- about half of MM
* emailed patches from Andrew Morton : (117 commits)
mm/migrate: remove unused function, fail_migrate_page()
mm/migrate: remove putback_lru_pages, fix comment on putback_movable_pages
mm/migrate: correct failure handling if !hugepage_migration_support()
mm/migrate: add comment about permanent failure path
mm, page_alloc: warn for non-blockable __GFP_NOFAIL allocation failure
mm: compaction: reset scanner positions immediately when they meet
mm: compaction: do not mark unmovable pageblocks as skipped in async compaction
mm: compaction: detect when scanners meet in isolate_freepages
mm: compaction: reset cached scanner pfn's before reading them
mm: compaction: encapsulate defer reset logic
mm: compaction: trace compaction begin and end
memcg, oom: lock mem_cgroup_print_oom_info
sched: add tracepoints related to NUMA task migration
mm: numa: do not automatically migrate KSM pages
mm: numa: trace tasks that fail migration due to rate limiting
mm: numa: limit scope of lock for NUMA migrate rate limiting
mm: numa: make NUMA-migrate related functions static
lib/show_mem.c: show num_poisoned_pages when oom
mm/hwpoison: add '#' to hwpoison_inject
mm/memblock: use WARN_ONCE when MAX_NUMNODES passed as input parameter
... -
Switch to memblock interfaces for early memory allocator instead of
bootmem allocator. No functional change in beahvior than what it is in
current code from bootmem users points of view.Archs already converted to NO_BOOTMEM now directly use memblock
interfaces instead of bootmem wrappers build on top of memblock. And
the archs which still uses bootmem, these new apis just fallback to
exiting bootmem APIs.Signed-off-by: Grygorii Strashko
Signed-off-by: Santosh Shilimkar
Cc: "Rafael J. Wysocki"
Cc: Arnd Bergmann
Cc: Christoph Lameter
Cc: Greg Kroah-Hartman
Cc: Grygorii Strashko
Cc: H. Peter Anvin
Cc: Johannes Weiner
Cc: KAMEZAWA Hiroyuki
Cc: Konrad Rzeszutek Wilk
Cc: Michal Hocko
Cc: Paul Walmsley
Cc: Pavel Machek
Cc: Russell King
Cc: Tejun Heo
Cc: Tony Lindgren
Cc: Yinghai Lu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
13 Jan, 2014
1 commit
-
Trivial: remove the few stray references to css_id, which itself
was removed in v3.13's 2ff2a7d03bbe "cgroup: kill css_id".Signed-off-by: Hugh Dickins
Signed-off-by: Tejun Heo
13 Dec, 2012
1 commit
-
N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.Signed-off-by: Lai Jiangshan
Signed-off-by: Wen Congyang
Cc: Christoph Lameter
Cc: Hillf Danton
Cc: Lin Feng
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 Dec, 2012
1 commit
-
When a memory block is onlined, we will try allocate memory on that node
to store page_cgroup. If onlining the memory block failed, we don't
offline the page cgroup, and we have no chance to offline this page cgroup
unless the memory block is onlined successfully again. It will cause that
we can't hot-remove the memory device on that node, because some memory is
used to store page cgroup. If onlining the memory block is failed, there
is no need to stort page cgroup for this memory. So auto offline
page_cgroup when onlining memory block failed.Signed-off-by: Wen Congyang
Cc: David Rientjes
Cc: Jiang Liu
Cc: Len Brown
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Christoph Lameter
Cc: Minchan Kim
Acked-by: KOSAKI Motohiro
Cc: Yasuaki Ishimatsu
Cc: Dave Hansen
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
01 Aug, 2012
1 commit
-
Sanity:
CONFIG_CGROUP_MEM_RES_CTLR -> CONFIG_MEMCG
CONFIG_CGROUP_MEM_RES_CTLR_SWAP -> CONFIG_MEMCG_SWAP
CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED -> CONFIG_MEMCG_SWAP_ENABLED
CONFIG_CGROUP_MEM_RES_CTLR_KMEM -> CONFIG_MEMCG_KMEM[mhocko@suse.cz: fix missed bits]
Cc: Glauber Costa
Acked-by: Michal Hocko
Cc: Johannes Weiner
Cc: KAMEZAWA Hiroyuki
Cc: Hugh Dickins
Cc: Tejun Heo
Cc: Aneesh Kumar K.V
Cc: David Rientjes
Cc: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
21 Jun, 2012
1 commit
-
Fix kernel-doc warnings such as
Warning(../mm/page_cgroup.c:432): No description found for parameter 'id'
Warning(../mm/page_cgroup.c:432): Excess function parameter 'mem' description in 'swap_cgroup_record'Signed-off-by: Wanpeng Li
Cc: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
07 Mar, 2012
1 commit
-
Why is memcg's swap accounting so broken? Insane counts, wrong
ownership, unfreeable structures, which later get freed and then
accessed after free.Turns out to be a tiny a little 3.3-rc1 regression in 9fb4b7cc0724
"page_cgroup: add helper function to get swap_cgroup": the helper
function (actually named lookup_swap_cgroup()) returns an address using
void* arithmetic, but the structure in question is a short.Signed-off-by: Hugh Dickins
Reviewed-by: Bob Liu
Cc: Michal Hocko
Cc: KAMEZAWA Hiroyuki
Cc: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
13 Jan, 2012
5 commits
-
No need for two CONFIG_MEMORY_HOTPLUG blocks.
Signed-off-by: Bob Liu
Acked-by: Michal Hocko
Cc: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Cc: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
There are multiple places which need to get the swap_cgroup address, so
add a helper function:static struct swap_cgroup *swap_cgroup_getsc(swp_entry_t ent,
struct swap_cgroup_ctrl **ctrl);to simplify the code.
Signed-off-by: Bob Liu
Acked-by: Michal Hocko
Acked-by: KAMEZAWA Hiroyuki
Cc: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
lookup_page_cgroup() is usually used only against pages that are used in
userspace.The exception is the CONFIG_DEBUG_VM-only memcg check from the page
allocator: it can run on pages without page_cgroup descriptors allocated
when the pages are fed into the page allocator for the first time during
boot or memory hotplug.Include the array check only when CONFIG_DEBUG_VM is set and save the
unnecessary check in production kernels.Signed-off-by: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Michal Hocko
Cc: Balbir Singh
Cc: David Rientjes
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
To find the page corresponding to a certain page_cgroup, the pc->flags
encoded the node or section ID with the base array to compare the pc
pointer to.Now that the per-memory cgroup LRU lists link page descriptors directly,
there is no longer any code that knows the struct page_cgroup of a PFN
but not the struct page.[hughd@google.com: remove unused node/section info from pc->flags fix]
Signed-off-by: Johannes Weiner
Reviewed-by: KAMEZAWA Hiroyuki
Reviewed-by: Michal Hocko
Reviewed-by: Kirill A. Shutemov
Cc: KAMEZAWA Hiroyuki
Cc: Michal Hocko
Cc: "Kirill A. Shutemov"
Cc: Daisuke Nishimura
Cc: Balbir Singh
Cc: Ying Han
Cc: Greg Thelen
Cc: Michel Lespinasse
Cc: Rik van Riel
Cc: Minchan Kim
Cc: Christoph Hellwig
Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Now that all code that operated on global per-zone LRU lists is
converted to operate on per-memory cgroup LRU lists instead, there is no
reason to keep the double-LRU scheme around any longer.The pc->lru member is removed and page->lru is linked directly to the
per-memory cgroup LRU lists, which removes two pointers from a
descriptor that exists for every page frame in the system.Signed-off-by: Johannes Weiner
Signed-off-by: Hugh Dickins
Signed-off-by: Ying Han
Reviewed-by: KAMEZAWA Hiroyuki
Reviewed-by: Michal Hocko
Reviewed-by: Kirill A. Shutemov
Cc: Daisuke Nishimura
Cc: Balbir Singh
Cc: Greg Thelen
Cc: Michel Lespinasse
Cc: Rik van Riel
Cc: Minchan Kim
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
03 Nov, 2011
2 commits
-
warning: symbol 'swap_cgroup_ctrl' was not declared. Should it be static?
Signed-off-by: H Hartley Sweeten
Cc: Paul Menage
Cc: Li Zefan
Acked-by: Balbir Singh
Cc: Daisuke Nishimura
Acked-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When the cgroup base was allocated with kmalloc, it was necessary to
annotate the variable with kmemleak_not_leak(). But because it has
recently been changed to be allocated with alloc_page() (which skips
kmemleak checks) causes a warning on boot up.I was triggering this output:
allocated 8388608 bytes of page_cgroup
please try 'cgroup_disable=memory' option if you don't want memory cgroups
kmemleak: Trying to color unknown object at 0xf5840000 as Grey
Pid: 0, comm: swapper Not tainted 3.0.0-test #12
Call Trace:
[] ? printk+0x1d/0x1f^M
[] paint_ptr+0x4f/0x78
[] kmemleak_not_leak+0x58/0x7d
[] ? __rcu_read_unlock+0x9/0x7d
[] kmemleak_init+0x19d/0x1e9
[] start_kernel+0x346/0x3ec
[] ? loglevel+0x18/0x18
[] i386_start_kernel+0xaa/0xb0After a bit of debugging I tracked the object 0xf840000 (and others) down
to the cgroup code. The change from allocating base with kmalloc to
alloc_page() has the base not calling kmemleak_alloc() which adds the
pointer to the object_tree_root, but kmemleak_not_leak() adds it to the
crt_early_log[] table. On kmemleak_init(), the entry is found in the
early_log[] but not the object_tree_root, and this error message is
displayed.If alloc_page() fails then it defaults back to vmalloc() which still uses
the kmemleak_alloc() which makes us still need the kmemleak_not_leak()
call. The solution is to call the kmemleak_alloc() directly if the
alloc_page() succeeds.Reviewed-by: Michal Hocko
Signed-off-by: Steven Rostedt
Acked-by: Catalin Marinas
Signed-off-by: Jonathan Nieder
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
15 Sep, 2011
1 commit
-
Signed-off-by: Joe Perches
Acked-by: Paul Menage
Signed-off-by: Jiri Kosina
26 Jul, 2011
2 commits
-
Commit a539f3533b78e3 ("mm: add SECTION_ALIGN_UP() and
SECTION_ALIGN_DOWN() macro") introduced the SECTION_ALIGN_UP() and
SECTION_ALIGN_DOWN() macros. Use those macros to increase code
readability.Signed-off-by: Daniel Kiper
Acked-by: David Rientjes
Acked-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
In commit a2c8990aed5ab ("memsw: remove noswapaccount kernel parameter"),
Michal forgot to remove some left pieces of noswapaccount in the tree,
this patch removes them all.Signed-off-by: WANG Cong
Acked-by: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
16 Jun, 2011
1 commit
-
Commit 21a3c9646873 ("memcg: allocate memory cgroup structures in local
nodes") makes page_cgroup allocation as NUMA aware. But that caused a
problem https://bugzilla.kernel.org/show_bug.cgi?id=36192.The problem was getting a NID from invalid struct pages, which was not
initialized because it was out-of-node, out of [node_start_pfn,
node_end_pfn)Now, with sparsemem, page_cgroup_init scans pfn from 0 to max_pfn. But
this may scan a pfn which is not on any node and can access memmap which
is not initialized.This makes page_cgroup_init() for SPARSEMEM node aware and remove a code
to get nid from page->flags. (Then, we'll use valid NID always.)[akpm@linux-foundation.org: try to fix up comments]
Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
27 May, 2011
3 commits
-
Move page-freeing code out of swap_cgroup_mutex in the hope that it could
reduce few of theoretical contentions between swapons and/or swapoffs.This is just a cleanup, no functional changes.
Signed-off-by: Namhyung Kim
Acked-by: KAMEZAWA Hiroyuki
Cc: Balbir Singh
Cc: Daisuke Nishimura
Cc: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It allocated one more page than necessary if @max_pages was a multiple of
SC_PER_PAGE.Signed-off-by: Namhyung Kim
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Balbir Singh
Cc: Daisuke Nishimura
Cc: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Commit ca371c0d7e23 ("memcg: fix page_cgroup fatal error in FLATMEM")
removes call to alloc_bootmem() in the function so that it can be marked
as __meminit to reduce memory usage when MEMORY_HOTPLUG=n.Also as the new helper function alloc_page_cgroup() is called only in the
function, it should be marked too.Signed-off-by: Namhyung Kim
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Balbir Singh
Cc: Michal Hocko
Cc: Daisuke Nishimura
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 May, 2011
1 commit
-
Commit dde79e005a769 ("page_cgroup: reduce allocation overhead for
page_cgroup array for CONFIG_SPARSEMEM") added a regression that the
memory cgroup data structures all end up in node 0 because the first
attempt at allocating them would not pass in a node hint. Since the
initialization runs on CPU #0 it would all end up node 0. This is a
problem on large memory systems, where node 0 would lose a lot of
memory.Change the alloc_pages_exact() to alloc_pages_exact_nid(). This will
still fall back to other nodes if not enough memory is available.[ RED-PEN: right now it would fall back first before trying
vmalloc_node. Probably not the best strategy ... But I left it like
that for now. ]Signed-off-by: Andi Kleen
Reported-by: Doug Nelson
Cc: David Rientjes
Reviewed-by: Michal Hocko
Cc: Dave Hansen
Acked-by: Balbir Singh
Acked-by: Johannes Weiner
Reviewed-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
31 Mar, 2011
1 commit
-
Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: Lucas De Marchi
24 Mar, 2011
3 commits
-
KAMEZAWA Hiroyuki noted that free_pages_cgroup doesn't have to check for
PageReserved because we never store the array on reserved pages (neither
alloc_pages_exact nor vmalloc use those pages).So we can replace the check by a BUG_ON.
Signed-off-by: Michal Hocko
Acked-by: KAMEZAWA Hiroyuki
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently we are allocating a single page_cgroup array per memory section
(stored in mem_section->base) when CONFIG_SPARSEMEM is selected. This is
correct but memory inefficient solution because the allocated memory
(unless we fall back to vmalloc) is not kmalloc friendly:- 32b - 16384 entries (20B per entry) fit into 327680B so the
524288B slab cache is used
- 32b with PAE - 131072 entries with 2621440B fit into 4194304B
- 64b - 32768 entries (40B per entry) fit into 2097152 cacheThis is ~37% wasted space per memory section and it sumps up for the whole
memory. On a x86_64 machine it is something like 6MB per 1GB of RAM.We can reduce the internal fragmentation by using alloc_pages_exact which
allocates PAGE_SIZE aligned blocks so we will get down to
Cc: Dave Hansen
Acked-by: KAMEZAWA Hiroyuki
Cc: Balbir Singh
Signed-off-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
In struct page_cgroup, we have a full word for flags but only a few are
reserved. Use the remaining upper bits to encode, depending on
configuration, the node or the section, to enable page_cgroup-to-page
lookups without a direct pointer.This saves a full word for every page in a system with memory cgroups
enabled.Signed-off-by: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Cc: Daisuke Nishimura
Cc: Balbir Singh
Cc: Minchan Kim
Cc: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
23 Mar, 2011
1 commit
-
While looking at some other notifier callbacks I noticed this code could
use a simple cleanup.notifier_from_errno() no longer needs the if (ret)/else conditional. That
same conditional is now done in notifier_from_errno().Signed-off-by: Prarit Bhargava
Cc: Paul Menage
Cc: Li Zefan
Acked-by: Pekka Enberg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
19 Jul, 2010
1 commit
-
The pointer to the page_cgroup table allocated in
init_section_page_cgroup() is stored in section->page_cgroup as (base -
pfn). Since this value does not point to the beginning or inside the
allocated memory block, kmemleak reports a false positive.This was reported in bugzilla.kernel.org as #16297.
Signed-off-by: Catalin Marinas
Reported-by: Adrien Dessemond
Reviewed-by: KAMEZAWA Hiroyuki
Cc: Pekka Enberg
Cc: Andrew Morton
18 Mar, 2010
1 commit
-
swap_cgroup uses 2bytes data and uses cmpxchg in a new operation. 2byte
cmpxchg/xchg is not available on some archs. This patch replaces
cmpxchg/xchg with operations under lock.Signed-off-by: KAMEZAWA Hiroyuki
Reported-by: Sachin Sant wrote:
Acked-by: Balbir Singh
Acked-by: Daisuke Nishimura
Cc: Li Zefan
Cc: Benjamin Herrenschmidt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
13 Mar, 2010
1 commit
-
This patch is another core part of this move-charge-at-task-migration
feature. It enables moving charges of anonymous swaps.To move the charge of swap, we need to exchange swap_cgroup's record.
In current implementation, swap_cgroup's record is protected by:
- page lock: if the entry is on swap cache.
- swap_lock: if the entry is not on swap cache.This works well in usual swap-in/out activity.
But this behavior make the feature of moving swap charge check many
conditions to exchange swap_cgroup's record safely.So I changed modification of swap_cgroup's recored(swap_cgroup_record())
to use xchg, and define a new function to cmpxchg swap_cgroup's record.This patch also enables moving charge of non pte_present but not uncharged
swap caches, which can be exist on swap-out path, by getting the target
pages via find_get_page() as do_mincore() does.[kosaki.motohiro@jp.fujitsu.com: fix ia64 build]
[akpm@linux-foundation.org: fix typos]
Signed-off-by: Daisuke Nishimura
Cc: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Li Zefan
Cc: Paul Menage
Cc: Daisuke Nishimura
Signed-off-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
22 Sep, 2009
1 commit
-
To initialize hotadded node, some pages are allocated. At that time, the
node hasn't memory, this makes the allocation always fail. In such case,
let's allocate pages from other nodes.Signed-off-by: Shaohua Li
Signed-off-by: Yakui Zhao
Cc: Mel Gorman
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
19 Jun, 2009
3 commits
-
We don't need to check do_swap_account in the case that the function which
checks do_swap_account will never get called if do_swap_account == 0.Signed-off-by: Li Zefan
Cc: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add file RSS tracking per memory cgroup
We currently don't track file RSS, the RSS we report is actually anon RSS.
All the file mapped pages, come in through the page cache and get
accounted there. This patch adds support for accounting file RSS pages.
It should1. Help improve the metrics reported by the memory resource controller
2. Will form the basis for a future shared memory accounting heuristic
that has been proposed by Kamezawa.Unfortunately, we cannot rename the existing "rss" keyword used in
memory.stat to "anon_rss". We however, add "mapped_file" data and hope to
educate the end user through documentation.[hugh.dickins@tiscali.co.uk: fix mem_cgroup_update_mapped_file_stat oops]
Signed-off-by: Balbir Singh
Acked-by: KAMEZAWA Hiroyuki
Cc: Li Zefan
Cc: Paul Menage
Cc: Dhaval Giani
Cc: Daisuke Nishimura
Cc: YAMAMOTO Takashi
Cc: KOSAKI Motohiro
Cc: David Rientjes
Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix some cgroup messages to read better.
Update MAINTAINERS to include mm/*cgroup* files.Signed-off-by: Randy Dunlap
Cc: Paul Menage
Cc: Li Zefan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 Jun, 2009
2 commits
-
Now, SLAB is configured in very early stage and it can be used in
init routine now.But replacing alloc_bootmem() in FLAT/DISCONTIGMEM's page_cgroup()
initialization breaks the allocation, now.
(Works well in SPARSEMEM case...it supports MEMORY_HOTPLUG and
size of page_cgroup is in reasonable size (< 1 << MAX_ORDER.)This patch revive FLATMEM+memory cgroup by using alloc_bootmem.
In future,
We stop to support FLATMEM (if no users) or rewrite codes for flatmem
completely.But this will adds more messy codes and overheads.Reported-by: Li Zefan
Tested-by: Li Zefan
Tested-by: Ingo Molnar
Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Pekka Enberg -
The bootmem allocator is no longer available for page_cgroup_init() because we
set up the kernel slab allocator much earlier now.Cc: Ingo Molnar
Cc: Johannes Weiner
Cc: Linus Torvalds
Signed-off-by: Yinghai Lu
Signed-off-by: Pekka Enberg
03 Apr, 2009
2 commits
-
It's pointed out that swap_cgroup's message at swapon() is nonsense.
Because* It can be calculated very easily if all necessary information is
written in Kconfig.* It's not necessary to annoying people at every swapon().
In other view, now, memory usage per swp_entry is reduced to 2bytes from
8bytes(64bit) and I think it's reasonably small.Reported-by: Hugh Dickins
Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Try to use CSS ID for records in swap_cgroup. By this, on 64bit machine,
size of swap_cgroup goes down to 2 bytes from 8bytes.This means, when 2GB of swap is equipped, (assume the page size is 4096bytes)
From size of swap_cgroup = 2G/4k * 8 = 4Mbytes.
To size of swap_cgroup = 2G/4k * 2 = 1Mbytes.Reduction is large. Of course, there are trade-offs. This CSS ID will
add overhead to swap-in/swap-out/swap-free.But in general,
- swap is a resource which the user tend to avoid use.
- If swap is never used, swap_cgroup area is not used.
- Reading traditional manuals, size of swap should be proportional to
size of memory. Memory size of machine is increasing now.I think reducing size of swap_cgroup makes sense.
Note:
- ID->CSS lookup routine has no locks, it's under RCU-Read-Side.
- memcg can be obsolete at rmdir() but not freed while refcnt from
swap_cgroup is available.Changelog v4->v5:
- reworked on to memcg-charge-swapcache-to-proper-memcg.patch
Changlog ->v4:
- fixed not configured case.
- deleted unnecessary comments.
- fixed NULL pointer bug.
- fixed message in dmesg.[nishimura@mxp.nes.nec.co.jp: css_tryget can be called twice in !PageCgroupUsed case]
Signed-off-by: KAMEZAWA Hiroyuki
Cc: Li Zefan
Cc: Balbir Singh
Cc: Paul Menage
Cc: Hugh Dickins
Signed-off-by: Daisuke Nishimura
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds