Eric Lee / smarc-fsl-linux-kernel

27 Oct, 2010

1 commit

572438f9b mm: fix is_mem_section_removable() page_order BUG_ON check ... Browse Code »

page_order() is called by memory hotplug's user interface to check the
section is removable or not. (is_mem_section_removable())

It calls page_order() withoug holding zone->lock.
So, even if the caller does

if (PageBuddy(page))
ret = page_order(page) ...
The caller may hit BUG_ON().

For fixing this, there are 2 choices.
1. add zone->lock.
2. remove BUG_ON().

is_mem_section_removable() is used for some "advice" and doesn't need to
be 100% accurate. This is_removable() can be called via user program..
We don't want to take this important lock for long by user's request. So,
this patch removes BUG_ON().

Signed-off-by: KAMEZAWA Hiroyuki
Acked-by: Wu Fengguang
Acked-by: Michal Hocko
Acked-by: Mel Gorman
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2010-10-27 07:52:11 +0800

16 Dec, 2009

9 commits

1bfe5febe HWPOISON: add an interface to switch off/on all the page filters ... Browse Code »

In some use cases, user doesn't need extra filtering. E.g. user program
can inject errors through madvise syscall to its own pages, however it
might not know what the page state exactly is or which inode the page
belongs to.

So introduce an one-off interface "corrupt-filter-enable".

Echo 0 to switch off page filters, and echo 1 to switch on the filters.
[AK: changed default to 0]

Signed-off-by: Haicheng Li
Signed-off-by: Wu Fengguang
Signed-off-by: Andi Kleen

Haicheng Li
2009-12-16 19:19:59 +0800
4fd466eb4 HWPOISON: add memory cgroup filter ... Browse Code »

The hwpoison test suite need to inject hwpoison to a collection of
selected task pages, and must not touch pages not owned by them and
thus kill important system processes such as init. (But it's OK to
mis-hwpoison free/unowned pages as well as shared clean pages.
Mis-hwpoison of shared dirty pages will kill all tasks, so the test
suite will target all or non of such tasks in the first place.)

The memory cgroup serves this purpose well. We can put the target
processes under the control of a memory cgroup, and tell the hwpoison
injection code to only kill pages associated with some active memory
cgroup.

The prerequisite for doing hwpoison stress tests with mem_cgroup is,
the mem_cgroup code tracks task pages _accurately_ (unless page is
locked). Which we believe is/should be true.

The benefits are simplification of hwpoison injector code. Also the
mem_cgroup code will automatically be tested by hwpoison test cases.

The alternative interfaces pin-pfn/unpin-pfn can also delegate the
(process and page flags) filtering functions reliably to user space.
However prototype implementation shows that this scheme adds more
complexity than we wanted.

Example test case:

mkdir /cgroup/hwpoison

usemem -m 100 -s 1000 &
echo `jobs -p` > /cgroup/hwpoison/tasks

memcg_ino=$(ls -id /cgroup/hwpoison | cut -f1 -d' ')
echo $memcg_ino > /debug/hwpoison/corrupt-filter-memcg

page-types -p `pidof init` --hwpoison # shall do nothing
page-types -p `pidof usemem` --hwpoison # poison its pages

[AK: Fix documentation]
[Add fix for problem noticed by Li Zefan ;
dentry in the css could be NULL]

CC: KOSAKI Motohiro
CC: Hugh Dickins
CC: Daisuke Nishimura
CC: Balbir Singh
CC: KAMEZAWA Hiroyuki
CC: Li Zefan
CC: Paul Menage
CC: Nick Piggin
CC: Andi Kleen
Signed-off-by: Wu Fengguang
Signed-off-by: Andi Kleen

Andi Kleen
2009-12-16 19:19:59 +0800
478c5ffc0 HWPOISON: add page flags filter ... Browse Code »

When specified, only poison pages if ((page_flags & mask) == value).

- corrupt-filter-flags-mask
- corrupt-filter-flags-value

This allows stress testing of many kinds of pages.

Strictly speaking, the buddy pages requires taking zone lock, to avoid
setting PG_hwpoison on a "was buddy but now allocated to someone" page.
However we can just do nothing because we set PG_locked in the beginning,
this prevents the page allocator from allocating it to someone. (It will
BUG() on the unexpected PG_locked, which is fine for hwpoison testing.)

[AK: Add select PROC_PAGE_MONITOR to satisfy dependency]

CC: Nick Piggin
Signed-off-by: Wu Fengguang
Signed-off-by: Andi Kleen

Wu Fengguang
2009-12-16 19:19:59 +0800
31d3d3484 HWPOISON: limit hwpoison injector to known page types ... Browse Code »

__memory_failure()'s workflow is

set PG_hwpoison
//...
unset PG_hwpoison if didn't pass hwpoison filter

That could kill unrelated process if it happens to page fault on the
page with the (temporary) PG_hwpoison. The race should be big enough to
appear in stress tests.

Fix it by grabbing the page and checking filter at inject time. This
also avoids the very noisy "Injecting memory failure..." messages.

- we don't touch madvise() based injection, because the filters are
generally not necessary for it.
- if we want to apply the filters to h/w aided injection, we'd better to
rearrange the logic in __memory_failure() instead of this patch.

AK: fix documentation, use drain all, cleanups

CC: Haicheng Li
Signed-off-by: Wu Fengguang
Signed-off-by: Andi Kleen

Wu Fengguang
2009-12-16 19:19:59 +0800
7c116f2b0 HWPOISON: add fs/device filters ... Browse Code »

Filesystem data/metadata present the most tricky-to-isolate pages.
It requires careful code review and stress testing to get them right.

The fs/device filter helps to target the stress tests to some specific
filesystem pages. The filter condition is block device's major/minor
numbers:
- corrupt-filter-dev-major
- corrupt-filter-dev-minor
When specified (non -1), only page cache pages that belong to that
device will be poisoned.

The filters are checked reliably on the locked and refcounted page.

Haicheng: clear PG_hwpoison and drop bad page count if filter not OK
AK: Add documentation

CC: Haicheng Li
CC: Nick Piggin
Signed-off-by: Wu Fengguang
Signed-off-by: Andi Kleen

Wu Fengguang
2009-12-16 19:19:59 +0800
8d22ba1b7 HWPOISON: detect free buddy pages explicitly ... Browse Code »

Most free pages in the buddy system have no PG_buddy set.
Introduce is_free_buddy_page() for detecting them reliably.

CC: Nick Piggin
CC: Mel Gorman
Signed-off-by: Wu Fengguang
Signed-off-by: Andi Kleen

Wu Fengguang
2009-12-16 19:19:58 +0800
418b27ef5 mm: remove unevictable_migrate_page function ... Browse Code »

unevictable_migrate_page() in mm/internal.h is a relic of the since
removed UNEVICTABLE_LRU Kconfig option. This patch removes the function
and open codes the test in migrate_page_copy().

Signed-off-by: Lee Schermerhorn
Reviewed-by: Christoph Lameter
Acked-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2009-12-16 00:53:23 +0800
73848b468 ksm: fix mlockfreed to munlocked ... Browse Code »

When KSM merges an mlocked page, it has been forgetting to munlock it:
that's been left to free_page_mlock(), which reports it in /proc/vmstat as
unevictable_pgs_mlockfreed instead of unevictable_pgs_munlocked (and
whinges "Page flag mlocked set for process" in mmotm, whereas mainline is
silently forgiving). Call munlock_vma_page() to fix that.

Signed-off-by: Hugh Dickins
Cc: Izik Eidus
Cc: Andrea Arcangeli
Cc: Chris Wright
Acked-by: Rik van Riel
Acked-by: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2009-12-16 00:53:19 +0800
af8e3354b mm: CONFIG_MMU for PG_mlocked ... Browse Code »

Remove three degrees of obfuscation, left over from when we had
CONFIG_UNEVICTABLE_LRU. MLOCK_PAGES is CONFIG_HAVE_MLOCKED_PAGE_BIT is
CONFIG_HAVE_MLOCK is CONFIG_MMU. rmap.o (and memory-failure.o) are only
built when CONFIG_MMU, so don't need such conditions at all.

Somehow, I feel no compulsion to remove the CONFIG_HAVE_MLOCK* lines from
169 defconfigs: leave those to evolve in due course.

Signed-off-by: Hugh Dickins
Cc: Izik Eidus
Cc: Andrea Arcangeli
Cc: Nick Piggin
Reviewed-by: KOSAKI Motohiro
Cc: Rik van Riel
Cc: Lee Schermerhorn
Cc: Andi Kleen
Cc: KAMEZAWA Hiroyuki
Cc: Wu Fengguang
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2009-12-16 00:53:17 +0800

22 Sep, 2009

4 commits

03f6462a3 mm: move highest_memmap_pfn ... Browse Code »

Move highest_memmap_pfn __read_mostly from page_alloc.c next to zero_pfn
__read_mostly in memory.c: to help them share a cacheline, since they're
very often tested together in vm_normal_page().

Signed-off-by: Hugh Dickins
Cc: Rik van Riel
Cc: KAMEZAWA Hiroyuki
Cc: KOSAKI Motohiro
Cc: Nick Piggin
Cc: Mel Gorman
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2009-09-22 22:17:41 +0800
58fa879e1 mm: FOLL flags for GUP flags ... Browse Code »

__get_user_pages() has been taking its own GUP flags, then processing
them into FOLL flags for follow_page(). Though oddly named, the FOLL
flags are more widely used, so pass them to __get_user_pages() now.
Sorry, VM flags, VM_FAULT flags and FAULT_FLAGs are still distinct.

(The patch to __get_user_pages() looks peculiar, with both gup_flags
and foll_flags: the gup_flags remain constant; but as before there's
an exceptional case, out of scope of the patch, in which foll_flags
per page have FOLL_WRITE masked off.)

Signed-off-by: Hugh Dickins
Cc: Rik van Riel
Cc: KAMEZAWA Hiroyuki
Cc: KOSAKI Motohiro
Cc: Nick Piggin
Cc: Mel Gorman
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2009-09-22 22:17:40 +0800
8e4b9a607 mm: FOLL_DUMP replace FOLL_ANON ... Browse Code »

The "FOLL_ANON optimization" and its use_zero_page() test have caused
confusion and bugs: why does it test VM_SHARED? for the very good but
unsatisfying reason that VMware crashed without. As we look to maybe
reinstating anonymous use of the ZERO_PAGE, we need to sort this out.

Easily done: it's silly for __get_user_pages() and follow_page() to
be guessing whether it's safe to assume that they're being used for
a coredump (which can take a shortcut snapshot where other uses must
handle a fault) - just tell them with GUP_FLAGS_DUMP and FOLL_DUMP.

get_dump_page() doesn't even want a ZERO_PAGE: an error suits fine.

Signed-off-by: Hugh Dickins
Acked-by: Rik van Riel
Acked-by: Mel Gorman
Reviewed-by: Minchan Kim
Cc: KAMEZAWA Hiroyuki
Cc: KOSAKI Motohiro
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2009-09-22 22:17:40 +0800
1c3aff1ce mm: remove unused GUP flags ... Browse Code »

GUP_FLAGS_IGNORE_VMA_PERMISSIONS and GUP_FLAGS_IGNORE_SIGKILL were
flags added solely to prevent __get_user_pages() from doing some of
what it usually does, in the munlock case: we can now remove them.

Signed-off-by: Hugh Dickins
Acked-by: Rik van Riel
Cc: KAMEZAWA Hiroyuki
Cc: KOSAKI Motohiro
Cc: Nick Piggin
Cc: Mel Gorman
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2009-09-22 22:17:40 +0800

17 Jun, 2009

5 commits

fa5e084e4 vmscan: do not unconditionally treat zones that fail zone_reclaim() as full ... Browse Code »

On NUMA machines, the administrator can configure zone_reclaim_mode that
is a more targetted form of direct reclaim. On machines with large NUMA
distances for example, a zone_reclaim_mode defaults to 1 meaning that
clean unmapped pages will be reclaimed if the zone watermarks are not
being met. The problem is that zone_reclaim() failing at all means the
zone gets marked full.

This can cause situations where a zone is usable, but is being skipped
because it has been considered full. Take a situation where a large tmpfs
mount is occuping a large percentage of memory overall. The pages do not
get cleaned or reclaimed by zone_reclaim(), but the zone gets marked full
and the zonelist cache considers them not worth trying in the future.

This patch makes zone_reclaim() return more fine-grained information about
what occured when zone_reclaim() failued. The zone only gets marked full
if it really is unreclaimable. If it's a case that the scan did not occur
or if enough pages were not reclaimed with the limited reclaim_mode, then
the zone is simply skipped.

There is a side-effect to this patch. Currently, if zone_reclaim()
successfully reclaimed SWAP_CLUSTER_MAX, an allocation attempt would go
ahead. With this patch applied, zone watermarks are rechecked after
zone_reclaim() does some work.

This bug was introduced by commit 9276b1bc96a132f4068fdee00983c532f43d3a26
("memory page_alloc zonelist caching speedup") way back in 2.6.19 when the
zonelist_cache was introduced. It was not intended that zone_reclaim()
aggressively consider the zone to be full when it failed as full direct
reclaim can still be an option. Due to the age of the bug, it should be
considered a -stable candidate.

Signed-off-by: Mel Gorman
Reviewed-by: Wu Fengguang
Reviewed-by: Rik van Riel
Reviewed-by: KOSAKI Motohiro
Cc: Christoph Lameter
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2009-06-17 10:47:45 +0800
683776596 mm: remove CONFIG_UNEVICTABLE_LRU config option ... Browse Code »

Currently, nobody wants to turn UNEVICTABLE_LRU off. Thus this
configurability is unnecessary.

Signed-off-by: KOSAKI Motohiro
Cc: Johannes Weiner
Cc: Andi Kleen
Acked-by: Minchan Kim
Cc: David Woodhouse
Cc: Matt Mackall
Cc: Rik van Riel
Cc: Lee Schermerhorn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2009-06-17 10:47:42 +0800
20a0307c0 mm: introduce PageHuge() for testing huge/gigantic pages ... Browse Code »

A series of patches to enhance the /proc/pagemap interface and to add a
userspace executable which can be used to present the pagemap data.

Export 10 more flags to end users (and more for kernel developers):

11. KPF_MMAP (pseudo flag) memory mapped page
12. KPF_ANON (pseudo flag) memory mapped page (anonymous)
13. KPF_SWAPCACHE page is in swap cache
14. KPF_SWAPBACKED page is swap/RAM backed
15. KPF_COMPOUND_HEAD (*)
16. KPF_COMPOUND_TAIL (*)
17. KPF_HUGE hugeTLB pages
18. KPF_UNEVICTABLE page is in the unevictable LRU list
19. KPF_HWPOISON hardware detected corruption
20. KPF_NOPAGE (pseudo flag) no page frame at the address

(*) For compound pages, exporting _both_ head/tail info enables
users to tell where a compound page starts/ends, and its order.

a simple demo of the page-types tool

# ./page-types -h
page-types [options]
-r|--raw Raw mode, for kernel developers
-a|--addr addr-spec Walk a range of pages
-b|--bits bits-spec Walk pages with specified bits
-l|--list Show page details in ranges
-L|--list-each Show page details one by one
-N|--no-summary Don't show summay info
-h|--help Show this usage message
addr-spec:
N one page at offset N (unit: pages)
N+M pages range from N to N+M-1
N,M pages range from N to M-1
N, pages range from N to end
,M pages range from 0 to M
bits-spec:
bit1,bit2 (flags & (bit1|bit2)) != 0
bit1,bit2=bit1 (flags & (bit1|bit2)) == bit1
bit1,~bit2 (flags & (bit1|bit2)) == bit1
=bit1,bit2 flags == (bit1|bit2)
bit-names:
locked error referenced uptodate
dirty lru active slab
writeback reclaim buddy mmap
anonymous swapcache swapbacked compound_head
compound_tail huge unevictable hwpoison
nopage reserved(r) mlocked(r) mappedtodisk(r)
private(r) private_2(r) owner_private(r) arch(r)
uncached(r) readahead(o) slob_free(o) slub_frozen(o)
slub_debug(o)
(r) raw mode bits (o) overloaded bits

# ./page-types
flags
0x0000000000000000
0x0000000000000014
0x0000000000000020
0x0000000000000024
0x0000000000000028
0x0001000000000028
0x000000000000002c
0x000100000000002c
0x0000000000000040
0x0000000000000060
0x0000000000000068
0x0001000000000068
0x000000000000006c
0x000100000000006c
0x0000000000004078
0x000000000000407c
0x0000000000000400
0x0000000000000804
0x0000000000000828
0x0001000000000828
0x000000000000082c
0x000100000000082c
0x0000000000000868
0x0001000000000868
0x000000000000086c
0x000100000000086c
0x0000000000004878
0x0000000000001000
0x0000000000005808
0x0000000000005868
0x000000000000586c
total page-count MB symbolic-flags long-symbolic-flags 487369 1903 _________________________________ 5 0 __R_D____________________________ referenced,dirty 1 0 _____l___________________________ lru 34 0 __R__l___________________________ referenced,lru 3838 14 ___U_l___________________________ uptodate,lru 48 0 ___U_l_______________________I___ uptodate,lru,readahead 6478 25 __RU_l___________________________ referenced,uptodate,lru 47 0 __RU_l_______________________I___ referenced,uptodate,lru,readahead 8344 32 ______A__________________________ active 1 0 _____lA__________________________ lru,active 348 1 ___U_lA__________________________ uptodate,lru,active 12 0 ___U_lA______________________I___ uptodate,lru,active,readahead 988 3 __RU_lA__________________________ referenced,uptodate,lru,active 48 0 __RU_lA______________________I___ referenced,uptodate,lru,active,readahead 1 0 ___UDlA_______b__________________ uptodate,dirty,lru,active,swapbacked 34 0 __RUDlA_______b__________________ referenced,uptodate,dirty,lru,active,swapbacked 503 1 __________B______________________ buddy 1 0 __R________M_____________________ referenced,mmap 1029 4 ___U_l_____M_____________________ uptodate,lru,mmap 43 0 ___U_l_____M_________________I___ uptodate,lru,mmap,readahead 382 1 __RU_l_____M_____________________ referenced,uptodate,lru,mmap 12 0 __RU_l_____M_________________I___ referenced,uptodate,lru,mmap,readahead 192 0 ___U_lA____M_____________________ uptodate,lru,active,mmap 12 0 ___U_lA____M_________________I___ uptodate,lru,active,mmap,readahead 800 3 __RU_lA____M_____________________ referenced,uptodate,lru,active,mmap 31 0 __RU_lA____M_________________I___ referenced,uptodate,lru,active,mmap,readahead 2 0 ___UDlA____M__b__________________ uptodate,dirty,lru,active,mmap,swapbacked 492 1 ____________a____________________ anonymous 4 0 ___U_______Ma_b__________________ uptodate,mmap,anonymous,swapbacked 2839 11 ___U_lA____Ma_b__________________ uptodate,lru,active,mmap,anonymous,swapbacked 30 0 __RU_lA____Ma_b__________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked 513968 2007

# ./page-types -r
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000000000000 468002 1828 _________________________________
0x0000000100000000 19102 74 _____________________r___________ reserved
0x0000000000008000 41 0 _______________H_________________ compound_head
0x0000000000010000 188 0 ________________T________________ compound_tail
0x0000000000008014 1 0 __R_D__________H_________________ referenced,dirty,compound_head
0x0000000000010014 4 0 __R_D___________T________________ referenced,dirty,compound_tail
0x0000000000000020 1 0 _____l___________________________ lru
0x0000000800000024 34 0 __R__l__________________P________ referenced,lru,private
0x0000000000000028 3794 14 ___U_l___________________________ uptodate,lru
0x0001000000000028 46 0 ___U_l_______________________I___ uptodate,lru,readahead
0x0000000400000028 44 0 ___U_l_________________d_________ uptodate,lru,mappedtodisk
0x0001000400000028 2 0 ___U_l_________________d_____I___ uptodate,lru,mappedtodisk,readahead
0x000000000000002c 6434 25 __RU_l___________________________ referenced,uptodate,lru
0x000100000000002c 47 0 __RU_l_______________________I___ referenced,uptodate,lru,readahead
0x000000040000002c 14 0 __RU_l_________________d_________ referenced,uptodate,lru,mappedtodisk
0x000000080000002c 30 0 __RU_l__________________P________ referenced,uptodate,lru,private
0x0000000800000040 8124 31 ______A_________________P________ active,private
0x0000000000000040 219 0 ______A__________________________ active
0x0000000800000060 1 0 _____lA_________________P________ lru,active,private
0x0000000000000068 322 1 ___U_lA__________________________ uptodate,lru,active
0x0001000000000068 12 0 ___U_lA______________________I___ uptodate,lru,active,readahead
0x0000000400000068 13 0 ___U_lA________________d_________ uptodate,lru,active,mappedtodisk
0x0000000800000068 12 0 ___U_lA_________________P________ uptodate,lru,active,private
0x000000000000006c 977 3 __RU_lA__________________________ referenced,uptodate,lru,active
0x000100000000006c 48 0 __RU_lA______________________I___ referenced,uptodate,lru,active,readahead
0x000000040000006c 5 0 __RU_lA________________d_________ referenced,uptodate,lru,active,mappedtodisk
0x000000080000006c 3 0 __RU_lA_________________P________ referenced,uptodate,lru,active,private
0x0000000c0000006c 3 0 __RU_lA________________dP________ referenced,uptodate,lru,active,mappedtodisk,private
0x0000000c00000068 1 0 ___U_lA________________dP________ uptodate,lru,active,mappedtodisk,private
0x0000000000004078 1 0 ___UDlA_______b__________________ uptodate,dirty,lru,active,swapbacked
0x000000000000407c 34 0 __RUDlA_______b__________________ referenced,uptodate,dirty,lru,active,swapbacked
0x0000000000000400 538 2 __________B______________________ buddy
0x0000000000000804 1 0 __R________M_____________________ referenced,mmap
0x0000000000000828 1029 4 ___U_l_____M_____________________ uptodate,lru,mmap
0x0001000000000828 43 0 ___U_l_____M_________________I___ uptodate,lru,mmap,readahead
0x000000000000082c 382 1 __RU_l_____M_____________________ referenced,uptodate,lru,mmap
0x000100000000082c 12 0 __RU_l_____M_________________I___ referenced,uptodate,lru,mmap,readahead
0x0000000000000868 192 0 ___U_lA____M_____________________ uptodate,lru,active,mmap
0x0001000000000868 12 0 ___U_lA____M_________________I___ uptodate,lru,active,mmap,readahead
0x000000000000086c 800 3 __RU_lA____M_____________________ referenced,uptodate,lru,active,mmap
0x000100000000086c 31 0 __RU_lA____M_________________I___ referenced,uptodate,lru,active,mmap,readahead
0x0000000000004878 2 0 ___UDlA____M__b__________________ uptodate,dirty,lru,active,mmap,swapbacked
0x0000000000001000 492 1 ____________a____________________ anonymous
0x0000000000005008 2 0 ___U________a_b__________________ uptodate,anonymous,swapbacked
0x0000000000005808 4 0 ___U_______Ma_b__________________ uptodate,mmap,anonymous,swapbacked
0x000000000000580c 1 0 __RU_______Ma_b__________________ referenced,uptodate,mmap,anonymous,swapbacked
0x0000000000005868 2839 11 ___U_lA____Ma_b__________________ uptodate,lru,active,mmap,anonymous,swapbacked
0x000000000000586c 29 0 __RU_lA____Ma_b__________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked
total 513968 2007

# ./page-types --raw --list --no-summary --bits reserved
offset count flags
0 15 _____________________r___________
31 4 _____________________r___________
159 97 _____________________r___________
4096 2067 _____________________r___________
6752 2390 _____________________r___________
9355 3 _____________________r___________
9728 14526 _____________________r___________

This patch:

Introduce PageHuge(), which identifies huge/gigantic pages by their
dedicated compound destructor functions.

Also move prep_compound_gigantic_page() to hugetlb.c and make
__free_pages_ok() non-static.

Signed-off-by: Wu Fengguang
Cc: KOSAKI Motohiro
Cc: Andi Kleen
Cc: Matt Mackall
Cc: Alexey Dobriyan
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wu Fengguang
2009-06-17 10:47:36 +0800
092cead61 page allocator: move free_page_mlock() to page_alloc.c ... Browse Code »

Currently, free_page_mlock() is only called from page_alloc.c. Thus, we
can move it to page_alloc.c.

Cc: Lee Schermerhorn
Cc: Mel Gorman
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: Dave Hansen
Signed-off-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2009-06-17 10:47:35 +0800
da456f14d page allocator: do not disable interrupts in free_page_mlock() ... Browse Code »

free_page_mlock() tests and clears PG_mlocked using locked versions of the
bit operations. If set, it disables interrupts to update counters and
this happens on every page free even though interrupts are disabled very
shortly afterwards a second time. This is wasteful.

This patch splits what free_page_mlock() does. The bit check is still
made. However, the update of counters is delayed until the interrupts are
disabled and the non-lock version for clearing the bit is used. One
potential weirdness with this split is that the counters do not get
updated if the bad_page() check is triggered but a system showing bad
pages is getting screwed already.

Signed-off-by: Mel Gorman
Reviewed-by: Christoph Lameter
Reviewed-by: Pekka Enberg
Reviewed-by: KOSAKI Motohiro
Cc: Peter Zijlstra
Cc: Nick Piggin
Cc: Dave Hansen
Acked-by: Lee Schermerhorn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2009-06-17 10:47:34 +0800

01 Apr, 2009

1 commit

33925b25d nommu: there is no mlock() for NOMMU, so don't provide the bits ... Browse Code »

The mlock() facility does not exist for NOMMU since all mappings are
effectively locked anyway, so we don't make the bits available when
they're not useful.

Signed-off-by: David Howells
Reviewed-by: KOSAKI Motohiro
Cc: Peter Zijlstra
Cc: Greg Ungerer
Cc: Johannes Weiner
Cc: Rik van Riel
Cc: Lee Schermerhorn
Cc: Enrik Berkhan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2009-04-01 23:59:14 +0800

07 Jan, 2009

2 commits

4779280d1 mm: make get_user_pages() interruptible ... Browse Code »

The initial implementation of checking TIF_MEMDIE covers the cases of OOM
killing. If the process has been OOM killed, the TIF_MEMDIE is set and it
return immediately. This patch includes:

1. add the case that the SIGKILL is sent by user processes. The
process can try to get_user_pages() unlimited memory even if a user
process has sent a SIGKILL to it(maybe a monitor find the process
exceed its memory limit and try to kill it). In the old
implementation, the SIGKILL won't be handled until the get_user_pages()
returns.

2. change the return value to be ERESTARTSYS. It makes no sense to
return ENOMEM if the get_user_pages returned by getting a SIGKILL
signal. Considering the general convention for a system call
interrupted by a signal is ERESTARTNOSYS, so the current return value
is consistant to that.

Lee:

An unfortunate side effect of "make-get_user_pages-interruptible" is that
it prevents a SIGKILL'd task from munlock-ing pages that it had mlocked,
resulting in freeing of mlocked pages. Freeing of mlocked pages, in
itself, is not so bad. We just count them now--altho' I had hoped to
remove this stat and add PG_MLOCKED to the free pages flags check.

However, consider pages in shared libraries mapped by more than one task
that a task mlocked--e.g., via mlockall(). If the task that mlocked the
pages exits via SIGKILL, these pages would be left mlocked and
unevictable.

Proposed fix:

Add another GUP flag to ignore sigkill when calling get_user_pages from
munlock()--similar to Kosaki Motohiro's 'IGNORE_VMA_PERMISSIONS flag for
the same purpose. We are not actually allocating memory in this case,
which "make-get_user_pages-interruptible" intends to avoid. We're just
munlocking pages that are already resident and mapped, and we're reusing
get_user_pages() to access those pages.

?? Maybe we should combine 'IGNORE_VMA_PERMISSIONS and '_IGNORE_SIGKILL
into a single flag: GUP_FLAGS_MUNLOCK ???

[Lee.Schermerhorn@hp.com: ignore sigkill in get_user_pages during munlock]
Signed-off-by: Paul Menage
Signed-off-by: Ying Han
Reviewed-by: KOSAKI Motohiro
Reviewed-by: Pekka Enberg
Cc: Nick Piggin
Cc: Hugh Dickins
Cc: Oleg Nesterov
Cc: Lee Schermerhorn
Cc: Rohit Seth
Cc: David Rientjes
Signed-off-by: Lee Schermerhorn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ying Han
2009-01-07 07:59:08 +0800
22b31eec6 badpage: vm_normal_page use print_bad_pte ... Browse Code »

print_bad_pte() is so far being called only when zap_pte_range() finds
negative page_mapcount, or there's a fault on a pte_file where it does not
belong. That's weak coverage when we suspect pagetable corruption.

Originally, it was called when vm_normal_page() found an invalid pfn: but
pfn_valid is expensive on some architectures and configurations, so 2.6.24
put that under CONFIG_DEBUG_VM (which doesn't help in the field), then
2.6.26 replaced it by a VM_BUG_ON (likewise).

Reinstate the print_bad_pte() in vm_normal_page(), but use a cheaper test
than pfn_valid(): memmap_init_zone() (used in bootup and hotplug) keep a
__read_mostly note of the highest_memmap_pfn, vm_normal_page() then check
pfn against that. We could call this pfn_plausible() or pfn_sane(), but I
doubt we'll need it elsewhere: of course it's not reliable, but gives much
stronger pagetable validation on many boxes.

Also use print_bad_pte() when the pte_special bit is found outside a
VM_PFNMAP or VM_MIXEDMAP area, instead of VM_BUG_ON.

Signed-off-by: Hugh Dickins
Cc: Nick Piggin
Cc: Christoph Lameter
Cc: Mel Gorman
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2009-01-07 07:59:07 +0800

07 Nov, 2008

2 commits

18229df5b hugetlb: pull gigantic page initialisation out of the default path ... Browse Code »

As we can determine exactly when a gigantic page is in use we can optimise
the common regular page cases by pulling out gigantic page initialisation
into its own function. As gigantic pages are never released to buddy we
do not need a destructor. This effectivly reverts the previous change to
the main buddy allocator. It also adds a paranoid check to ensure we
never release gigantic pages from hugetlbfs to the main buddy.

Signed-off-by: Andy Whitcroft
Cc: Jon Tollefson
Cc: Mel Gorman
Cc: Nick Piggin
Cc: Christoph Lameter
Cc: [2.6.27.x]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andy Whitcroft
2008-11-07 07:41:18 +0800
69d177c2f hugetlbfs: handle pages higher order than MAX_ORDER ... Browse Code »

When working with hugepages, hugetlbfs assumes that those hugepages are
smaller than MAX_ORDER. Specifically it assumes that the mem_map is
contigious and uses that to optimise access to the elements of the mem_map
that represent the hugepage. Gigantic pages (such as 16GB pages on
powerpc) by definition are of greater order than MAX_ORDER (larger than
MAX_ORDER_NR_PAGES in size). This means that we can no longer make use of
the buddy alloctor guarentees for the contiguity of the mem_map, which
ensures that the mem_map is at least contigious for maximmally aligned
areas of MAX_ORDER_NR_PAGES pages.

This patch adds new mem_map accessors and iterator helpers which handle
any discontiguity at MAX_ORDER_NR_PAGES boundaries. It then uses these to
implement gigantic page versions of copy_huge_page and clear_huge_page,
and to allow follow_hugetlb_page handle gigantic pages.

Signed-off-by: Andy Whitcroft
Cc: Jon Tollefson
Cc: Mel Gorman
Cc: Nick Piggin
Cc: Christoph Lameter
Cc: [2.6.27.x]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andy Whitcroft
2008-11-07 07:41:18 +0800

20 Oct, 2008

6 commits

985737cf2 mlock: count attempts to free mlocked page ... Browse Code »

Allow free of mlock()ed pages. This shouldn't happen, but during
developement, it occasionally did.

This patch allows us to survive that condition, while keeping the
statistics and events correct for debug.

Signed-off-by: Lee Schermerhorn
Signed-off-by: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2008-10-20 23:52:31 +0800
5344b7e64 vmstat: mlocked pages statistics ... Browse Code »

Add NR_MLOCK zone page state, which provides a (conservative) count of
mlocked pages (actually, the number of mlocked pages moved off the LRU).

Reworked by lts to fit in with the modified mlock page support in the
Reclaim Scalability series.

[kosaki.motohiro@jp.fujitsu.com: fix incorrect Mlocked field of /proc/meminfo]
[lee.schermerhorn@hp.com: mlocked-pages: add event counting with statistics]
Signed-off-by: Nick Piggin
Signed-off-by: Lee Schermerhorn
Signed-off-by: Rik van Riel
Signed-off-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2008-10-20 23:52:31 +0800
ba470de43 mmap: handle mlocked pages during map, remap, unmap ... Browse Code »

Originally by Nick Piggin

Remove mlocked pages from the LRU using "unevictable infrastructure"
during mmap(), munmap(), mremap() and truncate(). Try to move back to
normal LRU lists on munmap() when last mlocked mapping removed. Remove
PageMlocked() status when page truncated from file.

[akpm@linux-foundation.org: cleanup]
[kamezawa.hiroyu@jp.fujitsu.com: fix double unlock_page()]
[kosaki.motohiro@jp.fujitsu.com: split LRU: munlock rework]
[lee.schermerhorn@hp.com: mlock: fix __mlock_vma_pages_range comment block]
[akpm@linux-foundation.org: remove bogus kerneldoc token]
Signed-off-by: Nick Piggin
Signed-off-by: Lee Schermerhorn
Signed-off-by: Rik van Riel
Signed-off-by: KOSAKI Motohiro
Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rik van Riel
2008-10-20 23:52:31 +0800
b291f0003 mlock: mlocked pages are unevictable ... Browse Code »

Make sure that mlocked pages also live on the unevictable LRU, so kswapd
will not scan them over and over again.

This is achieved through various strategies:

1) add yet another page flag--PG_mlocked--to indicate that
the page is locked for efficient testing in vmscan and,
optionally, fault path. This allows early culling of
unevictable pages, preventing them from getting to
page_referenced()/try_to_unmap(). Also allows separate
accounting of mlock'd pages, as Nick's original patch
did.

Note: Nick's original mlock patch used a PG_mlocked
flag. I had removed this in favor of the PG_unevictable
flag + an mlock_count [new page struct member]. I
restored the PG_mlocked flag to eliminate the new
count field.

2) add the mlock/unevictable infrastructure to mm/mlock.c,
with internal APIs in mm/internal.h. This is a rework
of Nick's original patch to these files, taking into
account that mlocked pages are now kept on unevictable
LRU list.

3) update vmscan.c:page_evictable() to check PageMlocked()
and, if vma passed in, the vm_flags. Note that the vma
will only be passed in for new pages in the fault path;
and then only if the "cull unevictable pages in fault
path" patch is included.

4) add try_to_unlock() to rmap.c to walk a page's rmap and
ClearPageMlocked() if no other vmas have it mlocked.
Reuses as much of try_to_unmap() as possible. This
effectively replaces the use of one of the lru list links
as an mlock count. If this mechanism let's pages in mlocked
vmas leak through w/o PG_mlocked set [I don't know that it
does], we should catch them later in try_to_unmap(). One
hopes this will be rare, as it will be relatively expensive.

Original mm/internal.h, mm/rmap.c and mm/mlock.c changes:
Signed-off-by: Nick Piggin

splitlru: introduce __get_user_pages():

New munlock processing need to GUP_FLAGS_IGNORE_VMA_PERMISSIONS.
because current get_user_pages() can't grab PROT_NONE pages theresore it
cause PROT_NONE pages can't munlock.

[akpm@linux-foundation.org: fix this for pagemap-pass-mm-into-pagewalkers.patch]
[akpm@linux-foundation.org: untangle patch interdependencies]
[akpm@linux-foundation.org: fix things after out-of-order merging]
[hugh@veritas.com: fix page-flags mess]
[lee.schermerhorn@hp.com: fix munlock page table walk - now requires 'mm']
[kosaki.motohiro@jp.fujitsu.com: build fix]
[kosaki.motohiro@jp.fujitsu.com: fix truncate race and sevaral comments]
[kosaki.motohiro@jp.fujitsu.com: splitlru: introduce __get_user_pages()]
Signed-off-by: KOSAKI Motohiro
Signed-off-by: Rik van Riel
Signed-off-by: Lee Schermerhorn
Cc: Nick Piggin
Cc: Dave Hansen
Cc: Matt Mackall
Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2008-10-20 23:52:30 +0800
894bc3104 Unevictable LRU Infrastructure ... Browse Code »

When the system contains lots of mlocked or otherwise unevictable pages,
the pageout code (kswapd) can spend lots of time scanning over these
pages. Worse still, the presence of lots of unevictable pages can confuse
kswapd into thinking that more aggressive pageout modes are required,
resulting in all kinds of bad behaviour.

Infrastructure to manage pages excluded from reclaim--i.e., hidden from
vmscan. Based on a patch by Larry Woodman of Red Hat. Reworked to
maintain "unevictable" pages on a separate per-zone LRU list, to "hide"
them from vmscan.

Kosaki Motohiro added the support for the memory controller unevictable
lru list.

Pages on the unevictable list have both PG_unevictable and PG_lru set.
Thus, PG_unevictable is analogous to and mutually exclusive with
PG_active--it specifies which LRU list the page is on.

The unevictable infrastructure is enabled by a new mm Kconfig option
[CONFIG_]UNEVICTABLE_LRU.

A new function 'page_evictable(page, vma)' in vmscan.c tests whether or
not a page may be evictable. Subsequent patches will add the various
!evictable tests. We'll want to keep these tests light-weight for use in
shrink_active_list() and, possibly, the fault path.

To avoid races between tasks putting pages [back] onto an LRU list and
tasks that might be moving the page from non-evictable to evictable state,
the new function 'putback_lru_page()' -- inverse to 'isolate_lru_page()'
-- tests the "evictability" of a page after placing it on the LRU, before
dropping the reference. If the page has become unevictable,
putback_lru_page() will redo the 'putback', thus moving the page to the
unevictable list. This way, we avoid "stranding" evictable pages on the
unevictable list.

[akpm@linux-foundation.org: fix fallout from out-of-order merge]
[riel@redhat.com: fix UNEVICTABLE_LRU and !PROC_PAGE_MONITOR build]
[nishimura@mxp.nes.nec.co.jp: remove redundant mapping check]
[kosaki.motohiro@jp.fujitsu.com: unevictable-lru-infrastructure: putback_lru_page()/unevictable page handling rework]
[kosaki.motohiro@jp.fujitsu.com: kill unnecessary lock_page() in vmscan.c]
[kosaki.motohiro@jp.fujitsu.com: revert migration change of unevictable lru infrastructure]
[kosaki.motohiro@jp.fujitsu.com: revert to unevictable-lru-infrastructure-kconfig-fix.patch]
[kosaki.motohiro@jp.fujitsu.com: restore patch failure of vmstat-unevictable-and-mlocked-pages-vm-events.patch]
Signed-off-by: Lee Schermerhorn
Signed-off-by: Rik van Riel
Signed-off-by: KOSAKI Motohiro
Debugged-by: Benjamin Kidwell
Signed-off-by: Daisuke Nishimura
Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2008-10-20 23:50:26 +0800
62695a84e vmscan: move isolate_lru_page() to vmscan.c ... Browse Code »

On large memory systems, the VM can spend way too much time scanning
through pages that it cannot (or should not) evict from memory. Not only
does it use up CPU time, but it also provokes lock contention and can
leave large systems under memory presure in a catatonic state.

This patch series improves VM scalability by:

1) putting filesystem backed, swap backed and unevictable pages
onto their own LRUs, so the system only scans the pages that it
can/should evict from memory

2) switching to two handed clock replacement for the anonymous LRUs,
so the number of pages that need to be scanned when the system
starts swapping is bound to a reasonable number

3) keeping unevictable pages off the LRU completely, so the
VM does not waste CPU time scanning them. ramfs, ramdisk,
SHM_LOCKED shared memory segments and mlock()ed VMA pages
are keept on the unevictable list.

This patch:

isolate_lru_page logically belongs to be in vmscan.c than migrate.c.

It is tough, because we don't need that function without memory migration
so there is a valid argument to have it in migrate.c. However a
subsequent patch needs to make use of it in the core mm, so we can happily
move it to vmscan.c.

Also, make the function a little more generic by not requiring that it
adds an isolated page to a given list. Callers can do that.

Note that we now have '__isolate_lru_page()', that does
something quite different, visible outside of vmscan.c
for use with memory controller. Methinks we need to
rationalize these names/purposes. --lts

[akpm@linux-foundation.org: fix mm/memory_hotplug.c build]
Signed-off-by: Nick Piggin
Signed-off-by: Rik van Riel
Signed-off-by: Lee Schermerhorn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2008-10-20 23:50:25 +0800

25 Jul, 2008

6 commits

01ad1c082 mm: export prep_compound_page to mm ... Browse Code »

hugetlb will need to get compound pages from bootmem to handle the case of
them being greater than or equal to MAX_ORDER. Export the constructor
function needed for this.

Acked-by: Adam Litke
Signed-off-by: Andi Kleen
Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andi Kleen
2008-07-25 01:47:17 +0800
42b777281 mm: remove double indirection on tlb parameter to free_pgd_range() & Co ... Browse Code »

The double indirection here is not needed anywhere and hence (at least)
confusing.

Signed-off-by: Jan Beulich
Cc: Hugh Dickins
Cc: Nick Piggin
Cc: Christoph Lameter
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: "Luck, Tony"
Cc: Paul Mundt
Cc: "David S. Miller"
Acked-by: Jeremy Fitzhardinge
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Beulich
2008-07-25 01:47:15 +0800
68ad8df42 mm: print out the zonelists on request for manual verification ... Browse Code »

This patch prints out the zonelists during boot for manual verification by the
user if the mminit_loglevel is MMINIT_VERIFY or higher.

Signed-off-by: Mel Gorman
Cc: Christoph Lameter
Cc: Andy Whitcroft
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2008-07-25 01:47:14 +0800
2dbb51c49 mm: make defensive checks around PFN values registered for memory usage ... Browse Code »

There are a number of different views to how much memory is currently active.
There is the arch-independent zone-sizing view, the bootmem allocator and
memory models view.

Architectures register this information at different times and is not
necessarily in sync particularly with respect to some SPARSEMEM limitations.

This patch introduces mminit_validate_memmodel_limits() which is able to
validate and correct PFN ranges with respect to the memory model. It is only
SPARSEMEM that currently validates itself.

Signed-off-by: Mel Gorman
Cc: Christoph Lameter
Cc: Andy Whitcroft
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2008-07-25 01:47:13 +0800
708614e61 mm: verify the page links and memory model ... Browse Code »

Print out information on how the page flags are being used if mminit_loglevel
is MMINIT_VERIFY or higher and unconditionally performs sanity checks on the
flags regardless of loglevel.

When the page flags are updated with section, node and zone information, a
check are made to ensure the values can be retrieved correctly. Finally we
confirm that pfn_to_page and page_to_pfn are the correct inverse functions.

[akpm@linux-foundation.org: fix printk warnings]
Signed-off-by: Mel Gorman
Cc: Christoph Lameter
Cc: Andy Whitcroft
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2008-07-25 01:47:13 +0800
6b74ab97b mm: add a basic debugging framework for memory initialisation ... Browse Code »

Boot initialisation is very complex, with significant numbers of
architecture-specific routines, hooks and code ordering. While significant
amounts of the initialisation is architecture-independent, it trusts the data
received from the architecture layer. This is a mistake, and has resulted in
a number of difficult-to-diagnose bugs.

This patchset adds some validation and tracing to memory initialisation. It
also introduces a few basic defensive measures. The validation code can be
explicitly disabled for embedded systems.

This patch:

Add additional debugging and verification code for memory initialisation.

Once enabled, the verification checks are always run and when required
additional debugging information may be outputted via a mminit_loglevel=
command-line parameter.

The verification code is placed in a new file mm/mm_init.c. Ideally other mm
initialisation code will be moved here over time.

Signed-off-by: Mel Gorman
Cc: Christoph Lameter
Cc: Andy Whitcroft
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2008-07-25 01:47:13 +0800

28 Apr, 2008

1 commit

0c0a4a517 memory hotplug: free memmaps allocated by bootmem ... Browse Code »

This patch is to free memmaps which is allocated by bootmem.

Freeing usemap is not necessary. The pages of usemap may be necessary for
other sections.

If removing section is last section on the node, its section is the final user
of usemap page. (usemaps are allocated on its section by previous patch.) But
it shouldn't be freed too, because the section must be logical offline state
which all pages are isolated against page allocater. If it is freed, page
alloctor may use it which will be removed physically soon. It will be
disaster. So, this patch keeps it as it is.

Signed-off-by: Yasunori Goto
Cc: Badari Pulavarty
Cc: Yinghai Lu
Cc: Yasunori Goto
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yasunori Goto
2008-04-28 23:58:26 +0800

24 Feb, 2008

1 commit

b5a0e0113 Solve section mismatch for free_area_init_core. ... Browse Code »

WARNING: vmlinux.o(.meminit.text+0x649):
Section mismatch in reference from the
function free_area_init_core() to the function .init.text:setup_usemap()
The function __meminit free_area_init_core() references
a function __init setup_usemap().
If free_area_init_core is only used by setup_usemap then
annotate free_area_init_core with a matching annotation.

The warning is covers this stack of functions in mm/page_alloc.c:

alloc_bootmem_node must be marked __init.
alloc_bootmem_node is used by setup_usemap, if !SPARSEMEM.
(usemap_size is only used by setup_usemap, if !SPARSEMEM.)
setup_usemap is only used by free_area_init_core.
free_area_init_core is only used by free_area_init_node.

free_area_init_node is used by:
arch/alpha/mm/numa.c: __init paging_init()
arch/arm/mm/init.c: __init bootmem_init_node()
arch/avr32/mm/init.c: __init paging_init()
arch/cris/arch-v10/mm/init.c: __init paging_init()
arch/cris/arch-v32/mm/init.c: __init paging_init()
arch/m32r/mm/discontig.c: __init zone_sizes_init()
arch/m32r/mm/init.c: __init zone_sizes_init()
arch/m68k/mm/motorola.c: __init paging_init()
arch/m68k/mm/sun3mmu.c: __init paging_init()
arch/mips/sgi-ip27/ip27-memory.c: __init paging_init()
arch/parisc/mm/init.c: __init paging_init()
arch/sparc/mm/srmmu.c: __init srmmu_paging_init()
arch/sparc/mm/sun4c.c: __init sun4c_paging_init()
arch/sparc64/mm/init.c: __init paging_init()
mm/page_alloc.c: __init free_area_init_nodes()
mm/page_alloc.c: __init free_area_init()
and
mm/memory_hotplug.c: hotadd_new_pgdat()

hotadd_new_pgdat can not be an __init function, but:

It is compiled for MEMORY_HOTPLUG configurations only
MEMORY_HOTPLUG depends on SPARSEMEM || X86_64_ACPI_NUMA
X86_64_ACPI_NUMA depends on X86_64
ARCH_FLATMEM_ENABLE depends on X86_32
ARCH_DISCONTIGMEM_ENABLE depends on X86_32
So X86_64_ACPI_NUMA implies SPARSEMEM, right?

So we can mark the stack of functions __init for !SPARSEMEM, but we must mark
them __meminit for SPARSEMEM configurations. This is ok, because then the
calls to alloc_bootmem_node are also avoided.

Compile-tested on:
silly minimal config
defconfig x86_32
defconfig x86_64
defconfig x86_64 -HIBERNATION +MEMORY_HOTPLUG

Signed-off-by: Alexander van Heukelum
Reviewed-by: Sam Ravnborg
Acked-by: Geert Uytterhoeven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander van Heukelum
2008-02-24 09:13:24 +0800

06 Feb, 2008

2 commits

ae1276b93 set_page_refcounted() VM_BUG_ON fix ... Browse Code »

The current PageTail semantic is that a PageTail page is first a
PageCompound page. So remove the redundant PageCompound test in
set_page_refcounted().

Signed-off-by: Qi Yong
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Qi Yong
2008-02-06 01:44:19 +0800
920c7a5d0 mm: remove fastcall from mm/ ... Browse Code »

fastcall is always defined to be empty, remove it

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Harvey Harrison
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Harvey Harrison
2008-02-06 01:44:18 +0800