Eric Lee / smarc-fsl-linux-kernel

08 Oct, 2016

1 commit

ac34dcd26 mm/page_isolation: fix typo: "paes" -> "pages" ... Browse Code »

Fix typo in comment.

Link: http://lkml.kernel.org/r/1474788764-5774-1-git-send-email-ysxie@foxmail.com
Signed-off-by: Yisheng Xie
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yisheng Xie
2016-10-08 09:46:29 +0800

27 Jul, 2016

3 commits

e3a2713c3 mm/page_isolation: clean up confused code ... Browse Code »

When there is an isolated_page, post_alloc_hook() is called with page
but __free_pages() is called with isolated_page. Since they are the
same so no problem but it's very confusing. To reduce it, this patch
changes isolated_page to boolean type and uses page variable
consistently.

Link: http://lkml.kernel.org/r/1466150259-27727-10-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim
Acked-by: Vlastimil Babka
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joonsoo Kim
2016-07-27 07:19:19 +0800
46f24fd85 mm/page_alloc: introduce post allocation processing on page allocator ... Browse Code »

This patch is motivated from Hugh and Vlastimil's concern [1].

There are two ways to get freepage from the allocator. One is using
normal memory allocation API and the other is __isolate_free_page()
which is internally used for compaction and pageblock isolation. Later
usage is rather tricky since it doesn't do whole post allocation
processing done by normal API.

One problematic thing I already know is that poisoned page would not be
checked if it is allocated by __isolate_free_page(). Perhaps, there
would be more.

We could add more debug logic for allocated page in the future and this
separation would cause more problem. I'd like to fix this situation at
this time. Solution is simple. This patch commonize some logic for
newly allocated page and uses it on all sites. This will solve the
problem.

[1] http://marc.info/?i=alpine.LSU.2.11.1604270029350.7066%40eggly.anvils%3E

[iamjoonsoo.kim@lge.com: mm-page_alloc-introduce-post-allocation-processing-on-page-allocator-v3]
Link: http://lkml.kernel.org/r/1464230275-25791-7-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1466150259-27727-9-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1464230275-25791-7-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim
Acked-by: Vlastimil Babka
Cc: Mel Gorman
Cc: Minchan Kim
Cc: Alexander Potapenko
Cc: Hugh Dickins
Cc: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joonsoo Kim
2016-07-27 07:19:19 +0800
83358ece2 mm/page_owner: initialize page owner without holding the zone lock ... Browse Code »

It's not necessary to initialized page_owner with holding the zone lock.
It would cause more contention on the zone lock although it's not a big
problem since it is just debug feature. But, it is better than before
so do it. This is also preparation step to use stackdepot in page owner
feature. Stackdepot allocates new pages when there is no reserved space
and holding the zone lock in this case will cause deadlock.

Link: http://lkml.kernel.org/r/1464230275-25791-2-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim
Acked-by: Vlastimil Babka
Cc: Mel Gorman
Cc: Minchan Kim
Cc: Alexander Potapenko
Cc: Hugh Dickins
Cc: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joonsoo Kim
2016-07-27 07:19:19 +0800

20 May, 2016

2 commits

b9eb63191 mm/memory_hotplug: add comment to some functions related to memory hotplug ... Browse Code »

__offline_isolated_pages() and test_pages_isolated() are used by memory
hotplug. These functions require that range is in a single zone but
there is no code to do this because memory hotplug checks it before
calling these functions. To avoid confusing future user of these
functions, this patch adds comments to them.

Signed-off-by: Joonsoo Kim
Acked-by: Vlastimil Babka
Cc: Rik van Riel
Cc: Johannes Weiner
Cc: Mel Gorman
Cc: Laura Abbott
Cc: Minchan Kim
Cc: Marek Szyprowski
Cc: Michal Nazarewicz
Cc: "Aneesh Kumar K.V"
Cc: "Rafael J. Wysocki"
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joonsoo Kim
2016-05-20 10:12:14 +0800
0edaf86cf include/linux/nodemask.h: create next_node_in() helper ... Browse Code »

Lots of code does

node = next_node(node, XXX);
if (node == MAX_NUMNODES)
node = first_node(XXX);

so create next_node_in() to do this and use it in various places.

[mhocko@suse.com: use next_node_in() helper]
Acked-by: Vlastimil Babka
Acked-by: Michal Hocko
Signed-off-by: Michal Hocko
Cc: Xishi Qiu
Cc: Joonsoo Kim
Cc: David Rientjes
Cc: Naoya Horiguchi
Cc: Laura Abbott
Cc: Hui Zhu
Cc: Wang Xiaoqiang
Cc: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2016-05-20 10:12:14 +0800

02 Apr, 2016

2 commits

ec3b68825 mm/page_isolation.c: fix the function comments ... Browse Code »

Commit fea85cff11de ("mm/page_isolation.c: return last tested pfn rather
than failure indicator") changed the meaning of the return value. Let's
change the function comments as well.

Signed-off-by: Neil Zhang
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Neil Zhang
2016-04-02 06:03:37 +0800
6f25a14a7 mm: fix invalid node in alloc_migrate_target() ... Browse Code »

It is incorrect to use next_node to find a target node, it will return
MAX_NUMNODES or invalid node. This will lead to crash in buddy system
allocation.

Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Xishi Qiu
Acked-by: Vlastimil Babka
Acked-by: Naoya Horiguchi
Cc: Joonsoo Kim
Cc: David Rientjes
Cc: "Laura Abbott"
Cc: Hui Zhu
Cc: Wang Xiaoqiang
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xishi Qiu
2016-04-02 06:03:37 +0800

16 Jan, 2016

1 commit

6f8d2b8a2 mm/page_isolation: do some cleanup in "undo_isolate_page_range" ... Browse Code »

Use "IS_ALIGNED" to judge the alignment, rather than directly judging.

Signed-off-by: Wang Xiaoqiang
Cc: Naoya Horiguchi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wang Xiaoqiang
2016-01-16 09:56:32 +0800

15 Jan, 2016

3 commits

fec174d66 mm/page_isolation: use macro to judge the alignment ... Browse Code »

Signed-off-by: Wang Xiaoqiang
Reviewed-by: Naoya Horiguchi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Naoya Horiguchi
2016-01-15 08:00:49 +0800
0f0848e51 mm/page_isolation.c: add new tracepoint, test_pages_isolated ... Browse Code »

cma allocation should be guranteeded to succeed. But sometimes it can
fail in the current implementation. To track down the problem, we need
to know which page is problematic and this new tracepoint will report
it.

Signed-off-by: Joonsoo Kim
Acked-by: Michal Nazarewicz
Acked-by: David Rientjes
Cc: Minchan Kim
Acked-by: Vlastimil Babka
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joonsoo Kim
2016-01-15 08:00:49 +0800
fea85cff1 mm/page_isolation.c: return last tested pfn rather than failure indicator ... Browse Code »

This is preparation step to report test failed pfn in new tracepoint to
analyze cma allocation failure problem. There is no functional change
in this patch.

Signed-off-by: Joonsoo Kim
Acked-by: David Rientjes
Acked-by: Michal Nazarewicz
Cc: Minchan Kim
Acked-by: Vlastimil Babka
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joonsoo Kim
2016-01-15 08:00:49 +0800

09 Sep, 2015

2 commits

c5b4e1b02 mm, page_isolation: make set/unset_migratetype_isolate() file-local ... Browse Code »

Nowaday, set/unset_migratetype_isolate() is defined and used only in
mm/page_isolation, so let's limit the scope within the file.

Signed-off-by: Naoya Horiguchi
Acked-by: David Rientjes
Acked-by: Vlastimil Babka
Cc: Joonsoo Kim
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Naoya Horiguchi
2015-09-09 06:35:28 +0800
aa016d145 mm, page_isolation: remove bogus tests for isolated pages ... Browse Code »

The __test_page_isolated_in_pageblock() is used to verify whether all
pages in pageblock were either successfully isolated, or are hwpoisoned.
Two of the possible state of pages, that are tested, are however bogus
and misleading.

Both tests rely on get_freepage_migratetype(page), which however has no
guarantees about pages on freelists. Specifically, it doesn't guarantee
that the migratetype returned by the function actually matches the
migratetype of the freelist that the page is on. Such guarantee is not
its purpose and would have negative impact on allocator performance.

The first test checks whether the freepage_migratetype equals
MIGRATE_ISOLATE, supposedly to catch races between page isolation and
allocator activity. These races should be fixed nowadays with
51bb1a4093 ("mm/page_alloc: add freepage on isolate pageblock to correct
buddy list") and related patches. As explained above, the check
wouldn't be able to catch them reliably anyway. For the same reason
false positives can happen, although they are harmless, as the
move_freepages() call would just move the page to the same freelist it's
already on. So removing the test is not a bug fix, just cleanup. After
this patch, we assume that all PageBuddy pages are on the correct
freelist and that the races were really fixed. A truly reliable
verification in the form of e.g. VM_BUG_ON() would be complicated and
is arguably not needed.

The second test (page_count(page) == 0 && get_freepage_migratetype(page)
== MIGRATE_ISOLATE) is probably supposed (the code comes from a big
memory isolation patch from 2007) to catch pages on MIGRATE_ISOLATE
pcplists. However, pcplists don't contain MIGRATE_ISOLATE freepages
nowadays, those are freed directly to free lists, so the check is
obsolete. Remove it as well.

Signed-off-by: Vlastimil Babka
Acked-by: Joonsoo Kim
Cc: Minchan Kim
Acked-by: Michal Nazarewicz
Cc: Laura Abbott
Reviewed-by: Naoya Horiguchi
Cc: Seungho Park
Cc: Johannes Weiner
Cc: "Kirill A. Shutemov"
Acked-by: Mel Gorman
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vlastimil Babka
2015-09-09 06:35:28 +0800

15 May, 2015

1 commit

1ae7013df CMA: page_isolation: check buddy before accessing it ... Browse Code »

I had an issue:

Unable to handle kernel NULL pointer dereference at virtual address 0000082a
pgd = cc970000
[0000082a] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
PC is at get_pageblock_flags_group+0x5c/0xb0
LR is at unset_migratetype_isolate+0x148/0x1b0
pc : [] lr : [] psr: 80000093
sp : c7029d00 ip : 00000105 fp : c7029d1c
r10: 00000001 r9 : 0000000a r8 : 00000004
r7 : 60000013 r6 : 000000a4 r5 : c0a357e4 r4 : 00000000
r3 : 00000826 r2 : 00000002 r1 : 00000000 r0 : 0000003f
Flags: Nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
Control: 10c5387d Table: 2cb7006a DAC: 00000015
Backtrace:
get_pageblock_flags_group+0x0/0xb0
unset_migratetype_isolate+0x0/0x1b0
undo_isolate_page_range+0x0/0xdc
__alloc_contig_range+0x0/0x34c
alloc_contig_range+0x0/0x18

This issue is because when calling unset_migratetype_isolate() to unset
a part of CMA memory, it try to access the buddy page to get its status:

if (order >= pageblock_order) {
page_idx = page_to_pfn(page) & ((1 << MAX_ORDER) - 1);
buddy_idx = __find_buddy_index(page_idx, order);
buddy = page + (buddy_idx - page_idx);

if (!is_migrate_isolate_page(buddy)) {

But the begin addr of this part of CMA memory is very close to a part of
memory that is reserved at boot time (not in buddy system). So add a
check before accessing it.

[akpm@linux-foundation.org: use conventional code layout]
Signed-off-by: Hui Zhu
Suggested-by: Laura Abbott
Suggested-by: Joonsoo Kim
Cc: Vlastimil Babka
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hui Zhu
2015-05-15 08:55:51 +0800

26 Mar, 2015

1 commit

cfa869438 mm/page_alloc.c: call kernel_map_pages in unset_migrateype_isolate ... Browse Code »

Commit 3c605096d315 ("mm/page_alloc: restrict max order of merging on
isolated pageblock") changed the logic of unset_migratetype_isolate to
check the buddy allocator and explicitly call __free_pages to merge.

The page that is being freed in this path never had prep_new_page called
so set_page_refcounted is called explicitly but there is no call to
kernel_map_pages. With the default kernel_map_pages this is mostly
harmless but if kernel_map_pages does any manipulation of the page
tables (unmapping or setting pages to read only) this may trigger a
fault:

alloc_contig_range test_pages_isolated(ceb00, ced00) failed
Unable to handle kernel paging request at virtual address ffffffc0cec00000
pgd = ffffffc045fc4000
[ffffffc0cec00000] *pgd=0000000000000000
Internal error: Oops: 9600004f [#1] PREEMPT SMP
Modules linked in: exfatfs
CPU: 1 PID: 23237 Comm: TimedEventQueue Not tainted 3.10.49-gc72ad36-dirty #1
task: ffffffc03de52100 ti: ffffffc015388000 task.ti: ffffffc015388000
PC is at memset+0xc8/0x1c0
LR is at kernel_map_pages+0x1ec/0x244

Fix this by calling kernel_map_pages to ensure the page is set in the
page table properly

Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock")
Signed-off-by: Laura Abbott
Cc: Naoya Horiguchi
Cc: Mel Gorman
Acked-by: Rik van Riel
Cc: Yasuaki Ishimatsu
Cc: Zhang Yanfei
Cc: Xishi Qiu
Cc: Vladimir Davydov
Acked-by: Joonsoo Kim
Cc: Gioh Kim
Cc: Michal Nazarewicz
Cc: Marek Szyprowski
Cc: Vlastimil Babka
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Laura Abbott
2015-03-26 07:20:30 +0800

11 Dec, 2014

2 commits

ec25af84b mm, page_isolation: drain single zone pcplists ... Browse Code »

When setting MIGRATETYPE_ISOLATE on a pageblock, pcplists are drained to
have a better chance that all pages will be successfully isolated and
not left in the per-cpu caches. Since isolation is always concerned
with a single zone, we can reduce the pcplists drain to the single zone,
which is now possible.

The change should make memory isolation faster and not disturbing
unrelated pcplists anymore.

Signed-off-by: Vlastimil Babka
Cc: Naoya Horiguchi
Cc: Mel Gorman
Cc: Rik van Riel
Cc: Yasuaki Ishimatsu
Cc: Zhang Yanfei
Cc: Xishi Qiu
Cc: Vladimir Davydov
Cc: Joonsoo Kim
Cc: Michal Nazarewicz
Cc: Marek Szyprowski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vlastimil Babka
2014-12-11 09:41:05 +0800
93481ff0e mm: introduce single zone pcplists drain ... Browse Code »

The functions for draining per-cpu pages back to buddy allocators
currently always operate on all zones. There are however several cases
where the drain is only needed in the context of a single zone, and
spilling other pcplists is a waste of time both due to the extra
spilling and later refilling.

This patch introduces new zone pointer parameter to drain_all_pages()
and changes the dummy parameter of drain_local_pages() to be also a zone
pointer. When NULL is passed, the functions operate on all zones as
usual. Passing a specific zone pointer reduces the work to the single
zone.

All callers are updated to pass the NULL pointer in this patch.
Conversion to single zone (where appropriate) is done in further
patches.

Signed-off-by: Vlastimil Babka
Cc: Naoya Horiguchi
Cc: Mel Gorman
Cc: Rik van Riel
Cc: Yasuaki Ishimatsu
Cc: Zhang Yanfei
Cc: Xishi Qiu
Cc: Vladimir Davydov
Cc: Joonsoo Kim
Cc: Michal Nazarewicz
Cc: Marek Szyprowski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vlastimil Babka
2014-12-11 09:41:05 +0800

14 Nov, 2014

2 commits

3c605096d mm/page_alloc: restrict max order of merging on isolated pageblock ... Browse Code »

Current pageblock isolation logic could isolate each pageblock
individually. This causes freepage accounting problem if freepage with
pageblock order on isolate pageblock is merged with other freepage on
normal pageblock. We can prevent merging by restricting max order of
merging to pageblock order if freepage is on isolate pageblock.

A side-effect of this change is that there could be non-merged buddy
freepage even if finishing pageblock isolation, because undoing
pageblock isolation is just to move freepage from isolate buddy list to
normal buddy list rather than to consider merging. So, the patch also
makes undoing pageblock isolation consider freepage merge. When
un-isolation, freepage with more than pageblock order and it's buddy are
checked. If they are on normal pageblock, instead of just moving, we
isolate the freepage and free it in order to get merged.

Signed-off-by: Joonsoo Kim
Acked-by: Vlastimil Babka
Cc: "Kirill A. Shutemov"
Cc: Mel Gorman
Cc: Johannes Weiner
Cc: Minchan Kim
Cc: Yasuaki Ishimatsu
Cc: Zhang Yanfei
Cc: Tang Chen
Cc: Naoya Horiguchi
Cc: Bartlomiej Zolnierkiewicz
Cc: Wen Congyang
Cc: Marek Szyprowski
Cc: Michal Nazarewicz
Cc: Laura Abbott
Cc: Heesub Shin
Cc: "Aneesh Kumar K.V"
Cc: Ritesh Harjani
Cc: Gioh Kim
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joonsoo Kim
2014-11-14 08:17:05 +0800
ad53f92eb mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype ... Browse Code »

Before describing bugs itself, I first explain definition of freepage.

1. pages on buddy list are counted as freepage.
2. pages on isolate migratetype buddy list are *not* counted as freepage.
3. pages on cma buddy list are counted as CMA freepage, too.

Now, I describe problems and related patch.

Patch 1: There is race conditions on getting pageblock migratetype that
it results in misplacement of freepages on buddy list, incorrect
freepage count and un-availability of freepage.

Patch 2: Freepages on pcp list could have stale cached information to
determine migratetype of buddy list to go. This causes misplacement of
freepages on buddy list and incorrect freepage count.

Patch 4: Merging between freepages on different migratetype of
pageblocks will cause freepages accouting problem. This patch fixes it.

Without patchset [3], above problem doesn't happens on my CMA allocation
test, because CMA reserved pages aren't used at all. So there is no
chance for above race.

With patchset [3], I did simple CMA allocation test and get below
result:

- Virtual machine, 4 cpus, 1024 MB memory, 256 MB CMA reservation
- run kernel build (make -j16) on background
- 30 times CMA allocation(8MB * 30 = 240MB) attempts in 5 sec interval
- Result: more than 5000 freepage count are missed

With patchset [3] and this patchset, I found that no freepage count are
missed so that I conclude that problems are solved.

On my simple memory offlining test, these problems also occur on that
environment, too.

This patch (of 4):

There are two paths to reach core free function of buddy allocator,
__free_one_page(), one is free_one_page()->__free_one_page() and the
other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page().
Each paths has race condition causing serious problems. At first, this
patch is focused on first type of freepath. And then, following patch
will solve the problem in second type of freepath.

In the first type of freepath, we got migratetype of freeing page
without holding the zone lock, so it could be racy. There are two cases
of this race.

1. pages are added to isolate buddy list after restoring orignal
migratetype

CPU1 CPU2

get migratetype => return MIGRATE_ISOLATE
call free_one_page() with MIGRATE_ISOLATE

grab the zone lock
unisolate pageblock
release the zone lock

grab the zone lock
call __free_one_page() with MIGRATE_ISOLATE
freepage go into isolate buddy list,
although pageblock is already unisolated

This may cause two problems. One is that we can't use this page anymore
until next isolation attempt of this pageblock, because freepage is on
isolate buddy list. The other is that freepage accouting could be wrong
due to merging between different buddy list. Freepages on isolate buddy
list aren't counted as freepage, but ones on normal buddy list are
counted as freepage. If merge happens, buddy freepage on normal buddy
list is inevitably moved to isolate buddy list without any consideration
of freepage accouting so it could be incorrect.

2. pages are added to normal buddy list while pageblock is isolated.
It is similar with above case.

This also may cause two problems. One is that we can't keep these
freepages from being allocated. Although this pageblock is isolated,
freepage would be added to normal buddy list so that it could be
allocated without any restriction. And the other problem is same as
case 1, that it, incorrect freepage accouting.

This race condition would be prevented by checking migratetype again
with holding the zone lock. Because it is somewhat heavy operation and
it isn't needed in common case, we want to avoid rechecking as much as
possible. So this patch introduce new variable, nr_isolate_pageblock in
struct zone to check if there is isolated pageblock. With this, we can
avoid to re-check migratetype in common case and do it only if there is
isolated pageblock or migratetype is MIGRATE_ISOLATE. This solve above
mentioned problems.

Changes from v3:
Add one more check in free_one_page() that checks whether migratetype is
MIGRATE_ISOLATE or not. Without this, abovementioned case 1 could happens.

Signed-off-by: Joonsoo Kim
Acked-by: Minchan Kim
Acked-by: Michal Nazarewicz
Acked-by: Vlastimil Babka
Cc: "Kirill A. Shutemov"
Cc: Mel Gorman
Cc: Johannes Weiner
Cc: Yasuaki Ishimatsu
Cc: Zhang Yanfei
Cc: Tang Chen
Cc: Naoya Horiguchi
Cc: Bartlomiej Zolnierkiewicz
Cc: Wen Congyang
Cc: Marek Szyprowski
Cc: Laura Abbott
Cc: Heesub Shin
Cc: "Aneesh Kumar K.V"
Cc: Ritesh Harjani
Cc: Gioh Kim
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joonsoo Kim
2014-11-14 08:17:05 +0800

12 Sep, 2013

1 commit

c8721bbbd mm: memory-hotplug: enable memory hotplug to handle hugepage ... Browse Code »

Until now we can't offline memory blocks which contain hugepages because a
hugepage is considered as an unmovable page. But now with this patch
series, a hugepage has become movable, so by using hugepage migration we
can offline such memory blocks.

What's different from other users of hugepage migration is that we need to
decompose all the hugepages inside the target memory block into free buddy
pages after hugepage migration, because otherwise free hugepages remaining
in the memory block intervene the memory offlining. For this reason we
introduce new functions dissolve_free_huge_page() and
dissolve_free_huge_pages().

Other than that, what this patch does is straightforwardly to add hugepage
migration code, that is, adding hugepage code to the functions which scan
over pfn and collect hugepages to be migrated, and adding a hugepage
allocation function to alloc_migrate_target().

As for larger hugepages (1GB for x86_64), it's not easy to do hotremove
over them because it's larger than memory block. So we now simply leave
it to fail as it is.

[yongjun_wei@trendmicro.com.cn: remove duplicated include]
Signed-off-by: Naoya Horiguchi
Acked-by: Andi Kleen
Cc: Hillf Danton
Cc: Wanpeng Li
Cc: Mel Gorman
Cc: Hugh Dickins
Cc: KOSAKI Motohiro
Cc: Michal Hocko
Cc: Rik van Riel
Cc: "Aneesh Kumar K.V"
Signed-off-by: Wei Yongjun
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Naoya Horiguchi
2013-09-12 06:57:48 +0800

20 Aug, 2013

1 commit

85dbe7060 page_isolation: Fix a comment typo in test_pages_isolated() ... Browse Code »

pageblock_nr_page should be pageblock_nr_pages, and fist is
a typo of first.

Signed-off-by: Tang Chen
Signed-off-by: Jiri Kosina

Tang Chen
2013-08-20 19:03:41 +0800

05 Jan, 2013

1 commit

a458431e1 mm: fix zone_watermark_ok_safe() accounting of isolated pages ... Browse Code »

Commit 702d1a6e0766 ("memory-hotplug: fix kswapd looping forever
problem") added an isolated pageblocks counter (nr_pageblock_isolate in
struct zone) and used it to adjust free pages counter in
zone_watermark_ok_safe() to prevent kswapd looping forever problem.

Then later, commit 2139cbe627b8 ("cma: fix counting of isolated pages")
fixed accounting of isolated pages in global free pages counter. It
made the previous zone_watermark_ok_safe() fix unnecessary and
potentially harmful (cause now isolated pages may be accounted twice
making free pages counter incorrect).

This patch removes the special isolated pageblocks counter altogether
which fixes zone_watermark_ok_safe() free pages check.

Reported-by: Tomasz Stanislawski
Signed-off-by: Bartlomiej Zolnierkiewicz
Signed-off-by: Kyungmin Park
Cc: Minchan Kim
Cc: KOSAKI Motohiro
Cc: Aaditya Kumar
Cc: KAMEZAWA Hiroyuki
Cc: Mel Gorman
Cc: Michal Hocko
Cc: Marek Szyprowski
Cc: Michal Nazarewicz
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Bartlomiej Zolnierkiewicz
2013-01-05 08:11:46 +0800

12 Dec, 2012

1 commit

b023f4681 memory-hotplug: skip HWPoisoned page when offlining pages ... Browse Code »

hwpoisoned may be set when we offline a page by the sysfs interface
/sys/devices/system/memory/soft_offline_page or
/sys/devices/system/memory/hard_offline_page. If we don't clear
this flag when onlining pages, this page can't be freed, and will
not in free list. So we can't offline these pages again. So we
should skip such page when offlining pages.

Signed-off-by: Wen Congyang
Cc: David Rientjes
Cc: Jiang Liu
Cc: Len Brown
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Christoph Lameter
Cc: Minchan Kim
Cc: KOSAKI Motohiro
Cc: Yasuaki Ishimatsu
Cc: Andi Kleen
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wen Congyang
2012-12-12 09:22:22 +0800

09 Oct, 2012

6 commits

723a0644a mm/page_alloc: refactor out __alloc_contig_migrate_alloc() ... Browse Code »

__alloc_contig_migrate_alloc() can be used by memory-hotplug so refactor
it out (move + rename as a common name) into page_isolation.c.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Minchan Kim
Cc: Kamezawa Hiroyuki
Reviewed-by: Yasuaki Ishimatsu
Acked-by: Michal Nazarewicz
Cc: Marek Szyprowski
Cc: Wen Congyang
Acked-by: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minchan Kim
2012-10-09 15:22:52 +0800
435b405c0 memory-hotplug: fix pages missed by race rather than failing ... Browse Code »

If race between allocation and isolation in memory-hotplug offline
happens, some pages could be in MIGRATE_MOVABLE of free_list although the
pageblock's migratetype of the page is MIGRATE_ISOLATE.

The race could be detected by get_freepage_migratetype in
__test_page_isolated_in_pageblock. If it is detected, now EBUSY gets
bubbled all the way up and the hotplug operations fails.

But better idea is instead of returning and failing memory-hotremove, move
the free page to the correct list at the time it is detected. It could
enhance memory-hotremove operation success ratio although the race is
really rare.

Suggested by Mel Gorman.

[akpm@linux-foundation.org: small cleanup]
Signed-off-by: Minchan Kim
Cc: KAMEZAWA Hiroyuki
Reviewed-by: Yasuaki Ishimatsu
Acked-by: Mel Gorman
Cc: Xishi Qiu
Cc: Wen Congyang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minchan Kim
2012-10-09 15:22:46 +0800
41d575ad4 memory-hotplug: bug fix race between isolation and allocation ... Browse Code »

Like below, memory-hotplug makes race between page-isolation
and page-allocation so it can hit BUG_ON in __offline_isolated_pages.

CPU A CPU B

start_isolate_page_range
set_migratetype_isolate
spin_lock_irqsave(zone->lock)

free_hot_cold_page(Page A)
/* without zone->lock */
migratetype = get_pageblock_migratetype(Page A);
/*
* Page could be moved into MIGRATE_MOVABLE
* of per_cpu_pages
*/
list_add_tail(&page->lru, &pcp->lists[migratetype]);

set_pageblock_isolate
move_freepages_block
drain_all_pages

/* Page A could be in MIGRATE_MOVABLE of free_list. */

check_pages_isolated
__test_page_isolated_in_pageblock
/*
* We can't catch freed page which
* is free_list[MIGRATE_MOVABLE]
*/
if (PageBuddy(page A))
pfn += 1 << page_order(page A);

/* So, Page A could be allocated */

__offline_isolated_pages
/*
* BUG_ON hit or offline page
* which is used by someone
*/
BUG_ON(!PageBuddy(page A));

This patch checks page's migratetype in freelist in
__test_page_isolated_in_pageblock. So now
__test_page_isolated_in_pageblock can check the page caused by above race
and can fail of memory offlining.

Signed-off-by: Minchan Kim
Acked-by: KAMEZAWA Hiroyuki
Reviewed-by: Yasuaki Ishimatsu
Acked-by: Mel Gorman
Cc: Xishi Qiu
Cc: Wen Congyang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minchan Kim
2012-10-09 15:22:46 +0800
b12c4ad14 mm: page_alloc: use get_freepage_migratetype() instead of page_private() ... Browse Code »

The page allocator uses set_page_private and page_private for handling
migratetype when it frees page. Let's replace them with [set|get]
_freepage_migratetype to make it more clear.

Signed-off-by: Minchan Kim
Acked-by: KAMEZAWA Hiroyuki
Reviewed-by: Yasuaki Ishimatsu
Acked-by: Mel Gorman
Cc: Xishi Qiu
Cc: Wen Congyang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minchan Kim
2012-10-09 15:22:45 +0800
d1ce749a0 cma: count free CMA pages ... Browse Code »

Add NR_FREE_CMA_PAGES counter to be later used for checking watermark in
__zone_watermark_ok(). For simplicity and to avoid #ifdef hell make this
counter always available (not only when CONFIG_CMA=y).

[akpm@linux-foundation.org: use conventional migratetype naming]
Signed-off-by: Bartlomiej Zolnierkiewicz
Signed-off-by: Kyungmin Park
Cc: Marek Szyprowski
Cc: Michal Nazarewicz
Cc: Minchan Kim
Cc: Mel Gorman
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Bartlomiej Zolnierkiewicz
2012-10-09 15:22:44 +0800
2139cbe62 cma: fix counting of isolated pages ... Browse Code »

Isolated free pages shouldn't be accounted to NR_FREE_PAGES counter. Fix
it by properly decreasing/increasing NR_FREE_PAGES counter in
set_migratetype_isolate()/unset_migratetype_isolate() and removing counter
adjustment for isolated pages from free_one_page() and split_free_page().

Signed-off-by: Bartlomiej Zolnierkiewicz
Signed-off-by: Kyungmin Park
Cc: Marek Szyprowski
Cc: Michal Nazarewicz
Cc: Minchan Kim
Cc: Mel Gorman
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Bartlomiej Zolnierkiewicz
2012-10-09 15:22:44 +0800

01 Aug, 2012

2 commits

702d1a6e0 memory-hotplug: fix kswapd looping forever problem ... Browse Code »

When hotplug offlining happens on zone A, it starts to mark freed page as
MIGRATE_ISOLATE type in buddy for preventing further allocation.
(MIGRATE_ISOLATE is very irony type because it's apparently on buddy but
we can't allocate them).

When the memory shortage happens during hotplug offlining, current task
starts to reclaim, then wake up kswapd. Kswapd checks watermark, then go
sleep because current zone_watermark_ok_safe doesn't consider
MIGRATE_ISOLATE freed page count. Current task continue to reclaim in
direct reclaim path without kswapd's helping. The problem is that
zone->all_unreclaimable is set by only kswapd so that current task would
be looping forever like below.

__alloc_pages_slowpath
restart:
wake_all_kswapd
rebalance:
__alloc_pages_direct_reclaim
do_try_to_free_pages
if global_reclaim && !all_unreclaimable
return 1; /* It means we did did_some_progress */
skip __alloc_pages_may_oom
should_alloc_retry
goto rebalance;

If we apply KOSAKI's patch[1] which doesn't depends on kswapd about
setting zone->all_unreclaimable, we can solve this problem by killing some
task in direct reclaim path. But it doesn't wake up kswapd, still. It
could be a problem still if other subsystem needs GFP_ATOMIC request. So
kswapd should consider MIGRATE_ISOLATE when it calculate free pages BEFORE
going sleep.

This patch counts the number of MIGRATE_ISOLATE page block and
zone_watermark_ok_safe will consider it if the system has such blocks
(fortunately, it's very rare so no problem in POV overhead and kswapd is
never hotpath).

Copy/modify from Mel's quote
"
Ideal solution would be "allocating" the pageblock.
It would keep the free space accounting as it is but historically,
memory hotplug didn't allocate pages because it would be difficult to
detect if a pageblock was isolated or if part of some balloon.
Allocating just full pageblocks would work around this, However,
it would play very badly with CMA.
"

[1] http://lkml.org/lkml/2012/6/14/74

[akpm@linux-foundation.org: simplify nr_zone_isolate_freepages(), rework zone_watermark_ok_safe() comment, simplify set_pageblock_isolate() and restore_pageblock_isolate()]
[akpm@linux-foundation.org: fix CONFIG_MEMORY_ISOLATION=n build]
Signed-off-by: Minchan Kim
Suggested-by: KOSAKI Motohiro
Tested-by: Aaditya Kumar
Cc: KAMEZAWA Hiroyuki
Cc: Mel Gorman
Cc: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minchan Kim
2012-08-01 09:42:45 +0800
ee6f509c3 mm: factor out memory isolate functions ... Browse Code »

mm/page_alloc.c has some memory isolation functions but they are used only
when we enable CONFIG_{CMA|MEMORY_HOTPLUG|MEMORY_FAILURE}. So let's make
it configurable by new CONFIG_MEMORY_ISOLATION so that it can reduce
binary size and we can check it simple by CONFIG_MEMORY_ISOLATION, not if
defined CONFIG_{CMA|MEMORY_HOTPLUG|MEMORY_FAILURE}.

Signed-off-by: Minchan Kim
Cc: Andi Kleen
Cc: Marek Szyprowski
Acked-by: KAMEZAWA Hiroyuki
Cc: KOSAKI Motohiro
Cc: Mel Gorman
Cc: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minchan Kim
2012-08-01 09:42:45 +0800

21 May, 2012

1 commit

0815f3d81 mm: page_isolation: MIGRATE_CMA isolation functions added ... Browse Code »

This commit changes various functions that change pages and
pageblocks migrate type between MIGRATE_ISOLATE and
MIGRATE_MOVABLE in such a way as to allow to work with
MIGRATE_CMA migrate type.

Signed-off-by: Michal Nazarewicz
Signed-off-by: Marek Szyprowski
Reviewed-by: KAMEZAWA Hiroyuki
Tested-by: Rob Clark
Tested-by: Ohad Ben-Cohen
Tested-by: Benjamin Gaignard
Tested-by: Robert Nelson
Tested-by: Barry Song

Michal Nazarewicz
2012-05-21 21:09:33 +0800

27 Oct, 2010

1 commit

f6a3607e5 mm: page_isolation: codeclean fix comment and rm unneeded val init ... Browse Code »

__test_page_isolated_in_pageblock() returns 1 if all pages in the range
are isolated, so fix the comment. Variable `pfn' will be initialised in
the following loop so remove it.

Signed-off-by: Bob Liu
Acked-by: KAMEZAWA Hiroyuki
Cc: Wu Fengguang
Cc: KOSAKI Motohiro
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Bob Liu
2010-10-27 07:52:11 +0800

07 Nov, 2008

1 commit

a70dcb969 memory hotplug: fix page_zone() calculation in test_pages_isolated() ... Browse Code »

My last bugfix here (adding zone->lock) introduced a new problem: Using
page_zone(pfn_to_page(pfn)) to get the zone after the for() loop is wrong.
pfn will then be >= end_pfn, which may be in a different zone or not
present at all. This may lead to an addressing exception in page_zone()
or spin_lock_irqsave().

Now I use __first_valid_page() again after the loop to find a valid page
for page_zone().

Signed-off-by: Gerald Schaefer
Acked-by: Nathan Fontenot
Reviewed-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Gerald Schaefer
2008-11-07 07:41:19 +0800

03 Oct, 2008

1 commit

6c1b7f680 memory hotplug: missing zone->lock in test_pages_isolated() ... Browse Code »

__test_page_isolated_in_pageblock() in mm/page_isolation.c has a comment
saying that the caller must hold zone->lock. But the only caller of that
function, test_pages_isolated(), does not hold zone->lock and the lock is
also not acquired anywhere before. This patch adds the missing zone->lock
to test_pages_isolated().

We reproducibly run into BUG_ON(!PageBuddy(page)) in __offline_isolated_pages()
during memory hotplug stress test, see trace below. This patch fixes that
problem, it would be good if we could have it in 2.6.27.

kernel BUG at /home/autobuild/BUILD/linux-2.6.26-20080909/mm/page_alloc.c:4561!
illegal operation: 0001 [#1] PREEMPT SMP
Modules linked in: dm_multipath sunrpc bonding qeth_l3 dm_mod qeth ccwgroup vmur
CPU: 1 Not tainted 2.6.26-29.x.20080909-s390default #1
Process memory_loop_all (pid: 10025, task: 2f444028, ksp: 2b10dd28)
Krnl PSW : 040c0000 801727ea (__offline_isolated_pages+0x18e/0x1c4)
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0
Krnl GPRS: 00000000 7e27fc00 00000000 7e27fc00
00000000 00000400 00014000 7e27fc01
00606f00 7e27fc00 00013fe0 2b10dd28
00000005 80172662 801727b2 2b10dd28
Krnl Code: 801727de: 5810900c l %r1,12(%r9)
801727e2: a7f4ffb3 brc 15,80172748
801727e6: a7f40001 brc 15,801727e8
>801727ea: a7f4ffbc brc 15,80172762
801727ee: a7f40001 brc 15,801727f0
801727f2: a7f4ffaf brc 15,80172750
801727f6: 0707 bcr 0,%r7
801727f8: 0017 unknown
Call Trace:
([] __offline_isolated_pages+0x116/0x1c4)
[] offline_isolated_pages_cb+0x22/0x34
[] walk_memory_resource+0xcc/0x11c
[] offline_pages+0x36a/0x498
[] remove_memory+0x36/0x44
[] memory_block_change_state+0x112/0x150
[] store_mem_state+0x90/0xe4
[] sysdev_store+0x34/0x40
[] sysfs_write_file+0xd0/0x178
[] vfs_write+0x74/0x118
[] sys_write+0x46/0x7c
[] sysc_do_restart+0x12/0x16
[] 0x77f3e8ca

Signed-off-by: Gerald Schaefer
Acked-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Gerald Schaefer
2008-10-03 06:53:13 +0800

02 Sep, 2008

1 commit

0ed97ee47 Remove '#include <stddef.h>' from mm/page_isolation.c ... Browse Code »

Signed-off-by: David Woodhouse

David Woodhouse
2008-09-02 16:29:01 +0800

15 Nov, 2007

1 commit

dbc0e4cef memory hotremove: unset migrate type "ISOLATE" after removal ... Browse Code »

We should unset migrate type "ISOLATE" when we successfully removed memory.
But current code has BUG and cannot works well.

This patch also includes bugfix? to change get_pageblock_flags to
get_pageblock_migratetype().

Thanks to Badari Pulavarty for finding this.

Signed-off-by: KAMEZAWA Hiroyuki
Acked-by: Badari Pulavarty
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2007-11-15 10:45:38 +0800

17 Oct, 2007

1 commit

a5d76b54a memory unplug: page isolation ... Browse Code »

Implement generic chunk-of-pages isolation method by using page grouping ops.

This patch add MIGRATE_ISOLATE to MIGRATE_TYPES. By this
- MIGRATE_TYPES increases.
- bitmap for migratetype is enlarged.

pages of MIGRATE_ISOLATE migratetype will not be allocated even if it is free.
By this, you can isolated *freed* pages from users. How-to-free pages is not
a purpose of this patch. You may use reclaim and migrate codes to free pages.

If start_isolate_page_range(start,end) is called,
- migratetype of the range turns to be MIGRATE_ISOLATE if
its type is MIGRATE_MOVABLE. (*) this check can be updated if other
memory reclaiming works make progress.
- MIGRATE_ISOLATE is not on migratetype fallback list.
- All free pages and will-be-freed pages are isolated.
To check all pages in the range are isolated or not, use test_pages_isolated(),
To cancel isolation, use undo_isolate_page_range().

Changes V6 -> V7
- removed unnecessary #ifdef

There are HOLES_IN_ZONE handling codes...I'm glad if we can remove them..

Signed-off-by: Yasunori Goto
Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2007-10-17 00:43:02 +0800