12 Sep, 2013

1 commit

  • Until now we can't offline memory blocks which contain hugepages because a
    hugepage is considered as an unmovable page. But now with this patch
    series, a hugepage has become movable, so by using hugepage migration we
    can offline such memory blocks.

    What's different from other users of hugepage migration is that we need to
    decompose all the hugepages inside the target memory block into free buddy
    pages after hugepage migration, because otherwise free hugepages remaining
    in the memory block intervene the memory offlining. For this reason we
    introduce new functions dissolve_free_huge_page() and
    dissolve_free_huge_pages().

    Other than that, what this patch does is straightforwardly to add hugepage
    migration code, that is, adding hugepage code to the functions which scan
    over pfn and collect hugepages to be migrated, and adding a hugepage
    allocation function to alloc_migrate_target().

    As for larger hugepages (1GB for x86_64), it's not easy to do hotremove
    over them because it's larger than memory block. So we now simply leave
    it to fail as it is.

    [yongjun_wei@trendmicro.com.cn: remove duplicated include]
    Signed-off-by: Naoya Horiguchi
    Acked-by: Andi Kleen
    Cc: Hillf Danton
    Cc: Wanpeng Li
    Cc: Mel Gorman
    Cc: Hugh Dickins
    Cc: KOSAKI Motohiro
    Cc: Michal Hocko
    Cc: Rik van Riel
    Cc: "Aneesh Kumar K.V"
    Signed-off-by: Wei Yongjun
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     

20 Aug, 2013

1 commit


05 Jan, 2013

1 commit

  • Commit 702d1a6e0766 ("memory-hotplug: fix kswapd looping forever
    problem") added an isolated pageblocks counter (nr_pageblock_isolate in
    struct zone) and used it to adjust free pages counter in
    zone_watermark_ok_safe() to prevent kswapd looping forever problem.

    Then later, commit 2139cbe627b8 ("cma: fix counting of isolated pages")
    fixed accounting of isolated pages in global free pages counter. It
    made the previous zone_watermark_ok_safe() fix unnecessary and
    potentially harmful (cause now isolated pages may be accounted twice
    making free pages counter incorrect).

    This patch removes the special isolated pageblocks counter altogether
    which fixes zone_watermark_ok_safe() free pages check.

    Reported-by: Tomasz Stanislawski
    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Kyungmin Park
    Cc: Minchan Kim
    Cc: KOSAKI Motohiro
    Cc: Aaditya Kumar
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Marek Szyprowski
    Cc: Michal Nazarewicz
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bartlomiej Zolnierkiewicz
     

12 Dec, 2012

1 commit

  • hwpoisoned may be set when we offline a page by the sysfs interface
    /sys/devices/system/memory/soft_offline_page or
    /sys/devices/system/memory/hard_offline_page. If we don't clear
    this flag when onlining pages, this page can't be freed, and will
    not in free list. So we can't offline these pages again. So we
    should skip such page when offlining pages.

    Signed-off-by: Wen Congyang
    Cc: David Rientjes
    Cc: Jiang Liu
    Cc: Len Brown
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Christoph Lameter
    Cc: Minchan Kim
    Cc: KOSAKI Motohiro
    Cc: Yasuaki Ishimatsu
    Cc: Andi Kleen
    Cc: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wen Congyang
     

09 Oct, 2012

6 commits

  • __alloc_contig_migrate_alloc() can be used by memory-hotplug so refactor
    it out (move + rename as a common name) into page_isolation.c.

    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: Minchan Kim
    Cc: Kamezawa Hiroyuki
    Reviewed-by: Yasuaki Ishimatsu
    Acked-by: Michal Nazarewicz
    Cc: Marek Szyprowski
    Cc: Wen Congyang
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • If race between allocation and isolation in memory-hotplug offline
    happens, some pages could be in MIGRATE_MOVABLE of free_list although the
    pageblock's migratetype of the page is MIGRATE_ISOLATE.

    The race could be detected by get_freepage_migratetype in
    __test_page_isolated_in_pageblock. If it is detected, now EBUSY gets
    bubbled all the way up and the hotplug operations fails.

    But better idea is instead of returning and failing memory-hotremove, move
    the free page to the correct list at the time it is detected. It could
    enhance memory-hotremove operation success ratio although the race is
    really rare.

    Suggested by Mel Gorman.

    [akpm@linux-foundation.org: small cleanup]
    Signed-off-by: Minchan Kim
    Cc: KAMEZAWA Hiroyuki
    Reviewed-by: Yasuaki Ishimatsu
    Acked-by: Mel Gorman
    Cc: Xishi Qiu
    Cc: Wen Congyang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • Like below, memory-hotplug makes race between page-isolation
    and page-allocation so it can hit BUG_ON in __offline_isolated_pages.

    CPU A CPU B

    start_isolate_page_range
    set_migratetype_isolate
    spin_lock_irqsave(zone->lock)

    free_hot_cold_page(Page A)
    /* without zone->lock */
    migratetype = get_pageblock_migratetype(Page A);
    /*
    * Page could be moved into MIGRATE_MOVABLE
    * of per_cpu_pages
    */
    list_add_tail(&page->lru, &pcp->lists[migratetype]);

    set_pageblock_isolate
    move_freepages_block
    drain_all_pages

    /* Page A could be in MIGRATE_MOVABLE of free_list. */

    check_pages_isolated
    __test_page_isolated_in_pageblock
    /*
    * We can't catch freed page which
    * is free_list[MIGRATE_MOVABLE]
    */
    if (PageBuddy(page A))
    pfn += 1 << page_order(page A);

    /* So, Page A could be allocated */

    __offline_isolated_pages
    /*
    * BUG_ON hit or offline page
    * which is used by someone
    */
    BUG_ON(!PageBuddy(page A));

    This patch checks page's migratetype in freelist in
    __test_page_isolated_in_pageblock. So now
    __test_page_isolated_in_pageblock can check the page caused by above race
    and can fail of memory offlining.

    Signed-off-by: Minchan Kim
    Acked-by: KAMEZAWA Hiroyuki
    Reviewed-by: Yasuaki Ishimatsu
    Acked-by: Mel Gorman
    Cc: Xishi Qiu
    Cc: Wen Congyang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • The page allocator uses set_page_private and page_private for handling
    migratetype when it frees page. Let's replace them with [set|get]
    _freepage_migratetype to make it more clear.

    Signed-off-by: Minchan Kim
    Acked-by: KAMEZAWA Hiroyuki
    Reviewed-by: Yasuaki Ishimatsu
    Acked-by: Mel Gorman
    Cc: Xishi Qiu
    Cc: Wen Congyang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • Add NR_FREE_CMA_PAGES counter to be later used for checking watermark in
    __zone_watermark_ok(). For simplicity and to avoid #ifdef hell make this
    counter always available (not only when CONFIG_CMA=y).

    [akpm@linux-foundation.org: use conventional migratetype naming]
    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Kyungmin Park
    Cc: Marek Szyprowski
    Cc: Michal Nazarewicz
    Cc: Minchan Kim
    Cc: Mel Gorman
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bartlomiej Zolnierkiewicz
     
  • Isolated free pages shouldn't be accounted to NR_FREE_PAGES counter. Fix
    it by properly decreasing/increasing NR_FREE_PAGES counter in
    set_migratetype_isolate()/unset_migratetype_isolate() and removing counter
    adjustment for isolated pages from free_one_page() and split_free_page().

    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Kyungmin Park
    Cc: Marek Szyprowski
    Cc: Michal Nazarewicz
    Cc: Minchan Kim
    Cc: Mel Gorman
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bartlomiej Zolnierkiewicz
     

01 Aug, 2012

2 commits

  • When hotplug offlining happens on zone A, it starts to mark freed page as
    MIGRATE_ISOLATE type in buddy for preventing further allocation.
    (MIGRATE_ISOLATE is very irony type because it's apparently on buddy but
    we can't allocate them).

    When the memory shortage happens during hotplug offlining, current task
    starts to reclaim, then wake up kswapd. Kswapd checks watermark, then go
    sleep because current zone_watermark_ok_safe doesn't consider
    MIGRATE_ISOLATE freed page count. Current task continue to reclaim in
    direct reclaim path without kswapd's helping. The problem is that
    zone->all_unreclaimable is set by only kswapd so that current task would
    be looping forever like below.

    __alloc_pages_slowpath
    restart:
    wake_all_kswapd
    rebalance:
    __alloc_pages_direct_reclaim
    do_try_to_free_pages
    if global_reclaim && !all_unreclaimable
    return 1; /* It means we did did_some_progress */
    skip __alloc_pages_may_oom
    should_alloc_retry
    goto rebalance;

    If we apply KOSAKI's patch[1] which doesn't depends on kswapd about
    setting zone->all_unreclaimable, we can solve this problem by killing some
    task in direct reclaim path. But it doesn't wake up kswapd, still. It
    could be a problem still if other subsystem needs GFP_ATOMIC request. So
    kswapd should consider MIGRATE_ISOLATE when it calculate free pages BEFORE
    going sleep.

    This patch counts the number of MIGRATE_ISOLATE page block and
    zone_watermark_ok_safe will consider it if the system has such blocks
    (fortunately, it's very rare so no problem in POV overhead and kswapd is
    never hotpath).

    Copy/modify from Mel's quote
    "
    Ideal solution would be "allocating" the pageblock.
    It would keep the free space accounting as it is but historically,
    memory hotplug didn't allocate pages because it would be difficult to
    detect if a pageblock was isolated or if part of some balloon.
    Allocating just full pageblocks would work around this, However,
    it would play very badly with CMA.
    "

    [1] http://lkml.org/lkml/2012/6/14/74

    [akpm@linux-foundation.org: simplify nr_zone_isolate_freepages(), rework zone_watermark_ok_safe() comment, simplify set_pageblock_isolate() and restore_pageblock_isolate()]
    [akpm@linux-foundation.org: fix CONFIG_MEMORY_ISOLATION=n build]
    Signed-off-by: Minchan Kim
    Suggested-by: KOSAKI Motohiro
    Tested-by: Aaditya Kumar
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • mm/page_alloc.c has some memory isolation functions but they are used only
    when we enable CONFIG_{CMA|MEMORY_HOTPLUG|MEMORY_FAILURE}. So let's make
    it configurable by new CONFIG_MEMORY_ISOLATION so that it can reduce
    binary size and we can check it simple by CONFIG_MEMORY_ISOLATION, not if
    defined CONFIG_{CMA|MEMORY_HOTPLUG|MEMORY_FAILURE}.

    Signed-off-by: Minchan Kim
    Cc: Andi Kleen
    Cc: Marek Szyprowski
    Acked-by: KAMEZAWA Hiroyuki
    Cc: KOSAKI Motohiro
    Cc: Mel Gorman
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     

21 May, 2012

1 commit

  • This commit changes various functions that change pages and
    pageblocks migrate type between MIGRATE_ISOLATE and
    MIGRATE_MOVABLE in such a way as to allow to work with
    MIGRATE_CMA migrate type.

    Signed-off-by: Michal Nazarewicz
    Signed-off-by: Marek Szyprowski
    Reviewed-by: KAMEZAWA Hiroyuki
    Tested-by: Rob Clark
    Tested-by: Ohad Ben-Cohen
    Tested-by: Benjamin Gaignard
    Tested-by: Robert Nelson
    Tested-by: Barry Song

    Michal Nazarewicz
     

27 Oct, 2010

1 commit

  • __test_page_isolated_in_pageblock() returns 1 if all pages in the range
    are isolated, so fix the comment. Variable `pfn' will be initialised in
    the following loop so remove it.

    Signed-off-by: Bob Liu
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Wu Fengguang
    Cc: KOSAKI Motohiro
    Cc: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Liu
     

07 Nov, 2008

1 commit

  • My last bugfix here (adding zone->lock) introduced a new problem: Using
    page_zone(pfn_to_page(pfn)) to get the zone after the for() loop is wrong.
    pfn will then be >= end_pfn, which may be in a different zone or not
    present at all. This may lead to an addressing exception in page_zone()
    or spin_lock_irqsave().

    Now I use __first_valid_page() again after the loop to find a valid page
    for page_zone().

    Signed-off-by: Gerald Schaefer
    Acked-by: Nathan Fontenot
    Reviewed-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gerald Schaefer
     

03 Oct, 2008

1 commit

  • __test_page_isolated_in_pageblock() in mm/page_isolation.c has a comment
    saying that the caller must hold zone->lock. But the only caller of that
    function, test_pages_isolated(), does not hold zone->lock and the lock is
    also not acquired anywhere before. This patch adds the missing zone->lock
    to test_pages_isolated().

    We reproducibly run into BUG_ON(!PageBuddy(page)) in __offline_isolated_pages()
    during memory hotplug stress test, see trace below. This patch fixes that
    problem, it would be good if we could have it in 2.6.27.

    kernel BUG at /home/autobuild/BUILD/linux-2.6.26-20080909/mm/page_alloc.c:4561!
    illegal operation: 0001 [#1] PREEMPT SMP
    Modules linked in: dm_multipath sunrpc bonding qeth_l3 dm_mod qeth ccwgroup vmur
    CPU: 1 Not tainted 2.6.26-29.x.20080909-s390default #1
    Process memory_loop_all (pid: 10025, task: 2f444028, ksp: 2b10dd28)
    Krnl PSW : 040c0000 801727ea (__offline_isolated_pages+0x18e/0x1c4)
    R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0
    Krnl GPRS: 00000000 7e27fc00 00000000 7e27fc00
    00000000 00000400 00014000 7e27fc01
    00606f00 7e27fc00 00013fe0 2b10dd28
    00000005 80172662 801727b2 2b10dd28
    Krnl Code: 801727de: 5810900c l %r1,12(%r9)
    801727e2: a7f4ffb3 brc 15,80172748
    801727e6: a7f40001 brc 15,801727e8
    >801727ea: a7f4ffbc brc 15,80172762
    801727ee: a7f40001 brc 15,801727f0
    801727f2: a7f4ffaf brc 15,80172750
    801727f6: 0707 bcr 0,%r7
    801727f8: 0017 unknown
    Call Trace:
    ([] __offline_isolated_pages+0x116/0x1c4)
    [] offline_isolated_pages_cb+0x22/0x34
    [] walk_memory_resource+0xcc/0x11c
    [] offline_pages+0x36a/0x498
    [] remove_memory+0x36/0x44
    [] memory_block_change_state+0x112/0x150
    [] store_mem_state+0x90/0xe4
    [] sysdev_store+0x34/0x40
    [] sysfs_write_file+0xd0/0x178
    [] vfs_write+0x74/0x118
    [] sys_write+0x46/0x7c
    [] sysc_do_restart+0x12/0x16
    [] 0x77f3e8ca

    Signed-off-by: Gerald Schaefer
    Acked-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gerald Schaefer
     

02 Sep, 2008

1 commit


15 Nov, 2007

1 commit

  • We should unset migrate type "ISOLATE" when we successfully removed memory.
    But current code has BUG and cannot works well.

    This patch also includes bugfix? to change get_pageblock_flags to
    get_pageblock_migratetype().

    Thanks to Badari Pulavarty for finding this.

    Signed-off-by: KAMEZAWA Hiroyuki
    Acked-by: Badari Pulavarty
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     

17 Oct, 2007

1 commit

  • Implement generic chunk-of-pages isolation method by using page grouping ops.

    This patch add MIGRATE_ISOLATE to MIGRATE_TYPES. By this
    - MIGRATE_TYPES increases.
    - bitmap for migratetype is enlarged.

    pages of MIGRATE_ISOLATE migratetype will not be allocated even if it is free.
    By this, you can isolated *freed* pages from users. How-to-free pages is not
    a purpose of this patch. You may use reclaim and migrate codes to free pages.

    If start_isolate_page_range(start,end) is called,
    - migratetype of the range turns to be MIGRATE_ISOLATE if
    its type is MIGRATE_MOVABLE. (*) this check can be updated if other
    memory reclaiming works make progress.
    - MIGRATE_ISOLATE is not on migratetype fallback list.
    - All free pages and will-be-freed pages are isolated.
    To check all pages in the range are isolated or not, use test_pages_isolated(),
    To cancel isolation, use undo_isolate_page_range().

    Changes V6 -> V7
    - removed unnecessary #ifdef

    There are HOLES_IN_ZONE handling codes...I'm glad if we can remove them..

    Signed-off-by: Yasunori Goto
    Signed-off-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki