17 Oct, 2020

3 commits

  • The current page_order() can only be called on pages in the buddy
    allocator. For compound pages, you have to use compound_order(). This is
    confusing and led to a bug, so rename page_order() to buddy_order().

    Signed-off-by: Matthew Wilcox (Oracle)
    Signed-off-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20201001152259.14932-2-willy@infradead.org
    Signed-off-by: Linus Torvalds

    Matthew Wilcox (Oracle)
     
  • Whenever we move pages between freelists via move_to_free_list()/
    move_freepages_block(), we don't actually touch the pages:
    1. Page isolation doesn't actually touch the pages, it simply isolates
    pageblocks and moves all free pages to the MIGRATE_ISOLATE freelist.
    When undoing isolation, we move the pages back to the target list.
    2. Page stealing (steal_suitable_fallback()) moves free pages directly
    between lists without touching them.
    3. reserve_highatomic_pageblock()/unreserve_highatomic_pageblock() moves
    free pages directly between freelists without touching them.

    We already place pages to the tail of the freelists when undoing isolation
    via __putback_isolated_page(), let's do it in any case (e.g., if order
    Signed-off-by: Andrew Morton
    Reviewed-by: Oscar Salvador
    Reviewed-by: Wei Yang
    Acked-by: Pankaj Gupta
    Acked-by: Michal Hocko
    Cc: Alexander Duyck
    Cc: Mel Gorman
    Cc: Dave Hansen
    Cc: Vlastimil Babka
    Cc: Mike Rapoport
    Cc: Scott Cheloha
    Cc: Michael Ellerman
    Cc: Haiyang Zhang
    Cc: "K. Y. Srinivasan"
    Cc: Matthew Wilcox
    Cc: Michal Hocko
    Cc: Stephen Hemminger
    Cc: Wei Liu
    Link: https://lkml.kernel.org/r/20201005121534.15649-4-david@redhat.com
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • Callers no longer need the number of isolated pageblocks. Let's simplify.

    Signed-off-by: David Hildenbrand
    Signed-off-by: Andrew Morton
    Reviewed-by: Oscar Salvador
    Acked-by: Michal Hocko
    Cc: Wei Yang
    Cc: Baoquan He
    Cc: Pankaj Gupta
    Cc: Charan Teja Reddy
    Cc: Dan Williams
    Cc: Fenghua Yu
    Cc: Logan Gunthorpe
    Cc: "Matthew Wilcox (Oracle)"
    Cc: Mel Gorman
    Cc: Mel Gorman
    Cc: Michel Lespinasse
    Cc: Mike Rapoport
    Cc: Tony Luck
    Link: https://lkml.kernel.org/r/20200819175957.28465-7-david@redhat.com
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     

14 Oct, 2020

3 commits

  • Let's clean it up a bit, simplifying the exit paths.

    Signed-off-by: David Hildenbrand
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Reviewed-by: Pankaj Gupta
    Cc: Michal Hocko
    Cc: Michael S. Tsirkin
    Cc: Mike Kravetz
    Cc: Jason Wang
    Cc: Mike Rapoport
    Cc: Qian Cai
    Link: http://lkml.kernel.org/r/20200816125333.7434-5-david@redhat.com
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • Inside has_unmovable_pages(), we have a comment describing how unmovable
    data could end up in ZONE_MOVABLE - via "movablecore". Also, besides
    checking if the first page in the pageblock is reserved, we don't perform
    any further checks in case of ZONE_MOVABLE.

    In case of memory offlining, we set REPORT_FAILURE, properly dump_page()
    the page and handle the error gracefully. alloc_contig_pages() users
    currently never allocate from ZONE_MOVABLE. E.g., hugetlb uses
    alloc_contig_pages() for the allocation of gigantic pages only, which will
    never end up on the MOVABLE zone (see htlb_alloc_mask()).

    Signed-off-by: David Hildenbrand
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Cc: Michal Hocko
    Cc: Michael S. Tsirkin
    Cc: Mike Kravetz
    Cc: Pankaj Gupta
    Cc: Jason Wang
    Cc: Mike Rapoport
    Cc: Qian Cai
    Link: http://lkml.kernel.org/r/20200816125333.7434-4-david@redhat.com
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • Right now, if we have two isolations racing on a pageblock that's in the
    MOVABLE zone, we would trigger the WARN_ON_ONCE(). Let's just return
    directly, simplifying error handling.

    The change was introduced in commit 3d680bdf60a5 ("mm/page_isolation: fix
    potential warning from user"). As far as I can see, we currently don't
    have alloc_contig_range() users that use the ZONE_MOVABLE (anymore), so
    it's currently more a cleanup and a preparation for the future than a fix.

    Signed-off-by: David Hildenbrand
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Reviewed-by: Pankaj Gupta
    Acked-by: Mike Kravetz
    Cc: Michal Hocko
    Cc: Michael S. Tsirkin
    Cc: Qian Cai
    Cc: Jason Wang
    Cc: Mike Rapoport
    Link: http://lkml.kernel.org/r/20200816125333.7434-3-david@redhat.com
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     

20 Sep, 2020

1 commit

  • There is a race during page offline that can lead to infinite loop:
    a page never ends up on a buddy list and __offline_pages() keeps
    retrying infinitely or until a termination signal is received.

    Thread#1 - a new process:

    load_elf_binary
    begin_new_exec
    exec_mmap
    mmput
    exit_mmap
    tlb_finish_mmu
    tlb_flush_mmu
    release_pages
    free_unref_page_list
    free_unref_page_prepare
    set_pcppage_migratetype(page, migratetype);
    // Set page->index migration type below MIGRATE_PCPTYPES

    Thread#2 - hot-removes memory
    __offline_pages
    start_isolate_page_range
    set_migratetype_isolate
    set_pageblock_migratetype(page, MIGRATE_ISOLATE);
    Set migration type to MIGRATE_ISOLATE-> set
    drain_all_pages(zone);
    // drain per-cpu page lists to buddy allocator.

    Thread#1 - continue
    free_unref_page_commit
    migratetype = get_pcppage_migratetype(page);
    // get old migration type
    list_add(&page->lru, &pcp->lists[migratetype]);
    // add new page to already drained pcp list

    Thread#2
    Never drains pcp again, and therefore gets stuck in the loop.

    The fix is to try to drain per-cpu lists again after
    check_pages_isolated_cb() fails.

    Fixes: c52e75935f8d ("mm: remove extra drain pages on pcp list")
    Signed-off-by: Pavel Tatashin
    Signed-off-by: Andrew Morton
    Acked-by: David Rientjes
    Acked-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Acked-by: David Hildenbrand
    Cc: Oscar Salvador
    Cc: Wei Yang
    Cc:
    Link: https://lkml.kernel.org/r/20200903140032.380431-1-pasha.tatashin@soleen.com
    Link: https://lkml.kernel.org/r/20200904151448.100489-2-pasha.tatashin@soleen.com
    Link: http://lkml.kernel.org/r/20200904070235.GA15277@dhcp22.suse.cz
    Signed-off-by: Linus Torvalds

    Pavel Tatashin
     

13 Aug, 2020

3 commits

  • There is a well-defined standard migration target callback. Use it
    directly.

    Signed-off-by: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Acked-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: Christoph Hellwig
    Cc: Mike Kravetz
    Cc: Naoya Horiguchi
    Cc: Roman Gushchin
    Link: http://lkml.kernel.org/r/1594622517-20681-8-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • There are some similar functions for migration target allocation. Since
    there is no fundamental difference, it's better to keep just one rather
    than keeping all variants. This patch implements base migration target
    allocation function. In the following patches, variants will be converted
    to use this function.

    Changes should be mechanical, but, unfortunately, there are some
    differences. First, some callers' nodemask is assgined to NULL since NULL
    nodemask will be considered as all available nodes, that is,
    &node_states[N_MEMORY]. Second, for hugetlb page allocation, gfp_mask is
    redefined as regular hugetlb allocation gfp_mask plus __GFP_THISNODE if
    user provided gfp_mask has it. This is because future caller of this
    function requires to set this node constaint. Lastly, if provided nodeid
    is NUMA_NO_NODE, nodeid is set up to the node where migration source
    lives. It helps to remove simple wrappers for setting up the nodeid.

    Note that PageHighmem() call in previous function is changed to open-code
    "is_highmem_idx()" since it provides more readability.

    [akpm@linux-foundation.org: tweak patch title, per Vlastimil]
    [akpm@linux-foundation.org: fix typo in comment]

    Signed-off-by: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Acked-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Cc: Christoph Hellwig
    Cc: Mike Kravetz
    Cc: Naoya Horiguchi
    Cc: Roman Gushchin
    Link: http://lkml.kernel.org/r/1594622517-20681-6-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Patch series "clean-up the migration target allocation functions", v5.

    This patch (of 9):

    For locality, it's better to migrate the page to the same node rather than
    the node of the current caller's cpu.

    Signed-off-by: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Reviewed-by: Vlastimil Babka
    Acked-by: Roman Gushchin
    Acked-by: Michal Hocko
    Cc: Christoph Hellwig
    Cc: Mike Kravetz
    Cc: Naoya Horiguchi
    Link: http://lkml.kernel.org/r/1594622517-20681-1-git-send-email-iamjoonsoo.kim@lge.com
    Link: http://lkml.kernel.org/r/1594622517-20681-2-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

05 Jun, 2020

1 commit

  • virtio-mem wants to allow to offline memory blocks of which some parts
    were unplugged (allocated via alloc_contig_range()), especially, to later
    offline and remove completely unplugged memory blocks. The important part
    is that PageOffline() has to remain set until the section is offline, so
    these pages will never get accessed (e.g., when dumping). The pages should
    not be handed back to the buddy (which would require clearing PageOffline()
    and result in issues if offlining fails and the pages are suddenly in the
    buddy).

    Let's allow to do that by allowing to isolate any PageOffline() page
    when offlining. This way, we can reach the memory hotplug notifier
    MEM_GOING_OFFLINE, where the driver can signal that he is fine with
    offlining this page by dropping its reference count. PageOffline() pages
    with a reference count of 0 can then be skipped when offlining the
    pages (like if they were free, however they are not in the buddy).

    Anybody who uses PageOffline() pages and does not agree to offline them
    (e.g., Hyper-V balloon, XEN balloon, VMWare balloon for 2MB pages) will not
    decrement the reference count and make offlining fail when trying to
    migrate such an unmovable page. So there should be no observable change.
    Same applies to balloon compaction users (movable PageOffline() pages), the
    pages will simply be migrated.

    Note 1: If offlining fails, a driver has to increment the reference
    count again in MEM_CANCEL_OFFLINE.

    Note 2: A driver that makes use of this has to be aware that re-onlining
    the memory block has to be handled by hooking into onlining code
    (online_page_callback_t), resetting the page PageOffline() and
    not giving them to the buddy.

    Reviewed-by: Alexander Duyck
    Acked-by: Michal Hocko
    Tested-by: Pankaj Gupta
    Acked-by: Andrew Morton
    Cc: Andrew Morton
    Cc: Juergen Gross
    Cc: Konrad Rzeszutek Wilk
    Cc: Pavel Tatashin
    Cc: Alexander Duyck
    Cc: Vlastimil Babka
    Cc: Johannes Weiner
    Cc: Anthony Yznaga
    Cc: Michal Hocko
    Cc: Oscar Salvador
    Cc: Mel Gorman
    Cc: Mike Rapoport
    Cc: Dan Williams
    Cc: Anshuman Khandual
    Cc: Qian Cai
    Cc: Pingfan Liu
    Signed-off-by: David Hildenbrand
    Link: https://lore.kernel.org/r/20200507140139.17083-7-david@redhat.com
    Signed-off-by: Michael S. Tsirkin

    David Hildenbrand
     

08 Apr, 2020

1 commit

  • There are cases where we would benefit from avoiding having to go through
    the allocation and free cycle to return an isolated page.

    Examples for this might include page poisoning in which we isolate a page
    and then put it back in the free list without ever having actually
    allocated it.

    This will enable us to also avoid notifiers for the future free page
    reporting which will need to avoid retriggering page reporting when
    returning pages that have been reported on.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Andrew Morton
    Acked-by: David Hildenbrand
    Acked-by: Mel Gorman
    Cc: Andrea Arcangeli
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: Konrad Rzeszutek Wilk
    Cc: Luiz Capitulino
    Cc: Matthew Wilcox
    Cc: Michael S. Tsirkin
    Cc: Michal Hocko
    Cc: Nitesh Narayan Lal
    Cc: Oscar Salvador
    Cc: Pankaj Gupta
    Cc: Paolo Bonzini
    Cc: Rik van Riel
    Cc: Vlastimil Babka
    Cc: Wei Wang
    Cc: Yang Zhang
    Cc: wei qi
    Link: http://lkml.kernel.org/r/20200211224624.29318.89287.stgit@localhost.localdomain
    Signed-off-by: Linus Torvalds

    Alexander Duyck
     

01 Feb, 2020

4 commits

  • It makes sense to call the WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE)
    from start_isolate_page_range(), but should avoid triggering it from
    userspace, i.e, from is_mem_section_removable() because it could crash
    the system by a non-root user if warn_on_panic is set.

    While at it, simplify the code a bit by removing an unnecessary jump
    label.

    Link: http://lkml.kernel.org/r/20200120163915.1469-1-cai@lca.pw
    Signed-off-by: Qian Cai
    Suggested-by: Michal Hocko
    Acked-by: Michal Hocko
    Reviewed-by: David Hildenbrand
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Qian Cai
     
  • It is not that hard to trigger lockdep splats by calling printk from
    under zone->lock. Most of them are false positives caused by lock
    chains introduced early in the boot process and they do not cause any
    real problems (although most of the early boot lock dependencies could
    happen after boot as well). There are some console drivers which do
    allocate from the printk context as well and those should be fixed. In
    any case, false positives are not that trivial to workaround and it is
    far from optimal to lose lockdep functionality for something that is a
    non-issue.

    So change has_unmovable_pages() so that it no longer calls dump_page()
    itself - instead it returns a "struct page *" of the unmovable page back
    to the caller so that in the case of a has_unmovable_pages() failure,
    the caller can call dump_page() after releasing zone->lock. Also, make
    dump_page() is able to report a CMA page as well, so the reason string
    from has_unmovable_pages() can be removed.

    Even though has_unmovable_pages doesn't hold any reference to the
    returned page this should be reasonably safe for the purpose of
    reporting the page (dump_page) because it cannot be hotremoved in the
    context of memory unplug. The state of the page might change but that
    is the case even with the existing code as zone->lock only plays role
    for free pages.

    While at it, remove a similar but unnecessary debug-only printk() as
    well. A sample of one of those lockdep splats is,

    WARNING: possible circular locking dependency detected
    ------------------------------------------------------
    test.sh/8653 is trying to acquire lock:
    ffffffff865a4460 (console_owner){-.-.}, at:
    console_unlock+0x207/0x750

    but task is already holding lock:
    ffff88883fff3c58 (&(&zone->lock)->rlock){-.-.}, at:
    __offline_isolated_pages+0x179/0x3e0

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #3 (&(&zone->lock)->rlock){-.-.}:
    __lock_acquire+0x5b3/0xb40
    lock_acquire+0x126/0x280
    _raw_spin_lock+0x2f/0x40
    rmqueue_bulk.constprop.21+0xb6/0x1160
    get_page_from_freelist+0x898/0x22c0
    __alloc_pages_nodemask+0x2f3/0x1cd0
    alloc_pages_current+0x9c/0x110
    allocate_slab+0x4c6/0x19c0
    new_slab+0x46/0x70
    ___slab_alloc+0x58b/0x960
    __slab_alloc+0x43/0x70
    __kmalloc+0x3ad/0x4b0
    __tty_buffer_request_room+0x100/0x250
    tty_insert_flip_string_fixed_flag+0x67/0x110
    pty_write+0xa2/0xf0
    n_tty_write+0x36b/0x7b0
    tty_write+0x284/0x4c0
    __vfs_write+0x50/0xa0
    vfs_write+0x105/0x290
    redirected_tty_write+0x6a/0xc0
    do_iter_write+0x248/0x2a0
    vfs_writev+0x106/0x1e0
    do_writev+0xd4/0x180
    __x64_sys_writev+0x45/0x50
    do_syscall_64+0xcc/0x76c
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    -> #2 (&(&port->lock)->rlock){-.-.}:
    __lock_acquire+0x5b3/0xb40
    lock_acquire+0x126/0x280
    _raw_spin_lock_irqsave+0x3a/0x50
    tty_port_tty_get+0x20/0x60
    tty_port_default_wakeup+0xf/0x30
    tty_port_tty_wakeup+0x39/0x40
    uart_write_wakeup+0x2a/0x40
    serial8250_tx_chars+0x22e/0x440
    serial8250_handle_irq.part.8+0x14a/0x170
    serial8250_default_handle_irq+0x5c/0x90
    serial8250_interrupt+0xa6/0x130
    __handle_irq_event_percpu+0x78/0x4f0
    handle_irq_event_percpu+0x70/0x100
    handle_irq_event+0x5a/0x8b
    handle_edge_irq+0x117/0x370
    do_IRQ+0x9e/0x1e0
    ret_from_intr+0x0/0x2a
    cpuidle_enter_state+0x156/0x8e0
    cpuidle_enter+0x41/0x70
    call_cpuidle+0x5e/0x90
    do_idle+0x333/0x370
    cpu_startup_entry+0x1d/0x1f
    start_secondary+0x290/0x330
    secondary_startup_64+0xb6/0xc0

    -> #1 (&port_lock_key){-.-.}:
    __lock_acquire+0x5b3/0xb40
    lock_acquire+0x126/0x280
    _raw_spin_lock_irqsave+0x3a/0x50
    serial8250_console_write+0x3e4/0x450
    univ8250_console_write+0x4b/0x60
    console_unlock+0x501/0x750
    vprintk_emit+0x10d/0x340
    vprintk_default+0x1f/0x30
    vprintk_func+0x44/0xd4
    printk+0x9f/0xc5

    -> #0 (console_owner){-.-.}:
    check_prev_add+0x107/0xea0
    validate_chain+0x8fc/0x1200
    __lock_acquire+0x5b3/0xb40
    lock_acquire+0x126/0x280
    console_unlock+0x269/0x750
    vprintk_emit+0x10d/0x340
    vprintk_default+0x1f/0x30
    vprintk_func+0x44/0xd4
    printk+0x9f/0xc5
    __offline_isolated_pages.cold.52+0x2f/0x30a
    offline_isolated_pages_cb+0x17/0x30
    walk_system_ram_range+0xda/0x160
    __offline_pages+0x79c/0xa10
    offline_pages+0x11/0x20
    memory_subsys_offline+0x7e/0xc0
    device_offline+0xd5/0x110
    state_store+0xc6/0xe0
    dev_attr_store+0x3f/0x60
    sysfs_kf_write+0x89/0xb0
    kernfs_fop_write+0x188/0x240
    __vfs_write+0x50/0xa0
    vfs_write+0x105/0x290
    ksys_write+0xc6/0x160
    __x64_sys_write+0x43/0x50
    do_syscall_64+0xcc/0x76c
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    other info that might help us debug this:

    Chain exists of:
    console_owner --> &(&port->lock)->rlock --> &(&zone->lock)->rlock

    Possible unsafe locking scenario:

    CPU0 CPU1
    ---- ----
    lock(&(&zone->lock)->rlock);
    lock(&(&port->lock)->rlock);
    lock(&(&zone->lock)->rlock);
    lock(console_owner);

    *** DEADLOCK ***

    9 locks held by test.sh/8653:
    #0: ffff88839ba7d408 (sb_writers#4){.+.+}, at:
    vfs_write+0x25f/0x290
    #1: ffff888277618880 (&of->mutex){+.+.}, at:
    kernfs_fop_write+0x128/0x240
    #2: ffff8898131fc218 (kn->count#115){.+.+}, at:
    kernfs_fop_write+0x138/0x240
    #3: ffffffff86962a80 (device_hotplug_lock){+.+.}, at:
    lock_device_hotplug_sysfs+0x16/0x50
    #4: ffff8884374f4990 (&dev->mutex){....}, at:
    device_offline+0x70/0x110
    #5: ffffffff86515250 (cpu_hotplug_lock.rw_sem){++++}, at:
    __offline_pages+0xbf/0xa10
    #6: ffffffff867405f0 (mem_hotplug_lock.rw_sem){++++}, at:
    percpu_down_write+0x87/0x2f0
    #7: ffff88883fff3c58 (&(&zone->lock)->rlock){-.-.}, at:
    __offline_isolated_pages+0x179/0x3e0
    #8: ffffffff865a4920 (console_lock){+.+.}, at:
    vprintk_emit+0x100/0x340

    stack backtrace:
    Hardware name: HPE ProLiant DL560 Gen10/ProLiant DL560 Gen10,
    BIOS U34 05/21/2019
    Call Trace:
    dump_stack+0x86/0xca
    print_circular_bug.cold.31+0x243/0x26e
    check_noncircular+0x29e/0x2e0
    check_prev_add+0x107/0xea0
    validate_chain+0x8fc/0x1200
    __lock_acquire+0x5b3/0xb40
    lock_acquire+0x126/0x280
    console_unlock+0x269/0x750
    vprintk_emit+0x10d/0x340
    vprintk_default+0x1f/0x30
    vprintk_func+0x44/0xd4
    printk+0x9f/0xc5
    __offline_isolated_pages.cold.52+0x2f/0x30a
    offline_isolated_pages_cb+0x17/0x30
    walk_system_ram_range+0xda/0x160
    __offline_pages+0x79c/0xa10
    offline_pages+0x11/0x20
    memory_subsys_offline+0x7e/0xc0
    device_offline+0xd5/0x110
    state_store+0xc6/0xe0
    dev_attr_store+0x3f/0x60
    sysfs_kf_write+0x89/0xb0
    kernfs_fop_write+0x188/0x240
    __vfs_write+0x50/0xa0
    vfs_write+0x105/0x290
    ksys_write+0xc6/0x160
    __x64_sys_write+0x43/0x50
    do_syscall_64+0xcc/0x76c
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    Link: http://lkml.kernel.org/r/20200117181200.20299-1-cai@lca.pw
    Signed-off-by: Qian Cai
    Reviewed-by: David Hildenbrand
    Cc: Michal Hocko
    Cc: Sergey Senozhatsky
    Cc: Petr Mladek
    Cc: Steven Rostedt (VMware)
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Qian Cai
     
  • Now that the memory isolate notifier is gone, the parameter is always 0.
    Drop it and cleanup has_unmovable_pages().

    Link: http://lkml.kernel.org/r/20191114131911.11783-3-david@redhat.com
    Signed-off-by: David Hildenbrand
    Acked-by: Michal Hocko
    Cc: Oscar Salvador
    Cc: Anshuman Khandual
    Cc: Qian Cai
    Cc: Pingfan Liu
    Cc: Stephen Rothwell
    Cc: Dan Williams
    Cc: Pavel Tatashin
    Cc: Vlastimil Babka
    Cc: Mel Gorman
    Cc: Mike Rapoport
    Cc: Wei Yang
    Cc: Alexander Duyck
    Cc: Alexander Potapenko
    Cc: Arun KS
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • Luckily, we have no users left, so we can get rid of it. Cleanup
    set_migratetype_isolate() a little bit.

    Link: http://lkml.kernel.org/r/20191114131911.11783-2-david@redhat.com
    Signed-off-by: David Hildenbrand
    Reviewed-by: Greg Kroah-Hartman
    Acked-by: Michal Hocko
    Cc: "Rafael J. Wysocki"
    Cc: Pavel Tatashin
    Cc: Dan Williams
    Cc: Oscar Salvador
    Cc: Qian Cai
    Cc: Anshuman Khandual
    Cc: Pingfan Liu
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     

02 Dec, 2019

1 commit

  • We have two types of users of page isolation:

    1. Memory offlining: Offline memory so it can be unplugged. Memory
    won't be touched.

    2. Memory allocation: Allocate memory (e.g., alloc_contig_range()) to
    become the owner of the memory and make use of
    it.

    For example, in case we want to offline memory, we can ignore (skip
    over) PageHWPoison() pages, as the memory won't get used. We can allow
    to offline memory. In contrast, we don't want to allow to allocate such
    memory.

    Let's generalize the approach so we can special case other types of
    pages we want to skip over in case we offline memory. While at it, also
    pass the same flags to test_pages_isolated().

    Link: http://lkml.kernel.org/r/20191021172353.3056-3-david@redhat.com
    Signed-off-by: David Hildenbrand
    Suggested-by: Michal Hocko
    Acked-by: Michal Hocko
    Cc: Oscar Salvador
    Cc: Anshuman Khandual
    Cc: David Hildenbrand
    Cc: Pingfan Liu
    Cc: Qian Cai
    Cc: Pavel Tatashin
    Cc: Dan Williams
    Cc: Vlastimil Babka
    Cc: Mel Gorman
    Cc: Mike Rapoport
    Cc: Alexander Duyck
    Cc: Mike Rapoport
    Cc: Pavel Tatashin
    Cc: Wei Yang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     

13 Jul, 2019

1 commit


15 May, 2019

1 commit

  • pfn_valid_within() calls pfn_valid() when CONFIG_HOLES_IN_ZONE making it
    redundant for both definitions (w/wo CONFIG_MEMORY_HOTPLUG) of the helper
    pfn_to_online_page() which either calls pfn_valid() or pfn_valid_within().
    pfn_valid_within() being 1 when !CONFIG_HOLES_IN_ZONE is irrelevant
    either way. This does not change functionality.

    Link: http://lkml.kernel.org/r/1553141595-26907-1-git-send-email-anshuman.khandual@arm.com
    Signed-off-by: Anshuman Khandual
    Reviewed-by: Zi Yan
    Reviewed-by: Oscar Salvador
    Acked-by: Michal Hocko
    Cc: Mike Kravetz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anshuman Khandual
     

30 Mar, 2019

2 commits

  • Due to has_unmovable_pages() taking an incorrect irqsave flag instead of
    the isolation flag in set_migratetype_isolate(), there are issues with
    HWPOSION and error reporting where dump_page() is not called when there
    is an unmovable page.

    Link: http://lkml.kernel.org/r/20190320204941.53731-1-cai@lca.pw
    Fixes: d381c54760dc ("mm: only report isolation failures when offlining memory")
    Acked-by: Michal Hocko
    Reviewed-by: Oscar Salvador
    Signed-off-by: Qian Cai
    Cc: [5.0.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Qian Cai
     
  • Commit f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded
    memory to zones until online") introduced move_pfn_range_to_zone() which
    calls memmap_init_zone() during onlining a memory block.
    memmap_init_zone() will reset pagetype flags and makes migrate type to
    be MOVABLE.

    However, in __offline_pages(), it also call undo_isolate_page_range()
    after offline_isolated_pages() to do the same thing. Due to commit
    2ce13640b3f4 ("mm: __first_valid_page skip over offline pages") changed
    __first_valid_page() to skip offline pages, undo_isolate_page_range()
    here just waste CPU cycles looping around the offlining PFN range while
    doing nothing, because __first_valid_page() will return NULL as
    offline_isolated_pages() has already marked all memory sections within
    the pfn range as offline via offline_mem_sections().

    Also, after calling the "useless" undo_isolate_page_range() here, it
    reaches the point of no returning by notifying MEM_OFFLINE. Those pages
    will be marked as MIGRATE_MOVABLE again once onlining. The only thing
    left to do is to decrease the number of isolated pageblocks zone counter
    which would make some paths of the page allocation slower that the above
    commit introduced.

    Even if alloc_contig_range() can be used to isolate 16GB-hugetlb pages
    on ppc64, an "int" should still be enough to represent the number of
    pageblocks there. Fix an incorrect comment along the way.

    [cai@lca.pw: v4]
    Link: http://lkml.kernel.org/r/20190314150641.59358-1-cai@lca.pw
    Link: http://lkml.kernel.org/r/20190313143133.46200-1-cai@lca.pw
    Fixes: 2ce13640b3f4 ("mm: __first_valid_page skip over offline pages")
    Signed-off-by: Qian Cai
    Acked-by: Michal Hocko
    Reviewed-by: Oscar Salvador
    Cc: Vlastimil Babka
    Cc: [4.13+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Qian Cai
     

29 Dec, 2018

1 commit

  • Heiko has complained that his log is swamped by warnings from
    has_unmovable_pages

    [ 20.536664] page dumped because: has_unmovable_pages
    [ 20.536792] page:000003d081ff4080 count:1 mapcount:0 mapping:000000008ff88600 index:0x0 compound_mapcount: 0
    [ 20.536794] flags: 0x3fffe0000010200(slab|head)
    [ 20.536795] raw: 03fffe0000010200 0000000000000100 0000000000000200 000000008ff88600
    [ 20.536796] raw: 0000000000000000 0020004100000000 ffffffff00000001 0000000000000000
    [ 20.536797] page dumped because: has_unmovable_pages
    [ 20.536814] page:000003d0823b0000 count:1 mapcount:0 mapping:0000000000000000 index:0x0
    [ 20.536815] flags: 0x7fffe0000000000()
    [ 20.536817] raw: 07fffe0000000000 0000000000000100 0000000000000200 0000000000000000
    [ 20.536818] raw: 0000000000000000 0000000000000000 ffffffff00000001 0000000000000000

    which are not triggered by the memory hotplug but rather CMA allocator.
    The original idea behind dumping the page state for all call paths was
    that these messages will be helpful debugging failures. From the above it
    seems that this is not the case for the CMA path because we are lacking
    much more context. E.g the second reported page might be a CMA allocated
    page. It is still interesting to see a slab page in the CMA area but it
    is hard to tell whether this is bug from the above output alone.

    Address this issue by dumping the page state only on request. Both
    start_isolate_page_range and has_unmovable_pages already have an argument
    to ignore hwpoison pages so make this argument more generic and turn it
    into flags and allow callers to combine non-default modes into a mask.
    While we are at it, has_unmovable_pages call from
    is_pageblock_removable_nolock (sysfs removable file) is questionable to
    report the failure so drop it from there as well.

    Link: http://lkml.kernel.org/r/20181218092802.31429-1-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Reported-by: Heiko Carstens
    Reviewed-by: Oscar Salvador
    Cc: Anshuman Khandual
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

12 Apr, 2018

1 commit

  • No allocation callback is using this argument anymore. new_page_node
    used to use this parameter to convey node_id resp. migration error up
    to move_pages code (do_move_page_to_node_array). The error status never
    made it into the final status field and we have a better way to
    communicate node id to the status field now. All other allocation
    callbacks simply ignored the argument so we can drop it finally.

    [mhocko@suse.com: fix migration callback]
    Link: http://lkml.kernel.org/r/20180105085259.GH2801@dhcp22.suse.cz
    [akpm@linux-foundation.org: fix alloc_misplaced_dst_page()]
    [mhocko@kernel.org: fix build]
    Link: http://lkml.kernel.org/r/20180103091134.GB11319@dhcp22.suse.cz
    Link: http://lkml.kernel.org/r/20180103082555.14592-3-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Reviewed-by: Zi Yan
    Cc: Andrea Reale
    Cc: Anshuman Khandual
    Cc: Kirill A. Shutemov
    Cc: Mike Kravetz
    Cc: Naoya Horiguchi
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

06 Apr, 2018

1 commit

  • start_isolate_page_range() is used to set the migrate type of a set of
    pageblocks to MIGRATE_ISOLATE while attempting to start a migration
    operation. It assumes that only one thread is calling it for the
    specified range. This routine is used by CMA, memory hotplug and
    gigantic huge pages. Each of these users synchronize access to the
    range within their subsystem. However, two subsystems (CMA and gigantic
    huge pages for example) could attempt operations on the same range. If
    this happens, one thread may 'undo' the work another thread is doing.
    This can result in pageblocks being incorrectly left marked as
    MIGRATE_ISOLATE and therefore not available for page allocation.

    What is ideally needed is a way to synchronize access to a set of
    pageblocks that are undergoing isolation and migration. The only thing
    we know about these pageblocks is that they are all in the same zone. A
    per-node mutex is too coarse as we want to allow multiple operations on
    different ranges within the same zone concurrently. Instead, we will
    use the migration type of the pageblocks themselves as a form of
    synchronization.

    start_isolate_page_range sets the migration type on a set of page-
    blocks going in order from the one associated with the smallest pfn to
    the largest pfn. The zone lock is acquired to check and set the
    migration type. When going through the list of pageblocks check if
    MIGRATE_ISOLATE is already set. If so, this indicates another thread is
    working on this pageblock. We know exactly which pageblocks we set, so
    clean up by undo those and return -EBUSY.

    This allows start_isolate_page_range to serve as a synchronization
    mechanism and will allow for more general use of callers making use of
    these interfaces. Update comments in alloc_contig_range to reflect this
    new functionality.

    Each CPU holds the associated zone lock to modify or examine the
    migration type of a pageblock. And, it will only examine/update a
    single pageblock per lock acquire/release cycle.

    Link: http://lkml.kernel.org/r/20180309224731.16978-1-mike.kravetz@oracle.com
    Signed-off-by: Mike Kravetz
    Reviewed-by: Andrew Morton
    Cc: KAMEZAWA Hiroyuki
    Cc: Luiz Capitulino
    Cc: Michal Nazarewicz
    Cc: Michal Hocko
    Cc: Vlastimil Babka
    Cc: Mel Gorman
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Kravetz
     

16 Nov, 2017

1 commit

  • Joonsoo has noticed that "mm: drop migrate type checks from
    has_unmovable_pages" would break CMA allocator because it relies on
    has_unmovable_pages returning false even for CMA pageblocks which in
    fact don't have to be movable:

    alloc_contig_range
    start_isolate_page_range
    set_migratetype_isolate
    has_unmovable_pages

    This is a result of the code sharing between CMA and memory hotplug
    while each one has a different idea of what has_unmovable_pages should
    return. This is unfortunate but fixing it properly would require a lot
    of code duplication.

    Fix the issue by introducing the requested migrate type argument and
    special case MIGRATE_CMA case where CMA page blocks are handled
    properly. This will work for memory hotplug because it requires
    MIGRATE_MOVABLE.

    Link: http://lkml.kernel.org/r/20171019122118.y6cndierwl2vnguj@dhcp22.suse.cz
    Signed-off-by: Michal Hocko
    Reported-by: Joonsoo Kim
    Tested-by: Stefan Wahren
    Tested-by: Ran Wang
    Cc: Michael Ellerman
    Cc: Vlastimil Babka
    Cc: Igor Mammedov
    Cc: KAMEZAWA Hiroyuki
    Cc: Reza Arbab
    Cc: Vitaly Kuznetsov
    Cc: Xishi Qiu
    Cc: Yasuaki Ishimatsu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

11 Jul, 2017

1 commit

  • Commit 394e31d2ceb4 ("mem-hotplug: alloc new page from a nearest
    neighbor node when mem-offline") has duplicated a large part of
    alloc_migrate_target with some hotplug specific special casing.

    To be more precise it tried to enfore the allocation from a different
    node than the original page. As a result the two function diverged in
    their shared logic, e.g. the hugetlb allocation strategy.

    Let's unify the two and express different NUMA requirements by the given
    nodemask. new_node_page will simply exclude the node it doesn't care
    about and alloc_migrate_target will use all the available nodes.
    alloc_migrate_target will then learn to migrate hugetlb pages more
    sanely and use preallocated pool when possible.

    Please note that alloc_migrate_target used to call alloc_page resp.
    alloc_pages_current so the memory policy of the current context which is
    quite strange when we consider that it is used in the context of
    alloc_contig_range which just tries to migrate pages which stand in the
    way.

    Link: http://lkml.kernel.org/r/20170608074553.22152-4-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: Naoya Horiguchi
    Cc: Xishi Qiu
    Cc: zhong jiang
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

07 Jul, 2017

1 commit

  • __first_valid_page skips over invalid pfns in the range but it might
    still stumble over offline pages. At least start_isolate_page_range
    will mark those set_migratetype_isolate. This doesn't represent any
    immediate AFAICS because alloc_contig_range will fail to isolate those
    pages but it relies on not fully initialized page which will become a
    problem later when we stop associating offline pages to zones. Use
    pfn_to_online_page to handle this.

    This is more a preparatory patch than a fix.

    Link: http://lkml.kernel.org/r/20170515085827.16474-10-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: Andi Kleen
    Cc: Andrea Arcangeli
    Cc: Balbir Singh
    Cc: Dan Williams
    Cc: Daniel Kiper
    Cc: David Rientjes
    Cc: Heiko Carstens
    Cc: Igor Mammedov
    Cc: Jerome Glisse
    Cc: Joonsoo Kim
    Cc: Martin Schwidefsky
    Cc: Mel Gorman
    Cc: Reza Arbab
    Cc: Tobias Regnery
    Cc: Toshi Kani
    Cc: Vitaly Kuznetsov
    Cc: Xishi Qiu
    Cc: Yasuaki Ishimatsu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

09 May, 2017

1 commit

  • When stealing pages from pageblock of a different migratetype, we count
    how many free pages were stolen, and change the pageblock's migratetype
    if more than half of the pageblock was free. This might be too
    conservative, as there might be other pages that are not free, but were
    allocated with the same migratetype as our allocation requested.

    While we cannot determine the migratetype of allocated pages precisely
    (at least without the page_owner functionality enabled), we can count
    pages that compaction would try to isolate for migration - those are
    either on LRU or __PageMovable(). The rest can be assumed to be
    MIGRATE_RECLAIMABLE or MIGRATE_UNMOVABLE, which we cannot easily
    distinguish. This counting can be done as part of free page stealing
    with little additional overhead.

    The page stealing code is changed so that it considers free pages plus
    pages of the "good" migratetype for the decision whether to change
    pageblock's migratetype.

    The result should be more accurate migratetype of pageblocks wrt the
    actual pages in the pageblocks, when stealing from semi-occupied
    pageblocks. This should help the efficiency of page grouping by
    mobility.

    In testing based on 4.9 kernel with stress-highalloc from mmtests
    configured for order-4 GFP_KERNEL allocations, this patch has reduced
    the number of unmovable allocations falling back to movable pageblocks
    by 47%. The number of movable allocations falling back to other
    pageblocks are increased by 55%, but these events don't cause permanent
    fragmentation, so the tradeoff should be positive. Later patches also
    offset the movable fallback increase to some extent.

    [akpm@linux-foundation.org: merge fix]
    Link: http://lkml.kernel.org/r/20170307131545.28577-5-vbabka@suse.cz
    Signed-off-by: Vlastimil Babka
    Acked-by: Mel Gorman
    Cc: Johannes Weiner
    Cc: Joonsoo Kim
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     

04 May, 2017

1 commit

  • Use is_migrate_isolate_page() to simplify the code, no functional
    changes.

    Link: http://lkml.kernel.org/r/58B94FB1.8020802@huawei.com
    Signed-off-by: Xishi Qiu
    Acked-by: Michal Hocko
    Cc: Vlastimil Babka
    Cc: Mel Gorman
    Cc: Minchan Kim
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xishi Qiu
     

23 Feb, 2017

2 commits

  • On architectures that allow memory holes, page_is_buddy() has to perform
    page_to_pfn() to check for the memory hole. After the previous patch,
    we have the pfn already available in __free_one_page(), which is the
    only caller of page_is_buddy(), so move the check there and avoid
    page_to_pfn().

    Link: http://lkml.kernel.org/r/20161216120009.20064-2-vbabka@suse.cz
    Signed-off-by: Vlastimil Babka
    Acked-by: Mel Gorman
    Cc: Joonsoo Kim
    Cc: Michal Hocko
    Cc: "Kirill A. Shutemov"
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     
  • In __free_one_page() we do the buddy merging arithmetics on "page/buddy
    index", which is just the lower MAX_ORDER bits of pfn. The operations
    we do that affect the higher bits are bitwise AND and subtraction (in
    that order), where the final result will be the same with the higher
    bits left unmasked, as long as these bits are equal for both buddies -
    which must be true by the definition of a buddy.

    We can therefore use pfn's directly instead of "index" and skip the
    zeroing of >MAX_ORDER bits. This can help a bit by itself, although
    compiler might be smart enough already. It also helps the next patch to
    avoid page_to_pfn() for memory hole checks.

    Link: http://lkml.kernel.org/r/20161216120009.20064-1-vbabka@suse.cz
    Signed-off-by: Vlastimil Babka
    Acked-by: Mel Gorman
    Cc: Joonsoo Kim
    Cc: Michal Hocko
    Cc: "Kirill A. Shutemov"
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     

08 Oct, 2016

1 commit


27 Jul, 2016

3 commits

  • When there is an isolated_page, post_alloc_hook() is called with page
    but __free_pages() is called with isolated_page. Since they are the
    same so no problem but it's very confusing. To reduce it, this patch
    changes isolated_page to boolean type and uses page variable
    consistently.

    Link: http://lkml.kernel.org/r/1466150259-27727-10-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Joonsoo Kim
    Acked-by: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • This patch is motivated from Hugh and Vlastimil's concern [1].

    There are two ways to get freepage from the allocator. One is using
    normal memory allocation API and the other is __isolate_free_page()
    which is internally used for compaction and pageblock isolation. Later
    usage is rather tricky since it doesn't do whole post allocation
    processing done by normal API.

    One problematic thing I already know is that poisoned page would not be
    checked if it is allocated by __isolate_free_page(). Perhaps, there
    would be more.

    We could add more debug logic for allocated page in the future and this
    separation would cause more problem. I'd like to fix this situation at
    this time. Solution is simple. This patch commonize some logic for
    newly allocated page and uses it on all sites. This will solve the
    problem.

    [1] http://marc.info/?i=alpine.LSU.2.11.1604270029350.7066%40eggly.anvils%3E

    [iamjoonsoo.kim@lge.com: mm-page_alloc-introduce-post-allocation-processing-on-page-allocator-v3]
    Link: http://lkml.kernel.org/r/1464230275-25791-7-git-send-email-iamjoonsoo.kim@lge.com
    Link: http://lkml.kernel.org/r/1466150259-27727-9-git-send-email-iamjoonsoo.kim@lge.com
    Link: http://lkml.kernel.org/r/1464230275-25791-7-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Joonsoo Kim
    Acked-by: Vlastimil Babka
    Cc: Mel Gorman
    Cc: Minchan Kim
    Cc: Alexander Potapenko
    Cc: Hugh Dickins
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • It's not necessary to initialized page_owner with holding the zone lock.
    It would cause more contention on the zone lock although it's not a big
    problem since it is just debug feature. But, it is better than before
    so do it. This is also preparation step to use stackdepot in page owner
    feature. Stackdepot allocates new pages when there is no reserved space
    and holding the zone lock in this case will cause deadlock.

    Link: http://lkml.kernel.org/r/1464230275-25791-2-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Joonsoo Kim
    Acked-by: Vlastimil Babka
    Cc: Mel Gorman
    Cc: Minchan Kim
    Cc: Alexander Potapenko
    Cc: Hugh Dickins
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

20 May, 2016

2 commits

  • __offline_isolated_pages() and test_pages_isolated() are used by memory
    hotplug. These functions require that range is in a single zone but
    there is no code to do this because memory hotplug checks it before
    calling these functions. To avoid confusing future user of these
    functions, this patch adds comments to them.

    Signed-off-by: Joonsoo Kim
    Acked-by: Vlastimil Babka
    Cc: Rik van Riel
    Cc: Johannes Weiner
    Cc: Mel Gorman
    Cc: Laura Abbott
    Cc: Minchan Kim
    Cc: Marek Szyprowski
    Cc: Michal Nazarewicz
    Cc: "Aneesh Kumar K.V"
    Cc: "Rafael J. Wysocki"
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Lots of code does

    node = next_node(node, XXX);
    if (node == MAX_NUMNODES)
    node = first_node(XXX);

    so create next_node_in() to do this and use it in various places.

    [mhocko@suse.com: use next_node_in() helper]
    Acked-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Signed-off-by: Michal Hocko
    Cc: Xishi Qiu
    Cc: Joonsoo Kim
    Cc: David Rientjes
    Cc: Naoya Horiguchi
    Cc: Laura Abbott
    Cc: Hui Zhu
    Cc: Wang Xiaoqiang
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

02 Apr, 2016

2 commits

  • Commit fea85cff11de ("mm/page_isolation.c: return last tested pfn rather
    than failure indicator") changed the meaning of the return value. Let's
    change the function comments as well.

    Signed-off-by: Neil Zhang
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Neil Zhang
     
  • It is incorrect to use next_node to find a target node, it will return
    MAX_NUMNODES or invalid node. This will lead to crash in buddy system
    allocation.

    Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
    Signed-off-by: Xishi Qiu
    Acked-by: Vlastimil Babka
    Acked-by: Naoya Horiguchi
    Cc: Joonsoo Kim
    Cc: David Rientjes
    Cc: "Laura Abbott"
    Cc: Hui Zhu
    Cc: Wang Xiaoqiang
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xishi Qiu