Commit 2247bb335ab9c40058484cac36ea74ee652f3b7b

Authored by Gerald Schaefer
Committed by Linus Torvalds
1 parent 914a051654

mm/hugetlb: fix memory offline with hugepage size > memory block size

Patch series "mm/hugetlb: memory offline issues with hugepages", v4.

This addresses several issues with hugepages and memory offline.  While
the first patch fixes a panic, and is therefore rather important, the
last patch is just a performance optimization.

The second patch fixes a theoretical issue with reserved hugepages,
while still leaving some ugly usability issue, see description.

This patch (of 3):

dissolve_free_huge_pages() will either run into the VM_BUG_ON() or a
list corruption and addressing exception when trying to set a memory
block offline that is part (but not the first part) of a "gigantic"
hugetlb page with a size > memory block size.

When no other smaller hugetlb page sizes are present, the VM_BUG_ON()
will trigger directly.  In the other case we will run into an addressing
exception later, because dissolve_free_huge_page() will not work on the
head page of the compound hugetlb page which will result in a NULL
hstate from page_hstate().

To fix this, first remove the VM_BUG_ON() because it is wrong, and then
use the compound head page in dissolve_free_huge_page().  This means
that an unused pre-allocated gigantic page that has any part of itself
inside the memory block that is going offline will be dissolved
completely.  Losing an unused gigantic hugepage is preferable to failing
the memory offline, for example in the situation where a (possibly
faulty) memory DIMM needs to go offline.

Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Link: http://lkml.kernel.org/r/20160926172811.94033-2-gerald.schaefer@de.ibm.com
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rui Teng <rui.teng@linux.vnet.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 1 changed file with 7 additions and 6 deletions Side-by-side Diff

... ... @@ -1443,13 +1443,14 @@
1443 1443 {
1444 1444 spin_lock(&hugetlb_lock);
1445 1445 if (PageHuge(page) && !page_count(page)) {
1446   - struct hstate *h = page_hstate(page);
1447   - int nid = page_to_nid(page);
1448   - list_del(&page->lru);
  1446 + struct page *head = compound_head(page);
  1447 + struct hstate *h = page_hstate(head);
  1448 + int nid = page_to_nid(head);
  1449 + list_del(&head->lru);
1449 1450 h->free_huge_pages--;
1450 1451 h->free_huge_pages_node[nid]--;
1451 1452 h->max_huge_pages--;
1452   - update_and_free_page(h, page);
  1453 + update_and_free_page(h, head);
1453 1454 }
1454 1455 spin_unlock(&hugetlb_lock);
1455 1456 }
... ... @@ -1457,7 +1458,8 @@
1457 1458 /*
1458 1459 * Dissolve free hugepages in a given pfn range. Used by memory hotplug to
1459 1460 * make specified memory blocks removable from the system.
1460   - * Note that start_pfn should aligned with (minimum) hugepage size.
  1461 + * Note that this will dissolve a free gigantic hugepage completely, if any
  1462 + * part of it lies within the given range.
1461 1463 */
1462 1464 void dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
1463 1465 {
... ... @@ -1466,7 +1468,6 @@
1466 1468 if (!hugepages_supported())
1467 1469 return;
1468 1470  
1469   - VM_BUG_ON(!IS_ALIGNED(start_pfn, 1 << minimum_order));
1470 1471 for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order)
1471 1472 dissolve_free_huge_page(pfn_to_page(pfn));
1472 1473 }