Commit e7a00c45f29c0155007aa150bf231a70fa470365

Authored by Andrea Arcangeli
Committed by Linus Torvalds
1 parent 4e6af67e97

thp: add pmd_huge_pte to mm_struct

This increase the size of the mm struct a bit but it is needed to
preallocate one pte for each hugepage so that split_huge_page will not
require a fail path.  Guarantee of success is a fundamental property of
split_huge_page to avoid decrasing swapping reliability and to avoid
adding -ENOMEM fail paths that would otherwise force the hugepage-unaware
VM code to learn rolling back in the middle of its pte mangling operations
(if something we need it to learn handling pmd_trans_huge natively rather
being capable of rollback).  When split_huge_page runs a pte is needed to
succeed the split, to map the newly splitted regular pages with a regular
pte.  This way all existing VM code remains backwards compatible by just
adding a split_huge_page* one liner.  The memory waste of those
preallocated ptes is negligible and so it is worth it.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 2 changed files with 10 additions and 0 deletions Side-by-side Diff

include/linux/mm_types.h
... ... @@ -310,6 +310,9 @@
310 310 #ifdef CONFIG_MMU_NOTIFIER
311 311 struct mmu_notifier_mm *mmu_notifier_mm;
312 312 #endif
  313 +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
  314 + pgtable_t pmd_huge_pte; /* protected by page_table_lock */
  315 +#endif
313 316 /* How many tasks sharing this mm are OOM_DISABLE */
314 317 atomic_t oom_disable_count;
315 318 };
... ... @@ -529,6 +529,9 @@
529 529 mm_free_pgd(mm);
530 530 destroy_context(mm);
531 531 mmu_notifier_mm_destroy(mm);
  532 +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
  533 + VM_BUG_ON(mm->pmd_huge_pte);
  534 +#endif
532 535 free_mm(mm);
533 536 }
534 537 EXPORT_SYMBOL_GPL(__mmdrop);
... ... @@ -668,6 +671,10 @@
668 671 /* Initializing for Swap token stuff */
669 672 mm->token_priority = 0;
670 673 mm->last_interval = 0;
  674 +
  675 +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
  676 + mm->pmd_huge_pte = NULL;
  677 +#endif
671 678  
672 679 if (!mm_init(mm, tsk))
673 680 goto fail_nomem;