Commit efeda7a41e09efce506a68c3549b60b16dd7dedd

Authored by Jin Dongming
Committed by Linus Torvalds
1 parent b16957c643

thp: fix splitting of hwpoisoned hugepages

The poisoned THP is now split with split_huge_page() in
collect_procs_anon().  If kmalloc() is failed in collect_procs(),
split_huge_page() could not be called.  And the work after
split_huge_page() for collecting the processes using poisoned page will
not be done, too.  So the processes using the poisoned page could not be
killed.

The condition becomes worse when CONFIG_DEBUG_VM == "Y".  Because the
poisoned THP could not be split, system panic will be caused by
VM_BUG_ON(PageTransHuge(page)) in try_to_unmap().

This patch does:
  1. move split_huge_page() to the place before collect_procs().
     This can be sure the failure of splitting THP is caused by itself.
  2. when splitting THP is failed, stop the operations after it.
     This can avoid unexpected system panic or non sense works.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 1 changed file with 28 additions and 2 deletions Side-by-side Diff

... ... @@ -386,8 +386,6 @@
386 386 struct task_struct *tsk;
387 387 struct anon_vma *av;
388 388  
389   - if (!PageHuge(page) && unlikely(split_huge_page(page)))
390   - return;
391 389 read_lock(&tasklist_lock);
392 390 av = page_lock_anon_vma(page);
393 391 if (av == NULL) /* Not actually mapped anymore */
... ... @@ -893,6 +891,34 @@
893 891 printk(KERN_INFO
894 892 "MCE %#lx: corrupted page was clean: dropped without side effects\n",
895 893 pfn);
  894 + }
  895 + }
  896 +
  897 + if (PageTransHuge(hpage)) {
  898 + /*
  899 + * Verify that this isn't a hugetlbfs head page, the check for
  900 + * PageAnon is just for avoid tripping a split_huge_page
  901 + * internal debug check, as split_huge_page refuses to deal with
  902 + * anything that isn't an anon page. PageAnon can't go away fro
  903 + * under us because we hold a refcount on the hpage, without a
  904 + * refcount on the hpage. split_huge_page can't be safely called
  905 + * in the first place, having a refcount on the tail isn't
  906 + * enough * to be safe.
  907 + */
  908 + if (!PageHuge(hpage) && PageAnon(hpage)) {
  909 + if (unlikely(split_huge_page(hpage))) {
  910 + /*
  911 + * FIXME: if splitting THP is failed, it is
  912 + * better to stop the following operation rather
  913 + * than causing panic by unmapping. System might
  914 + * survive if the page is freed later.
  915 + */
  916 + printk(KERN_INFO
  917 + "MCE %#lx: failed to split THP\n", pfn);
  918 +
  919 + BUG_ON(!PageHWPoison(p));
  920 + return SWAP_FAIL;
  921 + }
896 922 }
897 923 }
898 924