Commit 117aad1e9e4d97448d1df3f84b08bd65811e6d6a

Authored by Rafael Aquini
Committed by Linus Torvalds
1 parent 0772dac1dc

mm: avoid reinserting isolated balloon pages into LRU lists

Isolated balloon pages can wrongly end up in LRU lists when
migrate_pages() finishes its round without draining all the isolated
page list.

The same issue can happen when reclaim_clean_pages_from_list() tries to
reclaim pages from an isolated page list, before migration, in the CMA
path.  Such balloon page leak opens a race window against LRU lists
shrinkers that leads us to the following kernel panic:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
  IP: [<ffffffff810c2625>] shrink_page_list+0x24e/0x897
  PGD 3cda2067 PUD 3d713067 PMD 0
  Oops: 0000 [#1] SMP
  CPU: 0 PID: 340 Comm: kswapd0 Not tainted 3.12.0-rc1-22626-g4367597 #87
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  RIP: shrink_page_list+0x24e/0x897
  RSP: 0000:ffff88003da499b8  EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffff88003e82bd60 RCX: 00000000000657d5
  RDX: 0000000000000000 RSI: 000000000000031f RDI: ffff88003e82bd40
  RBP: ffff88003da49ab0 R08: 0000000000000001 R09: 0000000081121a45
  R10: ffffffff81121a45 R11: ffff88003c4a9a28 R12: ffff88003e82bd40
  R13: ffff88003da0e800 R14: 0000000000000001 R15: ffff88003da49d58
  FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000067d9000 CR3: 000000003ace5000 CR4: 00000000000407b0
  Call Trace:
    shrink_inactive_list+0x240/0x3de
    shrink_lruvec+0x3e0/0x566
    __shrink_zone+0x94/0x178
    shrink_zone+0x3a/0x82
    balance_pgdat+0x32a/0x4c2
    kswapd+0x2f0/0x372
    kthread+0xa2/0xaa
    ret_from_fork+0x7c/0xb0
  Code: 80 7d 8f 01 48 83 95 68 ff ff ff 00 4c 89 e7 e8 5a 7b 00 00 48 85 c0 49 89 c5 75 08 80 7d 8f 00 74 3e eb 31 48 8b 80 18 01 00 00 <48> 8b 74 0d 48 8b 78 30 be 02 00 00 00 ff d2 eb
  RIP  [<ffffffff810c2625>] shrink_page_list+0x24e/0x897
   RSP <ffff88003da499b8>
  CR2: 0000000000000028
  ---[ end trace 703d2451af6ffbfd ]---
  Kernel panic - not syncing: Fatal exception

This patch fixes the issue, by assuring the proper tests are made at
putback_movable_pages() & reclaim_clean_pages_from_list() to avoid
isolated balloon pages being wrongly reinserted in LRU lists.

[akpm@linux-foundation.org: clarify awkward comment text]
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
Tested-by: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 3 changed files with 29 additions and 2 deletions Side-by-side Diff

include/linux/balloon_compaction.h
... ... @@ -159,6 +159,26 @@
159 159 }
160 160  
161 161 /*
  162 + * isolated_balloon_page - identify an isolated balloon page on private
  163 + * compaction/migration page lists.
  164 + *
  165 + * After a compaction thread isolates a balloon page for migration, it raises
  166 + * the page refcount to prevent concurrent compaction threads from re-isolating
  167 + * the same page. For that reason putback_movable_pages(), or other routines
  168 + * that need to identify isolated balloon pages on private pagelists, cannot
  169 + * rely on balloon_page_movable() to accomplish the task.
  170 + */
  171 +static inline bool isolated_balloon_page(struct page *page)
  172 +{
  173 + /* Already isolated balloon pages, by default, have a raised refcount */
  174 + if (page_flags_cleared(page) && !page_mapped(page) &&
  175 + page_count(page) >= 2)
  176 + return __is_movable_balloon_page(page);
  177 +
  178 + return false;
  179 +}
  180 +
  181 +/*
162 182 * balloon_page_insert - insert a page into the balloon's page list and make
163 183 * the page->mapping assignment accordingly.
164 184 * @page : page to be assigned as a 'balloon page'
... ... @@ -239,6 +259,11 @@
239 259 }
240 260  
241 261 static inline bool balloon_page_movable(struct page *page)
  262 +{
  263 + return false;
  264 +}
  265 +
  266 +static inline bool isolated_balloon_page(struct page *page)
242 267 {
243 268 return false;
244 269 }
... ... @@ -107,7 +107,7 @@
107 107 list_del(&page->lru);
108 108 dec_zone_page_state(page, NR_ISOLATED_ANON +
109 109 page_is_file_cache(page));
110   - if (unlikely(balloon_page_movable(page)))
  110 + if (unlikely(isolated_balloon_page(page)))
111 111 balloon_page_putback(page);
112 112 else
113 113 putback_lru_page(page);
... ... @@ -48,6 +48,7 @@
48 48 #include <asm/div64.h>
49 49  
50 50 #include <linux/swapops.h>
  51 +#include <linux/balloon_compaction.h>
51 52  
52 53 #include "internal.h"
53 54  
... ... @@ -1113,7 +1114,8 @@
1113 1114 LIST_HEAD(clean_pages);
1114 1115  
1115 1116 list_for_each_entry_safe(page, next, page_list, lru) {
1116   - if (page_is_file_cache(page) && !PageDirty(page)) {
  1117 + if (page_is_file_cache(page) && !PageDirty(page) &&
  1118 + !isolated_balloon_page(page)) {
1117 1119 ClearPageActive(page);
1118 1120 list_move(&page->lru, &clean_pages);
1119 1121 }