Commit 6a01f8dd2508cf79abbdccc44a6a41b2e17fb3cb

Authored by David Rientjes
Committed by Jiri Slaby
1 parent 1d08848674

mm, thp: only collapse hugepages to nodes with affinity for zone_reclaim_mode

commit 14a4e2141e24304fff2c697be6382ffb83888185 upstream.

Commit 9f1b868a13ac ("mm: thp: khugepaged: add policy for finding target
node") improved the previous khugepaged logic which allocated a
transparent hugepages from the node of the first page being collapsed.

However, it is still possible to collapse pages to remote memory which
may suffer from additional access latency.  With the current policy, it
is possible that 255 pages (with PAGE_SHIFT == 12) will be collapsed
remotely if the majority are allocated from that node.

When zone_reclaim_mode is enabled, it means the VM should make every
attempt to allocate locally to prevent NUMA performance degradation.  In
this case, we do not want to collapse hugepages to remote nodes that
would suffer from increased access latency.  Thus, when
zone_reclaim_mode is enabled, only allow collapsing to nodes with
RECLAIM_DISTANCE or less.

There is no functional change for systems that disable
zone_reclaim_mode.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>

Showing 1 changed file with 26 additions and 0 deletions Side-by-side Diff

... ... @@ -2191,6 +2191,30 @@
2191 2191  
2192 2192 static int khugepaged_node_load[MAX_NUMNODES];
2193 2193  
  2194 +static bool khugepaged_scan_abort(int nid)
  2195 +{
  2196 + int i;
  2197 +
  2198 + /*
  2199 + * If zone_reclaim_mode is disabled, then no extra effort is made to
  2200 + * allocate memory locally.
  2201 + */
  2202 + if (!zone_reclaim_mode)
  2203 + return false;
  2204 +
  2205 + /* If there is a count for this node already, it must be acceptable */
  2206 + if (khugepaged_node_load[nid])
  2207 + return false;
  2208 +
  2209 + for (i = 0; i < MAX_NUMNODES; i++) {
  2210 + if (!khugepaged_node_load[i])
  2211 + continue;
  2212 + if (node_distance(nid, i) > RECLAIM_DISTANCE)
  2213 + return true;
  2214 + }
  2215 + return false;
  2216 +}
  2217 +
2194 2218 #ifdef CONFIG_NUMA
2195 2219 static int khugepaged_find_target_node(void)
2196 2220 {
... ... @@ -2507,6 +2531,8 @@
2507 2531 * hit record.
2508 2532 */
2509 2533 node = page_to_nid(page);
  2534 + if (khugepaged_scan_abort(node))
  2535 + goto out_unmap;
2510 2536 khugepaged_node_load[node]++;
2511 2537 VM_BUG_ON(PageCompound(page));
2512 2538 if (!PageLRU(page) || PageLocked(page) || !PageAnon(page))