Commit a4c51bde13ee78405827287ef24f80b1b40d45d7

Authored by Michal Hocko
Committed by Greg Kroah-Hartman
1 parent 5c0f0c017c

mm: exclude memoryless nodes from zone_reclaim

commit 70ef57e6c22c3323dce179b7d0d433c479266612 upstream.

We had a report about strange OOM killer strikes on a PPC machine
although there was a lot of swap free and a tons of anonymous memory
which could be swapped out.  In the end it turned out that the OOM was a
side effect of zone reclaim which wasn't unmapping and swapping out and
so the system was pushed to the OOM.  Although this sounds like a bug
somewhere in the kswapd vs.  zone reclaim vs.  direct reclaim
interaction numactl on the said hardware suggests that the zone reclaim
should not have been set in the first place:

  node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  node 0 size: 0 MB
  node 0 free: 0 MB
  node 2 cpus:
  node 2 size: 7168 MB
  node 2 free: 6019 MB
  node distances:
  node   0   2
  0:  10  40
  2:  40  10

So all the CPUs are associated with Node0 which doesn't have any memory
while Node2 contains all the available memory.  Node distances cause an
automatic zone_reclaim_mode enabling.

Zone reclaim is intended to keep the allocations local but this doesn't
make any sense on the memoryless nodes.  So let's exclude such nodes for
init_zone_allows_reclaim which evaluates zone reclaim behavior and
suitable reclaim_nodes.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Tested-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Showing 1 changed file with 3 additions and 2 deletions Side-by-side Diff

... ... @@ -1869,7 +1869,7 @@
1869 1869 {
1870 1870 int i;
1871 1871  
1872   - for_each_online_node(i)
  1872 + for_each_node_state(i, N_MEMORY)
1873 1873 if (node_distance(nid, i) <= RECLAIM_DISTANCE)
1874 1874 node_set(i, NODE_DATA(nid)->reclaim_nodes);
1875 1875 else
... ... @@ -4933,7 +4933,8 @@
4933 4933  
4934 4934 pgdat->node_id = nid;
4935 4935 pgdat->node_start_pfn = node_start_pfn;
4936   - init_zone_allows_reclaim(nid);
  4936 + if (node_state(nid, N_MEMORY))
  4937 + init_zone_allows_reclaim(nid);
4937 4938 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
4938 4939 get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
4939 4940 #endif