Commit a1e565aa3cfc7c6252cabc93de8391d12b9216aa

Authored by Tang Chen
Committed by Linus Torvalds
1 parent d822b86a99

memory-hotplug: do not allocate pgdat if it was not freed when offline.

Since there is no way to guarentee the address of pgdat/zone is not on
stack of any kernel threads or used by other kernel objects without
reference counting or other symchronizing method, we cannot reset
node_data and free pgdat when offlining a node.  Just reset pgdat to 0
and reuse the memory when the node is online again.

The problem is suggested by Kamezawa Hiroyuki.  The idea is from Wen
Congyang.

NOTE: If we don't reset pgdat to 0, the WARN_ON in free_area_init_node()
      will be triggered.

[akpm@linux-foundation.org: fix warning when CONFIG_NEED_MULTIPLE_NODES=n]
[akpm@linux-foundation.org: fix the warning again again]
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 1 changed file with 16 additions and 8 deletions Side-by-side Diff

... ... @@ -1017,11 +1017,14 @@
1017 1017 unsigned long zholes_size[MAX_NR_ZONES] = {0};
1018 1018 unsigned long start_pfn = start >> PAGE_SHIFT;
1019 1019  
1020   - pgdat = arch_alloc_nodedata(nid);
1021   - if (!pgdat)
1022   - return NULL;
  1020 + pgdat = NODE_DATA(nid);
  1021 + if (!pgdat) {
  1022 + pgdat = arch_alloc_nodedata(nid);
  1023 + if (!pgdat)
  1024 + return NULL;
1023 1025  
1024   - arch_refresh_nodedata(nid, pgdat);
  1026 + arch_refresh_nodedata(nid, pgdat);
  1027 + }
1025 1028  
1026 1029 /* we can use NODE_DATA(nid) from here */
1027 1030  
... ... @@ -1074,7 +1077,8 @@
1074 1077 int __ref add_memory(int nid, u64 start, u64 size)
1075 1078 {
1076 1079 pg_data_t *pgdat = NULL;
1077   - int new_pgdat = 0;
  1080 + bool new_pgdat;
  1081 + bool new_node;
1078 1082 struct resource *res;
1079 1083 int ret;
1080 1084  
1081 1085  
... ... @@ -1085,12 +1089,16 @@
1085 1089 if (!res)
1086 1090 goto out;
1087 1091  
1088   - if (!node_online(nid)) {
  1092 + { /* Stupid hack to suppress address-never-null warning */
  1093 + void *p = NODE_DATA(nid);
  1094 + new_pgdat = !p;
  1095 + }
  1096 + new_node = !node_online(nid);
  1097 + if (new_node) {
1089 1098 pgdat = hotadd_new_pgdat(nid, start);
1090 1099 ret = -ENOMEM;
1091 1100 if (!pgdat)
1092 1101 goto error;
1093   - new_pgdat = 1;
1094 1102 }
1095 1103  
1096 1104 /* call arch's memory hotadd */
... ... @@ -1102,7 +1110,7 @@
1102 1110 /* we online node here. we can't roll back from here. */
1103 1111 node_set_online(nid);
1104 1112  
1105   - if (new_pgdat) {
  1113 + if (new_node) {
1106 1114 ret = register_one_node(nid);
1107 1115 /*
1108 1116 * If sysfs file of new node can't create, cpu on the node