Commit daaf1e68874c078a15ae6ae827751839c4d81739

Authored by KAMEZAWA Hiroyuki
Committed by Linus Torvalds
1 parent 1080d7a303

memcg: handle panic_on_oom=always case

Presently, if panic_on_oom=2, the whole system panics even if the oom
happend in some special situation (as cpuset, mempolicy....).  Then,
panic_on_oom=2 means painc_on_oom_always.

Now, memcg doesn't check panic_on_oom flag. This patch adds a check.

BTW, how it's useful ?

kdump+panic_on_oom=2 is the last tool to investigate what happens in
oom-ed system.  When a task is killed, the sysytem recovers and there will
be few hint to know what happnes.  In mission critical system, oom should
never happen.  Then, panic_on_oom=2+kdump is useful to avoid next OOM by
knowing precise information via snapshot.

TODO:
 - For memcg, it's for isolate system's memory usage, oom-notiifer and
   freeze_at_oom (or rest_at_oom) should be implemented. Then, management
   daemon can do similar jobs (as kdump) or taking snapshot per cgroup.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Nick Piggin <npiggin@suse.de>
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 3 changed files with 10 additions and 2 deletions Side-by-side Diff

Documentation/cgroups/memory.txt
... ... @@ -182,6 +182,8 @@
182 182 NOTE: Reclaim does not work for the root cgroup, since we cannot set any
183 183 limits on the root cgroup.
184 184  
  185 +Note2: When panic_on_oom is set to "2", the whole system will panic.
  186 +
185 187 2. Locking
186 188  
187 189 The memory controller uses the following hierarchy
... ... @@ -379,7 +381,8 @@
379 381 NOTE1: Enabling/disabling will fail if the cgroup already has other
380 382 cgroups created below it.
381 383  
382   -NOTE2: This feature can be enabled/disabled per subtree.
  384 +NOTE2: When panic_on_oom is set to "2", the whole system will panic in
  385 +case of an oom event in any cgroup.
383 386  
384 387 7. Soft limits
385 388  
Documentation/sysctl/vm.txt
... ... @@ -573,11 +573,14 @@
573 573 may be not fatal yet.
574 574  
575 575 If this is set to 2, the kernel panics compulsorily even on the
576   -above-mentioned.
  576 +above-mentioned. Even oom happens under memory cgroup, the whole
  577 +system panics.
577 578  
578 579 The default value is 0.
579 580 1 and 2 are for failover of clustering. Please select either
580 581 according to your policy of failover.
  582 +panic_on_oom=2+kdump gives you very strong tool to investigate
  583 +why oom happens. You can get snapshot.
581 584  
582 585 =============================================================
583 586  
... ... @@ -473,6 +473,8 @@
473 473 unsigned long points = 0;
474 474 struct task_struct *p;
475 475  
  476 + if (sysctl_panic_on_oom == 2)
  477 + panic("out of memory(memcg). panic_on_oom is selected.\n");
476 478 read_lock(&tasklist_lock);
477 479 retry:
478 480 p = select_bad_process(&points, mem);