Commit 901608d9045146aec6f14a7777ea4b1501c379f0

Authored by Oleg Nesterov
Committed by Linus Torvalds
1 parent 67d58ac47d

mm: introduce get_mm_hiwater_xxx(), fix taskstats->hiwater_xxx accounting

xacct_add_tsk() relies on do_exit()->update_hiwater_xxx() and uses
mm->hiwater_xxx directly, this leads to 2 problems:

- taskstats_user_cmd() can call fill_pid()->xacct_add_tsk() at any
  moment before the task exits, so we should check the current values of
  rss/vm anyway.

- do_exit()->update_hiwater_xxx() calls are racy.  An exiting thread can
  be preempted right before mm->hiwater_xxx = new_val, and another thread
  can use A_LOT of memory and exit in between.  When the first thread
  resumes it can be the last thread in the thread group, in that case we
  report the wrong hiwater_xxx values which do not take A_LOT into
  account.

Introduce get_mm_hiwater_rss() and get_mm_hiwater_vm() helpers and change
xacct_add_tsk() to use them.  The first helper will also be used by
rusage->ru_maxrss accounting.

Kill do_exit()->update_hiwater_xxx() calls.  Unless we are going to
decrease rss/vm there is no point to update mm->hiwater_xxx, and nobody
can look at this mm_struct when exit_mmap() actually unmaps the memory.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 4 changed files with 7 additions and 7 deletions Side-by-side Diff

include/linux/sched.h
... ... @@ -386,6 +386,9 @@
386 386 (mm)->hiwater_vm = (mm)->total_vm; \
387 387 } while (0)
388 388  
  389 +#define get_mm_hiwater_rss(mm) max((mm)->hiwater_rss, get_mm_rss(mm))
  390 +#define get_mm_hiwater_vm(mm) max((mm)->hiwater_vm, (mm)->total_vm)
  391 +
389 392 extern void set_dumpable(struct mm_struct *mm, int value);
390 393 extern int get_dumpable(struct mm_struct *mm);
391 394  
... ... @@ -1051,10 +1051,7 @@
1051 1051 preempt_count());
1052 1052  
1053 1053 acct_update_integrals(tsk);
1054   - if (tsk->mm) {
1055   - update_hiwater_rss(tsk->mm);
1056   - update_hiwater_vm(tsk->mm);
1057   - }
  1054 +
1058 1055 group_dead = atomic_dec_and_test(&tsk->signal->live);
1059 1056 if (group_dead) {
1060 1057 hrtimer_cancel(&tsk->signal->real_timer);
... ... @@ -92,8 +92,8 @@
92 92 mm = get_task_mm(p);
93 93 if (mm) {
94 94 /* adjust to KB unit */
95   - stats->hiwater_rss = mm->hiwater_rss * PAGE_SIZE / KB;
96   - stats->hiwater_vm = mm->hiwater_vm * PAGE_SIZE / KB;
  95 + stats->hiwater_rss = get_mm_hiwater_rss(mm) * PAGE_SIZE / KB;
  96 + stats->hiwater_vm = get_mm_hiwater_vm(mm) * PAGE_SIZE / KB;
97 97 mmput(mm);
98 98 }
99 99 stats->read_char = p->ioac.rchar;
... ... @@ -2102,7 +2102,7 @@
2102 2102 lru_add_drain();
2103 2103 flush_cache_mm(mm);
2104 2104 tlb = tlb_gather_mmu(mm, 1);
2105   - /* Don't update_hiwater_rss(mm) here, do_exit already did */
  2105 + /* update_hiwater_rss(mm) here? but nobody should be looking */
2106 2106 /* Use -1 here to ensure all VMAs in the mm are unmapped */
2107 2107 end = unmap_vmas(&tlb, vma, 0, -1, &nr_accounted, NULL);
2108 2108 vm_unacct_memory(nr_accounted);