Commit d4abc238c9f4df8b3216f3e883f5d0a07b7ac75a

Authored by Bharath Ravi
Committed by Ingo Molnar
1 parent d819c49da6

sched, delay accounting: fix incorrect delay time when constantly waiting on runqueue

This patch corrects the incorrect value of per process run-queue wait
time reported by delay statistics. The anomaly was due to the following
reason. When a process leaves the CPU and immediately starts waiting for
CPU on the runqueue (which means it remains in the TASK_RUNNABLE state),
the time of re-entry into the run-queue is never recorded. Due to this,
the waiting time on the runqueue from this point of re-entry upto the
next time it hits the CPU is not accounted for. This is solved by
recording the time of re-entry of a process leaving the CPU in the
sched_info_depart() function IF the process will go back to waiting on
the run-queue. This IF condition is verified by checking whether the
process is still in the TASK_RUNNABLE state.

The patch was tested on 2.6.26-rc6 using two simple CPU hog programs.
The values noted prior to the fix did not account for the time spent on
the runqueue waiting. After the fix, the correct values were reported
back to user space.

Signed-off-by: Bharath Ravi <bharathravi1@gmail.com>
Signed-off-by: Madhava K R  <madhavakr@gmail.com>
Cc: dhaval@linux.vnet.ibm.com
Cc: vatsa@in.ibm.com
Cc: balbir@in.ibm.com
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

Showing 1 changed file with 6 additions and 0 deletions Side-by-side Diff

kernel/sched_stats.h
... ... @@ -198,6 +198,9 @@
198 198 /*
199 199 * Called when a process ceases being the active-running process, either
200 200 * voluntarily or involuntarily. Now we can calculate how long we ran.
  201 + * Also, if the process is still in the TASK_RUNNING state, call
  202 + * sched_info_queued() to mark that it has now again started waiting on
  203 + * the runqueue.
201 204 */
202 205 static inline void sched_info_depart(struct task_struct *t)
203 206 {
... ... @@ -206,6 +209,9 @@
206 209  
207 210 t->sched_info.cpu_time += delta;
208 211 rq_sched_info_depart(task_rq(t), delta);
  212 +
  213 + if (t->state == TASK_RUNNING)
  214 + sched_info_queued(t);
209 215 }
210 216  
211 217 /*