Commit 5d18023294abc22984886bd7185344e0c2be0daf

Authored by Peter Zijlstra
Committed by Paul E. McKenney
1 parent 0d8ee37e2f

sched: Fix load avg vs cpu-hotplug

Rabik and Paul reported two different issues related to the same few
lines of code.

Rabik's issue is that the nr_uninterruptible migration code is wrong in
that he sees artifacts due to this (Rabik please do expand in more
detail).

Paul's issue is that this code as it stands relies on us using
stop_machine() for unplug, we all would like to remove this assumption
so that eventually we can remove this stop_machine() usage altogether.

The only reason we'd have to migrate nr_uninterruptible is so that we
could use for_each_online_cpu() loops in favour of
for_each_possible_cpu() loops, however since nr_uninterruptible() is the
only such loop and its using possible lets not bother at all.

The problem Rabik sees is (probably) caused by the fact that by
migrating nr_uninterruptible we screw rq->calc_load_active for both rqs
involved.

So don't bother with fancy migration schemes (meaning we now have to
keep using for_each_possible_cpu()) and instead fold any nr_active delta
after we migrate all tasks away to make sure we don't have any skewed
nr_active accounting.

[ paulmck: Move call to calc_load_migration to CPU_DEAD to avoid
miscounting noted by Rakib. ]

Reported-by: Rakib Mullick <rakib.mullick@gmail.com>
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>

Showing 1 changed file with 20 additions and 21 deletions Side-by-side Diff

... ... @@ -5304,30 +5304,20 @@
5304 5304 }
5305 5305  
5306 5306 /*
5307   - * While a dead CPU has no uninterruptible tasks queued at this point,
5308   - * it might still have a nonzero ->nr_uninterruptible counter, because
5309   - * for performance reasons the counter is not stricly tracking tasks to
5310   - * their home CPUs. So we just add the counter to another CPU's counter,
5311   - * to keep the global sum constant after CPU-down:
  5307 + * Since this CPU is going 'away' for a while, fold any nr_active delta
  5308 + * we might have. Assumes we're called after migrate_tasks() so that the
  5309 + * nr_active count is stable.
  5310 + *
  5311 + * Also see the comment "Global load-average calculations".
5312 5312 */
5313   -static void migrate_nr_uninterruptible(struct rq *rq_src)
  5313 +static void calc_load_migrate(struct rq *rq)
5314 5314 {
5315   - struct rq *rq_dest = cpu_rq(cpumask_any(cpu_active_mask));
5316   -
5317   - rq_dest->nr_uninterruptible += rq_src->nr_uninterruptible;
5318   - rq_src->nr_uninterruptible = 0;
  5315 + long delta = calc_load_fold_active(rq);
  5316 + if (delta)
  5317 + atomic_long_add(delta, &calc_load_tasks);
5319 5318 }
5320 5319  
5321 5320 /*
5322   - * remove the tasks which were accounted by rq from calc_load_tasks.
5323   - */
5324   -static void calc_global_load_remove(struct rq *rq)
5325   -{
5326   - atomic_long_sub(rq->calc_load_active, &calc_load_tasks);
5327   - rq->calc_load_active = 0;
5328   -}
5329   -
5330   -/*
5331 5321 * Migrate all tasks from the rq, sleeping tasks will be migrated by
5332 5322 * try_to_wake_up()->select_task_rq().
5333 5323 *
5334 5324  
... ... @@ -5617,9 +5607,18 @@
5617 5607 migrate_tasks(cpu);
5618 5608 BUG_ON(rq->nr_running != 1); /* the migration thread */
5619 5609 raw_spin_unlock_irqrestore(&rq->lock, flags);
  5610 + break;
5620 5611  
5621   - migrate_nr_uninterruptible(rq);
5622   - calc_global_load_remove(rq);
  5612 + case CPU_DEAD:
  5613 + {
  5614 + struct rq *dest_rq;
  5615 +
  5616 + local_irq_save(flags);
  5617 + dest_rq = cpu_rq(smp_processor_id());
  5618 + raw_spin_lock(&dest_rq->lock);
  5619 + calc_load_migrate(rq);
  5620 + raw_spin_unlock_irqrestore(&dest_rq->lock, flags);
  5621 + }
5623 5622 break;
5624 5623 #endif
5625 5624 }