Commit 7e0c5086c172ecf8b0c2ad860b02a586967d17d0

Authored by Thomas Gleixner
1 parent 507e123151

hrtimer: migration: do not check expiry time on current CPU

The timer migration code needs to check whether the expiry time of the
timer is before the programmed clock event expiry time when the timer
is enqueued on another CPU because we can not reprogram the timer
device on the other CPU. The current logic checks the expiry time even
if we enqueue on the current CPU when nohz_get_load_balancer() returns
current CPU. This might lead to an endless loop in the expiry check
code when the expiry time of the timer is before the current
programmed next event.

Check whether nohz_get_load_balancer() returns current CPU and skip
the expiry check if this is the case.

The bug was triggered from the networking code. The patch fixes the
regression http://bugzilla.kernel.org/show_bug.cgi?id=13738
(Soft-Lockup/Race in networking in 2.6.31-rc1+195)

Cc: Arun Bharadwaj <arun@linux.vnet.ibm.com
Tested-by: Joao Correia <joaomiguelcorreia@gmail.com>
Tested-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Showing 1 changed file with 13 additions and 2 deletions Side-by-side Diff

... ... @@ -206,8 +206,19 @@
206 206 #if defined(CONFIG_NO_HZ) && defined(CONFIG_SMP)
207 207 if (!pinned && get_sysctl_timer_migration() && idle_cpu(cpu)) {
208 208 preferred_cpu = get_nohz_load_balancer();
209   - if (preferred_cpu >= 0)
210   - cpu = preferred_cpu;
  209 + if (preferred_cpu >= 0) {
  210 + /*
  211 + * We must not check the expiry value when
  212 + * preferred_cpu is the current cpu. If base
  213 + * != new_base we would loop forever when the
  214 + * timer expires before the current programmed
  215 + * next timer event.
  216 + */
  217 + if (preferred_cpu != cpu)
  218 + cpu = preferred_cpu;
  219 + else
  220 + preferred_cpu = -1;
  221 + }
211 222 }
212 223 #endif
213 224