24 Jul, 2006

3 commits

  • Fix ABBA deadlock between lock_cpu_hotplug() and the cpuset
    callback_mutex lock.

    It only happens on cpu_exclusive cpusets, due to the dynamic
    sched domain code trying to take the cpu hotplug lock inside
    the cpuset callback_mutex lock.

    This bug has apparently been here for several months, but didn't
    get hit until the right customer load on a large system.

    This fix appears right from inspection, but it will take a few
    more days running it on that customers workload to be confident
    we nailed it. We don't have any other reproducible test case.

    The cpu_hotplug_lock() tends to cover large runs of code.
    The other places that hold both that lock and the cpuset callback
    mutex lock always nest the cpuset lock inside the hotplug lock.
    This place tries to do the reverse, risking an ABBA deadlock.

    This is in the cpuset_rmdir() code, where we:
    * take the callback_mutex lock
    * mark the cpuset CS_REMOVED
    * call update_cpu_domains for cpu_exclusive cpusets
    * in that call, take the cpu_hotplug lock if the
    cpuset is marked for removal.

    Thanks to Jack Steiner for identifying this deadlock.

    The fix is to tear down the dynamic sched domain before we grab
    the cpuset callback_mutex lock. This way, the two locks are
    serialized, with the hotplug lock taken and released before
    trying for the cpuset lock.

    I suspect that this bug was introduced when I changed the
    cpuset locking from one lock to two. The dynamic sched domain
    dependency on cpu_exclusive cpusets and its hotplug hooks were
    added to this code earlier, when cpusets had only a single lock.
    It may well have been fine then.

    Signed-off-by: Paul Jackson
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • The CPU hotplug locking was quite messy, with a recursive lock to
    handle the fact that both the actual up/down sequence wanted to
    protect itself from being re-entered, but the callbacks that it
    called also tended to want to protect themselves from CPU events.

    This splits the lock into two (one to serialize the whole hotplug
    sequence, the other to protect against the CPU present bitmaps
    changing). The latter still allows recursive usage because some
    subsystems (ondemand policy for cpufreq at least) had already gotten
    too used to the lax locking, but the locking mistakes are hopefully
    now less fundamental, and we now warn about recursive lock usage
    when we see it, in the hope that it can be fixed.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Shutting down the ondemand policy was fraught with potential
    problems, causing issues for SMP suspend (which wants to hot-
    unplug) all but the last CPU.

    This should fix at least the worst problems (divide-by-zero
    and infinite wait for the workqueue to shut down).

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

22 Jul, 2006

37 commits