Commit 035688290a740745adf9daff65bceac8b70e8732

Authored by Peter Zijlstra
Committed by Greg Kroah-Hartman
1 parent df8ad62006

locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()

commit 54cf809b9512be95f53ed4a5e3b631d1ac42f0fa upstream.

Similar to commits:

  51d7d5205d33 ("powerpc: Add smp_mb() to arch_spin_is_locked()")
  d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers")

qspinlock suffers from the fact that the _Q_LOCKED_VAL store is
unordered inside the ACQUIRE of the lock.

And while this is not a problem for the regular mutual exclusive
critical section usage of spinlocks, it breaks creative locking like:

	spin_lock(A)			spin_lock(B)
	spin_unlock_wait(B)		if (!spin_is_locked(A))
	do_something()			  do_something()

In that both CPUs can end up running do_something at the same time,
because our _Q_LOCKED_VAL store can drop past the spin_unlock_wait()
spin_is_locked() loads (even on x86!!).

To avoid making the normal case slower, add smp_mb()s to the less used
spin_unlock_wait() / spin_is_locked() side of things to avoid this
problem.

Reported-and-tested-by: Davidlohr Bueso <dave@stgolabs.net>
Reported-by: Giovanni Gherdovich <ggherdovich@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Showing 1 changed file with 26 additions and 1 deletions Side-by-side Diff

include/asm-generic/qspinlock.h
... ... @@ -27,7 +27,30 @@
27 27 */
28 28 static __always_inline int queued_spin_is_locked(struct qspinlock *lock)
29 29 {
30   - return atomic_read(&lock->val);
  30 + /*
  31 + * queued_spin_lock_slowpath() can ACQUIRE the lock before
  32 + * issuing the unordered store that sets _Q_LOCKED_VAL.
  33 + *
  34 + * See both smp_cond_acquire() sites for more detail.
  35 + *
  36 + * This however means that in code like:
  37 + *
  38 + * spin_lock(A) spin_lock(B)
  39 + * spin_unlock_wait(B) spin_is_locked(A)
  40 + * do_something() do_something()
  41 + *
  42 + * Both CPUs can end up running do_something() because the store
  43 + * setting _Q_LOCKED_VAL will pass through the loads in
  44 + * spin_unlock_wait() and/or spin_is_locked().
  45 + *
  46 + * Avoid this by issuing a full memory barrier between the spin_lock()
  47 + * and the loads in spin_unlock_wait() and spin_is_locked().
  48 + *
  49 + * Note that regular mutual exclusion doesn't care about this
  50 + * delayed store.
  51 + */
  52 + smp_mb();
  53 + return atomic_read(&lock->val) & _Q_LOCKED_MASK;
31 54 }
32 55  
33 56 /**
... ... @@ -107,6 +130,8 @@
107 130 */
108 131 static inline void queued_spin_unlock_wait(struct qspinlock *lock)
109 132 {
  133 + /* See queued_spin_is_locked() */
  134 + smp_mb();
110 135 while (atomic_read(&lock->val) & _Q_LOCKED_MASK)
111 136 cpu_relax();
112 137 }