Commit 035688290a740745adf9daff65bceac8b70e8732
Committed by
Greg Kroah-Hartman
1 parent
df8ad62006
Exists in
smarct4x-processor-sdk-linux-03.00.00.04
and in
2 other branches
locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()
commit 54cf809b9512be95f53ed4a5e3b631d1ac42f0fa upstream. Similar to commits: 51d7d5205d33 ("powerpc: Add smp_mb() to arch_spin_is_locked()") d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers") qspinlock suffers from the fact that the _Q_LOCKED_VAL store is unordered inside the ACQUIRE of the lock. And while this is not a problem for the regular mutual exclusive critical section usage of spinlocks, it breaks creative locking like: spin_lock(A) spin_lock(B) spin_unlock_wait(B) if (!spin_is_locked(A)) do_something() do_something() In that both CPUs can end up running do_something at the same time, because our _Q_LOCKED_VAL store can drop past the spin_unlock_wait() spin_is_locked() loads (even on x86!!). To avoid making the normal case slower, add smp_mb()s to the less used spin_unlock_wait() / spin_is_locked() side of things to avoid this problem. Reported-and-tested-by: Davidlohr Bueso <dave@stgolabs.net> Reported-by: Giovanni Gherdovich <ggherdovich@suse.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Showing 1 changed file with 26 additions and 1 deletions Side-by-side Diff
include/asm-generic/qspinlock.h
... | ... | @@ -27,7 +27,30 @@ |
27 | 27 | */ |
28 | 28 | static __always_inline int queued_spin_is_locked(struct qspinlock *lock) |
29 | 29 | { |
30 | - return atomic_read(&lock->val); | |
30 | + /* | |
31 | + * queued_spin_lock_slowpath() can ACQUIRE the lock before | |
32 | + * issuing the unordered store that sets _Q_LOCKED_VAL. | |
33 | + * | |
34 | + * See both smp_cond_acquire() sites for more detail. | |
35 | + * | |
36 | + * This however means that in code like: | |
37 | + * | |
38 | + * spin_lock(A) spin_lock(B) | |
39 | + * spin_unlock_wait(B) spin_is_locked(A) | |
40 | + * do_something() do_something() | |
41 | + * | |
42 | + * Both CPUs can end up running do_something() because the store | |
43 | + * setting _Q_LOCKED_VAL will pass through the loads in | |
44 | + * spin_unlock_wait() and/or spin_is_locked(). | |
45 | + * | |
46 | + * Avoid this by issuing a full memory barrier between the spin_lock() | |
47 | + * and the loads in spin_unlock_wait() and spin_is_locked(). | |
48 | + * | |
49 | + * Note that regular mutual exclusion doesn't care about this | |
50 | + * delayed store. | |
51 | + */ | |
52 | + smp_mb(); | |
53 | + return atomic_read(&lock->val) & _Q_LOCKED_MASK; | |
31 | 54 | } |
32 | 55 | |
33 | 56 | /** |
... | ... | @@ -107,6 +130,8 @@ |
107 | 130 | */ |
108 | 131 | static inline void queued_spin_unlock_wait(struct qspinlock *lock) |
109 | 132 | { |
133 | + /* See queued_spin_is_locked() */ | |
134 | + smp_mb(); | |
110 | 135 | while (atomic_read(&lock->val) & _Q_LOCKED_MASK) |
111 | 136 | cpu_relax(); |
112 | 137 | } |