Commit 561920a0d2bb6d63343e83acfd784c0a77bd28d1

Authored by Suresh Siddha
Committed by Jens Axboe
1 parent e78042e5b8

generic-ipi: fix the smp_mb() placement

smp_mb() is needed (to make the memory operations visible globally) before
sending the ipi on the sender and the receiver (on Alpha atleast) needs
smp_read_barrier_depends() in the handler before reading the call_single_queue
list in a lock-free fashion.

On x86, x2apic mode register accesses for sending IPI's don't have serializing
semantics. So the need for smp_mb() before sending the IPI becomes more
critical in x2apic mode.

Remove the unnecessary smp_mb() in csd_flag_wait(), as the presence of that
smp_mb() doesn't mean anything on the sender, when the ipi receiver is not
doing any thing special (like memory fence) after clearing the CSD_FLAG_WAIT.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

Showing 1 changed file with 12 additions and 6 deletions Side-by-side Diff

... ... @@ -51,10 +51,6 @@
51 51 {
52 52 /* Wait for response */
53 53 do {
54   - /*
55   - * We need to see the flags store in the IPI handler
56   - */
57   - smp_mb();
58 54 if (!(data->flags & CSD_FLAG_WAIT))
59 55 break;
60 56 cpu_relax();
... ... @@ -76,6 +72,11 @@
76 72 list_add_tail(&data->list, &dst->list);
77 73 spin_unlock_irqrestore(&dst->lock, flags);
78 74  
  75 + /*
  76 + * Make the list addition visible before sending the ipi.
  77 + */
  78 + smp_mb();
  79 +
79 80 if (ipi)
80 81 arch_send_call_function_single_ipi(cpu);
81 82  
... ... @@ -157,7 +158,7 @@
157 158 * Need to see other stores to list head for checking whether
158 159 * list is empty without holding q->lock
159 160 */
160   - smp_mb();
  161 + smp_read_barrier_depends();
161 162 while (!list_empty(&q->list)) {
162 163 unsigned int data_flags;
163 164  
... ... @@ -191,7 +192,7 @@
191 192 /*
192 193 * See comment on outer loop
193 194 */
194   - smp_mb();
  195 + smp_read_barrier_depends();
195 196 }
196 197 }
197 198  
... ... @@ -369,6 +370,11 @@
369 370 spin_lock_irqsave(&call_function_lock, flags);
370 371 list_add_tail_rcu(&data->csd.list, &call_function_queue);
371 372 spin_unlock_irqrestore(&call_function_lock, flags);
  373 +
  374 + /*
  375 + * Make the list addition visible before sending the ipi.
  376 + */
  377 + smp_mb();
372 378  
373 379 /* Send a message to all CPUs in the map */
374 380 arch_send_call_function_ipi(mask);