Commit e6cd1e07a185d5f9b0aa75e020df02d3c1c44940

Authored by Milton Miller
Committed by Linus Torvalds
1 parent ef2b4b95a6

call_function_many: fix list delete vs add race

Peter pointed out there was nothing preventing the list_del_rcu in
smp_call_function_interrupt from running before the list_add_rcu in
smp_call_function_many.

Fix this by not setting refs until we have gotten the lock for the list.
Take advantage of the wmb in list_add_rcu to save an explicit additional
one.

I tried to force this race with a udelay before the lock & list_add and
by mixing all 64 online cpus with just 3 random cpus in the mask, but
was unsuccessful.  Still, inspection shows a valid race, and the fix is
a extension of the existing protection window in the current code.

Cc: stable@kernel.org (v2.6.32 and later)
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 1 changed file with 13 additions and 7 deletions Side-by-side Diff

... ... @@ -491,15 +491,16 @@
491 491 cpumask_clear_cpu(this_cpu, data->cpumask);
492 492  
493 493 /*
494   - * To ensure the interrupt handler gets an complete view
495   - * we order the cpumask and refs writes and order the read
496   - * of them in the interrupt handler. In addition we may
497   - * only clear our own cpu bit from the mask.
  494 + * We reuse the call function data without waiting for any grace
  495 + * period after some other cpu removes it from the global queue.
  496 + * This means a cpu might find our data block as it is writen.
  497 + * The interrupt handler waits until it sees refs filled out
  498 + * while its cpu mask bit is set; here we may only clear our
  499 + * own cpu mask bit, and must wait to set refs until we are sure
  500 + * previous writes are complete and we have obtained the lock to
  501 + * add the element to the queue.
498 502 */
499   - smp_wmb();
500 503  
501   - atomic_set(&data->refs, cpumask_weight(data->cpumask));
502   -
503 504 raw_spin_lock_irqsave(&call_function.lock, flags);
504 505 /*
505 506 * Place entry at the _HEAD_ of the list, so that any cpu still
... ... @@ -507,6 +508,11 @@
507 508 * will not miss any other list entries:
508 509 */
509 510 list_add_rcu(&data->csd.list, &call_function.queue);
  511 + /*
  512 + * We rely on the wmb() in list_add_rcu to order the writes
  513 + * to func, data, and cpumask before this write to refs.
  514 + */
  515 + atomic_set(&data->refs, cpumask_weight(data->cpumask));
510 516 raw_spin_unlock_irqrestore(&call_function.lock, flags);
511 517  
512 518 /*