Commit e6cd1e07a185d5f9b0aa75e020df02d3c1c44940
Committed by
Linus Torvalds
1 parent
ef2b4b95a6
Exists in
master
and in
20 other branches
call_function_many: fix list delete vs add race
Peter pointed out there was nothing preventing the list_del_rcu in smp_call_function_interrupt from running before the list_add_rcu in smp_call_function_many. Fix this by not setting refs until we have gotten the lock for the list. Take advantage of the wmb in list_add_rcu to save an explicit additional one. I tried to force this race with a udelay before the lock & list_add and by mixing all 64 online cpus with just 3 random cpus in the mask, but was unsuccessful. Still, inspection shows a valid race, and the fix is a extension of the existing protection window in the current code. Cc: stable@kernel.org (v2.6.32 and later) Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Showing 1 changed file with 13 additions and 7 deletions Side-by-side Diff
kernel/smp.c
... | ... | @@ -491,15 +491,16 @@ |
491 | 491 | cpumask_clear_cpu(this_cpu, data->cpumask); |
492 | 492 | |
493 | 493 | /* |
494 | - * To ensure the interrupt handler gets an complete view | |
495 | - * we order the cpumask and refs writes and order the read | |
496 | - * of them in the interrupt handler. In addition we may | |
497 | - * only clear our own cpu bit from the mask. | |
494 | + * We reuse the call function data without waiting for any grace | |
495 | + * period after some other cpu removes it from the global queue. | |
496 | + * This means a cpu might find our data block as it is writen. | |
497 | + * The interrupt handler waits until it sees refs filled out | |
498 | + * while its cpu mask bit is set; here we may only clear our | |
499 | + * own cpu mask bit, and must wait to set refs until we are sure | |
500 | + * previous writes are complete and we have obtained the lock to | |
501 | + * add the element to the queue. | |
498 | 502 | */ |
499 | - smp_wmb(); | |
500 | 503 | |
501 | - atomic_set(&data->refs, cpumask_weight(data->cpumask)); | |
502 | - | |
503 | 504 | raw_spin_lock_irqsave(&call_function.lock, flags); |
504 | 505 | /* |
505 | 506 | * Place entry at the _HEAD_ of the list, so that any cpu still |
... | ... | @@ -507,6 +508,11 @@ |
507 | 508 | * will not miss any other list entries: |
508 | 509 | */ |
509 | 510 | list_add_rcu(&data->csd.list, &call_function.queue); |
511 | + /* | |
512 | + * We rely on the wmb() in list_add_rcu to order the writes | |
513 | + * to func, data, and cpumask before this write to refs. | |
514 | + */ | |
515 | + atomic_set(&data->refs, cpumask_weight(data->cpumask)); | |
510 | 516 | raw_spin_unlock_irqrestore(&call_function.lock, flags); |
511 | 517 | |
512 | 518 | /* |