Commit 8161239a8bcce9ad6b537c04a1fa3b5c68bae693

Authored by Lai Jiangshan
Committed by Steven Rostedt
1 parent 6fb1b30425

rtmutex: Simplify PI algorithm and make highest prio task get lock

In current rtmutex, the pending owner may be boosted by the tasks
in the rtmutex's waitlist when the pending owner is deboosted
or a task in the waitlist is boosted. This boosting is unrelated,
because the pending owner does not really take the rtmutex.
It is not reasonable.

Example.

time1:
A(high prio) onwers the rtmutex.
B(mid prio) and C (low prio) in the waitlist.

time2
A release the lock, B becomes the pending owner
A(or other high prio task) continues to run. B's prio is lower
than A, so B is just queued at the runqueue.

time3
A or other high prio task sleeps, but we have passed some time
The B and C's prio are changed in the period (time2 ~ time3)
due to boosting or deboosting. Now C has the priority higher
than B. ***Is it reasonable that C has to boost B and help B to
get the rtmutex?

NO!! I think, it is unrelated/unneed boosting before B really
owns the rtmutex. We should give C a chance to beat B and
win the rtmutex.

This is the motivation of this patch. This patch *ensures*
only the top waiter or higher priority task can take the lock.

How?
1) we don't dequeue the top waiter when unlock, if the top waiter
   is changed, the old top waiter will fail and go to sleep again.
2) when requiring lock, it will get the lock when the lock is not taken and:
   there is no waiter OR higher priority than waiters OR it is top waiter.
3) In any time, the top waiter is changed, the top waiter will be woken up.

The algorithm is much simpler than before, no pending owner, no
boosting for pending owner.

Other advantage of this patch:
1) The states of a rtmutex are reduced a half, easier to read the code.
2) the codes become shorter.
3) top waiter is not dequeued until it really take the lock:
   they will retain FIFO when it is stolen.

Not advantage nor disadvantage
1) Even we may wakeup multiple waiters(any time when top waiter changed),
   we hardly cause "thundering herd",
   the number of wokenup task is likely 1 or very little.
2) two APIs are changed.
   rt_mutex_owner() will not return pending owner, it will return NULL when
                    the top waiter is going to take the lock.
   rt_mutex_next_owner() always return the top waiter.
	                 will not return NULL if we have waiters
                         because the top waiter is not dequeued.

   I have fixed the code that use these APIs.

need updated after this patch is accepted
1) Document/*
2) the testcase scripts/rt-tester/t4-l2-pi-deboost.tst

Signed-off-by:  Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4D3012D5.4060709@cn.fujitsu.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Showing 4 changed files with 127 additions and 230 deletions Side-by-side Diff

... ... @@ -1556,10 +1556,10 @@
1556 1556  
1557 1557 /*
1558 1558 * We are here either because we stole the rtmutex from the
1559   - * pending owner or we are the pending owner which failed to
1560   - * get the rtmutex. We have to replace the pending owner TID
1561   - * in the user space variable. This must be atomic as we have
1562   - * to preserve the owner died bit here.
  1559 + * previous highest priority waiter or we are the highest priority
  1560 + * waiter but failed to get the rtmutex the first time.
  1561 + * We have to replace the newowner TID in the user space variable.
  1562 + * This must be atomic as we have to preserve the owner died bit here.
1563 1563 *
1564 1564 * Note: We write the user space value _before_ changing the pi_state
1565 1565 * because we can fault here. Imagine swapped out pages or a fork
... ... @@ -1608,8 +1608,8 @@
1608 1608  
1609 1609 /*
1610 1610 * To handle the page fault we need to drop the hash bucket
1611   - * lock here. That gives the other task (either the pending
1612   - * owner itself or the task which stole the rtmutex) the
  1611 + * lock here. That gives the other task (either the highest priority
  1612 + * waiter itself or the task which stole the rtmutex) the
1613 1613 * chance to try the fixup of the pi_state. So once we are
1614 1614 * back from handling the fault we need to check the pi_state
1615 1615 * after reacquiring the hash bucket lock and before trying to
1616 1616  
1617 1617  
1618 1618  
... ... @@ -1685,18 +1685,20 @@
1685 1685 /*
1686 1686 * pi_state is incorrect, some other task did a lock steal and
1687 1687 * we returned due to timeout or signal without taking the
1688   - * rt_mutex. Too late. We can access the rt_mutex_owner without
1689   - * locking, as the other task is now blocked on the hash bucket
1690   - * lock. Fix the state up.
  1688 + * rt_mutex. Too late.
1691 1689 */
  1690 + raw_spin_lock(&q->pi_state->pi_mutex.wait_lock);
1692 1691 owner = rt_mutex_owner(&q->pi_state->pi_mutex);
  1692 + if (!owner)
  1693 + owner = rt_mutex_next_owner(&q->pi_state->pi_mutex);
  1694 + raw_spin_unlock(&q->pi_state->pi_mutex.wait_lock);
1693 1695 ret = fixup_pi_state_owner(uaddr, q, owner);
1694 1696 goto out;
1695 1697 }
1696 1698  
1697 1699 /*
1698 1700 * Paranoia check. If we did not take the lock, then we should not be
1699   - * the owner, nor the pending owner, of the rt_mutex.
  1701 + * the owner of the rt_mutex.
1700 1702 */
1701 1703 if (rt_mutex_owner(&q->pi_state->pi_mutex) == current)
1702 1704 printk(KERN_ERR "fixup_owner: ret = %d pi-mutex: %p "
kernel/rtmutex-debug.c
... ... @@ -215,7 +215,6 @@
215 215 put_pid(waiter->deadlock_task_pid);
216 216 TRACE_WARN_ON(!plist_node_empty(&waiter->list_entry));
217 217 TRACE_WARN_ON(!plist_node_empty(&waiter->pi_list_entry));
218   - TRACE_WARN_ON(waiter->task);
219 218 memset(waiter, 0x22, sizeof(*waiter));
220 219 }
221 220  
... ... @@ -20,41 +20,34 @@
20 20 /*
21 21 * lock->owner state tracking:
22 22 *
23   - * lock->owner holds the task_struct pointer of the owner. Bit 0 and 1
24   - * are used to keep track of the "owner is pending" and "lock has
25   - * waiters" state.
  23 + * lock->owner holds the task_struct pointer of the owner. Bit 0
  24 + * is used to keep track of the "lock has waiters" state.
26 25 *
27   - * owner bit1 bit0
28   - * NULL 0 0 lock is free (fast acquire possible)
29   - * NULL 0 1 invalid state
30   - * NULL 1 0 Transitional State*
31   - * NULL 1 1 invalid state
32   - * taskpointer 0 0 lock is held (fast release possible)
33   - * taskpointer 0 1 task is pending owner
34   - * taskpointer 1 0 lock is held and has waiters
35   - * taskpointer 1 1 task is pending owner and lock has more waiters
  26 + * owner bit0
  27 + * NULL 0 lock is free (fast acquire possible)
  28 + * NULL 1 lock is free and has waiters and the top waiter
  29 + * is going to take the lock*
  30 + * taskpointer 0 lock is held (fast release possible)
  31 + * taskpointer 1 lock is held and has waiters**
36 32 *
37   - * Pending ownership is assigned to the top (highest priority)
38   - * waiter of the lock, when the lock is released. The thread is woken
39   - * up and can now take the lock. Until the lock is taken (bit 0
40   - * cleared) a competing higher priority thread can steal the lock
41   - * which puts the woken up thread back on the waiters list.
42   - *
43 33 * The fast atomic compare exchange based acquire and release is only
44   - * possible when bit 0 and 1 of lock->owner are 0.
  34 + * possible when bit 0 of lock->owner is 0.
45 35 *
46   - * (*) There's a small time where the owner can be NULL and the
47   - * "lock has waiters" bit is set. This can happen when grabbing the lock.
48   - * To prevent a cmpxchg of the owner releasing the lock, we need to set this
49   - * bit before looking at the lock, hence the reason this is a transitional
50   - * state.
  36 + * (*) It also can be a transitional state when grabbing the lock
  37 + * with ->wait_lock is held. To prevent any fast path cmpxchg to the lock,
  38 + * we need to set the bit0 before looking at the lock, and the owner may be
  39 + * NULL in this small time, hence this can be a transitional state.
  40 + *
  41 + * (**) There is a small time when bit 0 is set but there are no
  42 + * waiters. This can happen when grabbing the lock in the slow path.
  43 + * To prevent a cmpxchg of the owner releasing the lock, we need to
  44 + * set this bit before looking at the lock.
51 45 */
52 46  
53 47 static void
54   -rt_mutex_set_owner(struct rt_mutex *lock, struct task_struct *owner,
55   - unsigned long mask)
  48 +rt_mutex_set_owner(struct rt_mutex *lock, struct task_struct *owner)
56 49 {
57   - unsigned long val = (unsigned long)owner | mask;
  50 + unsigned long val = (unsigned long)owner;
58 51  
59 52 if (rt_mutex_has_waiters(lock))
60 53 val |= RT_MUTEX_HAS_WAITERS;
61 54  
62 55  
... ... @@ -203,15 +196,14 @@
203 196 * reached or the state of the chain has changed while we
204 197 * dropped the locks.
205 198 */
206   - if (!waiter || !waiter->task)
  199 + if (!waiter)
207 200 goto out_unlock_pi;
208 201  
209 202 /*
210 203 * Check the orig_waiter state. After we dropped the locks,
211   - * the previous owner of the lock might have released the lock
212   - * and made us the pending owner:
  204 + * the previous owner of the lock might have released the lock.
213 205 */
214   - if (orig_waiter && !orig_waiter->task)
  206 + if (orig_waiter && !rt_mutex_owner(orig_lock))
215 207 goto out_unlock_pi;
216 208  
217 209 /*
... ... @@ -254,6 +246,17 @@
254 246  
255 247 /* Release the task */
256 248 raw_spin_unlock_irqrestore(&task->pi_lock, flags);
  249 + if (!rt_mutex_owner(lock)) {
  250 + /*
  251 + * If the requeue above changed the top waiter, then we need
  252 + * to wake the new top waiter up to try to get the lock.
  253 + */
  254 +
  255 + if (top_waiter != rt_mutex_top_waiter(lock))
  256 + wake_up_process(rt_mutex_top_waiter(lock)->task);
  257 + raw_spin_unlock(&lock->wait_lock);
  258 + goto out_put_task;
  259 + }
257 260 put_task_struct(task);
258 261  
259 262 /* Grab the next task */
260 263  
261 264  
262 265  
... ... @@ -296,78 +299,16 @@
296 299 }
297 300  
298 301 /*
299   - * Optimization: check if we can steal the lock from the
300   - * assigned pending owner [which might not have taken the
301   - * lock yet]:
302   - */
303   -static inline int try_to_steal_lock(struct rt_mutex *lock,
304   - struct task_struct *task)
305   -{
306   - struct task_struct *pendowner = rt_mutex_owner(lock);
307   - struct rt_mutex_waiter *next;
308   - unsigned long flags;
309   -
310   - if (!rt_mutex_owner_pending(lock))
311   - return 0;
312   -
313   - if (pendowner == task)
314   - return 1;
315   -
316   - raw_spin_lock_irqsave(&pendowner->pi_lock, flags);
317   - if (task->prio >= pendowner->prio) {
318   - raw_spin_unlock_irqrestore(&pendowner->pi_lock, flags);
319   - return 0;
320   - }
321   -
322   - /*
323   - * Check if a waiter is enqueued on the pending owners
324   - * pi_waiters list. Remove it and readjust pending owners
325   - * priority.
326   - */
327   - if (likely(!rt_mutex_has_waiters(lock))) {
328   - raw_spin_unlock_irqrestore(&pendowner->pi_lock, flags);
329   - return 1;
330   - }
331   -
332   - /* No chain handling, pending owner is not blocked on anything: */
333   - next = rt_mutex_top_waiter(lock);
334   - plist_del(&next->pi_list_entry, &pendowner->pi_waiters);
335   - __rt_mutex_adjust_prio(pendowner);
336   - raw_spin_unlock_irqrestore(&pendowner->pi_lock, flags);
337   -
338   - /*
339   - * We are going to steal the lock and a waiter was
340   - * enqueued on the pending owners pi_waiters queue. So
341   - * we have to enqueue this waiter into
342   - * task->pi_waiters list. This covers the case,
343   - * where task is boosted because it holds another
344   - * lock and gets unboosted because the booster is
345   - * interrupted, so we would delay a waiter with higher
346   - * priority as task->normal_prio.
347   - *
348   - * Note: in the rare case of a SCHED_OTHER task changing
349   - * its priority and thus stealing the lock, next->task
350   - * might be task:
351   - */
352   - if (likely(next->task != task)) {
353   - raw_spin_lock_irqsave(&task->pi_lock, flags);
354   - plist_add(&next->pi_list_entry, &task->pi_waiters);
355   - __rt_mutex_adjust_prio(task);
356   - raw_spin_unlock_irqrestore(&task->pi_lock, flags);
357   - }
358   - return 1;
359   -}
360   -
361   -/*
362 302 * Try to take an rt-mutex
363 303 *
364   - * This fails
365   - * - when the lock has a real owner
366   - * - when a different pending owner exists and has higher priority than current
367   - *
368 304 * Must be called with lock->wait_lock held.
  305 + *
  306 + * @lock: the lock to be acquired.
  307 + * @task: the task which wants to acquire the lock
  308 + * @waiter: the waiter that is queued to the lock's wait list. (could be NULL)
369 309 */
370   -static int try_to_take_rt_mutex(struct rt_mutex *lock)
  310 +static int try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task,
  311 + struct rt_mutex_waiter *waiter)
371 312 {
372 313 /*
373 314 * We have to be careful here if the atomic speedups are
374 315  
375 316  
376 317  
... ... @@ -390,15 +331,52 @@
390 331 */
391 332 mark_rt_mutex_waiters(lock);
392 333  
393   - if (rt_mutex_owner(lock) && !try_to_steal_lock(lock, current))
  334 + if (rt_mutex_owner(lock))
394 335 return 0;
395 336  
  337 + /*
  338 + * It will get the lock because of one of these conditions:
  339 + * 1) there is no waiter
  340 + * 2) higher priority than waiters
  341 + * 3) it is top waiter
  342 + */
  343 + if (rt_mutex_has_waiters(lock)) {
  344 + if (task->prio >= rt_mutex_top_waiter(lock)->list_entry.prio) {
  345 + if (!waiter || waiter != rt_mutex_top_waiter(lock))
  346 + return 0;
  347 + }
  348 + }
  349 +
  350 + if (waiter || rt_mutex_has_waiters(lock)) {
  351 + unsigned long flags;
  352 + struct rt_mutex_waiter *top;
  353 +
  354 + raw_spin_lock_irqsave(&task->pi_lock, flags);
  355 +
  356 + /* remove the queued waiter. */
  357 + if (waiter) {
  358 + plist_del(&waiter->list_entry, &lock->wait_list);
  359 + task->pi_blocked_on = NULL;
  360 + }
  361 +
  362 + /*
  363 + * We have to enqueue the top waiter(if it exists) into
  364 + * task->pi_waiters list.
  365 + */
  366 + if (rt_mutex_has_waiters(lock)) {
  367 + top = rt_mutex_top_waiter(lock);
  368 + top->pi_list_entry.prio = top->list_entry.prio;
  369 + plist_add(&top->pi_list_entry, &task->pi_waiters);
  370 + }
  371 + raw_spin_unlock_irqrestore(&task->pi_lock, flags);
  372 + }
  373 +
396 374 /* We got the lock. */
397 375 debug_rt_mutex_lock(lock);
398 376  
399   - rt_mutex_set_owner(lock, current, 0);
  377 + rt_mutex_set_owner(lock, task);
400 378  
401   - rt_mutex_deadlock_account_lock(lock, current);
  379 + rt_mutex_deadlock_account_lock(lock, task);
402 380  
403 381 return 1;
404 382 }
... ... @@ -436,6 +414,9 @@
436 414  
437 415 raw_spin_unlock_irqrestore(&task->pi_lock, flags);
438 416  
  417 + if (!owner)
  418 + return 0;
  419 +
439 420 if (waiter == rt_mutex_top_waiter(lock)) {
440 421 raw_spin_lock_irqsave(&owner->pi_lock, flags);
441 422 plist_del(&top_waiter->pi_list_entry, &owner->pi_waiters);
442 423  
443 424  
... ... @@ -472,21 +453,18 @@
472 453 /*
473 454 * Wake up the next waiter on the lock.
474 455 *
475   - * Remove the top waiter from the current tasks waiter list and from
476   - * the lock waiter list. Set it as pending owner. Then wake it up.
  456 + * Remove the top waiter from the current tasks waiter list and wake it up.
477 457 *
478 458 * Called with lock->wait_lock held.
479 459 */
480 460 static void wakeup_next_waiter(struct rt_mutex *lock)
481 461 {
482 462 struct rt_mutex_waiter *waiter;
483   - struct task_struct *pendowner;
484 463 unsigned long flags;
485 464  
486 465 raw_spin_lock_irqsave(&current->pi_lock, flags);
487 466  
488 467 waiter = rt_mutex_top_waiter(lock);
489   - plist_del(&waiter->list_entry, &lock->wait_list);
490 468  
491 469 /*
492 470 * Remove it from current->pi_waiters. We do not adjust a
493 471  
494 472  
495 473  
496 474  
... ... @@ -495,43 +473,19 @@
495 473 * lock->wait_lock.
496 474 */
497 475 plist_del(&waiter->pi_list_entry, &current->pi_waiters);
498   - pendowner = waiter->task;
499   - waiter->task = NULL;
500 476  
501   - rt_mutex_set_owner(lock, pendowner, RT_MUTEX_OWNER_PENDING);
  477 + rt_mutex_set_owner(lock, NULL);
502 478  
503 479 raw_spin_unlock_irqrestore(&current->pi_lock, flags);
504 480  
505   - /*
506   - * Clear the pi_blocked_on variable and enqueue a possible
507   - * waiter into the pi_waiters list of the pending owner. This
508   - * prevents that in case the pending owner gets unboosted a
509   - * waiter with higher priority than pending-owner->normal_prio
510   - * is blocked on the unboosted (pending) owner.
511   - */
512   - raw_spin_lock_irqsave(&pendowner->pi_lock, flags);
513   -
514   - WARN_ON(!pendowner->pi_blocked_on);
515   - WARN_ON(pendowner->pi_blocked_on != waiter);
516   - WARN_ON(pendowner->pi_blocked_on->lock != lock);
517   -
518   - pendowner->pi_blocked_on = NULL;
519   -
520   - if (rt_mutex_has_waiters(lock)) {
521   - struct rt_mutex_waiter *next;
522   -
523   - next = rt_mutex_top_waiter(lock);
524   - plist_add(&next->pi_list_entry, &pendowner->pi_waiters);
525   - }
526   - raw_spin_unlock_irqrestore(&pendowner->pi_lock, flags);
527   -
528   - wake_up_process(pendowner);
  481 + wake_up_process(waiter->task);
529 482 }
530 483  
531 484 /*
532   - * Remove a waiter from a lock
  485 + * Remove a waiter from a lock and give up
533 486 *
534   - * Must be called with lock->wait_lock held
  487 + * Must be called with lock->wait_lock held and
  488 + * have just failed to try_to_take_rt_mutex().
535 489 */
536 490 static void remove_waiter(struct rt_mutex *lock,
537 491 struct rt_mutex_waiter *waiter)
538 492  
539 493  
... ... @@ -543,12 +497,14 @@
543 497  
544 498 raw_spin_lock_irqsave(&current->pi_lock, flags);
545 499 plist_del(&waiter->list_entry, &lock->wait_list);
546   - waiter->task = NULL;
547 500 current->pi_blocked_on = NULL;
548 501 raw_spin_unlock_irqrestore(&current->pi_lock, flags);
549 502  
550   - if (first && owner != current) {
  503 + if (!owner)
  504 + return;
551 505  
  506 + if (first) {
  507 +
552 508 raw_spin_lock_irqsave(&owner->pi_lock, flags);
553 509  
554 510 plist_del(&waiter->pi_list_entry, &owner->pi_waiters);
555 511  
556 512  
... ... @@ -614,21 +570,19 @@
614 570 * or TASK_UNINTERRUPTIBLE)
615 571 * @timeout: the pre-initialized and started timer, or NULL for none
616 572 * @waiter: the pre-initialized rt_mutex_waiter
617   - * @detect_deadlock: passed to task_blocks_on_rt_mutex
618 573 *
619 574 * lock->wait_lock must be held by the caller.
620 575 */
621 576 static int __sched
622 577 __rt_mutex_slowlock(struct rt_mutex *lock, int state,
623 578 struct hrtimer_sleeper *timeout,
624   - struct rt_mutex_waiter *waiter,
625   - int detect_deadlock)
  579 + struct rt_mutex_waiter *waiter)
626 580 {
627 581 int ret = 0;
628 582  
629 583 for (;;) {
630 584 /* Try to acquire the lock: */
631   - if (try_to_take_rt_mutex(lock))
  585 + if (try_to_take_rt_mutex(lock, current, waiter))
632 586 break;
633 587  
634 588 /*
635 589  
... ... @@ -645,39 +599,11 @@
645 599 break;
646 600 }
647 601  
648   - /*
649   - * waiter->task is NULL the first time we come here and
650   - * when we have been woken up by the previous owner
651   - * but the lock got stolen by a higher prio task.
652   - */
653   - if (!waiter->task) {
654   - ret = task_blocks_on_rt_mutex(lock, waiter, current,
655   - detect_deadlock);
656   - /*
657   - * If we got woken up by the owner then start loop
658   - * all over without going into schedule to try
659   - * to get the lock now:
660   - */
661   - if (unlikely(!waiter->task)) {
662   - /*
663   - * Reset the return value. We might
664   - * have returned with -EDEADLK and the
665   - * owner released the lock while we
666   - * were walking the pi chain.
667   - */
668   - ret = 0;
669   - continue;
670   - }
671   - if (unlikely(ret))
672   - break;
673   - }
674   -
675 602 raw_spin_unlock(&lock->wait_lock);
676 603  
677 604 debug_rt_mutex_print_deadlock(waiter);
678 605  
679   - if (waiter->task)
680   - schedule_rt_mutex(lock);
  606 + schedule_rt_mutex(lock);
681 607  
682 608 raw_spin_lock(&lock->wait_lock);
683 609 set_current_state(state);
684 610  
... ... @@ -698,12 +624,11 @@
698 624 int ret = 0;
699 625  
700 626 debug_rt_mutex_init_waiter(&waiter);
701   - waiter.task = NULL;
702 627  
703 628 raw_spin_lock(&lock->wait_lock);
704 629  
705 630 /* Try to acquire the lock again: */
706   - if (try_to_take_rt_mutex(lock)) {
  631 + if (try_to_take_rt_mutex(lock, current, NULL)) {
707 632 raw_spin_unlock(&lock->wait_lock);
708 633 return 0;
709 634 }
710 635  
711 636  
... ... @@ -717,12 +642,14 @@
717 642 timeout->task = NULL;
718 643 }
719 644  
720   - ret = __rt_mutex_slowlock(lock, state, timeout, &waiter,
721   - detect_deadlock);
  645 + ret = task_blocks_on_rt_mutex(lock, &waiter, current, detect_deadlock);
722 646  
  647 + if (likely(!ret))
  648 + ret = __rt_mutex_slowlock(lock, state, timeout, &waiter);
  649 +
723 650 set_current_state(TASK_RUNNING);
724 651  
725   - if (unlikely(waiter.task))
  652 + if (unlikely(ret))
726 653 remove_waiter(lock, &waiter);
727 654  
728 655 /*
... ... @@ -737,14 +664,6 @@
737 664 if (unlikely(timeout))
738 665 hrtimer_cancel(&timeout->timer);
739 666  
740   - /*
741   - * Readjust priority, when we did not get the lock. We might
742   - * have been the pending owner and boosted. Since we did not
743   - * take the lock, the PI boost has to go.
744   - */
745   - if (unlikely(ret))
746   - rt_mutex_adjust_prio(current);
747   -
748 667 debug_rt_mutex_free_waiter(&waiter);
749 668  
750 669 return ret;
... ... @@ -762,7 +681,7 @@
762 681  
763 682 if (likely(rt_mutex_owner(lock) != current)) {
764 683  
765   - ret = try_to_take_rt_mutex(lock);
  684 + ret = try_to_take_rt_mutex(lock, current, NULL);
766 685 /*
767 686 * try_to_take_rt_mutex() sets the lock waiters
768 687 * bit unconditionally. Clean this up.
... ... @@ -992,7 +911,7 @@
992 911 {
993 912 __rt_mutex_init(lock, NULL);
994 913 debug_rt_mutex_proxy_lock(lock, proxy_owner);
995   - rt_mutex_set_owner(lock, proxy_owner, 0);
  914 + rt_mutex_set_owner(lock, proxy_owner);
996 915 rt_mutex_deadlock_account_lock(lock, proxy_owner);
997 916 }
998 917  
... ... @@ -1008,7 +927,7 @@
1008 927 struct task_struct *proxy_owner)
1009 928 {
1010 929 debug_rt_mutex_proxy_unlock(lock);
1011   - rt_mutex_set_owner(lock, NULL, 0);
  930 + rt_mutex_set_owner(lock, NULL);
1012 931 rt_mutex_deadlock_account_unlock(proxy_owner);
1013 932 }
1014 933  
1015 934  
1016 935  
... ... @@ -1034,20 +953,14 @@
1034 953  
1035 954 raw_spin_lock(&lock->wait_lock);
1036 955  
1037   - mark_rt_mutex_waiters(lock);
1038   -
1039   - if (!rt_mutex_owner(lock) || try_to_steal_lock(lock, task)) {
1040   - /* We got the lock for task. */
1041   - debug_rt_mutex_lock(lock);
1042   - rt_mutex_set_owner(lock, task, 0);
  956 + if (try_to_take_rt_mutex(lock, task, NULL)) {
1043 957 raw_spin_unlock(&lock->wait_lock);
1044   - rt_mutex_deadlock_account_lock(lock, task);
1045 958 return 1;
1046 959 }
1047 960  
1048 961 ret = task_blocks_on_rt_mutex(lock, waiter, task, detect_deadlock);
1049 962  
1050   - if (ret && !waiter->task) {
  963 + if (ret && !rt_mutex_owner(lock)) {
1051 964 /*
1052 965 * Reset the return value. We might have
1053 966 * returned with -EDEADLK and the owner
... ... @@ -1056,6 +969,10 @@
1056 969 */
1057 970 ret = 0;
1058 971 }
  972 +
  973 + if (unlikely(ret))
  974 + remove_waiter(lock, waiter);
  975 +
1059 976 raw_spin_unlock(&lock->wait_lock);
1060 977  
1061 978 debug_rt_mutex_print_deadlock(waiter);
1062 979  
... ... @@ -1110,12 +1027,11 @@
1110 1027  
1111 1028 set_current_state(TASK_INTERRUPTIBLE);
1112 1029  
1113   - ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter,
1114   - detect_deadlock);
  1030 + ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter);
1115 1031  
1116 1032 set_current_state(TASK_RUNNING);
1117 1033  
1118   - if (unlikely(waiter->task))
  1034 + if (unlikely(ret))
1119 1035 remove_waiter(lock, waiter);
1120 1036  
1121 1037 /*
... ... @@ -1125,14 +1041,6 @@
1125 1041 fixup_rt_mutex_waiters(lock);
1126 1042  
1127 1043 raw_spin_unlock(&lock->wait_lock);
1128   -
1129   - /*
1130   - * Readjust priority, when we did not get the lock. We might have been
1131   - * the pending owner and boosted. Since we did not take the lock, the
1132   - * PI boost has to go.
1133   - */
1134   - if (unlikely(ret))
1135   - rt_mutex_adjust_prio(current);
1136 1044  
1137 1045 return ret;
1138 1046 }
kernel/rtmutex_common.h
... ... @@ -91,25 +91,13 @@
91 91 /*
92 92 * lock->owner state tracking:
93 93 */
94   -#define RT_MUTEX_OWNER_PENDING 1UL
95   -#define RT_MUTEX_HAS_WAITERS 2UL
96   -#define RT_MUTEX_OWNER_MASKALL 3UL
  94 +#define RT_MUTEX_HAS_WAITERS 1UL
  95 +#define RT_MUTEX_OWNER_MASKALL 1UL
97 96  
98 97 static inline struct task_struct *rt_mutex_owner(struct rt_mutex *lock)
99 98 {
100 99 return (struct task_struct *)
101 100 ((unsigned long)lock->owner & ~RT_MUTEX_OWNER_MASKALL);
102   -}
103   -
104   -static inline struct task_struct *rt_mutex_real_owner(struct rt_mutex *lock)
105   -{
106   - return (struct task_struct *)
107   - ((unsigned long)lock->owner & ~RT_MUTEX_HAS_WAITERS);
108   -}
109   -
110   -static inline unsigned long rt_mutex_owner_pending(struct rt_mutex *lock)
111   -{
112   - return (unsigned long)lock->owner & RT_MUTEX_OWNER_PENDING;
113 101 }
114 102  
115 103 /*