14 Aug, 2012

1 commit

  • ARM recently moved to asm-generic/mutex-xchg.h for its mutex
    implementation after the previous implementation was found to be missing
    some crucial memory barriers. However, this has revealed some problems
    running hackbench on SMP platforms due to the way in which the
    MUTEX_SPIN_ON_OWNER code operates.

    The symptoms are that a bunch of hackbench tasks are left waiting on an
    unlocked mutex and therefore never get woken up to claim it. This boils
    down to the following sequence of events:

    Task A Task B Task C Lock value
    0 1
    1 lock() 0
    2 lock() 0
    3 spin(A) 0
    4 unlock() 1
    5 lock() 0
    6 cmpxchg(1,0) 0
    7 contended() -1
    8 lock() 0
    9 spin(C) 0
    10 unlock() 1
    11 cmpxchg(1,0) 0
    12 unlock() 1

    At this point, the lock is unlocked, but Task B is in an uninterruptible
    sleep with nobody to wake it up.

    This patch fixes the problem by ensuring we put the lock into the
    contended state if we fail to acquire it on the fastpath, ensuring that
    any blocked waiters are woken up when the mutex is released.

    Signed-off-by: Will Deacon
    Cc: Arnd Bergmann
    Cc: Chris Mason
    Cc: Ingo Molnar
    Cc:
    Reviewed-by: Nicolas Pitre
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-6e9lrw2avczr0617fzl5vqb8@git.kernel.org
    Signed-off-by: Thomas Gleixner

    Will Deacon
     

24 Oct, 2008

1 commit

  • - atomic operations which both modify the variable and return something imply
    full smp memory barriers before and after the memory operations involved
    (failing atomic_cmpxchg, atomic_add_unless, etc don't imply a barrier because
    they don't modify the target). See Documentation/atomic_ops.txt.
    So remove extra barriers and branches.

    - All architectures support atomic_cmpxchg. This has no relation to
    __HAVE_ARCH_CMPXCHG. We can just take the atomic_cmpxchg path unconditionally

    This reduces a simple single threaded fastpath lock+unlock test from 590 cycles
    to 203 cycles on a ppc970 system.

    Signed-off-by: Nick Piggin
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

09 Feb, 2008

1 commit


04 Oct, 2006

1 commit


01 Apr, 2006

1 commit

  • Turn some macros into inline functions and add proper type checking as
    well as being more readable. Also a minor comment adjustment.

    Signed-off-by: Nicolas Pitre
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nicolas Pitre
     

10 Jan, 2006

1 commit

  • Add three (generic) mutex fastpath implementations.

    The mutex-xchg.h implementation is atomic_xchg() based, and should
    work fine on every architecture.

    The mutex-dec.h implementation is atomic_dec_return() based - this
    one too should work on every architecture, but might not perform the
    most optimally on architectures that have no atomic-dec/inc instructions.

    The mutex-null.h implementation forces all calls into the slowpath. This
    is used for mutex debugging, but it can also be used on platforms that do
    not want (or need) a fastpath at all.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven

    Ingo Molnar