19 May, 2010

1 commit

  • Currently, we can hit a nasty case with optimistic
    spinning on mutexes:

    CPU A tries to take a mutex, while holding the BKL

    CPU B tried to take the BLK while holding the mutex

    This looks like a AB-BA scenario but in practice, is
    allowed and happens due to the auto-release on
    schedule() nature of the BKL.

    In that case, the optimistic spinning code can get us
    into a situation where instead of going to sleep, A
    will spin waiting for B who is spinning waiting for
    A, and the only way out of that loop is the
    need_resched() test in mutex_spin_on_owner().

    This patch fixes it by completely disabling spinning
    if we own the BKL. This adds one more detail to the
    extensive list of reasons why it's a bad idea for
    kernel code to be holding the BKL.

    Signed-off-by: Tony Breeds
    Acked-by: Linus Torvalds
    Acked-by: Peter Zijlstra
    Cc: Benjamin Herrenschmidt
    Cc:
    LKML-Reference:
    [ added an unlikely() attribute to the branch ]
    Signed-off-by: Ingo Molnar

    Tony Breeds
     

03 Dec, 2009

1 commit


11 Jun, 2009

2 commits


11 May, 2009

1 commit


06 May, 2009

1 commit


30 Apr, 2009

1 commit

  • include/linux/mutex.h:136: warning: 'mutex_lock' declared inline after being called
    include/linux/mutex.h:136: warning: previous declaration of 'mutex_lock' was here

    uninline it.

    [ Impact: clean up and uninline, address compiler warning ]

    Signed-off-by: Andrew Morton
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: Eric Paris
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Andrew Morton
     

29 Apr, 2009

1 commit


21 Apr, 2009

1 commit


10 Apr, 2009

1 commit

  • Impact: performance regression fix for s390

    The adaptive spinning mutexes will not always do what one would expect on
    virtualized architectures like s390. Especially the cpu_relax() loop in
    mutex_spin_on_owner might hurt if the mutex holding cpu has been scheduled
    away by the hypervisor.

    We would end up in a cpu_relax() loop when there is no chance that the
    state of the mutex changes until the target cpu has been scheduled again by
    the hypervisor.

    For that reason we should change the default behaviour to no-spin on s390.

    We do have an instruction which allows to yield the current cpu in favour of
    a different target cpu. Also we have an instruction which allows us to figure
    out if the target cpu is physically backed.

    However we need to do some performance tests until we can come up with
    a solution that will do the right thing on s390.

    Signed-off-by: Heiko Carstens
    Acked-by: Peter Zijlstra
    Cc: Martin Schwidefsky
    Cc: Christian Borntraeger
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Heiko Carstens
     

06 Apr, 2009

1 commit

  • Impact: build fix

    mutex_lock() is was defined inline in kernel/mutex.c, but wasn't
    declared so not in . This didn't cause a problem until
    checkin 3a2d367d9aabac486ac4444c6c7ec7a1dab16267 added the
    atomic_dec_and_mutex_lock() inline in between declaration and
    definion.

    This broke building with CONFIG_ALLOW_WARNINGS=n, e.g. make
    allnoconfig.

    Either from the source code nor the allnoconfig binary output I cannot
    find any internal references to mutex_lock() in kernel/mutex.c, so
    presumably this "inline" is now-useless legacy.

    Cc: Eric Paris
    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Orig-LKML-Reference:
    Signed-off-by: H. Peter Anvin

    H. Peter Anvin
     

15 Jan, 2009

4 commits

  • Spin more agressively. This is less fair but also markedly faster.

    The numbers:

    * dbench 50 (higher is better):
    spin 1282MB/s
    v10 548MB/s
    v10 no wait 1868MB/s

    * 4k creates (numbers in files/second higher is better):
    spin avg 200.60 median 193.20 std 19.71 high 305.93 low 186.82
    v10 avg 180.94 median 175.28 std 13.91 high 229.31 low 168.73
    v10 no wait avg 232.18 median 222.38 std 22.91 high 314.66 low 209.12

    * File stats (numbers in seconds, lower is better):
    spin 2.27s
    v10 5.1s
    v10 no wait 1.6s

    ( The source changes are smaller than they look, I just moved the
    need_resched checks in __mutex_lock_common after the cmpxchg. )

    Signed-off-by: Chris Mason
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Chris Mason
     
  • Change mutex contention behaviour such that it will sometimes busy wait on
    acquisition - moving its behaviour closer to that of spinlocks.

    This concept got ported to mainline from the -rt tree, where it was originally
    implemented for rtmutexes by Steven Rostedt, based on work by Gregory Haskins.

    Testing with Ingo's test-mutex application (http://lkml.org/lkml/2006/1/8/50)
    gave a 345% boost for VFS scalability on my testbox:

    # ./test-mutex-shm V 16 10 | grep "^avg ops"
    avg ops/sec: 296604

    # ./test-mutex-shm V 16 10 | grep "^avg ops"
    avg ops/sec: 85870

    The key criteria for the busy wait is that the lock owner has to be running on
    a (different) cpu. The idea is that as long as the owner is running, there is a
    fair chance it'll release the lock soon, and thus we'll be better off spinning
    instead of blocking/scheduling.

    Since regular mutexes (as opposed to rtmutexes) do not atomically track the
    owner, we add the owner in a non-atomic fashion and deal with the races in
    the slowpath.

    Furthermore, to ease the testing of the performance impact of this new code,
    there is means to disable this behaviour runtime (without having to reboot
    the system), when scheduler debugging is enabled (CONFIG_SCHED_DEBUG=y),
    by issuing the following command:

    # echo NO_OWNER_SPIN > /debug/sched_features

    This command re-enables spinning again (this is also the default):

    # echo OWNER_SPIN > /debug/sched_features

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The problem is that dropping the spinlock right before schedule is a voluntary
    preemption point and can cause a schedule, right after which we schedule again.

    Fix this inefficiency by keeping preemption disabled until we schedule, do this
    by explicity disabling preemption and providing a schedule() variant that
    assumes preemption is already disabled.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Remove a local variable by combining an assingment and test in one.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

24 Nov, 2008

1 commit


20 Oct, 2008

1 commit


29 Jul, 2008

1 commit


10 Jun, 2008

1 commit

  • Change __mutex_lock_common() to use signal_pending_state() for the sake of
    the code re-use.

    This adds 7 bytes to kernel/mutex.o, but afaics only because gcc isn't smart
    enough.

    (btw, uninlining of __mutex_lock_common() shrinks .text from 2722 to 1542,
    perhaps it is worth doing).

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Ingo Molnar

    Oleg Nesterov
     

09 Feb, 2008

1 commit


07 Dec, 2007

1 commit


12 Oct, 2007

1 commit

  • The fancy mutex_lock fastpath has too many indirections to track the caller
    hence all contentions are perceived to come from mutex_lock().

    Avoid this by explicitly not using the fastpath code (it was disabled already
    anyway).

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

20 Jul, 2007

2 commits

  • __acquire
    |
    lock _____
    | \
    | __contended
    | |
    | wait
    | _______/
    |/
    |
    __acquired
    |
    __release
    |
    unlock

    We measure acquisition and contention bouncing.

    This is done by recording a cpu stamp in each lock instance.

    Contention bouncing requires the cpu stamp to be set on acquisition. Hence we
    move __acquired into the generic path.

    __acquired is then used to measure acquisition bouncing by comparing the
    current cpu with the old stamp before replacing it.

    __contended is used to measure contention bouncing (only useful for preemptable
    locks)

    [akpm@linux-foundation.org: cleanups]
    Signed-off-by: Peter Zijlstra
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Call the new lockstat tracking functions from the various lock primitives.

    Signed-off-by: Peter Zijlstra
    Acked-by: Ingo Molnar
    Acked-by: Jason Baron
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

10 May, 2007

1 commit

  • Recently a few direct accesses to the thread_info in the task structure snuck
    back, so this wraps them with the appropriate wrapper.

    Signed-off-by: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roman Zippel
     

09 Dec, 2006

1 commit

  • md_open takes ->reconfig_mutex which causes lockdep to complain. This
    (normally) doesn't have deadlock potential as the possible conflict is with a
    reconfig_mutex in a different device.

    I say "normally" because if a loop were created in the array->member hierarchy
    a deadlock could happen. However that causes bigger problems than a deadlock
    and should be fixed independently.

    So we flag the lock in md_open as a nested lock. This requires defining
    mutex_lock_interruptible_nested.

    Cc: Ingo Molnar
    Acked-by: Peter Zijlstra
    Acked-by: Ingo Molnar
    Signed-off-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     

04 Jul, 2006

4 commits

  • Use the lock validator framework to prove mutex locking correctness.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Work around weird section nesting build bug causing smp-alternatives failures
    under certain circumstances.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Generic lock debugging:

    - generalized lock debugging framework. For example, a bug in one lock
    subsystem turns off debugging in all lock subsystems.

    - got rid of the caller address passing (__IP__/__IP_DECL__/etc.) from
    the mutex/rtmutex debugging code: it caused way too much prototype
    hackery, and lockdep will give the same information anyway.

    - ability to do silent tests

    - check lock freeing in vfree too.

    - more finegrained debugging options, to allow distributions to
    turn off more expensive debugging features.

    There's no separate 'held mutexes' list anymore - but there's a 'held locks'
    stack within lockdep, which unifies deadlock detection across all lock
    classes. (this is independent of the lockdep validation stuff - lockdep first
    checks whether we are holding a lock already)

    Here are the current debugging options:

    CONFIG_DEBUG_MUTEXES=y
    CONFIG_DEBUG_LOCK_ALLOC=y

    which do:

    config DEBUG_MUTEXES
    bool "Mutex debugging, basic checks"

    config DEBUG_LOCK_ALLOC
    bool "Detect incorrect freeing of live mutexes"

    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • Rename DEBUG_WARN_ON() to the less generic DEBUG_LOCKS_WARN_ON() name, so that
    it's clear that this is a lock-debugging internal mechanism.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

27 Jun, 2006

1 commit


11 Jan, 2006

3 commits


10 Jan, 2006

1 commit