Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

28 Oct, 2014

1 commit

6f942a1f2 locking/mutex: Don't assume TASK_RUNNING ... Browse Code »

We're going to make might_sleep() test for TASK_RUNNING, because
blocking without TASK_RUNNING will destroy the task state by setting
it to TASK_RUNNING.

There are a few occasions where its 'valid' to call blocking
primitives (and mutex_lock in particular) and not have TASK_RUNNING,
typically such cases are right before we set TASK_RUNNING anyhow.

Robustify the code by not assuming this; this has the beneficial side
effect of allowing optional code emission for fixing the above
might_sleep() false positives.

Signed-off-by: Peter Zijlstra (Intel)
Cc: tglx@linutronix.de
Cc: ilya.dryomov@inktank.com
Cc: umgwanakikbuti@gmail.com
Cc: Oleg Nesterov
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/20140924082241.988560063@infradead.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2014-10-28 17:55:08 +0800

13 Oct, 2014

1 commit

6d5f0ebfc Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull core locking updates from Ingo Molnar:
"The main updates in this cycle were:

- mutex MCS refactoring finishing touches: improve comments, refactor
and clean up code, reduce debug data structure footprint, etc.

- qrwlock finishing touches: remove old code, self-test updates.

- small rwsem optimization

- various smaller fixes/cleanups"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Revert qrwlock recusive stuff
locking/rwsem: Avoid double checking before try acquiring write lock
locking/rwsem: Move EXPORT_SYMBOL() lines to follow function definition
locking/rwlock, x86: Delete unused asm/rwlock.h and rwlock.S
locking/rwlock, x86: Clean up asm/spinlock*.h to remove old rwlock code
locking/semaphore: Resolve some shadow warnings
locking/selftest: Support queued rwlock
locking/lockdep: Restrict the use of recursive read_lock() with qrwlock
locking/spinlocks: Always evaluate the second argument of spin_lock_nested()
locking/Documentation: Update locking/mutex-design.txt disadvantages
locking/Documentation: Move locking related docs into Documentation/locking/
locking/mutexes: Use MUTEX_SPIN_ON_OWNER when appropriate
locking/mutexes: Refactor optimistic spinning code
locking/mcs: Remove obsolete comment
locking/mutexes: Document quick lock release when unlocking
locking/mutexes: Standardize arguments in lock/unlock slowpaths
locking: Remove deprecated smp_mb__() barriers

Linus Torvalds
2014-10-13 21:51:40 +0800

03 Oct, 2014

2 commits

8acd91e86 locking/lockdep: Revert qrwlock recusive stuff ... Browse Code »

Commit f0bab73cb539 ("locking/lockdep: Restrict the use of recursive
read_lock() with qrwlock") changed lockdep to try and conform to the
qrwlock semantics which differ from the traditional rwlock semantics.

In particular qrwlock is fair outside of interrupt context, but in
interrupt context readers will ignore all fairness.

The problem modeling this is that read and write side have different
lock state (interrupts) semantics but we only have a single
representation of these. Therefore lockdep will get confused, thinking
the lock can cause interrupt lock inversions.

So revert it for now; the old rwlock semantics were already imperfectly
modeled and the qrwlock extra won't fit either.

If we want to properly fix this, I think we need to resurrect the work
by Gautham did a few years ago that split the read and write state of
locks:

http://lwn.net/Articles/332801/

FWIW the locking selftest that would've failed (and was reported by
Borislav earlier) is something like:

RL(X1); /* IRQ-ON */
LOCK(A);
UNLOCK(A);
RU(X1);

IRQ_ENTER();
RL(X1); /* IN-IRQ */
RU(X1);
IRQ_EXIT();

At which point it would report that because A is an IRQ-unsafe lock we
can suffer the following inversion:

CPU0 CPU1

lock(A)
lock(X1)
lock(A)

lock(X1)

And this is 'wrong' because X1 can recurse (assuming the above lock are
in fact read-lock) but lockdep doesn't know about this.

Signed-off-by: Peter Zijlstra (Intel)
Cc: Waiman Long
Cc: ego@linux.vnet.ibm.com
Cc: bp@alien8.de
Cc: Linus Torvalds
Cc: Paul E. McKenney
Link: http://lkml.kernel.org/r/20140930132600.GA7444@worktop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2014-10-03 12:09:30 +0800
debfab74e locking/rwsem: Avoid double checking before try acquiring write lock ... Browse Code »

Commit 9b0fc9c09f1b ("rwsem: skip initial trylock in rwsem_down_write_failed")
checks for if there are known active lockers in order to avoid write trylocking
using expensive cmpxchg() when it likely wouldn't get the lock.

However, a subsequent patch was added such that we directly
check for sem->count == RWSEM_WAITING_BIAS right before trying
that cmpxchg().

Thus, commit 9b0fc9c09f1b now just adds overhead.

This patch modifies it so that we only do a check for if
count == RWSEM_WAITING_BIAS.

Also, add a comment on why we do an "extra check" of count
before the cmpxchg().

Signed-off-by: Jason Low
Acked-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra (Intel)
Cc: Aswin Chandramouleeswaran
Cc: Chegu Vinod
Cc: Peter Hurley
Cc: Tim Chen
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1410913017.2447.22.camel@j-VirtualBox
Signed-off-by: Ingo Molnar

Jason Low
2014-10-03 12:09:29 +0800

30 Sep, 2014

4 commits

c98fed9fc locktorture: Cleanup header usage ... Browse Code »

Remove some unnecessary ones and explicitly include rwsem.h

Signed-off-by: Davidlohr Bueso
Signed-off-by: Paul E. McKenney

Davidlohr Bueso
2014-09-30 15:10:02 +0800
a12294910 locktorture: Cannot hold read and write lock ... Browse Code »

... trigger an error if so.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Paul E. McKenney

Davidlohr Bueso
2014-09-30 15:10:02 +0800
219f800f9 locktorture: Fix __acquire annotation for spinlock irq ... Browse Code »

Its quite easy to get mixed up with the names -- 'torture_spinlock_irq'
is not actually a valid spinlock name.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Paul E. McKenney

Davidlohr Bueso
2014-09-30 15:10:02 +0800
e34191fad locktorture: Support rwlocks ... Browse Code »

Add a "rw_lock" torture test to stress kernel rwlocks and their irq
variant. Reader critical regions are 5x longer than writers. As such
a similar ratio of lock acquisitions is seen in the statistics. In the
case of massive contention, both hold the lock for 1/10 of a second.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Paul E. McKenney

Davidlohr Bueso
2014-09-30 15:10:00 +0800

17 Sep, 2014

8 commits

630952c22 locktorture: Introduce torture context ... Browse Code »
2

The amount of global variables is getting pretty ugly. Group variables
related to the execution (ie: not parameters) in a new context structure.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Paul E. McKenney

Davidlohr Bueso
2014-09-17 04:41:40 +0800
4a3b427f0 locktorture: Support rwsems ... Browse Code »

We can easily do so with our new reader lock support. Just an arbitrary
design default: readers have higher (5x) critical region latencies than
writers: 50 ms and 10 ms, respectively.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Paul E. McKenney

Davidlohr Bueso
2014-09-17 04:41:36 +0800
4f6332c1d locktorture: Add infrastructure for torturing read locks ... Browse Code »

Most of it is based on what we already have for writers. This allows
readers to be very independent (and thus configurable), enabling
future module parameters to control things such as rw distribution.
Furthermore, readers have their own delaying function, allowing us
to test different rw critical region latencies, and stress locking
internals. Similarly, statistics, for now will only serve for the
number of lock acquisitions -- as opposed to writers, readers have
no failure detection.

In addition, introduce a new nreaders_stress module parameter. The
default number of readers will be the same number of writers threads.
Writer threads are interleaved with readers. Documentation is updated,
respectively.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Paul E. McKenney

Davidlohr Bueso
2014-09-17 04:41:07 +0800
d36a7a0d5 torture: Address race in module cleanup ... Browse Code »

When performing module cleanups by calling torture_cleanup() the
'torture_type' string in nullified However, callers are not necessarily
done, and might still need to reference the variable. This impacts
both rcutorture and locktorture, causing printing things like:

[ 94.226618] (null)-torture: Stopping lock_torture_writer task
[ 94.226624] (null)-torture: Stopping lock_torture_stats task

Thus delay this operation until the very end of the cleanup process.
The consequence (which shouldn't matter for this kid of program) is,
of course, that we delay the window between rmmod and modprobing,
for instance in module_torture_begin().

Signed-off-by: Davidlohr Bueso
Signed-off-by: Paul E. McKenney

Davidlohr Bueso
2014-09-17 04:41:06 +0800
1e6757a92 locktorture: Make statistics generic ... Browse Code »

The statistics structure can serve well for both reader and writer
locks, thus simply rename some fields that mention 'write' and leave
the declaration of lwsa.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Paul E. McKenney

Davidlohr Bueso
2014-09-17 04:41:05 +0800
f095bfc0e locktorture: Teach about lock debugging ... Browse Code »

Regular locks are very different than locks with debugging. For instance
for mutexes, debugging forces to only take the slowpaths. As such, the
locktorture module should take this into account when printing related
information -- specifically when printing user passed parameters, it seems
the right place for such info.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Paul E. McKenney

Davidlohr Bueso
2014-09-17 04:41:04 +0800
42ddc75dd locktorture: Support mutexes ... Browse Code »

Add a "mutex_lock" torture test. The main difference with the already
existing spinlock tests is that the latency of the critical region
is much larger. We randomly delay for (arbitrarily) either 500 ms or,
otherwise, 25 ms. While this can considerably reduce the amount of
writes compared to non blocking locks, if run long enough it can have
the same torturous effect. Furthermore it is more representative of
mutex hold times and can stress better things like thrashing.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Paul E. McKenney

Davidlohr Bueso
2014-09-17 04:40:59 +0800
23a8e5c2d locktorture: Rename locktorture_runnable parameter ... Browse Code »

... to just 'torture_runnable'. It follows other variable naming
and is shorter.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Paul E. McKenney

Davidlohr Bueso
2014-09-17 01:15:48 +0800

16 Sep, 2014

1 commit

db0e716a1 locking/rwsem: Move EXPORT_SYMBOL() lines to follow function definition ... Browse Code »

rw-semaphore is the only type of lock doing this ugliness of
exporting at the end of the file.

Signed-off-by: Davidlohr Bueso
Cc: dave@stgolabs.net
Cc: peterz@infradead.org
Link: http://lkml.kernel.org/r/1410500066-5909-1-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2014-09-16 15:49:01 +0800

04 Sep, 2014

1 commit

315427691 locking/semaphore: Resolve some shadow warnings ... Browse Code »

Resolve some shadow warnings resulting from using the name
jiffies, which is a well-known global. This is not a problem
of course, but it could be a trap for someone copying and
pasting code, and it just makes W=2 a little cleaner.

Signed-off-by: Mark Rustad
Signed-off-by: Jeff Kirsher
Acked-by: Peter Zijlstra
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: Thomas Gleixner
Cc: Paul E. McKenney
Link: http://lkml.kernel.org/r/1409739444-13635-1-git-send-email-jeffrey.t.kirsher@intel.com
Signed-off-by: Ingo Molnar

Mark Rustad
2014-09-04 13:17:24 +0800

13 Aug, 2014

7 commits

f0bab73cb locking/lockdep: Restrict the use of recursive read_lock() with qrwlock ... Browse Code »
13

Unlike the original unfair rwlock implementation, queued rwlock
will grant lock according to the chronological sequence of the lock
requests except when the lock requester is in the interrupt context.
Consequently, recursive read_lock calls will now hang the process if
there is a write_lock call somewhere in between the read_lock calls.

This patch updates the lockdep implementation to look for recursive
read_lock calls. A new read state (3) is used to mark those read_lock
call that cannot be recursively called except in the interrupt
context. The new read state does exhaust the 2 bits available in
held_lock:read bit field. The addition of any new read state in the
future may require a redesign of how all those bits are squeezed
together in the held_lock structure.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra
Cc: Maarten Lankhorst
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Fengguang Wu
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1407345722-61615-2-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Waiman Long
2014-08-13 16:33:34 +0800
214e0aed6 locking/Documentation: Move locking related docs into Documentation/locking/ ... Browse Code »
13

Specifically:
Documentation/locking/lockdep-design.txt
Documentation/locking/lockstat.txt
Documentation/locking/mutex-design.txt
Documentation/locking/rt-mutex-design.txt
Documentation/locking/rt-mutex.txt
Documentation/locking/spinlocks.txt
Documentation/locking/ww-mutex-design.txt

Signed-off-by: Davidlohr Bueso
Acked-by: Randy Dunlap
Signed-off-by: Peter Zijlstra
Cc: jason.low2@hp.com
Cc: aswin@hp.com
Cc: Alexei Starovoitov
Cc: Al Viro
Cc: Andrew Morton
Cc: Chris Mason
Cc: Dan Streetman
Cc: David Airlie
Cc: Davidlohr Bueso
Cc: David S. Miller
Cc: Greg Kroah-Hartman
Cc: Heiko Carstens
Cc: Jason Low
Cc: Josef Bacik
Cc: Kees Cook
Cc: Linus Torvalds
Cc: Lubomir Rintel
Cc: Masanari Iida
Cc: Paul E. McKenney
Cc: Randy Dunlap
Cc: Tim Chen
Cc: Vineet Gupta
Cc: fengguang.wu@intel.com
Link: http://lkml.kernel.org/r/1406752916-3341-6-git-send-email-davidlohr@hp.com
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2014-08-13 16:32:03 +0800
7608a43d8 locking/mutexes: Use MUTEX_SPIN_ON_OWNER when appropriate ... Browse Code »
8

4badad35 ("locking/mutex: Disable optimistic spinning on some
architectures") added a ARCH_SUPPORTS_ATOMIC_RMW flag to
disable the mutex optimistic feature on specific archs.

Because CONFIG_MUTEX_SPIN_ON_OWNER only depended on DEBUG and
SMP, it was ok to have the ->owner field conditional a bit
flexible. However by adding a new variable to the matter,
we can waste space with the unused field, ie: CONFIG_SMP &&
(!CONFIG_MUTEX_SPIN_ON_OWNER && !CONFIG_DEBUG_MUTEX).

Signed-off-by: Davidlohr Bueso
Acked-by: Jason Low
Signed-off-by: Peter Zijlstra
Cc: aswin@hp.com
Cc: Davidlohr Bueso
Cc: Heiko Carstens
Cc: Jason Low
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Tim Chen
Link: http://lkml.kernel.org/r/1406752916-3341-5-git-send-email-davidlohr@hp.com
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2014-08-13 16:32:02 +0800
76916515d locking/mutexes: Refactor optimistic spinning code ... Browse Code »

When we fail to acquire the mutex in the fastpath, we end up calling
__mutex_lock_common(). A *lot* goes on in this function. Move out the
optimistic spinning code into mutex_optimistic_spin() and simplify
the former a bit. Furthermore, this is similar to what we have in
rwsems. No logical changes.

Signed-off-by: Davidlohr Bueso
Acked-by: Jason Low
Signed-off-by: Peter Zijlstra
Cc: aswin@hp.com
Cc: mingo@kernel.org
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1406752916-3341-4-git-send-email-davidlohr@hp.com
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2014-08-13 16:32:01 +0800
aa9fc0c19 locking/mcs: Remove obsolete comment ... Browse Code »

... as we clearly inline mcs_spin_lock() now.

Signed-off-by: Davidlohr Bueso
Acked-by: Jason Low
Signed-off-by: Peter Zijlstra
Cc: aswin@hp.com
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1406752916-3341-3-git-send-email-davidlohr@hp.com
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2014-08-13 16:32:00 +0800
42fa566bd locking/mutexes: Document quick lock release when unlocking ... Browse Code »

When unlocking, we always want to reach the slowpath with the lock's counter
indicating it is unlocked. -- as returned by the asm fastpath call or by
explicitly setting it. While doing so, at least in theory, we can optimize
and allow faster lock stealing.

When unlocking, we always want to reach the slowpath with the lock's counter
indicating it is unlocked. -- as returned by the asm fastpath call or by
explicitly setting it. While doing so, at least in theory, we can optimize
and allow faster lock stealing.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra
Cc: jason.low2@hp.com
Cc: aswin@hp.com
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1406752916-3341-2-git-send-email-davidlohr@hp.com
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2014-08-13 16:31:59 +0800
242489cfe locking/mutexes: Standardize arguments in lock/unlock slowpaths ... Browse Code »

Just how the locking-end behaves, when unlocking, go ahead and
obtain the proper data structure immediately after the previous
(asm-end) call exits and there are (probably) pending waiters.
This simplifies a bit some of the layering.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra
Cc: jason.low2@hp.com
Cc: aswin@hp.com
Cc: mingo@kernel.org
Cc: Linus Torvalds
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1406752916-3341-1-git-send-email-davidlohr@hp.com
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2014-08-13 16:31:58 +0800

17 Jul, 2014

3 commits

3a6bfbc91 arch, locking: Ciao arch_mutex_cpu_relax() ... Browse Code »
13

The arch_mutex_cpu_relax() function, introduced by 34b133f, is
hacky and ugly. It was added a few years ago to address the fact
that common cpu_relax() calls include yielding on s390, and thus
impact the optimistic spinning functionality of mutexes. Nowadays
we use this function well beyond mutexes: rwsem, qrwlock, mcs and
lockref. Since the macro that defines the call is in the mutex header,
any users must include mutex.h and the naming is misleading as well.

This patch (i) renames the call to cpu_relax_lowlatency ("relax, but
only if you can do it with very low latency") and (ii) defines it in
each arch's asm/processor.h local header, just like for regular cpu_relax
functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax,
and thus we can take it out of mutex.h. While this can seem redundant,
I believe it is a good choice as it allows us to move out arch specific
logic from generic locking primitives and enables future(?) archs to
transparently define it, similarly to System Z.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra
Cc: Andrew Morton
Cc: Anton Blanchard
Cc: Aurelien Jacquiot
Cc: Benjamin Herrenschmidt
Cc: Bharat Bhushan
Cc: Catalin Marinas
Cc: Chen Liqin
Cc: Chris Metcalf
Cc: Christian Borntraeger
Cc: Chris Zankel
Cc: David Howells
Cc: David S. Miller
Cc: Deepthi Dharwar
Cc: Dominik Dingel
Cc: Fenghua Yu
Cc: Geert Uytterhoeven
Cc: Guan Xuetao
Cc: Haavard Skinnemoen
Cc: Hans-Christian Egtvedt
Cc: Heiko Carstens
Cc: Helge Deller
Cc: Hirokazu Takata
Cc: Ivan Kokshaysky
Cc: James E.J. Bottomley
Cc: James Hogan
Cc: Jason Wang
Cc: Jesper Nilsson
Cc: Joe Perches
Cc: Jonas Bonn
Cc: Joseph Myers
Cc: Kees Cook
Cc: Koichi Yasutake
Cc: Lennox Wu
Cc: Linus Torvalds
Cc: Mark Salter
Cc: Martin Schwidefsky
Cc: Matt Turner
Cc: Max Filippov
Cc: Michael Neuling
Cc: Michal Simek
Cc: Mikael Starvik
Cc: Nicolas Pitre
Cc: Paolo Bonzini
Cc: Paul Burton
Cc: Paul E. McKenney
Cc: Paul Gortmaker
Cc: Paul Mackerras
Cc: Qais Yousef
Cc: Qiaowei Ren
Cc: Rafael Wysocki
Cc: Ralf Baechle
Cc: Richard Henderson
Cc: Richard Kuo
Cc: Russell King
Cc: Steven Miao
Cc: Steven Rostedt
Cc: Stratos Karafotis
Cc: Tim Chen
Cc: Tony Luck
Cc: Vasily Kulikov
Cc: Vineet Gupta
Cc: Vineet Gupta
Cc: Waiman Long
Cc: Will Deacon
Cc: Wolfram Sang
Cc: adi-buildroot-devel@lists.sourceforge.net
Cc: linux390@de.ibm.com
Cc: linux-alpha@vger.kernel.org
Cc: linux-am33-list@redhat.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: linux-cris-kernel@axis.com
Cc: linux-hexagon@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux@lists.openrisc.net
Cc: linux-m32r-ja@ml.linux-m32r.org
Cc: linux-m32r@ml.linux-m32r.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-metag@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: sparclinux@vger.kernel.org
Link: http://lkml.kernel.org/r/1404079773.2619.4.camel@buesod1.americas.hpqcorp.net
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2014-07-17 18:32:47 +0800
acf593772 locking/lockdep: Only ask for /proc/lock_stat output when available ... Browse Code »

When lockdep turns itself off, the following message is logged:

Please attach the output of /proc/lock_stat to the bug report

Omit this message when CONFIG_LOCK_STAT is off, and /proc/lock_stat
doesn't exist.

Signed-off-by: Andreas Gruenbacher
Signed-off-by: Peter Zijlstra
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1405451452-3824-1-git-send-email-andreas.gruenbacher@gmail.com
Signed-off-by: Ingo Molnar

Andreas Gruenbacher
2014-07-17 17:48:15 +0800
b5e4111f0 Merge branch 'locking/urgent' into locking/core, before applying larger changes … ... Browse Code »

…and to refresh the branch with fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2014-07-17 17:45:29 +0800

16 Jul, 2014

7 commits

5db6c6fef locking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNER ... Browse Code »

Just like with mutexes (CONFIG_MUTEX_SPIN_ON_OWNER),
encapsulate the dependencies for rwsem optimistic spinning.
No logical changes here as it continues to depend on both
SMP and the XADD algorithm variant.

Signed-off-by: Davidlohr Bueso
Acked-by: Jason Low
[ Also make it depend on ARCH_SUPPORTS_ATOMIC_RMW. ]
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/1405112406-13052-2-git-send-email-davidlohr@hp.com
Cc: aswin@hp.com
Cc: Chris Mason
Cc: Davidlohr Bueso
Cc: Josef Bacik
Cc: Linus Torvalds
Cc: Waiman Long
Signed-off-by: Ingo Molnar

Signed-off-by: Ingo Molnar

Davidlohr Bueso
2014-07-16 20:57:13 +0800
13b9a962a locking/rwsem: Rename 'activity' to 'count' ... Browse Code »

There are two definitions of struct rw_semaphore, one in linux/rwsem.h
and one in linux/rwsem-spinlock.h.

For some reason they have different names for the initial field. This
makes it impossible to use C99 named initialization for
__RWSEM_INITIALIZER() -- or we have to duplicate that entire thing
along with the structure definitions.

The simpler patch is renaming the rwsem-spinlock variant to match the
regular rwsem.

This allows us to switch to C99 named initialization.

Signed-off-by: Peter Zijlstra
Cc: Linus Torvalds
Link: http://lkml.kernel.org/n/tip-bmrZolsbGmautmzrerog27io@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2014-07-16 20:56:55 +0800
33ecd2083 locking/spinlocks/mcs: Micro-optimize osq_unlock() ... Browse Code »

In the unlock function of the cancellable MCS spinlock, the first
thing we do is to retrive the current CPU's osq node. However, due to
the changes made in the previous patch, in the common case where the
lock is not contended, we wouldn't need to access the current CPU's
osq node anymore.

This patch optimizes this by only retriving this CPU's osq node
after we attempt the initial cmpxchg to unlock the osq and found
that its contended.

Signed-off-by: Jason Low
Signed-off-by: Peter Zijlstra
Cc: Scott Norton
Cc: "Paul E. McKenney"
Cc: Dave Chinner
Cc: Waiman Long
Cc: Davidlohr Bueso
Cc: Rik van Riel
Cc: Andrew Morton
Cc: "H. Peter Anvin"
Cc: Steven Rostedt
Cc: Tim Chen
Cc: Konrad Rzeszutek Wilk
Cc: Aswin Chandramouleeswaran
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1405358872-3732-5-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar

Jason Low
2014-07-16 19:28:06 +0800
4d9d951e6 locking/spinlocks/mcs: Introduce and use init macro and function for osq locks ... Browse Code »

Currently, we initialize the osq lock by directly setting the lock's values. It
would be preferable if we use an init macro to do the initialization like we do
with other locks.

This patch introduces and uses a macro and function for initializing the osq lock.

Signed-off-by: Jason Low
Signed-off-by: Peter Zijlstra
Cc: Scott Norton
Cc: "Paul E. McKenney"
Cc: Dave Chinner
Cc: Waiman Long
Cc: Davidlohr Bueso
Cc: Rik van Riel
Cc: Andrew Morton
Cc: "H. Peter Anvin"
Cc: Steven Rostedt
Cc: Tim Chen
Cc: Konrad Rzeszutek Wilk
Cc: Aswin Chandramouleeswaran
Cc: Linus Torvalds
Cc: Chris Mason
Cc: Josef Bacik
Link: http://lkml.kernel.org/r/1405358872-3732-4-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar

Jason Low
2014-07-16 19:28:05 +0800
90631822c locking/spinlocks/mcs: Convert osq lock to atomic_t to reduce overhead ... Browse Code »

The cancellable MCS spinlock is currently used to queue threads that are
doing optimistic spinning. It uses per-cpu nodes, where a thread obtaining
the lock would access and queue the local node corresponding to the CPU that
it's running on. Currently, the cancellable MCS lock is implemented by using
pointers to these nodes.

In this patch, instead of operating on pointers to the per-cpu nodes, we
store the CPU numbers in which the per-cpu nodes correspond to in atomic_t.
A similar concept is used with the qspinlock.

By operating on the CPU # of the nodes using atomic_t instead of pointers
to those nodes, this can reduce the overhead of the cancellable MCS spinlock
by 32 bits (on 64 bit systems).

Signed-off-by: Jason Low
Signed-off-by: Peter Zijlstra
Cc: Scott Norton
Cc: "Paul E. McKenney"
Cc: Dave Chinner
Cc: Waiman Long
Cc: Davidlohr Bueso
Cc: Rik van Riel
Cc: Andrew Morton
Cc: "H. Peter Anvin"
Cc: Steven Rostedt
Cc: Tim Chen
Cc: Konrad Rzeszutek Wilk
Cc: Aswin Chandramouleeswaran
Cc: Linus Torvalds
Cc: Chris Mason
Cc: Heiko Carstens
Cc: Josef Bacik
Link: http://lkml.kernel.org/r/1405358872-3732-3-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar

Jason Low
2014-07-16 19:28:04 +0800
046a619d8 locking/spinlocks/mcs: Rename optimistic_spin_queue() to optimistic_spin_node() ... Browse Code »

Currently, the per-cpu nodes structure for the cancellable MCS spinlock is
named "optimistic_spin_queue". However, in a follow up patch in the series
we will be introducing a new structure that serves as the new "handle" for
the lock. It would make more sense if that structure is named
"optimistic_spin_queue". Additionally, since the current use of the
"optimistic_spin_queue" structure are "nodes", it might be better if we
rename them to "node" anyway.

This preparatory patch renames all current "optimistic_spin_queue"
to "optimistic_spin_node".

Signed-off-by: Jason Low
Signed-off-by: Peter Zijlstra
Cc: Scott Norton
Cc: "Paul E. McKenney"
Cc: Dave Chinner
Cc: Waiman Long
Cc: Davidlohr Bueso
Cc: Rik van Riel
Cc: Andrew Morton
Cc: "H. Peter Anvin"
Cc: Steven Rostedt
Cc: Tim Chen
Cc: Konrad Rzeszutek Wilk
Cc: Aswin Chandramouleeswaran
Cc: Linus Torvalds
Cc: Chris Mason
Cc: Heiko Carstens
Cc: Josef Bacik
Link: http://lkml.kernel.org/r/1405358872-3732-2-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar

Jason Low
2014-07-16 19:28:03 +0800
37e956245 locking/rwsem: Allow conservative optimistic spinning when readers have lock ... Browse Code »
13

Commit 4fc828e24cd9 ("locking/rwsem: Support optimistic spinning")
introduced a major performance regression for workloads such as
xfs_repair which mix read and write locking of the mmap_sem across
many threads. The result was xfs_repair ran 5x slower on 3.16-rc2
than on 3.15 and using 20x more system CPU time.

Perf profiles indicate in some workloads that significant time can
be spent spinning on !owner. This is because we don't set the lock
owner when readers(s) obtain the rwsem.

In this patch, we'll modify rwsem_can_spin_on_owner() such that we'll
return false if there is no lock owner. The rationale is that if we
just entered the slowpath, yet there is no lock owner, then there is
a possibility that a reader has the lock. To be conservative, we'll
avoid spinning in these situations.

This patch reduced the total run time of the xfs_repair workload from
about 4 minutes 24 seconds down to approximately 1 minute 26 seconds,
back to close to the same performance as on 3.15.

Retesting of AIM7, which were some of the workloads used to test the
original optimistic spinning code, confirmed that we still get big
performance gains with optimistic spinning, even with this additional
regression fix. Davidlohr found that while the 'custom' workload took
a performance hit of ~-14% to throughput for >300 users with this
additional patch, the overall gain with optimistic spinning is
still ~+45%. The 'disk' workload even improved by ~+15% at >1000 users.

Tested-by: Dave Chinner
Acked-by: Davidlohr Bueso
Signed-off-by: Jason Low
Signed-off-by: Peter Zijlstra
Cc: Tim Chen
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1404532172.2572.30.camel@j-VirtualBox
Signed-off-by: Ingo Molnar

Jason Low
2014-07-16 19:28:02 +0800

05 Jul, 2014

4 commits

72d5305dc locking/mutexes: Optimize mutex trylock slowpath ... Browse Code »

The mutex_trylock() function calls into __mutex_trylock_fastpath() when
trying to obtain the mutex. On 32 bit x86, in the !__HAVE_ARCH_CMPXCHG
case, __mutex_trylock_fastpath() calls directly into __mutex_trylock_slowpath()
regardless of whether or not the mutex is locked.

In __mutex_trylock_slowpath(), we then acquire the wait_lock spinlock, xchg()
lock->count with -1, then set lock->count back to 0 if there are no waiters,
and return true if the prev lock count was 1.

However, if the mutex is already locked, then there isn't much point
in attempting all of the above expensive operations. In this patch, we only
attempt the above trylock operations if the mutex is unlocked.

Signed-off-by: Jason Low
Reviewed-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra
Cc: akpm@linux-foundation.org
Cc: tim.c.chen@linux.intel.com
Cc: paulmck@linux.vnet.ibm.com
Cc: rostedt@goodmis.org
Cc: Waiman.Long@hp.com
Cc: scott.norton@hp.com
Cc: aswin@hp.com
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1402511843-4721-5-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar

Jason Low
2014-07-05 17:25:42 +0800
0d968dd8c locking/mutexes: Try to acquire mutex only if it is unlocked ... Browse Code »

Upon entering the slowpath in __mutex_lock_common(), we try once more to
acquire the mutex. We only try to acquire if (lock->count >= 0). However,
what we actually want here is to try to acquire if the mutex is unlocked
(lock->count == 1).

This patch changes it so that we only try-acquire the mutex upon entering
the slowpath if it is unlocked, rather than if the lock count is non-negative.
This helps further reduce unnecessary atomic xchg() operations.

Furthermore, this patch uses !mutex_is_locked(lock) to do the initial
checks for if the lock is free rather than directly calling atomic_read()
on the lock->count, in order to improve readability.

Signed-off-by: Jason Low
Acked-by: Waiman Long
Signed-off-by: Peter Zijlstra
Cc: akpm@linux-foundation.org
Cc: tim.c.chen@linux.intel.com
Cc: paulmck@linux.vnet.ibm.com
Cc: rostedt@goodmis.org
Cc: davidlohr@hp.com
Cc: scott.norton@hp.com
Cc: aswin@hp.com
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1402511843-4721-4-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar

Jason Low
2014-07-05 17:25:42 +0800
1e820c960 locking/mutexes: Delete the MUTEX_SHOW_NO_WAITER macro ... Browse Code »

MUTEX_SHOW_NO_WAITER() is a macro which checks for if there are
"no waiters" on a mutex by checking if the lock count is non-negative.
Based on feedback from the discussion in the earlier version of this
patchset, the macro is not very readable.

Furthermore, checking lock->count isn't always the correct way to
determine if there are "no waiters" on a mutex. For example, a negative
count on a mutex really only means that there "potentially" are
waiters. Likewise, there can be waiters on the mutex even if the count is
non-negative. Thus, "MUTEX_SHOW_NO_WAITER" doesn't always do what the name
of the macro suggests.

So this patch deletes the MUTEX_SHOW_NO_WAITERS() macro, directly
use atomic_read() instead of the macro, and adds comments which
elaborate on how the extra atomic_read() checks can help reduce
unnecessary xchg() operations.

Signed-off-by: Jason Low
Acked-by: Waiman Long
Signed-off-by: Peter Zijlstra
Cc: akpm@linux-foundation.org
Cc: tim.c.chen@linux.intel.com
Cc: paulmck@linux.vnet.ibm.com
Cc: rostedt@goodmis.org
Cc: davidlohr@hp.com
Cc: scott.norton@hp.com
Cc: aswin@hp.com
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1402511843-4721-3-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar

Jason Low
2014-07-05 17:25:41 +0800
0c3c0f0d6 locking/mutexes: Correct documentation on mutex optimistic spinning ... Browse Code »

The mutex optimistic spinning documentation states that we spin for
acquisition when we find that there are no pending waiters. However,
in actuality, whether or not there are waiters for the mutex doesn't
determine if we will spin for it.

This patch removes that statement and also adds a comment which
mentions that we spin for the mutex while we don't need to reschedule.

Signed-off-by: Jason Low
Acked-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra
Cc: akpm@linux-foundation.org
Cc: tim.c.chen@linux.intel.com
Cc: paulmck@linux.vnet.ibm.com
Cc: rostedt@goodmis.org
Cc: Waiman.Long@hp.com
Cc: scott.norton@hp.com
Cc: aswin@hp.com
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1402511843-4721-2-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar

Jason Low
2014-07-05 17:25:41 +0800

22 Jun, 2014

1 commit

67792e2ca rtmutex: Avoid pointless requeueing in the deadlock detection chain walk ... Browse Code »

In case the dead lock detector is enabled we follow the lock chain to
the end in rt_mutex_adjust_prio_chain, even if we could stop earlier
due to the priority/waiter constellation.

But once we are no longer the top priority waiter in a certain step
or the task holding the lock has already the same priority then there
is no point in dequeing and enqueing along the lock chain as there is
no change at all.

So stop the queueing at this point.

Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Reviewed-by: Steven Rostedt
Cc: Lai Jiangshan
Link: http://lkml.kernel.org/r/20140522031950.280830190@linutronix.de
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2014-06-22 04:05:31 +0800