Eric Lee / smarc-fsl-linux-kernel

21 Jul, 2015

1 commit

cba77f03f locking/pvqspinlock: Fix kernel panic in locking-selftest ... Browse Code »

Enabling locking-selftest in a VM guest may cause the following
kernel panic:

kernel BUG at .../kernel/locking/qspinlock_paravirt.h:137!

This is due to the fact that the pvqspinlock unlock function is
expecting either a _Q_LOCKED_VAL or _Q_SLOW_VAL in the lock
byte. This patch prevents that bug report by ignoring it when
debug_locks_silent is set. Otherwise, a warning will be printed
if it contains an unexpected value.

With this patch applied, the kernel locking-selftest completed
without any noise.

Tested-by: Masami Hiramatsu
Signed-off-by: Waiman Long
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1436663959-53092-1-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Waiman Long
2015-07-21 16:18:07 +0800

25 Jun, 2015

2 commits

98ec21a01 Merge branch 'sched-hrtimers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler updates from Thomas Gleixner:
"This series of scheduler updates depends on sched/core and timers/core
branches, which are already in your tree:

- Scheduler balancing overhaul to plug a hard to trigger race which
causes an oops in the balancer (Peter Zijlstra)

- Lockdep updates which are related to the balancing updates (Peter
Zijlstra)"

* 'sched-hrtimers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched,lockdep: Employ lock pinning
lockdep: Implement lock pinning
lockdep: Simplify lock_release()
sched: Streamline the task migration locking a little
sched: Move code around
sched,dl: Fix sched class hopping CBS hole
sched, dl: Convert switched_{from, to}_dl() / prio_changed_dl() to balance callbacks
sched,dl: Remove return value from pull_dl_task()
sched, rt: Convert switched_{from, to}_rt() / prio_changed_rt() to balance callbacks
sched,rt: Remove return value from pull_rt_task()
sched: Allow balance callbacks for check_class_changed()
sched: Use replace normalize_task() with __sched_setscheduler()
sched: Replace post_schedule with a balance callback list

Linus Torvalds
2015-06-25 06:09:40 +0800
a26294833 Merge branch 'sched-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull locking updates from Thomas Gleixner:
"These locking updates depend on the alreay merged sched/core branch:

- Lockless top waiter wakeup for rtmutex (Davidlohr)

- Reduce hash bucket lock contention for PI futexes (Sebastian)

- Documentation update (Davidlohr)"

* 'sched-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rtmutex: Update stale plist comments
futex: Lower the lock contention on the HB lock during wake up
locking/rtmutex: Implement lockless top-waiter wakeup

Linus Torvalds
2015-06-25 05:46:01 +0800

23 Jun, 2015

4 commits

43224b96a Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull timer updates from Thomas Gleixner:
"A rather largish update for everything time and timer related:

- Cache footprint optimizations for both hrtimers and timer wheel

- Lower the NOHZ impact on systems which have NOHZ or timer migration
disabled at runtime.

- Optimize run time overhead of hrtimer interrupt by making the clock
offset updates smarter

- hrtimer cleanups and removal of restrictions to tackle some
problems in sched/perf

- Some more leap second tweaks

- Another round of changes addressing the 2038 problem

- First step to change the internals of clock event devices by
introducing the necessary infrastructure

- Allow constant folding for usecs/msecs_to_jiffies()

- The usual pile of clockevent/clocksource driver updates

The hrtimer changes contain updates to sched, perf and x86 as they
depend on them plus changes all over the tree to cleanup API changes
and redundant code, which got copied all over the place. The y2038
changes touch s390 to remove the last non 2038 safe code related to
boot/persistant clock"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
clocksource: Increase dependencies of timer-stm32 to limit build wreckage
timer: Minimize nohz off overhead
timer: Reduce timer migration overhead if disabled
timer: Stats: Simplify the flags handling
timer: Replace timer base by a cpu index
timer: Use hlist for the timer wheel hash buckets
timer: Remove FIFO "guarantee"
timers: Sanitize catchup_timer_jiffies() usage
hrtimer: Allow hrtimer::function() to free the timer
seqcount: Introduce raw_write_seqcount_barrier()
seqcount: Rename write_seqcount_barrier()
hrtimer: Fix hrtimer_is_queued() hole
hrtimer: Remove HRTIMER_STATE_MIGRATE
selftest: Timers: Avoid signal deadlock in leap-a-day
timekeeping: Copy the shadow-timekeeper over the real timekeeper last
clockevents: Check state instead of mode in suspend/resume path
selftests: timers: Add leap-second timer edge testing to leap-a-day.c
ntp: Do leapsecond adjustment in adjtimex read path
time: Prevent early expiry of hrtimers[CLOCK_REALTIME] at the leap second edge
ntp: Introduce and use SECS_PER_DAY macro instead of 86400
...

Linus Torvalds
2015-06-23 09:57:44 +0800
23b777629 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler updates from Ingo Molnar:
"The main changes are:

- lockless wakeup support for futexes and IPC message queues
(Davidlohr Bueso, Peter Zijlstra)

- Replace spinlocks with atomics in thread_group_cputimer(), to
improve scalability (Jason Low)

- NUMA balancing improvements (Rik van Riel)

- SCHED_DEADLINE improvements (Wanpeng Li)

- clean up and reorganize preemption helpers (Frederic Weisbecker)

- decouple page fault disabling machinery from the preemption
counter, to improve debuggability and robustness (David
Hildenbrand)

- SCHED_DEADLINE documentation updates (Luca Abeni)

- topology CPU masks cleanups (Bartosz Golaszewski)

- /proc/sched_debug improvements (Srikar Dronamraju)"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (79 commits)
sched/deadline: Remove needless parameter in dl_runtime_exceeded()
sched: Remove superfluous resetting of the p->dl_throttled flag
sched/deadline: Drop duplicate init_sched_dl_class() declaration
sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target
sched/deadline: Make init_sched_dl_class() __init
sched/deadline: Optimize pull_dl_task()
sched/preempt: Add static_key() to preempt_notifiers
sched/preempt: Fix preempt notifiers documentation about hlist_del() within unsafe iteration
sched/stop_machine: Fix deadlock between multiple stop_two_cpus()
sched/debug: Add sum_sleep_runtime to /proc//sched
sched/debug: Replace vruntime with wait_sum in /proc/sched_debug
sched/debug: Properly format runnable tasks in /proc/sched_debug
sched/numa: Only consider less busy nodes as numa balancing destinations
Revert 095bebf61a46 ("sched/numa: Do not move past the balance point if unbalanced")
sched/fair: Prevent throttling in early pick_next_task_fair()
preempt: Reorganize the notrace definitions a bit
preempt: Use preempt_schedule_context() as the official tracing preemption point
sched: Make preempt_schedule_context() function-tracing safe
x86: Remove cpu_sibling_mask() and cpu_core_mask()
x86: Replace cpu_**_mask() with topology_**_cpumask()
...

Linus Torvalds
2015-06-23 06:52:04 +0800
1bf7067c6 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull locking updates from Ingo Molnar:
"The main changes are:

- 'qspinlock' support, enabled on x86: queued spinlocks - these are
now the spinlock variant used by x86 as they outperform ticket
spinlocks in every category. (Waiman Long)

- 'pvqspinlock' support on x86: paravirtualized variant of queued
spinlocks. (Waiman Long, Peter Zijlstra)

- 'qrwlock' support, enabled on x86: queued rwlocks. Similar to
queued spinlocks, they are now the variant used by x86:

CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y

- various lockdep fixlets

- various locking primitives cleanups, further WRITE_ONCE()
propagation"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
locking/lockdep: Remove hard coded array size dependency
locking/qrwlock: Don't contend with readers when setting _QW_WAITING
lockdep: Do not break user-visible string
locking/arch: Rename set_mb() to smp_store_mb()
locking/arch: Add WRITE_ONCE() to set_mb()
rtmutex: Warn if trylock is called from hard/softirq context
arch: Remove __ARCH_HAVE_CMPXCHG
locking/rtmutex: Drop usage of __HAVE_ARCH_CMPXCHG
locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS
locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS
locking/pvqspinlock: Replace xchg() by the more descriptive set_mb()
locking/pvqspinlock, x86: Enable PV qspinlock for Xen
locking/pvqspinlock, x86: Enable PV qspinlock for KVM
locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching
locking/pvqspinlock: Implement simple paravirt support for the qspinlock
locking/qspinlock: Revert to test-and-set on hypervisors
locking/qspinlock: Use a simple write to grab the lock
locking/qspinlock: Optimize for smaller NR_CPUS
locking/qspinlock: Extract out code snippets for the next patch
locking/qspinlock: Add pending bit
...

Linus Torvalds
2015-06-23 05:54:22 +0800
fc934d401 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull RCU updates from Ingo Molnar:

- Continued initialization/Kconfig updates: hide most Kconfig options
from unsuspecting users.

There's now a single high level configuration option:

*
* RCU Subsystem
*
Make expert-level adjustments to RCU configuration (RCU_EXPERT) [N/y/?] (NEW)

Which if answered in the negative, leaves us with a single
interactive configuration option:

Offload RCU callback processing from boot-selected CPUs (RCU_NOCB_CPU) [N/y/?] (NEW)

All the rest of the RCU options are configured automatically. Later
on we'll remove this single leftover configuration option as well.

- Remove all uses of RCU-protected array indexes: replace the
rcu_[access|dereference]_index_check() APIs with READ_ONCE() and
rcu_lockdep_assert()

- RCU CPU-hotplug cleanups

- Updates to Tiny RCU: a race fix and further code shrinkage.

- RCU torture-testing updates: fixes, speedups, cleanups and
documentation updates.

- Miscellaneous fixes

- Documentation updates

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
rcutorture: Allow repetition factors in Kconfig-fragment lists
rcutorture: Display "make oldconfig" errors
rcutorture: Update TREE_RCU-kconfig.txt
rcutorture: Make rcutorture scripts force RCU_EXPERT
rcutorture: Update configuration fragments for rcutree.rcu_fanout_exact
rcutorture: TASKS_RCU set directly, so don't explicitly set it
rcutorture: Test SRCU cleanup code path
rcutorture: Replace barriers with smp_store_release() and smp_load_acquire()
locktorture: Change longdelay_us to longdelay_ms
rcutorture: Allow negative values of nreaders to oversubscribe
rcutorture: Exchange TREE03 and TREE08 NR_CPUS, speed up CPU hotplug
rcutorture: Exchange TREE03 and TREE04 geometries
locktorture: fix deadlock in 'rw_lock_irq' type
rcu: Correctly handle non-empty Tiny RCU callback list with none ready
rcutorture: Test both RCU-sched and RCU-bh for Tiny RCU
rcu: Further shrink Tiny RCU by making empty functions static inlines
rcu: Conditionally compile RCU's eqs warnings
rcu: Remove prompt for RCU implementation
rcu: Make RCU able to tolerate undefined CONFIG_RCU_KTHREAD_PRIO
rcu: Make RCU able to tolerate undefined CONFIG_RCU_FANOUT_LEAF
...

Linus Torvalds
2015-06-23 05:01:01 +0800

20 Jun, 2015

2 commits

9f40a51a3 locking/rtmutex: Update stale plist comments ... Browse Code »

... as of fb00aca4744 (rtmutex: Turn the plist into an rb-tree) we
no longer use plists for queuing any waiters. Update stale comments.

Signed-off-by: Davidlohr Bueso
Cc: Steven Rostedt
Cc: Mike Galbraith
Cc: Paul E. McKenney
Cc: Sebastian Andrzej Siewior
Cc: Davidlohr Bueso
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1432056298-18738-4-git-send-email-dave@stgolabs.net
Signed-off-by: Thomas Gleixner

Davidlohr Bueso
2015-06-20 03:27:21 +0800
802ab58da futex: Lower the lock contention on the HB lock during wake up ... Browse Code »

wake_futex_pi() wakes the task before releasing the hash bucket lock
(HB). The first thing the woken up task usually does is to acquire the
lock which requires the HB lock. On SMP Systems this leads to blocking
on the HB lock which is released by the owner shortly after.
This patch rearranges the unlock path by first releasing the HB lock and
then waking up the task.

[ tglx: Fixed up the rtmutex unlock path ]

Originally-from: Thomas Gleixner
Signed-off-by: Sebastian Andrzej Siewior
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: Mike Galbraith
Cc: Paul E. McKenney
Cc: Davidlohr Bueso
Link: http://lkml.kernel.org/r/20150617083350.GA2433@linutronix.de
Signed-off-by: Thomas Gleixner

Sebastian Andrzej Siewior
2015-06-20 03:26:38 +0800

19 Jun, 2015

5 commits

b17718d02 sched/stop_machine: Fix deadlock between multiple stop_two_cpus() ... Browse Code »

Jiri reported a machine stuck in multi_cpu_stop() with
migrate_swap_stop() as function and with the following src,dst cpu
pairs: {11, 4} {13, 11} { 4, 13}

4 11 13

cpuM: queue(4 ,13)
*Ma
cpuN: queue(13,11)
*N Na
*M Mb
cpuO: queue(11, 4)
*O Oa
*Nb
*Ob

Where *X denotes the cpu running the queueing of cpu-X and X[ab] denotes
the first/second queued work.

You'll observe the top of the workqueue for each cpu: 4,11,13 to be work
from cpus: M, O, N resp. IOW. deadlock.

Do away with the queueing trickery and introduce lg_double_lock() to
lock both CPUs and fully serialize the stop_two_cpus() callers instead
of the partial (and buggy) serialization we have now.

Reported-by: Jiri Olsa
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Borislav Petkov
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Rik van Riel
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20150605153023.GH19282@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2015-06-19 16:03:12 +0800
405963b6a locking/qrwlock: Don't contend with readers when setting _QW_WAITING ... Browse Code »

The current cmpxchg() loop in setting the _QW_WAITING flag for writers
in queue_write_lock_slowpath() will contend with incoming readers
causing possibly extra cmpxchg() operations that are wasteful. This
patch changes the code to do a byte cmpxchg() to eliminate contention
with new readers.

A multithreaded microbenchmark running 5M read_lock/write_lock loop
on a 8-socket 80-core Westmere-EX machine running 4.0 based kernel
with the qspinlock patch have the following execution times (in ms)
with and without the patch:

With R:W ratio = 5:1

Threads w/o patch with patch % change
------- --------- ---------- --------
2 990 895 -9.6%
3 2136 1912 -10.5%
4 3166 2830 -10.6%
5 3953 3629 -8.2%
6 4628 4405 -4.8%
7 5344 5197 -2.8%
8 6065 6004 -1.0%
9 6826 6811 -0.2%
10 7599 7599 0.0%
15 9757 9766 +0.1%
20 13767 13817 +0.4%

With small number of contending threads, this patch can improve
locking performance by up to 10%. With more contending threads,
however, the gain diminishes.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Arnd Bergmann
Cc: Borislav Petkov
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Scott J Norton
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1433863153-30722-3-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Waiman Long
2015-06-19 15:45:38 +0800
a24fc60d6 lockdep: Implement lock pinning ... Browse Code »

Add a lockdep annotation that WARNs if you 'accidentially' unlock a
lock.

This is especially helpful for code with callbacks, where the upper
layer assumes a lock remains taken but a lower layer thinks it maybe
can drop and reacquire the lock.

By unwittingly breaking up the lock, races can be introduced.

Lock pinning is a lockdep annotation that helps with this, when you
lockdep_pin_lock() a held lock, any unlock without a
lockdep_unpin_lock() will produce a WARN. Think of this as a relative
of lockdep_assert_held(), except you don't only assert its held now,
but ensure it stays held until you release your assertion.

RFC: a possible alternative API would be something like:

int cookie = lockdep_pin_lock(&foo);
...
lockdep_unpin_lock(&foo, cookie);

Where we pick a random number for the pin_count; this makes it
impossible to sneak a lock break in without also passing the right
cookie along.

I've not done this because it ends up generating code for !LOCKDEP,
esp. if you need to pass the cookie around for some reason.

Signed-off-by: Peter Zijlstra (Intel)
Cc: ktkhai@parallels.com
Cc: rostedt@goodmis.org
Cc: juri.lelli@gmail.com
Cc: pang.xunlei@linaro.org
Cc: oleg@redhat.com
Cc: wanpeng.li@linux.intel.com
Cc: umgwanakikbuti@gmail.com
Link: http://lkml.kernel.org/r/20150611124743.906731065@infradead.org
Signed-off-by: Thomas Gleixner

Peter Zijlstra
2015-06-19 06:25:27 +0800
e0f56fd70 lockdep: Simplify lock_release() ... Browse Code »

lock_release() takes this nested argument that's mostly pointless
these days, remove the implementation but leave the argument a
rudiment for now.

Signed-off-by: Peter Zijlstra (Intel)
Cc: ktkhai@parallels.com
Cc: rostedt@goodmis.org
Cc: juri.lelli@gmail.com
Cc: pang.xunlei@linaro.org
Cc: oleg@redhat.com
Cc: wanpeng.li@linux.intel.com
Cc: umgwanakikbuti@gmail.com
Link: http://lkml.kernel.org/r/20150611124743.840411606@infradead.org
Signed-off-by: Thomas Gleixner

Peter Zijlstra
2015-06-19 06:25:27 +0800
45ab4effc locking/rtmutex: Implement lockless top-waiter wakeup ... Browse Code »

Mark the task for later wakeup after the wait_lock has been released.
This way, once the next task is awoken, it will have a better chance
to of finding the wait_lock free when continuing executing in
__rt_mutex_slowlock() when trying to acquire the rtmutex, calling
try_to_take_rt_mutex(). Upon contended scenarios, other tasks attempting
take the lock may acquire it first, right after the wait_lock is released,
but (a) this can also occur with the current code, as it relies on the
spinlock fairness, and (b) we are dealing with the top-waiter anyway,
so it will always take the lock next.

Signed-off-by: Davidlohr Bueso
Cc: Steven Rostedt
Cc: Mike Galbraith
Cc: Paul E. McKenney
Cc: Sebastian Andrzej Siewior
Cc: Davidlohr Bueso
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1432056298-18738-2-git-send-email-dave@stgolabs.net
Signed-off-by: Thomas Gleixner

Davidlohr Bueso
2015-06-19 04:27:46 +0800

07 Jun, 2015

1 commit

cee34d88c lockdep: Fix a race between /proc/lock_stat and module unload ... Browse Code »

The lock_class iteration of /proc/lock_stat is not serialized against
the lockdep_free_key_range() call from module unload.

Therefore it can happen that we find a class of which ->name/->key are
no longer valid.

There is a further bug in zap_class() that left ->name dangling. Cure
this. Use RCU_INIT_POINTER() because NULL.

Since lockdep_free_key_range() is rcu_sched serialized, we can read
both ->name and ->key under rcu_read_lock_sched() (preempt-disable)
and be assured that if we observe a !NULL value it stays safe to use
for as long as we hold that lock.

If we observe both NULL, skip the entry.

Reported-by: Jerome Marchand
Tested-by: Jerome Marchand
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20150602105013.GS3644@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2015-06-07 21:46:30 +0800

03 Jun, 2015

1 commit

92ae18371 lockdep: Do not break user-visible string ... Browse Code »

Remove the line-break in the user-visible string and add the
missing space in this error message:

WARNING: lockdep init error! lock-(console_sem).lock was acquiredbefore lockdep_init

Also:

- don't yell, it's just a debug warning

- denote references to function calls with '()'

- standardize the lock name quoting

- and finish the sentence.

The result:

WARNING: lockdep init error: lock '(console_sem).lock' was acquired before lockdep_init().

Signed-off-by: Borislav Petkov
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20150602133827.GD19887@pd.tnic
[ Added a few more stylistic tweaks to the error message. ]
Signed-off-by: Ingo Molnar

Signed-off-by: Ingo Molnar

Borislav Petkov
2015-06-03 16:07:09 +0800

02 Jun, 2015

1 commit

085c78978 Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmc… ... Browse Code »

…k/linux-rcu into core/rcu

Pull RCU changes from Paul E. McKenney:

- Initialization/Kconfig updates: hide most Kconfig options from unsuspecting users.
There's now a single high level configuration option:

*
* RCU Subsystem
*
Make expert-level adjustments to RCU configuration (RCU_EXPERT) [N/y/?] (NEW)

Which if answered in the negative, leaves us with a single interactive
configuration option:

Offload RCU callback processing from boot-selected CPUs (RCU_NOCB_CPU) [N/y/?] (NEW)

All the rest of the RCU options are configured automatically.

- Remove all uses of RCU-protected array indexes: replace the
rcu_[access|dereference]_index_check() APIs with READ_ONCE() and rcu_lockdep_assert().

- RCU CPU-hotplug cleanups.

- Updates to Tiny RCU: a race fix and further code shrinkage.

- RCU torture-testing updates: fixes, speedups, cleanups and
documentation updates.

- Miscellaneous fixes.

- Documentation updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2015-06-02 14:18:34 +0800

28 May, 2015

2 commits

61d49d2f9 locktorture: Change longdelay_us to longdelay_ms ... Browse Code »

The locktorture long delays are in milliseconds rather than microseconds,
so this commit changes the name of the corresponding variable from
longdelay_us to longdelay_ms.

Reported-by: Ben Goodwyn
Signed-off-by: Paul E. McKenney
Cc: Davidlohr Bueso
Reviewed-by: Josh Triplett

Paul E. McKenney
2015-05-28 03:59:57 +0800
f548d99ef locktorture: fix deadlock in 'rw_lock_irq' type ... Browse Code »

torture_rwlock_read_unlock_irq() must use read_unlock_irqrestore()
instead of write_unlock_irqrestore().

Use read_unlock_irqrestore() instead of write_unlock_irqrestore().

Signed-off-by: Alexey Kodanev
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Alexey Kodanev
2015-05-28 03:59:56 +0800

19 May, 2015

2 commits

c3b5d3cea Merge branch 'linus' into timers/core ... Browse Code »

Make sure the upstream fixes are applied before adding further
modifications.

Thomas Gleixner
2015-05-19 22:12:32 +0800
b92b8b35a locking/arch: Rename set_mb() to smp_store_mb() ... Browse Code »

Since set_mb() is really about an smp_mb() -- not a IO/DMA barrier
like mb() rename it to match the recent smp_load_acquire() and
smp_store_release().

Suggested-by: Linus Torvalds
Signed-off-by: Peter Zijlstra (Intel)
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2015-05-19 14:32:00 +0800

14 May, 2015

1 commit

6ce47fd96 rtmutex: Warn if trylock is called from hard/softirq context ... Browse Code »

rt_mutex_trylock() must be called from thread context. It can be
called from atomic regions (preemption or interrupts disabled), but
not from hard/softirq/nmi context. Add a warning to alert abusers.

The reasons for this are:

1) There is a potential deadlock in the slowpath

2) Another cpu which blocks on the rtmutex will boost the task
which allegedly locked the rtmutex, but that cannot work
because the hard/softirq context borrows the task context.

Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Sebastian Siewior

Thomas Gleixner
2015-05-14 04:49:12 +0800

13 May, 2015

1 commit

cede88418 locking/rtmutex: Drop usage of __HAVE_ARCH_CMPXCHG ... Browse Code »

The rtmutex code is the only user of __HAVE_ARCH_CMPXCHG and we have a few
other user of cmpxchg() which do not care about __HAVE_ARCH_CMPXCHG. This
define was first introduced in 23f78d4a0 ("[PATCH] pi-futex: rt mutex core")
which is v2.6.18. The generic cmpxchg was introduced later in 068fbad288
("Add cmpxchg_local to asm-generic for per cpu atomic operations") which is
v2.6.25.
Back then something was required to get rtmutex working with the fast
path on architectures without cmpxchg and this seems to be the result.

It popped up recently on rt-users because ARM (v6+) does not define
__HAVE_ARCH_CMPXCHG (even that it implements it) which results in slower
locking performance in the fast path.
To put some numbers on it: preempt -RT, am335x, 10 loops of
100000 invocations of rt_spin_lock() + rt_spin_unlock() (time "total" is
the average of the 10 loops for the 100000 invocations, "loop" is
"total / 100000 * 1000"):

cmpxchg | slowpath used || cmpxchg used
| total | loop || total | loop
--------|-----------|-------||------------|-------
ARMv6 | 9129.4 us | 91 ns || 3311.9 us | 33 ns
generic | 9360.2 us | 94 ns || 10834.6 us | 108 ns
----------------------------||--------------------

Forcing it to generic cmpxchg() made things worse for the slowpath and
even worse in cmpxchg() path. It boils down to 14ns more per lock+unlock
in a cache hot loop so it might not be that much in real world.
The last test was a substitute for pre ARMv6 machine but then I was able
to perform the comparison on imx28 which is ARMv5 and therefore is
always is using the generic cmpxchg implementation. And the numbers:

| total | loop
-------- |----------- |--------
slowpath | 263937.2 us | 2639 ns
cmpxchg | 16934.2 us | 169 ns
--------------------------------

The numbers are larger since the machine is slower in general. However,
letting rtmutex use cmpxchg() instead the slowpath seem to improve things.

Since from the ARM (tested on am335x + imx28) point of view always
using cmpxchg() in rt_mutex_lock() + rt_mutex_unlock() makes sense I
would drop the define.

Signed-off-by: Sebastian Andrzej Siewior
Cc: Arnd Bergmann
Cc: Peter Zijlstra
Cc: will.deacon@arm.com
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20150225175613.GE6823@linutronix.de
Signed-off-by: Thomas Gleixner

Sebastian Andrzej Siewior
2015-05-13 16:51:28 +0800

12 May, 2015

1 commit

c7114b4e6 locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS ... Browse Code »

To be consistent with the queued spinlocks which use
CONFIG_QUEUED_SPINLOCKS config parameter, the one for the queued
rwlocks is now renamed to CONFIG_QUEUED_RWLOCKS.

Signed-off-by: Waiman Long
Cc: Borislav Petkov
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Scott J Norton
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1431367031-36697-1-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Waiman Long
2015-05-12 15:46:00 +0800

11 May, 2015

2 commits

62c7a1e9a locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS ... Browse Code »

Valentin Rothberg reported that we use CONFIG_QUEUED_SPINLOCKS
in arch/x86/kernel/paravirt_patch_32.c, while the symbol is
called CONFIG_QUEUED_SPINLOCK. (Note the extra 'S')

But the typo was natural: the proper English term for such
a generic object would be 'queued spinlocks' - so rename
this and related symbols accordingly to the plural form.

Reported-by: Valentin Rothberg
Cc: Douglas Hatch
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: Waiman Long
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Ingo Molnar
2015-05-11 15:52:09 +0800
52c9d2bad locking/pvqspinlock: Replace xchg() by the more descriptive set_mb() ... Browse Code »

The xchg() function was used in pv_wait_node() to set a certain
value and provide a memory barrier which is what the set_mb()
function is for. This patch replaces the xchg() call by
set_mb().

Suggested-by: Linus Torvalds
Signed-off-by: Waiman Long
Cc: Douglas Hatch
Cc: Peter Zijlstra
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Waiman Long
2015-05-11 15:51:52 +0800

08 May, 2015

9 commits

a23db284f locking/pvqspinlock: Implement simple paravirt support for the qspinlock ... Browse Code »

Provide a separate (second) version of the spin_lock_slowpath for
paravirt along with a special unlock path.

The second slowpath is generated by adding a few pv hooks to the
normal slowpath, but where those will compile away for the native
case, they expand into special wait/wake code for the pv version.

The actual MCS queue can use extra storage in the mcs_nodes[] array to
keep track of state and therefore uses directed wakeups.

The head contender has no such storage directly visible to the
unlocker. So the unlocker searches a hash table with open addressing
using a simple binary Galois linear feedback shift register.

Suggested-by: Peter Zijlstra (Intel)
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Daniel J Blueman
Cc: David Vrabel
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Paolo Bonzini
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Raghavendra K T
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1429901803-29771-9-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Waiman Long
2015-05-08 18:37:05 +0800
2aa79af64 locking/qspinlock: Revert to test-and-set on hypervisors ... Browse Code »

When we detect a hypervisor (!paravirt, see qspinlock paravirt support
patches), revert to a simple test-and-set lock to avoid the horrors
of queue preemption.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Daniel J Blueman
Cc: David Vrabel
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Paolo Bonzini
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Raghavendra K T
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-8-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Peter Zijlstra (Intel)
2015-05-08 18:36:58 +0800
2c83e8e94 locking/qspinlock: Use a simple write to grab the lock ... Browse Code »

Currently, atomic_cmpxchg() is used to get the lock. However, this
is not really necessary if there is more than one task in the queue
and the queue head don't need to reset the tail code. For that case,
a simple write to set the lock bit is enough as the queue head will
be the only one eligible to get the lock as long as it checks that
both the lock and pending bits are not set. The current pending bit
waiting code will ensure that the bit will not be set as soon as the
tail code in the lock is set.

With that change, the are some slight improvement in the performance
of the queued spinlock in the 5M loop micro-benchmark run on a 4-socket
Westere-EX machine as shown in the tables below.

[Standalone/Embedded - same node]
# of tasks Before patch After patch %Change
---------- ----------- ---------- -------
3 2324/2321 2248/2265 -3%/-2%
4 2890/2896 2819/2831 -2%/-2%
5 3611/3595 3522/3512 -2%/-2%
6 4281/4276 4173/4160 -3%/-3%
7 5018/5001 4875/4861 -3%/-3%
8 5759/5750 5563/5568 -3%/-3%

[Standalone/Embedded - different nodes]
# of tasks Before patch After patch %Change
---------- ----------- ---------- -------
3 12242/12237 12087/12093 -1%/-1%
4 10688/10696 10507/10521 -2%/-2%

It was also found that this change produced a much bigger performance
improvement in the newer IvyBridge-EX chip and was essentially to close
the performance gap between the ticket spinlock and queued spinlock.

The disk workload of the AIM7 benchmark was run on a 4-socket
Westmere-EX machine with both ext4 and xfs RAM disks at 3000 users
on a 3.14 based kernel. The results of the test runs were:

AIM7 XFS Disk Test
kernel JPM Real Time Sys Time Usr Time
----- --- --------- -------- --------
ticketlock 5678233 3.17 96.61 5.81
qspinlock 5750799 3.13 94.83 5.97

AIM7 EXT4 Disk Test
kernel JPM Real Time Sys Time Usr Time
----- --- --------- -------- --------
ticketlock 1114551 16.15 509.72 7.11
qspinlock 2184466 8.24 232.99 6.01

The ext4 filesystem run had a much higher spinlock contention than
the xfs filesystem run.

The "ebizzy -m" test was also run with the following results:

kernel records/s Real Time Sys Time Usr Time
----- --------- --------- -------- --------
ticketlock 2075 10.00 216.35 3.49
qspinlock 3023 10.00 198.20 4.80

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Daniel J Blueman
Cc: David Vrabel
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Paolo Bonzini
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Raghavendra K T
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-7-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Waiman Long
2015-05-08 18:36:55 +0800
69f9cae90 locking/qspinlock: Optimize for smaller NR_CPUS ... Browse Code »

When we allow for a max NR_CPUS < 2^14 we can optimize the pending
wait-acquire and the xchg_tail() operations.

By growing the pending bit to a byte, we reduce the tail to 16bit.
This means we can use xchg16 for the tail part and do away with all
the repeated compxchg() operations.

This in turn allows us to unconditionally acquire; the locked state
as observed by the wait loops cannot change. And because both locked
and pending are now a full byte we can use simple stores for the
state transition, obviating one atomic operation entirely.

This optimization is needed to make the qspinlock achieve performance
parity with ticket spinlock at light load.

All this is horribly broken on Alpha pre EV56 (and any other arch that
cannot do single-copy atomic byte stores).

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Daniel J Blueman
Cc: David Vrabel
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Paolo Bonzini
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Raghavendra K T
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-6-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Peter Zijlstra (Intel)
2015-05-08 18:36:48 +0800
6403bd7d0 locking/qspinlock: Extract out code snippets for the next patch ... Browse Code »

This is a preparatory patch that extracts out the following 2 code
snippets to prepare for the next performance optimization patch.

1) the logic for the exchange of new and previous tail code words
into a new xchg_tail() function.
2) the logic for clearing the pending bit and setting the locked bit
into a new clear_pending_set_locked() function.

This patch also simplifies the trylock operation before queuing by
calling queued_spin_trylock() directly.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Daniel J Blueman
Cc: David Vrabel
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Paolo Bonzini
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Raghavendra K T
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-5-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Waiman Long
2015-05-08 18:36:41 +0800
c1fb159db locking/qspinlock: Add pending bit ... Browse Code »

Because the qspinlock needs to touch a second cacheline (the per-cpu
mcs_nodes[]); add a pending bit and allow a single in-word spinner
before we punt to the second cacheline.

It is possible so observe the pending bit without the locked bit when
the last owner has just released but the pending owner has not yet
taken ownership.

In this case we would normally queue -- because the pending bit is
already taken. However, in this case the pending bit is guaranteed
to be released 'soon', therefore wait for it and avoid queueing.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Daniel J Blueman
Cc: David Vrabel
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Paolo Bonzini
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Raghavendra K T
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-4-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Peter Zijlstra (Intel)
2015-05-08 18:36:32 +0800
a33fda35e locking/qspinlock: Introduce a simple generic 4-byte queued spinlock ... Browse Code »

This patch introduces a new generic queued spinlock implementation that
can serve as an alternative to the default ticket spinlock. Compared
with the ticket spinlock, this queued spinlock should be almost as fair
as the ticket spinlock. It has about the same speed in single-thread
and it can be much faster in high contention situations especially when
the spinlock is embedded within the data structure to be protected.

Only in light to moderate contention where the average queue depth
is around 1-3 will this queued spinlock be potentially a bit slower
due to the higher slowpath overhead.

This queued spinlock is especially suit to NUMA machines with a large
number of cores as the chance of spinlock contention is much higher
in those machines. The cost of contention is also higher because of
slower inter-node memory traffic.

Due to the fact that spinlocks are acquired with preemption disabled,
the process will not be migrated to another CPU while it is trying
to get a spinlock. Ignoring interrupt handling, a CPU can only be
contending in one spinlock at any one time. Counting soft IRQ, hard
IRQ and NMI, a CPU can only have a maximum of 4 concurrent lock waiting
activities. By allocating a set of per-cpu queue nodes and used them
to form a waiting queue, we can encode the queue node address into a
much smaller 24-bit size (including CPU number and queue node index)
leaving one byte for the lock.

Please note that the queue node is only needed when waiting for the
lock. Once the lock is acquired, the queue node can be released to
be used later.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Daniel J Blueman
Cc: David Vrabel
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Paolo Bonzini
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Raghavendra K T
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-2-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Waiman Long
2015-05-08 18:36:25 +0800
59aabfc7e locking/rwsem: Reduce spinlock contention in wakeup after up_read()/up_write() ... Browse Code »

In up_write()/up_read(), rwsem_wake() will be called whenever it
detects that some writers/readers are waiting. The rwsem_wake()
function will take the wait_lock and call __rwsem_do_wake() to do the
real wakeup. For a heavily contended rwsem, doing a spin_lock() on
wait_lock will cause further contention on the heavily contended rwsem
cacheline resulting in delay in the completion of the up_read/up_write
operations.

This patch makes the wait_lock taking and the call to __rwsem_do_wake()
optional if at least one spinning writer is present. The spinning
writer will be able to take the rwsem and call rwsem_wake() later
when it calls up_write(). With the presence of a spinning writer,
rwsem_wake() will now try to acquire the lock using trylock. If that
fails, it will just quit.

Suggested-by: Peter Zijlstra (Intel)
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Davidlohr Bueso
Acked-by: Jason Low
Cc: Andrew Morton
Cc: Borislav Petkov
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Scott J Norton
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1430428337-16802-2-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Waiman Long
2015-05-08 18:27:59 +0800
0782e63bc sched: Handle priority boosted tasks proper in setscheduler() ... Browse Code »

Ronny reported that the following scenario is not handled correctly:

T1 (prio = 10)
lock(rtmutex);

T2 (prio = 20)
lock(rtmutex)
boost T1

T1 (prio = 20)
sys_set_scheduler(prio = 30)
T1 prio = 30
....
sys_set_scheduler(prio = 10)
T1 prio = 30

The last step is wrong as T1 should now be back at prio 20.

Commit c365c292d059 ("sched: Consider pi boosting in setscheduler()")
only handles the case where a boosted tasks tries to lower its
priority.

Fix it by taking the new effective priority into account for the
decision whether a change of the priority is required.

Reported-by: Ronny Meeus
Tested-by: Steven Rostedt
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Steven Rostedt
Cc:
Cc: Borislav Petkov
Cc: H. Peter Anvin
Cc: Mike Galbraith
Fixes: c365c292d059 ("sched: Consider pi boosting in setscheduler()")
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1505051806060.4225@nanos
Signed-off-by: Ingo Molnar

Thomas Gleixner
2015-05-08 17:53:55 +0800

22 Apr, 2015

1 commit

ccdd92c17 rtmutex: Remove bogus hrtimer_active() check ... Browse Code »

The check for hrtimer_active() after starting the timer is
pointless. If the timer is inactive it has expired already and
therefor the task pointer is already NULL.

Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra
Cc: Preeti U Murthy
Cc: Viresh Kumar
Cc: Marcelo Tosatti
Cc: Frederic Weisbecker
Link: http://lkml.kernel.org/r/20150414203503.081830481@linutronix.de
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2015-04-22 23:06:51 +0800

17 Apr, 2015

1 commit

d7bc3197b lockdep: Make print_lock() robust against concurrent release ... Browse Code »

During sysrq's show-held-locks command it is possible that
hlock_class() returns NULL for a given lock. The result is then (after
the warning):

|BUG: unable to handle kernel NULL pointer dereference at 0000001c
|IP: [] get_usage_chars+0x5/0x100
|Call Trace:
| [] print_lock_name+0x23/0x60
| [] print_lock+0x5d/0x7e
| [] lockdep_print_held_locks+0x74/0xe0
| [] debug_show_all_locks+0x132/0x1b0
| [] sysrq_handle_showlocks+0x8/0x10

This *might* happen because the thread on the other CPU drops the lock
after we are looking ->lockdep_depth and ->held_locks points no longer
to a lock that is held.

The fix here is to simply ignore it and continue.

Reported-by: Andreas Messerschmid
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Sebastian Andrzej Siewior
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar

Peter Zijlstra
2015-04-17 15:42:14 +0800

14 Apr, 2015

1 commit

cc76ee75a Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull core locking changes from Ingo Molnar:
"Main changes:

- jump label asm preparatory work for PowerPC (Anton Blanchard)

- rwsem optimizations and cleanups (Davidlohr Bueso)

- mutex optimizations and cleanups (Jason Low)

- futex fix (Oleg Nesterov)

- remove broken atomicity checks from {READ,WRITE}_ONCE() (Peter
Zijlstra)"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
powerpc, jump_label: Include linux/jump_label.h to get HAVE_JUMP_LABEL define
jump_label: Allow jump labels to be used in assembly
jump_label: Allow asm/jump_label.h to be included in assembly
locking/mutex: Further simplify mutex_spin_on_owner()
locking: Remove atomicy checks from {READ,WRITE}_ONCE
locking/rtmutex: Rename argument in the rt_mutex_adjust_prio_chain() documentation as well
locking/rwsem: Fix lock optimistic spinning when owner is not running
locking: Remove ACCESS_ONCE() usage
locking/rwsem: Check for active lock before bailing on spinning
locking/rwsem: Avoid deceiving lock spinners
locking/rwsem: Set lock ownership ASAP
locking/rwsem: Document barrier need when waking tasks
locking/futex: Check PF_KTHREAD rather than !p->mm to filter out kthreads
locking/mutex: Refactor mutex_spin_on_owner()
locking/mutex: In mutex_spin_on_owner(), return true when owner changes

Linus Torvalds
2015-04-14 01:27:28 +0800

09 Apr, 2015

1 commit

01ac33c1f locking/mutex: Further simplify mutex_spin_on_owner() ... Browse Code »

Similar to what Linus suggested for rwsem_spin_on_owner(), in
mutex_spin_on_owner() instead of having while (true) and
breaking out of the spin loop on lock->owner != owner, we can
have the loop directly check for while (lock->owner == owner) to
improve the readability of the code.

It also shrinks the code a bit:

text data bss dec hex filename
3721 0 0 3721 e89 mutex.o.before
3705 0 0 3705 e79 mutex.o.after

Signed-off-by: Jason Low
Cc: Andrew Morton
Cc: Aswin Chandramouleeswaran
Cc: Davidlohr Bueso
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Link: http://lkml.kernel.org/r/1428521960-5268-2-git-send-email-jason.low2@hp.com
[ Added code generation info. ]
Signed-off-by: Ingo Molnar

Jason Low
2015-04-09 14:10:23 +0800

25 Mar, 2015

1 commit

e6beaa363 locking/rtmutex: Rename argument in the rt_mutex_adjust_prio_chain() documentation as well ... Browse Code »

The following commit changed "deadlock_detect" to "chwalk":

8930ed80f970 ("rtmutex: Cleanup deadlock detector debug logic")

do that rename in the function's documentation as well.

Signed-off-by: Tom(JeHyeon) Yeon
Cc: peterz@infradead.org
Link: http://lkml.kernel.org/r/1426655010-31651-1-git-send-email-tom.yeon@windriver.com
Signed-off-by: Ingo Molnar

Tom(JeHyeon) Yeon
2015-03-25 20:43:55 +0800