Eric Lee / smarc-fsl-linux-kernel

13 Jan, 2021

1 commit

0290a41d0 Merge 5.10.6 into android12-5.10 ... Browse Code »

Changes in 5.10.6
Revert "drm/amd/display: Fix memory leaks in S3 resume"
Revert "mtd: spinand: Fix OOB read"
rtc: pcf2127: move watchdog initialisation to a separate function
rtc: pcf2127: only use watchdog when explicitly available
dt-bindings: rtc: add reset-source property
kdev_t: always inline major/minor helper functions
Bluetooth: Fix attempting to set RPA timeout when unsupported
ALSA: hda/realtek - Modify Dell platform name
ALSA: hda/hdmi: Fix incorrect mutex unlock in silent_stream_disable()
drm/i915/tgl: Fix Combo PHY DPLL fractional divider for 38.4MHz ref clock
scsi: ufs: Allow an error return value from ->device_reset()
scsi: ufs: Re-enable WriteBooster after device reset
RDMA/core: remove use of dma_virt_ops
RDMA/siw,rxe: Make emulated devices virtual in the device tree
fuse: fix bad inode
perf: Break deadlock involving exec_update_mutex
rwsem: Implement down_read_killable_nested
rwsem: Implement down_read_interruptible
exec: Transform exec_update_mutex into a rw_semaphore
mwifiex: Fix possible buffer overflows in mwifiex_cmd_802_11_ad_hoc_start
Linux 5.10.6

Signed-off-by: Greg Kroah-Hartman
Change-Id: Id4c57a151a1e8f2162163d2337b6055f04edbe9b

Greg Kroah-Hartman
2021-01-13 17:28:55 +0800

09 Jan, 2021

2 commits

933b7cc86 rwsem: Implement down_read_interruptible ... Browse Code »

[ Upstream commit 31784cff7ee073b34d6eddabb95e3be2880a425c ]

In preparation for converting exec_update_mutex to a rwsem so that
multiple readers can execute in parallel and not deadlock, add
down_read_interruptible. This is needed for perf_event_open to be
converted (with no semantic changes) from working on a mutex to
wroking on a rwsem.

Signed-off-by: Eric W. Biederman
Signed-off-by: Peter Zijlstra (Intel)
Link: https://lkml.kernel.org/r/87k0tybqfy.fsf@x220.int.ebiederm.org
Signed-off-by: Sasha Levin

Eric W. Biederman
2021-01-09 20:46:24 +0800
27bae39e4 rwsem: Implement down_read_killable_nested ... Browse Code »

[ Upstream commit 0f9368b5bf6db0c04afc5454b1be79022a681615 ]

In preparation for converting exec_update_mutex to a rwsem so that
multiple readers can execute in parallel and not deadlock, add
down_read_killable_nested. This is needed so that kcmp_lock
can be converted from working on a mutexes to working on rw_semaphores.

Signed-off-by: Eric W. Biederman
Signed-off-by: Peter Zijlstra (Intel)
Link: https://lkml.kernel.org/r/87o8jabqh3.fsf@x220.int.ebiederm.org
Signed-off-by: Sasha Levin

Eric W. Biederman
2021-01-09 20:46:24 +0800

05 Jan, 2021

1 commit

a34582fec ANDROID: rwsem: Export rwsem_waiter struct for loadable modules ... Browse Code »

The rwsem_waiter struct is needed in vendor hook alter_rwsem_list_add.
It has parameter sem which is a struct rw_semaphore (already export in
rwsem.h), inside the structure there is a wait_list to link
"struct rwsem_waiter" items. The task information in each item of the
wait_list is needed to be referenced in vendor loadable modules.

Bug: 174902706
Signed-off-by: Huang Yiwei
Change-Id: Ic7d21ffdd795eaa203989751d26f8b1f32134d8b

Huang Yiwei
2021-01-05 03:10:23 +0800

19 Aug, 2020

1 commit

9ad8ff902 ANDROID: vendor_hooks: add waiting information for blocked tasks ... Browse Code »

- Add the hook to get mutex/rwsem information that the tasks
are waiting for.

- Add the hook to print messages for sched_show_task.

- ANDROID_VENDOR_DATA_ARRAY added to task_struct

Bug: 162776704

Signed-off-by: Sangmoon Kim
Change-Id: Ib436fbd8d0ad509c3b5a73ea8f5170e0761a13fd
(cherry picked from commit b519ac423787d38f467ca479d2126b7204d6f498)

Sangmoon Kim
2020-08-19 22:52:44 +0800

28 Jul, 2020

1 commit

df18d99da ANDROID: rwsem: Add vendor hook to the rw-semaphore ... Browse Code »

- Add the hook to apply vendor's performance tune for owner
of rwsem.

- Add the hook for the waiter list of rwsem to allow
vendor perform waiting queue enhancement

- ANDROID_VENDOR_DATA added to rw_semaphore

Bug: 161400830

Signed-off-by: JianMin Liu
Change-Id: I007a5e26f3db2adaeaf4e5ccea414ce7abfa83b8

JianMin Liu
2020-07-28 11:04:13 +0800

21 Mar, 2020

1 commit

de8f5e4f2 lockdep: Introduce wait-type checks ... Browse Code »

Extend lockdep to validate lock wait-type context.

The current wait-types are:

LD_WAIT_FREE, /* wait free, rcu etc.. */
LD_WAIT_SPIN, /* spin loops, raw_spinlock_t etc.. */
LD_WAIT_CONFIG, /* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */
LD_WAIT_SLEEP, /* sleeping locks, mutex_t etc.. */

Where lockdep validates that the current lock (the one being acquired)
fits in the current wait-context (as generated by the held stack).

This ensures that there is no attempt to acquire mutexes while holding
spinlocks, to acquire spinlocks while holding raw_spinlocks and so on. In
other words, its a more fancy might_sleep().

Obviously RCU made the entire ordeal more complex than a simple single
value test because RCU can be acquired in (pretty much) any context and
while it presents a context to nested locks it is not the same as it
got acquired in.

Therefore its necessary to split the wait_type into two values, one
representing the acquire (outer) and one representing the nested context
(inner). For most 'normal' locks these two are the same.

[ To make static initialization easier we have the rule that:
.outer == INV means .outer == .inner; because INV == 0. ]

It further means that its required to find the minimal .inner of the held
stack to compare against the outer of the new lock; because while 'normal'
RCU presents a CONFIG type to nested locks, if it is taken while already
holding a SPIN type it obviously doesn't relax the rules.

Below is an example output generated by the trivial test code:

raw_spin_lock(&foo);
spin_lock(&bar);
spin_unlock(&bar);
raw_spin_unlock(&foo);

[ BUG: Invalid wait context ]
-----------------------------
swapper/0/1 is trying to lock:
ffffc90000013f20 (&bar){....}-{3:3}, at: kernel_init+0xdb/0x187
other info that might help us debug this:
1 lock held by swapper/0/1:
#0: ffffc90000013ee0 (&foo){+.+.}-{2:2}, at: kernel_init+0xd1/0x187

The way to read it is to look at the new -{n,m} part in the lock
description; -{3:3} for the attempted lock, and try and match that up to
the held locks, which in this case is the one: -{2,2}.

This tells that the acquiring lock requires a more relaxed environment than
presented by the lock stack.

Currently only the normal locks and RCU are converted, the rest of the
lockdep users defaults to .inner = INV which is ignored. More conversions
can be done when desired.

The check for spinlock_t nesting is not enabled by default. It's a separate
config option for now as there are known problems which are currently
addressed. The config option allows to identify these problems and to
verify that the solutions found are indeed solving them.

The config switch will be removed and the checks will permanently enabled
once the vast majority of issues has been addressed.

[ bigeasy: Move LD_WAIT_FREE,… out of CONFIG_LOCKDEP to avoid compile
failure with CONFIG_DEBUG_SPINLOCK + !CONFIG_LOCKDEP]
[ tglx: Add the config option ]

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Link: https://lkml.kernel.org/r/20200321113242.427089655@linutronix.de

Peter Zijlstra
2020-03-21 23:00:24 +0800

11 Feb, 2020

3 commits

bcba67cd8 locking/rwsem: Remove RWSEM_OWNER_UNKNOWN ... Browse Code »

Remove the now unused RWSEM_OWNER_UNKNOWN hack. This hack breaks
PREEMPT_RT and getting rid of it was the entire motivation for
re-writing the percpu rwsem.

The biggest problem is that it is fundamentally incompatible with any
form of Priority Inheritance, any exclusively held lock must have a
distinct owner.

Requested-by: Christoph Hellwig
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Ingo Molnar
Reviewed-by: Davidlohr Bueso
Acked-by: Will Deacon
Acked-by: Waiman Long
Tested-by: Juri Lelli
Link: https://lkml.kernel.org/r/20200204092228.GP14946@hirez.programming.kicks-ass.net

Peter Zijlstra
2020-02-11 20:10:57 +0800
7f26482a8 locking/percpu-rwsem: Remove the embedded rwsem ... Browse Code »

The filesystem freezer uses percpu-rwsem in a way that is effectively
write_non_owner() and achieves this with a few horrible hacks that
rely on the rwsem (!percpu) implementation.

When PREEMPT_RT replaces the rwsem implementation with a PI aware
variant this comes apart.

Remove the embedded rwsem and implement it using a waitqueue and an
atomic_t.

- make readers_block an atomic, and use it, with the waitqueue
for a blocking test-and-set write-side.

- have the read-side wait for the 'lock' state to clear.

Have the waiters use FIFO queueing and mark them (reader/writer) with
a new WQ_FLAG. Use a custom wake_function to wake either a single
writer or all readers until a writer.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Ingo Molnar
Reviewed-by: Davidlohr Bueso
Acked-by: Will Deacon
Acked-by: Waiman Long
Tested-by: Juri Lelli
Link: https://lkml.kernel.org/r/20200204092403.GB14879@hirez.programming.kicks-ass.net

Peter Zijlstra
2020-02-11 20:10:56 +0800
1751060e2 locking/percpu-rwsem, lockdep: Make percpu-rwsem use its own lockdep_map ... Browse Code »

As preparation for replacing the embedded rwsem, give percpu-rwsem its
own lockdep_map.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Ingo Molnar
Reviewed-by: Davidlohr Bueso
Acked-by: Will Deacon
Acked-by: Waiman Long
Tested-by: Juri Lelli
Link: https://lkml.kernel.org/r/20200131151539.927625541@infradead.org

Peter Zijlstra
2020-02-11 20:10:53 +0800

17 Jan, 2020

1 commit

39e7234f0 locking/rwsem: Fix kernel crash when spinning on RWSEM_OWNER_UNKNOWN ... Browse Code »

The commit 91d2a812dfb9 ("locking/rwsem: Make handoff writer
optimistically spin on owner") will allow a recently woken up waiting
writer to spin on the owner. Unfortunately, if the owner happens to be
RWSEM_OWNER_UNKNOWN, the code will incorrectly spin on it leading to a
kernel crash. This is fixed by passing the proper non-spinnable bits
to rwsem_spin_on_owner() so that RWSEM_OWNER_UNKNOWN will be treated
as a non-spinnable target.

Fixes: 91d2a812dfb9 ("locking/rwsem: Make handoff writer optimistically spin on owner")

Reported-by: Christoph Hellwig
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Tested-by: Christoph Hellwig
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200115154336.8679-1-longman@redhat.com

Waiman Long
2020-01-17 17:19:27 +0800

09 Oct, 2019

1 commit

5facae4f3 locking/lockdep: Remove unused @nested argument from lock_release() ... Browse Code »

Since the following commit:

b4adfe8e05f1 ("locking/lockdep: Remove unused argument in __lock_release")

@nested is no longer used in lock_release(), so remove it from all
lock_release() calls and friends.

Signed-off-by: Qian Cai
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Acked-by: Daniel Vetter
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: airlied@linux.ie
Cc: akpm@linux-foundation.org
Cc: alexander.levin@microsoft.com
Cc: daniel@iogearbox.net
Cc: davem@davemloft.net
Cc: dri-devel@lists.freedesktop.org
Cc: duyuyang@gmail.com
Cc: gregkh@linuxfoundation.org
Cc: hannes@cmpxchg.org
Cc: intel-gfx@lists.freedesktop.org
Cc: jack@suse.com
Cc: jlbec@evilplan.or
Cc: joonas.lahtinen@linux.intel.com
Cc: joseph.qi@linux.alibaba.com
Cc: jslaby@suse.com
Cc: juri.lelli@redhat.com
Cc: maarten.lankhorst@linux.intel.com
Cc: mark@fasheh.com
Cc: mhocko@kernel.org
Cc: mripard@kernel.org
Cc: ocfs2-devel@oss.oracle.com
Cc: rodrigo.vivi@intel.com
Cc: sean@poorly.run
Cc: st@kernel.org
Cc: tj@kernel.org
Cc: tytso@mit.edu
Cc: vdavydov.dev@gmail.com
Cc: vincent.guittot@linaro.org
Cc: viro@zeniv.linux.org.uk
Link: https://lkml.kernel.org/r/1568909380-32199-1-git-send-email-cai@lca.pw
Signed-off-by: Ingo Molnar

Qian Cai
2019-10-09 18:46:10 +0800

06 Aug, 2019

2 commits

fce45cd41 locking/rwsem: Check for operations on an uninitialized rwsem ... Browse Code »

Currently rwsems is the only locking primitive that lacks this
debug feature. Add it under CONFIG_DEBUG_RWSEMS and do the magic
checking in the locking fastpath (trylock) operation such that
we cover all cases. The unlocking part is pretty straightforward.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Waiman Long
Cc: mingo@kernel.org
Cc: Davidlohr Bueso
Link: https://lkml.kernel.org/r/20190729044735.9632-1-dave@stgolabs.net

Davidlohr Bueso
2019-08-06 18:49:15 +0800
91d2a812d locking/rwsem: Make handoff writer optimistically spin on owner ... Browse Code »

When the handoff bit is set by a writer, no other tasks other than
the setting writer itself is allowed to acquire the lock. If the
to-be-handoff'ed writer goes to sleep, there will be a wakeup latency
period where the lock is free, but no one can acquire it. That is less
than ideal.

To reduce that latency, the handoff writer will now optimistically spin
on the owner if it happens to be a on-cpu writer. It will spin until
it releases the lock and the to-be-handoff'ed writer can then acquire
the lock immediately without any delay. Of course, if the owner is not
a on-cpu writer, the to-be-handoff'ed writer will have to sleep anyway.

The optimistic spinning code is also modified to not stop spinning
when the handoff bit is set. This will prevent an occasional setting of
handoff bit from causing a bunch of optimistic spinners from entering
into the wait queue causing significant reduction in throughput.

On a 1-socket 22-core 44-thread Skylake system, the AIM7 shared_memory
workload was run with 7000 users. The throughput (jobs/min) of the
following kernels were as follows:

1) 5.2-rc6
- 8,092,486
2) 5.2-rc6 + tip's rwsem patches
- 7,567,568
3) 5.2-rc6 + tip's rwsem patches + this patch
- 7,954,545

Using perf-record(1), the %cpu time used by rwsem_down_write_slowpath(),
rwsem_down_write_failed() and their callees for the 3 kernels were 1.70%,
5.46% and 2.08% respectively.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: x86@kernel.org
Cc: Ingo Molnar
Cc: Will Deacon
Cc: huang ying
Cc: Tim Chen
Cc: Linus Torvalds
Cc: Borislav Petkov
Cc: Thomas Gleixner
Cc: Davidlohr Bueso
Cc: "H. Peter Anvin"
Link: https://lkml.kernel.org/r/20190625143913.24154-1-longman@redhat.com

Waiman Long
2019-08-06 18:49:15 +0800

25 Jul, 2019

4 commits

6ffddfb9e locking/rwsem: Add ACQUIRE comments ... Browse Code »

Since we just reviewed read_slowpath for ACQUIRE correctness, add a
few coments to retain our findings.

Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar

Peter Zijlstra
2019-07-25 21:39:25 +0800
99143f82a lcoking/rwsem: Add missing ACQUIRE to read_slowpath sleep loop ... Browse Code »

While reviewing another read_slowpath patch, both Will and I noticed
another missing ACQUIRE, namely:

X = 0;

CPU0 CPU1

rwsem_down_read()
for (;;) {
set_current_state(TASK_UNINTERRUPTIBLE);

X = 1;
rwsem_up_write();
rwsem_mark_wake()
atomic_long_add(adjustment, &sem->count);
smp_store_release(&waiter->task, NULL);

if (!waiter.task)
break;

...
}

r = X;

Allows 'r == 0'.

Reported-by: Peter Zijlstra (Intel)
Reported-by: Will Deacon
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Ingo Molnar

Peter Zijlstra
2019-07-25 21:39:24 +0800
e1b98fa31 locking/rwsem: Add missing ACQUIRE to read_slowpath exit when queue is empty ... Browse Code »

LTP mtest06 has been observed to occasionally hit "still mapped when
deleted" and following BUG_ON on arm64.

The extra mapcount originated from pagefault handler, which handled
pagefault for vma that has already been detached. vma is detached
under mmap_sem write lock by detach_vmas_to_be_unmapped(), which
also invalidates vmacache.

When the pagefault handler (under mmap_sem read lock) calls
find_vma(), vmacache_valid() wrongly reports vmacache as valid.

After rwsem down_read() returns via 'queue empty' path (as of v5.2),
it does so without an ACQUIRE on sem->count:

down_read()
__down_read()
rwsem_down_read_failed()
__rwsem_down_read_failed_common()
raw_spin_lock_irq(&sem->wait_lock);
if (list_empty(&sem->wait_list)) {
if (atomic_long_read(&sem->count) >= 0) {
raw_spin_unlock_irq(&sem->wait_lock);
return sem;

The problem can be reproduced by running LTP mtest06 in a loop and
building the kernel (-j $NCPUS) in parallel. It does reproduces since
v4.20 on arm64 HPE Apollo 70 (224 CPUs, 256GB RAM, 2 nodes). It
triggers reliably in about an hour.

The patched kernel ran fine for 10+ hours.

Signed-off-by: Jan Stancek
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Will Deacon
Acked-by: Waiman Long
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: dbueso@suse.de
Fixes: 4b486b535c33 ("locking/rwsem: Exit read lock slowpath if queue empty & no writer")
Link: https://lkml.kernel.org/r/50b8914e20d1d62bb2dee42d342836c2c16ebee7.1563438048.git.jstancek@redhat.com
Signed-off-by: Ingo Molnar

Jan Stancek
2019-07-25 21:39:23 +0800
781343005 locking/rwsem: Don't call owner_on_cpu() on read-owner ... Browse Code »

For writer, the owner value is cleared on unlock. For reader, it is
left intact on unlock for providing better debugging aid on crash dump
and the unlock of one reader may not mean the lock is free.

As a result, the owner_on_cpu() shouldn't be used on read-owner
as the task pointer value may not be valid and it might have
been freed. That is the case in rwsem_spin_on_owner(), but not in
rwsem_can_spin_on_owner(). This can lead to use-after-free error from
KASAN. For example,

BUG: KASAN: use-after-free in rwsem_down_write_slowpath
(/home/miguel/kernel/linux/kernel/locking/rwsem.c:669
/home/miguel/kernel/linux/kernel/locking/rwsem.c:1125)

Fix this by checking for RWSEM_READER_OWNED flag before calling
owner_on_cpu().

Reported-by: Luis Henriques
Tested-by: Luis Henriques
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Jeff Layton
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Fixes: 94a9717b3c40e ("locking/rwsem: Make rwsem->owner an atomic_long_t")
Link: https://lkml.kernel.org/r/81e82d5b-5074-77e8-7204-28479bbe0df0@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-07-25 21:39:22 +0800

17 Jun, 2019

13 commits

a15ea1a35 locking/rwsem: Guard against making count negative ... Browse Code »

The upper bits of the count field is used as reader count. When
sufficient number of active readers are present, the most significant
bit will be set and the count becomes negative. If the number of active
readers keep on piling up, we may eventually overflow the reader counts.
This is not likely to happen unless the number of bits reserved for
reader count is reduced because those bits are need for other purpose.

To prevent this count overflow from happening, the most significant
bit is now treated as a guard bit (RWSEM_FLAG_READFAIL). Read-lock
attempts will now fail for both the fast and slow paths whenever this
bit is set. So all those extra readers will be put to sleep in the wait
list. Wakeup will not happen until the reader count reaches 0.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Link: https://lkml.kernel.org/r/20190520205918.22251-17-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-06-17 18:28:11 +0800
5cfd92e12 locking/rwsem: Adaptive disabling of reader optimistic spinning ... Browse Code »

Reader optimistic spinning is helpful when the reader critical section
is short and there aren't that many readers around. It makes readers
relatively more preferred than writers. When a writer times out spinning
on a reader-owned lock and set the nospinnable bits, there are two main
reasons for that.

1) The reader critical section is long, perhaps the task sleeps after
acquiring the read lock.
2) There are just too many readers contending the lock causing it to
take a while to service all of them.

In the former case, long reader critical section will impede the progress
of writers which is usually more important for system performance.
In the later case, reader optimistic spinning tends to make the reader
groups that contain readers that acquire the lock together smaller
leading to more of them. That may hurt performance in some cases. In
other words, the setting of nonspinnable bits indicates that reader
optimistic spinning may not be helpful for those workloads that cause it.

Therefore, any writers that have observed the setting of the writer
nonspinnable bit for a given rwsem after they fail to acquire the lock
via optimistic spinning will set the reader nonspinnable bit once they
acquire the write lock. Similarly, readers that observe the setting
of reader nonspinnable bit at slowpath entry will also set the reader
nonspinnable bit when they acquire the read lock via the wakeup path.

Once the reader nonspinnable bit is on, it will only be reset when
a writer is able to acquire the rwsem in the fast path or somehow a
reader or writer in the slowpath doesn't observe the nonspinable bit.

This is to discourage reader optmistic spinning on that particular
rwsem and make writers more preferred. This adaptive disabling of reader
optimistic spinning will alleviate some of the negative side effect of
this feature.

In addition, this patch tries to make readers in the spinning queue
follow the phase-fair principle after quitting optimistic spinning
by checking if another reader has somehow acquired a read lock after
this reader enters the optimistic spinning queue. If so and the rwsem
is still reader-owned, this reader is in the right read-phase and can
attempt to acquire the lock.

On a 2-socket 40-core 80-thread Skylake system, the page_fault1 test of
the will-it-scale benchmark was run with various number of threads. The
number of operations done before reader optimistic spinning patches,
this patch and after this patch were:

Threads Before rspin Before patch After patch %change
------- ------------ ------------ ----------- -------
20 5541068 5345484 5455667 -3.5%/ +2.1%
40 10185150 7292313 9219276 -28.5%/+26.4%
60 8196733 6460517 7181209 -21.2%/+11.2%
80 9508864 6739559 8107025 -29.1%/+20.3%

This patch doesn't recover all the lost performance, but it is more
than half. Given the fact that reader optimistic spinning does benefit
some workloads, this is a good compromise.

Using the rwsem locking microbenchmark with very short critical section,
this patch doesn't have too much impact on locking performance as shown
by the locking rates (kops/s) below with equal numbers of readers and
writers before and after this patch:

# of Threads Pre-patch Post-patch
------------ --------- ----------
2 4,730 4,969
4 4,814 4,786
8 4,866 4,815
16 4,715 4,511
32 3,338 3,500
64 3,212 3,389
80 3,110 3,044

When running the locking microbenchmark with 40 dedicated reader and writer
threads, however, the reader performance is curtailed to favor the writer.

Before patch:

40 readers, Iterations Min/Mean/Max = 204,026/234,309/254,816
40 writers, Iterations Min/Mean/Max = 88,515/95,884/115,644

After patch:

40 readers, Iterations Min/Mean/Max = 33,813/35,260/36,791
40 writers, Iterations Min/Mean/Max = 95,368/96,565/97,798

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Link: https://lkml.kernel.org/r/20190520205918.22251-16-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-06-17 18:28:09 +0800
7d43f1ce9 locking/rwsem: Enable time-based spinning on reader-owned rwsem ... Browse Code »

When the rwsem is owned by reader, writers stop optimistic spinning
simply because there is no easy way to figure out if all the readers
are actively running or not. However, there are scenarios where
the readers are unlikely to sleep and optimistic spinning can help
performance.

This patch provides a simple mechanism for spinning on a reader-owned
rwsem by a writer. It is a time threshold based spinning where the
allowable spinning time can vary from 10us to 25us depending on the
condition of the rwsem.

When the time threshold is exceeded, the nonspinnable bits will be set
in the owner field to indicate that no more optimistic spinning will
be allowed on this rwsem until it becomes writer owned again. Not even
readers is allowed to acquire the reader-locked rwsem by optimistic
spinning for fairness.

We also want a writer to acquire the lock after the readers hold the
lock for a relatively long time. In order to give preference to writers
under such a circumstance, the single RWSEM_NONSPINNABLE bit is now split
into two - one for reader and one for writer. When optimistic spinning
is disabled, both bits will be set. When the reader count drop down
to 0, the writer nonspinnable bit will be cleared to allow writers to
spin on the lock, but not the readers. When a writer acquires the lock,
it will write its own task structure pointer into sem->owner and clear
the reader nonspinnable bit in the process.

The time taken for each iteration of the reader-owned rwsem spinning
loop varies. Below are sample minimum elapsed times for 16 iterations
of the loop.

System Time for 16 Iterations
------ ----------------------
1-socket Skylake ~800ns
4-socket Broadwell ~300ns
2-socket ThunderX2 (arm64) ~250ns

When the lock cacheline is contended, we can see up to almost 10X
increase in elapsed time. So 25us will be at most 500, 1300 and 1600
iterations for each of the above systems.

With a locking microbenchmark running on 5.1 based kernel, the total
locking rates (in kops/s) on a 8-socket IvyBridge-EX system with
equal numbers of readers and writers before and after this patch were
as follows:

# of Threads Pre-patch Post-patch
------------ --------- ----------
2 1,759 6,684
4 1,684 6,738
8 1,074 7,222
16 900 7,163
32 458 7,316
64 208 520
128 168 425
240 143 474

This patch gives a big boost in performance for mixed reader/writer
workloads.

With 32 locking threads, the rwsem lock event data were:

rwsem_opt_fail=79850
rwsem_opt_nospin=5069
rwsem_opt_rlock=597484
rwsem_opt_wlock=957339
rwsem_sleep_reader=57782
rwsem_sleep_writer=55663

With 64 locking threads, the data looked like:

rwsem_opt_fail=346723
rwsem_opt_nospin=6293
rwsem_opt_rlock=1127119
rwsem_opt_wlock=1400628
rwsem_sleep_reader=308201
rwsem_sleep_writer=72281

So a lot more threads acquired the lock in the slowpath and more threads
went to sleep.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Link: https://lkml.kernel.org/r/20190520205918.22251-15-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-06-17 18:28:07 +0800
94a9717b3 locking/rwsem: Make rwsem->owner an atomic_long_t ... Browse Code »

The rwsem->owner contains not just the task structure pointer, it also
holds some flags for storing the current state of the rwsem. Some of
the flags may have to be atomically updated. To reflect the new reality,
the owner is now changed to an atomic_long_t type.

New helper functions are added to properly separate out the task
structure pointer and the embedded flags.

Suggested-by: Peter Zijlstra
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Link: https://lkml.kernel.org/r/20190520205918.22251-14-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-06-17 18:28:06 +0800
cf69482d6 locking/rwsem: Enable readers spinning on writer ... Browse Code »

This patch enables readers to optimistically spin on a
rwsem when it is owned by a writer instead of going to sleep
directly. The rwsem_can_spin_on_owner() function is extracted
out of rwsem_optimistic_spin() and is called directly by
rwsem_down_read_slowpath() and rwsem_down_write_slowpath().

With a locking microbenchmark running on 5.1 based kernel, the total
locking rates (in kops/s) on a 8-socket IvyBrige-EX system with equal
numbers of readers and writers before and after the patch were as
follows:

# of Threads Pre-patch Post-patch
------------ --------- ----------
4 1,674 1,684
8 1,062 1,074
16 924 900
32 300 458
64 195 208
128 164 168
240 149 143

The performance change wasn't significant in this case, but this change
is required by a follow-on patch.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Link: https://lkml.kernel.org/r/20190520205918.22251-13-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-06-17 18:28:05 +0800
02f1082b0 locking/rwsem: Clarify usage of owner's nonspinaable bit ... Browse Code »

Bit 1 of sem->owner (RWSEM_ANONYMOUSLY_OWNED) is used to designate an
anonymous owner - readers or an anonymous writer. The setting of this
anonymous bit is used as an indicator that optimistic spinning cannot
be done on this rwsem.

With the upcoming reader optimistic spinning patches, a reader-owned
rwsem can be spinned on for a limit period of time. We still need
this bit to indicate a rwsem is nonspinnable, but not setting this
bit loses its meaning that the owner is known. So rename the bit
to RWSEM_NONSPINNABLE to clarify its meaning.

This patch also fixes a DEBUG_RWSEMS_WARN_ON() bug in __up_write().

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Link: https://lkml.kernel.org/r/20190520205918.22251-12-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-06-17 18:28:03 +0800
d3681e269 locking/rwsem: Wake up almost all readers in wait queue ... Browse Code »

When the front of the wait queue is a reader, other readers
immediately following the first reader will also be woken up at the
same time. However, if there is a writer in between. Those readers
behind the writer will not be woken up.

Because of optimistic spinning, the lock acquisition order is not FIFO
anyway. The lock handoff mechanism will ensure that lock starvation
will not happen.

Assuming that the lock hold times of the other readers still in the
queue will be about the same as the readers that are being woken up,
there is really not much additional cost other than the additional
latency due to the wakeup of additional tasks by the waker. Therefore
all the readers up to a maximum of 256 in the queue are woken up when
the first waiter is a reader to improve reader throughput. This is
somewhat similar in concept to a phase-fair R/W lock.

With a locking microbenchmark running on 5.1 based kernel, the total
locking rates (in kops/s) on a 8-socket IvyBridge-EX system with
equal numbers of readers and writers before and after this patch were
as follows:

# of Threads Pre-Patch Post-patch
------------ --------- ----------
4 1,641 1,674
8 731 1,062
16 564 924
32 78 300
64 38 195
240 50 149

There is no performance gain at low contention level. At high contention
level, however, this patch gives a pretty decent performance boost.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Link: https://lkml.kernel.org/r/20190520205918.22251-11-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-06-17 18:28:02 +0800
990fa7384 locking/rwsem: More optimal RT task handling of null owner ... Browse Code »

An RT task can do optimistic spinning only if the lock holder is
actually running. If the state of the lock holder isn't known, there
is a possibility that high priority of the RT task may block forward
progress of the lock holder if it happens to reside on the same CPU.
This will lead to deadlock. So we have to make sure that an RT task
will not spin on a reader-owned rwsem.

When the owner is temporarily set to NULL, there are two cases
where we may want to continue spinning:

1) The lock owner is in the process of releasing the lock, sem->owner
is cleared but the lock has not been released yet.

2) The lock was free and owner cleared, but another task just comes
in and acquire the lock before we try to get it. The new owner may
be a spinnable writer.

So an RT task is now made to retry one more time to see if it can
acquire the lock or continue spinning on the new owning writer.

When testing on a 8-socket IvyBridge-EX system, the one additional retry
seems to improve locking performance of RT write locking threads under
heavy contentions. The table below shows the locking rates (in kops/s)
with various write locking threads before and after the patch.

Locking threads Pre-patch Post-patch
--------------- --------- -----------
4 2,753 2,608
8 2,529 2,520
16 1,727 1,918
32 1,263 1,956
64 889 1,343

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Link: https://lkml.kernel.org/r/20190520205918.22251-10-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-06-17 18:28:01 +0800
00f3c5a3d locking/rwsem: Always release wait_lock before waking up tasks ... Browse Code »

With the use of wake_q, we can do task wakeups without holding the
wait_lock. There is one exception in the rwsem code, though. It is
when the writer in the slowpath detects that there are waiters ahead
but the rwsem is not held by a writer. This can lead to a long wait_lock
hold time especially when a large number of readers are to be woken up.

Remediate this situation by releasing the wait_lock before waking
up tasks and re-acquiring it afterward. The rwsem_try_write_lock()
function is also modified to read the rwsem count directly to avoid
stale count value.

Suggested-by: Peter Zijlstra
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Link: https://lkml.kernel.org/r/20190520205918.22251-9-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-06-17 18:28:00 +0800
4f23dbc1e locking/rwsem: Implement lock handoff to prevent lock starvation ... Browse Code »

Because of writer lock stealing, it is possible that a constant
stream of incoming writers will cause a waiting writer or reader to
wait indefinitely leading to lock starvation.

This patch implements a lock handoff mechanism to disable lock stealing
and force lock handoff to the first waiter or waiters (for readers)
in the queue after at least a 4ms waiting period unless it is a RT
writer task which doesn't need to wait. The waiting period is used to
avoid discouraging lock stealing too much to affect performance.

The setting and clearing of the handoff bit is serialized by the
wait_lock. So racing is not possible.

A rwsem microbenchmark was run for 5 seconds on a 2-socket 40-core
80-thread Skylake system with a v5.1 based kernel and 240 write_lock
threads with 5us sleep critical section.

Before the patch, the min/mean/max numbers of locking operations for
the locking threads were 1/7,792/173,696. After the patch, the figures
became 5,842/6,542/7,458. It can be seen that the rwsem became much
more fair, though there was a drop of about 16% in the mean locking
operations done which was a tradeoff of having better fairness.

Making the waiter set the handoff bit right after the first wakeup can
impact performance especially with a mixed reader/writer workload. With
the same microbenchmark with short critical section and equal number of
reader and writer threads (40/40), the reader/writer locking operation
counts with the current patch were:

40 readers, Iterations Min/Mean/Max = 1,793/1,794/1,796
40 writers, Iterations Min/Mean/Max = 1,793/34,956/86,081

By making waiter set handoff bit immediately after wakeup:

40 readers, Iterations Min/Mean/Max = 43/44/46
40 writers, Iterations Min/Mean/Max = 43/1,263/3,191

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Link: https://lkml.kernel.org/r/20190520205918.22251-8-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-06-17 18:27:59 +0800
3f6d517a3 locking/rwsem: Make rwsem_spin_on_owner() return owner state ... Browse Code »

This patch modifies rwsem_spin_on_owner() to return four possible
values to better reflect the state of lock holder which enables us to
make a better decision of what to do next.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Link: https://lkml.kernel.org/r/20190520205918.22251-7-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-06-17 18:27:59 +0800
6cef7ff6e locking/rwsem: Code cleanup after files merging ... Browse Code »

After merging all the relevant rwsem code into one single file, there
are a number of optimizations and cleanups that can be done:

1) Remove all the EXPORT_SYMBOL() calls for functions that are not
accessed elsewhere.
2) Remove all the __visible tags as none of the functions will be
called from assembly code anymore.
3) Make all the internal functions static.
4) Remove some unneeded blank lines.
5) Remove the intermediate rwsem_down_{read|write}_failed*() functions
and rename __rwsem_down_{read|write}_failed_common() to
rwsem_down_{read|write}_slowpath().
6) Remove "__" prefix of __rwsem_mark_wake().
7) Use atomic_long_try_cmpxchg_acquire() as much as possible.
8) Remove the rwsem_rtrylock and rwsem_wtrylock lock events as they
are not that useful.

That enables the compiler to do better optimization and reduce code
size. The text+data size of rwsem.o on an x86-64 machine with gcc8 was
reduced from 10237 bytes to 5030 bytes with this change.

Suggested-by: Peter Zijlstra
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Link: https://lkml.kernel.org/r/20190520205918.22251-6-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-06-17 18:27:58 +0800
5dec94d49 locking/rwsem: Merge rwsem.h and rwsem-xadd.c into rwsem.c ... Browse Code »

Now we only have one implementation of rwsem. Even though we still use
xadd to handle reader locking, we use cmpxchg for writer instead. So
the filename rwsem-xadd.c is not strictly correct. Also no one outside
of the rwsem code need to know the internal implementation other than
function prototypes for two internal functions that are called directly
from percpu-rwsem.c.

So the rwsem-xadd.c and rwsem.h files are now merged into rwsem.c in
the following order:

The rwsem.h file now contains only 2 function declarations for
__up_read() and __down_read().

This is a code relocation patch with no code change at all except
making __up_read() and __down_read() non-static functions so they
can be used by percpu-rwsem.c.

Suggested-by: Peter Zijlstra
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: huang ying
Link: https://lkml.kernel.org/r/20190520205918.22251-5-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-06-17 18:27:57 +0800

10 Apr, 2019

3 commits

3b4ba6643 locking/rwsem: Enhance DEBUG_RWSEMS_WARN_ON() macro ... Browse Code »

Currently, the DEBUG_RWSEMS_WARN_ON() macro just dumps a stack trace
when the rwsem isn't in the right state. It does not show the actual
states of the rwsem. This may not be that helpful in the debugging
process.

Enhance the DEBUG_RWSEMS_WARN_ON() macro to also show the current
content of the rwsem count and owner fields to give more information
about what is wrong with the rwsem. The debug_locks_off() function is
called as is done inside DEBUG_LOCKS_WARN_ON().

Signed-off-by: Waiman Long
Acked-by: Peter Zijlstra
Acked-by: Davidlohr Bueso
Cc: Andrew Morton
Cc: Arnd Bergmann
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Link: http://lkml.kernel.org/r/20190404174320.22416-7-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-04-10 16:56:03 +0800
a68e2c4c6 locking/rwsem: Add debug check for __down_read*() ... Browse Code »

When rwsem_down_read_failed*() return, the read lock is acquired
indirectly by others. So debug checks are added in __down_read() and
__down_read_killable() to make sure the rwsem is really reader-owned.

The other debug check calls in kernel/locking/rwsem.c except the
one in up_read_non_owner() are also moved over to rwsem-xadd.h.

Signed-off-by: Waiman Long
Acked-by: Peter Zijlstra
Acked-by: Davidlohr Bueso
Cc: Andrew Morton
Cc: Arnd Bergmann
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Link: http://lkml.kernel.org/r/20190404174320.22416-6-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-04-10 16:56:02 +0800
c7580c1e8 locking/rwsem: Move owner setting code from rwsem.c to rwsem.h ... Browse Code »

Move all the owner setting code closer to the rwsem-xadd fast paths
directly within rwsem.h file as well as in the slowpaths where owner
setting is done after acquring the lock. This will enable us to add
DEBUG_RWSEMS check in a later patch to make sure that read lock is
really acquired when rwsem_down_read_failed() returns, for instance.

Signed-off-by: Waiman Long
Acked-by: Peter Zijlstra
Acked-by: Davidlohr Bueso
Cc: Andrew Morton
Cc: Arnd Bergmann
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Link: http://lkml.kernel.org/r/20190404174320.22416-3-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-04-10 16:55:59 +0800

10 Sep, 2018

1 commit

925b9cd1b locking/rwsem: Make owner store task pointer of last owning reader ... Browse Code »

Currently, when a reader acquires a lock, it only sets the
RWSEM_READER_OWNED bit in the owner field. The other bits are simply
not used. When debugging hanging cases involving rwsems and readers,
the owner value does not provide much useful information at all.

This patch modifies the current behavior to always store the task_struct
pointer of the last rwsem-acquiring reader in a reader-owned rwsem. This
may be useful in debugging rwsem hanging cases especially if only one
reader is involved. However, the task in the owner field may not the
real owner or one of the real owners at all when the owner value is
examined, for example, in a crash dump. So it is just an additional
hint about the past history.

If CONFIG_DEBUG_RWSEMS=y is enabled, the owner field will be checked at
unlock time too to make sure the task pointer value is valid. That does
have a slight performance cost and so is only enabled as part of that
debug option.

From the performance point of view, it is expected that the changes
shouldn't have any noticeable performance impact. A rwsem microbenchmark
(with 48 worker threads and 1:1 reader/writer ratio) was ran on a
2-socket 24-core 48-thread Haswell system. The locking rates on a
4.19-rc1 based kernel were as follows:

1) Unpatched kernel: 543.3 kops/s
2) Patched kernel: 549.2 kops/s
3) Patched kernel (CONFIG_DEBUG_RWSEMS on): 546.6 kops/s

There was actually a slight increase in performance (1.1%) in this
particular case. Maybe it was caused by the elimination of a branch or
just a testing noise. Turning on the CONFIG_DEBUG_RWSEMS option also
had less than the expected impact on performance.

The least significant 2 bits of the owner value are now used to designate
the rwsem is readers owned and the owners are anonymous.

Signed-off-by: Waiman Long
Acked-by: Peter Zijlstra
Cc: Davidlohr Bueso
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Will Deacon
Link: http://lkml.kernel.org/r/1536265114-10842-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2018-09-10 18:04:07 +0800

20 Jun, 2018

1 commit

03eeafdd9 locking/rwsem: Fix up_read_non_owner() warning with DEBUG_RWSEMS ... Browse Code »

It was found that the use of up_read_non_owner() in NFS was causing
the following warning when DEBUG_RWSEMS was configured.

DEBUG_LOCKS_WARN_ON(sem->owner != ((struct task_struct *)(1UL << 0)))

Looking into the rwsem.c file, it was discovered that the corresponding
down_read_non_owner() function was not setting the owner field properly.
This is fixed now, and the warning should be gone.

Fixes: 5149cbac4235 ("locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches")
Signed-off-by: Waiman Long
Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra (Intel)
Tested-by: Gavin Schenk
Cc: Davidlohr Bueso
Cc: Dan Williams
Cc: Arnd Bergmann
Cc: linux-nfs@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1527168398-4291-1-git-send-email-longman@redhat.com

Waiman Long
2018-06-20 17:29:23 +0800

16 May, 2018

1 commit

d7d760efa locking/rwsem: Add a new RWSEM_ANONYMOUSLY_OWNED flag ... Browse Code »

There are use cases where a rwsem can be acquired by one task, but
released by another task. In thess cases, optimistic spinning may need
to be disabled. One example will be the filesystem freeze/thaw code
where the task that freezes the filesystem will acquire a write lock
on a rwsem and then un-owns it before returning to userspace. Later on,
another task will come along, acquire the ownership, thaw the filesystem
and release the rwsem.

Bit 0 of the owner field was used to designate that it is a reader
owned rwsem. It is now repurposed to mean that the owner of the rwsem
is not known. If only bit 0 is set, the rwsem is reader owned. If bit
0 and other bits are set, it is writer owned with an unknown owner.
One such value for the latter case is (-1L). So we can set owner to 1 for
reader-owned, -1 for writer-owned. The owner is unknown in both cases.

To handle transfer of rwsem ownership, the higher level code should
set the owner field to -1 to indicate a write-locked rwsem with unknown
owner. Optimistic spinning will be disabled in this case.

Once the higher level code figures who the new owner is, it can then
set the owner field accordingly.

Tested-by: Amir Goldstein
Signed-off-by: Waiman Long
Acked-by: Peter Zijlstra
Cc: Andrew Morton
Cc: Davidlohr Bueso
Cc: Jan Kara
Cc: Linus Torvalds
Cc: Matthew Wilcox
Cc: Oleg Nesterov
Cc: Paul E. McKenney
Cc: Theodore Y. Ts'o
Cc: Thomas Gleixner
Cc: Will Deacon
Cc: linux-fsdevel@vger.kernel.org
Link: http://lkml.kernel.org/r/1526420991-21213-2-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2018-05-16 17:45:15 +0800

31 Mar, 2018

1 commit

5149cbac4 locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches ... Browse Code »

For a rwsem, locking can either be exclusive or shared. The corresponding
exclusive or shared unlock must be used. Otherwise, the protected data
structures may get corrupted or the lock may be in an inconsistent state.

In order to detect such anomaly, a new configuration option DEBUG_RWSEMS
is added which can be enabled to look for such mismatches and print
warnings that that happens.

Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1522445280-7767-2-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2018-03-31 13:30:50 +0800

07 Nov, 2017

1 commit

8c5db92a7 Merge branch 'linus' into locking/core, to resolve conflicts ... Browse Code »

Conflicts:
include/linux/compiler-clang.h
include/linux/compiler-gcc.h
include/linux/compiler-intel.h
include/uapi/linux/stddef.h

Signed-off-by: Ingo Molnar

Ingo Molnar
2017-11-07 17:32:44 +0800

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800