Eric Lee / smarc-fsl-linux-kernel

09 Jul, 2019

1 commit

e19283286 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull locking updates from Ingo Molnar:
"The main changes in this cycle are:

- rwsem scalability improvements, phase #2, by Waiman Long, which are
rather impressive:

"On a 2-socket 40-core 80-thread Skylake system with 40 reader
and writer locking threads, the min/mean/max locking operations
done in a 5-second testing window before the patchset were:

40 readers, Iterations Min/Mean/Max = 1,807/1,808/1,810
40 writers, Iterations Min/Mean/Max = 1,807/50,344/151,255

After the patchset, they became:

40 readers, Iterations Min/Mean/Max = 30,057/31,359/32,741
40 writers, Iterations Min/Mean/Max = 94,466/95,845/97,098"

There's a lot of changes to the locking implementation that makes
it similar to qrwlock, including owner handoff for more fair
locking.

Another microbenchmark shows how across the spectrum the
improvements are:

"With a locking microbenchmark running on 5.1 based kernel, the
total locking rates (in kops/s) on a 2-socket Skylake system
with equal numbers of readers and writers (mixed) before and
after this patchset were:

# of Threads Before Patch After Patch
------------ ------------ -----------
2 2,618 4,193
4 1,202 3,726
8 802 3,622
16 729 3,359
32 319 2,826
64 102 2,744"

The changes are extensive and the patch-set has been through
several iterations addressing various locking workloads. There
might be more regressions, but unless they are pathological I
believe we want to use this new implementation as the baseline
going forward.

- jump-label optimizations by Daniel Bristot de Oliveira: the primary
motivation was to remove IPI disturbance of isolated RT-workload
CPUs, which resulted in the implementation of batched jump-label
updates. Beyond the improvement of the real-time characteristics
kernel, in one test this patchset improved static key update
overhead from 57 msecs to just 1.4 msecs - which is a nice speedup
as well.

- atomic64_t cross-arch type cleanups by Mark Rutland: over the last
~10 years of atomic64_t existence the various types used by the
APIs only had to be self-consistent within each architecture -
which means they became wildly inconsistent across architectures.
Mark puts and end to this by reworking all the atomic64
implementations to use 's64' as the base type for atomic64_t, and
to ensure that this type is consistently used for parameters and
return values in the API, avoiding further problems in this area.

- A large set of small improvements to lockdep by Yuyang Du: type
cleanups, output cleanups, function return type and othr cleanups
all around the place.

- A set of percpu ops cleanups and fixes by Peter Zijlstra.

- Misc other changes - please see the Git log for more details"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits)
locking/lockdep: increase size of counters for lockdep statistics
locking/atomics: Use sed(1) instead of non-standard head(1) option
locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING
x86/jump_label: Make tp_vec_nr static
x86/percpu: Optimize raw_cpu_xchg()
x86/percpu, sched/fair: Avoid local_clock()
x86/percpu, x86/irq: Relax {set,get}_irq_regs()
x86/percpu: Relax smp_processor_id()
x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}()
locking/rwsem: Guard against making count negative
locking/rwsem: Adaptive disabling of reader optimistic spinning
locking/rwsem: Enable time-based spinning on reader-owned rwsem
locking/rwsem: Make rwsem->owner an atomic_long_t
locking/rwsem: Enable readers spinning on writer
locking/rwsem: Clarify usage of owner's nonspinaable bit
locking/rwsem: Wake up almost all readers in wait queue
locking/rwsem: More optimal RT task handling of null owner
locking/rwsem: Always release wait_lock before waking up tasks
locking/rwsem: Implement lock handoff to prevent lock starvation
locking/rwsem: Make rwsem_spin_on_owner() return owner state
...

Linus Torvalds
2019-07-09 07:12:03 +0800

20 Jun, 2019

1 commit

2966f8d44 Documentation: atomic_t.txt: Explain ordering provided by smp_mb__{before,after}_atomic() ... Browse Code »

The description of smp_mb__before_atomic() and smp_mb__after_atomic()
in Documentation/atomic_t.txt is slightly terse and misleading. It
does not clearly state which other instructions are ordered by these
barriers.

This improves the text to make the actual ordering implications clear,
and also to explain how these barriers differ from a RELEASE or
ACQUIRE ordering.

Signed-off-by: Alan Stern
Cc: Jonathan Corbet
Cc: Peter Zijlstra
Acked-by: Andrea Parri
Signed-off-by: Paul E. McKenney

Alan Stern
2019-06-20 00:31:53 +0800

17 Jun, 2019

1 commit

69d927bba x86/atomic: Fix smp_mb__{before,after}_atomic() ... Browse Code »

Recent probing at the Linux Kernel Memory Model uncovered a
'surprise'. Strongly ordered architectures where the atomic RmW
primitive implies full memory ordering and
smp_mb__{before,after}_atomic() are a simple barrier() (such as x86)
fail for:

*x = 1;
atomic_inc(u);
smp_mb__after_atomic();
r0 = *y;

Because, while the atomic_inc() implies memory order, it
(surprisingly) does not provide a compiler barrier. This then allows
the compiler to re-order like so:

atomic_inc(u);
*x = 1;
smp_mb__after_atomic();
r0 = *y;

Which the CPU is then allowed to re-order (under TSO rules) like:

atomic_inc(u);
r0 = *y;
*x = 1;

And this very much was not intended. Therefore strengthen the atomic
RmW ops to include a compiler barrier.

NOTE: atomic_{or,and,xor} and the bitops already had the compiler
barrier.

Signed-off-by: Peter Zijlstra (Intel)
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar

Peter Zijlstra
2019-06-17 18:09:59 +0800

03 Jun, 2019

1 commit

fff9b6c7d Documentation/atomic_t.txt: Clarify pure non-rmw usage ... Browse Code »

Clarify that pure non-RMW usage of atomic_t is pointless, there is
nothing 'magical' about atomic_set() / atomic_read().

This is something that seems to confuse people, because I happen upon it
semi-regularly.

Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Greg Kroah-Hartman
Acked-by: Will Deacon
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: https://lkml.kernel.org/r/20190524115231.GN2623@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2019-06-03 18:32:57 +0800

19 Mar, 2019

1 commit

f1887143f Documentation/atomic_t: Clarify signed vs unsigned ... Browse Code »

Clarify the whole signed vs unsigned issue for atomic_t.

There has been enough confusion on this topic to warrant a few explicit
words I feel.

Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Acked-by: Boqun Feng
Signed-off-by: Paul E. McKenney

Peter Zijlstra
2019-03-19 01:27:52 +0800

25 Aug, 2017

1 commit

ca110694c Documentation/locking/atomic: Finish the document... ... Browse Code »

Julia reported that the document looked unfinished, and it is. I
forgot to include the example cooked up by Paul here:

https://lkml.kernel.org/r/20170731174345.GL3730@linux.vnet.ibm.com

and I added an explicit example showing how, while it is an ACQUIRE
pattern, it really does provide an MB.

Reported-by: Julia Cartwright
Signed-off-by: Peter Zijlstra (Intel)
Cc: Boqun Feng
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Will Deacon
Signed-off-by: Ingo Molnar

Peter Zijlstra
2017-08-25 17:06:33 +0800

10 Aug, 2017

1 commit

706eeb3e9 Documentation/locking/atomic: Add documents for new atomic_t APIs ... Browse Code »

Since we've vastly expanded the atomic_t interface in recent years the
existing documentation is woefully out of date and people seem to get
confused a bit.

Start a new document to hopefully better explain the current state of
affairs.

The old atomic_ops.txt also covers bitmaps and a few more details so
this is not a full replacement and we'll therefore keep that document
around until such a time that we've managed to write more text to cover
its entire.

Also please, ReST people, go away.

Signed-off-by: Peter Zijlstra (Intel)
Cc: Boqun Feng
Cc: Linus Torvalds
Cc: Paul McKenney
Cc: Peter Zijlstra
Cc: Randy Dunlap
Cc: Thomas Gleixner
Cc: Will Deacon
Signed-off-by: Ingo Molnar

Peter Zijlstra
2017-08-10 18:29:00 +0800