09 Jul, 2019

1 commit

  • Pull locking updates from Ingo Molnar:
    "The main changes in this cycle are:

    - rwsem scalability improvements, phase #2, by Waiman Long, which are
    rather impressive:

    "On a 2-socket 40-core 80-thread Skylake system with 40 reader
    and writer locking threads, the min/mean/max locking operations
    done in a 5-second testing window before the patchset were:

    40 readers, Iterations Min/Mean/Max = 1,807/1,808/1,810
    40 writers, Iterations Min/Mean/Max = 1,807/50,344/151,255

    After the patchset, they became:

    40 readers, Iterations Min/Mean/Max = 30,057/31,359/32,741
    40 writers, Iterations Min/Mean/Max = 94,466/95,845/97,098"

    There's a lot of changes to the locking implementation that makes
    it similar to qrwlock, including owner handoff for more fair
    locking.

    Another microbenchmark shows how across the spectrum the
    improvements are:

    "With a locking microbenchmark running on 5.1 based kernel, the
    total locking rates (in kops/s) on a 2-socket Skylake system
    with equal numbers of readers and writers (mixed) before and
    after this patchset were:

    # of Threads Before Patch After Patch
    ------------ ------------ -----------
    2 2,618 4,193
    4 1,202 3,726
    8 802 3,622
    16 729 3,359
    32 319 2,826
    64 102 2,744"

    The changes are extensive and the patch-set has been through
    several iterations addressing various locking workloads. There
    might be more regressions, but unless they are pathological I
    believe we want to use this new implementation as the baseline
    going forward.

    - jump-label optimizations by Daniel Bristot de Oliveira: the primary
    motivation was to remove IPI disturbance of isolated RT-workload
    CPUs, which resulted in the implementation of batched jump-label
    updates. Beyond the improvement of the real-time characteristics
    kernel, in one test this patchset improved static key update
    overhead from 57 msecs to just 1.4 msecs - which is a nice speedup
    as well.

    - atomic64_t cross-arch type cleanups by Mark Rutland: over the last
    ~10 years of atomic64_t existence the various types used by the
    APIs only had to be self-consistent within each architecture -
    which means they became wildly inconsistent across architectures.
    Mark puts and end to this by reworking all the atomic64
    implementations to use 's64' as the base type for atomic64_t, and
    to ensure that this type is consistently used for parameters and
    return values in the API, avoiding further problems in this area.

    - A large set of small improvements to lockdep by Yuyang Du: type
    cleanups, output cleanups, function return type and othr cleanups
    all around the place.

    - A set of percpu ops cleanups and fixes by Peter Zijlstra.

    - Misc other changes - please see the Git log for more details"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits)
    locking/lockdep: increase size of counters for lockdep statistics
    locking/atomics: Use sed(1) instead of non-standard head(1) option
    locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING
    x86/jump_label: Make tp_vec_nr static
    x86/percpu: Optimize raw_cpu_xchg()
    x86/percpu, sched/fair: Avoid local_clock()
    x86/percpu, x86/irq: Relax {set,get}_irq_regs()
    x86/percpu: Relax smp_processor_id()
    x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}()
    locking/rwsem: Guard against making count negative
    locking/rwsem: Adaptive disabling of reader optimistic spinning
    locking/rwsem: Enable time-based spinning on reader-owned rwsem
    locking/rwsem: Make rwsem->owner an atomic_long_t
    locking/rwsem: Enable readers spinning on writer
    locking/rwsem: Clarify usage of owner's nonspinaable bit
    locking/rwsem: Wake up almost all readers in wait queue
    locking/rwsem: More optimal RT task handling of null owner
    locking/rwsem: Always release wait_lock before waking up tasks
    locking/rwsem: Implement lock handoff to prevent lock starvation
    locking/rwsem: Make rwsem_spin_on_owner() return owner state
    ...

    Linus Torvalds
     

29 Jun, 2019

1 commit

  • …k/linux-rcu into core/rcu

    Pull rcu/next + tools/memory-model changes from Paul E. McKenney:

    - RCU flavor consolidation cleanups and optmizations
    - Documentation updates
    - Miscellaneous fixes
    - SRCU updates
    - RCU-sync flavor consolidation
    - Torture-test updates
    - Linux-kernel memory-consistency-model updates, most notably the addition of plain C-language accesses

    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

25 Jun, 2019

2 commits

  • When system has been running for a long time, signed integer
    counters are not enough for some lockdep statistics. Using
    unsigned long counters can satisfy the requirement. Besides,
    most of lockdep statistics are unsigned. It is better to use
    unsigned int instead of int.

    Remove unused variables.
    - max_recursion_depth
    - nr_cyclic_check_recursions
    - nr_find_usage_forwards_recursions
    - nr_find_usage_backwards_recursions

    Signed-off-by: Kobe Wu
    Signed-off-by: Peter Zijlstra (Intel)
    Cc:
    Cc:
    Cc: Eason Lin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: https://lkml.kernel.org/r/1561365348-16050-1-git-send-email-kobe-cp.wu@mediatek.com
    Signed-off-by: Ingo Molnar

    Kobe Wu
     
  • The last cleanup patch triggered another issue, as now another function
    should be moved into the same section:

    kernel/locking/lockdep.c:3580:12: error: 'mark_lock' defined but not used [-Werror=unused-function]
    static int mark_lock(struct task_struct *curr, struct held_lock *this,

    Move mark_lock() into the same #ifdef section as its only caller, and
    remove the now-unused mark_lock_irq() stub helper.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Bart Van Assche
    Cc: Frederic Weisbecker
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Waiman Long
    Cc: Will Deacon
    Cc: Yuyang Du
    Fixes: 0d2cc3b34532 ("locking/lockdep: Move valid_state() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING")
    Link: https://lkml.kernel.org/r/20190617124718.1232976-1-arnd@arndb.de
    Signed-off-by: Ingo Molnar

    Arnd Bergmann
     

20 Jun, 2019

1 commit


17 Jun, 2019

18 commits

  • The upper bits of the count field is used as reader count. When
    sufficient number of active readers are present, the most significant
    bit will be set and the count becomes negative. If the number of active
    readers keep on piling up, we may eventually overflow the reader counts.
    This is not likely to happen unless the number of bits reserved for
    reader count is reduced because those bits are need for other purpose.

    To prevent this count overflow from happening, the most significant
    bit is now treated as a guard bit (RWSEM_FLAG_READFAIL). Read-lock
    attempts will now fail for both the fast and slow paths whenever this
    bit is set. So all those extra readers will be put to sleep in the wait
    list. Wakeup will not happen until the reader count reaches 0.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-17-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • Reader optimistic spinning is helpful when the reader critical section
    is short and there aren't that many readers around. It makes readers
    relatively more preferred than writers. When a writer times out spinning
    on a reader-owned lock and set the nospinnable bits, there are two main
    reasons for that.

    1) The reader critical section is long, perhaps the task sleeps after
    acquiring the read lock.
    2) There are just too many readers contending the lock causing it to
    take a while to service all of them.

    In the former case, long reader critical section will impede the progress
    of writers which is usually more important for system performance.
    In the later case, reader optimistic spinning tends to make the reader
    groups that contain readers that acquire the lock together smaller
    leading to more of them. That may hurt performance in some cases. In
    other words, the setting of nonspinnable bits indicates that reader
    optimistic spinning may not be helpful for those workloads that cause it.

    Therefore, any writers that have observed the setting of the writer
    nonspinnable bit for a given rwsem after they fail to acquire the lock
    via optimistic spinning will set the reader nonspinnable bit once they
    acquire the write lock. Similarly, readers that observe the setting
    of reader nonspinnable bit at slowpath entry will also set the reader
    nonspinnable bit when they acquire the read lock via the wakeup path.

    Once the reader nonspinnable bit is on, it will only be reset when
    a writer is able to acquire the rwsem in the fast path or somehow a
    reader or writer in the slowpath doesn't observe the nonspinable bit.

    This is to discourage reader optmistic spinning on that particular
    rwsem and make writers more preferred. This adaptive disabling of reader
    optimistic spinning will alleviate some of the negative side effect of
    this feature.

    In addition, this patch tries to make readers in the spinning queue
    follow the phase-fair principle after quitting optimistic spinning
    by checking if another reader has somehow acquired a read lock after
    this reader enters the optimistic spinning queue. If so and the rwsem
    is still reader-owned, this reader is in the right read-phase and can
    attempt to acquire the lock.

    On a 2-socket 40-core 80-thread Skylake system, the page_fault1 test of
    the will-it-scale benchmark was run with various number of threads. The
    number of operations done before reader optimistic spinning patches,
    this patch and after this patch were:

    Threads Before rspin Before patch After patch %change
    ------- ------------ ------------ ----------- -------
    20 5541068 5345484 5455667 -3.5%/ +2.1%
    40 10185150 7292313 9219276 -28.5%/+26.4%
    60 8196733 6460517 7181209 -21.2%/+11.2%
    80 9508864 6739559 8107025 -29.1%/+20.3%

    This patch doesn't recover all the lost performance, but it is more
    than half. Given the fact that reader optimistic spinning does benefit
    some workloads, this is a good compromise.

    Using the rwsem locking microbenchmark with very short critical section,
    this patch doesn't have too much impact on locking performance as shown
    by the locking rates (kops/s) below with equal numbers of readers and
    writers before and after this patch:

    # of Threads Pre-patch Post-patch
    ------------ --------- ----------
    2 4,730 4,969
    4 4,814 4,786
    8 4,866 4,815
    16 4,715 4,511
    32 3,338 3,500
    64 3,212 3,389
    80 3,110 3,044

    When running the locking microbenchmark with 40 dedicated reader and writer
    threads, however, the reader performance is curtailed to favor the writer.

    Before patch:

    40 readers, Iterations Min/Mean/Max = 204,026/234,309/254,816
    40 writers, Iterations Min/Mean/Max = 88,515/95,884/115,644

    After patch:

    40 readers, Iterations Min/Mean/Max = 33,813/35,260/36,791
    40 writers, Iterations Min/Mean/Max = 95,368/96,565/97,798

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-16-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • When the rwsem is owned by reader, writers stop optimistic spinning
    simply because there is no easy way to figure out if all the readers
    are actively running or not. However, there are scenarios where
    the readers are unlikely to sleep and optimistic spinning can help
    performance.

    This patch provides a simple mechanism for spinning on a reader-owned
    rwsem by a writer. It is a time threshold based spinning where the
    allowable spinning time can vary from 10us to 25us depending on the
    condition of the rwsem.

    When the time threshold is exceeded, the nonspinnable bits will be set
    in the owner field to indicate that no more optimistic spinning will
    be allowed on this rwsem until it becomes writer owned again. Not even
    readers is allowed to acquire the reader-locked rwsem by optimistic
    spinning for fairness.

    We also want a writer to acquire the lock after the readers hold the
    lock for a relatively long time. In order to give preference to writers
    under such a circumstance, the single RWSEM_NONSPINNABLE bit is now split
    into two - one for reader and one for writer. When optimistic spinning
    is disabled, both bits will be set. When the reader count drop down
    to 0, the writer nonspinnable bit will be cleared to allow writers to
    spin on the lock, but not the readers. When a writer acquires the lock,
    it will write its own task structure pointer into sem->owner and clear
    the reader nonspinnable bit in the process.

    The time taken for each iteration of the reader-owned rwsem spinning
    loop varies. Below are sample minimum elapsed times for 16 iterations
    of the loop.

    System Time for 16 Iterations
    ------ ----------------------
    1-socket Skylake ~800ns
    4-socket Broadwell ~300ns
    2-socket ThunderX2 (arm64) ~250ns

    When the lock cacheline is contended, we can see up to almost 10X
    increase in elapsed time. So 25us will be at most 500, 1300 and 1600
    iterations for each of the above systems.

    With a locking microbenchmark running on 5.1 based kernel, the total
    locking rates (in kops/s) on a 8-socket IvyBridge-EX system with
    equal numbers of readers and writers before and after this patch were
    as follows:

    # of Threads Pre-patch Post-patch
    ------------ --------- ----------
    2 1,759 6,684
    4 1,684 6,738
    8 1,074 7,222
    16 900 7,163
    32 458 7,316
    64 208 520
    128 168 425
    240 143 474

    This patch gives a big boost in performance for mixed reader/writer
    workloads.

    With 32 locking threads, the rwsem lock event data were:

    rwsem_opt_fail=79850
    rwsem_opt_nospin=5069
    rwsem_opt_rlock=597484
    rwsem_opt_wlock=957339
    rwsem_sleep_reader=57782
    rwsem_sleep_writer=55663

    With 64 locking threads, the data looked like:

    rwsem_opt_fail=346723
    rwsem_opt_nospin=6293
    rwsem_opt_rlock=1127119
    rwsem_opt_wlock=1400628
    rwsem_sleep_reader=308201
    rwsem_sleep_writer=72281

    So a lot more threads acquired the lock in the slowpath and more threads
    went to sleep.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-15-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • The rwsem->owner contains not just the task structure pointer, it also
    holds some flags for storing the current state of the rwsem. Some of
    the flags may have to be atomically updated. To reflect the new reality,
    the owner is now changed to an atomic_long_t type.

    New helper functions are added to properly separate out the task
    structure pointer and the embedded flags.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-14-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • This patch enables readers to optimistically spin on a
    rwsem when it is owned by a writer instead of going to sleep
    directly. The rwsem_can_spin_on_owner() function is extracted
    out of rwsem_optimistic_spin() and is called directly by
    rwsem_down_read_slowpath() and rwsem_down_write_slowpath().

    With a locking microbenchmark running on 5.1 based kernel, the total
    locking rates (in kops/s) on a 8-socket IvyBrige-EX system with equal
    numbers of readers and writers before and after the patch were as
    follows:

    # of Threads Pre-patch Post-patch
    ------------ --------- ----------
    4 1,674 1,684
    8 1,062 1,074
    16 924 900
    32 300 458
    64 195 208
    128 164 168
    240 149 143

    The performance change wasn't significant in this case, but this change
    is required by a follow-on patch.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-13-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • Bit 1 of sem->owner (RWSEM_ANONYMOUSLY_OWNED) is used to designate an
    anonymous owner - readers or an anonymous writer. The setting of this
    anonymous bit is used as an indicator that optimistic spinning cannot
    be done on this rwsem.

    With the upcoming reader optimistic spinning patches, a reader-owned
    rwsem can be spinned on for a limit period of time. We still need
    this bit to indicate a rwsem is nonspinnable, but not setting this
    bit loses its meaning that the owner is known. So rename the bit
    to RWSEM_NONSPINNABLE to clarify its meaning.

    This patch also fixes a DEBUG_RWSEMS_WARN_ON() bug in __up_write().

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-12-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • When the front of the wait queue is a reader, other readers
    immediately following the first reader will also be woken up at the
    same time. However, if there is a writer in between. Those readers
    behind the writer will not be woken up.

    Because of optimistic spinning, the lock acquisition order is not FIFO
    anyway. The lock handoff mechanism will ensure that lock starvation
    will not happen.

    Assuming that the lock hold times of the other readers still in the
    queue will be about the same as the readers that are being woken up,
    there is really not much additional cost other than the additional
    latency due to the wakeup of additional tasks by the waker. Therefore
    all the readers up to a maximum of 256 in the queue are woken up when
    the first waiter is a reader to improve reader throughput. This is
    somewhat similar in concept to a phase-fair R/W lock.

    With a locking microbenchmark running on 5.1 based kernel, the total
    locking rates (in kops/s) on a 8-socket IvyBridge-EX system with
    equal numbers of readers and writers before and after this patch were
    as follows:

    # of Threads Pre-Patch Post-patch
    ------------ --------- ----------
    4 1,641 1,674
    8 731 1,062
    16 564 924
    32 78 300
    64 38 195
    240 50 149

    There is no performance gain at low contention level. At high contention
    level, however, this patch gives a pretty decent performance boost.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-11-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • An RT task can do optimistic spinning only if the lock holder is
    actually running. If the state of the lock holder isn't known, there
    is a possibility that high priority of the RT task may block forward
    progress of the lock holder if it happens to reside on the same CPU.
    This will lead to deadlock. So we have to make sure that an RT task
    will not spin on a reader-owned rwsem.

    When the owner is temporarily set to NULL, there are two cases
    where we may want to continue spinning:

    1) The lock owner is in the process of releasing the lock, sem->owner
    is cleared but the lock has not been released yet.

    2) The lock was free and owner cleared, but another task just comes
    in and acquire the lock before we try to get it. The new owner may
    be a spinnable writer.

    So an RT task is now made to retry one more time to see if it can
    acquire the lock or continue spinning on the new owning writer.

    When testing on a 8-socket IvyBridge-EX system, the one additional retry
    seems to improve locking performance of RT write locking threads under
    heavy contentions. The table below shows the locking rates (in kops/s)
    with various write locking threads before and after the patch.

    Locking threads Pre-patch Post-patch
    --------------- --------- -----------
    4 2,753 2,608
    8 2,529 2,520
    16 1,727 1,918
    32 1,263 1,956
    64 889 1,343

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-10-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • With the use of wake_q, we can do task wakeups without holding the
    wait_lock. There is one exception in the rwsem code, though. It is
    when the writer in the slowpath detects that there are waiters ahead
    but the rwsem is not held by a writer. This can lead to a long wait_lock
    hold time especially when a large number of readers are to be woken up.

    Remediate this situation by releasing the wait_lock before waking
    up tasks and re-acquiring it afterward. The rwsem_try_write_lock()
    function is also modified to read the rwsem count directly to avoid
    stale count value.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-9-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • Because of writer lock stealing, it is possible that a constant
    stream of incoming writers will cause a waiting writer or reader to
    wait indefinitely leading to lock starvation.

    This patch implements a lock handoff mechanism to disable lock stealing
    and force lock handoff to the first waiter or waiters (for readers)
    in the queue after at least a 4ms waiting period unless it is a RT
    writer task which doesn't need to wait. The waiting period is used to
    avoid discouraging lock stealing too much to affect performance.

    The setting and clearing of the handoff bit is serialized by the
    wait_lock. So racing is not possible.

    A rwsem microbenchmark was run for 5 seconds on a 2-socket 40-core
    80-thread Skylake system with a v5.1 based kernel and 240 write_lock
    threads with 5us sleep critical section.

    Before the patch, the min/mean/max numbers of locking operations for
    the locking threads were 1/7,792/173,696. After the patch, the figures
    became 5,842/6,542/7,458. It can be seen that the rwsem became much
    more fair, though there was a drop of about 16% in the mean locking
    operations done which was a tradeoff of having better fairness.

    Making the waiter set the handoff bit right after the first wakeup can
    impact performance especially with a mixed reader/writer workload. With
    the same microbenchmark with short critical section and equal number of
    reader and writer threads (40/40), the reader/writer locking operation
    counts with the current patch were:

    40 readers, Iterations Min/Mean/Max = 1,793/1,794/1,796
    40 writers, Iterations Min/Mean/Max = 1,793/34,956/86,081

    By making waiter set handoff bit immediately after wakeup:

    40 readers, Iterations Min/Mean/Max = 43/44/46
    40 writers, Iterations Min/Mean/Max = 43/1,263/3,191

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-8-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • This patch modifies rwsem_spin_on_owner() to return four possible
    values to better reflect the state of lock holder which enables us to
    make a better decision of what to do next.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-7-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • After merging all the relevant rwsem code into one single file, there
    are a number of optimizations and cleanups that can be done:

    1) Remove all the EXPORT_SYMBOL() calls for functions that are not
    accessed elsewhere.
    2) Remove all the __visible tags as none of the functions will be
    called from assembly code anymore.
    3) Make all the internal functions static.
    4) Remove some unneeded blank lines.
    5) Remove the intermediate rwsem_down_{read|write}_failed*() functions
    and rename __rwsem_down_{read|write}_failed_common() to
    rwsem_down_{read|write}_slowpath().
    6) Remove "__" prefix of __rwsem_mark_wake().
    7) Use atomic_long_try_cmpxchg_acquire() as much as possible.
    8) Remove the rwsem_rtrylock and rwsem_wtrylock lock events as they
    are not that useful.

    That enables the compiler to do better optimization and reduce code
    size. The text+data size of rwsem.o on an x86-64 machine with gcc8 was
    reduced from 10237 bytes to 5030 bytes with this change.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-6-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • Now we only have one implementation of rwsem. Even though we still use
    xadd to handle reader locking, we use cmpxchg for writer instead. So
    the filename rwsem-xadd.c is not strictly correct. Also no one outside
    of the rwsem code need to know the internal implementation other than
    function prototypes for two internal functions that are called directly
    from percpu-rwsem.c.

    So the rwsem-xadd.c and rwsem.h files are now merged into rwsem.c in
    the following order:




    The rwsem.h file now contains only 2 function declarations for
    __up_read() and __down_read().

    This is a code relocation patch with no code change at all except
    making __up_read() and __down_read() non-static functions so they
    can be used by percpu-rwsem.c.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-5-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • The current way of using various reader, writer and waiting biases
    in the rwsem code are confusing and hard to understand. I have to
    reread the rwsem count guide in the rwsem-xadd.c file from time to
    time to remind myself how this whole thing works. It also makes the
    rwsem code harder to be optimized.

    To make rwsem more sane, a new locking scheme similar to the one in
    qrwlock is now being used. The atomic long count has the following
    bit definitions:

    Bit 0 - writer locked bit
    Bit 1 - waiters present bit
    Bits 2-7 - reserved for future extension
    Bits 8-X - reader count (24/56 bits)

    The cmpxchg instruction is now used to acquire the write lock. The read
    lock is still acquired with xadd instruction, so there is no change here.
    This scheme will allow up to 16M/64P active readers which should be
    more than enough. We can always use some more reserved bits if necessary.

    With that change, we can deterministically know if a rwsem has been
    write-locked. Looking at the count alone, however, one cannot determine
    for certain if a rwsem is owned by readers or not as the readers that
    set the reader count bits may be in the process of backing out. So we
    still need the reader-owned bit in the owner field to be sure.

    With a locking microbenchmark running on 5.1 based kernel, the total
    locking rates (in kops/s) of the benchmark on a 8-socket 120-core
    IvyBridge-EX system before and after the patch were as follows:

    Before Patch After Patch
    # of Threads wlock rlock wlock rlock
    ------------ ----- ----- ----- -----
    1 30,659 31,341 31,055 31,283
    2 8,909 16,457 9,884 17,659
    4 9,028 15,823 8,933 20,233
    8 8,410 14,212 7,230 17,140
    16 8,217 25,240 7,479 24,607

    The locking rates of the benchmark on a Power8 system were as follows:

    Before Patch After Patch
    # of Threads wlock rlock wlock rlock
    ------------ ----- ----- ----- -----
    1 12,963 13,647 13,275 13,601
    2 7,570 11,569 7,902 10,829
    4 5,232 5,516 5,466 5,435
    8 5,233 3,386 5,467 3,168

    The locking rates of the benchmark on a 2-socket ARM64 system were
    as follows:

    Before Patch After Patch
    # of Threads wlock rlock wlock rlock
    ------------ ----- ----- ----- -----
    1 21,495 21,046 21,524 21,074
    2 5,293 10,502 5,333 10,504
    4 5,325 11,463 5,358 11,631
    8 5,391 11,712 5,470 11,680

    The performance are roughly the same before and after the patch. There
    are run-to-run variations in performance. Runs with higher variances
    usually have higher throughput.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-4-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • After the following commit:

    59aabfc7e959 ("locking/rwsem: Reduce spinlock contention in wakeup after up_read()/up_write()")

    the rwsem_wake() forgoes doing a wakeup if the wait_lock cannot be directly
    acquired and an optimistic spinning locker is present. This can help performance
    by avoiding spinning on the wait_lock when it is contended.

    With the later commit:

    133e89ef5ef3 ("locking/rwsem: Enable lockless waiter wakeup(s)")

    the performance advantage of the above optimization diminishes as the average
    wait_lock hold time become much shorter.

    With a later patch that supports rwsem lock handoff, we can no
    longer relies on the fact that the presence of an optimistic spinning
    locker will ensure that the lock will be acquired by a task soon and
    rwsem_wake() will be called later on to wake up waiters. This can lead
    to missed wakeup and application hang.

    So the original 59aabfc7e959 commit has to be reverted.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-3-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • The owner field in the rw_semaphore structure is used primarily for
    optimistic spinning. However, identifying the rwsem owner can also be
    helpful in debugging as well as tracing locking related issues when
    analyzing crash dump. The owner field may also store state information
    that can be important to the operation of the rwsem.

    So the owner field is now made a permanent member of the rw_semaphore
    structure irrespective of CONFIG_RWSEM_SPIN_ON_OWNER.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190520205918.22251-2-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • DEBUG_LOCKS_WARN_ON() will turn off debug_locks and
    makes print_unlock_imbalance_bug() return directly.

    Remove a redundant whitespace.

    Signed-off-by: Kobe Wu
    Signed-off-by: Peter Zijlstra (Intel)
    Cc:
    Cc:
    Cc: Eason Lin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: https://lkml.kernel.org/r/1559217575-30298-1-git-send-email-kobe-cp.wu@mediatek.com
    Signed-off-by: Ingo Molnar

    Kobe Wu
     
  • Signed-off-by: Ingo Molnar

    Ingo Molnar
     

05 Jun, 2019

1 commit

  • Based on 1 normalized pattern(s):

    distributed under the terms of the gnu gpl version 2

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 2 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Armijn Hemel
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190531190115.032570679@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

03 Jun, 2019

16 commits

  • Instead of playing silly games with CONFIG_DEBUG_PREEMPT toggling
    between this_cpu_*() and __this_cpu_*() use raw_cpu_*(), which is
    exactly what we want here.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Linus Torvalds
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Waiman Long
    Cc: Will Deacon
    Cc: huang ying
    Link: https://lkml.kernel.org/r/20190527082326.GP2623@hirez.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The sequence

    static DEFINE_WW_CLASS(test_ww_class);

    struct ww_acquire_ctx ww_ctx;
    struct ww_mutex ww_lock_a;
    struct ww_mutex ww_lock_b;
    struct ww_mutex ww_lock_c;
    struct mutex lock_c;

    ww_acquire_init(&ww_ctx, &test_ww_class);

    ww_mutex_init(&ww_lock_a, &test_ww_class);
    ww_mutex_init(&ww_lock_b, &test_ww_class);
    ww_mutex_init(&ww_lock_c, &test_ww_class);

    mutex_init(&lock_c);

    ww_mutex_lock(&ww_lock_a, &ww_ctx);

    mutex_lock(&lock_c);

    ww_mutex_lock(&ww_lock_b, &ww_ctx);
    ww_mutex_lock(&ww_lock_c, &ww_ctx);

    mutex_unlock(&lock_c); (*)

    ww_mutex_unlock(&ww_lock_c);
    ww_mutex_unlock(&ww_lock_b);
    ww_mutex_unlock(&ww_lock_a);

    ww_acquire_fini(&ww_ctx); (**)

    will trigger the following error in __lock_release() when calling
    mutex_release() at **:

    DEBUG_LOCKS_WARN_ON(depth
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?=
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: https://lkml.kernel.org/r/20190524201509.9199-2-imre.deak@intel.com
    Signed-off-by: Ingo Molnar

    Imre Deak
     
  • The sequence

    static DEFINE_WW_CLASS(test_ww_class);

    struct ww_acquire_ctx ww_ctx;
    struct ww_mutex ww_lock_a;
    struct ww_mutex ww_lock_b;
    struct mutex lock_c;
    struct mutex lock_d;

    ww_acquire_init(&ww_ctx, &test_ww_class);

    ww_mutex_init(&ww_lock_a, &test_ww_class);
    ww_mutex_init(&ww_lock_b, &test_ww_class);

    mutex_init(&lock_c);

    ww_mutex_lock(&ww_lock_a, &ww_ctx);

    mutex_lock(&lock_c);

    ww_mutex_lock(&ww_lock_b, &ww_ctx);

    mutex_unlock(&lock_c); (*)

    ww_mutex_unlock(&ww_lock_b);
    ww_mutex_unlock(&ww_lock_a);

    ww_acquire_fini(&ww_ctx);

    triggers the following WARN in __lock_release() when doing the unlock at *:

    DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth - 1);

    The problem is that the WARN check doesn't take into account the merging
    of ww_lock_a and ww_lock_b which results in decreasing curr->lockdep_depth
    by 2 not only 1.

    Note that the following sequence doesn't trigger the WARN, since there
    won't be any hlock merging.

    ww_acquire_init(&ww_ctx, &test_ww_class);

    ww_mutex_init(&ww_lock_a, &test_ww_class);
    ww_mutex_init(&ww_lock_b, &test_ww_class);

    mutex_init(&lock_c);
    mutex_init(&lock_d);

    ww_mutex_lock(&ww_lock_a, &ww_ctx);

    mutex_lock(&lock_c);
    mutex_lock(&lock_d);

    ww_mutex_lock(&ww_lock_b, &ww_ctx);

    mutex_unlock(&lock_d);

    ww_mutex_unlock(&ww_lock_b);
    ww_mutex_unlock(&ww_lock_a);

    mutex_unlock(&lock_c);

    ww_acquire_fini(&ww_ctx);

    In general both of the above two sequences are valid and shouldn't
    trigger any lockdep warning.

    Fix this by taking the decrement due to the hlock merging into account
    during lock release and hlock class re-setting. Merging can't happen
    during lock downgrading since there won't be a new possibility to merge
    hlocks in that case, so add a WARN if merging still happens then.

    Signed-off-by: Imre Deak
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: ville.syrjala@linux.intel.com
    Link: https://lkml.kernel.org/r/20190524201509.9199-1-imre.deak@intel.com
    Signed-off-by: Ingo Molnar

    Imre Deak
     
  • In mark_lock_irq(), the following checks are performed:

    ----------------------------------
    | -> | unsafe | read unsafe |
    |----------------------------------|
    | safe | F B | F* B* |
    |----------------------------------|
    | read safe | F? B* | - |
    ----------------------------------

    Where:
    F: check_usage_forwards
    B: check_usage_backwards
    *: check enabled by STRICT_READ_CHECKS
    ?: check enabled by the !dir condition

    From checking point of view, the special F? case does not make sense,
    whereas it perhaps is made for peroformance concern. As later patch will
    address this issue, remove this exception, which makes the checks
    consistent later.

    With STRICT_READ_CHECKS = 1 which is default, there is no functional
    change.

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bvanassche@acm.org
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-24-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • The new bit can be any possible lock usage except it is garbage, so the
    cases in switch can be made simpler. Warn early on if wrong usage bit is
    passed without taking locks. No functional change.

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bvanassche@acm.org
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-23-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • Lock usage bit initialization is consolidated into one function
    mark_usage(). Trivial readability improvement. No functional change.

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bvanassche@acm.org
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-22-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • As Peter has put it all sound and complete for the cause, I simply quote:

    "It (check_redundant) was added for cross-release (which has since been
    reverted) which would generate a lot of redundant links (IIRC) but
    having it makes the reports more convoluted -- basically, if we had an
    A-B-C relation, then A-C will not be added to the graph because it is
    already covered. This then means any report will include B, even though
    a shorter cycle might have been possible."

    This would increase the number of direct dependencies. For a simple workload
    (make clean; reboot; make vmlinux -j8), the data looks like this:

    CONFIG_LOCKDEP_SMALL: direct dependencies: 6926

    !CONFIG_LOCKDEP_SMALL: direct dependencies: 9052 (+30.7%)

    Suggested-by: Peter Zijlstra (Intel)
    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bvanassche@acm.org
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-21-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • These two functions now handle different check results themselves. A new
    check_path function is added to check whether there is a path in the
    dependency graph. No functional change.

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bvanassche@acm.org
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-20-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • The @nested is not used in __release_lock so remove it despite that it
    is not used in lock_release in the first place.

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bvanassche@acm.org
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-19-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • In check_deadlock(), the third argument read comes from the second
    argument hlock so that it can be removed. No functional change.

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bvanassche@acm.org
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-18-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • The breadth-first search is implemented as flat-out non-recursive now, but
    the comments are still describing it as recursive, update the comments in
    that regard.

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bvanassche@acm.org
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-16-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • In search of a dependency in the lock graph, there is contant checks for
    forward or backward search. Directly reference the field offset of the
    struct that differentiates the type of search to avoid those checks.

    No functional change.

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bvanassche@acm.org
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-15-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • With the change, we can slightly adjust the code to iterate the queue in BFS
    search, which simplifies the code. No functional change.

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bvanassche@acm.org
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-14-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • The element field is an array in struct circular_queue to keep track of locks
    in the search. Making it the same type as the locks avoids type cast. Also
    fix a typo and elaborate the comment above struct circular_queue.

    No functional change.

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Bart Van Assche
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-13-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • A leftover comment is removed. While at it, add more explanatory
    comments. Such a trivial patch!

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bvanassche@acm.org
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-12-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • The lockdep_map argument in them is not used, remove it.

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Bart Van Assche
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-11-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du