31 May, 2019

1 commit

  • Based on 3 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version this program is distributed in the
    hope that it will be useful but without any warranty without even
    the implied warranty of merchantability or fitness for a particular
    purpose see the gnu general public license for more details

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version [author] [kishon] [vijay] [abraham]
    [i] [kishon]@[ti] [com] this program is distributed in the hope that
    it will be useful but without any warranty without even the implied
    warranty of merchantability or fitness for a particular purpose see
    the gnu general public license for more details

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version [author] [graeme] [gregory]
    [gg]@[slimlogic] [co] [uk] [author] [kishon] [vijay] [abraham] [i]
    [kishon]@[ti] [com] [based] [on] [twl6030]_[usb] [c] [author] [hema]
    [hk] [hemahk]@[ti] [com] this program is distributed in the hope
    that it will be useful but without any warranty without even the
    implied warranty of merchantability or fitness for a particular
    purpose see the gnu general public license for more details

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 1105 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Richard Fontana
    Reviewed-by: Kate Stewart
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070033.202006027@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

02 Oct, 2018

1 commit

  • Both spin locks and write locks currently do:

    f0 0f b1 17 lock cmpxchg %edx,(%rdi)
    85 c0 test %eax,%eax
    75 05 jne [slowpath]

    This 'test' insn is superfluous; the cmpxchg insn sets the Z flag
    appropriately. Peter pointed out that using atomic_try_cmpxchg_acquire()
    will let the compiler know this is true. Comparing before/after
    disassemblies show the only effect is to remove this insn.

    Take this opportunity to make the spin & write lock code resemble each
    other more closely and have similar likely() hints.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Matthew Wilcox
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Arnd Bergmann
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Cc: Waiman Long
    Link: http://lkml.kernel.org/r/20180820162639.GC25153@bombadil.infradead.org
    Signed-off-by: Ingo Molnar

    Matthew Wilcox
     

25 Oct, 2017

3 commits

  • When a prospective writer takes the qrwlock locking slowpath due to the
    lock being held, it attempts to cmpxchg the wmode field from 0 to
    _QW_WAITING so that concurrent lockers also take the slowpath and queue
    on the spinlock accordingly, allowing the lockers to drain.

    Unfortunately, this isn't fair, because a fastpath writer that comes in
    after the lock is made available but before the _QW_WAITING flag is set
    can effectively jump the queue. If there is a steady stream of prospective
    writers, then the waiter will be held off indefinitely.

    This patch restores fairness by separating _QW_WAITING and _QW_LOCKED
    into two distinct fields: _QW_LOCKED continues to occupy the bottom byte
    of the lockword so that it can be cleared unconditionally when unlocking,
    but _QW_WAITING now occupies what used to be the bottom bit of the reader
    count. This then forces the slow-path for concurrent lockers.

    Tested-by: Waiman Long
    Tested-by: Jeremy Linton
    Tested-by: Adam Wallis
    Tested-by: Jan Glauber
    Signed-off-by: Will Deacon
    Acked-by: Peter Zijlstra
    Cc: Boqun Feng
    Cc: Jeremy.Linton@arm.com
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Thomas Gleixner
    Cc: linux-arm-kernel@lists.infradead.org
    Link: http://lkml.kernel.org/r/1507810851-306-6-git-send-email-will.deacon@arm.com
    Signed-off-by: Ingo Molnar

    Will Deacon
     
  • The qrwlock slowpaths involve spinning when either a prospective reader
    is waiting for a concurrent writer to drain, or a prospective writer is
    waiting for concurrent readers to drain. In both of these situations,
    atomic_cond_read_acquire() can be used to avoid busy-waiting and make use
    of any backoff functionality provided by the architecture.

    This patch replaces the open-code loops and rspin_until_writer_unlock()
    implementation with atomic_cond_read_acquire(). The write mode transition
    zero to _QW_WAITING is left alone, since (a) this doesn't need acquire
    semantics and (b) should be fast.

    Tested-by: Waiman Long
    Tested-by: Jeremy Linton
    Tested-by: Adam Wallis
    Tested-by: Jan Glauber
    Signed-off-by: Will Deacon
    Acked-by: Peter Zijlstra
    Cc: Boqun Feng
    Cc: Jeremy.Linton@arm.com
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Thomas Gleixner
    Cc: linux-arm-kernel@lists.infradead.org
    Link: http://lkml.kernel.org/r/1507810851-306-4-git-send-email-will.deacon@arm.com
    Signed-off-by: Ingo Molnar

    Will Deacon
     
  • There's no good reason to keep the internal structure of struct qrwlock
    hidden from qrwlock.h, particularly as it's actually needed for unlock
    and ends up being abstracted independently behind the __qrwlock_write_byte()
    function.

    Stop pretending we can hide this stuff, and move the __qrwlock definition
    into qrwlock, removing the __qrwlock_write_byte() nastiness and using the
    same struct definition everywhere instead.

    Signed-off-by: Will Deacon
    Acked-by: Peter Zijlstra
    Cc: Boqun Feng
    Cc: Jeremy.Linton@arm.com
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Thomas Gleixner
    Cc: Waiman Long
    Cc: linux-arm-kernel@lists.infradead.org
    Link: http://lkml.kernel.org/r/1507810851-306-2-git-send-email-will.deacon@arm.com
    Signed-off-by: Ingo Molnar

    Will Deacon
     

10 Oct, 2017

1 commit

  • Outside of the locking code itself, {read,spin,write}_can_lock() have no
    users in tree. Apparmor (the last remaining user of write_can_lock()) got
    moved over to lockdep by the previous patch.

    This patch removes the use of {read,spin,write}_can_lock() from the
    BUILD_LOCK_OPS macro, deferring to the trylock operation for testing the
    lock status, and subsequently removes the unused macros altogether. They
    aren't guaranteed to work in a concurrent environment and can give
    incorrect results in the case of qrwlock.

    Signed-off-by: Will Deacon
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: paulmck@linux.vnet.ibm.com
    Link: http://lkml.kernel.org/r/1507055129-12300-2-git-send-email-will.deacon@arm.com
    Signed-off-by: Ingo Molnar

    Will Deacon
     

10 Aug, 2016

1 commit

  • This patch aims to get rid of endianness in queued_write_unlock(). We
    want to set __qrwlock->wmode to NULL, however the address is not
    &lock->cnts in big endian machine. That causes queued_write_unlock()
    write NULL to the wrong field of __qrwlock.

    So implement __qrwlock_write_byte() which returns the correct
    __qrwlock->wmode address.

    Suggested-by: Peter Zijlstra (Intel)
    Signed-off-by: Pan Xinhui
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Waiman.Long@hpe.com
    Cc: arnd@arndb.de
    Cc: boqun.feng@gmail.com
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/1468835259-4486-1-git-send-email-xinhui.pan@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    pan xinhui
     

12 Aug, 2015

2 commits

  • The qrwlock implementation is slightly heavy in its use of memory
    barriers, mainly through the use of _cmpxchg() and _return() atomics, which
    imply full barrier semantics.

    This patch modifies the qrwlock code to use the more relaxed atomic
    routines so that we can reduce the unnecessary barrier overhead on
    weakly-ordered architectures.

    Signed-off-by: Will Deacon
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Waiman.Long@hp.com
    Cc: paulmck@linux.vnet.ibm.com
    Link: http://lkml.kernel.org/r/1438880084-18856-7-git-send-email-will.deacon@arm.com
    Signed-off-by: Ingo Molnar

    Will Deacon
     
  • Since the following commit:

    536fa402221f ("compiler: Allow 1- and 2-byte smp_load_acquire() and smp_store_release()")

    smp_store_release() supports byte accesses, so use that in writer unlock
    and remove the conditional macro override.

    Signed-off-by: Will Deacon
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Waiman Long
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: paulmck@linux.vnet.ibm.com
    Link: http://lkml.kernel.org/r/1438880084-18856-6-git-send-email-will.deacon@arm.com
    Signed-off-by: Ingo Molnar

    Will Deacon
     

06 Jul, 2015

2 commits

  • The qrwlock is fair in the process context, but becoming unfair when
    in the interrupt context to support use cases like the tasklist_lock.

    The current code isn't that well-documented on what happens when
    in the interrupt context. The rspin_until_writer_unlock() will only
    spin if the writer has gotten the lock. If the writer is still in the
    waiting state, the increment in the reader count will cause the writer
    to remain in the waiting state and the new interrupt context reader
    will get the lock and return immediately. The current code, however,
    does an additional read of the lock value which is not necessary as
    the information has already been there in the fast path. This may
    sometime cause an additional cacheline transfer when the lock is
    highly contended.

    This patch passes the lock value information gotten in the fast path
    to the slow path to eliminate the additional read. It also documents
    the action for the interrupt context readers more clearly.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Will Deacon
    Cc: Arnd Bergmann
    Cc: Douglas Hatch
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1434729002-57724-3-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • To sync up with the naming convention used in qspinlock, all the
    qrwlock functions were renamed to started with "queued" instead of
    "queue".

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Arnd Bergmann
    Cc: Douglas Hatch
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/1434729002-57724-2-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

06 Jun, 2014

1 commit

  • This rwlock uses the arch_spin_lock_t as a waitqueue, and assuming the
    arch_spin_lock_t is a fair lock (ticket,mcs etc..) the resulting
    rwlock is a fair lock.

    It fits in the same 8 bytes as the regular rwlock_t by folding the
    reader and writer count into a single integer, using the remaining 4
    bytes for the arch_spinlock_t.

    Architectures that can single-copy adress bytes can optimize
    queue_write_unlock() with a 0 write to the LSB (the write count).

    Performance as measured by Davidlohr Bueso (rwlock_t -> qrwlock_t):

    +--------------+-------------+---------------+
    | Workload | #users | delta |
    +--------------+-------------+---------------+
    | alltests | > 1400 | -4.83% |
    | custom | 0-100,> 100 | +1.43%,-1.57% |
    | high_systime | > 1000 | -2.61 |
    | shared | all | +0.32 |
    +--------------+-------------+---------------+

    http://www.stgolabs.net/qrwlock-stuff/aim7-results-vs-rwsem_optsin/

    Signed-off-by: Waiman Long
    [peterz: near complete rewrite]
    Signed-off-by: Peter Zijlstra
    Cc: Arnd Bergmann
    Cc: Linus Torvalds
    Cc: "Paul E.McKenney"
    Cc: linux-arch@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Link: http://lkml.kernel.org/n/tip-gac1nnl3wvs2ij87zv2xkdzq@git.kernel.org
    Signed-off-by: Ingo Molnar

    Waiman Long