17 Jun, 2016

1 commit

  • Nothing in the control-dependencies section of memory-barriers.txt
    says that control dependencies don't extend beyond the end of the
    if-statement containing the control dependency. Worse yet, in many
    situations, they do extend beyond that if-statement. In particular,
    the compiler cannot destroy the control dependency given proper use of
    READ_ONCE() and WRITE_ONCE(). However, a weakly ordered system having
    a conditional-move instruction provides the control-dependency guarantee
    only to code within the scope of the if-statement itself.

    This commit therefore adds words and an example demonstrating this
    limitation of control dependencies.

    Reported-by: Will Deacon
    Signed-off-by: Paul E. McKenney
    Acked-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: corbet@lwn.net
    Cc: linux-arch@vger.kernel.org
    Cc: linux-doc@vger.kernel.org
    Link: http://lkml.kernel.org/r/20160615230817.GA18039@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

28 Apr, 2016

3 commits

  • For compound atomics performing both a load and a store operation, make
    it clear that _acquire and _release variants refer only to the load and
    store portions of compound atomic. For example, xchg_acquire is an xchg
    operation where the load takes on ACQUIRE semantics.

    Signed-off-by: Will Deacon
    Signed-off-by: Paul E. McKenney
    Acked-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: corbet@lwn.net
    Cc: dave@stgolabs.net
    Cc: dhowells@redhat.com
    Cc: linux-doc@vger.kernel.org
    Link: http://lkml.kernel.org/r/1461691328-5429-3-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Will Deacon
     
  • There has been some confusion about the purpose of memory-barriers.txt,
    so this commit adds a statement of purpose.

    Signed-off-by: David Howells
    Signed-off-by: Paul E. McKenney
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: corbet@lwn.net
    Cc: dave@stgolabs.net
    Cc: linux-doc@vger.kernel.org
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/1461691328-5429-2-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    David Howells
     
  • It appears people are reading this document as a requirements list for
    building hardware. This is not the intent of this document. Nor is it
    particularly suited for this purpose.

    The primary purpose of this document is our collective attempt to define
    a set of primitives that (hopefully) allow us to write correct code on
    the myriad of SMP platforms Linux supports.

    Its a definite work in progress as our understanding of these platforms,
    and memory ordering in general, progresses.

    Nor does being mentioned in this document mean we think its a
    particularly good idea; the data dependency barrier required by Alpha
    being a prime example. Yes we have it, no you're insane to require it
    when building new hardware.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Paul E. McKenney
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: corbet@lwn.net
    Cc: dave@stgolabs.net
    Cc: dhowells@redhat.com
    Cc: linux-doc@vger.kernel.org
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/1461691328-5429-1-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

13 Apr, 2016

6 commits

  • ... do this next to smp_load_acquire() when first mentioning
    ACQUIRE. While this call is briefly explained and control
    dependencies are mentioned later, it does not hurt the reader.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Paul E. McKenney
    Cc: Andrew Morton
    Cc: Davidlohr Bueso
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bobby.prani@gmail.com
    Cc: dhowells@redhat.com
    Cc: dipankar@in.ibm.com
    Cc: dvhart@linux.intel.com
    Cc: edumazet@google.com
    Cc: fweisbec@gmail.com
    Cc: jiangshanlai@gmail.com
    Cc: josh@joshtriplett.org
    Cc: mathieu.desnoyers@efficios.com
    Cc: oleg@redhat.com
    Cc: rostedt@goodmis.org
    Link: http://lkml.kernel.org/r/1460476375-27803-7-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     
  • The document uses two newlines between sections, one newline between
    item and its detailed description, and two spaces between sentences.

    There are a few places that used these rules inconsistently - fix them.

    Signed-off-by: SeongJae Park
    Signed-off-by: Paul E. McKenney
    Acked-by: David Howells
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bobby.prani@gmail.com
    Cc: dipankar@in.ibm.com
    Cc: dvhart@linux.intel.com
    Cc: edumazet@google.com
    Cc: fweisbec@gmail.com
    Cc: jiangshanlai@gmail.com
    Cc: josh@joshtriplett.org
    Cc: mathieu.desnoyers@efficios.com
    Cc: oleg@redhat.com
    Cc: rostedt@goodmis.org
    Link: http://lkml.kernel.org/r/1460476375-27803-5-git-send-email-paulmck@linux.vnet.ibm.com
    [ Fixed the changelog. ]
    Signed-off-by: Ingo Molnar

    SeongJae Park
     
  • Signed-off-by: SeongJae Park
    Signed-off-by: Paul E. McKenney
    Acked-by: David Howells
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bobby.prani@gmail.com
    Cc: dipankar@in.ibm.com
    Cc: dvhart@linux.intel.com
    Cc: edumazet@google.com
    Cc: fweisbec@gmail.com
    Cc: jiangshanlai@gmail.com
    Cc: josh@joshtriplett.org
    Cc: mathieu.desnoyers@efficios.com
    Cc: oleg@redhat.com
    Cc: rostedt@goodmis.org
    Link: http://lkml.kernel.org/r/1460476375-27803-4-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    SeongJae Park
     
  • A 'Virtual Machine Guests' subsection was added by this commit:

    6a65d26385bf487 ("asm-generic: implement virt_xxx memory barriers")

    but the TOC was not updated - update it.

    Signed-off-by: SeongJae Park
    Signed-off-by: Paul E. McKenney
    Acked-by: David Howells
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bobby.prani@gmail.com
    Cc: dipankar@in.ibm.com
    Cc: dvhart@linux.intel.com
    Cc: edumazet@google.com
    Cc: fweisbec@gmail.com
    Cc: jiangshanlai@gmail.com
    Cc: josh@joshtriplett.org
    Cc: mathieu.desnoyers@efficios.com
    Cc: oleg@redhat.com
    Cc: rostedt@goodmis.org
    Link: http://lkml.kernel.org/r/1460476375-27803-3-git-send-email-paulmck@linux.vnet.ibm.com
    [ Rewrote the changelog. ]
    Signed-off-by: Ingo Molnar

    SeongJae Park
     
  • The terms 'lock'/'unlock' were changed to 'acquire'/'release' by the
    following commit:

    2e4f5382d12a4 ("locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE")

    However, the commit missed to change the table of contents - fix that.

    Also, the dumb rename changed the section name 'Locking functions' to an
    actively misleading 'Acquiring functions' section name.

    Rename it to 'Lock acquisition functions' instead.

    Suggested-by: David Howells
    Signed-off-by: SeongJae Park
    Signed-off-by: Paul E. McKenney
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bobby.prani@gmail.com
    Cc: dipankar@in.ibm.com
    Cc: dvhart@linux.intel.com
    Cc: edumazet@google.com
    Cc: fweisbec@gmail.com
    Cc: jiangshanlai@gmail.com
    Cc: josh@joshtriplett.org
    Cc: mathieu.desnoyers@efficios.com
    Cc: oleg@redhat.com
    Cc: rostedt@goodmis.org
    Link: http://lkml.kernel.org/r/1460476375-27803-2-git-send-email-paulmck@linux.vnet.ibm.com
    [ Rewrote the changelog. ]
    Signed-off-by: Ingo Molnar

    SeongJae Park
     
  • The current documentation claims that the compiler ignores barrier(),
    which is not the case. Instead, the compiler carefully pays attention
    to barrier(), but in a creative way that still manages to destroy
    the control dependency. This commit sets the story straight.

    Reported-by: Mathieu Desnoyers
    Signed-off-by: Paul E. McKenney
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bobby.prani@gmail.com
    Cc: dhowells@redhat.com
    Cc: dipankar@in.ibm.com
    Cc: dvhart@linux.intel.com
    Cc: edumazet@google.com
    Cc: fweisbec@gmail.com
    Cc: jiangshanlai@gmail.com
    Cc: josh@joshtriplett.org
    Cc: oleg@redhat.com
    Cc: rostedt@goodmis.org
    Link: http://lkml.kernel.org/r/1460476375-27803-1-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

15 Mar, 2016

8 commits

  • The compiler store-fusion example in memory-barriers.txt uses a C
    comment to represent arbitrary code that does not update a given
    variable. Unfortunately, someone could reasonably interpret the
    comment as instead referring to the following line of code. This
    commit therefore replaces the comment with a string that more
    clearly represents the arbitrary code.

    Signed-off-by: SeongJae Park
    Acked-by: David Howells
    Signed-off-by: Paul E. McKenney

    SeongJae Park
     
  • The "transitivity" section mentions cumulativity in a potentially
    confusing way. Contrary to the current wording, cumulativity is
    not transitivity, but rather a hardware discipline that can be used
    to implement transitivity on ARM and PowerPC CPUs. This commit
    therefore deletes the mention of cumulativity.

    Reported-by: Luc Maranget
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The memory-barriers.txt discussion of local transitivity and
    release-acquire chains leaves out discussion of the outcome of
    the read from "u". This commit therefore adds an outcome showing
    that you can get a "1" from this read even if the release-acquire
    pairs don't line up.

    Reported-by: Will Deacon
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The introduction of smp_load_acquire() and smp_store_release() had
    the side effect of introducing a weaker notion of transitivity:
    The transitivity of full smp_mb() barriers is global, but that
    of smp_store_release()/smp_load_acquire() chains is local. This
    commit therefore introduces the notion of local transitivity and
    gives an example.

    Reported-by: Peter Zijlstra
    Reported-by: Will Deacon
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The current memory-barriers.txt does not address the possibility of
    a write to a dereferenced pointer. This should be rare, but when it
    happens, we need that write -not- to be clobbered by the initialization.
    This commit therefore adds an example showing a data dependency ordering
    a later data-dependent write.

    Reported-by: Leonid Yegoshin
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Commit #1ebee8017d84 (rcu: Eliminate array-index-based RCU primitives)
    eliminated the primitives supporting RCU-protected array indexes, but
    failed to update Documentation/memory-barriers.txt accordingly. This
    commit therefore removes the discussion of RCU-protected array indexes.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit fixes a couple of "Compiler Barrier" section references to
    be "COMPILER BARRIER". This makes it easier to find the section in
    the usual text editors.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The summary of the "CONTROL DEPENDENCIES" section incorrectly states that
    barrier() may be used to prevent compiler reordering when more than one
    leg of the control-dependent "if" statement start with identical stores.
    This is incorrect at high optimization levels. This commit therefore
    updates the summary to match the detailed description.

    Reported by: Jianyu Zhan
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

19 Jan, 2016

1 commit

  • Pull virtio barrier rework+fixes from Michael Tsirkin:
    "This adds a new kind of barrier, and reworks virtio and xen to use it.

    Plus some fixes here and there"

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (44 commits)
    checkpatch: add virt barriers
    checkpatch: check for __smp outside barrier.h
    checkpatch.pl: add missing memory barriers
    virtio: make find_vqs() checkpatch.pl-friendly
    virtio_balloon: fix race between migration and ballooning
    virtio_balloon: fix race by fill and leak
    s390: more efficient smp barriers
    s390: use generic memory barriers
    xen/events: use virt_xxx barriers
    xen/io: use virt_xxx barriers
    xenbus: use virt_xxx barriers
    virtio_ring: use virt_store_mb
    sh: move xchg_cmpxchg to a header by itself
    sh: support 1 and 2 byte xchg
    virtio_ring: update weak barriers to use virt_xxx
    Revert "virtio_ring: Update weak barriers to use dma_wmb/rmb"
    asm-generic: implement virt_xxx memory barriers
    x86: define __smp_xxx
    xtensa: define __smp_xxx
    tile: define __smp_xxx
    ...

    Linus Torvalds
     

13 Jan, 2016

1 commit

  • Guests running within virtual machines might be affected by SMP effects even if
    the guest itself is compiled without SMP support. This is an artifact of
    interfacing with an SMP host while running an UP kernel. Using mandatory
    barriers for this use-case would be possible but is often suboptimal.

    In particular, virtio uses a bunch of confusing ifdefs to work around
    this, while xen just uses the mandatory barriers.

    To better handle this case, low-level virt_mb() etc macros are made available.
    These are implemented trivially using the low-level __smp_xxx macros,
    the purpose of these wrappers is to annotate those specific cases.

    These have the same effect as smp_mb() etc when SMP is enabled, but generate
    identical code for SMP and non-SMP systems. For example, virtual machine guests
    should use virt_mb() rather than smp_mb() when synchronizing against a
    (possibly SMP) host.

    Suggested-by: David Miller
    Signed-off-by: Michael S. Tsirkin
    Acked-by: Peter Zijlstra (Intel)

    Michael S. Tsirkin
     

12 Jan, 2016

1 commit

  • Pull locking updates from Ingo Molnar:
    "So we have a laundry list of locking subsystem changes:

    - continuing barrier API and code improvements

    - futex enhancements

    - atomics API improvements

    - pvqspinlock enhancements: in particular lock stealing and adaptive
    spinning

    - qspinlock micro-enhancements"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op
    futex: Cleanup the goto confusion in requeue_pi()
    futex: Remove pointless put_pi_state calls in requeue()
    futex: Document pi_state refcounting in requeue code
    futex: Rename free_pi_state() to put_pi_state()
    futex: Drop refcount if requeue_pi() acquired the rtmutex
    locking/barriers, arch: Remove ambiguous statement in the smp_store_mb() documentation
    lcoking/barriers, arch: Use smp barriers in smp_store_release()
    locking/cmpxchg, arch: Remove tas() definitions
    locking/pvqspinlock: Queue node adaptive spinning
    locking/pvqspinlock: Allow limited lock stealing
    locking/pvqspinlock: Collect slowpath lock statistics
    sched/core, locking: Document Program-Order guarantees
    locking, sched: Introduce smp_cond_acquire() and use it
    locking/pvqspinlock, x86: Optimize the PV unlock code path
    locking/qspinlock: Avoid redundant read of next pointer
    locking/qspinlock: Prefetch the next node cacheline
    locking/qspinlock: Use _acquire/_release() versions of cmpxchg() & xchg()
    atomics: Add test for atomic operations with _relaxed variants

    Linus Torvalds
     

06 Dec, 2015

1 commit

  • In commit 2ecf810121c7 ("Documentation/memory-barriers.txt: Add
    needed ACCESS_ONCE() calls to memory-barriers.txt") the statement
    "Q = P" was converted to "ACCESS_ONCE(Q) = P". This should have
    been "Q = ACCESS_ONCE(P)". It later became "WRITE_ONCE(Q, P)".
    This doesn't match the following text, which is "Q = LOAD P".
    Change the statement to be "Q = READ_ONCE(P)".

    Signed-off-by: Chris Metcalf
    Signed-off-by: Paul E. McKenney

    Chris Metcalf
     

04 Dec, 2015

1 commit

  • It serves no purpose but to confuse readers, and is most
    likely a left over from constant memory-barriers.txt
    updates. I.e.:

    http://lists.openwall.net/linux-kernel/2006/07/15/27

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Peter Zijlstra (Intel)
    Cc:
    Cc: Andrew Morton
    Cc: Jonathan Corbet
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1445975631-17047-5-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     

04 Nov, 2015

2 commits

  • This seems to be a mis-reading of how alpha memory ordering works, and
    is not backed up by the alpha architecture manual. The helper functions
    don't do anything special on any other architectures, and the arguments
    that support them being safe on other architectures also argue that they
    are safe on alpha.

    Basically, the "control dependency" is between a previous read and a
    subsequent write that is dependent on the value read. Even if the
    subsequent write is actually done speculatively, there is no way that
    such a speculative write could be made visible to other cpu's until it
    has been committed, which requires validating the speculation.

    Note that most weakely ordered architectures (very much including alpha)
    do not guarantee any ordering relationship between two loads that depend
    on each other on a control dependency:

    read A
    if (val == 1)
    read B

    because the conditional may be predicted, and the "read B" may be
    speculatively moved up to before reading the value A. So we require the
    user to insert a smp_rmb() between the two accesses to be correct:

    read A;
    if (A == 1)
    smp_rmb()
    read B

    Alpha is further special in that it can break that ordering even if the
    *address* of B depends on the read of A, because the cacheline that is
    read later may be stale unless you have a memory barrier in between the
    pointer read and the read of the value behind a pointer:

    read ptr
    read offset(ptr)

    whereas all other weakly ordered architectures guarantee that the data
    dependency (as opposed to just a control dependency) will order the two
    accesses. As a result, alpha needs a "smp_read_barrier_depends()" in
    between those two reads for them to be ordered.

    The coontrol dependency that "READ_ONCE_CTRL()" and "atomic_read_ctrl()"
    had was a control dependency to a subsequent *write*, however, and
    nobody can finalize such a subsequent write without having actually done
    the read. And were you to write such a value to a "stale" cacheline
    (the way the unordered reads came to be), that would seem to lose the
    write entirely.

    So the things that make alpha able to re-order reads even more
    aggressively than other weak architectures do not seem to be relevant
    for a subsequent write. Alpha memory ordering may be strange, but
    there's no real indication that it is *that* strange.

    Also, the alpha architecture reference manual very explicitly talks
    about the definition of "Dependence Constraints" in section 5.6.1.7,
    where a preceding read dominates a subsequent write.

    Such a dependence constraint admittedly does not impose a BEFORE (alpha
    architecture term for globally visible ordering), but it does guarantee
    that there can be no "causal loop". I don't see how you could avoid
    such a loop if another cpu could see the stored value and then impact
    the value of the first read. Put another way: the read and the write
    could not be seen as being out of order wrt other cpus.

    So I do not see how these "x_ctrl()" functions can currently be necessary.

    I may have to eat my words at some point, but in the absense of clear
    proof that alpha actually needs this, or indeed even an explanation of
    how alpha could _possibly_ need it, I do not believe these functions are
    called for.

    And if it turns out that alpha really _does_ need a barrier for this
    case, that barrier still should not be "smp_read_barrier_depends()".
    We'd have to make up some new speciality barrier just for alpha, along
    with the documentation for why it really is necessary.

    Cc: Peter Zijlstra
    Cc: Paul E McKenney
    Cc: Dmitry Vyukov
    Cc: Will Deacon
    Cc: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Pull locking changes from Ingo Molnar:
    "The main changes in this cycle were:

    - More gradual enhancements to atomic ops: new atomic*_read_ctrl()
    ops, synchronize atomic_{read,set}() ordering requirements between
    architectures, add atomic_long_t bitops. (Peter Zijlstra)

    - Add _{relaxed|acquire|release}() variants for inc/dec atomics and
    use them in various locking primitives: mutex, rtmutex, mcs, rwsem.
    This enables weakly ordered architectures (such as arm64) to make
    use of more locking related optimizations. (Davidlohr Bueso)

    - Implement atomic[64]_{inc,dec}_relaxed() on ARM. (Will Deacon)

    - Futex kernel data cache footprint micro-optimization. (Rasmus
    Villemoes)

    - pvqspinlock runtime overhead micro-optimization. (Waiman Long)

    - misc smaller fixlets"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    ARM, locking/atomics: Implement _relaxed variants of atomic[64]_{inc,dec}
    locking/rwsem: Use acquire/release semantics
    locking/mcs: Use acquire/release semantics
    locking/rtmutex: Use acquire/release semantics
    locking/mutex: Use acquire/release semantics
    locking/asm-generic: Add _{relaxed|acquire|release}() variants for inc/dec atomics
    atomic: Implement atomic_read_ctrl()
    atomic, arch: Audit atomic_{read,set}()
    atomic: Add atomic_long_t bitops
    futex: Force hot variables into a single cache line
    locking/pvqspinlock: Kick the PV CPU unconditionally when _Q_SLOW_VAL
    locking/osq: Relax atomic semantics
    locking/qrwlock: Rename ->lock to ->wait_lock
    locking/Documentation/lockstat: Fix typo - lokcing -> locking
    locking/atomics, cmpxchg: Privatize the inclusion of asm/cmpxchg.h

    Linus Torvalds
     

07 Oct, 2015

2 commits

  • The recently added lockless_dereference() macro is not present in the
    Documentation/ directory, so this commit fixes that.

    Reported-by: Dmitry Vyukov
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • Documentation/memory-barriers.txt calls out RCU as one of the sets
    of primitives associated with ACQUIRE and RELEASE. There really
    is an association in that rcu_assign_pointer() includes a RELEASE
    operation, but a quick read can convince people that rcu_read_lock() and
    rcu_read_unlock() have ACQUIRE and RELEASE semantics, which they do not.

    This commit therefore removes RCU from this list in order to avoid
    this confusion.

    Reported-by: Boqun Feng
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

23 Sep, 2015

1 commit

  • Provide atomic_read_ctrl() to mirror READ_ONCE_CTRL(), such that we can
    more conveniently use atomics in control dependencies.

    Since we can assume atomic_read() implies a READ_ONCE(), we must only
    emit an extra smp_read_barrier_depends() in order to upgrade to
    READ_ONCE_CTRL() semantics.

    Requested-by: Dmitry Vyukov
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Cc: oleg@redhat.com
    Link: http://lkml.kernel.org/r/20150918115637.GM3604@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

04 Sep, 2015

1 commit

  • Pull locking and atomic updates from Ingo Molnar:
    "Main changes in this cycle are:

    - Extend atomic primitives with coherent logic op primitives
    (atomic_{or,and,xor}()) and deprecate the old partial APIs
    (atomic_{set,clear}_mask())

    The old ops were incoherent with incompatible signatures across
    architectures and with incomplete support. Now every architecture
    supports the primitives consistently (by Peter Zijlstra)

    - Generic support for 'relaxed atomics':

    - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
    - atomic_read_acquire()
    - atomic_set_release()

    This came out of porting qwrlock code to arm64 (by Will Deacon)

    - Clean up the fragile static_key APIs that were causing repeat bugs,
    by introducing a new one:

    DEFINE_STATIC_KEY_TRUE(name);
    DEFINE_STATIC_KEY_FALSE(name);

    which define a key of different types with an initial true/false
    value.

    Then allow:

    static_branch_likely()
    static_branch_unlikely()

    to take a key of either type and emit the right instruction for the
    case. To be able to know the 'type' of the static key we encode it
    in the jump entry (by Peter Zijlstra)

    - Static key self-tests (by Jason Baron)

    - qrwlock optimizations (by Waiman Long)

    - small futex enhancements (by Davidlohr Bueso)

    - ... and misc other changes"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
    jump_label/x86: Work around asm build bug on older/backported GCCs
    locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
    locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
    locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
    locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
    locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
    locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
    locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
    locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
    locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
    locking/static_keys: Make verify_keys() static
    jump label, locking/static_keys: Update docs
    locking/static_keys: Provide a selftest
    jump_label: Provide a self-test
    s390/uaccess, locking/static_keys: employ static_branch_likely()
    x86, tsc, locking/static_keys: Employ static_branch_likely()
    locking/static_keys: Add selftest
    locking/static_keys: Add a new static_key interface
    locking/static_keys: Rework update logic
    locking/static_keys: Add static_key_{en,dis}able() helpers
    ...

    Linus Torvalds
     

04 Aug, 2015

1 commit


03 Aug, 2015

1 commit

  • A failed cmpxchg does not provide any memory ordering guarantees, a
    property that is used to optimise the cmpxchg implementations on Alpha,
    PowerPC and arm64.

    This patch updates atomic_ops.txt and memory-barriers.txt to reflect
    this.

    Signed-off-by: Will Deacon
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Davidlohr Bueso
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Jonathan Corbet
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: Waiman Long
    Link: http://lkml.kernel.org/r/20150716151006.GH26390@arm.com
    Signed-off-by: Ingo Molnar

    Will Deacon
     

16 Jul, 2015

3 commits


23 Jun, 2015

1 commit

  • Pull locking updates from Ingo Molnar:
    "The main changes are:

    - 'qspinlock' support, enabled on x86: queued spinlocks - these are
    now the spinlock variant used by x86 as they outperform ticket
    spinlocks in every category. (Waiman Long)

    - 'pvqspinlock' support on x86: paravirtualized variant of queued
    spinlocks. (Waiman Long, Peter Zijlstra)

    - 'qrwlock' support, enabled on x86: queued rwlocks. Similar to
    queued spinlocks, they are now the variant used by x86:

    CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
    CONFIG_QUEUED_SPINLOCKS=y
    CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
    CONFIG_QUEUED_RWLOCKS=y

    - various lockdep fixlets

    - various locking primitives cleanups, further WRITE_ONCE()
    propagation"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
    locking/lockdep: Remove hard coded array size dependency
    locking/qrwlock: Don't contend with readers when setting _QW_WAITING
    lockdep: Do not break user-visible string
    locking/arch: Rename set_mb() to smp_store_mb()
    locking/arch: Add WRITE_ONCE() to set_mb()
    rtmutex: Warn if trylock is called from hard/softirq context
    arch: Remove __ARCH_HAVE_CMPXCHG
    locking/rtmutex: Drop usage of __HAVE_ARCH_CMPXCHG
    locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS
    locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS
    locking/pvqspinlock: Replace xchg() by the more descriptive set_mb()
    locking/pvqspinlock, x86: Enable PV qspinlock for Xen
    locking/pvqspinlock, x86: Enable PV qspinlock for KVM
    locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching
    locking/pvqspinlock: Implement simple paravirt support for the qspinlock
    locking/qspinlock: Revert to test-and-set on hypervisors
    locking/qspinlock: Use a simple write to grab the lock
    locking/qspinlock: Optimize for smaller NR_CPUS
    locking/qspinlock: Extract out code snippets for the next patch
    locking/qspinlock: Add pending bit
    ...

    Linus Torvalds
     

28 May, 2015

3 commits

  • …plug.2015.05.27a', 'init.2015.05.27a', 'tiny.2015.05.27a' and 'torture.2015.05.27a' into HEAD

    array.2015.05.27a: Remove all uses of RCU-protected array indexes.
    doc.2015.05.27a: Docuemntation updates.
    fixes.2015.05.27a: Miscellaneous fixes.
    hotplug.2015.05.27a: CPU-hotplug updates.
    init.2015.05.27a: Initialization/Kconfig updates.
    tiny.2015.05.27a: Updates to Tiny RCU.
    torture.2015.05.27a: Torture-testing updates.

    Paul E. McKenney
     
  • The current formulation of control dependencies fails on DEC Alpha,
    which does not respect dependencies of any kind unless an explicit
    memory barrier is provided. This means that the current fomulation of
    control dependencies fails on Alpha. This commit therefore creates a
    READ_ONCE_CTRL() that has the same overhead on non-Alpha systems, but
    causes Alpha to produce the needed ordering. This commit also applies
    READ_ONCE_CTRL() to the one known use of control dependencies.

    Use of READ_ONCE_CTRL() also has the beneficial effect of adding a bit
    of self-documentation to control dependencies.

    Signed-off-by: Paul E. McKenney
    Acked-by: Peter Zijlstra (Intel)

    Paul E. McKenney
     
  • Our current documentation claims that, when followed by an ACQUIRE,
    smp_mb__before_spinlock() orders prior loads against subsequent loads
    and stores, which isn't the intent. This commit therefore fixes the
    documentation to state that this sequence orders only prior stores
    against subsequent loads and stores.

    In addition, the original intent of smp_mb__before_spinlock() was to only
    order prior loads against subsequent stores, however, people have started
    using it as if it ordered prior loads against subsequent loads and stores.
    This commit therefore also updates smp_mb__before_spinlock()'s header
    comment to reflect this new reality.

    Cc: Oleg Nesterov
    Cc: "Paul E. McKenney"
    Cc: Peter Zijlstra
    Signed-off-by: Will Deacon
    Signed-off-by: Paul E. McKenney

    Will Deacon
     

19 May, 2015

1 commit

  • Since set_mb() is really about an smp_mb() -- not a IO/DMA barrier
    like mb() rename it to match the recent smp_load_acquire() and
    smp_store_release().

    Suggested-by: Linus Torvalds
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

18 Apr, 2015

1 commit

  • Pull documentation updates from Jonathan Corbet:
    "Numerous fixes, the overdue removal of the i2o docs, some new Chinese
    translations, and, hopefully, the README fix that will end the flow of
    identical patches to that file"

    * tag 'docs-for-linus' of git://git.lwn.net/linux-2.6: (34 commits)
    Documentation/memcg: update memcg/kmem status
    Documentation: blackfin: Makefile: Typo building issue
    Documentation/vm/pagemap.txt: correct location of page-types tool
    Documentation/memory-barriers.txt: typo fix
    doc: Add guest_nice column to example output of `cat /proc/stat'
    Documentation/kernel-parameters: Move "eagerfpu" to its right place
    Documentation: gpio: Update ACPI part of the document to mention _DSD
    docs/completion.txt: Various tweaks and corrections
    doc: completion: context, scope and language fixes
    Documentation:Update Documentation/zh_CN/arm64/memory.txt
    Documentation:Update Documentation/zh_CN/arm64/booting.txt
    Documentation: Chinese translation of arm64/legacy_instructions.txt
    DocBook media: fix broken EIA hyperlink
    Documentation: tweak the maintainers entry
    README: Change gzip/bzip2 to xz compression format
    README: Update version number reference
    doc:pci: Fix typo in Documentation/PCI
    Documentation: drm: Use '->' when describing access through pointers.
    Documentation: Remove mentioning of block barriers
    Documentation/email-clients.txt: Fix one grammar mistake, add extra info about TB
    ...

    Linus Torvalds